Message ID | 20200812012005.1919255-1-nathanl@linux.ibm.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [v2] powerpc/pseries: explicitly reschedule during drmem_lmb list traversal | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/apply_patch | success | Successfully applied on branch powerpc/merge (f04b169e953c4db1a3a3c1d23eea09c726f01ee5) |
snowpatch_ozlabs/build-ppc64le | warning | Upstream build failed, couldn't test patch |
snowpatch_ozlabs/build-ppc64be | warning | Upstream build failed, couldn't test patch |
snowpatch_ozlabs/build-ppc64e | warning | Upstream build failed, couldn't test patch |
snowpatch_ozlabs/build-pmac32 | warning | Upstream build failed, couldn't test patch |
snowpatch_ozlabs/checkpatch | warning | total: 0 errors, 2 warnings, 0 checks, 31 lines checked |
snowpatch_ozlabs/needsstable | warning | Please consider tagging this patch for stable! |
Le 12/08/2020 à 03:20, Nathan Lynch a écrit : > The drmem lmb list can have hundreds of thousands of entries, and > unfortunately lookups take the form of linear searches. As long as > this is the case, traversals have the potential to monopolize the CPU > and provoke lockup reports, workqueue stalls, and the like unless > they explicitly yield. > > Rather than placing cond_resched() calls within various > for_each_drmem_lmb() loop blocks in the code, put it in the iteration > expression of the loop macro itself so users can't omit it. > > Call cond_resched() on every 20th element. Each iteration of the loop > in DLPAR code paths can involve around ten RTAS calls which can each > take up to 250us, so this ensures the check is performed at worst > every few milliseconds. > > Fixes: 6c6ea53725b3 ("powerpc/mm: Separate ibm, dynamic-memory data from DT format") > Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> > --- > arch/powerpc/include/asm/drmem.h | 18 +++++++++++++++++- > 1 file changed, 17 insertions(+), 1 deletion(-) > > Changes since v1: > * Add bounds assertions in drmem_lmb_next(). > * Call cond_resched() in the iterator on only every 20th element > instead of on every iteration, to reduce overhead in tight loops. > > diff --git a/arch/powerpc/include/asm/drmem.h b/arch/powerpc/include/asm/drmem.h > index 17ccc6474ab6..583277e30dd2 100644 > --- a/arch/powerpc/include/asm/drmem.h > +++ b/arch/powerpc/include/asm/drmem.h > @@ -8,6 +8,9 @@ > #ifndef _ASM_POWERPC_LMB_H > #define _ASM_POWERPC_LMB_H > > +#include <linux/bug.h> > +#include <linux/sched.h> > + > struct drmem_lmb { > u64 base_addr; > u32 drc_index; > @@ -26,8 +29,21 @@ struct drmem_lmb_info { > > extern struct drmem_lmb_info *drmem_info; > > +static inline struct drmem_lmb *drmem_lmb_next(struct drmem_lmb *lmb) > +{ > + const unsigned int resched_interval = 20; > + > + BUG_ON(lmb < drmem_info->lmbs); > + BUG_ON(lmb >= drmem_info->lmbs + drmem_info->n_lmbs); BUG_ON() shall be avoided unless absolutely necessary. Wouldn't WARN_ON() together with an early return be enough ? > + > + if ((lmb - drmem_info->lmbs) % resched_interval == 0) > + cond_resched(); Do you need something that precise ? Can't you use 16 or 32 and use a logical AND instead of a MODULO ? And what garanties that lmb is always an element of a table based at drmem_info->lmbs ? What about: static inline struct drmem_lmb *drmem_lmb_next(struct drmem_lmb *lmb, struct drmem_lmb *start) { const unsigned int resched_interval = 16; if ((++lmb - start) & resched_interval == 0) cond_resched(); return lmb; } #define for_each_drmem_lmb_in_range(lmb, start, end) \ for ((lmb) = (start); (lmb) < (end); lmb = drmem_lmb_next(lmb, start)) > + > + return ++lmb; > +} > + > #define for_each_drmem_lmb_in_range(lmb, start, end) \ > - for ((lmb) = (start); (lmb) < (end); (lmb)++) > + for ((lmb) = (start); (lmb) < (end); lmb = drmem_lmb_next(lmb)) > > #define for_each_drmem_lmb(lmb) \ > for_each_drmem_lmb_in_range((lmb), \ > Christophe
Hi Christophe, Christophe Leroy <christophe.leroy@csgroup.eu> writes: >> +static inline struct drmem_lmb *drmem_lmb_next(struct drmem_lmb *lmb) >> +{ >> + const unsigned int resched_interval = 20; >> + >> + BUG_ON(lmb < drmem_info->lmbs); >> + BUG_ON(lmb >= drmem_info->lmbs + drmem_info->n_lmbs); > > BUG_ON() shall be avoided unless absolutely necessary. > Wouldn't WARN_ON() together with an early return be enough ? Not sure what a sensible early return behavior would be. If the iterator doesn't advance the pointer the behavior will be a hang. BUG_ON a bounds-check failure is appropriate here; many users of this API will corrupt memory otherwise. >> + >> + if ((lmb - drmem_info->lmbs) % resched_interval == 0) >> + cond_resched(); > > Do you need something that precise ? Can't you use 16 or 32 and use a > logical AND instead of a MODULO ? Eh if you're executing in this code you've already lost with respect to performance considerations at this level, see the discussion on v1. I'll use 16 since I'm going to reroll the patch though. > And what garanties that lmb is always an element of a table based at > drmem_info->lmbs ? Well, that's its only intended use right now. There should not be any other arrays of drmem_lmb objects, and I hope we don't gain any. > What about: > > static inline struct drmem_lmb *drmem_lmb_next(struct drmem_lmb *lmb, > struct drmem_lmb *start) > { > const unsigned int resched_interval = 16; > > if ((++lmb - start) & resched_interval == 0) ^^^ Did you mean '%' here? The bitwise AND doesn't do what I want. Otherwise, making drmem_lmb_next() more general by adding a 'start' argument could ease refactoring to come, so I'll do that.
Le 12/08/2020 à 15:46, Nathan Lynch a écrit : > Hi Christophe, > > Christophe Leroy <christophe.leroy@csgroup.eu> writes: >>> +static inline struct drmem_lmb *drmem_lmb_next(struct drmem_lmb *lmb) >>> +{ >>> + const unsigned int resched_interval = 20; >>> + >>> + BUG_ON(lmb < drmem_info->lmbs); >>> + BUG_ON(lmb >= drmem_info->lmbs + drmem_info->n_lmbs); >> >> BUG_ON() shall be avoided unless absolutely necessary. >> Wouldn't WARN_ON() together with an early return be enough ? > > Not sure what a sensible early return behavior would be. If the iterator > doesn't advance the pointer the behavior will be a hang. I was thinking about returning lmb++ without checking the cond_resched stuff. > > BUG_ON a bounds-check failure is appropriate here; many users of this > API will corrupt memory otherwise. It looks really strange to me to do bounds-checks in an iterator like this. Should be checked before entering the loop. > >>> + >>> + if ((lmb - drmem_info->lmbs) % resched_interval == 0) >>> + cond_resched(); >> >> Do you need something that precise ? Can't you use 16 or 32 and use a >> logical AND instead of a MODULO ? > > Eh if you're executing in this code you've already lost with respect to > performance considerations at this level, see the discussion on v1. I'll > use 16 since I'm going to reroll the patch though. > >> And what garanties that lmb is always an element of a table based at >> drmem_info->lmbs ? > > Well, that's its only intended use right now. There should not be any > other arrays of drmem_lmb objects, and I hope we don't gain any. > > >> What about: >> >> static inline struct drmem_lmb *drmem_lmb_next(struct drmem_lmb *lmb, >> struct drmem_lmb *start) >> { >> const unsigned int resched_interval = 16; >> >> if ((++lmb - start) & resched_interval == 0) > ^^^ > Did you mean '%' here? The bitwise AND doesn't do what I want. I meant '& (resched_interval - 1)' indeed. But yes we can leave the %, GCC will change a % 16 to an & 15. > > Otherwise, making drmem_lmb_next() more general by adding a 'start' > argument could ease refactoring to come, so I'll do that. Yes, the main idea is to avoid the access to a global variable in such an helper. > Christophe
On 8/11/20 6:20 PM, Nathan Lynch wrote: > The drmem lmb list can have hundreds of thousands of entries, and > unfortunately lookups take the form of linear searches. As long as > this is the case, traversals have the potential to monopolize the CPU > and provoke lockup reports, workqueue stalls, and the like unless > they explicitly yield. > > Rather than placing cond_resched() calls within various > for_each_drmem_lmb() loop blocks in the code, put it in the iteration > expression of the loop macro itself so users can't omit it. > > Call cond_resched() on every 20th element. Each iteration of the loop > in DLPAR code paths can involve around ten RTAS calls which can each > take up to 250us, so this ensures the check is performed at worst > every few milliseconds. > > Fixes: 6c6ea53725b3 ("powerpc/mm: Separate ibm, dynamic-memory data from DT format") > Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> > --- > arch/powerpc/include/asm/drmem.h | 18 +++++++++++++++++- > 1 file changed, 17 insertions(+), 1 deletion(-) > > Changes since v1: > * Add bounds assertions in drmem_lmb_next(). > * Call cond_resched() in the iterator on only every 20th element > instead of on every iteration, to reduce overhead in tight loops. > > diff --git a/arch/powerpc/include/asm/drmem.h b/arch/powerpc/include/asm/drmem.h > index 17ccc6474ab6..583277e30dd2 100644 > --- a/arch/powerpc/include/asm/drmem.h > +++ b/arch/powerpc/include/asm/drmem.h > @@ -8,6 +8,9 @@ > #ifndef _ASM_POWERPC_LMB_H > #define _ASM_POWERPC_LMB_H > > +#include <linux/bug.h> > +#include <linux/sched.h> > + > struct drmem_lmb { > u64 base_addr; > u32 drc_index; > @@ -26,8 +29,21 @@ struct drmem_lmb_info { > > extern struct drmem_lmb_info *drmem_info; > > +static inline struct drmem_lmb *drmem_lmb_next(struct drmem_lmb *lmb) > +{ > + const unsigned int resched_interval = 20; > + > + BUG_ON(lmb < drmem_info->lmbs); > + BUG_ON(lmb >= drmem_info->lmbs + drmem_info->n_lmbs); I think BUG_ON is a pretty big no-no these days unless there is no other option. -Tyrel > + > + if ((lmb - drmem_info->lmbs) % resched_interval == 0) > + cond_resched(); > + > + return ++lmb; > +} > + > #define for_each_drmem_lmb_in_range(lmb, start, end) \ > - for ((lmb) = (start); (lmb) < (end); (lmb)++) > + for ((lmb) = (start); (lmb) < (end); lmb = drmem_lmb_next(lmb)) > > #define for_each_drmem_lmb(lmb) \ > for_each_drmem_lmb_in_range((lmb), \ >
Tyrel Datwyler <tyreld@linux.ibm.com> writes: > On 8/11/20 6:20 PM, Nathan Lynch wrote: >> The drmem lmb list can have hundreds of thousands of entries, and >> unfortunately lookups take the form of linear searches. As long as >> this is the case, traversals have the potential to monopolize the CPU >> and provoke lockup reports, workqueue stalls, and the like unless >> they explicitly yield. >> >> Rather than placing cond_resched() calls within various >> for_each_drmem_lmb() loop blocks in the code, put it in the iteration >> expression of the loop macro itself so users can't omit it. >> >> Call cond_resched() on every 20th element. Each iteration of the loop >> in DLPAR code paths can involve around ten RTAS calls which can each >> take up to 250us, so this ensures the check is performed at worst >> every few milliseconds. >> >> Fixes: 6c6ea53725b3 ("powerpc/mm: Separate ibm, dynamic-memory data from DT format") >> Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> >> --- >> arch/powerpc/include/asm/drmem.h | 18 +++++++++++++++++- >> 1 file changed, 17 insertions(+), 1 deletion(-) >> >> Changes since v1: >> * Add bounds assertions in drmem_lmb_next(). >> * Call cond_resched() in the iterator on only every 20th element >> instead of on every iteration, to reduce overhead in tight loops. >> >> diff --git a/arch/powerpc/include/asm/drmem.h b/arch/powerpc/include/asm/drmem.h >> index 17ccc6474ab6..583277e30dd2 100644 >> --- a/arch/powerpc/include/asm/drmem.h >> +++ b/arch/powerpc/include/asm/drmem.h >> @@ -8,6 +8,9 @@ >> #ifndef _ASM_POWERPC_LMB_H >> #define _ASM_POWERPC_LMB_H >> >> +#include <linux/bug.h> >> +#include <linux/sched.h> >> + >> struct drmem_lmb { >> u64 base_addr; >> u32 drc_index; >> @@ -26,8 +29,21 @@ struct drmem_lmb_info { >> >> extern struct drmem_lmb_info *drmem_info; >> >> +static inline struct drmem_lmb *drmem_lmb_next(struct drmem_lmb *lmb) >> +{ >> + const unsigned int resched_interval = 20; >> + >> + BUG_ON(lmb < drmem_info->lmbs); >> + BUG_ON(lmb >= drmem_info->lmbs + drmem_info->n_lmbs); > > I think BUG_ON is a pretty big no-no these days unless there is no other option. It's complicated, but yes we would like to avoid adding them if we can. In a case like this there is no other option, *if* the check has to be in drmem_lmb_next(). But I don't think we really need to check that there. If for some reason this was called with an *lmb pointing outside of the lmbs array it would confuse the cond_resched() logic, but it's not worth crashing the box for that IMHO. cheers
Michael Ellerman <mpe@ellerman.id.au> writes: > Tyrel Datwyler <tyreld@linux.ibm.com> writes: >> On 8/11/20 6:20 PM, Nathan Lynch wrote: >>> >>> +static inline struct drmem_lmb *drmem_lmb_next(struct drmem_lmb *lmb) >>> +{ >>> + const unsigned int resched_interval = 20; >>> + >>> + BUG_ON(lmb < drmem_info->lmbs); >>> + BUG_ON(lmb >= drmem_info->lmbs + drmem_info->n_lmbs); >> >> I think BUG_ON is a pretty big no-no these days unless there is no other option. > > It's complicated, but yes we would like to avoid adding them if we can. > > In a case like this there is no other option, *if* the check has to be > in drmem_lmb_next(). > > But I don't think we really need to check that there. > > If for some reason this was called with an *lmb pointing outside of the > lmbs array it would confuse the cond_resched() logic, but it's not worth > crashing the box for that IMHO. The BUG_ONs are pretty much orthogonal to the cond_resched(). It's not apparent from the context of the change, but some users of the for_each_drmem_lmb* macros modify elements of the drmem_info->lmbs array. If the lmb passed to drmem_lmb_next() violates the bounds check (say, if the callsite inappropriately modifies it within the loop), such users are guaranteed to corrupt other objects in memory. This was my thinking in adding the BUG_ONs, and I don't see another place to do it.
Nathan Lynch <nathanl@linux.ibm.com> writes: > Michael Ellerman <mpe@ellerman.id.au> writes: >> Tyrel Datwyler <tyreld@linux.ibm.com> writes: >>> On 8/11/20 6:20 PM, Nathan Lynch wrote: >>>> >>>> +static inline struct drmem_lmb *drmem_lmb_next(struct drmem_lmb *lmb) >>>> +{ >>>> + const unsigned int resched_interval = 20; >>>> + >>>> + BUG_ON(lmb < drmem_info->lmbs); >>>> + BUG_ON(lmb >= drmem_info->lmbs + drmem_info->n_lmbs); >>> >>> I think BUG_ON is a pretty big no-no these days unless there is no other option. >> >> It's complicated, but yes we would like to avoid adding them if we can. >> >> In a case like this there is no other option, *if* the check has to be >> in drmem_lmb_next(). >> >> But I don't think we really need to check that there. >> >> If for some reason this was called with an *lmb pointing outside of the >> lmbs array it would confuse the cond_resched() logic, but it's not worth >> crashing the box for that IMHO. > > The BUG_ONs are pretty much orthogonal to the cond_resched(). Yes that was kind of my point. We don't need them to implement the cond_resched() logic. > It's not apparent from the context of the change, but some users of the > for_each_drmem_lmb* macros modify elements of the drmem_info->lmbs > array. If the lmb passed to drmem_lmb_next() violates the bounds check > (say, if the callsite inappropriately modifies it within the loop), such > users are guaranteed to corrupt other objects in memory. This was my > thinking in adding the BUG_ONs, and I don't see another place to do > it. If it's really something we're worried about, I think we'd be better off putting checks for that in the code that's doing those modifications. That way you have enough context to do something more nuanced than a BUG(), ie. you can print a useful message and fail whatever operation is in progress. If we did that then we could also add those BUG_ONs() as a safety net. What you really want is a way for the for_each_xxx() construct to express that there was an error traversing the list, but there isn't really a nice way to do that in C. cheers
diff --git a/arch/powerpc/include/asm/drmem.h b/arch/powerpc/include/asm/drmem.h index 17ccc6474ab6..583277e30dd2 100644 --- a/arch/powerpc/include/asm/drmem.h +++ b/arch/powerpc/include/asm/drmem.h @@ -8,6 +8,9 @@ #ifndef _ASM_POWERPC_LMB_H #define _ASM_POWERPC_LMB_H +#include <linux/bug.h> +#include <linux/sched.h> + struct drmem_lmb { u64 base_addr; u32 drc_index; @@ -26,8 +29,21 @@ struct drmem_lmb_info { extern struct drmem_lmb_info *drmem_info; +static inline struct drmem_lmb *drmem_lmb_next(struct drmem_lmb *lmb) +{ + const unsigned int resched_interval = 20; + + BUG_ON(lmb < drmem_info->lmbs); + BUG_ON(lmb >= drmem_info->lmbs + drmem_info->n_lmbs); + + if ((lmb - drmem_info->lmbs) % resched_interval == 0) + cond_resched(); + + return ++lmb; +} + #define for_each_drmem_lmb_in_range(lmb, start, end) \ - for ((lmb) = (start); (lmb) < (end); (lmb)++) + for ((lmb) = (start); (lmb) < (end); lmb = drmem_lmb_next(lmb)) #define for_each_drmem_lmb(lmb) \ for_each_drmem_lmb_in_range((lmb), \
The drmem lmb list can have hundreds of thousands of entries, and unfortunately lookups take the form of linear searches. As long as this is the case, traversals have the potential to monopolize the CPU and provoke lockup reports, workqueue stalls, and the like unless they explicitly yield. Rather than placing cond_resched() calls within various for_each_drmem_lmb() loop blocks in the code, put it in the iteration expression of the loop macro itself so users can't omit it. Call cond_resched() on every 20th element. Each iteration of the loop in DLPAR code paths can involve around ten RTAS calls which can each take up to 250us, so this ensures the check is performed at worst every few milliseconds. Fixes: 6c6ea53725b3 ("powerpc/mm: Separate ibm, dynamic-memory data from DT format") Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> --- arch/powerpc/include/asm/drmem.h | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) Changes since v1: * Add bounds assertions in drmem_lmb_next(). * Call cond_resched() in the iterator on only every 20th element instead of on every iteration, to reduce overhead in tight loops.