Message ID | 1288984844.2665.52.camel@edumazet-laptop |
---|---|
State | Not Applicable, archived |
Delegated to: | David Miller |
Headers | show |
On Fri, 05 Nov 2010 20:20:44 +0100 Eric Dumazet <eric.dumazet@gmail.com> wrote: > Le vendredi 05 novembre 2010 __ 11:28 -0700, Andrew Morton a __crit : > > But we haven't established that there _is_ duplicated code which needs > > that treatment. > > > > Scanning arch/x86/include/asm/atomic.h, perhaps ATOMIC_INIT() is a > > candidate. But I'm not sure that it _should_ be hoisted up - if every > > architecture happens to do it the same way then that's just a fluke. > > > > > > Not sure I understand you. I was trying to avoid recursive includes, but > that should be protected anyway. I see a lot of code that could be > factorized in this new header (atomic_inc_not_zero() for example) Ah. I wasn't able to see much duplicated code at all, so I wasn't sure that we needed to bother about this issue. yup, atomic_inc_not_zero() looks like a candidate. > [PATCH v3] atomic: add atomic_inc_not_zero_hint() Let's go with this for now ;) I'll assume that you intend to make use of this function soon, and it looks safe enough to sneak it into 2.6.37-rc2, IMO. If Linus shouts at me then we could merge it into 2.6.38-rc1 via net-next, but I think straight-to-mainline is best. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le vendredi 05 novembre 2010 à 12:39 -0700, Andrew Morton a écrit : > Ah. I wasn't able to see much duplicated code at all, so I wasn't sure > that we needed to bother about this issue. > > yup, atomic_inc_not_zero() looks like a candidate. yes, and atomic_add_unless()... > > > [PATCH v3] atomic: add atomic_inc_not_zero_hint() > > Let's go with this for now ;) > > I'll assume that you intend to make use of this function soon, and it > looks safe enough to sneak it into 2.6.37-rc2, IMO. If Linus shouts at > me then we could merge it into 2.6.38-rc1 via net-next, but I think > straight-to-mainline is best. > Well, I dont expect using it before 2.6.38, no hurry Andrew, but it probably can be merged before, since it has no user yet. It'll help our job for sure. Thanks -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Nov 05, 2010 at 08:20:44PM +0100, Eric Dumazet wrote: > Le vendredi 05 novembre 2010 à 11:28 -0700, Andrew Morton a écrit : > > But we haven't established that there _is_ duplicated code which needs > > that treatment. > > > > Scanning arch/x86/include/asm/atomic.h, perhaps ATOMIC_INIT() is a > > candidate. But I'm not sure that it _should_ be hoisted up - if every > > architecture happens to do it the same way then that's just a fluke. > > > > > > Not sure I understand you. I was trying to avoid recursive includes, but > that should be protected anyway. I see a lot of code that could be > factorized in this new header (atomic_inc_not_zero() for example) > > Thanks > > [PATCH v3] atomic: add atomic_inc_not_zero_hint() > > Followup of perf tools session in Netfilter WorkShop 2010 > > In network stack we make high usage of atomic_inc_not_zero() in contexts > we know the probable value of atomic before increment (2 for udp sockets > for example) > > Using a special version of atomic_inc_not_zero() giving this hint can > help processor to use less bus transactions. > > On x86 (MESI protocol) for example, this avoids entering Shared state, > because "lock cmpxchg" issues an RFO (Read For Ownership) > > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> > Cc: Christoph Lameter <cl@linux-foundation.org> > Cc: Ingo Molnar <mingo@elte.hu> > Cc: Andi Kleen <andi@firstfloor.org> > Cc: Arnaldo Carvalho de Melo <acme@infradead.org> > Cc: David Miller <davem@davemloft.net> > Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Looks quite good to me! Reviewed-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> > Cc: Nick Piggin <npiggin@kernel.dk> > --- > V3: adds the include <asm/atomic.h> > if hint is null, use atomic_inc_not_zero() (Paul suggestion) > V2: add #ifndef atomic_inc_not_zero_hint > kerneldoc changes > test that hint is not null > Meant to be included at end of arch/*/asm/atomic.h files > > diff --git a/include/linux/atomic.h b/include/linux/atomic.h > new file mode 100644 > index 0000000..5a7df87 > --- /dev/null > +++ b/include/linux/atomic.h > @@ -0,0 +1,37 @@ > +#ifndef _LINUX_ATOMIC_H > +#define _LINUX_ATOMIC_H > +#include <asm/atomic.h> > + > +/** > + * atomic_inc_not_zero_hint - increment if not null > + * @v: pointer of type atomic_t > + * @hint: probable value of the atomic before the increment > + * > + * This version of atomic_inc_not_zero() gives a hint of probable > + * value of the atomic. This helps processor to not read the memory > + * before doing the atomic read/modify/write cycle, lowering > + * number of bus transactions on some arches. > + * > + * Returns: 0 if increment was not done, 1 otherwise. > + */ > +#ifndef atomic_inc_not_zero_hint > +static inline int atomic_inc_not_zero_hint(atomic_t *v, int hint) > +{ > + int val, c = hint; > + > + /* sanity test, should be removed by compiler if hint is a constant */ > + if (!hint) > + return atomic_inc_not_zero(v); > + > + do { > + val = atomic_cmpxchg(v, c, c + 1); > + if (val == c) > + return 1; > + c = val; > + } while (c); > + > + return 0; > +} > +#endif > + > +#endif /* _LINUX_ATOMIC_H */ > > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
prefetchw() would be too much overhead? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Nov 12, 2010 at 01:14:12PM -0600, Christoph Lameter wrote: > > prefetchw() would be too much overhead? No idea. Where do you believe that prefetchw() should be added? Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, 13 Nov 2010, Paul E. McKenney wrote: > On Fri, Nov 12, 2010 at 01:14:12PM -0600, Christoph Lameter wrote: > > > > prefetchw() would be too much overhead? > > No idea. Where do you believe that prefetchw() should be added? It is another way to get an exclusive cache line for situations like this. No need to give a hint. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Nov 15, 2010 at 07:57:10AM -0600, Christoph Lameter wrote: > On Sat, 13 Nov 2010, Paul E. McKenney wrote: > > > On Fri, Nov 12, 2010 at 01:14:12PM -0600, Christoph Lameter wrote: > > > > > > prefetchw() would be too much overhead? > > > > No idea. Where do you believe that prefetchw() should be added? > > It is another way to get an exclusive cache line > for situations like this. No need to give a hint. prefetchw doesn't work on Intel (or rather is equivalent to prefetch), for Intel you always need to explicitely write to get an exclusive line. -Andi
On Mon, 15 Nov 2010, Andi Kleen wrote: > > It is another way to get an exclusive cache line > > for situations like this. No need to give a hint. > > prefetchw doesn't work on Intel (or rather is equivalent to prefetch), > for Intel you always need to explicitely write to get an exclusive > line. Argh. You mean x86. Itanium could do it and is also by Intel. Could you please change that for x86 as well? Otherwise we will get more of these weird code twisters. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le lundi 15 novembre 2010 à 07:57 -0600, Christoph Lameter a écrit : > On Sat, 13 Nov 2010, Paul E. McKenney wrote: > > > On Fri, Nov 12, 2010 at 01:14:12PM -0600, Christoph Lameter wrote: > > > > > > prefetchw() would be too much overhead? > > > > No idea. Where do you believe that prefetchw() should be added? > > It is another way to get an exclusive cache line > for situations like this. No need to give a hint. > Exclusive access ? As soon as another cpu takes it again, you lose. Its not really the same thing... Maybe you miss the 'hint' intention at all. We know the probable value of the counter, we dont want to read it. In fact, prefetchw() is useful when you can assert it many cycles before the memory read you are going to perform [before the write]. On contended cache lines, its a waste, because by the time your cpu is going to read memory, then perform the atomic compare_and_exchange(), an other cpu might have dirtied the location again. This is what we noticed during Netfilter Workshop 2010 : A high performance cost at both atomic_read() and atomic_cmpxchg(). We tried prefetchw() and it was a performance drop. It was with only 16 cpus contending on neighbour refcnt, and 5 millions frames per second (5 millions atomic increments, 5 millions atomic decrements) prefetchw() should be used on very specific spots, when a cpu is going to write into a private area (not potentially accessed by other cpus). We use it for example in __alloc_skb(), a bit before memset(). By the way, atomic_inc_not_zero_hint() is less code than [prefetchw(), atomic_inc_not_zero()]. Using one instruction [cmpxchg] with the memory pointer is better than three. [prefetchw(), read(), cmpxchg()], particularly if you have high contention on cache line. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 15 Nov 2010, Eric Dumazet wrote: > Exclusive access ? As soon as another cpu takes it again, you lose. Sure but you want to avoid the fetch in shared mode here. > Its not really the same thing... Maybe you miss the 'hint' intention at > all. We know the probable value of the counter, we dont want to read it. Ok may be in thise case you can predict the value but in general it is difficult to always provide an expected value. It would be easier to be able to tell the processor that the cacheline should not be fetched as shared but immediately in exclusive state. > atomic_read() and atomic_cmpxchg(). We tried prefetchw() and it was a > performance drop. It was with only 16 cpus contending on neighbour Does prefetchw work? Andi claims that prefetchw is not working on x86 and I doubt that you ran tests on Itanium. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> > atomic_read() and atomic_cmpxchg(). We tried prefetchw() and it was a > > performance drop. It was with only 16 cpus contending on neighbour > > Does prefetchw work? Andi claims that prefetchw is not working on > x86 and I doubt that you ran tests on Itanium. AMD supports it due to their MOESI protocol, but it's not supported in MESIF as used by Intel QPI. The kernel maps it on Intel to ordinary prefetch. -Andi
Le lundi 15 novembre 2010 à 08:25 -0600, Christoph Lameter a écrit : > On Mon, 15 Nov 2010, Eric Dumazet wrote: > > > Exclusive access ? As soon as another cpu takes it again, you lose. > > Sure but you want to avoid the fetch in shared mode here. > Yes, this is what cmpxchg() does for sure. > > Its not really the same thing... Maybe you miss the 'hint' intention at > > all. We know the probable value of the counter, we dont want to read it. > > Ok may be in thise case you can predict the value but in general it is > difficult to always provide an expected value. It would be easier to be > able to tell the processor that the cacheline should not be fetched as > shared but immediately in exclusive state. > Maybe its not clear, but atomic_inc_not_zero_hint() is going to be used only in contexts we know the expected value, and not as a generic replacement for atomic_inc_not_zero(). Even if cache line is already hot in this cpu cache, it should be faster or same speed. Then, in high contention contexts, using atomic_inc_not_zero_hint() with whatever initial hint might also be a win over atomic_inc_not_zero(), but we try to remove such contexts ;) And two atomic_cmpxchg() are probably slower in non contended contexts, in particular is cache line is already hot in this cpu cache. > > atomic_read() and atomic_cmpxchg(). We tried prefetchw() and it was a > > performance drop. It was with only 16 cpus contending on neighbour > > Does prefetchw work? Andi claims that prefetchw is not working on > x86 and I doubt that you ran tests on Itanium. In fact, in benchmarks, prefetch() or prefetchw() are a pain on x86, or at least "perf tools" show artifact on them (high number of cycles consumed on these instructions) Andi had a patch to disable prefetch() in list iterators, and its a win. I dont have Itanium platform to run tests. Is cmpxchg() that bad on ia64 ? I also have old AMD cpus, so I cannot say if recent ones handle prefetchw() better... -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/linux/atomic.h b/include/linux/atomic.h new file mode 100644 index 0000000..5a7df87 --- /dev/null +++ b/include/linux/atomic.h @@ -0,0 +1,37 @@ +#ifndef _LINUX_ATOMIC_H +#define _LINUX_ATOMIC_H +#include <asm/atomic.h> + +/** + * atomic_inc_not_zero_hint - increment if not null + * @v: pointer of type atomic_t + * @hint: probable value of the atomic before the increment + * + * This version of atomic_inc_not_zero() gives a hint of probable + * value of the atomic. This helps processor to not read the memory + * before doing the atomic read/modify/write cycle, lowering + * number of bus transactions on some arches. + * + * Returns: 0 if increment was not done, 1 otherwise. + */ +#ifndef atomic_inc_not_zero_hint +static inline int atomic_inc_not_zero_hint(atomic_t *v, int hint) +{ + int val, c = hint; + + /* sanity test, should be removed by compiler if hint is a constant */ + if (!hint) + return atomic_inc_not_zero(v); + + do { + val = atomic_cmpxchg(v, c, c + 1); + if (val == c) + return 1; + c = val; + } while (c); + + return 0; +} +#endif + +#endif /* _LINUX_ATOMIC_H */