From patchwork Wed Nov 2 00:14:34 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Eric Dumazet X-Patchwork-Id: 123194 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 64E19B6F8B for ; Wed, 2 Nov 2011 11:15:04 +1100 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932371Ab1KBAOm (ORCPT ); Tue, 1 Nov 2011 20:14:42 -0400 Received: from mail-ww0-f44.google.com ([74.125.82.44]:37016 "EHLO mail-ww0-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751199Ab1KBAOk (ORCPT ); Tue, 1 Nov 2011 20:14:40 -0400 Received: by wwi36 with SMTP id 36so3874328wwi.1 for ; Tue, 01 Nov 2011 17:14:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=message-id:subject:from:to:cc:date:in-reply-to:references :content-type:x-mailer:content-transfer-encoding:mime-version; bh=8kGP2RnVs/lPxc9W1EwdFk3fqL64nTsO4NKSPuePBVU=; b=BCUumcevWW9/Cfb/g1TJv3TRQ+ucNH0p4iY6MpADlbsiIQchvNz8okOsI6trYUMj73 Kv3kVtiJF398ia3dMIjFP+V/6oYHZ2E6/i5GNUqFYOwsD7SkSno2Xue5Knpsro9ecCoh mKlIqcweJvnkjCdQB1nZQBO4oB/3dnyrFhGjE= Received: by 10.216.167.210 with SMTP id i60mr3003199wel.9.1320192879079; Tue, 01 Nov 2011 17:14:39 -0700 (PDT) Received: from [10.170.237.4] ([87.255.129.107]) by mx.google.com with ESMTPS id a27sm1088544wbp.16.2011.11.01.17.14.36 (version=SSLv3 cipher=OTHER); Tue, 01 Nov 2011 17:14:37 -0700 (PDT) Message-ID: <1320192874.4728.19.camel@edumazet-laptop> Subject: Re: >Re: [RFC] should VM_BUG_ON(cond) really evaluate cond From: Eric Dumazet To: Linus Torvalds Cc: Andi Kleen , Ben Hutchings , linux-kernel , netdev , Andrew Morton Date: Wed, 02 Nov 2011 01:14:34 +0100 In-Reply-To: References: <1319772566.6759.27.camel@deadeye> <1319777025.23112.67.camel@edumazet-laptop> <1319803781.23112.113.camel@edumazet-laptop> <1319813252.23112.122.camel@edumazet-laptop> <1319964754.13597.26.camel@edumazet-laptop> <20111030095918.GA19676@one.firstfloor.org> <1319987765.13597.60.camel@edumazet-laptop> <1319996494.13597.69.camel@edumazet-laptop> <1319997593.13597.76.camel@edumazet-laptop> X-Mailer: Evolution 3.2.0- Mime-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Le dimanche 30 octobre 2011 à 11:09 -0700, Linus Torvalds a écrit : > Argh. Ok. Testing a refcount in a const struct doesn't make much > sense, but there does seem to be perfectly valid uses of it > (sk_wmem_alloc etc). > > Annoying. I guess we have to have those casts. Grr. > OK, please check following patch then. Thanks ! [PATCH v3] atomic: introduce ACCESS_AT_MOST_ONCE() helper In commit 4e60c86bd9e (gcc-4.6: mm: fix unused but set warnings) Andi forced VM_BUG_ON(cond) to evaluate cond, even if CONFIG_DEBUG_VM is not set : #ifdef CONFIG_DEBUG_VM #define VM_BUG_ON(cond) BUG_ON(cond) #else #define VM_BUG_ON(cond) do { (void)(cond); } while (0) #endif As a side effect, get_page()/put_page_testzero() are performing more bus transactions on contended cache line on some workloads (tcp_sendmsg() for example, where a page is acting as a shared buffer) 0,05 : ffffffff815e4775: je ffffffff815e4970 0,05 : ffffffff815e477b: mov 0x1c(%r9),%eax // useless 3,32 : ffffffff815e477f: mov (%r9),%rax // useless 0,51 : ffffffff815e4782: lock incl 0x1c(%r9) 3,87 : ffffffff815e4787: mov (%r9),%rax 0,00 : ffffffff815e478a: test $0x80,%ah 0,00 : ffffffff815e478d: jne ffffffff815e49f2 Thats because both atomic_read() and constant_test_bit() use a volatile attribute and thus compiler is forced to perform a read, even if the result is optimized away. Linus suggested using an asm("") trick and place it in a variant of ACCESS_ONCE(), allowing compiler to omit reading memory if result is unused. This patch introduces ACCESS_AT_MOST_ONCE() helper and use it in the x86 implementation of atomic_read() and constant_test_bit() It's also used on x86_64 atomic64_read() implementation. on x86_64, we thus reduce vmlinux text a bit (if CONFIG_DEBUG_VM=n) # size vmlinux.old vmlinux.new text data bss dec hex filename 10706848 2894216 1540096 15141160 e70928 vmlinux.old 10704040 2894216 1540096 15138352 e6fe30 vmlinux.new Basically an extension of a prior patch from Linus Signed-off-by: Eric Dumazet --- arch/x86/include/asm/atomic.h | 5 +---- arch/x86/include/asm/atomic64_64.h | 6 ++---- arch/x86/include/asm/bitops.h | 6 ++++-- include/asm-generic/atomic.h | 2 +- include/linux/compiler.h | 10 ++++++++++ 5 files changed, 18 insertions(+), 11 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/x86/include/asm/atomic.h b/arch/x86/include/asm/atomic.h index 58cb6d4..2581008 100644 --- a/arch/x86/include/asm/atomic.h +++ b/arch/x86/include/asm/atomic.h @@ -20,10 +20,7 @@ * * Atomically reads the value of @v. */ -static inline int atomic_read(const atomic_t *v) -{ - return (*(volatile int *)&(v)->counter); -} +#define atomic_read(v) ACCESS_AT_MOST_ONCE(*(int *)&(v)->counter) /** * atomic_set - set atomic variable diff --git a/arch/x86/include/asm/atomic64_64.h b/arch/x86/include/asm/atomic64_64.h index 0e1cbfc..15bbad4 100644 --- a/arch/x86/include/asm/atomic64_64.h +++ b/arch/x86/include/asm/atomic64_64.h @@ -1,6 +1,7 @@ #ifndef _ASM_X86_ATOMIC64_64_H #define _ASM_X86_ATOMIC64_64_H +#include #include #include #include @@ -16,10 +17,7 @@ * Atomically reads the value of @v. * Doesn't imply a read memory barrier. */ -static inline long atomic64_read(const atomic64_t *v) -{ - return (*(volatile long *)&(v)->counter); -} +#define atomic64_read(v) ACCESS_AT_MOST_ONCE(*(long *)&(v)->counter); /** * atomic64_set - set atomic64 variable diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h index 1775d6e..6dcf4b1 100644 --- a/arch/x86/include/asm/bitops.h +++ b/arch/x86/include/asm/bitops.h @@ -308,8 +308,10 @@ static inline int test_and_change_bit(int nr, volatile unsigned long *addr) static __always_inline int constant_test_bit(unsigned int nr, const volatile unsigned long *addr) { - return ((1UL << (nr % BITS_PER_LONG)) & - (addr[nr / BITS_PER_LONG])) != 0; + unsigned long *word = (unsigned long *)addr + (nr / BITS_PER_LONG); + unsigned long bit = 1UL << (nr % BITS_PER_LONG); + + return (bit & ACCESS_AT_MOST_ONCE(*word)) != 0; } static inline int variable_test_bit(int nr, volatile const unsigned long *addr) diff --git a/include/asm-generic/atomic.h b/include/asm-generic/atomic.h index e37963c..76ab683 100644 --- a/include/asm-generic/atomic.h +++ b/include/asm-generic/atomic.h @@ -39,7 +39,7 @@ * Atomically reads the value of @v. */ #ifndef atomic_read -#define atomic_read(v) (*(volatile int *)&(v)->counter) +#define atomic_read(v) ACCESS_AT_MOST_ONCE(*(int *)&(v)->counter) #endif /** diff --git a/include/linux/compiler.h b/include/linux/compiler.h index 320d6c9..307f342 100644 --- a/include/linux/compiler.h +++ b/include/linux/compiler.h @@ -308,4 +308,14 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect); */ #define ACCESS_ONCE(x) (*(volatile typeof(x) *)&(x)) +/* + * Like ACCESS_ONCE, but can be optimized away if nothing uses the value, + * and/or merged with previous non-ONCE accesses. + */ +#define ACCESS_AT_MOST_ONCE(x) \ + ({ typeof(x) __y; \ + asm("":"=r" (__y):"0" (x)); \ + __y; \ + }) + #endif /* __LINUX_COMPILER_H */