From patchwork Sun Aug 7 05:33:45 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicolas Pitre X-Patchwork-Id: 656444 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3s6TlN2mQ8z9s5M for ; Sun, 7 Aug 2016 15:35:12 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b=AwsvgDxY; dkim-atps=neutral Received: from ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3s6TlN1jKTzDqhR for ; Sun, 7 Aug 2016 15:35:12 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b=AwsvgDxY; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Received: from mail-qt0-x233.google.com (mail-qt0-x233.google.com [IPv6:2607:f8b0:400d:c0d::233]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3s6Tjq3zMxzDqYj for ; Sun, 7 Aug 2016 15:33:50 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b=AwsvgDxY; dkim-atps=neutral Received: by mail-qt0-x233.google.com with SMTP id w38so191439627qtb.0 for ; Sat, 06 Aug 2016 22:33:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version; bh=RKTH20hfdz8v+TqJy8Pghw50/ZQk0AluQEjyqh6ji8E=; b=AwsvgDxYTGUDea3zeZxSilgYTN2TpmuObbbq4qwIbHwZggl3r3n+qWs6f/uxj6Rpi6 +uS81zWZPShz4QzCFSX9NUUIKkmuvTaeKfl20uSjGLYGr5zxCz5OsUaUUnXy06aFxPOY dpHQ1ERgI430RV/tEVBgGAfhxXO86YKOwNsbQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version; bh=RKTH20hfdz8v+TqJy8Pghw50/ZQk0AluQEjyqh6ji8E=; b=j7CbNcD6qfOFNMnaodFQsX66eOZ6mBf73tEQADRg5pf9NqnEgHi9qWxfg5kMSSI5FK K6ppyBu+4I30OD07dIiaS0CLxxq8jRzX273oUdSvlaLH9WbQw9PdaaybAquFZnMLwKcI PFcERDcXuMIzjNCmEyVn2DQVV/5iDo7i4BcqX75B8j2wG1k2fqVKjH/fpQhXYwngcEWR 3miLMlxZtxHzf87gTJjfo60RLfDjFyReNqK+cmVBbNX8UKnpD5X1B14IWZD56iKTMqnk mSN3kCUYEqNi0lGIx9geK7bf5gkvGs1TpqxFL8D/WvynmR1nOVT3LhXDrwkvMeHUyosa JNmw== X-Gm-Message-State: AEkoouvTiD6ka36GESIHtSsXjvAnsOP0NNQ5LpfoQbU3leB0U45k2XO9l9hPwBMiqqSpCTus X-Received: by 10.200.36.174 with SMTP id s43mr21573094qts.52.1470548028231; Sat, 06 Aug 2016 22:33:48 -0700 (PDT) Received: from xanadu.home ([2607:fa48:6e39:d410:feaa:14ff:fea7:ed77]) by smtp.gmail.com with ESMTPSA id p188sm14107527qkf.6.2016.08.06.22.33.46 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 06 Aug 2016 22:33:47 -0700 (PDT) Date: Sun, 7 Aug 2016 01:33:45 -0400 (EDT) From: Nicolas Pitre To: Nicholas Piggin Subject: Re: [PATCH 2/5] kbuild: allow archs to select build for link dead code/data elimination In-Reply-To: <1470399123-8455-3-git-send-email-npiggin@gmail.com> Message-ID: References: <1470399123-8455-1-git-send-email-npiggin@gmail.com> <1470399123-8455-3-git-send-email-npiggin@gmail.com> User-Agent: Alpine 2.20 (LFD 67 2015-01-07) MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arch@vger.kernel.org, Stephen Rothwell , Arnd Bergmann , linux-kbuild@vger.kernel.org, Alan Modra , linuxppc-dev@lists.ozlabs.org Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" On Fri, 5 Aug 2016, Nicholas Piggin wrote: > Introduce LINKER_DCE option for architectures to select if they want > to build with -ffunction-sections, -fdata-sections, and link with > --gc-sections. It requires some work (documented) to ensure all > unreferenced entrypoints are live, and requires toolchain and > build verification, so it is made a per-arch option for now. > > On a random powerpc64le build, this yelds a significant size saving, > it boots and runs fine, but there is a lot I haven't tested as yet, > so these savings may be reduced if there are bugs in the link. > > text data bss dec filename > 11169741 1180744 1923176 14273661 vmlinux > 10445269 1004127 1919707 13369103 vmlinux.dce > > ~700K text, ~170K data, 6% removed from kernel image size. > > Signed-off-by: Nicholas Piggin I played with that too. However this needs distinct sections for exception tables and the like otherwise the backward references from the final exception table to those functions responsible for those exception entries has the effect of pulling in all those functions even if their entry point is never referenced, making --gc-sections less effective. I managed to fix this only with a change to gas (accepted upstream). But once that is solved, you then have the missing forward reference problem i.e. nothing actually references those individual exception entry sections and ld happily drops them all. Having a KEEP() on each of them is unworkable and defeats the purpose anyway. That requires a dummy reloc to trick ld into pulling in those sections when the parent section is also pulled in. Please see attached a subset of the slides I presented at ELC and Linaro Connect last year to illustrate those issues. Also attached a sample patch partially implementing those changes. In short I'm very glad to see that this might steer interest across multiple architectures. I felt like this was becoming much more intrusive than I expected and that maybe LTO was a better bet after all. But LTO has its evils too and I'm willing to look at gc-sections again if there is interest from others as well. Nicolas commit 1d7ec46257dc546bc7b87439788514fc4650a2b1 Author: Nicolas Pitre Date: Mon Oct 26 10:16:14 2015 -0400 ARM: pushlinkedsection introduction Signed-off-by: Nicolas Pitre diff --git a/Makefile b/Makefile index d5b3739119..75541414cb 100644 --- a/Makefile +++ b/Makefile @@ -775,6 +775,10 @@ ifeq ($(shell $(CONFIG_SHELL) $(srctree)/scripts/gcc-goto.sh $(CC)), y) KBUILD_AFLAGS += -DCC_HAVE_ASM_GOTO endif +# Named subsections +KBUILD_AFLAGS += -Wa,--sectname-subst +KBUILD_CFLAGS += -Wa,--sectname-subst + include scripts/Makefile.kasan include scripts/Makefile.extrawarn diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h index b2bc8e1147..70161c9bfa 100644 --- a/arch/arm/include/asm/assembler.h +++ b/arch/arm/include/asm/assembler.h @@ -88,6 +88,17 @@ #endif /* + * Special .pushsection wrapper with explicit dependency to prevent + * garbage collection of the specified section. This is needed when no + * explicit symbol references are made to this section. + */ + .macro .pushlinkedsection name:vararg + .reloc . - 1, R_ARM_NONE, 9909f + .pushsection \name +9909: + .endm + +/* * Enable and disable interrupts */ #if __LINUX_ARM_ARCH__ >= 6 @@ -239,7 +250,7 @@ #define USER(x...) \ 9999: x; \ - .pushsection __ex_table,"a"; \ + .pushlinkedsection __ex_table.%S,"a"; \ .align 3; \ .long 9999b,9001f; \ .popsection @@ -253,7 +264,7 @@ * ALT_SMP( W(instr) ... ) */ #define ALT_UP(instr...) \ - .pushsection ".alt.smp.init", "a" ;\ + .pushlinkedsection ".alt.smp.init.%S", "a" ;\ .long 9998b ;\ 9997: instr ;\ .if . - 9997b == 2 ;\ @@ -265,7 +276,7 @@ .popsection #define ALT_UP_B(label) \ .equ up_b_offset, label - 9998b ;\ - .pushsection ".alt.smp.init", "a" ;\ + .pushlinkedsection ".alt.smp.init.%S", "a" ;\ .long 9998b ;\ W(b) . + up_b_offset ;\ .popsection @@ -375,7 +386,7 @@ THUMB( orr \reg , \reg , #PSR_T_BIT ) .error "Unsupported inc macro argument" .endif - .pushsection __ex_table,"a" + .pushlinkedsection __ex_table.%S,"a" .align 3 .long 9999b, \abort .popsection @@ -416,7 +427,7 @@ THUMB( orr \reg , \reg , #PSR_T_BIT ) .error "Unsupported inc macro argument" .endif - .pushsection __ex_table,"a" + .pushlinkedsection __ex_table.%S,"a" .align 3 .long 9999b, \abort .popsection diff --git a/arch/arm/include/asm/bug.h b/arch/arm/include/asm/bug.h index e7335a9214..0cbb6ef4b5 100644 --- a/arch/arm/include/asm/bug.h +++ b/arch/arm/include/asm/bug.h @@ -3,6 +3,7 @@ #include #include +#include #include #ifdef CONFIG_BUG @@ -39,9 +40,9 @@ do { \ ".pushsection .rodata.str, \"aMS\", %progbits, 1\n" \ "2:\t.asciz " #__file "\n" \ ".popsection\n" \ - ".pushsection __bug_table,\"a\"\n" \ + __pushlinkedsection("__bug_table.%S,\"a\"") "\n"\ ".align 2\n" \ - "3:\t.word 1b, 2b\n" \ + "\t.word 1b, 2b\n" \ "\t.hword " #__line ", 0\n" \ ".popsection"); \ unreachable(); \ diff --git a/arch/arm/include/asm/compiler.h b/arch/arm/include/asm/compiler.h index 29fe85e594..3bfdd749a3 100644 --- a/arch/arm/include/asm/compiler.h +++ b/arch/arm/include/asm/compiler.h @@ -24,5 +24,14 @@ ".endif; " \ ".endif\n\t" +/* + * Special .pushsection wrapper with explicit dependency to prevent + * garbage collection of the specified section. This is needed when no + * explicit symbol references are made to this section. + */ +#define __pushlinkedsection(name) \ + ".reloc . - 1, R_ARM_NONE, 9909f\n" \ + "\t.pushsection " name "\n" \ + "9909: " #endif /* __ASM_ARM_COMPILER_H */ diff --git a/arch/arm/include/asm/futex.h b/arch/arm/include/asm/futex.h index 6795368ad0..3540e42084 100644 --- a/arch/arm/include/asm/futex.h +++ b/arch/arm/include/asm/futex.h @@ -5,11 +5,12 @@ #include #include +#include #include #define __futex_atomic_ex_table(err_reg) \ "3:\n" \ - " .pushsection __ex_table,\"a\"\n" \ + " " __pushlinkedsection("__ex_table.%S,\"a\"") "\n" \ " .align 3\n" \ " .long 1b, 4f, 2b, 4f\n" \ " .popsection\n" \ diff --git a/arch/arm/include/asm/jump_label.h b/arch/arm/include/asm/jump_label.h index 34f7b6980d..54e2a5ec11 100644 --- a/arch/arm/include/asm/jump_label.h +++ b/arch/arm/include/asm/jump_label.h @@ -4,6 +4,7 @@ #ifndef __ASSEMBLY__ #include +#include #include #define JUMP_LABEL_NOP_SIZE 4 @@ -12,7 +13,7 @@ static __always_inline bool arch_static_branch(struct static_key *key, bool bran { asm_volatile_goto("1:\n\t" WASM(nop) "\n\t" - ".pushsection __jump_table, \"aw\"\n\t" + __pushlinkedsection("__jump_table.%S, \"aw\") "\n\t" ".word 1b, %l[l_yes], %c0\n\t" ".popsection\n\t" : : "i" (&((char *)key)[branch]) : : l_yes); @@ -26,7 +27,7 @@ static __always_inline bool arch_static_branch_jump(struct static_key *key, bool { asm_volatile_goto("1:\n\t" WASM(b) " %l[l_yes]\n\t" - ".pushsection __jump_table, \"aw\"\n\t" + __pushlinkedsection("__jump_table.%S, \"aw\"") "\n\t" ".word 1b, %l[l_yes], %c0\n\t" ".popsection\n\t" : : "i" (&((char *)key)[branch]) : : l_yes); diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h index 98d58bb04a..d5cc34e9a7 100644 --- a/arch/arm/include/asm/memory.h +++ b/arch/arm/include/asm/memory.h @@ -18,6 +18,8 @@ #include #include +#include + #ifdef CONFIG_NEED_MACH_MEMORY_H #include #endif @@ -172,7 +174,7 @@ extern const void *__pv_table_begin, *__pv_table_end; #define __pv_stub(from,to,instr,type) \ __asm__("@ __pv_stub\n" \ "1: " instr " %0, %1, %2\n" \ - " .pushsection .pv_table,\"a\"\n" \ + " " __pushlinkedsection(".pv_table.%S,\"a\"") "\n" \ " .long 1b\n" \ " .popsection\n" \ : "=r" (to) \ @@ -181,7 +183,7 @@ extern const void *__pv_table_begin, *__pv_table_end; #define __pv_stub_mov_hi(t) \ __asm__ volatile("@ __pv_stub_mov\n" \ "1: mov %R0, %1\n" \ - " .pushsection .pv_table,\"a\"\n" \ + " " __pushlinkedsection(".pv_table.%S,\"a\"") "\n" \ " .long 1b\n" \ " .popsection\n" \ : "=r" (t) \ @@ -191,7 +193,7 @@ extern const void *__pv_table_begin, *__pv_table_end; __asm__ volatile("@ __pv_add_carry_stub\n" \ "1: adds %Q0, %1, %2\n" \ " adc %R0, %R0, #0\n" \ - " .pushsection .pv_table,\"a\"\n" \ + " " __pushlinkedsection(".pv_table.%S,\"a\"") "\n" \ " .long 1b\n" \ " .popsection\n" \ : "+r" (y) \ diff --git a/arch/arm/include/asm/processor.h b/arch/arm/include/asm/processor.h index 8a1e8e995d..8c535eacea 100644 --- a/arch/arm/include/asm/processor.h +++ b/arch/arm/include/asm/processor.h @@ -19,6 +19,7 @@ #ifdef __KERNEL__ +#include #include #include #include @@ -93,7 +94,7 @@ unsigned long get_wchan(struct task_struct *p); #ifdef CONFIG_SMP #define __ALT_SMP_ASM(smp, up) \ "9998: " smp "\n" \ - " .pushsection \".alt.smp.init\", \"a\"\n" \ + " " __pushlinkedsection("\".alt.smp.init.%S\", \"a\"") "\n" \ " .long 9998b\n" \ " " up "\n" \ " .popsection\n" diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h index 8cc85a4ebe..5e7e404894 100644 --- a/arch/arm/include/asm/uaccess.h +++ b/arch/arm/include/asm/uaccess.h @@ -357,7 +357,7 @@ do { \ " mov %1, #0\n" \ " b 2b\n" \ " .popsection\n" \ - " .pushsection __ex_table,\"a\"\n" \ + " " __pushlinkedsection("__ex_table.%S,\"a\"") "\n" \ " .align 3\n" \ " .long 1b, 3b\n" \ " .popsection" \ @@ -429,7 +429,7 @@ do { \ "3: mov %0, %3\n" \ " b 2b\n" \ " .popsection\n" \ - " .pushsection __ex_table,\"a\"\n" \ + " " __pushlinkedsection("__ex_table.%S,\"a\"") "\n" \ " .align 3\n" \ " .long 1b, 3b\n" \ " .popsection" \ @@ -479,7 +479,7 @@ do { \ "4: mov %0, %3\n" \ " b 3b\n" \ " .popsection\n" \ - " .pushsection __ex_table,\"a\"\n" \ + " " __pushlinkedsection("__ex_table.%S,\"a\"") "\n" \ " .align 3\n" \ " .long 1b, 4b\n" \ " .long 2b, 4b\n" \ diff --git a/arch/arm/include/asm/word-at-a-time.h b/arch/arm/include/asm/word-at-a-time.h index 5831dce4b5..348a462d3e 100644 --- a/arch/arm/include/asm/word-at-a-time.h +++ b/arch/arm/include/asm/word-at-a-time.h @@ -8,6 +8,7 @@ * Heavily based on the x86 algorithm. */ #include +#include struct word_at_a_time { const unsigned long one_bits, high_bits; @@ -84,7 +85,7 @@ static inline unsigned long load_unaligned_zeropad(const void *addr) #endif " b 2b\n" " .popsection\n" - " .pushsection __ex_table,\"a\"\n" + " " __pushlinkedsection("__ex_table.%S,\"a\"") "\n" " .align 3\n" " .long 1b, 3b\n" " .popsection" diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S index 3e1c26eb32..5047757c34 100644 --- a/arch/arm/kernel/entry-armv.S +++ b/arch/arm/kernel/entry-armv.S @@ -564,7 +564,7 @@ ENDPROC(__und_usr) 4: str r4, [sp, #S_PC] @ retry current instruction ret r9 .popsection - .pushsection __ex_table,"a" + .pushlinkedsection __ex_table.%S,"a" .long 1b, 4b #if CONFIG_ARM_THUMB && __LINUX_ARM_ARCH__ >= 6 && CONFIG_CPU_V7 .long 2b, 4b diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S index 8b60fde5ce..6885382931 100644 --- a/arch/arm/kernel/vmlinux.lds.S +++ b/arch/arm/kernel/vmlinux.lds.S @@ -36,7 +36,7 @@ #define ARM_CPU_KEEP(x) #endif -#if (defined(CONFIG_SMP_ON_UP) && !defined(CONFIG_DEBUG_SPINLOCK)) || \ +#if 0 // (defined(CONFIG_SMP_ON_UP) && !defined(CONFIG_DEBUG_SPINLOCK)) || \ defined(CONFIG_GENERIC_BUG) #define ARM_EXIT_KEEP(x) x #define ARM_EXIT_DISCARD(x) diff --git a/arch/arm/lib/backtrace.S b/arch/arm/lib/backtrace.S index fab5a50503..238c7de114 100644 --- a/arch/arm/lib/backtrace.S +++ b/arch/arm/lib/backtrace.S @@ -104,7 +104,7 @@ for_each_frame: tst frame, mask @ Check for address exceptions no_frame: ldmfd sp!, {r4 - r8, pc} ENDPROC(c_backtrace) - .pushsection __ex_table,"a" + .pushlinkedsection __ex_table.%S,"a" .align 3 .long 1001b, 1006b .long 1002b, 1006b diff --git a/arch/arm/lib/getuser.S b/arch/arm/lib/getuser.S index 8ecfd15c3a..e2c6a5649f 100644 --- a/arch/arm/lib/getuser.S +++ b/arch/arm/lib/getuser.S @@ -132,7 +132,7 @@ __get_user_bad: ENDPROC(__get_user_bad) ENDPROC(__get_user_bad8) -.pushsection __ex_table, "a" +.pushlinkedsection __ex_table.%S, "a" .long 1b, __get_user_bad .long 2b, __get_user_bad .long 3b, __get_user_bad diff --git a/arch/arm/lib/putuser.S b/arch/arm/lib/putuser.S index 38d660d370..b52f4a264e 100644 --- a/arch/arm/lib/putuser.S +++ b/arch/arm/lib/putuser.S @@ -88,7 +88,7 @@ __put_user_bad: ret lr ENDPROC(__put_user_bad) -.pushsection __ex_table, "a" +.pushlinkedsection __ex_table.%S, "a" .long 1b, __put_user_bad .long 2b, __put_user_bad .long 3b, __put_user_bad diff --git a/arch/arm/mm/alignment.c b/arch/arm/mm/alignment.c index 00b7f7de28..a2e6f47edb 100644 --- a/arch/arm/mm/alignment.c +++ b/arch/arm/mm/alignment.c @@ -22,6 +22,7 @@ #include #include +#include #include #include #include @@ -206,7 +207,7 @@ union offset_union { "3: mov %0, #1\n" \ " b 2b\n" \ " .popsection\n" \ - " .pushsection __ex_table,\"a\"\n" \ + " " __pushlinkedsection("__ex_table.%S,\"a\"") "\n" \ " .align 3\n" \ " .long 1b, 3b\n" \ " .popsection\n" \ @@ -266,7 +267,7 @@ union offset_union { "4: mov %0, #1\n" \ " b 3b\n" \ " .popsection\n" \ - " .pushsection __ex_table,\"a\"\n" \ + " " __pushlinkedsection("__ex_table.%S,\"a\"") "\n" \ " .align 3\n" \ " .long 1b, 4b\n" \ " .long 2b, 4b\n" \ @@ -306,7 +307,7 @@ union offset_union { "6: mov %0, #1\n" \ " b 5b\n" \ " .popsection\n" \ - " .pushsection __ex_table,\"a\"\n" \ + " " __pushlinkedsection("__ex_table.%S,\"a\"") "\n" \ " .align 3\n" \ " .long 1b, 6b\n" \ " .long 2b, 6b\n" \ diff --git a/arch/arm/nwfpe/entry.S b/arch/arm/nwfpe/entry.S index 39c20afad7..8f566c87c2 100644 --- a/arch/arm/nwfpe/entry.S +++ b/arch/arm/nwfpe/entry.S @@ -119,7 +119,7 @@ next: .Lfix: ret r9 @ let the user eat segfaults .popsection - .pushsection __ex_table,"a" + .pushlinkedsection __ex_table.%S,"a" .align 3 .long .Lx1, .Lfix .popsection