diff mbox

[2/5] kbuild: allow archs to select build for link dead code/data elimination

Message ID 12986071.MeSB5hmlsH@wuerfel (mailing list archive)
State Not Applicable
Headers show

Commit Message

Arnd Bergmann Aug. 7, 2016, 8:26 p.m. UTC
On Sunday, August 7, 2016 7:27:39 PM CEST Alan Modra wrote:
> 
> If it can, then Nicholas' patch should be:
> 
>         *(.text.hot .text.hot.*) *(.text.unlikely .text.unlikely.*) *(.text .text.*)
> 
> If you can't put .text.fixup too far away then you may as well just use
> 
>         *(.text .text.*)

I tried this version:


but that failed to link an allyesconfig kernel because of references
from .fixup to .text.*. Trying your version now:

*(.text.hot .text.hot.*) *(.text.unlikely .text.unlikely.*) *(.text .text.*)

	Arnd

Comments

Alan Modra Aug. 7, 2016, 11:49 p.m. UTC | #1
On Sun, Aug 07, 2016 at 10:26:19PM +0200, Arnd Bergmann wrote:
> On Sunday, August 7, 2016 7:27:39 PM CEST Alan Modra wrote:
> > 
> > If it can, then Nicholas' patch should be:
> > 
> >         *(.text.hot .text.hot.*) *(.text.unlikely .text.unlikely.*) *(.text .text.*)
> > 
> > If you can't put .text.fixup too far away then you may as well just use
> > 
> >         *(.text .text.*)
> 
> I tried this version:
> 
> diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
> index b1f8828e9eac..fc210dacac9a 100644
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -438,7 +438,9 @@
>   * during second ld run in second ld pass when generating System.map */
>  #define TEXT_TEXT							\
>  		ALIGN_FUNCTION();					\
> -		*(.text.hot .text .text.fixup .text.unlikely .text.*)	\
> +		*(.text.hot .text.hot.*)				\
> +		*(.text.unlikely .text.fixup .text.unlikely.*)		\
> +		*(.text .text.*)					\
>  		*(.ref.text)						\
>  	MEM_KEEP(init.text)						\
>  	MEM_KEEP(exit.text)						\
> 
> but that failed to link an allyesconfig kernel because of references
> from .fixup to .text.*. Trying your version now:

Well then, that proves you can't put .text.fixup too far aways from
the associated input section.

> *(.text.hot .text.hot.*) *(.text.unlikely .text.unlikely.*) *(.text .text.*)

Which means this is guaranteed to fail when you test it properly using
gcc's profiling options, in order to generate .text.hot* and/or
.text.unlikely* sections.

It seems to me the right thing to do would be to change kernel asm to
generate .text.foo.fixup for any .text.foo section.  A gas feature
available with binutils-2.26 enabled by --sectname-subst might help
with implementing that.
Arnd Bergmann Aug. 8, 2016, 3:14 p.m. UTC | #2
On Monday, August 8, 2016 9:19:47 AM CEST Alan Modra wrote:
> On Sun, Aug 07, 2016 at 10:26:19PM +0200, Arnd Bergmann wrote:
> > On Sunday, August 7, 2016 7:27:39 PM CEST Alan Modra wrote:
> > > 
> > > If it can, then Nicholas' patch should be:
> > > 
> > >         *(.text.hot .text.hot.*) *(.text.unlikely .text.unlikely.*) *(.text .text.*)
> > > 
> > > If you can't put .text.fixup too far away then you may as well just use
> > > 
> > >         *(.text .text.*)
> > 
> > I tried this version:
> > 
> > diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
> > index b1f8828e9eac..fc210dacac9a 100644
> > --- a/include/asm-generic/vmlinux.lds.h
> > +++ b/include/asm-generic/vmlinux.lds.h
> > @@ -438,7 +438,9 @@
> >   * during second ld run in second ld pass when generating System.map */
> >  #define TEXT_TEXT                                                    \
> >               ALIGN_FUNCTION();                                       \
> > -             *(.text.hot .text .text.fixup .text.unlikely .text.*)   \
> > +             *(.text.hot .text.hot.*)                                \
> > +             *(.text.unlikely .text.fixup .text.unlikely.*)          \
> > +             *(.text .text.*)                                        \
> >               *(.ref.text)                                            \
> >       MEM_KEEP(init.text)                                             \
> >       MEM_KEEP(exit.text)                                             \
> > 
> > but that failed to link an allyesconfig kernel because of references
> > from .fixup to .text.*. Trying your version now:
> 
> Well then, that proves you can't put .text.fixup too far aways from
> the associated input section.
> 
> > *(.text.hot .text.hot.*) *(.text.unlikely .text.unlikely.*) *(.text .text.*)
> 
> Which means this is guaranteed to fail when you test it properly using
> gcc's profiling options, in order to generate .text.hot* and/or
> .text.unlikely* sections.

I've investigated further and it seems that "*(.text.fixup) *(.text .text.*)"
fails just because we list .text.fixup twice. The .text.fixup section
was originally[1] introduced to work around the same link error that
it is causing now: if we use recursive linking, merging .text and .text.fixup
helps avoid the problems of sections that are >32MB before the final
link.

I have reverted that patch now, so ARM uses ".fixup" again like every
other architecture does, and now "*(.fixup) *(.text .text.*)" works
correctly, while ""*(.fixup) *(.text .fixup .text.*)" also fails
the same way that I saw before:

drivers/scsi/sg.o:(.fixup+0x4): relocation truncated to fit: R_ARM_THM_JUMP24 against `.text.sg_ioctl'
drivers/scsi/sg.o:(.fixup+0xc): relocation truncated to fit: R_ARM_THM_JUMP24 against `.text.sg_ioctl'
drivers/scsi/sg.o:(.fixup+0x14): relocation truncated to fit: R_ARM_THM_JUMP24 against `.text.sg_ioctl'
drivers/scsi/sg.o:(.fixup+0x1c): relocation truncated to fit: R_ARM_THM_JUMP24 against `.text.sg_ioctl'
drivers/scsi/sg.o:(.fixup+0x24): relocation truncated to fit: R_ARM_THM_JUMP24 against `.text.sg_ioctl'
drivers/scsi/sg.o:(.fixup+0x2c): relocation truncated to fit: R_ARM_THM_JUMP24 against `.text.sg_ioctl'
drivers/scsi/sg.o:(.fixup+0x34): relocation truncated to fit: R_ARM_THM_JUMP24 against `.text.sg_ioctl'
drivers/scsi/sg.o:(.fixup+0x3c): relocation truncated to fit: R_ARM_THM_JUMP24 against `.text.sg_ioctl'
drivers/scsi/sg.o:(.fixup+0x44): relocation truncated to fit: R_ARM_THM_JUMP24 against `.text.sg_ioctl'

I don't understand what led Andi Kleen to also move .text.hot and
.text.unlikely together with .text [2], but this may have
been a related issue.

> It seems to me the right thing to do would be to change kernel asm to
> generate .text.foo.fixup for any .text.foo section.  A gas feature
> available with binutils-2.26 enabled by --sectname-subst might help
> with implementing that.

I think this what Nico wanted to use anyway to eliminate more functions
from the output.

	Arnd


[1] http://www.arm.linux.org.uk/developer/patches/viewpatch.php?id=8321
    http://www.arm.linux.org.uk/developer/patches/viewpatch.php?id=8322

[2] https://lkml.org/lkml/2015/7/19/377
Alan Modra Aug. 8, 2016, 11:50 p.m. UTC | #3
On Mon, Aug 08, 2016 at 05:14:27PM +0200, Arnd Bergmann wrote:
> I have reverted that patch now, so ARM uses ".fixup" again like every
> other architecture does, and now "*(.fixup) *(.text .text.*)" works
> correctly, while ""*(.fixup) *(.text .fixup .text.*)" also fails
> the same way that I saw before:

That is really odd.  The linker isn't supposed to treat those script
snippets differently.  First match for .fixup wins.

$ cat > fixup1.s <<\EOF
 .global _start
 .text
_start:
 .dc.a .L2
.L1:
 .section ".fixup","ax",%progbits
.L2:
 .dc.a .L1
EOF
$ cat > fixup2.s <<\EOF
 .section ".text.xyz","ax",%progbits
 .dc.a .L2
.L1:

 .section ".fixup","ax",%progbits
.L2:
 .dc.a .L1
EOF
$ cat > fixup.lnk <<\EOF
SECTIONS {
  .text : { *(.fixup) *(.text .fixup .text.*) }
}
EOF
$ as -o fixup1.o fixup1.s 
$ as -o fixup2.o fixup2.s 
$ ld -o fixup -T fixup.lnk -Map fixup.map fixup1.o fixup2.o
$ cat fixup.map

Memory Configuration

Name             Origin             Length             Attributes
*default*        0x0000000000000000 0xffffffffffffffff

Linker script and memory map


.text           0x0000000000000000       0x10
 *(.fixup)
 .fixup         0x0000000000000000        0x4 fixup1.o
 .fixup         0x0000000000000004        0x4 fixup2.o
 *(.text .fixup .text.*)
 .text          0x0000000000000008        0x4 fixup1.o
                0x0000000000000008                _start
 .text          0x000000000000000c        0x0 fixup2.o
 .text.xyz      0x000000000000000c        0x4 fixup2.o
[snip]
Andi Kleen Aug. 9, 2016, 3:16 a.m. UTC | #4
> I don't understand what led Andi Kleen to also move .text.hot and
> .text.unlikely together with .text [2], but this may have
> been a related issue.

The goal was just to move .hot and .unlikely all together, so that
they are clustered and use the minimum amount of cache. On x86 doesn't
matter where they are exactly, as long as each is together.
If they are not explicitely listed then the linker interleaves
them with the normal text, which defeats the purpose.

-Andi
Arnd Bergmann Aug. 9, 2016, 10:10 p.m. UTC | #5
On Tuesday, August 9, 2016 9:20:16 AM CEST Alan Modra wrote:
> On Mon, Aug 08, 2016 at 05:14:27PM +0200, Arnd Bergmann wrote:
> > I have reverted that patch now, so ARM uses ".fixup" again like every
> > other architecture does, and now "*(.fixup) *(.text .text.*)" works
> > correctly, while ""*(.fixup) *(.text .fixup .text.*)" also fails
> > the same way that I saw before:
> 
> That is really odd.  The linker isn't supposed to treat those script
> snippets differently.  First match for .fixup wins.

Sorry for my mistake. I checked again and cannot reproduce what I
thought I saw earlier. "*(.fixup) *(.text .text.*)" fails
as would be expected.

	Arnd
Arnd Bergmann Aug. 9, 2016, 10:29 p.m. UTC | #6
On Monday, August 8, 2016 8:16:05 PM CEST Andi Kleen wrote:
> > I don't understand what led Andi Kleen to also move .text.hot and
> > .text.unlikely together with .text [2], but this may have
> > been a related issue.
> >
> > [2] https://lkml.org/lkml/2015/7/19/377
>
> The goal was just to move .hot and .unlikely all together, so that
> they are clustered and use the minimum amount of cache. On x86 doesn't
> matter where they are exactly, as long as each is together.
> If they are not explicitely listed then the linker interleaves
> them with the normal text, which defeats the purpose.

I still don't see it, my reading of your patch is that you did
the opposite, by changing the description that puts all .text.hot
in front of .text, and all .text.unlikely after exit.text into
one that mixes them with .text. What am I missing here?

	Arnd
Andi Kleen Aug. 9, 2016, 11:08 p.m. UTC | #7
On Wed, Aug 10, 2016 at 12:29:29AM +0200, Arnd Bergmann wrote:
> On Monday, August 8, 2016 8:16:05 PM CEST Andi Kleen wrote:
> > > I don't understand what led Andi Kleen to also move .text.hot and
> > > .text.unlikely together with .text [2], but this may have
> > > been a related issue.
> > >
> > > [2] https://lkml.org/lkml/2015/7/19/377
> >
> > The goal was just to move .hot and .unlikely all together, so that
> > they are clustered and use the minimum amount of cache. On x86 doesn't
> > matter where they are exactly, as long as each is together.
> > If they are not explicitely listed then the linker interleaves
> > them with the normal text, which defeats the purpose.
> 
> I still don't see it, my reading of your patch is that you did
> the opposite, by changing the description that puts all .text.hot
> in front of .text, and all .text.unlikely after exit.text into
> one that mixes them with .text. What am I missing here?

.text.hot is actually not used, the critical part is .text.unlikely
which was not listed and was interleaved before the patch.

-Andi
Andi Kleen Aug. 10, 2016, 12:37 a.m. UTC | #8
On Wed, Aug 10, 2016 at 12:29:29AM +0200, Arnd Bergmann wrote:
> On Monday, August 8, 2016 8:16:05 PM CEST Andi Kleen wrote:
> > > I don't understand what led Andi Kleen to also move .text.hot and
> > > .text.unlikely together with .text [2], but this may have
> > > been a related issue.
> > >
> > > [2] https://lkml.org/lkml/2015/7/19/377
> >
> > The goal was just to move .hot and .unlikely all together, so that
> > they are clustered and use the minimum amount of cache. On x86 doesn't
> > matter where they are exactly, as long as each is together.
> > If they are not explicitely listed then the linker interleaves
> > them with the normal text, which defeats the purpose.
> 
> I still don't see it, my reading of your patch is that you did
> the opposite, by changing the description that puts all .text.hot
> in front of .text, and all .text.unlikely after exit.text into
> one that mixes them with .text. What am I missing here?

No it doesn't mix .unlikely with .text, .unlikely is all in one place.

-Andi
diff mbox

Patch

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index b1f8828e9eac..fc210dacac9a 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -438,7 +438,9 @@ 
  * during second ld run in second ld pass when generating System.map */
 #define TEXT_TEXT							\
 		ALIGN_FUNCTION();					\
-		*(.text.hot .text .text.fixup .text.unlikely .text.*)	\
+		*(.text.hot .text.hot.*)				\
+		*(.text.unlikely .text.fixup .text.unlikely.*)		\
+		*(.text .text.*)					\
 		*(.ref.text)						\
 	MEM_KEEP(init.text)						\
 	MEM_KEEP(exit.text)						\