diff mbox series

powerpc/64s: Add load address to plt branch targets before moved to linked location for non-relocatable kernels

Message ID 20210421021721.1539289-1-jniethe5@gmail.com (mailing list archive)
State Superseded
Headers show
Series powerpc/64s: Add load address to plt branch targets before moved to linked location for non-relocatable kernels | expand

Checks

Context Check Description
snowpatch_ozlabs/apply_patch success Successfully applied on branch powerpc/merge (40f5c8e99b3f2f53db08055f415af2aac416360e)
snowpatch_ozlabs/build-ppc64le success Build succeeded
snowpatch_ozlabs/build-ppc64be success Build succeeded
snowpatch_ozlabs/build-ppc64e success Build succeeded
snowpatch_ozlabs/build-pmac32 success Build succeeded
snowpatch_ozlabs/checkpatch warning total: 0 errors, 1 warnings, 0 checks, 103 lines checked
snowpatch_ozlabs/needsstable success Patch has no Fixes tags

Commit Message

Jordan Niethe April 21, 2021, 2:17 a.m. UTC
Large branches will go through the plt which includes a stub that loads
a target address from the .branch_lt section. On a relocatable kernel the
targets in .branch_lt have relocations so they will be fixed up for
where the kernel is running by relocate().

For a non-relocatable kernel obviously there are no relocations.
However, until the kernel is moved down to its linked address it is
expected to be able to run where ever it is loaded. For pseries machines
prom_init() is called before running at the linked address.

Certain configs result in a large kernel such as STRICT_KERNEL_RWX
(because of the larger data shift):

config DATA_SHIFT
	int "Data shift" if DATA_SHIFT_BOOL
	default 24 if STRICT_KERNEL_RWX && PPC64

These large kernels lead to prom_init()'s final call to __start()
generating a plt branch:

bl      c000000002000018 <00000078.plt_branch.__start>

This results in the kernel jumping to the linked address of __start,
0xc000000000000000, when really it needs to jump to the
0xc000000000000000 + the runtime address because the kernel is still
running at the load address.

The first 256 bytes are already copied to address 0 so the kernel will
run until

b	__start_initialization_multiplatform

because there is nothing yet at __start_initialization_multiplatform
this will inevitably crash. At this point the exception handlers are
still OF's.

On phyp this will look like:

OF stdout device is: /vdevice/vty@30000000
Preparing to boot Linux version 5.12.0-rc3-63029-gada7d7e600c0 (gcc (GCC) 8.4.1 20200928 (Red Hat 8.4.1-1), GNU ld version 2.30-93.el8) #1 SMP Wed Apr 7 07:24:20 EDT 2021
Detected machine type: 0000000000000101
command line: BOOT_IMAGE=/vmlinuz-5.12.0-rc3-63029-gada7d7e600c0
Max number of cores passed to firmware: 256 (NR_CPUS = 2048)
Calling ibm,client-architecture-support... done
memory layout at init:
  memory_limit : 0000000000000000 (16 MB aligned)
  alloc_bottom : 000000000edc0000
  alloc_top    : 0000000020000000
  alloc_top_hi : 0000000020000000
  rmo_top      : 0000000020000000
  ram_top      : 0000000020000000
instantiating rtas at 0x000000001ec30000... done
prom_hold_cpus: skipped
copying OF device tree...
Building dt strings...
Building dt structure...
Device tree strings 0x000000000edd0000 -> 0x000000000edd1809
Device tree struct  0x000000000ede0000 -> 0x000000000edf0000
Quiescing Open Firmware ...
Booting Linux via __start() @ 0x000000000a710000 ...
DEFAULT CATCH!, exception-handler=fffffffffffffff6
at   %SRR0: 0000000000000f20   %SRR1: 8000000000081000
Open Firmware exception handler entered from non-OF code
Client's Fix Pt Regs:
 00 000000000c713134 0000000008a9fc00 000000000caf9c00 000000000edc0000
 04 000000000a710000 0000000000000000 0000000000000000 0000000000000000
 08 0000000000000000 0000000000000000 000000000a7200fc 0000000000003003
 0c c000000000000000 0000000000000000 0000000000000000 000000000b5a9820
 10 000000000b5a9b38 000000000b5a9988 000000000b5a9f38 000000000b660c10
 14 000000000b5a9f60 00000000013d0000 000000001ec30000 000000001ec30000
 18 000000000b5a9840 000000000a710000 0000000000000028 000000000edc0008
 1c 000000000edc0000 000000000cb60000 0000000000000000 000000000edc0000
Special Regs:
    %IV: 00000700     %CR: 44000202    %XER: 00000000  %DSISR: 00000000
  %SRR0: 0000000000000f20   %SRR1: 8000000000081000
    %LR: 000000000c71326c    %CTR: c000000000000000
   %DAR: 0000000000000000
Virtual PID = 0
DEFAULT CATCH!, throw-code=fffffffffffffff6
Call History
------------
throw  - c3f05c
$call-method  - c4f0b4
(poplocals)  - c40a00
key-fillq  - c4f4cc
?xoff  - c4f5b4
(poplocals)  - c40a00
(stdout-write)  - c4fa64
(emit)  - c4fb3c
space  - c4dfc8
quit  - c5336c
quit  - c53100
My Fix Pt Regs:
 00 800000000000b002 0000000000000000 00000000deadbeef 0000000000c4f0b0
 04 0000000008bfff80 00000000deadbeef 0000000000000004 0000000000c09010
 08 0000000000000005 0000000000000000 0000000000000000 0000000000000000
 0c 80000000072a40a8 0000000000000000 0000000000000000 0000000008d2cf30
 10 0000000000e7d968 0000000000e7d968 0000000000c4f0a8 0000000000c4f0b4
 14 fffffffffffffff6 0000000008bfff80 c8ff21fbd0ff41fb f8ffe1fbb1fd21f8
 18 0000000000c19000 0000000000c3e000 0000000000c1af80 0000000000c1cfc0
 1c 0000000000c26000 0000000000c460f0 0000000000c17fa8 0000000000c16fe0
Special Regs:
    %IV: 00000900     %CR: 84800208    %XER: 00040010  %DSISR: 00000000
  %SRR0: 0000000000c3eec8   %SRR1: 800000000000b002
    %LR: 0000000000c3f05c    %CTR: 0000000000c4f0b0
   %DAR: 0000000000000000
...

On qemu it will just appear to be stuck after
Booting Linux via __start() @ 0x0000000000400000 ...:

SLOF **********************************************************************
QEMU Starting
 Build Date = Apr  9 2021 14:13:31
 FW Version = git-33a7322de13e9dca
 Press "s" to enter Open Firmware.

Populating /vdevice methods
Populating /vdevice/vty@71000000
Populating /vdevice/nvram@71000001
Populating /vdevice/l-lan@71000002
Populating /vdevice/v-scsi@71000003
       SCSI: Looking for devices
          8200000000000000 CD-ROM   : "QEMU     QEMU CD-ROM      2.5+"
Populating /pci@800000020000000
Scanning USB
Using default console: /vdevice/vty@71000000
Detected RAM kernel at 400000 (25baa08 bytes)

  Welcome to Open Firmware

  Copyright (c) 2004, 2017 IBM Corporation All rights reserved.
  This program and the accompanying materials are made available
  under the terms of the BSD License available at
  http://www.opensource.org/licenses/bsd-license.php

Booting from memory...
OF stdout device is: /vdevice/vty@71000000
Preparing to boot Linux version 5.12.0-rc3-00128-g87a8d2180282 (powerpc64le-linux-gnu-gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #85 SMP Sun Apr 18 19:30:55 AEST 2021
Detected machine type: 0000000000000101
command line: nokaslr
Max number of cores passed to firmware: 2048 (NR_CPUS = 2048)
Calling ibm,client-architecture-support... done
memory layout at init:
  memory_limit : 0000000000000000 (16 MB aligned)
  alloc_bottom : 0000000003df0000
  alloc_top    : 0000000030000000
  alloc_top_hi : 0000000080000000
  rmo_top      : 0000000030000000
  ram_top      : 0000000080000000
instantiating rtas at 0x000000002fff0000... done
prom_hold_cpus: skipped
copying OF device tree...
Building dt strings...
Building dt structure...
Device tree strings 0x0000000003e00000 -> 0x0000000003e00ab2
Device tree struct  0x0000000003e10000 -> 0x0000000003e20000
Quiescing Open Firmware ...
Booting Linux via __start() @ 0x0000000000400000 ...

To fix this do some "relocation" of the plt target addresses on
non-relocatable before running at the linked address. Before calling
prom_init() add the runtime address to all the targets in .branch_lt
with relocate_plt(). Have relocate_plt() save the offset added in
p_branch_lt_off.  After prom_init() calls __start() remove the offset
saved in p_branch_lt_off to return the targets to their original
addresses.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
 arch/powerpc/include/asm/sections.h |  2 +
 arch/powerpc/kernel/head_64.S       | 66 +++++++++++++++++++++++++++++
 arch/powerpc/kernel/vmlinux.lds.S   |  2 +
 3 files changed, 70 insertions(+)

Comments

Christophe Leroy April 21, 2021, 9:01 a.m. UTC | #1
Le 21/04/2021 à 04:17, Jordan Niethe a écrit :
> Large branches will go through the plt which includes a stub that loads
> a target address from the .branch_lt section. On a relocatable kernel the
> targets in .branch_lt have relocations so they will be fixed up for
> where the kernel is running by relocate().
> 
> For a non-relocatable kernel obviously there are no relocations.
> However, until the kernel is moved down to its linked address it is
> expected to be able to run where ever it is loaded. For pseries machines
> prom_init() is called before running at the linked address.
> 
> Certain configs result in a large kernel such as STRICT_KERNEL_RWX
> (because of the larger data shift):

Same problem occurs on 32s, see discussion at https://bugzilla.kernel.org/show_bug.cgi?id=208181#c14


> 
> config DATA_SHIFT
> 	int "Data shift" if DATA_SHIFT_BOOL
> 	default 24 if STRICT_KERNEL_RWX && PPC64
> 
> These large kernels lead to prom_init()'s final call to __start()
> generating a plt branch:
> 
> bl      c000000002000018 <00000078.plt_branch.__start>
> 
> This results in the kernel jumping to the linked address of __start,
> 0xc000000000000000, when really it needs to jump to the
> 0xc000000000000000 + the runtime address because the kernel is still
> running at the load address.
> 
> The first 256 bytes are already copied to address 0 so the kernel will
> run until
> 
> b	__start_initialization_multiplatform
> 
> because there is nothing yet at __start_initialization_multiplatform
> this will inevitably crash. At this point the exception handlers are
> still OF's.
> 
> On phyp this will look like:
> 
> OF stdout device is: /vdevice/vty@30000000
> Preparing to boot Linux version 5.12.0-rc3-63029-gada7d7e600c0 (gcc (GCC) 8.4.1 20200928 (Red Hat 8.4.1-1), GNU ld version 2.30-93.el8) #1 SMP Wed Apr 7 07:24:20 EDT 2021
> Detected machine type: 0000000000000101
> command line: BOOT_IMAGE=/vmlinuz-5.12.0-rc3-63029-gada7d7e600c0
> Max number of cores passed to firmware: 256 (NR_CPUS = 2048)
> Calling ibm,client-architecture-support... done
> memory layout at init:
>    memory_limit : 0000000000000000 (16 MB aligned)
>    alloc_bottom : 000000000edc0000
>    alloc_top    : 0000000020000000
>    alloc_top_hi : 0000000020000000
>    rmo_top      : 0000000020000000
>    ram_top      : 0000000020000000
> instantiating rtas at 0x000000001ec30000... done
> prom_hold_cpus: skipped
> copying OF device tree...
> Building dt strings...
> Building dt structure...
> Device tree strings 0x000000000edd0000 -> 0x000000000edd1809
> Device tree struct  0x000000000ede0000 -> 0x000000000edf0000
> Quiescing Open Firmware ...
> Booting Linux via __start() @ 0x000000000a710000 ...
> DEFAULT CATCH!, exception-handler=fffffffffffffff6
> at   %SRR0: 0000000000000f20   %SRR1: 8000000000081000
> Open Firmware exception handler entered from non-OF code
> Client's Fix Pt Regs:
>   00 000000000c713134 0000000008a9fc00 000000000caf9c00 000000000edc0000
>   04 000000000a710000 0000000000000000 0000000000000000 0000000000000000
>   08 0000000000000000 0000000000000000 000000000a7200fc 0000000000003003
>   0c c000000000000000 0000000000000000 0000000000000000 000000000b5a9820
>   10 000000000b5a9b38 000000000b5a9988 000000000b5a9f38 000000000b660c10
>   14 000000000b5a9f60 00000000013d0000 000000001ec30000 000000001ec30000
>   18 000000000b5a9840 000000000a710000 0000000000000028 000000000edc0008
>   1c 000000000edc0000 000000000cb60000 0000000000000000 000000000edc0000
> Special Regs:
>      %IV: 00000700     %CR: 44000202    %XER: 00000000  %DSISR: 00000000
>    %SRR0: 0000000000000f20   %SRR1: 8000000000081000
>      %LR: 000000000c71326c    %CTR: c000000000000000
>     %DAR: 0000000000000000
> Virtual PID = 0
> DEFAULT CATCH!, throw-code=fffffffffffffff6
> Call History
> ------------
> throw  - c3f05c
> $call-method  - c4f0b4
> (poplocals)  - c40a00
> key-fillq  - c4f4cc
> ?xoff  - c4f5b4
> (poplocals)  - c40a00
> (stdout-write)  - c4fa64
> (emit)  - c4fb3c
> space  - c4dfc8
> quit  - c5336c
> quit  - c53100
> My Fix Pt Regs:
>   00 800000000000b002 0000000000000000 00000000deadbeef 0000000000c4f0b0
>   04 0000000008bfff80 00000000deadbeef 0000000000000004 0000000000c09010
>   08 0000000000000005 0000000000000000 0000000000000000 0000000000000000
>   0c 80000000072a40a8 0000000000000000 0000000000000000 0000000008d2cf30
>   10 0000000000e7d968 0000000000e7d968 0000000000c4f0a8 0000000000c4f0b4
>   14 fffffffffffffff6 0000000008bfff80 c8ff21fbd0ff41fb f8ffe1fbb1fd21f8
>   18 0000000000c19000 0000000000c3e000 0000000000c1af80 0000000000c1cfc0
>   1c 0000000000c26000 0000000000c460f0 0000000000c17fa8 0000000000c16fe0
> Special Regs:
>      %IV: 00000900     %CR: 84800208    %XER: 00040010  %DSISR: 00000000
>    %SRR0: 0000000000c3eec8   %SRR1: 800000000000b002
>      %LR: 0000000000c3f05c    %CTR: 0000000000c4f0b0
>     %DAR: 0000000000000000
> ...
> 
> On qemu it will just appear to be stuck after
> Booting Linux via __start() @ 0x0000000000400000 ...:
> 
> SLOF **********************************************************************
> QEMU Starting
>   Build Date = Apr  9 2021 14:13:31
>   FW Version = git-33a7322de13e9dca
>   Press "s" to enter Open Firmware.
> 
> Populating /vdevice methods
> Populating /vdevice/vty@71000000
> Populating /vdevice/nvram@71000001
> Populating /vdevice/l-lan@71000002
> Populating /vdevice/v-scsi@71000003
>         SCSI: Looking for devices
>            8200000000000000 CD-ROM   : "QEMU     QEMU CD-ROM      2.5+"
> Populating /pci@800000020000000
> Scanning USB
> Using default console: /vdevice/vty@71000000
> Detected RAM kernel at 400000 (25baa08 bytes)
> 
>    Welcome to Open Firmware
> 
>    Copyright (c) 2004, 2017 IBM Corporation All rights reserved.
>    This program and the accompanying materials are made available
>    under the terms of the BSD License available at
>    http://www.opensource.org/licenses/bsd-license.php
> 
> Booting from memory...
> OF stdout device is: /vdevice/vty@71000000
> Preparing to boot Linux version 5.12.0-rc3-00128-g87a8d2180282 (powerpc64le-linux-gnu-gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #85 SMP Sun Apr 18 19:30:55 AEST 2021
> Detected machine type: 0000000000000101
> command line: nokaslr
> Max number of cores passed to firmware: 2048 (NR_CPUS = 2048)
> Calling ibm,client-architecture-support... done
> memory layout at init:
>    memory_limit : 0000000000000000 (16 MB aligned)
>    alloc_bottom : 0000000003df0000
>    alloc_top    : 0000000030000000
>    alloc_top_hi : 0000000080000000
>    rmo_top      : 0000000030000000
>    ram_top      : 0000000080000000
> instantiating rtas at 0x000000002fff0000... done
> prom_hold_cpus: skipped
> copying OF device tree...
> Building dt strings...
> Building dt structure...
> Device tree strings 0x0000000003e00000 -> 0x0000000003e00ab2
> Device tree struct  0x0000000003e10000 -> 0x0000000003e20000
> Quiescing Open Firmware ...
> Booting Linux via __start() @ 0x0000000000400000 ...
> 
> To fix this do some "relocation" of the plt target addresses on
> non-relocatable before running at the linked address. Before calling
> prom_init() add the runtime address to all the targets in .branch_lt
> with relocate_plt(). Have relocate_plt() save the offset added in
> p_branch_lt_off.  After prom_init() calls __start() remove the offset
> saved in p_branch_lt_off to return the targets to their original
> addresses.
> 
> Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> ---
>   arch/powerpc/include/asm/sections.h |  2 +
>   arch/powerpc/kernel/head_64.S       | 66 +++++++++++++++++++++++++++++
>   arch/powerpc/kernel/vmlinux.lds.S   |  2 +
>   3 files changed, 70 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/sections.h b/arch/powerpc/include/asm/sections.h
> index 324d7b298ec3..f087f5cd5a50 100644
> --- a/arch/powerpc/include/asm/sections.h
> +++ b/arch/powerpc/include/asm/sections.h
> @@ -30,6 +30,8 @@ extern char __end_interrupts[];
>   
>   extern char __prom_init_toc_start[];
>   extern char __prom_init_toc_end[];
> +extern char __branch_lt_start[];
> +extern char __branch_lt_end[];
>   
>   #ifdef CONFIG_PPC_POWERNV
>   extern char start_real_trampolines[];
> diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
> index ece7f97bafff..28a6c2abd3ab 100644
> --- a/arch/powerpc/kernel/head_64.S
> +++ b/arch/powerpc/kernel/head_64.S
> @@ -560,8 +560,11 @@ __boot_from_prom:
>   	/* Relocate code for where we are now */
>   	mr	r3,r26
>   	bl	relocate
> +#else
> +	bl	relocate_plt
>   #endif
>   
> +
>   	/* Restore parameters */
>   	mr	r3,r31
>   	mr	r4,r30
> @@ -600,6 +603,8 @@ __after_prom_start:
>   	/* IVPR needs to be set after relocation. */
>   	bl	init_core_book3e
>   #endif
> +#else
> +	bl	unrelocate_plt
>   #endif
>   
>   /*
> @@ -901,6 +906,67 @@ _GLOBAL(relative_toc)
>   .balign 8
>   p_toc:	.8byte	__toc_start + 0x8000 - 0b
>   
> +/*
> + * A large non relocatable kernel may generate branches that go though the plt,
> + * before the kernel is copied down to its link location, the target address in
> + * the .branch_lt section need to be offset with the run time address. The
> + * offset then needs to be removed before the kernel is running at the correct
> + * address.  When relocate_plt is called the current runtime address is added
> + * to all of the target address in .branch_lt and that address is stored in
> + * p_branch_lt_off.  When unrelocate_plt is called if there is an offset saved
> + * in p_branch_lt_off it is subtracted from the addresses in .branch_lt to
> + * return them to their original targets.
> + */
> +#ifndef CONFIG_RELOCATABLE
> +#define RELOCATE_MODE 0
> +#define UNRELOCATE_MODE 1
> +unrelocate_plt:
> +	li	r16,UNRELOCATE_MODE
> +	b	+8
> +relocate_plt:
> +	li	r16,RELOCATE_MODE
> +	mflr	r0
> +	bcl	20,31,$+4
> +0:	mflr	r11
> +
> +	ld	r12,(p_branch_lt_start - 0b)(r11)
> +	add	r12,r12,r11
> +	ld	r14,(p_branch_lt_end - 0b)(r11)
> +	add	r14,r14,r11
> +	ld	r15,(p_branch_lt_off - 0b)(r11)
> +
> +	/* Adding runtime address or subtracting p_branch_lt_off? */
> +	cmpdi	r16,UNRELOCATE_MODE
> +	bne	5f
> +	cmpdi	r15,0
> +	beq	4f
> +	mr	r10,r15
> +	neg	r10,r10
> +	b	2f
> +5:	mr	r10,r26
> +
> +	/* Iterate over all targets in .branch_lt */
> +2:	cmpd	r12,r14
> +	bge	6f
> +	ld	r13,0(r12)
> +	add	r13,r13, r10
> +	std	r13,0(r12)
> +	addi	r12,r12, 8
> +	b	2b
> +
> +6:	cmpdi	r16,RELOCATE_MODE
> +	bne	4f
> +	std	r26,(p_branch_lt_off - 0b)(r11)
> +
> +4:	mtlr	r0
> +	blr
> +
> +.balign 8
> +p_branch_lt_start:	.8byte	__branch_lt_start - 0b
> +p_branch_lt_end:	.8byte	__branch_lt_end - 0b
> +p_branch_lt_off:	.8byte	0
> +#endif
> +
>   /*
>    * This is where the main kernel code starts.
>    */
> diff --git a/arch/powerpc/kernel/vmlinux.lds.S b/arch/powerpc/kernel/vmlinux.lds.S
> index 72fa3c00229a..99085558ad3a 100644
> --- a/arch/powerpc/kernel/vmlinux.lds.S
> +++ b/arch/powerpc/kernel/vmlinux.lds.S
> @@ -317,7 +317,9 @@ SECTIONS
>   #endif
>   		*(.data.rel*)
>   		*(.toc1)
> +		__branch_lt_start = .;
>   		*(.branch_lt)
> +		__branch_lt_end = .;
>   	}
>   
>   	.opd : AT(ADDR(.opd) - LOAD_OFFSET) {
>
Christophe Leroy April 21, 2021, 11:59 a.m. UTC | #2
Le 21/04/2021 à 04:17, Jordan Niethe a écrit :
> Large branches will go through the plt which includes a stub that loads
> a target address from the .branch_lt section. On a relocatable kernel the
> targets in .branch_lt have relocations so they will be fixed up for
> where the kernel is running by relocate().
> 
> For a non-relocatable kernel obviously there are no relocations.
> However, until the kernel is moved down to its linked address it is
> expected to be able to run where ever it is loaded. For pseries machines
> prom_init() is called before running at the linked address.
> 
> Certain configs result in a large kernel such as STRICT_KERNEL_RWX
> (because of the larger data shift):
> 
> config DATA_SHIFT
> 	int "Data shift" if DATA_SHIFT_BOOL
> 	default 24 if STRICT_KERNEL_RWX && PPC64
> 
> These large kernels lead to prom_init()'s final call to __start()
> generating a plt branch:
> 
> bl      c000000002000018 <00000078.plt_branch.__start>
> 
> This results in the kernel jumping to the linked address of __start,
> 0xc000000000000000, when really it needs to jump to the
> 0xc000000000000000 + the runtime address because the kernel is still
> running at the load address.


On ppc32 it seems to be different. I can't find plt_branch or lt_branch or whatever.

Looks like the stubs are placed at the end of .head section, and just after prom_init:

c0003858 <setup_disp_bat>:
c0003858:	7d 08 02 a6 	mflr    r8
c000385c:	48 00 d5 35 	bl      c0010d90 <reloc_offset>
c0003860:	7d 08 03 a6 	mtlr    r8
c0003864:	3d 03 c2 04 	addis   r8,r3,-15868
c0003868:	39 08 1d 08 	addi    r8,r8,7432
c000386c:	2c 08 00 00 	cmpwi   r8,0
c0003870:	4d 82 00 20 	beqlr
c0003874:	81 68 00 00 	lwz     r11,0(r8)
c0003878:	81 08 00 04 	lwz     r8,4(r8)
c000387c:	7d 1f 83 a6 	mtdbatl 3,r8
c0003880:	7d 7e 83 a6 	mtdbatu 3,r11
c0003884:	4e 80 00 20 	blr
c0003888:	3d 80 c2 00 	lis     r12,-15872
c000388c:	39 8c 16 dc 	addi    r12,r12,5852
c0003890:	7d 89 03 a6 	mtctr   r12
c0003894:	4e 80 04 20 	bctr
c0003898:	3d 80 c2 01 	lis     r12,-15871
c000389c:	39 8c e1 38 	addi    r12,r12,-7880
c00038a0:	7d 89 03 a6 	mtctr   r12
c00038a4:	4e 80 04 20 	bctr
c00038a8:	3d 80 c2 00 	lis     r12,-15872
c00038ac:	39 8c 74 d0 	addi    r12,r12,29904
c00038b0:	7d 89 03 a6 	mtctr   r12
c00038b4:	4e 80 04 20 	bctr
c00038b8:	3d 80 c2 00 	lis     r12,-15872
c00038bc:	39 8c 73 38 	addi    r12,r12,29496
c00038c0:	7d 89 03 a6 	mtctr   r12
c00038c4:	4e 80 04 20 	bctr
c00038c8:	3d 80 c2 01 	lis     r12,-15871
c00038cc:	39 8c 83 6c 	addi    r12,r12,-31892
c00038d0:	7d 89 03 a6 	mtctr   r12
c00038d4:	4e 80 04 20 	bctr
c00038d8:	3d 80 c2 01 	lis     r12,-15871
c00038dc:	39 8c 8f 08 	addi    r12,r12,-28920
c00038e0:	7d 89 03 a6 	mtctr   r12
c00038e4:	4e 80 04 20 	bctr

Disassembly of section .text:

c0004000 <Reset_virt>:


c20016dc <prom_init>:
c20016dc:	94 21 ff 50 	stwu    r1,-176(r1)
c20016e0:	7c 08 02 a6 	mflr    r0
c20016e4:	42 9f 00 05 	bcl     20,4*cr7+so,c20016e8 <prom_init+0xc>
c20016e8:	bd c1 00 68 	stmw    r14,104(r1)
c20016ec:	7f c8 02 a6 	mflr    r30
c20016f0:	90 01 00 b4 	stw     r0,180(r1)
c20016f4:	7c bb 2b 78 	mr      r27,r5
c20016f8:	80 1e ff f0 	lwz     r0,-16(r30)
....
c20026d4:	4a 00 ed 69 	bl      c001143c <reloc_got2>
c20026d8:	7f e3 fb 78 	mr      r3,r31
c20026dc:	7f 24 cb 78 	mr      r4,r25
c20026e0:	39 20 00 00 	li      r9,0
c20026e4:	39 00 00 00 	li      r8,0
c20026e8:	38 e0 00 00 	li      r7,0
c20026ec:	38 c0 00 00 	li      r6,0
c20026f0:	38 a0 00 00 	li      r5,0
c20026f4:	48 00 00 61 	bl      c2002754 <prom_init+0x1078>
c20026f8:	38 60 00 00 	li      r3,0
c20026fc:	80 01 00 b4 	lwz     r0,180(r1)
c2002700:	81 c1 00 68 	lwz     r14,104(r1)
c2002704:	81 e1 00 6c 	lwz     r15,108(r1)
c2002708:	7c 08 03 a6 	mtlr    r0
c200270c:	82 01 00 70 	lwz     r16,112(r1)
c2002710:	82 21 00 74 	lwz     r17,116(r1)
c2002714:	82 41 00 78 	lwz     r18,120(r1)
c2002718:	82 61 00 7c 	lwz     r19,124(r1)
c200271c:	82 81 00 80 	lwz     r20,128(r1)
c2002720:	82 a1 00 84 	lwz     r21,132(r1)
c2002724:	82 c1 00 88 	lwz     r22,136(r1)
c2002728:	82 e1 00 8c 	lwz     r23,140(r1)
c200272c:	83 01 00 90 	lwz     r24,144(r1)
c2002730:	83 21 00 94 	lwz     r25,148(r1)
c2002734:	83 41 00 98 	lwz     r26,152(r1)
c2002738:	83 61 00 9c 	lwz     r27,156(r1)
c200273c:	83 81 00 a0 	lwz     r28,160(r1)
c2002740:	83 a1 00 a4 	lwz     r29,164(r1)
c2002744:	83 c1 00 a8 	lwz     r30,168(r1)
c2002748:	83 e1 00 ac 	lwz     r31,172(r1)
c200274c:	38 21 00 b0 	addi    r1,r1,176
c2002750:	4e 80 00 20 	blr
c2002754:	3d 80 c0 00 	lis     r12,-16384
c2002758:	39 8c 00 0c 	addi    r12,r12,12
c200275c:	7d 89 03 a6 	mtctr   r12
c2002760:	4e 80 04 20 	bctr


Any idea on how the GNU ld does it and how we can alter it, or force generation of dedicated section 
like on PPC64 ?

Christophe
diff mbox series

Patch

diff --git a/arch/powerpc/include/asm/sections.h b/arch/powerpc/include/asm/sections.h
index 324d7b298ec3..f087f5cd5a50 100644
--- a/arch/powerpc/include/asm/sections.h
+++ b/arch/powerpc/include/asm/sections.h
@@ -30,6 +30,8 @@  extern char __end_interrupts[];
 
 extern char __prom_init_toc_start[];
 extern char __prom_init_toc_end[];
+extern char __branch_lt_start[];
+extern char __branch_lt_end[];
 
 #ifdef CONFIG_PPC_POWERNV
 extern char start_real_trampolines[];
diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index ece7f97bafff..28a6c2abd3ab 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -560,8 +560,11 @@  __boot_from_prom:
 	/* Relocate code for where we are now */
 	mr	r3,r26
 	bl	relocate
+#else
+	bl	relocate_plt
 #endif
 
+
 	/* Restore parameters */
 	mr	r3,r31
 	mr	r4,r30
@@ -600,6 +603,8 @@  __after_prom_start:
 	/* IVPR needs to be set after relocation. */
 	bl	init_core_book3e
 #endif
+#else
+	bl	unrelocate_plt
 #endif
 
 /*
@@ -901,6 +906,67 @@  _GLOBAL(relative_toc)
 .balign 8
 p_toc:	.8byte	__toc_start + 0x8000 - 0b
 
+/*
+ * A large non relocatable kernel may generate branches that go though the plt,
+ * before the kernel is copied down to its link location, the target address in
+ * the .branch_lt section need to be offset with the run time address. The
+ * offset then needs to be removed before the kernel is running at the correct
+ * address.  When relocate_plt is called the current runtime address is added
+ * to all of the target address in .branch_lt and that address is stored in
+ * p_branch_lt_off.  When unrelocate_plt is called if there is an offset saved
+ * in p_branch_lt_off it is subtracted from the addresses in .branch_lt to
+ * return them to their original targets.
+ */
+#ifndef CONFIG_RELOCATABLE
+#define RELOCATE_MODE 0
+#define UNRELOCATE_MODE 1
+unrelocate_plt:
+	li	r16,UNRELOCATE_MODE
+	b	+8
+relocate_plt:
+	li	r16,RELOCATE_MODE
+	mflr	r0
+	bcl	20,31,$+4
+0:	mflr	r11
+
+	ld	r12,(p_branch_lt_start - 0b)(r11)
+	add	r12,r12,r11
+	ld	r14,(p_branch_lt_end - 0b)(r11)
+	add	r14,r14,r11
+	ld	r15,(p_branch_lt_off - 0b)(r11)
+
+	/* Adding runtime address or subtracting p_branch_lt_off? */
+	cmpdi	r16,UNRELOCATE_MODE
+	bne	5f
+	cmpdi	r15,0
+	beq	4f
+	mr	r10,r15
+	neg	r10,r10
+	b	2f
+5:	mr	r10,r26
+
+	/* Iterate over all targets in .branch_lt */
+2:	cmpd	r12,r14
+	bge	6f
+	ld	r13,0(r12)
+	add	r13,r13, r10
+	std	r13,0(r12)
+	addi	r12,r12, 8
+	b	2b
+
+6:	cmpdi	r16,RELOCATE_MODE
+	bne	4f
+	std	r26,(p_branch_lt_off - 0b)(r11)
+
+4:	mtlr	r0
+	blr
+
+.balign 8
+p_branch_lt_start:	.8byte	__branch_lt_start - 0b
+p_branch_lt_end:	.8byte	__branch_lt_end - 0b
+p_branch_lt_off:	.8byte	0
+#endif
+
 /*
  * This is where the main kernel code starts.
  */
diff --git a/arch/powerpc/kernel/vmlinux.lds.S b/arch/powerpc/kernel/vmlinux.lds.S
index 72fa3c00229a..99085558ad3a 100644
--- a/arch/powerpc/kernel/vmlinux.lds.S
+++ b/arch/powerpc/kernel/vmlinux.lds.S
@@ -317,7 +317,9 @@  SECTIONS
 #endif
 		*(.data.rel*)
 		*(.toc1)
+		__branch_lt_start = .;
 		*(.branch_lt)
+		__branch_lt_end = .;
 	}
 
 	.opd : AT(ADDR(.opd) - LOAD_OFFSET) {