diff mbox series

OF: Add a separate direct kernel loading word

Message ID 20220712004624.284935-1-jniethe5@gmail.com
State New
Headers show
Series OF: Add a separate direct kernel loading word | expand

Commit Message

Jordan Niethe July 12, 2022, 12:46 a.m. UTC
Currently, go-64 is used for booting a kernel from qemu (i.e. -kernel).
However, there is an expectation from users that this should be able to
boot not just vmlinux kernels but things like Zimages too.

The bootwrapper of a BE zImage is a 32-bit ELF. Attempting to load that
with go-64 means that it will be ran with MSR_SF set (64-bit mode). This
crashes early in boot (usually due to what should be 32-bit operations
being done with 64-bit registers eventually leading to an incorrect
address being generated and branched to).

Note that our 64-bit payloads are prepared to enter with MSR_SF cleared
and set it themselves very early.

Add a new word named go-direct that will execute any simple payload
in-place and will enter with MSR_SF cleared. This allows booting a BE
zImage from qemu with -machine kernel-addr=0.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
 board-qemu/slof/OF.fs | 5 ++---
 slof/fs/boot.fs       | 6 ++++++
 2 files changed, 8 insertions(+), 3 deletions(-)

Comments

Segher Boessenkool July 12, 2022, 1:48 p.m. UTC | #1
Hi!

On Tue, Jul 12, 2022 at 10:46:24AM +1000, Jordan Niethe wrote:
> Currently, go-64 is used for booting a kernel from qemu (i.e. -kernel).
> However, there is an expectation from users that this should be able to
> boot not just vmlinux kernels but things like Zimages too.
> 
> The bootwrapper of a BE zImage is a 32-bit ELF. Attempting to load that
> with go-64 means that it will be ran with MSR_SF set (64-bit mode). This
> crashes early in boot (usually due to what should be 32-bit operations
> being done with 64-bit registers eventually leading to an incorrect
> address being generated and branched to).

In PowerPC, all operations are done the same in SF=0 and SF=1 modes,
except:
  - For addressing storage, the high 32 bits are ignored if SF=0;
  - For bdz and bdnz (bc with BO[2]=1) the high 32 bits are ignored;
  - For integer record form instructions ("dot instructions"), the high
    32 bits are ignored.
Everything else is done exactly the same with SF=0 -- the high 32 bits
of everything are set in exactly the same way, for example.

In practice, what bites you is the first item, when doing table jumps:
it ends up jumping to 0xffffffff12345678 instead of 0x0000000012345678
when running 32-bit code with SF=1.  You get about two million
instructions into yaboot before it blows up, for example :-)  Dot insns
are very common, but you do end up with the result properly (sign-)
extended most of the time :-)

> Note that our 64-bit payloads are prepared to enter with MSR_SF cleared
> and set it themselves very early.
> 
> Add a new word named go-direct that will execute any simple payload
> in-place and will enter with MSR_SF cleared. This allows booting a BE
> zImage from qemu with -machine kernel-addr=0.

Ouch.  So you run 64-bit programs in 32-bit mode as well, just hoping
they will deal with it?  Not a good idea :-(  Current Linux is fine with
it, but are other payloads, including future Linux?

"init-program" is supposed to set the MSR state correctly (in ciregs
>srr1), based on the ELF headers (and btw the same is true for the LE
flag etc).  A little ELF parsing is needed.

Hope this helps,


Segher
Jordan Niethe July 13, 2022, 3:38 a.m. UTC | #2
On Tue, Jul 12, 2022 at 11:49 PM Segher Boessenkool
<segher@kernel.crashing.org> wrote:
>
> Hi!
>
> On Tue, Jul 12, 2022 at 10:46:24AM +1000, Jordan Niethe wrote:
> > Currently, go-64 is used for booting a kernel from qemu (i.e. -kernel).
> > However, there is an expectation from users that this should be able to
> > boot not just vmlinux kernels but things like Zimages too.
> >
> > The bootwrapper of a BE zImage is a 32-bit ELF. Attempting to load that
> > with go-64 means that it will be ran with MSR_SF set (64-bit mode). This
> > crashes early in boot (usually due to what should be 32-bit operations
> > being done with 64-bit registers eventually leading to an incorrect
> > address being generated and branched to).
>
> In PowerPC, all operations are done the same in SF=0 and SF=1 modes,
> except:
>   - For addressing storage, the high 32 bits are ignored if SF=0;
>   - For bdz and bdnz (bc with BO[2]=1) the high 32 bits are ignored;
>   - For integer record form instructions ("dot instructions"), the high
>     32 bits are ignored.
> Everything else is done exactly the same with SF=0 -- the high 32 bits
> of everything are set in exactly the same way, for example.
>
> In practice, what bites you is the first item, when doing table jumps:
> it ends up jumping to 0xffffffff12345678 instead of 0x0000000012345678
> when running 32-bit code with SF=1.  You get about two million
> instructions into yaboot before it blows up, for example :-)  Dot insns
> are very common, but you do end up with the result properly (sign-)
> extended most of the time :-)

Ah ok, it is the first case happening. For example, here the crash
comes after going to $0000000104002f24
when it should be going to $0000000004002f24. Like this:


=> 0x4002cd8:    add     r7,r7,r5
2: /x $r7 = 0xffff566c
3: /x $r5 = 0x400d8b8
(gdb) stepi
1: x/i $pc
=> 0x4002cdc:    mtctr   r7
2: /x $r7 = 0x104002f24
3: /x $r5 = 0x400d8b8
(gdb) stepi
1: x/i $pc
=> 0x4002ce0:    bctr
2: /x $r7 = 0x104002f24
3: /x $r5 = 0x400d8b8

>
> > Note that our 64-bit payloads are prepared to enter with MSR_SF cleared
> > and set it themselves very early.
> >
> > Add a new word named go-direct that will execute any simple payload
> > in-place and will enter with MSR_SF cleared. This allows booting a BE
> > zImage from qemu with -machine kernel-addr=0.
>
> Ouch.  So you run 64-bit programs in 32-bit mode as well, just hoping
> they will deal with it?  Not a good idea :-(  Current Linux is fine with
> it, but are other payloads, including future Linux?

In SLOF when using -kernel from qemu it already doesn't do the
"Preparing ELF-Format Programs for Execution" things (it runs it in
place, always enters the client BE).
So I thought these "-kernel" payloads weren't being treated as ELFs
but as the kind of client program described in the PowerPC Processor
Binding for 1275 (Section 8.2.1 Initial Register Values).
That seemed to indicate that a client program will be entered in
32-bit mode. So I thought it might be ok to do...

>
> "init-program" is supposed to set the MSR state correctly (in ciregs
> >srr1), based on the ELF headers (and btw the same is true for the LE
> flag etc).  A little ELF parsing is needed.

When booting from memory with the -kernel option, qemu has already
loaded the kernel into memory and tells SLOF where to jump into?
SLOF is not looking at the ELF at all in this case is it?

Thanks,
Jordan

>
> Hope this helps,
>
>
> Segher
Alexey Kardashevskiy July 13, 2022, 4:13 a.m. UTC | #3
On 7/12/22 23:48, Segher Boessenkool wrote:
> Hi!
> 
> On Tue, Jul 12, 2022 at 10:46:24AM +1000, Jordan Niethe wrote:
>> Currently, go-64 is used for booting a kernel from qemu (i.e. -kernel).
>> However, there is an expectation from users that this should be able to
>> boot not just vmlinux kernels but things like Zimages too.
>>
>> The bootwrapper of a BE zImage is a 32-bit ELF. Attempting to load that
>> with go-64 means that it will be ran with MSR_SF set (64-bit mode). This
>> crashes early in boot (usually due to what should be 32-bit operations
>> being done with 64-bit registers eventually leading to an incorrect
>> address being generated and branched to).
> 
> In PowerPC, all operations are done the same in SF=0 and SF=1 modes,
> except:
>    - For addressing storage, the high 32 bits are ignored if SF=0;
>    - For bdz and bdnz (bc with BO[2]=1) the high 32 bits are ignored;
>    - For integer record form instructions ("dot instructions"), the high
>      32 bits are ignored.
> Everything else is done exactly the same with SF=0 -- the high 32 bits
> of everything are set in exactly the same way, for example.
> 
> In practice, what bites you is the first item, when doing table jumps:
> it ends up jumping to 0xffffffff12345678 instead of 0x0000000012345678
> when running 32-bit code with SF=1.  You get about two million
> instructions into yaboot before it blows up, for example :-)

It crashes lot sooner with a BE zImage :)

>  Dot insns
> are very common, but you do end up with the result properly (sign-)
> extended most of the time :-)
> 
>> Note that our 64-bit payloads are prepared to enter with MSR_SF cleared
>> and set it themselves very early.
>>
>> Add a new word named go-direct that will execute any simple payload
>> in-place and will enter with MSR_SF cleared. This allows booting a BE
>> zImage from qemu with -machine kernel-addr=0.
> 
> Ouch.  So you run 64-bit programs in 32-bit mode as well, just hoping
> they will deal with it?  Not a good idea :-(  Current Linux is fine with
> it, but are other payloads, including future Linux?

https://www.devicetree.org/open-firmware/bindings/ppc/release/ppc-2_1.html 
says "Upon entry to the client program, the following registers shall 
contain the following values: ... msr ... SF=0, 32-bit mode". I'd think 
that it is up to that client program to adjust MSR_SF, not the firmware.

> "init-program" is supposed to set the MSR state correctly (in ciregs
>> srr1), based on the ELF headers (and btw the same is true for the LE
> flag etc).  A little ELF parsing is needed.


This is the case when QEMU runs with "-kernel zImage" - QEMU loads the 
ELF and SLOF has no idea what was the binary. QEMU does store a 
"/chosen/qemu,boot-kernel-le" property but there is nothing for 32bit.

> 
> Hope this helps,

Always does! :)

> 
> 
> Segher
> _______________________________________________
> SLOF mailing list
> SLOF@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/slof
Segher Boessenkool July 13, 2022, 6:11 p.m. UTC | #4
Hi!

On Wed, Jul 13, 2022 at 02:13:06PM +1000, Alexey Kardashevskiy wrote:
> On 7/12/22 23:48, Segher Boessenkool wrote:
> >Ouch.  So you run 64-bit programs in 32-bit mode as well, just hoping
> >they will deal with it?  Not a good idea :-(  Current Linux is fine with
> >it, but are other payloads, including future Linux?
> 
> https://www.devicetree.org/open-firmware/bindings/ppc/release/ppc-2_1.html 
> says "Upon entry to the client program, the following registers shall 
> contain the following values: ... msr ... SF=0, 32-bit mode". I'd think 
> that it is up to that client program to adjust MSR_SF, not the firmware.

The (later) PowerPC CHRP binding says (in 10.4):
  The data format of a client program compliant with this specification
  shall be either ELF (Executable and Linkage Format) as defined by
  [19], and extended by section 10.4.1.1., or PE (Portable Executable)
  as defined by [17]. The standard ELF format contains explicit
  indication as to the program's execution modes (e.g., 32- or 64-bit,
  Big- or Little-Endian). CHRP only supports the 32-bit version (i.e.,
  ELFCLASS32) for 32 and 64 bit platforms.  Note: other client program
  formats may be supported, in an implementation specific manner, by an
  Open Firmware implementation.

Supporting 64-bit ELF is thus explicitly allowed as well, and various
implementations do just that.

> >"init-program" is supposed to set the MSR state correctly (in ciregs
> >>srr1), based on the ELF headers (and btw the same is true for the LE
> >flag etc).  A little ELF parsing is needed.
> 
> This is the case when QEMU runs with "-kernel zImage" - QEMU loads the 
> ELF and SLOF has no idea what was the binary. QEMU does store a 
> "/chosen/qemu,boot-kernel-le" property but there is nothing for 32bit.

Ah, that sucks.  Then you'll just have to do what you have to do, no
matter how awful it is :-(

> >Hope this helps,
> 
> Always does! :)

Great to hear that :-)


Segher
Segher Boessenkool July 13, 2022, 6:20 p.m. UTC | #5
On Wed, Jul 13, 2022 at 01:38:03PM +1000, Jordan Niethe wrote:
> Ah ok, it is the first case happening. For example, here the crash
> comes after going to $0000000104002f24
> when it should be going to $0000000004002f24. Like this:
> 
> => 0x4002cd8:    add     r7,r7,r5
> 2: /x $r7 = 0xffff566c

A 64-bit program gets that r7 from a sign-extended load (lwa), which
would have given 0xffffffffffff566c.  A program compiled with -m32 uses
lwz here though (there is no lwa on actual 32-bit implementations, and
on some implementations lwa is slower than lwz as well).

> > "init-program" is supposed to set the MSR state correctly (in ciregs
> > >srr1), based on the ELF headers (and btw the same is true for the LE
> > flag etc).  A little ELF parsing is needed.
> 
> When booting from memory with the -kernel option, qemu has already
> loaded the kernel into memory and tells SLOF where to jump into?
> SLOF is not looking at the ELF at all in this case is it?

But *is* there an ELF loaded, or just a binary blob?  If the info is
there it would be beneficial to use it, raw blobs have more opportunity
to go wrong (and almost no opportunity to report what is wrong).

Thanks,


Segher
Alexey Kardashevskiy July 19, 2022, 4:20 a.m. UTC | #6
On 12/07/2022 10:46, Jordan Niethe wrote:
> Currently, go-64 is used for booting a kernel from qemu (i.e. -kernel).
> However, there is an expectation from users that this should be able to
> boot not just vmlinux kernels but things like Zimages too.
> 
> The bootwrapper of a BE zImage is a 32-bit ELF. Attempting to load that
> with go-64 means that it will be ran with MSR_SF set (64-bit mode). This
> crashes early in boot (usually due to what should be 32-bit operations
> being done with 64-bit registers eventually leading to an incorrect
> address being generated and branched to).
> 
> Note that our 64-bit payloads are prepared to enter with MSR_SF cleared
> and set it themselves very early.
> 
> Add a new word named go-direct that will execute any simple payload
> in-place and will enter with MSR_SF cleared. This allows booting a BE
> zImage from qemu with -machine kernel-addr=0.
> 
> Signed-off-by: Jordan Niethe <jniethe5@gmail.com>


Thanks, applied.

> ---
>   board-qemu/slof/OF.fs | 5 ++---
>   slof/fs/boot.fs       | 6 ++++++
>   2 files changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/board-qemu/slof/OF.fs b/board-qemu/slof/OF.fs
> index f0fc9c684b8e..3bcb2af94bdd 100644
> --- a/board-qemu/slof/OF.fs
> +++ b/board-qemu/slof/OF.fs
> @@ -303,10 +303,9 @@ set-default-console
>   : (boot-ram)
>       direct-ram-boot-size 0<> IF
>           ." Booting from memory..." cr
> -        s" go-args 2@ " evaluate
> -        direct-ram-boot-base 0
> +        s" direct-ram-boot-base to go-entry" evaluate
>           s" true state-valid ! " evaluate
> -        s" disable-watchdog go-64" evaluate
> +        s" disable-watchdog go-direct" evaluate
>       THEN
>   ;
>   
> diff --git a/slof/fs/boot.fs b/slof/fs/boot.fs
> index 6d16c54d2af4..a6dfdf3f6c1c 100644
> --- a/slof/fs/boot.fs
> +++ b/slof/fs/boot.fs
> @@ -112,6 +112,12 @@ defer go ( -- )
>       claim-list elf-release 0 to claim-list
>   ;
>   
> +: go-direct ( -- )
> +    0 ciregs >r3 ! 0 ciregs >r4 ! 0 ciregs >r2 !
> +    msr@ 7fffffffffffffff and 2000 or ciregs >srr1 !
> +    go-args 2@ go-entry call-client
> +;
> +
>   : set-le ( -- )
>       1 ciregs >r13 !
>   ;
diff mbox series

Patch

diff --git a/board-qemu/slof/OF.fs b/board-qemu/slof/OF.fs
index f0fc9c684b8e..3bcb2af94bdd 100644
--- a/board-qemu/slof/OF.fs
+++ b/board-qemu/slof/OF.fs
@@ -303,10 +303,9 @@  set-default-console
 : (boot-ram)
     direct-ram-boot-size 0<> IF
         ." Booting from memory..." cr
-        s" go-args 2@ " evaluate
-        direct-ram-boot-base 0
+        s" direct-ram-boot-base to go-entry" evaluate
         s" true state-valid ! " evaluate
-        s" disable-watchdog go-64" evaluate
+        s" disable-watchdog go-direct" evaluate
     THEN
 ;
 
diff --git a/slof/fs/boot.fs b/slof/fs/boot.fs
index 6d16c54d2af4..a6dfdf3f6c1c 100644
--- a/slof/fs/boot.fs
+++ b/slof/fs/boot.fs
@@ -112,6 +112,12 @@  defer go ( -- )
     claim-list elf-release 0 to claim-list
 ;
 
+: go-direct ( -- )
+    0 ciregs >r3 ! 0 ciregs >r4 ! 0 ciregs >r2 !
+    msr@ 7fffffffffffffff and 2000 or ciregs >srr1 !
+    go-args 2@ go-entry call-client
+;
+
 : set-le ( -- )
     1 ciregs >r13 !
 ;