Message ID | YTPIdbUCmwagL5/D@os.inf.tu-dresden.de |
---|---|
State | New |
Headers | show |
Series | arm: Launching EFI-enabled arm32 Linux | expand |
On Sat, 4 Sept 2021 at 20:26, Adam Lackorzynski <adam@l4re.org> wrote: > while trying to launch an EFI-enabled arm32 Linux binary (zImage) I > noticed I get an undefined instruction exception on the first > instruction. Now this is a bit special because Linux uses a nop > instruction there that also is a PE file signature ('MZ') such that the > CPU runs over it and the file is still recognized as a PE binary. Linux > uses 0x13105a4d (tstne r0, #0x4d000) as the instruction (see also > arch/arm/boot/compressed/head.S and efi-header.S in Linux). > However, QEMU's instruction decoder will only recognize TST with bits > 12-15 being 0, which this instruction is not fullfilling, and thus the > undef exception. I guess other CPU implementations will allow this > encoding. So while investigating I was doing the following to make Linux > proceed. I also believe this was working in a previous version of QEMU. > > diff --git a/target/arm/a32.decode b/target/arm/a32.decode > index fcd8cd4f7d..222553750e 100644 > --- a/target/arm/a32.decode > +++ b/target/arm/a32.decode > @@ -127,7 +127,7 @@ ADD_rri .... 001 0100 . .... .... ............ @s_rri_rot > ADC_rri .... 001 0101 . .... .... ............ @s_rri_rot > SBC_rri .... 001 0110 . .... .... ............ @s_rri_rot > RSC_rri .... 001 0111 . .... .... ............ @s_rri_rot > -TST_xri .... 001 1000 1 .... 0000 ............ @S_xri_rot > +TST_xri .... 001 1000 1 .... ---- ............ @S_xri_rot > TEQ_xri .... 001 1001 1 .... 0000 ............ @S_xri_rot > CMP_xri .... 001 1010 1 .... 0000 ............ @S_xri_rot > CMN_xri .... 001 1011 1 .... 0000 ............ @S_xri_rot > > > Any thoughts on this? If your guest code is relying on bits [15:12] in the TST (immediate) Arm encoding being non-zero then it is broken. In the v8A Arm ARM DDI 0487G.b, section F5.1.262, these bits are noted as "(0)", which means RES0, should-be-zero. In F1.7.2 this is described as meaning that if the bit is 1 then the behaviour is CONSTRAINED UNPREDICTABLE, and can be result in any of: * UNDEF (this is what QEMU chooses) * NOP * executes as if the bit were 0 * any destination registers become UNKNOWN This was true also for v7A. Even back as far as ARMv5 these bits are marked as "SBZ" (should-be-zero). Since this is all in the UNPREDICTABLE zone, there are presumably some CPUs that do execute this insn as either a NOP or ignoring the incorrectly set bits; but I would not be surprised if there are also some CPUs that behave like QEMU and UNDEF them. Looking at the code where this is used, I think it probably needs to abandon the goal of having the insn be a true or nearly-true NOP. Since this is the first insn the kernel executes, it doesn't really have to be a NOP, as long as it doesn't trash the registers where the bootloader passed it information (r0, r1, r2). Unless there are other undocumented constraints on this instruction pattern, you might try 0xe2255a4d ; eor r5, r5, 0x4d000 That's not a NOP on its own, but if you use it twice in a row then it is, and you can make sure the use in head.S arranges to put two of those and then revert to more normal-looking NOPs for the rest of its run of NOPs. (This doesn't work for CONFIG_THUMB2_KERNEL, but neither does the current insn pattern I think.) thanks -- PMM
On Sat, 4 Sep 2021 21:26:45 +0200 Adam Lackorzynski <adam@l4re.org> wrote: Hi Adam, > while trying to launch an EFI-enabled arm32 Linux binary (zImage) I > noticed I get an undefined instruction exception on the first > instruction. Now this is a bit special because Linux uses a nop > instruction there that also is a PE file signature ('MZ') such that the > CPU runs over it and the file is still recognized as a PE binary. Linux > uses 0x13105a4d (tstne r0, #0x4d000) as the instruction (see also > arch/arm/boot/compressed/head.S and efi-header.S in Linux). > However, QEMU's instruction decoder will only recognize TST with bits > 12-15 being 0, which this instruction is not fullfilling, and thus the > undef exception. I guess other CPU implementations will allow this > encoding. So while investigating I was doing the following to make Linux > proceed. I also believe this was working in a previous version of QEMU. > > diff --git a/target/arm/a32.decode b/target/arm/a32.decode > index fcd8cd4f7d..222553750e 100644 > --- a/target/arm/a32.decode > +++ b/target/arm/a32.decode > @@ -127,7 +127,7 @@ ADD_rri .... 001 0100 . .... .... ............ @s_rri_rot > ADC_rri .... 001 0101 . .... .... ............ @s_rri_rot > SBC_rri .... 001 0110 . .... .... ............ @s_rri_rot > RSC_rri .... 001 0111 . .... .... ............ @s_rri_rot > -TST_xri .... 001 1000 1 .... 0000 ............ @S_xri_rot > +TST_xri .... 001 1000 1 .... ---- ............ @S_xri_rot > TEQ_xri .... 001 1001 1 .... 0000 ............ @S_xri_rot > CMP_xri .... 001 1010 1 .... 0000 ............ @S_xri_rot > CMN_xri .... 001 1011 1 .... 0000 ............ @S_xri_rot > > > Any thoughts on this? thanks for the report, I was looking at this and have a kernel patch to fix this properly as Peter suggested. And while I agree on the problem, I was struggling to reproduce this in reality: both with -kernel and when booting through U-Boot the "Z" bit is set, which lets QEMU not even bother about the rest of the encoding - the condition flags don't match, so it proceeds. If I change the __nop to use "tsteq", I see it hanging due to the missing exception handler, but not with "tstne". So can you say how you spotted this issue? This would be needed as a justification for patching the guts of the ARM Linux kernel port. Cheers, Andre
Hi Andre, On Mon Sep 06, 2021 at 16:34:03 +0100, Andre Przywara wrote: > On Sat, 4 Sep 2021 21:26:45 +0200 > Adam Lackorzynski <adam@l4re.org> wrote: > > Hi Adam, > > > while trying to launch an EFI-enabled arm32 Linux binary (zImage) I > > noticed I get an undefined instruction exception on the first > > instruction. Now this is a bit special because Linux uses a nop > > instruction there that also is a PE file signature ('MZ') such that the > > CPU runs over it and the file is still recognized as a PE binary. Linux > > uses 0x13105a4d (tstne r0, #0x4d000) as the instruction (see also > > arch/arm/boot/compressed/head.S and efi-header.S in Linux). > > However, QEMU's instruction decoder will only recognize TST with bits > > 12-15 being 0, which this instruction is not fullfilling, and thus the > > undef exception. I guess other CPU implementations will allow this > > encoding. So while investigating I was doing the following to make Linux > > proceed. I also believe this was working in a previous version of QEMU. > > > > diff --git a/target/arm/a32.decode b/target/arm/a32.decode > > index fcd8cd4f7d..222553750e 100644 > > --- a/target/arm/a32.decode > > +++ b/target/arm/a32.decode > > @@ -127,7 +127,7 @@ ADD_rri .... 001 0100 . .... .... ............ @s_rri_rot > > ADC_rri .... 001 0101 . .... .... ............ @s_rri_rot > > SBC_rri .... 001 0110 . .... .... ............ @s_rri_rot > > RSC_rri .... 001 0111 . .... .... ............ @s_rri_rot > > -TST_xri .... 001 1000 1 .... 0000 ............ @S_xri_rot > > +TST_xri .... 001 1000 1 .... ---- ............ @S_xri_rot > > TEQ_xri .... 001 1001 1 .... 0000 ............ @S_xri_rot > > CMP_xri .... 001 1010 1 .... 0000 ............ @S_xri_rot > > CMN_xri .... 001 1011 1 .... 0000 ............ @S_xri_rot > > > > > > Any thoughts on this? > > thanks for the report, I was looking at this and have a kernel patch > to fix this properly as Peter suggested. And while I agree on the > problem, I was struggling to reproduce this in reality: both with > -kernel and when booting through U-Boot the "Z" bit is set, which lets > QEMU not even bother about the rest of the encoding - the condition > flags don't match, so it proceeds. If I change the __nop to use "tsteq", > I see it hanging due to the missing exception handler, but not with > "tstne". > So can you say how you spotted this issue? This would be needed as a > justification for patching the guts of the ARM Linux kernel port. Good point with the condition flags. I'm doing this with our own vmm where I'm loading the binary directly as the payload (as mandated by the header), and where psr is set to a defined value with all flags cleared. If I set the Z bit than it also works (of course). Looking a bit around in QEMU as well as u-boot I it looks like this is rather by luck how flags are set. Thanks for doing the Linux patch, I'll scrap mine, and also thanks to Peter for the idea! Adam
On Wed, 8 Sep 2021 01:25:04 +0200 Adam Lackorzynski <adam@l4re.org> wrote: Hi Adam, > On Mon Sep 06, 2021 at 16:34:03 +0100, Andre Przywara wrote: > > On Sat, 4 Sep 2021 21:26:45 +0200 > > Adam Lackorzynski <adam@l4re.org> wrote: > > > > Hi Adam, > > > > > while trying to launch an EFI-enabled arm32 Linux binary (zImage) I > > > noticed I get an undefined instruction exception on the first > > > instruction. Now this is a bit special because Linux uses a nop > > > instruction there that also is a PE file signature ('MZ') such that the > > > CPU runs over it and the file is still recognized as a PE binary. Linux > > > uses 0x13105a4d (tstne r0, #0x4d000) as the instruction (see also > > > arch/arm/boot/compressed/head.S and efi-header.S in Linux). > > > However, QEMU's instruction decoder will only recognize TST with bits > > > 12-15 being 0, which this instruction is not fullfilling, and thus the > > > undef exception. I guess other CPU implementations will allow this > > > encoding. So while investigating I was doing the following to make Linux > > > proceed. I also believe this was working in a previous version of QEMU. > > > > > > diff --git a/target/arm/a32.decode b/target/arm/a32.decode > > > index fcd8cd4f7d..222553750e 100644 > > > --- a/target/arm/a32.decode > > > +++ b/target/arm/a32.decode > > > @@ -127,7 +127,7 @@ ADD_rri .... 001 0100 . .... .... ............ @s_rri_rot > > > ADC_rri .... 001 0101 . .... .... ............ @s_rri_rot > > > SBC_rri .... 001 0110 . .... .... ............ @s_rri_rot > > > RSC_rri .... 001 0111 . .... .... ............ @s_rri_rot > > > -TST_xri .... 001 1000 1 .... 0000 ............ @S_xri_rot > > > +TST_xri .... 001 1000 1 .... ---- ............ @S_xri_rot > > > TEQ_xri .... 001 1001 1 .... 0000 ............ @S_xri_rot > > > CMP_xri .... 001 1010 1 .... 0000 ............ @S_xri_rot > > > CMN_xri .... 001 1011 1 .... 0000 ............ @S_xri_rot > > > > > > > > > Any thoughts on this? > > > > thanks for the report, I was looking at this and have a kernel patch > > to fix this properly as Peter suggested. And while I agree on the > > problem, I was struggling to reproduce this in reality: both with > > -kernel and when booting through U-Boot the "Z" bit is set, which lets > > QEMU not even bother about the rest of the encoding - the condition > > flags don't match, so it proceeds. If I change the __nop to use "tsteq", > > I see it hanging due to the missing exception handler, but not with > > "tstne". > > So can you say how you spotted this issue? This would be needed as a > > justification for patching the guts of the ARM Linux kernel port. > > Good point with the condition flags. I'm doing this with our own vmm > where I'm loading the binary directly as the payload (as mandated by the > header), and where psr is set to a defined value with all flags cleared. Right, I was thinking something like this. > If I set the Z bit than it also works (of course). > Looking a bit around in QEMU as well as u-boot I it looks like this is > rather by luck how flags are set. Yes, the kernel boot protocol doesn't say anything about the condition flags, so any combination would be valid and we were just lucky before. I did also test on an Cortex-A7 and A53, both ignore bits [15:12] (so execute the instruction as if they were 0), which explains why it works on real hardware. > Thanks for doing the Linux patch, I'll scrap mine, and also thanks to > Peter for the idea! Oh, didn't want to cut you off, if you want to have the commit: be my guest! Otherwise I will send something tomorrow, with a Reported-by: to you. Grüße an die Elbe! Cheers, Andre
Hi Andre, On Wed Sep 08, 2021 at 00:47:10 +0100, Andre Przywara wrote: > On Wed, 8 Sep 2021 01:25:04 +0200 > Adam Lackorzynski <adam@l4re.org> wrote: > > Hi Adam, > > > On Mon Sep 06, 2021 at 16:34:03 +0100, Andre Przywara wrote: > > > On Sat, 4 Sep 2021 21:26:45 +0200 > > > Adam Lackorzynski <adam@l4re.org> wrote: > > > > > > Hi Adam, > > > > > > > while trying to launch an EFI-enabled arm32 Linux binary (zImage) I > > > > noticed I get an undefined instruction exception on the first > > > > instruction. Now this is a bit special because Linux uses a nop > > > > instruction there that also is a PE file signature ('MZ') such that the > > > > CPU runs over it and the file is still recognized as a PE binary. Linux > > > > uses 0x13105a4d (tstne r0, #0x4d000) as the instruction (see also > > > > arch/arm/boot/compressed/head.S and efi-header.S in Linux). > > > > However, QEMU's instruction decoder will only recognize TST with bits > > > > 12-15 being 0, which this instruction is not fullfilling, and thus the > > > > undef exception. I guess other CPU implementations will allow this > > > > encoding. So while investigating I was doing the following to make Linux > > > > proceed. I also believe this was working in a previous version of QEMU. > > > > > > > > diff --git a/target/arm/a32.decode b/target/arm/a32.decode > > > > index fcd8cd4f7d..222553750e 100644 > > > > --- a/target/arm/a32.decode > > > > +++ b/target/arm/a32.decode > > > > @@ -127,7 +127,7 @@ ADD_rri .... 001 0100 . .... .... ............ @s_rri_rot > > > > ADC_rri .... 001 0101 . .... .... ............ @s_rri_rot > > > > SBC_rri .... 001 0110 . .... .... ............ @s_rri_rot > > > > RSC_rri .... 001 0111 . .... .... ............ @s_rri_rot > > > > -TST_xri .... 001 1000 1 .... 0000 ............ @S_xri_rot > > > > +TST_xri .... 001 1000 1 .... ---- ............ @S_xri_rot > > > > TEQ_xri .... 001 1001 1 .... 0000 ............ @S_xri_rot > > > > CMP_xri .... 001 1010 1 .... 0000 ............ @S_xri_rot > > > > CMN_xri .... 001 1011 1 .... 0000 ............ @S_xri_rot > > > > > > > > > > > > Any thoughts on this? > > > > > > thanks for the report, I was looking at this and have a kernel patch > > > to fix this properly as Peter suggested. And while I agree on the > > > problem, I was struggling to reproduce this in reality: both with > > > -kernel and when booting through U-Boot the "Z" bit is set, which lets > > > QEMU not even bother about the rest of the encoding - the condition > > > flags don't match, so it proceeds. If I change the __nop to use "tsteq", > > > I see it hanging due to the missing exception handler, but not with > > > "tstne". > > > So can you say how you spotted this issue? This would be needed as a > > > justification for patching the guts of the ARM Linux kernel port. > > > > Good point with the condition flags. I'm doing this with our own vmm > > where I'm loading the binary directly as the payload (as mandated by the > > header), and where psr is set to a defined value with all flags cleared. > > Right, I was thinking something like this. > > > If I set the Z bit than it also works (of course). > > Looking a bit around in QEMU as well as u-boot I it looks like this is > > rather by luck how flags are set. > > Yes, the kernel boot protocol doesn't say anything about the condition > flags, so any combination would be valid and we were just lucky before. > I did also test on an Cortex-A7 and A53, both ignore bits [15:12] (so > execute the instruction as if they were 0), which explains why it works > on real hardware. > > > Thanks for doing the Linux patch, I'll scrap mine, and also thanks to > > Peter for the idea! > > Oh, didn't want to cut you off, if you want to have the commit: be my > guest! > Otherwise I will send something tomorrow, with a Reported-by: to you. No, that's fine, I'm happy this is taken care of :) > Grüße an die Elbe! Danke, Grüße zurück :-) Adam
diff --git a/target/arm/a32.decode b/target/arm/a32.decode index fcd8cd4f7d..222553750e 100644 --- a/target/arm/a32.decode +++ b/target/arm/a32.decode @@ -127,7 +127,7 @@ ADD_rri .... 001 0100 . .... .... ............ @s_rri_rot ADC_rri .... 001 0101 . .... .... ............ @s_rri_rot SBC_rri .... 001 0110 . .... .... ............ @s_rri_rot RSC_rri .... 001 0111 . .... .... ............ @s_rri_rot -TST_xri .... 001 1000 1 .... 0000 ............ @S_xri_rot +TST_xri .... 001 1000 1 .... ---- ............ @S_xri_rot TEQ_xri .... 001 1001 1 .... 0000 ............ @S_xri_rot CMP_xri .... 001 1010 1 .... 0000 ............ @S_xri_rot CMN_xri .... 001 1011 1 .... 0000 ............ @S_xri_rot