diff mbox series

Revert "Fix data abort caused by mis-aligning FIT data"

Message ID 20201019214026.888876-1-marex@denx.de
State Accepted
Commit 5675ed7cb645f5ec13958726992daeeed16fd114
Delegated to: Tom Rini
Headers show
Series Revert "Fix data abort caused by mis-aligning FIT data" | expand

Commit Message

Marek Vasut Oct. 19, 2020, 9:40 p.m. UTC
This reverts commit eb39d8ba5f0d1468b01b89a2a464d18612d3ea76.
The commit breaks booting of fitImage by SPL, the system simply hangs.
This is because on arm32, the fitImage and all of its content can be
aligned to 4 bytes and U-Boot expects just that.

Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Reuben Dowle <reuben.dowle@4rf.com>
Cc: Tom Rini <trini@konsulko.com>
---
 common/spl/spl_fit.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

Comments

Reuben Dowle Oct. 19, 2020, 9:50 p.m. UTC | #1
The alignment of 8 bytes would also work if code was expecting 4 byte alignment. So the explanation you give for reverting this does not make sense to me.

The version I use in production uses 4 byte alignment, but on advice of Tom Rini I extended to 8 bytes. Maybe we could switch to just forcing 4 bytes, although I can't see why 8 byte would not work?

Note also that I was getting SPL data abort crashes on my arm32 target when image size was not 4 byte aligned. So reverting this patch will break my platform.

-----Original Message-----
From: Marek Vasut <marex@denx.de> 
Sent: Tuesday, 20 October 2020 10:40 am
To: u-boot@lists.denx.de
Cc: Marek Vasut <marex@denx.de>; Reuben Dowle <reuben.dowle@4rf.com>; Tom Rini <trini@konsulko.com>
Subject: [PATCH] Revert "Fix data abort caused by mis-aligning FIT data"

This reverts commit eb39d8ba5f0d1468b01b89a2a464d18612d3ea76.
The commit breaks booting of fitImage by SPL, the system simply hangs.
This is because on arm32, the fitImage and all of its content can be aligned to 4 bytes and U-Boot expects just that.

Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Reuben Dowle <reuben.dowle@4rf.com>
Cc: Tom Rini <trini@konsulko.com>
---
 common/spl/spl_fit.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/common/spl/spl_fit.c b/common/spl/spl_fit.c index 0e27ad1d6a..a90d821c82 100644
--- a/common/spl/spl_fit.c
+++ b/common/spl/spl_fit.c
@@ -349,12 +349,9 @@ static int spl_fit_append_fdt(struct spl_image_info *spl_image,
 
 	/*
 	 * Use the address following the image as target address for the
-	 * device tree. Load address is aligned to 8 bytes to match the required
-	 * alignment specified for linux arm [1] and arm 64 [2] booting
-	 * [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/arm/booting.rst#n126
-	 * [2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/arm64/booting.rst#n45
+	 * device tree.
 	 */
-	image_info.load_addr = ALIGN(spl_image->load_addr + spl_image->size, 8);
+	image_info.load_addr = spl_image->load_addr + spl_image->size;
 
 	/* Figure out which device tree the board wants to use */
 	node = spl_fit_get_image_node(fit, images, FIT_FDT_PROP, index++);
--
2.28.0
Marek Vasut Oct. 19, 2020, 9:59 p.m. UTC | #2
On 10/19/20 11:50 PM, Reuben Dowle wrote:
> The alignment of 8 bytes would also work if code was expecting 4 byte alignment. So the explanation you give for reverting this does not make sense to me.

Well, since U-Boot 2020.10-rc5, any STM32MP1 board does no longer boot
and if I revert this patch, it works again (per git bisect). But this
also applies to any other arm32 boards which load fitImage in SPL, all
of those boards are broken in U-Boot 2020.10.

It seems that the end of the U-Boot image is at 4-byte aligned offset on
arm32, and that is where the DT is also loaded ; but your patch forces
the alignment to 8-bytes, so suddenly the DT location is 4-bytes off.

> The version I use in production uses 4 byte alignment, but on advice of Tom Rini I extended to 8 bytes. Maybe we could switch to just forcing 4 bytes, although I can't see why 8 byte would not work?
> 
> Note also that I was getting SPL data abort crashes on my arm32 target when image size was not 4 byte aligned. So reverting this patch will break my platform.

Which platform is that ?

(also, please do not top-post)

[...]

> This reverts commit eb39d8ba5f0d1468b01b89a2a464d18612d3ea76.
> The commit breaks booting of fitImage by SPL, the system simply hangs.
> This is because on arm32, the fitImage and all of its content can be aligned to 4 bytes and U-Boot expects just that.
> 
> Signed-off-by: Marek Vasut <marex@denx.de>
> Cc: Reuben Dowle <reuben.dowle@4rf.com>
> Cc: Tom Rini <trini@konsulko.com>
> ---
>  common/spl/spl_fit.c | 7 ++-----
>  1 file changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/common/spl/spl_fit.c b/common/spl/spl_fit.c index 0e27ad1d6a..a90d821c82 100644
> --- a/common/spl/spl_fit.c
> +++ b/common/spl/spl_fit.c
> @@ -349,12 +349,9 @@ static int spl_fit_append_fdt(struct spl_image_info *spl_image,
> 
>  /*
>   * Use the address following the image as target address for the
> - * device tree. Load address is aligned to 8 bytes to match the required
> - * alignment specified for linux arm [1] and arm 64 [2] booting
> - * [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/arm/booting.rst#n126
> - * [2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/arm64/booting.rst#n45
> + * device tree.
>   */
> -image_info.load_addr = ALIGN(spl_image->load_addr + spl_image->size, 8);
> +image_info.load_addr = spl_image->load_addr + spl_image->size;
> 
>  /* Figure out which device tree the board wants to use */
>  node = spl_fit_get_image_node(fit, images, FIT_FDT_PROP, index++);
> --
> 2.28.0

[...]
Reuben Dowle Oct. 19, 2020, 10:17 p.m. UTC | #3
> -----Original Message-----
> From: Marek Vasut <marex@denx.de>
> Sent: Tuesday, 20 October 2020 10:59 am
> To: Reuben Dowle <reuben.dowle@4rf.com>; u-boot@lists.denx.de
> Cc: Tom Rini <trini@konsulko.com>
> Subject: Re: [PATCH] Revert "Fix data abort caused by mis-aligning FIT data"
> 
> On 10/19/20 11:50 PM, Reuben Dowle wrote:
> > The alignment of 8 bytes would also work if code was expecting 4 byte
> alignment. So the explanation you give for reverting this does not make
> sense to me.
> 
> Well, since U-Boot 2020.10-rc5, any STM32MP1 board does no longer boot
> and if I revert this patch, it works again (per git bisect). But this also applies to
> any other arm32 boards which load fitImage in SPL, all of those boards are
> broken in U-Boot 2020.10.

Well if it breaks these boards then we need to do something to fix this. Maybe reverting this patch is a good idea to fix this breakage, especially since others were not running into the same issue as me. But I would like to address the issue I was facing, so we need to figure out how to do something that works for my situation and supports those other boards.

> It seems that the end of the U-Boot image is at 4-byte aligned offset on
> arm32, and that is where the DT is also loaded ; but your patch forces the
> alignment to 8-bytes, so suddenly the DT location is 4-bytes off.

The issue I was facing is that spl_image->size was not a multiple of 4 bytes (alignment depended on the contents of the DT, which got misaligned when it was updated with secure boot information). Maybe an alternative way to address this would be to force alignment in the image generation process (either dtc or mkimage).

Moving image_info.load_addr by a few bytes to ensure alignment should not cause any following code to fail. Maybe the SPL is so close to size limit that wasting the 4 bytes pushes it over the limit?

> > The version I use in production uses 4 byte alignment, but on advice of Tom
> Rini I extended to 8 bytes. Maybe we could switch to just forcing 4 bytes,
> although I can't see why 8 byte would not work?
> >
> > Note also that I was getting SPL data abort crashes on my arm32 target
> when image size was not 4 byte aligned. So reverting this patch will break my
> platform.
> 
> Which platform is that ?

I am using a custom platform which is based off of the clearfog. I have changes to uboot to support our custom hardware, secure boot, etc.
Tom Rini Oct. 19, 2020, 10:45 p.m. UTC | #4
On Mon, Oct 19, 2020 at 11:59:22PM +0200, Marek Vasut wrote:
> On 10/19/20 11:50 PM, Reuben Dowle wrote:
> > The alignment of 8 bytes would also work if code was expecting 4 byte alignment. So the explanation you give for reverting this does not make sense to me.
> 
> Well, since U-Boot 2020.10-rc5, any STM32MP1 board does no longer boot
> and if I revert this patch, it works again (per git bisect). But this
> also applies to any other arm32 boards which load fitImage in SPL, all
> of those boards are broken in U-Boot 2020.10.
> 
> It seems that the end of the U-Boot image is at 4-byte aligned offset on
> arm32, and that is where the DT is also loaded ; but your patch forces
> the alignment to 8-bytes, so suddenly the DT location is 4-bytes off.

I think this needs some more investigation to figure out what's going
on and where the underlying bugs are.  This section of the code is where
U-Boot is saying it will copy the device tree to.  If we're using a
device tree in place that's NOT being copied (and someone else has
ensured 8 byte alignment of) we need to set the address of where the
device tree is, at that time.
Marek Vasut Oct. 19, 2020, 10:46 p.m. UTC | #5
On 10/20/20 12:17 AM, Reuben Dowle wrote:

[...]

>> On 10/19/20 11:50 PM, Reuben Dowle wrote:
>>> The alignment of 8 bytes would also work if code was expecting 4 byte
>> alignment. So the explanation you give for reverting this does not make
>> sense to me.
>>
>> Well, since U-Boot 2020.10-rc5, any STM32MP1 board does no longer boot
>> and if I revert this patch, it works again (per git bisect). But this also applies to
>> any other arm32 boards which load fitImage in SPL, all of those boards are
>> broken in U-Boot 2020.10.
> 
> Well if it breaks these boards then we need to do something to fix this. Maybe reverting this patch is a good idea to fix this breakage, especially since others were not running into the same issue as me. But I would like to address the issue I was facing, so we need to figure out how to do something that works for my situation and supports those other boards.
> 
>> It seems that the end of the U-Boot image is at 4-byte aligned offset on
>> arm32, and that is where the DT is also loaded ; but your patch forces the
>> alignment to 8-bytes, so suddenly the DT location is 4-bytes off.
> 
> The issue I was facing is that spl_image->size was not a multiple of 4 bytes (alignment depended on the contents of the DT, which got misaligned when it was updated with secure boot information).

The patch you posted aligns the load address to 8 bytes, not to 4 bytes.
So why didn't the ALIGN use 4 byte alignment ?

> Maybe an alternative way to address this would be to force alignment in the image generation process (either dtc or mkimage).

mkimage -E / -B already can do that for you.

> Moving image_info.load_addr by a few bytes to ensure alignment should not cause any following code to fail. Maybe the SPL is so close to size limit that wasting the 4 bytes pushes it over the limit?

No. If the DT which is supposed to be at the end of the
spl_image->load_addr + spl_image->size is suddenly a few bytes off (due
to the ALIGN()), then it is inaccessible and the boot fails. The
spl_image->load_addr + spl_image->size should already be 4 byte aligned.

>>> The version I use in production uses 4 byte alignment, but on advice of Tom
>> Rini I extended to 8 bytes. Maybe we could switch to just forcing 4 bytes,
>> although I can't see why 8 byte would not work?
>>>
>>> Note also that I was getting SPL data abort crashes on my arm32 target
>> when image size was not 4 byte aligned. So reverting this patch will break my
>> platform.
>>
>> Which platform is that ?
> 
> I am using a custom platform which is based off of the clearfog. I have changes to uboot to support our custom hardware, secure boot, etc.

So, are you triggering the problem with custom-patched U-Boot, not with
mainline ? Could it be that some of the custom patches triggered the
problem instead ?
Marek Vasut Oct. 19, 2020, 10:54 p.m. UTC | #6
On 10/20/20 12:45 AM, Tom Rini wrote:
> On Mon, Oct 19, 2020 at 11:59:22PM +0200, Marek Vasut wrote:
>> On 10/19/20 11:50 PM, Reuben Dowle wrote:
>>> The alignment of 8 bytes would also work if code was expecting 4 byte alignment. So the explanation you give for reverting this does not make sense to me.
>>
>> Well, since U-Boot 2020.10-rc5, any STM32MP1 board does no longer boot
>> and if I revert this patch, it works again (per git bisect). But this
>> also applies to any other arm32 boards which load fitImage in SPL, all
>> of those boards are broken in U-Boot 2020.10.
>>
>> It seems that the end of the U-Boot image is at 4-byte aligned offset on
>> arm32, and that is where the DT is also loaded ; but your patch forces
>> the alignment to 8-bytes, so suddenly the DT location is 4-bytes off.
> 
> I think this needs some more investigation to figure out what's going
> on and where the underlying bugs are.  This section of the code is where
> U-Boot is saying it will copy the device tree to.  If we're using a
> device tree in place that's NOT being copied (and someone else has
> ensured 8 byte alignment of) we need to set the address of where the
> device tree is, at that time.

The problem is that the previous alignment was 4 byte, now it is 8 byte
and that breaks all the other assumptions. So, this patch should be
reverted to fix the platforms which used to work (or use ALIGN(...4),
which is the same as reverting it really).

And likely the signed image which caused the breakage should be
generated with mkimage -E / -B 8, which would insure the alignment, so
then there is no need to change anything in the code itself.
Tom Rini Oct. 19, 2020, 10:58 p.m. UTC | #7
On Tue, Oct 20, 2020 at 12:54:35AM +0200, Marek Vasut wrote:
> On 10/20/20 12:45 AM, Tom Rini wrote:
> > On Mon, Oct 19, 2020 at 11:59:22PM +0200, Marek Vasut wrote:
> >> On 10/19/20 11:50 PM, Reuben Dowle wrote:
> >>> The alignment of 8 bytes would also work if code was expecting 4 byte alignment. So the explanation you give for reverting this does not make sense to me.
> >>
> >> Well, since U-Boot 2020.10-rc5, any STM32MP1 board does no longer boot
> >> and if I revert this patch, it works again (per git bisect). But this
> >> also applies to any other arm32 boards which load fitImage in SPL, all
> >> of those boards are broken in U-Boot 2020.10.
> >>
> >> It seems that the end of the U-Boot image is at 4-byte aligned offset on
> >> arm32, and that is where the DT is also loaded ; but your patch forces
> >> the alignment to 8-bytes, so suddenly the DT location is 4-bytes off.
> > 
> > I think this needs some more investigation to figure out what's going
> > on and where the underlying bugs are.  This section of the code is where
> > U-Boot is saying it will copy the device tree to.  If we're using a
> > device tree in place that's NOT being copied (and someone else has
> > ensured 8 byte alignment of) we need to set the address of where the
> > device tree is, at that time.
> 
> The problem is that the previous alignment was 4 byte, now it is 8 byte
> and that breaks all the other assumptions. So, this patch should be
> reverted to fix the platforms which used to work (or use ALIGN(...4),
> which is the same as reverting it really).
> 
> And likely the signed image which caused the breakage should be
> generated with mkimage -E / -B 8, which would insure the alignment, so
> then there is no need to change anything in the code itself.

4 byte alignment is wrong.
Marek Vasut Oct. 19, 2020, 11:02 p.m. UTC | #8
On 10/20/20 12:58 AM, Tom Rini wrote:
> On Tue, Oct 20, 2020 at 12:54:35AM +0200, Marek Vasut wrote:
>> On 10/20/20 12:45 AM, Tom Rini wrote:
>>> On Mon, Oct 19, 2020 at 11:59:22PM +0200, Marek Vasut wrote:
>>>> On 10/19/20 11:50 PM, Reuben Dowle wrote:
>>>>> The alignment of 8 bytes would also work if code was expecting 4 byte alignment. So the explanation you give for reverting this does not make sense to me.
>>>>
>>>> Well, since U-Boot 2020.10-rc5, any STM32MP1 board does no longer boot
>>>> and if I revert this patch, it works again (per git bisect). But this
>>>> also applies to any other arm32 boards which load fitImage in SPL, all
>>>> of those boards are broken in U-Boot 2020.10.
>>>>
>>>> It seems that the end of the U-Boot image is at 4-byte aligned offset on
>>>> arm32, and that is where the DT is also loaded ; but your patch forces
>>>> the alignment to 8-bytes, so suddenly the DT location is 4-bytes off.
>>>
>>> I think this needs some more investigation to figure out what's going
>>> on and where the underlying bugs are.  This section of the code is where
>>> U-Boot is saying it will copy the device tree to.  If we're using a
>>> device tree in place that's NOT being copied (and someone else has
>>> ensured 8 byte alignment of) we need to set the address of where the
>>> device tree is, at that time.
>>
>> The problem is that the previous alignment was 4 byte, now it is 8 byte
>> and that breaks all the other assumptions. So, this patch should be
>> reverted to fix the platforms which used to work (or use ALIGN(...4),
>> which is the same as reverting it really).
>>
>> And likely the signed image which caused the breakage should be
>> generated with mkimage -E / -B 8, which would insure the alignment, so
>> then there is no need to change anything in the code itself.
> 
> 4 byte alignment is wrong.

Please elaborate why is 4 byte alignment wrong? 8 byte alignment
obviously breaks existing boards which are in-tree and worked for
multiple releases.
Reuben Dowle Oct. 19, 2020, 11:02 p.m. UTC | #9
> The problem is that the previous alignment was 4 byte, now it is 8 byte and
> that breaks all the other assumptions. So, this patch should be reverted to fix
> the platforms which used to work (or use ALIGN(...4), which is the same as
> reverting it really).

What assumptions? Any code that assumes 4 byte alignment will also work on 8 byte alignment.

Reverting is not the same as assuming ALIGN(...4) if incoming data is not already aligned to 4 bytes (as was the case when I saw crashes).

> 
> And likely the signed image which caused the breakage should be generated
> with mkimage -E / -B 8, which would insure the alignment, so then there is no
> need to change anything in the code itself.

Interesting. I had not noticed the -B parameter previously. I had originally fixed this issue on an older version of uboot that did not have that option, and later rebased the fix to newer uboot. I would need to do some testing to see if this would fix it as well.
Alex G. Oct. 19, 2020, 11:09 p.m. UTC | #10
On 10/19/20 6:02 PM, Marek Vasut wrote:
> On 10/20/20 12:58 AM, Tom Rini wrote:
>> On Tue, Oct 20, 2020 at 12:54:35AM +0200, Marek Vasut wrote:
>>> On 10/20/20 12:45 AM, Tom Rini wrote:
>>>> On Mon, Oct 19, 2020 at 11:59:22PM +0200, Marek Vasut wrote:
>>>>> On 10/19/20 11:50 PM, Reuben Dowle wrote:
>>>>>> The alignment of 8 bytes would also work if code was expecting 4 byte alignment. So the explanation you give for reverting this does not make sense to me.
>>>>>
>>>>> Well, since U-Boot 2020.10-rc5, any STM32MP1 board does no longer boot
>>>>> and if I revert this patch, it works again (per git bisect). But this
>>>>> also applies to any other arm32 boards which load fitImage in SPL, all
>>>>> of those boards are broken in U-Boot 2020.10.
>>>>>
>>>>> It seems that the end of the U-Boot image is at 4-byte aligned offset on
>>>>> arm32, and that is where the DT is also loaded ; but your patch forces
>>>>> the alignment to 8-bytes, so suddenly the DT location is 4-bytes off.
>>>>
>>>> I think this needs some more investigation to figure out what's going
>>>> on and where the underlying bugs are.  This section of the code is where
>>>> U-Boot is saying it will copy the device tree to.  If we're using a
>>>> device tree in place that's NOT being copied (and someone else has
>>>> ensured 8 byte alignment of) we need to set the address of where the
>>>> device tree is, at that time.
>>>
>>> The problem is that the previous alignment was 4 byte, now it is 8 byte
>>> and that breaks all the other assumptions. So, this patch should be
>>> reverted to fix the platforms which used to work (or use ALIGN(...4),
>>> which is the same as reverting it really).
>>>
>>> And likely the signed image which caused the breakage should be
>>> generated with mkimage -E / -B 8, which would insure the alignment, so
>>> then there is no need to change anything in the code itself.
>>
>> 4 byte alignment is wrong.
> 
> Please elaborate why is 4 byte alignment wrong? 8 byte alignment
> obviously breaks existing boards which are in-tree and worked for
> multiple releases.

The reverted change linked to some kernel documentation that requires 
64-bit alignment. I agree with the alignment requirement.

Im my opinion, there are two things that need to be done:

First is to look at an ALIGNED address for the fdt. A summary inspection 
of board_fdt_blob_setup() tells us this is done via the "_end" linker 
symbol.

Second is to put things in the right place. For FIT, the code, as is, is 
correct, but this alignment is not guaranteed for legacy images. I think 
somebody mentioned changing the arguments to mkimage to achieve this.

I've tried to fix the first point by aligning the _end symbol (appendix 
A). Unfortunately, this is causing other build issues that I don't know 
how to deal with.

Alex


APPENDIX A:


-- a/arch/arm/cpu/u-boot.lds
+++ b/arch/arm/cpu/u-boot.lds
@@ -196,7 +196,6 @@ SECTIONS
          * for FIT images.
          * See common/spl/spl_fit.c: spl_fit_append_fdt
          */
+       . = ALIGN(8);
         .end :
         {
                 *(.__end)
Marek Vasut Oct. 19, 2020, 11:11 p.m. UTC | #11
On 10/20/20 1:02 AM, Reuben Dowle wrote:
>> The problem is that the previous alignment was 4 byte, now it is 8 byte and
>> that breaks all the other assumptions. So, this patch should be reverted to fix
>> the platforms which used to work (or use ALIGN(...4), which is the same as
>> reverting it really).
> 
> What assumptions? Any code that assumes 4 byte alignment will also work on 8 byte alignment.
> 
> Reverting is not the same as assuming ALIGN(...4) if incoming data is not already aligned to 4 bytes (as was the case when I saw crashes).

Can the incoming data _not_ be 4 byte aligned ?
How can this be replicated ?

>> And likely the signed image which caused the breakage should be generated
>> with mkimage -E / -B 8, which would insure the alignment, so then there is no
>> need to change anything in the code itself.
> 
> Interesting. I had not noticed the -B parameter previously. I had originally fixed this issue on an older version of uboot that did not have that option, and later rebased the fix to newer uboot. I would need to do some testing to see if this would fix it as well.

I believe this is the way to handle this if you are building a custom DT
for U-Boot -- just make sure it has the correct parameters. I think this
is also related to:
20a154f95b ("mkimage: fit: Do not tail-pad fitImage with external data")
Reuben Dowle Oct. 19, 2020, 11:13 p.m. UTC | #12
> The reverted change linked to some kernel documentation that requires 64-
> bit alignment. I agree with the alignment requirement.
> 
> Im my opinion, there are two things that need to be done:
> 
> First is to look at an ALIGNED address for the fdt. A summary inspection of
> board_fdt_blob_setup() tells us this is done via the "_end" linker symbol.

The linker script can only control padding of the executable, but won't affect the alignment of the fdt that can be appended to this later by mkimage.

> Second is to put things in the right place. For FIT, the code, as is, is correct,
> but this alignment is not guaranteed for legacy images. I think somebody
> mentioned changing the arguments to mkimage to achieve this.
> 
> I've tried to fix the first point by aligning the _end symbol (appendix A).
> Unfortunately, this is causing other build issues that I don't know how to deal
> with.
> 
> Alex
> 
> 
> APPENDIX A:
> 
> 
> -- a/arch/arm/cpu/u-boot.lds
> +++ b/arch/arm/cpu/u-boot.lds
> @@ -196,7 +196,6 @@ SECTIONS
>           * for FIT images.
>           * See common/spl/spl_fit.c: spl_fit_append_fdt
>           */
> +       . = ALIGN(8);
>          .end :
>          {
>                  *(.__end)
Reuben Dowle Oct. 20, 2020, 12:27 a.m. UTC | #13
> > What assumptions? Any code that assumes 4 byte alignment will also work
> on 8 byte alignment.
> >
> > Reverting is not the same as assuming ALIGN(...4) if incoming data is not
> already aligned to 4 bytes (as was the case when I saw crashes).
> 
> Can the incoming data _not_ be 4 byte aligned ?
> How can this be replicated ?

In my case I have an offline signing process (separate from build server to keep secure boot keys safe), and this runs a script which also patches the main uboot device tree with some extra properties, then updates main uboot dtb with kernel signature, then finally updates the spl dtb with the uboot signature. I think when mkimage patches the dtb with the signatures, this results in the alignment issues (the unsigned bootloader direct from the uboot make process does not experience this issue).

Possibly using mkimage to add padding would also fix the alignment issue I see at boot time.

> > Interesting. I had not noticed the -B parameter previously. I had originally
> fixed this issue on an older version of uboot that did not have that option,
> and later rebased the fix to newer uboot. I would need to do some testing to
> see if this would fix it as well.
> 
> I believe this is the way to handle this if you are building a custom DT for U-
> Boot -- just make sure it has the correct parameters. I think this is also related
> to:
> 20a154f95b ("mkimage: fit: Do not tail-pad fitImage with external data")

I will look into this, although unfortunately I am very busy with other projects right now so can't retest th
Alex G. Oct. 20, 2020, 12:31 a.m. UTC | #14
On 10/19/20 6:13 PM, Reuben Dowle wrote:
> 
>> The reverted change linked to some kernel documentation that requires 64-
>> bit alignment. I agree with the alignment requirement.
>>
>> Im my opinion, there are two things that need to be done:
>>
>> First is to look at an ALIGNED address for the fdt. A summary inspection of
>> board_fdt_blob_setup() tells us this is done via the "_end" linker symbol.
> 
> The linker script can only control padding of the executable, but won't 
> affect the alignment of the fdt that can be appended to this later by 
> mkimage.

Which I've addressed in the second paragraph :)

>> Second is to put things in the right place. For FIT, the code, as is, is correct,
>> but this alignment is not guaranteed for legacy images. I think somebody
>> mentioned changing the arguments to mkimage to achieve this.
>>
>> I've tried to fix the first point by aligning the _end symbol (appendix A).
>> Unfortunately, this is causing other build issues that I don't know how to deal
>> with.
>>
>> Alex

> ------------------------------------------------------------------------
> The information in this email communication (inclusive of attachments) 
> is confidential to 4RF Limited and the intended recipient(s). If you are 
> not the intended recipient(s), please note that any use, disclosure, 
> distribution or copying of this information or any part thereof is 
> strictly prohibited and that the author accepts no liability for the 
> consequences of any action taken on the basis of the information 
> provided. If you have received this email in error, please notify the 
> sender immediately by return email and then delete all instances of this 
> email from your system. 4RF Limited will not accept responsibility for 
> any consequences associated with the use of this email (including, but 
> not limited to, damages sustained as a result of any viruses and/or any 
> action or lack of action taken in reliance on it).

Per your instructions, I have deleted this message and all copies of it 
from my computer.
Marek Vasut Oct. 20, 2020, 9:05 a.m. UTC | #15
On 10/20/20 2:27 AM, Reuben Dowle wrote:
>>> What assumptions? Any code that assumes 4 byte alignment will also work
>> on 8 byte alignment.
>>>
>>> Reverting is not the same as assuming ALIGN(...4) if incoming data is not
>> already aligned to 4 bytes (as was the case when I saw crashes).
>>
>> Can the incoming data _not_ be 4 byte aligned ?
>> How can this be replicated ?
> 
> In my case I have an offline signing process (separate from build server to keep secure boot keys safe), and this runs a script which also patches the main uboot device tree with some extra properties, then updates main uboot dtb with kernel signature, then finally updates the spl dtb with the uboot signature. I think when mkimage patches the dtb with the signatures, this results in the alignment issues (the unsigned bootloader direct from the uboot make process does not experience this issue).
> 
> Possibly using mkimage to add padding would also fix the alignment issue I see at boot time.
> 
>>> Interesting. I had not noticed the -B parameter previously. I had originally
>> fixed this issue on an older version of uboot that did not have that option,
>> and later rebased the fix to newer uboot. I would need to do some testing to
>> see if this would fix it as well.
>>
>> I believe this is the way to handle this if you are building a custom DT for U-
>> Boot -- just make sure it has the correct parameters. I think this is also related
>> to:
>> 20a154f95b ("mkimage: fit: Do not tail-pad fitImage with external data")
> 
> I will look into this, although unfortunately I am very busy with other projects right now so can't retest th

In that case, I would propose to revert this to fix existing boards in
mainline, and if your tests are not successful, revisit this issue.

I am rather sure what you are hitting is related to the mkimage patch
above, there was a lengthy discussion on that topic before.
Tom Rini Oct. 20, 2020, 2:07 p.m. UTC | #16
On Tue, Oct 20, 2020 at 11:05:40AM +0200, Marek Vasut wrote:
> On 10/20/20 2:27 AM, Reuben Dowle wrote:
> >>> What assumptions? Any code that assumes 4 byte alignment will also work
> >> on 8 byte alignment.
> >>>
> >>> Reverting is not the same as assuming ALIGN(...4) if incoming data is not
> >> already aligned to 4 bytes (as was the case when I saw crashes).
> >>
> >> Can the incoming data _not_ be 4 byte aligned ?
> >> How can this be replicated ?
> > 
> > In my case I have an offline signing process (separate from build server to keep secure boot keys safe), and this runs a script which also patches the main uboot device tree with some extra properties, then updates main uboot dtb with kernel signature, then finally updates the spl dtb with the uboot signature. I think when mkimage patches the dtb with the signatures, this results in the alignment issues (the unsigned bootloader direct from the uboot make process does not experience this issue).
> > 
> > Possibly using mkimage to add padding would also fix the alignment issue I see at boot time.
> > 
> >>> Interesting. I had not noticed the -B parameter previously. I had originally
> >> fixed this issue on an older version of uboot that did not have that option,
> >> and later rebased the fix to newer uboot. I would need to do some testing to
> >> see if this would fix it as well.
> >>
> >> I believe this is the way to handle this if you are building a custom DT for U-
> >> Boot -- just make sure it has the correct parameters. I think this is also related
> >> to:
> >> 20a154f95b ("mkimage: fit: Do not tail-pad fitImage with external data")
> > 
> > I will look into this, although unfortunately I am very busy with other projects right now so can't retest th
> 
> In that case, I would propose to revert this to fix existing boards in
> mainline, and if your tests are not successful, revisit this issue.
> 
> I am rather sure what you are hitting is related to the mkimage patch
> above, there was a lengthy discussion on that topic before.

This gets back to what I was saying earlier in this thread.  But to
expand on it, we have been, but cannot, use the same variable for both
"this is where we have the DTB at runtime to use" and "this is where the
DTB happens to exist when we get here".  For the case of "we copy the
device tree to $address", $address must be 8 bytes aligned.  For the
case of "we use an externally provided DTB in place" I don't like the
idea of, and worry a lot about, assuming it's going to be 8 byte
aligned.  But I can set that aside for the moment.  That said, in that
second case we need to set $address to where the device tree is.

That all said, I'm still not quite sure how you're ending up in the
place you're ending up.  Which if/else paths in spl_fit_append_fdt() is
your exact platform hitting, and where is what in memory?
Marek Vasut Oct. 20, 2020, 2:29 p.m. UTC | #17
On 10/20/20 4:07 PM, Tom Rini wrote:
> On Tue, Oct 20, 2020 at 11:05:40AM +0200, Marek Vasut wrote:
>> On 10/20/20 2:27 AM, Reuben Dowle wrote:
>>>>> What assumptions? Any code that assumes 4 byte alignment will also work
>>>> on 8 byte alignment.
>>>>>
>>>>> Reverting is not the same as assuming ALIGN(...4) if incoming data is not
>>>> already aligned to 4 bytes (as was the case when I saw crashes).
>>>>
>>>> Can the incoming data _not_ be 4 byte aligned ?
>>>> How can this be replicated ?
>>>
>>> In my case I have an offline signing process (separate from build server to keep secure boot keys safe), and this runs a script which also patches the main uboot device tree with some extra properties, then updates main uboot dtb with kernel signature, then finally updates the spl dtb with the uboot signature. I think when mkimage patches the dtb with the signatures, this results in the alignment issues (the unsigned bootloader direct from the uboot make process does not experience this issue).
>>>
>>> Possibly using mkimage to add padding would also fix the alignment issue I see at boot time.
>>>
>>>>> Interesting. I had not noticed the -B parameter previously. I had originally
>>>> fixed this issue on an older version of uboot that did not have that option,
>>>> and later rebased the fix to newer uboot. I would need to do some testing to
>>>> see if this would fix it as well.
>>>>
>>>> I believe this is the way to handle this if you are building a custom DT for U-
>>>> Boot -- just make sure it has the correct parameters. I think this is also related
>>>> to:
>>>> 20a154f95b ("mkimage: fit: Do not tail-pad fitImage with external data")
>>>
>>> I will look into this, although unfortunately I am very busy with other projects right now so can't retest th
>>
>> In that case, I would propose to revert this to fix existing boards in
>> mainline, and if your tests are not successful, revisit this issue.
>>
>> I am rather sure what you are hitting is related to the mkimage patch
>> above, there was a lengthy discussion on that topic before.
> 
> This gets back to what I was saying earlier in this thread.  But to
> expand on it, we have been, but cannot, use the same variable for both
> "this is where we have the DTB at runtime to use" and "this is where the
> DTB happens to exist when we get here".  For the case of "we copy the
> device tree to $address", $address must be 8 bytes aligned.  For the
> case of "we use an externally provided DTB in place" I don't like the
> idea of, and worry a lot about, assuming it's going to be 8 byte
> aligned.  But I can set that aside for the moment.  That said, in that
> second case we need to set $address to where the device tree is.

I don't think I understand this paragraph at all ?

> That all said, I'm still not quite sure how you're ending up in the
> place you're ending up.  Which if/else paths in spl_fit_append_fdt() is
> your exact platform hitting, and where is what in memory?

Is this a question for me or for Reuben ?
Tom Rini Oct. 20, 2020, 2:32 p.m. UTC | #18
On Tue, Oct 20, 2020 at 04:29:36PM +0200, Marek Vasut wrote:
> On 10/20/20 4:07 PM, Tom Rini wrote:
> > On Tue, Oct 20, 2020 at 11:05:40AM +0200, Marek Vasut wrote:
> >> On 10/20/20 2:27 AM, Reuben Dowle wrote:
> >>>>> What assumptions? Any code that assumes 4 byte alignment will also work
> >>>> on 8 byte alignment.
> >>>>>
> >>>>> Reverting is not the same as assuming ALIGN(...4) if incoming data is not
> >>>> already aligned to 4 bytes (as was the case when I saw crashes).
> >>>>
> >>>> Can the incoming data _not_ be 4 byte aligned ?
> >>>> How can this be replicated ?
> >>>
> >>> In my case I have an offline signing process (separate from build server to keep secure boot keys safe), and this runs a script which also patches the main uboot device tree with some extra properties, then updates main uboot dtb with kernel signature, then finally updates the spl dtb with the uboot signature. I think when mkimage patches the dtb with the signatures, this results in the alignment issues (the unsigned bootloader direct from the uboot make process does not experience this issue).
> >>>
> >>> Possibly using mkimage to add padding would also fix the alignment issue I see at boot time.
> >>>
> >>>>> Interesting. I had not noticed the -B parameter previously. I had originally
> >>>> fixed this issue on an older version of uboot that did not have that option,
> >>>> and later rebased the fix to newer uboot. I would need to do some testing to
> >>>> see if this would fix it as well.
> >>>>
> >>>> I believe this is the way to handle this if you are building a custom DT for U-
> >>>> Boot -- just make sure it has the correct parameters. I think this is also related
> >>>> to:
> >>>> 20a154f95b ("mkimage: fit: Do not tail-pad fitImage with external data")
> >>>
> >>> I will look into this, although unfortunately I am very busy with other projects right now so can't retest th
> >>
> >> In that case, I would propose to revert this to fix existing boards in
> >> mainline, and if your tests are not successful, revisit this issue.
> >>
> >> I am rather sure what you are hitting is related to the mkimage patch
> >> above, there was a lengthy discussion on that topic before.
> > 
> > This gets back to what I was saying earlier in this thread.  But to
> > expand on it, we have been, but cannot, use the same variable for both
> > "this is where we have the DTB at runtime to use" and "this is where the
> > DTB happens to exist when we get here".  For the case of "we copy the
> > device tree to $address", $address must be 8 bytes aligned.  For the
> > case of "we use an externally provided DTB in place" I don't like the
> > idea of, and worry a lot about, assuming it's going to be 8 byte
> > aligned.  But I can set that aside for the moment.  That said, in that
> > second case we need to set $address to where the device tree is.
> 
> I don't think I understand this paragraph at all ?

OK.  Maybe I can better explain it after you walk through how changing
the "copy the DTB to this address" to be 8 byte aligned is leading to
failure, as I ask below.

> > That all said, I'm still not quite sure how you're ending up in the
> > place you're ending up.  Which if/else paths in spl_fit_append_fdt() is
> > your exact platform hitting, and where is what in memory?
> 
> Is this a question for me or for Reuben ?

For you, the person with the current failure.  Please walk me through
how / where that function is now failing.  With address values
(approximate if you can't get the exact ones).
Alex G. Oct. 20, 2020, 2:38 p.m. UTC | #19
On 10/20/20 9:32 AM, Tom Rini wrote:
> On Tue, Oct 20, 2020 at 04:29:36PM +0200, Marek Vasut wrote:
>> On 10/20/20 4:07 PM, Tom Rini wrote:
>>> On Tue, Oct 20, 2020 at 11:05:40AM +0200, Marek Vasut wrote:
>>>> On 10/20/20 2:27 AM, Reuben Dowle wrote:
>>>>>>> What assumptions? Any code that assumes 4 byte alignment will also work
>>>>>> on 8 byte alignment.
>>>>>>>
>>>>>>> Reverting is not the same as assuming ALIGN(...4) if incoming data is not
>>>>>> already aligned to 4 bytes (as was the case when I saw crashes).
>>>>>>
>>>>>> Can the incoming data _not_ be 4 byte aligned ?
>>>>>> How can this be replicated ?
>>>>>
>>>>> In my case I have an offline signing process (separate from build server to keep secure boot keys safe), and this runs a script which also patches the main uboot device tree with some extra properties, then updates main uboot dtb with kernel signature, then finally updates the spl dtb with the uboot signature. I think when mkimage patches the dtb with the signatures, this results in the alignment issues (the unsigned bootloader direct from the uboot make process does not experience this issue).
>>>>>
>>>>> Possibly using mkimage to add padding would also fix the alignment issue I see at boot time.
>>>>>
>>>>>>> Interesting. I had not noticed the -B parameter previously. I had originally
>>>>>> fixed this issue on an older version of uboot that did not have that option,
>>>>>> and later rebased the fix to newer uboot. I would need to do some testing to
>>>>>> see if this would fix it as well.
>>>>>>
>>>>>> I believe this is the way to handle this if you are building a custom DT for U-
>>>>>> Boot -- just make sure it has the correct parameters. I think this is also related
>>>>>> to:
>>>>>> 20a154f95b ("mkimage: fit: Do not tail-pad fitImage with external data")
>>>>>
>>>>> I will look into this, although unfortunately I am very busy with other projects right now so can't retest th
>>>>
>>>> In that case, I would propose to revert this to fix existing boards in
>>>> mainline, and if your tests are not successful, revisit this issue.
>>>>
>>>> I am rather sure what you are hitting is related to the mkimage patch
>>>> above, there was a lengthy discussion on that topic before.
>>>
>>> This gets back to what I was saying earlier in this thread.  But to
>>> expand on it, we have been, but cannot, use the same variable for both
>>> "this is where we have the DTB at runtime to use" and "this is where the
>>> DTB happens to exist when we get here".  For the case of "we copy the
>>> device tree to $address", $address must be 8 bytes aligned.  For the
>>> case of "we use an externally provided DTB in place" I don't like the
>>> idea of, and worry a lot about, assuming it's going to be 8 byte
>>> aligned.  But I can set that aside for the moment.  That said, in that
>>> second case we need to set $address to where the device tree is.
>>
>> I don't think I understand this paragraph at all ?
> 
> OK.  Maybe I can better explain it after you walk through how changing
> the "copy the DTB to this address" to be 8 byte aligned is leading to
> failure, as I ask below.

(gdb) print gd->fdt_blob
$13 = (const void *) 0xc01cf85c
(gdb) x/4x gd->fdt_blob
0xc01cf85c:    0xc01b814c     0xedfe0dd0     0xa0670100     0x38000000

Notice how u-boot thinks fdt_blob is at 0xc01cf85c. However it was 
actually loaded at 0xc01cf860, as evidenced by the doodfeed signature. 
This is with the 8-byte alignment.

Alex
Marek Vasut Oct. 20, 2020, 2:42 p.m. UTC | #20
On 10/20/20 4:32 PM, Tom Rini wrote:
> On Tue, Oct 20, 2020 at 04:29:36PM +0200, Marek Vasut wrote:
>> On 10/20/20 4:07 PM, Tom Rini wrote:
>>> On Tue, Oct 20, 2020 at 11:05:40AM +0200, Marek Vasut wrote:
>>>> On 10/20/20 2:27 AM, Reuben Dowle wrote:
>>>>>>> What assumptions? Any code that assumes 4 byte alignment will also work
>>>>>> on 8 byte alignment.
>>>>>>>
>>>>>>> Reverting is not the same as assuming ALIGN(...4) if incoming data is not
>>>>>> already aligned to 4 bytes (as was the case when I saw crashes).
>>>>>>
>>>>>> Can the incoming data _not_ be 4 byte aligned ?
>>>>>> How can this be replicated ?
>>>>>
>>>>> In my case I have an offline signing process (separate from build server to keep secure boot keys safe), and this runs a script which also patches the main uboot device tree with some extra properties, then updates main uboot dtb with kernel signature, then finally updates the spl dtb with the uboot signature. I think when mkimage patches the dtb with the signatures, this results in the alignment issues (the unsigned bootloader direct from the uboot make process does not experience this issue).
>>>>>
>>>>> Possibly using mkimage to add padding would also fix the alignment issue I see at boot time.
>>>>>
>>>>>>> Interesting. I had not noticed the -B parameter previously. I had originally
>>>>>> fixed this issue on an older version of uboot that did not have that option,
>>>>>> and later rebased the fix to newer uboot. I would need to do some testing to
>>>>>> see if this would fix it as well.
>>>>>>
>>>>>> I believe this is the way to handle this if you are building a custom DT for U-
>>>>>> Boot -- just make sure it has the correct parameters. I think this is also related
>>>>>> to:
>>>>>> 20a154f95b ("mkimage: fit: Do not tail-pad fitImage with external data")
>>>>>
>>>>> I will look into this, although unfortunately I am very busy with other projects right now so can't retest th
>>>>
>>>> In that case, I would propose to revert this to fix existing boards in
>>>> mainline, and if your tests are not successful, revisit this issue.
>>>>
>>>> I am rather sure what you are hitting is related to the mkimage patch
>>>> above, there was a lengthy discussion on that topic before.
>>>
>>> This gets back to what I was saying earlier in this thread.  But to
>>> expand on it, we have been, but cannot, use the same variable for both
>>> "this is where we have the DTB at runtime to use" and "this is where the
>>> DTB happens to exist when we get here".  For the case of "we copy the
>>> device tree to $address", $address must be 8 bytes aligned.  For the
>>> case of "we use an externally provided DTB in place" I don't like the
>>> idea of, and worry a lot about, assuming it's going to be 8 byte
>>> aligned.  But I can set that aside for the moment.  That said, in that
>>> second case we need to set $address to where the device tree is.
>>
>> I don't think I understand this paragraph at all ?
> 
> OK.  Maybe I can better explain it after you walk through how changing
> the "copy the DTB to this address" to be 8 byte aligned is leading to
> failure, as I ask below.
> 
>>> That all said, I'm still not quite sure how you're ending up in the
>>> place you're ending up.  Which if/else paths in spl_fit_append_fdt() is
>>> your exact platform hitting, and where is what in memory?
>>
>> Is this a question for me or for Reuben ?
> 
> For you, the person with the current failure.  Please walk me through
> how / where that function is now failing.  With address values
> (approximate if you can't get the exact ones).

I currently don't have time for that, sorry. But Alex (on CC) was
debugging it yesterday, so I will defer there.
Tom Rini Oct. 20, 2020, 3:54 p.m. UTC | #21
On Tue, Oct 20, 2020 at 09:38:52AM -0500, Alex G. wrote:
> On 10/20/20 9:32 AM, Tom Rini wrote:
> > On Tue, Oct 20, 2020 at 04:29:36PM +0200, Marek Vasut wrote:
> > > On 10/20/20 4:07 PM, Tom Rini wrote:
> > > > On Tue, Oct 20, 2020 at 11:05:40AM +0200, Marek Vasut wrote:
> > > > > On 10/20/20 2:27 AM, Reuben Dowle wrote:
> > > > > > > > What assumptions? Any code that assumes 4 byte alignment will also work
> > > > > > > on 8 byte alignment.
> > > > > > > > 
> > > > > > > > Reverting is not the same as assuming ALIGN(...4) if incoming data is not
> > > > > > > already aligned to 4 bytes (as was the case when I saw crashes).
> > > > > > > 
> > > > > > > Can the incoming data _not_ be 4 byte aligned ?
> > > > > > > How can this be replicated ?
> > > > > > 
> > > > > > In my case I have an offline signing process (separate from build server to keep secure boot keys safe), and this runs a script which also patches the main uboot device tree with some extra properties, then updates main uboot dtb with kernel signature, then finally updates the spl dtb with the uboot signature. I think when mkimage patches the dtb with the signatures, this results in the alignment issues (the unsigned bootloader direct from the uboot make process does not experience this issue).
> > > > > > 
> > > > > > Possibly using mkimage to add padding would also fix the alignment issue I see at boot time.
> > > > > > 
> > > > > > > > Interesting. I had not noticed the -B parameter previously. I had originally
> > > > > > > fixed this issue on an older version of uboot that did not have that option,
> > > > > > > and later rebased the fix to newer uboot. I would need to do some testing to
> > > > > > > see if this would fix it as well.
> > > > > > > 
> > > > > > > I believe this is the way to handle this if you are building a custom DT for U-
> > > > > > > Boot -- just make sure it has the correct parameters. I think this is also related
> > > > > > > to:
> > > > > > > 20a154f95b ("mkimage: fit: Do not tail-pad fitImage with external data")
> > > > > > 
> > > > > > I will look into this, although unfortunately I am very busy with other projects right now so can't retest th
> > > > > 
> > > > > In that case, I would propose to revert this to fix existing boards in
> > > > > mainline, and if your tests are not successful, revisit this issue.
> > > > > 
> > > > > I am rather sure what you are hitting is related to the mkimage patch
> > > > > above, there was a lengthy discussion on that topic before.
> > > > 
> > > > This gets back to what I was saying earlier in this thread.  But to
> > > > expand on it, we have been, but cannot, use the same variable for both
> > > > "this is where we have the DTB at runtime to use" and "this is where the
> > > > DTB happens to exist when we get here".  For the case of "we copy the
> > > > device tree to $address", $address must be 8 bytes aligned.  For the
> > > > case of "we use an externally provided DTB in place" I don't like the
> > > > idea of, and worry a lot about, assuming it's going to be 8 byte
> > > > aligned.  But I can set that aside for the moment.  That said, in that
> > > > second case we need to set $address to where the device tree is.
> > > 
> > > I don't think I understand this paragraph at all ?
> > 
> > OK.  Maybe I can better explain it after you walk through how changing
> > the "copy the DTB to this address" to be 8 byte aligned is leading to
> > failure, as I ask below.
> 
> (gdb) print gd->fdt_blob
> $13 = (const void *) 0xc01cf85c
> (gdb) x/4x gd->fdt_blob
> 0xc01cf85c:    0xc01b814c     0xedfe0dd0     0xa0670100     0x38000000
> 
> Notice how u-boot thinks fdt_blob is at 0xc01cf85c. However it was actually
> loaded at 0xc01cf860, as evidenced by the doodfeed signature. This is with
> the 8-byte alignment.

Ah, thanks.  So we're in this part of the code (conditions changed to results):
        if (TRUE) {
                debug("%s: cannot find FDT node\n", __func__);

                /*
                 * U-Boot did not find a device tree inside the FIT image. Use
                 * the U-Boot device tree instead.
                 */
                if (TRUE)
                        memcpy((void *)image_info.load_addr, gd->fdt_blob,
                               fdt_totalsize(gd->fdt_blob));
 

So, our destination is what changed but gd->fdt_blob is 4 lower than the
correct value.  Isn't that the problem?  Or am I missing something
still?  Thanks!
Alex G. Oct. 20, 2020, 5:01 p.m. UTC | #22
On 10/20/20 10:54 AM, Tom Rini wrote:
> On Tue, Oct 20, 2020 at 09:38:52AM -0500, Alex G. wrote:
>> On 10/20/20 9:32 AM, Tom Rini wrote:
>>> On Tue, Oct 20, 2020 at 04:29:36PM +0200, Marek Vasut wrote:
>>>> On 10/20/20 4:07 PM, Tom Rini wrote:
>>>>> On Tue, Oct 20, 2020 at 11:05:40AM +0200, Marek Vasut wrote:
>>>>>> On 10/20/20 2:27 AM, Reuben Dowle wrote:
>>>>>>>>> What assumptions? Any code that assumes 4 byte alignment will also work
>>>>>>>> on 8 byte alignment.
>>>>>>>>>
>>>>>>>>> Reverting is not the same as assuming ALIGN(...4) if incoming data is not
>>>>>>>> already aligned to 4 bytes (as was the case when I saw crashes).
>>>>>>>>
>>>>>>>> Can the incoming data _not_ be 4 byte aligned ?
>>>>>>>> How can this be replicated ?
>>>>>>>
>>>>>>> In my case I have an offline signing process (separate from build server to keep secure boot keys safe), and this runs a script which also patches the main uboot device tree with some extra properties, then updates main uboot dtb with kernel signature, then finally updates the spl dtb with the uboot signature. I think when mkimage patches the dtb with the signatures, this results in the alignment issues (the unsigned bootloader direct from the uboot make process does not experience this issue).
>>>>>>>
>>>>>>> Possibly using mkimage to add padding would also fix the alignment issue I see at boot time.
>>>>>>>
>>>>>>>>> Interesting. I had not noticed the -B parameter previously. I had originally
>>>>>>>> fixed this issue on an older version of uboot that did not have that option,
>>>>>>>> and later rebased the fix to newer uboot. I would need to do some testing to
>>>>>>>> see if this would fix it as well.
>>>>>>>>
>>>>>>>> I believe this is the way to handle this if you are building a custom DT for U-
>>>>>>>> Boot -- just make sure it has the correct parameters. I think this is also related
>>>>>>>> to:
>>>>>>>> 20a154f95b ("mkimage: fit: Do not tail-pad fitImage with external data")
>>>>>>>
>>>>>>> I will look into this, although unfortunately I am very busy with other projects right now so can't retest th
>>>>>>
>>>>>> In that case, I would propose to revert this to fix existing boards in
>>>>>> mainline, and if your tests are not successful, revisit this issue.
>>>>>>
>>>>>> I am rather sure what you are hitting is related to the mkimage patch
>>>>>> above, there was a lengthy discussion on that topic before.
>>>>>
>>>>> This gets back to what I was saying earlier in this thread.  But to
>>>>> expand on it, we have been, but cannot, use the same variable for both
>>>>> "this is where we have the DTB at runtime to use" and "this is where the
>>>>> DTB happens to exist when we get here".  For the case of "we copy the
>>>>> device tree to $address", $address must be 8 bytes aligned.  For the
>>>>> case of "we use an externally provided DTB in place" I don't like the
>>>>> idea of, and worry a lot about, assuming it's going to be 8 byte
>>>>> aligned.  But I can set that aside for the moment.  That said, in that
>>>>> second case we need to set $address to where the device tree is.
>>>>
>>>> I don't think I understand this paragraph at all ?
>>>
>>> OK.  Maybe I can better explain it after you walk through how changing
>>> the "copy the DTB to this address" to be 8 byte aligned is leading to
>>> failure, as I ask below.
>>
>> (gdb) print gd->fdt_blob
>> $13 = (const void *) 0xc01cf85c
>> (gdb) x/4x gd->fdt_blob
>> 0xc01cf85c:    0xc01b814c     0xedfe0dd0     0xa0670100     0x38000000
>>
>> Notice how u-boot thinks fdt_blob is at 0xc01cf85c. However it was actually
>> loaded at 0xc01cf860, as evidenced by the doodfeed signature. This is with
>> the 8-byte alignment.
> 
> Ah, thanks.  So we're in this part of the code (conditions changed to results):
>          if (TRUE) {
>                  debug("%s: cannot find FDT node\n", __func__);
> 
>                  /*
>                   * U-Boot did not find a device tree inside the FIT image. Use
>                   * the U-Boot device tree instead.
>                   */
>                  if (TRUE)
>                          memcpy((void *)image_info.load_addr, gd->fdt_blob,
>                                 fdt_totalsize(gd->fdt_blob));
>   
> 
> So, our destination is what changed but gd->fdt_blob is 4 lower than the
> correct value.  Isn't that the problem?  Or am I missing something
> still?  Thanks!

That is incorrect:

	node = 465;
	...
	} else { 

		ret = spl_load_fit_image(info, sector, fit, base_offset,
				node,&image_info);

I think there is confusion between the writer of the FDT and the reader. 
This code is in SPL, the writer. It writes the FDT to an address I shall 
call x1. The reader, which is u-boot TPL, will look at address x2.

It's obvious that for the FDT to be passed successfully, x1 == x2


## For x1:

image_info.load_addr = ALIGN(spl_image->load_addr + spl_image->size, 8);

	(gdb) print/x (spl_image->load_addr + spl_image->size)
	$19 = 0xc01cf85c
	(gdb) print/x image_info->load_addr
	$20 = 0xc01cf860

x1 is 0xc01cf860


## For x2:

	__weak void *board_fdt_blob_setup(void)
	{
		/* FDT is at end of image */ 

		fdt_blob = (ulong *)&_end;

	(gdb) print &_end
	$22 = (char (*)[]) 0xc01cf85c

x2 = 0xc01cf85c

As I have hopefully demonstrated, SPL and TPL do _not_ agree on the 
location of the FDT, hence the failure.

Alex
Tom Rini Oct. 20, 2020, 6:10 p.m. UTC | #23
On Tue, Oct 20, 2020 at 12:01:02PM -0500, Alex G. wrote:
> On 10/20/20 10:54 AM, Tom Rini wrote:
> > On Tue, Oct 20, 2020 at 09:38:52AM -0500, Alex G. wrote:
> > > On 10/20/20 9:32 AM, Tom Rini wrote:
> > > > On Tue, Oct 20, 2020 at 04:29:36PM +0200, Marek Vasut wrote:
> > > > > On 10/20/20 4:07 PM, Tom Rini wrote:
> > > > > > On Tue, Oct 20, 2020 at 11:05:40AM +0200, Marek Vasut wrote:
> > > > > > > On 10/20/20 2:27 AM, Reuben Dowle wrote:
> > > > > > > > > > What assumptions? Any code that assumes 4 byte alignment will also work
> > > > > > > > > on 8 byte alignment.
> > > > > > > > > > 
> > > > > > > > > > Reverting is not the same as assuming ALIGN(...4) if incoming data is not
> > > > > > > > > already aligned to 4 bytes (as was the case when I saw crashes).
> > > > > > > > > 
> > > > > > > > > Can the incoming data _not_ be 4 byte aligned ?
> > > > > > > > > How can this be replicated ?
> > > > > > > > 
> > > > > > > > In my case I have an offline signing process (separate from build server to keep secure boot keys safe), and this runs a script which also patches the main uboot device tree with some extra properties, then updates main uboot dtb with kernel signature, then finally updates the spl dtb with the uboot signature. I think when mkimage patches the dtb with the signatures, this results in the alignment issues (the unsigned bootloader direct from the uboot make process does not experience this issue).
> > > > > > > > 
> > > > > > > > Possibly using mkimage to add padding would also fix the alignment issue I see at boot time.
> > > > > > > > 
> > > > > > > > > > Interesting. I had not noticed the -B parameter previously. I had originally
> > > > > > > > > fixed this issue on an older version of uboot that did not have that option,
> > > > > > > > > and later rebased the fix to newer uboot. I would need to do some testing to
> > > > > > > > > see if this would fix it as well.
> > > > > > > > > 
> > > > > > > > > I believe this is the way to handle this if you are building a custom DT for U-
> > > > > > > > > Boot -- just make sure it has the correct parameters. I think this is also related
> > > > > > > > > to:
> > > > > > > > > 20a154f95b ("mkimage: fit: Do not tail-pad fitImage with external data")
> > > > > > > > 
> > > > > > > > I will look into this, although unfortunately I am very busy with other projects right now so can't retest th
> > > > > > > 
> > > > > > > In that case, I would propose to revert this to fix existing boards in
> > > > > > > mainline, and if your tests are not successful, revisit this issue.
> > > > > > > 
> > > > > > > I am rather sure what you are hitting is related to the mkimage patch
> > > > > > > above, there was a lengthy discussion on that topic before.
> > > > > > 
> > > > > > This gets back to what I was saying earlier in this thread.  But to
> > > > > > expand on it, we have been, but cannot, use the same variable for both
> > > > > > "this is where we have the DTB at runtime to use" and "this is where the
> > > > > > DTB happens to exist when we get here".  For the case of "we copy the
> > > > > > device tree to $address", $address must be 8 bytes aligned.  For the
> > > > > > case of "we use an externally provided DTB in place" I don't like the
> > > > > > idea of, and worry a lot about, assuming it's going to be 8 byte
> > > > > > aligned.  But I can set that aside for the moment.  That said, in that
> > > > > > second case we need to set $address to where the device tree is.
> > > > > 
> > > > > I don't think I understand this paragraph at all ?
> > > > 
> > > > OK.  Maybe I can better explain it after you walk through how changing
> > > > the "copy the DTB to this address" to be 8 byte aligned is leading to
> > > > failure, as I ask below.
> > > 
> > > (gdb) print gd->fdt_blob
> > > $13 = (const void *) 0xc01cf85c
> > > (gdb) x/4x gd->fdt_blob
> > > 0xc01cf85c:    0xc01b814c     0xedfe0dd0     0xa0670100     0x38000000
> > > 
> > > Notice how u-boot thinks fdt_blob is at 0xc01cf85c. However it was actually
> > > loaded at 0xc01cf860, as evidenced by the doodfeed signature. This is with
> > > the 8-byte alignment.
> > 
> > Ah, thanks.  So we're in this part of the code (conditions changed to results):
> >          if (TRUE) {
> >                  debug("%s: cannot find FDT node\n", __func__);
> > 
> >                  /*
> >                   * U-Boot did not find a device tree inside the FIT image. Use
> >                   * the U-Boot device tree instead.
> >                   */
> >                  if (TRUE)
> >                          memcpy((void *)image_info.load_addr, gd->fdt_blob,
> >                                 fdt_totalsize(gd->fdt_blob));
> > 
> > So, our destination is what changed but gd->fdt_blob is 4 lower than the
> > correct value.  Isn't that the problem?  Or am I missing something
> > still?  Thanks!
> 
> That is incorrect:
> 
> 	node = 465;
> 	...
> 	} else {
> 
> 		ret = spl_load_fit_image(info, sector, fit, base_offset,
> 				node,&image_info);
> 
> I think there is confusion between the writer of the FDT and the reader.
> This code is in SPL, the writer. It writes the FDT to an address I shall
> call x1. The reader, which is u-boot TPL, will look at address x2.
> 
> It's obvious that for the FDT to be passed successfully, x1 == x2
> 
> 
> ## For x1:
> 
> image_info.load_addr = ALIGN(spl_image->load_addr + spl_image->size, 8);
> 
> 	(gdb) print/x (spl_image->load_addr + spl_image->size)
> 	$19 = 0xc01cf85c
> 	(gdb) print/x image_info->load_addr
> 	$20 = 0xc01cf860
> 
> x1 is 0xc01cf860
> 
> 
> ## For x2:
> 
> 	__weak void *board_fdt_blob_setup(void)
> 	{
> 		/* FDT is at end of image */
> 
> 		fdt_blob = (ulong *)&_end;
> 
> 	(gdb) print &_end
> 	$22 = (char (*)[]) 0xc01cf85c
> 
> x2 = 0xc01cf85c
> 
> As I have hopefully demonstrated, SPL and TPL do _not_ agree on the location
> of the FDT, hence the failure.

Thanks, that's helpful.  I guess we have a few things to sort out.
First, we don't have a general good way to pass from prior stage to next
stage "the device tree to use is HERE".  That's I think why we're in
this particular mess.  Second, the comment in spl_fit_append_fdt() needs
to be made to reflect where the consumer will be looking for it.  Third,
and harder, is that we need to make sure that we're using the device
tree at an 8 byte aligned address, due to the alignment requirements of
our overall next consumers.

I don't expect you (or Marek) to solve #1.  For #2, a follow-up on the
revert to clarify the comment block would be much appreciated.  For #3,
that too requires some harder work as to me the more stringent
requirement is that once full U-Boot is at the point of using and
validating the device tree, it needs to be 8-byte aligned.
Alex G. Oct. 21, 2020, 5:11 p.m. UTC | #24
On 10/20/20 1:10 PM, Tom Rini wrote:
> On Tue, Oct 20, 2020 at 12:01:02PM -0500, Alex G. wrote:
>> On 10/20/20 10:54 AM, Tom Rini wrote:
>>> On Tue, Oct 20, 2020 at 09:38:52AM -0500, Alex G. wrote:
>>>> On 10/20/20 9:32 AM, Tom Rini wrote:
>>>>> On Tue, Oct 20, 2020 at 04:29:36PM +0200, Marek Vasut wrote:
>>>>>> On 10/20/20 4:07 PM, Tom Rini wrote:
>>>>>>> On Tue, Oct 20, 2020 at 11:05:40AM +0200, Marek Vasut wrote:
>>>>>>>> On 10/20/20 2:27 AM, Reuben Dowle wrote:
>>>>>>>>>>> What assumptions? Any code that assumes 4 byte alignment will also work
>>>>>>>>>> on 8 byte alignment.
>>>>>>>>>>>
>>>>>>>>>>> Reverting is not the same as assuming ALIGN(...4) if incoming data is not
>>>>>>>>>> already aligned to 4 bytes (as was the case when I saw crashes).
>>>>>>>>>>
>>>>>>>>>> Can the incoming data _not_ be 4 byte aligned ?
>>>>>>>>>> How can this be replicated ?
>>>>>>>>>
>>>>>>>>> In my case I have an offline signing process (separate from build server to keep secure boot keys safe), and this runs a script which also patches the main uboot device tree with some extra properties, then updates main uboot dtb with kernel signature, then finally updates the spl dtb with the uboot signature. I think when mkimage patches the dtb with the signatures, this results in the alignment issues (the unsigned bootloader direct from the uboot make process does not experience this issue).
>>>>>>>>>
>>>>>>>>> Possibly using mkimage to add padding would also fix the alignment issue I see at boot time.
>>>>>>>>>
>>>>>>>>>>> Interesting. I had not noticed the -B parameter previously. I had originally
>>>>>>>>>> fixed this issue on an older version of uboot that did not have that option,
>>>>>>>>>> and later rebased the fix to newer uboot. I would need to do some testing to
>>>>>>>>>> see if this would fix it as well.
>>>>>>>>>>
>>>>>>>>>> I believe this is the way to handle this if you are building a custom DT for U-
>>>>>>>>>> Boot -- just make sure it has the correct parameters. I think this is also related
>>>>>>>>>> to:
>>>>>>>>>> 20a154f95b ("mkimage: fit: Do not tail-pad fitImage with external data")
>>>>>>>>>
>>>>>>>>> I will look into this, although unfortunately I am very busy with other projects right now so can't retest th
>>>>>>>>
>>>>>>>> In that case, I would propose to revert this to fix existing boards in
>>>>>>>> mainline, and if your tests are not successful, revisit this issue.
>>>>>>>>
>>>>>>>> I am rather sure what you are hitting is related to the mkimage patch
>>>>>>>> above, there was a lengthy discussion on that topic before.
>>>>>>>
>>>>>>> This gets back to what I was saying earlier in this thread.  But to
>>>>>>> expand on it, we have been, but cannot, use the same variable for both
>>>>>>> "this is where we have the DTB at runtime to use" and "this is where the
>>>>>>> DTB happens to exist when we get here".  For the case of "we copy the
>>>>>>> device tree to $address", $address must be 8 bytes aligned.  For the
>>>>>>> case of "we use an externally provided DTB in place" I don't like the
>>>>>>> idea of, and worry a lot about, assuming it's going to be 8 byte
>>>>>>> aligned.  But I can set that aside for the moment.  That said, in that
>>>>>>> second case we need to set $address to where the device tree is.
>>>>>>
>>>>>> I don't think I understand this paragraph at all ?
>>>>>
>>>>> OK.  Maybe I can better explain it after you walk through how changing
>>>>> the "copy the DTB to this address" to be 8 byte aligned is leading to
>>>>> failure, as I ask below.
>>>>
>>>> (gdb) print gd->fdt_blob
>>>> $13 = (const void *) 0xc01cf85c
>>>> (gdb) x/4x gd->fdt_blob
>>>> 0xc01cf85c:    0xc01b814c     0xedfe0dd0     0xa0670100     0x38000000
>>>>
>>>> Notice how u-boot thinks fdt_blob is at 0xc01cf85c. However it was actually
>>>> loaded at 0xc01cf860, as evidenced by the doodfeed signature. This is with
>>>> the 8-byte alignment.
>>>
>>> Ah, thanks.  So we're in this part of the code (conditions changed to results):
>>>           if (TRUE) {
>>>                   debug("%s: cannot find FDT node\n", __func__);
>>>
>>>                   /*
>>>                    * U-Boot did not find a device tree inside the FIT image. Use
>>>                    * the U-Boot device tree instead.
>>>                    */
>>>                   if (TRUE)
>>>                           memcpy((void *)image_info.load_addr, gd->fdt_blob,
>>>                                  fdt_totalsize(gd->fdt_blob));
>>>
>>> So, our destination is what changed but gd->fdt_blob is 4 lower than the
>>> correct value.  Isn't that the problem?  Or am I missing something
>>> still?  Thanks!
>>
>> That is incorrect:
>>
>> 	node = 465;
>> 	...
>> 	} else {
>>
>> 		ret = spl_load_fit_image(info, sector, fit, base_offset,
>> 				node,&image_info);
>>
>> I think there is confusion between the writer of the FDT and the reader.
>> This code is in SPL, the writer. It writes the FDT to an address I shall
>> call x1. The reader, which is u-boot TPL, will look at address x2.
>>
>> It's obvious that for the FDT to be passed successfully, x1 == x2
>>
>>
>> ## For x1:
>>
>> image_info.load_addr = ALIGN(spl_image->load_addr + spl_image->size, 8);
>>
>> 	(gdb) print/x (spl_image->load_addr + spl_image->size)
>> 	$19 = 0xc01cf85c
>> 	(gdb) print/x image_info->load_addr
>> 	$20 = 0xc01cf860
>>
>> x1 is 0xc01cf860
>>
>>
>> ## For x2:
>>
>> 	__weak void *board_fdt_blob_setup(void)
>> 	{
>> 		/* FDT is at end of image */
>>
>> 		fdt_blob = (ulong *)&_end;
>>
>> 	(gdb) print &_end
>> 	$22 = (char (*)[]) 0xc01cf85c
>>
>> x2 = 0xc01cf85c
>>
>> As I have hopefully demonstrated, SPL and TPL do _not_ agree on the location
>> of the FDT, hence the failure.
> 
> Thanks, that's helpful.  I guess we have a few things to sort out.
> First, we don't have a general good way to pass from prior stage to next
> stage "the device tree to use is HERE".  That's I think why we're in
> this particular mess.  Second, the comment in spl_fit_append_fdt() needs
> to be made to reflect where the consumer will be looking for it.  Third,
> and harder, is that we need to make sure that we're using the device
> tree at an 8 byte aligned address, due to the alignment requirements of
> our overall next consumers.
> 
> I don't expect you (or Marek) to solve #1.  For #2, a follow-up on the
> revert to clarify the comment block would be much appreciated.  For #3,
> that too requires some harder work as to me the more stringent
> requirement is that once full U-Boot is at the point of using and
> validating the device tree, it needs to be 8-byte aligned.

Let's do this. Let's merge this revert, and un-break things. 
Re-introduce the FDT alignment once the other issues are ironed out.

Regarding #1, why not use r2, like everyone else? At least on arm.

Alex
Tom Rini Oct. 21, 2020, 11:12 p.m. UTC | #25
On Mon, Oct 19, 2020 at 11:40:26PM +0200, Marek Vasut wrote:

> This reverts commit eb39d8ba5f0d1468b01b89a2a464d18612d3ea76.
> The commit breaks booting of fitImage by SPL, the system simply hangs.
> This is because on arm32, the fitImage and all of its content can be
> aligned to 4 bytes and U-Boot expects just that.
> 
> Signed-off-by: Marek Vasut <marex@denx.de>
> Cc: Reuben Dowle <reuben.dowle@4rf.com>
> Cc: Tom Rini <trini@konsulko.com>
> Signed-off-by: Marek Vasut <marex@denx.de>

Applied to u-boot/master, thanks!
diff mbox series

Patch

diff --git a/common/spl/spl_fit.c b/common/spl/spl_fit.c
index 0e27ad1d6a..a90d821c82 100644
--- a/common/spl/spl_fit.c
+++ b/common/spl/spl_fit.c
@@ -349,12 +349,9 @@  static int spl_fit_append_fdt(struct spl_image_info *spl_image,
 
 	/*
 	 * Use the address following the image as target address for the
-	 * device tree. Load address is aligned to 8 bytes to match the required
-	 * alignment specified for linux arm [1] and arm 64 [2] booting
-	 * [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/arm/booting.rst#n126
-	 * [2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/arm64/booting.rst#n45
+	 * device tree.
 	 */
-	image_info.load_addr = ALIGN(spl_image->load_addr + spl_image->size, 8);
+	image_info.load_addr = spl_image->load_addr + spl_image->size;
 
 	/* Figure out which device tree the board wants to use */
 	node = spl_fit_get_image_node(fit, images, FIT_FDT_PROP, index++);