diff mbox series

arm: stm32mp: Fix board_get_usable_ram_top() again

Message ID 20230105012222.238075-1-marex@denx.de
State Accepted
Commit a2e0b041d662e4c80502cd5c9a8326d026f9deb1
Delegated to: Patrick Delaunay
Headers show
Series arm: stm32mp: Fix board_get_usable_ram_top() again | expand

Commit Message

Marek Vasut Jan. 5, 2023, 1:22 a.m. UTC
Do not access gd->ram_size and assume this is actual valid RAM size. Since commit
777aaaa706b ("common/memsize.c: Fix get_effective_memsize() to check for overflow")
the RAM size may be less than gd->ram_size , call get_effective_memsize() to get
the limited value instead.

The aforementioned commit makes STM32MP15xx boards with 1 GiB of DRAM
at 0xc0000000 hang on boot, which is a grave defect.

Signed-off-by: Marek Vasut <marex@denx.de>
---
Cc: Pali Rohar <pali@kernel.org>
Cc: Patrice Chotard <patrice.chotard@foss.st.com>
Cc: Patrick Delaunay <patrick.delaunay@foss.st.com>
Cc: Tom Rini <trini@konsulko.com>
---
 arch/arm/mach-stm32mp/dram_init.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Patrick Delaunay Jan. 5, 2023, 10:03 a.m. UTC | #1
Hi,

On 1/5/23 02:22, Marek Vasut wrote:
> Do not access gd->ram_size and assume this is actual valid RAM size. Since commit
> 777aaaa706b ("common/memsize.c: Fix get_effective_memsize() to check for overflow")
> the RAM size may be less than gd->ram_size , call get_effective_memsize() to get
> the limited value instead.
>
> The aforementioned commit makes STM32MP15xx boards with 1 GiB of DRAM
> at 0xc0000000 hang on boot, which is a grave defect.
>
> Signed-off-by: Marek Vasut <marex@denx.de>
> ---
> Cc: Pali Rohar <pali@kernel.org>
> Cc: Patrice Chotard <patrice.chotard@foss.st.com>
> Cc: Patrick Delaunay <patrick.delaunay@foss.st.com>
> Cc: Tom Rini <trini@konsulko.com>
> ---
>   arch/arm/mach-stm32mp/dram_init.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/arm/mach-stm32mp/dram_init.c b/arch/arm/mach-stm32mp/dram_init.c
> index 9346fa8546d..80ba5c27741 100644
> --- a/arch/arm/mach-stm32mp/dram_init.c
> +++ b/arch/arm/mach-stm32mp/dram_init.c
> @@ -51,7 +51,7 @@ phys_size_t board_get_usable_ram_top(phys_size_t total_size)
>   
>   	/* found enough not-reserved memory to relocated U-Boot */
>   	lmb_init(&lmb);
> -	lmb_add(&lmb, gd->ram_base, gd->ram_size);
> +	lmb_add(&lmb, gd->ram_base, get_effective_memsize());
>   	boot_fdt_add_mem_rsv_regions(&lmb, (void *)gd->fdt_blob);
>   	/* add 8M for reserved memory for display, fdt, gd,... */
>   	size = ALIGN(SZ_8M + CONFIG_SYS_MALLOC_LEN + total_size, MMU_SECTION_SIZE),


Thanks for the patch

Reviewed-by: Patrick Delaunay <patrick.delaunay@foss.st.com>


Even if I don't uderstood where the code blocked here...

I need to cros-check it.




But I think it is more a regression introduced by

777aaaa706b ("common/memsize.c: Fix get_effective_memsize() to check for overflow")


in this patch I don't understood the assumption:

* It is required that ram_base + ram_size must be representable by
* phys_size_t type and must be aligned by direct access


why add now this limitation ?

why phys_size_t here and not phys_addr_t or unsigned long, the ram_base type ?


why align on 4kB and not on MMU_SECTION_SIZE ?



until now on ARM 32bits U-Boot is working with

   ram_base = 0xC0000000 (unsigned long)
   ram_size = 0x40000000 (phys_size_t)

overflow for ram_top (phys_addr_t) = ram_base + get_effective_memsize() was previously managed in the code
as we work only on base address + size (in board_f.c, in lmb library)

I correct in the code overflow issues, for example in commit 54be09cd8f6e ("arm: caches: manage phys
addr_t overflow in mmu_set_region_dcache_behaviour").


moreover for me it is strange to reduce the size of the DDR banks by default
(see resutl of bdinfo command) only just to avoid overflow:

common/board_f.c:267:	gd->bd->bi_dram[0].size = get_effective_memsize();


For me the real assumption is on the end of RAM:

"ram_base + ram_size - 1" must be representable by 'ram_base' type
(phys_addr_t or unsigned long ?)



+
+	/*
+	 * Check for overflow and limit ram size.
+	 * It is required that ram_base + ram_size - 1 must be representable
+	 * by ram_base type, unsigned long.
+	 */
+	if (gd->ram_base - 1 + ram_size < gd->ram_base)
+		ram_size = ((unsigned long)~0x0ULL) - gd->ram_base;



=> no need to remove 4KB is here... base + size can reach the MAX value for the type + 1


Patrick
Marek Vasut Jan. 5, 2023, 12:11 p.m. UTC | #2
On 1/5/23 11:03, Patrick DELAUNAY wrote:
> Hi,

Hi,

> On 1/5/23 02:22, Marek Vasut wrote:
>> Do not access gd->ram_size and assume this is actual valid RAM size. 
>> Since commit
>> 777aaaa706b ("common/memsize.c: Fix get_effective_memsize() to check 
>> for overflow")
>> the RAM size may be less than gd->ram_size , call 
>> get_effective_memsize() to get
>> the limited value instead.
>>
>> The aforementioned commit makes STM32MP15xx boards with 1 GiB of DRAM
>> at 0xc0000000 hang on boot, which is a grave defect.
>>
>> Signed-off-by: Marek Vasut <marex@denx.de>
>> ---
>> Cc: Pali Rohar <pali@kernel.org>
>> Cc: Patrice Chotard <patrice.chotard@foss.st.com>
>> Cc: Patrick Delaunay <patrick.delaunay@foss.st.com>
>> Cc: Tom Rini <trini@konsulko.com>
>> ---
>>   arch/arm/mach-stm32mp/dram_init.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/arm/mach-stm32mp/dram_init.c 
>> b/arch/arm/mach-stm32mp/dram_init.c
>> index 9346fa8546d..80ba5c27741 100644
>> --- a/arch/arm/mach-stm32mp/dram_init.c
>> +++ b/arch/arm/mach-stm32mp/dram_init.c
>> @@ -51,7 +51,7 @@ phys_size_t board_get_usable_ram_top(phys_size_t 
>> total_size)
>>       /* found enough not-reserved memory to relocated U-Boot */
>>       lmb_init(&lmb);
>> -    lmb_add(&lmb, gd->ram_base, gd->ram_size);
>> +    lmb_add(&lmb, gd->ram_base, get_effective_memsize());
>>       boot_fdt_add_mem_rsv_regions(&lmb, (void *)gd->fdt_blob);
>>       /* add 8M for reserved memory for display, fdt, gd,... */
>>       size = ALIGN(SZ_8M + CONFIG_SYS_MALLOC_LEN + total_size, 
>> MMU_SECTION_SIZE),
> 
> 
> Thanks for the patch
> 
> Reviewed-by: Patrick Delaunay <patrick.delaunay@foss.st.com>

Thank you.

> Even if I don't uderstood where the code blocked here...
> 
> I need to cros-check it.

Please do, thank you.

> But I think it is more a regression introduced by
> 
> 777aaaa706b ("common/memsize.c: Fix get_effective_memsize() to check for 
> overflow")

I agree that commit likely breaks more systems than just STM32MP15xx , 
and I am tempted to say, it should be reverted before release.

Tom ?

> in this patch I don't understood the assumption:
> 
> * It is required that ram_base + ram_size must be representable by
> * phys_size_t type and must be aligned by direct access
> 
> 
> why add now this limitation ?
> 
> why phys_size_t here and not phys_addr_t or unsigned long, the ram_base 
> type ?
> 
> 
> why align on 4kB and not on MMU_SECTION_SIZE ?
> 
> 
> 
> until now on ARM 32bits U-Boot is working with
> 
>    ram_base = 0xC0000000 (unsigned long)
>    ram_size = 0x40000000 (phys_size_t)
> 
> overflow for ram_top (phys_addr_t) = ram_base + get_effective_memsize() 
> was previously managed in the code
> as we work only on base address + size (in board_f.c, in lmb library)
> 
> I correct in the code overflow issues, for example in commit 
> 54be09cd8f6e ("arm: caches: manage phys
> addr_t overflow in mmu_set_region_dcache_behaviour").
> 
> 
> moreover for me it is strange to reduce the size of the DDR banks by 
> default
> (see resutl of bdinfo command) only just to avoid overflow:
> 
> common/board_f.c:267:    gd->bd->bi_dram[0].size = get_effective_memsize();
> 
> 
> For me the real assumption is on the end of RAM:
> 
> "ram_base + ram_size - 1" must be representable by 'ram_base' type
> (phys_addr_t or unsigned long ?)
> 
> 
> 
> +
> +    /*
> +     * Check for overflow and limit ram size.
> +     * It is required that ram_base + ram_size - 1 must be representable
> +     * by ram_base type, unsigned long.
> +     */
> +    if (gd->ram_base - 1 + ram_size < gd->ram_base)
> +        ram_size = ((unsigned long)~0x0ULL) - gd->ram_base;
> 
> 
> 
> => no need to remove 4KB is here... base + size can reach the MAX value 
> for the type + 1

Note that this -4 kiB is exactly what triggers the problem , i.e. if 
get_effective_memsize() returns 0x4000_0000 , all is fine, if it returns 
0x3fff0000 then the system hangs. I also tried subtracting 
MMU_SECTION_SIZE instead of 4k and that does not help, system still hangs.
Patrick Delaunay Jan. 5, 2023, 6:31 p.m. UTC | #3
Hi Marek,

On 1/5/23 13:11, Marek Vasut wrote:
> On 1/5/23 11:03, Patrick DELAUNAY wrote:
>> Hi,
>
> Hi,
>
>> On 1/5/23 02:22, Marek Vasut wrote:
>>> Do not access gd->ram_size and assume this is actual valid RAM size. 
>>> Since commit
>>> 777aaaa706b ("common/memsize.c: Fix get_effective_memsize() to check 
>>> for overflow")
>>> the RAM size may be less than gd->ram_size , call 
>>> get_effective_memsize() to get
>>> the limited value instead.
>>>
>>> The aforementioned commit makes STM32MP15xx boards with 1 GiB of DRAM
>>> at 0xc0000000 hang on boot, which is a grave defect.
>>>
>>> Signed-off-by: Marek Vasut <marex@denx.de>
>>> ---
>>> Cc: Pali Rohar <pali@kernel.org>
>>> Cc: Patrice Chotard <patrice.chotard@foss.st.com>
>>> Cc: Patrick Delaunay <patrick.delaunay@foss.st.com>
>>> Cc: Tom Rini <trini@konsulko.com>
>>> ---
>>>   arch/arm/mach-stm32mp/dram_init.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/arch/arm/mach-stm32mp/dram_init.c 
>>> b/arch/arm/mach-stm32mp/dram_init.c
>>> index 9346fa8546d..80ba5c27741 100644
>>> --- a/arch/arm/mach-stm32mp/dram_init.c
>>> +++ b/arch/arm/mach-stm32mp/dram_init.c
>>> @@ -51,7 +51,7 @@ phys_size_t board_get_usable_ram_top(phys_size_t 
>>> total_size)
>>>       /* found enough not-reserved memory to relocated U-Boot */
>>>       lmb_init(&lmb);
>>> -    lmb_add(&lmb, gd->ram_base, gd->ram_size);
>>> +    lmb_add(&lmb, gd->ram_base, get_effective_memsize());
>>>       boot_fdt_add_mem_rsv_regions(&lmb, (void *)gd->fdt_blob);
>>>       /* add 8M for reserved memory for display, fdt, gd,... */
>>>       size = ALIGN(SZ_8M + CONFIG_SYS_MALLOC_LEN + total_size, 
>>> MMU_SECTION_SIZE),
>>
>>
>> Thanks for the patch
>>
>> Reviewed-by: Patrick Delaunay <patrick.delaunay@foss.st.com>
>
> Thank you.
>
>> Even if I don't uderstood where the code blocked here...
>>
>> I need to cros-check it.
>
> Please do, thank you.


Tested, see after....


>
>> But I think it is more a regression introduced by
>>
>> 777aaaa706b ("common/memsize.c: Fix get_effective_memsize() to check 
>> for overflow")
>
> I agree that commit likely breaks more systems than just STM32MP15xx , 
> and I am tempted to say, it should be reverted before release.
>
> Tom ?
>
>> in this patch I don't understood the assumption:
>>
>> * It is required that ram_base + ram_size must be representable by
>> * phys_size_t type and must be aligned by direct access
>>
>>
>> why add now this limitation ?
>>
>> why phys_size_t here and not phys_addr_t or unsigned long, the 
>> ram_base type ?
>>
>>
>> why align on 4kB and not on MMU_SECTION_SIZE ?
>>
>>
>>
>> until now on ARM 32bits U-Boot is working with
>>
>>    ram_base = 0xC0000000 (unsigned long)
>>    ram_size = 0x40000000 (phys_size_t)
>>
>> overflow for ram_top (phys_addr_t) = ram_base + 
>> get_effective_memsize() was previously managed in the code
>> as we work only on base address + size (in board_f.c, in lmb library)
>>
>> I correct in the code overflow issues, for example in commit 
>> 54be09cd8f6e ("arm: caches: manage phys
>> addr_t overflow in mmu_set_region_dcache_behaviour").
>>
>>
>> moreover for me it is strange to reduce the size of the DDR banks by 
>> default
>> (see resutl of bdinfo command) only just to avoid overflow:
>>
>> common/board_f.c:267:    gd->bd->bi_dram[0].size = 
>> get_effective_memsize();
>>
>>
>> For me the real assumption is on the end of RAM:
>>
>> "ram_base + ram_size - 1" must be representable by 'ram_base' type
>> (phys_addr_t or unsigned long ?)
>>
>>
>>
>> +
>> +    /*
>> +     * Check for overflow and limit ram size.
>> +     * It is required that ram_base + ram_size - 1 must be 
>> representable
>> +     * by ram_base type, unsigned long.
>> +     */
>> +    if (gd->ram_base - 1 + ram_size < gd->ram_base)
>> +        ram_size = ((unsigned long)~0x0ULL) - gd->ram_base;
>>
>>
>>
>> => no need to remove 4KB is here... base + size can reach the MAX 
>> value for the type + 1
>
> Note that this -4 kiB is exactly what triggers the problem , i.e. if 
> get_effective_memsize() returns 0x4000_0000 , all is fine, if it 
> returns 0x3fff0000 then the system hangs. I also tried subtracting 
> MMU_SECTION_SIZE instead of 4k and that does not help, system still 
> hangs.


ok


I tested on STM32MP157C-EV1 on my side...

with 1GiB mermory size

U-Boot is booting on next TOP (for trusted boot with TF-A and OP-TEE)



U-Boot 2023.01-rc4-00386-gb429e78942de (Jan 05 2023 - 17:44:04 +0100)

stm32_smc: Failed to exec svc=82001003 op=1 in secure mode (err = -1)
stm32_smc: Failed to exec svc=82001003 op=1 in secure mode (err = -1)
CPU: STM32MP157C?? Rev.Z
Model: STMicroelectronics STM32MP157C eval daughter on eval mother
Board: stm32mp1 in trusted mode (st,stm32mp157c-ev1)
stm32_smc: Failed to exec svc=82001003 op=1 in secure mode (err = -1)
DRAM:  1 GiB
....


STM32MP> bdinfo
boot_params = 0x00000000
DRAM bank   = 0x00000000
-> start    = 0xc0000000
-> size     = 0x3ffff000         <<<<<< the patch is applied - 4kiB here !!!
flashstart  = 0x00000000
flashsize   = 0x00000000
flashoffset = 0x00000000
baudrate    = 115200 bps
relocaddr   = 0xfdb1d000
reloc off   = 0x3da1d000
Build       = 32-bit
current eth = ethernet@5800a000
ethaddr     = 00:80:e1:01:60:da
IP addr     = <NULL>
fdt_blob    = 0xfbb040a0
new_fdt     = 0xfbb040a0
fdt_size    = 0x00016de0
Video       = display-controller@5a001000 inactive
lmb_dump_all:
  memory.cnt  = 0x1
  memory[0]    [0xc0000000-0xffffefff], 0x3ffff000 bytes flags: 0        
<<<<
  reserved.cnt  = 0x6
  reserved[0]    [0x10000000-0x10045fff], 0x00046000 bytes flags: 4
  reserved[1]    [0x30000000-0x3003ffff], 0x00040000 bytes flags: 4
  reserved[2]    [0x38000000-0x3800ffff], 0x00010000 bytes flags: 4
  reserved[3]    [0xe8000000-0xefffffff], 0x08000000 bytes flags: 4
  reserved[4]    [0xfbaffbb8-0xfdffffff], 0x02500448 bytes flags: 0
  reserved[5]    [0xfe000000-0xffffffff], 0x02000000 bytes flags: 4     <<<<
devicetree  = board
arch_number = 0x00000000
TLB addr    = 0xfdff0000
irq_sp      = 0xfbb03e10
sp start    = 0xfbb03e00
Early malloc usage: 2924 / 20000
STM32MP>


But we have strange LMB reserved memory [0xfe000000-0xffffffff] from DT

=> it is outside of  memory[0]    [0xc0000000-0xffffefff]


I think the -4 kiB should be removed at least....



I test also with basic boot = SPL and U-Boot, no issue on my side



Patrick
Pali Rohár Jan. 5, 2023, 7:16 p.m. UTC | #4
Hello!

On Thursday 05 January 2023 11:03:07 Patrick DELAUNAY wrote:
> Hi,
> 
> On 1/5/23 02:22, Marek Vasut wrote:
> > Do not access gd->ram_size and assume this is actual valid RAM size. Since commit
> > 777aaaa706b ("common/memsize.c: Fix get_effective_memsize() to check for overflow")
> > the RAM size may be less than gd->ram_size , call get_effective_memsize() to get
> > the limited value instead.
> > 
> > The aforementioned commit makes STM32MP15xx boards with 1 GiB of DRAM
> > at 0xc0000000 hang on boot, which is a grave defect.
> > 
> > Signed-off-by: Marek Vasut <marex@denx.de>
> > ---
> > Cc: Pali Rohar <pali@kernel.org>
> > Cc: Patrice Chotard <patrice.chotard@foss.st.com>
> > Cc: Patrick Delaunay <patrick.delaunay@foss.st.com>
> > Cc: Tom Rini <trini@konsulko.com>
> > ---
> >   arch/arm/mach-stm32mp/dram_init.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/arch/arm/mach-stm32mp/dram_init.c b/arch/arm/mach-stm32mp/dram_init.c
> > index 9346fa8546d..80ba5c27741 100644
> > --- a/arch/arm/mach-stm32mp/dram_init.c
> > +++ b/arch/arm/mach-stm32mp/dram_init.c
> > @@ -51,7 +51,7 @@ phys_size_t board_get_usable_ram_top(phys_size_t total_size)
> >   	/* found enough not-reserved memory to relocated U-Boot */
> >   	lmb_init(&lmb);
> > -	lmb_add(&lmb, gd->ram_base, gd->ram_size);
> > +	lmb_add(&lmb, gd->ram_base, get_effective_memsize());
> >   	boot_fdt_add_mem_rsv_regions(&lmb, (void *)gd->fdt_blob);
> >   	/* add 8M for reserved memory for display, fdt, gd,... */
> >   	size = ALIGN(SZ_8M + CONFIG_SYS_MALLOC_LEN + total_size, MMU_SECTION_SIZE),
> 
> 
> Thanks for the patch
> 
> Reviewed-by: Patrick Delaunay <patrick.delaunay@foss.st.com>
> 
> 
> Even if I don't uderstood where the code blocked here...
> 
> I need to cros-check it.
> 
> 
> 
> 
> But I think it is more a regression introduced by
> 
> 777aaaa706b ("common/memsize.c: Fix get_effective_memsize() to check for overflow")
> 
> 
> in this patch I don't understood the assumption:
> 
> * It is required that ram_base + ram_size must be representable by
> * phys_size_t type and must be aligned by direct access
> 
> 
> why add now this limitation ?
> 
> why phys_size_t here and not phys_addr_t or unsigned long, the ram_base type ?

Well, I think that it is quite confusing when it is phys_addr_t and when
phys_size_t. Because U-Boot on more places is doing addition:
phys_addr_t + phys_size_t.

And I was confused by it too, because get_effective_memsize() function
works only with variables of phys_size_t type.

So I would agree that it should be mentioned as phys_addr_t.

But is there any platform on which phys_addr_t and phys_size_t types
have different sizes?

Why it is not ram_base? Because there is no such type. Why it is not
phys_addr_t? Because some 32-bit platforms support more than 4GB of
physical memory and unsigned long type is small for them. Also there are
64-bit platforms with just 32-bit data bus (but I'm not sure if these
platforms are handled somehow specially with just 32-bit phys_addr_t).

> why align on 4kB and not on MMU_SECTION_SIZE ?

Because of my mistake and seems that nobody spotted or reported this
issue yet. I agree it should be some platform specific macro (I have not
checked now if MMU_SECTION_SIZE is the correct one).

> until now on ARM 32bits U-Boot is working with
> 
>   ram_base = 0xC0000000 (unsigned long)
>   ram_size = 0x40000000 (phys_size_t)
> 
> overflow for ram_top (phys_addr_t) = ram_base + get_effective_memsize() was previously managed in the code
> as we work only on base address + size (in board_f.c, in lmb library)
> 
> I correct in the code overflow issues, for example in commit 54be09cd8f6e ("arm: caches: manage phys
> addr_t overflow in mmu_set_region_dcache_behaviour").
> 
> 
> moreover for me it is strange to reduce the size of the DDR banks by default
> (see resutl of bdinfo command) only just to avoid overflow:
> 
> common/board_f.c:267:	gd->bd->bi_dram[0].size = get_effective_memsize();

Yes, this really looks strange to reduce total size of DDR banks to the
effective memory size which U-Boot can use or map. There is also config
option CONFIG_MAX_MEM_MAPPED to allow user to reduce mappable memory in
U-Boot to more strict value.

For sure DDR bank size should be set to the real size.

For example 32-bit U-Boot can map maximally 4GB of RAM, but on some
(32-bit) platforms there can be more than 4GB of DDR and operating
system may support to use more than 4GB of it. So DDR bank size passed
from bootloader to OS should be precise and not reduced to size
supported by bootloader.

> For me the real assumption is on the end of RAM:
> 
> "ram_base + ram_size - 1" must be representable by 'ram_base' type
> (phys_addr_t or unsigned long ?)

You are missing there +1. That is restriction in the U-Boot as it
requires that one byte after the RAM can be stored in both phys_addr_t
and phys_size_t variables.

Again, I wrote code comment slightly wrong. It should say that
ram_base + get_effective_memsize() must be representable by phys_addr_t
type.

> 
> 
> +
> +	/*
> +	 * Check for overflow and limit ram size.
> +	 * It is required that ram_base + ram_size - 1 must be representable
> +	 * by ram_base type, unsigned long.
> +	 */
> +	if (gd->ram_base - 1 + ram_size < gd->ram_base)
> +		ram_size = ((unsigned long)~0x0ULL) - gd->ram_base;
> 
> 
> 
> => no need to remove 4KB is here... base + size can reach the MAX value for the type + 1
> 
> 
> Patrick
> 

For example U-Boot code in common/board_f.c sets gd->ram_top to value:

gd->ram_top = gd->ram_base + get_effective_memsize();

So your above change with -1 just break this code.

I think that the best would be to fix U-Boot code so that it never use
last byte +1 in phys_addr_t.
Pali Rohár Jan. 5, 2023, 7:25 p.m. UTC | #5
Ok, so it is working...

On Thursday 05 January 2023 19:31:19 Patrick DELAUNAY wrote:
> I tested on STM32MP157C-EV1 on my side...
> 
> with 1GiB mermory size
> 
> U-Boot is booting on next TOP (for trusted boot with TF-A and OP-TEE)
> 
> 
> 
> U-Boot 2023.01-rc4-00386-gb429e78942de (Jan 05 2023 - 17:44:04 +0100)
> 
> stm32_smc: Failed to exec svc=82001003 op=1 in secure mode (err = -1)
> stm32_smc: Failed to exec svc=82001003 op=1 in secure mode (err = -1)
> CPU: STM32MP157C?? Rev.Z
> Model: STMicroelectronics STM32MP157C eval daughter on eval mother
> Board: stm32mp1 in trusted mode (st,stm32mp157c-ev1)
> stm32_smc: Failed to exec svc=82001003 op=1 in secure mode (err = -1)
> DRAM:  1 GiB
> ....
> 
> 
> STM32MP> bdinfo
> boot_params = 0x00000000
> DRAM bank   = 0x00000000
> -> start    = 0xc0000000
> -> size     = 0x3ffff000         <<<<<< the patch is applied - 4kiB here !!!

But this is wrong. DRAM bank size should not be changed. I think that in
U-Boot is another issue which propagates mapped U-Boot RAM size/range to
the places where should be real DDR size.

> flashstart  = 0x00000000
> flashsize   = 0x00000000
> flashoffset = 0x00000000
> baudrate    = 115200 bps
> relocaddr   = 0xfdb1d000
> reloc off   = 0x3da1d000
> Build       = 32-bit
> current eth = ethernet@5800a000
> ethaddr     = 00:80:e1:01:60:da
> IP addr     = <NULL>
> fdt_blob    = 0xfbb040a0
> new_fdt     = 0xfbb040a0
> fdt_size    = 0x00016de0
> Video       = display-controller@5a001000 inactive
> lmb_dump_all:
>  memory.cnt  = 0x1
>  memory[0]    [0xc0000000-0xffffefff], 0x3ffff000 bytes flags: 0        <<<<
>  reserved.cnt  = 0x6
>  reserved[0]    [0x10000000-0x10045fff], 0x00046000 bytes flags: 4
>  reserved[1]    [0x30000000-0x3003ffff], 0x00040000 bytes flags: 4
>  reserved[2]    [0x38000000-0x3800ffff], 0x00010000 bytes flags: 4
>  reserved[3]    [0xe8000000-0xefffffff], 0x08000000 bytes flags: 4
>  reserved[4]    [0xfbaffbb8-0xfdffffff], 0x02500448 bytes flags: 0
>  reserved[5]    [0xfe000000-0xffffffff], 0x02000000 bytes flags: 4     <<<<
> devicetree  = board
> arch_number = 0x00000000
> TLB addr    = 0xfdff0000
> irq_sp      = 0xfbb03e10
> sp start    = 0xfbb03e00
> Early malloc usage: 2924 / 20000
> STM32MP>
> 
> 
> But we have strange LMB reserved memory [0xfe000000-0xffffffff] from DT
> 
> => it is outside of  memory[0]    [0xc0000000-0xffffefff]
> 
> 
> I think the -4 kiB should be removed at least....
> 
> 
> 
> I test also with basic boot = SPL and U-Boot, no issue on my side
> 
> 
> 
> Patrick
>
Marek Vasut Jan. 5, 2023, 8:35 p.m. UTC | #6
On 1/5/23 19:31, Patrick DELAUNAY wrote:
> Hi Marek,

Hi,

[...]

> I tested on STM32MP157C-EV1 on my side...
> 
> with 1GiB mermory size
> 
> U-Boot is booting on next TOP (for trusted boot with TF-A and OP-TEE)
> 
> 
> 
> U-Boot 2023.01-rc4-00386-gb429e78942de (Jan 05 2023 - 17:44:04 +0100)

Note that I am not running into this problem on next, but on current 
master . Also, I am using basic boot. Can you retest ?

[...]
Marek Vasut Jan. 5, 2023, 8:37 p.m. UTC | #7
On 1/5/23 20:25, Pali Rohar wrote:
> Ok, so it is working...

No, it is NOT working because the test performed is using completely 
different codebase.

The STM32MP15xx platform I have on my desk is hanging on boot with 
current U-Boot master, i.e upcoming release. With commit
777aaaa706b ("common/memsize.c: Fix get_effective_memsize() to check for 
overflow")
reverted, the platform boots fine again.

[...]
Patrick Delaunay Jan. 6, 2023, 9:09 a.m. UTC | #8
Hi,


On 1/5/23 21:35, Marek Vasut wrote:
> On 1/5/23 19:31, Patrick DELAUNAY wrote:
>> Hi Marek,
>
> Hi,
>
> [...]
>
>> I tested on STM32MP157C-EV1 on my side...
>>
>> with 1GiB mermory size
>>
>> U-Boot is booting on next TOP (for trusted boot with TF-A and OP-TEE)
>>
>>
>>
>> U-Boot 2023.01-rc4-00386-gb429e78942de (Jan 05 2023 - 17:44:04 +0100)
>
> Note that I am not running into this problem on next, but on current 
> master . Also, I am using basic boot. Can you retest ?
>
> [...]


I retest on master for STM32MP157C-EV1, basic defconfig.

SHA1 = 8d6cbf5e6bc4e10e38b954763fa025caed517cc2


no blocking issue on my side....  I go up to U-Boot console.

but it is the same binary than you (not the same defconfig)

so your blcoking point is perhaps only masked.


regards.


Patrick
Patrick Delaunay Jan. 6, 2023, 9:13 a.m. UTC | #9
Hi,

On 1/5/23 20:25, Pali Rohar wrote:
> Ok, so it is working...
>
> On Thursday 05 January 2023 19:31:19 Patrick DELAUNAY wrote:
>> I tested on STM32MP157C-EV1 on my side...
>>
>> with 1GiB mermory size
>>
>> U-Boot is booting on next TOP (for trusted boot with TF-A and OP-TEE)
>>
>>
>>
>> U-Boot 2023.01-rc4-00386-gb429e78942de (Jan 05 2023 - 17:44:04 +0100)
>>
>> stm32_smc: Failed to exec svc=82001003 op=1 in secure mode (err = -1)
>> stm32_smc: Failed to exec svc=82001003 op=1 in secure mode (err = -1)
>> CPU: STM32MP157C?? Rev.Z
>> Model: STMicroelectronics STM32MP157C eval daughter on eval mother
>> Board: stm32mp1 in trusted mode (st,stm32mp157c-ev1)
>> stm32_smc: Failed to exec svc=82001003 op=1 in secure mode (err = -1)
>> DRAM:  1 GiB
>> ....
>>
>>
>> STM32MP> bdinfo
>> boot_params = 0x00000000
>> DRAM bank   = 0x00000000
>> -> start    = 0xc0000000
>> -> size     = 0x3ffff000         <<<<<< the patch is applied - 4kiB here !!!
> But this is wrong. DRAM bank size should not be changed. I think that in
> U-Boot is another issue which propagates mapped U-Boot RAM size/range to
> the places where should be real DDR size.


Yes it is wrong

and it is introduced by your patch...


in the STM32MP15x platfrorm are using the genric weak function


./common/board_f.c:267


__weak int dram_init_banksize(void)
{
     gd->bd->bi_dram[0].start = gd->ram_base;
     gd->bd->bi_dram[0].size = get_effective_memsize();

     return 0;
}

For me:

get_effective_memsize() must return the effective size of the DDR

and NOT -4KiB to avoid side effects in other part of the code (LMB for 
example)


I think today we need to revert your patch.


regards

Patrick
Patrick Delaunay Jan. 6, 2023, 12:16 p.m. UTC | #10
Hi,

On 1/5/23 20:16, Pali Rohar wrote:
> Hello!
>
> On Thursday 05 January 2023 11:03:07 Patrick DELAUNAY wrote:
>> Hi,
>>
>> On 1/5/23 02:22, Marek Vasut wrote:
>>> Do not access gd->ram_size and assume this is actual valid RAM size. Since commit
>>> 777aaaa706b ("common/memsize.c: Fix get_effective_memsize() to check for overflow")
>>> the RAM size may be less than gd->ram_size , call get_effective_memsize() to get
>>> the limited value instead.
>>>
>>> The aforementioned commit makes STM32MP15xx boards with 1 GiB of DRAM
>>> at 0xc0000000 hang on boot, which is a grave defect.
>>>
>>> Signed-off-by: Marek Vasut <marex@denx.de>
>>> ---
>>> Cc: Pali Rohar <pali@kernel.org>
>>> Cc: Patrice Chotard <patrice.chotard@foss.st.com>
>>> Cc: Patrick Delaunay <patrick.delaunay@foss.st.com>
>>> Cc: Tom Rini <trini@konsulko.com>
>>> ---
>>>    arch/arm/mach-stm32mp/dram_init.c | 2 +-
>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/arch/arm/mach-stm32mp/dram_init.c b/arch/arm/mach-stm32mp/dram_init.c
>>> index 9346fa8546d..80ba5c27741 100644
>>> --- a/arch/arm/mach-stm32mp/dram_init.c
>>> +++ b/arch/arm/mach-stm32mp/dram_init.c
>>> @@ -51,7 +51,7 @@ phys_size_t board_get_usable_ram_top(phys_size_t total_size)
>>>    	/* found enough not-reserved memory to relocated U-Boot */
>>>    	lmb_init(&lmb);
>>> -	lmb_add(&lmb, gd->ram_base, gd->ram_size);
>>> +	lmb_add(&lmb, gd->ram_base, get_effective_memsize());
>>>    	boot_fdt_add_mem_rsv_regions(&lmb, (void *)gd->fdt_blob);
>>>    	/* add 8M for reserved memory for display, fdt, gd,... */
>>>    	size = ALIGN(SZ_8M + CONFIG_SYS_MALLOC_LEN + total_size, MMU_SECTION_SIZE),
>>
>> Thanks for the patch
>>
>> Reviewed-by: Patrick Delaunay <patrick.delaunay@foss.st.com>
>>
>>
>> Even if I don't uderstood where the code blocked here...
>>
>> I need to cros-check it.
>>
>>
>>
>>
>> But I think it is more a regression introduced by
>>
>> 777aaaa706b ("common/memsize.c: Fix get_effective_memsize() to check for overflow")
>>
>>
>> in this patch I don't understood the assumption:
>>
>> * It is required that ram_base + ram_size must be representable by
>> * phys_size_t type and must be aligned by direct access
>>
>>
>> why add now this limitation ?
>>
>> why phys_size_t here and not phys_addr_t or unsigned long, the ram_base type ?
> Well, I think that it is quite confusing when it is phys_addr_t and when
> phys_size_t. Because U-Boot on more places is doing addition:
> phys_addr_t + phys_size_t.
>
> And I was confused by it too, because get_effective_memsize() function
> works only with variables of phys_size_t type.
>
> So I would agree that it should be mentioned as phys_addr_t.
>
> But is there any platform on which phys_addr_t and phys_size_t types
> have different sizes?


I don't know, it is just more clear

=> addr overflow based on address type


>
> Why it is not ram_base? Because there is no such type. Why it is not
> phys_addr_t? Because some 32-bit platforms support more than 4GB of
> physical memory and unsigned long type is small for them. Also there are
> 64-bit platforms with just 32-bit data bus (but I'm not sure if these
> platforms are handled somehow specially with just 32-bit phys_addr_t).
>
>> why align on 4kB and not on MMU_SECTION_SIZE ?
> Because of my mistake and seems that nobody spotted or reported this
> issue yet. I agree it should be some platform specific macro (I have not
> checked now if MMU_SECTION_SIZE is the correct one).


I propose MMU_SECTION_SIZE to allow MMU configuration up to end

of effective ram top.


>
>> until now on ARM 32bits U-Boot is working with
>>
>>    ram_base = 0xC0000000 (unsigned long)
>>    ram_size = 0x40000000 (phys_size_t)
>>
>> overflow for ram_top (phys_addr_t) = ram_base + get_effective_memsize() was previously managed in the code
>> as we work only on base address + size (in board_f.c, in lmb library)
>>
>> I correct in the code overflow issues, for example in commit 54be09cd8f6e ("arm: caches: manage phys
>> addr_t overflow in mmu_set_region_dcache_behaviour").
>>
>>
>> moreover for me it is strange to reduce the size of the DDR banks by default
>> (see resutl of bdinfo command) only just to avoid overflow:
>>
>> common/board_f.c:267:	gd->bd->bi_dram[0].size = get_effective_memsize();
> Yes, this really looks strange to reduce total size of DDR banks to the
> effective memory size which U-Boot can use or map. There is also config
> option CONFIG_MAX_MEM_MAPPED to allow user to reduce mappable memory in
> U-Boot to more strict value.
>
> For sure DDR bank size should be set to the real size.
>
> For example 32-bit U-Boot can map maximally 4GB of RAM, but on some
> (32-bit) platforms there can be more than 4GB of DDR and operating
> system may support to use more than 4GB of it. So DDR bank size passed
> from bootloader to OS should be precise and not reduced to size
> supported by bootloader.
>
>> For me the real assumption is on the end of RAM:
>>
>> "ram_base + ram_size - 1" must be representable by 'ram_base' type
>> (phys_addr_t or unsigned long ?)
> You are missing there +1. That is restriction in the U-Boot as it
> requires that one byte after the RAM can be stored in both phys_addr_t
> and phys_size_t variables.
>
> Again, I wrote code comment slightly wrong. It should say that
> ram_base + get_effective_memsize() must be representable by phys_addr_t
> type.


For me it is not a problem, 0x0 for RAM TOP  should be (max of 
phys_addr_t + 1)


=> relocation address of U-Boot is (ram_top - u-boot size) is correctly 
working

but perhaps I miss something....

and I dig deeper.


Anyway on STM32MP15X platform with 1GB DDR, for basic boot

ram_base = 0xC0000000

ram_size = 0x40000000


=> and it is working (with ram_top = 0x0)


>
>>
>> +
>> +	/*
>> +	 * Check for overflow and limit ram size.
>> +	 * It is required that ram_base + ram_size - 1 must be representable
>> +	 * by ram_base type, unsigned long.
>> +	 */
>> +	if (gd->ram_base - 1 + ram_size < gd->ram_base)
>> +		ram_size = ((unsigned long)~0x0ULL) - gd->ram_base;
>>
>>
>>
>> => no need to remove 4KB is here... base + size can reach the MAX value for the type + 1
>>
>>
>> Patrick
>>
> For example U-Boot code in common/board_f.c sets gd->ram_top to value:
>
> gd->ram_top = gd->ram_base + get_effective_memsize();
>
> So your above change with -1 just break this code.


For me ram_top = 0x0 is functional ( for u32 equivalant with overflow to 
0x1 0000 0000)


>
> I think that the best would be to fix U-Boot code so that it never use
> last byte +1 in phys_addr_t.


I see that over platform don't change this common part of U-Boot

but reduce the usable RAM by U-Boot by using the weak function:


board_get_usable_ram_top()


common/board_f.c:376:    gd->ram_top = 
board_get_usable_ram_top(gd->mon_len);


and if you see the default implementation of this function :


/* Get the top of usable RAM */
__weak phys_size_t gd->ram_base(phys_size_t total_size)
{
#if defined(CONFIG_SYS_SDRAM_BASE) && CONFIG_SYS_SDRAM_BASE > 0
     /*
      * Detect whether we have so much RAM that it goes past the end of our
      * 32-bit address space. If so, clip the usable RAM so it doesn't.
      */
     if (gd->ram_top < CONFIG_SYS_SDRAM_BASE)
         /*
          * Will wrap back to top of 32-bit space when reservations
          * are made.
          */
         return 0;
#endif
     return gd->ram_top;
}


the 32-bis address space overflow is already detected here when

gd->ram_top =  gd->ram_base +  gd->ram_size  < CONFIG_SYS_SDRAM_BASE


=> ram_top = board_get_usable_ram_top()

                     = 0


So I think you can revert your patch....

and perhaps check your implementation of this weak function if you 
overidde it


Regards


Patrick
Marek Vasut Jan. 6, 2023, 2:53 p.m. UTC | #11
On 1/6/23 10:13, Patrick DELAUNAY wrote:
> Hi,

Hi,

> For me:
> 
> get_effective_memsize() must return the effective size of the DDR
> 
> and NOT -4KiB to avoid side effects in other part of the code (LMB for 
> example)
> 
> 
> I think today we need to revert your patch.

I agree it would be good to revert the patch for the v2023.01 release , 
and then re-add it once all the issues it triggers are explained or 
resolved in satisfactory manner.
Pali Rohár Jan. 6, 2023, 4:40 p.m. UTC | #12
On Friday 06 January 2023 10:13:34 Patrick DELAUNAY wrote:
> Hi,
> 
> On 1/5/23 20:25, Pali Rohar wrote:
> > Ok, so it is working...
> > 
> > On Thursday 05 January 2023 19:31:19 Patrick DELAUNAY wrote:
> > > I tested on STM32MP157C-EV1 on my side...
> > > 
> > > with 1GiB mermory size
> > > 
> > > U-Boot is booting on next TOP (for trusted boot with TF-A and OP-TEE)
> > > 
> > > 
> > > 
> > > U-Boot 2023.01-rc4-00386-gb429e78942de (Jan 05 2023 - 17:44:04 +0100)
> > > 
> > > stm32_smc: Failed to exec svc=82001003 op=1 in secure mode (err = -1)
> > > stm32_smc: Failed to exec svc=82001003 op=1 in secure mode (err = -1)
> > > CPU: STM32MP157C?? Rev.Z
> > > Model: STMicroelectronics STM32MP157C eval daughter on eval mother
> > > Board: stm32mp1 in trusted mode (st,stm32mp157c-ev1)
> > > stm32_smc: Failed to exec svc=82001003 op=1 in secure mode (err = -1)
> > > DRAM:  1 GiB
> > > ....
> > > 
> > > 
> > > STM32MP> bdinfo
> > > boot_params = 0x00000000
> > > DRAM bank   = 0x00000000
> > > -> start    = 0xc0000000
> > > -> size     = 0x3ffff000         <<<<<< the patch is applied - 4kiB here !!!
> > But this is wrong. DRAM bank size should not be changed. I think that in
> > U-Boot is another issue which propagates mapped U-Boot RAM size/range to
> > the places where should be real DDR size.
> 
> 
> Yes it is wrong
> 
> and it is introduced by your patch...
> 
> 
> in the STM32MP15x platfrorm are using the genric weak function
> 
> 
> ./common/board_f.c:267
> 
> 
> __weak int dram_init_banksize(void)
> {
>     gd->bd->bi_dram[0].start = gd->ram_base;
>     gd->bd->bi_dram[0].size = get_effective_memsize();
> 
>     return 0;
> }
> 
> For me:
> 
> get_effective_memsize() must return the effective size of the DDR
> 
> and NOT -4KiB to avoid side effects in other part of the code (LMB for
> example)

Ok, this needs to be reworked. But seems it is not just simple issue.

> 
> I think today we need to revert your patch.

Hard revert without anything else is not the solution. As it will break
all 36-bit powerpc boards with 4GB of DDR RAM on which is u-boot running
in 32-bit mode.
Patrick Delaunay Jan. 6, 2023, 4:56 p.m. UTC | #13
Hi Marek,

On 1/5/23 02:22, Marek Vasut wrote:
> Do not access gd->ram_size and assume this is actual valid RAM size. Since commit
> 777aaaa706b ("common/memsize.c: Fix get_effective_memsize() to check for overflow")
> the RAM size may be less than gd->ram_size , call get_effective_memsize() to get
> the limited value instead.
>
> The aforementioned commit makes STM32MP15xx boards with 1 GiB of DRAM
> at 0xc0000000 hang on boot, which is a grave defect.
>
> Signed-off-by: Marek Vasut <marex@denx.de>
> ---
> Cc: Pali Rohar <pali@kernel.org>
> Cc: Patrice Chotard <patrice.chotard@foss.st.com>
> Cc: Patrick Delaunay <patrick.delaunay@foss.st.com>
> Cc: Tom Rini <trini@konsulko.com>
> ---
>   arch/arm/mach-stm32mp/dram_init.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/arm/mach-stm32mp/dram_init.c b/arch/arm/mach-stm32mp/dram_init.c
> index 9346fa8546d..80ba5c27741 100644
> --- a/arch/arm/mach-stm32mp/dram_init.c
> +++ b/arch/arm/mach-stm32mp/dram_init.c
> @@ -51,7 +51,7 @@ phys_size_t board_get_usable_ram_top(phys_size_t total_size)
>   
>   	/* found enough not-reserved memory to relocated U-Boot */
>   	lmb_init(&lmb);
> -	lmb_add(&lmb, gd->ram_base, gd->ram_size);
> +	lmb_add(&lmb, gd->ram_base, get_effective_memsize());
>   	boot_fdt_add_mem_rsv_regions(&lmb, (void *)gd->fdt_blob);
>   	/* add 8M for reserved memory for display, fdt, gd,... */
>   	size = ALIGN(SZ_8M + CONFIG_SYS_MALLOC_LEN + total_size, MMU_SECTION_SIZE),


Applied to u-boot-stm/master, thanks!


For my point a view , this patch is an acceptable workaround for master
branch and v2023.01 delivery.

but the initial commit 777aaaa706b ("common/memsize.c: Fix get_effective_memsize() to
check should be revisited or reverted (in master or in next ?).

For details see comments in patch

"arm: stm32mp: Fix board_get_usable_ram_top() again"

http://patchwork.ozlabs.org/project/uboot/patch/20230105012222.238075-1-marex@denx.de/

https://lore.kernel.org/u-boot/20230105012222.238075-1-marex@denx.de/


Regards
Patrick
diff mbox series

Patch

diff --git a/arch/arm/mach-stm32mp/dram_init.c b/arch/arm/mach-stm32mp/dram_init.c
index 9346fa8546d..80ba5c27741 100644
--- a/arch/arm/mach-stm32mp/dram_init.c
+++ b/arch/arm/mach-stm32mp/dram_init.c
@@ -51,7 +51,7 @@  phys_size_t board_get_usable_ram_top(phys_size_t total_size)
 
 	/* found enough not-reserved memory to relocated U-Boot */
 	lmb_init(&lmb);
-	lmb_add(&lmb, gd->ram_base, gd->ram_size);
+	lmb_add(&lmb, gd->ram_base, get_effective_memsize());
 	boot_fdt_add_mem_rsv_regions(&lmb, (void *)gd->fdt_blob);
 	/* add 8M for reserved memory for display, fdt, gd,... */
 	size = ALIGN(SZ_8M + CONFIG_SYS_MALLOC_LEN + total_size, MMU_SECTION_SIZE),