Message ID | 20230105012222.238075-1-marex@denx.de |
---|---|
State | Accepted |
Commit | a2e0b041d662e4c80502cd5c9a8326d026f9deb1 |
Delegated to: | Patrick Delaunay |
Headers | show |
Series | arm: stm32mp: Fix board_get_usable_ram_top() again | expand |
Hi, On 1/5/23 02:22, Marek Vasut wrote: > Do not access gd->ram_size and assume this is actual valid RAM size. Since commit > 777aaaa706b ("common/memsize.c: Fix get_effective_memsize() to check for overflow") > the RAM size may be less than gd->ram_size , call get_effective_memsize() to get > the limited value instead. > > The aforementioned commit makes STM32MP15xx boards with 1 GiB of DRAM > at 0xc0000000 hang on boot, which is a grave defect. > > Signed-off-by: Marek Vasut <marex@denx.de> > --- > Cc: Pali Rohar <pali@kernel.org> > Cc: Patrice Chotard <patrice.chotard@foss.st.com> > Cc: Patrick Delaunay <patrick.delaunay@foss.st.com> > Cc: Tom Rini <trini@konsulko.com> > --- > arch/arm/mach-stm32mp/dram_init.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/arm/mach-stm32mp/dram_init.c b/arch/arm/mach-stm32mp/dram_init.c > index 9346fa8546d..80ba5c27741 100644 > --- a/arch/arm/mach-stm32mp/dram_init.c > +++ b/arch/arm/mach-stm32mp/dram_init.c > @@ -51,7 +51,7 @@ phys_size_t board_get_usable_ram_top(phys_size_t total_size) > > /* found enough not-reserved memory to relocated U-Boot */ > lmb_init(&lmb); > - lmb_add(&lmb, gd->ram_base, gd->ram_size); > + lmb_add(&lmb, gd->ram_base, get_effective_memsize()); > boot_fdt_add_mem_rsv_regions(&lmb, (void *)gd->fdt_blob); > /* add 8M for reserved memory for display, fdt, gd,... */ > size = ALIGN(SZ_8M + CONFIG_SYS_MALLOC_LEN + total_size, MMU_SECTION_SIZE), Thanks for the patch Reviewed-by: Patrick Delaunay <patrick.delaunay@foss.st.com> Even if I don't uderstood where the code blocked here... I need to cros-check it. But I think it is more a regression introduced by 777aaaa706b ("common/memsize.c: Fix get_effective_memsize() to check for overflow") in this patch I don't understood the assumption: * It is required that ram_base + ram_size must be representable by * phys_size_t type and must be aligned by direct access why add now this limitation ? why phys_size_t here and not phys_addr_t or unsigned long, the ram_base type ? why align on 4kB and not on MMU_SECTION_SIZE ? until now on ARM 32bits U-Boot is working with ram_base = 0xC0000000 (unsigned long) ram_size = 0x40000000 (phys_size_t) overflow for ram_top (phys_addr_t) = ram_base + get_effective_memsize() was previously managed in the code as we work only on base address + size (in board_f.c, in lmb library) I correct in the code overflow issues, for example in commit 54be09cd8f6e ("arm: caches: manage phys addr_t overflow in mmu_set_region_dcache_behaviour"). moreover for me it is strange to reduce the size of the DDR banks by default (see resutl of bdinfo command) only just to avoid overflow: common/board_f.c:267: gd->bd->bi_dram[0].size = get_effective_memsize(); For me the real assumption is on the end of RAM: "ram_base + ram_size - 1" must be representable by 'ram_base' type (phys_addr_t or unsigned long ?) + + /* + * Check for overflow and limit ram size. + * It is required that ram_base + ram_size - 1 must be representable + * by ram_base type, unsigned long. + */ + if (gd->ram_base - 1 + ram_size < gd->ram_base) + ram_size = ((unsigned long)~0x0ULL) - gd->ram_base; => no need to remove 4KB is here... base + size can reach the MAX value for the type + 1 Patrick
On 1/5/23 11:03, Patrick DELAUNAY wrote: > Hi, Hi, > On 1/5/23 02:22, Marek Vasut wrote: >> Do not access gd->ram_size and assume this is actual valid RAM size. >> Since commit >> 777aaaa706b ("common/memsize.c: Fix get_effective_memsize() to check >> for overflow") >> the RAM size may be less than gd->ram_size , call >> get_effective_memsize() to get >> the limited value instead. >> >> The aforementioned commit makes STM32MP15xx boards with 1 GiB of DRAM >> at 0xc0000000 hang on boot, which is a grave defect. >> >> Signed-off-by: Marek Vasut <marex@denx.de> >> --- >> Cc: Pali Rohar <pali@kernel.org> >> Cc: Patrice Chotard <patrice.chotard@foss.st.com> >> Cc: Patrick Delaunay <patrick.delaunay@foss.st.com> >> Cc: Tom Rini <trini@konsulko.com> >> --- >> arch/arm/mach-stm32mp/dram_init.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/arch/arm/mach-stm32mp/dram_init.c >> b/arch/arm/mach-stm32mp/dram_init.c >> index 9346fa8546d..80ba5c27741 100644 >> --- a/arch/arm/mach-stm32mp/dram_init.c >> +++ b/arch/arm/mach-stm32mp/dram_init.c >> @@ -51,7 +51,7 @@ phys_size_t board_get_usable_ram_top(phys_size_t >> total_size) >> /* found enough not-reserved memory to relocated U-Boot */ >> lmb_init(&lmb); >> - lmb_add(&lmb, gd->ram_base, gd->ram_size); >> + lmb_add(&lmb, gd->ram_base, get_effective_memsize()); >> boot_fdt_add_mem_rsv_regions(&lmb, (void *)gd->fdt_blob); >> /* add 8M for reserved memory for display, fdt, gd,... */ >> size = ALIGN(SZ_8M + CONFIG_SYS_MALLOC_LEN + total_size, >> MMU_SECTION_SIZE), > > > Thanks for the patch > > Reviewed-by: Patrick Delaunay <patrick.delaunay@foss.st.com> Thank you. > Even if I don't uderstood where the code blocked here... > > I need to cros-check it. Please do, thank you. > But I think it is more a regression introduced by > > 777aaaa706b ("common/memsize.c: Fix get_effective_memsize() to check for > overflow") I agree that commit likely breaks more systems than just STM32MP15xx , and I am tempted to say, it should be reverted before release. Tom ? > in this patch I don't understood the assumption: > > * It is required that ram_base + ram_size must be representable by > * phys_size_t type and must be aligned by direct access > > > why add now this limitation ? > > why phys_size_t here and not phys_addr_t or unsigned long, the ram_base > type ? > > > why align on 4kB and not on MMU_SECTION_SIZE ? > > > > until now on ARM 32bits U-Boot is working with > > ram_base = 0xC0000000 (unsigned long) > ram_size = 0x40000000 (phys_size_t) > > overflow for ram_top (phys_addr_t) = ram_base + get_effective_memsize() > was previously managed in the code > as we work only on base address + size (in board_f.c, in lmb library) > > I correct in the code overflow issues, for example in commit > 54be09cd8f6e ("arm: caches: manage phys > addr_t overflow in mmu_set_region_dcache_behaviour"). > > > moreover for me it is strange to reduce the size of the DDR banks by > default > (see resutl of bdinfo command) only just to avoid overflow: > > common/board_f.c:267: gd->bd->bi_dram[0].size = get_effective_memsize(); > > > For me the real assumption is on the end of RAM: > > "ram_base + ram_size - 1" must be representable by 'ram_base' type > (phys_addr_t or unsigned long ?) > > > > + > + /* > + * Check for overflow and limit ram size. > + * It is required that ram_base + ram_size - 1 must be representable > + * by ram_base type, unsigned long. > + */ > + if (gd->ram_base - 1 + ram_size < gd->ram_base) > + ram_size = ((unsigned long)~0x0ULL) - gd->ram_base; > > > > => no need to remove 4KB is here... base + size can reach the MAX value > for the type + 1 Note that this -4 kiB is exactly what triggers the problem , i.e. if get_effective_memsize() returns 0x4000_0000 , all is fine, if it returns 0x3fff0000 then the system hangs. I also tried subtracting MMU_SECTION_SIZE instead of 4k and that does not help, system still hangs.
Hi Marek, On 1/5/23 13:11, Marek Vasut wrote: > On 1/5/23 11:03, Patrick DELAUNAY wrote: >> Hi, > > Hi, > >> On 1/5/23 02:22, Marek Vasut wrote: >>> Do not access gd->ram_size and assume this is actual valid RAM size. >>> Since commit >>> 777aaaa706b ("common/memsize.c: Fix get_effective_memsize() to check >>> for overflow") >>> the RAM size may be less than gd->ram_size , call >>> get_effective_memsize() to get >>> the limited value instead. >>> >>> The aforementioned commit makes STM32MP15xx boards with 1 GiB of DRAM >>> at 0xc0000000 hang on boot, which is a grave defect. >>> >>> Signed-off-by: Marek Vasut <marex@denx.de> >>> --- >>> Cc: Pali Rohar <pali@kernel.org> >>> Cc: Patrice Chotard <patrice.chotard@foss.st.com> >>> Cc: Patrick Delaunay <patrick.delaunay@foss.st.com> >>> Cc: Tom Rini <trini@konsulko.com> >>> --- >>> arch/arm/mach-stm32mp/dram_init.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/arch/arm/mach-stm32mp/dram_init.c >>> b/arch/arm/mach-stm32mp/dram_init.c >>> index 9346fa8546d..80ba5c27741 100644 >>> --- a/arch/arm/mach-stm32mp/dram_init.c >>> +++ b/arch/arm/mach-stm32mp/dram_init.c >>> @@ -51,7 +51,7 @@ phys_size_t board_get_usable_ram_top(phys_size_t >>> total_size) >>> /* found enough not-reserved memory to relocated U-Boot */ >>> lmb_init(&lmb); >>> - lmb_add(&lmb, gd->ram_base, gd->ram_size); >>> + lmb_add(&lmb, gd->ram_base, get_effective_memsize()); >>> boot_fdt_add_mem_rsv_regions(&lmb, (void *)gd->fdt_blob); >>> /* add 8M for reserved memory for display, fdt, gd,... */ >>> size = ALIGN(SZ_8M + CONFIG_SYS_MALLOC_LEN + total_size, >>> MMU_SECTION_SIZE), >> >> >> Thanks for the patch >> >> Reviewed-by: Patrick Delaunay <patrick.delaunay@foss.st.com> > > Thank you. > >> Even if I don't uderstood where the code blocked here... >> >> I need to cros-check it. > > Please do, thank you. Tested, see after.... > >> But I think it is more a regression introduced by >> >> 777aaaa706b ("common/memsize.c: Fix get_effective_memsize() to check >> for overflow") > > I agree that commit likely breaks more systems than just STM32MP15xx , > and I am tempted to say, it should be reverted before release. > > Tom ? > >> in this patch I don't understood the assumption: >> >> * It is required that ram_base + ram_size must be representable by >> * phys_size_t type and must be aligned by direct access >> >> >> why add now this limitation ? >> >> why phys_size_t here and not phys_addr_t or unsigned long, the >> ram_base type ? >> >> >> why align on 4kB and not on MMU_SECTION_SIZE ? >> >> >> >> until now on ARM 32bits U-Boot is working with >> >> ram_base = 0xC0000000 (unsigned long) >> ram_size = 0x40000000 (phys_size_t) >> >> overflow for ram_top (phys_addr_t) = ram_base + >> get_effective_memsize() was previously managed in the code >> as we work only on base address + size (in board_f.c, in lmb library) >> >> I correct in the code overflow issues, for example in commit >> 54be09cd8f6e ("arm: caches: manage phys >> addr_t overflow in mmu_set_region_dcache_behaviour"). >> >> >> moreover for me it is strange to reduce the size of the DDR banks by >> default >> (see resutl of bdinfo command) only just to avoid overflow: >> >> common/board_f.c:267: gd->bd->bi_dram[0].size = >> get_effective_memsize(); >> >> >> For me the real assumption is on the end of RAM: >> >> "ram_base + ram_size - 1" must be representable by 'ram_base' type >> (phys_addr_t or unsigned long ?) >> >> >> >> + >> + /* >> + * Check for overflow and limit ram size. >> + * It is required that ram_base + ram_size - 1 must be >> representable >> + * by ram_base type, unsigned long. >> + */ >> + if (gd->ram_base - 1 + ram_size < gd->ram_base) >> + ram_size = ((unsigned long)~0x0ULL) - gd->ram_base; >> >> >> >> => no need to remove 4KB is here... base + size can reach the MAX >> value for the type + 1 > > Note that this -4 kiB is exactly what triggers the problem , i.e. if > get_effective_memsize() returns 0x4000_0000 , all is fine, if it > returns 0x3fff0000 then the system hangs. I also tried subtracting > MMU_SECTION_SIZE instead of 4k and that does not help, system still > hangs. ok I tested on STM32MP157C-EV1 on my side... with 1GiB mermory size U-Boot is booting on next TOP (for trusted boot with TF-A and OP-TEE) U-Boot 2023.01-rc4-00386-gb429e78942de (Jan 05 2023 - 17:44:04 +0100) stm32_smc: Failed to exec svc=82001003 op=1 in secure mode (err = -1) stm32_smc: Failed to exec svc=82001003 op=1 in secure mode (err = -1) CPU: STM32MP157C?? Rev.Z Model: STMicroelectronics STM32MP157C eval daughter on eval mother Board: stm32mp1 in trusted mode (st,stm32mp157c-ev1) stm32_smc: Failed to exec svc=82001003 op=1 in secure mode (err = -1) DRAM: 1 GiB .... STM32MP> bdinfo boot_params = 0x00000000 DRAM bank = 0x00000000 -> start = 0xc0000000 -> size = 0x3ffff000 <<<<<< the patch is applied - 4kiB here !!! flashstart = 0x00000000 flashsize = 0x00000000 flashoffset = 0x00000000 baudrate = 115200 bps relocaddr = 0xfdb1d000 reloc off = 0x3da1d000 Build = 32-bit current eth = ethernet@5800a000 ethaddr = 00:80:e1:01:60:da IP addr = <NULL> fdt_blob = 0xfbb040a0 new_fdt = 0xfbb040a0 fdt_size = 0x00016de0 Video = display-controller@5a001000 inactive lmb_dump_all: memory.cnt = 0x1 memory[0] [0xc0000000-0xffffefff], 0x3ffff000 bytes flags: 0 <<<< reserved.cnt = 0x6 reserved[0] [0x10000000-0x10045fff], 0x00046000 bytes flags: 4 reserved[1] [0x30000000-0x3003ffff], 0x00040000 bytes flags: 4 reserved[2] [0x38000000-0x3800ffff], 0x00010000 bytes flags: 4 reserved[3] [0xe8000000-0xefffffff], 0x08000000 bytes flags: 4 reserved[4] [0xfbaffbb8-0xfdffffff], 0x02500448 bytes flags: 0 reserved[5] [0xfe000000-0xffffffff], 0x02000000 bytes flags: 4 <<<< devicetree = board arch_number = 0x00000000 TLB addr = 0xfdff0000 irq_sp = 0xfbb03e10 sp start = 0xfbb03e00 Early malloc usage: 2924 / 20000 STM32MP> But we have strange LMB reserved memory [0xfe000000-0xffffffff] from DT => it is outside of memory[0] [0xc0000000-0xffffefff] I think the -4 kiB should be removed at least.... I test also with basic boot = SPL and U-Boot, no issue on my side Patrick
Hello! On Thursday 05 January 2023 11:03:07 Patrick DELAUNAY wrote: > Hi, > > On 1/5/23 02:22, Marek Vasut wrote: > > Do not access gd->ram_size and assume this is actual valid RAM size. Since commit > > 777aaaa706b ("common/memsize.c: Fix get_effective_memsize() to check for overflow") > > the RAM size may be less than gd->ram_size , call get_effective_memsize() to get > > the limited value instead. > > > > The aforementioned commit makes STM32MP15xx boards with 1 GiB of DRAM > > at 0xc0000000 hang on boot, which is a grave defect. > > > > Signed-off-by: Marek Vasut <marex@denx.de> > > --- > > Cc: Pali Rohar <pali@kernel.org> > > Cc: Patrice Chotard <patrice.chotard@foss.st.com> > > Cc: Patrick Delaunay <patrick.delaunay@foss.st.com> > > Cc: Tom Rini <trini@konsulko.com> > > --- > > arch/arm/mach-stm32mp/dram_init.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/arch/arm/mach-stm32mp/dram_init.c b/arch/arm/mach-stm32mp/dram_init.c > > index 9346fa8546d..80ba5c27741 100644 > > --- a/arch/arm/mach-stm32mp/dram_init.c > > +++ b/arch/arm/mach-stm32mp/dram_init.c > > @@ -51,7 +51,7 @@ phys_size_t board_get_usable_ram_top(phys_size_t total_size) > > /* found enough not-reserved memory to relocated U-Boot */ > > lmb_init(&lmb); > > - lmb_add(&lmb, gd->ram_base, gd->ram_size); > > + lmb_add(&lmb, gd->ram_base, get_effective_memsize()); > > boot_fdt_add_mem_rsv_regions(&lmb, (void *)gd->fdt_blob); > > /* add 8M for reserved memory for display, fdt, gd,... */ > > size = ALIGN(SZ_8M + CONFIG_SYS_MALLOC_LEN + total_size, MMU_SECTION_SIZE), > > > Thanks for the patch > > Reviewed-by: Patrick Delaunay <patrick.delaunay@foss.st.com> > > > Even if I don't uderstood where the code blocked here... > > I need to cros-check it. > > > > > But I think it is more a regression introduced by > > 777aaaa706b ("common/memsize.c: Fix get_effective_memsize() to check for overflow") > > > in this patch I don't understood the assumption: > > * It is required that ram_base + ram_size must be representable by > * phys_size_t type and must be aligned by direct access > > > why add now this limitation ? > > why phys_size_t here and not phys_addr_t or unsigned long, the ram_base type ? Well, I think that it is quite confusing when it is phys_addr_t and when phys_size_t. Because U-Boot on more places is doing addition: phys_addr_t + phys_size_t. And I was confused by it too, because get_effective_memsize() function works only with variables of phys_size_t type. So I would agree that it should be mentioned as phys_addr_t. But is there any platform on which phys_addr_t and phys_size_t types have different sizes? Why it is not ram_base? Because there is no such type. Why it is not phys_addr_t? Because some 32-bit platforms support more than 4GB of physical memory and unsigned long type is small for them. Also there are 64-bit platforms with just 32-bit data bus (but I'm not sure if these platforms are handled somehow specially with just 32-bit phys_addr_t). > why align on 4kB and not on MMU_SECTION_SIZE ? Because of my mistake and seems that nobody spotted or reported this issue yet. I agree it should be some platform specific macro (I have not checked now if MMU_SECTION_SIZE is the correct one). > until now on ARM 32bits U-Boot is working with > > ram_base = 0xC0000000 (unsigned long) > ram_size = 0x40000000 (phys_size_t) > > overflow for ram_top (phys_addr_t) = ram_base + get_effective_memsize() was previously managed in the code > as we work only on base address + size (in board_f.c, in lmb library) > > I correct in the code overflow issues, for example in commit 54be09cd8f6e ("arm: caches: manage phys > addr_t overflow in mmu_set_region_dcache_behaviour"). > > > moreover for me it is strange to reduce the size of the DDR banks by default > (see resutl of bdinfo command) only just to avoid overflow: > > common/board_f.c:267: gd->bd->bi_dram[0].size = get_effective_memsize(); Yes, this really looks strange to reduce total size of DDR banks to the effective memory size which U-Boot can use or map. There is also config option CONFIG_MAX_MEM_MAPPED to allow user to reduce mappable memory in U-Boot to more strict value. For sure DDR bank size should be set to the real size. For example 32-bit U-Boot can map maximally 4GB of RAM, but on some (32-bit) platforms there can be more than 4GB of DDR and operating system may support to use more than 4GB of it. So DDR bank size passed from bootloader to OS should be precise and not reduced to size supported by bootloader. > For me the real assumption is on the end of RAM: > > "ram_base + ram_size - 1" must be representable by 'ram_base' type > (phys_addr_t or unsigned long ?) You are missing there +1. That is restriction in the U-Boot as it requires that one byte after the RAM can be stored in both phys_addr_t and phys_size_t variables. Again, I wrote code comment slightly wrong. It should say that ram_base + get_effective_memsize() must be representable by phys_addr_t type. > > > + > + /* > + * Check for overflow and limit ram size. > + * It is required that ram_base + ram_size - 1 must be representable > + * by ram_base type, unsigned long. > + */ > + if (gd->ram_base - 1 + ram_size < gd->ram_base) > + ram_size = ((unsigned long)~0x0ULL) - gd->ram_base; > > > > => no need to remove 4KB is here... base + size can reach the MAX value for the type + 1 > > > Patrick > For example U-Boot code in common/board_f.c sets gd->ram_top to value: gd->ram_top = gd->ram_base + get_effective_memsize(); So your above change with -1 just break this code. I think that the best would be to fix U-Boot code so that it never use last byte +1 in phys_addr_t.
Ok, so it is working... On Thursday 05 January 2023 19:31:19 Patrick DELAUNAY wrote: > I tested on STM32MP157C-EV1 on my side... > > with 1GiB mermory size > > U-Boot is booting on next TOP (for trusted boot with TF-A and OP-TEE) > > > > U-Boot 2023.01-rc4-00386-gb429e78942de (Jan 05 2023 - 17:44:04 +0100) > > stm32_smc: Failed to exec svc=82001003 op=1 in secure mode (err = -1) > stm32_smc: Failed to exec svc=82001003 op=1 in secure mode (err = -1) > CPU: STM32MP157C?? Rev.Z > Model: STMicroelectronics STM32MP157C eval daughter on eval mother > Board: stm32mp1 in trusted mode (st,stm32mp157c-ev1) > stm32_smc: Failed to exec svc=82001003 op=1 in secure mode (err = -1) > DRAM: 1 GiB > .... > > > STM32MP> bdinfo > boot_params = 0x00000000 > DRAM bank = 0x00000000 > -> start = 0xc0000000 > -> size = 0x3ffff000 <<<<<< the patch is applied - 4kiB here !!! But this is wrong. DRAM bank size should not be changed. I think that in U-Boot is another issue which propagates mapped U-Boot RAM size/range to the places where should be real DDR size. > flashstart = 0x00000000 > flashsize = 0x00000000 > flashoffset = 0x00000000 > baudrate = 115200 bps > relocaddr = 0xfdb1d000 > reloc off = 0x3da1d000 > Build = 32-bit > current eth = ethernet@5800a000 > ethaddr = 00:80:e1:01:60:da > IP addr = <NULL> > fdt_blob = 0xfbb040a0 > new_fdt = 0xfbb040a0 > fdt_size = 0x00016de0 > Video = display-controller@5a001000 inactive > lmb_dump_all: > memory.cnt = 0x1 > memory[0] [0xc0000000-0xffffefff], 0x3ffff000 bytes flags: 0 <<<< > reserved.cnt = 0x6 > reserved[0] [0x10000000-0x10045fff], 0x00046000 bytes flags: 4 > reserved[1] [0x30000000-0x3003ffff], 0x00040000 bytes flags: 4 > reserved[2] [0x38000000-0x3800ffff], 0x00010000 bytes flags: 4 > reserved[3] [0xe8000000-0xefffffff], 0x08000000 bytes flags: 4 > reserved[4] [0xfbaffbb8-0xfdffffff], 0x02500448 bytes flags: 0 > reserved[5] [0xfe000000-0xffffffff], 0x02000000 bytes flags: 4 <<<< > devicetree = board > arch_number = 0x00000000 > TLB addr = 0xfdff0000 > irq_sp = 0xfbb03e10 > sp start = 0xfbb03e00 > Early malloc usage: 2924 / 20000 > STM32MP> > > > But we have strange LMB reserved memory [0xfe000000-0xffffffff] from DT > > => it is outside of memory[0] [0xc0000000-0xffffefff] > > > I think the -4 kiB should be removed at least.... > > > > I test also with basic boot = SPL and U-Boot, no issue on my side > > > > Patrick >
On 1/5/23 19:31, Patrick DELAUNAY wrote: > Hi Marek, Hi, [...] > I tested on STM32MP157C-EV1 on my side... > > with 1GiB mermory size > > U-Boot is booting on next TOP (for trusted boot with TF-A and OP-TEE) > > > > U-Boot 2023.01-rc4-00386-gb429e78942de (Jan 05 2023 - 17:44:04 +0100) Note that I am not running into this problem on next, but on current master . Also, I am using basic boot. Can you retest ? [...]
On 1/5/23 20:25, Pali Rohar wrote:
> Ok, so it is working...
No, it is NOT working because the test performed is using completely
different codebase.
The STM32MP15xx platform I have on my desk is hanging on boot with
current U-Boot master, i.e upcoming release. With commit
777aaaa706b ("common/memsize.c: Fix get_effective_memsize() to check for
overflow")
reverted, the platform boots fine again.
[...]
Hi, On 1/5/23 21:35, Marek Vasut wrote: > On 1/5/23 19:31, Patrick DELAUNAY wrote: >> Hi Marek, > > Hi, > > [...] > >> I tested on STM32MP157C-EV1 on my side... >> >> with 1GiB mermory size >> >> U-Boot is booting on next TOP (for trusted boot with TF-A and OP-TEE) >> >> >> >> U-Boot 2023.01-rc4-00386-gb429e78942de (Jan 05 2023 - 17:44:04 +0100) > > Note that I am not running into this problem on next, but on current > master . Also, I am using basic boot. Can you retest ? > > [...] I retest on master for STM32MP157C-EV1, basic defconfig. SHA1 = 8d6cbf5e6bc4e10e38b954763fa025caed517cc2 no blocking issue on my side.... I go up to U-Boot console. but it is the same binary than you (not the same defconfig) so your blcoking point is perhaps only masked. regards. Patrick
Hi, On 1/5/23 20:25, Pali Rohar wrote: > Ok, so it is working... > > On Thursday 05 January 2023 19:31:19 Patrick DELAUNAY wrote: >> I tested on STM32MP157C-EV1 on my side... >> >> with 1GiB mermory size >> >> U-Boot is booting on next TOP (for trusted boot with TF-A and OP-TEE) >> >> >> >> U-Boot 2023.01-rc4-00386-gb429e78942de (Jan 05 2023 - 17:44:04 +0100) >> >> stm32_smc: Failed to exec svc=82001003 op=1 in secure mode (err = -1) >> stm32_smc: Failed to exec svc=82001003 op=1 in secure mode (err = -1) >> CPU: STM32MP157C?? Rev.Z >> Model: STMicroelectronics STM32MP157C eval daughter on eval mother >> Board: stm32mp1 in trusted mode (st,stm32mp157c-ev1) >> stm32_smc: Failed to exec svc=82001003 op=1 in secure mode (err = -1) >> DRAM: 1 GiB >> .... >> >> >> STM32MP> bdinfo >> boot_params = 0x00000000 >> DRAM bank = 0x00000000 >> -> start = 0xc0000000 >> -> size = 0x3ffff000 <<<<<< the patch is applied - 4kiB here !!! > But this is wrong. DRAM bank size should not be changed. I think that in > U-Boot is another issue which propagates mapped U-Boot RAM size/range to > the places where should be real DDR size. Yes it is wrong and it is introduced by your patch... in the STM32MP15x platfrorm are using the genric weak function ./common/board_f.c:267 __weak int dram_init_banksize(void) { gd->bd->bi_dram[0].start = gd->ram_base; gd->bd->bi_dram[0].size = get_effective_memsize(); return 0; } For me: get_effective_memsize() must return the effective size of the DDR and NOT -4KiB to avoid side effects in other part of the code (LMB for example) I think today we need to revert your patch. regards Patrick
Hi, On 1/5/23 20:16, Pali Rohar wrote: > Hello! > > On Thursday 05 January 2023 11:03:07 Patrick DELAUNAY wrote: >> Hi, >> >> On 1/5/23 02:22, Marek Vasut wrote: >>> Do not access gd->ram_size and assume this is actual valid RAM size. Since commit >>> 777aaaa706b ("common/memsize.c: Fix get_effective_memsize() to check for overflow") >>> the RAM size may be less than gd->ram_size , call get_effective_memsize() to get >>> the limited value instead. >>> >>> The aforementioned commit makes STM32MP15xx boards with 1 GiB of DRAM >>> at 0xc0000000 hang on boot, which is a grave defect. >>> >>> Signed-off-by: Marek Vasut <marex@denx.de> >>> --- >>> Cc: Pali Rohar <pali@kernel.org> >>> Cc: Patrice Chotard <patrice.chotard@foss.st.com> >>> Cc: Patrick Delaunay <patrick.delaunay@foss.st.com> >>> Cc: Tom Rini <trini@konsulko.com> >>> --- >>> arch/arm/mach-stm32mp/dram_init.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/arch/arm/mach-stm32mp/dram_init.c b/arch/arm/mach-stm32mp/dram_init.c >>> index 9346fa8546d..80ba5c27741 100644 >>> --- a/arch/arm/mach-stm32mp/dram_init.c >>> +++ b/arch/arm/mach-stm32mp/dram_init.c >>> @@ -51,7 +51,7 @@ phys_size_t board_get_usable_ram_top(phys_size_t total_size) >>> /* found enough not-reserved memory to relocated U-Boot */ >>> lmb_init(&lmb); >>> - lmb_add(&lmb, gd->ram_base, gd->ram_size); >>> + lmb_add(&lmb, gd->ram_base, get_effective_memsize()); >>> boot_fdt_add_mem_rsv_regions(&lmb, (void *)gd->fdt_blob); >>> /* add 8M for reserved memory for display, fdt, gd,... */ >>> size = ALIGN(SZ_8M + CONFIG_SYS_MALLOC_LEN + total_size, MMU_SECTION_SIZE), >> >> Thanks for the patch >> >> Reviewed-by: Patrick Delaunay <patrick.delaunay@foss.st.com> >> >> >> Even if I don't uderstood where the code blocked here... >> >> I need to cros-check it. >> >> >> >> >> But I think it is more a regression introduced by >> >> 777aaaa706b ("common/memsize.c: Fix get_effective_memsize() to check for overflow") >> >> >> in this patch I don't understood the assumption: >> >> * It is required that ram_base + ram_size must be representable by >> * phys_size_t type and must be aligned by direct access >> >> >> why add now this limitation ? >> >> why phys_size_t here and not phys_addr_t or unsigned long, the ram_base type ? > Well, I think that it is quite confusing when it is phys_addr_t and when > phys_size_t. Because U-Boot on more places is doing addition: > phys_addr_t + phys_size_t. > > And I was confused by it too, because get_effective_memsize() function > works only with variables of phys_size_t type. > > So I would agree that it should be mentioned as phys_addr_t. > > But is there any platform on which phys_addr_t and phys_size_t types > have different sizes? I don't know, it is just more clear => addr overflow based on address type > > Why it is not ram_base? Because there is no such type. Why it is not > phys_addr_t? Because some 32-bit platforms support more than 4GB of > physical memory and unsigned long type is small for them. Also there are > 64-bit platforms with just 32-bit data bus (but I'm not sure if these > platforms are handled somehow specially with just 32-bit phys_addr_t). > >> why align on 4kB and not on MMU_SECTION_SIZE ? > Because of my mistake and seems that nobody spotted or reported this > issue yet. I agree it should be some platform specific macro (I have not > checked now if MMU_SECTION_SIZE is the correct one). I propose MMU_SECTION_SIZE to allow MMU configuration up to end of effective ram top. > >> until now on ARM 32bits U-Boot is working with >> >> ram_base = 0xC0000000 (unsigned long) >> ram_size = 0x40000000 (phys_size_t) >> >> overflow for ram_top (phys_addr_t) = ram_base + get_effective_memsize() was previously managed in the code >> as we work only on base address + size (in board_f.c, in lmb library) >> >> I correct in the code overflow issues, for example in commit 54be09cd8f6e ("arm: caches: manage phys >> addr_t overflow in mmu_set_region_dcache_behaviour"). >> >> >> moreover for me it is strange to reduce the size of the DDR banks by default >> (see resutl of bdinfo command) only just to avoid overflow: >> >> common/board_f.c:267: gd->bd->bi_dram[0].size = get_effective_memsize(); > Yes, this really looks strange to reduce total size of DDR banks to the > effective memory size which U-Boot can use or map. There is also config > option CONFIG_MAX_MEM_MAPPED to allow user to reduce mappable memory in > U-Boot to more strict value. > > For sure DDR bank size should be set to the real size. > > For example 32-bit U-Boot can map maximally 4GB of RAM, but on some > (32-bit) platforms there can be more than 4GB of DDR and operating > system may support to use more than 4GB of it. So DDR bank size passed > from bootloader to OS should be precise and not reduced to size > supported by bootloader. > >> For me the real assumption is on the end of RAM: >> >> "ram_base + ram_size - 1" must be representable by 'ram_base' type >> (phys_addr_t or unsigned long ?) > You are missing there +1. That is restriction in the U-Boot as it > requires that one byte after the RAM can be stored in both phys_addr_t > and phys_size_t variables. > > Again, I wrote code comment slightly wrong. It should say that > ram_base + get_effective_memsize() must be representable by phys_addr_t > type. For me it is not a problem, 0x0 for RAM TOP should be (max of phys_addr_t + 1) => relocation address of U-Boot is (ram_top - u-boot size) is correctly working but perhaps I miss something.... and I dig deeper. Anyway on STM32MP15X platform with 1GB DDR, for basic boot ram_base = 0xC0000000 ram_size = 0x40000000 => and it is working (with ram_top = 0x0) > >> >> + >> + /* >> + * Check for overflow and limit ram size. >> + * It is required that ram_base + ram_size - 1 must be representable >> + * by ram_base type, unsigned long. >> + */ >> + if (gd->ram_base - 1 + ram_size < gd->ram_base) >> + ram_size = ((unsigned long)~0x0ULL) - gd->ram_base; >> >> >> >> => no need to remove 4KB is here... base + size can reach the MAX value for the type + 1 >> >> >> Patrick >> > For example U-Boot code in common/board_f.c sets gd->ram_top to value: > > gd->ram_top = gd->ram_base + get_effective_memsize(); > > So your above change with -1 just break this code. For me ram_top = 0x0 is functional ( for u32 equivalant with overflow to 0x1 0000 0000) > > I think that the best would be to fix U-Boot code so that it never use > last byte +1 in phys_addr_t. I see that over platform don't change this common part of U-Boot but reduce the usable RAM by U-Boot by using the weak function: board_get_usable_ram_top() common/board_f.c:376: gd->ram_top = board_get_usable_ram_top(gd->mon_len); and if you see the default implementation of this function : /* Get the top of usable RAM */ __weak phys_size_t gd->ram_base(phys_size_t total_size) { #if defined(CONFIG_SYS_SDRAM_BASE) && CONFIG_SYS_SDRAM_BASE > 0 /* * Detect whether we have so much RAM that it goes past the end of our * 32-bit address space. If so, clip the usable RAM so it doesn't. */ if (gd->ram_top < CONFIG_SYS_SDRAM_BASE) /* * Will wrap back to top of 32-bit space when reservations * are made. */ return 0; #endif return gd->ram_top; } the 32-bis address space overflow is already detected here when gd->ram_top = gd->ram_base + gd->ram_size < CONFIG_SYS_SDRAM_BASE => ram_top = board_get_usable_ram_top() = 0 So I think you can revert your patch.... and perhaps check your implementation of this weak function if you overidde it Regards Patrick
On 1/6/23 10:13, Patrick DELAUNAY wrote: > Hi, Hi, > For me: > > get_effective_memsize() must return the effective size of the DDR > > and NOT -4KiB to avoid side effects in other part of the code (LMB for > example) > > > I think today we need to revert your patch. I agree it would be good to revert the patch for the v2023.01 release , and then re-add it once all the issues it triggers are explained or resolved in satisfactory manner.
On Friday 06 January 2023 10:13:34 Patrick DELAUNAY wrote: > Hi, > > On 1/5/23 20:25, Pali Rohar wrote: > > Ok, so it is working... > > > > On Thursday 05 January 2023 19:31:19 Patrick DELAUNAY wrote: > > > I tested on STM32MP157C-EV1 on my side... > > > > > > with 1GiB mermory size > > > > > > U-Boot is booting on next TOP (for trusted boot with TF-A and OP-TEE) > > > > > > > > > > > > U-Boot 2023.01-rc4-00386-gb429e78942de (Jan 05 2023 - 17:44:04 +0100) > > > > > > stm32_smc: Failed to exec svc=82001003 op=1 in secure mode (err = -1) > > > stm32_smc: Failed to exec svc=82001003 op=1 in secure mode (err = -1) > > > CPU: STM32MP157C?? Rev.Z > > > Model: STMicroelectronics STM32MP157C eval daughter on eval mother > > > Board: stm32mp1 in trusted mode (st,stm32mp157c-ev1) > > > stm32_smc: Failed to exec svc=82001003 op=1 in secure mode (err = -1) > > > DRAM: 1 GiB > > > .... > > > > > > > > > STM32MP> bdinfo > > > boot_params = 0x00000000 > > > DRAM bank = 0x00000000 > > > -> start = 0xc0000000 > > > -> size = 0x3ffff000 <<<<<< the patch is applied - 4kiB here !!! > > But this is wrong. DRAM bank size should not be changed. I think that in > > U-Boot is another issue which propagates mapped U-Boot RAM size/range to > > the places where should be real DDR size. > > > Yes it is wrong > > and it is introduced by your patch... > > > in the STM32MP15x platfrorm are using the genric weak function > > > ./common/board_f.c:267 > > > __weak int dram_init_banksize(void) > { > gd->bd->bi_dram[0].start = gd->ram_base; > gd->bd->bi_dram[0].size = get_effective_memsize(); > > return 0; > } > > For me: > > get_effective_memsize() must return the effective size of the DDR > > and NOT -4KiB to avoid side effects in other part of the code (LMB for > example) Ok, this needs to be reworked. But seems it is not just simple issue. > > I think today we need to revert your patch. Hard revert without anything else is not the solution. As it will break all 36-bit powerpc boards with 4GB of DDR RAM on which is u-boot running in 32-bit mode.
Hi Marek, On 1/5/23 02:22, Marek Vasut wrote: > Do not access gd->ram_size and assume this is actual valid RAM size. Since commit > 777aaaa706b ("common/memsize.c: Fix get_effective_memsize() to check for overflow") > the RAM size may be less than gd->ram_size , call get_effective_memsize() to get > the limited value instead. > > The aforementioned commit makes STM32MP15xx boards with 1 GiB of DRAM > at 0xc0000000 hang on boot, which is a grave defect. > > Signed-off-by: Marek Vasut <marex@denx.de> > --- > Cc: Pali Rohar <pali@kernel.org> > Cc: Patrice Chotard <patrice.chotard@foss.st.com> > Cc: Patrick Delaunay <patrick.delaunay@foss.st.com> > Cc: Tom Rini <trini@konsulko.com> > --- > arch/arm/mach-stm32mp/dram_init.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/arm/mach-stm32mp/dram_init.c b/arch/arm/mach-stm32mp/dram_init.c > index 9346fa8546d..80ba5c27741 100644 > --- a/arch/arm/mach-stm32mp/dram_init.c > +++ b/arch/arm/mach-stm32mp/dram_init.c > @@ -51,7 +51,7 @@ phys_size_t board_get_usable_ram_top(phys_size_t total_size) > > /* found enough not-reserved memory to relocated U-Boot */ > lmb_init(&lmb); > - lmb_add(&lmb, gd->ram_base, gd->ram_size); > + lmb_add(&lmb, gd->ram_base, get_effective_memsize()); > boot_fdt_add_mem_rsv_regions(&lmb, (void *)gd->fdt_blob); > /* add 8M for reserved memory for display, fdt, gd,... */ > size = ALIGN(SZ_8M + CONFIG_SYS_MALLOC_LEN + total_size, MMU_SECTION_SIZE), Applied to u-boot-stm/master, thanks! For my point a view , this patch is an acceptable workaround for master branch and v2023.01 delivery. but the initial commit 777aaaa706b ("common/memsize.c: Fix get_effective_memsize() to check should be revisited or reverted (in master or in next ?). For details see comments in patch "arm: stm32mp: Fix board_get_usable_ram_top() again" http://patchwork.ozlabs.org/project/uboot/patch/20230105012222.238075-1-marex@denx.de/ https://lore.kernel.org/u-boot/20230105012222.238075-1-marex@denx.de/ Regards Patrick
diff --git a/arch/arm/mach-stm32mp/dram_init.c b/arch/arm/mach-stm32mp/dram_init.c index 9346fa8546d..80ba5c27741 100644 --- a/arch/arm/mach-stm32mp/dram_init.c +++ b/arch/arm/mach-stm32mp/dram_init.c @@ -51,7 +51,7 @@ phys_size_t board_get_usable_ram_top(phys_size_t total_size) /* found enough not-reserved memory to relocated U-Boot */ lmb_init(&lmb); - lmb_add(&lmb, gd->ram_base, gd->ram_size); + lmb_add(&lmb, gd->ram_base, get_effective_memsize()); boot_fdt_add_mem_rsv_regions(&lmb, (void *)gd->fdt_blob); /* add 8M for reserved memory for display, fdt, gd,... */ size = ALIGN(SZ_8M + CONFIG_SYS_MALLOC_LEN + total_size, MMU_SECTION_SIZE),
Do not access gd->ram_size and assume this is actual valid RAM size. Since commit 777aaaa706b ("common/memsize.c: Fix get_effective_memsize() to check for overflow") the RAM size may be less than gd->ram_size , call get_effective_memsize() to get the limited value instead. The aforementioned commit makes STM32MP15xx boards with 1 GiB of DRAM at 0xc0000000 hang on boot, which is a grave defect. Signed-off-by: Marek Vasut <marex@denx.de> --- Cc: Pali Rohar <pali@kernel.org> Cc: Patrice Chotard <patrice.chotard@foss.st.com> Cc: Patrick Delaunay <patrick.delaunay@foss.st.com> Cc: Tom Rini <trini@konsulko.com> --- arch/arm/mach-stm32mp/dram_init.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)