diff mbox series

[U-Boot] fs/fat: device not booting since conversion to dir iterators

Message ID 3BB206AB2B1BD448954845CE6FF69A8E01CB5312DB@NT-Mail07.beckhoff.com
State RFC
Delegated to: Tom Rini
Headers show
Series [U-Boot] fs/fat: device not booting since conversion to dir iterators | expand

Commit Message

Patrick Brünn Nov. 30, 2017, 11:22 a.m. UTC
Since commit 8eafae209c35932d9a6560809c55ee4641534236
my mx53cx9020_defconfig stopped booting.
I could "fix" that commit with this patch:


My device is powered by an i.MX53 and configured to
load u-boot from sdcard. That sdcard doesn't have any
FAT partition on it:
Disk /dev/mmcblk0: 471 MiB, 493879296 bytes, 964608 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0xbbd9d8f6

Device         Boot Start    End Sectors  Size Id Type
/dev/mmcblk0p1 *     2048 489471  487424  238M 83 Linux

Maybe my sdcard layout is broken, too. If you need any
additional info or have any idea whatelse I could try,
please let me know.

Thanks in advance,
Patrick
Beckhoff Automation GmbH & Co. KG | Managing Director: Dipl. Phys. Hans Beckhoff
Registered office: Verl, Germany | Register court: Guetersloh HRA 7075

Comments

Patrick Brünn Dec. 13, 2017, 10:15 a.m. UTC | #1
>From: Patrick Brünn
>Sent: Donnerstag, 30. November 2017 12:22
>
>Since commit 8eafae209c35932d9a6560809c55ee4641534236
>my mx53cx9020_defconfig stopped booting.
>...
>It seems important to read into do_fat_read_at_block. If I
>allocate a temporary buffer and read into that one, I still
>can't boot. That seems strange to me as, I couldn't find
>any uninitialized access to do_fat_read_at_block anywhere.
>
I believe I finally tracked it down to the linker.

Without accessing do_fat_read_at_block within fat.c, I find
.bss. do_fat_read_at_block as "discarded input section" in
my u-boot.map

With a patch like this:
diff --git a/fs/fat/fat.c b/fs/fat/fat.c
index d16883fa10..8e70fdbc8d 100644
--- a/fs/fat/fat.c
+++ b/fs/fat/fat.c
@@ -1157,6 +1157,7 @@ int fat_opendir(const char *filename, struct fs_dir_stream **dirsp)
                return -ENOMEM;
        memset(dir, 0, sizeof(*dir));

+       if (do_fat_read_at_block[0])
+               printf("value doesn't matter, the access is important\n");
+
        ret = fat_itr_root(&dir->itr, &dir->fsdata);
        if (ret)
                goto fail_free_dir;

The section isn't discarded and my board boots again.


Another workaround would be:
diff --git a/arch/arm/config.mk b/arch/arm/config.mk
index 124b44a337..02f61fcc3c 100644
--- a/arch/arm/config.mk
+++ b/arch/arm/config.mk
@@ -17,7 +17,7 @@ CFLAGS_NON_EFI := -fno-pic -ffixed-r9 -ffunction-sections -fdata-sections
 CFLAGS_EFI := -fpic -fshort-wchar

 LDFLAGS_FINAL += --gc-sections
-PLATFORM_RELFLAGS += -ffunction-sections -fdata-sections \
+PLATFORM_RELFLAGS += -ffunction-sections \
                     -fno-common -ffixed-r9
 PLATFORM_RELFLAGS += $(call cc-option, -msoft-float) \
       $(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,))

Or a hack like:
diff --git a/arch/arm/cpu/u-boot.lds b/arch/arm/cpu/u-boot.lds
index 37d4c605ac..b916cbc733 100644
--- a/arch/arm/cpu/u-boot.lds
+++ b/arch/arm/cpu/u-boot.lds
@@ -219,6 +219,7 @@ SECTIONS
        }

        .bss_end __bss_limit (OVERLAY) : {
+               KEEP(*(.bss.do_fat_read_at_block));
                KEEP(*(.__bss_end));
        }

I also tried to mark do_fat_read_at_block as:
volatile, __used and/or __attribute__((externally_visible))

This doesn't help.

Moving it into fat_write.c doesn't help either(with/without static)

My toolchain is:
~/u-boot$ arm-linux-gnueabihf-gcc -v
Using built-in specs.
COLLECT_GCC=arm-linux-gnueabihf-gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc-cross/arm-linux-gnueabihf/6/lto-wrapper
Target: arm-linux-gnueabihf
Configured with: ../src/configure -v --with-pkgversion='Debian 6.3.0-18' --with-bugurl=file:///usr/share/doc/gcc-6/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-6 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-libitm --disable-libquadmath --enable-plugin --enable-default-pie --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-6-armhf-cross/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-6-armhf-cross --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-6-armhf-cross --with-arch-directory=arm --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libgcj --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-sjlj-exceptions --with-arch=armv7-a --with-fpu=vfpv3-d16 --with-float=hard --with-mode=thumb --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=arm-linux-gnueabihf --program-prefix=arm-linux-gnueabihf- --includedir=/usr/arm-linux-gnueabihf/include
Thread model: posix
gcc version 6.3.0 20170516 (Debian 6.3.0-18)

~/u-boot$ arm-linux-gnueabihf-ld -v
GNU ld (GNU Binutils for Debian) 2.28


Does anyone have an idea what else I can try to prevent my linker from removing
do_fat_read_at_block?

Regards,
Patrick
--
Beckhoff Automation GmbH & Co. KG | Managing Director: Dipl. Phys. Hans Beckhoff
Registered office: Verl, Germany | Register court: Guetersloh HRA 7075
Simon Glass Dec. 19, 2017, 3:41 p.m. UTC | #2
Hi Patrick,

On 13 December 2017 at 03:15, Patrick Brünn <P.Bruenn@beckhoff.com> wrote:
>>From: Patrick Brünn
>>Sent: Donnerstag, 30. November 2017 12:22
>>
>>Since commit 8eafae209c35932d9a6560809c55ee4641534236
>>my mx53cx9020_defconfig stopped booting.
>>...
>>It seems important to read into do_fat_read_at_block. If I
>>allocate a temporary buffer and read into that one, I still
>>can't boot. That seems strange to me as, I couldn't find
>>any uninitialized access to do_fat_read_at_block anywhere.
>>
> I believe I finally tracked it down to the linker.
>
> Without accessing do_fat_read_at_block within fat.c, I find
> .bss. do_fat_read_at_block as "discarded input section" in
> my u-boot.map
>
> With a patch like this:
> diff --git a/fs/fat/fat.c b/fs/fat/fat.c
> index d16883fa10..8e70fdbc8d 100644
> --- a/fs/fat/fat.c
> +++ b/fs/fat/fat.c
> @@ -1157,6 +1157,7 @@ int fat_opendir(const char *filename, struct fs_dir_stream **dirsp)
>                 return -ENOMEM;
>         memset(dir, 0, sizeof(*dir));
>
> +       if (do_fat_read_at_block[0])
> +               printf("value doesn't matter, the access is important\n");
> +
>         ret = fat_itr_root(&dir->itr, &dir->fsdata);
>         if (ret)
>                 goto fail_free_dir;
>
> The section isn't discarded and my board boots again.
>
>
> Another workaround would be:
> diff --git a/arch/arm/config.mk b/arch/arm/config.mk
> index 124b44a337..02f61fcc3c 100644
> --- a/arch/arm/config.mk
> +++ b/arch/arm/config.mk
> @@ -17,7 +17,7 @@ CFLAGS_NON_EFI := -fno-pic -ffixed-r9 -ffunction-sections -fdata-sections
>  CFLAGS_EFI := -fpic -fshort-wchar
>
>  LDFLAGS_FINAL += --gc-sections
> -PLATFORM_RELFLAGS += -ffunction-sections -fdata-sections \
> +PLATFORM_RELFLAGS += -ffunction-sections \
>                      -fno-common -ffixed-r9
>  PLATFORM_RELFLAGS += $(call cc-option, -msoft-float) \
>        $(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,))
>
> Or a hack like:
> diff --git a/arch/arm/cpu/u-boot.lds b/arch/arm/cpu/u-boot.lds
> index 37d4c605ac..b916cbc733 100644
> --- a/arch/arm/cpu/u-boot.lds
> +++ b/arch/arm/cpu/u-boot.lds
> @@ -219,6 +219,7 @@ SECTIONS
>         }
>
>         .bss_end __bss_limit (OVERLAY) : {
> +               KEEP(*(.bss.do_fat_read_at_block));
>                 KEEP(*(.__bss_end));
>         }
>
> I also tried to mark do_fat_read_at_block as:
> volatile, __used and/or __attribute__((externally_visible))
>
> This doesn't help.
>
> Moving it into fat_write.c doesn't help either(with/without static)
>
> My toolchain is:
> ~/u-boot$ arm-linux-gnueabihf-gcc -v
> Using built-in specs.
> COLLECT_GCC=arm-linux-gnueabihf-gcc
> COLLECT_LTO_WRAPPER=/usr/lib/gcc-cross/arm-linux-gnueabihf/6/lto-wrapper
> Target: arm-linux-gnueabihf
> Configured with: ../src/configure -v --with-pkgversion='Debian 6.3.0-18' --with-bugurl=file:///usr/share/doc/gcc-6/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-6 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-libitm --disable-libquadmath --enable-plugin --enable-default-pie --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-6-armhf-cross/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-6-armhf-cross --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-6-armhf-cross --with-arch-directory=arm --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libgcj --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-sjlj-exceptions --with-arch=armv7-a --with-fpu=vfpv3-d16 --with-float=hard --with-mode=thumb --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=arm-linux-gnueabihf --program-prefix=arm-linux-gnueabihf- --includedir=/usr/arm-linux-gnueabihf/include
> Thread model: posix
> gcc version 6.3.0 20170516 (Debian 6.3.0-18)
>
> ~/u-boot$ arm-linux-gnueabihf-ld -v
> GNU ld (GNU Binutils for Debian) 2.28
>
>
> Does anyone have an idea what else I can try to prevent my linker from removing
> do_fat_read_at_block?

This doesn't seem to be used unless you have enabled CONFIG_FAT_WRITE. Have you?

I wonder if the problem is somewhere else, and the presence of this
buffer is just moving things apart in the data space, and thus masking
the problem?

Regards,
Simon
Patrick Brünn Dec. 20, 2017, 4:27 a.m. UTC | #3
Hi Simon,

>From: sjg@google.com [mailto:sjg@google.com] On Behalf Of Simon Glass

>Sent: Dienstag, 19. Dezember 2017 16:42

>Hi Patrick,

>

>On 13 December 2017 at 03:15, Patrick Brünn <P.Bruenn@beckhoff.com>

>wrote:

>>>From: Patrick Brünn

>>>Sent: Donnerstag, 30. November 2017 12:22

>>>

>>>Since commit 8eafae209c35932d9a6560809c55ee4641534236

>>>my mx53cx9020_defconfig stopped booting.

>>>...

>>>It seems important to read into do_fat_read_at_block. If I

>>>allocate a temporary buffer and read into that one, I still

>>>can't boot. That seems strange to me as, I couldn't find

>>>any uninitialized access to do_fat_read_at_block anywhere.

>>>

>> I believe I finally tracked it down to the linker.

>>

>> Without accessing do_fat_read_at_block within fat.c, I find

>> .bss. do_fat_read_at_block as "discarded input section" in

>> my u-boot.map

>>

>> With a patch like this:

>> diff --git a/fs/fat/fat.c b/fs/fat/fat.c

>> index d16883fa10..8e70fdbc8d 100644

>> --- a/fs/fat/fat.c

>> +++ b/fs/fat/fat.c

>> @@ -1157,6 +1157,7 @@ int fat_opendir(const char *filename, struct

>fs_dir_stream **dirsp)

>>                 return -ENOMEM;

>>         memset(dir, 0, sizeof(*dir));

>>

>> +       if (do_fat_read_at_block[0])

>> +               printf("value doesn't matter, the access is important\n");

>> +

>>         ret = fat_itr_root(&dir->itr, &dir->fsdata);

>>         if (ret)

>>                 goto fail_free_dir;

>>

>> The section isn't discarded and my board boots again.

>>

>>

>> Another workaround would be:

>> diff --git a/arch/arm/config.mk b/arch/arm/config.mk

>> index 124b44a337..02f61fcc3c 100644

>> --- a/arch/arm/config.mk

>> +++ b/arch/arm/config.mk

>> @@ -17,7 +17,7 @@ CFLAGS_NON_EFI := -fno-pic -ffixed-r9 -ffunction-

>sections -fdata-sections

>>  CFLAGS_EFI := -fpic -fshort-wchar

>>

>>  LDFLAGS_FINAL += --gc-sections

>> -PLATFORM_RELFLAGS += -ffunction-sections -fdata-sections \

>> +PLATFORM_RELFLAGS += -ffunction-sections \

>>                      -fno-common -ffixed-r9

>>  PLATFORM_RELFLAGS += $(call cc-option, -msoft-float) \

>>        $(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,))

>>

>> Or a hack like:

>> diff --git a/arch/arm/cpu/u-boot.lds b/arch/arm/cpu/u-boot.lds

>> index 37d4c605ac..b916cbc733 100644

>> --- a/arch/arm/cpu/u-boot.lds

>> +++ b/arch/arm/cpu/u-boot.lds

>> @@ -219,6 +219,7 @@ SECTIONS

>>         }

>>

>>         .bss_end __bss_limit (OVERLAY) : {

>> +               KEEP(*(.bss.do_fat_read_at_block));

>>                 KEEP(*(.__bss_end));

>>         }

>>

>> I also tried to mark do_fat_read_at_block as:

>> volatile, __used and/or __attribute__((externally_visible))

>>

>> This doesn't help.

>>

>> Moving it into fat_write.c doesn't help either(with/without static)

>>

>> My toolchain is:

>> ~/u-boot$ arm-linux-gnueabihf-gcc -v

>> Using built-in specs.

>> COLLECT_GCC=arm-linux-gnueabihf-gcc

>> COLLECT_LTO_WRAPPER=/usr/lib/gcc-cross/arm-linux-gnueabihf/6/lto-

>wrapper

>> Target: arm-linux-gnueabihf

>> Configured with: ../src/configure -v --with-pkgversion='Debian 6.3.0-18' --

>with-bugurl=file:///usr/share/doc/gcc-6/README.Bugs --enable-

>languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-

>suffix=-6 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --

>without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls

>--with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-

>libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-

>object --disable-libitm --disable-libquadmath --enable-plugin --enable-

>default-pie --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk

>--enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-6-armhf-

>cross/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-

>6-armhf-cross --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-6-armhf-

>cross --with-arch-directory=arm --with-ecj-jar=/usr/share/java/eclipse-ecj.jar

>--disable-libgcj --with-target-system-zlib --enable-objc-gc=auto --enable-

>multiarch --disable-sjlj-exceptions --with-arch=armv7-a --with-fpu=vfpv3-d16

>--with-float=hard --with-mode=thumb --enable-checking=release --

>build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=arm-linux-

>gnueabihf --program-prefix=arm-linux-gnueabihf- --includedir=/usr/arm-

>linux-gnueabihf/include

>> Thread model: posix

>> gcc version 6.3.0 20170516 (Debian 6.3.0-18)

>>

>> ~/u-boot$ arm-linux-gnueabihf-ld -v

>> GNU ld (GNU Binutils for Debian) 2.28

>>

>>

>> Does anyone have an idea what else I can try to prevent my linker from

>removing

>> do_fat_read_at_block?

>

>This doesn't seem to be used unless you have enabled CONFIG_FAT_WRITE.

>Have you?

>

>I wonder if the problem is somewhere else, and the presence of this

>buffer is just moving things apart in the data space, and thus masking

>the problem?

>

Yes, I hope I finally found it. Please see the first patch of this series[1].
Don't using that global variable anymore fixes my problem, according to the
documentation it makes sense. Even if I am still not 100% sure. I just don't understand
why this thing was working for so long for at least two boards.
Does it make sense to you?

Thanks for your input,
Patrick

[1] https://www.mail-archive.com/u-boot@lists.denx.de/msg272444.html

Beckhoff Automation GmbH & Co. KG | Managing Director: Dipl. Phys. Hans Beckhoff
Registered office: Verl, Germany | Register court: Guetersloh HRA 7075
diff mbox series

Patch

diff --git a/fs/fat/fat.c b/fs/fat/fat.c
index bbba7947ee..b99ec85097 100644
--- a/fs/fat/fat.c
+++ b/fs/fat/fat.c
@@ -1035,6 +1035,9 @@  int file_fat_ls(const char *dir)
        int files = 0, dirs = 0;
        int ret;

+       //fix_fat
+       disk_read(0, PREFETCH_BLOCKS, do_fat_read_at_block);
+
        ret = fat_itr_root(itr, &fsdata);
        if (ret)
                return ret;

Since then file_fat_ls() has been moved/replaced with
fs_ls_generic(). So I had to adjust that workaround.
And added the dummy disk_read() into fs_opendir() callback:
diff --git a/fs/fat/fat.c b/fs/fat/fat.c
index 7fe78439cf..be6eaeb68e 100644
--- a/fs/fat/fat.c
+++ b/fs/fat/fat.c
@@ -1155,6 +1155,7 @@  int fat_opendir(const char *filename, struct fs_dir_stream **dirsp)
        if (!dir)
                return -ENOMEM;

+       disk_read(0, PREFETCH_BLOCKS, do_fat_read_at_block);
        ret = fat_itr_root(&dir->itr, &dir->fsdata);
        if (ret)
                goto fail_free_dir;

It seems important to read into do_fat_read_at_block. If I
allocate a temporary buffer and read into that one, I still
can't boot. That seems strange to me as, I couldn't find
any uninitialized access to do_fat_read_at_block anywhere.

Another workaround is to disable CMD_FAT:
diff --git a/configs/mx53cx9020_defconfig b/configs/mx53cx9020_defconfig
index b704f32fbf..2d560aabc1 100644
--- a/configs/mx53cx9020_defconfig
+++ b/configs/mx53cx9020_defconfig
@@ -15,7 +15,7 @@  CONFIG_CMD_MII=y
 CONFIG_CMD_PING=y
 CONFIG_CMD_EXT2=y
 CONFIG_CMD_EXT4=y
-CONFIG_CMD_FAT=y
+CONFIG_DOS_PARTITION=y
 CONFIG_CMD_FS_GENERIC=y
 CONFIG_OF_CONTROL=y
 CONFIG_ENV_IS_IN_MMC=y