Message ID | 56BD1768.2040601@wwwdotorg.org |
---|---|
State | RFC |
Delegated to: | Joe Hershberger |
Headers | show |
Hi Stephen, On 11 February 2016 at 16:21, Stephen Warren <swarren@wwwdotorg.org> wrote: > On 12/13/2015 08:46 PM, Simon Glass wrote: >> >> Applied to u-boot-dm/next. > > > I've found another strange problem, at least triggered/exposed by this > patch: > > On at least either Dalmore or Jetson TK1, using USB Ethernet (hence with > RTL8169 support disabled in include/configs/jetson-tk1.h[1]), I find that if > I execute the following commands at or after this patch, then the system > reboots during DHCP operation: > > save mmc 1:1 $loadaddr /dfu_dummy.bin 0x3c0 > usb start > setenv autoload no > dhcp > > yields: > > ==================== > Tegra124 (Jetson TK1) # dhcp > Waiting for Ethernet connection... done. > BOOTP broadcast 1 > DHCP client bound to address 10.20.204.50 (1 ms) > data abort > pc : [<fff6f1d4>] lr : [<fff59fd0>] > reloc pc : [<801291d4>] lr : [<80113fd0>] > sp : fda4e720 ip : 450088df fp : fda60048 > r10: fffa4fe3 r9 : fda53ee0 r8 : 00000000 > r7 : 00000000 r6 : 00000000 r5 : 00000000 r4 : fda60048 > r3 : 00000383 r2 : 00000000 r1 : 00000000 r0 : e10f354b > Flags: NzCv IRQs off FIQs off Mode SVC_32 > Resetting CPU ... > > resetting ... > ==================== > > However, if I execute those commands at the commit before this patch, then > everything is OK. > > The "save" command is definitely required to trigger this issue. The > partition being saved to is a 1024MiB ext4 filesystem that's almost empty. > If I omit the save, or save to a 1024MiB FAT filesystem instead, there's no > error. This leads me to suspect some kind of memory corruption rather than a > direct problem with this patch. Due to ext4 interaction, also CCing Łukasz > in case he has any quick ideas. > > I'll go track down the PC where the error occurs and try and add some debug > spew etc. to see what's up. Any other ideas appreciated though. > > (This is the problem with writing test systems; they show up bugs!) > > > > [1] i.e. I have this change made locally so that no PCIe Ethernet device > exists, which causes the USB Ethernet to be used by default: > > diff --git a/include/configs/jetson-tk1.h b/include/configs/jetson-tk1.h > index 23b2e436167c..af26b055b70b 100644 > --- a/include/configs/jetson-tk1.h > +++ b/include/configs/jetson-tk1.h > @@ -62,7 +62,6 @@ > #define CONFIG_CMD_PCI > > /* PCI networking support */ > -#define CONFIG_RTL8169 > > /* General networking support */ > #define CONFIG_CMD_DHCP > I have seen some odd things on Seaboard which is a bit more forgiving with RAM at 0 (although not Jetson-TK1, right?). It can sometimes mask an issue where a driver has not auto-allocated space, but the driver is using it anyway (i.e. using a NULL pointer as its private data). I can't see anything wrong from inspection. The per-child data (struct usb_device) appears to be cache-aligned as expected. Which USB Ethernet driver are you using? It would be great to fix this. Regards, Simon
On 02/11/2016 05:10 PM, Simon Glass wrote: > Hi Stephen, > > On 11 February 2016 at 16:21, Stephen Warren <swarren@wwwdotorg.org> wrote: >> On 12/13/2015 08:46 PM, Simon Glass wrote: >>> >>> Applied to u-boot-dm/next. >> >> >> I've found another strange problem, at least triggered/exposed by this >> patch: >> >> On at least either Dalmore or Jetson TK1, using USB Ethernet (hence with >> RTL8169 support disabled in include/configs/jetson-tk1.h[1]), I find that if >> I execute the following commands at or after this patch, then the system >> reboots during DHCP operation: >> >> save mmc 1:1 $loadaddr /dfu_dummy.bin 0x3c0 >> usb start >> setenv autoload no >> dhcp >> >> yields: >> >> ==================== >> Tegra124 (Jetson TK1) # dhcp >> Waiting for Ethernet connection... done. >> BOOTP broadcast 1 >> DHCP client bound to address 10.20.204.50 (1 ms) >> data abort >> pc : [<fff6f1d4>] lr : [<fff59fd0>] >> reloc pc : [<801291d4>] lr : [<80113fd0>] >> sp : fda4e720 ip : 450088df fp : fda60048 >> r10: fffa4fe3 r9 : fda53ee0 r8 : 00000000 >> r7 : 00000000 r6 : 00000000 r5 : 00000000 r4 : fda60048 >> r3 : 00000383 r2 : 00000000 r1 : 00000000 r0 : e10f354b >> Flags: NzCv IRQs off FIQs off Mode SVC_32 >> Resetting CPU ... >> >> resetting ... >> ==================== >> >> However, if I execute those commands at the commit before this patch, then >> everything is OK. >> >> The "save" command is definitely required to trigger this issue. The >> partition being saved to is a 1024MiB ext4 filesystem that's almost empty. >> If I omit the save, or save to a 1024MiB FAT filesystem instead, there's no >> error. This leads me to suspect some kind of memory corruption rather than a >> direct problem with this patch. Due to ext4 interaction, also CCing Łukasz >> in case he has any quick ideas. >> >> I'll go track down the PC where the error occurs and try and add some debug >> spew etc. to see what's up. Any other ideas appreciated though. >> >> (This is the problem with writing test systems; they show up bugs!) >> >> >> >> [1] i.e. I have this change made locally so that no PCIe Ethernet device >> exists, which causes the USB Ethernet to be used by default: >> >> diff --git a/include/configs/jetson-tk1.h b/include/configs/jetson-tk1.h >> index 23b2e436167c..af26b055b70b 100644 >> --- a/include/configs/jetson-tk1.h >> +++ b/include/configs/jetson-tk1.h >> @@ -62,7 +62,6 @@ >> #define CONFIG_CMD_PCI >> >> /* PCI networking support */ >> -#define CONFIG_RTL8169 >> >> /* General networking support */ >> #define CONFIG_CMD_DHCP >> > > I have seen some odd things on Seaboard which is a bit more forgiving > with RAM at 0 (although not Jetson-TK1, right?). It can sometimes mask > an issue where a driver has not auto-allocated space, but the driver > is using it anyway (i.e. using a NULL pointer as its private data). > > I can't see anything wrong from inspection. The per-child data (struct > usb_device) appears to be cache-aligned as expected. > > Which USB Ethernet driver are you using? asix. It looks like something inside the dhcp command is trashing something in the hush shell "pipe" state... BTW, can sandbox support USB Ethernet and MMC (or perhaps any block device that I can "save" on)? I'd be curious if the problem could repro there, what with its better debugging tools. However, if the issue is fluky malloc heap layout due to command history, I guess we'd have to be pretty lucky...
Hi Stephen, On 11 February 2016 at 17:45, Stephen Warren <swarren@wwwdotorg.org> wrote: > On 02/11/2016 05:10 PM, Simon Glass wrote: >> >> Hi Stephen, >> >> On 11 February 2016 at 16:21, Stephen Warren <swarren@wwwdotorg.org> >> wrote: >>> >>> On 12/13/2015 08:46 PM, Simon Glass wrote: >>>> >>>> >>>> Applied to u-boot-dm/next. >>> >>> >>> >>> I've found another strange problem, at least triggered/exposed by this >>> patch: >>> >>> On at least either Dalmore or Jetson TK1, using USB Ethernet (hence with >>> RTL8169 support disabled in include/configs/jetson-tk1.h[1]), I find that >>> if >>> I execute the following commands at or after this patch, then the system >>> reboots during DHCP operation: >>> >>> save mmc 1:1 $loadaddr /dfu_dummy.bin 0x3c0 >>> usb start >>> setenv autoload no >>> dhcp >>> >>> yields: >>> >>> ==================== >>> Tegra124 (Jetson TK1) # dhcp >>> Waiting for Ethernet connection... done. >>> BOOTP broadcast 1 >>> DHCP client bound to address 10.20.204.50 (1 ms) >>> data abort >>> pc : [<fff6f1d4>] lr : [<fff59fd0>] >>> reloc pc : [<801291d4>] lr : [<80113fd0>] >>> sp : fda4e720 ip : 450088df fp : fda60048 >>> r10: fffa4fe3 r9 : fda53ee0 r8 : 00000000 >>> r7 : 00000000 r6 : 00000000 r5 : 00000000 r4 : fda60048 >>> r3 : 00000383 r2 : 00000000 r1 : 00000000 r0 : e10f354b >>> Flags: NzCv IRQs off FIQs off Mode SVC_32 >>> Resetting CPU ... >>> >>> resetting ... >>> ==================== >>> >>> However, if I execute those commands at the commit before this patch, >>> then >>> everything is OK. >>> >>> The "save" command is definitely required to trigger this issue. The >>> partition being saved to is a 1024MiB ext4 filesystem that's almost >>> empty. >>> If I omit the save, or save to a 1024MiB FAT filesystem instead, there's >>> no >>> error. This leads me to suspect some kind of memory corruption rather >>> than a >>> direct problem with this patch. Due to ext4 interaction, also CCing >>> Łukasz >>> in case he has any quick ideas. >>> >>> I'll go track down the PC where the error occurs and try and add some >>> debug >>> spew etc. to see what's up. Any other ideas appreciated though. >>> >>> (This is the problem with writing test systems; they show up bugs!) >>> >>> >>> >>> [1] i.e. I have this change made locally so that no PCIe Ethernet device >>> exists, which causes the USB Ethernet to be used by default: >>> >>> diff --git a/include/configs/jetson-tk1.h b/include/configs/jetson-tk1.h >>> index 23b2e436167c..af26b055b70b 100644 >>> --- a/include/configs/jetson-tk1.h >>> +++ b/include/configs/jetson-tk1.h >>> @@ -62,7 +62,6 @@ >>> #define CONFIG_CMD_PCI >>> >>> /* PCI networking support */ >>> -#define CONFIG_RTL8169 >>> >>> /* General networking support */ >>> #define CONFIG_CMD_DHCP >>> >> >> I have seen some odd things on Seaboard which is a bit more forgiving >> with RAM at 0 (although not Jetson-TK1, right?). It can sometimes mask >> an issue where a driver has not auto-allocated space, but the driver >> is using it anyway (i.e. using a NULL pointer as its private data). >> >> I can't see anything wrong from inspection. The per-child data (struct >> usb_device) appears to be cache-aligned as expected. >> >> Which USB Ethernet driver are you using? > > > asix. > > It looks like something inside the dhcp command is trashing something in the > hush shell "pipe" state... Wow, that's novel. > > BTW, can sandbox support USB Ethernet and MMC (or perhaps any block device > that I can "save" on)? I'd be curious if the problem could repro there, what > with its better debugging tools. However, if the issue is fluky malloc heap > layout due to command history, I guess we'd have to be pretty lucky... It supports USB keyboard and Flash. You could probably add a simulated USB Ethernet without much trouble. See drivers/usb/emul. It supports MMC at the top level but there is no back-end. As it happens I'm working on driver model blk support but it isn't ready. The USB flash device only supports read at present, but I suppose adding it wouldn't be hard. But I'm not sure savenv supports USB devices. You could use SPI flash but it isn't a block device. Hmmm... Regards, Simon
==================== Tegra124 (Jetson TK1) # dhcp Waiting for Ethernet connection... done. BOOTP broadcast 1 DHCP client bound to address 10.20.204.50 (1 ms) data abort pc : [<fff6f1d4>] lr : [<fff59fd0>] reloc pc : [<801291d4>] lr : [<80113fd0>] sp : fda4e720 ip : 450088df fp : fda60048 r10: fffa4fe3 r9 : fda53ee0 r8 : 00000000 r7 : 00000000 r6 : 00000000 r5 : 00000000 r4 : fda60048 r3 : 00000383 r2 : 00000000 r1 : 00000000 r0 : e10f354b Flags: NzCv IRQs off FIQs off Mode SVC_32 Resetting CPU ... resetting ... ==================== However, if I execute those commands at the commit before this patch, then everything is OK. The "save" command is definitely required to trigger this issue. The partition being saved to is a 1024MiB ext4 filesystem that's almost empty. If I omit the save, or save to a 1024MiB FAT filesystem instead, there's no error. This leads me to suspect some kind of memory corruption rather than a direct problem with this patch. Due to ext4 interaction, also CCing Łukasz in case he has any quick ideas. I'll go track down the PC where the error occurs and try and add some debug spew etc. to see what's up. Any other ideas appreciated though. (This is the problem with writing test systems; they show up bugs!) [1] i.e. I have this change made locally so that no PCIe Ethernet device exists, which causes the USB Ethernet to be used by default: diff --git a/include/configs/jetson-tk1.h b/include/configs/jetson-tk1.h index 23b2e436167c..af26b055b70b 100644 --- a/include/configs/jetson-tk1.h +++ b/include/configs/jetson-tk1.h @@ -62,7 +62,6 @@ #define CONFIG_CMD_PCI /* PCI networking support */ -#define CONFIG_RTL8169 /* General networking support */ #define CONFIG_CMD_DHCP
On 12/13/2015 08:46 PM, Simon Glass wrote: > Applied to u-boot-dm/next. I've found another strange problem, at least triggered/exposed by this patch: On at least either Dalmore or Jetson TK1, using USB Ethernet (hence with RTL8169 support disabled in include/configs/jetson-tk1.h[1]), I find that if I execute the following commands at or after this patch, then the system reboots during DHCP operation: save mmc 1:1 $loadaddr /dfu_dummy.bin 0x3c0 usb start setenv autoload no dhcp yields: