diff mbox

[U-Boot,v2,16/26] dm: tegra: net: Convert tegra boards to driver model for Ethernet

Message ID 56BD1768.2040601@wwwdotorg.org
State RFC
Delegated to: Joe Hershberger
Headers show

Commit Message

Stephen Warren Feb. 11, 2016, 11:21 p.m. UTC
On 12/13/2015 08:46 PM, Simon Glass wrote:
> Applied to u-boot-dm/next.

I've found another strange problem, at least triggered/exposed by this 
patch:

On at least either Dalmore or Jetson TK1, using USB Ethernet (hence with 
RTL8169 support disabled in include/configs/jetson-tk1.h[1]), I find 
that if I execute the following commands at or after this patch, then 
the system reboots during DHCP operation:

save mmc 1:1 $loadaddr /dfu_dummy.bin 0x3c0
usb start
setenv autoload no
dhcp

yields:

Comments

Simon Glass Feb. 12, 2016, 12:10 a.m. UTC | #1
Hi Stephen,

On 11 February 2016 at 16:21, Stephen Warren <swarren@wwwdotorg.org> wrote:
> On 12/13/2015 08:46 PM, Simon Glass wrote:
>>
>> Applied to u-boot-dm/next.
>
>
> I've found another strange problem, at least triggered/exposed by this
> patch:
>
> On at least either Dalmore or Jetson TK1, using USB Ethernet (hence with
> RTL8169 support disabled in include/configs/jetson-tk1.h[1]), I find that if
> I execute the following commands at or after this patch, then the system
> reboots during DHCP operation:
>
> save mmc 1:1 $loadaddr /dfu_dummy.bin 0x3c0
> usb start
> setenv autoload no
> dhcp
>
> yields:
>
> ====================
> Tegra124 (Jetson TK1) # dhcp
> Waiting for Ethernet connection... done.
> BOOTP broadcast 1
> DHCP client bound to address 10.20.204.50 (1 ms)
> data abort
> pc : [<fff6f1d4>]          lr : [<fff59fd0>]
> reloc pc : [<801291d4>]    lr : [<80113fd0>]
> sp : fda4e720  ip : 450088df     fp : fda60048
> r10: fffa4fe3  r9 : fda53ee0     r8 : 00000000
> r7 : 00000000  r6 : 00000000     r5 : 00000000  r4 : fda60048
> r3 : 00000383  r2 : 00000000     r1 : 00000000  r0 : e10f354b
> Flags: NzCv  IRQs off  FIQs off  Mode SVC_32
> Resetting CPU ...
>
> resetting ...
> ====================
>
> However, if I execute those commands at the commit before this patch, then
> everything is OK.
>
> The "save" command is definitely required to trigger this issue. The
> partition being saved to is a 1024MiB ext4 filesystem that's almost empty.
> If I omit the save, or save to a 1024MiB FAT filesystem instead, there's no
> error. This leads me to suspect some kind of memory corruption rather than a
> direct problem with this patch. Due to ext4 interaction, also CCing Łukasz
> in case he has any quick ideas.
>
> I'll go track down the PC where the error occurs and try and add some debug
> spew etc. to see what's up. Any other ideas appreciated though.
>
> (This is the problem with writing test systems; they show up bugs!)
>
>
>
> [1] i.e. I have this change made locally so that no PCIe Ethernet device
> exists, which causes the USB Ethernet to be used by default:
>
> diff --git a/include/configs/jetson-tk1.h b/include/configs/jetson-tk1.h
> index 23b2e436167c..af26b055b70b 100644
> --- a/include/configs/jetson-tk1.h
> +++ b/include/configs/jetson-tk1.h
> @@ -62,7 +62,6 @@
>  #define CONFIG_CMD_PCI
>
>  /* PCI networking support */
> -#define CONFIG_RTL8169
>
>  /* General networking support */
>  #define CONFIG_CMD_DHCP
>

I have seen some odd things on Seaboard which is a bit more forgiving
with RAM at 0 (although not Jetson-TK1, right?). It can sometimes mask
an issue where a driver has not auto-allocated space, but the driver
is using it anyway (i.e. using a NULL pointer as its private data).

I can't see anything wrong from inspection. The per-child data (struct
usb_device) appears to be cache-aligned as expected.

Which USB Ethernet driver are you using?

It would be great to fix this.

Regards,
Simon
Stephen Warren Feb. 12, 2016, 12:45 a.m. UTC | #2
On 02/11/2016 05:10 PM, Simon Glass wrote:
> Hi Stephen,
>
> On 11 February 2016 at 16:21, Stephen Warren <swarren@wwwdotorg.org> wrote:
>> On 12/13/2015 08:46 PM, Simon Glass wrote:
>>>
>>> Applied to u-boot-dm/next.
>>
>>
>> I've found another strange problem, at least triggered/exposed by this
>> patch:
>>
>> On at least either Dalmore or Jetson TK1, using USB Ethernet (hence with
>> RTL8169 support disabled in include/configs/jetson-tk1.h[1]), I find that if
>> I execute the following commands at or after this patch, then the system
>> reboots during DHCP operation:
>>
>> save mmc 1:1 $loadaddr /dfu_dummy.bin 0x3c0
>> usb start
>> setenv autoload no
>> dhcp
>>
>> yields:
>>
>> ====================
>> Tegra124 (Jetson TK1) # dhcp
>> Waiting for Ethernet connection... done.
>> BOOTP broadcast 1
>> DHCP client bound to address 10.20.204.50 (1 ms)
>> data abort
>> pc : [<fff6f1d4>]          lr : [<fff59fd0>]
>> reloc pc : [<801291d4>]    lr : [<80113fd0>]
>> sp : fda4e720  ip : 450088df     fp : fda60048
>> r10: fffa4fe3  r9 : fda53ee0     r8 : 00000000
>> r7 : 00000000  r6 : 00000000     r5 : 00000000  r4 : fda60048
>> r3 : 00000383  r2 : 00000000     r1 : 00000000  r0 : e10f354b
>> Flags: NzCv  IRQs off  FIQs off  Mode SVC_32
>> Resetting CPU ...
>>
>> resetting ...
>> ====================
>>
>> However, if I execute those commands at the commit before this patch, then
>> everything is OK.
>>
>> The "save" command is definitely required to trigger this issue. The
>> partition being saved to is a 1024MiB ext4 filesystem that's almost empty.
>> If I omit the save, or save to a 1024MiB FAT filesystem instead, there's no
>> error. This leads me to suspect some kind of memory corruption rather than a
>> direct problem with this patch. Due to ext4 interaction, also CCing Łukasz
>> in case he has any quick ideas.
>>
>> I'll go track down the PC where the error occurs and try and add some debug
>> spew etc. to see what's up. Any other ideas appreciated though.
>>
>> (This is the problem with writing test systems; they show up bugs!)
>>
>>
>>
>> [1] i.e. I have this change made locally so that no PCIe Ethernet device
>> exists, which causes the USB Ethernet to be used by default:
>>
>> diff --git a/include/configs/jetson-tk1.h b/include/configs/jetson-tk1.h
>> index 23b2e436167c..af26b055b70b 100644
>> --- a/include/configs/jetson-tk1.h
>> +++ b/include/configs/jetson-tk1.h
>> @@ -62,7 +62,6 @@
>>   #define CONFIG_CMD_PCI
>>
>>   /* PCI networking support */
>> -#define CONFIG_RTL8169
>>
>>   /* General networking support */
>>   #define CONFIG_CMD_DHCP
>>
>
> I have seen some odd things on Seaboard which is a bit more forgiving
> with RAM at 0 (although not Jetson-TK1, right?). It can sometimes mask
> an issue where a driver has not auto-allocated space, but the driver
> is using it anyway (i.e. using a NULL pointer as its private data).
>
> I can't see anything wrong from inspection. The per-child data (struct
> usb_device) appears to be cache-aligned as expected.
>
> Which USB Ethernet driver are you using?

asix.

It looks like something inside the dhcp command is trashing something in 
the hush shell "pipe" state...

BTW, can sandbox support USB Ethernet and MMC (or perhaps any block 
device that I can "save" on)? I'd be curious if the problem could repro 
there, what with its better debugging tools. However, if the issue is 
fluky malloc heap layout due to command history, I guess we'd have to be 
pretty lucky...
Simon Glass Feb. 12, 2016, 8:04 p.m. UTC | #3
Hi Stephen,

On 11 February 2016 at 17:45, Stephen Warren <swarren@wwwdotorg.org> wrote:
> On 02/11/2016 05:10 PM, Simon Glass wrote:
>>
>> Hi Stephen,
>>
>> On 11 February 2016 at 16:21, Stephen Warren <swarren@wwwdotorg.org>
>> wrote:
>>>
>>> On 12/13/2015 08:46 PM, Simon Glass wrote:
>>>>
>>>>
>>>> Applied to u-boot-dm/next.
>>>
>>>
>>>
>>> I've found another strange problem, at least triggered/exposed by this
>>> patch:
>>>
>>> On at least either Dalmore or Jetson TK1, using USB Ethernet (hence with
>>> RTL8169 support disabled in include/configs/jetson-tk1.h[1]), I find that
>>> if
>>> I execute the following commands at or after this patch, then the system
>>> reboots during DHCP operation:
>>>
>>> save mmc 1:1 $loadaddr /dfu_dummy.bin 0x3c0
>>> usb start
>>> setenv autoload no
>>> dhcp
>>>
>>> yields:
>>>
>>> ====================
>>> Tegra124 (Jetson TK1) # dhcp
>>> Waiting for Ethernet connection... done.
>>> BOOTP broadcast 1
>>> DHCP client bound to address 10.20.204.50 (1 ms)
>>> data abort
>>> pc : [<fff6f1d4>]          lr : [<fff59fd0>]
>>> reloc pc : [<801291d4>]    lr : [<80113fd0>]
>>> sp : fda4e720  ip : 450088df     fp : fda60048
>>> r10: fffa4fe3  r9 : fda53ee0     r8 : 00000000
>>> r7 : 00000000  r6 : 00000000     r5 : 00000000  r4 : fda60048
>>> r3 : 00000383  r2 : 00000000     r1 : 00000000  r0 : e10f354b
>>> Flags: NzCv  IRQs off  FIQs off  Mode SVC_32
>>> Resetting CPU ...
>>>
>>> resetting ...
>>> ====================
>>>
>>> However, if I execute those commands at the commit before this patch,
>>> then
>>> everything is OK.
>>>
>>> The "save" command is definitely required to trigger this issue. The
>>> partition being saved to is a 1024MiB ext4 filesystem that's almost
>>> empty.
>>> If I omit the save, or save to a 1024MiB FAT filesystem instead, there's
>>> no
>>> error. This leads me to suspect some kind of memory corruption rather
>>> than a
>>> direct problem with this patch. Due to ext4 interaction, also CCing
>>> Łukasz
>>> in case he has any quick ideas.
>>>
>>> I'll go track down the PC where the error occurs and try and add some
>>> debug
>>> spew etc. to see what's up. Any other ideas appreciated though.
>>>
>>> (This is the problem with writing test systems; they show up bugs!)
>>>
>>>
>>>
>>> [1] i.e. I have this change made locally so that no PCIe Ethernet device
>>> exists, which causes the USB Ethernet to be used by default:
>>>
>>> diff --git a/include/configs/jetson-tk1.h b/include/configs/jetson-tk1.h
>>> index 23b2e436167c..af26b055b70b 100644
>>> --- a/include/configs/jetson-tk1.h
>>> +++ b/include/configs/jetson-tk1.h
>>> @@ -62,7 +62,6 @@
>>>   #define CONFIG_CMD_PCI
>>>
>>>   /* PCI networking support */
>>> -#define CONFIG_RTL8169
>>>
>>>   /* General networking support */
>>>   #define CONFIG_CMD_DHCP
>>>
>>
>> I have seen some odd things on Seaboard which is a bit more forgiving
>> with RAM at 0 (although not Jetson-TK1, right?). It can sometimes mask
>> an issue where a driver has not auto-allocated space, but the driver
>> is using it anyway (i.e. using a NULL pointer as its private data).
>>
>> I can't see anything wrong from inspection. The per-child data (struct
>> usb_device) appears to be cache-aligned as expected.
>>
>> Which USB Ethernet driver are you using?
>
>
> asix.
>
> It looks like something inside the dhcp command is trashing something in the
> hush shell "pipe" state...

Wow, that's novel.

>
> BTW, can sandbox support USB Ethernet and MMC (or perhaps any block device
> that I can "save" on)? I'd be curious if the problem could repro there, what
> with its better debugging tools. However, if the issue is fluky malloc heap
> layout due to command history, I guess we'd have to be pretty lucky...

It supports USB keyboard and Flash. You could probably add a simulated
USB Ethernet without much trouble. See drivers/usb/emul.

It supports MMC at the top level but there is no back-end. As it
happens I'm working on driver model blk support but it isn't ready.
The USB flash device only supports read at present, but I suppose
adding it wouldn't be hard. But I'm not sure savenv supports USB
devices. You could use SPI flash but it isn't a block device.

Hmmm...

Regards,
Simon
diff mbox

Patch

====================
Tegra124 (Jetson TK1) # dhcp
Waiting for Ethernet connection... done.
BOOTP broadcast 1
DHCP client bound to address 10.20.204.50 (1 ms)
data abort
pc : [<fff6f1d4>]	   lr : [<fff59fd0>]
reloc pc : [<801291d4>]	   lr : [<80113fd0>]
sp : fda4e720  ip : 450088df	 fp : fda60048
r10: fffa4fe3  r9 : fda53ee0	 r8 : 00000000
r7 : 00000000  r6 : 00000000	 r5 : 00000000  r4 : fda60048
r3 : 00000383  r2 : 00000000	 r1 : 00000000  r0 : e10f354b
Flags: NzCv  IRQs off  FIQs off  Mode SVC_32
Resetting CPU ...

resetting ...
====================

However, if I execute those commands at the commit before this patch, 
then everything is OK.

The "save" command is definitely required to trigger this issue. The 
partition being saved to is a 1024MiB ext4 filesystem that's almost 
empty. If I omit the save, or save to a 1024MiB FAT filesystem instead, 
there's no error. This leads me to suspect some kind of memory 
corruption rather than a direct problem with this patch. Due to ext4 
interaction, also CCing Łukasz in case he has any quick ideas.

I'll go track down the PC where the error occurs and try and add some 
debug spew etc. to see what's up. Any other ideas appreciated though.

(This is the problem with writing test systems; they show up bugs!)



[1] i.e. I have this change made locally so that no PCIe Ethernet device 
exists, which causes the USB Ethernet to be used by default:

diff --git a/include/configs/jetson-tk1.h b/include/configs/jetson-tk1.h
index 23b2e436167c..af26b055b70b 100644
--- a/include/configs/jetson-tk1.h
+++ b/include/configs/jetson-tk1.h
@@ -62,7 +62,6 @@ 
  #define CONFIG_CMD_PCI

  /* PCI networking support */
-#define CONFIG_RTL8169

  /* General networking support */
  #define CONFIG_CMD_DHCP