diff mbox

[U-Boot] Revert "fdt: Fix fdtdec_get_addr_size() for 64-bit"

Message ID 1438560830-31221-1-git-send-email-sjg@chromium.org
State Accepted
Delegated to: Simon Glass
Headers show

Commit Message

Simon Glass Aug. 3, 2015, 12:13 a.m. UTC
This reverts commit 5b34436035fc862b5e8d0d2c3eab74ba36f1a7f4.

This function has a few problems. It calls fdt_parent_offset() which as
mentioned in code review is very slow.

https://patchwork.ozlabs.org/patch/499482/
https://patchwork.ozlabs.org/patch/452604/

It also happens to break SPI flash on Minnowboard max which is how I noticed
that this was applied. I can send a patch to tidy that up, but in any case
I think we should consider a revert until the function is better implemented.

Signed-off-by: Simon Glass <sjg@chromium.org>
---

 lib/fdtdec.c | 56 ++++++++++++++++++++------------------------------------
 1 file changed, 20 insertions(+), 36 deletions(-)

Comments

Stephen Warren Aug. 3, 2015, 3:12 p.m. UTC | #1
On 08/02/2015 06:13 PM, Simon Glass wrote:
> This reverts commit 5b34436035fc862b5e8d0d2c3eab74ba36f1a7f4.
>
> This function has a few problems. It calls fdt_parent_offset() which as
> mentioned in code review is very slow.
>
> https://patchwork.ozlabs.org/patch/499482/
> https://patchwork.ozlabs.org/patch/452604/
>
> It also happens to break SPI flash on Minnowboard max which is how I noticed
> that this was applied. I can send a patch to tidy that up, but in any case
> I think we should consider a revert until the function is better implemented.

The fact that the function needs to perform a slow operation is not a 
good reason for a revert. The slowness of the operation is just a matter 
of fact with DT not having uplinks in its data structure, and U-Boot 
using those data structures directly.

You'd requested during review that I additionally implement a faster 
version of the function in the case where the parent node is already 
known, and said it was fine if that happened in a later patch. I have 
this on my TODO list, but it's only been a couple of days.

Why not just fix the bug since you said you could? That seems far better 
than breaking the newly added Tegra210 support.

P.S. What is the issue with SPI flash? The commit description doesn't 
mention this at all.
Bin Meng Aug. 3, 2015, 3:40 p.m. UTC | #2
Hi Simon,

On Mon, Aug 3, 2015 at 8:13 AM, Simon Glass <sjg@chromium.org> wrote:
> This reverts commit 5b34436035fc862b5e8d0d2c3eab74ba36f1a7f4.
>
> This function has a few problems. It calls fdt_parent_offset() which as
> mentioned in code review is very slow.
>
> https://patchwork.ozlabs.org/patch/499482/
> https://patchwork.ozlabs.org/patch/452604/
>
> It also happens to break SPI flash on Minnowboard max which is how I noticed
> that this was applied. I can send a patch to tidy that up, but in any case
> I think we should consider a revert until the function is better implemented.
>

Ah, today I rebased my working branch on top of u-boot-x86/master and
also noticed that SPI flash on Intel Bayley Bay is no longer working.
I haven't got time today to trace the broken commit, but looks that
you have caught it.

I will try this revert commit tomorrow.

> Signed-off-by: Simon Glass <sjg@chromium.org>
> ---
>
>  lib/fdtdec.c | 56 ++++++++++++++++++++------------------------------------
>  1 file changed, 20 insertions(+), 36 deletions(-)
>
> diff --git a/lib/fdtdec.c b/lib/fdtdec.c
> index a954051..aac4f8d 100644
> --- a/lib/fdtdec.c
> +++ b/lib/fdtdec.c
> @@ -90,45 +90,29 @@ const char *fdtdec_get_compatible(enum fdt_compat_id id)
>  fdt_addr_t fdtdec_get_addr_size(const void *blob, int node,
>                 const char *prop_name, fdt_size_t *sizep)
>  {
> -       const fdt32_t *ptr, *end;
> -       int parent, na, ns, len;
> -       fdt_addr_t addr;
> +       const fdt_addr_t *cell;
> +       int len;
>
>         debug("%s: %s: ", __func__, prop_name);
> -
> -       parent = fdt_parent_offset(blob, node);
> -       if (parent < 0) {
> -               debug("(no parent found)\n");
> -               return FDT_ADDR_T_NONE;
> -       }
> -
> -       na = fdt_address_cells(blob, parent);
> -       ns = fdt_size_cells(blob, parent);
> -
> -       ptr = fdt_getprop(blob, node, prop_name, &len);
> -       if (!ptr) {
> -               debug("(not found)\n");
> -               return FDT_ADDR_T_NONE;
> -       }
> -
> -       end = ptr + len / sizeof(*ptr);
> -
> -       if (ptr + na + ns > end) {
> -               debug("(not enough data: expected %d bytes, got %d bytes)\n",
> -                     (na + ns) * 4, len);
> -               return FDT_ADDR_T_NONE;
> -       }
> -
> -       addr = fdtdec_get_number(ptr, na);
> -
> -       if (sizep) {
> -               *sizep = fdtdec_get_number(ptr + na, ns);
> -               debug("addr=%pa, size=%pa\n", &addr, sizep);
> -       } else {
> -               debug("%pa\n", &addr);
> +       cell = fdt_getprop(blob, node, prop_name, &len);
> +       if (cell && ((!sizep && len == sizeof(fdt_addr_t)) ||
> +                    len == sizeof(fdt_addr_t) * 2)) {
> +               fdt_addr_t addr = fdt_addr_to_cpu(*cell);
> +               if (sizep) {
> +                       const fdt_size_t *size;
> +
> +                       size = (fdt_size_t *)((char *)cell +
> +                                       sizeof(fdt_addr_t));
> +                       *sizep = fdt_size_to_cpu(*size);
> +                       debug("addr=%08lx, size=%llx\n",
> +                             (ulong)addr, (u64)*sizep);
> +               } else {
> +                       debug("%08lx\n", (ulong)addr);
> +               }
> +               return addr;
>         }
> -
> -       return addr;
> +       debug("(not found)\n");
> +       return FDT_ADDR_T_NONE;
>  }
>
>  fdt_addr_t fdtdec_get_addr(const void *blob, int node,
> --

Regards,
Bin
Simon Glass Aug. 3, 2015, 3:52 p.m. UTC | #3
Hi Stephen,

On 3 August 2015 at 09:12, Stephen Warren <swarren@wwwdotorg.org> wrote:
> On 08/02/2015 06:13 PM, Simon Glass wrote:
>>
>> This reverts commit 5b34436035fc862b5e8d0d2c3eab74ba36f1a7f4.
>>
>> This function has a few problems. It calls fdt_parent_offset() which as
>> mentioned in code review is very slow.
>>
>> https://patchwork.ozlabs.org/patch/499482/
>> https://patchwork.ozlabs.org/patch/452604/
>>
>> It also happens to break SPI flash on Minnowboard max which is how I
>> noticed
>> that this was applied. I can send a patch to tidy that up, but in any case
>> I think we should consider a revert until the function is better
>> implemented.
>
>
> The fact that the function needs to perform a slow operation is not a good
> reason for a revert. The slowness of the operation is just a matter of fact
> with DT not having uplinks in its data structure, and U-Boot using those
> data structures directly.
>
> You'd requested during review that I additionally implement a faster version
> of the function in the case where the parent node is already known, and said
> it was fine if that happened in a later patch. I have this on my TODO list,
> but it's only been a couple of days.

I didn't expect this to go to mainline before your new patch.

>
> Why not just fix the bug since you said you could? That seems far better
> than breaking the newly added Tegra210 support.

I do have a patch but I'm not 100% comfortable with the approach. I
don't see a good reason to move away from the idea of fdt_addr_t and
fdt_addr_t being set correctly for the platform. Or maybe I
misunderstand the problem the patch was trying to fix. As I said it
did not have a commit message, so who knows :-)

>
> P.S. What is the issue with SPI flash? The commit description doesn't
> mention this at all.

It calls that function expecting it to pick up an address and size
from two consecutive cells. With this patch, that fails (unless the
property happens to be "reg").

Actually I do favour a revert as it will allow us to discuss the fix
with a clean slate.

Regards,
Simon
Tom Rini Aug. 3, 2015, 5:25 p.m. UTC | #4
On Mon, Aug 03, 2015 at 09:52:50AM -0600, Simon Glass wrote:
> Hi Stephen,
> 
> On 3 August 2015 at 09:12, Stephen Warren <swarren@wwwdotorg.org> wrote:
> > On 08/02/2015 06:13 PM, Simon Glass wrote:
> >>
> >> This reverts commit 5b34436035fc862b5e8d0d2c3eab74ba36f1a7f4.
> >>
> >> This function has a few problems. It calls fdt_parent_offset() which as
> >> mentioned in code review is very slow.
> >>
> >> https://patchwork.ozlabs.org/patch/499482/
> >> https://patchwork.ozlabs.org/patch/452604/
> >>
> >> It also happens to break SPI flash on Minnowboard max which is how I
> >> noticed
> >> that this was applied. I can send a patch to tidy that up, but in any case
> >> I think we should consider a revert until the function is better
> >> implemented.
> >
> >
> > The fact that the function needs to perform a slow operation is not a good
> > reason for a revert. The slowness of the operation is just a matter of fact
> > with DT not having uplinks in its data structure, and U-Boot using those
> > data structures directly.
> >
> > You'd requested during review that I additionally implement a faster version
> > of the function in the case where the parent node is already known, and said
> > it was fine if that happened in a later patch. I have this on my TODO list,
> > but it's only been a couple of days.
> 
> I didn't expect this to go to mainline before your new patch.
> 
> >
> > Why not just fix the bug since you said you could? That seems far better
> > than breaking the newly added Tegra210 support.
> 
> I do have a patch but I'm not 100% comfortable with the approach. I
> don't see a good reason to move away from the idea of fdt_addr_t and
> fdt_addr_t being set correctly for the platform. Or maybe I
> misunderstand the problem the patch was trying to fix. As I said it
> did not have a commit message, so who knows :-)
> 
> >
> > P.S. What is the issue with SPI flash? The commit description doesn't
> > mention this at all.
> 
> It calls that function expecting it to pick up an address and size
> from two consecutive cells. With this patch, that fails (unless the
> property happens to be "reg").
> 
> Actually I do favour a revert as it will allow us to discuss the fix
> with a clean slate.

Well, we can unbreak one set of stuff and break another set of stuff,
that's not a clean slate for the now broken platforms.  If it helps to
come up with a better patch we can step one revert and step two better,
but I don't want to apply without step two ready.

... speaking as someone who has people poking me about if the SPI
flasher they sent me works so I can try U-Boot on x86 things.
Simon Glass Aug. 3, 2015, 5:27 p.m. UTC | #5
Hi Tom.

On 3 August 2015 at 11:25, Tom Rini <trini@konsulko.com> wrote:
> On Mon, Aug 03, 2015 at 09:52:50AM -0600, Simon Glass wrote:
>> Hi Stephen,
>>
>> On 3 August 2015 at 09:12, Stephen Warren <swarren@wwwdotorg.org> wrote:
>> > On 08/02/2015 06:13 PM, Simon Glass wrote:
>> >>
>> >> This reverts commit 5b34436035fc862b5e8d0d2c3eab74ba36f1a7f4.
>> >>
>> >> This function has a few problems. It calls fdt_parent_offset() which as
>> >> mentioned in code review is very slow.
>> >>
>> >> https://patchwork.ozlabs.org/patch/499482/
>> >> https://patchwork.ozlabs.org/patch/452604/
>> >>
>> >> It also happens to break SPI flash on Minnowboard max which is how I
>> >> noticed
>> >> that this was applied. I can send a patch to tidy that up, but in any case
>> >> I think we should consider a revert until the function is better
>> >> implemented.
>> >
>> >
>> > The fact that the function needs to perform a slow operation is not a good
>> > reason for a revert. The slowness of the operation is just a matter of fact
>> > with DT not having uplinks in its data structure, and U-Boot using those
>> > data structures directly.
>> >
>> > You'd requested during review that I additionally implement a faster version
>> > of the function in the case where the parent node is already known, and said
>> > it was fine if that happened in a later patch. I have this on my TODO list,
>> > but it's only been a couple of days.
>>
>> I didn't expect this to go to mainline before your new patch.
>>
>> >
>> > Why not just fix the bug since you said you could? That seems far better
>> > than breaking the newly added Tegra210 support.
>>
>> I do have a patch but I'm not 100% comfortable with the approach. I
>> don't see a good reason to move away from the idea of fdt_addr_t and
>> fdt_addr_t being set correctly for the platform. Or maybe I
>> misunderstand the problem the patch was trying to fix. As I said it
>> did not have a commit message, so who knows :-)
>>
>> >
>> > P.S. What is the issue with SPI flash? The commit description doesn't
>> > mention this at all.
>>
>> It calls that function expecting it to pick up an address and size
>> from two consecutive cells. With this patch, that fails (unless the
>> property happens to be "reg").
>>
>> Actually I do favour a revert as it will allow us to discuss the fix
>> with a clean slate.
>
> Well, we can unbreak one set of stuff and break another set of stuff,
> that's not a clean slate for the now broken platforms.  If it helps to
> come up with a better patch we can step one revert and step two better,
> but I don't want to apply without step two ready.
>
> ... speaking as someone who has people poking me about if the SPI
> flasher they sent me works so I can try U-Boot on x86 things.

Fair enough - I don't want to break things either. Perhaps we can
apply a revert and the new patch one after the other. But really
however we can fix this is fine with me.

Regards,
Simon
Stephen Warren Aug. 3, 2015, 6:20 p.m. UTC | #6
On 08/03/2015 09:52 AM, Simon Glass wrote:
> Hi Stephen,
>
> On 3 August 2015 at 09:12, Stephen Warren <swarren@wwwdotorg.org> wrote:
>> On 08/02/2015 06:13 PM, Simon Glass wrote:
>>>
>>> This reverts commit 5b34436035fc862b5e8d0d2c3eab74ba36f1a7f4.
>>>
>>> This function has a few problems. It calls fdt_parent_offset() which as
>>> mentioned in code review is very slow.
>>>
>>> https://patchwork.ozlabs.org/patch/499482/
>>> https://patchwork.ozlabs.org/patch/452604/
>>>
>>> It also happens to break SPI flash on Minnowboard max which is how I
>>> noticed
>>> that this was applied. I can send a patch to tidy that up, but in any case
>>> I think we should consider a revert until the function is better
>>> implemented.
>>
>>
>> The fact that the function needs to perform a slow operation is not a good
>> reason for a revert. The slowness of the operation is just a matter of fact
>> with DT not having uplinks in its data structure, and U-Boot using those
>> data structures directly.
>>
>> You'd requested during review that I additionally implement a faster version
>> of the function in the case where the parent node is already known, and said
>> it was fine if that happened in a later patch. I have this on my TODO list,
>> but it's only been a couple of days.
>
> I didn't expect this to go to mainline before your new patch.

I didn't get that message from the thread; you wrote:

Simon Glass wrote:
 > Stephen Warren wrote:
 >> Simon Glass wrote:
 >>> Also, how about (in addition) a
 >>> version of this function that works for devices? Like:
 >>>
 >>> device_get_addr_size(struct udevice *dev, ...)
 >>>
 >>> so that it can handle this for you.
 >>
 >> That sounds like a separate patch?
 >
 > Yes, but I think we should get it in there so that people don't start
 > using this (wildly inefficient) function when they don't need to. I
 > think by passing in the parent node we force people to think about the
 > cost.
 >
 > Yes the driver model function can be a separate patch, but let's get
 > it in at about the same time. We have dev_get_addr() so could have
 > dev_get_addr_size().

That sounds to like you'd like a followup patch soon after this patch 
(my assumption would be within a few days or weeks) to solve your 
comments, not that you wanted a replacement patch.

>> Why not just fix the bug since you said you could? That seems far better
>> than breaking the newly added Tegra210 support.
>
> I do have a patch but I'm not 100% comfortable with the approach. I
> don't see a good reason to move away from the idea of fdt_addr_t and
> fdt_addr_t being set correctly for the platform. Or maybe I
> misunderstand the problem the patch was trying to fix. As I said it
> did not have a commit message, so who knows :-)

The size of fdt_addr_t isn't relevant *at all* when parsing DT. The only 
thing that matters is #address-cells/#size-cells. Those properties are 
what tell the parsing code how many bytes to read from the reg property. 
Whether the resultant value fits into the code's internal representation 
is an internal detail of the code, not part of the semantics of DT 
itself or how to parse it.

If code assumes that #address-cells == sizeof(fdt_addr_t), it is indeed 
quite likely that everything will just happen to work most of the time. 
However, this is purely an accident and not something that anything 
should rely upon.

(I think Tegra210 support still has CONFIG_PHYS_64BIT undefined which is 
admittedly a little /unexpected/ for a 64-bit U-Boot build, but in 
practice is perfectly /legal/ and will work out just fine since, except 
perhaps for RAM sizes, I don't believe any value in DT will actually 
require more than 32-bits to represent)

>> P.S. What is the issue with SPI flash? The commit description doesn't
>> mention this at all.
>
> It calls that function expecting it to pick up an address and size
> from two consecutive cells. With this patch, that fails (unless the
> property happens to be "reg").

I don't see anything in the new code that interprets the property name 
in any way, or compares it against "reg". Can you point out where the 
code is doing something differently based on property name?

Perhaps you mean instead that something is calling 
fdtdec_get_addr_size() on a property that's not intended to be affected 
by #address-call/#size-cells? If so, I would assert that's a bug in the 
calling code; if fdtdec_get_addr_size() is ever intended to parse the 
reg property, then fdtdec_get_addr_size() must always honor 
#address-call/#size-cells since that's how the reg property works. If 
some code wants to parse a property where #address-call/#size-cells 
shouldn't be honored, but instead some fixed size assumed, then a 
different function should be called instead (or perhaps some additional 
function parameters added to specify override values for 
#address-call/#size-cells).

I think this is all stemming from drivers/mtd/spi/sf_probe.c 
spi_flash_decode_fdt()'s call to fdtdec_get_addr_size() for property
"memory-map"? If so, then looking at arch/x86/dts/minnowmax.dts's /spi 
node, the code and DT content are clearly inconsistent; For this node 
#address-cells=<1>, #size-cells=<0> which makes sense given that the 
address is a chip-select and hence has no size. So, the code should not 
assume that the memory-map property can be parsed in the same way as a 
reg property.

I note that the memory-map property doesn't exist in the Linux kernel's 
DT binding documentation database, or code, hence hasn't been reviewed 
by the DT binding maintainers.
Thierry Reding Aug. 4, 2015, 3:27 p.m. UTC | #7
On Sun, Aug 02, 2015 at 06:13:50PM -0600, Simon Glass wrote:
> This reverts commit 5b34436035fc862b5e8d0d2c3eab74ba36f1a7f4.
> 
> This function has a few problems. It calls fdt_parent_offset() which as
> mentioned in code review is very slow.
> 
> https://patchwork.ozlabs.org/patch/499482/
> https://patchwork.ozlabs.org/patch/452604/
> 
> It also happens to break SPI flash on Minnowboard max which is how I noticed
> that this was applied. I can send a patch to tidy that up, but in any case
> I think we should consider a revert until the function is better implemented.

I suspect that what breaks is when you try to parse the "memory-map"
property from this (taken from arch/x86/dts/minnowmax.dts):

	spi {
		#address-cells = <1>;
		#size-cells = <0>;
		compatible = "intel,ich-spi";
		spi-flash@0 {
			reg = <0>;
			compatible = "stmicro,n25q064a", "spi-flash";
			memory-map = <0xff800000 0x00800000>;
		};
	};

That's obviously not going to work because as Stephen and I have pointed
out it specifies an address range in a different address space than that
of the SPI flash. The spi parent node defines #address-cells and
#size-cells, both of which define the layout of the "reg" property (and
some others like "ranges"). I don't see a way around this other than to
read two individual cells from the "memory-map" property. I don't know
of a concept in DT that would allow you to translate from memory-map in
the spi-flash@0 node to the address space in the parent node of the spi
node (the root node).

Thierry
Simon Glass Aug. 5, 2015, 4:08 a.m. UTC | #8
Hi Stephen,

On 3 August 2015 at 12:20, Stephen Warren <swarren@wwwdotorg.org> wrote:
> On 08/03/2015 09:52 AM, Simon Glass wrote:
>>
>> Hi Stephen,
>>
>> On 3 August 2015 at 09:12, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>>
>>> On 08/02/2015 06:13 PM, Simon Glass wrote:
>>>>
>>>>
>>>> This reverts commit 5b34436035fc862b5e8d0d2c3eab74ba36f1a7f4.
>>>>
>>>> This function has a few problems. It calls fdt_parent_offset() which as
>>>> mentioned in code review is very slow.
>>>>
>>>> https://patchwork.ozlabs.org/patch/499482/
>>>> https://patchwork.ozlabs.org/patch/452604/
>>>>
>>>> It also happens to break SPI flash on Minnowboard max which is how I
>>>> noticed
>>>> that this was applied. I can send a patch to tidy that up, but in any
>>>> case
>>>> I think we should consider a revert until the function is better
>>>> implemented.
>>>
>>>
>>>
>>> The fact that the function needs to perform a slow operation is not a
>>> good
>>> reason for a revert. The slowness of the operation is just a matter of
>>> fact
>>> with DT not having uplinks in its data structure, and U-Boot using those
>>> data structures directly.
>>>
>>> You'd requested during review that I additionally implement a faster
>>> version
>>> of the function in the case where the parent node is already known, and
>>> said
>>> it was fine if that happened in a later patch. I have this on my TODO
>>> list,
>>> but it's only been a couple of days.
>>
>>
>> I didn't expect this to go to mainline before your new patch.
>
>
> I didn't get that message from the thread; you wrote:
>
> Simon Glass wrote:
>> Stephen Warren wrote:
>>> Simon Glass wrote:
>>>> Also, how about (in addition) a
>>>> version of this function that works for devices? Like:
>>>>
>>>> device_get_addr_size(struct udevice *dev, ...)
>>>>
>>>> so that it can handle this for you.
>>>
>>> That sounds like a separate patch?
>>
>> Yes, but I think we should get it in there so that people don't start
>> using this (wildly inefficient) function when they don't need to. I
>> think by passing in the parent node we force people to think about the
>> cost.
>>
>> Yes the driver model function can be a separate patch, but let's get
>> it in at about the same time. We have dev_get_addr() so could have
>> dev_get_addr_size().
>
> That sounds to like you'd like a followup patch soon after this patch (my
> assumption would be within a few days or weeks) to solve your comments, not
> that you wanted a replacement patch.

I will take that feedback to be a little more forceful in future, sorry.

>
>>> Why not just fix the bug since you said you could? That seems far better
>>> than breaking the newly added Tegra210 support.
>>
>>
>> I do have a patch but I'm not 100% comfortable with the approach. I
>> don't see a good reason to move away from the idea of fdt_addr_t and
>> fdt_addr_t being set correctly for the platform. Or maybe I
>> misunderstand the problem the patch was trying to fix. As I said it
>> did not have a commit message, so who knows :-)
>
>
> The size of fdt_addr_t isn't relevant *at all* when parsing DT. The only
> thing that matters is #address-cells/#size-cells. Those properties are what
> tell the parsing code how many bytes to read from the reg property. Whether
> the resultant value fits into the code's internal representation is an
> internal detail of the code, not part of the semantics of DT itself or how
> to parse it.
>
> If code assumes that #address-cells == sizeof(fdt_addr_t), it is indeed
> quite likely that everything will just happen to work most of the time.
> However, this is purely an accident and not something that anything should
> rely upon.
>
> (I think Tegra210 support still has CONFIG_PHYS_64BIT undefined which is
> admittedly a little /unexpected/ for a 64-bit U-Boot build, but in practice
> is perfectly /legal/ and will work out just fine since, except perhaps for
> RAM sizes, I don't believe any value in DT will actually require more than
> 32-bits to represent)

I would like to have the types match the platform where possible. At
least the types should not be smaller than the platform - e.g. if
CONFIG_PHYS_64BIT is not defined we should not support 64-bit
address/size in the device tree. This is for efficiency. We don't want
to force all the U-Boot code to 64-bit suddenly. This will bloat
things for no benefit.

>
>>> P.S. What is the issue with SPI flash? The commit description doesn't
>>> mention this at all.
>>
>>
>> It calls that function expecting it to pick up an address and size
>> from two consecutive cells. With this patch, that fails (unless the
>> property happens to be "reg").
>
>
> I don't see anything in the new code that interprets the property name in
> any way, or compares it against "reg". Can you point out where the code is
> doing something differently based on property name?
>
> Perhaps you mean instead that something is calling fdtdec_get_addr_size() on
> a property that's not intended to be affected by #address-call/#size-cells?
> If so, I would assert that's a bug in the calling code; if
> fdtdec_get_addr_size() is ever intended to parse the reg property, then
> fdtdec_get_addr_size() must always honor #address-call/#size-cells since
> that's how the reg property works. If some code wants to parse a property
> where #address-call/#size-cells shouldn't be honored, but instead some fixed
> size assumed, then a different function should be called instead (or perhaps
> some additional function parameters added to specify override values for
> #address-call/#size-cells).
>
> I think this is all stemming from drivers/mtd/spi/sf_probe.c
> spi_flash_decode_fdt()'s call to fdtdec_get_addr_size() for property
> "memory-map"? If so, then looking at arch/x86/dts/minnowmax.dts's /spi node,
> the code and DT content are clearly inconsistent; For this node
> #address-cells=<1>, #size-cells=<0> which makes sense given that the address
> is a chip-select and hence has no size. So, the code should not assume that
> the memory-map property can be parsed in the same way as a reg property.
>
> I note that the memory-map property doesn't exist in the Linux kernel's DT
> binding documentation database, or code, hence hasn't been reviewed by the
> DT binding maintainers.

The function comment says:

 * Look up an address property in a node and return it as an address.
 * The property must hold one address with a length. This is only tested
 * on 32-bit machines.

My intention was that this would be an efficient way to decode an
address and size from a device tree. To some extent regmap may take
over this role (IMO we should turn to drop fdtdec one day one we have
more infrastructure). But I'd like it to work efficiently for 32-bit
machines. The new function could hardly be less efficient.

I think we should consider the case where the physical address size of
U-Boot and the device tree do not match as a corner case. I certainly
don't want device tree to add loads of pointless code for 'normal'
platforms.

Re memory-map, yes it doesn't seem to be possible to do what it is
trying to do (and Thierry says the same below). It is quite weird to
have a SPI peripheral which is also memory mapped.

Here's my question - if you fix the CONFIG_PHYS_64BIT problem with
64-bit Tegra, what is actually wrong with the way the function was?
This is a boot loader so we should be willing to make some
simplifications.

Regards,
Simon
Stephen Warren Aug. 5, 2015, 6:22 p.m. UTC | #9
On 08/04/2015 10:08 PM, Simon Glass wrote:
> Hi Stephen,
>
> On 3 August 2015 at 12:20, Stephen Warren <swarren@wwwdotorg.org> wrote:
>> On 08/03/2015 09:52 AM, Simon Glass wrote:
>>>
>>> Hi Stephen,
>>>
>>> On 3 August 2015 at 09:12, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>>>
>>>> On 08/02/2015 06:13 PM, Simon Glass wrote:
>>>>>
>>>>>
>>>>> This reverts commit 5b34436035fc862b5e8d0d2c3eab74ba36f1a7f4.
>>>>>
>>>>> This function has a few problems. It calls fdt_parent_offset() which as
>>>>> mentioned in code review is very slow.
>>>>>
>>>>> https://patchwork.ozlabs.org/patch/499482/
>>>>> https://patchwork.ozlabs.org/patch/452604/
>>>>>
>>>>> It also happens to break SPI flash on Minnowboard max which is how I
>>>>> noticed
>>>>> that this was applied. I can send a patch to tidy that up, but in any
>>>>> case
>>>>> I think we should consider a revert until the function is better
>>>>> implemented.
>>>>
>>>>
>>>>
>>>> The fact that the function needs to perform a slow operation is not a
>>>> good
>>>> reason for a revert. The slowness of the operation is just a matter of
>>>> fact
>>>> with DT not having uplinks in its data structure, and U-Boot using those
>>>> data structures directly.
>>>>
>>>> You'd requested during review that I additionally implement a faster
>>>> version
>>>> of the function in the case where the parent node is already known, and
>>>> said
>>>> it was fine if that happened in a later patch. I have this on my TODO
>>>> list,
>>>> but it's only been a couple of days.
>>>
>>>
>>> I didn't expect this to go to mainline before your new patch.
>>
>>
>> I didn't get that message from the thread; you wrote:
>>
>> Simon Glass wrote:
>>> Stephen Warren wrote:
>>>> Simon Glass wrote:
>>>>> Also, how about (in addition) a
>>>>> version of this function that works for devices? Like:
>>>>>
>>>>> device_get_addr_size(struct udevice *dev, ...)
>>>>>
>>>>> so that it can handle this for you.
>>>>
>>>> That sounds like a separate patch?
>>>
>>> Yes, but I think we should get it in there so that people don't start
>>> using this (wildly inefficient) function when they don't need to. I
>>> think by passing in the parent node we force people to think about the
>>> cost.
>>>
>>> Yes the driver model function can be a separate patch, but let's get
>>> it in at about the same time. We have dev_get_addr() so could have
>>> dev_get_addr_size().
>>
>> That sounds to like you'd like a followup patch soon after this patch (my
>> assumption would be within a few days or weeks) to solve your comments, not
>> that you wanted a replacement patch.
>
> I will take that feedback to be a little more forceful in future, sorry.
>
>>
>>>> Why not just fix the bug since you said you could? That seems far better
>>>> than breaking the newly added Tegra210 support.
>>>
>>>
>>> I do have a patch but I'm not 100% comfortable with the approach. I
>>> don't see a good reason to move away from the idea of fdt_addr_t and
>>> fdt_addr_t being set correctly for the platform. Or maybe I
>>> misunderstand the problem the patch was trying to fix. As I said it
>>> did not have a commit message, so who knows :-)
>>
>>
>> The size of fdt_addr_t isn't relevant *at all* when parsing DT. The only
>> thing that matters is #address-cells/#size-cells. Those properties are what
>> tell the parsing code how many bytes to read from the reg property. Whether
>> the resultant value fits into the code's internal representation is an
>> internal detail of the code, not part of the semantics of DT itself or how
>> to parse it.
>>
>> If code assumes that #address-cells == sizeof(fdt_addr_t), it is indeed
>> quite likely that everything will just happen to work most of the time.
>> However, this is purely an accident and not something that anything should
>> rely upon.
>>
>> (I think Tegra210 support still has CONFIG_PHYS_64BIT undefined which is
>> admittedly a little /unexpected/ for a 64-bit U-Boot build, but in practice
>> is perfectly /legal/ and will work out just fine since, except perhaps for
>> RAM sizes, I don't believe any value in DT will actually require more than
>> 32-bits to represent)
>
> I would like to have the types match the platform where possible. At
> least the types should not be smaller than the platform - e.g. if
> CONFIG_PHYS_64BIT is not defined we should not support 64-bit

In general, there's no "should not" here; we "cannot". If there's a 
64-bit value in the DT (with bits above bit 31 set), then it can't be 
stored in a 32-bit variable.

That said, if a DT has #address-cells=<2>, but the actual values stored 
in all reg values have 0 for the MS word, that'll actually work just 
fine. Note that it's 100% legal to set #address-cells=<100> and just 
write a bunch of extra zeros into the property. Silly perhaps, but 
perfectly legal. Since the function should adapt to whatever 
#address-cells value is in the DT, supporting that case isn't any more 
work, so there's no reason we shouldn't.

> address/size in the device tree. This is for efficiency. We don't want
> to force all the U-Boot code to 64-bit suddenly. This will bloat
> things for no benefit.

We could and likely should set CONFIG_PHYS_64BIT for Tegra210. However, 
that's unrelated to using the correct algorithm to parse DT.

>>>> P.S. What is the issue with SPI flash? The commit description doesn't
>>>> mention this at all.
>>>
>>> It calls that function expecting it to pick up an address and size
>>> from two consecutive cells. With this patch, that fails (unless the
>>> property happens to be "reg").
...
>> I think this is all stemming from drivers/mtd/spi/sf_probe.c
>> spi_flash_decode_fdt()'s call to fdtdec_get_addr_size() for property
>> "memory-map"? If so, then looking at arch/x86/dts/minnowmax.dts's /spi node,
>> the code and DT content are clearly inconsistent; For this node
>> #address-cells=<1>, #size-cells=<0> which makes sense given that the address
>> is a chip-select and hence has no size. So, the code should not assume that
>> the memory-map property can be parsed in the same way as a reg property.
>>
>> I note that the memory-map property doesn't exist in the Linux kernel's DT
>> binding documentation database, or code, hence hasn't been reviewed by the
>> DT binding maintainers.
>
> The function comment says:
>
>   * Look up an address property in a node and return it as an address.
>   * The property must hold one address with a length. This is only tested
>   * on 32-bit machines.

And Thierry fixed it for systems with #address-cells > 1. Perhaps that 
part of the function comment should have been removed in the commit.

> My intention was that this would be an efficient way to decode an
> address and size from a device tree. To some extent regmap may take
> over this role (IMO we should turn to drop fdtdec one day one we have
> more infrastructure). But I'd like it to work efficiently for 32-bit
> machines. The new function could hardly be less efficient.

Again, there's no way in general to make it more efficient. The 
efficiency issue is directly implied by the DT data structures.

In the special case where the parent node is already known, we can 
certainly introduce an alternate function that is more efficient. You've 
already asked for that, and as I said, it's in my TODO list.

> I think we should consider the case where the physical address size of
> U-Boot and the device tree do not match as a corner case. I certainly
> don't want device tree to add loads of pointless code for 'normal'
> platforms.

It's not a corner case. It's a fundamental part of the DT schema. If 
U-Boot is going to use DT, it should actually use *DT*, not some-very 
similar-but-not-quite-DT format with all kinds of implicit and unstated 
exceptions, limitations, and assumptions. This implies fully honoring 
all aspects of how DT works when parsing it, not picking and choosing 
features because some are inconvenient or annoying. If U-Boot doesn't 
want to correctly implement DT support, it should just drop it completely.

As an aside, when instantiating devices, I hope one day that U-Boot will 
parse DT top-down in a hierarchical way, rather than simply searching 
for any node anywhere in the DT without regard for whether any parent 
node is enabled, has a driver, has had the driver initialize, etc. 
U-Boot ignores this right now, and is only getting away with this 
accidentally. Without hacky workarounds in drivers, this won't continue 
to work for all Tegra HW (e.g. host1x graphics/display sub-modules, 
AHUB's audio-related sub-modules), since parent drivers must initialize 
before child drivers in order to enable various register buses, 
including clocks/resets affecting those buses etc.

Again, if it's simply too inconvenient or bloated to implement DT 
properly in U-Boot, let's just drop it entirely. A halfway solution is 
the worst of both worlds. I'm not convinced the full implications of how 
to (and the need to) correctly and fully support DT have were well 
thought through before (control) DT support was added to U-Boot.

> Re memory-map, yes it doesn't seem to be possible to do what it is
> trying to do (and Thierry says the same below). It is quite weird to
> have a SPI peripheral which is also memory mapped.
>
> Here's my question - if you fix the CONFIG_PHYS_64BIT problem with
> 64-bit Tegra, what is actually wrong with the way the function was?

With the DT files now checked into U-Boot, I think it would accidentally 
work, since we just happen to have set #address-cells=<2>, 
#size-cells=<2>, and that would just happen to match sizeof(fdt_addr_t).

However note this is an accident on a couple of levels:

a) This is because the code assumes that sizeof(fdt_addr_t) == 
#address-cells * 4. This is an invalid assumption since it does not 
correctly honor the DT schema. It hard-codes the size of values whereas 
DT schema says the size is defined by the #xxx-cells properties.

b) The original Tegra210 DTs that TomW posted had #address-cells=<1>, 
#size-cells=<1>. I asked him to change that to match what I expected to 
be in the Linux kernel's Tegra210 DTs. However, if he'd rejected my 
request or I hadn't noticed that, then with CONFIG_PHYS_64BIT set, 
fdtdec_get_addr_size() would have attempted to read twice as many bytes 
as it should have from the property. It's entirely plausible that 
someone could have come along later and realized CONFIG_PHYS_64BIT was 
set incorrectly and enabled it.

> This is a boot loader so we should be willing to make some
> simplifications.

When dealing with internal bootloader details, sure assumptions, 
simplifications, etc. can be made.

However, DT is an externally defined standard. The content of DT must be 
identical across all OSs (SW stacks, bootloader) and not influenced by 
requirements/... of any specific individual OS's (SW stack, bootloader) 
quirks. We can't just pick and choose which parts of it we care about. 
Well, perhaps if we stop calling it DT we could.
Simon Glass Aug. 5, 2015, 11:45 p.m. UTC | #10
Hi Stephen,

On 5 August 2015 at 12:22, Stephen Warren <swarren@wwwdotorg.org> wrote:
> On 08/04/2015 10:08 PM, Simon Glass wrote:
>>
>> Hi Stephen,
>>
>> On 3 August 2015 at 12:20, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>>
>>> On 08/03/2015 09:52 AM, Simon Glass wrote:
>>>>
>>>>
>>>> Hi Stephen,
>>>>
>>>> On 3 August 2015 at 09:12, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>>>>
>>>>>
>>>>> On 08/02/2015 06:13 PM, Simon Glass wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> This reverts commit 5b34436035fc862b5e8d0d2c3eab74ba36f1a7f4.
>>>>>>
>>>>>> This function has a few problems. It calls fdt_parent_offset() which
>>>>>> as
>>>>>> mentioned in code review is very slow.
>>>>>>
>>>>>> https://patchwork.ozlabs.org/patch/499482/
>>>>>> https://patchwork.ozlabs.org/patch/452604/
>>>>>>
>>>>>> It also happens to break SPI flash on Minnowboard max which is how I
>>>>>> noticed
>>>>>> that this was applied. I can send a patch to tidy that up, but in any
>>>>>> case
>>>>>> I think we should consider a revert until the function is better
>>>>>> implemented.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> The fact that the function needs to perform a slow operation is not a
>>>>> good
>>>>> reason for a revert. The slowness of the operation is just a matter of
>>>>> fact
>>>>> with DT not having uplinks in its data structure, and U-Boot using
>>>>> those
>>>>> data structures directly.
>>>>>
>>>>> You'd requested during review that I additionally implement a faster
>>>>> version
>>>>> of the function in the case where the parent node is already known, and
>>>>> said
>>>>> it was fine if that happened in a later patch. I have this on my TODO
>>>>> list,
>>>>> but it's only been a couple of days.
>>>>
>>>>
>>>>
>>>> I didn't expect this to go to mainline before your new patch.
>>>
>>>
>>>
>>> I didn't get that message from the thread; you wrote:
>>>
>>> Simon Glass wrote:
>>>>
>>>> Stephen Warren wrote:
>>>>>
>>>>> Simon Glass wrote:
>>>>>>
>>>>>> Also, how about (in addition) a
>>>>>> version of this function that works for devices? Like:
>>>>>>
>>>>>> device_get_addr_size(struct udevice *dev, ...)
>>>>>>
>>>>>> so that it can handle this for you.
>>>>>
>>>>>
>>>>> That sounds like a separate patch?
>>>>
>>>>
>>>> Yes, but I think we should get it in there so that people don't start
>>>> using this (wildly inefficient) function when they don't need to. I
>>>> think by passing in the parent node we force people to think about the
>>>> cost.
>>>>
>>>> Yes the driver model function can be a separate patch, but let's get
>>>> it in at about the same time. We have dev_get_addr() so could have
>>>> dev_get_addr_size().
>>>
>>>
>>> That sounds to like you'd like a followup patch soon after this patch (my
>>> assumption would be within a few days or weeks) to solve your comments,
>>> not
>>> that you wanted a replacement patch.
>>
>>
>> I will take that feedback to be a little more forceful in future, sorry.
>>
>>>
>>>>> Why not just fix the bug since you said you could? That seems far
>>>>> better
>>>>> than breaking the newly added Tegra210 support.
>>>>
>>>>
>>>>
>>>> I do have a patch but I'm not 100% comfortable with the approach. I
>>>> don't see a good reason to move away from the idea of fdt_addr_t and
>>>> fdt_addr_t being set correctly for the platform. Or maybe I
>>>> misunderstand the problem the patch was trying to fix. As I said it
>>>> did not have a commit message, so who knows :-)
>>>
>>>
>>>
>>> The size of fdt_addr_t isn't relevant *at all* when parsing DT. The only
>>> thing that matters is #address-cells/#size-cells. Those properties are
>>> what
>>> tell the parsing code how many bytes to read from the reg property.
>>> Whether
>>> the resultant value fits into the code's internal representation is an
>>> internal detail of the code, not part of the semantics of DT itself or
>>> how
>>> to parse it.
>>>
>>> If code assumes that #address-cells == sizeof(fdt_addr_t), it is indeed
>>> quite likely that everything will just happen to work most of the time.
>>> However, this is purely an accident and not something that anything
>>> should
>>> rely upon.
>>>
>>> (I think Tegra210 support still has CONFIG_PHYS_64BIT undefined which is
>>> admittedly a little /unexpected/ for a 64-bit U-Boot build, but in
>>> practice
>>> is perfectly /legal/ and will work out just fine since, except perhaps
>>> for
>>> RAM sizes, I don't believe any value in DT will actually require more
>>> than
>>> 32-bits to represent)
>>
>>
>> I would like to have the types match the platform where possible. At
>> least the types should not be smaller than the platform - e.g. if
>> CONFIG_PHYS_64BIT is not defined we should not support 64-bit
>
>
> In general, there's no "should not" here; we "cannot". If there's a 64-bit
> value in the DT (with bits above bit 31 set), then it can't be stored in a
> 32-bit variable.
>
> That said, if a DT has #address-cells=<2>, but the actual values stored in
> all reg values have 0 for the MS word, that'll actually work just fine. Note
> that it's 100% legal to set #address-cells=<100> and just write a bunch of
> extra zeros into the property. Silly perhaps, but perfectly legal. Since the
> function should adapt to whatever #address-cells value is in the DT,
> supporting that case isn't any more work, so there's no reason we shouldn't.
>
>> address/size in the device tree. This is for efficiency. We don't want
>> to force all the U-Boot code to 64-bit suddenly. This will bloat
>> things for no benefit.
>
>
> We could and likely should set CONFIG_PHYS_64BIT for Tegra210. However,
> that's unrelated to using the correct algorithm to parse DT.
>
>>>>> P.S. What is the issue with SPI flash? The commit description doesn't
>>>>> mention this at all.
>>>>
>>>>
>>>> It calls that function expecting it to pick up an address and size
>>>> from two consecutive cells. With this patch, that fails (unless the
>>>> property happens to be "reg").
>
> ...
>>>
>>> I think this is all stemming from drivers/mtd/spi/sf_probe.c
>>> spi_flash_decode_fdt()'s call to fdtdec_get_addr_size() for property
>>> "memory-map"? If so, then looking at arch/x86/dts/minnowmax.dts's /spi
>>> node,
>>> the code and DT content are clearly inconsistent; For this node
>>> #address-cells=<1>, #size-cells=<0> which makes sense given that the
>>> address
>>> is a chip-select and hence has no size. So, the code should not assume
>>> that
>>> the memory-map property can be parsed in the same way as a reg property.
>>>
>>> I note that the memory-map property doesn't exist in the Linux kernel's
>>> DT
>>> binding documentation database, or code, hence hasn't been reviewed by
>>> the
>>> DT binding maintainers.
>>
>>
>> The function comment says:
>>
>>   * Look up an address property in a node and return it as an address.
>>   * The property must hold one address with a length. This is only tested
>>   * on 32-bit machines.
>
>
> And Thierry fixed it for systems with #address-cells > 1. Perhaps that part
> of the function comment should have been removed in the commit.
>
>> My intention was that this would be an efficient way to decode an
>> address and size from a device tree. To some extent regmap may take
>> over this role (IMO we should turn to drop fdtdec one day one we have
>> more infrastructure). But I'd like it to work efficiently for 32-bit
>> machines. The new function could hardly be less efficient.
>
>
> Again, there's no way in general to make it more efficient. The efficiency
> issue is directly implied by the DT data structures.
>
> In the special case where the parent node is already known, we can certainly
> introduce an alternate function that is more efficient. You've already asked
> for that, and as I said, it's in my TODO list.
>
>> I think we should consider the case where the physical address size of
>> U-Boot and the device tree do not match as a corner case. I certainly
>> don't want device tree to add loads of pointless code for 'normal'
>> platforms.
>
>
> It's not a corner case. It's a fundamental part of the DT schema. If U-Boot
> is going to use DT, it should actually use *DT*, not some-very
> similar-but-not-quite-DT format with all kinds of implicit and unstated
> exceptions, limitations, and assumptions. This implies fully honoring all
> aspects of how DT works when parsing it, not picking and choosing features
> because some are inconvenient or annoying. If U-Boot doesn't want to
> correctly implement DT support, it should just drop it completely.
>
> As an aside, when instantiating devices, I hope one day that U-Boot will
> parse DT top-down in a hierarchical way, rather than simply searching for
> any node anywhere in the DT without regard for whether any parent node is
> enabled, has a driver, has had the driver initialize, etc. U-Boot ignores
> this right now, and is only getting away with this accidentally. Without
> hacky workarounds in drivers, this won't continue to work for all Tegra HW
> (e.g. host1x graphics/display sub-modules, AHUB's audio-related
> sub-modules), since parent drivers must initialize before child drivers in
> order to enable various register buses, including clocks/resets affecting
> those buses etc.
>
> Again, if it's simply too inconvenient or bloated to implement DT properly
> in U-Boot, let's just drop it entirely. A halfway solution is the worst of
> both worlds. I'm not convinced the full implications of how to (and the need
> to) correctly and fully support DT have were well thought through before
> (control) DT support was added to U-Boot.
>
>> Re memory-map, yes it doesn't seem to be possible to do what it is
>> trying to do (and Thierry says the same below). It is quite weird to
>> have a SPI peripheral which is also memory mapped.
>>
>> Here's my question - if you fix the CONFIG_PHYS_64BIT problem with
>> 64-bit Tegra, what is actually wrong with the way the function was?
>
>
> With the DT files now checked into U-Boot, I think it would accidentally
> work, since we just happen to have set #address-cells=<2>, #size-cells=<2>,
> and that would just happen to match sizeof(fdt_addr_t).
>
> However note this is an accident on a couple of levels:
>
> a) This is because the code assumes that sizeof(fdt_addr_t) ==
> #address-cells * 4. This is an invalid assumption since it does not
> correctly honor the DT schema. It hard-codes the size of values whereas DT
> schema says the size is defined by the #xxx-cells properties.
>
> b) The original Tegra210 DTs that TomW posted had #address-cells=<1>,
> #size-cells=<1>. I asked him to change that to match what I expected to be
> in the Linux kernel's Tegra210 DTs. However, if he'd rejected my request or
> I hadn't noticed that, then with CONFIG_PHYS_64BIT set,
> fdtdec_get_addr_size() would have attempted to read twice as many bytes as
> it should have from the property. It's entirely plausible that someone could
> have come along later and realized CONFIG_PHYS_64BIT was set incorrectly and
> enabled it.
>
>> This is a boot loader so we should be willing to make some
>> simplifications.
>
>
> When dealing with internal bootloader details, sure assumptions,
> simplifications, etc. can be made.
>
> However, DT is an externally defined standard. The content of DT must be
> identical across all OSs (SW stacks, bootloader) and not influenced by
> requirements/... of any specific individual OS's (SW stack, bootloader)
> quirks. We can't just pick and choose which parts of it we care about. Well,
> perhaps if we stop calling it DT we could.

So I think in summary:

- 64-bit machines should have CONFIG_PHYS_64BIT set correctly
- then fdtdec_get_addr_size() would work as expected
- I want to make this cases as efficient as possible since it will be
called in SPL
- You are concerned that making assumptions like this violates the DT spec

One option is to split the functions into two - one that works in SPL
and makes assumptions, and one that does not and does things the hard
way. I suppose we could also add checks/warnings that the DT is
'well-formed' and that the address size matches the machine it is
running on.

In any case we do need to get rid of the parent lookup in this
function. So can either you or Thierry submit a patch to do this? The
parent should instead be a parameter to the function. You may need to
create a stub function which looks it up before calling
fdtdec_get_addr_size().

I'll see how things look in SPL.

Regards,
Simon
Michal Suchanek Aug. 6, 2015, 7:09 a.m. UTC | #11
Hello,

On 6 August 2015 at 01:45, Simon Glass <sjg@chromium.org> wrote:
> Hi Stephen,
>
> On 5 August 2015 at 12:22, Stephen Warren <swarren@wwwdotorg.org> wrote:
>> On 08/04/2015 10:08 PM, Simon Glass wrote:
>>>
>>> Hi Stephen,
>>>
>>> On 3 August 2015 at 12:20, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>>>
>>>> On 08/03/2015 09:52 AM, Simon Glass wrote:
>>>>>
>>>>>
>>>>> Hi Stephen,
>>>>>
>>>>> On 3 August 2015 at 09:12, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>>>>>
>>>>>>
>>>>>> On 08/02/2015 06:13 PM, Simon Glass wrote:
>>>>>>>
>>>>>>>

>>> This is a boot loader so we should be willing to make some
>>> simplifications.
>>
>>
>> When dealing with internal bootloader details, sure assumptions,
>> simplifications, etc. can be made.
>>
>> However, DT is an externally defined standard. The content of DT must be
>> identical across all OSs (SW stacks, bootloader) and not influenced by
>> requirements/... of any specific individual OS's (SW stack, bootloader)
>> quirks. We can't just pick and choose which parts of it we care about. Well,
>> perhaps if we stop calling it DT we could.
>
> So I think in summary:
>
> - 64-bit machines should have CONFIG_PHYS_64BIT set correctly
> - then fdtdec_get_addr_size() would work as expected
> - I want to make this cases as efficient as possible since it will be
> called in SPL
> - You are concerned that making assumptions like this violates the DT spec
>
> One option is to split the functions into two - one that works in SPL
> and makes assumptions, and one that does not and does things the hard
> way. I suppose we could also add checks/warnings that the DT is
> 'well-formed' and that the address size matches the machine it is
> running on.

This depends on the way the binding is defined. If the binding
description says that the value has as much data as defined in
#address-cells and #size-cells then you have to parse those values to
interpret the binding correctly. When the binding description says the
value is always 32bit or always 64bit or whatever you must always read
a value with that many bits.

Prime example of this are MTD partitions defined in dt.

If your flash is small you can get away with 32bit partitions and you
would define #address-cells and #size-cells as 1 even on 64bit system.
When your flash is large you need 64bits to store the partition size
and offset and you will define #address-cells and #size-cells as 2
even on 32bit system because it won't work otherwise.

Thanks

Michal
Stephen Warren Aug. 6, 2015, 6:43 p.m. UTC | #12
On 08/06/2015 01:09 AM, Michal Suchanek wrote:
> Hello,
>
> On 6 August 2015 at 01:45, Simon Glass <sjg@chromium.org> wrote:
>> Hi Stephen,
>>
>> On 5 August 2015 at 12:22, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>> On 08/04/2015 10:08 PM, Simon Glass wrote:
>>>>
>>>> Hi Stephen,
>>>>
>>>> On 3 August 2015 at 12:20, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>>>>
>>>>> On 08/03/2015 09:52 AM, Simon Glass wrote:
>>>>>>
>>>>>>
>>>>>> Hi Stephen,
>>>>>>
>>>>>> On 3 August 2015 at 09:12, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 08/02/2015 06:13 PM, Simon Glass wrote:
>>>>>>>>
>>>>>>>>
>
>>>> This is a boot loader so we should be willing to make some
>>>> simplifications.
>>>
>>>
>>> When dealing with internal bootloader details, sure assumptions,
>>> simplifications, etc. can be made.
>>>
>>> However, DT is an externally defined standard. The content of DT must be
>>> identical across all OSs (SW stacks, bootloader) and not influenced by
>>> requirements/... of any specific individual OS's (SW stack, bootloader)
>>> quirks. We can't just pick and choose which parts of it we care about. Well,
>>> perhaps if we stop calling it DT we could.
>>
>> So I think in summary:
>>
>> - 64-bit machines should have CONFIG_PHYS_64BIT set correctly
>> - then fdtdec_get_addr_size() would work as expected
>> - I want to make this cases as efficient as possible since it will be
>> called in SPL
>> - You are concerned that making assumptions like this violates the DT spec
>>
>> One option is to split the functions into two - one that works in SPL
>> and makes assumptions, and one that does not and does things the hard
>> way. I suppose we could also add checks/warnings that the DT is
>> 'well-formed' and that the address size matches the machine it is
>> running on.
>
> This depends on the way the binding is defined. If the binding
> description says that the value has as much data as defined in
> #address-cells and #size-cells then you have to parse those values to
> interpret the binding correctly. When the binding description says the
> value is always 32bit or always 64bit or whatever you must always read
> a value with that many bits.
>
> Prime example of this are MTD partitions defined in dt.
>
> If your flash is small you can get away with 32bit partitions and you
> would define #address-cells and #size-cells as 1 even on 64bit system.
> When your flash is large you need 64bits to store the partition size
> and offset and you will define #address-cells and #size-cells as 2
> even on 32bit system because it won't work otherwise.

I agree entirely.

FWIW, my part of this discussion was mainly targeted at the "reg" 
property, where #address-cells/#size-cells are defined as applying as 
part of the base DT definition. Indeed other bindings can be define this 
way to (or not) at the whim of the binding author. The bug with Intel 
flash was triggered by using the same piece of code to parse a property 
where #address-cells/#size-cells don't apply.
Stephen Warren Aug. 6, 2015, 7:03 p.m. UTC | #13
On 08/05/2015 05:45 PM, Simon Glass wrote:
> Hi Stephen,
>
> On 5 August 2015 at 12:22, Stephen Warren <swarren@wwwdotorg.org> wrote:
>> On 08/04/2015 10:08 PM, Simon Glass wrote:
>>>
>>> Hi Stephen,
>>>
>>> On 3 August 2015 at 12:20, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>>>
>>>> On 08/03/2015 09:52 AM, Simon Glass wrote:
>>>>>
>>>>>
>>>>> Hi Stephen,
>>>>>
>>>>> On 3 August 2015 at 09:12, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>>>>>
>>>>>>
>>>>>> On 08/02/2015 06:13 PM, Simon Glass wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> This reverts commit 5b34436035fc862b5e8d0d2c3eab74ba36f1a7f4.
>>>>>>>
>>>>>>> This function has a few problems. It calls fdt_parent_offset() which
>>>>>>> as
>>>>>>> mentioned in code review is very slow.
>>>>>>>
>>>>>>> https://patchwork.ozlabs.org/patch/499482/
>>>>>>> https://patchwork.ozlabs.org/patch/452604/
>>>>>>>
>>>>>>> It also happens to break SPI flash on Minnowboard max which is how I
>>>>>>> noticed
>>>>>>> that this was applied. I can send a patch to tidy that up, but in any
>>>>>>> case
>>>>>>> I think we should consider a revert until the function is better
>>>>>>> implemented.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> The fact that the function needs to perform a slow operation is not a
>>>>>> good
>>>>>> reason for a revert. The slowness of the operation is just a matter of
>>>>>> fact
>>>>>> with DT not having uplinks in its data structure, and U-Boot using
>>>>>> those
>>>>>> data structures directly.
>>>>>>
>>>>>> You'd requested during review that I additionally implement a faster
>>>>>> version
>>>>>> of the function in the case where the parent node is already known, and
>>>>>> said
>>>>>> it was fine if that happened in a later patch. I have this on my TODO
>>>>>> list,
>>>>>> but it's only been a couple of days.
>>>>>
>>>>>
>>>>>
>>>>> I didn't expect this to go to mainline before your new patch.
>>>>
>>>>
>>>>
>>>> I didn't get that message from the thread; you wrote:
>>>>
>>>> Simon Glass wrote:
>>>>>
>>>>> Stephen Warren wrote:
>>>>>>
>>>>>> Simon Glass wrote:
>>>>>>>
>>>>>>> Also, how about (in addition) a
>>>>>>> version of this function that works for devices? Like:
>>>>>>>
>>>>>>> device_get_addr_size(struct udevice *dev, ...)
>>>>>>>
>>>>>>> so that it can handle this for you.
>>>>>>
>>>>>>
>>>>>> That sounds like a separate patch?
>>>>>
>>>>>
>>>>> Yes, but I think we should get it in there so that people don't start
>>>>> using this (wildly inefficient) function when they don't need to. I
>>>>> think by passing in the parent node we force people to think about the
>>>>> cost.
>>>>>
>>>>> Yes the driver model function can be a separate patch, but let's get
>>>>> it in at about the same time. We have dev_get_addr() so could have
>>>>> dev_get_addr_size().
>>>>
>>>>
>>>> That sounds to like you'd like a followup patch soon after this patch (my
>>>> assumption would be within a few days or weeks) to solve your comments,
>>>> not
>>>> that you wanted a replacement patch.
>>>
>>>
>>> I will take that feedback to be a little more forceful in future, sorry.
>>>
>>>>
>>>>>> Why not just fix the bug since you said you could? That seems far
>>>>>> better
>>>>>> than breaking the newly added Tegra210 support.
>>>>>
>>>>>
>>>>>
>>>>> I do have a patch but I'm not 100% comfortable with the approach. I
>>>>> don't see a good reason to move away from the idea of fdt_addr_t and
>>>>> fdt_addr_t being set correctly for the platform. Or maybe I
>>>>> misunderstand the problem the patch was trying to fix. As I said it
>>>>> did not have a commit message, so who knows :-)
>>>>
>>>>
>>>>
>>>> The size of fdt_addr_t isn't relevant *at all* when parsing DT. The only
>>>> thing that matters is #address-cells/#size-cells. Those properties are
>>>> what
>>>> tell the parsing code how many bytes to read from the reg property.
>>>> Whether
>>>> the resultant value fits into the code's internal representation is an
>>>> internal detail of the code, not part of the semantics of DT itself or
>>>> how
>>>> to parse it.
>>>>
>>>> If code assumes that #address-cells == sizeof(fdt_addr_t), it is indeed
>>>> quite likely that everything will just happen to work most of the time.
>>>> However, this is purely an accident and not something that anything
>>>> should
>>>> rely upon.
>>>>
>>>> (I think Tegra210 support still has CONFIG_PHYS_64BIT undefined which is
>>>> admittedly a little /unexpected/ for a 64-bit U-Boot build, but in
>>>> practice
>>>> is perfectly /legal/ and will work out just fine since, except perhaps
>>>> for
>>>> RAM sizes, I don't believe any value in DT will actually require more
>>>> than
>>>> 32-bits to represent)
>>>
>>>
>>> I would like to have the types match the platform where possible. At
>>> least the types should not be smaller than the platform - e.g. if
>>> CONFIG_PHYS_64BIT is not defined we should not support 64-bit
>>
>>
>> In general, there's no "should not" here; we "cannot". If there's a 64-bit
>> value in the DT (with bits above bit 31 set), then it can't be stored in a
>> 32-bit variable.
>>
>> That said, if a DT has #address-cells=<2>, but the actual values stored in
>> all reg values have 0 for the MS word, that'll actually work just fine. Note
>> that it's 100% legal to set #address-cells=<100> and just write a bunch of
>> extra zeros into the property. Silly perhaps, but perfectly legal. Since the
>> function should adapt to whatever #address-cells value is in the DT,
>> supporting that case isn't any more work, so there's no reason we shouldn't.
>>
>>> address/size in the device tree. This is for efficiency. We don't want
>>> to force all the U-Boot code to 64-bit suddenly. This will bloat
>>> things for no benefit.
>>
>>
>> We could and likely should set CONFIG_PHYS_64BIT for Tegra210. However,
>> that's unrelated to using the correct algorithm to parse DT.
>>
>>>>>> P.S. What is the issue with SPI flash? The commit description doesn't
>>>>>> mention this at all.
>>>>>
>>>>>
>>>>> It calls that function expecting it to pick up an address and size
>>>>> from two consecutive cells. With this patch, that fails (unless the
>>>>> property happens to be "reg").
>>
>> ...
>>>>
>>>> I think this is all stemming from drivers/mtd/spi/sf_probe.c
>>>> spi_flash_decode_fdt()'s call to fdtdec_get_addr_size() for property
>>>> "memory-map"? If so, then looking at arch/x86/dts/minnowmax.dts's /spi
>>>> node,
>>>> the code and DT content are clearly inconsistent; For this node
>>>> #address-cells=<1>, #size-cells=<0> which makes sense given that the
>>>> address
>>>> is a chip-select and hence has no size. So, the code should not assume
>>>> that
>>>> the memory-map property can be parsed in the same way as a reg property.
>>>>
>>>> I note that the memory-map property doesn't exist in the Linux kernel's
>>>> DT
>>>> binding documentation database, or code, hence hasn't been reviewed by
>>>> the
>>>> DT binding maintainers.
>>>
>>>
>>> The function comment says:
>>>
>>>    * Look up an address property in a node and return it as an address.
>>>    * The property must hold one address with a length. This is only tested
>>>    * on 32-bit machines.
>>
>>
>> And Thierry fixed it for systems with #address-cells > 1. Perhaps that part
>> of the function comment should have been removed in the commit.
>>
>>> My intention was that this would be an efficient way to decode an
>>> address and size from a device tree. To some extent regmap may take
>>> over this role (IMO we should turn to drop fdtdec one day one we have
>>> more infrastructure). But I'd like it to work efficiently for 32-bit
>>> machines. The new function could hardly be less efficient.
>>
>>
>> Again, there's no way in general to make it more efficient. The efficiency
>> issue is directly implied by the DT data structures.
>>
>> In the special case where the parent node is already known, we can certainly
>> introduce an alternate function that is more efficient. You've already asked
>> for that, and as I said, it's in my TODO list.
>>
>>> I think we should consider the case where the physical address size of
>>> U-Boot and the device tree do not match as a corner case. I certainly
>>> don't want device tree to add loads of pointless code for 'normal'
>>> platforms.
>>
>>
>> It's not a corner case. It's a fundamental part of the DT schema. If U-Boot
>> is going to use DT, it should actually use *DT*, not some-very
>> similar-but-not-quite-DT format with all kinds of implicit and unstated
>> exceptions, limitations, and assumptions. This implies fully honoring all
>> aspects of how DT works when parsing it, not picking and choosing features
>> because some are inconvenient or annoying. If U-Boot doesn't want to
>> correctly implement DT support, it should just drop it completely.
>>
>> As an aside, when instantiating devices, I hope one day that U-Boot will
>> parse DT top-down in a hierarchical way, rather than simply searching for
>> any node anywhere in the DT without regard for whether any parent node is
>> enabled, has a driver, has had the driver initialize, etc. U-Boot ignores
>> this right now, and is only getting away with this accidentally. Without
>> hacky workarounds in drivers, this won't continue to work for all Tegra HW
>> (e.g. host1x graphics/display sub-modules, AHUB's audio-related
>> sub-modules), since parent drivers must initialize before child drivers in
>> order to enable various register buses, including clocks/resets affecting
>> those buses etc.
>>
>> Again, if it's simply too inconvenient or bloated to implement DT properly
>> in U-Boot, let's just drop it entirely. A halfway solution is the worst of
>> both worlds. I'm not convinced the full implications of how to (and the need
>> to) correctly and fully support DT have were well thought through before
>> (control) DT support was added to U-Boot.
>>
>>> Re memory-map, yes it doesn't seem to be possible to do what it is
>>> trying to do (and Thierry says the same below). It is quite weird to
>>> have a SPI peripheral which is also memory mapped.
>>>
>>> Here's my question - if you fix the CONFIG_PHYS_64BIT problem with
>>> 64-bit Tegra, what is actually wrong with the way the function was?
>>
>>
>> With the DT files now checked into U-Boot, I think it would accidentally
>> work, since we just happen to have set #address-cells=<2>, #size-cells=<2>,
>> and that would just happen to match sizeof(fdt_addr_t).
>>
>> However note this is an accident on a couple of levels:
>>
>> a) This is because the code assumes that sizeof(fdt_addr_t) ==
>> #address-cells * 4. This is an invalid assumption since it does not
>> correctly honor the DT schema. It hard-codes the size of values whereas DT
>> schema says the size is defined by the #xxx-cells properties.
>>
>> b) The original Tegra210 DTs that TomW posted had #address-cells=<1>,
>> #size-cells=<1>. I asked him to change that to match what I expected to be
>> in the Linux kernel's Tegra210 DTs. However, if he'd rejected my request or
>> I hadn't noticed that, then with CONFIG_PHYS_64BIT set,
>> fdtdec_get_addr_size() would have attempted to read twice as many bytes as
>> it should have from the property. It's entirely plausible that someone could
>> have come along later and realized CONFIG_PHYS_64BIT was set incorrectly and
>> enabled it.
>>
>>> This is a boot loader so we should be willing to make some
>>> simplifications.
>>
>>
>> When dealing with internal bootloader details, sure assumptions,
>> simplifications, etc. can be made.
>>
>> However, DT is an externally defined standard. The content of DT must be
>> identical across all OSs (SW stacks, bootloader) and not influenced by
>> requirements/... of any specific individual OS's (SW stack, bootloader)
>> quirks. We can't just pick and choose which parts of it we care about. Well,
>> perhaps if we stop calling it DT we could.
>
> So I think in summary:
>
> - 64-bit machines should have CONFIG_PHYS_64BIT set correctly

It turns out that arch/arm/include/asm/config.h already enables this for 
all ARM64 platforms.

As such, we can in fact go ahead with reverting this patch, and U-Boot 
will still function on Tegra210 boards.

In the short term, I think that means TomR should just apply this revert 
patch, and we don't need to send any additional patches. In the slightly 
longer term, we should add some comments to fdtdec_get_addr_size() 
describing its problems, and slightly longer term, add back Thierry's 
patch, but in a way that lets callers specify whether 
#address-cells/#size-cells should be used, or whether caller-supplied 
hard-coded values should be used.

I apologize for not noticing this earlier; I'd assumed that since none 
of the Tegra-specific files in include/configs/ set this flag, nor any 
of the Tegra-specific Kconfig files, it wasn't set, and hence a revert 
of the patch would break Tegra210 support.

> - then fdtdec_get_addr_size() would work as expected

I take issue with *works* as expected", since I would expect the 
function to implement DT parsing rules completely.

However, it's certainly true to say that it will generate the desired 
results, even if it doesn't /work/ (isn't implemented) as expected.

(I suppose this depends on whether you're talking about "works" w.r.t to 
correctness of the returned results or side-effects, or w.r.t. how the 
internal implementation works.)

> - I want to make this cases as efficient as possible since it will be
> called in SPL
> - You are concerned that making assumptions like this violates the DT spec
>
> One option is to split the functions into two - one that works in SPL
> and makes assumptions, and one that does not and does things the hard
> way.

Why should SPL be any different? U-Boot should either parse DT correctly 
or not at all. SPL shouldn't be a special case. Admittedly SPL on Tegra 
does very little (and isn't even present on Tegra210), but in general, 
can't someone use storage drivers, filesystems, etc. in SPL to choose 
the next stage to load, read GPIOs or other data sources to select 
between different boot paths, perhaps even interact with a user? If so, 
then assuming that SPL can somehow implements a reduced set of features, 
and hence can make assumptions that non-SPL can't, seems quite 
dangerous. We could only do that if we put some active checks in the 
U-Boot makefiles to ensure that nobody enabled anything in the SPL 
besides a set of strictly audited features.

 > I suppose we could also add checks/warnings that the DT is
> 'well-formed' and that the address size matches the machine it is
> running on.

Yes, we certainly should do that.

> In any case we do need to get rid of the parent lookup in this
> function. So can either you or Thierry submit a patch to do this? The
> parent should instead be a parameter to the function. You may need to
> create a stub function which looks it up before calling
> fdtdec_get_addr_size().

Yes, we can't remove the parent lookup in all cases. However, we can 
avoid it in the cases where the caller can supply it. I think that's 
most, but not quite all.

> I'll see how things look in SPL.
>
> Regards,
> Simon
>
Simon Glass Aug. 9, 2015, 3:08 p.m. UTC | #14
Hi Stephen,

On 6 August 2015 at 13:03, Stephen Warren <swarren@wwwdotorg.org> wrote:
> On 08/05/2015 05:45 PM, Simon Glass wrote:
>>
>> Hi Stephen,
>>
>> On 5 August 2015 at 12:22, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>>
>>> On 08/04/2015 10:08 PM, Simon Glass wrote:
>>>>
>>>>
>>>> Hi Stephen,
>>>>
>>>> On 3 August 2015 at 12:20, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>>>>
>>>>>
>>>>> On 08/03/2015 09:52 AM, Simon Glass wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hi Stephen,
>>>>>>
>>>>>> On 3 August 2015 at 09:12, Stephen Warren <swarren@wwwdotorg.org>
>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 08/02/2015 06:13 PM, Simon Glass wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> This reverts commit 5b34436035fc862b5e8d0d2c3eab74ba36f1a7f4.
>>>>>>>>
>>>>>>>> This function has a few problems. It calls fdt_parent_offset() which
>>>>>>>> as
>>>>>>>> mentioned in code review is very slow.
>>>>>>>>
>>>>>>>> https://patchwork.ozlabs.org/patch/499482/
>>>>>>>> https://patchwork.ozlabs.org/patch/452604/
>>>>>>>>
>>>>>>>> It also happens to break SPI flash on Minnowboard max which is how I
>>>>>>>> noticed
>>>>>>>> that this was applied. I can send a patch to tidy that up, but in
>>>>>>>> any
>>>>>>>> case
>>>>>>>> I think we should consider a revert until the function is better
>>>>>>>> implemented.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> The fact that the function needs to perform a slow operation is not a
>>>>>>> good
>>>>>>> reason for a revert. The slowness of the operation is just a matter
>>>>>>> of
>>>>>>> fact
>>>>>>> with DT not having uplinks in its data structure, and U-Boot using
>>>>>>> those
>>>>>>> data structures directly.
>>>>>>>
>>>>>>> You'd requested during review that I additionally implement a faster
>>>>>>> version
>>>>>>> of the function in the case where the parent node is already known,
>>>>>>> and
>>>>>>> said
>>>>>>> it was fine if that happened in a later patch. I have this on my TODO
>>>>>>> list,
>>>>>>> but it's only been a couple of days.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> I didn't expect this to go to mainline before your new patch.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> I didn't get that message from the thread; you wrote:
>>>>>
>>>>> Simon Glass wrote:
>>>>>>
>>>>>>
>>>>>> Stephen Warren wrote:
>>>>>>>
>>>>>>>
>>>>>>> Simon Glass wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> Also, how about (in addition) a
>>>>>>>> version of this function that works for devices? Like:
>>>>>>>>
>>>>>>>> device_get_addr_size(struct udevice *dev, ...)
>>>>>>>>
>>>>>>>> so that it can handle this for you.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> That sounds like a separate patch?
>>>>>>
>>>>>>
>>>>>>
>>>>>> Yes, but I think we should get it in there so that people don't start
>>>>>> using this (wildly inefficient) function when they don't need to. I
>>>>>> think by passing in the parent node we force people to think about the
>>>>>> cost.
>>>>>>
>>>>>> Yes the driver model function can be a separate patch, but let's get
>>>>>> it in at about the same time. We have dev_get_addr() so could have
>>>>>> dev_get_addr_size().
>>>>>
>>>>>
>>>>>
>>>>> That sounds to like you'd like a followup patch soon after this patch
>>>>> (my
>>>>> assumption would be within a few days or weeks) to solve your comments,
>>>>> not
>>>>> that you wanted a replacement patch.
>>>>
>>>>
>>>>
>>>> I will take that feedback to be a little more forceful in future, sorry.
>>>>
>>>>>
>>>>>>> Why not just fix the bug since you said you could? That seems far
>>>>>>> better
>>>>>>> than breaking the newly added Tegra210 support.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> I do have a patch but I'm not 100% comfortable with the approach. I
>>>>>> don't see a good reason to move away from the idea of fdt_addr_t and
>>>>>> fdt_addr_t being set correctly for the platform. Or maybe I
>>>>>> misunderstand the problem the patch was trying to fix. As I said it
>>>>>> did not have a commit message, so who knows :-)
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> The size of fdt_addr_t isn't relevant *at all* when parsing DT. The
>>>>> only
>>>>> thing that matters is #address-cells/#size-cells. Those properties are
>>>>> what
>>>>> tell the parsing code how many bytes to read from the reg property.
>>>>> Whether
>>>>> the resultant value fits into the code's internal representation is an
>>>>> internal detail of the code, not part of the semantics of DT itself or
>>>>> how
>>>>> to parse it.
>>>>>
>>>>> If code assumes that #address-cells == sizeof(fdt_addr_t), it is indeed
>>>>> quite likely that everything will just happen to work most of the time.
>>>>> However, this is purely an accident and not something that anything
>>>>> should
>>>>> rely upon.
>>>>>
>>>>> (I think Tegra210 support still has CONFIG_PHYS_64BIT undefined which
>>>>> is
>>>>> admittedly a little /unexpected/ for a 64-bit U-Boot build, but in
>>>>> practice
>>>>> is perfectly /legal/ and will work out just fine since, except perhaps
>>>>> for
>>>>> RAM sizes, I don't believe any value in DT will actually require more
>>>>> than
>>>>> 32-bits to represent)
>>>>
>>>>
>>>>
>>>> I would like to have the types match the platform where possible. At
>>>> least the types should not be smaller than the platform - e.g. if
>>>> CONFIG_PHYS_64BIT is not defined we should not support 64-bit
>>>
>>>
>>>
>>> In general, there's no "should not" here; we "cannot". If there's a
>>> 64-bit
>>> value in the DT (with bits above bit 31 set), then it can't be stored in
>>> a
>>> 32-bit variable.
>>>
>>> That said, if a DT has #address-cells=<2>, but the actual values stored
>>> in
>>> all reg values have 0 for the MS word, that'll actually work just fine.
>>> Note
>>> that it's 100% legal to set #address-cells=<100> and just write a bunch
>>> of
>>> extra zeros into the property. Silly perhaps, but perfectly legal. Since
>>> the
>>> function should adapt to whatever #address-cells value is in the DT,
>>> supporting that case isn't any more work, so there's no reason we
>>> shouldn't.
>>>
>>>> address/size in the device tree. This is for efficiency. We don't want
>>>> to force all the U-Boot code to 64-bit suddenly. This will bloat
>>>> things for no benefit.
>>>
>>>
>>>
>>> We could and likely should set CONFIG_PHYS_64BIT for Tegra210. However,
>>> that's unrelated to using the correct algorithm to parse DT.
>>>
>>>>>>> P.S. What is the issue with SPI flash? The commit description doesn't
>>>>>>> mention this at all.
>>>>>>
>>>>>>
>>>>>>
>>>>>> It calls that function expecting it to pick up an address and size
>>>>>> from two consecutive cells. With this patch, that fails (unless the
>>>>>> property happens to be "reg").
>>>
>>>
>>> ...
>>>>>
>>>>>
>>>>> I think this is all stemming from drivers/mtd/spi/sf_probe.c
>>>>> spi_flash_decode_fdt()'s call to fdtdec_get_addr_size() for property
>>>>> "memory-map"? If so, then looking at arch/x86/dts/minnowmax.dts's /spi
>>>>> node,
>>>>> the code and DT content are clearly inconsistent; For this node
>>>>> #address-cells=<1>, #size-cells=<0> which makes sense given that the
>>>>> address
>>>>> is a chip-select and hence has no size. So, the code should not assume
>>>>> that
>>>>> the memory-map property can be parsed in the same way as a reg
>>>>> property.
>>>>>
>>>>> I note that the memory-map property doesn't exist in the Linux kernel's
>>>>> DT
>>>>> binding documentation database, or code, hence hasn't been reviewed by
>>>>> the
>>>>> DT binding maintainers.
>>>>
>>>>
>>>>
>>>> The function comment says:
>>>>
>>>>    * Look up an address property in a node and return it as an address.
>>>>    * The property must hold one address with a length. This is only
>>>> tested
>>>>    * on 32-bit machines.
>>>
>>>
>>>
>>> And Thierry fixed it for systems with #address-cells > 1. Perhaps that
>>> part
>>> of the function comment should have been removed in the commit.
>>>
>>>> My intention was that this would be an efficient way to decode an
>>>> address and size from a device tree. To some extent regmap may take
>>>> over this role (IMO we should turn to drop fdtdec one day one we have
>>>> more infrastructure). But I'd like it to work efficiently for 32-bit
>>>> machines. The new function could hardly be less efficient.
>>>
>>>
>>>
>>> Again, there's no way in general to make it more efficient. The
>>> efficiency
>>> issue is directly implied by the DT data structures.
>>>
>>> In the special case where the parent node is already known, we can
>>> certainly
>>> introduce an alternate function that is more efficient. You've already
>>> asked
>>> for that, and as I said, it's in my TODO list.
>>>
>>>> I think we should consider the case where the physical address size of
>>>> U-Boot and the device tree do not match as a corner case. I certainly
>>>> don't want device tree to add loads of pointless code for 'normal'
>>>> platforms.
>>>
>>>
>>>
>>> It's not a corner case. It's a fundamental part of the DT schema. If
>>> U-Boot
>>> is going to use DT, it should actually use *DT*, not some-very
>>> similar-but-not-quite-DT format with all kinds of implicit and unstated
>>> exceptions, limitations, and assumptions. This implies fully honoring all
>>> aspects of how DT works when parsing it, not picking and choosing
>>> features
>>> because some are inconvenient or annoying. If U-Boot doesn't want to
>>> correctly implement DT support, it should just drop it completely.
>>>
>>> As an aside, when instantiating devices, I hope one day that U-Boot will
>>> parse DT top-down in a hierarchical way, rather than simply searching for
>>> any node anywhere in the DT without regard for whether any parent node is
>>> enabled, has a driver, has had the driver initialize, etc. U-Boot ignores
>>> this right now, and is only getting away with this accidentally. Without
>>> hacky workarounds in drivers, this won't continue to work for all Tegra
>>> HW
>>> (e.g. host1x graphics/display sub-modules, AHUB's audio-related
>>> sub-modules), since parent drivers must initialize before child drivers
>>> in
>>> order to enable various register buses, including clocks/resets affecting
>>> those buses etc.
>>>
>>> Again, if it's simply too inconvenient or bloated to implement DT
>>> properly
>>> in U-Boot, let's just drop it entirely. A halfway solution is the worst
>>> of
>>> both worlds. I'm not convinced the full implications of how to (and the
>>> need
>>> to) correctly and fully support DT have were well thought through before
>>> (control) DT support was added to U-Boot.
>>>
>>>> Re memory-map, yes it doesn't seem to be possible to do what it is
>>>> trying to do (and Thierry says the same below). It is quite weird to
>>>> have a SPI peripheral which is also memory mapped.
>>>>
>>>> Here's my question - if you fix the CONFIG_PHYS_64BIT problem with
>>>> 64-bit Tegra, what is actually wrong with the way the function was?
>>>
>>>
>>>
>>> With the DT files now checked into U-Boot, I think it would accidentally
>>> work, since we just happen to have set #address-cells=<2>,
>>> #size-cells=<2>,
>>> and that would just happen to match sizeof(fdt_addr_t).
>>>
>>> However note this is an accident on a couple of levels:
>>>
>>> a) This is because the code assumes that sizeof(fdt_addr_t) ==
>>> #address-cells * 4. This is an invalid assumption since it does not
>>> correctly honor the DT schema. It hard-codes the size of values whereas
>>> DT
>>> schema says the size is defined by the #xxx-cells properties.
>>>
>>> b) The original Tegra210 DTs that TomW posted had #address-cells=<1>,
>>> #size-cells=<1>. I asked him to change that to match what I expected to
>>> be
>>> in the Linux kernel's Tegra210 DTs. However, if he'd rejected my request
>>> or
>>> I hadn't noticed that, then with CONFIG_PHYS_64BIT set,
>>> fdtdec_get_addr_size() would have attempted to read twice as many bytes
>>> as
>>> it should have from the property. It's entirely plausible that someone
>>> could
>>> have come along later and realized CONFIG_PHYS_64BIT was set incorrectly
>>> and
>>> enabled it.
>>>
>>>> This is a boot loader so we should be willing to make some
>>>> simplifications.
>>>
>>>
>>>
>>> When dealing with internal bootloader details, sure assumptions,
>>> simplifications, etc. can be made.
>>>
>>> However, DT is an externally defined standard. The content of DT must be
>>> identical across all OSs (SW stacks, bootloader) and not influenced by
>>> requirements/... of any specific individual OS's (SW stack, bootloader)
>>> quirks. We can't just pick and choose which parts of it we care about.
>>> Well,
>>> perhaps if we stop calling it DT we could.
>>
>>
>> So I think in summary:
>>
>> - 64-bit machines should have CONFIG_PHYS_64BIT set correctly
>
>
> It turns out that arch/arm/include/asm/config.h already enables this for all
> ARM64 platforms.
>
> As such, we can in fact go ahead with reverting this patch, and U-Boot will
> still function on Tegra210 boards.
>
> In the short term, I think that means TomR should just apply this revert
> patch, and we don't need to send any additional patches. In the slightly
> longer term, we should add some comments to fdtdec_get_addr_size()
> describing its problems, and slightly longer term, add back Thierry's patch,
> but in a way that lets callers specify whether #address-cells/#size-cells
> should be used, or whether caller-supplied hard-coded values should be used.

OK great. But I see your new patch so I think we can apply both at the
same time.

>
> I apologize for not noticing this earlier; I'd assumed that since none of
> the Tegra-specific files in include/configs/ set this flag, nor any of the
> Tegra-specific Kconfig files, it wasn't set, and hence a revert of the patch
> would break Tegra210 support.

No problem - I assumed it would also.

>
>> - then fdtdec_get_addr_size() would work as expected
>
>
> I take issue with *works* as expected", since I would expect the function to
> implement DT parsing rules completely.
>
> However, it's certainly true to say that it will generate the desired
> results, even if it doesn't /work/ (isn't implemented) as expected.
>
> (I suppose this depends on whether you're talking about "works" w.r.t to
> correctness of the returned results or side-effects, or w.r.t. how the
> internal implementation works.)
>
>> - I want to make this cases as efficient as possible since it will be
>> called in SPL
>> - You are concerned that making assumptions like this violates the DT spec
>>
>> One option is to split the functions into two - one that works in SPL
>> and makes assumptions, and one that does not and does things the hard
>> way.
>
>
> Why should SPL be any different? U-Boot should either parse DT correctly or
> not at all. SPL shouldn't be a special case. Admittedly SPL on Tegra does
> very little (and isn't even present on Tegra210), but in general, can't
> someone use storage drivers, filesystems, etc. in SPL to choose the next
> stage to load, read GPIOs or other data sources to select between different
> boot paths, perhaps even interact with a user? If so, then assuming that SPL
> can somehow implements a reduced set of features, and hence can make
> assumptions that non-SPL can't, seems quite dangerous. We could only do that
> if we put some active checks in the U-Boot makefiles to ensure that nobody
> enabled anything in the SPL besides a set of strictly audited features.

Yes we could do that and it would avoid pain later. I suppose SPL on
Tegra is a bit of a special case since there is really no size limit.
For some chips the limits are pretty severe so I am quite sensitive to
code size even at the expense of extra debug time when something
breaks.

>
>> I suppose we could also add checks/warnings that the DT is
>>
>> 'well-formed' and that the address size matches the machine it is
>> running on.
>
>
> Yes, we certainly should do that.
>
>> In any case we do need to get rid of the parent lookup in this
>> function. So can either you or Thierry submit a patch to do this? The
>> parent should instead be a parameter to the function. You may need to
>> create a stub function which looks it up before calling
>> fdtdec_get_addr_size().
>
>
> Yes, we can't remove the parent lookup in all cases. However, we can avoid
> it in the cases where the caller can supply it. I think that's most, but not
> quite all.

My main concern is dev_get_addr() since that is the official way of
reading a device address now. See also regmap_init_mem() which does
things its own way.

>
>
>> I'll see how things look in SPL.
>>
>> Regards,
>> Simon
>>
>

Regards,
Simon
Bin Meng Aug. 14, 2015, 8:10 a.m. UTC | #15
Hi,

On Sun, Aug 9, 2015 at 11:08 PM, Simon Glass <sjg@chromium.org> wrote:
> Hi Stephen,
>
> On 6 August 2015 at 13:03, Stephen Warren <swarren@wwwdotorg.org> wrote:
>> On 08/05/2015 05:45 PM, Simon Glass wrote:
>>>
>>> Hi Stephen,
>>>
>>> On 5 August 2015 at 12:22, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>>>
>>>> On 08/04/2015 10:08 PM, Simon Glass wrote:
>>>>>
>>>>>
>>>>> Hi Stephen,
>>>>>
>>>>> On 3 August 2015 at 12:20, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>>>>>
>>>>>>
>>>>>> On 08/03/2015 09:52 AM, Simon Glass wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Hi Stephen,
>>>>>>>
>>>>>>> On 3 August 2015 at 09:12, Stephen Warren <swarren@wwwdotorg.org>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 08/02/2015 06:13 PM, Simon Glass wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> This reverts commit 5b34436035fc862b5e8d0d2c3eab74ba36f1a7f4.
>>>>>>>>>
>>>>>>>>> This function has a few problems. It calls fdt_parent_offset() which
>>>>>>>>> as
>>>>>>>>> mentioned in code review is very slow.
>>>>>>>>>
>>>>>>>>> https://patchwork.ozlabs.org/patch/499482/
>>>>>>>>> https://patchwork.ozlabs.org/patch/452604/
>>>>>>>>>
>>>>>>>>> It also happens to break SPI flash on Minnowboard max which is how I
>>>>>>>>> noticed
>>>>>>>>> that this was applied. I can send a patch to tidy that up, but in
>>>>>>>>> any
>>>>>>>>> case
>>>>>>>>> I think we should consider a revert until the function is better
>>>>>>>>> implemented.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> The fact that the function needs to perform a slow operation is not a
>>>>>>>> good
>>>>>>>> reason for a revert. The slowness of the operation is just a matter
>>>>>>>> of
>>>>>>>> fact
>>>>>>>> with DT not having uplinks in its data structure, and U-Boot using
>>>>>>>> those
>>>>>>>> data structures directly.
>>>>>>>>
>>>>>>>> You'd requested during review that I additionally implement a faster
>>>>>>>> version
>>>>>>>> of the function in the case where the parent node is already known,
>>>>>>>> and
>>>>>>>> said
>>>>>>>> it was fine if that happened in a later patch. I have this on my TODO
>>>>>>>> list,
>>>>>>>> but it's only been a couple of days.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I didn't expect this to go to mainline before your new patch.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> I didn't get that message from the thread; you wrote:
>>>>>>
>>>>>> Simon Glass wrote:
>>>>>>>
>>>>>>>
>>>>>>> Stephen Warren wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> Simon Glass wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Also, how about (in addition) a
>>>>>>>>> version of this function that works for devices? Like:
>>>>>>>>>
>>>>>>>>> device_get_addr_size(struct udevice *dev, ...)
>>>>>>>>>
>>>>>>>>> so that it can handle this for you.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> That sounds like a separate patch?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Yes, but I think we should get it in there so that people don't start
>>>>>>> using this (wildly inefficient) function when they don't need to. I
>>>>>>> think by passing in the parent node we force people to think about the
>>>>>>> cost.
>>>>>>>
>>>>>>> Yes the driver model function can be a separate patch, but let's get
>>>>>>> it in at about the same time. We have dev_get_addr() so could have
>>>>>>> dev_get_addr_size().
>>>>>>
>>>>>>
>>>>>>
>>>>>> That sounds to like you'd like a followup patch soon after this patch
>>>>>> (my
>>>>>> assumption would be within a few days or weeks) to solve your comments,
>>>>>> not
>>>>>> that you wanted a replacement patch.
>>>>>
>>>>>
>>>>>
>>>>> I will take that feedback to be a little more forceful in future, sorry.
>>>>>
>>>>>>
>>>>>>>> Why not just fix the bug since you said you could? That seems far
>>>>>>>> better
>>>>>>>> than breaking the newly added Tegra210 support.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I do have a patch but I'm not 100% comfortable with the approach. I
>>>>>>> don't see a good reason to move away from the idea of fdt_addr_t and
>>>>>>> fdt_addr_t being set correctly for the platform. Or maybe I
>>>>>>> misunderstand the problem the patch was trying to fix. As I said it
>>>>>>> did not have a commit message, so who knows :-)
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> The size of fdt_addr_t isn't relevant *at all* when parsing DT. The
>>>>>> only
>>>>>> thing that matters is #address-cells/#size-cells. Those properties are
>>>>>> what
>>>>>> tell the parsing code how many bytes to read from the reg property.
>>>>>> Whether
>>>>>> the resultant value fits into the code's internal representation is an
>>>>>> internal detail of the code, not part of the semantics of DT itself or
>>>>>> how
>>>>>> to parse it.
>>>>>>
>>>>>> If code assumes that #address-cells == sizeof(fdt_addr_t), it is indeed
>>>>>> quite likely that everything will just happen to work most of the time.
>>>>>> However, this is purely an accident and not something that anything
>>>>>> should
>>>>>> rely upon.
>>>>>>
>>>>>> (I think Tegra210 support still has CONFIG_PHYS_64BIT undefined which
>>>>>> is
>>>>>> admittedly a little /unexpected/ for a 64-bit U-Boot build, but in
>>>>>> practice
>>>>>> is perfectly /legal/ and will work out just fine since, except perhaps
>>>>>> for
>>>>>> RAM sizes, I don't believe any value in DT will actually require more
>>>>>> than
>>>>>> 32-bits to represent)
>>>>>
>>>>>
>>>>>
>>>>> I would like to have the types match the platform where possible. At
>>>>> least the types should not be smaller than the platform - e.g. if
>>>>> CONFIG_PHYS_64BIT is not defined we should not support 64-bit
>>>>
>>>>
>>>>
>>>> In general, there's no "should not" here; we "cannot". If there's a
>>>> 64-bit
>>>> value in the DT (with bits above bit 31 set), then it can't be stored in
>>>> a
>>>> 32-bit variable.
>>>>
>>>> That said, if a DT has #address-cells=<2>, but the actual values stored
>>>> in
>>>> all reg values have 0 for the MS word, that'll actually work just fine.
>>>> Note
>>>> that it's 100% legal to set #address-cells=<100> and just write a bunch
>>>> of
>>>> extra zeros into the property. Silly perhaps, but perfectly legal. Since
>>>> the
>>>> function should adapt to whatever #address-cells value is in the DT,
>>>> supporting that case isn't any more work, so there's no reason we
>>>> shouldn't.
>>>>
>>>>> address/size in the device tree. This is for efficiency. We don't want
>>>>> to force all the U-Boot code to 64-bit suddenly. This will bloat
>>>>> things for no benefit.
>>>>
>>>>
>>>>
>>>> We could and likely should set CONFIG_PHYS_64BIT for Tegra210. However,
>>>> that's unrelated to using the correct algorithm to parse DT.
>>>>
>>>>>>>> P.S. What is the issue with SPI flash? The commit description doesn't
>>>>>>>> mention this at all.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> It calls that function expecting it to pick up an address and size
>>>>>>> from two consecutive cells. With this patch, that fails (unless the
>>>>>>> property happens to be "reg").
>>>>
>>>>
>>>> ...
>>>>>>
>>>>>>
>>>>>> I think this is all stemming from drivers/mtd/spi/sf_probe.c
>>>>>> spi_flash_decode_fdt()'s call to fdtdec_get_addr_size() for property
>>>>>> "memory-map"? If so, then looking at arch/x86/dts/minnowmax.dts's /spi
>>>>>> node,
>>>>>> the code and DT content are clearly inconsistent; For this node
>>>>>> #address-cells=<1>, #size-cells=<0> which makes sense given that the
>>>>>> address
>>>>>> is a chip-select and hence has no size. So, the code should not assume
>>>>>> that
>>>>>> the memory-map property can be parsed in the same way as a reg
>>>>>> property.
>>>>>>
>>>>>> I note that the memory-map property doesn't exist in the Linux kernel's
>>>>>> DT
>>>>>> binding documentation database, or code, hence hasn't been reviewed by
>>>>>> the
>>>>>> DT binding maintainers.
>>>>>
>>>>>
>>>>>
>>>>> The function comment says:
>>>>>
>>>>>    * Look up an address property in a node and return it as an address.
>>>>>    * The property must hold one address with a length. This is only
>>>>> tested
>>>>>    * on 32-bit machines.
>>>>
>>>>
>>>>
>>>> And Thierry fixed it for systems with #address-cells > 1. Perhaps that
>>>> part
>>>> of the function comment should have been removed in the commit.
>>>>
>>>>> My intention was that this would be an efficient way to decode an
>>>>> address and size from a device tree. To some extent regmap may take
>>>>> over this role (IMO we should turn to drop fdtdec one day one we have
>>>>> more infrastructure). But I'd like it to work efficiently for 32-bit
>>>>> machines. The new function could hardly be less efficient.
>>>>
>>>>
>>>>
>>>> Again, there's no way in general to make it more efficient. The
>>>> efficiency
>>>> issue is directly implied by the DT data structures.
>>>>
>>>> In the special case where the parent node is already known, we can
>>>> certainly
>>>> introduce an alternate function that is more efficient. You've already
>>>> asked
>>>> for that, and as I said, it's in my TODO list.
>>>>
>>>>> I think we should consider the case where the physical address size of
>>>>> U-Boot and the device tree do not match as a corner case. I certainly
>>>>> don't want device tree to add loads of pointless code for 'normal'
>>>>> platforms.
>>>>
>>>>
>>>>
>>>> It's not a corner case. It's a fundamental part of the DT schema. If
>>>> U-Boot
>>>> is going to use DT, it should actually use *DT*, not some-very
>>>> similar-but-not-quite-DT format with all kinds of implicit and unstated
>>>> exceptions, limitations, and assumptions. This implies fully honoring all
>>>> aspects of how DT works when parsing it, not picking and choosing
>>>> features
>>>> because some are inconvenient or annoying. If U-Boot doesn't want to
>>>> correctly implement DT support, it should just drop it completely.
>>>>
>>>> As an aside, when instantiating devices, I hope one day that U-Boot will
>>>> parse DT top-down in a hierarchical way, rather than simply searching for
>>>> any node anywhere in the DT without regard for whether any parent node is
>>>> enabled, has a driver, has had the driver initialize, etc. U-Boot ignores
>>>> this right now, and is only getting away with this accidentally. Without
>>>> hacky workarounds in drivers, this won't continue to work for all Tegra
>>>> HW
>>>> (e.g. host1x graphics/display sub-modules, AHUB's audio-related
>>>> sub-modules), since parent drivers must initialize before child drivers
>>>> in
>>>> order to enable various register buses, including clocks/resets affecting
>>>> those buses etc.
>>>>
>>>> Again, if it's simply too inconvenient or bloated to implement DT
>>>> properly
>>>> in U-Boot, let's just drop it entirely. A halfway solution is the worst
>>>> of
>>>> both worlds. I'm not convinced the full implications of how to (and the
>>>> need
>>>> to) correctly and fully support DT have were well thought through before
>>>> (control) DT support was added to U-Boot.
>>>>
>>>>> Re memory-map, yes it doesn't seem to be possible to do what it is
>>>>> trying to do (and Thierry says the same below). It is quite weird to
>>>>> have a SPI peripheral which is also memory mapped.
>>>>>
>>>>> Here's my question - if you fix the CONFIG_PHYS_64BIT problem with
>>>>> 64-bit Tegra, what is actually wrong with the way the function was?
>>>>
>>>>
>>>>
>>>> With the DT files now checked into U-Boot, I think it would accidentally
>>>> work, since we just happen to have set #address-cells=<2>,
>>>> #size-cells=<2>,
>>>> and that would just happen to match sizeof(fdt_addr_t).
>>>>
>>>> However note this is an accident on a couple of levels:
>>>>
>>>> a) This is because the code assumes that sizeof(fdt_addr_t) ==
>>>> #address-cells * 4. This is an invalid assumption since it does not
>>>> correctly honor the DT schema. It hard-codes the size of values whereas
>>>> DT
>>>> schema says the size is defined by the #xxx-cells properties.
>>>>
>>>> b) The original Tegra210 DTs that TomW posted had #address-cells=<1>,
>>>> #size-cells=<1>. I asked him to change that to match what I expected to
>>>> be
>>>> in the Linux kernel's Tegra210 DTs. However, if he'd rejected my request
>>>> or
>>>> I hadn't noticed that, then with CONFIG_PHYS_64BIT set,
>>>> fdtdec_get_addr_size() would have attempted to read twice as many bytes
>>>> as
>>>> it should have from the property. It's entirely plausible that someone
>>>> could
>>>> have come along later and realized CONFIG_PHYS_64BIT was set incorrectly
>>>> and
>>>> enabled it.
>>>>
>>>>> This is a boot loader so we should be willing to make some
>>>>> simplifications.
>>>>
>>>>
>>>>
>>>> When dealing with internal bootloader details, sure assumptions,
>>>> simplifications, etc. can be made.
>>>>
>>>> However, DT is an externally defined standard. The content of DT must be
>>>> identical across all OSs (SW stacks, bootloader) and not influenced by
>>>> requirements/... of any specific individual OS's (SW stack, bootloader)
>>>> quirks. We can't just pick and choose which parts of it we care about.
>>>> Well,
>>>> perhaps if we stop calling it DT we could.
>>>
>>>
>>> So I think in summary:
>>>
>>> - 64-bit machines should have CONFIG_PHYS_64BIT set correctly
>>
>>
>> It turns out that arch/arm/include/asm/config.h already enables this for all
>> ARM64 platforms.
>>
>> As such, we can in fact go ahead with reverting this patch, and U-Boot will
>> still function on Tegra210 boards.
>>
>> In the short term, I think that means TomR should just apply this revert
>> patch, and we don't need to send any additional patches. In the slightly
>> longer term, we should add some comments to fdtdec_get_addr_size()
>> describing its problems, and slightly longer term, add back Thierry's patch,
>> but in a way that lets callers specify whether #address-cells/#size-cells
>> should be used, or whether caller-supplied hard-coded values should be used.
>
> OK great. But I see your new patch so I think we can apply both at the
> same time.
>
>>
>> I apologize for not noticing this earlier; I'd assumed that since none of
>> the Tegra-specific files in include/configs/ set this flag, nor any of the
>> Tegra-specific Kconfig files, it wasn't set, and hence a revert of the patch
>> would break Tegra210 support.
>
> No problem - I assumed it would also.
>
>>
>>> - then fdtdec_get_addr_size() would work as expected
>>
>>
>> I take issue with *works* as expected", since I would expect the function to
>> implement DT parsing rules completely.
>>
>> However, it's certainly true to say that it will generate the desired
>> results, even if it doesn't /work/ (isn't implemented) as expected.
>>
>> (I suppose this depends on whether you're talking about "works" w.r.t to
>> correctness of the returned results or side-effects, or w.r.t. how the
>> internal implementation works.)
>>
>>> - I want to make this cases as efficient as possible since it will be
>>> called in SPL
>>> - You are concerned that making assumptions like this violates the DT spec
>>>
>>> One option is to split the functions into two - one that works in SPL
>>> and makes assumptions, and one that does not and does things the hard
>>> way.
>>
>>
>> Why should SPL be any different? U-Boot should either parse DT correctly or
>> not at all. SPL shouldn't be a special case. Admittedly SPL on Tegra does
>> very little (and isn't even present on Tegra210), but in general, can't
>> someone use storage drivers, filesystems, etc. in SPL to choose the next
>> stage to load, read GPIOs or other data sources to select between different
>> boot paths, perhaps even interact with a user? If so, then assuming that SPL
>> can somehow implements a reduced set of features, and hence can make
>> assumptions that non-SPL can't, seems quite dangerous. We could only do that
>> if we put some active checks in the U-Boot makefiles to ensure that nobody
>> enabled anything in the SPL besides a set of strictly audited features.
>
> Yes we could do that and it would avoid pain later. I suppose SPL on
> Tegra is a bit of a special case since there is really no size limit.
> For some chips the limits are pretty severe so I am quite sensitive to
> code size even at the expense of extra debug time when something
> breaks.
>
>>
>>> I suppose we could also add checks/warnings that the DT is
>>>
>>> 'well-formed' and that the address size matches the machine it is
>>> running on.
>>
>>
>> Yes, we certainly should do that.
>>
>>> In any case we do need to get rid of the parent lookup in this
>>> function. So can either you or Thierry submit a patch to do this? The
>>> parent should instead be a parameter to the function. You may need to
>>> create a stub function which looks it up before calling
>>> fdtdec_get_addr_size().
>>
>>
>> Yes, we can't remove the parent lookup in all cases. However, we can avoid
>> it in the cases where the caller can supply it. I think that's most, but not
>> quite all.
>
> My main concern is dev_get_addr() since that is the official way of
> reading a device address now. See also regmap_init_mem() which does
> things its own way.
>
>>
>>
>>> I'll see how things look in SPL.
>>>
>>> Regards,
>>> Simon
>>>
>>
>

Do we have any conclusion about commit 5b34436? Today I started to
check the pre-relocatoin DM PCI UART issue, but found it is now broken
due to this commit. The broken part is at
ns16550_serial_ofdata_to_platdata() in drivers/serial/ns16550.c, in
which it has:

    addr = fdtdec_get_addr(gd->fdt_blob, dev->of_offset, "reg");
#ifdef CONFIG_PCI
    if (addr == FDT_ADDR_T_NONE) {
...

Before commit 5b34436, the old behavior is that the call to
fdtdec_get_addr() returns FDT_ADDR_T_NONE so that we can trap into the
PCI logic. But with commit 5b34436, addr is now zero which just bypass
this logic.

As for why addr is now zero, this is because fdtdec_get_number() can
only handle a 64-bit number at most. However for PCI reg, it has 3
cells. So if I have the following encoding:

reg = <0x00025100 0x0 0x0 0x0 0x0
           0x01025110 0x0 0x0 0x0 0x0>;

The addr will be assigned to zero after two rounds of left shift by 32-bit.

I can certainly change ns16550 driver to test the return value against
0 now, but I think this fdtdec_get_addr() does not cover all cases.
Please advise.

Regards,
Bin
Thierry Reding Aug. 14, 2015, 8:32 a.m. UTC | #16
On Fri, Aug 14, 2015 at 04:10:32PM +0800, Bin Meng wrote:
> Hi,
> 
> On Sun, Aug 9, 2015 at 11:08 PM, Simon Glass <sjg@chromium.org> wrote:
> > Hi Stephen,
> >
> > On 6 August 2015 at 13:03, Stephen Warren <swarren@wwwdotorg.org> wrote:
> >> On 08/05/2015 05:45 PM, Simon Glass wrote:
> >>>
> >>> Hi Stephen,
> >>>
> >>> On 5 August 2015 at 12:22, Stephen Warren <swarren@wwwdotorg.org> wrote:
> >>>>
> >>>> On 08/04/2015 10:08 PM, Simon Glass wrote:
> >>>>>
> >>>>>
> >>>>> Hi Stephen,
> >>>>>
> >>>>> On 3 August 2015 at 12:20, Stephen Warren <swarren@wwwdotorg.org> wrote:
> >>>>>>
> >>>>>>
> >>>>>> On 08/03/2015 09:52 AM, Simon Glass wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> Hi Stephen,
> >>>>>>>
> >>>>>>> On 3 August 2015 at 09:12, Stephen Warren <swarren@wwwdotorg.org>
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 08/02/2015 06:13 PM, Simon Glass wrote:
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> This reverts commit 5b34436035fc862b5e8d0d2c3eab74ba36f1a7f4.
> >>>>>>>>>
> >>>>>>>>> This function has a few problems. It calls fdt_parent_offset() which
> >>>>>>>>> as
> >>>>>>>>> mentioned in code review is very slow.
> >>>>>>>>>
> >>>>>>>>> https://patchwork.ozlabs.org/patch/499482/
> >>>>>>>>> https://patchwork.ozlabs.org/patch/452604/
> >>>>>>>>>
> >>>>>>>>> It also happens to break SPI flash on Minnowboard max which is how I
> >>>>>>>>> noticed
> >>>>>>>>> that this was applied. I can send a patch to tidy that up, but in
> >>>>>>>>> any
> >>>>>>>>> case
> >>>>>>>>> I think we should consider a revert until the function is better
> >>>>>>>>> implemented.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> The fact that the function needs to perform a slow operation is not a
> >>>>>>>> good
> >>>>>>>> reason for a revert. The slowness of the operation is just a matter
> >>>>>>>> of
> >>>>>>>> fact
> >>>>>>>> with DT not having uplinks in its data structure, and U-Boot using
> >>>>>>>> those
> >>>>>>>> data structures directly.
> >>>>>>>>
> >>>>>>>> You'd requested during review that I additionally implement a faster
> >>>>>>>> version
> >>>>>>>> of the function in the case where the parent node is already known,
> >>>>>>>> and
> >>>>>>>> said
> >>>>>>>> it was fine if that happened in a later patch. I have this on my TODO
> >>>>>>>> list,
> >>>>>>>> but it's only been a couple of days.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> I didn't expect this to go to mainline before your new patch.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> I didn't get that message from the thread; you wrote:
> >>>>>>
> >>>>>> Simon Glass wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>> Stephen Warren wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Simon Glass wrote:
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Also, how about (in addition) a
> >>>>>>>>> version of this function that works for devices? Like:
> >>>>>>>>>
> >>>>>>>>> device_get_addr_size(struct udevice *dev, ...)
> >>>>>>>>>
> >>>>>>>>> so that it can handle this for you.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> That sounds like a separate patch?
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> Yes, but I think we should get it in there so that people don't start
> >>>>>>> using this (wildly inefficient) function when they don't need to. I
> >>>>>>> think by passing in the parent node we force people to think about the
> >>>>>>> cost.
> >>>>>>>
> >>>>>>> Yes the driver model function can be a separate patch, but let's get
> >>>>>>> it in at about the same time. We have dev_get_addr() so could have
> >>>>>>> dev_get_addr_size().
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> That sounds to like you'd like a followup patch soon after this patch
> >>>>>> (my
> >>>>>> assumption would be within a few days or weeks) to solve your comments,
> >>>>>> not
> >>>>>> that you wanted a replacement patch.
> >>>>>
> >>>>>
> >>>>>
> >>>>> I will take that feedback to be a little more forceful in future, sorry.
> >>>>>
> >>>>>>
> >>>>>>>> Why not just fix the bug since you said you could? That seems far
> >>>>>>>> better
> >>>>>>>> than breaking the newly added Tegra210 support.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> I do have a patch but I'm not 100% comfortable with the approach. I
> >>>>>>> don't see a good reason to move away from the idea of fdt_addr_t and
> >>>>>>> fdt_addr_t being set correctly for the platform. Or maybe I
> >>>>>>> misunderstand the problem the patch was trying to fix. As I said it
> >>>>>>> did not have a commit message, so who knows :-)
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> The size of fdt_addr_t isn't relevant *at all* when parsing DT. The
> >>>>>> only
> >>>>>> thing that matters is #address-cells/#size-cells. Those properties are
> >>>>>> what
> >>>>>> tell the parsing code how many bytes to read from the reg property.
> >>>>>> Whether
> >>>>>> the resultant value fits into the code's internal representation is an
> >>>>>> internal detail of the code, not part of the semantics of DT itself or
> >>>>>> how
> >>>>>> to parse it.
> >>>>>>
> >>>>>> If code assumes that #address-cells == sizeof(fdt_addr_t), it is indeed
> >>>>>> quite likely that everything will just happen to work most of the time.
> >>>>>> However, this is purely an accident and not something that anything
> >>>>>> should
> >>>>>> rely upon.
> >>>>>>
> >>>>>> (I think Tegra210 support still has CONFIG_PHYS_64BIT undefined which
> >>>>>> is
> >>>>>> admittedly a little /unexpected/ for a 64-bit U-Boot build, but in
> >>>>>> practice
> >>>>>> is perfectly /legal/ and will work out just fine since, except perhaps
> >>>>>> for
> >>>>>> RAM sizes, I don't believe any value in DT will actually require more
> >>>>>> than
> >>>>>> 32-bits to represent)
> >>>>>
> >>>>>
> >>>>>
> >>>>> I would like to have the types match the platform where possible. At
> >>>>> least the types should not be smaller than the platform - e.g. if
> >>>>> CONFIG_PHYS_64BIT is not defined we should not support 64-bit
> >>>>
> >>>>
> >>>>
> >>>> In general, there's no "should not" here; we "cannot". If there's a
> >>>> 64-bit
> >>>> value in the DT (with bits above bit 31 set), then it can't be stored in
> >>>> a
> >>>> 32-bit variable.
> >>>>
> >>>> That said, if a DT has #address-cells=<2>, but the actual values stored
> >>>> in
> >>>> all reg values have 0 for the MS word, that'll actually work just fine.
> >>>> Note
> >>>> that it's 100% legal to set #address-cells=<100> and just write a bunch
> >>>> of
> >>>> extra zeros into the property. Silly perhaps, but perfectly legal. Since
> >>>> the
> >>>> function should adapt to whatever #address-cells value is in the DT,
> >>>> supporting that case isn't any more work, so there's no reason we
> >>>> shouldn't.
> >>>>
> >>>>> address/size in the device tree. This is for efficiency. We don't want
> >>>>> to force all the U-Boot code to 64-bit suddenly. This will bloat
> >>>>> things for no benefit.
> >>>>
> >>>>
> >>>>
> >>>> We could and likely should set CONFIG_PHYS_64BIT for Tegra210. However,
> >>>> that's unrelated to using the correct algorithm to parse DT.
> >>>>
> >>>>>>>> P.S. What is the issue with SPI flash? The commit description doesn't
> >>>>>>>> mention this at all.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> It calls that function expecting it to pick up an address and size
> >>>>>>> from two consecutive cells. With this patch, that fails (unless the
> >>>>>>> property happens to be "reg").
> >>>>
> >>>>
> >>>> ...
> >>>>>>
> >>>>>>
> >>>>>> I think this is all stemming from drivers/mtd/spi/sf_probe.c
> >>>>>> spi_flash_decode_fdt()'s call to fdtdec_get_addr_size() for property
> >>>>>> "memory-map"? If so, then looking at arch/x86/dts/minnowmax.dts's /spi
> >>>>>> node,
> >>>>>> the code and DT content are clearly inconsistent; For this node
> >>>>>> #address-cells=<1>, #size-cells=<0> which makes sense given that the
> >>>>>> address
> >>>>>> is a chip-select and hence has no size. So, the code should not assume
> >>>>>> that
> >>>>>> the memory-map property can be parsed in the same way as a reg
> >>>>>> property.
> >>>>>>
> >>>>>> I note that the memory-map property doesn't exist in the Linux kernel's
> >>>>>> DT
> >>>>>> binding documentation database, or code, hence hasn't been reviewed by
> >>>>>> the
> >>>>>> DT binding maintainers.
> >>>>>
> >>>>>
> >>>>>
> >>>>> The function comment says:
> >>>>>
> >>>>>    * Look up an address property in a node and return it as an address.
> >>>>>    * The property must hold one address with a length. This is only
> >>>>> tested
> >>>>>    * on 32-bit machines.
> >>>>
> >>>>
> >>>>
> >>>> And Thierry fixed it for systems with #address-cells > 1. Perhaps that
> >>>> part
> >>>> of the function comment should have been removed in the commit.
> >>>>
> >>>>> My intention was that this would be an efficient way to decode an
> >>>>> address and size from a device tree. To some extent regmap may take
> >>>>> over this role (IMO we should turn to drop fdtdec one day one we have
> >>>>> more infrastructure). But I'd like it to work efficiently for 32-bit
> >>>>> machines. The new function could hardly be less efficient.
> >>>>
> >>>>
> >>>>
> >>>> Again, there's no way in general to make it more efficient. The
> >>>> efficiency
> >>>> issue is directly implied by the DT data structures.
> >>>>
> >>>> In the special case where the parent node is already known, we can
> >>>> certainly
> >>>> introduce an alternate function that is more efficient. You've already
> >>>> asked
> >>>> for that, and as I said, it's in my TODO list.
> >>>>
> >>>>> I think we should consider the case where the physical address size of
> >>>>> U-Boot and the device tree do not match as a corner case. I certainly
> >>>>> don't want device tree to add loads of pointless code for 'normal'
> >>>>> platforms.
> >>>>
> >>>>
> >>>>
> >>>> It's not a corner case. It's a fundamental part of the DT schema. If
> >>>> U-Boot
> >>>> is going to use DT, it should actually use *DT*, not some-very
> >>>> similar-but-not-quite-DT format with all kinds of implicit and unstated
> >>>> exceptions, limitations, and assumptions. This implies fully honoring all
> >>>> aspects of how DT works when parsing it, not picking and choosing
> >>>> features
> >>>> because some are inconvenient or annoying. If U-Boot doesn't want to
> >>>> correctly implement DT support, it should just drop it completely.
> >>>>
> >>>> As an aside, when instantiating devices, I hope one day that U-Boot will
> >>>> parse DT top-down in a hierarchical way, rather than simply searching for
> >>>> any node anywhere in the DT without regard for whether any parent node is
> >>>> enabled, has a driver, has had the driver initialize, etc. U-Boot ignores
> >>>> this right now, and is only getting away with this accidentally. Without
> >>>> hacky workarounds in drivers, this won't continue to work for all Tegra
> >>>> HW
> >>>> (e.g. host1x graphics/display sub-modules, AHUB's audio-related
> >>>> sub-modules), since parent drivers must initialize before child drivers
> >>>> in
> >>>> order to enable various register buses, including clocks/resets affecting
> >>>> those buses etc.
> >>>>
> >>>> Again, if it's simply too inconvenient or bloated to implement DT
> >>>> properly
> >>>> in U-Boot, let's just drop it entirely. A halfway solution is the worst
> >>>> of
> >>>> both worlds. I'm not convinced the full implications of how to (and the
> >>>> need
> >>>> to) correctly and fully support DT have were well thought through before
> >>>> (control) DT support was added to U-Boot.
> >>>>
> >>>>> Re memory-map, yes it doesn't seem to be possible to do what it is
> >>>>> trying to do (and Thierry says the same below). It is quite weird to
> >>>>> have a SPI peripheral which is also memory mapped.
> >>>>>
> >>>>> Here's my question - if you fix the CONFIG_PHYS_64BIT problem with
> >>>>> 64-bit Tegra, what is actually wrong with the way the function was?
> >>>>
> >>>>
> >>>>
> >>>> With the DT files now checked into U-Boot, I think it would accidentally
> >>>> work, since we just happen to have set #address-cells=<2>,
> >>>> #size-cells=<2>,
> >>>> and that would just happen to match sizeof(fdt_addr_t).
> >>>>
> >>>> However note this is an accident on a couple of levels:
> >>>>
> >>>> a) This is because the code assumes that sizeof(fdt_addr_t) ==
> >>>> #address-cells * 4. This is an invalid assumption since it does not
> >>>> correctly honor the DT schema. It hard-codes the size of values whereas
> >>>> DT
> >>>> schema says the size is defined by the #xxx-cells properties.
> >>>>
> >>>> b) The original Tegra210 DTs that TomW posted had #address-cells=<1>,
> >>>> #size-cells=<1>. I asked him to change that to match what I expected to
> >>>> be
> >>>> in the Linux kernel's Tegra210 DTs. However, if he'd rejected my request
> >>>> or
> >>>> I hadn't noticed that, then with CONFIG_PHYS_64BIT set,
> >>>> fdtdec_get_addr_size() would have attempted to read twice as many bytes
> >>>> as
> >>>> it should have from the property. It's entirely plausible that someone
> >>>> could
> >>>> have come along later and realized CONFIG_PHYS_64BIT was set incorrectly
> >>>> and
> >>>> enabled it.
> >>>>
> >>>>> This is a boot loader so we should be willing to make some
> >>>>> simplifications.
> >>>>
> >>>>
> >>>>
> >>>> When dealing with internal bootloader details, sure assumptions,
> >>>> simplifications, etc. can be made.
> >>>>
> >>>> However, DT is an externally defined standard. The content of DT must be
> >>>> identical across all OSs (SW stacks, bootloader) and not influenced by
> >>>> requirements/... of any specific individual OS's (SW stack, bootloader)
> >>>> quirks. We can't just pick and choose which parts of it we care about.
> >>>> Well,
> >>>> perhaps if we stop calling it DT we could.
> >>>
> >>>
> >>> So I think in summary:
> >>>
> >>> - 64-bit machines should have CONFIG_PHYS_64BIT set correctly
> >>
> >>
> >> It turns out that arch/arm/include/asm/config.h already enables this for all
> >> ARM64 platforms.
> >>
> >> As such, we can in fact go ahead with reverting this patch, and U-Boot will
> >> still function on Tegra210 boards.
> >>
> >> In the short term, I think that means TomR should just apply this revert
> >> patch, and we don't need to send any additional patches. In the slightly
> >> longer term, we should add some comments to fdtdec_get_addr_size()
> >> describing its problems, and slightly longer term, add back Thierry's patch,
> >> but in a way that lets callers specify whether #address-cells/#size-cells
> >> should be used, or whether caller-supplied hard-coded values should be used.
> >
> > OK great. But I see your new patch so I think we can apply both at the
> > same time.
> >
> >>
> >> I apologize for not noticing this earlier; I'd assumed that since none of
> >> the Tegra-specific files in include/configs/ set this flag, nor any of the
> >> Tegra-specific Kconfig files, it wasn't set, and hence a revert of the patch
> >> would break Tegra210 support.
> >
> > No problem - I assumed it would also.
> >
> >>
> >>> - then fdtdec_get_addr_size() would work as expected
> >>
> >>
> >> I take issue with *works* as expected", since I would expect the function to
> >> implement DT parsing rules completely.
> >>
> >> However, it's certainly true to say that it will generate the desired
> >> results, even if it doesn't /work/ (isn't implemented) as expected.
> >>
> >> (I suppose this depends on whether you're talking about "works" w.r.t to
> >> correctness of the returned results or side-effects, or w.r.t. how the
> >> internal implementation works.)
> >>
> >>> - I want to make this cases as efficient as possible since it will be
> >>> called in SPL
> >>> - You are concerned that making assumptions like this violates the DT spec
> >>>
> >>> One option is to split the functions into two - one that works in SPL
> >>> and makes assumptions, and one that does not and does things the hard
> >>> way.
> >>
> >>
> >> Why should SPL be any different? U-Boot should either parse DT correctly or
> >> not at all. SPL shouldn't be a special case. Admittedly SPL on Tegra does
> >> very little (and isn't even present on Tegra210), but in general, can't
> >> someone use storage drivers, filesystems, etc. in SPL to choose the next
> >> stage to load, read GPIOs or other data sources to select between different
> >> boot paths, perhaps even interact with a user? If so, then assuming that SPL
> >> can somehow implements a reduced set of features, and hence can make
> >> assumptions that non-SPL can't, seems quite dangerous. We could only do that
> >> if we put some active checks in the U-Boot makefiles to ensure that nobody
> >> enabled anything in the SPL besides a set of strictly audited features.
> >
> > Yes we could do that and it would avoid pain later. I suppose SPL on
> > Tegra is a bit of a special case since there is really no size limit.
> > For some chips the limits are pretty severe so I am quite sensitive to
> > code size even at the expense of extra debug time when something
> > breaks.
> >
> >>
> >>> I suppose we could also add checks/warnings that the DT is
> >>>
> >>> 'well-formed' and that the address size matches the machine it is
> >>> running on.
> >>
> >>
> >> Yes, we certainly should do that.
> >>
> >>> In any case we do need to get rid of the parent lookup in this
> >>> function. So can either you or Thierry submit a patch to do this? The
> >>> parent should instead be a parameter to the function. You may need to
> >>> create a stub function which looks it up before calling
> >>> fdtdec_get_addr_size().
> >>
> >>
> >> Yes, we can't remove the parent lookup in all cases. However, we can avoid
> >> it in the cases where the caller can supply it. I think that's most, but not
> >> quite all.
> >
> > My main concern is dev_get_addr() since that is the official way of
> > reading a device address now. See also regmap_init_mem() which does
> > things its own way.
> >
> >>
> >>
> >>> I'll see how things look in SPL.
> >>>
> >>> Regards,
> >>> Simon
> >>>
> >>
> >
> 
> Do we have any conclusion about commit 5b34436? Today I started to
> check the pre-relocatoin DM PCI UART issue, but found it is now broken
> due to this commit. The broken part is at
> ns16550_serial_ofdata_to_platdata() in drivers/serial/ns16550.c, in
> which it has:
> 
>     addr = fdtdec_get_addr(gd->fdt_blob, dev->of_offset, "reg");
> #ifdef CONFIG_PCI
>     if (addr == FDT_ADDR_T_NONE) {
> ...
> 
> Before commit 5b34436, the old behavior is that the call to
> fdtdec_get_addr() returns FDT_ADDR_T_NONE so that we can trap into the
> PCI logic. But with commit 5b34436, addr is now zero which just bypass
> this logic.
> 
> As for why addr is now zero, this is because fdtdec_get_number() can
> only handle a 64-bit number at most. However for PCI reg, it has 3
> cells. So if I have the following encoding:
> 
> reg = <0x00025100 0x0 0x0 0x0 0x0
>            0x01025110 0x0 0x0 0x0 0x0>;
> 
> The addr will be assigned to zero after two rounds of left shift by 32-bit.
> 
> I can certainly change ns16550 driver to test the return value against
> 0 now, but I think this fdtdec_get_addr() does not cover all cases.
> Please advise.

This type of simple address parsing breaks down in the face of PCI. PCI
requires 3 address cells because it has different types of address
spaces. fdtdec_get_address() actually does the right thing here. It
returns the "address" associated with the entry. The address is the
64-bit value you get by concatenating cells 0 and 1. Cell 2 contains
additional meta data such as the PCI address and the type of memory that
you are dealing with (configuration space, I/O, memory-mapped).

I'd suggest that we add code to properly check whether this is a PCI
device.

Also, why isn't this code obtaining the memory address from the base
address registers?

Thierry
Bin Meng Aug. 14, 2015, 8:44 a.m. UTC | #17
Hi Thierry,

On Fri, Aug 14, 2015 at 4:32 PM, Thierry Reding <treding@nvidia.com> wrote:
> On Fri, Aug 14, 2015 at 04:10:32PM +0800, Bin Meng wrote:
>> Hi,
>>
>> On Sun, Aug 9, 2015 at 11:08 PM, Simon Glass <sjg@chromium.org> wrote:
>> > Hi Stephen,
>> >
>> > On 6 August 2015 at 13:03, Stephen Warren <swarren@wwwdotorg.org> wrote:
>> >> On 08/05/2015 05:45 PM, Simon Glass wrote:
>> >>>
>> >>> Hi Stephen,
>> >>>
>> >>> On 5 August 2015 at 12:22, Stephen Warren <swarren@wwwdotorg.org> wrote:
>> >>>>
>> >>>> On 08/04/2015 10:08 PM, Simon Glass wrote:
>> >>>>>
>> >>>>>
>> >>>>> Hi Stephen,
>> >>>>>
>> >>>>> On 3 August 2015 at 12:20, Stephen Warren <swarren@wwwdotorg.org> wrote:
>> >>>>>>
>> >>>>>>
>> >>>>>> On 08/03/2015 09:52 AM, Simon Glass wrote:
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> Hi Stephen,
>> >>>>>>>
>> >>>>>>> On 3 August 2015 at 09:12, Stephen Warren <swarren@wwwdotorg.org>
>> >>>>>>> wrote:
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> On 08/02/2015 06:13 PM, Simon Glass wrote:
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> This reverts commit 5b34436035fc862b5e8d0d2c3eab74ba36f1a7f4.
>> >>>>>>>>>
>> >>>>>>>>> This function has a few problems. It calls fdt_parent_offset() which
>> >>>>>>>>> as
>> >>>>>>>>> mentioned in code review is very slow.
>> >>>>>>>>>
>> >>>>>>>>> https://patchwork.ozlabs.org/patch/499482/
>> >>>>>>>>> https://patchwork.ozlabs.org/patch/452604/
>> >>>>>>>>>
>> >>>>>>>>> It also happens to break SPI flash on Minnowboard max which is how I
>> >>>>>>>>> noticed
>> >>>>>>>>> that this was applied. I can send a patch to tidy that up, but in
>> >>>>>>>>> any
>> >>>>>>>>> case
>> >>>>>>>>> I think we should consider a revert until the function is better
>> >>>>>>>>> implemented.
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> The fact that the function needs to perform a slow operation is not a
>> >>>>>>>> good
>> >>>>>>>> reason for a revert. The slowness of the operation is just a matter
>> >>>>>>>> of
>> >>>>>>>> fact
>> >>>>>>>> with DT not having uplinks in its data structure, and U-Boot using
>> >>>>>>>> those
>> >>>>>>>> data structures directly.
>> >>>>>>>>
>> >>>>>>>> You'd requested during review that I additionally implement a faster
>> >>>>>>>> version
>> >>>>>>>> of the function in the case where the parent node is already known,
>> >>>>>>>> and
>> >>>>>>>> said
>> >>>>>>>> it was fine if that happened in a later patch. I have this on my TODO
>> >>>>>>>> list,
>> >>>>>>>> but it's only been a couple of days.
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> I didn't expect this to go to mainline before your new patch.
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> I didn't get that message from the thread; you wrote:
>> >>>>>>
>> >>>>>> Simon Glass wrote:
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> Stephen Warren wrote:
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> Simon Glass wrote:
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> Also, how about (in addition) a
>> >>>>>>>>> version of this function that works for devices? Like:
>> >>>>>>>>>
>> >>>>>>>>> device_get_addr_size(struct udevice *dev, ...)
>> >>>>>>>>>
>> >>>>>>>>> so that it can handle this for you.
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> That sounds like a separate patch?
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> Yes, but I think we should get it in there so that people don't start
>> >>>>>>> using this (wildly inefficient) function when they don't need to. I
>> >>>>>>> think by passing in the parent node we force people to think about the
>> >>>>>>> cost.
>> >>>>>>>
>> >>>>>>> Yes the driver model function can be a separate patch, but let's get
>> >>>>>>> it in at about the same time. We have dev_get_addr() so could have
>> >>>>>>> dev_get_addr_size().
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> That sounds to like you'd like a followup patch soon after this patch
>> >>>>>> (my
>> >>>>>> assumption would be within a few days or weeks) to solve your comments,
>> >>>>>> not
>> >>>>>> that you wanted a replacement patch.
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> I will take that feedback to be a little more forceful in future, sorry.
>> >>>>>
>> >>>>>>
>> >>>>>>>> Why not just fix the bug since you said you could? That seems far
>> >>>>>>>> better
>> >>>>>>>> than breaking the newly added Tegra210 support.
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> I do have a patch but I'm not 100% comfortable with the approach. I
>> >>>>>>> don't see a good reason to move away from the idea of fdt_addr_t and
>> >>>>>>> fdt_addr_t being set correctly for the platform. Or maybe I
>> >>>>>>> misunderstand the problem the patch was trying to fix. As I said it
>> >>>>>>> did not have a commit message, so who knows :-)
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> The size of fdt_addr_t isn't relevant *at all* when parsing DT. The
>> >>>>>> only
>> >>>>>> thing that matters is #address-cells/#size-cells. Those properties are
>> >>>>>> what
>> >>>>>> tell the parsing code how many bytes to read from the reg property.
>> >>>>>> Whether
>> >>>>>> the resultant value fits into the code's internal representation is an
>> >>>>>> internal detail of the code, not part of the semantics of DT itself or
>> >>>>>> how
>> >>>>>> to parse it.
>> >>>>>>
>> >>>>>> If code assumes that #address-cells == sizeof(fdt_addr_t), it is indeed
>> >>>>>> quite likely that everything will just happen to work most of the time.
>> >>>>>> However, this is purely an accident and not something that anything
>> >>>>>> should
>> >>>>>> rely upon.
>> >>>>>>
>> >>>>>> (I think Tegra210 support still has CONFIG_PHYS_64BIT undefined which
>> >>>>>> is
>> >>>>>> admittedly a little /unexpected/ for a 64-bit U-Boot build, but in
>> >>>>>> practice
>> >>>>>> is perfectly /legal/ and will work out just fine since, except perhaps
>> >>>>>> for
>> >>>>>> RAM sizes, I don't believe any value in DT will actually require more
>> >>>>>> than
>> >>>>>> 32-bits to represent)
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> I would like to have the types match the platform where possible. At
>> >>>>> least the types should not be smaller than the platform - e.g. if
>> >>>>> CONFIG_PHYS_64BIT is not defined we should not support 64-bit
>> >>>>
>> >>>>
>> >>>>
>> >>>> In general, there's no "should not" here; we "cannot". If there's a
>> >>>> 64-bit
>> >>>> value in the DT (with bits above bit 31 set), then it can't be stored in
>> >>>> a
>> >>>> 32-bit variable.
>> >>>>
>> >>>> That said, if a DT has #address-cells=<2>, but the actual values stored
>> >>>> in
>> >>>> all reg values have 0 for the MS word, that'll actually work just fine.
>> >>>> Note
>> >>>> that it's 100% legal to set #address-cells=<100> and just write a bunch
>> >>>> of
>> >>>> extra zeros into the property. Silly perhaps, but perfectly legal. Since
>> >>>> the
>> >>>> function should adapt to whatever #address-cells value is in the DT,
>> >>>> supporting that case isn't any more work, so there's no reason we
>> >>>> shouldn't.
>> >>>>
>> >>>>> address/size in the device tree. This is for efficiency. We don't want
>> >>>>> to force all the U-Boot code to 64-bit suddenly. This will bloat
>> >>>>> things for no benefit.
>> >>>>
>> >>>>
>> >>>>
>> >>>> We could and likely should set CONFIG_PHYS_64BIT for Tegra210. However,
>> >>>> that's unrelated to using the correct algorithm to parse DT.
>> >>>>
>> >>>>>>>> P.S. What is the issue with SPI flash? The commit description doesn't
>> >>>>>>>> mention this at all.
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> It calls that function expecting it to pick up an address and size
>> >>>>>>> from two consecutive cells. With this patch, that fails (unless the
>> >>>>>>> property happens to be "reg").
>> >>>>
>> >>>>
>> >>>> ...
>> >>>>>>
>> >>>>>>
>> >>>>>> I think this is all stemming from drivers/mtd/spi/sf_probe.c
>> >>>>>> spi_flash_decode_fdt()'s call to fdtdec_get_addr_size() for property
>> >>>>>> "memory-map"? If so, then looking at arch/x86/dts/minnowmax.dts's /spi
>> >>>>>> node,
>> >>>>>> the code and DT content are clearly inconsistent; For this node
>> >>>>>> #address-cells=<1>, #size-cells=<0> which makes sense given that the
>> >>>>>> address
>> >>>>>> is a chip-select and hence has no size. So, the code should not assume
>> >>>>>> that
>> >>>>>> the memory-map property can be parsed in the same way as a reg
>> >>>>>> property.
>> >>>>>>
>> >>>>>> I note that the memory-map property doesn't exist in the Linux kernel's
>> >>>>>> DT
>> >>>>>> binding documentation database, or code, hence hasn't been reviewed by
>> >>>>>> the
>> >>>>>> DT binding maintainers.
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> The function comment says:
>> >>>>>
>> >>>>>    * Look up an address property in a node and return it as an address.
>> >>>>>    * The property must hold one address with a length. This is only
>> >>>>> tested
>> >>>>>    * on 32-bit machines.
>> >>>>
>> >>>>
>> >>>>
>> >>>> And Thierry fixed it for systems with #address-cells > 1. Perhaps that
>> >>>> part
>> >>>> of the function comment should have been removed in the commit.
>> >>>>
>> >>>>> My intention was that this would be an efficient way to decode an
>> >>>>> address and size from a device tree. To some extent regmap may take
>> >>>>> over this role (IMO we should turn to drop fdtdec one day one we have
>> >>>>> more infrastructure). But I'd like it to work efficiently for 32-bit
>> >>>>> machines. The new function could hardly be less efficient.
>> >>>>
>> >>>>
>> >>>>
>> >>>> Again, there's no way in general to make it more efficient. The
>> >>>> efficiency
>> >>>> issue is directly implied by the DT data structures.
>> >>>>
>> >>>> In the special case where the parent node is already known, we can
>> >>>> certainly
>> >>>> introduce an alternate function that is more efficient. You've already
>> >>>> asked
>> >>>> for that, and as I said, it's in my TODO list.
>> >>>>
>> >>>>> I think we should consider the case where the physical address size of
>> >>>>> U-Boot and the device tree do not match as a corner case. I certainly
>> >>>>> don't want device tree to add loads of pointless code for 'normal'
>> >>>>> platforms.
>> >>>>
>> >>>>
>> >>>>
>> >>>> It's not a corner case. It's a fundamental part of the DT schema. If
>> >>>> U-Boot
>> >>>> is going to use DT, it should actually use *DT*, not some-very
>> >>>> similar-but-not-quite-DT format with all kinds of implicit and unstated
>> >>>> exceptions, limitations, and assumptions. This implies fully honoring all
>> >>>> aspects of how DT works when parsing it, not picking and choosing
>> >>>> features
>> >>>> because some are inconvenient or annoying. If U-Boot doesn't want to
>> >>>> correctly implement DT support, it should just drop it completely.
>> >>>>
>> >>>> As an aside, when instantiating devices, I hope one day that U-Boot will
>> >>>> parse DT top-down in a hierarchical way, rather than simply searching for
>> >>>> any node anywhere in the DT without regard for whether any parent node is
>> >>>> enabled, has a driver, has had the driver initialize, etc. U-Boot ignores
>> >>>> this right now, and is only getting away with this accidentally. Without
>> >>>> hacky workarounds in drivers, this won't continue to work for all Tegra
>> >>>> HW
>> >>>> (e.g. host1x graphics/display sub-modules, AHUB's audio-related
>> >>>> sub-modules), since parent drivers must initialize before child drivers
>> >>>> in
>> >>>> order to enable various register buses, including clocks/resets affecting
>> >>>> those buses etc.
>> >>>>
>> >>>> Again, if it's simply too inconvenient or bloated to implement DT
>> >>>> properly
>> >>>> in U-Boot, let's just drop it entirely. A halfway solution is the worst
>> >>>> of
>> >>>> both worlds. I'm not convinced the full implications of how to (and the
>> >>>> need
>> >>>> to) correctly and fully support DT have were well thought through before
>> >>>> (control) DT support was added to U-Boot.
>> >>>>
>> >>>>> Re memory-map, yes it doesn't seem to be possible to do what it is
>> >>>>> trying to do (and Thierry says the same below). It is quite weird to
>> >>>>> have a SPI peripheral which is also memory mapped.
>> >>>>>
>> >>>>> Here's my question - if you fix the CONFIG_PHYS_64BIT problem with
>> >>>>> 64-bit Tegra, what is actually wrong with the way the function was?
>> >>>>
>> >>>>
>> >>>>
>> >>>> With the DT files now checked into U-Boot, I think it would accidentally
>> >>>> work, since we just happen to have set #address-cells=<2>,
>> >>>> #size-cells=<2>,
>> >>>> and that would just happen to match sizeof(fdt_addr_t).
>> >>>>
>> >>>> However note this is an accident on a couple of levels:
>> >>>>
>> >>>> a) This is because the code assumes that sizeof(fdt_addr_t) ==
>> >>>> #address-cells * 4. This is an invalid assumption since it does not
>> >>>> correctly honor the DT schema. It hard-codes the size of values whereas
>> >>>> DT
>> >>>> schema says the size is defined by the #xxx-cells properties.
>> >>>>
>> >>>> b) The original Tegra210 DTs that TomW posted had #address-cells=<1>,
>> >>>> #size-cells=<1>. I asked him to change that to match what I expected to
>> >>>> be
>> >>>> in the Linux kernel's Tegra210 DTs. However, if he'd rejected my request
>> >>>> or
>> >>>> I hadn't noticed that, then with CONFIG_PHYS_64BIT set,
>> >>>> fdtdec_get_addr_size() would have attempted to read twice as many bytes
>> >>>> as
>> >>>> it should have from the property. It's entirely plausible that someone
>> >>>> could
>> >>>> have come along later and realized CONFIG_PHYS_64BIT was set incorrectly
>> >>>> and
>> >>>> enabled it.
>> >>>>
>> >>>>> This is a boot loader so we should be willing to make some
>> >>>>> simplifications.
>> >>>>
>> >>>>
>> >>>>
>> >>>> When dealing with internal bootloader details, sure assumptions,
>> >>>> simplifications, etc. can be made.
>> >>>>
>> >>>> However, DT is an externally defined standard. The content of DT must be
>> >>>> identical across all OSs (SW stacks, bootloader) and not influenced by
>> >>>> requirements/... of any specific individual OS's (SW stack, bootloader)
>> >>>> quirks. We can't just pick and choose which parts of it we care about.
>> >>>> Well,
>> >>>> perhaps if we stop calling it DT we could.
>> >>>
>> >>>
>> >>> So I think in summary:
>> >>>
>> >>> - 64-bit machines should have CONFIG_PHYS_64BIT set correctly
>> >>
>> >>
>> >> It turns out that arch/arm/include/asm/config.h already enables this for all
>> >> ARM64 platforms.
>> >>
>> >> As such, we can in fact go ahead with reverting this patch, and U-Boot will
>> >> still function on Tegra210 boards.
>> >>
>> >> In the short term, I think that means TomR should just apply this revert
>> >> patch, and we don't need to send any additional patches. In the slightly
>> >> longer term, we should add some comments to fdtdec_get_addr_size()
>> >> describing its problems, and slightly longer term, add back Thierry's patch,
>> >> but in a way that lets callers specify whether #address-cells/#size-cells
>> >> should be used, or whether caller-supplied hard-coded values should be used.
>> >
>> > OK great. But I see your new patch so I think we can apply both at the
>> > same time.
>> >
>> >>
>> >> I apologize for not noticing this earlier; I'd assumed that since none of
>> >> the Tegra-specific files in include/configs/ set this flag, nor any of the
>> >> Tegra-specific Kconfig files, it wasn't set, and hence a revert of the patch
>> >> would break Tegra210 support.
>> >
>> > No problem - I assumed it would also.
>> >
>> >>
>> >>> - then fdtdec_get_addr_size() would work as expected
>> >>
>> >>
>> >> I take issue with *works* as expected", since I would expect the function to
>> >> implement DT parsing rules completely.
>> >>
>> >> However, it's certainly true to say that it will generate the desired
>> >> results, even if it doesn't /work/ (isn't implemented) as expected.
>> >>
>> >> (I suppose this depends on whether you're talking about "works" w.r.t to
>> >> correctness of the returned results or side-effects, or w.r.t. how the
>> >> internal implementation works.)
>> >>
>> >>> - I want to make this cases as efficient as possible since it will be
>> >>> called in SPL
>> >>> - You are concerned that making assumptions like this violates the DT spec
>> >>>
>> >>> One option is to split the functions into two - one that works in SPL
>> >>> and makes assumptions, and one that does not and does things the hard
>> >>> way.
>> >>
>> >>
>> >> Why should SPL be any different? U-Boot should either parse DT correctly or
>> >> not at all. SPL shouldn't be a special case. Admittedly SPL on Tegra does
>> >> very little (and isn't even present on Tegra210), but in general, can't
>> >> someone use storage drivers, filesystems, etc. in SPL to choose the next
>> >> stage to load, read GPIOs or other data sources to select between different
>> >> boot paths, perhaps even interact with a user? If so, then assuming that SPL
>> >> can somehow implements a reduced set of features, and hence can make
>> >> assumptions that non-SPL can't, seems quite dangerous. We could only do that
>> >> if we put some active checks in the U-Boot makefiles to ensure that nobody
>> >> enabled anything in the SPL besides a set of strictly audited features.
>> >
>> > Yes we could do that and it would avoid pain later. I suppose SPL on
>> > Tegra is a bit of a special case since there is really no size limit.
>> > For some chips the limits are pretty severe so I am quite sensitive to
>> > code size even at the expense of extra debug time when something
>> > breaks.
>> >
>> >>
>> >>> I suppose we could also add checks/warnings that the DT is
>> >>>
>> >>> 'well-formed' and that the address size matches the machine it is
>> >>> running on.
>> >>
>> >>
>> >> Yes, we certainly should do that.
>> >>
>> >>> In any case we do need to get rid of the parent lookup in this
>> >>> function. So can either you or Thierry submit a patch to do this? The
>> >>> parent should instead be a parameter to the function. You may need to
>> >>> create a stub function which looks it up before calling
>> >>> fdtdec_get_addr_size().
>> >>
>> >>
>> >> Yes, we can't remove the parent lookup in all cases. However, we can avoid
>> >> it in the cases where the caller can supply it. I think that's most, but not
>> >> quite all.
>> >
>> > My main concern is dev_get_addr() since that is the official way of
>> > reading a device address now. See also regmap_init_mem() which does
>> > things its own way.
>> >
>> >>
>> >>
>> >>> I'll see how things look in SPL.
>> >>>
>> >>> Regards,
>> >>> Simon
>> >>>
>> >>
>> >
>>
>> Do we have any conclusion about commit 5b34436? Today I started to
>> check the pre-relocatoin DM PCI UART issue, but found it is now broken
>> due to this commit. The broken part is at
>> ns16550_serial_ofdata_to_platdata() in drivers/serial/ns16550.c, in
>> which it has:
>>
>>     addr = fdtdec_get_addr(gd->fdt_blob, dev->of_offset, "reg");
>> #ifdef CONFIG_PCI
>>     if (addr == FDT_ADDR_T_NONE) {
>> ...
>>
>> Before commit 5b34436, the old behavior is that the call to
>> fdtdec_get_addr() returns FDT_ADDR_T_NONE so that we can trap into the
>> PCI logic. But with commit 5b34436, addr is now zero which just bypass
>> this logic.
>>
>> As for why addr is now zero, this is because fdtdec_get_number() can
>> only handle a 64-bit number at most. However for PCI reg, it has 3
>> cells. So if I have the following encoding:
>>
>> reg = <0x00025100 0x0 0x0 0x0 0x0
>>            0x01025110 0x0 0x0 0x0 0x0>;
>>
>> The addr will be assigned to zero after two rounds of left shift by 32-bit.
>>
>> I can certainly change ns16550 driver to test the return value against
>> 0 now, but I think this fdtdec_get_addr() does not cover all cases.
>> Please advise.
>
> This type of simple address parsing breaks down in the face of PCI. PCI
> requires 3 address cells because it has different types of address
> spaces. fdtdec_get_address() actually does the right thing here. It
> returns the "address" associated with the entry. The address is the
> 64-bit value you get by concatenating cells 0 and 1. Cell 2 contains
> additional meta data such as the PCI address and the type of memory that
> you are dealing with (configuration space, I/O, memory-mapped).

I think you wanted to say fdtdec_get_address() returns cell 1 and 2
and cell 0 contains the meta data.

>
> I'd suggest that we add code to properly check whether this is a PCI
> device.

fdtdec_get_number() will only get the last 2 cells into the returned
addr. If we have more than 2 cells (in the PCI case), we end up get
cell 1 and 2 which is zero. And if we have 4 cells, we will get cell 2
and 3.

>
> Also, why isn't this code obtaining the memory address from the base
> address registers?
>

I think we discussed this before. For simplification, we use device
tree to pass the BAR number to the driver, otherwise we will end up
dealing with a large code block of PCI vendor ID and device ID
switch/case to determine which BAR we should read.

To fix this, I think I can create a dedicated PCI UART driver and move
the PCI logic codes from ns16550_serial_ofdata_to_platdata() to the
new driver.

Regards,
Bin
Michal Suchanek Aug. 14, 2015, 9:01 a.m. UTC | #18
On 14 August 2015 at 10:10, Bin Meng <bmeng.cn@gmail.com> wrote:

> Do we have any conclusion about commit 5b34436? Today I started to
> check the pre-relocatoin DM PCI UART issue, but found it is now broken
> due to this commit. The broken part is at
> ns16550_serial_ofdata_to_platdata() in drivers/serial/ns16550.c, in
> which it has:
>
>     addr = fdtdec_get_addr(gd->fdt_blob, dev->of_offset, "reg");
> #ifdef CONFIG_PCI
>     if (addr == FDT_ADDR_T_NONE) {
> ...
>
> Before commit 5b34436, the old behavior is that the call to
> fdtdec_get_addr() returns FDT_ADDR_T_NONE so that we can trap into the
> PCI logic. But with commit 5b34436, addr is now zero which just bypass
> this logic.
>
> As for why addr is now zero, this is because fdtdec_get_number() can
> only handle a 64-bit number at most. However for PCI reg, it has 3
> cells. So if I have the following encoding:
>
> reg = <0x00025100 0x0 0x0 0x0 0x0
>            0x01025110 0x0 0x0 0x0 0x0>;
>
> The addr will be assigned to zero after two rounds of left shift by 32-bit.
>
> I can certainly change ns16550 driver to test the return value against
> 0 now, but I think this fdtdec_get_addr() does not cover all cases.
> Please advise.
>

Hello,

What do you expect the fdtdec_get_addr to do?

Any 32bit (or 64bit number on 64bit archs) is valid return value so
there is no possibility to return an error. This is probably a problem
with the interface. If there is more than can fit or there is an error
you will never know.

eg. Linux has this code for decoding numbers which can have 1-2 cells:

                reg = of_get_property(pp, "reg", &len);
                if (!reg) {
blah
                }

                a_cells = of_n_addr_cells(pp); /* scan parents for
#address-cells */
                s_cells = of_n_size_cells(pp);
                (*pparts)[i].offset = of_read_number(reg, a_cells);
                (*pparts)[i].size = of_read_number(reg + a_cells, s_cells);


If you read an address that can be 3 cells then you need a function
that returns cell array rather than a single integer. Or you can check
the address length and decide if you can fit that into an integer on
your platform.

Thanks

Michal
Bin Meng Aug. 14, 2015, 9:08 a.m. UTC | #19
Hi Michal,

On Fri, Aug 14, 2015 at 5:01 PM, Michal Suchanek <hramrach@gmail.com> wrote:
> On 14 August 2015 at 10:10, Bin Meng <bmeng.cn@gmail.com> wrote:
>
>> Do we have any conclusion about commit 5b34436? Today I started to
>> check the pre-relocatoin DM PCI UART issue, but found it is now broken
>> due to this commit. The broken part is at
>> ns16550_serial_ofdata_to_platdata() in drivers/serial/ns16550.c, in
>> which it has:
>>
>>     addr = fdtdec_get_addr(gd->fdt_blob, dev->of_offset, "reg");
>> #ifdef CONFIG_PCI
>>     if (addr == FDT_ADDR_T_NONE) {
>> ...
>>
>> Before commit 5b34436, the old behavior is that the call to
>> fdtdec_get_addr() returns FDT_ADDR_T_NONE so that we can trap into the
>> PCI logic. But with commit 5b34436, addr is now zero which just bypass
>> this logic.
>>
>> As for why addr is now zero, this is because fdtdec_get_number() can
>> only handle a 64-bit number at most. However for PCI reg, it has 3
>> cells. So if I have the following encoding:
>>
>> reg = <0x00025100 0x0 0x0 0x0 0x0
>>            0x01025110 0x0 0x0 0x0 0x0>;
>>
>> The addr will be assigned to zero after two rounds of left shift by 32-bit.
>>
>> I can certainly change ns16550 driver to test the return value against
>> 0 now, but I think this fdtdec_get_addr() does not cover all cases.
>> Please advise.
>>
>
> Hello,
>
> What do you expect the fdtdec_get_addr to do?
>
> Any 32bit (or 64bit number on 64bit archs) is valid return value so
> there is no possibility to return an error. This is probably a problem
> with the interface. If there is more than can fit or there is an error
> you will never know.

Agreed.

>
> eg. Linux has this code for decoding numbers which can have 1-2 cells:
>
>                 reg = of_get_property(pp, "reg", &len);
>                 if (!reg) {
> blah
>                 }
>
>                 a_cells = of_n_addr_cells(pp); /* scan parents for
> #address-cells */
>                 s_cells = of_n_size_cells(pp);
>                 (*pparts)[i].offset = of_read_number(reg, a_cells);
>                 (*pparts)[i].size = of_read_number(reg + a_cells, s_cells);
>
>
> If you read an address that can be 3 cells then you need a function
> that returns cell array rather than a single integer. Or you can check
> the address length and decide if you can fit that into an integer on
> your platform.
>

Yep, we have an API fdtdec_get_pci_addr() to parse a PCI reg. I was
raising this because I see this broke the existing codes and there was
discussion on reverting this commit.

Regards,
Bin
Simon Glass Aug. 14, 2015, 10:04 a.m. UTC | #20
Hi,

On 14 August 2015 at 03:08, Bin Meng <bmeng.cn@gmail.com> wrote:
> Hi Michal,
>
> On Fri, Aug 14, 2015 at 5:01 PM, Michal Suchanek <hramrach@gmail.com> wrote:
>> On 14 August 2015 at 10:10, Bin Meng <bmeng.cn@gmail.com> wrote:
>>
>>> Do we have any conclusion about commit 5b34436? Today I started to
>>> check the pre-relocatoin DM PCI UART issue, but found it is now broken
>>> due to this commit. The broken part is at
>>> ns16550_serial_ofdata_to_platdata() in drivers/serial/ns16550.c, in
>>> which it has:
>>>
>>>     addr = fdtdec_get_addr(gd->fdt_blob, dev->of_offset, "reg");
>>> #ifdef CONFIG_PCI
>>>     if (addr == FDT_ADDR_T_NONE) {
>>> ...
>>>
>>> Before commit 5b34436, the old behavior is that the call to
>>> fdtdec_get_addr() returns FDT_ADDR_T_NONE so that we can trap into the
>>> PCI logic. But with commit 5b34436, addr is now zero which just bypass
>>> this logic.
>>>
>>> As for why addr is now zero, this is because fdtdec_get_number() can
>>> only handle a 64-bit number at most. However for PCI reg, it has 3
>>> cells. So if I have the following encoding:
>>>
>>> reg = <0x00025100 0x0 0x0 0x0 0x0
>>>            0x01025110 0x0 0x0 0x0 0x0>;
>>>
>>> The addr will be assigned to zero after two rounds of left shift by 32-bit.
>>>
>>> I can certainly change ns16550 driver to test the return value against
>>> 0 now, but I think this fdtdec_get_addr() does not cover all cases.
>>> Please advise.
>>>
>>
>> Hello,
>>
>> What do you expect the fdtdec_get_addr to do?
>>
>> Any 32bit (or 64bit number on 64bit archs) is valid return value so
>> there is no possibility to return an error. This is probably a problem
>> with the interface. If there is more than can fit or there is an error
>> you will never know.
>
> Agreed.
>
>>
>> eg. Linux has this code for decoding numbers which can have 1-2 cells:
>>
>>                 reg = of_get_property(pp, "reg", &len);
>>                 if (!reg) {
>> blah
>>                 }
>>
>>                 a_cells = of_n_addr_cells(pp); /* scan parents for
>> #address-cells */
>>                 s_cells = of_n_size_cells(pp);
>>                 (*pparts)[i].offset = of_read_number(reg, a_cells);
>>                 (*pparts)[i].size = of_read_number(reg + a_cells, s_cells);
>>
>>
>> If you read an address that can be 3 cells then you need a function
>> that returns cell array rather than a single integer. Or you can check
>> the address length and decide if you can fit that into an integer on
>> your platform.
>>
>
> Yep, we have an API fdtdec_get_pci_addr() to parse a PCI reg. I was
> raising this because I see this broke the existing codes and there was
> discussion on reverting this commit.

I plan to apply this revert to u-boot-x86 (where SPI is currently
broken) and (once it has a bit more testing) also this patch which I
think makes the change in a safer way:

https://patchwork.ozlabs.org/patch/504918/

At present that patch breaks at least one x86 board and I have not dug
into it yet.

The revert should not break tegra, according to Stephen.

Regards,
Simon
Thierry Reding Aug. 14, 2015, 2:06 p.m. UTC | #21
On Fri, Aug 14, 2015 at 04:44:28PM +0800, Bin Meng wrote:
> Hi Thierry,
> 
> On Fri, Aug 14, 2015 at 4:32 PM, Thierry Reding <treding@nvidia.com> wrote:
> > On Fri, Aug 14, 2015 at 04:10:32PM +0800, Bin Meng wrote:
> >> Hi,
> >>
> >> On Sun, Aug 9, 2015 at 11:08 PM, Simon Glass <sjg@chromium.org> wrote:
> >> > Hi Stephen,
> >> >
> >> > On 6 August 2015 at 13:03, Stephen Warren <swarren@wwwdotorg.org> wrote:
> >> >> On 08/05/2015 05:45 PM, Simon Glass wrote:
> >> >>>
> >> >>> Hi Stephen,
> >> >>>
> >> >>> On 5 August 2015 at 12:22, Stephen Warren <swarren@wwwdotorg.org> wrote:
> >> >>>>
> >> >>>> On 08/04/2015 10:08 PM, Simon Glass wrote:
> >> >>>>>
> >> >>>>>
> >> >>>>> Hi Stephen,
> >> >>>>>
> >> >>>>> On 3 August 2015 at 12:20, Stephen Warren <swarren@wwwdotorg.org> wrote:
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> On 08/03/2015 09:52 AM, Simon Glass wrote:
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>> Hi Stephen,
> >> >>>>>>>
> >> >>>>>>> On 3 August 2015 at 09:12, Stephen Warren <swarren@wwwdotorg.org>
> >> >>>>>>> wrote:
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>> On 08/02/2015 06:13 PM, Simon Glass wrote:
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>> This reverts commit 5b34436035fc862b5e8d0d2c3eab74ba36f1a7f4.
> >> >>>>>>>>>
> >> >>>>>>>>> This function has a few problems. It calls fdt_parent_offset() which
> >> >>>>>>>>> as
> >> >>>>>>>>> mentioned in code review is very slow.
> >> >>>>>>>>>
> >> >>>>>>>>> https://patchwork.ozlabs.org/patch/499482/
> >> >>>>>>>>> https://patchwork.ozlabs.org/patch/452604/
> >> >>>>>>>>>
> >> >>>>>>>>> It also happens to break SPI flash on Minnowboard max which is how I
> >> >>>>>>>>> noticed
> >> >>>>>>>>> that this was applied. I can send a patch to tidy that up, but in
> >> >>>>>>>>> any
> >> >>>>>>>>> case
> >> >>>>>>>>> I think we should consider a revert until the function is better
> >> >>>>>>>>> implemented.
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>> The fact that the function needs to perform a slow operation is not a
> >> >>>>>>>> good
> >> >>>>>>>> reason for a revert. The slowness of the operation is just a matter
> >> >>>>>>>> of
> >> >>>>>>>> fact
> >> >>>>>>>> with DT not having uplinks in its data structure, and U-Boot using
> >> >>>>>>>> those
> >> >>>>>>>> data structures directly.
> >> >>>>>>>>
> >> >>>>>>>> You'd requested during review that I additionally implement a faster
> >> >>>>>>>> version
> >> >>>>>>>> of the function in the case where the parent node is already known,
> >> >>>>>>>> and
> >> >>>>>>>> said
> >> >>>>>>>> it was fine if that happened in a later patch. I have this on my TODO
> >> >>>>>>>> list,
> >> >>>>>>>> but it's only been a couple of days.
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>> I didn't expect this to go to mainline before your new patch.
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> I didn't get that message from the thread; you wrote:
> >> >>>>>>
> >> >>>>>> Simon Glass wrote:
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>> Stephen Warren wrote:
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>> Simon Glass wrote:
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>> Also, how about (in addition) a
> >> >>>>>>>>> version of this function that works for devices? Like:
> >> >>>>>>>>>
> >> >>>>>>>>> device_get_addr_size(struct udevice *dev, ...)
> >> >>>>>>>>>
> >> >>>>>>>>> so that it can handle this for you.
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>> That sounds like a separate patch?
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>> Yes, but I think we should get it in there so that people don't start
> >> >>>>>>> using this (wildly inefficient) function when they don't need to. I
> >> >>>>>>> think by passing in the parent node we force people to think about the
> >> >>>>>>> cost.
> >> >>>>>>>
> >> >>>>>>> Yes the driver model function can be a separate patch, but let's get
> >> >>>>>>> it in at about the same time. We have dev_get_addr() so could have
> >> >>>>>>> dev_get_addr_size().
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> That sounds to like you'd like a followup patch soon after this patch
> >> >>>>>> (my
> >> >>>>>> assumption would be within a few days or weeks) to solve your comments,
> >> >>>>>> not
> >> >>>>>> that you wanted a replacement patch.
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>> I will take that feedback to be a little more forceful in future, sorry.
> >> >>>>>
> >> >>>>>>
> >> >>>>>>>> Why not just fix the bug since you said you could? That seems far
> >> >>>>>>>> better
> >> >>>>>>>> than breaking the newly added Tegra210 support.
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>> I do have a patch but I'm not 100% comfortable with the approach. I
> >> >>>>>>> don't see a good reason to move away from the idea of fdt_addr_t and
> >> >>>>>>> fdt_addr_t being set correctly for the platform. Or maybe I
> >> >>>>>>> misunderstand the problem the patch was trying to fix. As I said it
> >> >>>>>>> did not have a commit message, so who knows :-)
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> The size of fdt_addr_t isn't relevant *at all* when parsing DT. The
> >> >>>>>> only
> >> >>>>>> thing that matters is #address-cells/#size-cells. Those properties are
> >> >>>>>> what
> >> >>>>>> tell the parsing code how many bytes to read from the reg property.
> >> >>>>>> Whether
> >> >>>>>> the resultant value fits into the code's internal representation is an
> >> >>>>>> internal detail of the code, not part of the semantics of DT itself or
> >> >>>>>> how
> >> >>>>>> to parse it.
> >> >>>>>>
> >> >>>>>> If code assumes that #address-cells == sizeof(fdt_addr_t), it is indeed
> >> >>>>>> quite likely that everything will just happen to work most of the time.
> >> >>>>>> However, this is purely an accident and not something that anything
> >> >>>>>> should
> >> >>>>>> rely upon.
> >> >>>>>>
> >> >>>>>> (I think Tegra210 support still has CONFIG_PHYS_64BIT undefined which
> >> >>>>>> is
> >> >>>>>> admittedly a little /unexpected/ for a 64-bit U-Boot build, but in
> >> >>>>>> practice
> >> >>>>>> is perfectly /legal/ and will work out just fine since, except perhaps
> >> >>>>>> for
> >> >>>>>> RAM sizes, I don't believe any value in DT will actually require more
> >> >>>>>> than
> >> >>>>>> 32-bits to represent)
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>> I would like to have the types match the platform where possible. At
> >> >>>>> least the types should not be smaller than the platform - e.g. if
> >> >>>>> CONFIG_PHYS_64BIT is not defined we should not support 64-bit
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>> In general, there's no "should not" here; we "cannot". If there's a
> >> >>>> 64-bit
> >> >>>> value in the DT (with bits above bit 31 set), then it can't be stored in
> >> >>>> a
> >> >>>> 32-bit variable.
> >> >>>>
> >> >>>> That said, if a DT has #address-cells=<2>, but the actual values stored
> >> >>>> in
> >> >>>> all reg values have 0 for the MS word, that'll actually work just fine.
> >> >>>> Note
> >> >>>> that it's 100% legal to set #address-cells=<100> and just write a bunch
> >> >>>> of
> >> >>>> extra zeros into the property. Silly perhaps, but perfectly legal. Since
> >> >>>> the
> >> >>>> function should adapt to whatever #address-cells value is in the DT,
> >> >>>> supporting that case isn't any more work, so there's no reason we
> >> >>>> shouldn't.
> >> >>>>
> >> >>>>> address/size in the device tree. This is for efficiency. We don't want
> >> >>>>> to force all the U-Boot code to 64-bit suddenly. This will bloat
> >> >>>>> things for no benefit.
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>> We could and likely should set CONFIG_PHYS_64BIT for Tegra210. However,
> >> >>>> that's unrelated to using the correct algorithm to parse DT.
> >> >>>>
> >> >>>>>>>> P.S. What is the issue with SPI flash? The commit description doesn't
> >> >>>>>>>> mention this at all.
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>> It calls that function expecting it to pick up an address and size
> >> >>>>>>> from two consecutive cells. With this patch, that fails (unless the
> >> >>>>>>> property happens to be "reg").
> >> >>>>
> >> >>>>
> >> >>>> ...
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> I think this is all stemming from drivers/mtd/spi/sf_probe.c
> >> >>>>>> spi_flash_decode_fdt()'s call to fdtdec_get_addr_size() for property
> >> >>>>>> "memory-map"? If so, then looking at arch/x86/dts/minnowmax.dts's /spi
> >> >>>>>> node,
> >> >>>>>> the code and DT content are clearly inconsistent; For this node
> >> >>>>>> #address-cells=<1>, #size-cells=<0> which makes sense given that the
> >> >>>>>> address
> >> >>>>>> is a chip-select and hence has no size. So, the code should not assume
> >> >>>>>> that
> >> >>>>>> the memory-map property can be parsed in the same way as a reg
> >> >>>>>> property.
> >> >>>>>>
> >> >>>>>> I note that the memory-map property doesn't exist in the Linux kernel's
> >> >>>>>> DT
> >> >>>>>> binding documentation database, or code, hence hasn't been reviewed by
> >> >>>>>> the
> >> >>>>>> DT binding maintainers.
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>> The function comment says:
> >> >>>>>
> >> >>>>>    * Look up an address property in a node and return it as an address.
> >> >>>>>    * The property must hold one address with a length. This is only
> >> >>>>> tested
> >> >>>>>    * on 32-bit machines.
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>> And Thierry fixed it for systems with #address-cells > 1. Perhaps that
> >> >>>> part
> >> >>>> of the function comment should have been removed in the commit.
> >> >>>>
> >> >>>>> My intention was that this would be an efficient way to decode an
> >> >>>>> address and size from a device tree. To some extent regmap may take
> >> >>>>> over this role (IMO we should turn to drop fdtdec one day one we have
> >> >>>>> more infrastructure). But I'd like it to work efficiently for 32-bit
> >> >>>>> machines. The new function could hardly be less efficient.
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>> Again, there's no way in general to make it more efficient. The
> >> >>>> efficiency
> >> >>>> issue is directly implied by the DT data structures.
> >> >>>>
> >> >>>> In the special case where the parent node is already known, we can
> >> >>>> certainly
> >> >>>> introduce an alternate function that is more efficient. You've already
> >> >>>> asked
> >> >>>> for that, and as I said, it's in my TODO list.
> >> >>>>
> >> >>>>> I think we should consider the case where the physical address size of
> >> >>>>> U-Boot and the device tree do not match as a corner case. I certainly
> >> >>>>> don't want device tree to add loads of pointless code for 'normal'
> >> >>>>> platforms.
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>> It's not a corner case. It's a fundamental part of the DT schema. If
> >> >>>> U-Boot
> >> >>>> is going to use DT, it should actually use *DT*, not some-very
> >> >>>> similar-but-not-quite-DT format with all kinds of implicit and unstated
> >> >>>> exceptions, limitations, and assumptions. This implies fully honoring all
> >> >>>> aspects of how DT works when parsing it, not picking and choosing
> >> >>>> features
> >> >>>> because some are inconvenient or annoying. If U-Boot doesn't want to
> >> >>>> correctly implement DT support, it should just drop it completely.
> >> >>>>
> >> >>>> As an aside, when instantiating devices, I hope one day that U-Boot will
> >> >>>> parse DT top-down in a hierarchical way, rather than simply searching for
> >> >>>> any node anywhere in the DT without regard for whether any parent node is
> >> >>>> enabled, has a driver, has had the driver initialize, etc. U-Boot ignores
> >> >>>> this right now, and is only getting away with this accidentally. Without
> >> >>>> hacky workarounds in drivers, this won't continue to work for all Tegra
> >> >>>> HW
> >> >>>> (e.g. host1x graphics/display sub-modules, AHUB's audio-related
> >> >>>> sub-modules), since parent drivers must initialize before child drivers
> >> >>>> in
> >> >>>> order to enable various register buses, including clocks/resets affecting
> >> >>>> those buses etc.
> >> >>>>
> >> >>>> Again, if it's simply too inconvenient or bloated to implement DT
> >> >>>> properly
> >> >>>> in U-Boot, let's just drop it entirely. A halfway solution is the worst
> >> >>>> of
> >> >>>> both worlds. I'm not convinced the full implications of how to (and the
> >> >>>> need
> >> >>>> to) correctly and fully support DT have were well thought through before
> >> >>>> (control) DT support was added to U-Boot.
> >> >>>>
> >> >>>>> Re memory-map, yes it doesn't seem to be possible to do what it is
> >> >>>>> trying to do (and Thierry says the same below). It is quite weird to
> >> >>>>> have a SPI peripheral which is also memory mapped.
> >> >>>>>
> >> >>>>> Here's my question - if you fix the CONFIG_PHYS_64BIT problem with
> >> >>>>> 64-bit Tegra, what is actually wrong with the way the function was?
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>> With the DT files now checked into U-Boot, I think it would accidentally
> >> >>>> work, since we just happen to have set #address-cells=<2>,
> >> >>>> #size-cells=<2>,
> >> >>>> and that would just happen to match sizeof(fdt_addr_t).
> >> >>>>
> >> >>>> However note this is an accident on a couple of levels:
> >> >>>>
> >> >>>> a) This is because the code assumes that sizeof(fdt_addr_t) ==
> >> >>>> #address-cells * 4. This is an invalid assumption since it does not
> >> >>>> correctly honor the DT schema. It hard-codes the size of values whereas
> >> >>>> DT
> >> >>>> schema says the size is defined by the #xxx-cells properties.
> >> >>>>
> >> >>>> b) The original Tegra210 DTs that TomW posted had #address-cells=<1>,
> >> >>>> #size-cells=<1>. I asked him to change that to match what I expected to
> >> >>>> be
> >> >>>> in the Linux kernel's Tegra210 DTs. However, if he'd rejected my request
> >> >>>> or
> >> >>>> I hadn't noticed that, then with CONFIG_PHYS_64BIT set,
> >> >>>> fdtdec_get_addr_size() would have attempted to read twice as many bytes
> >> >>>> as
> >> >>>> it should have from the property. It's entirely plausible that someone
> >> >>>> could
> >> >>>> have come along later and realized CONFIG_PHYS_64BIT was set incorrectly
> >> >>>> and
> >> >>>> enabled it.
> >> >>>>
> >> >>>>> This is a boot loader so we should be willing to make some
> >> >>>>> simplifications.
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>> When dealing with internal bootloader details, sure assumptions,
> >> >>>> simplifications, etc. can be made.
> >> >>>>
> >> >>>> However, DT is an externally defined standard. The content of DT must be
> >> >>>> identical across all OSs (SW stacks, bootloader) and not influenced by
> >> >>>> requirements/... of any specific individual OS's (SW stack, bootloader)
> >> >>>> quirks. We can't just pick and choose which parts of it we care about.
> >> >>>> Well,
> >> >>>> perhaps if we stop calling it DT we could.
> >> >>>
> >> >>>
> >> >>> So I think in summary:
> >> >>>
> >> >>> - 64-bit machines should have CONFIG_PHYS_64BIT set correctly
> >> >>
> >> >>
> >> >> It turns out that arch/arm/include/asm/config.h already enables this for all
> >> >> ARM64 platforms.
> >> >>
> >> >> As such, we can in fact go ahead with reverting this patch, and U-Boot will
> >> >> still function on Tegra210 boards.
> >> >>
> >> >> In the short term, I think that means TomR should just apply this revert
> >> >> patch, and we don't need to send any additional patches. In the slightly
> >> >> longer term, we should add some comments to fdtdec_get_addr_size()
> >> >> describing its problems, and slightly longer term, add back Thierry's patch,
> >> >> but in a way that lets callers specify whether #address-cells/#size-cells
> >> >> should be used, or whether caller-supplied hard-coded values should be used.
> >> >
> >> > OK great. But I see your new patch so I think we can apply both at the
> >> > same time.
> >> >
> >> >>
> >> >> I apologize for not noticing this earlier; I'd assumed that since none of
> >> >> the Tegra-specific files in include/configs/ set this flag, nor any of the
> >> >> Tegra-specific Kconfig files, it wasn't set, and hence a revert of the patch
> >> >> would break Tegra210 support.
> >> >
> >> > No problem - I assumed it would also.
> >> >
> >> >>
> >> >>> - then fdtdec_get_addr_size() would work as expected
> >> >>
> >> >>
> >> >> I take issue with *works* as expected", since I would expect the function to
> >> >> implement DT parsing rules completely.
> >> >>
> >> >> However, it's certainly true to say that it will generate the desired
> >> >> results, even if it doesn't /work/ (isn't implemented) as expected.
> >> >>
> >> >> (I suppose this depends on whether you're talking about "works" w.r.t to
> >> >> correctness of the returned results or side-effects, or w.r.t. how the
> >> >> internal implementation works.)
> >> >>
> >> >>> - I want to make this cases as efficient as possible since it will be
> >> >>> called in SPL
> >> >>> - You are concerned that making assumptions like this violates the DT spec
> >> >>>
> >> >>> One option is to split the functions into two - one that works in SPL
> >> >>> and makes assumptions, and one that does not and does things the hard
> >> >>> way.
> >> >>
> >> >>
> >> >> Why should SPL be any different? U-Boot should either parse DT correctly or
> >> >> not at all. SPL shouldn't be a special case. Admittedly SPL on Tegra does
> >> >> very little (and isn't even present on Tegra210), but in general, can't
> >> >> someone use storage drivers, filesystems, etc. in SPL to choose the next
> >> >> stage to load, read GPIOs or other data sources to select between different
> >> >> boot paths, perhaps even interact with a user? If so, then assuming that SPL
> >> >> can somehow implements a reduced set of features, and hence can make
> >> >> assumptions that non-SPL can't, seems quite dangerous. We could only do that
> >> >> if we put some active checks in the U-Boot makefiles to ensure that nobody
> >> >> enabled anything in the SPL besides a set of strictly audited features.
> >> >
> >> > Yes we could do that and it would avoid pain later. I suppose SPL on
> >> > Tegra is a bit of a special case since there is really no size limit.
> >> > For some chips the limits are pretty severe so I am quite sensitive to
> >> > code size even at the expense of extra debug time when something
> >> > breaks.
> >> >
> >> >>
> >> >>> I suppose we could also add checks/warnings that the DT is
> >> >>>
> >> >>> 'well-formed' and that the address size matches the machine it is
> >> >>> running on.
> >> >>
> >> >>
> >> >> Yes, we certainly should do that.
> >> >>
> >> >>> In any case we do need to get rid of the parent lookup in this
> >> >>> function. So can either you or Thierry submit a patch to do this? The
> >> >>> parent should instead be a parameter to the function. You may need to
> >> >>> create a stub function which looks it up before calling
> >> >>> fdtdec_get_addr_size().
> >> >>
> >> >>
> >> >> Yes, we can't remove the parent lookup in all cases. However, we can avoid
> >> >> it in the cases where the caller can supply it. I think that's most, but not
> >> >> quite all.
> >> >
> >> > My main concern is dev_get_addr() since that is the official way of
> >> > reading a device address now. See also regmap_init_mem() which does
> >> > things its own way.
> >> >
> >> >>
> >> >>
> >> >>> I'll see how things look in SPL.
> >> >>>
> >> >>> Regards,
> >> >>> Simon
> >> >>>
> >> >>
> >> >
> >>
> >> Do we have any conclusion about commit 5b34436? Today I started to
> >> check the pre-relocatoin DM PCI UART issue, but found it is now broken
> >> due to this commit. The broken part is at
> >> ns16550_serial_ofdata_to_platdata() in drivers/serial/ns16550.c, in
> >> which it has:
> >>
> >>     addr = fdtdec_get_addr(gd->fdt_blob, dev->of_offset, "reg");
> >> #ifdef CONFIG_PCI
> >>     if (addr == FDT_ADDR_T_NONE) {
> >> ...
> >>
> >> Before commit 5b34436, the old behavior is that the call to
> >> fdtdec_get_addr() returns FDT_ADDR_T_NONE so that we can trap into the
> >> PCI logic. But with commit 5b34436, addr is now zero which just bypass
> >> this logic.
> >>
> >> As for why addr is now zero, this is because fdtdec_get_number() can
> >> only handle a 64-bit number at most. However for PCI reg, it has 3
> >> cells. So if I have the following encoding:
> >>
> >> reg = <0x00025100 0x0 0x0 0x0 0x0
> >>            0x01025110 0x0 0x0 0x0 0x0>;
> >>
> >> The addr will be assigned to zero after two rounds of left shift by 32-bit.
> >>
> >> I can certainly change ns16550 driver to test the return value against
> >> 0 now, but I think this fdtdec_get_addr() does not cover all cases.
> >> Please advise.
> >
> > This type of simple address parsing breaks down in the face of PCI. PCI
> > requires 3 address cells because it has different types of address
> > spaces. fdtdec_get_address() actually does the right thing here. It
> > returns the "address" associated with the entry. The address is the
> > 64-bit value you get by concatenating cells 0 and 1. Cell 2 contains
> > additional meta data such as the PCI address and the type of memory that
> > you are dealing with (configuration space, I/O, memory-mapped).
> 
> I think you wanted to say fdtdec_get_address() returns cell 1 and 2
> and cell 0 contains the meta data.

That depends on how you look at it. But yes, I think we're talking about
the same thing.

> > I'd suggest that we add code to properly check whether this is a PCI
> > device.
> 
> fdtdec_get_number() will only get the last 2 cells into the returned
> addr. If we have more than 2 cells (in the PCI case), we end up get
> cell 1 and 2 which is zero. And if we have 4 cells, we will get cell 2
> and 3.

Yes. All the more reason to special-case here. fdtdec_get_address() here
will inevitably lead to pain for anything that exceeds 64-bit (2 cells).

> > Also, why isn't this code obtaining the memory address from the base
> > address registers?
> >
> 
> I think we discussed this before. For simplification, we use device
> tree to pass the BAR number to the driver, otherwise we will end up
> dealing with a large code block of PCI vendor ID and device ID
> switch/case to determine which BAR we should read.

That's really abusing DT, but like you said, I think we discussed this
before and disagreed the same way before. =)

Thierry
Bin Meng Aug. 14, 2015, 2:29 p.m. UTC | #22
Hi Thierry,

On Fri, Aug 14, 2015 at 10:06 PM, Thierry Reding <treding@nvidia.com> wrote:
> On Fri, Aug 14, 2015 at 04:44:28PM +0800, Bin Meng wrote:
>> Hi Thierry,
>>
>> On Fri, Aug 14, 2015 at 4:32 PM, Thierry Reding <treding@nvidia.com> wrote:
>> > On Fri, Aug 14, 2015 at 04:10:32PM +0800, Bin Meng wrote:
>> >> Hi,
>> >>
>> >> On Sun, Aug 9, 2015 at 11:08 PM, Simon Glass <sjg@chromium.org> wrote:
>> >> > Hi Stephen,
>> >> >
>> >> > On 6 August 2015 at 13:03, Stephen Warren <swarren@wwwdotorg.org> wrote:
>> >> >> On 08/05/2015 05:45 PM, Simon Glass wrote:
>> >> >>>
>> >> >>> Hi Stephen,
>> >> >>>
>> >> >>> On 5 August 2015 at 12:22, Stephen Warren <swarren@wwwdotorg.org> wrote:
>> >> >>>>
>> >> >>>> On 08/04/2015 10:08 PM, Simon Glass wrote:
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> Hi Stephen,
>> >> >>>>>
>> >> >>>>> On 3 August 2015 at 12:20, Stephen Warren <swarren@wwwdotorg.org> wrote:
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>> On 08/03/2015 09:52 AM, Simon Glass wrote:
>> >> >>>>>>>
>> >> >>>>>>>
>> >> >>>>>>>
>> >> >>>>>>> Hi Stephen,
>> >> >>>>>>>
>> >> >>>>>>> On 3 August 2015 at 09:12, Stephen Warren <swarren@wwwdotorg.org>
>> >> >>>>>>> wrote:
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>> On 08/02/2015 06:13 PM, Simon Glass wrote:
>> >> >>>>>>>>>
>> >> >>>>>>>>>
>> >> >>>>>>>>>
>> >> >>>>>>>>>
>> >> >>>>>>>>> This reverts commit 5b34436035fc862b5e8d0d2c3eab74ba36f1a7f4.
>> >> >>>>>>>>>
>> >> >>>>>>>>> This function has a few problems. It calls fdt_parent_offset() which
>> >> >>>>>>>>> as
>> >> >>>>>>>>> mentioned in code review is very slow.
>> >> >>>>>>>>>
>> >> >>>>>>>>> https://patchwork.ozlabs.org/patch/499482/
>> >> >>>>>>>>> https://patchwork.ozlabs.org/patch/452604/
>> >> >>>>>>>>>
>> >> >>>>>>>>> It also happens to break SPI flash on Minnowboard max which is how I
>> >> >>>>>>>>> noticed
>> >> >>>>>>>>> that this was applied. I can send a patch to tidy that up, but in
>> >> >>>>>>>>> any
>> >> >>>>>>>>> case
>> >> >>>>>>>>> I think we should consider a revert until the function is better
>> >> >>>>>>>>> implemented.
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>> The fact that the function needs to perform a slow operation is not a
>> >> >>>>>>>> good
>> >> >>>>>>>> reason for a revert. The slowness of the operation is just a matter
>> >> >>>>>>>> of
>> >> >>>>>>>> fact
>> >> >>>>>>>> with DT not having uplinks in its data structure, and U-Boot using
>> >> >>>>>>>> those
>> >> >>>>>>>> data structures directly.
>> >> >>>>>>>>
>> >> >>>>>>>> You'd requested during review that I additionally implement a faster
>> >> >>>>>>>> version
>> >> >>>>>>>> of the function in the case where the parent node is already known,
>> >> >>>>>>>> and
>> >> >>>>>>>> said
>> >> >>>>>>>> it was fine if that happened in a later patch. I have this on my TODO
>> >> >>>>>>>> list,
>> >> >>>>>>>> but it's only been a couple of days.
>> >> >>>>>>>
>> >> >>>>>>>
>> >> >>>>>>>
>> >> >>>>>>>
>> >> >>>>>>> I didn't expect this to go to mainline before your new patch.
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>> I didn't get that message from the thread; you wrote:
>> >> >>>>>>
>> >> >>>>>> Simon Glass wrote:
>> >> >>>>>>>
>> >> >>>>>>>
>> >> >>>>>>> Stephen Warren wrote:
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>> Simon Glass wrote:
>> >> >>>>>>>>>
>> >> >>>>>>>>>
>> >> >>>>>>>>> Also, how about (in addition) a
>> >> >>>>>>>>> version of this function that works for devices? Like:
>> >> >>>>>>>>>
>> >> >>>>>>>>> device_get_addr_size(struct udevice *dev, ...)
>> >> >>>>>>>>>
>> >> >>>>>>>>> so that it can handle this for you.
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>> That sounds like a separate patch?
>> >> >>>>>>>
>> >> >>>>>>>
>> >> >>>>>>>
>> >> >>>>>>> Yes, but I think we should get it in there so that people don't start
>> >> >>>>>>> using this (wildly inefficient) function when they don't need to. I
>> >> >>>>>>> think by passing in the parent node we force people to think about the
>> >> >>>>>>> cost.
>> >> >>>>>>>
>> >> >>>>>>> Yes the driver model function can be a separate patch, but let's get
>> >> >>>>>>> it in at about the same time. We have dev_get_addr() so could have
>> >> >>>>>>> dev_get_addr_size().
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>> That sounds to like you'd like a followup patch soon after this patch
>> >> >>>>>> (my
>> >> >>>>>> assumption would be within a few days or weeks) to solve your comments,
>> >> >>>>>> not
>> >> >>>>>> that you wanted a replacement patch.
>> >> >>>>>
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> I will take that feedback to be a little more forceful in future, sorry.
>> >> >>>>>
>> >> >>>>>>
>> >> >>>>>>>> Why not just fix the bug since you said you could? That seems far
>> >> >>>>>>>> better
>> >> >>>>>>>> than breaking the newly added Tegra210 support.
>> >> >>>>>>>
>> >> >>>>>>>
>> >> >>>>>>>
>> >> >>>>>>>
>> >> >>>>>>> I do have a patch but I'm not 100% comfortable with the approach. I
>> >> >>>>>>> don't see a good reason to move away from the idea of fdt_addr_t and
>> >> >>>>>>> fdt_addr_t being set correctly for the platform. Or maybe I
>> >> >>>>>>> misunderstand the problem the patch was trying to fix. As I said it
>> >> >>>>>>> did not have a commit message, so who knows :-)
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>> The size of fdt_addr_t isn't relevant *at all* when parsing DT. The
>> >> >>>>>> only
>> >> >>>>>> thing that matters is #address-cells/#size-cells. Those properties are
>> >> >>>>>> what
>> >> >>>>>> tell the parsing code how many bytes to read from the reg property.
>> >> >>>>>> Whether
>> >> >>>>>> the resultant value fits into the code's internal representation is an
>> >> >>>>>> internal detail of the code, not part of the semantics of DT itself or
>> >> >>>>>> how
>> >> >>>>>> to parse it.
>> >> >>>>>>
>> >> >>>>>> If code assumes that #address-cells == sizeof(fdt_addr_t), it is indeed
>> >> >>>>>> quite likely that everything will just happen to work most of the time.
>> >> >>>>>> However, this is purely an accident and not something that anything
>> >> >>>>>> should
>> >> >>>>>> rely upon.
>> >> >>>>>>
>> >> >>>>>> (I think Tegra210 support still has CONFIG_PHYS_64BIT undefined which
>> >> >>>>>> is
>> >> >>>>>> admittedly a little /unexpected/ for a 64-bit U-Boot build, but in
>> >> >>>>>> practice
>> >> >>>>>> is perfectly /legal/ and will work out just fine since, except perhaps
>> >> >>>>>> for
>> >> >>>>>> RAM sizes, I don't believe any value in DT will actually require more
>> >> >>>>>> than
>> >> >>>>>> 32-bits to represent)
>> >> >>>>>
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> I would like to have the types match the platform where possible. At
>> >> >>>>> least the types should not be smaller than the platform - e.g. if
>> >> >>>>> CONFIG_PHYS_64BIT is not defined we should not support 64-bit
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>> In general, there's no "should not" here; we "cannot". If there's a
>> >> >>>> 64-bit
>> >> >>>> value in the DT (with bits above bit 31 set), then it can't be stored in
>> >> >>>> a
>> >> >>>> 32-bit variable.
>> >> >>>>
>> >> >>>> That said, if a DT has #address-cells=<2>, but the actual values stored
>> >> >>>> in
>> >> >>>> all reg values have 0 for the MS word, that'll actually work just fine.
>> >> >>>> Note
>> >> >>>> that it's 100% legal to set #address-cells=<100> and just write a bunch
>> >> >>>> of
>> >> >>>> extra zeros into the property. Silly perhaps, but perfectly legal. Since
>> >> >>>> the
>> >> >>>> function should adapt to whatever #address-cells value is in the DT,
>> >> >>>> supporting that case isn't any more work, so there's no reason we
>> >> >>>> shouldn't.
>> >> >>>>
>> >> >>>>> address/size in the device tree. This is for efficiency. We don't want
>> >> >>>>> to force all the U-Boot code to 64-bit suddenly. This will bloat
>> >> >>>>> things for no benefit.
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>> We could and likely should set CONFIG_PHYS_64BIT for Tegra210. However,
>> >> >>>> that's unrelated to using the correct algorithm to parse DT.
>> >> >>>>
>> >> >>>>>>>> P.S. What is the issue with SPI flash? The commit description doesn't
>> >> >>>>>>>> mention this at all.
>> >> >>>>>>>
>> >> >>>>>>>
>> >> >>>>>>>
>> >> >>>>>>> It calls that function expecting it to pick up an address and size
>> >> >>>>>>> from two consecutive cells. With this patch, that fails (unless the
>> >> >>>>>>> property happens to be "reg").
>> >> >>>>
>> >> >>>>
>> >> >>>> ...
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>> I think this is all stemming from drivers/mtd/spi/sf_probe.c
>> >> >>>>>> spi_flash_decode_fdt()'s call to fdtdec_get_addr_size() for property
>> >> >>>>>> "memory-map"? If so, then looking at arch/x86/dts/minnowmax.dts's /spi
>> >> >>>>>> node,
>> >> >>>>>> the code and DT content are clearly inconsistent; For this node
>> >> >>>>>> #address-cells=<1>, #size-cells=<0> which makes sense given that the
>> >> >>>>>> address
>> >> >>>>>> is a chip-select and hence has no size. So, the code should not assume
>> >> >>>>>> that
>> >> >>>>>> the memory-map property can be parsed in the same way as a reg
>> >> >>>>>> property.
>> >> >>>>>>
>> >> >>>>>> I note that the memory-map property doesn't exist in the Linux kernel's
>> >> >>>>>> DT
>> >> >>>>>> binding documentation database, or code, hence hasn't been reviewed by
>> >> >>>>>> the
>> >> >>>>>> DT binding maintainers.
>> >> >>>>>
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> The function comment says:
>> >> >>>>>
>> >> >>>>>    * Look up an address property in a node and return it as an address.
>> >> >>>>>    * The property must hold one address with a length. This is only
>> >> >>>>> tested
>> >> >>>>>    * on 32-bit machines.
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>> And Thierry fixed it for systems with #address-cells > 1. Perhaps that
>> >> >>>> part
>> >> >>>> of the function comment should have been removed in the commit.
>> >> >>>>
>> >> >>>>> My intention was that this would be an efficient way to decode an
>> >> >>>>> address and size from a device tree. To some extent regmap may take
>> >> >>>>> over this role (IMO we should turn to drop fdtdec one day one we have
>> >> >>>>> more infrastructure). But I'd like it to work efficiently for 32-bit
>> >> >>>>> machines. The new function could hardly be less efficient.
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>> Again, there's no way in general to make it more efficient. The
>> >> >>>> efficiency
>> >> >>>> issue is directly implied by the DT data structures.
>> >> >>>>
>> >> >>>> In the special case where the parent node is already known, we can
>> >> >>>> certainly
>> >> >>>> introduce an alternate function that is more efficient. You've already
>> >> >>>> asked
>> >> >>>> for that, and as I said, it's in my TODO list.
>> >> >>>>
>> >> >>>>> I think we should consider the case where the physical address size of
>> >> >>>>> U-Boot and the device tree do not match as a corner case. I certainly
>> >> >>>>> don't want device tree to add loads of pointless code for 'normal'
>> >> >>>>> platforms.
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>> It's not a corner case. It's a fundamental part of the DT schema. If
>> >> >>>> U-Boot
>> >> >>>> is going to use DT, it should actually use *DT*, not some-very
>> >> >>>> similar-but-not-quite-DT format with all kinds of implicit and unstated
>> >> >>>> exceptions, limitations, and assumptions. This implies fully honoring all
>> >> >>>> aspects of how DT works when parsing it, not picking and choosing
>> >> >>>> features
>> >> >>>> because some are inconvenient or annoying. If U-Boot doesn't want to
>> >> >>>> correctly implement DT support, it should just drop it completely.
>> >> >>>>
>> >> >>>> As an aside, when instantiating devices, I hope one day that U-Boot will
>> >> >>>> parse DT top-down in a hierarchical way, rather than simply searching for
>> >> >>>> any node anywhere in the DT without regard for whether any parent node is
>> >> >>>> enabled, has a driver, has had the driver initialize, etc. U-Boot ignores
>> >> >>>> this right now, and is only getting away with this accidentally. Without
>> >> >>>> hacky workarounds in drivers, this won't continue to work for all Tegra
>> >> >>>> HW
>> >> >>>> (e.g. host1x graphics/display sub-modules, AHUB's audio-related
>> >> >>>> sub-modules), since parent drivers must initialize before child drivers
>> >> >>>> in
>> >> >>>> order to enable various register buses, including clocks/resets affecting
>> >> >>>> those buses etc.
>> >> >>>>
>> >> >>>> Again, if it's simply too inconvenient or bloated to implement DT
>> >> >>>> properly
>> >> >>>> in U-Boot, let's just drop it entirely. A halfway solution is the worst
>> >> >>>> of
>> >> >>>> both worlds. I'm not convinced the full implications of how to (and the
>> >> >>>> need
>> >> >>>> to) correctly and fully support DT have were well thought through before
>> >> >>>> (control) DT support was added to U-Boot.
>> >> >>>>
>> >> >>>>> Re memory-map, yes it doesn't seem to be possible to do what it is
>> >> >>>>> trying to do (and Thierry says the same below). It is quite weird to
>> >> >>>>> have a SPI peripheral which is also memory mapped.
>> >> >>>>>
>> >> >>>>> Here's my question - if you fix the CONFIG_PHYS_64BIT problem with
>> >> >>>>> 64-bit Tegra, what is actually wrong with the way the function was?
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>> With the DT files now checked into U-Boot, I think it would accidentally
>> >> >>>> work, since we just happen to have set #address-cells=<2>,
>> >> >>>> #size-cells=<2>,
>> >> >>>> and that would just happen to match sizeof(fdt_addr_t).
>> >> >>>>
>> >> >>>> However note this is an accident on a couple of levels:
>> >> >>>>
>> >> >>>> a) This is because the code assumes that sizeof(fdt_addr_t) ==
>> >> >>>> #address-cells * 4. This is an invalid assumption since it does not
>> >> >>>> correctly honor the DT schema. It hard-codes the size of values whereas
>> >> >>>> DT
>> >> >>>> schema says the size is defined by the #xxx-cells properties.
>> >> >>>>
>> >> >>>> b) The original Tegra210 DTs that TomW posted had #address-cells=<1>,
>> >> >>>> #size-cells=<1>. I asked him to change that to match what I expected to
>> >> >>>> be
>> >> >>>> in the Linux kernel's Tegra210 DTs. However, if he'd rejected my request
>> >> >>>> or
>> >> >>>> I hadn't noticed that, then with CONFIG_PHYS_64BIT set,
>> >> >>>> fdtdec_get_addr_size() would have attempted to read twice as many bytes
>> >> >>>> as
>> >> >>>> it should have from the property. It's entirely plausible that someone
>> >> >>>> could
>> >> >>>> have come along later and realized CONFIG_PHYS_64BIT was set incorrectly
>> >> >>>> and
>> >> >>>> enabled it.
>> >> >>>>
>> >> >>>>> This is a boot loader so we should be willing to make some
>> >> >>>>> simplifications.
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>> When dealing with internal bootloader details, sure assumptions,
>> >> >>>> simplifications, etc. can be made.
>> >> >>>>
>> >> >>>> However, DT is an externally defined standard. The content of DT must be
>> >> >>>> identical across all OSs (SW stacks, bootloader) and not influenced by
>> >> >>>> requirements/... of any specific individual OS's (SW stack, bootloader)
>> >> >>>> quirks. We can't just pick and choose which parts of it we care about.
>> >> >>>> Well,
>> >> >>>> perhaps if we stop calling it DT we could.
>> >> >>>
>> >> >>>
>> >> >>> So I think in summary:
>> >> >>>
>> >> >>> - 64-bit machines should have CONFIG_PHYS_64BIT set correctly
>> >> >>
>> >> >>
>> >> >> It turns out that arch/arm/include/asm/config.h already enables this for all
>> >> >> ARM64 platforms.
>> >> >>
>> >> >> As such, we can in fact go ahead with reverting this patch, and U-Boot will
>> >> >> still function on Tegra210 boards.
>> >> >>
>> >> >> In the short term, I think that means TomR should just apply this revert
>> >> >> patch, and we don't need to send any additional patches. In the slightly
>> >> >> longer term, we should add some comments to fdtdec_get_addr_size()
>> >> >> describing its problems, and slightly longer term, add back Thierry's patch,
>> >> >> but in a way that lets callers specify whether #address-cells/#size-cells
>> >> >> should be used, or whether caller-supplied hard-coded values should be used.
>> >> >
>> >> > OK great. But I see your new patch so I think we can apply both at the
>> >> > same time.
>> >> >
>> >> >>
>> >> >> I apologize for not noticing this earlier; I'd assumed that since none of
>> >> >> the Tegra-specific files in include/configs/ set this flag, nor any of the
>> >> >> Tegra-specific Kconfig files, it wasn't set, and hence a revert of the patch
>> >> >> would break Tegra210 support.
>> >> >
>> >> > No problem - I assumed it would also.
>> >> >
>> >> >>
>> >> >>> - then fdtdec_get_addr_size() would work as expected
>> >> >>
>> >> >>
>> >> >> I take issue with *works* as expected", since I would expect the function to
>> >> >> implement DT parsing rules completely.
>> >> >>
>> >> >> However, it's certainly true to say that it will generate the desired
>> >> >> results, even if it doesn't /work/ (isn't implemented) as expected.
>> >> >>
>> >> >> (I suppose this depends on whether you're talking about "works" w.r.t to
>> >> >> correctness of the returned results or side-effects, or w.r.t. how the
>> >> >> internal implementation works.)
>> >> >>
>> >> >>> - I want to make this cases as efficient as possible since it will be
>> >> >>> called in SPL
>> >> >>> - You are concerned that making assumptions like this violates the DT spec
>> >> >>>
>> >> >>> One option is to split the functions into two - one that works in SPL
>> >> >>> and makes assumptions, and one that does not and does things the hard
>> >> >>> way.
>> >> >>
>> >> >>
>> >> >> Why should SPL be any different? U-Boot should either parse DT correctly or
>> >> >> not at all. SPL shouldn't be a special case. Admittedly SPL on Tegra does
>> >> >> very little (and isn't even present on Tegra210), but in general, can't
>> >> >> someone use storage drivers, filesystems, etc. in SPL to choose the next
>> >> >> stage to load, read GPIOs or other data sources to select between different
>> >> >> boot paths, perhaps even interact with a user? If so, then assuming that SPL
>> >> >> can somehow implements a reduced set of features, and hence can make
>> >> >> assumptions that non-SPL can't, seems quite dangerous. We could only do that
>> >> >> if we put some active checks in the U-Boot makefiles to ensure that nobody
>> >> >> enabled anything in the SPL besides a set of strictly audited features.
>> >> >
>> >> > Yes we could do that and it would avoid pain later. I suppose SPL on
>> >> > Tegra is a bit of a special case since there is really no size limit.
>> >> > For some chips the limits are pretty severe so I am quite sensitive to
>> >> > code size even at the expense of extra debug time when something
>> >> > breaks.
>> >> >
>> >> >>
>> >> >>> I suppose we could also add checks/warnings that the DT is
>> >> >>>
>> >> >>> 'well-formed' and that the address size matches the machine it is
>> >> >>> running on.
>> >> >>
>> >> >>
>> >> >> Yes, we certainly should do that.
>> >> >>
>> >> >>> In any case we do need to get rid of the parent lookup in this
>> >> >>> function. So can either you or Thierry submit a patch to do this? The
>> >> >>> parent should instead be a parameter to the function. You may need to
>> >> >>> create a stub function which looks it up before calling
>> >> >>> fdtdec_get_addr_size().
>> >> >>
>> >> >>
>> >> >> Yes, we can't remove the parent lookup in all cases. However, we can avoid
>> >> >> it in the cases where the caller can supply it. I think that's most, but not
>> >> >> quite all.
>> >> >
>> >> > My main concern is dev_get_addr() since that is the official way of
>> >> > reading a device address now. See also regmap_init_mem() which does
>> >> > things its own way.
>> >> >
>> >> >>
>> >> >>
>> >> >>> I'll see how things look in SPL.
>> >> >>>
>> >> >>> Regards,
>> >> >>> Simon
>> >> >>>
>> >> >>
>> >> >
>> >>
>> >> Do we have any conclusion about commit 5b34436? Today I started to
>> >> check the pre-relocatoin DM PCI UART issue, but found it is now broken
>> >> due to this commit. The broken part is at
>> >> ns16550_serial_ofdata_to_platdata() in drivers/serial/ns16550.c, in
>> >> which it has:
>> >>
>> >>     addr = fdtdec_get_addr(gd->fdt_blob, dev->of_offset, "reg");
>> >> #ifdef CONFIG_PCI
>> >>     if (addr == FDT_ADDR_T_NONE) {
>> >> ...
>> >>
>> >> Before commit 5b34436, the old behavior is that the call to
>> >> fdtdec_get_addr() returns FDT_ADDR_T_NONE so that we can trap into the
>> >> PCI logic. But with commit 5b34436, addr is now zero which just bypass
>> >> this logic.
>> >>
>> >> As for why addr is now zero, this is because fdtdec_get_number() can
>> >> only handle a 64-bit number at most. However for PCI reg, it has 3
>> >> cells. So if I have the following encoding:
>> >>
>> >> reg = <0x00025100 0x0 0x0 0x0 0x0
>> >>            0x01025110 0x0 0x0 0x0 0x0>;
>> >>
>> >> The addr will be assigned to zero after two rounds of left shift by 32-bit.
>> >>
>> >> I can certainly change ns16550 driver to test the return value against
>> >> 0 now, but I think this fdtdec_get_addr() does not cover all cases.
>> >> Please advise.
>> >
>> > This type of simple address parsing breaks down in the face of PCI. PCI
>> > requires 3 address cells because it has different types of address
>> > spaces. fdtdec_get_address() actually does the right thing here. It
>> > returns the "address" associated with the entry. The address is the
>> > 64-bit value you get by concatenating cells 0 and 1. Cell 2 contains
>> > additional meta data such as the PCI address and the type of memory that
>> > you are dealing with (configuration space, I/O, memory-mapped).
>>
>> I think you wanted to say fdtdec_get_address() returns cell 1 and 2
>> and cell 0 contains the meta data.
>
> That depends on how you look at it. But yes, I think we're talking about
> the same thing.
>
>> > I'd suggest that we add code to properly check whether this is a PCI
>> > device.
>>
>> fdtdec_get_number() will only get the last 2 cells into the returned
>> addr. If we have more than 2 cells (in the PCI case), we end up get
>> cell 1 and 2 which is zero. And if we have 4 cells, we will get cell 2
>> and 3.
>
> Yes. All the more reason to special-case here. fdtdec_get_address() here
> will inevitably lead to pain for anything that exceeds 64-bit (2 cells).
>
>> > Also, why isn't this code obtaining the memory address from the base
>> > address registers?
>> >
>>
>> I think we discussed this before. For simplification, we use device
>> tree to pass the BAR number to the driver, otherwise we will end up
>> dealing with a large code block of PCI vendor ID and device ID
>> switch/case to determine which BAR we should read.
>
> That's really abusing DT, but like you said, I think we discussed this
> before and disagreed the same way before. =)
>

Well, I am open for discussions. I don't think current implementation
is abusing DT. IMHO, DT is only suitable for describing devices that
are not probable. For PCI devices (not the PCI host controller) since
they have a unique vendor ID & device ID pair, we can hardcoded all
those stuff in the driver itself. We don't need specify anything in
the device tree (like BAR number, b.d.f etc). Current PCI <reg>
bindings comes from Open Firmware spec. I believe the original intent
was to have bootloader fill in the <reg> property with bootloader
configured values and pass it to the OS so that OS does not need
re-configure the bus. Well, I would call abusing DT that we describe a
PCI device in the device tree. We shouldn't do that.

Anyway, given we have driver model PCI support now, I think I can
create a new PCI ns16550 driver using driver model. In the new driver,
I can continue using current implementation to get the BAR, but if
anyone thinks that we should never using device tree to describe a PCI
device, I can do a vendor ID & device ID match game :-) Be warned that
the PCI ns16550 driver may get bigger and bigger (see Linux driver
8250_pci.c and you know what I am talking about)

Regards,
Bin
Simon Glass Aug. 14, 2015, 4:50 p.m. UTC | #23
On 6 August 2015 at 13:03, Stephen Warren <swarren@wwwdotorg.org> wrote:
[snip]

> It turns out that arch/arm/include/asm/config.h already enables this for all
> ARM64 platforms.
>
> As such, we can in fact go ahead with reverting this patch, and U-Boot will
> still function on Tegra210 boards.

Applied to u-boot-x86.
Simon Glass Sept. 2, 2015, 4:58 p.m. UTC | #24
Hi Tom,

On 2 September 2015 at 10:52, Tom Warren <TWarren@nvidia.com> wrote:
> Simon, et al,
>
>> -----Original Message-----
>> From: sjg@google.com [mailto:sjg@google.com] On Behalf Of Simon Glass
>> Sent: Friday, August 14, 2015 3:05 AM
>> To: Bin Meng
>> Cc: Michal Suchanek; Tom Rini; Stephen Warren; U-Boot Mailing List; Tom
>> Warren; Thierry Reding
>> Subject: Re: [U-Boot] [PATCH] Revert "fdt: Fix fdtdec_get_addr_size() for 64-
>> bit"
>>
>> Hi,
>>
>
>
> [snip]
>
>>
>> I plan to apply this revert to u-boot-x86 (where SPI is currently
>> broken) and (once it has a bit more testing) also this patch which I think makes
>> the change in a safer way:
>>
>> https://patchwork.ozlabs.org/patch/504918/
>>
>> At present that patch breaks at least one x86 board and I have not dug into it
>> yet.
>>
>> The revert should not break tegra, according to Stephen.
>
> Unfortunately, my testing on P2571 with TOT u-boot-tegra (rebased against TOT u-boot/master this morning) shows that that is not true.
>
> The revert of the disputed 'fdtdec_get_addr_size' patch _does_ break Tegra 64-bit (P2571, at least). Nyan-big is OK.  With Simon's revert in place, my board just loops on SPL signon, so I assume it's faulting, etc. in CPU init.  Note that this is the current state of TOT u-boot/master.
>
> If I 'revert' Simon's revert, I can boot my P2571 to the command prompt again.  If I leave his revert of fdtdec_get_addr_size in, but use Stephen's rewrite patch (as listed by Simon above), I can again boot to the command prompt on my P2571 (haven't tested my Nyan-big yet).
>
> I've got some patches pending in u-boot-tegra/master that I wanted to send a PR for, but I can't until I can assure that at least a small sample of boards (T124 and T210, in this case) still boot.
>
> Since Stephen's patch isn't a Tegra patch, I don't think I should apply it to u-boot-tegra for the PR.  I think it's best handled by TomR in u-boot/master, especially considering the controversy surrounding the previous attempt, and Simon's assertion that it still needs work for at least on x86 board.
>
> Please put your heads together and solve this, since we're now broken on 64-bit T210 Tegra  (Stephen - I know you are OOTO and may not be able to test this on your T210 boards, but I'd like someone else to verify my findings ASAP if possible).

Oh dear.

I think the solution is to apply Stephen's patch. However that
currently breaks other boards. I'm planning to figure out why but have
not got to it yet. I'll make sure it gets done by early next week.

Regards,
Simon
Stephen Warren Sept. 2, 2015, 8:04 p.m. UTC | #25
On 09/02/2015 09:52 AM, Tom Warren wrote:
> Simon, et al,
> 
>> Simon Glass wrote at Friday, August 14, 2015 3:05 AM:
>> I plan to apply this revert to u-boot-x86 (where SPI is currently
>> broken) and (once it has a bit more testing) also this patch which I think makes
>> the change in a safer way:
>>
>> https://patchwork.ozlabs.org/patch/504918/
>>
>> At present that patch breaks at least one x86 board and I have not dug into it
>> yet.
>>
>> The revert should not break tegra, according to Stephen.
> 
> Unfortunately, my testing on P2571 with TOT u-boot-tegra (rebased against TOT u-boot/master this morning) shows that that is not true.
> 
> The revert of the disputed 'fdtdec_get_addr_size' patch _does_ break Tegra 64-bit (P2571, at least). Nyan-big is OK.  With Simon's revert in place, my board just loops on SPL signon, so I assume it's faulting, etc. in CPU init.  Note that this is the current state of TOT u-boot/master.

I'm a bit confused. So far, we don't support SPL on T210 since we assume
some other bootloader runs on the boot CPU and starts just the main
U-Boot on the main CPU. It sounds like you're testing some local-only
SPL support?
Stephen Warren Sept. 2, 2015, 8:54 p.m. UTC | #26
On 09/02/2015 01:39 PM, Tom Warren wrote:
> 
> 
>> -----Original Message-----
>> From: Stephen Warren
>> Sent: Wednesday, September 02, 2015 1:05 PM
>> To: Tom Warren; Simon Glass
>> Cc: Bin Meng; Thierry Reding; Tom Rini; U-Boot Mailing List
>> Subject: Re: [U-Boot] [PATCH] Revert "fdt: Fix fdtdec_get_addr_size() for 64-
>> bit"
>>
>> On 09/02/2015 09:52 AM, Tom Warren wrote:
>>> Simon, et al,
>>>
>>>> Simon Glass wrote at Friday, August 14, 2015 3:05 AM:
>>>> I plan to apply this revert to u-boot-x86 (where SPI is currently
>>>> broken) and (once it has a bit more testing) also this patch which I
>>>> think makes the change in a safer way:
>>>>
>>>> https://patchwork.ozlabs.org/patch/504918/
>>>>
>>>> At present that patch breaks at least one x86 board and I have not
>>>> dug into it yet.
>>>>
>>>> The revert should not break tegra, according to Stephen.
>>>
>>> Unfortunately, my testing on P2571 with TOT u-boot-tegra (rebased against
>> TOT u-boot/master this morning) shows that that is not true.
>>>
>>> The revert of the disputed 'fdtdec_get_addr_size' patch _does_ break Tegra
>> 64-bit (P2571, at least). Nyan-big is OK.  With Simon's revert in place, my board
>> just loops on SPL signon, so I assume it's faulting, etc. in CPU init.  Note that this
>> is the current state of TOT u-boot/master.
>>
>> I'm a bit confused. So far, we don't support SPL on T210 since we assume some
>> other bootloader runs on the boot CPU and starts just the main U-Boot on the
>> main CPU. It sounds like you're testing some local-only SPL support?
>
> Currently there are a couple of ways to get T210 64-bit U-Boot loaded/executed. The first is the way I've always done it (during development and today) - use a 32-bit SPL that I built when I first ported 32-bit U-Boot to T210. I've saved away the SPL portion as a binary, and combine it with the current 64-bit T210 U-Boot proper when building my image.  It's always worked up to now.  What I'm seeing is a failure in the 64-bit CPU U-Boot portion. I just mentioned the looping SPL signon symptom because that's what I see as the indicator of a broken 64-bit image.

Oh I see; the SPL is only looping because the main U-Boot binary
crashes/... and resets the CPU, hence re-executing the SPL. I thought
you were referring to a loop purely within SPL.

Now it makes more sense.

> The other way is to use your proprietary NV bootloader for the 32-bit portion (this will become the de facto standard when we release said NV bootloader code as open-source, or a binary first-stage loader 'tool').  I haven't tried that, since my way works and is an easy part of my workflow.
> 
> If you can point me to your boot CPU loader internally, I can try your method and see if it makes a difference, but I doubt it will.

I sent you an internal email message.

Perhaps you could also try my upstream U-Boot dev branch at:

repo git://github.com/swarren/u-boot.git branch tegra_dev

That has the revert of the original patch in, but also has the following
replacement which you'd want to revert (or perhaps best: try with and
without it):

c1fd5e1d5586 fdt: add new fdt address parsing functions

I'm sure I tested Simon's revert at the time I said it was OK. I wonder
if the revert used to work fine, but something since then fails if the
revert is in place? I would try testing this now, but I'm travelling so
it's a bit more painful.
Stephen Warren Sept. 2, 2015, 11:43 p.m. UTC | #27
On 09/02/2015 01:54 PM, Stephen Warren wrote:
> On 09/02/2015 01:39 PM, Tom Warren wrote:
>>
>>
>>> -----Original Message-----
>>> From: Stephen Warren
>>> Sent: Wednesday, September 02, 2015 1:05 PM
>>> To: Tom Warren; Simon Glass
>>> Cc: Bin Meng; Thierry Reding; Tom Rini; U-Boot Mailing List
>>> Subject: Re: [U-Boot] [PATCH] Revert "fdt: Fix fdtdec_get_addr_size() for 64-
>>> bit"
>>>
>>> On 09/02/2015 09:52 AM, Tom Warren wrote:
>>>> Simon, et al,
>>>>
>>>>> Simon Glass wrote at Friday, August 14, 2015 3:05 AM:
>>>>> I plan to apply this revert to u-boot-x86 (where SPI is currently
>>>>> broken) and (once it has a bit more testing) also this patch which I
>>>>> think makes the change in a safer way:
>>>>>
>>>>> https://patchwork.ozlabs.org/patch/504918/
>>>>>
>>>>> At present that patch breaks at least one x86 board and I have not
>>>>> dug into it yet.
>>>>>
>>>>> The revert should not break tegra, according to Stephen.
>>>>
>>>> Unfortunately, my testing on P2571 with TOT u-boot-tegra (rebased against
>>> TOT u-boot/master this morning) shows that that is not true.
>>>>
>>>> The revert of the disputed 'fdtdec_get_addr_size' patch _does_ break Tegra
>>> 64-bit (P2571, at least). Nyan-big is OK.  With Simon's revert in place, my board
>>> just loops on SPL signon, so I assume it's faulting, etc. in CPU init.  Note that this
>>> is the current state of TOT u-boot/master.
>>>
>>> I'm a bit confused. So far, we don't support SPL on T210 since we assume some
>>> other bootloader runs on the boot CPU and starts just the main U-Boot on the
>>> main CPU. It sounds like you're testing some local-only SPL support?
>>
>> Currently there are a couple of ways to get T210 64-bit U-Boot loaded/executed. The first is the way I've always done it (during development and today) - use a 32-bit SPL that I built when I first ported 32-bit U-Boot to T210. I've saved away the SPL portion as a binary, and combine it with the current 64-bit T210 U-Boot proper when building my image.  It's always worked up to now.  What I'm seeing is a failure in the 64-bit CPU U-Boot portion. I just mentioned the looping SPL signon symptom because that's what I see as the indicator of a broken 64-bit image.
> 
> Oh I see; the SPL is only looping because the main U-Boot binary
> crashes/... and resets the CPU, hence re-executing the SPL. I thought
> you were referring to a loop purely within SPL.
> 
> Now it makes more sense.
> 
>> The other way is to use your proprietary NV bootloader for the 32-bit portion (this will become the de facto standard when we release said NV bootloader code as open-source, or a binary first-stage loader 'tool').  I haven't tried that, since my way works and is an easy part of my workflow.
>>
>> If you can point me to your boot CPU loader internally, I can try your method and see if it makes a difference, but I doubt it will.
> 
> I sent you an internal email message.
> 
> Perhaps you could also try my upstream U-Boot dev branch at:
> 
> repo git://github.com/swarren/u-boot.git branch tegra_dev
> 
> That has the revert of the original patch in, but also has the following
> replacement which you'd want to revert (or perhaps best: try with and
> without it):
> 
> c1fd5e1d5586 fdt: add new fdt address parsing functions
> 
> I'm sure I tested Simon's revert at the time I said it was OK. I wonder
> if the revert used to work fine, but something since then fails if the
> revert is in place? I would try testing this now, but I'm travelling so
> it's a bit more painful.

I worked out how to remote control my device, and tested the current
u-boot-tegra/master (which contains the patch revert this email thread
is about) with and without "fdt: add new fdt address parsing functions"
removed, and it works fine for me.

When you're concatenating SPL+U-Boot+DTB, are you using the DTB from the
same source tree as the main U-Boot (likely by getting U-Boot+DTB from
u-boot-dtb.bin in that source tree)?
Simon Glass Sept. 17, 2015, 1:10 a.m. UTC | #28
Hi Tom,

On 16 September 2015 at 15:46, Tom Warren <TWarren@nvidia.com> wrote:
> Simon,
>
>> -----Original Message-----
>> From: Tom Warren
>> Sent: Wednesday, September 02, 2015 7:02 PM
>> To: 'Stephen Warren'; Simon Glass
>> Cc: U-Boot Mailing List; Thierry Reding; Tom Rini
>> Subject: RE: [U-Boot] [PATCH] Revert "fdt: Fix fdtdec_get_addr_size() for 64-
>> bit"
>>
>> > -----Original Message-----
>> > From: Stephen Warren [mailto:swarren@wwwdotorg.org]
>> > Sent: Wednesday, September 02, 2015 4:44 PM
>> > To: Tom Warren; Simon Glass
>> > Cc: U-Boot Mailing List; Thierry Reding; Tom Rini
>> > Subject: Re: [U-Boot] [PATCH] Revert "fdt: Fix fdtdec_get_addr_size()
>> > for 64- bit"
>> >
>> > On 09/02/2015 01:54 PM, Stephen Warren wrote:
>> > > On 09/02/2015 01:39 PM, Tom Warren wrote:
>> > >>
>> > >>
>> > >>> -----Original Message-----
>> > >>> From: Stephen Warren
>> > >>> Sent: Wednesday, September 02, 2015 1:05 PM
>> > >>> To: Tom Warren; Simon Glass
>> > >>> Cc: Bin Meng; Thierry Reding; Tom Rini; U-Boot Mailing List
>> > >>> Subject: Re: [U-Boot] [PATCH] Revert "fdt: Fix
>> > >>> fdtdec_get_addr_size() for 64- bit"
>> > >>>
>> > >>> On 09/02/2015 09:52 AM, Tom Warren wrote:
>> > >>>> Simon, et al,
>> > >>>>
>> > >>>>> Simon Glass wrote at Friday, August 14, 2015 3:05 AM:
>> > >>>>> I plan to apply this revert to u-boot-x86 (where SPI is
>> > >>>>> currently
>> > >>>>> broken) and (once it has a bit more testing) also this patch
>> > >>>>> which I think makes the change in a safer way:
>> > >>>>>
>> > >>>>> https://patchwork.ozlabs.org/patch/504918/
>> > >>>>>
>> > >>>>> At present that patch breaks at least one x86 board and I have
>> > >>>>> not dug into it yet.
>> > >>>>>
>> > >>>>> The revert should not break tegra, according to Stephen.
>> > >>>>
>> > >>>> Unfortunately, my testing on P2571 with TOT u-boot-tegra (rebased
>> > >>>> against
>> > >>> TOT u-boot/master this morning) shows that that is not true.
>> > >>>>
>> > >>>> The revert of the disputed 'fdtdec_get_addr_size' patch _does_
>> > >>>> break Tegra
>> > >>> 64-bit (P2571, at least). Nyan-big is OK.  With Simon's revert in
>> > >>> place, my board just loops on SPL signon, so I assume it's
>> > >>> faulting, etc. in CPU init.  Note that this is the current state of TOT u-
>> boot/master.
>> > >>>
>> > >>> I'm a bit confused. So far, we don't support SPL on T210 since we
>> > >>> assume some other bootloader runs on the boot CPU and starts just
>> > >>> the main U-Boot on the main CPU. It sounds like you're testing
>> > >>> some local-
>> > only SPL support?
>> > >>
>> > >> Currently there are a couple of ways to get T210 64-bit U-Boot
>> > loaded/executed. The first is the way I've always done it (during
>> > development and today) - use a 32-bit SPL that I built when I first
>> > ported 32-bit U-Boot to T210. I've saved away the SPL portion as a
>> > binary, and combine it with the current 64-bit T210 U-Boot proper when
>> > building my image.  It's always worked up to now.  What I'm seeing is
>> > a failure in the 64-bit CPU U-Boot portion. I just mentioned the
>> > looping SPL signon symptom because that's what I see as the indicator of a
>> broken 64-bit image.
>> > >
>> > > Oh I see; the SPL is only looping because the main U-Boot binary
>> > > crashes/... and resets the CPU, hence re-executing the SPL. I
>> > > thought you were referring to a loop purely within SPL.
>> > >
>> > > Now it makes more sense.
>> > >
>> > >> The other way is to use your proprietary NV bootloader for the
>> > >> 32-bit
>> > portion (this will become the de facto standard when we release said
>> > NV bootloader code as open-source, or a binary first-stage loader
>> > 'tool').  I haven't tried that, since my way works and is an easy part of my
>> workflow.
>> > >>
>> > >> If you can point me to your boot CPU loader internally, I can try
>> > >> your
>> > method and see if it makes a difference, but I doubt it will.
>> > >
>> > > I sent you an internal email message.
>> > >
>> > > Perhaps you could also try my upstream U-Boot dev branch at:
>> > >
>> > > repo git://github.com/swarren/u-boot.git branch tegra_dev
>> > >
>> > > That has the revert of the original patch in, but also has the
>> > > following replacement which you'd want to revert (or perhaps best:
>> > > try with and without it):
>> > >
>> > > c1fd5e1d5586 fdt: add new fdt address parsing functions
>> > >
>> > > I'm sure I tested Simon's revert at the time I said it was OK. I
>> > > wonder if the revert used to work fine, but something since then
>> > > fails if the revert is in place? I would try testing this now, but
>> > > I'm travelling so it's a bit more painful.
>> >
>> > I worked out how to remote control my device, and tested the current
>> > u-boot- tegra/master (which contains the patch revert this email
>> > thread is about) with and without "fdt: add new fdt address parsing
>> functions"
>> > removed, and it works fine for me.
>> >
>> > When you're concatenating SPL+U-Boot+DTB, are you using the DTB from
>> > the same source tree as the main U-Boot (likely by getting U-Boot+DTB
>> > from u- boot-dtb.bin in that source tree)?
>> Yes
>>
>
> I'm not sure if this was the last thread on this (I was on vacation for a few days), but have you resolved the problem you had with Stephen's new 'fdt: Add new fdt address parsing functions" patch? I'd really like to get this resolved so I can send a PR to TomR.

Yes I believe the issues are resolved (not with the patch BTW - the
only issue with the patch was the #define DEBUG which I removed).

The patch has just been applied to mainline.

Regards,
Simon
diff mbox

Patch

diff --git a/lib/fdtdec.c b/lib/fdtdec.c
index a954051..aac4f8d 100644
--- a/lib/fdtdec.c
+++ b/lib/fdtdec.c
@@ -90,45 +90,29 @@  const char *fdtdec_get_compatible(enum fdt_compat_id id)
 fdt_addr_t fdtdec_get_addr_size(const void *blob, int node,
 		const char *prop_name, fdt_size_t *sizep)
 {
-	const fdt32_t *ptr, *end;
-	int parent, na, ns, len;
-	fdt_addr_t addr;
+	const fdt_addr_t *cell;
+	int len;
 
 	debug("%s: %s: ", __func__, prop_name);
-
-	parent = fdt_parent_offset(blob, node);
-	if (parent < 0) {
-		debug("(no parent found)\n");
-		return FDT_ADDR_T_NONE;
-	}
-
-	na = fdt_address_cells(blob, parent);
-	ns = fdt_size_cells(blob, parent);
-
-	ptr = fdt_getprop(blob, node, prop_name, &len);
-	if (!ptr) {
-		debug("(not found)\n");
-		return FDT_ADDR_T_NONE;
-	}
-
-	end = ptr + len / sizeof(*ptr);
-
-	if (ptr + na + ns > end) {
-		debug("(not enough data: expected %d bytes, got %d bytes)\n",
-		      (na + ns) * 4, len);
-		return FDT_ADDR_T_NONE;
-	}
-
-	addr = fdtdec_get_number(ptr, na);
-
-	if (sizep) {
-		*sizep = fdtdec_get_number(ptr + na, ns);
-		debug("addr=%pa, size=%pa\n", &addr, sizep);
-	} else {
-		debug("%pa\n", &addr);
+	cell = fdt_getprop(blob, node, prop_name, &len);
+	if (cell && ((!sizep && len == sizeof(fdt_addr_t)) ||
+		     len == sizeof(fdt_addr_t) * 2)) {
+		fdt_addr_t addr = fdt_addr_to_cpu(*cell);
+		if (sizep) {
+			const fdt_size_t *size;
+
+			size = (fdt_size_t *)((char *)cell +
+					sizeof(fdt_addr_t));
+			*sizep = fdt_size_to_cpu(*size);
+			debug("addr=%08lx, size=%llx\n",
+			      (ulong)addr, (u64)*sizep);
+		} else {
+			debug("%08lx\n", (ulong)addr);
+		}
+		return addr;
 	}
-
-	return addr;
+	debug("(not found)\n");
+	return FDT_ADDR_T_NONE;
 }
 
 fdt_addr_t fdtdec_get_addr(const void *blob, int node,