diff mbox

[U-Boot,v2] dm: core: Enable optional use of fdt_translate_address()

Message ID 1441343506-28473-1-git-send-email-sr@denx.de
State Superseded
Delegated to: Simon Glass
Headers show

Commit Message

Stefan Roese Sept. 4, 2015, 5:11 a.m. UTC
The current "simple" address translation simple_bus_translate() is not
working on some platforms (e.g. MVEBU). As here more complex "ranges"
properties are used in many nodes (multiple tuples etc). This patch
enables the optional use of the common fdt_translate_address() function
which handles this translation correctly.

Signed-off-by: Stefan Roese <sr@denx.de>
Cc: Simon Glass <sjg@chromium.org>
Cc: Bin Meng <bmeng.cn@gmail.com>
Cc: Marek Vasut <marex@denx.de>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
---
v2:
- Rework code a bit as suggested by Simon. Also added some comments
  to make the use of the code paths more clear.

 drivers/core/Kconfig  | 30 ++++++++++++++++++++++++++++++
 drivers/core/device.c | 20 ++++++++++++++++++++
 2 files changed, 50 insertions(+)

Comments

Simon Glass Sept. 9, 2015, 6:07 p.m. UTC | #1
+Stephen

Hi Stefan,

On Thursday, 3 September 2015, Stefan Roese <sr@denx.de> wrote:
>
> The current "simple" address translation simple_bus_translate() is not
> working on some platforms (e.g. MVEBU). As here more complex "ranges"
> properties are used in many nodes (multiple tuples etc). This patch
> enables the optional use of the common fdt_translate_address() function
> which handles this translation correctly.
>
> Signed-off-by: Stefan Roese <sr@denx.de>
> Cc: Simon Glass <sjg@chromium.org>
> Cc: Bin Meng <bmeng.cn@gmail.com>
> Cc: Marek Vasut <marex@denx.de>
> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
> ---
> v2:
> - Rework code a bit as suggested by Simon. Also added some comments
>   to make the use of the code paths more clear.


While this works I'm reluctant to commit it as is. The call to
fdt_parent_offset() is very slow.

I wonder if this code should be copied into a new file in
drivers/core/, tidied up and updated to use dev->parent?

Other options:
- Add a library to unflatten the tree - but this would not be very
useful in SPL or before relocation due to memory/speed constraints
- Add a helper to find a node parent which uses a cached tree scan to
build a table of previous nodes (or some other means to go backwards
in the tree)
- Worry about it later and go ahead with this patch
>
>
>  drivers/core/Kconfig  | 30 ++++++++++++++++++++++++++++++
>  drivers/core/device.c | 20 ++++++++++++++++++++
>  2 files changed, 50 insertions(+)
>
> diff --git a/drivers/core/Kconfig b/drivers/core/Kconfig
> index 41f4e69..15681df 100644
> --- a/drivers/core/Kconfig
> +++ b/drivers/core/Kconfig
> @@ -120,4 +120,34 @@ config SPL_SIMPLE_BUS
>           Supports the 'simple-bus' driver, which is used on some systems
>           in SPL.
>
> +config OF_TRANSLATE
> +       bool "Translate addresses using fdt_translate_address"
> +       depends on DM && OF_CONTROL
> +       default y
> +       help
> +         If this option is enabled, the reg property will be translated
> +         using the fdt_translate_address() function. This is necessary
> +         on some platforms (e.g. MVEBU) using complex "ranges"
> +         properties in many nodes. As this translation is not handled
> +         correctly in the default simple_bus_translate() function.
> +
> +         If this option is not enabled, simple_bus_translate() will be
> +         used for the address translation. This function is faster and
> +         smaller in size than fdt_translate_address().
> +
> +config SPL_OF_TRANSLATE
> +       bool "Translate addresses using fdt_translate_address"
> +       depends on SPL_DM && SPL_OF_CONTROL
> +       default n
> +       help
> +         If this option is enabled, the reg property will be translated
> +         using the fdt_translate_address() function. This is necessary
> +         on some platforms (e.g. MVEBU) using complex "ranges"
> +         properties in many nodes. As this translation is not handled
> +         correctly in the default simple_bus_translate() function.
> +
> +         If this option is not enabled, simple_bus_translate() will be
> +         used for the address translation. This function is faster and
> +         smaller in size than fdt_translate_address().
> +
>  endmenu
> diff --git a/drivers/core/device.c b/drivers/core/device.c
> index 0ccd443..c543203 100644
> --- a/drivers/core/device.c
> +++ b/drivers/core/device.c
> @@ -11,6 +11,7 @@
>
>  #include <common.h>
>  #include <fdtdec.h>
> +#include <fdt_support.h>
>  #include <malloc.h>
>  #include <dm/device.h>
>  #include <dm/device-internal.h>
> @@ -581,6 +582,25 @@ fdt_addr_t dev_get_addr(struct udevice *dev)
>  #if CONFIG_IS_ENABLED(OF_CONTROL)
>         fdt_addr_t addr;
>
> +       if (CONFIG_IS_ENABLED(OF_TRANSLATE)) {
> +               const fdt32_t *reg;
> +
> +               reg = fdt_getprop(gd->fdt_blob, dev->of_offset, "reg", NULL);
> +               if (!reg)
> +                       return FDT_ADDR_T_NONE;
> +
> +               /*
> +                * Use the full-fledged translate function for complex
> +                * bus setups.
> +                */
> +               return fdt_translate_address((void *)gd->fdt_blob,
> +                                            dev->of_offset, reg);
> +       }
> +
> +       /*
> +        * Use the "simple" translate function for less complex
> +        * bus setups.
> +        */
>         addr = fdtdec_get_addr(gd->fdt_blob, dev->of_offset, "reg");
>         if (CONFIG_IS_ENABLED(SIMPLE_BUS) && addr != FDT_ADDR_T_NONE) {
>                 if (device_get_uclass_id(dev->parent) == UCLASS_SIMPLE_BUS)
> --
> 2.5.1


Regards,
Simon
Stefan Roese Sept. 10, 2015, 5:54 a.m. UTC | #2
Hi Simon,

On 09.09.2015 20:07, Simon Glass wrote:
> On Thursday, 3 September 2015, Stefan Roese <sr@denx.de> wrote:
>>
>> The current "simple" address translation simple_bus_translate() is not
>> working on some platforms (e.g. MVEBU). As here more complex "ranges"
>> properties are used in many nodes (multiple tuples etc). This patch
>> enables the optional use of the common fdt_translate_address() function
>> which handles this translation correctly.
>>
>> Signed-off-by: Stefan Roese <sr@denx.de>
>> Cc: Simon Glass <sjg@chromium.org>
>> Cc: Bin Meng <bmeng.cn@gmail.com>
>> Cc: Marek Vasut <marex@denx.de>
>> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
>> ---
>> v2:
>> - Rework code a bit as suggested by Simon. Also added some comments
>>    to make the use of the code paths more clear.
>
>
> While this works I'm reluctant to commit it as is. The call to
> fdt_parent_offset() is very slow.

You've mentioned this before. But how slow could this function really 
be? And it should not be called that often via dev_get_addr(). Usually 
only once for each driver in the probe function. Or am I missing something?

> I wonder if this code should be copied into a new file in
> drivers/core/, tidied up and updated to use dev->parent?

You mean fdt_translate_address()? It references many functions from 
fdt_support.c though which we would need to duplicate here as well.

> Other options:
> - Add a library to unflatten the tree - but this would not be very
> useful in SPL or before relocation due to memory/speed constraints
> - Add a helper to find a node parent which uses a cached tree scan to
> build a table of previous nodes (or some other means to go backwards
> in the tree)
> - Worry about it later and go ahead with this patch

I see no problems to defer this patch (or a "better" version of it) to 
after this release. The Marvell mvebu DM patches are also not targeted 
for this release.

Thanks,
Stefan
Simon Glass Sept. 11, 2015, 12:42 a.m. UTC | #3
Hi Stefan,

On 9 September 2015 at 22:54, Stefan Roese <sr@denx.de> wrote:
> Hi Simon,
>
> On 09.09.2015 20:07, Simon Glass wrote:
>>
>> On Thursday, 3 September 2015, Stefan Roese <sr@denx.de> wrote:
>>>
>>>
>>> The current "simple" address translation simple_bus_translate() is not
>>> working on some platforms (e.g. MVEBU). As here more complex "ranges"
>>> properties are used in many nodes (multiple tuples etc). This patch
>>> enables the optional use of the common fdt_translate_address() function
>>> which handles this translation correctly.
>>>
>>> Signed-off-by: Stefan Roese <sr@denx.de>
>>> Cc: Simon Glass <sjg@chromium.org>
>>> Cc: Bin Meng <bmeng.cn@gmail.com>
>>> Cc: Marek Vasut <marex@denx.de>
>>> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
>>> ---
>>> v2:
>>> - Rework code a bit as suggested by Simon. Also added some comments
>>>    to make the use of the code paths more clear.
>>
>>
>>
>> While this works I'm reluctant to commit it as is. The call to
>> fdt_parent_offset() is very slow.
>
>
> You've mentioned this before. But how slow could this function really be?

It scans the tree from the start. There is no back link.

> And it should not be called that often via dev_get_addr(). Usually only once
> for each driver in the probe function. Or am I missing something?

Sounds correct.

>
>> I wonder if this code should be copied into a new file in
>> drivers/core/, tidied up and updated to use dev->parent?
>
>
> You mean fdt_translate_address()? It references many functions from
> fdt_support.c though which we would need to duplicate here as well.
>

Right. Seems like a pain.

>> Other options:
>> - Add a library to unflatten the tree - but this would not be very
>> useful in SPL or before relocation due to memory/speed constraints
>> - Add a helper to find a node parent which uses a cached tree scan to
>> build a table of previous nodes (or some other means to go backwards
>> in the tree)
>> - Worry about it later and go ahead with this patch
>
>
> I see no problems to defer this patch (or a "better" version of it) to after
> this release. The Marvell mvebu DM patches are also not targeted for this
> release.

OK - and if the time slowdown is not too large then we can just use
this patch, particularly as it is an optional CONFIG. Can you check
how much slower it is to use your new case versus the original code?

Regards,
Simon
Stefan Roese Sept. 11, 2015, 5:41 a.m. UTC | #4
Hi Simon,

On 11.09.2015 02:42, Simon Glass wrote:
>>> On Thursday, 3 September 2015, Stefan Roese <sr@denx.de> wrote:
>>>>
>>>>
>>>> The current "simple" address translation simple_bus_translate() is not
>>>> working on some platforms (e.g. MVEBU). As here more complex "ranges"
>>>> properties are used in many nodes (multiple tuples etc). This patch
>>>> enables the optional use of the common fdt_translate_address() function
>>>> which handles this translation correctly.
>>>>
>>>> Signed-off-by: Stefan Roese <sr@denx.de>
>>>> Cc: Simon Glass <sjg@chromium.org>
>>>> Cc: Bin Meng <bmeng.cn@gmail.com>
>>>> Cc: Marek Vasut <marex@denx.de>
>>>> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
>>>> ---
>>>> v2:
>>>> - Rework code a bit as suggested by Simon. Also added some comments
>>>>     to make the use of the code paths more clear.
>>>
>>>
>>>
>>> While this works I'm reluctant to commit it as is. The call to
>>> fdt_parent_offset() is very slow.
>>
>>
>> You've mentioned this before. But how slow could this function really be?
>
> It scans the tree from the start. There is no back link.
>
>> And it should not be called that often via dev_get_addr(). Usually only once
>> for each driver in the probe function. Or am I missing something?
>
> Sounds correct.

So it really shouldn't make a big difference.

>>
>>> I wonder if this code should be copied into a new file in
>>> drivers/core/, tidied up and updated to use dev->parent?
>>
>>
>> You mean fdt_translate_address()? It references many functions from
>> fdt_support.c though which we would need to duplicate here as well.
>>
>
> Right. Seems like a pain.
>
>>> Other options:
>>> - Add a library to unflatten the tree - but this would not be very
>>> useful in SPL or before relocation due to memory/speed constraints
>>> - Add a helper to find a node parent which uses a cached tree scan to
>>> build a table of previous nodes (or some other means to go backwards
>>> in the tree)
>>> - Worry about it later and go ahead with this patch
>>
>>
>> I see no problems to defer this patch (or a "better" version of it) to after
>> this release. The Marvell mvebu DM patches are also not targeted for this
>> release.
>
> OK - and if the time slowdown is not too large then we can just use
> this patch, particularly as it is an optional CONFIG. Can you check
> how much slower it is to use your new case versus the original code?

Marvell MVEBU won't boot without this option enabled. So I can't really 
compare it here. Someone with a platform that doesn't need this option 
enabled can definitely better do this test and compare the results.

Thanks,
Stefan
Stephen Warren Sept. 11, 2015, 5:07 p.m. UTC | #5
On 09/09/2015 11:07 AM, Simon Glass wrote:
> +Stephen
> 
> Hi Stefan,
> 
> On Thursday, 3 September 2015, Stefan Roese <sr@denx.de> wrote:
>>
>> The current "simple" address translation simple_bus_translate() is not
>> working on some platforms (e.g. MVEBU). As here more complex "ranges"
>> properties are used in many nodes (multiple tuples etc). This patch
>> enables the optional use of the common fdt_translate_address() function
>> which handles this translation correctly.
>>
>> Signed-off-by: Stefan Roese <sr@denx.de>
>> Cc: Simon Glass <sjg@chromium.org>
>> Cc: Bin Meng <bmeng.cn@gmail.com>
>> Cc: Marek Vasut <marex@denx.de>
>> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
>> ---
>> v2:
>> - Rework code a bit as suggested by Simon. Also added some comments
>>   to make the use of the code paths more clear.
> 
> 
> While this works I'm reluctant to commit it as is. The call to
> fdt_parent_offset() is very slow.
> 
> I wonder if this code should be copied into a new file in
> drivers/core/, tidied up and updated to use dev->parent?
> 
> Other options:
> - Add a library to unflatten the tree - but this would not be very
> useful in SPL or before relocation due to memory/speed constraints
> - Add a helper to find a node parent which uses a cached tree scan to
> build a table of previous nodes (or some other means to go backwards
> in the tree)
> - Worry about it later and go ahead with this patch

I haven't looked at the code in detail, but I'm surprised there's a
Kconfig option for this, for either SPL or main U-Boot. In general, this
feature is simply a required part of parsing DT, so surely the code
should always be enabled. Without it, we're only getting lucky if DT
works (lucky the DT doesn't happen to contain a ranges property). Sure
the code does some searching through the DT, and that's slower than not
doing it, but I don't see how we can support DT without parsing DT
correctly. Now admittedly some platforms' DTs happen not to contain
ranges that require this code in practice. However, I feel that's a bit
of a micro-optimization, and a rather error-prone one at that. What if
someone pulls a more complete DT into U-Boot and suddenly the code is
required and they have to spend ages tracking down their problem to
missing functionality in a core DT parsing API - something they'd be
unlikely to initially suspect.
Stefan Roese Sept. 14, 2015, 5:25 a.m. UTC | #6
Hi Stephen,

On 11.09.2015 19:07, Stephen Warren wrote:
> On 09/09/2015 11:07 AM, Simon Glass wrote:
>> +Stephen
>>
>> Hi Stefan,
>>
>> On Thursday, 3 September 2015, Stefan Roese <sr@denx.de> wrote:
>>>
>>> The current "simple" address translation simple_bus_translate() is not
>>> working on some platforms (e.g. MVEBU). As here more complex "ranges"
>>> properties are used in many nodes (multiple tuples etc). This patch
>>> enables the optional use of the common fdt_translate_address() function
>>> which handles this translation correctly.
>>>
>>> Signed-off-by: Stefan Roese <sr@denx.de>
>>> Cc: Simon Glass <sjg@chromium.org>
>>> Cc: Bin Meng <bmeng.cn@gmail.com>
>>> Cc: Marek Vasut <marex@denx.de>
>>> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
>>> ---
>>> v2:
>>> - Rework code a bit as suggested by Simon. Also added some comments
>>>    to make the use of the code paths more clear.
>>
>>
>> While this works I'm reluctant to commit it as is. The call to
>> fdt_parent_offset() is very slow.
>>
>> I wonder if this code should be copied into a new file in
>> drivers/core/, tidied up and updated to use dev->parent?
>>
>> Other options:
>> - Add a library to unflatten the tree - but this would not be very
>> useful in SPL or before relocation due to memory/speed constraints
>> - Add a helper to find a node parent which uses a cached tree scan to
>> build a table of previous nodes (or some other means to go backwards
>> in the tree)
>> - Worry about it later and go ahead with this patch
>
> I haven't looked at the code in detail, but I'm surprised there's a
> Kconfig option for this, for either SPL or main U-Boot. In general, this
> feature is simply a required part of parsing DT, so surely the code
> should always be enabled. Without it, we're only getting lucky if DT
> works (lucky the DT doesn't happen to contain a ranges property).

Yes. I was also a bit surprised, that this current (limited) 
implementation to translate the address worked on the platforms using 
this interface right now.

> Sure
> the code does some searching through the DT, and that's slower than not
> doing it, but I don't see how we can support DT without parsing DT
> correctly. Now admittedly some platforms' DTs happen not to contain
> ranges that require this code in practice. However, I feel that's a bit
> of a micro-optimization, and a rather error-prone one at that. What if
> someone pulls a more complete DT into U-Boot and suddenly the code is
> required and they have to spend ages tracking down their problem to
> missing functionality in a core DT parsing API - something they'd be
> unlikely to initially suspect.

Ack. However, I definitely understand Simon's arguments about code size 
here. On some platforms with limited RAM for SPL this additional code 
for "correct" ranges parsing and address translation might break the 
size limit. Not sure how to handle this. At least a comment in the code 
would be helpful, explaining that simple_bus_translate() is limited here 
in some aspects.

Thanks,
Stefan
Thomas Chou Sept. 15, 2015, 7:31 a.m. UTC | #7
Hi Stefan,

On 09/04/2015 01:11 PM, Stefan Roese wrote:
> The current "simple" address translation simple_bus_translate() is not
> working on some platforms (e.g. MVEBU). As here more complex "ranges"
> properties are used in many nodes (multiple tuples etc). This patch
> enables the optional use of the common fdt_translate_address() function
> which handles this translation correctly.
>
> Signed-off-by: Stefan Roese <sr@denx.de>
> Cc: Simon Glass <sjg@chromium.org>
> Cc: Bin Meng <bmeng.cn@gmail.com>
> Cc: Marek Vasut <marex@denx.de>
> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
> ---
> v2:
> - Rework code a bit as suggested by Simon. Also added some comments
>    to make the use of the code paths more clear.
>

It works great on nios2 board. Thanks a lot.

Tested-by: Thomas Chou <thomas@wytron.com.tw>

Best regards,
Thomas Chou
Stephen Warren Sept. 21, 2015, 6:06 p.m. UTC | #8
On 09/13/2015 11:25 PM, Stefan Roese wrote:
> Hi Stephen,
>
> On 11.09.2015 19:07, Stephen Warren wrote:
>> On 09/09/2015 11:07 AM, Simon Glass wrote:
>>> +Stephen
>>>
>>> Hi Stefan,
>>>
>>> On Thursday, 3 September 2015, Stefan Roese <sr@denx.de> wrote:
>>>>
>>>> The current "simple" address translation simple_bus_translate() is not
>>>> working on some platforms (e.g. MVEBU). As here more complex "ranges"
>>>> properties are used in many nodes (multiple tuples etc). This patch
>>>> enables the optional use of the common fdt_translate_address() function
>>>> which handles this translation correctly.
>>>>
>>>> Signed-off-by: Stefan Roese <sr@denx.de>
>>>> Cc: Simon Glass <sjg@chromium.org>
>>>> Cc: Bin Meng <bmeng.cn@gmail.com>
>>>> Cc: Marek Vasut <marex@denx.de>
>>>> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
>>>> ---
>>>> v2:
>>>> - Rework code a bit as suggested by Simon. Also added some comments
>>>>    to make the use of the code paths more clear.
>>>
>>>
>>> While this works I'm reluctant to commit it as is. The call to
>>> fdt_parent_offset() is very slow.
>>>
>>> I wonder if this code should be copied into a new file in
>>> drivers/core/, tidied up and updated to use dev->parent?
>>>
>>> Other options:
>>> - Add a library to unflatten the tree - but this would not be very
>>> useful in SPL or before relocation due to memory/speed constraints
>>> - Add a helper to find a node parent which uses a cached tree scan to
>>> build a table of previous nodes (or some other means to go backwards
>>> in the tree)
>>> - Worry about it later and go ahead with this patch
>>
>> I haven't looked at the code in detail, but I'm surprised there's a
>> Kconfig option for this, for either SPL or main U-Boot. In general, this
>> feature is simply a required part of parsing DT, so surely the code
>> should always be enabled. Without it, we're only getting lucky if DT
>> works (lucky the DT doesn't happen to contain a ranges property).
>
> Yes. I was also a bit surprised, that this current (limited)
> implementation to translate the address worked on the platforms using
> this interface right now.
>
>> Sure
>> the code does some searching through the DT, and that's slower than not
>> doing it, but I don't see how we can support DT without parsing DT
>> correctly. Now admittedly some platforms' DTs happen not to contain
>> ranges that require this code in practice. However, I feel that's a bit
>> of a micro-optimization, and a rather error-prone one at that. What if
>> someone pulls a more complete DT into U-Boot and suddenly the code is
>> required and they have to spend ages tracking down their problem to
>> missing functionality in a core DT parsing API - something they'd be
>> unlikely to initially suspect.
>
> Ack. However, I definitely understand Simon's arguments about code size
> here. On some platforms with limited RAM for SPL this additional code
> for "correct" ranges parsing and address translation might break the
> size limit. Not sure how to handle this. At least a comment in the code
> would be helpful, explaining that simple_bus_translate() is limited here
> in some aspects.

So in my AArch64 build, fdt_translate_address is 0x270 bytes. I can see 
that might be pushing some extremely constrained binaries over a limit 
if that function isn't already included in the binary. However, if we 
are in that situation, I have a really hard time believing this one 
patch/function will be the only issue; we'll constantly be hitting a 
wall where we can't fix issues in DT parsing, DT handling, or other code 
in these binaries since the fix will bloat the binary too much.

In those cases, I rather question whether DT support is the correct 
approach; completely dropping DT support from those binaries would 
likely remove large amounts of code and replace it with a tiny amount of 
constant data. It seems like that'd be the best approach all around 
since it'd head of the issue completely.
Simon Glass Oct. 3, 2015, 12:50 p.m. UTC | #9
Hi Stephen,

On 21 September 2015 at 19:06, Stephen Warren <swarren@wwwdotorg.org> wrote:
> On 09/13/2015 11:25 PM, Stefan Roese wrote:
>>
>> Hi Stephen,
>>
>> On 11.09.2015 19:07, Stephen Warren wrote:
>>>
>>> On 09/09/2015 11:07 AM, Simon Glass wrote:
>>>>
>>>> +Stephen
>>>>
>>>> Hi Stefan,
>>>>
>>>> On Thursday, 3 September 2015, Stefan Roese <sr@denx.de> wrote:
>>>>>
>>>>>
>>>>> The current "simple" address translation simple_bus_translate() is not
>>>>> working on some platforms (e.g. MVEBU). As here more complex "ranges"
>>>>> properties are used in many nodes (multiple tuples etc). This patch
>>>>> enables the optional use of the common fdt_translate_address() function
>>>>> which handles this translation correctly.
>>>>>
>>>>> Signed-off-by: Stefan Roese <sr@denx.de>
>>>>> Cc: Simon Glass <sjg@chromium.org>
>>>>> Cc: Bin Meng <bmeng.cn@gmail.com>
>>>>> Cc: Marek Vasut <marex@denx.de>
>>>>> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
>>>>> ---
>>>>> v2:
>>>>> - Rework code a bit as suggested by Simon. Also added some comments
>>>>>    to make the use of the code paths more clear.
>>>>
>>>>
>>>>
>>>> While this works I'm reluctant to commit it as is. The call to
>>>> fdt_parent_offset() is very slow.
>>>>
>>>> I wonder if this code should be copied into a new file in
>>>> drivers/core/, tidied up and updated to use dev->parent?
>>>>
>>>> Other options:
>>>> - Add a library to unflatten the tree - but this would not be very
>>>> useful in SPL or before relocation due to memory/speed constraints
>>>> - Add a helper to find a node parent which uses a cached tree scan to
>>>> build a table of previous nodes (or some other means to go backwards
>>>> in the tree)
>>>> - Worry about it later and go ahead with this patch
>>>
>>>
>>> I haven't looked at the code in detail, but I'm surprised there's a
>>> Kconfig option for this, for either SPL or main U-Boot. In general, this
>>> feature is simply a required part of parsing DT, so surely the code
>>> should always be enabled. Without it, we're only getting lucky if DT
>>> works (lucky the DT doesn't happen to contain a ranges property).
>>
>>
>> Yes. I was also a bit surprised, that this current (limited)
>> implementation to translate the address worked on the platforms using
>> this interface right now.
>>
>>> Sure
>>> the code does some searching through the DT, and that's slower than not
>>> doing it, but I don't see how we can support DT without parsing DT
>>> correctly. Now admittedly some platforms' DTs happen not to contain
>>> ranges that require this code in practice. However, I feel that's a bit
>>> of a micro-optimization, and a rather error-prone one at that. What if
>>> someone pulls a more complete DT into U-Boot and suddenly the code is
>>> required and they have to spend ages tracking down their problem to
>>> missing functionality in a core DT parsing API - something they'd be
>>> unlikely to initially suspect.
>>
>>
>> Ack. However, I definitely understand Simon's arguments about code size
>> here. On some platforms with limited RAM for SPL this additional code
>> for "correct" ranges parsing and address translation might break the
>> size limit. Not sure how to handle this. At least a comment in the code
>> would be helpful, explaining that simple_bus_translate() is limited here
>> in some aspects.
>
>
> So in my AArch64 build, fdt_translate_address is 0x270 bytes. I can see that
> might be pushing some extremely constrained binaries over a limit if that
> function isn't already included in the binary. However, if we are in that
> situation, I have a really hard time believing this one patch/function will
> be the only issue; we'll constantly be hitting a wall where we can't fix
> issues in DT parsing, DT handling, or other code in these binaries since the
> fix will bloat the binary too much.
>
> In those cases, I rather question whether DT support is the correct
> approach; completely dropping DT support from those binaries would likely
> remove large amounts of code and replace it with a tiny amount of constant
> data. It seems like that'd be the best approach all around since it'd head
> of the issue completely.

U-Boot is not Linux - code size is important. We can enable features
when needed. At present we can enable driver model and device tree
with a ~5KB binary hit including a small device tree. I'd like to keep
that down as low as possible. Otherwise we will end up with SPL being
unable to driver model / device tree on lots of platforms. As time
goes by and SoCs become more and more complex, this will be a pain.
We'll end up forking the driver model.

Of course trade-offs can change over time but that's the way I see it
at the moment.

Regards,
Simon
Stephen Warren Oct. 3, 2015, 7:17 p.m. UTC | #10
On 10/03/2015 06:50 AM, Simon Glass wrote:
> Hi Stephen,
> 
> On 21 September 2015 at 19:06, Stephen Warren <swarren@wwwdotorg.org> wrote:
>> On 09/13/2015 11:25 PM, Stefan Roese wrote:
>>>
>>> Hi Stephen,
>>>
>>> On 11.09.2015 19:07, Stephen Warren wrote:
>>>>
>>>> On 09/09/2015 11:07 AM, Simon Glass wrote:
>>>>>
>>>>> +Stephen
>>>>>
>>>>> Hi Stefan,
>>>>>
>>>>> On Thursday, 3 September 2015, Stefan Roese <sr@denx.de> wrote:
>>>>>>
>>>>>>
>>>>>> The current "simple" address translation simple_bus_translate() is not
>>>>>> working on some platforms (e.g. MVEBU). As here more complex "ranges"
>>>>>> properties are used in many nodes (multiple tuples etc). This patch
>>>>>> enables the optional use of the common fdt_translate_address() function
>>>>>> which handles this translation correctly.
>>>>>>
>>>>>> Signed-off-by: Stefan Roese <sr@denx.de>
>>>>>> Cc: Simon Glass <sjg@chromium.org>
>>>>>> Cc: Bin Meng <bmeng.cn@gmail.com>
>>>>>> Cc: Marek Vasut <marex@denx.de>
>>>>>> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
>>>>>> ---
>>>>>> v2:
>>>>>> - Rework code a bit as suggested by Simon. Also added some comments
>>>>>>    to make the use of the code paths more clear.
>>>>>
>>>>>
>>>>>
>>>>> While this works I'm reluctant to commit it as is. The call to
>>>>> fdt_parent_offset() is very slow.
>>>>>
>>>>> I wonder if this code should be copied into a new file in
>>>>> drivers/core/, tidied up and updated to use dev->parent?
>>>>>
>>>>> Other options:
>>>>> - Add a library to unflatten the tree - but this would not be very
>>>>> useful in SPL or before relocation due to memory/speed constraints
>>>>> - Add a helper to find a node parent which uses a cached tree scan to
>>>>> build a table of previous nodes (or some other means to go backwards
>>>>> in the tree)
>>>>> - Worry about it later and go ahead with this patch
>>>>
>>>>
>>>> I haven't looked at the code in detail, but I'm surprised there's a
>>>> Kconfig option for this, for either SPL or main U-Boot. In general, this
>>>> feature is simply a required part of parsing DT, so surely the code
>>>> should always be enabled. Without it, we're only getting lucky if DT
>>>> works (lucky the DT doesn't happen to contain a ranges property).
>>>
>>>
>>> Yes. I was also a bit surprised, that this current (limited)
>>> implementation to translate the address worked on the platforms using
>>> this interface right now.
>>>
>>>> Sure
>>>> the code does some searching through the DT, and that's slower than not
>>>> doing it, but I don't see how we can support DT without parsing DT
>>>> correctly. Now admittedly some platforms' DTs happen not to contain
>>>> ranges that require this code in practice. However, I feel that's a bit
>>>> of a micro-optimization, and a rather error-prone one at that. What if
>>>> someone pulls a more complete DT into U-Boot and suddenly the code is
>>>> required and they have to spend ages tracking down their problem to
>>>> missing functionality in a core DT parsing API - something they'd be
>>>> unlikely to initially suspect.
>>>
>>>
>>> Ack. However, I definitely understand Simon's arguments about code size
>>> here. On some platforms with limited RAM for SPL this additional code
>>> for "correct" ranges parsing and address translation might break the
>>> size limit. Not sure how to handle this. At least a comment in the code
>>> would be helpful, explaining that simple_bus_translate() is limited here
>>> in some aspects.
>>
>>
>> So in my AArch64 build, fdt_translate_address is 0x270 bytes. I can see that
>> might be pushing some extremely constrained binaries over a limit if that
>> function isn't already included in the binary. However, if we are in that
>> situation, I have a really hard time believing this one patch/function will
>> be the only issue; we'll constantly be hitting a wall where we can't fix
>> issues in DT parsing, DT handling, or other code in these binaries since the
>> fix will bloat the binary too much.
>>
>> In those cases, I rather question whether DT support is the correct
>> approach; completely dropping DT support from those binaries would likely
>> remove large amounts of code and replace it with a tiny amount of constant
>> data. It seems like that'd be the best approach all around since it'd head
>> of the issue completely.
> 
> U-Boot is not Linux - code size is important. We can enable features
> when needed.

Only if they're not mandatory parts of other features that we've made an
arbitrary decision to use. Correctness trumps optimization in absolutely
all cases.
Simon Glass Oct. 4, 2015, 1:02 a.m. UTC | #11
Hi Stephen,

On 3 October 2015 at 20:17, Stephen Warren <swarren@wwwdotorg.org> wrote:
> On 10/03/2015 06:50 AM, Simon Glass wrote:
>> Hi Stephen,
>>
>> On 21 September 2015 at 19:06, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>> On 09/13/2015 11:25 PM, Stefan Roese wrote:
>>>>
>>>> Hi Stephen,
>>>>
>>>> On 11.09.2015 19:07, Stephen Warren wrote:
>>>>>
>>>>> On 09/09/2015 11:07 AM, Simon Glass wrote:
>>>>>>
>>>>>> +Stephen
>>>>>>
>>>>>> Hi Stefan,
>>>>>>
>>>>>> On Thursday, 3 September 2015, Stefan Roese <sr@denx.de> wrote:
>>>>>>>
>>>>>>>
>>>>>>> The current "simple" address translation simple_bus_translate() is not
>>>>>>> working on some platforms (e.g. MVEBU). As here more complex "ranges"
>>>>>>> properties are used in many nodes (multiple tuples etc). This patch
>>>>>>> enables the optional use of the common fdt_translate_address() function
>>>>>>> which handles this translation correctly.
>>>>>>>
>>>>>>> Signed-off-by: Stefan Roese <sr@denx.de>
>>>>>>> Cc: Simon Glass <sjg@chromium.org>
>>>>>>> Cc: Bin Meng <bmeng.cn@gmail.com>
>>>>>>> Cc: Marek Vasut <marex@denx.de>
>>>>>>> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
>>>>>>> ---
>>>>>>> v2:
>>>>>>> - Rework code a bit as suggested by Simon. Also added some comments
>>>>>>>    to make the use of the code paths more clear.
>>>>>>
>>>>>>
>>>>>>
>>>>>> While this works I'm reluctant to commit it as is. The call to
>>>>>> fdt_parent_offset() is very slow.
>>>>>>
>>>>>> I wonder if this code should be copied into a new file in
>>>>>> drivers/core/, tidied up and updated to use dev->parent?
>>>>>>
>>>>>> Other options:
>>>>>> - Add a library to unflatten the tree - but this would not be very
>>>>>> useful in SPL or before relocation due to memory/speed constraints
>>>>>> - Add a helper to find a node parent which uses a cached tree scan to
>>>>>> build a table of previous nodes (or some other means to go backwards
>>>>>> in the tree)
>>>>>> - Worry about it later and go ahead with this patch
>>>>>
>>>>>
>>>>> I haven't looked at the code in detail, but I'm surprised there's a
>>>>> Kconfig option for this, for either SPL or main U-Boot. In general, this
>>>>> feature is simply a required part of parsing DT, so surely the code
>>>>> should always be enabled. Without it, we're only getting lucky if DT
>>>>> works (lucky the DT doesn't happen to contain a ranges property).
>>>>
>>>>
>>>> Yes. I was also a bit surprised, that this current (limited)
>>>> implementation to translate the address worked on the platforms using
>>>> this interface right now.
>>>>
>>>>> Sure
>>>>> the code does some searching through the DT, and that's slower than not
>>>>> doing it, but I don't see how we can support DT without parsing DT
>>>>> correctly. Now admittedly some platforms' DTs happen not to contain
>>>>> ranges that require this code in practice. However, I feel that's a bit
>>>>> of a micro-optimization, and a rather error-prone one at that. What if
>>>>> someone pulls a more complete DT into U-Boot and suddenly the code is
>>>>> required and they have to spend ages tracking down their problem to
>>>>> missing functionality in a core DT parsing API - something they'd be
>>>>> unlikely to initially suspect.
>>>>
>>>>
>>>> Ack. However, I definitely understand Simon's arguments about code size
>>>> here. On some platforms with limited RAM for SPL this additional code
>>>> for "correct" ranges parsing and address translation might break the
>>>> size limit. Not sure how to handle this. At least a comment in the code
>>>> would be helpful, explaining that simple_bus_translate() is limited here
>>>> in some aspects.
>>>
>>>
>>> So in my AArch64 build, fdt_translate_address is 0x270 bytes. I can see that
>>> might be pushing some extremely constrained binaries over a limit if that
>>> function isn't already included in the binary. However, if we are in that
>>> situation, I have a really hard time believing this one patch/function will
>>> be the only issue; we'll constantly be hitting a wall where we can't fix
>>> issues in DT parsing, DT handling, or other code in these binaries since the
>>> fix will bloat the binary too much.
>>>
>>> In those cases, I rather question whether DT support is the correct
>>> approach; completely dropping DT support from those binaries would likely
>>> remove large amounts of code and replace it with a tiny amount of constant
>>> data. It seems like that'd be the best approach all around since it'd head
>>> of the issue completely.
>>
>> U-Boot is not Linux - code size is important. We can enable features
>> when needed.
>
> Only if they're not mandatory parts of other features that we've made an
> arbitrary decision to use. Correctness trumps optimization in absolutely
> all cases.

This patch adds the ability to support complex multi-level range
properties for those boards that need it (only one so far). I think it
is a reasonable feature to have. We can perhaps improve the
implementation as I mentioned earlier in this thread, but only at the
cost of more code and development. The only shortcoming I am aware of
is that it moves up the tree looking for parent nodes, and this
involves scanning the device tree repeatedly. We can address this
later if it becomes a performance issue.

While only one platform currently needs this feature, others may
follow, and as you point out if a platform needs this but we do not
support it, then it would be a failing to correctly parse valid device
tree semantics. But I can't agree that we must do everything or
nothing. One might argue that only the hush parser provides a correct
shell, or that simple malloc() does not implement memory allocation
correctly, or that only SHA256 is suitable as a hash, or that
snprintf() should always check its buffer size, or indeed that prinf()
should support every format parameter, even in SPL. U-Boot is full of
such compromises and that contributes to its flexibility.

There is of course the risk that some poor soul may bring in an
updated device tree file for a platform which suddenly starts needing
ranges where it did not before. Hopefully they will remember that they
changed the device tree and hopefully after bit of searching they find
this thread and they will know to define CONFIG_OF_TRANSLATE. But I am
more worried about the hopeful punter who wants to fit things into a
small SPL. We should try to make this easy from the start, and
allowing some of device tree's less common features to be optional is
the lesser of the two evils IMO.

Acked-by: Simon Glass <sjg@chromium.org>

Regards,
Simon
Stefan Roese Oct. 4, 2015, 7:35 a.m. UTC | #12
Hi Simon,

On 04.10.2015 03:02, Simon Glass wrote:
> Hi Stephen,
>
> On 3 October 2015 at 20:17, Stephen Warren <swarren@wwwdotorg.org> wrote:
>> On 10/03/2015 06:50 AM, Simon Glass wrote:
>>> Hi Stephen,
>>>
>>> On 21 September 2015 at 19:06, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>>> On 09/13/2015 11:25 PM, Stefan Roese wrote:
>>>>>
>>>>> Hi Stephen,
>>>>>
>>>>> On 11.09.2015 19:07, Stephen Warren wrote:
>>>>>>
>>>>>> On 09/09/2015 11:07 AM, Simon Glass wrote:
>>>>>>>
>>>>>>> +Stephen
>>>>>>>
>>>>>>> Hi Stefan,
>>>>>>>
>>>>>>> On Thursday, 3 September 2015, Stefan Roese <sr@denx.de> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> The current "simple" address translation simple_bus_translate() is not
>>>>>>>> working on some platforms (e.g. MVEBU). As here more complex "ranges"
>>>>>>>> properties are used in many nodes (multiple tuples etc). This patch
>>>>>>>> enables the optional use of the common fdt_translate_address() function
>>>>>>>> which handles this translation correctly.
>>>>>>>>
>>>>>>>> Signed-off-by: Stefan Roese <sr@denx.de>
>>>>>>>> Cc: Simon Glass <sjg@chromium.org>
>>>>>>>> Cc: Bin Meng <bmeng.cn@gmail.com>
>>>>>>>> Cc: Marek Vasut <marex@denx.de>
>>>>>>>> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
>>>>>>>> ---
>>>>>>>> v2:
>>>>>>>> - Rework code a bit as suggested by Simon. Also added some comments
>>>>>>>>     to make the use of the code paths more clear.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> While this works I'm reluctant to commit it as is. The call to
>>>>>>> fdt_parent_offset() is very slow.
>>>>>>>
>>>>>>> I wonder if this code should be copied into a new file in
>>>>>>> drivers/core/, tidied up and updated to use dev->parent?
>>>>>>>
>>>>>>> Other options:
>>>>>>> - Add a library to unflatten the tree - but this would not be very
>>>>>>> useful in SPL or before relocation due to memory/speed constraints
>>>>>>> - Add a helper to find a node parent which uses a cached tree scan to
>>>>>>> build a table of previous nodes (or some other means to go backwards
>>>>>>> in the tree)
>>>>>>> - Worry about it later and go ahead with this patch
>>>>>>
>>>>>>
>>>>>> I haven't looked at the code in detail, but I'm surprised there's a
>>>>>> Kconfig option for this, for either SPL or main U-Boot. In general, this
>>>>>> feature is simply a required part of parsing DT, so surely the code
>>>>>> should always be enabled. Without it, we're only getting lucky if DT
>>>>>> works (lucky the DT doesn't happen to contain a ranges property).
>>>>>
>>>>>
>>>>> Yes. I was also a bit surprised, that this current (limited)
>>>>> implementation to translate the address worked on the platforms using
>>>>> this interface right now.
>>>>>
>>>>>> Sure
>>>>>> the code does some searching through the DT, and that's slower than not
>>>>>> doing it, but I don't see how we can support DT without parsing DT
>>>>>> correctly. Now admittedly some platforms' DTs happen not to contain
>>>>>> ranges that require this code in practice. However, I feel that's a bit
>>>>>> of a micro-optimization, and a rather error-prone one at that. What if
>>>>>> someone pulls a more complete DT into U-Boot and suddenly the code is
>>>>>> required and they have to spend ages tracking down their problem to
>>>>>> missing functionality in a core DT parsing API - something they'd be
>>>>>> unlikely to initially suspect.
>>>>>
>>>>>
>>>>> Ack. However, I definitely understand Simon's arguments about code size
>>>>> here. On some platforms with limited RAM for SPL this additional code
>>>>> for "correct" ranges parsing and address translation might break the
>>>>> size limit. Not sure how to handle this. At least a comment in the code
>>>>> would be helpful, explaining that simple_bus_translate() is limited here
>>>>> in some aspects.
>>>>
>>>>
>>>> So in my AArch64 build, fdt_translate_address is 0x270 bytes. I can see that
>>>> might be pushing some extremely constrained binaries over a limit if that
>>>> function isn't already included in the binary. However, if we are in that
>>>> situation, I have a really hard time believing this one patch/function will
>>>> be the only issue; we'll constantly be hitting a wall where we can't fix
>>>> issues in DT parsing, DT handling, or other code in these binaries since the
>>>> fix will bloat the binary too much.
>>>>
>>>> In those cases, I rather question whether DT support is the correct
>>>> approach; completely dropping DT support from those binaries would likely
>>>> remove large amounts of code and replace it with a tiny amount of constant
>>>> data. It seems like that'd be the best approach all around since it'd head
>>>> of the issue completely.
>>>
>>> U-Boot is not Linux - code size is important. We can enable features
>>> when needed.
>>
>> Only if they're not mandatory parts of other features that we've made an
>> arbitrary decision to use. Correctness trumps optimization in absolutely
>> all cases.
>
> This patch adds the ability to support complex multi-level range
> properties for those boards that need it (only one so far).

Its actually already 2 platforms. As Thomas Chou also needs this for 
NIOS (or NIOS2). Thomas, please correct me if I'm wrong.

> I think it
> is a reasonable feature to have. We can perhaps improve the
> implementation as I mentioned earlier in this thread, but only at the
> cost of more code and development. The only shortcoming I am aware of
> is that it moves up the tree looking for parent nodes, and this
> involves scanning the device tree repeatedly. We can address this
> later if it becomes a performance issue.
>
> While only one platform currently needs this feature, others may
> follow, and as you point out if a platform needs this but we do not
> support it, then it would be a failing to correctly parse valid device
> tree semantics. But I can't agree that we must do everything or
> nothing. One might argue that only the hush parser provides a correct
> shell, or that simple malloc() does not implement memory allocation
> correctly, or that only SHA256 is suitable as a hash, or that
> snprintf() should always check its buffer size, or indeed that prinf()
> should support every format parameter, even in SPL. U-Boot is full of
> such compromises and that contributes to its flexibility.
>
> There is of course the risk that some poor soul may bring in an
> updated device tree file for a platform which suddenly starts needing
> ranges where it did not before. Hopefully they will remember that they
> changed the device tree and hopefully after bit of searching they find
> this thread and they will know to define CONFIG_OF_TRANSLATE. But I am
> more worried about the hopeful punter who wants to fit things into a
> small SPL. We should try to make this easy from the start, and
> allowing some of device tree's less common features to be optional is
> the lesser of the two evils IMO.
>
> Acked-by: Simon Glass <sjg@chromium.org>

Thanks,
Stefan
Thomas Chou Oct. 4, 2015, 11:38 a.m. UTC | #13
On 10/04/2015 03:35 PM, Stefan Roese wrote:
> Hi Simon,
>
> On 04.10.2015 03:02, Simon Glass wrote:
>> Hi Stephen,
>>
>> On 3 October 2015 at 20:17, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>> On 10/03/2015 06:50 AM, Simon Glass wrote:
>>>> Hi Stephen,
>>>>
>>>> On 21 September 2015 at 19:06, Stephen Warren
>>>> <swarren@wwwdotorg.org> wrote:
>>>>> On 09/13/2015 11:25 PM, Stefan Roese wrote:
>>>>>>
>>>>>> Hi Stephen,
>>>>>>
>>>>>> On 11.09.2015 19:07, Stephen Warren wrote:
>>>>>>>
>>>>>>> On 09/09/2015 11:07 AM, Simon Glass wrote:
>>>>>>>>
>>>>>>>> +Stephen
>>>>>>>>
>>>>>>>> Hi Stefan,
>>>>>>>>
>>>>>>>> On Thursday, 3 September 2015, Stefan Roese <sr@denx.de> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The current "simple" address translation simple_bus_translate()
>>>>>>>>> is not
>>>>>>>>> working on some platforms (e.g. MVEBU). As here more complex
>>>>>>>>> "ranges"
>>>>>>>>> properties are used in many nodes (multiple tuples etc). This
>>>>>>>>> patch
>>>>>>>>> enables the optional use of the common fdt_translate_address()
>>>>>>>>> function
>>>>>>>>> which handles this translation correctly.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Stefan Roese <sr@denx.de>
>>>>>>>>> Cc: Simon Glass <sjg@chromium.org>
>>>>>>>>> Cc: Bin Meng <bmeng.cn@gmail.com>
>>>>>>>>> Cc: Marek Vasut <marex@denx.de>
>>>>>>>>> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
>>>>>>>>> ---
>>>>>>>>> v2:
>>>>>>>>> - Rework code a bit as suggested by Simon. Also added some
>>>>>>>>> comments
>>>>>>>>>     to make the use of the code paths more clear.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> While this works I'm reluctant to commit it as is. The call to
>>>>>>>> fdt_parent_offset() is very slow.
>>>>>>>>
>>>>>>>> I wonder if this code should be copied into a new file in
>>>>>>>> drivers/core/, tidied up and updated to use dev->parent?
>>>>>>>>
>>>>>>>> Other options:
>>>>>>>> - Add a library to unflatten the tree - but this would not be very
>>>>>>>> useful in SPL or before relocation due to memory/speed constraints
>>>>>>>> - Add a helper to find a node parent which uses a cached tree
>>>>>>>> scan to
>>>>>>>> build a table of previous nodes (or some other means to go
>>>>>>>> backwards
>>>>>>>> in the tree)
>>>>>>>> - Worry about it later and go ahead with this patch
>>>>>>>
>>>>>>>
>>>>>>> I haven't looked at the code in detail, but I'm surprised there's a
>>>>>>> Kconfig option for this, for either SPL or main U-Boot. In
>>>>>>> general, this
>>>>>>> feature is simply a required part of parsing DT, so surely the code
>>>>>>> should always be enabled. Without it, we're only getting lucky if DT
>>>>>>> works (lucky the DT doesn't happen to contain a ranges property).
>>>>>>
>>>>>>
>>>>>> Yes. I was also a bit surprised, that this current (limited)
>>>>>> implementation to translate the address worked on the platforms using
>>>>>> this interface right now.
>>>>>>
>>>>>>> Sure
>>>>>>> the code does some searching through the DT, and that's slower
>>>>>>> than not
>>>>>>> doing it, but I don't see how we can support DT without parsing DT
>>>>>>> correctly. Now admittedly some platforms' DTs happen not to contain
>>>>>>> ranges that require this code in practice. However, I feel that's
>>>>>>> a bit
>>>>>>> of a micro-optimization, and a rather error-prone one at that.
>>>>>>> What if
>>>>>>> someone pulls a more complete DT into U-Boot and suddenly the
>>>>>>> code is
>>>>>>> required and they have to spend ages tracking down their problem to
>>>>>>> missing functionality in a core DT parsing API - something they'd be
>>>>>>> unlikely to initially suspect.
>>>>>>
>>>>>>
>>>>>> Ack. However, I definitely understand Simon's arguments about code
>>>>>> size
>>>>>> here. On some platforms with limited RAM for SPL this additional code
>>>>>> for "correct" ranges parsing and address translation might break the
>>>>>> size limit. Not sure how to handle this. At least a comment in the
>>>>>> code
>>>>>> would be helpful, explaining that simple_bus_translate() is
>>>>>> limited here
>>>>>> in some aspects.
>>>>>
>>>>>
>>>>> So in my AArch64 build, fdt_translate_address is 0x270 bytes. I can
>>>>> see that
>>>>> might be pushing some extremely constrained binaries over a limit
>>>>> if that
>>>>> function isn't already included in the binary. However, if we are
>>>>> in that
>>>>> situation, I have a really hard time believing this one
>>>>> patch/function will
>>>>> be the only issue; we'll constantly be hitting a wall where we
>>>>> can't fix
>>>>> issues in DT parsing, DT handling, or other code in these binaries
>>>>> since the
>>>>> fix will bloat the binary too much.
>>>>>
>>>>> In those cases, I rather question whether DT support is the correct
>>>>> approach; completely dropping DT support from those binaries would
>>>>> likely
>>>>> remove large amounts of code and replace it with a tiny amount of
>>>>> constant
>>>>> data. It seems like that'd be the best approach all around since
>>>>> it'd head
>>>>> of the issue completely.
>>>>
>>>> U-Boot is not Linux - code size is important. We can enable features
>>>> when needed.
>>>
>>> Only if they're not mandatory parts of other features that we've made an
>>> arbitrary decision to use. Correctness trumps optimization in absolutely
>>> all cases.
>>
>> This patch adds the ability to support complex multi-level range
>> properties for those boards that need it (only one so far).
>
> Its actually already 2 platforms. As Thomas Chou also needs this for
> NIOS (or NIOS2). Thomas, please correct me if I'm wrong.

Yes, nios2 and socfpga MUST have this ranges translation.

Acked-by: Thomas Chou <thomas@wytron.com.tw>

>
>> I think it
>> is a reasonable feature to have. We can perhaps improve the
>> implementation as I mentioned earlier in this thread, but only at the
>> cost of more code and development. The only shortcoming I am aware of
>> is that it moves up the tree looking for parent nodes, and this
>> involves scanning the device tree repeatedly. We can address this
>> later if it becomes a performance issue.
>>
>> While only one platform currently needs this feature, others may
>> follow, and as you point out if a platform needs this but we do not
>> support it, then it would be a failing to correctly parse valid device
>> tree semantics. But I can't agree that we must do everything or
>> nothing. One might argue that only the hush parser provides a correct
>> shell, or that simple malloc() does not implement memory allocation
>> correctly, or that only SHA256 is suitable as a hash, or that
>> snprintf() should always check its buffer size, or indeed that prinf()
>> should support every format parameter, even in SPL. U-Boot is full of
>> such compromises and that contributes to its flexibility.
>>
>> There is of course the risk that some poor soul may bring in an
>> updated device tree file for a platform which suddenly starts needing
>> ranges where it did not before. Hopefully they will remember that they
>> changed the device tree and hopefully after bit of searching they find
>> this thread and they will know to define CONFIG_OF_TRANSLATE. But I am
>> more worried about the hopeful punter who wants to fit things into a
>> small SPL. We should try to make this easy from the start, and
>> allowing some of device tree's less common features to be optional is
>> the lesser of the two evils IMO.
>>
>> Acked-by: Simon Glass <sjg@chromium.org>
>
> Thanks,
> Stefan
>
>
Stephen Warren Oct. 5, 2015, 1:22 a.m. UTC | #14
On 10/03/2015 07:02 PM, Simon Glass wrote:
> Hi Stephen,
> 
> On 3 October 2015 at 20:17, Stephen Warren <swarren@wwwdotorg.org> wrote:
>> On 10/03/2015 06:50 AM, Simon Glass wrote:
>>> Hi Stephen,
>>>
>>> On 21 September 2015 at 19:06, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>>> On 09/13/2015 11:25 PM, Stefan Roese wrote:
>>>>>
>>>>> Hi Stephen,
>>>>>
>>>>> On 11.09.2015 19:07, Stephen Warren wrote:
>>>>>>
>>>>>> On 09/09/2015 11:07 AM, Simon Glass wrote:
>>>>>>>
>>>>>>> +Stephen
>>>>>>>
>>>>>>> Hi Stefan,
>>>>>>>
>>>>>>> On Thursday, 3 September 2015, Stefan Roese <sr@denx.de> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> The current "simple" address translation simple_bus_translate() is not
>>>>>>>> working on some platforms (e.g. MVEBU). As here more complex "ranges"
>>>>>>>> properties are used in many nodes (multiple tuples etc). This patch
>>>>>>>> enables the optional use of the common fdt_translate_address() function
>>>>>>>> which handles this translation correctly.
>>>>>>>>
>>>>>>>> Signed-off-by: Stefan Roese <sr@denx.de>
>>>>>>>> Cc: Simon Glass <sjg@chromium.org>
>>>>>>>> Cc: Bin Meng <bmeng.cn@gmail.com>
>>>>>>>> Cc: Marek Vasut <marex@denx.de>
>>>>>>>> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
>>>>>>>> ---
>>>>>>>> v2:
>>>>>>>> - Rework code a bit as suggested by Simon. Also added some comments
>>>>>>>>    to make the use of the code paths more clear.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> While this works I'm reluctant to commit it as is. The call to
>>>>>>> fdt_parent_offset() is very slow.
>>>>>>>
>>>>>>> I wonder if this code should be copied into a new file in
>>>>>>> drivers/core/, tidied up and updated to use dev->parent?
>>>>>>>
>>>>>>> Other options:
>>>>>>> - Add a library to unflatten the tree - but this would not be very
>>>>>>> useful in SPL or before relocation due to memory/speed constraints
>>>>>>> - Add a helper to find a node parent which uses a cached tree scan to
>>>>>>> build a table of previous nodes (or some other means to go backwards
>>>>>>> in the tree)
>>>>>>> - Worry about it later and go ahead with this patch
>>>>>>
>>>>>>
>>>>>> I haven't looked at the code in detail, but I'm surprised there's a
>>>>>> Kconfig option for this, for either SPL or main U-Boot. In general, this
>>>>>> feature is simply a required part of parsing DT, so surely the code
>>>>>> should always be enabled. Without it, we're only getting lucky if DT
>>>>>> works (lucky the DT doesn't happen to contain a ranges property).
>>>>>
>>>>>
>>>>> Yes. I was also a bit surprised, that this current (limited)
>>>>> implementation to translate the address worked on the platforms using
>>>>> this interface right now.
>>>>>
>>>>>> Sure
>>>>>> the code does some searching through the DT, and that's slower than not
>>>>>> doing it, but I don't see how we can support DT without parsing DT
>>>>>> correctly. Now admittedly some platforms' DTs happen not to contain
>>>>>> ranges that require this code in practice. However, I feel that's a bit
>>>>>> of a micro-optimization, and a rather error-prone one at that. What if
>>>>>> someone pulls a more complete DT into U-Boot and suddenly the code is
>>>>>> required and they have to spend ages tracking down their problem to
>>>>>> missing functionality in a core DT parsing API - something they'd be
>>>>>> unlikely to initially suspect.
>>>>>
>>>>>
>>>>> Ack. However, I definitely understand Simon's arguments about code size
>>>>> here. On some platforms with limited RAM for SPL this additional code
>>>>> for "correct" ranges parsing and address translation might break the
>>>>> size limit. Not sure how to handle this. At least a comment in the code
>>>>> would be helpful, explaining that simple_bus_translate() is limited here
>>>>> in some aspects.
>>>>
>>>>
>>>> So in my AArch64 build, fdt_translate_address is 0x270 bytes. I can see that
>>>> might be pushing some extremely constrained binaries over a limit if that
>>>> function isn't already included in the binary. However, if we are in that
>>>> situation, I have a really hard time believing this one patch/function will
>>>> be the only issue; we'll constantly be hitting a wall where we can't fix
>>>> issues in DT parsing, DT handling, or other code in these binaries since the
>>>> fix will bloat the binary too much.
>>>>
>>>> In those cases, I rather question whether DT support is the correct
>>>> approach; completely dropping DT support from those binaries would likely
>>>> remove large amounts of code and replace it with a tiny amount of constant
>>>> data. It seems like that'd be the best approach all around since it'd head
>>>> of the issue completely.
>>>
>>> U-Boot is not Linux - code size is important. We can enable features
>>> when needed.
>>
>> Only if they're not mandatory parts of other features that we've made an
>> arbitrary decision to use. Correctness trumps optimization in absolutely
>> all cases.
> 
> This patch adds the ability to support complex multi-level range
> properties for those boards that need it (only one so far). I think it
> is a reasonable feature to have. We can perhaps improve the
> implementation as I mentioned earlier in this thread, but only at the
> cost of more code and development. The only shortcoming I am aware of
> is that it moves up the tree looking for parent nodes, and this
> involves scanning the device tree repeatedly. We can address this
> later if it becomes a performance issue.
> 
> While only one platform currently needs this feature, others may
> follow, and as you point out if a platform needs this but we do not
> support it, then it would be a failing to correctly parse valid device
> tree semantics. But I can't agree that we must do everything or
> nothing. One might argue that only the hush parser provides a correct
> shell, or that simple malloc() does not implement memory allocation
> correctly, or that only SHA256 is suitable as a hash, or that
> snprintf() should always check its buffer size, or indeed that prinf()
> should support every format parameter, even in SPL. U-Boot is full of
> such compromises and that contributes to its flexibility.

I believe that a primary difference between the examples above and this
DT parsing feature are that the examples above are all different options
for implementing a conceptual feature (e.g. different hash algorithms,
all of which implement the ability to hash some data), whereas
supporting ranges in DT is a (fundamental) part of a single feature (DT
support), rather than a different implementation of "parsing DT".
Simon Glass Oct. 6, 2015, 2:17 p.m. UTC | #15
Hi Stephen,

On 5 October 2015 at 02:22, Stephen Warren <swarren@wwwdotorg.org> wrote:
> On 10/03/2015 07:02 PM, Simon Glass wrote:
>> Hi Stephen,
>>
>> On 3 October 2015 at 20:17, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>> On 10/03/2015 06:50 AM, Simon Glass wrote:
>>>> Hi Stephen,
>>>>
>>>> On 21 September 2015 at 19:06, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>>>> On 09/13/2015 11:25 PM, Stefan Roese wrote:
>>>>>>
>>>>>> Hi Stephen,
>>>>>>
>>>>>> On 11.09.2015 19:07, Stephen Warren wrote:
>>>>>>>
>>>>>>> On 09/09/2015 11:07 AM, Simon Glass wrote:
>>>>>>>>
>>>>>>>> +Stephen
>>>>>>>>
>>>>>>>> Hi Stefan,
>>>>>>>>
>>>>>>>> On Thursday, 3 September 2015, Stefan Roese <sr@denx.de> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The current "simple" address translation simple_bus_translate() is not
>>>>>>>>> working on some platforms (e.g. MVEBU). As here more complex "ranges"
>>>>>>>>> properties are used in many nodes (multiple tuples etc). This patch
>>>>>>>>> enables the optional use of the common fdt_translate_address() function
>>>>>>>>> which handles this translation correctly.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Stefan Roese <sr@denx.de>
>>>>>>>>> Cc: Simon Glass <sjg@chromium.org>
>>>>>>>>> Cc: Bin Meng <bmeng.cn@gmail.com>
>>>>>>>>> Cc: Marek Vasut <marex@denx.de>
>>>>>>>>> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
>>>>>>>>> ---
>>>>>>>>> v2:
>>>>>>>>> - Rework code a bit as suggested by Simon. Also added some comments
>>>>>>>>>    to make the use of the code paths more clear.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> While this works I'm reluctant to commit it as is. The call to
>>>>>>>> fdt_parent_offset() is very slow.
>>>>>>>>
>>>>>>>> I wonder if this code should be copied into a new file in
>>>>>>>> drivers/core/, tidied up and updated to use dev->parent?
>>>>>>>>
>>>>>>>> Other options:
>>>>>>>> - Add a library to unflatten the tree - but this would not be very
>>>>>>>> useful in SPL or before relocation due to memory/speed constraints
>>>>>>>> - Add a helper to find a node parent which uses a cached tree scan to
>>>>>>>> build a table of previous nodes (or some other means to go backwards
>>>>>>>> in the tree)
>>>>>>>> - Worry about it later and go ahead with this patch
>>>>>>>
>>>>>>>
>>>>>>> I haven't looked at the code in detail, but I'm surprised there's a
>>>>>>> Kconfig option for this, for either SPL or main U-Boot. In general, this
>>>>>>> feature is simply a required part of parsing DT, so surely the code
>>>>>>> should always be enabled. Without it, we're only getting lucky if DT
>>>>>>> works (lucky the DT doesn't happen to contain a ranges property).
>>>>>>
>>>>>>
>>>>>> Yes. I was also a bit surprised, that this current (limited)
>>>>>> implementation to translate the address worked on the platforms using
>>>>>> this interface right now.
>>>>>>
>>>>>>> Sure
>>>>>>> the code does some searching through the DT, and that's slower than not
>>>>>>> doing it, but I don't see how we can support DT without parsing DT
>>>>>>> correctly. Now admittedly some platforms' DTs happen not to contain
>>>>>>> ranges that require this code in practice. However, I feel that's a bit
>>>>>>> of a micro-optimization, and a rather error-prone one at that. What if
>>>>>>> someone pulls a more complete DT into U-Boot and suddenly the code is
>>>>>>> required and they have to spend ages tracking down their problem to
>>>>>>> missing functionality in a core DT parsing API - something they'd be
>>>>>>> unlikely to initially suspect.
>>>>>>
>>>>>>
>>>>>> Ack. However, I definitely understand Simon's arguments about code size
>>>>>> here. On some platforms with limited RAM for SPL this additional code
>>>>>> for "correct" ranges parsing and address translation might break the
>>>>>> size limit. Not sure how to handle this. At least a comment in the code
>>>>>> would be helpful, explaining that simple_bus_translate() is limited here
>>>>>> in some aspects.
>>>>>
>>>>>
>>>>> So in my AArch64 build, fdt_translate_address is 0x270 bytes. I can see that
>>>>> might be pushing some extremely constrained binaries over a limit if that
>>>>> function isn't already included in the binary. However, if we are in that
>>>>> situation, I have a really hard time believing this one patch/function will
>>>>> be the only issue; we'll constantly be hitting a wall where we can't fix
>>>>> issues in DT parsing, DT handling, or other code in these binaries since the
>>>>> fix will bloat the binary too much.
>>>>>
>>>>> In those cases, I rather question whether DT support is the correct
>>>>> approach; completely dropping DT support from those binaries would likely
>>>>> remove large amounts of code and replace it with a tiny amount of constant
>>>>> data. It seems like that'd be the best approach all around since it'd head
>>>>> of the issue completely.
>>>>
>>>> U-Boot is not Linux - code size is important. We can enable features
>>>> when needed.
>>>
>>> Only if they're not mandatory parts of other features that we've made an
>>> arbitrary decision to use. Correctness trumps optimization in absolutely
>>> all cases.
>>
>> This patch adds the ability to support complex multi-level range
>> properties for those boards that need it (only one so far). I think it
>> is a reasonable feature to have. We can perhaps improve the
>> implementation as I mentioned earlier in this thread, but only at the
>> cost of more code and development. The only shortcoming I am aware of
>> is that it moves up the tree looking for parent nodes, and this
>> involves scanning the device tree repeatedly. We can address this
>> later if it becomes a performance issue.
>>
>> While only one platform currently needs this feature, others may
>> follow, and as you point out if a platform needs this but we do not
>> support it, then it would be a failing to correctly parse valid device
>> tree semantics. But I can't agree that we must do everything or
>> nothing. One might argue that only the hush parser provides a correct
>> shell, or that simple malloc() does not implement memory allocation
>> correctly, or that only SHA256 is suitable as a hash, or that
>> snprintf() should always check its buffer size, or indeed that prinf()
>> should support every format parameter, even in SPL. U-Boot is full of
>> such compromises and that contributes to its flexibility.
>
> I believe that a primary difference between the examples above and this
> DT parsing feature are that the examples above are all different options
> for implementing a conceptual feature (e.g. different hash algorithms,
> all of which implement the ability to hash some data), whereas
> supporting ranges in DT is a (fundamental) part of a single feature (DT
> support), rather than a different implementation of "parsing DT".

There was a discussion about implementing a version of printf() for
SPL which just outputs the format string and ignores the parameters.
Arguably this fails your test, but is still useful. I don't see that
DT parsing is any different.

Regards,
Simon
diff mbox

Patch

diff --git a/drivers/core/Kconfig b/drivers/core/Kconfig
index 41f4e69..15681df 100644
--- a/drivers/core/Kconfig
+++ b/drivers/core/Kconfig
@@ -120,4 +120,34 @@  config SPL_SIMPLE_BUS
 	  Supports the 'simple-bus' driver, which is used on some systems
 	  in SPL.
 
+config OF_TRANSLATE
+	bool "Translate addresses using fdt_translate_address"
+	depends on DM && OF_CONTROL
+	default y
+	help
+	  If this option is enabled, the reg property will be translated
+	  using the fdt_translate_address() function. This is necessary
+	  on some platforms (e.g. MVEBU) using complex "ranges"
+	  properties in many nodes. As this translation is not handled
+	  correctly in the default simple_bus_translate() function.
+
+	  If this option is not enabled, simple_bus_translate() will be
+	  used for the address translation. This function is faster and
+	  smaller in size than fdt_translate_address().
+
+config SPL_OF_TRANSLATE
+	bool "Translate addresses using fdt_translate_address"
+	depends on SPL_DM && SPL_OF_CONTROL
+	default n
+	help
+	  If this option is enabled, the reg property will be translated
+	  using the fdt_translate_address() function. This is necessary
+	  on some platforms (e.g. MVEBU) using complex "ranges"
+	  properties in many nodes. As this translation is not handled
+	  correctly in the default simple_bus_translate() function.
+
+	  If this option is not enabled, simple_bus_translate() will be
+	  used for the address translation. This function is faster and
+	  smaller in size than fdt_translate_address().
+
 endmenu
diff --git a/drivers/core/device.c b/drivers/core/device.c
index 0ccd443..c543203 100644
--- a/drivers/core/device.c
+++ b/drivers/core/device.c
@@ -11,6 +11,7 @@ 
 
 #include <common.h>
 #include <fdtdec.h>
+#include <fdt_support.h>
 #include <malloc.h>
 #include <dm/device.h>
 #include <dm/device-internal.h>
@@ -581,6 +582,25 @@  fdt_addr_t dev_get_addr(struct udevice *dev)
 #if CONFIG_IS_ENABLED(OF_CONTROL)
 	fdt_addr_t addr;
 
+	if (CONFIG_IS_ENABLED(OF_TRANSLATE)) {
+		const fdt32_t *reg;
+
+		reg = fdt_getprop(gd->fdt_blob, dev->of_offset, "reg", NULL);
+		if (!reg)
+			return FDT_ADDR_T_NONE;
+
+		/*
+		 * Use the full-fledged translate function for complex
+		 * bus setups.
+		 */
+		return fdt_translate_address((void *)gd->fdt_blob,
+					     dev->of_offset, reg);
+	}
+
+	/*
+	 * Use the "simple" translate function for less complex
+	 * bus setups.
+	 */
 	addr = fdtdec_get_addr(gd->fdt_blob, dev->of_offset, "reg");
 	if (CONFIG_IS_ENABLED(SIMPLE_BUS) && addr != FDT_ADDR_T_NONE) {
 		if (device_get_uclass_id(dev->parent) == UCLASS_SIMPLE_BUS)