Message ID | 1441343506-28473-1-git-send-email-sr@denx.de |
---|---|
State | Superseded |
Delegated to: | Simon Glass |
Headers | show |
+Stephen Hi Stefan, On Thursday, 3 September 2015, Stefan Roese <sr@denx.de> wrote: > > The current "simple" address translation simple_bus_translate() is not > working on some platforms (e.g. MVEBU). As here more complex "ranges" > properties are used in many nodes (multiple tuples etc). This patch > enables the optional use of the common fdt_translate_address() function > which handles this translation correctly. > > Signed-off-by: Stefan Roese <sr@denx.de> > Cc: Simon Glass <sjg@chromium.org> > Cc: Bin Meng <bmeng.cn@gmail.com> > Cc: Marek Vasut <marex@denx.de> > Cc: Masahiro Yamada <yamada.masahiro@socionext.com> > --- > v2: > - Rework code a bit as suggested by Simon. Also added some comments > to make the use of the code paths more clear. While this works I'm reluctant to commit it as is. The call to fdt_parent_offset() is very slow. I wonder if this code should be copied into a new file in drivers/core/, tidied up and updated to use dev->parent? Other options: - Add a library to unflatten the tree - but this would not be very useful in SPL or before relocation due to memory/speed constraints - Add a helper to find a node parent which uses a cached tree scan to build a table of previous nodes (or some other means to go backwards in the tree) - Worry about it later and go ahead with this patch > > > drivers/core/Kconfig | 30 ++++++++++++++++++++++++++++++ > drivers/core/device.c | 20 ++++++++++++++++++++ > 2 files changed, 50 insertions(+) > > diff --git a/drivers/core/Kconfig b/drivers/core/Kconfig > index 41f4e69..15681df 100644 > --- a/drivers/core/Kconfig > +++ b/drivers/core/Kconfig > @@ -120,4 +120,34 @@ config SPL_SIMPLE_BUS > Supports the 'simple-bus' driver, which is used on some systems > in SPL. > > +config OF_TRANSLATE > + bool "Translate addresses using fdt_translate_address" > + depends on DM && OF_CONTROL > + default y > + help > + If this option is enabled, the reg property will be translated > + using the fdt_translate_address() function. This is necessary > + on some platforms (e.g. MVEBU) using complex "ranges" > + properties in many nodes. As this translation is not handled > + correctly in the default simple_bus_translate() function. > + > + If this option is not enabled, simple_bus_translate() will be > + used for the address translation. This function is faster and > + smaller in size than fdt_translate_address(). > + > +config SPL_OF_TRANSLATE > + bool "Translate addresses using fdt_translate_address" > + depends on SPL_DM && SPL_OF_CONTROL > + default n > + help > + If this option is enabled, the reg property will be translated > + using the fdt_translate_address() function. This is necessary > + on some platforms (e.g. MVEBU) using complex "ranges" > + properties in many nodes. As this translation is not handled > + correctly in the default simple_bus_translate() function. > + > + If this option is not enabled, simple_bus_translate() will be > + used for the address translation. This function is faster and > + smaller in size than fdt_translate_address(). > + > endmenu > diff --git a/drivers/core/device.c b/drivers/core/device.c > index 0ccd443..c543203 100644 > --- a/drivers/core/device.c > +++ b/drivers/core/device.c > @@ -11,6 +11,7 @@ > > #include <common.h> > #include <fdtdec.h> > +#include <fdt_support.h> > #include <malloc.h> > #include <dm/device.h> > #include <dm/device-internal.h> > @@ -581,6 +582,25 @@ fdt_addr_t dev_get_addr(struct udevice *dev) > #if CONFIG_IS_ENABLED(OF_CONTROL) > fdt_addr_t addr; > > + if (CONFIG_IS_ENABLED(OF_TRANSLATE)) { > + const fdt32_t *reg; > + > + reg = fdt_getprop(gd->fdt_blob, dev->of_offset, "reg", NULL); > + if (!reg) > + return FDT_ADDR_T_NONE; > + > + /* > + * Use the full-fledged translate function for complex > + * bus setups. > + */ > + return fdt_translate_address((void *)gd->fdt_blob, > + dev->of_offset, reg); > + } > + > + /* > + * Use the "simple" translate function for less complex > + * bus setups. > + */ > addr = fdtdec_get_addr(gd->fdt_blob, dev->of_offset, "reg"); > if (CONFIG_IS_ENABLED(SIMPLE_BUS) && addr != FDT_ADDR_T_NONE) { > if (device_get_uclass_id(dev->parent) == UCLASS_SIMPLE_BUS) > -- > 2.5.1 Regards, Simon
Hi Simon, On 09.09.2015 20:07, Simon Glass wrote: > On Thursday, 3 September 2015, Stefan Roese <sr@denx.de> wrote: >> >> The current "simple" address translation simple_bus_translate() is not >> working on some platforms (e.g. MVEBU). As here more complex "ranges" >> properties are used in many nodes (multiple tuples etc). This patch >> enables the optional use of the common fdt_translate_address() function >> which handles this translation correctly. >> >> Signed-off-by: Stefan Roese <sr@denx.de> >> Cc: Simon Glass <sjg@chromium.org> >> Cc: Bin Meng <bmeng.cn@gmail.com> >> Cc: Marek Vasut <marex@denx.de> >> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> >> --- >> v2: >> - Rework code a bit as suggested by Simon. Also added some comments >> to make the use of the code paths more clear. > > > While this works I'm reluctant to commit it as is. The call to > fdt_parent_offset() is very slow. You've mentioned this before. But how slow could this function really be? And it should not be called that often via dev_get_addr(). Usually only once for each driver in the probe function. Or am I missing something? > I wonder if this code should be copied into a new file in > drivers/core/, tidied up and updated to use dev->parent? You mean fdt_translate_address()? It references many functions from fdt_support.c though which we would need to duplicate here as well. > Other options: > - Add a library to unflatten the tree - but this would not be very > useful in SPL or before relocation due to memory/speed constraints > - Add a helper to find a node parent which uses a cached tree scan to > build a table of previous nodes (or some other means to go backwards > in the tree) > - Worry about it later and go ahead with this patch I see no problems to defer this patch (or a "better" version of it) to after this release. The Marvell mvebu DM patches are also not targeted for this release. Thanks, Stefan
Hi Stefan, On 9 September 2015 at 22:54, Stefan Roese <sr@denx.de> wrote: > Hi Simon, > > On 09.09.2015 20:07, Simon Glass wrote: >> >> On Thursday, 3 September 2015, Stefan Roese <sr@denx.de> wrote: >>> >>> >>> The current "simple" address translation simple_bus_translate() is not >>> working on some platforms (e.g. MVEBU). As here more complex "ranges" >>> properties are used in many nodes (multiple tuples etc). This patch >>> enables the optional use of the common fdt_translate_address() function >>> which handles this translation correctly. >>> >>> Signed-off-by: Stefan Roese <sr@denx.de> >>> Cc: Simon Glass <sjg@chromium.org> >>> Cc: Bin Meng <bmeng.cn@gmail.com> >>> Cc: Marek Vasut <marex@denx.de> >>> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> >>> --- >>> v2: >>> - Rework code a bit as suggested by Simon. Also added some comments >>> to make the use of the code paths more clear. >> >> >> >> While this works I'm reluctant to commit it as is. The call to >> fdt_parent_offset() is very slow. > > > You've mentioned this before. But how slow could this function really be? It scans the tree from the start. There is no back link. > And it should not be called that often via dev_get_addr(). Usually only once > for each driver in the probe function. Or am I missing something? Sounds correct. > >> I wonder if this code should be copied into a new file in >> drivers/core/, tidied up and updated to use dev->parent? > > > You mean fdt_translate_address()? It references many functions from > fdt_support.c though which we would need to duplicate here as well. > Right. Seems like a pain. >> Other options: >> - Add a library to unflatten the tree - but this would not be very >> useful in SPL or before relocation due to memory/speed constraints >> - Add a helper to find a node parent which uses a cached tree scan to >> build a table of previous nodes (or some other means to go backwards >> in the tree) >> - Worry about it later and go ahead with this patch > > > I see no problems to defer this patch (or a "better" version of it) to after > this release. The Marvell mvebu DM patches are also not targeted for this > release. OK - and if the time slowdown is not too large then we can just use this patch, particularly as it is an optional CONFIG. Can you check how much slower it is to use your new case versus the original code? Regards, Simon
Hi Simon, On 11.09.2015 02:42, Simon Glass wrote: >>> On Thursday, 3 September 2015, Stefan Roese <sr@denx.de> wrote: >>>> >>>> >>>> The current "simple" address translation simple_bus_translate() is not >>>> working on some platforms (e.g. MVEBU). As here more complex "ranges" >>>> properties are used in many nodes (multiple tuples etc). This patch >>>> enables the optional use of the common fdt_translate_address() function >>>> which handles this translation correctly. >>>> >>>> Signed-off-by: Stefan Roese <sr@denx.de> >>>> Cc: Simon Glass <sjg@chromium.org> >>>> Cc: Bin Meng <bmeng.cn@gmail.com> >>>> Cc: Marek Vasut <marex@denx.de> >>>> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> >>>> --- >>>> v2: >>>> - Rework code a bit as suggested by Simon. Also added some comments >>>> to make the use of the code paths more clear. >>> >>> >>> >>> While this works I'm reluctant to commit it as is. The call to >>> fdt_parent_offset() is very slow. >> >> >> You've mentioned this before. But how slow could this function really be? > > It scans the tree from the start. There is no back link. > >> And it should not be called that often via dev_get_addr(). Usually only once >> for each driver in the probe function. Or am I missing something? > > Sounds correct. So it really shouldn't make a big difference. >> >>> I wonder if this code should be copied into a new file in >>> drivers/core/, tidied up and updated to use dev->parent? >> >> >> You mean fdt_translate_address()? It references many functions from >> fdt_support.c though which we would need to duplicate here as well. >> > > Right. Seems like a pain. > >>> Other options: >>> - Add a library to unflatten the tree - but this would not be very >>> useful in SPL or before relocation due to memory/speed constraints >>> - Add a helper to find a node parent which uses a cached tree scan to >>> build a table of previous nodes (or some other means to go backwards >>> in the tree) >>> - Worry about it later and go ahead with this patch >> >> >> I see no problems to defer this patch (or a "better" version of it) to after >> this release. The Marvell mvebu DM patches are also not targeted for this >> release. > > OK - and if the time slowdown is not too large then we can just use > this patch, particularly as it is an optional CONFIG. Can you check > how much slower it is to use your new case versus the original code? Marvell MVEBU won't boot without this option enabled. So I can't really compare it here. Someone with a platform that doesn't need this option enabled can definitely better do this test and compare the results. Thanks, Stefan
On 09/09/2015 11:07 AM, Simon Glass wrote: > +Stephen > > Hi Stefan, > > On Thursday, 3 September 2015, Stefan Roese <sr@denx.de> wrote: >> >> The current "simple" address translation simple_bus_translate() is not >> working on some platforms (e.g. MVEBU). As here more complex "ranges" >> properties are used in many nodes (multiple tuples etc). This patch >> enables the optional use of the common fdt_translate_address() function >> which handles this translation correctly. >> >> Signed-off-by: Stefan Roese <sr@denx.de> >> Cc: Simon Glass <sjg@chromium.org> >> Cc: Bin Meng <bmeng.cn@gmail.com> >> Cc: Marek Vasut <marex@denx.de> >> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> >> --- >> v2: >> - Rework code a bit as suggested by Simon. Also added some comments >> to make the use of the code paths more clear. > > > While this works I'm reluctant to commit it as is. The call to > fdt_parent_offset() is very slow. > > I wonder if this code should be copied into a new file in > drivers/core/, tidied up and updated to use dev->parent? > > Other options: > - Add a library to unflatten the tree - but this would not be very > useful in SPL or before relocation due to memory/speed constraints > - Add a helper to find a node parent which uses a cached tree scan to > build a table of previous nodes (or some other means to go backwards > in the tree) > - Worry about it later and go ahead with this patch I haven't looked at the code in detail, but I'm surprised there's a Kconfig option for this, for either SPL or main U-Boot. In general, this feature is simply a required part of parsing DT, so surely the code should always be enabled. Without it, we're only getting lucky if DT works (lucky the DT doesn't happen to contain a ranges property). Sure the code does some searching through the DT, and that's slower than not doing it, but I don't see how we can support DT without parsing DT correctly. Now admittedly some platforms' DTs happen not to contain ranges that require this code in practice. However, I feel that's a bit of a micro-optimization, and a rather error-prone one at that. What if someone pulls a more complete DT into U-Boot and suddenly the code is required and they have to spend ages tracking down their problem to missing functionality in a core DT parsing API - something they'd be unlikely to initially suspect.
Hi Stephen, On 11.09.2015 19:07, Stephen Warren wrote: > On 09/09/2015 11:07 AM, Simon Glass wrote: >> +Stephen >> >> Hi Stefan, >> >> On Thursday, 3 September 2015, Stefan Roese <sr@denx.de> wrote: >>> >>> The current "simple" address translation simple_bus_translate() is not >>> working on some platforms (e.g. MVEBU). As here more complex "ranges" >>> properties are used in many nodes (multiple tuples etc). This patch >>> enables the optional use of the common fdt_translate_address() function >>> which handles this translation correctly. >>> >>> Signed-off-by: Stefan Roese <sr@denx.de> >>> Cc: Simon Glass <sjg@chromium.org> >>> Cc: Bin Meng <bmeng.cn@gmail.com> >>> Cc: Marek Vasut <marex@denx.de> >>> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> >>> --- >>> v2: >>> - Rework code a bit as suggested by Simon. Also added some comments >>> to make the use of the code paths more clear. >> >> >> While this works I'm reluctant to commit it as is. The call to >> fdt_parent_offset() is very slow. >> >> I wonder if this code should be copied into a new file in >> drivers/core/, tidied up and updated to use dev->parent? >> >> Other options: >> - Add a library to unflatten the tree - but this would not be very >> useful in SPL or before relocation due to memory/speed constraints >> - Add a helper to find a node parent which uses a cached tree scan to >> build a table of previous nodes (or some other means to go backwards >> in the tree) >> - Worry about it later and go ahead with this patch > > I haven't looked at the code in detail, but I'm surprised there's a > Kconfig option for this, for either SPL or main U-Boot. In general, this > feature is simply a required part of parsing DT, so surely the code > should always be enabled. Without it, we're only getting lucky if DT > works (lucky the DT doesn't happen to contain a ranges property). Yes. I was also a bit surprised, that this current (limited) implementation to translate the address worked on the platforms using this interface right now. > Sure > the code does some searching through the DT, and that's slower than not > doing it, but I don't see how we can support DT without parsing DT > correctly. Now admittedly some platforms' DTs happen not to contain > ranges that require this code in practice. However, I feel that's a bit > of a micro-optimization, and a rather error-prone one at that. What if > someone pulls a more complete DT into U-Boot and suddenly the code is > required and they have to spend ages tracking down their problem to > missing functionality in a core DT parsing API - something they'd be > unlikely to initially suspect. Ack. However, I definitely understand Simon's arguments about code size here. On some platforms with limited RAM for SPL this additional code for "correct" ranges parsing and address translation might break the size limit. Not sure how to handle this. At least a comment in the code would be helpful, explaining that simple_bus_translate() is limited here in some aspects. Thanks, Stefan
Hi Stefan, On 09/04/2015 01:11 PM, Stefan Roese wrote: > The current "simple" address translation simple_bus_translate() is not > working on some platforms (e.g. MVEBU). As here more complex "ranges" > properties are used in many nodes (multiple tuples etc). This patch > enables the optional use of the common fdt_translate_address() function > which handles this translation correctly. > > Signed-off-by: Stefan Roese <sr@denx.de> > Cc: Simon Glass <sjg@chromium.org> > Cc: Bin Meng <bmeng.cn@gmail.com> > Cc: Marek Vasut <marex@denx.de> > Cc: Masahiro Yamada <yamada.masahiro@socionext.com> > --- > v2: > - Rework code a bit as suggested by Simon. Also added some comments > to make the use of the code paths more clear. > It works great on nios2 board. Thanks a lot. Tested-by: Thomas Chou <thomas@wytron.com.tw> Best regards, Thomas Chou
On 09/13/2015 11:25 PM, Stefan Roese wrote: > Hi Stephen, > > On 11.09.2015 19:07, Stephen Warren wrote: >> On 09/09/2015 11:07 AM, Simon Glass wrote: >>> +Stephen >>> >>> Hi Stefan, >>> >>> On Thursday, 3 September 2015, Stefan Roese <sr@denx.de> wrote: >>>> >>>> The current "simple" address translation simple_bus_translate() is not >>>> working on some platforms (e.g. MVEBU). As here more complex "ranges" >>>> properties are used in many nodes (multiple tuples etc). This patch >>>> enables the optional use of the common fdt_translate_address() function >>>> which handles this translation correctly. >>>> >>>> Signed-off-by: Stefan Roese <sr@denx.de> >>>> Cc: Simon Glass <sjg@chromium.org> >>>> Cc: Bin Meng <bmeng.cn@gmail.com> >>>> Cc: Marek Vasut <marex@denx.de> >>>> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> >>>> --- >>>> v2: >>>> - Rework code a bit as suggested by Simon. Also added some comments >>>> to make the use of the code paths more clear. >>> >>> >>> While this works I'm reluctant to commit it as is. The call to >>> fdt_parent_offset() is very slow. >>> >>> I wonder if this code should be copied into a new file in >>> drivers/core/, tidied up and updated to use dev->parent? >>> >>> Other options: >>> - Add a library to unflatten the tree - but this would not be very >>> useful in SPL or before relocation due to memory/speed constraints >>> - Add a helper to find a node parent which uses a cached tree scan to >>> build a table of previous nodes (or some other means to go backwards >>> in the tree) >>> - Worry about it later and go ahead with this patch >> >> I haven't looked at the code in detail, but I'm surprised there's a >> Kconfig option for this, for either SPL or main U-Boot. In general, this >> feature is simply a required part of parsing DT, so surely the code >> should always be enabled. Without it, we're only getting lucky if DT >> works (lucky the DT doesn't happen to contain a ranges property). > > Yes. I was also a bit surprised, that this current (limited) > implementation to translate the address worked on the platforms using > this interface right now. > >> Sure >> the code does some searching through the DT, and that's slower than not >> doing it, but I don't see how we can support DT without parsing DT >> correctly. Now admittedly some platforms' DTs happen not to contain >> ranges that require this code in practice. However, I feel that's a bit >> of a micro-optimization, and a rather error-prone one at that. What if >> someone pulls a more complete DT into U-Boot and suddenly the code is >> required and they have to spend ages tracking down their problem to >> missing functionality in a core DT parsing API - something they'd be >> unlikely to initially suspect. > > Ack. However, I definitely understand Simon's arguments about code size > here. On some platforms with limited RAM for SPL this additional code > for "correct" ranges parsing and address translation might break the > size limit. Not sure how to handle this. At least a comment in the code > would be helpful, explaining that simple_bus_translate() is limited here > in some aspects. So in my AArch64 build, fdt_translate_address is 0x270 bytes. I can see that might be pushing some extremely constrained binaries over a limit if that function isn't already included in the binary. However, if we are in that situation, I have a really hard time believing this one patch/function will be the only issue; we'll constantly be hitting a wall where we can't fix issues in DT parsing, DT handling, or other code in these binaries since the fix will bloat the binary too much. In those cases, I rather question whether DT support is the correct approach; completely dropping DT support from those binaries would likely remove large amounts of code and replace it with a tiny amount of constant data. It seems like that'd be the best approach all around since it'd head of the issue completely.
Hi Stephen, On 21 September 2015 at 19:06, Stephen Warren <swarren@wwwdotorg.org> wrote: > On 09/13/2015 11:25 PM, Stefan Roese wrote: >> >> Hi Stephen, >> >> On 11.09.2015 19:07, Stephen Warren wrote: >>> >>> On 09/09/2015 11:07 AM, Simon Glass wrote: >>>> >>>> +Stephen >>>> >>>> Hi Stefan, >>>> >>>> On Thursday, 3 September 2015, Stefan Roese <sr@denx.de> wrote: >>>>> >>>>> >>>>> The current "simple" address translation simple_bus_translate() is not >>>>> working on some platforms (e.g. MVEBU). As here more complex "ranges" >>>>> properties are used in many nodes (multiple tuples etc). This patch >>>>> enables the optional use of the common fdt_translate_address() function >>>>> which handles this translation correctly. >>>>> >>>>> Signed-off-by: Stefan Roese <sr@denx.de> >>>>> Cc: Simon Glass <sjg@chromium.org> >>>>> Cc: Bin Meng <bmeng.cn@gmail.com> >>>>> Cc: Marek Vasut <marex@denx.de> >>>>> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> >>>>> --- >>>>> v2: >>>>> - Rework code a bit as suggested by Simon. Also added some comments >>>>> to make the use of the code paths more clear. >>>> >>>> >>>> >>>> While this works I'm reluctant to commit it as is. The call to >>>> fdt_parent_offset() is very slow. >>>> >>>> I wonder if this code should be copied into a new file in >>>> drivers/core/, tidied up and updated to use dev->parent? >>>> >>>> Other options: >>>> - Add a library to unflatten the tree - but this would not be very >>>> useful in SPL or before relocation due to memory/speed constraints >>>> - Add a helper to find a node parent which uses a cached tree scan to >>>> build a table of previous nodes (or some other means to go backwards >>>> in the tree) >>>> - Worry about it later and go ahead with this patch >>> >>> >>> I haven't looked at the code in detail, but I'm surprised there's a >>> Kconfig option for this, for either SPL or main U-Boot. In general, this >>> feature is simply a required part of parsing DT, so surely the code >>> should always be enabled. Without it, we're only getting lucky if DT >>> works (lucky the DT doesn't happen to contain a ranges property). >> >> >> Yes. I was also a bit surprised, that this current (limited) >> implementation to translate the address worked on the platforms using >> this interface right now. >> >>> Sure >>> the code does some searching through the DT, and that's slower than not >>> doing it, but I don't see how we can support DT without parsing DT >>> correctly. Now admittedly some platforms' DTs happen not to contain >>> ranges that require this code in practice. However, I feel that's a bit >>> of a micro-optimization, and a rather error-prone one at that. What if >>> someone pulls a more complete DT into U-Boot and suddenly the code is >>> required and they have to spend ages tracking down their problem to >>> missing functionality in a core DT parsing API - something they'd be >>> unlikely to initially suspect. >> >> >> Ack. However, I definitely understand Simon's arguments about code size >> here. On some platforms with limited RAM for SPL this additional code >> for "correct" ranges parsing and address translation might break the >> size limit. Not sure how to handle this. At least a comment in the code >> would be helpful, explaining that simple_bus_translate() is limited here >> in some aspects. > > > So in my AArch64 build, fdt_translate_address is 0x270 bytes. I can see that > might be pushing some extremely constrained binaries over a limit if that > function isn't already included in the binary. However, if we are in that > situation, I have a really hard time believing this one patch/function will > be the only issue; we'll constantly be hitting a wall where we can't fix > issues in DT parsing, DT handling, or other code in these binaries since the > fix will bloat the binary too much. > > In those cases, I rather question whether DT support is the correct > approach; completely dropping DT support from those binaries would likely > remove large amounts of code and replace it with a tiny amount of constant > data. It seems like that'd be the best approach all around since it'd head > of the issue completely. U-Boot is not Linux - code size is important. We can enable features when needed. At present we can enable driver model and device tree with a ~5KB binary hit including a small device tree. I'd like to keep that down as low as possible. Otherwise we will end up with SPL being unable to driver model / device tree on lots of platforms. As time goes by and SoCs become more and more complex, this will be a pain. We'll end up forking the driver model. Of course trade-offs can change over time but that's the way I see it at the moment. Regards, Simon
On 10/03/2015 06:50 AM, Simon Glass wrote: > Hi Stephen, > > On 21 September 2015 at 19:06, Stephen Warren <swarren@wwwdotorg.org> wrote: >> On 09/13/2015 11:25 PM, Stefan Roese wrote: >>> >>> Hi Stephen, >>> >>> On 11.09.2015 19:07, Stephen Warren wrote: >>>> >>>> On 09/09/2015 11:07 AM, Simon Glass wrote: >>>>> >>>>> +Stephen >>>>> >>>>> Hi Stefan, >>>>> >>>>> On Thursday, 3 September 2015, Stefan Roese <sr@denx.de> wrote: >>>>>> >>>>>> >>>>>> The current "simple" address translation simple_bus_translate() is not >>>>>> working on some platforms (e.g. MVEBU). As here more complex "ranges" >>>>>> properties are used in many nodes (multiple tuples etc). This patch >>>>>> enables the optional use of the common fdt_translate_address() function >>>>>> which handles this translation correctly. >>>>>> >>>>>> Signed-off-by: Stefan Roese <sr@denx.de> >>>>>> Cc: Simon Glass <sjg@chromium.org> >>>>>> Cc: Bin Meng <bmeng.cn@gmail.com> >>>>>> Cc: Marek Vasut <marex@denx.de> >>>>>> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> >>>>>> --- >>>>>> v2: >>>>>> - Rework code a bit as suggested by Simon. Also added some comments >>>>>> to make the use of the code paths more clear. >>>>> >>>>> >>>>> >>>>> While this works I'm reluctant to commit it as is. The call to >>>>> fdt_parent_offset() is very slow. >>>>> >>>>> I wonder if this code should be copied into a new file in >>>>> drivers/core/, tidied up and updated to use dev->parent? >>>>> >>>>> Other options: >>>>> - Add a library to unflatten the tree - but this would not be very >>>>> useful in SPL or before relocation due to memory/speed constraints >>>>> - Add a helper to find a node parent which uses a cached tree scan to >>>>> build a table of previous nodes (or some other means to go backwards >>>>> in the tree) >>>>> - Worry about it later and go ahead with this patch >>>> >>>> >>>> I haven't looked at the code in detail, but I'm surprised there's a >>>> Kconfig option for this, for either SPL or main U-Boot. In general, this >>>> feature is simply a required part of parsing DT, so surely the code >>>> should always be enabled. Without it, we're only getting lucky if DT >>>> works (lucky the DT doesn't happen to contain a ranges property). >>> >>> >>> Yes. I was also a bit surprised, that this current (limited) >>> implementation to translate the address worked on the platforms using >>> this interface right now. >>> >>>> Sure >>>> the code does some searching through the DT, and that's slower than not >>>> doing it, but I don't see how we can support DT without parsing DT >>>> correctly. Now admittedly some platforms' DTs happen not to contain >>>> ranges that require this code in practice. However, I feel that's a bit >>>> of a micro-optimization, and a rather error-prone one at that. What if >>>> someone pulls a more complete DT into U-Boot and suddenly the code is >>>> required and they have to spend ages tracking down their problem to >>>> missing functionality in a core DT parsing API - something they'd be >>>> unlikely to initially suspect. >>> >>> >>> Ack. However, I definitely understand Simon's arguments about code size >>> here. On some platforms with limited RAM for SPL this additional code >>> for "correct" ranges parsing and address translation might break the >>> size limit. Not sure how to handle this. At least a comment in the code >>> would be helpful, explaining that simple_bus_translate() is limited here >>> in some aspects. >> >> >> So in my AArch64 build, fdt_translate_address is 0x270 bytes. I can see that >> might be pushing some extremely constrained binaries over a limit if that >> function isn't already included in the binary. However, if we are in that >> situation, I have a really hard time believing this one patch/function will >> be the only issue; we'll constantly be hitting a wall where we can't fix >> issues in DT parsing, DT handling, or other code in these binaries since the >> fix will bloat the binary too much. >> >> In those cases, I rather question whether DT support is the correct >> approach; completely dropping DT support from those binaries would likely >> remove large amounts of code and replace it with a tiny amount of constant >> data. It seems like that'd be the best approach all around since it'd head >> of the issue completely. > > U-Boot is not Linux - code size is important. We can enable features > when needed. Only if they're not mandatory parts of other features that we've made an arbitrary decision to use. Correctness trumps optimization in absolutely all cases.
Hi Stephen, On 3 October 2015 at 20:17, Stephen Warren <swarren@wwwdotorg.org> wrote: > On 10/03/2015 06:50 AM, Simon Glass wrote: >> Hi Stephen, >> >> On 21 September 2015 at 19:06, Stephen Warren <swarren@wwwdotorg.org> wrote: >>> On 09/13/2015 11:25 PM, Stefan Roese wrote: >>>> >>>> Hi Stephen, >>>> >>>> On 11.09.2015 19:07, Stephen Warren wrote: >>>>> >>>>> On 09/09/2015 11:07 AM, Simon Glass wrote: >>>>>> >>>>>> +Stephen >>>>>> >>>>>> Hi Stefan, >>>>>> >>>>>> On Thursday, 3 September 2015, Stefan Roese <sr@denx.de> wrote: >>>>>>> >>>>>>> >>>>>>> The current "simple" address translation simple_bus_translate() is not >>>>>>> working on some platforms (e.g. MVEBU). As here more complex "ranges" >>>>>>> properties are used in many nodes (multiple tuples etc). This patch >>>>>>> enables the optional use of the common fdt_translate_address() function >>>>>>> which handles this translation correctly. >>>>>>> >>>>>>> Signed-off-by: Stefan Roese <sr@denx.de> >>>>>>> Cc: Simon Glass <sjg@chromium.org> >>>>>>> Cc: Bin Meng <bmeng.cn@gmail.com> >>>>>>> Cc: Marek Vasut <marex@denx.de> >>>>>>> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> >>>>>>> --- >>>>>>> v2: >>>>>>> - Rework code a bit as suggested by Simon. Also added some comments >>>>>>> to make the use of the code paths more clear. >>>>>> >>>>>> >>>>>> >>>>>> While this works I'm reluctant to commit it as is. The call to >>>>>> fdt_parent_offset() is very slow. >>>>>> >>>>>> I wonder if this code should be copied into a new file in >>>>>> drivers/core/, tidied up and updated to use dev->parent? >>>>>> >>>>>> Other options: >>>>>> - Add a library to unflatten the tree - but this would not be very >>>>>> useful in SPL or before relocation due to memory/speed constraints >>>>>> - Add a helper to find a node parent which uses a cached tree scan to >>>>>> build a table of previous nodes (or some other means to go backwards >>>>>> in the tree) >>>>>> - Worry about it later and go ahead with this patch >>>>> >>>>> >>>>> I haven't looked at the code in detail, but I'm surprised there's a >>>>> Kconfig option for this, for either SPL or main U-Boot. In general, this >>>>> feature is simply a required part of parsing DT, so surely the code >>>>> should always be enabled. Without it, we're only getting lucky if DT >>>>> works (lucky the DT doesn't happen to contain a ranges property). >>>> >>>> >>>> Yes. I was also a bit surprised, that this current (limited) >>>> implementation to translate the address worked on the platforms using >>>> this interface right now. >>>> >>>>> Sure >>>>> the code does some searching through the DT, and that's slower than not >>>>> doing it, but I don't see how we can support DT without parsing DT >>>>> correctly. Now admittedly some platforms' DTs happen not to contain >>>>> ranges that require this code in practice. However, I feel that's a bit >>>>> of a micro-optimization, and a rather error-prone one at that. What if >>>>> someone pulls a more complete DT into U-Boot and suddenly the code is >>>>> required and they have to spend ages tracking down their problem to >>>>> missing functionality in a core DT parsing API - something they'd be >>>>> unlikely to initially suspect. >>>> >>>> >>>> Ack. However, I definitely understand Simon's arguments about code size >>>> here. On some platforms with limited RAM for SPL this additional code >>>> for "correct" ranges parsing and address translation might break the >>>> size limit. Not sure how to handle this. At least a comment in the code >>>> would be helpful, explaining that simple_bus_translate() is limited here >>>> in some aspects. >>> >>> >>> So in my AArch64 build, fdt_translate_address is 0x270 bytes. I can see that >>> might be pushing some extremely constrained binaries over a limit if that >>> function isn't already included in the binary. However, if we are in that >>> situation, I have a really hard time believing this one patch/function will >>> be the only issue; we'll constantly be hitting a wall where we can't fix >>> issues in DT parsing, DT handling, or other code in these binaries since the >>> fix will bloat the binary too much. >>> >>> In those cases, I rather question whether DT support is the correct >>> approach; completely dropping DT support from those binaries would likely >>> remove large amounts of code and replace it with a tiny amount of constant >>> data. It seems like that'd be the best approach all around since it'd head >>> of the issue completely. >> >> U-Boot is not Linux - code size is important. We can enable features >> when needed. > > Only if they're not mandatory parts of other features that we've made an > arbitrary decision to use. Correctness trumps optimization in absolutely > all cases. This patch adds the ability to support complex multi-level range properties for those boards that need it (only one so far). I think it is a reasonable feature to have. We can perhaps improve the implementation as I mentioned earlier in this thread, but only at the cost of more code and development. The only shortcoming I am aware of is that it moves up the tree looking for parent nodes, and this involves scanning the device tree repeatedly. We can address this later if it becomes a performance issue. While only one platform currently needs this feature, others may follow, and as you point out if a platform needs this but we do not support it, then it would be a failing to correctly parse valid device tree semantics. But I can't agree that we must do everything or nothing. One might argue that only the hush parser provides a correct shell, or that simple malloc() does not implement memory allocation correctly, or that only SHA256 is suitable as a hash, or that snprintf() should always check its buffer size, or indeed that prinf() should support every format parameter, even in SPL. U-Boot is full of such compromises and that contributes to its flexibility. There is of course the risk that some poor soul may bring in an updated device tree file for a platform which suddenly starts needing ranges where it did not before. Hopefully they will remember that they changed the device tree and hopefully after bit of searching they find this thread and they will know to define CONFIG_OF_TRANSLATE. But I am more worried about the hopeful punter who wants to fit things into a small SPL. We should try to make this easy from the start, and allowing some of device tree's less common features to be optional is the lesser of the two evils IMO. Acked-by: Simon Glass <sjg@chromium.org> Regards, Simon
Hi Simon, On 04.10.2015 03:02, Simon Glass wrote: > Hi Stephen, > > On 3 October 2015 at 20:17, Stephen Warren <swarren@wwwdotorg.org> wrote: >> On 10/03/2015 06:50 AM, Simon Glass wrote: >>> Hi Stephen, >>> >>> On 21 September 2015 at 19:06, Stephen Warren <swarren@wwwdotorg.org> wrote: >>>> On 09/13/2015 11:25 PM, Stefan Roese wrote: >>>>> >>>>> Hi Stephen, >>>>> >>>>> On 11.09.2015 19:07, Stephen Warren wrote: >>>>>> >>>>>> On 09/09/2015 11:07 AM, Simon Glass wrote: >>>>>>> >>>>>>> +Stephen >>>>>>> >>>>>>> Hi Stefan, >>>>>>> >>>>>>> On Thursday, 3 September 2015, Stefan Roese <sr@denx.de> wrote: >>>>>>>> >>>>>>>> >>>>>>>> The current "simple" address translation simple_bus_translate() is not >>>>>>>> working on some platforms (e.g. MVEBU). As here more complex "ranges" >>>>>>>> properties are used in many nodes (multiple tuples etc). This patch >>>>>>>> enables the optional use of the common fdt_translate_address() function >>>>>>>> which handles this translation correctly. >>>>>>>> >>>>>>>> Signed-off-by: Stefan Roese <sr@denx.de> >>>>>>>> Cc: Simon Glass <sjg@chromium.org> >>>>>>>> Cc: Bin Meng <bmeng.cn@gmail.com> >>>>>>>> Cc: Marek Vasut <marex@denx.de> >>>>>>>> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> >>>>>>>> --- >>>>>>>> v2: >>>>>>>> - Rework code a bit as suggested by Simon. Also added some comments >>>>>>>> to make the use of the code paths more clear. >>>>>>> >>>>>>> >>>>>>> >>>>>>> While this works I'm reluctant to commit it as is. The call to >>>>>>> fdt_parent_offset() is very slow. >>>>>>> >>>>>>> I wonder if this code should be copied into a new file in >>>>>>> drivers/core/, tidied up and updated to use dev->parent? >>>>>>> >>>>>>> Other options: >>>>>>> - Add a library to unflatten the tree - but this would not be very >>>>>>> useful in SPL or before relocation due to memory/speed constraints >>>>>>> - Add a helper to find a node parent which uses a cached tree scan to >>>>>>> build a table of previous nodes (or some other means to go backwards >>>>>>> in the tree) >>>>>>> - Worry about it later and go ahead with this patch >>>>>> >>>>>> >>>>>> I haven't looked at the code in detail, but I'm surprised there's a >>>>>> Kconfig option for this, for either SPL or main U-Boot. In general, this >>>>>> feature is simply a required part of parsing DT, so surely the code >>>>>> should always be enabled. Without it, we're only getting lucky if DT >>>>>> works (lucky the DT doesn't happen to contain a ranges property). >>>>> >>>>> >>>>> Yes. I was also a bit surprised, that this current (limited) >>>>> implementation to translate the address worked on the platforms using >>>>> this interface right now. >>>>> >>>>>> Sure >>>>>> the code does some searching through the DT, and that's slower than not >>>>>> doing it, but I don't see how we can support DT without parsing DT >>>>>> correctly. Now admittedly some platforms' DTs happen not to contain >>>>>> ranges that require this code in practice. However, I feel that's a bit >>>>>> of a micro-optimization, and a rather error-prone one at that. What if >>>>>> someone pulls a more complete DT into U-Boot and suddenly the code is >>>>>> required and they have to spend ages tracking down their problem to >>>>>> missing functionality in a core DT parsing API - something they'd be >>>>>> unlikely to initially suspect. >>>>> >>>>> >>>>> Ack. However, I definitely understand Simon's arguments about code size >>>>> here. On some platforms with limited RAM for SPL this additional code >>>>> for "correct" ranges parsing and address translation might break the >>>>> size limit. Not sure how to handle this. At least a comment in the code >>>>> would be helpful, explaining that simple_bus_translate() is limited here >>>>> in some aspects. >>>> >>>> >>>> So in my AArch64 build, fdt_translate_address is 0x270 bytes. I can see that >>>> might be pushing some extremely constrained binaries over a limit if that >>>> function isn't already included in the binary. However, if we are in that >>>> situation, I have a really hard time believing this one patch/function will >>>> be the only issue; we'll constantly be hitting a wall where we can't fix >>>> issues in DT parsing, DT handling, or other code in these binaries since the >>>> fix will bloat the binary too much. >>>> >>>> In those cases, I rather question whether DT support is the correct >>>> approach; completely dropping DT support from those binaries would likely >>>> remove large amounts of code and replace it with a tiny amount of constant >>>> data. It seems like that'd be the best approach all around since it'd head >>>> of the issue completely. >>> >>> U-Boot is not Linux - code size is important. We can enable features >>> when needed. >> >> Only if they're not mandatory parts of other features that we've made an >> arbitrary decision to use. Correctness trumps optimization in absolutely >> all cases. > > This patch adds the ability to support complex multi-level range > properties for those boards that need it (only one so far). Its actually already 2 platforms. As Thomas Chou also needs this for NIOS (or NIOS2). Thomas, please correct me if I'm wrong. > I think it > is a reasonable feature to have. We can perhaps improve the > implementation as I mentioned earlier in this thread, but only at the > cost of more code and development. The only shortcoming I am aware of > is that it moves up the tree looking for parent nodes, and this > involves scanning the device tree repeatedly. We can address this > later if it becomes a performance issue. > > While only one platform currently needs this feature, others may > follow, and as you point out if a platform needs this but we do not > support it, then it would be a failing to correctly parse valid device > tree semantics. But I can't agree that we must do everything or > nothing. One might argue that only the hush parser provides a correct > shell, or that simple malloc() does not implement memory allocation > correctly, or that only SHA256 is suitable as a hash, or that > snprintf() should always check its buffer size, or indeed that prinf() > should support every format parameter, even in SPL. U-Boot is full of > such compromises and that contributes to its flexibility. > > There is of course the risk that some poor soul may bring in an > updated device tree file for a platform which suddenly starts needing > ranges where it did not before. Hopefully they will remember that they > changed the device tree and hopefully after bit of searching they find > this thread and they will know to define CONFIG_OF_TRANSLATE. But I am > more worried about the hopeful punter who wants to fit things into a > small SPL. We should try to make this easy from the start, and > allowing some of device tree's less common features to be optional is > the lesser of the two evils IMO. > > Acked-by: Simon Glass <sjg@chromium.org> Thanks, Stefan
On 10/04/2015 03:35 PM, Stefan Roese wrote: > Hi Simon, > > On 04.10.2015 03:02, Simon Glass wrote: >> Hi Stephen, >> >> On 3 October 2015 at 20:17, Stephen Warren <swarren@wwwdotorg.org> wrote: >>> On 10/03/2015 06:50 AM, Simon Glass wrote: >>>> Hi Stephen, >>>> >>>> On 21 September 2015 at 19:06, Stephen Warren >>>> <swarren@wwwdotorg.org> wrote: >>>>> On 09/13/2015 11:25 PM, Stefan Roese wrote: >>>>>> >>>>>> Hi Stephen, >>>>>> >>>>>> On 11.09.2015 19:07, Stephen Warren wrote: >>>>>>> >>>>>>> On 09/09/2015 11:07 AM, Simon Glass wrote: >>>>>>>> >>>>>>>> +Stephen >>>>>>>> >>>>>>>> Hi Stefan, >>>>>>>> >>>>>>>> On Thursday, 3 September 2015, Stefan Roese <sr@denx.de> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> The current "simple" address translation simple_bus_translate() >>>>>>>>> is not >>>>>>>>> working on some platforms (e.g. MVEBU). As here more complex >>>>>>>>> "ranges" >>>>>>>>> properties are used in many nodes (multiple tuples etc). This >>>>>>>>> patch >>>>>>>>> enables the optional use of the common fdt_translate_address() >>>>>>>>> function >>>>>>>>> which handles this translation correctly. >>>>>>>>> >>>>>>>>> Signed-off-by: Stefan Roese <sr@denx.de> >>>>>>>>> Cc: Simon Glass <sjg@chromium.org> >>>>>>>>> Cc: Bin Meng <bmeng.cn@gmail.com> >>>>>>>>> Cc: Marek Vasut <marex@denx.de> >>>>>>>>> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> >>>>>>>>> --- >>>>>>>>> v2: >>>>>>>>> - Rework code a bit as suggested by Simon. Also added some >>>>>>>>> comments >>>>>>>>> to make the use of the code paths more clear. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> While this works I'm reluctant to commit it as is. The call to >>>>>>>> fdt_parent_offset() is very slow. >>>>>>>> >>>>>>>> I wonder if this code should be copied into a new file in >>>>>>>> drivers/core/, tidied up and updated to use dev->parent? >>>>>>>> >>>>>>>> Other options: >>>>>>>> - Add a library to unflatten the tree - but this would not be very >>>>>>>> useful in SPL or before relocation due to memory/speed constraints >>>>>>>> - Add a helper to find a node parent which uses a cached tree >>>>>>>> scan to >>>>>>>> build a table of previous nodes (or some other means to go >>>>>>>> backwards >>>>>>>> in the tree) >>>>>>>> - Worry about it later and go ahead with this patch >>>>>>> >>>>>>> >>>>>>> I haven't looked at the code in detail, but I'm surprised there's a >>>>>>> Kconfig option for this, for either SPL or main U-Boot. In >>>>>>> general, this >>>>>>> feature is simply a required part of parsing DT, so surely the code >>>>>>> should always be enabled. Without it, we're only getting lucky if DT >>>>>>> works (lucky the DT doesn't happen to contain a ranges property). >>>>>> >>>>>> >>>>>> Yes. I was also a bit surprised, that this current (limited) >>>>>> implementation to translate the address worked on the platforms using >>>>>> this interface right now. >>>>>> >>>>>>> Sure >>>>>>> the code does some searching through the DT, and that's slower >>>>>>> than not >>>>>>> doing it, but I don't see how we can support DT without parsing DT >>>>>>> correctly. Now admittedly some platforms' DTs happen not to contain >>>>>>> ranges that require this code in practice. However, I feel that's >>>>>>> a bit >>>>>>> of a micro-optimization, and a rather error-prone one at that. >>>>>>> What if >>>>>>> someone pulls a more complete DT into U-Boot and suddenly the >>>>>>> code is >>>>>>> required and they have to spend ages tracking down their problem to >>>>>>> missing functionality in a core DT parsing API - something they'd be >>>>>>> unlikely to initially suspect. >>>>>> >>>>>> >>>>>> Ack. However, I definitely understand Simon's arguments about code >>>>>> size >>>>>> here. On some platforms with limited RAM for SPL this additional code >>>>>> for "correct" ranges parsing and address translation might break the >>>>>> size limit. Not sure how to handle this. At least a comment in the >>>>>> code >>>>>> would be helpful, explaining that simple_bus_translate() is >>>>>> limited here >>>>>> in some aspects. >>>>> >>>>> >>>>> So in my AArch64 build, fdt_translate_address is 0x270 bytes. I can >>>>> see that >>>>> might be pushing some extremely constrained binaries over a limit >>>>> if that >>>>> function isn't already included in the binary. However, if we are >>>>> in that >>>>> situation, I have a really hard time believing this one >>>>> patch/function will >>>>> be the only issue; we'll constantly be hitting a wall where we >>>>> can't fix >>>>> issues in DT parsing, DT handling, or other code in these binaries >>>>> since the >>>>> fix will bloat the binary too much. >>>>> >>>>> In those cases, I rather question whether DT support is the correct >>>>> approach; completely dropping DT support from those binaries would >>>>> likely >>>>> remove large amounts of code and replace it with a tiny amount of >>>>> constant >>>>> data. It seems like that'd be the best approach all around since >>>>> it'd head >>>>> of the issue completely. >>>> >>>> U-Boot is not Linux - code size is important. We can enable features >>>> when needed. >>> >>> Only if they're not mandatory parts of other features that we've made an >>> arbitrary decision to use. Correctness trumps optimization in absolutely >>> all cases. >> >> This patch adds the ability to support complex multi-level range >> properties for those boards that need it (only one so far). > > Its actually already 2 platforms. As Thomas Chou also needs this for > NIOS (or NIOS2). Thomas, please correct me if I'm wrong. Yes, nios2 and socfpga MUST have this ranges translation. Acked-by: Thomas Chou <thomas@wytron.com.tw> > >> I think it >> is a reasonable feature to have. We can perhaps improve the >> implementation as I mentioned earlier in this thread, but only at the >> cost of more code and development. The only shortcoming I am aware of >> is that it moves up the tree looking for parent nodes, and this >> involves scanning the device tree repeatedly. We can address this >> later if it becomes a performance issue. >> >> While only one platform currently needs this feature, others may >> follow, and as you point out if a platform needs this but we do not >> support it, then it would be a failing to correctly parse valid device >> tree semantics. But I can't agree that we must do everything or >> nothing. One might argue that only the hush parser provides a correct >> shell, or that simple malloc() does not implement memory allocation >> correctly, or that only SHA256 is suitable as a hash, or that >> snprintf() should always check its buffer size, or indeed that prinf() >> should support every format parameter, even in SPL. U-Boot is full of >> such compromises and that contributes to its flexibility. >> >> There is of course the risk that some poor soul may bring in an >> updated device tree file for a platform which suddenly starts needing >> ranges where it did not before. Hopefully they will remember that they >> changed the device tree and hopefully after bit of searching they find >> this thread and they will know to define CONFIG_OF_TRANSLATE. But I am >> more worried about the hopeful punter who wants to fit things into a >> small SPL. We should try to make this easy from the start, and >> allowing some of device tree's less common features to be optional is >> the lesser of the two evils IMO. >> >> Acked-by: Simon Glass <sjg@chromium.org> > > Thanks, > Stefan > >
On 10/03/2015 07:02 PM, Simon Glass wrote: > Hi Stephen, > > On 3 October 2015 at 20:17, Stephen Warren <swarren@wwwdotorg.org> wrote: >> On 10/03/2015 06:50 AM, Simon Glass wrote: >>> Hi Stephen, >>> >>> On 21 September 2015 at 19:06, Stephen Warren <swarren@wwwdotorg.org> wrote: >>>> On 09/13/2015 11:25 PM, Stefan Roese wrote: >>>>> >>>>> Hi Stephen, >>>>> >>>>> On 11.09.2015 19:07, Stephen Warren wrote: >>>>>> >>>>>> On 09/09/2015 11:07 AM, Simon Glass wrote: >>>>>>> >>>>>>> +Stephen >>>>>>> >>>>>>> Hi Stefan, >>>>>>> >>>>>>> On Thursday, 3 September 2015, Stefan Roese <sr@denx.de> wrote: >>>>>>>> >>>>>>>> >>>>>>>> The current "simple" address translation simple_bus_translate() is not >>>>>>>> working on some platforms (e.g. MVEBU). As here more complex "ranges" >>>>>>>> properties are used in many nodes (multiple tuples etc). This patch >>>>>>>> enables the optional use of the common fdt_translate_address() function >>>>>>>> which handles this translation correctly. >>>>>>>> >>>>>>>> Signed-off-by: Stefan Roese <sr@denx.de> >>>>>>>> Cc: Simon Glass <sjg@chromium.org> >>>>>>>> Cc: Bin Meng <bmeng.cn@gmail.com> >>>>>>>> Cc: Marek Vasut <marex@denx.de> >>>>>>>> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> >>>>>>>> --- >>>>>>>> v2: >>>>>>>> - Rework code a bit as suggested by Simon. Also added some comments >>>>>>>> to make the use of the code paths more clear. >>>>>>> >>>>>>> >>>>>>> >>>>>>> While this works I'm reluctant to commit it as is. The call to >>>>>>> fdt_parent_offset() is very slow. >>>>>>> >>>>>>> I wonder if this code should be copied into a new file in >>>>>>> drivers/core/, tidied up and updated to use dev->parent? >>>>>>> >>>>>>> Other options: >>>>>>> - Add a library to unflatten the tree - but this would not be very >>>>>>> useful in SPL or before relocation due to memory/speed constraints >>>>>>> - Add a helper to find a node parent which uses a cached tree scan to >>>>>>> build a table of previous nodes (or some other means to go backwards >>>>>>> in the tree) >>>>>>> - Worry about it later and go ahead with this patch >>>>>> >>>>>> >>>>>> I haven't looked at the code in detail, but I'm surprised there's a >>>>>> Kconfig option for this, for either SPL or main U-Boot. In general, this >>>>>> feature is simply a required part of parsing DT, so surely the code >>>>>> should always be enabled. Without it, we're only getting lucky if DT >>>>>> works (lucky the DT doesn't happen to contain a ranges property). >>>>> >>>>> >>>>> Yes. I was also a bit surprised, that this current (limited) >>>>> implementation to translate the address worked on the platforms using >>>>> this interface right now. >>>>> >>>>>> Sure >>>>>> the code does some searching through the DT, and that's slower than not >>>>>> doing it, but I don't see how we can support DT without parsing DT >>>>>> correctly. Now admittedly some platforms' DTs happen not to contain >>>>>> ranges that require this code in practice. However, I feel that's a bit >>>>>> of a micro-optimization, and a rather error-prone one at that. What if >>>>>> someone pulls a more complete DT into U-Boot and suddenly the code is >>>>>> required and they have to spend ages tracking down their problem to >>>>>> missing functionality in a core DT parsing API - something they'd be >>>>>> unlikely to initially suspect. >>>>> >>>>> >>>>> Ack. However, I definitely understand Simon's arguments about code size >>>>> here. On some platforms with limited RAM for SPL this additional code >>>>> for "correct" ranges parsing and address translation might break the >>>>> size limit. Not sure how to handle this. At least a comment in the code >>>>> would be helpful, explaining that simple_bus_translate() is limited here >>>>> in some aspects. >>>> >>>> >>>> So in my AArch64 build, fdt_translate_address is 0x270 bytes. I can see that >>>> might be pushing some extremely constrained binaries over a limit if that >>>> function isn't already included in the binary. However, if we are in that >>>> situation, I have a really hard time believing this one patch/function will >>>> be the only issue; we'll constantly be hitting a wall where we can't fix >>>> issues in DT parsing, DT handling, or other code in these binaries since the >>>> fix will bloat the binary too much. >>>> >>>> In those cases, I rather question whether DT support is the correct >>>> approach; completely dropping DT support from those binaries would likely >>>> remove large amounts of code and replace it with a tiny amount of constant >>>> data. It seems like that'd be the best approach all around since it'd head >>>> of the issue completely. >>> >>> U-Boot is not Linux - code size is important. We can enable features >>> when needed. >> >> Only if they're not mandatory parts of other features that we've made an >> arbitrary decision to use. Correctness trumps optimization in absolutely >> all cases. > > This patch adds the ability to support complex multi-level range > properties for those boards that need it (only one so far). I think it > is a reasonable feature to have. We can perhaps improve the > implementation as I mentioned earlier in this thread, but only at the > cost of more code and development. The only shortcoming I am aware of > is that it moves up the tree looking for parent nodes, and this > involves scanning the device tree repeatedly. We can address this > later if it becomes a performance issue. > > While only one platform currently needs this feature, others may > follow, and as you point out if a platform needs this but we do not > support it, then it would be a failing to correctly parse valid device > tree semantics. But I can't agree that we must do everything or > nothing. One might argue that only the hush parser provides a correct > shell, or that simple malloc() does not implement memory allocation > correctly, or that only SHA256 is suitable as a hash, or that > snprintf() should always check its buffer size, or indeed that prinf() > should support every format parameter, even in SPL. U-Boot is full of > such compromises and that contributes to its flexibility. I believe that a primary difference between the examples above and this DT parsing feature are that the examples above are all different options for implementing a conceptual feature (e.g. different hash algorithms, all of which implement the ability to hash some data), whereas supporting ranges in DT is a (fundamental) part of a single feature (DT support), rather than a different implementation of "parsing DT".
Hi Stephen, On 5 October 2015 at 02:22, Stephen Warren <swarren@wwwdotorg.org> wrote: > On 10/03/2015 07:02 PM, Simon Glass wrote: >> Hi Stephen, >> >> On 3 October 2015 at 20:17, Stephen Warren <swarren@wwwdotorg.org> wrote: >>> On 10/03/2015 06:50 AM, Simon Glass wrote: >>>> Hi Stephen, >>>> >>>> On 21 September 2015 at 19:06, Stephen Warren <swarren@wwwdotorg.org> wrote: >>>>> On 09/13/2015 11:25 PM, Stefan Roese wrote: >>>>>> >>>>>> Hi Stephen, >>>>>> >>>>>> On 11.09.2015 19:07, Stephen Warren wrote: >>>>>>> >>>>>>> On 09/09/2015 11:07 AM, Simon Glass wrote: >>>>>>>> >>>>>>>> +Stephen >>>>>>>> >>>>>>>> Hi Stefan, >>>>>>>> >>>>>>>> On Thursday, 3 September 2015, Stefan Roese <sr@denx.de> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> The current "simple" address translation simple_bus_translate() is not >>>>>>>>> working on some platforms (e.g. MVEBU). As here more complex "ranges" >>>>>>>>> properties are used in many nodes (multiple tuples etc). This patch >>>>>>>>> enables the optional use of the common fdt_translate_address() function >>>>>>>>> which handles this translation correctly. >>>>>>>>> >>>>>>>>> Signed-off-by: Stefan Roese <sr@denx.de> >>>>>>>>> Cc: Simon Glass <sjg@chromium.org> >>>>>>>>> Cc: Bin Meng <bmeng.cn@gmail.com> >>>>>>>>> Cc: Marek Vasut <marex@denx.de> >>>>>>>>> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> >>>>>>>>> --- >>>>>>>>> v2: >>>>>>>>> - Rework code a bit as suggested by Simon. Also added some comments >>>>>>>>> to make the use of the code paths more clear. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> While this works I'm reluctant to commit it as is. The call to >>>>>>>> fdt_parent_offset() is very slow. >>>>>>>> >>>>>>>> I wonder if this code should be copied into a new file in >>>>>>>> drivers/core/, tidied up and updated to use dev->parent? >>>>>>>> >>>>>>>> Other options: >>>>>>>> - Add a library to unflatten the tree - but this would not be very >>>>>>>> useful in SPL or before relocation due to memory/speed constraints >>>>>>>> - Add a helper to find a node parent which uses a cached tree scan to >>>>>>>> build a table of previous nodes (or some other means to go backwards >>>>>>>> in the tree) >>>>>>>> - Worry about it later and go ahead with this patch >>>>>>> >>>>>>> >>>>>>> I haven't looked at the code in detail, but I'm surprised there's a >>>>>>> Kconfig option for this, for either SPL or main U-Boot. In general, this >>>>>>> feature is simply a required part of parsing DT, so surely the code >>>>>>> should always be enabled. Without it, we're only getting lucky if DT >>>>>>> works (lucky the DT doesn't happen to contain a ranges property). >>>>>> >>>>>> >>>>>> Yes. I was also a bit surprised, that this current (limited) >>>>>> implementation to translate the address worked on the platforms using >>>>>> this interface right now. >>>>>> >>>>>>> Sure >>>>>>> the code does some searching through the DT, and that's slower than not >>>>>>> doing it, but I don't see how we can support DT without parsing DT >>>>>>> correctly. Now admittedly some platforms' DTs happen not to contain >>>>>>> ranges that require this code in practice. However, I feel that's a bit >>>>>>> of a micro-optimization, and a rather error-prone one at that. What if >>>>>>> someone pulls a more complete DT into U-Boot and suddenly the code is >>>>>>> required and they have to spend ages tracking down their problem to >>>>>>> missing functionality in a core DT parsing API - something they'd be >>>>>>> unlikely to initially suspect. >>>>>> >>>>>> >>>>>> Ack. However, I definitely understand Simon's arguments about code size >>>>>> here. On some platforms with limited RAM for SPL this additional code >>>>>> for "correct" ranges parsing and address translation might break the >>>>>> size limit. Not sure how to handle this. At least a comment in the code >>>>>> would be helpful, explaining that simple_bus_translate() is limited here >>>>>> in some aspects. >>>>> >>>>> >>>>> So in my AArch64 build, fdt_translate_address is 0x270 bytes. I can see that >>>>> might be pushing some extremely constrained binaries over a limit if that >>>>> function isn't already included in the binary. However, if we are in that >>>>> situation, I have a really hard time believing this one patch/function will >>>>> be the only issue; we'll constantly be hitting a wall where we can't fix >>>>> issues in DT parsing, DT handling, or other code in these binaries since the >>>>> fix will bloat the binary too much. >>>>> >>>>> In those cases, I rather question whether DT support is the correct >>>>> approach; completely dropping DT support from those binaries would likely >>>>> remove large amounts of code and replace it with a tiny amount of constant >>>>> data. It seems like that'd be the best approach all around since it'd head >>>>> of the issue completely. >>>> >>>> U-Boot is not Linux - code size is important. We can enable features >>>> when needed. >>> >>> Only if they're not mandatory parts of other features that we've made an >>> arbitrary decision to use. Correctness trumps optimization in absolutely >>> all cases. >> >> This patch adds the ability to support complex multi-level range >> properties for those boards that need it (only one so far). I think it >> is a reasonable feature to have. We can perhaps improve the >> implementation as I mentioned earlier in this thread, but only at the >> cost of more code and development. The only shortcoming I am aware of >> is that it moves up the tree looking for parent nodes, and this >> involves scanning the device tree repeatedly. We can address this >> later if it becomes a performance issue. >> >> While only one platform currently needs this feature, others may >> follow, and as you point out if a platform needs this but we do not >> support it, then it would be a failing to correctly parse valid device >> tree semantics. But I can't agree that we must do everything or >> nothing. One might argue that only the hush parser provides a correct >> shell, or that simple malloc() does not implement memory allocation >> correctly, or that only SHA256 is suitable as a hash, or that >> snprintf() should always check its buffer size, or indeed that prinf() >> should support every format parameter, even in SPL. U-Boot is full of >> such compromises and that contributes to its flexibility. > > I believe that a primary difference between the examples above and this > DT parsing feature are that the examples above are all different options > for implementing a conceptual feature (e.g. different hash algorithms, > all of which implement the ability to hash some data), whereas > supporting ranges in DT is a (fundamental) part of a single feature (DT > support), rather than a different implementation of "parsing DT". There was a discussion about implementing a version of printf() for SPL which just outputs the format string and ignores the parameters. Arguably this fails your test, but is still useful. I don't see that DT parsing is any different. Regards, Simon
diff --git a/drivers/core/Kconfig b/drivers/core/Kconfig index 41f4e69..15681df 100644 --- a/drivers/core/Kconfig +++ b/drivers/core/Kconfig @@ -120,4 +120,34 @@ config SPL_SIMPLE_BUS Supports the 'simple-bus' driver, which is used on some systems in SPL. +config OF_TRANSLATE + bool "Translate addresses using fdt_translate_address" + depends on DM && OF_CONTROL + default y + help + If this option is enabled, the reg property will be translated + using the fdt_translate_address() function. This is necessary + on some platforms (e.g. MVEBU) using complex "ranges" + properties in many nodes. As this translation is not handled + correctly in the default simple_bus_translate() function. + + If this option is not enabled, simple_bus_translate() will be + used for the address translation. This function is faster and + smaller in size than fdt_translate_address(). + +config SPL_OF_TRANSLATE + bool "Translate addresses using fdt_translate_address" + depends on SPL_DM && SPL_OF_CONTROL + default n + help + If this option is enabled, the reg property will be translated + using the fdt_translate_address() function. This is necessary + on some platforms (e.g. MVEBU) using complex "ranges" + properties in many nodes. As this translation is not handled + correctly in the default simple_bus_translate() function. + + If this option is not enabled, simple_bus_translate() will be + used for the address translation. This function is faster and + smaller in size than fdt_translate_address(). + endmenu diff --git a/drivers/core/device.c b/drivers/core/device.c index 0ccd443..c543203 100644 --- a/drivers/core/device.c +++ b/drivers/core/device.c @@ -11,6 +11,7 @@ #include <common.h> #include <fdtdec.h> +#include <fdt_support.h> #include <malloc.h> #include <dm/device.h> #include <dm/device-internal.h> @@ -581,6 +582,25 @@ fdt_addr_t dev_get_addr(struct udevice *dev) #if CONFIG_IS_ENABLED(OF_CONTROL) fdt_addr_t addr; + if (CONFIG_IS_ENABLED(OF_TRANSLATE)) { + const fdt32_t *reg; + + reg = fdt_getprop(gd->fdt_blob, dev->of_offset, "reg", NULL); + if (!reg) + return FDT_ADDR_T_NONE; + + /* + * Use the full-fledged translate function for complex + * bus setups. + */ + return fdt_translate_address((void *)gd->fdt_blob, + dev->of_offset, reg); + } + + /* + * Use the "simple" translate function for less complex + * bus setups. + */ addr = fdtdec_get_addr(gd->fdt_blob, dev->of_offset, "reg"); if (CONFIG_IS_ENABLED(SIMPLE_BUS) && addr != FDT_ADDR_T_NONE) { if (device_get_uclass_id(dev->parent) == UCLASS_SIMPLE_BUS)
The current "simple" address translation simple_bus_translate() is not working on some platforms (e.g. MVEBU). As here more complex "ranges" properties are used in many nodes (multiple tuples etc). This patch enables the optional use of the common fdt_translate_address() function which handles this translation correctly. Signed-off-by: Stefan Roese <sr@denx.de> Cc: Simon Glass <sjg@chromium.org> Cc: Bin Meng <bmeng.cn@gmail.com> Cc: Marek Vasut <marex@denx.de> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> --- v2: - Rework code a bit as suggested by Simon. Also added some comments to make the use of the code paths more clear. drivers/core/Kconfig | 30 ++++++++++++++++++++++++++++++ drivers/core/device.c | 20 ++++++++++++++++++++ 2 files changed, 50 insertions(+)