diff mbox series

[U-Boot,3/9] SPL: read and store arch property from U-Boot image

Message ID 20190221013034.9099-4-andre.przywara@arm.com
State Deferred
Delegated to: Jagannadha Sutradharudu Teki
Headers show
Series sunxi: Allow FEL capable SPLs with 32bit builds | expand

Commit Message

Andre Przywara Feb. 21, 2019, 1:30 a.m. UTC
Read the specified "arch" value from a legacy or FIT U-Boot image and
store it in our SPL data structure.
This allows loaders to take the target architecture in account for
custom loading procedures.
Having the complete string -> arch mapping for FIT based images in the
SPL would be too big, so we leave it up to architectures (or boards) to
overwrite the weak function that does the actual translation, possibly
covering only the required subset there.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 common/spl/spl.c     | 1 +
 common/spl/spl_fit.c | 8 ++++++++
 include/spl.h        | 3 ++-
 3 files changed, 11 insertions(+), 1 deletion(-)

Comments

Alexander Graf Feb. 21, 2019, 11:47 a.m. UTC | #1
On 02/21/2019 02:30 AM, Andre Przywara wrote:
> Read the specified "arch" value from a legacy or FIT U-Boot image and
> store it in our SPL data structure.
> This allows loaders to take the target architecture in account for
> custom loading procedures.
> Having the complete string -> arch mapping for FIT based images in the
> SPL would be too big, so we leave it up to architectures (or boards) to
> overwrite the weak function that does the actual translation, possibly
> covering only the required subset there.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>

Reviewed-by: Alexander Graf <agraf@suse.de>

I don't fully buy the argument that the generic mapping would be too big 
though. Realistically you should be able to get away with 1 or 2 
branches per case, no? So that would be maybe 40 instructions?


Alex
Philipp Tomsich Feb. 21, 2019, 11:50 a.m. UTC | #2
> On 21.02.2019, at 12:47, Alexander Graf <agraf@suse.de> wrote:
> 
> On 02/21/2019 02:30 AM, Andre Przywara wrote:
>> Read the specified "arch" value from a legacy or FIT U-Boot image and
>> store it in our SPL data structure.
>> This allows loaders to take the target architecture in account for
>> custom loading procedures.
>> Having the complete string -> arch mapping for FIT based images in the
>> SPL would be too big, so we leave it up to architectures (or boards) to
>> overwrite the weak function that does the actual translation, possibly
>> covering only the required subset there.
>> 
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> 
> Reviewed-by: Alexander Graf <agraf@suse.de>
> 
> I don't fully buy the argument that the generic mapping would be too big though. Realistically you should be able to get away with 1 or 2 branches per case, no? So that would be maybe 40 instructions?

The question I believe is not just code size (should be minimal) but the table
size for the mapping (assuming it’s a table).
Do we have any data on this to understand what order-of-magnitude this
“too big” is?

—Phil.
Andre Przywara Feb. 21, 2019, noon UTC | #3
On Thu, 21 Feb 2019 12:47:04 +0100
Alexander Graf <agraf@suse.de> wrote:

Hi Alex,

> On 02/21/2019 02:30 AM, Andre Przywara wrote:
> > Read the specified "arch" value from a legacy or FIT U-Boot image and
> > store it in our SPL data structure.
> > This allows loaders to take the target architecture in account for
> > custom loading procedures.
> > Having the complete string -> arch mapping for FIT based images in the
> > SPL would be too big, so we leave it up to architectures (or boards) to
> > overwrite the weak function that does the actual translation, possibly
> > covering only the required subset there.
> >
> > Signed-off-by: Andre Przywara <andre.przywara@arm.com>  
> 
> Reviewed-by: Alexander Graf <agraf@suse.de>

Thanks for having a look!

> I don't fully buy the argument that the generic mapping would be too big 
> though. Realistically you should be able to get away with 1 or 2 
> branches per case, no? So that would be maybe 40 instructions?

You are apparently not in the Allwinner SPL mindset ;-)
I get excited when I find a way of saving 100 bytes (see the writel_relaxed
patch), so wasting 160 bytes when we will probably never need to return
IH_ARCH_X86_64 doesn't sound right to me.

And Philipp is right: the canonical way would be to use uimage_arch[] from
common/image.c, which is quite big.

We could have a big #ifdef cascade, something like:
+#ifdef CONFIG_ARM
+	if (!strcmp(arch_str, "arm"))
+		return IH_ARCH_ARM;
+
+	if (!strcmp(arch_str, "arm64"))
+		return IH_ARCH_ARM64;
+#endif
+#ifdef CONFIG_X86
+	if (!strcmp(arch_str, "x86"))
+		return IH_ARCH_I386;
+
+	if (!strcmp(arch_str, "x86_64"))
+		return IH_ARCH_X86_64;
+#endif
....

I am just not sure it's worth to introduce this out of the blue *now*,
without use cases and without proper testing.

Cheers,
Andre.
Philipp Tomsich Feb. 21, 2019, 1:22 p.m. UTC | #4
> On 21.02.2019, at 13:00, Andre Przywara <andre.przywara@arm.com> wrote:
> 
> On Thu, 21 Feb 2019 12:47:04 +0100
> Alexander Graf <agraf@suse.de> wrote:
> 
> Hi Alex,
> 
>> On 02/21/2019 02:30 AM, Andre Przywara wrote:
>>> Read the specified "arch" value from a legacy or FIT U-Boot image and
>>> store it in our SPL data structure.
>>> This allows loaders to take the target architecture in account for
>>> custom loading procedures.
>>> Having the complete string -> arch mapping for FIT based images in the
>>> SPL would be too big, so we leave it up to architectures (or boards) to
>>> overwrite the weak function that does the actual translation, possibly
>>> covering only the required subset there.
>>> 
>>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>  
>> 
>> Reviewed-by: Alexander Graf <agraf@suse.de>
> 
> Thanks for having a look!
> 
>> I don't fully buy the argument that the generic mapping would be too big 
>> though. Realistically you should be able to get away with 1 or 2 
>> branches per case, no? So that would be maybe 40 instructions?
> 
> You are apparently not in the Allwinner SPL mindset ;-)

I am sometimes struggling against the 192KB limit on the RK3399.  So while
my current problems (SPL on the A31 was a different story) may sound like
“luxury” to you, I do understand the pain.

> I get excited when I find a way of saving 100 bytes (see the writel_relaxed
> patch), so wasting 160 bytes when we will probably never need to return
> IH_ARCH_X86_64 doesn't sound right to me.
> 
> And Philipp is right: the canonical way would be to use uimage_arch[] from
> common/image.c, which is quite big.
> 
> We could have a big #ifdef cascade, something like:
> +#ifdef CONFIG_ARM
> +	if (!strcmp(arch_str, "arm"))
> +		return IH_ARCH_ARM;
> +
> +	if (!strcmp(arch_str, "arm64"))
> +		return IH_ARCH_ARM64;
> +#endif
> +#ifdef CONFIG_X86
> +	if (!strcmp(arch_str, , "x86"))
> +		return IH_ARCH_I386;
> +
> +	if (!strcmp(arch_str, "x86_64"))
> +		return IH_ARCH_X86_64;
> +#endif

Could we have a preprocessor-guarded table of the form
	ONLY_ON(ARM)( { “arm”, IH_ARCH_ARM }, )
	ONLY_ON(ARM)( { “arm64”, IH_ARCH_ARM64 }, )
for this?

Then the function would be
	for ( …, curr_element = …, ... ) {
		if (!strcmp(arch_str, curr_element.arch_str))
			return curr_element.ih_value;
which should be less than the hypothetical 40 instructions, I would hope.

Then again, I didn’t send this through a compiler to look at the assembly
generated.  And I have been surprised by code-size on AArch64 before.

I usually am in favor of cleaning up the common code paths, so my personal
preference would be in figuring out a way how we can change uimage_arch[]
to become useful in a space-constrained SPL context.

Phil.
Alexander Graf Feb. 22, 2019, 8:17 a.m. UTC | #5
On 21.02.19 14:22, Philipp Tomsich wrote:
> 
> 
>> On 21.02.2019, at 13:00, Andre Przywara <andre.przywara@arm.com> wrote:
>>
>> On Thu, 21 Feb 2019 12:47:04 +0100
>> Alexander Graf <agraf@suse.de> wrote:
>>
>> Hi Alex,
>>
>>> On 02/21/2019 02:30 AM, Andre Przywara wrote:
>>>> Read the specified "arch" value from a legacy or FIT U-Boot image and
>>>> store it in our SPL data structure.
>>>> This allows loaders to take the target architecture in account for
>>>> custom loading procedures.
>>>> Having the complete string -> arch mapping for FIT based images in the
>>>> SPL would be too big, so we leave it up to architectures (or boards) to
>>>> overwrite the weak function that does the actual translation, possibly
>>>> covering only the required subset there.
>>>>
>>>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>  
>>>
>>> Reviewed-by: Alexander Graf <agraf@suse.de>
>>
>> Thanks for having a look!
>>
>>> I don't fully buy the argument that the generic mapping would be too big 
>>> though. Realistically you should be able to get away with 1 or 2 
>>> branches per case, no? So that would be maybe 40 instructions?
>>
>> You are apparently not in the Allwinner SPL mindset ;-)
> 
> I am sometimes struggling against the 192KB limit on the RK3399.  So while
> my current problems (SPL on the A31 was a different story) may sound like
> “luxury” to you, I do understand the pain.
> 
>> I get excited when I find a way of saving 100 bytes (see the writel_relaxed
>> patch), so wasting 160 bytes when we will probably never need to return
>> IH_ARCH_X86_64 doesn't sound right to me.
>>
>> And Philipp is right: the canonical way would be to use uimage_arch[] from
>> common/image.c, which is quite big.
>>
>> We could have a big #ifdef cascade, something like:
>> +#ifdef CONFIG_ARM
>> +	if (!strcmp(arch_str, "arm"))
>> +		return IH_ARCH_ARM;
>> +
>> +	if (!strcmp(arch_str, "arm64"))
>> +		return IH_ARCH_ARM64;
>> +#endif
>> +#ifdef CONFIG_X86
>> +	if (!strcmp(arch_str, , "x86"))
>> +		return IH_ARCH_I386;
>> +
>> +	if (!strcmp(arch_str, "x86_64"))
>> +		return IH_ARCH_X86_64;
>> +#endif
> 
> Could we have a preprocessor-guarded table of the form
> 	ONLY_ON(ARM)( { “arm”, IH_ARCH_ARM }, )
> 	ONLY_ON(ARM)( { “arm64”, IH_ARCH_ARM64 }, )
> for this?
> 
> Then the function would be
> 	for ( …, curr_element = …, ... ) {
> 		if (!strcmp(arch_str, curr_element.arch_str))
> 			return curr_element.ih_value;
> which should be less than the hypothetical 40 instructions, I would hope.
> 
> Then again, I didn’t send this through a compiler to look at the assembly
> generated.  And I have been surprised by code-size on AArch64 before.
> 
> I usually am in favor of cleaning up the common code paths, so my personal
> preference would be in figuring out a way how we can change uimage_arch[]
> to become useful in a space-constrained SPL context.

I was actually thinking more along the lines of

#ifdef CONFIG_ARM
  if (arch[0] == 'a') {
    if (arch[3] == '6') /* "arm64" */
      return IH_ARCH_ARM64;
    else
      return IH_ARCH_ARM;
  }
#endif

etc.

If you put that in a generic function that can be used across the board,
you will also less likely have target specific bugs sneak in. And the
resulting code should be much smaller than your strcmp.


Alex
diff mbox series

Patch

diff --git a/common/spl/spl.c b/common/spl/spl.c
index 2e2af1b28e..7d25e72bbb 100644
--- a/common/spl/spl.c
+++ b/common/spl/spl.c
@@ -272,6 +272,7 @@  int spl_parse_image_header(struct spl_image_info *spl_image,
 #endif
 
 		spl_image->os = image_get_os(header);
+		spl_image->arch = image_get_arch(header);
 		spl_image->name = image_get_name(header);
 		debug(SPL_TPL_PROMPT
 		      "payload image: %32s load addr: 0x%lx size: %d\n",
diff --git a/common/spl/spl_fit.c b/common/spl/spl_fit.c
index db436268cb..d583946571 100644
--- a/common/spl/spl_fit.c
+++ b/common/spl/spl_fit.c
@@ -340,6 +340,11 @@  static int spl_fit_image_get_os(const void *fit, int noffset, uint8_t *os)
 #endif
 }
 
+__weak u8 spl_genimg_get_arch_id(const char *arch_str)
+{
+	return IH_ARCH_DEFAULT;
+}
+
 int spl_load_simple_fit(struct spl_image_info *spl_image,
 			struct spl_load_info *info, ulong sector, void *fit)
 {
@@ -351,6 +356,7 @@  int spl_load_simple_fit(struct spl_image_info *spl_image,
 	int images, ret;
 	int base_offset, hsize, align_len = ARCH_DMA_MINALIGN - 1;
 	int index = 0;
+	const char *arch_str;
 
 	/*
 	 * For FIT with external data, figure out where the external images
@@ -467,6 +473,8 @@  int spl_load_simple_fit(struct spl_image_info *spl_image,
 	else
 		spl_image->os = IH_OS_U_BOOT;
 #endif
+	arch_str = fdt_getprop(fit, node, "arch", NULL);
+	spl_image->arch = spl_genimg_get_arch_id(arch_str);
 
 	/*
 	 * Booting a next-stage U-Boot may require us to append the FDT.
diff --git a/include/spl.h b/include/spl.h
index f09909e189..aac85150a7 100644
--- a/include/spl.h
+++ b/include/spl.h
@@ -64,7 +64,6 @@  static inline bool u_boot_first_phase(void)
 
 struct spl_image_info {
 	const char *name;
-	u8 os;
 	uintptr_t load_addr;
 	uintptr_t entry_point;
 #if CONFIG_IS_ENABLED(LOAD_FIT)
@@ -79,6 +78,8 @@  struct spl_image_info {
 	ulong dcrc_length;
 	ulong dcrc;
 #endif
+	u8 os;
+	u8 arch;
 };
 
 /*