Patchwork arm-linux-user: fix elfload.c's AT_HWCAP reflected cpu features.

login
register
mail settings
Submitter Benoit Canet
Date Nov. 9, 2011, 1:04 p.m.
Message ID <1320843872-2402-2-git-send-email-benoit.canet@gmail.com>
Download mbox | patch
Permalink /patch/124545/
State New
Headers show

Comments

Benoit Canet - Nov. 9, 2011, 1:04 p.m.
The cpu capabilities passed by the elf loader in AT_HWCAP where
a constant.
Make AT_HWCAP reflect the emulated cpu features in order to give
correct clues to eglibc.

Fix :  [Bug 887516] [NEW] VFP support reported for the PXA270

Signed-off-by: Benoit Canet <benoit.canet@gmail.com>
---
 linux-user/elfload.c |   31 +++++++++++++++++++++++++++----
 1 files changed, 27 insertions(+), 4 deletions(-)
Andreas Färber - Nov. 9, 2011, 1:34 p.m.
Am 09.11.2011 14:04, schrieb Benoît Canet:
> The cpu capabilities passed by the elf loader in AT_HWCAP where
> a constant.
> Make AT_HWCAP reflect the emulated cpu features in order to give
> correct clues to eglibc.
> 
> Fix :  [Bug 887516] [NEW] VFP support reported for the PXA270
> 
> Signed-off-by: Benoit Canet <benoit.canet@gmail.com>
> ---
>  linux-user/elfload.c |   31 +++++++++++++++++++++++++++----
>  1 files changed, 27 insertions(+), 4 deletions(-)
> 
> diff --git a/linux-user/elfload.c b/linux-user/elfload.c
> index a413976..5d81ec1 100644
> --- a/linux-user/elfload.c
> +++ b/linux-user/elfload.c
> @@ -375,10 +375,33 @@ bool guest_validate_base(unsigned long guest_base)
>      return 1; /* All good */
>  }
>  
> -#define ELF_HWCAP (ARM_HWCAP_ARM_SWP | ARM_HWCAP_ARM_HALF               \
> -                   | ARM_HWCAP_ARM_THUMB | ARM_HWCAP_ARM_FAST_MULT      \
> -                   | ARM_HWCAP_ARM_FPA | ARM_HWCAP_ARM_VFP              \
> -                   | ARM_HWCAP_ARM_NEON | ARM_HWCAP_ARM_VFPv3 )
> +
> +#define ELF_HWCAP get_elf_hwcap()

I assume ELF_HWCAP is being used in architecture-independent code? Or
would it be feasible to replace all occurrences with the function call?

> +
> +static uint32_t get_elf_hwcap(void)
> +{
> +    CPUState *e = thread_env;
> +    uint32_t hwcaps = 0;
> +
> +    hwcaps |= ARM_HWCAP_ARM_SWP;
> +    hwcaps |= ARM_HWCAP_ARM_HALF;
> +    hwcaps |= ARM_HWCAP_ARM_THUMB;
> +    hwcaps |= ARM_HWCAP_ARM_FAST_MULT;
> +    hwcaps |= ARM_HWCAP_ARM_FPA;
> +
> +    /* prove for the extra features */

probe?

> +#define GET_FEATURE(feat, hwcap) \
> +    do {if (arm_feature(e, feat)) { hwcaps |= hwcap; } } while (0)

This doesn't return anything, it sets the hwcap flag. SET_HWCAP maybe?

> +    GET_FEATURE(ARM_FEATURE_VFP, ARM_HWCAP_ARM_VFP);
> +    GET_FEATURE(ARM_FEATURE_IWMMXT, ARM_HWCAP_ARM_IWMMXT);
> +    GET_FEATURE(ARM_FEATURE_THUMB2EE, ARM_HWCAP_ARM_THUMBEE);
> +    GET_FEATURE(ARM_FEATURE_NEON, ARM_HWCAP_ARM_NEON);
> +    GET_FEATURE(ARM_FEATURE_VFP3, ARM_HWCAP_ARM_VFPv3);
> +    GET_FEATURE(ARM_FEATURE_VFP_FP16, ARM_HWCAP_ARM_VFPv3D16);

I'm wondering if this translation table were better placed in
target-arm/helper.c where we fiddle with the features in the first
place. We could loop through all features here and call a function that
returns the hwcap or 0 and |= it. Me and others will be adding new
features and we'll risk adapting this here.

I was told that VFP_FP16 is *not* VFPv3-D16. Please remove that for now.

> +#undef GET_FEATURE
> +
> +    return hwcaps;
> +}
>  
>  #endif
>  

Andreas
Benoit Canet - Nov. 9, 2011, 1:51 p.m.
> I assume ELF_HWCAP is being used in architecture-independent code? Or
> would it be feasible to replace all occurrences with the function call?


Many architecture in elfloader.c hardcode ELF_HWCAP before putting into the
AT_HWCAP elf field which is used by glibc to guess the cpu capabilities.
I don't feel it so arch independant.


> I'm wondering if this translation table were better placed in
> target-arm/helper.c where we fiddle with the features in the first
> place. We could loop through all features here and call a function that
> returns the hwcap or 0 and |= it. Me and others will be adding new
> features and we'll risk adapting this here.
>
> I modeled this patch by looking at PPC behavior. However I can rewrite it
if needed but pcc will need a rewrite too.
Peter Maydell - Nov. 9, 2011, 2:01 p.m.
2011/11/9 Benoît Canet <benoit.canet@gmail.com>:
> +static uint32_t get_elf_hwcap(void)
> +{
> +    CPUState *e = thread_env;
> +    uint32_t hwcaps = 0;
> +
> +    hwcaps |= ARM_HWCAP_ARM_SWP;
> +    hwcaps |= ARM_HWCAP_ARM_HALF;
> +    hwcaps |= ARM_HWCAP_ARM_THUMB;
> +    hwcaps |= ARM_HWCAP_ARM_FAST_MULT;
> +    hwcaps |= ARM_HWCAP_ARM_FPA;

After looking at the Linux kernel code I've changed my mind on this one:
we shouldn't set the FPA hwcap, because we don't model any CPU with FPA
hardware and the kernel doesn't set this hwcap even if it is providing
emulated FPA via nwfpe, so we shouldn't either.

> +    /* prove for the extra features */
> +#define GET_FEATURE(feat, hwcap) \
> +    do {if (arm_feature(e, feat)) { hwcaps |= hwcap; } } while (0)
> +    GET_FEATURE(ARM_FEATURE_VFP, ARM_HWCAP_ARM_VFP);
> +    GET_FEATURE(ARM_FEATURE_IWMMXT, ARM_HWCAP_ARM_IWMMXT);
> +    GET_FEATURE(ARM_FEATURE_THUMB2EE, ARM_HWCAP_ARM_THUMBEE);
> +    GET_FEATURE(ARM_FEATURE_NEON, ARM_HWCAP_ARM_NEON);
> +    GET_FEATURE(ARM_FEATURE_VFP3, ARM_HWCAP_ARM_VFPv3);
> +    GET_FEATURE(ARM_FEATURE_VFP_FP16, ARM_HWCAP_ARM_VFPv3D16);

As Andreas says, this one's wrong. We don't currently implement
any CPUs with VFPv3D16 (ie only 16 double registers) so we never
need to set this hwcap. (ARM_FEATURE_VP_FP16 means "we implement
half-precision VFP".)

Missing:
   /* Strictly should be ARM_FEATURE_V5TE but we don't distinguish
    * as all our v5 cores are v5TE at the moment
    */
   GET_FEATURE(ARM_FEATURE_V5, ARM_HWCAP_ARM_EDSP);

(although maybe we should just bite the bullet and define the feature
bit...)

> +#undef GET_FEATURE
> +
> +    return hwcaps;
> +}

While we're here we might as well update the hwcaps list
based on the most recent kernel:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=blob;f=arch/arm/kernel/setup.c;h=7e7977ab994ff92ee4ded30ee728d92ed6c3a520;hb=HEAD#l986

So that's
    ARM_HWCAP_ARM_TLS       = 1 << 14,
    ARM_HWCAP_ARM_VFPv4     = 1 << 15,
    ARM_HWCAP_ARM_IDIVA     = 1 << 16,
    ARM_HWCAP_ARM_IDIVT     = 1 << 17,

and
    GET_FEATURE(ARM_FEATURE_V6K, ARM_HWCAP_ARM_TLS);
    GET_FEATURE(ARM_FEATURE_VFP4, ARM_HWCAP_ARM_VFPv4);
    GET_FEATURE(ARM_FEATURE_ARM_DIV, ARM_HWCAP_ARM_IDIVA);
    GET_FEATURE(ARM_FEATURE_THUMB_DIV, ARM_HWCAP_ARM_IDIVT);

-- PMM
Peter Maydell - Nov. 9, 2011, 2:06 p.m.
2011/11/9 Andreas Färber <afaerber@suse.de>:
> I'm wondering if this translation table were better placed in
> target-arm/helper.c where we fiddle with the features in the first
> place. We could loop through all features here and call a function that
> returns the hwcap or 0 and |= it. Me and others will be adding new
> features and we'll risk adapting this here.

Hmm. It's really linux-specific so there's a good argument for
leaving it in linux-user. On the other hand I did just add some
extra features (arm div, vfpv4) without fixing the hwcaps so I
see your point...

Maybe we should just have a comment in cpu.h next to the
arm_features enum?

-- PMM
Andreas Färber - Nov. 9, 2011, 2:41 p.m.
Am 09.11.2011 15:06, schrieb Peter Maydell:
> 2011/11/9 Andreas Färber <afaerber@suse.de>:
>> I'm wondering if this translation table were better placed in
>> target-arm/helper.c where we fiddle with the features in the first
>> place. We could loop through all features here and call a function that
>> returns the hwcap or 0 and |= it. Me and others will be adding new
>> features and we'll risk adapting this here.
> 
> Hmm. It's really linux-specific so there's a good argument for
> leaving it in linux-user. On the other hand I did just add some
> extra features (arm div, vfpv4) without fixing the hwcaps so I
> see your point...
> 
> Maybe we should just have a comment in cpu.h next to the
> arm_features enum?

That might do, too.

Andreas

Patch

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index a413976..5d81ec1 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -375,10 +375,33 @@  bool guest_validate_base(unsigned long guest_base)
     return 1; /* All good */
 }
 
-#define ELF_HWCAP (ARM_HWCAP_ARM_SWP | ARM_HWCAP_ARM_HALF               \
-                   | ARM_HWCAP_ARM_THUMB | ARM_HWCAP_ARM_FAST_MULT      \
-                   | ARM_HWCAP_ARM_FPA | ARM_HWCAP_ARM_VFP              \
-                   | ARM_HWCAP_ARM_NEON | ARM_HWCAP_ARM_VFPv3 )
+
+#define ELF_HWCAP get_elf_hwcap()
+
+static uint32_t get_elf_hwcap(void)
+{
+    CPUState *e = thread_env;
+    uint32_t hwcaps = 0;
+
+    hwcaps |= ARM_HWCAP_ARM_SWP;
+    hwcaps |= ARM_HWCAP_ARM_HALF;
+    hwcaps |= ARM_HWCAP_ARM_THUMB;
+    hwcaps |= ARM_HWCAP_ARM_FAST_MULT;
+    hwcaps |= ARM_HWCAP_ARM_FPA;
+
+    /* prove for the extra features */
+#define GET_FEATURE(feat, hwcap) \
+    do {if (arm_feature(e, feat)) { hwcaps |= hwcap; } } while (0)
+    GET_FEATURE(ARM_FEATURE_VFP, ARM_HWCAP_ARM_VFP);
+    GET_FEATURE(ARM_FEATURE_IWMMXT, ARM_HWCAP_ARM_IWMMXT);
+    GET_FEATURE(ARM_FEATURE_THUMB2EE, ARM_HWCAP_ARM_THUMBEE);
+    GET_FEATURE(ARM_FEATURE_NEON, ARM_HWCAP_ARM_NEON);
+    GET_FEATURE(ARM_FEATURE_VFP3, ARM_HWCAP_ARM_VFPv3);
+    GET_FEATURE(ARM_FEATURE_VFP_FP16, ARM_HWCAP_ARM_VFPv3D16);
+#undef GET_FEATURE
+
+    return hwcaps;
+}
 
 #endif