diff mbox series

[v3,3/3] aarch64: Add explicit checks for implicit LSE/LSE2 requirements.

Message ID 20240103012828.2446443-4-victor.donascimento@arm.com
State New
Headers show
Series Libatomic: Add LSE128 atomics support for AArch64 | expand

Commit Message

Victor Do Nascimento Jan. 3, 2024, 1:28 a.m. UTC
At present, Evaluation of both `has_lse2(hwcap)' and
`has_lse128(hwcap)' may require issuing an `mrs' instruction to query
a system register.  This instruction, when issued from user-space
results in a trap by the kernel which then returns the value read in
by the system register.  Given the undesirable nature of the
computational expense associated with the context switch, it is
important to implement mechanisms to, wherever possible, forgo the
operation.

In light of this, given how other architectural requirements serving
as prerequisites have long been assigned HWCAP bits by the kernel, we
can inexpensively query for their availability before attempting to
read any system registers.  Where one of these early tests fail, we
can assert that the main feature of interest (be it LSE2 or LSE128)
cannot be present, allowing us to return from the function early and
skip the unnecessary expensive kernel-mediated access to system
registers.

libatomic/ChangeLog:

	* config/linux/aarch64/host-config.h (has_lse2): Add test for LSE.
	(has_lse128): Add test for LSE2.
---
 libatomic/config/linux/aarch64/host-config.h | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

Comments

Richard Sandiford Jan. 5, 2024, 11:58 a.m. UTC | #1
Victor Do Nascimento <victor.donascimento@arm.com> writes:
> At present, Evaluation of both `has_lse2(hwcap)' and
> `has_lse128(hwcap)' may require issuing an `mrs' instruction to query
> a system register.  This instruction, when issued from user-space
> results in a trap by the kernel which then returns the value read in
> by the system register.  Given the undesirable nature of the
> computational expense associated with the context switch, it is
> important to implement mechanisms to, wherever possible, forgo the
> operation.
>
> In light of this, given how other architectural requirements serving
> as prerequisites have long been assigned HWCAP bits by the kernel, we
> can inexpensively query for their availability before attempting to
> read any system registers.  Where one of these early tests fail, we
> can assert that the main feature of interest (be it LSE2 or LSE128)
> cannot be present, allowing us to return from the function early and
> skip the unnecessary expensive kernel-mediated access to system
> registers.
>
> libatomic/ChangeLog:
>
> 	* config/linux/aarch64/host-config.h (has_lse2): Add test for LSE.
> 	(has_lse128): Add test for LSE2.
> ---
>  libatomic/config/linux/aarch64/host-config.h | 13 ++++++++++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/libatomic/config/linux/aarch64/host-config.h b/libatomic/config/linux/aarch64/host-config.h
> index c5485d63855..3be4db6e5f8 100644
> --- a/libatomic/config/linux/aarch64/host-config.h
> +++ b/libatomic/config/linux/aarch64/host-config.h
> @@ -53,8 +53,13 @@
>  static inline bool
>  has_lse2 (unsigned long hwcap)
>  {
> +  /* Check for LSE2.  */
>    if (hwcap & HWCAP_USCAT)
>      return true;
> +  /* No point checking further for atomic 128-bit load/store if LSE
> +     prerequisite not met.  */
> +  if (!(hwcap & HWCAP_ATOMICS))
> +    return false;

This part is OK.

>    if (!(hwcap & HWCAP_CPUID))
>      return false;
>  
> @@ -76,12 +81,14 @@ has_lse2 (unsigned long hwcap)
>  static inline bool
>  has_lse128 (unsigned long hwcap)
>  {
> -  if (!(hwcap & HWCAP_CPUID))
> -    return false;
> +  /* In the absence of HWCAP_CPUID, we are unable to check for LSE128, return.
> +     If feature check available, check LSE2 prerequisite before proceeding.  */
> +  if (!(hwcap & HWCAP_CPUID) || !(hwcap & HWCAP_USCAT))
> +     return false;

The inconsistency feels wrong here.  If we're saying that HWCAP_USCAT
is now so old that we don't need to fall back on CPUID, then it feels
like we should have the courage of our convictions and do the same for
has_lse2.  If instead we still want to support libcs that predate
HWCAP_USCAT, we should do the same here too.

>    unsigned long isar0;
>    asm volatile ("mrs %0, ID_AA64ISAR0_EL1" : "=r" (isar0));
>    if (AT_FEAT_FIELD (isar0) >= 3)
> -    return true;
> +      return true;

The original formatting was correct.

Thanks,
Richard

>    return false;
>  }
Richard Sandiford Jan. 5, 2024, 12:08 p.m. UTC | #2
Richard Sandiford <richard.sandiford@arm.com> writes:
> Victor Do Nascimento <victor.donascimento@arm.com> writes:
>> At present, Evaluation of both `has_lse2(hwcap)' and
>> `has_lse128(hwcap)' may require issuing an `mrs' instruction to query
>> a system register.  This instruction, when issued from user-space
>> results in a trap by the kernel which then returns the value read in
>> by the system register.  Given the undesirable nature of the
>> computational expense associated with the context switch, it is
>> important to implement mechanisms to, wherever possible, forgo the
>> operation.
>>
>> In light of this, given how other architectural requirements serving
>> as prerequisites have long been assigned HWCAP bits by the kernel, we
>> can inexpensively query for their availability before attempting to
>> read any system registers.  Where one of these early tests fail, we
>> can assert that the main feature of interest (be it LSE2 or LSE128)
>> cannot be present, allowing us to return from the function early and
>> skip the unnecessary expensive kernel-mediated access to system
>> registers.
>>
>> libatomic/ChangeLog:
>>
>> 	* config/linux/aarch64/host-config.h (has_lse2): Add test for LSE.
>> 	(has_lse128): Add test for LSE2.
>> ---
>>  libatomic/config/linux/aarch64/host-config.h | 13 ++++++++++---
>>  1 file changed, 10 insertions(+), 3 deletions(-)
>>
>> diff --git a/libatomic/config/linux/aarch64/host-config.h b/libatomic/config/linux/aarch64/host-config.h
>> index c5485d63855..3be4db6e5f8 100644
>> --- a/libatomic/config/linux/aarch64/host-config.h
>> +++ b/libatomic/config/linux/aarch64/host-config.h
>> @@ -53,8 +53,13 @@
>>  static inline bool
>>  has_lse2 (unsigned long hwcap)
>>  {
>> +  /* Check for LSE2.  */
>>    if (hwcap & HWCAP_USCAT)
>>      return true;
>> +  /* No point checking further for atomic 128-bit load/store if LSE
>> +     prerequisite not met.  */
>> +  if (!(hwcap & HWCAP_ATOMICS))
>> +    return false;
>
> This part is OK.
>
>>    if (!(hwcap & HWCAP_CPUID))
>>      return false;
>>  
>> @@ -76,12 +81,14 @@ has_lse2 (unsigned long hwcap)
>>  static inline bool
>>  has_lse128 (unsigned long hwcap)
>>  {
>> -  if (!(hwcap & HWCAP_CPUID))
>> -    return false;
>> +  /* In the absence of HWCAP_CPUID, we are unable to check for LSE128, return.
>> +     If feature check available, check LSE2 prerequisite before proceeding.  */
>> +  if (!(hwcap & HWCAP_CPUID) || !(hwcap & HWCAP_USCAT))
>> +     return false;
>
> The inconsistency feels wrong here.  If we're saying that HWCAP_USCAT
> is now so old that we don't need to fall back on CPUID, then it feels
> like we should have the courage of our convictions and do the same for
> has_lse2.  If instead we still want to support libcs that predate
> HWCAP_USCAT, we should do the same here too.

Sorry, scratch that, I'd misread has_lse2.  The CPUID fallback there is
only for Neoverse N1, which we know doesn't support LSE128.

So the patch is OK with the formatting fixed: the returns should be
indented by their original amount.

Thanks,
Richard

>>    unsigned long isar0;
>>    asm volatile ("mrs %0, ID_AA64ISAR0_EL1" : "=r" (isar0));
>>    if (AT_FEAT_FIELD (isar0) >= 3)
>> -    return true;
>> +      return true;
>
> The original formatting was correct.
>
> Thanks,
> Richard
>
>>    return false;
>>  }
diff mbox series

Patch

diff --git a/libatomic/config/linux/aarch64/host-config.h b/libatomic/config/linux/aarch64/host-config.h
index c5485d63855..3be4db6e5f8 100644
--- a/libatomic/config/linux/aarch64/host-config.h
+++ b/libatomic/config/linux/aarch64/host-config.h
@@ -53,8 +53,13 @@ 
 static inline bool
 has_lse2 (unsigned long hwcap)
 {
+  /* Check for LSE2.  */
   if (hwcap & HWCAP_USCAT)
     return true;
+  /* No point checking further for atomic 128-bit load/store if LSE
+     prerequisite not met.  */
+  if (!(hwcap & HWCAP_ATOMICS))
+    return false;
   if (!(hwcap & HWCAP_CPUID))
     return false;
 
@@ -76,12 +81,14 @@  has_lse2 (unsigned long hwcap)
 static inline bool
 has_lse128 (unsigned long hwcap)
 {
-  if (!(hwcap & HWCAP_CPUID))
-    return false;
+  /* In the absence of HWCAP_CPUID, we are unable to check for LSE128, return.
+     If feature check available, check LSE2 prerequisite before proceeding.  */
+  if (!(hwcap & HWCAP_CPUID) || !(hwcap & HWCAP_USCAT))
+     return false;
   unsigned long isar0;
   asm volatile ("mrs %0, ID_AA64ISAR0_EL1" : "=r" (isar0));
   if (AT_FEAT_FIELD (isar0) >= 3)
-    return true;
+      return true;
   return false;
 }