diff mbox series

[AArch64] Set SLOW_BYTE_ACCESS

Message ID DB6PR0801MB20538090D623425906F46EDA832F0@DB6PR0801MB2053.eurprd08.prod.outlook.com
State New
Headers show
Series [AArch64] Set SLOW_BYTE_ACCESS | expand

Commit Message

Wilco Dijkstra Nov. 17, 2017, 3:21 p.m. UTC
Contrary to all documentation, SLOW_BYTE_ACCESS simply means accessing
bitfields by their declared type, which results in better codegeneration on practically
any target.

I'm thinking we should completely remove all trace of SLOW_BYTE_ACCESS
from GCC as it's confusing and useless.

OK for commit until we get rid of it?

ChangeLog:
2017-11-17  Wilco Dijkstra  <wdijkstr@arm.com>

    gcc/
	* config/aarch64/aarch64.h (SLOW_BYTE_ACCESS): Set to 1.
--

Comments

James Greenhalgh Nov. 17, 2017, 3:53 p.m. UTC | #1
On Fri, Nov 17, 2017 at 03:21:31PM +0000, Wilco Dijkstra wrote:
> Contrary to all documentation, SLOW_BYTE_ACCESS simply means accessing
> bitfields by their declared type, which results in better codegeneration on practically
> any target.
> 
> I'm thinking we should completely remove all trace of SLOW_BYTE_ACCESS
> from GCC as it's confusing and useless.

> -/* Define this macro to be non-zero if accessing less than a word of
> -   memory is no faster than accessing a word of memory, i.e., if such
> -   accesses require more than one instruction or if there is no
> -   difference in cost.
> -   Although there's no difference in instruction count or cycles,
> -   in AArch64 we don't want to expand to a sub-word to a 64-bit access
> -   if we don't have to, for power-saving reasons.  */
> -#define SLOW_BYTE_ACCESS		0
> +/* Contrary to all documentation, this enables wide bitfield accesses,
> +   which results in better code when accessing multiple bitfields.  */
> +#define SLOW_BYTE_ACCESS		1

Why don't we fix the documentation and the name of this macro so we don't
need the comment?

I'd rather us fix the macro to behave in a sensible way than record the
opposite of what we mean just because the documentation is flawed.

Thanks,
James
Wilco Dijkstra May 15, 2018, 1:24 p.m. UTC | #2
ping




From: Wilco Dijkstra
Sent: 17 November 2017 15:21
To: GCC Patches
Cc: nd
Subject: [PATCH][AArch64] Set SLOW_BYTE_ACCESS
  

Contrary to all documentation, SLOW_BYTE_ACCESS simply means accessing
bitfields by their declared type, which results in better codegeneration on practically
any target.

I'm thinking we should completely remove all trace of SLOW_BYTE_ACCESS
from GCC as it's confusing and useless.

OK for commit until we get rid of it?

ChangeLog:
2017-11-17  Wilco Dijkstra  <wdijkstr@arm.com>

    gcc/
        * config/aarch64/aarch64.h (SLOW_BYTE_ACCESS): Set to 1.
--
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 056110afb228fb919e837c04aa5e5552a4868ec3..d8f4d129a02fb89eb00d256aba8c4764d6026078 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -769,14 +769,9 @@ typedef struct
    if given data not on the nominal alignment.  */
 #define STRICT_ALIGNMENT                TARGET_STRICT_ALIGN
 
-/* Define this macro to be non-zero if accessing less than a word of
-   memory is no faster than accessing a word of memory, i.e., if such
-   accesses require more than one instruction or if there is no
-   difference in cost.
-   Although there's no difference in instruction count or cycles,
-   in AArch64 we don't want to expand to a sub-word to a 64-bit access
-   if we don't have to, for power-saving reasons.  */
-#define SLOW_BYTE_ACCESS               0
+/* Contrary to all documentation, this enables wide bitfield accesses,
+   which results in better code when accessing multiple bitfields.  */
+#define SLOW_BYTE_ACCESS               1
 
 #define NO_FUNCTION_CSE 1
Richard Earnshaw (lists) May 15, 2018, 3:56 p.m. UTC | #3
On 15/05/18 14:24, Wilco Dijkstra wrote:
> 
> ping
> 
> 

I see nothing about you addressing James' comment from 17th November...


> 
> 
> From: Wilco Dijkstra
> Sent: 17 November 2017 15:21
> To: GCC Patches
> Cc: nd
> Subject: [PATCH][AArch64] Set SLOW_BYTE_ACCESS
>   
> 
> Contrary to all documentation, SLOW_BYTE_ACCESS simply means accessing
> bitfields by their declared type, which results in better codegeneration on practically
> any target.
> 
> I'm thinking we should completely remove all trace of SLOW_BYTE_ACCESS
> from GCC as it's confusing and useless.
> 
> OK for commit until we get rid of it?
> 
> ChangeLog:
> 2017-11-17  Wilco Dijkstra  <wdijkstr@arm.com>
> 
>     gcc/
>         * config/aarch64/aarch64.h (SLOW_BYTE_ACCESS): Set to 1.
> --
> diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
> index 056110afb228fb919e837c04aa5e5552a4868ec3..d8f4d129a02fb89eb00d256aba8c4764d6026078 100644
> --- a/gcc/config/aarch64/aarch64.h
> +++ b/gcc/config/aarch64/aarch64.h
> @@ -769,14 +769,9 @@ typedef struct
>     if given data not on the nominal alignment.  */
>  #define STRICT_ALIGNMENT                TARGET_STRICT_ALIGN
>  
> -/* Define this macro to be non-zero if accessing less than a word of
> -   memory is no faster than accessing a word of memory, i.e., if such
> -   accesses require more than one instruction or if there is no
> -   difference in cost.
> -   Although there's no difference in instruction count or cycles,
> -   in AArch64 we don't want to expand to a sub-word to a 64-bit access
> -   if we don't have to, for power-saving reasons.  */
> -#define SLOW_BYTE_ACCESS               0
> +/* Contrary to all documentation, this enables wide bitfield accesses,
> +   which results in better code when accessing multiple bitfields.  */
> +#define SLOW_BYTE_ACCESS               1
>  
>  #define NO_FUNCTION_CSE 1
> 
>     
>
Wilco Dijkstra May 15, 2018, 4:01 p.m. UTC | #4
Hi,
 
> I see nothing about you addressing James' comment from 17th November...

I addressed that in a separate patch, see https://patchwork.ozlabs.org/patch/839126/

Wilco
Richard Earnshaw (lists) May 15, 2018, 4:54 p.m. UTC | #5
On 15/05/18 17:01, Wilco Dijkstra wrote:
> Hi,
>  
>> I see nothing about you addressing James' comment from 17th November...
> 
> I addressed that in a separate patch, see https://patchwork.ozlabs.org/patch/839126/
> 
> Wilco
> 

Which doesn't appear to have been approved.  Did you follow up with Jeff?

R.
Wilco Dijkstra May 15, 2018, 4:58 p.m. UTC | #6
Hi,

> Which doesn't appear to have been approved.  Did you follow up with Jeff?

I'll get back to that one at some point - it'll take some time to agree on a way
forward with the callback.

Wilco
Richard Earnshaw (lists) May 16, 2018, 1:48 p.m. UTC | #7
On 15/05/18 17:58, Wilco Dijkstra wrote:
> Hi,
> 
>> Which doesn't appear to have been approved.  Did you follow up with Jeff?
> 
> I'll get back to that one at some point - it'll take some time to agree on a way
> forward with the callback.
> 
> Wilco
>     
> 

So it seems to me that this should then be queued until that is resolved.

R.
Wilco Dijkstra May 16, 2018, 1:56 p.m. UTC | #8
Richard Earnshaw wrote:
 
>>> Which doesn't appear to have been approved.  Did you follow up with Jeff?
>> 
>> I'll get back to that one at some point - it'll take some time to agree on a way
>> forward with the callback.
>> 
>> Wilco
>>     
>> 
>
> So it seems to me that this should then be queued until that is resolved.

Why? The patch as is doesn't at all depend on the resolution of how to improve
the callback. If we stopped all patches until GCC is 100% perfect we'd never
make any progress. 

Wilco
Richard Earnshaw (lists) May 16, 2018, 2:42 p.m. UTC | #9
On 16/05/18 14:56, Wilco Dijkstra wrote:
> Richard Earnshaw wrote:
>  
>>>> Which doesn't appear to have been approved.  Did you follow up with Jeff?
>>>
>>> I'll get back to that one at some point - it'll take some time to agree on a way
>>> forward with the callback.
>>>
>>> Wilco
>>>      
>>>
>>
>> So it seems to me that this should then be queued until that is resolved.
> 
> Why? The patch as is doesn't at all depend on the resolution of how to improve
> the callback. If we stopped all patches until GCC is 100% perfect we'd never
> make any progress. 
> 
> Wilco
> 

Because we don't want to build up technical debt for things that should
and can be fixed properly.

R.
diff mbox series

Patch

diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 056110afb228fb919e837c04aa5e5552a4868ec3..d8f4d129a02fb89eb00d256aba8c4764d6026078 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -769,14 +769,9 @@  typedef struct
    if given data not on the nominal alignment.  */
 #define STRICT_ALIGNMENT		TARGET_STRICT_ALIGN
 
-/* Define this macro to be non-zero if accessing less than a word of
-   memory is no faster than accessing a word of memory, i.e., if such
-   accesses require more than one instruction or if there is no
-   difference in cost.
-   Although there's no difference in instruction count or cycles,
-   in AArch64 we don't want to expand to a sub-word to a 64-bit access
-   if we don't have to, for power-saving reasons.  */
-#define SLOW_BYTE_ACCESS		0
+/* Contrary to all documentation, this enables wide bitfield accesses,
+   which results in better code when accessing multiple bitfields.  */
+#define SLOW_BYTE_ACCESS		1
 
 #define NO_FUNCTION_CSE	1