diff mbox

[ARM] Reload register class fix for NEON constants

Message ID 4DB595B3.3080708@ispras.ru
State New
Headers show

Commit Message

Dmitry Melnik April 25, 2011, 3:39 p.m. UTC
Hi All,

The attached patch changes the reload class for NEON constant vectors 
from GENERAL_REGS to NO_REGS.
The issue was found on this code from libevas:

void
_op_blend_p_caa_dp(unsigned *s, unsigned* e, unsigned *d, unsigned c) {
     while (d < e) {
      *d = ( (((((*s) >> 8) & 0x00ff00ff) * (c)) & 0xff00ff00) + 
(((((*s) & 0x00ff00ff) * (c)) >> 8) & 0x00ff00ff) );
      //*d = (*s) & 0x00ff00ff;
      d++;
      s++;
     }
}

Original asm:

.L4:
         adr     r8, .L10
         ldmia   r8, {r8-fp}
         ...
         vmov    d22, r8, r9  @ v4si
         vmov    d23, sl, fp
         vand    q12, q8, q11
         ...
         bhi     .L4

.L10:
         .word   16711935 @ 0xff00ff
         .word   16711935
         .word   16711935
         .word   16711935

Fixed asm:

.L4:
         vmov.i16        q11, #255  @ v4si
         ...
         vand    q12, q8, q11
         bhi     .L4

This fix results in +3.7% gain for expedite (reduced) test suite, and up 
to 15% for affected tests.

Ok for trunk?


--
Best regards,
    Dmitry

Comments

Richard Earnshaw May 4, 2011, 9:53 a.m. UTC | #1
On Mon, 2011-04-25 at 19:39 +0400, Dmitry Melnik wrote:
> Hi All,
> 
> The attached patch changes the reload class for NEON constant vectors 
> from GENERAL_REGS to NO_REGS.
> The issue was found on this code from libevas:
> 
> void
> _op_blend_p_caa_dp(unsigned *s, unsigned* e, unsigned *d, unsigned c) {
>      while (d < e) {
>       *d = ( (((((*s) >> 8) & 0x00ff00ff) * (c)) & 0xff00ff00) + 
> (((((*s) & 0x00ff00ff) * (c)) >> 8) & 0x00ff00ff) );
>       //*d = (*s) & 0x00ff00ff;
>       d++;
>       s++;
>      }
> }
> 
> Original asm:
> 
> .L4:
>          adr     r8, .L10
>          ldmia   r8, {r8-fp}
>          ...
>          vmov    d22, r8, r9  @ v4si
>          vmov    d23, sl, fp
>          vand    q12, q8, q11
>          ...
>          bhi     .L4
> 
> .L10:
>          .word   16711935 @ 0xff00ff
>          .word   16711935
>          .word   16711935
>          .word   16711935
> 
> Fixed asm:
> 
> .L4:
>          vmov.i16        q11, #255  @ v4si
>          ...
>          vand    q12, q8, q11
>          bhi     .L4
> 
> This fix results in +3.7% gain for expedite (reduced) test suite, and up 
> to 15% for affected tests.
> 
> Ok for trunk?
2011-04-22  Sergey Grechanik  <mouseentity@ispras.ru>

        * config/arm/arm.c (coproc_secondary_reload_class): Treat constant
        vectors the same way as memory locations to prevent loading them 
        through the ARM general registers.

Just say:

	* arm.c (coproc_secondary_reload_class): Return NO_REGS for constant
	vectors.

Otherwise OK.

R.
diff mbox

Patch

2011-04-22  Sergey Grechanik  <mouseentity@ispras.ru>

	* config/arm/arm.c (coproc_secondary_reload_class): Treat constant
	vectors the same way as memory locations to prevent loading them 
	through the ARM general registers.

--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -9152,7 +9152,7 @@  coproc_secondary_reload_class (enum machine_mode mode, rtx x, bool wb)
   /* The neon move patterns handle all legitimate vector and struct
      addresses.  */
   if (TARGET_NEON
-      && MEM_P (x)
+      && (MEM_P (x) || GET_CODE (x) == CONST_VECTOR)
       && (GET_MODE_CLASS (mode) == MODE_VECTOR_INT
          || GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT
          || VALID_NEON_STRUCT_MODE (mode)))