Patchwork Tighten ARM's CANNOT_CHANGE_MODE_CLASS

login
register
mail settings
Submitter Richard Sandiford
Date March 24, 2011, 3:40 p.m.
Message ID <g47hbov856.fsf@linaro.org>
Download mbox | patch
Permalink /patch/88216/
State New
Headers show

Comments

Richard Sandiford - March 24, 2011, 3:40 p.m.
We currently generate very poor code for tests like:

#include <arm_neon.h>

void
foo (uint32_t *a, uint32_t *b, uint32_t *c)
{
  uint32x4x3_t x, y;

  x = vld3q_u32 (a);
  y = vld3q_u32 (b);
  x.val[0] = vaddq_u32 (x.val[0], y.val[0]);
  x.val[1] = vaddq_u32 (x.val[1], y.val[1]);
  x.val[2] = vaddq_u32 (x.val[2], y.val[2]);
  vst3q_u32 (a, x);
}

This is because we force the uint32x4x3_t values to the stack and
then load and store the individual vectors.

What we actually want is for the uint32x4x3_t values to be stored
in registers, and for the individual vectors to be accessed as
subregs of those registers.  The first part involves some middle-end
mode changes (see recent gcc@ thread), while the second part requires
a change to ARM's CANNOT_CHANGE_MODE_CLASS.

CANNOT_CHANGE_MODE_CLASS is defined as:

/* FPA registers can't do subreg as all values are reformatted to internal
   precision.  VFP registers may only be accessed in the mode they
   were set.  */
#define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS)	\
  (GET_MODE_SIZE (FROM) != GET_MODE_SIZE (TO)		\
   ? reg_classes_intersect_p (FPA_REGS, (CLASS))	\
     || reg_classes_intersect_p (VFP_REGS, (CLASS))	\
   : 0)

But this VFP restriction appears to apply only to VFPv1; thanks to
Peter Maydell for the archaeology.

Tested on arm-linux-gnueabi.  OK to install?

This doesn't have any direct benefit without the middle-end mode change,
but it needs to go in first in order for that change not to regress.

Richard


gcc/
	* config/arm/arm.h (CANNOT_CHANGE_MODE_CLASS): Restrict FPA_REGS
	case to VFPv1.
Richard Earnshaw - March 24, 2011, 4:47 p.m.
On Thu, 2011-03-24 at 15:40 +0000, Richard Sandiford wrote:
> We currently generate very poor code for tests like:
> 
> #include <arm_neon.h>
> 
> void
> foo (uint32_t *a, uint32_t *b, uint32_t *c)
> {
>   uint32x4x3_t x, y;
> 
>   x = vld3q_u32 (a);
>   y = vld3q_u32 (b);
>   x.val[0] = vaddq_u32 (x.val[0], y.val[0]);
>   x.val[1] = vaddq_u32 (x.val[1], y.val[1]);
>   x.val[2] = vaddq_u32 (x.val[2], y.val[2]);
>   vst3q_u32 (a, x);
> }
> 
> This is because we force the uint32x4x3_t values to the stack and
> then load and store the individual vectors.
> 
> What we actually want is for the uint32x4x3_t values to be stored
> in registers, and for the individual vectors to be accessed as
> subregs of those registers.  The first part involves some middle-end
> mode changes (see recent gcc@ thread), while the second part requires
> a change to ARM's CANNOT_CHANGE_MODE_CLASS.
> 
> CANNOT_CHANGE_MODE_CLASS is defined as:
> 
> /* FPA registers can't do subreg as all values are reformatted to internal
>    precision.  VFP registers may only be accessed in the mode they
>    were set.  */
> #define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS)	\
>   (GET_MODE_SIZE (FROM) != GET_MODE_SIZE (TO)		\
>    ? reg_classes_intersect_p (FPA_REGS, (CLASS))	\
>      || reg_classes_intersect_p (VFP_REGS, (CLASS))	\
>    : 0)
> 
> But this VFP restriction appears to apply only to VFPv1; thanks to
> Peter Maydell for the archaeology.
> 
> Tested on arm-linux-gnueabi.  OK to install?
> 
> This doesn't have any direct benefit without the middle-end mode change,
> but it needs to go in first in order for that change not to regress.
> 
> Richard
> 
> 
> gcc/
> 	* config/arm/arm.h (CANNOT_CHANGE_MODE_CLASS): Restrict FPA_REGS
> 	case to VFPv1.
> 

GCC doesn't support VFPv1 (see the all_fpus table), and I don't think
many chips based on that ever escaped into the wild world, so I'm not
worried about trying to add that now.

So it's probably safe to just kill that check for VFP entirely.

R.

Patch

Index: gcc/config/arm/arm.h
===================================================================
--- gcc/config/arm/arm.h	2011-03-24 13:47:14.000000000 +0000
+++ gcc/config/arm/arm.h	2011-03-24 15:26:19.000000000 +0000
@@ -1167,12 +1167,14 @@  #define IRA_COVER_CLASSES						     \
 }
 
 /* FPA registers can't do subreg as all values are reformatted to internal
-   precision.  VFP registers may only be accessed in the mode they
+   precision.  VFPv1 registers may only be accessed in the mode they
    were set.  */
-#define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS)	\
-  (GET_MODE_SIZE (FROM) != GET_MODE_SIZE (TO)		\
-   ? reg_classes_intersect_p (FPA_REGS, (CLASS))	\
-     || reg_classes_intersect_p (VFP_REGS, (CLASS))	\
+#define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS)		\
+  (GET_MODE_SIZE (FROM) != GET_MODE_SIZE (TO)			\
+   ? (reg_classes_intersect_p (FPA_REGS, (CLASS))		\
+      || (TARGET_VFP						\
+	  && arm_fpu_desc->rev == 1				\
+	  && reg_classes_intersect_p (VFP_REGS, (CLASS))))	\
    : 0)
 
 /* The class value for index registers, and the one for base regs.  */