Patchwork [RFC,ARM] Cortex A8 Neon description fix.

login
register
mail settings
Submitter Ramana Radhakrishnan
Date Aug. 17, 2010, 3 p.m.
Message ID <1282057209.28522.45.camel@e102325-lin.cambridge.arm.com>
Download mbox | patch
Permalink /patch/61921/
State New
Headers show

Comments

Ramana Radhakrishnan - Aug. 17, 2010, 3 p.m.
Hi, 

So, I've been playing with Neon pipeline descriptions and noticed this with the A8 Neon pipeline description.



Consider the following testcase :

#include <arm_neon.h>

void neon_add(float * __restrict out, float * __restrict a, float * __restrict
b)
{
    float32x2_t tmp1, tmp2;
    tmp1 = vset_lane_f32(*a, tmp1, 0);
    tmp2 = vset_lane_f32(*b, tmp2, 0);
    tmp1 = vadd_f32(tmp1, tmp2);
    *out = vget_lane_f32(tmp1, 0);
}



There are 2 attributes for every pattern in the ARM backend "type" which is by default "alu" for all the insn 
patterns (neon_type which is none for all integer and VFP instructions but set to something for Neon instructions.). 
Because the first reservation unit in the A8 pipeline description defines the reservation for all insns of "type" alu, 
by default all Neon patterns appear to get scheduler reservation behaviour as defined by cortex_a8_default. 

Looking at the output of -fdump-rtl-sched2 from before and after on trunk. It doesn't seem to be using any of the 
Neon functional units defined in cortex-a8-neon.md. The bit that got me interested was the fact that the 
vadd.f32 d16, d17, d16 appears to be scheduled as per the reservation of cortex_a8_default which sounds to be absolutely wrong ! 

There is another option ofcourse to change the default value of the "type" attribute to be none, but that would mean
a careful audit every single pattern in the ARM backend to have the right "alu" "type" rather than just relying on the default
value that we end up giving it depending on alternatives that match. 



With this simple patch now applied - I get :

;;   ======================================================
;;   -- basic block 2 from 42 to 46 -- after reload
;;   ======================================================

;;        0-->    42 r3=0x0                            :cortex_a8_default
;;        0-->    10 r2=[r2]                           :cortex_a8_load_store_1
;;        2-->    18 d16=unspec[r3] 91                 :cortex_a8_neon_perm
;;        2-->     8 r3=[r1]                           :cortex_a8_load_store_1
;;        3-->    20 d17=d16                           :cortex_a8_neon_dp
;;        5-->     9 d16=unspec[r3,d16,0x0] 170        :cortex_a8_neon_perm
;;        6-->    11 d17=unspec[r2,d17,0x0] 170        :cortex_a8_neon_perm
;;        7-->    12 d16=unspec[d16,d17,0x3] 72        :cortex_a8_neon_fadd
;;       12-->    14 [r0]=vec_select                   :cortex_a8_neon_ls_2
;;       13-->    46 return                            :cortex_a8_load_store_1

rather than :

;;   ======================================================
;;   -- basic block 2 from 42 to 45 -- after reload
;;   ======================================================

;;        0-->    42 r3=0x0                            :cortex_a8_default
;;        0-->    10 r2=[r2]                           :cortex_a8_load_store_1
;;        1-->    18 d16=unspec[r3] 91                 :cortex_a8_default
;;        1-->     8 r3=[r1]                           :cortex_a8_load_store_1
;;        2-->    20 d17=d16                           :cortex_a8_default
;;        3-->    11 d17=unspec[r2,d17,0x0] 170        :cortex_a8_default
;;        3-->     9 d16=unspec[r3,d16,0x0] 170        :cortex_a8_default
;;        4-->    12 d16=unspec[d16,d17,0x3] 72        :cortex_a8_default
;;        5-->    14 [r0]=vec_select                   :cortex_a8_default
;;        5-->    45 return                            :cortex_a8_load_store_1
;;      Ready list (final):  


Options ? 

cheers
Ramana

2010-08-17  Ramana Radhakrishnan  <ramana.radhakrishnan@arm.com>

	* config/arm/cortex-a8.md: Fix include of cortex-a8-neon.md
Ramana Radhakrishnan - Aug. 17, 2010, 4:05 p.m.
Ah - ignore this. Just realized that Jie had fixed this on trunk and I'd
been using a 4.5 based compiler. 

cheers
Ramana

Patch

diff --git a/gcc/config/arm/cortex-a8.md b/gcc/config/arm/cortex-a8.md
index e982e04..a351d59 100644
--- a/gcc/config/arm/cortex-a8.md
+++ b/gcc/config/arm/cortex-a8.md
@@ -32,6 +32,10 @@ 
 (define_cpu_unit "cortex_a8_alu0" "cortex_a8")
 (define_cpu_unit "cortex_a8_alu1" "cortex_a8")
 
+;; NEON (including VFP) instructions.
+
+(include "cortex-a8-neon.md")
+
 ;; The usual flow of an instruction through the pipelines.
 (define_reservation "cortex_a8_default"
                     "cortex_a8_alu0|cortex_a8_alu1")
@@ -270,7 +274,4 @@ 
        (eq_attr "type" "call"))
   "cortex_a8_issue_branch")
 
-;; NEON (including VFP) instructions.
-
-(include "cortex-a8-neon.md")