diff mbox series

[Aarch64] PR rtl-optimization/87763 - Fix lsl_asr_sbfiz.c test by checking for subregs

Message ID e68f222dcc3cc33cc85bef6c87c2b8fca173943f.camel@marvell.com
State New
Headers show
Series [Aarch64] PR rtl-optimization/87763 - Fix lsl_asr_sbfiz.c test by checking for subregs | expand

Commit Message

Steve Ellcey Jan. 29, 2019, 10:36 p.m. UTC
So the various tests that started failing with r265398 seem to need
different fixes.  This particular fix is for the
gcc.target/aarch64/lsl_asr_sbfiz.c failure.  The problem is that the
instructions we are trying to match to *ashiftsi_extv_bfiz now have
explicit subregs in them where they didn't before.   The new version
is:

(set (reg:SI 93)
    (ashift:SI (subreg:SI (sign_extract:DI (subreg:DI (reg:SI 95) 0)
                (const_int 3 [0x3])
                (const_int 0 [0])) 0)
        (const_int 19 [0x13])))


The subreg's were not there before.  My proposed fix is to add an new
instruction like *ashiftsi_extv_bfiz but with the subregs.  This fixes
lsl_asr_sbfiz.c.  Does this seem like the right way to fix this?

Steve Ellcey
sellcey@marvell.com


2018-01-29  Steve Ellcey  <sellcey@marvell.com>

	PR rtl-optimization/87763
	* config/aarch64/aarch64.md (*ashiftsi_extv_bfiz_alt):
	New Instruction.

 ;; zero-extension of the X-reg.

Comments

Andrew Pinski Jan. 29, 2019, 10:51 p.m. UTC | #1
On Tue, Jan 29, 2019 at 2:36 PM Steve Ellcey <sellcey@marvell.com> wrote:
>
> So the various tests that started failing with r265398 seem to need
> different fixes.  This particular fix is for the
> gcc.target/aarch64/lsl_asr_sbfiz.c failure.  The problem is that the
> instructions we are trying to match to *ashiftsi_extv_bfiz now have
> explicit subregs in them where they didn't before.   The new version
> is:
>
> (set (reg:SI 93)
>     (ashift:SI (subreg:SI (sign_extract:DI (subreg:DI (reg:SI 95) 0)
>                 (const_int 3 [0x3])
>                 (const_int 0 [0])) 0)
>         (const_int 19 [0x13])))
>
>
> The subreg's were not there before.  My proposed fix is to add an new
> instruction like *ashiftsi_extv_bfiz but with the subregs.  This fixes
> lsl_asr_sbfiz.c.  Does this seem like the right way to fix this?

Seems to me rather this should have been simplified to just:
 (set (reg:SI 93)
     (ashift:SI (sign_extract:SI (reg:SI 95)
                 (const_int 3 [0x3])
                 (const_int 0 [0]))
         (const_int 19 [0x13])))

Because the two subreg cancel each other out.
This would be a thing to add to simplify-rtx.c.

Thanks,
Andrew

>
> Steve Ellcey
> sellcey@marvell.com
>
>
> 2018-01-29  Steve Ellcey  <sellcey@marvell.com>
>
>         PR rtl-optimization/87763
>         * config/aarch64/aarch64.md (*ashiftsi_extv_bfiz_alt):
>         New Instruction.
>
> diff --git a/gcc/config/aarch64/aarch64.md
> b/gcc/config/aarch64/aarch64.md
> index b7f6fe0f135..d65230c4837 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -5531,6 +5531,22 @@
>    [(set_attr "type" "bfx")]
>  )
>
> +(define_insn "*ashiftsi_extv_bfiz_alt"
> +  [(set (match_operand:SI 0 "register_operand" "=r")
> +       (ashift:SI
> +         (subreg:SI
> +           (sign_extract:DI
> +             (subreg:DI (match_operand:SI 1 "register_operand" "r") 0)
> +             (match_operand 2 "aarch64_simd_shift_imm_offset_si" "n")
> +             (const_int 0))
> +           0)
> +         (match_operand 3 "aarch64_simd_shift_imm_si" "n")))]
> +  "IN_RANGE (INTVAL (operands[2]) + INTVAL (operands[3]),
> +            1, GET_MODE_BITSIZE (SImode) - 1)"
> +  "sbfiz\\t%w0, %w1, %3, %2"
> +  [(set_attr "type" "bfx")]
> +)
> +
>  ;; When the bit position and width of the equivalent extraction add up
> to 32
>  ;; we can use a W-reg LSL instruction taking advantage of the implicit
>  ;; zero-extension of the X-reg.
Segher Boessenkool Jan. 29, 2019, 11:59 p.m. UTC | #2
On Tue, Jan 29, 2019 at 02:51:30PM -0800, Andrew Pinski wrote:
> On Tue, Jan 29, 2019 at 2:36 PM Steve Ellcey <sellcey@marvell.com> wrote:
> > So the various tests that started failing with r265398 seem to need
> > different fixes.  This particular fix is for the
> > gcc.target/aarch64/lsl_asr_sbfiz.c failure.  The problem is that the
> > instructions we are trying to match to *ashiftsi_extv_bfiz now have
> > explicit subregs in them where they didn't before.   The new version
> > is:
> >
> > (set (reg:SI 93)
> >     (ashift:SI (subreg:SI (sign_extract:DI (subreg:DI (reg:SI 95) 0)
> >                 (const_int 3 [0x3])
> >                 (const_int 0 [0])) 0)
> >         (const_int 19 [0x13])))
> >
> >
> > The subreg's were not there before.  My proposed fix is to add an new
> > instruction like *ashiftsi_extv_bfiz but with the subregs.  This fixes
> > lsl_asr_sbfiz.c.  Does this seem like the right way to fix this?
> 
> Seems to me rather this should have been simplified to just:
>  (set (reg:SI 93)
>      (ashift:SI (sign_extract:SI (reg:SI 95)
>                  (const_int 3 [0x3])
>                  (const_int 0 [0]))
>          (const_int 19 [0x13])))

Yes.

> Because the two subreg cancel each other out.

Well, why did it ever think of using DI at all?

> This would be a thing to add to simplify-rtx.c.

This is probably specific to combine actually.


Segher
diff mbox series

Patch

diff --git a/gcc/config/aarch64/aarch64.md
b/gcc/config/aarch64/aarch64.md
index b7f6fe0f135..d65230c4837 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -5531,6 +5531,22 @@ 
   [(set_attr "type" "bfx")]
 )
 
+(define_insn "*ashiftsi_extv_bfiz_alt"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(ashift:SI
+	  (subreg:SI
+	    (sign_extract:DI
+	      (subreg:DI (match_operand:SI 1 "register_operand" "r") 0)
+	      (match_operand 2 "aarch64_simd_shift_imm_offset_si" "n")
+	      (const_int 0))
+	    0)
+	  (match_operand 3 "aarch64_simd_shift_imm_si" "n")))]
+  "IN_RANGE (INTVAL (operands[2]) + INTVAL (operands[3]),
+	     1, GET_MODE_BITSIZE (SImode) - 1)"
+  "sbfiz\\t%w0, %w1, %3, %2"
+  [(set_attr "type" "bfx")]
+)
+
 ;; When the bit position and width of the equivalent extraction add up
to 32
 ;; we can use a W-reg LSL instruction taking advantage of the implicit