diff mbox

[ARM] Tie operand 1 to operand 0 in AESMC pattern when fusing AES/AESMC

Message ID 57484FA6.1060600@foss.arm.com
State New
Headers show

Commit Message

Kyrill Tkachov May 27, 2016, 1:46 p.m. UTC
On 20/05/16 11:04, Kyrill Tkachov wrote:
> Hi all,
>
> The recent -frename-registers change exposed a deficiency in the way we fuse AESE/AESMC instruction
> pairs in arm.
>
> Basically we want to enforce:
>     AESE Vn, _
>     AESMC Vn, Vn
>
> to enable the fusion, but regrename comes along and renames the output Vn register in AESMC to something
> else, killing the fusion in the hardware.
>
> The solution in this patch is to add an alternative that ties the input and output registers in the AESMC pattern
> and enable that alternative when the fusion is enabled.
>
> With this patch I've confirmed that the above preferred register sequence is kept even with -frename-registers
> when tuning for a cpu that enables the fusion and that the chain is broken by regrename otherwise and have
> seen the appropriate improvement in a proprietary benchmark (that I cannot name) that exercises this sequence.
>
> Bootstrapped and tested on arm-none-linux-gnueabihf.
>
> Ok for trunk?
>

Following James's feedback on the AArch64 version, this slightly modified version uses the enum type for the argument of the new function.
Is this ok instead?

Thanks,
Kyrill

2016-05-27  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     * config/arm/arm.c (arm_fusion_enabled_p): New function.
     * config/arm/arm-protos.h (arm_fusion_enabled_p): Declare prototype.
     * config/arm/crypto.md (crypto_<crypto_pattern>, CRYPTO_UNARY):
     Add "=w,0" alternative.  Enable it when AES/AESMC fusion is enabled.

Comments

Ramana Radhakrishnan June 1, 2016, 9:02 a.m. UTC | #1
On 27/05/16 14:46, Kyrill Tkachov wrote:
> 
> On 20/05/16 11:04, Kyrill Tkachov wrote:
>> Hi all,
>>
>> The recent -frename-registers change exposed a deficiency in the way we fuse AESE/AESMC instruction
>> pairs in arm.
>>
>> Basically we want to enforce:
>>     AESE Vn, _
>>     AESMC Vn, Vn
>>
>> to enable the fusion, but regrename comes along and renames the output Vn register in AESMC to something
>> else, killing the fusion in the hardware.
>>
>> The solution in this patch is to add an alternative that ties the input and output registers in the AESMC pattern
>> and enable that alternative when the fusion is enabled.
>>
>> With this patch I've confirmed that the above preferred register sequence is kept even with -frename-registers
>> when tuning for a cpu that enables the fusion and that the chain is broken by regrename otherwise and have
>> seen the appropriate improvement in a proprietary benchmark (that I cannot name) that exercises this sequence.
>>
>> Bootstrapped and tested on arm-none-linux-gnueabihf.
>>
>> Ok for trunk?
>>
> 
> Following James's feedback on the AArch64 version, this slightly modified version uses the enum type for the argument of the new function.
> Is this ok instead?
> 
> Thanks,
> Kyrill
> 
> 2016-05-27  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
> 
>     * config/arm/arm.c (arm_fusion_enabled_p): New function.
>     * config/arm/arm-protos.h (arm_fusion_enabled_p): Declare prototype.
>     * config/arm/crypto.md (crypto_<crypto_pattern>, CRYPTO_UNARY):
>     Add "=w,0" alternative.  Enable it when AES/AESMC fusion is enabled.
> 


Ok.

ramana
diff mbox

Patch

diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index cf221d6793eaf0959f2713fe0903a5d8602ec2f4..12a781de13f2f7816cc2b16b04835d87c83f7abb 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -320,6 +320,7 @@  extern int vfp3_const_double_for_bits (rtx);
 
 extern void arm_emit_coreregs_64bit_shift (enum rtx_code, rtx, rtx, rtx, rtx,
 					   rtx);
+extern bool arm_fusion_enabled_p (tune_params::fuse_ops);
 extern bool arm_valid_symbolic_address_p (rtx);
 extern bool arm_validize_comparison (rtx *, rtx *, rtx *);
 #endif /* RTX_CODE */
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 5110d9e989d605a9e2c262e6007b89a1c7dc7080..39a24c06c123b86883134368ef39794abf11898b 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -29704,6 +29704,13 @@  aarch_macro_fusion_pair_p (rtx_insn* prev, rtx_insn* curr)
   return false;
 }
 
+/* Return true iff the instruction fusion described by OP is enabled.  */
+bool
+arm_fusion_enabled_p (tune_params::fuse_ops op)
+{
+  return current_tune->fusible_ops & op;
+}
+
 /* Implement the TARGET_ASAN_SHADOW_OFFSET hook.  */
 
 static unsigned HOST_WIDE_INT
diff --git a/gcc/config/arm/crypto.md b/gcc/config/arm/crypto.md
index c6f17270b1dbaf6dc43eb1e9b8a182dbb0f5a1e1..0f510f069408471fcbf6751f161e984f39929813 100644
--- a/gcc/config/arm/crypto.md
+++ b/gcc/config/arm/crypto.md
@@ -18,14 +18,27 @@ 
 ;; along with GCC; see the file COPYING3.  If not see
 ;; <http://www.gnu.org/licenses/>.
 
+
+;; When AES/AESMC fusion is enabled we want the register allocation to
+;; look like:
+;;    AESE Vn, _
+;;    AESMC Vn, Vn
+;; So prefer to tie operand 1 to operand 0 when fusing.
+
 (define_insn "crypto_<crypto_pattern>"
-  [(set (match_operand:<crypto_mode> 0 "register_operand" "=w")
+  [(set (match_operand:<crypto_mode> 0 "register_operand" "=w,w")
         (unspec:<crypto_mode> [(match_operand:<crypto_mode> 1
-                       "register_operand" "w")]
+                       "register_operand" "0,w")]
          CRYPTO_UNARY))]
   "TARGET_CRYPTO"
   "<crypto_pattern>.<crypto_size_sfx>\\t%q0, %q1"
-  [(set_attr "type" "<crypto_type>")]
+  [(set_attr "type" "<crypto_type>")
+   (set_attr_alternative "enabled"
+     [(if_then_else (match_test
+		       "arm_fusion_enabled_p (tune_params::FUSE_AES_AESMC)")
+		     (const_string "yes" )
+		     (const_string "no"))
+      (const_string "yes")])]
 )
 
 (define_insn "crypto_<crypto_pattern>"