diff mbox

PR target/71549: Convert V1TImode register to TImode in debug insn

Message ID CAMe9rOqqKBM5ZWdcboaYzgVv69=Yn3sqdbgBRaUv-BbOQaGzxA@mail.gmail.com
State New
Headers show

Commit Message

H.J. Lu June 21, 2016, 12:48 p.m. UTC
On Mon, Jun 20, 2016 at 11:53 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Mon, Jun 20, 2016 at 10:31 AM, Ilya Enkovich <enkovich.gnu@gmail.com> wrote:
>> On 20 Jun 09:45, H.J. Lu wrote:
>>> On Mon, Jun 20, 2016 at 7:30 AM, Ilya Enkovich <enkovich.gnu@gmail.com> wrote:
>>> > 2016-06-20 16:39 GMT+03:00 Uros Bizjak <ubizjak@gmail.com>:
>>> >> On Mon, Jun 20, 2016 at 1:55 PM, H.J. Lu <hongjiu.lu@intel.com> wrote:
>>> >>> TImode register referenced in debug insn can be converted to V1TImode
>>> >>> by scalar to vector optimization.  We need to convert a debug insn if
>>> >>> it has a variable in a TImode register.
>>> >>>
>>> >>> Tested on x86-64.  OK for trunk?
>>> >>
>>> >> Ilya, does this approach look good to you? Also, does DImode STV
>>> >> conversion need similar handling of debug insns?
>>> >
>>> > DImode conversion doesn't change register mode (i.e. never calls
>>> > PUT_MODE for registers).  That keeps debug instructions valid.
>>> >
>>> > Overall I don't like the idea of having debug insns in candidates
>>> > set and in chains.  Looks like it is possible to have a chain
>>> > consisting of a debug insn only which is weird (otherwise I don't
>>> > see why we may return false in timode_scalar_chain::convert_insn).
>>>
>>> Yes, it can happen:
>>>
>>> (insn 11 8 12 2 (parallel [
>>>             (set (reg/v:TI 91 [ <retval> ])
>>>                 (plus:TI (reg/v:TI 92 [ a ])
>>>                     (reg/v:TI 96 [ b ])))
>>>             (clobber (reg:CC 17 flags))
>>>         ]) y.i:5 210 {*addti3_doubleword}
>>>      (expr_list:REG_UNUSED (reg:CC 17 flags)
>>>         (nil)))
>>> (debug_insn 12 11 13 2 (var_location:TI w (reg/v:TI 91 [ <retval> ])) y.i:5 -1
>>>      (nil))
>>>
>>>
>>> > What about other possible register uses?  If debug insns are added
>>> > to candidates then NONDEBUG_INSN_P check for uses in
>>> > timode_check_non_convertible_regs becomes invalid, right?
>>>
>>> Debug insn has no impact on STV decision.  We just need to convert
>>> register referenced in debug insn from V1TImode to TImode in
>>> timode_scalar_chain::convert_insn.
>>>
>>> > If we have (or want) to fix some register uses then it's probably
>>> > would be better to visit register uses when we convert its mode
>>> > and make required fix-ups.  It seems better to me to not involve
>>> > debug insns in analysis phase.
>>>
>>> Here is the updated patch to add debug insn, which references the
>>> TImode register which will be converted to V1TImode to queue.
>>> I am testing it now.
>>>
>>
>> You still count and dump debug insns as optimized ones.  Also we
>> try to use virtual functions to cover differences in DI and TI
>> optimizations and introducing additional TARGET_64BIT in common
>> STV code is undesirable.
>>
>> Also your conversion now depends on instructions processing order.
>> You will fail to process debug insn before non-debug ones. Required
>> order is not guaranteed because processing depends on instruction
>> UIDs only.
>>
>> I propose to modify transformation phase only like in the patch
>> (untested) below.  I rely on your code which assumes the only
>> possible usage in debug insn is VAR_LOCATION.
>>
>> Thanks,
>> Ilya
>> --
>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>> index c5e5e12..ec955f0 100644
>> --- a/gcc/config/i386/i386.c
>> +++ b/gcc/config/i386/i386.c
>> @@ -3139,6 +3139,7 @@ class timode_scalar_chain : public scalar_chain
>>
>>   private:
>>    void mark_dual_mode_def (df_ref def);
>> +  void fix_debug_reg_uses (rtx reg);
>>    void convert_insn (rtx_insn *insn);
>>    /* We don't convert registers to difference size.  */
>>    void convert_registers () {}
>> @@ -3790,6 +3791,34 @@ dimode_scalar_chain::convert_insn (rtx_insn *insn)
>>    df_insn_rescan (insn);
>>  }
>>
>> +/* Fix uses of converted REG in debug insns.  */
>> +
>> +void
>> +timode_scalar_chain::fix_debug_reg_uses (rtx reg)
>> +{
>> +  df_ref ref;
>> +  for (ref = DF_REG_USE_CHAIN (REGNO (reg)); ref; ref = DF_REF_NEXT_REG (ref))
>> +    {
>> +      rtx_insn *insn = DF_REF_INSN (ref);
>> +
>> +      if (DEBUG_INSN_P (insn))
>> +       {
>> +         /* It must be a debug insn with a TImode variable in register.  */
>> +         rtx val = PATTERN (insn);
>> +         gcc_assert (GET_MODE (val) == TImode
>> +                     && GET_CODE (val) == VAR_LOCATION);
>> +         rtx loc = PAT_VAR_LOCATION_LOC (val);
>> +         gcc_assert (REG_P (loc)
>> +                     && GET_MODE (loc) == V1TImode);
>> +         /* Convert V1TImode register, which has been updated by a SET
>> +             insn before, to SUBREG TImode.  */
>> +         PAT_VAR_LOCATION_LOC (val) = gen_rtx_SUBREG (TImode, loc, 0);
>> +         df_insn_rescan (insn);
>> +         return;
>> +       }
>> +    }
>> +}
>> +
>>  /* Convert INSN from TImode to V1T1mode.  */
>>
>>  void
>> @@ -3806,8 +3835,10 @@ timode_scalar_chain::convert_insn (rtx_insn *insn)
>>         rtx tmp = find_reg_equal_equiv_note (insn);
>>         if (tmp)
>>           PUT_MODE (XEXP (tmp, 0), V1TImode);
>> +       PUT_MODE (dst, V1TImode);
>> +       fix_debug_reg_uses (dst);
>>        }
>> -      /* FALLTHRU */
>> +      break;
>>      case MEM:
>>        PUT_MODE (dst, V1TImode);
>>        break;
>
> Won't be it waste CPU cycles when there is no debug insin
> which use TImode registers?
>

Here is the updated patch.  Tested on x86-64.  OK for trunk?

Thanks.

Comments

Ilya Enkovich June 21, 2016, 1:57 p.m. UTC | #1
2016-06-21 15:48 GMT+03:00 H.J. Lu <hjl.tools@gmail.com>:
> On Mon, Jun 20, 2016 at 11:53 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Mon, Jun 20, 2016 at 10:31 AM, Ilya Enkovich <enkovich.gnu@gmail.com> wrote:
>>> On 20 Jun 09:45, H.J. Lu wrote:
>>>> On Mon, Jun 20, 2016 at 7:30 AM, Ilya Enkovich <enkovich.gnu@gmail.com> wrote:
>>>> > 2016-06-20 16:39 GMT+03:00 Uros Bizjak <ubizjak@gmail.com>:
>>>> >> On Mon, Jun 20, 2016 at 1:55 PM, H.J. Lu <hongjiu.lu@intel.com> wrote:
>>>> >>> TImode register referenced in debug insn can be converted to V1TImode
>>>> >>> by scalar to vector optimization.  We need to convert a debug insn if
>>>> >>> it has a variable in a TImode register.
>>>> >>>
>>>> >>> Tested on x86-64.  OK for trunk?
>>>> >>
>>>> >> Ilya, does this approach look good to you? Also, does DImode STV
>>>> >> conversion need similar handling of debug insns?
>>>> >
>>>> > DImode conversion doesn't change register mode (i.e. never calls
>>>> > PUT_MODE for registers).  That keeps debug instructions valid.
>>>> >
>>>> > Overall I don't like the idea of having debug insns in candidates
>>>> > set and in chains.  Looks like it is possible to have a chain
>>>> > consisting of a debug insn only which is weird (otherwise I don't
>>>> > see why we may return false in timode_scalar_chain::convert_insn).
>>>>
>>>> Yes, it can happen:
>>>>
>>>> (insn 11 8 12 2 (parallel [
>>>>             (set (reg/v:TI 91 [ <retval> ])
>>>>                 (plus:TI (reg/v:TI 92 [ a ])
>>>>                     (reg/v:TI 96 [ b ])))
>>>>             (clobber (reg:CC 17 flags))
>>>>         ]) y.i:5 210 {*addti3_doubleword}
>>>>      (expr_list:REG_UNUSED (reg:CC 17 flags)
>>>>         (nil)))
>>>> (debug_insn 12 11 13 2 (var_location:TI w (reg/v:TI 91 [ <retval> ])) y.i:5 -1
>>>>      (nil))
>>>>
>>>>
>>>> > What about other possible register uses?  If debug insns are added
>>>> > to candidates then NONDEBUG_INSN_P check for uses in
>>>> > timode_check_non_convertible_regs becomes invalid, right?
>>>>
>>>> Debug insn has no impact on STV decision.  We just need to convert
>>>> register referenced in debug insn from V1TImode to TImode in
>>>> timode_scalar_chain::convert_insn.
>>>>
>>>> > If we have (or want) to fix some register uses then it's probably
>>>> > would be better to visit register uses when we convert its mode
>>>> > and make required fix-ups.  It seems better to me to not involve
>>>> > debug insns in analysis phase.
>>>>
>>>> Here is the updated patch to add debug insn, which references the
>>>> TImode register which will be converted to V1TImode to queue.
>>>> I am testing it now.
>>>>
>>>
>>> You still count and dump debug insns as optimized ones.  Also we
>>> try to use virtual functions to cover differences in DI and TI
>>> optimizations and introducing additional TARGET_64BIT in common
>>> STV code is undesirable.
>>>
>>> Also your conversion now depends on instructions processing order.
>>> You will fail to process debug insn before non-debug ones. Required
>>> order is not guaranteed because processing depends on instruction
>>> UIDs only.
>>>
>>> I propose to modify transformation phase only like in the patch
>>> (untested) below.  I rely on your code which assumes the only
>>> possible usage in debug insn is VAR_LOCATION.
>>>
>>> Thanks,
>>> Ilya
>>> --
>>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>>> index c5e5e12..ec955f0 100644
>>> --- a/gcc/config/i386/i386.c
>>> +++ b/gcc/config/i386/i386.c
>>> @@ -3139,6 +3139,7 @@ class timode_scalar_chain : public scalar_chain
>>>
>>>   private:
>>>    void mark_dual_mode_def (df_ref def);
>>> +  void fix_debug_reg_uses (rtx reg);
>>>    void convert_insn (rtx_insn *insn);
>>>    /* We don't convert registers to difference size.  */
>>>    void convert_registers () {}
>>> @@ -3790,6 +3791,34 @@ dimode_scalar_chain::convert_insn (rtx_insn *insn)
>>>    df_insn_rescan (insn);
>>>  }
>>>
>>> +/* Fix uses of converted REG in debug insns.  */
>>> +
>>> +void
>>> +timode_scalar_chain::fix_debug_reg_uses (rtx reg)
>>> +{
>>> +  df_ref ref;
>>> +  for (ref = DF_REG_USE_CHAIN (REGNO (reg)); ref; ref = DF_REF_NEXT_REG (ref))
>>> +    {
>>> +      rtx_insn *insn = DF_REF_INSN (ref);
>>> +
>>> +      if (DEBUG_INSN_P (insn))
>>> +       {
>>> +         /* It must be a debug insn with a TImode variable in register.  */
>>> +         rtx val = PATTERN (insn);
>>> +         gcc_assert (GET_MODE (val) == TImode
>>> +                     && GET_CODE (val) == VAR_LOCATION);
>>> +         rtx loc = PAT_VAR_LOCATION_LOC (val);
>>> +         gcc_assert (REG_P (loc)
>>> +                     && GET_MODE (loc) == V1TImode);
>>> +         /* Convert V1TImode register, which has been updated by a SET
>>> +             insn before, to SUBREG TImode.  */
>>> +         PAT_VAR_LOCATION_LOC (val) = gen_rtx_SUBREG (TImode, loc, 0);
>>> +         df_insn_rescan (insn);
>>> +         return;
>>> +       }
>>> +    }
>>> +}
>>> +
>>>  /* Convert INSN from TImode to V1T1mode.  */
>>>
>>>  void
>>> @@ -3806,8 +3835,10 @@ timode_scalar_chain::convert_insn (rtx_insn *insn)
>>>         rtx tmp = find_reg_equal_equiv_note (insn);
>>>         if (tmp)
>>>           PUT_MODE (XEXP (tmp, 0), V1TImode);
>>> +       PUT_MODE (dst, V1TImode);
>>> +       fix_debug_reg_uses (dst);
>>>        }
>>> -      /* FALLTHRU */
>>> +      break;
>>>      case MEM:
>>>        PUT_MODE (dst, V1TImode);
>>>        break;
>>
>> Won't be it waste CPU cycles when there is no debug insin
>> which use TImode registers?
>>
>
> Here is the updated patch.  Tested on x86-64.  OK for trunk?

I'm OK with this version.

Thanks,
Ilya

>
> Thanks.
>
>
> --
> H.J.
diff mbox

Patch

From df026e33ebae5fef48edc2c1c3a9a7036095aff9 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Sun, 19 Jun 2016 12:47:45 -0700
Subject: [PATCH] Convert V1TImode register to TImode in debug insn

TImode register referenced in debug insn can be converted to V1TImode
by scalar to vector optimization.  After converting a TImode register
to V1TImode, we need to check all debug insns on its use chain to
convert the V1TImode register to SUBREG TImode.

gcc/

2016-06-21  H.J. Lu  <hongjiu.lu@intel.com>
	    Ilya Enkovich  <ilya.enkovich@intel.com>

	PR target/71549
	* config/i386/i386.c (timode_scalar_chain::fix_debug_reg_uses):
	New member function to convert V1TImode register to SUBREG
	TImode in debug insn.
	(timode_scalar_chain::convert_insn): Call fix_debug_reg_uses
	after changing register mode to V1TImode.

gcc/testsuite/

2016-06-21  H.J. Lu  <hongjiu.lu@intel.com>

	PR target/71549
	* gcc.target/i386/pr71549.c: New test.
---
 gcc/config/i386/i386.c                  | 38 ++++++++++++++++++++++++++++++++-
 gcc/testsuite/gcc.target/i386/pr71549.c | 24 +++++++++++++++++++++
 2 files changed, 61 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr71549.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 56a5b9c..0dd09ce 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -3139,6 +3139,7 @@  class timode_scalar_chain : public scalar_chain
 
  private:
   void mark_dual_mode_def (df_ref def);
+  void fix_debug_reg_uses (rtx reg);
   void convert_insn (rtx_insn *insn);
   /* We don't convert registers to difference size.  */
   void convert_registers () {}
@@ -3790,6 +3791,39 @@  dimode_scalar_chain::convert_insn (rtx_insn *insn)
   df_insn_rescan (insn);
 }
 
+/* Fix uses of converted REG in debug insns.  */
+
+void
+timode_scalar_chain::fix_debug_reg_uses (rtx reg)
+{
+  if (!flag_var_tracking)
+    return;
+
+  df_ref ref;
+  for (ref = DF_REG_USE_CHAIN (REGNO (reg));
+       ref;
+       ref = DF_REF_NEXT_REG (ref))
+    {
+      rtx_insn *insn = DF_REF_INSN (ref);
+      if (DEBUG_INSN_P (insn))
+	{
+	  /* It may be a debug insn with a TImode variable in
+	     register.  */
+	  rtx val = PATTERN (insn);
+	  if (GET_MODE (val) != TImode)
+	    continue;
+	  gcc_assert (GET_CODE (val) == VAR_LOCATION);
+	  rtx loc = PAT_VAR_LOCATION_LOC (val);
+	  gcc_assert (REG_P (loc)
+		      && GET_MODE (loc) == V1TImode);
+	  /* Convert V1TImode register, which has been updated by a SET
+	     insn before, to SUBREG TImode.  */
+	  PAT_VAR_LOCATION_LOC (val) = gen_rtx_SUBREG (TImode, loc, 0);
+	  df_insn_rescan (insn);
+	}
+    }
+}
+
 /* Convert INSN from TImode to V1T1mode.  */
 
 void
@@ -3806,8 +3840,10 @@  timode_scalar_chain::convert_insn (rtx_insn *insn)
 	rtx tmp = find_reg_equal_equiv_note (insn);
 	if (tmp)
 	  PUT_MODE (XEXP (tmp, 0), V1TImode);
+	PUT_MODE (dst, V1TImode);
+	fix_debug_reg_uses (dst);
       }
-      /* FALLTHRU */
+      break;
     case MEM:
       PUT_MODE (dst, V1TImode);
       break;
diff --git a/gcc/testsuite/gcc.target/i386/pr71549.c b/gcc/testsuite/gcc.target/i386/pr71549.c
new file mode 100644
index 0000000..8aac891
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr71549.c
@@ -0,0 +1,24 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -g" } */
+
+struct S1
+{
+  int f0;
+  int f1;
+  int f2;
+  int:4;
+} a, b;
+
+void
+fn1 (struct S1 p1)
+{
+  a = p1;
+  int c = p1.f0;
+}
+
+int
+main ()
+{
+  fn1 (b);
+  return 0;
+}
-- 
2.5.5