Patchwork Reduce conservativeness in REE using machine model (issue6631066)

login
register
mail settings
Submitter Teresa Johnson
Date Oct. 11, 2012, 9:44 p.m.
Message ID <20121011214412.5AC876128C@tjsboxrox.mtv.corp.google.com>
Download mbox | patch
Permalink /patch/191007/
State New
Headers show

Comments

Teresa Johnson - Oct. 11, 2012, 9:44 p.m.
Revised patch to address conservative behavior in redundant extend
elimination that was resulting in redundant extends not being
removed. Now uses a new target hook machine_mode_from_attr_mode
which is currently enabled only for i386.

Bootstrapped and tested on x86_64-unknown-linux-gnu. Is this ok for trunk?

Thanks,
Teresa

2012-10-11  Teresa Johnson  <tejohnson@google.com>

	* doc/tm.texi: Document TARGET_MACHINE_MODE_FROM_ATTR_MODE.
	* doc/tm.texi.in: Regenerated.
	* targhooks.c (default_machine_mode_from_attr_mode): New function.
	* targhooks.h (default_machine_mode_from_attr_mode): Declare.
	* target.def (machine_mode_from_attr_mode): New target hook.
	* ree.c (get_mode): New function.
	(add_removable_extension): Use get_mode to obtain the
        machine mode for comparison with other extends.
	* config/i386/i386.c (ix86_machine_mode_from_attr_mode): New function.


--
This patch is available for review at http://codereview.appspot.com/6631066
Steven Bosscher - Oct. 11, 2012, 9:50 p.m.
On Thu, Oct 11, 2012 at 11:44 PM, Teresa Johnson wrote:
> +  mode = targetm.machine_mode_from_attr_mode(insn);

Nit: space between "..._mode" and "(".

A test case would also be Nice To Have.

Looks OK to me otherwise, but I can't approve it.

Ciao!
Steven
Jakub Jelinek - Oct. 12, 2012, 8:23 a.m.
On Thu, Oct 11, 2012 at 02:44:12PM -0700, Teresa Johnson wrote:
> Revised patch to address conservative behavior in redundant extend
> elimination that was resulting in redundant extends not being
> removed. Now uses a new target hook machine_mode_from_attr_mode
> which is currently enabled only for i386.

I still don't like it, the hook still is about how it is implemented
instead of what target property it wants to ask (the important thing
there is that a {QI,HI} -> SImode zero extension instruction on x86_64
performs {QI,HI} -> DImode extension actually).  That isn't the case for any
other modes, isn't the case for sign extension etc.

Can you please post a testcase first?

Given the recent ree.c changes to remember the performed operations and
their original modes (struct ext_modified), perhaps the
"Second, make sure the reaching definitions don't feed another and"...
check could be made less strict or even removed, but for that a testcase is
really needed.

	Jakub
Teresa Johnson - Oct. 15, 2012, 9:04 p.m.
On Fri, Oct 12, 2012 at 1:23 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Thu, Oct 11, 2012 at 02:44:12PM -0700, Teresa Johnson wrote:
>> Revised patch to address conservative behavior in redundant extend
>> elimination that was resulting in redundant extends not being
>> removed. Now uses a new target hook machine_mode_from_attr_mode
>> which is currently enabled only for i386.
>
> I still don't like it, the hook still is about how it is implemented
> instead of what target property it wants to ask (the important thing
> there is that a {QI,HI} -> SImode zero extension instruction on x86_64
> performs {QI,HI} -> DImode extension actually).  That isn't the case for any
> other modes, isn't the case for sign extension etc.

That's true, although for sign extends the attr modes being set in
i386.md ensure that this won't do the wrong thing as the the attr
modes in the machine desc file match the machine_mode. However, this
ends up leading to the conservative behavior remaining for sign
extends (see testcase below).

>
> Can you please post a testcase first?

This was exposed by the snappy decompression code. However, I was able
to reproduce it by modifying the testcase submitted with the fix that
introduced this checking (gcc.c-torture/execute/20111227-1.c for
http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01744.html). This test
case was failing because a sign_extend was being combined with a
zero_extend, so the instruction code check below fixed it, and the
mode check was unnecessary:

      /* Second, make sure the reaching definitions don't feed another and
         different extension.  FIXME: this obviously can be improved.  */
      for (def = defs; def; def = def->next)
        if ((idx = def_map[INSN_UID(DF_REF_INSN (def->ref))])
            && (cand = &VEC_index (ext_cand, *insn_list, idx - 1))
            && (cand->code != code || cand->mode != mode))

Here is my modified test case that exposes the conservative behavior I
saw in snappy:

------------------------
extern void abort (void);

unsigned short s;
unsigned int i;
unsigned long l;
unsigned char v = -1;

void __attribute__((noinline,noclone))
bar (int t)
{
  if (t == 2 && s != 0xff)
    abort ();
  if (t == 1 && i != 0xff)
    abort ();
  if (t == 0 && l != 0xff)
    abort ();
}

void __attribute__((noinline,noclone))
foo (unsigned char *a, int t)
{
  unsigned char r = v;

  if (t == 2)
    s = (unsigned short) r;
  else if (t == 1)
    i = (unsigned int) r;
  else if (t == 0)
    l = (unsigned long) r;
  bar (t);
}

int main(void)
{
  foo (&v, 0);
  foo (&v, 1);
  foo (&v, 2);
  return 0;
}
-----------------------

With trunk, there are currently 3 movzbl generated for foo():

	movzbl	v(%rip), %eax
	movzbl	%al, %eax
	movzbl	%al, %eax

With my fix this goes down to 1 movzbl. However, if the test case is
modified to use signed instead of unsigned, we still end up with 3
movsbl, of which 2 are redundant:

 	movsbw	v(%rip), %ax
	movsbq	%al, %rax
	movsbl	%al, %eax

A single movsbq will suffice. But because of the attr mode settings
for sign extends I mentioned above, my patch does not help here.

>
> Given the recent ree.c changes to remember the performed operations and
> their original modes (struct ext_modified), perhaps the
> "Second, make sure the reaching definitions don't feed another and"...
> check could be made less strict or even removed, but for that a testcase is
> really needed.

I believe that we can remove the mode check from the above code
altogether. The reason is that the ree.c code will always select the
widest mode when combining extends (in merge_def_and_ext). So with a
simple change to the ree.c code to simply avoid the mode checking both
the unsigned and signed cases get addressed. In the signed case we are
left with a single movs:

	movsbq	v(%rip), %rax

I didn't see any test failures in a x86_64-unknown-linux-gnu bootstrap
and regression test run. If this seems reasonable, I can follow up
with the patch (which is trivial), and I can also submit the 2 new
test cases (the signed test case is included below).

Thanks,
Teresa


extern void abort (void);

signed short s;
signed int i;
signed long l;
signed char v = -1;

void __attribute__((noinline,noclone))
bar (int t)
{
  if (t == 2 && s != -1)
    abort ();
  if (t == 1 && i != -1)
    abort ();
  if (t == 0 && l != -1)
    abort ();
}

void __attribute__((noinline,noclone))
foo (signed char *a, int t)
{
  signed char r = v;

  if (t == 2)
    s = (signed short) r;
  else if (t == 1)
    i = (signed int) r;
  else if (t == 0)
    l = (signed long) r;
  bar (t);
}

int main(void)
{
  foo (&v, 0);
  foo (&v, 1);
  foo (&v, 2);
  return 0;
}



>
>         Jakub
Xinliang David Li - Oct. 16, 2012, 6:53 a.m.
The change to remove mode check looks good to me.


Not directly related to this bug but somehow related: the REE has
obvious conservativeness regarding partial redundancy (i.e., not all
reaching def has the bits properly extended). However, there are bugs
preventing elimination even with full redundancy. The following is a
modified example -- the zero extend after the call to bar is
redundant, but not removed. This needs to be tracked separately.

/* compile with -O2 -fdisable-rtl-ce1 -fdisable-rtl-ce2
-fdisable-rtl-ce3 -DFULL */

unsigned short s;
unsigned long l;
unsigned char v = -1;
unsigned char v2 = -1;
unsigned char v3 = -1;

extern bar (int t);

void __attribute__((noinline,noclone))
foo (unsigned char *a, int t, char c)
{
  unsigned char r = c;

  if (t == 0)
    r = v;
  else if (t == 1)
    r = v2;
#ifdef FULL
  else
    r = v3;
#endif

   bar (t);
   s = r;

}

David


On Mon, Oct 15, 2012 at 2:04 PM, Teresa Johnson <tejohnson@google.com> wrote:
> On Fri, Oct 12, 2012 at 1:23 AM, Jakub Jelinek <jakub@redhat.com> wrote:
>> On Thu, Oct 11, 2012 at 02:44:12PM -0700, Teresa Johnson wrote:
>>> Revised patch to address conservative behavior in redundant extend
>>> elimination that was resulting in redundant extends not being
>>> removed. Now uses a new target hook machine_mode_from_attr_mode
>>> which is currently enabled only for i386.
>>
>> I still don't like it, the hook still is about how it is implemented
>> instead of what target property it wants to ask (the important thing
>> there is that a {QI,HI} -> SImode zero extension instruction on x86_64
>> performs {QI,HI} -> DImode extension actually).  That isn't the case for any
>> other modes, isn't the case for sign extension etc.
>
> That's true, although for sign extends the attr modes being set in
> i386.md ensure that this won't do the wrong thing as the the attr
> modes in the machine desc file match the machine_mode. However, this
> ends up leading to the conservative behavior remaining for sign
> extends (see testcase below).
>
>>
>> Can you please post a testcase first?
>
> This was exposed by the snappy decompression code. However, I was able
> to reproduce it by modifying the testcase submitted with the fix that
> introduced this checking (gcc.c-torture/execute/20111227-1.c for
> http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01744.html). This test
> case was failing because a sign_extend was being combined with a
> zero_extend, so the instruction code check below fixed it, and the
> mode check was unnecessary:
>
>       /* Second, make sure the reaching definitions don't feed another and
>          different extension.  FIXME: this obviously can be improved.  */
>       for (def = defs; def; def = def->next)
>         if ((idx = def_map[INSN_UID(DF_REF_INSN (def->ref))])
>             && (cand = &VEC_index (ext_cand, *insn_list, idx - 1))
>             && (cand->code != code || cand->mode != mode))
>
> Here is my modified test case that exposes the conservative behavior I
> saw in snappy:
>
> ------------------------
> extern void abort (void);
>
> unsigned short s;
> unsigned int i;
> unsigned long l;
> unsigned char v = -1;
>
> void __attribute__((noinline,noclone))
> bar (int t)
> {
>   if (t == 2 && s != 0xff)
>     abort ();
>   if (t == 1 && i != 0xff)
>     abort ();
>   if (t == 0 && l != 0xff)
>     abort ();
> }
>
> void __attribute__((noinline,noclone))
> foo (unsigned char *a, int t)
> {
>   unsigned char r = v;
>
>   if (t == 2)
>     s = (unsigned short) r;
>   else if (t == 1)
>     i = (unsigned int) r;
>   else if (t == 0)
>     l = (unsigned long) r;
>   bar (t);
> }
>
> int main(void)
> {
>   foo (&v, 0);
>   foo (&v, 1);
>   foo (&v, 2);
>   return 0;
> }
> -----------------------
>
> With trunk, there are currently 3 movzbl generated for foo():
>
>         movzbl  v(%rip), %eax
>         movzbl  %al, %eax
>         movzbl  %al, %eax
>
> With my fix this goes down to 1 movzbl. However, if the test case is
> modified to use signed instead of unsigned, we still end up with 3
> movsbl, of which 2 are redundant:
>
>         movsbw  v(%rip), %ax
>         movsbq  %al, %rax
>         movsbl  %al, %eax
>
> A single movsbq will suffice. But because of the attr mode settings
> for sign extends I mentioned above, my patch does not help here.
>
>>
>> Given the recent ree.c changes to remember the performed operations and
>> their original modes (struct ext_modified), perhaps the
>> "Second, make sure the reaching definitions don't feed another and"...
>> check could be made less strict or even removed, but for that a testcase is
>> really needed.
>
> I believe that we can remove the mode check from the above code
> altogether. The reason is that the ree.c code will always select the
> widest mode when combining extends (in merge_def_and_ext). So with a
> simple change to the ree.c code to simply avoid the mode checking both
> the unsigned and signed cases get addressed. In the signed case we are
> left with a single movs:
>
>         movsbq  v(%rip), %rax
>
> I didn't see any test failures in a x86_64-unknown-linux-gnu bootstrap
> and regression test run. If this seems reasonable, I can follow up
> with the patch (which is trivial), and I can also submit the 2 new
> test cases (the signed test case is included below).
>
> Thanks,
> Teresa
>
>
> extern void abort (void);
>
> signed short s;
> signed int i;
> signed long l;
> signed char v = -1;
>
> void __attribute__((noinline,noclone))
> bar (int t)
> {
>   if (t == 2 && s != -1)
>     abort ();
>   if (t == 1 && i != -1)
>     abort ();
>   if (t == 0 && l != -1)
>     abort ();
> }
>
> void __attribute__((noinline,noclone))
> foo (signed char *a, int t)
> {
>   signed char r = v;
>
>   if (t == 2)
>     s = (signed short) r;
>   else if (t == 1)
>     i = (signed int) r;
>   else if (t == 0)
>     l = (signed long) r;
>   bar (t);
> }
>
> int main(void)
> {
>   foo (&v, 0);
>   foo (&v, 1);
>   foo (&v, 2);
>   return 0;
> }
>
>
>
>>
>>         Jakub
>
>
>
> --
> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413

Patch

Index: doc/tm.texi
===================================================================
--- doc/tm.texi	(revision 192265)
+++ doc/tm.texi	(working copy)
@@ -10468,6 +10468,10 @@  In order to enforce the representation of @code{mo
 @code{mode}.
 @end deftypefn
 
+@deftypefn {Target Hook} {enum machine_mode} TARGET_MACHINE_MODE_FROM_ATTR_MODE (rtx @var{insn})
+If @var{insn} has an attr_mode that is equivalent to a @code{machine_mode},  return the corresponding @code{machine_mode}, otherwise return  @code{MAX_MACHINE_MODE}.
+@end deftypefn
+
 @defmac STORE_FLAG_VALUE
 A C expression describing the value returned by a comparison operator
 with an integral mode and stored by a store-flag instruction
Index: doc/tm.texi.in
===================================================================
--- doc/tm.texi.in	(revision 192265)
+++ doc/tm.texi.in	(working copy)
@@ -10326,6 +10326,8 @@  In order to enforce the representation of @code{mo
 @code{mode}.
 @end deftypefn
 
+@hook TARGET_MACHINE_MODE_FROM_ATTR_MODE
+
 @defmac STORE_FLAG_VALUE
 A C expression describing the value returned by a comparison operator
 with an integral mode and stored by a store-flag instruction
Index: targhooks.c
===================================================================
--- targhooks.c	(revision 192265)
+++ targhooks.c	(working copy)
@@ -250,6 +250,14 @@  default_mode_rep_extended (enum machine_mode mode
   return UNKNOWN;
 }
 
+/* The default implementation of TARGET_MACHINE_MODE_FROM_ATTR_MODE.  */
+
+enum machine_mode
+default_machine_mode_from_attr_mode (rtx insn ATTRIBUTE_UNUSED)
+{
+  return MAX_MACHINE_MODE;
+}
+
 /* Generic hook that takes a CUMULATIVE_ARGS pointer and returns true.  */
 
 bool
Index: targhooks.h
===================================================================
--- targhooks.h	(revision 192265)
+++ targhooks.h	(working copy)
@@ -47,6 +47,7 @@  extern unsigned HOST_WIDE_INT default_shift_trunca
   (enum machine_mode);
 extern unsigned int default_min_divisions_for_recip_mul (enum machine_mode);
 extern int default_mode_rep_extended (enum machine_mode, enum machine_mode);
+enum machine_mode default_machine_mode_from_attr_mode (rtx insn);
 
 extern tree default_stack_protect_guard (void);
 extern tree default_external_stack_protect_fail (void);
Index: target.def
===================================================================
--- target.def	(revision 192265)
+++ target.def	(working copy)
@@ -1576,6 +1576,17 @@  DEFHOOK
  int, (enum machine_mode mode, enum machine_mode rep_mode),
  default_mode_rep_extended)
 
+/* If the machine description for an rtl INSN defines the
+   attr_mode, and that mode is equivalent to a machine_mode, return
+   the corresponding machine_mode. Return MAX_MACHINE_MODE otherwise.  */
+DEFHOOK
+(machine_mode_from_attr_mode,
+ "If @var{insn} has an attr_mode that is equivalent to a @code{machine_mode},\
+  return the corresponding @code{machine_mode}, otherwise return\
+  @code{MAX_MACHINE_MODE}.",
+ enum machine_mode, (rtx insn),
+ default_machine_mode_from_attr_mode)
+
 /* True if MODE is valid for a pointer in __attribute__((mode("MODE"))).  */
 DEFHOOK
 (valid_pointer_mode,
Index: ree.c
===================================================================
--- ree.c	(revision 192265)
+++ ree.c	(working copy)
@@ -756,6 +756,20 @@  combine_reaching_defs (ext_cand *cand, const_rtx s
   return false;
 }
 
+/* Given an INSN, obtain the attr_mode specified by the machine
+   description, and map it to the corresponding machine_mode. If the
+   attr_mode isn't specified, return the machine mode for DEST.  */
+
+static enum machine_mode
+get_mode (rtx insn, rtx dest)
+{
+  enum machine_mode mode;
+  mode = targetm.machine_mode_from_attr_mode(insn);
+  if (mode == MAX_MACHINE_MODE)
+    mode = GET_MODE (dest);
+  return mode;
+}
+
 /* Add an extension pattern that could be eliminated.  */
 
 static void
@@ -775,7 +789,7 @@  add_removable_extension (const_rtx expr, rtx insn,
   src = SET_SRC (expr);
   code = GET_CODE (src);
   dest = SET_DEST (expr);
-  mode = GET_MODE (dest);
+  mode = get_mode (insn, dest);
 
   if (REG_P (dest)
       && (code == SIGN_EXTEND || code == ZERO_EXTEND)
Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c	(revision 192265)
+++ config/i386/i386.c	(working copy)
@@ -15074,6 +15074,48 @@  ix86_print_operand_address (FILE *file, rtx addr)
     }
 }
 
+/* Implementation of TARGET_MACHINE_MODE_FROM_ATTR_MODE.  */
+
+static enum machine_mode
+ix86_machine_mode_from_attr_mode (rtx insn)
+{
+  switch (get_attr_mode (insn))
+    {
+      case MODE_QI:
+        return QImode;
+      case MODE_HI:
+        return HImode;
+      case MODE_SI:
+        return SImode;
+      case MODE_DI:
+        return DImode;
+      case MODE_TI:
+        return TImode;
+      case MODE_OI:
+        return OImode;
+      case MODE_SF:
+        return SFmode;
+      case MODE_DF:
+        return DFmode;
+      case MODE_XF:
+        return XFmode;
+      case MODE_TF:
+        return TFmode;
+      case MODE_V8SF:
+        return V8SFmode;
+      case MODE_V4DF:
+        return V4DFmode;
+      case MODE_V4SF:
+        return V4SFmode;
+      case MODE_V2DF:
+        return V2DFmode;
+      case MODE_V2SF:
+        return V2SFmode;
+      default:
+        return MAX_MACHINE_MODE;
+    }
+}
+
 /* Implementation of TARGET_ASM_OUTPUT_ADDR_CONST_EXTRA.  */
 
 static bool
@@ -41171,6 +41213,9 @@  ix86_memmodel_check (unsigned HOST_WIDE_INT val)
 #undef TARGET_EXPAND_TO_RTL_HOOK
 #define TARGET_EXPAND_TO_RTL_HOOK ix86_maybe_switch_abi
 
+#undef TARGET_MACHINE_MODE_FROM_ATTR_MODE
+#define TARGET_MACHINE_MODE_FROM_ATTR_MODE ix86_machine_mode_from_attr_mode
+
 #undef TARGET_LEGITIMATE_ADDRESS_P
 #define TARGET_LEGITIMATE_ADDRESS_P ix86_legitimate_address_p