Patchwork Fix memset recognition, paper over PR53346

login
register
mail settings
Submitter Richard Guenther
Date May 18, 2012, 1:08 p.m.
Message ID <Pine.LNX.4.64.1205181507310.5860@jbgna.fhfr.qr>
Download mbox | patch
Permalink /patch/160111/
State New
Headers show

Comments

Richard Guenther - May 18, 2012, 1:08 p.m.
On Fri, 18 May 2012, Richard Guenther wrote:

> 
> In PR53346 we vectorize a simple memset loop very inefficiently.
> But of course we should have detected this and transformed the
> loop into a memset!  Seems like we only do that if the original
> loop does sth else than memset as well.
> 
> Fixed as follows.
> 
> Bootstrap and regtest on x86_64-unknown-linux-gnu ongoing
> (I suppose that will really stress loop distribution now ;))

Committed as follows, with extra testsuite adjustments to avoid
memset loops.

Richard.

2012-05-18  Richard Guenther  <rguenther@suse.de>

	PR tree-optimization/53346
	* tree-loop-distribution.c (ldist_gen): Make sure to apply
	builtin transform even when only a single partition with
	all reads/writes exists.

	* gcc.dg/tree-ssa/ldist-18.c: New testcase.
	* gcc.target/i386/incoming-10.c: Adjust.
	* gcc.target/i386/incoming-11.c: Likewise.
	* gcc.target/i386/pr46295.c: Likewise.

Index: gcc/tree-loop-distribution.c
===================================================================
*** gcc/tree-loop-distribution.c	(revision 187650)
--- gcc/tree-loop-distribution.c	(working copy)
*************** ldist_gen (struct loop *loop, struct gra
*** 1131,1138 ****
    BITMAP_FREE (processed);
    nbp = VEC_length (bitmap, partitions);
  
!   if (nbp <= 1
!       || partition_contains_all_rw (rdg, partitions))
      goto ldist_done;
  
    if (dump_file && (dump_flags & TDF_DETAILS))
--- 1131,1141 ----
    BITMAP_FREE (processed);
    nbp = VEC_length (bitmap, partitions);
  
!   if (nbp == 0
!       || (nbp == 1
! 	  && !can_generate_builtin (rdg, VEC_index (bitmap, partitions, 0)))
!       || (nbp > 1
! 	  && partition_contains_all_rw (rdg, partitions)))
      goto ldist_done;
  
    if (dump_file && (dump_flags & TDF_DETAILS))
Index: gcc/testsuite/gcc.dg/tree-ssa/ldist-18.c
===================================================================
*** gcc/testsuite/gcc.dg/tree-ssa/ldist-18.c	(revision 0)
--- gcc/testsuite/gcc.dg/tree-ssa/ldist-18.c	(revision 0)
***************
*** 0 ****
--- 1,12 ----
+ /* { dg-do compile } */
+ /* { dg-options "-O2 -ftree-loop-distribute-patterns -fdump-tree-ldist-details" } */
+ 
+ void foo (int *p, int n)
+ {
+   int i;
+   for (i = 0; i < n; ++i)
+     p[i] = 0;
+ }
+ 
+ /* { dg-final { scan-tree-dump "generated memset zero" "ldist" } } */
+ /* { dg-final { cleanup-tree-dump "ldist" } } */

Patch

Index: gcc/testsuite/gcc.target/i386/incoming-10.c
===================================================================
--- gcc/testsuite/gcc.target/i386/incoming-10.c	(revision 187653)
+++ gcc/testsuite/gcc.target/i386/incoming-10.c	(working copy)
@@ -12,7 +12,7 @@  void f()
 {
 	int i;
 	struct s s;
-	for (i = 0; i < sizeof(s.x) / sizeof(*s.x); i++) s.x[i] = 0;
+	for (i = 0; i < sizeof(s.x) / sizeof(*s.x); i++) s.x[i] = 1;
 	g(&s);
 }
 
Index: gcc/testsuite/gcc.target/i386/incoming-11.c
===================================================================
--- gcc/testsuite/gcc.target/i386/incoming-11.c	(revision 187653)
+++ gcc/testsuite/gcc.target/i386/incoming-11.c	(working copy)
@@ -10,9 +10,9 @@  int q[100];
 void f()
 {
 	int i;
-	for (i = 0; i < 100; i++) p[i] = 0;
+	for (i = 0; i < 100; i++) p[i] = 1;
 	g();
-	for (i = 0; i < 100; i++) q[i] = 0;
+	for (i = 0; i < 100; i++) q[i] = 1;
 }
 
 /* { dg-final { scan-assembler "andl\[\\t \]*\\$-16,\[\\t \]*%esp" } } */
Index: gcc/testsuite/gcc.target/i386/pr46295.c
===================================================================
--- gcc/testsuite/gcc.target/i386/pr46295.c	(revision 187653)
+++ gcc/testsuite/gcc.target/i386/pr46295.c	(working copy)
@@ -9,7 +9,7 @@  void Parse_Vector ()
    EXPRESS Express;
    int Terms;
    for (Terms = 0; Terms < 5; Terms++)
-     Express[Terms] = 0.0;
+     Express[Terms] = 1.0;
    Parse_Rel_Factor(Express,&Terms);
 }