diff mbox

Add value range support into memcpy/memset expansion

Message ID 20131119012339.GA29879@kam.mff.cuni.cz
State New
Headers show

Commit Message

Jan Hubicka Nov. 19, 2013, 1:23 a.m. UTC
Hi,
this patch fixes two issues with memcpy testcase - silences warning and updates
the template as suggested by Uros in the PR.  The testcase still fails on i386.
This is because we end up with:
;; Function t (t, funcdef_no=0, decl_uid=1763, symbol_order=2)

t (unsigned int c)
{
  void * b.0_4;
  void * a.1_5;

  <bb 2>:
  if (c_2(D) <= 9)
    goto <bb 3>;
  else
    goto <bb 4>;

  <bb 3>:
  b.0_4 = b;
  a.1_5 = a;
  memcpy (a.1_5, b.0_4, c_2(D));

  <bb 4>:
  return;

}
and we have no useful value range on c_2 because assert_expr was removed,
while in 64bit version there is a cast in bb 3 that preserves the info.
Solving this is an independent (and I guess not terribly easy) problem.

Regtested x86_64-linux, will commit it shortly.

Comments

Richard Biener Nov. 19, 2013, 8:50 a.m. UTC | #1
On Tue, 19 Nov 2013, Jan Hubicka wrote:

> Hi,
> this patch fixes two issues with memcpy testcase - silences warning and updates
> the template as suggested by Uros in the PR.  The testcase still fails on i386.
> This is because we end up with:
> ;; Function t (t, funcdef_no=0, decl_uid=1763, symbol_order=2)
> 
> t (unsigned int c)
> {
>   void * b.0_4;
>   void * a.1_5;
> 
>   <bb 2>:
>   if (c_2(D) <= 9)
>     goto <bb 3>;
>   else
>     goto <bb 4>;
> 
>   <bb 3>:
>   b.0_4 = b;
>   a.1_5 = a;
>   memcpy (a.1_5, b.0_4, c_2(D));
> 
>   <bb 4>:
>   return;
> 
> }
> and we have no useful value range on c_2 because assert_expr was removed,
> while in 64bit version there is a cast in bb 3 that preserves the info.
> Solving this is an independent (and I guess not terribly easy) problem.

Hmm, I thought Jakub fixed this already (with the checking whether
there are any uses of c_2(D) before the conditional)?  Or is this
a different case?

Richard.

> Regtested x86_64-linux, will commit it shortly.
> 
> Index: ChangeLog
> ===================================================================
> --- ChangeLog	(revision 204984)
> +++ ChangeLog	(working copy)
> @@ -1,3 +1,10 @@
> +2013-11-18  Jan Hubicka  <jh@suse.cz>
> +	    Uros Bizjak  <ubizjak@gmail.com>
> +
> +	PR middle-end/59175
> +	* gcc.target/i386/memcpy-2.c: Fix template;
> +	add +1 so the testcase passes at 32bit.
> +
>  2013-11-18  Dominique d'Humieres  <dominiq@lps.ens.fr>
>  
>  	* c-c++-common/cilk-plus/PS/reduction-3.c: Use stdlib.h.
> Index: gcc.target/i386/memcpy-2.c
> ===================================================================
> --- gcc.target/i386/memcpy-2.c	(revision 204984)
> +++ gcc.target/i386/memcpy-2.c	(working copy)
> @@ -1,11 +1,11 @@
>  /* { dg-do compile } */
>  /* { dg-options "-O2" } */
> -/* Memcpy should be inlined because block size is known.  */
> -/* { dg-final { scan-assembler-not "memcpy" } } */
>  void *a;
>  void *b;
>  t(unsigned int c)
>  {
>    if (c<10)
> -    memcpy (a,b,c);
> +    __builtin_memcpy (a,b,c+1);
>  }
> +/* Memcpy should be inlined because block size is known.  */
> +/* { dg-final { scan-assembler-not "(jmp|call)\[\\t \]*memcpy" } } */
> 
>
Jakub Jelinek Nov. 19, 2013, 9:23 a.m. UTC | #2
On Tue, Nov 19, 2013 at 09:50:56AM +0100, Richard Biener wrote:
> > this patch fixes two issues with memcpy testcase - silences warning and updates
> > the template as suggested by Uros in the PR.  The testcase still fails on i386.
> > This is because we end up with:
> > ;; Function t (t, funcdef_no=0, decl_uid=1763, symbol_order=2)
> > 
> > t (unsigned int c)
> > {
> >   void * b.0_4;
> >   void * a.1_5;
> > 
> >   <bb 2>:
> >   if (c_2(D) <= 9)
> >     goto <bb 3>;
> >   else
> >     goto <bb 4>;
> > 
> >   <bb 3>:
> >   b.0_4 = b;
> >   a.1_5 = a;
> >   memcpy (a.1_5, b.0_4, c_2(D));
> > 
> >   <bb 4>:
> >   return;
> > 
> > }
> > and we have no useful value range on c_2 because assert_expr was removed,
> > while in 64bit version there is a cast in bb 3 that preserves the info.
> > Solving this is an independent (and I guess not terribly easy) problem.
> 
> Hmm, I thought Jakub fixed this already (with the checking whether
> there are any uses of c_2(D) before the conditional)?  Or is this
> a different case?

It was a different case, I've only handled the case where you have
if (c >= 10) __builtin_unreachable ();
and c doesn't have any immediate uses before this form of assertion.
In Honza's testcase there is no __builtin_unreachable, but instead
return at that point, changing the value range of c_2(D) in his case
would be far more controversial - for __builtin_unreachable () it means
if you pass c_2(D) 10 to the function and the if (c >= 10) test is
reached, it will be invalid (from this POV that transformation isn't
100% valid either, because there is no proof that if you call the function,
then that condition stmt will be executed).  While in the above case
c_2(D) of 10 is completely valid.

Perhaps better might be for __builtin_unreachable () to just add
there an internal pass through call (perhaps doing nothing at all)
that would just preserve the range and non-zero bits info, we could perhaps
allow some extra optimizations on it (like, if we propagate to the first
argument anything other than some SSA_NAME, we just remove the call and
propagate it further), the question is what else should we do so that it
doesn't inhibit some optimizations unnecessarily.  But if the user
cared enough to insert an __builtin_unreachable () assertion, perhaps it
might be worth preserving that info.

That said, for Honza's case keeping around some internal pass through call
might be even far more expensive.

	Jakub
diff mbox

Patch

Index: ChangeLog
===================================================================
--- ChangeLog	(revision 204984)
+++ ChangeLog	(working copy)
@@ -1,3 +1,10 @@ 
+2013-11-18  Jan Hubicka  <jh@suse.cz>
+	    Uros Bizjak  <ubizjak@gmail.com>
+
+	PR middle-end/59175
+	* gcc.target/i386/memcpy-2.c: Fix template;
+	add +1 so the testcase passes at 32bit.
+
 2013-11-18  Dominique d'Humieres  <dominiq@lps.ens.fr>
 
 	* c-c++-common/cilk-plus/PS/reduction-3.c: Use stdlib.h.
Index: gcc.target/i386/memcpy-2.c
===================================================================
--- gcc.target/i386/memcpy-2.c	(revision 204984)
+++ gcc.target/i386/memcpy-2.c	(working copy)
@@ -1,11 +1,11 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2" } */
-/* Memcpy should be inlined because block size is known.  */
-/* { dg-final { scan-assembler-not "memcpy" } } */
 void *a;
 void *b;
 t(unsigned int c)
 {
   if (c<10)
-    memcpy (a,b,c);
+    __builtin_memcpy (a,b,c+1);
 }
+/* Memcpy should be inlined because block size is known.  */
+/* { dg-final { scan-assembler-not "(jmp|call)\[\\t \]*memcpy" } } */