Patchwork Revision 166517 caused PR 46414

login
register
mail settings
Submitter H.J. Lu
Date Nov. 10, 2010, 5:38 p.m.
Message ID <AANLkTi=k4FA73id1SSQBAY5cK2ymRV8CoFtEd858LhYk@mail.gmail.com>
Download mbox | patch
Permalink /patch/70662/
State New
Headers show

Comments

H.J. Lu - Nov. 10, 2010, 5:38 p.m.
2010/11/10 Jan Hubicka <hubicka@ucw.cz>:
>> 2010/11/10 Jan Hubicka <hubicka@ucw.cz>:
>> >> On Wed, Nov 10, 2010 at 7:18 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> >> >> The complette unrilling seems sane here.
>> >> >> This seems like good idea to me and i am not quite sure why 32bit compilation does not do the
>> >> >> transofrm. I will check now.
>> >> >
>> >> > The reason is word size.  We compute move costs based on this, so vectorized moves are more expensive
>> >> > in 32bit cost model than 64bit.
>> >> > Perhaps there is way to hook vectorizer cost model in here?
>> >> >
>> >>
>> >> Shouldn't we use vector size to compute move cost for vector move?
>> >
>> > That would work for me.  I am not terribly familiar with vector costs here, could you try to patch
>> > estimate_move_cost in tree-inline.c?
>> > We are very simplistic here assuming that pretty much all operations have cost of 1, so I guess
>> > making all vector moves to have cost 1 is fine. The main reason for that function is to catch large
>> > structure copies.
>> >
>>
>> How about this patch?
>> +  if (TREE_CODE (type) == VECTOR_TYPE)
>> +    {
>> +      enum machine_mode inner = TYPE_MODE (TREE_TYPE (type));
>> +      enum machine_mode simd
>> +     = targetm.vectorize.preferred_simd_mode (inner);
>> +      return GET_MODE_SIZE (TYPE_MODE (type)) / GET_MODE_SIZE (simd);
>> +    }
>
> You need to round up, so we don't end up with cost of 0 for small blocks. Other than that it seems fine to me.
> We will probably eventually need to also update the expander to use vector instructions when possible to match
> the costs, but I guess most of vectors we see in the code are already of proper size.
>

Like this?  OK for trunk?

Thanks.
Jan Hubicka - Nov. 10, 2010, 7:56 p.m.
> 2010/11/10 Jan Hubicka <hubicka@ucw.cz>:
> >> 2010/11/10 Jan Hubicka <hubicka@ucw.cz>:
> >> >> On Wed, Nov 10, 2010 at 7:18 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
> >> >> >> The complette unrilling seems sane here.
> >> >> >> This seems like good idea to me and i am not quite sure why 32bit compilation does not do the
> >> >> >> transofrm. I will check now.
> >> >> >
> >> >> > The reason is word size.  We compute move costs based on this, so vectorized moves are more expensive
> >> >> > in 32bit cost model than 64bit.
> >> >> > Perhaps there is way to hook vectorizer cost model in here?
> >> >> >
> >> >>
> >> >> Shouldn't we use vector size to compute move cost for vector move?
> >> >
> >> > That would work for me.  I am not terribly familiar with vector costs here, could you try to patch
> >> > estimate_move_cost in tree-inline.c?
> >> > We are very simplistic here assuming that pretty much all operations have cost of 1, so I guess
> >> > making all vector moves to have cost 1 is fine. The main reason for that function is to catch large
> >> > structure copies.
> >> >
> >>
> >> How about this patch?
> >> +  if (TREE_CODE (type) == VECTOR_TYPE)
> >> +    {
> >> +      enum machine_mode inner = TYPE_MODE (TREE_TYPE (type));
> >> +      enum machine_mode simd
> >> +     = targetm.vectorize.preferred_simd_mode (inner);
> >> +      return GET_MODE_SIZE (TYPE_MODE (type)) / GET_MODE_SIZE (simd);
> >> +    }
> >
> > You need to round up, so we don't end up with cost of 0 for small blocks. Other than that it seems fine to me.
> > We will probably eventually need to also update the expander to use vector instructions when possible to match
> > the costs, but I guess most of vectors we see in the code are already of proper size.
> >
> 
> Like this?  OK for trunk?

This seems fine, thanks!
Honza

Patch

diff --git a/gcc/testsuite/gcc.target/i386/recip-vec-sqrtf-avx.c b/gcc/testsuite/gcc.target/i386/recip-vec-sqrtf-avx.c
index 506df88..5a8e696 100644
--- a/gcc/testsuite/gcc.target/i386/recip-vec-sqrtf-avx.c
+++ b/gcc/testsuite/gcc.target/i386/recip-vec-sqrtf-avx.c
@@ -32,4 +32,4 @@  void t3(void)
 }
 
 /* Last loop is small enough to be fully unrolled.  */
-/* { dg-final { scan-assembler-times "vrsqrtps\[ \\t\]+\[^\n\]*%ymm" 4 } } */
+/* { dg-final { scan-assembler-times "vrsqrtps\[ \\t\]+\[^\n\]*%ymm" 6 } } */
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index fc470a7..2c05835 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -3249,6 +3249,16 @@  estimate_move_cost (tree type)
 
   gcc_assert (!VOID_TYPE_P (type));
 
+  if (TREE_CODE (type) == VECTOR_TYPE)
+    {
+      enum machine_mode inner = TYPE_MODE (TREE_TYPE (type));
+      enum machine_mode simd
+	= targetm.vectorize.preferred_simd_mode (inner);
+      int simd_mode_size = GET_MODE_SIZE (simd);
+      return ((GET_MODE_SIZE (TYPE_MODE (type)) + simd_mode_size - 1)
+	      / simd_mode_size);
+    }
+
   size = int_size_in_bytes (type);
 
   if (size < 0 || size > MOVE_MAX_PIECES * MOVE_RATIO (!optimize_size))