Message ID | CAK=A3=0aokhOEYE3kiEnON3QNY=fXiC=OcO_NjxGepkv_9fCuw@mail.gmail.com |
---|---|
State | New |
Headers | show |
Ping? thanks, Cong On Wed, Mar 19, 2014 at 11:39 AM, Cong Hou <congh@google.com> wrote: > On Tue, Mar 18, 2014 at 4:43 AM, Richard Biener <rguenther@suse.de> wrote: >> >> On Mon, 17 Mar 2014, Cong Hou wrote: >> >> > On Mon, Mar 17, 2014 at 6:44 AM, Richard Biener <rguenther@suse.de> wrote: >> > > On Fri, 14 Mar 2014, Cong Hou wrote: >> > > >> > >> On Fri, Mar 14, 2014 at 12:58 AM, Richard Biener <rguenther@suse.de> wrote: >> > >> > On Fri, 14 Mar 2014, Jakub Jelinek wrote: >> > >> > >> > >> >> On Fri, Mar 14, 2014 at 08:52:07AM +0100, Richard Biener wrote: >> > >> >> > > Consider this fact and if there are alias checks, we can safely remove >> > >> >> > > the epilogue if the maximum trip count of the loop is less than or >> > >> >> > > equal to the calculated threshold. >> > >> >> > >> > >> >> > You have to consider n % vf != 0, so an argument on only maximum >> > >> >> > trip count or threshold cannot work. >> > >> >> >> > >> >> Well, if you only check if maximum trip count is <= vf and you know >> > >> >> that for n < vf the vectorized loop + it's epilogue path will not be taken, >> > >> >> then perhaps you could, but it is a very special case. >> > >> >> Now, the question is when we are guaranteed we enter the scalar versioned >> > >> >> loop instead for n < vf, is that in case of versioning for alias or >> > >> >> versioning for alignment? >> > >> > >> > >> > I think neither - I have plans to do the cost model check together >> > >> > with the versioning condition but didn't get around to implement that. >> > >> > That would allow stronger max bounds for the epilogue loop. >> > >> >> > >> In vect_transform_loop(), check_profitability will be set to true if >> > >> th >= VF-1 and the number of iteration is unknown (we only consider >> > >> unknown trip count here), where th is calculated based on the >> > >> parameter PARAM_MIN_VECT_LOOP_BOUND and cost model, with the minimum >> > >> value VF-1. If the loop needs to be versioned, then >> > >> check_profitability with true value will be passed to >> > >> vect_loop_versioning(), in which an enhanced loop bound check >> > >> (considering cost) will be built. So I think if the loop is versioned >> > >> and n < VF, then we must enter the scalar version, and in this case >> > >> removing epilogue should be safe when the maximum trip count <= th+1. >> > > >> > > You mean exactly in the case where the profitability check ensures >> > > that n % vf == 0? Thus effectively if n == maximum trip count? >> > > That's quite a special case, no? >> > >> > >> > Yes, it is a special case. But it is in this special case that those >> > warnings are thrown out. Also, I think declaring an array with VF*N as >> > length is not unusual. >> >> Ok, but then for the patch compute the cost model threshold once >> in vect_analyze_loop_2 and store it in a new >> LOOP_VINFO_COST_MODEL_THRESHOLD. > > > Done. > > >> Also you have to check >> the return value from max_stmt_executions_int as that may return >> -1 if the number cannot be computed (or isn't representable in >> a HOST_WIDE_INT). > > > It will be converted to unsigned type so that -1 means infinity. > > >> You also should check for >> LOOP_REQUIRES_VERSIONING_FOR_ALIGNMENT which should have the >> same effect on the cost model check. > > > Done. > > >> >> >> The existing condition is already complicated enough - adding new >> stuff warrants comments before the (sub-)checks. > > > OK. Comments added. > > Below is the revised patch. Bootstrapped and tested on a x86-64 machine. > > > Cong > > > > diff --git a/gcc/ChangeLog b/gcc/ChangeLog > index e1d8666..eceefb3 100644 > --- a/gcc/ChangeLog > +++ b/gcc/ChangeLog > @@ -1,3 +1,18 @@ > +2014-03-11 Cong Hou <congh@google.com> > + > + PR tree-optimization/60505 > + * tree-vectorizer.h (struct _stmt_vec_info): Add th field as the > + threshold of number of iterations below which no vectorization will be > + done. > + * tree-vect-loop.c (new_loop_vec_info): > + Initialize LOOP_VINFO_COST_MODEL_THRESHOLD. > + * tree-vect-loop.c (vect_analyze_loop_operations): > + Set LOOP_VINFO_COST_MODEL_THRESHOLD. > + * tree-vect-loop.c (vect_transform_loop): > + Use LOOP_VINFO_COST_MODEL_THRESHOLD. > + * tree-vect-loop.c (vect_analyze_loop_2): Check the maximum number > + of iterations of the loop and see if we should build the epilogue. > + > 2014-03-10 Jakub Jelinek <jakub@redhat.com> > > PR ipa/60457 > diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog > index 41b6875..09ec1c0 100644 > --- a/gcc/testsuite/ChangeLog > +++ b/gcc/testsuite/ChangeLog > @@ -1,3 +1,8 @@ > +2014-03-11 Cong Hou <congh@google.com> > + > + PR tree-optimization/60505 > + * gcc.dg/vect/pr60505.c: New test. > + > 2014-03-10 Jakub Jelinek <jakub@redhat.com> > > PR ipa/60457 > diff --git a/gcc/testsuite/gcc.dg/vect/pr60505.c > b/gcc/testsuite/gcc.dg/vect/pr60505.c > new file mode 100644 > index 0000000..6940513 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/vect/pr60505.c > @@ -0,0 +1,12 @@ > +/* { dg-do compile } */ > +/* { dg-additional-options "-Wall -Werror" } */ > + > +void foo(char *in, char *out, int num) > +{ > + int i; > + char ovec[16] = {0}; > + > + for(i = 0; i < num ; ++i) > + out[i] = (ovec[i] = in[i]); > + out[num] = ovec[num/2]; > +} > diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c > index df6ab6f..1c78e11 100644 > --- a/gcc/tree-vect-loop.c > +++ b/gcc/tree-vect-loop.c > @@ -933,6 +933,7 @@ new_loop_vec_info (struct loop *loop) > LOOP_VINFO_NITERS (res) = NULL; > LOOP_VINFO_NITERS_UNCHANGED (res) = NULL; > LOOP_VINFO_COST_MODEL_MIN_ITERS (res) = 0; > + LOOP_VINFO_COST_MODEL_THRESHOLD (res) = 0; > LOOP_VINFO_VECTORIZABLE_P (res) = 0; > LOOP_VINFO_PEELING_FOR_ALIGNMENT (res) = 0; > LOOP_VINFO_VECT_FACTOR (res) = 0; > @@ -1579,6 +1580,8 @@ vect_analyze_loop_operations (loop_vec_info > loop_vinfo, bool slp) > || min_profitable_iters > min_scalar_loop_bound)) > th = (unsigned) min_profitable_iters; > > + LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo) = th; > + > if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo) > && LOOP_VINFO_INT_NITERS (loop_vinfo) <= th) > { > @@ -1625,6 +1628,7 @@ vect_analyze_loop_2 (loop_vec_info loop_vinfo) > bool ok, slp = false; > int max_vf = MAX_VECTORIZATION_FACTOR; > int min_vf = 2; > + unsigned int th; > > /* Find all data references in the loop (which correspond to vdefs/vuses) > and analyze their evolution in the loop. Also adjust the minimal > @@ -1769,6 +1773,10 @@ vect_analyze_loop_2 (loop_vec_info loop_vinfo) > > /* Decide whether we need to create an epilogue loop to handle > remaining scalar iterations. */ > + th = ((LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo) + 1) > + / LOOP_VINFO_VECT_FACTOR (loop_vinfo)) > + * LOOP_VINFO_VECT_FACTOR (loop_vinfo); > + > if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo) > && LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) > 0) > { > @@ -1779,7 +1787,14 @@ vect_analyze_loop_2 (loop_vec_info loop_vinfo) > } > else if (LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) > || (tree_ctz (LOOP_VINFO_NITERS (loop_vinfo)) > - < (unsigned)exact_log2 (LOOP_VINFO_VECT_FACTOR (loop_vinfo)))) > + < (unsigned)exact_log2 (LOOP_VINFO_VECT_FACTOR (loop_vinfo)) > + /* In case of versioning, check if the maximum number of > + iterations is greater than th. If they are identical, > + the epilogue is unnecessary. */ > + && ((!LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo) > + && !LOOP_REQUIRES_VERSIONING_FOR_ALIGNMENT (loop_vinfo)) > + || (unsigned HOST_WIDE_INT)max_stmt_executions_int > + (LOOP_VINFO_LOOP (loop_vinfo)) > th))) > LOOP_VINFO_PEELING_FOR_NITER (loop_vinfo) = true; > > /* If an epilogue loop is required make sure we can create one. */ > @@ -5775,9 +5790,7 @@ vect_transform_loop (loop_vec_info loop_vinfo) > by our caller. If the threshold makes all loops profitable that > run at least the vectorization factor number of times checking > is pointless, too. */ > - th = ((PARAM_VALUE (PARAM_MIN_VECT_LOOP_BOUND) > - * LOOP_VINFO_VECT_FACTOR (loop_vinfo)) - 1); > - th = MAX (th, LOOP_VINFO_COST_MODEL_MIN_ITERS (loop_vinfo)); > + th = LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo); > if (th >= LOOP_VINFO_VECT_FACTOR (loop_vinfo) - 1 > && !LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)) > { > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h > index f8efe47..f2087e2 100644 > --- a/gcc/tree-vectorizer.h > +++ b/gcc/tree-vectorizer.h > @@ -264,6 +264,11 @@ typedef struct _loop_vec_info { > values unknown at compile time. */ > int min_profitable_iters; > > + /* Threshold of number of iterations below which vectorzation will not be > + performed. It is calculated from MIN_PROFITABLE_ITERS and > + PARAM_MIN_VECT_LOOP_BOUND. */ > + unsigned int th; > + > /* Is the loop vectorizable? */ > bool vectorizable; > > @@ -382,6 +387,7 @@ typedef struct _loop_vec_info { > cost model. */ > #define LOOP_VINFO_NITERS_UNCHANGED(L) (L)->num_iters_unchanged > #define LOOP_VINFO_COST_MODEL_MIN_ITERS(L) (L)->min_profitable_iters > +#define LOOP_VINFO_COST_MODEL_THRESHOLD(L) (L)->th > #define LOOP_VINFO_VECTORIZABLE_P(L) (L)->vectorizable > #define LOOP_VINFO_VECT_FACTOR(L) (L)->vectorization_factor > #define LOOP_VINFO_PTR_MASK(L) (L)->ptr_mask > > > > > >> >> >> Richard.
On Fri, 28 Mar 2014, Cong Hou wrote: > Ping? Ok. Thanks, Richard. > > thanks, > Cong > > > On Wed, Mar 19, 2014 at 11:39 AM, Cong Hou <congh@google.com> wrote: > > On Tue, Mar 18, 2014 at 4:43 AM, Richard Biener <rguenther@suse.de> wrote: > >> > >> On Mon, 17 Mar 2014, Cong Hou wrote: > >> > >> > On Mon, Mar 17, 2014 at 6:44 AM, Richard Biener <rguenther@suse.de> wrote: > >> > > On Fri, 14 Mar 2014, Cong Hou wrote: > >> > > > >> > >> On Fri, Mar 14, 2014 at 12:58 AM, Richard Biener <rguenther@suse.de> wrote: > >> > >> > On Fri, 14 Mar 2014, Jakub Jelinek wrote: > >> > >> > > >> > >> >> On Fri, Mar 14, 2014 at 08:52:07AM +0100, Richard Biener wrote: > >> > >> >> > > Consider this fact and if there are alias checks, we can safely remove > >> > >> >> > > the epilogue if the maximum trip count of the loop is less than or > >> > >> >> > > equal to the calculated threshold. > >> > >> >> > > >> > >> >> > You have to consider n % vf != 0, so an argument on only maximum > >> > >> >> > trip count or threshold cannot work. > >> > >> >> > >> > >> >> Well, if you only check if maximum trip count is <= vf and you know > >> > >> >> that for n < vf the vectorized loop + it's epilogue path will not be taken, > >> > >> >> then perhaps you could, but it is a very special case. > >> > >> >> Now, the question is when we are guaranteed we enter the scalar versioned > >> > >> >> loop instead for n < vf, is that in case of versioning for alias or > >> > >> >> versioning for alignment? > >> > >> > > >> > >> > I think neither - I have plans to do the cost model check together > >> > >> > with the versioning condition but didn't get around to implement that. > >> > >> > That would allow stronger max bounds for the epilogue loop. > >> > >> > >> > >> In vect_transform_loop(), check_profitability will be set to true if > >> > >> th >= VF-1 and the number of iteration is unknown (we only consider > >> > >> unknown trip count here), where th is calculated based on the > >> > >> parameter PARAM_MIN_VECT_LOOP_BOUND and cost model, with the minimum > >> > >> value VF-1. If the loop needs to be versioned, then > >> > >> check_profitability with true value will be passed to > >> > >> vect_loop_versioning(), in which an enhanced loop bound check > >> > >> (considering cost) will be built. So I think if the loop is versioned > >> > >> and n < VF, then we must enter the scalar version, and in this case > >> > >> removing epilogue should be safe when the maximum trip count <= th+1. > >> > > > >> > > You mean exactly in the case where the profitability check ensures > >> > > that n % vf == 0? Thus effectively if n == maximum trip count? > >> > > That's quite a special case, no? > >> > > >> > > >> > Yes, it is a special case. But it is in this special case that those > >> > warnings are thrown out. Also, I think declaring an array with VF*N as > >> > length is not unusual. > >> > >> Ok, but then for the patch compute the cost model threshold once > >> in vect_analyze_loop_2 and store it in a new > >> LOOP_VINFO_COST_MODEL_THRESHOLD. > > > > > > Done. > > > > > >> Also you have to check > >> the return value from max_stmt_executions_int as that may return > >> -1 if the number cannot be computed (or isn't representable in > >> a HOST_WIDE_INT). > > > > > > It will be converted to unsigned type so that -1 means infinity. > > > > > >> You also should check for > >> LOOP_REQUIRES_VERSIONING_FOR_ALIGNMENT which should have the > >> same effect on the cost model check. > > > > > > Done. > > > > > >> > >> > >> The existing condition is already complicated enough - adding new > >> stuff warrants comments before the (sub-)checks. > > > > > > OK. Comments added. > > > > Below is the revised patch. Bootstrapped and tested on a x86-64 machine. > > > > > > Cong > > > > > > > > diff --git a/gcc/ChangeLog b/gcc/ChangeLog > > index e1d8666..eceefb3 100644 > > --- a/gcc/ChangeLog > > +++ b/gcc/ChangeLog > > @@ -1,3 +1,18 @@ > > +2014-03-11 Cong Hou <congh@google.com> > > + > > + PR tree-optimization/60505 > > + * tree-vectorizer.h (struct _stmt_vec_info): Add th field as the > > + threshold of number of iterations below which no vectorization will be > > + done. > > + * tree-vect-loop.c (new_loop_vec_info): > > + Initialize LOOP_VINFO_COST_MODEL_THRESHOLD. > > + * tree-vect-loop.c (vect_analyze_loop_operations): > > + Set LOOP_VINFO_COST_MODEL_THRESHOLD. > > + * tree-vect-loop.c (vect_transform_loop): > > + Use LOOP_VINFO_COST_MODEL_THRESHOLD. > > + * tree-vect-loop.c (vect_analyze_loop_2): Check the maximum number > > + of iterations of the loop and see if we should build the epilogue. > > + > > 2014-03-10 Jakub Jelinek <jakub@redhat.com> > > > > PR ipa/60457 > > diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog > > index 41b6875..09ec1c0 100644 > > --- a/gcc/testsuite/ChangeLog > > +++ b/gcc/testsuite/ChangeLog > > @@ -1,3 +1,8 @@ > > +2014-03-11 Cong Hou <congh@google.com> > > + > > + PR tree-optimization/60505 > > + * gcc.dg/vect/pr60505.c: New test. > > + > > 2014-03-10 Jakub Jelinek <jakub@redhat.com> > > > > PR ipa/60457 > > diff --git a/gcc/testsuite/gcc.dg/vect/pr60505.c > > b/gcc/testsuite/gcc.dg/vect/pr60505.c > > new file mode 100644 > > index 0000000..6940513 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.dg/vect/pr60505.c > > @@ -0,0 +1,12 @@ > > +/* { dg-do compile } */ > > +/* { dg-additional-options "-Wall -Werror" } */ > > + > > +void foo(char *in, char *out, int num) > > +{ > > + int i; > > + char ovec[16] = {0}; > > + > > + for(i = 0; i < num ; ++i) > > + out[i] = (ovec[i] = in[i]); > > + out[num] = ovec[num/2]; > > +} > > diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c > > index df6ab6f..1c78e11 100644 > > --- a/gcc/tree-vect-loop.c > > +++ b/gcc/tree-vect-loop.c > > @@ -933,6 +933,7 @@ new_loop_vec_info (struct loop *loop) > > LOOP_VINFO_NITERS (res) = NULL; > > LOOP_VINFO_NITERS_UNCHANGED (res) = NULL; > > LOOP_VINFO_COST_MODEL_MIN_ITERS (res) = 0; > > + LOOP_VINFO_COST_MODEL_THRESHOLD (res) = 0; > > LOOP_VINFO_VECTORIZABLE_P (res) = 0; > > LOOP_VINFO_PEELING_FOR_ALIGNMENT (res) = 0; > > LOOP_VINFO_VECT_FACTOR (res) = 0; > > @@ -1579,6 +1580,8 @@ vect_analyze_loop_operations (loop_vec_info > > loop_vinfo, bool slp) > > || min_profitable_iters > min_scalar_loop_bound)) > > th = (unsigned) min_profitable_iters; > > > > + LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo) = th; > > + > > if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo) > > && LOOP_VINFO_INT_NITERS (loop_vinfo) <= th) > > { > > @@ -1625,6 +1628,7 @@ vect_analyze_loop_2 (loop_vec_info loop_vinfo) > > bool ok, slp = false; > > int max_vf = MAX_VECTORIZATION_FACTOR; > > int min_vf = 2; > > + unsigned int th; > > > > /* Find all data references in the loop (which correspond to vdefs/vuses) > > and analyze their evolution in the loop. Also adjust the minimal > > @@ -1769,6 +1773,10 @@ vect_analyze_loop_2 (loop_vec_info loop_vinfo) > > > > /* Decide whether we need to create an epilogue loop to handle > > remaining scalar iterations. */ > > + th = ((LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo) + 1) > > + / LOOP_VINFO_VECT_FACTOR (loop_vinfo)) > > + * LOOP_VINFO_VECT_FACTOR (loop_vinfo); > > + > > if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo) > > && LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) > 0) > > { > > @@ -1779,7 +1787,14 @@ vect_analyze_loop_2 (loop_vec_info loop_vinfo) > > } > > else if (LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) > > || (tree_ctz (LOOP_VINFO_NITERS (loop_vinfo)) > > - < (unsigned)exact_log2 (LOOP_VINFO_VECT_FACTOR (loop_vinfo)))) > > + < (unsigned)exact_log2 (LOOP_VINFO_VECT_FACTOR (loop_vinfo)) > > + /* In case of versioning, check if the maximum number of > > + iterations is greater than th. If they are identical, > > + the epilogue is unnecessary. */ > > + && ((!LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo) > > + && !LOOP_REQUIRES_VERSIONING_FOR_ALIGNMENT (loop_vinfo)) > > + || (unsigned HOST_WIDE_INT)max_stmt_executions_int > > + (LOOP_VINFO_LOOP (loop_vinfo)) > th))) > > LOOP_VINFO_PEELING_FOR_NITER (loop_vinfo) = true; > > > > /* If an epilogue loop is required make sure we can create one. */ > > @@ -5775,9 +5790,7 @@ vect_transform_loop (loop_vec_info loop_vinfo) > > by our caller. If the threshold makes all loops profitable that > > run at least the vectorization factor number of times checking > > is pointless, too. */ > > - th = ((PARAM_VALUE (PARAM_MIN_VECT_LOOP_BOUND) > > - * LOOP_VINFO_VECT_FACTOR (loop_vinfo)) - 1); > > - th = MAX (th, LOOP_VINFO_COST_MODEL_MIN_ITERS (loop_vinfo)); > > + th = LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo); > > if (th >= LOOP_VINFO_VECT_FACTOR (loop_vinfo) - 1 > > && !LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)) > > { > > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h > > index f8efe47..f2087e2 100644 > > --- a/gcc/tree-vectorizer.h > > +++ b/gcc/tree-vectorizer.h > > @@ -264,6 +264,11 @@ typedef struct _loop_vec_info { > > values unknown at compile time. */ > > int min_profitable_iters; > > > > + /* Threshold of number of iterations below which vectorzation will not be > > + performed. It is calculated from MIN_PROFITABLE_ITERS and > > + PARAM_MIN_VECT_LOOP_BOUND. */ > > + unsigned int th; > > + > > /* Is the loop vectorizable? */ > > bool vectorizable; > > > > @@ -382,6 +387,7 @@ typedef struct _loop_vec_info { > > cost model. */ > > #define LOOP_VINFO_NITERS_UNCHANGED(L) (L)->num_iters_unchanged > > #define LOOP_VINFO_COST_MODEL_MIN_ITERS(L) (L)->min_profitable_iters > > +#define LOOP_VINFO_COST_MODEL_THRESHOLD(L) (L)->th > > #define LOOP_VINFO_VECTORIZABLE_P(L) (L)->vectorizable > > #define LOOP_VINFO_VECT_FACTOR(L) (L)->vectorization_factor > > #define LOOP_VINFO_PTR_MASK(L) (L)->ptr_mask > > > > > > > > > > > >> > >> > >> Richard. > >
diff --git a/gcc/ChangeLog b/gcc/ChangeLog index e1d8666..eceefb3 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,18 @@ +2014-03-11 Cong Hou <congh@google.com> + + PR tree-optimization/60505 + * tree-vectorizer.h (struct _stmt_vec_info): Add th field as the + threshold of number of iterations below which no vectorization will be + done. + * tree-vect-loop.c (new_loop_vec_info): + Initialize LOOP_VINFO_COST_MODEL_THRESHOLD. + * tree-vect-loop.c (vect_analyze_loop_operations): + Set LOOP_VINFO_COST_MODEL_THRESHOLD. + * tree-vect-loop.c (vect_transform_loop): + Use LOOP_VINFO_COST_MODEL_THRESHOLD. + * tree-vect-loop.c (vect_analyze_loop_2): Check the maximum number + of iterations of the loop and see if we should build the epilogue. + 2014-03-10 Jakub Jelinek <jakub@redhat.com> PR ipa/60457 diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 41b6875..09ec1c0 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,8 @@ +2014-03-11 Cong Hou <congh@google.com> + + PR tree-optimization/60505 + * gcc.dg/vect/pr60505.c: New test. + 2014-03-10 Jakub Jelinek <jakub@redhat.com> PR ipa/60457 diff --git a/gcc/testsuite/gcc.dg/vect/pr60505.c b/gcc/testsuite/gcc.dg/vect/pr60505.c new file mode 100644 index 0000000..6940513 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/pr60505.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-Wall -Werror" } */ + +void foo(char *in, char *out, int num) +{ + int i; + char ovec[16] = {0}; + + for(i = 0; i < num ; ++i) + out[i] = (ovec[i] = in[i]); + out[num] = ovec[num/2]; +} diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c index df6ab6f..1c78e11 100644 --- a/gcc/tree-vect-loop.c +++ b/gcc/tree-vect-loop.c @@ -933,6 +933,7 @@ new_loop_vec_info (struct loop *loop) LOOP_VINFO_NITERS (res) = NULL; LOOP_VINFO_NITERS_UNCHANGED (res) = NULL; LOOP_VINFO_COST_MODEL_MIN_ITERS (res) = 0; + LOOP_VINFO_COST_MODEL_THRESHOLD (res) = 0; LOOP_VINFO_VECTORIZABLE_P (res) = 0; LOOP_VINFO_PEELING_FOR_ALIGNMENT (res) = 0; LOOP_VINFO_VECT_FACTOR (res) = 0; @@ -1579,6 +1580,8 @@ vect_analyze_loop_operations (loop_vec_info loop_vinfo, bool slp) || min_profitable_iters > min_scalar_loop_bound)) th = (unsigned) min_profitable_iters; + LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo) = th; + if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo) && LOOP_VINFO_INT_NITERS (loop_vinfo) <= th) { @@ -1625,6 +1628,7 @@ vect_analyze_loop_2 (loop_vec_info loop_vinfo) bool ok, slp = false; int max_vf = MAX_VECTORIZATION_FACTOR; int min_vf = 2; + unsigned int th; /* Find all data references in the loop (which correspond to vdefs/vuses) and analyze their evolution in the loop. Also adjust the minimal @@ -1769,6 +1773,10 @@ vect_analyze_loop_2 (loop_vec_info loop_vinfo) /* Decide whether we need to create an epilogue loop to handle remaining scalar iterations. */ + th = ((LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo) + 1) + / LOOP_VINFO_VECT_FACTOR (loop_vinfo)) + * LOOP_VINFO_VECT_FACTOR (loop_vinfo); + if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo) && LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) > 0) { @@ -1779,7 +1787,14 @@ vect_analyze_loop_2 (loop_vec_info loop_vinfo) } else if (LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) || (tree_ctz (LOOP_VINFO_NITERS (loop_vinfo)) - < (unsigned)exact_log2 (LOOP_VINFO_VECT_FACTOR (loop_vinfo)))) + < (unsigned)exact_log2 (LOOP_VINFO_VECT_FACTOR (loop_vinfo)) + /* In case of versioning, check if the maximum number of + iterations is greater than th. If they are identical, + the epilogue is unnecessary. */ + && ((!LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo) + && !LOOP_REQUIRES_VERSIONING_FOR_ALIGNMENT (loop_vinfo)) + || (unsigned HOST_WIDE_INT)max_stmt_executions_int + (LOOP_VINFO_LOOP (loop_vinfo)) > th))) LOOP_VINFO_PEELING_FOR_NITER (loop_vinfo) = true; /* If an epilogue loop is required make sure we can create one. */ @@ -5775,9 +5790,7 @@ vect_transform_loop (loop_vec_info loop_vinfo) by our caller. If the threshold makes all loops profitable that run at least the vectorization factor number of times checking is pointless, too. */ - th = ((PARAM_VALUE (PARAM_MIN_VECT_LOOP_BOUND) - * LOOP_VINFO_VECT_FACTOR (loop_vinfo)) - 1); - th = MAX (th, LOOP_VINFO_COST_MODEL_MIN_ITERS (loop_vinfo)); + th = LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo); if (th >= LOOP_VINFO_VECT_FACTOR (loop_vinfo) - 1 && !LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)) { diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index f8efe47..f2087e2 100644 --- a/gcc/tree-vectorizer.h +++ b/gcc/tree-vectorizer.h @@ -264,6 +264,11 @@ typedef struct _loop_vec_info { values unknown at compile time. */ int min_profitable_iters; + /* Threshold of number of iterations below which vectorzation will not be + performed. It is calculated from MIN_PROFITABLE_ITERS and + PARAM_MIN_VECT_LOOP_BOUND. */ + unsigned int th; + /* Is the loop vectorizable? */ bool vectorizable; @@ -382,6 +387,7 @@ typedef struct _loop_vec_info { cost model. */ #define LOOP_VINFO_NITERS_UNCHANGED(L) (L)->num_iters_unchanged #define LOOP_VINFO_COST_MODEL_MIN_ITERS(L) (L)->min_profitable_iters +#define LOOP_VINFO_COST_MODEL_THRESHOLD(L) (L)->th #define LOOP_VINFO_VECTORIZABLE_P(L) (L)->vectorizable #define LOOP_VINFO_VECT_FACTOR(L) (L)->vectorization_factor #define LOOP_VINFO_PTR_MASK(L) (L)->ptr_mask