Message ID | VI1PR0802MB21767511F259C08E5E23F7A2E7190@VI1PR0802MB2176.eurprd08.prod.outlook.com |
---|---|
State | New |
Headers | show |
On Tue, Apr 18, 2017 at 12:54 PM, Bin Cheng <Bin.Cheng@arm.com> wrote: > Hi, > When loop versioning is required in vectorization, we can merge niter check for vect > peeling with the check for loop versioning, thus save one check/branch for vectorized > loop. > Is it OK? Ok. Thanks, Richard. > Thanks, > bin > 2017-04-11 Bin Cheng <bin.cheng@arm.com> > > * tree-vect-loop-manip.c (vect_do_peeling): Don't skip vector loop > if versioning is required. > * tree-vect-loop.c (vect_analyze_loop_2): Merge niter check for loop > peeling with the check for versioning.
On Thu, May 11, 2017 at 12:06 PM, Richard Biener <richard.guenther@gmail.com> wrote: > On Tue, Apr 18, 2017 at 12:54 PM, Bin Cheng <Bin.Cheng@arm.com> wrote: >> Hi, >> When loop versioning is required in vectorization, we can merge niter check for vect >> peeling with the check for loop versioning, thus save one check/branch for vectorized >> loop. >> Is it OK? > > Ok. Applied @r248959. Thanks, bin > > Thanks, > Richard. > >> Thanks, >> bin >> 2017-04-11 Bin Cheng <bin.cheng@arm.com> >> >> * tree-vect-loop-manip.c (vect_do_peeling): Don't skip vector loop >> if versioning is required. >> * tree-vect-loop.c (vect_analyze_loop_2): Merge niter check for loop >> peeling with the check for versioning.
Another one sorry, but: Bin Cheng <Bin.Cheng@arm.com> writes: > diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c > index af874e7..98caa5e 100644 > --- a/gcc/tree-vect-loop.c > +++ b/gcc/tree-vect-loop.c > @@ -2214,6 +2214,36 @@ start_over: > } > } > > + /* During peeling, we need to check if number of loop iterations is > + enough for both peeled prolog loop and vector loop. This check > + can be merged along with threshold check of loop versioning, so > + increase threshold for this case if necessary. */ > + if (LOOP_REQUIRES_VERSIONING (loop_vinfo) > + && (LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo) > + || LOOP_VINFO_PEELING_FOR_NITER (loop_vinfo))) > + { > + unsigned niters_th; > + > + /* Niters for peeled prolog loop. */ > + if (LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) < 0) > + { > + struct data_reference *dr = LOOP_VINFO_UNALIGNED_DR (loop_vinfo); > + tree vectype = STMT_VINFO_VECTYPE (vinfo_for_stmt (DR_STMT (dr))); > + > + niters_th = TYPE_VECTOR_SUBPARTS (vectype) - 1; > + } > + else > + niters_th = LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo); > + > + /* Niters for at least one iteration of vectorized loop. */ > + niters_th += LOOP_VINFO_VECT_FACTOR (loop_vinfo); > + /* One additional iteration because of peeling for gap. */ > + if (!LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo)) > + niters_th++; is the ! intentional here? It looks like it should adding 1 when peeling for gaps _is_ needed. > + if (LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo) < niters_th) > + LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo) = niters_th; > + } > + > gcc_assert (vectorization_factor > == (unsigned)LOOP_VINFO_VECT_FACTOR (loop_vinfo));
On Sat, Jun 10, 2017 at 11:06 AM, Richard Sandiford <richard.sandiford@linaro.org> wrote: > Another one sorry, but: > > Bin Cheng <Bin.Cheng@arm.com> writes: >> diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c >> index af874e7..98caa5e 100644 >> --- a/gcc/tree-vect-loop.c >> +++ b/gcc/tree-vect-loop.c >> @@ -2214,6 +2214,36 @@ start_over: >> } >> } >> >> + /* During peeling, we need to check if number of loop iterations is >> + enough for both peeled prolog loop and vector loop. This check >> + can be merged along with threshold check of loop versioning, so >> + increase threshold for this case if necessary. */ >> + if (LOOP_REQUIRES_VERSIONING (loop_vinfo) >> + && (LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo) >> + || LOOP_VINFO_PEELING_FOR_NITER (loop_vinfo))) >> + { >> + unsigned niters_th; >> + >> + /* Niters for peeled prolog loop. */ >> + if (LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) < 0) >> + { >> + struct data_reference *dr = LOOP_VINFO_UNALIGNED_DR (loop_vinfo); >> + tree vectype = STMT_VINFO_VECTYPE (vinfo_for_stmt (DR_STMT (dr))); >> + >> + niters_th = TYPE_VECTOR_SUBPARTS (vectype) - 1; >> + } >> + else >> + niters_th = LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo); >> + >> + /* Niters for at least one iteration of vectorized loop. */ >> + niters_th += LOOP_VINFO_VECT_FACTOR (loop_vinfo); >> + /* One additional iteration because of peeling for gap. */ >> + if (!LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo)) >> + niters_th++; > > is the ! intentional here? It looks like it should adding 1 when > peeling for gaps _is_ needed. Hi Richard, Thanks for spotting this. This one is more like my typo. The comments says one additional iteration for peeling gap, but the code does the opposite. How to fix this depends on the answer to previous question. If th stands for minimum niters of vector loop, we need: >> + /* One additional iteration because of peeling for gap. */ >> + if (LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo)) >> + niters_th++; If it stands for maximum niters of scalar loop, we need: >> + /* One additional iteration because of peeling for gap. */ >> + if (!LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo)) >> + niters_th--; Thanks, bin > >> + if (LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo) < niters_th) >> + LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo) = niters_th; >> + } >> + >> gcc_assert (vectorization_factor >> == (unsigned)LOOP_VINFO_VECT_FACTOR (loop_vinfo));
diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c index 0fc8cd3..0ff474d 100644 --- a/gcc/tree-vect-loop-manip.c +++ b/gcc/tree-vect-loop-manip.c @@ -1686,9 +1686,11 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree niters, tree nitersm1, /* Prolog loop may be skipped. */ bool skip_prolog = (prolog_peeling != 0); - /* Skip to epilog if scalar loop may be preferred. It's only used when - we peel for epilog loop. */ - bool skip_vector = (!LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)); + /* Skip to epilog if scalar loop may be preferred. It's only needed + when we peel for epilog loop and when it hasn't been checked with + loop versioning. */ + bool skip_vector = (!LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo) + && !LOOP_REQUIRES_VERSIONING (loop_vinfo)); /* Epilog loop must be executed if the number of iterations for epilog loop is known at compile time, otherwise we need to add a check at the end of vector loop and skip to the end of epilog loop. */ diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c index af874e7..98caa5e 100644 --- a/gcc/tree-vect-loop.c +++ b/gcc/tree-vect-loop.c @@ -2214,6 +2214,36 @@ start_over: } } + /* During peeling, we need to check if number of loop iterations is + enough for both peeled prolog loop and vector loop. This check + can be merged along with threshold check of loop versioning, so + increase threshold for this case if necessary. */ + if (LOOP_REQUIRES_VERSIONING (loop_vinfo) + && (LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo) + || LOOP_VINFO_PEELING_FOR_NITER (loop_vinfo))) + { + unsigned niters_th; + + /* Niters for peeled prolog loop. */ + if (LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) < 0) + { + struct data_reference *dr = LOOP_VINFO_UNALIGNED_DR (loop_vinfo); + tree vectype = STMT_VINFO_VECTYPE (vinfo_for_stmt (DR_STMT (dr))); + + niters_th = TYPE_VECTOR_SUBPARTS (vectype) - 1; + } + else + niters_th = LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo); + + /* Niters for at least one iteration of vectorized loop. */ + niters_th += LOOP_VINFO_VECT_FACTOR (loop_vinfo); + /* One additional iteration because of peeling for gap. */ + if (!LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo)) + niters_th++; + if (LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo) < niters_th) + LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo) = niters_th; + } + gcc_assert (vectorization_factor == (unsigned)LOOP_VINFO_VECT_FACTOR (loop_vinfo));
Hi, When loop versioning is required in vectorization, we can merge niter check for vect peeling with the check for loop versioning, thus save one check/branch for vectorized loop. Is it OK? Thanks, bin 2017-04-11 Bin Cheng <bin.cheng@arm.com> * tree-vect-loop-manip.c (vect_do_peeling): Don't skip vector loop if versioning is required. * tree-vect-loop.c (vect_analyze_loop_2): Merge niter check for loop peeling with the check for versioning. From bd54e2524a4047328ba4847ad013db2bbe5850fe Mon Sep 17 00:00:00 2001 From: Bin Cheng <binche01@e108451-lin.cambridge.arm.com> Date: Thu, 16 Mar 2017 16:40:50 +0000 Subject: [PATCH 32/33] save-vect_peeling-niters-check-20170225.txt --- gcc/tree-vect-loop-manip.c | 8 +++++--- gcc/tree-vect-loop.c | 30 ++++++++++++++++++++++++++++++ 2 files changed, 35 insertions(+), 3 deletions(-)