Patchwork cost model patch

login
register
mail settings
Submitter Xinliang David Li
Date Sept. 27, 2013, 5:22 p.m.
Message ID <CAAkRFZJizGjqbwmJ5QTQURjDp9U_1YKqZijOjYmeLF9zf0_MkQ@mail.gmail.com>
Download mbox | patch
Permalink /patch/278653/
State New
Headers show

Comments

Xinliang David Li - Sept. 27, 2013, 5:22 p.m.
Please review the changes.html change and suggest better wordings if possible:

ndex: htdocs/gcc-4.9/changes.html
vectorizer is turned on at the expense of not getting the maximum
potential runtime speedup. The 'cheap' model will be the default when
vectorizer is turned on at <code>-O2</code>. To override this, use
option <code>-fvect-cost-model=[cheap|dynamic|unlimited]</code>.
   </ul>

 <h2>New Languages and Language specific improvements</h2>

thanks,

David


On Thu, Sep 26, 2013 at 11:09 AM, Xinliang David Li <davidxl@google.com> wrote:
> On Thu, Sep 26, 2013 at 7:37 AM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Thu, Sep 26, 2013 at 1:10 AM, Xinliang David Li <davidxl@google.com> wrote:
>>> I took the liberty to pick up Richard's original fvect-cost-model
>>> patch and made some modification.
>>>
>>> What has not changed:
>>> 1) option -ftree-vect-loop-version is removed;
>>> 2) three cost models are introduced: cheap, dynamic, and unlimited;
>>> 3) unless explicitly specified, cheap model is the default at O2 (e.g.
>>> when -ftree-loop-vectorize is used with -O2), and dynamic mode is the
>>> default for O3 and FDO
>>> 4) alignment based versioning is disabled with cheap model.
>>>
>>> What has changed:
>>> 1) peeling is also disabled with cheap model;
>>> 2) alias check condition limit is reduced with cheap model, but not
>>> completely suppressed. Runtime alias check is a pretty important
>>> enabler.
>>> 3) tree if conversion changes are not included.
>>>
>>> Does this patch look reasonable?
>>
>> In principle yes.  Note that it changes the behavior of -O2 -ftree-vectorize
>> as -ftree-vectorize does not imply changing the default cost model.  I am
>> fine with that, but eventually this will have some testsuite fallout.  This
>> reorg would also need documenting in changes.html to make people
>> aware of this.
>
>
> Here is the proposed change:
>
>
> Index: htdocs/gcc-4.9/changes.html
> ===================================================================
> RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.9/changes.html,v
> retrieving revision 1.26
> diff -u -r1.26 changes.html
> --- htdocs/gcc-4.9/changes.html 26 Aug 2013 14:16:31 -0000 1.26
> +++ htdocs/gcc-4.9/changes.html 26 Sep 2013 18:02:33 -0000
> @@ -37,6 +37,7 @@
>    <ul>
>      <li>AddressSanitizer, a fast memory error detector, is now
> available on ARM.
>      </li>
> +    <li>GCC introduces a new cost model for vectorizer, called
> 'cheap' model. The new cost model is intenteded to minimize compile
> time, code size, and potential negative runtime impact introduced when
> vectorizer is turned on at the expense of not getting the maximum
> potential runtime speedup. The 'cheap' model will be the default when
> vectorizer is turned on at <code>-O2</code>. To override this, use
> option <code>-fvect-cost-model=[cheap|dynamic|unlimited]</code>.
>    </ul>
>
>  <h2>New Languages and Language specific improvements</h2>
>
>
>>
>> With completely disabling alingment peeling and alignment versioning
>> you cut out targets that have no way of performing unaligned accesses.
>> From looking at vect_no_align this are mips, sparc, ia64 and some arm.
>> A compromise for them would be to allow peeling a single iteration
>> and some alignment checks (like up to two?).
>>
>
> Possibly. I think target owners can choose to do target specific
> tunings as follow up.
>
>
>> Reducing the number of allowed alias-checks is ok, but I'd reduce it
>> more than to 6 (was that an arbitrary number or is that the result of
>> some benchmarking?)
>>
>
> yes -- we found that it is not uncommon to have a loop with 2 or 3
> distinct source address and 1 or 2 target address.
>
> There are also tuning opportunities. For instance, in cases where
> source address are derived from the same base, a consolidated alias
> check (against the whole access range instead of just checking cross
> 1-unrolled iteration dependence) can be done.
>
>> I suppose all of the params could use some benchmarking to select
>> a sweet spot in code size vs. runtime.
>
> Agree.
>
>
>>
>> I suppose the patch is ok as-is (if it actually works) if you provide
>> a changelog and propose an entry for changes.html.  We can
>> tune the params for the cheap model as followup.
>
> Ok. I will do more testing and check in the patch with proper
> ChangeLog. The changes.html change will be done separately.
>
> thanks,
>
> David
>
>
>>
>> Thanks for picking this up,
>> Richard.
>>
>>> thanks,
>>>
>>> David

Patch

===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.9/changes.html,v
retrieving revision 1.26
diff -u -r1.26 changes.html
--- htdocs/gcc-4.9/changes.html 26 Aug 2013 14:16:31 -0000 1.26
+++ htdocs/gcc-4.9/changes.html 26 Sep 2013 18:02:33 -0000
@@ -37,6 +37,7 @@ 
   <ul>
     <li>AddressSanitizer, a fast memory error detector, is now
available on ARM.
     </li>
+    <li>GCC introduces a new cost model for vectorizer, called
'cheap' model. The new cost model is intenteded to minimize compile
time, code size, and potential negative runtime impact introduced when