Message ID | 5d7ca95c-8a61-6d07-dd5b-e7c2d03072ff@codesourcery.com |
---|---|
State | New |
Headers | show |
Series | GCN: Add -mlow-precision-sqrt for double-precision sqrt [PR105246] (was: Re: [PATCH] amdgcn: Add support for additional natively supported floating-point operations) | expand |
On 09/09/2022 13:20, Tobias Burnus wrote: > However, the pre-existing 'sqrt' problem still is real. It also applies > to reverse sqrt ("v_rsq"), but that's for whatever reason not used for GCN. > > This patch now adds a commandline flag - off by default - to choose > whether this behavior is wanted. I did use the same name as aarch64, > https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html#index-mlow-precision-sqrt (the latter also has -mlow-precision-recip-sqrt, which is not (yet) sensible for GCN.) > > This patch was manually tested for all combinations and I also looked at > insn-recog.cc, given that it is my first .md patch – it it seems to work > fine. > > OK for mainline – or are there comments or more suggestions? I also > included some word for the release notes. No, thank you. I don't see any value in adding an option no one cares about (but we still have to maintain and test). I think it will make sense to drop the double-precision insn definition and fall back to libm in that case. Kwok is currently reviewing all the libm functions and can probably include this one. Andrew
gcc-13/changes.html - GCN: document -mlow-precision-sqrt diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html index 390193ca..d335eab3 100644 --- a/htdocs/gcc-13/changes.html +++ b/htdocs/gcc-13/changes.html @@ -179,6 +179,12 @@ a work-in-progress.</p> <li>Support for the Instinct MI200 series devices (<a href="https://gcc.gnu.org/onlinedocs/gcc/AMD-GCN-Options.html"> <code>gfx90a</code></a>) has been added.</li> + <li>The <code>-mlow-precision-sqrt</code> option (disabled by default) + has been added to use the hardware <code>sqrt</code> also for + double-precision floating point arguments; note that the result + only has much a reduced accurary of 2<sup>29</sup> ULP. This + option requires <code>-funsafe-math-optimizations</code> + (implied by <code>-ffast-math</code>) in addition.</li> </ul> <!-- <h3 id="arc">ARC</h3> -->