diff mbox

Define 3-argument overloads of std::hypot for C++17 (P0030R1)

Message ID 20160927151158.GA6188@redhat.com
State New
Headers show

Commit Message

Jonathan Wakely Sept. 27, 2016, 3:11 p.m. UTC
This adds the new 3D std::hypot() functions. This implementation seems
to be faster than the naïve sqrt(x*x + y*y + z*z) implementation, or
hypot(hypot(x, y), z), and should be a bit more accurate at very large
or very small values due to reducing the arguments by the largest one.
Improvements welcome though, as this is not my forte.

The test might not be very good, but tests some small integer values
and some other values where accuracy is lost for one or other of the
alternative implementations mentioned above. If this FAILs for some
32-bit targets we might need to adjust the tolerances or the
dg-options.

	* doc/xml/manual/status_cxx2017.xml: Update status.
	* include/c_global/cmath (hypot): Add three-dimensional overloads.
	* testsuite/26_numerics/headers/cmath/hypot.cc: New.

Tested powerpc64le-linux and x86_64-linux, committed to trunk.
commit 33fb40d1445fbc711d64a1ff8b9a0a8f092a2e7e
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Tue Sep 27 15:44:45 2016 +0100

    Define 3-argument overloads of std::hypot for C++17 (P0030R1)
    
    	* doc/xml/manual/status_cxx2017.xml: Update status.
    	* include/c_global/cmath (hypot): Add three-dimensional overloads.
    	* testsuite/26_numerics/headers/cmath/hypot.cc: New.

Comments

Marc Glisse Sept. 27, 2016, 9:50 p.m. UTC | #1
On Tue, 27 Sep 2016, Jonathan Wakely wrote:

> This adds the new 3D std::hypot() functions. This implementation seems
> to be faster than the naïve sqrt(x*x + y*y + z*z) implementation, or
> hypot(hypot(x, y), z), and should be a bit more accurate at very large
> or very small values due to reducing the arguments by the largest one.
> Improvements welcome though, as this is not my forte.

I understand the claims about accuracy, but the one about speed seems very 
surprising to me. Was your test specifically with denormals on 
not-so-recent hardware, or specifically when sqrt does a library call for 
errno? Otherwise, I have a hard time believing that 3 multiplications and 
2 additions can be slower than 4 multiplications, 2 additions, plus a 
bunch of tests and divisions.
Joseph Myers Sept. 27, 2016, 11:28 p.m. UTC | #2
On Tue, 27 Sep 2016, Jonathan Wakely wrote:

> This adds the new 3D std::hypot() functions. This implementation seems
> to be faster than the naïve sqrt(x*x + y*y + z*z) implementation, or
> hypot(hypot(x, y), z), and should be a bit more accurate at very large
> or very small values due to reducing the arguments by the largest one.
> Improvements welcome though, as this is not my forte.

Should I take it from this implementation that C++ is not concerned by 
certain considerations that would arise for C: spurious underflow 
exceptions from the scaling when some arguments much larger than others; 
spurious "invalid" exceptions from the comparisons when any argument is 
(quiet) NaN; handling of mixed (quiet) NaN and Inf arguments (where ISO C 
Annex F has Inf returned, not NaN)?
Jonathan Wakely Sept. 28, 2016, 9:51 a.m. UTC | #3
On 27/09/16 23:28 +0000, Joseph Myers wrote:
>On Tue, 27 Sep 2016, Jonathan Wakely wrote:
>
>> This adds the new 3D std::hypot() functions. This implementation seems
>> to be faster than the naïve sqrt(x*x + y*y + z*z) implementation, or
>> hypot(hypot(x, y), z), and should be a bit more accurate at very large
>> or very small values due to reducing the arguments by the largest one.
>> Improvements welcome though, as this is not my forte.
>
>Should I take it from this implementation that C++ is not concerned by
>certain considerations that would arise for C: spurious underflow
>exceptions from the scaling when some arguments much larger than others;
>spurious "invalid" exceptions from the comparisons when any argument is
>(quiet) NaN; handling of mixed (quiet) NaN and Inf arguments (where ISO C
>Annex F has Inf returned, not NaN)?

The entire spec from the C++ draft is:  Returns: √ x^2 + y^2 + z^2

It doesn't mention "without undue overflow or underflow" as the C
standard does for hypot(x, y). Handling those conditions would of
course be nice, but I'm not capable of doing so, and I'd rather have a
poor implementation than none.

Again, improvements are welcome. I'm not the right person to do that.
Jonathan Wakely Sept. 28, 2016, 10 a.m. UTC | #4
On 27/09/16 23:50 +0200, Marc Glisse wrote:
>On Tue, 27 Sep 2016, Jonathan Wakely wrote:
>
>>This adds the new 3D std::hypot() functions. This implementation seems
>>to be faster than the naïve sqrt(x*x + y*y + z*z) implementation, or
>>hypot(hypot(x, y), z), and should be a bit more accurate at very large
>>or very small values due to reducing the arguments by the largest one.
>>Improvements welcome though, as this is not my forte.
>
>I understand the claims about accuracy, but the one about speed seems 
>very surprising to me. Was your test specifically with denormals on 
>not-so-recent hardware, or specifically when sqrt does a library call 
>for errno? Otherwise, I have a hard time believing that 3 
>multiplications and 2 additions can be slower than 4 multiplications, 
>2 additions, plus a bunch of tests and divisions.

I didn't test any denormals, just similar inputs to the ones in the
new testcase. Looking at my crappy benchmark again it seems I'd
commented out the sqrt(x*x+y*y+z*z) version in favour of a call to
gsl_hypot3 ... oops!

So what I committed is faster than hypot(hypot(x, y), z) but as you
guessed, slower than the simple sqrt(x*x+y*y+z*z).

I'm happy to defer to you in terms of what would be a better
implementation, as I'm very unlikely to ever need to use this. I don't
know whether we should be optimizing for speed or accuracy. But since
nobody else had contributed an implementation (and this was the one
C++17 feature libc++ has that we don't :-) I decided to add it.
Marc Glisse Sept. 28, 2016, 11:31 a.m. UTC | #5
On Wed, 28 Sep 2016, Jonathan Wakely wrote:

> On 27/09/16 23:28 +0000, Joseph Myers wrote:
>> On Tue, 27 Sep 2016, Jonathan Wakely wrote:
>> 
>>> This adds the new 3D std::hypot() functions. This implementation seems
>>> to be faster than the naïve sqrt(x*x + y*y + z*z) implementation, or
>>> hypot(hypot(x, y), z), and should be a bit more accurate at very large
>>> or very small values due to reducing the arguments by the largest one.
>>> Improvements welcome though, as this is not my forte.
>> 
>> Should I take it from this implementation that C++ is not concerned by
>> certain considerations that would arise for C: spurious underflow
>> exceptions from the scaling when some arguments much larger than others;
>> spurious "invalid" exceptions from the comparisons when any argument is
>> (quiet) NaN; handling of mixed (quiet) NaN and Inf arguments (where ISO C
>> Annex F has Inf returned, not NaN)?
>
> The entire spec from the C++ draft is:  Returns: √ x^2 + y^2 + z^2
>
> It doesn't mention "without undue overflow or underflow" as the C
> standard does for hypot(x, y). Handling those conditions would of
> course be nice, but I'm not capable of doing so, and I'd rather have a
> poor implementation than none.

p0030r0 had "In addition to the two-argument versions of certain math
functions in <cmath> and its float and long double overloads, C++ adds
three-argument versions of these functions that follow similar semantics
as their two-argument counterparts."

One could interpret that as an intent to follow the C semantics. On the
other hand, the paper mentions speed and convenience as the main selling
points. Maybe LWG could clarify the intent?
Jonathan Wakely Sept. 28, 2016, 11:37 a.m. UTC | #6
On 28/09/16 13:31 +0200, Marc Glisse wrote:
>On Wed, 28 Sep 2016, Jonathan Wakely wrote:
>
>>On 27/09/16 23:28 +0000, Joseph Myers wrote:
>>>On Tue, 27 Sep 2016, Jonathan Wakely wrote:
>>>
>>>>This adds the new 3D std::hypot() functions. This implementation seems
>>>>to be faster than the naïve sqrt(x*x + y*y + z*z) implementation, or
>>>>hypot(hypot(x, y), z), and should be a bit more accurate at very large
>>>>or very small values due to reducing the arguments by the largest one.
>>>>Improvements welcome though, as this is not my forte.
>>>
>>>Should I take it from this implementation that C++ is not concerned by
>>>certain considerations that would arise for C: spurious underflow
>>>exceptions from the scaling when some arguments much larger than others;
>>>spurious "invalid" exceptions from the comparisons when any argument is
>>>(quiet) NaN; handling of mixed (quiet) NaN and Inf arguments (where ISO C
>>>Annex F has Inf returned, not NaN)?
>>
>>The entire spec from the C++ draft is:  Returns: √ x^2 + y^2 + z^2
>>
>>It doesn't mention "without undue overflow or underflow" as the C
>>standard does for hypot(x, y). Handling those conditions would of
>>course be nice, but I'm not capable of doing so, and I'd rather have a
>>poor implementation than none.
>
>p0030r0 had "In addition to the two-argument versions of certain math
>functions in <cmath> and its float and long double overloads, C++ adds
>three-argument versions of these functions that follow similar semantics
>as their two-argument counterparts."
>
>One could interpret that as an intent to follow the C semantics. On the
>other hand, the paper mentions speed and convenience as the main selling
>points. Maybe LWG could clarify the intent?

I expect the answer will be that it's QoI.
Joseph Myers Sept. 28, 2016, 12:40 p.m. UTC | #7
On Wed, 28 Sep 2016, Jonathan Wakely wrote:

> On 27/09/16 23:28 +0000, Joseph Myers wrote:
> > On Tue, 27 Sep 2016, Jonathan Wakely wrote:
> > 
> > > This adds the new 3D std::hypot() functions. This implementation seems
> > > to be faster than the naïve sqrt(x*x + y*y + z*z) implementation, or
> > > hypot(hypot(x, y), z), and should be a bit more accurate at very large
> > > or very small values due to reducing the arguments by the largest one.
> > > Improvements welcome though, as this is not my forte.
> > 
> > Should I take it from this implementation that C++ is not concerned by
> > certain considerations that would arise for C: spurious underflow
> > exceptions from the scaling when some arguments much larger than others;
> > spurious "invalid" exceptions from the comparisons when any argument is
> > (quiet) NaN; handling of mixed (quiet) NaN and Inf arguments (where ISO C
> > Annex F has Inf returned, not NaN)?
> 
> The entire spec from the C++ draft is:  Returns: √ x^2 + y^2 + z^2

As a further issue: even if you ignore exceptions and Annex F issues, and 
don't have NaN arguments, if you have an Inf argument it will be the 
largest, and so your code will do Inf/Inf, and so return NaN, if any 
argument is Inf (when obviously the correct answer in that case is Inf).
Jonathan Wakely Sept. 28, 2016, 12:46 p.m. UTC | #8
On 28/09/16 12:40 +0000, Joseph Myers wrote:
>On Wed, 28 Sep 2016, Jonathan Wakely wrote:
>
>> On 27/09/16 23:28 +0000, Joseph Myers wrote:
>> > On Tue, 27 Sep 2016, Jonathan Wakely wrote:
>> >
>> > > This adds the new 3D std::hypot() functions. This implementation seems
>> > > to be faster than the naïve sqrt(x*x + y*y + z*z) implementation, or
>> > > hypot(hypot(x, y), z), and should be a bit more accurate at very large
>> > > or very small values due to reducing the arguments by the largest one.
>> > > Improvements welcome though, as this is not my forte.
>> >
>> > Should I take it from this implementation that C++ is not concerned by
>> > certain considerations that would arise for C: spurious underflow
>> > exceptions from the scaling when some arguments much larger than others;
>> > spurious "invalid" exceptions from the comparisons when any argument is
>> > (quiet) NaN; handling of mixed (quiet) NaN and Inf arguments (where ISO C
>> > Annex F has Inf returned, not NaN)?
>>
>> The entire spec from the C++ draft is:  Returns: √ x^2 + y^2 + z^2
>
>As a further issue: even if you ignore exceptions and Annex F issues, and
>don't have NaN arguments, if you have an Inf argument it will be the
>largest, and so your code will do Inf/Inf, and so return NaN, if any
>argument is Inf (when obviously the correct answer in that case is Inf).

I've opened https://gcc.gnu.org/PR77776 but I don't plan to work on
it.
Rainer Orth Sept. 29, 2016, 8:54 a.m. UTC | #9
Hi Jonathan,

> This adds the new 3D std::hypot() functions. This implementation seems
> to be faster than the naïve sqrt(x*x + y*y + z*z) implementation, or
> hypot(hypot(x, y), z), and should be a bit more accurate at very large
> or very small values due to reducing the arguments by the largest one.
> Improvements welcome though, as this is not my forte.
>
> The test might not be very good, but tests some small integer values
> and some other values where accuracy is lost for one or other of the
> alternative implementations mentioned above. If this FAILs for some
> 32-bit targets we might need to adjust the tolerances or the
> dg-options.
>
> 	* doc/xml/manual/status_cxx2017.xml: Update status.
> 	* include/c_global/cmath (hypot): Add three-dimensional overloads.
> 	* testsuite/26_numerics/headers/cmath/hypot.cc: New.
>
> Tested powerpc64le-linux and x86_64-linux, committed to trunk.

the new test currently FAILs on Solaris 12 (both SPARC and x86):

+FAIL: 26_numerics/headers/cmath/hypot.cc (test for excess errors)
+WARNING: 26_numerics/headers/cmath/hypot.cc compilation failed to produce execu
table

Excess errors:
/vol/gcc/src/hg/trunk/local/libstdc++-v3/testsuite/26_numerics/headers/cmath/hypot.cc:38: error: no matching function for call to 'hypot(double, double, double)'
/vol/gcc/src/hg/trunk/local/libstdc++-v3/testsuite/26_numerics/headers/cmath/hypot.cc:38: error: no matching function for call to 'hypot(double, double, double)'
/vol/gcc/src/hg/trunk/local/libstdc++-v3/testsuite/26_numerics/headers/cmath/hypot.cc:38: error: template argument 2 is invalid

and many more.

	Rainer
Jonathan Wakely Sept. 29, 2016, 9:10 a.m. UTC | #10
On 29/09/16 10:54 +0200, Rainer Orth wrote:
>Hi Jonathan,
>
>> This adds the new 3D std::hypot() functions. This implementation seems
>> to be faster than the naïve sqrt(x*x + y*y + z*z) implementation, or
>> hypot(hypot(x, y), z), and should be a bit more accurate at very large
>> or very small values due to reducing the arguments by the largest one.
>> Improvements welcome though, as this is not my forte.
>>
>> The test might not be very good, but tests some small integer values
>> and some other values where accuracy is lost for one or other of the
>> alternative implementations mentioned above. If this FAILs for some
>> 32-bit targets we might need to adjust the tolerances or the
>> dg-options.
>>
>> 	* doc/xml/manual/status_cxx2017.xml: Update status.
>> 	* include/c_global/cmath (hypot): Add three-dimensional overloads.
>> 	* testsuite/26_numerics/headers/cmath/hypot.cc: New.
>>
>> Tested powerpc64le-linux and x86_64-linux, committed to trunk.
>
>the new test currently FAILs on Solaris 12 (both SPARC and x86):
>
>+FAIL: 26_numerics/headers/cmath/hypot.cc (test for excess errors)
>+WARNING: 26_numerics/headers/cmath/hypot.cc compilation failed to produce execu
>table
>
>Excess errors:
>/vol/gcc/src/hg/trunk/local/libstdc++-v3/testsuite/26_numerics/headers/cmath/hypot.cc:38: error: no matching function for call to 'hypot(double, double, double)'
>/vol/gcc/src/hg/trunk/local/libstdc++-v3/testsuite/26_numerics/headers/cmath/hypot.cc:38: error: no matching function for call to 'hypot(double, double, double)'
>/vol/gcc/src/hg/trunk/local/libstdc++-v3/testsuite/26_numerics/headers/cmath/hypot.cc:38: error: template argument 2 is invalid
>
>and many more.

That would suggest Solaris uses include/c_std/cmath (where I forgot to
add the new overloads) rather than include/c_global/cmath ... is that
right?
Jonathan Wakely Sept. 29, 2016, 9:24 a.m. UTC | #11
On 29/09/16 10:10 +0100, Jonathan Wakely wrote:
>On 29/09/16 10:54 +0200, Rainer Orth wrote:
>>Hi Jonathan,
>>
>>>This adds the new 3D std::hypot() functions. This implementation seems
>>>to be faster than the naïve sqrt(x*x + y*y + z*z) implementation, or
>>>hypot(hypot(x, y), z), and should be a bit more accurate at very large
>>>or very small values due to reducing the arguments by the largest one.
>>>Improvements welcome though, as this is not my forte.
>>>
>>>The test might not be very good, but tests some small integer values
>>>and some other values where accuracy is lost for one or other of the
>>>alternative implementations mentioned above. If this FAILs for some
>>>32-bit targets we might need to adjust the tolerances or the
>>>dg-options.
>>>
>>>	* doc/xml/manual/status_cxx2017.xml: Update status.
>>>	* include/c_global/cmath (hypot): Add three-dimensional overloads.
>>>	* testsuite/26_numerics/headers/cmath/hypot.cc: New.
>>>
>>>Tested powerpc64le-linux and x86_64-linux, committed to trunk.
>>
>>the new test currently FAILs on Solaris 12 (both SPARC and x86):
>>
>>+FAIL: 26_numerics/headers/cmath/hypot.cc (test for excess errors)
>>+WARNING: 26_numerics/headers/cmath/hypot.cc compilation failed to produce execu
>>table
>>
>>Excess errors:
>>/vol/gcc/src/hg/trunk/local/libstdc++-v3/testsuite/26_numerics/headers/cmath/hypot.cc:38: error: no matching function for call to 'hypot(double, double, double)'
>>/vol/gcc/src/hg/trunk/local/libstdc++-v3/testsuite/26_numerics/headers/cmath/hypot.cc:38: error: no matching function for call to 'hypot(double, double, double)'
>>/vol/gcc/src/hg/trunk/local/libstdc++-v3/testsuite/26_numerics/headers/cmath/hypot.cc:38: error: template argument 2 is invalid
>>
>>and many more.
>
>That would suggest Solaris uses include/c_std/cmath (where I forgot to
>add the new overloads) rather than include/c_global/cmath ... is that
>right?

Alternatively it's using c_global/cmath but _GLIBCXX_USE_C99_MATH_TR1
is not defined, as the new overloads are inside that block. I thought
that was defined for Solaris though, which is why we have the
__CORRECT_ISO_CPP11_MATH_H_PROTO there in that file.

Once again I wish we had a Solaris machine in the compile farm, or it
was possible to install a Solaris VM and get OS updates without paying
Oracle for the privilege.
Rainer Orth Sept. 29, 2016, 10:31 a.m. UTC | #12
Hi Jonathan,

> That would suggest Solaris uses include/c_std/cmath (where I forgot to
> add the new overloads) rather than include/c_global/cmath ... is that
> right?

no, include/cmath points to the c_global version.

	Rainer
Rainer Orth Sept. 29, 2016, 10:39 a.m. UTC | #13
Hi Jonathan,

>>That would suggest Solaris uses include/c_std/cmath (where I forgot to
>>add the new overloads) rather than include/c_global/cmath ... is that
>>right?
>
> Alternatively it's using c_global/cmath but _GLIBCXX_USE_C99_MATH_TR1
> is not defined, as the new overloads are inside that block. I thought
> that was defined for Solaris though, which is why we have the
> __CORRECT_ISO_CPP11_MATH_H_PROTO there in that file.

it is indeed, at least initially.  What I see in preprocessed input is
this:

ro@galeras 208 > grep _GLIBCXX_USE_C99_MATH testsuite/normal4/hypot.ii
#define _GLIBCXX_USE_C99_MATH _GLIBCXX11_USE_C99_MATH
#define _GLIBCXX_USE_C99_MATH_TR1 1
#undef _GLIBCXX_USE_C99_MATH
#undef _GLIBCXX_USE_C99_MATH_TR1

It turns out the #undef's are from <math.h>:

#if __cplusplus >= 201103L
#undef  _GLIBCXX_USE_C99_MATH
#undef  _GLIBCXX_USE_C99_MATH_TR1
#endif

No idea what this nonsense is trying to accomplish!  It's already in
Solaris 11.3, however.

> Once again I wish we had a Solaris machine in the compile farm, or it
> was possible to install a Solaris VM and get OS updates without paying
> Oracle for the privilege.

That's easily doable: Solaris is free for development use; you get
access to the release (11, 11.1, 11.2, 11.3, ...) versions, just not to
patches/updates.  They do have a program for free academic use of
Solaris, including updates, these days: https://academy.oracle.com/.

	Rainer
Jonathan Wakely Sept. 29, 2016, 10:47 a.m. UTC | #14
On 29/09/16 12:39 +0200, Rainer Orth wrote:
>Hi Jonathan,
>
>>>That would suggest Solaris uses include/c_std/cmath (where I forgot to
>>>add the new overloads) rather than include/c_global/cmath ... is that
>>>right?
>>
>> Alternatively it's using c_global/cmath but _GLIBCXX_USE_C99_MATH_TR1
>> is not defined, as the new overloads are inside that block. I thought
>> that was defined for Solaris though, which is why we have the
>> __CORRECT_ISO_CPP11_MATH_H_PROTO there in that file.
>
>it is indeed, at least initially.  What I see in preprocessed input is
>this:
>
>ro@galeras 208 > grep _GLIBCXX_USE_C99_MATH testsuite/normal4/hypot.ii
>#define _GLIBCXX_USE_C99_MATH _GLIBCXX11_USE_C99_MATH
>#define _GLIBCXX_USE_C99_MATH_TR1 1
>#undef _GLIBCXX_USE_C99_MATH
>#undef _GLIBCXX_USE_C99_MATH_TR1
>
>It turns out the #undef's are from <math.h>:
>
>#if __cplusplus >= 201103L
>#undef  _GLIBCXX_USE_C99_MATH
>#undef  _GLIBCXX_USE_C99_MATH_TR1
>#endif
>
>No idea what this nonsense is trying to accomplish!  It's already in
>Solaris 11.3, however.

Wow.

If only there was some way the Solaris team could contact us so we
could coordinate and stop adding more and more hacks to mess with each
others headers. But I assume they don't have access to the www or
email, because the only other explanation is too rude to say on a
public list.


>> Once again I wish we had a Solaris machine in the compile farm, or it
>> was possible to install a Solaris VM and get OS updates without paying
>> Oracle for the privilege.
>
>That's easily doable: Solaris is free for development use; you get
>access to the release (11, 11.1, 11.2, 11.3, ...) versions, just not to
>patches/updates.

Yes, but because the libc headers get changed by patches and updates,
I found it was useless to install it in a VM. Because I had outdated
headers I couldn't test how our build system interacts with a fully
updated version of Solaris.

>They do have a program for free academic use of
>Solaris, including updates, these days: https://academy.oracle.com/.

That might be useful, I'll see if I qualify. Thanks.
Rainer Orth Sept. 29, 2016, 10:53 a.m. UTC | #15
Hi Jonathan,

>>It turns out the #undef's are from <math.h>:
>>
>>#if __cplusplus >= 201103L
>>#undef  _GLIBCXX_USE_C99_MATH
>>#undef  _GLIBCXX_USE_C99_MATH_TR1
>>#endif
>>
>>No idea what this nonsense is trying to accomplish!  It's already in
>>Solaris 11.3, however.
>
> Wow.
>
> If only there was some way the Solaris team could contact us so we
> could coordinate and stop adding more and more hacks to mess with each
> others headers. But I assume they don't have access to the www or
> email, because the only other explanation is too rude to say on a
> public list.

it very much depends who you are dealing with: some are quite good at
coordinating with other/external groups, while others, well, don't get
me started...

I'll see what I can dig up here: my goal has been to get rid of the need
for fixincludes on newer Solaris versions, but progress has been slow ;-(

We'll probably have to undo this via fixincludes since at least on
Solaris 11.x it's already in the wild.  Maybe this can be stopped before
Solaris 12 ships, though...

>>> Once again I wish we had a Solaris machine in the compile farm, or it
>>> was possible to install a Solaris VM and get OS updates without paying
>>> Oracle for the privilege.
>>
>>That's easily doable: Solaris is free for development use; you get
>>access to the release (11, 11.1, 11.2, 11.3, ...) versions, just not to
>>patches/updates.
>
> Yes, but because the libc headers get changed by patches and updates,
> I found it was useless to install it in a VM. Because I had outdated
> headers I couldn't test how our build system interacts with a fully
> updated version of Solaris.

True: there have certainly issues like this in the past; that's why I'm
trying to keep even older Solaris versions up to date WRT patches.

>>They do have a program for free academic use of
>>Solaris, including updates, these days: https://academy.oracle.com/.
>
> That might be useful, I'll see if I qualify. Thanks.

Another option might be to get Oracle grant a similar exception to the
compile farm.  I'll ask around if there's any chance of this.

	Rainer
Andre Vieira (lists) Sept. 29, 2016, 1:37 p.m. UTC | #16
Hi Jonathan,

On 27/09/16 16:11, Jonathan Wakely wrote:
> 
> The test might not be very good, but tests some small integer values
> and some other values where accuracy is lost for one or other of the
> alternative implementations mentioned above. If this FAILs for some
> 32-bit targets we might need to adjust the tolerances or the
> dg-options.

On arm-none-eabi I'm seeing a failure for the long double type and inputs:
{ 1e-2l, 1e-4l, 1e-4l, 0.010000999950004999375l }

The abs(frac) is higher than the toler: 1.73455e-16 vs 1e-16. Is that a
reasonable difference? Should we raise toler3 to 1e-15?

The last line is also too high:
  { 2147483647.l, 2147483647.l, 2147483647.l, 3719550785.027307813987l }
Yields a frac of: 1.28198e-16

Those are the only ones that pass the 1e-16 threshold.

Cheers,
Andre
Jonathan Wakely Sept. 29, 2016, 1:50 p.m. UTC | #17
On 29/09/16 14:37 +0100, Andre Vieira (lists) wrote:
>Hi Jonathan,
>
>On 27/09/16 16:11, Jonathan Wakely wrote:
>>
>> The test might not be very good, but tests some small integer values
>> and some other values where accuracy is lost for one or other of the
>> alternative implementations mentioned above. If this FAILs for some
>> 32-bit targets we might need to adjust the tolerances or the
>> dg-options.
>
>On arm-none-eabi I'm seeing a failure for the long double type and inputs:
>{ 1e-2l, 1e-4l, 1e-4l, 0.010000999950004999375l }
>
>The abs(frac) is higher than the toler: 1.73455e-16 vs 1e-16. Is that a
>reasonable difference? Should we raise toler3 to 1e-15?
>
>The last line is also too high:
>  { 2147483647.l, 2147483647.l, 2147483647.l, 3719550785.027307813987l }
>Yields a frac of: 1.28198e-16
>
>Those are the only ones that pass the 1e-16 threshold.

Yes, it makes sense to raise it, although Ed's reworking the code and
tests anyway.
Szabolcs Nagy Sept. 30, 2016, 9:35 a.m. UTC | #18
On 29/09/16 14:37, Andre Vieira (lists) wrote:
> 
> On arm-none-eabi I'm seeing a failure for the long double type and inputs:
> { 1e-2l, 1e-4l, 1e-4l, 0.010000999950004999375l }
> 
> The abs(frac) is higher than the toler: 1.73455e-16 vs 1e-16. Is that a
> reasonable difference? Should we raise toler3 to 1e-15?
> 
> The last line is also too high:
>   { 2147483647.l, 2147483647.l, 2147483647.l, 3719550785.027307813987l }
> Yields a frac of: 1.28198e-16
> 
> Those are the only ones that pass the 1e-16 threshold.
> 

i think the tolerance should be defined in
terms of LDBL_EPSILON (or numeric_limits).

n*LDBL_EPSILON tolerance would accept hypot
with about n ulp error.
Szabolcs Nagy Sept. 30, 2016, 1:13 p.m. UTC | #19
On 30/09/16 10:35, Szabolcs Nagy wrote:
> On 29/09/16 14:37, Andre Vieira (lists) wrote:
>>
>> On arm-none-eabi I'm seeing a failure for the long double type and inputs:
>> { 1e-2l, 1e-4l, 1e-4l, 0.010000999950004999375l }
>>
>> The abs(frac) is higher than the toler: 1.73455e-16 vs 1e-16. Is that a
>> reasonable difference? Should we raise toler3 to 1e-15?
>>
>> The last line is also too high:
>>   { 2147483647.l, 2147483647.l, 2147483647.l, 3719550785.027307813987l }
>> Yields a frac of: 1.28198e-16
>>
>> Those are the only ones that pass the 1e-16 threshold.
>>
> 
> i think the tolerance should be defined in
> terms of LDBL_EPSILON (or numeric_limits).
> 
> n*LDBL_EPSILON tolerance would accept hypot
> with about n ulp error.
> 

now i see that there are huge ulp errors..

toler = 10*eps;

should work for all formats, but currently there
is >1000 ulp error on one of the double test cases..
so tolerance is carefully set to avoid triggering
the failure there

i'd set toler to 10000*eps if this test case is not
for testing hypot quality.
Jonathan Wakely Sept. 30, 2016, 1:17 p.m. UTC | #20
On 30/09/16 14:13 +0100, Szabolcs Nagy wrote:
>On 30/09/16 10:35, Szabolcs Nagy wrote:
>> On 29/09/16 14:37, Andre Vieira (lists) wrote:
>>>
>>> On arm-none-eabi I'm seeing a failure for the long double type and inputs:
>>> { 1e-2l, 1e-4l, 1e-4l, 0.010000999950004999375l }
>>>
>>> The abs(frac) is higher than the toler: 1.73455e-16 vs 1e-16. Is that a
>>> reasonable difference? Should we raise toler3 to 1e-15?
>>>
>>> The last line is also too high:
>>>   { 2147483647.l, 2147483647.l, 2147483647.l, 3719550785.027307813987l }
>>> Yields a frac of: 1.28198e-16
>>>
>>> Those are the only ones that pass the 1e-16 threshold.
>>>
>>
>> i think the tolerance should be defined in
>> terms of LDBL_EPSILON (or numeric_limits).
>>
>> n*LDBL_EPSILON tolerance would accept hypot
>> with about n ulp error.
>>
>
>now i see that there are huge ulp errors..
>
>toler = 10*eps;
>
>should work for all formats, but currently there
>is >1000 ulp error on one of the double test cases..
>so tolerance is carefully set to avoid triggering
>the failure there
>
>i'd set toler to 10000*eps if this test case is not
>for testing hypot quality.

Ed is already re-working the code, and probably the tests.

I'd prefer to wait for his new implementation, but if the FAIL is a
problem now then go ahead and adjust the tolerance.
diff mbox

Patch

diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
index 76eaaa0..4ead6b9 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
@@ -633,14 +633,13 @@  Feature-testing recommendations for C++</link>.
     </row>
 
     <row>
-      <?dbhtml bgcolor="#C8B0B0" ?>
       <entry> Proposal to Introduce a 3-Argument Overload to <code>std::hypot</code> </entry>
       <entry>
 	<link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p0030r1.pdf">
 	P0030R1
 	</link>
       </entry>
-      <entry align="center"> No </entry>
+      <entry align="center"> 7 </entry>
       <entry><code> __cpp_lib_hypot >= 201603 </code></entry>
     </row>
 
diff --git a/libstdc++-v3/include/c_global/cmath b/libstdc++-v3/include/c_global/cmath
index 6db9dee..fffa0e7 100644
--- a/libstdc++-v3/include/c_global/cmath
+++ b/libstdc++-v3/include/c_global/cmath
@@ -1455,6 +1455,46 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
       return hypot(__type(__x), __type(__y));
     }
 
+#if __cplusplus > 201402L
+#define __cpp_lib_hypot 201603
+  // [c.math.hypot3], three-dimensional hypotenuse
+
+  template<typename _Tp>
+    inline _Tp
+    __hypot3(_Tp __x, _Tp __y, _Tp __z)
+    {
+      __x = std::abs(__x);
+      __y = std::abs(__y);
+      __z = std::abs(__z);
+      if (_Tp __a = __x < __y ? __y < __z ? __z : __y : __x < __z ? __z : __x)
+	return __a * std::sqrt((__x / __a) * (__x / __a)
+			       + (__y / __a) * (__y / __a)
+			       + (__z / __a) * (__z / __a));
+      else
+	return {};
+    }
+
+  inline float
+  hypot(float __x, float __y, float __z)
+  { return std::__hypot3<float>(__x, __y, __z); }
+
+  inline double
+  hypot(double __x, double __y, double __z)
+  { return std::__hypot3<double>(__x, __y, __z); }
+
+  inline long double
+  hypot(long double __x, long double __y, long double __z)
+  { return std::__hypot3<long double>(__x, __y, __z); }
+
+  template<typename _Tp, typename _Up, typename _Vp>
+    typename __gnu_cxx::__promote_3<_Tp, _Up, _Vp>::__type
+    hypot(_Tp __x, _Up __y, _Vp __z)
+    {
+      using __type = typename __gnu_cxx::__promote_3<_Tp, _Up, _Vp>::__type;
+      return std::__hypot3<__type>(__x, __y, __z);
+    }
+#endif // C++17
+
 #ifndef __CORRECT_ISO_CPP11_MATH_H_PROTO
   constexpr int
   ilogb(float __x)
diff --git a/libstdc++-v3/testsuite/26_numerics/headers/cmath/hypot.cc b/libstdc++-v3/testsuite/26_numerics/headers/cmath/hypot.cc
new file mode 100644
index 0000000..4a6841c
--- /dev/null
+++ b/libstdc++-v3/testsuite/26_numerics/headers/cmath/hypot.cc
@@ -0,0 +1,138 @@ 
+// Copyright (C) 2016 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+// { dg-options "-std=gnu++17" }
+// { dg-do run { target c++1z } }
+
+#include <cmath>
+#include <type_traits>
+#if defined(__TEST_DEBUG)
+#include <iostream>
+#define VERIFY(A) \
+if (!(A)) \
+  { \
+    std::cout << "line " << __LINE__ \
+      << "  max_abs_frac = " << max_abs_frac \
+      << "  tolerance = " << toler \
+      << std::endl; \
+  }
+#else
+#include <testsuite_hooks.h>
+#endif
+
+using std::is_same_v;
+static_assert(is_same_v<double, decltype(std::hypot(0.0, 0.0, 0.0))>);
+static_assert(is_same_v<double, decltype(std::hypot(0.0f, 0.0, 0.0))>);
+static_assert(is_same_v<double, decltype(std::hypot(0.0, 0.0f, 0.0))>);
+static_assert(is_same_v<double, decltype(std::hypot(0.0, 0.0, 0.0f))>);
+static_assert(is_same_v<double, decltype(std::hypot(0.0f, 0.0f, 0.0))>);
+static_assert(is_same_v<double, decltype(std::hypot(0.0f, 0.0, 0))>);
+static_assert(is_same_v<long double, decltype(std::hypot(0.0f, 0.0, 0.0l))>);
+static_assert(is_same_v<long double, decltype(std::hypot(0, 0.0, 0.0l))>);
+
+template<typename T> struct testcase_hypot { T x, y, z, f0; };
+
+template<typename Tp, unsigned int Num>
+  void
+  test(const testcase_hypot<Tp> (&data)[Num], Tp toler)
+  {
+    bool test __attribute__((unused)) = true;
+    const Tp eps = std::numeric_limits<Tp>::epsilon();
+    Tp max_abs_diff = -Tp(1);
+    Tp max_abs_frac = -Tp(1);
+    unsigned int num_datum = Num;
+    for (unsigned int i = 0; i < num_datum; ++i)
+      {
+	const Tp f = std::hypot(data[i].x, data[i].y, data[i].z);
+	const Tp f0 = data[i].f0;
+	const Tp diff = f - f0;
+	if (std::abs(diff) > max_abs_diff)
+  	 max_abs_diff = std::abs(diff);
+	if (std::abs(f0) > Tp(10) * eps && std::abs(f) > Tp(10) * eps)
+	  {
+	    const Tp frac = diff / f0;
+	    if (std::abs(frac) > max_abs_frac)
+	      max_abs_frac = std::abs(frac);
+	  }
+      }
+    VERIFY(max_abs_frac < toler);
+  }
+
+const testcase_hypot<double> data1[] = {
+  { 0.0, 0.0, 0.0, 0.0 },
+  { 0.0, 1.0, 1.0, std::sqrt(2.0) },
+  { 1.0, 1.0, 1.0, std::sqrt(3.0) },
+  { 1.0, 2.0, 2.0, 3.0 },
+  { 2.0, 3.0, 6.0, 7.0 },
+  { 1.0, 4.0, 8.0, 9.0 },
+  { 4.0, 4.0, 7.0, 9.0 },
+  { 12.0, 16.0, 21.0, 29.0 },
+  { 1e8, 1., 1e-8, 1e8 },
+  { 1., 1e8, 1e-8, 1e8 },
+  { 1e-8, 1., 1e8, 1e8 },
+  { 1e-2, 1e-4, 1e-4, 0.01000099995 },
+  { 214748364., 214748364., 214748364., 371955077.2902952 }
+};
+const double toler1 = 1e-12;
+
+const testcase_hypot<float> data2[] = {
+  { 0.0f, 0.0f, 0.0f, 0.0f },
+  { 0.0f, 1.0f, 1.0f, std::sqrt(2.0f) },
+  { 1.0f, 1.0f, 1.0f, std::sqrt(3.0f) },
+  { 1.0f, 2.0f, 2.0f, 3.0f },
+  { 2.0f, 3.0f, 6.0f, 7.0f },
+  { 1.0f, 4.0f, 8.0f, 9.0f },
+  { 4.0f, 4.0f, 7.0f, 9.0f },
+  { 12.0f, 16.0f, 21.0f, 29.0f },
+  { 1e8f, 1.f, 1e-8f, 1e8f },
+  { 1.f, 1e8f, 1e-8f, 1e8f },
+  { 1e-8f, 1.f, 1e8f, 1e8f },
+  { 1e-2f, 1e-4f, 1e-4f, 0.010001f },
+  { 214748364.f, 214748364.f, 214748364.f, 371955072.f }
+};
+const float toler2 = 1e-7f;
+
+const testcase_hypot<long double> data3[] = {
+  { 0.0l, 0.0l, 0.0l, 0.0l },
+  { 0.0l, 1.0l, 1.0l, std::sqrt(2.0l) },
+  { 1.0l, 1.0l, 1.0l, std::sqrt(3.0l) },
+  { 1.0l, 2.0l, 2.0l, 3.0l },
+  { 2.0l, 3.0l, 6.0l, 7.0l },
+  { 1.0l, 4.0l, 8.0l, 9.0l },
+  { 4.0l, 4.0l, 7.0l, 9.0l },
+  { 12.0l, 16.0l, 21.0l, 29.0l },
+  { 1e8l, 1.l, 1e-8l, 1e8l },
+  { 1.l, 1e8l, 1e-8l, 1e8l },
+  { 1e-8l, 1.l, 1e8l, 1e8l },
+  { 1e-2l, 1e-4l, 1e-4l, 0.010000999950004999375l },
+  { 2147483647.l, 2147483647.l, 2147483647.l, 3719550785.027307813987l }
+};
+const long double toler3 = 1e-16l;
+
+void
+test01()
+{
+  test(data1, toler1);
+  test(data2, toler2);
+  test(data3, toler3);
+}
+
+int
+main()
+{
+  test01();
+}