mbox series

[00/31] VAX: Bring the port up to date (yes, MODE_CC conversion is included)

Message ID alpine.LFD.2.21.2010251020560.866917@eddie.linux-mips.org
Headers show
Series VAX: Bring the port up to date (yes, MODE_CC conversion is included) | expand

Message

Maciej W. Rozycki Nov. 20, 2020, 3:38 a.m. UTC
Hi,

 [Paul, there's a PDP11 piece for you further down here and then 29/31.]

 This is the much-desired refurbishment of the VAX backend.  A little bit 
past the end of Stage 1, which I apologise for and which I do hope is not 
going to make it a no-no for GCC 11.  I feel quite satisfied anyway I was 
able to overcome all the difficulties outside the development itself I was 
faced with throughout this effort and fit it into quite a tight schedule 
between my departure from Western Digital effective Sep 1st and now.

 Special thanks to Anders "Ragge" Magnusson for persuading me, on my trip 
to LuleƄ, Sweden back in 2015, to adopt Lizzie, his VAXstation 4000/60 he 
used to use for VAX/NetBSD development and decided to part with, as she 
turned out to be the only VAX machine in my possession ready to undertake 
the task of GCC verification, and also quite a mighty one for such a 
mature piece of hardware.

 The port has turned out to have some issues, which I decided to address 
so as not to have to propagate or correct breakage with the MODE_CC update 
itself, hence the 28 preparatory patches.  I might have skipped maybe two 
changes as not really necessary, such as the addition of `movmem' pattern, 
but they were really low-hanging fruit, and then easy to lose if not done 
right away.  I have split MODE_CC conversion test cases off due to the 
size of the change.

 Then there is a fix for the PDP11 backend addressing an issue I found in 
the handling of floating-point comparisons.  Unlike all the other changes 
this one has not been regression-tested, not even built as I have no idea 
how to prepare a development environment for a PDP11 target (also none of 
my VAX pieces is old enough to support PDP11 machine code execution).

 Still I am fairly sure it is a correct change to make, and you should be 
able to confirm it quite easily perhaps by picking the same test case from 
31/31 that I used for the example RTL dump in 28/31 and using it along 
with said dump to match what the PDP11 backend produces.  Maybe you can 
use these test cases for PDP11 verification as well, as they are pretty 
generic except for the assembly match patterns of course.

 These changes have been regression-tested throughout development with the 
`vax-netbsdelf' target running NetBSD 9.0, using said VAXstation 4000/60, 
which uses the Mariah implemementation of the VAX architecture.  The host 
used was `powerpc64le-linux-gnu' and occasionally `x86_64-linux-gnu' as 
well; changes outside the VAX backend were all natively bootstrapped and 
regression-tested with both these hosts.

 Target regression-testing has been done across all the components that 
build (01/31 is required to build libgomp at `-O2), meaning the following 
parts have been excluded for the reasons stated:

1. libada -- not ported to VAX/NetBSD, machine/OS bindings not present.

2. libgfortran -- oddly enough for Fortran a piece requires IEEE 754
   floating-point arithmetic (possibly a porting problem too).

3. libgo -- not ported to VAX/NetBSD, machine/OS bindings are not present.

and the absence of the respective libraries caused failures with the 
respective frontends as well.

 One regression has been nominally caused, in C frontend testing:

FAIL: gcc.dg/lto/pr55660 c_lto_pr55660_0.o-c_lto_pr55660_1.o link, -O2 -flto -flto-partition=none -fuse-linker-plugin -fno-fat-lto-objects

however it is a symptom of an unrelated bug in the LTO wrapper, which 
clears the PIC flag unconditionally:

    case LTO_LINKER_OUTPUT_EXEC: /* Normal executable */
      flag_pic = 0;
      flag_pie = 0;
      flag_shlib = 0;
      break;

and causes a legitimate assembly warning:

/tmp/ccG0X3DQ.s: Assembler messages:
/tmp/ccG0X3DQ.s:17: Warning: Symbol n used as immediate operand in PIC mode.
/tmp/ccG0X3DQ.s:26: Warning: Symbol n used as immediate operand in PIC mode.

similarly to a preexisting failure for the same test case at `-O0':

FAIL: gcc.dg/lto/pr55660 c_lto_pr55660_0.o-c_lto_pr55660_1.o link, -O0 -flto -flto-partition=none -fuse-linker-plugin

and numerous other ones.  I'll file a PR to track this problem and see if 
I can address it quickly now that I'm done with the MODE_CC conversion, 
with the understanding that it may not be suitable for GCC 11 at this 
point of the development cycle.

 As I have refreshed the tree again for this submission and verification 
takes short of 48 hours per run, I'll be scheduling another full cycle and 
expect to have updated results in about a week's time as all being well I 
imagine I'll have to go throug three runs for the base results, results 
for the preparatory changes, and then the final results.  I'll see if I 
can arrange and run some benchmarking too.

 See individual change descriptions for details and code quality stats.

 Last not least for easier access I have made these changes available at 
<git://gcc.gnu.org/git/gcc.git>, `users/macro/vax-mode-cc' branch.

 Comments, questions, concerns?

  Maciej

Comments

Anders Magnusson Nov. 20, 2020, 7:58 a.m. UTC | #1
Morning Maciej,
>   Then there is a fix for the PDP11 backend addressing an issue I found in
> the handling of floating-point comparisons.  Unlike all the other changes
> this one has not been regression-tested, not even built as I have no idea
> how to prepare a development environment for a PDP11 target (also none of
> my VAX pieces is old enough to support PDP11 machine code execution).
You could use simh /w 2.11BSD, or if you want to test it on real 
hardware I have a 11/83 where you could test?

-- R
Toon Moene Nov. 21, 2020, 9:02 p.m. UTC | #2
[ cc'd to the fortran mailing list to hopely get some more knowledgeable 
input ... ]

On 11/20/20 4:38 AM, Maciej W. Rozycki wrote:

> 2. libgfortran -- oddly enough for Fortran a piece requires IEEE 754
>     floating-point arithmetic (possibly a porting problem too).

gcc/libgfortran/config.h.in does have:

/* Define to 1 if you have the <ieeefp.h> header file. */
#undef HAVE_IEEEFP_H

So perhaps it does do the "right thing" if you do not have this header 
file on your VAX operating system.

The Fortran Standard allows an implementation *not* to have IEEE 
floating point support ...

Kind regards,
Paul Koning Nov. 23, 2020, 3:48 p.m. UTC | #3
> On Nov 19, 2020, at 10:38 PM, Maciej W. Rozycki <macro@linux-mips.org> wrote:
> 
> Hi,
> 
> [Paul, there's a PDP11 piece for you further down here and then 29/31.]
> 
> ...
> 
> Then there is a fix for the PDP11 backend addressing an issue I found in 
> the handling of floating-point comparisons.  Unlike all the other changes 
> this one has not been regression-tested, not even built as I have no idea 
> how to prepare a development environment for a PDP11 target (also none of 
> my VAX pieces is old enough to support PDP11 machine code execution).

I agree this is a correct change, interesting that it was missed before.  You'd expect some ICE issues from that mistake.  Perhaps there were and I didn't realize the cause; the PDP11 test run is not yet fully clean.

I've hacked together a primitive newlib based "bare metal" execution test setup that uses SIMH, but it's not a particularly clean setup.  And it hasn't been posted, I hope to be able to do that at some point.

Thanks for the fix.

	paul
Maciej W. Rozycki Nov. 23, 2020, 8:31 p.m. UTC | #4
On Fri, 20 Nov 2020, Anders Magnusson wrote:

> >   Then there is a fix for the PDP11 backend addressing an issue I found in
> > the handling of floating-point comparisons.  Unlike all the other changes
> > this one has not been regression-tested, not even built as I have no idea
> > how to prepare a development environment for a PDP11 target (also none of
> > my VAX pieces is old enough to support PDP11 machine code execution).
> You could use simh /w 2.11BSD, or if you want to test it on real hardware I
> have a 11/83 where you could test?

 This promises a lot of fun, but regrettably I think I cannot afford the 
time required to get this set up right now.  And getting at a piece of 
hardware, real or simulated, is the least of the problem here.  Getting a 
proper cross-compilation environment set up is, as Paul's recent response 
also indicates.  And to verify this change it is all that is needed really 
(though running full testing never hurts).

 Running testing across the Internet, while supported by DejaGNU, even (or 
maybe especially) if all the security issues around it have been sorted is 
going to be terribly slow.  And having peeked at the PDP-11 compatibility 
chapter of the VAX 032 spec I note that it does not actually include FPU 
emulation, so it would have to be a real PDP-11 piece anyway.

 Anyway once the patches have been merged, which given all the approvals 
from Jeff I think is going to take until verification completes by the end 
of this week, I need to move on.

 First I need to figure out why an attempt to use a DEFPA FDDI network 
interface with my brand new POWER9 system makes the system firmware spew a 
flood of warning messages on the main console (there's the BMC console as 
well) and the interface fails to initialise and work; the MAC address is 
retrieved as all-zeros.  I only brought the system up enough to complete 
this effort and I need to complete the rest while I'm onsite (owing to the 
pandemic).

 NB as a precaution to anyone considering to use POWER9 for software 
development: I have learnt the hard way with this very VAX/GCC effort of 
one embarassing shortcoming of the architecture when it comes to this kind 
of use.  It only implements a lone single hardware watchpoint, and then a 
security-impared one, which therefore has to be force-enabled (with Linux 
anyway) before it can be used (cf. Documentation/powerpc/dawr-power9.rst 
in Linux sources).  Consequently I had to take extra care not to have more 
then one watchpoint set at any given time or GDB would resort to using 
software watchpoints, painfully slow.

 I used to think watchpoints are for granted nowadays and before getting 
this system I didn't even think such a silly shortcoming would be possible 
in 2020 with a performance chip where the cost of a couple watchpoints is 
nothing in terms of silicon.  I mean Intel x86 has had *four* of them as 
standard since 1985(!) in every single even most crippled implementation, 
and myself I have used this feature since 1991.  And even embedded chips 
such as MIPS ones usually have at least two these days.

 Based on my quick research this shortcoming has now been corrected with 
POWER10, but I find the 35 years required to get on par with x86 a bit 
disappointing, and of course I won't be able to make use of it anyway.

 Second I need to find a new day job.

  Maciej
Maciej W. Rozycki Nov. 23, 2020, 9:51 p.m. UTC | #5
On Sat, 21 Nov 2020, Toon Moene wrote:

> > 2. libgfortran -- oddly enough for Fortran a piece requires IEEE 754
> >     floating-point arithmetic (possibly a porting problem too).
> 
> gcc/libgfortran/config.h.in does have:
> 
> /* Define to 1 if you have the <ieeefp.h> header file. */
> #undef HAVE_IEEEFP_H
> 
> So perhaps it does do the "right thing" if you do not have this header file on
> your VAX operating system.

 Well it does not:

In file included from .../libgfortran/generated/maxval_r4.c:26:
.../libgfortran/generated/maxval_r4.c: In function 'maxval_r4':
.../libgfortran/libgfortran.h:292:30: warning: target format does not support infinity
  292 | # define GFC_REAL_4_INFINITY __builtin_inff ()
      |                              ^~~~~~~~~~~~~~
.../libgfortran/generated/maxval_r4.c:149:19:
note: in expansion of macro 'GFC_REAL_4_INFINITY'
  149 |         result = -GFC_REAL_4_INFINITY;
      |                   ^~~~~~~~~~~~~~~~~~~
.../libgfortran/generated/maxval_r4.c: In function 'mmaxval_r4':
.../libgfortran/libgfortran.h:292:30: warning: target format does not support infinity
  292 | # define GFC_REAL_4_INFINITY __builtin_inff ()
      |                              ^~~~~~~~~~~~~~
.../libgfortran/generated/maxval_r4.c:363:19:
note: in expansion of macro 'GFC_REAL_4_INFINITY'
  363 |         result = -GFC_REAL_4_INFINITY;
      |                   ^~~~~~~~~~~~~~~~~~~
{standard input}: Assembler messages:
{standard input}:204: Fatal error: Can't relocate expression
make[3]: *** [Makefile:3358: maxval_r4.lo] Error 1

with the offending assembly instruction at 204 being:

	movf $0f+QNaN,%r2

and the `QNaN' part of the operand being what GAS complains about (of 
course it is a problem too that GCC lets the infinity intrinsics through 
with a mere warning and rubbish emitted rather than bailing out right 
away, but they are not supposed to be requested in the first place as the 
notion of infinity is specific to IEEE 754 FP and the VAX FP format used 
here does not support such an FP datum).

 The absence of IEEE 754 FP is correctly recognised by the configuration 
script:

checking for ieeefp.h... no
[...]
configure: FPU dependent file will be fpu-generic.h
configure: Support for IEEE modules: no

> The Fortran Standard allows an implementation *not* to have IEEE floating
> point support ...

 Given how long before IEEE 754 Fortran was invented I would be rather 
surprised if it was the other way round, which is why I have suspected, as 
I have noted in the piece quoted above, a problem with the VAX/NetBSD port 
of libgfortran.

  Maciej
Thomas Koenig Nov. 23, 2020, 10:12 p.m. UTC | #6
Am 23.11.20 um 22:51 schrieb Maciej W. Rozycki:
>> /* Define to 1 if you have the <ieeefp.h> header file. */
>> #undef HAVE_IEEEFP_H
>>
>> So perhaps it does do the "right thing" if you do not have this header file on
>> your VAX operating system.
>   Well it does not:
> 
> In file included from .../libgfortran/generated/maxval_r4.c:26:
> .../libgfortran/generated/maxval_r4.c: In function 'maxval_r4':
> .../libgfortran/libgfortran.h:292:30: warning: target format does not support infinity
>    292 | # define GFC_REAL_4_INFINITY __builtin_inff ()
>        |                              ^~~~~~~~~~~~~~

This is guarded with

#ifdef __FLT_HAS_INFINITY__
# define GFC_REAL_4_INFINITY __builtin_inff ()
#endif

I don't know how or why __FLT_HAS_INFINITY is set for a target which
does not support it, but if you get rid of that macro, that particular
problem should be solved.

Best regards

	Thomas
Maciej W. Rozycki Nov. 24, 2020, 4:28 a.m. UTC | #7
On Mon, 23 Nov 2020, Thomas Koenig wrote:

> >   Well it does not:
> > 
> > In file included from .../libgfortran/generated/maxval_r4.c:26:
> > .../libgfortran/generated/maxval_r4.c: In function 'maxval_r4':
> > .../libgfortran/libgfortran.h:292:30: warning: target format does not
> > support infinity
> >    292 | # define GFC_REAL_4_INFINITY __builtin_inff ()
> >        |                              ^~~~~~~~~~~~~~
> 
> This is guarded with
> 
> #ifdef __FLT_HAS_INFINITY__
> # define GFC_REAL_4_INFINITY __builtin_inff ()
> #endif
> 
> I don't know how or why __FLT_HAS_INFINITY is set for a target which
> does not support it, but if you get rid of that macro, that particular
> problem should be solved.

 Thanks for the hint; I didn't look into it any further not to distract 
myself from the scope of the project.  I have now, and the check you have 
quoted is obviously broken (as are all the remaining similar ones), given:

$ vax-netbsdelf-gcc -E -dM - < /dev/null | sort | grep _HAS_
#define __DBL_HAS_DENORM__ 0
#define __DBL_HAS_INFINITY__ 0
#define __DBL_HAS_QUIET_NAN__ 0
#define __FLT_HAS_DENORM__ 0
#define __FLT_HAS_INFINITY__ 0
#define __FLT_HAS_QUIET_NAN__ 0
#define __LDBL_HAS_DENORM__ 0
#define __LDBL_HAS_INFINITY__ 0
#define __LDBL_HAS_QUIET_NAN__ 0
$ 

which looks reasonable to me.  This seems straightforward to fix to me, so 
I'll include it along with verification I am about to schedule (assuming 
that this will be enough for libgfortran to actually build; obviously it 
hasn't been tried by anyone with such a setup for a while now, as these 
libgfortran checks date back to 2009).

  Maciej
Maciej W. Rozycki Nov. 24, 2020, 5:27 a.m. UTC | #8
On Tue, 24 Nov 2020, Maciej W. Rozycki wrote:

> > I don't know how or why __FLT_HAS_INFINITY is set for a target which
> > does not support it, but if you get rid of that macro, that particular
> > problem should be solved.
> 
>  Thanks for the hint; I didn't look into it any further not to distract 
> myself from the scope of the project.  I have now, and the check you have 
> quoted is obviously broken (as are all the remaining similar ones), given:
> 
> $ vax-netbsdelf-gcc -E -dM - < /dev/null | sort | grep _HAS_
> #define __DBL_HAS_DENORM__ 0
> #define __DBL_HAS_INFINITY__ 0
> #define __DBL_HAS_QUIET_NAN__ 0
> #define __FLT_HAS_DENORM__ 0
> #define __FLT_HAS_INFINITY__ 0
> #define __FLT_HAS_QUIET_NAN__ 0
> #define __LDBL_HAS_DENORM__ 0
> #define __LDBL_HAS_INFINITY__ 0
> #define __LDBL_HAS_QUIET_NAN__ 0
> $ 
> 
> which looks reasonable to me.  This seems straightforward to fix to me, so 
> I'll include it along with verification I am about to schedule (assuming 
> that this will be enough for libgfortran to actually build; obviously it 
> hasn't been tried by anyone with such a setup for a while now, as these 
> libgfortran checks date back to 2009).

 Well, it is still broken, owing to NetBSD failing to implement POSIX 2008 
locale handling correctly, apparently deliberately[1], and missing 
uselocale(3)[2] while still providing newlocale(3).  This confuses our 
conditionals and consequently:

.../libgfortran/io/transfer.c: In function 'data_transfer_init_worker':
.../libgfortran/io/transfer.c:3416:30: error:
'old_locale_lock' undeclared (first use in this function)
 3416 |       __gthread_mutex_lock (&old_locale_lock);
      |                              ^~~~~~~~~~~~~~~

etc.

 We can probably work it around by downgrading to setlocale(3) for NetBSD 
(i.e. whenever either function is missing) unless someone from the NetBSD 
community contributes a better implementation (they seem to prefer their 
own non-standard printf_l(3) library API).

References:

[1] Martin Husemann, "Re: uselocale() function", 
    <https://mail-index.netbsd.org/netbsd-users/2017/02/14/msg019351.html>

[2] "The Open Group Base Specifications Issue 7, 2018 edition", IEEE Std 
    1003.1-2017 (Revision of IEEE Std 1003.1-2008),
    <https://pubs.opengroup.org/onlinepubs/9699919799/functions/uselocale.html>

  Maciej
Maciej W. Rozycki Nov. 24, 2020, 6:04 a.m. UTC | #9
On Tue, 24 Nov 2020, Maciej W. Rozycki wrote:

> > > I don't know how or why __FLT_HAS_INFINITY is set for a target which
> > > does not support it, but if you get rid of that macro, that particular
> > > problem should be solved.
> > 
> >  Thanks for the hint; I didn't look into it any further not to distract 
> > myself from the scope of the project.  I have now, and the check you have 
> > quoted is obviously broken (as are all the remaining similar ones), given:
> > 
> > $ vax-netbsdelf-gcc -E -dM - < /dev/null | sort | grep _HAS_
> > #define __DBL_HAS_DENORM__ 0
> > #define __DBL_HAS_INFINITY__ 0
> > #define __DBL_HAS_QUIET_NAN__ 0
> > #define __FLT_HAS_DENORM__ 0
> > #define __FLT_HAS_INFINITY__ 0
> > #define __FLT_HAS_QUIET_NAN__ 0
> > #define __LDBL_HAS_DENORM__ 0
> > #define __LDBL_HAS_INFINITY__ 0
> > #define __LDBL_HAS_QUIET_NAN__ 0
> > $ 
> > 
> > which looks reasonable to me.  This seems straightforward to fix to me, so 
> > I'll include it along with verification I am about to schedule (assuming 
> > that this will be enough for libgfortran to actually build; obviously it 
> > hasn't been tried by anyone with such a setup for a while now, as these 
> > libgfortran checks date back to 2009).
> 
>  Well, it is still broken, owing to NetBSD failing to implement POSIX 2008 
> locale handling correctly, apparently deliberately[1], and missing 
> uselocale(3)[2] while still providing newlocale(3).  This confuses our 
> conditionals and consequently:
> 
> .../libgfortran/io/transfer.c: In function 'data_transfer_init_worker':
> .../libgfortran/io/transfer.c:3416:30: error:
> 'old_locale_lock' undeclared (first use in this function)
>  3416 |       __gthread_mutex_lock (&old_locale_lock);
>       |                              ^~~~~~~~~~~~~~~
> 
> etc.
> 
>  We can probably work it around by downgrading to setlocale(3) for NetBSD 
> (i.e. whenever either function is missing) unless someone from the NetBSD 
> community contributes a better implementation (they seem to prefer their 
> own non-standard printf_l(3) library API).

 And now:

In file included from .../libgfortran/intrinsics/erfc_scaled.c:33:
.../libgfortran/intrinsics/erfc_scaled_inc.c:
In function 'erfc_scaled_r4':
.../libgfortran/intrinsics/erfc_scaled_inc.c:169:15: warning: target format does not support infinity
  169 |         res = __builtin_inf ();
      |               ^~~~~~~~~~~~~
In file included from .../libgfortran/intrinsics/erfc_scaled.c:39:
.../libgfortran/intrinsics/erfc_scaled_inc.c:
In function 'erfc_scaled_r8':
.../libgfortran/intrinsics/erfc_scaled_inc.c:82:15: warning: floating constant exceeds range of 'double'
   82 |               xbig = 26.543, xhuge = 6.71e+7, xmax = 2.53e+307;
      |               ^~~~
.../libgfortran/intrinsics/erfc_scaled_inc.c:169:15: warning: target format does not support infinity
  169 |         res = __builtin_inf ();
      |               ^~~~~~~~~~~~~

and:

.../libgfortran/intrinsics/c99_functions.c: In function 'tgamma':
.../libgfortran/intrinsics/c99_functions.c:1866:3: warning: floating constant truncated to zero [-Woverflow]
 1866 |   static const double xminin = 2.23e-308;
      |   ^~~~~~
.../libgfortran/intrinsics/c99_functions.c:1868:60: warning: target format does not support infinity
 1868 |   static const double xnan = __builtin_nan ("0x0"), xinf = __builtin_inf ();
      |                                                            ^~~~~~~~~~~~~{standard input}: Assembler messages:
{standard input}:487: Fatal error: Junk at end of expression "QNaN"
make[3]: *** [Makefile:6466: c99_functions.lo] Error 1

I am going to give up at this point, as porting libgfortran to non-IEEE FP 
is surely well beyond what I can afford to do right now.

  Maciej
Thomas Koenig Nov. 24, 2020, 6:16 a.m. UTC | #10
Am 24.11.20 um 07:04 schrieb Maciej W. Rozycki:
> I am going to give up at this point, as porting libgfortran to non-IEEE FP
> is surely well beyond what I can afford to do right now.

Can you file a PR about this? Eliminating __builtin_inf and friends
sounds doable.

And does anybody know what we should return in cases where the result
exceeds the maximum representable number?

Best regards

	Thomas
Maciej W. Rozycki Nov. 25, 2020, 5:07 p.m. UTC | #11
On Mon, 23 Nov 2020, Paul Koning wrote:

> > Then there is a fix for the PDP11 backend addressing an issue I found in 
> > the handling of floating-point comparisons.  Unlike all the other changes 
> > this one has not been regression-tested, not even built as I have no idea 
> > how to prepare a development environment for a PDP11 target (also none of 
> > my VAX pieces is old enough to support PDP11 machine code execution).
> 
> I agree this is a correct change, interesting that it was missed before.  
> You'd expect some ICE issues from that mistake.  Perhaps there were and 
> I didn't realize the cause; the PDP11 test run is not yet fully clean.

 Nothing like that, I wouldn't expect an ICE here.  Just as none happened 
with the VAX backend before a test case made me realise a corresponding 
change was required.  It's just a pessimisation: the RTL simply doesn't 
match and the comparison to remove stays.

> I've hacked together a primitive newlib based "bare metal" execution 
> test setup that uses SIMH, but it's not a particularly clean setup.  
> And it hasn't been posted, I hope to be able to do that at some point.

 Hmm, I gather those systems are able to run some kind of BSD Unix: don't 
they support the r-commands which would allow you to run DejaGNU testing 
with a realistic environment PDP-11 hardware would be usually used with, 
possibly on actual hardware even?  I always feel a bit uneasy about the 
accuracy of any simulation (having suffered from bugs in QEMU causing 
false negatives in software verification).

 While I would expect old BSD libc to miss some of the C language features 
considered standard nowadays, I think at least the C GCC frontend runtime 
(libgcc.a) and the test suite do not overall rely on their presence, and 
any individual test cases that do can be easily excluded.

> Thanks for the fix.

 I take it as an approval and will apply the change then along with the 
rest of the series.  Thank you for your review.

  Maciej
Maciej W. Rozycki Nov. 25, 2020, 6:26 p.m. UTC | #12
On Tue, 24 Nov 2020, Maciej W. Rozycki wrote:

> I am going to give up at this point, as porting libgfortran to non-IEEE FP 
> is surely well beyond what I can afford to do right now.

 I have now posted fixes for the issues handled so far.

 For the other pieces that are missing perhaps my work I did many years 
ago to port glibc 2.4 (the last one I was able to cook up without NPTL), 
and specifically libm within, to the never-upstreamed VAX/Linux target 
might be useful to complete the effort, as there seems to be an overlap 
here.  That port hasn't been fully verified though and I do not promise 
doing any work related to it anytime either.  The glibc patches continue 
being available online to download and use under the terms of the GNU GPL 
for anyone though.

  Maciej
Maciej W. Rozycki Nov. 25, 2020, 6:36 p.m. UTC | #13
On Fri, 20 Nov 2020, Maciej W. Rozycki wrote:

>  These changes have been regression-tested throughout development with the 
> `vax-netbsdelf' target running NetBSD 9.0, using said VAXstation 4000/60, 
> which uses the Mariah implemementation of the VAX architecture.  The host 
> used was `powerpc64le-linux-gnu' and occasionally `x86_64-linux-gnu' as 
> well; changes outside the VAX backend were all natively bootstrapped and 
> regression-tested with both these hosts.

 I forgot to note that I have been going through this final verification 
with the native compiler and the `vax-netbsdelf' cross-compiler built with 
it both configured with `--disable-werror'.  This is due to a recent 
regression with the Go frontend causing a build error otherwise:

.../gcc/go/gofrontend/go-diagnostics.cc: In function 'std::string expand_message(const char*, va_list)':
.../gcc/go/gofrontend/go-diagnostics.cc:110:61: error: '<anonymous>' may be used uninitialized [-Werror=maybe-uninitialized]
  110 |                      "memory allocation failed in vasprintf");
      |                                                             ^
In file included from .../prev-powerpc64le-linux-gnu/libstdc++-v3/include/string:55,
                 from .../gcc/go/go-system.h:34,
                 from .../gcc/go/gofrontend/go-linemap.h:10,
                 from .../gcc/go/gofrontend/go-diagnostics.h:10,
                 from .../gcc/go/gofrontend/go-diagnostics.cc:7:
.../prev-powerpc64le-linux-gnu/libstdc++-v3/include/bits/basic_string.h:525:7: note: by argument 3 of type 'const std::allocator<char>&' to 'std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::basic_string(const _CharT*, const _Alloc&) [with <template-parameter-2-1> = std::allocator<char>; _CharT = char; _Traits = std::char_traits<char>; _Alloc = std::allocator<char>]' declared here
  525 |       basic_string(const _CharT* __s, const _Alloc& __a = _Alloc())
      |       ^~~~~~~~~~~~
.../gcc/go/gofrontend/go-diagnostics.cc:110:61: note: '<anonymous>' declared here
  110 |                      "memory allocation failed in vasprintf");
      |                                                             ^
cc1plus: all warnings being treated as errors
make[3]: *** [.../gcc/go/Make-lang.in:242: go/go-diagnostics.o] Error 1

the cause for which I decided I could not afford the time to track down.  
Perhaps it has been fixed since, but mentioning it in case it has not.

 Earlier verification iterations were done with `--enable-werror-always'.

  Maciej
Maciej W. Rozycki Nov. 25, 2020, 7:22 p.m. UTC | #14
On Tue, 24 Nov 2020, Thomas Koenig wrote:

> > I am going to give up at this point, as porting libgfortran to non-IEEE FP
> > is surely well beyond what I can afford to do right now.
> 
> Can you file a PR about this? Eliminating __builtin_inf and friends
> sounds doable.

 There's more to that unfortunately.  I would have done it right away if 
it was so easy.

> And does anybody know what we should return in cases where the result
> exceeds the maximum representable number?

 Presumably the standard has it (implementation-specific for non-IEEE-754 
I suppose; in the VAX FP ISA an underflow optionally traps, and otherwise 
produces a zero result, while an overflow traps unconditionally and keeps 
the destination operand unchanged).

  Maciej
Joseph Myers Nov. 25, 2020, 10:20 p.m. UTC | #15
On Wed, 25 Nov 2020, Maciej W. Rozycki wrote:

>  For the other pieces that are missing perhaps my work I did many years 
> ago to port glibc 2.4 (the last one I was able to cook up without NPTL), 
> and specifically libm within, to the never-upstreamed VAX/Linux target 
> might be useful to complete the effort, as there seems to be an overlap 
> here.  That port hasn't been fully verified though and I do not promise 
> doing any work related to it anytime either.  The glibc patches continue 
> being available online to download and use under the terms of the GNU GPL 
> for anyone though.

I think I mentioned before: if you wish to bring a VAX port back to 
current glibc, I think it would make more sense to use software IEEE 
floating point rather than adding new support to glibc for a non-IEEE 
floating-point format.
Li, Pan2 via Gcc-patches Nov. 25, 2020, 10:26 p.m. UTC | #16
On Tue, Nov 24, 2020 at 05:27:10AM +0000, Maciej W. Rozycki wrote:
> On Tue, 24 Nov 2020, Maciej W. Rozycki wrote:
> 
> > > I don't know how or why __FLT_HAS_INFINITY is set for a target which
> > > does not support it, but if you get rid of that macro, that particular
> > > problem should be solved.
> > 
> >  Thanks for the hint; I didn't look into it any further not to distract 
> > myself from the scope of the project.  I have now, and the check you have 
> > quoted is obviously broken (as are all the remaining similar ones), given:
> > 
> > $ vax-netbsdelf-gcc -E -dM - < /dev/null | sort | grep _HAS_
> > #define __DBL_HAS_DENORM__ 0
> > #define __DBL_HAS_INFINITY__ 0
> > #define __DBL_HAS_QUIET_NAN__ 0
> > #define __FLT_HAS_DENORM__ 0
> > #define __FLT_HAS_INFINITY__ 0
> > #define __FLT_HAS_QUIET_NAN__ 0
> > #define __LDBL_HAS_DENORM__ 0
> > #define __LDBL_HAS_INFINITY__ 0
> > #define __LDBL_HAS_QUIET_NAN__ 0
> > $ 
> > 
> > which looks reasonable to me.  This seems straightforward to fix to me, so 
> > I'll include it along with verification I am about to schedule (assuming 
> > that this will be enough for libgfortran to actually build; obviously it 
> > hasn't been tried by anyone with such a setup for a while now, as these 
> > libgfortran checks date back to 2009).
> 
>  Well, it is still broken, owing to NetBSD failing to implement POSIX 2008 
> locale handling correctly, apparently deliberately[1], and missing 
> uselocale(3)[2] while still providing newlocale(3).  This confuses our 
> conditionals and consequently:
> 
> .../libgfortran/io/transfer.c: In function 'data_transfer_init_worker':
> .../libgfortran/io/transfer.c:3416:30: error:
> 'old_locale_lock' undeclared (first use in this function)
>  3416 |       __gthread_mutex_lock (&old_locale_lock);
>       |                              ^~~~~~~~~~~~~~~
> 
> etc.
> 
>  We can probably work it around by downgrading to setlocale(3) for NetBSD 
> (i.e. whenever either function is missing) unless someone from the NetBSD 
> community contributes a better implementation (they seem to prefer their 
> own non-standard printf_l(3) library API).

Hi Maciej,

I've been building successfully with setting:
export ac_cv_func_freelocale=no
export ac_cv_func_newlocale=no
export ac_cv_func_uselocale=no

I think the code to avoid these functions already exists, but just the
configure tests need tuning.

Also, this is amazing work!
Ian Lance Taylor Nov. 26, 2020, 2:46 p.m. UTC | #17
On Wed, Nov 25, 2020 at 10:36 AM Maciej W. Rozycki <macro@linux-mips.org> wrote:
>
> On Fri, 20 Nov 2020, Maciej W. Rozycki wrote:
>
> >  These changes have been regression-tested throughout development with the
> > `vax-netbsdelf' target running NetBSD 9.0, using said VAXstation 4000/60,
> > which uses the Mariah implemementation of the VAX architecture.  The host
> > used was `powerpc64le-linux-gnu' and occasionally `x86_64-linux-gnu' as
> > well; changes outside the VAX backend were all natively bootstrapped and
> > regression-tested with both these hosts.
>
>  I forgot to note that I have been going through this final verification
> with the native compiler and the `vax-netbsdelf' cross-compiler built with
> it both configured with `--disable-werror'.  This is due to a recent
> regression with the Go frontend causing a build error otherwise:
>
> .../gcc/go/gofrontend/go-diagnostics.cc: In function 'std::string expand_message(const char*, va_list)':
> .../gcc/go/gofrontend/go-diagnostics.cc:110:61: error: '<anonymous>' may be used uninitialized [-Werror=maybe-uninitialized]
>   110 |                      "memory allocation failed in vasprintf");
>       |                                                             ^
> In file included from .../prev-powerpc64le-linux-gnu/libstdc++-v3/include/string:55,
>                  from .../gcc/go/go-system.h:34,
>                  from .../gcc/go/gofrontend/go-linemap.h:10,
>                  from .../gcc/go/gofrontend/go-diagnostics.h:10,
>                  from .../gcc/go/gofrontend/go-diagnostics.cc:7:
> .../prev-powerpc64le-linux-gnu/libstdc++-v3/include/bits/basic_string.h:525:7: note: by argument 3 of type 'const std::allocator<char>&' to 'std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::basic_string(const _CharT*, const _Alloc&) [with <template-parameter-2-1> = std::allocator<char>; _CharT = char; _Traits = std::char_traits<char>; _Alloc = std::allocator<char>]' declared here
>   525 |       basic_string(const _CharT* __s, const _Alloc& __a = _Alloc())
>       |       ^~~~~~~~~~~~
> .../gcc/go/gofrontend/go-diagnostics.cc:110:61: note: '<anonymous>' declared here
>   110 |                      "memory allocation failed in vasprintf");
>       |                                                             ^
> cc1plus: all warnings being treated as errors
> make[3]: *** [.../gcc/go/Make-lang.in:242: go/go-diagnostics.o] Error 1
>
> the cause for which I decided I could not afford the time to track down.
> Perhaps it has been fixed since, but mentioning it in case it has not.
>
>  Earlier verification iterations were done with `--enable-werror-always'.

I just want to note that as far as I can tell this is not a bug in the
Go frontend code.

Ian
Maciej W. Rozycki Nov. 26, 2020, 5:59 p.m. UTC | #18
On Wed, 25 Nov 2020, coypu@sdf.org wrote:

> I've been building successfully with setting:
> export ac_cv_func_freelocale=no
> export ac_cv_func_newlocale=no
> export ac_cv_func_uselocale=no

 It's a workaround really, though surely one that works; I've done things 
like this myself regularly, especially for cross-compilation where the 
right setting cannot be determined by `configure' and the default is 
either wrong or bails out (it used to be the case with type sizes, very 
painful, until a clever trick was invented with one version of autoconf).

 Overall a plain `/path/to/configure' invocation is expected to do the 
right thing though for whatever the default configuration of a software 
piece is.

 Also by "building successfully" do you mean someone has actually ported 
libgfortran locally for VAX FP support, or is it just with NetBSD ports 
that use IEEE-754 FP?

> I think the code to avoid these functions already exists, but just the
> configure tests need tuning.

 I chose to update actual code instead as I found it more straightforward. 
Fix committed now; commit beb9afcaf146 ("libgfortran: Verify the presence 
of all functions for POSIX 2008 locale").

> Also, this is amazing work!

 Thanks, appreciated, although I've been doing this kind of things for 
some 20 years now (and much longer when it comes to all kinds of computer 
programming and fiddling with microprocessors), so it's bread and butter 
to me really.  I'd have had to switch fields by now if I used not to get 
such stuff right.

  Maciej
Maciej W. Rozycki Nov. 26, 2020, 6:01 p.m. UTC | #19
On Wed, 25 Nov 2020, Joseph Myers wrote:

> >  For the other pieces that are missing perhaps my work I did many years 
> > ago to port glibc 2.4 (the last one I was able to cook up without NPTL), 
> > and specifically libm within, to the never-upstreamed VAX/Linux target 
> > might be useful to complete the effort, as there seems to be an overlap 
> > here.  That port hasn't been fully verified though and I do not promise 
> > doing any work related to it anytime either.  The glibc patches continue 
> > being available online to download and use under the terms of the GNU GPL 
> > for anyone though.
> 
> I think I mentioned before: if you wish to bring a VAX port back to 
> current glibc, I think it would make more sense to use software IEEE 
> floating point rather than adding new support to glibc for a non-IEEE 
> floating-point format.

 Right, that would be the path of least resistance (non-FP VAX bits are 
obviously limited and boil down to the OS interface and some handwritten 
assembly such as for optimised string ops), and surely the least involving 
one maintenance-wise, so perhaps the only acceptable compromise these days 
given that VAX is a niche of a niche now.

 The kernel part would have to happen first though, and the old effort 
became stuck in 2.6 days, so clearly not suitable for upstreaming.  Back 
in the day I did enough kernel updates to get interrupt handling right, 
i.e. the IPL stuff, based on what the Alpha port does, which is really the 
same, and then on top of it ptrace(2) support along with a native GDB and 
a `gdbserver' port so that I could actually debug the outstanding userland 
issues.

 But that was surely not enough even back then and is even less so now.  
FWIW I was able to run single-user `bash' (with `ncurses', etc.) and some 
other programs; native GCC crashed as did GDB, due to a bug leading to 
stack exhaustion, but `gdbserver' worked along with single-stepping, etc., 
so that was a good starting point.

 The VAX/NetBSD port however does use hardware FP in their libm as far as 
I can tell, so I guess it would be reasonable for libgfortran to do so as 
well.  I haven't checked how correct their implementation actually is, but 
barring evidence otherwise I would assume they did the right thing.  

 Without all the NaN/denormal stuff the handling should be much simpler I 
believe, though I gather some numeric algorithms do rely on the presence 
of denormals to produce reasonable results in some boundary cases.  These 
would be lost or a different algorithm would have to be chosen for the 
respective computations I suppose.

  Maciej
Maciej W. Rozycki Nov. 26, 2020, 6:07 p.m. UTC | #20
On Thu, 26 Nov 2020, Ian Lance Taylor wrote:

> I just want to note that as far as I can tell this is not a bug in the
> Go frontend code.

 Ack.  I thought it might be the case as the file in question hasn't 
changed for a while now, so it wasn't a Go frontend change that triggered 
the warning.  Just a heads-up anyway.

  Maciej
Martin Husemann Nov. 26, 2020, 6:08 p.m. UTC | #21
On Thu, Nov 26, 2020 at 06:01:31PM +0000, Maciej W. Rozycki wrote:
>  The VAX/NetBSD port however does use hardware FP in their libm as far as 
> I can tell, so I guess it would be reasonable for libgfortran to do so as 
> well.  I haven't checked how correct their implementation actually is, but 
> barring evidence otherwise I would assume they did the right thing.  

It does, but it is not totally correct in all places (due to gcc issues
some parts have not received good testing, and others clearly are broken,
eg. when tables are used that have not been adjusted for the different
limits in VAX float/double formats).

Should be a lot easier to fix with newer gcc soonish (I hope).

Martin
Maciej W. Rozycki Nov. 26, 2020, 7:35 p.m. UTC | #22
On Thu, 26 Nov 2020, Maciej W. Rozycki wrote:

> > Also, this is amazing work!
> 
>  Thanks, appreciated, although I've been doing this kind of things for 
> some 20 years now (and much longer when it comes to all kinds of computer 
> programming and fiddling with microprocessors), so it's bread and butter 
> to me really.  I'd have had to switch fields by now if I used not to get 
> such stuff right.

 On second thoughts I realised this may have sounded a bit harsh even 
though meant to be a joke really.  Apologies for that.  I feel really 
happy all the conditions clicked and I was in the right place and time to 
save the VAX port from obsoletion.  And thank you for your appreciation!

  Maciej
Paul Koning Nov. 28, 2020, 6:48 p.m. UTC | #23
> On Nov 25, 2020, at 12:07 PM, Maciej W. Rozycki <macro@linux-mips.org> wrote:
> 
> On Mon, 23 Nov 2020, Paul Koning wrote:
> 
>>> ...
> 
>> I've hacked together a primitive newlib based "bare metal" execution 
>> test setup that uses SIMH, but it's not a particularly clean setup.  
>> And it hasn't been posted, I hope to be able to do that at some point.
> 
> Hmm, I gather those systems are able to run some kind of BSD Unix: don't 
> they support the r-commands which would allow you to run DejaGNU testing 
> with a realistic environment PDP-11 hardware would be usually used with, 
> possibly on actual hardware even?  I always feel a bit uneasy about the 
> accuracy of any simulation (having suffered from bugs in QEMU causing 
> false negatives in software verification).

Fair enough.  But SIMH is a full system emulator with a very large amount of history and expertise involved in its creation.  It's also known to run every PDP-11 OS and most diagnostics.  Yes, it certainly runs BSD 2.x; the reason I didn't use that approach is that I don't know it well. 

> While I would expect old BSD libc to miss some of the C language features 
> considered standard nowadays, I think at least the C GCC frontend runtime 
> (libgcc.a) and the test suite do not overall rely on their presence, and 
> any individual test cases that do can be easily excluded.
> 
>> Thanks for the fix.
> 
> I take it as an approval and will apply the change then along with the 
> rest of the series.  Thank you for your review.

I should have been explicit.  Yes, I approve that change, thanks.

	paul
Martin Sebor Nov. 29, 2020, 5:56 p.m. UTC | #24
On 11/25/20 11:36 AM, Maciej W. Rozycki wrote:
> On Fri, 20 Nov 2020, Maciej W. Rozycki wrote:
> 
>>   These changes have been regression-tested throughout development with the
>> `vax-netbsdelf' target running NetBSD 9.0, using said VAXstation 4000/60,
>> which uses the Mariah implemementation of the VAX architecture.  The host
>> used was `powerpc64le-linux-gnu' and occasionally `x86_64-linux-gnu' as
>> well; changes outside the VAX backend were all natively bootstrapped and
>> regression-tested with both these hosts.
> 
>   I forgot to note that I have been going through this final verification
> with the native compiler and the `vax-netbsdelf' cross-compiler built with
> it both configured with `--disable-werror'.  This is due to a recent
> regression with the Go frontend causing a build error otherwise:
> 
> .../gcc/go/gofrontend/go-diagnostics.cc: In function 'std::string expand_message(const char*, va_list)':
> .../gcc/go/gofrontend/go-diagnostics.cc:110:61: error: '<anonymous>' may be used uninitialized [-Werror=maybe-uninitialized]
>    110 |                      "memory allocation failed in vasprintf");
>        |                                                             ^
> In file included from .../prev-powerpc64le-linux-gnu/libstdc++-v3/include/string:55,
>                   from .../gcc/go/go-system.h:34,
>                   from .../gcc/go/gofrontend/go-linemap.h:10,
>                   from .../gcc/go/gofrontend/go-diagnostics.h:10,
>                   from .../gcc/go/gofrontend/go-diagnostics.cc:7:
> .../prev-powerpc64le-linux-gnu/libstdc++-v3/include/bits/basic_string.h:525:7: note: by argument 3 of type 'const std::allocator<char>&' to 'std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::basic_string(const _CharT*, const _Alloc&) [with <template-parameter-2-1> = std::allocator<char>; _CharT = char; _Traits = std::char_traits<char>; _Alloc = std::allocator<char>]' declared here
>    525 |       basic_string(const _CharT* __s, const _Alloc& __a = _Alloc())
>        |       ^~~~~~~~~~~~
> .../gcc/go/gofrontend/go-diagnostics.cc:110:61: note: '<anonymous>' declared here
>    110 |                      "memory allocation failed in vasprintf");
>        |                                                             ^
> cc1plus: all warnings being treated as errors
> make[3]: *** [.../gcc/go/Make-lang.in:242: go/go-diagnostics.o] Error 1
> 
> the cause for which I decided I could not afford the time to track down.
> Perhaps it has been fixed since, but mentioning it in case it has not.

I wouldn't expect to see this warning after r11-5073.  If it persists,
can you please open a bug and attach the translation unit to it?

Thanks
Martin

> 
>   Earlier verification iterations were done with `--enable-werror-always'.
> 
>    Maciej
>
Maciej W. Rozycki Dec. 7, 2020, 2:25 p.m. UTC | #25
On Sun, 29 Nov 2020, Martin Sebor wrote:

> > Perhaps it has been fixed since, but mentioning it in case it has not.
> 
> I wouldn't expect to see this warning after r11-5073.  If it persists,
> can you please open a bug and attach the translation unit to it?

 Indeed, the issue has gone now, now that I have pushed the VAX changes 
and rebuilt a refreshed tree.  Thanks for looking into it.

  Maciej
Maciej W. Rozycki Dec. 8, 2020, 2:38 p.m. UTC | #26
On Thu, 26 Nov 2020, Martin Husemann wrote:

> >  The VAX/NetBSD port however does use hardware FP in their libm as far as 
> > I can tell, so I guess it would be reasonable for libgfortran to do so as 
> > well.  I haven't checked how correct their implementation actually is, but 
> > barring evidence otherwise I would assume they did the right thing.  
> 
> It does, but it is not totally correct in all places (due to gcc issues
> some parts have not received good testing, and others clearly are broken,
> eg. when tables are used that have not been adjusted for the different
> limits in VAX float/double formats).

 I have realised that with my VAX/Linux effort, more than 10 years ago, I 
did not encounter such issues, and I did port all the GCC components the 
compiler provided at the time (although the port of libjava could have 
been only partially functional as I didn't properly verify the IEEE<->VAX 
FP conversion stubs I have necessarily implemented), though what chose was 
4.1.2 rather than the most recent version (to avoid the need to port NPTL 
right away).  I should have tripped over this issue then, but I did not.

 So with the objective of this effort out of the way I have now looked 
into what happened with libgfortran here and realised that the cause of 
the compilation error was an attempt to provide a standard ISO C function 
missing from NetBSD's libc or libm (even though it's declared).  Indeed:

$ grep tgamma usr/include/math.h
double	tgamma(double);
float	tgammaf(float);
long double	tgammal(long double);
$ readelf -s usr/lib/libc.so usr/lib/libm.so usr/lib/libc.a usr/lib/libm.a | grep tgamma
$ 

So clearly something went wrong there and I think it's that that has to be 
fixed rather than the fallback implementations in libgfortran (which I 
gather have been only provided for legacy systems that do not implement a 
full ISO C environment and are no longer maintained).  I suspect that once 
this function (and any other ones that may be missing) has been supplied 
by the system libraries libgfortran will just work out of the box.

 Here's the full list of math functions that the `configure' script in 
libgfortran reports as missing:

checking for acosl... no
checking for acoshf... no
checking for acoshl... no
checking for asinl... no
checking for asinhf... no
checking for asinhl... no
checking for atan2l... no
checking for atanl... no
checking for atanhl... no
checking for cosl... no
checking for coshl... no
checking for expl... no
checking for fmaf... no
checking for fma... no
checking for fmal... no
checking for frexpf... no
checking for frexpl... no
checking for logl... no
checking for log10l... no
checking for clog10f... no
checking for clog10... no
checking for clog10l... no
checking for nextafterf... no
checking for nextafter... no
checking for nextafterl... no
checking for lroundl... no
checking for llroundf... no
checking for llround... no
checking for llroundl... no
checking for sinl... no
checking for sinhl... no
checking for tanl... no
checking for tanhl... no
checking for erfcl... no
checking for j0f... no
checking for j1f... no
checking for jnf... no
checking for jnl... no
checking for y0f... no
checking for y1f... no
checking for ynf... no
checking for ynl... no
checking for tgamma... no
checking for tgammaf... no
checking for lgammaf... no

Except for the Bessel functions these are a part of ISO C; `long double' 
versions, some of which appear missing unlike their `float' or `double' 
counterparts, should probably just alias to the corresponding `double' 
versions as I doubt we want to get into the H-floating format, largely 
missing from actual VAX hardware and meant to be emulated by the OS.

 Please note that this is with NetBSD 9 rather than 9.1 (which has only 
been recently released and therefore I decided not to get distracted with 
an upgrade) and I don't know if it has been fixed in the latter release.

  Maciej
Martin Husemann Dec. 8, 2020, 3:22 p.m. UTC | #27
On Tue, Dec 08, 2020 at 02:38:59PM +0000, Maciej W. Rozycki wrote:
>  Here's the full list of math functions that the `configure' script in 
> libgfortran reports as missing:
> 
> checking for acosl... no
> checking for acoshf... no
[..]
> Except for the Bessel functions these are a part of ISO C; `long double' 
> versions, some of which appear missing unlike their `float' or `double' 
> counterparts, should probably just alias to the corresponding `double' 
> versions as I doubt we want to get into the H-floating format, largely 
> missing from actual VAX hardware and meant to be emulated by the OS.

Thanks for the list - I'll add the aliases soonish (they are likely already
there for the IEEE versions but missing from the vax code) and check
what remains missing then.

Martin
Maciej W. Rozycki Dec. 9, 2020, 2:06 p.m. UTC | #28
On Sat, 28 Nov 2020, Paul Koning wrote:

> > Hmm, I gather those systems are able to run some kind of BSD Unix: don't 
> > they support the r-commands which would allow you to run DejaGNU testing 
> > with a realistic environment PDP-11 hardware would be usually used with, 
> > possibly on actual hardware even?  I always feel a bit uneasy about the 
> > accuracy of any simulation (having suffered from bugs in QEMU causing 
> > false negatives in software verification).
> 
> Fair enough.  But SIMH is a full system emulator with a very large 
> amount of history and expertise involved in its creation.  It's also 
> known to run every PDP-11 OS and most diagnostics.  Yes, it certainly 
> runs BSD 2.x; the reason I didn't use that approach is that I don't know 
> it well.

 This all sounds great.  Do you happen to know if it is cycle-accurate 
with respect to individual hardware microarchitectures simulated?  That 
would be required for performance evaluation of compiler-generated code.

  Maciej
Paul Koning Dec. 10, 2020, 1:33 a.m. UTC | #29
> On Dec 9, 2020, at 9:06 AM, Maciej W. Rozycki <macro@linux-mips.org> wrote:
> 
> On Sat, 28 Nov 2020, Paul Koning wrote:
> 
>>> Hmm, I gather those systems are able to run some kind of BSD Unix: don't 
>>> they support the r-commands which would allow you to run DejaGNU testing 
>>> with a realistic environment PDP-11 hardware would be usually used with, 
>>> possibly on actual hardware even?  I always feel a bit uneasy about the 
>>> accuracy of any simulation (having suffered from bugs in QEMU causing 
>>> false negatives in software verification).
>> 
>> Fair enough.  But SIMH is a full system emulator with a very large 
>> amount of history and expertise involved in its creation.  It's also 
>> known to run every PDP-11 OS and most diagnostics.  Yes, it certainly 
>> runs BSD 2.x; the reason I didn't use that approach is that I don't know 
>> it well.
> 
> This all sounds great.  Do you happen to know if it is cycle-accurate 
> with respect to individual hardware microarchitectures simulated?  That 
> would be required for performance evaluation of compiler-generated code.

No, it isn't.  I believe it just charges one time unit per instruction, with the possible exception of CIS instructions. 

I don't know of any cycle accurate PDP-11 emulators.  It's not even clear if it is possible to build one, given the asynchronous operation of the UNIBUS.  It certainly would be extremely difficult since even the documented timing is amazingly complex, never mind the possibility that the reality is different from what is documented.

The pdp11 back end uses a very rough approximation of the documented 11/70 timing, but GCC doesn't make it easy (or maybe not even possible) to use the full timing details.  It's not something I'd expect to refine a whole lot further.

More interesting would be to tweak the optimizing machinery to improve parts that either have bitrotted or never actually worked. The code generation for auto-increment etc. isn't particularly effective and I think that's a known limitation.  Ditto indirect addressing, since few other machines have that.  (VAX does, of course; it might benefit too.)  And with LRA things are more limited still, again this seems to be known and is caused by the focus on modern machine architectures.

	paul
Maciej W. Rozycki Dec. 11, 2020, 2:54 p.m. UTC | #30
On Wed, 9 Dec 2020, Paul Koning wrote:

> > This all sounds great.  Do you happen to know if it is cycle-accurate 
> > with respect to individual hardware microarchitectures simulated?  That 
> > would be required for performance evaluation of compiler-generated code.
> 
> No, it isn't.  I believe it just charges one time unit per instruction, 
> with the possible exception of CIS instructions.

 Fair enough, from experience most CPU emulators are instruction-accurate 
only.  Of all the generally available emulators I came across (and looked 
into closely enough; maybe I missed something) only ones for the Z80 were 
cycle-accurate, and I believe the MAME project has had cycle-accurate 
emulation, both down to the system level and both out of necessity, as 
software they were written for was often unforgiving when it comes to any 
discrepancy with respect to original hardware.

 Commercially, MIPS Technologies used to have cycle-accurate MIPSsim, 
actually used for hardware verification, and taking into account all the 
implementation details such as the TLB and caches of individual CPU cores 
supported.  And you could choose the topology of these resources according 
to what actual silicon could have.  Some LV hardware has had it too for 
evaluation purposes:

YAMON> scpu
Current settings :
  I-Cache bytes per way = 0x1000
  I-Cache associativity = 4
  D-Cache bytes per way = 0x1000
  D-Cache associativity = 4
  MMU                   = tlb
YAMON> scpu -a
Available settings :
  I-Cache bytes per way : 0x1000, 0x0
  I-Cache associativity : 4, 3, 2, 1
  D-Cache bytes per way : 0x1000, 0x0
  D-Cache associativity : 4, 3, 2, 1
  MMU types             : tlb, fixed
YAMON> scpu -i 0x1000 2
YAMON> scpu -d 0x1000 2
YAMON> scpu fixed
YAMON> scpu
Current settings :
  I-Cache bytes per way = 0x1000
  I-Cache associativity = 2
  D-Cache bytes per way = 0x1000
  D-Cache associativity = 2
  MMU                   = fixed
YAMON> 

But then even cycle-accurate MIPSsim would not take every parameter of a 
system into account, such as the latency of peripheral components.  Not 
sure about DRAM either, though being predictable I guess that might have 
been simulated.

> I don't know of any cycle accurate PDP-11 emulators.  It's not even 
> clear if it is possible to build one, given the asynchronous operation 
> of the UNIBUS.  It certainly would be extremely difficult since even the 
> documented timing is amazingly complex, never mind the possibility that 
> the reality is different from what is documented.

 For the purpose of compiler's performance evaluation however I don't 
think we need to go down as far as the external bus, so however UNIBUS 
performs should not really matter.  Even with the modern systems all the 
pipeline descriptions and operation timings we have recorded within GCC 
reflect perfect operating conditions such as hot caches, no TLB misses, no 
branch mispredictions, to say nothing of disruption to all that caused by 
hardware interrupts and context switches.

 So I guess with cycle-accurate PDP-11 emulation it would be sufficient if 
relative CPU instruction execution timings were correctly reflected, such 
as the latency of say MOV vs DIV, as I am fairly sure they are not even 
close to being equivalent.  But that does come at a cost; cycle-accurate 
MIPSsim was much slower than its instruction-accurate counterpart which 
also existed.

> The pdp11 back end uses a very rough approximation of the documented 
> 11/70 timing, but GCC doesn't make it easy (or maybe not even possible) 
> to use the full timing details.  It's not something I'd expect to refine 
> a whole lot further.

 Understood.

> More interesting would be to tweak the optimizing machinery to improve 
> parts that either have bitrotted or never actually worked. The code 
> generation for auto-increment etc. isn't particularly effective and I 
> think that's a known limitation.  Ditto indirect addressing, since few 
> other machines have that.  (VAX does, of course; it might benefit too.)  
> And with LRA things are more limited still, again this seems to be known 
> and is caused by the focus on modern machine architectures.

 Correctness absolutely has to take precedence over performance, but that 
does not mean the latter has to be completely ignored either.  And the 
presence of tools may only help with that.  We may not have the resources 
available commercially significant ports have, but that does not mean we 
should decide upfront to abandon any kind of performance QA.  I think we 
can still act professionally and try to do our best to make the quality of 
code produced as good as possible within our available resources.

 FWIW,

  Maciej
Paul Koning Dec. 11, 2020, 9:50 p.m. UTC | #31
> On Dec 11, 2020, at 9:54 AM, Maciej W. Rozycki <macro@linux-mips.org> wrote:
> 
> On Wed, 9 Dec 2020, Paul Koning wrote:
> 
>>> This all sounds great.  Do you happen to know if it is cycle-accurate 
>>> with respect to individual hardware microarchitectures simulated?  That 
>>> would be required for performance evaluation of compiler-generated code.
>> 
>> No, it isn't.  I believe it just charges one time unit per instruction, 
>> with the possible exception of CIS instructions.
> 
> Fair enough, from experience most CPU emulators are instruction-accurate 
> only.  Of all the generally available emulators I came across (and looked 
> into closely enough; maybe I missed something) only ones for the Z80 were 
> cycle-accurate, and I believe the MAME project has had cycle-accurate 
> emulation, both down to the system level and both out of necessity, as 
> software they were written for was often unforgiving when it comes to any 
> discrepancy with respect to original hardware.

I know of a cycle-accurate CDC 6000 simulator, but I think that was a one man project never released.

> Commercially, MIPS Technologies used to have cycle-accurate MIPSsim, 
> actually used for hardware verification, and taking into account all the 
> implementation details such as the TLB and caches of individual CPU cores 
> supported.

There was also a simulator with capabilities like that for the SB-1 CPU core of the Sibyte SB-1250 SoC. 

> ...
>> I don't know of any cycle accurate PDP-11 emulators.  It's not even 
>> clear if it is possible to build one, given the asynchronous operation 
>> of the UNIBUS.  It certainly would be extremely difficult since even the 
>> documented timing is amazingly complex, never mind the possibility that 
>> the reality is different from what is documented.
> 
> For the purpose of compiler's performance evaluation however I don't 
> think we need to go down as far as the external bus, so however UNIBUS 
> performs should not really matter.  Even with the modern systems all the 
> pipeline descriptions and operation timings we have recorded within GCC 
> reflect perfect operating conditions such as hot caches, no TLB misses, no 
> branch mispredictions, to say nothing of disruption to all that caused by 
> hardware interrupts and context switches.

True, but I was thinking of models where the UNIBUS is used for memory.  The real issue is that the documented timings are full of strange numbers.  There isn't a timing for a given instruction, but rather a whole pile of numbers depending on the addressing modes, with occasional exceptions to a pattern (for example, some register to register operations are faster than the general pattern for the operation and addressing mode costs would suggest).  And it's hard to find a number that can be used as the "cycle time" where each time value is a small multiple of that basic number.  That's an issue both for a timing simulation and also for the GCC instruction scheduler and instruction cost models -- I ended up rounding things rather drastically and trimming out some detail in order to have the cost values be small integers and not blow up the size of the scheduler state machine.

> So I guess with cycle-accurate PDP-11 emulation it would be sufficient if 
> relative CPU instruction execution timings were correctly reflected, such 
> as the latency of say MOV vs DIV, as I am fairly sure they are not even 
> close to being equivalent.  But that does come at a cost; cycle-accurate 
> MIPSsim was much slower than its instruction-accurate counterpart which 
> also existed.
> 
> ...
>> More interesting would be to tweak the optimizing machinery to improve 
>> parts that either have bitrotted or never actually worked. The code 
>> generation for auto-increment etc. isn't particularly effective and I 
>> think that's a known limitation.  Ditto indirect addressing, since few 
>> other machines have that.  (VAX does, of course; it might benefit too.)  
>> And with LRA things are more limited still, again this seems to be known 
>> and is caused by the focus on modern machine architectures.
> 
> Correctness absolutely has to take precedence over performance, but that 
> does not mean the latter has to be completely ignored either.  And the 
> presence of tools may only help with that.  We may not have the resources 
> available commercially significant ports have, but that does not mean we 
> should decide upfront to abandon any kind of performance QA.  I think we 
> can still act professionally and try to do our best to make the quality of 
> code produced as good as possible within our available resources.

Definitely.  For me, one complication is that the key bits of the common code are things I don't really understand and there isn't much documentation, especially for the LRA case.  Some of what is documented apparently hasn't been correct in many years, and possibly was never correct.  I think some of the auto-increment facilities fall in that category.

	paul