diff mbox

Allow MIPS call-saved-{4-6}.c tests to correctly run for micromips

Message ID 0DA23CC379F5F945ACB41CF394B9827720F4BF02@LEMAIL01.le.imgtec.org
State New
Headers show

Commit Message

Andrew Bennett Jan. 13, 2015, 5:22 p.m. UTC
Hi,

The call-saved-{4-6}.c tests in the mips testsuite fail for micromips.  The reason is 
that micromips uses the swm and lwm instructions to save/restore the call-saved registers 
rather than using the sw and lw instructions.  The swm and lwm instructions only list 
the range of registers to use ie. $16-$25 and hence some of the scan-assembler 
patterns fail.  This fix adds the NO_COMPRESSION attribute to the foo function to 
force the tests to always compile as mips.
 
I have tested this for both mips and micromips, and the tests now pass successfully.  
The ChangeLog and patch are below.

Ok to commit?


Many thanks,


Andrew


testsuite/

	* gcc.target/mips/call-saved-4.c: Add NO_COMPRESSION attribute.
	* gcc.target/mips/call-saved-5.c: Likewise.
	* gcc.target/mips/call-saved-6.c: Likewise.

Comments

Maciej W. Rozycki Jan. 13, 2015, 6:38 p.m. UTC | #1
On Tue, 13 Jan 2015, Andrew Bennett wrote:

> The call-saved-{4-6}.c tests in the mips testsuite fail for micromips.  The reason is 
> that micromips uses the swm and lwm instructions to save/restore the call-saved registers 
> rather than using the sw and lw instructions.  The swm and lwm instructions only list 
> the range of registers to use ie. $16-$25 and hence some of the scan-assembler 
> patterns fail.  This fix adds the NO_COMPRESSION attribute to the foo function to 
> force the tests to always compile as mips.
>  
> I have tested this for both mips and micromips, and the tests now pass successfully.  
> The ChangeLog and patch are below.

 Hmm, instead of trying to avoid testing microMIPS code generation just to 
satisfy the test suite I'd rather see the test cases updated so that 
LWM/SWM register ranges are expected and accepted whenever microMIPS code 
is produced.  These scan patterns can be made conditional.

  Maciej
Richard Sandiford Jan. 13, 2015, 7:49 p.m. UTC | #2
"Maciej W. Rozycki" <macro@linux-mips.org> writes:
> On Tue, 13 Jan 2015, Andrew Bennett wrote:
>
>> The call-saved-{4-6}.c tests in the mips testsuite fail for micromips.
>> The reason is
>> that micromips uses the swm and lwm instructions to save/restore the
>> call-saved registers
>> rather than using the sw and lw instructions.  The swm and lwm
>> instructions only list
>> the range of registers to use ie. $16-$25 and hence some of the
>> scan-assembler
>> patterns fail.  This fix adds the NO_COMPRESSION attribute to the foo
>> function to
>> force the tests to always compile as mips.
>>  
>> I have tested this for both mips and micromips, and the tests now pass
>> successfully.
>> The ChangeLog and patch are below.
>
>  Hmm, instead of trying to avoid testing microMIPS code generation just to 
> satisfy the test suite I'd rather see the test cases updated so that 
> LWM/SWM register ranges are expected and accepted whenever microMIPS code 
> is produced.  These scan patterns can be made conditional.

FWIW I think Andrew's patch is correct.  If we want to test microMIPS
output against micromips-specific regexps, we should add a separate test
that forces micromips, so that it gets tested regardless of people's
RUNTESTFLAGS.  Doing that shouldn't hold up Andrew's patch though.

Whereever possible gcc.target/mips should not have conditional dg-finals.

Thanks,
Richard
Matthew Fortune Jan. 13, 2015, 8:02 p.m. UTC | #3
Richard Sandiford <rdsandiford@googlemail.com>  writes:
> "Maciej W. Rozycki" <macro@linux-mips.org> writes:
> > On Tue, 13 Jan 2015, Andrew Bennett wrote:
> >
> >> The call-saved-{4-6}.c tests in the mips testsuite fail for
> micromips.
> >> The reason is
> >> that micromips uses the swm and lwm instructions to save/restore the
> >> call-saved registers rather than using the sw and lw instructions.
> >> The swm and lwm instructions only list the range of registers to use
> >> ie. $16-$25 and hence some of the scan-assembler patterns fail.  This
> >> fix adds the NO_COMPRESSION attribute to the foo function to force
> >> the tests to always compile as mips.
> >>
> >> I have tested this for both mips and micromips, and the tests now
> >> pass successfully.
> >> The ChangeLog and patch are below.
> >
> >  Hmm, instead of trying to avoid testing microMIPS code generation
> > just to satisfy the test suite I'd rather see the test cases updated
> > so that LWM/SWM register ranges are expected and accepted whenever
> > microMIPS code is produced.  These scan patterns can be made
> conditional.
> 
> FWIW I think Andrew's patch is correct.  If we want to test microMIPS
> output against micromips-specific regexps, we should add a separate test
> that forces micromips, so that it gets tested regardless of people's
> RUNTESTFLAGS.  Doing that shouldn't hold up Andrew's patch though.
> 
> Whereever possible gcc.target/mips should not have conditional dg-
> finals.

I was going to suggest a follow up patch to add copies of the three tests
as Richard suggests. I haven't yet done a micromips run of the testsuite
to check for any other issues like this but I suspect problems are limited
to the tests that I recently added.

I certainly agree that we shouldn't just ignore micromips expected output
given it is pretty easy to test.

Please go ahead and commit this patch so we clean up the test results for
GCC 5 in case you (or anyone else) doesn't get to submitting the extra test
cases before we hit stage 4.

Thanks,
Matthew
Maciej W. Rozycki Jan. 13, 2015, 8:27 p.m. UTC | #4
On Tue, 13 Jan 2015, Matthew Fortune wrote:

> > >> I have tested this for both mips and micromips, and the tests now
> > >> pass successfully.
> > >> The ChangeLog and patch are below.
> > >
> > >  Hmm, instead of trying to avoid testing microMIPS code generation
> > > just to satisfy the test suite I'd rather see the test cases updated
> > > so that LWM/SWM register ranges are expected and accepted whenever
> > > microMIPS code is produced.  These scan patterns can be made
> > conditional.
> > 
> > FWIW I think Andrew's patch is correct.  If we want to test microMIPS
> > output against micromips-specific regexps, we should add a separate test
> > that forces micromips, so that it gets tested regardless of people's
> > RUNTESTFLAGS.  Doing that shouldn't hold up Andrew's patch though.

 Taking care that the default compilation mode does not conflict (e.g. 
MIPS16, incompatible) and taking any exceptions into account (e.g. n64, 
unsupported) I presume, right?

> > Whereever possible gcc.target/mips should not have conditional dg-
> > finals.

 OK, if that's been the policy.

> I was going to suggest a follow up patch to add copies of the three tests
> as Richard suggests. I haven't yet done a micromips run of the testsuite
> to check for any other issues like this but I suspect problems are limited
> to the tests that I recently added.

 Please always try to test changes reasonably, i.e. at least o32, 
o32/MIPS16, o32/microMIPS, n32, n64, and then Linux and ELF if applicable, 
plus any options that may be relevant, unless it is absolutely clear 
ABI/ISA variations do not matter for a change proposed.  Running 
regression tests is just processing time after all (cheap!), you don't 
have to spend your brain cycles on it (expensive!).

  Maciej
Richard Sandiford Jan. 14, 2015, 7:21 a.m. UTC | #5
"Maciej W. Rozycki" <macro@linux-mips.org> writes:
> On Tue, 13 Jan 2015, Matthew Fortune wrote:
>
>> > >> I have tested this for both mips and micromips, and the tests now
>> > >> pass successfully.
>> > >> The ChangeLog and patch are below.
>> > >
>> > >  Hmm, instead of trying to avoid testing microMIPS code generation
>> > > just to satisfy the test suite I'd rather see the test cases updated
>> > > so that LWM/SWM register ranges are expected and accepted whenever
>> > > microMIPS code is produced.  These scan patterns can be made
>> > conditional.
>> > 
>> > FWIW I think Andrew's patch is correct.  If we want to test microMIPS
>> > output against micromips-specific regexps, we should add a separate test
>> > that forces micromips, so that it gets tested regardless of people's
>> > RUNTESTFLAGS.  Doing that shouldn't hold up Andrew's patch though.
>
>  Taking care that the default compilation mode does not conflict (e.g. 
> MIPS16, incompatible) and taking any exceptions into account (e.g. n64, 
> unsupported) I presume, right?

mips.exp sorts that out for you.  Adding "-mmicromips" or "(-micromips)"
to dg-options forces (or at least is supposed to force) the overall flags
to be compatible with microMIPS.

The aim of mips.exp is avoid skipping tests whereever possible.  If
someone runs the testsuite with -mips16 and we have a -micromips test,
it's better to remove -mips16 for that test than to skip the test entirely.

>> I was going to suggest a follow up patch to add copies of the three tests
>> as Richard suggests. I haven't yet done a micromips run of the testsuite
>> to check for any other issues like this but I suspect problems are limited
>> to the tests that I recently added.
>
>  Please always try to test changes reasonably, i.e. at least o32, 
> o32/MIPS16, o32/microMIPS, n32, n64, and then Linux and ELF if applicable, 
> plus any options that may be relevant, unless it is absolutely clear 
> ABI/ISA variations do not matter for a change proposed.

TBH this seems a bit much.  On the one hand it's more testing than you'd
get for almost any other target, but on the other it leaves out important
differences like MIPS I vs MIPS II vs MIPS 32, MIPS III vs MIPS IV vs MIPS64,
r1 vs. r2 vs. r6, Octeon vs. Loongson vs. vanilla, DSP vs. no DSP, etc.
I think we just have to accept that there are so many possible
combinations that we can't test everything that's potentially relevant.
I think it's more useful to be flexible than prescribe a particular list.

Having everyone test the same multilib combinations on the same target
isn't necessarily a good thing anyway.  Diversity in testing (between
developers) is useful too.

Thanks,
Richard
Maciej W. Rozycki Jan. 14, 2015, 12:17 p.m. UTC | #6
On Wed, 14 Jan 2015, Richard Sandiford wrote:

> >  Taking care that the default compilation mode does not conflict (e.g. 
> > MIPS16, incompatible) and taking any exceptions into account (e.g. n64, 
> > unsupported) I presume, right?
> 
> mips.exp sorts that out for you.  Adding "-mmicromips" or "(-micromips)"
> to dg-options forces (or at least is supposed to force) the overall flags
> to be compatible with microMIPS.
> 
> The aim of mips.exp is avoid skipping tests whereever possible.  If
> someone runs the testsuite with -mips16 and we have a -micromips test,
> it's better to remove -mips16 for that test than to skip the test entirely.

 OK, good to know, thanks; that works for compilation tests only though.  
For execution tests however what if target hardware used is incompatible 
or there is no suitable C library (multilib configuration) available for 
the option requested?  E.g. any hardware supporting MIPS16 won't run 
microMIPS code, n64 tests won't link if there's no n64 multilib, etc.

> >  Please always try to test changes reasonably, i.e. at least o32, 
> > o32/MIPS16, o32/microMIPS, n32, n64, and then Linux and ELF if applicable, 
> > plus any options that may be relevant, unless it is absolutely clear 
> > ABI/ISA variations do not matter for a change proposed.
> 
> TBH this seems a bit much.  On the one hand it's more testing than you'd
> get for almost any other target, but on the other it leaves out important
> differences like MIPS I vs MIPS II vs MIPS 32, MIPS III vs MIPS IV vs MIPS64,
> r1 vs. r2 vs. r6, Octeon vs. Loongson vs. vanilla, DSP vs. no DSP, etc.

 I disagree, I listed what I consider the base set of configurations for 
the MIPS target, spanning the major target variations:

- MIPS/MIPS16/microMIPS can be treated almost as distinct processor 
  architectures, the instruction sets have much common in spirit, but 
  there are enough pitfalls and traps,

- n32 covers a substantially different calling convention plus (for Linux) 
  SVR4 PIC code that normally isn't used for executables with o32 these 
  days,

- n64 covers all that n32 does plus a 64-bit target.

I realise ELF testing may be difficult for some people due to the hassle 
with setting up the runtime, so to skip an ELF target can be justified; 
otherwise I think it makes sense to run such testing for at least one 
configuration from the list above for a good measure.  As is running some 
of them with the big and some of them with the little endianness.

 You've got a point with architecture levels or processor models.  I think 
r6 should be treated as a distinct architecture and tested as the fourth 
variant along MIPS/MIPS16/microMIPS, but such a test environment may not 
yet be available to many.  The rest I'm afraid will mostly matter for 
changes made to the middle end rather than the MIPS backend, in which case 
chances are MIPS testing will not be run at all.  A test bot (similar to 
JBG's build bot, but extended to run testing too) can help in this case; I 
don't know if anyone runs one.

 As to DSP, MSA, hard-float, soft-float, 2008-NaN, etc., I'd only expect 
to run appropriate testing (i.e. with `-mdsp', etc.) across the 
configurations named above whenever relevant code is changed; some of this 
stuff is irrelevant or unavailable for some of the configurations above 
(e.g. n64 DSP, IIRC), or may have no influence (e.g. the NaN encoding), in 
which case it may be justified to skip them.

> I think we just have to accept that there are so many possible
> combinations that we can't test everything that's potentially relevant.
> I think it's more useful to be flexible than prescribe a particular list.

 Of course flexibility is needed, I absolutely agree.  I consider the list 
I quoted the base set, I've used it for all recent submissions.  Then for 
each individual change I've asked myself: does it make sense to run all 
this testing?  If for example a change touched `if (TARGET_MICROMIPS)' 
code only, then clearly running any non-microMIPS testing adds no value.  
And then: will this testing provide enough coverage?  If not, then what 
else needs to be covered?

 As I say, testing is cheap, you can fire a bunch of test suites in 
parallel under Linux started on QEMU run in the system emulation mode.  
From my experience on decent x86 hardware whole GCC/G++ testing across the 
5 configurations named will complete in just a few hours, that you can 
spend doing something else.  And if any issues are found then the patch 
submitter, who's often the actual author and knows his code the best, is 
in the best position to understand what happened.

 OTOH chasing down a problem later on is expensive (difficult), first it 
has to be narrowed down, often based on a user bug report rather than the 
discovery of a test-suite regression.  Even making a reproducible test 
case from such a report may be tough.  And then you have the choice of 
either debugging the problem from scratch, or (if you have an easy way to 
figure out it is a regression, such as by passing the test case through an 
older version of the compiler whose binary you already have handy) 
bisecting the tree to find the offending commit (not readily available 
with SVN AFAIK, but I had cases I did it manually in the past) and 
starting from there.  Both ways are tedious and time consuming.

> Having everyone test the same multilib combinations on the same target
> isn't necessarily a good thing anyway.  Diversity in testing (between
> developers) is useful too.

 Sure, people will undoubtedly use different default options, I'm sure 
folks at Cavium will compile for Octeon rather than the base architecture 
for example.  Other people may have DSP enabled.  Etc., etc...  That IMHO 
does not preclude testing across more than just a single configuration.

 FWIW,

  Maciej
Richard Sandiford Jan. 14, 2015, 7:47 p.m. UTC | #7
"Maciej W. Rozycki" <macro@linux-mips.org> writes:
> On Wed, 14 Jan 2015, Richard Sandiford wrote:
>
>> >  Taking care that the default compilation mode does not conflict (e.g. 
>> > MIPS16, incompatible) and taking any exceptions into account (e.g. n64, 
>> > unsupported) I presume, right?
>> 
>> mips.exp sorts that out for you.  Adding "-mmicromips" or "(-micromips)"
>> to dg-options forces (or at least is supposed to force) the overall flags
>> to be compatible with microMIPS.
>> 
>> The aim of mips.exp is avoid skipping tests whereever possible.  If
>> someone runs the testsuite with -mips16 and we have a -micromips test,
>> it's better to remove -mips16 for that test than to skip the test entirely.
>
>  OK, good to know, thanks; that works for compilation tests only though.  
> For execution tests however what if target hardware used is incompatible 
> or there is no suitable C library (multilib configuration) available for 
> the option requested?  E.g. any hardware supporting MIPS16 won't run 
> microMIPS code, n64 tests won't link if there's no n64 multilib, etc.

In those cases it just does a compile-only test, again on the basis that
it's better than skipping the test entirely.  See the big comment at the
beginning of mips.exp if you're interested in the specific details of
how this works and what is supported.

>> >  Please always try to test changes reasonably, i.e. at least o32, 
>> > o32/MIPS16, o32/microMIPS, n32, n64, and then Linux and ELF if applicable, 
>> > plus any options that may be relevant, unless it is absolutely clear 
>> > ABI/ISA variations do not matter for a change proposed.
>> 
>> TBH this seems a bit much.  On the one hand it's more testing than you'd
>> get for almost any other target, but on the other it leaves out important
>> differences like MIPS I vs MIPS II vs MIPS 32, MIPS III vs MIPS IV vs MIPS64,
>> r1 vs. r2 vs. r6, Octeon vs. Loongson vs. vanilla, DSP vs. no DSP, etc.
>
>  I disagree, I listed what I consider the base set of configurations for 
> the MIPS target, spanning the major target variations:
>
> - MIPS/MIPS16/microMIPS can be treated almost as distinct processor 
>   architectures, the instruction sets have much common in spirit, but 
>   there are enough pitfalls and traps,
>
> - n32 covers a substantially different calling convention plus (for Linux) 
>   SVR4 PIC code that normally isn't used for executables with o32 these 
>   days,
>
> - n64 covers all that n32 does plus a 64-bit target.
>
> I realise ELF testing may be difficult for some people due to the hassle 
> with setting up the runtime, so to skip an ELF target can be justified; 
> otherwise I think it makes sense to run such testing for at least one 
> configuration from the list above for a good measure.  As is running some 
> of them with the big and some of them with the little endianness.
>
>  You've got a point with architecture levels or processor models.  I think 
> r6 should be treated as a distinct architecture and tested as the fourth 
> variant along MIPS/MIPS16/microMIPS, but such a test environment may not 
> yet be available to many.  The rest I'm afraid will mostly matter for 
> changes made to the middle end rather than the MIPS backend, in which case 
> chances are MIPS testing will not be run at all.  A test bot (similar to 
> JBG's build bot, but extended to run testing too) can help in this case; I 
> don't know if anyone runs one.
>
>  As to DSP, MSA, hard-float, soft-float, 2008-NaN, etc., I'd only expect 
> to run appropriate testing (i.e. with `-mdsp', etc.) across the 
> configurations named above whenever relevant code is changed; some of this 
> stuff is irrelevant or unavailable for some of the configurations above 
> (e.g. n64 DSP, IIRC), or may have no influence (e.g. the NaN encoding), in 
> which case it may be justified to skip them.

But soft vs. hard float in particular is a significant difference in
terms of the ABI.  Especially when it comes to MIPS16 interworking
(but even apart from that).

>> I think we just have to accept that there are so many possible
>> combinations that we can't test everything that's potentially relevant.
>> I think it's more useful to be flexible than prescribe a particular list.
>
>  Of course flexibility is needed, I absolutely agree.  I consider the list 
> I quoted the base set, I've used it for all recent submissions.  Then for 
> each individual change I've asked myself: does it make sense to run all 
> this testing?  If for example a change touched `if (TARGET_MICROMIPS)' 
> code only, then clearly running any non-microMIPS testing adds no value.  
> And then: will this testing provide enough coverage?  If not, then what 
> else needs to be covered?
>
>  As I say, testing is cheap, you can fire a bunch of test suites in 
> parallel under Linux started on QEMU run in the system emulation mode.
> From my experience on decent x86 hardware whole GCC/G++ testing across the 
> 5 configurations named will complete in just a few hours, that you can 
> spend doing something else.  And if any issues are found then the patch 
> submitter, who's often the actual author and knows his code the best, is 
> in the best position to understand what happened.
>
>  OTOH chasing down a problem later on is expensive (difficult), first it 
> has to be narrowed down, often based on a user bug report rather than the 
> discovery of a test-suite regression.  Even making a reproducible test 
> case from such a report may be tough.  And then you have the choice of 
> either debugging the problem from scratch, or (if you have an easy way to 
> figure out it is a regression, such as by passing the test case through an 
> older version of the compiler whose binary you already have handy) 
> bisecting the tree to find the offending commit (not readily available 
> with SVN AFAIK, but I had cases I did it manually in the past) and 
> starting from there.  Both ways are tedious and time consuming.
>
>> Having everyone test the same multilib combinations on the same target
>> isn't necessarily a good thing anyway.  Diversity in testing (between
>> developers) is useful too.
>
>  Sure, people will undoubtedly use different default options, I'm sure 
> folks at Cavium will compile for Octeon rather than the base architecture 
> for example.  Other people may have DSP enabled.  Etc., etc...  That IMHO 
> does not preclude testing across more than just a single configuration.

Yeah, but that's just the way it goes.  By trying to get everyone to
test with the options that matter to you, you're reducing the amount of
work you have to do when tracking regressions on those targets, but you're
saying that people who care about Octeon or the opposite floatness have
to go through the process you describe as "tedious and time consuming".

And you don't avoid that process anyway, since people making changes
to target-independent parts of GCC are just as likely to introduce
a regression as those making changes to MIPS-only code.  If testing
is cheap and takes only a small number of hours, and if you want to
make it less tedious to track down a regression, continuous testing
would give you a narrow window for each regression.

Submitters should be free to test on what matters to them rather than
have to test a canned set of multilibs on specific configurations.

Thanks,
Richard
Matthew Fortune Jan. 14, 2015, 8:24 p.m. UTC | #8
Richard Sandiford <rdsandiford@googlemail.com> writes:
> "Maciej W. Rozycki" <macro@linux-mips.org> writes:
> > On Wed, 14 Jan 2015, Richard Sandiford wrote:
> >> I think we just have to accept that there are so many possible
> >> combinations that we can't test everything that's potentially
> relevant.
> >> I think it's more useful to be flexible than prescribe a particular
> list.
> >
> >  Of course flexibility is needed, I absolutely agree.  I consider the
> > list I quoted the base set, I've used it for all recent submissions.
> > Then for each individual change I've asked myself: does it make sense
> > to run all this testing?  If for example a change touched `if
> (TARGET_MICROMIPS)'
> > code only, then clearly running any non-microMIPS testing adds no
> value.
> > And then: will this testing provide enough coverage?  If not, then
> > what else needs to be covered?
> >
> >  As I say, testing is cheap, you can fire a bunch of test suites in
> > parallel under Linux started on QEMU run in the system emulation mode.
> > From my experience on decent x86 hardware whole GCC/G++ testing across
> > the
> > 5 configurations named will complete in just a few hours, that you can
> > spend doing something else.  And if any issues are found then the
> > patch submitter, who's often the actual author and knows his code the
> > best, is in the best position to understand what happened.
> >
> >  OTOH chasing down a problem later on is expensive (difficult), first
> > it has to be narrowed down, often based on a user bug report rather
> > than the discovery of a test-suite regression.  Even making a
> > reproducible test case from such a report may be tough.  And then you
> > have the choice of either debugging the problem from scratch, or (if
> > you have an easy way to figure out it is a regression, such as by
> > passing the test case through an older version of the compiler whose
> > binary you already have handy) bisecting the tree to find the
> > offending commit (not readily available with SVN AFAIK, but I had
> > cases I did it manually in the past) and starting from there.  Both
> ways are tedious and time consuming.
> >
> >> Having everyone test the same multilib combinations on the same
> >> target isn't necessarily a good thing anyway.  Diversity in testing
> >> (between
> >> developers) is useful too.
> >
> >  Sure, people will undoubtedly use different default options, I'm sure
> > folks at Cavium will compile for Octeon rather than the base
> > architecture for example.  Other people may have DSP enabled.  Etc.,
> > etc...  That IMHO does not preclude testing across more than just a
> single configuration.
> 
> Yeah, but that's just the way it goes.  By trying to get everyone to
> test with the options that matter to you, you're reducing the amount of
> work you have to do when tracking regressions on those targets, but
> you're saying that people who care about Octeon or the opposite
> floatness have to go through the process you describe as "tedious and
> time consuming".
> 
> And you don't avoid that process anyway, since people making changes to
> target-independent parts of GCC are just as likely to introduce a
> regression as those making changes to MIPS-only code.  If testing is
> cheap and takes only a small number of hours, and if you want to make it
> less tedious to track down a regression, continuous testing would give
> you a narrow window for each regression.
> 
> Submitters should be free to test on what matters to them rather than
> have to test a canned set of multilibs on specific configurations.

One of my main concerns is in enabling contribution from less experienced
developers and those that don't have the infrastructure available to
perform wide regression testing.

I would not want to instil fear in anyone that because they didn't test
a specific ISA/revision then they shouldn't bother submitting their
patch. The review process is fairly intense in GNU projects and the
retest of code can easily stack up with just with a few configurations.
Frankly, I dread having to do anything remotely like FPXX ever again
as the testing drove me bonkers. I believe there is a point where we
have to accept that some issues may have to be fixed after the initial
patch is committed. There have been several configuration related issues
addressed after FPXX was committed but having the code in tree and getting
feedback from other people's favourite configuration testing can actually
help speed up development as well.

The majority of test failures for different MIPS configurations tend
to come from the tests with expected output. Trying to ensure a test
builds correctly for any set of test options and has the correct
output is exceptionally hard and there is a general theme of not
over-specifying test options so that the test does take on the
personality of the test options if possible. Personally I am happy to
go through at regular intervals and look at the results for a wide range
of configurations and fix them up. It takes significantly less time to
do one pass through for that kind of issue than for every new test.
None of that reduces the need for thorough testing of a change to
prove it is functionally correct though. So in that sense I agree that
multiple configurations have to be tried if there is risk of breaking
code-gen in different ways for different configs.

On the original issue of micromips...
I did manage to get round to a test run of micromips for mips.exp today
and although I haven't checked back in history it looks like we have
had micromips expected output failures for a significant time. I'll
try to address some of the failures but don't want to destabilise the
testsuite for non-micromips at this late stage.

Thanks,
Matthew
Maciej W. Rozycki Jan. 30, 2015, 12:09 p.m. UTC | #9
On Wed, 14 Jan 2015, Richard Sandiford wrote:

> >> The aim of mips.exp is avoid skipping tests whereever possible.  If
> >> someone runs the testsuite with -mips16 and we have a -micromips test,
> >> it's better to remove -mips16 for that test than to skip the test entirely.
> >
> >  OK, good to know, thanks; that works for compilation tests only though.  
> > For execution tests however what if target hardware used is incompatible 
> > or there is no suitable C library (multilib configuration) available for 
> > the option requested?  E.g. any hardware supporting MIPS16 won't run 
> > microMIPS code, n64 tests won't link if there's no n64 multilib, etc.
> 
> In those cases it just does a compile-only test, again on the basis that
> it's better than skipping the test entirely.  See the big comment at the
> beginning of mips.exp if you're interested in the specific details of
> how this works and what is supported.

 Very good indeed then, and thanks for clearing my uncertainty!  I have 
actually been quite impressed by the robustness of the mips.exp harness, 
and missed its power while chasing test suite failures for another 
architecture where the lack of attention to multilib defaults led to silly 
option combinations causing inevitable errors.

> >> TBH this seems a bit much.  On the one hand it's more testing than you'd
> >> get for almost any other target, but on the other it leaves out important
> >> differences like MIPS I vs MIPS II vs MIPS 32, MIPS III vs MIPS IV vs MIPS64,
> >> r1 vs. r2 vs. r6, Octeon vs. Loongson vs. vanilla, DSP vs. no DSP, etc.
> >
> >  I disagree, I listed what I consider the base set of configurations for 
> > the MIPS target, spanning the major target variations:
> >
> > - MIPS/MIPS16/microMIPS can be treated almost as distinct processor 
> >   architectures, the instruction sets have much common in spirit, but 
> >   there are enough pitfalls and traps,
> >
> > - n32 covers a substantially different calling convention plus (for Linux) 
> >   SVR4 PIC code that normally isn't used for executables with o32 these 
> >   days,
> >
> > - n64 covers all that n32 does plus a 64-bit target.
> >
> > I realise ELF testing may be difficult for some people due to the hassle 
> > with setting up the runtime, so to skip an ELF target can be justified; 
> > otherwise I think it makes sense to run such testing for at least one 
> > configuration from the list above for a good measure.  As is running some 
> > of them with the big and some of them with the little endianness.
> >
> >  You've got a point with architecture levels or processor models.  I think 
> > r6 should be treated as a distinct architecture and tested as the fourth 
> > variant along MIPS/MIPS16/microMIPS, but such a test environment may not 
> > yet be available to many.  The rest I'm afraid will mostly matter for 
> > changes made to the middle end rather than the MIPS backend, in which case 
> > chances are MIPS testing will not be run at all.  A test bot (similar to 
> > JBG's build bot, but extended to run testing too) can help in this case; I 
> > don't know if anyone runs one.
> >
> >  As to DSP, MSA, hard-float, soft-float, 2008-NaN, etc., I'd only expect 
> > to run appropriate testing (i.e. with `-mdsp', etc.) across the 
> > configurations named above whenever relevant code is changed; some of this 
> > stuff is irrelevant or unavailable for some of the configurations above 
> > (e.g. n64 DSP, IIRC), or may have no influence (e.g. the NaN encoding), in 
> > which case it may be justified to skip them.
> 
> But soft vs. hard float in particular is a significant difference in
> terms of the ABI.  Especially when it comes to MIPS16 interworking
> (but even apart from that).

 And indeed I'd expect such testing to be done for changes involving FP!

> >  Sure, people will undoubtedly use different default options, I'm sure 
> > folks at Cavium will compile for Octeon rather than the base architecture 
> > for example.  Other people may have DSP enabled.  Etc., etc...  That IMHO 
> > does not preclude testing across more than just a single configuration.
> 
> Yeah, but that's just the way it goes.  By trying to get everyone to
> test with the options that matter to you, you're reducing the amount of
> work you have to do when tracking regressions on those targets, but you're
> saying that people who care about Octeon or the opposite floatness have
> to go through the process you describe as "tedious and time consuming".

 Well, when the submitted does not do enough testing then the maintainer 
has to do that instead, assuming that he finds a given change useful and 
wants to accept it.  And the maintainer's processing power is limited and 
will eventually run out.  The more testing other people do, the more 
resources the maintainer has left to do his job.

> And you don't avoid that process anyway, since people making changes
> to target-independent parts of GCC are just as likely to introduce
> a regression as those making changes to MIPS-only code.  If testing
> is cheap and takes only a small number of hours, and if you want to
> make it less tedious to track down a regression, continuous testing
> would give you a narrow window for each regression.

 Yes, this is one way to monitor regressions triggered by generic changes.

> Submitters should be free to test on what matters to them rather than
> have to test a canned set of multilibs on specific configurations.

 Well, as I say, this really boils down to who is going to do testing.  
I'll be more willing to accept a patch right away that has been broadly 
tested to one that has only received minimal testing.  But that of course 
depends on how broad the scope of the change is.

  Maciej
Maciej W. Rozycki Jan. 30, 2015, 12:41 p.m. UTC | #10
On Wed, 14 Jan 2015, Matthew Fortune wrote:

> > Yeah, but that's just the way it goes.  By trying to get everyone to
> > test with the options that matter to you, you're reducing the amount of
> > work you have to do when tracking regressions on those targets, but
> > you're saying that people who care about Octeon or the opposite
> > floatness have to go through the process you describe as "tedious and
> > time consuming".
> > 
> > And you don't avoid that process anyway, since people making changes to
> > target-independent parts of GCC are just as likely to introduce a
> > regression as those making changes to MIPS-only code.  If testing is
> > cheap and takes only a small number of hours, and if you want to make it
> > less tedious to track down a regression, continuous testing would give
> > you a narrow window for each regression.
> > 
> > Submitters should be free to test on what matters to them rather than
> > have to test a canned set of multilibs on specific configurations.
> 
> One of my main concerns is in enabling contribution from less experienced
> developers and those that don't have the infrastructure available to
> perform wide regression testing.

 Common sense applies of course as far as I am concerned.  I'll set 
expectations differently for Joe Random contributing improvements he made 
in his free time than for a company doing it professionally with all the 
infrastructure behind it.

> I would not want to instil fear in anyone that because they didn't test
> a specific ISA/revision then they shouldn't bother submitting their
> patch. The review process is fairly intense in GNU projects and the
> retest of code can easily stack up with just with a few configurations.

 Again, common sense applies.  As soon as *you* are happy with testing, go 
submit your change!  The worst that can happen (again, as far as I am 
concerned) is that I'll ask you if you tested it or can test it for FOO -- 
if I conclude that it is important.  And you can do it if you can, or you 
can say that you don't have the resources for that.  And that will be fine 
-- it'll just mean I'll have to find some time to push it through my 
testing.

 Did you ever see me discourage anyone from submitting any change on any 
basis, BTW?

> On the original issue of micromips...
> I did manage to get round to a test run of micromips for mips.exp today
> and although I haven't checked back in history it looks like we have
> had micromips expected output failures for a significant time. I'll
> try to address some of the failures but don't want to destabilise the
> testsuite for non-micromips at this late stage.

 Thanks!  While at it I wonder whether we shouldn't actually have a third 
variant covering MIPS16e SAVE/RESTORE sequences too.  AFAICT we only have 
legacy MIPS16 covered (isa_rev=0).

  Maciej
diff mbox

Patch

diff --git a/gcc/testsuite/gcc.target/mips/call-saved-4.c b/gcc/testsuite/gcc.target/mips/call-saved-4.c
index 846ea32..92881c4 100644
--- a/gcc/testsuite/gcc.target/mips/call-saved-4.c
+++ b/gcc/testsuite/gcc.target/mips/call-saved-4.c
@@ -3,7 +3,7 @@ 
 
 void bar (void);
 
-void
+NOCOMPRESSION void
 foo (int x)
 {
   __builtin_unwind_init ();
diff --git a/gcc/testsuite/gcc.target/mips/call-saved-5.c b/gcc/testsuite/gcc.target/mips/call-saved-5.c
index 2937b31..152b28f 100644
--- a/gcc/testsuite/gcc.target/mips/call-saved-5.c
+++ b/gcc/testsuite/gcc.target/mips/call-saved-5.c
@@ -3,7 +3,7 @@ 
 
 void bar (void);
 
-void
+NOCOMPRESSION void
 foo (int x)
 {
   __builtin_unwind_init ();
diff --git a/gcc/testsuite/gcc.target/mips/call-saved-6.c b/gcc/testsuite/gcc.target/mips/call-saved-6.c
index 0d1a4c8..a384d4a 100644
--- a/gcc/testsuite/gcc.target/mips/call-saved-6.c
+++ b/gcc/testsuite/gcc.target/mips/call-saved-6.c
@@ -3,7 +3,7 @@ 
 
 void bar (void);
 
-void
+NOCOMPRESSION void
 foo (int x)
 {
   __builtin_unwind_init ();