[roland/test-unsupported] Let tests result in UNSUPPORTED; use that for unbuildable C++ cases
diff mbox

Message ID 20150307010200.B51122C3B39@topped-with-meat.com
State New
Headers show

Commit Message

Roland McGrath March 7, 2015, 1:02 a.m. UTC
This gives our test infrastructure the beginnings of handling UNSUPPORTED
as an outcome for a test (as opposed to PASS, FAIL, or ERROR).
evaluate-test.sh now follows the Autotest convention of exit code 77
meaning a test that cannot be built or cannot be usefully run,
i.e. UNSUPPORTED.

For tests that cannot be built, the infrastructure for subdir makefiles now
supports a tests-unsupported variable.  This patch uses that for all the
missing-C++ cases that I touched earlier this week.

We have many existing cases of tests that do compile but are turned into
no-ops that always pass either at compile time or at run time.  Once this
lands, we should start converting those such that they report UNSUPPORTED
rather that PASS.  But that is a later step.

First I'd like to reach consensus on the script/makefile level changes here.

Notably, I've changed the summarizing stuff so that it prints out all
results that are neither PASS nor XFAIL, so that catches all ERROR and
XPASS as well as the new UNSUPPORTED.  Similarly, I've changed the logic
that decides whether 'make check' as a whole succeeded or not so that it
rejects anything that is not PASS, XFAIL, or UNSUPPORTED, rather than
rejecting only things that are exactly FAIL or ERROR.  This means that any
existing XPASS cases now cause 'make check' to fail.  I think that's the
right thing, since XPASS means something funny is going on and either the
test needs to be changed or the XFAIL expectation needs to be dropped.  I
also think it's definitely the right thing to both display and barf on any
statuses that are not in the expected set.


Thanks,
Roland


2015-03-06  Roland McGrath  <roland@hack.frob.com>

	* scripts/evaluate-test.sh: Grok exit code 77 as UNSUPPORTED and exit
	with 0 in that case.
	* Makefile (summarize-tests): New canned sequence, factored out of
	commands for targets tests and xtests.  Display summary lines that
	don't start with PASS: or XFAIL: rather than ones that do start with
	ERROR: or FAIL:.  Make the commands fail if any summary lines fail
	to start with PASS: or XFAIL: or UNSUPPORTED: rather than if any
	do start with ERROR: or FAIL:.
	* dlfcn/Makefile (tests): Add bug-atexit3 back here unconditionally
	(except for [$(build-shared) = yes]).
	(tests-unsupported) [$(CXX) empty]: Add bug-atexit3.
	(LDLIBS-bug-atexit3-lib.so): Conditionalize on [$(CXX) nonempty].
	($(objpfx)bug-atexit3, $(objpfx)bug-atexit3.out): Likewise.
	* nptl/Makefile: Revert 2015-03-04 changes.
	[$(CXX) empty] (tests-unsupported): New variable.
	* debug/Makefile: Likewise.

Comments

Joseph Myers March 7, 2015, 1:37 a.m. UTC | #1
On Fri, 6 Mar 2015, Roland McGrath wrote:

> evaluate-test.sh now follows the Autotest convention of exit code 77
> meaning a test that cannot be built or cannot be usefully run,
> i.e. UNSUPPORTED.

I wonder if any existing tests return a count of errors (incrementing a 
variable on error rather than setting it to 1, then returning that 
variable) and could potentially return 77 by accident.  But such tests 
would already be broken if they returned 256, so any issues can be 
considered pre-existing.

> We have many existing cases of tests that do compile but are turned into
> no-ops that always pass either at compile time or at run time.  Once this
> lands, we should start converting those such that they report UNSUPPORTED
> rather that PASS.  But that is a later step.

And if cross-compiling without test-wrapper (so run-built-tests = no), in 
principle all affected tests should be UNSUPPORTED.  And at some point 
there's the issue of allowing tests to report status for their individual 
assertions (so a test might have some assertions that pass and some that 
are unsupported).

> Notably, I've changed the summarizing stuff so that it prints out all
> results that are neither PASS nor XFAIL, so that catches all ERROR and
> XPASS as well as the new UNSUPPORTED.  Similarly, I've changed the logic
> that decides whether 'make check' as a whole succeeded or not so that it
> rejects anything that is not PASS, XFAIL, or UNSUPPORTED, rather than
> rejecting only things that are exactly FAIL or ERROR.  This means that any
> existing XPASS cases now cause 'make check' to fail.  I think that's the
> right thing, since XPASS means something funny is going on and either the
> test needs to be changed or the XFAIL expectation needs to be dropped.  I
> also think it's definitely the right thing to both display and barf on any
> statuses that are not in the expected set.

I don't think XPASSes should cause "make check" to fail - or at least, 
that's contrary to how we use XFAILs at present.  XPASSes appear right now 
for cases where the expected failure depends on conditions there is 
otherwise no need to test for in configure or makefiles.  In conform/, 
that includes (a) tests requiring GCC 4.7 or later, (b) tests that may 
depend on whether some non-glibc package has installed ndbm.h, (c) tests 
that are expected to fail on at least one of x86_64 and x86 but where the 
failure is for an architecture-specific (library or header) issue and so 
the tests pass on some other platforms.  Only (c) is a case where the 
XFAILs could simply be moved to sysdeps if desired and the relevant 
sysdeps condition is worked out.  It's not clear to me that it's useful to 
condition these XFAILs more precisely.  And XFAILs might well be useful 
for a few nptl/ and rt/ tests that are known to fail intermittently, 
possibly because of Linux kernel issues (where such a test issue is known, 
it seems better to me for the test to be variably XPASS or XFAIL, than 
variably PASS or FAIL).
Carlos O'Donell March 8, 2015, 7:23 p.m. UTC | #2
Roland,

Thanks for tackling this.

On 03/06/2015 08:02 PM, Roland McGrath wrote:
> This gives our test infrastructure the beginnings of handling UNSUPPORTED
> as an outcome for a test (as opposed to PASS, FAIL, or ERROR).
> evaluate-test.sh now follows the Autotest convention of exit code 77
> meaning a test that cannot be built or cannot be usefully run,
> i.e. UNSUPPORTED.

OK.

> For tests that cannot be built, the infrastructure for subdir makefiles now
> supports a tests-unsupported variable.  This patch uses that for all the
> missing-C++ cases that I touched earlier this week.

OK.
 
> We have many existing cases of tests that do compile but are turned into
> no-ops that always pass either at compile time or at run time.  Once this
> lands, we should start converting those such that they report UNSUPPORTED
> rather that PASS.  But that is a later step.

Agreed.

> First I'd like to reach consensus on the script/makefile level changes here.
> 
> Notably, I've changed the summarizing stuff so that it prints out all
> results that are neither PASS nor XFAIL, so that catches all ERROR and
> XPASS as well as the new UNSUPPORTED.  Similarly, I've changed the logic
> that decides whether 'make check' as a whole succeeded or not so that it
> rejects anything that is not PASS, XFAIL, or UNSUPPORTED, rather than
> rejecting only things that are exactly FAIL or ERROR.  This means that any
> existing XPASS cases now cause 'make check' to fail.  I think that's the
> right thing, since XPASS means something funny is going on and either the
> test needs to be changed or the XFAIL expectation needs to be dropped.  I
> also think it's definitely the right thing to both display and barf on any
> statuses that are not in the expected set.

I'm fine with the output being more verbose.

I'm less fine with XPASS failing the entire make check run. There are cases
that I want to XFAIL for Fedora that include intermittent test failures related
to kernel or glibc issues. If those tests pass and become XPASS, then `make check`
should not fail. Likewise if you fix a binutils or gcc bug which fixes the test
in question then the XPASS should not fail, but it should be printed to draw
your attention to the fact that it's passing now.

This is entirely subjective though, and I'm not willing to hold up this good
chance just for this case which I haven't implemented yet.

> 2015-03-06  Roland McGrath  <roland@hack.frob.com>
> 
> 	* scripts/evaluate-test.sh: Grok exit code 77 as UNSUPPORTED and exit
> 	with 0 in that case.
> 	* Makefile (summarize-tests): New canned sequence, factored out of
> 	commands for targets tests and xtests.  Display summary lines that
> 	don't start with PASS: or XFAIL: rather than ones that do start with
> 	ERROR: or FAIL:.  Make the commands fail if any summary lines fail
> 	to start with PASS: or XFAIL: or UNSUPPORTED: rather than if any
> 	do start with ERROR: or FAIL:.
> 	* dlfcn/Makefile (tests): Add bug-atexit3 back here unconditionally
> 	(except for [$(build-shared) = yes]).
> 	(tests-unsupported) [$(CXX) empty]: Add bug-atexit3.
> 	(LDLIBS-bug-atexit3-lib.so): Conditionalize on [$(CXX) nonempty].
> 	($(objpfx)bug-atexit3, $(objpfx)bug-atexit3.out): Likewise.
> 	* nptl/Makefile: Revert 2015-03-04 changes.
> 	[$(CXX) empty] (tests-unsupported): New variable.
> 	* debug/Makefile: Likewise.
> 
> --- a/Makefile
> +++ b/Makefile
> @@ -316,6 +316,13 @@ $(objpfx)begin-end-check.out: scripts/begin-end-check.pl
>  	$(evaluate-test)
>  endif
>  
> +define summarize-tests
> +@egrep -v '^(PASS|XFAIL):' $(objpfx)$1 || true
> +@echo "Summary of test results$2:"
> +@sed 's/:.*//' < $(objpfx)$1 | sort | uniq -c
> +@egrep -q -v '^(PASS|XFAIL|UNSUPPORTED):' $(objpfx)$1 && false
> +endef
> +

Not OK. See discussion of XPASS above.

>  tests-special-notdir = $(patsubst $(objpfx)%, %, $(tests-special))
>  tests: $(tests-special)
>  	$(..)scripts/merge-test-results.sh -s $(objpfx) "" \
> @@ -324,22 +331,12 @@ tests: $(tests-special)
>  	$(..)scripts/merge-test-results.sh -t $(objpfx) subdir-tests.sum \
>  	  $(sort $(subdirs) .) \
>  	  > $(objpfx)tests.sum
> -	@grep '^ERROR:' $(objpfx)tests.sum || true
> -	@grep '^FAIL:' $(objpfx)tests.sum || true
> -	@echo "Summary of test results:"
> -	@sed 's/:.*//' < $(objpfx)tests.sum | sort | uniq -c
> -	@if grep -q '^ERROR:' $(objpfx)tests.sum; then exit 1; fi
> -	@if grep -q '^FAIL:' $(objpfx)tests.sum; then exit 1; fi
> +	$(call summarize-tests,tests.sum)

OK.

>  xtests:
>  	$(..)scripts/merge-test-results.sh -t $(objpfx) subdir-xtests.sum \
>  	  $(sort $(subdirs)) \
>  	  > $(objpfx)xtests.sum
> -	@grep '^ERROR:' $(objpfx)xtests.sum || true
> -	@grep '^FAIL:' $(objpfx)xtests.sum || true
> -	@echo "Summary of test results for extra tests:"
> -	@sed 's/:.*//' < $(objpfx)xtests.sum | sort | uniq -c
> -	@if grep -q '^ERROR:' $(objpfx)xtests.sum; then exit 1; fi
> -	@if grep -q '^FAIL:' $(objpfx)xtests.sum; then exit 1; fi
> +	$(call summarize-tests,xtests.sum, for extra tests)

OK.

>  
>  # The realclean target is just like distclean for the parent, but we want
>  # the subdirs to know the difference in case they care.
> --- a/Rules
> +++ b/Rules
> @@ -198,6 +198,18 @@ $(objpfx)%.out: /dev/null $(objpfx)%	# Make it 2nd arg for canned sequence.
>  	$(make-test-out) > $@; \
>  	$(evaluate-test)
>  
> +# tests-unsupported lists tests that we will not try to build at all in
> +# this configuration.  Note this runs every time because it does not
> +# actually create its target.  The dependency on Makefile is meant to
> +# ensure that it runs after a Makefile change to add a test to the list
> +# when it previously ran and produced a .out file (probably for a failure).
> +ifneq "$(strip $(tests-unsupported))" ""
> +$(tests-unsupported:%=$(objpfx)%.out): $(objpfx)%.out: Makefile
> +	@rm -f $@
> +	$(..)scripts/evaluate-test.sh $(patsubst $(common-objpfx)%.out,%,$@) \
> +				      77 false false > $(@:.out=.test-result)
> +endif
> +

OK.

RFE: Could we have a target that actually lists information about the tests
*before* you run them? That way a developer can verify what is about to
get tested?

>  endif	# tests
>  
>  
> --- a/debug/Makefile
> +++ b/debug/Makefile
> @@ -133,11 +133,13 @@ LDFLAGS-tst-backtrace6 = -rdynamic
>  
>  tests = backtrace-tst tst-longjmp_chk tst-chk1 tst-chk2 tst-chk3 \
>  	tst-lfschk1 tst-lfschk2 tst-lfschk3 test-strcpy_chk test-stpcpy_chk \
> +	tst-chk4 tst-chk5 tst-chk6 tst-lfschk4 tst-lfschk5 tst-lfschk6 \
>  	tst-longjmp_chk2 tst-backtrace2 tst-backtrace3 tst-backtrace4 \
>  	tst-backtrace5 tst-backtrace6
>  
> -ifneq (,$(CXX))
> -tests += tst-chk4 tst-chk5 tst-chk6 tst-lfschk4 tst-lfschk5 tst-lfschk6
> +ifeq (,$(CXX))
> +tests-unsupported = tst-chk4 tst-chk5 tst-chk6 \
> +		    tst-lfschk4 tst-lfschk5 tst-lfschk6

OK.

>  endif
>  
>  extra-libs = libSegFault libpcprofile
> --- a/dlfcn/Makefile
> +++ b/dlfcn/Makefile
> @@ -36,7 +36,7 @@ endif
>  ifeq (yes,$(build-shared))
>  tests = glrefmain failtest tst-dladdr default errmsg1 tstcxaatexit \
>  	bug-dlopen1 bug-dlsym1 tst-dlinfo bug-atexit1 bug-atexit2 \
> -	tstatexit bug-dl-leaf tst-rec-dlopen
> +	bug-atexit3 tstatexit bug-dl-leaf tst-rec-dlopen
>  endif
>  modules-names = glreflib1 glreflib2 glreflib3 failtestmod defaultmod1 \
>  		defaultmod2 errmsg1mod modatexit modcxaatexit \
> @@ -59,8 +59,9 @@ tststatic4-ENV = $(tststatic-ENV)
>  tststatic5-ENV = $(tststatic-ENV)
>  
>  ifneq (,$(CXX))
> -tests += bug-atexit3
>  modules-names += bug-atexit3-lib
> +else
> +tests-unsupported += bug-atexit3

OK.

>  endif
>  endif
>  
> @@ -136,9 +137,11 @@ $(objpfx)bug-atexit1.out: $(objpfx)bug-atexit1-lib.so
>  $(objpfx)bug-atexit2: $(libdl)
>  $(objpfx)bug-atexit2.out: $(objpfx)bug-atexit2-lib.so
>  
> +ifneq (,$(CXX))
>  LDLIBS-bug-atexit3-lib.so = -lstdc++ -lgcc_eh
>  $(objpfx)bug-atexit3: $(libdl)
>  $(objpfx)bug-atexit3.out: $(objpfx)bug-atexit3-lib.so
> +endif

OK. Skip the lib if we're just going to mark bug-atexit3 UNSUPPORTED.

>  
>  $(objpfx)bug-dl-leaf: $(objpfx)bug-dl-leaf-lib.so
>  $(objpfx)bug-dl-leaf.out: $(objpfx)bug-dl-leaf-lib-cb.so
> --- a/nptl/Makefile
> +++ b/nptl/Makefile
> @@ -245,8 +245,8 @@ tests = tst-typesizes \
>  	tst-cancel6 tst-cancel7 tst-cancel8 tst-cancel9 tst-cancel10 \
>  	tst-cancel11 tst-cancel12 tst-cancel13 tst-cancel14 tst-cancel15 \
>  	tst-cancel16 tst-cancel17 tst-cancel18 tst-cancel19 tst-cancel20 \
> -	tst-cancel21 tst-cancel22 tst-cancel23 $(if $(CXX),tst-cancel24) \
> -	tst-cancel25 tst-cancel-self tst-cancel-self-cancelstate \
> +	tst-cancel21 tst-cancel22 tst-cancel23 tst-cancel24 tst-cancel25 \
> +	tst-cancel-self tst-cancel-self-cancelstate \

OK.

>  	tst-cancel-self-canceltype tst-cancel-self-testcancel \
>  	tst-cleanup0 tst-cleanup1 tst-cleanup2 tst-cleanup3 tst-cleanup4 \
>  	tst-flock1 tst-flock2 \
> @@ -364,19 +364,14 @@ link-libc-static := $(common-objpfx)libc.a $(static-gnulib) \
>  		    $(common-objpfx)libc.a
>  
>  tests-static += tst-locale1 tst-locale2 tst-stackguard1-static \
> -		tst-cancel21-static tst-cond8-static \
> +		tst-cancel21-static tst-cancel24-static tst-cond8-static \

OK.

>  		tst-mutex8-static tst-mutexpi8-static tst-sem11-static \
>  		tst-sem12-static
> -tests += tst-stackguard1-static tst-cancel21-static \
> +tests += tst-stackguard1-static tst-cancel21-static tst-cancel24-static \

OK.

>  	 tst-cond8-static tst-mutex8-static tst-mutexpi8-static \
>  	 tst-sem11-static tst-sem12-static
>  xtests-static += tst-setuid1-static
>  
> -ifneq (,$(CXX))
> -tests += tst-cancel24-static
> -tests-static += tst-cancel24-static
> -endif
> -

OK.

>  # These tests are linked with libc before libpthread
>  tests-reverse += tst-cancel5 tst-cancel23 tst-vfork1x tst-vfork2x
>  
> @@ -388,6 +383,11 @@ tests-special += $(objpfx)tst-tls6.out $(objpfx)tst-cleanup0-cmp.out \
>  endif
>  endif
>  
> +ifeq (,$(CXX))
> +# These tests require a C++ compiler and runtime.
> +tests-unsupported += tst-cancel24 tst-cancel24-static
> +endif

OK.

> +
>  include ../Rules
>  
>  ifeq (yes,$(build-shared))
> --- a/scripts/evaluate-test.sh
> +++ b/scripts/evaluate-test.sh
> @@ -25,15 +25,20 @@ orig_rc=$rc
>  xfail=$3
>  stop_on_failure=$4
>  
> -if [ $rc -eq 0 ]; then
> -  result="PASS"
> -else
> -  result="FAIL"
> -fi
> -
> -if $xfail; then
> -  result="X$result"
> +if [ $rc -eq 77 ]; then
> +  result="UNSUPPORTED"

OK.

>    rc=0
> +else
> +  if [ $rc -eq 0 ]; then
> +    result="PASS"
> +  else
> +    result="FAIL"
> +  fi
> +
> +  if $xfail; then
> +    result="X$result"
> +    rc=0
> +  fi
>  fi

OK.

>  
>  echo "$result: $test_name"
> 

Cheers,
Carlos.
Joseph Myers March 9, 2015, 4:03 p.m. UTC | #3
On Sun, 8 Mar 2015, Carlos O'Donell wrote:

> I'm less fine with XPASS failing the entire make check run. There are cases
> that I want to XFAIL for Fedora that include intermittent test failures related
> to kernel or glibc issues. If those tests pass and become XPASS, then `make check`

I'd like to see such XFAILs go in the main glibc repository, not just in 
distribution versions, if the cause of the failure isn't distribution 
glibc changes.  (It would be desirable to comment them e.g. "this fails 
intermittently with kernels before commit abcdef, which was in kernel g.h 
and later", so we can tell when such XFAILs are no longer needed - or if 
the failure is well-enough understood to know there's a glibc bug 
involved, the comment should point to that bug in Bugzilla.  We could do 
with a policy on how conditional to try to make such XFAILs.)

Generically, I think listing known failures on the per-release wiki pages 
is worse than XFAILing them, at least after some analysis has been done to 
understand the cause of the failure and how to tell when it's been fixed.
Roland McGrath March 10, 2015, 10:10 p.m. UTC | #4
> I wonder if any existing tests return a count of errors (incrementing a 
> variable on error rather than setting it to 1, then returning that 
> variable) and could potentially return 77 by accident.  But such tests 
> would already be broken if they returned 256, so any issues can be 
> considered pre-existing.

Right.  (To be pedantic, there might be such cases where there are >= 77
but < 256 errors possible in the test, in which case it would be a new
issue in practice.)  Off hand I found test-efgcvt.c seems to be in this
category.  There might be others.

I think we should formalize our use of exit status values in tests and then
audit all the tests to make sure they conform.  That would go along with an
audit to make sure everything is using test-skeleton and so forth.  I think
with this as a stated plan, it's OK to install the new infrastructure with
the possible new risk of FAIL->UNSUPPORTED for this corner case.

> And if cross-compiling without test-wrapper (so run-built-tests = no), in 
> principle all affected tests should be UNSUPPORTED.  

In principle, yes.  But I don't think it's especially useful in the real
world.  Certainly it can wait until further cleanup of the test
infrastructure we will do at some point in the future.

> And at some point there's the issue of allowing tests to report status
> for their individual assertions (so a test might have some assertions
> that pass and some that are unsupported).

Something like that I don't think we'll achieve before a complete overhaul
of our entire approach to tests.

> I don't think XPASSes should cause "make check" to fail - or at least, 
> that's contrary to how we use XFAILs at present. [...]

OK.  I used the logic that XPASS is an unexpected anomaly that merits
investigation.  I still tend to think that this is what it should mean.
But it was not the intent of this change to change existing policy on the
meaning of XFAIL or XPASS.  We can revisit all the subtleties you raised
later on.  For now, I'll change my code to admit XPASS for purposes of the
'make check' exit status (but still list them in the summary output).


Thanks,
Roland
Roland McGrath March 10, 2015, 10:12 p.m. UTC | #5
> RFE: Could we have a target that actually lists information about the tests
> *before* you run them? That way a developer can verify what is about to
> get tested?

I don't know exactly what you have in mind.  It seems orthogonal to what
I've worked on here.  Feel free to post such a change.


Thanks,
Roland
Roland McGrath March 10, 2015, 10:14 p.m. UTC | #6
I've committed it with the only change from what I posted being not making
XPASS cause 'make check' to exit nonzero.  Everything else discussed can be
done in follow-on changes, none of which I am particularly planning to do.
Andreas Schwab March 10, 2015, 10:54 p.m. UTC | #7
Roland McGrath <roland@hack.frob.com> writes:

>> I don't think XPASSes should cause "make check" to fail - or at least, 
>> that's contrary to how we use XFAILs at present. [...]
>
> OK.  I used the logic that XPASS is an unexpected anomaly that merits
> investigation.

The dejagnu framework treas XPASS as an unsuccessful outcome.

Andreas.
Carlos O'Donell March 11, 2015, 3:17 a.m. UTC | #8
On 03/10/2015 06:14 PM, Roland McGrath wrote:
> I've committed it with the only change from what I posted being not making
> XPASS cause 'make check' to exit nonzero.  Everything else discussed can be
> done in follow-on changes, none of which I am particularly planning to do.

Thanks for this.

Cheers,
Carlos.

Patch
diff mbox

--- a/Makefile
+++ b/Makefile
@@ -316,6 +316,13 @@  $(objpfx)begin-end-check.out: scripts/begin-end-check.pl
 	$(evaluate-test)
 endif
 
+define summarize-tests
+@egrep -v '^(PASS|XFAIL):' $(objpfx)$1 || true
+@echo "Summary of test results$2:"
+@sed 's/:.*//' < $(objpfx)$1 | sort | uniq -c
+@egrep -q -v '^(PASS|XFAIL|UNSUPPORTED):' $(objpfx)$1 && false
+endef
+
 tests-special-notdir = $(patsubst $(objpfx)%, %, $(tests-special))
 tests: $(tests-special)
 	$(..)scripts/merge-test-results.sh -s $(objpfx) "" \
@@ -324,22 +331,12 @@  tests: $(tests-special)
 	$(..)scripts/merge-test-results.sh -t $(objpfx) subdir-tests.sum \
 	  $(sort $(subdirs) .) \
 	  > $(objpfx)tests.sum
-	@grep '^ERROR:' $(objpfx)tests.sum || true
-	@grep '^FAIL:' $(objpfx)tests.sum || true
-	@echo "Summary of test results:"
-	@sed 's/:.*//' < $(objpfx)tests.sum | sort | uniq -c
-	@if grep -q '^ERROR:' $(objpfx)tests.sum; then exit 1; fi
-	@if grep -q '^FAIL:' $(objpfx)tests.sum; then exit 1; fi
+	$(call summarize-tests,tests.sum)
 xtests:
 	$(..)scripts/merge-test-results.sh -t $(objpfx) subdir-xtests.sum \
 	  $(sort $(subdirs)) \
 	  > $(objpfx)xtests.sum
-	@grep '^ERROR:' $(objpfx)xtests.sum || true
-	@grep '^FAIL:' $(objpfx)xtests.sum || true
-	@echo "Summary of test results for extra tests:"
-	@sed 's/:.*//' < $(objpfx)xtests.sum | sort | uniq -c
-	@if grep -q '^ERROR:' $(objpfx)xtests.sum; then exit 1; fi
-	@if grep -q '^FAIL:' $(objpfx)xtests.sum; then exit 1; fi
+	$(call summarize-tests,xtests.sum, for extra tests)
 
 # The realclean target is just like distclean for the parent, but we want
 # the subdirs to know the difference in case they care.
--- a/Rules
+++ b/Rules
@@ -198,6 +198,18 @@  $(objpfx)%.out: /dev/null $(objpfx)%	# Make it 2nd arg for canned sequence.
 	$(make-test-out) > $@; \
 	$(evaluate-test)
 
+# tests-unsupported lists tests that we will not try to build at all in
+# this configuration.  Note this runs every time because it does not
+# actually create its target.  The dependency on Makefile is meant to
+# ensure that it runs after a Makefile change to add a test to the list
+# when it previously ran and produced a .out file (probably for a failure).
+ifneq "$(strip $(tests-unsupported))" ""
+$(tests-unsupported:%=$(objpfx)%.out): $(objpfx)%.out: Makefile
+	@rm -f $@
+	$(..)scripts/evaluate-test.sh $(patsubst $(common-objpfx)%.out,%,$@) \
+				      77 false false > $(@:.out=.test-result)
+endif
+
 endif	# tests
 
 
--- a/debug/Makefile
+++ b/debug/Makefile
@@ -133,11 +133,13 @@  LDFLAGS-tst-backtrace6 = -rdynamic
 
 tests = backtrace-tst tst-longjmp_chk tst-chk1 tst-chk2 tst-chk3 \
 	tst-lfschk1 tst-lfschk2 tst-lfschk3 test-strcpy_chk test-stpcpy_chk \
+	tst-chk4 tst-chk5 tst-chk6 tst-lfschk4 tst-lfschk5 tst-lfschk6 \
 	tst-longjmp_chk2 tst-backtrace2 tst-backtrace3 tst-backtrace4 \
 	tst-backtrace5 tst-backtrace6
 
-ifneq (,$(CXX))
-tests += tst-chk4 tst-chk5 tst-chk6 tst-lfschk4 tst-lfschk5 tst-lfschk6
+ifeq (,$(CXX))
+tests-unsupported = tst-chk4 tst-chk5 tst-chk6 \
+		    tst-lfschk4 tst-lfschk5 tst-lfschk6
 endif
 
 extra-libs = libSegFault libpcprofile
--- a/dlfcn/Makefile
+++ b/dlfcn/Makefile
@@ -36,7 +36,7 @@  endif
 ifeq (yes,$(build-shared))
 tests = glrefmain failtest tst-dladdr default errmsg1 tstcxaatexit \
 	bug-dlopen1 bug-dlsym1 tst-dlinfo bug-atexit1 bug-atexit2 \
-	tstatexit bug-dl-leaf tst-rec-dlopen
+	bug-atexit3 tstatexit bug-dl-leaf tst-rec-dlopen
 endif
 modules-names = glreflib1 glreflib2 glreflib3 failtestmod defaultmod1 \
 		defaultmod2 errmsg1mod modatexit modcxaatexit \
@@ -59,8 +59,9 @@  tststatic4-ENV = $(tststatic-ENV)
 tststatic5-ENV = $(tststatic-ENV)
 
 ifneq (,$(CXX))
-tests += bug-atexit3
 modules-names += bug-atexit3-lib
+else
+tests-unsupported += bug-atexit3
 endif
 endif
 
@@ -136,9 +137,11 @@  $(objpfx)bug-atexit1.out: $(objpfx)bug-atexit1-lib.so
 $(objpfx)bug-atexit2: $(libdl)
 $(objpfx)bug-atexit2.out: $(objpfx)bug-atexit2-lib.so
 
+ifneq (,$(CXX))
 LDLIBS-bug-atexit3-lib.so = -lstdc++ -lgcc_eh
 $(objpfx)bug-atexit3: $(libdl)
 $(objpfx)bug-atexit3.out: $(objpfx)bug-atexit3-lib.so
+endif
 
 $(objpfx)bug-dl-leaf: $(objpfx)bug-dl-leaf-lib.so
 $(objpfx)bug-dl-leaf.out: $(objpfx)bug-dl-leaf-lib-cb.so
--- a/nptl/Makefile
+++ b/nptl/Makefile
@@ -245,8 +245,8 @@  tests = tst-typesizes \
 	tst-cancel6 tst-cancel7 tst-cancel8 tst-cancel9 tst-cancel10 \
 	tst-cancel11 tst-cancel12 tst-cancel13 tst-cancel14 tst-cancel15 \
 	tst-cancel16 tst-cancel17 tst-cancel18 tst-cancel19 tst-cancel20 \
-	tst-cancel21 tst-cancel22 tst-cancel23 $(if $(CXX),tst-cancel24) \
-	tst-cancel25 tst-cancel-self tst-cancel-self-cancelstate \
+	tst-cancel21 tst-cancel22 tst-cancel23 tst-cancel24 tst-cancel25 \
+	tst-cancel-self tst-cancel-self-cancelstate \
 	tst-cancel-self-canceltype tst-cancel-self-testcancel \
 	tst-cleanup0 tst-cleanup1 tst-cleanup2 tst-cleanup3 tst-cleanup4 \
 	tst-flock1 tst-flock2 \
@@ -364,19 +364,14 @@  link-libc-static := $(common-objpfx)libc.a $(static-gnulib) \
 		    $(common-objpfx)libc.a
 
 tests-static += tst-locale1 tst-locale2 tst-stackguard1-static \
-		tst-cancel21-static tst-cond8-static \
+		tst-cancel21-static tst-cancel24-static tst-cond8-static \
 		tst-mutex8-static tst-mutexpi8-static tst-sem11-static \
 		tst-sem12-static
-tests += tst-stackguard1-static tst-cancel21-static \
+tests += tst-stackguard1-static tst-cancel21-static tst-cancel24-static \
 	 tst-cond8-static tst-mutex8-static tst-mutexpi8-static \
 	 tst-sem11-static tst-sem12-static
 xtests-static += tst-setuid1-static
 
-ifneq (,$(CXX))
-tests += tst-cancel24-static
-tests-static += tst-cancel24-static
-endif
-
 # These tests are linked with libc before libpthread
 tests-reverse += tst-cancel5 tst-cancel23 tst-vfork1x tst-vfork2x
 
@@ -388,6 +383,11 @@  tests-special += $(objpfx)tst-tls6.out $(objpfx)tst-cleanup0-cmp.out \
 endif
 endif
 
+ifeq (,$(CXX))
+# These tests require a C++ compiler and runtime.
+tests-unsupported += tst-cancel24 tst-cancel24-static
+endif
+
 include ../Rules
 
 ifeq (yes,$(build-shared))
--- a/scripts/evaluate-test.sh
+++ b/scripts/evaluate-test.sh
@@ -25,15 +25,20 @@  orig_rc=$rc
 xfail=$3
 stop_on_failure=$4
 
-if [ $rc -eq 0 ]; then
-  result="PASS"
-else
-  result="FAIL"
-fi
-
-if $xfail; then
-  result="X$result"
+if [ $rc -eq 77 ]; then
+  result="UNSUPPORTED"
   rc=0
+else
+  if [ $rc -eq 0 ]; then
+    result="PASS"
+  else
+    result="FAIL"
+  fi
+
+  if $xfail; then
+    result="X$result"
+    rc=0
+  fi
 fi
 
 echo "$result: $test_name"