diff mbox series

testsuite: Fix subexpressions with `scan-assembler-times'

Message ID alpine.DEB.2.20.2311190446360.5892@tpp.orcam.me.uk
State New
Headers show
Series testsuite: Fix subexpressions with `scan-assembler-times' | expand

Commit Message

Maciej W. Rozycki Nov. 19, 2023, 11:27 a.m. UTC
We have an issue with `scan-assembler-times' handling expressions using 
subexpressions as produced by capturing parentheses `()' in an odd way, 
and one that is inconsistent with `scan-assembler', `scan-assembler-not', 
etc.  The problem comes from calling `regexp' with `-inline -all', which 
causes a list to be returned that would otherwise be placed in match 
variables.

Consequently if we have say:

/* { dg-final { scan-assembler-times "\\s(foo|bar)\\s" 1 } } */

in a test case and there is a lone `foo' present in output being matched, 
then our invocation of `regexp -inline -all' in `scan-assembler-times' 
will return:

{ foo } foo

and that in turn will confuse our match count calculation as `llength' 
will return 2 rather than 1, making the test fail even though `foo' was 
only actually matched once.

It seems unclear why we chose to call `regexp' in such an odd way in the 
first place just to figure out the number of matches.  The first version 
of TCL that supports the `-all' option to `regexp' is 8.3, and according 
to its documentation[1][2] `regexp' already returns the number of matches 
found whenever `-all' has been used *unless* `-inline' has also been used.

Remove the `-inline' option then along with the `llength' invocation.

References:

[1] "Tcl Built-In Commands - regexp manual page", 
    <https://www.tcl.tk/man/tcl8.2.3/TclCmd/regexp.html>

[2] "Tcl Built-In Commands - regexp manual page", 
    <https://www.tcl.tk/man/tcl8.3/TclCmd/regexp.html>

	gcc/testsuite/
	* lib/scanasm.exp (scan-assembler-times): Remove the `-inline' 
	option to `regexp' and the wrapping `llength' call.
---
Hi,

 Verified with the `riscv64-linux-gnu' target and the C language
testsuite.  OK to apply?

  Maciej
---
 gcc/testsuite/lib/scanasm.exp |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

gcc-test-scan-assembler-times-count.diff

Comments

Jeff Law Nov. 19, 2023, 11:44 p.m. UTC | #1
On 11/19/23 04:27, Maciej W. Rozycki wrote:
> We have an issue with `scan-assembler-times' handling expressions using
> subexpressions as produced by capturing parentheses `()' in an odd way,
> and one that is inconsistent with `scan-assembler', `scan-assembler-not',
> etc.  The problem comes from calling `regexp' with `-inline -all', which
> causes a list to be returned that would otherwise be placed in match
> variables.
> 
> Consequently if we have say:
> 
> /* { dg-final { scan-assembler-times "\\s(foo|bar)\\s" 1 } } */
> 
> in a test case and there is a lone `foo' present in output being matched,
> then our invocation of `regexp -inline -all' in `scan-assembler-times'
> will return:
> 
> { foo } foo
> 
> and that in turn will confuse our match count calculation as `llength'
> will return 2 rather than 1, making the test fail even though `foo' was
> only actually matched once.
> 
> It seems unclear why we chose to call `regexp' in such an odd way in the
> first place just to figure out the number of matches.  The first version
> of TCL that supports the `-all' option to `regexp' is 8.3, and according
> to its documentation[1][2] `regexp' already returns the number of matches
> found whenever `-all' has been used *unless* `-inline' has also been used.
> 
> Remove the `-inline' option then along with the `llength' invocation.
> 
> References:
> 
> [1] "Tcl Built-In Commands - regexp manual page",
>      <https://www.tcl.tk/man/tcl8.2.3/TclCmd/regexp.html>
> 
> [2] "Tcl Built-In Commands - regexp manual page",
>      <https://www.tcl.tk/man/tcl8.3/TclCmd/regexp.html>
> 
> 	gcc/testsuite/
> 	* lib/scanasm.exp (scan-assembler-times): Remove the `-inline'
> 	option to `regexp' and the wrapping `llength' call.
> ---
> Hi,
> 
>   Verified with the `riscv64-linux-gnu' target and the C language
> testsuite.  OK to apply?
Not sure why it is the way it is -- I walked back to Zdenek's change 
which introduced the scan-assembler-times and nothing about the -inline 
argument.

OK, but be on the lookout for scan-asm problems on other targets over 
the next few days.

Jeff
Maciej W. Rozycki Nov. 22, 2023, 2:13 a.m. UTC | #2
On Sun, 19 Nov 2023, Jeff Law wrote:

> >   Verified with the `riscv64-linux-gnu' target and the C language
> > testsuite.  OK to apply?
> Not sure why it is the way it is -- I walked back to Zdenek's change which
> introduced the scan-assembler-times and nothing about the -inline argument.

 I went through our history beforehand too and found nothing interesting 
either.  My only suspicion has been it may have happened as a conseqence 
of somewhat confusing regexp(n) TCL documentation just saying:

"Determines whether the regular expression exp matches part or all of 
string and returns 1 if it does, 0 if it does not, unless -inline is 
specified (see below)."

and then you need to dive into the description of `-all' to find out it 
actually returns the number of matches rather than just 1 or 0:

"Causes the regular expression to be matched as many times as possible in 
the string, returning the total number of matches found."

I guess maybe Zdenek missed the part after the comma?

> OK, but be on the lookout for scan-asm problems on other targets over the next
> few days.

 Good point.  I have grepped our testsuite for instances and found only 
one (as opposed to numerous non-captured subexpressions), specifically 
gcc/testsuite/gcc.target/arm/pr53447-5.c, well-documented as working 
around the quirk.  I've posted a change to avoid the quirk with this case: 
<https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637710.html> and 
I mean to apply it just before this `scan-assembler-times' fix.

  Maciej
diff mbox series

Patch

Index: gcc/gcc/testsuite/lib/scanasm.exp
===================================================================
--- gcc.orig/gcc/testsuite/lib/scanasm.exp
+++ gcc/gcc/testsuite/lib/scanasm.exp
@@ -505,7 +505,7 @@  proc scan-assembler-times { args } {
     close $fd
     regsub -all {(^|\n)[[:space:]]*\.section[[:space:]]*\.gnu\.lto_(?:[^\n]*\n(?![[:space:]]*\.(section|text|data|bss)))*[^\n]*\n} $text {\1} text
 
-    set result_count [llength [regexp -inline -all -- $pattern $text]]
+    set result_count [regexp -all -- $pattern $text]
     if {$result_count == $times} {
 	pass "$testcase scan-assembler-times $pp_pattern $times"
     } else {