diff mbox

print-rtx.c: add 'h', v' and 'p' prefixes to regnos

Message ID 1474484478-55517-1-git-send-email-dmalcolm@redhat.com
State New
Headers show

Commit Message

David Malcolm Sept. 21, 2016, 7:01 p.m. UTC
On Tue, 2016-09-20 at 15:12 -0400, David Malcolm wrote:
> On Tue, 2016-09-20 at 09:20 -0600, Jeff Law wrote:
> > On 09/20/2016 08:34 AM, Bernd Schmidt wrote:
> > > On 09/20/2016 04:32 PM, David Malcolm wrote:
> > > >
> > > > To summarize so far: you want every pseudo to have its regno
> > > > dumped
> > > > with a 'p' prefix, and renumber them on dump (and then on
> > > > load).
> > > > OK.
> > >
> > > Renumbering is not helpful because it interferes with the view
> > > you
> > > have
> > > in the debugger. So, IMO just a prefix, and maybe
> > Yea, I guess it does since we want the numbers in the dump to be
> > the
> > same that we used to access the arrays.  So prefixing in the dump
> > with
> > adjustment of the number in the reader.
>
> To check I understand: am I right in thinking you want:
> (A) the numbers in the dump to be unmodified when dumping, so that we
> can easily look up values in arrays without confusion, and
> (B) regnums in the dump gain a 'p' prefix for values >=
>  FIRST_PSEUDO_REGISTER, so that humans and parsers can easily see
> when
> the regs are pseudos, and that
> (C) the parser will detect if a 'p'-prefixed regno actually has the
> same number as a hard reg (which can happen e.g. when a .md file
> changes, or when sharing .rtl dumps between targets), and remap the
> values on load accordingly
>
> ?
>
> (in which case we do need the regno_remapper class, or something like
> it)
>
> > >
> > > >   (reg/f:DI v1 virtual-stack-vars)
> > >
> > > this.
> > Doesn't the same apply to the number of virtual stack regs?  Those
> > are
> > in the same array as pseudos.  So ISTM we prefix in the dump, but
> > do
> > adjustment of the number in the reader?
>
> Presumably we could use "v" rather than "p" as the prefix for the
> first
> 5 pseudos (up to LAST_VIRTUAL_REGISTER), doing any adjustment at load
> time, rather than at dump time.  So the above example would look
> like:
>
>    (reg/f:DI v82 virtual-stack-vars)
>
> i.e. the 82 for x86_64's virtual-stack-vars would be prefixed with a
> 'v', and the loader would adjust it to be the current target's value
> for VIRTUAL_STACK_VARS_REGNUM.
>
> Do you like the idea of prefixing regnums of hardregs with 'h'? (so
> that all regnos get a one-char prefix) e.g.
>   (reg/i:SI h0 ax)
>   (reg/i:SF h21 xmm0)

(Replying to myself, in the hope of better demonstrating the idea)

The following patch implements this idea for RTL dumps, so that all REGNO
values in dumps get a one character prefix: 'h' for hard registers, 'v'
for virtual registers, and 'p' for non-virtual pseudos (making it easier
for both humans and parsers to grok the meaning of a REGNO).

For example, on compiling this:

  int test (int i)
  {
    return i * i;
  }

for target==x86_64, the dump for expand looks like this:

(note 1 0 4 NOTE_INSN_DELETED)
(note 4 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(insn 2 4 3 2 (set (mem/c:SI (plus:DI (reg/f:DI v82 virtual-stack-vars)
                (const_int -4 [0xfffffffffffffffc])) [1 i+0 S4 A32])
        (reg:SI h5 di [ i ])) ../../test.c:2 -1
     (nil))
(note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
(insn 6 3 7 2 (set (reg:SI p89)
        (mem/c:SI (plus:DI (reg/f:DI v82 virtual-stack-vars)
                (const_int -4 [0xfffffffffffffffc])) [1 i+0 S4 A32])) ../../test.c:3 -1
     (nil))
(insn 7 6 10 2 (parallel [
            (set (reg:SI p87 [ _2 ])
                (mult:SI (reg:SI p89)
                    (mem/c:SI (plus:DI (reg/f:DI v82 virtual-stack-vars)
                            (const_int -4 [0xfffffffffffffffc])) [1 i+0 S4 A32])))
            (clobber (reg:CC h17 flags))
        ]) ../../test.c:3 -1
     (nil))
(insn 10 7 14 2 (set (reg:SI p88 [ <retval> ])
        (reg:SI p87 [ _2 ])) ../../test.c:3 -1
     (nil))
(insn 14 10 15 2 (set (reg/i:SI h0 ax)
        (reg:SI p88 [ <retval> ])) ../../test.c:4 -1
     (nil))
(insn 15 14 0 2 (use (reg/i:SI h0 ax)) ../../test.c:4 -1
     (nil))

Note the 'v' prefix here:
  (reg/f:DI v82 virtual-stack-vars)
the 'h' prefix here:
  (reg/i:SI h0 ax)
and the 'p' prefix here:
  (reg:SI p88 [ <retval> ])

Successfully bootstrapped on x86_64-pc-linux-gnu.
There are various regression test failures involving scan-rtl-dump
due to regexes not matching e.g.
 PASS -> FAIL : gcc.target/i386/pr20020-1.c scan-rtl-dump expand "\\(set \\(reg/i:TI 0 ax\\)"
 PASS -> FAIL : gcc.target/i386/pr20020-1.c scan-rtl-dump expand "\\(set \\(reg:TI [0-9]* \\[ <retval> \\]\\)"
If the approach is OK, I can do an updated patch that also fixes up the
relevant tests (adding the appropriate prefixes); this would touch
multiple targets.

gcc/ChangeLog:
	* print-rtl.c (print_rtx): When dumping regnos, prefix the
	number with 'h', 'v' or 'p', depending on whether it is a
	hard register, a virtual register, or a non-virtual pseudo.
	* doc/rtl.texi (Registers and Memory): Document the above change,
	adding the prefix to examples that specifically refer to hard or
	pseudo regs in the text.
---
 gcc/doc/rtl.texi | 26 +++++++++++++++++++-------
 gcc/print-rtl.c  | 18 +++++++++---------
 2 files changed, 28 insertions(+), 16 deletions(-)

Comments

Jeff Law Sept. 28, 2016, 4:23 p.m. UTC | #1
On 09/21/2016 01:01 PM, David Malcolm wrote:
>>
>> Presumably we could use "v" rather than "p" as the prefix for the
>> first
>> 5 pseudos (up to LAST_VIRTUAL_REGISTER), doing any adjustment at load
>> time, rather than at dump time.  So the above example would look
>> like:
>>
>>    (reg/f:DI v82 virtual-stack-vars)
>>
>> i.e. the 82 for x86_64's virtual-stack-vars would be prefixed with a
>> 'v', and the loader would adjust it to be the current target's value
>> for VIRTUAL_STACK_VARS_REGNUM.
>>
>> Do you like the idea of prefixing regnums of hardregs with 'h'? (so
>> that all regnos get a one-char prefix) e.g.
>>   (reg/i:SI h0 ax)
>>   (reg/i:SF h21 xmm0)
>
> (Replying to myself, in the hope of better demonstrating the idea)
>
> The following patch implements this idea for RTL dumps, so that all REGNO
> values in dumps get a one character prefix: 'h' for hard registers, 'v'
> for virtual registers, and 'p' for non-virtual pseudos (making it easier
> for both humans and parsers to grok the meaning of a REGNO).
I think you nailed it.  h, v & p prefixing for each of the register 
types, but leaving the actual register number as-is in the dump file.


> Successfully bootstrapped on x86_64-pc-linux-gnu.
> There are various regression test failures involving scan-rtl-dump
> due to regexes not matching e.g.
>  PASS -> FAIL : gcc.target/i386/pr20020-1.c scan-rtl-dump expand "\\(set \\(reg/i:TI 0 ax\\)"
>  PASS -> FAIL : gcc.target/i386/pr20020-1.c scan-rtl-dump expand "\\(set \\(reg:TI [0-9]* \\[ <retval> \\]\\)"
> If the approach is OK, I can do an updated patch that also fixes up the
> relevant tests (adding the appropriate prefixes); this would touch
> multiple targets.
Yea.  This obviously highlights some of the long term issues with making 
changes into the dump format.  As I said in our meeting yesterday, I do 
understand the desire to nail down the format :-)  But I also want to 
use the opportunity we have to make the dumps easier for your work to 
read & interpret.

jeff
Bernd Schmidt Sept. 28, 2016, 4:30 p.m. UTC | #2
On 09/28/2016 06:23 PM, Jeff Law wrote:

>>>   (reg/i:SI h0 ax)
>>>   (reg/i:SF h21 xmm0)
>>
>> (Replying to myself, in the hope of better demonstrating the idea)
>>
>> The following patch implements this idea for RTL dumps, so that all REGNO
>> values in dumps get a one character prefix: 'h' for hard registers, 'v'
>> for virtual registers, and 'p' for non-virtual pseudos (making it easier
>> for both humans and parsers to grok the meaning of a REGNO).
> I think you nailed it.  h, v & p prefixing for each of the register
> types, but leaving the actual register number as-is in the dump file.
>
I'm actually no longer quite so sure this buys us much: a port might 
have an actual register named "h0", leading to confusion. Virtual and 
hard registers also already have their real name printed after the number.

A "p" prefix for pseudos might still be a good idea, but there's still 
the issue of a real "p0" register name causing confusion.


Bernd
Jeff Law Sept. 28, 2016, 4:36 p.m. UTC | #3
On 09/28/2016 10:30 AM, Bernd Schmidt wrote:
> On 09/28/2016 06:23 PM, Jeff Law wrote:
>
>>>>   (reg/i:SI h0 ax)
>>>>   (reg/i:SF h21 xmm0)
>>>
>>> (Replying to myself, in the hope of better demonstrating the idea)
>>>
>>> The following patch implements this idea for RTL dumps, so that all
>>> REGNO
>>> values in dumps get a one character prefix: 'h' for hard registers, 'v'
>>> for virtual registers, and 'p' for non-virtual pseudos (making it easier
>>> for both humans and parsers to grok the meaning of a REGNO).
>> I think you nailed it.  h, v & p prefixing for each of the register
>> types, but leaving the actual register number as-is in the dump file.
>>
> I'm actually no longer quite so sure this buys us much: a port might
> have an actual register named "h0", leading to confusion. Virtual and
> hard registers also already have their real name printed after the number.
>
> A "p" prefix for pseudos might still be a good idea, but there's still
> the issue of a real "p0" register name causing confusion.
So how do you think we should deal with distinguishing between the 
different registers that may appear in a dump file?

The case I'm worried about is the register meanings in a testsuite dump 
file changing over time if/when new hard registers are added to the port 
or we introduce new virtual registers.

FOr hard regs and virtuals we can probably map backwards using their 
names.  So given a register in a dump, if we can't reverse map it back 
to a hard reg or a virtual, then we assume its a pseudo?

jeff
Bernd Schmidt Sept. 28, 2016, 4:49 p.m. UTC | #4
On 09/28/2016 06:36 PM, Jeff Law wrote:
>> A "p" prefix for pseudos might still be a good idea, but there's still
>> the issue of a real "p0" register name causing confusion.
> So how do you think we should deal with distinguishing between the
> different registers that may appear in a dump file?

I think the main problem we were trying to solve is making sure we can 
make future-proof dumps. So that would argue for allowing h0, v0, p0 
syntax in the reader, but not printing them out that way by default.

Also, if we don't already, allow hard regs to be specified by name in 
the reader, and maybe even require it for virtuals.


Bernd
David Malcolm Sept. 29, 2016, 12:53 p.m. UTC | #5
On Wed, 2016-09-28 at 18:49 +0200, Bernd Schmidt wrote:
> On 09/28/2016 06:36 PM, Jeff Law wrote:
> > > A "p" prefix for pseudos might still be a good idea, but there's
> > > still
> > > the issue of a real "p0" register name causing confusion.
> > So how do you think we should deal with distinguishing between the
> > different registers that may appear in a dump file?
> 
> I think the main problem we were trying to solve is making sure we
> can 
> make future-proof dumps. So that would argue for allowing h0, v0, p0 
> syntax in the reader, but not printing them out that way by default.
> 
> Also, if we don't already, allow hard regs to be specified by name in
> the reader, and maybe even require it for virtuals.

Yes: the problems I'm trying to solve here are:
(a) making dumps more resilient against changing .md files, and
(b) where possible, allowing target-independent dump fragments where
everything is a pseudo

The issue I ran into was parsing the "r" format code, for a regno,
where print_rtx can print various things after the actual number.  My
hope with the prefix patch was to give the parser more of a hint as to
the category of reg (and to perhaps make dumps easier for humans to
read).

But it looks like the "r" format code is only ever used by REG, which
means there's always a closing parenthesis at the end of the material
emitted for the "r" format code.  So given that I *think* that the
parser already has all it needs, and that the patch is redundant.

So my plan for the reader is:
- read the number emitted by "r"
- see if there's a name after the number.  If there is, assume a hard
or virtual reg, and try to parse the name:
  - if the name is recognized, use the target's current number for that
name (future-proofing virtuals against .md changes)
  - if the name is not recognized, issue a fatal error (it's probably a
typo, or maybe a backport from a future version of gcc, or a target
incompatibility)
  - if there's no name, assume it's a pseudo.  Remap all pseudos in a
postprocessing phase to ensure that they *are* pseudos (even if the .md
has gained some hard regs in the meantime), leaving the numbering
untouched if possible (the common case).

...and to drop the regno-prefixing idea from this patch.

Hopefully this sounds sane.

(I'm also working on a function-dumping patch, which will cover CFG
information and various "crtl" information and other state that can't
be reconstructed and hence ought to be in the dump).

Dave
Jeff Law Sept. 29, 2016, 5:22 p.m. UTC | #6
On 09/28/2016 10:49 AM, Bernd Schmidt wrote:
> On 09/28/2016 06:36 PM, Jeff Law wrote:
>>> A "p" prefix for pseudos might still be a good idea, but there's still
>>> the issue of a real "p0" register name causing confusion.
>> So how do you think we should deal with distinguishing between the
>> different registers that may appear in a dump file?
>
> I think the main problem we were trying to solve is making sure we can
> make future-proof dumps. So that would argue for allowing h0, v0, p0
> syntax in the reader, but not printing them out that way by default.
Correct, I'm mostly concerned with future proofing.

I'm certainly OK with a flag to create dumps in a form that's easier to 
parse, but leaving things as-is by default.


>
> Also, if we don't already, allow hard regs to be specified by name in
> the reader, and maybe even require it for virtuals.
Works for me.

jeff
diff mbox

Patch

diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi
index 1b3f47e..450c041 100644
--- a/gcc/doc/rtl.texi
+++ b/gcc/doc/rtl.texi
@@ -1715,6 +1715,9 @@  registers and to main memory.
 @cindex hard registers
 @cindex pseudo registers
 @item (reg:@var{m} @var{n})
+@item (reg:@var{m} h@var{n})
+@item (reg:@var{m} v@var{n})
+@item (reg:@var{m} p@var{n})
 For small values of the integer @var{n} (those that are less than
 @code{FIRST_PSEUDO_REGISTER}), this stands for a reference to machine
 register number @var{n}: a @dfn{hard register}.  For larger values of
@@ -1723,6 +1726,12 @@  The compiler's strategy is to generate code assuming an unlimited
 number of such pseudo registers, and later convert them into hard
 registers or into memory references.
 
+Hard register numbers are printed in debugging dumps with a ``h''
+prefix.  Pseudo register numbers are printed with a a ``v'' or ``p''
+prefix (see the description of @code{FIRST_VIRTUAL_REGISTER} and
+@code{LAST_VIRTUAL_REGISTER} below).  The ``h'' prefix can be omitted in
+machine description files.
+
 @var{m} is the machine mode of the reference.  It is necessary because
 machines can generally refer to each register in more than one mode.
 For example, a register may contain a full word but there may be
@@ -1763,7 +1772,10 @@  Some pseudo register numbers, those within the range of
 appear during the RTL generation phase and are eliminated before the
 optimization phases.  These represent locations in the stack frame that
 cannot be determined until RTL generation for the function has been
-completed.  The following virtual register numbers are defined:
+completed.  Such numbers are printed in dumps with a 'v' prefix;
+the other non-virtual pseudo register numbers are printed with a 'p'
+prefix.
+The following virtual register numbers are defined:
 
 @table @code
 @findex VIRTUAL_INCOMING_ARGS_REGNUM
@@ -3115,14 +3127,14 @@  side-effects are computed, and second all the actual side-effects are
 performed.  For example,
 
 @smallexample
-(parallel [(set (reg:SI 1) (mem:SI (reg:SI 1)))
-           (set (mem:SI (reg:SI 1)) (reg:SI 1))])
+(parallel [(set (reg:SI h1) (mem:SI (reg:SI h1)))
+           (set (mem:SI (reg:SI h1)) (reg:SI h1))])
 @end smallexample
 
 @noindent
 says unambiguously that the values of hard register 1 and the memory
 location addressed by it are interchanged.  In both places where
-@code{(reg:SI 1)} appears as a memory address it refers to the value
+@code{(reg:SI h1)} appears as a memory address it refers to the value
 in register 1 @emph{before} the execution of the insn.
 
 It follows that it is @emph{incorrect} to use @code{parallel} and
@@ -3270,7 +3282,7 @@  reference of which this expression serves as the address.  Here is an
 example of its use:
 
 @smallexample
-(mem:DF (pre_dec:SI (reg:SI 39)))
+(mem:DF (pre_dec:SI (reg:SI p39)))
 @end smallexample
 
 @noindent
@@ -3308,8 +3320,8 @@  where @var{z} is an index register and @var{i} is a constant.
 Here is an example of its use:
 
 @smallexample
-(mem:SF (post_modify:SI (reg:SI 42) (plus (reg:SI 42)
-                                          (reg:SI 48))))
+(mem:SF (post_modify:SI (reg:SI p42) (plus (reg:SI p42)
+                                           (reg:SI p48))))
 @end smallexample
 
 This says to modify pseudo register 42 by adding the contents of pseudo
diff --git a/gcc/print-rtl.c b/gcc/print-rtl.c
index a905127..309dcd1 100644
--- a/gcc/print-rtl.c
+++ b/gcc/print-rtl.c
@@ -476,24 +476,24 @@  print_rtx (const_rtx in_rtx)
 	  unsigned int regno = REGNO (in_rtx);
 #ifndef GENERATOR_FILE
 	  if (regno < FIRST_PSEUDO_REGISTER)
-	    fprintf (outfile, " %d %s", regno, reg_names[regno]);
+	    fprintf (outfile, " h%d %s", regno, reg_names[regno]);
 	  else if (regno <= LAST_VIRTUAL_REGISTER)
 	    {
 	      if (regno == VIRTUAL_INCOMING_ARGS_REGNUM)
-		fprintf (outfile, " %d virtual-incoming-args", regno);
+		fprintf (outfile, " v%d virtual-incoming-args", regno);
 	      else if (regno == VIRTUAL_STACK_VARS_REGNUM)
-		fprintf (outfile, " %d virtual-stack-vars", regno);
+		fprintf (outfile, " v%d virtual-stack-vars", regno);
 	      else if (regno == VIRTUAL_STACK_DYNAMIC_REGNUM)
-		fprintf (outfile, " %d virtual-stack-dynamic", regno);
+		fprintf (outfile, " v%d virtual-stack-dynamic", regno);
 	      else if (regno == VIRTUAL_OUTGOING_ARGS_REGNUM)
-		fprintf (outfile, " %d virtual-outgoing-args", regno);
+		fprintf (outfile, " v%d virtual-outgoing-args", regno);
 	      else if (regno == VIRTUAL_CFA_REGNUM)
-		fprintf (outfile, " %d virtual-cfa", regno);
+		fprintf (outfile, " v%d virtual-cfa", regno);
 	      else if (regno == VIRTUAL_PREFERRED_STACK_BOUNDARY_REGNUM)
-		fprintf (outfile, " %d virtual-preferred-stack-boundary",
+		fprintf (outfile, " v%d virtual-preferred-stack-boundary",
 			 regno);
 	      else
-		fprintf (outfile, " %d virtual-reg-%d", regno,
+		fprintf (outfile, " v%d virtual-reg-%d", regno,
 			 regno-FIRST_VIRTUAL_REGISTER);
 	    }
 	  else
@@ -501,7 +501,7 @@  print_rtx (const_rtx in_rtx)
 	    if (flag_dump_unnumbered && is_insn)
 	      fputc ('#', outfile);
 	    else
-	      fprintf (outfile, " %d", regno);
+	      fprintf (outfile, " p%d", regno);
 
 #ifndef GENERATOR_FILE
 	  if (REG_ATTRS (in_rtx))