diff mbox

[docs,2/5] Update "Instruction Patterns" in md.texi

Message ID 1420543302-11008-3-git-send-email-james.greenhalgh@arm.com
State New
Headers show

Commit Message

James Greenhalgh Jan. 6, 2015, 11:21 a.m. UTC
Hi,

This patch updates the second section of md.texi - "Everything about
Patterns".

I was aiming to:

  * Remove outdated details of the compiler.
  * Remove long or obscure words that, while accurate, only served to
    obfuscate a simple idea.
  * Refer to similar things in a consistent fashion - in particular
    trying to keep consistent use of "insn" and "pattern".
  * Remove superflous examples, or waffling.

OK?

Thanks,
James

---
2015-01-06  James Greenhalgh  <james.greenhalgh@arm.com>

	* doc/md.texi (Instruction Patterns): Update text.
	(Example): Update text.

Comments

Jeff Law Jan. 8, 2015, 10 p.m. UTC | #1
On 01/06/15 04:21, James Greenhalgh wrote:
> Hi,
>
> This patch updates the second section of md.texi - "Everything about
> Patterns".
>
> I was aiming to:
>
>    * Remove outdated details of the compiler.
>    * Remove long or obscure words that, while accurate, only served to
>      obfuscate a simple idea.
>    * Refer to similar things in a consistent fashion - in particular
>      trying to keep consistent use of "insn" and "pattern".
>    * Remove superflous examples, or waffling.
>
> OK?
>
> Thanks,
> James
>
> ---
> 2015-01-06  James Greenhalgh<james.greenhalgh@arm.com>
>
> 	* doc/md.texi (Instruction Patterns): Update text.
> 	(Example): Update text.
>
>
> 0002-Patch-docs-2-5-Update-Instruction-Patterns-in-md.tex.patch
>
>
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index 0277f14..b852981 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -115,85 +115,98 @@ emit the final assembly code.  For this purpose, names are ignored.  All
>   @cindex instruction patterns
>
>   @findex define_insn
> -Each instruction pattern contains an incomplete RTL expression, with pieces
> -to be filled in later, operand constraints that restrict how the pieces can
> -be filled in, and an output pattern or C code to generate the assembler
> -output, all wrapped up in a @code{define_insn} expression.
> +A @code{define_insn} expression is used to define instruction patterns
> +to which insns may be matched.  A @code{define_insn} expression contains
> +an incomplete RTL expression, with pieces to be filled in later, operand
> +constraints that restrict how the pieces can be filled in, and an output
> +template or C code to generate the assembler output.
>
> -A @code{define_insn} is an RTL expression containing four or five operands:
> +A @code{define_insn} contains either four or five components:
How about
A @code{define_insn} contains the following components:

>
>   @enumerate
>   @item
> -An optional name.  The presence of a name indicate that this instruction
> -pattern can perform a certain standard job for the RTL-generation
> -pass of the compiler.  This pass knows certain names and will use
> -the instruction patterns with those names, if the names are defined
> -in the machine description.
> -
> -The absence of a name is indicated by writing an empty string
> -where the name should go.  Nameless instruction patterns are never
> -used for generating RTL code, but they may permit several simpler insns
> -to be combined later on.
> -
> -Names that are not thus known and used in RTL-generation have no
> -effect; they are equivalent to no name at all.
> +The @dfn{insn name}: When expanding from gimple to RTL, and when performing
> +optimizations, the compiler looks for patterns with certain names,
IIRC reload looks for specific named patterns as well.   There may be 
other places that look for standard named patterns.

Which makes me wonder how hard we should try to nail this down.  Maybe 
something along the lines of passes which generate new insns may look 
for standard names.

> +collectively known as the standard pattern names (@pxref{Standard Names}).
> +The target-independent infrastructure in the compiler which references
> +these names is generally accessed through the interfaces defined
> +in @code{optabs.c}.
Hmm, I'm not sure I'd call out optabs.c here because looking up standard 
names happens all over the place.

> +
> +Names that are not listed as one of the standard pattern names are not
> +used directly by the target-independent code.  However, machine
> +descriptions may themselves make use of named patterns in
> +@code{define_expand} or @code{define_split} expressions.
"make use of named patterns when generating insns" or something similar? 
  In theory that covers us if we have other define_foo things that want 
to look at named patterns in the future.

> +
> +It is also possible to define a nameless instruction pattern.  This uses
> +an empty string in place of the name.  Nameless instruction patterns cannot
> +be used when generating RTL code, but they may be matched against during
> +the combine and split passes of the compiler.
Wouldn't necessarily call out combine here -- nameless patterns could be 
matched anytime RTL is changed.  Just changing an operand from a 
constant to a register or vice-versa may trigger the use of a nameless 
pattern.

> +
> +Where names are given to instruction patterns, these must be unique
> +in the machine description file.
We can have multiple .md files, so probably unique across the machine 
description files for the given target.

>
>   @item
> -The @dfn{output template}: a string that says how to output matching
> -insns as assembler code.  @samp{%} in this string specifies where
> -to substitute the value of an operand.  @xref{Output Template}.
> +The @dfn{output template} or @dfn{output statement}: This is either
> +a string, or a fragment of C code which returns a string.
>
> -When simple substitution isn't general enough, you can specify a piece
> -of C code to compute the output.  @xref{Output Statement}.
> +If it is a string, that string forms the output template and defines how
> +a matched insn should be output as assembler code
> +(@pxref{Output Template}).  If it is a fragment of C code, this should
> +return a string which will be used as the output template
> +(@pxref{Output Statement}).
Can't we have multiple output templates (one per constraint)?  Ah, 
that's discussed later.  No worries.

If I havne't commented, then those hunks should be considered OK -- you 
can check those hunks in if you want.

jeff
Segher Boessenkool Jan. 9, 2015, 2:18 a.m. UTC | #2
On Thu, Jan 08, 2015 at 03:00:02PM -0700, Jeff Law wrote:
> On 01/06/15 04:21, James Greenhalgh wrote:
> >-A @code{define_insn} is an RTL expression containing four or five 
> >operands:
> >+A @code{define_insn} contains either four or five components:
> How about
> A @code{define_insn} contains the following components:
> 
> >
> >  @enumerate
> >  @item
> >-An optional name.  The presence of a name indicate that this instruction
> >-pattern can perform a certain standard job for the RTL-generation
> >-pass of the compiler.  This pass knows certain names and will use
> >-the instruction patterns with those names, if the names are defined
> >-in the machine description.
> >-
> >-The absence of a name is indicated by writing an empty string
> >-where the name should go.  Nameless instruction patterns are never
> >-used for generating RTL code, but they may permit several simpler insns
> >-to be combined later on.
> >-
> >-Names that are not thus known and used in RTL-generation have no
> >-effect; they are equivalent to no name at all.
> >+The @dfn{insn name}: When expanding from gimple to RTL, and when 
> >performing
> >+optimizations, the compiler looks for patterns with certain names,
> IIRC reload looks for specific named patterns as well.   There may be 
> other places that look for standard named patterns.
> 
> Which makes me wonder how hard we should try to nail this down.  Maybe 
> something along the lines of passes which generate new insns may look 
> for standard names.

This whole business with standard names isn't specific to define_insn
(also define_insn_and_split and define_expand), so perhaps it should
be moved elsewhere?

Patterns with names that start with a '*' behave like nameless patterns
as well (except where the name is printed, like in dump files).  This
should be mentioned in the same place too I think (it currently is
mentioned _somewhere_ I think, but I cannot find it, hrm).


Segher
Richard Sandiford Jan. 12, 2015, 10:25 p.m. UTC | #3
James Greenhalgh <james.greenhalgh@arm.com> writes:
>  @node Example
>  @section Example of @code{define_insn}
>  @cindex @code{define_insn} example
>  
> -Here is an actual example of an instruction pattern, for the 68000/68020.
> +Here is an example of an instruction pattern, taken from the machine
> +description for the 68000/68020.
>  
>  @smallexample
>  (define_insn "tstsi"

Might be mission creep again, but: tstsi is no longer a named pattern.
Maybe we should put something more modern in there...

Thanks,
Richard
diff mbox

Patch

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 0277f14..b852981 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -115,85 +115,98 @@  emit the final assembly code.  For this purpose, names are ignored.  All
 @cindex instruction patterns
 
 @findex define_insn
-Each instruction pattern contains an incomplete RTL expression, with pieces
-to be filled in later, operand constraints that restrict how the pieces can
-be filled in, and an output pattern or C code to generate the assembler
-output, all wrapped up in a @code{define_insn} expression.
+A @code{define_insn} expression is used to define instruction patterns
+to which insns may be matched.  A @code{define_insn} expression contains
+an incomplete RTL expression, with pieces to be filled in later, operand
+constraints that restrict how the pieces can be filled in, and an output
+template or C code to generate the assembler output.
 
-A @code{define_insn} is an RTL expression containing four or five operands:
+A @code{define_insn} contains either four or five components:
 
 @enumerate
 @item
-An optional name.  The presence of a name indicate that this instruction
-pattern can perform a certain standard job for the RTL-generation
-pass of the compiler.  This pass knows certain names and will use
-the instruction patterns with those names, if the names are defined
-in the machine description.
-
-The absence of a name is indicated by writing an empty string
-where the name should go.  Nameless instruction patterns are never
-used for generating RTL code, but they may permit several simpler insns
-to be combined later on.
-
-Names that are not thus known and used in RTL-generation have no
-effect; they are equivalent to no name at all.
+The @dfn{insn name}: When expanding from gimple to RTL, and when performing
+optimizations, the compiler looks for patterns with certain names,
+collectively known as the standard pattern names (@pxref{Standard Names}).
+The target-independent infrastructure in the compiler which references
+these names is generally accessed through the interfaces defined
+in @code{optabs.c}.
+
+Names that are not listed as one of the standard pattern names are not
+used directly by the target-independent code.  However, machine
+descriptions may themselves make use of named patterns in
+@code{define_expand} or @code{define_split} expressions.
+
+It is also possible to define a nameless instruction pattern.  This uses
+an empty string in place of the name.  Nameless instruction patterns cannot
+be used when generating RTL code, but they may be matched against during
+the combine and split passes of the compiler.
+
+Where names are given to instruction patterns, these must be unique
+in the machine description file.
 
 For the purpose of debugging the compiler, you may also specify a
 name beginning with the @samp{*} character.  Such a name is used only
-for identifying the instruction in RTL dumps; it is entirely equivalent
-to having a nameless pattern for all other purposes.
+for identifying the instruction in RTL dumps; it is equivalent to having
+a nameless pattern for all other purposes.  Names beginning with the
+@samp{*} character are not required to be unique.
 
 @item
-The @dfn{RTL template} (@pxref{RTL Template}) is a vector of incomplete
-RTL expressions which show what the instruction should look like.  It is
-incomplete because it may contain @code{match_operand},
+The @dfn{RTL template}: This is a vector of incomplete RTL expressions
+which describe the semantics of the instruction (@pxref{RTL Template}).
+It is incomplete because it may contain @code{match_operand},
 @code{match_operator}, and @code{match_dup} expressions that stand for
 operands of the instruction.
 
-If the vector has only one element, that element is the template for the
-instruction pattern.  If the vector has multiple elements, then the
-instruction pattern is a @code{parallel} expression containing the
-elements described.
+If the vector has multiple elements, the RTL template is treated as a
+@code{parallel} expression.
 
 @item
 @cindex pattern conditions
 @cindex conditions, in patterns
-A condition.  This is a string which contains a C expression that is
-the final test to decide whether an insn body matches this pattern.
+The condition: This is a string which contains a C expression.  When the
+compiler attempts to match RTL against a pattern, the condition is
+evaluated.  If the condition evaluates to @code{true}, the match is
+permitted.  The condition may be an empty string, which is treated
+as always @code{true}.
 
 @cindex named patterns and conditions
-For a named pattern, the condition (if present) may not depend on
-the data in the insn being matched, but only the target-machine-type
-flags.  The compiler needs to test these conditions during
-initialization in order to learn exactly which named instructions are
-available in a particular run.
+For a named pattern, the condition may not depend on the data in the
+insn being matched, but only the target-machine-type flags.  The compiler
+needs to test these conditions during initialization in order to learn
+exactly which named instructions are available in a particular run.
 
 @findex operands
 For nameless patterns, the condition is applied only when matching an
 individual insn, and only after the insn has matched the pattern's
 recognition template.  The insn's operands may be found in the vector
-@code{operands}.  For an insn where the condition has once matched, it
-can't be used to control register allocation, for example by excluding
-certain hard registers or hard register combinations.
+@code{operands}.
+
+For an insn where the condition has once matched, it
+cannot later be used to control register allocation by excluding
+certain register or value combinations.
 
 @item
-The @dfn{output template}: a string that says how to output matching
-insns as assembler code.  @samp{%} in this string specifies where
-to substitute the value of an operand.  @xref{Output Template}.
+The @dfn{output template} or @dfn{output statement}: This is either
+a string, or a fragment of C code which returns a string.
 
-When simple substitution isn't general enough, you can specify a piece
-of C code to compute the output.  @xref{Output Statement}.
+If it is a string, that string forms the output template and defines how
+a matched insn should be output as assembler code
+(@pxref{Output Template}).  If it is a fragment of C code, this should
+return a string which will be used as the output template
+(@pxref{Output Statement}).
 
 @item
-Optionally, a vector containing the values of attributes for insns matching
-this pattern.  @xref{Insn Attributes}.
+The @dfn{insn attributes}: This is an optional vector containing the values of
+attributes for insns matching this pattern (@pxref{Insn Attributes}).
 @end enumerate
 
 @node Example
 @section Example of @code{define_insn}
 @cindex @code{define_insn} example
 
-Here is an actual example of an instruction pattern, for the 68000/68020.
+Here is an example of an instruction pattern, taken from the machine
+description for the 68000/68020.
 
 @smallexample
 (define_insn "tstsi"
@@ -223,12 +236,12 @@  This can also be written using braced strings:
 @})
 @end smallexample
 
-This is an instruction that sets the condition codes based on the value of
-a general operand.  It has no condition, so any insn whose RTL description
-has the form shown may be handled according to this pattern.  The name
-@samp{tstsi} means ``test a @code{SImode} value'' and tells the RTL generation
-pass that, when it is necessary to test such a value, an insn to do so
-can be constructed using this pattern.
+This describes an instruction which sets the condition codes based on the
+value of a general operand.  It has no condition, so any insn with an RTL
+description of the form shown may be matched to this pattern.  The name
+@samp{tstsi} means ``test a @code{SImode} value'' and tells the RTL
+generation pass that, when it is necessary to test such a value, an insn
+to do so can be constructed using this pattern.
 
 The output control string is a piece of C code which chooses which
 output template to return based on the kind of operand and the specific