===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/ast-optimizer.html,v
retrieving revision 1.10
@@ -91,7 +91,7 @@
See <a href="https://gcc.gnu.org/ml/gcc-patches/2001-07/msg00859.html">this
thread</a>.</p>
-<h3><a name="ssa_for_trees">SSA for trees</a></h3>
+<h3 id="ssa_for_trees">SSA for trees</h3>
<p>The tree SSA infrastructure is maintained by <a
href="mailto:dnovillo@redhat.com">Diego Novillo
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/cfg.html,v
retrieving revision 1.25
@@ -114,7 +114,7 @@
algorithms and keep code size under control. The purpose of this pass
is to minimize the number of branches and cache misses. It uses code
duplication to avoid jumps. A trivial example is copying the return
-instruction instead of jumping to it. See <a href="#6">[6]</a> for a
+instruction instead of jumping to it. See <a href="#ref6">[6]</a> for a
detailed description.</p>
<h4>Implementation in GCC</h4>
@@ -356,7 +356,7 @@
<h4>Implementation in GCC</h4>
<p>We plan to do loop peeling and superblock formation in single pass
-as described in <a href="#3">[3]</a>.</p>
+as described in <a href="#ref3">[3]</a>.</p>
<p>Enhancements that are currently only in the cfg-branch:</p>
<ul>
@@ -404,7 +404,7 @@
look like a real file format.</p>
<p>There are a few references to much more advanced profiling systems
-in <a href="#3">[3]</a>.</p>
+in <a href="#ref3">[3]</a>.</p>
<p>
This is done using <code>NOTE_INSN_PREDICTION</code> emitted in the
stream converted to <code>REG_PREDICTION</code> later. For instance,
@@ -450,55 +450,55 @@
homepage. Some other papers:</p>
<dl>
-<dt><a name="1">[1]</a></dt>
+<dt>[1]</dt>
<dd><a href="https://doi.org/10.1145/155090.155119">Branch
Prediction for Free; Ball and Larus; PLDI '93.</a></dd>
-<dt><a name="2">[2]</a></dt>
+<dt>[2]</dt>
<dd><a href="https://doi.org/10.1145/192724.192725">Static
Branch Frequency and Program Profile Analysis; Wu and Larus;
MICRO-27.</a></dd>
-<dt><a name="3">[3]</a></dt>
+<dt id="ref3">[3]</dt>
<dd>Design and Analysis of Profile-Based Optimization in Compaq's
Compilation Tools for Alpha; Journal of Instruction-Level Parallelism 3
(2000) 1-25.</dd>
-<dt><a name="4">[4]</a></dt>
+<dt>[4]</dt>
<dd><a href=
"http://www.lighterra.com/papers/valuerangeprop/Patterson1995-ValueRangeProp.pdf">Accurate
Static Branch Prediction by Value Range Propagation; Jason R. C.
Patterson (jasonp@fit.qut.edu.au), 1995</a></dd>
-<dt><a name="5">[5]</a></dt>
+<dt>[5]</dt>
<dd><a href="https://doi.org/10.1145/258916.258932">Near-optimal
Intraprocedural Branch Alignment; Cliff Young, David S. Johnson,
David R. Karger, Michael D. Smith, ACM 1997</a></dd>
-<dt><a name="6">[6]</a></dt>
+<dt id="6">[6]</dt>
<dd><a href="https://doi.org/10.1145/305138.305178">Software
Trace Cache; International Conference on Supercomputing, 1999</a></dd>
-<dt><a name="7">[7]</a></dt>
+<dt>[7]</dt>
<dd><a href="https://doi.org/10.1002/spe.4380211204">Using
Profile Information to Assist Classic Code Optimizations; Pohua P.
Chang, Scott A. Mahlke, and Wen-mei W. Hwu, 1991</a></dd>
-<dt><a name="8">[8]</a></dt>
+<dt>[8]</dt>
<dd><a href=
"http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.39.1922">Hyperblock
Performance Optimizations For ILP Processors; David Isaac August,
1996 (Master Thesis)</a></dd>
-<dt><a name="9">[9]</a></dt>
+<dt>[9]</dt>
<dd><a href="https://doi.org/10.1145/173262.155118">Reverse
If-Conversion; Nancy J. Warter, Scott A. Mahlke, Wen-mei W. Hwu, B.
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/cfo.html,v
retrieving revision 1.11
@@ -19,7 +19,7 @@
<li><a href="#todo">To do</a></li>
</ul>
-<h2><a name="news">Latest News</a></h2>
+<h2 id="news">Latest News</h2>
<dl>
<dt>2005-08-11</dt>
@@ -33,7 +33,7 @@
</p></dd>
</dl>
-<h2><a name="intro">Introduction</a></h2>
+<h2 id="intro">Introduction</a></h2>
<p>Code factoring is the name of a class of useful optimization techniques
developed especially for code size reduction. These approaches aim to reduce
@@ -43,7 +43,7 @@
size optimization of GCC with code factoring methods (code motion and merging
algorithms). The implementation currently resides on the branch.</p>
-<h2><a name="contributing">Contributing</a></h2>
+<h2 id="contributing">Contributing</h2>
<p>Checkout the cfo-branch branch from
<a href="../svn.html">our respository</a>.</p>
@@ -52,7 +52,7 @@
[cfo] in the subject. The usual contribution and testing rules apply. This
branch is maintained by <a href="mailto:loki@gcc.gnu.org">Gabor Loki</a>.</p>
-<h2><a name="documentation">Documentation</a></h2>
+<h2 id="documentation">Documentation</h2>
<p>The project includes the following two code factoring algorithms:</p>
<ul>
@@ -146,7 +146,7 @@
<a href="ftp://gcc.gnu.org/pub/gcc/summit/2004/Code%20Factoring.pdf">
GCC Summit Proceedings (2004)</a>.</p>
-<h2><a name="features">Features</a></h2>
+<h2 id="features">Features</h2>
<p>Currently the following algorithms are implemented on the branch:</p>
@@ -157,7 +157,7 @@
<li>Sequence abstraction on Tree (<code>-ftree-seqabstr</code>)</li>
</ul>
-<h2><a name="preresults">Preliminary results</a></h2>
+<h2 id="preresults">Preliminary results</h2>
<p>The following results have been prepared using the
<a href="http://szeged.github.io/csibe/">CSiBE</a> benchmark with respect
@@ -218,7 +218,7 @@
</tr>
</table>
-<h2><a name="todo">To do</a></h2>
+<h2 id="todo">To do</h2>
<ul>
<li>Implement procedural abstraction on IPA (interprocedural version of
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/cli.html,v
retrieving revision 1.31
@@ -18,7 +18,7 @@
<li><a href="#readings">Readings</a></li>
</ul>
-<h2><a name="news">Latest News</a></h2>
+<h2 id="news">Latest News</h2>
<dl>
<dt>2009-07</dt>
@@ -79,7 +79,7 @@
<dd><p>Creation of st/cli branch.</p></dd>
</dl>
-<h2><a name="intro">Introduction</a></h2>
+<h2 id="intro">Introduction</h2>
<p>
CLI is a framework that defines a platform independent format for
executables and a run-time environment for the execution of applications.
@@ -130,7 +130,7 @@
<a href="mailto:gabriele.svelto@gmail.com">Gabriele Svelto</a>.
</p>
-<h2><a name="contributing">Contributing</a></h2>
+<h2 id="contributing">Contributing</h2>
<p>Check out <code>st/cli-be</code> branch following the instructions found in the
<a href="../svn.html">SVN documentation</a>.</p>
@@ -150,7 +150,7 @@
front end and the CLI binutils (both Mono based and DotGnu based) .
</p>
-<h2><a name="internals">The CLI back end</a></h2>
+<h2 id="internals">The CLI back end</h2>
<p>
Unlike a typical GCC back end, the CLI backnend stops the compilation flow
at the end of the middle-end passes and, without going through any RTL
@@ -168,7 +168,7 @@
data type information that is not preserved across RTL.
</p>
-<h3><a name="mmodel">Target machine model</a></h3>
+<h3 id="mmodel">Target machine model</h3>
<p>
Like existing GCC back ends, CLI is truly seen as a target machine
and, as such, it follows GCC policy about the organization of the
@@ -215,7 +215,7 @@
for CLI target.</li>
</ul>
-<h3><a name="cil_ir">The CIL intermediate representation</a></h3>
+<h3 id="cil_ir">The CIL intermediate representation</h3>
<p>
In our branch the traditional compilation flow of GCC has been altered to make
CIL code emission more robust and to improve the quality of the emitted code.
@@ -244,7 +244,7 @@
and greatly simplifies code emission.
</p>
-<h3><a name="gimple2cil">GIMPLE to CIL translation pass</a></h3>
+<h3 id="gimple2cil">GIMPLE to CIL translation pass</h3>
<p>
GIMPLE/generic code is lowered into CIL code by descending the original trees
and expanding them one node at a time. Many GIMPLE statements can be translated
@@ -272,7 +272,7 @@
<code>__builtin_memcpy</code> is turned into a <code>cpblk</code> instruction.
</p>
-<h3><a name="optimizations">CIL-specific optimizations</a></h3>
+<h3 id="optimizations">CIL-specific optimizations</h3>
<p>
The GIMPLE/generic conversion phase sometimes emits some fairly poor CIL code.
The code quality actually depends much on the depth of the input trees. Very
@@ -318,7 +318,7 @@
a fall through if possible.</li>
</ul>
-<h3><a name="emission">Assembly emission</a></h3>
+<h3 id="emission">Assembly emission</h3>
<p>
The last pass which emits CIL code is fairly straightforward as the intermediate
representation maps one-to-one with the CIL code. This phase however contains a
@@ -341,7 +341,7 @@
probabilities.
</p>
-<h2><a name="frontend">The CLI front end</a></h2>
+<h2 id="frontend">The CLI front end</h2>
<p>The objective of the project was to create a new GCC front end able
to take a .NET executable as input, and produce optimized native code
@@ -454,24 +454,24 @@
unsupported feature. In those cases, those methods can be skipped,
allowing the user to provide a native implementation if necessary.</p>
-<h2><a name="readings">Readings</a></h2>
+<h2 id="readings">Readings</h2>
<dl>
-<dt><a name="1">[1]</a></dt>
+<dt>[1]</dt>
<dd>
ECMA, <a href="http://www.ecma-international.org/publications/standards/Ecma-335.htm">
<i>Common Language Infrastructure (CLI)</i></a>, 4th edition, June 2006.
</dd>
-<dt><a name="2">[2]</a></dt>
+<dt>[2]</dt>
<dd>
John Gough, <i>Compiling for the .NET Common Language Runtime (CLR)</i>,
Prentice Hall, ISBN 0-13-062296-6.
</dd>
-<dt><a name="3">[3]</a></dt>
+<dt>[3]</dt>
<dd>
Serge Liden, <i>Inside Microsoft .NET IL Assembler</i>, Microsoft Press,
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/cpplib.html,v
retrieving revision 1.24
@@ -109,7 +109,7 @@
levels of wrapper headers.</li>
</ol>
-<h2><a name="charset">Character set issues</a></h2>
+<h2 id="charset">Character set issues</h2>
<p>Proper non-ASCII character handling is a hard problem. Users want
to be able to write comments and strings in their native language.
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/documentation.html,v
retrieving revision 1.11
@@ -47,8 +47,7 @@
<hr />
-<h2><a name="frontend_middleend_interface">Fully
-document the interface of front ends to GCC</a></h2>
+<h2 id="frontend_middleend_interface">Fully document the interface of front ends to GCC</h2>
<p>Fully document the interface of front ends to GCC, that is, the
<code>tree</code>, <code>cgraph</code>, and <code>langhooks</code>
@@ -67,8 +66,7 @@
most of these manuals are obsolete.</p>
-<h2><a name="internals_and_porting">Better documentation of how GCC
-works and how to port it</a></h2>
+<h2 id="internals_and_porting">Better documentation of how GCC works and how to port it</h2>
<p>The porting manual describes what used to be the proper way to
write a GCC back end. It is several years out of date. Find all the
@@ -169,8 +167,7 @@
</ol>
-<h2><a name="RTL">Fully document the back-end intermediate
-language data structures</a></h2>
+<h2 id="RTL">Fully document the back-end intermediate language data structures</h2>
<p>Document every RTX code and accessor macro, every insn name, every
<code>tm.h</code> macro and every target hook thoroughly. (See <a
@@ -187,8 +184,7 @@
<code>targetm</code> structure for target hooks.</p>
-<h2><a name="improve_manual_index">Improve the indexing
-of the GCC manual.</a></h2>
+<h2 id="improve_manual_index">Improve the indexing of the GCC manual.</h2>
<p>All command-line options should be indexed, and there should be index
entries for the text of all error messages that might be confusing, if
@@ -197,8 +193,7 @@
gcc-bugs</a> about this.</p>
-<h2><a name="external_documents">Roll information in external documents
-the official manual.</a></h2>
+<h2 id="external_documents">Roll information in external documents the official manual.</h2>
<p>Start with the <a href="../readings.html">readings list</a> and the
secondary Texinfo documents in the source tree, such as
@@ -206,8 +201,7 @@
favorite FAQ from the lists and roll it into the manual.</p>
-<h2><a name="user_level_documentation">Improve user and installation
-documentation.</a></h2>
+<h2 id="user_level_documentation">Improve user and installation documentation.</h2>
<ul>
<li>Add information on relevant standards. Document the exact semantics
@@ -227,8 +221,7 @@
</ul>
-<h2><a name="revisit_actual_bugs">Revisit the list of "Actual Bugs"
-in the manual</a></h2>
+<h2 id="revisit_actual_bugs">Revisit the list of "Actual Bugs" in the manual</h2>
<p>Go through the list of "Actual Bugs" in
<code>gcc/doc/trouble.texi</code>. Work out what they refer to, if
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/ia64.html,v
retrieving revision 1.25
@@ -8,7 +8,7 @@
<body>
<h1>Projects to improve performance on IA-64</h1>
<!-- table of contents start -->
-<h2><a name="toc">Contents:</a></h2>
+<h2 id="toc">Contents:</h2>
<ul>
<li><a href="#short_term_projects">Short-term projects</a></li>
@@ -32,7 +32,7 @@
working on related improvements so that adverse interactions can be
detected early.</p>
-<h2><a name="short_term_projects">Short-term projects</a></h2>
+<h2 id="short_term_projects">Short-term projects</h2>
<ul>
<li>Track memory origin to allow better alias analysis
@@ -211,7 +211,7 @@
</ul>
-<h2><a name="long_term_projects">Long-term and infrastructure projects</a></h2>
+<h2 id="long_term_projects">Long-term and infrastructure projects</h2>
<ul>
<li>Region formation heuristics
@@ -352,7 +352,7 @@
</ul>
-<h2><a name="tool_projects">Tools: performance tools, benchmarks, etc.</a></h2>
+<h2 id="tool_projects">Tools: performance tools, benchmarks, etc.</h2>
<ul>
<li>Analyze benchmark results to identify important optimizations
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/index.html,v
retrieving revision 1.72
@@ -39,7 +39,7 @@
<hr />
-<h2><a name="improve_the_installation_procedure">Improve the installation procedure</a></h2>
+<h2 id="improve_the_installation_procedure">Improve the installation procedure</h2>
<ul>
<li>See <a href="https://gcc.gnu.org/ml/gcc/2000-11/msg00556.html">a
@@ -72,7 +72,7 @@
href="https://gcc.gnu.org/PR346">PR other/346</a>.</li>
</ul>
-<h2><a name="simpler_porting">Simpler porting</a></h2>
+<h2 id="simpler_porting">Simpler porting</h2>
<p>Right now, describing the target machine's instructions is done
cleanly, but describing its addressing mode is done with several
@@ -95,7 +95,7 @@
in the RTL expression.</li>
</ul>
-<h2><a name="generalize_the_machine_model">Generalize the machine model</a></h2>
+<h2 id="generalize_the_machine_model">Generalize the machine model</h2>
<p>Some new compiler features may be needed to do a good job on
machines where static data needs to be addressed using base registers.</p>
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/optimize.html,v
retrieving revision 1.15
@@ -20,7 +20,7 @@
<!-- table of contents start -->
-<h1><a name="toc">Table of Contents</a></h1>
+<h1 id="toc">Table of Contents</h1>
<ul>
<li><a href="#putting_constants_in_special_sections">Putting constants in special sections</a></li>
<li><a href="#un_cse">Un-CSE</a></li>
@@ -37,7 +37,7 @@
<!-- table of contents end -->
<hr />
-<h2><a name="putting_constants_in_special_sections">Putting constants in special sections.</a></h2>
+<h2 id="putting_constants_in_special_sections">Putting constants in special sections.</h2>
<p>If a function has been placed in a special
section via attributes, we may want to put its static data and string
@@ -46,7 +46,7 @@
kernel.)</p>
<hr />
-<h2><a name="un_cse">Un-cse.</a></h2>
+<h2 id="un_cse">Un-cse.</h2>
<p>Perhaps we should have an un-cse step right after cse, which tries to
replace a reg with its value if the value can be substituted for the
@@ -55,7 +55,7 @@
change is really an improvement.</p>
<hr />
-<h2><a name="clean_up_how_cse_works">Clean up how cse works.</a></h2>
+<h2 id="clean_up_how_cse_works">Clean up how cse works.</h2>
<p>The scheme is that each value has just one hash entry. The
first_same_value and next_same_value chains are no longer needed.</p>
@@ -110,7 +110,7 @@
If the value is constant, it is always explicitly constant.</p>
<hr />
-<h2><a name="loop_optimization">Loop optimization</a></h2>
+<h2 id="loop_optimization">Loop optimization</h2>
<p>Strength reduction and iteration variable elimination could be
smarter. They should know how to decide which iteration variables are
@@ -124,7 +124,7 @@
within the loop by computing that value at the loop end.</p>
<hr />
-<h2><a name="using_constraints_on_values">Using constraints on values</a></h2>
+<h2 id="using_constraints_on_values">Using constraints on values</h2>
<p>Many operations could be simplified based on knowledge of the
minimum and maximum possible values of a register at any particular
@@ -142,7 +142,7 @@
specified to exit if negative.</p>
<hr />
-<h2><a name="change_the_type_of_a_variable">Change the type of a variable</a></h2>
+<h2 id="change_the_type_of_a_variable">Change the type of a variable</h2>
<p>Sometimes a variable is declared as <code>int</code>, it is
assigned only once from a value of type <code>char</code>, and then it
@@ -152,20 +152,20 @@
declaration of the variable and change all the places that use it.</p>
<hr />
-<h2><a name="better_handling_for_very_sparse_switches">Better handling for very sparse switches</a></h2>
+<h2 id="better_handling_for_very_sparse_switches">Better handling for very sparse switches</h2>
<p>There may be cases where it would be better to compile a switch
statement to use a fixed hash table rather than the current
combination of jump tables and binary search.</p>
<hr />
-<h2><a name="order_of_subexpressions">Order of subexpressions</a></h2>
+<h2 id="order_of_subexpressions">Order of subexpressions</h2>
<p>It might be possible to make better code by paying attention to the
order in which to generate code for subexpressions of an expression.</p>
<hr />
-<h2><a name="distributive_law">Distributive law</a></h2>
+<h2 id="distributive_law">Distributive law</h2>
<p>The C expression <code>*(X + 4 * (Y + C))</code> compiles better on
certain machines if rewritten as <code>*(X + 4*C + 4*Y)</code> because
@@ -175,7 +175,7 @@
<p>Some work has been done on this, in combine.c.</p>
<hr />
-<h2><a name="better_builtin_string_functions">Better builtin string functions</a></h2>
+<h2 id="better_builtin_string_functions">Better builtin string functions</h2>
<p>Although GCC implements numerous optimizations of the standard C
library's string, math and I/O functions, there are still plenty more
@@ -249,7 +249,7 @@
<hr />
-<h2><a name="data_prefetch">Data prefetch support</a></h2>
+<h2 id="data_prefetch">Data prefetch support</h2>
<p>Loads from memory can take many cycles if the loaded data is not in
a cache line. Cache misses can bring a CPU to a halt for several 100
@@ -264,7 +264,7 @@
are in development or already supported by GCC.</p>
<hr />
-<h2><a name="target-specific">Target specific optimizer deficiencies</a></h2>
+<h2 id="target-specific">Target specific optimizer deficiencies</h2>
<p>Almost all code transformations implemented in GCC are target
independent by design, but how well they work depends on how accurately
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/prefetch.html,v
retrieving revision 1.33
@@ -8,7 +8,7 @@
<body>
<h1>Data Prefetch Support</h1>
-<h2><a name="toc">Contents</a></h2>
+<h2 id="toc">Contents</h2>
<ul>
<li><a href="#intro">Introduction</a></li>
<li><a href="#elements">Elements of Data Prefetch Support</a>
@@ -42,7 +42,7 @@
<li><a href="#refs">References</a></li>
</ul>
-<h2><a name="intro">Introduction</a></h2>
+<h2 id="intro">Introduction</h2>
<p>The framework for data prefetch in GCC supports capabilities
of a variety of targets. Optimizations within GCC that involve prefetching
@@ -81,7 +81,7 @@
it to <a href="mailto:janis187@us.ibm.com">Janis Johnson,
<janis187@us.ibm.com></a>.</p>
-<h2><a name="elements">Elements of Data Prefetch Support</a></h2>
+<h2 id="elements">Elements of Data Prefetch Support</h2>
<p>Data prefetch, or cache management, instructions allow a compiler
or an assembly language programmer to minimize cache-miss latency
@@ -90,7 +90,7 @@
they affect the performance but not the functionality of software in
which they are used.</p>
-<h3><a name="locality">Locality</a></h3>
+<h3 id="locality">Locality</h3>
<p>Data prefetch instructions often include information about the
<em>locality</em> of expected accesses to prefetched memory. Such
@@ -113,7 +113,7 @@
<p>Locality hints determined in GCC optimization passes can be ignored in
the machine description for targets that do not support them.</p>
-<h3><a name="write">Read or Write Access</a></h3>
+<h3 id="write">Read or Write Access</h3>
<p>Some data prefetch instructions make a distinction between memory
which is expected to be read and memory which is expected to be written.
@@ -128,13 +128,13 @@
that define both kinds of instructions but do not support prefetch for
writes.</p>
-<h3><a name="size">Size of block to access</a></h3>
+<h3 id="size">Size of block to access</h3>
<p>The amount of data accessed by a data prefetch instruction is
usually a cache line, whose size is usually implementation specific,
but is sometimes a specified number of bytes.</p>
-<h3><a name="base_update">Base update</a></h3>
+<h3 id="base_update">Base update</h3>
<p>At least one target's data prefetch instructions has a
<em>base update</em> form, which modifies the prefetch address after
@@ -142,14 +142,14 @@
on load and store instructions for some targets, and this could be
taken into consideration in code that uses data prefetch.</p>
-<h3><a name="faulting">Faulting v. Non-faulting</a></h3>
+<h3 id="faulting">Faulting v. Non-faulting</h3>
<p>Some architectures provide prefetch instructions that cause
faults when the address to prefetch is invalid or not cacheable.
The data prefetch support in GCC assumes that only non-faulting
prefetch instructions will be used.</p>
-<h3><a name="misc">Miscellaneous Features</a></h3>
+<h3 id="misc">Miscellaneous Features</h3>
<p>Some prefetch instructions have requirements about address alignment.
These can be handled in the machine description; optimization passes
@@ -162,7 +162,7 @@
<li>number of bytes prefetched</li>
</ul>
-<h2><a name="rules">Guidelines for Prefetching Data</a></h2>
+<h2 id="rules">Guidelines for Prefetching Data</h2>
<p>Prefetch timing is important. The data should be in the cache
by the time it is accessed, but without a delay that would allow
@@ -189,7 +189,7 @@
arrays in loops with loop unrolling
[<a href="#ref_23">23</a>][<a href="#ref_26">26</a>].</p>
-<h2><a name="targets">Data Prefetch Support on GCC Targets</a></h2>
+<h2 id="targets">Data Prefetch Support on GCC Targets</h2>
<p>Variants of prefetch commands that fault are not included here.
Some implementations of these architectures recognize data prefetch
@@ -205,7 +205,7 @@
technical documentation for that processor; the references provide a
starting point for that information.</p>
-<h3><a name="summary">Summary</a></h3>
+<h3 id="summary">Summary</h3>
<table border="1" cellspacing="0" cellpadding="5">
<tr>
@@ -303,7 +303,7 @@
</tr>
</table>
-<h3><a name="3dnow">3DNow!</a></h3>
+<h3 id="3dnow">3DNow!</h3>
<p>The 3DNow! technology from AMD extends the x86 instruction set, primarily
to support floating point computations. Processors that support this
@@ -323,7 +323,7 @@
Future AMD K86 processors might extend the <code>PREFETCH</code>
instruction format.</p>
-<h3><a name="alpha">Alpha</a></h3>
+<h3 id="alpha">Alpha</h3>
<p>The Alpha architecture supports data prefetch via load instructions
with a destination of register <code>R31</code> or <code>F31</code>, which
@@ -371,7 +371,7 @@
<p>These instructions are meant to help with very long memory latencies
and are not useful on existing Alpha implementations (through 21264).</p>
-<h3><a name="altivec">AltiVec</a></h3>
+<h3 id="altivec">AltiVec</h3>
<p>Data prefetch support in the AltiVec instruction set architecture
is quite different from that of other architectures that GCC supports.
@@ -463,7 +463,7 @@
specifying a data stream for each prefetch and keeping track of which ones
are in use.</p>
-<h3><a name="ia32_sse">IA-32 SSE</a></h3>
+<h3 id="ia32_sse">IA-32 SSE</h3>
<p>The IA-32 Streaming SIMD Extension (SSE) instructions are used on several
platforms, including the Pentium III and Pentium 4 [<a href="#ref_6">6</a>]
@@ -498,7 +498,7 @@
<p>There are no alignment requirements for the address. The size of the
line prefetched is implementation dependent, but a minimum of 32 bytes.</p>
-<h3><a name="ia64">IA-64</a></h3>
+<h3 id="ia64">IA-64</h3>
<p>The <code>lfetch</code> (Line Prefetch) instruction has versions for
read and write prefetches, and an optional modifier to specify the
@@ -538,7 +538,7 @@
The base update forms of these instructions imply a prefetch, and
have a completer that specifies the locality of the memory access.</p>
-<h3><a name="mips">MIPS</a></h3>
+<h3 id="mips">MIPS</h3>
<p>The <code>PREF</code> (Prefetch) instruction, supported by MIPS32
[<a href="#ref_9">9</a>] and MIPS64 [<a href="#ref_10">10</a>],
@@ -592,7 +592,7 @@
<p>The <code>PREFX</code> (Prefetch Indexed) instruction, supported by MIPS64,
differs in the addressing mode and is for use with floating point data.</p>
-<h3><a name="mmix">MMIX</a></h3>
+<h3 id="mmix">MMIX</h3>
<p>MMIX has the following data prefetch instructions
[<a href="#ref_11">11</a>][<a href="#ref_12">12</a>]:</p>
@@ -620,7 +620,7 @@
<code>STUNC</code>, which request that the data not be cached because
it is unlikely to be accessed again soon.</p>
-<h3><a name="hppa">PA-RISC</a></h3>
+<h3 id="hppa">PA-RISC</h3>
<p>A normal load to register <code>GR0</code> prefetches data.
The data prefetch instructions are [<a href="#ref_13">13</a>]:</p>
@@ -645,7 +645,7 @@
the low order part of the address is ignored.
</p>
-<h3><a name="powerpc">PowerPC</a></h3>
+<h3 id="powerpc">PowerPC</h3>
<p>The PowerPC provides the following data prefetch instructions
[<a href="#ref_14">14</a>]:</p>
@@ -664,7 +664,7 @@
<p>There are no alignment restrictions on the address of the data to
prefetch.</p>
-<h3><a name="sh_34">SuperH</a></h3>
+<h3 id="sh_34">SuperH</h3>
<p>The SuperH RISC engine architecture defines the <code>PREF</code> (Prefetch
Data to the Cache) instruction.</p>
@@ -675,7 +675,7 @@
<p>For the SH-4, the instruction moves 32 bytes of data starting at a 32-byte
boundary into the operand cache [<a href="#ref_17">17</a>].</p>
-<h3><a name="sparc">SPARC</a></h3>
+<h3 id="sparc">SPARC</h3>
<p>The SPARC version 9 instruction set architecture defines
the <code>PREFETCH</code> (Prefetch Data) and
@@ -723,7 +723,7 @@
<p>There are no alignment restrictions on the address to prefetch; the
instructions ignore the 5 least significant bits.</p>
-<h3><a name="xscale">XScale</a></h3>
+<h3 id="xscale">XScale</h3>
<p>The Intel XScale processor includes ARM's DSP-enhanced instructions,
including the <code>PLD</code> (Preload) instruction.
@@ -733,7 +733,7 @@
<p>NOTE: More investigation is necessary; [<a href="#ref_23">23</a>]
has an example that implies that base update might be available.</p>
-<h2><a name="refs">References</a></h2>
+<h2 id="refs">References</h2>
<p>These references need cleanup and should actually be used in the text
above that uses the information. Many of the links will likely be out
@@ -742,99 +742,99 @@
<p>References to cache control instructions for specific architectures:</p>
-<p><a name="ref_1">[1]</a>
+<p id="ref_1">[1]
<em>3DNow![tm] Technology Manual</em>, AMD, 29128G/0, March 2000.</p>
-<p><a name="ref_2">[2]</a>
+<p id="ref_2">[2]
<em>Alpha Architecture Handbook</em>, Compaq, Version 4, October 1998,
Order Number EC-QD2KC-TE;
see pages 4-139 and A-8.</p>
-<p><a name="ref_3">[3]</a>
+<p id="ref_3">[3]
<em>Alpha 21264 Hardware Reference Manual</em>, July 1999;
see section 2.6.</p>
-<p><a name="ref_4">[4]</a>
+<p id="ref_4">[4]
<em>AltiVec Technology Programming Environments Manual</em>, 11/1998, Rev. 0.1;
Page 5-9 has usage recommendations.</p>
-<p><a name="ref_5">[5]</a>
+<p id="ref_5">[5]
<em>AMD Extensions to the 3DNow![tm] and MMX[tm] Instruction Sets</em>, AMD,
Publication 22466D, March 2000.</p>
-<p><a name="ref_6">[6]</a>
+<p id="ref_6">[6]
<em>The IA-32 Intel Architecture Software Developer's Manual, Volume 2:
Instruction Set Reference</em>.</p>
-<p><a name="ref_8">[8]</a>
+<p id="ref_8">[8]
<em>Intel Itanium[tm] Architecture Software Developer's Manual Vol. 3
rev. 2.1: Instruction Set Reference</em>.</p>
-<p><a name="ref_9">[9]</a>
+<p id="ref_9">[9]
<em>MIPS32[tm] Architecture for Programmers; Volume II: The MIPS32[tm]
Instruction Set</em>, MIPS Technologies, Document Number MD00086,
Revision 0.95, March 12, 2001 search from www.mips.com.</p>
-<p><a name="ref_10">[10]</a>
+<p id="ref_10">[10]
<em>MIPS64[tm] Architecture for Programmers; Volume II: The MIPS64[tm]
Instruction Set</em>, MIPS Technologies, Document Number MD00087,
Revision 0.95, March 12, 2001;
search from www.mips.com.</p>
-<p><a name="ref_11">[11]</a>
+<p id="ref_11">[11]
<a href="https://www-cs-faculty.stanford.edu/~knuth/mmop.html">
MMIX Op Codes</a>, Don Knuth.</p>
-<p><a name="ref_12">[12]</a>
+<p id="ref_12">[12]
<em>The Art of Computer Programming, Fascicle 1: MMIX</em>, Don Knuth,
Addison Wesley Longman, 2001;
<a href="https://www-cs-faculty.stanford.edu/~knuth/fasc1.ps.gz">
http://www-cs-faculty.stanford.edu/~uno/fasc1.ps.gz</a>.</p>
-<p><a name="ref_13">[13]</a>
+<p id="ref_13">[13]
<em>PA-RISC 2.0 Instruction Set Architecture</em>;
see <em>Memory Reference Instructions</em> in Chapter 6.</p>
-<p><a name="ref_14">[14]</a>
+<p id="ref_14">[14]
<em>PowerPC Microprocessor 32-bit Family: The Programming Environments</em>,
page 5-8.</p>
-<p><a name="ref_15">[15]</a>
+<p id="ref_15">[15]
<em>The SPARC Architecture Manual</em>, Version 9, SPARC International,
SAV09R1459912, 1994-2000; see A.42.</p>
-<p><a name="ref_16">[16]</a>
+<p id="ref_16">[16]
<em>SuperH[tm] RISC Engine SH-3/SH-3E/SH3-DSP Programming Manual</em>,
ADE-602-096B, Rev. 3.0, 9/25/00, Hitachi, Ltd.</p>
-<p><a name="ref_17">[17]</a>
+<p id="ref_17">[17]
<em>SuperH[tm] RISC Engine SH-4 Programming Manual</em>,
ADE-602-156D, Rev. 5.0, 4/19/2001, Hitachi, Ltd.</p>
-<p><a name="ref_18">[18]</a>
+<p id="ref_18">[18]
<em>UltraSPARC[tm] User's Manual</em>,
Sun Microsystems, Part No: 802-7720-02, July 1997, pages 36-37.</p>
-<p><a name="ref_19">[19]</a>
+<p id="ref_19">[19]
<em>UltraSPARC[tm]-II High Performance 64-bit RISC Processor</em>,
Sun Microelectronics Application Notes,
section 5.0: Software Prefetch and Multiple-Outstanding Misses</p>
-<p><a name="ref_20">[20]</a>
+<p id="ref_20">[20]
<em>UltraSPARC[tm]-IIi User's Manual</em>,
Sun Microsystems, Part No: 805-0087-01, 1997.</p>
<p>References to uses of data prefetch instructions:</p>
-<p><a name="ref_21a">[21a]</a>
+<p id="ref_21a">[21a]
<em>Compiler Writer's Guide for the Alpha 21264</em>,
Order Number EC-RJ66A-TE, June 1999.</p>
-<p><a name="ref_21b">[21b]</a>
+<p id="ref_21b">[21b]
<em>Compiler Writer's Guide for the 21264/21364</em>,
Order Number EC-0100A-TE, January 2002.</p>
-<p><a name="ref_22">[22]</a>
+<p id="ref_22">[22]
<em>Compiler-Based Prefetching for Recursive Data Structures</em>,
Chi-Keung Luk and Todd C. Mowry, linked from
<a href="http://www.cs.cmu.edu/~tcm/Papers.html">
@@ -842,21 +842,21 @@
That location also has links to several other papers about data prefetch
by Todd C. Mowry.</p>
-<p><a name="ref_23">[23]</a>
+<p id="ref_23">[23]
<em>Intel(r) XScale[tm] Core Developer's Manual</em>, December 2000;
section A.4.4 is "Prefetch Considerations" in the Optimization Guide.</p>
-<p><a name="ref_24">[24]</a>
+<p id="ref_24">[24]
<em>Optimizing 3DNow! Real-Time Graphics</em>, Dr. Dobb's Journal August 2000,
Max I. Fomitchev.</p>
-<p><a name="ref_25">[25]</a>
+<p id="ref_25">[25]
<em>An Overview of the Intel IA-64 Compiler</em>,
Carole Dulong, Rakesh Krishnaiyer, Dattatraya Kulkarni, Daniel Lavery,
Wei Li, John Ng, and David Sehr, all of Microcomputer Software Laboratory,
Intel Corporation, <em>Intel Technology Journal</em>, 4th quarter 1999.</p>
-<p><a name="ref_26">[26]</a>
+<p id="ref_26">[26]
<em>UltraSPARC[tm]-II Enhancements: Support for Software Controlled
Prefetch</em>, Sun Microsystems, July 1996, WPR-0002.</p>
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/sched-treegion.html,v
retrieving revision 1.8
@@ -16,7 +16,7 @@
<li><a href="#readings">Readings</a></li>
</ul>
-<h2><a name="news">Latest News</a></h2>
+<h2 id="news">Latest News</h2>
<dl>
<dt>2005-02-03</dt>
@@ -27,7 +27,7 @@
<dd><p>Creation of sched-treegion-branch.</p></dd>
</dl>
-<h2><a name="intro">Introduction</a></h2>
+<h2 id="intro">Introduction</h2>
<p>
Instruction scheduling is a critical phase of compilation for extracting
large amounts of ILP from a program. In this work we present the status
@@ -37,7 +37,7 @@
codes size and ILP. The implementation
currently resides on the sched-treegion-branch.</p>
-<h2><a name="tree_form">Natural Treegion Formation</a></h2>
+<h2 id="tree_form">Natural Treegion Formation</h2>
<p>
A treegion is a non-linear, single-entry, multiple-exit region of code
@@ -71,7 +71,7 @@
successor blocks.
</p>
-<h2><a name="tail_dup">Tail Duplication</a></h2>
+<h2 id="tail_dup">Tail Duplication</h2>
<p>
In this section we discuss our method for efficient tail duplication, with
@@ -109,7 +109,7 @@
basic blocks and/or the number of instructions contained with in the treegion.
</p>
-<h2><a name="tree_sched">Treegion scheduling - Tree Traversal Scheduling</a></h2>
+<h2 id="tree_sched">Treegion scheduling - Tree Traversal Scheduling</h2>
<p>
Due to the acyclic nature of treegions, the Haifa scheduler does not
@@ -147,7 +147,7 @@
<p>The implementation currently resides on the branch.</p>
-<h2><a name="contributing">Contributing</a></h2>
+<h2 id="contributing">Contributing</h2>
<p>Checkout the sched-treegion branch following the instructions found in the
<a href="../svn.html">SVN documentation</a>.</p>
@@ -157,14 +157,14 @@
rules apply. This branch is maintained by
<a href="mailto:mcrosier@unity.ncsu.edu">Chad Rosier</a>.</p>
-<h2><a name="readings">Readings</a></h2>
+<h2 id="readings">Readings</h2>
<p>Lots of useful information is present at the <a
href="http://tinker.cc.gatech.edu">TINKER Microarchitecture and
Compiler Research</a> homepage. More relevant papers:</p>
<dl>
-<dt><a name="1">[1]</a></dt>
+<dt>[1]</dt>
<dd>
H. Zhou, and T.M. Conte,
@@ -174,7 +174,7 @@
and Computer Architectures (INTERACT-6), Cambridge, MA, February 2002.
</dd>
-<dt><a name="2">[2]</a></dt>
+<dt>[2]</dt>
<dd>
H. Zhou, M. D. Jennings, and T. M. Conte,
@@ -184,7 +184,7 @@
Compilers for Parallel Computing (LCPC'01), Cumberland Falls, KY, August 2001.
</dd>
-<dt><a name="3">[3]</a></dt>
+<dt>[3]</dt>
<dd>
W. A. Havanki, S. Banerjia, and T. M. Conte,
@@ -194,7 +194,7 @@
(HPCA-4), Las Vegas, Feb. 1998.
</dd>
-<dt><a name="4">[4]</a></dt>
+<dt>[4]</dt>
<dd>
S. Banerjia, W.A. Havanki, and T.M. Conte,
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/x86.html,v
retrieving revision 1.3
@@ -20,7 +20,7 @@
turn it back on.)</p>
<!-- table of contents start -->
-<h1><a name="toc">Table of Contents</a></h1>
+<h1 id="toc">Table of Contents</h1>
<ul>
<li><a href="#csefail">Failure of common subexpression elimination</a></li>
<li><a href="#storemerge">Store merging</a></li>
@@ -31,7 +31,7 @@
</ul>
<hr />
-<h2><a name="csefail">Failure of common subexpression elimination</a></h2>
+<h2 id="csefail">Failure of common subexpression elimination</h2>
<p>(12 Nov 2004, reconfirmed with trunk revision 156706) Common
subexpression elimination cannot merge calculations that take
@@ -115,7 +115,7 @@
to <code>.L2</code>.</p>
<hr />
-<h2><a name="storemerge">Store merging</a></h2>
+<h2 id="storemerge">Store merging</h2>
<p>(12 Nov 2004, reconfirmed with trunk revision 156706) GCC
frequently generates multiple narrow writes to adjacent memory
@@ -192,7 +192,7 @@
advantage is less obvious here.</p>
<hr />
-<h2><a name="volatile">Volatile inhibits too many optimizations</a></h2>
+<h2 id="volatile">Volatile inhibits too many optimizations</h2>
<p>(12 Nov 2004, reconfirmed with trunk revision 156706) GCC refuses
to perform in-memory operations on volatile variables, on architectures
@@ -232,7 +232,7 @@
standard may take issue with the difference - we aren't sure.</p>
<hr />
-<h2><a name="rndmode">Unnecessary changes of rounding mode</a></h2>
+<h2 id="rndmode">Unnecessary changes of rounding mode</h2>
<p>(12 Nov 2004, reconfirmed with trunk revision 156706) GCC does not
remember the state of the floating point control register, so it
@@ -339,7 +339,7 @@
</pre>
<hr />
-<h2><a name="fpmove">Moving floating point through integer registers</a></h2>
+<h2 id="fpmove">Moving floating point through integer registers</h2>
<p>(22 Jan 2000, reconfirmed with trunk revision 156706) GCC knows how
to move <code>float</code> quantities using integer instructions. This
@@ -477,8 +477,7 @@
</table>
<hr />
-<h2><a name="pathetic-loop">More pathetic failures of loop
-optimization</a></h2>
+<h2 id="pathetic-loop">More pathetic failures of loop optimization</h2>
<p>(25 Aug 2001) Consider the following code, which is a trimmed down
version of a real function that does something sensible.</p>
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/tree-ssa/index.html,v
retrieving revision 1.46
@@ -24,7 +24,7 @@
</ul>
<hr />
-<h2><a name="news">Latest News</a></h2>
+<h2 id="news">Latest News</h2>
<dl>
<dt>2004-05-13</dt>
@@ -54,7 +54,7 @@
</dl>
<hr />
-<h3><a name="intro">Introduction</a></h3>
+<h3 id="intro">Introduction</h3>
<p>The goal of this project is to build an optimization framework for trees
based on the Static Single Assignment (SSA) form [<a
@@ -62,7 +62,7 @@
<code>tree-ssa-20020619-branch</code> branch.</p>
<hr />
-<h3><a name="documentation">Documentation</a></h3>
+<h3 id="documentation">Documentation</h3>
<p>A high-level overview of GENERIC/GIMPLE and the SSA implementation may
be found in the <a href="ftp://gcc.gnu.org/pub/gcc/summit/2003/">Proceedings
@@ -75,7 +75,7 @@
href="https://gcc.gnu.org/wiki/GettingStarted">GCC Wiki</a>.</p>
<hr />
-<h3><a name="contributing">Contributing</a></h3>
+<h3 id="contributing">Contributing</h3>
<p>Checkout the <code>tree-ssa-20020619-branch</code> branch
in <a href="../../svn.html">our respository</a>.</p>
@@ -98,7 +98,7 @@
Diego Novillo, Sebastian Pop, Graham Stott and Jeff Sturm.</p>
<hr />
-<h3><a name="stability">Branch stability</a></h3>
+<h3 id="stability">Branch stability</h3>
<p>Every patch submitted for review must either fix a PR or adress one
of the issues mentioned in the <a
@@ -115,7 +115,7 @@
<hr />
-<h2><a name="gimple">GENERIC and GIMPLE</a></h2>
+<h2 id="gimple">GENERIC and GIMPLE</h2>
<p>While GCC trees contain sufficient information for
implementing SSA, there are two major problems that make this
@@ -180,7 +180,7 @@
in GIMPLE form are defined in <code>tree-simple.[ch]</code>.</p>
<hr />
-<h2><a name="ssa">SSA implementation</a></h2>
+<h2 id="ssa">SSA implementation</h2>
<p>Having trees in GIMPLE form enables language-independent analysis
and transformation passes. Currently, we are implementing an SSA pass
@@ -211,7 +211,7 @@
</ol>
<hr />
-<h2><a name="unparse">Unparsing C trees</a></h2>
+<h2 id="unparse">Unparsing C trees</h2>
<p>The file <code>tree-pretty-print.c</code> implements several debugging
functions that given a GENERIC tree node, they print a C representation of
@@ -219,12 +219,12 @@
help when debugging transformations done by the transformation passes.</p>
<hr />
-<h2><a name="tb">Tree Browser</a></h2>
+<h2 id="tb">Tree Browser</h2>
For debugging, browsing, discovering, and playing with trees you can
use the <a href="tree-browser.html">Tree Browser</a> directly from gdb.
<hr />
-<h2><a name="status">Implementation Status</a></h2>
+<h2 id="status">Implementation Status</h2>
<p>This is a short list of the work that has already been finished or
is ongoing.</p>
@@ -275,7 +275,7 @@
existing DejaGNU testing framework.</p>
<hr />
-<h2><a name="todo">TODO list</a></h2>
+<h2 id="todo">TODO list</h2>
<p>This is a loosely organized list of unimplemented features,
possible improvement, and planned analyses and optimizations.
@@ -359,33 +359,33 @@
<h2>References</h2>
<dl>
-<dt><a name="cytron.ea-91">[1]</a></dt>
+<dt id="cytron.ea-91">[1]</dt>
<dd>R. Cytron, J. Ferrante, B. Rosen, M. Wegman, and K. Zadeck.
Efficiently Computing Static Single Assignment Form and the Control Dependence Graph.
<em>ACM Transactions on Programming Languages and Systems</em>, 13(4): 451-490, October 1991.</dd>
-<dt><a name="hendren.ea-92">[2]</a></dt>
+<dt id="hendren.ea-92">[2]</dt>
<dd>L. Hendren, C. Donawa, M. Emami, G. Gao, Justiani, and B. Sridharan.
Designing the McCAT compiler based on a family of structured intermediate representations.
In <em>Proceedings of the 5th International Workshop on Languages
and Compilers for Parallel Computing</em>, pages 406-420. Lecture Notes in
Computer Science, no. 457, Springer-Verlag, August 1992.</dd>
-<dt><a name="morgan-98">[3]</a></dt>
+<dt id="morgan-98">[3]</dt>
<dd>Robert Morgan.
<em>Building an Optimizing Compiler</em>, Butterworth-Heinemann, 1998.</dd>
-<dt><a name="wegman.ea-91">[4]</a></dt>
+<dt id="wegman.ea-91">[4]</dt>
<dd>Mark N. Wegman and F. Kenneth Zadeck.
Constant Propagation with Conditional Branches.
<em>ACM Transactions on Programming Languages and Systems</em>, 13(2): 181-210, April 1991.</dd>
-<dt><a name="chow.ea-97">[5]</a></dt>
+<dt id="chow.ea-97">[5]</dt>
<dd>Robbert Kennedy, Sun Chan, Shin-Ming Liu, Raymond Lo, Peng Tu, and Fred Chow.
Partial Redundancy Elimination in SSA Form.
<em>ACM Transactions on Programming Languages and Systems</em>, 21(3): 627-676, 1999.</dd>
-<dt><a name="patterson-95">[6]</a></dt>
+<dt id="patterson-95">[6]</dt>
<dd>Jason R. C. Patterson.
Accurate Static Branch Prediction by Value Range Propagation.
<em>Proceedings of the ACM SIGPLAN '95 Conference on Programming Language Design and Implementation</em>, pages 67-78, June 1995.</dd>
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/tree-ssa/vectorization.html,v
retrieving revision 1.34
@@ -31,7 +31,7 @@
Implementation</a></li>
</ul>
- <h2><a name="news">Latest News</a></h2>
+ <h2 id="news">Latest News</h2>
<dl>
<dt>2011-10-23</dt>
<dd>
@@ -110,7 +110,7 @@
</dl>
- <h2><a name="vec_todo">Contributing</a></h2>
+ <h2 id="vec_todo">Contributing</h2>
<p>This project was started by Dorit (Naishlos) Nuzman. Current contributors
to this project include Revital Eres, Richard Guenther, Jakub Jelinek, Michael Matz,
@@ -122,7 +122,7 @@
https://gcc.gnu.org/wiki/VectorizationTasks</a>.
</p>
- <h2><a name="using">Using the Vectorizer</a></h2>
+ <h2 id="using">Using the Vectorizer</h2>
<p>Vectorization is enabled by the flag
<code>-ftree-vectorize</code> and by default
@@ -164,7 +164,7 @@
<p>"feature" indicates
the vectorization capabilities demonstrated by the
- example.</p><strong><a name="example1">example1:</a></strong>
+ example.</p><strong id="example1">example1:</strong>
<pre>
int a[256], b[256], c[256];
@@ -175,7 +175,7 @@
a[i] = b[i] + c[i];
}
}
-</pre><strong><a name="example2">example2</a>:</strong>
+</pre><strong id="example2">example2:</strong>
<pre>
int a[256], b[256], c[256];
@@ -194,7 +194,7 @@
a[i] = b[i]&c[i]; i++;
}
}
-</pre><strong><a name="example3">example3</a>:</strong>
+</pre><strong id="example3">example3:</strong>
<pre>
typedef int aint __attribute__ ((__aligned__(16)));
@@ -205,7 +205,7 @@
*p++ = *q++;
}
}
-</pre><strong><a name="example4">example4</a></strong>:
+</pre><strong id="example4">example4</strong>:
<pre>
typedef int aint __attribute__ ((__aligned__(16)));
@@ -230,7 +230,7 @@
b[i] = (j > MAX ? MAX : 0);
}
}
-</pre><strong><a name="example5">example5</a></strong>:
+</pre><strong id="example5">example5</strong>:
<pre>
struct a {
@@ -250,7 +250,7 @@
A = LOG(X); B = LOG(Y); C = A + B
PRINT*, C(500000)
END
-</pre><strong><a name="example7">example7</a></strong>:
+</pre><strong id="example7">example7</strong>:
<pre>
int a[256], b[256];
@@ -262,7 +262,7 @@
a[i] = b[i+x];
}
}
-</pre><a name="example8"><strong>example8</strong>:</a>
+</pre id="example8"><strong>example8</strong>:
<pre>
int a[M][N];
@@ -276,7 +276,7 @@
}
}
}
-</pre><a name="example9"><strong>example9</strong>:</a>
+</pre id="example9"><strong>example9</strong>:
<pre>
unsigned int ub[N], uc[N];
@@ -547,7 +547,7 @@
https://gcc.gnu.org/wiki/VectorizationTasks</a>
and a list of vectorizer missed-optimization PRs in the GCC bug tracker.</p>
- <h2><a name="oldnews">Previous News and Status</a></h2>
+ <h2 id="oldnews">Previous News and Status</h2>
<dl>
<dt>2007-09-17</dt>
@@ -785,7 +785,7 @@
</dl>
<dl>
- <dt><a name="status4.0"><strong>2005-03-01, mainline (final 4.0 status)</strong></a></dt>
+ <dt id="status4.0"><strong>2005-03-01, mainline (final 4.0 status)</strong></dt>
<dd>
Description of vectorizable loops:
@@ -1510,7 +1510,7 @@
</dd>
</dl>
- <h2><a name="References">References/Documentation</a></h2>
+ <h2 id="References">References/Documentation</h2>
<ol>
<li>"Vapor SIMD: Auto-vectorize once, run everywhere",
Dorit Nuzman, Sergei Dyshel, Erven Rohou, Ira Rosen, Kevin Williams,
@@ -1554,7 +1554,7 @@
Constraints", Alexandre E. Eichenberger, Peng Wu, Kevin O'brien,
PLDI'04, June 9-11 2004.</li>
- <li><a name="kenedy-book"></a>"Optimizing
+ <li id="kenedy-book">"Optimizing
Compilers for Modern Architectures - A dependence based
approach", Randy Allen & Ken Kennedy, Morgan Kaufmann
Publishers, San Francisco, San Diego, New York (2001).</li>
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/cfg.html,v
retrieving revision 1.26
retrieving revision 1.27
@@ -480,7 +480,7 @@
Intraprocedural Branch Alignment; Cliff Young, David S. Johnson,
David R. Karger, Michael D. Smith, ACM 1997</a></dd>
-<dt id="6">[6]</dt>
+<dt id="ref6">[6]</dt>
<dd><a href="https://doi.org/10.1145/305138.305178">Software
Trace Cache; International Conference on Supercomputing, 1999</a></dd>
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/cfo.html,v
retrieving revision 1.12
retrieving revision 1.13
@@ -33,7 +33,7 @@
</p></dd>
</dl>
-<h2 id="intro">Introduction</a></h2>
+<h2 id="intro">Introduction</h2>
<p>Code factoring is the name of a class of useful optimization techniques
developed especially for code size reduction. These approaches aim to reduce
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/prefetch.html,v
retrieving revision 1.34
retrieving revision 1.35
@@ -25,7 +25,7 @@
<li><a href="#targets">Data Prefetch Support on GCC Targets</a>
<ul>
<li><a href="#summary">Summary</a></li>
- <li><a href="#3dnow">3DNow!</a></li>
+ <li><a href="#threednow">3DNow!</a></li>
<li><a href="#alpha">Alpha</a></li>
<li><a href="#altivec">AltiVec</a></li>
<li><a href="#ia32_sse">IA-32 SSE</a></li>
@@ -303,7 +303,7 @@
</tr>
</table>
-<h3 id="3dnow">3DNow!</h3>
+<h3 id="threednow">3DNow!</h3>
<p>The 3DNow! technology from AMD extends the x86 instruction set, primarily
to support floating point computations. Processors that support this
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/tree-ssa/vectorization.html,v
retrieving revision 1.35
retrieving revision 1.36
@@ -262,7 +262,7 @@
a[i] = b[i+x];
}
}
-</pre id="example8"><strong>example8</strong>:
+</pre><strong id="example8">example8</strong>:
<pre>
int a[M][N];
@@ -276,7 +276,7 @@
}
}
}
-</pre id="example9"><strong>example9</strong>:
+</pre><strong id="example9">example9</strong>:
<pre>
unsigned int ub[N], uc[N];