diff mbox

[4/3] Header file reduction - Tools for contrib

Message ID 5613A1F9.3030407@codesourcery.com
State New
Headers show

Commit Message

Bernd Schmidt Oct. 6, 2015, 10:27 a.m. UTC
On 10/05/2015 11:18 PM, Andrew MacLeod wrote:
> Here's the patch to add all the tools to contrib/headers.

Small patches should not be sent in compressed form, it makes reading 
and quoting them harder. This message is only intended to contain the 
patch in plain text so that I can quote it in further replies.

> There are 9 tools I used over the run of the project.  They were
> developed in various stages and iterations, but I tried to at least have
> some common interface things, and I tried some cleaning up and
> documentation.  No commenting on the quality of python code... :-) I was
> learning python on the fly.    Im sure some things are QUITE awful.,
>
> There is a readme file which gives a common use cases for each tool
>
> Some of the tools are for analysis, aggregation, or flattening, some for
> visualization, and some are for the include reduction. I would have just
> filed them away somewhere, but  Jeff suggested I contribute them in case
> someone wants to do something with them down the road... which
> presumably also includes me :-)   Less chance of losing them this way.
>
> They need more polishing, but I'm tired of looking at them. I will
> return to them down the road and see about cleaning them up a bit more.
> They still aren't perfect by any means, but should do their job safely.
> when used properly.   Comments in the code vary from good to absent,
> depending on how irritable I was at the time I was working on itl
>
> I will soon also provide a modified config-list.mk which still works
> like the current one, but allows for easy overrides of certain things
> the include reducer requires..  until now I've just made a copy of
> config-list.mk and modified it for my own means.
>
> The 2 tools for include reduction are  gcc-order-headers   and
> reduce-headers
>
> what the process/conditions for checking things into contrib?  I've
> never had to do it before :-)
>
> Andrew
>

Comments

Bernd Schmidt Oct. 6, 2015, 12:02 p.m. UTC | #1
>> There are 9 tools I used over the run of the project.  They were
>> developed in various stages and iterations, but I tried to at least have
>> some common interface things, and I tried some cleaning up and
>> documentation.

I'll probably have to make multiple passes over this. A disclaimer 
first, I have done enough Python programming to develop a dislike for 
the language, but not enough to call myself an expert.

General comments first. Where applicable, I think we should apply the 
same coding standards to Python as we do for C/C++. That means things 
like function comments documenting parameters. They are absent for the 
most part in this patch, and I won't point out individual instances.
Also, I think the documentation should follow our usual rules. There are 
spelling and grammar problems. I will point out what I find (only the 
first instance for recurring problems), but please proofread the whole 
thing for the next submission. The Thunderbird spellchecker actually is 
pointing out a lot of these. Capitalize starts of sentences, write full 
sentences and terminate with punctuation.

> No commenting on the quality of python code... :-) I was
> learning python on the fly.    Im sure some things are QUITE awful.,

Yeah, the general impression is of fairly ad-hoc code. Not sure how much 
can be done about this.

> + trigger the help message.  Help may specify additonal functionality to what is

"additional"

> + - For*all*  tools, option format for specifying filenames must have no spaces

Space after "For", remove double space. This pattern occurs very often - 
something your editor does maybe?

> + - Many of the tools are required to be run from the core gcc source directory
> + containing coretypes.h  typically that is  in gcc/gcc from a source checkout.

Odd whitespace, and probably lack of punctuation before "typically".

> + gcc-order-headers
> + -----------------
> +   This will reorder any primary backend headers files into a canonical order
> +   which will resolve any hidden dependencies they may have.  Any unknown
> +   headers will simply be occur after the recognized core files, and retain the
> +   same relative ordering they had.

Grammar ("be occur").

This sounds like the intention is to move recognized core files (I 
assume these are the ones in the "order" array in the tool) to the 
start, and leaving everything alone? I was a bit confused about this at 
first; I see for example "timevar.h" moving around without being present 
in the list, but it looks like it gets added implicitly through being 
included by df.h. (Incidentally, this looks like another case like the 
obstack one - a file using timevars should include timevar.h IMO, even 
if it also includes df.h).

> +
> +   Must be run in the core gcc source directory

"This tool must be run in the core gcc source directory." (Punctuation 
and grammar).

> +   Any files which are changed are output, and the original is saved with a
> +   .bak extention.
> +
> +   ex.:     gcc-order-headers tree-ssa.c c/c-decl.c

It might be more useful to produce a diff rather than modify files in-place.

> +   if any header files are included within a conditional code block, the tool
> +   will issue a message and not change the file.  When this happens, you can
> +   manually inspect the file, and if reorder it will be fine, rerun the command

"if reorder it will be fine"?

> +   It does a 4 level deep find of all source files from the current directory
> +   and look in each of those for a #include of the specified headers.  So expect
> +   a little bit of slowness.

"looks"?

> +
> +   -i limits the search to only other header files.
> +   -c limits the search to .c and .cc files.
> +   -a shows only source files which include*all*  specified headers.

Whitespace.

> +   it is good practive to run 'gcc-order-headers' on a source file before trying

"practice"

> +   Any desired target builds should be built in one directory using a modified
> +   config-list.mk file which doesnt delete the build directory when its done.

"doesn't", or more probably "does not" in documentation.

> +   The tool will analyze a source file and attempt to remove each non-conditional
> +   header from last to first in the file.:
> +     It will first attempt to build the native all-gcc target.
> +     If that succeeds, it will attempt to build any target build .o files
> +     If that suceeds, it will check to see if there are any conditional

"succeeds"
> +        compilation dependencies between this header file and the source file or
> +        any header whihch have already been determined as non-removable.

"whihch"

> +     If all these tests are passed, the header file is determined to be removable
> +        and is removed from the source file.
> +     This continues until all headers have been checked.

One thing I've wondered about - have you tried checking for object file 
differences?
As far as I can tell the dependency checking does not check for undefs. 
Is that correct? I think that needs to be added.

> +   At this point, the a bootstrap is attempted in the native build, and if that

"the a"

> +   A small subset of targets has been determined to provide excellent coverage,
> +   at least as of Aug 31/15 .  A fullset of targets reduced all of the files

"fullset", and whitespace. Determined how?

> +   making up libbackend.a.  All of the features which requires target testing
> +   were found to be triggered by one or more of these targets.  They are
> +   actually known to the tool, and when checkiong target, it will check those

"checkiong".

> +   targets first, then the rest.  It is mostly safe to do a reduction with just
> +   these targets, at least until some new whacky target comes along.
> +   building config-list.mk with :
> +   LIST="aarch64-linux-gnu arm-netbsdelf avr-rtems c6x-elf epiphany-elf hppa2.0-hpux10.1 i686-mingw32crt i686-pc-msdosdjgpp mipsel-elf powerpc-eabisimaltivec rs6000-ibm-aix5.1.0 sh-superh-elf sparc64-elf spu-elf"

I think I get what you're trying to say, but the whole paragraph could 
be rewritten for clarity.

> +     reduce-headers.log :  All the compilation failure output that tool tried.

"the tool"?

> + 			     and why it thinks taht is the case

"taht".

> +     $src.c.log  : for each failed header removal, the compilation
> + 		  messages as to why it failed.
> +     $header.h.log: The same log is put into the relevent header log as well.

"relevant"

> +   The tool will aggregate all these and generate a graph of the dependencies
> +   exposed during compilation.  red lines indicate dependecies that are

"depndecies"

> +   presednt because a head file physically includes another header. Black lines

"presednt"

> +   represent data dependencies causing compilation if the header isnt present.

"is not"

> + for x in sys.argv[1:]:
> +   if x[0:2] == "-h":
> +     usage = True
> +   else:
> +     src.append (x)

There are getopt and argparse libraries available for python. I seem to 
recall fighting them at some point because they didn't quite do what I 
expected from a C getopt, so it may not be worth it trying to use them.

> + if not usage and len(src) > 0:
> +
> +   incl = { }

Watch the extra blank lines.

> +         if dup.get(d) == None:

I think we want to be consistent with our C style? I.e., extra space 
before parentheses.

> +   l.sort(key=lambda tup:tup[0], reverse=True)

And spaces around things like = operators.

> +     # Don't put diagnostic*.h into the ordering list, its special since

"it is". Many instances, please grep for "its" and fix them all.

> +     # various front ends have to set GCC_DIAG_STYLE before including it
> +     # for each file, we'll tailor where it belongs by looking at the dup
> +     # list and seeing which file is included, and position it appropriately.

 From that comment it's not entirely clear how they are handled. Please 
expand documentation of this mechanism.

> +   # rtl.h gets tagged as a duplicate includer for all of coretypes, but thats

"that's"

> + # process diagnostic.h first.. it's special since GCC_DIAG_STYLE can be
> + # overridden by languages, but must be done so by a file included BEFORE it.
> + # so make sure it isn't seen as inclujded by one of those files by making it

"inclujded"

> + # Now crate the master ordering list

"create".

> + for i in order:
> +   create_master_list (os.path.basename (i), False)

I found myself wanting to pass True. The tool could use a "-v" flag.

> +   print " -s Show the cananoical order of known includes"

"canonical"

> +   print "Multi-line comments after a #include can also cause failuer, they must be turned"

"failuer"

> + ignore = [ "coretypes_h",
> + 	     "machmode_h",
> + 	     "signop_h",
> + 	     "wide_int_h",
> + 	     "double_int_h",
> + 	     "real_h",
> + 	     "fixed_value_h",
> + 	     "hash_table_h",
> + 	       "statistics_h",
> + 	       "ggc_h",
> + 	       "vec_h",
> + 	       "hashtab_h",
> + 	       "inchash_h",
> + 	       "mem_stats_traits_h",
> + 	       "hash_map_traits_h",
> + 	       "mem_stats_h",
> + 	       "hash_map_h",
> + 	     "hash_set_h",
> + 	     "input_h",
> + 	       "line_map_h",
> + 	     "is_a_h",
> + 	   "system_h",
> + 	   "config_h" ]

Is the random indentation indicating some kind of nesting? If not, 
please fix.

> +   for line in logfile:
> +     if len (line) > 21 and line[:21] in depstring:
> +       if newinc:
> +         incfrom = list()
> + 	newinc = False

It looks like you are mixing tab and space indentation. For a language 
like Python, that is absolutely scary. Please fix throughout (I think 
only spaces is probably best).

> + if dohelp:
> +   print "Generates a graph of the include web for specified files."
> +   print "Usage:  [-finput_file] [-h] [-ooutput] [file1 ... [filen]]"
> +   print "	-finput_file : Input file is file containing a list of files"
> +   print "	-ooutput : Specifies output to output.dot and output.png"
> +   print "                  defaults to graph.dot and graph.png"
> +   print "	-nnum : specifies the # of edges beyond which sfdp is invoked. def=0"
> +   print "       -a : Aggregate all .c files to 1 file.  Shows only include web."
> +   print "       -at :  Aggregate, but don't include terminal.h to .c links. "
> +   print "	-h : help"

The formatting of the help output seems somewhat random. Also "is a file"?

> +   if len(inc) > 0:
> + #    inc2 = re.findall (ur"defined *\((.+?)\)", inc[0])
> +     inc2 = re.findall (ur"defined\s*\((.+?)\)", inc[0])

Intentionally commented out?

> +
> + def process_ii (filen):
> +   return process_include_info (filen, False, False)
> +
> + def process_ii_macro (filen):
> +   return process_include_info (filen, True, False)
> +
> + def process_ii_src (filen):
> +   return process_include_info (filen, False, True)
> +
> + def process_ii_macro_src (filen):
> +   return process_include_info (filen, True, True)
> +
> + def ii_base (iinfo):
> +   return iinfo[0]
> +
> + def ii_path (iinfo):
> +   return iinfo[1]
> +
> + def ii_include_list (iinfo):
> +   return iinfo[2]
> +
> + def ii_include_list_cond (iinfo):
> +   return iinfo[3]
> +
> + def ii_include_list_non_cond (iinfo):
> +   l = ii_include_list (iinfo)
> +   for n in ii_include_list_cond (iinfo):
> +     l.remove (n)
> +   return l
> +
> + def ii_macro_consume (iinfo):
> +   return iinfo[4]
> +
> + def ii_macro_define (iinfo):
> +   return iinfo[5]
> +
> + def ii_src (iinfo):
> +   return iinfo[6]
> +
> + def ii_src_line (iinfo):
> +   return iinfo[7]

That's a lot of little functions with pretty much no clue for the reader 
what's going on. It looks like maybe there's an array where a struct 
should have been used?

> + # extract data for include file name_h and enter it into the dictionary.
> + # this doesnt change once read in.  use_requies is True if you want to

"does not", "use_requies"

> + # find FIND in src, and replace it with the list of includes in REPLACE
> + # remove any duplicates of find or replace, and if some of hte replace

"hte"

> + # includes occur earlier in the inlude chain, leave them.

"inlude"

> + # compensate for this stupid warning that should be an error for
> + # inlined templates
> + def get_make_rc (rc, output):
> +   rc = rc % 1280
> +   if rc == 0:
> +     # This is not considered a fatal error for a build!  /me rolls eyes
> +     h = re.findall ("warning: inline function.*used but never defined", output)
> +     if len(h) != 0:
> +       rc = 1
> +   return rc;

What's this about?

> +   print " -a  : Show only files which*all*  listed files are included"

Whitespace around *all*. Seems to happen quite often.

> + # given a header name, normalize it.  ie  cp/cp-tree.h could be in gcc, while

Formatting, capitalization.

> + # the same header could be referenecd from within the cp subdirectory as

"referenced"

> + # Adds a header file and it's sub includes to the global dictionary if they

This time, "its".

> + # aren't already there.  SPecify s_path since different build directories may

"SPecify"

> + if usage:
> +   print "Attempts to remove extraneous include files from source files. "
> +   print " "
> +   print "Should be run from the main gcc source directory, and works on a target"
> +   print "directory, as we attempt to make the 'all' target."
> +   print " "
> +   print "By default, gcc-reorder-includes is run on each file before attempting"
> +   print "to remove includes. this removes duplicates and puts some headers in a"
> +   print "canonical ordering"
> +   print " "
> +   print "The build directory should be ready to compile via make. Time is saved "

Space at the end of the line (two cases in this block).

> +   print " "
> +   print " show in a hierarchical visual format how many times each header file"
> +   print " is included ina source file.  Should be run from the source directory"

"ina".


Bernd
Andrew MacLeod Oct. 6, 2015, 2:04 p.m. UTC | #2
On 10/06/2015 08:02 AM, Bernd Schmidt wrote:
>
>>> There are 9 tools I used over the run of the project.  They were
>>> developed in various stages and iterations, but I tried to at least 
>>> have
>>> some common interface things, and I tried some cleaning up and
>>> documentation.
>
> I'll probably have to make multiple passes over this. A disclaimer 
> first, I have done enough Python programming to develop a dislike for 
> the language, but not enough to call myself an expert.
>
> General comments first. Where applicable, I think we should apply the 
> same coding standards to Python as we do for C/C++. That means things 
> like function comments documenting parameters. They are absent for the 
> most part in this patch, and I won't point out individual instances.
> Also, I think the documentation should follow our usual rules. There 
> are spelling and grammar problems. I will point out what I find (only 
> the first instance for recurring problems), but please proofread the 
> whole thing for the next submission. The Thunderbird spellchecker 
> actually is pointing out a lot of these. Capitalize starts of 
> sentences, write full sentences and terminate with punctuation.
>

I primarily submitted it early because you wanted to look at the tools  
before the code patch, which is the one I care about since the longer it 
goes, the more effort it is to update the patch to mainline.   I 
apologize for not proofreading it as much as I usually do.  My longer 
term intention was to polish the readme stuff and put it into each tool 
as well.

however, none of the other tools or scripts in contrib subscribe to 
commenting every function the same as we do for c/c++.  I did put 
comments in many places where it wasn't obvious what was going on to 
help with readability, but other cases it seemed obvious enough not to 
bother.    I don't mind adding missing ones that are important, but I do 
not see why every function needs to have the full c/c++ coding standard 
applied to it when no other tool does.  These certainly appear as good 
to me if not better than the existing scripts...


>> No commenting on the quality of python code... :-) I was
>> learning python on the fly.    Im sure some things are QUITE awful.,
>
> Yeah, the general impression is of fairly ad-hoc code. Not sure how 
> much can be done about this.
they were never intended as general purpose tools, they were developed 
over multiple iterations and bugfixing and never properly designed.. 
they were never originally intended for public submission, so they 
suffer...  and I'm not interested in rewriting them yet again

Andrew
Bernd Schmidt Oct. 6, 2015, 2:56 p.m. UTC | #3
On 10/06/2015 04:04 PM, Andrew MacLeod wrote:

> I primarily submitted it early because you wanted to look at the tools
> before the code patch, which is the one I care about since the longer it
> goes, the more effort it is to update the patch to mainline.

The problem is that the generated patch is impossible to review on its 
own. It's just a half a megabyte dump of changes that can't 
realistically be verified for correctness. Reading it can throw up some 
interesting questions which can then (hopefully) be answered by 
reference to the tools, such as "why does timevar.h move?" For that to 
work, the tools need at least to have a minimum level of readability. 
They are the important part here, not the generated patch. (Unless you 
find a reviewer who's less risk-averse than me and is willing to approve 
the whole set and hope for the best.)

I suspect you'll have to regenerate the includes patch anyway, because 
of the missing #undef tracking I mentioned.

Let's consider the timevar.h example a bit more. Does the include have 
to move? I don't see anything in that file that looks like a dependency, 
and include files that need it are already including it. Is the fact 
that df.h includes it in any way material for generating an order of 
headers? IMO, no, it's an unnecessary change indicating a bug in the 
script, and any kind of unnecessary change in a patch like this makes it 
so much harder to verify. I think the canonical order that's produced 
should probably ignore files included from other headers so that these 
are left alone in their original order.

I'd still like more explanations of special cases in the tools like the 
diagnostic.h area as well as
     # seed tm.h with options.h since its a build file and won't be seen.
and I think we need to understand what makes them special in a way that 
makes the rest of the algorithm not handle them correctly (so that we 
don't overlook any other such cases).


Bernd
Joseph Myers Oct. 6, 2015, 4:31 p.m. UTC | #4
On Tue, 6 Oct 2015, Bernd Schmidt wrote:

> General comments first. Where applicable, I think we should apply the same
> coding standards to Python as we do for C/C++. That means things like function

FWIW, glibc's rule is to follow PEP 8 formatting for Python code.

https://sourceware.org/glibc/wiki/Style_and_Conventions#Code_formatting_in_python_sources
Andrew MacLeod Oct. 6, 2015, 7:18 p.m. UTC | #5
On 10/06/2015 08:02 AM, Bernd Schmidt wrote:
>
>
> This sounds like the intention is to move recognized core files (I 
> assume these are the ones in the "order" array in the tool) to the 
> start, and leaving everything alone? I was a bit confused about this 
> at first; I see for example "timevar.h" moving around without being 
> present in the list, but it looks like it gets added implicitly 
> through being included by df.h. (Incidentally, this looks like another 
> case like the obstack one - a file using timevars should include 
> timevar.h IMO, even if it also includes df.h).
>

Ordering the includes is perhaps more complex than you realize. It more 
complex than I realized when I first started it. it took a long and very 
frustrating period to get it working properly.

There are implicit dependencies between some include files.  The primary 
ordering list is to provide a canonical order for key files so that 
those dependencies are automatically taken care of.  Until now we've 
managed it by hand.    The problem is that the dependencies are not 
necessary always from the main header file.. they may come from one of 
the headers that were included in it. There are lots of dependencies on 
symtab.h for instance, which comes from tree.h   Some other source files 
don't need tree.h, but they do need symtab.h.   If symtab.h isn't in the 
ordering list and the header which uses it is (like cgraph.h) , the tool 
would move cgraph.h above symtab.h and the result doesn't work.

The solution is to take that initial canonical list, and fully expand it 
to include everything that those headers include. This gives a linear 
canonical list of close to 100 files.    It means things like timevar.h 
(which is included by df.h)  are in this "ordering":
<...>
regset.h
alloc-pool.h
timevar.h
df.h
tm_p.h
gimple-iterator
<...>

A source file which does not include df.h  but includes timevar.h muist 
keep it in this same relative ordering, or some other header from the 
ordering list which uses timevar.h may no longer compile. (timevar.h 
would end up after everything in the canonical list instead of in fromt 
of the other file)

This means the any of those 100 headers files which occur in a source 
file should occur in this order.  The original version of the tool tried 
to spell out this exact order, but I realized that was not maintainable 
as headers change, and it was actually far simply to specify the core 
ones In the tool, and let it do the expansion based on what is in the 
current tree.

This also means that taken as a snapshot, you are going to see things 
like timevar.h move around in apparently random fashion... but it is not 
random.  It will be in front of any and all headers listed after it in 
the ordering.  Any headers which don't appear in the canonical list will 
simply retain their current order in the source file, but AFTER all the 
ones in the canonical list.

This also made it fairly easy to remove redundant includes that have 
been seen already by including some other header... I just build the 
list of headers that have been seen already

There are a couple of specialty cases that are handled..
The 'exclude processing' list are headers which shouldn't be expanded 
like above. They can cause irreconcilable problems when expanded , 
especially the front end file files.  They do need to be ordered since 
diagnostics require them to be included first in order to satisfy the 
requirement that   GCC_DIAG_STYLE  be defined before diagnostic.h is 
included.  Plus most of them include tree.h and/or diagnostic.h 
themselves, but we don't want them to impact  the ordering for the 
backend files.

That list puts those core files in an appropriate place canoncailly, but 
doesn't expand into the file because the order we get for the different 
front ends would be different .  Finally diagnostic*.h and friends are 
removed from the list and put at the end to ensure eveything that might 
be needed by them is available.  Again, the front end files would have 
made it much earlier than we wanted for the backend files.

I also disagree with the assertion that " a file using timevars should 
include timevar.h IMO, even if it also includes df.h"  It could, but I 
don't see the value, and I doubt anyone really cares much.    If someone 
ever removes the only thing that does bring timevar.h, you simply add it 
then. That is just part of updating headers.   I'm sure before I run 
this patch not every file which uses timevar.h actually physically 
includes it.  This process will set us to a somewhat consistent state.

  Its simple enough to remove the ones that are redundant in an 
automated way, and very difficult to determine whether they not 
required, but contain content that is used.

The fully expanded canonical list looks something like this:

safe-ctype.h
filenames.h
libiberty.h
hwint.h
system.h
insn-modes.h
machmode.h
signop.h
wide-int.h
double-int.h
real.h
fixed-value.h
statistics.h
gtype-desc.h
ggc.h
vec.h
hashtab.h
inchash.h
mem-stats-traits.h
hash-traits.h
hash-map-traits.h
mem-stats.h
hash-map.h
hash-table.h
hash-set.h
line-map.h
input.h
is-a.h
memory-block.h
coretypes.h
options.h
tm.h
function.h
obstack.h
bitmap.h
sbitmap.h
basic-block.h
dominance.h
cfg.h
backend.h
insn-codes.h
hard-reg-set.h
target.h
genrtl.h
rtl.h
c-target.h
c-target-def.h
symtab.h
tree-core.h
tree-check.h
tree.h
cp-tree.h
c-common.h
c-tree.h
gfortran.h
tree-ssa-alias.h
gimple-expr.h
gimple.h
predict.h
cfghooks.h
regset.h
alloc-pool.h
timevar.hdf.h
tm_p.h
gimple-iterators.h
stringpool.h
tree-ssa-operands.h
gimple-ssa.h
tree-ssanames.h
tree-phinodes.h
ssa-iterators.h
ssa.h
expmed.h
insn-opinit.h
optabs-query.h
optabs-libfuncs.h
insn-config.h
optabs.h
regs.h
emit-rtl.h
ira.h
recog.h
ira-int.h
streamer-hooks.h
plugin-api.h
gcov-iov.h
gcov-io.h
wide-int-print.h
pretty-print.h
bversion.h
lto-streamer.h
data-streamer.h
tree-streamer.h
gimple-streamer.h




>
> Intentionally commented out?
>
>> +
>> + def process_ii (filen):
>> +   return process_include_info (filen, False, False)
>> +
>> + def process_ii_macro (filen):
>> +   return process_include_info (filen, True, False)
>> +
>> + def process_ii_src (filen):
>> +   return process_include_info (filen, False, True)
>> +
>> + def process_ii_macro_src (filen):
>> +   return process_include_info (filen, True, True)
>> +
>> + def ii_base (iinfo):
>> +   return iinfo[0]
>> +
>> + def ii_path (iinfo):
>> +   return iinfo[1]
>> +
>> + def ii_include_list (iinfo):
>> +   return iinfo[2]
>> +
>> + def ii_include_list_cond (iinfo):
>> +   return iinfo[3]
>> +
>> + def ii_include_list_non_cond (iinfo):
>> +   l = ii_include_list (iinfo)
>> +   for n in ii_include_list_cond (iinfo):
>> +     l.remove (n)
>> +   return l
>> +
>> + def ii_macro_consume (iinfo):
>> +   return iinfo[4]
>> +
>> + def ii_macro_define (iinfo):
>> +   return iinfo[5]
>> +
>> + def ii_src (iinfo):
>> +   return iinfo[6]
>> +
>> + def ii_src_line (iinfo):
>> +   return iinfo[7]
>
> That's a lot of little functions with pretty much no clue for the 
> reader what's going on. It looks like maybe there's an array where a 
> struct should have been used?
>

there once was a large comment at the start of process_include_info 
describing the return value vactor... they simply access it.  Im not 
sure where it went.  I will find and put the big comment back in.

Andrew
Andrew MacLeod Oct. 6, 2015, 7:19 p.m. UTC | #6
On 10/06/2015 10:56 AM, Bernd Schmidt wrote:
> On 10/06/2015 04:04 PM, Andrew MacLeod wrote:
>
>> I primarily submitted it early because you wanted to look at the tools
>> before the code patch, which is the one I care about since the longer it
>> goes, the more effort it is to update the patch to mainline.
>
> The problem is that the generated patch is impossible to review on its 
> own. It's just a half a megabyte dump of changes that can't 
> realistically be verified for correctness. Reading it can throw up 
> some interesting questions which can then (hopefully) be answered by 
> reference to the tools, such as "why does timevar.h move?" For that to 
> work, the tools need at least to have a minimum level of readability. 
> They are the important part here, not the generated patch. (Unless you 
> find a reviewer who's less risk-averse than me and is willing to 
> approve the whole set and hope for the best.)

I dont get your fear.  I could have created that patch by hand, it would 
just take a long time, and would likely be less complete, but just as large.

I'm not  changing functionality.  ALL the tool is doing is removing 
header files which aren't needed to compile.  It goes to great pains to 
make sure it doesn't remove a silent dependency that conditional 
compilation might introduce.  Other than that, the sanity check is that 
everything compiles on every target and regression tests show nothing.   
Since we're doing this with just include files, and not changing 
functionality, Im not sure what your primary concern is? You are 
unlikely to ever be able to read the patch and decide for yourself 
whether removing expr.h from the header list is correct or not.  Much 
like if I proposed the same thing by hand.

Yes, I added the other tool in which reorders the headers and removes 
duplicates, and perhaps that is what is causing you the angst.  The 
canonical ordering was developed by taking current practice and adding 
in other core files which had ordering issues that showed up during the 
reduction process.   Reorderiing all files to this order should actually 
resolve more issues than it causes.   I can generate and provide that as 
a patch if you want to look at it separately...  I dont know what that 
buys you.  you could match the includes to the master list to make sure 
the tool did its job by itself I guess.

The tools are unlikely to ever be used again... Jeff suggested I provide 
them to contrib just in case someone decided to do something with them 
someday, they wouldn't be lost,or at least they wouldn't have to track 
me down to get them.

IF we discover that one or more of the tools does continue to have some 
life, well then maybe at that point its worth putting some time into 
refining it a bit better.


> I suspect you'll have to regenerate the includes patch anyway, because 
> of the missing #undef tracking I mentioned.

I dont see that #undef is relevant at all.  All the conditional 
dependencies care about is "MAY DEFINE"  Its conservative in that if 
something could be defined, we'll assume it is and not remove any file 
which may depend on it.  to undefine something in a MAY DEFINE world 
doesnt mean anything.



>
> Let's consider the timevar.h example a bit more. Does the include have 
> to move? I don't see anything in that file that looks like a 
> dependency, and include files that need it are already including it. 
> Is the fact that df.h includes it in any way material for generating 
> an order of headers? IMO, no, it's an unnecessary change indicating a 
> bug in the script, and any kind of unnecessary change in a patch like 
> this makes it so much harder to verify. I think the canonical order 
> that's produced should probably ignore files included from other 
> headers so that these are left alone in their original order.
>
I covered  this in the last note.  Pretty much every file is going to 
have a "core" of up to 95 files reordered into the canonical form, which 
taken as a snapshot of any given file, may look arbitrary but is in fact 
a specific subset of the canonical ordering.   You cant only order some 
parts of it because there are subtle dependencies between the files 
which force you to look at them all.  Trust me, I didnt start by 
reordering all of them this way... it developed over time.


> I'd still like more explanations of special cases in the tools like 
> the diagnostic.h area as well as
>     # seed tm.h with options.h since its a build file and won't be seen.
> and I think we need to understand what makes them special in a way 
> that makes the rest of the algorithm not handle them correctly (so 
> that we don't overlook any other such cases).
>
See the other note, its because of the front end files/diagnostic 
dependencies  or irreconcilable cycles because of what a header 
includes.     Any other case would have shown up the way those did 
during development.

Andrew
Bernd Schmidt Oct. 6, 2015, 8:37 p.m. UTC | #7
On 10/06/2015 09:19 PM, Andrew MacLeod wrote:
> I dont get your fear.  I could have created that patch by hand, it would
> just take a long time, and would likely be less complete, but just as
> large.
>
> I'm not  changing functionality.  ALL the tool is doing is removing
> header files which aren't needed to compile.  It goes to great pains to
> make sure it doesn't remove a silent dependency that conditional
> compilation might introduce.  Other than that, the sanity check is that
> everything compiles on every target and regression tests show nothing.
> Since we're doing this with just include files, and not changing
> functionality, Im not sure what your primary concern is?

My concern is that I've seen occasions in the past where "harmless 
cleanups" that were not intended to alter functionality introduced 
severe and subtle bugs that went unnoticed for a significant amount of 
time. If a change does not alter functionality, then there is a valid 
question of "why apply it then?", and the question of correctness 
becomes very important (to me anyway). The patch was produced by a 
fairly complex process, and I'd want to at least be able to convince 
myself that the process is correct.

Anyhow, I'll step back from this, you're probably better served by 
someone else reviewing the patch.


Bernd
Jeff Law Oct. 6, 2015, 9:27 p.m. UTC | #8
On 10/06/2015 08:04 AM, Andrew MacLeod wrote:
>>> No commenting on the quality of python code... :-) I was
>>> learning python on the fly.    Im sure some things are QUITE awful.,
>>
>> Yeah, the general impression is of fairly ad-hoc code. Not sure how
>> much can be done about this.
> they were never intended as general purpose tools, they were developed
> over multiple iterations and bugfixing and never properly designed..
> they were never originally intended for public submission, so they
> suffer...  and I'm not interested in rewriting them yet again
So a little background for Bernd.

The tangled mess that our header files has been makes it extremely 
difficult to do something introduce a new classes/interfaces to improve 
the separation of various parts of GCC.    Consider the case if we 
wanted to drop trees from gimple onward by initially wrapping trees in a 
trivially compatible class then converting files one by one to use the 
new representation.

We'd want to be able to do the conversion, then ensure ourselves that 
the old interfaces couldn't sneak in.  Getting there required some 
significant header file deconstruction, then reconstruction.

So Andrew set forth to try and untangle the mess of dependencies, remove 
unnecessary includes, etc etc.  He had the good sense to write some 
scripts to help :-0

A few months ago as this stage of refactoring header files as nearing 
completion, I asked Andrew how we were going to prevent things from 
getting into the sorry shape we were in last year.  From that discussion 
the suggestion that he should polish up his scripts and submit them for 
inclusion into the contrib/ subdirectory for future reference/use.

Ideally we'd occasionally run those scripts to ensure that we don't muck 
things up too badly again in the future.

Anyway, that's how we got here.  The scripts are just helper tools, but 
I wouldn't consider them a core part of GCC.  Obviously the cleaner and 
easier to run, the better.

It's interesting that a lot of work done by Andrew has ended up 
mirroring stuff I'm reading these days in Feathers' book.


Jeff
Jeff Law Oct. 6, 2015, 9:29 p.m. UTC | #9
On 10/06/2015 02:37 PM, Bernd Schmidt wrote:
> On 10/06/2015 09:19 PM, Andrew MacLeod wrote:
>> I dont get your fear.  I could have created that patch by hand, it would
>> just take a long time, and would likely be less complete, but just as
>> large.
>>
>> I'm not  changing functionality.  ALL the tool is doing is removing
>> header files which aren't needed to compile.  It goes to great pains to
>> make sure it doesn't remove a silent dependency that conditional
>> compilation might introduce.  Other than that, the sanity check is that
>> everything compiles on every target and regression tests show nothing.
>> Since we're doing this with just include files, and not changing
>> functionality, Im not sure what your primary concern is?
>
> My concern is that I've seen occasions in the past where "harmless
> cleanups" that were not intended to alter functionality introduced
> severe and subtle bugs that went unnoticed for a significant amount of
> time. If a change does not alter functionality, then there is a valid
> question of "why apply it then?", and the question of correctness
> becomes very important (to me anyway). The patch was produced by a
> fairly complex process, and I'd want to at least be able to convince
> myself that the process is correct.
A very valid concern.  In fact, one could argue that one of the long 
term problems we're likely to face as a project is the inability to do 
this kind of refactoring with high degrees of confidence that we're not 
breaking things.



>
> Anyhow, I'll step back from this, you're probably better served by
> someone else reviewing the patch.
That's fine.  I don't mind covering this.

jeff
Andrew MacLeod Oct. 6, 2015, 10:43 p.m. UTC | #10
On 10/06/2015 04:37 PM, Bernd Schmidt wrote:
> On 10/06/2015 09:19 PM, Andrew MacLeod wrote:
>> I dont get your fear.  I could have created that patch by hand, it would
>> just take a long time, and would likely be less complete, but just as
>> large.
>>
>> I'm not  changing functionality.  ALL the tool is doing is removing
>> header files which aren't needed to compile.  It goes to great pains to
>> make sure it doesn't remove a silent dependency that conditional
>> compilation might introduce.  Other than that, the sanity check is that
>> everything compiles on every target and regression tests show nothing.
>> Since we're doing this with just include files, and not changing
>> functionality, Im not sure what your primary concern is?
>
> My concern is that I've seen occasions in the past where "harmless 
> cleanups" that were not intended to alter functionality introduced 
> severe and subtle bugs that went unnoticed for a significant amount of 
> time. If a change does not alter functionality, then there is a valid 
> question of "why apply it then?", and the question of correctness 
> becomes very important (to me anyway). The patch was produced by a 
> fairly complex process, and I'd want to at least be able to convince 
> myself that the process is correct.
>
> Anyhow, I'll step back from this, you're probably better served by 
> someone else reviewing the patch.
>
>
> Bernd
I do get it.  And I have spent a lot of time trying to make sure none of 
those sort of bugs come in, and ultimately have tried to be 
conservative.. after all, its better to have the tool leave an include 
than remove one that may be required.

Ultimately, these changes are unlikely to introduce an issue, but there 
is a very slight possibility.  Any issues that do surface should be of 
the "not using a pattern" kind because a conditional compilation code 
case was somehow missed. I'm hoping for none of those obviously.  
Anyway, the tool does seem to work on all the tests I have looked at.  
If any bugs are uncovered by this, then they are also latent issues we 
didn't know about that should be exposed and fixed anyway.

I am fine if we'd like to separate the patches into the reordering, and 
the deleting.   Its not a lot of effort on my part, just a lot of time 
compiling for the reducer in the background.. and we can do them as 2 
commits if that is helpful.

What I don't want to do is spend a lot more time massaging the tools for 
contrib because I am sick of looking at them right now, and no one is in 
a hurry to use them anyway...  if anyone ever does.:-) The documentation 
grammer should certainly be fixed up and I will add some comments around 
the questions you had.

we could also do a small scale submission on half a dozen files, provide 
the reorder patch, and then  the reduction patch with the logs  if that 
helps whoever is reviewing get comfortable with what the tool is doing, 
then its easier to simply acknowledge the mechanical nature of the large 
commit.

Perhaps it would be educational anyway.

I'll do it however you guys want...  i just want to get it done :-)

Andrew
Andrew MacLeod Oct. 7, 2015, 4:35 p.m. UTC | #11
I went through and addressed the comments..  Just for info, a few replies:

>> +     # various front ends have to set GCC_DIAG_STYLE before 
>> including it
>> +     # for each file, we'll tailor where it belongs by looking at 
>> the dup
>> +     # list and seeing which file is included, and position it 
>> appropriately.
>
> From that comment it's not entirely clear how they are handled. Please 
> expand documentation of this mechanism.

I modified the comments in a couple of places to hopefully make it clearer.
>> + for i in order:
>> +   create_master_list (os.path.basename (i), False)
>
> I found myself wanting to pass True. The tool could use a "-v" flag.
>
I changed the existing -s flag to -v, and simply passed the value 
here...  Now you see the final list, as well as the list of where each 
one came from.
>> +   for line in logfile:
>> +     if len (line) > 21 and line[:21] in depstring:
>> +       if newinc:
>> +         incfrom = list()
>> +     newinc = False
>
> It looks like you are mixing tab and space indentation. For a language 
> like Python, that is absolutely scary. Please fix throughout (I think 
> only spaces is probably best).
>

vi is doing that automatically for me.. I will expandtabs everything.
>> + # compensate for this stupid warning that should be an error for
>> + # inlined templates
>> + def get_make_rc (rc, output):
>> +   rc = rc % 1280
>> +   if rc == 0:
>> +     # This is not considered a fatal error for a build!  /me rolls 
>> eyes
>> +     h = re.findall ("warning: inline function.*used but never 
>> defined", output)
>> +     if len(h) != 0:
>> +       rc = 1
>> +   return rc;
>
> What's this about?
I've updated the comment to be clearer.  Apparently its only a warning 
to use a template inline function with no definition. I suspect this is 
some oddball C++ thing :-).   Maybe it can be resolved at link time 
somehow?   Anyway,  what I found is that the return code from this is 0 
since its just a warning. SO the tool would remove the header file, and 
when I later try to link and build and object, it becomes a fatal link 
error with the function used but undefined.

It shows up when checking target builds since I only try to build the .o 
file there rather than build and link.   So the tool checks the output 
from the compilation, and if it sees this error, decided to be 
conservative and report it as a build error, and thus it will leave the 
header file in the source.

>
>> +   print " -a  : Show only files which*all*  listed files are included"
>
> Whitespace around *all*. Seems to happen quite often.

Yeah, that is very odd.   In the code here, there is a space in front of 
every single one of those. I simply changed all these to 'all' instead 
of '*all*'.

I'm also going to add a few more comments to functions in 
gcc-order-headers and reduce-headers, as well as utils.py

Andrew
David Malcolm Oct. 8, 2015, 4:31 p.m. UTC | #12
On Tue, 2015-10-06 at 14:02 +0200, Bernd Schmidt wrote:
[...]
> > No commenting on the quality of python code... :-) I was
> > learning python on the fly.    Im sure some things are QUITE awful.,

[...]

> > + def ii_base (iinfo):
> > +   return iinfo[0]
> > +
> > + def ii_path (iinfo):
> > +   return iinfo[1]
> > +
> > + def ii_include_list (iinfo):
> > +   return iinfo[2]
> > +
> > + def ii_include_list_cond (iinfo):
> > +   return iinfo[3]
> > +
> > + def ii_include_list_non_cond (iinfo):
> > +   l = ii_include_list (iinfo)
> > +   for n in ii_include_list_cond (iinfo):
> > +     l.remove (n)
> > +   return l
> > +
> > + def ii_macro_consume (iinfo):
> > +   return iinfo[4]
> > +
> > + def ii_macro_define (iinfo):
> > +   return iinfo[5]
> > +
> > + def ii_src (iinfo):
> > +   return iinfo[6]
> > +
> > + def ii_src_line (iinfo):
> > +   return iinfo[7]
> 
> That's a lot of little functions with pretty much no clue for the reader 
> what's going on. It looks like maybe there's an array where a struct 
> should have been used?

FWIW, this kind of thing is often made a lot neater and easier to debug
by using "namedtuple" from within the "collections" module in the
standard library:

https://docs.python.org/2/library/collections.html#collections.namedtuple

which lets you refer e.g. to field 5 of the tuple as a "define"
attribute.

  iinfo.define

and avoid all these accessor functions (and you can add methods and
properties, giving e.g. a "list_non_cond").

Not that I'm asking you to rewrite it; merely that namedtuple is one of
many gems in the python stdlib that are worth knowing about.

[...]

Hope this is constructive
Dave
diff mbox

Patch

Index: contrib/headers/ChangeLog
===================================================================
*** contrib/headers/ChangeLog	(revision 0)
--- contrib/headers/ChangeLog	(working copy)
***************
*** 0 ****
--- 1,12 ----
+ 2015-10-06  Andrew MacLeod  <amacleod@redhat.com>
+ 
+ 	* README : New File.
+ 	* count-headers : New File.
+ 	* gcc-order-headers : New File.
+ 	* graph-header-logs : New File.
+ 	* graph-include-web : New File.
+ 	* headerutils.py : New File.
+ 	* included-by : New File.
+ 	* reduce-headers : New File.
+ 	* replace-header : New File.
+ 	* show-headers : New File.
Index: contrib/headers/README
===================================================================
*** contrib/headers/README	(revision 0)
--- contrib/headers/README	(working copy)
***************
*** 0 ****
--- 1,282 ----
+ Quick start documentation for the header file utilities.  
+ 
+ This isn't a full breakdown of the tools, just they typical use scenarios.
+ 
+ - Each tool accepts -h to show its usage. usually no parameters will also
+ trigger the help message.  Help may specify additonal functionality to what is
+ listed here.
+ 
+ - For *all* tools, option format for specifying filenames must have no spaces
+ between the option and filename.
+ ie.:     tool -lfilename.h  target.h
+ 
+ - Many of the tools are required to be run from the core gcc source directory
+ containing coretypes.h  typically that is  in gcc/gcc from a source checkout.
+ For these tools to work on files not in this directory, their path needs to be
+ specified on the command line, 
+ ie.:     tool c/c-decl.c  lto/lto.c
+ 
+ - options can be intermixed with filenames anywhere on the command line
+ ie.   tool ssa.h rtl.h -a   is equivalent to 
+       tool ssa.h -a rtl.h
+ 
+ 
+ 
+ 
+ 
+ gcc-order-headers
+ -----------------
+   This will reorder any primary backend headers files into a canonical order
+   which will resolve any hidden dependencies they may have.  Any unknown
+   headers will simply be occur after the recognized core files, and retain the
+   same relative ordering they had.
+  
+   Must be run in the core gcc source directory
+ 
+   simply execute the command listing any files you wish to process on the
+   command line.
+ 
+   Any files which are changed are output, and the original is saved with a
+   .bak extention.
+ 
+   ex.:     gcc-order-headers tree-ssa.c c/c-decl.c
+ 
+   -s will list all of the known headers in their canonical order. It does not
+   show which of those headers include other headers, just the final canonical
+   ordering.
+ 
+   if any header files are included within a conditional code block, the tool
+   will issue a message and not change the file.  When this happens, you can
+   manually inspect the file, and if reorder it will be fine, rerun the command
+   with -i on the files.  This will ignore the conditional error condition
+   and perform the re-ordering anyway.
+   
+   If any #include line has the beginning of a multi-line comment, it will also
+   refuse to process the file until that is resolved. 
+  
+ 
+ 
+ 
+ show-headers
+ ------------
+   This will show the include structure for any given file. Each level of nesting
+   is indented, and when any duplicate headers are seen, they have their
+   duplicate number shown
+ 
+   -i may be used to specify alternate search directories for headers to parse.
+ 
+   Must be run in the core gcc source directory
+ 
+   ex.: show-headers -i../../build/gcc -i../libcpp tree-ssa.c
+ 	tree-ssa.c
+ 	  config.h
+ 	    auto-host.h
+ 	    ansidecl.h  (1)
+ 	  system.h
+ 	    safe-ctype.h
+ 	    filenames.h
+ 	      hashtab.h  (1)
+ 		ansidecl.h  (2)
+ 	    libiberty.h
+ 	      ansidecl.h  (3)
+ 	    hwint.h
+ 	  coretypes.h
+ 	    machmode.h  (1)
+ 	      insn-modes.h  (1)
+ 	    signop.h
+ 	  <...>
+ 
+ 
+ 
+ 
+ count-headers
+ -------------
+   simply count all the headers found in the specified files. A summary is 
+   printed showing occurrences from high to low.
+ 
+   ex.:    count-headers  tree*.c
+ 	    86 : coretypes.h
+ 	    86 : config.h
+ 	    86 : system.h
+ 	    86 : tree.h
+ 	    82 : backend.h
+ 	    80 : gimple.h
+ 	    72 : gimple-iterator.h
+ 	    70 : ssa.h
+ 	    68 : fold-const.h
+             <...>
+ 
+ 
+ 
+ included-by
+ -----------
+   This tool will search all the .c,.cc and .h files and output a list of files
+   which include the specified header(s).
+ 
+   It does a 4 level deep find of all source files from the current directory
+   and look in each of those for a #include of the specified headers.  So expect
+   a little bit of slowness.
+ 
+   -i limits the search to only other header files.
+   -c limits the search to .c and .cc files.
+   -a shows only source files which include *all* specified headers.
+   -f allows you to specify a file which contains a list of source files to
+      check rather than performing the much slower find command.
+ 
+   ex: included-by tree-vectorizer.h
+ 	config/aarch64/aarch64.c
+ 	config/i386/i386.c
+ 	config/rs6000/rs6000.c
+ 	tree-loop-distribution.c
+ 	tree-parloops.c
+ 	tree-ssa-loop-ivopts.c
+ 	tree-ssa-loop.c
+ 
+ 
+ 
+ 
+ replace-header
+ --------------
+   This tool simply replaces a single header file with one or more other headers.
+   -r specifies the include to replace, and one or more -f options specify the
+   replacement headers, in the order they occur.
+   
+   This is commonly used in conjunction with 'included-by' to change all 
+   occurrences of a header file to something else, or to insert new headers 
+   before or after.  
+ 
+   ex:  to insert #include "before.h" before every occurence of tree.h in all
+   .c and .cc source files:
+ 
+   replace-header -rtree.h -fbefore.h -ftree.h `included-by -c tree.h`
+ 
+ 
+ 
+ 
+ reduce-headers
+ --------------
+ 
+   This tool removes any header files which are not needed from a source file.
+ 
+   This tool must be run for the core gcc source directory, and requires either
+   a native build and sometimes target builds, depending on what you are trying
+   to reduce.
+ 
+   it is good practive to run 'gcc-order-headers' on a source file before trying
+   to reduce it.  This removes duplicates and performs some simplifications 
+   which reduce the chances of the reduction tool missing things.
+   
+   start with a completely bootstrapped native compiler.
+ 
+   Any desired target builds should be built in one directory using a modified
+   config-list.mk file which doesnt delete the build directory when its done.
+   any target directories which do not successfully complete a 'make all-gcc'
+   may cause the tool to not reduce anything.
+   (todo - provide a config-list.mk that leaves successful target builds, but
+           deletes ones which do not compile)
+ 
+   The tool will examine all the target builds to determine which targets build
+   the file, and include those targets in the testing.
+   
+ 
+ 
+   The tool will analyze a source file and attempt to remove each non-conditional
+   header from last to first in the file.:
+     It will first attempt to build the native all-gcc target.
+     If that succeeds, it will attempt to build any target build .o files
+     If that suceeds, it will check to see if there are any conditional
+        compilation dependencies between this header file and the source file or
+        any header whihch have already been determined as non-removable.
+     If all these tests are passed, the header file is determined to be removable
+        and is removed from the source file.
+     This continues until all headers have been checked.
+   At this point, the a bootstrap is attempted in the native build, and if that
+      passes the file is considered reduced.
+ 
+   Any files from the config subdirectory require target builds to be present
+   in order to proceed.
+ 
+   A small subset of targets has been determined to provide excellent coverage,
+   at least as of Aug 31/15 .  A fullset of targets reduced all of the files
+   making up libbackend.a.  All of the features which requires target testing 
+   were found to be triggered by one or more of these targets.  They are
+   actually known to the tool, and when checkiong target, it will check those
+   targets first, then the rest.  It is mostly safe to do a reduction with just
+   these targets, at least until some new whacky target comes along.
+   building config-list.mk with :
+   LIST="aarch64-linux-gnu arm-netbsdelf avr-rtems c6x-elf epiphany-elf hppa2.0-hpux10.1 i686-mingw32crt i686-pc-msdosdjgpp mipsel-elf powerpc-eabisimaltivec rs6000-ibm-aix5.1.0 sh-superh-elf sparc64-elf spu-elf"
+ 
+   -b specifies the native bootstrapped build root directory
+   -t specifies a target build root directory that config-list.mk was run from
+   -f is used to limit the headers for consideration.
+ 
+   example:
+ 
+   mkdir gcc          // checkout gcc in subdir gcc
+   mdsir build        // boostrap gcc in subdir build
+   mkdir target       // create target directory and run config-list.mk
+   cd gcc/gcc
+ 
+   reduce-headers -b../../build -t../../targets -falias.h -fexpr.h tree*.c  (1)
+        #  This will attempt to remove only alias.h and expr.h from tree*.c
+ 
+   reduce-headers -b../../build -t../../targets tree-ssa-live.c
+        #  This will attempt to remove all header files from tree-ssa-live.c
+   
+ 
+   the tool will generate a number of log files:
+ 
+     reduce-headers.log :  All the compilation failure output that tool tried.
+     reduce-headers.sum : One line summary of what happened to each source file.
+ 
+   (All the remaining logs are appended to, so if you run the tool multiple times
+   these files are just added to. You must physically remove them yourself.)
+ 
+     reduce-headers-kept.log: List of all the successful compiles that were
+                              ignored because of conditional macro dependencies
+ 			     and why it thinks taht is the case
+     $src.c.log  : for each failed header removal, the compilation
+ 		  messages as to why it failed.
+     $header.h.log: The same log is put into the relevent header log as well.
+ 
+ 
+ a sample output from ira.c.log:
+ 
+ Compilation failed:
+  for shrink-wrap.h:
+ 
+  ============================================
+  /gcc/2015-09-09/gcc/gcc/ira.c: In function ‘bool split_live_ranges_for_shrink_wrap()’:
+  /gcc/2015-09-09/gcc/gcc/ira.c:4839:8: error: ‘SHRINK_WRAPPING_ENABLED’ was not declared in this scope
+     if (!SHRINK_WRAPPING_ENABLED)
+             ^
+ 	    make: *** [ira.o] Error 1
+ 
+ 
+ the same message would be put into shrink-wrap.h.log.
+ 
+ 
+ 
+ graph-header-logs
+ -----------------
+   This tool will parse all the messages from the .C files, looking for failures
+   that show up in other headers...  meaning there is a compilation dependency
+   between the 2 header files. 
+ 
+   The tool will aggregate all these and generate a graph of the dependencies
+   exposed during compilation.  red lines indicate dependecies that are
+   presednt because a head file physically includes another header. Black lines
+   represent data dependencies causing compilation if the header isnt present.
+ 
+   ex.: graph-header-logs *.c.log
+ 
+ 
+ 
+ graph-include-web
+ -----------------
+   This tool can be used to visualize the include structure in files.  It is
+   rapidly turned useless if you specify too many things, but it can be 
+   useful for finding cycles and redundancies, or simply to see what a single
+   file looks like.
+ 
+   ex.: graph-include-web tree.c
Index: contrib/headers/count-headers
===================================================================
*** contrib/headers/count-headers	(revision 0)
--- contrib/headers/count-headers	(working copy)
***************
*** 0 ****
--- 1,63 ----
+ #! /usr/bin/python2
+ import os.path
+ import sys
+ import shlex
+ import re
+ 
+ from headerutils import *
+ 
+ 
+ 
+ usage = False
+ src = list()
+ flist = { }
+ process_h = True
+ process_c = True
+ verbose = False
+ all_inc = True
+ level = 0
+ 
+ only_use_list = list()
+ 
+ 
+ 
+ for x in sys.argv[1:]:
+   if x[0:2] == "-h":
+     usage = True
+   else:
+     src.append (x)
+ 
+ 
+ if not usage and len(src) > 0:
+ 
+   incl = { }
+   for fn in src:
+     src = readwholefile (fn)
+     dup = { }
+     for line in src:
+       d = find_pound_include (line, True, True)
+       if d != "" and d[-2:] ==".h":
+         if dup.get(d) == None:
+ 	  if incl.get(d) == None:
+ 	    incl[d] = 1
+ 	  else:
+ 	    incl[d] = incl[d]+ 1
+ 	  dup[d] = 1
+ 
+   l = list()
+   for i in incl:
+     l.append ((incl[i], i))
+   l.sort(key=lambda tup:tup[0], reverse=True)
+ 
+   for f in l:
+     print str(f[0]) + " : " + f[1]
+ 
+ else:
+   print "count-headers file1 [filen]"
+   print "Count the number of occurrences of all includes across all listed files"
+ 
+  
+ 
+ 
+ 
+ 

Property changes on: contrib/headers/count-headers
___________________________________________________________________
Added: svn:executable
## -0,0 +1 ##
+*
\ No newline at end of property
Index: contrib/headers/gcc-order-headers
===================================================================
*** contrib/headers/gcc-order-headers	(revision 0)
--- contrib/headers/gcc-order-headers	(working copy)
***************
*** 0 ****
--- 1,366 ----
+ #! /usr/bin/python2
+ import os
+ import sys
+ import shlex
+ import re
+ 
+ from headerutils import *
+ import Queue
+ 
+ file_list = list ()
+ usage = False
+ 
+ ignore_conditional = False
+ 
+ order = [
+   "system.h",
+   "coretypes.h",
+   "backend.h",
+   "target.h",
+   "rtl.h",
+   "c-family/c-target.h",
+   "c-family/c-target-def.h",
+   "tree.h",
+   "cp/cp-tree.h",
+   "c-family/c-common.h",  # these must come before diagnostic.h
+   "c/c-tree.h",
+   "fortran/gfortran.h",
+   "gimple.h",
+   "cfghooks.h",
+   "df.h",
+   "tm_p.h",
+   "gimple-iterators.h",
+   "ssa.h",
+   "expmed.h",
+   "optabs.h",
+   "regs.h",
+   "ira.h",
+   "ira-int.h",
+   "gimple-streamer.h"
+ 
+ ]
+ 
+ exclude_special = [  "bversion.h", "obstack.h", "insn-codes.h", "hooks.h" ]
+ includes = { }
+ dups = { }
+ exclude_processing = [ "tree-vectorizer.h" , "c-target.h", "c-target-def.h", "cp-tree.h", "c-common.h", "c-tree.h", "gfortran.h" ]
+ 
+ master_list = list()
+ # where include file comes from in src
+ h_from = { }
+ 
+ # create the master ordering list... this is the desired order of headers
+ def create_master_list (fn, verbose):
+   if fn not in exclude_processing:
+     for x in includes[fn][1]:
+       create_master_list (x, verbose)
+   if not fn in master_list:
+     # Don't put diagnostic*.h into the ordering list, its special since
+     # various front ends have to set GCC_DIAG_STYLE before including it
+     # for each file, we'll tailor where it belongs by looking at the dup
+     # list and seeing which file is included, and position it appropriately.
+     if fn != "diagnostic.h" and fn != "diagnostic-core.h":
+       master_list.append (fn)
+       if (verbose):
+ 	print fn + "      included by: " + includes[fn][0]
+ 
+ 
+ 
+ def print_dups ():
+   if dups:
+     print "\nduplicated includes"
+   for i in dups:
+     string =  "dup : " + i + " : "
+     string += includes[i][0] 
+     for i2 in dups[i]:
+       string += ", "+i2
+     print string
+ 
+ 
+ def process_known_dups ():
+   # rtl.h gets tagged as a duplicate includer for all of coretypes, but thats
+   # really for only generator files
+   rtl_remove = includes["coretypes.h"][1] + ["statistics.h", "vec.h"]
+   for i in rtl_remove:
+     if dups[i] and "rtl.h" in dups[i]:
+       dups[i].remove("rtl.h")
+     if not dups[i]:
+       dups.pop (i, None)
+ 
+   # make sure diagnostic.h is the owner of diagnostic-core.h
+   if includes["diagnostic-core.h"][0] != "diagnostic.h":
+     dups["diagnostic-core.h"].append (includes["diagnostic-core.h"][0])
+     includes["diagnostic-core.h"] = ("diagnostic.h", includes["diagnostic-core.h"][1])
+ 
+ def indirectly_included (header, header_list):
+   nm = os.path.basename (header)
+   while nm and includes.get(nm):
+     if includes[nm][0] in header_list:
+       return includes[nm][0]
+     nm = includes[nm][0]
+ 
+   if header == "diagnostic-core.h":
+     if dups.get("diagnostic-core.h"):
+       for f in dups["diagnostic-core.h"]:
+ 	if f in header_list:
+ 	  return f
+     else:
+       if header in header_list:
+ 	return header
+     # Now check if diagnostics is included indirectly anywhere
+     header = "diagnostic.h"
+ 
+   if header == "diagnostic.h":
+     if dups.get("diagnostic.h"):
+       for f in dups["diagnostic.h"]:
+ 	if f in header_list:
+ 	  return f
+     else:
+       if header in header_list:
+ 	return header 
+ 
+   return ""
+ 
+ 
+ def get_new_order (src_h, desired_order):
+   new_order = list()
+   for h in desired_order:
+     if h in master_list:
+       # find what included this
+       iclist = list()
+       ib = includes[h][0]
+       while ib:
+         iclist.insert(0, ib)
+ 	ib = includes[ib][0]
+       if iclist:
+ 	for x in iclist:
+ 	  if x in src_h and x not in exclude_processing:
+ 	    if x not in new_order and x[:10] != "diagnostic" and h not in exclude_special:
+ 	      new_order.append (x)
+ 	      break;
+       else:
+ 	if h not in new_order:
+ 	  new_order.append (h)
+ 
+   f = ""
+   if "diagnostic.h" in src_h:
+     f = "diagnostic.h"
+   elif "diagnostic-core.h" in src_h:
+     f = "diagnostic-core.h"
+ 
+  
+   if f:
+     ii = indirectly_included (f, src_h)
+     if not ii or ii == f:
+       new_order.append (f)
+ 
+   return new_order
+         
+     
+ 
+ # stack of files to process
+ process_stack = list()
+ 
+ def process_one (info):
+   i = info[0]
+   owner = info[1]
+   name = os.path.basename(i)
+   if os.path.exists (i):
+     if includes.get(name) == None:
+       l = find_unique_include_list (i)
+       # create a list which has just basenames in it
+       new_list = list()
+       for x in l:
+ 	new_list.append (os.path.basename (x))
+ 	process_stack.append((x, name))
+       includes[name] = (owner, new_list)
+     elif owner:
+       if dups.get(name) == None:
+         dups[name] = [ owner ]
+       else:
+         dups[name].append (owner)
+   else:
+     # seed tm.h with options.h since its a build file and won't be seen. 
+     if not includes.get(name):
+       if name == "tm.h":
+ 	includes[name] = (owner, [ "options.h" ])
+ 	includes["options.h"] = ("tm.h", list())
+       else:
+ 	includes[name] = (owner, list())
+ 
+ 
+ show_master = False
+ 
+ for arg in sys.argv[1:]:
+   if arg[0:1] == "-":
+     if arg[0:2] == "-h":
+       usage = True
+     elif arg[0:2] == "-i":
+       ignore_conditional = True
+     elif arg[0:2] == "-s":
+       show_master = True
+     else:
+       print "Error: unrecognized option " + arg
+   elif os.path.exists(arg):
+     file_list.append (arg)
+   else:
+     print "Error: file " + arg + " Does not exist."
+     usage = True
+ 
+ if not file_list and not show_master:
+   usage = True
+ 
+ if not usage and not os.path.exists ("coretypes.h"):
+   usage = True
+   print "Error: Must run command in main gcc source directory containing coretypes.h\n"
+ 
+ # process diagnostic.h first.. it's special since GCC_DIAG_STYLE can be
+ # overridden by languages, but must be done so by a file included BEFORE it.
+ # so make sure it isn't seen as inclujded by one of those files by making it 
+ # appear to be included by the src file.
+ process_stack.insert (0, ("diagnostic.h", ""))
+ 
+ # Add the list of files in reverse order since it is processed as a stack later
+ for i in order:
+   process_stack.insert (0, (i, "") )
+ 
+ # build up the library of what header files include what other files.
+ while process_stack:
+   info = process_stack.pop ()
+   process_one (info)
+ 
+ # Now crate the master ordering list
+ for i in order:
+   create_master_list (os.path.basename (i), False)
+ 
+ # handle warts in the duplicate list
+ process_known_dups ()
+ desired_order = master_list
+ 
+ if show_master:
+   print " Canonical order of gcc include files: "
+   for x in master_list:
+     print x
+   print " "
+ 
+ if usage:
+   print "gcc-order-headers [-i] [-s] file1 [filen]"
+   print "    Ensures gcc's headers files are included in a normalized form with"
+   print "    redundant headers removed.  The original files are saved in filename.bak"
+   print "    Outputs a list of files which changed."
+   print " -i ignore conditional compilation."
+   print "    Use after examining the file to be sure includes within #ifs are safe"
+   print "    Any headers within conditional sections will be ignored."
+   print " -s Show the cananoical order of known includes"
+   sys.exit(0)
+ 
+ 
+ didnt_do = list ()
+ 
+ for fn in file_list:
+   nest = 0
+   src_h = list()
+   src_line = { }
+ 
+   master_list = list()
+   includes = { }
+   dups = { }
+ 
+   iinfo = process_ii_src (fn)
+   src = ii_src (iinfo)
+   include_list = ii_include_list (iinfo)
+ 
+   if ii_include_list_cond (iinfo):
+     if not ignore_conditional:
+       print fn + ": Cannot process due to conditional compilation of includes"
+       didnt_do.append (fn)
+       src = list ()
+ 
+   if not src:
+     continue
+ 
+   process_stack = list()
+   # prime the stack with headers in the main ordering list so we get them in
+   # this order.
+   for d in order:
+     if d in include_list:
+       process_stack.insert (0, (d, ""))
+ 
+   for d in include_list:
+       nm = os.path.basename(d)
+       src_h.append (nm)
+       iname = d
+       iname2 = os.path.dirname (fn) + "/" + d
+       if not os.path.exists (d) and os.path.exists (iname2):
+         iname = iname2
+       if iname not in process_stack:
+ 	process_stack.insert (0, (iname, ""))
+       src_line[nm] = ii_src_line(iinfo)[d]
+       if src_line[nm].find("/*") != -1 and src_line[nm].find("*/") == -1:
+         # this means we have a multi line comment, abort!'
+ 	print fn + ": Cannot process due to a multi-line comment :"
+ 	print "        " + src_line[nm]
+ 	if fn not in didnt_do:
+ 	  didnt_do.append (fn)
+ 	src = list ()
+ 
+   if not src:
+     continue
+ 
+   # Now create the list of includes as seen by the source file.
+   while process_stack:
+     info = process_stack.pop ()
+     process_one (info)
+  
+   for i in include_list:
+     create_master_list (os.path.basename (i), False)
+ 
+   new_src = list()
+   header_added = list()
+   new_order = list()
+   for line in src:
+     d = find_pound_include (line, True, True)
+     if not d or d[-2:] != ".h":
+       new_src.append (line)
+     else:
+       if d == order[0] and not new_order:
+         new_order = get_new_order (src_h, desired_order)
+ 	for i in new_order:
+ 	  new_src.append (src_line[i])
+ 	  # if not seen, add it.
+ 	  if i not in header_added:
+ 	    header_added.append (i)
+       else:
+ 	nm = os.path.basename(d)
+ 	if nm not in header_added:
+ 	  iby = indirectly_included (nm, src_h)
+ 	  if not iby:
+ 	    new_src.append (line)
+ 	    header_added.append (nm)
+ 
+   if src != new_src:
+     os.rename (fn, fn + ".bak")
+     fl = open(fn,"w")
+     for line in new_src:
+       fl.write (line)
+     fl.close ()
+     print fn 
+ 
+  
+ if didnt_do:
+   print "\n\n Did not process the following files due to conditional dependencies:"
+   str = ""
+   for x in didnt_do:
+     str += x + " "
+   print str
+   print "\n"
+   print "Please examine to see if they are safe to process, and re-try with -i. "
+   print "Safeness is determined by checking whether any of the reordered headers are"
+   print "within a conditional and could be hauled out of the conditional, thus changing"
+   print "what the compiler will see."
+   print "Multi-line comments after a #include can also cause failuer, they must be turned"
+   print "into single line comments or removed."
+ 
+ 
+ 
+ 

Property changes on: contrib/headers/gcc-order-headers
___________________________________________________________________
Added: svn:executable
## -0,0 +1 ##
+*
\ No newline at end of property
Index: contrib/headers/graph-header-logs
===================================================================
*** contrib/headers/graph-header-logs	(revision 0)
--- contrib/headers/graph-header-logs	(working copy)
***************
*** 0 ****
--- 1,226 ----
+ #! /usr/bin/python2
+ import os.path
+ import sys
+ import shlex
+ import re
+ 
+ from headerutils import *
+ 
+ header_roots = { }
+ extra_edges = list()
+ verbose = False
+ verbosity = 0
+ nodes = list()
+ 
+ def unpretty (name):
+   if name[-2:] == "_h":
+     name = name[:-2] + ".h"
+   return name.replace("_", "-")
+ 
+ def pretty_name (name):
+   name = os.path.basename (name)
+   return name.replace(".","_").replace("-","_").replace("/","_").replace("+","_");
+ 
+ depstring = ("In file included from", "                 from")
+ 
+ ignore = [ "coretypes_h",
+ 	     "machmode_h",
+ 	     "signop_h",
+ 	     "wide_int_h",
+ 	     "double_int_h",
+ 	     "real_h",
+ 	     "fixed_value_h",
+ 	     "hash_table_h",
+ 	       "statistics_h",
+ 	       "ggc_h",
+ 	       "vec_h",
+ 	       "hashtab_h",
+ 	       "inchash_h",
+ 	       "mem_stats_traits_h",
+ 	       "hash_map_traits_h",
+ 	       "mem_stats_h",
+ 	       "hash_map_h",
+ 	     "hash_set_h",
+ 	     "input_h",
+ 	       "line_map_h",
+ 	     "is_a_h",
+ 	   "system_h",
+ 	   "config_h" ]
+ 
+ def process_log_file (header, logfile):
+   if header_roots.get (header) != None:
+     print "Error: already processed log file: " + header + ".log"
+     return
+   hname = pretty_name (header)
+   header_roots[hname] = { }
+   
+   sline = list();
+   incfrom = list()
+   newinc = True
+   for line in logfile:
+     if len (line) > 21 and line[:21] in depstring:
+       if newinc:
+         incfrom = list()
+ 	newinc = False
+       fn = re.findall(ur".*/(.*?):", line)
+       if len(fn) != 1:
+         continue
+       if fn[0][-2:] != ".h":
+         continue
+       n = pretty_name (fn[0])
+       if n not in ignore:
+ 	incfrom.append (n)
+       continue
+     newinc = True
+     note = re.findall (ur"^.*note: (.*)", line)
+     if len(note) > 0:
+       sline.append (("note", note[0]))
+     else:
+       err_msg = re.findall (ur"^.*: error: (.*)", line)
+       if len(err_msg) == 1:
+ 	msg = err_msg[0]
+ 	if (len (re.findall("error: forward declaration", line))) != 0:
+ 	  continue
+ 	path = re.findall (ur"^(.*?):.*error: ", line)
+ 	if len(path) != 1:
+ 	  continue
+ 	if path[0][-2:] != ".h":
+ 	  continue
+ 	fname = pretty_name (path[0])
+ 	if fname in ignore or fname[0:3] == "gt_":
+ 	  continue
+ 	sline.append (("error", msg, fname, incfrom))
+ 
+   print str(len(sline)) + " lines to process"
+   lastline = "note"
+   for line in sline:
+     if line[0] != "note" and lastline[0] == "error":
+       fname = lastline[2]
+       msg = lastline[1]
+       incfrom = lastline[3]
+       string = ""
+       ofname = fname
+       if len(incfrom) != 0:
+ 	for t in incfrom:
+ 	  string = string + t + " : "
+ 	  ee = (fname, t)
+ 	  if ee not in extra_edges:
+ 	    extra_edges.append (ee)
+ 	  fname = t
+ 	  print string
+ 
+       if hname not in nodes:
+ 	nodes.append(hname)
+       if fname not in nodes:
+ 	nodes.append (ofname)
+       for y in incfrom:
+ 	if y not in nodes:
+ 	  nodes.append (y)
+ 
+ 
+       if header_roots[hname].get(fname) == None:
+ 	header_roots[hname][fname] = list()
+       if msg not in header_roots[hname][fname]:
+ 	print string + ofname + " : " +msg
+ 	header_roots[hname][fname].append (msg)
+     lastline = line;
+ 
+ 
+ dotname = "graph.dot"
+ graphname = "graph.png"
+ 
+ 
+ def build_dot_file (file_list):
+   output = open(dotname, "w")
+   output.write ("digraph incweb {\n");
+   for x in file_list:
+     if os.path.exists (x) and x[-4:] == ".log":
+       header =  x[:-4]
+       logfile = open(x).read().splitlines()
+       process_log_file (header, logfile)
+     elif os.path.exists (x + ".log"):
+       logfile = open(x + ".log").read().splitlines()
+       process_log_file (x, logfile)
+ 
+   for n in nodes:
+     fn = unpretty(n)
+     label = n + " [ label = \"" + fn  + "\" ];"
+     output.write (label + "\n")
+     if os.path.exists (fn):
+       h = open(fn).read().splitlines()
+       for l in h:
+         t = find_pound_include (l, True, False)
+ 	if t != "":
+ 	  t = pretty_name (t)
+ 	  if t in ignore or t[-2:] != "_h":
+ 	    continue
+ 	  if t not in nodes:
+ 	    nodes.append (t)
+ 	  ee = (t, n)
+ 	  if ee not in extra_edges:
+ 	    extra_edges.append (ee)
+ 
+   depcount = list()
+   for h in header_roots:
+     for dep in header_roots[h]:
+       label = " [ label = "+ str(len(header_roots[h][dep])) + " ];"
+       string = h + " -> " + dep + label
+       output.write (string + "\n");
+       if verbose:
+         depcount.append ((h, dep, len(header_roots[h][dep])))
+ 
+   for ee in extra_edges:
+     string = ee[0] + " -> " + ee[1] + "[ color=red ];"
+     output.write (string + "\n");
+ 
+   
+   if verbose:
+     depcount.sort(key=lambda tup:tup[2])
+     for x in depcount:
+       print " ("+str(x[2])+ ") : " + x[0] + " -> " + x[1]
+       if (x[2] <= verbosity):
+         for l in header_roots[x[0]][x[1]]:
+ 	  print "            " + l
+ 
+   output.write ("}\n");
+ 
+ 
+ files = list()
+ dohelp = False
+ edge_thresh = 0
+ for arg in sys.argv[1:]:
+   if arg[0:2] == "-o":
+     dotname = arg[2:]+".dot"
+     graphname = arg[2:]+".png"
+   elif arg[0:2] == "-h":
+     dohelp = True
+   elif arg[0:2] == "-v":
+     verbose = True
+     if len(arg) > 2:
+       verbosity = int (arg[2:])
+       if (verbosity == 9):
+         verbosity = 9999
+   elif arg[0:1] == "-":
+     print "Unrecognized option " + arg
+     dohelp = True
+   else:
+     files.append (arg)
+     
+ if len(sys.argv) == 1:
+   dohelp = True
+ 
+ if dohelp:
+   print "Parses the log files from remove-include processes to generate"
+   print " dependency graphs for the include web for specified files."
+   print "Usage:  [-nnum] [-h] [-v[n]] [-ooutput] file1 [[file2] ... [filen]]"
+   print "	-ooutput : Specifies output to output.dot and output.png"
+   print "                   Defaults to 'graph.dot and graph.png"
+   print "       -vn : verbose mode, shows the number of connections, and if n"
+   print "             is specifies, show the messages if # < n. 9 is infinity"
+   print "	-h : help"
+ else:
+   print files
+   build_dot_file (files)
+   os.system ("dot -Tpng " + dotname + " -o" + graphname)
+ 
+ 

Property changes on: contrib/headers/graph-header-logs
___________________________________________________________________
Added: svn:executable
## -0,0 +1 ##
+*
\ No newline at end of property
Index: contrib/headers/graph-include-web
===================================================================
*** contrib/headers/graph-include-web	(revision 0)
--- contrib/headers/graph-include-web	(working copy)
***************
*** 0 ****
--- 1,122 ----
+ #! /usr/bin/python2
+ import os.path
+ import sys
+ import shlex
+ import re
+ 
+ from headerutils import *
+ 
+ def pretty_name (name):
+   return name.replace(".","_").replace("-","_").replace("/","_").replace("+","_");
+ 
+ 
+ include_files = list()
+ edges = 0
+ one_c = False
+ clink = list()
+ noterm = False
+ 
+ def build_inclist (output, filen):
+   global edges
+   global one_c
+   global clink
+   global noterm
+   inc = build_include_list (filen)
+   if one_c and filen[-2:] == ".c":
+     pn = "all_c"
+   else:
+     pn = pretty_name(filen)
+   for nm in inc:
+     if pn == "all_c":
+       if nm not in clink:
+         if len(build_include_list(nm)) != 0 or not noterm:
+ 	  output.write (pretty_name(nm) + " -> " + pn + ";\n")
+ 	  edges = edges + 1
+ 	  if nm not in include_files:
+ 	    include_files.append(nm)
+ 	clink.append (nm)
+     else:
+       output.write (pretty_name(nm) + " -> " + pn + ";\n")
+       edges = edges + 1
+       if nm not in include_files:
+ 	include_files.append(nm)
+   return len(inc) == 0
+ 
+ dotname = "graph.dot"
+ graphname = "graph.png"
+ 
+ def build_dot_file (file_list):
+   global one_c
+   output = open(dotname, "w")
+   output.write ("digraph incweb {\n");
+   if one_c:
+     output.write ("all_c [shape=box];\n");
+   for x in file_list:
+     if x[-2:] == ".h":
+       include_files.append (x)
+     elif os.path.exists (x):
+       build_inclist (output, x)
+       if not one_c:
+ 	output.write (pretty_name (x) + "[shape=box];\n")
+ 
+   for x in include_files:
+     term = build_inclist (output, x)
+     if term:
+       output.write (pretty_name(x) + " [style=filled];\n")
+ 
+   output.write ("}\n");
+ 
+ 
+ files = list()
+ dohelp = False
+ edge_thresh = 0
+ for arg in sys.argv[1:]:
+   if arg[0:2] == "-o":
+     dotname = arg[2:]+".dot"
+     graphname = arg[2:]+".png"
+   elif arg[0:2] == "-h":
+     dohelp = True
+   elif arg[0:2] == "-a":
+     one_c = True
+     if arg[0:3] == "-at":
+       noterm = True
+   elif arg[0:2] == "-f":
+     if not os.path.exists (arg[2:]):
+       print "Option " + arg +" doesn't specify a proper file"
+       dohelp = True
+     else:
+       sfile = open (arg[2:], "r")
+       srcdata = sfile.readlines()
+       sfile.close()
+       for x in srcdata:
+ 	files.append(x.rstrip())
+   elif arg[0:2] == "-n":
+     edge_thresh = int (arg[2:])
+   elif arg[0:1] == "-":
+     print "Unrecognized option " + arg
+     dohelp = True
+   else:
+     files.append (arg)
+     
+ if len(sys.argv) == 1:
+   dohelp = True
+ 
+ if dohelp:
+   print "Generates a graph of the include web for specified files."
+   print "Usage:  [-finput_file] [-h] [-ooutput] [file1 ... [filen]]"
+   print "	-finput_file : Input file is file containing a list of files"
+   print "	-ooutput : Specifies output to output.dot and output.png"
+   print "                  defaults to graph.dot and graph.png"
+   print "	-nnum : specifies the # of edges beyond which sfdp is invoked. def=0"
+   print "       -a : Aggregate all .c files to 1 file.  Shows only include web."
+   print "       -at :  Aggregate, but don't include terminal.h to .c links. "
+   print "	-h : help"
+ else:
+   print files
+   build_dot_file (files)
+   if edges > edge_thresh:
+     os.system ("sfdp -Tpng " + dotname + " -o" + graphname)
+   else:
+     os.system ("dot -Tpng " + dotname + " -o" + graphname)
+ 
+ 

Property changes on: contrib/headers/graph-include-web
___________________________________________________________________
Added: svn:executable
## -0,0 +1 ##
+*
\ No newline at end of property
Index: contrib/headers/headerutils.py
===================================================================
*** contrib/headers/headerutils.py	(revision 0)
--- contrib/headers/headerutils.py	(working copy)
***************
*** 0 ****
--- 1,500 ----
+ #! /usr/bin/python2
+ import os.path
+ import sys
+ import shlex
+ import re
+ import subprocess
+ import shutil
+ import pickle
+ 
+ import multiprocessing 
+ 
+ def find_pound_include (line, use_outside, use_slash):
+   inc = re.findall (ur"^\s*#\s*include\s*\"(.+?)\"", line)
+   if len(inc) == 1:
+     nm = inc[0]
+     if use_outside or os.path.exists (nm):
+       if use_slash or '/' not in nm:
+ 	return nm
+   return ""
+ 
+ def find_system_include (line):
+   inc = re.findall (ur"^\s*#\s*include\s*<(.+?)>", line)
+   if len(inc) == 1:
+     return inc[0]
+   return ""
+   
+ def find_pound_define (line):
+   inc = re.findall (ur"^\s*#\s*define ([A-Za-z0-9_]+)", line)
+   if len(inc) != 0:
+     if len(inc) > 1:
+       print "What? more than 1 match in #define??"
+       print inc
+       sys.exit(5)
+     return inc[0];
+   return ""
+ 
+ def is_pound_if (line):
+   inc = re.findall ("^\s*#\s*if\s", line)
+   if not inc:
+     inc = re.findall ("^\s*#\s*if[n]?def\s", line)
+   if inc:
+     return True
+   return False
+ 
+ def is_pound_endif (line):
+   inc = re.findall ("^\s*#\s*endif", line)
+   if inc:
+     return True
+   return False
+ 
+ def find_pound_if (line):
+   inc = re.findall (ur"^\s*#\s*if\s+(.*)", line)
+   if len(inc) == 0:
+     inc = re.findall (ur"^\s*#\s*elif\s+(.*)", line)
+   if len(inc) > 0:
+ #    inc2 = re.findall (ur"defined *\((.+?)\)", inc[0])
+     inc2 = re.findall (ur"defined\s*\((.+?)\)", inc[0])
+     inc3 = re.findall (ur"defined\s+([a-zA-Z0-9_]+)", inc[0])
+     for yy in inc3:
+       inc2.append (yy)
+     return inc2
+   else:
+     inc = re.findall (ur"^\s*#\s*ifdef\s(.*)", line)
+     if len(inc) == 0:
+       inc = re.findall (ur"^\s*#\s*ifndef\s(.*)", line)
+     if len(inc) > 0:
+       inc2 = re.findall ("[A-Za-z_][A-Za-z_0-9]*", inc[0])
+       return inc2
+   if len(inc) == 0:
+     return list ()
+   print "WTF. more than one line returned for find_pound_if"
+   print inc
+   sys.exit(5)
+ 
+ empty_iinfo =  ("", "", list(), list(), list(), list(), list())
+ 
+ # find all relevant include data. 
+ def process_include_info (filen, do_macros, keep_src):
+   header = False
+   if not os.path.exists (filen):
+     return empty_iinfo
+ 
+   sfile = open (filen, "r");
+   data = sfile.readlines()
+   sfile.close()
+ 
+   # Ignore the initial #ifdef HEADER_H in header files
+   if filen[-2:] == ".h":
+     nest = -1
+     header = True
+   else:
+     nest = 0
+ 
+   macout = list ()
+   macin = list()
+   incl = list()
+   cond_incl = list()
+   src_line = { }
+   guard = ""
+ 
+   for line in (data):
+     if is_pound_if (line):
+       nest += 1
+     elif is_pound_endif (line):
+       nest -= 1
+ 
+     nm = find_pound_include (line, True, True)
+     if nm != "" and nm not in incl and nm[-2:] == ".h":
+       incl.append (nm)
+       if nest > 0:
+         cond_incl.append (nm)
+       if keep_src:
+         src_line[nm] = line
+       continue
+ 
+     if do_macros:
+       d = find_pound_define (line)
+       if d:
+         if d not in macout:
+ 	  macout.append (d);
+ 	  continue
+ 
+       d = find_pound_if (line)
+       if d:
+         # The first #if in a header file should be the guard
+         if header and len (d) == 1 and guard == "":
+ 	  if d[0][-2:] == "_H":
+ 	    guard = d
+ 	  else:
+ 	    guard = "Guess there was no guard..."
+ 	else:
+ 	  for mac in d:
+ 	    if mac != "defined" and mac not in macin:
+ 	      macin.append (mac);
+ 
+   if not keep_src:
+     data = list()
+ 
+   return (os.path.basename (filen), os.path.dirname (filen), incl, cond_incl,
+ 	  macin, macout, data, src_line)
+ 
+ def process_ii (filen):
+   return process_include_info (filen, False, False)
+ 
+ def process_ii_macro (filen):
+   return process_include_info (filen, True, False)
+ 
+ def process_ii_src (filen):
+   return process_include_info (filen, False, True)
+ 
+ def process_ii_macro_src (filen):
+   return process_include_info (filen, True, True)
+ 
+ def ii_base (iinfo):
+   return iinfo[0]
+ 
+ def ii_path (iinfo):
+   return iinfo[1]
+ 
+ def ii_include_list (iinfo):
+   return iinfo[2]
+ 
+ def ii_include_list_cond (iinfo):
+   return iinfo[3]
+ 
+ def ii_include_list_non_cond (iinfo):
+   l = ii_include_list (iinfo)
+   for n in ii_include_list_cond (iinfo):
+     l.remove (n)
+   return l
+ 
+ def ii_macro_consume (iinfo):
+   return iinfo[4]
+   
+ def ii_macro_define (iinfo):
+   return iinfo[5]
+ 
+ def ii_src (iinfo):
+   return iinfo[6]
+ 
+ def ii_src_line (iinfo):
+   return iinfo[7]
+ 
+ def ii_read (fname):
+   f = open (fname, 'rb')
+   incl = pickle.load (f)
+   consumes = pickle.load (f)
+   defines = pickle.load (f)
+   obj = (fname,fname,incl,list(), list(), consumes, defines, list(), list())
+   return obj
+ 
+ def ii_write (fname, obj):
+   f = open (fname, 'wb')
+   pickle.dump (obj[2], f)
+   pickle.dump (obj[4], f)
+   pickle.dump (obj[5], f)
+   f.close ()
+ 
+ 
+ # Find files matching pattern NAME, return in a list.
+ # CURRENT is True if you want to include the current directory
+ # DEEPER is True if you want to search 3 levels below the current directory
+ # any files with testsuite diurectories are ignored
+ 
+ def find_gcc_files (name, current, deeper):
+   files = list()
+   command = ""
+   if current:
+     if not deeper:
+       command = "find -maxdepth 1 -name " + name + " -not -path \"./testsuite/*\""
+     else:
+       command = "find -maxdepth 4 -name " + name + " -not -path \"./testsuite/*\""
+   else:
+     if deeper:
+       command = "find -maxdepth 4 -mindepth 2 -name " + name + " -not -path \"./testsuite/*\""
+ 
+   if command != "":
+     f = os.popen (command)
+     for x in f:
+       if x[0] == ".":
+         fn = x.rstrip()[2:]
+       else:
+ 	fn = x
+       files.append(fn)
+ 
+   return files
+ 
+ # find the list of unique include names found in a file.
+ def find_unique_include_list_src (data):
+   found = list ()
+   for line in data:
+     d = find_pound_include (line, True, True)
+     if d and d not in found and d[-2:] == ".h":
+       found.append (d)
+   return found
+ 
+ # find the list of unique include names found in a file.
+ def find_unique_include_list (filen):
+   data = open (filen).read().splitlines()
+   return find_unique_include_list_src (data)
+ 
+ 
+ # Create the macin, macout, and incl vectors for a file FILEN.
+ # macin are the macros that are used in #if* conditional expressions
+ # macout are the macros which are #defined
+ # incl is the list of incluide files encountered
+ # returned as a tuple of the filename followed by the triplet of lists
+ # (filen, macin, macout, incl)
+ 
+ def create_macro_in_out (filen):
+   sfile = open (filen, "r");
+   data = sfile.readlines()
+   sfile.close()
+ 
+   macout = list ()
+   macin = list()
+   incl = list()
+ 
+   for line in (data):
+     d = find_pound_define (line)
+     if d != "":
+       if d not in macout:
+ 	macout.append (d);
+       continue
+ 
+     d = find_pound_if (line)
+     if len(d) != 0:
+       for mac in d:
+ 	if mac != "defined" and mac not in macin:
+ 	  macin.append (mac);
+       continue
+ 
+     nm = find_pound_include (line, True, True)
+     if nm != "" and nm not in incl:
+       incl.append (nm)
+ 
+   return (filen, macin, macout, incl)
+ 
+ # create the macro information for filen, and create .macin, .macout, and .incl
+ # files.  Return the created macro tuple.
+ def create_include_data_files (filen):
+ 
+   macros = create_macro_in_out (filen)
+   depends = macros[1]
+   defines = macros[2]
+   incls = macros[3]
+   
+   disp_message = filen
+   if len (defines) > 0:
+     disp_message = disp_message + " " + str(len (defines)) + " #defines"
+   dfile = open (filen + ".macout", "w")
+   for x in defines:
+     dfile.write (x + "\n")
+   dfile.close ()
+ 
+   if len (depends) > 0:
+     disp_message = disp_message + " " + str(len (depends)) + " #if dependencies"
+   dfile = open (filen + ".macin", "w")
+   for x in depends:
+     dfile.write (x + "\n")
+   dfile.close ()
+ 
+   if len (incls) > 0:
+     disp_message = disp_message + " " + str(len (incls)) + " #includes"
+   dfile = open (filen + ".incl", "w")
+   for x in incls:
+     dfile.write (x + "\n")
+   dfile.close ()
+ 
+   return macros
+ 
+ 
+ 
+ # extract data for include file name_h and enter it into the dictionary.
+ # this doesnt change once read in.  use_requies is True if you want to 
+ # prime the values with already created .requires and .provides files.
+ def get_include_data (name_h, use_requires):
+   macin = list()
+   macout = list()
+   incl = list ()
+   if use_requires and os.path.exists (name_h + ".requires"):
+     macin = open (name_h + ".requires").read().splitlines()
+   elif os.path.exists (name_h + ".macin"):
+     macin = open (name_h + ".macin").read().splitlines()
+ 
+   if use_requires and os.path.exists (name_h + ".provides"):
+     macout  = open (name_h + ".provides").read().splitlines()
+   elif os.path.exists (name_h + ".macout"):
+     macout  = open (name_h + ".macout").read().splitlines()
+ 
+   if os.path.exists (name_h + ".incl"):
+     incl = open (name_h + ".incl").read().splitlines()
+ 
+   if len(macin) == 0 and len(macout) == 0 and len(incl) == 0:
+     return ()
+   data = ( name_h, macin, macout, incl )
+   return data
+   
+ # find FIND in src, and replace it with the list of includes in REPLACE
+ # remove any duplicates of find or replace, and if some of hte replace
+ # includes occur earlier in the inlude chain, leave them.
+ # return the new SRC only if anything changed.
+ def find_replace_include (find, replace, src):
+   res = list()
+   seen = { }
+   anything = False
+   for line in src:
+     inc = find_pound_include (line, True, True)
+     if inc == find:
+       for y in replace:
+         if seen.get(y) == None:
+ 	  res.append("#include \""+y+"\"\n")
+ 	  seen[y] = True
+ 	  if y != find:
+ 	    anything = True
+ # if find isnt in the replacement list, then we are deleting FIND, so changes.
+       if find not in replace:
+         anything = True
+     else:
+       if inc in replace:
+         if seen.get(inc) == None:
+ 	  res.append (line)
+ 	  seen[inc] = True
+       else:
+ 	res.append (line)
+ 
+   if (anything):
+     return res
+   else:
+     return list()
+       
+ 
+ # pass in a require and provide dictionary to be read in.
+ def read_require_provides (require, provide):
+   if not os.path.exists ("require-provide.master"):
+     print "require-provide.master file is not available. please run data collection."
+     sys.exit(1)
+   incl_list = open("require-provide.master").read().splitlines()
+   for f in incl_list:
+     if os.path.exists (f+".requires"):
+       require[os.path.basename (f)] = open (f + ".requires").read().splitlines()
+     else:
+       require[os.path.basename (f)] = list ()
+     if os.path.exists (f+".provides"):
+       provide[os.path.basename (f)] = open (f + ".provides").read().splitlines()
+     else:
+       provide [os.path.basename (f)] = list ()
+ 
+    
+ def build_include_list (filen):
+   include_files = list()
+   sfile = open (filen, "r")
+   data = sfile.readlines()
+   sfile.close()
+   for line in data:
+     nm = find_pound_include (line, False, False)
+     if nm != "" and nm[-2:] == ".h":
+       if nm not in include_files:
+ 	include_files.append(nm)
+   return include_files
+  
+ def build_reverse_include_list (filen):
+   include_files = list()
+   sfile = open (filen, "r")
+   data = sfile.readlines()
+   sfile.close()
+   for line in reversed(data):
+     nm = find_pound_include (line, False, False)
+     if nm != "":
+       if nm not in include_files:
+ 	include_files.append(nm)
+   return include_files
+      
+ # compensate for this stupid warning that should be an error for 
+ # inlined templates
+ def get_make_rc (rc, output):
+   rc = rc % 1280
+   if rc == 0:
+     # This is not considered a fatal error for a build!  /me rolls eyes
+     h = re.findall ("warning: inline function.*used but never defined", output)
+     if len(h) != 0:
+       rc = 1
+   return rc;
+ 
+ def get_make_output (build_dir, make_opt):
+   devnull = open('/dev/null', 'w')
+   at_a_time = multiprocessing.cpu_count() * 2
+   make = "make -j"+str(at_a_time)+ " "
+   if build_dir != "":
+     command = "cd " + build_dir +"; " + make + make_opt
+   else:
+     command = make + make_opt
+   process = subprocess.Popen(command, stdout=devnull, stderr=subprocess.PIPE, shell=True)
+   output = process.communicate();
+   rc = get_make_rc (process.returncode, output[1])
+   return (rc , output[1])
+ 
+ def spawn_makes (command_list):
+   devnull = open('/dev/null', 'w')
+   rc = (0,"", "")
+   proc_res = list()
+   text = "  Trying target builds : "
+   for command_pair in command_list:
+     tname = command_pair[0]
+     command = command_pair[1]
+     text += tname + ", "
+     c = subprocess.Popen(command, bufsize=-1, stdout=devnull, stderr=subprocess.PIPE, shell=True)
+     proc_res.append ((c, tname))
+ 
+   print text[:-2]
+ 
+   for p in proc_res:
+     output = p[0].communicate()
+     ret = (get_make_rc (p[0].returncode, output[1]), output[1], p[1])
+     if (ret[0] != 0):
+       # Just record the first one.
+       if rc[0] == 0:
+ 	rc = ret;
+   return rc
+ 
+ def get_make_output_parallel (targ_list, make_opt, at_a_time):
+   command = list()
+   targname = list()
+   if at_a_time == 0:
+     at_a_time = multiprocessing.cpu_count() * 2
+   proc_res = [0] * at_a_time
+   for x in targ_list:
+     if make_opt[-2:] == ".o":
+       s = "cd " + x[1] + "/gcc/; make " + make_opt
+     else:
+       s = "cd " + x[1] +"; make " + make_opt
+     command.append ((x[0],s))
+ 
+   num = len(command) 
+   rc = (0,"", "")
+   loops = num // at_a_time
+   
+   if (loops > 0):
+     for idx in range (loops):
+       ret = spawn_makes (command[idx*at_a_time:(idx+1)*at_a_time])
+       if ret[0] != 0:
+         rc = ret
+         break
+ 
+   if (rc[0] == 0):
+     leftover = num % at_a_time
+     if (leftover > 0):
+       ret = spawn_makes (command[-leftover:])
+       if ret[0] != 0:
+         rc = ret
+ 
+   return rc
+ 
+ 
+ def readwholefile (src_file):
+   sfile = open (src_file, "r")
+   src_data = sfile.readlines()
+   sfile.close()
+   return src_data
+ 

Property changes on: contrib/headers/headerutils.py
___________________________________________________________________
Added: svn:executable
## -0,0 +1 ##
+*
\ No newline at end of property
Index: contrib/headers/included-by
===================================================================
*** contrib/headers/included-by	(revision 0)
--- contrib/headers/included-by	(working copy)
***************
*** 0 ****
--- 1,112 ----
+ #! /usr/bin/python2
+ import os.path
+ import sys
+ import shlex
+ import re
+ 
+ from headerutils import *
+ 
+ 
+ 
+ usage = False
+ src = list()
+ flist = { }
+ process_h = False
+ process_c = False
+ verbose = False
+ level = 0
+ match_all = False
+ num_match = 1
+ 
+ file_list = list()
+ current = True
+ deeper = True
+ scanfiles = True
+ for x in sys.argv[1:]:
+   if x[0:2] == "-h":
+     usage = True
+   elif x[0:2] == "-i":
+     process_h = True
+   elif x[0:2] == "-s" or x[0:2] == "-c":
+     process_c = True
+   elif x[0:2] == "-v":
+     verbose = True
+   elif x[0:2] == "-a":
+     match_all = True
+   elif x[0:2] == "-n":
+     num_match = int(x[2:])
+   elif x[0:2] == "-1":
+     deeper = False
+   elif x[0:2] == "-2":
+     current = False
+   elif x[0:2] == "-f":
+     file_list = open (x[2:]).read().splitlines()
+     scanfiles = False
+   elif x[0] == "-":
+     print "Error: Unknown option " + x
+     usage = True
+   else:
+     src.append (x)
+ 
+ if match_all:
+   num_match = len (src)
+ 
+ if not process_h and not process_c:
+   process_h = True
+   process_c = True
+ 
+ if len(src) == 0:
+   usage = True
+ 
+ if not usage:
+   if scanfiles:
+     if process_h:
+       file_list = find_gcc_files ("\*.h", current, deeper)
+     if process_c:
+       file_list = file_list + find_gcc_files ("\*.c", current, deeper)
+       file_list = file_list + find_gcc_files ("\*.cc", current, deeper)
+   else:
+     newlist = list()
+     for x in file_list:
+       if process_h and x[-2:] == ".h":
+ 	newlist.append (x)
+       elif process_c and (x[-2:] == ".c" or x[-3:] == ".cc"):
+ 	newlist.append (x)
+     file_list = newlist;
+      
+   file_list.sort()
+   for fn in file_list:
+     found = find_unique_include_list (fn)
+     careabout = list()
+     output = ""
+     for inc in found:
+       if inc in src:
+         careabout.append (inc)
+         if output == "":
+ 	  output = fn
+         if verbose:
+ 	  output = output + " [" + inc +"]"
+     if len (careabout) < num_match:
+         output = ""
+     if output != "":
+       print output
+ else:
+   print "included-by [-h] [-i] [-c] [-v] [-a] [-nx] file1 [file2] ... [filen]"
+   print "find the list of all files in subdirectories that include any of "
+   print "the listed files. processed to a depth of 3 subdirs"
+   print " -h  : Show this message"
+   print " -i  : process only header files (*.h) for #include"
+   print " -c  : process only source files (*.c *.cc) for #include"
+   print "       If nothing is specified, defaults to -i -c"
+   print " -s  : Same as -c."
+   print " -v  : Show which include(s) were found"
+   print " -nx : Only list files which have at least x different matches. Default = 1"
+   print " -a  : Show only files which *all* listed files are included"
+   print "       This is equivilent to -nT where T == # of items in list"
+   print " -flistfile  : Show only files contained in the list of files"
+ 
+  
+ 
+ 
+ 
+ 

Property changes on: contrib/headers/included-by
___________________________________________________________________
Added: svn:executable
## -0,0 +1 ##
+*
\ No newline at end of property
Index: contrib/headers/reduce-headers
===================================================================
*** contrib/headers/reduce-headers	(revision 0)
--- contrib/headers/reduce-headers	(working copy)
***************
*** 0 ****
--- 1,596 ----
+ #! /usr/bin/python2
+ import os.path
+ import sys
+ import shlex
+ import re
+ import tempfile
+ import copy
+ 
+ from headerutils import *
+ 
+ requires = { }
+ provides = { }
+ 
+ no_remove = [ "system.h", "coretypes.h", "config.h" , "bconfig.h", "backend.h" ]
+ 
+ # These targets are the ones which provide "coverage".  Typically, if any
+ # target is going to fail compilation, it's one of these.  This was determined
+ # during the initial runs of reduce-headers... On a full set of target builds,
+ # every failure which occured was triggered by one of these.  
+ # This list is used during target-list construction simply to put any of these
+ # *first* in the candidate list, increasing the probability that a failure is 
+ # found quickly.
+ target_priority = [
+     "aarch64-linux-gnu",
+     "arm-netbsdelf",
+     "avr-rtems",
+     "c6x-elf",
+     "epiphany-elf",
+     "hppa2.0-hpux10.1",
+     "i686-mingw32crt",
+     "i686-pc-msdosdjgpp",
+     "mipsel-elf",
+     "powerpc-eabisimaltivec",
+     "rs6000-ibm-aix5.1.0",
+     "sh-superh-elf",
+     "sparc64-elf",
+     "spu-elf"
+ ]
+ 
+ 
+ target_dir = ""
+ build_dir = ""
+ ignore_list = list()
+ target_builds = list()
+ 
+ target_dict = { }
+ header_dict = { }
+ search_path = [ ".", "../include", "../libcpp/include" ]
+ 
+ remove_count = { }
+ 
+ 
+ # given a header name, normalize it.  ie  cp/cp-tree.h could be in gcc, while
+ # the same header could be referenecd from within the cp subdirectory as
+ # just cp-tree.h
+ # for now, just assume basenames are unique
+ 
+ def normalize_header (header):
+   return os.path.basename (header)
+ 
+ 
+ # Adds a header file and it's sub includes to the global dictionary if they
+ # aren't already there.  SPecify s_path since different build directories may
+ # append themselves on demand to the global list.
+ # return entry for the specified header, knowing all sub entries are completed
+ 
+ def get_header_info (header, s_path):
+   global header_dict
+   global empty_iinfo
+   process_list = list ()
+   location = ""
+   bname = ""
+   bname_iinfo = empty_iinfo
+   for path in s_path:
+     if os.path.exists (path + "/" + header):
+       location = path + "/" + header
+       break
+ 
+   if location:
+     bname = normalize_header (location)
+     if header_dict.get (bname):
+       bname_iinfo = header_dict[bname]
+       loc2 = ii_path (bname_iinfo)+ "/" + bname
+       if loc2[:2] == "./":
+         loc2 = loc2[2:]
+       if location[:2] == "./":
+         location = location[2:]
+       if loc2 != location:
+         # Don't use the cache if it isnt the right one.
+         bname_iinfo = process_ii_macro (location)
+       return bname_iinfo
+ 
+     bname_iinfo = process_ii_macro (location)
+     header_dict[bname] = bname_iinfo
+     # now decend into the include tree
+     for i in ii_include_list (bname_iinfo):
+       get_header_info (i, s_path)
+   else:
+     # if the file isnt in the source directories, look in the build and target
+     # directories. If its here, then aggregate all the versions.
+     location = build_dir + "/gcc/" + header
+     build_inc = target_inc = False
+     if os.path.exists (location):
+       build_inc = True
+     for x in target_dict:
+       location = target_dict[x] + "/gcc/" + header
+       if os.path.exists (location):
+ 	target_inc = True
+ 	break
+ 
+     if (build_inc or target_inc):
+       bname = normalize_header(header)
+       defines = set()
+       consumes = set()
+       incl = set()
+       if build_inc:
+ 	iinfo = process_ii_macro (build_dir + "/gcc/" + header)
+ 	defines = set (ii_macro_define (iinfo))
+ 	consumes = set (ii_macro_consume (iinfo))
+ 	incl = set (ii_include_list (iinfo))
+ 
+       if (target_inc):
+ 	for x in target_dict:
+ 	  location = target_dict[x] + "/gcc/" + header
+ 	  if os.path.exists (location):
+ 	    iinfo = process_ii_macro (location)
+ 	    defines.update (ii_macro_define (iinfo))
+ 	    consumes.update (ii_macro_consume (iinfo))
+ 	    incl.update (ii_include_list (iinfo))
+ 
+       bname_iinfo = (header, "build", list(incl), list(), list(consumes), list(defines), list(), list())
+ 
+       header_dict[bname] = bname_iinfo
+       for i in incl:
+ 	get_header_info (i, s_path)
+ 
+   return bname_iinfo
+ 
+ 
+ # return a list of all headers brought in by this header
+ def all_headers (fname):
+   global header_dict
+   headers_stack = list()
+   headers_list = list()
+   if header_dict.get (fname) == None:
+     return list ()
+   for y in ii_include_list (header_dict[fname]):
+     headers_stack.append (y)
+ 
+   while headers_stack:
+     h = headers_stack.pop ()
+     hn = normalize_header (h)
+     if hn not in headers_list:
+       headers_list.append (hn)
+       if header_dict.get(hn):
+ 	for y in ii_include_list (header_dict[hn]):
+ 	  if normalize_header (y) not in headers_list:
+ 	    headers_stack.append (y)
+ 
+   return headers_list
+ 
+ 
+ 
+ 
+ # Search bld_dir for all target tuples, confirm that they have a build path with
+ # bld_dir/target-tuple/gcc, and build a dictionary of build paths indexed by
+ # target tuple..
+ 
+ def build_target_dict (bld_dir, just_these):
+   global target_dict
+   target_doct = { }
+   error = False
+   if os.path.exists (bld_dir):
+     if just_these:
+       ls = just_these
+     else:
+       ls = os.listdir(bld_dir)
+     for t in ls:
+       if t.find("-") != -1:
+ 	target = t.strip()
+ 	tpath = bld_dir + "/" + target
+ 	if not os.path.exists (tpath + "/gcc"):
+ 	  print "Error: gcc build directory for target " + t + " Does not exist: " + tpath + "/gcc"
+ 	  error = True
+ 	else:
+ 	  target_dict[target] = tpath
+ 
+   if error:
+     target_dict = { }
+ 
+ def get_obj_name (src_file):
+   if src_file[-2:] == ".c":
+     return src_file.replace (".c", ".o")
+   elif src_file[-3:] == ".cc":
+     return src_file.replace (".cc", ".o")
+   return ""
+ 
+ def target_obj_exists (target, obj_name):
+   global target_dict
+   # look in a subdir if src has a subdir, then check gcc base directory.
+   if target_dict.get(target):
+     obj = target_dict[target] + "/gcc/" + obj_name
+     if not os.path.exists (obj):
+       obj = target_dict[target] + "/gcc/" + os.path.basename(obj_name)
+     if os.path.exists (obj):
+       return True
+   return False
+  
+ # Given a src file, return a list of targets which may build this file.
+ def find_targets (src_file):
+   global target_dict
+   targ_list = list()
+   obj_name = get_obj_name (src_file)
+   if not obj_name:
+     print "Error: " + src_file + " - Cannot determine object name."
+     return list()
+ 
+   # Put the high priority targets which tend to trigger failures first
+   for target in target_priority:
+     if target_obj_exists (target, obj_name):
+       targ_list.append ((target, target_dict[target]))
+ 
+   for target in target_dict:
+     if target not in target_priority and target_obj_exists (target, obj_name):
+       targ_list.append ((target, target_dict[target]))
+         
+   return targ_list
+ 
+ 
+ def try_to_remove (src_file, h_list, verbose):
+   global target_dict
+   global header_dict
+   global build_dir
+ 
+   # build from scratch each time
+   header_dict = { }
+   summary = ""
+   rmcount = 0
+ 
+   because = { }
+   src_info = process_ii_macro_src (src_file)
+   src_data = ii_src (src_info)
+   if src_data:
+     inclist = ii_include_list_non_cond (src_info)
+     # work is done if there are no includes to check
+     if not inclist:
+       return src_file + ": No include files to attempt to remove"
+ 
+     # work on the include list in reverse.
+     inclist.reverse()
+ 
+     # Get the target list 
+     targ_list = list()
+     targ_list = find_targets (src_file)
+ 
+     spath = search_path
+     if os.path.dirname (src_file):
+       spath.append (os.path.dirname (src_file))
+ 
+     hostbuild = True
+     if src_file.find("config/") != -1:
+       # config files dont usually build on the host
+       hostbuild = False
+       obn = get_obj_name (os.path.basename (src_file))
+       if obn and os.path.exists (build_dir + "/gcc/" + obn):
+         hostbuild = True
+       if not target_dict:
+         summary = src_file + ": Target builds are required for config files.  None found."
+ 	print summary
+ 	return summary
+       if not targ_list:
+         summary =src_file + ": Cannot find any targets which build this file."
+ 	print summary
+ 	return summary
+ 
+     if hostbuild:
+       # confirm it actually builds before we do anything
+       print "Confirming source file builds"
+       res = get_make_output (build_dir + "/gcc", "all")
+       if res[0] != 0:
+         message = "Error: " + src_file + " does not build currently."
+ 	summary = src_file + " does not build on host."
+ 	print message
+ 	print res[1]
+ 	if verbose:
+ 	  verbose.write (message + "\n")
+ 	  verbose.write (res[1]+ "\n")
+ 	return summary
+ 
+     src_requires = set (ii_macro_consume (src_info))
+     for macro in src_requires:
+       because[macro] = src_file
+     header_seen = list ()
+ 
+     os.rename (src_file, src_file + ".bak")
+     src_orig = copy.deepcopy (src_data)
+     src_tmp = copy.deepcopy (src_data)
+ 
+     try:
+       # process the includes from bottom to top.  This is because we know that
+       # later includes have are known to be needed, so any dependency from this 
+       # header is a true dependency
+       for inc_file in inclist:
+ 	inc_file_norm = normalize_header (inc_file)
+ 	
+ 	if inc_file in no_remove:
+ 	  continue
+ 	if len (h_list) != 0 and inc_file_norm not in h_list:
+ 	  continue
+ 	if inc_file_norm[0:3] == "gt-":
+ 	  continue
+ 	if inc_file_norm[0:6] == "gtype-":
+ 	  continue
+ 	if inc_file_norm.replace(".h",".c") == os.path.basename(src_file):
+ 	  continue
+ 	     
+ 	lookfor = ii_src_line(src_info)[inc_file]
+ 	src_tmp.remove (lookfor)
+ 	message = "Trying " + src_file + " without " + inc_file
+ 	print message
+ 	if verbose:
+ 	  verbose.write (message + "\n")
+ 	out = open(src_file, "w")
+ 	for line in src_tmp:
+ 	  out.write (line)
+ 	out.close()
+ 	  
+ 	keep = False
+ 	if hostbuild:
+ 	  res = get_make_output (build_dir + "/gcc", "all")
+ 	else:
+ 	  res = (0, "")
+ 
+ 	rc = res[0]
+ 	message = "Passed Host build"
+ 	if (rc != 0):
+ 	  # host build failed
+ 	  message  = "Compilation failed:\n";
+ 	  keep = True
+ 	else:
+ 	  if targ_list:
+ 	    objfile = get_obj_name (src_file)
+ 	    t1 = targ_list[0]
+ 	    if objfile and os.path.exists(t1[1] +"/gcc/"+objfile):
+ 	      res = get_make_output_parallel (targ_list, objfile, 0)
+ 	    else:
+ 	      res = get_make_output_parallel (targ_list, "all-gcc", 0)
+ 	    rc = res[0]
+ 	    if rc != 0:
+ 	      message = "Compilation failed on TARGET : " + res[2]
+ 	      keep = True
+ 	    else:
+ 	      message = "Passed host and target builds"
+ 
+         if keep:
+ 	  print message + "\n"
+ 
+ 	if (rc != 0):
+ 	  if verbose:
+ 	    verbose.write (message + "\n");
+ 	    verbose.write (res[1])
+ 	    verbose.write ("\n");
+ 	    if os.path.exists (inc_file):
+ 	      ilog = open(inc_file+".log","a")
+ 	      ilog.write (message + " for " + src_file + ":\n\n");
+ 	      ilog.write ("============================================\n");
+ 	      ilog.write (res[1])
+ 	      ilog.write ("\n");
+ 	      ilog.close()
+ 	    if os.path.exists (src_file):
+ 	      ilog = open(src_file+".log","a")
+ 	      ilog.write (message + " for " +inc_file + ":\n\n");
+ 	      ilog.write ("============================================\n");
+ 	      ilog.write (res[1])
+ 	      ilog.write ("\n");
+ 	      ilog.close()
+ 
+ 	# Given a sequence where :
+ 	# #include "tm.h"
+ 	# #include "target.h"  // includes tm.h
+ 
+ 	# target.h was required, and when attempting to remove tm.h we'd see that
+ 	# all the macro defintions are "required" since they all look like:
+ 	# #ifndef HAVE_blah
+ 	# #define HAVE_blah
+ 	# endif
+ 
+ 	# when target.h was found to be required, tm.h will be tagged as included.
+ 	# so when we get this far, we know we dont have to check the macros for
+ 	# tm.h since we know its already been included.
+ 
+         if inc_file_norm not in header_seen:
+ 	  iinfo = get_header_info (inc_file, spath)
+ 	  newlist = all_headers (inc_file_norm)
+ 	  if ii_path(iinfo) == "build" and not target_dict:
+ 	    keep = True
+ 	    text = message + " : Will not remove a build file without some targets."
+ 	    print text
+ 	    ilog = open(src_file+".log","a")
+ 	    ilog.write (text +"\n")
+ 	    ilog.write ("============================================\n");
+ 	    ilog.close()
+ 	    ilog = open("reduce-headers-kept.log","a")
+ 	    ilog.write (src_file + " " + text +"\n")
+ 	    ilog.close()
+ 	else:
+ 	  newlist = list()
+ 	if not keep and inc_file_norm not in header_seen:
+ 	  # now look for any macro requirements.
+ 	  for h in newlist:
+ 	    if not h in header_seen:
+ 	      if header_dict.get(h):
+ 		defined = ii_macro_define (header_dict[h])
+ 		for dep in defined:
+ 		  if dep in src_requires and dep not in ignore_list:
+ 		    keep = True;
+ 		    text = message + ", but must keep " + inc_file + " because it provides " + dep 
+ 		    if because.get(dep) != None:
+ 		      text = text + " Possibly required by " + because[dep]
+ 		    print text
+ 		    ilog = open(inc_file+".log","a")
+ 		    ilog.write (because[dep]+": Requires [dep] in "+src_file+"\n")
+ 		    ilog.write ("============================================\n");
+ 		    ilog.close()
+ 		    ilog = open(src_file+".log","a")
+ 		    ilog.write (text +"\n")
+ 		    ilog.write ("============================================\n");
+ 		    ilog.close()
+ 		    ilog = open("reduce-headers-kept.log","a")
+ 		    ilog.write (src_file + " " + text +"\n")
+ 		    ilog.close()
+ 		    if verbose:
+ 		      verbose.write (text + "\n")
+ 
+ 	if keep:
+ 	  # add all headers 'consumes' to src_requires list, and mark as seen
+ 	  for h in newlist:
+ 	    if not h in header_seen:
+ 	      header_seen.append (h)
+ 	      if header_dict.get(h):
+ 		consume = ii_macro_consume (header_dict[h])
+ 		for dep in consume:
+ 		  if dep not in src_requires:
+ 		    src_requires.add (dep)
+ 		    if because.get(dep) == None:
+ 		      because[dep] = inc_file
+ 
+ 	  src_tmp = copy.deepcopy (src_data)
+ 	else:
+ 	  print message + "  --> removing " + inc_file + "\n"
+ 	  rmcount += 1
+ 	  if verbose:
+ 	    verbose.write (message + "  --> removing " + inc_file + "\n")
+ 	  if remove_count.get(inc_file) == None:
+ 	    remove_count[inc_file] = 1
+ 	  else:
+ 	    remove_count[inc_file] += 1
+ 	  src_data = copy.deepcopy (src_tmp)
+     except:
+       print "Interuption: restoring original file"
+       out = open(src_file, "w")
+       for line in src_orig:
+ 	out.write (line)
+       out.close()
+       raise
+ 
+     # copy current version, since its the "right" one now.
+     out = open(src_file, "w")
+     for line in src_data:
+       out.write (line)
+     out.close()
+     
+     # Try a final host bootstrap build to make sure everything is kosher.
+     if hostbuild:
+       res = get_make_output (build_dir, "all")
+       rc = res[0]
+       if (rc != 0):
+ 	# host build failed! return to original version
+ 	print "Error: " + src_file + " Failed to bootstrap at end!!! restoring."
+ 	print "        Bad version at " + src_file + ".bad"
+ 	os.rename (src_file, src_file + ".bad")
+ 	out = open(src_file, "w")
+ 	for line in src_orig:
+ 	  out.write (line)
+ 	out.close()
+ 	return src_file + ": failed to build after reduction.  Restored original"
+ 
+     if src_data == src_orig:
+       summary = src_file + ": No change."
+     else:
+       summary = src_file + ": Reduction performed, "+str(rmcount)+" includes removed."
+   print summary
+   return summary
+ 
+ only_h = list ()
+ ignore_cond = False
+ 
+ usage = False
+ src = list()
+ only_targs = list ()
+ for x in sys.argv[1:]:
+   if x[0:2] == "-b":
+     build_dir = x[2:]
+   elif x[0:2] == "-f":
+     fn = normalize_header (x[2:])
+     if fn not in only_h:
+       only_h.append (fn)
+   elif x[0:2] == "-h":
+     usage = True
+   elif x[0:2] == "-d":
+     ignore_cond = True
+   elif x[0:2] == "-D":
+     ignore_list.append(x[2:])
+   elif x[0:2] == "-T":
+     only_targs.append(x[2:])
+   elif x[0:2] == "-t":
+     target_dir = x[2:]
+   elif x[0] == "-":
+     print "Error:  Unrecognized option " + x
+     usgae = True
+   else:
+     if not os.path.exists (x):
+       print "Error: specified file " + x + " does not exist."
+       usage = True
+     else:
+       src.append (x)
+ 
+ if target_dir:
+   build_target_dict (target_dir, only_targs)
+ 
+ if build_dir == "" and target_dir == "":
+   print "Error: Must specify a build directory, and/or a target directory."
+   usage = True
+ 
+ if build_dir and not os.path.exists (build_dir):
+     print "Error: specified build directory does not exist : " + build_dir
+     usage = True
+ 
+ if target_dir and not os.path.exists (target_dir):
+     print "Error: specified target directory does not exist : " + target_dir
+     usage = True
+ 
+ if usage:
+   print "Attempts to remove extraneous include files from source files. "
+   print " "
+   print "Should be run from the main gcc source directory, and works on a target"
+   print "directory, as we attempt to make the 'all' target."
+   print " "
+   print "By default, gcc-reorder-includes is run on each file before attempting"
+   print "to remove includes. this removes duplicates and puts some headers in a"
+   print "canonical ordering"
+   print " "
+   print "The build directory should be ready to compile via make. Time is saved "
+   print "if the build is already complete, so that only changes need to be built."
+   print " "
+   print "Usage: [options] file1.c [file2.c] ... [filen.c]"
+   print "      -bdir    : the root build directory to attempt buiding .o files."
+   print "      -tdir    : the target build directory"
+   print "      -d       : Ignore conditional macro dependencies."
+   print " "
+   print "      -Dmacro  : Ignore a specific macro for dependencies"
+   print "      -Ttarget : Only consider target in target directory."
+   print "      -fheader : Specifies a specific .h file to be considered."
+   print " "
+   print "      -D, -T, and -f can be specified mulitple times and are aggregated."
+   print " "
+   print "  The original file will be in filen.bak"
+   print " "
+   sys.exit (0)
+  
+ if only_h:
+   print "Attempting to remove only these files:"
+   for x in only_h:
+     print x
+   print " "
+ 
+ logfile = open("reduce-headers.log","w")
+ 
+ for x in src:
+   msg = try_to_remove (x, only_h, logfile)
+   ilog = open("reduce-headers.sum","a")
+   ilog.write (msg + "\n")
+   ilog.close()
+ 
+ ilog = open("reduce-headers.sum","a")
+ ilog.write ("===============================================================\n")
+ for x in remove_count:
+   msg = x + ": Removed " + str(remove_count[x]) + " times."
+   print msg
+   logfile.write (msg + "\n")
+   ilog.write (msg + "\n")
+ 
+ 
+ 
+ 
+ 

Property changes on: contrib/headers/reduce-headers
___________________________________________________________________
Added: svn:executable
## -0,0 +1 ##
+*
\ No newline at end of property
Index: contrib/headers/replace-header
===================================================================
*** contrib/headers/replace-header	(revision 0)
--- contrib/headers/replace-header	(working copy)
***************
*** 0 ****
--- 1,53 ----
+ #! /usr/bin/python2
+ import os.path
+ import sys
+ import shlex
+ import re
+ 
+ from headerutils import *
+ 
+ 
+ files = list()
+ replace = list()
+ find = ""
+ usage = False
+ 
+ for x in sys.argv[1:]:
+   if x[0:2] == "-h":
+     usage = True
+   elif x[0:2] == "-f" and find == "":
+     find = x[2:]
+   elif x[0:2] == "-r":
+     replace.append (x[2:])
+   elif x[0:1] == "-":
+     print "Error: unrecognized option " + x
+     usage = True
+   else:
+     files.append (x)
+ 
+ if find == "":
+   usage = True
+ 
+ if usage:
+   print "replace-header -fheader -rheader [-rheader] file1 [filen.]"
+   sys.exit(0)
+ 
+ string = ""
+ for x in replace:
+   string = string + " '"+x+"'"
+ print "Replacing '"+find+"'  with"+string
+ 
+ for x in files:
+   src = readwholefile (x)
+   src = find_replace_include (find, replace, src)
+   if (len(src) > 0):
+     print x + ": Changed"
+     out = open(x, "w")
+     for line in src:
+       out.write (line);
+     out.close ()
+   else:
+     print x
+ 
+ 
+ 

Property changes on: contrib/headers/replace-header
___________________________________________________________________
Added: svn:executable
## -0,0 +1 ##
+*
\ No newline at end of property
Index: contrib/headers/show-headers
===================================================================
*** contrib/headers/show-headers	(revision 0)
--- contrib/headers/show-headers	(working copy)
***************
*** 0 ****
--- 1,108 ----
+ #! /usr/bin/python2
+ import os.path
+ import sys
+ import shlex
+ import re
+ 
+ from headerutils import *
+ 
+ 
+ tabstop = 2
+ padding = "                                                                  "
+ seen = { }
+ output = list()
+ sawcore = False
+ 
+ incl_dirs = [".", "../include", "../../build/gcc", "../libcpp/include" ]
+ 
+ def append_1 (output, inc):
+   for n,t in enumerate (output):
+     if t.find(inc) != -1:
+       t += "  (1)"
+       output[n] = t
+       return
+ 
+ rtl_core = [ "machmode.h" , "signop.h" , "wide-int.h" , "double-int.h" , "real.h" , "fixed-value.h" , "statistics.h" , "vec.h" , "hash-table.h" , "hash-set.h" , "input.h" , "is-a.h" ]
+ 
+ def find_include_data (inc):
+   global sawcore
+   for x in incl_dirs:
+     nm = x+"/"+inc
+     if os.path.exists (nm):
+       info = find_unique_include_list (nm)
+       # rtl.h mimics coretypes for GENERATOR FILES, remove if coretypes.h seen.
+       if inc == "coretypes.h":
+         sawcore = True
+       elif inc  == "rtl.h" and sawcore:
+         for i in rtl_core:
+ 	  if i in info:
+ 	    info.remove (i)
+       return info
+   return list()
+ 
+ def process_include (inc, indent):
+   if inc[-2:] != ".h":
+     return
+   if seen.get(inc) == None:
+     seen[inc] = 1
+     output.append (padding[:indent*tabstop] + os.path.basename (inc))
+     info = find_include_data (inc)
+     for y in info:
+       process_include (y, indent+1)
+   else:
+     seen[inc] += 1
+     if (seen[inc] == 2):
+       append_1(output, inc)
+     output.append (padding[:indent*tabstop] + os.path.basename (inc) + "  ("+str(seen[inc])+")")
+ 
+     
+ 
+ blddir = [ "." ]
+ usage = False
+ src = list()
+ 
+ for x in sys.argv[1:]:
+   if x[0:2] == "-i":
+     bld = x[2:]
+     print "Build dir : " + bld
+     blddir.append (bld)
+   elif x[0:2] == "-h":
+     usage = True
+   else:
+     src.append (x)
+ 
+ if len(src) != 1:
+   usage = True
+ 
+ if usage:
+   print "show-headers [-idir]  file1 "
+   print " "
+   print " show in a hierarchical visual format how many times each header file"
+   print " is included ina source file.  Should be run from the source directory"
+   print "  files from find-include-depends"
+   print "      -i : Specifies 1 or more directories to search for includes."
+   print "           defaults to looking in :"
+   print "           . , ../include, ../libcpp/include, and ../../build/gcc"
+   print "      specifying anything else creates a new list starting with '.'"
+   sys.exit(0)
+ 
+ 
+ if len(blddir) > 1:
+   incl_dirs = blddir
+ 
+ x = src[0]
+ # if source is in a subdirectory, add the subdirectory to the search list
+ srcpath = os.path.dirname(x)
+ if srcpath:
+   incl_dirs.append (srcpath)
+ 
+ output = list()
+ sawcore = False
+ incl = find_unique_include_list (x)
+ for inc in incl:
+   process_include (inc, 1)
+ print "\n" + x
+ for line in output:
+   print line
+ 
+ 

Property changes on: contrib/headers/show-headers
___________________________________________________________________
Added: svn:executable
## -0,0 +1 ##
+*
\ No newline at end of property