Patchwork Speed up genattrtab

login
register
mail settings
Submitter Jakub Jelinek
Date June 16, 2010, 7:09 p.m.
Message ID <20100616190924.GQ7811@tyan-ft48-01.lab.bos.redhat.com>
Download mbox | patch
Permalink /patch/55924/
State New
Headers show

Comments

Jakub Jelinek - June 16, 2010, 7:09 p.m.
On Wed, Jun 16, 2010 at 04:33:34PM +0200, Jakub Jelinek wrote:
> On Tue, Jun 15, 2010 at 09:16:33PM +0200, Jakub Jelinek wrote:
> > On Tue, Jun 15, 2010 at 11:39:47AM -0700, Mark Mitchell wrote:
> > I believe on x86_64/i686 the most time is spent in compiling
> > internal_dfa_insn_code, primarily because there are so many different
> > schedulings.
> > The insn is a big switch on recog_memoized, where most of the cases first
> > compare ix86_schedule var to some enum.  I guess it would be certainly
> > faster to compile to instead split the big function into separate function
> > for each schedule and make internal_dfa_insn_code a function pointer, would
> > need to be benchmarked how it would actually perform at runtime.
> 
> Here is a WIP untested patch.

And here is a patch I've actually bootstrapped/regtested on x86_64-linux and
i686-linux.  It was an --enable-checking=release build (simultaneously both
arches), so timing wasn't exact, but config.status timestamp to compare
timestamp difference was
1112s -> 843s x86_64
736s -> 630s i686.
Times from config.status to end of make were
2453s -> 2111s x86_64
1190s -> 1100s i686
and times from config.status to and of make check were
4033s -> 3805s x86_64
3331s -> 3303s i686

Except for a few expected differences (insn-attrtab.o, files including
insn-attr.h and *checksum* gcc/*.o had no differences on stripped objects).

2010-06-16  Jakub Jelinek  <jakub@redhat.com>

	* Makefile.in (cfgexpand.o): Depend on $(INSN_ATTR_H).
	* genattrtab.c (check_tune_attr, find_tune_attr): New functions.
	(make_automaton_attrs): If find_tune_attr returns non-NULL,
	write separate internal_dfa_insn_code_* and insn_default_latency_*
	functions for each attribute's value and emit init_sched_attrs
	function and function pointers.
	* genattr.c (const_attrs, reservations): New variables.
	(gen_attr): Add const attributes to const_attrs vector.
	(check_tune_attr, find_tune_attr): New functions.
	(main): Add reservations to reservations vector.  If find_tune_attr
	returns true, add prototype for init_sched_attrs and make
	internal_dfa_insn_code and insn_default_latency function pointers,
	otherwise define init_sched_attrs as dummy macro.
	* cfgexpand.c: Include insn-attr.h.
	(gimple_expand_cfg): Call init_sched_attrs.



	Jakub
Mark Mitchell - June 16, 2010, 9:22 p.m.
Jakub Jelinek wrote:

> 2010-06-16  Jakub Jelinek  <jakub@redhat.com>
> 
> 	* Makefile.in (cfgexpand.o): Depend on $(INSN_ATTR_H).
> 	* genattrtab.c (check_tune_attr, find_tune_attr): New functions.

This should have no impact on compile-time for things compiled with GCC,
correct?  If so, for avoidance of doubt, while I haven't reviewed the
patch in detail, I certainly have no objections to it.  Let me know if
you need help getting it reviewed.

Thanks,
Jakub Jelinek - June 17, 2010, 9:38 a.m.
On Wed, Jun 16, 2010 at 02:22:58PM -0700, Mark Mitchell wrote:
> Jakub Jelinek wrote:
> 
> > 2010-06-16  Jakub Jelinek  <jakub@redhat.com>
> > 
> > 	* Makefile.in (cfgexpand.o): Depend on $(INSN_ATTR_H).
> > 	* genattrtab.c (check_tune_attr, find_tune_attr): New functions.
> 
> This should have no impact on compile-time for things compiled with GCC,
> correct?  If so, for avoidance of doubt, while I haven't reviewed the
> patch in detail, I certainly have no objections to it.  Let me know if

It has compile time impact, but a mixed one.  The negative performance
impact is that internal_dfa_insn_code and insn_default_latency calls
are no longer direct function calls (on the targets which have some cpu/tune
attribute tested in all reservations), but are function pointers and thus
indirect calls.  The pointer isn't changing much usually though (unless
optimized/target attribute/pragma is used, it shouldn't change at all).
How much this costs depends on the host CPU (and whether the host CPU
is able to cache target CPU if the fn pointer isn't changing;
currently the init_sched_attrs call which is called once per function
always writes the fn pointers, usually with the same value as it already has
- would it help for some host CPUs if the function instead computed the
fn pointers into temporary variable and wrote the fn pointer var only
if the temporary is different from its current contents?).

The advantage is that the text size of the functions shrinks a lot
(at least on the architectures I've looked at - i?86/x86_64, powerpc{,64}
and s390{,x} the .text size of all the per tuning functions together
is smaller than the .text size of the old monster functions, the sum of
all the per tuning function .rodata sizes (jump tables) usually slightly
grew, but still for each individual function both sizes are much smaller),
which means that unless optimize/target attribute is used heavily and every
function uses different tuning, the new code is much more i-cache and
d-cache friendly.  Plus, many extract_insn_cached or
extract_constrain_insn_cached calls could go away - if say only one tuning
was interested in that additional info and all others don't care for
some particular insn, the new code will call it only in the function
for the tuning that needs it and not in the other tuning functions.

The last arch I've looked at was arm - there the patch doesn't make any
difference (except for #define init_sched_attrs() do { } while (0) in
insn-attr.h) because arm currently doesn't have a single const attribute
that is used in tests for all reservations.  The solution there could be
to create a new const attribute that would combine the attributes currently
used in define_insn_reservation tests (say a bitfield containing the other
attrs), will leave that to arm maintainers if they wish to do that.

	Jakub
Jan Hubicka - June 17, 2010, 11:34 a.m.
> On Wed, Jun 16, 2010 at 02:22:58PM -0700, Mark Mitchell wrote:
> > Jakub Jelinek wrote:
> > 
> > > 2010-06-16  Jakub Jelinek  <jakub@redhat.com>
> > > 
> > > 	* Makefile.in (cfgexpand.o): Depend on $(INSN_ATTR_H).
> > > 	* genattrtab.c (check_tune_attr, find_tune_attr): New functions.
> > 
> > This should have no impact on compile-time for things compiled with GCC,
> > correct?  If so, for avoidance of doubt, while I haven't reviewed the
> > patch in detail, I certainly have no objections to it.  Let me know if
> 
> It has compile time impact, but a mixed one.  The negative performance
> impact is that internal_dfa_insn_code and insn_default_latency calls
> are no longer direct function calls (on the targets which have some cpu/tune
> attribute tested in all reservations), but are function pointers and thus
> indirect calls.  The pointer isn't changing much usually though (unless
> optimized/target attribute/pragma is used, it shouldn't change at all).
> How much this costs depends on the host CPU (and whether the host CPU
> is able to cache target CPU if the fn pointer isn't changing;
> currently the init_sched_attrs call which is called once per function
> always writes the fn pointers, usually with the same value as it already has
> - would it help for some host CPUs if the function instead computed the
> fn pointers into temporary variable and wrote the fn pointer var only
> if the temporary is different from its current contents?).

I think in general we took the way of function pointers instead of 
macro machinery with direct calls even in hot parts of program (we
have targhooks in general_operand and friendds; dataflow branch
has indirect calls in internal loop etc.).

So I would not worry about this particular case, it is not worse than
existic practices.
> 
> The advantage is that the text size of the functions shrinks a lot
> (at least on the architectures I've looked at - i?86/x86_64, powerpc{,64}
> and s390{,x} the .text size of all the per tuning functions together
> is smaller than the .text size of the old monster functions, the sum of
> all the per tuning function .rodata sizes (jump tables) usually slightly
> grew, but still for each individual function both sizes are much smaller),
> which means that unless optimize/target attribute is used heavily and every
> function uses different tuning, the new code is much more i-cache and
> d-cache friendly.  Plus, many extract_insn_cached or
> extract_constrain_insn_cached calls could go away - if say only one tuning
> was interested in that additional info and all others don't care for
> some particular insn, the new code will call it only in the function
> for the tuning that needs it and not in the other tuning functions.

Oprofiling the compilatio of small files even with LTO linked binary,
we do have a lot of system overhead (it is over 70% for empty file compilation).

I guess cost of mmpapping large binaries + the memory dirtified by startup
accounts a lot here.

Honza
> 
> The last arch I've looked at was arm - there the patch doesn't make any
> difference (except for #define init_sched_attrs() do { } while (0) in
> insn-attr.h) because arm currently doesn't have a single const attribute
> that is used in tests for all reservations.  The solution there could be
> to create a new const attribute that would combine the attributes currently
> used in define_insn_reservation tests (say a bitfield containing the other
> attrs), will leave that to arm maintainers if they wish to do that.
> 
> 	Jakub
Michael Matz - June 17, 2010, 1:03 p.m.
Hello,

On Thu, 17 Jun 2010, Jakub Jelinek wrote:

> On Wed, Jun 16, 2010 at 02:22:58PM -0700, Mark Mitchell wrote:
> > Jakub Jelinek wrote:
> > 
> > > 2010-06-16  Jakub Jelinek  <jakub@redhat.com>
> > > 
> > > 	* Makefile.in (cfgexpand.o): Depend on $(INSN_ATTR_H).
> > > 	* genattrtab.c (check_tune_attr, find_tune_attr): New functions.
> > 
> > This should have no impact on compile-time for things compiled with GCC,
> > correct?  If so, for avoidance of doubt, while I haven't reviewed the
> > patch in detail, I certainly have no objections to it.  Let me know if
> 
> It has compile time impact, but a mixed one.

Same setup as with the other measurements.

arch     name   gen_u  st1_u  big_u  kde_u
alpha    clean  0      0.52   40.92  29.32
alpha    jj     0      0.46   41.01  29.14
arm      clean  6      27.10  47.61  41.30
arm      jj     6      27.03  46.93  41.14
crisv32  clean  0      0.21   34.16  24.44
crisv32  jj     0      0.21   34.18  24.46
hppa     clean  0      1.10   38.37  28.44
hppa     jj     0      0.89   38.34  28.27
i386     clean  44     33.39  30.64  26.51
i386     jj     3      10.06  30.54  26.56
ia64     clean  1      1.82   64.58  45.89
ia64     jj     0      1.54   64.57  47.07
mips     clean  74     14.68  44.01
mips     jj     74     14.61  44.13
powerpc  clean  56     48.49  42.49  30.57
powerpc  jj     1      4.10   42.46  30.28
s390x    clean  0      1.96   41.84  28.99
s390x    jj     0      1.40   41.64  28.80
sh       clean  0      1.35   47.57  34.35
sh       jj     0      1.37   47.57  34.41
sparc    clean  0      1.08   36.73  29.71
sparc    jj     0      1.01   36.40  29.39
x86_64   clean  52     42.40  28.02  25.71
x86_64   jj     3      12.25  27.92  25.64


Ciao,
Michael.
Mark Mitchell - June 17, 2010, 2:47 p.m.
Michael Matz wrote:

>>> This should have no impact on compile-time for things compiled with GCC,
>>> correct?  If so, for avoidance of doubt, while I haven't reviewed the
>>> patch in detail, I certainly have no objections to it.  Let me know if
>> It has compile time impact, but a mixed one.
> 
> Same setup as with the other measurements.

Is it easy for you to toss that into a spreadsheet and get a geometric
mean of the impact across architectures?  Compared to the earlier table,
this looks like it has less negative impact in many cases and positive
impact in some cases.  I'm hoping that it's a wash overall...
Michael Matz - June 17, 2010, 4:07 p.m.
Hi,

On Thu, 17 Jun 2010, Mark Mitchell wrote:

> Michael Matz wrote:
> 
> >>> This should have no impact on compile-time for things compiled with GCC,
> >>> correct?  If so, for avoidance of doubt, while I haven't reviewed the
> >>> patch in detail, I certainly have no objections to it.  Let me know if
> >> It has compile time impact, but a mixed one.
> > 
> > Same setup as with the other measurements.
> 
> Is it easy for you to toss that into a spreadsheet and get a geometric 
> mean of the impact across architectures?  Compared to the earlier table, 
> this looks like it has less negative impact in many cases and positive 
> impact in some cases.  I'm hoping that it's a wash overall...

It certainly has no negative impact I can measure.  The numbers have to be 
taken with a grain of salt, they are not done statistically correct (only 
two runs, without removing outliers and averaging results or taking the 
minimum).  So every difference less than say 1% contains much noise.

I think we should go with Jakubs patch.  I'll then rework my one to only 
contain the non-controversial stuff.


Ciao,
Michael.
Mark Mitchell - June 17, 2010, 4:16 p.m.
Michael Matz wrote:

> It certainly has no negative impact I can measure.

I'm happy with the results, then.

Thanks,

Patch

--- gcc/Makefile.in.jj	2010-06-15 10:37:06.000000000 +0200
+++ gcc/Makefile.in	2010-06-16 18:41:41.000000000 +0200
@@ -3192,7 +3192,7 @@  cfgexpand.o : cfgexpand.c $(TREE_FLOW_H)
    coretypes.h $(TREE_DUMP_H) $(EXCEPT_H) langhooks.h $(TREE_PASS_H) $(RTL_H) \
    $(DIAGNOSTIC_H) $(TOPLEV_H) $(BASIC_BLOCK_H) $(FLAGS_H) debug.h $(PARAMS_H) \
    value-prof.h $(TREE_INLINE_H) $(TARGET_H) $(SSAEXPAND_H) \
-   tree-pretty-print.h gimple-pretty-print.h $(BITMAP_H) sbitmap.h
+   tree-pretty-print.h gimple-pretty-print.h $(BITMAP_H) sbitmap.h $(INSN_ATTR_H)
 cfgrtl.o : cfgrtl.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
    $(FLAGS_H) insn-config.h $(BASIC_BLOCK_H) $(REGS_H) hard-reg-set.h \
    output.h $(TOPLEV_H) $(FUNCTION_H) $(EXCEPT_H) $(TM_P_H) $(INSN_ATTR_H) \
--- gcc/genattrtab.c.jj	2010-06-11 09:38:08.000000000 +0200
+++ gcc/genattrtab.c	2010-06-16 19:19:57.000000000 +0200
@@ -1,6 +1,6 @@ 
 /* Generate code from machine description to compute values of attributes.
    Copyright (C) 1991, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000,
-   2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009
+   2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010
    Free Software Foundation, Inc.
    Contributed by Richard Kenner (kenner@vlsi1.ultra.nyu.edu)
 
@@ -4372,6 +4372,69 @@  process_bypasses (void)
 	r->bypassed = true;
 }
 
+/* Check that attribute NAME is used in define_insn_reservation condition
+   EXP.  Return true if it is.  */
+static bool
+check_tune_attr (const char *name, rtx exp)
+{
+  switch (GET_CODE (exp))
+    {
+    case AND:
+      if (check_tune_attr (name, XEXP (exp, 0)))
+	return true;
+      return check_tune_attr (name, XEXP (exp, 1));
+
+    case IOR:
+      return (check_tune_attr (name, XEXP (exp, 0))
+	      && check_tune_attr (name, XEXP (exp, 1)));
+
+    case EQ_ATTR:
+      return XSTR (exp, 0) == name;
+
+    default:
+      return false;
+    }
+}
+
+/* Try to find a const attribute (usually cpu or tune) that is used
+   in all define_insn_reservation conditions.  */
+static struct attr_desc *
+find_tune_attr (rtx exp)
+{
+  struct attr_desc *attr;
+
+  switch (GET_CODE (exp))
+    {
+    case AND:
+    case IOR:
+      attr = find_tune_attr (XEXP (exp, 0));
+      if (attr)
+	return attr;
+      return find_tune_attr (XEXP (exp, 1));
+
+    case EQ_ATTR:
+      if (XSTR (exp, 0) == alternative_name)
+	return NULL;
+
+      attr = find_attr (&XSTR (exp, 0), 0);
+      gcc_assert (attr);
+
+      if (attr->is_const && !attr->is_special)
+	{
+	  struct insn_reserv *decl;
+
+	  for (decl = all_insn_reservs; decl; decl = decl->next)
+	    if (! check_tune_attr (attr->name, decl->condexp))
+	      return NULL;
+	  return attr;
+	}
+      return NULL;
+
+    default:
+      return NULL;
+    }
+}
+
 /* Create all of the attributes that describe automaton properties.  */
 static void
 make_automaton_attrs (void)
@@ -4379,28 +4442,154 @@  make_automaton_attrs (void)
   int i;
   struct insn_reserv *decl;
   rtx code_exp, lats_exp, byps_exp;
+  struct attr_desc *tune_attr;
 
   if (n_insn_reservs == 0)
     return;
 
-  code_exp = rtx_alloc (COND);
-  lats_exp = rtx_alloc (COND);
+  tune_attr = find_tune_attr (all_insn_reservs->condexp);
+  if (tune_attr != NULL)
+    {
+      rtx *condexps = XNEWVEC (rtx, n_insn_reservs * 3);
+      struct attr_value *val;
+      bool first = true;
+
+      gcc_assert (tune_attr->is_const
+		  && !tune_attr->is_special
+		  && !tune_attr->is_numeric);
+      for (val = tune_attr->first_value; val; val = val->next)
+	{
+	  if (val == tune_attr->default_val)
+	    continue;
+	  gcc_assert (GET_CODE (val->value) == CONST_STRING);
+	  printf ("static int internal_dfa_insn_code_%s (rtx);\n"
+		  "static int insn_default_latency_%s (rtx);\n",
+		  XSTR (val->value, 0), XSTR (val->value, 0));
+	}
+
+      printf ("\n");
+      printf ("int (*internal_dfa_insn_code) (rtx);\n");
+      printf ("int (*insn_default_latency) (rtx);\n");
+      printf ("\n");
+      printf ("void\n");
+      printf ("init_sched_attrs (void)\n");
+      printf ("{\n");
+
+      for (val = tune_attr->first_value; val; val = val->next)
+	{
+	  int j;
+	  char *name;
+	  rtx test = attr_rtx (EQ_ATTR, tune_attr->name, XSTR (val->value, 0));
+
+	  if (val == tune_attr->default_val)
+	    continue;
+	  for (decl = all_insn_reservs, i = 0;
+	       decl;
+	       decl = decl->next)
+	    {
+	      rtx ctest = test;
+	      rtx condexp
+		= simplify_and_tree (decl->condexp, &ctest, -2, 0);
+	      if (condexp == false_rtx)
+		continue;
+	      if (condexp == true_rtx)
+		break;
+	      condexps[i] = condexp;
+	      condexps[i + 1] = make_numeric_value (decl->insn_num);
+	      condexps[i + 2] = make_numeric_value (decl->default_latency);
+	      i += 3;
+	    }
+
+	  code_exp = rtx_alloc (COND);
+	  lats_exp = rtx_alloc (COND);
+
+	  j = i / 3 * 2;
+	  XVEC (code_exp, 0) = rtvec_alloc (j);
+	  XVEC (lats_exp, 0) = rtvec_alloc (j);
+
+	  if (decl)
+	    {
+	      XEXP (code_exp, 1) = make_numeric_value (decl->insn_num);
+	      XEXP (lats_exp, 1) = make_numeric_value (decl->default_latency);
+	    }
+	  else
+	    {
+	      XEXP (code_exp, 1) = make_numeric_value (n_insn_reservs + 1);
+	      XEXP (lats_exp, 1) = make_numeric_value (0);
+	    }
+
+	  while (i > 0)
+	    {
+	      i -= 3;
+	      j -= 2;
+	      XVECEXP (code_exp, 0, j) = condexps[i];
+	      XVECEXP (lats_exp, 0, j) = condexps[i];
+
+	      XVECEXP (code_exp, 0, j + 1) = condexps[i + 1];
+	      XVECEXP (lats_exp, 0, j + 1) = condexps[i + 2];
+	    }
 
-  XVEC (code_exp, 0) = rtvec_alloc (n_insn_reservs * 2);
-  XVEC (lats_exp, 0) = rtvec_alloc (n_insn_reservs * 2);
+	  name = XNEWVEC (char,
+			  sizeof ("*internal_dfa_insn_code_")
+			  + strlen (XSTR (val->value, 0)));
+	  strcpy (name, "*internal_dfa_insn_code_");
+	  strcat (name, XSTR (val->value, 0));
+	  make_internal_attr (name, code_exp, ATTR_NONE);
+	  strcpy (name, "*insn_default_latency_");
+	  strcat (name, XSTR (val->value, 0));
+	  make_internal_attr (name, lats_exp, ATTR_NONE);
+	  XDELETEVEC (name);
 
-  XEXP (code_exp, 1) = make_numeric_value (n_insn_reservs + 1);
-  XEXP (lats_exp, 1) = make_numeric_value (0);
+	  if (first)
+	    {
+	      printf ("  if (");
+	      first = false;
+	    }
+	  else
+	    printf ("  else if (");
+	  write_test_expr (test, 0);
+	  printf (")\n");
+	  printf ("    {\n");
+	  printf ("      internal_dfa_insn_code\n");
+	  printf ("        = internal_dfa_insn_code_%s;\n",
+		  XSTR (val->value, 0));
+	  printf ("      insn_default_latency\n");
+	  printf ("        = insn_default_latency_%s;\n",
+		  XSTR (val->value, 0));
+	  printf ("    }\n");
+	}
+
+      printf ("  else\n");
+      printf ("    gcc_unreachable ();\n");
+      printf ("}\n");
+      printf ("\n");
 
-  for (decl = all_insn_reservs, i = 0;
-       decl;
-       decl = decl->next, i += 2)
+      XDELETEVEC (condexps);
+    }
+  else
     {
-      XVECEXP (code_exp, 0, i)   = decl->condexp;
-      XVECEXP (lats_exp, 0, i)   = decl->condexp;
+      code_exp = rtx_alloc (COND);
+      lats_exp = rtx_alloc (COND);
+
+      XVEC (code_exp, 0) = rtvec_alloc (n_insn_reservs * 2);
+      XVEC (lats_exp, 0) = rtvec_alloc (n_insn_reservs * 2);
 
-      XVECEXP (code_exp, 0, i+1) = make_numeric_value (decl->insn_num);
-      XVECEXP (lats_exp, 0, i+1) = make_numeric_value (decl->default_latency);
+      XEXP (code_exp, 1) = make_numeric_value (n_insn_reservs + 1);
+      XEXP (lats_exp, 1) = make_numeric_value (0);
+
+      for (decl = all_insn_reservs, i = 0;
+	   decl;
+	   decl = decl->next, i += 2)
+	{
+	  XVECEXP (code_exp, 0, i)   = decl->condexp;
+	  XVECEXP (lats_exp, 0, i)   = decl->condexp;
+
+	  XVECEXP (code_exp, 0, i+1) = make_numeric_value (decl->insn_num);
+	  XVECEXP (lats_exp, 0, i+1)
+	    = make_numeric_value (decl->default_latency);
+	}
+      make_internal_attr ("*internal_dfa_insn_code", code_exp, ATTR_NONE);
+      make_internal_attr ("*insn_default_latency",   lats_exp, ATTR_NONE);
     }
 
   if (n_bypasses == 0)
@@ -4423,8 +4612,6 @@  make_automaton_attrs (void)
 	  }
     }
 
-  make_internal_attr ("*internal_dfa_insn_code", code_exp, ATTR_NONE);
-  make_internal_attr ("*insn_default_latency",   lats_exp, ATTR_NONE);
   make_internal_attr ("*bypass_p",               byps_exp, ATTR_NONE);
 }
 
--- gcc/genattr.c.jj	2010-06-11 09:38:08.000000000 +0200
+++ gcc/genattr.c	2010-06-16 19:18:10.000000000 +0200
@@ -1,6 +1,6 @@ 
 /* Generate attribute information (insn-attr.h) from machine description.
-   Copyright (C) 1991, 1994, 1996, 1998, 1999, 2000, 2003, 2004, 2007, 2008
-   Free Software Foundation, Inc.
+   Copyright (C) 1991, 1994, 1996, 1998, 1999, 2000, 2003, 2004, 2007, 2008,
+   2010  Free Software Foundation, Inc.
    Contributed by Richard Kenner (kenner@vlsi1.ultra.nyu.edu)
 
 This file is part of GCC.
@@ -40,12 +40,18 @@  write_upcase (const char *str)
     putchar (TOUPPER(*str));
 }
 
+static VEC (rtx, heap) *const_attrs, *reservations;
+
+
 static void
 gen_attr (rtx attr)
 {
   const char *p, *tag;
   int is_const = GET_CODE (XEXP (attr, 2)) == CONST;
 
+  if (is_const)
+    VEC_safe_push (rtx, heap, const_attrs, attr);
+
   printf ("#define HAVE_ATTR_%s\n", XSTR (attr, 0));
 
   /* If numeric attribute, don't need to write an enum.  */
@@ -92,6 +98,68 @@  extern int insn_current_length (rtx);\n\
     }
 }
 
+/* Check that attribute NAME is used in define_insn_reservation condition
+   EXP.  Return true if it is.  */
+static bool
+check_tune_attr (const char *name, rtx exp)
+{
+  switch (GET_CODE (exp))
+    {
+    case AND:
+      if (check_tune_attr (name, XEXP (exp, 0)))
+	return true;
+      return check_tune_attr (name, XEXP (exp, 1));
+
+    case IOR:
+      return (check_tune_attr (name, XEXP (exp, 0))
+	      && check_tune_attr (name, XEXP (exp, 1)));
+
+    case EQ_ATTR:
+      return strcmp (XSTR (exp, 0), name) == 0;
+
+    default:
+      return false;
+    }
+}
+
+/* Try to find a const attribute (usually cpu or tune) that is used
+   in all define_insn_reservation conditions.  */
+static bool
+find_tune_attr (rtx exp)
+{
+  unsigned int i;
+  rtx attr;
+
+  switch (GET_CODE (exp))
+    {
+    case AND:
+    case IOR:
+      if (find_tune_attr (XEXP (exp, 0)))
+	return true;
+      return find_tune_attr (XEXP (exp, 1));
+
+    case EQ_ATTR:
+      if (strcmp (XSTR (exp, 0), "alternative") == 0)
+	return false;
+
+      for (i = 0; VEC_iterate (rtx, const_attrs, i, attr); i++)
+	if (strcmp (XSTR (attr, 0), XSTR (exp, 0)) == 0)
+	  {
+	    unsigned int j;
+	    rtx resv;
+
+	    for (j = 0; VEC_iterate (rtx, reservations, j, resv); j++)
+	      if (! check_tune_attr (XSTR (attr, 0), XEXP (resv, 2)))
+		return false;
+	    return true;
+	  }
+      return false;
+
+    default:
+      return false;
+    }
+}
+
 int
 main (int argc, char **argv)
 {
@@ -162,11 +230,16 @@  main (int argc, char **argv)
         }
 
       else if (GET_CODE (desc) == DEFINE_INSN_RESERVATION)
-	num_insn_reservations++;
+	{
+	  num_insn_reservations++;
+	  VEC_safe_push (rtx, heap, reservations, desc);
+	}
     }
 
   if (num_insn_reservations > 0)
     {
+      bool has_tune_attr
+	= find_tune_attr (XEXP (VEC_index (rtx, reservations, 0), 2));
       /* Output interface for pipeline hazards recognition based on
 	 DFA (deterministic finite state automata.  */
       printf ("\n#define INSN_SCHEDULING\n");
@@ -181,10 +254,24 @@  main (int argc, char **argv)
       printf ("#define CPU_UNITS_QUERY 0\n");
       printf ("#endif\n\n");
       /* Interface itself: */
-      printf ("/* Internal insn code number used by automata.  */\n");
-      printf ("extern int internal_dfa_insn_code (rtx);\n\n");
-      printf ("/* Insn latency time defined in define_insn_reservation. */\n");
-      printf ("extern int insn_default_latency (rtx);\n\n");
+      if (has_tune_attr)
+	{
+	  printf ("/* Initialize fn pointers for internal_dfa_insn_code\n");
+	  printf ("   and insn_default_latency.  */\n");
+	  printf ("extern void init_sched_attrs (void);\n\n");
+	  printf ("/* Internal insn code number used by automata.  */\n");
+	  printf ("extern int (*internal_dfa_insn_code) (rtx);\n\n");
+	  printf ("/* Insn latency time defined in define_insn_reservation. */\n");
+	  printf ("extern int (*insn_default_latency) (rtx);\n\n");
+	}
+      else
+	{
+	  printf ("#define init_sched_attrs() do { } while (0)\n\n");
+	  printf ("/* Internal insn code number used by automata.  */\n");
+	  printf ("extern int internal_dfa_insn_code (rtx);\n\n");
+	  printf ("/* Insn latency time defined in define_insn_reservation. */\n");
+	  printf ("extern int insn_default_latency (rtx);\n\n");
+	}
       printf ("/* Return nonzero if there is a bypass for given insn\n");
       printf ("   which is a data producer.  */\n");
       printf ("extern int bypass_p (rtx);\n\n");
--- gcc/cfgexpand.c.jj	2010-06-07 11:24:33.000000000 +0200
+++ gcc/cfgexpand.c	2010-06-16 18:41:04.000000000 +0200
@@ -47,6 +47,7 @@  along with GCC; see the file COPYING3.  
 #include "ssaexpand.h"
 #include "bitmap.h"
 #include "sbitmap.h"
+#include "insn-attr.h" /* For INSN_SCHEDULING.  */
 
 /* This variable holds information helping the rewriting of SSA trees
    into RTL.  */
@@ -3761,6 +3762,10 @@  gimple_expand_cfg (void)
   set_curr_insn_block (DECL_INITIAL (current_function_decl));
   prologue_locator = curr_insn_locator ();
 
+#ifdef INSN_SCHEDULING
+  init_sched_attrs ();
+#endif
+
   /* Make sure first insn is a note even if we don't want linenums.
      This makes sure the first insn will never be deleted.
      Also, final expects a note to appear there.  */