Patchwork [PR,47382] We cannot simply fold OBJ_TYPE_REF at all in 4.6

login
register
mail settings
Submitter Martin Jambor
Date Jan. 25, 2011, 1:57 p.m.
Message ID <20110125135736.GB27592@virgil.arch.suse.de>
Download mbox | patch
Permalink /patch/80360/
State New
Headers show

Comments

Martin Jambor - Jan. 25, 2011, 1:57 p.m.
Hi,

PR 47382 is another devirtualization issue but at least this time I
think I did not cause it.  The problem is that even when I reverted
gimple-folding of OBJ_TYPE_REF to what is being done in 4.5, the IL
now looks different and thus we still miscompile stuff.

For the PR testcase, 4.5 early inliner produces:

int main() ()
{
  struct A * this.1;
  int (*__vtbl_ptr_type) (void) * D.1820;
  int (*__vtbl_ptr_type) (void) D.1819;
  struct B b;
  int D.1793;

<bb 2>:
  b.D.1713._vptr.A = &_ZTV1A[2];
  b.D.1713._vptr.A = &_ZTV1B[2];
  D.1793_1 = 0;
  b.D.1713._vptr.A = &_ZTV1B[2];
  this.1_5 = &b.D.1713;
  this.1_5->_vptr.A = &_ZTV1A[2];
  D.1820_6 = this.1_5->_vptr.A;
  D.1819_7 = *D.1820_6;
  OBJ_TYPE_REF(D.1819_7;this.1_5->0) (this.1_5);

<L2>:
<L3>:
  return D.1793_1;

}

Whereas in 4.6 the IL (after disabling O_T_R folding) looks like:

int main() ()
{
  int (*__vtbl_ptr_type) (void) * D.1819;
  int (*__vtbl_ptr_type) (void) D.1818;
  struct B b;
  int D.1794;

<bb 2>:
  MEM[(struct A *)&b]._vptr.A = &_ZTV1A[2];
  b.D.1714._vptr.A = &_ZTV1B[2];
  D.1794_1 = 0;
  b.D.1714._vptr.A = &_ZTV1B[2];
  MEM[(struct A *)&b]._vptr.A = &_ZTV1A[2];
  D.1819_5 = MEM[(struct A *)&b]._vptr.A;
  D.1818_6 = *D.1819_5;
  OBJ_TYPE_REF(D.1818_6;&b->0) (&b);

<L2>:
<L3>:
  return D.1794_1;

}

Notice the subtle but important difference in the 1st (counted from
zero) argument of the OBJ_TYPE_REFs.

O_T_R folding as we've done it so far (since I first looked at the
source) relied on the fact that &DECL means we do not call a virtual
method of an ancestor but of the whole (outer-most) object, which
avoids many problems, including those with changing dynamic type.
However this apparently does not hold true for the current trunk.

Since the optimizations that lead to this seem all valid and useful in
other circumstances, I guess that we simply cannot fold OBJ_TYPE_REFs
without scanning for dynamic type changes at all and so the patch
below disables it.

Nevertheless, we do have a testcase in our testsuite
(g++.dg/opt/devirt1.C), that actually checks that this folding is
being done and so I have xfailed it.  Without LTO, our only chance of
optimizing code like this (with constructor in a different compilation
unit) is to use types and so I have had a look at what my patches I
have not committed do with this source.  Unfortunately, they bail out
when they try to make sure that the destination of the call is not an
complex type of thunk (only this-adjusting thunks are not too complex
to be represented in the call graph now).  This is done using call
graph nodes and since the call target does not have one, I give up.

Is there any way of determining if the function declaration extracted
from BINFO_VIRTUALS is a thunk?  I use TREE_PURPOSE to get the delta
by which to adjust this pointer already, do all thunks have a non-zero
delta in TREE_PURPOSE in BINFO_VIRTUALS?  It would be better to fail
to devirtualize only for thunks (even simple ones) than for all
methods...

Anyway, the patch below successfully bootstrapped and tested on
x86_64-linux.  OK for trunk?

Thanks,

Martin


2011-01-24  Martin Jambor  <mjambor@suse.cz>

	PR tree-optimization/47382
	* gimple-fold.c (gimple_fold_obj_type_ref_call): Removed.
	(gimple_fold_call): Do not call gimple_fold_obj_type_ref_call.

	* testsuite/g++.dg/torture/pr47382.C: New test.
	* testsuite/g++.dg/opt/devirt1.C: Xfail.
Richard Guenther - Jan. 25, 2011, 2:31 p.m.
On Tue, 25 Jan 2011, Martin Jambor wrote:

> Hi,
> 
> PR 47382 is another devirtualization issue but at least this time I
> think I did not cause it.  The problem is that even when I reverted
> gimple-folding of OBJ_TYPE_REF to what is being done in 4.5, the IL
> now looks different and thus we still miscompile stuff.
> 
> For the PR testcase, 4.5 early inliner produces:
> 
> int main() ()
> {
>   struct A * this.1;
>   int (*__vtbl_ptr_type) (void) * D.1820;
>   int (*__vtbl_ptr_type) (void) D.1819;
>   struct B b;
>   int D.1793;
> 
> <bb 2>:
>   b.D.1713._vptr.A = &_ZTV1A[2];
>   b.D.1713._vptr.A = &_ZTV1B[2];
>   D.1793_1 = 0;
>   b.D.1713._vptr.A = &_ZTV1B[2];
>   this.1_5 = &b.D.1713;
>   this.1_5->_vptr.A = &_ZTV1A[2];
>   D.1820_6 = this.1_5->_vptr.A;
>   D.1819_7 = *D.1820_6;
>   OBJ_TYPE_REF(D.1819_7;this.1_5->0) (this.1_5);
> 
> <L2>:
> <L3>:
>   return D.1793_1;
> 
> }
> 
> Whereas in 4.6 the IL (after disabling O_T_R folding) looks like:
> 
> int main() ()
> {
>   int (*__vtbl_ptr_type) (void) * D.1819;
>   int (*__vtbl_ptr_type) (void) D.1818;
>   struct B b;
>   int D.1794;
> 
> <bb 2>:
>   MEM[(struct A *)&b]._vptr.A = &_ZTV1A[2];
>   b.D.1714._vptr.A = &_ZTV1B[2];
>   D.1794_1 = 0;
>   b.D.1714._vptr.A = &_ZTV1B[2];
>   MEM[(struct A *)&b]._vptr.A = &_ZTV1A[2];
>   D.1819_5 = MEM[(struct A *)&b]._vptr.A;
>   D.1818_6 = *D.1819_5;
>   OBJ_TYPE_REF(D.1818_6;&b->0) (&b);
> 
> <L2>:
> <L3>:
>   return D.1794_1;
> 
> }
> 
> Notice the subtle but important difference in the 1st (counted from
> zero) argument of the OBJ_TYPE_REFs.
> 
> O_T_R folding as we've done it so far (since I first looked at the
> source) relied on the fact that &DECL means we do not call a virtual
> method of an ancestor but of the whole (outer-most) object, which
> avoids many problems, including those with changing dynamic type.
> However this apparently does not hold true for the current trunk.

As I mentioned repeatedly pointer types have no semantic meaning ;)

> Since the optimizations that lead to this seem all valid and useful in
> other circumstances, I guess that we simply cannot fold OBJ_TYPE_REFs
> without scanning for dynamic type changes at all and so the patch
> below disables it.
> 
> Nevertheless, we do have a testcase in our testsuite
> (g++.dg/opt/devirt1.C), that actually checks that this folding is
> being done and so I have xfailed it.  Without LTO, our only chance of
> optimizing code like this (with constructor in a different compilation
> unit) is to use types and so I have had a look at what my patches I
> have not committed do with this source.  Unfortunately, they bail out
> when they try to make sure that the destination of the call is not an
> complex type of thunk (only this-adjusting thunks are not too complex
> to be represented in the call graph now).  This is done using call
> graph nodes and since the call target does not have one, I give up.
> 
> Is there any way of determining if the function declaration extracted
> from BINFO_VIRTUALS is a thunk?  I use TREE_PURPOSE to get the delta
> by which to adjust this pointer already, do all thunks have a non-zero
> delta in TREE_PURPOSE in BINFO_VIRTUALS?  It would be better to fail
> to devirtualize only for thunks (even simple ones) than for all
> methods...
> 
> Anyway, the patch below successfully bootstrapped and tested on
> x86_64-linux.  OK for trunk?

Ok.

Thanks,
Richard.

> Thanks,
> 
> Martin
> 
> 
> 2011-01-24  Martin Jambor  <mjambor@suse.cz>
> 
> 	PR tree-optimization/47382
> 	* gimple-fold.c (gimple_fold_obj_type_ref_call): Removed.
> 	(gimple_fold_call): Do not call gimple_fold_obj_type_ref_call.
> 
> 	* testsuite/g++.dg/torture/pr47382.C: New test.
> 	* testsuite/g++.dg/opt/devirt1.C: Xfail.
> 
> Index: icln/gcc/gimple-fold.c
> ===================================================================
> --- icln.orig/gcc/gimple-fold.c
> +++ icln/gcc/gimple-fold.c
> @@ -1443,38 +1443,6 @@ gimple_adjust_this_by_delta (gimple_stmt
>    gimple_call_set_arg (call_stmt, 0, tmp);
>  }
>  
> -/* Fold a call statement to OBJ_TYPE_REF to a direct call, if possible.  GSI
> -   determines the statement, generating new statements is allowed only if
> -   INPLACE is false.  Return true iff the statement was changed.  */
> -
> -static bool
> -gimple_fold_obj_type_ref_call (gimple_stmt_iterator *gsi)
> -{
> -  gimple stmt = gsi_stmt (*gsi);
> -  tree ref = gimple_call_fn (stmt);
> -  tree obj = OBJ_TYPE_REF_OBJECT (ref);
> -  tree binfo, fndecl, delta;
> -  HOST_WIDE_INT token;
> -
> -  if (TREE_CODE (obj) != ADDR_EXPR)
> -    return false;
> -  obj = TREE_OPERAND (obj, 0);
> -  if (!DECL_P (obj)
> -      || TREE_CODE (TREE_TYPE (obj)) != RECORD_TYPE)
> -    return false;
> -  binfo = TYPE_BINFO (TREE_TYPE (obj));
> -  if (!binfo)
> -    return false;
> -
> -  token = tree_low_cst (OBJ_TYPE_REF_TOKEN (ref), 1);
> -  fndecl = gimple_get_virt_mehtod_for_binfo (token, binfo, &delta, false);
> -  if (!fndecl)
> -    return false;
> -  gcc_assert (integer_zerop (delta));
> -  gimple_call_set_fndecl (stmt, fndecl);
> -  return true;
> -}
> -
>  /* Attempt to fold a call statement referenced by the statement iterator GSI.
>     The statement may be replaced by another statement, e.g., if the call
>     simplifies to a constant value. Return true if any changes were made.
> @@ -1500,17 +1468,6 @@ gimple_fold_call (gimple_stmt_iterator *
>  	  return true;
>  	}
>      }
> -  else
> -    {
> -      /* ??? Should perhaps do this in fold proper.  However, doing it
> -         there requires that we create a new CALL_EXPR, and that requires
> -         copying EH region info to the new node.  Easier to just do it
> -         here where we can just smash the call operand.  */
> -      callee = gimple_call_fn (stmt);
> -      if (TREE_CODE (callee) == OBJ_TYPE_REF)
> -	return gimple_fold_obj_type_ref_call (gsi);
> -    }
> -
>    return false;
>  }
>  
> Index: icln/gcc/testsuite/g++.dg/torture/pr47382.C
> ===================================================================
> --- /dev/null
> +++ icln/gcc/testsuite/g++.dg/torture/pr47382.C
> @@ -0,0 +1,30 @@
> +// { dg-do run }
> +
> +extern "C" void abort ();
> +
> +struct A
> +{
> +  inline ~A ();
> +  virtual void foo () {}
> +};
> +
> +struct B : A
> +{
> +  virtual void foo () { abort(); }
> +};
> +
> +static inline void middleman (A *a)
> +{
> +  a->foo ();
> +}
> +
> +inline A::~A ()
> +{
> +  middleman (this);
> +}
> +
> +int main ()
> +{
> +   B b;
> +   return 0;
> +}
> Index: icln/gcc/testsuite/g++.dg/opt/devirt1.C
> ===================================================================
> --- icln.orig/gcc/testsuite/g++.dg/opt/devirt1.C
> +++ icln/gcc/testsuite/g++.dg/opt/devirt1.C
> @@ -1,6 +1,6 @@
>  // { dg-do compile }
>  // { dg-options "-O" }
> -// { dg-final { scan-assembler "xyzzy" } }
> +// { dg-final { scan-assembler "xyzzy" { xfail *-*-* } } }
>  
>  struct S { S(); virtual void xyzzy(); };
>  inline void foo(S *s) { s->xyzzy(); }
> 
>
Maxim Kuvyrkov - Feb. 3, 2011, 9:18 p.m.
<trimming CC:>

On Jan 25, 2011, at 4:57 PM, Martin Jambor wrote:

> Hi,
> 
> PR 47382 is another devirtualization issue but at least this time I
> think I did not cause it.  The problem is that even when I reverted
> gimple-folding of OBJ_TYPE_REF to what is being done in 4.5, the IL
> now looks different and thus we still miscompile stuff.

Martin,

Is it expected that this patch stops GCC from devirtualizing cases when inlining functions with references for parameters?

GCC mainline just before your patch was optimizing the following testcase to have no virtual calls.  I wonder how this can be fixed non-intrusively to qualify for Stage 4.

====================
/* Verify that the inliner makes good decisions and the example
   is optimized to 4 printf()s in main().  */
// { dg-do compile }
// { dg-options "-O2 -fdump-tree-optimized"  }

#include <stdio.h>

class Calculable
{
public:
	virtual unsigned char calculate() = 0;
	virtual ~Calculable() {}
};

class X : public Calculable
{
public:
	unsigned char calculate() { return 1; }
};

class Y : public Calculable
{
public:
	virtual unsigned char calculate() { return 2; }
};

static void print(X& c)
{
	printf("%d\n", c.calculate());
	printf("+1: %d\n", c.calculate() + 1);
}

static void print(Y& c)
{
	printf("%d\n", c.calculate());
	printf("+1: %d\n", c.calculate() + 1);
}

int main()
{
	X x;
	Y y;

	print(x);
	print(y);

	return 0;
}

// { dg-final { scan-tree-dump "printf \\(\"%d\\\\n\", 1\\);" "optimized" } }
// { dg-final { scan-tree-dump "printf \\(\"\\+1: %d\\\\n\", 2\\);" "optimized" } }
// { dg-final { scan-tree-dump "printf \\(\"%d\\\\n\", 2\\);" "optimized" } }
// { dg-final { scan-tree-dump "printf \\(\"\\+1: %d\\\\n\", 3\\);" "optimized" } }
=====================

Thank you,

--
Maxim Kuvyrkov
CodeSourcery
+7-812-677-6839
Martin Jambor - Feb. 8, 2011, 5:53 p.m.
Hi,

sorry for a late (and a rather brief) reply.

On Fri, Feb 04, 2011 at 12:18:42AM +0300, Maxim Kuvyrkov wrote:
> <trimming CC:>
> 
> On Jan 25, 2011, at 4:57 PM, Martin Jambor wrote:
> 
> > Hi,
> > 
> > PR 47382 is another devirtualization issue but at least this time I
> > think I did not cause it.  The problem is that even when I reverted
> > gimple-folding of OBJ_TYPE_REF to what is being done in 4.5, the IL
> > now looks different and thus we still miscompile stuff.
> 
> Martin,
> 
> Is it expected that this patch stops GCC from devirtualizing cases
> when inlining functions with references for parameters?

This patch basically disables all intraprocedural devirtualization
simply because that transformation relies on assumptions that no
longer hold true.  That leaves only devirtualization within inlining
and IPA-CP but those do not work intraprocedurally (i.e. when the
object is within the same function as the call).  And in your
testcase, early inlining puts all virtual calls to main() where the
objects are.

> 
> GCC mainline just before your patch was optimizing the following
> testcase to have no virtual calls.  I wonder how this can be fixed
> non-intrusively to qualify for Stage 4.

I have patches for this and hope I will post them soon (probably only as an
RFC, however).  They in fact are in the gcc-patches archives:
http://gcc.gnu.org/ml/gcc-patches/2011-01/msg01033.html and 
http://gcc.gnu.org/ml/gcc-patches/2010-12/msg01214.html 

Unfortunately, I am afraid we would need to make a _very_ compelling
case for them to be included at this point.

Martin


> 
> ====================
> /* Verify that the inliner makes good decisions and the example
>    is optimized to 4 printf()s in main().  */
> // { dg-do compile }
> // { dg-options "-O2 -fdump-tree-optimized"  }
> 
> #include <stdio.h>
> 
> class Calculable
> {
> public:
> 	virtual unsigned char calculate() = 0;
> 	virtual ~Calculable() {}
> };
> 
> class X : public Calculable
> {
> public:
> 	unsigned char calculate() { return 1; }
> };
> 
> class Y : public Calculable
> {
> public:
> 	virtual unsigned char calculate() { return 2; }
> };
> 
> static void print(X& c)
> {
> 	printf("%d\n", c.calculate());
> 	printf("+1: %d\n", c.calculate() + 1);
> }
> 
> static void print(Y& c)
> {
> 	printf("%d\n", c.calculate());
> 	printf("+1: %d\n", c.calculate() + 1);
> }
> 
> int main()
> {
> 	X x;
> 	Y y;
> 
> 	print(x);
> 	print(y);
> 
> 	return 0;
> }
> 
> // { dg-final { scan-tree-dump "printf \\(\"%d\\\\n\", 1\\);" "optimized" } }
> // { dg-final { scan-tree-dump "printf \\(\"\\+1: %d\\\\n\", 2\\);" "optimized" } }
> // { dg-final { scan-tree-dump "printf \\(\"%d\\\\n\", 2\\);" "optimized" } }
> // { dg-final { scan-tree-dump "printf \\(\"\\+1: %d\\\\n\", 3\\);" "optimized" } }
> =====================
> 
> Thank you,
> 
> --
> Maxim Kuvyrkov
> CodeSourcery
> +7-812-677-6839
>
Maxim Kuvyrkov - Sept. 22, 2011, 6:36 a.m.
On 9/02/2011, at 6:53 AM, Martin Jambor wrote:

> 
> This patch basically disables all intraprocedural devirtualization
> simply because that transformation relies on assumptions that no
> longer hold true.  That leaves only devirtualization within inlining
> and IPA-CP but those do not work intraprocedurally (i.e. when the
> object is within the same function as the call).  And in your
> testcase, early inlining puts all virtual calls to main() where the
> objects are.
> 
>> 
>> GCC mainline just before your patch was optimizing the following
>> testcase to have no virtual calls.  I wonder how this can be fixed
>> non-intrusively to qualify for Stage 4.
> 
> I have patches for this and hope I will post them soon (probably only as an
> RFC, however).  They in fact are in the gcc-patches archives:
> http://gcc.gnu.org/ml/gcc-patches/2011-01/msg01033.html and 
> http://gcc.gnu.org/ml/gcc-patches/2010-12/msg01214.html 
> 
> Unfortunately, I am afraid we would need to make a _very_ compelling
> case for them to be included at this point.

Martin,

Ping.

I am about to submit improvements to inlining that take devirtualization opportunities into account and your two above patches help improve devirtualization quite a bit.  Do you plan to commit them some time soon?

For reference, I'm attaching the patch that adds several new devirtualization testcases.  Not optimizing the very simple inline-devirt-1.C is just embarrassing. 

Thank you,

--
Maxim Kuvyrkov
CodeSourcery / Mentor Graphics
Martin Jambor - Sept. 23, 2011, 2:19 p.m.
Hi,

On Thu, Sep 22, 2011 at 06:36:43PM +1200, Maxim Kuvyrkov wrote:
> On 9/02/2011, at 6:53 AM, Martin Jambor wrote:
> 
> > 
> > This patch basically disables all intraprocedural devirtualization
> > simply because that transformation relies on assumptions that no
> > longer hold true.  That leaves only devirtualization within inlining
> > and IPA-CP but those do not work intraprocedurally (i.e. when the
> > object is within the same function as the call).  And in your
> > testcase, early inlining puts all virtual calls to main() where the
> > objects are.
> > 
> >> 
> >> GCC mainline just before your patch was optimizing the following
> >> testcase to have no virtual calls.  I wonder how this can be fixed
> >> non-intrusively to qualify for Stage 4.
> > 
> > I have patches for this and hope I will post them soon (probably only as an
> > RFC, however).  They in fact are in the gcc-patches archives:
> > http://gcc.gnu.org/ml/gcc-patches/2011-01/msg01033.html and 
> > http://gcc.gnu.org/ml/gcc-patches/2010-12/msg01214.html 
> > 
> > Unfortunately, I am afraid we would need to make a _very_ compelling
> > case for them to be included at this point.
> 
> Martin,
> 
> Ping.
> 
> I am about to submit improvements to inlining that take devirtualization opportunities into account and your two above patches help improve devirtualization quite a bit.  Do you plan to commit them some time soon?
> 
> For reference, I'm attaching the patch that adds several new devirtualization testcases.  Not optimizing the very simple inline-devirt-1.C is just embarrassing. 
> 

Well, we have tried the patches when LTO-building Firefox in spring
and they did not really bring about much improvement.  Therefore we
concluded that more is needed - such as try to track malloced objects
and do devirtualization on them whenever possible because the
automatically allocated objects are not enough.

Also, because for other reasons (mainly Fortran array info) I plan to
implement jump functions for structure components, I planned to
eventually replace the new-dynamic-type-identification patch with a
jump function for the actual vptr value.

However, both of these are really 4.8 material and since the patches
probably need only minor updates, it might be worthwhile to do that so
that gcc can handle the "embarrassing" simple cases.  So I will do
that (though it might need to wait for about a week), re-try them on
Firefox and probably propose them for submission.

By the way, do you evaluate (your) devirtualization by compiling any
real software (other than Firefox)?  Do the patches make any
difference there?

Thanks,

Martin
Maxim Kuvyrkov - Sept. 30, 2011, 3:02 a.m.
On 24/09/2011, at 2:19 AM, Martin Jambor wrote:

> However, both of these are really 4.8 material and since the patches
> probably need only minor updates, it might be worthwhile to do that so
> that gcc can handle the "embarrassing" simple cases.  So I will do
> that (though it might need to wait for about a week), re-try them on
> Firefox and probably propose them for submission.

Great!  Thank you.

> 
> By the way, do you evaluate (your) devirtualization by compiling any
> real software (other than Firefox)?  Do the patches make any
> difference there?

The devirtualization and inlining improvements with your patches in the mix provided very good improvements on certain proprietary source base.  Code size shrank by 8-10% and performance improved by 3-10% (depending on hardware).  There was some amount of parameter tuning involved, so the improvements with default settings will not be as big, but they still should be noticeable.  [The testing was done with a 4.6-based toolchain.]

The optimizations that we implemented targeted coding patterns that are distilled in the 9 inline-devirt-*.C testcases that I posted.  The actual code base was not micro-optimized, and this makes me confident that the improvements are general enough to be valuable to broad bodies of code.

Regards,

--
Maxim Kuvyrkov
CodeSourcery / Mentor Graphics
Maxim Kuvyrkov - Sept. 30, 2011, 5:56 a.m.
On 30/09/2011, at 4:02 PM, Maxim Kuvyrkov wrote:

> On 24/09/2011, at 2:19 AM, Martin Jambor wrote:
> 
>> However, both of these are really 4.8 material and since the patches
>> probably need only minor updates, it might be worthwhile to do that so
>> that gcc can handle the "embarrassing" simple cases.  So I will do
>> that (though it might need to wait for about a week), re-try them on
>> Firefox and probably propose them for submission.
> 
> Great!  Thank you.

Here is one of your patches updated to the latest mainline.  Just add water^W changelog.

It bootstraps, and I'm regtesting it on x86_64-linux-gnu {-m64,-m32} now.

Regards,

--
Maxim Kuvyrkov
CodeSourcery / Mentor Graphics
Maxim Kuvyrkov - Sept. 30, 2011, 8:34 p.m.
On 30/09/2011, at 6:56 PM, Maxim Kuvyrkov wrote:

> On 30/09/2011, at 4:02 PM, Maxim Kuvyrkov wrote:
> 
>> On 24/09/2011, at 2:19 AM, Martin Jambor wrote:
>> 
>>> However, both of these are really 4.8 material and since the patches
>>> probably need only minor updates, it might be worthwhile to do that so
>>> that gcc can handle the "embarrassing" simple cases.  So I will do
>>> that (though it might need to wait for about a week), re-try them on
>>> Firefox and probably propose them for submission.
>> 
>> Great!  Thank you.
> 
> Here is one of your patches updated to the latest mainline.  Just add water^W changelog.
> 
> It bootstraps, and I'm regtesting it on x86_64-linux-gnu {-m64,-m32} now.

The regtest passed fine (C and C++ languages only).  The only failure is the testcase for PR43411, which this patch removes from XFAILs.  It seems the testcase requires your other patch to PASS.

HTH,

--
Maxim Kuvyrkov
CodeSourcery / Mentor Graphics
Maxim Kuvyrkov - Sept. 30, 2011, 9:23 p.m.
On 30/09/2011, at 6:56 PM, Maxim Kuvyrkov wrote:

> On 30/09/2011, at 4:02 PM, Maxim Kuvyrkov wrote:
> 
>> On 24/09/2011, at 2:19 AM, Martin Jambor wrote:
>> 
>>> However, both of these are really 4.8 material and since the patches
>>> probably need only minor updates, it might be worthwhile to do that so
>>> that gcc can handle the "embarrassing" simple cases.  So I will do
>>> that (though it might need to wait for about a week), re-try them on
>>> Firefox and probably propose them for submission.
>> 
>> Great!  Thank you.
> 
> Here is one of your patches updated to the latest mainline.  Just add water^W changelog.
> 
> It bootstraps, and I'm regtesting it on x86_64-linux-gnu {-m64,-m32} now.

Attached is a follow up patch to your 2 devirtualization patches.  Now that compute_known_type_jump_func may be called with SSA_NAME argument (from ipa_try_devirtualize_immediately) we need to make it handle SSA_NAMEs.

One way to do so it to defer the job to detect_type_change_ssa, below patch implements this approach.  Note that the patch relies on detect_type_change to initialize JFUNC, as is done in your patch at http://gcc.gnu.org/ml/gcc-patches/2011-01/msg01033.html.  Without that patch we need to initialize JFUNC inside compute_known_type_jump_func, which may be a cleaner solution.

I encourage you to include this small fix in the main body of one of your patches and claim credit for it.  You will spare me a test-submission cycle that way :-).

Thank you,

--
Maxim Kuvyrkov
CodeSourcery / Mentor Graphics
Maxim Kuvyrkov - Nov. 22, 2011, 6:45 a.m.
On 30/09/2011, at 6:56 PM, Maxim Kuvyrkov wrote:

> On 30/09/2011, at 4:02 PM, Maxim Kuvyrkov wrote:
> 
>> On 24/09/2011, at 2:19 AM, Martin Jambor wrote:
>> 
>>> However, both of these are really 4.8 material and since the patches
>>> probably need only minor updates, it might be worthwhile to do that so
>>> that gcc can handle the "embarrassing" simple cases.  So I will do
>>> that (though it might need to wait for about a week), re-try them on
>>> Firefox and probably propose them for submission.
>> 
>> Great!  Thank you.
> 
> Here is one of your patches updated to the latest mainline.  Just add water^W changelog.
> 
> It bootstraps, and I'm regtesting it on x86_64-linux-gnu {-m64,-m32} now.

Ping.

Martin, do you plan to have this pushed in for GCC 4.7?

Thank you,

--
Maxim Kuvyrkov
CodeSourcery / Mentor Graphics
Martin Jambor - Nov. 22, 2011, 10:38 a.m.
Hi,

On Tue, Nov 22, 2011 at 07:45:39PM +1300, Maxim Kuvyrkov wrote:
> On 30/09/2011, at 6:56 PM, Maxim Kuvyrkov wrote:
> 
> > On 30/09/2011, at 4:02 PM, Maxim Kuvyrkov wrote:
> > 
> >> On 24/09/2011, at 2:19 AM, Martin Jambor wrote:
> >> 
> >>> However, both of these are really 4.8 material and since the patches
> >>> probably need only minor updates, it might be worthwhile to do that so
> >>> that gcc can handle the "embarrassing" simple cases.  So I will do
> >>> that (though it might need to wait for about a week), re-try them on
> >>> Firefox and probably propose them for submission.
> >> 
> >> Great!  Thank you.
> > 
> > Here is one of your patches updated to the latest mainline.  Just add water^W changelog.
> > 
> > It bootstraps, and I'm regtesting it on x86_64-linux-gnu {-m64,-m32} now.
> 
> Ping.
> 
> Martin, do you plan to have this pushed in for GCC 4.7?
> 

well, there were two patches.  I have managed to update and push
trough one of them in time
(http://gcc.gnu.org/ml/gcc-patches/2011-11/msg00086.html) but
unfortunately I have not managed to do the same with the second one.
It's recent incarnation is here: 

http://gcc.gnu.org/ml/gcc-patches/2011-11/msg00095.html

However, since the stage 1 ended and I still wasn't able to
demonstrate any real impact anywhere (other than my semi-silly example
attached to that patch), I gave up.  It is unfortunate but I also had
other pressing tasks and the patch does not do anything on simple
programs and I have not been able to compile Firefox even without LTO
with the current trunk to try it on something complex.

Therefore at the moment I see no other option but to queue it to stage 1
again.

Thanks,

Martin

Patch

Index: icln/gcc/gimple-fold.c
===================================================================
--- icln.orig/gcc/gimple-fold.c
+++ icln/gcc/gimple-fold.c
@@ -1443,38 +1443,6 @@  gimple_adjust_this_by_delta (gimple_stmt
   gimple_call_set_arg (call_stmt, 0, tmp);
 }
 
-/* Fold a call statement to OBJ_TYPE_REF to a direct call, if possible.  GSI
-   determines the statement, generating new statements is allowed only if
-   INPLACE is false.  Return true iff the statement was changed.  */
-
-static bool
-gimple_fold_obj_type_ref_call (gimple_stmt_iterator *gsi)
-{
-  gimple stmt = gsi_stmt (*gsi);
-  tree ref = gimple_call_fn (stmt);
-  tree obj = OBJ_TYPE_REF_OBJECT (ref);
-  tree binfo, fndecl, delta;
-  HOST_WIDE_INT token;
-
-  if (TREE_CODE (obj) != ADDR_EXPR)
-    return false;
-  obj = TREE_OPERAND (obj, 0);
-  if (!DECL_P (obj)
-      || TREE_CODE (TREE_TYPE (obj)) != RECORD_TYPE)
-    return false;
-  binfo = TYPE_BINFO (TREE_TYPE (obj));
-  if (!binfo)
-    return false;
-
-  token = tree_low_cst (OBJ_TYPE_REF_TOKEN (ref), 1);
-  fndecl = gimple_get_virt_mehtod_for_binfo (token, binfo, &delta, false);
-  if (!fndecl)
-    return false;
-  gcc_assert (integer_zerop (delta));
-  gimple_call_set_fndecl (stmt, fndecl);
-  return true;
-}
-
 /* Attempt to fold a call statement referenced by the statement iterator GSI.
    The statement may be replaced by another statement, e.g., if the call
    simplifies to a constant value. Return true if any changes were made.
@@ -1500,17 +1468,6 @@  gimple_fold_call (gimple_stmt_iterator *
 	  return true;
 	}
     }
-  else
-    {
-      /* ??? Should perhaps do this in fold proper.  However, doing it
-         there requires that we create a new CALL_EXPR, and that requires
-         copying EH region info to the new node.  Easier to just do it
-         here where we can just smash the call operand.  */
-      callee = gimple_call_fn (stmt);
-      if (TREE_CODE (callee) == OBJ_TYPE_REF)
-	return gimple_fold_obj_type_ref_call (gsi);
-    }
-
   return false;
 }
 
Index: icln/gcc/testsuite/g++.dg/torture/pr47382.C
===================================================================
--- /dev/null
+++ icln/gcc/testsuite/g++.dg/torture/pr47382.C
@@ -0,0 +1,30 @@ 
+// { dg-do run }
+
+extern "C" void abort ();
+
+struct A
+{
+  inline ~A ();
+  virtual void foo () {}
+};
+
+struct B : A
+{
+  virtual void foo () { abort(); }
+};
+
+static inline void middleman (A *a)
+{
+  a->foo ();
+}
+
+inline A::~A ()
+{
+  middleman (this);
+}
+
+int main ()
+{
+   B b;
+   return 0;
+}
Index: icln/gcc/testsuite/g++.dg/opt/devirt1.C
===================================================================
--- icln.orig/gcc/testsuite/g++.dg/opt/devirt1.C
+++ icln/gcc/testsuite/g++.dg/opt/devirt1.C
@@ -1,6 +1,6 @@ 
 // { dg-do compile }
 // { dg-options "-O" }
-// { dg-final { scan-assembler "xyzzy" } }
+// { dg-final { scan-assembler "xyzzy" { xfail *-*-* } } }
 
 struct S { S(); virtual void xyzzy(); };
 inline void foo(S *s) { s->xyzzy(); }