diff mbox

Eliminate write-only variables

Message ID 20140518205956.GG1828@kam.mff.cuni.cz
State New
Headers show

Commit Message

Jan Hubicka May 18, 2014, 8:59 p.m. UTC
Sandra,
> This patch seems quite similar in purpose to the
> remove_local_statics optimization that Mentor has proposed, although
> the implementation is quite different.  Here is the last version of
> our patch, prepared by Bernd Schmidt last year:
> 
> https://gcc.gnu.org/ml/gcc-patches/2013-06/msg00317.html

Thanks for pointer, I did not notice this patch!
The approach is indeed very different.  So the patch works on function basis
and cares only about local statics of functions that was not inlined?
> 
> I think we can drop our patch from our local tree now, but it
> includes a large number of test cases which I think are worth
> keeping on mainline.  A few of them fail with your implementation,
> though -- which might be genuine bugs, or just different limitations
> of the two approaches.  Can you take a look?
> 
> The failing tests are remove-local-statics-{4,5,7,12,14b}.c.

+/* Verify that we don't eliminate a global static variable.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler "global_static" } } */
+
+static int global_static;
+
+int
+test1 (int x)
+{
+  global_static = x;
+
+  return global_static + x;
+}

here test1 optimizes into

  global_static=x;
  return x+x;

this makes global_static write only and thus it is correctly eliminated.
So I think this testcase is bogus.

+++ b/gcc/testsuite/gcc.dg/remove-local-statics-5.c
@@ -0,0 +1,24 @@
+/* Verify that we do not eliminate a static local variable whose uses
+   are dominated by a def when the function calls setjmp.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler "thestatic" } } */
+
+#include <setjmp.h>
+
+int
+foo (int x)
+{
+  static int thestatic;
+  int retval;
+  jmp_buf env;
+
+  thestatic = x;
+
+  retval = thestatic + x;
+
+  setjmp (env);
+
+  return retval;
+}

I belive this is similar case.  I do not see setjmp changing anything here, since
local optimizers turns retval = x+x;
What it was intended to test?


Here we get after early optimizations:

int
test1 (int x)
{
  static int thestatic;
  int thestatic.0_5;
  int thestatic.1_7;
  int _8;

  <bb 2>:
  if (x_2(D) < 0)
    goto <bb 3>;
  else
    goto <bb 4>;

  <bb 3>:
  thestatic = x_2(D);
  goto <bb 5>;

  <bb 4>:
  thestatic.0_5 = -x_2(D);
  thestatic = thestatic.0_5;

  <bb 5>:
  thestatic.1_7 = thestatic;
  _8 = thestatic.1_7 + x_2(D);
  return _8;

}

and thus we still have bogus read from thestatic.  Because my analysis works at IPA level,
we won't benefit from fact that dom2 eventually cleans it up as:
int
test1 (int x)
{
  static int thestatic;
  int thestatic.0_5;
  int thestatic.1_7;
  int _8;
  int prephitmp_10;

  <bb 2>:
  if (x_2(D) < 0)
    goto <bb 3>;
  else
    goto <bb 4>;

  <bb 3>:
  thestatic = x_2(D);
  goto <bb 5>;

  <bb 4>:
  thestatic.0_5 = -x_2(D);
  thestatic = thestatic.0_5;

  <bb 5>:
  # prephitmp_10 = PHI <x_2(D)(3), thestatic.0_5(4)>
  thestatic.1_7 = prephitmp_10;
  _8 = thestatic.1_7 + x_2(D);
  return _8;

}

Richi, is there a way to teach early FRE to get this transformation?
I see it is a partial redundancy problem...

+/* Verify that we do not eliminate a static variable when it is declared
+   in a function that has nested functions.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler "thestatic" } } */
+
+int test1 (int x)
+{
+  static int thestatic;
+
+  int nested_test1 (int x)
+  {
+    return x + thestatic;
+  }
+
+  thestatic = x;
+
+  return thestatic + x + nested_test1 (x);
+}

Here we work hard enough to optimize test1 as:
int
test1 (int x)
{
  static int thestatic;
  int _4;
  int _5;

  <bb 2>:
  thestatic = x_2(D);
  _4 = x_2(D) + x_2(D);
  _5 = _4 + _4;
  return _5;

}

thus inlining nested_test1 during early optimization. This makes the removal valid.

+/* Verify that we do not eliminate a static local variable if the function
+   containing it is inlined.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler "thestatic" } } */
+
+int
+test2 (int x)
+{
+  if (x < 0)
+    return 0;
+  else
+    return test1 (x - 1);
+}
+
+inline int
+test1 (int x)
+{
+  static int thestatic;
+  int y;
+
+  thestatic = x;
+
+  y = test2 (thestatic - 1);
+
+  return y + x;
+}

Here thestatic becomes write only during early optimization, so again we can correctly eliminate it.

Sandra,
do you think you can drop the testcases that are not valid and commit the valid one minus
remove-local-statics-7.c for which we can fill in enhancement request?

For cases like local-statics-7 your approach can be "saved" by adding simple IPA analysis
to look for static vars that are used only by one function and keeping your DSE code active
for them, so we can still get rid of this special case assignments during late compilation.
I am however not quite convinced it is worth the effort - do you have some real world
cases where it helps?

I am rather thinking about cutting the passmanager queue once again after main
tree optimization and re-running IPA unreachable code removal after them. This
should help with rather common cases where we optimize out code as effect
of inlining.

This would basically mean running pass_all_optimizations from late IPA pass
and scheduling one extra fixup_cfg and perhaps DCE pass at begginig of
pass_all_optimizations.

Honza

Comments

Sandra Loosemore May 19, 2014, 2:45 a.m. UTC | #1
On 05/18/2014 02:59 PM, Jan Hubicka wrote:
> Sandra,
>> This patch seems quite similar in purpose to the
>> remove_local_statics optimization that Mentor has proposed, although
>> the implementation is quite different.  Here is the last version of
>> our patch, prepared by Bernd Schmidt last year:
>>
>> https://gcc.gnu.org/ml/gcc-patches/2013-06/msg00317.html
>
> Thanks for pointer, I did not notice this patch!
> The approach is indeed very different.  So the patch works on function basis
> and cares only about local statics of functions that was not inlined?

Yes.  I should probably mention here that we did the analysis and 
initial implementation of this optimization 7+ years ago against GCC 
4.2, and in some cases we were being conservative in deciding the 
optimization was not valid because the information required for more 
detailed analysis wasn't being collected in the right place back then, 
etc.

>> The failing tests are remove-local-statics-{4,5,7,12,14b}.c.
>
> +/* Verify that we don't eliminate a global static variable.  */
> +
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +/* { dg-final { scan-assembler "global_static" } } */
> +
> +static int global_static;
> +
> +int
> +test1 (int x)
> +{
> +  global_static = x;
> +
> +  return global_static + x;
> +}
>
> here test1 optimizes into
>
>    global_static=x;
>    return x+x;
>
> this makes global_static write only and thus it is correctly eliminated.
> So I think this testcase is bogus.

Yes, I agree that this one was for a restriction of our implementation 
approach.

> +++ b/gcc/testsuite/gcc.dg/remove-local-statics-5.c
> @@ -0,0 +1,24 @@
> +/* Verify that we do not eliminate a static local variable whose uses
> +   are dominated by a def when the function calls setjmp.  */
> +
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +/* { dg-final { scan-assembler "thestatic" } } */
> +
> +#include <setjmp.h>
> +
> +int
> +foo (int x)
> +{
> +  static int thestatic;
> +  int retval;
> +  jmp_buf env;
> +
> +  thestatic = x;
> +
> +  retval = thestatic + x;
> +
> +  setjmp (env);
> +
> +  return retval;
> +}
>
> I belive this is similar case.  I do not see setjmp changing anything here, since
> local optimizers turns retval = x+x;
> What it was intended to test?

Hmmmm, I'm guessing this was some concern about invalid code motion 
around a setjmp.  Our original analysis document lists "F does not call 
setjmp" as a requirement for the optimization, so this was probably a 
case where we were being excessively conservative.

> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/remove-local-statics-7.c
> @@ -0,0 +1,19 @@
> +/* Verify that we eliminate a static local variable where it is defined
> +   along all paths leading to a use.  */
> +
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +/* { dg-final { scan-assembler-not "thestatic" } } */
> +
> +int
> +test1 (int x)
> +{
> +  static int thestatic;
> +
> +  if (x < 0)
> +    thestatic = x;
> +  else
> +    thestatic = -x;
> +
> +  return thestatic + x;
> +}
>
> Here we get after early optimizations:
>
> int
> test1 (int x)
> {
>    static int thestatic;
>    int thestatic.0_5;
>    int thestatic.1_7;
>    int _8;
>
>    <bb 2>:
>    if (x_2(D) < 0)
>      goto <bb 3>;
>    else
>      goto <bb 4>;
>
>    <bb 3>:
>    thestatic = x_2(D);
>    goto <bb 5>;
>
>    <bb 4>:
>    thestatic.0_5 = -x_2(D);
>    thestatic = thestatic.0_5;
>
>    <bb 5>:
>    thestatic.1_7 = thestatic;
>    _8 = thestatic.1_7 + x_2(D);
>    return _8;
>
> }
>
> and thus we still have bogus read from thestatic.  Because my analysis works at IPA level,
> we won't benefit from fact that dom2 eventually cleans it up as:
> int
> test1 (int x)
> {
>    static int thestatic;
>    int thestatic.0_5;
>    int thestatic.1_7;
>    int _8;
>    int prephitmp_10;
>
>    <bb 2>:
>    if (x_2(D) < 0)
>      goto <bb 3>;
>    else
>      goto <bb 4>;
>
>    <bb 3>:
>    thestatic = x_2(D);
>    goto <bb 5>;
>
>    <bb 4>:
>    thestatic.0_5 = -x_2(D);
>    thestatic = thestatic.0_5;
>
>    <bb 5>:
>    # prephitmp_10 = PHI <x_2(D)(3), thestatic.0_5(4)>
>    thestatic.1_7 = prephitmp_10;
>    _8 = thestatic.1_7 + x_2(D);
>    return _8;
>
> }
>
> Richi, is there a way to teach early FRE to get this transformation?
> I see it is a partial redundancy problem...

Hmmmm, bummer that we don't get this one for free.  :-(

> +/* Verify that we do not eliminate a static variable when it is declared
> +   in a function that has nested functions.  */
> +
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +/* { dg-final { scan-assembler "thestatic" } } */
> +
> +int test1 (int x)
> +{
> +  static int thestatic;
> +
> +  int nested_test1 (int x)
> +  {
> +    return x + thestatic;
> +  }
> +
> +  thestatic = x;
> +
> +  return thestatic + x + nested_test1 (x);
> +}
>
> Here we work hard enough to optimize test1 as:
> int
> test1 (int x)
> {
>    static int thestatic;
>    int _4;
>    int _5;
>
>    <bb 2>:
>    thestatic = x_2(D);
>    _4 = x_2(D) + x_2(D);
>    _5 = _4 + _4;
>    return _5;
>
> }
>
> thus inlining nested_test1 during early optimization. This makes the removal valid.

Yes.  This is one we had to punt on due to the one-function-at-a-time 
approach.

> +/* Verify that we do not eliminate a static local variable if the function
> +   containing it is inlined.  */
> +
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +/* { dg-final { scan-assembler "thestatic" } } */
> +
> +int
> +test2 (int x)
> +{
> +  if (x < 0)
> +    return 0;
> +  else
> +    return test1 (x - 1);
> +}
> +
> +inline int
> +test1 (int x)
> +{
> +  static int thestatic;
> +  int y;
> +
> +  thestatic = x;
> +
> +  y = test2 (thestatic - 1);
> +
> +  return y + x;
> +}
>
> Here thestatic becomes write only during early optimization, so again we can correctly eliminate it.

OK.

> Sandra,
> do you think you can drop the testcases that are not valid and commit the valid one minus
> remove-local-statics-7.c for which we can fill in enhancement request?

OK.  Keep the original numbering or re-number them to fill up the holes 
left by the deletions?

> For cases like local-statics-7 your approach can be "saved" by adding simple IPA analysis
> to look for static vars that are used only by one function and keeping your DSE code active
> for them, so we can still get rid of this special case assignments during late compilation.
> I am however not quite convinced it is worth the effort - do you have some real world
> cases where it helps?

Um, the well-known benchmark.  ;-)

> I am rather thinking about cutting the passmanager queue once again after main
> tree optimization and re-running IPA unreachable code removal after them. This
> should help with rather common cases where we optimize out code as effect
> of inlining.
>
> This would basically mean running pass_all_optimizations from late IPA pass
> and scheduling one extra fixup_cfg and perhaps DCE pass at begginig of
> pass_all_optimizations.
>
> Honza
>

-Sandra
Jan Hubicka May 19, 2014, 3:41 a.m. UTC | #2
> 
> Hmmmm, I'm guessing this was some concern about invalid code motion
> around a setjmp.  Our original analysis document lists "F does not
> call setjmp" as a requirement for the optimization, so this was
> probably a case where we were being excessively conservative.

I suppose it was because you needed to prove that the value stored in static
variable is dead at the function return and setjmp may let you to jump
from somewhere else.  This should transparently work with mainline approach.

> >
> >Richi, is there a way to teach early FRE to get this transformation?
> >I see it is a partial redundancy problem...
> 
> Hmmmm, bummer that we don't get this one for free.  :-(

Yeah, lets not forget about this one. On the plus side we got quite few cases
your code didn't ;)

> >Sandra,
> >do you think you can drop the testcases that are not valid and commit the valid one minus
> >remove-local-statics-7.c for which we can fill in enhancement request?
> 
> OK.  Keep the original numbering or re-number them to fill up the
> holes left by the deletions?

Since the original testcases never hit mainline, I would preffer them to be renumbered ;)

> 
> >For cases like local-statics-7 your approach can be "saved" by adding simple IPA analysis
> >to look for static vars that are used only by one function and keeping your DSE code active
> >for them, so we can still get rid of this special case assignments during late compilation.
> >I am however not quite convinced it is worth the effort - do you have some real world
> >cases where it helps?
> 
> Um, the well-known benchmark.  ;-)

Very informative, does my implementation handle it well? ;)

I suppose for benchmarks using static where they should not, the analysis that static
is used only in one function and pass to turn it into automatic variable would still
make sense. The approach removing write only variables and relying on FRE to completely
clean up is bit fragile by requiring very complex machinery to work perfectly...

Honza
> 
> >I am rather thinking about cutting the passmanager queue once again after main
> >tree optimization and re-running IPA unreachable code removal after them. This
> >should help with rather common cases where we optimize out code as effect
> >of inlining.
> >
> >This would basically mean running pass_all_optimizations from late IPA pass
> >and scheduling one extra fixup_cfg and perhaps DCE pass at begginig of
> >pass_all_optimizations.
> >
> >Honza
> >
> 
> -Sandra
>
Richard Biener May 19, 2014, 9:26 a.m. UTC | #3
On Sun, May 18, 2014 at 10:59 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
> Sandra,
>> This patch seems quite similar in purpose to the
>> remove_local_statics optimization that Mentor has proposed, although
>> the implementation is quite different.  Here is the last version of
>> our patch, prepared by Bernd Schmidt last year:
>>
>> https://gcc.gnu.org/ml/gcc-patches/2013-06/msg00317.html
>
> Thanks for pointer, I did not notice this patch!
> The approach is indeed very different.  So the patch works on function basis
> and cares only about local statics of functions that was not inlined?
>>
>> I think we can drop our patch from our local tree now, but it
>> includes a large number of test cases which I think are worth
>> keeping on mainline.  A few of them fail with your implementation,
>> though -- which might be genuine bugs, or just different limitations
>> of the two approaches.  Can you take a look?
>>
>> The failing tests are remove-local-statics-{4,5,7,12,14b}.c.
>
> +/* Verify that we don't eliminate a global static variable.  */
> +
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +/* { dg-final { scan-assembler "global_static" } } */
> +
> +static int global_static;
> +
> +int
> +test1 (int x)
> +{
> +  global_static = x;
> +
> +  return global_static + x;
> +}
>
> here test1 optimizes into
>
>   global_static=x;
>   return x+x;
>
> this makes global_static write only and thus it is correctly eliminated.
> So I think this testcase is bogus.
>
> +++ b/gcc/testsuite/gcc.dg/remove-local-statics-5.c
> @@ -0,0 +1,24 @@
> +/* Verify that we do not eliminate a static local variable whose uses
> +   are dominated by a def when the function calls setjmp.  */
> +
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +/* { dg-final { scan-assembler "thestatic" } } */
> +
> +#include <setjmp.h>
> +
> +int
> +foo (int x)
> +{
> +  static int thestatic;
> +  int retval;
> +  jmp_buf env;
> +
> +  thestatic = x;
> +
> +  retval = thestatic + x;
> +
> +  setjmp (env);
> +
> +  return retval;
> +}
>
> I belive this is similar case.  I do not see setjmp changing anything here, since
> local optimizers turns retval = x+x;
> What it was intended to test?
>
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/remove-local-statics-7.c
> @@ -0,0 +1,19 @@
> +/* Verify that we eliminate a static local variable where it is defined
> +   along all paths leading to a use.  */
> +
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +/* { dg-final { scan-assembler-not "thestatic" } } */
> +
> +int
> +test1 (int x)
> +{
> +  static int thestatic;
> +
> +  if (x < 0)
> +    thestatic = x;
> +  else
> +    thestatic = -x;
> +
> +  return thestatic + x;
> +}
>
> Here we get after early optimizations:
>
> int
> test1 (int x)
> {
>   static int thestatic;
>   int thestatic.0_5;
>   int thestatic.1_7;
>   int _8;
>
>   <bb 2>:
>   if (x_2(D) < 0)
>     goto <bb 3>;
>   else
>     goto <bb 4>;
>
>   <bb 3>:
>   thestatic = x_2(D);
>   goto <bb 5>;
>
>   <bb 4>:
>   thestatic.0_5 = -x_2(D);
>   thestatic = thestatic.0_5;
>
>   <bb 5>:
>   thestatic.1_7 = thestatic;
>   _8 = thestatic.1_7 + x_2(D);
>   return _8;
>
> }
>
> and thus we still have bogus read from thestatic.  Because my analysis works at IPA level,
> we won't benefit from fact that dom2 eventually cleans it up as:
> int
> test1 (int x)
> {
>   static int thestatic;
>   int thestatic.0_5;
>   int thestatic.1_7;
>   int _8;
>   int prephitmp_10;
>
>   <bb 2>:
>   if (x_2(D) < 0)
>     goto <bb 3>;
>   else
>     goto <bb 4>;
>
>   <bb 3>:
>   thestatic = x_2(D);
>   goto <bb 5>;
>
>   <bb 4>:
>   thestatic.0_5 = -x_2(D);
>   thestatic = thestatic.0_5;
>
>   <bb 5>:
>   # prephitmp_10 = PHI <x_2(D)(3), thestatic.0_5(4)>
>   thestatic.1_7 = prephitmp_10;
>   _8 = thestatic.1_7 + x_2(D);
>   return _8;
>
> }
>
> Richi, is there a way to teach early FRE to get this transformation?
> I see it is a partial redundancy problem...

Yeah, nothing FRE can do about (needs insert of a PHI node).
Eventually for this particular case running phiopt early would solve it.

> +/* Verify that we do not eliminate a static variable when it is declared
> +   in a function that has nested functions.  */
> +
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +/* { dg-final { scan-assembler "thestatic" } } */
> +
> +int test1 (int x)
> +{
> +  static int thestatic;
> +
> +  int nested_test1 (int x)
> +  {
> +    return x + thestatic;
> +  }
> +
> +  thestatic = x;
> +
> +  return thestatic + x + nested_test1 (x);
> +}
>
> Here we work hard enough to optimize test1 as:
> int
> test1 (int x)
> {
>   static int thestatic;
>   int _4;
>   int _5;
>
>   <bb 2>:
>   thestatic = x_2(D);
>   _4 = x_2(D) + x_2(D);
>   _5 = _4 + _4;
>   return _5;
>
> }
>
> thus inlining nested_test1 during early optimization. This makes the removal valid.
>
> +/* Verify that we do not eliminate a static local variable if the function
> +   containing it is inlined.  */
> +
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +/* { dg-final { scan-assembler "thestatic" } } */
> +
> +int
> +test2 (int x)
> +{
> +  if (x < 0)
> +    return 0;
> +  else
> +    return test1 (x - 1);
> +}
> +
> +inline int
> +test1 (int x)
> +{
> +  static int thestatic;
> +  int y;
> +
> +  thestatic = x;
> +
> +  y = test2 (thestatic - 1);
> +
> +  return y + x;
> +}
>
> Here thestatic becomes write only during early optimization, so again we can correctly eliminate it.
>
> Sandra,
> do you think you can drop the testcases that are not valid and commit the valid one minus
> remove-local-statics-7.c for which we can fill in enhancement request?
>
> For cases like local-statics-7 your approach can be "saved" by adding simple IPA analysis
> to look for static vars that are used only by one function and keeping your DSE code active
> for them, so we can still get rid of this special case assignments during late compilation.
> I am however not quite convinced it is worth the effort - do you have some real world
> cases where it helps?
>
> I am rather thinking about cutting the passmanager queue once again after main
> tree optimization and re-running IPA unreachable code removal after them. This
> should help with rather common cases where we optimize out code as effect
> of inlining.
>
> This would basically mean running pass_all_optimizations from late IPA pass
> and scheduling one extra fixup_cfg and perhaps DCE pass at begginig of
> pass_all_optimizations.
>
> Honza
Joseph Myers May 19, 2014, 5:45 p.m. UTC | #4
On Sun, 18 May 2014, Sandra Loosemore wrote:

> Hmmmm, I'm guessing this was some concern about invalid code motion around a
> setjmp.  Our original analysis document lists "F does not call setjmp" as a
> requirement for the optimization, so this was probably a case where we were
> being excessively conservative.

Reading my internal analysis from 24 Nov 2005, it was in terms of 
converting a function-local static variable to automatic.  The concern may 
well have been about "the values of objects of automatic storage duration 
that are local to the function containing the invocation of the 
corresponding setjmp macro that do not have volatile-qualified type and 
have been changed between the setjmp invocation and longjmp call are 
indeterminate" meaning that program that's valid when the variable is 
static can become invalid on converting it to automatic.  Of course that 
depends on the particular uses of the variable (whether it's possible for 
it to be changed between the two calls); I think we can presume the 
existing optimizations (that deal with everything except the final store) 
already deal with keeping the transformations valid in the presence of 
setjmp / longjmp.
Sandra Loosemore May 20, 2014, 3:59 a.m. UTC | #5
On 05/18/2014 08:45 PM, Sandra Loosemore wrote:
> On 05/18/2014 02:59 PM, Jan Hubicka wrote:
>> For cases like local-statics-7 your approach can be "saved" by adding
>> simple IPA analysis
>> to look for static vars that are used only by one function and keeping
>> your DSE code active
>> for them, so we can still get rid of this special case assignments
>> during late compilation.
>> I am however not quite convinced it is worth the effort - do you have
>> some real world
>> cases where it helps?
>
> Um, the well-known benchmark.  ;-)

I looked at the generated code for this benchmark and see that your 
patch is indeed not getting rid of all the pointless static variables, 
while ours is, so this is quite disappointing.  I'm thinking now that we 
will still need to retain our patch at least locally, although we'd 
really like to get it on trunk if possible.

Here is another testcase vaguely based on the same kind of 
signal-processing algorithm as the benchmark, but coded without 
reference to it.  I have arm-none-eabi builds around for both 4.9.0 with 
our remove_local_statics patch applied, and trunk with your patch.  With 
-O2, our optimization produces 23 instructions and gets rid of all 3 
statics, yours produces 33 and only gets rid of 1 of them.

Of course it's lame to use static variables in this way, but OTOH it's 
lame if GCC can't recognize them and optimize them away, too.

-Sandra
Jan Hubicka May 20, 2014, 4:29 a.m. UTC | #6
> On 05/18/2014 08:45 PM, Sandra Loosemore wrote:
> >On 05/18/2014 02:59 PM, Jan Hubicka wrote:
> >>For cases like local-statics-7 your approach can be "saved" by adding
> >>simple IPA analysis
> >>to look for static vars that are used only by one function and keeping
> >>your DSE code active
> >>for them, so we can still get rid of this special case assignments
> >>during late compilation.
> >>I am however not quite convinced it is worth the effort - do you have
> >>some real world
> >>cases where it helps?
> >
> >Um, the well-known benchmark.  ;-)
> 
> I looked at the generated code for this benchmark and see that your
> patch is indeed not getting rid of all the pointless static
> variables, while ours is, so this is quite disappointing.  I'm
> thinking now that we will still need to retain our patch at least
> locally, although we'd really like to get it on trunk if possible.

Yours patch can really be improved by adding simple IPA analysis to work
out what variables are refernced by one function only instead of using
not-longer-that-relevant local static info.
As last IPA pass for each static variable with no address taken, look at all
references and see if they all come from one function or functions inlined to
it.
> 
> Here is another testcase vaguely based on the same kind of
> signal-processing algorithm as the benchmark, but coded without
> reference to it.  I have arm-none-eabi builds around for both 4.9.0
> with our remove_local_statics patch applied, and trunk with your
> patch.  With -O2, our optimization produces 23 instructions and gets
> rid of all 3 statics, yours produces 33 and only gets rid of 1 of
> them.
> 
> Of course it's lame to use static variables in this way, but OTOH
> it's lame if GCC can't recognize them and optimize them away, too.
> 
> -Sandra
> 

> void
> fir (int *coeffs, int coeff_length, int *input, int input_length, int *output)
> {
>   static int *coeffp;
>   static int *inputp;
>   static int *outputp;

Here inputp read is rmeoved only at 101.dceloop1 pass. That is really late.
coeffp is removed late, too.

>   int i, c, acc;
> 
>   for (i = 0; i < input_length; i++)
>     {
>       coeffp = coeffs;
>       inputp = input + i;
>       outputp = output + i;
>       acc = 0;
>       for (c = 0; c < coeff_length; c++)
> 	{
> 	  acc += *coeffp * *inputp;
> 	  coeffp++;
> 	  inputp--;
> 	}

It seems like AA can not work out the fact that inputp is unchanged until that
late.  Richi, any ideas?

Honza
>       *outputp = acc;
>     }
> }
> 
>
diff mbox

Patch

--- /dev/null
+++ b/gcc/testsuite/gcc.dg/remove-local-statics-7.c
@@ -0,0 +1,19 @@ 
+/* Verify that we eliminate a static local variable where it is defined
+   along all paths leading to a use.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-not "thestatic" } } */
+
+int
+test1 (int x)
+{
+  static int thestatic;
+
+  if (x < 0)
+    thestatic = x;
+  else
+    thestatic = -x;
+
+  return thestatic + x;
+}