diff mbox

[RFC] Don't completely scalarize a record if it contains bit-field (PR tree-optimization/45144)

Message ID 4C5954A3.8090003@codesourcery.com
State New
Headers show

Commit Message

Jie Zhang Aug. 4, 2010, 11:53 a.m. UTC
On 08/02/2010 09:01 PM, Martin Jambor wrote:
> Hi,
>
> On Sat, Jul 31, 2010 at 12:09:42AM +0800, Jie Zhang wrote:
>> PR tree-optimization/45144 shows an issue that SRA causes. I used
>> arm-none-eabi target as an example in PR tree-optimization/45144.
>> But the same issue can also been seen on x86_64-linux-gnu target
>> using the same test case in the PR.
>>
>> SRA completely scalarizes a small record. But when the record is
>> used later as a whole, GCC has to make the record out of the scalar
>> parts. When the record contains bit-fields, GCC generates ugly code
>> to assemble the scalar parts into a record.
>>
>> Until the aggregates copy propagation is implemented, I think it
>> would better to disable full scalarization for such records. The
>> patch is attached. It's bootstrapped on x86_64-linux-gnu and
>> regression tested.
>>
>> Is it OK for now? We can remove it after aggregates copy propagation
>> is implemented.
>>
>> Will it be better to add bit-field check in
>> type_consists_of_records_p instead of using a new function
>> "type_contains_bit_field_p"?
>>
>
> When I was implementing the total scalarization bit of SRA I thought
> of disabling it for structures with bit-fields too.  I did not really
> examine the effects in any way but I never expected this to result in
> nice code at places where we use SRA to do poor-man's copy
> propagation.  However, eventually I decided to keep the total
> scalarization for these structures because doing so can save stack
> space and it would be shame if adding one such field to a structure
> would make us use the space again (in fact, total scalarization was
> introduced as a fix to unnecessary stack-frame setup bugs like PR
> 42585).  But given your results with kernel and gcc, I don't object to
> disabling it... people will scream if something slows down for them.
>
> On the other hand, if we decide to go this way, we need to do the
> check at a different place, going over the whole type whenever looking
> at an assignment is not necessary and is wasteful.  The check would be
> most appropriate as a part of type_consists_of_records_p where it
> would be performed only once for each variable in question.
>

Thanks for your comment! How about this version? I moved the check into 
type_consists_of_records_p. Bootstrapped and regression tested on 
x86_64. Also checked by building linux kernel and made sure there were 
no regressions.


Regards,

Comments

Richard Biener Aug. 4, 2010, 12:23 p.m. UTC | #1
On Wed, Aug 4, 2010 at 1:53 PM, Jie Zhang <jie@codesourcery.com> wrote:
> On 08/02/2010 09:01 PM, Martin Jambor wrote:
>>
>> Hi,
>>
>> On Sat, Jul 31, 2010 at 12:09:42AM +0800, Jie Zhang wrote:
>>>
>>> PR tree-optimization/45144 shows an issue that SRA causes. I used
>>> arm-none-eabi target as an example in PR tree-optimization/45144.
>>> But the same issue can also been seen on x86_64-linux-gnu target
>>> using the same test case in the PR.
>>>
>>> SRA completely scalarizes a small record. But when the record is
>>> used later as a whole, GCC has to make the record out of the scalar
>>> parts. When the record contains bit-fields, GCC generates ugly code
>>> to assemble the scalar parts into a record.
>>>
>>> Until the aggregates copy propagation is implemented, I think it
>>> would better to disable full scalarization for such records. The
>>> patch is attached. It's bootstrapped on x86_64-linux-gnu and
>>> regression tested.
>>>
>>> Is it OK for now? We can remove it after aggregates copy propagation
>>> is implemented.
>>>
>>> Will it be better to add bit-field check in
>>> type_consists_of_records_p instead of using a new function
>>> "type_contains_bit_field_p"?
>>>
>>
>> When I was implementing the total scalarization bit of SRA I thought
>> of disabling it for structures with bit-fields too.  I did not really
>> examine the effects in any way but I never expected this to result in
>> nice code at places where we use SRA to do poor-man's copy
>> propagation.  However, eventually I decided to keep the total
>> scalarization for these structures because doing so can save stack
>> space and it would be shame if adding one such field to a structure
>> would make us use the space again (in fact, total scalarization was
>> introduced as a fix to unnecessary stack-frame setup bugs like PR
>> 42585).  But given your results with kernel and gcc, I don't object to
>> disabling it... people will scream if something slows down for them.
>>
>> On the other hand, if we decide to go this way, we need to do the
>> check at a different place, going over the whole type whenever looking
>> at an assignment is not necessary and is wasteful.  The check would be
>> most appropriate as a part of type_consists_of_records_p where it
>> would be performed only once for each variable in question.
>>
>
> Thanks for your comment! How about this version? I moved the check into
> type_consists_of_records_p. Bootstrapped and regression tested on x86_64.
> Also checked by building linux kernel and made sure there were no
> regressions.

This is ok if Martin is ok with it.

Thanks,
Richard.

>
> Regards,
> --
> Jie Zhang
> CodeSourcery
>
Martin Jambor Aug. 4, 2010, 7:40 p.m. UTC | #2
Hi,

On Wed, Aug 04, 2010 at 02:23:23PM +0200, Richard Guenther wrote:
> On Wed, Aug 4, 2010 at 1:53 PM, Jie Zhang <jie@codesourcery.com> wrote:
> > Thanks for your comment! How about this version? I moved the check into
> > type_consists_of_records_p. Bootstrapped and regression tested on x86_64.
> > Also checked by building linux kernel and made sure there were no
> > regressions.
> 
> This is ok if Martin is ok with it.
> 
> Thanks,
> Richard.


Yes, it's fine, I can't really think of any other quick way of dealing
with this.

Thanks,

Martin
Jie Zhang Aug. 5, 2010, 3:11 a.m. UTC | #3
On 08/05/2010 03:40 AM, Martin Jambor wrote:
> Hi,
>
> On Wed, Aug 04, 2010 at 02:23:23PM +0200, Richard Guenther wrote:
>> On Wed, Aug 4, 2010 at 1:53 PM, Jie Zhang<jie@codesourcery.com>  wrote:
>>> Thanks for your comment! How about this version? I moved the check into
>>> type_consists_of_records_p. Bootstrapped and regression tested on x86_64.
>>> Also checked by building linux kernel and made sure there were no
>>> regressions.
>>
>> This is ok if Martin is ok with it.
>>
> Yes, it's fine, I can't really think of any other quick way of dealing
> with this.
>
Thanks Richard and Martin. I have committed it on trunk.
diff mbox

Patch


	PR tree-optimization/45144
	* tree-sra.c (type_consists_of_records_p): Return false
	if the record contains bit-field.

	testsuite/
	PR tree-optimization/45144
	* gcc.dg/tree-ssa/pr45144.c: New test.

Index: testsuite/gcc.dg/tree-ssa/pr45144.c
===================================================================
--- testsuite/gcc.dg/tree-ssa/pr45144.c	(revision 0)
+++ testsuite/gcc.dg/tree-ssa/pr45144.c	(revision 0)
@@ -0,0 +1,46 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+void baz (unsigned);
+
+extern unsigned buf[];
+
+struct A
+{
+  unsigned a1:10;
+  unsigned a2:3;
+  unsigned:19;
+};
+
+union TMP
+{
+  struct A a;
+  unsigned int b;
+};
+
+static unsigned
+foo (struct A *p)
+{
+  union TMP t;
+  struct A x;
+  
+  x = *p;
+  t.a = x;
+  return t.b;
+}
+
+void
+bar (unsigned orig, unsigned *new)
+{
+  struct A a;
+  union TMP s;
+
+  s.b = orig;
+  a = s.a;
+  if (a.a1)
+    baz (a.a2);
+  *new = foo (&a);
+}
+
+/* { dg-final { scan-tree-dump "x = a;" "optimized"} } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
Index: tree-sra.c
===================================================================
--- tree-sra.c	(revision 162725)
+++ tree-sra.c	(working copy)
@@ -811,7 +811,7 @@  create_access (tree expr, gimple stmt, b
 /* Return true iff TYPE is a RECORD_TYPE with fields that are either of gimple
    register types or (recursively) records with only these two kinds of fields.
    It also returns false if any of these records has a zero-size field as its
-   last field.  */
+   last field or has a bit-field.  */
 
 static bool
 type_consists_of_records_p (tree type)
@@ -827,6 +827,9 @@  type_consists_of_records_p (tree type)
       {
 	tree ft = TREE_TYPE (fld);
 
+	if (DECL_BIT_FIELD (fld))
+	  return false;
+
 	if (!is_gimple_reg_type (ft)
 	    && !type_consists_of_records_p (ft))
 	  return false;