Patchwork fix incorrect SRA transformation on non-integral VIEW_CONVERT argument

login
register
mail settings
Submitter Richard Guenther
Date April 25, 2012, 2:16 p.m.
Message ID <CAFiYyc0MrXj-rU8vdCRu4G1vNeUu-jMMz-v1nVsiKx+A0sQ=2g@mail.gmail.com>
Download mbox | patch
Permalink /patch/154946/
State New
Headers show

Comments

Richard Guenther - April 25, 2012, 2:16 p.m.
On Wed, Apr 25, 2012 at 3:37 PM, Olivier Hainque <hainque@adacore.com> wrote:
> Hello,
>
> For the  "PA(1).Z := 44;" assignment in the attached Ada
> testcase, we observe the gcc 4.5 SRA pass performing an
> invalid transformation, turning:
>
>  struct {
>    system__pack_48__bits_48 OBJ;
>  } D.1432;
>
>  D.1432.OBJ = D.1435;
>  T1b.F = VIEW_CONVERT_EXPR<struct pt__point>(D.1432);
>
> into:
>
>  SR.12_17 = D.1435_3;
>  T1b.F = VIEW_CONVERT_EXPR<struct pt__point>(SR.12_17);
>
> where we have
>
>    <var_decl D.1432
>     type <record_type 0x7ffff7fb72a0 BLK
>        size <integer_cst 0x7ffff7fac960 constant 48>
>
> and
>
>    <var_decl SR.12
>     type <integer_type system__pack_48__bits_48
>        size <integer_cst 0x7ffff7ecd870 64>
>
>        type <integer_type system__pack_48__bits_48___UMT
>            size <integer_cst 64>
>
> At least the change in size is problematic because the conversion
> outcome might differ after the replacement, in particular on
> big-endian targets.
>
> mainline does something slightly different with the same effects
> eventually (same testcase failure on sparc-solaris). The attached patch
> is a proposal to address this at the point where we already check
> for VCE in the access creation process, disqualifying scalarization
> for a VCE argument of non-integral size.
>
> We (AdaCore) have been using this internally for a while now.
> I also checked that it fixes the observable problem on sparc, then
> bootstrapped and regtested on i686-suse-linux.
>
> OK to commit ?
>
> Thanks in advance for your feedback,

Does this really cover all problematic cases?  Ah, the existing code
already rejects all VIEW_CONVERT_EXPRs down in the reference
chain.

I think much better would be to simply disallow any toplevel
VIEW_CONVERT_EXPR of BLKmode, so, something like


Does that fix your problems, too?  If so I prefer that.

Thanks,
Richard.

> Olivier
>
> 2012-04-25  Olivier Hainque  <hainque@adacore.com>
>
>        * tree-sra.c (create_access): Additional IN_VCE argument, telling
>        if EXPR is VIEW_CONVERT_EXPR input.  Disqualify base if access size
>        is not that of an integer mode in this case.
>        (build_access_from_expr_1): Adjust caller.
>
>        testsuite/
>        * gnat.dg/sra_vce[_decls].adb: New testcase.
>
Olivier Hainque - April 25, 2012, 9:29 p.m.
Thanks for your feedback Richard,

On Apr 25, 2012, at 16:16 , Richard Guenther wrote:
> I think much better would be to simply disallow any toplevel
> VIEW_CONVERT_EXPR of BLKmode,

> Does that fix your problems, too?  If so I prefer that.

 Hmm, I think that this would fix the particular testscase at
 hand but not quite the underlying issue I was aiming at.

 The basic problem I am trying to address is SRA turning
 VCE (<some access>) into VCE (<scalar>) where the replacement
 <scalar> is of a different size than <some access>.

 The idea of locating the check in create_access is that
 this is where the <access> size is computed.

 I'm pretty sure that excluding just on BLKmode for
 TYPE(<access>) doesn't catch them all. 

 In particular, I'm pretty sure that we can get component
 refs of integral modes that access a smaller range of bits
 than what the mode conveys. It is common with packing or
 rep clauses in Ada.

 I'll see if I can come up with a case tomorrow.

 I'm actually also slightly confused by the processing of
 VCE inputs in SRA, as VCE (<composite>) and VCE (<scalar>)
 are not supposed to result in the same outcome for identical
 bit size and patterns on big endian. But I might just still
 be missing something here.

 Olivier
Richard Guenther - April 26, 2012, 8:48 a.m.
On Wed, Apr 25, 2012 at 11:29 PM, Olivier Hainque <hainque@adacore.com> wrote:
> Thanks for your feedback Richard,
>
> On Apr 25, 2012, at 16:16 , Richard Guenther wrote:
>> I think much better would be to simply disallow any toplevel
>> VIEW_CONVERT_EXPR of BLKmode,
>
>> Does that fix your problems, too?  If so I prefer that.
>
>  Hmm, I think that this would fix the particular testscase at
>  hand but not quite the underlying issue I was aiming at.
>
>  The basic problem I am trying to address is SRA turning
>  VCE (<some access>) into VCE (<scalar>) where the replacement
>  <scalar> is of a different size than <some access>.
>
>  The idea of locating the check in create_access is that
>  this is where the <access> size is computed.
>
>  I'm pretty sure that excluding just on BLKmode for
>  TYPE(<access>) doesn't catch them all.
>
>  In particular, I'm pretty sure that we can get component
>  refs of integral modes that access a smaller range of bits
>  than what the mode conveys. It is common with packing or
>  rep clauses in Ada.

Yeah, well - the tree verification code in tree-cfg.c is not enforcing
any constraints here and the docs are not clear either.  My view is
that we don't want the size of the VIEW_CONVERT_EXPR differ
from the size of its operand - you should use a BIT_FIELD_REF
or a MEM_REF for that (yes, VIEW_CONVERT_EXPRs could be
removed or treated as short-cut for a BIT_FIELD_REF that covers
the whole object).

>  I'll see if I can come up with a case tomorrow.
>
>  I'm actually also slightly confused by the processing of
>  VCE inputs in SRA, as VCE (<composite>) and VCE (<scalar>)
>  are not supposed to result in the same outcome for identical
>  bit size and patterns on big endian. But I might just still
>  be missing something here.

Well, it's not clear to me either what the semantics of a VIEW_CONVERT_EXPR
is when you apply it to memory and the result is not of the same size as
the operand.  Naively a VIEW_CONVERT_EXPR is *(T *)&op?  Then a
VIEW_CONVERT_EXPR <T, op> would be the same as
MEM_REF <T, &op, (T' *)0> with the advantage that for the MEM_REF you
can specify the alias set that is supposed to be in effect.  Similar
it would be the same as BIT_FIELD_REF <T, &op, sizeof (T) * BITS_PER_UNIT, 0>.

Can you formally relate those three representations and tell me why
VIEW_CONVERT_EXPR is useful (not only convenient because of less operands)
to use on lvalues (thus memory, compared to registers or constants)?

Thanks,
Richard.

>  Olivier
>
Olivier Hainque - April 30, 2012, 2:18 p.m.
Hello Richard,

Thanks for the constructive exchange :-)

On Apr 26, 2012, at 10:48 , Richard Guenther wrote:
>>  In particular, I'm pretty sure that we can get component
>>  refs of integral modes that access a smaller range of bits
>>  than what the mode conveys. It is common with packing or
>>  rep clauses in Ada.
> 
> Yeah, well - the tree verification code in tree-cfg.c is not enforcing
> any constraints here and the docs are not clear either.  My view is
> that we don't want the size of the VIEW_CONVERT_EXPR differ from the
> size of its operand

 I agree the current situation is not ideal.

 What you suggest corresponds to what is currently documented
 (difference => undefined behavior), but I'm pretty sure that the Ada compiler
 relies on some cases to behave in some specific manner. For tagged types
 IIRC.

 It would certainly be nice to get rid of this discrepancy at some point.

 The issue I was trying to address was a bit different: SRA changing the
 VCE argument "access" size, which seems incorrect regardless of the source
 destination size consistency. I can see how they are connected though and
 will see if I can come up with a different approach.

[...]

> Naively a VIEW_CONVERT_EXPR is *(T *)&op?  Then a
> VIEW_CONVERT_EXPR <T, op> would be the same as
> MEM_REF <T, &op, (T' *)0> with the advantage that for the MEM_REF you
> can specify the alias set that is supposed to be in effect.  Similar
> it would be the same as BIT_FIELD_REF <T, &op, sizeof (T) * BITS_PER_UNIT, 0>.

> Can you formally relate those three representations and tell me why
> VIEW_CONVERT_EXPR is useful (not only convenient because of less operands)
> to use on lvalues (thus memory, compared to registers or constants)?

 I have ideas on how they are supposed to relate (corresponding to your
 intuitive understanding, what is documented), but MEM_REF and BIT_FIELD_REF
 were introduced much more recently though, so I'm pretty sure that there are
 details that escape me.
Olivier Hainque - May 3, 2012, 8:21 a.m.
On Apr 30, 2012, at 16:18 , Olivier Hainque wrote:
>> Can you formally relate those three representations and tell me why
>> VIEW_CONVERT_EXPR is useful (not only convenient because of less operands)
>> to use on lvalues (thus memory, compared to registers or constants)?
> 
> I have ideas on how they are supposed to relate (corresponding to your
> intuitive understanding, what is documented), but MEM_REF and BIT_FIELD_REF
> were introduced much more recently though, so I'm pretty sure that there are
> details that escape me.

 One area of potential difference came to mind yesterday: regarding the
 processing of type alignment differences. VCE to more aligned (of the
 same size) would make a temp copy to yield a correctly aligned object.

 Would MEM_REF do that as well ?
Richard Guenther - May 3, 2012, 10:24 a.m.
On Thu, May 3, 2012 at 10:21 AM, Olivier Hainque <hainque@adacore.com> wrote:
>
> On Apr 30, 2012, at 16:18 , Olivier Hainque wrote:
>>> Can you formally relate those three representations and tell me why
>>> VIEW_CONVERT_EXPR is useful (not only convenient because of less operands)
>>> to use on lvalues (thus memory, compared to registers or constants)?
>>
>> I have ideas on how they are supposed to relate (corresponding to your
>> intuitive understanding, what is documented), but MEM_REF and BIT_FIELD_REF
>> were introduced much more recently though, so I'm pretty sure that there are
>> details that escape me.
>
>  One area of potential difference came to mind yesterday: regarding the
>  processing of type alignment differences. VCE to more aligned (of the
>  same size) would make a temp copy to yield a correctly aligned object.
>
>  Would MEM_REF do that as well ?

A temporary copy?  You mean if I do VIEW_CONVERT_EXPR <double> (longlong)
on x86 with -malign-double (so long long is 4 byte aligned and double is 8 byte
aligned) the access guarantees only 4 byte alignment but if we "spill" that to
a temporary the temporary will use double as type and thus get larger alignment
of 8 bytes?

MEM_REF <double> (&longlong) would assume 8-byte alignment of longlong
unless you use a double type variant with 4-byte alignment.

Richard.
Olivier Hainque - May 3, 2012, 1:40 p.m.
On May 3, 2012, at 12:24 , Richard Guenther wrote:
>>  One area of potential difference came to mind yesterday: regarding the
>>  processing of type alignment differences. VCE to more aligned (of the
>>  same size) would make a temp copy to yield a correctly aligned object.
>> 
>>  Would MEM_REF do that as well ?
> 
> A temporary copy?

 At least in some cases.

>  You mean if I do VIEW_CONVERT_EXPR <double> (longlong)
> on x86 with -malign-double (so long long is 4 byte aligned and double is 8 byte
> aligned) the access guarantees only 4 byte alignment but if we "spill" that to
> a temporary the temporary will use double as type and thus get larger alignment
> of 8 bytes?

 Probably not on x86. I am referring to this piece of the RTL expansion:

 expand_expr_real_1()

    case VIEW_CONVERT_EXPR:
      ...
      /* At this point, OP0 is in the correct mode.  If the output type is
         such that the operand is known to be aligned, indicate that it is.
         Otherwise, we need only be concerned about alignment for non-BLKmode
         results.  */
      if (MEM_P (op0))
      ...
         else if (STRICT_ALIGNMENT
                   && mode != BLKmode
                   && MEM_ALIGN (op0) < GET_MODE_ALIGNMENT (mode))
            {
              tree inner_type = TREE_TYPE (treeop0);
              HOST_WIDE_INT temp_size
                = MAX (int_size_in_bytes (inner_type),
                       (HOST_WIDE_INT) GET_MODE_SIZE (mode));
              rtx new_rtx
                = assign_stack_temp_for_type (mode, temp_size, 0, type);
              rtx new_with_op0_mode
                = adjust_address (new_rtx, GET_MODE (op0), 0);

              gcc_assert (!TREE_ADDRESSABLE (exp));

              if (GET_MODE (op0) == BLKmode)
                emit_block_move (new_with_op0_mode, op0,
                                 GEN_INT (GET_MODE_SIZE (mode)),
                                 (modifier == EXPAND_STACK_PARM
                                  ? BLOCK_OP_CALL_PARM : BLOCK_OP_NORMAL));
              else
                emit_move_insn (new_with_op0_mode, op0);

              op0 = new_rtx;
            }

 Not sure how to trigger this though, and the documentation of
 VIEW_CONVERT_EXPR is pretty silent about it :-(

Patch

Index: gcc/tree-sra.c
===================================================================
--- gcc/tree-sra.c      (revision 186817)
+++ gcc/tree-sra.c      (working copy)
@@ -1001,9 +1001,10 @@  build_access_from_expr_1 (tree expr, gim

   /* We need to dive through V_C_Es in order to get the size of its parameter
      and not the result type.  Ada produces such statements.  We are also
-     capable of handling the topmost V_C_E but not any of those buried in other
-     handled components.  */
-  if (TREE_CODE (expr) == VIEW_CONVERT_EXPR)
+     capable of handling the topmost V_C_E if it has a suitable mode but
+     not any of those buried in other handled components.  */
+  if (TREE_CODE (expr) == VIEW_CONVERT_EXPR
+      && TYPE_MODE (TREE_TYPE (expr)) != BLKmode)
     expr = TREE_OPERAND (expr, 0);

   if (contains_view_convert_expr_p (expr))