diff mbox

Fix PR tree-optimization/51315

Message ID 201112062324.09783.ebotcazou@adacore.com
State New
Headers show

Commit Message

Eric Botcazou Dec. 6, 2011, 10:24 p.m. UTC
Hi,

this is a regression present on mainline and 4.6 branch at -O for the SPARC.  
The compiler generates an unaligned access for the memcpy call in:

  char *s

  union
  {
    int32_t i;
    char a[sizeof (int32_t)];
  }
  v;

  memcpy (v.a, s, sizeof (int32_t));

The memcpy call is folded to an assignment between a couple of MEM_REFs:

  MEM[(char * {ref-all})&v] = MEM[(char * {ref-all})s_1];

with type 'char a[4]'.  The problem is that SRA turns this into:

  v$i_27 = MEM[(char * {ref-all})s_1].i;

but s isn't aligned enough for an int32_t access.


I don't think we can scalarize in this case on strict-alignment targets.  The 
SRA pass already contains an alignment check, but it is purely internal to the 
RHS and this is clearly not sufficient anymore with MEM_REFs.

The proposed fix is to enhance this alignment check to take into account both 
the RHS and the LHS (much like the memcpy folder).  I think it subsumes the 
previous check, which could be viewed as a check involving the RHS and the 
type of the LHS.  But there is a hitch: get_object_alignment is conservative 
for MEM_REF (unlike for INDIRECT_REF) so a trick is needed in order not to 
pessimize in some cases.

Tested on SPARC/Solaris.  OK for mainline and 4.6 branch?


2011-12-06  Eric Botcazou  <ebotcazou@adacore.com>

	PR tree-optimization/51315
	* tree-sra.c (tree_non_mode_aligned_mem_p): Rename to...
	(tree_non_aligned_mem_p): ...this.  Add ALIGN parameter.  Look into
	MEM_REFs and deal with simple dereferences specially.
	(build_accesses_from_assign): Adjust for above change.
	(access_precludes_ipa_sra_p): Likewise.


2011-12-06  Eric Botcazou  <ebotcazou@adacore.com>

	* gcc.c-torture/execute/20111206-1.c: New test.

Comments

Richard Biener Dec. 7, 2011, 12:04 p.m. UTC | #1
On Tue, Dec 6, 2011 at 11:24 PM, Eric Botcazou <ebotcazou@adacore.com> wrote:
> Hi,
>
> this is a regression present on mainline and 4.6 branch at -O for the SPARC.
> The compiler generates an unaligned access for the memcpy call in:
>
>  char *s
>
>  union
>  {
>    int32_t i;
>    char a[sizeof (int32_t)];
>  }
>  v;
>
>  memcpy (v.a, s, sizeof (int32_t));
>
> The memcpy call is folded to an assignment between a couple of MEM_REFs:
>
>  MEM[(char * {ref-all})&v] = MEM[(char * {ref-all})s_1];
>
> with type 'char a[4]'.  The problem is that SRA turns this into:
>
>  v$i_27 = MEM[(char * {ref-all})s_1].i;
>
> but s isn't aligned enough for an int32_t access.
>
>
> I don't think we can scalarize in this case on strict-alignment targets.  The
> SRA pass already contains an alignment check, but it is purely internal to the
> RHS and this is clearly not sufficient anymore with MEM_REFs.
>
> The proposed fix is to enhance this alignment check to take into account both
> the RHS and the LHS (much like the memcpy folder).  I think it subsumes the
> previous check, which could be viewed as a check involving the RHS and the
> type of the LHS.  But there is a hitch: get_object_alignment is conservative
> for MEM_REF (unlike for INDIRECT_REF) so a trick is needed in order not to
> pessimize in some cases.
>
> Tested on SPARC/Solaris.  OK for mainline and 4.6 branch?

You are basically (but non-optimally) locally re-implementing
what expr.c:get_object_or_type_alignment does, for the
bare MEM_REF case (you don't consider offsets that
do not change the alignment, nor alignment information
present on the SSA name).

Note that in expr.c gete_object_or_type_alignment is only
called on [TARGET_]MEM_REFs (I didn't export that function
exactly because of its restrictions, if we want to export it
we should assert it is only called for [TARGET_]MEM_REFs).

Thus, apart from the exp_align computation the patch looks ok
(it's only non-optimal, not incorrect).

Thanks,
Richard.

>
> 2011-12-06  Eric Botcazou  <ebotcazou@adacore.com>
>
>        PR tree-optimization/51315
>        * tree-sra.c (tree_non_mode_aligned_mem_p): Rename to...
>        (tree_non_aligned_mem_p): ...this.  Add ALIGN parameter.  Look into
>        MEM_REFs and deal with simple dereferences specially.
>        (build_accesses_from_assign): Adjust for above change.
>        (access_precludes_ipa_sra_p): Likewise.
>
>
> 2011-12-06  Eric Botcazou  <ebotcazou@adacore.com>
>
>        * gcc.c-torture/execute/20111206-1.c: New test.
>
>
> --
> Eric Botcazou
Eric Botcazou Dec. 7, 2011, 12:34 p.m. UTC | #2
> You are basically (but non-optimally) locally re-implementing
> what expr.c:get_object_or_type_alignment does, for the
> bare MEM_REF case (you don't consider offsets that
> do not change the alignment, nor alignment information
> present on the SSA name).

Gosh.  And now I distinctly remember suggesting that the function be moved to 
the same file as the others...

> Note that in expr.c gete_object_or_type_alignment is only
> called on [TARGET_]MEM_REFs (I didn't export that function
> exactly because of its restrictions, if we want to export it
> we should assert it is only called for [TARGET_]MEM_REFs).

OK, let's export it, move it to builtins.c and and add the assert.

> Thus, apart from the exp_align computation the patch looks ok
> (it's only non-optimal, not incorrect).

I'll post an adjusted version.  Thanks for the review.
diff mbox

Patch

Index: tree-sra.c
===================================================================
--- tree-sra.c	(revision 181993)
+++ tree-sra.c	(working copy)
@@ -1067,26 +1067,31 @@  disqualify_ops_if_throwing_stmt (gimple
   return false;
 }
 
-/* Return true iff type of EXP is not sufficiently aligned.  */
+/* Return true if EXP is a memory reference less aligned than ALIGN.  This is
+   invoked only on strict-alignment targets.  */
 
 static bool
-tree_non_mode_aligned_mem_p (tree exp)
+tree_non_aligned_mem_p (tree exp, unsigned int align)
 {
-  enum machine_mode mode = TYPE_MODE (TREE_TYPE (exp));
-  unsigned int align;
+  unsigned int exp_align;
 
   if (TREE_CODE (exp) == VIEW_CONVERT_EXPR)
     exp = TREE_OPERAND (exp, 0);
 
-  if (TREE_CODE (exp) == SSA_NAME
-      || TREE_CODE (exp) == MEM_REF
-      || mode == BLKmode
-      || is_gimple_min_invariant (exp)
-      || !STRICT_ALIGNMENT)
+  if (TREE_CODE (exp) == SSA_NAME || is_gimple_min_invariant (exp))
     return false;
 
-  align = get_object_alignment (exp);
-  if (GET_MODE_ALIGNMENT (mode) > align)
+  /* get_object_alignment will fall back to BITS_PER_UNIT if it cannot
+     compute an explicit alignment.  Pretend that dereferenced pointers
+     are always aligned on strict-alignment targets.  */
+  if (TREE_CODE (exp) == MEM_REF
+      && TREE_CODE (TREE_OPERAND (exp, 0)) == SSA_NAME
+      && integer_zerop (TREE_OPERAND (exp, 1)))
+    exp_align = TYPE_ALIGN (TREE_TYPE (exp));
+  else
+    exp_align = get_object_alignment (exp);
+
+  if (exp_align < align)
     return true;
 
   return false;
@@ -1120,7 +1125,9 @@  build_accesses_from_assign (gimple stmt)
   if (lacc)
     {
       lacc->grp_assignment_write = 1;
-      lacc->grp_unscalarizable_region |= tree_non_mode_aligned_mem_p (rhs);
+      if (STRICT_ALIGNMENT
+	  && tree_non_aligned_mem_p (rhs, get_object_alignment (lhs)))
+        lacc->grp_unscalarizable_region = 1;
     }
 
   if (racc)
@@ -1129,7 +1136,9 @@  build_accesses_from_assign (gimple stmt)
       if (should_scalarize_away_bitmap && !gimple_has_volatile_ops (stmt)
 	  && !is_gimple_reg_type (racc->type))
 	bitmap_set_bit (should_scalarize_away_bitmap, DECL_UID (racc->base));
-      racc->grp_unscalarizable_region |= tree_non_mode_aligned_mem_p (lhs);
+      if (STRICT_ALIGNMENT
+	  && tree_non_aligned_mem_p (lhs, get_object_alignment (rhs)))
+        racc->grp_unscalarizable_region = 1;
     }
 
   if (lacc && racc
@@ -3705,7 +3714,8 @@  access_precludes_ipa_sra_p (struct acces
 	  || gimple_code (access->stmt) == GIMPLE_ASM))
     return true;
 
-  if (tree_non_mode_aligned_mem_p (access->expr))
+  if (STRICT_ALIGNMENT
+      && tree_non_aligned_mem_p (access->expr, TYPE_ALIGN (access->type)))
     return true;
 
   return false;