diff mbox

PATCH: PR target/64905: unsigned short is loaded with 4-byte load (movl)

Message ID 20150202213947.GA27865@intel.com
State New
Headers show

Commit Message

H.J. Lu Feb. 2, 2015, 9:39 p.m. UTC
This patch fixes a long standing bug where aligned_operand ignores
alignment of memory operand less than 32 bits.  It drops address
decomposition and returns false if alignment of memory operand less
is than 32 bits. Tested on Linux/x86-64.   OK for trunk, 4.9 and 4.8
branches?


H.J.
---
gcc/

	PR target/64905
	* config/i386/predicates.md (aligned_operand): Don't decompose
	address.  Return false if alignment of memory operand is less
	than 32 bits.

gcc/testsuite/

	PR target/64905
	* gcc.target/i386/pr64905.c: New file.
---
 gcc/config/i386/predicates.md           | 33 +--------------------------------
 gcc/testsuite/gcc.target/i386/pr64905.c | 22 ++++++++++++++++++++++
 2 files changed, 23 insertions(+), 32 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr64905.c

Comments

Uros Bizjak Feb. 4, 2015, 1:21 p.m. UTC | #1
On Mon, Feb 2, 2015 at 10:39 PM, H.J. Lu <hongjiu.lu@intel.com> wrote:
> This patch fixes a long standing bug where aligned_operand ignores
> alignment of memory operand less than 32 bits.  It drops address
> decomposition and returns false if alignment of memory operand less
> is than 32 bits. Tested on Linux/x86-64.   OK for trunk, 4.9 and 4.8
> branches?

Can you please find some references in gcc mainlig lists why and for
what reason is the predicate written in the current way? Are there
some (older?) processors that require this approach, so a tuning flag
should be used here?

OTOH, this is not a regression, so not a stage-4 material.

Uros.
Uros Bizjak Feb. 4, 2015, 1:52 p.m. UTC | #2
On Wed, Feb 4, 2015 at 2:21 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Mon, Feb 2, 2015 at 10:39 PM, H.J. Lu <hongjiu.lu@intel.com> wrote:
>> This patch fixes a long standing bug where aligned_operand ignores
>> alignment of memory operand less than 32 bits.  It drops address
>> decomposition and returns false if alignment of memory operand less
>> is than 32 bits. Tested on Linux/x86-64.   OK for trunk, 4.9 and 4.8
>> branches?
>
> Can you please find some references in gcc mainlig lists why and for
> what reason is the predicate written in the current way? Are there
> some (older?) processors that require this approach, so a tuning flag
> should be used here?

After some more thinking, it looks the failure is due to:

emit-rtl.c:  REGNO_POINTER_ALIGN (HARD_FRAME_POINTER_REGNUM) = STACK_BOUNDARY;

The testcase forces the pointer to %rbp (== HARD_FRAME_POINTER_REGNUM
in the above line), so the predicate thinks that the value is aligned,
since %rbp has its REGNO_POINTER_ALIGN set to STACK_BOUNDARY.

Looks like generic RTL infrastructure problem to me, the
REGNO_POINTER_ALIGNMENT of hard_frame_pointer should be cleared when
H_F_P is omitted and reused.

Please let's move discussion back to the PR.

Adding CC.

Uros.
diff mbox

Patch

diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index 0f314cc..98dbcba 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -1095,9 +1095,6 @@ 
 (define_predicate "aligned_operand"
   (match_operand 0 "general_operand")
 {
-  struct ix86_address parts;
-  int ok;
-
   /* Registers and immediate operands are always "aligned".  */
   if (!MEM_P (op))
     return true;
@@ -1121,35 +1118,7 @@ 
       || GET_CODE (op) == POST_INC)
     return true;
 
-  /* Decode the address.  */
-  ok = ix86_decompose_address (op, &parts);
-  gcc_assert (ok);
-
-  if (parts.base && GET_CODE (parts.base) == SUBREG)
-    parts.base = SUBREG_REG (parts.base);
-  if (parts.index && GET_CODE (parts.index) == SUBREG)
-    parts.index = SUBREG_REG (parts.index);
-
-  /* Look for some component that isn't known to be aligned.  */
-  if (parts.index)
-    {
-      if (REGNO_POINTER_ALIGN (REGNO (parts.index)) * parts.scale < 32)
-	return false;
-    }
-  if (parts.base)
-    {
-      if (REGNO_POINTER_ALIGN (REGNO (parts.base)) < 32)
-	return false;
-    }
-  if (parts.disp)
-    {
-      if (!CONST_INT_P (parts.disp)
-	  || (INTVAL (parts.disp) & 3))
-	return false;
-    }
-
-  /* Didn't find one -- this must be an aligned address.  */
-  return true;
+  return false;
 })
 
 ;; Return true if OP is memory operand with a displacement.
diff --git a/gcc/testsuite/gcc.target/i386/pr64905.c b/gcc/testsuite/gcc.target/i386/pr64905.c
new file mode 100644
index 0000000..bc87d85
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr64905.c
@@ -0,0 +1,22 @@ 
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-Os -ffixed-rax -ffixed-rbx -ffixed-rcx -ffixed-rdx -ffixed-rdi -ffixed-rsi -ffixed-r8 -ffixed-r9 -ffixed-r10 -ffixed-r11 -ffixed-r12 -ffixed-r13 -ffixed-r14 -ffixed-r15" } */
+/* { dg-final { scan-assembler-not "movl\[ \t\]0\\(%.*\\), %.*" } } */
+
+typedef unsigned short uint16_t;
+uint16_t a_global;
+
+void __attribute__ ((noinline))
+function (uint16_t **a_p)
+{
+  // unaligned access by address in %rbp: mov    0x0(%rbp),%ebp
+  a_global = **a_p;
+}
+
+int main(int argc, char **argv)
+{
+  uint16_t array [4] = { 1, 2, 3, 4 };
+  uint16_t *array_elem_p = &array [3];
+
+  function (&array_elem_p);
+  return 0;
+}