From patchwork Tue Sep 21 09:49:02 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 65289 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 1C5FBB6F01 for ; Tue, 21 Sep 2010 19:48:10 +1000 (EST) Received: (qmail 19170 invoked by alias); 21 Sep 2010 09:48:08 -0000 Received: (qmail 19155 invoked by uid 22791); 21 Sep 2010 09:48:07 -0000 X-SWARE-Spam-Status: No, hits=-6.2 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_HI, SPF_HELO_PASS, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 21 Sep 2010 09:48:02 +0000 Received: from int-mx03.intmail.prod.int.phx2.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.16]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id o8L9m07w007012 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Tue, 21 Sep 2010 05:48:01 -0400 Received: from tyan-ft48-01.lab.bos.redhat.com (tyan-ft48-01.lab.bos.redhat.com [10.16.42.4]) by int-mx03.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id o8L9lt3F009392 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 21 Sep 2010 05:47:57 -0400 Received: from tyan-ft48-01.lab.bos.redhat.com (tyan-ft48-01.lab.bos.redhat.com [127.0.0.1]) by tyan-ft48-01.lab.bos.redhat.com (8.14.4/8.14.4) with ESMTP id o8L9n366028687; Tue, 21 Sep 2010 11:49:03 +0200 Received: (from jakub@localhost) by tyan-ft48-01.lab.bos.redhat.com (8.14.4/8.14.4/Submit) id o8L9n3u8028686; Tue, 21 Sep 2010 11:49:03 +0200 Date: Tue, 21 Sep 2010 11:49:02 +0200 From: Jakub Jelinek To: "H.J. Lu" Cc: Richard Henderson , Richard Guenther , gcc-patches@gcc.gnu.org Subject: Re: 4.4/4.5 PATCH: PR middle-end/45678: [4.4/4.5/4.6 Regression] crash on vector code with -m32 -msse Message-ID: <20100921094902.GQ1269@tyan-ft48-01.lab.bos.redhat.com> Reply-To: Jakub Jelinek References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.20 (2009-12-10) X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On Mon, Sep 20, 2010 at 04:45:07PM -0700, H.J. Lu wrote: > Here is the patch for 4.4/4.5. fold_builtin_memory_op is very different > in 4.4/4.5. Simple backport doesn't work. Does this patch make > any senses? Your builtins.c part doesn't make sense. The code already handles the alignment, see the srctype = build_qualified_type (desttype, 0); if (src_align < (int) TYPE_ALIGN (srctype)) { if (AGGREGATE_TYPE_P (srctype) || SLOW_UNALIGNED_ACCESS (TYPE_MODE (srctype), src_align)) return NULL_TREE; srctype = build_variant_type_copy (srctype); TYPE_ALIGN (srctype) = src_align; TYPE_USER_ALIGN (srctype) = 1; TYPE_PACKED (srctype) = 1; } hunk and similar hunk for dsttype. The difference is just that 4.6 creates a valid unaligned MEM_REF while 4.5 creates a valid unaligned VCE. As you can see above, in 4.5 and earlier it was using a packed type properly no matter whether target was STRICT_ALIGNMENT or not. The problem is during expansion that the VCE isn't expanded properly. So, IMHO we need something like this (so far untested; 4.5 version): 2010-09-21 Jakub Jelinek PR middle-end/45678 * expr.c (expand_expr_real_1) : If op0 isn't sufficiently aligned and there is movmisalignM insn for mode, use it to load op0 into a temporary register. Backport from mainline 2010-09-20 Jakub Jelinek PR middle-end/45678 * cfgexpand.c (expand_one_stack_var_at): Limit alignment to crtl->max_used_stack_slot_alignment. 2010-09-21 Jakub Jelinek Backport from mainline 2010-09-17 Richard Guenther H.J. Lu PR middle-end/45678 * gcc.dg/torture/pr45678-1.c: New. * gcc.dg/torture/pr45678-2.c: Likewise. Jakub --- gcc/expr.c.jj 2010-09-20 22:42:42.000000000 +0200 +++ gcc/expr.c 2010-09-21 11:31:26.286778101 +0200 @@ -9387,10 +9387,32 @@ expand_expr_real_1 (tree exp, rtx target results. */ if (MEM_P (op0)) { + enum insn_code icode; op0 = copy_rtx (op0); if (TYPE_ALIGN_OK (type)) set_mem_align (op0, MAX (MEM_ALIGN (op0), TYPE_ALIGN (type))); + else if (mode != BLKmode + && MEM_ALIGN (op0) < GET_MODE_ALIGNMENT (mode) + /* If the target does have special handling for unaligned + loads of mode then use them. */ + && ((icode = optab_handler (movmisalign_optab, + mode)->insn_code) + != CODE_FOR_nothing)) + { + rtx reg, insn; + + op0 = adjust_address (op0, mode, 0); + /* We've already validated the memory, and we're creating a + new pseudo destination. The predicates really can't + fail. */ + reg = gen_reg_rtx (mode); + + /* Nor can the insn generator. */ + insn = GEN_FCN (icode) (reg, op0); + emit_insn (insn); + return reg; + } else if (STRICT_ALIGNMENT && mode != BLKmode && MEM_ALIGN (op0) < GET_MODE_ALIGNMENT (mode)) --- gcc/cfgexpand.c.jj 2010-06-11 11:06:01.000000000 +0200 +++ gcc/cfgexpand.c 2010-09-21 11:36:58.331377579 +0200 @@ -705,7 +705,7 @@ static void expand_one_stack_var_at (tree decl, HOST_WIDE_INT offset) { /* Alignment is unsigned. */ - unsigned HOST_WIDE_INT align; + unsigned HOST_WIDE_INT align, max_align; rtx x; /* If this fails, we've overflowed the stack frame. Error nicely? */ @@ -722,10 +722,9 @@ expand_one_stack_var_at (tree decl, HOST offset -= frame_phase; align = offset & -offset; align *= BITS_PER_UNIT; - if (align == 0) - align = STACK_BOUNDARY; - else if (align > MAX_SUPPORTED_STACK_ALIGNMENT) - align = MAX_SUPPORTED_STACK_ALIGNMENT; + max_align = crtl->max_used_stack_slot_alignment; + if (align == 0 || align > max_align) + align = max_align; DECL_ALIGN (decl) = align; DECL_USER_ALIGN (decl) = 0; --- gcc/testsuite/gcc.dg/torture/pr45678-1.c.jj 2010-09-21 11:37:37.744364834 +0200 +++ gcc/testsuite/gcc.dg/torture/pr45678-1.c 2010-09-17 16:40:41.000000000 +0200 @@ -0,0 +1,16 @@ +/* { dg-do run } */ + +typedef float V __attribute__ ((vector_size (16))); +V g; +float d[4] = { 4, 3, 2, 1 }; + +int +main () +{ + V e; + __builtin_memcpy (&e, &d, sizeof (d)); + V f = { 5, 15, 25, 35 }; + e = e * f; + g = e; + return 0; +} --- gcc/testsuite/gcc.dg/torture/pr45678-2.c.jj 2010-09-21 11:37:41.167614502 +0200 +++ gcc/testsuite/gcc.dg/torture/pr45678-2.c 2010-09-18 19:50:55.000000000 +0200 @@ -0,0 +1,16 @@ +/* { dg-do run } */ + +typedef float V __attribute__ ((vector_size (16))); +V g; + +int +main () +{ + float d[4] = { 4, 3, 2, 1 }; + V e; + __builtin_memcpy (&e, &d, sizeof (d)); + V f = { 5, 15, 25, 35 }; + e = e * f; + g = e; + return 0; +}