From patchwork Thu Jul 22 21:52:52 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 59649 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 02C94B70C0 for ; Fri, 23 Jul 2010 07:53:10 +1000 (EST) Received: (qmail 23775 invoked by alias); 22 Jul 2010 21:53:08 -0000 Received: (qmail 23767 invoked by uid 22791); 22 Jul 2010 21:53:08 -0000 X-SWARE-Spam-Status: No, hits=-5.3 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_HI, SPF_HELO_PASS, TW_DQ, TW_VD, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 22 Jul 2010 21:53:03 +0000 Received: from int-mx02.intmail.prod.int.phx2.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id o6MLqr7C027378 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 22 Jul 2010 17:52:53 -0400 Received: from anchor.twiddle.home (vpn-230-89.phx2.redhat.com [10.3.230.89]) by int-mx02.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id o6MLqqfc023591; Thu, 22 Jul 2010 17:52:52 -0400 Message-ID: <4C48BDB4.5060209@redhat.com> Date: Thu, 22 Jul 2010 14:52:52 -0700 From: Richard Henderson User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.10) Gecko/20100621 Fedora/3.0.5-1.fc13 Thunderbird/3.0.5 MIME-Version: 1.0 To: "H.J. Lu" CC: Bernd Schmidt , GCC Patches , ubizjak@gmail.com Subject: Fix target/45027 [was: x86_64 varargs setup jump table] References: <4C4035C3.9080305@codesourcery.com> <4C40A5BD.9080208@redhat.com> <4C40F005.3060507@codesourcery.com> <4C41BD52.5040905@codesourcery.com> <4C447222.7080500@redhat.com> <4C44C00F.3070201@redhat.com> <4C45CFA0.9070404@redhat.com> In-Reply-To: X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On 07/21/2010 04:15 PM, H.J. Lu wrote: > This caused: > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45027 Fixed thus. Honza and I had a discussion about this on IRC. His original patch, http://gcc.gnu.org/ml/gcc-patches/2010-04/msg01120.html had some problems in it, but worked anyway for a complicated set of reasons. The following patch removes the optimization that attempts to leave the stack alignment at 64-bits, rather than storing the varargs SSE registers in full into a 128-bit aligned slot. The only optimization that seems interesting to me would be to properly track the true types of the data at each varargs slot. In a restricted set of circumstances we might be able to copy-propagate the incoming SSE register to the use. This would require a significant amount of work in pass_stdarg, and really doesn't seem worth the effort. r~ * config/i386/i386.c (setup_incoming_varargs_64): Force the use of V4SFmode for the SSE saves; increase stack alignment if needed. (ix86_gimplify_va_arg): Don't increase stack alignment here. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index d9dc571..596a6db 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -7073,7 +7073,7 @@ setup_incoming_varargs_64 (CUMULATIVE_ARGS *cum) /* FPR size of varargs save area. We don't need it if we don't pass anything in SSE registers. */ - if (cum->sse_nregs && cfun->va_list_fpr_size) + if (TARGET_SSE && cfun->va_list_fpr_size) ix86_varargs_fpr_size = X86_64_SSE_REGPARM_MAX * 16; else ix86_varargs_fpr_size = 0; @@ -7112,12 +7112,13 @@ setup_incoming_varargs_64 (CUMULATIVE_ARGS *cum) emit_jump_insn (gen_cbranchqi4 (test, XEXP (test, 0), XEXP (test, 1), label)); - /* If we've determined that we're only loading scalars (and not - vector data) then we can store doubles instead. */ - if (crtl->stack_alignment_needed < 128) - smode = DFmode; - else - smode = V4SFmode; + /* ??? If !TARGET_SSE_TYPELESS_STORES, would we perform better if + we used movdqa (i.e. TImode) instead? Perhaps even better would + be if we could determine the real mode of the data, via a hook + into pass_stdarg. Ignore all that for now. */ + smode = V4SFmode; + if (crtl->stack_alignment_needed < GET_MODE_ALIGNMENT (smode)) + crtl->stack_alignment_needed = GET_MODE_ALIGNMENT (smode); max = cum->sse_regno + cfun->va_list_fpr_size / 16; if (max > X86_64_SSE_REGPARM_MAX) @@ -7549,8 +7550,7 @@ ix86_gimplify_va_arg (tree valist, tree type, gimple_seq *pre_p, arg_boundary = MAX_SUPPORTED_STACK_ALIGNMENT; /* Care for on-stack alignment if needed. */ - if (arg_boundary <= 64 - || integer_zerop (TYPE_SIZE (type))) + if (arg_boundary <= 64 || size == 0) t = ovf; else { @@ -7561,9 +7561,8 @@ ix86_gimplify_va_arg (tree valist, tree type, gimple_seq *pre_p, t = build2 (BIT_AND_EXPR, TREE_TYPE (t), t, size_int (-align)); t = fold_convert (TREE_TYPE (ovf), t); - if (crtl->stack_alignment_needed < arg_boundary) - crtl->stack_alignment_needed = arg_boundary; } + gimplify_expr (&t, pre_p, NULL, is_gimple_val, fb_rvalue); gimplify_assign (addr, t, pre_p);