From patchwork Mon Nov 12 04:49:34 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hans-Peter Nilsson X-Patchwork-Id: 198317 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id A10452C0086 for ; Mon, 12 Nov 2012 15:49:47 +1100 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1353300590; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: Date:From:To:Subject:Message-ID:User-Agent:MIME-Version: Content-Type:Mailing-List:Precedence:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:Sender:Delivered-To; bh=Pa0mXfH Ek4DFNQcNVh25X+/0eNY=; b=V6uHaT0pVZSq2zRuExuuiJr1sDCp0WFQ9xB4g6H /rKelAZaCE7PSLkyL0YIgQkkYlj0S0AQOTR/y2t9pEx3h1LYBQsA8xkYUN28mZC9 0t6+akhu72GGHEPOG7GDOwyymumIcHJf4vXmLZFDzMz96L14Eno83oRMFAjKPX1j 5wtk= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Received:Date:From:To:Subject:Message-ID:User-Agent:MIME-Version:Content-Type:X-IsSubscribed:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=UI0rGbmG2XCoxL/ps6Rjq2Y3dV0cVe5t7XPnpVaHn7eZOZlOKnhEudlQ04DyVD Ff7rrSmz9SFsPzNPLCJ4mNjCjIxtTPi3xkHYstj8f3tRb9Ms7QmB11WspRowfaOx IXc0aaGrfa7pk7aVg8aaFJNUo53W21y+AtUUC+syOKxnQ=; Received: (qmail 31207 invoked by alias); 12 Nov 2012 04:49:42 -0000 Received: (qmail 31198 invoked by uid 22791); 12 Nov 2012 04:49:41 -0000 X-SWARE-Spam-Status: No, hits=-2.1 required=5.0 tests=AWL,BAYES_00 X-Spam-Check-By: sourceware.org Received: from dair.pair.com (HELO dair.pair.com) (209.68.1.49) by sourceware.org (qpsmtpd/0.43rc1) with SMTP; Mon, 12 Nov 2012 04:49:36 +0000 Received: (qmail 27645 invoked by uid 20157); 12 Nov 2012 04:49:34 -0000 Received: from localhost (sendmail-bs@127.0.0.1) by localhost with SMTP; 12 Nov 2012 04:49:34 -0000 Date: Sun, 11 Nov 2012 23:49:34 -0500 (EST) From: Hans-Peter Nilsson To: gcc-patches@gcc.gnu.org Subject: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver Message-ID: User-Agent: Alpine 2.02 (BSF 1266 2009-07-14) MIME-Version: 1.0 X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org The problem exposed in PR55030 (repeatable on x86_64-linux with -m32 at r192676) is that the fake-frame-pointer "frame" is replaced with the actual-frame-pointer "bp" in cse1, around the critical insn in __builtin_setjmp_receiver that restores their defined offset. The patch in PR55030/r192676 removed a clobber of the frame-pointer, and cse1 felt free to replace the former register with the latter, as in the following expansion of __builtin_setjmp_receiver (in builtins.c: expand_builtin_setjmp_receiver) from a reduced test-case, see the PR. (You might find the setup/restore code having an unexpected order; there are two __builtin_setjmps in series and the setup of the second one is storing the wrong value for the frame-pointer.) The pre-transformation dump in cse1 says: (code_label/s 13 51 14 3 2 "" [2 uses]) (note 14 13 15 3 [bb 3] NOTE_INSN_BASIC_BLOCK) (insn 15 14 78 3 (use (reg/f:SI 6 bp)) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:31 -1 (nil)) (insn 78 15 17 3 (set (reg/f:SI 20 frame) (reg/f:SI 6 bp)) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:31 -1 (nil)) (insn 17 78 56 3 (unspec_volatile [ (const_int 0 [0]) ] UNSPECV_BLOCKAGE) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:31 643 {blockage} (nil)) (insn 56 17 57 3 (set (reg/f:SI 74) (symbol_ref:SI ("chk_fail_buf") [flags 0x40] )) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:37 65 {*movsi_internal} (nil)) (insn 57 56 58 3 (set (mem:SI (reg/f:SI 74) [5 S4 A8]) (reg/f:SI 20 frame)) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:37 65 {*movsi_internal} (nil)) Before r192676 and after the reversion in r192701, there was/is that weird extra "(clobber (reg/f:SI 6 bp)" inserted before insn 17. But without that, we notice post-cse1-transformation, a change in insn 57: (code_label/s 13 51 14 3 2 "" [3 uses]) (note 14 13 15 3 [bb 3] NOTE_INSN_BASIC_BLOCK) (insn 15 14 78 3 (use (reg/f:SI 6 bp)) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:31 -1 (nil)) (insn 78 15 17 3 (set (reg/f:SI 20 frame) (reg/f:SI 6 bp)) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:31 -1 (nil)) (insn 17 78 56 3 (unspec_volatile [ (const_int 0 [0]) ] UNSPECV_BLOCKAGE) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:31 643 {blockage} (nil)) (insn 56 17 57 3 (set (reg/f:SI 74) (symbol_ref:SI ("chk_fail_buf") [flags 0x40] )) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:37 65 {*movsi_internal} (nil)) (insn 57 56 58 3 (set (mem:SI (reg/f:SI 74) [5 S4 A8]) (reg/f:SI 6 bp)) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:37 65 {*movsi_internal} (nil)) It seemed wrong for this change to be stopped *only* by that (clobber bp). Stepping through the code for insn 78 and 57, I was looking for related special-casing of frame- and other similar registers in cse.c and was a bit confused when I found none (not counting hash_rtx_cb). (I admit lack of special-casing is a good sign, though. :) Then I was blinded by the obvious; the next insn, insn 17, is a "blockage" insn. This is a target-specific blockage insn, but with the general form found in all targets defining it. The default blockage is an empty asm-volatile, which is what cse_insn recognized. The blockage insn is there to "prevent scheduling" of the critical insns and register values. It's almost obvious that an unspec_volatile should have that effect "too"; at least that's intended by the code in expand_builtin_setjmp_receiver. Luckily (or unluckily regarding the presence of the bug) *this* cse code is not run post-frame-layout (post-reload-cse does not use this code), or it seems people would soon notice register values used from the wrong side of the blockage, considering its critical use at the restore of the stack-pointer. As mentioned in the previous patch, , clobbering the frame-pointer (and then using it) does not seem the logical alternative to the patch below; the blockage insn should just do its job. I updated comments and documentation so it's more apparent that register uses should also not be moved across the blockage; see the patched comments. Tested native x86_64-unknown-linux-gnu w./wo. -m32 and before/after 192677. Ok to commit? gcc: PR middle-end/55030 * builtins.c (expand_builtin_setjmp_receiver): Update comment regarding purpose of blockage. * emit-rtl.c [!HAVE_blockage] (gen_blockage): Similarly for the head comment. * doc/md.texi (blockage): Update similarly. Change wording to require one of two forms, rather than implying a wider choice. * cse.c (cse_insn): Where checking for blocking insns, treat UNSPEC_VOLATILE as blocking, besides volatile ASM. brgds, H-P Index: gcc/builtins.c =================================================================== --- gcc/builtins.c (revision 192677) +++ gcc/builtins.c (working copy) @@ -963,7 +963,8 @@ expand_builtin_setjmp_receiver (rtx rece /* We must not allow the code we just generated to be reordered by scheduling. Specifically, the update of the frame pointer must - happen immediately, not later. */ + happen immediately, not later. Similarly, we must block + (frame-related) register values to be used across this code. */ emit_insn (gen_blockage ()); } Index: gcc/cse.c =================================================================== --- gcc/cse.c (revision 192677) +++ gcc/cse.c (working copy) @@ -5660,10 +5660,11 @@ cse_insn (rtx insn) invalidate (XEXP (dest, 0), GET_MODE (dest)); } - /* A volatile ASM invalidates everything. */ + /* A volatile ASM or an UNSPEC_VOLATILE invalidates everything. */ if (NONJUMP_INSN_P (insn) - && GET_CODE (PATTERN (insn)) == ASM_OPERANDS - && MEM_VOLATILE_P (PATTERN (insn))) + && ((GET_CODE (PATTERN (insn)) == ASM_OPERANDS + && MEM_VOLATILE_P (PATTERN (insn))) + || GET_CODE (PATTERN (insn)) == UNSPEC_VOLATILE)) flush_hash_table (); /* Don't cse over a call to setjmp; on some machines (eg VAX) Index: gcc/emit-rtl.c =================================================================== --- gcc/emit-rtl.c (revision 192677) +++ gcc/emit-rtl.c (working copy) @@ -364,8 +364,8 @@ get_reg_attrs (tree decl, int offset) #if !HAVE_blockage -/* Generate an empty ASM_INPUT, which is used to block attempts to schedule - across this insn. */ +/* Generate an empty ASM_INPUT, which is used to block attempts to schedule, + and to block register equivalences to be seen across this insn. */ rtx gen_blockage (void) Index: gcc/doc/md.texi =================================================================== --- gcc/doc/md.texi (revision 192677) +++ gcc/doc/md.texi (working copy) @@ -5832,8 +5832,9 @@ the values of operands 1 and 2. @item @samp{blockage} This pattern defines a pseudo insn that prevents the instruction -scheduler from moving instructions across the boundary defined by the -blockage insn. Normally an UNSPEC_VOLATILE pattern. +scheduler and other passes from moving instructions and using register +equivalences across the boundary defined by the blockage insn. +This needs to be an UNSPEC_VOLATILE pattern or a volatile ASM. @cindex @code{memory_barrier} instruction pattern @item @samp{memory_barrier}