From patchwork Mon Nov 19 03:54:31 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hans-Peter Nilsson X-Patchwork-Id: 199919 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 767CB2C008E for ; Mon, 19 Nov 2012 14:54:46 +1100 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1353902087; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: Date:From:To:cc:Subject:In-Reply-To:Message-ID:References: User-Agent:MIME-Version:Content-Type:Mailing-List:Precedence: List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender: Delivered-To; bh=t5eK2gby6LMZEKg50eVNP9Tw2VA=; b=LToCJEa5aDU5yqq liLCiFJtDFdn7iEvmevIzVZJVdSs+bQajPZUdmogM9HaHYc0bYyekSuODYNJZuPk NsnVUvNND3uJjY6HJ1tyKjxvEHGcFrvGEg6EkhH34hX0QDGBXWMfeWLiDlhHYRnN QIsFwCu8yls9HHWZuSAqjb1DvfRA= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Received:Date:From:To:cc:Subject:In-Reply-To:Message-ID:References:User-Agent:MIME-Version:Content-Type:X-IsSubscribed:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=Rrg83DpY3W0PPFzXzNsIQNX6gFwZ2Zof/x4UPfAQprH8jVYQXa4SllsQ5Lif40 5dtHnE72xqxe0UU/BZ/HxRuDsemqRSAIllRoJ7uEqf8BmqqROeRrenxiQrljnXXa IBTCUTa+eBMgx2hovHZae4uvSE1vYzY6WNyR9CLl0dZ7o=; Received: (qmail 3942 invoked by alias); 19 Nov 2012 03:54:41 -0000 Received: (qmail 3924 invoked by uid 22791); 19 Nov 2012 03:54:41 -0000 X-SWARE-Spam-Status: No, hits=-2.9 required=5.0 tests=AWL, BAYES_00, KHOP_THREADED X-Spam-Check-By: sourceware.org Received: from dair.pair.com (HELO dair.pair.com) (209.68.1.49) by sourceware.org (qpsmtpd/0.43rc1) with SMTP; Mon, 19 Nov 2012 03:54:32 +0000 Received: (qmail 85545 invoked by uid 20157); 19 Nov 2012 03:54:31 -0000 Received: from localhost (sendmail-bs@127.0.0.1) by localhost with SMTP; 19 Nov 2012 03:54:31 -0000 Date: Sun, 18 Nov 2012 22:54:31 -0500 (EST) From: Hans-Peter Nilsson To: Eric Botcazou cc: gcc-patches@gcc.gnu.org, Alexandre Oliva Subject: Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver In-Reply-To: <13011180.NBQR3vZSIa@polaris> Message-ID: References: <13011180.NBQR3vZSIa@polaris> User-Agent: Alpine 2.02 (BSF 1266 2009-07-14) MIME-Version: 1.0 X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On Mon, 12 Nov 2012, Eric Botcazou wrote: > > This is a target-specific blockage insn, but with the general form > > found in all targets defining it. The default blockage is an empty > > asm-volatile, which is what cse_insn recognized. The blockage insn is > > there to "prevent scheduling" of the critical insns and register > > values. It's almost obvious that an unspec_volatile should have that > > effect "too"; at least that's intended by the code in > > expand_builtin_setjmp_receiver. Luckily (or unluckily regarding the > > presence of the bug) *this* cse code is not run post-frame-layout > > (post-reload-cse does not use this code), or it seems people would > > soon notice register values used from the wrong side of the blockage, > > considering its critical use at the restore of the stack-pointer. > > As mentioned in the previous patch, > > , clobbering > > the frame-pointer (and then using it) does not seem the logical > > alternative to the patch below; the blockage insn should just do its job. > > Agreed. > > > I updated comments and documentation so it's more apparent that > > register uses should also not be moved across the blockage; see the > > patched comments. > > > > Tested native x86_64-unknown-linux-gnu w./wo. -m32 and before/after > > 192677. Ok to commit? > > > > gcc: > > PR middle-end/55030 > > * builtins.c (expand_builtin_setjmp_receiver): Update comment > > regarding purpose of blockage. > > * emit-rtl.c [!HAVE_blockage] (gen_blockage): Similarly for > > the head comment. > > * doc/md.texi (blockage): Update similarly. Change wording to > > require one of two forms, rather than implying a wider choice. > > * cse.c (cse_insn): Where checking for blocking insns, treat > > UNSPEC_VOLATILE as blocking, besides volatile ASM. > > That's fine on principle, but there is a predicate for this (volatile_insn_p) > so I think we should use it here. Moreover, cselib_process_insn has the same > check so we should adjust it as well, which in turn means that dse.c:scan_insn > should probably be adjusted too. OK with these changes, thanks. Doh, I should have looked for that construct in other places. Thanks for the review. Here's an updated patch with that. Unfortunately, it causes regressions; read on for a very brief analysis. For x86_64-linux (base multilib): Running /home/hp/gcctop/tmp/pr55030-2n/gcc/gcc/testsuite/gcc.dg/guality/guality.exp ... ... FAIL: gcc.dg/guality/pr36728-1.c -O1 line 12 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O1 line 14 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O2 line 12 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O2 line 14 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O3 -fomit-frame-pointer line 12 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O3 -fomit-frame-pointer line 14 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 12 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 14 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -Os line 12 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -Os line 14 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 12 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 14 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 12 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 14 arg7 == 30 And for -m32: FAIL: gcc.dg/guality/pr36728-1.c -O1 line 12 arg2 == 2 FAIL: gcc.dg/guality/pr36728-1.c -O1 line 12 arg3 == 3 FAIL: gcc.dg/guality/pr36728-1.c -O1 line 12 arg4 == 4 FAIL: gcc.dg/guality/pr36728-1.c -O1 line 12 arg5 == 5 FAIL: gcc.dg/guality/pr36728-1.c -O1 line 12 arg6 == 6 FAIL: gcc.dg/guality/pr36728-1.c -O1 line 12 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O1 line 14 arg2 == 2 FAIL: gcc.dg/guality/pr36728-1.c -O1 line 14 arg3 == 3 FAIL: gcc.dg/guality/pr36728-1.c -O1 line 14 arg4 == 4 FAIL: gcc.dg/guality/pr36728-1.c -O1 line 14 arg5 == 5 FAIL: gcc.dg/guality/pr36728-1.c -O1 line 14 arg6 == 6 FAIL: gcc.dg/guality/pr36728-1.c -O1 line 14 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O2 line 12 arg2 == 2 FAIL: gcc.dg/guality/pr36728-1.c -O2 line 12 arg3 == 3 FAIL: gcc.dg/guality/pr36728-1.c -O2 line 12 arg4 == 4 FAIL: gcc.dg/guality/pr36728-1.c -O2 line 12 arg5 == 5 FAIL: gcc.dg/guality/pr36728-1.c -O2 line 12 arg6 == 6 FAIL: gcc.dg/guality/pr36728-1.c -O2 line 12 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O2 line 14 arg2 == 2 FAIL: gcc.dg/guality/pr36728-1.c -O2 line 14 arg3 == 3 FAIL: gcc.dg/guality/pr36728-1.c -O2 line 14 arg4 == 4 FAIL: gcc.dg/guality/pr36728-1.c -O2 line 14 arg5 == 5 FAIL: gcc.dg/guality/pr36728-1.c -O2 line 14 arg6 == 6 FAIL: gcc.dg/guality/pr36728-1.c -O2 line 14 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O3 -fomit-frame-pointer line 12 arg2 == 2 FAIL: gcc.dg/guality/pr36728-1.c -O3 -fomit-frame-pointer line 12 arg3 == 3 FAIL: gcc.dg/guality/pr36728-1.c -O3 -fomit-frame-pointer line 12 arg4 == 4 FAIL: gcc.dg/guality/pr36728-1.c -O3 -fomit-frame-pointer line 12 arg5 == 5 FAIL: gcc.dg/guality/pr36728-1.c -O3 -fomit-frame-pointer line 12 arg6 == 6 FAIL: gcc.dg/guality/pr36728-1.c -O3 -fomit-frame-pointer line 12 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O3 -fomit-frame-pointer line 14 arg2 == 2 FAIL: gcc.dg/guality/pr36728-1.c -O3 -fomit-frame-pointer line 14 arg3 == 3 FAIL: gcc.dg/guality/pr36728-1.c -O3 -fomit-frame-pointer line 14 arg4 == 4 FAIL: gcc.dg/guality/pr36728-1.c -O3 -fomit-frame-pointer line 14 arg5 == 5 FAIL: gcc.dg/guality/pr36728-1.c -O3 -fomit-frame-pointer line 14 arg6 == 6 FAIL: gcc.dg/guality/pr36728-1.c -O3 -fomit-frame-pointer line 14 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 12 arg2 == 2 FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 12 arg3 == 3 FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 12 arg4 == 4 FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 12 arg5 == 5 FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 12 arg6 == 6 FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 12 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 14 arg2 == 2 FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 14 arg3 == 3 FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 14 arg4 == 4 FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 14 arg5 == 5 FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 14 arg6 == 6 FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 14 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -Os line 12 arg2 == 2 FAIL: gcc.dg/guality/pr36728-1.c -Os line 12 arg3 == 3 FAIL: gcc.dg/guality/pr36728-1.c -Os line 12 arg4 == 4 FAIL: gcc.dg/guality/pr36728-1.c -Os line 12 arg5 == 5 FAIL: gcc.dg/guality/pr36728-1.c -Os line 12 arg6 == 6 FAIL: gcc.dg/guality/pr36728-1.c -Os line 12 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -Os line 14 arg2 == 2 FAIL: gcc.dg/guality/pr36728-1.c -Os line 14 arg3 == 3 FAIL: gcc.dg/guality/pr36728-1.c -Os line 14 arg4 == 4 FAIL: gcc.dg/guality/pr36728-1.c -Os line 14 arg5 == 5 FAIL: gcc.dg/guality/pr36728-1.c -Os line 14 arg6 == 6 FAIL: gcc.dg/guality/pr36728-1.c -Os line 14 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 12 arg2 == 2 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 12 arg3 == 3 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 12 arg4 == 4 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 12 arg5 == 5 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 12 arg6 == 6 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 12 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 14 arg2 == 2 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 14 arg3 == 3 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 14 arg4 == 4 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 14 arg5 == 5 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 14 arg6 == 6 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 14 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 12 arg1 == 1 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 14 arg1 == 1 I looked into gcc.dg/guality/pr36728-1.c at base -O1. The generated assembly code modulo debug info, is the same. The value of arg7 is optimized out, says gdb in gcc.log. The problem doesn't seem to be any md-generated frame-related-barrier as was my first guess, but the volatile asms in the source(!). The first one looks like this (and the second one similar) in pr36728-1.c.168r.cse1 (r193583): (insn 26 25 27 2 (parallel [ (set (mem/c:SI (plus:DI (reg/f:DI 20 frame) (const_int -32 [0xffffffffffffffe0])) [0 y+0 S4 A256]) (asm_operands/v:SI ("") ("=m") 0 [ (mem/c:SI (plus:DI (reg/f:DI 20 frame) (const_int -32 [0xffffffffffffffe0])) [0 y+0 S4 A256]) ] [ (asm_input:SI ("m") (null):0) ] [] /home/hp/gcctop/tmp/nbasf/gcc/gcc/testsuite/gcc.dg/guality/pr36728-1.c:12)) (clobber (reg:QI 18 fpsr)) (clobber (reg:QI 17 flags)) ]) /home/hp/gcctop/tmp/nbasf/gcc/gcc/testsuite/gcc.dg/guality/pr36728-1.c:12 -1 (nil)) It's not caught by the previous test: - && GET_CODE (PATTERN (insn)) == ASM_OPERANDS - && MEM_VOLATILE_P (PATTERN (insn))) ...but since it is a volatile asm (as seen in the source and by the /v on the asm_operands) volatile_insn_p does catch it (as expected and intended) and down the road for some reason, gcc can't recover the arg7 contents. Somewhat expected, but this volatile asm is also more volatile than intended; a clobber list for example as seen above inserted by the md, is now redundant. I'm not sure what to do with this. Try changing volatile_insn_p adding a parameter to optionally allow volatile asms with outputs to pass? But then, when *should* that distinction be done, to let such volatile asms be allowed to pass as not-really-as-volatile-as-we-look-for? I'd think "never" for any of the the patched code, or maybe "only when looking at effects on memory". Maybe there's some known gotcha in debug handling. I'm CC:ing Alexandre, in case he happens to have input to this off the top of his head (or has copious spare time :) PS. without the dse.c and cselib.c changes, it passed testing. gcc: PR middle-end/55030 * builtins.c (expand_builtin_setjmp_receiver): Update comment regarding purpose of blockage. * emit-rtl.c [!HAVE_blockage] (gen_blockage): Similarly for the head comment. * rtlanal.c (volatile_insn_p): Ditto. * doc/md.texi (blockage): Update similarly. Change wording to require one of two forms, rather than implying a wider choice. * cse.c (cse_insn): Where checking for blocking insns, use volatile_insn_p instead of manual check for volatile ASM. * dse.c (scan_insn): Ditto. * cselib.c (cselib_process_insn): Ditto. brgds, H-P Index: gcc/dse.c =================================================================== --- gcc/dse.c (revision 192677) +++ gcc/dse.c (working copy) @@ -2522,8 +2522,7 @@ scan_insn (bb_info_t bb_info, rtx insn) /* Cselib clears the table for this case, so we have to essentially do the same. */ if (NONJUMP_INSN_P (insn) - && GET_CODE (PATTERN (insn)) == ASM_OPERANDS - && MEM_VOLATILE_P (PATTERN (insn))) + && volatile_insn_p (PATTERN (insn))) { add_wild_read (bb_info); insn_info->cannot_delete = true; Index: gcc/rtlanal.c =================================================================== --- gcc/rtlanal.c (revision 192677) +++ gcc/rtlanal.c (working copy) @@ -2083,8 +2083,8 @@ remove_node_from_expr_list (const_rtx no /* Nonzero if X contains any volatile instructions. These are instructions which may cause unpredictable machine state instructions, and thus no - instructions should be moved or combined across them. This includes - only volatile asms and UNSPEC_VOLATILE instructions. */ + instructions or register uses should be moved or combined across them. + This includes only volatile asms and UNSPEC_VOLATILE instructions. */ int volatile_insn_p (const_rtx x) Index: gcc/builtins.c =================================================================== --- gcc/builtins.c (revision 192677) +++ gcc/builtins.c (working copy) @@ -963,7 +963,8 @@ expand_builtin_setjmp_receiver (rtx rece /* We must not allow the code we just generated to be reordered by scheduling. Specifically, the update of the frame pointer must - happen immediately, not later. */ + happen immediately, not later. Similarly, we must block + (frame-related) register values to be used across this code. */ emit_insn (gen_blockage ()); } Index: gcc/cselib.c =================================================================== --- gcc/cselib.c (revision 192677) +++ gcc/cselib.c (working copy) @@ -2607,13 +2607,12 @@ cselib_process_insn (rtx insn) cselib_current_insn = insn; - /* Forget everything at a CODE_LABEL, a volatile asm, or a setjmp. */ + /* Forget everything at a CODE_LABEL, a volatile insn, or a setjmp. */ if (LABEL_P (insn) || (CALL_P (insn) && find_reg_note (insn, REG_SETJMP, NULL)) || (NONJUMP_INSN_P (insn) - && GET_CODE (PATTERN (insn)) == ASM_OPERANDS - && MEM_VOLATILE_P (PATTERN (insn)))) + && volatile_insn_p (PATTERN (insn)))) { cselib_reset_table (next_uid); cselib_current_insn = NULL_RTX; Index: gcc/cse.c =================================================================== --- gcc/cse.c (revision 192677) +++ gcc/cse.c (working copy) @@ -5660,10 +5660,9 @@ cse_insn (rtx insn) invalidate (XEXP (dest, 0), GET_MODE (dest)); } - /* A volatile ASM invalidates everything. */ + /* A volatile ASM or an UNSPEC_VOLATILE invalidates everything. */ if (NONJUMP_INSN_P (insn) - && GET_CODE (PATTERN (insn)) == ASM_OPERANDS - && MEM_VOLATILE_P (PATTERN (insn))) + && volatile_insn_p (PATTERN (insn))) flush_hash_table (); /* Don't cse over a call to setjmp; on some machines (eg VAX) Index: gcc/emit-rtl.c =================================================================== --- gcc/emit-rtl.c (revision 192677) +++ gcc/emit-rtl.c (working copy) @@ -364,8 +364,8 @@ get_reg_attrs (tree decl, int offset) #if !HAVE_blockage -/* Generate an empty ASM_INPUT, which is used to block attempts to schedule - across this insn. */ +/* Generate an empty ASM_INPUT, which is used to block attempts to schedule, + and to block register equivalences to be seen across this insn. */ rtx gen_blockage (void) Index: gcc/doc/md.texi =================================================================== --- gcc/doc/md.texi (revision 192677) +++ gcc/doc/md.texi (working copy) @@ -5832,8 +5832,9 @@ the values of operands 1 and 2. @item @samp{blockage} This pattern defines a pseudo insn that prevents the instruction -scheduler from moving instructions across the boundary defined by the -blockage insn. Normally an UNSPEC_VOLATILE pattern. +scheduler and other passes from moving instructions and using register +equivalences across the boundary defined by the blockage insn. +This needs to be an UNSPEC_VOLATILE pattern or a volatile ASM. @cindex @code{memory_barrier} instruction pattern @item @samp{memory_barrier}