From patchwork Mon Jan 30 16:57:23 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Georg-Johann Lay X-Patchwork-Id: 138613 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 1F05DB6EFE for ; Tue, 31 Jan 2012 03:58:02 +1100 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1328547483; h=Comment: DomainKey-Signature:Received:Received:Received:Received: Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject: References:In-Reply-To:Content-Type:Mailing-List:Precedence: List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender: Delivered-To; bh=be9OcICpPK+/VJ/hkFD30G8BdUs=; b=qgEnl28JlUbVIF8 Ydjd8KuU22u+NQXvu8FT34cAFApdcRWnZFEAlFNtphxwqzEIqQnc47BF/SXXFDNW zlZhX84vjV0VXO09VNHeZAVgO8TMRp8fSLgAbePkljmcKyd409PhqDwQdcIOBxtV vojEnqYdp2qvBVl99ab4EX1z6Nl8= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:X-RZG-AUTH:X-RZG-CLASS-ID:Received:Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject:References:In-Reply-To:Content-Type:X-IsSubscribed:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=PShNDe42YLYZUz3eZHzVZmgSL+5uSJn1jLB7pxu1N4c9/2nO8e5CUY7MyPAd0c UIWhaas8b67sH0pvSRJC6zSF2w1DrNJkU61yEPLHodsTZBV65wtbSLeb5PYbA5jE FGLvshycgI0NKHFUj1pT6T09Z1qe0k/pTFG+0Ct9PZ1IY=; Received: (qmail 22818 invoked by alias); 30 Jan 2012 16:57:54 -0000 Received: (qmail 22802 invoked by uid 22791); 30 Jan 2012 16:57:51 -0000 X-SWARE-Spam-Status: No, hits=-1.5 required=5.0 tests=AWL, BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, RCVD_IN_DNSWL_NONE X-Spam-Check-By: sourceware.org Received: from mo-p00-ob.rzone.de (HELO mo-p00-ob.rzone.de) (81.169.146.161) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 30 Jan 2012 16:57:37 +0000 X-RZG-AUTH: :LXoWVUeid/7A29J/hMvvT2k715jHQaJercGObUOFkj18odoYNahU4Q== X-RZG-CLASS-ID: mo00 Received: from [192.168.0.22] (business-188-111-022-002.static.arcor-ip.net [188.111.22.2]) by smtp.strato.de (jimi mo25) (RZmta 27.6 AUTH) with ESMTPA id 2011dco0UGSBp2 ; Mon, 30 Jan 2012 17:57:24 +0100 (MET) Message-ID: <4F26CBF3.5050806@gjlay.de> Date: Mon, 30 Jan 2012 17:57:23 +0100 From: Georg-Johann Lay User-Agent: Thunderbird 2.0.0.24 (X11/20100302) MIME-Version: 1.0 To: gcc-patches@gcc.gnu.org CC: rdsandiford@googlemail.com Subject: Re: [Patch] PR51374: combine.c and volatile correctness References: <4F205FC3.1020107@gjlay.de> <87obtm94zr.fsf@firetop.home> In-Reply-To: <87obtm94zr.fsf@firetop.home> X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Richard Sandiford wrote: > Georg-Johann Lay writes: >> This patch fixes PR51374 by more strictly updating mem_last_set. >> Sloppy handling of mem_last_set can lead to error in volatile correctness >> because combine then is allowed to drag one volatile access over an other >> volatile thing (volatile access, asm volatile, unspec_volatile...). >> >> As explained in the source comment, any volatile might change memory, even a >> volatile read. > > This looks too broad to me. Volatile MEMs don't usually alias > non-volatile ones (so for example, reads from volatile memory shouldn't > affect reads from non-volatile memory). The patch tried to fix the issue. If there is less invasive solutions, that's always preferred, of course. > What do you think about instead changing: > > /* If there are any volatile insns between INSN and I3, reject, because > they might affect machine state. */ > > for (p = NEXT_INSN (insn); p != i3; p = NEXT_INSN (p)) > if (INSN_P (p) && p != succ && p != succ2 && volatile_insn_p (PATTERN (p))) > return 0; > > to: > > /* If INSN contains volatile references (specifically volatile MEMs), > we cannot combine across any other volatile references. > Even if INSN doesn't contain volatile references, any intervening > volatile insn might affect machine state. */ > > pred = volatile_refs_p (insn) ? volatile_refs_p : volatile_insn_p; > for (p = NEXT_INSN (insn); p != i3; p = NEXT_INSN (p)) > if (NONDEBUG_INSN_P (p) && p != succ && p != succ2 && pred (PATTERN (p))) > return 0; This would still allow a volatile_refs_p to cross a volatile_insn_p? And some lines above there is this: /* If INSN contains anything volatile, or is an `asm' (whether volatile or not), reject, unless nothing volatile comes between it and I3 */ if (GET_CODE (src) == ASM_OPERANDS || volatile_refs_p (src)) { /* Make sure neither succ nor succ2 contains a volatile reference. */ if (succ2 != 0 && volatile_refs_p (PATTERN (succ2))) return 0; if (succ != 0 && volatile_refs_p (PATTERN (succ))) return 0; /* We'll check insns between INSN and I3 below. */ } Would our change (whatever it will look like) make that code superfluous? > As you say in the PR notes, we've already handled volatile_insn_p (insn) > by this stage, so the new loop is handling: > > volatile_refs_p (insn) && !volatile_insn_p (insn) There is this part of an if-clause in can_combine_p: /* Make sure that the value that is to be substituted for the register does not use any registers whose values alter in between. However, If the insns are adjacent, a use can't cross a set even though we think it might (this can happen for a sequence of insns each setting the same destination; last_set of that register might point to a NOTE). If INSN has a REG_EQUIV note, the register is always equivalent to the memory so the substitution is valid even if there are intervening stores. Also, don't move a volatile asm or UNSPEC_VOLATILE across any other insns. */ || (! all_adjacent && (((!MEM_P (src) || ! find_reg_note (insn, REG_EQUIV, src)) && use_crosses_set_p (src, DF_INSN_LUID (insn))) || (GET_CODE (src) == ASM_OPERANDS && MEM_VOLATILE_P (src)) || GET_CODE (src) == UNSPEC_VOLATILE)) so it's not a verbatim volatile_insn_p. Maybe it's for historical reason or for some subtle differences not explained; only someone familiar with combine will know... perhaps performance issue. >> Moreover, writes of the shape (set (zero_extract (mem ... update mem_last_set. > > Good catch. I think we should use note_stores for added safety though, > rather than call zero_extract out as a special case. (I realise the > function already handles subregs, but still.) record_dead_and_set_regs_1 is called through note_stores so that (set (zero_extract (mem)) is handled already and thus no change to record_dead_and_set_regs_1 is needed, right? Attached you find a new, tentative patch. It resolves the issue in my small test program. However, I think someone with more insight into combine should take over the patch. Johann PR rtl-optimization/51374 * combine.c (can_combine_p): Don't allow volatile_refs_p insns to cross other volatile_refs_p insns. Index: combine.c =================================================================== --- combine.c (revision 183695) +++ combine.c (working copy) @@ -1947,12 +1947,21 @@ can_combine_p (rtx insn, rtx i3, rtx pre && REG_P (dest) && REGNO (dest) < FIRST_PSEUDO_REGISTER) return 0; - /* If there are any volatile insns between INSN and I3, reject, because - they might affect machine state. */ + /* If INSN contains volatile references (specifically volatile MEMs), + we cannot combine across any other volatile references. + Even if INSN doesn't contain volatile references, any intervening + volatile insn might affect machine state. */ + { + int (*predicat) (const_rtx); - for (p = NEXT_INSN (insn); p != i3; p = NEXT_INSN (p)) - if (INSN_P (p) && p != succ && p != succ2 && volatile_insn_p (PATTERN (p))) - return 0; + predicat = volatile_refs_p (PATTERN (insn)) + ? volatile_refs_p + : volatile_insn_p; + + for (p = NEXT_INSN (insn); p != i3; p = NEXT_INSN (p)) + if (INSN_P (p) && p != succ && p != succ2 && predicat (PATTERN (p))) + return 0; + } /* If INSN contains an autoincrement or autodecrement, make sure that register is not used between there and I3, and not already used in