From patchwork Wed Jan 25 22:01:08 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 719901 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3v7zXk3kXrz9rxl for ; Thu, 26 Jan 2017 09:01:41 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="Bw5Je4th"; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:references:mime-version :content-type:in-reply-to; q=dns; s=default; b=pQ3sOFEvXyVRCTZSw lqgd5wgmDsEPcydbeH833avCIF75QRvfpcmZsYKnX0c1VovWmZuHvcpDupjN3t0M 7Q4/v184qrLcP/uWaCQ1ufYy6bDjUOOtcjW3IVM54Gu/H7Tioy1RXzZ+B1nNN56r rkwSsNr0cFsWP1w1rTM2NMwgLU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:references:mime-version :content-type:in-reply-to; s=default; bh=c5d7QAMmjqcJX3h2keaUZOp H7SI=; b=Bw5Je4thBn3H+bjZMzK67x64rUNrVLvMmj2YHkQLsYJkQDA1hn3E/Dv VGZeO9A7w+nTnSkbupyBjo/O3fXMXJyiGDHkN4Yt7wOacWRfnUPx4nF1nrGAX9h+ OQ9Bx0eyOS/+hO/crKWLZ/ns2q0dxip+w5bdHG19/J0T2P3LylQE= Received: (qmail 81056 invoked by alias); 25 Jan 2017 22:01:18 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 81030 invoked by uid 89); 25 Jan 2017 22:01:17 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-5.1 required=5.0 tests=BAYES_00, RP_MATCHES_RCVD, SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=H*i:sk:0ae12f6, H*f:sk:0ae12f6, H*MI:sk:0ae12f6, swapped X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 25 Jan 2017 22:01:14 +0000 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6E08EC05678D; Wed, 25 Jan 2017 22:01:14 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-117-185.ams2.redhat.com [10.36.117.185]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id v0PM1CJg016445 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Wed, 25 Jan 2017 17:01:13 -0500 Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id v0PM19rb012766; Wed, 25 Jan 2017 23:01:10 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id v0PM18ah012765; Wed, 25 Jan 2017 23:01:08 +0100 Date: Wed, 25 Jan 2017 23:01:08 +0100 From: Jakub Jelinek To: Jeff Law Cc: Uros Bizjak , gcc-patches@gcc.gnu.org Subject: Re: [PATCH] Use fld b; fld a; instead of fld a; fld b; fxch %st(1) in reg-stack (PR target/70465) Message-ID: <20170125220108.GU1867@tucnak> Reply-To: Jakub Jelinek References: <20170125210604.GS1867@tucnak> <0ae12f62-8d59-f750-123e-a4fa7b525773@redhat.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <0ae12f62-8d59-f750-123e-a4fa7b525773@redhat.com> User-Agent: Mutt/1.7.1 (2016-10-04) X-IsSubscribed: yes On Wed, Jan 25, 2017 at 02:43:34PM -0700, Jeff Law wrote: > > 2017-01-25 Jakub Jelinek > > > > PR target/70465 > > * reg-stack.c (emit_swap_insn): Instead of fld a; fld b; fxchg %st(1); > > emit fld b; fld a; if possible. > > > > * gcc.target/i386/pr70465.c: New test. > So please comment on the general approach you're taking here. I have a > pretty good sense of what you're doing, mostly because I pondered something > similar. But I doubt others coming across the code would see the overall > structure as quickly. Does the following updated patch explain it sufficiently? > > + if (i2 != NULL_RTX > > + && (i2set = single_set (i2)) != NULL_RTX) > > + { > > + /* Instead of fld a; fld b; fxch %st(1); just > > + use fld b; fld a; if possible. */ > > + rtx i2dest = *get_true_reg (&SET_DEST (i2set)); > > + if (REG_P (i2dest) > > + && REGNO (i2dest) == FIRST_STACK_REG > > + && MEM_P (SET_SRC (i2set)) > > + && !side_effects_p (SET_SRC (i2set)) > > + && !modified_between_p (SET_SRC (i1set), i2, i1)) > And here we're trying to verify that the insn found above (I2) is pushing > another value onto the FP register stack and that the value in I2 is not > modified between I1 and I2. No, that last call (as I've tried to explain in the new comment) wants to ensure that there are no stores in between i2 and i1 that might alias with the second load's memory (then it would be invalid to move it before i2) and that the address of the memory doesn't depend on something set after i2. 2017-01-25 Jakub Jelinek PR target/70465 * reg-stack.c (emit_swap_insn): Instead of fld a; fld b; fxchg %st(1); emit fld b; fld a; if possible. * gcc.target/i386/pr70465.c: New test. Jakub --- gcc/reg-stack.c.jj 2017-01-25 17:17:46.086112137 +0100 +++ gcc/reg-stack.c 2017-01-25 22:58:00.403259702 +0100 @@ -887,6 +887,77 @@ emit_swap_insn (rtx_insn *insn, stack_pt && REG_P (i1src) && REGNO (i1src) == FIRST_STACK_REG && find_regno_note (i1, REG_DEAD, FIRST_STACK_REG) == NULL_RTX) return; + + /* Instead of + fld a + fld b + fxch %st(1) + just use + fld b + fld a + if possible. */ + + if (REG_P (i1dest) + && REGNO (i1dest) == FIRST_STACK_REG + && MEM_P (SET_SRC (i1set)) + && !side_effects_p (SET_SRC (i1set)) + && hard_regno == FIRST_STACK_REG + 1 + && i1 != BB_HEAD (current_block)) + { + /* i1 is the last insn that involves stack regs before insn, and + is known to be a load without other side-effects, i.e. fld b + in the above comment. */ + rtx_insn *i2 = NULL; + rtx i2set; + rtx_insn *tmp = PREV_INSN (i1); + rtx_insn *limit = PREV_INSN (BB_HEAD (current_block)); + /* Find the previous insn involving stack regs, but don't pass a + block boundary. */ + while (tmp != limit) + { + if (LABEL_P (tmp) + || CALL_P (tmp) + || NOTE_INSN_BASIC_BLOCK_P (tmp) + || (NONJUMP_INSN_P (tmp) + && stack_regs_mentioned (tmp))) + { + i2 = tmp; + break; + } + tmp = PREV_INSN (tmp); + } + if (i2 != NULL_RTX + && (i2set = single_set (i2)) != NULL_RTX) + { + rtx i2dest = *get_true_reg (&SET_DEST (i2set)); + /* If the last two insns before insn that involve + stack regs are loads, where the latter (i1) + pushes onto the register stack and thus + moves the value from the first load (i2) from + %st to %st(1), consider swapping them. */ + if (REG_P (i2dest) + && REGNO (i2dest) == FIRST_STACK_REG + && MEM_P (SET_SRC (i2set)) + /* Ensure i2 doesn't have other side-effects. */ + && !side_effects_p (SET_SRC (i2set)) + /* And that the two instructions can actually be + swapped, i.e. there shouldn't be any stores + in between i2 and i1 that might alias with + the i1 memory, and the memory address can't + use registers set in between i2 and i1. */ + && !modified_between_p (SET_SRC (i1set), i2, i1)) + { + /* Move i1 (fld b above) right before i2 (fld a + above. */ + remove_insn (i1); + SET_PREV_INSN (i1) = NULL_RTX; + SET_NEXT_INSN (i1) = NULL_RTX; + set_block_for_insn (i1, NULL); + emit_insn_before (i1, i2); + return; + } + } + } } /* Avoid emitting the swap if this is the first register stack insn --- gcc/testsuite/gcc.target/i386/pr70465.c.jj 2017-01-25 22:49:21.986853620 +0100 +++ gcc/testsuite/gcc.target/i386/pr70465.c 2017-01-25 22:49:21.985853633 +0100 @@ -0,0 +1,12 @@ +/* PR target/70465 */ +/* { dg-do compile { target ia32 } } */ +/* { dg-options "-O2 -mfpmath=387 -fomit-frame-pointer" } */ +/* { dg-final { scan-assembler-not "fxch\t%st.1" } } */ + +double +atan2 (double y, double x) +{ + double res = 0.0; + asm ("fpatan" : "=t" (res) : "u" (y), "0" (x) : "st(1)"); + return res; +}