From patchwork Mon Feb 1 08:32:07 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 576365 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id D523D140784 for ; Mon, 1 Feb 2016 19:32:27 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=K8OhlbLy; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:mime-version :content-type; q=dns; s=default; b=fHJhm/MHvWzTIl7CPAJP096MZ73SS If72vBZdmSIZJ5rU3B+PQ43XDyS04/BNKvaK2uWC/DcsRfmRFB8Yhd6qtw0cbuQi 67SX74CLLH9gEkTNTqZp88ul+TQ6muvv2exFppJyJd9UoobOPqV04TJD3v4Fcm/E EJ6GQBezV8yzWE= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:mime-version :content-type; s=default; bh=z6KSiKc9VEZpmUyOZehCCbtjZNU=; b=K8O hlbLyuG/jTYfYQeetv+OCwUT/ilCVImoBj6GjRyS45YHlGRVgRBg51M/U6OtIAVO LaRpa3Bo4hvt66FE8NAvNURkHpiDiUEor2i41SmYJQchGaVgWicFoJAlChTzT4Ad pk0Oj8Kls6xkoMgf4p+KGDmSvTZrffl0Y/yurJdI= Received: (qmail 34879 invoked by alias); 1 Feb 2016 08:32:18 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 34864 invoked by uid 89); 1 Feb 2016 08:32:18 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.4 required=5.0 tests=BAYES_00, RP_MATCHES_RCVD, SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=Ensure, 170, negatively, 2016-02-01 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Mon, 01 Feb 2016 08:32:16 +0000 Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) by mx1.redhat.com (Postfix) with ESMTPS id 967BDC09FA86; Mon, 1 Feb 2016 08:32:15 +0000 (UTC) Received: from tucnak.zalov.cz ([10.3.113.11]) by int-mx13.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u118WDte031328 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Mon, 1 Feb 2016 03:32:14 -0500 Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id u118WAWD022958; Mon, 1 Feb 2016 09:32:11 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id u118W7gF022957; Mon, 1 Feb 2016 09:32:07 +0100 Date: Mon, 1 Feb 2016 09:32:07 +0100 From: Jakub Jelinek To: Richard Biener , Jeff Law , Bernd Schmidt Cc: gcc-patches@gcc.gnu.org, James Greenhalgh Subject: [PATCH] Ensure noce_convert_multiple_sets handles only multiple sets (PR rtl-optimization/69570) Message-ID: <20160201083207.GB3017@tucnak.redhat.com> Reply-To: Jakub Jelinek MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.24 (2015-08-30) X-IsSubscribed: yes Hi! While looking at this PR (which is most likely a reg-stack or RA bug triggered by the ifcvt noce_convert_multiple_sets additions), I've noticed that despite the "multiple_sets" in the name it actually attempts to handle not just multiple sets, but also the single set case, which is already handled by various calls later on noce_process_if_block. Additionally, it handles it worse than those calls we had for years, because it always creates temporary pseudos to store result into and then at the end copy back to the desired register. If there is just a single set, this temporary is unnecessary and unfortunately negatively affects RA (get larger code with more spills/fills in *.reload/postreload). So, I'd like to change noce_convert_multiple_sets to really apply to multiple sets only. While it makes the issue latent again (and I'll try to analyze it), IMHO it is the right step forward. Bootstrapped/regtested on {x86_64,i686,ppc64,ppc64le,s390,s390x,aarch64}-linux, ok for trunk? 2016-02-01 Jakub Jelinek PR rtl-optimization/69570 * ifcvt.c (bb_ok_for_noce_convert_multiple_sets): Return true only if there is more than one set, not if there is a single set. * g++.dg/opt/pr69570.C: New test. Jakub --- gcc/ifcvt.c.jj 2016-01-21 17:53:32.000000000 +0100 +++ gcc/ifcvt.c 2016-01-31 13:47:34.171323086 +0100 @@ -3295,7 +3295,7 @@ bb_ok_for_noce_convert_multiple_sets (ba if (count > limit) return false; - return count > 0; + return count > 1; } /* Given a simple IF-THEN-JOIN or IF-THEN-ELSE-JOIN block, attempt to convert --- gcc/testsuite/g++.dg/opt/pr69570.C.jj 2016-01-31 22:49:03.747216450 +0100 +++ gcc/testsuite/g++.dg/opt/pr69570.C 2016-01-31 22:49:18.861009011 +0100 @@ -0,0 +1,70 @@ +// PR rtl-optimization/69570 +// { dg-do run } +// { dg-options "-O2" } +// { dg-additional-options "-fpic" { target fpic } } +// { dg-additional-options "-march=i686" { target ia32 } } + +template inline const T & +min (const T &a, const T &b) +{ + if (b < a) + return b; + return a; +} + +template inline const T & +max (const T &a, const T &b) +{ + if (a < b) + return b; + return a; +} + +static inline void +foo (unsigned x, unsigned y, unsigned z, double &h, double &s, double &l) +{ + double r = x / 255.0; + double g = y / 255.0; + double b = z / 255.0; + double m = max (r, max (g, b)); + double n = min (r, min (g, b)); + double d = m - n; + double e = m + n; + h = 0.0, s = 0.0, l = e / 2.0; + if (d > 0.0) + { + s = l > 0.5 ? d / (2.0 - e) : d / e; + if (m == r && m != g) + h = (g - b) / d + (g < b ? 6.0 : 0.0); + if (m == g && m != b) + h = (b - r) / d + 2.0; + if (m == b && m != r) + h = (r - g) / d + 4.0; + h /= 6.0; + } +} + +__attribute__ ((noinline, noclone)) +void bar (unsigned x[3], double y[3]) +{ + double h, s, l; + foo (x[0], x[1], x[2], h, s, l); + y[0] = h; + y[1] = s; + y[2] = l; +} + +int +main () +{ + unsigned x[3] = { 0, 128, 0 }; + double y[3]; + + bar (x, y); + if (__builtin_fabs (y[0] - 0.33333) > 0.001 + || __builtin_fabs (y[1] - 1) > 0.001 + || __builtin_fabs (y[2] - 0.25098) > 0.001) + __builtin_abort (); + + return 0; +}