From patchwork Fri Jan 12 17:11:01 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Vladimir Makarov X-Patchwork-Id: 860062 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-471011-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="L3uMLt5V"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3zJ8R52Qhrz9sQm for ; Sat, 13 Jan 2018 04:11:13 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; q=dns; s=default; b=XLjdt7wIPoGB/28B/aCVsZZl8KP8iIKereJZXrXgt0Qfxd9zo8 gap1RHECrvr+cpITV/cgzvffBlpCNSHkWKyEcxIadrQp+3UotlFgjJchWsmugnvL deHrySPr4sjxgZWCS2yQuiNg6QjrnV09lsjVcZMb325HnifFdjYaG7VrU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; s= default; bh=zwS/wDXhOsoHTzvLCCsNIp4W4IY=; b=L3uMLt5VJH+RjnyvgvDd enJGCkSoxnxACJsvnbGLipJQ2BPlkwNmombyVahGzv1DU8N8tz9HcEh7BiXFA408 88mLy+gOViYf3itnIuLEwx3+o0sO8h0DLm9hKMo+6ILEhORjht5BUm6HobvEitnp sPrpts5zSsFk2m2H4/DNy3U= Received: (qmail 19119 invoked by alias); 12 Jan 2018 17:11:06 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 19106 invoked by uid 89); 12 Jan 2018 17:11:05 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-7.2 required=5.0 tests=BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, KAM_LAZY_DOMAIN_SECURITY, KAM_NUMSUBJECT, KAM_SHORT, SPF_HELO_PASS, TBIRD_SUSP_MIME_BDRY, T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=Hx-languages-length:4813 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 12 Jan 2018 17:11:03 +0000 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C2C248763C for ; Fri, 12 Jan 2018 17:11:02 +0000 (UTC) Received: from [10.10.120.200] (ovpn-120-200.rdu2.redhat.com [10.10.120.200]) by smtp.corp.redhat.com (Postfix) with ESMTP id 59CE260C4B for ; Fri, 12 Jan 2018 17:11:02 +0000 (UTC) To: "gcc-patches@gcc.gnu.org" From: Vladimir Makarov Subject: patch to fix PR80481 Message-ID: <42f636f5-ef34-1e0b-a920-b73252435629@redhat.com> Date: Fri, 12 Jan 2018 12:11:01 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 X-IsSubscribed: yes The following patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80481    During forming an allocation thread in a multi-region function a conflict allocno was added to the thread and that resulted in generation of additional moves.  The patch prevents inclusion of conflict allocnos into allocation threads.   The patch was successfully bootstrapped and tested on x86-64. The patch changes x86-64 SPEC2000 rates and code size insignificantly.   Committed as rev. 256590. Index: ChangeLog =================================================================== --- ChangeLog (revision 256589) +++ ChangeLog (working copy) @@ -1,3 +1,11 @@ +2018-01-12 Vladimir Makarov + + PR rtl-optimization/80481 + * ira-color.c (get_cap_member): New function. + (allocnos_conflict_by_live_ranges_p): Use it. + (slot_coalesced_allocno_live_ranges_intersect_p): Add assert. + (setup_slot_coalesced_allocno_live_ranges): Ditto. + 2018-01-12 Uros Bizjak PR target/83628 Index: ira-color.c =================================================================== --- ira-color.c (revision 256350) +++ ira-color.c (working copy) @@ -1905,6 +1905,18 @@ assign_hard_reg (ira_allocno_t a, bool r /* An array used to sort copies. */ static ira_copy_t *sorted_copies; +/* If allocno A is a cap, return non-cap allocno from which A is + created. Otherwise, return A. */ +static ira_allocno_t +get_cap_member (ira_allocno_t a) +{ + ira_allocno_t member; + + while ((member = ALLOCNO_CAP_MEMBER (a)) != NULL) + a = member; + return a; +} + /* Return TRUE if live ranges of allocnos A1 and A2 intersect. It is used to find a conflict for new allocnos or allocnos with the different allocno classes. */ @@ -1924,6 +1936,10 @@ allocnos_conflict_by_live_ranges_p (ira_ && ORIGINAL_REGNO (reg1) == ORIGINAL_REGNO (reg2)) return false; + /* We don't keep live ranges for caps because they can be quite big. + Use ranges of non-cap allocno from which caps are created. */ + a1 = get_cap_member (a1); + a2 = get_cap_member (a2); for (i = 0; i < n1; i++) { ira_object_t c1 = ALLOCNO_OBJECT (a1, i); @@ -4027,7 +4043,7 @@ slot_coalesced_allocno_live_ranges_inter { int i; int nr = ALLOCNO_NUM_OBJECTS (a); - + gcc_assert (ALLOCNO_CAP_MEMBER (a) == NULL); for (i = 0; i < nr; i++) { ira_object_t obj = ALLOCNO_OBJECT (a, i); @@ -4057,6 +4073,7 @@ setup_slot_coalesced_allocno_live_ranges a = ALLOCNO_COALESCE_DATA (a)->next) { int nr = ALLOCNO_NUM_OBJECTS (a); + gcc_assert (ALLOCNO_CAP_MEMBER (a) == NULL); for (i = 0; i < nr; i++) { ira_object_t obj = ALLOCNO_OBJECT (a, i); Index: testsuite/ChangeLog =================================================================== --- testsuite/ChangeLog (revision 256589) +++ testsuite/ChangeLog (working copy) @@ -1,3 +1,8 @@ +2018-01-12 Vladimir Makarov + + PR rtl-optimization/80481 + * g++.dg/pr80481.C: New. + 2018-01-12 Uros Bizjak PR target/83628 Index: testsuite/g++.dg/pr80481.C =================================================================== --- testsuite/g++.dg/pr80481.C (nonexistent) +++ testsuite/g++.dg/pr80481.C (working copy) @@ -0,0 +1,70 @@ +// { dg-do compile { target i?86-*-* x86_64-*-* } } +// { dg-options "-Ofast -funroll-loops -fopenmp -march=knl" } +// { dg-final { scan-assembler-not "vmovaps" } } + +#include + +#include + +#define max(a, b) ( (a) > (b) ? (a) : (b) ) + +struct Sdata { + float w; + float s; + float r; + float t; + float v; +}; + extern int N1, N2, N3; + +#define func(p, up, down) ((p)*(up) + (1.0f-(p)) * (down)) + +void foo (Sdata *in, int idx, float *out) +{ + float* y1 = (float*)_mm_malloc(sizeof(float) * N1,16); + float* y2 = (float*)_mm_malloc(sizeof(float) * N1,16); + float* y3 = (float*)_mm_malloc(sizeof(float) * N1,16); + float* y4 = (float*)_mm_malloc(sizeof(float) * N1,16); + + for (int k = idx; k < idx + N3; k++) { + float x1 = in[k].r; + float x2 = in[k].s; + float x3 = in[k].w; + float x4 = in[k].v; + float x5 = in[k].t; + x5 /= N2; + float u = exp(x4 * sqrt(x5)); + float d = exp(-x4 * sqrt(x5)); + float a = exp(x1 * x5); + float m = exp(-x1 * x5); + float p = (a - d) / (u - d); + y2[0] = x2; + y3[0] = float(1.f); + for (int i = 1; i <= N2; i++) { + y2[i] = u * y2[i - 1]; + y3[i] = d * y3[i - 1]; + } +#pragma omp simd + for (int i = 0; i <= N2; i++) { + y1[i] = + max((x3 - y2[N2 - i] * y3[i]), float(0.f)); + } + for (int i = N2 - 1; i >= 0; i--) { +#pragma omp simd + for (int j = 0; j <= i; j++) { + y4[j] = func(p,y1[j],y1[j+1]) * m; + } +#pragma omp simd + for (int j = 0; j <= i; j++) { + float t1 = y2[i - j] * y3[j]; + float t2 = max(x3 - t1, float(0.f)); + y1[j] = max(t2, y4[j]); + } + } + out[k] = y1[0]; + } + _mm_free(y1); + _mm_free(y2); + _mm_free(y3); + _mm_free(y4); +}