From patchwork Tue Oct 25 11:21:40 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bin Cheng X-Patchwork-Id: 686421 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3t39j91hKXz9t2Y for ; Tue, 25 Oct 2016 22:22:05 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=qfOytv9I; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:mime-version:content-type; q=dns; s=default; b=bgVJptVWx4Ejx4iH/ffIYSVB7A7HjsvyP9w3pWrWfr5oxKJ3X5 L79vVcXlyWeueAXAWVpX11pkbgvchzma1vdZCpw1g7DwNzTSnxD6FXuhs57mbuzV bS1s/QH8U9MwimD1TOI6aeaeVFfCSPufEAamQ571CFVHY7a55wYdZ0i/g= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:mime-version:content-type; s= default; bh=qU4ZcrVJff9VBKfTefV1ArqFc1g=; b=qfOytv9ILsf+Za0djniO YNzGinVRSLwwX4vOCCFLVMe2kqL5dBBIYwrTQKhbO+GyKyn2KPFdRXAPUZ0VQDHQ pg8gptkfjV9Tl1y4eX7VgvoBwGZIYtLL45YutywI3lPBCvRlxwkj1rMk1ldq1/dM wx4bHV2vLsgpwlMTGmoUWTA= Received: (qmail 37294 invoked by alias); 25 Oct 2016 11:21:56 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 37265 invoked by uid 89); 25 Oct 2016 11:21:55 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=AWL, BAYES_00, SPF_PASS autolearn=ham version=3.3.2 spammy=reveals, dump_file, GSI_SAME_STMT, sk:gsi_ins X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (207.82.80.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 25 Oct 2016 11:21:45 +0000 Received: from EUR02-VE1-obe.outbound.protection.outlook.com (mail-ve1eur02lp0054.outbound.protection.outlook.com [213.199.154.54]) (Using TLS) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-46-NWD64jRqOBuxXreURd6bOg-1; Tue, 25 Oct 2016 12:21:42 +0100 Received: from VI1PR0802MB2176.eurprd08.prod.outlook.com (10.172.12.21) by VI1PR0802MB2176.eurprd08.prod.outlook.com (10.172.12.21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.659.11; Tue, 25 Oct 2016 11:21:41 +0000 Received: from VI1PR0802MB2176.eurprd08.prod.outlook.com ([10.172.12.21]) by VI1PR0802MB2176.eurprd08.prod.outlook.com ([10.172.12.21]) with mapi id 15.01.0659.028; Tue, 25 Oct 2016 11:21:40 +0000 From: Bin Cheng To: "gcc-patches@gcc.gnu.org" CC: nd Subject: [PATCH GCC][2/4]Simplify (cond (cmp (convert (x), c1), x, c2)) into (minmax (x, c)) Date: Tue, 25 Oct 2016 11:21:40 +0000 Message-ID: x-ms-office365-filtering-correlation-id: 139a0b91-80c6-4a27-1c89-08d3fcc917ef x-microsoft-exchange-diagnostics: 1; VI1PR0802MB2176; 7:yc1MTpJvywMdyoibxFyd61sE4DYhz+myPJUjPPZLBV2jsBYyYMxRakjTyFkVvT9ge+DN+07KB/ydKS1vaLEErjzzjDstYZw3A885oHSpOaUxnurgz/h8vLzY9O08UWvmlFkWTWDNMASVhzvGZdEV1porTZSCAqLIoM6z7VuHVekvEnCwb5jd60KtPWSgS/lm1PgXU/HyalEOabPtwvGpEo5OYg3822/hHvLvPwyefz8qhHcjNmt1yNZBXCj/+Ivyagt5uAHshNUa9SpOThYQNibRXPQ92tPcJzzg06b2JsoNTWTjPq0ZD1V94m1szBnpEvdVl6OYTAgCi24XflHG1MFi1asV87jlzRiqd/NRP4k= x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:VI1PR0802MB2176; nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(180628864354917); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(102415321)(6040176)(601004)(2401047)(5005006)(8121501046)(10201501046)(3002001)(6055026); SRVR:VI1PR0802MB2176; BCL:0; PCL:0; RULEID:; SRVR:VI1PR0802MB2176; x-forefront-prvs: 01068D0A20 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(6009001)(7916002)(54534003)(189002)(377424004)(199003)(74316002)(8676002)(7846002)(92566002)(9686002)(5002640100001)(3280700002)(76576001)(8936002)(19580395003)(19580405001)(97736004)(4001150100001)(77096005)(2900100001)(450100001)(106356001)(106116001)(105586002)(33656002)(87936001)(2906002)(101416001)(586003)(102836003)(3846002)(4326007)(6116002)(229853001)(2351001)(66066001)(189998001)(7696004)(2501003)(110136003)(5660300001)(6916009)(68736007)(10400500002)(54356999)(3660700001)(81166006)(81156014)(86362001)(99936001)(122556002)(50986999)(7736002)(305945005); DIR:OUT; SFP:1101; SCL:1; SRVR:VI1PR0802MB2176; H:VI1PR0802MB2176.eurprd08.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-originalarrivaltime: 25 Oct 2016 11:21:40.6942 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR0802MB2176 X-MC-Unique: NWD64jRqOBuxXreURd6bOg-1 X-IsSubscribed: yes Hi, Second patch optimizing (cond (cmp (convert (x), c1), x, c2)) into (minmax (x, c)). As commented in patch, this is done if: + 1) Comparison's operands are promoted from smaller type. + 2) Const c1 equals to c2 after canonicalizing comparison. + 3) Comparison has tree code LT, LE, GT or GE. + This specific pattern is needed when (cmp (convert x) c) may not + be simplified by comparison patterns because of multiple uses of + x. It also makes sense here because simplifying across multiple + referred var is always benefitial for complicated cases. It also adds call to fold_stmt in tree-if-conv.c so that generated cond_expr statement has its chance to be simplified. Bootstrap and test on x86_64 and AArch64. It introduces below failure on both x86_64 and AArch64: FAIL: gcc.dg/vect/slp-cond-3.c I believe it reveals defect in vect-slp. In call to fold_stmt in ifcvt, canonicalization transforms _145 = _95 <= _96 ? _149 : _147 into _145 = _95 > _96 ? _147 : _149. As a result, this stmt has different code to the first one of SLP instance. IMO, SLP should be improved to handle operands swapping, apparently, current support is not OK. It also introduces more failures on AArch64(probably other targets) as below: FAIL: gcc.dg/vect/pr65947-1.c -flto -ffat-lto-objects scan-tree-dump-times vect "LOOP VECTORIZED" 2 FAIL: gcc.dg/vect/pr65947-1.c -flto -ffat-lto-objects scan-tree-dump-times vect "condition expression based on integer induction." 4 FAIL: gcc.dg/vect/pr65947-1.c scan-tree-dump-times vect "LOOP VECTORIZED" 2 FAIL: gcc.dg/vect/pr65947-1.c scan-tree-dump-times vect "condition expression based on integer induction." 4 FAIL: gcc.dg/vect/pr65947-13.c -flto -ffat-lto-objects scan-tree-dump-times vect "LOOP VECTORIZED" 2 FAIL: gcc.dg/vect/pr65947-13.c scan-tree-dump-times vect "LOOP VECTORIZED" 2 FAIL: gcc.dg/vect/pr65947-4.c -flto -ffat-lto-objects scan-tree-dump-times vect "LOOP VECTORIZED" 2 FAIL: gcc.dg/vect/pr65947-4.c -flto -ffat-lto-objects scan-tree-dump-times vect "condition expression based on integer induction." 4 FAIL: gcc.dg/vect/pr65947-4.c scan-tree-dump-times vect "LOOP VECTORIZED" 2 FAIL: gcc.dg/vect/pr65947-4.c scan-tree-dump-times vect "condition expression based on integer induction." 4 FAIL: gcc.dg/vect/pr77503.c -flto -ffat-lto-objects scan-tree-dump vect "vectorized 1 loops" FAIL: gcc.dg/vect/pr77503.c scan-tree-dump vect "vectorized 1 loops" FAIL: gcc.dg/vect/vect-pr69848.c -flto -ffat-lto-objects scan-tree-dump vect "vectorized 1 loops" FAIL: gcc.dg/vect/vect-pr69848.c scan-tree-dump vect "vectorized 1 loops" Again, these failures reveal a defect in vectorizer that operand swapping is not supported for COND_REDUCTION. I will send another two patches independent to this patch set resolving these failures. Is this OK? Thanks, bin 2016-10-21 Bin Cheng * tree-if-conv.c (ifcvt_follow_ssa_use_edges): New func. (predicate_scalar_phi): Call fold_stmt using the new valueize func. * match.pd ((cond (cmp (convert (x), c1), x, c2)) -> (minmax (x, c))): New pattern. gcc/testsuite/ChangeLog 2016-10-21 Bin Cheng * gcc.dg/fold-condcmpconv-1.c: New test. * gcc.dg/fold-condcmpconv-2.c: New test. diff --git a/gcc/match.pd b/gcc/match.pd index 7365bc1..7523b2f 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -1930,6 +1930,59 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (integer_zerop (@0)) @2))) +#if GIMPLE +/* (cond (cmp (convert (x) c1) x c2)) -> (minmax (x c)) if: + 1) Comparison's operands are promoted from smaller type. + 2) Const c1 equals to c2 after canonicalizing comparison. + 3) Comparison has tree code LT, LE, GT or GE. + This specific pattern is needed when (cmp (convert x) c) may not + be simplified by comparison patterns because of multiple uses of + x. It also makes sense here because simplifying across multiple + referred var is always benefitial for complicated cases. */ +(for cmp (lt le gt ge) + (simplify + (cond (cmp@0 (convert@3 @1) INTEGER_CST@4) @1 INTEGER_CST@2) + (with + { + tree op_type = TREE_TYPE (@1), cmp_type = TREE_TYPE (@3); + enum tree_code code = TREE_CODE (@0), cmp_code = TREE_CODE (@0); + + if (TYPE_SIGN (cmp_type) == TYPE_SIGN (op_type) + && TYPE_PRECISION (cmp_type) > TYPE_PRECISION (op_type)) + { + if (wi::to_widest (@4) == (wi::to_widest (@2) - 1)) + { + /* X <= Y - 1 equals to X < Y. */ + if (cmp_code == LE_EXPR) + code = LT_EXPR; + /* X > Y - 1 equals to X >= Y. */ + if (cmp_code == GT_EXPR) + code = GE_EXPR; + } + if (wi::to_widest (@4) == (wi::to_widest (@2) + 1)) + { + /* X < Y + 1 equals to X <= Y. */ + if (cmp_code == LT_EXPR) + code = LE_EXPR; + /* X >= Y + 1 equals to X > Y. */ + if (cmp_code == GE_EXPR) + code = GT_EXPR; + } + if (code != cmp_code || wi::to_widest (@2) == wi::to_widest (@4)) + { + if (cmp_code == LT_EXPR || cmp_code == LE_EXPR) + code = MIN_EXPR; + if (cmp_code == GT_EXPR || cmp_code == GE_EXPR) + code = MAX_EXPR; + } + } + } + (if (code == MAX_EXPR) + (max @1 @2) + (if (code == MIN_EXPR) + (min @1 @2)))))) +#endif + (for cnd (cond vec_cond) /* A ? B : (A ? X : C) -> A ? B : C. */ (simplify diff --git a/gcc/testsuite/gcc.dg/fold-condcmpconv-1.c b/gcc/testsuite/gcc.dg/fold-condcmpconv-1.c new file mode 100644 index 0000000..321294f --- /dev/null +++ b/gcc/testsuite/gcc.dg/fold-condcmpconv-1.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -fdump-tree-ifcvt" } */ + +int foo (unsigned short a[], unsigned int x) +{ + unsigned int i; + for (i = 0; i < 1000; i++) + { + x = a[i]; + a[i] = (unsigned short)(x >= 255 ? 255 : x); + } return x; +} + +/* { dg-final { scan-tree-dump " = MIN_EXPR <" "ifcvt" } } */ diff --git a/gcc/testsuite/gcc.dg/fold-condcmpconv-2.c b/gcc/testsuite/gcc.dg/fold-condcmpconv-2.c new file mode 100644 index 0000000..5d3ef4a --- /dev/null +++ b/gcc/testsuite/gcc.dg/fold-condcmpconv-2.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -fdump-tree-ifcvt" } */ + +int foo (short a[], int x) +{ + unsigned int i; + for (i = 0; i < 1000; i++) + { + x = a[i]; + a[i] = (short)(x <= 0 ? 0 : x); + } return x; +} + + +/* { dg-final { scan-tree-dump " = MAX_EXPR <" "ifcvt" } } */ diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c index 0a20189..24cca2e 100644 --- a/gcc/tree-if-conv.c +++ b/gcc/tree-if-conv.c @@ -1749,6 +1749,14 @@ gen_phi_arg_condition (gphi *phi, vec *occur, return cond; } +/* Local valueization callback that follos all-use SSA edges. */ + +static tree +ifcvt_follow_ssa_use_edges (tree val) +{ + return val; +} + /* Replace a scalar PHI node with a COND_EXPR using COND as condition. This routine can handle PHI nodes with more than two arguments. @@ -1844,6 +1852,8 @@ predicate_scalar_phi (gphi *phi, gimple_stmt_iterator *gsi) arg0, arg1); new_stmt = gimple_build_assign (res, rhs); gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); + gimple_stmt_iterator new_gsi = gsi_for_stmt (new_stmt); + fold_stmt (&new_gsi, ifcvt_follow_ssa_use_edges); update_stmt (new_stmt); if (dump_file && (dump_flags & TDF_DETAILS))