From patchwork Fri Apr 20 01:58:38 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bill Schmidt X-Patchwork-Id: 153905 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 95E03B7029 for ; Fri, 20 Apr 2012 11:59:11 +1000 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1335491952; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: Received:Received:Received:Received:Message-ID:Subject:From:To: Cc:Date:Content-Type:Content-Transfer-Encoding:Mime-Version: Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:Sender:Delivered-To; bh=SD5a4oEd6uS2OdcwheY2 x43S5wY=; b=NlUqp+MsWRsGgFhX0QLnCPqoDB+y6O0k08sa2HzTdoXSv6Mkroka LiOUIb3jlM0Mu39R4XKmemor1kKSa0iTRBCQvY5sV7g47GQW7i4quNW9lB7Mfg7G S86eWAH7QBzvCPeNHhhNG31wQH+mJVBFseAcxdKkRRdOFVgcnfWnEcA= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Received:Received:Received:Received:Received:Message-ID:Subject:From:To:Cc:Date:Content-Type:Content-Transfer-Encoding:Mime-Version:X-Content-Scanned:x-cbid:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=cQH2RUJKxBPGiwMsHvHutMysHpVc3zagq7grWhxWL5gJezBLIzVbTZvgYs3ArG 6G5EHFq5fn60yyWcj8LyVYr73OqpsdD6UuK2eAwyDmZRF4hy9zojLjSyBIbJ9V0B w7oQRLcnKlDCnNr9wfWnEDdtT3Sh5kvz2mIcZfDKDWZe8=; Received: (qmail 10576 invoked by alias); 20 Apr 2012 01:58:59 -0000 Received: (qmail 10281 invoked by uid 22791); 20 Apr 2012 01:58:57 -0000 X-SWARE-Spam-Status: No, hits=-6.5 required=5.0 tests=BAYES_00, KHOP_RCVD_UNTRUST, RCVD_IN_DNSWL_HI, RCVD_IN_HOSTKARMA_W, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from e1.ny.us.ibm.com (HELO e1.ny.us.ibm.com) (32.97.182.141) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 20 Apr 2012 01:58:42 +0000 Received: from /spool/local by e1.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 19 Apr 2012 21:58:41 -0400 Received: from d01dlp02.pok.ibm.com (9.56.224.85) by e1.ny.us.ibm.com (192.168.1.101) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Thu, 19 Apr 2012 21:58:38 -0400 Received: from d01relay03.pok.ibm.com (d01relay03.pok.ibm.com [9.56.227.235]) by d01dlp02.pok.ibm.com (Postfix) with ESMTP id 81FF26E804A for ; Thu, 19 Apr 2012 21:58:37 -0400 (EDT) Received: from d01av01.pok.ibm.com (d01av01.pok.ibm.com [9.56.224.215]) by d01relay03.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q3K1wb68294708 for ; Thu, 19 Apr 2012 21:58:37 -0400 Received: from d01av01.pok.ibm.com (loopback [127.0.0.1]) by d01av01.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q3K7TTdP012836 for ; Fri, 20 Apr 2012 03:29:29 -0400 Received: from [9.77.133.66] (sig-9-77-133-66.mts.ibm.com [9.77.133.66]) by d01av01.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id q3K7TS4J012798; Fri, 20 Apr 2012 03:29:29 -0400 Message-ID: <1334887118.32653.6.camel@gnopaine> Subject: [PATCH] Fix PR44214 From: "William J. Schmidt" To: gcc-patches@gcc.gnu.org Cc: rguenther@suse.de, bergner@vnet.ibm.com Date: Thu, 19 Apr 2012 20:58:38 -0500 Mime-Version: 1.0 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 12042001-6078-0000-0000-00000A325D54 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org This enhances constant folding for division by complex and vector constants. When -freciprocal-math is present, such divisions are converted into multiplies by the constant reciprocal. When an exact reciprocal is available, this is done for vector constants when optimizing. I did not implement logic for exact reciprocals of complex constants because either (a) the complexity doesn't justify the likelihood of occurrence, or (b) I'm lazy. Your choice. ;) Bootstrapped with no new regressions on powerpc64-unknown-linux-gnu. Ok for trunk? Thanks, Bill gcc: 2012-04-19 Bill Schmidt PR rtl-optimization/44214 * fold-const.c (exact_inverse): New function. (fold_binary_loc): Fold vector and complex division by constant into multiply by recripocal with flag_reciprocal_math; fold vector division by constant into multiply by reciprocal with exact inverse. gcc/testsuite: 2012-04-19 Bill Schmidt PR rtl-optimization/44214 * gcc.target/powerpc/pr44214-1.c: New test. * gcc.dg/pr44214-2.c: Likewise. * gcc.target/powerpc/pr44214-3.c: Likewise. Index: gcc/fold-const.c =================================================================== --- gcc/fold-const.c (revision 186573) +++ gcc/fold-const.c (working copy) @@ -9693,6 +9693,48 @@ fold_addr_of_array_ref_difference (location_t loc, return NULL_TREE; } +/* If the real or vector real constant CST of type TYPE has an exact + inverse, return it, else return NULL. */ + +static tree +exact_inverse (tree type, tree cst) +{ + REAL_VALUE_TYPE r; + tree unit_type, *elts; + enum machine_mode mode; + unsigned vec_nelts, i; + + switch (TREE_CODE (cst)) + { + case REAL_CST: + r = TREE_REAL_CST (cst); + + if (exact_real_inverse (TYPE_MODE (type), &r)) + return build_real (type, r); + + return NULL_TREE; + + case VECTOR_CST: + vec_nelts = VECTOR_CST_NELTS (cst); + elts = XALLOCAVEC (tree, vec_nelts); + unit_type = TREE_TYPE (type); + mode = TYPE_MODE (unit_type); + + for (i = 0; i < vec_nelts; i++) + { + r = TREE_REAL_CST (VECTOR_CST_ELT (cst, i)); + if (!exact_real_inverse (mode, &r)) + return NULL_TREE; + elts[i] = build_real (unit_type, r); + } + + return build_vector (type, elts); + + default: + return NULL_TREE; + } +} + /* Fold a binary expression of code CODE and type TYPE with operands OP0 and OP1. LOC is the location of the resulting expression. Return the folded expression if folding is successful. Otherwise, @@ -11734,23 +11776,25 @@ fold_binary_loc (location_t loc, so only do this if -freciprocal-math. We can actually always safely do it if ARG1 is a power of two, but it's hard to tell if it is or not in a portable manner. */ - if (TREE_CODE (arg1) == REAL_CST) + if (TREE_CODE (arg1) == REAL_CST + || (TREE_CODE (arg1) == COMPLEX_CST + && COMPLEX_FLOAT_TYPE_P (TREE_TYPE (arg1))) + || (TREE_CODE (arg1) == VECTOR_CST + && VECTOR_FLOAT_TYPE_P (TREE_TYPE (arg1)))) { if (flag_reciprocal_math - && 0 != (tem = const_binop (code, build_real (type, dconst1), + && 0 != (tem = fold_binary (code, type, build_one_cst (type), arg1))) return fold_build2_loc (loc, MULT_EXPR, type, arg0, tem); - /* Find the reciprocal if optimizing and the result is exact. */ - if (optimize) + /* Find the reciprocal if optimizing and the result is exact. + TODO: Complex reciprocal not implemented. */ + if (optimize + && TREE_CODE (arg1) != COMPLEX_CST) { - REAL_VALUE_TYPE r; - r = TREE_REAL_CST (arg1); - if (exact_real_inverse (TYPE_MODE(TREE_TYPE(arg0)), &r)) - { - tem = build_real (type, r); - return fold_build2_loc (loc, MULT_EXPR, type, - fold_convert_loc (loc, type, arg0), tem); - } + tree inverse = exact_inverse (TREE_TYPE (arg0), arg1); + + if (inverse) + return fold_build2_loc (loc, MULT_EXPR, type, arg0, inverse); } } /* Convert A/B/C to A/(B*C). */ Index: gcc/testsuite/gcc.target/powerpc/pr44214-3.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/pr44214-3.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/pr44214-3.c (revision 0) @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mcpu=power7 -fdump-tree-optimized" } */ + +void do_div (vector double *a, vector double *b) +{ + *a = *b / (vector double) { 2.0, 2.0 }; +} + +/* Since 2.0 has an exact reciprocal, constant folding should multiply *b + by the reciprocals of the vector elements. As a result there should be + one vector multiply and zero divides in the optimized code. The string + " * " occurs 3 times: one multiply and two indirect parameters. */ + +/* { dg-final { scan-tree-dump-times " \\\* " 3 "optimized" } } */ +/* { dg-final { scan-tree-dump-times " / " 0 "optimized" } } */ +/* { dg-final { cleanup-tree-dump "optimized" } } */ Index: gcc/testsuite/gcc.target/powerpc/pr44214-1.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/pr44214-1.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/pr44214-1.c (revision 0) @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ffast-math -mcpu=power7 -fdump-tree-optimized" } */ + +void do_div (vector double *a, vector double *b) +{ + *a = *b / (vector double) { 2.0, 3.0 }; +} + +/* Constant folding should multiply *b by the reciprocals of the + vector elements. As a result there should be one vector multiply + and zero divides in the optimized code. The string " * " occurs + 3 times: one multiply and two indirect parameters. */ + +/* { dg-final { scan-tree-dump-times " \\\* " 3 "optimized" } } */ +/* { dg-final { scan-tree-dump-times " / " 0 "optimized" } } */ +/* { dg-final { cleanup-tree-dump "optimized" } } */ Index: gcc/testsuite/gcc.dg/pr44214-2.c =================================================================== --- gcc/testsuite/gcc.dg/pr44214-2.c (revision 0) +++ gcc/testsuite/gcc.dg/pr44214-2.c (revision 0) @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ffast-math -fdump-tree-optimized" } */ + +void do_div (_Complex double *a, _Complex double *b) +{ + *a = *b / (4.0 - 5.0fi); +} + +/* Constant folding should multiply *b by the reciprocal of 4-5i + = 4/41 - (5/41)i. As a result there should be 4 multiplies and + zero divides in the optimized code. The string " * " occurs 6 + times: 4 multiplies and 2 indirect parameters. */ + +/* { dg-final { scan-tree-dump-times " \\\* " 6 "optimized" } } */ +/* { dg-final { scan-tree-dump-times " / " 0 "optimized" } } */ +/* { dg-final { cleanup-tree-dump "optimized" } } */