From patchwork Fri Oct 21 15:23:39 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 685177 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3t0qG758y1z9t0t for ; Sat, 22 Oct 2016 02:23:59 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=PnFWpFI/; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:mime-version :content-type; q=dns; s=default; b=sJRChT91FmzoYbOfNeryXYngXUNHE dxvbRN5JQO74S2Bq3wQrBZ/XeLGu475ry6tz9zXbyG5II2bDCJLbh7BAg/qqX7ig JrZeUQPUJCanWop/nwRrd7/wHBXpALeGInNHKeGkWItDHAf9ryJBJBAkrVP7JKzz wnfzQ/J+US0+lI= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:mime-version :content-type; s=default; bh=OK5VzEYq/cUWlv/ppn8i5Q4PBWU=; b=PnF WpFI/iHQdbE6gnqnF9NYSJZp5aowD42jyYJRLI5pGSTsJExDtlDgLi6dEqurGxB7 2UEEk61Fe/7WXRDpCavz8ta3cadF8PLqPe3RmFDmfcg1w3eXR3LSyOlc+auSpXdk GcQP779uJORERYbt7+LIsy7w96JH0LzuflII4PYs= Received: (qmail 126135 invoked by alias); 21 Oct 2016 15:23:49 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 126116 invoked by uid 89); 21 Oct 2016 15:23:47 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=BAYES_00, RP_MATCHES_RCVD, SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=ia32, Fold X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 21 Oct 2016 15:23:46 +0000 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 0F206C054905; Fri, 21 Oct 2016 15:23:45 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-116-44.ams2.redhat.com [10.36.116.44]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u9LFNgLa013951 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Fri, 21 Oct 2016 11:23:44 -0400 Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id u9LFNecq013409; Fri, 21 Oct 2016 17:23:41 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id u9LFNdvo013408; Fri, 21 Oct 2016 17:23:39 +0200 Date: Fri, 21 Oct 2016 17:23:39 +0200 From: Jakub Jelinek To: Uros Bizjak Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] Fold __builtin_ia32_[tl]zcnt_u{16,32,64} (PR target/78057) Message-ID: <20161021152339.GK7282@tucnak.redhat.com> Reply-To: Jakub Jelinek MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.24 (2015-08-30) X-IsSubscribed: yes Hi! This patch adds folding for the new ia32 md builtins. If they can be folded into constant, it is done in ix86_fold_builtin, if they can fold to corresponding generic __builtin_c[lt]z* (which have e.g. the advantage that VRP knows about what values it can have etc.), it is done in gimple_fold_builtin target hook. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2016-10-21 Jakub Jelinek PR target/78057 * config/i386/i386.c: Include fold-const-call.h, tree-vrp.h and tree-ssanames.h. (ix86_fold_builtin): Fold IX86_BUILTIN_[LT]ZCNT{16,32,64} with INTEGER_CST argument. (ix86_gimple_fold_builtin): New function. (TARGET_GIMPLE_FOLD_BUILTIN): Define. * gcc.target/i386/pr78057.c: New test. +/* { dg-final { scan-tree-dump-times "__builtin_clzll " 1 "optimized" { target lp64 } } } */ Jakub --- gcc/config/i386/i386.c.jj 2016-10-21 11:36:33.135677698 +0200 +++ gcc/config/i386/i386.c 2016-10-21 11:57:58.248530521 +0200 @@ -77,6 +77,9 @@ along with GCC; see the file COPYING3. #include "case-cfn-macros.h" #include "regrename.h" #include "dojump.h" +#include "fold-const-call.h" +#include "tree-vrp.h" +#include "tree-ssanames.h" /* This file should be included last. */ #include "target-def.h" @@ -33332,6 +33335,40 @@ ix86_fold_builtin (tree fndecl, int n_ar return build_real (type, inf); } + case IX86_BUILTIN_TZCNT16: + case IX86_BUILTIN_TZCNT32: + case IX86_BUILTIN_TZCNT64: + gcc_assert (n_args == 1); + if (TREE_CODE (args[0]) == INTEGER_CST) + { + tree type = TREE_TYPE (TREE_TYPE (fndecl)); + tree arg = args[0]; + if (fn_code == IX86_BUILTIN_TZCNT16) + arg = fold_convert (short_unsigned_type_node, arg); + if (integer_zerop (arg)) + return build_int_cst (type, TYPE_PRECISION (TREE_TYPE (arg))); + else + return fold_const_call (CFN_CTZ, type, arg); + } + break; + + case IX86_BUILTIN_LZCNT16: + case IX86_BUILTIN_LZCNT32: + case IX86_BUILTIN_LZCNT64: + gcc_assert (n_args == 1); + if (TREE_CODE (args[0]) == INTEGER_CST) + { + tree type = TREE_TYPE (TREE_TYPE (fndecl)); + tree arg = args[0]; + if (fn_code == IX86_BUILTIN_LZCNT16) + arg = fold_convert (short_unsigned_type_node, arg); + if (integer_zerop (arg)) + return build_int_cst (type, TYPE_PRECISION (TREE_TYPE (arg))); + else + return fold_const_call (CFN_CLZ, type, arg); + } + break; + default: break; } @@ -33344,6 +33381,67 @@ ix86_fold_builtin (tree fndecl, int n_ar return NULL_TREE; } +/* Fold a MD builtin (use ix86_fold_builtin for folding into + constant) in GIMPLE. */ + +bool +ix86_gimple_fold_builtin (gimple_stmt_iterator *gsi) +{ + gimple *stmt = gsi_stmt (*gsi); + tree fndecl = gimple_call_fndecl (stmt); + gcc_checking_assert (fndecl && DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_MD); + int n_args = gimple_call_num_args (stmt); + enum ix86_builtins fn_code = (enum ix86_builtins) DECL_FUNCTION_CODE (fndecl); + tree decl = NULL_TREE; + tree arg0; + + switch (fn_code) + { + case IX86_BUILTIN_TZCNT32: + decl = builtin_decl_implicit (BUILT_IN_CTZ); + goto fold_tzcnt_lzcnt; + + case IX86_BUILTIN_TZCNT64: + decl = builtin_decl_implicit (BUILT_IN_CTZLL); + goto fold_tzcnt_lzcnt; + + case IX86_BUILTIN_LZCNT32: + decl = builtin_decl_implicit (BUILT_IN_CLZ); + goto fold_tzcnt_lzcnt; + + case IX86_BUILTIN_LZCNT64: + decl = builtin_decl_implicit (BUILT_IN_CLZLL); + goto fold_tzcnt_lzcnt; + + fold_tzcnt_lzcnt: + gcc_assert (n_args == 1); + arg0 = gimple_call_arg (stmt, 0); + if (TREE_CODE (arg0) == SSA_NAME && decl && gimple_call_lhs (stmt)) + { + int prec = TYPE_PRECISION (TREE_TYPE (arg0)); + if (!expr_not_equal_to (arg0, wi::zero (prec))) + return false; + + location_t loc = gimple_location (stmt); + gimple *g = gimple_build_call (decl, 1, arg0); + gimple_set_location (g, loc); + tree lhs = make_ssa_name (integer_type_node); + gimple_call_set_lhs (g, lhs); + gsi_insert_before (gsi, g, GSI_SAME_STMT); + g = gimple_build_assign (gimple_call_lhs (stmt), NOP_EXPR, lhs); + gimple_set_location (g, loc); + gsi_replace (gsi, g, true); + return true; + } + break; + + default: + break; + } + + return false; +} + /* Make builtins to detect cpu type and features supported. NAME is the builtin name, CODE is the builtin code, and FTYPE is the function type of the builtin. */ @@ -50531,6 +50629,9 @@ ix86_addr_space_zero_address_valid (addr #undef TARGET_FOLD_BUILTIN #define TARGET_FOLD_BUILTIN ix86_fold_builtin +#undef TARGET_GIMPLE_FOLD_BUILTIN +#define TARGET_GIMPLE_FOLD_BUILTIN ix86_gimple_fold_builtin + #undef TARGET_COMPARE_VERSION_PRIORITY #define TARGET_COMPARE_VERSION_PRIORITY ix86_compare_version_priority --- gcc/testsuite/gcc.target/i386/pr78057.c.jj 2016-10-21 11:57:58.249530508 +0200 +++ gcc/testsuite/gcc.target/i386/pr78057.c 2016-10-21 11:57:58.249530508 +0200 @@ -0,0 +1,42 @@ +/* PR target/78057 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -mbmi -mlzcnt -fdump-tree-optimized" } */ + +extern void link_error (void); + +int +foo (int x) +{ + if (__builtin_ia32_tzcnt_u16 (16) != 4 + || __builtin_ia32_tzcnt_u16 (0) != 16 + || __builtin_ia32_lzcnt_u16 (0x1ff) != 7 + || __builtin_ia32_lzcnt_u16 (0) != 16 + || __builtin_ia32_tzcnt_u32 (8) != 3 + || __builtin_ia32_tzcnt_u32 (0) != 32 + || __builtin_ia32_lzcnt_u32 (0x3fffffff) != 2 + || __builtin_ia32_lzcnt_u32 (0) != 32 +#ifdef __x86_64__ + || __builtin_ia32_tzcnt_u64 (4) != 2 + || __builtin_ia32_tzcnt_u64 (0) != 64 + || __builtin_ia32_lzcnt_u64 (0x1fffffff) != 35 + || __builtin_ia32_lzcnt_u64 (0) != 64 +#endif + ) + link_error (); + x += 2; + if (x == 0) + return 5; + return __builtin_ia32_tzcnt_u32 (x) + + __builtin_ia32_lzcnt_u32 (x) +#ifdef __x86_64__ + + __builtin_ia32_tzcnt_u64 (x) + + __builtin_ia32_lzcnt_u64 (x) +#endif + ; +} + +/* { dg-final { scan-tree-dump-not "link_error" "optimized" } } */ +/* { dg-final { scan-tree-dump-not "__builtin_ia32_\[lt]zcnt" "optimized" } } */ +/* { dg-final { scan-tree-dump-times "__builtin_ctz " 1 "optimized" } } */ +/* { dg-final { scan-tree-dump-times "__builtin_clz " 1 "optimized" } } */ +/* { dg-final { scan-tree-dump-times "__builtin_ctzll " 1 "optimized" { target lp64 } } } */