From patchwork Tue Jul 28 15:12:26 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kumar, Venkataramanan" X-Patchwork-Id: 501284 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 983991402C8 for ; Wed, 29 Jul 2015 01:12:43 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=SiEl9QdZ; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:content-type:mime-version; q=dns; s= default; b=ClGLCkx89cvgN09MXOAoTDFGcJ/3fm/mWgkD2seImjhf5Nf/svUm4 xVsxKBja9UhYPlSFHvRP1+G4/wPVnAxO/OOJ+S3lLqET/utEexyx/TIYJ7JWGbfN KyXgzTQVm/RZjGtNkjBjxsQh6tVlS24v9wvnQPcUl9lMdp/5Gdbrd4= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:content-type:mime-version; s= default; bh=2KRciONt3e5NTXHzQpoFcXa1VEQ=; b=SiEl9QdZrnRHSqhel0NB HYjtmlRW+d4HdB8ecba7Z+QUFBX5qoCBTj3WKb3ljeFQuqedIDSc5mydGxB5ccg/ fZQ12VMFwWGsFnLKgIfb58deLR2OlyCXAqegEiMe/lUanG4G9jzd83AlUWQu460r 9mphHIA0tr99yZvXz2U3uNo= Received: (qmail 99378 invoked by alias); 28 Jul 2015 15:12:36 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 99367 invoked by uid 89); 28 Jul 2015 15:12:35 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=1.1 required=5.0 tests=AWL, BAYES_50, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS autolearn=no version=3.3.2 X-HELO: na01-bn1-obe.outbound.protection.outlook.com Received: from mail-bn1bon0130.outbound.protection.outlook.com (HELO na01-bn1-obe.outbound.protection.outlook.com) (157.56.111.130) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA256 encrypted) ESMTPS; Tue, 28 Jul 2015 15:12:33 +0000 Received: from BY2PR02CA0122.namprd02.prod.outlook.com (10.163.44.176) by BLUPR02MB1137.namprd02.prod.outlook.com (10.163.79.151) with Microsoft SMTP Server (TLS) id 15.1.225.19; Tue, 28 Jul 2015 15:12:30 +0000 Received: from BL2FFO11FD024.protection.gbl (2a01:111:f400:7c09::104) by BY2PR02CA0122.outlook.office365.com (2a01:111:e400:5261::48) with Microsoft SMTP Server (TLS) id 15.1.225.19 via Frontend Transport; Tue, 28 Jul 2015 15:12:30 +0000 Authentication-Results: spf=none (sender IP is 165.204.84.221) smtp.mailfrom=amd.com; gcc.gnu.org; dkim=none (message not signed) header.d=none; Received-SPF: None (protection.outlook.com: amd.com does not designate permitted sender hosts) Received: from atltwp01.amd.com (165.204.84.221) by BL2FFO11FD024.mail.protection.outlook.com (10.173.161.103) with Microsoft SMTP Server id 15.1.231.11 via Frontend Transport; Tue, 28 Jul 2015 15:12:29 +0000 X-M-MSG: Received: from satlvexedge02.amd.com (satlvexedge02.amd.com [10.177.96.29]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by atltwp01.amd.com (Axway MailGate 5.3.1) with ESMTPS id 2F996CAE7C7; Tue, 28 Jul 2015 11:12:27 -0400 (EDT) Received: from SATLEXDAG05.amd.com (10.181.40.11) by SATLVEXEDGE02.amd.com (10.177.96.29) with Microsoft SMTP Server (TLS) id 14.3.195.1; Tue, 28 Jul 2015 10:12:31 -0500 Received: from SATLEXDAG06.amd.com ([fe80::1557:d877:7f65:c17]) by satlexdag05.amd.com ([fe80::48a2:3adb:b398:7181%23]) with mapi id 14.03.0195.001; Tue, 28 Jul 2015 11:12:27 -0400 From: "Kumar, Venkataramanan" To: "Richard Beiner (richard.guenther@gmail.com)" , "gcc-patches@gcc.gnu.org" Subject: [RFC] [Patch]: Try and vectorize with shift for mult expr with power 2 integer constant. Date: Tue, 28 Jul 2015 15:12:26 +0000 Message-ID: <7794A52CE4D579448B959EED7DD0A4723DD1F787@satlexdag06.amd.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-Microsoft-Exchange-Diagnostics: 1; BL2FFO11FD024; 1:jkleV9EaFxiO+8Ad7r04wp8QrYzfBUQj3q95acs6XXUdixhOJJbRElLtUllPv667n5istUDjNBOTCCxEXPHZP7iHetTlHbqH4U5VijKqFiXEjTnUX+Rm82GAO2PF9vGi8trN0cCpMvDHNodKNU2G90i/77H+JiEk9E3ybgrqWelSAPrgDG7GJ1nfdJ7Fcx4T+V23fh1tELJou6gFSlUgAv+g2y9cMPBA/UUBrUwrGVqWrWpF00vKLaSWFA/e0doZAkTOa8ZgBQQkzDpzmgxxkP8omzjo8MM+cLQUoWpMGl4wXLThxankrKJCIirMx5CG X-Forefront-Antispam-Report: CIP:165.204.84.221; CTRY:US; IPV:NLI; EFV:NLI; SFV:NSPM; SFS:(10019020)(6009001)(2980300002)(428002)(189002)(199003)(54356999)(106466001)(107886002)(55846006)(2930100002)(5890100001)(84326002)(2656002)(101416001)(87936001)(86362001)(19580395003)(46102003)(189998001)(5001770100001)(229853001)(53416004)(2501003)(5000100001)(62966003)(33656002)(568964001)(5260100001)(99936001)(512954002)(102836002)(105586002)(92566002)(77156002)(50986999)(2900100001)(4610100001)(5003600100002)(15975445007)(2920100001); DIR:OUT; SFP:1102; SCL:1; SRVR:BLUPR02MB1137; H:atltwp01.amd.com; FPR:; SPF:None; MLV:sfv; MX:1; A:1; LANG:en; X-Microsoft-Exchange-Diagnostics: 1; BLUPR02MB1137; 2:clv7j6ghm4ZbI8aj3JJuaVFApRHMe8ccFD5A2IoLCF9EZb1xGOpT+vss2AZexU1uVqikuLTIwIMd2pYUx/Fy8nTToabw7KCxcB7kpx/FM0U3kUOqGi/La+/N4/4BAL9k1w+8sGxzbzoLTsBRNB8vIgBqHPqGaFJvwHXEELSRp78=; 3:H5OqAfM6o0QDoNL8O5nIlfWJE2FyYbZTyncdZITYYUxqYk1KG6IjHzfgegF8lSEBBvhbkglYSWOSSFekiX1p2aJtEY68LYat/Ynf2DPI17WAQc4ESuLiohC99hkdJtFVV215IPwsloRjws+a4j/DDG5caemDf3pdsX1nZmG3cORvV/iSZu8JBSAnGH1HcdqxEehismmy0OoxKx2x3HH7UuWRYDS0kDfXFrnAYnaRhwfyR+eDTM21z6MomrIdxHhe; 25:bchoQxwjrBHeRH9/zpKZ5FAjzf8ZDiXenhrZWtClV4bFgpNEhiYWU5ifcF6KxdWsb2j8HkOELiK/rTpUYRwP6Lf/eoRnBHoy6vLns+Clpz0CRLKjTHM3XOH6YSrLl07dIHW3OJVyfc0ezxKPW2LKWbwHti8yrfJIn9r6t+s42uFMjEnjtCR6qc/+mgVvtxM8WTE/U15ngaSBxFrZ/LMCJjjxde/6q3sjSsjXIKE43FDmkcI07SA9FUllIVdcJEVHZzMULo8vhccjRzL0lgfJNA== X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:BLUPR02MB1137; X-Microsoft-Exchange-Diagnostics: 1; BLUPR02MB1137; 20:CRBM3FyLgOGGKDMU+0ecbtapZqJyMQbT1nI2jZd/qVLlH+regfxPPuLXkl21Xo46p7djGZIQisrprfKpIbltpyvuqUg4i8AjJqyws+H4+EZivGM7+1GcZmVyI+d9vXd7AJEfZdDDfN991kdyu8A3pK+hQdFiF8YDFcrNmN7ZLHvNGLvMctknFeCBqhwkAPAy997A7v23goNMlhhxkjklj2jdwSZBFu7asioxgZ8/qY04+mP7cPKXYFBHkEoFotxgn/sC/oF3XLlHdwCJtCchk98179p1K0tWLAiiATmoiR110tBjOmQwab9rO/Pz8BwlSU2HkCI6nJAP6OFevBF4CzX3G6+uIzeIrmTDCiLuQbIJ4ol+6CtLZNP3O41Y7NjwKKvXEpKLgZqExlbdck2S4/rhuoMWaBp5zNgoelZPWpgokri/fc1VtYUplzcveUKsMaSwqfa53gAKAn3EoIO1wcbkiyhtVwEIZQevGEa6tZmYTb8qku0Q08FPNUY+JPz+; 4:ZUX6rsoxQj47d6AdiA2Kaz4+fy7y/Zg1ZCAUZ37X/SttEWT5GTh5ghxwTkCREPQLxqu9muSXpJFj0RIrev5XKcy7On31/BgFgWzcza+rtlF8cOO+cY/0mDwvqLB7L8Fw2z4iIjoAMI+2NRXM3n5Dvgzz4FNrXKQnbgpns1EeJviPmAKugK7VpEWd3f07+NFxk+eHgS/qly2JPYkFh8B7M/82rvtVJHmZL+FFWKmpWVDXoJde3y5z9hfuwthPaJUK1HfcEfWOq+Jc0lclVyzzZ6I3TSm1WQjsqBFmYgmzkPs= BLUPR02MB1137: X-MS-Exchange-Organization-RulesExecuted X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(601004)(5005006)(3002001); SRVR:BLUPR02MB1137; BCL:0; PCL:0; RULEID:; SRVR:BLUPR02MB1137; X-Forefront-PRVS: 06515DA04B X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; BLUPR02MB1137; 23:evNac3Ifzav1aMrBSD0tgabY4j+yGC3WLdqUh+ilX?= =?us-ascii?Q?jCFIP3Bih9lhQBkMECn+vn1Zfongb/PJwbMLUVCGd7MWRDXCziAI7XQbkAAX?= =?us-ascii?Q?RK1vIN20xg8h/zsb/HbkPOqrPpzyKxb8PlOECPRNzzfvfZYFYlke0xvsVMJZ?= =?us-ascii?Q?EBi4v4hR/So5iSPmKM1WAQEk7k/6KZRo+3wFFX7KHhD7F9W1hV+T4xLwbyCU?= =?us-ascii?Q?/xQ97SL3ko4Sjf/grUTJheEf1v6h8wkaLA7r+7FVtpUDbQiGCYeL3sDd7LKD?= =?us-ascii?Q?d/1aDJSIxYCnBCoLtOoEN71Fuqs++X+Ooi60a+PVg591GM8XEE252MBp0EnF?= =?us-ascii?Q?ocncomseKsRei2JbivV0l6q67V18TOv7Hzjldb8mjhsoxISgfyyUw0sqViuA?= =?us-ascii?Q?jGyeNZAvgDy8nih+qATXOwWJiUmhI5/OyEV6NcTNO+kqLEbhOXhYtOFNBtD6?= =?us-ascii?Q?QmNGTPAUms32A3iOhYTSatVTnB8iUdx2nzpBWDxPa5BdKvL43/DX8ttycXeL?= =?us-ascii?Q?zBL7EbzyFJLsBrV3aCZdODFwTr4Uz7jzjqZiV7GCK+MUGohtlMT7I1pNaKQ4?= =?us-ascii?Q?ue3PKIDAqQeyEqcfMTx/oGLrQ5CQXqZe1XDvoHh2lU5JndzwhjVnDAI9Vwt3?= =?us-ascii?Q?SuLcpuAgiYjMH9WVaYn89dllAfItRfGVl/0dPxQ/EfSn2y4hUni8qfcR15oN?= =?us-ascii?Q?7A1dTRwJGUoWVzTsTUSb66eRBoR7rOdWDnjHmU1Dss2jvBfN9N8y0oo0OnY0?= =?us-ascii?Q?5k8qVMaPRYsApCaHoOCxM0i9TgmDwkYI1Ea0Md8vPH4tYJq0u0j4XkEZYDLf?= =?us-ascii?Q?ef/GvyvJG8Ws4tSz25kNnz2AnTucgCQOny3CFcJX5mDe83q6jD1O3BBxF6Ak?= =?us-ascii?Q?1hsUPGy7VQnjxRuwqK9w0G8nUl2d9aTFFo3PXun3GSBaUl5pfEUmAsJ3xInZ?= =?us-ascii?Q?8yaPkRy3KOlrAa8Mcmz+NFIxf4Im7LXb40IJwFmq3O72AeJIfhqAYRZ5nSf/?= =?us-ascii?Q?XqdGkfTfcZKY60x+73WXF7VXLSGkUkl5wxpcCqB6OpDmOanhrseuHCmzZfXQ?= =?us-ascii?Q?H/Wpco=3D?= X-Microsoft-Exchange-Diagnostics: 1; BLUPR02MB1137; 5:AFGBQGb1rWPlGlCwuPVJOVdSFFW6uQBZsOa4If/KIWcPXi8Lo39/dHhGAa2YLZLzy09YDkfUB1xiu/A5zPbEys9Xx/Q1nrGTwphnTf3/d1Nj75xvYxzxkicZVd4AHszYQctRZPVuEsNof77J7E1KNQ==; 24:iv9kW3yDkUAczTDx0E/ltm7Thop4SL1Zf7PENj4aDfC8AOP7Rx0oIRWFZ9tFvJtpnEAd3PE2X5wUCCITnAdwEuFDs/4BEzSVCRwFo/T77+0=; 20:sXpeMvqocynFSJ3TwylUqaonCSsynhjumJ4l1Nu+OWZ47guV/tZm4cvoTn5c2b4UpdGU0nKzH4kPb6JzZg1FZQ== X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Jul 2015 15:12:29.6519 (UTC) X-MS-Exchange-CrossTenant-Id: fde4dada-be84-483f-92cc-e026cbee8e96 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=fde4dada-be84-483f-92cc-e026cbee8e96; Ip=[165.204.84.221]; Helo=[atltwp01.amd.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BLUPR02MB1137 X-IsSubscribed: yes Hi Richard, For Aarch64 target, I was trying to vectorize the expression "arr[i]=arr[i]*4;" via vector shifts instructions since they don't have vector mults. unsigned long int __attribute__ ((aligned (64)))arr[100]; int i; #if 1 void test_vector_shifts() { for(i=0; i<=99;i++) arr[i]=arr[i]<<2; } #endif void test_vectorshift_via_mul() { for(i=0; i<=99;i++) arr[i]=arr[i]*4; } I found a similar PR and your comments https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65952#c6. Based on that and IRC discussion I had with you, I added vector recog pattern that transforms mults to shifts. The vectorizer is now able to generate vector shifts for the above test case. PR case also gets vectorized https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65952#c10. This is just an initial patch and tries to optimize integer type power 2 constants. I wanted to get feedback on this . I bootstrapped and reg tested on aarch64-none-linux-gnu . Regards, Venkat. diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c index f034635..948203d 100644 --- a/gcc/tree-vect-patterns.c +++ b/gcc/tree-vect-patterns.c @@ -76,6 +76,10 @@ static gimple vect_recog_vector_vector_shift_pattern (vec *, tree *, tree *); static gimple vect_recog_divmod_pattern (vec *, tree *, tree *); + +static gimple vect_recog_multconst_pattern (vec *, + tree *, tree *); + static gimple vect_recog_mixed_size_cond_pattern (vec *, tree *, tree *); static gimple vect_recog_bool_pattern (vec *, tree *, tree *); @@ -90,6 +94,7 @@ static vect_recog_func_ptr vect_vect_recog_func_ptrs[NUM_PATTERNS] = { vect_recog_rotate_pattern, vect_recog_vector_vector_shift_pattern, vect_recog_divmod_pattern, + vect_recog_multconst_pattern, vect_recog_mixed_size_cond_pattern, vect_recog_bool_pattern}; @@ -2147,6 +2152,87 @@ vect_recog_vector_vector_shift_pattern (vec *stmts, return pattern_stmt; } +static gimple +vect_recog_multconst_pattern (vec *stmts, + tree *type_in, tree *type_out) +{ + gimple last_stmt = stmts->pop (); + tree oprnd0, oprnd1, vectype, itype; + gimple pattern_stmt; + enum tree_code rhs_code; + optab optab; + stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt); + + if (!is_gimple_assign (last_stmt)) + return NULL; + + rhs_code = gimple_assign_rhs_code (last_stmt); + switch (rhs_code) + { + case MULT_EXPR: + break; + default: + return NULL; + } + + if (STMT_VINFO_IN_PATTERN_P (stmt_vinfo)) + return NULL; + + oprnd0 = gimple_assign_rhs1 (last_stmt); + oprnd1 = gimple_assign_rhs2 (last_stmt); + itype = TREE_TYPE (oprnd0); + + if (TREE_CODE (oprnd0) != SSA_NAME + || TREE_CODE (oprnd1) != INTEGER_CST + || TREE_CODE (itype) != INTEGER_TYPE + || TYPE_PRECISION (itype) != GET_MODE_PRECISION (TYPE_MODE (itype))) + return NULL; + + vectype = get_vectype_for_scalar_type (itype); + if (vectype == NULL_TREE) + return NULL; + + /* If the target can handle vectorized multiplication natively, + don't attempt to optimize this. */ + optab = optab_for_tree_code (rhs_code, vectype, optab_default); + if (optab != unknown_optab) + { + machine_mode vec_mode = TYPE_MODE (vectype); + int icode = (int) optab_handler (optab, vec_mode); + if (icode != CODE_FOR_nothing) + return NULL; + } + + /* If target cannot handle vector left shift then we cannot + optimize and bail out. */ + optab = optab_for_tree_code (LSHIFT_EXPR, vectype, optab_vector); + if (!optab + || optab_handler (optab, TYPE_MODE (vectype)) == CODE_FOR_nothing) + return NULL; + + if (integer_pow2p (oprnd1)) + { + /* Pattern detected. */ + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "vect_recog_multconst_pattern: detected:\n"); + + tree shift; + shift = build_int_cst (itype, tree_log2 (oprnd1)); + pattern_stmt = gimple_build_assign (vect_recog_temp_ssa_var (itype, NULL), + LSHIFT_EXPR, oprnd0, shift); + if (dump_enabled_p ()) + dump_gimple_stmt_loc (MSG_NOTE, vect_location, TDF_SLIM, pattern_stmt, + 0); + stmts->safe_push (last_stmt); + *type_in = vectype; + *type_out = vectype; + return pattern_stmt; + } + + return NULL; +} + /* Detect a signed division by a constant that wouldn't be otherwise vectorized: diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index 48c1f8d..833fe4b 100644 --- a/gcc/tree-vectorizer.h +++ b/gcc/tree-vectorizer.h @@ -1131,7 +1131,7 @@ extern void vect_slp_transform_bb (basic_block); Additional pattern recognition functions can (and will) be added in the future. */ typedef gimple (* vect_recog_func_ptr) (vec *, tree *, tree *); -#define NUM_PATTERNS 12 +#define NUM_PATTERNS 13 void vect_pattern_recog (loop_vec_info, bb_vec_info); /* In tree-vectorizer.c. */