From patchwork Tue Aug 4 16:49:14 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kumar, Venkataramanan" X-Patchwork-Id: 503704 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 8327B140271 for ; Wed, 5 Aug 2015 02:49:31 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=ZG0wvIlv; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:references:in-reply-to :content-type:mime-version; q=dns; s=default; b=GK57FaF0EajLGCkE Nv3YmTxrjptq5V0yUqwxyt2FARIqSQkx3Ai8iu2QWqBTtauJqtKSXQJyqhV0x8vQ TKa7xIEqHO0ifQXtbCPZuwWM2DqJ7GG7EzoPWaz1EtE2UQUJMe8khFtx+4mOHRm0 7Lys5qcUOqY5IecrPTMkY+SYaFk= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:references:in-reply-to :content-type:mime-version; s=default; bh=nUp6UVO9t6gEsyVwZXiIky w0rm8=; b=ZG0wvIlvXbvgNnjazD+e7nZVUfcNDdegngclTq5KWOyooOGTu1B774 ahZZCYKyyI50a1UqEYYYt9VomgixoTjvXiasF/RhfHVg0oIIFx+s3UjoQiebyrmV fnK3lOOmnN+SU9wAmn4NT8xKv8wXvBK33r/a5DZvS2kl4DdLv2LlY= Received: (qmail 107231 invoked by alias); 4 Aug 2015 16:49:24 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 107212 invoked by uid 89); 4 Aug 2015 16:49:22 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=1.2 required=5.0 tests=AWL, BAYES_50, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS autolearn=no version=3.3.2 X-HELO: na01-by2-obe.outbound.protection.outlook.com Received: from mail-by2on0143.outbound.protection.outlook.com (HELO na01-by2-obe.outbound.protection.outlook.com) (207.46.100.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA256 encrypted) ESMTPS; Tue, 04 Aug 2015 16:49:20 +0000 Received: from BY2PR02CA0019.namprd02.prod.outlook.com (10.242.32.19) by BY2PR0201MB1496.namprd02.prod.outlook.com (10.163.153.157) with Microsoft SMTP Server (TLS) id 15.1.225.19; Tue, 4 Aug 2015 16:49:17 +0000 Received: from BL2FFO11FD042.protection.gbl (2a01:111:f400:7c09::109) by BY2PR02CA0019.outlook.office365.com (2a01:111:e400:2c2a::19) with Microsoft SMTP Server (TLS) id 15.1.225.19 via Frontend Transport; Tue, 4 Aug 2015 16:49:17 +0000 Authentication-Results: spf=none (sender IP is 165.204.84.221) smtp.mailfrom=amd.com; gcc.gnu.org; dkim=none (message not signed) header.d=none; Received-SPF: None (protection.outlook.com: amd.com does not designate permitted sender hosts) Received: from atltwp01.amd.com (165.204.84.221) by BL2FFO11FD042.mail.protection.outlook.com (10.173.161.138) with Microsoft SMTP Server id 15.1.243.9 via Frontend Transport; Tue, 4 Aug 2015 16:49:16 +0000 X-M-MSG: Received: from satlvexedge02.amd.com (satlvexedge02.amd.com [10.177.96.29]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by atltwp01.amd.com (Axway MailGate 5.3.1) with ESMTPS id 262CCCAE668; Tue, 4 Aug 2015 12:49:15 -0400 (EDT) Received: from SATLEXDAG03.amd.com (10.181.40.7) by SATLVEXEDGE02.amd.com (10.177.96.29) with Microsoft SMTP Server (TLS) id 14.3.195.1; Tue, 4 Aug 2015 11:49:27 -0500 Received: from SATLEXDAG06.amd.com ([fe80::1557:d877:7f65:c17]) by satlexdag03.amd.com ([fe80::b5e9:cb70:d30c:3fbc%22]) with mapi id 14.03.0195.001; Tue, 4 Aug 2015 12:49:16 -0400 From: "Kumar, Venkataramanan" To: Richard Biener CC: Jeff Law , Jakub Jelinek , "gcc-patches@gcc.gnu.org" Subject: RE: [RFC] [Patch]: Try and vectorize with shift for mult expr with power 2 integer constant. Date: Tue, 4 Aug 2015 16:49:14 +0000 Message-ID: <7794A52CE4D579448B959EED7DD0A4723DD2081C@satlexdag06.amd.com> References: <7794A52CE4D579448B959EED7DD0A4723DD1F787@satlexdag06.amd.com> <20150728195340.GX1780@tucnak.redhat.com> <7794A52CE4D579448B959EED7DD0A4723DD201CC@satlexdag06.amd.com> <55BFAEE5.9010303@redhat.com> <7794A52CE4D579448B959EED7DD0A4723DD205E2@satlexdag06.amd.com> In-Reply-To: MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-Microsoft-Exchange-Diagnostics: 1; BL2FFO11FD042; 1:7IPGuHaGbeAbV8qbUhJbeWfI4DDt3RMFl3sbLDTurq/H9W4BtOs67sIRdV39ABfwWdIy1KTW/3/Dfi9bkbqqfb6KNnh0J1RGvme92ZTX79msxUEHvoDOdA7OUcxF6XflB0zHGVZnfkzjyqd4YQUqEcFoZGcSARTB1dvBt+lOWJCXs7K6RbdOVOF1mEcN+tgjkK1v1m7BMNt4Fh6lcZYCf0ysuDQrM6jb9GD50WCZy5lRfg7OnZxgzoNXA0tCieIOrJI66Cb0otZFFFq+dtFmAiu2sko9Ke2oMqr2Qoq1pei0SDZmz191Bo4MKxO3WORj X-Forefront-Antispam-Report: CIP:165.204.84.221; CTRY:US; IPV:NLI; EFV:NLI; SFV:NSPM; SFS:(10019020)(6009001)(2980300002)(428002)(199003)(479174004)(377454003)(164054003)(189002)(24454002)(377424004)(54534003)(13464003)(5003600100002)(2900100001)(64706001)(5001830100001)(4001540100001)(54356999)(101416001)(5250100002)(110136002)(97736004)(512874002)(5890100001)(93886004)(568964001)(5260100001)(87936001)(106466001)(19580405001)(76176999)(62966003)(53416004)(50986999)(46102003)(5001860100001)(5001920100001)(77156002)(55846006)(189998001)(2656002)(4610100001)(86362001)(92566002)(105586002)(106116001)(2950100001)(99936001)(2920100001)(19580395003)(68736005)(84326002)(102836002)(33656002); DIR:OUT; SFP:1102; SCL:1; SRVR:BY2PR0201MB1496; H:atltwp01.amd.com; FPR:; SPF:None; PTR:InfoDomainNonexistent; A:1; MX:1; LANG:en; X-Microsoft-Exchange-Diagnostics: 1; BY2PR0201MB1496; 2:DIpTRzuYFzHeNDPAEGItlelXTj/nowwrFgtDoha3DtJAnhZx+c0ub7ze05NvDcEorFqc0OgDTPZkJzi7BEX4K7VYVd6OaEc0Xush+QU19ogdWeJBMIXGEh/tBNaUXd9AMckLiczrVRUT+uYeWfVzx6poUbK4giQxvkBii40zdxg=; 3:5LDWO7bNRLVhf7F90hS3y+ATfm66tqbnc+T5VHGHwKsbAQVm5R+zfFtgIWVndduTSBaTB8Q3ybfmZY7+Ove3RkwqY9bWBVmpI6KCfw4kh7PM+CuFGFalKXHxG23IdjBoWR2w8T/ZVQdn1xXRXN13+TJmJxtHg7X7iXClVKn3Es5bGNebnFTRT9ZalaSLrkk7rKKMB3odC/6s/NNU4svm2dKXfWjpWOCz226RWtdGXVTgX1jLiJsp2r9nyvkfDh7z; 25:2qQ9FBgdN1XG/NfUqc4SkpdxBPTZnlxR0eF6Z8kvZlwYWMYC7QKmg1ujMSKNaCnnu7G1exmd+7/FPSx9uYFzV4SfcYX91u4hDxIWxmNm3vANgCbS5ealGa2TJUREAch2Bl+e11JrEZYyCn5KQ+nTK8IbWj5D4EnznK7S/DKScGD/ASoEmaFtZehYWT423JEgElHgZ/VTDWIK235PYczNVJjZH/JSolQHRgn6dHIgwE223dEF32ffEnj+ckmHd0TOwfEcmPUzVOZq1iCndPR62g== X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:BY2PR0201MB1496; X-Microsoft-Exchange-Diagnostics: 1; BY2PR0201MB1496; 20:kt0PaYe7vttr4zKY13wdtoJPYorbTjQ6sbCDYZb9qt66s7nbE+Dvuv+E4CnlBAYYiYK2WLzVB/f0wXf4sQm0K+W9wGW+JvRxTDCdwFxCutMJGfzkvF3v28hINnUF9H0OVVJEm1Wm/hlf/cGQEV1ULEfhzx+q6ODI7znj+z8V6ufCZIXdsIPVcIMJm7nK0SR+vOtcsm8Xib0LkOlZeOqoCTHmpPsScqeu+8KenDgOE7yiZETL8oCoDs5MVGuVpYomFVz6F03aTQQiHQeCOvt7x0kPR6/+JQ1loTMWEQmLP/bSFeGwvQOZHnC8048U1RAa7kmosK6twbaTMHhmL7u5mLKtRFkPes2gr0AY8UGl3ZMvrJt0Ds1Cvk07uH12FULbfSGn/fNwgcCNqzlg0MLFpQl09ERN5qSGqC+/bh89j6u6bya67G0Nm5ZW+tYkKZlST4/rB5/eGMFo7TwYKBcElxAbWARZ6qWzz2qYOkmAbuh+dO53andZu8R6w6J/HGNW; 4:lIg8PFe8pPC9pIK93hY/YlOZ3s61Ef5xIVYW3cHqOf/mvYKSaLxEvflfGaBAvbrsSsMm7TVvouCfOH8H9/1I+wxZ2jupAnL7cfeLM3MGfEIRi81KfIe2qwaNW8qEYj9a7TpK8zqfmVCh/+0zJqDm63kW3fCJWyUvhDIpSPOluGsv0Z6oyWN4VNLGI4EAw9HxryLA0W/piyZc2xCZvUbg2eLGReH/mxURlAB7rM0Pg5ffWR57LmfNQgEBdq8ma1CM0zzcsLLFEfEG678Lvbzf9nrMnG75qZh/3q8UVD5LM30= X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(601004)(5005006)(3002001); SRVR:BY2PR0201MB1496; BCL:0; PCL:0; RULEID:; SRVR:BY2PR0201MB1496; X-Forefront-PRVS: 0658BAF71F X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; BY2PR0201MB1496; 23:ZjA1y93S9/krdHQBe5jG2/+jWWgfOjgEuf/7Hib?= =?us-ascii?Q?CRMu2/o5jGjgdWU4vHIHbuqlkTB3Db//uuPsEdSEgwJhCQ0eWBu/gqNwI3W5?= =?us-ascii?Q?AsDKtdRpaVrKQIbhDt+lhdifotNmEhh62jXQwbcnd5Ljgi8dPmAKoCDYFEDV?= =?us-ascii?Q?Zeh9vDOSDDNVX89rOB9+OWPi8J1QAvPLgMksI2d042OW6D16Y7FyuUmDR756?= =?us-ascii?Q?u6Q9E4uygHGDhulyum0HtWFjcT33Rnb2bvKD1KKH2omC4W389ecFLxg8Ac7L?= =?us-ascii?Q?Iz2FavgF+p+E5MSA4n7eQQJrzvzfL7lFbIxjOye6xXKukHGE2o809KV1heU9?= =?us-ascii?Q?eW0dDwBIPkt8Wid5zsqPZyiQGDLX7GhztC/N1HQW8K1HyCHHPLOgkPyJnkuL?= =?us-ascii?Q?pRVLgtEZlVv2hDtsbbZWs5nt4DF6Ucd0yT3kDV0MNvwY4ITcRpnWaMp7htsw?= =?us-ascii?Q?Z0IDKl2l7SYiaPuuuyhXK0fSdhMfcguRnJesaMyl3Rs7W9piQj4AP5A2YPOw?= =?us-ascii?Q?pCsodGB9Zsce20XIqET9qLf26ePNj/7otWnP0kaqniLwqheFHpzjyS8285Fu?= =?us-ascii?Q?8wv4KJhbFKVYXCDa3J/OMoo5GHZvIggRc1+96PDIR2EoAjb+THTwUzeMJ5Ol?= =?us-ascii?Q?PL0d+FtGZGMDIVvZnHzu70ZRux5eGvLUL4sB0lMg7UE43iXOikxv0kgXoqI0?= =?us-ascii?Q?LxBEvJ16m/MiBNOHrqArBRWikXKqVPwc3rDq1J1+w4droDfLzScHosnGpr42?= =?us-ascii?Q?txADrqWFl01Vrh2A1wxAJVsfoENXYdtqUZ6Xn2cC/P56/hKoRK7wGWacBtpR?= =?us-ascii?Q?FJXyiYTnLWziEhkWG5clUZtB5gkcRSXjFPsvBD0HV3X6Ze9pH3mqMaNmZWsh?= =?us-ascii?Q?3H4vjNipq484v4pK7P3OZa16hLgXFgvRFTEiAXyO7abWk9nTKA9TjjYBRTcC?= =?us-ascii?Q?wRk6j/geXgZChDlBdz8tFQZAw79iZ64Cwp7wv9iGxcxlQ1GIX1x+hxHI5Nyo?= =?us-ascii?Q?2M1BjGXwbnnXMpX5NJxyT/BzYkHfIuQ39+gVapgdxPrG3hBfu9WYtLYgQS+1?= =?us-ascii?Q?V9GMpmZtw/SstGpEotEndyXzUr6tosZ3xB+DOnhm29ogOS9iPB70ypgu996l?= =?us-ascii?Q?/vkvZkEDPEnP+eL8TUBVp2z68uVm6DLX2ENhw6jznURLxGYXxTZ2WKY+8zCD?= =?us-ascii?Q?MqmaLKmq6W7YiidCBVXYyGdyHufyQEDnKYiYYwQb4qLubUS/h6qJ+H2+pF5Z?= =?us-ascii?Q?E7MwLYnx+Q3GjX1892OMisPZDLqM/3V26rYk4h+Bfo39R0oZuELM9iy54Cyu?= =?us-ascii?Q?Y3QqMC1B2nR/UhTPlDgSYOkOtjOy9f+vv6GDp/bD2YWK0dtby/oXNNqenwV4?= =?us-ascii?Q?4TF2zBULgONGXzjQq0WQUpSkTlzExqHbzKdnPWczqZxtINyNP?= X-Microsoft-Exchange-Diagnostics: 1; BY2PR0201MB1496; 5:z8GCGHfeSQtaHkclCkORhWy+zNqjGeLLfc6FcdSQHUAoVPr92T8dX+jTkVOQuFt3GS8r6gVT7Xg07yDShS8tZETq7oJY0WJxzODkgjNV3Xkne495uNs5S95gpzinrKZmk2FVmzaOAjvQOie5YB97Ng==; 24:AvmZk15bKTD6dfcWn7iW3shcT36HMjB0I+CF/jso1B7jWqLsLSt4zTEO+qSS7xWCbNDKs5CiuRpnpHnM02xbstvGAHXaTdD1qiEOmSohf7Y=; 20:U2OVvkg8ZlifPDZw3x9fJ2L+SWLKhkynmexa9vINkLAQ0Ku60BHienuIrQBZ9XwRidejXDXor0o7xZz5RaXK5A== X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Aug 2015 16:49:16.8134 (UTC) X-MS-Exchange-CrossTenant-Id: fde4dada-be84-483f-92cc-e026cbee8e96 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=fde4dada-be84-483f-92cc-e026cbee8e96; Ip=[165.204.84.221]; Helo=[atltwp01.amd.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY2PR0201MB1496 X-IsSubscribed: yes Hi Richard, > -----Original Message----- > From: Richard Biener [mailto:richard.guenther@gmail.com] > Sent: Tuesday, August 04, 2015 4:07 PM > To: Kumar, Venkataramanan > Cc: Jeff Law; Jakub Jelinek; gcc-patches@gcc.gnu.org > Subject: Re: [RFC] [Patch]: Try and vectorize with shift for mult expr with > power 2 integer constant. > > On Tue, Aug 4, 2015 at 10:52 AM, Kumar, Venkataramanan > wrote: > > Hi Jeff, > > > >> -----Original Message----- > >> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches- > >> owner@gcc.gnu.org] On Behalf Of Jeff Law > >> Sent: Monday, August 03, 2015 11:42 PM > >> To: Kumar, Venkataramanan; Jakub Jelinek > >> Cc: Richard Beiner (richard.guenther@gmail.com); > >> gcc-patches@gcc.gnu.org > >> Subject: Re: [RFC] [Patch]: Try and vectorize with shift for mult > >> expr with power 2 integer constant. > >> > >> On 08/02/2015 05:03 AM, Kumar, Venkataramanan wrote: > >> > Hi Jakub, > >> > > >> > Thank you for reviewing the patch. > >> > > >> > I have incorporated your comments in the attached patch. > >> Note Jakub is on PTO for the next 3 weeks. > > > > Thank you for this information. > > > >> > >> > >> > > >> > > >> > > >> > vectorize_mults_via_shift.diff.txt > >> > > >> > > >> > diff --git a/gcc/testsuite/gcc.dg/vect/vect-mult-patterns.c > >> > b/gcc/testsuite/gcc.dg/vect/vect-mult-patterns.c > >> Jakub would probably like more testcases :-) > >> > >> The most obvious thing to test would be other shift factors. > >> > >> A negative test to verify we don't try to turn a multiply by > >> non-constant or multiply by a constant that is not a power of 2 into shifts. > > > > I have added negative test in the attached patch. > > > > > >> > >> [ Would it make sense, for example, to turn a multiply by 3 into a > >> shift-add sequence? As Jakub said, choose_mult_variant can be your > >> friend. ] > > > > Yes I will do that in a follow up patch. > > > > The new change log becomes > > > > gcc/ChangeLog > > 2015-08-04 Venkataramanan Kumar > > > * tree-vect-patterns.c (vect_recog_mult_pattern): New function for > vectorizing > > multiplication patterns. > > * tree-vectorizer.h: Adjust the number of patterns. > > > > gcc/testsuite/ChangeLog > > 2015-08-04 Venkataramanan Kumar > > > * gcc.dg/vect/vect-mult-pattern-1.c: New > > * gcc.dg/vect/vect-mult-pattern-2.c: New > > > > Bootstrapped and reg tested on aarch64-unknown-linux-gnu. > > > > Ok for trunk ? > > + if (TREE_CODE (oprnd0) != SSA_NAME > + || TREE_CODE (oprnd1) != INTEGER_CST > + || TREE_CODE (itype) != INTEGER_TYPE > > INTEGRAL_TYPE_P (itype) > > + optab = optab_for_tree_code (LSHIFT_EXPR, vectype, optab_vector); if > + (!optab > + || optab_handler (optab, TYPE_MODE (vectype)) == > CODE_FOR_nothing) > + return NULL; > + > > indent of the return stmt looks wrong > > + /* Handle constant operands that are postive or negative powers of 2. > + */ if ( wi::exact_log2 (oprnd1) != -1 || > + wi::exact_log2 (wi::neg (oprnd1)) != -1) > > no space after (, || goes to the next line. > > + { > + tree shift; > + > + if (wi::exact_log2 (oprnd1) != -1) > > please cache wi::exact_log2 > > in fact the first if () looks redundant if you simply put an else return NULL > after a else if (wi::exact_log2 (wi::neg (oprnd1)) != -1) > > Note that the issue with INT_MIN is that wi::neg (INT_MIN) is INT_MIN > again, but it seems that wi::exact_log2 returns -1 in that case so you are fine > (and in fact not handling this case). > I have updated your review comments in the attached patch. For the INT_MIN case, I am getting vectorized output with the patch. I believe x86_64 also vectorizes but does not negates the results. #include unsigned long int __attribute__ ((aligned (64)))arr[100]; int i; #if 1 void test_vector_shifts() { for(i=0; i<=99;i++) arr[i]=arr[i] * INT_MIN; } #endif void test_vectorshift_via_mul() { for(i=0; i<=99;i++) arr[i]=arr[i]*(-INT_MIN); } Before --------- ldr x1, [x0] neg x1, x1, lsl 31 str x1, [x0], 8 cmp x0, x2 After ------- ldr q0, [x0] shl v0.2d, v0.2d, 31 neg v0.2d, v0.2d str q0, [x0], 16 cmp x1, x0 is this fine ? > Thanks, > Richard. > > >> > >> > >> > >> > @@ -2147,6 +2152,140 @@ vect_recog_vector_vector_shift_pattern > >> (vec *stmts, > >> > return pattern_stmt; > >> > } > >> > > >> > +/* Detect multiplication by constant which are postive or > >> > +negatives of power 2, > >> s/postive/positive/ > >> > >> > >> Jeff > > > > Regards, > > Venkat. > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-mult-pattern-1.c b/gcc/testsuite/gcc.dg/vect/vect-mult-pattern-1.c new file mode 100644 index 0000000..764d0e3 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/vect-mult-pattern-1.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target vect_int } */ +/* { dg-require-effective-target vect_shift } */ + +unsigned long int __attribute__ ((aligned (64)))arr[100]; +int i; + +void test_for_vectorshifts_via_mul_with_power2_const () +{ + for (i=0; i<=99; i++) + arr[i] = arr[i] * 4; +} + +void test_for_vectorshifts_via_mul_with_negative_power2_const () +{ + for (i=0; i<=99; i++) + arr[i] = arr[i] * (-4); +} + +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" {target { ! { vect_int_mult } } } } } */ +/* { dg-final { scan-tree-dump-times "vect_recog_mult_pattern: detected" 2 "vect" {target { ! { vect_int_mult } } } } } */ diff --git a/gcc/testsuite/gcc.dg/vect/vect-mult-pattern-2.c b/gcc/testsuite/gcc.dg/vect/vect-mult-pattern-2.c new file mode 100644 index 0000000..77e8cff --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/vect-mult-pattern-2.c @@ -0,0 +1,28 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target vect_int } */ +/* { dg-require-effective-target vect_shift } */ + +unsigned long int __attribute__ ((aligned (64)))arr[100]; +int i; + +void negative_test_for_vectorshifts_via_mul_with_const () +{ + for (i=0; i<=99; i++) + arr[i] = arr[i] * 123; +} + +void negative_test_for_vectorshifts_via_mul_with_negative_const () +{ + for (i=0; i<=99; i++) + arr[i] = arr[i] * (-123); +} + +void negative_test_for_vectorshifts_via_mul_with_varable (int x) +{ + for (i=0; i<=99; i++) + arr[i] = arr[i] * x; +} + + +/* { dg-final { scan-tree-dump-times "vectorized 0 loops" 3 "vect" {target { ! { vect_int_mult } } } } } */ +/* { dg-final { scan-tree-dump-not "vect_recog_mult_pattern: detected" "vect" {target { ! { vect_int_mult } } } } } */ diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c index f034635..bc3117d 100644 --- a/gcc/tree-vect-patterns.c +++ b/gcc/tree-vect-patterns.c @@ -76,6 +76,10 @@ static gimple vect_recog_vector_vector_shift_pattern (vec *, tree *, tree *); static gimple vect_recog_divmod_pattern (vec *, tree *, tree *); + +static gimple vect_recog_mult_pattern (vec *, + tree *, tree *); + static gimple vect_recog_mixed_size_cond_pattern (vec *, tree *, tree *); static gimple vect_recog_bool_pattern (vec *, tree *, tree *); @@ -90,6 +94,7 @@ static vect_recog_func_ptr vect_vect_recog_func_ptrs[NUM_PATTERNS] = { vect_recog_rotate_pattern, vect_recog_vector_vector_shift_pattern, vect_recog_divmod_pattern, + vect_recog_mult_pattern, vect_recog_mixed_size_cond_pattern, vect_recog_bool_pattern}; @@ -2147,6 +2152,140 @@ vect_recog_vector_vector_shift_pattern (vec *stmts, return pattern_stmt; } +/* Detect multiplication by constant which are postive or negatives of power 2, + and convert them to shift patterns. + + Mult with constants that are postive power of two. + type a_t; + type b_t + S1: b_t = a_t * n + + or + + Mult with constants that are negative power of two. + S2: b_t = a_t * -n + + Input/Output: + + STMTS: Contains a stmt from which the pattern search begins, + i.e. the mult stmt. Convert the mult operation to LSHIFT if + constant operand is a power of 2. + type a_t, b_t + S1': b_t = a_t << log2 (n) + + Convert the mult operation to LSHIFT and followed by a NEGATE + if constant operand is a negative power of 2. + type a_t, b_t, res_T; + S2': b_t = a_t << log2 (n) + S3': res_T = - (b_t) + + Output: + + * TYPE_IN: The type of the input arguments to the pattern. + + * TYPE_OUT: The type of the output of this pattern. + + * Return value: A new stmt that will be used to replace the multiplication + S1 or S2 stmt. */ + +static gimple +vect_recog_mult_pattern (vec *stmts, + tree *type_in, tree *type_out) +{ + gimple last_stmt = stmts->pop (); + tree oprnd0, oprnd1, vectype, itype; + gimple pattern_stmt, def_stmt; + optab optab; + stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt); + int power2_val, power2_neg_val; + tree shift; + + if (!is_gimple_assign (last_stmt)) + return NULL; + + if (gimple_assign_rhs_code (last_stmt) != MULT_EXPR) + return NULL; + + oprnd0 = gimple_assign_rhs1 (last_stmt); + oprnd1 = gimple_assign_rhs2 (last_stmt); + itype = TREE_TYPE (oprnd0); + + if (TREE_CODE (oprnd0) != SSA_NAME + || TREE_CODE (oprnd1) != INTEGER_CST + || !INTEGRAL_TYPE_P (itype) + || TYPE_PRECISION (itype) != GET_MODE_PRECISION (TYPE_MODE (itype))) + return NULL; + + vectype = get_vectype_for_scalar_type (itype); + if (vectype == NULL_TREE) + return NULL; + + /* If the target can handle vectorized multiplication natively, + don't attempt to optimize this. */ + optab = optab_for_tree_code (MULT_EXPR, vectype, optab_default); + if (optab != unknown_optab) + { + machine_mode vec_mode = TYPE_MODE (vectype); + int icode = (int) optab_handler (optab, vec_mode); + if (icode != CODE_FOR_nothing) + return NULL; + } + + /* If target cannot handle vector left shift then we cannot + optimize and bail out. */ + optab = optab_for_tree_code (LSHIFT_EXPR, vectype, optab_vector); + if (!optab + || optab_handler (optab, TYPE_MODE (vectype)) == CODE_FOR_nothing) + return NULL; + + power2_val = wi::exact_log2 (oprnd1); + power2_neg_val = wi::exact_log2 (wi::neg (oprnd1)); + + /* Handle constant operands that are postive or negative powers of 2. */ + if (power2_val != -1) + { + shift = build_int_cst (itype, power2_val); + pattern_stmt + = gimple_build_assign (vect_recog_temp_ssa_var (itype, NULL), + LSHIFT_EXPR, oprnd0, shift); + } + else if (power2_neg_val != -1) + { + /* If the target cannot handle vector NEGATE then we cannot + do the optimization. */ + optab = optab_for_tree_code (NEGATE_EXPR, vectype, optab_vector); + if (!optab + || optab_handler (optab, TYPE_MODE (vectype)) == CODE_FOR_nothing) + return NULL; + + shift = build_int_cst (itype, power2_neg_val); + def_stmt + = gimple_build_assign (vect_recog_temp_ssa_var (itype, NULL), + LSHIFT_EXPR, oprnd0, shift); + new_pattern_def_seq (stmt_vinfo, def_stmt); + pattern_stmt + = gimple_build_assign (vect_recog_temp_ssa_var (itype, NULL), + NEGATE_EXPR, gimple_assign_lhs (def_stmt)); + } + else + return NULL; + + /* Pattern detected. */ + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "vect_recog_mult_pattern: detected:\n"); + + if (dump_enabled_p ()) + dump_gimple_stmt_loc (MSG_NOTE, vect_location, TDF_SLIM, + pattern_stmt,0); + + stmts->safe_push (last_stmt); + *type_in = vectype; + *type_out = vectype; + + return pattern_stmt; +} + /* Detect a signed division by a constant that wouldn't be otherwise vectorized: diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index dfa8795..b490af4 100644 --- a/gcc/tree-vectorizer.h +++ b/gcc/tree-vectorizer.h @@ -1132,7 +1132,7 @@ extern void vect_slp_transform_bb (basic_block); Additional pattern recognition functions can (and will) be added in the future. */ typedef gimple (* vect_recog_func_ptr) (vec *, tree *, tree *); -#define NUM_PATTERNS 12 +#define NUM_PATTERNS 13 void vect_pattern_recog (loop_vec_info, bb_vec_info); /* In tree-vectorizer.c. */