From patchwork Mon Apr 20 06:03:38 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xingxing Pan X-Patchwork-Id: 462711 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 8E5C2140077 for ; Mon, 20 Apr 2015 16:05:13 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass reason="1024-bit key; unprotected key" header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=nTNGgQXU; dkim-adsp=none (unprotected policy); dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type; q=dns; s=default; b=wJqi45F0sMGv/N7Ev HyWKSbBbfRFw1mt0N3haA/6zt/NXTGmXBBPpWFXqcPv38Ld9iSK8qR8/FEXZps8A 6NgAMVJxlvYo1EkTGQkHGTKEzRjBlnPLvbkO9Yh5rS/bXTbWWbu6sw12mCDV5KCh o3S7GQ8x7PVFo4qmGEd9Hst/6Y= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type; s=default; bh=VhuBWLlfvurfHNEJFyevde4 qQlQ=; b=nTNGgQXUVZTpIW/jgVxL+04XC2nIEtNTVq171INbnsvyZj2zsMXdzHw cH7OPnQrlxnVhFZtPo0R+gEEVE8hrmmXG4Mn/vlY78WiyXS6G7vrJEuZdMib6tPl wIHcMwt9s66urSHV0B0V+IjH/hcJWvvKuwAiwnAPCXklkvRaRc2k= Received: (qmail 27623 invoked by alias); 20 Apr 2015 06:05:03 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 27610 invoked by uid 89); 20 Apr 2015 06:05:02 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.9 required=5.0 tests=AWL, BAYES_00, MEDICAL_SUBJECT, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=no version=3.3.2 X-HELO: mx0a-0016f401.pphosted.com Received: from mx0a-0016f401.pphosted.com (HELO mx0a-0016f401.pphosted.com) (67.231.148.174) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Mon, 20 Apr 2015 06:05:00 +0000 Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.14.5/8.14.5) with SMTP id t3K63JGk002922; Sun, 19 Apr 2015 23:04:53 -0700 Received: from sc-owa04.marvell.com ([199.233.58.150]) by mx0a-0016f401.pphosted.com with ESMTP id 1tv76yj13m-1 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT); Sun, 19 Apr 2015 23:04:53 -0700 Received: from maili.marvell.com (10.93.76.83) by SC-OWA04.marvell.com (10.93.76.33) with Microsoft SMTP Server id 8.3.327.1; Sun, 19 Apr 2015 23:04:52 -0700 Received: from [10.32.130.115] (unknown [10.32.130.115]) by maili.marvell.com (Postfix) with ESMTP id 51A393F703F; Sun, 19 Apr 2015 23:04:50 -0700 (PDT) Message-ID: <553496BA.2020006@marvell.com> Date: Mon, 20 Apr 2015 14:03:38 +0800 From: Xingxing Pan User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: "ramrad01@arm.com" CC: Julian Brown , James Greenhalgh , Kyrill Tkachov , Ramana Radhakrishnan , Richard Earnshaw , "nickc@redhat.com" , Xinyu Qi , Liping Gao , Joey Ye , "gcc-patches@gcc.gnu.org" Subject: Re: [PATCH] [ARM] Fix widen-sum pattern in neon.md. References: <54F85B61.8090901@marvell.com> In-Reply-To: X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.13.68, 1.0.33, 0.0.0000 definitions=2015-04-20_01:2015-04-17, 2015-04-19, 1970-01-01 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=7.0.1-1402240000 definitions=main-1504200056 On 04/15/2015 03:13 AM, Ramana Radhakrishnan wrote: > On Thu, Mar 5, 2015 at 1:34 PM, Xingxing Pan wrote: >> Hi, >> >> The expanding of widen-sum pattern always fails. The vectorizer expects the >> operands to have the same size, while the current implementation of >> widen-sum pattern dose not conform to this. >> >> This patch implements the widen-sum pattern with vpadal. Change the vaddw >> pattern to anonymous. Add widen-sum test cases for neon. >> > > Can you please respin addressing James and Kyrill's comments ? > > > Ramana > >> -- >> Regards, >> Xingxing Hi, Sorry for late response. The pattern is rewritten to utilize neon_vpadal's "0" constraints. Have run vect.exp and neon.exp in an armv7 board. vect.exp has two new XFAILs: XFAIL: gcc.dg/vect/slp-reduc-3.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 XFAIL: gcc.dg/vect/slp-reduc-3.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorizing stmts using SLP" 1 This is because widen-sum optimization precedes SLP. The xfail predicate vect_widen_sum_hi_to_si becomes true when widen-sum is enabled. neon.exp has four new XFAILs: XFAIL: gcc.target/arm/neon/vect-widen-sum-char2short-s-d.c scan-tree-dump-times vect "pattern recognized.*w\\+" 1 XFAIL: gcc.target/arm/neon/vect-widen-sum-char2short-s-d.c scan-rtl-dump-times expand "UNSPEC_VPADAL" 1 XFAIL: gcc.target/arm/neon/vect-widen-sum-char2short-s.c scan-tree-dump-times vect "pattern recognized.*w\\+" 1 XFAIL: gcc.target/arm/neon/vect-widen-sum-char2short-s.c scan-rtl-dump-times expand "UNSPEC_VPADAL" 1 If the widen-sum pattern is successfully expanded, "w+" and "UNSPEC_VPADAL" should appear in the dump file like other vect-widen-sum-*.c tests. But vect-widen-sum-char2short-s[-d].c is special because at tree level the signed operations will be converted into unsigned operations, which destroy the widen-sum pattern. That is due to the workaround of PR tree-optimization/25125. I just add xfail following gcc.dg/vect/vect-reduc-pattern-2c.c. commit c44b5bd19efb029b8bbd4e3c7e2d631bdc482b7c Author: Xingxing Pan Date: Sun Apr 19 15:54:43 2015 +0800 Fix widen-sum pattern in neon.md. gcc/ 2015-04-19 Xingxing Pan * config/arm/iterators.md (VWSD): New. (V_widen_sum_d): New. * config/arm/neon.md (widen_ssum3): Redefined. (widen_usum3): Ditto. (neon_svaddw3): New anonymous define_insn. (neon_uvaddw3): Ditto. gcc/testsuite/ 2015-04-19 Xingxing Pan * gcc.target/arm/neon/vect-widen-sum-char2short-s-d.c: New. * gcc.target/arm/neon/vect-widen-sum-char2short-s.c: New. * gcc.target/arm/neon/vect-widen-sum-char2short-u-d.c: New. * gcc.target/arm/neon/vect-widen-sum-char2short-u.c: New. * gcc.target/arm/neon/vect-widen-sum-short2int-s-d.c: New. * gcc.target/arm/neon/vect-widen-sum-short2int-s.c: New. * gcc.target/arm/neon/vect-widen-sum-short2int-u-d.c: New. * gcc.target/arm/neon/vect-widen-sum-short2int-u.c: New. * lib/target-supports.exp (check_effective_target_vect_widen_sum_hi_to_si_pattern): Return 1 for ARM NEON. (check_effective_target_vect_widen_sum_hi_to_si): Ditto. (check_effective_target_vect_widen_sum_qi_to_hi): Ditto. diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md index f7f8ab7..f73278d 100644 --- a/gcc/config/arm/iterators.md +++ b/gcc/config/arm/iterators.md @@ -95,6 +95,9 @@ ;; Widenable modes. (define_mode_iterator VW [V8QI V4HI V2SI]) +;; Widenable modes. Used by widen sum. +(define_mode_iterator VWSD [V8QI V4HI V16QI V8HI]) + ;; Narrowable modes. (define_mode_iterator VN [V8HI V4SI V2DI]) @@ -555,9 +558,14 @@ ;; Same as V_widen, but lower-case. (define_mode_attr V_widen_l [(V8QI "v8hi") (V4HI "v4si") ( V2SI "v2di")]) -;; Widen. Result is half the number of elements, but widened to double-width. +;; Widen. Result is half the number of elements, but widened to double-width. (define_mode_attr V_unpack [(V16QI "V8HI") (V8HI "V4SI") (V4SI "V2DI")]) +;; Widen. Result is half the number of elements, but widened to double-width. +;; Used by widen sum. +(define_mode_attr V_widen_sum_d [(V8QI "V4HI") (V4HI "V2SI") + (V16QI "V8HI") (V8HI "V4SI")]) + ;; Conditions to be used in extenddi patterns. (define_mode_attr qhs_zextenddi_cond [(SI "") (HI "&& arm_arch6") (QI "")]) (define_mode_attr qhs_sextenddi_cond [(SI "") (HI "&& arm_arch6") diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index 63c327e..839883f 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -1174,7 +1174,29 @@ ;; Widening operations -(define_insn "widen_ssum3" +(define_expand "widen_usum3" + [(match_operand: 0 "s_register_operand" "") + (match_operand:VWSD 1 "s_register_operand" "") + (match_operand: 2 "s_register_operand" "")] + "TARGET_NEON" + { + emit_insn (gen_neon_vpadalu (operands[0], operands[2], operands[1])); + DONE; + } +) + +(define_expand "widen_ssum3" + [(match_operand: 0 "s_register_operand" "") + (match_operand:VWSD 1 "s_register_operand" "") + (match_operand: 2 "s_register_operand" "")] + "TARGET_NEON" + { + emit_insn (gen_neon_vpadals (operands[0], operands[2], operands[1])); + DONE; + } +) + +(define_insn "*neon_svaddw3" [(set (match_operand: 0 "s_register_operand" "=w") (plus: (sign_extend: (match_operand:VW 1 "s_register_operand" "%w")) @@ -1184,7 +1206,7 @@ [(set_attr "type" "neon_add_widen")] ) -(define_insn "widen_usum3" +(define_insn "*neon_uvaddw3" [(set (match_operand: 0 "s_register_operand" "=w") (plus: (zero_extend: (match_operand:VW 1 "s_register_operand" "%w")) diff --git a/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-char2short-s-d.c b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-char2short-s-d.c new file mode 100644 index 0000000..8d0278c --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-char2short-s-d.c @@ -0,0 +1,63 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_hw } */ +/* { dg-options "-O2 -ffast-math -ftree-vectorize -mvectorize-with-neon-double -fdump-tree-vect-details -fdump-rtl-expand" } */ +/* { dg-add-options arm_neon } */ + +/* { dg-final { scan-tree-dump-times "pattern recognized.*w\\\+" 1 "vect" { xfail *-*-* } } } */ +/* { dg-final { cleanup-tree-dump "vect" } } */ +/* { dg-final { scan-rtl-dump-times "UNSPEC_VPADAL" 1 "expand" { xfail *-*-* } } } */ +/* { dg-final { cleanup-rtl-dump "expand" } } */ + +typedef signed char STYPE1; +typedef signed short STYPE2; +extern void abort (void); + +#define N 128 +STYPE1 sdata[N]; + +volatile int y = 0; + +__attribute__ ((noinline)) int +ssum () +{ + int i; + STYPE2 sum = 0; + STYPE2 check_sum = 0; + + /* widenning sum: sum chars into short. + + Like gcc.dg/vect/vect-reduc-pattern-2c.c, the widening-summation pattern + is currently not detected because of this patch: + + 2005-12-26 Kazu Hirata + PR tree-optimization/25125 + */ + + for (i = 0; i < N; i++) + { + sdata[i] = i*2; + check_sum += sdata[i]; + /* Avoid vectorization. */ + if (y) + abort (); + } + + /* widenning sum: sum chars into int. */ + for (i = 0; i < N; i++) + { + sum += sdata[i]; + } + + /* check results: */ + if (sum != check_sum) + abort (); + + return 0; +} + +int +main (void) +{ + ssum (); + return 0; +} diff --git a/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-char2short-s.c b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-char2short-s.c new file mode 100644 index 0000000..f7384c3 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-char2short-s.c @@ -0,0 +1,63 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_hw } */ +/* { dg-options "-O2 -ffast-math -ftree-vectorize -fdump-tree-vect-details -fdump-rtl-expand" } */ +/* { dg-add-options arm_neon } */ + +/* { dg-final { scan-tree-dump-times "pattern recognized.*w\\\+" 1 "vect" { xfail *-*-* } } } */ +/* { dg-final { cleanup-tree-dump "vect" } } */ +/* { dg-final { scan-rtl-dump-times "UNSPEC_VPADAL" 1 "expand" { xfail *-*-* } } } */ +/* { dg-final { cleanup-rtl-dump "expand" } } */ + +typedef signed char STYPE1; +typedef signed short STYPE2; +extern void abort (void); + +#define N 128 +STYPE1 sdata[N]; + +volatile int y = 0; + +__attribute__ ((noinline)) int +ssum () +{ + int i; + STYPE2 sum = 0; + STYPE2 check_sum = 0; + + /* widenning sum: sum chars into short. + + Like gcc.dg/vect/vect-reduc-pattern-2c.c, the widening-summation pattern + is currently not detected because of this patch: + + 2005-12-26 Kazu Hirata + PR tree-optimization/25125 + */ + + for (i = 0; i < N; i++) + { + sdata[i] = i*2; + check_sum += sdata[i]; + /* Avoid vectorization. */ + if (y) + abort (); + } + + /* widenning sum: sum chars into int. */ + for (i = 0; i < N; i++) + { + sum += sdata[i]; + } + + /* check results: */ + if (sum != check_sum) + abort (); + + return 0; +} + +int +main (void) +{ + ssum (); + return 0; +} diff --git a/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-char2short-u-d.c b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-char2short-u-d.c new file mode 100644 index 0000000..35f8fa7 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-char2short-u-d.c @@ -0,0 +1,54 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_hw } */ +/* { dg-options "-O2 -ffast-math -ftree-vectorize -mvectorize-with-neon-double -fdump-tree-vect-details -fdump-rtl-expand" } */ +/* { dg-add-options arm_neon } */ + +/* { dg-final { scan-tree-dump-times "pattern recognized.*w\\\+" 1 "vect" { target { arm_neon } } } } */ +/* { dg-final { cleanup-tree-dump "vect" } } */ +/* { dg-final { scan-rtl-dump-times "UNSPEC_VPADAL" 1 "expand" { target { arm_neon } } } } */ +/* { dg-final { cleanup-rtl-dump "expand" } } */ + +typedef unsigned char UTYPE1; +typedef unsigned short UTYPE2; +extern void abort (void); + +#define N 128 +UTYPE1 udata[N]; + +volatile int y = 0; + +__attribute__ ((noinline)) int +usum () +{ + int i; + UTYPE2 sum = 0; + UTYPE2 check_sum = 0; + + for (i = 0; i < N; i++) + { + udata[i] = i*2; + check_sum += udata[i]; + /* Avoid vectorization. */ + if (y) + abort (); + } + + /* widenning sum: sum chars into int. */ + for (i = 0; i < N; i++) + { + sum += udata[i]; + } + + /* check results: */ + if (sum != check_sum) + abort (); + + return 0; +} + +int +main (void) +{ + usum (); + return 0; +} diff --git a/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-char2short-u.c b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-char2short-u.c new file mode 100644 index 0000000..38af5f0 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-char2short-u.c @@ -0,0 +1,54 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_hw } */ +/* { dg-options "-O2 -ffast-math -ftree-vectorize -fdump-tree-vect-details -fdump-rtl-expand" } */ +/* { dg-add-options arm_neon } */ + +/* { dg-final { scan-tree-dump-times "pattern recognized.*w\\\+" 1 "vect" { target { arm_neon } } } } */ +/* { dg-final { cleanup-tree-dump "vect" } } */ +/* { dg-final { scan-rtl-dump-times "UNSPEC_VPADAL" 1 "expand" { target { arm_neon } } } } */ +/* { dg-final { cleanup-rtl-dump "expand" } } */ + +typedef unsigned char UTYPE1; +typedef unsigned short UTYPE2; +extern void abort (void); + +#define N 128 +UTYPE1 udata[N]; + +volatile int y = 0; + +__attribute__ ((noinline)) int +usum () +{ + int i; + UTYPE2 sum = 0; + UTYPE2 check_sum = 0; + + for (i = 0; i < N; i++) + { + udata[i] = i*2; + check_sum += udata[i]; + /* Avoid vectorization. */ + if (y) + abort (); + } + + /* widenning sum: sum chars into int. */ + for (i = 0; i < N; i++) + { + sum += udata[i]; + } + + /* check results: */ + if (sum != check_sum) + abort (); + + return 0; +} + +int +main (void) +{ + usum (); + return 0; +} diff --git a/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-short2int-s-d.c b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-short2int-s-d.c new file mode 100644 index 0000000..ef765de --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-short2int-s-d.c @@ -0,0 +1,54 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_hw } */ +/* { dg-options "-O2 -ffast-math -ftree-vectorize -mvectorize-with-neon-double -fdump-tree-vect-details -fdump-rtl-expand" } */ +/* { dg-add-options arm_neon } */ + +/* { dg-final { scan-tree-dump-times "pattern recognized.*w\\\+" 1 "vect" { target { arm_neon } } } } */ +/* { dg-final { cleanup-tree-dump "vect" } } */ +/* { dg-final { scan-rtl-dump-times "UNSPEC_VPADAL" 1 "expand" { target arm_neon } } } */ +/* { dg-final { cleanup-rtl-dump "expand" } } */ + +typedef signed short STYPE1; +typedef signed int STYPE2; +extern void abort (void); + +#define N 128 +STYPE1 sdata[N]; + +volatile int y = 0; + +__attribute__ ((noinline)) int +ssum () +{ + int i; + STYPE2 sum = 0; + STYPE2 check_sum = 0; + + for (i = 0; i < N; i++) + { + sdata[i] = i*2; + check_sum += sdata[i]; + /* Avoid vectorization. */ + if (y) + abort (); + } + + /* widenning sum: sum chars into int. */ + for (i = 0; i < N; i++) + { + sum += sdata[i]; + } + + /* check results: */ + if (sum != check_sum) + abort (); + + return 0; +} + +int +main (void) +{ + ssum (); + return 0; +} diff --git a/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-short2int-s.c b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-short2int-s.c new file mode 100644 index 0000000..fb38d56 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-short2int-s.c @@ -0,0 +1,54 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_hw } */ +/* { dg-options "-O2 -ffast-math -ftree-vectorize -fdump-tree-vect-details -fdump-rtl-expand" } */ +/* { dg-add-options arm_neon } */ + +/* { dg-final { scan-tree-dump-times "pattern recognized.*w\\\+" 1 "vect" { target { arm_neon } } } } */ +/* { dg-final { cleanup-tree-dump "vect" } } */ +/* { dg-final { scan-rtl-dump-times "UNSPEC_VPADAL" 1 "expand" { target { arm_neon } } } } */ +/* { dg-final { cleanup-rtl-dump "expand" } } */ + +typedef signed short STYPE1; +typedef signed int STYPE2; +extern void abort (void); + +#define N 128 +STYPE1 sdata[N]; + +volatile int y = 0; + +__attribute__ ((noinline)) int +ssum () +{ + int i; + STYPE2 sum = 0; + STYPE2 check_sum = 0; + + for (i = 0; i < N; i++) + { + sdata[i] = i*2; + check_sum += sdata[i]; + /* Avoid vectorization. */ + if (y) + abort (); + } + + /* widenning sum: sum chars into int. */ + for (i = 0; i < N; i++) + { + sum += sdata[i]; + } + + /* check results: */ + if (sum != check_sum) + abort (); + + return 0; +} + +int +main (void) +{ + ssum (); + return 0; +} diff --git a/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-short2int-u-d.c b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-short2int-u-d.c new file mode 100644 index 0000000..5a3dfd6 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-short2int-u-d.c @@ -0,0 +1,54 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_hw } */ +/* { dg-options "-O2 -ffast-math -ftree-vectorize -mvectorize-with-neon-double -fdump-tree-vect-details -fdump-rtl-expand" } */ +/* { dg-add-options arm_neon } */ + +/* { dg-final { scan-tree-dump-times "pattern recognized.*w\\\+" 1 "vect" { target { arm_neon } } } } */ +/* { dg-final { cleanup-tree-dump "vect" } } */ +/* { dg-final { scan-rtl-dump-times "UNSPEC_VPADAL" 1 "expand" { target { arm_neon } } } } */ +/* { dg-final { cleanup-rtl-dump "expand" } } */ + +typedef unsigned short UTYPE1; +typedef unsigned int UTYPE2; +extern void abort (void); + +#define N 128 +UTYPE1 udata[N]; + +volatile int y = 0; + +__attribute__ ((noinline)) int +usum () +{ + int i; + UTYPE2 sum = 0; + UTYPE2 check_sum = 0; + + for (i = 0; i < N; i++) + { + udata[i] = i*2; + check_sum += udata[i]; + /* Avoid vectorization. */ + if (y) + abort (); + } + + /* widenning sum: sum chars into int. */ + for (i = 0; i < N; i++) + { + sum += udata[i]; + } + + /* check results: */ + if (sum != check_sum) + abort (); + + return 0; +} + +int +main (void) +{ + usum (); + return 0; +} diff --git a/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-short2int-u.c b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-short2int-u.c new file mode 100644 index 0000000..770b08d --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-short2int-u.c @@ -0,0 +1,54 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_hw } */ +/* { dg-options "-O2 -ffast-math -ftree-vectorize -fdump-tree-vect-details -fdump-rtl-expand" } */ +/* { dg-add-options arm_neon } */ + +/* { dg-final { scan-tree-dump-times "pattern recognized.*w\\\+" 1 "vect" { target { arm_neon } } } } */ +/* { dg-final { cleanup-tree-dump "vect" } } */ +/* { dg-final { scan-rtl-dump-times "UNSPEC_VPADAL" 1 "expand" { target { arm_neon } } } } */ +/* { dg-final { cleanup-rtl-dump "expand" } } */ + +typedef unsigned short UTYPE1; +typedef unsigned int UTYPE2; +extern void abort (void); + +#define N 128 +UTYPE1 udata[N]; + +volatile int y = 0; + +__attribute__ ((noinline)) int +usum () +{ + int i; + UTYPE2 sum = 0; + UTYPE2 check_sum = 0; + + for (i = 0; i < N; i++) + { + udata[i] = i*2; + check_sum += udata[i]; + /* Avoid vectorization. */ + if (y) + abort (); + } + + /* widenning sum: sum chars into int. */ + for (i = 0; i < N; i++) + { + sum += udata[i]; + } + + /* check results: */ + if (sum != check_sum) + abort (); + + return 0; +} + +int +main (void) +{ + usum (); + return 0; +} diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index f632d00..477ab53 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -3795,6 +3795,7 @@ proc check_effective_target_vect_widen_sum_hi_to_si_pattern { } { } else { set et_vect_widen_sum_hi_to_si_pattern_saved 0 if { [istarget powerpc*-*-*] + || ([istarget arm*-*-*] && [check_effective_target_arm_neon_ok]) || [istarget ia64-*-*] } { set et_vect_widen_sum_hi_to_si_pattern_saved 1 } @@ -3818,7 +3819,8 @@ proc check_effective_target_vect_widen_sum_hi_to_si { } { } else { set et_vect_widen_sum_hi_to_si_saved [check_effective_target_vect_unpack] if { [istarget powerpc*-*-*] - || [istarget ia64-*-*] } { + || ([istarget arm*-*-*] && [check_effective_target_arm_neon_ok]) + || [istarget ia64-*-*] } { set et_vect_widen_sum_hi_to_si_saved 1 } } @@ -3841,7 +3843,7 @@ proc check_effective_target_vect_widen_sum_qi_to_hi { } { } else { set et_vect_widen_sum_qi_to_hi_saved 0 if { [check_effective_target_vect_unpack] - || [check_effective_target_arm_neon_ok] + || ([istarget arm*-*-*] && [check_effective_target_arm_neon_ok]) || [istarget ia64-*-*] } { set et_vect_widen_sum_qi_to_hi_saved 1 }