From patchwork Fri Oct 28 08:57:44 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 122362 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id F38591007DB for ; Fri, 28 Oct 2011 19:58:08 +1100 (EST) Received: (qmail 23822 invoked by alias); 28 Oct 2011 08:58:06 -0000 Received: (qmail 23800 invoked by uid 22791); 28 Oct 2011 08:58:03 -0000 X-SWARE-Spam-Status: No, hits=-6.7 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, SPF_HELO_PASS, TW_AV, TW_MX, TW_ZJ X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 28 Oct 2011 08:57:46 +0000 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p9S8vkW4031048 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Fri, 28 Oct 2011 04:57:46 -0400 Received: from tyan-ft48-01.lab.bos.redhat.com (tyan-ft48-01.lab.bos.redhat.com [10.16.42.4]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id p9S8vj5p029786 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 28 Oct 2011 04:57:46 -0400 Received: from tyan-ft48-01.lab.bos.redhat.com (tyan-ft48-01.lab.bos.redhat.com [127.0.0.1]) by tyan-ft48-01.lab.bos.redhat.com (8.14.4/8.14.4) with ESMTP id p9S8vj52013600; Fri, 28 Oct 2011 10:57:45 +0200 Received: (from jakub@localhost) by tyan-ft48-01.lab.bos.redhat.com (8.14.4/8.14.4/Submit) id p9S8viIV013598; Fri, 28 Oct 2011 10:57:44 +0200 Date: Fri, 28 Oct 2011 10:57:44 +0200 From: Jakub Jelinek To: Uros Bizjak Cc: Richard Henderson , Kirill Yukhin , gcc-patches@gcc.gnu.org Subject: [PATCH] Cleanup AVX2 vector/vector shifts (take 2) Message-ID: <20111028085744.GQ1052@tyan-ft48-01.lab.bos.redhat.com> Reply-To: Jakub Jelinek References: <20111027195035.GO1052@tyan-ft48-01.lab.bos.redhat.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On Thu, Oct 27, 2011 at 10:07:13PM +0200, Uros Bizjak wrote: > Please use expressive RTX forms for expanders, similar to the above > define_insn RTX. You can avoid calling gen_avx2_lshrv at the end > of c code. Also, expanders can have nonimmediate_operand as operand 2 > and conditionally move it to register in C code block if needed. Like this? In addition to that the patch also enables all the 3 patterns for V2DImode for -mxop too (all this depends on some solution for the vectorizable_shift ICE I've posted yesterday) and except for the left shift xop only pattern uses nonimmediate_operand on the last arg - even the xop pattern that start with negation of the last operand can use nonimmediate_operand which neg2 uses. 2011-10-28 Jakub Jelinek * config/i386/sse.md (VI4SD_AVX2): Removed. (VI48_AVX2, VI128_128, VI48_128, VI48_256): New mode iterators. (vashl3): Use VI12_128 iterator instead of VI124_128. Add another expander using VI48_128 iterator for TARGET_AVX2 || TARGET_XOP and another using VI48_256 iterator for TARGET_AVX2. (vlshr3): Likewise. Change register_operand predicate to nonimmediate_operand on last operand in the VI12_128 expander. (vashr3): Use VI128_128 iterator instead of VI124_128. (vashrv4si3, vashrv8si3): New expanders. (avx2_ashrvv8si, avx2_ashrvv4si, avx2_vv8si, avx2_vv2di): Removed. (avx2_ashrv): New insn with VI4_AVX2 iterator. (avx2_v): Macroize using VI48_AVX2 iterator. Simplify pattern. * gcc.dg/vshift-1.c: New test. * gcc.dg/vshift-2.c: New test. * gcc.target/i386/xop-vshift-1.c: New test. * gcc.target/i386/xop-vshift-2.c: New test. * gcc.target/i386/avx2-vshift-1.c: New test. Jakub --- gcc/config/i386/sse.md.jj 2011-10-28 09:59:54.000000000 +0200 +++ gcc/config/i386/sse.md 2011-10-28 10:29:22.000000000 +0200 @@ -125,8 +125,9 @@ (define_mode_iterator VI248_AVX2 (V8SI "TARGET_AVX2") V4SI (V4DI "TARGET_AVX2") V2DI]) -(define_mode_iterator VI4SD_AVX2 - [V4SI V4DI]) +(define_mode_iterator VI48_AVX2 + [(V8SI "TARGET_AVX2") V4SI + (V4DI "TARGET_AVX2") V2DI]) (define_mode_iterator V48_AVX2 [V4SF V2DF @@ -191,11 +192,14 @@ (define_mode_iterator VI_256 [V32QI V16H (define_mode_iterator VI12_128 [V16QI V8HI]) (define_mode_iterator VI14_128 [V16QI V4SI]) (define_mode_iterator VI124_128 [V16QI V8HI V4SI]) +(define_mode_iterator VI128_128 [V16QI V8HI V2DI]) (define_mode_iterator VI24_128 [V8HI V4SI]) (define_mode_iterator VI248_128 [V8HI V4SI V2DI]) +(define_mode_iterator VI48_128 [V4SI V2DI]) ;; Random 256bit vector integer mode combinations (define_mode_iterator VI124_256 [V32QI V16HI V8SI]) +(define_mode_iterator VI48_256 [V8SI V4DI]) ;; Int-float size matches (define_mode_iterator VI4F_128 [V4SI V4SF]) @@ -11265,11 +11269,10 @@ (define_insn "xop_vrotl3" (set_attr "mode" "TI")]) ;; XOP packed shift instructions. -;; FIXME: add V2DI back in (define_expand "vlshr3" - [(match_operand:VI124_128 0 "register_operand" "") - (match_operand:VI124_128 1 "register_operand" "") - (match_operand:VI124_128 2 "register_operand" "")] + [(match_operand:VI12_128 0 "register_operand" "") + (match_operand:VI12_128 1 "register_operand" "") + (match_operand:VI12_128 2 "nonimmediate_operand" "")] "TARGET_XOP" { rtx neg = gen_reg_rtx (mode); @@ -11278,10 +11281,33 @@ (define_expand "vlshr3" DONE; }) +(define_expand "vlshr3" + [(set (match_operand:VI48_128 0 "register_operand" "") + (lshiftrt:VI48_128 + (match_operand:VI48_128 1 "register_operand" "") + (match_operand:VI48_128 2 "nonimmediate_operand" "")))] + "TARGET_AVX2 || TARGET_XOP" +{ + if (!TARGET_AVX2) + { + rtx neg = gen_reg_rtx (mode); + emit_insn (gen_neg2 (neg, operands[2])); + emit_insn (gen_xop_lshl3 (operands[0], operands[1], neg)); + DONE; + } +}) + +(define_expand "vlshr3" + [(set (match_operand:VI48_256 0 "register_operand" "") + (lshiftrt:VI48_256 + (match_operand:VI48_256 1 "register_operand" "") + (match_operand:VI48_256 2 "nonimmediate_operand" "")))] + "TARGET_AVX2") + (define_expand "vashr3" - [(match_operand:VI124_128 0 "register_operand" "") - (match_operand:VI124_128 1 "register_operand" "") - (match_operand:VI124_128 2 "register_operand" "")] + [(match_operand:VI128_128 0 "register_operand" "") + (match_operand:VI128_128 1 "register_operand" "") + (match_operand:VI128_128 2 "nonimmediate_operand" "")] "TARGET_XOP" { rtx neg = gen_reg_rtx (mode); @@ -11290,16 +11316,59 @@ (define_expand "vashr3" DONE; }) +(define_expand "vashrv4si3" + [(set (match_operand:V4SI 0 "register_operand" "") + (ashiftrt:V4SI (match_operand:V4SI 1 "register_operand" "") + (match_operand:V4SI 2 "nonimmediate_operand" "")))] + "TARGET_AVX2 || TARGET_XOP" +{ + if (!TARGET_AVX2) + { + rtx neg = gen_reg_rtx (V4SImode); + emit_insn (gen_negv4si2 (neg, operands[2])); + emit_insn (gen_xop_ashlv4si3 (operands[0], operands[1], neg)); + DONE; + } +}) + +(define_expand "vashrv8si3" + [(set (match_operand:V8SI 0 "register_operand" "") + (ashiftrt:V8SI (match_operand:V8SI 1 "register_operand" "") + (match_operand:V8SI 2 "nonimmediate_operand" "")))] + "TARGET_AVX2") + (define_expand "vashl3" - [(match_operand:VI124_128 0 "register_operand" "") - (match_operand:VI124_128 1 "register_operand" "") - (match_operand:VI124_128 2 "register_operand" "")] + [(match_operand:VI12_128 0 "register_operand" "") + (match_operand:VI12_128 1 "register_operand" "") + (match_operand:VI12_128 2 "register_operand" "")] "TARGET_XOP" { emit_insn (gen_xop_ashl3 (operands[0], operands[1], operands[2])); DONE; }) +(define_expand "vashl3" + [(set (match_operand:VI48_128 0 "register_operand" "") + (ashift:VI48_128 + (match_operand:VI48_128 1 "register_operand" "") + (match_operand:VI48_128 2 "nonimmediate_operand" "")))] + "TARGET_AVX2 || TARGET_XOP" +{ + if (!TARGET_AVX2) + { + operands[2] = force_reg (mode, operands[2]); + emit_insn (gen_xop_ashl3 (operands[0], operands[1], operands[2])); + DONE; + } +}) + +(define_expand "vashl3" + [(set (match_operand:VI48_256 0 "register_operand" "") + (ashift:VI48_256 + (match_operand:VI48_256 1 "register_operand" "") + (match_operand:VI48_256 2 "nonimmediate_operand" "")))] + "TARGET_AVX2") + (define_insn "xop_ashl3" [(set (match_operand:VI_128 0 "register_operand" "=x,x") (if_then_else:VI_128 @@ -12401,249 +12470,28 @@ (define_expand "avx2_inserti128" DONE; }) -(define_insn "avx2_ashrvv8si" - [(set (match_operand:V8SI 0 "register_operand" "=x") - (vec_concat:V8SI - (vec_concat:V4SI - (vec_concat:V2SI - (ashiftrt:SI - (vec_select:SI - (match_operand:V8SI 1 "register_operand" "x") - (parallel [(const_int 0)])) - (vec_select:SI - (match_operand:V8SI 2 "nonimmediate_operand" "xm") - (parallel [(const_int 0)]))) - (ashiftrt:SI - (vec_select:SI - (match_dup 1) - (parallel [(const_int 1)])) - (vec_select:SI - (match_dup 2) - (parallel [(const_int 1)])))) - (vec_concat:V2SI - (ashiftrt:SI - (vec_select:SI - (match_dup 1) - (parallel [(const_int 2)])) - (vec_select:SI - (match_dup 2) - (parallel [(const_int 2)]))) - (ashiftrt:SI - (vec_select:SI - (match_dup 1) - (parallel [(const_int 3)])) - (vec_select:SI - (match_dup 2) - (parallel [(const_int 3)]))))) - (vec_concat:V4SI - (vec_concat:V2SI - (ashiftrt:SI - (vec_select:SI - (match_dup 1) - (parallel [(const_int 0)])) - (vec_select:SI - (match_dup 2) - (parallel [(const_int 0)]))) - (ashiftrt:SI - (vec_select:SI - (match_dup 1) - (parallel [(const_int 1)])) - (vec_select:SI - (match_dup 2) - (parallel [(const_int 1)])))) - (vec_concat:V2SI - (ashiftrt:SI - (vec_select:SI - (match_dup 1) - (parallel [(const_int 2)])) - (vec_select:SI - (match_dup 2) - (parallel [(const_int 2)]))) - (ashiftrt:SI - (vec_select:SI - (match_dup 1) - (parallel [(const_int 3)])) - (vec_select:SI - (match_dup 2) - (parallel [(const_int 3)])))))))] - "TARGET_AVX2" - "vpsravd\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "type" "sseishft") - (set_attr "prefix" "vex") - (set_attr "mode" "OI")]) - -(define_insn "avx2_ashrvv4si" - [(set (match_operand:V4SI 0 "register_operand" "=x") - (vec_concat:V4SI - (vec_concat:V2SI - (ashiftrt:SI - (vec_select:SI - (match_operand:V4SI 1 "register_operand" "x") - (parallel [(const_int 0)])) - (vec_select:SI - (match_operand:V4SI 2 "nonimmediate_operand" "xm") - (parallel [(const_int 0)]))) - (ashiftrt:SI - (vec_select:SI - (match_dup 1) - (parallel [(const_int 1)])) - (vec_select:SI - (match_dup 2) - (parallel [(const_int 1)])))) - (vec_concat:V2SI - (ashiftrt:SI - (vec_select:SI - (match_dup 1) - (parallel [(const_int 2)])) - (vec_select:SI - (match_dup 2) - (parallel [(const_int 2)]))) - (ashiftrt:SI - (vec_select:SI - (match_dup 1) - (parallel [(const_int 3)])) - (vec_select:SI - (match_dup 2) - (parallel [(const_int 3)]))))))] +(define_insn "avx2_ashrv" + [(set (match_operand:VI4_AVX2 0 "register_operand" "=x") + (ashiftrt:VI4_AVX2 (match_operand:VI4_AVX2 1 "register_operand" "x") + (match_operand:VI4_AVX2 2 "nonimmediate_operand" + "xm")))] "TARGET_AVX2" "vpsravd\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "sseishft") (set_attr "prefix" "vex") - (set_attr "mode" "TI")]) - -(define_insn "avx2_vv8si" - [(set (match_operand:V8SI 0 "register_operand" "=x") - (vec_concat:V8SI - (vec_concat:V4SI - (vec_concat:V2SI - (lshift:SI - (vec_select:SI - (match_operand:V8SI 1 "register_operand" "x") - (parallel [(const_int 0)])) - (vec_select:SI - (match_operand:V8SI 2 "nonimmediate_operand" "xm") - (parallel [(const_int 0)]))) - (lshift:SI - (vec_select:SI - (match_dup 1) - (parallel [(const_int 1)])) - (vec_select:SI - (match_dup 2) - (parallel [(const_int 1)])))) - (vec_concat:V2SI - (lshift:SI - (vec_select:SI - (match_dup 1) - (parallel [(const_int 2)])) - (vec_select:SI - (match_dup 2) - (parallel [(const_int 2)]))) - (lshift:SI - (vec_select:SI - (match_dup 1) - (parallel [(const_int 3)])) - (vec_select:SI - (match_dup 2) - (parallel [(const_int 3)]))))) - (vec_concat:V4SI - (vec_concat:V2SI - (lshift:SI - (vec_select:SI - (match_dup 1) - (parallel [(const_int 0)])) - (vec_select:SI - (match_dup 2) - (parallel [(const_int 0)]))) - (lshift:SI - (vec_select:SI - (match_dup 1) - (parallel [(const_int 1)])) - (vec_select:SI - (match_dup 2) - (parallel [(const_int 1)])))) - (vec_concat:V2SI - (lshift:SI - (vec_select:SI - (match_dup 1) - (parallel [(const_int 2)])) - (vec_select:SI - (match_dup 2) - (parallel [(const_int 2)]))) - (lshift:SI - (vec_select:SI - (match_dup 1) - (parallel [(const_int 3)])) - (vec_select:SI - (match_dup 2) - (parallel [(const_int 3)])))))))] - "TARGET_AVX2" - "vpvd\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "type" "sseishft") - (set_attr "prefix" "vex") - (set_attr "mode" "OI")]) + (set_attr "mode" "")]) (define_insn "avx2_v" - [(set (match_operand:VI4SD_AVX2 0 "register_operand" "=x") - (vec_concat:VI4SD_AVX2 - (vec_concat: - (lshift: - (vec_select: - (match_operand:VI4SD_AVX2 1 "register_operand" "x") - (parallel [(const_int 0)])) - (vec_select: - (match_operand:VI4SD_AVX2 2 "nonimmediate_operand" "xm") - (parallel [(const_int 0)]))) - (lshift: - (vec_select: - (match_dup 1) - (parallel [(const_int 1)])) - (vec_select: - (match_dup 2) - (parallel [(const_int 1)])))) - (vec_concat: - (lshift: - (vec_select: - (match_dup 1) - (parallel [(const_int 2)])) - (vec_select: - (match_dup 2) - (parallel [(const_int 2)]))) - (lshift: - (vec_select: - (match_dup 1) - (parallel [(const_int 3)])) - (vec_select: - (match_dup 2) - (parallel [(const_int 3)]))))))] + [(set (match_operand:VI48_AVX2 0 "register_operand" "=x") + (lshift:VI48_AVX2 (match_operand:VI48_AVX2 1 "register_operand" "x") + (match_operand:VI48_AVX2 2 "nonimmediate_operand" + "xm")))] "TARGET_AVX2" "vpv\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "sseishft") (set_attr "prefix" "vex") (set_attr "mode" "")]) -(define_insn "avx2_vv2di" - [(set (match_operand:V2DI 0 "register_operand" "=x") - (vec_concat:V2DI - (lshift:DI - (vec_select:DI - (match_operand:V2DI 1 "register_operand" "x") - (parallel [(const_int 0)])) - (vec_select:DI - (match_operand:V2DI 2 "nonimmediate_operand" "xm") - (parallel [(const_int 0)]))) - (lshift:DI - (vec_select:DI - (match_dup 1) - (parallel [(const_int 1)])) - (vec_select:DI - (match_dup 2) - (parallel [(const_int 1)])))))] - "TARGET_AVX2" - "vpvq\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "type" "sseishft") - (set_attr "prefix" "vex") - (set_attr "mode" "TI")]) - (define_insn "avx_vec_concat" [(set (match_operand:V_256 0 "register_operand" "=x,x") (vec_concat:V_256 --- gcc/testsuite/gcc.dg/vshift-1.c.jj 2011-10-28 10:02:26.000000000 +0200 +++ gcc/testsuite/gcc.dg/vshift-1.c 2011-10-28 10:02:26.000000000 +0200 @@ -0,0 +1,132 @@ +/* { dg-do run } */ +/* { dg-options "-O3" } */ + +#include + +#define N 64 + +#ifndef TYPE1 +#define TYPE1 int +#define TYPE2 long long +#endif + +signed TYPE1 a[N], b[N], g[N]; +unsigned TYPE1 c[N], h[N]; +signed TYPE2 d[N], e[N], j[N]; +unsigned TYPE2 f[N], k[N]; + +__attribute__((noinline)) void +f1 (void) +{ + int i; + for (i = 0; i < N; i++) + g[i] = a[i] << b[i]; +} + +__attribute__((noinline)) void +f2 (void) +{ + int i; + for (i = 0; i < N; i++) + g[i] = a[i] >> b[i]; +} + +__attribute__((noinline)) void +f3 (void) +{ + int i; + for (i = 0; i < N; i++) + h[i] = c[i] >> b[i]; +} + +__attribute__((noinline)) void +f4 (void) +{ + int i; + for (i = 0; i < N; i++) + j[i] = d[i] << e[i]; +} + +__attribute__((noinline)) void +f5 (void) +{ + int i; + for (i = 0; i < N; i++) + j[i] = d[i] >> e[i]; +} + +__attribute__((noinline)) void +f6 (void) +{ + int i; + for (i = 0; i < N; i++) + k[i] = f[i] >> e[i]; +} + +__attribute__((noinline)) void +f7 (void) +{ + int i; + for (i = 0; i < N; i++) + j[i] = d[i] << b[i]; +} + +__attribute__((noinline)) void +f8 (void) +{ + int i; + for (i = 0; i < N; i++) + j[i] = d[i] >> b[i]; +} + +__attribute__((noinline)) void +f9 (void) +{ + int i; + for (i = 0; i < N; i++) + k[i] = f[i] >> b[i]; +} + +int +main () +{ + int i; + for (i = 0; i < N; i++) + { + asm (""); + c[i] = (random () << 1) | (random () & 1); + b[i] = (i * 85) & (sizeof (TYPE1) * __CHAR_BIT__ - 1); + a[i] = c[i]; + d[i] = (random () << 1) | (random () & 1); + d[i] |= (unsigned long long) c[i] << 32; + e[i] = (i * 85) & (sizeof (TYPE2) * __CHAR_BIT__ - 1); + f[i] = d[i]; + } + f1 (); + f3 (); + f4 (); + f6 (); + for (i = 0; i < N; i++) + if (g[i] != (signed TYPE1) (a[i] << b[i]) + || h[i] != (unsigned TYPE1) (c[i] >> b[i]) + || j[i] != (signed TYPE2) (d[i] << e[i]) + || k[i] != (unsigned TYPE2) (f[i] >> e[i])) + abort (); + f2 (); + f5 (); + f9 (); + for (i = 0; i < N; i++) + if (g[i] != (signed TYPE1) (a[i] >> b[i]) + || j[i] != (signed TYPE2) (d[i] >> e[i]) + || k[i] != (unsigned TYPE2) (f[i] >> b[i])) + abort (); + f7 (); + for (i = 0; i < N; i++) + if (j[i] != (signed TYPE2) (d[i] << b[i])) + abort (); + f8 (); + for (i = 0; i < N; i++) + if (j[i] != (signed TYPE2) (d[i] >> b[i])) + abort (); + return 0; +} --- gcc/testsuite/gcc.dg/vshift-2.c.jj 2011-10-28 10:02:26.000000000 +0200 +++ gcc/testsuite/gcc.dg/vshift-2.c 2011-10-28 10:02:26.000000000 +0200 @@ -0,0 +1,7 @@ +/* { dg-do run } */ +/* { dg-options "-O3" } */ + +#define TYPE1 char +#define TYPE2 short + +#include "vshift-1.c" --- gcc/testsuite/gcc.target/i386/xop-vshift-1.c.jj 2011-10-28 10:02:26.000000000 +0200 +++ gcc/testsuite/gcc.target/i386/xop-vshift-1.c 2011-10-28 10:02:26.000000000 +0200 @@ -0,0 +1,140 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -mxop" } */ +/* { dg-require-effective-target xop } */ + +#ifndef CHECK_H +#define CHECK_H "xop-check.h" +#endif + +#ifndef TEST +#define TEST xop_test +#endif + +#include CHECK_H + +#define N 64 + +#ifndef TYPE1 +#define TYPE1 int +#define TYPE2 long long +#endif + +signed TYPE1 a[N], b[N], g[N]; +unsigned TYPE1 c[N], h[N]; +signed TYPE2 d[N], e[N], j[N]; +unsigned TYPE2 f[N], k[N]; + +__attribute__((noinline)) void +f1 (void) +{ + int i; + for (i = 0; i < N; i++) + g[i] = a[i] << b[i]; +} + +__attribute__((noinline)) void +f2 (void) +{ + int i; + for (i = 0; i < N; i++) + g[i] = a[i] >> b[i]; +} + +__attribute__((noinline)) void +f3 (void) +{ + int i; + for (i = 0; i < N; i++) + h[i] = c[i] >> b[i]; +} + +__attribute__((noinline)) void +f4 (void) +{ + int i; + for (i = 0; i < N; i++) + j[i] = d[i] << e[i]; +} + +__attribute__((noinline)) void +f5 (void) +{ + int i; + for (i = 0; i < N; i++) + j[i] = d[i] >> e[i]; +} + +__attribute__((noinline)) void +f6 (void) +{ + int i; + for (i = 0; i < N; i++) + k[i] = f[i] >> e[i]; +} + +__attribute__((noinline)) void +f7 (void) +{ + int i; + for (i = 0; i < N; i++) + j[i] = d[i] << b[i]; +} + +__attribute__((noinline)) void +f8 (void) +{ + int i; + for (i = 0; i < N; i++) + j[i] = d[i] >> b[i]; +} + +__attribute__((noinline)) void +f9 (void) +{ + int i; + for (i = 0; i < N; i++) + k[i] = f[i] >> b[i]; +} + +static void +TEST () +{ + int i; + for (i = 0; i < N; i++) + { + asm (""); + c[i] = (random () << 1) | (random () & 1); + b[i] = (i * 85) & (sizeof (TYPE1) * __CHAR_BIT__ - 1); + a[i] = c[i]; + d[i] = (random () << 1) | (random () & 1); + d[i] |= (unsigned long long) c[i] << 32; + e[i] = (i * 85) & (sizeof (TYPE2) * __CHAR_BIT__ - 1); + f[i] = d[i]; + } + f1 (); + f3 (); + f4 (); + f6 (); + for (i = 0; i < N; i++) + if (g[i] != (signed TYPE1) (a[i] << b[i]) + || h[i] != (unsigned TYPE1) (c[i] >> b[i]) + || j[i] != (signed TYPE2) (d[i] << e[i]) + || k[i] != (unsigned TYPE2) (f[i] >> e[i])) + abort (); + f2 (); + f5 (); + f9 (); + for (i = 0; i < N; i++) + if (g[i] != (signed TYPE1) (a[i] >> b[i]) + || j[i] != (signed TYPE2) (d[i] >> e[i]) + || k[i] != (unsigned TYPE2) (f[i] >> b[i])) + abort (); + f7 (); + for (i = 0; i < N; i++) + if (j[i] != (signed TYPE2) (d[i] << b[i])) + abort (); + f8 (); + for (i = 0; i < N; i++) + if (j[i] != (signed TYPE2) (d[i] >> b[i])) + abort (); +} --- gcc/testsuite/gcc.target/i386/xop-vshift-2.c.jj 2011-10-28 10:02:26.000000000 +0200 +++ gcc/testsuite/gcc.target/i386/xop-vshift-2.c 2011-10-28 10:02:26.000000000 +0200 @@ -0,0 +1,8 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -mxop" } */ +/* { dg-require-effective-target xop } */ + +#define TYPE1 char +#define TYPE2 short + +#include "xop-vshift-1.c" --- gcc/testsuite/gcc.target/i386/avx2-vshift-1.c.jj 2011-10-28 10:02:26.000000000 +0200 +++ gcc/testsuite/gcc.target/i386/avx2-vshift-1.c 2011-10-28 10:02:26.000000000 +0200 @@ -0,0 +1,13 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -mavx2" } */ +/* { dg-require-effective-target avx2 } */ + +#ifndef CHECK_H +#define CHECK_H "avx2-check.h" +#endif + +#ifndef TEST +#define TEST avx2_test +#endif + +#include "xop-vshift-1.c"