From patchwork Wed May 4 19:48:10 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 618605 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3r0TB665WVz9t4h for ; Thu, 5 May 2016 05:48:46 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=hcpLjCAN; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:mime-version :content-type; q=dns; s=default; b=wQadhR3SaUDoujqIQnxEQWay2uWVm j2tk/qCUvGS1OWTs1G8udbDOQYc/fCT0wVT8qhDqk7QS4rE57QoBBlT78LsV8zwQ wDJKjDtTY0rrMvL6LlhbJcnBGpX3KUJSXmPiLeFI6W9kcU1Y1B5SkNBfSnc8l0gu F4YjVGjr9JXHy0= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:mime-version :content-type; s=default; bh=QN9wwI1X7erEzfMiNXKWDhvMf2o=; b=hcp LjCANor27jvXVqoccdGoLDexwKHV+B27MwKqVvpRw6MFq4HbnC3XKkfp/k3Hr23M zkWZUG/RQszewxfYF6i6s095+9hYZ7lzvt33rcTXN3WztNy70G1AhwYoWy4vqA2p ZNFzUTT5NdnEWc7xF3BOHX7oPKx9pLDcvhsKyy6Y= Received: (qmail 100190 invoked by alias); 4 May 2016 19:48:39 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 100016 invoked by uid 89); 4 May 2016 19:48:30 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-4.0 required=5.0 tests=BAYES_00, RP_MATCHES_RCVD, SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=sk:avx512b X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Wed, 04 May 2016 19:48:20 +0000 Received: from int-mx14.intmail.prod.int.phx2.redhat.com (int-mx14.intmail.prod.int.phx2.redhat.com [10.5.11.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9798BC0D7F07; Wed, 4 May 2016 19:48:14 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-113-135.phx2.redhat.com [10.3.113.135]) by int-mx14.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u44JmD0X013094 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Wed, 4 May 2016 15:48:14 -0400 Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id u44JmBUq013690; Wed, 4 May 2016 21:48:11 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id u44JmA0B013689; Wed, 4 May 2016 21:48:10 +0200 Date: Wed, 4 May 2016 21:48:10 +0200 From: Jakub Jelinek To: Uros Bizjak , Kirill Yukhin Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] Improve *pmaddwd Message-ID: <20160504194810.GR26501@tucnak.zalov.cz> Reply-To: Jakub Jelinek MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.24 (2015-08-30) X-IsSubscribed: yes Hi! As the testcase shows, we unnecessarily disallow xmm16+, even when we can use them for -mavx512bw. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2016-05-04 Jakub Jelinek * config/i386/sse.md (*avx2_pmaddwd, *sse2_pmaddwd): Use v instead of x in vex or maybe_vex alternatives, use maybe_evex instead of vex in prefix. * gcc.target/i386/avx512bw-vpmaddwd-3.c: New test. Jakub --- gcc/config/i386/sse.md.jj 2016-05-04 14:36:08.000000000 +0200 +++ gcc/config/i386/sse.md 2016-05-04 15:16:44.180894303 +0200 @@ -9803,19 +9817,19 @@ (define_expand "avx2_pmaddwd" "ix86_fixup_binary_operands_no_copy (MULT, V16HImode, operands);") (define_insn "*avx2_pmaddwd" - [(set (match_operand:V8SI 0 "register_operand" "=x") + [(set (match_operand:V8SI 0 "register_operand" "=x,v") (plus:V8SI (mult:V8SI (sign_extend:V8SI (vec_select:V8HI - (match_operand:V16HI 1 "nonimmediate_operand" "%x") + (match_operand:V16HI 1 "nonimmediate_operand" "%x,v") (parallel [(const_int 0) (const_int 2) (const_int 4) (const_int 6) (const_int 8) (const_int 10) (const_int 12) (const_int 14)]))) (sign_extend:V8SI (vec_select:V8HI - (match_operand:V16HI 2 "nonimmediate_operand" "xm") + (match_operand:V16HI 2 "nonimmediate_operand" "xm,vm") (parallel [(const_int 0) (const_int 2) (const_int 4) (const_int 6) (const_int 8) (const_int 10) @@ -9836,7 +9850,8 @@ (define_insn "*avx2_pmaddwd" "TARGET_AVX2 && ix86_binary_operator_ok (MULT, V16HImode, operands)" "vpmaddwd\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "sseiadd") - (set_attr "prefix" "vex") + (set_attr "isa" "*,avx512bw") + (set_attr "prefix" "vex,evex") (set_attr "mode" "OI")]) (define_expand "sse2_pmaddwd" @@ -9866,17 +9881,17 @@ (define_expand "sse2_pmaddwd" "ix86_fixup_binary_operands_no_copy (MULT, V8HImode, operands);") (define_insn "*sse2_pmaddwd" - [(set (match_operand:V4SI 0 "register_operand" "=x,x") + [(set (match_operand:V4SI 0 "register_operand" "=x,x,v") (plus:V4SI (mult:V4SI (sign_extend:V4SI (vec_select:V4HI - (match_operand:V8HI 1 "vector_operand" "%0,x") + (match_operand:V8HI 1 "vector_operand" "%0,x,v") (parallel [(const_int 0) (const_int 2) (const_int 4) (const_int 6)]))) (sign_extend:V4SI (vec_select:V4HI - (match_operand:V8HI 2 "vector_operand" "xBm,xm") + (match_operand:V8HI 2 "vector_operand" "xBm,xm,vm") (parallel [(const_int 0) (const_int 2) (const_int 4) (const_int 6)])))) (mult:V4SI @@ -9891,12 +9906,13 @@ (define_insn "*sse2_pmaddwd" "TARGET_SSE2 && ix86_binary_operator_ok (MULT, V8HImode, operands)" "@ pmaddwd\t{%2, %0|%0, %2} + vpmaddwd\t{%2, %1, %0|%0, %1, %2} vpmaddwd\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "isa" "noavx,avx") + [(set_attr "isa" "noavx,avx,avx512bw") (set_attr "type" "sseiadd") (set_attr "atom_unit" "simul") - (set_attr "prefix_data16" "1,*") - (set_attr "prefix" "orig,vex") + (set_attr "prefix_data16" "1,*,*") + (set_attr "prefix" "orig,vex,evex") (set_attr "mode" "TI")]) (define_insn "avx512dq_mul3" --- gcc/testsuite/gcc.target/i386/avx512bw-vpmaddwd-3.c.jj 2016-05-04 16:37:21.196223424 +0200 +++ gcc/testsuite/gcc.target/i386/avx512bw-vpmaddwd-3.c 2016-05-04 16:37:51.867819502 +0200 @@ -0,0 +1,24 @@ +/* { dg-do assemble { target { avx512bw && { avx512vl && { ! ia32 } } } } } */ +/* { dg-options "-O2 -mavx512bw -mavx512vl" } */ + +#include + +void +f1 (__m128i x, __m128i y) +{ + register __m128i a __asm ("xmm16"), b __asm ("xmm17"); + a = x; b = y; + asm volatile ("" : "+v" (a), "+v" (b)); + a = _mm_madd_epi16 (a, b); + asm volatile ("" : "+v" (a)); +} + +void +f2 (__m256i x, __m256i y) +{ + register __m256i a __asm ("xmm16"), b __asm ("xmm17"); + a = x; b = y; + asm volatile ("" : "+v" (a), "+v" (b)); + a = _mm256_madd_epi16 (a, b); + asm volatile ("" : "+v" (a)); +}