From patchwork Thu Mar 16 09:29:52 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kyrill Tkachov X-Patchwork-Id: 739720 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3vkNVx48MPz9ryr for ; Thu, 16 Mar 2017 20:30:30 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="Na5jmcRn"; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:content-type; q=dns; s=default; b=G9YWVmZgJZBVSdRdmwkig44qM6k9JLue+aIm0MiRE6o ACuYZQ0odH5KdozrbSDOoGhMZ8m+tMTEDnsaKHm5iJaSbbpot3wLl5b8AL0Eyj9V o6rdVCADuXTkZs0WTHzSyucDe5ovwhhlclC6AmCa5jJt+1ybw7jSx125U/e88ez8 = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:content-type; s=default; bh=+NKR5/gtzTnP8pI8/KXuWDh4O8c=; b=Na5jmcRnE9U+85bOs ge+UeTWxxESUfFhgvL3cgJQFSbsGhnhyRxFGkIvRzQ4fWTv698xvsUUriV9CJWdI OGh8TNXfkicR52VurkhyTa8n7pXW1jQygVd6R+bIFisyDXsCUJcIGyYYw0z7gBby fSzq3tvX06duWKxvQkDcOyK2Es= Received: (qmail 101874 invoked by alias); 16 Mar 2017 09:29:59 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 101692 invoked by uid 89); 16 Mar 2017 09:29:57 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_LAZY_DOMAIN_SECURITY, RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=kyrylo, VEL, Kyrylo X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 16 Mar 2017 09:29:55 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4B99480D; Thu, 16 Mar 2017 02:29:54 -0700 (PDT) Received: from [10.2.207.77] (e100706-lin.cambridge.arm.com [10.2.207.77]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 84ED03F220; Thu, 16 Mar 2017 02:29:53 -0700 (PDT) Message-ID: <58CA5B10.3020605@foss.arm.com> Date: Thu, 16 Mar 2017 09:29:52 +0000 From: Kyrill Tkachov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: GCC Patches CC: Marcus Shawcroft , Richard Earnshaw , James Greenhalgh Subject: [PATCH][AArch64] Use 'x' constraint for vector HFmode multiplication by indexed element instructions Hi all, The advsimd-intrinsics.exp tests for the fmul and fmulx instructions that perform a multiplication by indexed element have started generating invalid assembly in my testing. For example: Error: register number out of range 0 to 15 at operand 3 -- `fmulx v24.8h,v23.8h,v22.h[0]' The problem is that the indexed vector register (v22 in this case) has to be in V0-V15 when accessed as a 16-bit element. The constraints on the pattern don't reflect this. We already have the h_con constraint that's supposed to do what we want, but it incorrectly returns the "w" constraint for HF inner modes and it isn't applied in all the patterns that it needs to be (it's needed for the FMLA, FMLS, FMUL, FMULx by element patterns). This patch fixes those issues by changing h_con to return the "x" constraint for HF inner modes and applying it to all the operands that need it in aarch64-simd.md With this patch the advsimd-intrinsics.exp tests now generate valid assembly and don't complain, so no new regression tests are added. Bootstrapped and tested on aarch64-none-linux-gnu. Ok for trunk? Thanks, Kyrill 2017-03-16 Kyrylo Tkachov * config/aarch64/iterators.md (h_con): Return "x" for V4HF and V8HF. * config/aarch64/aarch64-simd.md (*aarch64_fma4_elt_from_dup): Use h_con constraint for operand 1. (*aarch64_fnma4_elt_from_dup): Likewise. (*aarch64_mulx_elt_from_dup): Likewise for operand 2. diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index b3db7be..bf13f07 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -1660,7 +1660,7 @@ (define_insn "*aarch64_fma4_elt_from_dup" [(set (match_operand:VMUL 0 "register_operand" "=w") (fma:VMUL (vec_duplicate:VMUL - (match_operand: 1 "register_operand" "w")) + (match_operand: 1 "register_operand" "")) (match_operand:VMUL 2 "register_operand" "w") (match_operand:VMUL 3 "register_operand" "0")))] "TARGET_SIMD" @@ -1739,7 +1739,7 @@ (define_insn "*aarch64_fnma4_elt_from_dup" (neg:VMUL (match_operand:VMUL 2 "register_operand" "w")) (vec_duplicate:VMUL - (match_operand: 1 "register_operand" "w")) + (match_operand: 1 "register_operand" "")) (match_operand:VMUL 3 "register_operand" "0")))] "TARGET_SIMD" "fmls\t%0., %2., %1.[0]" @@ -3191,7 +3191,7 @@ (define_insn "*aarch64_mulx_elt_from_dup" (unspec:VHSDF [(match_operand:VHSDF 1 "register_operand" "w") (vec_duplicate:VHSDF - (match_operand: 2 "register_operand" "w"))] + (match_operand: 2 "register_operand" ""))] UNSPEC_FMULX))] "TARGET_SIMD" "fmulx\t%0., %1., %2.[0]"; diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 1ddf6ad..43be7fd 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -749,11 +749,11 @@ (define_mode_attr vswap_width_name [(V8QI "to_128") (V16QI "to_64") (DF "to_128") (V2DF "to_64")]) ;; For certain vector-by-element multiplication instructions we must -;; constrain the HI cases to use only V0-V15. This is covered by +;; constrain the 16-bit cases to use only V0-V15. This is covered by ;; the 'x' constraint. All other modes may use the 'w' constraint. (define_mode_attr h_con [(V2SI "w") (V4SI "w") (V4HI "x") (V8HI "x") - (V4HF "w") (V8HF "w") + (V4HF "x") (V8HF "x") (V2SF "w") (V4SF "w") (V2DF "w") (DF "w")])