From patchwork Mon Jan 15 11:23:46 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kyrill Tkachov X-Patchwork-Id: 860817 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-471260-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="t8LAtF+F"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3zKrb44l9dz9sBd for ; Mon, 15 Jan 2018 22:24:00 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:subject:content-type; q= dns; s=default; b=u9K7k/v6ArTXqUaiYMl0EvJQJcf9/TFkswIpGRdhBqBkFR OQCoJBar9ezT6c0oi6JDEy4V/hDIrH4yFHHiWpJFCTJrI1lz7GsAjU+ZCj5jD496 muQg8savjpsSAQzHsupHAiX/gL8cjHyXJlrFBCn7U56GP5Hvdi+LdOo/VaKT0= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:subject:content-type; s= default; bh=Ld4K3VZIDRcES56MJjIjl8YpS7M=; b=t8LAtF+F6rDJNF2hppL5 gkfVxZhjPyNqmFeQVZmiEoqt/f8+mV6tEioNmd62cRrS6Udhh/WxX8A8zOToskQ3 PvZkCTAckhF5nWnN8fWiJPz5neMGS88r8FORT6q045PK+GG5bOvxQ095wPA/e7bU uPAKgDtz+r3tEbrWl8Xwj74= Received: (qmail 88674 invoked by alias); 15 Jan 2018 11:23:53 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 88663 invoked by uid 89); 15 Jan 2018 11:23:52 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-24.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_LAZY_DOMAIN_SECURITY, KAM_LOTSOFHASH, T_RP_MATCHES_RCVD, URIBL_ABUSE_SURBL autolearn=ham version=3.3.2 spammy= X-HELO: foss.arm.com Received: from usa-sjc-mx-foss1.foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 15 Jan 2018 11:23:50 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0DBF880D for ; Mon, 15 Jan 2018 03:23:49 -0800 (PST) Received: from [10.2.207.77] (e100706-lin.cambridge.arm.com [10.2.207.77]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id ADD323F59A for ; Mon, 15 Jan 2018 03:23:48 -0800 (PST) Message-ID: <5A5C8F42.6040503@foss.arm.com> Date: Mon, 15 Jan 2018 11:23:46 +0000 From: Kyrill Tkachov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: "gcc-patches@gcc.gnu.org" Subject: [PATCH][arm] PR target/83687: Fix invalid combination of VSUB + VABS into VABD Hi all, In this wrong-code bug we combine a VSUB.I8 and a VABS.S8 into a VABD.S8 instruction . This combination is not valid for integer operands because in the VABD instruction the semantics are that the difference is computed in notionally infinite precision and the absolute difference is computed on that, whereas for a VSUB.I8 + VABS.S8 sequence the VSUB operation will perform any wrapping that's needed for the 8-bit signed type before the VABS gets its hands on it. This leads to the wrong-code in the PR where the expected sequence from the intrinsics: VSUB + VABS of two vectors {-100, -100, -100...}, {100, 100, 100...} gives a result of {56, 56, 56...} (-100 - 100) but GCC optimises it into a single VABD of {-100, -100, -100...}, {100, 100, 100...} which produces a result of {200, 200, 200...} The transformation is still valid for floating-point operands, which is why it was added in the first place I believe (r178817) but this patch disables it for integer operands. The HFmode variants though only exist for TARGET_NEON_FP16INST, so this patch adds the appropriate guards to the new mode iterator Bootstrapped and tested on arm-none-linux-gnueabihf. Committing to trunk. Thanks, Kyrill 2018-01-15 Kyrylo Tkachov PR target/83687 * config/arm/iterators.md (VF): New mode iterator. * config/arm/neon.md (neon_vabd_2): Use the above. Remove integer-related logic from pattern. (neon_vabd_3): Likewise. 2018-01-15 Kyrylo Tkachov PR target/83687 * gcc.target/arm/neon-combine-sub-abs-into-vabd.c: Delete integer tests. * gcc.target/arm/pr83687.c: New test. diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md index 5772aa99cc92de66ef4438b76632e86325a96ef2..0b2d42399d22ba89a976e39bef6182d31173c1ef 100644 --- a/gcc/config/arm/iterators.md +++ b/gcc/config/arm/iterators.md @@ -119,6 +119,10 @@ (define_mode_iterator VN [V8HI V4SI V2DI]) ;; All supported vector modes (except singleton DImode). (define_mode_iterator VDQ [V8QI V16QI V4HI V8HI V2SI V4SI V4HF V8HF V2SF V4SF V2DI]) +;; All supported floating-point vector modes (except V2DF). +(define_mode_iterator VF [(V4HF "TARGET_NEON_FP16INST") + (V8HF "TARGET_NEON_FP16INST") V2SF V4SF]) + ;; All supported vector modes (except those with 64-bit integer elements). (define_mode_iterator VDQW [V8QI V16QI V4HI V8HI V2SI V4SI V2SF V4SF]) diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index 59fb6435da8abfe46254558e8646cd4606acb4fa..6a6f5d737715e4100adee8fb7de1d6211da3d85c 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -6706,28 +6706,22 @@ (define_expand "vec_pack_trunc_" }) (define_insn "neon_vabd_2" - [(set (match_operand:VDQ 0 "s_register_operand" "=w") - (abs:VDQ (minus:VDQ (match_operand:VDQ 1 "s_register_operand" "w") - (match_operand:VDQ 2 "s_register_operand" "w"))))] - "TARGET_NEON && (! || flag_unsafe_math_optimizations)" + [(set (match_operand:VF 0 "s_register_operand" "=w") + (abs:VF (minus:VF (match_operand:VF 1 "s_register_operand" "w") + (match_operand:VF 2 "s_register_operand" "w"))))] + "TARGET_NEON && flag_unsafe_math_optimizations" "vabd. %0, %1, %2" - [(set (attr "type") - (if_then_else (ne (symbol_ref "") (const_int 0)) - (const_string "neon_fp_abd_s") - (const_string "neon_abd")))] + [(set_attr "type" "neon_fp_abd_s")] ) (define_insn "neon_vabd_3" - [(set (match_operand:VDQ 0 "s_register_operand" "=w") - (abs:VDQ (unspec:VDQ [(match_operand:VDQ 1 "s_register_operand" "w") - (match_operand:VDQ 2 "s_register_operand" "w")] - UNSPEC_VSUB)))] - "TARGET_NEON && (! || flag_unsafe_math_optimizations)" + [(set (match_operand:VF 0 "s_register_operand" "=w") + (abs:VF (unspec:VF [(match_operand:VF 1 "s_register_operand" "w") + (match_operand:VF 2 "s_register_operand" "w")] + UNSPEC_VSUB)))] + "TARGET_NEON && flag_unsafe_math_optimizations" "vabd. %0, %1, %2" - [(set (attr "type") - (if_then_else (ne (symbol_ref "") (const_int 0)) - (const_string "neon_fp_abd_s") - (const_string "neon_abd")))] + [(set_attr "type" "neon_fp_abd_s")] ) ;; Copy from core-to-neon regs, then extend, not vice-versa diff --git a/gcc/testsuite/gcc.target/arm/neon-combine-sub-abs-into-vabd.c b/gcc/testsuite/gcc.target/arm/neon-combine-sub-abs-into-vabd.c index fe3d78b308cde0338300785cf7cb6ca77a831e3d..784714f0e87d8cd1216af948c61cdb87319e02cd 100644 --- a/gcc/testsuite/gcc.target/arm/neon-combine-sub-abs-into-vabd.c +++ b/gcc/testsuite/gcc.target/arm/neon-combine-sub-abs-into-vabd.c @@ -12,31 +12,3 @@ float32x2_t f_sub_abs_to_vabd_32(float32x2_t val1, float32x2_t val2) return res; } /* { dg-final { scan-assembler "vabd\.f32" } }*/ - -#include -int8x8_t sub_abs_to_vabd_8(int8x8_t val1, int8x8_t val2) -{ - int8x8_t sres = vsub_s8(val1, val2); - int8x8_t res = vabs_s8 (sres); - - return res; -} -/* { dg-final { scan-assembler "vabd\.s8" } }*/ - -int16x4_t sub_abs_to_vabd_16(int16x4_t val1, int16x4_t val2) -{ - int16x4_t sres = vsub_s16(val1, val2); - int16x4_t res = vabs_s16 (sres); - - return res; -} -/* { dg-final { scan-assembler "vabd\.s16" } }*/ - -int32x2_t sub_abs_to_vabd_32(int32x2_t val1, int32x2_t val2) -{ - int32x2_t sres = vsub_s32(val1, val2); - int32x2_t res = vabs_s32 (sres); - - return res; -} -/* { dg-final { scan-assembler "vabd\.s32" } }*/ diff --git a/gcc/testsuite/gcc.target/arm/pr83687.c b/gcc/testsuite/gcc.target/arm/pr83687.c new file mode 100644 index 0000000000000000000000000000000000000000..42754138660739d9fbffcd337460e26de94f736f --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/pr83687.c @@ -0,0 +1,31 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_hw } */ +/* { dg-options "-O2" } */ +/* { dg-add-options arm_neon } */ + +#include + +__attribute__ ((noinline)) int8_t +testFunction1 (int8_t a, int8_t b) +{ + volatile int8x16_t sub = vsubq_s8 (vdupq_n_s8 (a), vdupq_n_s8 (b)); + int8x16_t abs = vabsq_s8 (sub); + return vgetq_lane_s8 (abs, 0); +} + +__attribute__ ((noinline)) int8_t +testFunction2 (int8_t a, int8_t b) +{ + int8x16_t sub = vsubq_s8 (vdupq_n_s8 (a), vdupq_n_s8 (b)); + int8x16_t abs = vabsq_s8 (sub); + return vgetq_lane_s8 (abs, 0); +} + +int +main (void) +{ + if (testFunction1 (-100, 100) != testFunction2 (-100, 100)) + __builtin_abort (); + + return 0; +}