From patchwork Thu Aug 20 10:34:02 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 508990 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 495B51401F0 for ; Thu, 20 Aug 2015 20:34:19 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=Owq9egFx; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type :content-transfer-encoding; q=dns; s=default; b=qfaBmWBqxN8KOILP tVTKouHBeWvTqlmb705IoaH1pT8Ph9t0IDocNwA1MjZBPgfNEhxj9ydqSFCgN9tH FLB0LWSpGEJbLKJCzVo+7LlpzbUsh5+eUmOvb2WoIiqHmhnytAIoATOObuDpMQDX BSqij/Aa0A4t2eZraG1wrk+yPM0= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type :content-transfer-encoding; s=default; bh=G+GWJ8bnmsFBP2LHHTrsxY Xe60k=; b=Owq9egFx/USS3xKMCopSCwY0rpyDZSBKtS5UsJJzrxmyFmliTGoGSs m6OU9lWg9IXNKKdzX/eZnxPpeyqvFx5qUbEGvdfycZqDCcVhdti9dfktmQdnHfCe iTsimWagC07KMCLpzQFZOFBCF56GDySn5IWTckW1CGGHPEokEhu/Y= Received: (qmail 6448 invoked by alias); 20 Aug 2015 10:34:12 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 6439 invoked by uid 89); 20 Aug 2015 10:34:12 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.4 required=5.0 tests=AWL, BAYES_00, SPF_PASS autolearn=ham version=3.3.2 X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (146.101.78.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 20 Aug 2015 10:34:10 +0000 Received: from cam-owa2.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.140]) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-11-K7CHSTC8RhGbHzHFONgJ6A-1; Thu, 20 Aug 2015 11:34:03 +0100 Received: from localhost ([10.1.2.79]) by cam-owa2.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Thu, 20 Aug 2015 11:34:03 +0100 From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: Add utility functions for rtx broadcast/duplicate constants Date: Thu, 20 Aug 2015 11:34:02 +0100 Message-ID: <878u969rl1.fsf@e105548-lin.cambridge.arm.com> User-Agent: Gnus/5.130012 (Ma Gnus v0.12) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 X-MC-Unique: K7CHSTC8RhGbHzHFONgJ6A-1 Several pieces of code want to know whether all elements of a CONST_VECTOR are equal, and I'm about to add some more to simplify-rtx.c. This patch adds some utility functions for that. I don't think we're really helping ourselves by having the shift amount in "v16qi << 3" be: (const_vector:V16QI [ (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) (const_int 3) ]) so I wanted to leave open the possibility of using: (const:V16QI (vec_duplicate:V16QI (const_int 3))) in future. The interface therefore passes back the duplicated element rather than leaving callers to use CONST_VECTOR_ELT (c, 0) (== XVECEXP (c, 0, 0)). unwrap_const_vec_duplicate is mostly for code that handles vector operations equivalently to scalar ops. The follow-on simplify-rtx.c code makes more use of this. It also came in useful for the tilegx/ tilepro predicates. Tested on x86_64-linux-gnu, arm-linux-gnueabi and aarch64-linux-gnu. I also built cross-compilers for s390x-linux-gnu, spu-elf, tilepro-elf and tilegx-elf and checked by hand that the affected code still worked. (Well, except for the SPU case. That's handling vector constants in which the elements are symbolic addresses, such as { &foo, &foo, &foo, &foo }. Such vectors don't seem to be treated as constants at the gimple level and the initial rtl code that we generate is too complex for later optimisations to convert back to a constant, so I wasn't sure how best to trigger it.) OK to install? Thanks, Richard gcc/ * rtl.h (rtvec_all_equal_p): Declare. (const_vec_duplicate_p, unwrap_const_vec_duplicate): New functions. * rtl.c (rtvec_all_equal_p): New function. * expmed.c (expand_mult): Use unwrap_const_vec_duplicate. * config/aarch64/aarch64.c (aarch64_vect_float_const_representable_p) (aarch64_simd_dup_constant): Use const_vec_duplicate_p. * config/arm/arm.c (neon_vdup_constant): Likewise. * config/s390/s390.c (s390_contiguous_bitmask_vector_p): Likewise. * config/tilegx/constraints.md (W, Y): Likewise. * config/tilepro/constraints.md (W, Y): Likewise. * config/spu/spu.c (spu_legitimate_constant_p): Likewise. (classify_immediate): Use unwrap_const_vec_duplicate. * config/tilepro/predicates.md (reg_or_v4s8bit_operand): Likewise. (reg_or_v2s8bit_operand): Likewise. * config/tilegx/predicates.md (reg_or_v8s8bit_operand): Likewise. (reg_or_v4s8bit_operand): Likewise. diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 382be2c..9b2ea2c 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -9879,31 +9879,10 @@ sizetochar (int size) static bool aarch64_vect_float_const_representable_p (rtx x) { - int i = 0; - REAL_VALUE_TYPE r0, ri; - rtx x0, xi; - - if (GET_MODE_CLASS (GET_MODE (x)) != MODE_VECTOR_FLOAT) - return false; - - x0 = CONST_VECTOR_ELT (x, 0); - if (!CONST_DOUBLE_P (x0)) - return false; - - REAL_VALUE_FROM_CONST_DOUBLE (r0, x0); - - for (i = 1; i < CONST_VECTOR_NUNITS (x); i++) - { - xi = CONST_VECTOR_ELT (x, i); - if (!CONST_DOUBLE_P (xi)) - return false; - - REAL_VALUE_FROM_CONST_DOUBLE (ri, xi); - if (!REAL_VALUES_EQUAL (r0, ri)) - return false; - } - - return aarch64_float_const_representable_p (x0); + rtx elt; + return (GET_MODE_CLASS (GET_MODE (x)) == MODE_VECTOR_FLOAT + && const_vec_duplicate_p (x, &elt) + && aarch64_float_const_representable_p (elt)); } /* Return true for valid and false for invalid. */ @@ -10366,28 +10345,15 @@ aarch64_simd_dup_constant (rtx vals) { machine_mode mode = GET_MODE (vals); machine_mode inner_mode = GET_MODE_INNER (mode); - int n_elts = GET_MODE_NUNITS (mode); - bool all_same = true; rtx x; - int i; - - if (GET_CODE (vals) != CONST_VECTOR) - return NULL_RTX; - - for (i = 1; i < n_elts; ++i) - { - x = CONST_VECTOR_ELT (vals, i); - if (!rtx_equal_p (x, CONST_VECTOR_ELT (vals, 0))) - all_same = false; - } - if (!all_same) + if (!const_vec_duplicate_p (vals, &x)) return NULL_RTX; /* We can load this constant by using DUP and a constant in a single ARM register. This will be cheaper than a vector load. */ - x = copy_to_mode_reg (inner_mode, CONST_VECTOR_ELT (vals, 0)); + x = copy_to_mode_reg (inner_mode, x); return gen_rtx_VEC_DUPLICATE (mode, x); } diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index da77244..c2095a3 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -12607,22 +12607,12 @@ neon_vdup_constant (rtx vals) { machine_mode mode = GET_MODE (vals); machine_mode inner_mode = GET_MODE_INNER (mode); - int n_elts = GET_MODE_NUNITS (mode); - bool all_same = true; rtx x; - int i; if (GET_CODE (vals) != CONST_VECTOR || GET_MODE_SIZE (inner_mode) > 4) return NULL_RTX; - for (i = 0; i < n_elts; ++i) - { - x = XVECEXP (vals, 0, i); - if (i > 0 && !rtx_equal_p (x, XVECEXP (vals, 0, 0))) - all_same = false; - } - - if (!all_same) + if (!const_vec_duplicate_p (vals, &x)) /* The elements are not all the same. We could handle repeating patterns of a mode larger than INNER_MODE here (e.g. int8x8_t {0, C, 0, C, 0, C, 0, C} which can be loaded using @@ -12633,7 +12623,7 @@ neon_vdup_constant (rtx vals) single ARM register. This will be cheaper than a vector load. */ - x = copy_to_mode_reg (inner_mode, XVECEXP (vals, 0, 0)); + x = copy_to_mode_reg (inner_mode, x); return gen_rtx_VEC_DUPLICATE (mode, x); } diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c index 5814694..54b6b7d 100644 --- a/gcc/config/s390/s390.c +++ b/gcc/config/s390/s390.c @@ -2258,23 +2258,14 @@ s390_contiguous_bitmask_vector_p (rtx op, int *start, int *end) { unsigned HOST_WIDE_INT mask; int length, size; + rtx elt; - if (!VECTOR_MODE_P (GET_MODE (op)) - || GET_CODE (op) != CONST_VECTOR - || !CONST_INT_P (XVECEXP (op, 0, 0))) + if (!const_vec_duplicate_p (op, &elt) + || !CONST_INT_P (elt)) return false; - if (GET_MODE_NUNITS (GET_MODE (op)) > 1) - { - int i; - - for (i = 1; i < GET_MODE_NUNITS (GET_MODE (op)); ++i) - if (!rtx_equal_p (XVECEXP (op, 0, i), XVECEXP (op, 0, 0))) - return false; - } - size = GET_MODE_UNIT_BITSIZE (GET_MODE (op)); - mask = UINTVAL (XVECEXP (op, 0, 0)); + mask = UINTVAL (elt); if (s390_contiguous_bitmask_p (mask, size, start, end != NULL ? &length : NULL)) { diff --git a/gcc/config/spu/spu.c b/gcc/config/spu/spu.c index ca76287..05c81f5 100644 --- a/gcc/config/spu/spu.c +++ b/gcc/config/spu/spu.c @@ -3185,11 +3185,8 @@ classify_immediate (rtx op, machine_mode mode) && mode == V4SImode && GET_CODE (op) == CONST_VECTOR && GET_CODE (CONST_VECTOR_ELT (op, 0)) != CONST_INT - && GET_CODE (CONST_VECTOR_ELT (op, 0)) != CONST_DOUBLE - && CONST_VECTOR_ELT (op, 0) == CONST_VECTOR_ELT (op, 1) - && CONST_VECTOR_ELT (op, 1) == CONST_VECTOR_ELT (op, 2) - && CONST_VECTOR_ELT (op, 2) == CONST_VECTOR_ELT (op, 3)) - op = CONST_VECTOR_ELT (op, 0); + && GET_CODE (CONST_VECTOR_ELT (op, 0)) != CONST_DOUBLE) + op = unwrap_const_vec_duplicate (op); switch (GET_CODE (op)) { @@ -3507,9 +3504,7 @@ spu_legitimate_constant_p (machine_mode mode, rtx x) && (GET_CODE (CONST_VECTOR_ELT (x, 0)) == SYMBOL_REF || GET_CODE (CONST_VECTOR_ELT (x, 0)) == LABEL_REF || GET_CODE (CONST_VECTOR_ELT (x, 0)) == CONST)) - return CONST_VECTOR_ELT (x, 0) == CONST_VECTOR_ELT (x, 1) - && CONST_VECTOR_ELT (x, 1) == CONST_VECTOR_ELT (x, 2) - && CONST_VECTOR_ELT (x, 2) == CONST_VECTOR_ELT (x, 3); + return const_vec_duplicate_p (x); if (GET_CODE (x) == CONST_VECTOR && !const_vector_immediate_p (x)) diff --git a/gcc/config/tilegx/constraints.md b/gcc/config/tilegx/constraints.md index 783e1ca..f47d0f6 100644 --- a/gcc/config/tilegx/constraints.md +++ b/gcc/config/tilegx/constraints.md @@ -96,21 +96,14 @@ "An 8-element vector constant with identical elements" (and (match_code "const_vector") (match_test "CONST_VECTOR_NUNITS (op) == 8") - (match_test "CONST_VECTOR_ELT (op, 0) == CONST_VECTOR_ELT (op, 1)") - (match_test "CONST_VECTOR_ELT (op, 0) == CONST_VECTOR_ELT (op, 2)") - (match_test "CONST_VECTOR_ELT (op, 0) == CONST_VECTOR_ELT (op, 3)") - (match_test "CONST_VECTOR_ELT (op, 0) == CONST_VECTOR_ELT (op, 4)") - (match_test "CONST_VECTOR_ELT (op, 0) == CONST_VECTOR_ELT (op, 5)") - (match_test "CONST_VECTOR_ELT (op, 0) == CONST_VECTOR_ELT (op, 6)") - (match_test "CONST_VECTOR_ELT (op, 0) == CONST_VECTOR_ELT (op, 7)"))) + (match_test "const_vec_duplicate_p (op)"))) (define_constraint "Y" "A 4-element vector constant with identical elements" (and (match_code "const_vector") (match_test "CONST_VECTOR_NUNITS (op) == 4") - (match_test "CONST_VECTOR_ELT (op, 0) == CONST_VECTOR_ELT (op, 1)") - (match_test "CONST_VECTOR_ELT (op, 0) == CONST_VECTOR_ELT (op, 2)") - (match_test "CONST_VECTOR_ELT (op, 0) == CONST_VECTOR_ELT (op, 3)"))) + (match_test "const_vec_duplicate_p (op)"))) + (define_constraint "Z0" "The integer constant 0xffffffff" (and (match_code "const_int") diff --git a/gcc/config/tilegx/predicates.md b/gcc/config/tilegx/predicates.md index 4cbebf1..ce04660 100644 --- a/gcc/config/tilegx/predicates.md +++ b/gcc/config/tilegx/predicates.md @@ -112,14 +112,8 @@ (ior (match_operand 0 "register_operand") (and (match_code "const_vector") (match_test "CONST_VECTOR_NUNITS (op) == 8 - && satisfies_constraint_I (CONST_VECTOR_ELT (op, 0)) - && CONST_VECTOR_ELT (op, 0) == CONST_VECTOR_ELT (op, 1) - && CONST_VECTOR_ELT (op, 0) == CONST_VECTOR_ELT (op, 2) - && CONST_VECTOR_ELT (op, 0) == CONST_VECTOR_ELT (op, 3) - && CONST_VECTOR_ELT (op, 0) == CONST_VECTOR_ELT (op, 4) - && CONST_VECTOR_ELT (op, 0) == CONST_VECTOR_ELT (op, 5) - && CONST_VECTOR_ELT (op, 0) == CONST_VECTOR_ELT (op, 6) - && CONST_VECTOR_ELT (op, 0) == CONST_VECTOR_ELT (op, 7)")))) + && (satisfies_constraint_I + (unwrap_const_vec_duplicate (op)))")))) ;; Return 1 if OP is a 4-element vector constant with identical signed ;; 8-bit elements or any register. @@ -127,10 +121,8 @@ (ior (match_operand 0 "register_operand") (and (match_code "const_vector") (match_test "CONST_VECTOR_NUNITS (op) == 4 - && satisfies_constraint_I (CONST_VECTOR_ELT (op, 0)) - && CONST_VECTOR_ELT (op, 0) == CONST_VECTOR_ELT (op, 1) - && CONST_VECTOR_ELT (op, 0) == CONST_VECTOR_ELT (op, 2) - && CONST_VECTOR_ELT (op, 0) == CONST_VECTOR_ELT (op, 3)")))) + && (satisfies_constraint_I + (unwrap_const_vec_duplicate (op)))")))) ;; Return 1 if the operand is a valid second operand to an add insn. (define_predicate "add_operand" diff --git a/gcc/config/tilepro/constraints.md b/gcc/config/tilepro/constraints.md index 4d13fb0..3ab9ab7 100644 --- a/gcc/config/tilepro/constraints.md +++ b/gcc/config/tilepro/constraints.md @@ -90,12 +90,10 @@ "A 4-element vector constant with identical elements" (and (match_code "const_vector") (match_test "CONST_VECTOR_NUNITS (op) == 4") - (match_test "CONST_VECTOR_ELT (op, 0) == CONST_VECTOR_ELT (op, 1)") - (match_test "CONST_VECTOR_ELT (op, 0) == CONST_VECTOR_ELT (op, 2)") - (match_test "CONST_VECTOR_ELT (op, 0) == CONST_VECTOR_ELT (op, 3)"))) + (match_test "const_vec_duplicate_p (op)"))) (define_constraint "Y" "A 2-element vector constant with identical elements" (and (match_code "const_vector") (match_test "CONST_VECTOR_NUNITS (op) == 2") - (match_test "CONST_VECTOR_ELT (op, 0) == CONST_VECTOR_ELT (op, 1)"))) + (match_test "const_vec_duplicate_p (op)"))) diff --git a/gcc/config/tilepro/predicates.md b/gcc/config/tilepro/predicates.md index 00d2bb9..ab62d20 100644 --- a/gcc/config/tilepro/predicates.md +++ b/gcc/config/tilepro/predicates.md @@ -75,10 +75,8 @@ (ior (match_operand 0 "register_operand") (and (match_code "const_vector") (match_test "CONST_VECTOR_NUNITS (op) == 4 - && satisfies_constraint_I (CONST_VECTOR_ELT (op, 0)) - && CONST_VECTOR_ELT (op, 0) == CONST_VECTOR_ELT (op, 1) - && CONST_VECTOR_ELT (op, 0) == CONST_VECTOR_ELT (op, 2) - && CONST_VECTOR_ELT (op, 0) == CONST_VECTOR_ELT (op, 3)")))) + && (satisfies_constraint_I + (unwrap_const_vec_duplicate (op)))")))) ;; Return 1 if OP is a 2-element vector constant with identical signed ;; 8-bit elements or any register. @@ -86,8 +84,8 @@ (ior (match_operand 0 "register_operand") (and (match_code "const_vector") (match_test "CONST_VECTOR_NUNITS (op) == 2 - && satisfies_constraint_I (CONST_VECTOR_ELT (op, 0)) - && CONST_VECTOR_ELT (op, 0) == CONST_VECTOR_ELT (op, 1)")))) + && (satisfies_constraint_I + (unwrap_const_vec_duplicate (op)))")))) ;; Return 1 if the operand is a valid second operand to an add insn. (define_predicate "add_operand" diff --git a/gcc/expmed.c b/gcc/expmed.c index 59b2919..604a957 100644 --- a/gcc/expmed.c +++ b/gcc/expmed.c @@ -3117,15 +3117,7 @@ expand_mult (machine_mode mode, rtx op0, rtx op1, rtx target, /* For vectors, there are several simplifications that can be made if all elements of the vector constant are identical. */ - scalar_op1 = op1; - if (GET_CODE (op1) == CONST_VECTOR) - { - int i, n = CONST_VECTOR_NUNITS (op1); - scalar_op1 = CONST_VECTOR_ELT (op1, 0); - for (i = 1; i < n; ++i) - if (!rtx_equal_p (scalar_op1, CONST_VECTOR_ELT (op1, i))) - goto skip_scalar; - } + scalar_op1 = unwrap_const_vec_duplicate (op1); if (INTEGRAL_MODE_P (mode)) { @@ -3254,7 +3246,6 @@ expand_mult (machine_mode mode, rtx op0, rtx op1, rtx target, target, unsignedp, OPTAB_LIB_WIDEN); } } - skip_scalar: /* This used to use umul_optab if unsigned, but for non-widening multiply there is no difference between signed and unsigned. */ diff --git a/gcc/rtl.c b/gcc/rtl.c index b1b485e..3c8bdc1 100644 --- a/gcc/rtl.c +++ b/gcc/rtl.c @@ -657,6 +657,31 @@ rtx_equal_p (const_rtx x, const_rtx y) return 1; } +/* Return true if all elements of VEC are equal. */ + +bool +rtvec_all_equal_p (const_rtvec vec) +{ + const_rtx first = RTVEC_ELT (vec, 0); + /* Optimize the important special case of a vector of constants. + The main use of this function is to detect whether every element + of CONST_VECTOR is the same. */ + switch (GET_CODE (first)) + { + CASE_CONST_UNIQUE: + for (int i = 1, n = GET_NUM_ELEM (vec); i < n; ++i) + if (first != RTVEC_ELT (vec, i)) + return false; + return true; + + default: + for (int i = 1, n = GET_NUM_ELEM (vec); i < n; ++i) + if (!rtx_equal_p (first, RTVEC_ELT (vec, i))) + return false; + return true; + } +} + /* Return an indication of which type of insn should have X as a body. In generator files, this can be UNKNOWN if the answer is only known at (GCC) runtime. Otherwise the value is CODE_LABEL, INSN, CALL_INSN diff --git a/gcc/rtl.h b/gcc/rtl.h index 5e02397..ac56133 100644 --- a/gcc/rtl.h +++ b/gcc/rtl.h @@ -2678,6 +2678,42 @@ extern unsigned int rtx_size (const_rtx); extern rtx shallow_copy_rtx_stat (const_rtx MEM_STAT_DECL); #define shallow_copy_rtx(a) shallow_copy_rtx_stat (a MEM_STAT_INFO) extern int rtx_equal_p (const_rtx, const_rtx); +extern bool rtvec_all_equal_p (const_rtvec); + +/* Return true if X is a vector constant with a duplicated element value. */ + +inline bool +const_vec_duplicate_p (const_rtx x) +{ + return GET_CODE (x) == CONST_VECTOR && rtvec_all_equal_p (XVEC (x, 0)); +} + +/* Return true if X is a vector constant with a duplicated element value. + Store the duplicated element in *ELT if so. */ + +template +inline bool +const_vec_duplicate_p (T x, T *elt) +{ + if (const_vec_duplicate_p (x)) + { + *elt = CONST_VECTOR_ELT (x, 0); + return true; + } + return false; +} + +/* If X is a vector constant with a duplicated element value, return that + element value, otherwise return X. */ + +template +inline T +unwrap_const_vec_duplicate (T x) +{ + if (const_vec_duplicate_p (x)) + x = CONST_VECTOR_ELT (x, 0); + return x; +} /* In emit-rtl.c */ extern rtvec gen_rtvec_v (int, rtx *);