From patchwork Wed Jul 17 08:32:15 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 1133187 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-505187-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="tzB6chyq"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 45pVrY0gHtz9sLt for ; Wed, 17 Jul 2019 18:32:44 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:from:to:cc:references:date:mime-version:in-reply-to :content-type:message-id; q=dns; s=default; b=HtTxubDAFMGRQxaRS/ mxixPqFyMfFdMrvbwHqGs/lIFoaN2oAhAejHuM/R3HEhTBPpi72sIkOlZBOS6Gyd mrJwcnjpq1bpUzy3IE1/7WSy9xrH1Cl2GqjN/o+lka36e8yZxs05JMGC36RXJ3Q7 g8/JpeYCvfXvPbM4+B4dRhAf8= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:from:to:cc:references:date:mime-version:in-reply-to :content-type:message-id; s=default; bh=Za7UHrrE+MbNo/YDrLfftD9c t/Q=; b=tzB6chyq+/ALyerxgnpRsfTeBpX8HdgrkYyr/QU2bYIgfmMQQqIyXws9 KHtwaJvfKjkFBcVBxrEPb1T9+psRA4PG1pESJ9bqr0cCdkGzVHEvNOrQiWkZaEAj VDq2ml6eTpOUor1SjKZY7UKedfKdL4udgBqky1IQ89xoDza4kHY= Received: (qmail 37480 invoked by alias); 17 Jul 2019 08:32:37 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 37471 invoked by uid 89); 17 Jul 2019 08:32:37 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-18.1 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, MIME_CHARSET_FARAWAY, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.1 spammy=suffering, launched, sud, Lin X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0b-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.158.5) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 17 Jul 2019 08:32:35 +0000 Received: from pps.filterd (m0098413.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x6H8NDZL137787 for ; Wed, 17 Jul 2019 04:32:33 -0400 Received: from e06smtp01.uk.ibm.com (e06smtp01.uk.ibm.com [195.75.94.97]) by mx0b-001b2d01.pphosted.com with ESMTP id 2tsyqesfq0-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 17 Jul 2019 04:32:32 -0400 Received: from localhost by e06smtp01.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 17 Jul 2019 09:32:30 +0100 Received: from b06cxnps4075.portsmouth.uk.ibm.com (9.149.109.197) by e06smtp01.uk.ibm.com (192.168.101.131) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Wed, 17 Jul 2019 09:32:27 +0100 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id x6H8WQHZ56361034 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 17 Jul 2019 08:32:26 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2520AAE057; Wed, 17 Jul 2019 08:32:26 +0000 (GMT) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B5A98AE045; Wed, 17 Jul 2019 08:32:23 +0000 (GMT) Received: from kewenlins-mbp.cn.ibm.com (unknown [9.200.146.80]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 17 Jul 2019 08:32:23 +0000 (GMT) Subject: [PATCH, rs6000] Support vrotr3 for int vector types From: "Kewen.Lin" To: GCC Patches Cc: Jakub Jelinek , Richard Biener , richard.sandiford@arm.com, Segher Boessenkool , Bill Schmidt References: <232a38b1-76c2-476d-1be0-a1958e5624bb@linux.ibm.com> <20190715085929.GO2125@tucnak> <32f89c4f-cd2d-a7bd-16d2-26fed6bb5f56@linux.ibm.com> Date: Wed, 17 Jul 2019 16:32:15 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <32f89c4f-cd2d-a7bd-16d2-26fed6bb5f56@linux.ibm.com> x-cbid: 19071708-4275-0000-0000-0000034E0643 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 19071708-4276-0000-0000-0000385E1AFC Message-Id: <27be90e6-4beb-5c4c-a163-9b136490d783@linux.ibm.com> X-IsSubscribed: yes Hi all, This patch follows the idea to improve rs6000 backend instead of generic expander. I think this is a better solution? I was thinking generic expander change may benefit other targets suffering similar issues but the previous RFC seems too restricted on const rotation count, although it's possible to extend. Any comments on their pros/ cons are really helpful to me (a noob). Regression testing just launched, is it OK for trunk if it's bootstrapped and regresstested on powerpc64le-unknown-linux-gnu? Thanks, Kewen ---- gcc/ChangeLog 2019-07-17 Kewen Lin * config/rs6000/predicates.md (vint_reg_or_const_vector): New predicate. * config/rs6000/vector.md (vrotr3): New define_expand. gcc/testsuite/ChangeLog 2019-07-17 Kewen Lin * gcc.target/powerpc/vec_rotate-1.c: New test. * gcc.target/powerpc/vec_rotate-2.c: New test. on 2019/7/16 下午4:45, Kewen.Lin wrote: > Hi all, > > Based on the previous comments (thank you!), I tried to update the > handling in expander and vectorizer. Middle-end optimizes lrotate > with const rotation count to rrotate all the time, it makes vectorizer > fail to vectorize if rrotate isn't supported on the target. We can at > least teach it on const rotation count, the cost should be the same? > At the same time, the expander already tries to use the opposite > rotation optable for scalar, we can teach it to deal with vector as well. > > Is it on the right track and reasonable? > > > Thanks, > Kewen > diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md index 8ca98299950..c4c74630d26 100644 --- a/gcc/config/rs6000/predicates.md +++ b/gcc/config/rs6000/predicates.md @@ -163,6 +163,17 @@ return VINT_REGNO_P (REGNO (op)); }) +;; Return 1 if op is a vector register that operates on integer vectors +;; or if op is a const vector with integer vector modes. +(define_predicate "vint_reg_or_const_vector" + (match_code "reg,subreg,const_vector") +{ + if (GET_CODE (op) == CONST_VECTOR && GET_MODE_CLASS (mode) == MODE_VECTOR_INT) + return 1; + + return vint_operand (op, mode); +}) + ;; Return 1 if op is a vector register to do logical operations on (and, or, ;; xor, etc.) (define_predicate "vlogical_operand" diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md index 70bcfe02e22..5c6a344e452 100644 --- a/gcc/config/rs6000/vector.md +++ b/gcc/config/rs6000/vector.md @@ -1260,6 +1260,32 @@ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" "") +;; Expanders for rotatert to make use of vrotl +(define_expand "vrotr3" + [(set (match_operand:VEC_I 0 "vint_operand") + (rotatert:VEC_I (match_operand:VEC_I 1 "vint_operand") + (match_operand:VEC_I 2 "vint_reg_or_const_vector")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" +{ + machine_mode inner_mode = GET_MODE_INNER (mode); + unsigned int bits = GET_MODE_PRECISION (inner_mode); + rtx imm_vec = gen_const_vec_duplicate (mode, GEN_INT (bits)); + rtx rot_count = gen_reg_rtx (mode); + if (GET_CODE (operands[2]) == CONST_VECTOR) + { + imm_vec = simplify_const_binary_operation (MINUS, mode, imm_vec, + operands[2]); + rot_count = force_reg (mode, imm_vec); + } + else + { + rtx imm_reg = force_reg (mode, imm_vec); + emit_insn (gen_sub3 (rot_count, imm_reg, operands[2])); + } + emit_insn (gen_vrotl3 (operands[0], operands[1], rot_count)); + DONE; +}) + ;; Expanders for arithmetic shift left on each vector element (define_expand "vashl3" [(set (match_operand:VEC_I 0 "vint_operand") diff --git a/gcc/testsuite/gcc.target/powerpc/vec_rotate-1.c b/gcc/testsuite/gcc.target/powerpc/vec_rotate-1.c new file mode 100644 index 00000000000..80aca1a94a5 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vec_rotate-1.c @@ -0,0 +1,46 @@ +/* { dg-options "-O3" } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ + +/* Check vectorizer can exploit vector rotation instructions on Power, mainly + for the case rotation count is const number. */ + +#define N 256 +unsigned long long sud[N], rud[N]; +unsigned int suw[N], ruw[N]; +unsigned short suh[N], ruh[N]; +unsigned char sub[N], rub[N]; + +void +testULL () +{ + for (int i = 0; i < 256; ++i) + rud[i] = (sud[i] >> 8) | (sud[i] << (sizeof (sud[0]) * 8 - 8)); +} + +void +testUW () +{ + for (int i = 0; i < 256; ++i) + ruw[i] = (suw[i] >> 8) | (suw[i] << (sizeof (suw[0]) * 8 - 8)); +} + +void +testUH () +{ + for (int i = 0; i < 256; ++i) + ruh[i] = (unsigned short) (suh[i] >> 9) + | (unsigned short) (suh[i] << (sizeof (suh[0]) * 8 - 9)); +} + +void +testUB () +{ + for (int i = 0; i < 256; ++i) + rub[i] = (unsigned char) (sub[i] >> 5) + | (unsigned char) (sub[i] << (sizeof (sub[0]) * 8 - 5)); +} + +/* { dg-final { scan-assembler {\mvrld\M} } } */ +/* { dg-final { scan-assembler {\mvrlw\M} } } */ +/* { dg-final { scan-assembler {\mvrlh\M} } } */ +/* { dg-final { scan-assembler {\mvrlb\M} } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vec_rotate-2.c b/gcc/testsuite/gcc.target/powerpc/vec_rotate-2.c new file mode 100644 index 00000000000..affda6c023b --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vec_rotate-2.c @@ -0,0 +1,47 @@ +/* { dg-options "-O3" } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ + +/* Check vectorizer can exploit vector rotation instructions on Power, mainly + for the case rotation count isn't const number. */ + +#define N 256 +unsigned long long sud[N], rud[N]; +unsigned int suw[N], ruw[N]; +unsigned short suh[N], ruh[N]; +unsigned char sub[N], rub[N]; +extern unsigned char rot_cnt; + +void +testULL () +{ + for (int i = 0; i < 256; ++i) + rud[i] = (sud[i] >> rot_cnt) | (sud[i] << (sizeof (sud[0]) * 8 - rot_cnt)); +} + +void +testUW () +{ + for (int i = 0; i < 256; ++i) + ruw[i] = (suw[i] >> rot_cnt) | (suw[i] << (sizeof (suw[0]) * 8 - rot_cnt)); +} + +void +testUH () +{ + for (int i = 0; i < 256; ++i) + ruh[i] = (unsigned short) (suh[i] >> rot_cnt) + | (unsigned short) (suh[i] << (sizeof (suh[0]) * 8 - rot_cnt)); +} + +void +testUB () +{ + for (int i = 0; i < 256; ++i) + rub[i] = (unsigned char) (sub[i] >> rot_cnt) + | (unsigned char) (sub[i] << (sizeof (sub[0]) * 8 - rot_cnt)); +} + +/* { dg-final { scan-assembler {\mvrld\M} } } */ +/* { dg-final { scan-assembler {\mvrlw\M} } } */ +/* { dg-final { scan-assembler {\mvrlh\M} } } */ +/* { dg-final { scan-assembler {\mvrlb\M} } } */