From patchwork Tue Jul 16 08:45:19 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 1132524 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-505119-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="TDdZCEPb"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 45nvB44pD8z9sN4 for ; Tue, 16 Jul 2019 18:45:47 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:from:to:cc:references:date:mime-version:in-reply-to :content-type:content-transfer-encoding:message-id; q=dns; s= default; b=E0DQPT1P6Q4gZKQIZvUiKNilwHHxHYMzTaCRiQHhLcehUl5WP/n3u qPzdA1cKgQKf03Y+3kmDfm636iGhu0VN5NnhbIJmkdgQgODRbhL7ehQsqtJQSMUR TnP7sS0cgPC7EppTnrYGn0I5l7KtL0f1C7BKuaKM3Hi1LfGOAG5y74= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:from:to:cc:references:date:mime-version:in-reply-to :content-type:content-transfer-encoding:message-id; s=default; bh=sUrnHs36UT+tOp8QuMvpWAJCyh0=; b=TDdZCEPbefNy4waegclufGR7vR8Z eA5JVWG6vk+a4wwFDNEVMTngWdf3c+dmDBc3qFsygDP0NNKkMXxamIN2I5yRjHSI kMXBA6iGFv8vGIshfY2Kt9kEWuDXr5k4XmPBalcXYBFkHJBTGp3BddtMCgFxrcS4 LloAklct5da+FSo= Received: (qmail 108425 invoked by alias); 16 Jul 2019 08:45:39 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 108415 invoked by uid 89); 16 Jul 2019 08:45:38 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-19.1 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.1 spammy= X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0b-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.158.5) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 16 Jul 2019 08:45:37 +0000 Received: from pps.filterd (m0098414.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x6G8hKGa052864 for ; Tue, 16 Jul 2019 04:45:35 -0400 Received: from e06smtp05.uk.ibm.com (e06smtp05.uk.ibm.com [195.75.94.101]) by mx0b-001b2d01.pphosted.com with ESMTP id 2tsayqrv7s-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 16 Jul 2019 04:45:35 -0400 Received: from localhost by e06smtp05.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 16 Jul 2019 09:45:33 +0100 Received: from b06avi18626390.portsmouth.uk.ibm.com (9.149.26.192) by e06smtp05.uk.ibm.com (192.168.101.135) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Tue, 16 Jul 2019 09:45:30 +0100 Received: from d06av25.portsmouth.uk.ibm.com (d06av25.portsmouth.uk.ibm.com [9.149.105.61]) by b06avi18626390.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id x6G8jGIL40370634 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 16 Jul 2019 08:45:16 GMT Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C37A911C05B; Tue, 16 Jul 2019 08:45:29 +0000 (GMT) Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3A05F11C058; Tue, 16 Jul 2019 08:45:27 +0000 (GMT) Received: from kewenlins-mbp.cn.ibm.com (unknown [9.200.147.161]) by d06av25.portsmouth.uk.ibm.com (Postfix) with ESMTP; Tue, 16 Jul 2019 08:45:26 +0000 (GMT) Subject: [RFC] Consider lrotate const rotation in vectorizer From: "Kewen.Lin" To: GCC Patches Cc: Jakub Jelinek , Richard Biener , richard.sandiford@arm.com, Segher Boessenkool , Bill Schmidt References: <232a38b1-76c2-476d-1be0-a1958e5624bb@linux.ibm.com> <20190715085929.GO2125@tucnak> Date: Tue, 16 Jul 2019 16:45:19 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: x-cbid: 19071608-0020-0000-0000-00000353F9EC X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 19071608-0021-0000-0000-000021A7C4EE Message-Id: <32f89c4f-cd2d-a7bd-16d2-26fed6bb5f56@linux.ibm.com> X-IsSubscribed: yes Hi all, Based on the previous comments (thank you!), I tried to update the handling in expander and vectorizer. Middle-end optimizes lrotate with const rotation count to rrotate all the time, it makes vectorizer fail to vectorize if rrotate isn't supported on the target. We can at least teach it on const rotation count, the cost should be the same? At the same time, the expander already tries to use the opposite rotation optable for scalar, we can teach it to deal with vector as well. Is it on the right track and reasonable? Thanks, Kewen -------------- One informal patch to help describing this new thought: diff --git a/gcc/optabs.c b/gcc/optabs.c index a0e361b8bfe..ebebb0ad145 100644 --- a/gcc/optabs.c +++ b/gcc/optabs.c @@ -1273,6 +1273,7 @@ expand_binop (machine_mode mode, optab binoptab, rtx op0, rtx op1, if (mclass == MODE_VECTOR_INT) { optab otheroptab = unknown_optab; + optab otheroptab1 = unknown_optab; if (binoptab == ashl_optab) otheroptab = vashl_optab; @@ -1281,23 +1282,50 @@ expand_binop (machine_mode mode, optab binoptab, rtx op0, rtx op1, else if (binoptab == lshr_optab) otheroptab = vlshr_optab; else if (binoptab == rotl_optab) - otheroptab = vrotl_optab; + { + otheroptab = vrotl_optab; + otheroptab1 = vrotr_optab; + } else if (binoptab == rotr_optab) - otheroptab = vrotr_optab; + { + otheroptab = vrotr_optab; + otheroptab1 = vrotl_optab; + } + + bool other_ok = (otheroptab && (icode = optab_handler (otheroptab, mode)) != CODE_FOR_nothing); + bool other1_ok = false; + if (!other_ok && otheroptab1) + other1_ok + = ((icode = optab_handler (otheroptab1, mode)) != CODE_FOR_nothing) + && SCALAR_INT_MODE_P (GET_MODE_INNER (mode)); - if (otheroptab - && (icode = optab_handler (otheroptab, mode)) != CODE_FOR_nothing) + if (other_ok || other1_ok) { /* The scalar may have been extended to be too wide. Truncate it back to the proper size to fit in the broadcast vector. */ scalar_mode inner_mode = GET_MODE_INNER (mode); - if (!CONST_INT_P (op1) - && (GET_MODE_BITSIZE (as_a (GET_MODE (op1))) + rtx newop1 = op1; + if (other1_ok) + { + unsigned int bits = GET_MODE_PRECISION (inner_mode); + + if (CONST_INT_P (op1)) + newop1 = gen_int_shift_amount (int_mode, bits - INTVAL (op1)); + else if (targetm.shift_truncation_mask (int_mode) == bits - 1) + newop1 = negate_rtx (GET_MODE (op1), op1); + else + newop1 = expand_binop (GET_MODE (op1), sub_optab, + gen_int_mode (bits, GET_MODE (op1)), op1, + NULL_RTX, unsignedp, OPTAB_DIRECT); + } + if (!CONST_INT_P (newop1) + && (GET_MODE_BITSIZE (as_a (GET_MODE (newop1))) > GET_MODE_BITSIZE (inner_mode))) - op1 = force_reg (inner_mode, - simplify_gen_unary (TRUNCATE, inner_mode, op1, - GET_MODE (op1))); - rtx vop1 = expand_vector_broadcast (mode, op1); + newop1 = force_reg (inner_mode, + simplify_gen_unary (TRUNCATE, inner_mode, + newop1, GET_MODE (newop1))); + + rtx vop1 = expand_vector_broadcast (mode, newop1); if (vop1) { temp = expand_binop_directly (icode, mode, otheroptab, op0, vop1, diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c index ff952d6f464..c05ce1acba4 100644 --- a/gcc/tree-vect-patterns.c +++ b/gcc/tree-vect-patterns.c @@ -2039,6 +2039,15 @@ vect_recog_rotate_pattern (stmt_vec_info stmt_vinfo, tree *type_out) if (optab1 && optab_handler (optab1, TYPE_MODE (vectype)) != CODE_FOR_nothing) return NULL; + /* middle-end canonicalizing LROTATE to RROTATE with const rotation count, + let's try the LROTATE as well. */ + if (rhs_code == RROTATE_EXPR && TREE_CODE(oprnd1) == INTEGER_CST) + { + optab1 = optab_for_tree_code (LROTATE_EXPR, vectype, optab_vector); + if (optab1 + && optab_handler (optab1, TYPE_MODE (vectype)) != CODE_FOR_nothing) + return NULL; + } if (is_a (vinfo) || dt != vect_internal_def) { @@ -2046,6 +2055,14 @@ vect_recog_rotate_pattern (stmt_vec_info stmt_vinfo, tree *type_out) if (optab2 && optab_handler (optab2, TYPE_MODE (vectype)) != CODE_FOR_nothing) return NULL; + if (rhs_code == RROTATE_EXPR && TREE_CODE(oprnd1) == INTEGER_CST) + { + optab2 = optab_for_tree_code (LROTATE_EXPR, vectype, optab_scalar); + if (optab2 + && optab_handler (optab2, TYPE_MODE (vectype)) + != CODE_FOR_nothing) + return NULL; + } } /* If vector/vector or vector/scalar shifts aren't supported by the target, diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c index 21046931243..9e1a2f971a1 100644 --- a/gcc/tree-vect-stmts.c +++ b/gcc/tree-vect-stmts.c @@ -5665,6 +5665,12 @@ vectorizable_shift (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi, if (!scalar_shift_arg) { optab = optab_for_tree_code (code, vectype, optab_vector); + + if (TREE_CODE (op1) == INTEGER_CST && code == RROTATE_EXPR + && !(optab + && optab_handler (optab, TYPE_MODE (vectype)) + != CODE_FOR_nothing)) + optab = optab_for_tree_code (LROTATE_EXPR, vectype, optab_vector); if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, vect_location, "vector/vector shift/rotate found.\n"); @@ -5696,7 +5702,10 @@ vectorizable_shift (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi, else { optab = optab_for_tree_code (code, vectype, optab_vector); - if (optab + if (TREE_CODE (op1) == INTEGER_CST && code == RROTATE_EXPR + && !(optab && optab_handler (optab, TYPE_MODE (vectype)))) + optab = optab_for_tree_code (LROTATE_EXPR, vectype, optab_vector); + if (optab && (optab_handler (optab, TYPE_MODE (vectype)) != CODE_FOR_nothing)) {