From patchwork Tue Jul 13 08:52:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 1504466 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=u09SH+RF; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPDw41DwPz9sWw for ; Tue, 13 Jul 2021 18:54:28 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CF80C394AC34 for ; Tue, 13 Jul 2021 08:54:25 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org CF80C394AC34 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1626166465; bh=LF9XtFw11Elo7kiAYPYEI+EtTIYAwbVgmSmWoWsoeE4=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=u09SH+RFhkyMivHfm5sKCryajwfoHfONhKP4yahHDe0gPVCyp0ceGNQCCL/AlgeIy NJ8vPCItCBttEoV+i5MWBjhkkU/9cJShQHdtANPQbtW+qdygfNpa4cYooE8RYxmQNq s6u51QZeEm6vgPiAPaUhYbhX0xG40EU/N+6bt0p4= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id E170E394BE2C for ; Tue, 13 Jul 2021 08:53:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E170E394BE2C Received: from pps.filterd (m0098393.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 16D8XdIJ106486; Tue, 13 Jul 2021 04:53:11 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 39qrmckcrs-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jul 2021 04:53:11 -0400 Received: from m0098393.ppops.net (m0098393.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 16D8YAOQ108515; Tue, 13 Jul 2021 04:53:10 -0400 Received: from ppma04ams.nl.ibm.com (63.31.33a9.ip4.static.sl-reverse.com [169.51.49.99]) by mx0a-001b2d01.pphosted.com with ESMTP id 39qrmckcr0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jul 2021 04:53:10 -0400 Received: from pps.filterd (ppma04ams.nl.ibm.com [127.0.0.1]) by ppma04ams.nl.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 16D8ptB3030653; Tue, 13 Jul 2021 08:53:08 GMT Received: from b06cxnps4074.portsmouth.uk.ibm.com (d06relay11.portsmouth.uk.ibm.com [9.149.109.196]) by ppma04ams.nl.ibm.com with ESMTP id 39q368966u-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jul 2021 08:53:08 +0000 Received: from d06av23.portsmouth.uk.ibm.com (d06av23.portsmouth.uk.ibm.com [9.149.105.59]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 16D8r5WQ35651954 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 13 Jul 2021 08:53:06 GMT Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C41EBA4072; Tue, 13 Jul 2021 08:53:05 +0000 (GMT) Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 27C4AA4059; Tue, 13 Jul 2021 08:52:58 +0000 (GMT) Received: from KewenLins-MacBook-Pro.local (unknown [9.200.62.231]) by d06av23.portsmouth.uk.ibm.com (Postfix) with ESMTP; Tue, 13 Jul 2021 08:52:57 +0000 (GMT) To: GCC Patches Subject: [RFC/PATCH] vect: Recog mul_highpart pattern Message-ID: Date: Tue, 13 Jul 2021 16:52:55 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.10.0 MIME-Version: 1.0 Content-Language: en-US X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: 5StoWasAZzGL4VRmCY4k7DxRa0ShuDUf X-Proofpoint-GUID: 86CcZjQE8dhwcQY4847R5wBzfJ7zjsUd X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.790 definitions=2021-07-13_03:2021-07-13, 2021-07-13 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 bulkscore=0 phishscore=0 mlxscore=0 lowpriorityscore=0 adultscore=0 malwarescore=0 suspectscore=0 impostorscore=0 mlxlogscore=999 priorityscore=1501 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2104190000 definitions=main-2107130053 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Kewen.Lin via Gcc-patches" From: "Kewen.Lin" Reply-To: "Kewen.Lin" Cc: Richard Sandiford , Bill Schmidt , Segher Boessenkool Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" Hi, When I added the support for Power10 newly introduced multiply highpart instrutions, I noticed that currently vectorizer doesn't try to vectorize multiply highpart pattern, I hope this isn't intentional? This patch is to extend the existing pattern mulhs handlings to cover multiply highpart. Another alternative seems to recog mul_highpart operation in a general place applied for scalar code when the target supports the optab for the scalar operation, it's based on the assumption that one target which supports vector version of multiply highpart should have the scalar version. I noticed that the function can_mult_highpart_p can check/handle mult_highpart well even without mul_highpart optab support, I think to recog this pattern in vectorizer is better. Is it on the right track? Bootstrapped & regtested on powerpc64le-linux-gnu P9, x86_64-redhat-linux and aarch64-linux-gnu. BR, Kewen ----- gcc/ChangeLog: * tree-vect-patterns.c (vect_recog_mulhs_pattern): Add support to recog normal multiply highpart. --- gcc/tree-vect-patterns.c | 67 ++++++++++++++++++++++++++++------------ 1 file changed, 48 insertions(+), 19 deletions(-) diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c index b2e7fc2cc7a..9253c8088e9 100644 --- a/gcc/tree-vect-patterns.c +++ b/gcc/tree-vect-patterns.c @@ -1896,8 +1896,15 @@ vect_recog_over_widening_pattern (vec_info *vinfo, 1) Multiply high with scaling TYPE res = ((TYPE) a * (TYPE) b) >> c; + Here, c is bitsize (TYPE) / 2 - 1. + 2) ... or also with rounding TYPE res = (((TYPE) a * (TYPE) b) >> d + 1) >> 1; + Here, d is bitsize (TYPE) / 2 - 2. + + 3) Normal multiply high + TYPE res = ((TYPE) a * (TYPE) b) >> e; + Here, e is bitsize (TYPE) / 2. where only the bottom half of res is used. */ @@ -1942,7 +1949,6 @@ vect_recog_mulhs_pattern (vec_info *vinfo, stmt_vec_info mulh_stmt_info; tree scale_term; internal_fn ifn; - unsigned int expect_offset; /* Check for the presence of the rounding term. */ if (gimple_assign_rhs_code (rshift_input_stmt) == PLUS_EXPR) @@ -1991,25 +1997,37 @@ vect_recog_mulhs_pattern (vec_info *vinfo, /* Get the scaling term. */ scale_term = gimple_assign_rhs2 (plus_input_stmt); + /* Check that the scaling factor is correct. */ + if (TREE_CODE (scale_term) != INTEGER_CST) + return NULL; + + /* Check pattern 2). */ + if (wi::to_widest (scale_term) + target_precision + 2 + != TYPE_PRECISION (lhs_type)) + return NULL; - expect_offset = target_precision + 2; ifn = IFN_MULHRS; } else { mulh_stmt_info = rshift_input_stmt_info; scale_term = gimple_assign_rhs2 (last_stmt); + /* Check that the scaling factor is correct. */ + if (TREE_CODE (scale_term) != INTEGER_CST) + return NULL; - expect_offset = target_precision + 1; - ifn = IFN_MULHS; + /* Check for pattern 1). */ + if (wi::to_widest (scale_term) + target_precision + 1 + == TYPE_PRECISION (lhs_type)) + ifn = IFN_MULHS; + /* Check for pattern 3). */ + else if (wi::to_widest (scale_term) + target_precision + == TYPE_PRECISION (lhs_type)) + ifn = IFN_LAST; + else + return NULL; } - /* Check that the scaling factor is correct. */ - if (TREE_CODE (scale_term) != INTEGER_CST - || wi::to_widest (scale_term) + expect_offset - != TYPE_PRECISION (lhs_type)) - return NULL; - /* Check whether the scaling input term can be seen as two widened inputs multiplied together. */ vect_unpromoted_value unprom_mult[2]; @@ -2029,9 +2047,14 @@ vect_recog_mulhs_pattern (vec_info *vinfo, /* Check for target support. */ tree new_vectype = get_vectype_for_scalar_type (vinfo, new_type); - if (!new_vectype - || !direct_internal_fn_supported_p - (ifn, new_vectype, OPTIMIZE_FOR_SPEED)) + if (!new_vectype) + return NULL; + if (ifn != IFN_LAST + && !direct_internal_fn_supported_p (ifn, new_vectype, OPTIMIZE_FOR_SPEED)) + return NULL; + else if (ifn == IFN_LAST + && !can_mult_highpart_p (TYPE_MODE (new_vectype), + TYPE_UNSIGNED (new_type))) return NULL; /* The IR requires a valid vector type for the cast result, even though @@ -2040,14 +2063,20 @@ vect_recog_mulhs_pattern (vec_info *vinfo, if (!*type_out) return NULL; - /* Generate the IFN_MULHRS call. */ + gimple *mulhrs_stmt; tree new_var = vect_recog_temp_ssa_var (new_type, NULL); tree new_ops[2]; - vect_convert_inputs (vinfo, last_stmt_info, 2, new_ops, new_type, - unprom_mult, new_vectype); - gcall *mulhrs_stmt - = gimple_build_call_internal (ifn, 2, new_ops[0], new_ops[1]); - gimple_call_set_lhs (mulhrs_stmt, new_var); + vect_convert_inputs (vinfo, last_stmt_info, 2, new_ops, new_type, unprom_mult, + new_vectype); + if (ifn == IFN_LAST) + mulhrs_stmt = gimple_build_assign (new_var, MULT_HIGHPART_EXPR, new_ops[0], + new_ops[1]); + else + { + /* Generate the IFN_MULHRS call. */ + mulhrs_stmt = gimple_build_call_internal (ifn, 2, new_ops[0], new_ops[1]); + gimple_call_set_lhs (mulhrs_stmt, new_var); + } gimple_set_location (mulhrs_stmt, gimple_location (last_stmt)); if (dump_enabled_p ())