From patchwork Thu Nov 14 18:47:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Prathamesh Kulkarni X-Patchwork-Id: 1195071 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-513451-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="nmPBKXxR"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="b2Wh8/Tx"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 47DVr303TRz9sPc for ; Fri, 15 Nov 2019 05:48:44 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:content-type; q= dns; s=default; b=J0d0tl0yvDm/gJ7tIWyMjC/BtvO7KgM6h3010UOw/9we3y sE81u2jdwspxMJWFpEAD+7/4SV1ndtiPm5JpGD+Jw9Qqdt618qB8mjm+PXStjsSz q/rfBAVT+GLKMrUtUyia4FzqOL/c/Ry/afWRGutMRLKhGLs6JGtx9J1/d1F+Y= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:content-type; s= default; bh=zuSuTw2Bwr/FXGBRfAww3CfezP4=; b=nmPBKXxRrIaRnNRVyYLx W3IAwKqOuuhilmS6adMrWd2TuRz4lWouQVM3vdJSBy0klXqUosFHBuxWXXpwKfdq hQQjmcKgFeyDU+AsDRgQXcVNYyDanjpW9Hzq2ePGY7ESzb2yFsJTh0wZbaO12TBH /BhRhZlka2d76WX0BFx67Jo= Received: (qmail 128033 invoked by alias); 14 Nov 2019 18:48:36 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 127876 invoked by uid 89); 14 Nov 2019 18:48:26 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-21.3 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.1 spammy=20191115, 2019-11-15, sk:direct_, distributing X-HELO: mail-lj1-f172.google.com Received: from mail-lj1-f172.google.com (HELO mail-lj1-f172.google.com) (209.85.208.172) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 14 Nov 2019 18:48:23 +0000 Received: by mail-lj1-f172.google.com with SMTP id n21so7786119ljg.12 for ; Thu, 14 Nov 2019 10:48:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:from:date:message-id:subject:to; bh=kpQd9ReLD3OJzq28Zdn9/cBc2TfYNdkrgk6ROzmgZKI=; b=b2Wh8/Txuvcy+ucan+ySTnVZzqTgsft/8WDSjekd/uR7bPWfk2De/0j0GSN4pQ9t8b XOJB9etsOhrQCrK1Kci0Lz4WPC0rmIqTLebQC82lcNIGc7MIaT7b+AI0Y/Uj/cVT+3i3 4KHGhVGfP9MoaJjZc4XW6mKJ+3cCd3LRZvYiZ0UE3uJOHxF/+hhDCCIUvDC3iPkaD5n8 hIIUXpOxXOviVWw42RyNswHGIoFJUUy8rGmkQxd5VolSiB26OpKGgu40rCekun4WJu7o MXuEs3vrVuPbXrk7lO31gr2tkA9S+5iTr88asqafpRvUSh8pbVyqMx8GnV7dejoyeHeM 75tg== MIME-Version: 1.0 From: Prathamesh Kulkarni Date: Fri, 15 Nov 2019 00:17:36 +0530 Message-ID: Subject: [SVE] PR89007 - Implement generic vector average expansion To: gcc Patches , Richard Sandiford X-IsSubscribed: yes Hi, As suggested in PR, the attached patch falls back to distributing rshift over plus_expr instead of fallback widening -> arithmetic -> narrowing sequence, if target support is not available. Bootstrap+tested on x86_64-unknown-linux-gnu and aarch64-linux-gnu. OK to commit ? Thanks, Prathamesh 2019-11-15 Prathamesh Kulkarni PR tree-optimization/89007 * tree-vect-patterns.c (vect_recog_average_pattern): If there is no target support available, generate code to distribute rshift over plus and add one depending upon floor or ceil rounding. testsuite/ * gcc.target/aarch64/sve/pr89007.c: New test. diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr89007.c b/gcc/testsuite/gcc.target/aarch64/sve/pr89007.c new file mode 100644 index 00000000000..b682f3f3b74 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr89007.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O3" } */ + +#define N 1024 +unsigned char dst[N]; +unsigned char in1[N]; +unsigned char in2[N]; + +void +foo () +{ + for( int x = 0; x < N; x++ ) + dst[x] = (in1[x] + in2[x] + 1) >> 1; +} + +/* { dg-final { scan-assembler-not {\tuunpklo\t} } } */ +/* { dg-final { scan-assembler-not {\tuunpkhi\t} } } */ diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c index 8ebbcd76b64..7025a3b4dc2 100644 --- a/gcc/tree-vect-patterns.c +++ b/gcc/tree-vect-patterns.c @@ -2019,22 +2019,59 @@ vect_recog_average_pattern (stmt_vec_info last_stmt_info, tree *type_out) /* Check for target support. */ tree new_vectype = get_vectype_for_scalar_type (vinfo, new_type); - if (!new_vectype - || !direct_internal_fn_supported_p (ifn, new_vectype, - OPTIMIZE_FOR_SPEED)) + + if (!new_vectype) return NULL; + bool ifn_supported + = direct_internal_fn_supported_p (ifn, new_vectype, OPTIMIZE_FOR_SPEED); + /* The IR requires a valid vector type for the cast result, even though it's likely to be discarded. */ *type_out = get_vectype_for_scalar_type (vinfo, type); if (!*type_out) return NULL; - /* Generate the IFN_AVG* call. */ tree new_var = vect_recog_temp_ssa_var (new_type, NULL); tree new_ops[2]; vect_convert_inputs (last_stmt_info, 2, new_ops, new_type, unprom, new_vectype); + + if (!ifn_supported) + { + /* If there is no target support available, generate code + to distribute rshift over plus and add one depending + upon floor or ceil rounding. */ + + tree one_cst = build_one_cst (new_type); + + tree tmp1 = vect_recog_temp_ssa_var (new_type, NULL); + gassign *g1 = gimple_build_assign (tmp1, RSHIFT_EXPR, new_ops[0], one_cst); + + tree tmp2 = vect_recog_temp_ssa_var (new_type, NULL); + gassign *g2 = gimple_build_assign (tmp2, RSHIFT_EXPR, new_ops[1], one_cst); + + tree tmp3 = vect_recog_temp_ssa_var (new_type, NULL); + gassign *g3 = gimple_build_assign (tmp3, PLUS_EXPR, tmp1, tmp2); + + tree tmp4 = vect_recog_temp_ssa_var (new_type, NULL); + tree_code c = (ifn == IFN_AVG_CEIL) ? BIT_IOR_EXPR : BIT_AND_EXPR; + gassign *g4 = gimple_build_assign (tmp4, c, new_ops[0], new_ops[1]); + + tree tmp5 = vect_recog_temp_ssa_var (new_type, NULL); + gassign *g5 = gimple_build_assign (tmp5, BIT_AND_EXPR, tmp4, one_cst); + + gassign *g6 = gimple_build_assign (new_var, PLUS_EXPR, tmp3, tmp5); + + append_pattern_def_seq (last_stmt_info, g1, new_vectype); + append_pattern_def_seq (last_stmt_info, g2, new_vectype); + append_pattern_def_seq (last_stmt_info, g3, new_vectype); + append_pattern_def_seq (last_stmt_info, g4, new_vectype); + append_pattern_def_seq (last_stmt_info, g5, new_vectype); + return vect_convert_output (last_stmt_info, type, g6, new_vectype); + } + + /* Generate the IFN_AVG* call. */ gcall *average_stmt = gimple_build_call_internal (ifn, 2, new_ops[0], new_ops[1]); gimple_call_set_lhs (average_stmt, new_var);