From patchwork Mon Dec 11 21:22:45 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 847228 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-468951-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="gu6z7cw+"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3ywbXV2jdxz9sxR for ; Tue, 12 Dec 2017 08:23:06 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-transfer-encoding:in-reply-to; q=dns; s= default; b=Vty+o0NlgiWf0Vx6lC4Buq4FMD+3NscsTMYJIg+uOrhmaBFbrfRKT YKhIgibQxTd7chRyHW7abPPWAss21fRw6p0eoUOGpsLV0Des0hx1LHb2UwrHSADW ttamojv985RXTN2WYVPSNstFLz6/H2wAUCDWFlbPHCEKqwwBVhFOGI= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-transfer-encoding:in-reply-to; s=default; bh=phEFY9JvpLN67tZT4OarQkxHV1w=; b=gu6z7cw+UMkbm5/oO/qMx/BE3fWs B+jh0n0tYvnCXr3TXru4MyZtuJ/Ly3PR0fNwJYI9ersdWS8pj4GVIyDI+GKerTgi nUAJrIn0oi8QK+uc5DqNbbH/7VVjs6ELmNuJ5mVl2Ppr/qmruZyWF9ULdPb101SZ kEkWzTfDh06qJXI= Received: (qmail 101812 invoked by alias); 11 Dec 2017 21:22:57 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 101745 invoked by uid 89); 11 Dec 2017 21:22:56 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-15.9 required=5.0 tests=BAYES_00, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_PASS, T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=pr65947-13.c, pr6594713c, UD:pr65947-13.c X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 11 Dec 2017 21:22:54 +0000 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C67F3820F7; Mon, 11 Dec 2017 21:22:52 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-116-34.ams2.redhat.com [10.36.116.34]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 3E0D446E8C; Mon, 11 Dec 2017 21:22:52 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id vBBLMmTE001896; Mon, 11 Dec 2017 22:22:49 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id vBBLMjED001895; Mon, 11 Dec 2017 22:22:45 +0100 Date: Mon, 11 Dec 2017 22:22:45 +0100 From: Jakub Jelinek To: Richard Biener , Kilian Verhetsel Cc: Alan Hayward , GCC Patches , nd Subject: [PATCH] Fix result for conditional reductions matching at index 0 (PR tree-optimization/80631) Message-ID: <20171211212245.GS2353@tucnak> Reply-To: Jakub Jelinek References: <87zi7fbn07.fsf@uclouvain.be> <87wp2ib6aj.fsf@uclouvain.be> <20171208181501.GB2353@tucnak> <87po7lleh4.fsf@uclouvain.be> <20171211131134.GL2353@tucnak> <20171211135150.GM2353@tucnak> <87o9n5kxno.fsf@uclouvain.be> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <87o9n5kxno.fsf@uclouvain.be> User-Agent: Mutt/1.7.1 (2016-10-04) X-IsSubscribed: yes On Mon, Dec 11, 2017 at 06:00:11PM +0100, Kilian Verhetsel wrote: > Jakub Jelinek writes: > > Of course it can be done efficiently, what we care most is that the body of > > the vectorized loop is efficient. > > That's fair, I was looking at the x86 assembly being generated when a single > vectorized iteration was enough (because that is the context in which I > first encountered this bug): > > int f(unsigned int *x, unsigned int k) { > unsigned int result = 8; > for (unsigned int i = 0; i < 8; i++) { > if (x[i] == k) result = i; > } > return result; > } > > where the vpand instruction this generates would have to be replaced > with a variable blend if the default value weren't 0 — although I had > not realized even SSE4.1 on x86 includes such an instruction, making > this point less relevant. So, here is my version of the patch, independent from your change. As I said, your patch is still highly valueable if it will be another STMT_VINFO_VEC_REDUCTION_TYPE kind to be used for the cases like the above testcase, where base is equal to TYPE_MIN_VALUE, or future improvement of base being variable, but TYPE_OVERFLOW_UNDEFINED iterator, where all we need is that the maximum number of iterations is smaller than the maximum of the type we use for the reduction phi. The patch handles also negative steps, though for now only on signed type (for unsigned it can't be really negative, but perhaps we could treat unsigned values with the msb set as if they were negative and consider overflows in that direction). Bootstrapped/regtested on x86_64-linux, i686-linux, powerpc64le-linux, bootstrapped on powerpc64-linux, regtest there ongoing. Ok for trunk? The patch prefers to emit what we were emitting if possible (i.e. zero value for the COND_EXPR never hit) - building a zero vector is usually cheaper than any other; if that is not possible, checks if initial_def can be used for that value - then we can avoid the res == induc_val ? initial_def : res; conditional move; if even that is not possible, attempts to use any other value. If no value can be found, it for now uses COND_REDUCTION, which is more expensive, but correct. 2017-12-11 Jakub Jelinek PR tree-optimization/80631 * tree-vect-loop.c (get_initial_def_for_reduction): Fix comment typo. (vect_create_epilog_for_reduction): Add INDUC_VAL and INDUC_CODE arguments, for INTEGER_INDUC_COND_REDUCTION use INDUC_VAL instead of hardcoding zero as the value if COND_EXPR is never true. For INTEGER_INDUC_COND_REDUCTION don't emit the final COND_EXPR if INDUC_VAL is equal to INITIAL_DEF, and use INDUC_CODE instead of hardcoding MAX_EXPR as the reduction operation. (is_nonwrapping_integer_induction): Allow negative step. (vectorizable_reduction): Compute INDUC_VAL and INDUC_CODE for vect_create_epilog_for_reduction, if no value is suitable, don't use INTEGER_INDUC_COND_REDUCTION for now. Formatting fixes. * gcc.dg/vect/pr80631-1.c: New test. * gcc.dg/vect/pr80631-2.c: New test. * gcc.dg/vect/pr65947-13.c: Expect integer induc cond reduction vectorization. Jakub --- gcc/tree-vect-loop.c.jj 2017-12-11 14:57:38.000000000 +0100 +++ gcc/tree-vect-loop.c 2017-12-11 16:59:06.930720928 +0100 @@ -4034,7 +4034,7 @@ get_initial_def_for_reduction (gimple *s case MULT_EXPR: case BIT_AND_EXPR: { - /* ADJUSMENT_DEF is NULL when called from + /* ADJUSTMENT_DEF is NULL when called from vect_create_epilog_for_reduction to vectorize double reduction. */ if (adjustment_def) *adjustment_def = init_val; @@ -4283,6 +4283,11 @@ get_initial_defs_for_reduction (slp_tree DOUBLE_REDUC is TRUE if double reduction phi nodes should be handled. SLP_NODE is an SLP node containing a group of reduction statements. The first one in this group is STMT. + INDUC_VAL is for INTEGER_INDUC_COND_REDUCTION the value to use for the case + when the COND_EXPR is never true in the loop. For MAX_EXPR, it needs to + be smaller than any value of the IV in the loop, for MIN_EXPR larger than + any value of the IV in the loop. + INDUC_CODE is the code for epilog reduction if INTEGER_INDUC_COND_REDUCTION. This function: 1. Creates the reduction def-use cycles: sets the arguments for @@ -4330,7 +4335,8 @@ vect_create_epilog_for_reduction (vec reduction_phis, bool double_reduc, slp_tree slp_node, - slp_instance slp_node_instance) + slp_instance slp_node_instance, + tree induc_val, enum tree_code induc_code) { stmt_vec_info stmt_info = vinfo_for_stmt (stmt); stmt_vec_info prev_phi_info; @@ -4419,6 +4425,18 @@ vect_create_epilog_for_reduction (vec (phi), zero_vec, + add_phi_arg (as_a (phi), induc_val_vec, loop_preheader_edge (loop), UNKNOWN_LOCATION); } else @@ -4983,14 +5002,16 @@ vect_create_epilog_for_reduction (vec= 0; k--) + if (v[k] == 77) + r = k; + if (r != 0) + abort (); +} + +__attribute__((noipa)) void +f2 (void) +{ + int k, r = 4; + for (k = 7; k >= 0; k--) + if (v[k] == 79) + r = k; + if (r != 2) + abort (); +} + +__attribute__((noipa)) void +f3 (void) +{ + int k, r = -17; + for (k = 7; k >= 0; k--) + if (v[k] == 78) + r = k; + if (r != -17) + abort (); +} + +__attribute__((noipa)) void +f4 (void) +{ + int k, r = 7; + for (k = 7; k >= 0; k--) + if (v[k] == 78) + r = k; + if (r != 7) + abort (); +} + +__attribute__((noipa)) void +f5 (void) +{ + int k, r = -1; + for (k = 7; k >= 0; k--) + if (v[k] == 3) + r = k; + if (r != 3) + abort (); +} + +int +main () +{ + check_vect (); + f1 (); + f2 (); + f3 (); + f4 (); + f5 (); + return 0; +} + +/* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 5 "vect" { target vect_condition } } } */ +/* { dg-final { scan-tree-dump-times "condition expression based on integer induction." 10 "vect" { target vect_condition } } } */ --- gcc/testsuite/gcc.dg/vect/pr65947-13.c.jj 2017-06-23 17:04:40.000000000 +0200 +++ gcc/testsuite/gcc.dg/vect/pr65947-13.c 2017-12-11 21:04:15.822886161 +0100 @@ -6,8 +6,7 @@ extern void abort (void) __attribute__ ( #define N 32 -/* Simple condition reduction with a reversed loop. - Will fail to vectorize to a simple case. */ +/* Simple condition reduction with a reversed loop. */ int condition_reduction (int *a, int min_v) @@ -42,4 +41,4 @@ main (void) } /* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 2 "vect" } } */ -/* { dg-final { scan-tree-dump-not "condition expression based on integer induction." "vect" } } */ +/* { dg-final { scan-tree-dump-times "condition expression based on integer induction." 4 "vect" } } */