From patchwork Tue Jul 18 07:25:01 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 789950 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-458369-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="n2m1gAWG"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3xBWsr3T54z9s3T for ; Tue, 18 Jul 2017 17:25:52 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; q=dns; s= default; b=iHj5YwhLRBXdihdyj9ysZb1YkXwYFX1TJaHYfSsabKF3EnYtWDjlD CLkhSPoD83Lc+DX/gwKlf1s/T5koX8l2DRtRuXEhF5jnAyGn70ilYbWnO98d1B7p SvOvviRvlXfASr0TNxHl2pueAhihU9OKGUWB5tqQjdBCnZR/4TvOp8= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; s= default; bh=O6nN7FRvH500DE1lk3I6vfgulCQ=; b=n2m1gAWG9om+piHH2j+P WK7qFeSZHz4QPQzwj668K1oa5V4+XL9uIJGFEDMPkQGGzg46dHyoS1AdBmZl4epw GwzupiGg5dpunV16SEGa0Fk/FszWM9nYytK2AvWejqDF8WOcuvz7W8KWA3IX/sua sEFMQMujAGVZ0k7MlepkaCM= Received: (qmail 110811 invoked by alias); 18 Jul 2017 07:25:20 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 105716 invoked by uid 89); 18 Jul 2017 07:25:07 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-11.1 required=5.0 tests=BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, RP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mx1.suse.de Received: from mx2.suse.de (HELO mx1.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 18 Jul 2017 07:25:05 +0000 Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id AB563AD8A for ; Tue, 18 Jul 2017 07:25:01 +0000 (UTC) Date: Tue, 18 Jul 2017 09:25:01 +0200 (CEST) From: Richard Biener To: gcc-patches@gcc.gnu.org Subject: [PATCH] Fix PR81418 Message-ID: User-Agent: Alpine 2.20 (LSU 67 2015-01-07) MIME-Version: 1.0 The following fixes a missed check in vectorizable-reduction. We cannot handle the case where we have a lane-reducing reduction operation like DOT_PROD_EXPR with not using a single def-use cycle because we need individual reduction vector elements in other vector stmts. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2017-06-18 Richard Biener PR tree-optimization/81418 * tree-vect-loop.c (vectorizable_reduction): Properly compute vectype_in. Verify that with lane-reducing reduction operations we have a single def-use cycle. * gcc.dg/torture/pr81418.c: New testcase. Index: gcc/tree-vect-loop.c =================================================================== --- gcc/tree-vect-loop.c (revision 250270) +++ gcc/tree-vect-loop.c (working copy) @@ -5642,7 +5642,10 @@ vectorizable_reduction (gimple *stmt, gi if (k == 1 && gimple_assign_rhs_code (reduc_stmt) == COND_EXPR) continue; - vectype_in = get_vectype_for_scalar_type (TREE_TYPE (op)); + tem = get_vectype_for_scalar_type (TREE_TYPE (op)); + if (! vectype_in + || TYPE_VECTOR_SUBPARTS (tem) < TYPE_VECTOR_SUBPARTS (vectype_in)) + vectype_in = tem; break; } gcc_assert (vectype_in); @@ -6213,26 +6216,6 @@ vectorizable_reduction (gimple *stmt, gi } } - if (!vec_stmt) /* transformation not required. */ - { - if (first_p) - vect_model_reduction_cost (stmt_info, epilog_reduc_code, ncopies); - STMT_VINFO_TYPE (stmt_info) = reduc_vec_info_type; - return true; - } - - /* Transform. */ - - if (dump_enabled_p ()) - dump_printf_loc (MSG_NOTE, vect_location, "transform reduction.\n"); - - /* FORNOW: Multiple types are not supported for condition. */ - if (code == COND_EXPR) - gcc_assert (ncopies == 1); - - /* Create the destination vector */ - vec_dest = vect_create_destination_var (scalar_dest, vectype_out); - /* In case the vectorization factor (VF) is bigger than the number of elements that we can fit in a vectype (nunits), we have to generate more than one vector stmt - i.e - we need to "unroll" the @@ -6276,6 +6259,41 @@ vectorizable_reduction (gimple *stmt, gi else epilog_copies = ncopies; + /* If the reduction stmt is one of the patterns that have lane + reduction embedded we cannot handle the case of ! single_defuse_cycle. */ + if ((ncopies > 1 + && ! single_defuse_cycle) + && (code == DOT_PROD_EXPR + || code == WIDEN_SUM_EXPR + || code == SAD_EXPR)) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, + "multi def-use cycle not possible for lane-reducing " + "reduction operation\n"); + return false; + } + + if (!vec_stmt) /* transformation not required. */ + { + if (first_p) + vect_model_reduction_cost (stmt_info, epilog_reduc_code, ncopies); + STMT_VINFO_TYPE (stmt_info) = reduc_vec_info_type; + return true; + } + + /* Transform. */ + + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, "transform reduction.\n"); + + /* FORNOW: Multiple types are not supported for condition. */ + if (code == COND_EXPR) + gcc_assert (ncopies == 1); + + /* Create the destination vector */ + vec_dest = vect_create_destination_var (scalar_dest, vectype_out); + prev_stmt_info = NULL; prev_phi_info = NULL; if (slp_node) Index: gcc/testsuite/gcc.dg/torture/pr81418.c =================================================================== --- gcc/testsuite/gcc.dg/torture/pr81418.c (nonexistent) +++ gcc/testsuite/gcc.dg/torture/pr81418.c (working copy) @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-ftree-loop-optimize" } */ + +int +ol (int ku) +{ + int zq = 0; + + while (ku < 1) + { + int y6; + + for (y6 = 0; y6 < 3; ++y6) + zq += (char)ku; + ++ku; + } + + return zq; +}