From patchwork Thu Oct 17 12:07:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Andre Vieira (lists)" X-Patchwork-Id: 1178491 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-511214-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="vH91soza"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 46v7GH4VyQz9sPJ for ; Thu, 17 Oct 2019 23:07:51 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; q=dns; s=default; b=l71OcOMRDbyHIXWxT4HbMxPovEJthyXeblDGKkrCvhUWb8o2K/ 8sEUuuZCtS9erkst81XV+nupaYJvCjLr1bGBGzXkxTfnd+EjeVM9zZfAKKswe5Vc D97G3EqVnMwSM1CqezI5skYvNlhmkOsq9dSjsKS/Ntqkt2BYEkFhAA5YE= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; s= default; bh=h6Otx1lrQtVQ5w/TMw7GCszSN6s=; b=vH91sozatjXG4z4lEgMi L37+7M0CnmgVfQDyuQG1dOr/I6tAs6uyk3MoFaVNcaj7r2kIjvEmPHZeBEKzIjIf RPGDflbgGYEOqIoGVXDPOzEY7MSVgKfnITsqs05mMqsQnfpObui1Ey/p3WbHL0JD eETsJ1ruU4Pi1u3ROQkaHE0= Received: (qmail 93831 invoked by alias); 17 Oct 2019 12:07:45 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 93823 invoked by uid 89); 17 Oct 2019 12:07:44 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-24.1 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, SPF_PASS autolearn=ham version=3.3.1 spammy= X-HELO: foss.arm.com Received: from Unknown (HELO foss.arm.com) (217.140.110.172) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 17 Oct 2019 12:07:43 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id AB6EC1BC0 for ; Thu, 17 Oct 2019 05:07:35 -0700 (PDT) Received: from [10.2.206.37] (e107157-lin.cambridge.arm.com [10.2.206.37]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 6DC153F718 for ; Thu, 17 Oct 2019 05:07:35 -0700 (PDT) To: gcc-patches From: "Andre Vieira (lists)" Subject: [committed][vect] Outline code into new function: determine_peel_for_niter Message-ID: <8acf945f-781c-48ff-2de8-9dd27351e339@arm.com> Date: Thu, 17 Oct 2019 13:07:34 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 X-IsSubscribed: yes Hi, This piece of code was pre-approved by richi. Retested by bootstrapping and regression testing on x86_64 (AVX512) and aarch64. Committed in revision r277103. gcc/ChangeLog: 2019-10-17 Andre Vieira * tree-vect-loop.c (determine_peel_for_niter): New function contained outlined code from ... (vect_analyze_loop_2): ... here. diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c index 39125aa4fc6a17657bde76921c80980f18621600..b00d68b9de938148c24e088024b2e01bdceb2e81 100644 --- a/gcc/tree-vect-loop.c +++ b/gcc/tree-vect-loop.c @@ -1809,6 +1809,57 @@ vect_dissolve_slp_only_groups (loop_vec_info loop_vinfo) } } + +/* Decides whether we need to create an epilogue loop to handle + remaining scalar iterations and sets PEELING_FOR_NITERS accordingly. */ + +void +determine_peel_for_niter (loop_vec_info loop_vinfo) +{ + LOOP_VINFO_PEELING_FOR_NITER (loop_vinfo) = false; + + unsigned HOST_WIDE_INT const_vf; + HOST_WIDE_INT max_niter + = likely_max_stmt_executions_int (LOOP_VINFO_LOOP (loop_vinfo)); + + unsigned th = LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo); + if (!th && LOOP_VINFO_ORIG_LOOP_INFO (loop_vinfo)) + th = LOOP_VINFO_COST_MODEL_THRESHOLD (LOOP_VINFO_ORIG_LOOP_INFO + (loop_vinfo)); + + if (LOOP_VINFO_FULLY_MASKED_P (loop_vinfo)) + /* The main loop handles all iterations. */ + LOOP_VINFO_PEELING_FOR_NITER (loop_vinfo) = false; + else if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo) + && LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) >= 0) + { + /* Work out the (constant) number of iterations that need to be + peeled for reasons other than niters. */ + unsigned int peel_niter = LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo); + if (LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo)) + peel_niter += 1; + if (!multiple_p (LOOP_VINFO_INT_NITERS (loop_vinfo) - peel_niter, + LOOP_VINFO_VECT_FACTOR (loop_vinfo))) + LOOP_VINFO_PEELING_FOR_NITER (loop_vinfo) = true; + } + else if (LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) + /* ??? When peeling for gaps but not alignment, we could + try to check whether the (variable) niters is known to be + VF * N + 1. That's something of a niche case though. */ + || LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo) + || !LOOP_VINFO_VECT_FACTOR (loop_vinfo).is_constant (&const_vf) + || ((tree_ctz (LOOP_VINFO_NITERS (loop_vinfo)) + < (unsigned) exact_log2 (const_vf)) + /* In case of versioning, check if the maximum number of + iterations is greater than th. If they are identical, + the epilogue is unnecessary. */ + && (!LOOP_REQUIRES_VERSIONING (loop_vinfo) + || ((unsigned HOST_WIDE_INT) max_niter + > (th / const_vf) * const_vf)))) + LOOP_VINFO_PEELING_FOR_NITER (loop_vinfo) = true; +} + + /* Function vect_analyze_loop_2. Apply a set of analyses on LOOP, and create a loop_vec_info struct @@ -1936,7 +1987,6 @@ vect_analyze_loop_2 (loop_vec_info loop_vinfo, bool &fatal, unsigned *n_stmts) vect_compute_single_scalar_iteration_cost (loop_vinfo); poly_uint64 saved_vectorization_factor = LOOP_VINFO_VECT_FACTOR (loop_vinfo); - unsigned th; /* Check the SLP opportunities in the loop, analyze and build SLP trees. */ ok = vect_analyze_slp (loop_vinfo, *n_stmts); @@ -1976,9 +2026,6 @@ start_over: LOOP_VINFO_INT_NITERS (loop_vinfo)); } - HOST_WIDE_INT max_niter - = likely_max_stmt_executions_int (LOOP_VINFO_LOOP (loop_vinfo)); - /* Analyze the alignment of the data-refs in the loop. Fail if a data reference is found that cannot be vectorized. */ @@ -2082,42 +2129,7 @@ start_over: return opt_result::failure_at (vect_location, "Loop costings not worthwhile.\n"); - /* Decide whether we need to create an epilogue loop to handle - remaining scalar iterations. */ - th = LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo); - - unsigned HOST_WIDE_INT const_vf; - if (LOOP_VINFO_FULLY_MASKED_P (loop_vinfo)) - /* The main loop handles all iterations. */ - LOOP_VINFO_PEELING_FOR_NITER (loop_vinfo) = false; - else if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo) - && LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) >= 0) - { - /* Work out the (constant) number of iterations that need to be - peeled for reasons other than niters. */ - unsigned int peel_niter = LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo); - if (LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo)) - peel_niter += 1; - if (!multiple_p (LOOP_VINFO_INT_NITERS (loop_vinfo) - peel_niter, - LOOP_VINFO_VECT_FACTOR (loop_vinfo))) - LOOP_VINFO_PEELING_FOR_NITER (loop_vinfo) = true; - } - else if (LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) - /* ??? When peeling for gaps but not alignment, we could - try to check whether the (variable) niters is known to be - VF * N + 1. That's something of a niche case though. */ - || LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo) - || !LOOP_VINFO_VECT_FACTOR (loop_vinfo).is_constant (&const_vf) - || ((tree_ctz (LOOP_VINFO_NITERS (loop_vinfo)) - < (unsigned) exact_log2 (const_vf)) - /* In case of versioning, check if the maximum number of - iterations is greater than th. If they are identical, - the epilogue is unnecessary. */ - && (!LOOP_REQUIRES_VERSIONING (loop_vinfo) - || ((unsigned HOST_WIDE_INT) max_niter - > (th / const_vf) * const_vf)))) - LOOP_VINFO_PEELING_FOR_NITER (loop_vinfo) = true; - + determine_peel_for_niter (loop_vinfo); /* If an epilogue loop is required make sure we can create one. */ if (LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo) || LOOP_VINFO_PEELING_FOR_NITER (loop_vinfo))