From patchwork Fri Jul 21 15:37:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Hubicka X-Patchwork-Id: 1810987 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=YAzFNJ0i; dkim-atps=neutral Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4R6twW4cCHz1yYc for ; Sat, 22 Jul 2023 01:37:31 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5B8BD3853D1F for ; Fri, 21 Jul 2023 15:37:29 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 5B8BD3853D1F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1689953849; bh=PeTf8dSYkwjomE6Ak4Q/Grk69Gdo3lz1kFFd1ZkJZ0Q=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=YAzFNJ0ixXSLhVaBvY1hKnDgoc4cNPYMHLSR23PbgfyuJuJA0JYt9m7tRMJNrhy2R Anm5E88K17e3QWgDdsFofEf0a0kAqMgb99eOTl9dJptbWtgnOzh2nJGKvqALEHpMap NsU8lUm0yvb58enUl3P198YeM/xD6qx6Yi0nprFo= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from nikam.ms.mff.cuni.cz (nikam.ms.mff.cuni.cz [195.113.20.16]) by sourceware.org (Postfix) with ESMTPS id 121443858409 for ; Fri, 21 Jul 2023 15:37:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 121443858409 Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id 4D0F528270E; Fri, 21 Jul 2023 17:37:07 +0200 (CEST) Date: Fri, 21 Jul 2023 17:37:07 +0200 To: gcc-patches@gcc.gnu.org Subject: Implement flat loop profile detection Message-ID: MIME-Version: 1.0 Content-Disposition: inline X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jan Hubicka via Gcc-patches From: Jan Hubicka Reply-To: Jan Hubicka Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" Hi, this patch adds maybe_flat_loop_profile which can be used in loop profile udpate to detect situation where the profile may be unrealistically flat and should not be dwonscalled after vectorizing, unrolling and other transforms that assume that loop has high iteration count even if the CFG profile says otherwise. Profile is flat if it was statically detected and at that time we had no idea about actual number of iterations or we artificially capped them. So the function considers flat all profiles that have guessed or lower reliability in their count and there is no nb_iteration_bounds/estimate which would prove that the profile iteration count is high enough. Bootstrapped/regtested x86_64-linux, comitted. gcc/ChangeLog: * cfgloop.h (maybe_flat_loop_profile): Declare * cfgloopanal.cc (maybe_flat_loop_profile): New function. * tree-cfg.cc (print_loop_info): Print info about flat profiles. diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h index 269694c7962..22293e1c237 100644 --- a/gcc/cfgloop.h +++ b/gcc/cfgloop.h @@ -407,6 +407,7 @@ gcov_type expected_loop_iterations_unbounded (const class loop *, extern bool expected_loop_iterations_by_profile (const class loop *loop, sreal *ret, bool *reliable = NULL); +extern bool maybe_flat_loop_profile (const class loop *); extern unsigned expected_loop_iterations (class loop *); extern rtx doloop_condition_get (rtx_insn *); diff --git a/gcc/cfgloopanal.cc b/gcc/cfgloopanal.cc index c86a537f024..d8923b27e5d 100644 --- a/gcc/cfgloopanal.cc +++ b/gcc/cfgloopanal.cc @@ -303,6 +303,67 @@ expected_loop_iterations_by_profile (const class loop *loop, sreal *ret, return true; } +/* Return true if loop CFG profile may be unrealistically flat. + This is a common case, since average loops iterate only about 5 times. + In the case we do not have profile feedback or do not know real number of + iterations during profile estimation, we are likely going to predict it with + similar low iteration count. For static loop profiles we also artificially + cap profile of loops with known large iteration count so they do not appear + significantly more hot than other loops with unknown iteration counts. + + For loop optimization heuristics we ignore CFG profile and instead + use get_estimated_loop_iterations API which returns estimate + only when it is realistic. For unknown counts some optimizations, + like vectorizer or unroller make guess that iteration count will + be large. In this case we need to avoid scaling down the profile + after the loop transform. */ + +bool +maybe_flat_loop_profile (const class loop *loop) +{ + bool reliable; + sreal ret; + + if (!expected_loop_iterations_by_profile (loop, &ret, &reliable)) + return true; + + /* Reliable CFG estimates ought never be flat. Sanity check with + nb_iterations_estimate. If those differ, it is a but in profile + updating code */ + if (reliable) + { + int64_t intret = ret.to_nearest_int (); + if (loop->any_estimate + && (wi::ltu_p (intret * 2, loop->nb_iterations_estimate) + || wi::gtu_p (intret, loop->nb_iterations_estimate * 2))) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, + "Loop %i has inconsistent iterations estimates: " + "reliable CFG based iteration estimate is %f " + "while nb_iterations_estimate is %i\n", + loop->num, + ret.to_double (), + (int)loop->nb_iterations_estimate.to_shwi ()); + return true; + } + return false; + } + + /* Allow some margin of error and see if we are close to known bounds. + sreal (9,-3) is 9/8 */ + int64_t intret = (ret * sreal (9, -3)).to_nearest_int (); + if (loop->any_upper_bound && wi::geu_p (intret, loop->nb_iterations_upper_bound)) + return false; + if (loop->any_likely_upper_bound + && wi::geu_p (intret, loop->nb_iterations_likely_upper_bound)) + return false; + if (loop->any_estimate + && wi::geu_p (intret, loop->nb_iterations_estimate)) + return false; + return true; +} + /* Returns expected number of iterations of LOOP, according to measured or guessed profile. diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc index a6c97a04662..c65af8cc800 100644 --- a/gcc/tree-cfg.cc +++ b/gcc/tree-cfg.cc @@ -8523,8 +8523,11 @@ print_loop_info (FILE *file, const class loop *loop, const char *prefix) bool reliable; sreal iterations; if (loop->num && expected_loop_iterations_by_profile (loop, &iterations, &reliable)) - fprintf (file, "\n%siterations by profile: %f %s", prefix, - iterations.to_double (), reliable ? "(reliable)" : "(unreliable)"); + { + fprintf (file, "\n%siterations by profile: %f (%s%s)", prefix, + iterations.to_double (), reliable ? "reliable" : "unreliable", + maybe_flat_loop_profile (loop) ? ", maybe flat" : ""); + } }