From patchwork Tue Jun 25 12:59:02 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 1122039 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-503683-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="A9jzg8SH"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 45Y5pD4j8nz9s3C for ; Tue, 25 Jun 2019 22:59:14 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:mime-version:content-type; q=dns; s=default; b=JoSzyUSzY4wkrlHXKDXWRjAi+0CInkjrQiygKeTBABhPlcuGzV HFKhbxpZQap8oF/ky30ODFnUXWW0MDWgxAMnf3sdFBl+OmrZqDyx5VrclunmEj1u RltVul5jvPQkWw/wpigC+kBV7m5umZh2dWd5QYLKhonDYduUvzAnIqk0I= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:mime-version:content-type; s= default; bh=pXrvyjcjRJo5qHQhq+9BYZ5W9xQ=; b=A9jzg8SHIilzfnJ29EGs vWTppKPl2Jwc+Y2MRIFKo2Hl/A2N9T9BKNqU0UOSLDa4LFDyBZWyeQXFHntCzr/5 2llptAQWxowQgALKjPOtc5E2Yi8OJrpAoJMG8UPQbo8jeqfZHlR63QYrUa55iqpS 8OOJVxEAKmTGHwyBilxoxfc= Received: (qmail 1392 invoked by alias); 25 Jun 2019 12:59:07 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 1384 invoked by uid 89); 25 Jun 2019 12:59:07 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-10.2 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, KAM_NUMSUBJECT, SPF_PASS autolearn=ham version=3.3.1 spammy=reusing X-HELO: mx1.suse.de Received: from mx2.suse.de (HELO mx1.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 25 Jun 2019 12:59:06 +0000 Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id A3EEBAFF9; Tue, 25 Jun 2019 12:59:03 +0000 (UTC) Date: Tue, 25 Jun 2019 14:59:02 +0200 (CEST) From: Richard Biener To: gcc-patches@gcc.gnu.org cc: Jan Hubicka Subject: [PATCH] Try fix PR90911 Message-ID: User-Agent: Alpine 2.20 (LSU 67 2015-01-07) MIME-Version: 1.0 PR90911 reports a slowdown of 456.hmmer with the recent introduction of vectorizer versioning of outer loops, more specifically the case of re-using if-conversion created versions. The patch below fixes things up to adjust the edge probability and scale the loop bodies in two steps, delaying scalar_loop scaling until all peeling is done. This restores profile-mismatches to the same state as it was on the GCC 9 branch and seems to fix the observed slowdown of 456.hmmer. Boostrap & regtest running on x86_64-unknown-linux-gnu. Honza, does this look OK? Thanks, Richard. 2019-06-25 Richard Biener * tree-vectorizer.h (_loop_vec_info::scalar_loop_scaling): New field. (LOOP_VINFO_SCALAR_LOOP_SCALING): new. * tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize scalar_loop_scaling. (vect_transform_loop): Scale scalar loop profile if needed. * tree-vect-loop-manip.c (vect_loop_versioning): When re-using the loop copy from if-conversion adjust edge probabilities and scale the vectorized loop body profile, queue the scalar profile for updating after peeling. Index: gcc/tree-vectorizer.h =================================================================== --- gcc/tree-vectorizer.h (revision 272636) +++ gcc/tree-vectorizer.h (working copy) @@ -548,6 +548,9 @@ typedef struct _loop_vec_info : public v /* Mark loops having masked stores. */ bool has_mask_store; + /* Queued scaling factor for the scalar loop. */ + profile_probability scalar_loop_scaling; + /* If if-conversion versioned this loop before conversion, this is the loop version without if-conversion. */ struct loop *scalar_loop; @@ -603,6 +606,7 @@ typedef struct _loop_vec_info : public v #define LOOP_VINFO_PEELING_FOR_NITER(L) (L)->peeling_for_niter #define LOOP_VINFO_NO_DATA_DEPENDENCIES(L) (L)->no_data_dependencies #define LOOP_VINFO_SCALAR_LOOP(L) (L)->scalar_loop +#define LOOP_VINFO_SCALAR_LOOP_SCALING(L) (L)->scalar_loop_scaling #define LOOP_VINFO_HAS_MASK_STORE(L) (L)->has_mask_store #define LOOP_VINFO_SCALAR_ITERATION_COST(L) (L)->scalar_cost_vec #define LOOP_VINFO_SINGLE_SCALAR_ITERATION_COST(L) (L)->single_scalar_iteration_cost Index: gcc/tree-vect-loop.c =================================================================== --- gcc/tree-vect-loop.c (revision 272636) +++ gcc/tree-vect-loop.c (working copy) @@ -835,6 +835,7 @@ _loop_vec_info::_loop_vec_info (struct l operands_swapped (false), no_data_dependencies (false), has_mask_store (false), + scalar_loop_scaling (profile_probability::uninitialized ()), scalar_loop (NULL), orig_loop_info (NULL) { @@ -8562,6 +8563,10 @@ vect_transform_loop (loop_vec_info loop_ epilogue = vect_do_peeling (loop_vinfo, niters, nitersm1, &niters_vector, &step_vector, &niters_vector_mult_vf, th, check_profitability, niters_no_overflow); + if (LOOP_VINFO_SCALAR_LOOP (loop_vinfo) + && LOOP_VINFO_SCALAR_LOOP_SCALING (loop_vinfo).initialized_p ()) + scale_loop_frequencies (LOOP_VINFO_SCALAR_LOOP (loop_vinfo), + LOOP_VINFO_SCALAR_LOOP_SCALING (loop_vinfo)); if (niters_vector == NULL_TREE) { Index: gcc/tree-vect-loop-manip.c =================================================================== --- gcc/tree-vect-loop-manip.c (revision 272636) +++ gcc/tree-vect-loop-manip.c (working copy) @@ -3114,8 +3114,17 @@ vect_loop_versioning (loop_vec_info loop GSI_SAME_STMT); } - /* ??? if-conversion uses profile_probability::always () but - prob below is profile_probability::likely (). */ + /* if-conversion uses profile_probability::always () for both paths, + reset the paths probabilities appropriately. */ + edge te, fe; + extract_true_false_edges_from_block (condition_bb, &te, &fe); + te->probability = prob; + fe->probability = prob.invert (); + /* We can scale loops counts immediately but have to postpone + scaling the scalar loop because we re-use it during peeling. */ + scale_loop_frequencies (loop_to_version, prob); + LOOP_VINFO_SCALAR_LOOP_SCALING (loop_vinfo) = prob.invert (); + nloop = scalar_loop; if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, vect_location,