From patchwork Wed Jun 2 21:15:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andrew MacLeod X-Patchwork-Id: 1486884 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=cw4HqSsM; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4FwMK43mznz9sW5 for ; Thu, 3 Jun 2021 07:16:23 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 09EFF398797A for ; Wed, 2 Jun 2021 21:16:21 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 09EFF398797A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1622668581; bh=zwfRehRz9tOOlsNgsbDYYqdttN1g5iBSUGUEj5X9yWk=; h=Subject:To:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=cw4HqSsMocnKc4wU/Kc6dklUVahMaIx0nJJ6B+DFST9BxVwOl+oeXB9d6QXM34vSz Lihb0slpe4P65I6y0+cNv1mu1MIDKu/6mPmhv3g0gnbF30uhsL99yTLikp303rPY9E qze1kejCNVPVx1uBWIX9jveugvcZVQbaJpAUQ4yk= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by sourceware.org (Postfix) with ESMTP id AB3B73986828 for ; Wed, 2 Jun 2021 21:15:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org AB3B73986828 Received: from mail-qv1-f71.google.com (mail-qv1-f71.google.com [209.85.219.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-501-w93fd233Nu-DL-RDxs9zRQ-1; Wed, 02 Jun 2021 17:15:56 -0400 X-MC-Unique: w93fd233Nu-DL-RDxs9zRQ-1 Received: by mail-qv1-f71.google.com with SMTP id n4-20020ad44a240000b029021cbf9668daso2766362qvz.23 for ; Wed, 02 Jun 2021 14:15:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:subject:to:cc:message-id:date:user-agent :mime-version:content-language; bh=zwfRehRz9tOOlsNgsbDYYqdttN1g5iBSUGUEj5X9yWk=; b=lkEx52J7FzVMou+05qQ3zcAXpJod9We1hsyNBrdqGfXoe6YW2wpKq7oJ/H288UtpOb 1Bx84AdU/dmguntFCSvsmRdxjdYCYpygxY2qAWUCWQ08B4MpW5kyXt95KAgRGSbQowlL Nx0PVzEC2dUDLkyywvjDnxyk2QuWaZyYpfYgPFA07/F6rsY/adAHOpOOLRNxhA+B+2cD xr6xOOnkOcVwJFQfsucooXA1LG7+/hkfcmavCu/Lpgr4ir5XrN/oywQKkiZdvuThFzBY ZzSqlKjJUbEwTJc95d0uQO1XUY4NZ+5xqhykFQPOq4cAyHzQLoy3C9GQiQySendPnx0M 8IZA== X-Gm-Message-State: AOAM532canN8WMQgE/w1+fRkGUmWlQCxkz5bSDGHNxsWbYIq+t2rry23 ZBRtSEHh+UUfA4RCVy5uCBGDZIj3n3Q89AGAfCAzA9CUWmrvN6CzsiJCiRNtSzX2bheg+17H2yx y3ALGBQhHZS638en0ew== X-Received: by 2002:ad4:596c:: with SMTP id eq12mr19319216qvb.30.1622668555657; Wed, 02 Jun 2021 14:15:55 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy7MGhsNOigHucx+x5xmG2IHcXt+7OdfmuUrFi9ERCCxkNh7RgJsZjpzuRCl2Fbaday5R7YCw== X-Received: by 2002:ad4:596c:: with SMTP id eq12mr19319188qvb.30.1622668555427; Wed, 02 Jun 2021 14:15:55 -0700 (PDT) Received: from ?IPv6:2607:fea8:a25d:e700::ef8c? ([2607:fea8:a25d:e700::ef8c]) by smtp.gmail.com with ESMTPSA id a21sm602135qtm.54.2021.06.02.14.15.54 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 02 Jun 2021 14:15:54 -0700 (PDT) Subject: [PATCH][RFC] Sparse on entry cache for Ranger. To: gcc-patches Message-ID: <2dc2b54e-e737-e03d-e103-541dd4847255@redhat.com> Date: Wed, 2 Jun 2021 17:15:53 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.9.1 MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-CA X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIM_INVALID, DKIM_SIGNED, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Andrew MacLeod via Gcc-patches From: Andrew MacLeod Reply-To: Andrew MacLeod Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" As mentioned earlier, I abstracted the on-entry cache at the beginning of stage1. This was to make it easier to port future changes back to GCC11 so we could provide alternate representations to deal with memory issues, or what have you. This patch introduces a sparse representation of the cache which is used  when the number of basic blocks gets too large. I commandeered the bitmap code since its efficient and has been working a long time, and added 2 routines to get and set 4 bits (quads) at a time.  This allows me to use a bitmap like its a sparse array which can contain a value between 0 and 15, and is conveniently pre-initialized to values of 0 at no cost :-)   This is then used as an index into a small local table to store ranges for the name.  Its limiting in that an ssa-name will not be able to have more than 15 unique ranges, but this happens in less than 8% of all cases in the data I collected, and most of those are switches.. any ranges after the 15 slots are full revert to VARYING.  The values for VARYING and UNDEFINED are pre-populated, and for pointers, I also pre-populate [0,0] and ~[0, 0]. This also adds --param=evrp-sparse-threshold=  which allows the threshold between the full vector and this new sparse representation to be changed. It defaults to a value of 800. I've done various performance runs, and this seems to be a reasonably balanced number. In fact its a 28% improvement in EVRP compile time over 390 files from a gcc bootstrap with minimal impact on missed opportunities. I've also tried to see if using less than 15 values has any significant effect (The lookup is linear when setting), but it does not appear to. I've also bootstrapped with the sparse threshold at 0 to ensure there aren't any issues. My thoughts are I would put this into trunk, and assuming nothing comes up  over the next couple of days, port it back to GCC11 to resolve 100299 and other excessive memory consumption PRs there as well. given that its reusing bitmap code for the sparse representation, it seems like it would be low risk. Are we OK with the addition of the bitmap_get_quad and bitmap_set_quad routines in bitmap.c?  It seems like they might be useful to others.  They are simple tweaks of bitmap_set_bit and bitmap_bit_p.. just dealing with 4 bits at a time.  I could make them local if this is a problem, but i don't have access to the bitmap internals there. Bootstraps on x86_64-pc-linux-gnu with no regressions. Andrew PS in PR10299 we spend a fraction of a second in EVRP now. From 9408efe462c7a664858d986f69082eb86f2a0007 Mon Sep 17 00:00:00 2001 From: Andrew MacLeod Date: Wed, 2 Jun 2021 15:20:02 -0400 Subject: [PATCH 3/3] Implement a sparse bitmap represenation for Rangers on-entry cache. Use s sparse representation for the on entry cache, and utilize it when the number of basic blocks in the function exceeds param_evrp_sparse_threshold. PR tree-optimization/PR100299 * gimple-range-cache.cc (class sbr_sparse_bitmap): New. (sbr_sparse_bitmap::sbr_sparse_bitmap): New. (sbr_sparse_bitmap::set_bb_range): New. (sbr_sparse_bitmap::get_bb_range): New. (sbr_sparse_bitmap::bb_range_p): New. (block_range_cache::block_range_cache): initialize bitmap obstack. (block_range_cache::~block_range_cache): Destruct obstack. (block_range_cache::set_bb_range): Decide when to utilze the sparse on entry cache. * gimple-range-cache.h (block_range_cache): Add bitmap obstack. * params.opt (-param=evrp-sparse-threshold): New. --- gcc/gimple-range-cache.cc | 126 +++++++++++++++++++++++++++++++++++++- gcc/gimple-range-cache.h | 1 + gcc/params.opt | 4 ++ 3 files changed, 128 insertions(+), 3 deletions(-) diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc index 3d1ca3b94ca..658f1bc9342 100644 --- a/gcc/gimple-range-cache.cc +++ b/gcc/gimple-range-cache.cc @@ -236,12 +236,119 @@ sbr_vector::bb_range_p (const basic_block bb) return m_tab[bb->index] != NULL; } +// This class implements the on entry cache via a sparse bitmap. +// It uses the quad bit routines to access 4 bits at a time. +// A value of 0 (the default) means there is no entry, and a value of +// 1 thru SBR_NUM represents an element in the m_range vector. +// Varying is given the first value (1) and pre-cached. +// SBR_NUM + 1 represents the value of UNDEFINED, and is never stored. +// SBR_NUM is the number of values that can be cached. +// Indexes are 1..SBR_NUM and are stored locally at m_range[0..SBR_NUM-1] + +#define SBR_NUM 14 +#define SBR_UNDEF SBR_NUM + 1 +#define SBR_VARYING 1 + +class sbr_sparse_bitmap : public ssa_block_ranges +{ +public: + sbr_sparse_bitmap (tree t, irange_allocator *allocator, bitmap_obstack *bm); + virtual void set_bb_range (const basic_block bb, const irange &r) OVERRIDE; + virtual bool get_bb_range (irange &r, const basic_block bb) OVERRIDE; + virtual bool bb_range_p (const basic_block bb) OVERRIDE; +private: + irange_allocator *m_irange_allocator; + irange *m_range[SBR_NUM]; + bitmap bitvec; + tree m_type; +}; + +// Initialize a block cache for an ssa_name of type T. + +sbr_sparse_bitmap::sbr_sparse_bitmap (tree t, irange_allocator *allocator, + bitmap_obstack *bm) +{ + gcc_checking_assert (TYPE_P (t)); + m_type = t; + bitvec = BITMAP_ALLOC (bm); + m_irange_allocator = allocator; + // Pre-cache varying. + m_range[0] = m_irange_allocator->allocate (2); + m_range[0]->set_varying (t); + // Pre-cache zero and non-zero values for pointers. + if (POINTER_TYPE_P (t)) + { + m_range[1] = m_irange_allocator->allocate (2); + m_range[1]->set_nonzero (t); + m_range[2] = m_irange_allocator->allocate (2); + m_range[2]->set_zero (t); + } + else + m_range[1] = m_range[2] = NULL; + // Clear SBR_NUM entries. + for (int x = 3; x < SBR_NUM; x++) + m_range[x] = 0; +} + +// Set the range on entry to basic block BB to R. + +void +sbr_sparse_bitmap::set_bb_range (const basic_block bb, const irange &r) +{ + if (r.undefined_p ()) + { + bitmap_set_quad (bitvec, bb->index, SBR_UNDEF); + return; + } + + // Loop thru the values to see if R is already present. + for (int x = 0; x < SBR_NUM; x++) + if (!m_range[x] || r == *(m_range[x])) + { + if (!m_range[x]) + m_range[x] = m_irange_allocator->allocate (r); + bitmap_set_quad (bitvec, bb->index, x + 1); + return; + } + // All values are taken, default to VARYING. + bitmap_set_quad (bitvec, bb->index, SBR_VARYING); + return; +} + +// Return the range associated with block BB in R. Return false if +// there is no range. + +bool +sbr_sparse_bitmap::get_bb_range (irange &r, const basic_block bb) +{ + int value = bitmap_get_quad (bitvec, bb->index); + + if (!value) + return false; + + gcc_checking_assert (value <= SBR_UNDEF); + if (value == SBR_UNDEF) + r.set_undefined (); + else + r = *(m_range[value - 1]); + return true; +} + +// Return true if a range is present. + +bool +sbr_sparse_bitmap::bb_range_p (const basic_block bb) +{ + return (bitmap_get_quad (bitvec, bb->index) != 0); +} + // ------------------------------------------------------------------------- // Initialize the block cache. block_range_cache::block_range_cache () { + bitmap_obstack_initialize (&m_bitmaps); m_ssa_ranges.create (0); m_ssa_ranges.safe_grow_cleared (num_ssa_names); m_irange_allocator = new irange_allocator; @@ -254,6 +361,7 @@ block_range_cache::~block_range_cache () delete m_irange_allocator; // Release the vector itself. m_ssa_ranges.release (); + bitmap_obstack_release (&m_bitmaps); } // Set the range for NAME on entry to block BB to R. @@ -269,9 +377,21 @@ block_range_cache::set_bb_range (tree name, const basic_block bb, if (!m_ssa_ranges[v]) { - void *r = m_irange_allocator->get_memory (sizeof (sbr_vector)); - m_ssa_ranges[v] = new (r) sbr_vector (TREE_TYPE (name), - m_irange_allocator); + // Use sparse representation if there are too many basic blocks. + if (last_basic_block_for_fn (cfun) > param_evrp_sparse_threshold) + { + void *r = m_irange_allocator->get_memory (sizeof (sbr_sparse_bitmap)); + m_ssa_ranges[v] = new (r) sbr_sparse_bitmap (TREE_TYPE (name), + m_irange_allocator, + &m_bitmaps); + } + else + { + // Otherwise use the default vector implemntation. + void *r = m_irange_allocator->get_memory (sizeof (sbr_vector)); + m_ssa_ranges[v] = new (r) sbr_vector (TREE_TYPE (name), + m_irange_allocator); + } } m_ssa_ranges[v]->set_bb_range (bb, r); } diff --git a/gcc/gimple-range-cache.h b/gcc/gimple-range-cache.h index 4af461d2aa3..ce4449a08db 100644 --- a/gcc/gimple-range-cache.h +++ b/gcc/gimple-range-cache.h @@ -61,6 +61,7 @@ private: ssa_block_ranges &get_block_ranges (tree name); ssa_block_ranges *query_block_ranges (tree name); irange_allocator *m_irange_allocator; + bitmap_obstack m_bitmaps; }; // This global cache is used with the range engine as markers for what diff --git a/gcc/params.opt b/gcc/params.opt index 0d0dcd216f6..18e6036c4f4 100644 --- a/gcc/params.opt +++ b/gcc/params.opt @@ -126,6 +126,10 @@ Maximum size (in bytes) of objects tracked bytewise by dead store elimination. Common Joined UInteger Var(param_early_inlining_insns) Init(6) Optimization Param Maximal estimated growth of function body caused by early inlining of single call. +-param=evrp-sparse-threshold= +Common Joined UInteger Var(param_evrp_sparse_threshold) Init(800) Optimization Param +Maximum number of basic blocks before EVRP uses a sparse cache. + -param=evrp-mode= Common Joined Var(param_evrp_mode) Enum(evrp_mode) Init(EVRP_MODE_EVRP_FIRST) Param Optimization --param=evrp-mode=[legacy|ranger|legacy-first|ranger-first|ranger-trace|ranger-debug|trace|debug] Specifies the mode Early VRP should operate in. -- 2.17.2