From patchwork Fri May 3 19:18:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Martin Jambor X-Patchwork-Id: 1931166 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=suse.cz header.i=@suse.cz header.a=rsa-sha256 header.s=susede2_rsa header.b=rZ4reACS; dkim=pass header.d=suse.cz header.i=@suse.cz header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=YUUWud1U; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.a=rsa-sha256 header.s=susede2_rsa header.b=rZ4reACS; dkim=neutral header.d=suse.cz header.i=@suse.cz header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=YUUWud1U; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4VWLG220kKz1ymc for ; Sat, 4 May 2024 05:19:22 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 776023845148 for ; Fri, 3 May 2024 19:19:20 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by sourceware.org (Postfix) with ESMTPS id A585E384AB58 for ; Fri, 3 May 2024 19:18:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A585E384AB58 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.cz ARC-Filter: OpenARC Filter v1.0.0 sourceware.org A585E384AB58 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1714763942; cv=none; b=paOnUgMc+Rq4DYIZvyPZ57Bvessf1uva3iAsqZZufDd0DEh0E+TfdDAJmsXfTEl8fBUbCqvkGoDDHRldYf9E6dE371cUanHa2K95+VM9NPRTzCysH5R5iGb2nGc3SfxjTNHQzGJCuw4ZrgUqZrlUMFNEutrCefu1R45uZz0iVgk= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1714763942; c=relaxed/simple; bh=Bw4eIW1YW/mZ0GSCHiYPDv4Q7YOQgi/Ss0gkl1rHaU0=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:From: To:Subject:Date:Message-ID:MIME-Version; b=X+FQ0sxEzw3XjDrMZJZEvN9LEfzc0Y8XYiaKAcqIs2/zIYoSTRIeKI5WQEfWMS5QKMIRq928vEBWX0PLeyAvicq6Tubb/4P0ZRmTFMtNC7GcJw6t3yqNCbFUptg67HXApG5nbLIcLq9ieqHtlOJEs6lahd+sc9qQce+hGxr+3hs= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 9CB9E20707; Fri, 3 May 2024 19:18:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1714763938; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=/XTwrajVcHyqjN/cUY2SkmJxOkD3FYTrnpPAggyUt6c=; b=rZ4reACScO7X7AVfZmSClVcNUh1J25/H9vI7+e00NsyLpv48iu50nNxQc6/A3QfhORlJTN Ktrxz2MR6rBSetWa1J4B/NyfsG5j1GvmxB03LCdAWXKVx5K85BNx7NvPPLoxhwj0CQjUas G/w4B4y8wnXojmnI3zuXJoYLCyrBLtk= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1714763938; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=/XTwrajVcHyqjN/cUY2SkmJxOkD3FYTrnpPAggyUt6c=; b=YUUWud1UmbtfQc36cZFiwchL/3msNcruFoKDO2x9SxSYnYj8jQGD536NTag9GNUIWxclpr 4WU7i/sIRGg2szAw== Authentication-Results: smtp-out2.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1714763938; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=/XTwrajVcHyqjN/cUY2SkmJxOkD3FYTrnpPAggyUt6c=; b=rZ4reACScO7X7AVfZmSClVcNUh1J25/H9vI7+e00NsyLpv48iu50nNxQc6/A3QfhORlJTN Ktrxz2MR6rBSetWa1J4B/NyfsG5j1GvmxB03LCdAWXKVx5K85BNx7NvPPLoxhwj0CQjUas G/w4B4y8wnXojmnI3zuXJoYLCyrBLtk= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1714763938; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=/XTwrajVcHyqjN/cUY2SkmJxOkD3FYTrnpPAggyUt6c=; b=YUUWud1UmbtfQc36cZFiwchL/3msNcruFoKDO2x9SxSYnYj8jQGD536NTag9GNUIWxclpr 4WU7i/sIRGg2szAw== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 74B5D13991; Fri, 3 May 2024 19:18:58 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id 26q/GaI4NWYOcQAAD6G6ig (envelope-from ); Fri, 03 May 2024 19:18:58 +0000 From: Martin Jambor To: GCC Patches Cc: Richard Biener , Jan Hubicka Subject: [PATCH] sra: Do not leave work for DSE (that it can sometimes not perform) User-Agent: Notmuch/0.38.2 (https://notmuchmail.org) Emacs/29.3 (x86_64-suse-linux-gnu) Date: Fri, 03 May 2024 21:18:55 +0200 Message-ID: MIME-Version: 1.0 X-Spam-Level: X-Spamd-Result: default: False [-4.30 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_ALL(0.00)[]; MID_RHS_MATCH_FROMTLD(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; DKIM_SIGNED(0.00)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; FUZZY_BLOCKED(0.00)[rspamd.com]; TO_DN_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; FROM_EQ_ENVFROM(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; DBL_BLOCKED_OPENRESOLVER(0.00)[imap1.dmz-prg2.suse.org:helo, imap1.dmz-prg2.suse.org:rdns] X-Spam-Score: -4.30 X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Hi, when looking again at the g++.dg/tree-ssa/pr109849.C testcase we discovered that it generates terrible store-to-load forwarding stalls because SRA was leaving behind aggregate loads but all the stores were by scalar parts and DSE failed to remove the useless load. SRA has all the knowledge to remove the statement even now, so this small patch makes it do so. With this patch, the g++.dg/tree-ssa/pr109849.C micro-benchmark runs 9 times faster (on an AMD EPYC 75F3 machine). Bootstrapped and tested on x86_64. OK for master? Given that the patch is simple but can sometimes have large benefit, could it possibly be backported to gcc-14 branch even if it is not a regression (at least not in the last decade) in a few weeks? Thanks, Martin gcc/ChangeLog: 2024-04-18 Martin Jambor * tree-sra.cc (sra_modify_assign): Remove the original statement also when dealing with a store to a fully covered aggregate from a non-candidate. gcc/testsuite/ChangeLog: 2024-04-23 Martin Jambor * g++.dg/tree-ssa/pr109849.C: Also check that the aggeegate store to cur disappears. * gcc.dg/tree-ssa/ssa-dse-26.c: Instead of relying on DSE, check that the unwanted stores were removed at early SRA time. --- gcc/testsuite/g++.dg/tree-ssa/pr109849.C | 3 ++- gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c | 6 +++--- gcc/tree-sra.cc | 14 ++++++++++++-- 3 files changed, 17 insertions(+), 6 deletions(-) diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr109849.C b/gcc/testsuite/g++.dg/tree-ssa/pr109849.C index cd348c0f590..d06dbb10482 100644 --- a/gcc/testsuite/g++.dg/tree-ssa/pr109849.C +++ b/gcc/testsuite/g++.dg/tree-ssa/pr109849.C @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-sra" } */ +/* { dg-options "-O2 -fdump-tree-sra -fdump-tree-optimized" } */ #include typedef unsigned int uint32_t; @@ -29,3 +29,4 @@ main() } /* { dg-final { scan-tree-dump "Created a replacement for stack offset" "sra"} } */ +/* { dg-final { scan-tree-dump-not "cur = MEM" "optimized"} } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c index 43152de5616..1d01392c595 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-dse1-details -fno-short-enums -fno-tree-fre" } */ +/* { dg-options "-O2 -fdump-tree-esra -fno-short-enums -fno-tree-fre" } */ /* { dg-skip-if "we want a BIT_FIELD_REF from fold_truth_andor" { ! lp64 } } */ /* { dg-skip-if "temporary variable names are not x and y" { mmix-knuth-mmixware } } */ @@ -31,5 +31,5 @@ constraint_equal (struct constraint a, struct constraint b) && constraint_expr_equal (a.rhs, b.rhs); } -/* { dg-final { scan-tree-dump-times "Deleted dead store: x = " 2 "dse1" } } */ -/* { dg-final { scan-tree-dump-times "Deleted dead store: y = " 2 "dse1" } } */ +/* { dg-final { scan-tree-dump-not "x = " "esra" } } */ +/* { dg-final { scan-tree-dump-not "y = " "esra" } } */ diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc index 32fa28911f2..8040b0c5645 100644 --- a/gcc/tree-sra.cc +++ b/gcc/tree-sra.cc @@ -4854,8 +4854,18 @@ sra_modify_assign (gimple *stmt, gimple_stmt_iterator *gsi) But use the RHS aggregate to load from to expose more optimization opportunities. */ if (access_has_children_p (lacc)) - generate_subtree_copies (lacc->first_child, rhs, lacc->offset, - 0, 0, gsi, true, true, loc); + { + generate_subtree_copies (lacc->first_child, rhs, lacc->offset, + 0, 0, gsi, true, true, loc); + if (lacc->grp_covered) + { + unlink_stmt_vdef (stmt); + gsi_remove (& orig_gsi, true); + release_defs (stmt); + sra_stats.deleted++; + return SRA_AM_REMOVED; + } + } } return SRA_AM_NONE;