From patchwork Wed May 1 17:30:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Law X-Patchwork-Id: 1930305 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=ventanamicro.com header.i=@ventanamicro.com header.a=rsa-sha256 header.s=google header.b=GWGpObRF; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4VV3y12WSMz1ymc for ; Thu, 2 May 2024 03:31:03 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B9881384B000 for ; Wed, 1 May 2024 17:31:01 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-yw1-x112b.google.com (mail-yw1-x112b.google.com [IPv6:2607:f8b0:4864:20::112b]) by sourceware.org (Postfix) with ESMTPS id D2D573858D34 for ; Wed, 1 May 2024 17:30:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D2D573858D34 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=ventanamicro.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=ventanamicro.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org D2D573858D34 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::112b ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1714584640; cv=none; b=Fmt7ZiqkmwplBrCxcSMR4w0Hggj542m3kBmkVF2bv+jI1SH+qsDTCPIzOBEiPnLwh+m57pc2CeA237Ew0M6qoNGraZW/ij6aKpaV52GHFsyAEiGRVWqzOV+4mOwY7oMCWwGLNx5IozzGAOTzup3w64hsq/J3UWabX4n5PYDWa5w= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1714584640; c=relaxed/simple; bh=UlwAkWDdDUCZ8aEu6yzNIcJI9aDr/Tr+MYiVuLnZIY0=; h=DKIM-Signature:Message-ID:Date:MIME-Version:From:Subject:To; b=XfdIkxNvgJJxCp7bN6HtYV/BASYiqWObLtWco71FLQwe4YuPZmYm1hx2y3wG8/TaHrjE48ftNXoQkGh5iJ6Rm3lHHUYqzv/0OGoPMWlPUz4bjqOr5ZEZWgBLcLR84U0+B2FfLAsWqAzqZrDQjxtBBPZNBKEkfQSVPB77KQGlnC4= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-yw1-x112b.google.com with SMTP id 00721157ae682-615019cd427so56205267b3.3 for ; Wed, 01 May 2024 10:30:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ventanamicro.com; s=google; t=1714584638; x=1715189438; darn=gcc.gnu.org; h=content-language:to:subject:from:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=Aum6QK+51YEnQYsS4/MoAhQhieMnWerjHKf2vXxNFeE=; b=GWGpObRFvTldErbklHpjX4WU3kLkpWrmJKB8cfV0fBFPMRMCDjdQkOc9j7uf9hmL69 Uc6ORy58ZlwZcwMuuLGHX+Pq0H7FaDM9mURPWGhT/x9ljzamkp62Z88YJPCVCBdc+Kxw PQdZGK5qsULUrdFNux7iLD5NYuRa543TvjDvVCFoifzkNWVQMED9HjuqEROWMT/FuG3u iAJYyCR2tkYK/CiTDBBeURkkZPpiAlGi3gEjM6STrYRJshrcwKkSdmP2UYnC3Qkk4QOs ai4VJoXH34SQcFMxGE7sEtZGwowaFp1MzelQGb56hrCEH7EIbUJOZ39iy02c79YuplcY UV5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714584638; x=1715189438; h=content-language:to:subject:from:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Aum6QK+51YEnQYsS4/MoAhQhieMnWerjHKf2vXxNFeE=; b=Y9hCxeuBujCP3ke1ajZHjj5FkIoKMlnpbP/kW79xAamz2+qsAmhQp1a9uFYocS/y87 bkzZHEqTpWm4Ey8Nvk0oK+OyHr02WpHsdwqrY/WELWeu9WyKYMG4jrxPeG85xnrYw5mp 5IU0JKP+iLmWE5JLluQOaOemRB6OB2j6cDQ1R7pS9NR7Jt0mSvHbIh/mFl0skithsU01 4nx4qjazGiKglWlYXdDa1d+XY5XnJoOfK2bKxeUtuks2hqh6fSRw5fPepbHyURnTn9ED 4GK3MPM3/p1Stx9tjCHZ984rmleLKwmlTZk1gEL+YU1WxseaEjb5j8b5a4nlAFEKGnmf 0+DA== X-Gm-Message-State: AOJu0YwOrqNhp8E/YW2Zw+sJI0IwpupzknvpNrLHT7Ffs9Ft23VIS70I CAwPlS3vm0W/vUPywc7DuD0a70gXUCWBMcTYosYrxrlDE1hzeIL4mz1O/0G/6xK7grA92jkqVXN z X-Google-Smtp-Source: AGHT+IGnVAB6nY9rLrePAgO9ovcdXrb43DwQ8K+rBxhL7o0Oc5FY8X2uXEwimpguhn3fGkAv8X06ag== X-Received: by 2002:a05:690c:c09:b0:61b:33fb:5759 with SMTP id cl9-20020a05690c0c0900b0061b33fb5759mr3360654ywb.1.1714584637881; Wed, 01 May 2024 10:30:37 -0700 (PDT) Received: from [172.31.0.109] ([136.36.72.243]) by smtp.gmail.com with ESMTPSA id id8-20020a05690c680800b0061adccb38ecsm6378647ywb.10.2024.05.01.10.30.37 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 01 May 2024 10:30:37 -0700 (PDT) Message-ID: Date: Wed, 1 May 2024 11:30:36 -0600 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Beta From: Jeff Law Subject: [committed] [RISC-V] Fix detection of store pair fusion cases To: "gcc-patches@gcc.gnu.org" Content-Language: en-US X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org We've got the ability to count the number of store pair fusions happening in the front-end of the pipeline. When comparing some code from last year vs the current trunk we saw a fairly dramatic drop. The problem is the store pair fusion detection code was actively harmful due to a minor bug in checking offsets. So instead of pairing up 8 byte stores such as sp+0 with sp+8, it tried to pair up sp+8 and sp+16. Given uarch sensitivity I didn't try to pull together a testcase. But we could certainly see the undesirable behavior in benchmarks as simplistic as dhrystone up through spec2017. Anyway, bootstrapped a while back. Also verified through our performance counters that store pair fusion rates are back up. Regression tested with crosses a few minutes ago. Pushing to the trunk and coordination branch. jeff commit fad93e7617ce1aafb006983a71b6edc9ae1eb2d1 Author: Jeff Law Date: Wed May 1 11:28:41 2024 -0600 [committed] [RISC-V] Fix detection of store pair fusion cases We've got the ability to count the number of store pair fusions happening in the front-end of the pipeline. When comparing some code from last year vs the current trunk we saw a fairly dramatic drop. The problem is the store pair fusion detection code was actively harmful due to a minor bug in checking offsets. So instead of pairing up 8 byte stores such as sp+0 with sp+8, it tried to pair up sp+8 and sp+16. Given uarch sensitivity I didn't try to pull together a testcase. But we could certainly see the undesirable behavior in benchmarks as simplistic as dhrystone up through spec2017. Anyway, bootstrapped a while back. Also verified through our performance counters that store pair fusion rates are back up. Regression tested with crosses a few minutes ago. gcc/ * config/riscv/riscv.cc (riscv_macro_fusion_pair_p): Break out tests for easier debugging in store pair fusion case. Fix offset check in same. diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 0f62b295b96..24d1ead3902 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -8874,26 +8874,43 @@ riscv_macro_fusion_pair_p (rtx_insn *prev, rtx_insn *curr) extract_base_offset_in_addr (SET_DEST (prev_set), &base_prev, &offset_prev); extract_base_offset_in_addr (SET_DEST (curr_set), &base_curr, &offset_curr); - /* The two stores must be contained within opposite halves of the same - 16 byte aligned block of memory. We know that the stack pointer and - the frame pointer have suitable alignment. So we just need to check - the offsets of the two stores for suitable alignment. - - Originally the thought was to check MEM_ALIGN, but that was reporting - incorrect alignments, even for SP/FP accesses, so we gave up on that - approach. */ - if (base_prev != NULL_RTX - && base_curr != NULL_RTX - && REG_P (base_prev) - && REG_P (base_curr) - && REGNO (base_prev) == REGNO (base_curr) - && (REGNO (base_prev) == STACK_POINTER_REGNUM - || REGNO (base_prev) == HARD_FRAME_POINTER_REGNUM) - && ((INTVAL (offset_prev) == INTVAL (offset_curr) + 8 - && (INTVAL (offset_prev) % 16) == 0) - || ((INTVAL (offset_curr) == INTVAL (offset_prev) + 8) - && (INTVAL (offset_curr) % 16) == 0))) - return true; + /* Fail if we did not find both bases. */ + if (base_prev == NULL_RTX || base_curr == NULL_RTX) + return false; + + /* Fail if either base is not a register. */ + if (!REG_P (base_prev) || !REG_P (base_curr)) + return false; + + /* Fail if the bases are not the same register. */ + if (REGNO (base_prev) != REGNO (base_curr)) + return false; + + /* Originally the thought was to check MEM_ALIGN, but that was + reporting incorrect alignments, even for SP/FP accesses, so we + gave up on that approach. Instead just check for stack/hfp + which we know are aligned. */ + if (REGNO (base_prev) != STACK_POINTER_REGNUM + && REGNO (base_prev) != HARD_FRAME_POINTER_REGNUM) + return false; + + /* The two stores must be contained within opposite halves of the + same 16 byte aligned block of memory. We know that the stack + pointer and the frame pointer have suitable alignment. So we + just need to check the offsets of the two stores for suitable + alignment. */ + /* Get the smaller offset into OFFSET_PREV. */ + if (INTVAL (offset_prev) > INTVAL (offset_curr)) + std::swap (offset_prev, offset_curr); + + /* If the smaller offset (OFFSET_PREV) is not 16 byte aligned, + then fail. */ + if ((INTVAL (offset_prev) % 16) != 0) + return false; + + /* The higher offset must be 8 bytes more than the lower + offset. */ + return (INTVAL (offset_prev) + 8 == INTVAL (offset_curr)); } }