From patchwork Wed Jul 5 23:27:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "H.J. Lu" X-Patchwork-Id: 1804105 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=jwU+uT1t; dkim-atps=neutral Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4QxG6Z5VCGz20ZC for ; Thu, 6 Jul 2023 09:27:50 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A2FFC3857011 for ; Wed, 5 Jul 2023 23:27:48 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org A2FFC3857011 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1688599668; bh=CGOXr9UnzhhxipxYxa7eOIT29SHwKB4CU/RgmB2NaSA=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=jwU+uT1txEcRJza5EPn0d4SKUExN5HvDgyuCa3EWBZQppVUDy4+YeolsrGlYfcDBU ZSevtTrF42qA9mLZ2j1htthE3FpvTZq3NYCBoHJXQWJw+2yoId/vx1tAJBqbufYz6V r8ctkWm1O1OP6E975Pr8Efw9CM8xKTJ2DRuuXsuY= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-pl1-x634.google.com (mail-pl1-x634.google.com [IPv6:2607:f8b0:4864:20::634]) by sourceware.org (Postfix) with ESMTPS id 1B89D3858D20 for ; Wed, 5 Jul 2023 23:27:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1B89D3858D20 Received: by mail-pl1-x634.google.com with SMTP id d9443c01a7336-1b8a8154f9cso498975ad.1 for ; Wed, 05 Jul 2023 16:27:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688599647; x=1691191647; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=CGOXr9UnzhhxipxYxa7eOIT29SHwKB4CU/RgmB2NaSA=; b=Lj6GBU17GjLiJl/Z5MWZKSkZe3XuKCiK2zPzA/EnnDts2GcQYZovS2wbmdN5xm7Qo8 BAmNMTG91+kmstHZKOegMmzMIi636NaLWiKKpeGe/fpqM5kaMDvr4t1y6i1xYpHygTda Y9n70DE5MXI0u1PBTv5mBonxEIEvUuQyat53Au2V+mS2r6oU22GjcL6Lyv+nkGhVkl/n oaNhSzXi6yZ9Okhq3fzmMOgP9MjjfbRkmglE2Zm26Lq9dV/PtqvPvQngX4QF260Ewo9j 0i0bYrl4PR7LdXTpvO9K+GuTZf7X47xwuPmzpO+OlfqxiBRLdjVIwNN4ROS21wouU50e UTXg== X-Gm-Message-State: ABy/qLYHhUpiuSZB3saNI5eKZ9huZGRGXcAmlTdjHCu+VCW3TyqGNWl8 UTUqv6AlK/WwTyz40CMF5mlQBUVu8ao= X-Google-Smtp-Source: APBJJlHk/ob9MMdRKv+aLxQ73LnOrOq6v26fyqt5pI1r6ITck0ki1EOoqAY6udQ2k44vKjc4TNCZGQ== X-Received: by 2002:a17:902:c94c:b0:1b5:674d:2aa5 with SMTP id i12-20020a170902c94c00b001b5674d2aa5mr788406pla.13.1688599646418; Wed, 05 Jul 2023 16:27:26 -0700 (PDT) Received: from gnu-cfl-3.localdomain ([172.59.160.24]) by smtp.gmail.com with ESMTPSA id k6-20020a170902694600b001b53d3d8f3dsm8871plt.299.2023.07.05.16.27.25 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 05 Jul 2023 16:27:25 -0700 (PDT) Received: from gnu-cfl-3.. (localhost [IPv6:::1]) by gnu-cfl-3.localdomain (Postfix) with ESMTP id AEBE97403A5 for ; Wed, 5 Jul 2023 16:27:24 -0700 (PDT) To: gcc-patches@gcc.gnu.org Subject: [PATCH] x86: Properly find the maximum stack slot alignment Date: Wed, 5 Jul 2023 16:27:24 -0700 Message-ID: <20230705232724.631992-1-hjl.tools@gmail.com> X-Mailer: git-send-email 2.41.0 MIME-Version: 1.0 X-Spam-Status: No, score=-3025.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "H.J. Lu via Gcc-patches" From: "H.J. Lu" Reply-To: "H.J. Lu" Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" Don't assume that stack slots can only be accessed by stack or frame registers. Also check memory accesses from registers defined by stack or frame registers. gcc/ PR target/109780 * config/i386/i386.cc (ix86_set_with_register_source): New. (ix86_find_all_stack_access): Likewise. (ix86_find_max_used_stack_alignment): Also check memory accesses from registers defined by stack or frame registers. gcc/testsuite/ PR target/109780 * g++.target/i386/pr109780-1.C: New test. * gcc.target/i386/pr109780-1.c: Likewise. * gcc.target/i386/pr109780-2.c: Likewise. --- gcc/config/i386/i386.cc | 145 ++++++++++++++++++--- gcc/testsuite/g++.target/i386/pr109780-1.C | 72 ++++++++++ gcc/testsuite/gcc.target/i386/pr109780-1.c | 14 ++ gcc/testsuite/gcc.target/i386/pr109780-2.c | 21 +++ 4 files changed, 233 insertions(+), 19 deletions(-) create mode 100644 gcc/testsuite/g++.target/i386/pr109780-1.C create mode 100644 gcc/testsuite/gcc.target/i386/pr109780-1.c create mode 100644 gcc/testsuite/gcc.target/i386/pr109780-2.c diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index caca74d6dec..85dd8cb0581 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -8084,6 +8084,72 @@ output_probe_stack_range (rtx reg, rtx end) return ""; } +/* Check if PAT is a SET with register source. */ + +static void +ix86_set_with_register_source (rtx, const_rtx pat, void *data) +{ + if (GET_CODE (pat) != SET) + return; + + rtx src = SET_SRC (pat); + if (MEM_P (src) || CONST_INT_P (src)) + return; + + bool *may_use_register = (bool *) data; + *may_use_register = true; +} + +/* Find all register access registers. */ + +static bool +ix86_find_all_stack_access (HARD_REG_SET &stack_slot_access) +{ + bool repeat = false; + + for (int i = 0; i < FIRST_PSEUDO_REGISTER; i++) + if (GENERAL_REGNO_P (i) + && !TEST_HARD_REG_BIT (stack_slot_access, i)) + for (df_ref def = DF_REG_DEF_CHAIN (i); + def != NULL; + def = DF_REF_NEXT_REG (def)) + { + if (DF_REF_IS_ARTIFICIAL (def)) + continue; + + rtx_insn *insn = DF_REF_INSN (def); + + bool may_use_register = false; + note_stores (insn, ix86_set_with_register_source, + &may_use_register); + + if (!may_use_register) + continue; + + df_ref use; + FOR_EACH_INSN_USE (use, insn) + { + rtx reg = DF_REF_REG (use); + + if (!REG_P (reg)) + continue; + + /* Skip if stack slot access register isn't used. */ + if (!TEST_HARD_REG_BIT (stack_slot_access, + REGNO (reg))) + continue; + + /* Add this register to stack_slot_access. */ + add_to_hard_reg_set (&stack_slot_access, Pmode, i); + + /* Repeat if a register is added to stack_slot_access. */ + repeat = true; + } + } + + return repeat; +} + /* Set stack_frame_required to false if stack frame isn't required. Update STACK_ALIGNMENT to the largest alignment, in bits, of stack slot used if stack frame is required and CHECK_STACK_SLOT is true. */ @@ -8092,15 +8158,23 @@ static void ix86_find_max_used_stack_alignment (unsigned int &stack_alignment, bool check_stack_slot) { - HARD_REG_SET set_up_by_prologue, prologue_used; + HARD_REG_SET set_up_by_prologue, prologue_used, stack_slot_access; basic_block bb; CLEAR_HARD_REG_SET (prologue_used); CLEAR_HARD_REG_SET (set_up_by_prologue); + CLEAR_HARD_REG_SET (stack_slot_access); add_to_hard_reg_set (&set_up_by_prologue, Pmode, STACK_POINTER_REGNUM); add_to_hard_reg_set (&set_up_by_prologue, Pmode, ARG_POINTER_REGNUM); add_to_hard_reg_set (&set_up_by_prologue, Pmode, HARD_FRAME_POINTER_REGNUM); + /* Stack slot can be accessed by stack pointer, frame pointer or + registers defined by stack pointer or frame pointer. */ + add_to_hard_reg_set (&stack_slot_access, Pmode, + STACK_POINTER_REGNUM); + if (frame_pointer_needed) + add_to_hard_reg_set (&stack_slot_access, Pmode, + HARD_FRAME_POINTER_REGNUM); /* The preferred stack alignment is the minimum stack alignment. */ if (stack_alignment > crtl->preferred_stack_boundary) @@ -8108,32 +8182,65 @@ ix86_find_max_used_stack_alignment (unsigned int &stack_alignment, bool require_stack_frame = false; + /* Find all register access registers. */ + while (ix86_find_all_stack_access (stack_slot_access)) + ; + FOR_EACH_BB_FN (bb, cfun) { rtx_insn *insn; FOR_BB_INSNS (bb, insn) - if (NONDEBUG_INSN_P (insn) - && requires_stack_frame_p (insn, prologue_used, - set_up_by_prologue)) + if (NONDEBUG_INSN_P (insn)) { - require_stack_frame = true; + if (!require_stack_frame) + { + if (requires_stack_frame_p (insn, prologue_used, + set_up_by_prologue)) + require_stack_frame = true; + else + /* Skip if stack frame isn't required. */ + continue; + } - if (check_stack_slot) + /* Stop if stack frame is required, but we don't need to + check stack slot. */ + if (!check_stack_slot) + break; + + /* Find stack slot access register use. */ + bool stack_slot_register_p = false; + df_ref use; + FOR_EACH_INSN_USE (use, insn) { - /* Find the maximum stack alignment. */ - subrtx_iterator::array_type array; - FOR_EACH_SUBRTX (iter, array, PATTERN (insn), ALL) - if (MEM_P (*iter) - && (reg_mentioned_p (stack_pointer_rtx, - *iter) - || reg_mentioned_p (frame_pointer_rtx, - *iter))) - { - unsigned int alignment = MEM_ALIGN (*iter); - if (alignment > stack_alignment) - stack_alignment = alignment; - } + rtx reg = DF_REF_REG (use); + + if (!REG_P (reg)) + continue; + + /* Stop if stack slot access register is used. */ + if (TEST_HARD_REG_BIT (stack_slot_access, + REGNO (reg))) + { + stack_slot_register_p = true; + break; + } } + + /* Skip if stack slot access registers are unused. */ + if (!stack_slot_register_p) + continue; + + /* This insn may reference stack slot. Find the maximum + stack slot alignment. */ + subrtx_iterator::array_type array; + FOR_EACH_SUBRTX (iter, array, PATTERN (insn), ALL) + if (MEM_P (*iter)) + { + unsigned int alignment = MEM_ALIGN (*iter); + if (alignment > stack_alignment) + stack_alignment = alignment; + break; + } } } diff --git a/gcc/testsuite/g++.target/i386/pr109780-1.C b/gcc/testsuite/g++.target/i386/pr109780-1.C new file mode 100644 index 00000000000..7e3eabdec94 --- /dev/null +++ b/gcc/testsuite/g++.target/i386/pr109780-1.C @@ -0,0 +1,72 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target c++17 } */ +/* { dg-options "-O2 -mavx2 -mtune=haswell" } */ + +template struct remove_reference { + using type = __remove_reference(_Tp); +}; +template struct MaybeStorageBase { + T val; + struct Union { + ~Union(); + } mStorage; +}; +template struct MaybeStorage : MaybeStorageBase { + char mIsSome; +}; +template ::type> +constexpr MaybeStorage Some(T &&); +template constexpr MaybeStorage Some(T &&aValue) { + return {aValue}; +} +template struct Span { + int operator[](long idx) { + int *__trans_tmp_4; + if (__builtin_expect(idx, 0)) + *(int *)__null = false; + __trans_tmp_4 = storage_.data(); + return __trans_tmp_4[idx]; + } + struct { + int *data() { return data_; } + int *data_; + } storage_; +}; +struct Variant { + template Variant(RefT) {} +}; +long from_i, from___trans_tmp_9; +namespace js::intl { +struct DecimalNumber { + Variant string_; + unsigned long significandStart_; + unsigned long significandEnd_; + bool zero_ = false; + bool negative_; + template DecimalNumber(CharT string) : string_(string) {} + template + static MaybeStorage from(Span); + void from(); +}; +} // namespace js::intl +void js::intl::DecimalNumber::from() { + Span __trans_tmp_3; + from(__trans_tmp_3); +} +template +MaybeStorage +js::intl::DecimalNumber::from(Span chars) { + DecimalNumber number(chars); + if (auto ch = chars[from_i]) { + from_i++; + number.negative_ = ch == '-'; + } + while (from___trans_tmp_9 && chars[from_i]) + ; + if (chars[from_i]) + while (chars[from_i - 1]) + number.zero_ = true; + return Some(number); +} + +/* { dg-final { scan-assembler-not "and\[lq\]?\[^\\n\]*-32,\[^\\n\]*sp" } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr109780-1.c b/gcc/testsuite/gcc.target/i386/pr109780-1.c new file mode 100644 index 00000000000..6b06947f2a5 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr109780-1.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -march=skylake" } */ + +char perm[64]; + +void +__attribute__((noipa)) +foo (int n) +{ + for (int i = 0; i < n; ++i) + perm[i] = i; +} + +/* { dg-final { scan-assembler-not "and\[lq\]?\[^\\n\]*-32,\[^\\n\]*sp" } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr109780-2.c b/gcc/testsuite/gcc.target/i386/pr109780-2.c new file mode 100644 index 00000000000..152da06c6ad --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr109780-2.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -march=skylake" } */ + +#define N 9 + +void +f (double x, double y, double *res) +{ + y = -y; + for (int i = 0; i < N; ++i) + { + double tmp = y; + y = x; + x = tmp; + res[i] = i; + } + res[N] = y * y; + res[N + 1] = x; +} + +/* { dg-final { scan-assembler-not "and\[lq\]?\[^\\n\]*-32,\[^\\n\]*sp" } } */