From patchwork Mon Mar 23 01:51:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Pan2 via Gcc-patches" X-Patchwork-Id: 1259811 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=gcc.gnu.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=cNlqRQVF; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48ly5x6njVz9sPF for ; Mon, 23 Mar 2020 12:51:15 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BED18385E002; Mon, 23 Mar 2020 01:51:11 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BED18385E002 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1584928271; bh=Sn7cgF/3+ra479uFMK7Seznjkt3SQbwZWbOoo4ZBTRE=; h=Date:To:Subject:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=cNlqRQVF/RQ6NyhV3Ab9dbxd+tg0CE20mAiUdCfw3lZlTJ75/tyCjOXgmeVxXhw+m uUX8e9OiNSRQ/ehlx67+qQN9NrPQ4mAOfJMiA3Ks9IBGp7Gp6SSm5E4olmqaaeCBmA 843g7ff8a3u/B72H5Y7CquwjgD5qX1QeluuoeITI= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-pf1-x442.google.com (mail-pf1-x442.google.com [IPv6:2607:f8b0:4864:20::442]) by sourceware.org (Postfix) with ESMTPS id 5838C385C426 for ; Mon, 23 Mar 2020 01:51:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 5838C385C426 Received: by mail-pf1-x442.google.com with SMTP id f206so6704325pfa.10 for ; Sun, 22 Mar 2020 18:51:09 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=Sn7cgF/3+ra479uFMK7Seznjkt3SQbwZWbOoo4ZBTRE=; b=hj71HfmWDQVD8eKu8+ykLUkU0y4BvoiC/KftoX+4P0rDY/b36P8fW3Ha7pRphouMuN tHkzmbQ+h0M5WVwb1OF3gnUZ6Y4HeDZs9stZChtXnm1efJf4qZs/dFaydi35cG6LJ+hj +PSy4p2obM7JzvIbMScsAjFwNb9U7+BeTRtGyaKNCvGkV4x/QNBF32BpkXJ0xXEMVtcu zDsofSzOvWs1o3+RK0KcldtDvlLppcz1dpVpi2JQWRDyyDI83fBWzG5W235P5zsDCO9G lheTg4mdlevRq6ag9FqflmvwWHJrjscjnoOWXkBKsQnumgjpZywhgjE5DBPH/pyjeaTH MXbw== X-Gm-Message-State: ANhLgQ15diixSCHLl1vu8XUVgoE/s0dN1g+TuRpA9d+UhFVMgCUNFMj8 lclfSQjHtaNDMo1mS0GsgM/fsm8O8HE= X-Google-Smtp-Source: ADFU+vs4deU6ADzRyKdQc1/NXKN+gEj6nvK3sGlWi5VuAQRh9enRHe0wVhtxEl8ySdCPQoTcSukKvw== X-Received: by 2002:a63:c050:: with SMTP id z16mr19077915pgi.177.1584928267732; Sun, 22 Mar 2020 18:51:07 -0700 (PDT) Received: from bubble.grove.modra.org ([2406:3400:51d:8cc0:6120:5ee0:cb93:ed00]) by smtp.gmail.com with ESMTPSA id m13sm10239134pjq.26.2020.03.22.18.51.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 22 Mar 2020 18:51:07 -0700 (PDT) Received: by bubble.grove.modra.org (Postfix, from userid 1000) id 5645380910; Mon, 23 Mar 2020 12:21:03 +1030 (ACDT) Date: Mon, 23 Mar 2020 12:21:03 +1030 To: Segher Boessenkool Subject: [RS6000] PR94145, make PLT loads volatile Message-ID: <20200323015103.GS4583@bubble.grove.modra.org> References: <20200312024850.GE5384@bubble.grove.modra.org> <20200312165717.GG22482@gate.crashing.org> <20200312233601.GH5384@bubble.grove.modra.org> <20200313154038.GR22482@gate.crashing.org> <20200313230002.GB23597@bubble.grove.modra.org> <20200318215359.GO22482@gate.crashing.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20200318215359.GO22482@gate.crashing.org> User-Agent: Mutt/1.9.4 (2018-02-28) X-Spam-Status: No, score=-26.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Alan Modra via Gcc-patches From: "Li, Pan2 via Gcc-patches" Reply-To: Alan Modra Cc: gcc-patches@gcc.gnu.org Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" On Wed, Mar 18, 2020 at 04:53:59PM -0500, Segher Boessenkool wrote: > Could you please send a new patch (could be the same patch even) that > is easier to review for me? The PLT is volatile. On PowerPC it is a bss style section which the dynamic loader initialises to point at resolver stubs (called glink on PowerPC64) to support lazy resolution of function addresses. The first call to a given function goes via the dynamic loader symbol resolver, which updates the PLT entry for that function and calls the function. The second call, if there is one and we don't have a multi-threaded race, will use the updated PLT entry and thus avoid the relatively slow symbol resolver path. Calls via the PLT are like calls via a function pointer, except that no initialised function pointer is volatile like the PLT. All initialised function pointers are resolved at program startup to point at the function or are left as NULL. There is no support for lazy resolution of any user visible function pointer. So why does any of this matter to gcc? Well, normally the PLT call mechanism happens entirely behind gcc's back, but since we implemented inline PLT calls (effectively putting the PLT code stub that loads the PLT entry inline and making that code sequence scheduled), the load of the PLT entry is visible to gcc. That load then is subject to gcc optimization, for example in /* -S -mcpu=future -mpcrel -mlongcall -O2. */ int foo (int); void bar (void) { while (foo(0)) foo (99); } we see the PLT load for foo being hoisted out of the loop and stashed in a call-saved register. If that happens to be the first call to foo, then the stashed value is that for the resolver stub, and every call to foo in the loop will then go via the slow resolver path. Not a good idea. Also, if foo turns out to be a local function and the linker replaces the PLT calls with direct calls to foo then gcc has just wasted a call-saved register. This patch teaches gcc that the PLT loads are volatile. The change doesn't affect other loads of function pointers and thus has no effect on normal indirect function calls. Note that because the "optimization" this patch prevents can only occur over function calls, the only place gcc can stash PLT loads is in call-saved registers or in other memory. I'm reasonably confident that this change will be neutral or positive for the "ld -z now" case where the PLT is not volatile, in code where there is any register pressure. Even if gcc could be taught to recognise cases where the PLT is resolved, you'd need to discount use of registers to cache PLT loads by some factor involving the chance that those calls would be converted to direct calls.. PR target/94145 * config/rs6000/rs6000.c (rs6000_longcall_ref): Use unspec_volatile for PLT16_LO and PLT_PCREL. * config/rs6000/rs6000.md (UNSPEC_PLT16_LO, UNSPEC_PLT_PCREL): Remove. (UNSPECV_PLT16_LO, UNSPECV_PLT_PCREL): Define. (pltseq_plt16_lo_, pltseq_plt_pcrel): Use unspec_volatile. diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 07f7cf516ba..68046fdb5ee 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -19274,8 +19274,9 @@ rs6000_longcall_ref (rtx call_ref, rtx arg) if (rs6000_pcrel_p (cfun)) { rtx reg = gen_rtx_REG (Pmode, regno); - rtx u = gen_rtx_UNSPEC (Pmode, gen_rtvec (3, base, call_ref, arg), - UNSPEC_PLT_PCREL); + rtx u = gen_rtx_UNSPEC_VOLATILE (Pmode, + gen_rtvec (3, base, call_ref, arg), + UNSPECV_PLT_PCREL); emit_insn (gen_rtx_SET (reg, u)); return reg; } @@ -19294,8 +19295,9 @@ rs6000_longcall_ref (rtx call_ref, rtx arg) rtx reg = gen_rtx_REG (Pmode, regno); rtx hi = gen_rtx_UNSPEC (Pmode, gen_rtvec (3, base, call_ref, arg), UNSPEC_PLT16_HA); - rtx lo = gen_rtx_UNSPEC (Pmode, gen_rtvec (3, reg, call_ref, arg), - UNSPEC_PLT16_LO); + rtx lo = gen_rtx_UNSPEC_VOLATILE (Pmode, + gen_rtvec (3, reg, call_ref, arg), + UNSPECV_PLT16_LO); emit_insn (gen_rtx_SET (reg, hi)); emit_insn (gen_rtx_SET (reg, lo)); return reg; diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index ad88b6783af..5a8e9de670b 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -148,8 +148,6 @@ UNSPEC_SI_FROM_SF UNSPEC_PLTSEQ UNSPEC_PLT16_HA - UNSPEC_PLT16_LO - UNSPEC_PLT_PCREL ]) ;; @@ -178,6 +176,8 @@ UNSPECV_MTFSB1 ; Set FPSCR Field bit to 1 UNSPECV_SPLIT_STACK_RETURN ; A camouflaged return UNSPECV_SPEC_BARRIER ; Speculation barrier + UNSPECV_PLT16_LO + UNSPECV_PLT_PCREL ]) ; The three different kinds of epilogue. @@ -10359,10 +10359,10 @@ (define_insn "*pltseq_plt16_lo_" [(set (match_operand:P 0 "gpc_reg_operand" "=r") - (unspec:P [(match_operand:P 1 "gpc_reg_operand" "b") - (match_operand:P 2 "symbol_ref_operand" "s") - (match_operand:P 3 "" "")] - UNSPEC_PLT16_LO))] + (unspec_volatile:P [(match_operand:P 1 "gpc_reg_operand" "b") + (match_operand:P 2 "symbol_ref_operand" "s") + (match_operand:P 3 "" "")] + UNSPECV_PLT16_LO))] "TARGET_PLTSEQ" { return rs6000_pltseq_template (operands, RS6000_PLTSEQ_PLT16_LO); @@ -10382,10 +10382,10 @@ (define_insn "*pltseq_plt_pcrel" [(set (match_operand:P 0 "gpc_reg_operand" "=r") - (unspec:P [(match_operand:P 1 "" "") - (match_operand:P 2 "symbol_ref_operand" "s") - (match_operand:P 3 "" "")] - UNSPEC_PLT_PCREL))] + (unspec_volatile:P [(match_operand:P 1 "" "") + (match_operand:P 2 "symbol_ref_operand" "s") + (match_operand:P 3 "" "")] + UNSPECV_PLT_PCREL))] "HAVE_AS_PLTSEQ && TARGET_ELF && rs6000_pcrel_p (cfun)" {