From patchwork Wed Nov 22 21:45:31 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Botcazou X-Patchwork-Id: 840551 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-467753-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="Cv7oGdz7"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3yhwxT03Scz9rxm for ; Thu, 23 Nov 2017 08:45:44 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type :content-transfer-encoding; q=dns; s=default; b=jx4YDKUMkXAW1XFa e3i/olkH9qFkD6vZdp4Gm3KM5C3Ar7rd9xZ5uuqoM4AgR/mEtYvX1TQO1Y4oAHBs 99pyEl0sM6hYNkdtp7BtoIWS8oUmVqxl6Sv88Mq29MJjBYQzDFHwm3hEzKWXfPGf CrwAVIi2/qKwcRS4ClZ2FwkZkgk= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type :content-transfer-encoding; s=default; bh=qrGbBW70gK43Facg4OeTmj IFtEM=; b=Cv7oGdz78WvTU7qA3D3UWnlKcCukaXgY1QUWzNMOn0gwyJbqHwHuXy RUdAH8naO88XJ/AEMxrDrd+Tkwy64F3UnDiY6tIzt3XbGtKIpo3+fpKWbR3TIgYP D752zUsQ7HI6dEcQp0hEfOSMLk/tdsUEJMbNyq1bU6zS10wDBIUr8= Received: (qmail 125566 invoked by alias); 22 Nov 2017 21:45:37 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 125552 invoked by uid 89); 22 Nov 2017 21:45:36 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-12.8 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, KAM_NUMSUBJECT, KAM_STOCKGEN, KB_WAM_FROM_NAME_SINGLEWORD, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=ROM, happily X-HELO: smtp.eu.adacore.com Received: from mel.act-europe.fr (HELO smtp.eu.adacore.com) (194.98.77.210) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 22 Nov 2017 21:45:34 +0000 Received: from localhost (localhost [127.0.0.1]) by filtered-smtp.eu.adacore.com (Postfix) with ESMTP id F000D82278 for ; Wed, 22 Nov 2017 22:45:31 +0100 (CET) Received: from smtp.eu.adacore.com ([127.0.0.1]) by localhost (smtp.eu.adacore.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tM725KYDEVKU for ; Wed, 22 Nov 2017 22:45:31 +0100 (CET) Received: from polaris.localnet (bon31-6-88-161-99-133.fbx.proxad.net [88.161.99.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.eu.adacore.com (Postfix) with ESMTPSA id B90C582273 for ; Wed, 22 Nov 2017 22:45:31 +0100 (CET) From: Eric Botcazou To: gcc-patches@gcc.gnu.org Subject: Fix PR rtl-optimization/83030 Date: Wed, 22 Nov 2017 22:45:31 +0100 Message-ID: <32159344.6WgmTslxEn@polaris> User-Agent: KMail/4.14.10 (Linux/3.16.7-53-desktop; KDE/4.14.9; x86_64; ; ) MIME-Version: 1.0 This is a regression present on mainline for SPARC under the form of the failure of g++.dg/tree-prof/partition1.C. The delayed-branch scheduling pass happily deletes a CROSSING_JUMP_P jump insn, which was precisely added to bridge the gap between the hot and cold section. It turns out that CROSSING_JUMP_P is not documented at all (instead the now dead REG_CROSSING_JUMP still is) so the patch does a bit of housekeeping work. Tested on x86-64/Linux and SPARC64/Linux, applied on the mainline. 2017-11-22 Eric Botcazou PR rtl-optimization/83030 * doc/rtl.texi (Flags in an RTL Expression): Alphabetize, add entry for CROSSING_JUMP_P and mention usage of 'jump' for JUMP_INSNs. (Insns): Delete entry for REG_CROSSING_JUMP in register notes. * bb-reorder.c (update_crossing_jump_flags): Do not test whether the CROSSING_JUMP_P flag is already set before setting it. * cfgrtl.c (fixup_partition_crossing): Likewise. * reorg.c (relax_delay_slots): Do not consider a CROSSING_JUMP_P insn as useless. Index: doc/rtl.texi =================================================================== --- doc/rtl.texi (revision 255000) +++ doc/rtl.texi (working copy) @@ -565,6 +565,16 @@ that are used in certain types of expres are accessed with the following macros, which expand into lvalues. @table @code +@findex CROSSING_JUMP_P +@cindex @code{jump_insn} and @samp{/j} +@item CROSSING_JUMP_P (@var{x}) +Nonzero in a @code{jump_insn} if it crosses between hot and cold sections, +which could potentially be very far apart in the executable. The presence +of this flag indicates to other optimizations that this branching instruction +should not be ``collapsed'' into a simpler branching construct. It is used +when the optimization to partition basic blocks into hot and cold sections +is turned on. + @findex CONSTANT_POOL_ADDRESS_P @cindex @code{symbol_ref} and @samp{/u} @cindex @code{unchanging}, in @code{symbol_ref} @@ -577,37 +587,6 @@ In either case GCC assumes these address perhaps with the help of base registers. Stored in the @code{unchanging} field and printed as @samp{/u}. -@findex RTL_CONST_CALL_P -@cindex @code{call_insn} and @samp{/u} -@cindex @code{unchanging}, in @code{call_insn} -@item RTL_CONST_CALL_P (@var{x}) -In a @code{call_insn} indicates that the insn represents a call to a -const function. Stored in the @code{unchanging} field and printed as -@samp{/u}. - -@findex RTL_PURE_CALL_P -@cindex @code{call_insn} and @samp{/i} -@cindex @code{return_val}, in @code{call_insn} -@item RTL_PURE_CALL_P (@var{x}) -In a @code{call_insn} indicates that the insn represents a call to a -pure function. Stored in the @code{return_val} field and printed as -@samp{/i}. - -@findex RTL_CONST_OR_PURE_CALL_P -@cindex @code{call_insn} and @samp{/u} or @samp{/i} -@item RTL_CONST_OR_PURE_CALL_P (@var{x}) -In a @code{call_insn}, true if @code{RTL_CONST_CALL_P} or -@code{RTL_PURE_CALL_P} is true. - -@findex RTL_LOOPING_CONST_OR_PURE_CALL_P -@cindex @code{call_insn} and @samp{/c} -@cindex @code{call}, in @code{call_insn} -@item RTL_LOOPING_CONST_OR_PURE_CALL_P (@var{x}) -In a @code{call_insn} indicates that the insn represents a possibly -infinite looping call to a const or pure function. Stored in the -@code{call} field and printed as @samp{/c}. Only true if one of -@code{RTL_CONST_CALL_P} or @code{RTL_PURE_CALL_P} is true. - @findex INSN_ANNULLED_BRANCH_P @cindex @code{jump_insn} and @samp{/u} @cindex @code{call_insn} and @samp{/u} @@ -702,6 +681,29 @@ Stored in the @code{call} field and prin Nonzero in a @code{mem} if the memory reference holds a pointer. Stored in the @code{frame_related} field and printed as @samp{/f}. +@findex MEM_READONLY_P +@cindex @code{mem} and @samp{/u} +@cindex @code{unchanging}, in @code{mem} +@item MEM_READONLY_P (@var{x}) +Nonzero in a @code{mem}, if the memory is statically allocated and read-only. + +Read-only in this context means never modified during the lifetime of the +program, not necessarily in ROM or in write-disabled pages. A common +example of the later is a shared library's global offset table. This +table is initialized by the runtime loader, so the memory is technically +writable, but after control is transferred from the runtime loader to the +application, this memory will never be subsequently modified. + +Stored in the @code{unchanging} field and printed as @samp{/u}. + +@findex PREFETCH_SCHEDULE_BARRIER_P +@cindex @code{prefetch} and @samp{/v} +@cindex @code{volatile}, in @code{prefetch} +@item PREFETCH_SCHEDULE_BARRIER_P (@var{x}) +In a @code{prefetch}, indicates that the prefetch is a scheduling barrier. +No other INSNs will be moved over it. +Stored in the @code{volatil} field and printed as @samp{/v}. + @findex REG_FUNCTION_VALUE_P @cindex @code{reg} and @samp{/i} @cindex @code{return_val}, in @code{reg} @@ -731,6 +733,37 @@ The same hard register may be used also functions called by this one, but @code{REG_FUNCTION_VALUE_P} is zero in this kind of use. +@findex RTL_CONST_CALL_P +@cindex @code{call_insn} and @samp{/u} +@cindex @code{unchanging}, in @code{call_insn} +@item RTL_CONST_CALL_P (@var{x}) +In a @code{call_insn} indicates that the insn represents a call to a +const function. Stored in the @code{unchanging} field and printed as +@samp{/u}. + +@findex RTL_PURE_CALL_P +@cindex @code{call_insn} and @samp{/i} +@cindex @code{return_val}, in @code{call_insn} +@item RTL_PURE_CALL_P (@var{x}) +In a @code{call_insn} indicates that the insn represents a call to a +pure function. Stored in the @code{return_val} field and printed as +@samp{/i}. + +@findex RTL_CONST_OR_PURE_CALL_P +@cindex @code{call_insn} and @samp{/u} or @samp{/i} +@item RTL_CONST_OR_PURE_CALL_P (@var{x}) +In a @code{call_insn}, true if @code{RTL_CONST_CALL_P} or +@code{RTL_PURE_CALL_P} is true. + +@findex RTL_LOOPING_CONST_OR_PURE_CALL_P +@cindex @code{call_insn} and @samp{/c} +@cindex @code{call}, in @code{call_insn} +@item RTL_LOOPING_CONST_OR_PURE_CALL_P (@var{x}) +In a @code{call_insn} indicates that the insn represents a possibly +infinite looping call to a const or pure function. Stored in the +@code{call} field and printed as @samp{/c}. Only true if one of +@code{RTL_CONST_CALL_P} or @code{RTL_PURE_CALL_P} is true. + @findex RTX_FRAME_RELATED_P @cindex @code{insn} and @samp{/f} @cindex @code{call_insn} and @samp{/f} @@ -765,21 +798,6 @@ computation performed by this instructio This flag is required for exception handling support on targets with RTL prologues. -@findex MEM_READONLY_P -@cindex @code{mem} and @samp{/u} -@cindex @code{unchanging}, in @code{mem} -@item MEM_READONLY_P (@var{x}) -Nonzero in a @code{mem}, if the memory is statically allocated and read-only. - -Read-only in this context means never modified during the lifetime of the -program, not necessarily in ROM or in write-disabled pages. A common -example of the later is a shared library's global offset table. This -table is initialized by the runtime loader, so the memory is technically -writable, but after control is transferred from the runtime loader to the -application, this memory will never be subsequently modified. - -Stored in the @code{unchanging} field and printed as @samp{/u}. - @findex SCHED_GROUP_P @cindex @code{insn} and @samp{/s} @cindex @code{call_insn} and @samp{/s} @@ -879,14 +897,6 @@ Stored in the @code{volatil} field and p Most uses of @code{SYMBOL_REF_FLAG} are historic and may be subsumed by @code{SYMBOL_REF_FLAGS}. Certainly use of @code{SYMBOL_REF_FLAGS} is mandatory if the target requires more than one bit of storage. - -@findex PREFETCH_SCHEDULE_BARRIER_P -@cindex @code{prefetch} and @samp{/v} -@cindex @code{volatile}, in @code{prefetch} -@item PREFETCH_SCHEDULE_BARRIER_P (@var{x}) -In a @code{prefetch}, indicates that the prefetch is a scheduling barrier. -No other INSNs will be moved over it. -Stored in the @code{volatil} field and printed as @samp{/v}. @end table These are the fields to which the above macros refer: @@ -974,6 +984,8 @@ In a @code{set}, 1 means it is for a ret In a @code{call_insn}, 1 means it is a sibling call. +In a @code{jump_insn}, 1 means it is a crossing jump. + In an RTL dump, this flag is represented as @samp{/j}. @findex unchanging @@ -3910,16 +3922,6 @@ multiple targets; the last label in the insn-field) goes into the @code{JUMP_LABEL} field and does not have a @code{REG_LABEL_TARGET} note. @xref{Insns, JUMP_LABEL}. -@findex REG_CROSSING_JUMP -@item REG_CROSSING_JUMP -This insn is a branching instruction (either an unconditional jump or -an indirect jump) which crosses between hot and cold sections, which -could potentially be very far apart in the executable. The presence -of this note indicates to other optimizations that this branching -instruction should not be ``collapsed'' into a simpler branching -construct. It is used when the optimization to partition basic blocks -into hot and cold sections is turned on. - @findex REG_SETJMP @item REG_SETJMP Appears attached to each @code{CALL_INSN} to @code{setjmp} or a Index: bb-reorder.c =================================================================== --- bb-reorder.c (revision 255000) +++ bb-reorder.c (working copy) @@ -2239,10 +2239,7 @@ update_crossing_jump_flags (void) FOR_EACH_EDGE (e, ei, bb->succs) if (e->flags & EDGE_CROSSING) { - if (JUMP_P (BB_END (bb)) - /* Some flags were added during fix_up_fall_thru_edges, via - force_nonfallthru_and_redirect. */ - && !CROSSING_JUMP_P (BB_END (bb))) + if (JUMP_P (BB_END (bb))) CROSSING_JUMP_P (BB_END (bb)) = 1; break; } Index: cfgrtl.c =================================================================== --- cfgrtl.c (revision 255000) +++ cfgrtl.c (working copy) @@ -1333,8 +1333,7 @@ fixup_partition_crossing (edge e) if (BB_PARTITION (e->src) != BB_PARTITION (e->dest)) { e->flags |= EDGE_CROSSING; - if (JUMP_P (BB_END (e->src)) - && !CROSSING_JUMP_P (BB_END (e->src))) + if (JUMP_P (BB_END (e->src))) CROSSING_JUMP_P (BB_END (e->src)) = 1; } else if (BB_PARTITION (e->src) == BB_PARTITION (e->dest)) Index: reorg.c =================================================================== --- reorg.c (revision 255000) +++ reorg.c (working copy) @@ -3361,10 +3361,11 @@ relax_delay_slots (rtx_insn *first) } /* See if we have a simple (conditional) jump that is useless. */ - if (! INSN_ANNULLED_BRANCH_P (delay_jump_insn) - && ! condjump_in_parallel_p (delay_jump_insn) + if (!CROSSING_JUMP_P (delay_jump_insn) + && !INSN_ANNULLED_BRANCH_P (delay_jump_insn) + && !condjump_in_parallel_p (delay_jump_insn) && prev_active_insn (as_a (target_label)) == insn - && ! BARRIER_P (prev_nonnote_insn (as_a (target_label))) + && !BARRIER_P (prev_nonnote_insn (as_a (target_label))) /* If the last insn in the delay slot sets CC0 for some insn, various code assumes that it is in a delay slot. We could put it back where it belonged and delete the register notes,