From patchwork Sun Aug 26 13:08:38 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gerald Pfeifer X-Patchwork-Id: 962246 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-484450-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=pfeifer.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="GpUqjby7"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41ywMB0g3vz9s7T for ; Sun, 26 Aug 2018 23:08:52 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; q=dns; s= default; b=CGoLidv4KyD5iMG5bk43383uPaUnUU9Zm6ff4LvZZyPMLNRI7xLwz wy2y+S9K8t2WJxJJ8bqd6FNU9/3X0WbfVaGhESkDhd12K1VSk2DeQjWb8mz2oOLe ge4hJLvq2W/sTUEqkiM9ovlYr6cQCpfcm17hTJzz9x/vXfz7/S1ug8= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; s= default; bh=HELhj38ssYrlLJxSjNoyl12Da3s=; b=GpUqjby7G6Bc76xwhzPm FJfqRdp/L1/7e8iDnqPTAQdx/bi22ufK1UOC7UjqMYIwlci+Gjn+bXGZxmK9cePZ v3hEhMb6yYv3TX3/fF+SveD1hE0IrN3r4RoRStjNq+PDY+15TvCDSTBok7mCzQes vdpyOCQBh4dajvDI7Lh1vz4= Received: (qmail 7095 invoked by alias); 26 Aug 2018 13:08:45 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 7083 invoked by uid 89); 26 Aug 2018 13:08:44 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-11.0 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=Program, conference, Conference, accuracy X-HELO: ainaz.pair.com Received: from ainaz.pair.com (HELO ainaz.pair.com) (209.68.2.66) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sun, 26 Aug 2018 13:08:42 +0000 Received: from ainaz.pair.com (localhost [127.0.0.1]) by ainaz.pair.com (Postfix) with ESMTP id E35F9B53ECA for ; Sun, 26 Aug 2018 09:08:40 -0400 (EDT) Received: from anthias (vie-91-186-158-155.dsl.sil.at [91.186.158.155]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ainaz.pair.com (Postfix) with ESMTPSA id 73812B53EC9 for ; Sun, 26 Aug 2018 09:08:40 -0400 (EDT) Date: Sun, 26 Aug 2018 15:08:38 +0200 (CEST) From: Gerald Pfeifer To: gcc-patches@gcc.gnu.org Subject: [wwwdocs] news/profiledriven.html -- avoid Message-ID: MIME-Version: 1.0 X-IsSubscribed: yes This updates news/profiledriven.html, where in addition to using id attributes we need to change the names of the ids since numbers are not acceptable. I decided to simply use "ref1" instead of "1" and so forth. Applied. Gerald Index: news/profiledriven.html =================================================================== RCS file: /cvs/gcc/wwwdocs/htdocs/news/profiledriven.html,v retrieving revision 1.13 diff -u -r1.13 profiledriven.html --- news/profiledriven.html 2 Jun 2018 21:16:18 -0000 1.13 +++ news/profiledriven.html 26 Aug 2018 11:57:34 -0000 @@ -82,7 +82,7 @@

GCC contains a static branch predictor which is able to guess the common direction of any branch without an experimental run based on [1]. The predictor consists of a set of simple +href="#ref1">[1]. The predictor consists of a set of simple heuristics that expose common behavior of programs, for instance that loops usually loop more than once, pointers are non-null and integers usually positive. The original predictor has been contributed by

For this project the predictor has been extended to use -Dempster-Shaffer theory [2] to combine the used +Dempster-Shaffer theory [2] to combine the used heuristics to give the expected branch probability and a new mechanism has been added for other optimization passes of GCC to annotate branches. For instance, the loop optimizer is sometimes able to @@ -104,7 +104,7 @@ probabilities into expected frequencies of executions of the individual basic blocks, so that the static profile looks identical to the feedback driven profile for the rest of the compiler. Wu and -Larus [2] report that this algorithm can accurately +Larus [2] report that this algorithm can accurately identify hot spots in a program even at intraprocedural level.

@@ -132,7 +132,7 @@

The experimental results show that the current implementation of branch predictors successfully guesses about 76% of the branches -(compared to 70% reported by [1]). About half of the +(compared to 70% reported by [1]). About half of the branches are guessed with 90% success. A perfect branch predictor based on the profile feedback guesses 94% of the branches correctly.

@@ -144,7 +144,7 @@ gives an overall difference of about 3%. We hope to enlarge this gap in the future by better use of the profile information and by implementing better static predictors. As reported in [3], the benefit for real world applications is higher +href="#ref3">[3], the benefit for real world applications is higher than for benchmarks, as applications tend to have larger working sets and benefit more from reduced code size.

@@ -189,7 +189,7 @@

A number of further optimizations are possible. For instance [3] describes superblock formation, loop peeling, loop +href="#ref3">[3] describes superblock formation, loop peeling, loop inlining and some other minor optimizations. Work continues on a separate branch to introduce better infrastructure for control flow graph manipulation (such as code duplication) that will make @@ -200,7 +200,7 @@ It would also be nice to modify the current loop optimizer to preserve the flow graph and use this information to control the optimizations performed, such as loop unrolling, peeling or strength reduction [8], [3]. +href="#ref8">[8], [3].

@@ -211,19 +211,19 @@

The basic block reordering algorithm can be considerably improved and extended for code replication, as described in [5] and [6], to optimize branch +href="#ref5">[5] and [6], to optimize branch prediction, cache and instruction fetch performance.

The predicated execution framework can be used for hyperblock -formation [8] and possible reverse if-conversion [9] on architectures not supporting predicated +formation [8] and possible reverse if-conversion [9] on architectures not supporting predicated execution.

There is room from improvement in branch prediction. Patterson describes branch prediction using an improved value range propagation -pass [4] that has significantly better accuracy. A +pass [4] that has significantly better accuracy. A number of other simple heuristics can be added.

@@ -260,45 +260,45 @@

References

-
[1]
+
[1]
Branch Prediction for Free; Ball and Larus; PLDI '93.
-
[2]
+
[2]
Static Branch Frequency and Program Profile Analysis; Wu and Larus; MICRO-27.
-
[3]
+
[3]
Design and Analysis of Profile-Based Optimization in Compaq's Compilation Tools for Alpha; Journal of Instruction-Level Parallelism 3 (2000) 1-25
-
[4]
+
[4]
Accurate Static Branch Prediction by Value Range Propagation; Jason R. C. Patterson (jasonp@fit.qut.edu.au), 1995
-
[5]
+
[5]
Near-optimal Intraprocedural Branch Alignment; Cliff Young, David S. Johnson, David R. Karger, Michael D. Smith, ACM 1997
-
[6]
+
[6]
Software Trace Cache; International Conference on Supercomputing, 1999
-
[7]
+
[7]
Using Profile Information to Assist Classic Code Optimizations; Pohua P. Chang, Scott A. Mahlke, and Wen-mei W. Hwu, 1991
-
[8]
+
[8]
Hyperblock Performance Optimizations For ILP Processors; David Isaac August, 1996
-
[9]
+
[9]
Reverse If-Conversion; Nancy J. Warter, Scott A. Mahlke, Wen-mei W. Hwu, B. Ramakrishna Rau; ACM SIGPLAN Notices, 1993