From patchwork Thu Oct 12 15:49:15 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Hubicka X-Patchwork-Id: 824921 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-464045-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="KK4sNq1n"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3yCZzF3L5cz9t2S for ; Fri, 13 Oct 2017 02:49:28 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; q=dns; s= default; b=WMQESfmn5bePVJu3VBXYES5W8ph2fdf93TrPLT5sXhQP6tABp3GQa mFp0SxZvbz6cV/bM/ritSbsMWwcx+ZBsJLZyvi4HCRmP3VwMChqieyyBxjLDEumr svAm/bMwOMQu3U9LCJ5M5vzuzdNCso3Qnr7iN93y4593VgQpkI8NtI= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; s= default; bh=0KFWtNAb9Q4IzGi1HX3FQgfulLw=; b=KK4sNq1nW6Yz7lsgvDaj TekymlxHXYlIYB9P9aejw5iCkipQbBq+FVFd8UBc9Kx/PeINd3Pxn+bvjZme+74b QWHQshy1n/9EMD83RijT0OL1rwXkqRgKhrz/22LIgKlf2LoQ9JKx+eqHcT+Hkdvx wL4cLvcOnfHZxEIYbEX7Cvg= Received: (qmail 46133 invoked by alias); 12 Oct 2017 15:49:20 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 46122 invoked by uid 89); 12 Oct 2017 15:49:19 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-10.3 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_LAZY_DOMAIN_SECURITY, RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=accounting, 3526, involving X-HELO: nikam.ms.mff.cuni.cz Received: from nikam.ms.mff.cuni.cz (HELO nikam.ms.mff.cuni.cz) (195.113.20.16) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 12 Oct 2017 15:49:18 +0000 Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id 90F11546A64; Thu, 12 Oct 2017 17:49:15 +0200 (CEST) Date: Thu, 12 Oct 2017 17:49:15 +0200 From: Jan Hubicka To: gcc-patches@gcc.gnu.org, Venkataramanan.Kumar@amd.com Subject: Zen tuning part 7: Fix ix86_adjust_cost Message-ID: <20171012154915.GA45576@kam.mff.cuni.cz> MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.23 (2014-03-12) Hi, this patch fixes ix86_adjust_cost for zen support. In particular the original code was accounting memory latencies incorrectly (3 for integer, 2 for FP unit) while they are 4 for integer and 7 for FP on this CPU. Using lower latencies makes scheduler overly pesimistic about CPU's ability to execute sequences involving loads effectively. I have decided to split the code into new switch, even tought it is currently similar to Athon-Buldozer tuning. The reason is that some extra special cases will appear here and Zen is probably good place to cut away from sharing implementation with older AMD designs. Bootstrapped/regtested x86_64-linux, will commit it shortly. * x86-tune-sched.c (ix86_adjust_cost): Fix Zen support. Index: config/i386/x86-tune-sched.c =================================================================== --- config/i386/x86-tune-sched.c (revision 253651) +++ config/i386/x86-tune-sched.c (working copy) @@ -352,7 +352,6 @@ ix86_adjust_cost (rtx_insn *insn, int de case PROCESSOR_BDVER2: case PROCESSOR_BDVER3: case PROCESSOR_BDVER4: - case PROCESSOR_ZNVER1: case PROCESSOR_BTVER1: case PROCESSOR_BTVER2: case PROCESSOR_GENERIC: @@ -387,6 +386,35 @@ ix86_adjust_cost (rtx_insn *insn, int de if (cost >= loadcost) cost -= loadcost; + else + cost = 0; + } + break; + + case PROCESSOR_ZNVER1: + /* Stack engine allows to execute push&pop instructions in parall. */ + if ((insn_type == TYPE_PUSH || insn_type == TYPE_POP) + && (dep_insn_type == TYPE_PUSH || dep_insn_type == TYPE_POP)) + return 0; + + memory = get_attr_memory (insn); + + /* Show ability of reorder buffer to hide latency of load by executing + in parallel with previous instruction in case + previous instruction is not needed to compute the address. */ + if ((memory == MEMORY_LOAD || memory == MEMORY_BOTH) + && !ix86_agi_dependent (dep_insn, insn)) + { + enum attr_unit unit = get_attr_unit (insn); + int loadcost; + + if (unit == UNIT_INTEGER || unit == UNIT_UNKNOWN) + loadcost = 4; + else + loadcost = 7; + + if (cost >= loadcost) + cost -= loadcost; else cost = 0; }