From patchwork Tue Jan 2 09:34:13 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Hubicka X-Patchwork-Id: 854471 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-469939-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="aTs+a4CT"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3z9pmj1C4Qz9s81 for ; Tue, 2 Jan 2018 20:34:26 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; q=dns; s= default; b=LlQiDg6K9L315Zu2bFidtGEN8R8Cac7hyoMI6tzcPTZ1yZhJrhsci 95ClBpRRfYLpsOxaFtepEXlU18YHpVa8N4C+dsBBD/AUpkGYKRsT0/RXDFycvvnl G7kdpJdnBBWkUwIiCznSHP71WR9/Uhcr5uk7bITut12az7P58V/8jQ= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; s= default; bh=khf2CqBmI8KrQHXGBQ9zzN2me0w=; b=aTs+a4CTOmofI0Mzrev3 F9sIKiDEE7HS+0aZ+R6py1AQ5PrHK2D7v0Pnk3Dz9j4XwCF+UW0zAynecBFyVv8B b2GkqXiSAHiEmhKzI9CdBJzEp2W6eYwiSg+MFAzYN4O1SaTqFCdb/xWabQQUvMCP NTQ4A3rb1Z/NgIRPBQ9PT8w= Received: (qmail 45284 invoked by alias); 2 Jan 2018 09:34:19 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 45259 invoked by uid 89); 2 Jan 2018 09:34:18 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-10.3 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_LAZY_DOMAIN_SECURITY, T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=chips, FMA X-HELO: nikam.ms.mff.cuni.cz Received: from nikam.ms.mff.cuni.cz (HELO nikam.ms.mff.cuni.cz) (195.113.20.16) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 02 Jan 2018 09:34:16 +0000 Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id D3AB4548E26; Tue, 2 Jan 2018 10:34:13 +0100 (CET) Date: Tue, 2 Jan 2018 10:34:13 +0100 From: Jan Hubicka To: gcc-patches@gcc.gnu.org Subject: Fix div/sqrt costs for generic model Message-ID: <20180102093413.GA44218@kam.mff.cuni.cz> MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.23 (2014-03-12) Hi, the following patch makes FP sqrt and div costs to match better modern chips. It enables vectorization of internal loops in spec2006 cactusAMD and leslie3d with over 10% speedup on Zen and also on Haswell. Bootstrapped/regtestex x86_64-linux, comitted. Honza PR target/81616 * x86-tune-costs.h (generic_cost): Reduce cost of FDIV 20->17, cost of sqrt 20->14, DIVSS 18->13, DIVSD 32->17, SQRtSS 30->14 and SQRTsD 58->18, cond_not_taken_branch_cost. 2->1. Increase cond_taken_branch_cost 3->4. Index: config/i386/x86-tune-costs.h =================================================================== --- config/i386/x86-tune-costs.h (revision 256065) +++ config/i386/x86-tune-costs.h (working copy) @@ -2293,10 +2293,10 @@ struct processor_costs generic_cost = { 3, /* Branch cost */ COSTS_N_INSNS (3), /* cost of FADD and FSUB insns. */ COSTS_N_INSNS (5), /* cost of FMUL instruction. */ - COSTS_N_INSNS (20), /* cost of FDIV instruction. */ + COSTS_N_INSNS (17), /* cost of FDIV instruction. */ COSTS_N_INSNS (1), /* cost of FABS instruction. */ COSTS_N_INSNS (1), /* cost of FCHS instruction. */ - COSTS_N_INSNS (20), /* cost of FSQRT instruction. */ + COSTS_N_INSNS (14), /* cost of FSQRT instruction. */ COSTS_N_INSNS (1), /* cost of cheap SSE instruction. */ COSTS_N_INSNS (3), /* cost of ADDSS/SD SUBSS/SD insns. */ @@ -2304,15 +2304,15 @@ struct processor_costs generic_cost = { COSTS_N_INSNS (5), /* cost of MULSD instruction. */ COSTS_N_INSNS (5), /* cost of FMA SS instruction. */ COSTS_N_INSNS (5), /* cost of FMA SD instruction. */ - COSTS_N_INSNS (18), /* cost of DIVSS instruction. */ - COSTS_N_INSNS (32), /* cost of DIVSD instruction. */ - COSTS_N_INSNS (30), /* cost of SQRTSS instruction. */ - COSTS_N_INSNS (58), /* cost of SQRTSD instruction. */ + COSTS_N_INSNS (13), /* cost of DIVSS instruction. */ + COSTS_N_INSNS (17), /* cost of DIVSD instruction. */ + COSTS_N_INSNS (14), /* cost of SQRTSS instruction. */ + COSTS_N_INSNS (18), /* cost of SQRTSD instruction. */ 1, 4, 3, 3, /* reassoc int, fp, vec_int, vec_fp. */ generic_memcpy, generic_memset, - COSTS_N_INSNS (3), /* cond_taken_branch_cost. */ - COSTS_N_INSNS (1), /* cond_not_taken_branch_cost. */ + COSTS_N_INSNS (4), /* cond_taken_branch_cost. */ + COSTS_N_INSNS (2), /* cond_not_taken_branch_cost. */ }; /* core_cost should produce code tuned for Core familly of CPUs. */