From patchwork Tue Apr 24 07:42:03 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 903311 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-476747-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="BVBWfDBA"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40VZzb1KCnz9s02 for ; Tue, 24 Apr 2018 17:42:17 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:mime-version:content-type; q=dns; s=default; b=WqhRGEGslMr48C4iSuWnFYoSORQ6zz/lEuPx5vh0nGuKLi98r/ FGtXENsWuJaVwgIqZrYJuFbFj3nYSdqz+GvHydzeJe2vPI9BRP9Qsfdtg4q9WUZv AQ0wlWJSpqlIjByYGTF8aPOfALA5dwaHBBHT1hJc4OmaOxJTgfAz9FDeg= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:mime-version:content-type; s= default; bh=JJ9GUc3URpMhMvX9zAJjCXYqoGo=; b=BVBWfDBAY9mlbsJaAsOm jtydho4P7T+iZHTwQURPADkrVLFR4MTIAw/55HdmfTygr1kll3C9UAx9ggqGHsJF xK5VIbL9ntMnvU/dobaYqTDhI/tHGZfolx8pI6t+VkKOMPLxNRi6MmO4C1boGkEc ujh+A5uzzF0/UCKy7NTShDw= Received: (qmail 122789 invoked by alias); 24 Apr 2018 07:42:09 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 122773 invoked by uid 89); 24 Apr 2018 07:42:08 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-11.4 required=5.0 tests=BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_NUMSUBJECT, SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mx2.suse.de Received: from mx2.suse.de (HELO mx2.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 24 Apr 2018 07:42:07 +0000 Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 6AD99ACEE; Tue, 24 Apr 2018 07:42:03 +0000 (UTC) Date: Tue, 24 Apr 2018 09:42:03 +0200 (CEST) From: Richard Biener To: gcc-patches@gcc.gnu.org cc: ubizjak@gmail.com, kirill.yukhin@gmail.com Subject: [PATCH] Fix PR85491 Message-ID: User-Agent: Alpine 2.20 (LSU 67 2015-01-07) MIME-Version: 1.0 The following patch restricts the previous fix for PR84037 to the case of strided loads with non-constant step to avoid regression nbench LU decomposition test on Haswell where the change causes us to use AVX128 instead of AVX256 in the two critical loops. Bootstrapped and tested on x86_64-unknown-linux-gnu. SPEC CPU 2006 results are in the noise, so is SPEC CPU 2000 (200.sixtrack seems to be awfully jumpy for me - it goes up and down by almost 50%!), nbench LU factorization performance is back up. OK for trunk? Thanks, Richard. 2018-04-24 Richard Biener PR target/85491 * config/i386/i386.c (ix86_add_stmt_cost): Restrict strided load cost increase to the case of non-constant step. Index: gcc/config/i386/i386.c =================================================================== --- gcc/config/i386/i386.c (revision 259556) +++ gcc/config/i386/i386.c (working copy) @@ -50550,8 +50550,9 @@ ix86_add_stmt_cost (void *data, int coun construction cost by the number of elements involved. */ if (kind == vec_construct && stmt_info - && stmt_info->type == load_vec_info_type - && stmt_info->memory_access_type == VMAT_ELEMENTWISE) + && STMT_VINFO_TYPE (stmt_info) == load_vec_info_type + && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) == VMAT_ELEMENTWISE + && TREE_CODE (DR_STEP (STMT_VINFO_DATA_REF (stmt_info))) != INTEGER_CST) { stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign); stmt_cost *= TYPE_VECTOR_SUBPARTS (vectype);