From patchwork Mon Aug 19 12:13:08 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 1149221 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-507242-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="a5BQKqAO"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 46Bt9s5Pb2z9s3Z for ; Mon, 19 Aug 2019 22:13:19 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; q=dns; s= default; b=NwzbgcZNbfIKX2i0nVI5txvRHKsjTTup8kMRr5/7HTEM65xR6658y pwrmqDKEdbYlWXYxZKoIJrgciNe2NBA6ANLH2XGn6emXS0zJ952vq5DLB5XbX3bj HF4VdapxedaZ6PBJzbjmvCITGjZT1qIM+ctnioxvtGOdFLfsVMgdXI= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; s= default; bh=l118logy/3K0j3Ky7m9jJoP1OLs=; b=a5BQKqAOZkF4LP7wVJFh DXFdXZpoFAfcl8U+/424HJryi9fy1QRi0o/s4oz/K8w8m3Gu2NIqCoK5k3OJ3OjE 868mJPa8yvhPHEs6X3jafUQbYffLlQLhNh3ST90WvJ8cVFYjbBWqkfb3hjUvo3Hh 9557cJ4qKoxirCr4YMYGrAY= Received: (qmail 10591 invoked by alias); 19 Aug 2019 12:13:12 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 10576 invoked by uid 89); 19 Aug 2019 12:13:11 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-10.4 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, SPF_PASS autolearn=ham version=3.3.1 spammy= X-HELO: mx1.suse.de Received: from mx2.suse.de (HELO mx1.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 19 Aug 2019 12:13:10 +0000 Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 347A4AEE1 for ; Mon, 19 Aug 2019 12:13:08 +0000 (UTC) Date: Mon, 19 Aug 2019 14:13:08 +0200 (CEST) From: Richard Biener To: gcc-patches@gcc.gnu.org Subject: [PATCH] Fix 300.twolf regression caused by STV Message-ID: User-Agent: Alpine 2.20 (LSU 67 2015-01-07) MIME-Version: 1.0 Uros noted that STV with !TImode isn't supposed to run before combine. The following adjusts things accordingly and now the pass runs twice for TARGET_64BIT. I've also adjusted another gpr->xmm move to use (vec_merge (vec_duplicate..)) style rather than using a subreg. This isn't strictly neccesary to fix the bug though and my previous needs to do this might have been caused by the pass running too early. So - with or without this consistency part? Bootstrap / regtest running on x86_64-unknown-linux-gnu, OK? Thanks, Richard. 2019-08-19 Richard Biener PR target/91154 * config/i386/i386-features.c (general_scalar_chain::convert_op): Use (vec_merge (vec_duplicate..)) style vector from scalar move. (convert_scalars_to_vector): Add timode_p parameter and use it to guard TImode-only operation. (pass_stv::gate): Adjust so STV runs twice for TARGET_64BIT. (pass_stv::execute): Pass down timode_p. * gcc.target/i386/minmax-7.c: New testcase. Index: gcc/config/i386/i386-features.c =================================================================== --- gcc/config/i386/i386-features.c (revision 274666) +++ gcc/config/i386/i386-features.c (working copy) @@ -910,7 +910,9 @@ general_scalar_chain::convert_op (rtx *o { rtx tmp = gen_reg_rtx (GET_MODE (*op)); - emit_insn_before (gen_move_insn (tmp, *op), insn); + emit_insn_before (gen_rtx_SET (gen_rtx_SUBREG (vmode, tmp, 0), + gen_gpr_to_xmm_move_src (vmode, *op)), + insn); *op = gen_rtx_SUBREG (vmode, tmp, 0); if (dump_file) @@ -1664,7 +1666,7 @@ timode_remove_non_convertible_regs (bitm instructions into vector mode when profitable. */ static unsigned int -convert_scalars_to_vector () +convert_scalars_to_vector (bool timode_p) { basic_block bb; int converted_insns = 0; @@ -1690,7 +1692,7 @@ convert_scalars_to_vector () { rtx_insn *insn; FOR_BB_INSNS (bb, insn) - if (TARGET_64BIT + if (timode_p && timode_scalar_to_vector_candidate_p (insn)) { if (dump_file) @@ -1699,7 +1701,7 @@ convert_scalars_to_vector () bitmap_set_bit (&candidates[2], INSN_UID (insn)); } - else + else if (!timode_p) { /* Check {SI,DI}mode. */ for (unsigned i = 0; i <= 1; ++i) @@ -1715,7 +1717,7 @@ convert_scalars_to_vector () } } - if (TARGET_64BIT) + if (timode_p) timode_remove_non_convertible_regs (&candidates[2]); for (unsigned i = 0; i <= 1; ++i) general_remove_non_convertible_regs (&candidates[i]); @@ -1875,13 +1877,13 @@ public: /* opt_pass methods: */ virtual bool gate (function *) { - return (timode_p == !!TARGET_64BIT + return ((!timode_p || TARGET_64BIT) && TARGET_STV && TARGET_SSE2 && optimize > 1); } virtual unsigned int execute (function *) { - return convert_scalars_to_vector (); + return convert_scalars_to_vector (timode_p); } opt_pass *clone () Index: gcc/testsuite/gcc.target/i386/minmax-7.c =================================================================== --- gcc/testsuite/gcc.target/i386/minmax-7.c (nonexistent) +++ gcc/testsuite/gcc.target/i386/minmax-7.c (working copy) @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=haswell" } */ + +extern int numBins; +extern int binOffst; +extern int binWidth; +extern int Trybin; +void foo (int); + +void bar (int aleft, int axcenter) +{ + int a1LoBin = (((Trybin=((axcenter + aleft)-binOffst)/binWidth)<0) + ? 0 : ((Trybin>numBins) ? numBins : Trybin)); + foo (a1LoBin); +} + +/* We do not want the RA to spill %esi for it's dual-use but using + pminsd is OK. */ +/* { dg-final { scan-assembler-not "rsp" { target { ! { ia32 } } } } } */ +/* { dg-final { scan-assembler "pminsd" } } */