From patchwork Wed Dec 12 18:32:52 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 205604 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id A90EF2C008F for ; Thu, 13 Dec 2012 05:33:04 +1100 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1355941984; h=Comment: DomainKey-Signature:Received:Received:Received:Received: MIME-Version:Received:Received:In-Reply-To:References:Date: Message-ID:Subject:From:To:Cc:Content-Type:Mailing-List: Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:Sender:Delivered-To; bh=s/s++qLzBuvs+WFVcUqS2JH1nq8=; b=U5Ty1/9aJgpORoThK1FmXsmRZaYd+b5K808Uk2ZVRTxWZibA2ku+ebhfXP1hTq J6DwPhZ06fm1s9HjGJeFJnIHUgP1vSuu4p5AYyd8md0vtLEqyeDWK/ba+lVn9oE+ nYmgp+GGOvv5h8nBUhUWLjnN0FJoWtEeyAYi6osJ5wgf0= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:MIME-Version:Received:Received:In-Reply-To:References:Date:Message-ID:Subject:From:To:Cc:Content-Type:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=tptktyRYVzbYDi4UZvB6A2tijtSzIQKjGEs17vA1JSgeL0P6aCUE2EBNv/TVaL UnaabjVuC3QgbcoKsJIuxD7TatGkylT5Red33l8vPB0Kr82kAtdTeRHWPIufxX7g v4WF+cXCEsAAribucf3Kj0vIno1WXOOQR7SKlkeQwQ/nY=; Received: (qmail 12606 invoked by alias); 12 Dec 2012 18:33:00 -0000 Received: (qmail 12595 invoked by uid 22791); 12 Dec 2012 18:32:58 -0000 X-SWARE-Spam-Status: No, hits=-5.0 required=5.0 tests=AWL, BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, KHOP_RCVD_TRUST, KHOP_THREADED, RCVD_IN_DNSWL_LOW, RCVD_IN_HOSTKARMA_YE X-Spam-Check-By: sourceware.org Received: from mail-ob0-f175.google.com (HELO mail-ob0-f175.google.com) (209.85.214.175) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 12 Dec 2012 18:32:53 +0000 Received: by mail-ob0-f175.google.com with SMTP id vb8so1005828obc.20 for ; Wed, 12 Dec 2012 10:32:53 -0800 (PST) MIME-Version: 1.0 Received: by 10.182.162.69 with SMTP id xy5mr944320obb.95.1355337172906; Wed, 12 Dec 2012 10:32:52 -0800 (PST) Received: by 10.182.153.201 with HTTP; Wed, 12 Dec 2012 10:32:52 -0800 (PST) In-Reply-To: References: Date: Wed, 12 Dec 2012 19:32:52 +0100 Message-ID: Subject: Re: [PATCH,x86] Fix combine for condditional instructions. From: Uros Bizjak To: Richard Biener Cc: Yuri Rumyantsev , gcc-patches , Igor Zamyatin Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On Wed, Dec 12, 2012 at 3:45 PM, Richard Biener wrote: >> I assume that this is not right way for fixing such simple performance >> anomaly since we need to do redundant work - combine load to >> conditional and then split it back in peephole2? Does it look >> reasonable? Why we should produce non-efficient instrucction that must >> be splitted later? > > Well, either don't allow this instruction variant from the start, or allow > the extra freedom for register allocation this creates. It doesn't make > sense to just reject it being generated by combine - that doesn't address > when it materializes in another way. Please check the attached patch, it implements this limitation in a correct way: - keeps memory operands for -Os or cold parts of the executable - doesn't increase register pressure - handles all situations where memory operand can propagate into RTX Yuri, can you please check if this patch fixes the performance problem for you? BTW: I would really like to add some TARGET_USE_CMOVE_WITH_MEMOP target macro and conditionalize new peephole2 patterns on it. Uros. Index: i386.md =================================================================== --- i386.md (revision 194451) +++ i386.md (working copy) @@ -16122,6 +16122,31 @@ operands[3] = gen_lowpart (SImode, operands[3]); }) +;; Don't do conditional moves with memory inputs +(define_peephole2 + [(match_scratch:SWI248 2 "r") + (set (match_operand:SWI248 0 "register_operand") + (if_then_else:SWI248 (match_operator 1 "ix86_comparison_operator" + [(reg FLAGS_REG) (const_int 0)]) + (match_dup 0) + (match_operand:SWI248 3 "memory_operand")))] + "TARGET_CMOVE && optimize_insn_for_speed_p ()" + [(set (match_dup 2) (match_dup 3)) + (set (match_dup 0) + (if_then_else:SWI248 (match_dup 1) (match_dup 0) (match_dup 2)))]) + +(define_peephole2 + [(match_scratch:SWI248 2 "r") + (set (match_operand:SWI248 0 "register_operand") + (if_then_else:SWI248 (match_operator 1 "ix86_comparison_operator" + [(reg FLAGS_REG) (const_int 0)]) + (match_operand:SWI248 3 "memory_operand") + (match_dup 0)))] + "TARGET_CMOVE && optimize_insn_for_speed_p ()" + [(set (match_dup 2) (match_dup 3)) + (set (match_dup 0) + (if_then_else:SWI248 (match_dup 1) (match_dup 2) (match_dup 0)))]) + (define_expand "movcc" [(set (match_operand:X87MODEF 0 "register_operand") (if_then_else:X87MODEF @@ -16209,6 +16234,35 @@ [(set_attr "type" "fcmov,fcmov,icmov,icmov") (set_attr "mode" "SF,SF,SI,SI")]) +;; Don't do conditional moves with memory inputs +(define_peephole2 + [(match_scratch:MODEF 2 "r") + (set (match_operand:MODEF 0 "register_and_not_any_fp_reg_operand") + (if_then_else:MODEF (match_operator 1 "fcmov_comparison_operator" + [(reg FLAGS_REG) (const_int 0)]) + (match_dup 0) + (match_operand:MODEF 3 "memory_operand")))] + "(mode != DFmode || TARGET_64BIT) + && TARGET_80387 && TARGET_CMOVE + && optimize_insn_for_speed_p ()" + [(set (match_dup 2) (match_dup 3)) + (set (match_dup 0) + (if_then_else:MODEF (match_dup 1) (match_dup 0) (match_dup 2)))]) + +(define_peephole2 + [(match_scratch:MODEF 2 "r") + (set (match_operand:MODEF 0 "register_and_not_any_fp_reg_operand") + (if_then_else:MODEF (match_operator 1 "fcmov_comparison_operator" + [(reg FLAGS_REG) (const_int 0)]) + (match_operand:MODEF 3 "memory_operand") + (match_dup 0)))] + "(mode != DFmode || TARGET_64BIT) + && TARGET_80387 && TARGET_CMOVE + && optimize_insn_for_speed_p ()" + [(set (match_dup 2) (match_dup 3)) + (set (match_dup 0) + (if_then_else:MODEF (match_dup 1) (match_dup 2) (match_dup 0)))]) + ;; All moves in XOP pcmov instructions are 128 bits and hence we restrict ;; the scalar versions to have only XMM registers as operands.