From patchwork Thu Feb 21 16:42:40 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 222342 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id D179B2C0079 for ; Fri, 22 Feb 2013 03:43:07 +1100 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1362069788; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: Received:Received:Date:From:To:Cc:Subject:Message-ID:Reply-To: MIME-Version:Content-Type:Content-Disposition:User-Agent: Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:Sender:Delivered-To; bh=XFODtKrIq0zqMrnWAsNz arTBT0Y=; b=XR02gml+C31ttUqSUiiW4H9uqznDhOErlyJOeCR740DjA0FLsNN+ hLnt/AG7nrsoFmAQwTUmjanCYYza4lOHkibvfU5pLIRe5tTXf5Edydp4vZj8zeN2 mswpYKl//Oad3MfvCKH3fuup5lz5mdvqhHVLKvOw1OYAF1Dv0H2xwvY= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Received:Received:Received:Date:From:To:Cc:Subject:Message-ID:Reply-To:MIME-Version:Content-Type:Content-Disposition:User-Agent:X-IsSubscribed:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=rmrp54P3Xfe5sXHFgvKKV5sZdGJvYIMU9A8zlEDqGNlu9zP1T89t5JingKvnAC x8RKrWvmJPss9JA+td7CyC+wp6teVgr2jIs2hnF5OGLd3dZeJUOeVY9ENlsIuWvo 9gd2tyWCTvXQPSCajFq6JmmXKUnW5J3WwI+C/l26uCqqc=; Received: (qmail 2975 invoked by alias); 21 Feb 2013 16:42:58 -0000 Received: (qmail 2963 invoked by uid 22791); 21 Feb 2013 16:42:55 -0000 X-SWARE-Spam-Status: No, hits=-6.6 required=5.0 tests=AWL, BAYES_00, KHOP_RCVD_UNTRUST, KHOP_SPAMHAUS_DROP, RCVD_IN_DNSWL_HI, RCVD_IN_HOSTKARMA_W, RP_MATCHES_RCVD, SPF_HELO_PASS X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 21 Feb 2013 16:42:43 +0000 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r1LGghun028323 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 21 Feb 2013 11:42:43 -0500 Received: from zalov.redhat.com (vpn1-6-249.ams2.redhat.com [10.36.6.249]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id r1LGgfbm028360 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 21 Feb 2013 11:42:42 -0500 Received: from zalov.cz (localhost [127.0.0.1]) by zalov.redhat.com (8.14.5/8.14.5) with ESMTP id r1LGgeRb001392; Thu, 21 Feb 2013 17:42:40 +0100 Received: (from jakub@localhost) by zalov.cz (8.14.5/8.14.5/Submit) id r1LGgefh001391; Thu, 21 Feb 2013 17:42:40 +0100 Date: Thu, 21 Feb 2013 17:42:40 +0100 From: Jakub Jelinek To: Richard Henderson Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] Handle ASHIFTRT with constant shift count >= BITS_PER_WORD in subreg lowering (PR rtl-optimization/50339) Message-ID: <20130221164240.GE1215@tucnak.zalov.cz> Reply-To: Jakub Jelinek MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hi! This patch teaches lower-subreg pass to also handle ASHIFTRTs with BITS_PER_WORD to 2*BITS_PER_WORD-1 constant shift counts, like it already handles similar LSHIFTRTs. While for LSHIFTRT we should zero the upper half, for ASHIFTRT we either should set it to upper source half >> (BITS_PER_WORD-1), or for shifts by >> (2*BITS_PER_WORD-1) we can optimize that to one shift followed by copying it from the lower to the upper half. On the testcase from the PR this removes 3 unnecessary moves, so we are one more better than 4.7 (thus fix a regression), and on the other testcases either we generated the same quality of code, or better. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2013-02-21 Jakub Jelinek PR rtl-optimization/50339 * lower-subreg.h (struct lower_subreg_choices): Add splitting_ashiftrt field. * lower-subreg.c (compute_splitting_shift): Handle ASHIFTRT. (compute_costs): Call compute_splitting_shift also for ASHIFTRT into splitting_ashiftrt field. (find_decomposable_shift_zext, resolve_shift_zext): Handle also ASHIFTRT. (dump_choices): Fix up printing LSHIFTRT choices, print ASHIFTRT choices. Jakub --- gcc/lower-subreg.h.jj 2013-02-21 14:10:39.033592663 +0100 +++ gcc/lower-subreg.h 2013-02-21 15:26:18.773634801 +0100 @@ -34,6 +34,7 @@ struct lower_subreg_choices { should be split. */ bool splitting_ashift[MAX_BITS_PER_WORD]; bool splitting_lshiftrt[MAX_BITS_PER_WORD]; + bool splitting_ashiftrt[MAX_BITS_PER_WORD]; /* True if there is at least one mode that is worth splitting. */ bool something_to_do; --- gcc/lower-subreg.c.jj 2013-02-21 14:10:38.975592966 +0100 +++ gcc/lower-subreg.c 2013-02-21 15:27:15.114316148 +0100 @@ -57,9 +57,9 @@ along with GCC; see the file COPYING3. to do this. This pass only splits moves with modes that are wider than - word_mode and ASHIFTs, LSHIFTRTs and ZERO_EXTENDs with integer - modes that are twice the width of word_mode. The latter could be - generalized if there was a need to do this, but the trend in + word_mode and ASHIFTs, LSHIFTRTs, ASHIFTRTs and ZERO_EXTENDs with + integer modes that are twice the width of word_mode. The latter + could be generalized if there was a need to do this, but the trend in architectures is to not need this. There are two useful preprocessor defines for use by maintainers: @@ -152,7 +152,7 @@ compute_splitting_shift (bool speed_p, s bool *splitting, enum rtx_code code, int word_move_zero_cost, int word_move_cost) { - int wide_cost, narrow_cost, i; + int wide_cost, narrow_cost, upper_cost, i; for (i = 0; i < BITS_PER_WORD; i++) { @@ -163,13 +163,20 @@ compute_splitting_shift (bool speed_p, s else narrow_cost = shift_cost (speed_p, rtxes, code, word_mode, i); + if (code != ASHIFTRT) + upper_cost = word_move_zero_cost; + else if (i == BITS_PER_WORD - 1) + upper_cost = word_move_cost; + else + upper_cost = shift_cost (speed_p, rtxes, code, word_mode, + BITS_PER_WORD - 1); + if (LOG_COSTS) fprintf (stderr, "%s %s by %d: original cost %d, split cost %d + %d\n", GET_MODE_NAME (twice_word_mode), GET_RTX_NAME (code), - i + BITS_PER_WORD, wide_cost, narrow_cost, - word_move_zero_cost); + i + BITS_PER_WORD, wide_cost, narrow_cost, upper_cost); - if (FORCE_LOWERING || wide_cost >= narrow_cost + word_move_zero_cost) + if (FORCE_LOWERING || wide_cost >= narrow_cost + upper_cost) splitting[i] = true; } } @@ -248,6 +255,9 @@ compute_costs (bool speed_p, struct cost compute_splitting_shift (speed_p, rtxes, choices[speed_p].splitting_lshiftrt, LSHIFTRT, word_move_zero_cost, word_move_cost); + compute_splitting_shift (speed_p, rtxes, + choices[speed_p].splitting_ashiftrt, ASHIFTRT, + word_move_zero_cost, word_move_cost); } } @@ -1153,6 +1163,7 @@ find_decomposable_shift_zext (rtx insn, op = SET_SRC (set); if (GET_CODE (op) != ASHIFT && GET_CODE (op) != LSHIFTRT + && GET_CODE (op) != ASHIFTRT && GET_CODE (op) != ZERO_EXTEND) return false; @@ -1173,6 +1184,8 @@ find_decomposable_shift_zext (rtx insn, { bool *splitting = (GET_CODE (op) == ASHIFT ? choices[speed_p].splitting_ashift + : GET_CODE (op) == ASHIFTRT + ? choices[speed_p].splitting_ashiftrt : choices[speed_p].splitting_lshiftrt); if (!CONST_INT_P (XEXP (op, 1)) || !IN_RANGE (INTVAL (XEXP (op, 1)), BITS_PER_WORD, @@ -1200,7 +1213,7 @@ resolve_shift_zext (rtx insn) rtx op; rtx op_operand; rtx insns; - rtx src_reg, dest_reg, dest_zero; + rtx src_reg, dest_reg, dest_upper; int src_reg_num, dest_reg_num, offset1, offset2, src_offset; set = single_set (insn); @@ -1210,6 +1223,7 @@ resolve_shift_zext (rtx insn) op = SET_SRC (set); if (GET_CODE (op) != ASHIFT && GET_CODE (op) != LSHIFTRT + && GET_CODE (op) != ASHIFTRT && GET_CODE (op) != ZERO_EXTEND) return NULL_RTX; @@ -1223,7 +1237,8 @@ resolve_shift_zext (rtx insn) /* src_reg_num is the number of the word mode register which we are operating on. For a left shift and a zero_extend on little endian machines this is register 0. */ - src_reg_num = GET_CODE (op) == LSHIFTRT ? 1 : 0; + src_reg_num = (GET_CODE (op) == LSHIFTRT || GET_CODE (op) == ASHIFTRT) + ? 1 : 0; if (WORDS_BIG_ENDIAN && GET_MODE_SIZE (GET_MODE (op_operand)) > UNITS_PER_WORD) @@ -1243,12 +1258,20 @@ resolve_shift_zext (rtx insn) dest_reg = simplify_gen_subreg_concatn (word_mode, SET_DEST (set), GET_MODE (SET_DEST (set)), offset1); - dest_zero = simplify_gen_subreg_concatn (word_mode, SET_DEST (set), - GET_MODE (SET_DEST (set)), - offset2); + dest_upper = simplify_gen_subreg_concatn (word_mode, SET_DEST (set), + GET_MODE (SET_DEST (set)), + offset2); src_reg = simplify_gen_subreg_concatn (word_mode, op_operand, GET_MODE (op_operand), src_offset); + if (GET_CODE (op) == ASHIFTRT + && INTVAL (XEXP (op, 1)) != 2 * BITS_PER_WORD - 1) + { + rtx tem = expand_shift (RSHIFT_EXPR, word_mode, copy_rtx (src_reg), + BITS_PER_WORD - 1, dest_upper, 0); + if (dest_upper != tem) + emit_move_insn (dest_upper, tem); + } if (GET_CODE (op) != ZERO_EXTEND) { int shift_count = INTVAL (XEXP (op, 1)); @@ -1257,12 +1280,15 @@ resolve_shift_zext (rtx insn) LSHIFT_EXPR : RSHIFT_EXPR, word_mode, src_reg, shift_count - BITS_PER_WORD, - dest_reg, 1); + dest_reg, GET_CODE (op) != ASHIFTRT); } if (dest_reg != src_reg) emit_move_insn (dest_reg, src_reg); - emit_move_insn (dest_zero, CONST0_RTX (word_mode)); + if (GET_CODE (op) != ASHIFTRT) + emit_move_insn (dest_upper, CONST0_RTX (word_mode)); + else if (INTVAL (XEXP (op, 1)) == 2 * BITS_PER_WORD - 1) + emit_move_insn (dest_upper, copy_rtx (src_reg)); insns = get_insns (); end_sequence (); @@ -1328,7 +1354,8 @@ dump_choices (bool speed_p, const char * GET_MODE_NAME (twice_word_mode)); dump_shift_choices (ASHIFT, choices[speed_p].splitting_ashift); - dump_shift_choices (LSHIFTRT, choices[speed_p].splitting_ashift); + dump_shift_choices (LSHIFTRT, choices[speed_p].splitting_lshiftrt); + dump_shift_choices (ASHIFTRT, choices[speed_p].splitting_ashiftrt); fprintf (dump_file, "\n"); }