From patchwork Fri Jul 5 15:15:51 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 257188 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "localhost", Issuer "www.qmailtoaster.com" (not verified)) by ozlabs.org (Postfix) with ESMTPS id 5558E2C0096 for ; Sat, 6 Jul 2013 01:16:09 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:mime-version :content-type; q=dns; s=default; b=DxRV/c3lag2/1122yuf02jxV9rCfc b8cP5fm/HKJpF7dFDwYq8fZVu/dZM1vwG1JoqvHyFXUlpHtFC+H/JpDJKfb7Cly9 Yab3IivINZiog/M9ZTsmgA08dnsB+G751kelrTvk7yV3LoS8vjLEm6+wYhyHi2Tc v52zTbiz3PPDjU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:mime-version :content-type; s=default; bh=7bsBJ1mHMMGIGZ7hma0nnxrvNJ4=; b=CxV u8T3nhAghX9fsmRPBTVWg+X90UU/NXVnafPbcT+Ae88321NMZeZiQ1a6nItQ88KB m0pw54m1HGkM4cX2xA2s/HyDSXcuDD3C/NTYhqNE+xvoqnsYtSnbiejn49AsIz6u j6vOWqFw7rFpDNB2GtJ+8657hwmlSWejpSzOhWro= Received: (qmail 27972 invoked by alias); 5 Jul 2013 15:16:03 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 27952 invoked by uid 89); 5 Jul 2013 15:15:59 -0000 X-Spam-SWARE-Status: No, score=-6.4 required=5.0 tests=AWL, BAYES_00, RCVD_IN_HOSTKARMA_W, RCVD_IN_HOSTKARMA_WL, RP_MATCHES_RCVD, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.1 Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP; Fri, 05 Jul 2013 15:15:58 +0000 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r65FFsHv020511 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Fri, 5 Jul 2013 11:15:55 -0400 Received: from zalov.cz (vpn1-4-58.ams2.redhat.com [10.36.4.58]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id r65FFqA8016551 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 5 Jul 2013 11:15:54 -0400 Received: from zalov.cz (localhost [127.0.0.1]) by zalov.cz (8.14.5/8.14.5) with ESMTP id r65FFqan028620; Fri, 5 Jul 2013 17:15:52 +0200 Received: (from jakub@localhost) by zalov.cz (8.14.5/8.14.5/Submit) id r65FFpMp028619; Fri, 5 Jul 2013 17:15:51 +0200 Date: Fri, 5 Jul 2013 17:15:51 +0200 From: Jakub Jelinek To: Uros Bizjak , Eric Botcazou Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] Improve btc (PR target/57819) Message-ID: <20130705151551.GX2336@tucnak.redhat.com> Reply-To: Jakub Jelinek MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-Virus-Found: No Hi! Kai has reported his type demotion patches lead to a regression, which can be seen also without his patches by doing the type demotion by hand. test1 is optimized using *jcc_bt_mask instruction (combiner detects this), but test2 isn't. In that case combiner first merges the and with shift into *3_mask insn, and *jcc_bt_mask won't match, because we end up with (zero_extend:SI (subreg:QI (and:SI (const_int 63)) 0)) and we don't simplify that. So, my first approach was trying to simplify that, because nonzero_bits on the subreg operand say that no bits outside of QImode may be non-zero, both the zero_extend and subreg can be dropped. That is the simplify-rtx.c change. Then I've figured out that combine.c doesn't actually attempt to simplify this anyway, so that is the combine.c change. And lastly an i386 pattern was needed anyway. I've also attempted to simplify: (zero_extend:SI (subreg:QI (and:DI (const_int 63)) 0)) into (subreg:SI (and:DI (const_int 63)) 0) (very small change in simplify-rtx.c, just drop the requirement that zero_extend mode is as wide or wider than SUBREG_REG's mode, and when it is <= use gen_lowpart_no_emit instead of just returning the SUBREG_REG, but that unfortunately regressed the test1 case, we'd need some further i386.md tweaks. While in theory this folding looks like a useful simplification, because of this I'm wondering if other backends don't rely on those actually not being simplified. So, I've as an alternative implemented also an i386.md only fix. Thus, do we want the first patch, or first patch + also the above described further simplify-rtx.c change + some further i386.md tweaks, or just the second patch instead? Both have been bootstrapped/regtested on x86_64-linux and i686-linux. Jakub 2013-07-05 Jakub Jelinek PR target/57819 * simplify-rtx.c (simplify_unary_operation_1) : Simplify (zero_extend:SI (subreg:QI (and:SI (reg:SI) (const_int 63)) 0)). * combine.c (make_extraction): Create ZERO_EXTEND or SIGN_EXTEND using simplify_gen_unary instead of gen_rtx_*_EXTEND. * config/i386/i386.md (*jcc_bt_1): New define_insn_and_split. * gcc.target/i386/pr57819.c: New test. 2013-07-05 Jakub Jelinek PR target/57819 * config/i386/i386.md (*jcc_bt_mask_1): New define_insn_and_split. * gcc.target/i386/pr57819.c: New test. --- gcc/config/i386/i386.md.jj 2013-06-27 18:47:32.000000000 +0200 +++ gcc/config/i386/i386.md 2013-07-04 16:54:48.789218553 +0200 @@ -10510,6 +10510,45 @@ (define_insn_and_split "*jcc_bt_ma PUT_CODE (operands[0], reverse_condition (GET_CODE (operands[0]))); }) +;; Like *jcc_bt_mask, but for the case where AND has been previously +;; combined with a shift. +(define_insn_and_split "*jcc_bt_mask_1" + [(set (pc) + (if_then_else (match_operator 0 "bt_comparison_operator" + [(zero_extract:SWI48 + (match_operand:SWI48 1 "register_operand" "r") + (const_int 1) + (zero_extend:SI + (subreg:QI + (and:SI + (match_operand:SI 2 "register_operand" "r") + (match_operand:SI 3 "const_int_operand" "n")) + 0)))]) + (label_ref (match_operand 4)) + (pc))) + (clobber (reg:CC FLAGS_REG))] + "(TARGET_USE_BT || optimize_function_for_size_p (cfun)) + && (INTVAL (operands[3]) & (GET_MODE_BITSIZE (mode)-1)) + == GET_MODE_BITSIZE (mode)-1" + "#" + "&& 1" + [(set (reg:CCC FLAGS_REG) + (compare:CCC + (zero_extract:SWI48 + (match_dup 1) + (const_int 1) + (match_dup 2)) + (const_int 0))) + (set (pc) + (if_then_else (match_op_dup 0 [(reg:CCC FLAGS_REG) (const_int 0)]) + (label_ref (match_dup 4)) + (pc)))] +{ + operands[2] = simplify_gen_subreg (mode, operands[2], SImode, 0); + + PUT_CODE (operands[0], reverse_condition (GET_CODE (operands[0]))); +}) + (define_insn_and_split "*jcc_btsi_1" [(set (pc) (if_then_else (match_operator 0 "bt_comparison_operator" --- gcc/testsuite/gcc.target/i386/pr57819.c.jj 2013-07-04 16:27:46.900877301 +0200 +++ gcc/testsuite/gcc.target/i386/pr57819.c 2013-07-04 16:27:30.000000000 +0200 @@ -0,0 +1,38 @@ +/* PR target/57819 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -mtune=core2" } */ + +void foo (void); + +__extension__ typedef __INTPTR_TYPE__ intptr_t; + +int +test1 (intptr_t x, intptr_t n) +{ + n &= sizeof (intptr_t) * __CHAR_BIT__ - 1; + + if (x & ((intptr_t) 1 << n)) + foo (); + + return 0; +} + +int +test2 (intptr_t x, intptr_t n) +{ + if (x & ((intptr_t) 1 << ((int) n & (sizeof (intptr_t) * __CHAR_BIT__ - 1)))) + foo (); + + return 0; +} + +int +test3 (intptr_t x, intptr_t n) +{ + if (x & ((intptr_t) 1 << ((int) n & ((int) sizeof (intptr_t) * __CHAR_BIT__ - 1)))) + foo (); + + return 0; +} + +/* { dg-final { scan-assembler-not "and\[lq\]\[ \t\]" } } */ --- gcc/simplify-rtx.c.jj 2013-06-01 14:47:23.000000000 +0200 +++ gcc/simplify-rtx.c 2013-07-04 16:24:48.654817120 +0200 @@ -1470,6 +1470,29 @@ simplify_unary_operation_1 (enum rtx_cod } } + /* (zero_extend:M (subreg:N )) is (for M == O) or + (zero_extend:M ), if X doesn't have any bits outside of N mode + non-zero. E.g. + (zero_extend:SI (subreg:QI (and:SI (reg:SI) (const_int 63)) 0)) is + (and:SI (reg:SI) (const_int 63)). */ + if (GET_CODE (op) == SUBREG + && GET_MODE_PRECISION (GET_MODE (op)) + < GET_MODE_PRECISION (GET_MODE (SUBREG_REG (op))) + && GET_MODE_PRECISION (GET_MODE (SUBREG_REG (op))) + <= HOST_BITS_PER_WIDE_INT + && GET_MODE_PRECISION (mode) + >= GET_MODE_PRECISION (GET_MODE (SUBREG_REG (op))) + && subreg_lowpart_p (op) + && (nonzero_bits (SUBREG_REG (op), GET_MODE (SUBREG_REG (op))) + & ~GET_MODE_MASK (GET_MODE (op))) == 0) + { + if (GET_MODE_PRECISION (mode) + == GET_MODE_PRECISION (GET_MODE (SUBREG_REG (op)))) + return SUBREG_REG (op); + return simplify_gen_unary (ZERO_EXTEND, mode, SUBREG_REG (op), + GET_MODE (SUBREG_REG (op))); + } + #if defined(POINTERS_EXTEND_UNSIGNED) && !defined(HAVE_ptr_extend) /* As we do not know which address space the pointer is referring to, we can do this only if the target does not support different pointer --- gcc/combine.c.jj 2013-05-04 14:40:40.000000000 +0200 +++ gcc/combine.c 2013-07-04 15:44:59.409575170 +0200 @@ -7326,7 +7326,8 @@ make_extraction (enum machine_mode mode, if (pos_rtx != 0 && GET_MODE_SIZE (pos_mode) > GET_MODE_SIZE (GET_MODE (pos_rtx))) { - rtx temp = gen_rtx_ZERO_EXTEND (pos_mode, pos_rtx); + rtx temp = simplify_gen_unary (ZERO_EXTEND, pos_mode, pos_rtx, + GET_MODE (pos_rtx)); /* If we know that no extraneous bits are set, and that the high bit is not set, convert extraction to cheaper one - either @@ -7340,7 +7341,8 @@ make_extraction (enum machine_mode mode, >> 1)) == 0))) { - rtx temp1 = gen_rtx_SIGN_EXTEND (pos_mode, pos_rtx); + rtx temp1 = simplify_gen_unary (SIGN_EXTEND, pos_mode, pos_rtx, + GET_MODE (pos_rtx)); /* Prefer ZERO_EXTENSION, since it gives more information to backends. */ --- gcc/config/i386/i386.md.jj 2013-06-27 18:47:32.000000000 +0200 +++ gcc/config/i386/i386.md 2013-07-04 15:58:24.429243358 +0200 @@ -10474,6 +10474,39 @@ (define_insn_and_split "*jcc_bt" PUT_CODE (operands[0], reverse_condition (GET_CODE (operands[0]))); }) +;; Like *jcc_bt, but expect a SImode operand 2 instead of QImode +;; zero extended to SImode. +(define_insn_and_split "*jcc_bt_1" + [(set (pc) + (if_then_else (match_operator 0 "bt_comparison_operator" + [(zero_extract:SWI48 + (match_operand:SWI48 1 "register_operand" "r") + (const_int 1) + (match_operand:SI 2 "register_operand" "r")) + (const_int 0)]) + (label_ref (match_operand 3)) + (pc))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_USE_BT || optimize_function_for_size_p (cfun)" + "#" + "&& 1" + [(set (reg:CCC FLAGS_REG) + (compare:CCC + (zero_extract:SWI48 + (match_dup 1) + (const_int 1) + (match_dup 2)) + (const_int 0))) + (set (pc) + (if_then_else (match_op_dup 0 [(reg:CCC FLAGS_REG) (const_int 0)]) + (label_ref (match_dup 3)) + (pc)))] +{ + operands[2] = simplify_gen_subreg (mode, operands[2], SImode, 0); + + PUT_CODE (operands[0], reverse_condition (GET_CODE (operands[0]))); +}) + ;; Avoid useless masking of bit offset operand. "and" in SImode is correct ;; also for DImode, this is what combine produces. (define_insn_and_split "*jcc_bt_mask" --- gcc/testsuite/gcc.target/i386/pr57819.c.jj 2013-07-04 16:27:46.900877301 +0200 +++ gcc/testsuite/gcc.target/i386/pr57819.c 2013-07-04 16:27:30.000000000 +0200 @@ -0,0 +1,38 @@ +/* PR target/57819 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -mtune=core2" } */ + +void foo (void); + +__extension__ typedef __INTPTR_TYPE__ intptr_t; + +int +test1 (intptr_t x, intptr_t n) +{ + n &= sizeof (intptr_t) * __CHAR_BIT__ - 1; + + if (x & ((intptr_t) 1 << n)) + foo (); + + return 0; +} + +int +test2 (intptr_t x, intptr_t n) +{ + if (x & ((intptr_t) 1 << ((int) n & (sizeof (intptr_t) * __CHAR_BIT__ - 1)))) + foo (); + + return 0; +} + +int +test3 (intptr_t x, intptr_t n) +{ + if (x & ((intptr_t) 1 << ((int) n & ((int) sizeof (intptr_t) * __CHAR_BIT__ - 1)))) + foo (); + + return 0; +} + +/* { dg-final { scan-assembler-not "and\[lq\]\[ \t\]" } } */