From patchwork Tue Jan 26 21:26:16 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 573475 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 378F7140BAB for ; Wed, 27 Jan 2016 08:26:34 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=toVyP9Rc; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-transfer-encoding:in-reply-to; q=dns; s= default; b=r+ROjrlgytWyEGPfI73yXze3OiQj+4hxPcPoLPNO1ERp1E8rK2G67 McDi9likZLg1zgr409N2IPdeD/9kTwdFzvPAgvtVrZDYa8SgELUVLhXGPB//vhR/ FooeLoFIeh5b23esY9g2rooden84vPD58YvLpJGgsmwRBAxXod3bFI= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-transfer-encoding:in-reply-to; s=default; bh=PuKec8nI/n7dv1SkajeZckd2K1w=; b=toVyP9Rc4XsH03L/z3qh9SSyVAwm 5aMsus/zoRiitAW8KhrAlGiV1AS+yeJcBP1SoAzSUxaAUYw/42GxeezrBqgiEhrm MLINmObpBGwKmV2NjWXnEmw1MV3OP/VutXFHjXGBfn+0gZ2/HIUd+vxRC718ISLZ BpXOhlUVAqLqnlc= Received: (qmail 55414 invoked by alias); 26 Jan 2016 21:26:27 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 55399 invoked by uid 89); 26 Jan 2016 21:26:26 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00, RP_MATCHES_RCVD, SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=mikestumpcomcastnet, mikestump@comcast.net, truth, HContent-Transfer-Encoding:8bit X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Tue, 26 Jan 2016 21:26:25 +0000 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) by mx1.redhat.com (Postfix) with ESMTPS id 464D7C049D4F; Tue, 26 Jan 2016 21:26:24 +0000 (UTC) Received: from tucnak.zalov.cz ([10.3.113.11]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u0QLQLuV026837 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 26 Jan 2016 16:26:22 -0500 Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id u0QLQIK2004245; Tue, 26 Jan 2016 22:26:19 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id u0QLQGAF004067; Tue, 26 Jan 2016 22:26:16 +0100 Date: Tue, 26 Jan 2016 22:26:16 +0100 From: Jakub Jelinek To: Richard Biener Cc: Mike Stump , Richard Sandiford , "H.J. Lu" , GCC Patches Subject: Re: [PATCH] Fix up wi::lrshift (PR c++/69399) Message-ID: <20160126212616.GK3017@tucnak.redhat.com> Reply-To: Jakub Jelinek References: <20160122185533.GA3430@intel.com> <20160126182148.GE3017@tucnak.redhat.com> <4ADD4937-E19F-4CD0-8441-C97935548759@comcast.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-IsSubscribed: yes On Tue, Jan 26, 2016 at 09:45:19PM +0100, Richard Biener wrote: > On January 26, 2016 8:00:52 PM GMT+01:00, Mike Stump wrote: > >On Jan 26, 2016, at 10:21 AM, Jakub Jelinek wrote > >> The question is, shall we do what wi::lshift does and have the fast > >path > >> only for the constant shifts, as the untested patch below does, or > >shall we > >> check shift dynamically (thus use > >> shift < HOST_BITS_PER_WIDE_INT > >> instead of > >> STATIC_CONSTANT_P (shift < HOST_BITS_PER_WIDE_INT) > >> in the patch), > > > >Hum… I think I prefer the dynamic check. The reasoning is that when > >we fast path, we can tolerate the conditional branch to retain the fast > >path, as most of the time, that condition will usually be true. If the > >compiler had troubles with knowing the usual truth value of the > >expression, seems like we can hint that it will be true and influence > >the static prediction of the branch. This permits us to fast path > >almost all the time in the non-constant, but small enough case. For > >known shifts, there is no code gen difference, so it doesn’t matter. > > The original reasoning was to inline only the fast path if it is known at compile time and otherwise have a call. Exactly to avoid bloating callers with inlined conditionals. I'm now also bootstrapping/regtesting following patch, the previous one passed bootstrap/regtest, and will do cc1plus size comparison afterwards. That said, I have done a quick check, where I believe that unless xi and shift are both compile time constants, for the STATIC_CONSTANT_P (xi.precision > HOST_BITS_PER_WIDE_INT) case there should be some comparison and the lrshift_large call with the non-STATIC_CONSTANT_P variant, but in the bootstrapped (non-LTO) cc1plus I only see 14 calls to lrshift_large, thus I bet it will likely affect only <= 14 places right now: 0000000000776990 <_ZL11build_new_1PP3vecIP9tree_node5va_gc8vl_embedES1_S1_S6_bi>: 777ca4: e8 17 28 91 00 callq 108a4c0 <_ZN2wi13lrshift_largeEPlPKljjjj> --- 00000000008b8bc0 <_ZL31adjust_offset_for_component_refP9tree_nodePbPl.part.91>: 8b8cc2: e8 f9 17 7d 00 callq 108a4c0 <_ZN2wi13lrshift_largeEPlPKljjjj> -- 0000000000a7b550 <_ZN2wi7rrotateIPK9tree_node16generic_wide_intI16wide_int_storageEEENS_12unary_traitsIT_E11result_typeERKS8_RKT0_j>: a7baea: e8 d1 e9 60 00 callq 108a4c0 <_ZN2wi13lrshift_largeEPlPKljjjj> 0000000000a7bd90 <_ZN2wi7lrotateIPK9tree_node16generic_wide_intI16wide_int_storageEEENS_12unary_traitsIT_E11result_typeERKS8_RKT0_j>: a7c457: e8 64 e0 60 00 callq 108a4c0 <_ZN2wi13lrshift_largeEPlPKljjjj> -- 0000000000a7c710 <_ZN2wi7lrshiftIP9tree_nodelEENS_12unary_traitsIT_E11result_typeERKS4_RKT0_>: a7c851: e8 6a dc 60 00 callq 108a4c0 <_ZN2wi13lrshift_largeEPlPKljjjj> -- 0000000000a7d280 <_ZN2wi7lrshiftIPK9tree_node16generic_wide_intI16wide_int_storageEEENS_12unary_traitsIT_E11result_typeERKS8_RKT0_>: a7d3b9: e8 02 d1 60 00 callq 108a4c0 <_ZN2wi13lrshift_largeEPlPKljjjj> -- 0000000000cc1370 <_Z15real_to_integerPK10real_valuePbi>: cc1752: e8 69 8d 3c 00 callq 108a4c0 <_ZN2wi13lrshift_largeEPlPKljjjj> -- 0000000000cc27f0 <_Z17real_from_integerP10real_value13format_helperRK16generic_wide_intI20wide_int_ref_storageILb0EEE6signop>: cc2f05: e8 b6 75 3c 00 callq 108a4c0 <_ZN2wi13lrshift_largeEPlPKljjjj> -- 0000000000d6c420 <_Z31simplify_const_binary_operation8rtx_code12machine_modeP7rtx_defS2_>: d6f5a5: e8 16 af 31 00 callq 108a4c0 <_ZN2wi13lrshift_largeEPlPKljjjj> -- 0000000000d7ca40 <_ZN2wi7lrotateISt4pairIP7rtx_def12machine_modeES5_EENS_12unary_traitsIT_E11result_typeERKS7_RKT0_j>: d7d17a: e8 41 d3 30 00 callq 108a4c0 <_ZN2wi13lrshift_largeEPlPKljjjj> 0000000000d7d310 <_ZN2wi7rrotateISt4pairIP7rtx_def12machine_modeES5_EENS_12unary_traitsIT_E11result_typeERKS7_RKT0_j>: d7d99d: e8 1e cb 30 00 callq 108a4c0 <_ZN2wi13lrshift_largeEPlPKljjjj> -- 0000000000ea3c40 <_ZN2wi7lrshiftI16generic_wide_intI22fixed_wide_int_storageILi192EEES4_EENS_12unary_traitsIT_E11result_typeERKS6_RKT0_>: ea3ca7: e8 14 68 1e 00 callq 108a4c0 <_ZN2wi13lrshift_largeEPlPKljjjj> -- 0000000000f435a0 <_ZL27copy_reference_ops_from_refP9tree_nodeP3vecI22vn_reference_op_struct7va_heap6vl_ptrE>: f444b2: e8 09 60 14 00 callq 108a4c0 <_ZN2wi13lrshift_largeEPlPKljjjj> -- 000000000161c840 <_ZL21restructure_referencePP9tree_nodeS1_P16generic_wide_intI22fixed_wide_int_storageILi192EEES1_.constprop.133>: 161cb51: e8 6a d9 a6 ff callq 108a4c0 <_ZN2wi13lrshift_largeEPlPKljjjj> 2016-01-26 Jakub Jelinek PR tree-optimization/69399 * wide-int.h (wi::lrshift): For larger precisions, only use fast path if shift is known to be < HOST_BITS_PER_WIDE_INT. * gcc.dg/torture/pr69399.c: New test. Jakub --- gcc/wide-int.h.jj 2016-01-04 14:55:50.000000000 +0100 +++ gcc/wide-int.h 2016-01-26 19:05:20.715269366 +0100 @@ -2909,7 +2909,9 @@ wi::lrshift (const T1 &x, const T2 &y) For variable-precision integers like wide_int, handle HWI and sub-HWI integers inline. */ if (STATIC_CONSTANT_P (xi.precision > HOST_BITS_PER_WIDE_INT) - ? xi.len == 1 && xi.val[0] >= 0 + ? (STATIC_CONSTANT_P (shift < HOST_BITS_PER_WIDE_INT) + && xi.len == 1 + && xi.val[0] >= 0) : xi.precision <= HOST_BITS_PER_WIDE_INT) { val[0] = xi.to_uhwi () >> shift; --- gcc/testsuite/gcc.dg/torture/pr69399.c.jj 2016-01-26 21:39:45.732927748 +0100 +++ gcc/testsuite/gcc.dg/torture/pr69399.c 2016-01-26 21:41:09.081753051 +0100 @@ -0,0 +1,18 @@ +/* { dg-do run { target int128 } } */ + +static unsigned __attribute__((noinline, noclone)) +foo (unsigned long long u) +{ + unsigned __int128 v = u | 0xffffff81U; + v >>= 64; + return v; +} + +int +main () +{ + unsigned x = foo (27); + if (x != 0) + __builtin_abort (); + return 0; +}