From patchwork Mon Jun 20 19:38:22 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bernd Schmidt X-Patchwork-Id: 101189 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 388D8B6F81 for ; Tue, 21 Jun 2011 05:40:49 +1000 (EST) Received: (qmail 16071 invoked by alias); 20 Jun 2011 19:40:46 -0000 Received: (qmail 15393 invoked by uid 22791); 20 Jun 2011 19:40:37 -0000 X-SWARE-Spam-Status: No, hits=0.8 required=5.0 tests=AWL, BAYES_50, SARE_BAYES_5x7, SARE_BAYES_6x7, SARE_BAYES_7x7, TW_CL, TW_IV, TW_LX, TW_SB, TW_VX, TW_ZL, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mail.codesourcery.com (HELO mail.codesourcery.com) (38.113.113.100) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 20 Jun 2011 19:39:54 +0000 Received: (qmail 18530 invoked from network); 20 Jun 2011 19:38:33 -0000 Received: from unknown (HELO ?84.152.158.160?) (bernds@127.0.0.2) by mail.codesourcery.com with ESMTPA; 20 Jun 2011 19:38:33 -0000 Message-ID: <4DFFA1AE.7070405@codesourcery.com> Date: Mon, 20 Jun 2011 21:38:22 +0200 From: Bernd Schmidt User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.17) Gecko/20110516 Lightning/1.0b3pre Thunderbird/3.1.10 MIME-Version: 1.0 To: Richard Henderson CC: GCC Patches Subject: Re: Add __builtin_clrsb, similar to clz/ctz References: <4DF9FA9A.8040505@codesourcery.com> <4DFA2E85.2030601@redhat.com> In-Reply-To: <4DFA2E85.2030601@redhat.com> Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On 06/16/2011 06:25 PM, Richard Henderson wrote: > On 06/16/2011 05:44 AM, Bernd Schmidt wrote: >> +@deftypefn {Built-in Function} int __builtin_clrsb (unsigned int x) >> +Returns the number of leading redundant sign bits in @var{x}, starting >> +at the most significant bit position. >> +@end deftypefn > > Do we want a signed argument, since we're talking about signs? Err, yes. It's signed everywhere else (builtins.def etc.). > It would seem that unlike clz, this function is not undefined for zero. > What about INT_MIN? Do all cpus handle those edge cases the same way? -1 and zero should both produce the same value, 31 (for a 32 bit integer). I don't see why INT_MIN should be special - the return value is zero. This is true for C6X and Blackfin; ARM documentation suggests it's also true for their VCLS instruction. I've not found proper picochip documentation but some other documents that suggest it's also implemented this way. > Do you get smaller code in general from > > if (x < 0) > x = ~x; > if (x == 0) > return W_TYPE_SIZE - 1; > count_leading_zeros(ret, x); > return ret - 1; Probably. >> -(define_insn "signbitssi2" >> +(define_insn "clrsbsi2" >> [(set (match_operand:HI 0 "register_operand" "=d") >> (if_then_else:HI >> (lt (match_operand:SI 1 "register_operand" "d") (const_int 0)) > > No use of the new rtx code? D'oh. Blackfin has a (clrsb:HI (operand:SI)) instruction, so adding this showed a problem with some of the existing simplify_const_unop cases: for ffs/clz/ctz/clrsb/parity/popcount, we should look at the mode of the operand, rather than the mode of the operation. This limits what we can do in that function, since op_mode is sometimes VOIDmode - we really should add builtin folders for these at some point. New patch below. Retested on i686 and bfin. Bernd libgcc/ * Makefile.in (lib2funcs): Add _clrsbsi2 and _clrsbdi2. * libgcc-std.ver.in (GCC_4.7.0): New section. gcc/ * doc/extend.texi (__builtin_clrsb, __builtin_clrsbl, __builtin_clrsbll): Document. * doc/rtl.texi (clrsb): New entry. * optabs.c (widen_leading): Renamed from widen_clz. New argument UNOPTAB. All callers changed. Use UNOPTAB instead of clz_optab. (expand_unop): Handle clrsb_optab. (init_optabs): Initialize it. * optabs.h (enum optab_index): New entry OTI_clrsb. (clrsb_optab): Define. * genopinit.c (optabs): Add an entry for it. * builtins.c (expand_builtin): Handle clrsb builtin functions. * builtins.def (BUILT_IN_CLRSB, BUILT_IN_CLRSBIMAX, BUILT_IN_CLRSBL, BUILT_IN_CLRSBLL): New. * rtl.def (CLRSB): New code. * dwarf2out.c (mem_loc_descriptor): Handle it. * simplify-rtx.c (simplify_const_unary_operation): Likewise. Use op_mode rather than mode when optimizing ffs, clz, ctz, parity and popcount. * libgcc2.c (__clrsbSI2, __clrsbDI2): New functions. * libgcc2.h (__clrsbSI2, __clrsbDI2): Define and declare. (__ctzDI2): Move declaration. * config/bfin/bfin.md (clrsbsi2): New expander. (signbitssi2): Use the CLRSB rtx. (clrsbhi2): Renamed from signbitshi2. Use the CLRSB rtx. * config/bfin/bfin.c (bdesc_1arg): Changed accordingly. gcc/testsuite/ * gcc.c-torture/excute/builtin-bitops-1.c (MAKE_FUNS): Make my_clrsb test functions. (main): Test clrsb. * gcc.dg/builtin-protos-1.c (test_s, test_u, test_sl, test_ul, test_sll, test_ull): Add clrsb tests. * gcc.dg/torture/builtin-attr-1.c: Add tests for clrsb, clrsbl, clrsbll. Index: libgcc/Makefile.in =================================================================== --- libgcc/Makefile.in (revision 174339) +++ libgcc/Makefile.in (working copy) @@ -320,7 +320,7 @@ lib2funcs = _muldi3 _negdi2 _lshrdi3 _as _ctzsi2 _ctzdi2 _popcount_tab _popcountsi2 _popcountdi2 \ _paritysi2 _paritydi2 _powisf2 _powidf2 _powixf2 _powitf2 \ _mulsc3 _muldc3 _mulxc3 _multc3 _divsc3 _divdc3 _divxc3 \ - _divtc3 _bswapsi2 _bswapdi2 + _divtc3 _bswapsi2 _bswapdi2 _clrsbsi2 _clrsbdi2 # The floating-point conversion routines that involve a single-word integer. # XX stands for the integer mode. Index: libgcc/libgcc-std.ver.in =================================================================== --- libgcc/libgcc-std.ver.in (revision 174339) +++ libgcc/libgcc-std.ver.in (working copy) @@ -1920,3 +1920,10 @@ GCC_4.6.0 { __morestack_initial_sp __splitstack_find } + +%inherit GCC_4.7.0 GCC_4.6.0 +GCC_4.7.0 { + __PFX__clrsbsi2 + __PFX__clrsbdi2 + __PFX__clrsbti2 +} Index: gcc/doc/extend.texi =================================================================== --- gcc/doc/extend.texi (revision 174339) +++ gcc/doc/extend.texi (working copy) @@ -7828,6 +7828,12 @@ Returns the number of trailing 0-bits in significant bit position. If @var{x} is 0, the result is undefined. @end deftypefn +@deftypefn {Built-in Function} int __builtin_clrsb (int x) +Returns the number of leading redundant sign bits in @var{x}, i.e. the +number of bits following the most significant bit which are identical +to it. There are no special cases for 0 or other values. +@end deftypefn + @deftypefn {Built-in Function} int __builtin_popcount (unsigned int x) Returns the number of 1-bits in @var{x}. @end deftypefn @@ -7852,6 +7858,11 @@ Similar to @code{__builtin_ctz}, except @code{unsigned long}. @end deftypefn +@deftypefn {Built-in Function} int __builtin_clrsbl (long) +Similar to @code{__builtin_clrsb}, except the argument type is +@code{long}. +@end deftypefn + @deftypefn {Built-in Function} int __builtin_popcountl (unsigned long) Similar to @code{__builtin_popcount}, except the argument type is @code{unsigned long}. @@ -7877,6 +7888,11 @@ Similar to @code{__builtin_ctz}, except @code{unsigned long long}. @end deftypefn +@deftypefn {Built-in Function} int __builtin_clrsbll (long long) +Similar to @code{__builtin_clrsb}, except the argument type is +@code{long long}. +@end deftypefn + @deftypefn {Built-in Function} int __builtin_popcountll (unsigned long long) Similar to @code{__builtin_popcount}, except the argument type is @code{unsigned long long}. Index: gcc/doc/rtl.texi =================================================================== --- gcc/doc/rtl.texi (revision 174339) +++ gcc/doc/rtl.texi (working copy) @@ -2400,6 +2400,14 @@ zero if @var{x} is zero.) The mode of @ depending on the target machine, various mode combinations may be valid. +@findex clrsb +@item (clrsb:@var{m} @var{x}) +Represents the number of redundant leading sign bits in @var{x}, +represented as an integer of mode @var{m}, starting at the most +significant bit position. This is one less than the number of leading +sign bits (either 0 or 1), with no special cases. The mode of @var{x} +will usually be an integer mode and may differ from @var{m}. + @findex clz @item (clz:@var{m} @var{x}) Represents the number of leading 0-bits in @var{x}, represented as an Index: gcc/optabs.c =================================================================== --- gcc/optabs.c (revision 174339) +++ gcc/optabs.c (working copy) @@ -2317,9 +2317,12 @@ expand_simple_unop (enum machine_mode mo /* Try calculating (clz:narrow x) as - (clz:wide (zero_extend:wide x)) - ((width wide) - (width narrow)). */ + (clz:wide (zero_extend:wide x)) - ((width wide) - (width narrow)). + + A similar operation can be used for clrsb. UNOPTAB says which operation + we are trying to expand. */ static rtx -widen_clz (enum machine_mode mode, rtx op0, rtx target) +widen_leading (enum machine_mode mode, rtx op0, rtx target, optab unoptab) { enum mode_class mclass = GET_MODE_CLASS (mode); if (CLASS_HAS_WIDER_MODES_P (mclass)) @@ -2329,7 +2332,7 @@ widen_clz (enum machine_mode mode, rtx o wider_mode != VOIDmode; wider_mode = GET_MODE_WIDER_MODE (wider_mode)) { - if (optab_handler (clz_optab, wider_mode) != CODE_FOR_nothing) + if (optab_handler (unoptab, wider_mode) != CODE_FOR_nothing) { rtx xop0, temp, last; @@ -2338,7 +2341,7 @@ widen_clz (enum machine_mode mode, rtx o if (target == 0) target = gen_reg_rtx (mode); xop0 = widen_operand (op0, wider_mode, mode, true, false); - temp = expand_unop (wider_mode, clz_optab, xop0, NULL_RTX, true); + temp = expand_unop (wider_mode, unoptab, xop0, NULL_RTX, true); if (temp != 0) temp = expand_binop (wider_mode, sub_optab, temp, GEN_INT (GET_MODE_BITSIZE (wider_mode) @@ -2832,7 +2835,7 @@ expand_unop (enum machine_mode mode, opt /* Widening (or narrowing) clz needs special treatment. */ if (unoptab == clz_optab) { - temp = widen_clz (mode, op0, target); + temp = widen_leading (mode, op0, target, unoptab); if (temp) return temp; @@ -2844,7 +2847,15 @@ expand_unop (enum machine_mode mode, opt return temp; } - goto try_libcall; + goto try_libcall; + } + + if (unoptab == clrsb_optab) + { + temp = widen_leading (mode, op0, target, unoptab); + if (temp) + return temp; + goto try_libcall; } /* Widening (or narrowing) bswap needs special treatment. */ @@ -2999,7 +3010,8 @@ expand_unop (enum machine_mode mode, opt /* All of these functions return small values. Thus we choose to have them return something that isn't a double-word. */ if (unoptab == ffs_optab || unoptab == clz_optab || unoptab == ctz_optab - || unoptab == popcount_optab || unoptab == parity_optab) + || unoptab == clrsb_optab || unoptab == popcount_optab + || unoptab == parity_optab) outmode = GET_MODE (hard_libcall_value (TYPE_MODE (integer_type_node), optab_libfunc (unoptab, mode))); @@ -5943,6 +5955,7 @@ init_optabs (void) init_optab (ffs_optab, FFS); init_optab (clz_optab, CLZ); init_optab (ctz_optab, CTZ); + init_optab (clrsb_optab, CLRSB); init_optab (popcount_optab, POPCOUNT); init_optab (parity_optab, PARITY); init_optab (sqrt_optab, SQRT); @@ -6173,6 +6186,9 @@ init_optabs (void) ctz_optab->libcall_basename = "ctz"; ctz_optab->libcall_suffix = '2'; ctz_optab->libcall_gen = gen_int_libfunc; + clrsb_optab->libcall_basename = "clrsb"; + clrsb_optab->libcall_suffix = '2'; + clrsb_optab->libcall_gen = gen_int_libfunc; popcount_optab->libcall_basename = "popcount"; popcount_optab->libcall_suffix = '2'; popcount_optab->libcall_gen = gen_int_libfunc; Index: gcc/optabs.h =================================================================== --- gcc/optabs.h (revision 174339) +++ gcc/optabs.h (working copy) @@ -220,6 +220,7 @@ enum optab_index OTI_ffs, OTI_clz, OTI_ctz, + OTI_clrsb, OTI_popcount, OTI_parity, /* Square root */ @@ -456,6 +457,7 @@ enum optab_index #define ffs_optab (&optab_table[OTI_ffs]) #define clz_optab (&optab_table[OTI_clz]) #define ctz_optab (&optab_table[OTI_ctz]) +#define clrsb_optab (&optab_table[OTI_clrsb]) #define popcount_optab (&optab_table[OTI_popcount]) #define parity_optab (&optab_table[OTI_parity]) #define sqrt_optab (&optab_table[OTI_sqrt]) Index: gcc/genopinit.c =================================================================== --- gcc/genopinit.c (revision 174339) +++ gcc/genopinit.c (working copy) @@ -199,6 +199,7 @@ static const char * const optabs[] = "set_optab_handler (ffs_optab, $A, CODE_FOR_$(ffs$a2$))", "set_optab_handler (clz_optab, $A, CODE_FOR_$(clz$a2$))", "set_optab_handler (ctz_optab, $A, CODE_FOR_$(ctz$a2$))", + "set_optab_handler (clrsb_optab, $A, CODE_FOR_$(clrsb$a2$))", "set_optab_handler (popcount_optab, $A, CODE_FOR_$(popcount$a2$))", "set_optab_handler (parity_optab, $A, CODE_FOR_$(parity$a2$))", "set_optab_handler (mov_optab, $A, CODE_FOR_$(mov$a$))", Index: gcc/builtins.c =================================================================== --- gcc/builtins.c (revision 174339) +++ gcc/builtins.c (working copy) @@ -6068,6 +6068,14 @@ expand_builtin (tree exp, rtx target, rt return target; break; + CASE_INT_FN (BUILT_IN_CLRSB): + case BUILT_IN_CLRSBIMAX: + target = expand_builtin_unop (target_mode, exp, target, + subtarget, clrsb_optab); + if (target) + return target; + break; + CASE_INT_FN (BUILT_IN_POPCOUNT): case BUILT_IN_POPCOUNTIMAX: target = expand_builtin_unop (target_mode, exp, target, Index: gcc/testsuite/gcc.c-torture/execute/builtin-bitops-1.c =================================================================== --- gcc/testsuite/gcc.c-torture/execute/builtin-bitops-1.c (revision 174339) +++ gcc/testsuite/gcc.c-torture/execute/builtin-bitops-1.c (working copy) @@ -62,6 +62,16 @@ int my_clz##suffix(type x) { \ return i; \ } \ \ +int my_clrsb##suffix(type x) { \ + int i; \ + int leading = (x >> CHAR_BIT * sizeof (type) - 1) & 1; \ + for (i = 1; i < CHAR_BIT * sizeof (type); i++) \ + if (((x >> ((CHAR_BIT * sizeof (type)) - i - 1)) & 1) \ + != leading) \ + break; \ + return i - 1; \ +} \ + \ int my_popcount##suffix(type x) { \ int i; \ int count = 0; \ @@ -176,6 +186,8 @@ main (void) if (ints[i] != 0 && __builtin_ctz (ints[i]) != my_ctz (ints[i])) abort (); + if (__builtin_clrsb (ints[i]) != my_clrsb (ints[i])) + abort (); if (__builtin_popcount (ints[i]) != my_popcount (ints[i])) abort (); if (__builtin_parity (ints[i]) != my_parity (ints[i])) @@ -192,6 +204,8 @@ main (void) if (longs[i] != 0 && __builtin_ctzl (longs[i]) != my_ctzl (longs[i])) abort (); + if (__builtin_clrsbl (longs[i]) != my_clrsbl (longs[i])) + abort (); if (__builtin_popcountl (longs[i]) != my_popcountl (longs[i])) abort (); if (__builtin_parityl (longs[i]) != my_parityl (longs[i])) @@ -208,6 +222,8 @@ main (void) if (longlongs[i] != 0 && __builtin_ctzll (longlongs[i]) != my_ctzll (longlongs[i])) abort (); + if (__builtin_clrsbll (longlongs[i]) != my_clrsbll (longlongs[i])) + abort (); if (__builtin_popcountll (longlongs[i]) != my_popcountll (longlongs[i])) abort (); if (__builtin_parityll (longlongs[i]) != my_parityll (longlongs[i])) @@ -223,6 +239,8 @@ main (void) abort (); \ if (x != 0 && __builtin_ctz##suffix (x) != my_ctz##suffix (x)) \ abort (); \ + if (__builtin_clrsb##suffix (x) != my_clrsb##suffix (x)) \ + abort (); \ if (__builtin_popcount##suffix (x) != my_popcount##suffix (x)) \ abort (); \ if (__builtin_parity##suffix (x) != my_parity##suffix (x)) \ Index: gcc/testsuite/gcc.dg/builtin-protos-1.c =================================================================== --- gcc/testsuite/gcc.dg/builtin-protos-1.c (revision 174339) +++ gcc/testsuite/gcc.dg/builtin-protos-1.c (working copy) @@ -7,6 +7,7 @@ test_s (signed int x) return __builtin_abs (x) /* { dg-bogus "as unsigned due to prototype" } */ + __builtin_clz (x) /* { dg-warning "as unsigned due to prototype" } */ + __builtin_ctz (x) /* { dg-warning "as unsigned due to prototype" } */ + + __builtin_clrsb (x) /* { dg-bogus "as unsigned due to prototype" } */ + __builtin_ffs (x) /* { dg-bogus "as unsigned due to prototype" } */ + __builtin_parity (x) /* { dg-warning "as unsigned due to prototype" } */ + __builtin_popcount (x); /* { dg-warning "as unsigned due to prototype" } */ @@ -18,6 +19,7 @@ test_u (unsigned int x) return __builtin_abs (x) /* { dg-warning "as signed due to prototype" } */ + __builtin_clz (x) /* { dg-bogus "as signed due to prototype" } */ + __builtin_ctz (x) /* { dg-bogus "as signed due to prototype" } */ + + __builtin_clrsb (x) /* { dg-warning "as signed due to prototype" } */ + __builtin_ffs (x) /* { dg-warning "as signed due to prototype" } */ + __builtin_parity (x) /* { dg-bogus "as signed due to prototype" } */ + __builtin_popcount (x); /* { dg-bogus "as signed due to prototype" } */ @@ -29,6 +31,7 @@ test_sl (signed long x) return __builtin_labs (x) /* { dg-bogus "as unsigned due to prototype" } */ + __builtin_clzl (x) /* { dg-warning "as unsigned due to prototype" } */ + __builtin_ctzl (x) /* { dg-warning "as unsigned due to prototype" } */ + + __builtin_clrsbl (x) /* { dg-bogus "as unsigned due to prototype" } */ + __builtin_ffsl (x) /* { dg-bogus "as unsigned due to prototype" } */ + __builtin_parityl (x) /* { dg-warning "as unsigned due to prototype" } */ + __builtin_popcountl (x); /* { dg-warning "as unsigned due to prototype" } */ @@ -40,6 +43,7 @@ test_ul (unsigned long x) return __builtin_labs (x) /* { dg-warning "as signed due to prototype" } */ + __builtin_clzl (x) /* { dg-bogus "as signed due to prototype" } */ + __builtin_ctzl (x) /* { dg-bogus "as signed due to prototype" } */ + + __builtin_clrsbl (x) /* { dg-warning "as signed due to prototype" } */ + __builtin_ffsl (x) /* { dg-warning "as signed due to prototype" } */ + __builtin_parityl (x) /* { dg-bogus "as signed due to prototype" } */ + __builtin_popcountl (x); /* { dg-bogus "as signed due to prototype" } */ @@ -51,6 +55,7 @@ test_sll (signed long long x) return __builtin_llabs (x) /* { dg-bogus "as unsigned due to prototype" } */ + __builtin_clzll (x) /* { dg-warning "as unsigned due to prototype" } */ + __builtin_ctzll (x) /* { dg-warning "as unsigned due to prototype" } */ + + __builtin_clrsbll (x) /* { dg-bogus "as unsigned due to prototype" } */ + __builtin_ffsll (x) /* { dg-bogus "as unsigned due to prototype" } */ + __builtin_parityll (x) /* { dg-warning "as unsigned due to prototype" } */ + __builtin_popcountll (x); /* { dg-warning "as unsigned due to prototype" } */ @@ -62,6 +67,7 @@ test_ull (unsigned long long x) return __builtin_llabs (x) /* { dg-warning "as signed due to prototype" } */ + __builtin_clzll (x) /* { dg-bogus "as signed due to prototype" } */ + __builtin_ctzll (x) /* { dg-bogus "as signed due to prototype" } */ + + __builtin_clrsbll (x) /* { dg-warning "as signed due to prototype" } */ + __builtin_ffsll (x) /* { dg-warning "as signed due to prototype" } */ + __builtin_parityll (x) /* { dg-bogus "as signed due to prototype" } */ + __builtin_popcountll (x); /* { dg-bogus "as signed due to prototype" } */ Index: gcc/testsuite/gcc.dg/torture/builtin-attr-1.c =================================================================== --- gcc/testsuite/gcc.dg/torture/builtin-attr-1.c (revision 174339) +++ gcc/testsuite/gcc.dg/torture/builtin-attr-1.c (working copy) @@ -416,6 +416,9 @@ BUILTIN_TEST1 (clzll, long long) BUILTIN_TEST1 (ctz, int) BUILTIN_TEST1 (ctzl, long) BUILTIN_TEST1 (ctzll, long long) +BUILTIN_TEST1 (clrsb, int) +BUILTIN_TEST1 (clrsbl, long) +BUILTIN_TEST1 (clrsbll, long long) TEST1 (ffs, int, int) TEST1 (ffsl, long, int) TEST1 (ffsll, long long, int) Index: gcc/builtins.def =================================================================== --- gcc/builtins.def (revision 174339) +++ gcc/builtins.def (working copy) @@ -620,6 +620,10 @@ DEF_GCC_BUILTIN (BUILT_IN_CTZ, "c DEF_GCC_BUILTIN (BUILT_IN_CTZIMAX, "ctzimax", BT_FN_INT_UINTMAX, ATTR_CONST_NOTHROW_LEAF_LIST) DEF_GCC_BUILTIN (BUILT_IN_CTZL, "ctzl", BT_FN_INT_ULONG, ATTR_CONST_NOTHROW_LEAF_LIST) DEF_GCC_BUILTIN (BUILT_IN_CTZLL, "ctzll", BT_FN_INT_ULONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST) +DEF_GCC_BUILTIN (BUILT_IN_CLRSB, "clrsb", BT_FN_INT_INT, ATTR_CONST_NOTHROW_LEAF_LIST) +DEF_GCC_BUILTIN (BUILT_IN_CLRSBIMAX, "clrsbimax", BT_FN_INT_INTMAX, ATTR_CONST_NOTHROW_LEAF_LIST) +DEF_GCC_BUILTIN (BUILT_IN_CLRSBL, "clrsbl", BT_FN_INT_LONG, ATTR_CONST_NOTHROW_LEAF_LIST) +DEF_GCC_BUILTIN (BUILT_IN_CLRSBLL, "clrsbll", BT_FN_INT_LONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST) DEF_EXT_LIB_BUILTIN (BUILT_IN_DCGETTEXT, "dcgettext", BT_FN_STRING_CONST_STRING_CONST_STRING_INT, ATTR_FORMAT_ARG_2) DEF_EXT_LIB_BUILTIN (BUILT_IN_DGETTEXT, "dgettext", BT_FN_STRING_CONST_STRING_CONST_STRING, ATTR_FORMAT_ARG_2) DEF_GCC_BUILTIN (BUILT_IN_DWARF_CFA, "dwarf_cfa", BT_FN_PTR, ATTR_NULL) Index: gcc/rtl.def =================================================================== --- gcc/rtl.def (revision 174339) +++ gcc/rtl.def (working copy) @@ -613,6 +613,10 @@ DEF_RTL_EXPR(BSWAP, "bswap", "e", RTX_UN or 0 if arg is 0. */ DEF_RTL_EXPR(FFS, "ffs", "e", RTX_UNARY) +/* Count number of leading redundant sign bits (number of leading + sign bits minus one). */ +DEF_RTL_EXPR(CLRSB, "clrsb", "e", RTX_UNARY) + /* Count leading zeros. */ DEF_RTL_EXPR(CLZ, "clz", "e", RTX_UNARY) Index: gcc/dwarf2out.c =================================================================== --- gcc/dwarf2out.c (revision 174339) +++ gcc/dwarf2out.c (working copy) @@ -14874,6 +14874,7 @@ mem_loc_descriptor (rtx rtl, enum machin case FFS: case CLZ: case CTZ: + case CLRSB: case POPCOUNT: case PARITY: case ASM_OPERANDS: Index: gcc/libgcc2.c =================================================================== --- gcc/libgcc2.c (revision 174339) +++ gcc/libgcc2.c (working copy) @@ -762,7 +762,50 @@ __ctzDI2 (UDWtype x) return ret + add; } #endif + +#ifdef L_clrsbsi2 +#undef int +int +__clrsbSI2 (Wtype x) +{ + Wtype ret; + if (x < 0) + x = ~x; + if (x == 0) + return W_TYPE_SIZE - 1; + count_leading_zeros (ret, x); + return ret - 1; +} +#endif + +#ifdef L_clrsbdi2 +#undef int +int +__clrsbDI2 (DWtype x) +{ + const DWunion uu = {.ll = x}; + UWtype word; + Wtype ret, add; + + if (uu.s.high == 0) + word = uu.s.low, add = W_TYPE_SIZE; + else if (uu.s.high == -1) + word = ~uu.s.low, add = W_TYPE_SIZE; + else if (uu.s.high >= 0) + word = uu.s.high, add = 0; + else + word = ~uu.s.high, add = 0; + + if (word == 0) + ret = W_TYPE_SIZE; + else + count_leading_zeros (ret, word); + + return ret + add - 1; +} +#endif + #ifdef L_popcount_tab const UQItype __popcount_tab[256] = { Index: gcc/libgcc2.h =================================================================== --- gcc/libgcc2.h (revision 174339) +++ gcc/libgcc2.h (working copy) @@ -315,11 +315,13 @@ typedef int shift_count_type __attribute #define __ffsSI2 __NW(ffs,2) #define __clzSI2 __NW(clz,2) #define __ctzSI2 __NW(ctz,2) +#define __clrsbSI2 __NW(clrsb,2) #define __popcountSI2 __NW(popcount,2) #define __paritySI2 __NW(parity,2) #define __ffsDI2 __NDW(ffs,2) #define __clzDI2 __NDW(clz,2) #define __ctzDI2 __NDW(ctz,2) +#define __clrsbDI2 __NDW(clrsb,2) #define __popcountDI2 __NDW(popcount,2) #define __parityDI2 __NDW(parity,2) @@ -508,9 +510,11 @@ extern const UQItype __clz_tab[256]; extern int __clzDI2 (UDWtype); extern int __clzSI2 (UWtype); extern int __ctzSI2 (UWtype); +extern int __ctzDI2 (UDWtype); +extern int __clrsbSI2 (Wtype); +extern int __clrsbDI2 (DWtype); extern int __ffsSI2 (UWtype); extern int __ffsDI2 (DWtype); -extern int __ctzDI2 (UDWtype); extern int __popcountSI2 (UWtype); extern int __popcountDI2 (UDWtype); extern int __paritySI2 (UWtype); Index: gcc/simplify-rtx.c =================================================================== --- gcc/simplify-rtx.c (revision 174339) +++ gcc/simplify-rtx.c (working copy) @@ -1127,6 +1127,7 @@ simplify_const_unary_operation (enum rtx rtx op, enum machine_mode op_mode) { unsigned int width = GET_MODE_BITSIZE (mode); + unsigned int op_width = GET_MODE_BITSIZE (op_mode); if (code == VEC_DUPLICATE) { @@ -1237,7 +1238,8 @@ simplify_const_unary_operation (enum rtx } if (CONST_INT_P (op) - && width <= HOST_BITS_PER_WIDE_INT && width > 0) + && width <= HOST_BITS_PER_WIDE_INT + && op_width <= HOST_BITS_PER_WIDE_INT && op_width > 0) { HOST_WIDE_INT arg0 = INTVAL (op); HOST_WIDE_INT val; @@ -1257,40 +1259,50 @@ simplify_const_unary_operation (enum rtx break; case FFS: - arg0 &= GET_MODE_MASK (mode); + arg0 &= GET_MODE_MASK (op_mode); val = ffs_hwi (arg0); break; case CLZ: - arg0 &= GET_MODE_MASK (mode); - if (arg0 == 0 && CLZ_DEFINED_VALUE_AT_ZERO (mode, val)) + arg0 &= GET_MODE_MASK (op_mode); + if (arg0 == 0 && CLZ_DEFINED_VALUE_AT_ZERO (op_mode, val)) ; else - val = GET_MODE_BITSIZE (mode) - floor_log2 (arg0) - 1; + val = GET_MODE_BITSIZE (op_mode) - floor_log2 (arg0) - 1; + break; + + case CLRSB: + arg0 &= GET_MODE_MASK (op_mode); + if (arg0 == 0) + val = GET_MODE_BITSIZE (op_mode) - 1; + else if (arg0 >= 0) + val = GET_MODE_BITSIZE (op_mode) - floor_log2 (arg0) - 2; + else if (arg0 < 0) + val = GET_MODE_BITSIZE (op_mode) - floor_log2 (~arg0) - 2; break; case CTZ: - arg0 &= GET_MODE_MASK (mode); + arg0 &= GET_MODE_MASK (op_mode); if (arg0 == 0) { /* Even if the value at zero is undefined, we have to come up with some replacement. Seems good enough. */ - if (! CTZ_DEFINED_VALUE_AT_ZERO (mode, val)) - val = GET_MODE_BITSIZE (mode); + if (! CTZ_DEFINED_VALUE_AT_ZERO (op_mode, val)) + val = GET_MODE_BITSIZE (op_mode); } else val = ctz_hwi (arg0); break; case POPCOUNT: - arg0 &= GET_MODE_MASK (mode); + arg0 &= GET_MODE_MASK (op_mode); val = 0; while (arg0) val++, arg0 &= arg0 - 1; break; case PARITY: - arg0 &= GET_MODE_MASK (mode); + arg0 &= GET_MODE_MASK (op_mode); val = 0; while (arg0) val++, arg0 &= arg0 - 1; Index: gcc/config/bfin/bfin.c =================================================================== --- gcc/config/bfin/bfin.c (revision 174339) +++ gcc/config/bfin/bfin.c (working copy) @@ -6254,11 +6254,11 @@ static const struct builtin_description { CODE_FOR_ones, "__builtin_bfin_ones", BFIN_BUILTIN_ONES, 0 }, - { CODE_FOR_signbitshi2, "__builtin_bfin_norm_fr1x16", BFIN_BUILTIN_NORM_1X16, 0 }, + { CODE_FOR_clrsbhi2, "__builtin_bfin_norm_fr1x16", BFIN_BUILTIN_NORM_1X16, 0 }, { CODE_FOR_ssneghi2, "__builtin_bfin_negate_fr1x16", BFIN_BUILTIN_NEG_1X16, 0 }, { CODE_FOR_abshi2, "__builtin_bfin_abs_fr1x16", BFIN_BUILTIN_ABS_1X16, 0 }, - { CODE_FOR_signbitssi2, "__builtin_bfin_norm_fr1x32", BFIN_BUILTIN_NORM_1X32, 0 }, + { CODE_FOR_clrsbsi2, "__builtin_bfin_norm_fr1x32", BFIN_BUILTIN_NORM_1X32, 0 }, { CODE_FOR_ssroundsi2, "__builtin_bfin_round_fr1x32", BFIN_BUILTIN_ROUND_1X32, 0 }, { CODE_FOR_ssnegsi2, "__builtin_bfin_negate_fr1x32", BFIN_BUILTIN_NEG_1X32, 0 }, { CODE_FOR_ssabssi2, "__builtin_bfin_abs_fr1x32", BFIN_BUILTIN_ABS_1X32, 0 }, Index: gcc/config/bfin/bfin.md =================================================================== --- gcc/config/bfin/bfin.md (revision 174339) +++ gcc/config/bfin/bfin.md (working copy) @@ -1461,12 +1461,19 @@ (define_insn "one_cmplsi2" "%0 = ~%1;" [(set_attr "type" "alu0")]) +(define_expand "clrsbsi2" + [(set (match_dup 2) + (clrsb:HI (match_operand:SI 1 "register_operand" "d"))) + (set (match_operand:SI 0 "register_operand") + (zero_extend:SI (match_dup 2)))] + "" +{ + operands[2] = gen_reg_rtx (HImode); +}) + (define_insn "signbitssi2" [(set (match_operand:HI 0 "register_operand" "=d") - (if_then_else:HI - (lt (match_operand:SI 1 "register_operand" "d") (const_int 0)) - (clz:HI (not:SI (match_dup 1))) - (clz:HI (match_dup 1))))] + (clrsb:HI (match_operand:SI 1 "register_operand" "d")))] "" "%h0 = signbits %1%!" [(set_attr "type" "dsp32")]) @@ -1518,12 +1525,9 @@ (define_insn "ssneghi2" "%0 = -%1 (V)%!" [(set_attr "type" "dsp32")]) -(define_insn "signbitshi2" +(define_insn "clrsbhi2" [(set (match_operand:HI 0 "register_operand" "=d") - (if_then_else:HI - (lt (match_operand:HI 1 "register_operand" "d") (const_int 0)) - (clz:HI (not:HI (match_dup 1))) - (clz:HI (match_dup 1))))] + (clrsb:HI (match_operand:HI 1 "register_operand" "d")))] "" "%h0 = signbits %h1%!" [(set_attr "type" "dsp32")])