Message ID | 87vbzcph8h.fsf@sandifor-thinkpad.stglab.manchester.uk.ibm.com |
---|---|
State | New |
Headers | show |
On 28/11/13 17:29, Richard Sandiford wrote: > The existing ltu_p fast path can handle any pairs of single-HWI inputs, > even for precision > HOST_BITS_PER_WIDE_INT. In that case both xl and > yl are implicitly sign-extended to the larger precision, but with the > extended values still being compared as unsigned. The extension doesn't > change the result in that case. > > When compiling a recent fold-const.ii, this reduces the number of > ltu_p_large calls from 23849 to 697. > Are these sorts of nuggets of information going to be recorded anywhere? R.
this is fine. kenny On 11/28/2013 12:29 PM, Richard Sandiford wrote: > The existing ltu_p fast path can handle any pairs of single-HWI inputs, > even for precision > HOST_BITS_PER_WIDE_INT. In that case both xl and > yl are implicitly sign-extended to the larger precision, but with the > extended values still being compared as unsigned. The extension doesn't > change the result in that case. > > When compiling a recent fold-const.ii, this reduces the number of > ltu_p_large calls from 23849 to 697. > > Tested on x86_64-linux-gnu. OK to install? > > Thanks, > Richard > > > Index: gcc/alias.c > =================================================================== > --- gcc/alias.c 2013-11-20 12:12:49.393055063 +0000 > +++ gcc/alias.c 2013-11-28 12:24:23.307549245 +0000 > @@ -342,7 +342,7 @@ ao_ref_from_mem (ao_ref *ref, const_rtx > || (DECL_P (ref->base) > && (DECL_SIZE (ref->base) == NULL_TREE > || TREE_CODE (DECL_SIZE (ref->base)) != INTEGER_CST > - || wi::ltu_p (DECL_SIZE (ref->base), > + || wi::ltu_p (wi::to_offset (DECL_SIZE (ref->base)), > ref->offset + ref->size))))) > return false; > > Index: gcc/wide-int.h > =================================================================== > --- gcc/wide-int.h 2013-11-28 11:44:39.041731636 +0000 > +++ gcc/wide-int.h 2013-11-28 12:48:36.200764215 +0000 > @@ -1740,13 +1740,15 @@ wi::ltu_p (const T1 &x, const T2 &y) > unsigned int precision = get_binary_precision (x, y); > WIDE_INT_REF_FOR (T1) xi (x, precision); > WIDE_INT_REF_FOR (T2) yi (y, precision); > - /* Optimize comparisons with constants and with sub-HWI unsigned > - integers. */ > + /* Optimize comparisons with constants. */ > if (STATIC_CONSTANT_P (yi.len == 1 && yi.val[0] >= 0)) > return xi.len == 1 && xi.to_uhwi () < (unsigned HOST_WIDE_INT) yi.val[0]; > if (STATIC_CONSTANT_P (xi.len == 1 && xi.val[0] >= 0)) > return yi.len != 1 || yi.to_uhwi () > (unsigned HOST_WIDE_INT) xi.val[0]; > - if (precision <= HOST_BITS_PER_WIDE_INT) > + /* Optimize the case of two HWIs. The HWIs are implicitly sign-extended > + for precisions greater than HOST_BITS_WIDE_INT, but sign-extending both > + values does not change the result. */ > + if (xi.len + yi.len == 2) > { > unsigned HOST_WIDE_INT xl = xi.to_uhwi (); > unsigned HOST_WIDE_INT yl = yi.to_uhwi (); >
Richard Earnshaw <rearnsha@arm.com> writes: > On 28/11/13 17:29, Richard Sandiford wrote: >> The existing ltu_p fast path can handle any pairs of single-HWI inputs, >> even for precision > HOST_BITS_PER_WIDE_INT. In that case both xl and >> yl are implicitly sign-extended to the larger precision, but with the >> extended values still being compared as unsigned. The extension doesn't >> change the result in that case. >> >> When compiling a recent fold-const.ii, this reduces the number of >> ltu_p_large calls from 23849 to 697. >> > > Are these sorts of nuggets of information going to be recorded anywhere? You mean put the fold-const.ii numbers in a comment? I could if you like, but it's really just a general principle that checking for two len == 1 integers catches more cases than checking for the precision being <= HOST_BITS_PER_WIDE_INT. Every precision <= HOST_BITS_PER_WIDE_INT will have a length of 1, but many integers with a length of 1 have a precision > HOST_BITS_PER_WIDE_INT (because of offset_int and widest_int). Thanks, Richard
On Fri, Nov 29, 2013 at 11:08 AM, Richard Sandiford <rdsandiford@googlemail.com> wrote: > Richard Earnshaw <rearnsha@arm.com> writes: >> On 28/11/13 17:29, Richard Sandiford wrote: >>> The existing ltu_p fast path can handle any pairs of single-HWI inputs, >>> even for precision > HOST_BITS_PER_WIDE_INT. In that case both xl and >>> yl are implicitly sign-extended to the larger precision, but with the >>> extended values still being compared as unsigned. The extension doesn't >>> change the result in that case. >>> >>> When compiling a recent fold-const.ii, this reduces the number of >>> ltu_p_large calls from 23849 to 697. >>> >> >> Are these sorts of nuggets of information going to be recorded anywhere? > > You mean put the fold-const.ii numbers in a comment? I could if you like, > but it's really just a general principle that checking for two len == 1 > integers catches more cases than checking for the precision being <= > HOST_BITS_PER_WIDE_INT. Every precision <= HOST_BITS_PER_WIDE_INT > will have a length of 1, but many integers with a length of 1 have a > precision > HOST_BITS_PER_WIDE_INT (because of offset_int and widest_int). Indeed - to be really useful shortcuts should work with len == 1 instead of just with precision <= HOST_BITS_PER_WIDE_INT. Which usually means handling of result len == 2. Richard. > Thanks, > Richard
Index: gcc/alias.c =================================================================== --- gcc/alias.c 2013-11-20 12:12:49.393055063 +0000 +++ gcc/alias.c 2013-11-28 12:24:23.307549245 +0000 @@ -342,7 +342,7 @@ ao_ref_from_mem (ao_ref *ref, const_rtx || (DECL_P (ref->base) && (DECL_SIZE (ref->base) == NULL_TREE || TREE_CODE (DECL_SIZE (ref->base)) != INTEGER_CST - || wi::ltu_p (DECL_SIZE (ref->base), + || wi::ltu_p (wi::to_offset (DECL_SIZE (ref->base)), ref->offset + ref->size))))) return false; Index: gcc/wide-int.h =================================================================== --- gcc/wide-int.h 2013-11-28 11:44:39.041731636 +0000 +++ gcc/wide-int.h 2013-11-28 12:48:36.200764215 +0000 @@ -1740,13 +1740,15 @@ wi::ltu_p (const T1 &x, const T2 &y) unsigned int precision = get_binary_precision (x, y); WIDE_INT_REF_FOR (T1) xi (x, precision); WIDE_INT_REF_FOR (T2) yi (y, precision); - /* Optimize comparisons with constants and with sub-HWI unsigned - integers. */ + /* Optimize comparisons with constants. */ if (STATIC_CONSTANT_P (yi.len == 1 && yi.val[0] >= 0)) return xi.len == 1 && xi.to_uhwi () < (unsigned HOST_WIDE_INT) yi.val[0]; if (STATIC_CONSTANT_P (xi.len == 1 && xi.val[0] >= 0)) return yi.len != 1 || yi.to_uhwi () > (unsigned HOST_WIDE_INT) xi.val[0]; - if (precision <= HOST_BITS_PER_WIDE_INT) + /* Optimize the case of two HWIs. The HWIs are implicitly sign-extended + for precisions greater than HOST_BITS_WIDE_INT, but sign-extending both + values does not change the result. */ + if (xi.len + yi.len == 2) { unsigned HOST_WIDE_INT xl = xi.to_uhwi (); unsigned HOST_WIDE_INT yl = yi.to_uhwi ();