[wide-int] Update main comment

Message ID	87mwlrputg.fsf@talisman.default
State	New
Headers	show Return-Path: <gcc-patches-return-352641-incoming=patchwork.ozlabs.org@gcc.gnu.org> DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:mime-version:content-type; q=dns; s=default; b=k1limGAyHYNFb50YT0a0qopTyJQCvg1HyOQIneO1EmF556y0kx MaDk8ke5eNoTTf3DCwlWWG4ugPIaYXR7b5EjEU83OWLAM5GygXmQW0t7ba3TAwD7 A/QNkEoGOFnwk14nfS7Lt9Il9L1kLwfclfbVBE6PsJJd6IsbPhA3b4B+4= Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk Sender: gcc-patches-owner@gcc.gnu.org From: Richard Sandiford <rdsandiford@googlemail.com> To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, zadeck@naturalbridge.com, mikestump@comcast.net, rguenther@suse.de, rdsandiford@googlemail.com Cc: zadeck@naturalbridge.com, mikestump@comcast.net, rguenther@suse.de Subject: [wide-int] Update main comment Date: Tue, 29 Oct 2013 22:37:47 +0000 Message-ID: <87mwlrputg.fsf@talisman.default> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain

Richard Sandiford Oct. 29, 2013, 10:37 p.m. UTC

This patch tries to update the main wide_int comment to reflect the current
implementation.

- bitsizetype is TImode on x86_64 and others, so I don't think it's
  necessarily true that all offset_ints are signed.  (widest_int are
  though.)

- As discussed in the early threads, I think the first reason for
  using widest_int is bogus.  Extensions should be done in the sign
  of source regardless of which wide_int type you're using.
  Extending directly to another wide_int is fine.

- offset_int now only contains the HWIs that it needs, rather than
  the full MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT.

- offset_int and widest_int no longer store the precision.

- Precision 0 is gone (both as a constant marker and for zero-width
  bitfields -- thanks Richard).

Does it look OK?

Thanks,
Richard

Kenneth Zadeck Oct. 29, 2013, 11:05 p.m. UTC | #1

On 10/29/2013 06:37 PM, Richard Sandiford wrote:
> This patch tries to update the main wide_int comment to reflect the current
> implementation.
>
> - bitsizetype is TImode on x86_64 and others, so I don't think it's
>    necessarily true that all offset_ints are signed.  (widest_int are
>    though.)
i am wondering if this is too conservative an interpretation.    I 
believe that they are ti mode because that is the next thing after di 
mode and so they wanted to accommodate the 3 extra bits. Certainly there 
is no x86 that is able to address more than 64 bits.

aside from that all of this looks OK to me.
>
> - As discussed in the early threads, I think the first reason for
>    using widest_int is bogus.  Extensions should be done in the sign
>    of source regardless of which wide_int type you're using.
>    Extending directly to another wide_int is fine.
>
> - offset_int now only contains the HWIs that it needs, rather than
>    the full MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT.
>
> - offset_int and widest_int no longer store the precision.
>
> - Precision 0 is gone (both as a constant marker and for zero-width
>    bitfields -- thanks Richard).
>
> Does it look OK?
>
> Thanks,
> Richard
>
>
> Index: gcc/wide-int.h
> ===================================================================
> --- gcc/wide-int.h	(revision 204183)
> +++ gcc/wide-int.h	(working copy)
> @@ -70,28 +70,18 @@
>        been no effort by the front ends to convert most addressing
>        arithmetic to canonical types.
>   
> -     In the offset_int, all numbers are represented as signed numbers.
> -     There are enough bits in the internal representation so that no
> -     infomation is lost by representing them this way.
> -
>        3) widest_int.  This representation is an approximation of
>        infinite precision math.  However, it is not really infinite
>        precision math as in the GMP library.  It is really finite
>        precision math where the precision is 4 times the size of the
>        largest integer that the target port can represent.
>   
> -     Like the offset_ints, all numbers are inherently signed.
> +     widest_int is supposed to be wider than any number that it needs to
> +     store, meaning that there is always at least one leading sign bit.
> +     All widest_int values are therefore signed.
>   
>        There are several places in the GCC where this should/must be used:
>   
> -     * Code that does widening conversions.  The canonical way that
> -       this is performed is to sign or zero extend the input value to
> -       the max width based on the sign of the type of the source and
> -       then to truncate that value to the target type.  This is in
> -       preference to using the sign of the target type to extend the
> -       value directly (which gets the wrong value for the conversion
> -       of large unsigned numbers to larger signed types).
> -
>        * Code that does induction variable optimizations.  This code
>          works with induction variables of many different types at the
>          same time.  Because of this, it ends up doing many different
> @@ -122,17 +112,17 @@
>      two, the default is the prefered representation.
>   
>      All three flavors of wide_int are represented as a vector of
> -   HOST_WIDE_INTs.  The vector contains enough elements to hold a
> -   value of MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT which is
> -   a derived for each host/target combination.  The values are stored
> -   in the vector with the least significant HOST_BITS_PER_WIDE_INT
> -   bits of the value stored in element 0.
> +   HOST_WIDE_INTs.  The default and widest_int vectors contain enough elements
> +   to hold a value of MAX_BITSIZE_MODE_ANY_INT bits.  offset_int contains only
> +   enough elements to hold ADDR_MAX_PRECISION bits.  The values are stored
> +   in the vector with the least significant HOST_BITS_PER_WIDE_INT bits
> +   in element 0.
>   
> -   A wide_int contains three fields: the vector (VAL), precision and a
> -   length (LEN).  The length is the number of HWIs needed to
> -   represent the value.  For the widest_int and the offset_int,
> -   the precision is a constant that cannot be changed.  For the
> -   default wide_int, the precision is set from the constructor.
> +   The default wide_int contains three fields: the vector (VAL),
> +   the precision and a length (LEN).  The length is the number of HWIs
> +   needed to represent the value.  widest_int and offset_int have a
> +   constant precision that cannot be changed, so they only store the
> +   VAL and LEN fields.
>   
>      Since most integers used in a compiler are small values, it is
>      generally profitable to use a representation of the value that is
> @@ -143,75 +133,90 @@
>      as long as they can be reconstructed from the top bit that is being
>      represented.
>   
> +   The precision and length of a wide_int are always greater than 0.
> +   Any bits in a wide_int above the precision are sign-extended from the
> +   most significant bit.  For example, a 4-bit value 0x8 is represented as
> +   VAL = { 0xf...fff8 }.  However, as an optimization, we allow other integer
> +   constants to be represented with undefined bits above the precision.
> +   This allows INTEGER_CSTs to be pre-extended according to TYPE_SIGN,
> +   so that the INTEGER_CST representation can be used both in TYPE_PRECISION
> +   and in wider precisions.
> +
>      There are constructors to create the various forms of wide_int from
> -   trees, rtl and constants.  For trees and constants, you can simply say:
> +   trees, rtl and constants.  For trees you can simply say:
>   
> -             tree t = ...;
> +	     tree t = ...;
>   	     wide_int x = t;
> -	     wide_int y = 6;
>   
>      However, a little more syntax is required for rtl constants since
>      they do have an explicit precision.  To make an rtl into a
>      wide_int, you have to pair it with a mode.  The canonical way to do
>      this is with std::make_pair as in:
>   
> -             rtx r = ...
> +	     rtx r = ...
>   	     wide_int x = std::make_pair (r, mode);
>   
> -   Wide ints sometimes have a value with the precision of 0.  These
> -   come from two separate sources:
> +   Similarly, a wide_int can only be constructed from a host value if
> +   the target precision is given explicitly, such as in:
>   
> -   * The front ends do sometimes produce values that really have a
> -     precision of 0.  The only place where these seem to come in are
> -     the MIN and MAX value for types with a precision of 0.  Asside
> -     from the computation of these MIN and MAX values, there appears
> -     to be no other use of true precision 0 numbers so the overloading
> -     of precision 0 does not appear to be an issue.  These appear to
> -     be associated with 0 width bit fields.  They are harmless, but
> -     there are several paths through the wide int code to support this
> -     without having to special case the front ends.
> +	     wide_int x = wi::shwi (c, prec); // sign-extend X if necessary
> +	     wide_int y = wi::uhwi (c, prec); // zero-extend X if necessary
>   
> -   * When a constant that has an integer type is converted to a
> -     wide_int it comes in with precision 0.  For these constants the
> -     top bit does accurately reflect the sign of that constant; this
> -     is an exception to the normal rule that the signedness is not
> -     represented.  When used in a binary operation, the wide_int
> -     implementation properly extends these constants so that they
> -     properly match the other operand of the computation.  This allows
> -     you write:
> +   However, offset_int and widest_int have an inherent precision and so
> +   can be initialized directly from a host value:
>   
> -                tree t = ...
> -                wide_int x = t + 6;
> +	     offset_int x = (int) c;          // sign-extend C
> +	     widest_int x = (unsigned int) c; // zero-extend C
>   
> -     assuming t is a int_cst.
> +   It is also possible to do arithmetic directly on trees, rtxes and
> +   constants.  For example:
>   
> -   Any bits in a wide_int above the precision are sign-extended from the
> -   most significant bit.  For example, a 4-bit value 0x8 is represented as
> -   VAL = { 0xf...fff8 }.  However, as an optimization, we allow other integer
> -   constants to be represented with undefined bits above the precision.
> -   This allows INTEGER_CSTs to be pre-extended according to TYPE_SIGN,
> -   so that the INTEGER_CST representation can be used both in TYPE_PRECISION
> -   and in wider precisions.
> +	     wi::add (t1, t2);	  // add equal-sized INTEGER_CSTs t1 and t2
> +	     wi::add (t1, 1);     // add 1 to INTEGER_CST t1
> +	     wi::add (r1, r2);    // add equal-sized rtx constants r1 and r2
> +	     wi::lshift (1, 100); // 1 << 100 as a widest_int
>   
> -   Precision 0 is allowed for the special case of zero-width bitfields.
> -   They always have a VAL of { 0 } and a LEN of 1.
> +   Many binary operations place restrictions on the combinations of inputs,
> +   using the following rules:
>   
> -   Many binary operations require that the precisions of the two
> -   operands be the same.  However, the API tries to keep this relaxed
> -   as much as possible.  In particular:
> +   - {tree, rtx, wide_int} op {tree, rtx, wide_int} -> wide_int
> +       The inputs must be the same precision.  The result is a wide_int
> +       of the same precision
>   
> -   * shifts do not care about the precision of the second operand.
> +   - {tree, rtx, wide_int} op (un)signed HOST_WIDE_INT -> wide_int
> +     (un)signed HOST_WIDE_INT op {tree, rtx, wide_int} -> wide_int
> +       The HOST_WIDE_INT is extended or truncated to the precision of
> +       the other input.  The result is a wide_int of the same precision
> +       as that input.
>   
> -   * values that come in from gcc source constants or variables are
> -     not checked as long one of the two operands has a precision.
> -     This is allowed because it is always known whether to sign or zero
> -     extend these values.
> +   - (un)signed HOST_WIDE_INT op (un)signed HOST_WIDE_INT -> widest_int
> +       The inputs are extended to widest_int precision and produce a
> +       widest_int result.
>   
> -   * order comparisons do not require that the operands be the same
> -     length.  This allows wide ints to be used in hash tables where
> -     all of the values may not be the same precision.  */
> +   - offset_int op offset_int -> offset_int
> +     offset_int op (un)signed HOST_WIDE_INT -> offset_int
> +     (un)signed HOST_WIDE_INT op offset_int -> offset_int
>   
> +   - widest_int op widest_int -> widest_int
> +     widest_int op (un)signed HOST_WIDE_INT -> widest_int
> +     (un)signed HOST_WIDE_INT op widest_int -> widest_int
>   
> +   Other combinations like:
> +
> +   - widest_int op offset_int and
> +   - wide_int op offset_int
> +
> +   are not allowed.  The inputs should instead be extended or truncated
> +   so that they match.
> +
> +   The inputs to comparison functions like wi::eq_p and wi::lts_p
> +   follow the same compatibility rules, although their return types
> +   are different.  Unary functions on X produce the same result as
> +   a binary operation X + X.  Shift functions X op Y also produce
> +   the same result as X + X; the precision of the shift amount Y
> +   can be arbitrarily different from X.  */
> +
> +
>   #include <utility>
>   #include "system.h"
>   #include "hwint.h"

Richard Sandiford Oct. 30, 2013, 11:01 a.m. UTC | #2

Kenneth Zadeck <zadeck@naturalbridge.com> writes:
> On 10/29/2013 06:37 PM, Richard Sandiford wrote:
>> This patch tries to update the main wide_int comment to reflect the current
>> implementation.
>>
>> - bitsizetype is TImode on x86_64 and others, so I don't think it's
>>    necessarily true that all offset_ints are signed.  (widest_int are
>>    though.)
> i am wondering if this is too conservative an interpretation.    I 
> believe that they are ti mode because that is the next thing after di 
> mode and so they wanted to accommodate the 3 extra bits. Certainly there 
> is no x86 that is able to address more than 64 bits.

Right, but my point is that it's a different case from widest_int.
It'd be just as valid to do bitsizetype arithmetic using wide_int
rather than offset_int, and those wide_ints would have precision 128,
just like the offset_ints.  And I wouldn't really say that those wide_ints
were fundamentally signed in any way.  Although the tree layer might "know"
that X upper bits of the bitsizetype are always signs, the tree-wide_int
interface treats them in the same way as any other 128-bit type.

Maybe I'm just being pedantic, but I think offset_int would only be like
widest_int if bitsizetype had precision 67 or whatever.  Then we could
say that both offset_int and widest_int must be wider than any inputs,
meaning that there's at least one leading sign bit.

This is related to the way that we have to assert:

template <int N>
inline wi::extended_tree <N>::extended_tree (const_tree t)
  : m_t (t)
{
  gcc_checking_assert (TYPE_PRECISION (TREE_TYPE (t)) <= N);
}

rather than:

template <int N>
inline wi::extended_tree <N>::extended_tree (const_tree t)
  : m_t (t)
{
  gcc_checking_assert (TYPE_PRECISION (TREE_TYPE (t)) < N);
}

(which would give slightly better offset_int code, because we could then
always use TREE_INT_CST_EXT_NUNITS.)

Thanks,
Richard

Kenneth Zadeck Oct. 30, 2013, 1:18 p.m. UTC | #3

On 10/30/2013 07:01 AM, Richard Sandiford wrote:
> Kenneth Zadeck <zadeck@naturalbridge.com> writes:
>> On 10/29/2013 06:37 PM, Richard Sandiford wrote:
>>> This patch tries to update the main wide_int comment to reflect the current
>>> implementation.
>>>
>>> - bitsizetype is TImode on x86_64 and others, so I don't think it's
>>>     necessarily true that all offset_ints are signed.  (widest_int are
>>>     though.)
>> i am wondering if this is too conservative an interpretation.    I
>> believe that they are ti mode because that is the next thing after di
>> mode and so they wanted to accommodate the 3 extra bits. Certainly there
>> is no x86 that is able to address more than 64 bits.
> Right, but my point is that it's a different case from widest_int.
> It'd be just as valid to do bitsizetype arithmetic using wide_int
> rather than offset_int, and those wide_ints would have precision 128,
> just like the offset_ints.  And I wouldn't really say that those wide_ints
> were fundamentally signed in any way.  Although the tree layer might "know"
> that X upper bits of the bitsizetype are always signs, the tree-wide_int
> interface treats them in the same way as any other 128-bit type.
>
> Maybe I'm just being pedantic, but I think offset_int would only be like
> widest_int if bitsizetype had precision 67 or whatever.  Then we could
> say that both offset_int and widest_int must be wider than any inputs,
> meaning that there's at least one leading sign bit.
this was of course what mike and i wanted, but we could not really 
figure out how to pull it off.
in particular, we could not find any existing reliable marker in the 
targets to say what the width of the widest pointer on any 
implementation.   We actually used the number 68 rather than 67 because 
we assumed 64 for the widest pointer on any existing platform, 3 bits 
for the bits and 1 bit for the sign.
> This is related to the way that we have to assert:
>
> template <int N>
> inline wi::extended_tree <N>::extended_tree (const_tree t)
>    : m_t (t)
> {
>    gcc_checking_assert (TYPE_PRECISION (TREE_TYPE (t)) <= N);
> }
>
> rather than:
>
> template <int N>
> inline wi::extended_tree <N>::extended_tree (const_tree t)
>    : m_t (t)
> {
>    gcc_checking_assert (TYPE_PRECISION (TREE_TYPE (t)) < N);
> }
>
> (which would give slightly better offset_int code, because we could then
> always use TREE_INT_CST_EXT_NUNITS.)
>
> Thanks,
> Richard

Richard Sandiford Oct. 30, 2013, 6:34 p.m. UTC | #4

Kenneth Zadeck <zadeck@naturalbridge.com> writes:
> On 10/30/2013 07:01 AM, Richard Sandiford wrote:
>> Kenneth Zadeck <zadeck@naturalbridge.com> writes:
>>> On 10/29/2013 06:37 PM, Richard Sandiford wrote:
>>>> This patch tries to update the main wide_int comment to reflect the current
>>>> implementation.
>>>>
>>>> - bitsizetype is TImode on x86_64 and others, so I don't think it's
>>>>     necessarily true that all offset_ints are signed.  (widest_int are
>>>>     though.)
>>> i am wondering if this is too conservative an interpretation.    I
>>> believe that they are ti mode because that is the next thing after di
>>> mode and so they wanted to accommodate the 3 extra bits. Certainly there
>>> is no x86 that is able to address more than 64 bits.
>> Right, but my point is that it's a different case from widest_int.
>> It'd be just as valid to do bitsizetype arithmetic using wide_int
>> rather than offset_int, and those wide_ints would have precision 128,
>> just like the offset_ints.  And I wouldn't really say that those wide_ints
>> were fundamentally signed in any way.  Although the tree layer might "know"
>> that X upper bits of the bitsizetype are always signs, the tree-wide_int
>> interface treats them in the same way as any other 128-bit type.
>>
>> Maybe I'm just being pedantic, but I think offset_int would only be like
>> widest_int if bitsizetype had precision 67 or whatever.  Then we could
>> say that both offset_int and widest_int must be wider than any inputs,
>> meaning that there's at least one leading sign bit.
> this was of course what mike and i wanted, but we could not really 
> figure out how to pull it off.
> in particular, we could not find any existing reliable marker in the 
> targets to say what the width of the widest pointer on any 
> implementation.   We actually used the number 68 rather than 67 because 
> we assumed 64 for the widest pointer on any existing platform, 3 bits 
> for the bits and 1 bit for the sign.

Ah yeah, 68 would be better for signed types.

Is the patch OK while we still have 128-bit bitsizetypes though?
I agree the current comment would be right if we ever did switch
to sub-128 bitsizes.

Thanks,
Richard

Kenneth Zadeck Oct. 31, 2013, 11:26 a.m. UTC | #5

On 10/30/2013 02:34 PM, Richard Sandiford wrote:
> Kenneth Zadeck <zadeck@naturalbridge.com> writes:
>> On 10/30/2013 07:01 AM, Richard Sandiford wrote:
>>> Kenneth Zadeck <zadeck@naturalbridge.com> writes:
>>>> On 10/29/2013 06:37 PM, Richard Sandiford wrote:
>>>>> This patch tries to update the main wide_int comment to reflect the current
>>>>> implementation.
>>>>>
>>>>> - bitsizetype is TImode on x86_64 and others, so I don't think it's
>>>>>      necessarily true that all offset_ints are signed.  (widest_int are
>>>>>      though.)
>>>> i am wondering if this is too conservative an interpretation.    I
>>>> believe that they are ti mode because that is the next thing after di
>>>> mode and so they wanted to accommodate the 3 extra bits. Certainly there
>>>> is no x86 that is able to address more than 64 bits.
>>> Right, but my point is that it's a different case from widest_int.
>>> It'd be just as valid to do bitsizetype arithmetic using wide_int
>>> rather than offset_int, and those wide_ints would have precision 128,
>>> just like the offset_ints.  And I wouldn't really say that those wide_ints
>>> were fundamentally signed in any way.  Although the tree layer might "know"
>>> that X upper bits of the bitsizetype are always signs, the tree-wide_int
>>> interface treats them in the same way as any other 128-bit type.
>>>
>>> Maybe I'm just being pedantic, but I think offset_int would only be like
>>> widest_int if bitsizetype had precision 67 or whatever.  Then we could
>>> say that both offset_int and widest_int must be wider than any inputs,
>>> meaning that there's at least one leading sign bit.
>> this was of course what mike and i wanted, but we could not really
>> figure out how to pull it off.
>> in particular, we could not find any existing reliable marker in the
>> targets to say what the width of the widest pointer on any
>> implementation.   We actually used the number 68 rather than 67 because
>> we assumed 64 for the widest pointer on any existing platform, 3 bits
>> for the bits and 1 bit for the sign.
> Ah yeah, 68 would be better for signed types.
>
> Is the patch OK while we still have 128-bit bitsizetypes though?
> I agree the current comment would be right if we ever did switch
> to sub-128 bitsizes.
>
> Thanks,
> Richard
yes this is fine.   Note that 68 is documented at the top of wide-int.h

kenny

Richard Biener Nov. 4, 2013, 9:01 a.m. UTC | #6

On Wed, 30 Oct 2013, Richard Sandiford wrote:

> Kenneth Zadeck <zadeck@naturalbridge.com> writes:
> > On 10/30/2013 07:01 AM, Richard Sandiford wrote:
> >> Kenneth Zadeck <zadeck@naturalbridge.com> writes:
> >>> On 10/29/2013 06:37 PM, Richard Sandiford wrote:
> >>>> This patch tries to update the main wide_int comment to reflect the current
> >>>> implementation.
> >>>>
> >>>> - bitsizetype is TImode on x86_64 and others, so I don't think it's
> >>>>     necessarily true that all offset_ints are signed.  (widest_int are
> >>>>     though.)
> >>> i am wondering if this is too conservative an interpretation.    I
> >>> believe that they are ti mode because that is the next thing after di
> >>> mode and so they wanted to accommodate the 3 extra bits. Certainly there
> >>> is no x86 that is able to address more than 64 bits.
> >> Right, but my point is that it's a different case from widest_int.
> >> It'd be just as valid to do bitsizetype arithmetic using wide_int
> >> rather than offset_int, and those wide_ints would have precision 128,
> >> just like the offset_ints.  And I wouldn't really say that those wide_ints
> >> were fundamentally signed in any way.  Although the tree layer might "know"
> >> that X upper bits of the bitsizetype are always signs, the tree-wide_int
> >> interface treats them in the same way as any other 128-bit type.
> >>
> >> Maybe I'm just being pedantic, but I think offset_int would only be like
> >> widest_int if bitsizetype had precision 67 or whatever.  Then we could
> >> say that both offset_int and widest_int must be wider than any inputs,
> >> meaning that there's at least one leading sign bit.
> > this was of course what mike and i wanted, but we could not really 
> > figure out how to pull it off.
> > in particular, we could not find any existing reliable marker in the 
> > targets to say what the width of the widest pointer on any 
> > implementation.   We actually used the number 68 rather than 67 because 
> > we assumed 64 for the widest pointer on any existing platform, 3 bits 
> > for the bits and 1 bit for the sign.
> 
> Ah yeah, 68 would be better for signed types.
> 
> Is the patch OK while we still have 128-bit bitsizetypes though?
> I agree the current comment would be right if we ever did switch
> to sub-128 bitsizes.

The issue with sub-128bit bitsizetype is code generation quality.
We do generate code for bitsizetype operations (at least from Ada),
so a power-of-two precision is required to avoid a lot of masking
operations.

Richard.

[wide-int] Update main comment

Commit Message

Comments

Patch