diff mbox

patch to fix constant math

Message ID CAFiYyc0RQvoXZ8OvqctwfcFJtdECD57+c3oZ9RefwSxD36Lhpw@mail.gmail.com
State New
Headers show

Commit Message

Richard Biener Oct. 5, 2012, 1:49 p.m. UTC
On Fri, Oct 5, 2012 at 2:39 PM, Richard Guenther
<richard.guenther@gmail.com> wrote:
>
> Ok, I see where you are going.  Let me look at the patch again.

* The introduction and use of CONST_SCALAR_INT_P could be split out
  (obvious and good)

* DEF_RTL_EXPR(CONST_WIDE_INT, "const_wide_int", "", RTX_CONST_OBJ)
  defining that new RTX is orthogonal to providing the wide_int abstraction
  for operating on CONST_INT and CONST_DOUBLE, right?

@@ -144,6 +144,7 @@ static bool
 prefer_and_bit_test (enum machine_mode mode, int bitnum)
 {
   bool speed_p;
+  wide_int mask = wide_int::set_bit_in_zero (bitnum, mode);

set_bit_in_zero ... why use this instead of
wide_int::zero (mode).set_bit (bitnum) that would match the old
double_int_zero.set_bit interface and be more composed of primitives?

   if (and_test == 0)
     {
@@ -164,8 +165,7 @@ prefer_and_bit_test (enum machine_mode m
     }

   /* Fill in the integers.  */
-  XEXP (and_test, 1)
-    = immed_double_int_const (double_int_zero.set_bit (bitnum), mode);
+  XEXP (and_test, 1) = immed_wide_int_const (mask);

I suppose immed_wide_int_const may create a CONST_INT, a CONST_DOUBLE
or a CONST_WIDE_INT?

+#if TARGET_SUPPORTS_WIDE_INT
+/* Returns 1 if OP is an operand that is a CONST_INT or CONST_WIDE_INT.  */
+int
+const_scalar_int_operand (rtx op, enum machine_mode mode)
+{

why is it necessary to condition anything on TARGET_SUPPORTS_WIDE_INT?
It seems testing/compile coverage of this code will be bad ...

Can't you simply map CONST_WIDE_INT to CONST_DOUBLE for targets
not supporting CONST_WIDE_INT (what does it mean to "support"
CONST_WIDE_INT?  Does a target that supports CONST_WIDE_INT no
longer use CONST_DOUBLEs for integers?)

+  if (!CONST_WIDE_INT_P (op))
+    return 0;

hm, that doesn't match the comment (CONST_INT is supposed to return 1 as well).
The comment doesn't really tell what the function does it seems,

+      int prec = GET_MODE_PRECISION (mode);
+      int bitsize = GET_MODE_BITSIZE (mode);
+
+      if (CONST_WIDE_INT_NUNITS (op) * HOST_BITS_PER_WIDE_INT > bitsize)
+       return 0;

it's mode argument is not documented.  And this doesn't even try to see if
the constant would fit the mode anyway (like if it's zero).  Not sure what
the function is for.

+       {
+         /* Multiword partial int.  */
+         HOST_WIDE_INT x
+           = CONST_WIDE_INT_ELT (op, CONST_WIDE_INT_NUNITS (op) - 1);
+         return (wide_int::sext (x, prec & (HOST_BITS_PER_WIDE_INT - 1))
+                 == x);

err - so now we have wide_int with a mode but we pass in another mode
and see if they have the same value?  Same value as-in signed value?
Which probably is why you need to rule out different size constants above?
Because you don't know whether to sign or zero extend?


+/* Returns 1 if OP is an operand that is a CONST_WIDE_INT.  */
+int
+const_wide_int_operand (rtx op, enum machine_mode mode)
+{

similar comments, common code should be factored out at least.
Instead of conditioning a whole set of new function on supports-wide-int
please add cases to the existing functions to avoid diverging in pieces
that both kind of targets support.

@@ -2599,7 +2678,7 @@ constrain_operands (int strict)
                break;

              case 'n':
-               if (CONST_INT_P (op) || CONST_DOUBLE_AS_INT_P (op))
+               if (CONST_SCALAR_INT_P (op))
                  win = 1;

factoring out this changes would really make the patch less noisy.

skipping to rtl.h bits

+struct GTY((variable_size)) hwivec_def {
+  int num_elem;                /* number of elements */
+  HOST_WIDE_INT elem[1];
+};

num_elem seems redundant and computable from GET_MODE_SIZE.
Or do you want to optimize encoding like for CONST_INT (and unlike
CONST_DOUBLE)?  I doubt the above packs nicely into rtx_def?
A general issue of it though - we waste 32bits on 64bit hosts in
rtx_def between the bits and the union.  Perfect place for num_elem
(if you really need it) and avoids another 4 byte hole.  Similar issue
exists in rtvec_def.

back to where I was

@@ -645,6 +673,10 @@ iterative_hash_rtx (const_rtx x, hashval
       return iterative_hash_object (i, hash);
     case CONST_INT:
       return iterative_hash_object (INTVAL (x), hash);
+    case CONST_WIDE_INT:
+      for (i = 0; i < CONST_WIDE_INT_NUNITS (x); i++)
+       hash = iterative_hash_object (CONST_WIDE_INT_ELT (x, i), hash);
+      return hash;

I see CONST_DOUBLEs value is not hashed.  Why hash wide-ints value?

Seeing

+#define HWI_GET_NUM_ELEM(HWIVEC)       ((HWIVEC)->num_elem)
+#define HWI_PUT_NUM_ELEM(HWIVEC, NUM)  ((HWIVEC)->num_elem = (NUM))

skipping to

+#if TARGET_SUPPORTS_WIDE_INT
+  {
+    rtx value = const_wide_int_alloc (len);
+    unsigned int i;
+
+    /* It is so tempting to just put the mode in here.  Must control
+       myself ... */
+    PUT_MODE (value, VOIDmode);
+    HWI_PUT_NUM_ELEM (CONST_WIDE_INT_VEC (value), len);

why is NUM_ELEM not set in const_wide_int_alloc, and only there?

+extern rtx rtx_alloc_stat_v (RTX_CODE MEM_STAT_DECL, int);
+#define rtx_alloc_v(c, SZ) rtx_alloc_stat_v (c MEM_STAT_INFO, SZ)
+#define const_wide_int_alloc(NWORDS)                           \
+  rtx_alloc_v (CONST_WIDE_INT,                                 \
+              (sizeof (struct hwivec_def)                      \
+               + ((NWORDS)-1) * sizeof (HOST_WIDE_INT)))       \

because it's a macro(?).  Ugh.

I see CONST_DOUBLE for ints and CONST_WIDE_INT for ints use
is mutually exclusive.  Good.  What's the issue with converting targets?
Just retain the existing 2 * HWI limit by default.

+#if TARGET_SUPPORTS_WIDE_INT
+
+/* Match CONST_*s that can represent compile-time constant integers.  */
+#define CASE_CONST_SCALAR_INT \
+   case CONST_INT: \
+   case CONST_WIDE_INT
+

I regard this abstraction is mostly because the transition is not finished.
Not sure if seeing CASE_CONST_SCALAR_INT, CASE_CONST_UNIQUE (?),
CASE_CONST_ANY adds more to the confusion than spelling out all of them.
Isn't there sth like a tree-code-class for RTXen?  That would catch
CASE_CONST_ANY, no?

@@ -3081,6 +3081,8 @@ commutative_operand_precedence (rtx op)
   /* Constants always come the second operand.  Prefer "nice" constants.  */
   if (code == CONST_INT)
     return -8;
+  if (code == CONST_WIDE_INT)
+    return -8;
   if (code == CONST_DOUBLE)
     return -7;
   if (code == CONST_FIXED)

I think it should be the same as CONST_DOUBLE which it "replaces".
Likewise below, that avoids code generation changes (which there should
be none for a target that is converted, right?).

@@ -5351,6 +5355,20 @@ split_double (rtx value, rtx *first, rtx
            }
        }
     }
+  else if (GET_CODE (value) == CONST_WIDE_INT)
+    {
+      gcc_assert (CONST_WIDE_INT_NUNITS (value) == 2);
+      if (WORDS_BIG_ENDIAN)
+       {
+         *first = GEN_INT (CONST_WIDE_INT_ELT (value, 1));
+         *second = GEN_INT (CONST_WIDE_INT_ELT (value, 0));
+       }
+      else
+       {
+         *first = GEN_INT (CONST_WIDE_INT_ELT (value, 0));
+         *second = GEN_INT (CONST_WIDE_INT_ELT (value, 1));
+       }
+    }

scary ;)

Comments

Kenneth Zadeck Oct. 5, 2012, 4:34 p.m. UTC | #1
richard s,
there are two comments that i deferred to you.   that have the word 
richard in them,

richi,
thank, i will start doing this now.

kenny
On 10/05/2012 09:49 AM, Richard Guenther wrote:
> On Fri, Oct 5, 2012 at 2:39 PM, Richard Guenther
> <richard.guenther@gmail.com> wrote:
>> Ok, I see where you are going.  Let me look at the patch again.
> * The introduction and use of CONST_SCALAR_INT_P could be split out
>    (obvious and good)
>
> * DEF_RTL_EXPR(CONST_WIDE_INT, "const_wide_int", "", RTX_CONST_OBJ)
>    defining that new RTX is orthogonal to providing the wide_int abstraction
>    for operating on CONST_INT and CONST_DOUBLE, right?
>
> @@ -144,6 +144,7 @@ static bool
>   prefer_and_bit_test (enum machine_mode mode, int bitnum)
>   {
>     bool speed_p;
> +  wide_int mask = wide_int::set_bit_in_zero (bitnum, mode);
>
> set_bit_in_zero ... why use this instead of
> wide_int::zero (mode).set_bit (bitnum) that would match the old
> double_int_zero.set_bit interface and be more composed of primitives?
adding something like this was just based on usage.    We do this 
operation all over the place, why not make it concise and efficient.     
The api is very bottom up.   I looked at what the clients were doing all 
over the place and added those functions.

wide-int has both and_not and or_not.   double-int only has and_not, but 
there are a lot of places in where you do or_nots, so we have or_not also.

>     if (and_test == 0)
>       {
> @@ -164,8 +165,7 @@ prefer_and_bit_test (enum machine_mode m
>       }
>
>     /* Fill in the integers.  */
> -  XEXP (and_test, 1)
> -    = immed_double_int_const (double_int_zero.set_bit (bitnum), mode);
> +  XEXP (and_test, 1) = immed_wide_int_const (mask);
>
> I suppose immed_wide_int_const may create a CONST_INT, a CONST_DOUBLE
> or a CONST_WIDE_INT?
yes, on converted targets it builds const_ints and const_wide_ints and 
on unconverted targets it builds const_ints and const_doubles.    The 
reason i did not want to convert the targets is that the code that lives 
in the targets generally wants to use the values to create constants and 
this code is very very very target specific.
>
> +#if TARGET_SUPPORTS_WIDE_INT
> +/* Returns 1 if OP is an operand that is a CONST_INT or CONST_WIDE_INT.  */
> +int
> +const_scalar_int_operand (rtx op, enum machine_mode mode)
> +{
>
> why is it necessary to condition anything on TARGET_SUPPORTS_WIDE_INT?
> It seems testing/compile coverage of this code will be bad ...
>
> Can't you simply map CONST_WIDE_INT to CONST_DOUBLE for targets
The accessors for the two are completely different.    const doubles 
"know" that there are exactly two hwi's.   for const_wide_ints, you only 
know that the len is greater than 1.   anything with len 1 would be 
CONST_INT.

In a modern c++ world, const_int would be a subtype of const_int, but 
that is a huge patch.

> not supporting CONST_WIDE_INT (what does it mean to "support"
> CONST_WIDE_INT?  Does a target that supports CONST_WIDE_INT no
> longer use CONST_DOUBLEs for integers?)
yes, that is exactly what it means.
> +  if (!CONST_WIDE_INT_P (op))
> +    return 0;
>
> hm, that doesn't match the comment (CONST_INT is supposed to return 1 as well).
> The comment doesn't really tell what the function does it seems,
I need some context here to reply.

> +      int prec = GET_MODE_PRECISION (mode);
> +      int bitsize = GET_MODE_BITSIZE (mode);
> +
> +      if (CONST_WIDE_INT_NUNITS (op) * HOST_BITS_PER_WIDE_INT > bitsize)
> +       return 0;
>
> it's mode argument is not documented.  And this doesn't even try to see if
> the constant would fit the mode anyway (like if it's zero).  Not sure what
> the function is for.
I will upgrade the comments, they were inherited from the old version 
with the const_double changed to the const_wide_int

> +       {
> +         /* Multiword partial int.  */
> +         HOST_WIDE_INT x
> +           = CONST_WIDE_INT_ELT (op, CONST_WIDE_INT_NUNITS (op) - 1);
> +         return (wide_int::sext (x, prec & (HOST_BITS_PER_WIDE_INT - 1))
> +                 == x);
>
> err - so now we have wide_int with a mode but we pass in another mode
> and see if they have the same value?  Same value as-in signed value?
> Which probably is why you need to rule out different size constants above?
> Because you don't know whether to sign or zero extend?
no it is worse than that.   I made the decision, which i think is 
correct, that we were not going to carry the mode inside 
const_wide_int.   The justification was that we do not do it for 
const_int.    There is a comment in the constructor for const_wide_int 
that says it would be so easy to just put this in.

But, this is an old api that did not change.   only the internals 
changed to support const_wide_int.

>
> +/* Returns 1 if OP is an operand that is a CONST_WIDE_INT.  */
> +int
> +const_wide_int_operand (rtx op, enum machine_mode mode)
> +{
>
> similar comments, common code should be factored out at least.
> Instead of conditioning a whole set of new function on supports-wide-int
> please add cases to the existing functions to avoid diverging in pieces
> that both kind of targets support.
in some places i do one and in some i do the other.   it really just 
depends on how much common code there was.   For these there was almost 
no common code.

> @@ -2599,7 +2678,7 @@ constrain_operands (int strict)
>                  break;
>
>                case 'n':
> -               if (CONST_INT_P (op) || CONST_DOUBLE_AS_INT_P (op))
> +               if (CONST_SCALAR_INT_P (op))
>                    win = 1;
>
> factoring out this changes would really make the patch less noisy.
i will this weekend.
> skipping to rtl.h bits
>
> +struct GTY((variable_size)) hwivec_def {
> +  int num_elem;                /* number of elements */
> +  HOST_WIDE_INT elem[1];
> +};
>
> num_elem seems redundant and computable from GET_MODE_SIZE.
NOT AT ALL.   CONST_WIDE_INT and TREE_INT_CST only have enough elements 
in the array to hold the actual value, they do not have enough elements 
to hold the mode or type.
in real life almost all integer constants of any type are very small.    
in the rtl word, we almost never actually create any CONST_WIDE_INT 
outside of the test cases, because CONST_INT handles the cases, and i 
assume that at the tree level, almost every int cst will have an len of 1.


> Or do you want to optimize encoding like for CONST_INT (and unlike
> CONST_DOUBLE)?  I doubt the above packs nicely into rtx_def?
> A general issue of it though - we waste 32bits on 64bit hosts in
> rtx_def between the bits and the union.  Perfect place for num_elem
> (if you really need it) and avoids another 4 byte hole.  Similar issue
> exists in rtvec_def.
I am going to let richard answer this.

> back to where I was
>
> @@ -645,6 +673,10 @@ iterative_hash_rtx (const_rtx x, hashval
>         return iterative_hash_object (i, hash);
>       case CONST_INT:
>         return iterative_hash_object (INTVAL (x), hash);
> +    case CONST_WIDE_INT:
> +      for (i = 0; i < CONST_WIDE_INT_NUNITS (x); i++)
> +       hash = iterative_hash_object (CONST_WIDE_INT_ELT (x, i), hash);
> +      return hash;
>
> I see CONST_DOUBLEs value is not hashed.  Why hash wide-ints value?
There are a lot of ugly things that one uncovers when one looks at code 
this old.
the idea was to bring whatever i touched up to best practice standards.

> Seeing
>
> +#define HWI_GET_NUM_ELEM(HWIVEC)       ((HWIVEC)->num_elem)
> +#define HWI_PUT_NUM_ELEM(HWIVEC, NUM)  ((HWIVEC)->num_elem = (NUM))
>
> skipping to
>
> +#if TARGET_SUPPORTS_WIDE_INT
> +  {
> +    rtx value = const_wide_int_alloc (len);
> +    unsigned int i;
> +
> +    /* It is so tempting to just put the mode in here.  Must control
> +       myself ... */
> +    PUT_MODE (value, VOIDmode);
> +    HWI_PUT_NUM_ELEM (CONST_WIDE_INT_VEC (value), len);
>
> why is NUM_ELEM not set in const_wide_int_alloc, and only there?
>
> +extern rtx rtx_alloc_stat_v (RTX_CODE MEM_STAT_DECL, int);
> +#define rtx_alloc_v(c, SZ) rtx_alloc_stat_v (c MEM_STAT_INFO, SZ)
> +#define const_wide_int_alloc(NWORDS)                           \
> +  rtx_alloc_v (CONST_WIDE_INT,                                 \
> +              (sizeof (struct hwivec_def)                      \
> +               + ((NWORDS)-1) * sizeof (HOST_WIDE_INT)))       \
>
> because it's a macro(?).  Ugh.
>
> I see CONST_DOUBLE for ints and CONST_WIDE_INT for ints use
> is mutually exclusive.  Good.  What's the issue with converting targets?
> Just retain the existing 2 * HWI limit by default.
I have answered this elsewhere, the constant generation code on some 
platforms is tricky and i felt that would require knowledge of the port 
that i did not have.   it is not just a bunch of c code, it is also 
changing patterns in the md files.

> +#if TARGET_SUPPORTS_WIDE_INT
> +
> +/* Match CONST_*s that can represent compile-time constant integers.  */
> +#define CASE_CONST_SCALAR_INT \
> +   case CONST_INT: \
> +   case CONST_WIDE_INT
> +
>
> I regard this abstraction is mostly because the transition is not finished.
> Not sure if seeing CASE_CONST_SCALAR_INT, CASE_CONST_UNIQUE (?),
> CASE_CONST_ANY adds more to the confusion than spelling out all of them.
> Isn't there sth like a tree-code-class for RTXen?  That would catch
> CASE_CONST_ANY, no?
however, those abstractions are already in the trunk.   They were put in 
earlier to make this patch smaller.
> @@ -3081,6 +3081,8 @@ commutative_operand_precedence (rtx op)
>     /* Constants always come the second operand.  Prefer "nice" constants.  */
>     if (code == CONST_INT)
>       return -8;
> +  if (code == CONST_WIDE_INT)
> +    return -8;
>     if (code == CONST_DOUBLE)
>       return -7;
>     if (code == CONST_FIXED)
>
> I think it should be the same as CONST_DOUBLE which it "replaces".
> Likewise below, that avoids code generation changes (which there should
> be none for a target that is converted, right?).
not clear because it does not really replace const double - remember 
that const double still holds fp stuff.    another one for richard to 
comment on.
> @@ -5351,6 +5355,20 @@ split_double (rtx value, rtx *first, rtx
>              }
>          }
>       }
> +  else if (GET_CODE (value) == CONST_WIDE_INT)
> +    {
> +      gcc_assert (CONST_WIDE_INT_NUNITS (value) == 2);
> +      if (WORDS_BIG_ENDIAN)
> +       {
> +         *first = GEN_INT (CONST_WIDE_INT_ELT (value, 1));
> +         *second = GEN_INT (CONST_WIDE_INT_ELT (value, 0));
> +       }
> +      else
> +       {
> +         *first = GEN_INT (CONST_WIDE_INT_ELT (value, 0));
> +         *second = GEN_INT (CONST_WIDE_INT_ELT (value, 1));
> +       }
> +    }
>
> scary ;)
unbelievably scary.    This is one of those places where i really do not 
know what the code is doing and so i just put in something that was 
minimally correct.    when we get some code where the assert hits then i 
will figure it out.

> Index: gcc/sched-vis.c
> ===================================================================
> --- gcc/sched-vis.c     (revision 191978)
> +++ gcc/sched-vis.c     (working copy)
> @@ -444,14 +444,31 @@ print_value (char *buf, const_rtx x, int
> ...
>       case CONST_DOUBLE:
> -      if (FLOAT_MODE_P (GET_MODE (x)))
> -       real_to_decimal (t, CONST_DOUBLE_REAL_VALUE (x), sizeof (t), 0, 1);
> -      else
> +     if (TARGET_SUPPORTS_WIDE_INT == 0 && !FLOAT_MODE_P (GET_MODE (x)))
>          sprintf (t,
>                   "<" HOST_WIDE_INT_PRINT_HEX "," HOST_WIDE_INT_PRINT_HEX ">",
>                   (unsigned HOST_WIDE_INT) CONST_DOUBLE_LOW (x),
>                   (unsigned HOST_WIDE_INT) CONST_DOUBLE_HIGH (x));
> +      else
> +       real_to_decimal (t, CONST_DOUBLE_REAL_VALUE (x), sizeof (t), 0, 1);
>         cur = safe_concat (buf, cur, t);
>         break;
>
> I don't see that this hunk is needed?  In fact it's more confusing this way.
you are likely right.   this is really there to say that this code would 
go away if
(when) all targets support wide int,  but i will get rid of it.
> +/* If the target supports integers that are wider than two
> +   HOST_WIDE_INTs on the host compiler, then the target should define
> +   TARGET_SUPPORTS_WIDE_INT and make the appropriate fixups.
> +   Otherwise the compiler really is not robust.  */
> +#ifndef TARGET_SUPPORTS_WIDE_INT
> +#define TARGET_SUPPORTS_WIDE_INT 0
> +#endif
>
> I wonder what are the appropriate fixups?
it was in the mail of the original patch.    i will in the next round, 
transfer a cleaned up version of that into the doc for this target hook.
>
> -/* Return a CONST_INT or CONST_DOUBLE corresponding to target reading
> -   GET_MODE_BITSIZE (MODE) bits from string constant STR.  */
> +/* Return a CONST_INT, CONST_WIDE_INT, or CONST_DOUBLE corresponding
> +   to target reading GET_MODE_BITSIZE (MODE) bits from string constant
> +   STR.  */
>
> maybe being less specific in this kind of comments and just refering to
> RTX integer constants would be better?
will do, i have done some like that.
> ... skip ...
>
> Index: gcc/hwint.c
> ===================================================================
> --- gcc/hwint.c (revision 191978)
> +++ gcc/hwint.c (working copy)
> @@ -112,6 +112,29 @@ ffs_hwi (unsigned HOST_WIDE_INT x)
>   int
>   popcount_hwi (unsigned HOST_WIDE_INT x)
>   {
> +  /* Compute the popcount of a HWI using the algorithm from
> +     Hacker's Delight.  */
> +#if HOST_BITS_PER_WIDE_INT == 32
>
> separate patch please.  With a rationale.
fine.
> ...
>
> +/* Constructs tree in type TYPE from with value given by CST.  Signedness
> +   of CST is assumed to be the same as the signedness of TYPE.  */
> +
> +tree
> +wide_int_to_tree (tree type, const wide_int &cst)
> +{
> +  wide_int v;
> +  if (TYPE_UNSIGNED (type))
> +    v = cst.zext (TYPE_PRECISION (type));
> +  else
> +    v = cst.sext (TYPE_PRECISION (type));
> +
> +  return build_int_cst_wide (type, v.elt (0), v.elt (1));
>
> this needs an assert that the wide-int contains two HWIs.
as i said in an earlier mail, this is just transition code for when the 
tree level code goes in.
but i see your point and will add the assert.


> +#ifndef GENERATOR_FILE
> +extern tree wide_int_to_tree (tree type, const wide_int &cst);
> +#endif
>
> why's that?  The header inclusion isn't guarded that way.
there was a problem building without it.   but i will look into it.
> Stopping here, skipping to wide-int.[ch]
>
> +#define DEBUG_WIDE_INT
> +#ifdef DEBUG_WIDE_INT
> +  /* Debugging routines.  */
> +  static void debug_vw  (const char* name, int r, const wide_int& o0);
> +  static void debug_vwh (const char* name, int r, const wide_int &o0,
> +                        HOST_WIDE_INT o1);
>
> there is DEBUG_FUNCTION.
>
> +/* This is the maximal size of the buffer needed for dump.  */
> +const int MAX = (MAX_BITSIZE_MODE_ANY_INT / 4
> +                + MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT + 32);
>
> I'd prefer a target macro for this, not anything based off
> MAX_BITSIZE_MODE_ANY_INT just to allow no-op transition of a
> target to wide-ints (which IMHO should be the default for all targets,
> simplifying the patch).
>
> +wide_int
> +wide_int::from_uhwi (unsigned HOST_WIDE_INT op0, enum machine_mode
> mode, bool *overflow)
> +{
> ...
> +#ifdef DEBUG_WIDE_INT
> +  if (dump_file)
> +    {
> +      char buf0[MAX];
> +      fprintf (dump_file, "%s: %s = " HOST_WIDE_INT_PRINT_HEX "\n",
> +              "wide_int::from_uhwi", result.dump (buf0), op0);
> +    }
> +#endif
>
> hm, ok ... I suppose the debug stuff (which looks very noisy) is going to be
> removed before this gets committed anywhere?
it is very noisy and i was just planning to turn it off.    but it is 
very useful when there is a problem so i was not planning to actually 
remove it immediately.

>
> +wide_int
> +wide_int::from_int_cst (const_tree tcst)
> +{
> +#if 1
> +  wide_int result;
> +  tree type = TREE_TYPE (tcst);
>
> should just call wide_it::from_double_int (tree_to_double_int (tcst))
>
> As I said elsewhere I don't think only allowing precisions that are somewhere
> in any mode is good enough.  We need the actual precision here (everywhere).
i think that the case has been made that what wide int needs to store 
inside it is the precision and bitsize and that passing in the mode/tree 
type is just a convenience.     I will make this change. Then we can 
have exposed constructors that just take bitsize and precision.


> Index: gcc/wide-int.h
> ===================================================================
> --- gcc/wide-int.h      (revision 0)
> +++ gcc/wide-int.h      (revision 0)
> @@ -0,0 +1,718 @@
> +/* Operations with very long integers.
> +   Copyright (C) 2012 Free Software Foundation, Inc.
> ...
> +
> +#ifndef GENERATOR_FILE
> +#include "hwint.h"
> +#include "options.h"
> +#include "tm.h"
> +#include "insn-modes.h"
> +#include "machmode.h"
> +#include "double-int.h"
> +#include <gmp.h>
> +#include "insn-modes.h"
>
> ugh.  Compare this with double-int.h which just needs <gmp.h> (and even that
> is a red herring) ...
>
> +#define wide_int_minus_one(MODE) (wide_int::from_shwi (-1, MODE))
> +#define wide_int_zero(MODE)      (wide_int::from_shwi (0, MODE))
> +#define wide_int_one(MODE)       (wide_int::from_shwi (1, MODE))
> +#define wide_int_two(MODE)       (wide_int::from_shwi (2, MODE))
> +#define wide_int_ten(MODE)       (wide_int::from_shwi (10, MODE))
>
> add static functions like
>
>    wide_int wide_int::zero (MODE)
ok
> +class wide_int {
> +  /* Internal representation.  */
> +  HOST_WIDE_INT val[MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT];
> +  unsigned short len;
> +  enum machine_mode mode;
>
> you really need to store the precision here, not the mode.  We do not have
> a distinct mode for each of the tree constant precisions we see.  So what
> might work for RTL will later fail on trees, so better do it correct
> from the start
> (overloads using mode rather than precision may be acceptable - similarly
> I'd consider overloads for tree types).
discussed above.
> len seems redundant unless you want to optimize encoding.
> len == (precision + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT.
that is exactly what we do.   However, we are optimizing time, not space.
the value is always stored in compressed form, i.e the same 
representation as is used inside the CONST_WIDE_INTs and INT_CSTs. this 
makes the transformation between them very fast.   And since we do this 
a lot, it needs to be fast.   So the len is the number of HWIs needed to 
represent the value which is typically less than what would be implied 
by the precision.
> +  enum Op {
> +    NONE,
>
> we don't have sth like this in double-int.h, why was the double-int mechanism
> not viable?
i have chosen to use enums for things rather than passing booleans.
> +  static wide_int from_int_cst (const_tree);
> +  static wide_int from_rtx (const_rtx, enum machine_mode);
>
> from_tree instead of from_int_cst?
ok
>    I miss from_pair.
why do you need this?
> +  HOST_WIDE_INT to_shwi (int prec) const;
>
> isn't the precision encoded?  If it's a different precision this seems to
> combine two primitives - is that really that convenient (if so that hints
> at some oddity IMHO).
this is an override of the one that does not take the precision. and it 
is that convenient.
I think that it is well understood that there are strange things in gcc.
> +  static wide_int max_value (const enum machine_mode mode, Op sgn);
> +  static wide_int max_value (const enum machine_mode mode, int prec, Op sgn);
> +  static wide_int min_value (const enum machine_mode mode, Op sgn);
> +  static wide_int min_value (const enum machine_mode mode, int prec, Op sgn);
again this is a style thing.    boolean parameters are more likely to 
get mangled.
> again prec seems redundant to me.  double-int uses 'bool uns', why is this
> different here?  What happens for NONE, TRUNC?
>
> +  inline unsigned short get_len () const;
> +  inline void set_len (unsigned int);
>
> the length should be an implementation detail, no?  Important is only
> the precision of the number.  So this shouldn' t be exported at least.
see above where you talk about len before.
> +  inline int full_len () const;
>
> what's this?
it is used by dwarf2out, it is a little redundant with precision.
dwarf2out is one of the places in the compiler that really has to know 
how many HWIs are used to represent the uncompressed number because that 
is what it is going to be dumping into the debug records.   it is used 
in 10 places there, all for pretty much the same reason.
> +  inline enum machine_mode get_mode () const;
> +  inline void set_mode (enum machine_mode m);
>
> not sure if we want machine-mode/precision mutating operations (similar
> to length mutating operations).  I guess better not.
This is one of the ugly worlds of abstractions.   the issue is that when 
you convert from/to something else into/from wide-int, then one of those 
two abstractions is going to have to reveal their innards.
That is the use of this.

> +  wide_int copy (enum machine_mode mode) const;
>
> why does this need a mode argument?  Maybe 'copy' isn't a good name?
perhaps, i guess at the tree level there are a lot of functions with the 
word "force" in their name.
however, here you always get a copy.
> +  static inline HOST_WIDE_INT sext (HOST_WIDE_INT src, int prec);
> +  static inline HOST_WIDE_INT zext (HOST_WIDE_INT src, int prec);
>
> doesn't belong to wide_int::
i will move them to hwint
> +#define wide_int_smin(OP0,OP1)  ((OP0).lts_p (OP1) ? (OP0) : (OP1))
> +#define wide_int_smax(OP0,OP1)  ((OP0).lts_p (OP1) ? (OP1) : (OP0))
> +#define wide_int_umin(OP0,OP1)  ((OP0).ltu_p (OP1) ? (OP0) : (OP1))
> +#define wide_int_umax(OP0,OP1)  ((OP0).ltu_p (OP1) ? (OP1) : (OP0))
>
> use inline functions
ok
> +void
> +wide_int::set_len (unsigned int l)
> +{
> +  gcc_assert (l < MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT);
>
> use gcc_checking_assert everywhere.
ok
>
> Overall it's not too bad.  The mode vs. precision thing needs to be sorted
as i said, i give in to this.
> out and I'd like TARGET_SUPPORTS_WIDE_INT not be introduced,
> even transitional.  Just make it possible to have the default be the
> existing size of CONST_DOUBLE.
i do not see how to do this.    we did what we could to minimize this, 
but at some point a target is going to have to indicate that it can 
behave in the new way, and in the new way thing work differently.

kenny
> Thanks,
> Richard.
Richard Sandiford Oct. 5, 2012, 5:29 p.m. UTC | #2
Kenneth Zadeck <zadeck@naturalbridge.com> writes:
>> Or do you want to optimize encoding like for CONST_INT (and unlike
>> CONST_DOUBLE)?  I doubt the above packs nicely into rtx_def?
>> A general issue of it though - we waste 32bits on 64bit hosts in
>> rtx_def between the bits and the union.  Perfect place for num_elem
>> (if you really need it) and avoids another 4 byte hole.  Similar issue
>> exists in rtvec_def.
> I am going to let richard answer this.

I agree this would be nice, but (stating the obvious) only on targets
where there is a hole.  I suppose that translates to something like:

#if (HOST_BITS_PER_PTR > HOST_BITS_PER_INT \
     || HOST_BITS_PER_WIDE_INT > HOST_BITS_PER_INT)
#define RTL_LAYOUT 1
#else
#define RTL_LAYOUT 2
#endif

where HOST_BITS_PER_PTR would have to be set by a new configure-time test.
RTL_LAYOUT == 1 would put the CONST_WIDE_INT vector length in rtx_def itself,
probably as part of a new union, while RTL_LAYOUT == 2 would keep
it in its current place.

E.g. between:

  unsigned return_val : 1;

and:

  /* The first element of the operands of this rtx.
     The number of operands and their types are controlled
     by the `code' field, according to rtl.def.  */
  union u {

have:

#if RTL_LAYOUT == 1
  union {
    int num_elem;
  } u2;
#endif

And:

struct GTY((variable_size)) hwivec_def {
#if RTL_LAYOUT == 2
  int num_elem;
#endif
  HOST_WIDE_INT elem[1];
};

>> +#if TARGET_SUPPORTS_WIDE_INT
>> +
>> +/* Match CONST_*s that can represent compile-time constant integers.  */
>> +#define CASE_CONST_SCALAR_INT \
>> +   case CONST_INT: \
>> +   case CONST_WIDE_INT
>> +
>>
>> I regard this abstraction is mostly because the transition is not finished.
>> Not sure if seeing CASE_CONST_SCALAR_INT, CASE_CONST_UNIQUE (?),
>> CASE_CONST_ANY adds more to the confusion than spelling out all of them.
>> Isn't there sth like a tree-code-class for RTXen?  That would catch
>> CASE_CONST_ANY, no?

Not as things stand.  All constants are classified as RTX_CONST_OBJ,
which includes run-time constants like SYMBOL_REF as well.  Whether it
makes sense to change the classification is really a separate question
from this patch.  Besides, the other cases in the affected switch
statements aren't necessarily going to divide into classes either,
so we might still need CASE_CONST_ANY.

I'd like to keep these macros even after the conversion.  They make
it more obvious what property is being tested, instead of having various
subsets of the CONST_* codes included by name.  Reviewing the patch that
introduced the macros showed up instances where the existing switch
statements were wrong.

>> @@ -3081,6 +3081,8 @@ commutative_operand_precedence (rtx op)
>>     /* Constants always come the second operand.  Prefer "nice" constants.  */
>>     if (code == CONST_INT)
>>       return -8;
>> +  if (code == CONST_WIDE_INT)
>> +    return -8;
>>     if (code == CONST_DOUBLE)
>>       return -7;
>>     if (code == CONST_FIXED)
>>
>> I think it should be the same as CONST_DOUBLE which it "replaces".
>> Likewise below, that avoids code generation changes (which there should
>> be none for a target that is converted, right?).
> not clear because it does not really replace const double - remember 
> that const double still holds fp stuff.    another one for richard to 
> comment on.

I think it should be the same as CONST_INT.  Whether an integer
fits into a HWI or not shouldn't affect its precedence.

>> +/* This is the maximal size of the buffer needed for dump.  */
>> +const int MAX = (MAX_BITSIZE_MODE_ANY_INT / 4
>> +                + MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT + 32);
>>
>> I'd prefer a target macro for this, not anything based off
>> MAX_BITSIZE_MODE_ANY_INT just to allow no-op transition of a
>> target to wide-ints (which IMHO should be the default for all targets,
>> simplifying the patch).

Kenny didn't comment on this, but I disagree.  Part of the process of
"converting" a port is to ween it off any assumptions about the number
of HWIs.

Richard
Kenneth Zadeck Oct. 5, 2012, 5:53 p.m. UTC | #3
On 10/05/2012 01:29 PM, Richard Sandiford wrote:
>>> +/* This is the maximal size of the buffer needed for dump.  */
>>> >>+const int MAX = (MAX_BITSIZE_MODE_ANY_INT / 4
>>> >>+                + MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT + 32);
>>> >>
>>> >>I'd prefer a target macro for this, not anything based off
>>> >>MAX_BITSIZE_MODE_ANY_INT just to allow no-op transition of a
>>> >>target to wide-ints (which IMHO should be the default for all targets,
>>> >>simplifying the patch).
> Kenny didn't comment on this, but I disagree.  Part of the process of
> "converting" a port is to ween it off any assumptions about the number
> of HWIs.
i also disagree.    my patch is littered with are places like this.
consider:

@@ -5180,13 +4808,11 @@ static rtx
  simplify_immed_subreg (enum machine_mode outermode, rtx op,
                 enum machine_mode innermode, unsigned int byte)
  {
-  /* We support up to 512-bit values (for V8DFmode).  */
    enum {
-    max_bitsize = 512,
      value_bit = 8,
      value_mask = (1 << value_bit) - 1
    };
-  unsigned char value[max_bitsize / value_bit];
+  unsigned char value [MAX_BITSIZE_MODE_ANY_MODE/value_bit];
    int value_start;
    int i;
    int elem;

Would you want do have a target macro for this also?
The code works no matter what. we do not need to be chasing this kind of 
problem all over the place.

kenny
Richard Biener Oct. 7, 2012, 12:47 p.m. UTC | #4
On Fri, Oct 5, 2012 at 6:34 PM, Kenneth Zadeck <zadeck@naturalbridge.com> wrote:
> richard s,
> there are two comments that i deferred to you.   that have the word richard
> in them,
>
> richi,
> thank, i will start doing this now.
>
> kenny
>
> On 10/05/2012 09:49 AM, Richard Guenther wrote:
>>
>> On Fri, Oct 5, 2012 at 2:39 PM, Richard Guenther
>> <richard.guenther@gmail.com> wrote:
>>>
>>> Ok, I see where you are going.  Let me look at the patch again.
>>
>> * The introduction and use of CONST_SCALAR_INT_P could be split out
>>    (obvious and good)
>>
>> * DEF_RTL_EXPR(CONST_WIDE_INT, "const_wide_int", "", RTX_CONST_OBJ)
>>    defining that new RTX is orthogonal to providing the wide_int
>> abstraction
>>    for operating on CONST_INT and CONST_DOUBLE, right?
>>
>> @@ -144,6 +144,7 @@ static bool
>>   prefer_and_bit_test (enum machine_mode mode, int bitnum)
>>   {
>>     bool speed_p;
>> +  wide_int mask = wide_int::set_bit_in_zero (bitnum, mode);
>>
>> set_bit_in_zero ... why use this instead of
>> wide_int::zero (mode).set_bit (bitnum) that would match the old
>> double_int_zero.set_bit interface and be more composed of primitives?
>
> adding something like this was just based on usage.    We do this operation
> all over the place, why not make it concise and efficient.     The api is
> very bottom up.   I looked at what the clients were doing all over the place
> and added those functions.
>
> wide-int has both and_not and or_not.   double-int only has and_not, but
> there are a lot of places in where you do or_nots, so we have or_not also.
>
>
>>     if (and_test == 0)
>>       {
>> @@ -164,8 +165,7 @@ prefer_and_bit_test (enum machine_mode m
>>       }
>>
>>     /* Fill in the integers.  */
>> -  XEXP (and_test, 1)
>> -    = immed_double_int_const (double_int_zero.set_bit (bitnum), mode);
>> +  XEXP (and_test, 1) = immed_wide_int_const (mask);
>>
>> I suppose immed_wide_int_const may create a CONST_INT, a CONST_DOUBLE
>> or a CONST_WIDE_INT?
>
> yes, on converted targets it builds const_ints and const_wide_ints and on
> unconverted targets it builds const_ints and const_doubles.    The reason i
> did not want to convert the targets is that the code that lives in the
> targets generally wants to use the values to create constants and this code
> is very very very target specific.
>
>>
>> +#if TARGET_SUPPORTS_WIDE_INT
>> +/* Returns 1 if OP is an operand that is a CONST_INT or CONST_WIDE_INT.
>> */
>> +int
>> +const_scalar_int_operand (rtx op, enum machine_mode mode)
>> +{
>>
>> why is it necessary to condition anything on TARGET_SUPPORTS_WIDE_INT?
>> It seems testing/compile coverage of this code will be bad ...
>>
>> Can't you simply map CONST_WIDE_INT to CONST_DOUBLE for targets
>
> The accessors for the two are completely different.    const doubles "know"
> that there are exactly two hwi's.   for const_wide_ints, you only know that
> the len is greater than 1.   anything with len 1 would be CONST_INT.
>
> In a modern c++ world, const_int would be a subtype of const_int, but that
> is a huge patch.
>
>
>> not supporting CONST_WIDE_INT (what does it mean to "support"
>> CONST_WIDE_INT?  Does a target that supports CONST_WIDE_INT no
>> longer use CONST_DOUBLEs for integers?)
>
> yes, that is exactly what it means.
>
>> +  if (!CONST_WIDE_INT_P (op))
>> +    return 0;
>>
>> hm, that doesn't match the comment (CONST_INT is supposed to return 1 as
>> well).
>> The comment doesn't really tell what the function does it seems,
>
> I need some context here to reply.
>
>
>> +      int prec = GET_MODE_PRECISION (mode);
>> +      int bitsize = GET_MODE_BITSIZE (mode);
>> +
>> +      if (CONST_WIDE_INT_NUNITS (op) * HOST_BITS_PER_WIDE_INT > bitsize)
>> +       return 0;
>>
>> it's mode argument is not documented.  And this doesn't even try to see if
>> the constant would fit the mode anyway (like if it's zero).  Not sure what
>> the function is for.
>
> I will upgrade the comments, they were inherited from the old version with
> the const_double changed to the const_wide_int
>
>
>> +       {
>> +         /* Multiword partial int.  */
>> +         HOST_WIDE_INT x
>> +           = CONST_WIDE_INT_ELT (op, CONST_WIDE_INT_NUNITS (op) - 1);
>> +         return (wide_int::sext (x, prec & (HOST_BITS_PER_WIDE_INT - 1))
>> +                 == x);
>>
>> err - so now we have wide_int with a mode but we pass in another mode
>> and see if they have the same value?  Same value as-in signed value?
>> Which probably is why you need to rule out different size constants above?
>> Because you don't know whether to sign or zero extend?
>
> no it is worse than that.   I made the decision, which i think is correct,
> that we were not going to carry the mode inside const_wide_int.   The
> justification was that we do not do it for const_int.    There is a comment
> in the constructor for const_wide_int that says it would be so easy to just
> put this in.
>
> But, this is an old api that did not change.   only the internals changed to
> support const_wide_int.
>
>
>>
>> +/* Returns 1 if OP is an operand that is a CONST_WIDE_INT.  */
>> +int
>> +const_wide_int_operand (rtx op, enum machine_mode mode)
>> +{
>>
>> similar comments, common code should be factored out at least.
>> Instead of conditioning a whole set of new function on supports-wide-int
>> please add cases to the existing functions to avoid diverging in pieces
>> that both kind of targets support.
>
> in some places i do one and in some i do the other.   it really just depends
> on how much common code there was.   For these there was almost no common
> code.
>
>
>> @@ -2599,7 +2678,7 @@ constrain_operands (int strict)
>>                  break;
>>
>>                case 'n':
>> -               if (CONST_INT_P (op) || CONST_DOUBLE_AS_INT_P (op))
>> +               if (CONST_SCALAR_INT_P (op))
>>                    win = 1;
>>
>> factoring out this changes would really make the patch less noisy.
>
> i will this weekend.
>
>> skipping to rtl.h bits
>>
>> +struct GTY((variable_size)) hwivec_def {
>> +  int num_elem;                /* number of elements */
>> +  HOST_WIDE_INT elem[1];
>> +};
>>
>> num_elem seems redundant and computable from GET_MODE_SIZE.
>
> NOT AT ALL.   CONST_WIDE_INT and TREE_INT_CST only have enough elements in
> the array to hold the actual value, they do not have enough elements to hold
> the mode or type.
> in real life almost all integer constants of any type are very small.    in
> the rtl word, we almost never actually create any CONST_WIDE_INT outside of
> the test cases, because CONST_INT handles the cases, and i assume that at
> the tree level, almost every int cst will have an len of 1.
>
>
>
>> Or do you want to optimize encoding like for CONST_INT (and unlike
>> CONST_DOUBLE)?  I doubt the above packs nicely into rtx_def?
>> A general issue of it though - we waste 32bits on 64bit hosts in
>> rtx_def between the bits and the union.  Perfect place for num_elem
>> (if you really need it) and avoids another 4 byte hole.  Similar issue
>> exists in rtvec_def.
>
> I am going to let richard answer this.
>
>
>> back to where I was
>>
>> @@ -645,6 +673,10 @@ iterative_hash_rtx (const_rtx x, hashval
>>         return iterative_hash_object (i, hash);
>>       case CONST_INT:
>>         return iterative_hash_object (INTVAL (x), hash);
>> +    case CONST_WIDE_INT:
>> +      for (i = 0; i < CONST_WIDE_INT_NUNITS (x); i++)
>> +       hash = iterative_hash_object (CONST_WIDE_INT_ELT (x, i), hash);
>> +      return hash;
>>
>> I see CONST_DOUBLEs value is not hashed.  Why hash wide-ints value?
>
> There are a lot of ugly things that one uncovers when one looks at code this
> old.
> the idea was to bring whatever i touched up to best practice standards.
>
>
>> Seeing
>>
>> +#define HWI_GET_NUM_ELEM(HWIVEC)       ((HWIVEC)->num_elem)
>> +#define HWI_PUT_NUM_ELEM(HWIVEC, NUM)  ((HWIVEC)->num_elem = (NUM))
>>
>> skipping to
>>
>> +#if TARGET_SUPPORTS_WIDE_INT
>> +  {
>> +    rtx value = const_wide_int_alloc (len);
>> +    unsigned int i;
>> +
>> +    /* It is so tempting to just put the mode in here.  Must control
>> +       myself ... */
>> +    PUT_MODE (value, VOIDmode);
>> +    HWI_PUT_NUM_ELEM (CONST_WIDE_INT_VEC (value), len);
>>
>> why is NUM_ELEM not set in const_wide_int_alloc, and only there?
>>
>> +extern rtx rtx_alloc_stat_v (RTX_CODE MEM_STAT_DECL, int);
>> +#define rtx_alloc_v(c, SZ) rtx_alloc_stat_v (c MEM_STAT_INFO, SZ)
>> +#define const_wide_int_alloc(NWORDS)                           \
>> +  rtx_alloc_v (CONST_WIDE_INT,                                 \
>> +              (sizeof (struct hwivec_def)                      \
>> +               + ((NWORDS)-1) * sizeof (HOST_WIDE_INT)))       \
>>
>> because it's a macro(?).  Ugh.
>>
>> I see CONST_DOUBLE for ints and CONST_WIDE_INT for ints use
>> is mutually exclusive.  Good.  What's the issue with converting targets?
>> Just retain the existing 2 * HWI limit by default.
>
> I have answered this elsewhere, the constant generation code on some
> platforms is tricky and i felt that would require knowledge of the port that
> i did not have.   it is not just a bunch of c code, it is also changing
> patterns in the md files.
>
>
>> +#if TARGET_SUPPORTS_WIDE_INT
>> +
>> +/* Match CONST_*s that can represent compile-time constant integers.  */
>> +#define CASE_CONST_SCALAR_INT \
>> +   case CONST_INT: \
>> +   case CONST_WIDE_INT
>> +
>>
>> I regard this abstraction is mostly because the transition is not
>> finished.
>> Not sure if seeing CASE_CONST_SCALAR_INT, CASE_CONST_UNIQUE (?),
>> CASE_CONST_ANY adds more to the confusion than spelling out all of them.
>> Isn't there sth like a tree-code-class for RTXen?  That would catch
>> CASE_CONST_ANY, no?
>
> however, those abstractions are already in the trunk.   They were put in
> earlier to make this patch smaller.
>
>> @@ -3081,6 +3081,8 @@ commutative_operand_precedence (rtx op)
>>     /* Constants always come the second operand.  Prefer "nice" constants.
>> */
>>     if (code == CONST_INT)
>>       return -8;
>> +  if (code == CONST_WIDE_INT)
>> +    return -8;
>>     if (code == CONST_DOUBLE)
>>       return -7;
>>     if (code == CONST_FIXED)
>>
>> I think it should be the same as CONST_DOUBLE which it "replaces".
>> Likewise below, that avoids code generation changes (which there should
>> be none for a target that is converted, right?).
>
> not clear because it does not really replace const double - remember that
> const double still holds fp stuff.    another one for richard to comment on.
>
>> @@ -5351,6 +5355,20 @@ split_double (rtx value, rtx *first, rtx
>>              }
>>          }
>>       }
>> +  else if (GET_CODE (value) == CONST_WIDE_INT)
>> +    {
>> +      gcc_assert (CONST_WIDE_INT_NUNITS (value) == 2);
>> +      if (WORDS_BIG_ENDIAN)
>> +       {
>> +         *first = GEN_INT (CONST_WIDE_INT_ELT (value, 1));
>> +         *second = GEN_INT (CONST_WIDE_INT_ELT (value, 0));
>> +       }
>> +      else
>> +       {
>> +         *first = GEN_INT (CONST_WIDE_INT_ELT (value, 0));
>> +         *second = GEN_INT (CONST_WIDE_INT_ELT (value, 1));
>> +       }
>> +    }
>>
>> scary ;)
>
> unbelievably scary.    This is one of those places where i really do not
> know what the code is doing and so i just put in something that was
> minimally correct.    when we get some code where the assert hits then i
> will figure it out.
>
>
>> Index: gcc/sched-vis.c
>> ===================================================================
>> --- gcc/sched-vis.c     (revision 191978)
>> +++ gcc/sched-vis.c     (working copy)
>> @@ -444,14 +444,31 @@ print_value (char *buf, const_rtx x, int
>> ...
>>       case CONST_DOUBLE:
>> -      if (FLOAT_MODE_P (GET_MODE (x)))
>> -       real_to_decimal (t, CONST_DOUBLE_REAL_VALUE (x), sizeof (t), 0,
>> 1);
>> -      else
>> +     if (TARGET_SUPPORTS_WIDE_INT == 0 && !FLOAT_MODE_P (GET_MODE (x)))
>>          sprintf (t,
>>                   "<" HOST_WIDE_INT_PRINT_HEX "," HOST_WIDE_INT_PRINT_HEX
>> ">",
>>                   (unsigned HOST_WIDE_INT) CONST_DOUBLE_LOW (x),
>>                   (unsigned HOST_WIDE_INT) CONST_DOUBLE_HIGH (x));
>> +      else
>> +       real_to_decimal (t, CONST_DOUBLE_REAL_VALUE (x), sizeof (t), 0,
>> 1);
>>         cur = safe_concat (buf, cur, t);
>>         break;
>>
>> I don't see that this hunk is needed?  In fact it's more confusing this
>> way.
>
> you are likely right.   this is really there to say that this code would go
> away if
> (when) all targets support wide int,  but i will get rid of it.
>
>> +/* If the target supports integers that are wider than two
>> +   HOST_WIDE_INTs on the host compiler, then the target should define
>> +   TARGET_SUPPORTS_WIDE_INT and make the appropriate fixups.
>> +   Otherwise the compiler really is not robust.  */
>> +#ifndef TARGET_SUPPORTS_WIDE_INT
>> +#define TARGET_SUPPORTS_WIDE_INT 0
>> +#endif
>>
>> I wonder what are the appropriate fixups?
>
> it was in the mail of the original patch.    i will in the next round,
> transfer a cleaned up version of that into the doc for this target hook.
>
>>
>> -/* Return a CONST_INT or CONST_DOUBLE corresponding to target reading
>> -   GET_MODE_BITSIZE (MODE) bits from string constant STR.  */
>> +/* Return a CONST_INT, CONST_WIDE_INT, or CONST_DOUBLE corresponding
>> +   to target reading GET_MODE_BITSIZE (MODE) bits from string constant
>> +   STR.  */
>>
>> maybe being less specific in this kind of comments and just refering to
>> RTX integer constants would be better?
>
> will do, i have done some like that.
>
>> ... skip ...
>>
>> Index: gcc/hwint.c
>> ===================================================================
>> --- gcc/hwint.c (revision 191978)
>> +++ gcc/hwint.c (working copy)
>> @@ -112,6 +112,29 @@ ffs_hwi (unsigned HOST_WIDE_INT x)
>>   int
>>   popcount_hwi (unsigned HOST_WIDE_INT x)
>>   {
>> +  /* Compute the popcount of a HWI using the algorithm from
>> +     Hacker's Delight.  */
>> +#if HOST_BITS_PER_WIDE_INT == 32
>>
>> separate patch please.  With a rationale.
>
> fine.
>
>> ...
>>
>> +/* Constructs tree in type TYPE from with value given by CST.  Signedness
>> +   of CST is assumed to be the same as the signedness of TYPE.  */
>> +
>> +tree
>> +wide_int_to_tree (tree type, const wide_int &cst)
>> +{
>> +  wide_int v;
>> +  if (TYPE_UNSIGNED (type))
>> +    v = cst.zext (TYPE_PRECISION (type));
>> +  else
>> +    v = cst.sext (TYPE_PRECISION (type));
>> +
>> +  return build_int_cst_wide (type, v.elt (0), v.elt (1));
>>
>> this needs an assert that the wide-int contains two HWIs.
>
> as i said in an earlier mail, this is just transition code for when the tree
> level code goes in.
> but i see your point and will add the assert.
>
>
>
>> +#ifndef GENERATOR_FILE
>> +extern tree wide_int_to_tree (tree type, const wide_int &cst);
>> +#endif
>>
>> why's that?  The header inclusion isn't guarded that way.
>
> there was a problem building without it.   but i will look into it.
>
>> Stopping here, skipping to wide-int.[ch]
>>
>> +#define DEBUG_WIDE_INT
>> +#ifdef DEBUG_WIDE_INT
>> +  /* Debugging routines.  */
>> +  static void debug_vw  (const char* name, int r, const wide_int& o0);
>> +  static void debug_vwh (const char* name, int r, const wide_int &o0,
>> +                        HOST_WIDE_INT o1);
>>
>> there is DEBUG_FUNCTION.
>>
>> +/* This is the maximal size of the buffer needed for dump.  */
>> +const int MAX = (MAX_BITSIZE_MODE_ANY_INT / 4
>> +                + MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT +
>> 32);
>>
>> I'd prefer a target macro for this, not anything based off
>> MAX_BITSIZE_MODE_ANY_INT just to allow no-op transition of a
>> target to wide-ints (which IMHO should be the default for all targets,
>> simplifying the patch).
>>
>> +wide_int
>> +wide_int::from_uhwi (unsigned HOST_WIDE_INT op0, enum machine_mode
>> mode, bool *overflow)
>> +{
>> ...
>> +#ifdef DEBUG_WIDE_INT
>> +  if (dump_file)
>> +    {
>> +      char buf0[MAX];
>> +      fprintf (dump_file, "%s: %s = " HOST_WIDE_INT_PRINT_HEX "\n",
>> +              "wide_int::from_uhwi", result.dump (buf0), op0);
>> +    }
>> +#endif
>>
>> hm, ok ... I suppose the debug stuff (which looks very noisy) is going to
>> be
>> removed before this gets committed anywhere?
>
> it is very noisy and i was just planning to turn it off.    but it is very
> useful when there is a problem so i was not planning to actually remove it
> immediately.
>
>
>>
>> +wide_int
>> +wide_int::from_int_cst (const_tree tcst)
>> +{
>> +#if 1
>> +  wide_int result;
>> +  tree type = TREE_TYPE (tcst);
>>
>> should just call wide_it::from_double_int (tree_to_double_int (tcst))
>>
>> As I said elsewhere I don't think only allowing precisions that are
>> somewhere
>> in any mode is good enough.  We need the actual precision here
>> (everywhere).
>
> i think that the case has been made that what wide int needs to store inside
> it is the precision and bitsize and that passing in the mode/tree type is
> just a convenience.     I will make this change. Then we can have exposed
> constructors that just take bitsize and precision.
>
>
>
>> Index: gcc/wide-int.h
>> ===================================================================
>> --- gcc/wide-int.h      (revision 0)
>> +++ gcc/wide-int.h      (revision 0)
>> @@ -0,0 +1,718 @@
>> +/* Operations with very long integers.
>> +   Copyright (C) 2012 Free Software Foundation, Inc.
>> ...
>> +
>> +#ifndef GENERATOR_FILE
>> +#include "hwint.h"
>> +#include "options.h"
>> +#include "tm.h"
>> +#include "insn-modes.h"
>> +#include "machmode.h"
>> +#include "double-int.h"
>> +#include <gmp.h>
>> +#include "insn-modes.h"
>>
>> ugh.  Compare this with double-int.h which just needs <gmp.h> (and even
>> that
>> is a red herring) ...
>>
>> +#define wide_int_minus_one(MODE) (wide_int::from_shwi (-1, MODE))
>> +#define wide_int_zero(MODE)      (wide_int::from_shwi (0, MODE))
>> +#define wide_int_one(MODE)       (wide_int::from_shwi (1, MODE))
>> +#define wide_int_two(MODE)       (wide_int::from_shwi (2, MODE))
>> +#define wide_int_ten(MODE)       (wide_int::from_shwi (10, MODE))
>>
>> add static functions like
>>
>>    wide_int wide_int::zero (MODE)
>
> ok
>
>> +class wide_int {
>> +  /* Internal representation.  */
>> +  HOST_WIDE_INT val[MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT];
>> +  unsigned short len;
>> +  enum machine_mode mode;
>>
>> you really need to store the precision here, not the mode.  We do not have
>> a distinct mode for each of the tree constant precisions we see.  So what
>> might work for RTL will later fail on trees, so better do it correct
>> from the start
>> (overloads using mode rather than precision may be acceptable - similarly
>> I'd consider overloads for tree types).
>
> discussed above.
>
>> len seems redundant unless you want to optimize encoding.
>> len == (precision + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT.
>
> that is exactly what we do.   However, we are optimizing time, not space.
> the value is always stored in compressed form, i.e the same representation
> as is used inside the CONST_WIDE_INTs and INT_CSTs. this makes the
> transformation between them very fast.   And since we do this a lot, it
> needs to be fast.   So the len is the number of HWIs needed to represent the
> value which is typically less than what would be implied by the precision.

But doesn't this require a sign?  Otherwise how do you encode TImode 0xffffffff?
Would len be 1 here?  (and talking about CONST_WIDE_INT vs CONST_INT,
wouldn't CONST_INT be used anyway for all ints we can encode "small"?  Or
is it (hopefully) the goal to replace CONST_INT with CONST_WIDE_INT
everywhere?)  Or do you say wide_int is always "signed', thus TImode 0xffffffff
needs len == 2 and an explicit zero upper HWI word?  Or do you say wide_int
is always "unsigned", thus TImode -1 needs len == 2?  Noting that double-ints
were supposed to be twos-complement (thus, 'unsigned') numbers having
implicit non-zero bits sounds error-prone.

That said - I don't see any value in optimizing storage for short-lived
(as you say) wide-ints (apart from limiting it to the mode size).  For
example mutating operations on wide-ints are not really possible
if you need to extend storage.

>> +  enum Op {
>> +    NONE,
>>
>> we don't have sth like this in double-int.h, why was the double-int
>> mechanism
>> not viable?
>
> i have chosen to use enums for things rather than passing booleans.

But it's bad to use an enum with 4 values for a thing that can only possibly
expect two.  You cannot optimize tests as easily.  Please retain the bool uns
parameters instead.

Richard.
Kenneth Zadeck Oct. 7, 2012, 1:11 p.m. UTC | #5
On 10/07/2012 08:47 AM, Richard Guenther wrote:
>>> len seems redundant unless you want to optimize encoding.
>>> >>len == (precision + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT.
>> >
>> >that is exactly what we do.   However, we are optimizing time, not space.
>> >the value is always stored in compressed form, i.e the same representation
>> >as is used inside the CONST_WIDE_INTs and INT_CSTs. this makes the
>> >transformation between them very fast.   And since we do this a lot, it
>> >needs to be fast.   So the len is the number of HWIs needed to represent the
>> >value which is typically less than what would be implied by the precision.
> But doesn't this require a sign?  Otherwise how do you encode TImode 0xffffffff?
> Would len be 1 here?  (and talking about CONST_WIDE_INT vs CONST_INT,
> wouldn't CONST_INT be used anyway for all ints we can encode "small"?  Or
> is it (hopefully) the goal to replace CONST_INT with CONST_WIDE_INT
> everywhere?)  Or do you say wide_int is always "signed', thus TImode 0xffffffff
> needs len == 2 and an explicit zero upper HWI word?
the compression of this has len ==2 with the explicit upper being 0.

>   Or do you say wide_int
> is always "unsigned", thus TImode -1 needs len == 2?  Noting that double-ints
> were supposed to be twos-complement (thus, 'unsigned') numbers having
> implicit non-zero bits sounds error-prone.
>
> That said - I don't see any value in optimizing storage for short-lived
> (as you say) wide-ints (apart from limiting it to the mode size).  For
> example mutating operations on wide-ints are not really possible
> if you need to extend storage.
the compression used is independent of whether it is sign or unsigned.  
but the compression "looks" signed.   i.e. you do not represent the 
upper hwi if they would just be a sign extension of the top hwi that is 
represented.   However, this does not imply anything about the math that 
was used to set those bits that way, and you can always go back to the 
full rep independent of the sign.

I do not have any long term plans to merge CONST_INT into 
CONST_WIDE_INT.   It would be a huge change in the ports and would be 
fairly space inefficient.  My guess is that in real code, less than 1% 
of real constants will have a length greater than 1 even on a 32 bit 
host.   CONST_INT is very space efficient. This could have been 
mentioned as part of Richard's response to your comment about the way we 
represent the CONST_WIDE_INTs.   In practice, there are almost none of 
them anyway.

In fact, you could argue that the tree level did it wrong (not that i am 
suggesting to change this).   But it makes me think what was going on 
when the decision to make TYPE_PRECISION be an INT_CST rather than just 
a HWI was made.   For that there is an implication that it could never 
take more than a HWI since no place in the code even checks 
TREE_INT_CST_HIGH for these.
>>> >>+  enum Op {
>>> >>+    NONE,
>>> >>
>>> >>we don't have sth like this in double-int.h, why was the double-int
>>> >>mechanism
>>> >>not viable?
>> >
>> >i have chosen to use enums for things rather than passing booleans.
> But it's bad to use an enum with 4 values for a thing that can only possibly
> expect two.  You cannot optimize tests as easily.  Please retain the bool uns
> parameters instead.
I am breaking it into two enums.
Richard Biener Oct. 7, 2012, 1:19 p.m. UTC | #6
On Sun, Oct 7, 2012 at 3:11 PM, Kenneth Zadeck <zadeck@naturalbridge.com> wrote:
>
> On 10/07/2012 08:47 AM, Richard Guenther wrote:
>>>>
>>>> len seems redundant unless you want to optimize encoding.
>>>> >>len == (precision + HOST_BITS_PER_WIDE_INT - 1) /
>>>> >> HOST_BITS_PER_WIDE_INT.
>>>
>>> >
>>> >that is exactly what we do.   However, we are optimizing time, not
>>> > space.
>>> >the value is always stored in compressed form, i.e the same
>>> > representation
>>> >as is used inside the CONST_WIDE_INTs and INT_CSTs. this makes the
>>> >transformation between them very fast.   And since we do this a lot, it
>>> >needs to be fast.   So the len is the number of HWIs needed to represent
>>> > the
>>> >value which is typically less than what would be implied by the
>>> > precision.
>>
>> But doesn't this require a sign?  Otherwise how do you encode TImode
>> 0xffffffff?
>> Would len be 1 here?  (and talking about CONST_WIDE_INT vs CONST_INT,
>> wouldn't CONST_INT be used anyway for all ints we can encode "small"?  Or
>> is it (hopefully) the goal to replace CONST_INT with CONST_WIDE_INT
>> everywhere?)  Or do you say wide_int is always "signed', thus TImode
>> 0xffffffff
>> needs len == 2 and an explicit zero upper HWI word?
>
> the compression of this has len ==2 with the explicit upper being 0.
>
>
>>   Or do you say wide_int
>> is always "unsigned", thus TImode -1 needs len == 2?  Noting that
>> double-ints
>> were supposed to be twos-complement (thus, 'unsigned') numbers having
>> implicit non-zero bits sounds error-prone.
>>
>> That said - I don't see any value in optimizing storage for short-lived
>> (as you say) wide-ints (apart from limiting it to the mode size).  For
>> example mutating operations on wide-ints are not really possible
>> if you need to extend storage.
>
> the compression used is independent of whether it is sign or unsigned.  but
> the compression "looks" signed.   i.e. you do not represent the upper hwi if
> they would just be a sign extension of the top hwi that is represented.
> However, this does not imply anything about the math that was used to set
> those bits that way, and you can always go back to the full rep independent
> of the sign.
>
> I do not have any long term plans to merge CONST_INT into CONST_WIDE_INT.
> It would be a huge change in the ports and would be fairly space
> inefficient.  My guess is that in real code, less than 1% of real constants
> will have a length greater than 1 even on a 32 bit host.   CONST_INT is very
> space efficient. This could have been mentioned as part of Richard's
> response to your comment about the way we represent the CONST_WIDE_INTs.
> In practice, there are almost none of them anyway.
>
> In fact, you could argue that the tree level did it wrong (not that i am
> suggesting to change this).   But it makes me think what was going on when
> the decision to make TYPE_PRECISION be an INT_CST rather than just a HWI was
> made.   For that there is an implication that it could never take more than
> a HWI since no place in the code even checks TREE_INT_CST_HIGH for these.

Well - on the tree level we now always have two HWIs for all INTEGER_CSTs.  If
we can, based on the size of the underlying mode, reduce that to one
HWI we already
win something.  If we add an explicit length to allow a smaller
encoding for larger modes
(tree_base conveniently has an available 'int' for this ...) then we'd
win in more cases.
Thus, is CONST_INT really necessarily better than optimized CONST_WIDE_INT
storage?

Richard.

>>>> >>+  enum Op {
>>>> >>+    NONE,
>>>> >>
>>>> >>we don't have sth like this in double-int.h, why was the double-int
>>>> >>mechanism
>>>> >>not viable?
>>>
>>> >
>>> >i have chosen to use enums for things rather than passing booleans.
>>
>> But it's bad to use an enum with 4 values for a thing that can only
>> possibly
>> expect two.  You cannot optimize tests as easily.  Please retain the bool
>> uns
>> parameters instead.
>
> I am breaking it into two enums.
Kenneth Zadeck Oct. 7, 2012, 2:58 p.m. UTC | #7
On 10/07/2012 09:19 AM, Richard Guenther wrote:
>> >In fact, you could argue that the tree level did it wrong (not that i am
>> >suggesting to change this).   But it makes me think what was going on when
>> >the decision to make TYPE_PRECISION be an INT_CST rather than just a HWI was
>> >made.   For that there is an implication that it could never take more than
>> >a HWI since no place in the code even checks TREE_INT_CST_HIGH for these.
> Well - on the tree level we now always have two HWIs for all INTEGER_CSTs.  If
> we can, based on the size of the underlying mode, reduce that to one
> HWI we already
> win something.  If we add an explicit length to allow a smaller
> encoding for larger modes
> (tree_base conveniently has an available 'int' for this ...) then we'd
> win in more cases.
> Thus, is CONST_INT really necessarily better than optimized CONST_WIDE_INT
> storage?
i have to admit, that looking at these data structures gives me a 
headache.  This all looks like something that Rube Goldberg would have 
invented had he done object oriented design  (Richard S did not know who 
Rube Goldberg when i mentioned this name to him a few years ago since 
this is an American thing, but the british had their own equivalent and 
I assume the germans do too.).

i did the first cut of changing the rtl level structure and Richard S 
threw up on it and suggested what is there now, which happily (for me) i 
was able to get mike to implement.

mike also did the tree level version of the data structures for me.   i 
will make sure he used that left over length field.

The bottom line is that you most likely just save the length, but that 
is a big percent of something this small.  Both of rtl ints have a mode, 
so if we can make that change later, it will be space neutral.
Richard Biener Oct. 8, 2012, 9:27 a.m. UTC | #8
On Sun, Oct 7, 2012 at 4:58 PM, Kenneth Zadeck <zadeck@naturalbridge.com> wrote:
>
> On 10/07/2012 09:19 AM, Richard Guenther wrote:
>>>
>>> >In fact, you could argue that the tree level did it wrong (not that i am
>>> >suggesting to change this).   But it makes me think what was going on
>>> > when
>>> >the decision to make TYPE_PRECISION be an INT_CST rather than just a HWI
>>> > was
>>> >made.   For that there is an implication that it could never take more
>>> > than
>>> >a HWI since no place in the code even checks TREE_INT_CST_HIGH for
>>> > these.
>>
>> Well - on the tree level we now always have two HWIs for all INTEGER_CSTs.
>> If
>> we can, based on the size of the underlying mode, reduce that to one
>> HWI we already
>> win something.  If we add an explicit length to allow a smaller
>> encoding for larger modes
>> (tree_base conveniently has an available 'int' for this ...) then we'd
>> win in more cases.
>> Thus, is CONST_INT really necessarily better than optimized CONST_WIDE_INT
>> storage?
>
> i have to admit, that looking at these data structures gives me a headache.
> This all looks like something that Rube Goldberg would have invented had he
> done object oriented design  (Richard S did not know who Rube Goldberg when
> i mentioned this name to him a few years ago since this is an American
> thing, but the british had their own equivalent and I assume the germans do
> too.).
>
> i did the first cut of changing the rtl level structure and Richard S threw
> up on it and suggested what is there now, which happily (for me) i was able
> to get mike to implement.
>
> mike also did the tree level version of the data structures for me.   i will
> make sure he used that left over length field.
>
> The bottom line is that you most likely just save the length, but that is a
> big percent of something this small.  Both of rtl ints have a mode, so if we
> can make that change later, it will be space neutral.

Yes.

Btw, as for Richards idea of conditionally placing the length field in
rtx_def looks like overkill to me.  These days we'd merely want to
optimize for 64bit hosts, thus unconditionally adding a 32 bit
field to rtx_def looks ok to me (you can wrap that inside a union to
allow both descriptive names and eventual different use - see what
I've done to tree_base)

Richard.
Nathan Froyd Oct. 8, 2012, 3:01 p.m. UTC | #9
----- Original Message -----
> Btw, as for Richards idea of conditionally placing the length field
> in
> rtx_def looks like overkill to me.  These days we'd merely want to
> optimize for 64bit hosts, thus unconditionally adding a 32 bit
> field to rtx_def looks ok to me (you can wrap that inside a union to
> allow both descriptive names and eventual different use - see what
> I've done to tree_base)

IMHO, unconditionally adding that field isn't "optimize for 64-bit
hosts", but "gratuitously make one of the major compiler data
structures bigger on 32-bit hosts".  Not everybody can cross-compile
from a 64-bit host.  And even those people who can don't necessarily
want to.  Please try to consider what's best for all the people who
use GCC, not just the cases you happen to be working with every day.

-Nathan
Robert Dewar Oct. 8, 2012, 3:11 p.m. UTC | #10
On 10/8/2012 11:01 AM, Nathan Froyd wrote:
> ----- Original Message -----
>> Btw, as for Richards idea of conditionally placing the length field
>> in
>> rtx_def looks like overkill to me.  These days we'd merely want to
>> optimize for 64bit hosts, thus unconditionally adding a 32 bit
>> field to rtx_def looks ok to me (you can wrap that inside a union to
>> allow both descriptive names and eventual different use - see what
>> I've done to tree_base)
>
> IMHO, unconditionally adding that field isn't "optimize for 64-bit
> hosts", but "gratuitously make one of the major compiler data
> structures bigger on 32-bit hosts".  Not everybody can cross-compile
> from a 64-bit host.  And even those people who can don't necessarily
> want to.  Please try to consider what's best for all the people who
> use GCC, not just the cases you happen to be working with every day.

I think that's rasonable in general, but as time goes on, and every
$300 laptop is 64-bit capable, one should not go TOO far out of the
way trying to make sure we can compile everything on a 32-bit machine.
After all, we don't try to ensure we can compile on a 16-bit machine
though when I helped write the Realia COBOL compiler, it was a major
consideration that we had to be able to compile arbitrarily large
programs on a 32-bit machine with one megabyte of memory. That was
achieved at the time, but is hardly relevant now!
Richard Biener Oct. 8, 2012, 4:18 p.m. UTC | #11
On Mon, Oct 8, 2012 at 5:01 PM, Nathan Froyd <froydnj@mozilla.com> wrote:
> ----- Original Message -----
>> Btw, as for Richards idea of conditionally placing the length field
>> in
>> rtx_def looks like overkill to me.  These days we'd merely want to
>> optimize for 64bit hosts, thus unconditionally adding a 32 bit
>> field to rtx_def looks ok to me (you can wrap that inside a union to
>> allow both descriptive names and eventual different use - see what
>> I've done to tree_base)
>
> IMHO, unconditionally adding that field isn't "optimize for 64-bit
> hosts", but "gratuitously make one of the major compiler data
> structures bigger on 32-bit hosts".  Not everybody can cross-compile
> from a 64-bit host.  And even those people who can don't necessarily
> want to.  Please try to consider what's best for all the people who
> use GCC, not just the cases you happen to be working with every day.

The challenge would of course be to have the overhead only for a minority
of all RTX codes.  After all that 32bits are free to be used for every one.

And I would not consider RTX a 'major compiler data structure' - of course
that makes the whole issue somewhat moot ;)

Richard.

> -Nathan
Richard Sandiford Oct. 8, 2012, 7:55 p.m. UTC | #12
Robert Dewar <dewar@adacore.com> writes:
> On 10/8/2012 11:01 AM, Nathan Froyd wrote:
>> ----- Original Message -----
>>> Btw, as for Richards idea of conditionally placing the length field
>>> in
>>> rtx_def looks like overkill to me.  These days we'd merely want to
>>> optimize for 64bit hosts, thus unconditionally adding a 32 bit
>>> field to rtx_def looks ok to me (you can wrap that inside a union to
>>> allow both descriptive names and eventual different use - see what
>>> I've done to tree_base)
>>
>> IMHO, unconditionally adding that field isn't "optimize for 64-bit
>> hosts", but "gratuitously make one of the major compiler data
>> structures bigger on 32-bit hosts".  Not everybody can cross-compile
>> from a 64-bit host.  And even those people who can don't necessarily
>> want to.  Please try to consider what's best for all the people who
>> use GCC, not just the cases you happen to be working with every day.
>
> I think that's rasonable in general, but as time goes on, and every
> $300 laptop is 64-bit capable, one should not go TOO far out of the
> way trying to make sure we can compile everything on a 32-bit machine.

It's not 64-bit machine vs. 32-bit machine.  It's an LP64 ABI vs.
an ILP32 ABI.  HJ & co. have put considerable effort into developing
the x32 ABI for x86_64 precisely because ILP32 is still useful for
64-bit machines.  Just as it was for MIPS when SGI invented n32
(which is still useful now).  I believe 64-bit SPARC has a similar
thing, and no doubt other architectures do too.

After all, there shouldn't be much need for more than 2GB of virtual
address space in an AVR cross compiler.  So why pay the cache penalty
of 64-bit pointers and longs (GCC generally tries to avoid using "long"
directly) when a 32-bit pointer will do?

Many years ago, I moved the HOST_WIDE_INT fields out of rtunion
and into the main rtx_def union because it produced a significant
speed-up on n32 IRIX.  That was before tree-level optimisation,
but I don't think we've really pruned that much RTL optimisation
since then, so I'd be surprised if much has changed.

Richard
Richard Biener Oct. 9, 2012, 7:08 a.m. UTC | #13
On Mon, Oct 8, 2012 at 9:55 PM, Richard Sandiford
<rdsandiford@googlemail.com> wrote:
> Robert Dewar <dewar@adacore.com> writes:
>> On 10/8/2012 11:01 AM, Nathan Froyd wrote:
>>> ----- Original Message -----
>>>> Btw, as for Richards idea of conditionally placing the length field
>>>> in
>>>> rtx_def looks like overkill to me.  These days we'd merely want to
>>>> optimize for 64bit hosts, thus unconditionally adding a 32 bit
>>>> field to rtx_def looks ok to me (you can wrap that inside a union to
>>>> allow both descriptive names and eventual different use - see what
>>>> I've done to tree_base)
>>>
>>> IMHO, unconditionally adding that field isn't "optimize for 64-bit
>>> hosts", but "gratuitously make one of the major compiler data
>>> structures bigger on 32-bit hosts".  Not everybody can cross-compile
>>> from a 64-bit host.  And even those people who can don't necessarily
>>> want to.  Please try to consider what's best for all the people who
>>> use GCC, not just the cases you happen to be working with every day.
>>
>> I think that's rasonable in general, but as time goes on, and every
>> $300 laptop is 64-bit capable, one should not go TOO far out of the
>> way trying to make sure we can compile everything on a 32-bit machine.
>
> It's not 64-bit machine vs. 32-bit machine.  It's an LP64 ABI vs.
> an ILP32 ABI.  HJ & co. have put considerable effort into developing
> the x32 ABI for x86_64 precisely because ILP32 is still useful for
> 64-bit machines.  Just as it was for MIPS when SGI invented n32
> (which is still useful now).  I believe 64-bit SPARC has a similar
> thing, and no doubt other architectures do too.
>
> After all, there shouldn't be much need for more than 2GB of virtual
> address space in an AVR cross compiler.  So why pay the cache penalty
> of 64-bit pointers and longs (GCC generally tries to avoid using "long"
> directly) when a 32-bit pointer will do?
>
> Many years ago, I moved the HOST_WIDE_INT fields out of rtunion
> and into the main rtx_def union because it produced a significant
> speed-up on n32 IRIX.  That was before tree-level optimisation,

But of course that doesn't change the alignment requirement of
that union which is the issue on lp64 hosts.  Also as HOST_WIDE_INT
is 64bits for most ilp32 targets as well it doesn't necessarily save
ilp32 hosts from having this issue (unless long long isn't aligned
to 8 bytes for them).  So what I said is that arranging for the 32bit
hole to contain useful data for most RTXen should be possible.

Richard.

> but I don't think we've really pruned that much RTL optimisation
> since then, so I'd be surprised if much has changed.
diff mbox

Patch

Index: gcc/sched-vis.c
===================================================================
--- gcc/sched-vis.c     (revision 191978)
+++ gcc/sched-vis.c     (working copy)
@@ -444,14 +444,31 @@  print_value (char *buf, const_rtx x, int
...
     case CONST_DOUBLE:
-      if (FLOAT_MODE_P (GET_MODE (x)))
-       real_to_decimal (t, CONST_DOUBLE_REAL_VALUE (x), sizeof (t), 0, 1);
-      else
+     if (TARGET_SUPPORTS_WIDE_INT == 0 && !FLOAT_MODE_P (GET_MODE (x)))
        sprintf (t,
                 "<" HOST_WIDE_INT_PRINT_HEX "," HOST_WIDE_INT_PRINT_HEX ">",
                 (unsigned HOST_WIDE_INT) CONST_DOUBLE_LOW (x),
                 (unsigned HOST_WIDE_INT) CONST_DOUBLE_HIGH (x));
+      else
+       real_to_decimal (t, CONST_DOUBLE_REAL_VALUE (x), sizeof (t), 0, 1);
       cur = safe_concat (buf, cur, t);
       break;

I don't see that this hunk is needed?  In fact it's more confusing this way.

+/* If the target supports integers that are wider than two
+   HOST_WIDE_INTs on the host compiler, then the target should define
+   TARGET_SUPPORTS_WIDE_INT and make the appropriate fixups.
+   Otherwise the compiler really is not robust.  */
+#ifndef TARGET_SUPPORTS_WIDE_INT
+#define TARGET_SUPPORTS_WIDE_INT 0
+#endif

I wonder what are the appropriate fixups?

-/* Return a CONST_INT or CONST_DOUBLE corresponding to target reading
-   GET_MODE_BITSIZE (MODE) bits from string constant STR.  */
+/* Return a CONST_INT, CONST_WIDE_INT, or CONST_DOUBLE corresponding
+   to target reading GET_MODE_BITSIZE (MODE) bits from string constant
+   STR.  */

maybe being less specific in this kind of comments and just refering to
RTX integer constants would be better?

... skip ...

Index: gcc/hwint.c
===================================================================
--- gcc/hwint.c (revision 191978)
+++ gcc/hwint.c (working copy)
@@ -112,6 +112,29 @@  ffs_hwi (unsigned HOST_WIDE_INT x)
 int
 popcount_hwi (unsigned HOST_WIDE_INT x)
 {
+  /* Compute the popcount of a HWI using the algorithm from
+     Hacker's Delight.  */
+#if HOST_BITS_PER_WIDE_INT == 32

separate patch please.  With a rationale.

...

+/* Constructs tree in type TYPE from with value given by CST.  Signedness
+   of CST is assumed to be the same as the signedness of TYPE.  */
+
+tree
+wide_int_to_tree (tree type, const wide_int &cst)
+{
+  wide_int v;
+  if (TYPE_UNSIGNED (type))
+    v = cst.zext (TYPE_PRECISION (type));
+  else
+    v = cst.sext (TYPE_PRECISION (type));
+
+  return build_int_cst_wide (type, v.elt (0), v.elt (1));

this needs an assert that the wide-int contains two HWIs.

+#ifndef GENERATOR_FILE
+extern tree wide_int_to_tree (tree type, const wide_int &cst);
+#endif

why's that?  The header inclusion isn't guarded that way.

Stopping here, skipping to wide-int.[ch]

+#define DEBUG_WIDE_INT
+#ifdef DEBUG_WIDE_INT
+  /* Debugging routines.  */
+  static void debug_vw  (const char* name, int r, const wide_int& o0);
+  static void debug_vwh (const char* name, int r, const wide_int &o0,
+                        HOST_WIDE_INT o1);

there is DEBUG_FUNCTION.

+/* This is the maximal size of the buffer needed for dump.  */
+const int MAX = (MAX_BITSIZE_MODE_ANY_INT / 4
+                + MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT + 32);

I'd prefer a target macro for this, not anything based off
MAX_BITSIZE_MODE_ANY_INT just to allow no-op transition of a
target to wide-ints (which IMHO should be the default for all targets,
simplifying the patch).

+wide_int
+wide_int::from_uhwi (unsigned HOST_WIDE_INT op0, enum machine_mode
mode, bool *overflow)
+{
...
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    {
+      char buf0[MAX];
+      fprintf (dump_file, "%s: %s = " HOST_WIDE_INT_PRINT_HEX "\n",
+              "wide_int::from_uhwi", result.dump (buf0), op0);
+    }
+#endif

hm, ok ... I suppose the debug stuff (which looks very noisy) is going to be
removed before this gets committed anywhere?

+wide_int
+wide_int::from_int_cst (const_tree tcst)
+{
+#if 1
+  wide_int result;
+  tree type = TREE_TYPE (tcst);

should just call wide_it::from_double_int (tree_to_double_int (tcst))

As I said elsewhere I don't think only allowing precisions that are somewhere
in any mode is good enough.  We need the actual precision here (everywhere).

Index: gcc/wide-int.h
===================================================================
--- gcc/wide-int.h      (revision 0)
+++ gcc/wide-int.h      (revision 0)
@@ -0,0 +1,718 @@ 
+/* Operations with very long integers.
+   Copyright (C) 2012 Free Software Foundation, Inc.
...
+
+#ifndef GENERATOR_FILE
+#include "hwint.h"
+#include "options.h"
+#include "tm.h"
+#include "insn-modes.h"
+#include "machmode.h"
+#include "double-int.h"
+#include <gmp.h>
+#include "insn-modes.h"

ugh.  Compare this with double-int.h which just needs <gmp.h> (and even that
is a red herring) ...

+#define wide_int_minus_one(MODE) (wide_int::from_shwi (-1, MODE))
+#define wide_int_zero(MODE)      (wide_int::from_shwi (0, MODE))
+#define wide_int_one(MODE)       (wide_int::from_shwi (1, MODE))
+#define wide_int_two(MODE)       (wide_int::from_shwi (2, MODE))
+#define wide_int_ten(MODE)       (wide_int::from_shwi (10, MODE))

add static functions like

  wide_int wide_int::zero (MODE)

+class wide_int {
+  /* Internal representation.  */
+  HOST_WIDE_INT val[MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT];
+  unsigned short len;
+  enum machine_mode mode;

you really need to store the precision here, not the mode.  We do not have
a distinct mode for each of the tree constant precisions we see.  So what
might work for RTL will later fail on trees, so better do it correct
from the start
(overloads using mode rather than precision may be acceptable - similarly
I'd consider overloads for tree types).

len seems redundant unless you want to optimize encoding.
len == (precision + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT.

+  enum Op {
+    NONE,

we don't have sth like this in double-int.h, why was the double-int mechanism
not viable?

+  static wide_int from_int_cst (const_tree);
+  static wide_int from_rtx (const_rtx, enum machine_mode);

from_tree instead of from_int_cst?  I miss from_pair.

+  HOST_WIDE_INT to_shwi (int prec) const;

isn't the precision encoded?  If it's a different precision this seems to
combine two primitives - is that really that convenient (if so that hints
at some oddity IMHO).

+  static wide_int max_value (const enum machine_mode mode, Op sgn);
+  static wide_int max_value (const enum machine_mode mode, int prec, Op sgn);
+  static wide_int min_value (const enum machine_mode mode, Op sgn);
+  static wide_int min_value (const enum machine_mode mode, int prec, Op sgn);

again prec seems redundant to me.  double-int uses 'bool uns', why is this
different here?  What happens for NONE, TRUNC?

+  inline unsigned short get_len () const;
+  inline void set_len (unsigned int);

the length should be an implementation detail, no?  Important is only
the precision of the number.  So this shouldn' t be exported at least.

+  inline int full_len () const;

what's this?

+  inline enum machine_mode get_mode () const;
+  inline void set_mode (enum machine_mode m);

not sure if we want machine-mode/precision mutating operations (similar
to length mutating operations).  I guess better not.

+  wide_int copy (enum machine_mode mode) const;

why does this need a mode argument?  Maybe 'copy' isn't a good name?

+  static inline HOST_WIDE_INT sext (HOST_WIDE_INT src, int prec);
+  static inline HOST_WIDE_INT zext (HOST_WIDE_INT src, int prec);

doesn't belong to wide_int::

+#define wide_int_smin(OP0,OP1)  ((OP0).lts_p (OP1) ? (OP0) : (OP1))
+#define wide_int_smax(OP0,OP1)  ((OP0).lts_p (OP1) ? (OP1) : (OP0))
+#define wide_int_umin(OP0,OP1)  ((OP0).ltu_p (OP1) ? (OP0) : (OP1))
+#define wide_int_umax(OP0,OP1)  ((OP0).ltu_p (OP1) ? (OP1) : (OP0))

use inline functions

+void
+wide_int::set_len (unsigned int l)
+{
+  gcc_assert (l < MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT);

use gcc_checking_assert everywhere.


Overall it's not too bad.  The mode vs. precision thing needs to be sorted
out and I'd like TARGET_SUPPORTS_WIDE_INT not be introduced,
even transitional.  Just make it possible to have the default be the
existing size of CONST_DOUBLE.

Thanks,
Richard.