diff mbox

patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1

Message ID 516DAF9B.3050008@naturalbridge.com
State New
Headers show

Commit Message

Kenneth Zadeck April 16, 2013, 8:07 p.m. UTC
Richard,

I made major changes to wide-int along the lines you suggested. Each of 
the binary operations is now a template.
There are 5 possible implementations of those operations, one for each 
of HWI, unsigned HWI, wide-int, rtl, and tree.   Note that this is not 
exactly as you suggested, but it is along the same lines.

The HWI template sign extends the value to the precision of the first 
operand, the unsigned HWI is the same except that it is an unsigned 
extension.   The wide-int version is used as before, but is in truth 
rarely used.  The rtl and tree "logically" convert the value to a 
wide-int but in practice do something more efficient than converting to 
the wide-int.   What they do is look inside the rtl or the tree and pass 
a pointer to the data and a length to the binary operation.  This is 
perfectly safe in the position of a second operand to the binary 
operation because the lifetime is guaranteed to be very short.  The 
wide-int implementation was also modified to do the same pointer trick 
allowing all 5 templates to share the same use of the data.

Note that currently the tree code is more crufty than one would like.   
This will clean up nicely when the tree-cst is changed to represent the 
value with an array and a length field.

So now, at least for the second operand of binary operations, the 
storage is never copied.    I do not believe that there is a good 
similar trick for the first operand.  i did not consider something like 
wide_int::add (a, b) to be a viable option; it seems to mis the point of 
using an object oriented language.   So I think that you really have to 
copy the data into an instance of a wide int.

However, while all of this avoids ever having to pass a precision into 
the second operand, this patch does preserve the finite math 
implementation of wide-int.    Finite math is really what people expect 
an optimizer to do, because it seamlessly matches what the machine is 
going to do.

I hope at this point, i can get a comprehensive review on these patches. 
   I believe that I have done what is required.

There are two other patches that will be submitted in the next few 
minutes.   The first one is an updated version of the rtl level patch.   
The only changes from what you have seen before are that the binary 
operations now use the templated binary operations.  The second one is 
the first of the tree level patches.   It converts builtins.c to use 
both use wide-int and it removes all assumptions that tree-csts are 
built with two HWIs.

Once builtins.c is accepted, i will convert the rest of the middle end 
patches.   They will all be converted in a similar way.

Kenny
2013-04-16  Kenneth Zadeck <zadeck@naturalbridge.com>

	* Makefile.in (wide-int.c, wide-int.h): New files.
	* wide-int.c: New file containing implementation of wide_int class.
	* wide-int.h: New file containing public spec for wide_int class.

Comments

Richard Biener April 19, 2013, 1:31 p.m. UTC | #1
On Tue, Apr 16, 2013 at 10:07 PM, Kenneth Zadeck
<zadeck@naturalbridge.com> wrote:
> Richard,
>
> I made major changes to wide-int along the lines you suggested. Each of the
> binary operations is now a template.
> There are 5 possible implementations of those operations, one for each of
> HWI, unsigned HWI, wide-int, rtl, and tree.   Note that this is not exactly
> as you suggested, but it is along the same lines.
>
> The HWI template sign extends the value to the precision of the first
> operand, the unsigned HWI is the same except that it is an unsigned
> extension.   The wide-int version is used as before, but is in truth rarely
> used.  The rtl and tree "logically" convert the value to a wide-int but in
> practice do something more efficient than converting to the wide-int.   What
> they do is look inside the rtl or the tree and pass a pointer to the data
> and a length to the binary operation.  This is perfectly safe in the
> position of a second operand to the binary operation because the lifetime is
> guaranteed to be very short.  The wide-int implementation was also modified
> to do the same pointer trick allowing all 5 templates to share the same use
> of the data.
>
> Note that currently the tree code is more crufty than one would like.   This
> will clean up nicely when the tree-cst is changed to represent the value
> with an array and a length field.
>
> So now, at least for the second operand of binary operations, the storage is
> never copied.    I do not believe that there is a good similar trick for the
> first operand.  i did not consider something like wide_int::add (a, b) to be
> a viable option; it seems to mis the point of using an object oriented
> language.   So I think that you really have to copy the data into an
> instance of a wide int.
>
> However, while all of this avoids ever having to pass a precision into the
> second operand, this patch does preserve the finite math implementation of
> wide-int.    Finite math is really what people expect an optimizer to do,
> because it seamlessly matches what the machine is going to do.
>
> I hope at this point, i can get a comprehensive review on these patches.   I
> believe that I have done what is required.
>
> There are two other patches that will be submitted in the next few minutes.
> The first one is an updated version of the rtl level patch.   The only
> changes from what you have seen before are that the binary operations now
> use the templated binary operations.  The second one is the first of the
> tree level patches.   It converts builtins.c to use both use wide-int and it
> removes all assumptions that tree-csts are built with two HWIs.
>
> Once builtins.c is accepted, i will convert the rest of the middle end
> patches.   They will all be converted in a similar way.

+   number of elements of the vector that are in use.  When LEN *
+   HOST_BITS_PER_WIDE_INT < the precision, the value has been
+   compressed.  The values of the elements of the vector greater than
+   LEN - 1. are all equal to the highest order bit of LEN.

equal to the highest order bit of element LEN - 1. ?

Especially _not_ equal to the precision - 1 bit of the value, correct?

+   The representation does not contain any information inherant about
+   signedness of the represented value, so it can be used to represent
+   both signed and unsigned numbers.   For operations where the results
+   depend on signedness (division, comparisons), the signedness must
+   be specified separately.  For operations where the signness
+   matters, one of the operands to the operation specifies either
+   wide_int::SIGNED or wide_int::UNSIGNED.

The last sentence is somehow duplicated.

+   The numbers are stored as sign entended numbers as a means of
+   compression.  Leading HOST_WIDE_INTS that contain strings of either
+   -1 or 0 are removed as long as they can be reconstructed from the
+   top bit that is being represented.

I'd put this paragraph before the one that talks about signedness, next
to the one that already talks about encoding.

+   All constructors for wide_int take either a precision, an enum
+   machine_mode or tree_type.  */

That's probably no longer true (I'll now check).

+class wide_int {
+  /* Internal representation.  */
+
+  /* VAL is set to a size that is capable of computing a full
+     multiplication on the largest mode that is represented on the
+     target.  The full multiplication is use by tree-vrp.  tree-vpn
+     currently does a 2x largest mode by 2x largest mode yielding a 4x
+     largest mode result.  If operations are added that require larger
+     buffers, then VAL needs to be changed.  */
+  HOST_WIDE_INT val[WIDE_INT_MAX_ELTS];
+  unsigned short len;
+  unsigned int precision;

I wonder if there is a technical reason to stick to HOST_WIDE_INTs?
I'd say for efficiency HOST_WIDEST_FAST_INT would be more appropriate
(to get a 32bit value on 32bit x86 for example).  I of course see that
conversion to/from HOST_WIDE_INT is an important operation
that would get slightly more complicated.

Maybe just quickly checking the code generated on 32bit x86 for
HOST_WIDE_INT vs. HOST_WIDEST_FAST_INT tells us whether
it's worth considering (it would be bad if each add/multiply would
end up calling to libgcc for example - I know that doesn't happen
for x86, but maybe it would happen for an arm hosted gcc
targeting x86_64?)

+  enum ShiftOp {
+    NONE,
+    /* There are two uses for the wide-int shifting functions.  The
+       first use is as an emulation of the target hardware.  The
+       second use is as service routines for other optimizations.  The
+       first case needs to be identified by passing TRUNC as the value
+       of ShiftOp so that shift amount is properly handled according to the
+       SHIFT_COUNT_TRUNCATED flag.  For the second case, the shift
+       amount is always truncated by the bytesize of the mode of
+       THIS.  */
+    TRUNC
+  };

double-int simply honors SHIFT_COUNT_TRUNCATED.  Why differ
from that (and thus change behavior in existing code - not sure if you
do that with introducing wide-int)?

+  enum SignOp {
+    /* Many of the math functions produce different results depending
+       on if they are SIGNED or UNSIGNED.  In general, there are two
+       different functions, whose names are prefixed with an 'S" and
+       or an 'U'.  However, for some math functions there is also a
+       routine that does not have the prefix and takes an SignOp
+       parameter of SIGNED or UNSIGNED.  */
+    SIGNED,
+    UNSIGNED
+  };

You seem to insist on that.  It should propagate to the various parts
of the compiler that have settled for the 'uns' integer argument.
Having one piece behave different is just weird.  I suppose I will
find code like

    wi.ext (prec, uns ? UNSIGNED : SIGNED)

in your conversion?

+  static wide_int from_shwi (HOST_WIDE_INT op0,
+                            unsigned int precision, bool *overflow = 0);

+  if (precision < HOST_BITS_PER_WIDE_INT)
+    {
+      HOST_WIDE_INT t = sext_hwi (op0, precision);
+      if (t != op0 && overflow)
+       *overflow = true;
+      op0 = t;
+    }

Hm.  I'd say input values should be properly extended already (we certainly
expect that from RTL or tree constants).  So we might want to assert the
above instead of tracking it via *overflow.

+  static wide_int from_array (const HOST_WIDE_INT* op0,
+                             unsigned int len,
+                             unsigned int precision, bool need_canon = true);
+  inline static wide_int from_array (const HOST_WIDE_INT* op0,
+                                    unsigned int len,
+                                    enum machine_mode mode);
+  inline static wide_int from_array (const HOST_WIDE_INT* op0,
+                                    unsigned int len,
+                                    const_tree type);

I still don't like the overloads precision vs. machine_mode vs. tree type.
It's much more explanative what you specify here if you write

  from_array (&x, len, GET_MODE_PRECISION (mode))

instead of

  from_array (&x, len, mode)

and it's not that much more typing either.

+  static wide_int from_double_int (double_int,
+                                  unsigned int precision);
+  inline static wide_int from_double_int (double_int, enum machine_mode);

this one would lack the tree type overload (I of course think it has an
excessive overload for machine_mode).

+  inline HOST_WIDE_INT to_shwi (unsigned int prec = 0) const;
+  inline unsigned HOST_WIDE_INT to_uhwi (unsigned int prec = 0) const;

this is

   wi.sext (prec);
   wi.to_shwi ();

which is less odd than handling prec == 0 specially.

+  static wide_int max_value (unsigned int prec,
+                            unsigned int precision, SignOp sgn);

two precisions - ugh ;)  In some places having to have a precision is ugly :/

+  inline static wide_int minus_one (unsigned int prec);
+  inline static wide_int zero (unsigned int prec);
+  inline static wide_int one (unsigned int prec);
+  inline static wide_int two (unsigned int prec);

here as well.  I see two (1) properly truncates the value to zero ;)
It just comes to my mind that these could have "arbitrary" precision.
"arbitrary" == MAX_SIZE * HOST_BITS_PER_WIDE_INT.  Which
would get us back to mixed precision operations ...


+  /* Printing functions.  */
+
+  void print_dec (char *buf, SignOp sgn) const;
+  void print_dec (FILE *file, SignOp sgn) const;
+  void print_decs (char *buf) const;
+  void print_decs (FILE *file) const;
+  void print_decu (char *buf) const;
+  void print_decu (FILE *file) const;
+  void print_hex (char *buf) const;
+  void print_hex (FILE *file) const;

making those non-member functions would allow getting rid of stdio.h
from wide-int.h (similar making the tree / rtx / mode argument methods
either standalone or making them templates and externalizing the
implementation to a separate template class would get rid of the
rtl / tree dependency)

+  inline bool neg_p () const;

how can you test that?  wide-int doesn't have sign information.

+bool
+wide_int::neg_p () const
+{
+  return sign_mask () != 0;

not exactly efficient either ...

+HOST_WIDE_INT
+wide_int::sign_mask () const
+{
+  int i = len - 1;
+  if (precision < HOST_BITS_PER_WIDE_INT)
+    return ((val[0] << (HOST_BITS_PER_WIDE_INT - precision))
+           >> (HOST_BITS_PER_WIDE_INT - 1));
+
+  /* VRP appears to be badly broken and this is a very ugly fix.  */
+  if (i >= 0)
+    return val[i] >> (HOST_BITS_PER_WIDE_INT - 1);
+
+  gcc_unreachable ();
+#if 0
+  return val[len - 1] >> (HOST_BITS_PER_WIDE_INT - 1);
+#endif
+}

I can't see what's the "fix".  It seems to be equivalent to the commented out
code.  Apart from not handling len == 0 for which you ICE then.

Back to neg_p ... it computes wrong information.
from_uwhi (-1, HOST_BITS_PER_WIDE_INT).neg_p () returns true.

I suppose you want to rename it to msb () (implementing that efficiently
and using it from sign_mask, too)

+  template <typename T>
+    inline bool gt_p (T c, SignOp sgn) const;
+  template <typename T>
+    inline bool gts_p (T c) const;
+  template <typename T>
+    inline bool gtu_p (T c) const;

it's bad that we can't use the sign information we have available in almost
all cases ... (where precision is not an exact multiple of
HOST_BITS_PER_WIDE_INT
and len == precision / HOST_BITS_PER_WIDE_INT).  It isn't hard to encode
a sign - you just have to possibly waste a word of zeroes for positive
values where at the moment precision is an exact multiple of
HOST_BIST_PER_WIDE_INT and len == precision / HOST_BITS_PER_WIDE_INT.
Which of course means that the encoding can be one word larger than
maximally required by 'precision'.

+  wide_int force_to_size (unsigned int precision, SignOp sgn) const;
+  inline wide_int force_to_size (enum machine_mode mode, SignOp sgn) const;
+  inline wide_int force_to_size (const_tree type) const;
+  inline wide_int force_to_size (const_tree type, SignOp sgn) const;
+
+  inline wide_int sforce_to_size (enum machine_mode mode) const;
+  inline wide_int sforce_to_size (const_tree type) const;
+  inline wide_int zforce_to_size (enum machine_mode mode) const;
+  inline wide_int zforce_to_size (const_tree type) const;

too many overloads again

Looking at

+template <typename T>
+wide_int
+wide_int::operator & (T c) const
+{
+  wide_int result;
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+

I see

+  template <typename T>
+  static inline const HOST_WIDE_INT* to_shwi2 (HOST_WIDE_INT *s, int *l, T x)
+  {
+    s[0] = x;
+    if (~(T)0 < (T)0
...

+  static inline const HOST_WIDE_INT* to_shwi2 (HOST_WIDE_INT *s
ATTRIBUTE_UNUSED,
+                                       int *l, const wide_int &y)
+  {
+    *l = y.len;
...

I think you want to use template specialization here, not overloading.  Thus,

  template <>
  static inline const HOST_WIDE_INT* to_shwi2 (HOST_WIDE_INT *s, int
*l, const wide_int &y)

and adjust the HWI case to look like

  template <typename T>
  static inline const HOST_WIDE_INT* to_shwi2 (HOST_WIDE_INT *s, int
*l, const T& x)

you want to move that ~(T)0 < (T)0 check up to template substitution level
to allow SFINAE to disable the template if that's not computable.

+    s[0] = x;
+    if (~(T)0 < (T)0
+       || sizeof (T) < sizeof (HOST_WIDE_INT))
+      {
+       *l = 1;
+      }
+    else
+      {
+       s[1] = 0;
+       *l = 2;

that's only required if the MSB is set, otherwise it's not properly compressed

+      }
+    return s;

hmm, looking at from_uhwi I see that you are adding the extra zero words
as I suggested ... which means you _do_ have reliable sign information
available (well, not for the case where len would end up bigger than MAX_LEN).
So - can we simplify some things with that?  I can think of the ordered
comparisons for example.  Also the encoding description should be
clarified that there _is_ sign information available (and that len can be
bigger than precision / HOST_BITS_PER_WIDE_INT).

Returning to to_shwi2 (odd name, btw):

+       /* Copy the result because it does not fit in two HWIs.  */
+       s[0] = TREE_INT_CST_LOW (tcst);
+       s[1] = TREE_INT_CST_HIGH (tcst);
+       s[2] = 0;
+       *l = 3;
+       return s;

ah, this is why you need the extra argument ;)  Btw, what happens
when a proper encoding does not fit in MAX_LEN words?  That is,
we'd need an extra zero word?

+  /* There should logically be an overload for rtl here, but it cannot
+     be here because of circular include issues.  It is in rtl.h.  */
+  static inline const HOST_WIDE_INT* to_shwi2
+    (HOST_WIDE_INT *s ATTRIBUTE_UNUSED, int *l, rtx rcst);

in rtl.c I suppose.  Btw, this is why I'm suggesting to defer implementation
to a template - you can implement that outside of wide-int.h even without
declaring the specialization here.

Is there an accessible branch with the wide-int work?  I may be tempted
to propose some changes in form of patches.  If not then I think the
work is in a state that is suitable to put on a merge-branch.

+wide_int
+wide_int::add (const wide_int &op1, SignOp sgn, bool *overflow) const
+{
...
+  /* Uncompress the rest.  */
+  for (i = len; i < op1.len; i++)
+    {
+      o0 = val[i];
+      o1 = mask1;
+      x = o0 + o1 + carry;
+      result.val[i] = x;
+      old_carry = carry;
+      carry = x < o0;
+    }
+  for (i = len; i < op1.len; i++)
+    {
+      o0 = mask0;
+      o1 = op1.val[i];
+      x = o0 + o1 + carry;
+      result.val[i] = x;
+      old_carry = carry;
+      carry = x < o0;
+    }

Cut & paste error here I think.  at least one should run up to this->len, no?
And the first loop should run to min (len, op1.len) only.  How much
testing coverage does the code have for "interesting" values?
A standalone testing program will probably reveal such issues (due to
the rich-ness of the current API covering everything will be hard :/)

Looking at

+HOST_WIDE_INT
+wide_int::sign_mask () const
+{
+  int i = len - 1;
+  if (precision < HOST_BITS_PER_WIDE_INT)
+    return ((val[0] << (HOST_BITS_PER_WIDE_INT - precision))
+           >> (HOST_BITS_PER_WIDE_INT - 1));
+
+  /* VRP appears to be badly broken and this is a very ugly fix.  */
+  if (i >= 0)
+    return val[i] >> (HOST_BITS_PER_WIDE_INT - 1);
+
+  gcc_unreachable ();
+#if 0
+  return val[len - 1] >> (HOST_BITS_PER_WIDE_INT - 1);
+#endif
+}

again - this looks like it does overly complicated work.  Our encoding
of values guarantees that the following is correct:

HOST_WIDE_INT
wide_int::sign_mask () const
{
  if (val[len - 1] < 0)
    return -1;
  else
    return 0;
}

I suppose quite some code can be simplified that calls sign_mask
currently (thanks to the changed encoding).

back to ::add ():

+  small_prec = precision & (HOST_BITS_PER_WIDE_INT - 1);
+  if (small_prec == 0)
+    {
+      if (sgn == wide_int::SIGNED)
+       {
+         if (((!(o0 ^ o1)) & (x ^ o0)) >> (HOST_BITS_PER_WIDE_INT - 1))
+           *overflow = true;

shouldn't this share code with the non-overflow add_large and simply
do overflow detection conditional on overlflow being non-NULL?
Because the add_large code seems to be correct with respect to
the issues noted above ...

+ ex:
+  result.canonize ();
+  return result;
+}

canonizing is not strictly necessary - in fact it is unlikely to do
something useful most of the time.  I'd say we should only
canonize before building a tree or RTL with that representation.
Note that canonize is wrong, too:

+wide_int::canonize ()
+{
+  int small_prec = precision & (HOST_BITS_PER_WIDE_INT - 1);
+  int blocks_needed = BLOCKS_NEEDED (precision);
+  HOST_WIDE_INT top;
+  int i;
+
+  if (len > blocks_needed)
+    len = blocks_needed;

that will set a UHWI -1 to have len == 1, ignoring the extra needed
zero word.  Thus, either BLOCKS_NEEDED is broken or its
uses need to be audited.

+
+  /* Clean up the top bits for any mode that is not a multiple of a HWI.  */
+  if (len == blocks_needed && small_prec)
+    val[len - 1] = sext_hwi (val[len - 1], small_prec);

That's not necessary.  By construction all arithmetic operations will
preserve proper extension.  And you unconditionally sign-extend
small precision "large" positive constants here which is even wrong
(-1U, precision == 32, HWI == 64, original rep { 0x00..00fffff }, broken
{ 0xfff...fff } ).

+  if (len == 1)
+    return;

in fact for len == 1 we should do nothing in this function from the start.
Callers that also want to properly sign or zero extend the result value
should do so using ext (), not abuse another implementation inside
canonize () (which should only optimize the encoding).

Ok, I'll stop reviewing here.  It's a lot better than before - thanks for
doing the changes.

I still like you to cut down the interface somewhat and _test_ what
is remaining.  Put the code on a branch off trunk so I and others can
produce patches against it.

Thanks,
Richard.
Kenneth Zadeck April 21, 2013, 8:54 p.m. UTC | #2
Richard,

i pulled these two frags out of your comments because i wanted to get 
some input from you on it while i addressed the other issues you raised.

> +  enum SignOp {
> +    /* Many of the math functions produce different results depending
> +       on if they are SIGNED or UNSIGNED.  In general, there are two
> +       different functions, whose names are prefixed with an 'S" and
> +       or an 'U'.  However, for some math functions there is also a
> +       routine that does not have the prefix and takes an SignOp
> +       parameter of SIGNED or UNSIGNED.  */
> +    SIGNED,
> +    UNSIGNED
> +  };
>
> You seem to insist on that.  It should propagate to the various parts
> of the compiler that have settled for the 'uns' integer argument.
> Having one piece behave different is just weird.  I suppose I will
> find code like
>
>      wi.ext (prec, uns ? UNSIGNED : SIGNED)

there is a lot more flexibility on my part than you perceive with 
respect to this point.   My primary issue is that i do not want to have 
is an interface that has 0 and 1 as a programmer visible part.   Beyond 
that i am open to suggestion.

The poster child of my hate are the host_integer_p and the tree_low_cst 
interfaces. I did not want the wide int stuff to look like these.  I see 
several problems with these:

1) of the 314 places where tree_low_cst is called in the gcc directory 
(not the subdirectories where the front ends live), NONE of the calls 
have a variable second parameter.   There are a handful of places, as 
one expects, in the front ends that do, but NONE in the middle end.
2) there are a small number of the places where host_integer_p is called 
with one parameter and then it is followed by a call to tree_low_cst 
that has the value with the other sex.   I am sure these are mistakes, 
but having the 0s and 1s flying around does not make it easy to spot them.
3) tree_low_cst implies that the tree cst has only two hwis in it.

While i do not want to propagate an interface with 0 and 1 into 
wide-int, i can understand your dislike of having a wide-int only 
solution for this.

I will point out that for your particular example, uns is almost always 
set by a call to TYPE_UNSIGNED.  There could easily be a different type 
accessor that converts this part of the type to the right thing to pass 
in here.   I think that there is certainly some place for there to be a 
unified SYMBOLIC api that controls the signedness everywhere in the 
compiler.

I would like to move toward this direction, but you have been so 
negative to the places where i have made it convenient to directly 
convert from tree or rtl into or out of wide-int that i have hesitated 
to do something that directly links trees and wide-int. So i would like 
to ask you what would like?



> +  template <typename T>
> +    inline bool gt_p (T c, SignOp sgn) const;
> +  template <typename T>
> +    inline bool gts_p (T c) const;
> +  template <typename T>
> +    inline bool gtu_p (T c) const;
>
> it's bad that we can't use the sign information we have available in almost
> all cases ... (where precision is not an exact multiple of
> HOST_BITS_PER_WIDE_INT
> and len == precision / HOST_BITS_PER_WIDE_INT).  It isn't hard to encode
> a sign - you just have to possibly waste a word of zeroes for positive
> values where at the moment precision is an exact multiple of
> HOST_BIST_PER_WIDE_INT and len == precision / HOST_BITS_PER_WIDE_INT.
> Which of course means that the encoding can be one word larger than
> maximally required by 'precision'.
>
Going back to point 1 above,   the front ends structure the middle end 
code where (generally) the sign that is used is encoded in the operator 
that one is looking at.    So the majority of uses in the middle end 
this fall into the second or third templates and the first template is 
there as a convenience routine for the middle ends.
The front ends certainly use the first template.

This is how the rtl level has survived so long without a sign bit in the 
modes, the operators tell the whole story.   The truth is that in the 
middle end, the story is the same - it is the operators (most of the 
time) that drive the calls being made.

There is an assumption that you are making that i certainly do not 
believe is true in the backends and i kind of doubt is true in the 
middle ends.   That is that the sign of the compare ALWAYS matches the 
sign of the operands.   Given that i have never seen any code that 
verifies this in the middle end, i am going to assume that it is not 
true, because it is always true in gcc that anything that we do not 
explicitly verify generally turns out to only be generally true and you 
can spend your life tracking down the end cases.  This is a needless 
complication.
At the rtl level, this is completely doomed by the GEN_INT which neither 
takes a mode or an indication of sign.   To assume that there is any 
meaningful sign information there is a horror story waiting to be 
written ("sure what could go wrong if we go into the old house?  whats 
that i hear, it sounds like a chain saw ....").

The api that i have is independent of the rep that i happen to use 
inside of wide-ints.   The fact that it might be work is really just an 
accident.   Gimple was designed (and rtl just happened) so that the 
operators carry the information that is needed.   The fact that those 
operators are generally redundant with the types is a byproduct of the 
strong typing of the middle ends.   With this respect, i think i have 
the right interface.

Kenny
Richard Biener April 22, 2013, 12:20 p.m. UTC | #3
On Sun, Apr 21, 2013 at 10:54 PM, Kenneth Zadeck
<zadeck@naturalbridge.com> wrote:
> Richard,
>
> i pulled these two frags out of your comments because i wanted to get some
> input from you on it while i addressed the other issues you raised.
>
>
>> +  enum SignOp {
>> +    /* Many of the math functions produce different results depending
>> +       on if they are SIGNED or UNSIGNED.  In general, there are two
>> +       different functions, whose names are prefixed with an 'S" and
>> +       or an 'U'.  However, for some math functions there is also a
>> +       routine that does not have the prefix and takes an SignOp
>> +       parameter of SIGNED or UNSIGNED.  */
>> +    SIGNED,
>> +    UNSIGNED
>> +  };
>>
>> You seem to insist on that.  It should propagate to the various parts
>> of the compiler that have settled for the 'uns' integer argument.
>> Having one piece behave different is just weird.  I suppose I will
>> find code like
>>
>>      wi.ext (prec, uns ? UNSIGNED : SIGNED)
>
>
> there is a lot more flexibility on my part than you perceive with respect to
> this point.   My primary issue is that i do not want to have is an interface
> that has 0 and 1 as a programmer visible part.   Beyond that i am open to
> suggestion.
>
> The poster child of my hate are the host_integer_p and the tree_low_cst
> interfaces. I did not want the wide int stuff to look like these.  I see
> several problems with these:
>
> 1) of the 314 places where tree_low_cst is called in the gcc directory (not
> the subdirectories where the front ends live), NONE of the calls have a
> variable second parameter.   There are a handful of places, as one expects,
> in the front ends that do, but NONE in the middle end.
> 2) there are a small number of the places where host_integer_p is called
> with one parameter and then it is followed by a call to tree_low_cst that
> has the value with the other sex.   I am sure these are mistakes, but having
> the 0s and 1s flying around does not make it easy to spot them.
> 3) tree_low_cst implies that the tree cst has only two hwis in it.
>
> While i do not want to propagate an interface with 0 and 1 into wide-int, i
> can understand your dislike of having a wide-int only solution for this.
>
> I will point out that for your particular example, uns is almost always set
> by a call to TYPE_UNSIGNED.  There could easily be a different type accessor
> that converts this part of the type to the right thing to pass in here.   I
> think that there is certainly some place for there to be a unified SYMBOLIC
> api that controls the signedness everywhere in the compiler.
>
> I would like to move toward this direction, but you have been so negative to
> the places where i have made it convenient to directly convert from tree or
> rtl into or out of wide-int that i have hesitated to do something that
> directly links trees and wide-int. So i would like to ask you what would
> like?

Ideally I'd like the wide-int introduction to _not_ be the introduction of
a unified symbolic way that controls signedness.  We do have two
kinds of interfaces currently - one that uses different API entries,
like build_int_cstu vs. build_int_cst or double_int::from_shwi vs. from_uhwi,
and one that uses the aforementioned integer flag 'uns' with 0 being
signed and 1 being unsigned.

I think the _uhwi vs. _shwi and _cstu variants are perfectly fine
(but only for compile-time constant uses as you say), and the wide-int
interface makes use of this kind, too.

Proposing a better API for the 'uns' flag separately from wide-int would
be a better way to get anybody else than me chime in (I have the feeling
that the wide-int series seems to scare off every other reviewer besides me...).
I can live with the SIGNED/UNSIGNED enum, but existing APIs should
be changed to use that.

For wide-int I suggest to go the route you don't want to go.  Stick to
existing practice and use the integer 'uns' flag.  It's as good as
SIGNED/UNSIGNED for _variable_ cases (and yes, a lot less descriptive
for constant cases).  For wide-int, always add a static interface
if there is a variable one and convert variable uses to the proper static
interface.

That said, a lot of my pushback is because I feel a little lonesome in this
wide-int review and don't want to lone-some decide about that (generic)
interface part as well.

>> +  template <typename T>
>> +    inline bool gt_p (T c, SignOp sgn) const;
>> +  template <typename T>
>> +    inline bool gts_p (T c) const;
>> +  template <typename T>
>> +    inline bool gtu_p (T c) const;
>>
>> it's bad that we can't use the sign information we have available in
>> almost
>> all cases ... (where precision is not an exact multiple of
>> HOST_BITS_PER_WIDE_INT
>> and len == precision / HOST_BITS_PER_WIDE_INT).  It isn't hard to encode
>> a sign - you just have to possibly waste a word of zeroes for positive
>> values where at the moment precision is an exact multiple of
>> HOST_BIST_PER_WIDE_INT and len == precision / HOST_BITS_PER_WIDE_INT.
>> Which of course means that the encoding can be one word larger than
>> maximally required by 'precision'.
>>
> Going back to point 1 above,   the front ends structure the middle end code
> where (generally) the sign that is used is encoded in the operator that one
> is looking at.    So the majority of uses in the middle end this fall into
> the second or third templates and the first template is there as a
> convenience routine for the middle ends.
> The front ends certainly use the first template.
>
> This is how the rtl level has survived so long without a sign bit in the
> modes, the operators tell the whole story.   The truth is that in the middle
> end, the story is the same - it is the operators (most of the time) that
> drive the calls being made.

With the difference RTL vs. tree that tree knows about undefined overflow
for signed plus while RTL does not (there isn't a signed plus).  And that
tree gets the operator signedness from the sign of the operands.

> There is an assumption that you are making that i certainly do not believe
> is true in the backends and i kind of doubt is true in the middle ends.
> That is that the sign of the compare ALWAYS matches the sign of the
> operands.   Given that i have never seen any code that verifies this in the
> middle end, i am going to assume that it is not true, because it is always
> true in gcc that anything that we do not explicitly verify generally turns
> out to only be generally true and you can spend your life tracking down the
> end cases.  This is a needless complication.
> At the rtl level, this is completely doomed by the GEN_INT which neither
> takes a mode or an indication of sign.   To assume that there is any
> meaningful sign information there is a horror story waiting to be written
> ("sure what could go wrong if we go into the old house?  whats that i hear,
> it sounds like a chain saw ....").

Well .. ;)  If you constructed a wide-int constant A from the signed number
-10465 and a wide-int constant B from the unsigned number 921, then
if you ask whether A is less than B then I expect the answer 'yes'.  If
you ask whether A is unsigned less than B (ltu_p) then I'd be confused
and be tempted to error out.  You should have first converted A to unsigned,
that is, zero-extend it from its precision (which in my wide-int speak
simply results in another signed but positive number).  That is,
having ltu_p interpreting a value as different value sounds wrong.

> The api that i have is independent of the rep that i happen to use inside of
> wide-ints.   The fact that it might be work is really just an accident.
> Gimple was designed (and rtl just happened) so that the operators carry the
> information that is needed.   The fact that those operators are generally
> redundant with the types is a byproduct of the strong typing of the middle
> ends.   With this respect, i think i have the right interface.

The api may be independent of the rep - but it doesn't make actual API
use independent of the information present in the rep.  If the API allows
for "mismatches", like feeding a negative value to a ltu_p call, then
I view it as inherently fragile and unclean.

That said, all numbers are part of the infinite precision signed set of numbers
and we can compare them.  And the current wide-int representation seems
to be powerful enough to place each wide-int into that number space
unambiguously.  So why not use that property to clean up the API and
make it less error-prone to use?

Richard.

> Kenny
>
Kenneth Zadeck April 22, 2013, 5:59 p.m. UTC | #4
On 04/19/2013 09:31 AM, Richard Biener wrote:
> +   number of elements of the vector that are in use.  When LEN *
> +   HOST_BITS_PER_WIDE_INT < the precision, the value has been
> +   compressed.  The values of the elements of the vector greater than
> +   LEN - 1. are all equal to the highest order bit of LEN.
>
> equal to the highest order bit of element LEN - 1. ?
Fixed, you are correct.

I have gone thru the entire wide-int patch to clean this up.   The 
bottom line is that if the precision is not a multiple of the size of a 
HWI then everything above that precision is assumed to be identical to 
the sign bit.

> Especially _not_ equal to the precision - 1 bit of the value, correct?
I do not understand your question here, because in the case talked about 
above, the bit at precision - 1 would not have been explicitly represented.

Anyway,  i went thru this top part carefully and made many things clearer.
> +   The representation does not contain any information inherant about
> +   signedness of the represented value, so it can be used to represent
> +   both signed and unsigned numbers.   For operations where the results
> +   depend on signedness (division, comparisons), the signedness must
> +   be specified separately.  For operations where the signness
> +   matters, one of the operands to the operation specifies either
> +   wide_int::SIGNED or wide_int::UNSIGNED.
>
> The last sentence is somehow duplicated.
fixed
>
> +   The numbers are stored as sign entended numbers as a means of
> +   compression.  Leading HOST_WIDE_INTS that contain strings of either
> +   -1 or 0 are removed as long as they can be reconstructed from the
> +   top bit that is being represented.
>
> I'd put this paragraph before the one that talks about signedness, next
> to the one that already talks about encoding.
done
> +   All constructors for wide_int take either a precision, an enum
> +   machine_mode or tree_type.  */
>
> That's probably no longer true (I'll now check).
yes you are correct

> +class wide_int {
> +  /* Internal representation.  */
> +
> +  /* VAL is set to a size that is capable of computing a full
> +     multiplication on the largest mode that is represented on the
> +     target.  The full multiplication is use by tree-vrp.  tree-vpn
> +     currently does a 2x largest mode by 2x largest mode yielding a 4x
> +     largest mode result.  If operations are added that require larger
> +     buffers, then VAL needs to be changed.  */
> +  HOST_WIDE_INT val[WIDE_INT_MAX_ELTS];
> +  unsigned short len;
> +  unsigned int precision;
>
> I wonder if there is a technical reason to stick to HOST_WIDE_INTs?
> I'd say for efficiency HOST_WIDEST_FAST_INT would be more appropriate
> (to get a 32bit value on 32bit x86 for example).  I of course see that
> conversion to/from HOST_WIDE_INT is an important operation
> that would get slightly more complicated.
>
> Maybe just quickly checking the code generated on 32bit x86 for
> HOST_WIDE_INT vs. HOST_WIDEST_FAST_INT tells us whether
> it's worth considering (it would be bad if each add/multiply would
> end up calling to libgcc for example - I know that doesn't happen
> for x86, but maybe it would happen for an arm hosted gcc
> targeting x86_64?)
This is an interesting point.   my guess is that it is unlikely to be 
worth the work.
consider add:    most machines have add with carry and well written 32 
bit ports would have used an add with carry sequence rather than making 
the libcall.   If i rewrite wide-int in terms of host_fastest_int, then 
i have to do some messy code to compute the carry which is unlikely to 
translate into the proper carry instructions.   Not to mention the cost 
overhead of converting to and from HFI given that gcc is written almost 
entirely using HWIs.

I thought about the possible idea of just converting the mul and div 
functions.   This would be easy because i already reblock them into 
HOST_WIDE_HALF_INTs to do the math.    I could just do a different 
reblocking.   However, i think that it is unlikely that doing this would 
ever show up on anyone's performance counts.   Either way you do the 
same number of multiply instructions, it is just the subroutine wrapper 
that could possibly go away.

> +  enum ShiftOp {
> +    NONE,
> +    /* There are two uses for the wide-int shifting functions.  The
> +       first use is as an emulation of the target hardware.  The
> +       second use is as service routines for other optimizations.  The
> +       first case needs to be identified by passing TRUNC as the value
> +       of ShiftOp so that shift amount is properly handled according to the
> +       SHIFT_COUNT_TRUNCATED flag.  For the second case, the shift
> +       amount is always truncated by the bytesize of the mode of
> +       THIS.  */
> +    TRUNC
> +  };
>
> double-int simply honors SHIFT_COUNT_TRUNCATED.  Why differ
> from that (and thus change behavior in existing code - not sure if you
> do that with introducing wide-int)?
I believe that GCC is supposed to be a little schizophrenic here, at 
least according to the doc.    when it is doing its own math it is 
defined to use SHIFT_COUNT_TRUNCATED, but when it is doing math for the 
program it is supposed to honor what the port does.   The idea is that 
you pass in TRUNC when you want to have the shift op do what the port 
does.   Again, this goes back to an insistence that the optimizations 
are just doing what the machine would do, only earlier.

> +  enum SignOp {
> +    /* Many of the math functions produce different results depending
> +       on if they are SIGNED or UNSIGNED.  In general, there are two
> +       different functions, whose names are prefixed with an 'S" and
> +       or an 'U'.  However, for some math functions there is also a
> +       routine that does not have the prefix and takes an SignOp
> +       parameter of SIGNED or UNSIGNED.  */
> +    SIGNED,
> +    UNSIGNED
> +  };
>
> You seem to insist on that.  It should propagate to the various parts
> of the compiler that have settled for the 'uns' integer argument.
> Having one piece behave different is just weird.  I suppose I will
> find code like
>
>      wi.ext (prec, uns ? UNSIGNED : SIGNED)
>
> in your conversion?
this has been resolved on another thread.
> +  static wide_int from_shwi (HOST_WIDE_INT op0,
> +                            unsigned int precision, bool *overflow = 0);
>
> +  if (precision < HOST_BITS_PER_WIDE_INT)
> +    {
> +      HOST_WIDE_INT t = sext_hwi (op0, precision);
> +      if (t != op0 && overflow)
> +       *overflow = true;
> +      op0 = t;
> +    }
>
> Hm.  I'd say input values should be properly extended already (we certainly
> expect that from RTL or tree constants).  So we might want to assert the
> above instead of tracking it via *overflow.
>
> +  static wide_int from_array (const HOST_WIDE_INT* op0,
> +                             unsigned int len,
> +                             unsigned int precision, bool need_canon = true);
> +  inline static wide_int from_array (const HOST_WIDE_INT* op0,
> +                                    unsigned int len,
> +                                    enum machine_mode mode);
> +  inline static wide_int from_array (const HOST_WIDE_INT* op0,
> +                                    unsigned int len,
> +                                    const_tree type);
>
> I still don't like the overloads precision vs. machine_mode vs. tree type.
> It's much more explanative what you specify here if you write
>
>    from_array (&x, len, GET_MODE_PRECISION (mode))
>
> instead of
>
>    from_array (&x, len, mode)
>
> and it's not that much more typing either.
>
> +  static wide_int from_double_int (double_int,
> +                                  unsigned int precision);
> +  inline static wide_int from_double_int (double_int, enum machine_mode);
>
> this one would lack the tree type overload (I of course think it has an
> excessive overload for machine_mode).
>
> +  inline HOST_WIDE_INT to_shwi (unsigned int prec = 0) const;
> +  inline unsigned HOST_WIDE_INT to_uhwi (unsigned int prec = 0) const;
>
> this is
>
>     wi.sext (prec);
>     wi.to_shwi ();
>
> which is less odd than handling prec == 0 specially.
>
> +  static wide_int max_value (unsigned int prec,
> +                            unsigned int precision, SignOp sgn);
>
> two precisions - ugh ;)  In some places having to have a precision is ugly :/
this is really necessary, unfortunately.   though perhaps i can 
rearrange the operands so that one of them can be defaulted.     The 
problem is that in vrp you want to know the max value of some small 
precision but you want that value to be expressed as a number in a 
larger precision.   outside of tree-vrp, the rest of the compiler only 
uses one precision

> +  inline static wide_int minus_one (unsigned int prec);
> +  inline static wide_int zero (unsigned int prec);
> +  inline static wide_int one (unsigned int prec);
> +  inline static wide_int two (unsigned int prec);
>
> here as well.  I see two (1) properly truncates the value to zero ;)
> It just comes to my mind that these could have "arbitrary" precision.
> "arbitrary" == MAX_SIZE * HOST_BITS_PER_WIDE_INT.  Which
> would get us back to mixed precision operations ...
these are now pretty rarely used since constants almost always appear as 
the second operand of a binary operation.    but the first operand has 
to convey the precision.
> +  /* Printing functions.  */
> +
> +  void print_dec (char *buf, SignOp sgn) const;
> +  void print_dec (FILE *file, SignOp sgn) const;
> +  void print_decs (char *buf) const;
> +  void print_decs (FILE *file) const;
> +  void print_decu (char *buf) const;
> +  void print_decu (FILE *file) const;
> +  void print_hex (char *buf) const;
> +  void print_hex (FILE *file) const;
>
> making those non-member functions would allow getting rid of stdio.h
> from wide-int.h (similar making the tree / rtx / mode argument methods
> either standalone or making them templates and externalizing the
> implementation to a separate template class would get rid of the
> rtl / tree dependency)
>
> +  inline bool neg_p () const;
>
> how can you test that?  wide-int doesn't have sign information.
>
> +bool
> +wide_int::neg_p () const
> +{
> +  return sign_mask () != 0;
>
> not exactly efficient either ...
>
> +HOST_WIDE_INT
> +wide_int::sign_mask () const
> +{
> +  int i = len - 1;
> +  if (precision < HOST_BITS_PER_WIDE_INT)
> +    return ((val[0] << (HOST_BITS_PER_WIDE_INT - precision))
> +           >> (HOST_BITS_PER_WIDE_INT - 1));
> +
> +  /* VRP appears to be badly broken and this is a very ugly fix.  */
> +  if (i >= 0)
> +    return val[i] >> (HOST_BITS_PER_WIDE_INT - 1);
> +
> +  gcc_unreachable ();
> +#if 0
> +  return val[len - 1] >> (HOST_BITS_PER_WIDE_INT - 1);
> +#endif
> +}
>
> I can't see what's the "fix".  It seems to be equivalent to the commented out
> code.  Apart from not handling len == 0 for which you ICE then.
>
> Back to neg_p ... it computes wrong information.
> from_uwhi (-1, HOST_BITS_PER_WIDE_INT).neg_p () returns true.
>
> I suppose you want to rename it to msb () (implementing that efficiently
> and using it from sign_mask, too)
>
> +  template <typename T>
> +    inline bool gt_p (T c, SignOp sgn) const;
> +  template <typename T>
> +    inline bool gts_p (T c) const;
> +  template <typename T>
> +    inline bool gtu_p (T c) const;
>
> it's bad that we can't use the sign information we have available in almost
> all cases ... (where precision is not an exact multiple of
> HOST_BITS_PER_WIDE_INT
> and len == precision / HOST_BITS_PER_WIDE_INT).  It isn't hard to encode
> a sign - you just have to possibly waste a word of zeroes for positive
> values where at the moment precision is an exact multiple of
> HOST_BIST_PER_WIDE_INT and len == precision / HOST_BITS_PER_WIDE_INT.
> Which of course means that the encoding can be one word larger than
> maximally required by 'precision'.
>
> +  wide_int force_to_size (unsigned int precision, SignOp sgn) const;
> +  inline wide_int force_to_size (enum machine_mode mode, SignOp sgn) const;
> +  inline wide_int force_to_size (const_tree type) const;
> +  inline wide_int force_to_size (const_tree type, SignOp sgn) const;
> +
> +  inline wide_int sforce_to_size (enum machine_mode mode) const;
> +  inline wide_int sforce_to_size (const_tree type) const;
> +  inline wide_int zforce_to_size (enum machine_mode mode) const;
> +  inline wide_int zforce_to_size (const_tree type) const;
>
> too many overloads again
>
> Looking at
>
> +template <typename T>
> +wide_int
> +wide_int::operator & (T c) const
> +{
> +  wide_int result;
> +  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
> +  const HOST_WIDE_INT *s;
> +  int cl;
> +
> +  s = to_shwi2 (ws, &cl, c);
> +
>
> I see
>
> +  template <typename T>
> +  static inline const HOST_WIDE_INT* to_shwi2 (HOST_WIDE_INT *s, int *l, T x)
> +  {
> +    s[0] = x;
> +    if (~(T)0 < (T)0
> ...
>
> +  static inline const HOST_WIDE_INT* to_shwi2 (HOST_WIDE_INT *s
> ATTRIBUTE_UNUSED,
> +                                       int *l, const wide_int &y)
> +  {
> +    *l = y.len;
> ...
>
> I think you want to use template specialization here, not overloading.  Thus,
>
>    template <>
>    static inline const HOST_WIDE_INT* to_shwi2 (HOST_WIDE_INT *s, int
> *l, const wide_int &y)
>
> and adjust the HWI case to look like
>
>    template <typename T>
>    static inline const HOST_WIDE_INT* to_shwi2 (HOST_WIDE_INT *s, int
> *l, const T& x)
>
> you want to move that ~(T)0 < (T)0 check up to template substitution level
> to allow SFINAE to disable the template if that's not computable.
>
> +    s[0] = x;
> +    if (~(T)0 < (T)0
> +       || sizeof (T) < sizeof (HOST_WIDE_INT))
> +      {
> +       *l = 1;
> +      }
> +    else
> +      {
> +       s[1] = 0;
> +       *l = 2;
>
> that's only required if the MSB is set, otherwise it's not properly compressed
>
> +      }
> +    return s;
>
> hmm, looking at from_uhwi I see that you are adding the extra zero words
> as I suggested ... which means you _do_ have reliable sign information
> available (well, not for the case where len would end up bigger than MAX_LEN).
> So - can we simplify some things with that?  I can think of the ordered
> comparisons for example.  Also the encoding description should be
> clarified that there _is_ sign information available (and that len can be
> bigger than precision / HOST_BITS_PER_WIDE_INT).
i did all of this because it makes somethings easier and other things 
easier to talk about.   In particular it allows the binary operators to 
work properly with constants (thanks for the trick).

i did not do this because i have any interest in changing the compiler 
so that its math is
signed infinite precision.    I strongly do not want to go there. if you 
want to take my patches later and experiment with this, fine.   I do not 
want to be the person who chances people to use llvm because they all of 
a sudden start getting weird (but conforming to the c and c++ spec) 
results out of the optimizer.

In particular, others in the community consider it non conforming gimple 
to look at a value beyond the precision.   See the posting by iant about 
my question on 11/05/2012.




> Returning to to_shwi2 (odd name, btw):
>
> +       /* Copy the result because it does not fit in two HWIs.  */
> +       s[0] = TREE_INT_CST_LOW (tcst);
> +       s[1] = TREE_INT_CST_HIGH (tcst);
> +       s[2] = 0;
> +       *l = 3;
> +       return s;
>
> ah, this is why you need the extra argument ;)  Btw, what happens
> when a proper encoding does not fit in MAX_LEN words?  That is,
> we'd need an extra zero word?
this is a hack that will die when the tree cst has a variable len 
array.   When that happens, this code becomes an unconditional pointer 
copy.

> +  /* There should logically be an overload for rtl here, but it cannot
> +     be here because of circular include issues.  It is in rtl.h.  */
> +  static inline const HOST_WIDE_INT* to_shwi2
> +    (HOST_WIDE_INT *s ATTRIBUTE_UNUSED, int *l, rtx rcst);
>
> in rtl.c I suppose.  Btw, this is why I'm suggesting to defer implementation
> to a template - you can implement that outside of wide-int.h even without
> declaring the specialization here.
>
> Is there an accessible branch with the wide-int work?  I may be tempted
> to propose some changes in form of patches.  If not then I think the
> work is in a state that is suitable to put on a merge-branch.
i submitted all of the rtl work in the other patch and the first of the 
tree patches.   what i have is badly rotted so i was going to wait til i 
had a stable api to clean it up and post everything.
> +wide_int
> +wide_int::add (const wide_int &op1, SignOp sgn, bool *overflow) const
> +{
> ...
> +  /* Uncompress the rest.  */
> +  for (i = len; i < op1.len; i++)
> +    {
> +      o0 = val[i];
> +      o1 = mask1;
> +      x = o0 + o1 + carry;
> +      result.val[i] = x;
> +      old_carry = carry;
> +      carry = x < o0;
> +    }
> +  for (i = len; i < op1.len; i++)
> +    {
> +      o0 = mask0;
> +      o1 = op1.val[i];
> +      x = o0 + o1 + carry;
> +      result.val[i] = x;
> +      old_carry = carry;
> +      carry = x < o0;
> +    }
>
> Cut & paste error here I think.  at least one should run up to this->len, no?
no, if len is 3, the top one you can look at is 0
> And the first loop should run to min (len, op1.len) only.  How much
> testing coverage does the code have for "interesting" values?
> A standalone testing program will probably reveal such issues (due to
> the rich-ness of the current API covering everything will be hard :/)
>
> Looking at
>
> +HOST_WIDE_INT
> +wide_int::sign_mask () const
> +{
> +  int i = len - 1;
> +  if (precision < HOST_BITS_PER_WIDE_INT)
> +    return ((val[0] << (HOST_BITS_PER_WIDE_INT - precision))
> +           >> (HOST_BITS_PER_WIDE_INT - 1));
> +
> +  /* VRP appears to be badly broken and this is a very ugly fix.  */
> +  if (i >= 0)
> +    return val[i] >> (HOST_BITS_PER_WIDE_INT - 1);
> +
> +  gcc_unreachable ();
> +#if 0
> +  return val[len - 1] >> (HOST_BITS_PER_WIDE_INT - 1);
> +#endif
> +}
>
> again - this looks like it does overly complicated work.  Our encoding
> of values guarantees that the following is correct:
>
> HOST_WIDE_INT
> wide_int::sign_mask () const
> {
>    if (val[len - 1] < 0)
>      return -1;
>    else
>      return 0;
> }
>
> I suppose quite some code can be simplified that calls sign_mask
> currently (thanks to the changed encoding).
>
> back to ::add ():
>
> +  small_prec = precision & (HOST_BITS_PER_WIDE_INT - 1);
> +  if (small_prec == 0)
> +    {
> +      if (sgn == wide_int::SIGNED)
> +       {
> +         if (((!(o0 ^ o1)) & (x ^ o0)) >> (HOST_BITS_PER_WIDE_INT - 1))
> +           *overflow = true;
>
> shouldn't this share code with the non-overflow add_large and simply
> do overflow detection conditional on overlflow being non-NULL?
> Because the add_large code seems to be correct with respect to
> the issues noted above ...
>
> + ex:
> +  result.canonize ();
> +  return result;
> +}
>
> canonizing is not strictly necessary - in fact it is unlikely to do
> something useful most of the time.  I'd say we should only
> canonize before building a tree or RTL with that representation.
> Note that canonize is wrong, too:
>
> +wide_int::canonize ()
> +{
> +  int small_prec = precision & (HOST_BITS_PER_WIDE_INT - 1);
> +  int blocks_needed = BLOCKS_NEEDED (precision);
> +  HOST_WIDE_INT top;
> +  int i;
> +
> +  if (len > blocks_needed)
> +    len = blocks_needed;
>
> that will set a UHWI -1 to have len == 1, ignoring the extra needed
> zero word.  Thus, either BLOCKS_NEEDED is broken or its
> uses need to be audited.
>
> +
> +  /* Clean up the top bits for any mode that is not a multiple of a HWI.  */
> +  if (len == blocks_needed && small_prec)
> +    val[len - 1] = sext_hwi (val[len - 1], small_prec);
>
> That's not necessary.  By construction all arithmetic operations will
> preserve proper extension.  And you unconditionally sign-extend
> small precision "large" positive constants here which is even wrong
> (-1U, precision == 32, HWI == 64, original rep { 0x00..00fffff }, broken
> { 0xfff...fff } ).
>
> +  if (len == 1)
> +    return;
>
> in fact for len == 1 we should do nothing in this function from the start.
> Callers that also want to properly sign or zero extend the result value
> should do so using ext (), not abuse another implementation inside
> canonize () (which should only optimize the encoding).
>
> Ok, I'll stop reviewing here.  It's a lot better than before - thanks for
> doing the changes.
>
> I still like you to cut down the interface somewhat and _test_ what
> is remaining.  Put the code on a branch off trunk so I and others can
> produce patches against it.
>
> Thanks,
> Richard.
Kenneth Zadeck April 22, 2013, 6:02 p.m. UTC | #5
richard,

i pushed send rather than save, just ignore the last email i sent

sorry.


On 04/22/2013 01:59 PM, Kenneth Zadeck wrote:
> On 04/19/2013 09:31 AM, Richard Biener wrote:
>> +   number of elements of the vector that are in use.  When LEN *
>> +   HOST_BITS_PER_WIDE_INT < the precision, the value has been
>> +   compressed.  The values of the elements of the vector greater than
>> +   LEN - 1. are all equal to the highest order bit of LEN.
>>
>> equal to the highest order bit of element LEN - 1. ?
> Fixed, you are correct.
>
> I have gone thru the entire wide-int patch to clean this up.   The 
> bottom line is that if the precision is not a multiple of the size of 
> a HWI then everything above that precision is assumed to be identical 
> to the sign bit.
>
>> Especially _not_ equal to the precision - 1 bit of the value, correct?
> I do not understand your question here, because in the case talked 
> about above, the bit at precision - 1 would not have been explicitly 
> represented.
>
> Anyway,  i went thru this top part carefully and made many things 
> clearer.
>> +   The representation does not contain any information inherant about
>> +   signedness of the represented value, so it can be used to represent
>> +   both signed and unsigned numbers.   For operations where the results
>> +   depend on signedness (division, comparisons), the signedness must
>> +   be specified separately.  For operations where the signness
>> +   matters, one of the operands to the operation specifies either
>> +   wide_int::SIGNED or wide_int::UNSIGNED.
>>
>> The last sentence is somehow duplicated.
> fixed
>>
>> +   The numbers are stored as sign entended numbers as a means of
>> +   compression.  Leading HOST_WIDE_INTS that contain strings of either
>> +   -1 or 0 are removed as long as they can be reconstructed from the
>> +   top bit that is being represented.
>>
>> I'd put this paragraph before the one that talks about signedness, next
>> to the one that already talks about encoding.
> done
>> +   All constructors for wide_int take either a precision, an enum
>> +   machine_mode or tree_type.  */
>>
>> That's probably no longer true (I'll now check).
> yes you are correct
>
>> +class wide_int {
>> +  /* Internal representation.  */
>> +
>> +  /* VAL is set to a size that is capable of computing a full
>> +     multiplication on the largest mode that is represented on the
>> +     target.  The full multiplication is use by tree-vrp. tree-vpn
>> +     currently does a 2x largest mode by 2x largest mode yielding a 4x
>> +     largest mode result.  If operations are added that require larger
>> +     buffers, then VAL needs to be changed.  */
>> +  HOST_WIDE_INT val[WIDE_INT_MAX_ELTS];
>> +  unsigned short len;
>> +  unsigned int precision;
>>
>> I wonder if there is a technical reason to stick to HOST_WIDE_INTs?
>> I'd say for efficiency HOST_WIDEST_FAST_INT would be more appropriate
>> (to get a 32bit value on 32bit x86 for example).  I of course see that
>> conversion to/from HOST_WIDE_INT is an important operation
>> that would get slightly more complicated.
>>
>> Maybe just quickly checking the code generated on 32bit x86 for
>> HOST_WIDE_INT vs. HOST_WIDEST_FAST_INT tells us whether
>> it's worth considering (it would be bad if each add/multiply would
>> end up calling to libgcc for example - I know that doesn't happen
>> for x86, but maybe it would happen for an arm hosted gcc
>> targeting x86_64?)
> This is an interesting point.   my guess is that it is unlikely to be 
> worth the work.
> consider add:    most machines have add with carry and well written 32 
> bit ports would have used an add with carry sequence rather than 
> making the libcall.   If i rewrite wide-int in terms of 
> host_fastest_int, then i have to do some messy code to compute the 
> carry which is unlikely to translate into the proper carry 
> instructions.   Not to mention the cost overhead of converting to and 
> from HFI given that gcc is written almost entirely using HWIs.
>
> I thought about the possible idea of just converting the mul and div 
> functions.   This would be easy because i already reblock them into 
> HOST_WIDE_HALF_INTs to do the math.    I could just do a different 
> reblocking.   However, i think that it is unlikely that doing this 
> would ever show up on anyone's performance counts. Either way you do 
> the same number of multiply instructions, it is just the subroutine 
> wrapper that could possibly go away.
>
>> +  enum ShiftOp {
>> +    NONE,
>> +    /* There are two uses for the wide-int shifting functions. The
>> +       first use is as an emulation of the target hardware. The
>> +       second use is as service routines for other optimizations.  The
>> +       first case needs to be identified by passing TRUNC as the value
>> +       of ShiftOp so that shift amount is properly handled according 
>> to the
>> +       SHIFT_COUNT_TRUNCATED flag.  For the second case, the shift
>> +       amount is always truncated by the bytesize of the mode of
>> +       THIS.  */
>> +    TRUNC
>> +  };
>>
>> double-int simply honors SHIFT_COUNT_TRUNCATED.  Why differ
>> from that (and thus change behavior in existing code - not sure if you
>> do that with introducing wide-int)?
> I believe that GCC is supposed to be a little schizophrenic here, at 
> least according to the doc.    when it is doing its own math it is 
> defined to use SHIFT_COUNT_TRUNCATED, but when it is doing math for 
> the program it is supposed to honor what the port does.   The idea is 
> that you pass in TRUNC when you want to have the shift op do what the 
> port does.   Again, this goes back to an insistence that the 
> optimizations are just doing what the machine would do, only earlier.
>
>> +  enum SignOp {
>> +    /* Many of the math functions produce different results depending
>> +       on if they are SIGNED or UNSIGNED.  In general, there are two
>> +       different functions, whose names are prefixed with an 'S" and
>> +       or an 'U'.  However, for some math functions there is also a
>> +       routine that does not have the prefix and takes an SignOp
>> +       parameter of SIGNED or UNSIGNED.  */
>> +    SIGNED,
>> +    UNSIGNED
>> +  };
>>
>> You seem to insist on that.  It should propagate to the various parts
>> of the compiler that have settled for the 'uns' integer argument.
>> Having one piece behave different is just weird.  I suppose I will
>> find code like
>>
>>      wi.ext (prec, uns ? UNSIGNED : SIGNED)
>>
>> in your conversion?
> this has been resolved on another thread.
>> +  static wide_int from_shwi (HOST_WIDE_INT op0,
>> +                            unsigned int precision, bool *overflow = 
>> 0);
>>
>> +  if (precision < HOST_BITS_PER_WIDE_INT)
>> +    {
>> +      HOST_WIDE_INT t = sext_hwi (op0, precision);
>> +      if (t != op0 && overflow)
>> +       *overflow = true;
>> +      op0 = t;
>> +    }
>>
>> Hm.  I'd say input values should be properly extended already (we 
>> certainly
>> expect that from RTL or tree constants).  So we might want to assert the
>> above instead of tracking it via *overflow.
>>
>> +  static wide_int from_array (const HOST_WIDE_INT* op0,
>> +                             unsigned int len,
>> +                             unsigned int precision, bool need_canon 
>> = true);
>> +  inline static wide_int from_array (const HOST_WIDE_INT* op0,
>> +                                    unsigned int len,
>> +                                    enum machine_mode mode);
>> +  inline static wide_int from_array (const HOST_WIDE_INT* op0,
>> +                                    unsigned int len,
>> +                                    const_tree type);
>>
>> I still don't like the overloads precision vs. machine_mode vs. tree 
>> type.
>> It's much more explanative what you specify here if you write
>>
>>    from_array (&x, len, GET_MODE_PRECISION (mode))
>>
>> instead of
>>
>>    from_array (&x, len, mode)
>>
>> and it's not that much more typing either.
>>
>> +  static wide_int from_double_int (double_int,
>> +                                  unsigned int precision);
>> +  inline static wide_int from_double_int (double_int, enum 
>> machine_mode);
>>
>> this one would lack the tree type overload (I of course think it has an
>> excessive overload for machine_mode).
>>
>> +  inline HOST_WIDE_INT to_shwi (unsigned int prec = 0) const;
>> +  inline unsigned HOST_WIDE_INT to_uhwi (unsigned int prec = 0) const;
>>
>> this is
>>
>>     wi.sext (prec);
>>     wi.to_shwi ();
>>
>> which is less odd than handling prec == 0 specially.
>>
>> +  static wide_int max_value (unsigned int prec,
>> +                            unsigned int precision, SignOp sgn);
>>
>> two precisions - ugh ;)  In some places having to have a precision is 
>> ugly :/
> this is really necessary, unfortunately.   though perhaps i can 
> rearrange the operands so that one of them can be defaulted. The 
> problem is that in vrp you want to know the max value of some small 
> precision but you want that value to be expressed as a number in a 
> larger precision.   outside of tree-vrp, the rest of the compiler only 
> uses one precision
>
>> +  inline static wide_int minus_one (unsigned int prec);
>> +  inline static wide_int zero (unsigned int prec);
>> +  inline static wide_int one (unsigned int prec);
>> +  inline static wide_int two (unsigned int prec);
>>
>> here as well.  I see two (1) properly truncates the value to zero ;)
>> It just comes to my mind that these could have "arbitrary" precision.
>> "arbitrary" == MAX_SIZE * HOST_BITS_PER_WIDE_INT.  Which
>> would get us back to mixed precision operations ...
> these are now pretty rarely used since constants almost always appear 
> as the second operand of a binary operation.    but the first operand 
> has to convey the precision.
>> +  /* Printing functions.  */
>> +
>> +  void print_dec (char *buf, SignOp sgn) const;
>> +  void print_dec (FILE *file, SignOp sgn) const;
>> +  void print_decs (char *buf) const;
>> +  void print_decs (FILE *file) const;
>> +  void print_decu (char *buf) const;
>> +  void print_decu (FILE *file) const;
>> +  void print_hex (char *buf) const;
>> +  void print_hex (FILE *file) const;
>>
>> making those non-member functions would allow getting rid of stdio.h
>> from wide-int.h (similar making the tree / rtx / mode argument methods
>> either standalone or making them templates and externalizing the
>> implementation to a separate template class would get rid of the
>> rtl / tree dependency)
>>
>> +  inline bool neg_p () const;
>>
>> how can you test that?  wide-int doesn't have sign information.
>>
>> +bool
>> +wide_int::neg_p () const
>> +{
>> +  return sign_mask () != 0;
>>
>> not exactly efficient either ...
>>
>> +HOST_WIDE_INT
>> +wide_int::sign_mask () const
>> +{
>> +  int i = len - 1;
>> +  if (precision < HOST_BITS_PER_WIDE_INT)
>> +    return ((val[0] << (HOST_BITS_PER_WIDE_INT - precision))
>> +           >> (HOST_BITS_PER_WIDE_INT - 1));
>> +
>> +  /* VRP appears to be badly broken and this is a very ugly fix.  */
>> +  if (i >= 0)
>> +    return val[i] >> (HOST_BITS_PER_WIDE_INT - 1);
>> +
>> +  gcc_unreachable ();
>> +#if 0
>> +  return val[len - 1] >> (HOST_BITS_PER_WIDE_INT - 1);
>> +#endif
>> +}
>>
>> I can't see what's the "fix".  It seems to be equivalent to the 
>> commented out
>> code.  Apart from not handling len == 0 for which you ICE then.
>>
>> Back to neg_p ... it computes wrong information.
>> from_uwhi (-1, HOST_BITS_PER_WIDE_INT).neg_p () returns true.
>>
>> I suppose you want to rename it to msb () (implementing that efficiently
>> and using it from sign_mask, too)
>>
>> +  template <typename T>
>> +    inline bool gt_p (T c, SignOp sgn) const;
>> +  template <typename T>
>> +    inline bool gts_p (T c) const;
>> +  template <typename T>
>> +    inline bool gtu_p (T c) const;
>>
>> it's bad that we can't use the sign information we have available in 
>> almost
>> all cases ... (where precision is not an exact multiple of
>> HOST_BITS_PER_WIDE_INT
>> and len == precision / HOST_BITS_PER_WIDE_INT).  It isn't hard to encode
>> a sign - you just have to possibly waste a word of zeroes for positive
>> values where at the moment precision is an exact multiple of
>> HOST_BIST_PER_WIDE_INT and len == precision / HOST_BITS_PER_WIDE_INT.
>> Which of course means that the encoding can be one word larger than
>> maximally required by 'precision'.
>>
>> +  wide_int force_to_size (unsigned int precision, SignOp sgn) const;
>> +  inline wide_int force_to_size (enum machine_mode mode, SignOp sgn) 
>> const;
>> +  inline wide_int force_to_size (const_tree type) const;
>> +  inline wide_int force_to_size (const_tree type, SignOp sgn) const;
>> +
>> +  inline wide_int sforce_to_size (enum machine_mode mode) const;
>> +  inline wide_int sforce_to_size (const_tree type) const;
>> +  inline wide_int zforce_to_size (enum machine_mode mode) const;
>> +  inline wide_int zforce_to_size (const_tree type) const;
>>
>> too many overloads again
>>
>> Looking at
>>
>> +template <typename T>
>> +wide_int
>> +wide_int::operator & (T c) const
>> +{
>> +  wide_int result;
>> +  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
>> +  const HOST_WIDE_INT *s;
>> +  int cl;
>> +
>> +  s = to_shwi2 (ws, &cl, c);
>> +
>>
>> I see
>>
>> +  template <typename T>
>> +  static inline const HOST_WIDE_INT* to_shwi2 (HOST_WIDE_INT *s, int 
>> *l, T x)
>> +  {
>> +    s[0] = x;
>> +    if (~(T)0 < (T)0
>> ...
>>
>> +  static inline const HOST_WIDE_INT* to_shwi2 (HOST_WIDE_INT *s
>> ATTRIBUTE_UNUSED,
>> +                                       int *l, const wide_int &y)
>> +  {
>> +    *l = y.len;
>> ...
>>
>> I think you want to use template specialization here, not 
>> overloading.  Thus,
>>
>>    template <>
>>    static inline const HOST_WIDE_INT* to_shwi2 (HOST_WIDE_INT *s, int
>> *l, const wide_int &y)
>>
>> and adjust the HWI case to look like
>>
>>    template <typename T>
>>    static inline const HOST_WIDE_INT* to_shwi2 (HOST_WIDE_INT *s, int
>> *l, const T& x)
>>
>> you want to move that ~(T)0 < (T)0 check up to template substitution 
>> level
>> to allow SFINAE to disable the template if that's not computable.
>>
>> +    s[0] = x;
>> +    if (~(T)0 < (T)0
>> +       || sizeof (T) < sizeof (HOST_WIDE_INT))
>> +      {
>> +       *l = 1;
>> +      }
>> +    else
>> +      {
>> +       s[1] = 0;
>> +       *l = 2;
>>
>> that's only required if the MSB is set, otherwise it's not properly 
>> compressed
>>
>> +      }
>> +    return s;
>>
>> hmm, looking at from_uhwi I see that you are adding the extra zero words
>> as I suggested ... which means you _do_ have reliable sign information
>> available (well, not for the case where len would end up bigger than 
>> MAX_LEN).
>> So - can we simplify some things with that?  I can think of the ordered
>> comparisons for example.  Also the encoding description should be
>> clarified that there _is_ sign information available (and that len 
>> can be
>> bigger than precision / HOST_BITS_PER_WIDE_INT).
> i did all of this because it makes somethings easier and other things 
> easier to talk about.   In particular it allows the binary operators 
> to work properly with constants (thanks for the trick).
>
> i did not do this because i have any interest in changing the compiler 
> so that its math is
> signed infinite precision.    I strongly do not want to go there. if 
> you want to take my patches later and experiment with this, fine.   I 
> do not want to be the person who chances people to use llvm because 
> they all of a sudden start getting weird (but conforming to the c and 
> c++ spec) results out of the optimizer.
>
> In particular, others in the community consider it non conforming 
> gimple to look at a value beyond the precision.   See the posting by 
> iant about my question on 11/05/2012.
>
>
>
>
>> Returning to to_shwi2 (odd name, btw):
>>
>> +       /* Copy the result because it does not fit in two HWIs. */
>> +       s[0] = TREE_INT_CST_LOW (tcst);
>> +       s[1] = TREE_INT_CST_HIGH (tcst);
>> +       s[2] = 0;
>> +       *l = 3;
>> +       return s;
>>
>> ah, this is why you need the extra argument ;)  Btw, what happens
>> when a proper encoding does not fit in MAX_LEN words?  That is,
>> we'd need an extra zero word?
> this is a hack that will die when the tree cst has a variable len 
> array.   When that happens, this code becomes an unconditional pointer 
> copy.
>
>> +  /* There should logically be an overload for rtl here, but it cannot
>> +     be here because of circular include issues.  It is in rtl.h.  */
>> +  static inline const HOST_WIDE_INT* to_shwi2
>> +    (HOST_WIDE_INT *s ATTRIBUTE_UNUSED, int *l, rtx rcst);
>>
>> in rtl.c I suppose.  Btw, this is why I'm suggesting to defer 
>> implementation
>> to a template - you can implement that outside of wide-int.h even 
>> without
>> declaring the specialization here.
>>
>> Is there an accessible branch with the wide-int work?  I may be tempted
>> to propose some changes in form of patches.  If not then I think the
>> work is in a state that is suitable to put on a merge-branch.
> i submitted all of the rtl work in the other patch and the first of 
> the tree patches.   what i have is badly rotted so i was going to wait 
> til i had a stable api to clean it up and post everything.
>> +wide_int
>> +wide_int::add (const wide_int &op1, SignOp sgn, bool *overflow) const
>> +{
>> ...
>> +  /* Uncompress the rest.  */
>> +  for (i = len; i < op1.len; i++)
>> +    {
>> +      o0 = val[i];
>> +      o1 = mask1;
>> +      x = o0 + o1 + carry;
>> +      result.val[i] = x;
>> +      old_carry = carry;
>> +      carry = x < o0;
>> +    }
>> +  for (i = len; i < op1.len; i++)
>> +    {
>> +      o0 = mask0;
>> +      o1 = op1.val[i];
>> +      x = o0 + o1 + carry;
>> +      result.val[i] = x;
>> +      old_carry = carry;
>> +      carry = x < o0;
>> +    }
>>
>> Cut & paste error here I think.  at least one should run up to 
>> this->len, no?
> no, if len is 3, the top one you can look at is 0
>> And the first loop should run to min (len, op1.len) only.  How much
>> testing coverage does the code have for "interesting" values?
>> A standalone testing program will probably reveal such issues (due to
>> the rich-ness of the current API covering everything will be hard :/)
>>
>> Looking at
>>
>> +HOST_WIDE_INT
>> +wide_int::sign_mask () const
>> +{
>> +  int i = len - 1;
>> +  if (precision < HOST_BITS_PER_WIDE_INT)
>> +    return ((val[0] << (HOST_BITS_PER_WIDE_INT - precision))
>> +           >> (HOST_BITS_PER_WIDE_INT - 1));
>> +
>> +  /* VRP appears to be badly broken and this is a very ugly fix.  */
>> +  if (i >= 0)
>> +    return val[i] >> (HOST_BITS_PER_WIDE_INT - 1);
>> +
>> +  gcc_unreachable ();
>> +#if 0
>> +  return val[len - 1] >> (HOST_BITS_PER_WIDE_INT - 1);
>> +#endif
>> +}
>>
>> again - this looks like it does overly complicated work.  Our encoding
>> of values guarantees that the following is correct:
>>
>> HOST_WIDE_INT
>> wide_int::sign_mask () const
>> {
>>    if (val[len - 1] < 0)
>>      return -1;
>>    else
>>      return 0;
>> }
>>
>> I suppose quite some code can be simplified that calls sign_mask
>> currently (thanks to the changed encoding).
>>
>> back to ::add ():
>>
>> +  small_prec = precision & (HOST_BITS_PER_WIDE_INT - 1);
>> +  if (small_prec == 0)
>> +    {
>> +      if (sgn == wide_int::SIGNED)
>> +       {
>> +         if (((!(o0 ^ o1)) & (x ^ o0)) >> (HOST_BITS_PER_WIDE_INT - 1))
>> +           *overflow = true;
>>
>> shouldn't this share code with the non-overflow add_large and simply
>> do overflow detection conditional on overlflow being non-NULL?
>> Because the add_large code seems to be correct with respect to
>> the issues noted above ...
>>
>> + ex:
>> +  result.canonize ();
>> +  return result;
>> +}
>>
>> canonizing is not strictly necessary - in fact it is unlikely to do
>> something useful most of the time.  I'd say we should only
>> canonize before building a tree or RTL with that representation.
>> Note that canonize is wrong, too:
>>
>> +wide_int::canonize ()
>> +{
>> +  int small_prec = precision & (HOST_BITS_PER_WIDE_INT - 1);
>> +  int blocks_needed = BLOCKS_NEEDED (precision);
>> +  HOST_WIDE_INT top;
>> +  int i;
>> +
>> +  if (len > blocks_needed)
>> +    len = blocks_needed;
>>
>> that will set a UHWI -1 to have len == 1, ignoring the extra needed
>> zero word.  Thus, either BLOCKS_NEEDED is broken or its
>> uses need to be audited.
>>
>> +
>> +  /* Clean up the top bits for any mode that is not a multiple of a 
>> HWI.  */
>> +  if (len == blocks_needed && small_prec)
>> +    val[len - 1] = sext_hwi (val[len - 1], small_prec);
>>
>> That's not necessary.  By construction all arithmetic operations will
>> preserve proper extension.  And you unconditionally sign-extend
>> small precision "large" positive constants here which is even wrong
>> (-1U, precision == 32, HWI == 64, original rep { 0x00..00fffff }, broken
>> { 0xfff...fff } ).
>>
>> +  if (len == 1)
>> +    return;
>>
>> in fact for len == 1 we should do nothing in this function from the 
>> start.
>> Callers that also want to properly sign or zero extend the result value
>> should do so using ext (), not abuse another implementation inside
>> canonize () (which should only optimize the encoding).
>>
>> Ok, I'll stop reviewing here.  It's a lot better than before - thanks 
>> for
>> doing the changes.
>>
>> I still like you to cut down the interface somewhat and _test_ what
>> is remaining.  Put the code on a branch off trunk so I and others can
>> produce patches against it.
>>
>> Thanks,
>> Richard.
>
Kenneth Zadeck April 23, 2013, 12:35 a.m. UTC | #6
On 04/22/2013 08:20 AM, Richard Biener wrote:

>
> That said, a lot of my pushback is because I feel a little lonesome in this
> wide-int review and don't want to lone-some decide about that (generic)
> interface part as well.
yeh,  now sandiford is back from vacation so there are two of us to beat 
on you about your
how bad it would be to do infinite precision!!!!!!!

be careful what you wish for.

kenny
diff mbox

Patch

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 109f865..514fabc 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -857,7 +857,7 @@  RTL_BASE_H = coretypes.h rtl.h rtl.def $(MACHMODE_H) reg-notes.def \
   insn-notes.def $(INPUT_H) $(REAL_H) statistics.h $(VEC_H) \
   $(FIXED_VALUE_H) alias.h $(HASHTAB_H)
 FIXED_VALUE_H = fixed-value.h $(MACHMODE_H) double-int.h
-RTL_H = $(RTL_BASE_H) $(FLAGS_H) genrtl.h
+RTL_H = $(RTL_BASE_H) $(FLAGS_H) genrtl.h wide-int.h
 RTL_ERROR_H = rtl-error.h $(RTL_H) $(DIAGNOSTIC_CORE_H)
 READ_MD_H = $(OBSTACK_H) $(HASHTAB_H) read-md.h
 PARAMS_H = params.h params.def
@@ -945,7 +945,7 @@  TREE_PRETTY_PRINT_H = tree-pretty-print.h $(PRETTY_PRINT_H)
 GIMPLE_PRETTY_PRINT_H = gimple-pretty-print.h $(TREE_PRETTY_PRINT_H)
 DIAGNOSTIC_CORE_H = diagnostic-core.h $(INPUT_H) bversion.h diagnostic.def
 DIAGNOSTIC_H = diagnostic.h $(DIAGNOSTIC_CORE_H) $(PRETTY_PRINT_H)
-DWARF2OUT_H = dwarf2out.h $(DWARF2_H)
+DWARF2OUT_H = dwarf2out.h $(DWARF2_H) wide-int.h
 C_PRETTY_PRINT_H = c-family/c-pretty-print.h $(PRETTY_PRINT_H) \
 	$(C_COMMON_H) $(TREE_H)
 SCEV_H = tree-scalar-evolution.h $(GGC_H) tree-chrec.h $(PARAMS_H)
@@ -1458,6 +1458,7 @@  OBJS = \
 	varpool.o \
 	vmsdbgout.o \
 	web.o \
+	wide-int.o \
 	xcoffout.o \
 	$(out_object_file) \
 	$(EXTRA_OBJS) \
@@ -2687,6 +2688,7 @@  targhooks.o : targhooks.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TREE_H) \
    tree-ssa-alias.h $(TREE_FLOW_H)
 common/common-targhooks.o : common/common-targhooks.c $(CONFIG_H) $(SYSTEM_H) \
    coretypes.h $(INPUT_H) $(TM_H) $(COMMON_TARGET_H) common/common-targhooks.h
+wide-int.o: wide-int.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H)
 
 bversion.h: s-bversion; @true
 s-bversion: BASE-VER
@@ -2853,10 +2855,10 @@  emit-rtl.o : emit-rtl.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
    $(HASHTAB_H) $(TM_P_H) debug.h langhooks.h $(TREE_PASS_H) gt-emit-rtl.h \
    $(DF_H) $(PARAMS_H) $(TARGET_H)
 real.o : real.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \
-   $(DIAGNOSTIC_CORE_H) $(TM_P_H) $(REAL_H) dfp.h realmpfr.h
+   $(DIAGNOSTIC_CORE_H) $(TM_P_H) $(REAL_H) dfp.h realmpfr.h wide-int.h
 realmpfr.o : realmpfr.c realmpfr.h $(CONFIG_H) $(SYSTEM_H) coretypes.h $(REAL_H) $(TREE_H)
 dfp.o : dfp.c dfp.h $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H)	$(TREE_H) \
-   $(TM_P_H) $(REAL_H) $(DECNUM_H)
+   $(TM_P_H) $(REAL_H) $(DECNUM_H) wide-int.h
 fixed-value.o: fixed-value.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
    $(TREE_H) $(REAL_H) $(DIAGNOSTIC_CORE_H)
 jump.o : jump.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
@@ -3941,15 +3943,16 @@  CFLAGS-gengtype-parse.o += -DGENERATOR_FILE
 build/gengtype-parse.o: $(BCONFIG_H)
 
 gengtype-state.o build/gengtype-state.o: gengtype-state.c $(SYSTEM_H) \
-  gengtype.h errors.h double-int.h version.h $(HASHTAB_H) $(OBSTACK_H) \
-  $(XREGEX_H)
+  gengtype.h errors.h double-int.h version.h $(HASHTAB_H)    \
+  $(OBSTACK_H) $(XREGEX_H)
 gengtype-state.o: $(CONFIG_H)
 CFLAGS-gengtype-state.o += -DGENERATOR_FILE
 build/gengtype-state.o: $(BCONFIG_H)
-
+wide-int.h: $(GTM_H) $(TREE_H) hwint.h $(OPTIONS_H) $(TM_H)             \
+  $(MACHMODE_H) double-int.h dumpfile.h $(REAL_H)
 gengtype.o build/gengtype.o : gengtype.c $(SYSTEM_H) gengtype.h 	\
-  rtl.def insn-notes.def errors.h double-int.h version.h $(HASHTAB_H) \
-  $(OBSTACK_H) $(XREGEX_H)
+  rtl.def insn-notes.def errors.h double-int.h wide-int.h version.h     \
+  $(HASHTAB_H) $(OBSTACK_H) $(XREGEX_H)
 gengtype.o: $(CONFIG_H)
 CFLAGS-gengtype.o += -DGENERATOR_FILE
 build/gengtype.o: $(BCONFIG_H)
diff --git a/gcc/wide-int.c b/gcc/wide-int.c
new file mode 100644
index 0000000..80f1981
--- /dev/null
+++ b/gcc/wide-int.c
@@ -0,0 +1,3164 @@ 
+/* Operations with very long integers.
+   Copyright (C) 2012-2013 Free Software Foundation, Inc.
+   Contributed by Kenneth Zadeck <zadeck@naturalbridge.com>
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by the
+Free Software Foundation; either version 3, or (at your option) any
+later version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT
+ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "hwint.h"
+#include "wide-int.h"
+#include "rtl.h"
+#include "tree.h"
+#include "dumpfile.h"
+
+/* This is the maximal size of the buffer needed for dump.  */
+const int MAX_SIZE = 4 * (MAX_BITSIZE_MODE_ANY_INT / 4
+		     + MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT + 32);
+
+/*
+ * Internal utilities.
+ */
+
+/* Quantities to deal with values that hold half of a wide int.  Used
+   in multiply and divide.  */
+#define HALF_INT_MASK (((HOST_WIDE_INT)1 << HOST_BITS_PER_HALF_WIDE_INT) - 1)
+
+#define BLOCK_OF(TARGET) ((TARGET) / HOST_BITS_PER_WIDE_INT)
+#define BLOCKS_NEEDED(PREC) \
+  (((PREC) + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT)
+#define SIGN_MASK(X) (((HOST_WIDE_INT)X) >> (HOST_BITS_PER_WIDE_INT - 1))
+/*
+ * Conversion routines in and out of wide-int.
+ */
+
+/* Convert OP0 into a wide int of PRECISION.  If the precision is less
+   than HOST_BITS_PER_WIDE_INT, zero extend the value of the word.
+   The overflow bit are set if the number was too large to fit in the
+   mode.  */
+
+wide_int
+wide_int::from_shwi (HOST_WIDE_INT op0,
+		     unsigned int precision, bool *overflow)
+{
+  wide_int result;
+
+  result.precision = precision;
+
+  if (precision < HOST_BITS_PER_WIDE_INT)
+    {
+      HOST_WIDE_INT t = sext_hwi (op0, precision);
+      if (t != op0 && overflow)
+	*overflow = true; 
+      op0 = t;
+    }
+
+  result.val[0] = op0;
+  result.len = 1;
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wh ("wide_int::from_shwi %s " HOST_WIDE_INT_PRINT_HEX ")\n",
+	      result, op0);
+#endif
+
+  return result;
+}
+
+/* Convert OP0 into a wide int of PRECISION.  If the precision is less
+   than HOST_BITS_PER_WIDE_INT, zero extend the value of the word.
+   The overflow bit are set if the number was too large to fit in the
+   mode.  */
+
+wide_int
+wide_int::from_uhwi (unsigned HOST_WIDE_INT op0, 
+		     unsigned int precision, bool *overflow)
+{
+  wide_int result;
+
+  result.precision = precision;
+
+  if (precision < HOST_BITS_PER_WIDE_INT)
+    {
+      unsigned HOST_WIDE_INT t = zext_hwi (op0, precision);
+      if (t != op0 && overflow)
+	*overflow = true;
+      op0 = t;
+    }
+
+  result.val[0] = op0;
+
+  /* If the top bit is a 1, we need to add another word of 0s since
+     that would not expand the right value since the infinite
+     expansion of any unsigned number must have 0s at the top.  */
+  if ((HOST_WIDE_INT)op0 < 0 && precision > HOST_BITS_PER_WIDE_INT)
+    {
+      result.val[1] = 0;
+      result.len = 2;
+    }
+  else
+    result.len = 1;
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wh ("wide_int::from_uhwi %s " HOST_WIDE_INT_PRINT_HEX ")\n",
+	      result, op0);
+#endif
+
+  return result;
+}
+
+/* Create a wide_int from an array of host_wide_ints in OP1 of LEN.
+   The result has PRECISION.  */
+
+wide_int
+wide_int::from_array (const HOST_WIDE_INT *op1, unsigned int len, 
+		      unsigned int precision, bool need_canon)
+{
+  unsigned int i;
+  wide_int result;
+  
+  result.len = len;
+  result.precision = precision;
+
+  for (i=0; i < len; i++)
+    result.val[i] = op1[i];
+
+  if (need_canon)
+    result.canonize ();
+  return result;
+}
+
+/* Convert a double int into a wide int with precision PREC.  */
+
+wide_int
+wide_int::from_double_int (double_int di, unsigned int prec)
+{
+  HOST_WIDE_INT op = di.low;
+  wide_int result;
+
+  result.precision = prec;
+  result.len = (prec <= HOST_BITS_PER_WIDE_INT) ? 1 : 2;
+
+  if (prec < HOST_BITS_PER_WIDE_INT)
+    result.val[0] = sext_hwi (op, prec);
+  else
+    {
+      result.val[0] = op;
+      if (prec > HOST_BITS_PER_WIDE_INT)
+	{
+	  if (prec < HOST_BITS_PER_DOUBLE_INT)
+	    result.val[1] = sext_hwi (di.high, prec);
+	  else
+	    result.val[1] = di.high;
+	}
+    }
+
+  if (result.len == 2)
+    result.canonize ();
+
+  return result;
+}
+
+/* Convert a integer cst into a wide int.  */
+
+wide_int
+wide_int::from_tree (const_tree tcst)
+{
+  wide_int result;
+  tree type = TREE_TYPE (tcst);
+  unsigned int prec = TYPE_PRECISION (type);
+
+  result.precision = prec;
+
+  /* This will simplify out to just an array copy and a len copy.  */
+  result.val[0] = TREE_INT_CST_LOW (tcst);
+  result.val[1] = TREE_INT_CST_HIGH (tcst);
+  if (prec == HOST_BITS_PER_DOUBLE_INT 
+      && TYPE_UNSIGNED (type)
+      && TREE_INT_CST_HIGH (tcst) < 0)
+    {
+      result.val[2] = 0;
+      result.len = 3;
+    }
+  else if (prec > HOST_BITS_PER_WIDE_INT)
+    result.len = 2;
+  else if (prec == HOST_BITS_PER_WIDE_INT
+	   && TYPE_UNSIGNED (type)
+	   && (HOST_WIDE_INT)TREE_INT_CST_LOW (tcst) < 0)
+    result.len = 2;
+  else
+    result.len = 1;
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    {
+      debug_whh ("wide_int:: %s = from_tree ("HOST_WIDE_INT_PRINT_HEX" "HOST_WIDE_INT_PRINT_HEX")\n",
+		 result, TREE_INT_CST_HIGH (tcst), TREE_INT_CST_LOW (tcst));
+    }
+#endif
+
+  return result;
+}
+
+/* Extract a constant integer from the X of type MODE.  The bits of
+   the integer are returned.  */
+
+wide_int
+wide_int::from_rtx (const_rtx x, enum machine_mode mode)
+{
+  wide_int result;
+  unsigned int prec = GET_MODE_PRECISION (mode);
+
+  gcc_assert (mode != VOIDmode);
+
+  result.precision = prec;
+
+  switch (GET_CODE (x))
+    {
+    case CONST_INT:
+      result.val[0] = INTVAL (x);
+      result.len = 1;
+#ifdef DEBUG_WIDE_INT
+      if (dump_file)
+	{
+	  debug_wh ("wide_int:: %s = from_rtx ("HOST_WIDE_INT_PRINT_HEX")\n",
+		    result, INTVAL (x));
+	}
+#endif
+      break;
+
+#if TARGET_SUPPORTS_WIDE_INT
+    case CONST_WIDE_INT:
+      {
+	int i;
+	result.len = CONST_WIDE_INT_NUNITS (x);
+	
+	for (i = 0; i < result.len; i++)
+	  result.val[i] = CONST_WIDE_INT_ELT (x, i);
+      }
+      break;
+#else
+    case CONST_DOUBLE:
+      result.len = 2;
+      result.val[0] = CONST_DOUBLE_LOW (x);
+      result.val[1] = CONST_DOUBLE_HIGH (x);
+
+#ifdef DEBUG_WIDE_INT
+      if (dump_file)
+	{
+	  debug_whh ("wide_int:: %s = from_rtx ("HOST_WIDE_INT_PRINT_HEX" "HOST_WIDE_INT_PRINT_HEX")\n",
+		     result, CONST_DOUBLE_HIGH (x), CONST_DOUBLE_LOW (x));
+	}
+#endif
+
+      break;
+#endif
+
+    default:
+      gcc_unreachable ();
+    }
+
+  return result;
+}
+
+/* Construct from a buffer of length LEN.  BUFFER will be read according
+   to byte endianess and word endianess.  Only the lower LEN bytes
+   of the result are set; the remaining high bytes are cleared.  */
+
+wide_int
+wide_int::from_buffer (const unsigned char *buffer, int len)
+{
+  wide_int result = wide_int::zero (len * BITS_PER_UNIT);
+  int words = len / UNITS_PER_WORD;
+
+  for (int byte = 0; byte < len; byte++)
+    {
+      int offset;
+      int index;
+      int bitpos = byte * BITS_PER_UNIT;
+      unsigned HOST_WIDE_INT value;
+
+      if (len > UNITS_PER_WORD)
+	{
+	  int word = byte / UNITS_PER_WORD;
+
+	  if (WORDS_BIG_ENDIAN)
+	    word = (words - 1) - word;
+
+	  offset = word * UNITS_PER_WORD;
+
+	  if (BYTES_BIG_ENDIAN)
+	    offset += (UNITS_PER_WORD - 1) - (byte % UNITS_PER_WORD);
+	  else
+	    offset += byte % UNITS_PER_WORD;
+	}
+      else
+	offset = BYTES_BIG_ENDIAN ? (len - 1) - byte : byte;
+
+      value = (unsigned HOST_WIDE_INT) buffer[offset];
+
+      index = bitpos / HOST_BITS_PER_WIDE_INT;
+      result.val[index] |= value << bitpos;
+    }
+
+  result.canonize ();
+  return result;
+}
+
+
+/*
+ * Largest and smallest values in a mode.
+ */
+
+/* Produce the largest SGNed number that is represented in PREC.  The
+   resulting number is placed in a wide int of size PRECISION.  */
+
+wide_int
+wide_int::max_value (unsigned int prec, unsigned int precision, 
+		     SignOp sgn)
+{
+  wide_int result;
+  
+  result.precision = precision;
+
+  if (sgn == UNSIGNED)
+    {
+      /* The unsigned max is just all ones, for which the compressed
+	 rep is just a single HWI.  */ 
+      result.len = 1;
+      result.val[0] = (HOST_WIDE_INT)-1;
+    }
+  else
+    {
+      /* The signed max is all ones except the top bit.  This must be
+	 explicitly represented.  */
+      int i;
+      int small_prec = prec & (HOST_BITS_PER_WIDE_INT - 1);
+      int shift = (small_prec == 0) 
+	? HOST_BITS_PER_WIDE_INT - 1 : small_prec - 1;
+
+      result.len = BLOCKS_NEEDED (prec);
+      for (i = 0; i < result.len - 1; i++)
+	result.val[i] = (HOST_WIDE_INT)-1;
+
+      result.val[result.len - 1] = ((HOST_WIDE_INT)1 << shift) - 1;
+    }
+  
+  return result;
+}
+
+/* Produce the smallest SGNed number that is represented in PREC.  The
+   resulting number is placed in a wide int of size PRECISION.  */
+
+wide_int
+wide_int::min_value (unsigned int prec, unsigned int precision, 
+		     SignOp sgn)
+{
+  if (sgn == UNSIGNED)
+    {
+      /* The unsigned min is just all zeros, for which the compressed
+	 rep is just a single HWI.  */ 
+      wide_int result;
+      result.len = 1;
+      result.precision = precision;
+      result.val[0] = 0;
+      return result;
+    }
+  else
+    {
+      /* The signed min is all zeros except the top bit.  This must be
+	 explicitly represented.  */
+      return set_bit_in_zero (prec - 1, precision);
+    }
+}
+
+/*
+ * Public utilities.
+ */
+
+/* Check the upper HOST_WIDE_INTs of src to see if the length can be
+   shortened.  An upper HOST_WIDE_INT is unnecessary if it is all ones
+   or zeros and the top bit of the next lower word matches.
+
+   This function may change the representation of THIS, but does not
+   change the value that THIS represents.  It does not sign extend in
+   the case that the size of the mode is less than
+   HOST_BITS_PER_WIDE_INT.  */
+
+void
+wide_int::canonize ()
+{
+  int small_prec = precision & (HOST_BITS_PER_WIDE_INT - 1);
+  int blocks_needed = BLOCKS_NEEDED (precision);
+  HOST_WIDE_INT top;
+  int i;
+
+  if (len > blocks_needed)
+    len = blocks_needed;
+
+  /* Clean up the top bits for any mode that is not a multiple of a HWI.  */
+  if (len == blocks_needed && small_prec)
+    val[len - 1] = sext_hwi (val[len - 1], small_prec);
+
+  if (len == 1)
+    return;
+
+  top = val[len - 1];
+  if (top != 0 && top != (HOST_WIDE_INT)-1)
+    return;
+
+  /* At this point we know that the top is either 0 or -1.  Find the
+     first block that is not a copy of this.  */
+  for (i = len - 2; i >= 0; i--)
+    {
+      HOST_WIDE_INT x = val[i];
+      if (x != top)
+	{
+	  if (SIGN_MASK (x) == top)
+	    {
+	      len = i + 1;
+	      return;
+	    }
+
+	  /* We need an extra block because the top bit block i does
+	     not match the extension.  */
+	  len = i + 2;
+	  return;
+	}
+    }
+
+  /* The number is 0 or -1.  */
+  len = 1;
+}
+
+
+/* Make a copy of this.  */
+
+wide_int
+wide_int::copy () const
+{
+  wide_int result;
+  int i;
+
+  result.precision = precision;
+  result.len = len;
+
+  for (i = 0; i < result.len; i++)
+    result.val[i] = val[i];
+  return result;
+}
+
+
+/* Copy THIS replacing the precision with PREC.
+   It can do any of truncation, extension or copying.  */
+
+wide_int
+wide_int::force_to_size (unsigned int prec, SignOp sgn) const
+{
+  wide_int result;
+  int blocks_needed = BLOCKS_NEEDED (prec);
+  int i;
+
+  result.precision = prec;
+  result.len = blocks_needed < len ? blocks_needed : len;
+  for (i = 0; i < result.len; i++)
+    result.val[i] = val[i];
+
+  if (prec >= precision) 
+    {
+      /* Expanding */
+      int small_precision = precision & (HOST_BITS_PER_WIDE_INT - 1);
+
+      /* The only case we care about is unsigned because the rep is
+	 inherantly signed.  */
+      if (sgn == UNSIGNED)
+	{
+	  /* The top block in the existing rep must be zero extended,
+	     but this is all the work we need to do.  */
+	  if (small_precision 
+	      && (len == BLOCKS_NEEDED (precision)
+		  || len == blocks_needed))
+	    result.val[len-1] = zext_hwi (result.val[len-1], small_precision);
+	  else if (len == BLOCKS_NEEDED (precision) 
+		   && len < blocks_needed
+		   && small_precision == 0
+		   && result.val[result.len - 1] < 0)
+		    /* We need to put the 0 block on top to keep the value
+		       from being sign extended.  */ 
+		    result.val[result.len++] = 0;
+	}
+    }
+  else
+    {
+      /* Truncating.  */
+      int small_prec = prec & (HOST_BITS_PER_WIDE_INT - 1);
+      /* The only weird case we need to look at here is when we are
+         truncating within the top block.  We need to make sure that
+         everything in the block above the new precision is sign
+         extended.  Note that this is independent of the SGN.  This is
+         just to stay canonical.  */
+      if (small_prec && (blocks_needed == len))
+	result.val[blocks_needed-1]
+	  = sext_hwi (result.val[blocks_needed-1], small_prec);
+    }
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwvs ("wide_int:: %s = force_to_size (%s, prec = %d %s)\n", 
+		 result, *this, prec, sgn==UNSIGNED ? "U" : "S");
+#endif
+
+  return result;
+}
+
+/*
+ * public printing routines.
+ */
+
+/* Try to print the signed self in decimal to BUF if the number fits
+   in a HWI.  Other print in hex.  */
+
+void 
+wide_int::print_decs (char *buf) const
+{
+  if ((precision <= HOST_BITS_PER_WIDE_INT)
+      || (len == 1 && !neg_p ()))
+      sprintf (buf, HOST_WIDE_INT_PRINT_DEC, val[0]);
+  else
+    print_hex (buf);
+}
+
+/* Try to print the signed self in decimal to FILE if the number fits
+   in a HWI.  Other print in hex.  */
+
+void 
+wide_int::print_decs (FILE *file) const
+{
+  char buf[(2 * MAX_BITSIZE_MODE_ANY_INT / BITS_PER_UNIT) + 4];
+  print_decs (buf);
+  fputs (buf, file);
+}
+
+/* Try to print the unsigned self in decimal to BUF if the number fits
+   in a HWI.  Other print in hex.  */
+
+void 
+wide_int::print_decu (char *buf) const
+{
+  if ((precision <= HOST_BITS_PER_WIDE_INT)
+      || (len == 1 && !neg_p ()))
+      sprintf (buf, HOST_WIDE_INT_PRINT_UNSIGNED, val[0]);
+  else
+    print_hex (buf);
+}
+
+/* Try to print the signed self in decimal to FILE if the number fits
+   in a HWI.  Other print in hex.  */
+
+void 
+wide_int::print_decu (FILE *file) const
+{
+  char buf[(2 * MAX_BITSIZE_MODE_ANY_INT / BITS_PER_UNIT) + 4];
+  print_decu (buf);
+  fputs (buf, file);
+}
+
+void 
+wide_int::print_hex (char *buf) const
+{
+  int i = len;
+
+  if (zero_p ())
+    sprintf (buf, "0x");
+  else
+    {
+      if (neg_p ())
+	{
+	  int j;
+	  /* If the number is negative, we may need to pad value with
+	     0xFFF...  because the leading elements may be missing and
+	     we do not print a '-' with hex.  */
+	  for (j = BLOCKS_NEEDED (precision); j > i; j--)
+	    buf += sprintf (buf, HOST_WIDE_INT_PRINT_PADDED_HEX, (HOST_WIDE_INT) -1);
+	    
+	}
+      else
+	buf += sprintf (buf, HOST_WIDE_INT_PRINT_HEX, val [--i]);
+      while (-- i >= 0)
+	buf += sprintf (buf, HOST_WIDE_INT_PRINT_PADDED_HEX, val [i]);
+    }
+}
+
+/* Print one big hex number to FILE.  Note that some assemblers may not
+   accept this for large modes.  */
+void 
+wide_int::print_hex (FILE *file) const
+{
+  char buf[(2 * MAX_BITSIZE_MODE_ANY_INT / BITS_PER_UNIT) + 4];
+  print_hex (buf);
+  fputs (buf, file);
+}
+
+/*
+ * Comparisons, note that only equality is an operator.  The other
+ * comparisons cannot be operators since they are inherently singed or
+ * unsigned and C++ has no such operators.
+ */
+
+/* Return true if THIS == OP1.  */
+
+bool
+wide_int::eq_p_large (const HOST_WIDE_INT *op1, unsigned int op1len) const
+{
+  int l0 = len - 1;
+  int l1 = op1len - 1;
+  HOST_WIDE_INT op0mask = sign_mask ();
+  HOST_WIDE_INT op1mask = SIGN_MASK (op1[op1len - 1]);
+
+  while (l0 > l1)
+    if (val[l0--] != op1mask)
+      return false;
+
+  while (l1 > l0)
+    if (op1[l1--] != op0mask)
+      return false;
+
+  while (l0 >= 0)
+    if (val[l0--] != op1[l1--])
+      return false;
+
+  return true;
+}
+
+/* Return true if THIS < OP1 using signed comparisons.  */
+
+bool
+wide_int::lts_p_large (const HOST_WIDE_INT *op0, unsigned int op0len, 
+		       const HOST_WIDE_INT *op1, unsigned int op1len, 
+		       unsigned int precision)
+{
+  int l;
+  HOST_WIDE_INT s0, s1;
+  unsigned HOST_WIDE_INT u0, u1;
+  unsigned int blocks_needed = BLOCKS_NEEDED (precision);
+  HOST_WIDE_INT op0mask = SIGN_MASK (op0[op0len - 1]);
+  HOST_WIDE_INT op1mask = SIGN_MASK (op1[op1len - 1]);
+
+  /* Only the top block is compared as signed.  The rest are unsigned
+     comparisons.  */
+  s0 = blocks_needed == op0len ? op0[blocks_needed - 1] : op0mask;
+  s1 = blocks_needed == op1len ? op1[blocks_needed - 1] : op1mask;
+  if (s0 < s1)
+    return true;
+  if (s0 > s1)
+    return false;
+
+  l = (signed int)MAX (op0len, op1len);
+  /* We have already checked to top block so skip down.  */
+  l = (l == (signed int)blocks_needed) ? l - 2 : l - 1;
+
+  while (l >= 0)
+    {
+      u0 = l < (signed int)op0len ? op0[l] : op0mask;
+      u1 = l < (signed int)op1len ? op1[l] : op1mask;
+
+      if (u0 < u1)
+	return true;
+      if (u0 > u1)
+	return false;
+      l--;
+    }
+
+  return false;
+}
+
+/* Returns -1 if THIS < OP1, 0 if THIS == OP1 and 1 if A > OP1 using
+   signed compares.  */
+
+int
+wide_int::cmps_large (const HOST_WIDE_INT *op1, unsigned int op1len) const
+{
+  int l;
+  HOST_WIDE_INT s0, s1;
+  unsigned HOST_WIDE_INT u0, u1;
+  unsigned int blocks_needed = BLOCKS_NEEDED (precision);
+  HOST_WIDE_INT op0mask = sign_mask ();
+  HOST_WIDE_INT op1mask = SIGN_MASK (op1[op1len - 1]);
+
+  /* Only the top block is compared as signed.  The rest are unsigned
+     comparisons.  */
+  s0 = blocks_needed == (unsigned int)len ? val[blocks_needed - 1] : op0mask;
+  s1 = blocks_needed == op1len ? op1[blocks_needed - 1] : op1mask;
+  if (s0 < s1)
+    return -1;
+  if (s0 > s1)
+    return 1;
+
+  l = MAX (len, (signed int)op1len);
+  /* We have already checked to top block so skip down.  */
+  l = (l == (signed int)blocks_needed) ? l - 2 : l - 1;
+
+  while (l >= 0)
+    {
+      u0 = l < len ? val [l] : op0mask;
+      u1 = l < (signed int)op1len ? op1[l] : op1mask;
+
+      if (u0 < u1)
+	return -1;
+      if (u0 > u1)
+	return 1;
+      l--;
+    }
+
+  return 0;
+}
+
+/* Return true if OP0 < OP1 using unsigned comparisons.  */
+
+bool
+wide_int::ltu_p_large (const HOST_WIDE_INT *op0, unsigned int op0len, 
+		       const HOST_WIDE_INT *op1, unsigned int op1len)
+{
+  unsigned HOST_WIDE_INT x0;
+  unsigned HOST_WIDE_INT x1;
+  int l0 = op0len - 1;
+  int l1 = op1len - 1;
+  HOST_WIDE_INT op0mask = SIGN_MASK (op0[op0len - 1]);
+  HOST_WIDE_INT op1mask = SIGN_MASK (op1[op1len - 1]);
+
+  while (l0 > l1)
+    {
+      x0 = op0[l0--];
+      x1 = op1mask;
+      if (x0 < x1)
+	return true;
+      if (x0 > x1)
+	return false;
+    }
+
+  while (l1 > l0)
+    {
+      x0 = op0mask;
+      x1 = op1[l1--];
+      if (x0 < x1)
+	return true;
+      if (x0 > x1)
+	return false;
+    }
+
+  while (l0 >= 0)
+    {
+      x0 = op0[l0--];
+      x1 = op1[l1--];
+      if (x0 < x1)
+	return true;
+      if (x0 > x1)
+	return false;
+    }
+
+  return false;
+}
+
+/* Returns -1 if THIS < OP1, 0 if THIS == OP1 and 1 if A > OP1 using
+   unsigned compares.  */
+
+int
+wide_int::cmpu_large (const HOST_WIDE_INT *op1, unsigned int op1len) const
+{
+  unsigned HOST_WIDE_INT x0;
+  unsigned HOST_WIDE_INT x1;
+  int l0 = len - 1;
+  int l1 = op1len - 1;
+  HOST_WIDE_INT op0mask = sign_mask ();
+  HOST_WIDE_INT op1mask = SIGN_MASK (op1[op1len - 1]);
+
+  while (l0 > l1)
+    {
+      x0 = val[l0--];
+      x1 = op1mask;
+      if (x0 < x1)
+	return -1;
+      else if (x0 > x1)
+	return 1;
+    }
+
+  while (l1 > l0)
+    {
+      x0 = op0mask;
+      x1 = op1[l1--];
+      if (x0 < x1)
+	return -1;
+      if (x0 > x1)
+	return 1;
+    }
+
+  while (l0 >= 0)
+    {
+      x0 = val[l0--];
+      x1 = op1[l1--];
+      if (x0 < x1)
+	return -1;
+      if (x0 > x1)
+	return 1;
+    }
+
+  return 0;
+}
+
+/* Return true if THIS has the sign bit set to 1 and all other bits are
+   zero.  */
+
+bool
+wide_int::only_sign_bit_p (unsigned int prec) const
+{
+  int i;
+  HOST_WIDE_INT x;
+  int small_prec;
+  bool result;
+
+  if (BLOCKS_NEEDED (prec) != len)
+    {
+      result = false;
+      goto ex;
+    }
+
+  for (i=0; i < len - 1; i++)
+    if (val[i] != 0)
+      {
+	result = false;
+	goto ex;
+      }
+
+  x = val[len - 1];
+  small_prec = prec & (HOST_BITS_PER_WIDE_INT - 1);
+  if (small_prec)
+    x = x << (HOST_BITS_PER_WIDE_INT - small_prec);
+
+  result = x == ((HOST_WIDE_INT)1) << (HOST_BITS_PER_WIDE_INT - 1);
+
+ ex:
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_vw ("wide_int:: %d = only_sign_bit_p (%s)\n", result, *this);
+#endif
+  return result;
+}
+
+/* Returns true if THIS fits into range of TYPE.  Signedness of OP0 is
+   assumed to be the same as the signedness of TYPE.  */
+
+bool
+wide_int::fits_to_tree_p (const_tree type) const
+{
+  int type_prec = TYPE_PRECISION (type);
+
+  if (TYPE_UNSIGNED (type))
+    return fits_u_p (type_prec);
+  else
+    return fits_s_p (type_prec);
+}
+
+/* Returns true of THIS fits in the unsigned range of precision.  */
+
+bool
+wide_int::fits_s_p (unsigned int prec) const
+{
+  if (len < BLOCKS_NEEDED (prec))
+    return true;
+
+  if (precision <= prec)
+    return true;
+
+  return *this == sext (prec);
+}
+
+
+/* Returns true if THIS fits into range of TYPE.  */
+
+bool
+wide_int::fits_u_p (unsigned int prec) const
+{
+  if (len < BLOCKS_NEEDED (prec))
+    return true;
+
+  if (precision <= prec)
+    return true;
+
+  return *this == zext (prec);
+}
+
+/*
+ * Extension.
+ */
+
+/* Sign extend THIS starting at OFFSET.  The precision of the result
+   are the same as THIS.  */
+
+wide_int
+wide_int::sext (unsigned int offset) const
+{
+  wide_int result;
+  int off;
+
+  gcc_assert (precision >= offset);
+
+  if (precision <= HOST_BITS_PER_WIDE_INT)
+    {
+      result.precision = precision;
+      if (offset < precision)
+	result.val[0] = sext_hwi (val[0], offset);
+      else
+	/* If offset is greater or equal to precision there is nothing
+	   to do since the internal rep is already sign extended.  */
+	result.val[0] = val[0];
+
+      result.len = 1;
+    }
+  else if (precision == offset)
+    result = *this;
+  else
+    {
+      result = decompress (offset, precision);
+      
+      /* Now we can do the real sign extension.  */
+      off = offset & (HOST_BITS_PER_WIDE_INT - 1);
+      if (off)
+	{
+	  int block = BLOCK_OF (offset);
+	  result.val[block] = sext_hwi (val[block], off);
+	  result.len = block + 1;
+	}
+      /* We never need an extra element for sign extended values.  */
+    }    
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwv ("wide_int:: %s = (%s sext %d)\n", result, *this, offset);
+#endif
+
+  return result;
+}
+
+/* Zero extend THIS starting at OFFSET.  The precision of the result
+   are the same as THIS.  */
+
+wide_int
+wide_int::zext (unsigned int offset) const
+{
+  wide_int result;
+  int off;
+  int block;
+
+  gcc_assert (precision >= offset);
+
+  if (precision <= HOST_BITS_PER_WIDE_INT)
+    {
+      result.precision = precision;
+      if (offset < precision)
+	result.val[0] = zext_hwi (val[0], offset);
+      else if (offset == precision)
+	result.val[0] = val[0];
+	/* If offset was greater than the precision we need to zero
+	   extend from the old precision since the internal rep was
+	   equivalent to sign extended.  */
+      else
+	result.val[0] = zext_hwi (val[0], precision);
+	
+      result.len = 1;
+    }
+  else if (precision == offset)
+    result = *this;
+  else
+    {
+      result = decompress (offset, precision);
+
+      /* Now we can do the real zero extension.  */
+      off = offset & (HOST_BITS_PER_WIDE_INT - 1);
+      block = BLOCK_OF (offset);
+      if (off)
+	{
+	  result.val[block] = zext_hwi (val[block], off);
+	  result.len = block + 1;
+	}
+      else
+	/* See if we need an extra zero element to satisfy the
+	   compression rule.  */
+	if (val[block - 1] < 0 && offset < precision)
+	  {
+	    result.val[block] = 0;
+	    result.len += 1;
+	  }
+    }
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwv ("wide_int:: %s = (%s zext %d)\n", result, *this, offset);
+#endif
+  return result;
+}
+
+/*
+ * Masking, inserting, shifting, rotating.
+ */
+
+/* Return a value with a one bit inserted in THIS at BITPOS.  */
+
+wide_int
+wide_int::set_bit (unsigned int bitpos) const
+{
+  wide_int result;
+  int i, j;
+
+  if (bitpos >= precision)
+    result = copy ();
+  else
+    {
+      result = decompress (bitpos, precision);
+      j = bitpos / HOST_BITS_PER_WIDE_INT;
+      i = bitpos & (HOST_BITS_PER_WIDE_INT - 1);
+      result.val[j] |= ((HOST_WIDE_INT)1) << i;
+    }
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwv ("wide_int:: %s = (%s set_bit %d)\n", result, *this, bitpos);
+#endif
+  return result;
+}
+
+/* Insert a 1 bit into 0 at BITPOS producing an number with PRECISION.  */
+
+wide_int
+wide_int::set_bit_in_zero (unsigned int bitpos, unsigned int prec)
+{
+  wide_int result;
+  int blocks_needed = BLOCKS_NEEDED (bitpos + 1);
+  int i, j;
+
+  result.precision = prec;
+  if (bitpos >= prec)
+    {
+      result.len = 1;
+      result.val[0] = 0;
+    }
+  else
+    {
+      result.len = blocks_needed;
+      for (i = 0; i < blocks_needed; i++)
+	result.val[i] = 0;
+      
+      j = bitpos / HOST_BITS_PER_WIDE_INT;
+      i = bitpos & (HOST_BITS_PER_WIDE_INT - 1);
+      result.val[j] |= ((HOST_WIDE_INT)1) << i;
+    }
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wv ("wide_int:: %s = set_bit_in_zero (%d)\n", result, bitpos);
+#endif
+
+  return result;
+}
+
+/* Insert WIDTH bits from OP0 into THIS starting at START.  */
+
+wide_int
+wide_int::insert (const wide_int &op0, unsigned int start, 
+		  unsigned int width) const
+{
+  wide_int result;
+  wide_int mask;
+  wide_int tmp;
+
+  if (start + width >= precision) 
+    width = precision - start;
+
+  mask = shifted_mask (start, width, false, precision);
+  tmp = op0.lshift (start, 0, precision, NONE);
+  result = tmp & mask;
+
+  tmp = and_not (mask);
+  result = result | tmp;
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwwvv ("wide_int:: %s = (%s insert %s start = %d width = %d)\n", 
+		 result, *this, op0, start, width);
+#endif
+
+  return result;
+}
+
+/* bswap THIS.  */
+
+wide_int
+wide_int::bswap () const
+{
+  wide_int result;
+  int i, s;
+  int end;
+  int len = BLOCKS_NEEDED (precision);
+
+  /* This is not a well defined operation if the precision is not a
+     multiple of 8.  */
+  gcc_assert ((precision & 0x7) == 0);
+
+  result.precision = precision;
+  result.len = len;
+
+  for (i = 0; i < len; i++)
+    result.val[i] = 0;
+
+  /* Only swap the bytes that are not the padding.  */
+  if ((precision & (HOST_BITS_PER_WIDE_INT - 1))
+      && (this->len == len))
+    end = precision;
+  else
+    end = this->len * HOST_BITS_PER_WIDE_INT;
+
+  for (s = 0; s < end; s += 8)
+    {
+      unsigned int d = precision - s - 8;
+      unsigned HOST_WIDE_INT byte;
+
+      int block = s / HOST_BITS_PER_WIDE_INT;
+      int offset = s & (HOST_BITS_PER_WIDE_INT - 1);
+
+      byte = (val[block] >> offset) & 0xff;
+
+      block = d / HOST_BITS_PER_WIDE_INT;
+      offset = d & (HOST_BITS_PER_WIDE_INT - 1);
+
+      result.val[block] |= byte << offset;
+    }
+
+  result.canonize ();
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_ww ("wide_int:: %s = bswap (%s)\n", result, *this);
+#endif
+  return result;
+}
+
+/* Return a result mask where the lower WIDTH bits are ones and the
+   bits above that up to the precision are zeros.  The result is
+   inverted if NEGATE is true.  The result is made with PREC. */
+
+wide_int
+wide_int::mask (unsigned int width, bool negate, unsigned int prec)
+{
+  wide_int result;
+  unsigned int i = 0;
+  int shift;
+
+  gcc_assert (width < 2 * MAX_BITSIZE_MODE_ANY_INT);
+  gcc_assert (prec <= 2 * MAX_BITSIZE_MODE_ANY_INT);
+
+  if (width == 0)
+    {
+      if (negate)
+	result = wide_int::minus_one (prec);
+      else
+	result = wide_int::zero (prec);
+    }
+  else
+    {
+      result.precision = prec;
+      
+      while (i < width / HOST_BITS_PER_WIDE_INT)
+	result.val[i++] = negate ? 0 : (HOST_WIDE_INT)-1;
+      
+      shift = width & (HOST_BITS_PER_WIDE_INT - 1);
+      if (shift != 0)
+	{
+	  HOST_WIDE_INT last = (((HOST_WIDE_INT)1) << shift) - 1;
+	  result.val[i++] = negate ? ~last : last;
+	}
+      result.len = i;
+    }
+
+#ifdef DEBUG_WIDE_INT
+      if (dump_file)
+	debug_wvv ("wide_int:: %s = mask (%d, negate = %d)\n", result, width, negate);
+#endif
+  return result;
+}
+
+/* Return a result mask of WIDTH ones starting at START and the
+   bits above that up to the precision are zeros.  The result is
+   inverted if NEGATE is true.  */
+wide_int
+wide_int::shifted_mask (unsigned int start, unsigned int width, 
+			bool negate, unsigned int prec)
+{
+  wide_int result;
+  unsigned int i = 0;
+  unsigned int shift;
+  unsigned int end = start + width;
+  HOST_WIDE_INT block;
+
+  if (start + width > prec)
+    width = prec - start;
+ 
+  if (width == 0)
+    {
+      if (negate)
+	result = wide_int::minus_one (prec);
+      else
+	result = wide_int::zero (prec);
+#ifdef DEBUG_WIDE_INT
+      if (dump_file)
+	debug_wvvv 
+	  ("wide_int:: %s = shifted_mask (start = %d width = %d negate = %d)\n", 
+	   result, start, width, negate);
+#endif
+      return result;
+    }
+
+  result.precision = prec;
+
+  while (i < start / HOST_BITS_PER_WIDE_INT)
+    result.val[i++] = negate ? (HOST_WIDE_INT)-1 : 0;
+
+  shift = start & (HOST_BITS_PER_WIDE_INT - 1);
+  if (shift)
+    {
+      block = (((HOST_WIDE_INT)1) << shift) - 1;
+      shift = (end) & (HOST_BITS_PER_WIDE_INT - 1);
+      if (shift)
+	{
+	  /* case 000111000 */
+	  block = (((HOST_WIDE_INT)1) << shift) - block - 1;
+	  result.val[i++] = negate ? ~block : block;
+	  result.len = i;
+
+#ifdef DEBUG_WIDE_INT
+	  if (dump_file)
+	    debug_wvvv 
+	      ("wide_int:: %s = shifted_mask (start = %d width = %d negate = %d)\n", 
+	       result, start, width, negate);
+#endif
+	  return result;
+	}
+      else
+	/* ...111000 */
+	result.val[i++] = negate ? block : ~block;
+    }
+
+  while (i < end / HOST_BITS_PER_WIDE_INT)
+    /* 1111111 */
+    result.val[i++] = negate ? 0 : (HOST_WIDE_INT)-1;
+
+  shift = end & (HOST_BITS_PER_WIDE_INT - 1);
+  if (shift != 0)
+    {
+      /* 000011111 */
+      block = (((HOST_WIDE_INT)1) << shift) - 1;
+      result.val[i++] = negate ? ~block : block;
+    }
+
+  result.len = i;
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wvvv 
+      ("wide_int:: %s = shifted_mask (start = %d width = %d negate = %d)\n", 
+       result, start, width, negate);
+#endif
+
+  return result;
+}
+
+
+/*
+ * logical operations.
+ */
+
+/* Return THIS & OP1.  */
+
+wide_int
+wide_int::and_large (const HOST_WIDE_INT *op1, unsigned int op1len) const
+{
+  wide_int result;
+  int l0 = len - 1;
+  int l1 = op1len - 1;
+  bool need_canon = true;
+
+  result.len = MAX (len, op1len);
+  result.precision = precision;
+
+  if (l0 > l1)
+    {
+      if (SIGN_MASK (op1[op1len - 1]) == 0)
+	{
+	  l0 = l1;
+	  result.len = l1 + 1;
+	}
+      else
+	{
+	  need_canon = false;
+	  while (l0 > l1)
+	    {
+	      result.val[l0] = val[l0];
+	      l0--;
+	    }
+	}
+    }
+  else if (l1 > l0)
+    {
+      if (sign_mask () == 0)
+	result.len = l0 + 1;
+      else
+	{
+	  need_canon = false;
+	  while (l1 > l0)
+	    {
+	      result.val[l1] = op1[l1];
+	      l1--;
+	    }
+	}
+    }
+
+  while (l0 >= 0)
+    {
+      result.val[l0] = val[l0] & op1[l0];
+      l0--;
+    }
+
+  if (need_canon)
+    result.canonize ();
+  return result;
+}
+
+/* Return THIS & ~OP1.  */
+
+wide_int
+wide_int::and_not_large (const HOST_WIDE_INT *op1, unsigned int op1len) const
+{
+  wide_int result;
+  int l0 = len - 1;
+  int l1 = op1len - 1;
+  bool need_canon = true;
+
+  result.len = MAX (len, op1len);
+  result.precision = precision;
+
+  if (l0 > l1)
+    {
+      if (SIGN_MASK (op1[op1len - 1]) != 0)
+	{
+	  l0 = l1;
+	  result.len = l1 + 1;
+	}
+      else
+	{
+	  need_canon = false;
+	  while (l0 > l1)
+	    {
+	      result.val[l0] = val[l0];
+	      l0--;
+	    }
+	}
+    }
+  else if (l1 > l0)
+    {
+      if (sign_mask () == 0)
+	result.len = l0 + 1;
+      else
+	{
+	  need_canon = false;
+	  while (l1 > l0)
+	    {
+	      result.val[l1] = ~op1[l1];
+	      l1--;
+	    }
+	}
+    }
+
+  while (l0 >= 0)
+    {
+      result.val[l0] = val[l0] & ~op1[l0];
+      l0--;
+    }
+
+  if (need_canon)
+    result.canonize ();
+
+  return result;
+}
+
+/* Return THIS | OP1.  */
+
+wide_int
+wide_int::or_large (const HOST_WIDE_INT *op1, unsigned int op1len) const
+{
+  wide_int result;
+  int l0 = len - 1;
+  int l1 = op1len - 1;
+  bool need_canon = true;
+
+  result.len = MAX (len, op1len);
+  result.precision = precision;
+
+  if (l0 > l1)
+    {
+      if (SIGN_MASK (op1[op1len - 1]) != 0)
+	{
+	  l0 = l1;
+	  result.len = l1 + 1;
+	}
+      else
+	{
+	  need_canon = false;
+	  while (l0 > l1)
+	    {
+	      result.val[l0] = val[l0];
+	      l0--;
+	    }
+	}
+    }
+  else if (l1 > l0)
+    {
+      if (sign_mask () != 0)
+	result.len = l0 + 1;
+      else
+	{
+	  need_canon = false;
+	  while (l1 > l0)
+	    {
+	      result.val[l1] = op1[l1];
+	      l1--;
+	    }
+	}
+    }
+
+  while (l0 >= 0)
+    {
+      result.val[l0] = val[l0] | op1[l0];
+      l0--;
+    }
+
+  if (need_canon)
+    result.canonize ();
+
+  return result;
+}
+
+/* Return THIS | ~OP1.  */
+
+wide_int
+wide_int::or_not_large (const HOST_WIDE_INT *op1, unsigned int op1len) const
+{
+  wide_int result;
+  int l0 = len - 1;
+  int l1 = op1len - 1;
+  bool need_canon = true;
+
+  result.len = MAX (len, op1len);
+  result.precision = precision;
+
+  if (l0 > l1)
+    {
+      if (SIGN_MASK (op1[op1len - 1]) == 0)
+	{
+	  l0 = l1;
+	  result.len = l1 + 1;
+	}
+      else
+	{
+	  need_canon = false;
+	  while (l0 > l1)
+	    {
+	      result.val[l0] = val[l0];
+	      l0--;
+	    }
+	}
+    }
+  else if (l1 > l0)
+    {
+      if (sign_mask () != 0)
+	result.len = l0 + 1;
+      else
+	{
+	  need_canon = false;
+	  while (l1 > l0)
+	    {
+	      result.val[l1] = ~op1[l1];
+	      l1--;
+	    }
+	}
+    }
+
+  while (l0 >= 0)
+    {
+      result.val[l0] = val[l0] | ~op1[l0];
+      l0--;
+    }
+
+  if (need_canon)
+    result.canonize ();
+  return result;
+}
+
+/* Return the exclusive ior (xor) of THIS and OP1.  */
+
+wide_int
+wide_int::xor_large (const HOST_WIDE_INT *op1, unsigned int op1len) const
+{
+  wide_int result;
+  int l0 = len - 1;
+  int l1 = op1len - 1;
+
+  result.len = MAX (len, op1len);
+  result.precision = precision;
+
+  while (l0 > l1)
+    {
+      result.val[l0] = val[l0] ^ SIGN_MASK (op1[op1len - 1]);
+      l0--;
+    }
+
+  while (l1 > l0)
+    {
+      result.val[l1] = sign_mask () ^ op1[l1];
+      l1--;
+    }
+
+  while (l0 >= 0)
+    {
+      result.val[l0] = val[l0] ^ op1[l0];
+      l0--;
+    }
+
+  result.canonize ();
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwa ("wide_int:: %s = (%s ^ %s)\n", result, *this, op1, op1len);
+#endif
+  return result;
+}
+
+/*
+ * math
+ */
+
+/* Absolute value of THIS.  */
+
+wide_int
+wide_int::abs () const
+{
+  if (sign_mask ())
+    return neg ();
+
+  wide_int result = copy ();
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_ww ("wide_int:: %s = abs (%s)\n", result, *this);
+#endif
+  return result;
+}
+
+/* Add of THIS and OP1.  No overflow is detected.  */
+
+wide_int
+wide_int::add_large (const HOST_WIDE_INT *op1, unsigned int op1len) const
+{
+  wide_int result;
+  unsigned HOST_WIDE_INT o0, o1;
+  unsigned HOST_WIDE_INT x = 0;
+  unsigned HOST_WIDE_INT carry = 0;
+  unsigned HOST_WIDE_INT mask0, mask1;
+  unsigned int i, small_prec;
+
+  result.precision = precision;
+  result.len = MAX (len, op1len);
+  mask0 = sign_mask ();
+  mask1 = SIGN_MASK (op1[op1len - 1]);
+  /* Add all of the explicitly defined elements.  */
+  for (i = 0; i < result.len; i++)
+    {
+      o0 = i < len ? (unsigned HOST_WIDE_INT)val[i] : mask0;
+      o1 = i < op1len ? (unsigned HOST_WIDE_INT)op1[i] : mask1;
+      x = o0 + o1 + carry;
+      result.val[i] = x;
+      carry = x < o0;
+    }
+
+  if (len * HOST_BITS_PER_WIDE_INT < precision)
+    {
+      result.val[result.len] = mask0 + mask1 + carry;
+      result.len++;
+    }
+
+  small_prec = precision & (HOST_BITS_PER_WIDE_INT - 1);
+  if (small_prec != 0 && BLOCKS_NEEDED (precision) == result.len)
+    {
+      /* Modes with weird precisions.  */
+      i = result.len - 1;
+      result.val[i] = sext_hwi (result.val[i], small_prec);
+    }
+
+  result.canonize ();
+  return result;
+}
+
+/* Add of OP0 and OP1 with overflow checking.  If the result overflows
+   within the precision, set OVERFLOW.  OVERFLOW is assumed to be
+   sticky so it should be initialized.  SGN controls if signed or
+   unsigned overflow is checked.  */
+
+wide_int
+wide_int::add (const wide_int &op1, SignOp sgn, bool *overflow) const
+{
+  wide_int result;
+  unsigned HOST_WIDE_INT o0 = 0;
+  unsigned HOST_WIDE_INT o1 = 0;
+  unsigned HOST_WIDE_INT x = 0;
+  unsigned HOST_WIDE_INT carry = 0;
+  unsigned HOST_WIDE_INT old_carry = 0;
+  unsigned HOST_WIDE_INT mask0, mask1;
+  int i, small_prec;
+
+  gcc_checking_assert (precision == op1.precision);
+
+  result.precision = precision;
+  result.len = MAX (len, op1.len);
+  mask0 = sign_mask ();
+  mask1 = op1.sign_mask ();
+
+  /* Add all of the explicitly defined elements.  */
+  for (i = 0; i < len; i++)
+    {
+      o0 = val[i];
+      o1 = op1.val[i];
+      x = o0 + o1 + carry;
+      result.val [i] = x;
+      old_carry = carry;
+      carry = x < o0;
+    }
+
+  /* Uncompress the rest.  */
+  for (i = len; i < op1.len; i++)
+    {
+      o0 = val[i];
+      o1 = mask1;
+      x = o0 + o1 + carry;
+      result.val[i] = x;
+      old_carry = carry;
+      carry = x < o0;
+    }
+  for (i = len; i < op1.len; i++)
+    {
+      o0 = mask0;
+      o1 = op1.val[i];
+      x = o0 + o1 + carry;
+      result.val[i] = x;
+      old_carry = carry;
+      carry = x < o0;
+    }
+
+  if (len * HOST_BITS_PER_WIDE_INT < precision)
+    {
+      result.val[result.len] = mask0 + mask1 + carry;
+      result.len++;
+      /* If we were short, we could not have overflowed.  */
+      goto ex;
+    }
+
+  small_prec = precision & (HOST_BITS_PER_WIDE_INT - 1);
+  if (small_prec == 0)
+    {
+      if (sgn == wide_int::SIGNED)
+	{
+	  if (((!(o0 ^ o1)) & (x ^ o0)) >> (HOST_BITS_PER_WIDE_INT - 1))
+	    *overflow = true;
+	}
+      else if (old_carry)
+	{
+	  if ((~o0) <= o1)
+	    *overflow = true;
+	}
+      else
+	{
+	  if ((~o0) < o1)
+	    *overflow = true;
+	}
+    }
+  else
+    {
+      if (sgn == wide_int::UNSIGNED)
+	{
+	  /* The caveat for unsigned is to get rid of the bits above
+	     the precision before doing the addition.  To check the
+	     overflow, clear these bits and then redo the last
+	     addition.  If there are any non zero bits above the prec,
+	     we overflowed. */
+	  o0 = zext_hwi (o0, small_prec);
+	  o1 = zext_hwi (o1, small_prec);
+	  x = o0 + o1 + old_carry;
+	  if (x >> small_prec)
+	    *overflow = true;
+	}
+      else 
+	{
+	  /* Overflow in this case is easy since we can see bits beyond
+	     the precision.  If the value computed is not the sign
+	     extended value, then we have overflow.  */
+	  unsigned HOST_WIDE_INT y = sext_hwi (x, small_prec);
+	  if (x != y)
+	    *overflow = true;
+	}
+    }
+
+ ex:
+  result.canonize ();
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wvww ("wide_int:: %s %d = (%s +O %s)\n", 
+	       result, *overflow, *this, op1);
+#endif
+
+  return result;
+}
+
+
+/* Count leading zeros of THIS but only looking at the bits in the
+   smallest HWI of size mode.  */
+
+wide_int
+wide_int::clz () const
+{
+  int i;
+  int start;
+  int count;
+  HOST_WIDE_INT v;
+  int small_prec = precision & (HOST_BITS_PER_WIDE_INT - 1);
+
+  if (zero_p ())
+    {
+      enum machine_mode mode = mode_for_size (precision, MODE_INT, 0);
+      if (mode == BLKmode)
+	mode_for_size (precision, MODE_PARTIAL_INT, 0); 
+
+      /* Even if the value at zero is undefined, we have to come up
+	 with some replacement.  Seems good enough.  */
+      if (mode == BLKmode)
+	count = precision;
+      else if (!CLZ_DEFINED_VALUE_AT_ZERO (mode, count))
+	count = precision;
+    }
+  else
+    {
+      /* The high order block is special if it is the last block and the
+	 precision is not an even multiple of HOST_BITS_PER_WIDE_INT.  We
+	 have to clear out any ones above the precision before doing clz
+	 on this block.  */
+      if (BLOCKS_NEEDED (precision) == len && small_prec)
+	{
+	  v = zext_hwi (val[len - 1], small_prec);
+	  count = clz_hwi (v) - (HOST_BITS_PER_WIDE_INT - small_prec);
+	  start = len - 2;
+	  if (v != 0)
+	    {
+#ifdef DEBUG_WIDE_INT
+	      if (dump_file)
+		debug_vw ("wide_int:: %d = clz (%s)\n", count, *this);
+#endif
+	      return from_shwi (count, precision);
+	    }
+	}
+      else
+	{
+	  count = HOST_BITS_PER_WIDE_INT * (BLOCKS_NEEDED (precision) - len);
+	  start = len - 1;
+	}
+      
+      for (i = start; i >= 0; i--)
+	{
+	  v = elt (i);
+	  count += clz_hwi (v);
+	  if (v != 0)
+	    break;
+	}
+      
+    }
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_vw ("wide_int:: %d = clz (%s)\n", count, *this);
+#endif
+  return from_shwi (count, precision);
+}
+
+/* Count the number of redundant leading bits of THIS.  Return result
+   as a HOST_WIDE_INT.  */
+
+wide_int
+wide_int::clrsb () const
+{
+  if (neg_p ())
+    return operator ~ ().clz () - 1;
+
+  return clz () - 1;
+}
+
+/* Count zeros of THIS.   */
+
+wide_int
+wide_int::ctz () const
+{
+  int i;
+  unsigned int count = 0;
+  HOST_WIDE_INT v;
+  int small_prec = precision & (HOST_BITS_PER_WIDE_INT - 1);
+  int end;
+  bool more_to_do;
+
+  if (zero_p ())
+    {
+      enum machine_mode mode = mode_for_size (precision, MODE_INT, 0);
+      if (mode == BLKmode)
+	mode_for_size (precision, MODE_PARTIAL_INT, 0); 
+
+      /* Even if the value at zero is undefined, we have to come up
+	 with some replacement.  Seems good enough.  */
+      if (mode == BLKmode)
+	count = precision;
+      else if (!CTZ_DEFINED_VALUE_AT_ZERO (mode, count))
+	count = precision;
+    }
+  else
+    {
+      /* The high order block is special if it is the last block and the
+	 precision is not an even multiple of HOST_BITS_PER_WIDE_INT.  We
+	 have to clear out any ones above the precision before doing clz
+	 on this block.  */
+      if (BLOCKS_NEEDED (precision) == len && small_prec)
+	{
+	  end = len - 1;
+	  more_to_do = true;
+	}
+      else
+	{
+	  end = len;
+	  more_to_do = false;
+	}
+      
+      for (i = 0; i < end; i++)
+	{
+	  v = val[i];
+	  count += ctz_hwi (v);
+	  if (v != 0)
+	    {
+#ifdef DEBUG_WIDE_INT
+	      if (dump_file)
+		debug_vw ("wide_int:: %d = ctz (%s)\n", count, *this);
+#endif
+	      return wide_int::from_shwi (count, precision);
+	    }
+	}
+      
+      if (more_to_do)
+	{
+	  v = zext_hwi (val[len - 1], small_prec);
+	  count = ctz_hwi (v);
+	  /* The top word was all zeros so we have to cut it back to prec,
+	     because we are counting some of the zeros above the
+	     interesting part.  */
+	  if (count > precision)
+	    count = precision;
+	}
+      else
+	/* Skip over the blocks that are not represented.  They must be
+	   all zeros at this point.  */
+	count = precision;
+    }
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_vw ("wide_int:: %d = ctz (%s)\n", count, *this);
+#endif
+  return wide_int::from_shwi (count, precision);
+}
+
+/* ffs of THIS.  */
+
+wide_int
+wide_int::ffs () const
+{
+  HOST_WIDE_INT count = ctz ().to_shwi ();
+  if (count == precision)
+    count = 0;
+  else
+    count += 1;
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_vw ("wide_int:: %d = ffs (%s)\n", count, *this);
+#endif
+  return wide_int::from_shwi (count, precision);
+}
+
+/* Subroutines of the multiplication and division operations.  Unpack
+   the first IN_LEN HOST_WIDE_INTs in INPUT into 2 * IN_LEN
+   HOST_HALF_WIDE_INTs of RESULT.  The rest of RESULT is filled by
+   uncompressing the top bit of INPUT[IN_LEN - 1].  */
+
+static void
+wi_unpack (unsigned HOST_HALF_WIDE_INT *result, 
+	   const unsigned HOST_WIDE_INT *input,
+	   int in_len, int out_len)
+{
+  int i;
+  int j = 0;
+  HOST_WIDE_INT mask;
+
+  for (i = 0; i <in_len; i++)
+    {
+      result[j++] = input[i];
+      result[j++] = input[i] >> HOST_BITS_PER_HALF_WIDE_INT;
+    }
+  mask = SIGN_MASK (input[in_len - 1]);
+  mask &= HALF_INT_MASK;
+
+  /* Smear the sign bit.  */
+  while (j < out_len)
+    result[j++] = mask;
+}
+
+/* The inverse of wi_unpack.  IN_LEN is the the number of input
+   blocks.  The number of output blocks will be half this amount.  */
+
+static void
+wi_pack (unsigned HOST_WIDE_INT *result, 
+	 const unsigned HOST_HALF_WIDE_INT *input, 
+	 int in_len)
+{
+  int i = 0;
+  int j = 0;
+
+  while (i < in_len - 2)
+    {
+      result[j++] = (unsigned HOST_WIDE_INT)input[i] 
+	| ((unsigned HOST_WIDE_INT)input[i + 1] << HOST_BITS_PER_HALF_WIDE_INT);
+      i += 2;
+    }
+
+  /* Handle the case where in_len is odd.   For this we zero extend.  */
+  if (i & 1)
+    result[j++] = (unsigned HOST_WIDE_INT)input[i];
+  else
+    result[j++] = (unsigned HOST_WIDE_INT)input[i] 
+      | ((unsigned HOST_WIDE_INT)input[i + 1] << HOST_BITS_PER_HALF_WIDE_INT);
+}
+
+/* Return an integer that is the exact log2 of THIS.  */
+
+wide_int
+wide_int::exact_log2 () const
+{
+  int small_prec = precision & (HOST_BITS_PER_WIDE_INT - 1);
+  wide_int count;
+  wide_int result;
+
+  if (precision <= HOST_BITS_PER_WIDE_INT)
+    {
+      HOST_WIDE_INT v;
+      if (small_prec)
+	v = zext_hwi (val[0], small_prec);
+      else
+	v = val[0];
+      result = wide_int::from_shwi (::exact_log2 (v), precision);
+    }
+  else
+    {
+      count = ctz ();
+      if (clz () + count + 1 == precision)
+	result = count;
+      else
+	result = wide_int::from_shwi (-1, precision);
+    }
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_ww ("wide_int:: %s = exact_log2 (%s)\n", result, *this);
+#endif
+  return result;
+}
+
+/* Return an integer that is the floor log2 of THIS.  */
+
+wide_int
+wide_int::floor_log2 () const
+{
+  int small_prec = precision & (HOST_BITS_PER_WIDE_INT - 1);
+  wide_int result;
+
+  if (precision <= HOST_BITS_PER_WIDE_INT)
+    {
+      HOST_WIDE_INT v;
+      if (small_prec)
+	v = zext_hwi (val[0], small_prec);
+      else
+	v = val[0];
+      result = wide_int::from_shwi (::floor_log2 (v), precision);
+    }
+  else
+    result = wide_int::from_shwi (precision, precision) - 1 - clz ();
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_ww ("wide_int:: %s = floor_log2 (%s)\n", result, *this);
+#endif
+  return result;
+}
+
+
+/* Multiply Op1 by Op2.  If HIGH is set, only the upper half of the
+   result is returned.  If FULL is set, the entire result is returned
+   in a mode that is twice the width of the inputs.  However, that
+   mode needs to exist if the value is to be usable.  Clients that use
+   FULL need to check for this.
+
+   If HIGH or FULL are not setm throw away the upper half after the check
+   is made to see if it overflows.  Unfortunately there is no better
+   way to check for overflow than to do this.  OVERFLOW is assumed to
+   be sticky so it should be initialized.  SGN controls the signess
+   and is used to check overflow or if HIGH or FULL is set.  */
+
+wide_int
+wide_int::mul_internal (bool high, bool full, 
+			const wide_int *op1, 
+			const HOST_WIDE_INT *op2, unsigned int op2len,
+			wide_int::SignOp sgn,  bool *overflow, 
+			bool needs_overflow)
+{
+  wide_int result;
+  unsigned HOST_WIDE_INT o0, o1, k, t;
+  unsigned int i;
+  unsigned int j;
+  unsigned int prec = op1->get_precision ();
+  unsigned int blocks_needed = BLOCKS_NEEDED (prec);
+  unsigned int half_blocks_needed = blocks_needed * 2;
+  /* The sizes here are scaled to support a 2x largest mode by 2x
+     largest mode yielding a 4x largest mode result.  This is what is
+     needed by vpn.  */
+
+  unsigned HOST_HALF_WIDE_INT 
+    u[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_HALF_WIDE_INT];
+  unsigned HOST_HALF_WIDE_INT 
+    v[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_HALF_WIDE_INT];
+  /* The '2' in 'R' is because we are internally doing a full
+     multiply.  */
+  unsigned HOST_HALF_WIDE_INT 
+    r[2 * 2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_HALF_WIDE_INT];
+  HOST_WIDE_INT mask = ((HOST_WIDE_INT)1 << HOST_BITS_PER_HALF_WIDE_INT) - 1;
+
+  /* If the top level routine did not really pass in an overflow, then
+     just make sure that we never attempt to set it.  */
+  if (overflow == 0)
+    needs_overflow = false;
+  result.precision = op1->precision;
+
+  if (high || full || needs_overflow)
+    {
+      /* If we need to check for overflow, we can only do half wide
+	 multiplies quickly because we need to look at the top bits to
+	 check for the overflow.  */
+      if (prec <= HOST_BITS_PER_HALF_WIDE_INT)
+	{
+	  HOST_WIDE_INT t, r;
+	  result.len = 1;
+	  o0 = op1->elt (0);
+	  o1 = op2[0];
+	  r = o0 * o1;
+	  /* Signed shift down will leave 0 or -1 if there was no
+	     overflow for signed or 0 for unsigned.  */
+	  t = SIGN_MASK (r);
+	  if (needs_overflow)
+	    {
+	      if (sgn == wide_int::SIGNED)
+		{
+		  if (t != (HOST_WIDE_INT)-1 && t != 0)
+		    *overflow = true;
+		}
+	      else
+		{
+		  if (t != 0)
+		    *overflow = true;
+		}
+	    }
+	  if (full)
+	    {
+	      result.val[0] = sext_hwi (r, prec * 2);
+	      result.precision = op1->precision * 2;
+	    }
+	  else if (high)
+	    result.val[0] = r >> prec;
+	  else
+	    result.val[0] = sext_hwi (r, prec);
+#ifdef DEBUG_WIDE_INT
+	  if (dump_file)
+	    debug_wvwa ("wide_int:: %s %d = (%s *O %s)\n", 
+			result, *overflow, *op1, op2, op2len);
+#endif
+	  return result;
+	}
+    }
+
+  wi_unpack (u, (const unsigned HOST_WIDE_INT*)op1->val, op1->len,
+	     half_blocks_needed);
+  wi_unpack (v, (const unsigned HOST_WIDE_INT*)op2, op2len,
+	     half_blocks_needed);
+
+  /* The 2 is for a full mult.  */
+  memset (r, 0, half_blocks_needed * 2 
+	  * HOST_BITS_PER_HALF_WIDE_INT / BITS_PER_UNIT);
+
+  for (j = 0; j < half_blocks_needed; j++)
+    {
+      k = 0;
+      for (i = 0; i < half_blocks_needed; i++)
+	{
+	  t = ((unsigned HOST_WIDE_INT)u[i] * (unsigned HOST_WIDE_INT)v[j]
+	       + r[i + j] + k);
+	  r[i + j] = t & HALF_INT_MASK;
+	  k = t >> HOST_BITS_PER_HALF_WIDE_INT;
+	}
+      r[j + half_blocks_needed] = k;
+    }
+
+  /* We did unsigned math above.  For signed we must adjust the
+     product (assuming we need to see that).  */
+  if (sgn == wide_int::SIGNED && (full || high || needs_overflow))
+    {
+      unsigned HOST_WIDE_INT b;
+      if ((*op1).neg_p ())
+	{
+	  b = 0;
+	  for (i = 0; i < half_blocks_needed; i++)
+	    {
+	      t = (unsigned HOST_WIDE_INT)r[i + half_blocks_needed]
+		- (unsigned HOST_WIDE_INT)v[i] - b;
+	      r[i + half_blocks_needed] = t & HALF_INT_MASK;
+	      b = t >> (HOST_BITS_PER_WIDE_INT - 1);
+	    }
+	}
+      if (op2[op2len-1] < 0)
+	{
+	  b = 0;
+	  for (i = 0; i < half_blocks_needed; i++)
+	    {
+	      t = (unsigned HOST_WIDE_INT)r[i + half_blocks_needed]
+		- (unsigned HOST_WIDE_INT)u[i] - b;
+	      r[i + half_blocks_needed] = t & HALF_INT_MASK;
+	      b = t >> (HOST_BITS_PER_WIDE_INT - 1);
+	    }
+	}
+    }
+
+  if (needs_overflow)
+    {
+      HOST_WIDE_INT top;
+
+      /* For unsigned, overflow is true if any of the top bits are set.
+	 For signed, overflow is true if any of the top bits are not equal
+	 to the sign bit.  */
+      if (sgn == wide_int::UNSIGNED)
+	top = 0;
+      else
+	{
+	  top = r[(half_blocks_needed) - 1];
+	  top = SIGN_MASK (top << (HOST_BITS_PER_WIDE_INT / 2));
+	  top &= mask;
+	}
+      
+      for (i = half_blocks_needed; i < half_blocks_needed * 2; i++)
+	if (((HOST_WIDE_INT)(r[i] & mask)) != top)
+	  *overflow = true; 
+    }
+
+  if (full)
+    {
+      /* compute [2prec] <- [prec] * [prec] */
+      wi_pack ((unsigned HOST_WIDE_INT*)result.val, r, 2 * half_blocks_needed);
+      result.len = blocks_needed * 2;
+      result.precision = op1->precision * 2;
+    }
+  else if (high)
+    {
+      /* compute [prec] <- ([prec] * [prec]) >> [prec] */
+      wi_pack ((unsigned HOST_WIDE_INT*)&result.val [blocks_needed >> 1],
+	       r, half_blocks_needed);
+      result.len = blocks_needed;
+    }
+  else
+    {
+      /* compute [prec] <- ([prec] * [prec]) && ((1 << [prec]) - 1) */
+      wi_pack ((unsigned HOST_WIDE_INT*)result.val, r, half_blocks_needed);
+      result.len = blocks_needed;
+    }
+      
+  result.canonize ();
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wvwa ("wide_int:: %s %d = (%s *O %s)\n", 
+		result, *overflow, *op1, op2, op2len);
+#endif
+  return result;
+}
+
+/* Compute the parity of THIS.  */
+
+wide_int
+wide_int::parity () const
+{
+  wide_int count = popcount ();
+  return count & 1;
+}
+
+/* Compute the population count of THIS.  */
+
+wide_int
+wide_int::popcount () const
+{
+  int i;
+  int start;
+  int count;
+  HOST_WIDE_INT v;
+  int small_prec = precision & (HOST_BITS_PER_WIDE_INT - 1);
+
+  /* The high order block is special if it is the last block and the
+     precision is not an even multiple of HOST_BITS_PER_WIDE_INT.  We
+     have to clear out any ones above the precision before doing clz
+     on this block.  */
+  if (BLOCKS_NEEDED (precision) == len && small_prec)
+    {
+      v = zext_hwi (val[len - 1], small_prec);
+      count = popcount_hwi (v);
+      start = len - 2;
+    }
+  else
+    {
+      if (sign_mask ())
+	count = HOST_BITS_PER_WIDE_INT * (BLOCKS_NEEDED (precision) - len);
+      else
+	count = 0;
+      start = len - 1;
+    }
+
+  for (i = start; i >= 0; i--)
+    {
+      v = val[i];
+      count += popcount_hwi (v);
+    }
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_vw ("wide_int:: %d = popcount (%s)\n", count, *this);
+#endif
+  return wide_int::from_shwi (count, precision);
+}
+
+/* Subtract of THIS and OP1.  No overflow is detected.  */
+
+wide_int
+wide_int::sub_large (const HOST_WIDE_INT *op1, unsigned int op1len) const
+{
+  wide_int result;
+  unsigned HOST_WIDE_INT o0, o1;
+  unsigned HOST_WIDE_INT x = 0;
+  /* We implement subtraction as an in place negate and add.  Negation
+     is just inversion and add 1, so we can do the add of 1 by just
+     starting the borrow in of the first element at 1.  */
+  unsigned HOST_WIDE_INT borrow = 0;
+  unsigned HOST_WIDE_INT mask0, mask1;
+  unsigned int i, small_prec;
+
+  result.precision = precision;
+  result.len = MAX (len, op1len);
+  mask0 = sign_mask ();
+  mask1 = SIGN_MASK (op1[op1len - 1]);
+
+  /* Subtract all of the explicitly defined elements.  */
+  for (i = 0; i < result.len; i++)
+    {
+      o0 = i < len ? (unsigned HOST_WIDE_INT)val[i] : mask0;
+      o1 = i < op1len ? (unsigned HOST_WIDE_INT)op1[i] : mask1;
+      x = o0 - o1 - borrow;
+      result.val[i] = x;
+      borrow = o0 < o1;
+    }
+
+  if (len * HOST_BITS_PER_WIDE_INT < precision)
+    {
+      result.val[result.len] = mask0 - mask1 - borrow;
+      result.len++;
+    }
+
+  small_prec = precision & (HOST_BITS_PER_WIDE_INT - 1);
+  if (small_prec != 0 && BLOCKS_NEEDED (precision) == result.len)
+    {
+      /* Modes with weird precisions.  */
+      i = result.len - 1;
+      result.val[i] = sext_hwi (result.val[i], small_prec);
+    }
+
+  result.canonize ();
+  return result;
+}
+
+/* Subtract of THIS and OP1 with overflow checking.  If the result
+   overflows within the precision, set OVERFLOW.  OVERFLOW is assumed
+   to be sticky so it should be initialized.  SGN controls if signed or
+   unsigned overflow is checked.  */
+
+wide_int
+wide_int::sub (const wide_int &op1, SignOp sgn, bool *overflow) const
+{
+  wide_int result;
+  unsigned HOST_WIDE_INT o0 = 0;
+  unsigned HOST_WIDE_INT o1 = 0;
+  unsigned HOST_WIDE_INT x = 0;
+  unsigned HOST_WIDE_INT borrow = 0;
+  unsigned HOST_WIDE_INT old_borrow = 0;
+  unsigned HOST_WIDE_INT mask0, mask1;
+  int i, small_prec;
+
+  gcc_checking_assert (precision == op1.precision);
+
+  result.precision = precision;
+  result.len = MAX (len, op1.len);
+  mask0 = sign_mask ();
+  mask1 = op1.sign_mask ();
+
+  /* Subtract all of the explicitly defined elements.  */
+  for (i = 0; i < len; i++)
+    {
+      o0 = val[i];
+      o1 = op1.val[i];
+      x = o0 - o1 - borrow;
+      result.val[i] = x;
+      old_borrow = borrow;
+      borrow = o0 < o1;
+    }
+
+  /* Uncompress the rest.  */
+  for (i = len; i < op1.len; i++)
+    {
+      o0 = val[i];
+      o1 = mask1;
+      x = o0 - o1 - borrow;
+      result.val[i] = x;
+      old_borrow = borrow;
+      borrow = o0 < o1;
+    }
+
+  for (i = op1.len; i < len; i++)
+    {
+      o0 = mask0;
+      o1 = op1.val[i];
+      x = o0 - o1 - borrow;
+      result.val[i] = x;
+      old_borrow = borrow;
+      borrow = o0 < o1;
+    }
+
+  if (len * HOST_BITS_PER_WIDE_INT < precision)
+    {
+      result.val[result.len] = mask0 - mask1 - borrow;
+      result.len++;
+      /* If we were short, we could not have overflowed.  */
+    }
+  else
+    {
+      small_prec = precision & (HOST_BITS_PER_WIDE_INT - 1);
+      if (small_prec == 0)
+	{
+	  if (sgn == wide_int::SIGNED)
+	    {
+	      if (((x ^ o0) & (x ^ o0)) >> (HOST_BITS_PER_WIDE_INT - 1))
+		*overflow = true;
+	    }
+	  else if (old_borrow)
+	    {
+	      if ((~o0) <= o1)
+		*overflow = true;
+	    }
+	  else
+	    {
+	      if ((~o0) < o1)
+		*overflow = true;
+	    }
+	}
+      else
+	{
+	  if (sgn == wide_int::UNSIGNED)
+	    {
+	      /* The caveat for unsigned is to get rid of the bits above
+		 the precision before doing the addition.  To check the
+		 overflow, clear these bits and then redo the last
+		 addition.  If there are any non zero bits above the prec,
+		 we overflowed. */
+	      o0 = zext_hwi (o0, small_prec);
+	      o1 = zext_hwi (o1, small_prec);
+	      x = o0 - o1 - old_borrow;
+	      if (x >> small_prec)
+		*overflow = true;
+	    }
+	  else 
+	    {
+	      /* Overflow in this case is easy since we can see bits beyond
+		 the precision.  If the value computed is not the sign
+		 extended value, then we have overflow.  */
+	      unsigned HOST_WIDE_INT y = sext_hwi (x, small_prec);
+	      if (x != y)
+		*overflow = true;
+	    }
+	}
+    }
+
+  result.canonize ();
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wvww ("wide_int:: %s %d = (%s -O %s)\n", 
+		result, *overflow, *this, op1);
+#endif
+  return result;
+}
+
+/*
+ * Division and Mod
+ */
+
+/* Compute B_QUOTIENT and B_REMAINDER from B_DIVIDEND/B_DIVISOR.  The
+   algorithm is a small modification of the algorithm in Hacker's
+   Delight by Warren, which itself is a small modification of Knuth's
+   algorithm.  M is the number of significant elements of U however
+   there needs to be at least one extra element of B_DIVIDEND
+   allocated, N is the number of elements of B_DIVISOR.  */
+
+void
+wide_int::divmod_internal_2 (unsigned HOST_HALF_WIDE_INT *b_quotient, 
+			     unsigned HOST_HALF_WIDE_INT *b_remainder,
+			     unsigned HOST_HALF_WIDE_INT *b_dividend, 
+			     unsigned HOST_HALF_WIDE_INT *b_divisor, 
+			     int m, int n)
+{
+  /* The "digits" are a HOST_HALF_WIDE_INT which the size of half of a
+     HOST_WIDE_INT and stored in the lower bits of each word.  This
+     algorithm should work properly on both 32 and 64 bit
+     machines.  */
+  unsigned HOST_WIDE_INT b
+    = (unsigned HOST_WIDE_INT)1 << HOST_BITS_PER_HALF_WIDE_INT;
+  unsigned HOST_WIDE_INT qhat;   /* Estimate of quotient digit.  */
+  unsigned HOST_WIDE_INT rhat;   /* A remainder.  */
+  unsigned HOST_WIDE_INT p;      /* Product of two digits.  */
+  HOST_WIDE_INT s, i, j, t, k;
+
+  /* Single digit divisor.  */
+  if (n == 1)
+    {
+      k = 0;
+      for (j = m - 1; j >= 0; j--)
+	{
+	  b_quotient[j] = (k * b + b_dividend[j])/b_divisor[0];
+	  k = ((k * b + b_dividend[j])
+	       - ((unsigned HOST_WIDE_INT)b_quotient[j]
+		  * (unsigned HOST_WIDE_INT)b_divisor[0]));
+	}
+      b_remainder[0] = k;
+      return;
+    }
+
+  s = clz_hwi (b_divisor[n-1]) - HOST_BITS_PER_HALF_WIDE_INT; /* CHECK clz */
+
+  /* Normalize B_DIVIDEND and B_DIVISOR.  Unlike the published
+     algorithm, we can overwrite b_dividend and b_divisor, so we do
+     that.  */
+  for (i = n - 1; i > 0; i--)
+    b_divisor[i] = (b_divisor[i] << s)
+      | (b_divisor[i-1] >> (HOST_BITS_PER_HALF_WIDE_INT - s));
+  b_divisor[0] = b_divisor[0] << s;
+
+  b_dividend[m] = b_dividend[m-1] >> (HOST_BITS_PER_HALF_WIDE_INT - s);
+  for (i = m - 1; i > 0; i--)
+    b_dividend[i] = (b_dividend[i] << s)
+      | (b_dividend[i-1] >> (HOST_BITS_PER_HALF_WIDE_INT - s));
+  b_dividend[0] = b_dividend[0] << s;
+
+  /* Main loop.  */
+  for (j = m - n; j >= 0; j--)
+    {
+      qhat = (b_dividend[j+n] * b + b_dividend[j+n-1]) / b_divisor[n-1];
+      rhat = (b_dividend[j+n] * b + b_dividend[j+n-1]) - qhat * b_divisor[n-1];
+    again:
+      if (qhat >= b || qhat * b_divisor[n-2] > b * rhat + b_dividend[j+n-2])
+	{
+	  qhat -= 1;
+	  rhat += b_divisor[n-1];
+	  if (rhat < b)
+	    goto again;
+	}
+
+      /* Multiply and subtract.  */
+      k = 0;
+      for (i = 0; i < n; i++)
+	{
+	  p = qhat * b_divisor[i];
+	  t = b_dividend[i+j] - k - (p & HALF_INT_MASK);
+	  b_dividend[i + j] = t;
+	  k = ((p >> HOST_BITS_PER_HALF_WIDE_INT)
+	       - (t >> HOST_BITS_PER_HALF_WIDE_INT));
+	}
+      t = b_dividend[j+n] - k;
+      b_dividend[j+n] = t;
+
+      b_quotient[j] = qhat;
+      if (t < 0)
+	{
+	  b_quotient[j] -= 1;
+	  k = 0;
+	  for (i = 0; i < n; i++)
+	    {
+	      t = (HOST_WIDE_INT)b_dividend[i+j] + b_divisor[i] + k;
+	      b_dividend[i+j] = t;
+	      k = t >> HOST_BITS_PER_HALF_WIDE_INT;
+	    }
+	  b_dividend[j+n] += k;
+	}
+    }
+  for (i = 0; i < n; i++)
+    b_remainder[i] = (b_dividend[i] >> s) 
+      | (b_dividend[i+1] << (HOST_BITS_PER_HALF_WIDE_INT - s));
+}
+
+
+/* Do a truncating divide DIVISOR into DIVIDEND.  The result is the
+   same size as the operands.  SIGN is either wide_int::SIGNED or
+   wide_int::UNSIGNED.  */
+
+wide_int
+wide_int::divmod_internal (bool compute_quotient, 
+			   const wide_int *dividend, 
+			   const HOST_WIDE_INT *divisor,
+			   unsigned int divisorlen,
+			   wide_int::SignOp sgn, wide_int *remainder,
+			   bool compute_remainder, 
+			   bool *oflow)
+{
+  wide_int quotient, u0, u1;
+  unsigned int prec = dividend->get_precision();
+  int blocks_needed = 2 * BLOCKS_NEEDED (prec);
+  /* The '2' in the next 4 vars are because they are built on half
+     sized wide ints.  */
+  unsigned HOST_HALF_WIDE_INT 
+    b_quotient[MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_HALF_WIDE_INT];
+  unsigned HOST_HALF_WIDE_INT
+    b_remainder[MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_HALF_WIDE_INT];
+  unsigned HOST_HALF_WIDE_INT
+    b_dividend[(MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_HALF_WIDE_INT) + 1];
+  unsigned HOST_HALF_WIDE_INT
+    b_divisor[MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_HALF_WIDE_INT];
+  int m, n;
+  bool dividend_neg = false;
+  bool divisor_neg = false;
+  bool overflow = false;
+
+  if (divisor[0] == 0 && divisorlen == 1)
+    overflow = true;
+
+  /* The smallest signed number / -1 causes overflow.  */
+  if (sgn == wide_int::SIGNED)
+    {
+      wide_int t = wide_int::set_bit_in_zero (prec - 1, prec);
+      if (*dividend == t && divisor[0] == -1 && divisorlen == 1)
+	overflow = true;
+    }
+
+  quotient.precision = prec;
+  remainder->precision = prec;
+
+  /* If overflow is set, just get out.  There will only be grief by
+     continuing.  */
+  if (overflow)
+    {
+      if (compute_remainder)
+	{
+	  remainder->len = 1;
+	  remainder->val[0] = 0;
+	}
+      if (oflow != 0)
+	*oflow = true;
+      return wide_int::zero (prec);
+    }
+
+  /* Do it on the host if you can.  */
+  if (prec <= HOST_BITS_PER_WIDE_INT)
+    {
+      quotient.len = 1;
+      remainder->len = 1;
+      if (sgn == wide_int::SIGNED)
+	{
+	  quotient.val[0] 
+	    = sext_hwi (dividend->val[0] / divisor[0], prec);
+	  remainder->val[0] 
+	    = sext_hwi (dividend->val[0] % divisor[0], prec);
+	}
+      else
+	{
+	  unsigned HOST_WIDE_INT o0 = dividend->val[0];
+	  unsigned HOST_WIDE_INT o1 = divisor[0];
+
+	  if (prec < HOST_BITS_PER_WIDE_INT)
+	    {
+	      o0 = zext_hwi (o0, prec);
+	      o1 = zext_hwi (o1, prec);
+	    }
+	  quotient.val[0] = sext_hwi (o0 / o1, prec);
+	  remainder->val[0] = sext_hwi (o0 % o1, prec);
+	}
+
+#ifdef DEBUG_WIDE_INT
+      if (dump_file)
+	debug_wwwa ("wide_int:: (q = %s) (r = %s) = (%s / %s)\n", 
+		    quotient, *remainder, *dividend, divisor, divisorlen);
+#endif
+      return quotient;
+    }
+
+  /* Make the divisor and divident positive and remember what we
+     did.  */
+  if (sgn == wide_int::SIGNED)
+    {
+      if (dividend->sign_mask ())
+	{
+	  u0 = dividend->neg ();
+	  dividend = &u0;
+	  dividend_neg = true;
+	}
+      if (divisor[divisorlen - 1] < 0)
+	{
+	  u1 = wide_int::zero (dividend->precision).sub_large (divisor, divisorlen);
+	  divisor = u1.val;
+	  divisorlen = u1.len;
+	  divisor_neg = true;
+	}
+    }
+
+  wi_unpack (b_dividend, (const unsigned HOST_WIDE_INT*)dividend->val,
+	     dividend->len, blocks_needed);
+  wi_unpack (b_divisor, (const unsigned HOST_WIDE_INT*)divisor, 
+	     divisorlen, blocks_needed);
+
+  if (dividend->sign_mask ())
+    m = blocks_needed;
+  else
+    m = 2 * dividend->get_len ();
+
+  if (SIGN_MASK (divisor[divisorlen - 1]))
+    n = blocks_needed;
+  else
+    n = 2 * divisorlen;
+
+  /* It is known that the top input block to the divisor is non zero,
+     but when this block is split into two half blocks, it may be that
+     the top half block is zero.  Skip over this half block.  */
+  if (b_divisor[n - 1] == 0)
+    n--;
+
+  divmod_internal_2 (b_quotient, b_remainder, b_dividend, b_divisor, m, n);
+
+  if (compute_quotient)
+    {
+      wi_pack ((unsigned HOST_WIDE_INT*)quotient.val, b_quotient, m);
+      quotient.len = m / 2;
+      quotient.canonize ();
+      /* The quotient is neg if exactly one of the divisor or dividend is
+	 neg.  */
+      if (dividend_neg != divisor_neg)
+	quotient = quotient.neg ();
+    }
+  else
+    quotient = wide_int::zero (dividend->precision);
+
+  if (compute_remainder)
+    {
+      wi_pack ((unsigned HOST_WIDE_INT*)remainder->val, b_remainder, n);
+      if (n & 1)
+	n++;
+      remainder->len = n / 2;
+      (*remainder).canonize ();
+      /* The remainder is always the same sign as the dividend.  */
+      if (dividend_neg)
+	*remainder = (*remainder).neg ();
+    }
+  else
+    *remainder = wide_int::zero (dividend->precision);
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwwa ("wide_int:: (q = %s) (r = %s) = (%s / %s)\n", 
+		quotient, *remainder, *dividend, divisor, divisorlen);
+#endif
+  return quotient;
+}
+
+
+/*
+ * Shifting, rotating and extraction.
+ */
+
+/* Extract WIDTH bits from THIS starting at OFFSET.  The result is
+   assumed to fit in a HOST_WIDE_INT.  This function is safe in that
+   it can properly access elements that may not be explicitly
+   represented.  */
+
+HOST_WIDE_INT
+wide_int::extract_to_hwi (int offset, int width) const
+{
+  int start_elt, end_elt, shift;
+  HOST_WIDE_INT x;
+
+  /* Get rid of the easy cases first.   */
+  if (offset >= len * HOST_BITS_PER_WIDE_INT)
+    return sign_mask ();
+  if (offset + width <= 0)
+    return 0;
+
+  shift = offset & (HOST_BITS_PER_WIDE_INT - 1);
+  if (offset < 0)
+    {
+      start_elt = -1;
+      end_elt = 0;
+      x = 0;
+    }
+  else
+    {
+      start_elt = offset / HOST_BITS_PER_WIDE_INT;
+      end_elt = (offset + width - 1) / HOST_BITS_PER_WIDE_INT;
+      x = start_elt >= len 
+	? sign_mask ()
+	: (unsigned HOST_WIDE_INT)val[start_elt] >> shift;
+    }
+
+  if (start_elt != end_elt)
+    {
+      HOST_WIDE_INT y = end_elt == len
+	? sign_mask () : val[end_elt];
+
+      x |= y << (HOST_BITS_PER_WIDE_INT - shift);
+    }
+
+  if (width != HOST_BITS_PER_WIDE_INT)
+    x &= ((HOST_WIDE_INT)1 << width) - 1;
+
+  return x;
+}
+
+
+/* Left shift THIS by CNT.  See the definition of Op.TRUNC for how to
+   set Z.  Since this is used internally, it has the ability to
+   specify the BISIZE and PRECISION independently.  This is useful
+   when inserting a small value into a larger one.  */
+
+wide_int
+wide_int::lshift_large (unsigned int cnt, unsigned int res_prec) const
+{
+  wide_int result;
+  unsigned int i;
+
+  result.precision = res_prec;
+
+  if (cnt >= res_prec)
+    {
+      result.val[0] = 0;
+      result.len = 1;
+      return result;
+    }
+
+  for (i = 0; i < res_prec; i += HOST_BITS_PER_WIDE_INT)
+    result.val[i / HOST_BITS_PER_WIDE_INT]
+      = extract_to_hwi (i - cnt, HOST_BITS_PER_WIDE_INT);
+
+  result.len = BLOCKS_NEEDED (res_prec);
+  result.canonize ();
+
+  return result;
+}
+
+/* Unsigned right shift THIS by CNT.  See the definition of Op.TRUNC
+   for how to set Z.  */
+
+wide_int
+wide_int::rshiftu_large (unsigned int cnt) const
+{
+  wide_int result;
+  int stop_block, offset, i;
+
+  result.precision = precision;
+
+  if (cnt >= precision)
+    {
+      result.val[0] = 0;
+      result.len = 1;
+      return result;
+    }
+
+  stop_block = BLOCKS_NEEDED (precision - cnt);
+  for (i = 0; i < stop_block; i++)
+    result.val[i]
+      = extract_to_hwi ((i * HOST_BITS_PER_WIDE_INT) + cnt,
+			HOST_BITS_PER_WIDE_INT);
+
+  result.len = stop_block;
+
+  offset = (precision - cnt) & (HOST_BITS_PER_WIDE_INT - 1);
+  if (offset)
+    result.val[stop_block - 1] = zext_hwi (result.val[stop_block - 1], offset);
+  else
+    /* The top block had a 1 at the top position so it will decompress
+       wrong unless a zero block is added.  This only works because we
+       know the shift was greater than 0.  */
+    if (result.val[stop_block - 1] < 0)
+      result.val[result.len++] = 0;
+  return result;
+}
+
+/* Signed right shift THIS by CNT.  See the definition of Op.TRUNC for
+   how to set Z.  */
+
+wide_int
+wide_int::rshifts_large (unsigned int cnt) const
+{
+  wide_int result;
+  int stop_block, i;
+
+  result.precision = precision;
+
+  if (cnt >= precision)
+    {
+      HOST_WIDE_INT m = sign_mask ();
+      result.val[0] = m;
+      result.len = 1;
+      return result;
+    }
+
+  stop_block = BLOCKS_NEEDED (precision - cnt);
+  for (i = 0; i < stop_block; i++)
+    result.val[i]
+      = extract_to_hwi ((i * HOST_BITS_PER_WIDE_INT) + cnt,
+			HOST_BITS_PER_WIDE_INT);
+
+  result.len = stop_block;
+
+  /* No need to sign extend the last block, since it extract_to_hwi
+     already did that.  */
+  return result;
+}
+
+/*
+ * Private utilities.
+ */
+/* Decompress THIS for at least TARGET bits into a result with
+   precision PREC.  */
+
+wide_int
+wide_int::decompress (unsigned int target, unsigned int prec) const
+{
+  wide_int result;
+  int blocks_needed = BLOCKS_NEEDED (target);
+  HOST_WIDE_INT mask;
+  int len, i;
+
+  result.precision = prec;
+  result.len = blocks_needed;
+
+  for (i = 0; i < this->len; i++)
+    result.val[i] = val[i];
+
+  len = this->len;
+
+  if (target > result.precision)
+    return result;
+
+  /* The extension that we are doing here is not sign extension, it is
+     decompression.  */
+  mask = sign_mask ();
+  while (len < blocks_needed)
+    result.val[len++] = mask;
+
+  return result;
+}
+
+/*
+ * Private debug printing routines.
+ */
+
+/* The debugging routines print results of wide operations into the
+   dump files of the respective passes in which they were called.  */
+static char *
+dumpa (const HOST_WIDE_INT *val, unsigned int len, char *buf)
+{
+  int i;
+  int l;
+  const char * sep = "";
+
+  l = sprintf (buf, "[(");
+  for (i = len - 1; i >= 0; i--)
+    {
+      l += sprintf (&buf[l], "%s" HOST_WIDE_INT_PRINT_HEX, sep, val[i]);
+      sep = " ";
+    }
+
+  gcc_assert (len != 0);
+
+  l += sprintf (&buf[l], ")]");
+
+  gcc_assert (l < MAX_SIZE);
+  return buf;
+
+
+}
+
+/* The debugging routines print results of wide operations into the
+   dump files of the respective passes in which they were called.  */
+char *
+wide_int::dump (char* buf) const
+{
+  int i;
+  int l;
+  const char * sep = "";
+
+  l = sprintf (buf, "[%d (", precision);
+  for (i = len - 1; i >= 0; i--)
+    {
+      l += sprintf (&buf[l], "%s" HOST_WIDE_INT_PRINT_HEX, sep, val[i]);
+      sep = " ";
+    }
+
+  gcc_assert (len != 0);
+
+  l += sprintf (&buf[l], ")]");
+
+  gcc_assert (l < MAX_SIZE);
+  return buf;
+}
+
+#ifdef DEBUG_WIDE_INT
+void
+wide_int::debug_vw (const char* fmt, int r, const wide_int& o0)
+{
+  char buf0[MAX_SIZE];
+  fprintf (dump_file, fmt, r, o0.dump (buf0));
+}
+
+void
+wide_int::debug_vwh (const char* fmt, int r, const wide_int &o0,
+		     HOST_WIDE_INT o1)
+{
+  char buf0[MAX_SIZE];
+  fprintf (dump_file, fmt, r, o0.dump (buf0), o1);
+}
+
+void
+wide_int::debug_vwa (const char* fmt, int r, const wide_int &o0,
+		     const HOST_WIDE_INT *o1, unsigned int l1)
+{
+  char buf0[MAX_SIZE];
+  char buf1[MAX_SIZE];
+  fprintf (dump_file, fmt, r, o0.dump (buf0), dumpa (o1, l1, buf1));
+}
+
+void
+wide_int::debug_vww (const char* fmt, int r, const wide_int &o0,
+		     const wide_int &o1)
+{
+  char buf0[MAX_SIZE];
+  char buf1[MAX_SIZE];
+  fprintf (dump_file, fmt, r, o0.dump (buf0), o1.dump (buf1));
+}
+
+void
+wide_int::debug_wh (const char* fmt, const wide_int &r,
+		    HOST_WIDE_INT o1)
+{
+  char buf0[MAX_SIZE];
+  fprintf (dump_file, fmt, r.dump (buf0), o1);
+}
+
+void
+wide_int::debug_whh (const char* fmt, const wide_int &r,
+		     HOST_WIDE_INT o1, HOST_WIDE_INT o2)
+{
+  char buf0[MAX_SIZE];
+  fprintf (dump_file, fmt, r.dump (buf0), o1, o2);
+}
+
+void
+wide_int::debug_wv (const char* fmt, const wide_int &r, int v0)
+{
+  char buf0[MAX_SIZE];
+  fprintf (dump_file, fmt, r.dump (buf0), v0);
+}
+
+void
+wide_int::debug_wvv (const char* fmt, const wide_int &r,
+		     int v0, int v1)
+{
+  char buf0[MAX_SIZE];
+  fprintf (dump_file, fmt, r.dump (buf0), v0, v1);
+}
+
+void
+wide_int::debug_wvvv (const char* fmt, const wide_int &r,
+		      int v0, int v1, int v2)
+{
+  char buf0[MAX_SIZE];
+  fprintf (dump_file, fmt, r.dump (buf0), v0, v1, v2);
+}
+
+void
+wide_int::debug_wvwa (const char* fmt, const wide_int &r, int v0,
+		      const wide_int &o0, const HOST_WIDE_INT *o1,
+		      unsigned int o1len)
+{
+  char buf0[MAX_SIZE];
+  char buf1[MAX_SIZE];
+  char buf2[MAX_SIZE];
+  fprintf (dump_file, fmt, r.dump (buf0), v0,
+	   o0.dump (buf1), dumpa (o1, o1len, buf2));
+}
+
+void
+wide_int::debug_wvww (const char* fmt, const wide_int &r, int v0,
+		      const wide_int &o0, const wide_int &o1)
+{
+  char buf0[MAX_SIZE];
+  char buf1[MAX_SIZE];
+  char buf2[MAX_SIZE];
+  fprintf (dump_file, fmt, r.dump (buf0), v0,
+	   o0.dump (buf1), o1.dump (buf2));
+}
+
+void
+wide_int::debug_ww (const char* fmt, const wide_int &r,
+		    const wide_int &o0)
+{
+  char buf0[MAX_SIZE];
+  char buf1[MAX_SIZE];
+  fprintf (dump_file, fmt, r.dump (buf0), o0.dump (buf1));
+}
+
+void
+wide_int::debug_wwa (const char* fmt, const wide_int &r,
+		     const wide_int &o0, const HOST_WIDE_INT *o1, 
+		     unsigned int l1)
+{
+  char buf0[MAX_SIZE];
+  char buf1[MAX_SIZE];
+  char buf2[MAX_SIZE];
+  fprintf (dump_file, fmt, r.dump (buf0), o0.dump (buf1), dumpa (o1, l1, buf2));
+}
+
+void
+wide_int::debug_wwv (const char* fmt, const wide_int &r,
+		     const wide_int &o0, int v0)
+{
+  char buf0[MAX_SIZE];
+  char buf1[MAX_SIZE];
+  fprintf (dump_file, fmt, r.dump (buf0), o0.dump (buf1), v0);
+}
+
+void
+wide_int::debug_wwvs (const char* fmt, const wide_int &r, 
+		      const wide_int &o0, int v0,
+		      const char *s)
+{
+  char buf0[MAX_SIZE];
+  char buf1[MAX_SIZE];
+  fprintf (dump_file, fmt, r.dump (buf0), o0.dump (buf1), v0, s);
+}
+
+void
+wide_int::debug_wwvvs (const char* fmt, const wide_int &r, 
+		       const wide_int &o0, int v0, int v1,
+		       const char *s)
+{
+  char buf0[MAX_SIZE];
+  char buf1[MAX_SIZE];
+  fprintf (dump_file, fmt, r.dump (buf0), o0.dump (buf1), v0, v1, s);
+}
+
+void
+wide_int::debug_wwwvv (const char* fmt, const wide_int &r,
+		       const wide_int &o0, const wide_int &o1,
+		       int v0, int v1)
+{
+  char buf0[MAX_SIZE];
+  char buf1[MAX_SIZE];
+  char buf2[MAX_SIZE];
+  fprintf (dump_file, fmt, r.dump (buf0),
+	   o0.dump (buf1), o1.dump (buf2), v0, v1);
+}
+
+void
+wide_int::debug_www (const char* fmt, const wide_int &r,
+		     const wide_int &o0, const wide_int &o1)
+{
+  char buf0[MAX_SIZE];
+  char buf1[MAX_SIZE];
+  char buf2[MAX_SIZE];
+  fprintf (dump_file, fmt, r.dump (buf0),
+	   o0.dump (buf1), o1.dump (buf2));
+}
+
+void
+wide_int::debug_wwwa (const char* fmt, const wide_int &r,
+		      const wide_int &o0, const wide_int &o1,
+		      const HOST_WIDE_INT *o2, unsigned int o2len)
+{
+  char buf0[MAX_SIZE];
+  char buf1[MAX_SIZE];
+  char buf2[MAX_SIZE];
+  char buf3[MAX_SIZE];
+  fprintf (dump_file, fmt, r.dump (buf0),
+	   o0.dump (buf1), o1.dump (buf2), dumpa (o2, o2len, buf3));
+}
+
+void
+wide_int::debug_wwww (const char* fmt, const wide_int &r,
+		      const wide_int &o0, const wide_int &o1,
+		      const wide_int &o2)
+{
+  char buf0[MAX_SIZE];
+  char buf1[MAX_SIZE];
+  char buf2[MAX_SIZE];
+  char buf3[MAX_SIZE];
+  fprintf (dump_file, fmt, r.dump (buf0),
+	   o0.dump (buf1), o1.dump (buf2), o2.dump (buf3));
+}
+
+#endif
+
diff --git a/gcc/wide-int.h b/gcc/wide-int.h
new file mode 100644
index 0000000..c12db4a
--- /dev/null
+++ b/gcc/wide-int.h
@@ -0,0 +1,2608 @@ 
+/* Operations with very long integers.
+   Copyright (C) 2012-2013 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by the
+Free Software Foundation; either version 3, or (at your option) any
+later version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT
+ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef WIDE_INT_H
+#define WIDE_INT_H
+
+/* A wide integer is currently represented as a vector of
+   HOST_WIDE_INTs.  The vector contains enough elements to hold a
+   value of MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT which is
+   a derived for each host target combination.  The values are stored
+   in the vector with the least signicant HOST_BITS_PER_WIDE_INT bits
+   of the value stored in element 0.
+
+   A wide_int contains three fields: the vector (VAL), precision and a
+   length, (LEN).  The length is the number of HWIs needed to
+   represent the value.
+
+   Since most integers used in a compiler are small values, it is
+   generally profitable to use a representation of the value that is
+   shorter than the modes precision.  LEN is used to indicate the
+   number of elements of the vector that are in use.  When LEN *
+   HOST_BITS_PER_WIDE_INT < the precision, the value has been
+   compressed.  The values of the elements of the vector greater than
+   LEN - 1. are all equal to the highest order bit of LEN.
+
+   The representation does not contain any information inherant about
+   signedness of the represented value, so it can be used to represent
+   both signed and unsigned numbers.   For operations where the results
+   depend on signedness (division, comparisons), the signedness must
+   be specified separately.  For operations where the signness
+   matters, one of the operands to the operation specifies either
+   wide_int::SIGNED or wide_int::UNSIGNED.
+
+   The numbers are stored as sign entended numbers as a means of
+   compression.  Leading HOST_WIDE_INTS that contain strings of either
+   -1 or 0 are removed as long as they can be reconstructed from the
+   top bit that is being represented.
+
+   All constructors for wide_int take either a precision, an enum
+   machine_mode or tree_type.  */
+
+
+#ifndef GENERATOR_FILE
+#include "tree.h"
+#include "system.h"
+#include "hwint.h"
+#include "options.h"
+#include "tm.h"
+#include "insn-modes.h"
+#include "machmode.h"
+#include "double-int.h"
+#include <gmp.h>
+#include "dumpfile.h"
+#include "real.h"
+
+#define DEBUG_WIDE_INT
+
+#define WIDE_INT_MAX_ELTS \
+  ((4 * MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT)
+
+class wide_int {
+  /* Internal representation.  */
+  
+  /* VAL is set to a size that is capable of computing a full
+     multiplication on the largest mode that is represented on the
+     target.  The full multiplication is use by tree-vrp.  tree-vpn
+     currently does a 2x largest mode by 2x largest mode yielding a 4x
+     largest mode result.  If operations are added that require larger
+     buffers, then VAL needs to be changed.  */
+  HOST_WIDE_INT val[WIDE_INT_MAX_ELTS];
+  unsigned short len;
+  unsigned int precision;
+
+ public:
+  enum ShiftOp {
+    NONE,
+    /* There are two uses for the wide-int shifting functions.  The
+       first use is as an emulation of the target hardware.  The
+       second use is as service routines for other optimizations.  The
+       first case needs to be identified by passing TRUNC as the value
+       of ShiftOp so that shift amount is properly handled according to the
+       SHIFT_COUNT_TRUNCATED flag.  For the second case, the shift
+       amount is always truncated by the bytesize of the mode of
+       THIS.  */
+    TRUNC
+  };
+
+  enum SignOp {
+    /* Many of the math functions produce different results depending
+       on if they are SIGNED or UNSIGNED.  In general, there are two
+       different functions, whose names are prefixed with an 'S" and
+       or an 'U'.  However, for some math functions there is also a
+       routine that does not have the prefix and takes an SignOp
+       parameter of SIGNED or UNSIGNED.  */
+    SIGNED,
+    UNSIGNED
+  };
+
+  /* Conversions.  */
+
+  static wide_int from_shwi (HOST_WIDE_INT op0, 
+			     unsigned int precision, bool *overflow = 0);
+  static wide_int from_uhwi (unsigned HOST_WIDE_INT op0, 
+			     unsigned int precision, bool *overflow = 0);
+
+  inline static wide_int from_hwi (HOST_WIDE_INT op0, const_tree type, 
+				   bool *overflow = 0);
+  inline static wide_int from_shwi (HOST_WIDE_INT op0, enum machine_mode mode, 
+				    bool *overflow = 0);
+  inline static wide_int from_uhwi (unsigned HOST_WIDE_INT op0, 
+				    enum machine_mode mode, 
+				    bool *overflow = 0);
+  static wide_int from_array (const HOST_WIDE_INT* op0,
+			      unsigned int len, 
+			      unsigned int precision, bool need_canon = true); 
+  inline static wide_int from_array (const HOST_WIDE_INT* op0,
+				     unsigned int len,
+				     enum machine_mode mode);
+  inline static wide_int from_array (const HOST_WIDE_INT* op0,
+				     unsigned int len,
+				     const_tree type);
+
+  static wide_int from_double_int (double_int, 
+				   unsigned int precision);
+  inline static wide_int from_double_int (double_int, enum machine_mode);
+  static wide_int from_tree (const_tree);
+  static wide_int from_rtx (const_rtx, enum machine_mode);
+  static wide_int from_buffer (const unsigned char*, int);
+
+  inline HOST_WIDE_INT to_shwi (unsigned int prec = 0) const;
+  inline unsigned HOST_WIDE_INT to_uhwi (unsigned int prec = 0) const;
+
+  /* Largest and smallest values that are represented in modes or precisions.  */
+
+  static wide_int max_value (unsigned int prec, 
+			     unsigned int precision, SignOp sgn);
+  inline static wide_int max_value (unsigned int precision, SignOp sgn);
+  inline static wide_int max_value (const_tree type);
+  inline static wide_int max_value (enum machine_mode mode, SignOp sgn);
+  
+  static wide_int min_value (unsigned int prec, 
+			     unsigned int precision, SignOp sgn);
+  inline static wide_int min_value (unsigned int precision, SignOp sgn);
+  inline static wide_int min_value (const_tree type);
+  inline static wide_int min_value (enum machine_mode mode, SignOp sgn);
+  
+  /* Small constants */
+
+  inline static wide_int minus_one (unsigned int prec);
+  inline static wide_int zero (unsigned int prec);
+  inline static wide_int one (unsigned int prec);
+  inline static wide_int two (unsigned int prec);
+
+  /* Accessors.  */
+
+  inline unsigned short get_len () const;
+  inline unsigned int get_precision () const;
+  inline HOST_WIDE_INT elt (unsigned int i) const;
+
+  /* Printing functions.  */
+
+  void print_dec (char *buf, SignOp sgn) const;
+  void print_dec (FILE *file, SignOp sgn) const;
+  void print_decs (char *buf) const;
+  void print_decs (FILE *file) const;
+  void print_decu (char *buf) const;
+  void print_decu (FILE *file) const;
+  void print_hex (char *buf) const;
+  void print_hex (FILE *file) const;
+
+  /* Comparative functions.  */
+
+  inline bool minus_one_p () const;
+  inline bool zero_p () const;
+  inline bool one_p () const;
+  inline bool neg_p () const;
+
+  template <typename T>
+    inline bool operator == (T c) const;
+  
+  template <typename T>
+    inline bool operator != (T c) const;
+  
+  template <typename T>
+    inline bool gt_p (T c, SignOp sgn) const;
+  template <typename T>
+    inline bool gts_p (T c) const;
+  template <typename T>
+    inline bool gtu_p (T c) const;
+  
+  template <typename T>
+    inline bool lt_p (T c, SignOp sgn) const;
+  template <typename T>
+    inline bool lts_p (T c) const;
+  template <typename T>
+    inline bool ltu_p (T c) const;
+  
+  template <typename T>
+    inline int cmp (T c, SignOp sgn) const;
+  template <typename T>
+    inline int cmps (T c) const;
+  template <typename T>
+    inline int cmpu (T c) const;
+  
+  bool only_sign_bit_p (unsigned int prec) const;
+  inline bool only_sign_bit_p () const;
+  inline bool fits_uhwi_p () const;
+  inline bool fits_shwi_p () const;
+  bool fits_to_tree_p (const_tree type) const;
+  bool fits_u_p (unsigned int prec) const;
+  bool fits_s_p (unsigned int prec) const;
+
+  /* Min and max */
+
+  template <typename T>
+    inline wide_int min (T c, SignOp sgn) const;
+    inline wide_int min (const wide_int &op1, SignOp sgn) const;
+  template <typename T>
+    inline wide_int max (T c, SignOp sgn) const;
+    inline wide_int max (const wide_int &op1, SignOp sgn) const;
+  template <typename T>
+    inline wide_int smin (T c) const;
+    inline wide_int smin (const wide_int &op1) const;
+  template <typename T>
+    inline wide_int smax (T c) const;
+    inline wide_int smax (const wide_int &op1) const;
+  template <typename T>
+    inline wide_int umin (T c) const;
+    inline wide_int umin (const wide_int &op1) const;
+  template <typename T>
+    inline wide_int umax (T c) const;
+    inline wide_int umax (const wide_int &op1) const;
+
+  /* Extension, these do not change the precision.  */
+
+  inline wide_int ext (unsigned int offset, SignOp sgn) const;
+  wide_int sext (unsigned int offset) const;
+  wide_int zext (unsigned int offset) const;
+
+  /* Make a fast copy.  */
+
+  wide_int copy () const;
+
+  /* These change the underlying precision.  */
+  
+  wide_int force_to_size (unsigned int precision, SignOp sgn) const;
+  inline wide_int force_to_size (enum machine_mode mode, SignOp sgn) const;
+  inline wide_int force_to_size (const_tree type) const;
+  inline wide_int force_to_size (const_tree type, SignOp sgn) const;
+
+  inline wide_int sforce_to_size (enum machine_mode mode) const;
+  inline wide_int sforce_to_size (const_tree type) const;
+  inline wide_int zforce_to_size (enum machine_mode mode) const;
+  inline wide_int zforce_to_size (const_tree type) const;
+
+  /* Masking, and Insertion  */
+
+  wide_int set_bit (unsigned int bitpos) const;
+  static wide_int set_bit_in_zero (unsigned int bitpos, unsigned int prec);
+  inline static wide_int set_bit_in_zero (unsigned int, 
+					  enum machine_mode mode);
+  inline static wide_int set_bit_in_zero (unsigned int, const_tree type);
+  wide_int insert (const wide_int &op0, unsigned int offset,
+		   unsigned int width) const;
+
+  wide_int bswap () const;
+
+  static wide_int mask (unsigned int start, bool negate, 
+			unsigned int prec);
+  inline static wide_int mask (unsigned int start, bool negate, 
+			       enum machine_mode mode);
+  inline static wide_int mask (unsigned int start, bool negate,
+			       const_tree type);
+
+  static wide_int shifted_mask (unsigned int start, unsigned int width,
+				bool negate, unsigned int prec);
+  inline static wide_int shifted_mask (unsigned int start, unsigned int width, 
+				       bool negate, enum machine_mode mode);
+  inline static wide_int shifted_mask (unsigned int start, unsigned int width, 
+				       bool negate, const_tree type);
+
+  inline HOST_WIDE_INT sign_mask () const;
+
+  /* Logicals */
+
+  template <typename T>
+    inline wide_int operator & (T c) const;
+  template <typename T>
+    inline wide_int and_not (T c) const;
+  inline wide_int operator ~ () const;
+  template <typename T>
+    inline wide_int operator | (T c) const;
+  template <typename T>
+    inline wide_int or_not (T c) const;
+  template <typename T>
+    inline wide_int operator ^ (T c) const;
+  
+  /* Arithmetic operation functions, alpha sorted.  */
+  
+  wide_int abs () const;
+  template <typename T>
+    inline wide_int operator + (T c) const;
+  wide_int add (const wide_int &x, SignOp sgn, bool *overflow) const;
+  
+  wide_int clz () const;
+  wide_int clrsb () const;
+  wide_int ctz () const;
+  wide_int exact_log2 () const;
+  wide_int floor_log2 () const;
+  wide_int ffs () const;
+  
+  template <typename T>
+    inline wide_int operator * (T c) const;
+  template <typename T>
+    inline wide_int mul (T c, SignOp sgn, bool *overflow) const;
+  template <typename T>
+    inline wide_int smul (T c, bool *overflow) const;
+  template <typename T>
+    inline wide_int umul (T c, bool *overflow) const;
+  template <typename T>
+    inline wide_int mul_full (T c, SignOp sgn) const;
+  template <typename T>
+    inline wide_int smul_full (T c) const;
+  template <typename T>
+    inline wide_int umul_full (T c) const;
+  template <typename T>
+    inline wide_int mul_high (T c, SignOp sgn) const;
+  
+  inline wide_int neg () const;
+  inline wide_int neg (bool *overflow) const;
+  
+  wide_int parity () const;
+  wide_int popcount () const;
+  
+  template <typename T>
+    inline wide_int operator - (T c) const;
+  wide_int sub (const wide_int &x, SignOp sgn, bool *overflow) const;
+  
+  /* Divison and mod.  These are the ones that are actually used, but
+     there are a lot of them.  */
+  
+  template <typename T>
+    inline wide_int div_trunc (T c, SignOp sgn, bool *overflow = 0) const;
+  template <typename T>
+    inline wide_int sdiv_trunc (T c) const;
+  template <typename T>
+    inline wide_int udiv_trunc (T c) const;
+  
+  template <typename T>
+    inline wide_int div_floor (T c, SignOp sgn, bool *overflow = 0) const;
+  template <typename T>
+    inline wide_int udiv_floor (T c) const;
+  template <typename T>
+    inline wide_int sdiv_floor (T c) const;
+  template <typename T>
+    inline wide_int div_ceil (T c, SignOp sgn, bool *overflow = 0) const;
+  template <typename T>
+    inline wide_int div_round (T c, SignOp sgn, bool *overflow = 0) const;
+  
+  template <typename T>
+    inline wide_int divmod_trunc (T c, wide_int *mod, SignOp sgn) const;
+  template <typename T>
+    inline wide_int sdivmod_trunc (T c, wide_int *mod) const;
+  template <typename T>
+    inline wide_int udivmod_trunc (T c, wide_int *mod) const;
+  
+  template <typename T>
+    inline wide_int divmod_floor (T c, wide_int *mod, SignOp sgn) const;
+  template <typename T>
+    inline wide_int sdivmod_floor (T c, wide_int *mod) const;
+  
+  template <typename T>
+    inline wide_int mod_trunc (T c, SignOp sgn, bool *overflow = 0) const;
+  template <typename T>
+    inline wide_int smod_trunc (T c) const;
+  template <typename T>
+    inline wide_int umod_trunc (T c) const;
+  
+  template <typename T>
+    inline wide_int mod_floor (T c, SignOp sgn, bool *overflow = 0) const;
+  template <typename T>
+    inline wide_int umod_floor (T c) const;
+  template <typename T>
+    inline wide_int mod_ceil (T c, SignOp sgn, bool *overflow = 0) const;
+  template <typename T>
+    inline wide_int mod_round (T c, SignOp sgn, bool *overflow = 0) const;
+  
+  /* Shifting rotating and extracting.  */
+  HOST_WIDE_INT extract_to_hwi (int offset, int width) const;
+  
+  template <typename T>
+    inline wide_int lshift (T c, unsigned int bitsize = 0,
+			    unsigned int precision = 0,
+			    ShiftOp z = NONE) const;
+
+  template <typename T>
+    inline wide_int lrotate (T c) const;
+  inline wide_int lrotate (unsigned HOST_WIDE_INT y) const;
+
+  template <typename T>
+    inline wide_int rshift (T c, SignOp sgn, 
+			    unsigned int bitsize = 0,
+			    ShiftOp z = NONE) const;
+  template <typename T>
+    inline wide_int rshiftu (T c, unsigned int bitsize = 0,
+			     ShiftOp z = NONE) const;
+  template <typename T>
+    inline wide_int rshifts (T c, unsigned int bitsize = 0,
+			     ShiftOp z = NONE) const;
+
+  template <typename T>
+    inline wide_int rrotate (T c) const;
+    inline wide_int rrotate (unsigned HOST_WIDE_INT y) const;
+
+  static const int DUMP_MAX = (2 * (MAX_BITSIZE_MODE_ANY_INT / 4
+			       + MAX_BITSIZE_MODE_ANY_INT 
+				    / HOST_BITS_PER_WIDE_INT + 32));
+  char *dump (char* buf) const;
+ private:
+
+  /* 
+   * Internal versions that do the work if the values do not fit in a
+   * HWI.
+   */ 
+
+  /* Comparisons */
+  bool eq_p_large (const HOST_WIDE_INT *, unsigned int) const;
+  static bool lts_p_large (const HOST_WIDE_INT *, unsigned int, 
+			   const HOST_WIDE_INT *, unsigned int,
+			   unsigned int);
+  int cmps_large (const HOST_WIDE_INT *, unsigned int) const;
+  static bool ltu_p_large (const HOST_WIDE_INT *, unsigned int, 
+			   const HOST_WIDE_INT *, unsigned int);
+  int cmpu_large (const HOST_WIDE_INT *, unsigned int) const;
+
+  /* Logicals.  */
+  wide_int and_large (const HOST_WIDE_INT *, unsigned int) const;
+  wide_int and_not_large (const HOST_WIDE_INT *, unsigned int) const;
+  wide_int or_large (const HOST_WIDE_INT *, unsigned int) const;
+  wide_int or_not_large (const HOST_WIDE_INT *, unsigned int) const;
+  wide_int xor_large (const HOST_WIDE_INT *, unsigned int) const;
+
+  /* Arithmetic */
+  wide_int add_large (const HOST_WIDE_INT *, unsigned int) const;
+  wide_int sub_large (const HOST_WIDE_INT *, unsigned int) const;
+
+  wide_int lshift_large (unsigned int cnt, unsigned int res_prec) const;
+  wide_int rshiftu_large (unsigned int cnt) const;
+  wide_int rshifts_large (unsigned int cnt) const;
+
+  static wide_int
+    mul_internal (bool high, bool full, 
+		  const wide_int *op1, const HOST_WIDE_INT *op2, unsigned int op2len,
+		  wide_int::SignOp sgn,  bool *overflow, bool needs_overflow);
+  static void
+    divmod_internal_2 (unsigned HOST_HALF_WIDE_INT *b_quotient, 
+		       unsigned HOST_HALF_WIDE_INT *b_remainder,
+		       unsigned HOST_HALF_WIDE_INT *b_dividend, 
+		       unsigned HOST_HALF_WIDE_INT *b_divisor, 
+		       int m, int n);
+  static wide_int
+    divmod_internal (bool compute_quotient, 
+		     const wide_int *dividend, const HOST_WIDE_INT *, unsigned int,
+		     wide_int::SignOp sgn, wide_int *remainder,
+		     bool compute_remainder, 
+		     bool *overflow);
+
+
+  /* Private utility routines.  */
+  wide_int decompress (unsigned int target, unsigned int precision) const;
+  void canonize ();
+  static inline int trunc_shift (const HOST_WIDE_INT *cnt, unsigned int len, 
+				 unsigned int bitsize, ShiftOp z);
+
+  /* The following template and its overrides are used for the second
+     operand of binary functions.  They access the value are package
+     it so that the operation can be done.  These have been
+     implemented so that pointer copying is done from the rep of the
+     second operand rather than actual data copying.  This is safe
+     even for garbage collected objects since the value is immediately
+     throw away.  
+
+     The first two templates match all integers.  */
+
+  template <typename T>
+  static inline const HOST_WIDE_INT* to_shwi2 (HOST_WIDE_INT *s, int *l, T x)
+  { 
+    s[0] = x;
+    if (~(T)0 < (T)0
+	|| sizeof (T) < sizeof (HOST_WIDE_INT))
+      {
+	*l = 1;
+      }
+    else
+      {
+	s[1] = 0;
+	*l = 2;	  
+      }
+    return s;
+  }
+  
+  static inline const HOST_WIDE_INT* to_shwi2 (HOST_WIDE_INT *s ATTRIBUTE_UNUSED,
+					int *l, const wide_int &y)
+  {
+    *l = y.len;
+    return y.val;
+  }
+
+  /* This overload may just return the pointer to the value and set
+     the length.  */
+  static inline const HOST_WIDE_INT* to_shwi2 (HOST_WIDE_INT *s, int *l, const_tree tcst)
+  {
+    /* This is rather complex now, in the future when the rep for tree
+       cst is changed, it will be just a pointer and len copy.  */
+    tree type = TREE_TYPE (tcst);
+    unsigned int prec = TYPE_PRECISION (type);
+
+    if (prec == HOST_BITS_PER_DOUBLE_INT 
+	&& TYPE_UNSIGNED (type)
+	&& TREE_INT_CST_HIGH (tcst) < 0)
+      {
+	/* Copy the result because it does not fit in two HWIs.  */
+	s[0] = TREE_INT_CST_LOW (tcst);
+	s[1] = TREE_INT_CST_HIGH (tcst);
+	s[2] = 0;
+	*l = 3;
+	return s;
+      }
+    else
+      {
+	if (prec > HOST_BITS_PER_WIDE_INT)
+	  *l = 2;
+	else
+	  {
+	    if (prec == HOST_BITS_PER_WIDE_INT
+		&& TYPE_UNSIGNED (type)
+		&& (HOST_WIDE_INT)TREE_INT_CST_LOW (tcst) < 0)
+	      *l = 2;
+	    else
+	      *l = 1;
+	  }
+	return (const HOST_WIDE_INT*)&TREE_INT_CST_LOW (tcst);
+      }
+  }
+
+  /* There should logically be an overload for rtl here, but it cannot
+     be here because of circular include issues.  It is in rtl.h.  */
+  static inline const HOST_WIDE_INT* to_shwi2 
+    (HOST_WIDE_INT *s ATTRIBUTE_UNUSED, int *l, rtx rcst);
+
+#ifdef DEBUG_WIDE_INT
+  /* Debugging routines.  */
+  static void debug_vw  (const char* fmt, int r, const wide_int& o0);
+  static void debug_vwh (const char* fmt, int r, const wide_int &o0,
+			 HOST_WIDE_INT o1);
+  static void debug_vwa (const char* fmt, int r, const wide_int &o0,
+			 const HOST_WIDE_INT *, unsigned int v0);
+  static void debug_vww (const char* fmt, int r, const wide_int &o0,
+			 const wide_int &o1);
+  static void debug_wh (const char* fmt, const wide_int &r,
+			 HOST_WIDE_INT o1);
+  static void debug_whh (const char* fmt, const wide_int &r,
+			 HOST_WIDE_INT o1, HOST_WIDE_INT o2);
+  static void debug_wv (const char* fmt, const wide_int &r, int v0);
+  static void debug_wvv (const char* fmt, const wide_int &r, int v0,
+			 int v1);
+  static void debug_wvvv (const char* fmt, const wide_int &r, int v0,
+			  int v1, int v2);
+  static void debug_wvwa (const char* fmt, const wide_int &r, int v0,
+			  const wide_int &o0, const HOST_WIDE_INT *o1, 
+			  unsigned int o1l);
+  static void debug_wvww (const char* fmt, const wide_int &r, int v0,
+			  const wide_int &o0, const wide_int &o1);
+  static void debug_wwa (const char* fmt, const wide_int &r,
+			 const wide_int &o0, const HOST_WIDE_INT *, 
+			 unsigned int v0);
+  static void debug_wwv (const char* fmt, const wide_int &r,
+			 const wide_int &o0, int v0);
+  static void debug_wwvs (const char* fmt, const wide_int &r, 
+			  const wide_int &o0, 
+			  int v0, const char *s);
+  static void debug_wwvvs (const char* fmt, const wide_int &r, 
+			   const wide_int &o0, 
+			   int v0, int v1, const char *s);
+  static void debug_wwwvv (const char* fmt, const wide_int &r,
+			   const wide_int &o0, const wide_int &o1,
+			   int v0, int v1);
+  static void debug_ww (const char* fmt, const wide_int &r,
+			const wide_int &o0);
+  static void debug_www (const char* fmt, const wide_int &r,
+			 const wide_int &o0, const wide_int &o1);
+  static void debug_wwwa (const char* fmt, const wide_int &r,
+			  const wide_int &o0, const wide_int &o1, 
+			  const HOST_WIDE_INT *o2, unsigned int len);
+  static void debug_wwww (const char* fmt, const wide_int &r,
+			  const wide_int &o0, const wide_int &o1, 
+			  const wide_int &o2);
+#endif
+};
+
+/* Insert a 1 bit into 0 at BITPOS producing an number with precision
+   taken from MODE.  */
+
+wide_int
+wide_int::set_bit_in_zero (unsigned int bitpos, enum machine_mode mode)
+{
+  return wide_int::set_bit_in_zero (bitpos, GET_MODE_PRECISION (mode));
+}
+
+/* Insert a 1 bit into 0 at BITPOS producing an number with precision
+   taken from TYPE.  */
+
+wide_int
+wide_int::set_bit_in_zero (unsigned int bitpos, const_tree type)
+{
+
+  return wide_int::set_bit_in_zero (bitpos, TYPE_PRECISION (type));
+}
+
+/* Return a result mask where the lower WIDTH bits are ones and the
+   bits above that up to the precision are zeros.  The result is
+   inverted if NEGATE is true.  The result is made with precision
+   taken from MODE.  */
+
+wide_int
+wide_int::mask (unsigned int width, bool negate, enum machine_mode mode)
+{
+  return wide_int::mask (width, negate, GET_MODE_PRECISION (mode));
+}
+
+/* Return a result mask where the lower WIDTH bits are ones and the
+   bits above that up to the precision are zeros.  The result is
+   inverted if NEGATE is true.  The result is made with precision
+   taken from TYPE.  */
+
+wide_int
+wide_int::mask (unsigned int width, bool negate, const_tree type)
+{
+
+  return wide_int::mask (width, negate, TYPE_PRECISION (type));
+}
+
+/* Return a result mask of WIDTH ones starting at START and the bits
+   above that up to the precision are zeros.  The result is inverted
+   if NEGATE is true.  The result is made with precision taken from
+   MODE.  */
+
+wide_int
+wide_int::shifted_mask (unsigned int start, unsigned int width, 
+			bool negate, enum machine_mode mode)
+{
+  return wide_int::shifted_mask (start, width, negate,
+				 GET_MODE_PRECISION (mode));
+}
+
+/* Return a result mask of WIDTH ones starting at START and the bits
+   above that up to the precision are zeros.  The result is inverted
+   if NEGATE is true.  The result is made with precision taken from
+   TYPE.  */
+
+wide_int
+wide_int::shifted_mask (unsigned int start, unsigned int width, 
+			bool negate, const_tree type)
+{
+
+  return wide_int::shifted_mask (start, width, negate,
+				 TYPE_PRECISION (type));
+}
+
+/* Produce 0 or -1 that is the smear of the sign bit.  */
+
+HOST_WIDE_INT
+wide_int::sign_mask () const
+{
+  int i = len - 1;
+  if (precision < HOST_BITS_PER_WIDE_INT)
+    return ((val[0] << (HOST_BITS_PER_WIDE_INT - precision))
+	    >> (HOST_BITS_PER_WIDE_INT - 1));
+
+  /* VRP appears to be badly broken and this is a very ugly fix.  */
+  if (i >= 0)
+    return val[i] >> (HOST_BITS_PER_WIDE_INT - 1);
+
+  gcc_unreachable ();
+#if 0
+  return val[len - 1] >> (HOST_BITS_PER_WIDE_INT - 1);
+#endif
+}
+
+/* Conversions */
+
+/* Convert OP0 into a wide_int with parameters taken from TYPE.  If
+   the value does not fit, set OVERFLOW.  */
+
+wide_int
+wide_int::from_hwi (HOST_WIDE_INT op0, const_tree type, 
+		    bool *overflow)
+{
+  unsigned int prec = TYPE_PRECISION (type);
+
+  if (TYPE_UNSIGNED (type))
+    return wide_int::from_uhwi (op0, prec, overflow);
+  else
+    return wide_int::from_shwi (op0, prec, overflow);
+}
+
+/* Convert signed OP0 into a wide_int with parameters taken from
+   MODE. If the value does not fit, set OVERFLOW. */
+
+wide_int
+wide_int::from_shwi (HOST_WIDE_INT op0, enum machine_mode mode, 
+		     bool *overflow)
+{
+  unsigned int prec = GET_MODE_PRECISION (mode);
+
+  return wide_int::from_shwi (op0, prec, overflow);
+}
+
+/* Convert unsigned OP0 into a wide_int with parameters taken from
+   MODE. If the value does not fit, set OVERFLOW. */
+
+wide_int
+wide_int::from_uhwi (unsigned HOST_WIDE_INT op0, enum machine_mode mode, 
+		     bool *overflow)
+{
+  unsigned int prec = GET_MODE_PRECISION (mode);
+
+  return wide_int::from_uhwi (op0, prec, overflow);
+}
+
+/* Return THIS as a signed HOST_WIDE_INT.  If THIS does not fit in
+   PREC, the information is lost. */
+
+HOST_WIDE_INT 
+wide_int::to_shwi (unsigned int prec) const
+{
+  HOST_WIDE_INT result;
+
+  if (prec == 0)
+    prec = precision;
+
+  if (prec < HOST_BITS_PER_WIDE_INT)
+    result = sext_hwi (val[0], prec);
+  else
+    result = val[0];
+
+  return result;
+}
+
+/* Return THIS as an unsigned HOST_WIDE_INT.  If THIS does not fit in
+   PREC, the information is lost. */
+
+unsigned HOST_WIDE_INT 
+wide_int::to_uhwi (unsigned int prec) const
+{
+  HOST_WIDE_INT result;
+
+  if (prec == 0)
+    prec = precision;
+
+  if (prec < HOST_BITS_PER_WIDE_INT)
+    result = zext_hwi (val[0], prec);
+  else
+    result = val[0];
+
+  return result;
+}
+
+
+/* Convert OP0 of LEN into a wide_int with parameters taken from
+   MODE. If the value does not fit, set OVERFLOW. */
+
+wide_int
+wide_int::from_array (const HOST_WIDE_INT* op0, unsigned int len, 
+		      enum machine_mode mode)
+{
+  unsigned int prec = GET_MODE_PRECISION (mode);
+
+  return wide_int::from_array (op0, len, prec);
+}
+
+/* Convert OP0 of LEN into a wide_int with parameters taken from
+   MODE. If the value does not fit, set OVERFLOW. */
+
+wide_int
+wide_int::from_array (const HOST_WIDE_INT* op0, unsigned int len, 
+		      const_tree type)
+{
+  unsigned int prec = TYPE_PRECISION (type);
+
+  return wide_int::from_array (op0, len, prec);
+}
+
+/* Convert double_int OP0 into a wide_int with parameters taken from
+   MODE.  */
+
+wide_int
+wide_int::from_double_int (double_int op0, enum machine_mode mode)
+{
+  unsigned int prec = GET_MODE_PRECISION (mode);
+
+  return wide_int::from_double_int (op0, prec);
+}
+
+/* Min and Max value helpers.  */
+
+/* Produce the largest SGNed number that is represented in PRECISION.
+   The result is represented in PRECISION.  SGN must be SIGNED or
+   UNSIGNED.  */
+
+wide_int
+wide_int::max_value (unsigned int precision, SignOp sgn)
+{
+  return max_value (precision, precision, sgn);
+}
+  
+/* Produce the largest number that is represented in MODE. The
+   precision are taken from mode.  SGN must be SIGNED or
+   UNSIGNED.  */
+
+wide_int
+wide_int::max_value (enum machine_mode mode, SignOp sgn)
+{
+  unsigned int prec = GET_MODE_PRECISION (mode);
+  return max_value (prec, prec, sgn);
+}
+
+/* Produce the largest number that is represented in TYPE. The
+   precision and sign are taken from TYPE.  */
+
+wide_int
+wide_int::max_value (const_tree type)
+{
+  unsigned int prec = TYPE_PRECISION (type);
+  return max_value (prec, prec, TYPE_UNSIGNED (type) ? UNSIGNED : SIGNED);
+}
+
+/* Produce the smallest SGNed number that is represented in PRECISION.
+   The result is represented in PRECISION.  SGN must be SIGNED or
+   UNSIGNED.  */
+
+wide_int
+wide_int::min_value (unsigned int precision, SignOp sgn)
+{
+  return min_value (precision, precision, sgn);
+}
+  
+/* Produce the smallest number that is represented in MODE. The
+   precision are taken from mode.  SGN must be SIGNED or UNSIGNED.  */
+
+wide_int
+wide_int::min_value (enum machine_mode mode, SignOp sgn)
+{
+  unsigned int prec = GET_MODE_PRECISION (mode);
+  return min_value (prec, prec, sgn);
+}
+
+/* Produce the smallest number that is represented in TYPE. The
+   precision and sign are taken from TYPE.  */
+
+wide_int
+wide_int::min_value (const_tree type)
+{
+  unsigned int prec = TYPE_PRECISION (type);
+  return min_value (prec, prec, TYPE_UNSIGNED (type) ? UNSIGNED : SIGNED);
+}
+
+
+/* Small constants.  */
+
+/* Return a wide int of -1 with precision PREC.  */
+
+wide_int
+wide_int::minus_one (unsigned int prec)
+{
+  return wide_int::from_shwi (-1, prec);
+}
+
+/* Return a wide int of 0 with precision PREC.  */
+
+wide_int
+wide_int::zero (unsigned int prec)
+{
+  return wide_int::from_shwi (0, prec);
+}
+
+/* Return a wide int of 1 with precision PREC.  */
+
+wide_int
+wide_int::one (unsigned int prec)
+{
+  return wide_int::from_shwi (1, prec);
+}
+
+/* Return a wide int of 2 with precision PREC.  */
+
+wide_int
+wide_int::two (unsigned int prec)
+{
+  return wide_int::from_shwi (2, prec);
+}
+
+/* Public accessors for the interior of a wide int.  */
+
+/* Get the number of host wide ints actually represented within the
+   wide int.  */
+
+unsigned short
+wide_int::get_len () const
+{
+  return len;
+}
+
+/* Get precision of the value represented within the wide int.  */
+
+unsigned int
+wide_int::get_precision () const
+{
+  return precision;
+}
+
+/* Get a particular element of the wide int.  */
+
+HOST_WIDE_INT
+wide_int::elt (unsigned int i) const
+{
+  return i >= len ? sign_mask () : val[i];
+}
+
+/* Return true if THIS is -1.  */
+
+bool
+wide_int::minus_one_p () const
+{
+  return len == 1 && val[0] == (HOST_WIDE_INT)-1;
+}
+
+/* Return true if THIS is 0.  */
+
+bool
+wide_int::zero_p () const
+{
+  return len == 1 && val[0] == 0;
+}
+
+/* Return true if THIS is 1.  */
+
+bool
+wide_int::one_p () const
+{
+  return len == 1 && val[0] == 1;
+}
+
+/* Return true if THIS is negative.  */
+
+bool
+wide_int::neg_p () const
+{
+  return sign_mask () != 0;
+}
+
+/*
+ * Comparisons, note that only equality is an operator.  The other
+ * comparisons cannot be operators since they are inherently signed or
+ * unsigned and C++ has no such operators.
+ */
+
+/* Return true if THIS == OP1.  */
+
+template <typename T>
+bool
+wide_int::operator == (T c) const
+{
+  bool result;
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+
+  if (precision < HOST_BITS_PER_WIDE_INT)
+    {
+      unsigned HOST_WIDE_INT mask = ((HOST_WIDE_INT)1 << precision) - 1;
+      result = (val[0] & mask) == (s[0] & mask);
+    }
+  else if (precision == HOST_BITS_PER_WIDE_INT)
+      result = val[0] == s[0];
+  else
+    result = eq_p_large (s, cl);
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_vwa ("wide_int:: %d = (%s == %s)\n", result, *this, s, cl);
+#endif
+  return result;
+}
+
+/* Return true if THIS is not equal to OP1. */ 
+
+template <typename T>
+bool
+wide_int::operator != (T c) const
+{
+  return !(*this == c);
+}  
+
+/* Return true if THIS is less than OP1.  Signness is indicated by
+   OP.  */
+
+template <typename T>
+bool
+wide_int::lt_p (T c, SignOp op) const
+{
+  if (op == SIGNED)
+    return lts_p (c);
+  else
+    return ltu_p (c);
+}  
+
+/* Return true if THIS < OP1 using signed comparisons.  */
+
+template <typename T>
+bool
+wide_int::lts_p (T c) const
+{
+  bool result;
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+
+  if (precision <= HOST_BITS_PER_WIDE_INT)
+    /* The values are already logically sign extended.  */
+    result = val[0] < s[0];
+  else
+    result = lts_p_large (val, len, s, cl, precision);
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_vwa ("wide_int:: %d = (%s lts_p %s\n", result, *this, s, cl);
+#endif
+  return result;
+}
+
+/* Return true if THIS < OP1 using unsigned comparisons.  */
+
+template <typename T>
+bool
+wide_int::ltu_p (T c) const
+{
+  unsigned HOST_WIDE_INT x0;
+  unsigned HOST_WIDE_INT x1;
+  bool result;
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+
+  if (precision <= HOST_BITS_PER_WIDE_INT)
+    {
+      if (precision < HOST_BITS_PER_WIDE_INT)
+	{
+	  x0 = zext_hwi (val[0], precision);
+	  x1 = zext_hwi (s[0], precision);
+	}
+      else
+	{
+	  x0 = val[0];
+	  x1 = s[0];
+	}
+
+      result = x0 < x1;
+    }
+  else
+    result = ltu_p_large (val, len, s, cl);
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_vwa ("wide_int:: %d = (%s ltu_p %s)\n", result, *this, s, cl);
+#endif
+  return result;
+}
+
+/* Return true if THIS is greater than OP1.  Signness is indicated by
+   OP.  */
+
+template <typename T>
+bool
+wide_int::gt_p (T c, SignOp op) const
+{
+  if (op == SIGNED)
+    return gts_p (c);
+  else
+    return gtu_p (c);
+}  
+
+/* Return true if THIS < OP1 using signed comparisons.  */
+
+template <typename T>
+bool
+wide_int::gts_p (T c) const
+{
+  bool result;
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+
+  if (precision <= HOST_BITS_PER_WIDE_INT)
+    {
+      /* The values are already logically sign extended.  */
+      result = val[0] > s[0];
+    }
+  else
+    /* Reverse the parms and use ltu.  */
+    result = lts_p_large (s, cl, val, len, precision);
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_vwa ("wide_int:: %d = (%s gts_p %s\n", result, *this, s, cl);
+#endif
+  return result;
+}
+
+/* Return true if THIS < OP1 using unsigned comparisons.  */
+
+template <typename T>
+bool
+wide_int::gtu_p (T c) const
+{
+  unsigned HOST_WIDE_INT x0;
+  unsigned HOST_WIDE_INT x1;
+  bool result;
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+
+  if (precision <= HOST_BITS_PER_WIDE_INT)
+    {
+      if (precision < HOST_BITS_PER_WIDE_INT)
+	{
+	  x0 = zext_hwi (val[0], precision);
+	  x1 = zext_hwi (s[0], precision);
+	}
+      else
+	{
+	  x0 = val[0];
+	  x1 = s[0];
+	}
+
+      result = x0 > x1;
+    }
+  else 
+    /* Reverse the parms and use ltu.  */
+    result = ltu_p_large (s, cl, val, len);
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_vwa ("wide_int:: %d = (%s gtu_p %s)\n", result, *this, s, cl);
+#endif
+  return result;
+}
+
+
+/* Return -1 0 or 1 depending on how THIS compares with OP1.  Signness
+   is indicated by OP.  */
+
+template <typename T>
+int
+wide_int::cmp (T c, SignOp op) const
+{
+  if (op == SIGNED)
+    return cmps (c);
+  else
+    return cmpu (c);
+}  
+
+
+/* Returns -1 if THIS < OP1, 0 if THIS == OP1 and 1 if A > OP1 using
+   signed compares.  */
+
+template <typename T>
+int
+wide_int::cmps (T c) const
+{
+  int result;
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+
+  if (precision <= HOST_BITS_PER_WIDE_INT)
+    {
+      /* The values are already logically sign extended.  */
+      if (val[0] < s[0])
+	result = -1;
+      else if (val[0] > s[0])
+	result = 1;
+      else 
+	result = 0;
+    }
+  else
+    result = cmps_large (s, cl);
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_vwa ("wide_int:: %d = (%s cmps %s)\n", result, *this, s, cl);
+#endif
+  return result;
+}
+
+/* Returns -1 if THIS < OP1, 0 if THIS == OP1 and 1 if A > OP1 using
+   unsigned compares.  */
+
+template <typename T>
+int
+wide_int::cmpu (T c) const
+{
+  unsigned HOST_WIDE_INT x0;
+  unsigned HOST_WIDE_INT x1;
+  int result;
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+
+
+  if (precision <= HOST_BITS_PER_WIDE_INT)
+    {
+      if (precision < HOST_BITS_PER_WIDE_INT)
+	{
+	  x0 = zext_hwi (val[0], precision);
+	  x1 = zext_hwi (s[0], precision);
+	}
+      else
+	{
+	  x0 = val[0];
+	  x1 = s[0];
+	}
+
+      if (x0 < x1)
+	result = -1;
+      else if (x0 == x1)
+	result = 0;
+      else
+	result = 1;
+    }
+  else
+    result = cmpu_large (s, cl);
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_vwa ("wide_int:: %d = (%s cmpu %s)\n", result, *this, s, cl);
+#endif
+
+  return result;
+}
+
+
+/* Return the signed or unsigned min of THIS and OP1. */
+
+template <typename T>
+wide_int
+wide_int::min (T c, SignOp sgn) const
+{
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+
+  if (sgn == SIGNED)
+    return lts_p (c) ? (*this) : wide_int::from_array (s, cl, precision, false);
+  else
+    return ltu_p (c) ? (*this) : wide_int::from_array (s, cl, precision, false);
+}  
+
+/* Return the signed or unsigned min of THIS and OP1. */
+
+wide_int
+wide_int::min (const wide_int &op1, SignOp sgn) const
+{
+  if (sgn == SIGNED)
+    return lts_p (op1) ? (*this) : op1;
+  else
+    return ltu_p (op1) ? (*this) : op1;
+}  
+
+/* Return the signed or unsigned max of THIS and OP1. */
+
+template <typename T>
+wide_int
+wide_int::max (T c, SignOp sgn) const
+{
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+
+  if (sgn == SIGNED)
+    return gts_p (c) ? (*this) : wide_int::from_array (s, cl, precision, false);
+  else
+    return gtu_p (c) ? (*this) : wide_int::from_array (s, cl, precision, false);
+}  
+
+/* Return the signed or unsigned max of THIS and OP1. */
+
+wide_int
+wide_int::max (const wide_int &op1, SignOp sgn) const
+{
+  if (sgn == SIGNED)
+    return gts_p (op1) ? (*this) : op1;
+  else
+    return gtu_p (op1) ? (*this) : op1;
+}  
+
+/* Return the signed min of THIS and OP1. */
+
+template <typename T>
+wide_int
+wide_int::smin (T c) const
+{
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+
+  return lts_p (c) ? (*this) : wide_int::from_array (s, cl, precision, false);
+}  
+
+/* Return the signed min of THIS and OP1. */
+
+wide_int
+wide_int::smin (const wide_int &op1) const
+{
+  return lts_p (op1) ? (*this) : op1;
+}  
+
+/* Return the signed max of THIS and OP1. */
+
+template <typename T>
+wide_int
+wide_int::smax (T c) const
+{
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+
+  return gts_p (c) ? (*this) : wide_int::from_array (s, cl, precision, false);
+}  
+
+/* Return the signed max of THIS and OP1. */
+
+wide_int
+wide_int::smax (const wide_int &op1) const
+{
+  return gts_p (op1) ? (*this) : op1;
+}  
+
+/* Return the unsigned min of THIS and OP1. */
+
+template <typename T>
+wide_int
+wide_int::umin (T c) const
+{
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+
+  return ltu_p (c) ? (*this) : wide_int::from_array (s, cl, precision, false);
+}  
+
+/* Return the unsigned min of THIS and OP1. */
+
+wide_int
+wide_int::umin (const wide_int &op1) const
+{
+  return ltu_p (op1) ? (*this) : op1;
+}  
+
+/* Return the unsigned max of THIS and OP1. */
+
+template <typename T>
+wide_int
+wide_int::umax (T c) const
+{
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+
+  return gtu_p (c) ? (*this) : wide_int::from_array (s, cl, precision, false);
+}  
+
+/* Return the unsigned max of THIS and OP1. */
+
+wide_int
+wide_int::umax (const wide_int &op1) const
+{
+  return gtu_p (op1) ? (*this) : op1;
+}  
+
+/* Return true if THIS has the sign bit set to 1 and all other bits are
+   zero.  */
+
+bool
+wide_int::only_sign_bit_p () const
+{
+  return only_sign_bit_p (precision);
+}
+
+/* Return true if THIS fits in a HOST_WIDE_INT with no loss of
+   precision.  */
+
+bool
+wide_int::fits_shwi_p () const
+{
+  return len == 1;
+}
+
+/* Return true if THIS fits in an unsigned HOST_WIDE_INT with no loss
+   of precision.  */
+
+bool
+wide_int::fits_uhwi_p () const
+{
+  return len == 1 
+    || (len == 2 && val[1] == 0);
+}
+
+/* Return THIS extended to PREC.  The signness of the extension is
+   specified by OP.  */
+
+wide_int 
+wide_int::ext (unsigned int prec, SignOp z) const
+{
+  if (z == UNSIGNED)
+    return zext (prec);
+  else
+    return sext (prec);
+}
+
+/* Return THIS forced to the precision of TYPE.  If this is extension,
+   the sign is set by SGN. */
+
+wide_int 
+wide_int::force_to_size (enum machine_mode mode, SignOp sgn) const
+{
+  unsigned int prec = GET_MODE_PRECISION (mode);
+
+  return force_to_size (prec, sgn);
+}
+
+/* Return THIS forced to the precision and sign of TYPE.  If this is
+   extension, the sign is set by SGN. */
+
+wide_int 
+wide_int::force_to_size (const_tree type) const
+{
+  unsigned int prec = TYPE_PRECISION (type);
+  SignOp sgn = TYPE_UNSIGNED (type) ? UNSIGNED : SIGNED;
+
+  return force_to_size (prec, sgn);
+}
+
+/* Return THIS zero extended to the precision of TYPE but extends
+   using SGN.  If this is extension, the sign is set by SGN. */
+
+wide_int 
+wide_int::force_to_size (const_tree type, SignOp sgn) const
+{
+  unsigned int prec = TYPE_PRECISION (type);
+
+  return force_to_size (prec, sgn);
+}
+
+/* Return THIS forced to the precision of TYPE.  If this is extension,
+   it is signed. */
+
+wide_int 
+wide_int::sforce_to_size (enum machine_mode mode) const
+{
+  unsigned int prec = GET_MODE_PRECISION (mode);
+
+  return force_to_size (prec, SIGNED);
+}
+
+/* Return THIS zero extended to the precision of TYPE but extends
+   using SGN.  If this is extension, it is signed. */
+
+wide_int 
+wide_int::sforce_to_size (const_tree type) const
+{
+  unsigned int prec = TYPE_PRECISION (type);
+
+  return force_to_size (prec, SIGNED);
+}
+
+/* Return THIS forced to the precision of TYPE.  If this
+   is extension, it is unsigned. */
+
+wide_int 
+wide_int::zforce_to_size (enum machine_mode mode) const
+{
+  unsigned int prec = GET_MODE_PRECISION (mode);
+
+  return force_to_size (prec, UNSIGNED);
+}
+
+/* Return THIS zero extended to the precision of TYPE but extends
+   using SGN.  If this is extension, it is unsigned. */
+
+wide_int 
+wide_int::zforce_to_size (const_tree type) const
+{
+  unsigned int prec = TYPE_PRECISION (type);
+
+  return force_to_size (prec, UNSIGNED);
+}
+
+/*
+ * Logicals.
+ */
+
+/* Return THIS & OP1.  */
+
+template <typename T>
+wide_int
+wide_int::operator & (T c) const
+{
+  wide_int result;
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+  
+  if (precision <= HOST_BITS_PER_WIDE_INT)
+    {
+      result.len = 1;
+      result.precision = precision;
+      result.val[0] = val[0] & s[0];
+    }
+  else
+    result = and_large (s, cl);
+  
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwa ("wide_int:: %s = (%s & %s)\n", result, *this, s, cl);
+#endif
+  return result;
+}
+
+
+/* Return THIS & ~OP1.  */
+
+template <typename T>
+wide_int
+wide_int::and_not (T c) const
+{
+  wide_int result;
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+
+  if (precision <= HOST_BITS_PER_WIDE_INT)
+    {
+      result.len = 1;
+      result.precision = precision;
+      result.val[0] = val[0] & ~s[0];
+    }
+  else
+    result = and_not_large (s, cl);
+  
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwa ("wide_int:: %s = (%s &~ %s)\n", result, *this, s, cl);
+#endif
+  return result;
+}
+
+
+/* Return the logical negation (bitwise complement) of THIS.  */
+
+wide_int
+wide_int::operator ~ () const
+{
+  wide_int result;
+  int l0 = len - 1;
+
+  result.len = len;
+  result.precision = precision;
+
+  while (l0 >= 0)
+    {
+      result.val[l0] = ~val[l0];
+      l0--;
+    }
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_ww ("wide_int:: %s = (~ %s)\n", result, *this);
+#endif
+  return result;
+}
+
+/* Return THIS | OP1.  */
+
+template <typename T>
+wide_int
+wide_int::operator | (T c) const
+{
+  wide_int result;
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+
+  if (precision <= HOST_BITS_PER_WIDE_INT)
+    {
+      result.len = 1;
+      result.precision = precision;
+      result.val[0] = val[0] | s[0];
+    }
+  else
+    result = or_large (s, cl);
+  
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwa ("wide_int:: %s = (%s | %s)\n", result, *this, s, cl);
+#endif
+  return result;
+}
+
+
+/* Return THIS | ~OP1.  */
+
+template <typename T>
+wide_int
+wide_int::or_not (T c) const
+{
+  wide_int result;
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+  
+  if (precision <= HOST_BITS_PER_WIDE_INT)
+    {
+      result.len = 1;
+      result.precision = precision;
+      result.val[0] = val[0] | ~s[0];
+    }
+  else
+    result = or_not_large (s, cl);
+  
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwa ("wide_int:: %s = (%s |~ %s)\n", result, *this, s, cl);
+#endif
+  return result;
+}
+
+
+/* Return THIS ^ OP1.  */
+
+template <typename T>
+wide_int
+wide_int::operator ^ (T c) const
+{
+  wide_int result;
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+  
+  if (precision <= HOST_BITS_PER_WIDE_INT)
+    {
+      result.len = 1;
+      result.precision = precision;
+      result.val[0] = val[0] ^ s[0];
+    }
+  else
+    result = xor_large (s, cl);
+  
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwa ("wide_int:: %s = (%s ^ %s)\n", result, *this, s, cl);
+#endif
+  return result;
+}
+
+/*
+ * Integer arithmetic
+ */
+
+/* Return THIS + OP1.  */
+
+template <typename T>
+wide_int
+wide_int::operator + (T c) const
+{
+  wide_int result;
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+  
+  if (precision <= HOST_BITS_PER_WIDE_INT)
+    {
+      result.len = 1;
+      result.precision = precision;
+      result.val[0] = val[0] + s[0];
+      if (precision < HOST_BITS_PER_WIDE_INT)
+	result.val[0] = sext_hwi (result.val[0], precision);
+    }
+  else
+    result = add_large (s, cl);
+  
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwa ("wide_int:: %s = (%s + %s)\n", result, *this, s, cl);
+#endif
+  return result;
+}
+
+/* Multiply THIS and OP1.  The result is the same precision as the
+   operands, so there is no reason for signed or unsigned
+   versions.  */
+
+template <typename T>
+wide_int
+wide_int::operator * (T c) const
+{
+  wide_int result;
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+  bool overflow = false;
+
+  s = to_shwi2 (ws, &cl, c);
+
+  if (precision <= HOST_BITS_PER_WIDE_INT)
+    {
+      result.len = 1;
+      result.precision = precision;
+      result.val[0] = val[0] * s[0];
+      if (precision < HOST_BITS_PER_WIDE_INT)
+	result.val[0] = sext_hwi (result.val[0], precision);
+    }
+  else
+    result = mul_internal (false, false, this, s, cl, UNSIGNED, &overflow, false);
+#ifdef DEBUG_WIDE_INT
+      if (dump_file)
+	debug_wwa ("wide_int:: %s = (%s * %s)\n", result, *this, s, cl);
+#endif
+  return result;
+}
+
+/* Multiply THIS and OP1.  The signess is specified with SGN.
+   OVERFLOW is set true if the result overflows.  */
+
+template <typename T>
+wide_int 
+wide_int::mul (T c, SignOp sgn, bool *overflow) const
+{
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+
+  return mul_internal (false, false, this, s, cl, sgn, overflow, true);
+}
+
+/* Signed multiply THIS and OP1.  The result is the same precision as
+   the operands.  OVERFLOW is set true if the result overflows.  */
+
+template <typename T>
+wide_int
+wide_int::smul (T c, bool *overflow) const
+{
+  return mul (c, SIGNED, overflow);
+}
+
+/* Unsigned multiply THIS and OP1.  The result is the same precision
+   as the operands.  OVERFLOW is set true if the result overflows.  */
+
+template <typename T>
+wide_int
+wide_int::umul (T c, bool *overflow) const
+{
+  return mul (c, UNSIGNED, overflow);
+}
+
+/* Multiply THIS and OP1.  The signess is specified with SGN.  The
+   result is twice the precision as the operands.  The signess is
+   specified with SGN.  */
+
+template <typename T>
+wide_int
+wide_int::mul_full (T c, SignOp sgn) const
+{
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+
+  return mul_internal (false, true, this, s, cl, sgn, 0, false);
+}
+
+/* Signed multiply THIS and OP1.  The result is twice the precision as
+   the operands.  */
+
+template <typename T>
+wide_int
+wide_int::smul_full (T c) const
+{
+  return mul_full (c, SIGNED);
+}
+
+/* Unsigned multiply THIS and OP1.  The result is twice the precision
+   as the operands.  */
+
+template <typename T>
+wide_int
+wide_int::umul_full (T c) const
+{
+  return mul_full (c, UNSIGNED);
+}
+
+/* Multiply THIS and OP1 and return the high part of that result.  The
+   signess is specified with SGN.  The result is the same precision as
+   the operands.  The mode is the same mode as the operands.  The
+   signess is specified with y.  */
+
+template <typename T>
+wide_int
+wide_int::mul_high (T c, SignOp sgn) const
+{
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+  return mul_internal (true, false, this, s, cl, sgn, 0, false);
+}
+
+/* Negate THIS.  */
+
+wide_int
+wide_int::neg () const
+{
+  wide_int z = wide_int::from_shwi (0, precision);
+  return z - *this;
+}
+
+/* Negate THIS.  Set overflow if the value cannot be negated.  */
+
+wide_int
+wide_int::neg (bool *overflow) const
+{
+  wide_int z = wide_int::from_shwi (0, precision);
+  if (only_sign_bit_p ())
+    *overflow = true;
+
+  return z - *this;
+}
+
+/* Return THIS - OP1.  */
+
+template <typename T>
+wide_int
+wide_int::operator - (T c) const
+{
+  wide_int result;
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+  
+  if (precision <= HOST_BITS_PER_WIDE_INT)
+    {
+      result.len = 1;
+      result.precision = precision;
+      result.val[0] = val[0] - s[0];
+      if (precision < HOST_BITS_PER_WIDE_INT)
+	result.val[0] = sext_hwi (result.val[0], precision);
+    }
+  else
+    result = sub_large (s, cl);
+  
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwa ("wide_int:: %s = (%s - %s)\n", result, *this, s, cl);
+#endif
+  return result;
+}
+
+
+
+/* Divide DIVISOR into THIS.  The result is the same size as the
+   operands.  The sign is specified in SGN.  The output is truncated.
+   Overflow is set to true if the result overflows, otherwise it is
+   not set.  */
+template <typename T>
+wide_int
+wide_int::div_trunc (T c, SignOp sgn, bool *overflow) const
+{
+  wide_int remainder;
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+  
+  return divmod_internal (true, this, s, cl, sgn, 
+			  &remainder, false, overflow);
+}
+
+/* Signed divide with truncation of result.  */
+
+template <typename T>
+wide_int
+wide_int::sdiv_trunc (T c) const
+{
+  return div_trunc (c, SIGNED);
+}
+
+/* Unsigned divide with truncation of result.  */
+
+template <typename T>
+wide_int
+wide_int::udiv_trunc (T c) const
+{
+  return div_trunc (c, UNSIGNED);
+}
+
+/* Divide DIVISOR into THIS.  The result is the same size as the
+   operands.  The sign is specified in SGN.  The output is floor
+   truncated.  Overflow is set to true if the result overflows,
+   otherwise it is not set.  */
+
+template <typename T>
+wide_int
+wide_int::div_floor (T c, SignOp sgn, bool *overflow) const
+{
+  wide_int remainder;
+  wide_int quotient;
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+
+  quotient = divmod_internal (true, this, s, cl, sgn, 
+			      &remainder, true, overflow);
+  if (sgn == SIGNED && quotient.neg_p () && !remainder.zero_p ())
+    return quotient - 1;
+  return quotient;
+}
+
+/* Unsigned divide with floor truncation of result.  */
+
+template <typename T>
+wide_int
+wide_int::udiv_floor (T c) const
+{
+  bool overflow;
+
+  return div_floor (c, UNSIGNED, &overflow);
+}
+
+/* Signed divide with floor truncation of result.  */
+
+template <typename T>
+wide_int
+wide_int::sdiv_floor (T c) const
+{
+  bool overflow;
+
+  return div_floor (c, SIGNED, &overflow);
+}
+
+/* Divide DIVISOR into THIS.  The result is the same size as the
+   operands.  The sign is specified in SGN.  The output is ceil
+   truncated.  Overflow is set to true if the result overflows,
+   otherwise it is not set.  */
+
+template <typename T>
+wide_int
+wide_int::div_ceil (T c, SignOp sgn, bool *overflow) const
+{
+  wide_int remainder;
+  wide_int quotient;
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+
+  quotient = divmod_internal (true, this, s, cl, sgn, 
+			      &remainder, true, overflow);
+
+  if (!remainder.zero_p ())
+    {
+      if (sgn == UNSIGNED || quotient.neg_p ())
+	return quotient;
+      else
+	return quotient + 1;
+    }
+  return quotient;
+}
+
+/* Divide DIVISOR into THIS.  The result is the same size as the
+   operands.  The sign is specified in SGN.  The output is round
+   truncated.  Overflow is set to true if the result overflows,
+   otherwise it is not set.  */
+
+template <typename T>
+wide_int
+wide_int::div_round (T c, SignOp sgn, bool *overflow) const
+{
+  wide_int remainder;
+  wide_int quotient;
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+
+  quotient = divmod_internal (true, this, s, cl, sgn, 
+			      &remainder, true, overflow);
+  if (!remainder.zero_p ())
+    {
+      wide_int divisor = wide_int::from_array (s, cl, precision);
+      if (sgn == SIGNED)
+	{
+	  wide_int p_remainder = remainder.neg_p () ? remainder.neg () : remainder;
+	  wide_int p_divisor = divisor.neg_p () ? divisor.neg () : divisor;
+	  p_divisor = p_divisor.rshiftu_large (1);
+	  
+	  if (p_divisor.gts_p (p_remainder)) 
+	    {
+	      if (quotient.neg_p ())
+		return quotient - 1;
+	      else 
+		return quotient + 1;
+	    }
+	}
+      else
+	{
+	  wide_int p_divisor = divisor.rshiftu_large (1);
+	  if (p_divisor.gtu_p (remainder))
+	    return quotient + 1;
+	}
+    }
+  return quotient;
+}
+
+/* Divide DIVISOR into THIS producing both the quotient and remainder.
+   The result is the same size as the operands.  The sign is specified
+   in SGN.  The output is truncated.  */
+
+template <typename T>
+wide_int
+wide_int::divmod_trunc (T c, wide_int *remainder, SignOp sgn) const
+{
+  bool overflow = false;
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+
+  return divmod_internal (true, this, s, cl, sgn, 
+			  remainder, true, &overflow);
+}
+
+/* Signed divide/mod with truncation of result.  */
+
+template <typename T>
+wide_int
+wide_int::sdivmod_trunc (T c, wide_int *mod) const
+{
+  return divmod_trunc (c, mod, SIGNED);
+}
+
+/* Unsigned divide/mod with truncation of result.  */
+
+template <typename T>
+wide_int
+wide_int::udivmod_trunc (T c, wide_int *mod) const
+{
+  return divmod_trunc (c, mod, UNSIGNED);
+}
+
+/* Divide DIVISOR into THIS.  The remainder is also produced in
+   REMAINDER.  The result is the same size as the operands.  The sign
+   is specified in SGN.  The output is floor truncated.  Overflow is
+   set to true if the result overflows, otherwise it is not set.  */
+
+template <typename T>
+wide_int
+wide_int::divmod_floor (T c, wide_int *remainder, SignOp sgn) const
+{
+  wide_int quotient;
+  bool overflow = false;
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+
+  quotient = divmod_internal (true, this, s, cl, sgn, 
+			      remainder, true, &overflow);
+  if (sgn == SIGNED && quotient.neg_p () && !(*remainder).zero_p ())
+    {
+      *remainder = *remainder - wide_int::from_array (s, cl, precision);
+      return quotient - 1;
+    }
+  return quotient;
+}
+
+/* Signed divide/mod with floor truncation of result.  */
+
+template <typename T>
+wide_int
+wide_int::sdivmod_floor (T c, wide_int *mod) const
+{
+  return divmod_floor (c, mod, SIGNED);
+}
+
+/* Divide DIVISOR into THIS producing the remainder.  The result is
+   the same size as the operands.  The sign is specified in SGN.  The
+   output is truncated.  Overflow is set to true if the result
+   overflows, otherwise it is not set.  */
+
+template <typename T>
+wide_int
+wide_int::mod_trunc (T c, SignOp sgn, bool *overflow) const
+{
+  wide_int remainder;
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+
+  divmod_internal (true, this, s, cl, sgn, 
+			  &remainder, true, overflow);
+  return remainder;
+}
+
+/* Signed mod with truncation of result.  */
+
+template <typename T>
+wide_int
+wide_int::smod_trunc (T c) const
+{
+  return mod_trunc (c, SIGNED);
+}
+
+/* Unsigned mod with truncation of result.  */
+
+template <typename T>
+wide_int
+wide_int::umod_trunc (T c) const
+{
+  return mod_trunc (c, UNSIGNED);
+}
+
+/* Divide DIVISOR into THIS producing the remainder.  The result is
+   the same size as the operands.  The sign is specified in SGN.  The
+   output is floor truncated.  Overflow is set to true if the result
+   overflows, otherwise it is not set.  */
+
+template <typename T>
+wide_int
+wide_int::mod_floor (T c, SignOp sgn, bool *overflow) const
+{
+  wide_int remainder;
+  wide_int quotient;
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+
+  quotient = divmod_internal (true, this, s, cl, sgn, 
+			      &remainder, true, overflow);
+
+  if (sgn == SIGNED && quotient.neg_p () && !remainder.zero_p ())
+    return remainder - wide_int::from_array (s, cl, precision);
+  return remainder;
+}
+
+/* Unsigned mod with floor truncation of result.  */
+
+template <typename T>
+wide_int
+wide_int::umod_floor (T c) const
+{
+  bool overflow;
+
+  return mod_floor (c, UNSIGNED, &overflow);
+}
+
+/* Divide DIVISOR into THIS producing the remainder.  The result is the
+   same size as the operands.  The sign is specified in SGN.  The
+   output is ceil truncated.  Overflow is set to true if the result
+   overflows, otherwise it is not set.  */
+
+template <typename T>
+wide_int
+wide_int::mod_ceil (T c, SignOp sgn, bool *overflow) const
+{
+  wide_int remainder;
+  wide_int quotient;
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+
+  quotient = divmod_internal (true, this, s, cl, sgn, 
+			      &remainder, true, overflow);
+
+  if (!remainder.zero_p ())
+    {
+      if (sgn == UNSIGNED || quotient.neg_p ())
+	return remainder;
+      else
+	return remainder - wide_int::from_array (s, cl, precision);
+    }
+  return remainder;
+}
+
+/* Divide DIVISOR into THIS producing the remainder.  The result is
+   the same size as the operands.  The sign is specified in SGN.  The
+   output is round truncated.  Overflow is set to true if the result
+   overflows, otherwise it is not set.  */
+
+template <typename T>
+wide_int
+wide_int::mod_round (T c, SignOp sgn, bool *overflow) const
+{
+  wide_int remainder;
+  wide_int quotient;
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+
+  quotient = divmod_internal (true, this, s, cl, sgn, 
+			      &remainder, true, overflow);
+
+  if (!remainder.zero_p ())
+    {
+      wide_int divisor = wide_int::from_array (s, cl, precision);
+      if (sgn == SIGNED)
+	{
+	  wide_int p_remainder = remainder.neg_p () ? remainder.neg () : remainder;
+	  wide_int p_divisor = divisor.neg_p () ? divisor.neg () : divisor;
+	  p_divisor = p_divisor.rshiftu_large (1);
+	  
+	  if (p_divisor.gts_p (p_remainder)) 
+	    {
+	      if (quotient.neg_p ())
+		return remainder + divisor;
+	      else 
+		return remainder - divisor;
+	    }
+	}
+      else
+	{
+	  wide_int p_divisor = divisor.rshiftu_large (1);
+	  if (p_divisor.gtu_p (remainder))
+	    return remainder - divisor;
+	}
+    }
+  return remainder;
+}
+
+
+/* If SHIFT_COUNT_TRUNCATED is defined, truncate CNT.   
+
+   At first look, the shift truncation code does not look right.
+   Shifts (and rotates) are done according to the precision of the
+   mode but the shift count is truncated according to the bitsize of
+   the mode.  This is how real hardware works (Knuth's mix machine is
+   the only known exception to this rule, but it was never real).
+
+   On an ideal machine, like Knuth's mix machine, a shift count is a
+   word long and all of the bits of that word are examined to compute
+   the shift amount.  But on real hardware, especially on machines
+   with fast (single cycle shifts) that takes too long.  On these
+   machines, the amount of time to perform a shift dictates the cycle
+   time of the machine so corners are cut to keep this fast.  A
+   comparison of an entire 64 bit word would take something like 6
+   gate delays before the shifting can even start.
+
+   So real hardware only looks at a small part of the shift amount.
+   On ibm machines, this tends to be 1 more than what is necessary to
+   encode the shift amount.  The rest of the world looks at only the
+   minimum number of bits.  This means that only 3 gate delays are
+   necessary to set up the shifter.
+
+   On the other hand, right shifts and rotates must be according to
+   the precision or the operation does not make any sense. 
+
+   This function is called in two contexts.  If OP == TRUNC, this
+   function provides a count that matches the semantics of the target
+   machine depending on the value of SHIFT_COUNT_TRUNCATED.  Note that
+   if SHIFT_COUNT_TRUNCATED is not defined, this function may produce
+   -1 as a value if the shift amount is greater than the bitsize of
+   the mode.  -1 is a surrogate for a very large amount.
+
+   If OP == NONE, then this function always truncates the shift value
+   to the bitsize because this shifting operation is a function that
+   is internal to GCC.  */
+
+inline int
+wide_int::trunc_shift (const HOST_WIDE_INT *cnt, unsigned int len ATTRIBUTE_UNUSED,
+		       unsigned int bitsize, ShiftOp trunc_op)
+{
+  if (trunc_op == TRUNC)
+    {
+      gcc_checking_assert (bitsize != 0);
+#ifdef SHIFT_COUNT_TRUNCATED
+      return cnt[0] & (bitsize - 1);
+#else
+      if (cnt[0] < bitsize && cnt[0] >= 0 && len == 1)
+	return cnt[0];
+      else 
+	return -1;
+#endif
+    }
+  else if (bitsize == 0)
+    return cnt[0];
+  else
+    return cnt[0] & (bitsize - 1);
+}
+
+/* Left shift THIS by C.  See the definition of Op.TRUNC for how to
+   set TRUNC_OP.  Since this is used internally, it has the ability to
+   specify the BITSIZE  and PRECISION independently.  This is useful
+   when inserting a small value into a larger one.  */
+
+template <typename T>
+wide_int
+wide_int::lshift (T c, unsigned int bitsize, 
+		  unsigned int res_prec, ShiftOp trunc_op) const
+{
+  wide_int result;
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+  HOST_WIDE_INT shift;
+
+  s = to_shwi2 (ws, &cl, c);
+
+  if (res_prec == 0)
+    res_prec = precision;
+
+  shift = trunc_shift (s, cl, bitsize, trunc_op);
+  if (shift == -1)
+    result = wide_int::zero (precision);
+  else if (shift == 0 && res_prec == precision)
+    result = *this;
+  /* Handle the simple case quickly.   */
+  else if (res_prec <= HOST_BITS_PER_WIDE_INT)
+    {
+      result.precision = res_prec;
+      result.len = 1;
+      result.val[0] = val[0] << shift;
+    }
+  else
+    result = lshift_large (shift, res_prec);
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwa ("wide_int:: %s = (%s << %s)\n", result, *this, s, cl);
+#endif
+  return result;
+}
+
+/* Rotate THIS left by Y within its precision.  */
+
+template <typename T>
+wide_int
+wide_int::lrotate (T c) const
+{
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+
+  return lrotate ((unsigned HOST_WIDE_INT)s[0]);
+}
+
+
+/* Rotate THIS left by CNT within its precision.  */
+
+wide_int
+wide_int::lrotate (unsigned HOST_WIDE_INT cnt) const
+{
+  wide_int left, right, result;
+
+  left = lshift (cnt);
+  right = rshiftu (precision - cnt);
+  result = left | right;
+
+  return result;
+}
+
+
+/* Right shift THIS by Y.  SGN indicates the sign.  TRUNC_OP indicates the
+   truncation option.  */
+
+template <typename T>
+wide_int
+wide_int::rshift (T c, SignOp sgn, 
+		  unsigned int bitsize, ShiftOp trunc_op) const
+{
+  if (sgn == UNSIGNED)
+    return rshiftu (c, bitsize, trunc_op);
+  else
+    return rshifts (c, bitsize, trunc_op);
+}
+
+
+/* Unsigned right shift THIS by CNT.  See the definition of Op.TRUNC
+   for how to set TRUNC_OP.  */
+
+template <typename T>
+wide_int
+wide_int::rshiftu (T c, unsigned int bitsize,
+		   ShiftOp trunc_op) const
+{
+  wide_int result;
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+  HOST_WIDE_INT shift;
+
+  s = to_shwi2 (ws, &cl, c);
+
+  shift = trunc_shift (s, cl, bitsize, trunc_op);
+  
+  if (shift == 0)
+    result = copy ();
+  else if (shift == -1)
+    result = wide_int::zero (precision);
+  else if (precision <= HOST_BITS_PER_WIDE_INT)
+    {
+      /* Handle the simple case quickly.   */
+      unsigned HOST_WIDE_INT x = val[0];
+
+      result.precision = precision;
+      result.len = 1;
+
+      if (precision < HOST_BITS_PER_WIDE_INT)
+	x = zext_hwi (x, precision);
+
+      result.val[0] = x >> shift;
+    }
+  else 
+    result = rshiftu_large (shift);
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwa ("wide_int:: %s = (%s >>U %s)\n", result, *this, s, cl);
+#endif
+  return result;
+}
+
+/* Signed right shift THIS by CNT.  See the definition of Op.TRUNC for
+   how to set TRUNC_OP.  */
+
+template <typename T>
+wide_int
+wide_int::rshifts (T c, unsigned int bitsize,
+		   ShiftOp trunc_op) const
+{
+  wide_int result;
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+  HOST_WIDE_INT shift;
+
+  s = to_shwi2 (ws, &cl, c);
+
+  shift = trunc_shift (s, cl, bitsize, trunc_op);
+  
+  if (shift == 0)
+    result = copy ();
+  else if (shift == -1)
+    result = wide_int::zero (precision);
+  else if (precision <= HOST_BITS_PER_WIDE_INT)
+    {
+      /* Handle the simple case quickly.   */
+      HOST_WIDE_INT x = val[0];
+
+      result.precision = precision;
+      result.len = 1;
+      result.val[0] = x >> shift;
+    }
+  else 
+    result = rshifts_large (shift);
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwa ("wide_int:: %s = (%s >>S %s)\n", result, *this, s, cl);
+#endif
+  return result;
+}
+
+/* Rotate THIS right by Y within its precision.  */
+
+
+template <typename T>
+wide_int
+wide_int::rrotate (T c) const
+{
+  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s;
+  int cl;
+
+  s = to_shwi2 (ws, &cl, c);
+
+  return rrotate ((unsigned HOST_WIDE_INT) s[0]);
+}
+
+
+/* Rotate THIS left by CNT within its precision.  */
+
+wide_int
+wide_int::rrotate (unsigned HOST_WIDE_INT cnt) const
+{
+  wide_int left, right, result;
+
+  left = lshift (precision - cnt);
+  right = rshiftu (cnt);
+  result = left | right;
+
+  return result;
+}
+
+/* tree related routines.  */
+
+extern tree wide_int_to_tree (tree type, const wide_int &cst);
+extern tree wide_int_to_infinite_tree (tree type, const wide_int &cst, 
+				       unsigned int prec);
+extern tree force_fit_type_wide (tree, const wide_int &, int, bool);
+
+/* real related routines.  */
+extern wide_int real_to_integer (const REAL_VALUE_TYPE *, bool *, int);
+extern wide_int decimal_real_to_integer (const REAL_VALUE_TYPE *, bool *, int);
+
+
+/* Conversion to and from GMP integer representations.  */
+
+void mpz_set_wide_int (mpz_t, wide_int, bool);
+wide_int mpz_get_wide_int (const_tree, mpz_t, bool);
+#endif /* GENERATOR FILE */
+
+#endif /* WIDE_INT_H */