diff mbox

patch to fix

Message ID 506C72C7.7090207@naturalbridge.com
State New
Headers show

Commit Message

Kenneth Zadeck Oct. 3, 2012, 5:15 p.m. UTC
The enclosed patch is the third of at least four patches that fix the
problems associated with supporting integers on the target that are
wider than two HOST_WIDE_INTs.

While GCC claims to support OI mode, and we have two public ports that
make minor use of this mode, in practice, compilation that uses OImode
mode commonly gives the wrong result or ices.  We have a private port
of GCC for an architecture that is further down the road to needing
comprehensive OImode and we have discovered that this is unusable. We
have decided to fix it in a general way that so that it is most
beneficial to the GCC community.  It is our belief that we are just a
little ahead of the X86 and the NEON and these patches will shortly be
essential.

The first two of these patches were primarily lexigraphical and have
already been committed.    They transformed the uses of CONST_DOUBLE
so that it is easy to tell what the intended usage is.

The underlying structures in the next two patches are very general:
once they are added to the compiler, the compiler will be able to
support targets with any size of integer from hosts of any size
integer.

The patch enclosed deals with the portable RTL parts of the compiler.
The next patch, which is currently under construction deals with the
tree level.  However, this patch can be put on the trunk as is, and it
will eleviate many, but not all of the current limitations in the rtl
parts of the compiler.

Some of the patch is conditional, depending on a port defining the
symbol 'TARGET_SUPPORTS_WIDE_INT' to be non zero.  Defining this
symbol to be non zero is declaring that the port has been converted to
use the new form or integer constants.  However, the patch is
completely backwards compatible to allow ports that do not need this
immediately to convert at their leasure.  The conversion process is
not difficult, but it does require some knowledge of the port, so we
are not volinteering to do this for all ports.

OVERVIEW OF THE PATCH:

The patch defines a new datatype, a 'wide_int' (defined in
wide-int.[ch], and this datatype will be used to perform all of the
integer constant math in the compiler.  Externally, wide-int is very
similar to double-int except that it does not have the limitation that
math must be done on exactly two HOST_WIDE_INTs.

Internally, a wide_int is a structure that contains a fixed sized
array of HOST_WIDE_INTs, a length field and a mode.  The size of the
array is determined at generation time by dividing the number of bits
of the largest integer supported on the target by the number of bits
in a HOST_WIDE_INT of the host.  Thus, with this format, any length of
integer can be supported on any host.

A new rtx type is created, the CONST_WIDE_INT, which contains a
garbage collected array of HOST_WIDE_INTS that is large enough to hold
the constant.  For the targets that define TARGET_SUPPORTS_WIDE_INT to
be non zero, CONST_DOUBLES are only used to hold floating point
values.  If the target leaves TARGET_SUPPORTS_WIDE_INT defined as 0,
CONST_WIDE_INTs are not used and CONST_DOUBLEs are as they were
before.

CONST_INT does not change except that it is defined to hold all
constants that fit in exactly one HOST_WIDE_INT.  Note that is slightly
different than the current trunk.  Before this patch, the TImode
constant '5' could either be in a CONST_INT or CONST_DOUBLE depending
on which code path was used to create it.  This patch changes this so
that if the constant fits in a CONST_INT then it is represented in a
CONST_INT no matter how it is created.

For the array inside a CONST_WIDE_INT, and internally in wide-int, we
use a compressed form for integers that need more than one
HOST_WIDE_INT.  Higher elements of the array are not needed if they
are just a sign extension of the elements below them.  This does not
imply that constants are signed or are sign extended, this is only a
compression technique.

While it might seem to be more esthetically pleasing to have not
introduced the CONST_WIDE_INT and to have changed the representation
of the CONST_INT to accomodate larger numbers, this would have both
used more space and would be a time consuming change for the port
maintainers.  We believe that most ports can be quickly converted with
the current scheme because there is just not a lot of code in the back
ends that cares about large constants.  Furthermore, the CONST_INT is
very space efficient and even in a program that was heavy in large
values, most constants would still fit in a CONST_INT.

All of the parts of the rtl level that deal with CONST_DOUBLE as an
now conditionally work with CONST_WIDE_INTs depending on the value
of TARGET_SUPPORTS_WIDE_INT.  We believe that this patch removes all
of the ices and wrong code places at the portable rtl level. However,
there are still places in the portable rtl code that refuse to
transform the code unless it is a CONST_INT.  Since these do not cause
failures, they can be handled later.  The patch is already very large.

It should be noted that much of the constant overflow checking in the
constant math dissappears with these patches.  The overflow checking
code in the current compiler is really divided into two cases:
overflow on the host and overflow on the target.  The overflow
checking on the host was to make sure that the math did overflow when
done on two HOST_WIDE_INTs.  All of this code goes away.  These
patches allow the constant math to be done exactly the way it is done
on the target.

This patch also aids other cleanups that are being considered at the
rtl level:

   1) These patches remove most of the host dependencies on the
   optimizations.  Currently a 32 bit GCC host will produce different
   code for a specific target than a 64 bit host will.  This is because
   many of the transformations only work on constants that can be a
   represented with a single HWI or two HWIs.  If the target has larger
   integers than the host, the compilation suffers.

   2) Bernd's need to make GCC correctly support partial its is made
   easier by the wide-int library.  This library carefully does all
   arithmetic in the precision of the mode included in it.  While there
   are still places at the rtl level that still do arithmetic inline,
   we plan to convert those to use the library over time.   This patch
   converts a substantial number of those places.

   3) This patch is one step along the path to add modes to rtl integer
   constants.  There is no longer any checking to see if a CONST_DOUBLE
   has VOIDmode as its mode.  Furthermore, all constructors for various
   wide ints do take a mode and require that it not be VOIDmode. There
   is still a lot of work to do to make this conversion possible.

Richard Sandiford has been over the rtl portions of this patch a few
times.  He has not looked at the wide-int files in any detail.  This
patch has been heavily tested on my private ports and also on x86-64.


CONVERSION PROCESS

Converting a port mostly requires looking for the places where
CONST_DOUBLES are used with VOIDmode and replacing that code with code
that accesses CONST_WIDE_INTs.  "grep -i const_double" at the port
level gets you to 95% of the changes that need to be made.  There are
a few places that require a deeper look.

   1) There is no equivalent to hval and lval for CONST_WIDE_INTs.
   This would be difficult to express in the md language since there
   are a variable number of elements.

   Most ports only check that hval is either 0 or -1 to see if the int
   is small.  As mentioned above, this will no longer be necessary
   since small constants are always CONST_INT.  Of course there are
   still a few exceptions, the alpha's constraint used by the zap
   instruction certainly requires careful examination by C code.
   However, all the current code does is pass the hval and lval to C
   code, so evolving the c code to look at the CONST_WIDE_INT is not
   really a large change.

   2) Because there is no standard template that ports use to
   materialize constants, there is likely to be some futzing that is
   unique to each port in this code.

   3) The rtx costs may have to be adjusted to properly account for
   larger constants that are represented as CONST_WIDE_INT.

All and all it has not taken us long to convert ports that we are
familiar with.

OTHER COMMENTS

I did find what i believe is one interesting bug in the double-int
code.  I believe that the code that performs divide and mod with round
to nearest is seriously wrong for unsigned integers.  I believe that
it will get the wrong answer for any numbers that are large enough to
look negative if they consider signed integers.  Asside from that,
wide-int should perform in a very similar manner to double-int.

I am sorry for the size of this patch.   However, there does not appear
to change the underlying data structure to support wider integers
without doing something like this.

kenny
2012-10-3  Kenneth Zadeck <zadeck@naturalbridge.com>

	* reload.c (find_reloads): Use CONST_SCALAR_INT_P.
	rtl.def (CONST_WIDE_INT): New.
	* ira-costs.c (record_reg_classes, record_address_regs):
	Use CONST_SCALAR_INT_P.
	* dojump.c (prefer_and_bit_test): Use wide int api.
	* recog.c (simplify_while_replacing): Use CONST_SCALAR_INT_P.
 	(const_scalar_int_operand, const_double_operand): New versions
	if target supports wide integers.
	(const_wide_int_operand): New function.
	(asm_operand_ok, constrain_operands): Use CONST_SCALAR_INT_P.
	* rtl.c (DEF_RTL_EXPR): Added CONST_WIDE_INT case.
	(rtx_size): Ditto.
	(rtx_alloc_stat, hwivec_output_hex, hwivec_check_failed_bounds):
	New functions.
	(iterative_hash_rtx): Added CONST_WIDE_INT case.
	* rtl.h (hwivec_def): New function.
	(HWI_GET_NUM_ELEM, HWI_PUT_NUM_ELEM, CONST_WIDE_INT_P,
	CONST_SCALAR_INT_P, XHWIVEC_ELT, HWIVEC_CHECK, CONST_WIDE_INT_VEC,
	CONST_WIDE_INT_NUNITS, CONST_WIDE_INT_ELT, rtx_alloc_v): New macros.
	(chain_next): Added hwiv case.
	(CASE_CONST_SCALAR_INT, CONST_INT, CONST_WIDE_INT):  Added new
	defs if target supports wide ints.
	* rtlanal.c (commutative_operand_precedence): Added CONST_WIDE_INT
	case.
	* Makefile.in (wide-int.c, wide-int.h): New files.
	* sched-vis.c (print_value): Added CONST_WIDE_INT case are
	modified DOUBLE_INT case.
	* gengtype.c (wide-int): New type.
	* alias.c  (rtx_equal_for_memref_p): Fixed comment.
	* sel-sched-ir.c (lhs_and_rhs_separable_p): Ditto.
	* genemit.c (gen_exp): Added CONST_WIDE_INT case.
	* defaults.h (TARGET_SUPPORTS_WIDE_INT): New.
	* builtins.c (c_getstr, c_readstr, expand_builtin_signbit): 
	Make to work with any size int.
	* simplify-rtx.c (mode_signbit_p, simplify_unary_operation_1,
	simplify_const_unary_operation, simplify_binary_operation_1,
	simplify_const_binary_operation,
	simplify_relational_operation_1,
	simplify_const_relational_operation, simplify_immed_subreg,
	simplify_subreg): Ditto.
	* gengenrtl.c (excluded_rtx): Added CONST_WIDE_INT case.
	* expmed.c (mask_rtx, lshift_value): Now uses wide-int.
 	(expand_mult, expand_smod_pow2): Make to work with any size int.
	(make_tree): Added CONST_WIDE_INT case.
	* cselib.c (entry_and_rtx_equal_p): Use CONST_SCALAR_INT_P.
	(rtx_equal_for_cselib_1, cselib_hash_rtx): Added CONST_WIDE_INT case.
	* explow.c (plus_constant): Now uses wide-int api.
	* varasm.c (const_rtx_hash_1): Added CONST_WIDE_INT case.
	* hwint.c (popcount_hwi): New function.
	* hwint.h (HOST_BITS_PER_HALF_WIDE_INT, HOST_HALF_WIDE_INT,
	HOST_HALF_WIDE_INT_PRINT, HOST_HALF_WIDE_INT_PRINT_C,
	HOST_HALF_WIDE_INT_PRINT_DEC, HOST_HALF_WIDE_INT_PRINT_DEC_C,
	HOST_HALF_WIDE_INT_PRINT_UNSIGNED, HOST_HALF_WIDE_INT_PRINT_HEX,
	HOST_HALF_WIDE_INT_PRINT_HEX_PURE): New symbols.
	* postreload.c (reload_cse_simplify_set):  Now uses wide-int api.
	* var-tracking.c (loc_cmp): Added CONST_WIDE_INT case.
	* tree.c (wide_int_to_tree): New function.
	* tree.h (wide_int_to_tree): Ditto.
	* gensupport.c (const_wide_int_operand,
	const_scalar_int_operand): New
	* read-rtl.c (validate_const_wide_int): New function.
	(read_rtx_code): Added CONST_WIDE_INT case.
	* cse.c (hash_rtx_cb): Added CONST_WIDE_INT case are
	modified DOUBLE_INT case.
	* dwarf2out.c (dw_val_equal_p, size_of_loc_descr,
	output_loc_operands, print_die, attr_checksum, same_dw_val_p,
	size_of_die, value_format, output_die, mem_loc_descriptor,
	loc_descriptor, extract_int, add_const_value_attribute,
	hash_loc_operands, compare_loc_operands): Add support for wide-ints.
	(add_AT_wide): New function.
	* dwarf2out.h (enum dw_val_class): Added dw_val_class_wide_int.
	* wide-int.c (all): New file.
	* wide-int.h (all): New file.
	* genmodes.c (emit_max_int): New function.
	(emit_insn_modes_h): Add call to emit_max_int.
	* ira-lives.c (single_reg_class):  Use CONST_SCALAR_INT_P.
	* emit-rtl.c (const_wide_int_htab): Add marking.
	(const_wide_int_htab_hash, const_wide_int_htab_eq,
	lookup_const_wide_int, immed_wide_int_const): New functions.
	(const_double_htab_hash, const_double_htab_eq,
	rtx_to_double_int, immed_double_const): Conditionally 
	changed CONST_DOUBLE behavior.
 	(immed_double_const, init_emit_once): Changed to support wide-int.
	* combine.c (try_combine, subst, make_extraction, 
	gen_lowpart_for_combine): Changed to support any size integer.
	* print-rtl.c (print_rtx): Added CONST_WIDE_INT case.
	* genpreds.c (write_one_predicate_function): Fixed comment.
	(add_constraint): Added CONST_WIDE_INT test.
	(write_tm_constrs_h): Do not emit hval or lval if target
	supports wide integers.
	* tree-ssa-address.c (addr_for_mem_ref): Changes to use
	wide-int rather than double-int.
	* ggc-zone.c (ggc_alloc_typed_stat): Added
	gt_ggc_e_10hwivec_def case.
	* final.c (output_addr_const): Added CONST_WIDE_INT case.
	* coretypes.h (hwivec_def, hwivec, const_hwivec): New.
	* expr.c (convert_modes): Added support for any size int.
	(emit_group_load_1): Added todo for place that still does not
	allow large ints.
	(store_expr, expand_constructor): Fixed comments.
	(expand_expr_real_2, expand_expr_real_1,
	reduce_to_bit_field_precision, const_vector_from_tree):
	Converted to use wide-int api.
	* optabs.c (expand_subword_shift, expand_doubleword_shift,
	expand_absneg_bit, expand_absneg_bit, expand_copysign_absneg,
	expand_copysign_bit): Made to work with any size int.  
	* cfgexpand.c (expand_debug_locations):  Use CONST_SCALAR_INT_P.
	* ggc.h (ggc_alloc_hwivec_sized): New.

Comments

Marc Glisse Oct. 3, 2012, 8:47 p.m. UTC | #1
On Wed, 3 Oct 2012, Kenneth Zadeck wrote:

> The patch defines a new datatype, a 'wide_int' (defined in
> wide-int.[ch], and this datatype will be used to perform all of the
> integer constant math in the compiler.  Externally, wide-int is very
> similar to double-int except that it does not have the limitation that
> math must be done on exactly two HOST_WIDE_INTs.
>
> Internally, a wide_int is a structure that contains a fixed sized
> array of HOST_WIDE_INTs, a length field and a mode.  The size of the
> array is determined at generation time by dividing the number of bits
> of the largest integer supported on the target by the number of bits
> in a HOST_WIDE_INT of the host.  Thus, with this format, any length of
> integer can be supported on any host.

Hello,

did you consider making the size of wide_int a template parameter, now 
that we are using C++? All with a convenient typedef or macro so it 
doesn't show. I am asking because in vrp I do some arithmetic that 
requires 2*N+1 bits where N is the size of double_int.
Kenneth Zadeck Oct. 3, 2012, 10:04 p.m. UTC | #2
i have already converted the vrp code, so i have some guess at where you 
are talking about.  (of course correct me if i am wrong).

in the code that computes the range when two variables are multiplied 
together needs to do a multiplication that produces a result that is 
twice as wide as the inputs.

my library is able to do that with one catch (and this is a big catch): 
the target has to have an integer mode that is twice as big as the mode 
of the operands. The issue is that wide-ints actually carry around the 
mode of the value in order to get the bitsize and precision of the 
operands (it does not have the type, because this code has to both work 
on the rtl and tree level and i generally do not want the signness anyway).

my current code in vrp checks to see if such a mode exists and if it 
does, it produces the product.   if the mode does not exist, it returns 
bottom.   What this means is that for most (many or some) targets that 
have a TImode, the largest thing that particular vrp discover ranges for 
is a DImode value.   We could get around this by defining the next 
larger mode than what the target really needs but i wonder how much 
mileage you are going to get out of that with really large numbers.

Of course you could have something else in mind.

kenny

On 10/03/2012 04:47 PM, Marc Glisse wrote:
> On Wed, 3 Oct 2012, Kenneth Zadeck wrote:
>
>> The patch defines a new datatype, a 'wide_int' (defined in
>> wide-int.[ch], and this datatype will be used to perform all of the
>> integer constant math in the compiler.  Externally, wide-int is very
>> similar to double-int except that it does not have the limitation that
>> math must be done on exactly two HOST_WIDE_INTs.
>>
>> Internally, a wide_int is a structure that contains a fixed sized
>> array of HOST_WIDE_INTs, a length field and a mode.  The size of the
>> array is determined at generation time by dividing the number of bits
>> of the largest integer supported on the target by the number of bits
>> in a HOST_WIDE_INT of the host.  Thus, with this format, any length of
>> integer can be supported on any host.
>
> Hello,
>
> did you consider making the size of wide_int a template parameter, now 
> that we are using C++? All with a convenient typedef or macro so it 
> doesn't show. I am asking because in vrp I do some arithmetic that 
> requires 2*N+1 bits where N is the size of double_int.
>
Mike Stump Oct. 3, 2012, 10:55 p.m. UTC | #3
On Oct 3, 2012, at 1:47 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> did you consider making the size of wide_int a template parameter, now that we are using C++? All with a convenient typedef or macro so it doesn't show. I am asking because in vrp I do some arithmetic that requires 2*N+1 bits where N is the size of double_int.

No, not really.  I'd maybe answer it this way, we put in a type (singular) to support all integral constants in all languages on a port.  Since we only needed 1, there was little need to templatize it.  By supporting all integral constants in all languages, there is little need for more.  If Ada say, wanted a 2048 bit integer, then, we just have it drop off the size it wants someplace and we would mix that in on a MAX(….) line, net result, the type we use would then directly support the needs of Ada.  If vpr wanted 2x of all existing modes, we could simply change the MAX equation and essentially double it; if people need that.  This comes as a cost, as the intermediate wide values are fixed size allocated (not variable); so these all would be larger.  For the longer lived values, no change, as these are variably sized as one would expect.
Richard Biener Oct. 4, 2012, 12:48 p.m. UTC | #4
On Wed, Oct 3, 2012 at 7:15 PM, Kenneth Zadeck <zadeck@naturalbridge.com> wrote:
> The enclosed patch is the third of at least four patches that fix the
> problems associated with supporting integers on the target that are
> wider than two HOST_WIDE_INTs.
>
> While GCC claims to support OI mode, and we have two public ports that
> make minor use of this mode, in practice, compilation that uses OImode
> mode commonly gives the wrong result or ices.  We have a private port
> of GCC for an architecture that is further down the road to needing
> comprehensive OImode and we have discovered that this is unusable. We
> have decided to fix it in a general way that so that it is most
> beneficial to the GCC community.  It is our belief that we are just a
> little ahead of the X86 and the NEON and these patches will shortly be
> essential.
>
> The first two of these patches were primarily lexigraphical and have
> already been committed.    They transformed the uses of CONST_DOUBLE
> so that it is easy to tell what the intended usage is.
>
> The underlying structures in the next two patches are very general:
> once they are added to the compiler, the compiler will be able to
> support targets with any size of integer from hosts of any size
> integer.
>
> The patch enclosed deals with the portable RTL parts of the compiler.
> The next patch, which is currently under construction deals with the
> tree level.  However, this patch can be put on the trunk as is, and it
> will eleviate many, but not all of the current limitations in the rtl
> parts of the compiler.
>
> Some of the patch is conditional, depending on a port defining the
> symbol 'TARGET_SUPPORTS_WIDE_INT' to be non zero.  Defining this
> symbol to be non zero is declaring that the port has been converted to
> use the new form or integer constants.  However, the patch is
> completely backwards compatible to allow ports that do not need this
> immediately to convert at their leasure.  The conversion process is
> not difficult, but it does require some knowledge of the port, so we
> are not volinteering to do this for all ports.
>
> OVERVIEW OF THE PATCH:
>
> The patch defines a new datatype, a 'wide_int' (defined in
> wide-int.[ch], and this datatype will be used to perform all of the
> integer constant math in the compiler.  Externally, wide-int is very
> similar to double-int except that it does not have the limitation that
> math must be done on exactly two HOST_WIDE_INTs.
>
> Internally, a wide_int is a structure that contains a fixed sized
> array of HOST_WIDE_INTs, a length field and a mode.  The size of the

That it has a mode sounds odd to me and makes it subtly different
from HOST_WIDE_INT and double-int.  Maybe the patch will tell
why this is so.

> array is determined at generation time by dividing the number of bits
> of the largest integer supported on the target by the number of bits
> in a HOST_WIDE_INT of the host.  Thus, with this format, any length of
> integer can be supported on any host.
>
> A new rtx type is created, the CONST_WIDE_INT, which contains a
> garbage collected array of HOST_WIDE_INTS that is large enough to hold
> the constant.  For the targets that define TARGET_SUPPORTS_WIDE_INT to
> be non zero, CONST_DOUBLES are only used to hold floating point
> values.  If the target leaves TARGET_SUPPORTS_WIDE_INT defined as 0,
> CONST_WIDE_INTs are not used and CONST_DOUBLEs are as they were
> before.
>
> CONST_INT does not change except that it is defined to hold all
> constants that fit in exactly one HOST_WIDE_INT.  Note that is slightly
> different than the current trunk.  Before this patch, the TImode
> constant '5' could either be in a CONST_INT or CONST_DOUBLE depending
> on which code path was used to create it.  This patch changes this so
> that if the constant fits in a CONST_INT then it is represented in a
> CONST_INT no matter how it is created.
>
> For the array inside a CONST_WIDE_INT, and internally in wide-int, we
> use a compressed form for integers that need more than one
> HOST_WIDE_INT.  Higher elements of the array are not needed if they
> are just a sign extension of the elements below them.  This does not
> imply that constants are signed or are sign extended, this is only a
> compression technique.
>
> While it might seem to be more esthetically pleasing to have not
> introduced the CONST_WIDE_INT and to have changed the representation
> of the CONST_INT to accomodate larger numbers, this would have both
> used more space and would be a time consuming change for the port
> maintainers.  We believe that most ports can be quickly converted with
> the current scheme because there is just not a lot of code in the back
> ends that cares about large constants.  Furthermore, the CONST_INT is
> very space efficient and even in a program that was heavy in large
> values, most constants would still fit in a CONST_INT.
>
> All of the parts of the rtl level that deal with CONST_DOUBLE as an
> now conditionally work with CONST_WIDE_INTs depending on the value
> of TARGET_SUPPORTS_WIDE_INT.  We believe that this patch removes all
> of the ices and wrong code places at the portable rtl level. However,
> there are still places in the portable rtl code that refuse to
> transform the code unless it is a CONST_INT.  Since these do not cause
> failures, they can be handled later.  The patch is already very large.
>
> It should be noted that much of the constant overflow checking in the
> constant math dissappears with these patches.  The overflow checking
> code in the current compiler is really divided into two cases:
> overflow on the host and overflow on the target.  The overflow
> checking on the host was to make sure that the math did overflow when
> done on two HOST_WIDE_INTs.  All of this code goes away.  These
> patches allow the constant math to be done exactly the way it is done
> on the target.
>
> This patch also aids other cleanups that are being considered at the
> rtl level:
>
>   1) These patches remove most of the host dependencies on the
>   optimizations.  Currently a 32 bit GCC host will produce different
>   code for a specific target than a 64 bit host will.  This is because
>   many of the transformations only work on constants that can be a
>   represented with a single HWI or two HWIs.  If the target has larger
>   integers than the host, the compilation suffers.
>
>   2) Bernd's need to make GCC correctly support partial its is made
>   easier by the wide-int library.  This library carefully does all
>   arithmetic in the precision of the mode included in it.  While there
>   are still places at the rtl level that still do arithmetic inline,
>   we plan to convert those to use the library over time.   This patch
>   converts a substantial number of those places.
>
>   3) This patch is one step along the path to add modes to rtl integer
>   constants.  There is no longer any checking to see if a CONST_DOUBLE
>   has VOIDmode as its mode.  Furthermore, all constructors for various
>   wide ints do take a mode and require that it not be VOIDmode. There
>   is still a lot of work to do to make this conversion possible.
>
> Richard Sandiford has been over the rtl portions of this patch a few
> times.  He has not looked at the wide-int files in any detail.  This
> patch has been heavily tested on my private ports and also on x86-64.
>
>
> CONVERSION PROCESS
>
> Converting a port mostly requires looking for the places where
> CONST_DOUBLES are used with VOIDmode and replacing that code with code
> that accesses CONST_WIDE_INTs.  "grep -i const_double" at the port
> level gets you to 95% of the changes that need to be made.  There are
> a few places that require a deeper look.
>
>   1) There is no equivalent to hval and lval for CONST_WIDE_INTs.
>   This would be difficult to express in the md language since there
>   are a variable number of elements.
>
>   Most ports only check that hval is either 0 or -1 to see if the int
>   is small.  As mentioned above, this will no longer be necessary
>   since small constants are always CONST_INT.  Of course there are
>   still a few exceptions, the alpha's constraint used by the zap
>   instruction certainly requires careful examination by C code.
>   However, all the current code does is pass the hval and lval to C
>   code, so evolving the c code to look at the CONST_WIDE_INT is not
>   really a large change.
>
>   2) Because there is no standard template that ports use to
>   materialize constants, there is likely to be some futzing that is
>   unique to each port in this code.
>
>   3) The rtx costs may have to be adjusted to properly account for
>   larger constants that are represented as CONST_WIDE_INT.
>
> All and all it has not taken us long to convert ports that we are
> familiar with.
>
> OTHER COMMENTS
>
> I did find what i believe is one interesting bug in the double-int
> code.  I believe that the code that performs divide and mod with round
> to nearest is seriously wrong for unsigned integers.  I believe that
> it will get the wrong answer for any numbers that are large enough to
> look negative if they consider signed integers.  Asside from that,
> wide-int should perform in a very similar manner to double-int.
>
> I am sorry for the size of this patch.   However, there does not appear
> to change the underlying data structure to support wider integers
> without doing something like this.

Some pieces can be easily split out, like the introduction and use
of CONST_SCALAR_INT_P.

As my general comment I would like to see double-int and wide-int
unified from an interface perspective.  Which means that double-int
should be a specialization of wide-int which should be a template
(which also means its size is constant).  Thus,

typedef wide_int<2> double_int;

should be the way to expose the double_int type.

The main question remains - why does wide_int have a mode?
That looks redundant, both with information stored in types
and the RTL constant, and with the len field (that would be
always GET_MODE_SIZE () / ...?).  Also when you associate
a mode it's weird you don't associate a signedness.

Thus I'd ask you to rework this to be a template on 'len'
(number of HOST_WIDE_INT words), drop the mode member
and unify double-int and wide-int.  Co-incidentially incrementally
doing this by converting double-int to a typedef of a
wide_int<2> specialization (thus moving double-int implementation
stuff to be wide_int<2> specialized) would be prefered.

Btw,

+/* Constructs tree in type TYPE from with value given by CST.  Signedness
+   of CST is assumed to be the same as the signedness of TYPE.  */
+
+tree
+wide_int_to_tree (tree type, const wide_int &cst)
+{
+  wide_int v;
+  if (TYPE_UNSIGNED (type))
+    v = cst.zext (TYPE_PRECISION (type));
+  else
+    v = cst.sext (TYPE_PRECISION (type));
+
+  return build_int_cst_wide (type, v.elt (0), v.elt (1));
+}

is surely broken.  A wide-int does not fit a double-int.  How are you
going to "fix" this?

Thanks,
Richard.

> kenny
Marc Glisse Oct. 4, 2012, 1:17 p.m. UTC | #5
On Wed, 3 Oct 2012, Mike Stump wrote:
> On Oct 3, 2012, at 1:47 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>> did you consider making the size of wide_int a template parameter, now 
>> that we are using C++? All with a convenient typedef or macro so it 
>> doesn't show. I am asking because in vrp I do some arithmetic that 
>> requires 2*N+1 bits where N is the size of double_int.
>
> No, not really.  I'd maybe answer it this way, we put in a type 
> (singular) to support all integral constants in all languages on a port. 
> Since we only needed 1, there was little need to templatize it.  By 
> supporting all integral constants in all languages, there is little need 
> for more.  If Ada say, wanted a 2048 bit integer, then, we just have it 
> drop off the size it wants someplace and we would mix that in on a 
> MAX(….) line, net result, the type we use would then directly support 
> the needs of Ada.  If vpr wanted 2x of all existing modes, we could 
> simply change the MAX equation and essentially double it; if people need 
> that.  This comes as a cost, as the intermediate wide values are fixed 
> size allocated (not variable); so these all would be larger.

And this cost could be eliminated by having a template wide_int_ so only 
the places that need it actually use the extra size ;-)


On Wed, 3 Oct 2012, Kenneth Zadeck wrote:

> i have already converted the vrp code, so i have some guess at where you are 
> talking about.  (of course correct me if i am wrong).
>
> in the code that computes the range when two variables are multiplied 
> together needs to do a multiplication that produces a result that is twice as 
> wide as the inputs.

Yes, exactly.

> my library is able to do that with one catch (and this is a big catch): the 
> target has to have an integer mode that is twice as big as the mode of the 
> operands. The issue is that wide-ints actually carry around the mode of the 
> value in order to get the bitsize and precision of the operands (it does not 
> have the type, because this code has to both work on the rtl and tree level 
> and i generally do not want the signness anyway).
>
> my current code in vrp checks to see if such a mode exists and if it does, it 
> produces the product.   if the mode does not exist, it returns bottom.   What 
> this means is that for most (many or some) targets that have a TImode, the 
> largest thing that particular vrp discover ranges for is a DImode value.   We 
> could get around this by defining the next larger mode than what the target 
> really needs but i wonder how much mileage you are going to get out of that 
> with really large numbers.

This will be for discussion when you submit that next patch, but currently 
VRP handles integers the same size as double_int. In particular, it 
handles __int128. I would be unhappy if introducing a larger bigint type 
in gcc made us regress there.
Kenneth Zadeck Oct. 4, 2012, 1:55 p.m. UTC | #6
Let me talk about the mode here first.

What this interface/patch provides is a facility where the constant math 
that is done in optimizations is done exactly the way that it would be 
done on the target machine.   What we have now is a compiler that only 
does this if it convenient to do on the host.   I admit that i care 
about this more than others right now, but if intel adds a couple of 
more instructions to their vector units, other people will start to 
really care about this issue.   If you take an OImode value with the 
current compiler and left shift it by 250 the middle end will say that 
the result is 0.   This is just wrong!!!

What this means is that the bitsize and precision of the operations need 
to be carried along when doing math. when wide-int  checks for overflow 
on the multiply or add, it is not checking the if the value overflowed 
on two HWIs, it is checking if the add overflowed in the mode of the 
types that are represented on the target.   When we do shift, we are not 
doing a shift within two HWIs, we are truncating the shift value (if 
this is appropriate) according to the bitsize and shifting according the 
precision.

I think that an argument could be made that storing the mode should be 
changed to an explicit precision and bitsize.  (A possible other option 
would be to store a tree type, but this would make the usage at the rtl 
level very cumbersome since types are rare.) Aside from the work, you 
would not get much push back.

But the signess is a different argument.   At the rtl level, the signess 
is a matter of context.   (you could argue that this is a mistake and i 
would agree, but that is an even bigger change.)   But more to the 
point, at the tree level, there are a surprising number of places where 
the operation desired does not follow the sign of the types that were 
used to construct the constants.   Furthermore, not carrying the sign is 
more consistent with the double int code, which as you point out carries 
nothing.

As for the splitting out the patch in smaller pieces, i am all for it.   
I have done this twice already and i could get the const_scalar_int_p 
patch out quickly.    But you do not get too far along that before you 
are still left with a big patch.   I could split out wide-int.* and just 
commit those files with no clients as a first step.   My guess is that 
Richard Sandiford would appreciate that because while he has carefully 
checked the rtl stuff, i think that the code inside wide-int is not in 
his comfort zone of things he would approve.

As far as your btw - noticed this last night.   it is an artifact of the 
way i produced the patch and "responsible people have been sacked".   
However, it shows that you read the patch carefully, and i really 
appreciate that.   i owe you a beer (not that you need another at this 
time of year).

Kenny



On 10/04/2012 08:48 AM, Richard Guenther wrote:
> On Wed, Oct 3, 2012 at 7:15 PM, Kenneth Zadeck <zadeck@naturalbridge.com> wrote:
>> The enclosed patch is the third of at least four patches that fix the
>> problems associated with supporting integers on the target that are
>> wider than two HOST_WIDE_INTs.
>>
>> While GCC claims to support OI mode, and we have two public ports that
>> make minor use of this mode, in practice, compilation that uses OImode
>> mode commonly gives the wrong result or ices.  We have a private port
>> of GCC for an architecture that is further down the road to needing
>> comprehensive OImode and we have discovered that this is unusable. We
>> have decided to fix it in a general way that so that it is most
>> beneficial to the GCC community.  It is our belief that we are just a
>> little ahead of the X86 and the NEON and these patches will shortly be
>> essential.
>>
>> The first two of these patches were primarily lexigraphical and have
>> already been committed.    They transformed the uses of CONST_DOUBLE
>> so that it is easy to tell what the intended usage is.
>>
>> The underlying structures in the next two patches are very general:
>> once they are added to the compiler, the compiler will be able to
>> support targets with any size of integer from hosts of any size
>> integer.
>>
>> The patch enclosed deals with the portable RTL parts of the compiler.
>> The next patch, which is currently under construction deals with the
>> tree level.  However, this patch can be put on the trunk as is, and it
>> will eleviate many, but not all of the current limitations in the rtl
>> parts of the compiler.
>>
>> Some of the patch is conditional, depending on a port defining the
>> symbol 'TARGET_SUPPORTS_WIDE_INT' to be non zero.  Defining this
>> symbol to be non zero is declaring that the port has been converted to
>> use the new form or integer constants.  However, the patch is
>> completely backwards compatible to allow ports that do not need this
>> immediately to convert at their leasure.  The conversion process is
>> not difficult, but it does require some knowledge of the port, so we
>> are not volinteering to do this for all ports.
>>
>> OVERVIEW OF THE PATCH:
>>
>> The patch defines a new datatype, a 'wide_int' (defined in
>> wide-int.[ch], and this datatype will be used to perform all of the
>> integer constant math in the compiler.  Externally, wide-int is very
>> similar to double-int except that it does not have the limitation that
>> math must be done on exactly two HOST_WIDE_INTs.
>>
>> Internally, a wide_int is a structure that contains a fixed sized
>> array of HOST_WIDE_INTs, a length field and a mode.  The size of the
> That it has a mode sounds odd to me and makes it subtly different
> from HOST_WIDE_INT and double-int.  Maybe the patch will tell
> why this is so.
>
>> array is determined at generation time by dividing the number of bits
>> of the largest integer supported on the target by the number of bits
>> in a HOST_WIDE_INT of the host.  Thus, with this format, any length of
>> integer can be supported on any host.
>>
>> A new rtx type is created, the CONST_WIDE_INT, which contains a
>> garbage collected array of HOST_WIDE_INTS that is large enough to hold
>> the constant.  For the targets that define TARGET_SUPPORTS_WIDE_INT to
>> be non zero, CONST_DOUBLES are only used to hold floating point
>> values.  If the target leaves TARGET_SUPPORTS_WIDE_INT defined as 0,
>> CONST_WIDE_INTs are not used and CONST_DOUBLEs are as they were
>> before.
>>
>> CONST_INT does not change except that it is defined to hold all
>> constants that fit in exactly one HOST_WIDE_INT.  Note that is slightly
>> different than the current trunk.  Before this patch, the TImode
>> constant '5' could either be in a CONST_INT or CONST_DOUBLE depending
>> on which code path was used to create it.  This patch changes this so
>> that if the constant fits in a CONST_INT then it is represented in a
>> CONST_INT no matter how it is created.
>>
>> For the array inside a CONST_WIDE_INT, and internally in wide-int, we
>> use a compressed form for integers that need more than one
>> HOST_WIDE_INT.  Higher elements of the array are not needed if they
>> are just a sign extension of the elements below them.  This does not
>> imply that constants are signed or are sign extended, this is only a
>> compression technique.
>>
>> While it might seem to be more esthetically pleasing to have not
>> introduced the CONST_WIDE_INT and to have changed the representation
>> of the CONST_INT to accomodate larger numbers, this would have both
>> used more space and would be a time consuming change for the port
>> maintainers.  We believe that most ports can be quickly converted with
>> the current scheme because there is just not a lot of code in the back
>> ends that cares about large constants.  Furthermore, the CONST_INT is
>> very space efficient and even in a program that was heavy in large
>> values, most constants would still fit in a CONST_INT.
>>
>> All of the parts of the rtl level that deal with CONST_DOUBLE as an
>> now conditionally work with CONST_WIDE_INTs depending on the value
>> of TARGET_SUPPORTS_WIDE_INT.  We believe that this patch removes all
>> of the ices and wrong code places at the portable rtl level. However,
>> there are still places in the portable rtl code that refuse to
>> transform the code unless it is a CONST_INT.  Since these do not cause
>> failures, they can be handled later.  The patch is already very large.
>>
>> It should be noted that much of the constant overflow checking in the
>> constant math dissappears with these patches.  The overflow checking
>> code in the current compiler is really divided into two cases:
>> overflow on the host and overflow on the target.  The overflow
>> checking on the host was to make sure that the math did overflow when
>> done on two HOST_WIDE_INTs.  All of this code goes away.  These
>> patches allow the constant math to be done exactly the way it is done
>> on the target.
>>
>> This patch also aids other cleanups that are being considered at the
>> rtl level:
>>
>>    1) These patches remove most of the host dependencies on the
>>    optimizations.  Currently a 32 bit GCC host will produce different
>>    code for a specific target than a 64 bit host will.  This is because
>>    many of the transformations only work on constants that can be a
>>    represented with a single HWI or two HWIs.  If the target has larger
>>    integers than the host, the compilation suffers.
>>
>>    2) Bernd's need to make GCC correctly support partial its is made
>>    easier by the wide-int library.  This library carefully does all
>>    arithmetic in the precision of the mode included in it.  While there
>>    are still places at the rtl level that still do arithmetic inline,
>>    we plan to convert those to use the library over time.   This patch
>>    converts a substantial number of those places.
>>
>>    3) This patch is one step along the path to add modes to rtl integer
>>    constants.  There is no longer any checking to see if a CONST_DOUBLE
>>    has VOIDmode as its mode.  Furthermore, all constructors for various
>>    wide ints do take a mode and require that it not be VOIDmode. There
>>    is still a lot of work to do to make this conversion possible.
>>
>> Richard Sandiford has been over the rtl portions of this patch a few
>> times.  He has not looked at the wide-int files in any detail.  This
>> patch has been heavily tested on my private ports and also on x86-64.
>>
>>
>> CONVERSION PROCESS
>>
>> Converting a port mostly requires looking for the places where
>> CONST_DOUBLES are used with VOIDmode and replacing that code with code
>> that accesses CONST_WIDE_INTs.  "grep -i const_double" at the port
>> level gets you to 95% of the changes that need to be made.  There are
>> a few places that require a deeper look.
>>
>>    1) There is no equivalent to hval and lval for CONST_WIDE_INTs.
>>    This would be difficult to express in the md language since there
>>    are a variable number of elements.
>>
>>    Most ports only check that hval is either 0 or -1 to see if the int
>>    is small.  As mentioned above, this will no longer be necessary
>>    since small constants are always CONST_INT.  Of course there are
>>    still a few exceptions, the alpha's constraint used by the zap
>>    instruction certainly requires careful examination by C code.
>>    However, all the current code does is pass the hval and lval to C
>>    code, so evolving the c code to look at the CONST_WIDE_INT is not
>>    really a large change.
>>
>>    2) Because there is no standard template that ports use to
>>    materialize constants, there is likely to be some futzing that is
>>    unique to each port in this code.
>>
>>    3) The rtx costs may have to be adjusted to properly account for
>>    larger constants that are represented as CONST_WIDE_INT.
>>
>> All and all it has not taken us long to convert ports that we are
>> familiar with.
>>
>> OTHER COMMENTS
>>
>> I did find what i believe is one interesting bug in the double-int
>> code.  I believe that the code that performs divide and mod with round
>> to nearest is seriously wrong for unsigned integers.  I believe that
>> it will get the wrong answer for any numbers that are large enough to
>> look negative if they consider signed integers.  Asside from that,
>> wide-int should perform in a very similar manner to double-int.
>>
>> I am sorry for the size of this patch.   However, there does not appear
>> to change the underlying data structure to support wider integers
>> without doing something like this.
> Some pieces can be easily split out, like the introduction and use
> of CONST_SCALAR_INT_P.
>
> As my general comment I would like to see double-int and wide-int
> unified from an interface perspective.  Which means that double-int
> should be a specialization of wide-int which should be a template
> (which also means its size is constant).  Thus,
>
> typedef wide_int<2> double_int;
>
> should be the way to expose the double_int type.
>
> The main question remains - why does wide_int have a mode?
> That looks redundant, both with information stored in types
> and the RTL constant, and with the len field (that would be
> always GET_MODE_SIZE () / ...?).  Also when you associate
> a mode it's weird you don't associate a signedness.
>
> Thus I'd ask you to rework this to be a template on 'len'
> (number of HOST_WIDE_INT words), drop the mode member
> and unify double-int and wide-int.  Co-incidentially incrementally
> doing this by converting double-int to a typedef of a
> wide_int<2> specialization (thus moving double-int implementation
> stuff to be wide_int<2> specialized) would be prefered.
>
> Btw,
>
> +/* Constructs tree in type TYPE from with value given by CST.  Signedness
> +   of CST is assumed to be the same as the signedness of TYPE.  */
> +
> +tree
> +wide_int_to_tree (tree type, const wide_int &cst)
> +{
> +  wide_int v;
> +  if (TYPE_UNSIGNED (type))
> +    v = cst.zext (TYPE_PRECISION (type));
> +  else
> +    v = cst.sext (TYPE_PRECISION (type));
> +
> +  return build_int_cst_wide (type, v.elt (0), v.elt (1));
> +}
>
> is surely broken.  A wide-int does not fit a double-int.  How are you
> going to "fix" this?
>
> Thanks,
> Richard.
>
>> kenny
Kenneth Zadeck Oct. 4, 2012, 3:19 p.m. UTC | #7
On 10/04/2012 09:17 AM, Marc Glisse wrote:
> On Wed, 3 Oct 2012, Mike Stump wrote:
>> On Oct 3, 2012, at 1:47 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>>> did you consider making the size of wide_int a template parameter, 
>>> now that we are using C++? All with a convenient typedef or macro so 
>>> it doesn't show. I am asking because in vrp I do some arithmetic 
>>> that requires 2*N+1 bits where N is the size of double_int.
>>
>> No, not really.  I'd maybe answer it this way, we put in a type 
>> (singular) to support all integral constants in all languages on a 
>> port. Since we only needed 1, there was little need to templatize 
>> it.  By supporting all integral constants in all languages, there is 
>> little need for more.  If Ada say, wanted a 2048 bit integer, then, 
>> we just have it drop off the size it wants someplace and we would mix 
>> that in on a MAX(….) line, net result, the type we use would then 
>> directly support the needs of Ada.  If vpr wanted 2x of all existing 
>> modes, we could simply change the MAX equation and essentially double 
>> it; if people need that.  This comes as a cost, as the intermediate 
>> wide values are fixed size allocated (not variable); so these all 
>> would be larger.
>
> And this cost could be eliminated by having a template wide_int_ so 
> only the places that need it actually use the extra size ;-)
>
The space is not really an issue in most places since wide-ints tend to 
be short lived.  i guess vrp is slightly different because it creates a 
lot at once.  but then they go away.

However the real question is what are you going to instantiate the 
template on?    What we do is look at the target and determine the 
largest type that the target supports and build a wide int type that 
supports that.    how are you going to do better?   are you going to 
instantiate one for every type you see?   are these going to be static 
or dynamic?   The last line this email seems to imply that you were 
planning to "know" that __int128 was the largest integer that any target 
or front end could support.

and then what do you do for the parts of the compiler that have 
operations that take things of two different types, like shift. The 
shift amount can and may times is a shorter type that what is being 
shifted. Would these different length integers be represented with 
different instances from the same template?   I am not a c++ programmer 
and so all of this is a little new to me, but given a perspective of the 
rest of the compiler, this does not seem like the right way to go.


>
> On Wed, 3 Oct 2012, Kenneth Zadeck wrote:
>
>> i have already converted the vrp code, so i have some guess at where 
>> you are talking about.  (of course correct me if i am wrong).
>>
>> in the code that computes the range when two variables are multiplied 
>> together needs to do a multiplication that produces a result that is 
>> twice as wide as the inputs.
>
> Yes, exactly.
>
>> my library is able to do that with one catch (and this is a big 
>> catch): the target has to have an integer mode that is twice as big 
>> as the mode of the operands. The issue is that wide-ints actually 
>> carry around the mode of the value in order to get the bitsize and 
>> precision of the operands (it does not have the type, because this 
>> code has to both work on the rtl and tree level and i generally do 
>> not want the signness anyway).
>>
>> my current code in vrp checks to see if such a mode exists and if it 
>> does, it produces the product.   if the mode does not exist, it 
>> returns bottom.   What this means is that for most (many or some) 
>> targets that have a TImode, the largest thing that particular vrp 
>> discover ranges for is a DImode value.   We could get around this by 
>> defining the next larger mode than what the target really needs but i 
>> wonder how much mileage you are going to get out of that with really 
>> large numbers.
>
> This will be for discussion when you submit that next patch, but 
> currently VRP handles integers the same size as double_int. In 
> particular, it handles __int128. I would be unhappy if introducing a 
> larger bigint type in gcc made us regress there.
>
You are only happy now because you do not really understand the world 
around you.    This is not what your code does.   What you code does is 
that if the host is a 64 bit host you can handle __int128 and if your 
host is a 32 bit host you can handle a __int64.  If you are building a 
cross compiler from a 32 bit host to a 64 bit target, your pass is 
either going to get the wrong answer, give up, or ice.   There are 
currently parts of gcc that do each of these three "solutions" and my 
patch gets rid of these because it does the math as the target does the 
math, no matter that the target is.

The goal of my patch is to make gcc produce the same correct results no 
matter what types the target or host support.    The last thing that we 
need to have some optimization "knowing" what the limits of either of 
these are and hard coding that in a set of templates that have been 
statically instantiated.
Kenneth Zadeck Oct. 4, 2012, 3:39 p.m. UTC | #8
Actually richi, this code is "correct" for some broken definition of 
correct.

If all that is done is to convert the rtl parts of the compiler, then 
this code is the best you can do (of course an assertion that the length 
is not greater than 2 would be a useful addition).

The code that is in the follow on patch which converts the insides of a 
tree cst to look like a const wide int, i.e. an array of HWIs.   When 
that happens, this code looks completely different. But if you only 
convert the rtl level, at some point there is going to be an impedance 
mismatch and it is buried here.

I will point out that this is the fall out of trying to split things 
into a bunch of smaller patches that could in theory go in separately.

kenny




>
> +/* Constructs tree in type TYPE from with value given by CST.  Signedness
> +   of CST is assumed to be the same as the signedness of TYPE.  */
> +
> +tree
> +wide_int_to_tree (tree type, const wide_int &cst)
> +{
> +  wide_int v;
> +  if (TYPE_UNSIGNED (type))
> +    v = cst.zext (TYPE_PRECISION (type));
> +  else
> +    v = cst.sext (TYPE_PRECISION (type));
> +
> +  return build_int_cst_wide (type, v.elt (0), v.elt (1));
> +}
>
> is surely broken.  A wide-int does not fit a double-int.  How are you
> going to "fix" this?
>
> Thanks,
> Richard.
>
>> kenny
Marc Glisse Oct. 4, 2012, 4:55 p.m. UTC | #9
On Thu, 4 Oct 2012, Kenneth Zadeck wrote:

> On 10/04/2012 09:17 AM, Marc Glisse wrote:
>> On Wed, 3 Oct 2012, Mike Stump wrote:
>>> On Oct 3, 2012, at 1:47 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>>>> did you consider making the size of wide_int a template parameter, now 
>>>> that we are using C++? All with a convenient typedef or macro so it 
>>>> doesn't show. I am asking because in vrp I do some arithmetic that 
>>>> requires 2*N+1 bits where N is the size of double_int.
>>> 
>>> No, not really.  I'd maybe answer it this way, we put in a type (singular) 
>>> to support all integral constants in all languages on a port. Since we 
>>> only needed 1, there was little need to templatize it.  By supporting all 
>>> integral constants in all languages, there is little need for more.  If 
>>> Ada say, wanted a 2048 bit integer, then, we just have it drop off the 
>>> size it wants someplace and we would mix that in on a MAX(….) line, net 
>>> result, the type we use would then directly support the needs of Ada.  If 
>>> vpr wanted 2x of all existing modes, we could simply change the MAX 
>>> equation and essentially double it; if people need that.  This comes as a 
>>> cost, as the intermediate wide values are fixed size allocated (not 
>>> variable); so these all would be larger.
>> 
>> And this cost could be eliminated by having a template wide_int_ so only 
>> the places that need it actually use the extra size ;-)
>> 
> The space is not really an issue in most places since wide-ints tend to be 
> short lived.

You were the one talking of a cost.

> However the real question is what are you going to instantiate the template 
> on?    What we do is look at the target and determine the largest type that 
> the target supports and build a wide int type that supports that.    how are 
> you going to do better?

In a single place in tree-vrp.c in the code that evaluates 
multiplications, I would instantiate the template on the double (possibly 
+1) of the value you selected as large enough for all constants. For all 
the rest, your type is fine.

>> This will be for discussion when you submit that next patch, but currently 
>> VRP handles integers the same size as double_int. In particular, it handles 
>> __int128. I would be unhappy if introducing a larger bigint type in gcc 
>> made us regress there.
>> 
> You are only happy now because you do not really understand the world around 
> you.

I did not want to go into details, but let me re-phrase: I do not want to 
regress. Currently, hosts with a 64 bit hwi can handle VRP multiplications 
on __int128. If your patch introducing better big integers breaks that, 
that sounds bad to me, since I would expect s/double_int/wide_int/ to just 
work, and using wide_int<2*MAX> would just be a potential simplification 
of the code for later.


Note that VRP is just the one case I am familiar with. Using templates 
should (I haven't checked) be completely trivial and help the next person 
who needs bigger integers for a specific purpose and doesn't want to 
penalize the whole compiler. If the size of wide_int is completely 
irrelevant and we can make it 10 times larger without thinking, I guess 
some numbers showing it would be great (or maybe that's common 
knowledge, then I guess it is fine).


Now those are only some comments from an occasional contributor, not 
reviewer requirements, it is fine to ignore them.
Richard Biener Oct. 4, 2012, 4:58 p.m. UTC | #10
On Thu, Oct 4, 2012 at 3:55 PM, Kenneth Zadeck <zadeck@naturalbridge.com> wrote:
> Let me talk about the mode here first.
>
> What this interface/patch provides is a facility where the constant math
> that is done in optimizations is done exactly the way that it would be done
> on the target machine.   What we have now is a compiler that only does this
> if it convenient to do on the host.   I admit that i care about this more
> than others right now, but if intel adds a couple of more instructions to
> their vector units, other people will start to really care about this issue.
> If you take an OImode value with the current compiler and left shift it by
> 250 the middle end will say that the result is 0.   This is just wrong!!!
>
> What this means is that the bitsize and precision of the operations need to
> be carried along when doing math. when wide-int  checks for overflow on the
> multiply or add, it is not checking the if the value overflowed on two HWIs,
> it is checking if the add overflowed in the mode of the types that are
> represented on the target.   When we do shift, we are not doing a shift
> within two HWIs, we are truncating the shift value (if this is appropriate)
> according to the bitsize and shifting according the precision.
>
> I think that an argument could be made that storing the mode should be
> changed to an explicit precision and bitsize.  (A possible other option
> would be to store a tree type, but this would make the usage at the rtl
> level very cumbersome since types are rare.) Aside from the work, you would
> not get much push back.
>
> But the signess is a different argument.   At the rtl level, the signess is
> a matter of context.   (you could argue that this is a mistake and i would
> agree, but that is an even bigger change.)   But more to the point, at the
> tree level, there are a surprising number of places where the operation
> desired does not follow the sign of the types that were used to construct
> the constants.   Furthermore, not carrying the sign is more consistent with
> the double int code, which as you point out carries nothing.

Well, on RTL the signedness is on the operation (you have sdiv and udiv, etc.).

double-int tries to present a sign-less twos-complement entity of size
2 * HOST_BITS_PER_WIDE_INT.  I think that is sensible and for
obvious reasons should not change.  Both tree and RTL rely on this.
What we do not want is that up to TImode you get an internal representation
done one way (twos-complement) and on OImode and larger you
suddenly get subtly different behavior.  That's a recepie for desaster.

I'd like to clean up the interface to double-int some more (now with the
nice C++ stuff we have).  double-int should be pure twos-complement,
there should be no operations on double-ints that behave differently
when done signed or unsigned, instead we have signed and unsigned
versions of the operations (similar to how signedness is handled on
the RTL level).  With some trivial C++ fu you could have a
double_sint and double_uint type that would get rid of the bool
sign params we have to some functions (and then you could
write double_sint >> n using operator notation).

I'd like wide-int (whatever it's internal representation is) to behave
exactly like double-ints with respect to precision and signedness
handling.  Ideally all static functions we have that operate on
double-ints would be 1:1 available for wide-ints, so I can change
the type of entities in an algorithm from double-ints to wide-ints
(or vice versa) and do not have to change the code at all.

Thus as first step I'd like you to go over the double-int stuff,
compare it to the wide-int stuff you introduce and point out
differences (changing double-ints or wide-ints to whatever is
the more general concept).

Now, as for 'modes' - similar to signedness some functions
that operate on double-ints take a precision argument (like
the various extensions).  You can add a similar wrapper
type like double_sint, but this time with a cost - a new precision
member, that can be constructed from a double_int (or wide_int)
that ends up specifying the desired precision (be it in terms
of a mode or a type).

You didn't question my suggestion to have the number of
HOST_WIDE_INTs in a wide-int be compile-time constant - was
that just an oversight on your side?  The consequence is that
code wanting to deal with arbitrary length wide-ints needs to
be a template.

> As for the splitting out the patch in smaller pieces, i am all for it.   I
> have done this twice already and i could get the const_scalar_int_p patch
> out quickly.    But you do not get too far along that before you are still
> left with a big patch.   I could split out wide-int.* and just commit those
> files with no clients as a first step.   My guess is that Richard Sandiford
> would appreciate that because while he has carefully checked the rtl stuff,
> i think that the code inside wide-int is not in his comfort zone of things
> he would approve.
>
> As far as your btw - noticed this last night.   it is an artifact of the way
> i produced the patch and "responsible people have been sacked".   However,
> it shows that you read the patch carefully, and i really appreciate that.
> i owe you a beer (not that you need another at this time of year).

You also didn't mention the missing tree bits ... was this just a 1/n patch
or is it at all usable for you in this state?  Where do the large integers
magically come from?

Richard.

> Kenny
>
>
>
> On 10/04/2012 08:48 AM, Richard Guenther wrote:
>>
>> On Wed, Oct 3, 2012 at 7:15 PM, Kenneth Zadeck <zadeck@naturalbridge.com>
>> wrote:
>>>
>>> The enclosed patch is the third of at least four patches that fix the
>>> problems associated with supporting integers on the target that are
>>> wider than two HOST_WIDE_INTs.
>>>
>>> While GCC claims to support OI mode, and we have two public ports that
>>> make minor use of this mode, in practice, compilation that uses OImode
>>> mode commonly gives the wrong result or ices.  We have a private port
>>> of GCC for an architecture that is further down the road to needing
>>> comprehensive OImode and we have discovered that this is unusable. We
>>> have decided to fix it in a general way that so that it is most
>>> beneficial to the GCC community.  It is our belief that we are just a
>>> little ahead of the X86 and the NEON and these patches will shortly be
>>> essential.
>>>
>>> The first two of these patches were primarily lexigraphical and have
>>> already been committed.    They transformed the uses of CONST_DOUBLE
>>> so that it is easy to tell what the intended usage is.
>>>
>>> The underlying structures in the next two patches are very general:
>>> once they are added to the compiler, the compiler will be able to
>>> support targets with any size of integer from hosts of any size
>>> integer.
>>>
>>> The patch enclosed deals with the portable RTL parts of the compiler.
>>> The next patch, which is currently under construction deals with the
>>> tree level.  However, this patch can be put on the trunk as is, and it
>>> will eleviate many, but not all of the current limitations in the rtl
>>> parts of the compiler.
>>>
>>> Some of the patch is conditional, depending on a port defining the
>>> symbol 'TARGET_SUPPORTS_WIDE_INT' to be non zero.  Defining this
>>> symbol to be non zero is declaring that the port has been converted to
>>> use the new form or integer constants.  However, the patch is
>>> completely backwards compatible to allow ports that do not need this
>>> immediately to convert at their leasure.  The conversion process is
>>> not difficult, but it does require some knowledge of the port, so we
>>> are not volinteering to do this for all ports.
>>>
>>> OVERVIEW OF THE PATCH:
>>>
>>> The patch defines a new datatype, a 'wide_int' (defined in
>>> wide-int.[ch], and this datatype will be used to perform all of the
>>> integer constant math in the compiler.  Externally, wide-int is very
>>> similar to double-int except that it does not have the limitation that
>>> math must be done on exactly two HOST_WIDE_INTs.
>>>
>>> Internally, a wide_int is a structure that contains a fixed sized
>>> array of HOST_WIDE_INTs, a length field and a mode.  The size of the
>>
>> That it has a mode sounds odd to me and makes it subtly different
>> from HOST_WIDE_INT and double-int.  Maybe the patch will tell
>> why this is so.
>>
>>> array is determined at generation time by dividing the number of bits
>>> of the largest integer supported on the target by the number of bits
>>> in a HOST_WIDE_INT of the host.  Thus, with this format, any length of
>>> integer can be supported on any host.
>>>
>>> A new rtx type is created, the CONST_WIDE_INT, which contains a
>>> garbage collected array of HOST_WIDE_INTS that is large enough to hold
>>> the constant.  For the targets that define TARGET_SUPPORTS_WIDE_INT to
>>> be non zero, CONST_DOUBLES are only used to hold floating point
>>> values.  If the target leaves TARGET_SUPPORTS_WIDE_INT defined as 0,
>>> CONST_WIDE_INTs are not used and CONST_DOUBLEs are as they were
>>> before.
>>>
>>> CONST_INT does not change except that it is defined to hold all
>>> constants that fit in exactly one HOST_WIDE_INT.  Note that is slightly
>>> different than the current trunk.  Before this patch, the TImode
>>> constant '5' could either be in a CONST_INT or CONST_DOUBLE depending
>>> on which code path was used to create it.  This patch changes this so
>>> that if the constant fits in a CONST_INT then it is represented in a
>>> CONST_INT no matter how it is created.
>>>
>>> For the array inside a CONST_WIDE_INT, and internally in wide-int, we
>>> use a compressed form for integers that need more than one
>>> HOST_WIDE_INT.  Higher elements of the array are not needed if they
>>> are just a sign extension of the elements below them.  This does not
>>> imply that constants are signed or are sign extended, this is only a
>>> compression technique.
>>>
>>> While it might seem to be more esthetically pleasing to have not
>>> introduced the CONST_WIDE_INT and to have changed the representation
>>> of the CONST_INT to accomodate larger numbers, this would have both
>>> used more space and would be a time consuming change for the port
>>> maintainers.  We believe that most ports can be quickly converted with
>>> the current scheme because there is just not a lot of code in the back
>>> ends that cares about large constants.  Furthermore, the CONST_INT is
>>> very space efficient and even in a program that was heavy in large
>>> values, most constants would still fit in a CONST_INT.
>>>
>>> All of the parts of the rtl level that deal with CONST_DOUBLE as an
>>> now conditionally work with CONST_WIDE_INTs depending on the value
>>> of TARGET_SUPPORTS_WIDE_INT.  We believe that this patch removes all
>>> of the ices and wrong code places at the portable rtl level. However,
>>> there are still places in the portable rtl code that refuse to
>>> transform the code unless it is a CONST_INT.  Since these do not cause
>>> failures, they can be handled later.  The patch is already very large.
>>>
>>> It should be noted that much of the constant overflow checking in the
>>> constant math dissappears with these patches.  The overflow checking
>>> code in the current compiler is really divided into two cases:
>>> overflow on the host and overflow on the target.  The overflow
>>> checking on the host was to make sure that the math did overflow when
>>> done on two HOST_WIDE_INTs.  All of this code goes away.  These
>>> patches allow the constant math to be done exactly the way it is done
>>> on the target.
>>>
>>> This patch also aids other cleanups that are being considered at the
>>> rtl level:
>>>
>>>    1) These patches remove most of the host dependencies on the
>>>    optimizations.  Currently a 32 bit GCC host will produce different
>>>    code for a specific target than a 64 bit host will.  This is because
>>>    many of the transformations only work on constants that can be a
>>>    represented with a single HWI or two HWIs.  If the target has larger
>>>    integers than the host, the compilation suffers.
>>>
>>>    2) Bernd's need to make GCC correctly support partial its is made
>>>    easier by the wide-int library.  This library carefully does all
>>>    arithmetic in the precision of the mode included in it.  While there
>>>    are still places at the rtl level that still do arithmetic inline,
>>>    we plan to convert those to use the library over time.   This patch
>>>    converts a substantial number of those places.
>>>
>>>    3) This patch is one step along the path to add modes to rtl integer
>>>    constants.  There is no longer any checking to see if a CONST_DOUBLE
>>>    has VOIDmode as its mode.  Furthermore, all constructors for various
>>>    wide ints do take a mode and require that it not be VOIDmode. There
>>>    is still a lot of work to do to make this conversion possible.
>>>
>>> Richard Sandiford has been over the rtl portions of this patch a few
>>> times.  He has not looked at the wide-int files in any detail.  This
>>> patch has been heavily tested on my private ports and also on x86-64.
>>>
>>>
>>> CONVERSION PROCESS
>>>
>>> Converting a port mostly requires looking for the places where
>>> CONST_DOUBLES are used with VOIDmode and replacing that code with code
>>> that accesses CONST_WIDE_INTs.  "grep -i const_double" at the port
>>> level gets you to 95% of the changes that need to be made.  There are
>>> a few places that require a deeper look.
>>>
>>>    1) There is no equivalent to hval and lval for CONST_WIDE_INTs.
>>>    This would be difficult to express in the md language since there
>>>    are a variable number of elements.
>>>
>>>    Most ports only check that hval is either 0 or -1 to see if the int
>>>    is small.  As mentioned above, this will no longer be necessary
>>>    since small constants are always CONST_INT.  Of course there are
>>>    still a few exceptions, the alpha's constraint used by the zap
>>>    instruction certainly requires careful examination by C code.
>>>    However, all the current code does is pass the hval and lval to C
>>>    code, so evolving the c code to look at the CONST_WIDE_INT is not
>>>    really a large change.
>>>
>>>    2) Because there is no standard template that ports use to
>>>    materialize constants, there is likely to be some futzing that is
>>>    unique to each port in this code.
>>>
>>>    3) The rtx costs may have to be adjusted to properly account for
>>>    larger constants that are represented as CONST_WIDE_INT.
>>>
>>> All and all it has not taken us long to convert ports that we are
>>> familiar with.
>>>
>>> OTHER COMMENTS
>>>
>>> I did find what i believe is one interesting bug in the double-int
>>> code.  I believe that the code that performs divide and mod with round
>>> to nearest is seriously wrong for unsigned integers.  I believe that
>>> it will get the wrong answer for any numbers that are large enough to
>>> look negative if they consider signed integers.  Asside from that,
>>> wide-int should perform in a very similar manner to double-int.
>>>
>>> I am sorry for the size of this patch.   However, there does not appear
>>> to change the underlying data structure to support wider integers
>>> without doing something like this.
>>
>> Some pieces can be easily split out, like the introduction and use
>> of CONST_SCALAR_INT_P.
>>
>> As my general comment I would like to see double-int and wide-int
>> unified from an interface perspective.  Which means that double-int
>> should be a specialization of wide-int which should be a template
>> (which also means its size is constant).  Thus,
>>
>> typedef wide_int<2> double_int;
>>
>> should be the way to expose the double_int type.
>>
>> The main question remains - why does wide_int have a mode?
>> That looks redundant, both with information stored in types
>> and the RTL constant, and with the len field (that would be
>> always GET_MODE_SIZE () / ...?).  Also when you associate
>> a mode it's weird you don't associate a signedness.
>>
>> Thus I'd ask you to rework this to be a template on 'len'
>> (number of HOST_WIDE_INT words), drop the mode member
>> and unify double-int and wide-int.  Co-incidentially incrementally
>> doing this by converting double-int to a typedef of a
>> wide_int<2> specialization (thus moving double-int implementation
>> stuff to be wide_int<2> specialized) would be prefered.
>>
>> Btw,
>>
>> +/* Constructs tree in type TYPE from with value given by CST.  Signedness
>> +   of CST is assumed to be the same as the signedness of TYPE.  */
>> +
>> +tree
>> +wide_int_to_tree (tree type, const wide_int &cst)
>> +{
>> +  wide_int v;
>> +  if (TYPE_UNSIGNED (type))
>> +    v = cst.zext (TYPE_PRECISION (type));
>> +  else
>> +    v = cst.sext (TYPE_PRECISION (type));
>> +
>> +  return build_int_cst_wide (type, v.elt (0), v.elt (1));
>> +}
>>
>> is surely broken.  A wide-int does not fit a double-int.  How are you
>> going to "fix" this?
>>
>> Thanks,
>> Richard.
>>
>>> kenny
>
>
Kenneth Zadeck Oct. 4, 2012, 6:08 p.m. UTC | #11
On 10/04/2012 12:58 PM, Richard Guenther wrote:
> On Thu, Oct 4, 2012 at 3:55 PM, Kenneth Zadeck <zadeck@naturalbridge.com> wrote:
>> Let me talk about the mode here first.
>>
>> What this interface/patch provides is a facility where the constant math
>> that is done in optimizations is done exactly the way that it would be done
>> on the target machine.   What we have now is a compiler that only does this
>> if it convenient to do on the host.   I admit that i care about this more
>> than others right now, but if intel adds a couple of more instructions to
>> their vector units, other people will start to really care about this issue.
>> If you take an OImode value with the current compiler and left shift it by
>> 250 the middle end will say that the result is 0.   This is just wrong!!!
>>
>> What this means is that the bitsize and precision of the operations need to
>> be carried along when doing math. when wide-int  checks for overflow on the
>> multiply or add, it is not checking the if the value overflowed on two HWIs,
>> it is checking if the add overflowed in the mode of the types that are
>> represented on the target.   When we do shift, we are not doing a shift
>> within two HWIs, we are truncating the shift value (if this is appropriate)
>> according to the bitsize and shifting according the precision.
>>
>> I think that an argument could be made that storing the mode should be
>> changed to an explicit precision and bitsize.  (A possible other option
>> would be to store a tree type, but this would make the usage at the rtl
>> level very cumbersome since types are rare.) Aside from the work, you would
>> not get much push back.
>>
>> But the signess is a different argument.   At the rtl level, the signess is
>> a matter of context.   (you could argue that this is a mistake and i would
>> agree, but that is an even bigger change.)   But more to the point, at the
>> tree level, there are a surprising number of places where the operation
>> desired does not follow the sign of the types that were used to construct
>> the constants.   Furthermore, not carrying the sign is more consistent with
>> the double int code, which as you point out carries nothing.
> Well, on RTL the signedness is on the operation (you have sdiv and udiv, etc.).
yes, there is a complete enough set of operations that allow you to 
specify the signess where this matters.

> double-int tries to present a sign-less twos-complement entity of size
> 2 * HOST_BITS_PER_WIDE_INT.  I think that is sensible and for
> obvious reasons should not change.  Both tree and RTL rely on this.
> What we do not want is that up to TImode you get an internal representation
> done one way (twos-complement) and on OImode and larger you
> suddenly get subtly different behavior.  That's a recepie for desaster.

This is the main difference between double-int and wide-int.    Wide int 
does the math the way the machine does it or the way the front end would 
expect it to be done.    There is nothing about the host that is visible 
in the interfaces.

I reiterate, our world is already bigger than 128 bits and the intel 
world is likely to be soon.   Double int is stuck in a 64/128 bit world. 
these patches, which i admit are huge, are a way out of that box.


> I'd like to clean up the interface to double-int some more (now with the
> nice C++ stuff we have).  double-int should be pure twos-complement,
> there should be no operations on double-ints that behave differently
> when done signed or unsigned, instead we have signed and unsigned
> versions of the operations (similar to how signedness is handled on
> the RTL level).  With some trivial C++ fu you could have a
> double_sint and double_uint type that would get rid of the bool
> sign params we have to some functions (and then you could
> write double_sint >> n using operator notation).

The problem is that size does matter.    wide int is effectively 
infinite precision twos complement.    In practice, we can get by by 
just looking at the bitsize and precision of the types/modes involved 
and this makes the implementation faster than true infinite precision.

I went done the road trying to fix all of the places where the compiler 
either iced or got the wrong answer.   I showed this to Sandiford and he 
talked me out of it.  He was right, it was a rat hole.  It could have 
been a smaller patch but it was there were places where it was clearly 
going to take monumental work just to be able to back out and say that 
you had nothing.    The number of places in the compiler where you 
compare against the largest and smallest representation of an integer is 
not small and some of them are buried very deep down chains that were 
not designed to say "i cannot answer that question".

I believe that i have all of the functionality of double int in wide 
int, it is just the calls look different because there are not all of 
the interfaces that take two HWI's.    As mentioned before, all of the 
places where the overflow is computed for the purpose of asking if this 
is ok in two hwi's is gone.


> I'd like wide-int (whatever it's internal representation is) to behave
> exactly like double-ints with respect to precision and signedness
> handling.  Ideally all static functions we have that operate on
> double-ints would be 1:1 available for wide-ints, so I can change
> the type of entities in an algorithm from double-ints to wide-ints
> (or vice versa) and do not have to change the code at all.
As mentioned above, this is not that easy.    the calls look very 
similar, but there are huge places in the both the rtl level and the 
tree level that just go away because you are always guaranteed to be 
doing things the way the target wants.    Look closely at the patch for 
simplify-rtx.   Huge blocks of code are gone AND THE COMPILER IS DOING 
MORE STUFF!!!!!

I will admit that the wide-int api is larger than the double-int 
one.     This means that a lot of things turn out to be very easy to do 
with wide ints that are cumbersome in double-int, like you can just 
return a number with a single bit set, or make a shifted, complemented 
mask in one call.

> Thus as first step I'd like you to go over the double-int stuff,
> compare it to the wide-int stuff you introduce and point out
> differences (changing double-ints or wide-ints to whatever is
> the more general concept).
My goal here is to get rid of double-int and have wide-int be the 
replacement.   The only real substantive difference between the two 
(aside from the divmod rounding issue i mentioned at the end of my first 
email) is that the clients no longer have to worry about overflow with 
respect to the host representation.  When overflow is returned, you know 
it is because the math overflowed in the type or mode of the integer 
operands.


> Now, as for 'modes' - similar to signedness some functions
> that operate on double-ints take a precision argument (like
> the various extensions).  You can add a similar wrapper
> type like double_sint, but this time with a cost - a new precision
> member, that can be constructed from a double_int (or wide_int)
> that ends up specifying the desired precision (be it in terms
> of a mode or a type).
>
> You didn't question my suggestion to have the number of
> HOST_WIDE_INTs in a wide-int be compile-time constant - was
> that just an oversight on your side?  The consequence is that
> code wanting to deal with arbitrary length wide-ints needs to
> be a template.
It was an oversight.   The number of HWIs is a compile time constant, it 
is just a compile time constant that is determined for you by looking at 
the target.   I went the extra mile to have this be automatically 
computed based on the target rather than putting in yet another target 
hook.

There is the issue about vrp needing something two times the size of the 
largest mode.    one way to solve that is just to get rid of mode inside 
the wide-int and keep an explicit precision and bitsize.  Then we could 
change the wide-int type to just use a 2x buffer.      Doing this would 
not be that expensive, except for vrp (which needed it anyway).   In 
almost all of the rest of the compiler, wide-ints are very short lived 
objects that just live on the stack.    So their size does not really 
effect anything.


>> As for the splitting out the patch in smaller pieces, i am all for it.   I
>> have done this twice already and i could get the const_scalar_int_p patch
>> out quickly.    But you do not get too far along that before you are still
>> left with a big patch.   I could split out wide-int.* and just commit those
>> files with no clients as a first step.   My guess is that Richard Sandiford
>> would appreciate that because while he has carefully checked the rtl stuff,
>> i think that the code inside wide-int is not in his comfort zone of things
>> he would approve.
>>
>> As far as your btw - noticed this last night.   it is an artifact of the way
>> i produced the patch and "responsible people have been sacked".   However,
>> it shows that you read the patch carefully, and i really appreciate that.
>> i owe you a beer (not that you need another at this time of year).
> You also didn't mention the missing tree bits ... was this just a 1/n patch
> or is it at all usable for you in this state?  Where do the large integers
> magically come from?
>
> Richard.

Yes, the tree bits are missing!!!  i am still working on them.   The 
plan was to have 2 patches, one for the rtl and one for the tree. the 
idea was that the rtl one could go in first.   Of course, if we can find 
ways to break this into more pieces, i agree that that helps.    i am a 
few days away from getting the tree patch out the door, and unlike the 
rtl patch, richard sandiford is not going to give the private comments.

My plan is to submit the tree patch as soon as possible, but before 
stage 1 closes (unless i could get an extension from a friendly release 
manager).


>> Kenny
>>
>>
>>
>> On 10/04/2012 08:48 AM, Richard Guenther wrote:
>>> On Wed, Oct 3, 2012 at 7:15 PM, Kenneth Zadeck <zadeck@naturalbridge.com>
>>> wrote:
>>>> The enclosed patch is the third of at least four patches that fix the
>>>> problems associated with supporting integers on the target that are
>>>> wider than two HOST_WIDE_INTs.
>>>>
>>>> While GCC claims to support OI mode, and we have two public ports that
>>>> make minor use of this mode, in practice, compilation that uses OImode
>>>> mode commonly gives the wrong result or ices.  We have a private port
>>>> of GCC for an architecture that is further down the road to needing
>>>> comprehensive OImode and we have discovered that this is unusable. We
>>>> have decided to fix it in a general way that so that it is most
>>>> beneficial to the GCC community.  It is our belief that we are just a
>>>> little ahead of the X86 and the NEON and these patches will shortly be
>>>> essential.
>>>>
>>>> The first two of these patches were primarily lexigraphical and have
>>>> already been committed.    They transformed the uses of CONST_DOUBLE
>>>> so that it is easy to tell what the intended usage is.
>>>>
>>>> The underlying structures in the next two patches are very general:
>>>> once they are added to the compiler, the compiler will be able to
>>>> support targets with any size of integer from hosts of any size
>>>> integer.
>>>>
>>>> The patch enclosed deals with the portable RTL parts of the compiler.
>>>> The next patch, which is currently under construction deals with the
>>>> tree level.  However, this patch can be put on the trunk as is, and it
>>>> will eleviate many, but not all of the current limitations in the rtl
>>>> parts of the compiler.
>>>>
>>>> Some of the patch is conditional, depending on a port defining the
>>>> symbol 'TARGET_SUPPORTS_WIDE_INT' to be non zero.  Defining this
>>>> symbol to be non zero is declaring that the port has been converted to
>>>> use the new form or integer constants.  However, the patch is
>>>> completely backwards compatible to allow ports that do not need this
>>>> immediately to convert at their leasure.  The conversion process is
>>>> not difficult, but it does require some knowledge of the port, so we
>>>> are not volinteering to do this for all ports.
>>>>
>>>> OVERVIEW OF THE PATCH:
>>>>
>>>> The patch defines a new datatype, a 'wide_int' (defined in
>>>> wide-int.[ch], and this datatype will be used to perform all of the
>>>> integer constant math in the compiler.  Externally, wide-int is very
>>>> similar to double-int except that it does not have the limitation that
>>>> math must be done on exactly two HOST_WIDE_INTs.
>>>>
>>>> Internally, a wide_int is a structure that contains a fixed sized
>>>> array of HOST_WIDE_INTs, a length field and a mode.  The size of the
>>> That it has a mode sounds odd to me and makes it subtly different
>>> from HOST_WIDE_INT and double-int.  Maybe the patch will tell
>>> why this is so.
>>>
>>>> array is determined at generation time by dividing the number of bits
>>>> of the largest integer supported on the target by the number of bits
>>>> in a HOST_WIDE_INT of the host.  Thus, with this format, any length of
>>>> integer can be supported on any host.
>>>>
>>>> A new rtx type is created, the CONST_WIDE_INT, which contains a
>>>> garbage collected array of HOST_WIDE_INTS that is large enough to hold
>>>> the constant.  For the targets that define TARGET_SUPPORTS_WIDE_INT to
>>>> be non zero, CONST_DOUBLES are only used to hold floating point
>>>> values.  If the target leaves TARGET_SUPPORTS_WIDE_INT defined as 0,
>>>> CONST_WIDE_INTs are not used and CONST_DOUBLEs are as they were
>>>> before.
>>>>
>>>> CONST_INT does not change except that it is defined to hold all
>>>> constants that fit in exactly one HOST_WIDE_INT.  Note that is slightly
>>>> different than the current trunk.  Before this patch, the TImode
>>>> constant '5' could either be in a CONST_INT or CONST_DOUBLE depending
>>>> on which code path was used to create it.  This patch changes this so
>>>> that if the constant fits in a CONST_INT then it is represented in a
>>>> CONST_INT no matter how it is created.
>>>>
>>>> For the array inside a CONST_WIDE_INT, and internally in wide-int, we
>>>> use a compressed form for integers that need more than one
>>>> HOST_WIDE_INT.  Higher elements of the array are not needed if they
>>>> are just a sign extension of the elements below them.  This does not
>>>> imply that constants are signed or are sign extended, this is only a
>>>> compression technique.
>>>>
>>>> While it might seem to be more esthetically pleasing to have not
>>>> introduced the CONST_WIDE_INT and to have changed the representation
>>>> of the CONST_INT to accomodate larger numbers, this would have both
>>>> used more space and would be a time consuming change for the port
>>>> maintainers.  We believe that most ports can be quickly converted with
>>>> the current scheme because there is just not a lot of code in the back
>>>> ends that cares about large constants.  Furthermore, the CONST_INT is
>>>> very space efficient and even in a program that was heavy in large
>>>> values, most constants would still fit in a CONST_INT.
>>>>
>>>> All of the parts of the rtl level that deal with CONST_DOUBLE as an
>>>> now conditionally work with CONST_WIDE_INTs depending on the value
>>>> of TARGET_SUPPORTS_WIDE_INT.  We believe that this patch removes all
>>>> of the ices and wrong code places at the portable rtl level. However,
>>>> there are still places in the portable rtl code that refuse to
>>>> transform the code unless it is a CONST_INT.  Since these do not cause
>>>> failures, they can be handled later.  The patch is already very large.
>>>>
>>>> It should be noted that much of the constant overflow checking in the
>>>> constant math dissappears with these patches.  The overflow checking
>>>> code in the current compiler is really divided into two cases:
>>>> overflow on the host and overflow on the target.  The overflow
>>>> checking on the host was to make sure that the math did overflow when
>>>> done on two HOST_WIDE_INTs.  All of this code goes away.  These
>>>> patches allow the constant math to be done exactly the way it is done
>>>> on the target.
>>>>
>>>> This patch also aids other cleanups that are being considered at the
>>>> rtl level:
>>>>
>>>>     1) These patches remove most of the host dependencies on the
>>>>     optimizations.  Currently a 32 bit GCC host will produce different
>>>>     code for a specific target than a 64 bit host will.  This is because
>>>>     many of the transformations only work on constants that can be a
>>>>     represented with a single HWI or two HWIs.  If the target has larger
>>>>     integers than the host, the compilation suffers.
>>>>
>>>>     2) Bernd's need to make GCC correctly support partial its is made
>>>>     easier by the wide-int library.  This library carefully does all
>>>>     arithmetic in the precision of the mode included in it.  While there
>>>>     are still places at the rtl level that still do arithmetic inline,
>>>>     we plan to convert those to use the library over time.   This patch
>>>>     converts a substantial number of those places.
>>>>
>>>>     3) This patch is one step along the path to add modes to rtl integer
>>>>     constants.  There is no longer any checking to see if a CONST_DOUBLE
>>>>     has VOIDmode as its mode.  Furthermore, all constructors for various
>>>>     wide ints do take a mode and require that it not be VOIDmode. There
>>>>     is still a lot of work to do to make this conversion possible.
>>>>
>>>> Richard Sandiford has been over the rtl portions of this patch a few
>>>> times.  He has not looked at the wide-int files in any detail.  This
>>>> patch has been heavily tested on my private ports and also on x86-64.
>>>>
>>>>
>>>> CONVERSION PROCESS
>>>>
>>>> Converting a port mostly requires looking for the places where
>>>> CONST_DOUBLES are used with VOIDmode and replacing that code with code
>>>> that accesses CONST_WIDE_INTs.  "grep -i const_double" at the port
>>>> level gets you to 95% of the changes that need to be made.  There are
>>>> a few places that require a deeper look.
>>>>
>>>>     1) There is no equivalent to hval and lval for CONST_WIDE_INTs.
>>>>     This would be difficult to express in the md language since there
>>>>     are a variable number of elements.
>>>>
>>>>     Most ports only check that hval is either 0 or -1 to see if the int
>>>>     is small.  As mentioned above, this will no longer be necessary
>>>>     since small constants are always CONST_INT.  Of course there are
>>>>     still a few exceptions, the alpha's constraint used by the zap
>>>>     instruction certainly requires careful examination by C code.
>>>>     However, all the current code does is pass the hval and lval to C
>>>>     code, so evolving the c code to look at the CONST_WIDE_INT is not
>>>>     really a large change.
>>>>
>>>>     2) Because there is no standard template that ports use to
>>>>     materialize constants, there is likely to be some futzing that is
>>>>     unique to each port in this code.
>>>>
>>>>     3) The rtx costs may have to be adjusted to properly account for
>>>>     larger constants that are represented as CONST_WIDE_INT.
>>>>
>>>> All and all it has not taken us long to convert ports that we are
>>>> familiar with.
>>>>
>>>> OTHER COMMENTS
>>>>
>>>> I did find what i believe is one interesting bug in the double-int
>>>> code.  I believe that the code that performs divide and mod with round
>>>> to nearest is seriously wrong for unsigned integers.  I believe that
>>>> it will get the wrong answer for any numbers that are large enough to
>>>> look negative if they consider signed integers.  Asside from that,
>>>> wide-int should perform in a very similar manner to double-int.
>>>>
>>>> I am sorry for the size of this patch.   However, there does not appear
>>>> to change the underlying data structure to support wider integers
>>>> without doing something like this.
>>> Some pieces can be easily split out, like the introduction and use
>>> of CONST_SCALAR_INT_P.
>>>
>>> As my general comment I would like to see double-int and wide-int
>>> unified from an interface perspective.  Which means that double-int
>>> should be a specialization of wide-int which should be a template
>>> (which also means its size is constant).  Thus,
>>>
>>> typedef wide_int<2> double_int;
>>>
>>> should be the way to expose the double_int type.
>>>
>>> The main question remains - why does wide_int have a mode?
>>> That looks redundant, both with information stored in types
>>> and the RTL constant, and with the len field (that would be
>>> always GET_MODE_SIZE () / ...?).  Also when you associate
>>> a mode it's weird you don't associate a signedness.
>>>
>>> Thus I'd ask you to rework this to be a template on 'len'
>>> (number of HOST_WIDE_INT words), drop the mode member
>>> and unify double-int and wide-int.  Co-incidentially incrementally
>>> doing this by converting double-int to a typedef of a
>>> wide_int<2> specialization (thus moving double-int implementation
>>> stuff to be wide_int<2> specialized) would be prefered.
>>>
>>> Btw,
>>>
>>> +/* Constructs tree in type TYPE from with value given by CST.  Signedness
>>> +   of CST is assumed to be the same as the signedness of TYPE.  */
>>> +
>>> +tree
>>> +wide_int_to_tree (tree type, const wide_int &cst)
>>> +{
>>> +  wide_int v;
>>> +  if (TYPE_UNSIGNED (type))
>>> +    v = cst.zext (TYPE_PRECISION (type));
>>> +  else
>>> +    v = cst.sext (TYPE_PRECISION (type));
>>> +
>>> +  return build_int_cst_wide (type, v.elt (0), v.elt (1));
>>> +}
>>>
>>> is surely broken.  A wide-int does not fit a double-int.  How are you
>>> going to "fix" this?
>>>
>>> Thanks,
>>> Richard.
>>>
>>>> kenny
>>
Richard Sandiford Oct. 4, 2012, 7:27 p.m. UTC | #12
Kenneth Zadeck <zadeck@naturalbridge.com> writes:
> On 10/04/2012 12:58 PM, Richard Guenther wrote:
>> On Thu, Oct 4, 2012 at 3:55 PM, Kenneth Zadeck <zadeck@naturalbridge.com> wrote:
>>> Let me talk about the mode here first.
>>>
>>> What this interface/patch provides is a facility where the constant math
>>> that is done in optimizations is done exactly the way that it would be done
>>> on the target machine.   What we have now is a compiler that only does this
>>> if it convenient to do on the host.   I admit that i care about this more
>>> than others right now, but if intel adds a couple of more instructions to
>>> their vector units, other people will start to really care about this issue.
>>> If you take an OImode value with the current compiler and left shift it by
>>> 250 the middle end will say that the result is 0.   This is just wrong!!!
>>>
>>> What this means is that the bitsize and precision of the operations need to
>>> be carried along when doing math. when wide-int  checks for overflow on the
>>> multiply or add, it is not checking the if the value overflowed on two HWIs,
>>> it is checking if the add overflowed in the mode of the types that are
>>> represented on the target.   When we do shift, we are not doing a shift
>>> within two HWIs, we are truncating the shift value (if this is appropriate)
>>> according to the bitsize and shifting according the precision.
>>>
>>> I think that an argument could be made that storing the mode should be
>>> changed to an explicit precision and bitsize.  (A possible other option
>>> would be to store a tree type, but this would make the usage at the rtl
>>> level very cumbersome since types are rare.) Aside from the work, you would
>>> not get much push back.
>>>
>>> But the signess is a different argument.   At the rtl level, the signess is
>>> a matter of context.   (you could argue that this is a mistake and i would
>>> agree, but that is an even bigger change.)   But more to the point, at the
>>> tree level, there are a surprising number of places where the operation
>>> desired does not follow the sign of the types that were used to construct
>>> the constants.   Furthermore, not carrying the sign is more consistent with
>>> the double int code, which as you point out carries nothing.
>> Well, on RTL the signedness is on the operation (you have sdiv and udiv, etc.).
> yes, there is a complete enough set of operations that allow you to 
> specify the signess where this matters.
>
>> double-int tries to present a sign-less twos-complement entity of size
>> 2 * HOST_BITS_PER_WIDE_INT.  I think that is sensible and for
>> obvious reasons should not change.  Both tree and RTL rely on this.

I disagree, at least from an RTL perspective.  HOST_BITS_PER_WIDE_INT is
a host property, and as Kenny says, it's not really sensible to tie our
main target-specific IR to something host-specific.  We've only done
that for historical reasons.

On a target that's been converted to wide_int, I don't think a pair
of HWIs (whether or not it's expressed as a double_int) has any
significance at all at the RTL level.

As far as the wide_ints recording a mode or precision goes: we're in
the "lucky" position of having tried both options.  Trees record the
type (and thus precision) of all compile-time integer constants.
RTL doesn't.  And the RTL way is a right pain.  It leads to interfaces
in which rtx arguments often have to be accompanied by a mode.  And it
leads to clunky differences like those between simplify_unary_operation
(which takes two mode arguments) and simplify_binary_operation
(which with our current set of binary operations requires only one).

To put it another way: every wide_int operation needs to know
the precision of its arguments and the precision of its result.
Not storing the precision in the wide_int is putting the onus
on the caller to keep track of the precision separately.
And that just leads to the possibility of mismatches (e.g. from
accidentally passing the precision of a different wide_int instead;
the corresponding rtx mistake is easily made when you have several
variables with "mode" in their name).

Storing the precision in the wide_int means that we can assert for
such mismatches.  And those kinds of assert are much easier to debug,
because they trigger regardless of the integer values involved.
In contrast, RTL-level mismatches between CONST_INTs/CONST_DOUBLEs
and their accompanying mode have tended to trigger ICEs or wrong-code
bugs only for certain relatively uncommon inputs.  I.e. they're much
more likely to be found in the field rather than before release.

But let's say for the sake of argument that we went that way and didn't
record the precision in the wide_int.  We're unlikely to want interfaces
like:

    div (precision_of_result, precision_of_op0, op0,
         precsion_of_op1, op1)

because the three precisions are always going to be the same.
We'd have:

    div (precision_of_result, op0, op1)

instead.  So in cases where we "know" we're only going to be calling
wide_int interfaces like these, the callers will probably be lazy and
not keep track of op0 and op1's precision separately.  But the functions
like sign and zero extension, popcount, etc., will definitely need to
know the precision of their operand, and once we start using functions
like those, we will have to make the caller aware of the precision of
existing wide_ints.  It's not always easy to retrofit precision-tracking
code to the caller, and combine.c's handling of ZERO_EXTEND is one
example where we ended up deciding it was easier not to bother and just
avoid optimising ZERO_EXTENDs of constants instead.  See also cselib,
where we have to wrap certain types of rtl in a special CONST wrapper
(with added mode) so that we don't lose track.

And I think all that "need to know the precision" stuff applies to
double_int, except that the precision of a double_int is always
implicitly 2 * HOST_BITS_PER_WIDE_INT.  We don't want to hardcode
a similar precision for wide_int because it's always the precision
of the current operation, not the precision of the target's widest mode
(or whatever) that matters.

We only get away with using double_int (or more generally a pair of HWIs)
for RTL because (a) single HWI arithmetic tends to use a different path
and (b) no-one is making active use of modes with precisions in the range
(HOST_BITS_PER_WIDE_INT, 2 * HOST_BITS_PER_WIDE_INT).  If they were,
a lot of our current RTL code would be wrong, because we often don't
truncate intermediate or final 2-HWI values to the precision of the
result mode.

One of the advantages of wide_int is that it removes this (IMO) artifical
restriction "for free" (or in fact with less RTL code than we have now).
And that's not just academic: I think Kenny said that his target had
modes that are wider than 64 bits but not a power of 2 in width.

Sorry for the rant.  I just don't see any advantage to keeping the
precision and integer separate.

Richard
Marc Glisse Oct. 4, 2012, 9:06 p.m. UTC | #13
On Wed, 3 Oct 2012, Kenneth Zadeck wrote:

> i have already converted the vrp code, so i have some guess at where you are 
> talking about.  (of course correct me if i am wrong).
>
> in the code that computes the range when two variables are multiplied 
> together needs to do a multiplication that produces a result that is twice as 
> wide as the inputs.
>
> my library is able to do that with one catch (and this is a big catch): the 
> target has to have an integer mode that is twice as big as the mode of the 
> operands. The issue is that wide-ints actually carry around the mode of the 
> value in order to get the bitsize and precision of the operands (it does not 
> have the type, because this code has to both work on the rtl and tree level 
> and i generally do not want the signness anyway).

Ah, after reading the whole thread, now I understand that it is because 
wide_int carries a mode that it makes little sense making it a template 
(sorry that it took me so long when the information was in your first 
answer). I understand that it would be inconvenient (much longer code) to 
have a base_wide_int that does just the arithmetic and a wrapper that 
contains the mode as well.

Your idea below to define dummy extra modes does bring the template idea 
back to the table though ;-)

> my current code in vrp checks to see if such a mode exists and if it does, it 
> produces the product.   if the mode does not exist, it returns bottom.   What 
> this means is that for most (many or some) targets that have a TImode, the 
> largest thing that particular vrp discover ranges for is a DImode value.   We 
> could get around this by defining the next larger mode than what the target 
> really needs but i wonder how much mileage you are going to get out of that 
> with really large numbers.

The current wrapping multiplication code in vrp works with a pair of 
double_int, so it should keep working with a pair of wide_int. I see now 
why wide_int doesn't allow to simplify the code, but it doesn't have to 
break.
Kenneth Zadeck Oct. 4, 2012, 11:02 p.m. UTC | #14
There are a bunch of ways to skin the cat.

1) we can define the extra mode.
2) if we get rid of the mode inside the wide int and replace it with an 
explicit precision and bitsize, then we can just make the size of the 
buffer twice as big as the analysis of the modes indicates.
3) or we can leave your code in a form that uses 2 wide ints.   my 
current patch (which i have not gotten working yet) changes this to use 
the mul_full call, but it could be changed.   It is much simpler that 
the existing code.

i do not see how templates offer any solution at all.   the wide int 
code really needs to have something valid to indicate the length of the 
object, and if there is no mode that big, the code will ice.

my personal feeling is that range analysis is quite useful for small 
integers and not so much as the values get larger.   The only really 
large integer constants that you are likely to find in real code are 
encryption keys, like the dvd decoding keys, and if the key is chosen 
well there should be no useful optimization that you can perform on the 
code.   If this did not work on the largest modes, no one would ever 
notice.   i.e. i would bet that you never make a useful transformation 
on any integer that would not fit in an int32.

However, this is your pass, and i understand the principal of "never 
back down".

Kenny

On 10/04/2012 05:06 PM, Marc Glisse wrote:
> On Wed, 3 Oct 2012, Kenneth Zadeck wrote:
>
>> i have already converted the vrp code, so i have some guess at where 
>> you are talking about.  (of course correct me if i am wrong).
>>
>> in the code that computes the range when two variables are multiplied 
>> together needs to do a multiplication that produces a result that is 
>> twice as wide as the inputs.
>>
>> my library is able to do that with one catch (and this is a big 
>> catch): the target has to have an integer mode that is twice as big 
>> as the mode of the operands. The issue is that wide-ints actually 
>> carry around the mode of the value in order to get the bitsize and 
>> precision of the operands (it does not have the type, because this 
>> code has to both work on the rtl and tree level and i generally do 
>> not want the signness anyway).
>
> Ah, after reading the whole thread, now I understand that it is 
> because wide_int carries a mode that it makes little sense making it a 
> template (sorry that it took me so long when the information was in 
> your first answer). I understand that it would be inconvenient (much 
> longer code) to have a base_wide_int that does just the arithmetic and 
> a wrapper that contains the mode as well.
>
> Your idea below to define dummy extra modes does bring the template 
> idea back to the table though ;-)
>
>> my current code in vrp checks to see if such a mode exists and if it 
>> does, it produces the product.   if the mode does not exist, it 
>> returns bottom.   What this means is that for most (many or some) 
>> targets that have a TImode, the largest thing that particular vrp 
>> discover ranges for is a DImode value.   We could get around this by 
>> defining the next larger mode than what the target really needs but i 
>> wonder how much mileage you are going to get out of that with really 
>> large numbers.
>
> The current wrapping multiplication code in vrp works with a pair of 
> double_int, so it should keep working with a pair of wide_int. I see 
> now why wide_int doesn't allow to simplify the code, but it doesn't 
> have to break.
>
Marc Glisse Oct. 5, 2012, 7:04 a.m. UTC | #15
On Thu, 4 Oct 2012, Kenneth Zadeck wrote:

> There are a bunch of ways to skin the cat.
>
> 1) we can define the extra mode.
> 2) if we get rid of the mode inside the wide int and replace it with an 
> explicit precision and bitsize, then we can just make the size of the buffer 
> twice as big as the analysis of the modes indicates.
> 3) or we can leave your code in a form that uses 2 wide ints.   my current 
> patch (which i have not gotten working yet) changes this to use the mul_full 
> call, but it could be changed.   It is much simpler that the existing code.

Thanks, we are exactly on the same page :-)

> i do not see how templates offer any solution at all.   the wide int code 
> really needs to have something valid to indicate the length of the object, 
> and if there is no mode that big, the code will ice.

It would be possible to define the regular wide_int taking into account 
only valid modes, and only use the dummy larger modes in very specific 
circumstances, where the template parameter would somehow indicate the 
last mode that may appear in it. This is not a recommendation at all, just 
an explanation of why templates might have had something to do with it.

> my personal feeling is that range analysis is quite useful for small integers 
> and not so much as the values get larger.   The only really large integer 
> constants that you are likely to find in real code are encryption keys,

Note that gcc regularly decides to use unsigned types where the code is 
signed, so a "small" constant like -1 can be huge.
Richard Biener Oct. 5, 2012, 9:26 a.m. UTC | #16
On Thu, Oct 4, 2012 at 9:27 PM, Richard Sandiford
<rdsandiford@googlemail.com> wrote:
> Kenneth Zadeck <zadeck@naturalbridge.com> writes:
>> On 10/04/2012 12:58 PM, Richard Guenther wrote:
>>> On Thu, Oct 4, 2012 at 3:55 PM, Kenneth Zadeck <zadeck@naturalbridge.com> wrote:
>>>> Let me talk about the mode here first.
>>>>
>>>> What this interface/patch provides is a facility where the constant math
>>>> that is done in optimizations is done exactly the way that it would be done
>>>> on the target machine.   What we have now is a compiler that only does this
>>>> if it convenient to do on the host.   I admit that i care about this more
>>>> than others right now, but if intel adds a couple of more instructions to
>>>> their vector units, other people will start to really care about this issue.
>>>> If you take an OImode value with the current compiler and left shift it by
>>>> 250 the middle end will say that the result is 0.   This is just wrong!!!
>>>>
>>>> What this means is that the bitsize and precision of the operations need to
>>>> be carried along when doing math. when wide-int  checks for overflow on the
>>>> multiply or add, it is not checking the if the value overflowed on two HWIs,
>>>> it is checking if the add overflowed in the mode of the types that are
>>>> represented on the target.   When we do shift, we are not doing a shift
>>>> within two HWIs, we are truncating the shift value (if this is appropriate)
>>>> according to the bitsize and shifting according the precision.
>>>>
>>>> I think that an argument could be made that storing the mode should be
>>>> changed to an explicit precision and bitsize.  (A possible other option
>>>> would be to store a tree type, but this would make the usage at the rtl
>>>> level very cumbersome since types are rare.) Aside from the work, you would
>>>> not get much push back.
>>>>
>>>> But the signess is a different argument.   At the rtl level, the signess is
>>>> a matter of context.   (you could argue that this is a mistake and i would
>>>> agree, but that is an even bigger change.)   But more to the point, at the
>>>> tree level, there are a surprising number of places where the operation
>>>> desired does not follow the sign of the types that were used to construct
>>>> the constants.   Furthermore, not carrying the sign is more consistent with
>>>> the double int code, which as you point out carries nothing.
>>> Well, on RTL the signedness is on the operation (you have sdiv and udiv, etc.).
>> yes, there is a complete enough set of operations that allow you to
>> specify the signess where this matters.
>>
>>> double-int tries to present a sign-less twos-complement entity of size
>>> 2 * HOST_BITS_PER_WIDE_INT.  I think that is sensible and for
>>> obvious reasons should not change.  Both tree and RTL rely on this.
>
> I disagree, at least from an RTL perspective.  HOST_BITS_PER_WIDE_INT is
> a host property, and as Kenny says, it's not really sensible to tie our
> main target-specific IR to something host-specific.  We've only done
> that for historical reasons.

Oh, I agree - that HOST_WIDE_INT provides the limit for the largest
integer constants we can encode on a target is a bug!

> On a target that's been converted to wide_int, I don't think a pair
> of HWIs (whether or not it's expressed as a double_int) has any
> significance at all at the RTL level.
>
> As far as the wide_ints recording a mode or precision goes: we're in
> the "lucky" position of having tried both options.  Trees record the
> type (and thus precision) of all compile-time integer constants.
> RTL doesn't.  And the RTL way is a right pain.  It leads to interfaces
> in which rtx arguments often have to be accompanied by a mode.  And it
> leads to clunky differences like those between simplify_unary_operation
> (which takes two mode arguments) and simplify_binary_operation
> (which with our current set of binary operations requires only one).

But the issue here is not that double_int doesn't record a mode
or precision (or a sign).  The issue is that CONST_DOUBLE and CONST_INT
don't!  The _tree_ INTEGER_CST contains of a type and a double-int.
I see double-int (and the proposed wide-int) as a common building-block
used for kind of "arbitrary precision" (as far as the target needs) integer
constants on both tree and RTL.  And having a common worker implementation
requires it to operate on something different than tree types or RTL mode
plus signedness.

> To put it another way: every wide_int operation needs to know
> the precision of its arguments and the precision of its result.

That's not true.  Every tree or RTL operation does, not every
wide_int operation.  double_int's are just twos-complement numbers
of precision 2 * HOST_BITS_PER_WIDE_INT.  wide_int's should
be just twos-complement numbers of precision len *
WHATEVER_HOST_COMPONENT_TYPE_IS_SUITABLE_FOR_A_FAST_IMPLEMENTATION.
Operations on double_int and wide_int are bare-metal,
in nearly all of the times you should use routines that do
proper sign-/zero-extending to a possibly smaller precision.  When
using bare-metal you are supposed to do that yourself.

Which is why I was suggesting to add double_sint, double_uint,
double_sintp (with precision), etc., wrappers and operations.

> Not storing the precision in the wide_int is putting the onus
> on the caller to keep track of the precision separately.

But that's a matter of providing something high-level ontop of
the bare-metal double-int/wide-int (something shared between
RTL and trees).  Duplicating information in the bare-metal
doesn't make sense (and no, INTEGER_CSTs on the tree level
are _not_ short-lived, and how can a double-int make sense on
the tree level when you say it doesn't make sense on the RTL level?)

> And that just leads to the possibility of mismatches (e.g. from
> accidentally passing the precision of a different wide_int instead;
> the corresponding rtx mistake is easily made when you have several
> variables with "mode" in their name).

Well, the mistake was obviously that CONST_INT/DOUBLE_INT
do not have a mode.  You are passing those RTXen, you are not
passing double-ints!  Don't blame double-ints for something that is
not their fault.

To me it looks like you want to create a CONST_WIDE, not a wide-int
(as you are not touching the tree side at all).  I'm all for that if that
is progress for you.  But make sure to provide an underlying
object type that is suitable for sharing between RTL and trees
(thus do not put in redundant information like a precision).

In the other mail Kenny mentioned that wide-ints are all of the same
compile-time, target specific size.  That might be suitable for
CONST_WIDE (on RTL you still have CONST_INT for integers that
can be encoded smaller, and we only have one function live in RTL
at a time) - but for trees certainly INTEGER_CSTs can consume
a significant portion of GC memory, so it's not an option to enlarge
it from containing two HWIs to what is required for 2 * OImode.
But again there is enough redundant information in the INTEGER_CST
(it's type which contains its mode) that allows it to be of variable size.

> Storing the precision in the wide_int means that we can assert for
> such mismatches.  And those kinds of assert are much easier to debug,
> because they trigger regardless of the integer values involved.
> In contrast, RTL-level mismatches between CONST_INTs/CONST_DOUBLEs
> and their accompanying mode have tended to trigger ICEs or wrong-code
> bugs only for certain relatively uncommon inputs.  I.e. they're much
> more likely to be found in the field rather than before release.
>
> But let's say for the sake of argument that we went that way and didn't
> record the precision in the wide_int.  We're unlikely to want interfaces
> like:
>
>     div (precision_of_result, precision_of_op0, op0,
>          precsion_of_op1, op1)
>
> because the three precisions are always going to be the same.
> We'd have:
>
>     div (precision_of_result, op0, op1)
>
> instead.  So in cases where we "know" we're only going to be calling
> wide_int interfaces like these, the callers will probably be lazy and
> not keep track of op0 and op1's precision separately.  But the functions
> like sign and zero extension, popcount, etc., will definitely need to
> know the precision of their operand, and once we start using functions
> like those, we will have to make the caller aware of the precision of
> existing wide_ints.  It's not always easy to retrofit precision-tracking
> code to the caller, and combine.c's handling of ZERO_EXTEND is one
> example where we ended up deciding it was easier not to bother and just
> avoid optimising ZERO_EXTENDs of constants instead.  See also cselib,
> where we have to wrap certain types of rtl in a special CONST wrapper
> (with added mode) so that we don't lose track.
>
> And I think all that "need to know the precision" stuff applies to
> double_int, except that the precision of a double_int is always
> implicitly 2 * HOST_BITS_PER_WIDE_INT.  We don't want to hardcode
> a similar precision for wide_int because it's always the precision
> of the current operation, not the precision of the target's widest mode
> (or whatever) that matters.
>
> We only get away with using double_int (or more generally a pair of HWIs)
> for RTL because (a) single HWI arithmetic tends to use a different path
> and (b) no-one is making active use of modes with precisions in the range
> (HOST_BITS_PER_WIDE_INT, 2 * HOST_BITS_PER_WIDE_INT).  If they were,
> a lot of our current RTL code would be wrong, because we often don't
> truncate intermediate or final 2-HWI values to the precision of the
> result mode.
>
> One of the advantages of wide_int is that it removes this (IMO) artifical
> restriction "for free" (or in fact with less RTL code than we have now).
> And that's not just academic: I think Kenny said that his target had
> modes that are wider than 64 bits but not a power of 2 in width.
>
> Sorry for the rant.  I just don't see any advantage to keeping the
> precision and integer separate.

Avoiding redundant information (which can get out of sync) for something
that is supposed to be shared between tree and RTL (is it?!).

So, I think you want a CONST_WIDE RTX that has

struct const_wide_rtx {
  wide_int w;
  enum machine_mode mode;
};

which, if I look close enough, we already have (in struct rtx_def we _do_
have a mode field).

So I'm not really sure what your issue is?

Look at RTL users of the double-int routines and provide wrappers
that take RTXen as inputs.  Enforce that all CONSTs have a mode.

Richard.

> Richard
Richard Biener Oct. 5, 2012, 9:29 a.m. UTC | #17
On Fri, Oct 5, 2012 at 11:26 AM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> Look at RTL users of the double-int routines and provide wrappers
> that take RTXen as inputs.  Enforce that all CONSTs have a mode.

Which would, btw, allow to "merge" CONST_INT, CONST_DOUBLE
and CONST_WIDE by making the storage size variable and it's
length specified by the mode of the RTX (I never liked the distinction
of CONST_INT and CONST_DOUBLE, and you are right, the
CONST_DOUBLE paths are seldomly excercised).

Richard.
Richard Sandiford Oct. 5, 2012, 9:55 a.m. UTC | #18
Richard Guenther <richard.guenther@gmail.com> writes:
>> As far as the wide_ints recording a mode or precision goes: we're in
>> the "lucky" position of having tried both options.  Trees record the
>> type (and thus precision) of all compile-time integer constants.
>> RTL doesn't.  And the RTL way is a right pain.  It leads to interfaces
>> in which rtx arguments often have to be accompanied by a mode.  And it
>> leads to clunky differences like those between simplify_unary_operation
>> (which takes two mode arguments) and simplify_binary_operation
>> (which with our current set of binary operations requires only one).
>
> But the issue here is not that double_int doesn't record a mode
> or precision (or a sign).  The issue is that CONST_DOUBLE and CONST_INT
> don't!  The _tree_ INTEGER_CST contains of a type and a double-int.
> I see double-int (and the proposed wide-int) as a common building-block
> used for kind of "arbitrary precision" (as far as the target needs) integer
> constants on both tree and RTL.  And having a common worker implementation
> requires it to operate on something different than tree types or RTL mode
> plus signedness.
>
>> To put it another way: every wide_int operation needs to know
>> the precision of its arguments and the precision of its result.
>
> That's not true.  Every tree or RTL operation does, not every
> wide_int operation.  double_int's are just twos-complement numbers
> of precision 2 * HOST_BITS_PER_WIDE_INT.  wide_int's should
> be just twos-complement numbers of precision len *
> WHATEVER_HOST_COMPONENT_TYPE_IS_SUITABLE_FOR_A_FAST_IMPLEMENTATION.
> Operations on double_int and wide_int are bare-metal,
> in nearly all of the times you should use routines that do
> proper sign-/zero-extending to a possibly smaller precision.  When
> using bare-metal you are supposed to do that yourself.
>
> Which is why I was suggesting to add double_sint, double_uint,
> double_sintp (with precision), etc., wrappers and operations.
>
>> Not storing the precision in the wide_int is putting the onus
>> on the caller to keep track of the precision separately.
>
> But that's a matter of providing something high-level ontop of
> the bare-metal double-int/wide-int (something shared between
> RTL and trees).  Duplicating information in the bare-metal
> doesn't make sense (and no, INTEGER_CSTs on the tree level
> are _not_ short-lived, and how can a double-int make sense on
> the tree level when you say it doesn't make sense on the RTL level?)

I think we're talking at cross purposes here.  To the extent that
I'm not really sure where to begin. :-)  Just in case this is it:
the idea is that wide_int is the type used to _process_ integers.
It is not suppoed to be the type used to store integers in the IRs.
The way trees store integers and the way that RTL stores integers
are up to them.  For RTL the idea is that we continue to store integers
that happen to fit into a HWI as a CONST_INT (regardless of the size of
the CONST_INT's context-determined mode).  Wider integers are stored
as a CONST_DOUBLE (for unconverted targets) or a CONST_WIDE_INT
(for converted targets).  None of the three use the wide_int type;
they use more compact representations instead.  And as Kenny says,
using CONST_INT continues to be an important way of reducing the
IR footprint.

Whenever we want to do something interesting with the integers,
we construct a wide_int for them and use the same processing code
for all three rtx types.  This avoids having the separate single-HWI
and double-HWI paths that we have now.  It also copes naturally with
cases where we start out with single-HWI values but end up with wider
ones.

But the operations that we do on these wide_ints will all be to a mode
or precision.  Shifts of QImode integers are different from shifts of
HImode integers, etc.

If you knew all that already (you probably did) and I've completely
missed the point, please say so. :-)

I'm not sure what you mean by "bare metal".

Richard
Richard Biener Oct. 5, 2012, 10:34 a.m. UTC | #19
On Fri, Oct 5, 2012 at 11:55 AM, Richard Sandiford
<rdsandiford@googlemail.com> wrote:
> Richard Guenther <richard.guenther@gmail.com> writes:
>>> As far as the wide_ints recording a mode or precision goes: we're in
>>> the "lucky" position of having tried both options.  Trees record the
>>> type (and thus precision) of all compile-time integer constants.
>>> RTL doesn't.  And the RTL way is a right pain.  It leads to interfaces
>>> in which rtx arguments often have to be accompanied by a mode.  And it
>>> leads to clunky differences like those between simplify_unary_operation
>>> (which takes two mode arguments) and simplify_binary_operation
>>> (which with our current set of binary operations requires only one).
>>
>> But the issue here is not that double_int doesn't record a mode
>> or precision (or a sign).  The issue is that CONST_DOUBLE and CONST_INT
>> don't!  The _tree_ INTEGER_CST contains of a type and a double-int.
>> I see double-int (and the proposed wide-int) as a common building-block
>> used for kind of "arbitrary precision" (as far as the target needs) integer
>> constants on both tree and RTL.  And having a common worker implementation
>> requires it to operate on something different than tree types or RTL mode
>> plus signedness.
>>
>>> To put it another way: every wide_int operation needs to know
>>> the precision of its arguments and the precision of its result.
>>
>> That's not true.  Every tree or RTL operation does, not every
>> wide_int operation.  double_int's are just twos-complement numbers
>> of precision 2 * HOST_BITS_PER_WIDE_INT.  wide_int's should
>> be just twos-complement numbers of precision len *
>> WHATEVER_HOST_COMPONENT_TYPE_IS_SUITABLE_FOR_A_FAST_IMPLEMENTATION.
>> Operations on double_int and wide_int are bare-metal,
>> in nearly all of the times you should use routines that do
>> proper sign-/zero-extending to a possibly smaller precision.  When
>> using bare-metal you are supposed to do that yourself.
>>
>> Which is why I was suggesting to add double_sint, double_uint,
>> double_sintp (with precision), etc., wrappers and operations.
>>
>>> Not storing the precision in the wide_int is putting the onus
>>> on the caller to keep track of the precision separately.
>>
>> But that's a matter of providing something high-level ontop of
>> the bare-metal double-int/wide-int (something shared between
>> RTL and trees).  Duplicating information in the bare-metal
>> doesn't make sense (and no, INTEGER_CSTs on the tree level
>> are _not_ short-lived, and how can a double-int make sense on
>> the tree level when you say it doesn't make sense on the RTL level?)
>
> I think we're talking at cross purposes here.  To the extent that
> I'm not really sure where to begin. :-)  Just in case this is it:
> the idea is that wide_int is the type used to _process_ integers.
> It is not suppoed to be the type used to store integers in the IRs.
> The way trees store integers and the way that RTL stores integers
> are up to them.  For RTL the idea is that we continue to store integers
> that happen to fit into a HWI as a CONST_INT (regardless of the size of
> the CONST_INT's context-determined mode).  Wider integers are stored
> as a CONST_DOUBLE (for unconverted targets) or a CONST_WIDE_INT
> (for converted targets).  None of the three use the wide_int type;
> they use more compact representations instead.  And as Kenny says,
> using CONST_INT continues to be an important way of reducing the
> IR footprint.
>
> Whenever we want to do something interesting with the integers,
> we construct a wide_int for them and use the same processing code
> for all three rtx types.  This avoids having the separate single-HWI
> and double-HWI paths that we have now.  It also copes naturally with
> cases where we start out with single-HWI values but end up with wider
> ones.
>
> But the operations that we do on these wide_ints will all be to a mode
> or precision.  Shifts of QImode integers are different from shifts of
> HImode integers, etc.
>
> If you knew all that already (you probably did) and I've completely
> missed the point, please say so. :-)
>
> I'm not sure what you mean by "bare metal".

The issue is that unlike RTL where we "construct" double-ints from
CONST_INT/CONST_DOUBLE right now, tree has the double-int
_embedded_.  That's why I say that the thing we embed in
a CONST_WIDE_INT or a tree INTEGER_CST needs to be
"bare metal", and that's what I would call wide-int.

I think you have two things mixed in this patch which might add to
the confusion (heh).  One is, making the thing you work on in RTL
(which used to be double-ints before this patch) be wide-ints
which carry additional information taken from the IL RTX piece
at construction time.  That's all nice and good.  The other thing
is adding a CONST_WIDE_INT to support wider integer constants
than the weird host-dependent limitation we have right now.
Mixing those things together probably made you be able to make
Kenny work on it ... heh ;)

So to me wide-ints provide the higher-level abstraction ontop of
double-ints (which would remain the storage entity).  Such
higher-level abstraction is very useful, also for double-ints and
also on the tree level.  There is no requirement to provide bigger
double-int (or wide-int) for this.  Let's call this abstraction
wide-int (as opposed to my suggested more piecemail double_sint,
double_uint).  You can perfectly model it ontop of the existing
double_int storage.

As of providing larger "double-ints" - there is not much code left
(ok, quite an overstatement ;)) that relies on the implementation
detail of double-int containing exactly two HOST_WIDE_INTs.
The exact purpose of double-ints was to _hide_ this (previously
we only had routines like mul_double_with_sign which take
two HOST_WIDE_INT components).  Most code dealing with
the implementation detail is code interfacing with middle-end
routines that take a HOST_WIDE_INT argument (thus the
double_int_fits_hwi_p predicates) - even wide_int has to support
this kind of interfacing.

So, after introducing wide_int that just wraps double_int and
changing all user code (hopefully all, I guess mostly all), we'd
tackle the issue that the size of double_int's is host dependent.
A simple solution is to make it's size dependent on a target macro
(yes, macro ...), so on a 32bit HWI host targeting a 64bit 'HWI' target
you'd simply have four HWIs in the 'double-int' storage (and
arithmetic needs to be generalized to support this).

[The more complex solution makes double-int* just a pointer to the first
HWI, so it cannot be used without being "wrapped" in a wide-int
which implicitely would provide the number of HWIs pointed to
(yes, I think variable-sized wide-int storage is the ultimate way to go).]

So ... can you possibly split the patch in a way that just does the
first thing (provide the high-level wrapper and make use of it from RTL)?
I don't see the reason to have both CONST_DOUBLE and CONST_WIDE,
in fact they should be the same given choice one for the 2nd step.
And you can of course trivially construct a wide-int from a CONST_INT
as well (given it has a mode).

As of using a mode or precision in wide-int - for re-use on the tree level
you need a precision as not all precisions have a distinct mode.  As on
RTL you don't have a sign we probably don't want wide-ints to have a sign
but continue double-int-like to specify the sign as part of the operation
where it matters.

Richard.

> Richard
Richard Sandiford Oct. 5, 2012, 11:24 a.m. UTC | #20
Richard Guenther <richard.guenther@gmail.com> writes:
> On Fri, Oct 5, 2012 at 11:55 AM, Richard Sandiford
> <rdsandiford@googlemail.com> wrote:
>> Richard Guenther <richard.guenther@gmail.com> writes:
>>>> As far as the wide_ints recording a mode or precision goes: we're in
>>>> the "lucky" position of having tried both options.  Trees record the
>>>> type (and thus precision) of all compile-time integer constants.
>>>> RTL doesn't.  And the RTL way is a right pain.  It leads to interfaces
>>>> in which rtx arguments often have to be accompanied by a mode.  And it
>>>> leads to clunky differences like those between simplify_unary_operation
>>>> (which takes two mode arguments) and simplify_binary_operation
>>>> (which with our current set of binary operations requires only one).
>>>
>>> But the issue here is not that double_int doesn't record a mode
>>> or precision (or a sign).  The issue is that CONST_DOUBLE and CONST_INT
>>> don't!  The _tree_ INTEGER_CST contains of a type and a double-int.
>>> I see double-int (and the proposed wide-int) as a common building-block
>>> used for kind of "arbitrary precision" (as far as the target needs) integer
>>> constants on both tree and RTL.  And having a common worker implementation
>>> requires it to operate on something different than tree types or RTL mode
>>> plus signedness.
>>>
>>>> To put it another way: every wide_int operation needs to know
>>>> the precision of its arguments and the precision of its result.
>>>
>>> That's not true.  Every tree or RTL operation does, not every
>>> wide_int operation.  double_int's are just twos-complement numbers
>>> of precision 2 * HOST_BITS_PER_WIDE_INT.  wide_int's should
>>> be just twos-complement numbers of precision len *
>>> WHATEVER_HOST_COMPONENT_TYPE_IS_SUITABLE_FOR_A_FAST_IMPLEMENTATION.
>>> Operations on double_int and wide_int are bare-metal,
>>> in nearly all of the times you should use routines that do
>>> proper sign-/zero-extending to a possibly smaller precision.  When
>>> using bare-metal you are supposed to do that yourself.
>>>
>>> Which is why I was suggesting to add double_sint, double_uint,
>>> double_sintp (with precision), etc., wrappers and operations.
>>>
>>>> Not storing the precision in the wide_int is putting the onus
>>>> on the caller to keep track of the precision separately.
>>>
>>> But that's a matter of providing something high-level ontop of
>>> the bare-metal double-int/wide-int (something shared between
>>> RTL and trees).  Duplicating information in the bare-metal
>>> doesn't make sense (and no, INTEGER_CSTs on the tree level
>>> are _not_ short-lived, and how can a double-int make sense on
>>> the tree level when you say it doesn't make sense on the RTL level?)
>>
>> I think we're talking at cross purposes here.  To the extent that
>> I'm not really sure where to begin. :-)  Just in case this is it:
>> the idea is that wide_int is the type used to _process_ integers.
>> It is not suppoed to be the type used to store integers in the IRs.
>> The way trees store integers and the way that RTL stores integers
>> are up to them.  For RTL the idea is that we continue to store integers
>> that happen to fit into a HWI as a CONST_INT (regardless of the size of
>> the CONST_INT's context-determined mode).  Wider integers are stored
>> as a CONST_DOUBLE (for unconverted targets) or a CONST_WIDE_INT
>> (for converted targets).  None of the three use the wide_int type;
>> they use more compact representations instead.  And as Kenny says,
>> using CONST_INT continues to be an important way of reducing the
>> IR footprint.
>>
>> Whenever we want to do something interesting with the integers,
>> we construct a wide_int for them and use the same processing code
>> for all three rtx types.  This avoids having the separate single-HWI
>> and double-HWI paths that we have now.  It also copes naturally with
>> cases where we start out with single-HWI values but end up with wider
>> ones.
>>
>> But the operations that we do on these wide_ints will all be to a mode
>> or precision.  Shifts of QImode integers are different from shifts of
>> HImode integers, etc.
>>
>> If you knew all that already (you probably did) and I've completely
>> missed the point, please say so. :-)
>>
>> I'm not sure what you mean by "bare metal".
>
> The issue is that unlike RTL where we "construct" double-ints from
> CONST_INT/CONST_DOUBLE right now, tree has the double-int
> _embedded_.  That's why I say that the thing we embed in
> a CONST_WIDE_INT or a tree INTEGER_CST needs to be
> "bare metal", and that's what I would call wide-int.

OK, but that's deliberately not what Kenny's patch calls "wide int".
The whole idea is that the "bare metal" thing we embed in a
CONST_WIDE_INT or tree isn't (and doesn't need to be) the same
as the type that we use to operate on integers.  That bare-metal
thing doesn't even have to have a fixed width.  (CONST_WIDE_INT
doesn't have a fixed width, it's only as big as the integer
needs it to be.)

> So to me wide-ints provide the higher-level abstraction ontop of
> double-ints (which would remain the storage entity).
>
> Such higher-level abstraction is very useful, also for double-ints and
> also on the tree level.  There is no requirement to provide bigger
> double-int (or wide-int) for this.  Let's call this abstraction
> wide-int (as opposed to my suggested more piecemail double_sint,
> double_uint).  You can perfectly model it ontop of the existing
> double_int storage.
>
> As of providing larger "double-ints" - there is not much code left
> (ok, quite an overstatement ;)) that relies on the implementation
> detail of double-int containing exactly two HOST_WIDE_INTs.
> The exact purpose of double-ints was to _hide_ this (previously
> we only had routines like mul_double_with_sign which take
> two HOST_WIDE_INT components).  Most code dealing with
> the implementation detail is code interfacing with middle-end
> routines that take a HOST_WIDE_INT argument (thus the
> double_int_fits_hwi_p predicates) - even wide_int has to support
> this kind of interfacing.
>
> So, after introducing wide_int that just wraps double_int and
> changing all user code (hopefully all, I guess mostly all), we'd
> tackle the issue that the size of double_int's is host dependent.
> A simple solution is to make it's size dependent on a target macro
> (yes, macro ...), so on a 32bit HWI host targeting a 64bit 'HWI' target
> you'd simply have four HWIs in the 'double-int' storage (and
> arithmetic needs to be generalized to support this).

I just don't see why this is a good thing.  The constraints
are different when storing integers and when operating on them.
When operating on them, we want something that is easily constructed
on the stack, so we can create temporary structures very cheaply,
and can easily pass and return them.  We happen to know at GCC's
compile time how big the biggest integer will ever be, so it makes
sense for wide_int to be that wide.

But when storing integers we want something compact.  If your
machine supports 256-bit integers, but the code you're compiling
makes heavy use of 128-bit integers, why would you want to waste
128 of 256 bits on every stored integer?  Which is why even
CONST_WIDE_INT doesn't have a fixed width.

You seem to be arguing that the type used to store integers in the IR
has to be same as the one that we use when performing compile-time
arithmetic, but I think it's better to have an abstraction between
the two.

So if you think a pair of HWIs continues to be a useful way of
storing integers at the tree level, then we can easily continue
to use a pair of HWIs there.  Or if you'd prefer every tree integer
to be the same width as a wide_int, we can do that too.  (I don't
know what Kenny's tree patch does.)  But the whole point of having
wide_int as an abstraction is that most code _operating_ on integers
doesn't care what the representation is.  It becomes much easier to
change that representation to whatever gives the best performance.

Another advantage of abstracting away the storage type is that
we could store things like an overflow flag in the wide_int
(if that ever proves useful) without worrying about the impact
on the tree and RTL footprint.

> [The more complex solution makes double-int* just a pointer to the first
> HWI, so it cannot be used without being "wrapped" in a wide-int
> which implicitely would provide the number of HWIs pointed to
> (yes, I think variable-sized wide-int storage is the ultimate way to go).]
>
> So ... can you possibly split the patch in a way that just does the
> first thing (provide the high-level wrapper and make use of it from RTL)?
> I don't see the reason to have both CONST_DOUBLE and CONST_WIDE,
> in fact they should be the same given choice one for the 2nd step.
> And you can of course trivially construct a wide-int from a CONST_INT
> as well (given it has a mode).
>
> As of using a mode or precision in wide-int - for re-use on the tree level
> you need a precision as not all precisions have a distinct mode.  As on
> RTL you don't have a sign we probably don't want wide-ints to have a sign
> but continue double-int-like to specify the sign as part of the operation
> where it matters.

Definitely agree that signedness is a property of the operation.
(Saturation too IMO.)  I don't really mind about mode vs. precision.

Richard
Richard Biener Oct. 5, 2012, 11:41 a.m. UTC | #21
On Fri, Oct 5, 2012 at 1:24 PM, Richard Sandiford
<rdsandiford@googlemail.com> wrote:
> Richard Guenther <richard.guenther@gmail.com> writes:
>> On Fri, Oct 5, 2012 at 11:55 AM, Richard Sandiford
>> <rdsandiford@googlemail.com> wrote:
>>> Richard Guenther <richard.guenther@gmail.com> writes:
>>>>> As far as the wide_ints recording a mode or precision goes: we're in
>>>>> the "lucky" position of having tried both options.  Trees record the
>>>>> type (and thus precision) of all compile-time integer constants.
>>>>> RTL doesn't.  And the RTL way is a right pain.  It leads to interfaces
>>>>> in which rtx arguments often have to be accompanied by a mode.  And it
>>>>> leads to clunky differences like those between simplify_unary_operation
>>>>> (which takes two mode arguments) and simplify_binary_operation
>>>>> (which with our current set of binary operations requires only one).
>>>>
>>>> But the issue here is not that double_int doesn't record a mode
>>>> or precision (or a sign).  The issue is that CONST_DOUBLE and CONST_INT
>>>> don't!  The _tree_ INTEGER_CST contains of a type and a double-int.
>>>> I see double-int (and the proposed wide-int) as a common building-block
>>>> used for kind of "arbitrary precision" (as far as the target needs) integer
>>>> constants on both tree and RTL.  And having a common worker implementation
>>>> requires it to operate on something different than tree types or RTL mode
>>>> plus signedness.
>>>>
>>>>> To put it another way: every wide_int operation needs to know
>>>>> the precision of its arguments and the precision of its result.
>>>>
>>>> That's not true.  Every tree or RTL operation does, not every
>>>> wide_int operation.  double_int's are just twos-complement numbers
>>>> of precision 2 * HOST_BITS_PER_WIDE_INT.  wide_int's should
>>>> be just twos-complement numbers of precision len *
>>>> WHATEVER_HOST_COMPONENT_TYPE_IS_SUITABLE_FOR_A_FAST_IMPLEMENTATION.
>>>> Operations on double_int and wide_int are bare-metal,
>>>> in nearly all of the times you should use routines that do
>>>> proper sign-/zero-extending to a possibly smaller precision.  When
>>>> using bare-metal you are supposed to do that yourself.
>>>>
>>>> Which is why I was suggesting to add double_sint, double_uint,
>>>> double_sintp (with precision), etc., wrappers and operations.
>>>>
>>>>> Not storing the precision in the wide_int is putting the onus
>>>>> on the caller to keep track of the precision separately.
>>>>
>>>> But that's a matter of providing something high-level ontop of
>>>> the bare-metal double-int/wide-int (something shared between
>>>> RTL and trees).  Duplicating information in the bare-metal
>>>> doesn't make sense (and no, INTEGER_CSTs on the tree level
>>>> are _not_ short-lived, and how can a double-int make sense on
>>>> the tree level when you say it doesn't make sense on the RTL level?)
>>>
>>> I think we're talking at cross purposes here.  To the extent that
>>> I'm not really sure where to begin. :-)  Just in case this is it:
>>> the idea is that wide_int is the type used to _process_ integers.
>>> It is not suppoed to be the type used to store integers in the IRs.
>>> The way trees store integers and the way that RTL stores integers
>>> are up to them.  For RTL the idea is that we continue to store integers
>>> that happen to fit into a HWI as a CONST_INT (regardless of the size of
>>> the CONST_INT's context-determined mode).  Wider integers are stored
>>> as a CONST_DOUBLE (for unconverted targets) or a CONST_WIDE_INT
>>> (for converted targets).  None of the three use the wide_int type;
>>> they use more compact representations instead.  And as Kenny says,
>>> using CONST_INT continues to be an important way of reducing the
>>> IR footprint.
>>>
>>> Whenever we want to do something interesting with the integers,
>>> we construct a wide_int for them and use the same processing code
>>> for all three rtx types.  This avoids having the separate single-HWI
>>> and double-HWI paths that we have now.  It also copes naturally with
>>> cases where we start out with single-HWI values but end up with wider
>>> ones.
>>>
>>> But the operations that we do on these wide_ints will all be to a mode
>>> or precision.  Shifts of QImode integers are different from shifts of
>>> HImode integers, etc.
>>>
>>> If you knew all that already (you probably did) and I've completely
>>> missed the point, please say so. :-)
>>>
>>> I'm not sure what you mean by "bare metal".
>>
>> The issue is that unlike RTL where we "construct" double-ints from
>> CONST_INT/CONST_DOUBLE right now, tree has the double-int
>> _embedded_.  That's why I say that the thing we embed in
>> a CONST_WIDE_INT or a tree INTEGER_CST needs to be
>> "bare metal", and that's what I would call wide-int.
>
> OK, but that's deliberately not what Kenny's patch calls "wide int".
> The whole idea is that the "bare metal" thing we embed in a
> CONST_WIDE_INT or tree isn't (and doesn't need to be) the same
> as the type that we use to operate on integers.  That bare-metal
> thing doesn't even have to have a fixed width.  (CONST_WIDE_INT
> doesn't have a fixed width, it's only as big as the integer
> needs it to be.)

Ok, let's rephrase it this way then: the "bare metal" thing used
for the storage should ideally be the same in the tree IL and the RTL
IL _and_ the higher-level abstract wide-int.

>> So to me wide-ints provide the higher-level abstraction ontop of
>> double-ints (which would remain the storage entity).
>>
>> Such higher-level abstraction is very useful, also for double-ints and
>> also on the tree level.  There is no requirement to provide bigger
>> double-int (or wide-int) for this.  Let's call this abstraction
>> wide-int (as opposed to my suggested more piecemail double_sint,
>> double_uint).  You can perfectly model it ontop of the existing
>> double_int storage.
>>
>> As of providing larger "double-ints" - there is not much code left
>> (ok, quite an overstatement ;)) that relies on the implementation
>> detail of double-int containing exactly two HOST_WIDE_INTs.
>> The exact purpose of double-ints was to _hide_ this (previously
>> we only had routines like mul_double_with_sign which take
>> two HOST_WIDE_INT components).  Most code dealing with
>> the implementation detail is code interfacing with middle-end
>> routines that take a HOST_WIDE_INT argument (thus the
>> double_int_fits_hwi_p predicates) - even wide_int has to support
>> this kind of interfacing.
>>
>> So, after introducing wide_int that just wraps double_int and
>> changing all user code (hopefully all, I guess mostly all), we'd
>> tackle the issue that the size of double_int's is host dependent.
>> A simple solution is to make it's size dependent on a target macro
>> (yes, macro ...), so on a 32bit HWI host targeting a 64bit 'HWI' target
>> you'd simply have four HWIs in the 'double-int' storage (and
>> arithmetic needs to be generalized to support this).
>
> I just don't see why this is a good thing.  The constraints
> are different when storing integers and when operating on them.
> When operating on them, we want something that is easily constructed
> on the stack, so we can create temporary structures very cheaply,
> and can easily pass and return them.  We happen to know at GCC's
> compile time how big the biggest integer will ever be, so it makes
> sense for wide_int to be that wide.

I'm not arguing against this.  I'm just saying that the internal
representation will depend on the host - not the number of total
bits, but the number of pieces.

> But when storing integers we want something compact.  If your
> machine supports 256-bit integers, but the code you're compiling
> makes heavy use of 128-bit integers, why would you want to waste
> 128 of 256 bits on every stored integer?  Which is why even
> CONST_WIDE_INT doesn't have a fixed width.
>
> You seem to be arguing that the type used to store integers in the IR
> has to be same as the one that we use when performing compile-time
> arithmetic, but I think it's better to have an abstraction between
> the two.

Well, you don't want to pay the cost dividing 256bit numbers all
the time when most of your numbers are only 128bit.  So we don't
really want to perform compile-time arithmetic on the biggest
possible precision either.  Ideally, of course - at the moment
we have double-ints and what precision we internally use
is an implementation detail (once it is sufficient precision).

> So if you think a pair of HWIs continues to be a useful way of
> storing integers at the tree level, then we can easily continue
> to use a pair of HWIs there.

How do you store larger ints there then?  How is CONST_WIDE_INT
variable size?  Why can wide-int not be variable-size?

> Or if you'd prefer every tree integer
> to be the same width as a wide_int, we can do that too.  (I don't
> know what Kenny's tree patch does.)

Keenys patch truncates wide-ints to two HWIs in wide-int-to-tree
without any checking (I claimed it's a bug, Kenny says its a feature).

>  But the whole point of having
> wide_int as an abstraction is that most code _operating_ on integers
> doesn't care what the representation is.  It becomes much easier to
> change that representation to whatever gives the best performance.

Sure!  And I agree totally with this!  Just don't mix enlarging that thing
from double_int in the first round of implementation!

> Another advantage of abstracting away the storage type is that
> we could store things like an overflow flag in the wide_int
> (if that ever proves useful) without worrying about the impact
> on the tree and RTL footprint.

Sure.

We seem to be talking in circles.  You don't seem to want (or care)
about a common storage for tree, RTL and wide-int.  You seem to
care about that operate-on-wide-ints thing.  I am saying if you keep
double-ints as-is you create two similar things which should be one thing.

So, make double-int storage only.  Introduce wide-int and use that
everywhere we compute on double-ints.  Whether wide-int is variable-size
or not is an implementation detail (if it's not we simply ICE if you require
a too large wide-int).  Wheter IL storage is variable-size or not is an
implementation detail as well.

What I don't see is that the patch just introduces wide-int as type
to do compile-time math on in RTL.  It does more (the patch is large
and I didn't read it in detail).  It doesn't even try to address the same
on the tree level.  It doesn't address the fundamental issue of
double-int size being host dependent.

Instead it seems to focus on getting even larger constants "work"
(on RTL level only).

What I'd like to see is _just_ the wide-int abstraction, suitable to
replace both tree and RTL level compile-time math (without even
trying to conver all uses, that's just too much noise for a thorough
review).

Richard.

>> [The more complex solution makes double-int* just a pointer to the first
>> HWI, so it cannot be used without being "wrapped" in a wide-int
>> which implicitely would provide the number of HWIs pointed to
>> (yes, I think variable-sized wide-int storage is the ultimate way to go).]
>>
>> So ... can you possibly split the patch in a way that just does the
>> first thing (provide the high-level wrapper and make use of it from RTL)?
>> I don't see the reason to have both CONST_DOUBLE and CONST_WIDE,
>> in fact they should be the same given choice one for the 2nd step.
>> And you can of course trivially construct a wide-int from a CONST_INT
>> as well (given it has a mode).
>>
>> As of using a mode or precision in wide-int - for re-use on the tree level
>> you need a precision as not all precisions have a distinct mode.  As on
>> RTL you don't have a sign we probably don't want wide-ints to have a sign
>> but continue double-int-like to specify the sign as part of the operation
>> where it matters.
>
> Definitely agree that signedness is a property of the operation.
> (Saturation too IMO.)  I don't really mind about mode vs. precision.
>
> Richard
Richard Sandiford Oct. 5, 2012, 12:26 p.m. UTC | #22
Richard Guenther <richard.guenther@gmail.com> writes:
> On Fri, Oct 5, 2012 at 1:24 PM, Richard Sandiford
> <rdsandiford@googlemail.com> wrote:
>> Richard Guenther <richard.guenther@gmail.com> writes:
>>> The issue is that unlike RTL where we "construct" double-ints from
>>> CONST_INT/CONST_DOUBLE right now, tree has the double-int
>>> _embedded_.  That's why I say that the thing we embed in
>>> a CONST_WIDE_INT or a tree INTEGER_CST needs to be
>>> "bare metal", and that's what I would call wide-int.
>>
>> OK, but that's deliberately not what Kenny's patch calls "wide int".
>> The whole idea is that the "bare metal" thing we embed in a
>> CONST_WIDE_INT or tree isn't (and doesn't need to be) the same
>> as the type that we use to operate on integers.  That bare-metal
>> thing doesn't even have to have a fixed width.  (CONST_WIDE_INT
>> doesn't have a fixed width, it's only as big as the integer
>> needs it to be.)
>
> Ok, let's rephrase it this way then: the "bare metal" thing used
> for the storage should ideally be the same in the tree IL and the RTL
> IL _and_ the higher-level abstract wide-int.

Hmm, OK, that's a straight disagreement then.

>>> So to me wide-ints provide the higher-level abstraction ontop of
>>> double-ints (which would remain the storage entity).
>>>
>>> Such higher-level abstraction is very useful, also for double-ints and
>>> also on the tree level.  There is no requirement to provide bigger
>>> double-int (or wide-int) for this.  Let's call this abstraction
>>> wide-int (as opposed to my suggested more piecemail double_sint,
>>> double_uint).  You can perfectly model it ontop of the existing
>>> double_int storage.
>>>
>>> As of providing larger "double-ints" - there is not much code left
>>> (ok, quite an overstatement ;)) that relies on the implementation
>>> detail of double-int containing exactly two HOST_WIDE_INTs.
>>> The exact purpose of double-ints was to _hide_ this (previously
>>> we only had routines like mul_double_with_sign which take
>>> two HOST_WIDE_INT components).  Most code dealing with
>>> the implementation detail is code interfacing with middle-end
>>> routines that take a HOST_WIDE_INT argument (thus the
>>> double_int_fits_hwi_p predicates) - even wide_int has to support
>>> this kind of interfacing.
>>>
>>> So, after introducing wide_int that just wraps double_int and
>>> changing all user code (hopefully all, I guess mostly all), we'd
>>> tackle the issue that the size of double_int's is host dependent.
>>> A simple solution is to make it's size dependent on a target macro
>>> (yes, macro ...), so on a 32bit HWI host targeting a 64bit 'HWI' target
>>> you'd simply have four HWIs in the 'double-int' storage (and
>>> arithmetic needs to be generalized to support this).
>>
>> I just don't see why this is a good thing.  The constraints
>> are different when storing integers and when operating on them.
>> When operating on them, we want something that is easily constructed
>> on the stack, so we can create temporary structures very cheaply,
>> and can easily pass and return them.  We happen to know at GCC's
>> compile time how big the biggest integer will ever be, so it makes
>> sense for wide_int to be that wide.
>
> I'm not arguing against this.  I'm just saying that the internal
> representation will depend on the host - not the number of total
> bits, but the number of pieces.

Sure, but Kenny already has a macro to specify how many bits we need
(MAX_BITSIZE_MODE_ANY_INT).  We can certainly wrap:

  HOST_WIDE_INT val[MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT];

in a typedef if that's what you prefer.

>> But when storing integers we want something compact.  If your
>> machine supports 256-bit integers, but the code you're compiling
>> makes heavy use of 128-bit integers, why would you want to waste
>> 128 of 256 bits on every stored integer?  Which is why even
>> CONST_WIDE_INT doesn't have a fixed width.
>>
>> You seem to be arguing that the type used to store integers in the IR
>> has to be same as the one that we use when performing compile-time
>> arithmetic, but I think it's better to have an abstraction between
>> the two.
>
> Well, you don't want to pay the cost dividing 256bit numbers all
> the time when most of your numbers are only 128bit.  So we don't
> really want to perform compile-time arithmetic on the biggest
> possible precision either.

wide_int doesn't perform them in the biggest possible precision.
It performs them in the precision of the result.  It just happens
to have enough storage to cope with all possible precisions (because
the actual precision of the result is usually only known at GCC's runtime).

>> So if you think a pair of HWIs continues to be a useful way of
>> storing integers at the tree level, then we can easily continue
>> to use a pair of HWIs there.
>
> How do you store larger ints there then?

You'd need another tree code for wider integers.  I'm not saying that's
a good idea, I just wasn't sure if it's what you wanted.

> How is CONST_WIDE_INT variable size?

It's just the usual trailing variable-length array thing.

> Why can wide-int not be variable-size?

Because variable-length arrays are usually more expensive
than (still fairly small) fixed-length arrays when dealing
with temporaries.

>> Or if you'd prefer every tree integer
>> to be the same width as a wide_int, we can do that too.  (I don't
>> know what Kenny's tree patch does.)
>
> Keenys patch truncates wide-ints to two HWIs in wide-int-to-tree
> without any checking (I claimed it's a bug, Kenny says its a feature).

Only because he split the rtl and tree parts up.  By "Kenny's tree patch",
I meant the patch that he said he was going to send (part 4 as he called it).

Until then, we're not regressing at the tree level, and I think
the patch is a genuine RTL improvement on its own.

>> Another advantage of abstracting away the storage type is that
>> we could store things like an overflow flag in the wide_int
>> (if that ever proves useful) without worrying about the impact
>> on the tree and RTL footprint.
>
> Sure.
>
> We seem to be talking in circles.  You don't seem to want (or care)
> about a common storage for tree, RTL and wide-int.  You seem to
> care about that operate-on-wide-ints thing.  I am saying if you keep
> double-ints as-is you create two similar things which should be one thing.

The idea is to get rid of double_ints altogether.  They shouldn't have
any use once everything has been converted to wide_ints.

> So, make double-int storage only.

The idea was to treat them as legacy and get rid of them as soon
as we can.

> What I don't see is that the patch just introduces wide-int as type
> to do compile-time math on in RTL.  It does more (the patch is large
> and I didn't read it in detail).

Yeah, it introduces things like the CONST_SCALAR_INT_P abstraction.
But I actually find the patch easier to review like that, because both
changes are affecting the same kinds of place.

> It doesn't even try to address the same on the tree level.

Because as Kenny's already said, that's a separate patch.

> It doesn't address the fundamental issue of double-int size being host
> dependent.

Because the plan is to get rid of it :-)

* Trivial, but it has the wrong name.

* It has the wrong interface for general-precision arithmetic because
  it doesn't say how wide the value stored (or to be stored) in those
  HOST_WIDE_INTs actually is.  E.g. there's no such thing as an "X-bit
  add followed by an X-bit division".  You have to a "double_int-wide
  add", followed by a truncation/extension to X bits, followed by a
  "double_int-wide division", followed by another truncation/extension
  to X bits.  Which we don't do in RTL; we just assume or (occasionally)
  assert that we only use double_int for modes whose precisions are
  exactly 2 * HOST_BITS_PER_WIDE_INT.

* Using a fixed-length array of HOST_WIDE_INTs is too inflexible
  for size-conscious IRs, so just bumping its size probably isn't
  a good idea for them.  But having a fixed-length array does
  IMO make sense for temporaries.

* If we made it storage-only, it doesn't need all those operations.

Which is why I agree with Kenny that double_int as it exists today
isn't the right starting point.

Richard
Richard Biener Oct. 5, 2012, 12:39 p.m. UTC | #23
On Fri, Oct 5, 2012 at 2:26 PM, Richard Sandiford
<rdsandiford@googlemail.com> wrote:
> Richard Guenther <richard.guenther@gmail.com> writes:
>> On Fri, Oct 5, 2012 at 1:24 PM, Richard Sandiford
>> <rdsandiford@googlemail.com> wrote:
>>> Richard Guenther <richard.guenther@gmail.com> writes:
>>>> The issue is that unlike RTL where we "construct" double-ints from
>>>> CONST_INT/CONST_DOUBLE right now, tree has the double-int
>>>> _embedded_.  That's why I say that the thing we embed in
>>>> a CONST_WIDE_INT or a tree INTEGER_CST needs to be
>>>> "bare metal", and that's what I would call wide-int.
>>>
>>> OK, but that's deliberately not what Kenny's patch calls "wide int".
>>> The whole idea is that the "bare metal" thing we embed in a
>>> CONST_WIDE_INT or tree isn't (and doesn't need to be) the same
>>> as the type that we use to operate on integers.  That bare-metal
>>> thing doesn't even have to have a fixed width.  (CONST_WIDE_INT
>>> doesn't have a fixed width, it's only as big as the integer
>>> needs it to be.)
>>
>> Ok, let's rephrase it this way then: the "bare metal" thing used
>> for the storage should ideally be the same in the tree IL and the RTL
>> IL _and_ the higher-level abstract wide-int.
>
> Hmm, OK, that's a straight disagreement then.
>
>>>> So to me wide-ints provide the higher-level abstraction ontop of
>>>> double-ints (which would remain the storage entity).
>>>>
>>>> Such higher-level abstraction is very useful, also for double-ints and
>>>> also on the tree level.  There is no requirement to provide bigger
>>>> double-int (or wide-int) for this.  Let's call this abstraction
>>>> wide-int (as opposed to my suggested more piecemail double_sint,
>>>> double_uint).  You can perfectly model it ontop of the existing
>>>> double_int storage.
>>>>
>>>> As of providing larger "double-ints" - there is not much code left
>>>> (ok, quite an overstatement ;)) that relies on the implementation
>>>> detail of double-int containing exactly two HOST_WIDE_INTs.
>>>> The exact purpose of double-ints was to _hide_ this (previously
>>>> we only had routines like mul_double_with_sign which take
>>>> two HOST_WIDE_INT components).  Most code dealing with
>>>> the implementation detail is code interfacing with middle-end
>>>> routines that take a HOST_WIDE_INT argument (thus the
>>>> double_int_fits_hwi_p predicates) - even wide_int has to support
>>>> this kind of interfacing.
>>>>
>>>> So, after introducing wide_int that just wraps double_int and
>>>> changing all user code (hopefully all, I guess mostly all), we'd
>>>> tackle the issue that the size of double_int's is host dependent.
>>>> A simple solution is to make it's size dependent on a target macro
>>>> (yes, macro ...), so on a 32bit HWI host targeting a 64bit 'HWI' target
>>>> you'd simply have four HWIs in the 'double-int' storage (and
>>>> arithmetic needs to be generalized to support this).
>>>
>>> I just don't see why this is a good thing.  The constraints
>>> are different when storing integers and when operating on them.
>>> When operating on them, we want something that is easily constructed
>>> on the stack, so we can create temporary structures very cheaply,
>>> and can easily pass and return them.  We happen to know at GCC's
>>> compile time how big the biggest integer will ever be, so it makes
>>> sense for wide_int to be that wide.
>>
>> I'm not arguing against this.  I'm just saying that the internal
>> representation will depend on the host - not the number of total
>> bits, but the number of pieces.
>
> Sure, but Kenny already has a macro to specify how many bits we need
> (MAX_BITSIZE_MODE_ANY_INT).  We can certainly wrap:
>
>   HOST_WIDE_INT val[MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT];
>
> in a typedef if that's what you prefer.

I'd prefer it to be initially double_int, and later "fixed" to double_int
with a member like the above.  Possibly renamed as well.

>>> But when storing integers we want something compact.  If your
>>> machine supports 256-bit integers, but the code you're compiling
>>> makes heavy use of 128-bit integers, why would you want to waste
>>> 128 of 256 bits on every stored integer?  Which is why even
>>> CONST_WIDE_INT doesn't have a fixed width.
>>>
>>> You seem to be arguing that the type used to store integers in the IR
>>> has to be same as the one that we use when performing compile-time
>>> arithmetic, but I think it's better to have an abstraction between
>>> the two.
>>
>> Well, you don't want to pay the cost dividing 256bit numbers all
>> the time when most of your numbers are only 128bit.  So we don't
>> really want to perform compile-time arithmetic on the biggest
>> possible precision either.
>
> wide_int doesn't perform them in the biggest possible precision.
> It performs them in the precision of the result.  It just happens
> to have enough storage to cope with all possible precisions (because
> the actual precision of the result is usually only known at GCC's runtime).
>
>>> So if you think a pair of HWIs continues to be a useful way of
>>> storing integers at the tree level, then we can easily continue
>>> to use a pair of HWIs there.
>>
>> How do you store larger ints there then?
>
> You'd need another tree code for wider integers.  I'm not saying that's
> a good idea, I just wasn't sure if it's what you wanted.

Uh, ok.

>> How is CONST_WIDE_INT variable size?
>
> It's just the usual trailing variable-length array thing.

Good.  Do you get rid of CONST_DOUBLE (for integers) at the same time?

>> Why can wide-int not be variable-size?
>
> Because variable-length arrays are usually more expensive
> than (still fairly small) fixed-length arrays when dealing
> with temporaries.

which is why I'd have made it a template (but I admit I didn't fully
investigate all issues).  Something like

template <unsigned bits>  (or bytes?)
struct wide_int {
  unsigned short precision;
  HOST_WIDE_INT val[(bits + HOST_BITS_PER_WIDE_INT - 1) /
HOST_BITS_PER_WIDE_INT];
};

so it would not be variable-size at compile-time but we'd be able
to constrain its maximum size.  That's important for things like
CCP which keep a lattice of ints and maybe do not want to care
about tracking your new 4096-bit integers.

But maybe we don't really care.

>>> Or if you'd prefer every tree integer
>>> to be the same width as a wide_int, we can do that too.  (I don't
>>> know what Kenny's tree patch does.)
>>
>> Keenys patch truncates wide-ints to two HWIs in wide-int-to-tree
>> without any checking (I claimed it's a bug, Kenny says its a feature).
>
> Only because he split the rtl and tree parts up.  By "Kenny's tree patch",
> I meant the patch that he said he was going to send (part 4 as he called it).
>
> Until then, we're not regressing at the tree level, and I think
> the patch is a genuine RTL improvement on its own.
>
>>> Another advantage of abstracting away the storage type is that
>>> we could store things like an overflow flag in the wide_int
>>> (if that ever proves useful) without worrying about the impact
>>> on the tree and RTL footprint.
>>
>> Sure.
>>
>> We seem to be talking in circles.  You don't seem to want (or care)
>> about a common storage for tree, RTL and wide-int.  You seem to
>> care about that operate-on-wide-ints thing.  I am saying if you keep
>> double-ints as-is you create two similar things which should be one thing.
>
> The idea is to get rid of double_ints altogether.  They shouldn't have
> any use once everything has been converted to wide_ints.
>
>> So, make double-int storage only.
>
> The idea was to treat them as legacy and get rid of them as soon
> as we can.
>
>> What I don't see is that the patch just introduces wide-int as type
>> to do compile-time math on in RTL.  It does more (the patch is large
>> and I didn't read it in detail).
>
> Yeah, it introduces things like the CONST_SCALAR_INT_P abstraction.
> But I actually find the patch easier to review like that, because both
> changes are affecting the same kinds of place.
>
>> It doesn't even try to address the same on the tree level.
>
> Because as Kenny's already said, that's a separate patch.
>
>> It doesn't address the fundamental issue of double-int size being host
>> dependent.
>
> Because the plan is to get rid of it :-)
>
> * Trivial, but it has the wrong name.
>
> * It has the wrong interface for general-precision arithmetic because
>   it doesn't say how wide the value stored (or to be stored) in those
>   HOST_WIDE_INTs actually is.  E.g. there's no such thing as an "X-bit
>   add followed by an X-bit division".  You have to a "double_int-wide
>   add", followed by a truncation/extension to X bits, followed by a
>   "double_int-wide division", followed by another truncation/extension
>   to X bits.  Which we don't do in RTL; we just assume or (occasionally)
>   assert that we only use double_int for modes whose precisions are
>   exactly 2 * HOST_BITS_PER_WIDE_INT.
>
> * Using a fixed-length array of HOST_WIDE_INTs is too inflexible
>   for size-conscious IRs, so just bumping its size probably isn't
>   a good idea for them.  But having a fixed-length array does
>   IMO make sense for temporaries.
>
> * If we made it storage-only, it doesn't need all those operations.
>
> Which is why I agree with Kenny that double_int as it exists today
> isn't the right starting point.

Ok, I see where you are going.  Let me look at the patch again.

Richard.

> Richard
Richard Sandiford Oct. 5, 2012, 1:10 p.m. UTC | #24
Richard Guenther <richard.guenther@gmail.com> writes:
> On Fri, Oct 5, 2012 at 2:26 PM, Richard Sandiford
> <rdsandiford@googlemail.com> wrote:
>> Richard Guenther <richard.guenther@gmail.com> writes:
>>> On Fri, Oct 5, 2012 at 1:24 PM, Richard Sandiford
>>> <rdsandiford@googlemail.com> wrote:
>>>> Richard Guenther <richard.guenther@gmail.com> writes:
>>>>> The issue is that unlike RTL where we "construct" double-ints from
>>>>> CONST_INT/CONST_DOUBLE right now, tree has the double-int
>>>>> _embedded_.  That's why I say that the thing we embed in
>>>>> a CONST_WIDE_INT or a tree INTEGER_CST needs to be
>>>>> "bare metal", and that's what I would call wide-int.
>>>>
>>>> OK, but that's deliberately not what Kenny's patch calls "wide int".
>>>> The whole idea is that the "bare metal" thing we embed in a
>>>> CONST_WIDE_INT or tree isn't (and doesn't need to be) the same
>>>> as the type that we use to operate on integers.  That bare-metal
>>>> thing doesn't even have to have a fixed width.  (CONST_WIDE_INT
>>>> doesn't have a fixed width, it's only as big as the integer
>>>> needs it to be.)
>>>
>>> Ok, let's rephrase it this way then: the "bare metal" thing used
>>> for the storage should ideally be the same in the tree IL and the RTL
>>> IL _and_ the higher-level abstract wide-int.
>>
>> Hmm, OK, that's a straight disagreement then.
>>
>>>>> So to me wide-ints provide the higher-level abstraction ontop of
>>>>> double-ints (which would remain the storage entity).
>>>>>
>>>>> Such higher-level abstraction is very useful, also for double-ints and
>>>>> also on the tree level.  There is no requirement to provide bigger
>>>>> double-int (or wide-int) for this.  Let's call this abstraction
>>>>> wide-int (as opposed to my suggested more piecemail double_sint,
>>>>> double_uint).  You can perfectly model it ontop of the existing
>>>>> double_int storage.
>>>>>
>>>>> As of providing larger "double-ints" - there is not much code left
>>>>> (ok, quite an overstatement ;)) that relies on the implementation
>>>>> detail of double-int containing exactly two HOST_WIDE_INTs.
>>>>> The exact purpose of double-ints was to _hide_ this (previously
>>>>> we only had routines like mul_double_with_sign which take
>>>>> two HOST_WIDE_INT components).  Most code dealing with
>>>>> the implementation detail is code interfacing with middle-end
>>>>> routines that take a HOST_WIDE_INT argument (thus the
>>>>> double_int_fits_hwi_p predicates) - even wide_int has to support
>>>>> this kind of interfacing.
>>>>>
>>>>> So, after introducing wide_int that just wraps double_int and
>>>>> changing all user code (hopefully all, I guess mostly all), we'd
>>>>> tackle the issue that the size of double_int's is host dependent.
>>>>> A simple solution is to make it's size dependent on a target macro
>>>>> (yes, macro ...), so on a 32bit HWI host targeting a 64bit 'HWI' target
>>>>> you'd simply have four HWIs in the 'double-int' storage (and
>>>>> arithmetic needs to be generalized to support this).
>>>>
>>>> I just don't see why this is a good thing.  The constraints
>>>> are different when storing integers and when operating on them.
>>>> When operating on them, we want something that is easily constructed
>>>> on the stack, so we can create temporary structures very cheaply,
>>>> and can easily pass and return them.  We happen to know at GCC's
>>>> compile time how big the biggest integer will ever be, so it makes
>>>> sense for wide_int to be that wide.
>>>
>>> I'm not arguing against this.  I'm just saying that the internal
>>> representation will depend on the host - not the number of total
>>> bits, but the number of pieces.
>>
>> Sure, but Kenny already has a macro to specify how many bits we need
>> (MAX_BITSIZE_MODE_ANY_INT).  We can certainly wrap:
>>
>>   HOST_WIDE_INT val[MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT];
>>
>> in a typedef if that's what you prefer.
>
> I'd prefer it to be initially double_int, and later "fixed" to double_int
> with a member like the above.  Possibly renamed as well.

I'd like to avoid that.  The current double_int code really isn't useful
to the patch.  double_int doesn't have the right representation because
it uses a low/high pair (of slightly different types) rather than a
parameterised array.  The operators don't have the right interface,
for the reasons I described later.  They don't have the right
implementation because they're specific to HWI pairs rather than
parameterised based on the number of HWIs.  Once you change the name,
the internal representation, the operator interface, and the operator
implementation, there isn't a lot left to keep.

>>> How is CONST_WIDE_INT variable size?
>>
>> It's just the usual trailing variable-length array thing.
>
> Good.  Do you get rid of CONST_DOUBLE (for integers) at the same time?

Yeah.  I initially thought it might be OK to keep them and have
CONST_INT, integer CONST_DOUBLEs and CONST_WIDE_INT alongside
each other.  (The way the patch is structured means that the
choice of whether to keep integer CONST_DOUBLEs can be changed
very easily.)  But Kenny convinced me it was a bad idea.

Richard
Kenneth Zadeck Oct. 5, 2012, 1:11 p.m. UTC | #25
richi,

let me address a bunch of issues that are randomly spread thru the thread.

1) unlike the double int's current relationship to int cst, we do not 
currently wrap a wide-int into an CONST_WIDE_INT nor (in the patch that 
you have not seen) do we wrap a wide-int into the int cst.    wide-ints 
are designed to be short lived.   With the exception of the vrp pass, 
they are almost always allocated on the stack and thrown away 
quickly.    That way we can afford to grab a big buffer and play with it.

The rep that lives in the CONST_WIDE_INT and the int cst is a garbage 
collected array of HWIs that is only large enough to hold the actual 
value.   and it is almost always an array of length 1. This saves space 
and works because they are immutable.

This means that there is almost no storage bloat by going with this 
representation no matter how large the widest integer is on the platform.

2) wide-int is really just a generalized version of double-int. The abi 
is different because it does not have all of those calls that take two 
HWIs explicitly and because i was building it all at once LOOKING BOTH 
AT THE TREE AND RTL levels, it has a richer set of calls.   Aside from 
it carrying a mode around that it uses for precision and bitsize, it is 
really bare metal.   My problem is that i am still working on getting 
the tree level stuff working, and phase 1 is closing soon and i wanted 
to get my foot in the door.

3) I think that everyone agrees that having CONST_DOUBLE and CONST_INT 
not carrying around modes was a big mistake.   I toyed with the idea for 
a nanosecond of fixing that along with this, but as you point out the 
patches are already way to large and it would have meant that the burden 
on converting the ports would be larger.   BUT, as the note in the 
original email said, this patch moves significantly in the direction of 
being able to have a mode for the integers at the rtl level.    It does 
this by no longer storing integer values in CONST_DOUBLEs so there is no 
longer the test of VOIDmode to see what the CONST_DOUBLE contains.    
Also the patch contains a significant number of changes (in the machine 
independent rtl code) to the constructors that carry the mode, rather 
than using GENINT.

4) Having the mode (or at least the precision and bitsize) is crutial 
for the performance of wide-int.   All of the interesting functions have 
quick out tests to use the bare metal operators if the precision fits a 
HWI.  This way, 99% of the time, we are just doing c operators with a 
simple check.

kenny


On 10/05/2012 07:41 AM, Richard Guenther wrote:
> On Fri, Oct 5, 2012 at 1:24 PM, Richard Sandiford
> <rdsandiford@googlemail.com> wrote:
>> Richard Guenther <richard.guenther@gmail.com> writes:
>>> On Fri, Oct 5, 2012 at 11:55 AM, Richard Sandiford
>>> <rdsandiford@googlemail.com> wrote:
>>>> Richard Guenther <richard.guenther@gmail.com> writes:
>>>>>> As far as the wide_ints recording a mode or precision goes: we're in
>>>>>> the "lucky" position of having tried both options.  Trees record the
>>>>>> type (and thus precision) of all compile-time integer constants.
>>>>>> RTL doesn't.  And the RTL way is a right pain.  It leads to interfaces
>>>>>> in which rtx arguments often have to be accompanied by a mode.  And it
>>>>>> leads to clunky differences like those between simplify_unary_operation
>>>>>> (which takes two mode arguments) and simplify_binary_operation
>>>>>> (which with our current set of binary operations requires only one).
>>>>> But the issue here is not that double_int doesn't record a mode
>>>>> or precision (or a sign).  The issue is that CONST_DOUBLE and CONST_INT
>>>>> don't!  The _tree_ INTEGER_CST contains of a type and a double-int.
>>>>> I see double-int (and the proposed wide-int) as a common building-block
>>>>> used for kind of "arbitrary precision" (as far as the target needs) integer
>>>>> constants on both tree and RTL.  And having a common worker implementation
>>>>> requires it to operate on something different than tree types or RTL mode
>>>>> plus signedness.
>>>>>
>>>>>> To put it another way: every wide_int operation needs to know
>>>>>> the precision of its arguments and the precision of its result.
>>>>> That's not true.  Every tree or RTL operation does, not every
>>>>> wide_int operation.  double_int's are just twos-complement numbers
>>>>> of precision 2 * HOST_BITS_PER_WIDE_INT.  wide_int's should
>>>>> be just twos-complement numbers of precision len *
>>>>> WHATEVER_HOST_COMPONENT_TYPE_IS_SUITABLE_FOR_A_FAST_IMPLEMENTATION.
>>>>> Operations on double_int and wide_int are bare-metal,
>>>>> in nearly all of the times you should use routines that do
>>>>> proper sign-/zero-extending to a possibly smaller precision.  When
>>>>> using bare-metal you are supposed to do that yourself.
>>>>>
>>>>> Which is why I was suggesting to add double_sint, double_uint,
>>>>> double_sintp (with precision), etc., wrappers and operations.
>>>>>
>>>>>> Not storing the precision in the wide_int is putting the onus
>>>>>> on the caller to keep track of the precision separately.
>>>>> But that's a matter of providing something high-level ontop of
>>>>> the bare-metal double-int/wide-int (something shared between
>>>>> RTL and trees).  Duplicating information in the bare-metal
>>>>> doesn't make sense (and no, INTEGER_CSTs on the tree level
>>>>> are _not_ short-lived, and how can a double-int make sense on
>>>>> the tree level when you say it doesn't make sense on the RTL level?)
>>>> I think we're talking at cross purposes here.  To the extent that
>>>> I'm not really sure where to begin. :-)  Just in case this is it:
>>>> the idea is that wide_int is the type used to _process_ integers.
>>>> It is not suppoed to be the type used to store integers in the IRs.
>>>> The way trees store integers and the way that RTL stores integers
>>>> are up to them.  For RTL the idea is that we continue to store integers
>>>> that happen to fit into a HWI as a CONST_INT (regardless of the size of
>>>> the CONST_INT's context-determined mode).  Wider integers are stored
>>>> as a CONST_DOUBLE (for unconverted targets) or a CONST_WIDE_INT
>>>> (for converted targets).  None of the three use the wide_int type;
>>>> they use more compact representations instead.  And as Kenny says,
>>>> using CONST_INT continues to be an important way of reducing the
>>>> IR footprint.
>>>>
>>>> Whenever we want to do something interesting with the integers,
>>>> we construct a wide_int for them and use the same processing code
>>>> for all three rtx types.  This avoids having the separate single-HWI
>>>> and double-HWI paths that we have now.  It also copes naturally with
>>>> cases where we start out with single-HWI values but end up with wider
>>>> ones.
>>>>
>>>> But the operations that we do on these wide_ints will all be to a mode
>>>> or precision.  Shifts of QImode integers are different from shifts of
>>>> HImode integers, etc.
>>>>
>>>> If you knew all that already (you probably did) and I've completely
>>>> missed the point, please say so. :-)
>>>>
>>>> I'm not sure what you mean by "bare metal".
>>> The issue is that unlike RTL where we "construct" double-ints from
>>> CONST_INT/CONST_DOUBLE right now, tree has the double-int
>>> _embedded_.  That's why I say that the thing we embed in
>>> a CONST_WIDE_INT or a tree INTEGER_CST needs to be
>>> "bare metal", and that's what I would call wide-int.
>> OK, but that's deliberately not what Kenny's patch calls "wide int".
>> The whole idea is that the "bare metal" thing we embed in a
>> CONST_WIDE_INT or tree isn't (and doesn't need to be) the same
>> as the type that we use to operate on integers.  That bare-metal
>> thing doesn't even have to have a fixed width.  (CONST_WIDE_INT
>> doesn't have a fixed width, it's only as big as the integer
>> needs it to be.)
> Ok, let's rephrase it this way then: the "bare metal" thing used
> for the storage should ideally be the same in the tree IL and the RTL
> IL _and_ the higher-level abstract wide-int.
>
>>> So to me wide-ints provide the higher-level abstraction ontop of
>>> double-ints (which would remain the storage entity).
>>>
>>> Such higher-level abstraction is very useful, also for double-ints and
>>> also on the tree level.  There is no requirement to provide bigger
>>> double-int (or wide-int) for this.  Let's call this abstraction
>>> wide-int (as opposed to my suggested more piecemail double_sint,
>>> double_uint).  You can perfectly model it ontop of the existing
>>> double_int storage.
>>>
>>> As of providing larger "double-ints" - there is not much code left
>>> (ok, quite an overstatement ;)) that relies on the implementation
>>> detail of double-int containing exactly two HOST_WIDE_INTs.
>>> The exact purpose of double-ints was to _hide_ this (previously
>>> we only had routines like mul_double_with_sign which take
>>> two HOST_WIDE_INT components).  Most code dealing with
>>> the implementation detail is code interfacing with middle-end
>>> routines that take a HOST_WIDE_INT argument (thus the
>>> double_int_fits_hwi_p predicates) - even wide_int has to support
>>> this kind of interfacing.
>>>
>>> So, after introducing wide_int that just wraps double_int and
>>> changing all user code (hopefully all, I guess mostly all), we'd
>>> tackle the issue that the size of double_int's is host dependent.
>>> A simple solution is to make it's size dependent on a target macro
>>> (yes, macro ...), so on a 32bit HWI host targeting a 64bit 'HWI' target
>>> you'd simply have four HWIs in the 'double-int' storage (and
>>> arithmetic needs to be generalized to support this).
>> I just don't see why this is a good thing.  The constraints
>> are different when storing integers and when operating on them.
>> When operating on them, we want something that is easily constructed
>> on the stack, so we can create temporary structures very cheaply,
>> and can easily pass and return them.  We happen to know at GCC's
>> compile time how big the biggest integer will ever be, so it makes
>> sense for wide_int to be that wide.
> I'm not arguing against this.  I'm just saying that the internal
> representation will depend on the host - not the number of total
> bits, but the number of pieces.
>
>> But when storing integers we want something compact.  If your
>> machine supports 256-bit integers, but the code you're compiling
>> makes heavy use of 128-bit integers, why would you want to waste
>> 128 of 256 bits on every stored integer?  Which is why even
>> CONST_WIDE_INT doesn't have a fixed width.
>>
>> You seem to be arguing that the type used to store integers in the IR
>> has to be same as the one that we use when performing compile-time
>> arithmetic, but I think it's better to have an abstraction between
>> the two.
> Well, you don't want to pay the cost dividing 256bit numbers all
> the time when most of your numbers are only 128bit.  So we don't
> really want to perform compile-time arithmetic on the biggest
> possible precision either.  Ideally, of course - at the moment
> we have double-ints and what precision we internally use
> is an implementation detail (once it is sufficient precision).
>
>> So if you think a pair of HWIs continues to be a useful way of
>> storing integers at the tree level, then we can easily continue
>> to use a pair of HWIs there.
> How do you store larger ints there then?  How is CONST_WIDE_INT
> variable size?  Why can wide-int not be variable-size?
>
>> Or if you'd prefer every tree integer
>> to be the same width as a wide_int, we can do that too.  (I don't
>> know what Kenny's tree patch does.)
> Keenys patch truncates wide-ints to two HWIs in wide-int-to-tree
> without any checking (I claimed it's a bug, Kenny says its a feature).
>
>>   But the whole point of having
>> wide_int as an abstraction is that most code _operating_ on integers
>> doesn't care what the representation is.  It becomes much easier to
>> change that representation to whatever gives the best performance.
> Sure!  And I agree totally with this!  Just don't mix enlarging that thing
> from double_int in the first round of implementation!
>
>> Another advantage of abstracting away the storage type is that
>> we could store things like an overflow flag in the wide_int
>> (if that ever proves useful) without worrying about the impact
>> on the tree and RTL footprint.
> Sure.
>
> We seem to be talking in circles.  You don't seem to want (or care)
> about a common storage for tree, RTL and wide-int.  You seem to
> care about that operate-on-wide-ints thing.  I am saying if you keep
> double-ints as-is you create two similar things which should be one thing.
>
> So, make double-int storage only.  Introduce wide-int and use that
> everywhere we compute on double-ints.  Whether wide-int is variable-size
> or not is an implementation detail (if it's not we simply ICE if you require
> a too large wide-int).  Wheter IL storage is variable-size or not is an
> implementation detail as well.
>
> What I don't see is that the patch just introduces wide-int as type
> to do compile-time math on in RTL.  It does more (the patch is large
> and I didn't read it in detail).  It doesn't even try to address the same
> on the tree level.  It doesn't address the fundamental issue of
> double-int size being host dependent.
>
> Instead it seems to focus on getting even larger constants "work"
> (on RTL level only).
>
> What I'd like to see is _just_ the wide-int abstraction, suitable to
> replace both tree and RTL level compile-time math (without even
> trying to conver all uses, that's just too much noise for a thorough
> review).
>
> Richard.
>
>>> [The more complex solution makes double-int* just a pointer to the first
>>> HWI, so it cannot be used without being "wrapped" in a wide-int
>>> which implicitely would provide the number of HWIs pointed to
>>> (yes, I think variable-sized wide-int storage is the ultimate way to go).]
>>>
>>> So ... can you possibly split the patch in a way that just does the
>>> first thing (provide the high-level wrapper and make use of it from RTL)?
>>> I don't see the reason to have both CONST_DOUBLE and CONST_WIDE,
>>> in fact they should be the same given choice one for the 2nd step.
>>> And you can of course trivially construct a wide-int from a CONST_INT
>>> as well (given it has a mode).
>>>
>>> As of using a mode or precision in wide-int - for re-use on the tree level
>>> you need a precision as not all precisions have a distinct mode.  As on
>>> RTL you don't have a sign we probably don't want wide-ints to have a sign
>>> but continue double-int-like to specify the sign as part of the operation
>>> where it matters.
>> Definitely agree that signedness is a property of the operation.
>> (Saturation too IMO.)  I don't really mind about mode vs. precision.
>>
>> Richard
Richard Sandiford Oct. 5, 2012, 1:18 p.m. UTC | #26
Richard Sandiford <rdsandiford@googlemail.com> writes:
>>>> How is CONST_WIDE_INT variable size?
>>>
>>> It's just the usual trailing variable-length array thing.
>>
>> Good.  Do you get rid of CONST_DOUBLE (for integers) at the same time?
>
> Yeah.  I initially thought it might be OK to keep them and have
> CONST_INT, integer CONST_DOUBLEs and CONST_WIDE_INT alongside
> each other.  (The way the patch is structured means that the
> choice of whether to keep integer CONST_DOUBLEs can be changed
> very easily.)  But Kenny convinced me it was a bad idea.

Sorry to follow up on myself, but to clarify: I was talking about
converted targets here.  (As in, I originally thought even converted
targets could continue to use integer CONST_DOUBLEs.)

Unconverted targets continue to use CONST_DOUBLE.

Richard
Richard Biener Oct. 5, 2012, 1:53 p.m. UTC | #27
On Fri, Oct 5, 2012 at 3:18 PM, Richard Sandiford
<rdsandiford@googlemail.com> wrote:
> Richard Sandiford <rdsandiford@googlemail.com> writes:
>>>>> How is CONST_WIDE_INT variable size?
>>>>
>>>> It's just the usual trailing variable-length array thing.
>>>
>>> Good.  Do you get rid of CONST_DOUBLE (for integers) at the same time?
>>
>> Yeah.  I initially thought it might be OK to keep them and have
>> CONST_INT, integer CONST_DOUBLEs and CONST_WIDE_INT alongside
>> each other.  (The way the patch is structured means that the
>> choice of whether to keep integer CONST_DOUBLEs can be changed
>> very easily.)  But Kenny convinced me it was a bad idea.
>
> Sorry to follow up on myself, but to clarify: I was talking about
> converted targets here.  (As in, I originally thought even converted
> targets could continue to use integer CONST_DOUBLEs.)
>
> Unconverted targets continue to use CONST_DOUBLE.

Why is it that not all targets are "converted"?  What's the difficulty
with that?
I really do not like partially transitioning there.

Richard.

> Richard
Richard Sandiford Oct. 5, 2012, 2:14 p.m. UTC | #28
Richard Guenther <richard.guenther@gmail.com> writes:
> On Fri, Oct 5, 2012 at 3:18 PM, Richard Sandiford
> <rdsandiford@googlemail.com> wrote:
>> Richard Sandiford <rdsandiford@googlemail.com> writes:
>>>>>> How is CONST_WIDE_INT variable size?
>>>>>
>>>>> It's just the usual trailing variable-length array thing.
>>>>
>>>> Good.  Do you get rid of CONST_DOUBLE (for integers) at the same time?
>>>
>>> Yeah.  I initially thought it might be OK to keep them and have
>>> CONST_INT, integer CONST_DOUBLEs and CONST_WIDE_INT alongside
>>> each other.  (The way the patch is structured means that the
>>> choice of whether to keep integer CONST_DOUBLEs can be changed
>>> very easily.)  But Kenny convinced me it was a bad idea.
>>
>> Sorry to follow up on myself, but to clarify: I was talking about
>> converted targets here.  (As in, I originally thought even converted
>> targets could continue to use integer CONST_DOUBLEs.)
>>
>> Unconverted targets continue to use CONST_DOUBLE.
>
> Why is it that not all targets are "converted"?  What's the difficulty
> with that?
> I really do not like partially transitioning there.

The problem is that CONST_DOUBLE as it exists today has two meanings:
a floating-point meaning and an integer meaning.  Ports that handle
CONST_DOUBLEs are aware of this and expect the two things to have
the same rtx code.  Whereas in a converted port, the two things
have different rtx codes, and the integers have a different
representation from the current low/high pair.

So to take the first hit in config/alpha/alpha.c,
namely alpha_rtx_costs:

    case CONST_DOUBLE:
      if (x == CONST0_RTX (mode))
	*total = 0;
      else if ((outer_code == PLUS && add_operand (x, VOIDmode))
	       || (outer_code == AND && and_operand (x, VOIDmode)))
	*total = 0;
      else if (add_operand (x, VOIDmode) || and_operand (x, VOIDmode))
	*total = 2;
      else
	*total = COSTS_N_INSNS (2);
      return true;

What could the patch do to make this work without modification?
The middle two cases are for integers, but the first and last
presumably apply to both integers and floats.

Richard
Richard Biener Oct. 5, 2012, 2:36 p.m. UTC | #29
On Fri, Oct 5, 2012 at 4:14 PM, Richard Sandiford
<rdsandiford@googlemail.com> wrote:
> Richard Guenther <richard.guenther@gmail.com> writes:
>> On Fri, Oct 5, 2012 at 3:18 PM, Richard Sandiford
>> <rdsandiford@googlemail.com> wrote:
>>> Richard Sandiford <rdsandiford@googlemail.com> writes:
>>>>>>> How is CONST_WIDE_INT variable size?
>>>>>>
>>>>>> It's just the usual trailing variable-length array thing.
>>>>>
>>>>> Good.  Do you get rid of CONST_DOUBLE (for integers) at the same time?
>>>>
>>>> Yeah.  I initially thought it might be OK to keep them and have
>>>> CONST_INT, integer CONST_DOUBLEs and CONST_WIDE_INT alongside
>>>> each other.  (The way the patch is structured means that the
>>>> choice of whether to keep integer CONST_DOUBLEs can be changed
>>>> very easily.)  But Kenny convinced me it was a bad idea.
>>>
>>> Sorry to follow up on myself, but to clarify: I was talking about
>>> converted targets here.  (As in, I originally thought even converted
>>> targets could continue to use integer CONST_DOUBLEs.)
>>>
>>> Unconverted targets continue to use CONST_DOUBLE.
>>
>> Why is it that not all targets are "converted"?  What's the difficulty
>> with that?
>> I really do not like partially transitioning there.
>
> The problem is that CONST_DOUBLE as it exists today has two meanings:
> a floating-point meaning and an integer meaning.  Ports that handle
> CONST_DOUBLEs are aware of this and expect the two things to have
> the same rtx code.  Whereas in a converted port, the two things
> have different rtx codes, and the integers have a different
> representation from the current low/high pair.
>
> So to take the first hit in config/alpha/alpha.c,
> namely alpha_rtx_costs:
>
>     case CONST_DOUBLE:
>       if (x == CONST0_RTX (mode))
>         *total = 0;
>       else if ((outer_code == PLUS && add_operand (x, VOIDmode))
>                || (outer_code == AND && and_operand (x, VOIDmode)))
>         *total = 0;
>       else if (add_operand (x, VOIDmode) || and_operand (x, VOIDmode))
>         *total = 2;
>       else
>         *total = COSTS_N_INSNS (2);
>       return true;
>
> What could the patch do to make this work without modification?
> The middle two cases are for integers, but the first and last
> presumably apply to both integers and floats.

I didn't say it does not require changes, just that the transition should be
finished.  Some ports have little CONST_DOUBLE uses (which is what
I am grepping for), and if the max-wide-int-size matches that of the
current CONST_DOUBLE there is little chance for code generation
differences (in fact there should be none, correct?).

In the current patch no target is converted (maybe that's just going to
be 5/n?), I'd like to see at least one commonly tested target converted
and if we don't convert all targets another commonly tested target
to stay unconverted (just check gcc-testresults for which pair of targets
that would work).  Definitely at the end of stage3 we should have zero
unconverted targets (but I doubt converting them is a huge task - I have
converted the VECTOR_CST representation as well and we've had
the location + BLOCK merge and other changes affecting all targets).

Richard.

> Richard
Kenneth Zadeck Oct. 5, 2012, 2:41 p.m. UTC | #30
On 10/05/2012 10:36 AM, Richard Guenther wrote:
> On Fri, Oct 5, 2012 at 4:14 PM, Richard Sandiford
> <rdsandiford@googlemail.com> wrote:
>> Richard Guenther <richard.guenther@gmail.com> writes:
>>> On Fri, Oct 5, 2012 at 3:18 PM, Richard Sandiford
>>> <rdsandiford@googlemail.com> wrote:
>>>> Richard Sandiford <rdsandiford@googlemail.com> writes:
>>>>>>>> How is CONST_WIDE_INT variable size?
>>>>>>> It's just the usual trailing variable-length array thing.
>>>>>> Good.  Do you get rid of CONST_DOUBLE (for integers) at the same time?
>>>>> Yeah.  I initially thought it might be OK to keep them and have
>>>>> CONST_INT, integer CONST_DOUBLEs and CONST_WIDE_INT alongside
>>>>> each other.  (The way the patch is structured means that the
>>>>> choice of whether to keep integer CONST_DOUBLEs can be changed
>>>>> very easily.)  But Kenny convinced me it was a bad idea.
>>>> Sorry to follow up on myself, but to clarify: I was talking about
>>>> converted targets here.  (As in, I originally thought even converted
>>>> targets could continue to use integer CONST_DOUBLEs.)
>>>>
>>>> Unconverted targets continue to use CONST_DOUBLE.
>>> Why is it that not all targets are "converted"?  What's the difficulty
>>> with that?
>>> I really do not like partially transitioning there.
>> The problem is that CONST_DOUBLE as it exists today has two meanings:
>> a floating-point meaning and an integer meaning.  Ports that handle
>> CONST_DOUBLEs are aware of this and expect the two things to have
>> the same rtx code.  Whereas in a converted port, the two things
>> have different rtx codes, and the integers have a different
>> representation from the current low/high pair.
>>
>> So to take the first hit in config/alpha/alpha.c,
>> namely alpha_rtx_costs:
>>
>>      case CONST_DOUBLE:
>>        if (x == CONST0_RTX (mode))
>>          *total = 0;
>>        else if ((outer_code == PLUS && add_operand (x, VOIDmode))
>>                 || (outer_code == AND && and_operand (x, VOIDmode)))
>>          *total = 0;
>>        else if (add_operand (x, VOIDmode) || and_operand (x, VOIDmode))
>>          *total = 2;
>>        else
>>          *total = COSTS_N_INSNS (2);
>>        return true;
>>
>> What could the patch do to make this work without modification?
>> The middle two cases are for integers, but the first and last
>> presumably apply to both integers and floats.
> I didn't say it does not require changes, just that the transition should be
> finished.  Some ports have little CONST_DOUBLE uses (which is what
> I am grepping for), and if the max-wide-int-size matches that of the
> current CONST_DOUBLE there is little chance for code generation
> differences (in fact there should be none, correct?).
>
> In the current patch no target is converted (maybe that's just going to
> be 5/n?), I'd like to see at least one commonly tested target converted
> and if we don't convert all targets another commonly tested target
> to stay unconverted (just check gcc-testresults for which pair of targets
> that would work).  Definitely at the end of stage3 we should have zero
> unconverted targets (but I doubt converting them is a huge task - I have
> converted the VECTOR_CST representation as well and we've had
> the location + BLOCK merge and other changes affecting all targets).
>
> Richard.
i will convert ppc if that is what it takes.   david's office is 4 isles 
away and mike has a lot of experience on ppc also.  (this is unless 
richard is willing to do mips one afternoon.)   I have done my two 
private ports but i understand that that should not count.
>> Richard
Richard Sandiford Oct. 5, 2012, 2:53 p.m. UTC | #31
Kenneth Zadeck <zadeck@naturalbridge.com> writes:
> i will convert ppc if that is what it takes.   david's office is 4 isles 
> away and mike has a lot of experience on ppc also.  (this is unless 
> richard is willing to do mips one afternoon.)

'Fraid MIPS is very boring as far as this is concerned.  MIPS sets
need_64bit_hwint to yes and has no 128-bit registers, so CONST_DOUBLE
is already restricted to floating-point values.

It might be interesting to see what happens without need_64bit_hwint,
but only once double_int has been removed.

Richard
diff mbox

Patch

Index: gcc/reload.c
===================================================================
--- gcc/reload.c	(revision 191978)
+++ gcc/reload.c	(working copy)
@@ -3437,7 +3437,7 @@  find_reloads (rtx insn, int replace, int
 		    break;
 
 		  case 's':
-		    if (CONST_INT_P (operand) || CONST_DOUBLE_AS_INT_P (operand))
+		    if (CONST_SCALAR_INT_P (operand))
 		      break;
 		  case 'i':
 		    if (CONSTANT_P (operand)
@@ -3446,7 +3446,7 @@  find_reloads (rtx insn, int replace, int
 		    break;
 
 		  case 'n':
-		    if (CONST_INT_P (operand) || CONST_DOUBLE_AS_INT_P (operand))
+		    if (CONST_SCALAR_INT_P (operand))
 		      win = 1;
 		    break;
 
Index: gcc/rtl.def
===================================================================
--- gcc/rtl.def	(revision 191978)
+++ gcc/rtl.def	(working copy)
@@ -319,6 +319,9 @@  DEF_RTL_EXPR(TRAP_IF, "trap_if", "ee", R
 /* numeric integer constant */
 DEF_RTL_EXPR(CONST_INT, "const_int", "w", RTX_CONST_OBJ)
 
+/* numeric integer constant */
+DEF_RTL_EXPR(CONST_WIDE_INT, "const_wide_int", "", RTX_CONST_OBJ)
+
 /* fixed-point constant */
 DEF_RTL_EXPR(CONST_FIXED, "const_fixed", "www", RTX_CONST_OBJ)
 
Index: gcc/ira-costs.c
===================================================================
--- gcc/ira-costs.c	(revision 191978)
+++ gcc/ira-costs.c	(working copy)
@@ -667,7 +667,7 @@  record_reg_classes (int n_alts, int n_op
 		  break;
 
 		case 's':
-		  if (CONST_INT_P (op) || CONST_DOUBLE_AS_INT_P (op)) 
+		  if (CONST_SCALAR_INT_P (op)) 
 		    break;
 
 		case 'i':
@@ -677,7 +677,7 @@  record_reg_classes (int n_alts, int n_op
 		  break;
 
 		case 'n':
-		  if (CONST_INT_P (op) || CONST_DOUBLE_AS_INT_P (op)) 
+		  if (CONST_SCALAR_INT_P (op)) 
 		    win = 1;
 		  break;
 
@@ -1068,7 +1068,7 @@  record_address_regs (enum machine_mode m
 
 	/* If the second operand is a constant integer, it doesn't
 	   change what class the first operand must be.  */
-	else if (code1 == CONST_INT || code1 == CONST_DOUBLE)
+	else if (CONST_SCALAR_INT_P (arg1))
 	  record_address_regs (mode, as, arg0, context, PLUS, code1, scale);
 	/* If the second operand is a symbolic constant, the first
 	   operand must be an index register.  */
Index: gcc/dojump.c
===================================================================
--- gcc/dojump.c	(revision 191978)
+++ gcc/dojump.c	(working copy)
@@ -144,6 +144,7 @@  static bool
 prefer_and_bit_test (enum machine_mode mode, int bitnum)
 {
   bool speed_p;
+  wide_int mask = wide_int::set_bit_in_zero (bitnum, mode);
 
   if (and_test == 0)
     {
@@ -164,8 +165,7 @@  prefer_and_bit_test (enum machine_mode m
     }
 
   /* Fill in the integers.  */
-  XEXP (and_test, 1)
-    = immed_double_int_const (double_int_zero.set_bit (bitnum), mode);
+  XEXP (and_test, 1) = immed_wide_int_const (mask);
   XEXP (XEXP (shift_test, 0), 1) = GEN_INT (bitnum);
 
   speed_p = optimize_insn_for_speed_p ();
Index: gcc/recog.c
===================================================================
--- gcc/recog.c	(revision 191978)
+++ gcc/recog.c	(working copy)
@@ -586,8 +586,7 @@  simplify_while_replacing (rtx *loc, rtx
 			 (PLUS, GET_MODE (x), XEXP (x, 0), XEXP (x, 1)), 1);
       break;
     case MINUS:
-      if (CONST_INT_P (XEXP (x, 1))
-	  || CONST_DOUBLE_AS_INT_P (XEXP (x, 1)))
+      if (CONST_SCALAR_INT_P (XEXP (x, 1)))
 	validate_change (object, loc,
 			 simplify_gen_binary
 			 (PLUS, GET_MODE (x), XEXP (x, 0),
@@ -1146,6 +1145,86 @@  const_int_operand (rtx op, enum machine_
   return 1;
 }
 
+#if TARGET_SUPPORTS_WIDE_INT
+/* Returns 1 if OP is an operand that is a CONST_INT or CONST_WIDE_INT.  */
+int
+const_scalar_int_operand (rtx op, enum machine_mode mode)
+{
+  if (!CONST_WIDE_INT_P (op))
+    return 0;
+
+  if (mode != VOIDmode)
+    {
+      int prec = GET_MODE_PRECISION (mode);
+      int bitsize = GET_MODE_BITSIZE (mode);
+      
+      if (CONST_WIDE_INT_NUNITS (op) * HOST_BITS_PER_WIDE_INT > bitsize)
+	return 0;
+      
+      if (prec == bitsize)
+	return 1;
+      else
+	{
+	  /* Multiword partial int.  */
+	  HOST_WIDE_INT x 
+	    = CONST_WIDE_INT_ELT (op, CONST_WIDE_INT_NUNITS (op) - 1);
+	  return (wide_int::sext (x, prec & (HOST_BITS_PER_WIDE_INT - 1))
+		  == x);
+	}
+    }
+  return 1;
+}
+
+/* Returns 1 if OP is an operand that is a CONST_WIDE_INT.  */
+int
+const_wide_int_operand (rtx op, enum machine_mode mode)
+{
+  switch (GET_CODE (op)) 
+    {
+    case CONST_INT:
+      if (mode != VOIDmode
+	  && trunc_int_for_mode (INTVAL (op), mode) != INTVAL (op))
+	return 0;
+      return 1;
+
+    case CONST_WIDE_INT:
+      if (mode != VOIDmode)
+	{
+	  int prec = GET_MODE_PRECISION (mode);
+	  int bitsize = GET_MODE_BITSIZE (mode);
+
+	  if (CONST_WIDE_INT_NUNITS (op) * HOST_BITS_PER_WIDE_INT > bitsize)
+	    return 0;
+
+	  if (prec == bitsize)
+	    return 1;
+	  else
+	    {
+	      /* Multiword partial int.  */
+	      HOST_WIDE_INT x
+		= CONST_WIDE_INT_ELT (op, CONST_WIDE_INT_NUNITS (op));
+	      return (wide_int::sext (x,
+				      prec & (HOST_BITS_PER_WIDE_INT - 1))
+		      == x);
+	    }
+	}
+      return 1;
+
+    default:
+      return 0;
+    }
+}
+
+/* Returns 1 if OP is an operand that is a constant integer or constant
+   floating-point number.  */
+
+int
+const_double_operand (rtx op, enum machine_mode mode)
+{
+  return (GET_CODE (op) == CONST_DOUBLE)
+	  && (GET_MODE (op) == mode || mode == VOIDmode);
+}
+#else
 /* Returns 1 if OP is an operand that is a constant integer or constant
    floating-point number.  */
 
@@ -1163,7 +1242,7 @@  const_double_operand (rtx op, enum machi
 	  && (mode == VOIDmode || GET_MODE (op) == mode
 	      || GET_MODE (op) == VOIDmode));
 }
-
+#endif
 /* Return 1 if OP is a general operand that is not an immediate operand.  */
 
 int
@@ -1726,7 +1805,7 @@  asm_operand_ok (rtx op, const char *cons
 	  break;
 
 	case 's':
-	  if (CONST_INT_P (op) || CONST_DOUBLE_AS_INT_P (op))
+	  if (CONST_SCALAR_INT_P (op))
 	    break;
 	  /* Fall through.  */
 
@@ -1736,7 +1815,7 @@  asm_operand_ok (rtx op, const char *cons
 	  break;
 
 	case 'n':
-	  if (CONST_INT_P (op) || CONST_DOUBLE_AS_INT_P (op))
+	  if (CONST_SCALAR_INT_P (op))
 	    result = 1;
 	  break;
 
@@ -2591,7 +2670,7 @@  constrain_operands (int strict)
 		break;
 
 	      case 's':
-		if (CONST_INT_P (op) || CONST_DOUBLE_AS_INT_P (op))
+		if (CONST_SCALAR_INT_P (op))
 		  break;
 	      case 'i':
 		if (CONSTANT_P (op))
@@ -2599,7 +2678,7 @@  constrain_operands (int strict)
 		break;
 
 	      case 'n':
-		if (CONST_INT_P (op) || CONST_DOUBLE_AS_INT_P (op))
+		if (CONST_SCALAR_INT_P (op))
 		  win = 1;
 		break;
 
Index: gcc/rtl.c
===================================================================
--- gcc/rtl.c	(revision 191978)
+++ gcc/rtl.c	(working copy)
@@ -111,7 +111,7 @@  const enum rtx_class rtx_class[NUM_RTX_C
 const unsigned char rtx_code_size[NUM_RTX_CODE] = {
 #define DEF_RTL_EXPR(ENUM, NAME, FORMAT, CLASS)				\
   (((ENUM) == CONST_INT || (ENUM) == CONST_DOUBLE			\
-    || (ENUM) == CONST_FIXED)						\
+    || (ENUM) == CONST_FIXED || (ENUM) == CONST_WIDE_INT)		\
    ? RTX_HDR_SIZE + (sizeof FORMAT - 1) * sizeof (HOST_WIDE_INT)	\
    : RTX_HDR_SIZE + (sizeof FORMAT - 1) * sizeof (rtunion)),
 
@@ -183,18 +183,24 @@  shallow_copy_rtvec (rtvec vec)
 unsigned int
 rtx_size (const_rtx x)
 {
+  if (GET_CODE (x) == CONST_WIDE_INT)
+    return (RTX_HDR_SIZE
+	    + sizeof (struct hwivec_def)
+	    + ((CONST_WIDE_INT_NUNITS (x) - 1)
+	       * sizeof (HOST_WIDE_INT)));
   if (GET_CODE (x) == SYMBOL_REF && SYMBOL_REF_HAS_BLOCK_INFO_P (x))
     return RTX_HDR_SIZE + sizeof (struct block_symbol);
   return RTX_CODE_SIZE (GET_CODE (x));
 }
 
-/* Allocate an rtx of code CODE.  The CODE is stored in the rtx;
-   all the rest is initialized to zero.  */
+/* Allocate an rtx of code CODE with EXTRA bytes in it.  The CODE is
+   stored in the rtx; all the rest is initialized to zero.  */
 
 rtx
-rtx_alloc_stat (RTX_CODE code MEM_STAT_DECL)
+rtx_alloc_stat_v (RTX_CODE code MEM_STAT_DECL, int extra)
 {
-  rtx rt = ggc_alloc_zone_rtx_def_stat (&rtl_zone, RTX_CODE_SIZE (code)
+  rtx rt = ggc_alloc_zone_rtx_def_stat (&rtl_zone,
+					RTX_CODE_SIZE (code) + extra
                                         PASS_MEM_STAT);
 
   /* We want to clear everything up to the FLD array.  Normally, this
@@ -213,6 +219,29 @@  rtx_alloc_stat (RTX_CODE code MEM_STAT_D
   return rt;
 }
 
+/* Allocate an rtx of code CODE.  The CODE is stored in the rtx;
+   all the rest is initialized to zero.  */
+
+rtx
+rtx_alloc_stat (RTX_CODE code MEM_STAT_DECL)
+{
+  return rtx_alloc_stat_v (code PASS_MEM_STAT, 0);
+}
+
+/* Write the wide constant OP0 to OUTFILE.  */
+
+void
+hwivec_output_hex (FILE *outfile, const_hwivec op0)
+{
+  int i = HWI_GET_NUM_ELEM (op0);
+  gcc_assert (i > 0);
+  if (XHWIVEC_ELT (op0, i-1) == 0)
+    fprintf (outfile, "0x");
+  fprintf (outfile, HOST_WIDE_INT_PRINT_HEX, XHWIVEC_ELT (op0, --i));
+  while (--i >= 0)
+    fprintf (outfile, HOST_WIDE_INT_PRINT_PADDED_HEX, XHWIVEC_ELT (op0, i));
+}
+
 
 /* Return true if ORIG is a sharable CONST.  */
 
@@ -427,7 +456,6 @@  rtx_equal_p_cb (const_rtx x, const_rtx y
 	  if (XWINT (x, i) != XWINT (y, i))
 	    return 0;
 	  break;
-
 	case 'n':
 	case 'i':
 	  if (XINT (x, i) != XINT (y, i))
@@ -645,6 +673,10 @@  iterative_hash_rtx (const_rtx x, hashval
       return iterative_hash_object (i, hash);
     case CONST_INT:
       return iterative_hash_object (INTVAL (x), hash);
+    case CONST_WIDE_INT:
+      for (i = 0; i < CONST_WIDE_INT_NUNITS (x); i++)
+	hash = iterative_hash_object (CONST_WIDE_INT_ELT (x, i), hash);
+      return hash;
     case SYMBOL_REF:
       if (XSTR (x, 0))
 	return iterative_hash (XSTR (x, 0), strlen (XSTR (x, 0)) + 1,
@@ -809,6 +841,16 @@  rtl_check_failed_block_symbol (const cha
 }
 
 /* XXX Maybe print the vector?  */
+void
+hwivec_check_failed_bounds (const_hwivec r, int n, const char *file, int line,
+			    const char *func)
+{
+  internal_error
+    ("RTL check: access of hwi elt %d of vector with last elt %d in %s, at %s:%d",
+     n, GET_NUM_ELEM (r) - 1, func, trim_filename (file), line);
+}
+
+/* XXX Maybe print the vector?  */
 void
 rtvec_check_failed_bounds (const_rtvec r, int n, const char *file, int line,
 			   const char *func)
Index: gcc/rtl.h
===================================================================
--- gcc/rtl.h	(revision 191978)
+++ gcc/rtl.h	(working copy)
@@ -31,6 +31,7 @@  along with GCC; see the file COPYING3.
 #include "fixed-value.h"
 #include "alias.h"
 #include "hashtab.h"
+#include "wide-int.h"
 #include "flags.h"
 
 /* Value used by some passes to "recognize" noop moves as valid
@@ -252,6 +253,14 @@  struct GTY(()) object_block {
   VEC(rtx,gc) *anchors;
 };
 
+struct GTY((variable_size)) hwivec_def {
+  int num_elem;		/* number of elements */
+  HOST_WIDE_INT elem[1];
+};
+
+#define HWI_GET_NUM_ELEM(HWIVEC)	((HWIVEC)->num_elem)
+#define HWI_PUT_NUM_ELEM(HWIVEC, NUM)	((HWIVEC)->num_elem = (NUM))
+
 /* RTL expression ("rtx").  */
 
 struct GTY((chain_next ("RTX_NEXT (&%h)"),
@@ -344,6 +353,7 @@  struct GTY((chain_next ("RTX_NEXT (&%h)"
     struct block_symbol block_sym;
     struct real_value rv;
     struct fixed_value fv;
+    struct hwivec_def hwiv;
   } GTY ((special ("rtx_def"), desc ("GET_CODE (&%0)"))) u;
 };
 
@@ -382,13 +392,13 @@  struct GTY((chain_next ("RTX_NEXT (&%h)"
    for a variable number of things.  The principle use is inside
    PARALLEL expressions.  */
 
+#define NULL_RTVEC (rtvec) 0
+
 struct GTY((variable_size)) rtvec_def {
   int num_elem;		/* number of elements */
   rtx GTY ((length ("%h.num_elem"))) elem[1];
 };
 
-#define NULL_RTVEC (rtvec) 0
-
 #define GET_NUM_ELEM(RTVEC)		((RTVEC)->num_elem)
 #define PUT_NUM_ELEM(RTVEC, NUM)	((RTVEC)->num_elem = (NUM))
 
@@ -398,12 +408,38 @@  struct GTY((variable_size)) rtvec_def {
 /* Predicate yielding nonzero iff X is an rtx for a memory location.  */
 #define MEM_P(X) (GET_CODE (X) == MEM)
 
+#if TARGET_SUPPORTS_WIDE_INT
+
+/* Match CONST_*s that can represent compile-time constant integers.  */
+#define CASE_CONST_SCALAR_INT \
+   case CONST_INT: \
+   case CONST_WIDE_INT
+
+/* Match CONST_*s for which pointer equality corresponds to value 
+   equality.  */
+#define CASE_CONST_UNIQUE \
+   case CONST_INT: \
+   case CONST_WIDE_INT: \
+   case CONST_DOUBLE: \
+   case CONST_FIXED
+
+/* Match all CONST_* rtxes.  */
+#define CASE_CONST_ANY \
+   case CONST_INT: \
+   case CONST_WIDE_INT: \
+   case CONST_DOUBLE: \
+   case CONST_FIXED: \
+   case CONST_VECTOR
+
+#else
+
 /* Match CONST_*s that can represent compile-time constant integers.  */
 #define CASE_CONST_SCALAR_INT \
    case CONST_INT: \
    case CONST_DOUBLE
 
-/* Match CONST_*s for which pointer equality corresponds to value equality.  */
+/* Match CONST_*s for which pointer equality corresponds to value 
+equality.  */
 #define CASE_CONST_UNIQUE \
    case CONST_INT: \
    case CONST_DOUBLE: \
@@ -415,10 +451,17 @@  struct GTY((variable_size)) rtvec_def {
    case CONST_DOUBLE: \
    case CONST_FIXED: \
    case CONST_VECTOR
+#endif
+
+
+
 
 /* Predicate yielding nonzero iff X is an rtx for a constant integer.  */
 #define CONST_INT_P(X) (GET_CODE (X) == CONST_INT)
 
+/* Predicate yielding nonzero iff X is an rtx for a constant integer.  */
+#define CONST_WIDE_INT_P(X) (GET_CODE (X) == CONST_WIDE_INT)
+
 /* Predicate yielding nonzero iff X is an rtx for a constant fixed-point.  */
 #define CONST_FIXED_P(X) (GET_CODE (X) == CONST_FIXED)
 
@@ -430,6 +473,15 @@  struct GTY((variable_size)) rtvec_def {
 #define CONST_DOUBLE_AS_INT_P(X) \
   (GET_CODE (X) == CONST_DOUBLE && GET_MODE (X) == VOIDmode)
 
+/* Predicate yielding true iff X is an rtx for a integer const.  */
+#if TARGET_SUPPORTS_WIDE_INT
+#define CONST_SCALAR_INT_P(X) \
+  (CONST_INT_P (X) || CONST_WIDE_INT_P (X))
+#else
+#define CONST_SCALAR_INT_P(X) \
+  (CONST_INT_P (X) || CONST_DOUBLE_AS_INT_P (X))
+#endif
+
 /* Predicate yielding true iff X is an rtx for a double-int.  */
 #define CONST_DOUBLE_AS_FLOAT_P(X) \
   (GET_CODE (X) == CONST_DOUBLE && GET_MODE (X) != VOIDmode)
@@ -591,6 +643,13 @@  struct GTY((variable_size)) rtvec_def {
 			       __FUNCTION__);				\
      &_rtx->u.hwint[_n]; }))
 
+#define XHWIVEC_ELT(HWIVEC, I) __extension__				\
+(*({ __typeof (HWIVEC) const _hwivec = (HWIVEC); const int _i = (I);	\
+     if (_i < 0 || _i >= HWI_GET_NUM_ELEM (_hwivec))			\
+       hwivec_check_failed_bounds (_hwivec, _i, __FILE__, __LINE__,	\
+				  __FUNCTION__);			\
+     &_hwivec->elem[_i]; }))
+
 #define XCWINT(RTX, N, C) __extension__					\
 (*({ __typeof (RTX) const _rtx = (RTX);					\
      if (GET_CODE (_rtx) != (C))					\
@@ -627,6 +686,11 @@  struct GTY((variable_size)) rtvec_def {
 				    __FUNCTION__);			\
    &_symbol->u.block_sym; })
 
+#define HWIVEC_CHECK(RTX,C) __extension__				\
+({ __typeof (RTX) const _symbol = (RTX);				\
+   RTL_CHECKC1 (_symbol, 0, C);						\
+   &_symbol->u.hwiv; })
+
 extern void rtl_check_failed_bounds (const_rtx, int, const char *, int,
 				     const char *)
     ATTRIBUTE_NORETURN;
@@ -647,6 +711,9 @@  extern void rtl_check_failed_code_mode (
     ATTRIBUTE_NORETURN;
 extern void rtl_check_failed_block_symbol (const char *, int, const char *)
     ATTRIBUTE_NORETURN;
+extern void hwivec_check_failed_bounds (const_rtvec, int, const char *, int,
+					const char *)
+    ATTRIBUTE_NORETURN;
 extern void rtvec_check_failed_bounds (const_rtvec, int, const char *, int,
 				       const char *)
     ATTRIBUTE_NORETURN;
@@ -659,12 +726,14 @@  extern void rtvec_check_failed_bounds (c
 #define RTL_CHECKC2(RTX, N, C1, C2) ((RTX)->u.fld[N])
 #define RTVEC_ELT(RTVEC, I)	    ((RTVEC)->elem[I])
 #define XWINT(RTX, N)		    ((RTX)->u.hwint[N])
+#define XHWIVEC_ELT(HWIVEC, I)	    ((HWIVEC)->elem[I])
 #define XCWINT(RTX, N, C)	    ((RTX)->u.hwint[N])
 #define XCMWINT(RTX, N, C, M)	    ((RTX)->u.hwint[N])
 #define XCNMWINT(RTX, N, C, M)	    ((RTX)->u.hwint[N])
 #define XCNMPRV(RTX, C, M)	    (&(RTX)->u.rv)
 #define XCNMPFV(RTX, C, M)	    (&(RTX)->u.fv)
 #define BLOCK_SYMBOL_CHECK(RTX)	    (&(RTX)->u.block_sym)
+#define HWIVEC_CHECK(RTX,C)	    (&(RTX)->u.hwiv)
 
 #endif
 
@@ -807,8 +876,8 @@  extern void rtl_check_failed_flag (const
 #define XCCFI(RTX, N, C)      (RTL_CHECKC1 (RTX, N, C).rt_cfi)
 #define XCCSELIB(RTX, N, C)   (RTL_CHECKC1 (RTX, N, C).rt_cselib)
 
-#define XCVECEXP(RTX, N, M, C)	RTVEC_ELT (XCVEC (RTX, N, C), M)
-#define XCVECLEN(RTX, N, C)	GET_NUM_ELEM (XCVEC (RTX, N, C))
+#define XCVECEXP(RTX, N, M, C) RTVEC_ELT (XCVEC (RTX, N, C), M)
+#define XCVECLEN(RTX, N, C)    GET_NUM_ELEM (XCVEC (RTX, N, C))
 
 #define XC2EXP(RTX, N, C1, C2)      (RTL_CHECKC2 (RTX, N, C1, C2).rt_rtx)
 
@@ -1150,9 +1219,19 @@  rhs_regno (const_rtx x)
 #define INTVAL(RTX) XCWINT(RTX, 0, CONST_INT)
 #define UINTVAL(RTX) ((unsigned HOST_WIDE_INT) INTVAL (RTX))
 
+/* For a CONST_WIDE_INT, CONST_WIDE_INT_NUNITS is the number of
+   elements actually needed to represent the constant.
+   CONST_WIDE_INT_ELT gets one of the elements.  0 is the least
+   significant HOST_WIDE_INT.  */
+#define CONST_WIDE_INT_VEC(RTX) HWIVEC_CHECK (RTX, CONST_WIDE_INT)
+#define CONST_WIDE_INT_NUNITS(RTX) HWI_GET_NUM_ELEM (CONST_WIDE_INT_VEC (RTX))
+#define CONST_WIDE_INT_ELT(RTX, N) XHWIVEC_ELT (CONST_WIDE_INT_VEC (RTX), N) 
+
 /* For a CONST_DOUBLE:
+#if TARGET_SUPPORTS_WIDE_INT == 0
    For a VOIDmode, there are two integers CONST_DOUBLE_LOW is the
      low-order word and ..._HIGH the high-order.
+#endif
    For a float, there is a REAL_VALUE_TYPE structure, and
      CONST_DOUBLE_REAL_VALUE(r) is a pointer to it.  */
 #define CONST_DOUBLE_LOW(r) XCMWINT (r, 0, CONST_DOUBLE, VOIDmode)
@@ -1678,6 +1757,12 @@  extern rtx plus_constant (enum machine_m
 /* In rtl.c */
 extern rtx rtx_alloc_stat (RTX_CODE MEM_STAT_DECL);
 #define rtx_alloc(c) rtx_alloc_stat (c MEM_STAT_INFO)
+extern rtx rtx_alloc_stat_v (RTX_CODE MEM_STAT_DECL, int);
+#define rtx_alloc_v(c, SZ) rtx_alloc_stat_v (c MEM_STAT_INFO, SZ)
+#define const_wide_int_alloc(NWORDS)				\
+  rtx_alloc_v (CONST_WIDE_INT,					\
+	       (sizeof (struct hwivec_def)			\
+		+ ((NWORDS)-1) * sizeof (HOST_WIDE_INT)))	\
 
 extern rtvec rtvec_alloc (int);
 extern rtvec shallow_copy_rtvec (rtvec);
@@ -1734,10 +1819,17 @@  extern void start_sequence (void);
 extern void push_to_sequence (rtx);
 extern void push_to_sequence2 (rtx, rtx);
 extern void end_sequence (void);
+#if TARGET_SUPPORTS_WIDE_INT == 0
 extern double_int rtx_to_double_int (const_rtx);
-extern rtx immed_double_int_const (double_int, enum machine_mode);
+#endif
+extern void hwivec_output_hex (FILE *, const_hwivec);
+#ifndef GENERATOR_FILE
+extern rtx immed_wide_int_const (const wide_int &cst);
+#endif
+#if TARGET_SUPPORTS_WIDE_INT == 0
 extern rtx immed_double_const (HOST_WIDE_INT, HOST_WIDE_INT,
 			       enum machine_mode);
+#endif
 
 /* In loop-iv.c  */
 
Index: gcc/rtlanal.c
===================================================================
--- gcc/rtlanal.c	(revision 191978)
+++ gcc/rtlanal.c	(working copy)
@@ -3081,6 +3081,8 @@  commutative_operand_precedence (rtx op)
   /* Constants always come the second operand.  Prefer "nice" constants.  */
   if (code == CONST_INT)
     return -8;
+  if (code == CONST_WIDE_INT)
+    return -8;
   if (code == CONST_DOUBLE)
     return -7;
   if (code == CONST_FIXED)
@@ -3093,6 +3095,8 @@  commutative_operand_precedence (rtx op)
     case RTX_CONST_OBJ:
       if (code == CONST_INT)
         return -6;
+      if (code == CONST_WIDE_INT)
+        return -6;
       if (code == CONST_DOUBLE)
         return -5;
       if (code == CONST_FIXED)
@@ -5351,6 +5355,20 @@  split_double (rtx value, rtx *first, rtx
 	    }
 	}
     }
+  else if (GET_CODE (value) == CONST_WIDE_INT)
+    {
+      gcc_assert (CONST_WIDE_INT_NUNITS (value) == 2);
+      if (WORDS_BIG_ENDIAN)
+	{
+	  *first = GEN_INT (CONST_WIDE_INT_ELT (value, 1));
+	  *second = GEN_INT (CONST_WIDE_INT_ELT (value, 0));
+	}
+      else
+	{
+	  *first = GEN_INT (CONST_WIDE_INT_ELT (value, 0));
+	  *second = GEN_INT (CONST_WIDE_INT_ELT (value, 1));
+	}
+    }
   else if (!CONST_DOUBLE_P (value))
     {
       if (WORDS_BIG_ENDIAN)
Index: gcc/Makefile.in
===================================================================
--- gcc/Makefile.in	(revision 191978)
+++ gcc/Makefile.in	(working copy)
@@ -841,7 +841,7 @@  COMMON_TARGET_DEF_H = common/common-targ
 RTL_BASE_H = coretypes.h rtl.h rtl.def $(MACHMODE_H) reg-notes.def \
   insn-notes.def $(INPUT_H) $(REAL_H) statistics.h $(VEC_H) \
   $(FIXED_VALUE_H) alias.h $(HASHTAB_H)
-FIXED_VALUE_H = fixed-value.h $(MACHMODE_H) double-int.h
+FIXED_VALUE_H = fixed-value.h $(MACHMODE_H) double-int.h wide-int.h
 RTL_H = $(RTL_BASE_H) $(FLAGS_H) genrtl.h vecir.h
 RTL_ERROR_H = rtl-error.h $(RTL_H) $(DIAGNOSTIC_CORE_H)
 READ_MD_H = $(OBSTACK_H) $(HASHTAB_H) read-md.h
@@ -853,7 +853,7 @@  INTERNAL_FN_H = internal-fn.h $(INTERNAL
 TREE_H = coretypes.h tree.h all-tree.def tree.def c-family/c-common.def \
 	$(lang_tree_files) $(MACHMODE_H) tree-check.h $(BUILTINS_DEF) \
 	$(INPUT_H) statistics.h $(VEC_H) treestruct.def $(HASHTAB_H) \
-	double-int.h alias.h $(SYMTAB_H) $(FLAGS_H) vecir.h \
+	double-int.h wide-int.h alias.h $(SYMTAB_H) $(FLAGS_H) vecir.h \
 	$(REAL_H) $(FIXED_VALUE_H)
 REGSET_H = regset.h $(BITMAP_H) hard-reg-set.h
 BASIC_BLOCK_H = basic-block.h $(PREDICT_H) $(VEC_H) $(FUNCTION_H) \
@@ -1432,6 +1432,7 @@  OBJS = \
 	varpool.o \
 	vmsdbgout.o \
 	web.o \
+	wide-int.o \
 	xcoffout.o \
 	$(out_object_file) \
 	$(EXTRA_OBJS) \
@@ -2639,6 +2640,7 @@  targhooks.o : targhooks.c $(CONFIG_H) $(
    tree-ssa-alias.h $(TREE_FLOW_H)
 common/common-targhooks.o : common/common-targhooks.c $(CONFIG_H) $(SYSTEM_H) \
    coretypes.h $(INPUT_H) $(TM_H) $(COMMON_TARGET_H) common/common-targhooks.h
+wide-int.o: wide-int.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H)
 
 bversion.h: s-bversion; @true
 s-bversion: BASE-VER
@@ -3833,15 +3835,16 @@  CFLAGS-gengtype-parse.o += -DGENERATOR_F
 build/gengtype-parse.o: $(BCONFIG_H)
 
 gengtype-state.o build/gengtype-state.o: gengtype-state.c $(SYSTEM_H) \
-  gengtype.h errors.h double-int.h version.h $(HASHTAB_H) $(OBSTACK_H) \
-  $(XREGEX_H)
+  gengtype.h errors.h double-int.h wide-int.h version.h $(HASHTAB_H)    \
+  $(OBSTACK_H) $(XREGEX_H)
 gengtype-state.o: $(CONFIG_H)
 CFLAGS-gengtype-state.o += -DGENERATOR_FILE
 build/gengtype-state.o: $(BCONFIG_H)
+wide-int.h: $(GTM_H) insn-modes.h
 
 gengtype.o build/gengtype.o : gengtype.c $(SYSTEM_H) gengtype.h 	\
-  rtl.def insn-notes.def errors.h double-int.h version.h $(HASHTAB_H) \
-  $(OBSTACK_H) $(XREGEX_H)
+  rtl.def insn-notes.def errors.h double-int.h wide-int.h version.h     \
+  $(HASHTAB_H) $(OBSTACK_H) $(XREGEX_H)
 gengtype.o: $(CONFIG_H)
 CFLAGS-gengtype.o += -DGENERATOR_FILE
 build/gengtype.o: $(BCONFIG_H)
Index: gcc/sched-vis.c
===================================================================
--- gcc/sched-vis.c	(revision 191978)
+++ gcc/sched-vis.c	(working copy)
@@ -444,14 +444,31 @@  print_value (char *buf, const_rtx x, int
 	       (unsigned HOST_WIDE_INT) INTVAL (x));
       cur = safe_concat (buf, cur, t);
       break;
+
+    case CONST_WIDE_INT:
+      {
+	const char *sep = "<";
+	int i;
+	for (i = CONST_WIDE_INT_NUNITS (x) - 1; i >= 0; i--)
+	  {
+	    cur = safe_concat (buf, cur, sep);
+	    sep = ",";
+	    sprintf (t, HOST_WIDE_INT_PRINT_HEX,
+		     (unsigned HOST_WIDE_INT) CONST_WIDE_INT_ELT (x, i));
+	    cur = safe_concat (buf, cur, t);
+	  }
+	cur = safe_concat (buf, cur, ">");
+      }
+      break;
+
     case CONST_DOUBLE:
-      if (FLOAT_MODE_P (GET_MODE (x)))
-	real_to_decimal (t, CONST_DOUBLE_REAL_VALUE (x), sizeof (t), 0, 1);
-      else
+     if (TARGET_SUPPORTS_WIDE_INT == 0 && !FLOAT_MODE_P (GET_MODE (x)))
 	sprintf (t,
 		 "<" HOST_WIDE_INT_PRINT_HEX "," HOST_WIDE_INT_PRINT_HEX ">",
 		 (unsigned HOST_WIDE_INT) CONST_DOUBLE_LOW (x),
 		 (unsigned HOST_WIDE_INT) CONST_DOUBLE_HIGH (x));
+      else
+	real_to_decimal (t, CONST_DOUBLE_REAL_VALUE (x), sizeof (t), 0, 1);
       cur = safe_concat (buf, cur, t);
       break;
     case CONST_FIXED:
Index: gcc/gengtype.c
===================================================================
--- gcc/gengtype.c	(revision 191978)
+++ gcc/gengtype.c	(working copy)
@@ -5440,6 +5440,7 @@  main (int argc, char **argv)
       POS_HERE (do_scalar_typedef ("REAL_VALUE_TYPE", &pos));
       POS_HERE (do_scalar_typedef ("FIXED_VALUE_TYPE", &pos));
       POS_HERE (do_scalar_typedef ("double_int", &pos));
+      POS_HERE (do_scalar_typedef ("wide_int", &pos));
       POS_HERE (do_scalar_typedef ("uint64_t", &pos));
       POS_HERE (do_scalar_typedef ("uint8", &pos));
       POS_HERE (do_scalar_typedef ("uintptr_t", &pos));
Index: gcc/alias.c
===================================================================
--- gcc/alias.c	(revision 191978)
+++ gcc/alias.c	(working copy)
@@ -1490,9 +1490,9 @@  rtx_equal_for_memref_p (const_rtx x, con
 
     case VALUE:
     CASE_CONST_UNIQUE:
-      /* There's no need to compare the contents of CONST_DOUBLEs or
-	 CONST_INTs because pointer equality is a good enough
-	 comparison for these nodes.  */
+      /* There's no need to compare the contents of CONST_DOUBLEs,
+	 CONST_INTs or CONST_WIDE_INTs because pointer equality is a
+	 good enough comparison for these nodes.  */
       return 0;
 
     default:
Index: gcc/sel-sched-ir.c
===================================================================
--- gcc/sel-sched-ir.c	(revision 191978)
+++ gcc/sel-sched-ir.c	(working copy)
@@ -1137,10 +1137,10 @@  lhs_and_rhs_separable_p (rtx lhs, rtx rh
   if (lhs == NULL || rhs == NULL)
     return false;
 
-  /* Do not schedule CONST, CONST_INT and CONST_DOUBLE etc as rhs: no point
-     to use reg, if const can be used.  Moreover, scheduling const as rhs may
-     lead to mode mismatch cause consts don't have modes but they could be
-     merged from branches where the same const used in different modes.  */
+  /* Do not schedule constants as rhs: no point to use reg, if const
+     can be used.  Moreover, scheduling const as rhs may lead to mode
+     mismatch cause consts don't have modes but they could be merged
+     from branches where the same const used in different modes.  */
   if (CONSTANT_P (rhs))
     return false;
 
Index: gcc/genemit.c
===================================================================
--- gcc/genemit.c	(revision 191978)
+++ gcc/genemit.c	(working copy)
@@ -205,6 +205,7 @@  gen_exp (rtx x, enum rtx_code subroutine
 
     case CONST_DOUBLE:
     case CONST_FIXED:
+    case CONST_WIDE_INT:
       /* These shouldn't be written in MD files.  Instead, the appropriate
 	 routines in varasm.c should be called.  */
       gcc_unreachable ();
Index: gcc/defaults.h
===================================================================
--- gcc/defaults.h	(revision 191978)
+++ gcc/defaults.h	(working copy)
@@ -1402,6 +1402,14 @@  see the files COPYING3 and COPYING.RUNTI
 #define SWITCHABLE_TARGET 0
 #endif
 
+/* If the target supports integers that are wider than two
+   HOST_WIDE_INTs on the host compiler, then the target should define
+   TARGET_SUPPORTS_WIDE_INT and make the appropriate fixups.
+   Otherwise the compiler really is not robust.  */
+#ifndef TARGET_SUPPORTS_WIDE_INT
+#define TARGET_SUPPORTS_WIDE_INT 0
+#endif
+
 #endif /* GCC_INSN_FLAGS_H  */
 
 #endif  /* ! GCC_DEFAULTS_H */
Index: gcc/builtins.c
===================================================================
--- gcc/builtins.c	(revision 191978)
+++ gcc/builtins.c	(working copy)
@@ -671,20 +671,25 @@  c_getstr (tree src)
   return TREE_STRING_POINTER (src) + tree_low_cst (offset_node, 1);
 }
 
-/* Return a CONST_INT or CONST_DOUBLE corresponding to target reading
-   GET_MODE_BITSIZE (MODE) bits from string constant STR.  */
+/* Return a CONST_INT, CONST_WIDE_INT, or CONST_DOUBLE corresponding
+   to target reading GET_MODE_BITSIZE (MODE) bits from string constant
+   STR.  */
 
 static rtx
 c_readstr (const char *str, enum machine_mode mode)
 {
-  HOST_WIDE_INT c[2];
+  wide_int c;
   HOST_WIDE_INT ch;
   unsigned int i, j;
+  c.set_mode (mode);
+  c.set_len ((GET_MODE_PRECISION (mode) + HOST_BITS_PER_WIDE_INT - 1)
+	     / HOST_BITS_PER_WIDE_INT);
+
+  for (i = 0; i < c.get_len (); i++)
+    c.elt_ref(i) = 0;
 
   gcc_assert (GET_MODE_CLASS (mode) == MODE_INT);
 
-  c[0] = 0;
-  c[1] = 0;
   ch = 1;
   for (i = 0; i < GET_MODE_SIZE (mode); i++)
     {
@@ -695,13 +700,14 @@  c_readstr (const char *str, enum machine
 	  && GET_MODE_SIZE (mode) >= UNITS_PER_WORD)
 	j = j + UNITS_PER_WORD - 2 * (j % UNITS_PER_WORD) - 1;
       j *= BITS_PER_UNIT;
-      gcc_assert (j < HOST_BITS_PER_DOUBLE_INT);
 
       if (ch)
 	ch = (unsigned char) str[i];
-      c[j / HOST_BITS_PER_WIDE_INT] |= ch << (j % HOST_BITS_PER_WIDE_INT);
+      c.elt_ref (j / HOST_BITS_PER_WIDE_INT) |= ch << (j % HOST_BITS_PER_WIDE_INT);
     }
-  return immed_double_const (c[0], c[1], mode);
+  
+  c.canonize ();
+  return immed_wide_int_const (c);
 }
 
 /* Cast a target constant CST to target CHAR and if that value fits into
@@ -4990,12 +4996,13 @@  expand_builtin_signbit (tree exp, rtx ta
 
   if (bitpos < GET_MODE_BITSIZE (rmode))
     {
-      double_int mask = double_int_zero.set_bit (bitpos);
+      wide_int mask;
+      mask = wide_int::set_bit_in_zero (bitpos, rmode);
 
       if (GET_MODE_SIZE (imode) > GET_MODE_SIZE (rmode))
 	temp = gen_lowpart (rmode, temp);
       temp = expand_binop (rmode, and_optab, temp,
-			   immed_double_int_const (mask, rmode),
+			   immed_wide_int_const (mask),
 			   NULL_RTX, 1, OPTAB_LIB_WIDEN);
     }
   else
Index: gcc/simplify-rtx.c
===================================================================
--- gcc/simplify-rtx.c	(revision 191978)
+++ gcc/simplify-rtx.c	(working copy)
@@ -88,6 +88,22 @@  mode_signbit_p (enum machine_mode mode,
   if (width <= HOST_BITS_PER_WIDE_INT
       && CONST_INT_P (x))
     val = INTVAL (x);
+#if TARGET_SUPPORTS_WIDE_INT
+  else if (CONST_WIDE_INT_P (x))
+    {
+      unsigned int i;
+      unsigned int elts = CONST_WIDE_INT_NUNITS (x);
+      if (elts != (width + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT)
+	return false;
+      for (i = 0; i < elts - 1; i++)
+	if (CONST_WIDE_INT_ELT (x, i) != 0)
+	  return false;
+      val = CONST_WIDE_INT_ELT (x, elts - 1);
+      width %= HOST_BITS_PER_WIDE_INT;
+      if (width == 0)
+	width = HOST_BITS_PER_WIDE_INT;
+    }
+#else
   else if (width <= HOST_BITS_PER_DOUBLE_INT
 	   && CONST_DOUBLE_AS_INT_P (x)
 	   && CONST_DOUBLE_LOW (x) == 0)
@@ -95,8 +111,9 @@  mode_signbit_p (enum machine_mode mode,
       val = CONST_DOUBLE_HIGH (x);
       width -= HOST_BITS_PER_WIDE_INT;
     }
+#endif
   else
-    /* FIXME: We don't yet have a representation for wider modes.  */
+    /* X is not an integer constant.  */
     return false;
 
   if (width < HOST_BITS_PER_WIDE_INT)
@@ -729,8 +746,8 @@  simplify_unary_operation_1 (enum rtx_cod
 	  && !HONOR_SIGN_DEPENDENT_ROUNDING (mode))
 	{
 	  /* (neg (plus A C)) is simplified to (minus -C A).  */
-	  if (CONST_INT_P (XEXP (op, 1))
-	      || CONST_DOUBLE_P (XEXP (op, 1)))
+	  if (CONST_SCALAR_INT_P (XEXP (op, 1))
+	      || CONST_DOUBLE_AS_FLOAT_P (XEXP (op, 1)))
 	    {
 	      temp = simplify_unary_operation (NEG, mode, XEXP (op, 1), mode);
 	      if (temp)
@@ -1270,7 +1287,6 @@  simplify_const_unary_operation (enum rtx
 				rtx op, enum machine_mode op_mode)
 {
   unsigned int width = GET_MODE_PRECISION (mode);
-  unsigned int op_width = GET_MODE_PRECISION (op_mode);
 
   if (code == VEC_DUPLICATE)
     {
@@ -1283,7 +1299,7 @@  simplify_const_unary_operation (enum rtx
 	  gcc_assert (GET_MODE_INNER (mode) == GET_MODE_INNER
 						(GET_MODE (op)));
       }
-      if (CONST_INT_P (op) || CONST_DOUBLE_P (op)
+      if (CONST_SCALAR_INT_P (op) || CONST_DOUBLE_AS_FLOAT_P (op)
 	  || GET_CODE (op) == CONST_VECTOR)
 	{
           int elt_size = GET_MODE_SIZE (GET_MODE_INNER (mode));
@@ -1336,7 +1352,7 @@  simplify_const_unary_operation (enum rtx
      check the wrong mode (input vs. output) for a conversion operation,
      such as FIX.  At some point, this should be simplified.  */
 
-  if (code == FLOAT && (CONST_DOUBLE_AS_INT_P (op) || CONST_INT_P (op)))
+  if (code == FLOAT && CONST_SCALAR_INT_P (op))
     {
       HOST_WIDE_INT hv, lv;
       REAL_VALUE_TYPE d;
@@ -1344,14 +1360,24 @@  simplify_const_unary_operation (enum rtx
       if (CONST_INT_P (op))
 	lv = INTVAL (op), hv = HWI_SIGN_EXTEND (lv);
       else
+#if TARGET_SUPPORTS_WIDE_INT
+	{
+	  /* The conversion code to floats really want exactly 2 HWIs.
+	     This needs to be fixed.  For now, if the constant is
+	     really big, just return 0 which is safe.  */
+	  if (CONST_WIDE_INT_NUNITS (op) > 2)
+	    return 0;
+	  lv = CONST_WIDE_INT_ELT (op, 0);
+	  hv = CONST_WIDE_INT_ELT (op, 1);
+	}
+#else
 	lv = CONST_DOUBLE_LOW (op),  hv = CONST_DOUBLE_HIGH (op);
-
+#endif
       REAL_VALUE_FROM_INT (d, lv, hv, mode);
       d = real_value_truncate (mode, d);
       return CONST_DOUBLE_FROM_REAL_VALUE (d, mode);
     }
-  else if (code == UNSIGNED_FLOAT
-	   && (CONST_DOUBLE_AS_INT_P (op) || CONST_INT_P (op)))
+  else if (code == UNSIGNED_FLOAT && CONST_SCALAR_INT_P (op))
     {
       HOST_WIDE_INT hv, lv;
       REAL_VALUE_TYPE d;
@@ -1359,8 +1385,19 @@  simplify_const_unary_operation (enum rtx
       if (CONST_INT_P (op))
 	lv = INTVAL (op), hv = HWI_SIGN_EXTEND (lv);
       else
+#if TARGET_SUPPORTS_WIDE_INT
+	{
+	  /* The conversion code to floats really want exactly 2 HWIs.
+	     This needs to be fixed.  For now, if the constant is
+	     really big, just return 0 which is safe.  */
+	  if (CONST_WIDE_INT_NUNITS (op) > 2)
+	    return 0;
+	  lv = CONST_WIDE_INT_ELT (op, 0);
+	  hv = CONST_WIDE_INT_ELT (op, 1);
+	}
+#else
 	lv = CONST_DOUBLE_LOW (op),  hv = CONST_DOUBLE_HIGH (op);
-
+#endif
       if (op_mode == VOIDmode
 	  || GET_MODE_PRECISION (op_mode) > HOST_BITS_PER_DOUBLE_INT)
 	/* We should never get a negative number.  */
@@ -1373,302 +1410,82 @@  simplify_const_unary_operation (enum rtx
       return CONST_DOUBLE_FROM_REAL_VALUE (d, mode);
     }
 
-  if (CONST_INT_P (op)
-      && width <= HOST_BITS_PER_WIDE_INT && width > 0)
+  if (CONST_SCALAR_INT_P (op) && width > 0)
     {
-      HOST_WIDE_INT arg0 = INTVAL (op);
-      HOST_WIDE_INT val;
+      wide_int result;
+      enum machine_mode imode = op_mode == VOIDmode ? mode : op_mode;
+      wide_int op0 = wide_int::from_rtx (op, imode);
+
+#if TARGET_SUPPORTS_WIDE_INT == 0
+      /* This assert keeps the simplification from producing a result
+	 that cannot be represented in a CONST_DOUBLE but a lot of
+	 upstream callers expect that this function never fails to
+	 simplify something and so you if you added this to the test
+	 above the code would die later anyway.  If this assert
+	 happens, you just need to make the port support wide int.  */
+      gcc_assert (width <= HOST_BITS_PER_DOUBLE_INT); 
+#endif
 
       switch (code)
 	{
 	case NOT:
-	  val = ~ arg0;
+	  result = ~op0;
 	  break;
 
 	case NEG:
-	  val = - arg0;
+	  result = op0.neg ();
 	  break;
 
 	case ABS:
-	  val = (arg0 >= 0 ? arg0 : - arg0);
+	  result = op0.abs ();
 	  break;
 
 	case FFS:
-	  arg0 &= GET_MODE_MASK (mode);
-	  val = ffs_hwi (arg0);
+	  result = op0.ffs ();
 	  break;
 
 	case CLZ:
-	  arg0 &= GET_MODE_MASK (mode);
-	  if (arg0 == 0 && CLZ_DEFINED_VALUE_AT_ZERO (mode, val))
-	    ;
-	  else
-	    val = GET_MODE_PRECISION (mode) - floor_log2 (arg0) - 1;
+	  result = op0.clz (mode);
 	  break;
 
 	case CLRSB:
-	  arg0 &= GET_MODE_MASK (mode);
-	  if (arg0 == 0)
-	    val = GET_MODE_PRECISION (mode) - 1;
-	  else if (arg0 >= 0)
-	    val = GET_MODE_PRECISION (mode) - floor_log2 (arg0) - 2;
-	  else if (arg0 < 0)
-	    val = GET_MODE_PRECISION (mode) - floor_log2 (~arg0) - 2;
+	  result = op0.clrsb (mode);
 	  break;
 
 	case CTZ:
-	  arg0 &= GET_MODE_MASK (mode);
-	  if (arg0 == 0)
-	    {
-	      /* Even if the value at zero is undefined, we have to come
-		 up with some replacement.  Seems good enough.  */
-	      if (! CTZ_DEFINED_VALUE_AT_ZERO (mode, val))
-		val = GET_MODE_PRECISION (mode);
-	    }
-	  else
-	    val = ctz_hwi (arg0);
+	  result = op0.ctz (mode);
 	  break;
 
 	case POPCOUNT:
-	  arg0 &= GET_MODE_MASK (mode);
-	  val = 0;
-	  while (arg0)
-	    val++, arg0 &= arg0 - 1;
+	  result = op0.popcount (mode);
 	  break;
 
 	case PARITY:
-	  arg0 &= GET_MODE_MASK (mode);
-	  val = 0;
-	  while (arg0)
-	    val++, arg0 &= arg0 - 1;
-	  val &= 1;
+	  result = op0.parity (mode);
 	  break;
 
 	case BSWAP:
-	  {
-	    unsigned int s;
-
-	    val = 0;
-	    for (s = 0; s < width; s += 8)
-	      {
-		unsigned int d = width - s - 8;
-		unsigned HOST_WIDE_INT byte;
-		byte = (arg0 >> s) & 0xff;
-		val |= byte << d;
-	      }
-	  }
+	  result = op0.bswap ();
 	  break;
 
 	case TRUNCATE:
-	  val = arg0;
+	  result = op0.truncate (mode);
 	  break;
 
 	case ZERO_EXTEND:
-	  /* When zero-extending a CONST_INT, we need to know its
-             original mode.  */
-	  gcc_assert (op_mode != VOIDmode);
-	  if (op_width == HOST_BITS_PER_WIDE_INT)
-	    {
-	      /* If we were really extending the mode,
-		 we would have to distinguish between zero-extension
-		 and sign-extension.  */
-	      gcc_assert (width == op_width);
-	      val = arg0;
-	    }
-	  else if (GET_MODE_BITSIZE (op_mode) < HOST_BITS_PER_WIDE_INT)
-	    val = arg0 & GET_MODE_MASK (op_mode);
-	  else
-	    return 0;
+	  result = op0.zext (mode);
 	  break;
 
 	case SIGN_EXTEND:
-	  if (op_mode == VOIDmode)
-	    op_mode = mode;
-	  op_width = GET_MODE_PRECISION (op_mode);
-	  if (op_width == HOST_BITS_PER_WIDE_INT)
-	    {
-	      /* If we were really extending the mode,
-		 we would have to distinguish between zero-extension
-		 and sign-extension.  */
-	      gcc_assert (width == op_width);
-	      val = arg0;
-	    }
-	  else if (op_width < HOST_BITS_PER_WIDE_INT)
-	    {
-	      val = arg0 & GET_MODE_MASK (op_mode);
-	      if (val_signbit_known_set_p (op_mode, val))
-		val |= ~GET_MODE_MASK (op_mode);
-	    }
-	  else
-	    return 0;
+	  result = op0.sext (mode);
 	  break;
 
 	case SQRT:
-	case FLOAT_EXTEND:
-	case FLOAT_TRUNCATE:
-	case SS_TRUNCATE:
-	case US_TRUNCATE:
-	case SS_NEG:
-	case US_NEG:
-	case SS_ABS:
-	  return 0;
-
-	default:
-	  gcc_unreachable ();
-	}
-
-      return gen_int_mode (val, mode);
-    }
-
-  /* We can do some operations on integer CONST_DOUBLEs.  Also allow
-     for a DImode operation on a CONST_INT.  */
-  else if (width <= HOST_BITS_PER_DOUBLE_INT
-	   && (CONST_DOUBLE_AS_INT_P (op) || CONST_INT_P (op)))
-    {
-      double_int first, value;
-
-      if (CONST_DOUBLE_AS_INT_P (op))
-	first = double_int::from_pair (CONST_DOUBLE_HIGH (op),
-				       CONST_DOUBLE_LOW (op));
-      else
-	first = double_int::from_shwi (INTVAL (op));
-
-      switch (code)
-	{
-	case NOT:
-	  value = ~first;
-	  break;
-
-	case NEG:
-	  value = -first;
-	  break;
-
-	case ABS:
-	  if (first.is_negative ())
-	    value = -first;
-	  else
-	    value = first;
-	  break;
-
-	case FFS:
-	  value.high = 0;
-	  if (first.low != 0)
-	    value.low = ffs_hwi (first.low);
-	  else if (first.high != 0)
-	    value.low = HOST_BITS_PER_WIDE_INT + ffs_hwi (first.high);
-	  else
-	    value.low = 0;
-	  break;
-
-	case CLZ:
-	  value.high = 0;
-	  if (first.high != 0)
-	    value.low = GET_MODE_PRECISION (mode) - floor_log2 (first.high) - 1
-	              - HOST_BITS_PER_WIDE_INT;
-	  else if (first.low != 0)
-	    value.low = GET_MODE_PRECISION (mode) - floor_log2 (first.low) - 1;
-	  else if (! CLZ_DEFINED_VALUE_AT_ZERO (mode, value.low))
-	    value.low = GET_MODE_PRECISION (mode);
-	  break;
-
-	case CTZ:
-	  value.high = 0;
-	  if (first.low != 0)
-	    value.low = ctz_hwi (first.low);
-	  else if (first.high != 0)
-	    value.low = HOST_BITS_PER_WIDE_INT + ctz_hwi (first.high);
-	  else if (! CTZ_DEFINED_VALUE_AT_ZERO (mode, value.low))
-	    value.low = GET_MODE_PRECISION (mode);
-	  break;
-
-	case POPCOUNT:
-	  value = double_int_zero;
-	  while (first.low)
-	    {
-	      value.low++;
-	      first.low &= first.low - 1;
-	    }
-	  while (first.high)
-	    {
-	      value.low++;
-	      first.high &= first.high - 1;
-	    }
-	  break;
-
-	case PARITY:
-	  value = double_int_zero;
-	  while (first.low)
-	    {
-	      value.low++;
-	      first.low &= first.low - 1;
-	    }
-	  while (first.high)
-	    {
-	      value.low++;
-	      first.high &= first.high - 1;
-	    }
-	  value.low &= 1;
-	  break;
-
-	case BSWAP:
-	  {
-	    unsigned int s;
-
-	    value = double_int_zero;
-	    for (s = 0; s < width; s += 8)
-	      {
-		unsigned int d = width - s - 8;
-		unsigned HOST_WIDE_INT byte;
-
-		if (s < HOST_BITS_PER_WIDE_INT)
-		  byte = (first.low >> s) & 0xff;
-		else
-		  byte = (first.high >> (s - HOST_BITS_PER_WIDE_INT)) & 0xff;
-
-		if (d < HOST_BITS_PER_WIDE_INT)
-		  value.low |= byte << d;
-		else
-		  value.high |= byte << (d - HOST_BITS_PER_WIDE_INT);
-	      }
-	  }
-	  break;
-
-	case TRUNCATE:
-	  /* This is just a change-of-mode, so do nothing.  */
-	  value = first;
-	  break;
-
-	case ZERO_EXTEND:
-	  gcc_assert (op_mode != VOIDmode);
-
-	  if (op_width > HOST_BITS_PER_WIDE_INT)
-	    return 0;
-
-	  value = double_int::from_uhwi (first.low & GET_MODE_MASK (op_mode));
-	  break;
-
-	case SIGN_EXTEND:
-	  if (op_mode == VOIDmode
-	      || op_width > HOST_BITS_PER_WIDE_INT)
-	    return 0;
-	  else
-	    {
-	      value.low = first.low & GET_MODE_MASK (op_mode);
-	      if (val_signbit_known_set_p (op_mode, value.low))
-		value.low |= ~GET_MODE_MASK (op_mode);
-
-	      value.high = HWI_SIGN_EXTEND (value.low);
-	    }
-	  break;
-
-	case SQRT:
-	  return 0;
-
 	default:
 	  return 0;
 	}
 
-      return immed_double_int_const (value, mode);
+      return immed_wide_int_const (result);
     }
 
   else if (CONST_DOUBLE_AS_FLOAT_P (op) 
@@ -1720,7 +1537,6 @@  simplify_const_unary_operation (enum rtx
 	}
       return CONST_DOUBLE_FROM_REAL_VALUE (d, mode);
     }
-
   else if (CONST_DOUBLE_AS_FLOAT_P (op)
 	   && SCALAR_FLOAT_MODE_P (GET_MODE (op))
 	   && GET_MODE_CLASS (mode) == MODE_INT
@@ -1733,9 +1549,13 @@  simplify_const_unary_operation (enum rtx
 
       /* This was formerly used only for non-IEEE float.
 	 eggert@twinsun.com says it is safe for IEEE also.  */
-      HOST_WIDE_INT xh, xl, th, tl;
+      HOST_WIDE_INT th, tl;
       REAL_VALUE_TYPE x, t;
+      wide_int wc;
       REAL_VALUE_FROM_CONST_DOUBLE (x, op);
+      wc.set_mode (mode);
+      wc.set_len (2);
+
       switch (code)
 	{
 	case FIX:
@@ -1757,8 +1577,8 @@  simplify_const_unary_operation (enum rtx
 	  real_from_integer (&t, VOIDmode, tl, th, 0);
 	  if (REAL_VALUES_LESS (t, x))
 	    {
-	      xh = th;
-	      xl = tl;
+	      wc.elt_ref (1) = th;
+	      wc.elt_ref (0) = tl;
 	      break;
 	    }
 
@@ -1777,11 +1597,11 @@  simplify_const_unary_operation (enum rtx
 	  real_from_integer (&t, VOIDmode, tl, th, 0);
 	  if (REAL_VALUES_LESS (x, t))
 	    {
-	      xh = th;
-	      xl = tl;
+	      wc.elt_ref (1) = th;
+	      wc.elt_ref (0) = tl;
 	      break;
 	    }
-	  REAL_VALUE_TO_INT (&xl, &xh, x);
+	  REAL_VALUE_TO_INT (&wc.elt_ref (0), &wc.elt_ref (1), x);
 	  break;
 
 	case UNSIGNED_FIX:
@@ -1808,18 +1628,19 @@  simplify_const_unary_operation (enum rtx
 	  real_from_integer (&t, VOIDmode, tl, th, 1);
 	  if (REAL_VALUES_LESS (t, x))
 	    {
-	      xh = th;
-	      xl = tl;
+	      wc.elt_ref (1) = th;
+	      wc.elt_ref (0) = tl;
 	      break;
 	    }
 
-	  REAL_VALUE_TO_INT (&xl, &xh, x);
+	  REAL_VALUE_TO_INT (&wc.elt_ref (0), &wc.elt_ref (1), x);
 	  break;
 
 	default:
 	  gcc_unreachable ();
 	}
-      return immed_double_const (xl, xh, mode);
+      wc.canonize ();
+      return immed_wide_int_const (wc);
     }
 
   return NULL_RTX;
@@ -1979,49 +1800,50 @@  simplify_binary_operation_1 (enum rtx_co
 
       if (SCALAR_INT_MODE_P (mode))
 	{
-	  double_int coeff0, coeff1;
+	  wide_int coeff0;
+	  wide_int coeff1;
 	  rtx lhs = op0, rhs = op1;
 
-	  coeff0 = double_int_one;
-	  coeff1 = double_int_one;
+	  coeff0 = wide_int_one (mode);
+	  coeff1 = wide_int_one (mode);
 
 	  if (GET_CODE (lhs) == NEG)
 	    {
-	      coeff0 = double_int_minus_one;
+	      coeff0 = wide_int_minus_one (mode);
 	      lhs = XEXP (lhs, 0);
 	    }
 	  else if (GET_CODE (lhs) == MULT
 		   && CONST_INT_P (XEXP (lhs, 1)))
 	    {
-	      coeff0 = double_int::from_shwi (INTVAL (XEXP (lhs, 1)));
+	      coeff0 = wide_int::from_rtx (XEXP (lhs, 1), mode);
 	      lhs = XEXP (lhs, 0);
 	    }
 	  else if (GET_CODE (lhs) == ASHIFT
 		   && CONST_INT_P (XEXP (lhs, 1))
                    && INTVAL (XEXP (lhs, 1)) >= 0
-		   && INTVAL (XEXP (lhs, 1)) < HOST_BITS_PER_WIDE_INT)
+		   && INTVAL (XEXP (lhs, 1)) < GET_MODE_PRECISION (mode))
 	    {
-	      coeff0 = double_int_zero.set_bit (INTVAL (XEXP (lhs, 1)));
+	      coeff0 = wide_int::set_bit_in_zero (INTVAL (XEXP (lhs, 1)), mode);
 	      lhs = XEXP (lhs, 0);
 	    }
 
 	  if (GET_CODE (rhs) == NEG)
 	    {
-	      coeff1 = double_int_minus_one;
+	      coeff1 = wide_int_minus_one (mode);
 	      rhs = XEXP (rhs, 0);
 	    }
 	  else if (GET_CODE (rhs) == MULT
 		   && CONST_INT_P (XEXP (rhs, 1)))
 	    {
-	      coeff1 = double_int::from_shwi (INTVAL (XEXP (rhs, 1)));
+	      coeff1 = wide_int::from_rtx (XEXP (rhs, 1), mode);
 	      rhs = XEXP (rhs, 0);
 	    }
 	  else if (GET_CODE (rhs) == ASHIFT
 		   && CONST_INT_P (XEXP (rhs, 1))
 		   && INTVAL (XEXP (rhs, 1)) >= 0
-		   && INTVAL (XEXP (rhs, 1)) < HOST_BITS_PER_WIDE_INT)
+		   && INTVAL (XEXP (rhs, 1)) < GET_MODE_PRECISION (mode))
 	    {
-	      coeff1 = double_int_zero.set_bit (INTVAL (XEXP (rhs, 1)));
+	      coeff1 = wide_int::set_bit_in_zero (INTVAL (XEXP (rhs, 1)), mode);
 	      rhs = XEXP (rhs, 0);
 	    }
 
@@ -2029,11 +1851,9 @@  simplify_binary_operation_1 (enum rtx_co
 	    {
 	      rtx orig = gen_rtx_PLUS (mode, op0, op1);
 	      rtx coeff;
-	      double_int val;
 	      bool speed = optimize_function_for_speed_p (cfun);
 
-	      val = coeff0 + coeff1;
-	      coeff = immed_double_int_const (val, mode);
+	      coeff = immed_wide_int_const (coeff0 + coeff1);
 
 	      tem = simplify_gen_binary (MULT, mode, lhs, coeff);
 	      return set_src_cost (tem, speed) <= set_src_cost (orig, speed)
@@ -2042,10 +1862,9 @@  simplify_binary_operation_1 (enum rtx_co
 	}
 
       /* (plus (xor X C1) C2) is (xor X (C1^C2)) if C2 is signbit.  */
-      if ((CONST_INT_P (op1) || CONST_DOUBLE_AS_INT_P (op1))
+      if (CONST_SCALAR_INT_P (op1)
 	  && GET_CODE (op0) == XOR
-	  && (CONST_INT_P (XEXP (op0, 1))
-	      || CONST_DOUBLE_AS_INT_P (XEXP (op0, 1)))
+	  && CONST_SCALAR_INT_P (XEXP (op0, 1))
 	  && mode_signbit_p (mode, op1))
 	return simplify_gen_binary (XOR, mode, XEXP (op0, 0),
 				    simplify_gen_binary (XOR, mode, op1,
@@ -2156,50 +1975,52 @@  simplify_binary_operation_1 (enum rtx_co
 
       if (SCALAR_INT_MODE_P (mode))
 	{
-	  double_int coeff0, negcoeff1;
+	  wide_int coeff0;
+	  wide_int negcoeff1;
 	  rtx lhs = op0, rhs = op1;
 
-	  coeff0 = double_int_one;
-	  negcoeff1 = double_int_minus_one;
+	  coeff0 = wide_int_one (mode);
+	  negcoeff1 = wide_int_minus_one (mode);
 
 	  if (GET_CODE (lhs) == NEG)
 	    {
-	      coeff0 = double_int_minus_one;
+	      coeff0 = wide_int_minus_one (mode);
 	      lhs = XEXP (lhs, 0);
 	    }
 	  else if (GET_CODE (lhs) == MULT
-		   && CONST_INT_P (XEXP (lhs, 1)))
+		   && CONST_SCALAR_INT_P (XEXP (lhs, 1)))
 	    {
-	      coeff0 = double_int::from_shwi (INTVAL (XEXP (lhs, 1)));
+	      coeff0 = wide_int::from_rtx (XEXP (lhs, 1), mode);
 	      lhs = XEXP (lhs, 0);
 	    }
 	  else if (GET_CODE (lhs) == ASHIFT
 		   && CONST_INT_P (XEXP (lhs, 1))
 		   && INTVAL (XEXP (lhs, 1)) >= 0
-		   && INTVAL (XEXP (lhs, 1)) < HOST_BITS_PER_WIDE_INT)
+		   && INTVAL (XEXP (lhs, 1)) < GET_MODE_PRECISION (mode))
 	    {
-	      coeff0 = double_int_zero.set_bit (INTVAL (XEXP (lhs, 1)));
+	      coeff0 = wide_int::set_bit_in_zero (INTVAL (XEXP (lhs, 1)), mode);
 	      lhs = XEXP (lhs, 0);
 	    }
 
 	  if (GET_CODE (rhs) == NEG)
 	    {
-	      negcoeff1 = double_int_one;
+	      negcoeff1 = wide_int_one (mode);
 	      rhs = XEXP (rhs, 0);
 	    }
 	  else if (GET_CODE (rhs) == MULT
 		   && CONST_INT_P (XEXP (rhs, 1)))
 	    {
-	      negcoeff1 = double_int::from_shwi (-INTVAL (XEXP (rhs, 1)));
+	      negcoeff1 = wide_int::from_rtx (XEXP (rhs, 1), mode).neg ();
 	      rhs = XEXP (rhs, 0);
 	    }
 	  else if (GET_CODE (rhs) == ASHIFT
 		   && CONST_INT_P (XEXP (rhs, 1))
 		   && INTVAL (XEXP (rhs, 1)) >= 0
-		   && INTVAL (XEXP (rhs, 1)) < HOST_BITS_PER_WIDE_INT)
+		   && INTVAL (XEXP (rhs, 1)) < GET_MODE_PRECISION (mode))
 	    {
-	      negcoeff1 = double_int_zero.set_bit (INTVAL (XEXP (rhs, 1)));
-	      negcoeff1 = -negcoeff1;
+	      negcoeff1 = wide_int::set_bit_in_zero (INTVAL (XEXP (rhs, 1)),
+						    mode);
+	      negcoeff1 = negcoeff1.neg ();
 	      rhs = XEXP (rhs, 0);
 	    }
 
@@ -2207,11 +2028,9 @@  simplify_binary_operation_1 (enum rtx_co
 	    {
 	      rtx orig = gen_rtx_MINUS (mode, op0, op1);
 	      rtx coeff;
-	      double_int val;
 	      bool speed = optimize_function_for_speed_p (cfun);
 
-	      val = coeff0 + negcoeff1;
-	      coeff = immed_double_int_const (val, mode);
+	      coeff = immed_wide_int_const (coeff0 + negcoeff1);
 
 	      tem = simplify_gen_binary (MULT, mode, lhs, coeff);
 	      return set_src_cost (tem, speed) <= set_src_cost (orig, speed)
@@ -2225,7 +2044,7 @@  simplify_binary_operation_1 (enum rtx_co
 
       /* (-x - c) may be simplified as (-c - x).  */
       if (GET_CODE (op0) == NEG
-	  && (CONST_INT_P (op1) || CONST_DOUBLE_P (op1)))
+	  && (CONST_SCALAR_INT_P (op1) || CONST_DOUBLE_AS_FLOAT_P (op1)))
 	{
 	  tem = simplify_unary_operation (NEG, mode, op1, mode);
 	  if (tem)
@@ -2363,8 +2182,21 @@  simplify_binary_operation_1 (enum rtx_co
 	  && trueop1 == CONST1_RTX (mode))
 	return op0;
 
-      /* Convert multiply by constant power of two into shift unless
-	 we are still generating RTL.  This test is a kludge.  */
+      /* Convert multiply by constant power of two into shift.  */
+#if TARGET_SUPPORTS_WIDE_INT
+      if (CONST_SCALAR_INT_P (trueop1))
+	{
+	  val = wide_int::from_rtx (trueop1, mode).exact_log2 ();
+	  if (val > 0)
+	    {
+	      unsigned int bitsize = GET_MODE_BITSIZE (mode);
+	      if (SHIFT_COUNT_TRUNCATED && val >= bitsize)
+		val %= bitsize;
+	      if (val < bitsize)
+		return simplify_gen_binary (ASHIFT, mode, op0, GEN_INT (val));
+	    }
+	}
+#else
       if (CONST_INT_P (trueop1)
 	  && (val = exact_log2 (UINTVAL (trueop1))) >= 0
 	  /* If the mode is larger than the host word size, and the
@@ -2383,7 +2215,7 @@  simplify_binary_operation_1 (enum rtx_co
 	      || GET_MODE_BITSIZE (mode) <= HOST_BITS_PER_DOUBLE_INT))
 	return simplify_gen_binary (ASHIFT, mode, op0,
 				    GEN_INT (val + HOST_BITS_PER_WIDE_INT));
-
+#endif
       /* x*2 is x+x and x*(-1) is -x */
       if (CONST_DOUBLE_AS_FLOAT_P (trueop1)
 	  && SCALAR_FLOAT_MODE_P (GET_MODE (trueop1))
@@ -2583,14 +2415,13 @@  simplify_binary_operation_1 (enum rtx_co
 	 return CONST0_RTX (mode);
 
       /* Canonicalize XOR of the most significant bit to PLUS.  */
-      if ((CONST_INT_P (op1) || CONST_DOUBLE_AS_INT_P (op1))
+      if (CONST_SCALAR_INT_P (op1)
 	  && mode_signbit_p (mode, op1))
 	return simplify_gen_binary (PLUS, mode, op0, op1);
       /* (xor (plus X C1) C2) is (xor X (C1^C2)) if C1 is signbit.  */
-      if ((CONST_INT_P (op1) || CONST_DOUBLE_AS_INT_P (op1))
+      if (CONST_SCALAR_INT_P (op1)
 	  && GET_CODE (op0) == PLUS
-	  && (CONST_INT_P (XEXP (op0, 1))
-	      || CONST_DOUBLE_AS_INT_P (XEXP (op0, 1)))
+	  && CONST_SCALAR_INT_P (XEXP (op0, 1))
 	  && mode_signbit_p (mode, XEXP (op0, 1)))
 	return simplify_gen_binary (XOR, mode, XEXP (op0, 0),
 				    simplify_gen_binary (XOR, mode, op1,
@@ -3355,9 +3186,11 @@  simplify_binary_operation_1 (enum rtx_co
 	  gcc_assert (GET_MODE_INNER (mode) == op1_mode);
 
 	if ((GET_CODE (trueop0) == CONST_VECTOR
-	     || CONST_INT_P (trueop0) || CONST_DOUBLE_P (trueop0))
+	     || CONST_SCALAR_INT_P (trueop0) 
+	     || CONST_DOUBLE_AS_FLOAT_P (trueop0))
 	    && (GET_CODE (trueop1) == CONST_VECTOR
-		|| CONST_INT_P (trueop1) || CONST_DOUBLE_P (trueop1)))
+		|| CONST_SCALAR_INT_P (trueop1) 
+		|| CONST_DOUBLE_AS_FLOAT_P (trueop1)))
 	  {
 	    int elt_size = GET_MODE_SIZE (GET_MODE_INNER (mode));
 	    unsigned n_elts = (GET_MODE_SIZE (mode) / elt_size);
@@ -3420,9 +3253,9 @@  rtx
 simplify_const_binary_operation (enum rtx_code code, enum machine_mode mode,
 				 rtx op0, rtx op1)
 {
-  HOST_WIDE_INT arg0, arg1, arg0s, arg1s;
-  HOST_WIDE_INT val;
+#if TARGET_SUPPORTS_WIDE_INT == 0
   unsigned int width = GET_MODE_PRECISION (mode);
+#endif
 
   if (VECTOR_MODE_P (mode)
       && code != VEC_CONCAT
@@ -3454,11 +3287,11 @@  simplify_const_binary_operation (enum rt
 
   if (VECTOR_MODE_P (mode)
       && code == VEC_CONCAT
-      && (CONST_INT_P (op0)
+      && (CONST_SCALAR_INT_P (op0)
 	  || GET_CODE (op0) == CONST_FIXED
-	  || CONST_DOUBLE_P (op0))
-      && (CONST_INT_P (op1)
-	  || CONST_DOUBLE_P (op1)
+	  || CONST_DOUBLE_AS_FLOAT_P (op0))
+      && (CONST_SCALAR_INT_P (op1)
+	  || CONST_DOUBLE_AS_FLOAT_P (op1)
 	  || GET_CODE (op1) == CONST_FIXED))
     {
       unsigned n_elts = GET_MODE_NUNITS (mode);
@@ -3615,299 +3448,128 @@  simplify_const_binary_operation (enum rt
 
   /* We can fold some multi-word operations.  */
   if (GET_MODE_CLASS (mode) == MODE_INT
-      && width == HOST_BITS_PER_DOUBLE_INT
-      && (CONST_DOUBLE_AS_INT_P (op0) || CONST_INT_P (op0))
-      && (CONST_DOUBLE_AS_INT_P (op1) || CONST_INT_P (op1)))
+      && CONST_SCALAR_INT_P (op0)
+      && CONST_SCALAR_INT_P (op1))
     {
-      double_int o0, o1, res, tmp;
-      bool overflow;
-
-      o0 = rtx_to_double_int (op0);
-      o1 = rtx_to_double_int (op1);
-
+      wide_int result;
+      wide_int wop0 = wide_int::from_rtx (op0, mode);
+      wide_int wop1 = wide_int::from_rtx (op1, mode);
+      bool overflow = false;
+
+#if TARGET_SUPPORTS_WIDE_INT == 0
+      /* This assert keeps the simplification from producing a result
+	 that cannot be represented in a CONST_DOUBLE but a lot of
+	 upstream callers expect that this function never fails to
+	 simplify something and so you if you added this to the test
+	 above the code would die later anyway.  If this assert
+	 happens, you just need to make the port support wide int.  */
+      gcc_assert (width <= HOST_BITS_PER_DOUBLE_INT);
+#endif
       switch (code)
 	{
 	case MINUS:
-	  /* A - B == A + (-B).  */
-	  o1 = -o1;
-
-	  /* Fall through....  */
+	  result = wop0 - wop1;
+	  break;
 
 	case PLUS:
-	  res = o0 + o1;
+	  result = wop0 + wop1;
 	  break;
 
 	case MULT:
-	  res = o0 * o1;
+	  result = wop0 * wop1;
 	  break;
 
 	case DIV:
-          res = o0.divmod_with_overflow (o1, false, TRUNC_DIV_EXPR,
-					 &tmp, &overflow);
+	  result = wop0.div_trunc (wop1, wide_int::SIGNED, &overflow);
 	  if (overflow)
-	    return 0;
+	    return NULL_RTX;
 	  break;
-
+	  
 	case MOD:
-          tmp = o0.divmod_with_overflow (o1, false, TRUNC_DIV_EXPR,
-					 &res, &overflow);
+	  result = wop0.mod_trunc (wop1, wide_int::SIGNED, &overflow);
 	  if (overflow)
-	    return 0;
+	    return NULL_RTX;
 	  break;
 
 	case UDIV:
-          res = o0.divmod_with_overflow (o1, true, TRUNC_DIV_EXPR,
-					 &tmp, &overflow);
+	  result = wop0.div_trunc (wop1, wide_int::UNSIGNED, &overflow);
 	  if (overflow)
-	    return 0;
+	    return NULL_RTX;
 	  break;
 
 	case UMOD:
-          tmp = o0.divmod_with_overflow (o1, true, TRUNC_DIV_EXPR,
-					 &res, &overflow);
+	  result = wop0.mod_trunc (wop1, wide_int::UNSIGNED, &overflow);
 	  if (overflow)
-	    return 0;
+	    return NULL_RTX;
 	  break;
 
 	case AND:
-	  res = o0 & o1;
+	  result = wop0 & wop1;
 	  break;
 
 	case IOR:
-	  res = o0 | o1;
+	  result = wop0 | wop1;
 	  break;
 
 	case XOR:
-	  res = o0 ^ o1;
+	  result = wop0 ^ wop1;
 	  break;
 
 	case SMIN:
-	  res = o0.smin (o1);
+	  result = wide_int_smin (wop0, wop1);
 	  break;
 
 	case SMAX:
-	  res = o0.smax (o1);
+	  result = wide_int_smax (wop0, wop1);
 	  break;
 
 	case UMIN:
-	  res = o0.umin (o1);
+	  result = wide_int_umin (wop0, wop1);
 	  break;
 
 	case UMAX:
-	  res = o0.umax (o1);
-	  break;
-
-	case LSHIFTRT:   case ASHIFTRT:
-	case ASHIFT:
-	case ROTATE:     case ROTATERT:
-	  {
-	    unsigned HOST_WIDE_INT cnt;
-
-	    if (SHIFT_COUNT_TRUNCATED)
-	      {
-		o1.high = 0; 
-		o1.low &= GET_MODE_PRECISION (mode) - 1;
-	      }
-
-	    if (!o1.fits_uhwi ()
-	        || o1.to_uhwi () >= GET_MODE_PRECISION (mode))
-	      return 0;
-
-	    cnt = o1.to_uhwi ();
-	    unsigned short prec = GET_MODE_PRECISION (mode);
-
-	    if (code == LSHIFTRT || code == ASHIFTRT)
-	      res = o0.rshift (cnt, prec, code == ASHIFTRT);
-	    else if (code == ASHIFT)
-	      res = o0.alshift (cnt, prec);
-	    else if (code == ROTATE)
-	      res = o0.lrotate (cnt, prec);
-	    else /* code == ROTATERT */
-	      res = o0.rrotate (cnt, prec);
-	  }
-	  break;
-
-	default:
-	  return 0;
-	}
-
-      return immed_double_int_const (res, mode);
-    }
-
-  if (CONST_INT_P (op0) && CONST_INT_P (op1)
-      && width <= HOST_BITS_PER_WIDE_INT && width != 0)
-    {
-      /* Get the integer argument values in two forms:
-         zero-extended in ARG0, ARG1 and sign-extended in ARG0S, ARG1S.  */
-
-      arg0 = INTVAL (op0);
-      arg1 = INTVAL (op1);
-
-      if (width < HOST_BITS_PER_WIDE_INT)
-        {
-          arg0 &= GET_MODE_MASK (mode);
-          arg1 &= GET_MODE_MASK (mode);
-
-          arg0s = arg0;
-	  if (val_signbit_known_set_p (mode, arg0s))
-	    arg0s |= ~GET_MODE_MASK (mode);
-
-          arg1s = arg1;
-	  if (val_signbit_known_set_p (mode, arg1s))
-	    arg1s |= ~GET_MODE_MASK (mode);
-	}
-      else
-	{
-	  arg0s = arg0;
-	  arg1s = arg1;
-	}
-
-      /* Compute the value of the arithmetic.  */
-
-      switch (code)
-	{
-	case PLUS:
-	  val = arg0s + arg1s;
-	  break;
-
-	case MINUS:
-	  val = arg0s - arg1s;
-	  break;
-
-	case MULT:
-	  val = arg0s * arg1s;
-	  break;
-
-	case DIV:
-	  if (arg1s == 0
-	      || ((unsigned HOST_WIDE_INT) arg0s
-		  == (unsigned HOST_WIDE_INT) 1 << (HOST_BITS_PER_WIDE_INT - 1)
-		  && arg1s == -1))
-	    return 0;
-	  val = arg0s / arg1s;
-	  break;
-
-	case MOD:
-	  if (arg1s == 0
-	      || ((unsigned HOST_WIDE_INT) arg0s
-		  == (unsigned HOST_WIDE_INT) 1 << (HOST_BITS_PER_WIDE_INT - 1)
-		  && arg1s == -1))
-	    return 0;
-	  val = arg0s % arg1s;
+	  result = wide_int_umax (wop0, wop1);
 	  break;
 
-	case UDIV:
-	  if (arg1 == 0
-	      || ((unsigned HOST_WIDE_INT) arg0s
-		  == (unsigned HOST_WIDE_INT) 1 << (HOST_BITS_PER_WIDE_INT - 1)
-		  && arg1s == -1))
-	    return 0;
-	  val = (unsigned HOST_WIDE_INT) arg0 / arg1;
-	  break;
-
-	case UMOD:
-	  if (arg1 == 0
-	      || ((unsigned HOST_WIDE_INT) arg0s
-		  == (unsigned HOST_WIDE_INT) 1 << (HOST_BITS_PER_WIDE_INT - 1)
-		  && arg1s == -1))
-	    return 0;
-	  val = (unsigned HOST_WIDE_INT) arg0 % arg1;
-	  break;
-
-	case AND:
-	  val = arg0 & arg1;
-	  break;
+	case LSHIFTRT:
+	  if (wop1.neg_p ())
+	    return NULL_RTX;
 
-	case IOR:
-	  val = arg0 | arg1;
+	  result = wop0.rshiftu (wop1, wide_int::TRUNC);
 	  break;
-
-	case XOR:
-	  val = arg0 ^ arg1;
-	  break;
-
-	case LSHIFTRT:
-	case ASHIFT:
+	  
 	case ASHIFTRT:
-	  /* Truncate the shift if SHIFT_COUNT_TRUNCATED, otherwise make sure
-	     the value is in range.  We can't return any old value for
-	     out-of-range arguments because either the middle-end (via
-	     shift_truncation_mask) or the back-end might be relying on
-	     target-specific knowledge.  Nor can we rely on
-	     shift_truncation_mask, since the shift might not be part of an
-	     ashlM3, lshrM3 or ashrM3 instruction.  */
-	  if (SHIFT_COUNT_TRUNCATED)
-	    arg1 = (unsigned HOST_WIDE_INT) arg1 % width;
-	  else if (arg1 < 0 || arg1 >= GET_MODE_BITSIZE (mode))
-	    return 0;
+	  if (wop1.neg_p ())
+	    return NULL_RTX;
 
-	  val = (code == ASHIFT
-		 ? ((unsigned HOST_WIDE_INT) arg0) << arg1
-		 : ((unsigned HOST_WIDE_INT) arg0) >> arg1);
-
-	  /* Sign-extend the result for arithmetic right shifts.  */
-	  if (code == ASHIFTRT && arg0s < 0 && arg1 > 0)
-	    val |= ((unsigned HOST_WIDE_INT) (-1)) << (width - arg1);
+	  result = wop0.rshifts (wop1, wide_int::TRUNC);
 	  break;
+	  
+	case ASHIFT:
+	  if (wop1.neg_p ())
+	    return NULL_RTX;
 
-	case ROTATERT:
-	  if (arg1 < 0)
-	    return 0;
-
-	  arg1 %= width;
-	  val = ((((unsigned HOST_WIDE_INT) arg0) << (width - arg1))
-		 | (((unsigned HOST_WIDE_INT) arg0) >> arg1));
+	  result = wop0.lshift (wop1, wide_int::TRUNC);
 	  break;
-
+	  
 	case ROTATE:
-	  if (arg1 < 0)
-	    return 0;
-
-	  arg1 %= width;
-	  val = ((((unsigned HOST_WIDE_INT) arg0) << arg1)
-		 | (((unsigned HOST_WIDE_INT) arg0) >> (width - arg1)));
-	  break;
-
-	case COMPARE:
-	  /* Do nothing here.  */
-	  return 0;
-
-	case SMIN:
-	  val = arg0s <= arg1s ? arg0s : arg1s;
-	  break;
-
-	case UMIN:
-	  val = ((unsigned HOST_WIDE_INT) arg0
-		 <= (unsigned HOST_WIDE_INT) arg1 ? arg0 : arg1);
-	  break;
+	  if (wop1.neg_p ())
+	    return NULL_RTX;
 
-	case SMAX:
-	  val = arg0s > arg1s ? arg0s : arg1s;
+	  result = wop0.lrotate (wop1);
 	  break;
+	  
+	case ROTATERT:
+	  if (wop1.neg_p ())
+	    return NULL_RTX;
 
-	case UMAX:
-	  val = ((unsigned HOST_WIDE_INT) arg0
-		 > (unsigned HOST_WIDE_INT) arg1 ? arg0 : arg1);
+	  result = wop0.rrotate (wop1);
 	  break;
 
-	case SS_PLUS:
-	case US_PLUS:
-	case SS_MINUS:
-	case US_MINUS:
-	case SS_MULT:
-	case US_MULT:
-	case SS_DIV:
-	case US_DIV:
-	case SS_ASHIFT:
-	case US_ASHIFT:
-	  /* ??? There are simplifications that can be done.  */
-	  return 0;
-
 	default:
-	  gcc_unreachable ();
+	  return NULL_RTX;
 	}
-
-      return gen_int_mode (val, mode);
+      return immed_wide_int_const (result);
     }
 
   return NULL_RTX;
@@ -4482,9 +4144,8 @@  simplify_relational_operation_1 (enum rt
   /* (eq/ne (xor x C1) C2) simplifies to (eq/ne x (C1^C2)).  */
   if ((code == EQ || code == NE)
       && op0code == XOR
-      && (CONST_INT_P (op1) || CONST_DOUBLE_AS_INT_P (op1))
-      && (CONST_INT_P (XEXP (op0, 1))
-	  || CONST_DOUBLE_AS_INT_P (XEXP (op0, 1))))
+      && CONST_SCALAR_INT_P (op1)
+      && CONST_SCALAR_INT_P (XEXP (op0, 1)))
     return simplify_gen_relational (code, mode, cmp_mode, XEXP (op0, 0),
 				    simplify_gen_binary (XOR, cmp_mode,
 							 XEXP (op0, 1), op1));
@@ -4574,10 +4235,11 @@  comparison_result (enum rtx_code code, i
     }
 }
 
-/* Check if the given comparison (done in the given MODE) is actually a
-   tautology or a contradiction.
-   If no simplification is possible, this function returns zero.
-   Otherwise, it returns either const_true_rtx or const0_rtx.  */
+/* Check if the given comparison (done in the given MODE) is actually
+   a tautology or a contradiction.  If the mode is VOID_mode, the
+   comparison is done in "infinite precision".  If no simplification
+   is possible, this function returns zero.  Otherwise, it returns
+   either const_true_rtx or const0_rtx.  */
 
 rtx
 simplify_const_relational_operation (enum rtx_code code,
@@ -4701,59 +4363,25 @@  simplify_const_relational_operation (enu
 
   /* Otherwise, see if the operands are both integers.  */
   if ((GET_MODE_CLASS (mode) == MODE_INT || mode == VOIDmode)
-       && (CONST_DOUBLE_AS_INT_P (trueop0) || CONST_INT_P (trueop0))
-       && (CONST_DOUBLE_AS_INT_P (trueop1) || CONST_INT_P (trueop1)))
+      && CONST_SCALAR_INT_P (trueop0) && CONST_SCALAR_INT_P (trueop1))
     {
-      int width = GET_MODE_PRECISION (mode);
-      HOST_WIDE_INT l0s, h0s, l1s, h1s;
-      unsigned HOST_WIDE_INT l0u, h0u, l1u, h1u;
-
-      /* Get the two words comprising each integer constant.  */
-      if (CONST_DOUBLE_AS_INT_P (trueop0))
-	{
-	  l0u = l0s = CONST_DOUBLE_LOW (trueop0);
-	  h0u = h0s = CONST_DOUBLE_HIGH (trueop0);
-	}
-      else
-	{
-	  l0u = l0s = INTVAL (trueop0);
-	  h0u = h0s = HWI_SIGN_EXTEND (l0s);
-	}
-
-      if (CONST_DOUBLE_AS_INT_P (trueop1))
-	{
-	  l1u = l1s = CONST_DOUBLE_LOW (trueop1);
-	  h1u = h1s = CONST_DOUBLE_HIGH (trueop1);
-	}
-      else
-	{
-	  l1u = l1s = INTVAL (trueop1);
-	  h1u = h1s = HWI_SIGN_EXTEND (l1s);
-	}
-
-      /* If WIDTH is nonzero and smaller than HOST_BITS_PER_WIDE_INT,
-	 we have to sign or zero-extend the values.  */
-      if (width != 0 && width < HOST_BITS_PER_WIDE_INT)
-	{
-	  l0u &= GET_MODE_MASK (mode);
-	  l1u &= GET_MODE_MASK (mode);
-
-	  if (val_signbit_known_set_p (mode, l0s))
-	    l0s |= ~GET_MODE_MASK (mode);
-
-	  if (val_signbit_known_set_p (mode, l1s))
-	    l1s |= ~GET_MODE_MASK (mode);
-	}
-      if (width != 0 && width <= HOST_BITS_PER_WIDE_INT)
-	h0u = h1u = 0, h0s = HWI_SIGN_EXTEND (l0s), h1s = HWI_SIGN_EXTEND (l1s);
-
-      if (h0u == h1u && l0u == l1u)
+      enum machine_mode cmode = mode;
+      wide_int wo0;
+      wide_int wo1;
+
+      /* It would be nice if we really had a mode here.  However, the
+	 largest int representable on the target is as good as
+	 infinite.  */
+      if (mode == VOIDmode)
+	cmode = MAX_MODE_INT;
+      wo0 = wide_int::from_rtx (trueop0, cmode);
+      wo1 = wide_int::from_rtx (trueop1, cmode);
+      if (wo0 == wo1)
 	return comparison_result (code, CMP_EQ);
       else
 	{
-	  int cr;
-	  cr = (h0s < h1s || (h0s == h1s && l0u < l1u)) ? CMP_LT : CMP_GT;
-	  cr |= (h0u < h1u || (h0u == h1u && l0u < l1u)) ? CMP_LTU : CMP_GTU;
+	  int cr = wo0.lts_p (wo1) ? CMP_LT : CMP_GT;
+	  cr |= wo0.ltu_p (wo1) ? CMP_LTU : CMP_GTU;
 	  return comparison_result (code, cr);
 	}
     }
@@ -5168,9 +4796,9 @@  simplify_ternary_operation (enum rtx_cod
   return 0;
 }
 
-/* Evaluate a SUBREG of a CONST_INT or CONST_DOUBLE or CONST_FIXED
-   or CONST_VECTOR,
-   returning another CONST_INT or CONST_DOUBLE or CONST_FIXED or CONST_VECTOR.
+/* Evaluate a SUBREG of a CONST_INT or CONST_WIDE_INT or CONST_DOUBLE
+   or CONST_FIXED or CONST_VECTOR, returning another CONST_INT or
+   CONST_WIDE_INT or CONST_DOUBLE or CONST_FIXED or CONST_VECTOR.
 
    Works by unpacking OP into a collection of 8-bit values
    represented as a little-endian array of 'unsigned char', selecting by BYTE,
@@ -5180,13 +4808,11 @@  static rtx
 simplify_immed_subreg (enum machine_mode outermode, rtx op,
 		       enum machine_mode innermode, unsigned int byte)
 {
-  /* We support up to 512-bit values (for V8DFmode).  */
   enum {
-    max_bitsize = 512,
     value_bit = 8,
     value_mask = (1 << value_bit) - 1
   };
-  unsigned char value[max_bitsize / value_bit];
+  unsigned char value [MAX_BITSIZE_MODE_ANY_MODE/value_bit];
   int value_start;
   int i;
   int elem;
@@ -5198,6 +4824,7 @@  simplify_immed_subreg (enum machine_mode
   rtvec result_v = NULL;
   enum mode_class outer_class;
   enum machine_mode outer_submode;
+  int max_bitsize;
 
   /* Some ports misuse CCmode.  */
   if (GET_MODE_CLASS (outermode) == MODE_CC && CONST_INT_P (op))
@@ -5207,6 +4834,9 @@  simplify_immed_subreg (enum machine_mode
   if (COMPLEX_MODE_P (outermode))
     return NULL_RTX;
 
+  /* We support any size mode.  */
+  max_bitsize = MAX (GET_MODE_BITSIZE (outermode), GET_MODE_BITSIZE (innermode));
+
   /* Unpack the value.  */
 
   if (GET_CODE (op) == CONST_VECTOR)
@@ -5256,8 +4886,20 @@  simplify_immed_subreg (enum machine_mode
 	    *vp++ = INTVAL (el) < 0 ? -1 : 0;
 	  break;
 
+	case CONST_WIDE_INT:
+	  {
+	    wide_int val = wide_int::from_rtx (el, innermode);
+	    unsigned char extend = val.sign_mask ();
+
+	    for (i = 0; i < elem_bitsize; i += value_bit) 
+	      *vp++ = val.extract_to_hwi (i, value_bit);
+	    for (; i < elem_bitsize; i += value_bit)
+	      *vp++ = extend;
+	  }
+	  break;
+
 	case CONST_DOUBLE:
-	  if (GET_MODE (el) == VOIDmode)
+	  if (TARGET_SUPPORTS_WIDE_INT == 0 && GET_MODE (el) == VOIDmode)
 	    {
 	      unsigned char extend = 0;
 	      /* If this triggers, someone should have generated a
@@ -5280,7 +4922,8 @@  simplify_immed_subreg (enum machine_mode
 	    }
 	  else
 	    {
-	      long tmp[max_bitsize / 32];
+	      /* This is big enough for anything on the platform.  */
+	      long tmp[MAX_BITSIZE_MODE_ANY_MODE / 32];
 	      int bitsize = GET_MODE_BITSIZE (GET_MODE (el));
 
 	      gcc_assert (SCALAR_FLOAT_MODE_P (GET_MODE (el)));
@@ -5400,24 +5043,27 @@  simplify_immed_subreg (enum machine_mode
 	case MODE_INT:
 	case MODE_PARTIAL_INT:
 	  {
-	    unsigned HOST_WIDE_INT hi = 0, lo = 0;
-
-	    for (i = 0;
-		 i < HOST_BITS_PER_WIDE_INT && i < elem_bitsize;
-		 i += value_bit)
-	      lo |= (unsigned HOST_WIDE_INT)(*vp++ & value_mask) << i;
-	    for (; i < elem_bitsize; i += value_bit)
-	      hi |= (unsigned HOST_WIDE_INT)(*vp++ & value_mask)
-		     << (i - HOST_BITS_PER_WIDE_INT);
+	    int u;
+	    int base = 0;
+	    int units 
+	      = (GET_MODE_BITSIZE (outer_submode) + HOST_BITS_PER_WIDE_INT - 1) 
+	      / HOST_BITS_PER_WIDE_INT;
+	    wide_int r;
+	    for (u = 0; u < units; u++) 
+	      {
+		unsigned HOST_WIDE_INT buf = 0;
+		for (i = 0; 
+		     i < HOST_BITS_PER_WIDE_INT && base + i < elem_bitsize; 
+		     i += value_bit)
+		  buf |= (unsigned HOST_WIDE_INT)(*vp++ & value_mask) << i;
 
-	    /* immed_double_const doesn't call trunc_int_for_mode.  I don't
-	       know why.  */
-	    if (elem_bitsize <= HOST_BITS_PER_WIDE_INT)
-	      elems[elem] = gen_int_mode (lo, outer_submode);
-	    else if (elem_bitsize <= HOST_BITS_PER_DOUBLE_INT)
-	      elems[elem] = immed_double_const (lo, hi, outer_submode);
-	    else
-	      return NULL_RTX;
+		r.elt_ref (u) = buf;
+		base += HOST_BITS_PER_WIDE_INT;
+	      }
+	    r.set_len (units);
+	    r.set_mode (outer_submode);
+	    r.canonize ();
+	    elems[elem] = immed_wide_int_const (r);
 	  }
 	  break;
 
@@ -5425,7 +5071,7 @@  simplify_immed_subreg (enum machine_mode
 	case MODE_DECIMAL_FLOAT:
 	  {
 	    REAL_VALUE_TYPE r;
-	    long tmp[max_bitsize / 32];
+	    long tmp[MAX_BITSIZE_MODE_ANY_INT / 32];
 
 	    /* real_from_target wants its input in words affected by
 	       FLOAT_WORDS_BIG_ENDIAN.  However, we ignore this,
@@ -5501,8 +5147,8 @@  simplify_subreg (enum machine_mode outer
   if (outermode == innermode && !byte)
     return op;
 
-  if (CONST_INT_P (op)
-      || CONST_DOUBLE_P (op)
+  if (CONST_SCALAR_INT_P (op)
+      || CONST_DOUBLE_AS_FLOAT_P (op)
       || GET_CODE (op) == CONST_FIXED
       || GET_CODE (op) == CONST_VECTOR)
     return simplify_immed_subreg (outermode, op, innermode, byte);
Index: gcc/gengenrtl.c
===================================================================
--- gcc/gengenrtl.c	(revision 191978)
+++ gcc/gengenrtl.c	(working copy)
@@ -143,6 +143,7 @@  static int
 excluded_rtx (int idx)
 {
   return ((strcmp (defs[idx].enumname, "CONST_DOUBLE") == 0)
+	  || (strcmp (defs[idx].enumname, "CONST_WIDE_INT") == 0)
 	  || (strcmp (defs[idx].enumname, "CONST_FIXED") == 0));
 }
 
Index: gcc/expmed.c
===================================================================
--- gcc/expmed.c	(revision 191978)
+++ gcc/expmed.c	(working copy)
@@ -60,7 +60,6 @@  static rtx extract_fixed_bit_field (enum
 				    unsigned HOST_WIDE_INT,
 				    unsigned HOST_WIDE_INT,
 				    unsigned HOST_WIDE_INT, rtx, int, bool);
-static rtx mask_rtx (enum machine_mode, int, int, int);
 static rtx lshift_value (enum machine_mode, rtx, int, int);
 static rtx extract_split_bit_field (rtx, unsigned HOST_WIDE_INT,
 				    unsigned HOST_WIDE_INT, int);
@@ -68,6 +67,19 @@  static void do_cmp_and_jump (rtx, rtx, e
 static rtx expand_smod_pow2 (enum machine_mode, rtx, HOST_WIDE_INT);
 static rtx expand_sdiv_pow2 (enum machine_mode, rtx, HOST_WIDE_INT);
 
+/* Return a constant integer (CONST_INT or CONST_WIDE_INT) mask value
+   of mode MODE with BITSIZE ones followed by BITPOS zeros, or the
+   complement of that if COMPLEMENT.  The mask is truncated if
+   necessary to the width of mode MODE.  The mask is zero-extended if
+   BITSIZE+BITPOS is too small for MODE.  */
+
+static inline rtx 
+mask_rtx (enum machine_mode mode, int bitpos, int bitsize, bool complement)
+{
+  return immed_wide_int_const 
+    (wide_int::shifted_mask (bitpos, bitsize, complement, mode));
+}
+
 /* Test whether a value is zero of a power of two.  */
 #define EXACT_POWER_OF_2_OR_ZERO_P(x) (((x) & ((x) - 1)) == 0)
 
@@ -1961,39 +1973,16 @@  extract_fixed_bit_field (enum machine_mo
   return expand_shift (RSHIFT_EXPR, mode, op0,
 		       GET_MODE_BITSIZE (mode) - bitsize, target, 0);
 }
-
-/* Return a constant integer (CONST_INT or CONST_DOUBLE) mask value
-   of mode MODE with BITSIZE ones followed by BITPOS zeros, or the
-   complement of that if COMPLEMENT.  The mask is truncated if
-   necessary to the width of mode MODE.  The mask is zero-extended if
-   BITSIZE+BITPOS is too small for MODE.  */
-
-static rtx
-mask_rtx (enum machine_mode mode, int bitpos, int bitsize, int complement)
-{
-  double_int mask;
-
-  mask = double_int::mask (bitsize);
-  mask = mask.llshift (bitpos, HOST_BITS_PER_DOUBLE_INT);
-
-  if (complement)
-    mask = ~mask;
-
-  return immed_double_int_const (mask, mode);
-}
-
-/* Return a constant integer (CONST_INT or CONST_DOUBLE) rtx with the value
-   VALUE truncated to BITSIZE bits and then shifted left BITPOS bits.  */
+/* Return a constant integer (CONST_INT or CONST_WIDE_INT) rtx with the value
+   VALUE truncated to BITSIZE bits and then shifted left BITPOS bits.   */
 
 static rtx
 lshift_value (enum machine_mode mode, rtx value, int bitpos, int bitsize)
 {
-  double_int val;
-  
-  val = double_int::from_uhwi (INTVAL (value)).zext (bitsize);
-  val = val.llshift (bitpos, HOST_BITS_PER_DOUBLE_INT);
-
-  return immed_double_int_const (val, mode);
+  return 
+    immed_wide_int_const (wide_int::from_rtx (value, mode)
+			  .zext (bitsize)
+			  .lshift (bitpos, wide_int::NONE));
 }
 
 /* Extract a bit field that is split across two words
@@ -3199,34 +3188,41 @@  expand_mult (enum machine_mode mode, rtx
 	 only if the constant value exactly fits in an `unsigned int' without
 	 any truncation.  This means that multiplying by negative values does
 	 not work; results are off by 2^32 on a 32 bit machine.  */
-
       if (CONST_INT_P (scalar_op1))
 	{
 	  coeff = INTVAL (scalar_op1);
 	  is_neg = coeff < 0;
 	}
+#if TARGET_SUPPORTS_WIDE_INT
+      else if (CONST_WIDE_INT_P (scalar_op1))
+#else
       else if (CONST_DOUBLE_AS_INT_P (scalar_op1))
+#endif
 	{
-	  /* If we are multiplying in DImode, it may still be a win
-	     to try to work with shifts and adds.  */
-	  if (CONST_DOUBLE_HIGH (scalar_op1) == 0
-	      && CONST_DOUBLE_LOW (scalar_op1) > 0)
+	  int p = GET_MODE_PRECISION (mode);
+	  wide_int val = wide_int::from_rtx (scalar_op1, mode);
+	  int shift = val.exact_log2 (); 
+	  /* Perfect power of 2.  */
+	  is_neg = false;
+	  if (shift > 0)
 	    {
-	      coeff = CONST_DOUBLE_LOW (scalar_op1);
-	      is_neg = false;
+	      /* Do the shift count trucation against the bitsize, not
+		 the precision.  See the comment above
+		 wide-int.c:trunc_shift for details.  */
+	      if (SHIFT_COUNT_TRUNCATED)
+		shift &= GET_MODE_BITSIZE (mode) - 1;
+	      /* We could consider adding just a move of 0 to target
+		 if the shift >= p  */
+	      if (shift < p)
+		return expand_shift (LSHIFT_EXPR, mode, op0, 
+				     shift, target, unsignedp);
+	      /* Any positive number that fits in a word.  */
+	      coeff = CONST_WIDE_INT_ELT (scalar_op1, 0);
 	    }
-	  else if (CONST_DOUBLE_LOW (scalar_op1) == 0)
+	  else if (val.sign_mask () == 0)
 	    {
-	      coeff = CONST_DOUBLE_HIGH (scalar_op1);
-	      if (EXACT_POWER_OF_2_OR_ZERO_P (coeff))
-		{
-		  int shift = floor_log2 (coeff) + HOST_BITS_PER_WIDE_INT;
-		  if (shift < HOST_BITS_PER_DOUBLE_INT - 1
-		      || mode_bitsize <= HOST_BITS_PER_DOUBLE_INT)
-		    return expand_shift (LSHIFT_EXPR, mode, op0,
-					 shift, target, unsignedp);
-		}
-	      goto skip_synth;
+	      /* Any positive number that fits in a word.  */
+	      coeff = CONST_WIDE_INT_ELT (scalar_op1, 0);
 	    }
 	  else
 	    goto skip_synth;
@@ -3716,9 +3712,10 @@  expmed_mult_highpart (enum machine_mode
 static rtx
 expand_smod_pow2 (enum machine_mode mode, rtx op0, HOST_WIDE_INT d)
 {
-  unsigned HOST_WIDE_INT masklow, maskhigh;
   rtx result, temp, shift, label;
   int logd;
+  wide_int mask;
+  int prec = GET_MODE_PRECISION (mode);
 
   logd = floor_log2 (d);
   result = gen_reg_rtx (mode);
@@ -3731,8 +3728,8 @@  expand_smod_pow2 (enum machine_mode mode
 				      mode, 0, -1);
       if (signmask)
 	{
+	  HOST_WIDE_INT masklow = ((HOST_WIDE_INT) 1 << logd) - 1;
 	  signmask = force_reg (mode, signmask);
-	  masklow = ((HOST_WIDE_INT) 1 << logd) - 1;
 	  shift = GEN_INT (GET_MODE_BITSIZE (mode) - logd);
 
 	  /* Use the rtx_cost of a LSHIFTRT instruction to determine
@@ -3777,19 +3774,11 @@  expand_smod_pow2 (enum machine_mode mode
      modulus.  By including the signbit in the operation, many targets
      can avoid an explicit compare operation in the following comparison
      against zero.  */
-
-  masklow = ((HOST_WIDE_INT) 1 << logd) - 1;
-  if (GET_MODE_BITSIZE (mode) <= HOST_BITS_PER_WIDE_INT)
-    {
-      masklow |= (HOST_WIDE_INT) -1 << (GET_MODE_BITSIZE (mode) - 1);
-      maskhigh = -1;
-    }
-  else
-    maskhigh = (HOST_WIDE_INT) -1
-		 << (GET_MODE_BITSIZE (mode) - HOST_BITS_PER_WIDE_INT - 1);
+  mask = wide_int::mask (logd, false, mode);
+  mask = mask.set_bit (prec - 1);
 
   temp = expand_binop (mode, and_optab, op0,
-		       immed_double_const (masklow, maskhigh, mode),
+		       immed_wide_int_const (mask),
 		       result, 1, OPTAB_LIB_WIDEN);
   if (temp != result)
     emit_move_insn (result, temp);
@@ -3799,10 +3788,10 @@  expand_smod_pow2 (enum machine_mode mode
 
   temp = expand_binop (mode, sub_optab, result, const1_rtx, result,
 		       0, OPTAB_LIB_WIDEN);
-  masklow = (HOST_WIDE_INT) -1 << logd;
-  maskhigh = -1;
+
+  mask = wide_int::mask (logd, true, mode); 
   temp = expand_binop (mode, ior_optab, temp,
-		       immed_double_const (masklow, maskhigh, mode),
+		       immed_wide_int_const (mask),
 		       result, 1, OPTAB_LIB_WIDEN);
   temp = expand_binop (mode, add_optab, temp, const1_rtx, result,
 		       0, OPTAB_LIB_WIDEN);
@@ -5056,8 +5045,12 @@  make_tree (tree type, rtx x)
 	return t;
       }
 
+    case CONST_WIDE_INT:
+      t = wide_int_to_tree (type, wide_int::from_rtx (x, TYPE_MODE (type)));
+      return t;
+
     case CONST_DOUBLE:
-      if (GET_MODE (x) == VOIDmode)
+      if (TARGET_SUPPORTS_WIDE_INT == 0 && GET_MODE (x) == VOIDmode)
 	t = build_int_cst_wide (type,
 				CONST_DOUBLE_LOW (x), CONST_DOUBLE_HIGH (x));
       else
Index: gcc/cselib.c
===================================================================
--- gcc/cselib.c	(revision 191978)
+++ gcc/cselib.c	(working copy)
@@ -534,17 +534,15 @@  entry_and_rtx_equal_p (const void *entry
   rtx x = CONST_CAST_RTX ((const_rtx)x_arg);
   enum machine_mode mode = GET_MODE (x);
 
-  gcc_assert (!CONST_INT_P (x) && GET_CODE (x) != CONST_FIXED
-	      && (mode != VOIDmode || GET_CODE (x) != CONST_DOUBLE));
+  gcc_assert (!CONST_SCALAR_INT_P (x) && GET_CODE (x) != CONST_FIXED);
 
   if (mode != GET_MODE (v->val_rtx))
     return 0;
 
   /* Unwrap X if necessary.  */
   if (GET_CODE (x) == CONST
-      && (CONST_INT_P (XEXP (x, 0))
-	  || GET_CODE (XEXP (x, 0)) == CONST_FIXED
-	  || GET_CODE (XEXP (x, 0)) == CONST_DOUBLE))
+      && (CONST_SCALAR_INT_P (XEXP (x, 0))
+	  || GET_CODE (XEXP (x, 0)) == CONST_FIXED))
     x = XEXP (x, 0);
 
   /* We don't guarantee that distinct rtx's have different hash values,
@@ -911,6 +913,20 @@  rtx_equal_for_cselib_1 (rtx x, rtx y, en
     case DEBUG_EXPR:
       return 0;
 
+    case CONST_WIDE_INT:
+      {
+	int i;
+	/* It would have been nice to have had a mode.  */
+	if (CONST_WIDE_INT_NUNITS (x) != CONST_WIDE_INT_NUNITS (y))
+	  return 0;
+
+	for (i = 0; i < CONST_WIDE_INT_NUNITS (x); i++)
+	  if (CONST_WIDE_INT_ELT (x, i) != CONST_WIDE_INT_ELT (y, i))
+	    return 0;
+
+	return 1;
+      }
+
     case DEBUG_IMPLICIT_PTR:
       return DEBUG_IMPLICIT_PTR_DECL (x)
 	     == DEBUG_IMPLICIT_PTR_DECL (y);
@@ -1009,9 +1025,7 @@  rtx_equal_for_cselib_1 (rtx x, rtx y, en
 static rtx
 wrap_constant (enum machine_mode mode, rtx x)
 {
-  if (!CONST_INT_P (x) 
-      && GET_CODE (x) != CONST_FIXED
-      && !CONST_DOUBLE_AS_INT_P (x))
+  if ((!CONST_SCALAR_INT_P (x)) && GET_CODE (x) != CONST_FIXED)
     return x;
   gcc_assert (mode != VOIDmode);
   return gen_rtx_CONST (mode, x);
@@ -1103,15 +1117,23 @@  cselib_hash_rtx (rtx x, int create, enum
       hash += ((unsigned) CONST_INT << 7) + INTVAL (x);
       return hash ? hash : (unsigned int) CONST_INT;
 
+    case CONST_WIDE_INT:
+      {
+	int i;
+	for (i = 0; i < CONST_WIDE_INT_NUNITS (x); i++)
+	  hash += CONST_WIDE_INT_ELT (x, i);
+      }
+      return hash;
+
     case CONST_DOUBLE:
       /* This is like the general case, except that it only counts
 	 the integers representing the constant.  */
       hash += (unsigned) code + (unsigned) GET_MODE (x);
-      if (GET_MODE (x) != VOIDmode)
-	hash += real_hash (CONST_DOUBLE_REAL_VALUE (x));
-      else
+      if (TARGET_SUPPORTS_WIDE_INT == 0 && GET_MODE (x) == VOIDmode)
 	hash += ((unsigned) CONST_DOUBLE_LOW (x)
 		 + (unsigned) CONST_DOUBLE_HIGH (x));
+      else
+	hash += real_hash (CONST_DOUBLE_REAL_VALUE (x));
       return hash ? hash : (unsigned int) CONST_DOUBLE;
 
     case CONST_FIXED:
Index: gcc/explow.c
===================================================================
--- gcc/explow.c	(revision 191978)
+++ gcc/explow.c	(working copy)
@@ -97,38 +97,9 @@  plus_constant (enum machine_mode mode, r
 
   switch (code)
     {
-    case CONST_INT:
-      if (GET_MODE_BITSIZE (mode) > HOST_BITS_PER_WIDE_INT)
-	{
-	  double_int di_x = double_int::from_shwi (INTVAL (x));
-	  double_int di_c = double_int::from_shwi (c);
-
-	  bool overflow;
-	  double_int v = di_x.add_with_sign (di_c, false, &overflow);
-	  if (overflow)
-	    gcc_unreachable ();
-
-	  return immed_double_int_const (v, VOIDmode);
-	}
-
-      return GEN_INT (INTVAL (x) + c);
-
-    case CONST_DOUBLE:
-      {
-	double_int di_x = double_int::from_pair (CONST_DOUBLE_HIGH (x),
-						 CONST_DOUBLE_LOW (x));
-	double_int di_c = double_int::from_shwi (c);
-
-	bool overflow;
-	double_int v = di_x.add_with_sign (di_c, false, &overflow);
-	if (overflow)
-	  /* Sorry, we have no way to represent overflows this wide.
-	     To fix, add constant support wider than CONST_DOUBLE.  */
-	  gcc_assert (GET_MODE_BITSIZE (mode) <= HOST_BITS_PER_DOUBLE_INT);
-
-	return immed_double_int_const (v, VOIDmode);
-      }
-
+    CASE_CONST_SCALAR_INT:
+      return immed_wide_int_const (wide_int::from_rtx (x, mode)
+				   + wide_int::from_shwi (c, mode));
     case MEM:
       /* If this is a reference to the constant pool, try replacing it with
 	 a reference to a new constant.  If the resulting address isn't
Index: gcc/varasm.c
===================================================================
--- gcc/varasm.c	(revision 191978)
+++ gcc/varasm.c	(working copy)
@@ -3358,6 +3358,7 @@  const_rtx_hash_1 (rtx *xp, void *data)
   enum rtx_code code;
   hashval_t h, *hp;
   rtx x;
+  int i;
 
   x = *xp;
   code = GET_CODE (x);
@@ -3368,12 +3369,12 @@  const_rtx_hash_1 (rtx *xp, void *data)
     {
     case CONST_INT:
       hwi = INTVAL (x);
+
     fold_hwi:
       {
 	int shift = sizeof (hashval_t) * CHAR_BIT;
 	const int n = sizeof (HOST_WIDE_INT) / sizeof (hashval_t);
-	int i;
-
+	
 	h ^= (hashval_t) hwi;
 	for (i = 1; i < n; ++i)
 	  {
@@ -3383,8 +3384,17 @@  const_rtx_hash_1 (rtx *xp, void *data)
       }
       break;
 
+    case CONST_WIDE_INT:
+      hwi = GET_MODE_PRECISION (mode);
+      {
+	int i;
+	for (i = 0; i < CONST_WIDE_INT_NUNITS (x); i++)
+	  hwi ^= CONST_WIDE_INT_ELT (x, i);
+	goto fold_hwi;
+      }
+
     case CONST_DOUBLE:
-      if (mode == VOIDmode)
+      if (TARGET_SUPPORTS_WIDE_INT == 0 && mode == VOIDmode)
 	{
 	  hwi = CONST_DOUBLE_LOW (x) ^ CONST_DOUBLE_HIGH (x);
 	  goto fold_hwi;
Index: gcc/hwint.c
===================================================================
--- gcc/hwint.c	(revision 191978)
+++ gcc/hwint.c	(working copy)
@@ -112,6 +112,29 @@  ffs_hwi (unsigned HOST_WIDE_INT x)
 int
 popcount_hwi (unsigned HOST_WIDE_INT x)
 {
+  /* Compute the popcount of a HWI using the algorithm from
+     Hacker's Delight.  */
+#if HOST_BITS_PER_WIDE_INT == 32
+  x = (x & 0x55555555) + ((x >> 1) & 0x55555555);
+  x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
+  x = (x & 0x0F0F0F0F) + ((x >> 4) & 0x0F0F0F0F);
+  x = (x & 0x00FF00FF) + ((x >> 8) & 0x00FF00FF);
+  x = (x & 0x0000FFFF) + ((x >>16) & 0x0000FFFF);
+#elif HOST_BITS_PER_WIDE_INT == 64
+  x = (x & HOST_WIDE_INT_C (0x5555555555555555) 
+       + ((x >> 1) & HOST_WIDE_INT_C (0x5555555555555555);
+  x = (x & HOST_WIDE_INT_C (0x3333333333333333) 
+       + ((x >> 2) & HOST_WIDE_INT_C (0x3333333333333333);
+  x = (x & HOST_WIDE_INT_C (0x0F0F0F0F0F0F0F0F) 
+       + ((x >> 4) & HOST_WIDE_INT_C (0x0F0F0F0F0F0F0F0F);
+  x = (x & HOST_WIDE_INT_C (0x00FF00FF00FF00FF) 
+       + ((x >> 8) & HOST_WIDE_INT_C (0x00FF00FF00FF00FF);
+  x = (x & HOST_WIDE_INT_C (0x0000FFFF0000FFFF) 
+       + ((x >>16) & HOST_WIDE_INT_C (0x0000FFFF0000FFFF);
+  x = (x & HOST_WIDE_INT_C (0x00000000FFFFFFFF) 
+       + ((x >>32) & HOST_WIDE_INT_C (0x00000000FFFFFFFF);
+  return x;
+#else
   int i, ret = 0;
   size_t bits = sizeof (x) * CHAR_BIT;
 
@@ -122,6 +145,7 @@  popcount_hwi (unsigned HOST_WIDE_INT x)
     }
 
   return ret;
+#endif
 }
 
 #endif /* GCC_VERSION < 3004 */
Index: gcc/hwint.h
===================================================================
--- gcc/hwint.h	(revision 191978)
+++ gcc/hwint.h	(working copy)
@@ -77,6 +77,40 @@  extern char sizeof_long_long_must_be_8[s
 # endif
 #endif
 
+/* Print support for half a host wide int.  */
+#define HOST_BITS_PER_HALF_WIDE_INT (HOST_BITS_PER_WIDE_INT / 2)
+#if HOST_BITS_PER_HALF_WIDE_INT == HOST_BITS_PER_LONG
+# define HOST_HALF_WIDE_INT long
+# define HOST_HALF_WIDE_INT_PRINT HOST_LONG_FORMAT
+# define HOST_HALF_WIDE_INT_PRINT_C "L"
+# define HOST_HALF_WIDE_INT_PRINT_DEC "%" HOST_HALF_WIDE_INT_PRINT "d"
+# define HOST_HALF_WIDE_INT_PRINT_DEC_C HOST_HALF_WIDE_INT_PRINT_DEC HOST_HALF_WIDE_INT_PRINT_C
+# define HOST_HALF_WIDE_INT_PRINT_UNSIGNED "%" HOST_HALF_WIDE_INT_PRINT "u"
+# define HOST_HALF_WIDE_INT_PRINT_HEX "%#" HOST_HALF_WIDE_INT_PRINT "x"
+# define HOST_HALF_WIDE_INT_PRINT_HEX_PURE "%" HOST_HALF_WIDE_INT_PRINT "x"
+#elif HOST_BITS_PER_HALF_WIDE_INT == HOST_BITS_PER_INT
+# define HOST_HALF_WIDE_INT int
+# define HOST_HALF_WIDE_INT_PRINT ""
+# define HOST_HALF_WIDE_INT_PRINT_C ""
+# define HOST_HALF_WIDE_INT_PRINT_DEC "%" HOST_HALF_WIDE_INT_PRINT "d"
+# define HOST_HALF_WIDE_INT_PRINT_DEC_C HOST_HALF_WIDE_INT_PRINT_DEC HOST_HALF_WIDE_INT_PRINT_C
+# define HOST_HALF_WIDE_INT_PRINT_UNSIGNED "%" HOST_HALF_WIDE_INT_PRINT "u"
+# define HOST_HALF_WIDE_INT_PRINT_HEX "%#" HOST_HALF_WIDE_INT_PRINT "x"
+# define HOST_HALF_WIDE_INT_PRINT_HEX_PURE "%" HOST_HALF_WIDE_INT_PRINT "x"
+#elif HOST_BITS_PER_HALF_WIDE_INT == HOST_BITS_PER_SHORT
+# define HOST_HALF_WIDE_INT short
+# define HOST_HALF_WIDE_INT_PRINT "h"
+# define HOST_HALF_WIDE_INT_PRINT_C ""
+# define HOST_HALF_WIDE_INT_PRINT_DEC "%" HOST_HALF_WIDE_INT_PRINT "d"
+# define HOST_HALF_WIDE_INT_PRINT_DEC_C HOST_HALF_WIDE_INT_PRINT_DEC HOST_HALF_WIDE_INT_PRINT_C
+# define HOST_HALF_WIDE_INT_PRINT_UNSIGNED "%" HOST_HALF_WIDE_INT_PRINT "u"
+# define HOST_HALF_WIDE_INT_PRINT_HEX "%#" HOST_HALF_WIDE_INT_PRINT "x"
+# define HOST_HALF_WIDE_INT_PRINT_HEX_PURE "%" HOST_HALF_WIDE_INT_PRINT "x"
+#else
+#error Please add support for HOST_HALF_WIDE_INT
+#endif
+
+
 #define HOST_WIDE_INT_1 HOST_WIDE_INT_C(1)
 
 /* This is a magic identifier which allows GCC to figure out the type
@@ -94,9 +128,13 @@  typedef HOST_WIDE_INT __gcc_host_wide_in
 # if HOST_BITS_PER_WIDE_INT == 64
 #  define HOST_WIDE_INT_PRINT_DOUBLE_HEX \
      "0x%" HOST_LONG_FORMAT "x%016" HOST_LONG_FORMAT "x"
+#  define HOST_WIDE_INT_PRINT_PADDED_HEX \
+     "%016" HOST_LONG_FORMAT "x"
 # else
 #  define HOST_WIDE_INT_PRINT_DOUBLE_HEX \
      "0x%" HOST_LONG_FORMAT "x%08" HOST_LONG_FORMAT "x"
+#  define HOST_WIDE_INT_PRINT_PADDED_HEX \
+     "%08" HOST_LONG_FORMAT "x"
 # endif
 #else
 # define HOST_WIDE_INT_PRINT HOST_LONG_LONG_FORMAT
@@ -104,6 +142,8 @@  typedef HOST_WIDE_INT __gcc_host_wide_in
   /* We can assume that 'long long' is at least 64 bits.  */
 # define HOST_WIDE_INT_PRINT_DOUBLE_HEX \
     "0x%" HOST_LONG_LONG_FORMAT "x%016" HOST_LONG_LONG_FORMAT "x"
+# define HOST_WIDE_INT_PRINT_PADDED_HEX \
+    "%016" HOST_LONG_LONG_FORMAT "x"
 #endif /* HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_LONG */
 
 #define HOST_WIDE_INT_PRINT_DEC "%" HOST_WIDE_INT_PRINT "d"
Index: gcc/postreload.c
===================================================================
--- gcc/postreload.c	(revision 191978)
+++ gcc/postreload.c	(working copy)
@@ -286,27 +286,25 @@  reload_cse_simplify_set (rtx set, rtx in
 #ifdef LOAD_EXTEND_OP
 	  if (extend_op != UNKNOWN)
 	    {
-	      HOST_WIDE_INT this_val;
+	      wide_int result;
 
-	      /* ??? I'm lazy and don't wish to handle CONST_DOUBLE.  Other
-		 constants, such as SYMBOL_REF, cannot be extended.  */
-	      if (!CONST_INT_P (this_rtx))
+	      if (!CONST_SCALAR_INT_P (this_rtx))
 		continue;
 
-	      this_val = INTVAL (this_rtx);
 	      switch (extend_op)
 		{
 		case ZERO_EXTEND:
-		  this_val &= GET_MODE_MASK (GET_MODE (src));
+		  result = (wide_int::from_rtx (this_rtx, GET_MODE (src))
+			    .zext (word_mode));
 		  break;
 		case SIGN_EXTEND:
-		  /* ??? In theory we're already extended.  */
-		  if (this_val == trunc_int_for_mode (this_val, GET_MODE (src)))
-		    break;
+		  result = (wide_int::from_rtx (this_rtx, GET_MODE (src))
+			    .sext (word_mode));
+		  break;
 		default:
 		  gcc_unreachable ();
 		}
-	      this_rtx = GEN_INT (this_val);
+	      this_rtx = immed_wide_int_const (result);
 	    }
 #endif
 	  this_cost = set_src_cost (this_rtx, speed);
Index: gcc/var-tracking.c
===================================================================
--- gcc/var-tracking.c	(revision 191978)
+++ gcc/var-tracking.c	(working copy)
@@ -3386,6 +3386,23 @@  loc_cmp (rtx x, rtx y)
       default:
 	gcc_unreachable ();
       }
+  if (CONST_WIDE_INT_P (x))
+    {
+      /* Compare the vector length first.  */
+      if (CONST_WIDE_INT_NUNITS (x) >= CONST_WIDE_INT_NUNITS (y))
+	return 1;
+      else if (CONST_WIDE_INT_NUNITS (x) < CONST_WIDE_INT_NUNITS (y))
+	return -1;
+
+      /* Compare the vectors elements.  */;
+      for (j = CONST_WIDE_INT_NUNITS (x) - 1; j >= 0 ; j--)
+	{
+	  if (CONST_WIDE_INT_ELT (x, j) < CONST_WIDE_INT_ELT (y, j))
+	    return -1;
+	  if (CONST_WIDE_INT_ELT (x, j) > CONST_WIDE_INT_ELT (y, j))
+	    return 1;
+	}
+    }
 
   return 0;
 }
Index: gcc/tree.c
===================================================================
--- gcc/tree.c	(revision 191978)
+++ gcc/tree.c	(working copy)
@@ -1067,6 +1067,21 @@  double_int_to_tree (tree type, double_in
   return build_int_cst_wide (type, cst.low, cst.high);
 }
 
+/* Constructs tree in type TYPE from with value given by CST.  Signedness
+   of CST is assumed to be the same as the signedness of TYPE.  */
+
+tree
+wide_int_to_tree (tree type, const wide_int &cst)
+{
+  wide_int v;
+  if (TYPE_UNSIGNED (type))
+    v = cst.zext (TYPE_PRECISION (type));
+  else
+    v = cst.sext (TYPE_PRECISION (type));
+
+  return build_int_cst_wide (type, v.elt (0), v.elt (1));
+}
+
 /* Returns true if CST fits into range of TYPE.  Signedness of CST is assumed
    to be the same as the signedness of TYPE.  */
 
Index: gcc/tree.h
===================================================================
--- gcc/tree.h	(revision 191978)
+++ gcc/tree.h	(working copy)
@@ -29,6 +29,7 @@  along with GCC; see the file COPYING3.
 #include "vec.h"
 #include "vecir.h"
 #include "double-int.h"
+#include "wide-int.h"
 #include "real.h"
 #include "fixed-value.h"
 #include "alias.h"
@@ -4719,6 +4720,10 @@  tree_to_double_int (const_tree cst)
 }
 
 extern tree double_int_to_tree (tree, double_int);
+#ifndef GENERATOR_FILE
+extern tree wide_int_to_tree (tree type, const wide_int &cst);
+#endif
+
 extern bool double_int_fits_to_tree_p (const_tree, double_int);
 extern tree force_fit_type_double (tree, double_int, int, bool);
 
Index: gcc/gensupport.c
===================================================================
--- gcc/gensupport.c	(revision 191978)
+++ gcc/gensupport.c	(working copy)
@@ -1671,7 +1671,13 @@  static const struct std_pred_table std_p
   {"scratch_operand", false, false, {SCRATCH, REG}},
   {"immediate_operand", false, true, {UNKNOWN}},
   {"const_int_operand", false, false, {CONST_INT}},
+#if TARGET_SUPPORTS_WIDE_INT
+  {"const_wide_int_operand", false, false, {CONST_WIDE_INT}},
+  {"const_scalar_int_operand", false, false, {CONST_INT, CONST_WIDE_INT}},
+  {"const_double_operand", false, false, {CONST_DOUBLE}},
+#else
   {"const_double_operand", false, false, {CONST_INT, CONST_DOUBLE}},
+#endif
   {"nonimmediate_operand", false, false, {SUBREG, REG, MEM}},
   {"nonmemory_operand", false, true, {SUBREG, REG}},
   {"push_operand", false, false, {MEM}},
Index: gcc/read-rtl.c
===================================================================
--- gcc/read-rtl.c	(revision 191978)
+++ gcc/read-rtl.c	(working copy)
@@ -679,6 +679,29 @@  validate_const_int (const char *string)
     fatal_with_file_and_line ("invalid decimal constant \"%s\"\n", string);
 }
 
+static void
+validate_const_wide_int (const char *string)
+{
+  const char *cp;
+  int valid = 1;
+
+  cp = string;
+  while (*cp && ISSPACE (*cp))
+    cp++;
+  /* Skip the leading 0x.  */
+  if (cp[0] == '0' || cp[1] == 'x')
+    cp += 2;
+  else
+    valid = 0;
+  if (*cp == 0)
+    valid = 0;
+  for (; *cp; cp++)
+    if (! ISXDIGIT (*cp))
+      valid = 0;
+  if (!valid)
+    fatal_with_file_and_line ("invalid hex constant \"%s\"\n", string);
+}
+
 /* Record that PTR uses iterator ITERATOR.  */
 
 static void
@@ -1064,6 +1087,56 @@  read_rtx_code (const char *code_name)
 	gcc_unreachable ();
       }
 
+  if (CONST_WIDE_INT_P (return_rtx))
+    {
+      read_name (&name);
+      validate_const_wide_int (name.string);
+      {
+	hwivec hwiv;
+	const char *s = name.string;
+	int len;
+	int index = 0;
+	int gs = HOST_BITS_PER_WIDE_INT/4;
+	int pos;
+	char * buf = XALLOCAVEC (char, gs + 1);
+	unsigned HOST_WIDE_INT wi;
+	int wlen;
+
+	/* Skip the leading spaces.  */
+	while (*s && ISSPACE (*s))
+	  s++;
+
+	/* Skip the leading 0x.  */
+	gcc_assert (s[0] == '0');
+	gcc_assert (s[1] == 'x');
+	s += 2;
+
+	len = strlen (s);
+	pos = len - gs;
+	wlen = (len + gs - 1)/gs;	/* Number of words needed */
+
+	return_rtx = const_wide_int_alloc (wlen);
+
+	hwiv = CONST_WIDE_INT_VEC (return_rtx);
+	while (pos > 0)
+	  {
+#if HOST_BITS_PER_WIDE_INT == 64
+	    sscanf (s + pos, "%16" HOST_WIDE_INT_PRINT "x", &wi);
+#else
+	    sscanf (s + pos, "%8" HOST_WIDE_INT_PRINT "x", &wi);
+#endif
+	    XHWIVEC_ELT (hwiv, index++) = wi;
+	    pos -= gs;
+	  }
+	strncpy (buf, s, gs - pos);
+	buf [gs - pos] = 0;
+	sscanf (buf, "%" HOST_WIDE_INT_PRINT "x", &wi);
+	XHWIVEC_ELT (hwiv, index++) = wi;
+	/* TODO: After reading, do we want to canonicalize with:
+	   value = lookup_const_wide_int (value); ? */
+      }
+    }
+
   c = read_skip_spaces ();
   /* Syntactic sugar for AND and IOR, allowing Lisp-like
      arbitrary number of arguments for them.  */
Index: gcc/cse.c
===================================================================
--- gcc/cse.c	(revision 191978)
+++ gcc/cse.c	(working copy)
@@ -2335,15 +2335,23 @@  hash_rtx_cb (const_rtx x, enum machine_m
                + (unsigned int) INTVAL (x));
       return hash;
 
+    case CONST_WIDE_INT:
+      {
+	int i;
+	for (i = 0; i < CONST_WIDE_INT_NUNITS (x); i++)
+	  hash += CONST_WIDE_INT_ELT (x, i);
+      }
+      return hash;
+
     case CONST_DOUBLE:
       /* This is like the general case, except that it only counts
 	 the integers representing the constant.  */
       hash += (unsigned int) code + (unsigned int) GET_MODE (x);
-      if (GET_MODE (x) != VOIDmode)
-	hash += real_hash (CONST_DOUBLE_REAL_VALUE (x));
-      else
+      if (TARGET_SUPPORTS_WIDE_INT == 0 && GET_MODE (x) == VOIDmode)
 	hash += ((unsigned int) CONST_DOUBLE_LOW (x)
 		 + (unsigned int) CONST_DOUBLE_HIGH (x));
+      else
+	hash += real_hash (CONST_DOUBLE_REAL_VALUE (x));
       return hash;
 
     case CONST_FIXED:
@@ -3759,6 +3767,7 @@  equiv_constant (rtx x)
 
       /* See if we previously assigned a constant value to this SUBREG.  */
       if ((new_rtx = lookup_as_function (x, CONST_INT)) != 0
+	  || (new_rtx = lookup_as_function (x, CONST_WIDE_INT)) != 0
           || (new_rtx = lookup_as_function (x, CONST_DOUBLE)) != 0
           || (new_rtx = lookup_as_function (x, CONST_FIXED)) != 0)
         return new_rtx;
Index: gcc/dwarf2out.c
===================================================================
--- gcc/dwarf2out.c	(revision 191978)
+++ gcc/dwarf2out.c	(working copy)
@@ -1327,6 +1327,9 @@  dw_val_equal_p (dw_val_node *a, dw_val_n
       return (a->v.val_double.high == b->v.val_double.high
 	      && a->v.val_double.low == b->v.val_double.low);
 
+    case dw_val_class_wide_int:
+      return a->v.val_wide == b->v.val_wide;
+
     case dw_val_class_vec:
       {
 	size_t a_len = a->v.val_vec.elt_size * a->v.val_vec.length;
@@ -1578,6 +1581,10 @@  size_of_loc_descr (dw_loc_descr_ref loc)
 	  case dw_val_class_const_double:
 	    size += HOST_BITS_PER_DOUBLE_INT / BITS_PER_UNIT;
 	    break;
+	  case dw_val_class_wide_int:
+	    size += (loc->dw_loc_oprnd2.v.val_wide.full_len ()
+		     * HOST_BITS_PER_WIDE_INT / BITS_PER_UNIT);
+	    break;
 	  default:
 	    gcc_unreachable ();
 	  }
@@ -1755,6 +1762,20 @@  output_loc_operands (dw_loc_descr_ref lo
 				 second, NULL);
 	  }
 	  break;
+	case dw_val_class_wide_int:
+	  {
+	    int i;
+	    int len = val2->v.val_wide.full_len ();
+	    if (WORDS_BIG_ENDIAN)
+	      for (i = len; i >= 0; --i)
+		dw2_asm_output_data (HOST_BITS_PER_WIDE_INT / HOST_BITS_PER_CHAR,
+				     val2->v.val_wide.elt (i), NULL);
+	    else
+	      for (i = 0; i < len; ++i)
+		dw2_asm_output_data (HOST_BITS_PER_WIDE_INT / HOST_BITS_PER_CHAR,
+				     val2->v.val_wide.elt (i), NULL);
+	  }
+	  break;
 	case dw_val_class_addr:
 	  gcc_assert (val1->v.val_unsigned == DWARF2_ADDR_SIZE);
 	  dw2_asm_output_addr_rtx (DWARF2_ADDR_SIZE, val2->v.val_addr, NULL);
@@ -1957,6 +1978,21 @@  output_loc_operands (dw_loc_descr_ref lo
 	      dw2_asm_output_data (l, second, NULL);
 	    }
 	    break;
+	  case dw_val_class_wide_int:
+	    {
+	      int i;
+	      int len = val2->v.val_wide.full_len ();
+	      l = HOST_BITS_PER_WIDE_INT / HOST_BITS_PER_CHAR;
+
+	      dw2_asm_output_data (1, len * l, NULL);
+	      if (WORDS_BIG_ENDIAN)
+		for (i = len; i >= 0; --i)
+		  dw2_asm_output_data (l, val2->v.val_wide.elt (i), NULL);
+	      else
+		for (i = 0; i < len; ++i)
+		  dw2_asm_output_data (l, val2->v.val_wide.elt (i), NULL);
+	    }
+	    break;
 	  default:
 	    gcc_unreachable ();
 	  }
@@ -3060,7 +3096,7 @@  static void add_AT_location_description
 static void add_data_member_location_attribute (dw_die_ref, tree);
 static bool add_const_value_attribute (dw_die_ref, rtx);
 static void insert_int (HOST_WIDE_INT, unsigned, unsigned char *);
-static void insert_double (double_int, unsigned char *);
+static void insert_wide_int (wide_int *, unsigned char *);
 static void insert_float (const_rtx, unsigned char *);
 static rtx rtl_for_decl_location (tree);
 static bool add_location_or_const_value_attribute (dw_die_ref, tree, bool,
@@ -3557,6 +3593,20 @@  AT_unsigned (dw_attr_ref a)
 /* Add an unsigned double integer attribute value to a DIE.  */
 
 static inline void
+add_AT_wide (dw_die_ref die, enum dwarf_attribute attr_kind,
+	     wide_int w)
+{
+  dw_attr_node attr;
+
+  attr.dw_attr = attr_kind;
+  attr.dw_attr_val.val_class = dw_val_class_wide_int;
+  attr.dw_attr_val.v.val_wide = w;
+  add_dwarf_attr (die, &attr);
+}
+
+/* Add an unsigned double integer attribute value to a DIE.  */
+
+static inline void
 add_AT_double (dw_die_ref die, enum dwarf_attribute attr_kind,
 	       HOST_WIDE_INT high, unsigned HOST_WIDE_INT low)
 {
@@ -4867,6 +4917,19 @@  print_die (dw_die_ref die, FILE *outfile
 		   a->dw_attr_val.v.val_double.high,
 		   a->dw_attr_val.v.val_double.low);
 	  break;
+	case dw_val_class_wide_int:
+	  {
+	    int i = a->dw_attr_val.v.val_wide.get_len ();
+	    fprintf (outfile, "constant (");
+	    gcc_assert (i > 0);
+	    if (a->dw_attr_val.v.val_wide.elt (i) == 0)
+	      fprintf (outfile, "0x");
+	    fprintf (outfile, HOST_WIDE_INT_PRINT_HEX, a->dw_attr_val.v.val_wide.elt (--i));
+	    while (-- i >= 0)
+	      fprintf (outfile, HOST_WIDE_INT_PRINT_PADDED_HEX, a->dw_attr_val.v.val_wide.elt (i));
+	    fprintf (outfile, ")");
+	    break;
+	  }
 	case dw_val_class_vec:
 	  fprintf (outfile, "floating-point or vector constant");
 	  break;
@@ -5022,6 +5085,9 @@  attr_checksum (dw_attr_ref at, struct md
     case dw_val_class_const_double:
       CHECKSUM (at->dw_attr_val.v.val_double);
       break;
+    case dw_val_class_wide_int:
+      CHECKSUM (at->dw_attr_val.v.val_wide);
+      break;
     case dw_val_class_vec:
       CHECKSUM (at->dw_attr_val.v.val_vec);
       break;
@@ -5292,6 +5358,12 @@  attr_checksum_ordered (enum dwarf_tag ta
       CHECKSUM (at->dw_attr_val.v.val_double);
       break;
 
+    case dw_val_class_wide_int:
+      CHECKSUM_ULEB128 (DW_FORM_block);
+      CHECKSUM_ULEB128 (sizeof (at->dw_attr_val.v.val_wide));
+      CHECKSUM (at->dw_attr_val.v.val_wide);
+      break;
+
     case dw_val_class_vec:
       CHECKSUM_ULEB128 (DW_FORM_block);
       CHECKSUM_ULEB128 (sizeof (at->dw_attr_val.v.val_vec));
@@ -5756,6 +5828,8 @@  same_dw_val_p (const dw_val_node *v1, co
     case dw_val_class_const_double:
       return v1->v.val_double.high == v2->v.val_double.high
 	     && v1->v.val_double.low == v2->v.val_double.low;
+    case dw_val_class_wide_int:
+      return v1->v.val_wide == v2->v.val_wide;
     case dw_val_class_vec:
       if (v1->v.val_vec.length != v2->v.val_vec.length
 	  || v1->v.val_vec.elt_size != v2->v.val_vec.elt_size)
@@ -7207,6 +7281,13 @@  size_of_die (dw_die_ref die)
 	  if (HOST_BITS_PER_WIDE_INT >= 64)
 	    size++; /* block */
 	  break;
+	case dw_val_class_wide_int:
+	  size += (a->dw_attr_val.v.val_wide.full_len ()
+		   * HOST_BITS_PER_WIDE_INT / HOST_BITS_PER_CHAR);
+	  if (a->dw_attr_val.v.val_wide.full_len () * HOST_BITS_PER_WIDE_INT
+	      > 64)
+	    size++; /* block */
+	  break;
 	case dw_val_class_vec:
 	  size += constant_size (a->dw_attr_val.v.val_vec.length
 				 * a->dw_attr_val.v.val_vec.elt_size)
@@ -7531,6 +7612,20 @@  value_format (dw_attr_ref a)
 	default:
 	  return DW_FORM_block1;
 	}
+    case dw_val_class_wide_int:
+      switch (a->dw_attr_val.v.val_wide.full_len () * HOST_BITS_PER_WIDE_INT)
+	{
+	case 8:
+	  return DW_FORM_data1;
+	case 16:
+	  return DW_FORM_data2;
+	case 32:
+	  return DW_FORM_data4;
+	case 64:
+	  return DW_FORM_data8;
+	default:
+	  return DW_FORM_block1;
+	}
     case dw_val_class_vec:
       switch (constant_size (a->dw_attr_val.v.val_vec.length
 			     * a->dw_attr_val.v.val_vec.elt_size))
@@ -7878,6 +7973,32 @@  output_die (dw_die_ref die)
 	  }
 	  break;
 
+	case dw_val_class_wide_int:
+	  {
+	    int i;
+	    int len = a->dw_attr_val.v.val_wide.full_len ();
+	    int l = HOST_BITS_PER_WIDE_INT / HOST_BITS_PER_CHAR;
+	    if (len * HOST_BITS_PER_WIDE_INT > 64)
+	      dw2_asm_output_data (1, a->dw_attr_val.v.val_wide.full_len () * l,
+				   NULL);
+
+	    if (WORDS_BIG_ENDIAN)
+	      for (i = len; i >= 0; --i)
+		{
+		  dw2_asm_output_data (l, a->dw_attr_val.v.val_wide.elt (i),
+				       name);
+		  name = NULL;
+		}
+	    else
+	      for (i = 0; i < len; ++i)
+		{
+		  dw2_asm_output_data (l, a->dw_attr_val.v.val_wide.elt (i),
+				       name);
+		  name = NULL;
+		}
+	  }
+	  break;
+
 	case dw_val_class_vec:
 	  {
 	    unsigned int elt_size = a->dw_attr_val.v.val_vec.elt_size;
@@ -10846,9 +10967,8 @@  clz_loc_descriptor (rtx rtl, enum machin
     msb = GEN_INT ((unsigned HOST_WIDE_INT) 1
 		   << (GET_MODE_BITSIZE (mode) - 1));
   else
-    msb = immed_double_const (0, (unsigned HOST_WIDE_INT) 1
-				  << (GET_MODE_BITSIZE (mode)
-				      - HOST_BITS_PER_WIDE_INT - 1), mode);
+    msb = immed_wide_int_const 
+      (wide_int::set_bit_in_zero (GET_MODE_PRECISION (mode) - 1, mode));
   if (GET_CODE (msb) == CONST_INT && INTVAL (msb) < 0)
     tmp = new_loc_descr (HOST_BITS_PER_WIDE_INT == 32
 			 ? DW_OP_const4u : HOST_BITS_PER_WIDE_INT == 64
@@ -11784,7 +11904,16 @@  mem_loc_descriptor (rtx rtl, enum machin
 	  mem_loc_result->dw_loc_oprnd1.val_class = dw_val_class_die_ref;
 	  mem_loc_result->dw_loc_oprnd1.v.val_die_ref.die = type_die;
 	  mem_loc_result->dw_loc_oprnd1.v.val_die_ref.external = 0;
-	  if (SCALAR_FLOAT_MODE_P (mode))
+#if TARGET_SUPPORTS_WIDE_INT == 0
+	  if (!SCALAR_FLOAT_MODE_P (mode))
+	    {
+	      mem_loc_result->dw_loc_oprnd2.val_class
+		= dw_val_class_const_double;
+	      mem_loc_result->dw_loc_oprnd2.v.val_double
+		= rtx_to_double_int (rtl);
+	    }
+	  else
+#endif
 	    {
 	      unsigned int length = GET_MODE_SIZE (mode);
 	      unsigned char *array
@@ -11796,13 +11925,26 @@  mem_loc_descriptor (rtx rtl, enum machin
 	      mem_loc_result->dw_loc_oprnd2.v.val_vec.elt_size = 4;
 	      mem_loc_result->dw_loc_oprnd2.v.val_vec.array = array;
 	    }
-	  else
-	    {
-	      mem_loc_result->dw_loc_oprnd2.val_class
-		= dw_val_class_const_double;
-	      mem_loc_result->dw_loc_oprnd2.v.val_double
-		= rtx_to_double_int (rtl);
-	    }
+	}
+      break;
+
+    case CONST_WIDE_INT:
+      if (!dwarf_strict)
+	{
+	  dw_die_ref type_die;
+
+	  type_die = base_type_for_mode (mode,
+					 GET_MODE_CLASS (mode) == MODE_INT);
+	  if (type_die == NULL)
+	    return NULL;
+	  mem_loc_result = new_loc_descr (DW_OP_GNU_const_type, 0, 0);
+	  mem_loc_result->dw_loc_oprnd1.val_class = dw_val_class_die_ref;
+	  mem_loc_result->dw_loc_oprnd1.v.val_die_ref.die = type_die;
+	  mem_loc_result->dw_loc_oprnd1.v.val_die_ref.external = 0;
+	  mem_loc_result->dw_loc_oprnd2.val_class
+	    = dw_val_class_wide_int;
+	  mem_loc_result->dw_loc_oprnd2.v.val_wide
+	    = wide_int::from_rtx (rtl, mode);
 	}
       break;
 
@@ -12273,7 +12415,15 @@  loc_descriptor (rtx rtl, enum machine_mo
 	     adequately represented.  We output CONST_DOUBLEs as blocks.  */
 	  loc_result = new_loc_descr (DW_OP_implicit_value,
 				      GET_MODE_SIZE (mode), 0);
-	  if (SCALAR_FLOAT_MODE_P (mode))
+#if TARGET_SUPPORTS_WIDE_INT == 0
+	  if (!SCALAR_FLOAT_MODE_P (mode))
+	    {
+	      loc_result->dw_loc_oprnd2.val_class = dw_val_class_const_double;
+	      loc_result->dw_loc_oprnd2.v.val_double
+	        = rtx_to_double_int (rtl);
+	    }
+	  else
+#endif
 	    {
 	      unsigned int length = GET_MODE_SIZE (mode);
 	      unsigned char *array
@@ -12285,12 +12435,26 @@  loc_descriptor (rtx rtl, enum machine_mo
 	      loc_result->dw_loc_oprnd2.v.val_vec.elt_size = 4;
 	      loc_result->dw_loc_oprnd2.v.val_vec.array = array;
 	    }
-	  else
-	    {
-	      loc_result->dw_loc_oprnd2.val_class = dw_val_class_const_double;
-	      loc_result->dw_loc_oprnd2.v.val_double
-	        = rtx_to_double_int (rtl);
-	    }
+	}
+      break;
+
+    case CONST_WIDE_INT:
+      if (mode == VOIDmode)
+	mode = GET_MODE (rtl);
+
+      if (mode != VOIDmode && (dwarf_version >= 4 || !dwarf_strict))
+	{
+	  gcc_assert (mode == GET_MODE (rtl) || VOIDmode == GET_MODE (rtl));
+
+	  /* Note that a CONST_DOUBLE rtx could represent either an integer
+	     or a floating-point constant.  A CONST_DOUBLE is used whenever
+	     the constant requires more than one word in order to be
+	     adequately represented.  We output CONST_DOUBLEs as blocks.  */
+	  loc_result = new_loc_descr (DW_OP_implicit_value,
+				      GET_MODE_SIZE (mode), 0);
+	  loc_result->dw_loc_oprnd2.val_class = dw_val_class_wide_int;
+	  loc_result->dw_loc_oprnd2.v.val_wide
+	    = wide_int::from_rtx (rtl, mode);
 	}
       break;
 
@@ -12306,6 +12470,7 @@  loc_descriptor (rtx rtl, enum machine_mo
 	    ggc_alloc_atomic (length * elt_size);
 	  unsigned int i;
 	  unsigned char *p;
+	  enum machine_mode imode = GET_MODE_INNER (mode);
 
 	  gcc_assert (mode == GET_MODE (rtl) || VOIDmode == GET_MODE (rtl));
 	  switch (GET_MODE_CLASS (mode))
@@ -12314,15 +12479,8 @@  loc_descriptor (rtx rtl, enum machine_mo
 	      for (i = 0, p = array; i < length; i++, p += elt_size)
 		{
 		  rtx elt = CONST_VECTOR_ELT (rtl, i);
-		  double_int val = rtx_to_double_int (elt);
-
-		  if (elt_size <= sizeof (HOST_WIDE_INT))
-		    insert_int (val.to_shwi (), elt_size, p);
-		  else
-		    {
-		      gcc_assert (elt_size == 2 * sizeof (HOST_WIDE_INT));
-		      insert_double (val, p);
-		    }
+		  wide_int val = wide_int::from_rtx (elt, imode);
+		  insert_wide_int (&val, p);
 		}
 	      break;
 
@@ -12349,8 +12507,8 @@  loc_descriptor (rtx rtl, enum machine_mo
 
     case CONST:
       if (mode == VOIDmode
-	  || GET_CODE (XEXP (rtl, 0)) == CONST_INT
-	  || GET_CODE (XEXP (rtl, 0)) == CONST_DOUBLE
+	  || CONST_SCALAR_INT_P (XEXP (rtl, 0))
+	  || CONST_DOUBLE_AS_FLOAT_P (XEXP (rtl, 0))
 	  || GET_CODE (XEXP (rtl, 0)) == CONST_VECTOR)
 	{
 	  loc_result = loc_descriptor (XEXP (rtl, 0), mode, initialized);
@@ -13960,22 +14118,27 @@  extract_int (const unsigned char *src, u
   return val;
 }
 
-/* Writes double_int values to dw_vec_const array.  */
+/* Writes wide_int values to dw_vec_const array.  */
 
 static void
-insert_double (double_int val, unsigned char *dest)
+insert_wide_int (wide_int *val, unsigned char *dest)
 {
-  unsigned char *p0 = dest;
-  unsigned char *p1 = dest + sizeof (HOST_WIDE_INT);
+  int i;
 
   if (WORDS_BIG_ENDIAN)
-    {
-      p0 = p1;
-      p1 = dest;
-    }
-
-  insert_int ((HOST_WIDE_INT) val.low, sizeof (HOST_WIDE_INT), p0);
-  insert_int ((HOST_WIDE_INT) val.high, sizeof (HOST_WIDE_INT), p1);
+    for (i = val->full_len () - 1; i >= 0; i--)
+      {
+	insert_int ((HOST_WIDE_INT) val->elt (i), 
+		    sizeof (HOST_WIDE_INT), dest);
+	dest += sizeof (HOST_WIDE_INT);
+      }
+  else
+    for (i = 0; i < val->full_len (); i++)
+      {
+	insert_int ((HOST_WIDE_INT) val->elt (i), 
+		    sizeof (HOST_WIDE_INT), dest);
+	dest += sizeof (HOST_WIDE_INT);
+      }
 }
 
 /* Writes floating point values to dw_vec_const array.  */
@@ -14020,6 +14183,11 @@  add_const_value_attribute (dw_die_ref di
       }
       return true;
 
+    case CONST_WIDE_INT:
+      add_AT_wide (die, DW_AT_const_value,
+		   wide_int::from_rtx (rtl, GET_MODE (rtl)));
+      return true;
+
     case CONST_DOUBLE:
       /* Note that a CONST_DOUBLE rtx could represent either an integer or a
 	 floating-point constant.  A CONST_DOUBLE is used whenever the
@@ -14028,7 +14196,10 @@  add_const_value_attribute (dw_die_ref di
       {
 	enum machine_mode mode = GET_MODE (rtl);
 
-	if (SCALAR_FLOAT_MODE_P (mode))
+	if (TARGET_SUPPORTS_WIDE_INT == 0 && !SCALAR_FLOAT_MODE_P (mode))
+	  add_AT_double (die, DW_AT_const_value,
+			 CONST_DOUBLE_HIGH (rtl), CONST_DOUBLE_LOW (rtl));
+	else
 	  {
 	    unsigned int length = GET_MODE_SIZE (mode);
 	    unsigned char *array = (unsigned char *) ggc_alloc_atomic (length);
@@ -14036,9 +14207,6 @@  add_const_value_attribute (dw_die_ref di
 	    insert_float (rtl, array);
 	    add_AT_vec (die, DW_AT_const_value, length / 4, 4, array);
 	  }
-	else
-	  add_AT_double (die, DW_AT_const_value,
-			 CONST_DOUBLE_HIGH (rtl), CONST_DOUBLE_LOW (rtl));
       }
       return true;
 
@@ -14051,6 +14219,7 @@  add_const_value_attribute (dw_die_ref di
 	  (length * elt_size);
 	unsigned int i;
 	unsigned char *p;
+	enum machine_mode imode = GET_MODE_INNER (mode);
 
 	switch (GET_MODE_CLASS (mode))
 	  {
@@ -14058,15 +14227,8 @@  add_const_value_attribute (dw_die_ref di
 	    for (i = 0, p = array; i < length; i++, p += elt_size)
 	      {
 		rtx elt = CONST_VECTOR_ELT (rtl, i);
-		double_int val = rtx_to_double_int (elt);
-
-		if (elt_size <= sizeof (HOST_WIDE_INT))
-		  insert_int (val.to_shwi (), elt_size, p);
-		else
-		  {
-		    gcc_assert (elt_size == 2 * sizeof (HOST_WIDE_INT));
-		    insert_double (val, p);
-		  }
+		wide_int val = wide_int::from_rtx (elt, imode);
+		insert_wide_int (&val, p);
 	      }
 	    break;
 
@@ -21863,6 +22025,9 @@  hash_loc_operands (dw_loc_descr_ref loc,
 	  hash = iterative_hash_object (val2->v.val_double.low, hash);
 	  hash = iterative_hash_object (val2->v.val_double.high, hash);
 	  break;
+	case dw_val_class_wide_int:
+	  hash = iterative_hash_object (val2->v.val_wide, hash);
+	  break;
 	case dw_val_class_addr:
 	  hash = iterative_hash_rtx (val2->v.val_addr, hash);
 	  break;
@@ -21941,6 +22106,9 @@  hash_loc_operands (dw_loc_descr_ref loc,
 	    hash = iterative_hash_object (val2->v.val_double.low, hash);
 	    hash = iterative_hash_object (val2->v.val_double.high, hash);
 	    break;
+	  case dw_val_class_wide_int:
+	    hash = iterative_hash_object (val2->v.val_wide, hash);
+	    break;
 	  default:
 	    gcc_unreachable ();
 	  }
@@ -22086,6 +22254,8 @@  compare_loc_operands (dw_loc_descr_ref x
 	case dw_val_class_const_double:
 	  return valx2->v.val_double.low == valy2->v.val_double.low
 		 && valx2->v.val_double.high == valy2->v.val_double.high;
+	case dw_val_class_wide_int:
+	  return valx2->v.val_wide == valy2->v.val_wide;
 	case dw_val_class_addr:
 	  return rtx_equal_p (valx2->v.val_addr, valy2->v.val_addr);
 	default:
@@ -22122,6 +22292,8 @@  compare_loc_operands (dw_loc_descr_ref x
 	case dw_val_class_const_double:
 	  return valx2->v.val_double.low == valy2->v.val_double.low
 		 && valx2->v.val_double.high == valy2->v.val_double.high;
+	case dw_val_class_wide_int:
+	  return valx2->v.val_wide == valy2->v.val_wide;
 	default:
 	  gcc_unreachable ();
 	}
Index: gcc/dwarf2out.h
===================================================================
--- gcc/dwarf2out.h	(revision 191978)
+++ gcc/dwarf2out.h	(working copy)
@@ -22,6 +22,7 @@  along with GCC; see the file COPYING3.
 #define GCC_DWARF2OUT_H 1
 
 #include "dwarf2.h"	/* ??? Remove this once only used by dwarf2foo.c.  */
+#include "wide-int.h"
 
 typedef struct die_struct *dw_die_ref;
 typedef const struct die_struct *const_dw_die_ref;
@@ -143,6 +144,7 @@  enum dw_val_class
   dw_val_class_const,
   dw_val_class_unsigned_const,
   dw_val_class_const_double,
+  dw_val_class_wide_int,
   dw_val_class_vec,
   dw_val_class_flag,
   dw_val_class_die_ref,
@@ -181,6 +183,7 @@  typedef struct GTY(()) dw_val_struct {
       HOST_WIDE_INT GTY ((default)) val_int;
       unsigned HOST_WIDE_INT GTY ((tag ("dw_val_class_unsigned_const"))) val_unsigned;
       double_int GTY ((tag ("dw_val_class_const_double"))) val_double;
+      wide_int GTY ((tag ("dw_val_class_wide_int"))) val_wide;
       dw_vec_const GTY ((tag ("dw_val_class_vec"))) val_vec;
       struct dw_val_die_union
 	{
Index: gcc/wide-int.c
===================================================================
--- gcc/wide-int.c	(revision 0)
+++ gcc/wide-int.c	(revision 0)
@@ -0,0 +1,4257 @@ 
+/* Operations with very long integers.
+   Copyright (C) 2012 Free Software Foundation, Inc.
+   Contributed by Kenneth Zadeck <zadeck@naturalbridge.com>
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by the
+Free Software Foundation; either version 3, or (at your option) any
+later version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT
+ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "hwint.h"
+#include "wide-int.h"
+#include "rtl.h"
+#include "tree.h"
+#include "dumpfile.h"
+
+#define DEBUG_WIDE_INT
+#ifdef DEBUG_WIDE_INT
+  /* Debugging routines.  */
+  static void debug_vw  (const char* name, int r, const wide_int& o0);
+  static void debug_vwh (const char* name, int r, const wide_int &o0,
+			 HOST_WIDE_INT o1);
+  static void debug_vww (const char* name, int r, const wide_int &o0,
+			 const wide_int &o1);
+  static void debug_wv (const char* name, const wide_int &r, int v0);
+  static void debug_wvv (const char* name, const wide_int &r, int v0,
+			 int v1);
+  static void debug_wvvv (const char* name, const wide_int &r, int v0,
+			  int v1, int v2);
+  static void debug_wwv (const char* name, const wide_int &r,
+			 const wide_int &o0, int v0);
+  static void debug_wwwvv (const char* name, const wide_int &r,
+			   const wide_int &o0, const wide_int &o1,
+			   int v0, int v1);
+  static void debug_ww (const char* name, const wide_int &r,
+			const wide_int &o0);
+  static void debug_www (const char* name, const wide_int &r,
+			 const wide_int &o0, const wide_int &o1);
+  static void debug_wwwv (const char* name, const wide_int &r,
+			  const wide_int &o0, const wide_int &o1,
+			  int v0);
+  static void debug_wwww (const char* name, const wide_int &r,
+			  const wide_int &o0, const wide_int &o1, 
+			  const wide_int &o2);
+#endif
+// using wide_int::;
+
+/* Debugging routines.  */
+
+/* This is the maximal size of the buffer needed for dump.  */
+const int MAX = (MAX_BITSIZE_MODE_ANY_INT / 4
+		 + MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT + 32);
+
+#ifdef DEBUG_WIDE_INT
+
+/*
+ * Internal utilities.
+ */
+
+/* Quantities to deal with values that hold half of a wide int.  Used
+   in multiply and divide.  */
+#define HALF_INT_MASK (((HOST_WIDE_INT)1 << HOST_BITS_PER_HALF_WIDE_INT) - 1)
+
+#define BLOCK_OF(TARGET) ((TARGET) / HOST_BITS_PER_WIDE_INT)
+#define BLOCKS_NEEDED(PREC) \
+  (((PREC) + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT)
+
+/*
+ * Conversion routines in and out of wide-int.
+ */
+
+/* Convert OP0 into a wide int.  If the precision is less than
+   HOST_BITS_PER_WIDE_INT, zero extend the value of the word.  */
+
+wide_int
+wide_int::from_shwi (HOST_WIDE_INT op0, enum machine_mode mode)
+{
+  wide_int result;
+  int prec = GET_MODE_PRECISION (mode);
+
+  result.mode = mode;
+
+  if (prec < HOST_BITS_PER_WIDE_INT)
+    op0 = sext (op0, prec);
+
+  result.val[0] = op0;
+  result.len = 1;
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    {
+      char buf0[MAX];
+      fprintf (dump_file, "%s: %s = " HOST_WIDE_INT_PRINT_HEX "\n",
+	       "wide_int::from_shwi", result.dump (buf0), op0);
+    }
+#endif
+
+  return result;
+}
+
+/* Convert OP0 into a wide int.  If the precision is less than
+   HOST_BITS_PER_WIDE_INT, zero extend the value of the word.  The
+   overflow bit are set if the number was too large to fit in the
+   mode.  */
+
+wide_int
+wide_int::from_shwi (HOST_WIDE_INT op0, enum machine_mode mode, bool *overflow)
+{
+  wide_int result;
+  int prec = GET_MODE_PRECISION (mode);
+
+  result.mode = mode;
+
+  if (prec < HOST_BITS_PER_WIDE_INT)
+    {
+      HOST_WIDE_INT t = sext (op0, prec);
+      if (t != op0)
+	*overflow = true; 
+      op0 = t;
+    }
+
+  result.val[0] = op0;
+  result.len = 1;
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    {
+      char buf0[MAX];
+      fprintf (dump_file, "%s: %s = " HOST_WIDE_INT_PRINT_HEX "\n",
+	       "wide_int::from_shwi", result.dump (buf0), op0);
+    }
+#endif
+
+  return result;
+}
+
+/* Convert OP0 into a wide int.  If the precision is less than
+   HOST_BITS_PER_WIDE_INT, zero extend the value of the word.  */
+
+wide_int
+wide_int::from_uhwi (unsigned HOST_WIDE_INT op0, enum machine_mode mode)
+{
+  wide_int result;
+  int prec = GET_MODE_PRECISION (mode);
+
+  result.mode = mode;
+
+  if (prec < HOST_BITS_PER_WIDE_INT)
+    op0 = zext (op0, prec);
+
+  result.val[0] = op0;
+
+  /* If the top bit is a 1, we need to add another word of 0s since
+     that would not expand the right value since the infinite
+     expansion of any unsigned number must have 0s at the top.  */
+  if ((HOST_WIDE_INT)op0 < 0 && prec > HOST_BITS_PER_WIDE_INT)
+    {
+      result.val[1] = 0;
+      result.len = 2;
+    }
+  else
+    result.len = 1;
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    {
+      char buf0[MAX];
+      fprintf (dump_file, "%s: %s = " HOST_WIDE_INT_PRINT_HEX "\n",
+	       "wide_int::from_uhwi", result.dump (buf0), op0);
+    }
+#endif
+
+  return result;
+}
+
+/* Convert OP0 into a wide int.  If the precision is less than
+   HOST_BITS_PER_WIDE_INT, zero extend the value of the word.  The
+   overflow bit are set if the number was too large to fit in the
+   mode.  */
+
+wide_int
+wide_int::from_uhwi (unsigned HOST_WIDE_INT op0, enum machine_mode mode, bool *overflow)
+{
+  wide_int result;
+  int prec = GET_MODE_PRECISION (mode);
+
+  result.mode = mode;
+
+  if (prec < HOST_BITS_PER_WIDE_INT)
+    {
+      unsigned HOST_WIDE_INT t = zext (op0, prec);
+      if (t != op0)
+	*overflow = true; 
+      op0 = t;
+    }
+
+  result.val[0] = op0;
+
+  /* If the top bit is a 1, we need to add another word of 0s since
+     that would not expand the right value since the infinite
+     expansion of any unsigned number must have 0s at the top.  */
+  if ((HOST_WIDE_INT)op0 < 0 && prec > HOST_BITS_PER_WIDE_INT)
+    {
+      result.val[1] = 0;
+      result.len = 2;
+    }
+  else
+    result.len = 1;
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    {
+      char buf0[MAX];
+      fprintf (dump_file, "%s: %s = " HOST_WIDE_INT_PRINT_HEX "\n",
+	       "wide_int::from_uhwi", result.dump (buf0), op0);
+    }
+#endif
+
+  return result;
+}
+
+/* Convert a double int into a wide int.  */
+
+wide_int
+wide_int::from_double_int (enum machine_mode mode, double_int di)
+{
+  wide_int result;
+  result.mode = mode;
+  result.len = 2;
+  result.val[0] = di.low;
+  result.val[1] = di.high;
+  result.canonize ();
+  return result;
+}
+
+/* Convert a integer cst into a wide int.  */
+
+wide_int
+wide_int::from_int_cst (const_tree tcst)
+{
+#if 1
+  wide_int result;
+  tree type = TREE_TYPE (tcst);
+  int prec = TYPE_PRECISION (type);
+  HOST_WIDE_INT op = TREE_INT_CST_LOW (tcst);
+
+  result.mode = TYPE_MODE (type);
+  result.len = (prec <= HOST_BITS_PER_WIDE_INT) ? 1 : 2;
+
+  if (prec < HOST_BITS_PER_WIDE_INT)
+    {
+      if (TYPE_UNSIGNED (type))
+	result.val[0] = zext (op, prec);
+      else
+	result.val[0] = sext (op, prec);
+    }
+  else
+    {
+      result.val[0] = op;
+      if (prec > HOST_BITS_PER_WIDE_INT)
+	{
+	  if (prec < HOST_BITS_PER_DOUBLE_INT)
+	    {
+	      op = TREE_INT_CST_HIGH (tcst);
+	      if (TYPE_UNSIGNED (type))
+		result.val[1] = zext (op, prec);
+	      else
+		result.val[1] = sext (op, prec);
+	    }
+	  else
+	    result.val[1] = TREE_INT_CST_HIGH (tcst);
+	}
+    }
+
+  if (result.len == 2)
+    result.canonize ();
+
+  return result;
+#endif
+  /* This is the code once the tree level is converted.  */
+#if 0
+  wide_int result;
+  int i;
+
+  tree type = TREE_TYPE (tcst);
+
+  result.mode = TYPE_MODE (type);
+  result.len = TREE_INT_CST_LEN (tcst);
+  for (i = 0; i < result.len; i++)
+    result.val[i] = TREE_INT_CST_ELT (tcst, i);
+
+  return result;
+#endif
+}
+
+/* Extract a constant integer from the X.  The bits of the integer are
+   returned.  */
+
+wide_int
+wide_int::from_rtx (const_rtx x, enum machine_mode mode)
+{
+  wide_int result;
+  int prec = GET_MODE_PRECISION (mode);
+
+  gcc_assert (mode != VOIDmode);
+
+  result.mode = mode;
+
+  switch (GET_CODE (x))
+    {
+    case CONST_INT:
+      if ((prec & (HOST_BITS_PER_WIDE_INT - 1)) != 0)
+	result.val[0] = sext (INTVAL (x), prec);
+      else
+	result.val[0] = INTVAL (x);
+      result.len = 1;
+      break;
+
+#if TARGET_SUPPORTS_WIDE_INT
+    case CONST_WIDE_INT:
+      {
+	int i;
+	result.len = CONST_WIDE_INT_NUNITS (x);
+	
+	for (i = 0; i < result.len; i++)
+	  result.val[i] = CONST_WIDE_INT_ELT (x, i);
+      }
+      break;
+#else
+    case CONST_DOUBLE:
+      result.len = 2;
+      result.val[0] = CONST_DOUBLE_LOW (x);
+      result.val[1] = CONST_DOUBLE_HIGH (x);
+      result.canonize ();
+      break;
+#endif
+
+    default:
+      gcc_unreachable ();
+    }
+
+  return result;
+}
+
+/* Return THIS as a signed HOST_WIDE_INT.  If THIS does not fit in
+   PREC, the information is lost. */
+
+HOST_WIDE_INT 
+wide_int::to_shwi (int prec) const
+{
+  HOST_WIDE_INT result;
+
+  if (prec < HOST_BITS_PER_WIDE_INT)
+    result = sext (val[0], prec);
+  else
+    result = val[0];
+
+  return result;
+}
+
+/* Return THIS as a signed HOST_WIDE_INT.  If THIS is too large for
+   the mode's precision, the information is lost. */
+
+HOST_WIDE_INT 
+wide_int::to_shwi () const
+{
+  return to_shwi (precision ());
+}
+
+/* Return THIS as an unsigned HOST_WIDE_INT.  If THIS does not fit in
+   PREC, the information is lost. */
+
+unsigned HOST_WIDE_INT 
+wide_int::to_uhwi (int prec) const
+{
+  HOST_WIDE_INT result;
+
+  if (prec < HOST_BITS_PER_WIDE_INT)
+    result = zext (val[0], prec);
+  else
+    result = val[0];
+
+  return result;
+}
+
+/* Return THIS as an unsigned HOST_WIDE_INT.  If THIS is too large for
+   the mode's precision, the information is lost. */
+
+unsigned HOST_WIDE_INT 
+wide_int::to_uhwi () const
+{
+  return to_uhwi (precision ());
+}
+
+/*
+ * Largest and smallest values in a mode.
+ */
+
+/* Produce the largest number that is represented in MODE.  The result
+   is a wide_int of MODE.  SGN must be SIGNED or UNSIGNED.  */
+
+wide_int
+wide_int::max_value (const enum machine_mode mode, Op sgn)
+{
+  return max_value (mode, GET_MODE_PRECISION (mode), sgn);
+}
+
+/* Produce the largest number that is represented in PREC.  The result
+   is a wide_int of MODE.   SGN must be SIGNED or UNSIGNED.  */
+
+wide_int
+wide_int::max_value (const enum machine_mode mode, int prec, Op sgn)
+{
+  wide_int result;
+  
+  result.mode = mode;
+
+  if (sgn == UNSIGNED)
+    {
+      /* The unsigned max is just all ones, for which the compressed
+	 rep is just a single HWI.  */ 
+      result.len = 1;
+      result.val[0] = (HOST_WIDE_INT)-1;
+    }
+  else
+    {
+      /* The signed max is all ones except the top bit.  This must be
+	 explicitly represented.  */
+      int i;
+      int small_prec = prec & (HOST_BITS_PER_WIDE_INT - 1);
+      int shift = (small_prec == 0) 
+	? HOST_BITS_PER_WIDE_INT - 1 : small_prec - 1;
+
+      result.len = BLOCKS_NEEDED (prec);
+      for (i = 0; i < result.len - 1; i++)
+	result.val[i] = (HOST_WIDE_INT)-1;
+
+      result.val[result.len - 1] = ((HOST_WIDE_INT)1 << shift) - 1;
+    }
+  
+  return result;
+}
+
+/* Produce the smallest number that is represented in MODE. The result
+   is a wide_int of MODE.   SGN must be SIGNED or UNSIGNED.   */
+
+wide_int
+wide_int::min_value (const enum machine_mode mode, Op sgn)
+{
+  return min_value (mode, GET_MODE_PRECISION (mode), sgn);
+}
+
+/* Produce the smallest number that is represented in PREC. The result
+   is a wide_int of MODE.   SGN must be SIGNED or UNSIGNED.   */
+
+wide_int
+wide_int::min_value (const enum machine_mode mode, int prec, Op sgn)
+{
+  if (sgn == UNSIGNED)
+    {
+      /* The unsigned min is just all zeros, for which the compressed
+	 rep is just a single HWI.  */ 
+      wide_int result;
+      result.len = 1;
+      result.mode = mode;
+      result.val[0] = 0;
+      return result;
+    }
+  else
+    {
+      /* The signed min is all zeros except the top bit.  This must be
+	 explicitly represented.  */
+      return set_bit_in_zero (prec - 1, mode);
+    }
+}
+
+/*
+ * Public utilities.
+ */
+
+/* Check the upper HOST_WIDE_INTs of src to see if the length can be
+   shortened.  An upper HOST_WIDE_INT is unnecessary if it is all ones
+   or zeros and the top bit of the next lower word matches.
+
+   This function may change the representation of THIS, but does not
+   change the value that THIS represents.  It does not sign extend in
+   the case that the size of the mode is less than
+   HOST_BITS_PER_WIDE_INT.  */
+
+void
+wide_int::canonize ()
+{
+  int prec = precision ();
+  int small_prec = prec & (HOST_BITS_PER_WIDE_INT - 1);
+  int blocks_needed = BLOCKS_NEEDED (prec);
+  HOST_WIDE_INT top;
+  int i;
+
+  if (len > blocks_needed)
+    len = blocks_needed;
+
+  /* Clean up the top bits for any mode that is not a multiple of a HWI.  */
+  if (len == blocks_needed && small_prec)
+    val[len - 1] = sext (val[len - 1], small_prec);
+
+  if (len == 1)
+    return;
+
+  top = val[len - 1];
+  if (top != 0 && top != (HOST_WIDE_INT)-1)
+    return;
+
+  /* At this point we know that the top is either 0 or -1.  Find the
+     first block that is not a copy of this.  */
+  for (i = len - 2; i >= 0; i--)
+    {
+      HOST_WIDE_INT x = val[i];
+      if (x != top)
+	{
+	  if (x >> (HOST_BITS_PER_WIDE_INT - 1) == top)
+	    {
+	      len = i + 1;
+	      return;
+	    }
+
+	  /* We need an extra block because the top bit block i does
+	     not match the extension.  */
+	  len = i + 2;
+	  return;
+	}
+    }
+
+  /* The number is 0 or -1.  */
+  len = 1;
+}
+
+/* Copy THIS replacing the mode with MODE.  */
+
+wide_int
+wide_int::copy (enum machine_mode mode) const
+{
+  wide_int result;
+  int prec = GET_MODE_PRECISION (mode);
+  int small_prec = prec & (HOST_BITS_PER_WIDE_INT - 1);
+  int blocks_needed = BLOCKS_NEEDED (prec);
+  int i;
+
+  result.mode = mode;
+  result.len = blocks_needed < len ? blocks_needed : len;
+  for (i = 0; i < result.len; i++)
+    result.val[i] = val[i];
+
+  if (small_prec & (blocks_needed == len))
+    result.val[blocks_needed-1]
+      = sext (result.val[blocks_needed-1], small_prec);
+  return result;
+}
+
+/*
+ * public printing routines.
+ */
+
+/* Try to print the signed self in decimal to BUF if the number fits
+   in a HWI.  Other print in hex.  */
+
+void 
+wide_int::print_decs (char *buf) const
+{
+  if ((precision () <= HOST_BITS_PER_WIDE_INT)
+      || (len == 1 && !neg_p ()))
+      sprintf (buf, HOST_WIDE_INT_PRINT_DEC, val[0]);
+  else
+    print_hex (buf);
+}
+
+/* Try to print the signed self in decimal to FILE if the number fits
+   in a HWI.  Other print in hex.  */
+
+void 
+wide_int::print_decs (FILE *file) const
+{
+  char buf[(2 * MAX_BITSIZE_MODE_ANY_INT / BITS_PER_UNIT) + 4];
+  print_decs (buf);
+  fputs (buf, file);
+}
+
+/* Try to print the unsigned self in decimal to BUF if the number fits
+   in a HWI.  Other print in hex.  */
+
+void 
+wide_int::print_decu (char *buf) const
+{
+  if ((precision () <= HOST_BITS_PER_WIDE_INT)
+      || (len == 1 && !neg_p ()))
+      sprintf (buf, HOST_WIDE_INT_PRINT_UNSIGNED, val[0]);
+  else
+    print_hex (buf);
+}
+
+/* Try to print the signed self in decimal to FILE if the number fits
+   in a HWI.  Other print in hex.  */
+
+void 
+wide_int::print_decu (FILE *file) const
+{
+  char buf[(2 * MAX_BITSIZE_MODE_ANY_INT / BITS_PER_UNIT) + 4];
+  print_decu (buf);
+  fputs (buf, file);
+}
+
+void 
+wide_int::print_hex (char *buf) const
+{
+  int i = len;
+
+  if (zero_p ())
+    sprintf (buf, "0x");
+  else
+    {
+      if (neg_p ())
+	{
+	  int j;
+	  /* If the number is negative, we may need to pad value with
+	     0xFFF...  because the leading elements may be missing and
+	     we do not print a '-' with hex.  */
+	  for (j = BLOCKS_NEEDED (precision ()); j > i; j--)
+	    buf += sprintf (buf, HOST_WIDE_INT_PRINT_PADDED_HEX, (HOST_WIDE_INT) -1);
+	    
+	}
+      else
+	buf += sprintf (buf, HOST_WIDE_INT_PRINT_HEX, val [--i]);
+      while (-- i >= 0)
+	buf += sprintf (buf, HOST_WIDE_INT_PRINT_PADDED_HEX, val [i]);
+    }
+}
+
+/* Print one big hex number to FILE.  Note that some assemblers may not
+   accept this for large modes.  */
+void 
+wide_int::print_hex (FILE *file) const
+{
+  char buf[(2 * MAX_BITSIZE_MODE_ANY_INT / BITS_PER_UNIT) + 4];
+  print_hex (buf);
+  fputs (buf, file);
+}
+
+/*
+ * Comparisons, note that only equality is an operator.  The other
+ * comparisons cannot be operators since they are inherently singed or
+ * unsigned and C++ has no such operators.
+ */
+
+/* Return true if THIS == OP1.  */
+
+bool
+wide_int::operator == (const wide_int &op1) const
+{
+  int l0 = len - 1;
+  int l1 = op1.len - 1;
+  int prec = precision ();
+  bool result;
+
+  gcc_assert (mode == op1.mode);
+
+  if (this == &op1)
+    {
+      result = true;
+      goto ex;
+    }
+
+  if (prec < HOST_BITS_PER_WIDE_INT)
+    {
+      unsigned HOST_WIDE_INT mask = ((HOST_WIDE_INT)1 << prec) - 1;
+      result = (val[0] & mask) == (op1.val[0] & mask);
+      goto ex;
+    }
+
+  while (l0 > l1)
+    if (val[l0--] != op1.sign_mask ())
+      {
+	result = false;
+	goto ex;
+      }
+
+  while (l1 > l0)
+    if (op1.val[l1--] != sign_mask ())
+      {
+	result = false;
+	goto ex;
+      }
+
+  while (l0 >= 0)
+    if (val[l0--] != op1.val[l1--])
+      {
+	result = false;
+	goto ex;
+      }
+
+  result = true;
+
+ ex:
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_vww ("operator ==", result, *this, op1);
+#endif
+
+  return result;
+}
+
+/* Return true if THIS > OP1 using signed comparisons.  */
+
+bool
+wide_int::gts_p (const HOST_WIDE_INT op1) const
+{
+  int prec = precision ();
+  bool result;
+
+  if (prec <= HOST_BITS_PER_WIDE_INT || len == 1)
+    {
+      /* The values are already logically sign extended.  */
+      result = val[0] > wide_int::sext(op1, prec);
+      goto ex;
+    }
+  
+  result = !neg_p ();
+
+ ex:
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_vwh ("wide_int::gts_p", result, *this, op1);
+#endif
+
+  return result;
+}
+
+/* Return true if THIS > OP1 using unsigned comparisons.  */
+
+bool
+wide_int::gtu_p (const unsigned HOST_WIDE_INT op1) const
+{
+  unsigned HOST_WIDE_INT x0;
+  unsigned HOST_WIDE_INT x1;
+  int prec = precision ();
+  bool result;
+
+  if (prec < HOST_BITS_PER_WIDE_INT || len == 1)
+    {
+      x0 = zext (val[0], prec);
+      x1 = zext (op1, prec);
+
+      result = x0 > x1;
+    }
+  else
+    result = true;
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_vwh ("wide_int::gtu_p", result, *this, op1);
+#endif
+
+  return result;
+}
+
+/* Return true if THIS < OP1 using signed comparisons.  */
+
+bool
+wide_int::lts_p (const HOST_WIDE_INT op1) const
+{
+  int prec = precision ();
+  bool result;
+
+  if (prec <= HOST_BITS_PER_WIDE_INT || len == 1)
+    {
+      /* The values are already logically sign extended.  */
+      result = val[0] < wide_int::sext(op1, prec);
+      goto ex;
+    }
+  
+  result = neg_p ();
+
+ ex:
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_vwh ("wide_int::lts_p", result, *this, op1);
+#endif
+
+  return result;
+}
+
+/* Return true if THIS < OP1 using signed comparisons.  */
+
+bool
+wide_int::lts_p (const wide_int &op1) const
+{
+  int l0 = len - 1;
+  int l1 = op1.len - 1;
+  int prec = precision ();
+  bool result;
+
+  gcc_assert (mode == op1.mode);
+
+  if (this == &op1)
+    {
+      result = false;
+      goto ex;
+    }
+
+  if (prec <= HOST_BITS_PER_WIDE_INT)
+    {
+      /* The values are already logically sign extended.  */
+      result = val[0] < op1.val[0];
+      goto ex;
+    }
+
+  while (l0 > l1)
+    if (val[l0--] < op1.sign_mask ())
+      {
+	result = true;
+	goto ex;
+      }
+
+  while (l1 > l0)
+    if (sign_mask () < op1.val[l1--])
+      {
+	result = true;
+	goto ex;
+      }
+
+  while (l0 >= 0)
+    if (val[l0--] < op1.val[l1--])
+      {
+	result = true;
+	goto ex;
+      }
+
+  result = false;
+
+ ex:
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_vww ("wide_int::lts_p", result, *this, op1);
+#endif
+
+  return result;
+}
+
+/* Return true if THIS < OP1 using unsigned comparisons.  */
+
+bool
+wide_int::ltu_p (const unsigned HOST_WIDE_INT op1) const
+{
+  unsigned HOST_WIDE_INT x0;
+  unsigned HOST_WIDE_INT x1;
+  int prec = precision ();
+  bool result;
+
+  if (prec < HOST_BITS_PER_WIDE_INT || len == 1)
+    {
+      x0 = zext (val[0], prec);
+      x1 = zext (op1, prec);
+
+      result = x0 < x1;
+    }
+  else
+    result = false;
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_vwh ("wide_int::ltu_p", result, *this, op1);
+#endif
+
+  return result;
+}
+
+/* Return true if THIS < OP1 using unsigned comparisons.  */
+
+bool
+wide_int::ltu_p (const wide_int &op1) const
+{
+  unsigned HOST_WIDE_INT x0;
+  unsigned HOST_WIDE_INT x1;
+  int l0 = len - 1;
+  int l1 = op1.len - 1;
+  int prec = precision ();
+  bool result;
+
+  gcc_assert (mode == op1.mode);
+
+  if (this == &op1)
+    {
+      result = false;
+      goto ex;
+    }
+
+  if (prec < HOST_BITS_PER_WIDE_INT)
+    {
+      x0 = zext (val[0], prec);
+      x1 = zext (op1.val[0], prec);
+
+      result = x0 < x1;
+      goto ex;
+    }
+
+  while (l0 > l1)
+    {
+      x0 = val[l0--];
+      x1 = op1.sign_mask ();
+      if (x0 < x1)
+	{
+	  result = true;
+	  goto ex;
+	}
+    }
+
+  while (l1 > l0)
+    {
+      x0 = sign_mask ();
+      x1 = op1.val[l1--];
+      if (x0 < x1)
+	{
+	  result = true;
+	  goto ex;
+	}
+    }
+
+  while (l0 >= 0)
+    {
+      x0 = val[l0--];
+      x1 = op1.val[l1--];
+      if (x0 < x1)
+	{
+	  result = true;
+	  goto ex;
+	}
+    }
+
+  result = false;
+
+ ex:
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_vww ("wide_int::ltu_p", result, *this, op1);
+#endif
+
+  return result;
+}
+
+/* Return true if THIS has the sign bit set to 1 and all other bits are
+   zero.  */
+
+bool
+wide_int::only_sign_bit_p (int prec) const
+{
+  int i;
+  HOST_WIDE_INT x;
+  int small_prec;
+  bool result;
+
+  if (BLOCKS_NEEDED (prec) != len)
+    {
+      result = false;
+      goto ex;
+    }
+
+  for (i=0; i < len - 1; i++)
+    if (val[i] != 0)
+      {
+	result = false;
+	goto ex;
+      }
+
+  x = val[len - 1];
+  small_prec = prec & (HOST_BITS_PER_WIDE_INT - 1);
+  if (small_prec)
+    x = x << (HOST_BITS_PER_WIDE_INT - small_prec);
+
+  result = x == ((HOST_WIDE_INT)1) << (HOST_BITS_PER_WIDE_INT - 1);
+
+ ex:
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_vw ("wide_int::only_sign_bit_p", result, *this);
+#endif
+
+  return result;
+}
+
+bool
+wide_int::only_sign_bit_p () const
+{
+  int prec = precision ();
+  return only_sign_bit_p (prec);
+}
+
+/* Returns true if THIS fits into range of TYPE.  Signedness of OP0 is
+   assumed to be the same as the signedness of TYPE.  */
+
+bool
+wide_int::fits_to_tree_p (const_tree type) const
+{
+  int type_prec = TYPE_PRECISION (type);
+
+  if (TYPE_UNSIGNED (type))
+    return fits_u_p (type_prec);
+  else
+    return fits_s_p (type_prec);
+}
+
+/* Returns true of THIS fits in the unsigned range of precision.  */
+
+bool
+wide_int::fits_s_p (int prec) const
+{
+  if (len < BLOCKS_NEEDED (prec))
+    return true;
+
+  if (precision () <= prec)
+    return true;
+
+  return *this == sext (prec);
+}
+
+
+/* Returns true if THIS fits into range of TYPE.  */
+
+bool
+wide_int::fits_u_p (int prec) const
+{
+  if (len < BLOCKS_NEEDED (prec))
+    return true;
+
+  if (precision () <= prec)
+    return true;
+
+  return *this == zext (prec);
+}
+
+/*
+ * Extension.
+ */
+
+/* Sign extend THIS starting at OFFSET within the precision of the mode.  */
+
+wide_int
+wide_int::sext (int offset) const
+{
+  wide_int result;
+  int off;
+  int prec = precision ();
+
+  gcc_assert (prec >= offset);
+
+  result.mode = mode;
+  if (prec < HOST_BITS_PER_WIDE_INT)
+    {
+      result.val[0] = sext (val[0], offset);
+      result.len = 1;
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwv ("wide_int::sext", result, *this, offset);
+#endif
+
+      return result;
+    }
+
+  if (prec == offset)
+    {
+      result = copy (mode);
+#ifdef DEBUG_WIDE_INT
+      if (dump_file)
+	debug_wwv ("wide_int::sext", result, *this, offset);
+#endif
+      return result;
+    }
+
+  result = decompress (offset, mode);
+
+  /* Now we can do the real sign extension.  */
+  off = offset & (HOST_BITS_PER_WIDE_INT - 1);
+  if (off)
+    {
+      int block = BLOCK_OF (offset);
+      result.val[block] = sext (val[block], off);
+      result.len = block + 1;
+    }
+  /* We never need an extra element for sign extended values.  */
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwv ("wide_int::sext", result, *this, offset);
+#endif
+
+  return result;
+}
+
+/* Sign extend THIS to mode M.  */
+
+wide_int
+wide_int::sext (enum machine_mode m) const
+{
+  if (GET_MODE_PRECISION (m) >= precision ())
+    /* Assuming that MODE is larger than op0-> mode, the compressed
+       value of op0 and the result will be the same.  The only thing
+       that is different is that the mode of the result will be
+       different.  */
+    return copy (m);
+
+  return truncate (m);
+}
+
+/* Zero extend THIS starting at OFFSET within the precision of the mode.  */
+
+wide_int
+wide_int::zext (int offset) const
+{
+  wide_int result;
+  int off;
+  int block;
+  int prec = precision ();
+
+  gcc_assert (prec >= offset);
+
+  result.mode = mode;
+  if (prec < HOST_BITS_PER_WIDE_INT)
+    {
+      result.val[0] = zext (val[0], offset);
+      result.len = 1;
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwv ("wide_int::zext", result, *this, offset);
+#endif
+
+      return result;
+    }
+
+  if (prec == offset)
+    {
+      result = copy (mode);
+#ifdef DEBUG_WIDE_INT
+      if (dump_file)
+	debug_wwv ("wide_int::zext", result, *this, offset);
+#endif
+      return result;
+    }
+
+  result = decompress (offset, mode);
+
+  /* Now we can do the real zero extension.  */
+  off = offset & (HOST_BITS_PER_WIDE_INT - 1);
+  block = BLOCK_OF (offset);
+  if (off)
+    {
+      result.val[block] = zext (val[block], off);
+      result.len = block + 1;
+    }
+  else
+    /* See if we need an extra zero element to satisfy the compression
+       rule.  */
+    if (val[block - 1] < 0 && offset < prec)
+      {
+	result.val[block] = 0;
+	result.len += 1;
+      }
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwv ("wide_int::zext", result, *this, offset);
+#endif
+
+  return result;
+}
+
+/* Zero extend THIS to mode M.  */
+
+wide_int
+wide_int::zext (enum machine_mode m) const
+{
+  wide_int result;
+  int off;
+  int block;
+  int op0_prec = precision ();
+  int res_prec = GET_MODE_PRECISION (m);
+
+  gcc_assert (res_prec >= op0_prec);
+
+  result.mode = m;
+
+  if (res_prec < HOST_BITS_PER_WIDE_INT)
+    {
+      result.val[0] = zext (val[0], op0_prec);
+      result.len = 1;
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwv ("wide_int::zext", result, *this, res_prec);
+#endif
+      return result;
+    }
+
+  result = decompress (op0_prec, m);
+
+  /* Now we can do the real zero extension.  */
+  off = op0_prec & (HOST_BITS_PER_WIDE_INT - 1);
+  block = BLOCK_OF (op0_prec);
+  if (off)
+    {
+      result.val[block] = zext (val[block], off);
+      result.len = block + 1;
+    }
+  else
+    /* See if we need an extra zero element to satisfy the compression
+       rule.  */
+    if (val[block - 1] < 0 && op0_prec < res_prec)
+      {
+	result.val[block] = 0;
+	result.len += 1;
+      }
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwv ("wide_int::zext", result, *this, res_prec);
+#endif
+
+  return result;
+}
+
+/*
+ * Masking, inserting, shifting, rotating.
+ */
+
+/* Return a value with a one bit inserted in THIS at BITPOS.  */
+
+wide_int
+wide_int::set_bit (int bitpos) const
+{
+  wide_int result;
+  int i, j;
+
+  if (bitpos >= precision ())
+    result = copy (mode);
+  else
+    {
+      result = decompress (bitpos, mode);
+      j = bitpos / HOST_BITS_PER_WIDE_INT;
+      i = bitpos & (HOST_BITS_PER_WIDE_INT - 1);
+      result.val[j] |= ((HOST_WIDE_INT)1) << i;
+    }
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwv ("wide_int::set_bit", result, *this, bitpos);
+#endif
+
+  return result;
+}
+
+/* Insert a 1 bit into 0 at BITPOS producing an number with MODE.  */
+
+wide_int
+wide_int::set_bit_in_zero (int bitpos, enum machine_mode mode)
+{
+  wide_int result;
+  int blocks_needed = BLOCKS_NEEDED (bitpos);
+  int i, j;
+
+  result.mode = mode;
+  if (bitpos >= GET_MODE_PRECISION (mode))
+    {
+      result.len = 1;
+      result.val[0] = 0;
+    }
+  else
+    {
+      result.len = blocks_needed;
+      for (i = 0; i < blocks_needed; i++)
+	result.val[i] = 0;
+      
+      j = bitpos / HOST_BITS_PER_WIDE_INT;
+      i = bitpos & (HOST_BITS_PER_WIDE_INT - 1);
+      result.val[j] |= ((HOST_WIDE_INT)1) << i;
+    }
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wv ("wide_int::set_bit_in_zero", result, bitpos);
+#endif
+
+  return result;
+}
+
+/* Insert WIDTH bits from OP0 into THIS starting at START.  */
+
+wide_int
+wide_int::insert (const wide_int &op0, int start, int width) const
+{
+  wide_int result;
+  wide_int mask;
+  wide_int tmp;
+  int prec = precision ();
+
+  if (start + width >= prec) 
+    {
+      width = prec - start;
+      if (width < 0)
+	{
+	  width = 0;
+	  start = prec;
+	}
+    }
+
+  mask = shifted_mask (start, width, false, mode);
+  tmp = op0.lshift (start, NONE, mode);
+  result = tmp & mask;
+
+  tmp = and_not (mask);
+  result = result | tmp;
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwwvv ("wide_int::insert", result, *this, op0, start, width);
+#endif
+
+  return result;
+}
+
+/* bswap THIS.  */
+
+wide_int
+wide_int::bswap () const
+{
+  wide_int result;
+  int i, s;
+  int end;
+  int prec = precision ();
+  int len = BLOCKS_NEEDED (prec);
+  HOST_WIDE_INT mask = sign_mask ();
+
+  /* This is not a well defined operation if the precision is not a
+     multiple of 8.  */
+  gcc_assert ((prec & 0x7) == 0);
+
+  result.mode = mode;
+  result.len = len;
+
+  for (i = 0; i < len; i++)
+    result.val[0] = mask;
+
+  /* Only swap the bytes that are not the padding.  */
+  if ((prec & (HOST_BITS_PER_WIDE_INT - 1))
+      && (this->len == len))
+    end = prec;
+  else
+    end = this->len * HOST_BITS_PER_WIDE_INT;
+
+  for (s = 0; s < end; s += 8)
+    {
+      unsigned int d = prec - s - 8;
+      unsigned HOST_WIDE_INT byte;
+
+      int block = s / HOST_BITS_PER_WIDE_INT;
+      int offset = s & (HOST_BITS_PER_WIDE_INT - 1);
+
+      byte = (val[block] >> offset) & 0xff;
+
+      block = d / HOST_BITS_PER_WIDE_INT;
+      offset = d & (HOST_BITS_PER_WIDE_INT - 1);
+
+      result.val[block] &= ((((HOST_WIDE_INT)1 << offset) + 8)
+			    - ((HOST_WIDE_INT)1 << offset));
+      result.val[block] |= byte << offset;
+    }
+
+  result.canonize ();
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_ww ("wide_int::bswap", result, *this);
+#endif
+
+  return result;
+}
+
+/* Return a result mask where the lower WIDTH bits are ones and the
+   bits above that up to the precision are zeros.  The result is
+   inverted if NEGATE is true.  */
+
+wide_int
+wide_int::mask (int width, bool negate, enum machine_mode mode)
+{
+  wide_int result;
+  int i = 0;
+  int shift;
+
+  if (width == 0)
+    {
+      if (negate)
+	result = wide_int_minus_one (mode);
+      else
+	result = wide_int_zero (mode);
+#ifdef DEBUG_WIDE_INT
+      if (dump_file)
+	debug_wvv ("wide_int::mask", result, width, negate);
+#endif
+      return result;
+    }
+
+  result.mode = mode;
+
+  while (i < width / HOST_BITS_PER_WIDE_INT)
+    result.val[i++] = negate ? 0 : (HOST_WIDE_INT)-1;
+
+  shift = width & (HOST_BITS_PER_WIDE_INT - 1);
+  if (shift != 0)
+    {
+      HOST_WIDE_INT last = (((HOST_WIDE_INT)1) << shift) - 1;
+      result.val[i++] = negate ? ~last : last;
+    }
+  result.len = i;
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wvv ("wide_int::mask", result, width, negate);
+#endif
+
+  return result;
+}
+
+/* Return a result mask of WIDTH ones starting at START and the
+   bits above that up to the precision are zeros.  The result is
+   inverted if NEGATE is true.  */
+
+wide_int
+wide_int::shifted_mask (int start, int width, bool negate,
+			enum machine_mode mode)
+{
+  wide_int result;
+  int i = 0;
+  int shift;
+  int end = start + width;
+  HOST_WIDE_INT block;
+  int prec = GET_MODE_PRECISION (mode);
+
+  if (start + width > prec)
+    {
+      width = prec - start;
+      if (width < 0)
+	{
+	  width = 0;
+	  start = prec;
+	}
+    }
+
+  if (width == 0)
+    {
+      if (negate)
+	result = wide_int_minus_one (mode);
+      else
+	result = wide_int_zero (mode);
+#ifdef DEBUG_WIDE_INT
+      if (dump_file)
+	debug_wvv ("wide_int::shifted_mask", result, width, negate);
+#endif
+      return result;
+    }
+
+  result.mode = mode;
+
+  while (i < start / HOST_BITS_PER_WIDE_INT)
+    result.val[i++] = negate ? (HOST_WIDE_INT)-1 : 0;
+
+  shift = start & (HOST_BITS_PER_WIDE_INT - 1);
+  if (shift)
+    {
+      block = (((HOST_WIDE_INT)1) << shift) - 1;
+      shift = (end) & (HOST_BITS_PER_WIDE_INT - 1);
+      if (shift)
+	{
+	  /* case 000111000 */
+	  block = (((HOST_WIDE_INT)1) << shift) - block - 1;
+	  result.val[i++] = negate ? ~block : block;
+	  result.len = i;
+
+#ifdef DEBUG_WIDE_INT
+	  if (dump_file)
+	    debug_wvvv ("wide_int::shifted_mask", result, start,
+				  width, negate);
+#endif
+	  return result;
+	}
+      else
+	/* ...111000 */
+	result.val[i++] = negate ? block : ~block;
+    }
+
+  while (i < end / HOST_BITS_PER_WIDE_INT)
+    /* 1111111 */
+    result.val[i++] = negate ? 0 : (HOST_WIDE_INT)-1;
+
+  shift = end & (HOST_BITS_PER_WIDE_INT - 1);
+  if (shift != 0)
+    {
+      /* 000011111 */
+      block = (((HOST_WIDE_INT)1) << shift) - 1;
+      result.val[i++] = negate ? ~block : block;
+    }
+
+  result.len = i;
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wvvv ("wide_int::shifted_mask", result, start, width,
+			  negate);
+#endif
+
+  return result;
+}
+
+
+/*
+ * logical operations.
+ */
+
+/* Return THIS & OP1.  */
+
+wide_int
+wide_int::operator & (const wide_int &op1) const
+{
+  wide_int result;
+  int l0 = len - 1;
+  int l1 = op1.len - 1;
+  bool need_canon = true;
+
+  gcc_assert (mode == op1.mode);
+
+  result.len = len > op1.len ? len : op1.len;
+  result.mode = mode;
+
+  if (l0 > l1)
+    {
+      if (op1.sign_mask () == 0)
+	{
+	  l0 = l1;
+	  result.len = l1 + 1;
+	}
+      else
+	{
+	  need_canon = false;
+	  while (l0 > l1)
+	    {
+	      result.val[l0] = val[l0];
+	      l0--;
+	    }
+	}
+    }
+  else if (l1 > l0)
+    {
+      if (sign_mask () == 0)
+	  result.len = l0 + 1;
+      else
+	{
+	  need_canon = false;
+	  while (l1 > l0)
+	    {
+	      result.val[l0] = op1.val[l0];
+	      l1--;
+	    }
+	}
+    }
+
+  while (l0 >= 0)
+    {
+      result.val[l0] = val[l0] & op1.val[l0];
+      l0--;
+    }
+
+  if (need_canon)
+    result.canonize ();
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_www ("wide_int::operator &", result, *this, op1);
+#endif
+  return result;
+}
+
+/* Return THIS & ~OP1.  */
+
+wide_int
+wide_int::and_not (const wide_int &op1) const
+{
+  wide_int result;
+  int l0 = len - 1;
+  int l1 = op1.len - 1;
+  bool need_canon = true;
+
+  gcc_assert (mode == op1.mode);
+
+  result.len = len > op1.len ? len : op1.len;
+  result.mode = mode;
+
+  if (l0 > l1)
+    {
+      if (op1.sign_mask () != 0)
+	{
+	  l0 = l1;
+	  result.len = l1 + 1;
+	}
+      else
+	{
+	  need_canon = false;
+	  while (l0 > l1)
+	    {
+	      result.val[l0] = val[l0];
+	      l0--;
+	    }
+	}
+    }
+  else if (l1 > l0)
+    {
+      if (sign_mask () == 0)
+	result.len = l0 + 1;
+      else
+	{
+	  need_canon = false;
+	  while (l1 > l0)
+	    {
+	      result.val[l0] = ~op1.val[l0];
+	      l1--;
+	    }
+	}
+    }
+
+  while (l0 >= 0)
+    {
+      result.val[l0] = val[l0] & ~op1.val[l0];
+      l0--;
+    }
+
+  if (need_canon)
+    result.canonize ();
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_www ("wide_int::and_not", result, *this, op1);
+#endif
+  return result;
+}
+
+/* Return THIS | OP1.  */
+
+wide_int
+wide_int::operator | (const wide_int &op1) const
+{
+  wide_int result;
+  int l0 = len - 1;
+  int l1 = op1.len - 1;
+  bool need_canon = true;
+
+  gcc_assert (mode == op1.mode);
+
+  result.len = len > op1.len ? len : op1.len;
+  result.mode = mode;
+
+  if (l0 > l1)
+    {
+      if (op1.sign_mask () != 0)
+	{
+	  l0 = l1;
+	  result.len = l1 + 1;
+	}
+      else
+	{
+	  need_canon = false;
+	  while (l0 > l1)
+	    {
+	      result.val[l0] = val[l0];
+	      l0--;
+	    }
+	}
+    }
+  else if (l1 > l0)
+    {
+      if (sign_mask () != 0)
+	result.len = l0 + 1;
+      else
+	{
+	  need_canon = false;
+	  while (l1 > l0)
+	    {
+	      result.val[l0] = op1.val[l0];
+	      l1--;
+	    }
+	}
+    }
+
+  while (l0 >= 0)
+    {
+      result.val[l0] = val[l0] | op1.val[l0];
+      l0--;
+    }
+
+  if (need_canon)
+    result.canonize ();
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_www ("wide_int::operator |", result, *this, op1);
+#endif
+  return result;
+}
+
+/* Return the logical negation (bitwise complement) of THIS.  */
+
+wide_int
+wide_int::operator ~ () const
+{
+  wide_int result;
+  int l0 = len - 1;
+
+  result.len = len;
+  result.mode = mode;
+
+  while (l0 >= 0)
+    {
+      result.val[l0] = ~val[l0];
+      l0--;
+    }
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_ww ("wide_int::operator ~", result, *this);
+#endif
+  return result;
+}
+
+/* Return THIS | ~OP1.  */
+
+wide_int
+wide_int::or_not (const wide_int &op1) const
+{
+  wide_int result;
+  int l0 = len - 1;
+  int l1 = op1.len - 1;
+  bool need_canon = true;
+
+  gcc_assert (mode == op1.mode);
+
+  result.len = len > op1.len ? len : op1.len;
+  result.mode = mode;
+
+  if (l0 > l1)
+    {
+      if (op1.sign_mask () == 0)
+	{
+	  l0 = l1;
+	  result.len = l1 + 1;
+	}
+      else
+	{
+	  need_canon = false;
+	  while (l0 > l1)
+	    {
+	      result.val[l0] = val[l0];
+	      l0--;
+	    }
+	}
+    }
+  else if (l1 > l0)
+    {
+      if (sign_mask () != 0)
+	result.len = l0 + 1;
+      else
+	{
+	  need_canon = false;
+	  while (l1 > l0)
+	    {
+	      result.val[l0] = ~op1.val[l0];
+	      l1--;
+	    }
+	}
+    }
+
+  while (l0 >= 0)
+    {
+      result.val[l0] = val[l0] | ~op1.val[l0];
+      l0--;
+    }
+
+  if (need_canon)
+    result.canonize ();
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_www ("wide_int::and_not", result, *this, op1);
+#endif
+  return result;
+}
+
+/* Return the exclusive ior (xor) of THIS and OP1.  */
+
+wide_int
+wide_int::operator ^ (const wide_int &op1) const
+{
+  wide_int result;
+  int l0 = len - 1;
+  int l1 = op1.len - 1;
+
+  gcc_assert (mode == op1.mode);
+
+  result.len = len > op1.len ? len : op1.len;
+  result.mode = mode;
+
+  while (l0 > l1)
+    {
+      result.val[l0] = val[l0] ^ op1.sign_mask ();
+      l0--;
+    }
+
+  while (l1 > l0)
+    {
+      result.val[l0] = sign_mask () ^ op1.val[l0];
+      l1--;
+    }
+
+  while (l0 >= 0)
+    {
+      result.val[l0] = val[l0] ^ op1.val[l0];
+      l0--;
+    }
+
+  result.canonize ();
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_www ("wide_int::operator ^", result, *this, op1);
+#endif
+  return result;
+}
+
+/*
+ * math
+ */
+
+/* Absolute value of THIS.  */
+
+wide_int
+wide_int::abs () const
+{
+  if (sign_mask ())
+    return neg ();
+
+  wide_int result = copy (mode);
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_ww ("wide_int::abs", result, *this);
+#endif
+  return result;
+}
+
+/* Add of THIS and OP1.  No overflow is detected.  */
+
+wide_int
+wide_int::operator + (const wide_int &op1) const
+{
+  wide_int result;
+  unsigned HOST_WIDE_INT o0, o1;
+  unsigned HOST_WIDE_INT x = 0;
+  unsigned HOST_WIDE_INT carry = 0;
+  unsigned HOST_WIDE_INT mask0, mask1;
+  int prec = precision ();
+  unsigned int i, small_prec, stop;
+
+  gcc_assert (mode == op1.mode);
+
+  result.mode = mode;
+
+  if (prec <= HOST_BITS_PER_WIDE_INT)
+    {
+      result.len = 1;
+      o0 = val[0];
+      o1 = op1.val[0];
+      result.val[0] = sext (o0 + o1, prec);
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_www ("wide_int::operator +", result, *this, op1);
+#endif
+      return result;
+    }
+
+  stop = len > op1.len ? len : op1.len;
+  /* Need to do one extra block just to handle the special cases.  */
+  if (stop < (unsigned)BLOCKS_NEEDED (prec))
+    stop++;
+
+  result.len = stop;
+  mask0 = sign_mask ();
+  mask1 = op1.sign_mask ();
+  /* Add all of the explicitly defined elements.  */
+  for (i = 0; i < stop; i++)
+    {
+      o0 = i < len ? (unsigned HOST_WIDE_INT)val[i] : mask0;
+      o1 = i < op1.len ? (unsigned HOST_WIDE_INT)op1.val[i] : mask1;
+      x = o0 + o1 + carry;
+      result.val[i] = x;
+      carry = x < o0;
+    }
+
+  small_prec = prec & (HOST_BITS_PER_WIDE_INT - 1);
+  if (small_prec != 0 && BLOCKS_NEEDED (prec) == result.len)
+    {
+      /* Modes with weird precisions.  */
+      i = result.len - 1;
+      result.val[i] = sext (result.val[i], small_prec);
+    }
+
+  result.canonize ();
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_www ("wide_int::operator +", result, *this, op1);
+#endif
+  return result;
+}
+
+/* Add of THIS and signed OP1.  No overflow is detected.  */
+
+wide_int
+wide_int::operator + (HOST_WIDE_INT op1) const
+{
+  wide_int result;
+  unsigned HOST_WIDE_INT o0, o1;
+  unsigned HOST_WIDE_INT x = 0;
+  unsigned HOST_WIDE_INT carry = 0;
+  unsigned HOST_WIDE_INT mask0, mask1;
+  int prec = precision ();
+  unsigned int i, small_prec, stop;
+
+  result.mode = mode;
+
+  if (prec <= HOST_BITS_PER_WIDE_INT)
+    {
+      result.len = 1;
+      o0 = val[0];
+      result.val[0] = sext (o0 + op1, prec);
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwv ("wide_int::add", result, *this, op1);
+#endif
+      return result;
+    }
+
+  stop = len;
+  /* Need to do one extra block just to handle the special cases.  */
+  if (stop < (unsigned)BLOCKS_NEEDED (prec))
+    stop++;
+
+  result.len = stop;
+  mask0 = sign_mask ();
+  mask1 = op1 >> (HOST_BITS_PER_WIDE_INT - 1);
+  /* Add all of the explicitly defined elements.  */
+  for (i = 0; i < len; i++)
+    {
+      o0 = i < len ? (unsigned HOST_WIDE_INT)val[i] : mask0;
+      o1 = i < 1 ? (unsigned HOST_WIDE_INT)op1 : mask1;
+      x = o0 + o1 + carry;
+      result.val[i] = x;
+      carry = x < o0;
+    }
+
+  small_prec = prec & (HOST_BITS_PER_WIDE_INT - 1);
+  if (small_prec != 0 && BLOCKS_NEEDED (prec) == result.len)
+    {
+      /* Modes with weird precisions.  */
+      i = result.len - 1;
+      result.val[i] = sext (result.val[i], small_prec);
+    }
+
+  result.canonize ();
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwv ("wide_int::operator +", result, *this, op1);
+#endif
+  return result;
+}
+
+/* Add of THIS and unsigned OP1.  No overflow is detected.  */
+
+wide_int
+wide_int::operator + (unsigned HOST_WIDE_INT op1) const
+{
+  wide_int result;
+  unsigned HOST_WIDE_INT o0, o1;
+  unsigned HOST_WIDE_INT x = 0;
+  unsigned HOST_WIDE_INT carry = 0;
+  unsigned HOST_WIDE_INT mask0;
+  int prec = precision ();
+  unsigned int i, small_prec, stop;
+
+  result.mode = mode;
+
+  if (prec <= HOST_BITS_PER_WIDE_INT)
+    {
+      result.len = 1;
+      o0 = val[0];
+      result.val[0] = sext (o0 + op1, prec);
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwv ("wide_int::operator +", result, *this, op1);
+#endif
+      return result;
+    }
+
+  stop = len;
+  /* Need to do one extra block just to handle the special cases.  */
+  if (stop < (unsigned)BLOCKS_NEEDED (prec))
+    stop++;
+
+  result.len = stop;
+  mask0 = sign_mask ();
+  /* Add all of the explicitly defined elements.  */
+  for (i = 0; i < stop; i++)
+    {
+      o0 = i < len ? (unsigned HOST_WIDE_INT)val[i] : mask0;
+      o1 = i < 1 ? (unsigned HOST_WIDE_INT)op1 : 0;
+      x = o0 + o1 + carry;
+      result.val[i] = x;
+      carry = x < o0;
+    }
+
+  small_prec = prec & (HOST_BITS_PER_WIDE_INT - 1);
+  if (small_prec != 0 && BLOCKS_NEEDED (prec) == result.len)
+    {
+      /* Modes with weird precisions.  */
+      i = result.len - 1;
+      result.val[i] = sext (result.val[i], small_prec);
+    }
+
+  result.canonize ();
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwv ("wide_int::operator +", result, *this, op1);
+#endif
+  return result;
+}
+
+/* Add of OP0 and OP1 with overflow checking.  If the result overflows
+   within the precision, set OVERFLOW.  OVERFLOW is assumed to be
+   sticky so it should be initialized.  SGN controls if signed or
+   unsigned overflow is checked.  */
+
+wide_int
+wide_int::add_overflow (const wide_int *op0, const wide_int *op1,
+			wide_int::Op sgn, bool *overflow)
+{
+  wide_int result;
+  const wide_int *tmp;
+  unsigned HOST_WIDE_INT o0 = 0;
+  unsigned HOST_WIDE_INT o1 = 0;
+  unsigned HOST_WIDE_INT x = 0;
+  unsigned HOST_WIDE_INT carry = 0;
+  int prec = op0->precision ();
+  int i, small_prec;
+
+  gcc_assert (op0->get_mode () == op1->get_mode ());
+
+  result.set_mode (op0->get_mode ());
+
+  /* Put the longer one first.  */
+  if (op0->get_len () > op1->get_len ())
+    {
+      tmp = op0;
+      op0 = op1;
+      op1 = tmp;
+    }
+
+  /* Add all of the explicitly defined elements.  */
+  for (i = 0; i < op0->get_len (); i++)
+    {
+      o0 = op0->elt (i);
+      o1 = op1->elt (i);
+      x = o0 + o1 + carry;
+      result.elt_ref (i) = x;
+      carry = x < o0;
+    }
+
+  /* Uncompress the rest.  */
+  if (op0->get_len () < op1->get_len ())
+    {
+      unsigned HOST_WIDE_INT mask = op1->sign_mask ();
+      for (i = op0->get_len (); i < op1->get_len (); i++)
+	{
+	  o0 = op0->elt (i);
+	  o1 = mask;
+	  x = o0 + o1 + carry;
+	  result.elt_ref (i) = x;
+	  carry = x < o0;
+	}
+    }
+
+  result.set_len (op0->get_len ());
+  small_prec = prec & (HOST_BITS_PER_WIDE_INT - 1);
+  if (small_prec == 0)
+    {
+      if (op0->get_len () * HOST_BITS_PER_WIDE_INT < prec)
+	{
+	  /* If the carry is 1, then we need another word.  If the carry
+	     is 0, we only need another word if the top bit is 1.  */
+	  if (carry == 1
+	      || (x >> (HOST_BITS_PER_WIDE_INT - 1) == 1))
+	    /* Check for signed overflow.  */
+	    {
+	      result.elt_ref (result.get_len ()) = carry;
+	      result.set_len (result.get_len () + 1);
+	    }
+	  if (sgn == wide_int::SIGNED)
+	    {
+	      if (((x ^ o0) & (x ^ o1)) >> (HOST_BITS_PER_WIDE_INT - 1))
+		*overflow = true;
+	    }
+	  else if (carry)
+	    {
+	      if ((~o0) < o1)
+		*overflow = true;
+	    }
+	  else
+	    {
+	      if ((~o0) <= o1)
+		*overflow = true;
+	    }
+	}
+    }
+  else
+    {
+      /* Overflow in this case is easy since we can see bits beyond
+	 the precision.  If the value computed is not the sign
+	 extended value, then we have overflow.  */
+      unsigned HOST_WIDE_INT y;
+
+      if (sgn == wide_int::UNSIGNED)
+	{
+	  /* The caveat for unsigned is to get rid of the bits above
+	     the precision before doing the addition.  To check the
+	     overflow, clear these bits and then redo the last
+	     addition.  Then the rest of the code just works.  */
+	  o0 = wide_int::zext (o0, small_prec);
+	  o1 = wide_int::zext (o1, small_prec);
+	  x = o0 + o1 + carry;
+	}
+      /* Short integers and modes with weird precisions.  */
+      y = wide_int::sext (x, small_prec);
+      result.set_len (op1->get_len ());
+      if (BLOCKS_NEEDED (prec) == result.get_len () && x != y)
+	*overflow = true;
+      /* Then put the sign extended form back because that is the
+	 canonical form.  */
+      result.elt_ref (result.get_len () - 1) = y;
+    }
+
+  result.canonize ();
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_www ("wide_int::add_overflow", result, *op0, *op1);
+#endif
+  return result;
+}
+
+/* Add this and X.  If overflow occurs, set OVERFLOW.  */
+
+wide_int
+wide_int::add (const wide_int &x, Op sgn, bool *overflow) const
+{
+  return add_overflow (this, &x, sgn, overflow);
+}
+
+/* Count leading zeros of THIS but only looking at the bits in the
+   smallest HWI of size mode.  */
+
+wide_int
+wide_int::clz (enum machine_mode m) const
+{
+  return wide_int::from_shwi (clz (), m);
+}
+
+/* Count leading zeros of THIS.  */
+
+HOST_WIDE_INT
+wide_int::clz () const
+{
+  int i;
+  int start;
+  int count;
+  HOST_WIDE_INT v;
+  int prec = precision ();
+  int small_prec = prec & (HOST_BITS_PER_WIDE_INT - 1);
+
+  if (zero_p ())
+    {
+      /* Even if the value at zero is undefined, we have to come up
+	 with some replacement.  Seems good enough.  */
+      if (!CLZ_DEFINED_VALUE_AT_ZERO (mode, count))
+	count = prec;
+
+#ifdef DEBUG_WIDE_INT
+      if (dump_file)
+	debug_vw ("wide_int::clz", count, *this);
+#endif
+      return count;
+    }
+
+  /* The high order block is special if it is the last block and the
+     precision is not an even multiple of HOST_BITS_PER_WIDE_INT.  We
+     have to clear out any ones above the precision before doing clz
+     on this block.  */
+  if (BLOCKS_NEEDED (prec) == len && small_prec)
+    {
+      v = zext (val[len - 1], small_prec);
+      count = clz_hwi (v) - (HOST_BITS_PER_WIDE_INT - small_prec);
+      start = len - 2;
+      if (v != 0)
+	{
+#ifdef DEBUG_WIDE_INT
+	  if (dump_file)
+	    debug_vw ("wide_int::clz", count, *this);
+#endif
+	  return count;
+	}
+    }
+  else
+    {
+      count = HOST_BITS_PER_WIDE_INT * (BLOCKS_NEEDED (prec) - len);
+      start = len - 1;
+    }
+
+  for (i = start; i >= 0; i--)
+    {
+      v = elt (i);
+      count += clz_hwi (v);
+      if (v != 0)
+	break;
+    }
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_vw ("wide_int::clz", count, *this);
+#endif
+  return count;
+}
+
+wide_int
+wide_int::clrsb (enum machine_mode m) const
+{
+  return wide_int::from_shwi (clrsb (), m);
+}
+
+/* Count the number of redundant leading bits of THIS.  Return result
+   as a HOST_WIDE_INT.  There is a wrapper to convert this into a
+   wide_int.  */
+
+HOST_WIDE_INT
+wide_int::clrsb () const
+{
+  if (neg_p ())
+    return operator ~ ().clz () - 1;
+
+  return clz () - 1;
+}
+
+wide_int
+wide_int::ctz (enum machine_mode mode) const
+{
+  return wide_int::from_shwi (ctz (), mode);
+}
+
+/* Count zeros of THIS.  Return result as a HOST_WIDE_INT.  There is a
+   wrapper to convert this into a wide_int.  */
+
+HOST_WIDE_INT
+wide_int::ctz () const
+{
+  int i;
+  int count = 0;
+  HOST_WIDE_INT v;
+  int prec = precision ();
+  int small_prec = prec & (HOST_BITS_PER_WIDE_INT - 1);
+  int end;
+  bool more_to_do;
+
+  if (zero_p ())
+    {
+      /* Even if the value at zero is undefined, we have to come up
+	 with some replacement.  Seems good enough.  */
+      if (!CTZ_DEFINED_VALUE_AT_ZERO (mode, count))
+	count = prec;
+
+#ifdef DEBUG_WIDE_INT
+      if (dump_file)
+	debug_vw ("wide_int::ctz", count, *this);
+#endif
+      return count;
+    }
+
+  /* The high order block is special if it is the last block and the
+     precision is not an even multiple of HOST_BITS_PER_WIDE_INT.  We
+     have to clear out any ones above the precision before doing clz
+     on this block.  */
+  if (BLOCKS_NEEDED (prec) == len && small_prec)
+    {
+      end = len - 1;
+      more_to_do = true;
+    }
+  else
+    {
+      end = len;
+      more_to_do = false;
+    }
+
+  for (i = 0; i < end; i++)
+    {
+      v = val[i];
+      count += ctz_hwi (v);
+      if (v != 0)
+	{
+#ifdef DEBUG_WIDE_INT
+	  if (dump_file)
+	    debug_vw ("wide_int::ctz", count, *this);
+#endif
+	  return count;
+	}
+    }
+
+  if (more_to_do)
+    {
+      v = zext (val[len - 1], small_prec);
+      count = ctz_hwi (v);
+      /* The top word was all zeros so we have to cut it back to prec,
+	 because we are counting some of the zeros above the
+	 interesting part.  */
+      if (count > prec)
+	count = prec;
+    }
+  else
+    /* Skip over the blocks that are not represented.  They must be
+       all zeros at this point.  */
+    count = prec;
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_vw ("wide_int::ctz", count, *this);
+#endif
+  return count;
+}
+
+/* ffs of THIS.  */
+
+wide_int
+wide_int::ffs () const
+{
+  int prec = precision ();
+  HOST_WIDE_INT count = ctz ();
+  if (count == prec)
+    count = 0;
+  else
+    count += 1;
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_vw ("wide_int::ffs", count, *this);
+#endif
+  return wide_int::from_shwi (count, word_mode);
+}
+
+/* Subroutines of the multiplication and division operations.  Unpack
+   the first IN_LEN HOST_WIDE_INTs in INPUT into 2 * IN_LEN
+   HOST_HALF_WID_INTs of RESULT.  The rest of RESULT is filled by
+   uncompressing the top bit of INPUT[IN_LEN - 1].  */
+
+static void
+wi_unpack (unsigned HOST_HALF_WIDE_INT *result, 
+	   const unsigned HOST_WIDE_INT *input,
+	   int in_len, int out_len)
+{
+  int i;
+  int j = 0;
+  HOST_WIDE_INT mask;
+
+  for (i = 0; i <in_len; i++)
+    {
+      result[j++] = input[i];
+      result[j++] = input[i] >> HOST_BITS_PER_HALF_WIDE_INT;
+    }
+  mask = ((HOST_WIDE_INT)input[in_len - 1]) >> (HOST_BITS_PER_WIDE_INT - 1);
+  mask &= HALF_INT_MASK;
+
+  /* Smear the sign bit.  */
+  while (j < out_len)
+    result[j++] = mask;
+}
+
+/* The inverse of wi_unpack.  */
+
+static void
+wi_pack (unsigned HOST_WIDE_INT *result, 
+	 const unsigned HOST_HALF_WIDE_INT *input, 
+	 int in_len)
+{
+  int i = 0;
+  int j = 0;
+
+  while (i < in_len)
+    {
+      result[j++] = (unsigned HOST_WIDE_INT)input[i] 
+	| ((unsigned HOST_WIDE_INT)input[i + 1] << HOST_BITS_PER_HALF_WIDE_INT);
+      i += 2;
+    }
+}
+
+/* Return an integer that is the exact log2 of THIS.  */
+
+HOST_WIDE_INT
+wide_int::exact_log2 () const
+{
+  int prec = precision ();
+  int small_prec = prec & (HOST_BITS_PER_WIDE_INT - 1);
+  HOST_WIDE_INT count;
+  HOST_WIDE_INT result;
+
+  if (prec <= HOST_BITS_PER_WIDE_INT)
+    {
+      HOST_WIDE_INT v;
+      if (small_prec < HOST_BITS_PER_WIDE_INT)
+	v = sext (val[0], small_prec);
+      else
+	v = val[0];
+      result = ::exact_log2 (v);
+      goto ex;
+    }
+
+  count = ctz ();
+  if (clz () + count + 1 == prec)
+    {
+      result = count;
+      goto ex;
+    }
+
+  result = -1;
+
+ ex:
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_vw ("wide_int::exact_log2", result, *this);
+#endif
+  return result;
+}
+
+/* Multiply THIS and OP1.  If HIGH is set, only the upper half of the
+   result is returned.  If FULL is set, the entire result is returned
+   in a mode that is twice the width of the inputs.  However, that
+   mode needs to exist if the value is to be usable.  Clients that use
+   FULL need to check for this.
+
+   If HIGH or FULL are not setm throw away the upper half after the check
+   is made to see if it overflows.  Unfortunately there is no better
+   way to check for overflow than to do this.  OVERFLOW is assumed to
+   be sticky so it should be initialized.  SGN controls the signess
+   and is used to check overflow or if HIGH or FULL is set.  */
+
+wide_int
+wide_int::mul_internal (bool high, bool full, 
+			const wide_int *op1, const wide_int *op2,
+			wide_int::Op sgn,  bool *overflow, bool needs_overflow)
+{
+  wide_int result;
+  unsigned HOST_WIDE_INT o0, o1, k, t;
+  unsigned int i;
+  unsigned int j;
+  int prec = op1->precision ();
+  unsigned int blocks_needed = 2 * BLOCKS_NEEDED (prec);
+  unsigned HOST_HALF_WIDE_INT u[2 * MAX_BITSIZE_MODE_ANY_INT
+			   / HOST_BITS_PER_WIDE_INT];
+  unsigned HOST_HALF_WIDE_INT v[2 * MAX_BITSIZE_MODE_ANY_INT
+			   / HOST_BITS_PER_WIDE_INT];
+  unsigned HOST_HALF_WIDE_INT r[4 * MAX_BITSIZE_MODE_ANY_INT
+			   / HOST_BITS_PER_WIDE_INT];
+  HOST_WIDE_INT mask = ((HOST_WIDE_INT)1 << (HOST_BITS_PER_WIDE_INT / 2)) - 1;
+
+  gcc_assert (op1->get_mode () == op2->get_mode ());
+  result.set_mode (op1->get_mode ());
+
+  if (high || full || needs_overflow)
+    {
+      /* If we need to check for overflow, we can only do half wide
+	 multiplies quickly because we need to look at the top bits to
+	 check for the overflow.  */
+      if (prec <= HOST_BITS_PER_HALF_WIDE_INT)
+	{
+	  HOST_WIDE_INT t;
+	  result.set_len (1);
+	  o0 = op1->elt (0);
+	  o1 = op2->elt (0);
+	  t = o0 * o1;
+	  /* Signed shift down will leave 0 or -1 if there was no
+	     overflow for signed or 0 for unsigned.  */
+	  t = t >> (HOST_BITS_PER_HALF_WIDE_INT - 1);
+	  if (needs_overflow)
+	    {
+	      if (sgn == wide_int::SIGNED)
+		{
+		  if (t != (HOST_WIDE_INT)-1 && t != 0)
+		    *overflow = true;
+		}
+	      else
+		{
+		  if (t != 0)
+		    *overflow = true;
+		}
+	    }
+	  if (full)
+	    {
+	      result.elt_ref (0) = wide_int::sext (t, prec << 1);
+	      result.set_mode (GET_MODE_2XWIDER_MODE (op1->get_mode ()));
+	    }
+	  else if (high)
+	    result.elt_ref (0) = wide_int::sext (t >> (prec >> 1), prec);
+	  else
+	    result.elt_ref (0) = wide_int::sext (t, prec);
+#ifdef DEBUG_WIDE_INT
+	  if (dump_file)
+	    debug_www ("wide_int::mul_overflow", result, *op1, *op2);
+#endif
+	  return result;
+	}
+    }
+  else
+    {
+      if (prec <= HOST_BITS_PER_WIDE_INT)
+	{
+	  result.set_len (1);
+	  o0 = op1->elt (0);
+	  o1 = op2->elt (0);
+	  result.elt_ref (0) = wide_int::sext (o0 * o1, prec);
+	  
+#ifdef DEBUG_WIDE_INT
+	  if (dump_file)
+	    debug_www ("wide_int::mul_overflow", result, *op1, *op2);
+#endif
+	  return result;
+	}
+    }
+
+  wi_unpack (u, &op1->uelt_ref (0), op1->get_len (), blocks_needed);
+  wi_unpack (v, &op2->uelt_ref (0), op2->get_len (), blocks_needed);
+
+  memset (r, 0, blocks_needed * 2 * HOST_BITS_PER_WIDE_INT / BITS_PER_UNIT);
+
+  for (j = 0; j < blocks_needed; j++)
+    {
+      k = 0;
+      for (i = 0; i < blocks_needed; i++)
+	{
+	  t = ((unsigned HOST_WIDE_INT)u[i] * (unsigned HOST_WIDE_INT)v[j]
+	       + r[i + j] + k);
+	  r[i + j] = t & HALF_INT_MASK;
+	  k = t >> HOST_BITS_PER_HALF_WIDE_INT;
+	}
+      r[j + blocks_needed] = k;
+    }
+
+  if (needs_overflow)
+    {
+      HOST_WIDE_INT top;
+
+      /* For unsigned, overflow is true if any of the top bits are set.
+	 For signed, overflow is true if any of the top bits are not equal
+	 to the sign bit.  */
+      if (sgn == wide_int::UNSIGNED)
+	top = 0;
+      else
+	{
+	  top = r[blocks_needed - 1];
+	  top = ((top << (HOST_BITS_PER_WIDE_INT / 2))
+		 >> (HOST_BITS_PER_WIDE_INT - 1));
+	  top &= mask;
+	}
+      
+      for (i = blocks_needed; i < 2 * blocks_needed; i++)
+	if (((HOST_WIDE_INT)(r[i] & mask)) != top)
+	  *overflow = true; 
+    }
+
+  if (full)
+    {
+      /* compute [2prec] <- [prec] * [prec] */
+      wi_pack (&result.uelt_ref (0), r, blocks_needed);
+      result.set_len (blocks_needed);
+      result.set_mode (GET_MODE_2XWIDER_MODE (op1->get_mode ()));
+    }
+  else if (high)
+    {
+      /* compute [prec] <- ([prec] * [prec]) >> [prec] */
+      wi_pack (&result.uelt_ref (blocks_needed >> 1), r, blocks_needed >> 1);
+      result.set_len (blocks_needed / 2);
+    }
+  else
+    {
+      /* compute [prec] <- ([prec] * [prec]) && ((1 << [prec]) - 1) */
+      wi_pack (&result.uelt_ref (0), r, blocks_needed >> 1);
+      result.set_len (blocks_needed / 2);
+    }
+      
+  result.canonize ();
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwwv ("wide_int::mul_overflow", result, *op1, *op2, *overflow);
+#endif
+  return result;
+}
+
+/* Multiply THIS and OP1.  The result is the same precision as the
+   operands, so there is no reason for signed or unsigned
+   versions.  */
+
+wide_int
+wide_int::operator * (const wide_int &op1) const
+{
+  bool overflow;
+
+  return mul_internal (false, false, this, &op1, UNSIGNED, &overflow, false);
+}
+
+/* Multiply THIS and OP1.  The signess is specified with SGN.
+   OVERFLOW is set true if the result overflows.  */
+
+wide_int 
+wide_int::mul (const wide_int &op1, Op sgn, bool *overflow) const
+{
+  return mul_internal (false, false, this, &op1, sgn, overflow, true);
+}
+
+/* Multiply THIS and OP1.  The signess is specified with SGN.  The
+   result is twice the precision as the operands.  NOTE THAT A MODE
+   MUST EXIST ON THE TARGET THAT IS TWICE THE WIDTH OF THE MODES OF
+   THE OPERANDS.  The signess is specified with SGN.  */
+
+wide_int
+wide_int::mul_full (const wide_int &op1, Op sgn) const
+{
+  bool overflow;
+
+  return mul_internal (false, true, this, &op1, sgn, &overflow, false);
+}
+
+/* Multiply THIS and OP1 and return the high part of that result.  The
+   signess is specified with SGN.  The result is the same precision as
+   the operands.  The mode is the same mode as the operands.  The
+   signess is specified with y.  */
+
+wide_int
+wide_int::mul_high (const wide_int &op1, Op sgn) const
+{
+  bool overflow;
+
+  return mul_internal (true, false, this, &op1, sgn, &overflow, false);
+}
+
+/* Negate THIS.  */
+
+wide_int
+wide_int::neg () const
+{
+  wide_int z = wide_int::from_shwi (0, mode);
+  return z - *this;
+}
+
+static inline wide_int
+parity (const wide_int &x)
+{
+  return x.parity (x.get_mode ());
+}
+
+/* Compute the parity of THIS.  */
+
+wide_int
+wide_int::parity (enum machine_mode mode) const
+{
+  int count = popcount ();
+  return wide_int::from_shwi (count & 1, mode);
+}
+
+static inline int
+popcount (const wide_int &x)
+{
+  return x.popcount ();
+}
+
+wide_int
+wide_int::popcount (enum machine_mode m) const
+{
+  return wide_int::from_shwi (popcount (), m);
+}
+
+/* Compute the population count of THIS.  */
+
+int
+wide_int::popcount () const
+{
+  int i;
+  int start;
+  int count;
+  HOST_WIDE_INT v;
+  int prec = precision ();
+  int small_prec = prec & (HOST_BITS_PER_WIDE_INT - 1);
+
+  /* The high order block is special if it is the last block and the
+     precision is not an even multiple of HOST_BITS_PER_WIDE_INT.  We
+     have to clear out any ones above the precision before doing clz
+     on this block.  */
+  if (BLOCKS_NEEDED (prec) == len && small_prec)
+    {
+      v = zext (val[len - 1], small_prec);
+      count = popcount_hwi (v);
+      start = len - 2;
+    }
+  else
+    {
+      if (sign_mask ())
+	count = HOST_BITS_PER_WIDE_INT * (BLOCKS_NEEDED (prec) - len);
+      else
+	count = 0;
+      start = len - 1;
+    }
+
+  for (i = start; i >= 0; i--)
+    {
+      v = val[i];
+      count += popcount_hwi (v);
+    }
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_vw ("wide_int::popcount", count, *this);
+#endif
+  return count;
+}
+
+/* Subtract of THIS and OP1.  No overflow is detected.  */
+
+wide_int
+wide_int::operator - (const wide_int &op1) const
+{
+  wide_int result;
+  unsigned HOST_WIDE_INT o0, o1;
+  unsigned HOST_WIDE_INT x = 0;
+  /* We implement subtraction as an in place negate and add.  Negation
+     is just inversion and add 1, so we can do the add of 1 by just
+     starting the carry in of the first element at 1.  */
+  unsigned HOST_WIDE_INT carry = 1;
+  unsigned HOST_WIDE_INT mask0, mask1;
+  int prec = precision ();
+  unsigned int i, small_prec, stop;
+
+  gcc_assert (mode == op1.mode);
+
+  result.mode = mode;
+  if (prec <= HOST_BITS_PER_WIDE_INT)
+    {
+      result.len = 1;
+      o0 = val[0];
+      o1 = op1.val[0];
+      result.val[0] = sext (o0 - o1, prec);
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_www ("wide_int::operator -", result, *this, op1);
+#endif
+      return result;
+    }
+
+  stop = len > op1.len ? len : op1.len;
+  /* Need to do one extra block just to handle the special cases.  */
+  if (stop < (unsigned)BLOCKS_NEEDED (prec))
+    stop++;
+
+  result.len = stop;
+  mask0 = sign_mask ();
+  mask1 = ~op1.sign_mask ();
+
+  /* Subtract all of the explicitly defined elements.  */
+  for (i = 0; i < stop; i++)
+    {
+      o0 = i < len ? (unsigned HOST_WIDE_INT)val[i] : mask0;
+      o1 = i < op1.len ? (unsigned HOST_WIDE_INT)~op1.val[i] : mask1;
+      x = o0 + o1 + carry;
+      result.val[i] = x;
+      carry = x < o0;
+    }
+
+  small_prec = prec & (HOST_BITS_PER_WIDE_INT - 1);
+  if (small_prec != 0 && BLOCKS_NEEDED (prec) == result.len)
+    {
+      /* Modes with weird precisions.  */
+      i = result.len - 1;
+      result.val[i] = sext (result.val[i], small_prec);
+    }
+
+  result.canonize ();
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_www ("wide_int::operator -", result, *this, op1);
+#endif
+  return result;
+}
+
+/* Subtract of signed OP1 from THIS.  No overflow is detected.  */
+
+wide_int
+wide_int::operator - (HOST_WIDE_INT op1) const
+{
+  wide_int result;
+  unsigned HOST_WIDE_INT o0, o1;
+  unsigned HOST_WIDE_INT x = 0;
+  /* We implement subtraction as an in place negate and add.  Negation
+     is just inversion and add 1, so we can do the add of 1 by just
+     starting the carry in of the first element at 1.  */
+  unsigned HOST_WIDE_INT carry = 1;
+  unsigned HOST_WIDE_INT mask0, mask1;
+  int prec = precision ();
+  unsigned int i, small_prec, stop;
+
+  result.mode = mode;
+
+  if (prec <= HOST_BITS_PER_WIDE_INT)
+    {
+      result.len = 1;
+      o0 = val[0];
+      result.val[0] = sext (o0 - op1, prec);
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwv ("wide_int::operator -", result, *this, op1);
+#endif
+      return result;
+    }
+
+  stop = len;
+  /* Need to do one extra block just to handle the special cases.  */
+  if (stop < (unsigned)BLOCKS_NEEDED (prec))
+    stop++;
+
+  result.len = stop;
+  mask0 = sign_mask ();
+  mask1 = ~(op1 >> (HOST_BITS_PER_WIDE_INT - 1));
+
+  /* Subtract all of the explicitly defined elements.  */
+  for (i = 0; i < stop; i++)
+    {
+      o0 = i < len ? (unsigned HOST_WIDE_INT)val[i] : mask0;
+      o1 = i < 1 ? (unsigned HOST_WIDE_INT)~op1 : mask1;
+      x = o0 + o1 + carry;
+      result.val[i] = x;
+      carry = x < o0;
+    }
+
+  small_prec = prec & (HOST_BITS_PER_WIDE_INT - 1);
+  if (small_prec != 0 && BLOCKS_NEEDED (prec) == result.len)
+    {
+      /* Modes with weird precisions.  */
+      i = result.len - 1;
+      result.val[i] = sext (result.val[i], small_prec);
+    }
+
+  result.canonize ();
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwv ("wide_int::operator -", result, *this, op1);
+#endif
+  return result;
+}
+
+/* Subtract of unsigned OP1 from THIS.  No overflow is detected.  */
+
+wide_int
+wide_int::operator - (unsigned HOST_WIDE_INT op1) const
+{
+  wide_int result;
+  unsigned HOST_WIDE_INT o0, o1;
+  unsigned HOST_WIDE_INT x = 0;
+  /* We implement subtraction as an in place negate and add.  Negation
+     is just inversion and add 1, so we can do the add of 1 by just
+     starting the carry in of the first element at 1.  */
+  unsigned HOST_WIDE_INT carry = 1;
+  unsigned HOST_WIDE_INT mask0, mask1;
+  int prec = precision ();
+  unsigned int i, small_prec, stop;
+
+  result.mode = mode;
+
+  if (prec <= HOST_BITS_PER_WIDE_INT)
+    {
+      result.len = 1;
+      o0 = val[0];
+      result.val[0] = sext (o0 - op1, prec);
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwv ("wide_int::operator -", result, *this, op1);
+#endif
+      return result;
+    }
+
+  stop = len;
+  /* Need to do one extra block just to handle the special cases.  */
+  if (stop < (unsigned)BLOCKS_NEEDED (prec))
+    stop++;
+
+  result.len = stop;
+  mask0 = sign_mask ();
+  mask1 = (HOST_WIDE_INT) -1;
+
+  /* Subtract all of the explicitly defined elements.  */
+  for (i = 0; i < stop; i++)
+    {
+      o0 = i < len ? (unsigned HOST_WIDE_INT)val[i] : mask0;
+      o1 = i < 1 ? (unsigned HOST_WIDE_INT)~op1 : mask1;
+      x = o0 + o1 + carry;
+      result.val[i] = x;
+      carry = x < o0;
+    }
+
+  small_prec = prec & (HOST_BITS_PER_WIDE_INT - 1);
+  if (small_prec != 0 && BLOCKS_NEEDED (prec) == result.len)
+    {
+      /* Modes with weird precisions.  */
+      i = result.len - 1;
+      result.val[i] = sext (result.val[i], small_prec);
+    }
+
+  result.canonize ();
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwv ("wide_int::operator -", result, *this, op1);
+#endif
+  return result;
+}
+
+/* Subtract of THIS and OP1 with overflow checking.  If the result
+   overflows within the precision, set OVERFLOW.  OVERFLOW is assumed
+   to be sticky so it should be initialized.  SGN controls if signed or
+   unsigned overflow is checked.  */
+
+wide_int
+wide_int::sub_overflow (const wide_int *op0, const wide_int *op1, 
+			wide_int::Op sgn, bool *overflow)
+{
+  wide_int result;
+  unsigned HOST_WIDE_INT o0 = 0;
+  unsigned HOST_WIDE_INT o1 = 0;
+  unsigned HOST_WIDE_INT x = 0;
+  /* We implement subtraction as an in place negate and add.  Negation
+     is just inversion and add 1, so we can do the add of 1 by just
+     starting the carry in of the first element at 1.  */
+  unsigned HOST_WIDE_INT carry = 1;
+  int prec = op1->precision ();
+  int i, small_prec;
+
+  gcc_assert (op0->get_mode () == op1->get_mode ());
+
+  result.set_mode (op0->get_mode ());
+
+  /* Subtract all of the explicitly defined elements.  */
+  for (i = 0; i < op0->get_len (); i++)
+    {
+      o0 = op0->elt (i);
+      o1 = ~op1->elt (i);
+      x = o0 + o1 + carry;
+      result.elt_ref (i) = x;
+      carry = x < o0;
+    }
+
+  /* Uncompress the rest.  */
+  if (op1->get_len () < op1->get_len ())
+    {
+      unsigned HOST_WIDE_INT mask = op1->sign_mask ();
+      for (i = op0->get_len (); i < op1->get_len (); i++)
+	{
+	  o0 = op0->elt (i);
+	  o1 = ~mask;
+	  x = o0 + o1 + carry;
+	  result.elt_ref (i) = x;
+	  carry = x < o0;
+	}
+    }
+  else if (op0->get_len () > op1->get_len ())
+    {
+      unsigned HOST_WIDE_INT mask = op1->sign_mask ();
+      for (i = op0->get_len (); i < op1->get_len (); i++)
+	{
+	  o0 = mask;
+	  o1 = ~op1->elt (i);
+	  x = o0 + o1 + carry;
+	  result.elt_ref (i) = x;
+	  carry = x < o0;
+	}
+    }
+
+  result.set_len (op0->get_len ());
+  small_prec = prec & (HOST_BITS_PER_WIDE_INT - 1);
+  if (small_prec == 0)
+    {
+      if (op0->get_len () * HOST_BITS_PER_WIDE_INT < prec)
+	{
+	  /* If the carry is 1, then we need another word.  If the carry
+	     is 0, we only need another word if the top bit is 1.  */
+	  if (carry == 1
+	      || (x >> (HOST_BITS_PER_WIDE_INT - 1) == 1))
+	    {
+	      /* Check for signed overflow.  */
+	      result.elt_ref (result.get_len ()) = carry;
+	      result.set_len (result.get_len () + 1);
+	    }
+	  if (sgn == wide_int::SIGNED)
+	    {
+	      if (((x ^ o0) & (x ^ o1)) >> (HOST_BITS_PER_WIDE_INT - 1))
+		*overflow = true;
+	    }
+	  else if (carry)
+	    {
+	      if ((~o0) < o1)
+		*overflow = true;
+	    }
+	  else
+	    {
+	      if ((~o0) <= o1)
+		*overflow = true;
+	    }
+	}
+    }
+  else
+    {
+      /* Overflow in this case is easy since we can see bits beyond
+	 the precision.  If the value computed is not the sign
+	 extended value, then we have overflow.  */
+      unsigned HOST_WIDE_INT y;
+
+      if (sgn == wide_int::UNSIGNED)
+	{
+	  /* The caveat for unsigned is to get rid of the bits above
+	     the precision before doing the addition.  To check the
+	     overflow, clear these bits and then redo the last
+	     addition.  Then the rest of the code just works.  */
+	  o0 = wide_int::zext (o0, small_prec);
+	  o1 = wide_int::zext (o1, small_prec);
+	  x = o0 + o1 + carry;
+	}
+      /* Short integers and modes with weird precisions.  */
+      y = wide_int::sext (x, small_prec);
+      result.set_len (op1->get_len ());
+      if (BLOCKS_NEEDED (prec) == result.get_len () && x != y)
+	*overflow = true;
+      /* Then put the sign extended form back because that is the
+	 canonical form.  */
+      result.elt_ref (result.get_len () - 1) = y;
+    }
+
+  result.canonize ();
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_www ("wide_int::sub_overflow", result, *op0, *op1);
+#endif
+  return result;
+}
+
+/* sub X from THIS.  If overflow occurs, set OVERFLOW.  */
+
+wide_int
+wide_int::sub (const wide_int &x, Op sgn, bool *overflow) const
+{
+  return sub_overflow (this, &x, sgn, overflow);
+}
+
+/* Truncate THIS to MODE.  */
+
+wide_int
+wide_int::truncate (enum machine_mode m) const
+{
+  gcc_assert (GET_MODE_PRECISION (m) <= precision ());
+  return copy (m);
+}
+
+/*
+ * Division and Mod
+ */
+
+/* Compute B_QUOTIENT and B_REMAINDER from B_DIVIDEND/B_DIVISOR.  The
+   algorithm is a small modification of the algorithm in Hacker's
+   Delight by Warren, which itself is a small modification of Knuth's
+   algorithm.  M is the number of significant elements of U however
+   there needs to be at least one extra element of B_DIVIDEND
+   allocated, N is the number of elements of B_DIVISOR.  */
+
+static void
+divmod_internal_2 (unsigned HOST_HALF_WIDE_INT *b_quotient, 
+		   unsigned HOST_HALF_WIDE_INT *b_remainder,
+		   unsigned HOST_HALF_WIDE_INT *b_dividend, 
+		   unsigned HOST_HALF_WIDE_INT *b_divisor, 
+		   int m, int n)
+{
+  /* The "digits" are a HOST_HALF_WIDE_INT which the size of half of a
+     HOST_WIDE_INT and stored in the lower bits of each word.  This
+     algorithm should work properly on both 32 and 64 bit
+     machines.  */
+  unsigned HOST_WIDE_INT b
+    = (unsigned HOST_WIDE_INT)1 << HOST_BITS_PER_HALF_WIDE_INT;
+  unsigned HOST_WIDE_INT qhat;   /* Estimate of quotient digit.  */
+  unsigned HOST_WIDE_INT rhat;   /* A remainder.  */
+  unsigned HOST_WIDE_INT p;      /* Product of two digits.  */
+  HOST_WIDE_INT s, i, j, t, k;
+
+  /* Single digit divisor.  */
+  if (n == 1)
+    {
+      k = 0;
+      for (j = m - 1; j >= 0; j--)
+	{
+	  b_quotient[j] = (k * b + b_dividend[j])/b_divisor[0];
+	  k = ((k * b + b_dividend[j])
+	       - ((unsigned HOST_WIDE_INT)b_quotient[j]
+		  * (unsigned HOST_WIDE_INT)b_divisor[0]));
+	}
+      b_remainder[0] = k;
+      return;
+    }
+
+  s = clz_hwi (b_divisor[n-1]) - HOST_BITS_PER_HALF_WIDE_INT; /* CHECK clz */
+
+  /* Normalize B_DIVIDEND and B_DIVISOR.  Unlike the published
+     algorithm, we can overwrite b_dividend and b_divisor, so we do
+     that.  */
+  for (i = n - 1; i > 0; i--)
+    b_divisor[i] = (b_divisor[i] << s)
+      | (b_divisor[i-1] >> (HOST_BITS_PER_HALF_WIDE_INT - s));
+  b_divisor[0] = b_divisor[0] << s;
+
+  b_dividend[m] = b_dividend[m-1] >> (HOST_BITS_PER_HALF_WIDE_INT - s);
+  for (i = m - 1; i > 0; i--)
+    b_dividend[i] = (b_dividend[i] << s)
+      | (b_dividend[i-1] >> (HOST_BITS_PER_HALF_WIDE_INT - s));
+  b_dividend[0] = b_dividend[0] << s;
+
+  /* Main loop.  */
+  for (j = m - n; j >= 0; j--)
+    {
+      qhat = (b_dividend[j+n] * b + b_dividend[j+n-1]) / b_divisor[n-1];
+      rhat = (b_dividend[j+n] * b + b_dividend[j+n-1]) - qhat * b_divisor[n-1];
+    again:
+      if (qhat >= b || qhat * b_divisor[n-2] > b * rhat + b_dividend[j+n-2])
+	{
+	  qhat -= 1;
+	  rhat += b_divisor[n-1];
+	  if (rhat < b)
+	    goto again;
+	}
+
+      /* Multiply and subtract.  */
+      k = 0;
+      for (i = 0; i < n; i++)
+	{
+	  p = qhat * b_divisor[i];
+	  t = b_dividend[i+j] - k - (p & HALF_INT_MASK);
+	  b_dividend[i + j] = t;
+	  k = ((p >> HOST_BITS_PER_HALF_WIDE_INT)
+	       - (t >> HOST_BITS_PER_HALF_WIDE_INT));
+	}
+      t = b_dividend[j+n] - k;
+      b_dividend[j+n] = t;
+
+      b_quotient[j] = qhat;
+      if (t < 0)
+	{
+	  b_quotient[j] -= 1;
+	  k = 0;
+	  for (i = 0; i < n; i++)
+	    {
+	      t = (HOST_WIDE_INT)b_dividend[i+j] + b_divisor[i] + k;
+	      b_dividend[i+j] = t;
+	      k = t >> HOST_BITS_PER_HALF_WIDE_INT;
+	    }
+	  b_dividend[j+n] += k;
+	}
+    }
+  for (i = 0; i < n; i++)
+    b_remainder[i] = (b_dividend[i] >> s) 
+      | (b_dividend[i+1] << (HOST_BITS_PER_HALF_WIDE_INT - s));
+}
+
+
+/* Do a truncating divide DIVISOR into DIVIDEND.  The result is the
+   same size as the operands.  SIGN is either wide_int::SIGNED or
+   wide_int::UNSIGNED.  */
+
+static wide_int
+divmod_internal (bool compute_quotient, 
+		 const wide_int *dividend, const wide_int *divisor,
+		 wide_int::Op sgn, wide_int *remainder, bool compute_remainder, 
+		 bool *overflow)
+{
+  wide_int quotient, u0, u1;
+  int prec = dividend->precision ();
+  int blocks_needed = 2 * BLOCKS_NEEDED (prec);
+  unsigned HOST_HALF_WIDE_INT b_quotient[2 * MAX_BITSIZE_MODE_ANY_INT
+				/ HOST_BITS_PER_WIDE_INT];
+  unsigned HOST_HALF_WIDE_INT b_remainder[2 * MAX_BITSIZE_MODE_ANY_INT
+				/ HOST_BITS_PER_WIDE_INT];
+  unsigned HOST_HALF_WIDE_INT b_dividend[(2 * MAX_BITSIZE_MODE_ANY_INT
+				 / HOST_BITS_PER_WIDE_INT) + 1];
+  unsigned HOST_HALF_WIDE_INT b_divisor[2 * MAX_BITSIZE_MODE_ANY_INT
+				/ HOST_BITS_PER_WIDE_INT];
+  int m, n;
+  bool dividend_neg = false;
+  bool divisor_neg = false;
+
+  if ((*divisor).zero_p ())
+    *overflow = true;
+
+  /* The smallest signed number / -1 causes overflow.  */
+  if (sgn == wide_int::SIGNED)
+    {
+      wide_int t = wide_int::set_bit_in_zero ((*dividend).precision () - 1, 
+					      (*dividend).get_mode ());
+      if (*dividend == t && (*divisor).minus_one_p ())
+	*overflow = true;
+    }
+
+  gcc_assert (dividend->get_mode () == divisor->get_mode ());
+  quotient.set_mode (dividend->get_mode ());
+  remainder->set_mode (dividend->get_mode ());
+
+  /* If overflow is set, just get out.  There will only be grief by
+     continuing.  */
+  if (*overflow)
+    {
+      if (compute_remainder)
+	{
+	  remainder->set_len (1);
+	  remainder->elt_ref (0) = 0;
+	}
+      return wide_int_zero (quotient.get_mode ());
+    }
+
+  /* Do it on the host if you can.  */
+  if (prec <= HOST_BITS_PER_WIDE_INT)
+    {
+      quotient.set_len (1);
+      remainder->set_len (1);
+      if (sgn == wide_int::SIGNED)
+	{
+	  quotient.elt_ref (0) = wide_int::sext (dividend->elt (0) / divisor->elt (0), prec);
+	  remainder->elt_ref (0) = wide_int::sext (dividend->elt (0) % divisor->elt (0), prec);
+	}
+      else
+	{
+	  unsigned HOST_WIDE_INT o0 = dividend->elt (0);
+	  unsigned HOST_WIDE_INT o1 = divisor->elt (0);
+
+	  if (prec < HOST_BITS_PER_WIDE_INT)
+	    {
+	      o0 = wide_int::zext (o0, prec);
+	      o1 = wide_int::zext (o1, prec);
+	    }
+	  quotient.elt_ref (0) = wide_int::sext (o0 / o1, prec);
+	  remainder->elt_ref (0) = wide_int::sext (o0 % o1, prec);
+	}
+
+#ifdef DEBUG_WIDE_INT
+      if (dump_file)
+	debug_wwww ("wide_int::divmod", quotient, *remainder, *dividend, *divisor);
+#endif
+      return quotient;
+    }
+
+  /* Make the divisor and divident positive and remember what we
+     did.  */
+  if (sgn == wide_int::SIGNED)
+    {
+      if (dividend->sign_mask ())
+	{
+	  u0 = dividend->neg ();
+	  dividend = &u0;
+	  dividend_neg = true;
+	}
+      if (divisor->sign_mask ())
+	{
+	  u1 = divisor->neg ();
+	  divisor = &u1;
+	  divisor_neg = true;
+	}
+    }
+
+  wi_unpack (b_dividend, &dividend->uelt_ref (0), dividend->get_len (),
+	     blocks_needed);
+  wi_unpack (b_divisor, &divisor->uelt_ref (0), divisor->get_len (),
+	     blocks_needed);
+
+  if (dividend->sign_mask ())
+    m = blocks_needed;
+  else
+    m = 2 * dividend->get_len ();
+
+  if (divisor->sign_mask ())
+    n = blocks_needed;
+  else
+    n = 2 * divisor->get_len ();
+
+  divmod_internal_2 (b_quotient, b_remainder, b_dividend, b_divisor, m, n);
+
+  if (compute_quotient)
+    {
+      wi_pack (&quotient.uelt_ref (0), b_quotient, m);
+      quotient.set_len (m / 2);
+      quotient.canonize ();
+      /* The quotient is neg if exactly one of the divisor or dividend is
+	 neg.  */
+      if (dividend_neg != divisor_neg)
+	quotient = quotient.neg ();
+    }
+
+  if (compute_remainder)
+    {
+      wi_pack (&remainder->uelt_ref (0), b_remainder, n);
+      remainder->set_len (n / 2);
+      (*remainder).canonize ();
+      /* The remainder is always the same sign as the dividend.  */
+      if (dividend_neg)
+	*remainder = (*remainder).neg ();
+    }
+
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwww ("wide_int::divmod", quotient, *remainder, *dividend, *divisor);
+#endif
+  return quotient;
+}
+
+
+/* Divide DIVISOR into THIS.  The result is the same size as the
+   operands.  The sign is specified in SGN.  The output is
+   truncated.  */
+
+wide_int
+wide_int::div_trunc (const wide_int &divisor, Op sgn) const
+{
+  wide_int remainder;
+  bool overflow;
+
+  return divmod_internal (true, this, &divisor, sgn, 
+			  &remainder, false, &overflow);
+}
+
+/* Divide DIVISOR into THIS.  The result is the same size as the
+   operands.  The sign is specified in SGN.  The output is truncated.
+   Overflow is set to true if the result overflows, otherwise it is
+   not set.  */
+wide_int
+wide_int::div_trunc (const wide_int &divisor, Op sgn, bool *overflow) const
+{
+  wide_int remainder;
+  
+  return divmod_internal (true, this, &divisor, sgn, 
+			  &remainder, false, overflow);
+}
+
+/* Divide DIVISOR into THIS producing both the quotient and remainder.
+   The result is the same size as the operands.  The sign is specified
+   in SGN.  The output is truncated.  */
+
+wide_int
+wide_int::divmod_trunc (const wide_int &divisor, wide_int *remainder, Op sgn) const
+{
+  bool overflow;
+
+  return divmod_internal (true, this, &divisor, sgn, 
+			  remainder, true, &overflow);
+}
+
+/* Divide DIVISOR into THIS producing the remainder.  The result is
+   the same size as the operands.  The sign is specified in SGN.  The
+   output is truncated.  */
+
+wide_int
+wide_int::mod_trunc (const wide_int &divisor, Op sgn) const
+{
+  bool overflow;
+  wide_int remainder;
+
+  divmod_internal (false, this, &divisor, sgn, 
+		   &remainder, true, &overflow);
+  return remainder;
+}
+
+/* Divide DIVISOR into THIS producing the remainder.  The result is
+   the same size as the operands.  The sign is specified in SGN.  The
+   output is truncated.  Overflow is set to true if the result
+   overflows, otherwise it is not set.  */
+
+wide_int
+wide_int::mod_trunc (const wide_int &divisor, Op sgn, bool *overflow) const
+{
+  wide_int remainder;
+
+  divmod_internal (true, this, &divisor, sgn, 
+			  &remainder, true, overflow);
+  return remainder;
+}
+
+/* Divide DIVISOR into THIS.  The result is the same size as the
+   operands.  The sign is specified in SGN.  The output is floor
+   truncated.  Overflow is set to true if the result overflows,
+   otherwise it is not set.  */
+
+wide_int
+wide_int::div_floor (const wide_int &divisor, Op sgn, bool *overflow) const
+{
+  wide_int remainder;
+  wide_int quotient;
+
+  quotient = divmod_internal (true, this, &divisor, sgn, 
+			      &remainder, true, overflow);
+  if (sgn == SIGNED && quotient.neg_p () && !remainder.zero_p ())
+    return quotient - (HOST_WIDE_INT)1;
+  return quotient;
+}
+
+
+/* Divide DIVISOR into THIS.  The remainder is also produced in
+   REMAINDER.  The result is the same size as the operands.  The sign
+   is specified in SGN.  The output is floor truncated.  Overflow is
+   set to true if the result overflows, otherwise it is not set.  */
+
+wide_int
+wide_int::divmod_floor (const wide_int &divisor, wide_int *remainder, Op sgn) const
+{
+  wide_int quotient;
+  bool overflow;
+
+  quotient = divmod_internal (true, this, &divisor, sgn, 
+			      remainder, true, &overflow);
+  if (sgn == SIGNED && quotient.neg_p () && !(*remainder).zero_p ())
+    {
+      *remainder = *remainder - divisor;
+      return quotient - (HOST_WIDE_INT)1;
+    }
+  return quotient;
+}
+
+
+
+/* Divide DIVISOR into THIS producing the remainder.  The result is
+   the same size as the operands.  The sign is specified in SGN.  The
+   output is floor truncated.  Overflow is set to true if the result
+   overflows, otherwise it is not set.  */
+
+wide_int
+wide_int::mod_floor (const wide_int &divisor, Op sgn, bool *overflow) const
+{
+  wide_int remainder;
+  wide_int quotient;
+
+  quotient = divmod_internal (true, this, &divisor, sgn, 
+			      &remainder, true, overflow);
+
+  if (sgn == SIGNED && quotient.neg_p () && !remainder.zero_p ())
+    return remainder - divisor;
+  return remainder;
+}
+
+/* Divide DIVISOR into THIS.  The result is the same size as the
+   operands.  The sign is specified in SGN.  The output is ceil
+   truncated.  Overflow is set to true if the result overflows,
+   otherwise it is not set.  */
+
+wide_int
+wide_int::div_ceil (const wide_int &divisor, Op sgn, bool *overflow) const
+{
+  wide_int remainder;
+  wide_int quotient;
+
+  quotient = divmod_internal (true, this, &divisor, sgn, 
+			      &remainder, true, overflow);
+
+  if (!remainder.zero_p ())
+    {
+      if (sgn == UNSIGNED || quotient.neg_p ())
+	return quotient;
+      else
+	return quotient + (HOST_WIDE_INT)1;
+    }
+  return quotient;
+}
+
+/* Divide DIVISOR into THIS producing the remainder.  The result is the
+   same size as the operands.  The sign is specified in SGN.  The
+   output is ceil truncated.  Overflow is set to true if the result
+   overflows, otherwise it is not set.  */
+
+wide_int
+wide_int::mod_ceil (const wide_int &divisor, Op sgn, bool *overflow) const
+{
+  wide_int remainder;
+  wide_int quotient;
+
+  quotient = divmod_internal (true, this, &divisor, sgn, 
+			      &remainder, true, overflow);
+
+  if (!remainder.zero_p ())
+    {
+      if (sgn == UNSIGNED || quotient.neg_p ())
+	return remainder;
+      else
+	return remainder - divisor;
+    }
+  return remainder;
+}
+
+/* Divide DIVISOR into THIS.  The result is the same size as the
+   operands.  The sign is specified in SGN.  The output is round
+   truncated.  Overflow is set to true if the result overflows,
+   otherwise it is not set.  */
+
+wide_int
+wide_int::div_round (const wide_int &divisor, Op sgn, bool *overflow) const
+{
+  wide_int remainder;
+  wide_int quotient;
+
+  quotient = divmod_internal (true, this, &divisor, sgn, 
+			      &remainder, true, overflow);
+  if (!remainder.zero_p ())
+    {
+      if (sgn == SIGNED)
+	{
+	  wide_int p_remainder = remainder.neg_p () ? remainder.neg () : remainder;
+	  wide_int p_divisor = divisor.neg_p () ? divisor.neg () : divisor;
+	  p_divisor = p_divisor.rshiftu (1);
+	  
+	  if (p_divisor.gts_p (p_remainder)) 
+	    {
+	      if (quotient.neg_p ())
+		return quotient - (HOST_WIDE_INT)1;
+	      else 
+		return quotient + (HOST_WIDE_INT)1;
+	    }
+	}
+      else
+	{
+	  wide_int p_divisor = divisor.rshiftu (1);
+	  if (p_divisor.gtu_p (remainder))
+	    return quotient + (unsigned HOST_WIDE_INT)1;
+	}
+    }
+  return quotient;
+}
+
+/* Divide DIVISOR into THIS producing the remainder.  The result is
+   the same size as the operands.  The sign is specified in SGN.  The
+   output is round truncated.  Overflow is set to true if the result
+   overflows, otherwise it is not set.  */
+
+wide_int
+wide_int::mod_round (const wide_int &divisor, Op sgn, bool *overflow) const
+{
+  wide_int remainder;
+  wide_int quotient;
+
+  quotient = divmod_internal (true, this, &divisor, sgn, 
+			      &remainder, true, overflow);
+
+  if (!remainder.zero_p ())
+    {
+      if (sgn == SIGNED)
+	{
+	  wide_int p_remainder = remainder.neg_p () ? remainder.neg () : remainder;
+	  wide_int p_divisor = divisor.neg_p () ? divisor.neg () : divisor;
+	  p_divisor = p_divisor.rshiftu (1);
+	  
+	  if (p_divisor.gts_p (p_remainder)) 
+	    {
+	      if (quotient.neg_p ())
+		return remainder + divisor;
+	      else 
+		return remainder - divisor;
+	    }
+	}
+      else
+	{
+	  wide_int p_divisor = divisor.rshiftu (1);
+	  if (p_divisor.gtu_p (remainder))
+	    return remainder - divisor;
+	}
+    }
+  return remainder;
+}
+
+/*
+ * Shifting, rotating and extraction.
+ */
+
+/* If SHIFT_COUNT_TRUNCATED is defined, truncate CNT.   
+
+   At first look, the shift truncation code does not look right.
+   Shifts (and rotates) are done according to the precision of the
+   mode but the shift count is truncated according to the bitsize
+   of the mode.   This is how real hardware works.
+
+   On an ideal machine, like Knuth's mix machine, a shift count is a
+   word long and all of the bits of that word are examined to compute
+   the shift amount.  But on real hardware, especially on machines
+   with fast (single cycle shifts) that takes too long.  On these
+   machines, the amount of time to perform a shift dictates the cycle
+   time of the machine so corners are cut to keep this fast.  A
+   comparison of an entire 64 bit word would take something like 6
+   gate delays before the shifting can even start.
+
+   So real hardware only looks at a small part of the shift amount.
+   On ibm machines, this tends to be 1 more than what is necessary to
+   encode the shift amount.  The rest of the world looks at only the
+   minimum number of bits.  This means that only 3 gate delays are
+   necessary to set up the shifter.
+
+   On the other hand, right shifts and rotates must be according to
+   the precision or the operation does not make any sense.   */
+static inline int
+trunc_shift (const enum machine_mode mode, int cnt)
+{
+#ifdef SHIFT_COUNT_TRUNCATED
+  cnt = cnt & (GET_MODE_BITSIZE (mode) - 1);
+#endif
+  return cnt;
+}
+
+/* This function is called in two contexts.  If OP == TRUNC, this
+   function provides a count that matches the semantics of the target
+   machine depending on the value of SHIFT_COUNT_TRUNCATED.  Note that
+   if SHIFT_COUNT_TRUNCATED is not defined, this function may produce
+   -1 as a value if the shift amount is greater than the bitsize of
+   the mode.  -1 is a surrogate for a very large amount.
+
+   If OP == NONE, then this function always truncates the shift value
+   to the bitsize because this shifting operation is a function that
+   is internal to GCC.  */
+
+static inline int
+trunc_shift (const enum machine_mode mode, const wide_int *cnt, wide_int::Op z)
+{
+  int bitsize = GET_MODE_BITSIZE (mode);
+
+  if (z == wide_int::TRUNC)
+    {
+#ifdef SHIFT_COUNT_TRUNCATED
+      return cnt->elt (0) & (bitsize - 1);
+#else
+      if (cnt.ltu (bitsize))
+	return cnt->elt (0) & (bitsize - 1);
+      else 
+	return -1;
+#endif
+    }
+  else
+    return cnt->elt (0) & (bitsize - 1);
+}
+
+/* Extract WIDTH bits from THIS starting at OFFSET.  The result is
+   assumed to fit in a HOST_WIDE_INT.  This function is safe in that
+   it can properly access elements that may not be explicitly
+   represented.  */
+
+HOST_WIDE_INT
+wide_int::extract_to_hwi (int offset, int width) const
+{
+  int start_elt, end_elt, shift;
+  HOST_WIDE_INT x;
+
+  /* Get rid of the easy cases first.   */
+  if (offset >= len * HOST_BITS_PER_WIDE_INT)
+    return sign_mask ();
+  if (offset + width <= 0)
+    return 0;
+
+  shift = offset & (HOST_BITS_PER_WIDE_INT - 1);
+  if (offset < 0)
+    {
+      start_elt = -1;
+      end_elt = 0;
+      x = 0;
+    }
+  else
+    {
+      start_elt = offset / HOST_BITS_PER_WIDE_INT;
+      end_elt = (offset + width - 1) / HOST_BITS_PER_WIDE_INT;
+      x = start_elt >= len ? sign_mask () : val[start_elt] >> shift;
+    }
+
+  if (start_elt != end_elt)
+    {
+      HOST_WIDE_INT y = end_elt == len
+	? sign_mask () : val[end_elt];
+
+      x = (unsigned HOST_WIDE_INT)x >> shift;
+      x |= y << (HOST_BITS_PER_WIDE_INT - shift);
+    }
+
+  if (width != HOST_BITS_PER_WIDE_INT)
+    x &= ((HOST_WIDE_INT)1 << width) - 1;
+
+  return x;
+}
+
+
+/* Left shift by an integer Y.  See the definition of Op.TRUNC for how
+   to set Z.  */
+
+wide_int
+wide_int::lshift (int y, Op z) const
+{
+  return lshift (y, z, mode);
+}
+
+/* Left shifting by an wide_int shift amount.  See the definition of
+   Op.TRUNC for how to set Z.  */
+
+wide_int
+wide_int::lshift (const wide_int &y, Op z) const
+{
+  if (z == TRUNC)
+    {
+      HOST_WIDE_INT shift = trunc_shift (mode, &y, TRUNC);
+      if (shift == -1)
+	return wide_int_zero (mode);
+      return lshift (shift, NONE, mode);
+    }
+  else
+    return lshift (trunc_shift (mode, &y, NONE), NONE, mode);
+}
+
+/* Left shift THIS by CNT.  See the definition of Op.TRUNC for how to
+   set Z.  Since this is used internally, it has the ability to
+   specify the MODE independently.  This is useful when inserting a
+   small value into a larger one.  */
+
+wide_int
+wide_int::lshift (int cnt, Op op, enum machine_mode m) const
+{
+  wide_int result;
+  int res_prec = GET_MODE_PRECISION (m);
+  int i;
+
+  result.mode = m;
+
+  if (op == TRUNC)
+    cnt = trunc_shift (mode, cnt);
+
+  /* Handle the simple case quickly.   */
+  if (res_prec <= HOST_BITS_PER_WIDE_INT)
+    {
+      result.val[0] = val[0] << cnt;
+      result.len = 1;
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwv ("wide_int::lshift", result, *this, cnt);
+#endif
+
+      return result;
+    }
+
+  if (cnt >= res_prec)
+    {
+      result.val[0] = 0;
+      result.len = 1;
+#ifdef DEBUG_WIDE_INT
+      if (dump_file)
+	debug_wwv ("wide_int::lshift", result, *this, cnt);
+#endif
+      return result;
+    }
+
+  for (i = 0; i < res_prec; i += HOST_BITS_PER_WIDE_INT)
+    result.val[i / HOST_BITS_PER_WIDE_INT]
+      = extract_to_hwi (i - cnt, HOST_BITS_PER_WIDE_INT);
+
+  result.len = BLOCKS_NEEDED (res_prec);
+
+  result.canonize ();
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwv ("wide_int::lshift", result, *this, cnt);
+#endif
+
+  return result;
+}
+
+/* Rotate THIS left by Y within its precision.  */
+
+wide_int
+wide_int::lrotate (const wide_int &y) const
+{
+  return lrotate (y.extract_to_hwi (0, HOST_BITS_PER_WIDE_INT));
+}
+
+/* Rotate THIS left by CNT within its precision.  */
+
+wide_int
+wide_int::lrotate (int cnt) const
+{
+  wide_int left, right, result;
+  int prec = precision ();
+
+  left = lshift (cnt, NONE);
+  right = rshiftu (prec - cnt, NONE);
+  result = left | right;
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwv ("wide_int::lrotate", result, *this, cnt);
+#endif
+  return result;
+}
+
+/* Unsigned right shift by Y.  See the definition of Op.TRUNC for how
+   to set Z.  */
+
+wide_int
+wide_int::rshiftu (const wide_int &y, Op z) const
+{
+  if (z == TRUNC)
+    {
+      HOST_WIDE_INT shift = trunc_shift (mode, &y, TRUNC);
+      if (shift == -1)
+	return wide_int_zero (mode);
+      return rshiftu (shift, NONE);
+    }
+  else
+    return rshiftu (trunc_shift (mode, &y, NONE), NONE);
+}
+
+/* Unsigned right shift THIS by CNT.  See the definition of Op.TRUNC
+   for how to set Z.  */
+
+wide_int
+wide_int::rshiftu (int cnt, Op trunc_op) const
+{
+  wide_int result;
+  int prec = precision ();
+  int stop_block, offset, i;
+
+  result.mode = mode;
+
+  if (trunc_op == TRUNC)
+    cnt = trunc_shift (mode, cnt);
+
+  if (cnt == 0)
+    {
+      result = copy (mode);
+#ifdef DEBUG_WIDE_INT
+      if (dump_file)
+	debug_wwv ("wide_int::rshiftu", result, *this, cnt);
+#endif
+      return result;
+    }
+
+  /* Handle the simple case quickly.   */
+  if (prec <= HOST_BITS_PER_WIDE_INT)
+    {
+      unsigned HOST_WIDE_INT x = val[0];
+
+      if (prec < HOST_BITS_PER_WIDE_INT)
+	x = zext (x, prec);
+
+      result.val[0] = x >> cnt;
+      result.len = 1;
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwv ("wide_int::rshiftu", result, *this, cnt);
+#endif
+      return result;
+    }
+
+  if (cnt >= prec)
+    {
+      result.val[0] = 0;
+      result.len = 1;
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwv ("wide_int::rshiftu", result, *this, cnt);
+#endif
+      return result;
+    }
+
+  stop_block = BLOCKS_NEEDED (prec - cnt);
+  for (i = 0; i < stop_block; i++)
+    result.val[i]
+      = extract_to_hwi ((i * HOST_BITS_PER_WIDE_INT) + cnt,
+			HOST_BITS_PER_WIDE_INT);
+
+  result.len = stop_block;
+
+  offset = (prec - cnt) & (HOST_BITS_PER_WIDE_INT - 1);
+  if (offset)
+    result.val[stop_block - 1] = zext (result.val[stop_block - 1], offset);
+  else
+    /* The top block had a 1 at the top position so it will decompress
+       wrong unless a zero block is added.  This only works because we
+       know the shift was greater than 0.  */
+    if (result.val[stop_block - 1] < 0)
+      result.val[result.len++] = 0;
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwv ("wide_int:rshiftu", result, *this, cnt);
+#endif
+  return result;
+}
+
+/* Signed right shift by Y.  See the definition of Op.TRUNC for how to
+   set Z.  */
+wide_int
+wide_int::rshifts (const wide_int &y, Op z) const
+{
+  if (z == TRUNC)
+    {
+      HOST_WIDE_INT shift = trunc_shift (mode, &y, TRUNC);
+      if (shift == -1)
+	{
+	  /* The value of the shift was larger than the bitsize and this
+	     machine does not truncate the value, so the result is
+	     a smeared sign bit.  */
+	  if (neg_p ())
+	    return wide_int_minus_one (mode);
+	  else
+	    return wide_int_zero (mode);
+	}
+      return rshifts (shift, NONE);
+    }
+  else
+    return rshifts (trunc_shift (mode, &y, NONE), NONE);
+}
+
+/* Signed right shift THIS by CNT.  See the definition of Op.TRUNC for
+   how to set Z.  */
+
+wide_int
+wide_int::rshifts (int cnt, Op trunc_op) const
+{
+  wide_int result;
+  int prec = precision ();
+  int stop_block, i;
+
+  result.mode = mode;
+
+  if (trunc_op == TRUNC)
+    cnt = trunc_shift (result.mode, cnt);
+
+  if (cnt == 0)
+    {
+      result = copy (mode);
+#ifdef DEBUG_WIDE_INT
+      if (dump_file)
+	debug_wwv ("wide_int::rshifts", result, *this, cnt);
+#endif
+      return result;
+    }
+  /* Handle the simple case quickly.   */
+  if (prec <= HOST_BITS_PER_WIDE_INT)
+    {
+      HOST_WIDE_INT x = val[0];
+      result.val[0] = x >> cnt;
+      result.len = 1;
+
+#ifdef DEBUG_WIDE_INT
+      if (dump_file)
+	debug_wwv ("wide_int::rshifts", result, *this, cnt);
+#endif
+      return result;
+    }
+
+  if (cnt >= prec)
+    {
+      HOST_WIDE_INT m = sign_mask ();
+      result.val[0] = m;
+      result.len = 1;
+#ifdef DEBUG_WIDE_INT
+      if (dump_file)
+	debug_wwv ("wide_int::rshifts", result, *this, cnt);
+#endif
+      return result;
+    }
+
+  stop_block = BLOCKS_NEEDED (prec - cnt);
+  for (i = 0; i < stop_block; i++)
+    result.val[i]
+      = extract_to_hwi ((i * HOST_BITS_PER_WIDE_INT) + cnt,
+			HOST_BITS_PER_WIDE_INT);
+
+  result.len = stop_block;
+
+  /* No need to sign extend the last block, since it extract_to_hwi
+     already did that.  */
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwv ("wide_int::rshifts", result, *this, cnt);
+#endif
+
+  return result;
+}
+
+/* Rotate THIS right by Y within its precision.  */
+
+wide_int
+wide_int::rrotate (const wide_int &y) const
+{
+  return rrotate (y.extract_to_hwi (0, HOST_BITS_PER_WIDE_INT));
+}
+
+/* Rotate THIS right by CNT within its precision.  */
+
+wide_int
+wide_int::rrotate (int cnt) const
+{
+  wide_int left, right, result;
+  int prec = precision ();
+
+  left = lshift (prec - cnt, NONE);
+  right = rshiftu (cnt, NONE);
+  result = left | right;
+
+#ifdef DEBUG_WIDE_INT
+  if (dump_file)
+    debug_wwv ("wide_int::rrotate", result, *this, cnt);
+#endif
+  return result;
+}
+
+/*
+ * Private utilities.
+ */
+/* Decompress THIS for at least TARGET bits into a result with MODE.  */
+
+wide_int
+wide_int::decompress (int target, enum machine_mode mode) const
+{
+  wide_int result;
+  int blocks_needed = BLOCKS_NEEDED (target);
+  HOST_WIDE_INT mask;
+  int len, i;
+
+  result.mode = mode;
+  result.len = blocks_needed;
+
+  for (i = 0; i < this->len; i++)
+    result.val[i] = val[i];
+
+  len = this->len;
+
+  /* One could argue that this should just ice.  */
+  if (target > GET_MODE_PRECISION (mode))
+    return result;
+
+  /* The extension that we are doing here is not sign extension, it is
+     decompression.  */
+  mask = sign_mask ();
+  while (len < blocks_needed)
+    result.val[len++] = mask;
+
+  return result;
+}
+
+
+/*
+ * Private debug printing routines.
+ */
+
+/* The debugging routines print results of wide operations into the
+   dump files of the respective passes in which they were called.  */
+char *
+wide_int::dump (char* buf) const
+{
+  int i;
+  int l;
+  const char * sep = "";
+
+  l = sprintf (buf, "[%s (", GET_MODE_NAME (mode));
+  for (i = len - 1; i >= 0; i--)
+    {
+      l += sprintf (&buf[l], "%s" HOST_WIDE_INT_PRINT_HEX, sep, val[i]);
+      sep = " ";
+    }
+
+  gcc_assert (len != 0);
+
+  l += sprintf (&buf[l], ")]");
+
+  gcc_assert (l < MAX);
+  return buf;
+}
+
+void
+debug_vw (const char* name, int r, const wide_int& o0)
+{
+  char buf0[MAX];
+  fprintf (dump_file, "%s: %d = %s\n", name, r, o0.dump (buf0));
+}
+
+void
+debug_vwh (const char* name, int r, const wide_int &o0,
+	   HOST_WIDE_INT o1)
+{
+  char buf0[MAX];
+  fprintf (dump_file, "%s: %d = %s 0x"HOST_WIDE_INT_PRINT_HEX" \n", name, r,
+	   o0.dump (buf0), o1);
+}
+
+void
+debug_vww (const char* name, int r, const wide_int &o0,
+	   const wide_int &o1)
+{
+  char buf0[MAX];
+  char buf1[MAX];
+  fprintf (dump_file, "%s: %d = %s OP %s\n", name, r,
+	   o0.dump (buf0), o1.dump (buf1));
+}
+
+void
+debug_wv (const char* name, const wide_int &r, int v0)
+{
+  char buf0[MAX];
+  fprintf (dump_file, "%s: %s = %d\n",
+	   name, r.dump (buf0), v0);
+}
+
+void
+debug_wvv (const char* name, const wide_int &r, int v0, int v1)
+{
+  char buf0[MAX];
+  fprintf (dump_file, "%s: %s = %d %d\n",
+	   name, r.dump (buf0), v0, v1);
+}
+
+void
+debug_wvvv (const char* name, const wide_int &r, int v0,
+	    int v1, int v2)
+{
+  char buf0[MAX];
+  fprintf (dump_file, "%s: %s = %d %d %d\n",
+	   name, r.dump (buf0), v0, v1, v2);
+}
+
+void
+debug_wwv (const char* name, const wide_int &r,
+	   const wide_int &o0, int v0)
+{
+  char buf0[MAX];
+  char buf1[MAX];
+  fprintf (dump_file, "%s: %s = %s %d\n",
+	   name, r.dump (buf0),
+	   o0.dump (buf1), v0);
+}
+
+void
+debug_wwwvv (const char* name, const wide_int &r,
+	     const wide_int &o0, const wide_int &o1, int v0, int v1)
+{
+  char buf0[MAX];
+  char buf1[MAX];
+  char buf2[MAX];
+  fprintf (dump_file, "%s: %s = %s OP %s %d %d\n",
+	   name, r.dump (buf0),
+	   o0.dump (buf1), o1.dump (buf2), v0, v1);
+}
+
+void
+debug_ww (const char* name, const wide_int &r, const wide_int &o0)
+{
+  char buf0[MAX];
+  char buf1[MAX];
+  fprintf (dump_file, "%s: %s = %s\n",
+	   name, r.dump (buf0),
+	   o0.dump (buf1));
+}
+
+void
+debug_www (const char* name, const wide_int &r,
+	   const wide_int &o0, const wide_int &o1)
+{
+  char buf0[MAX];
+  char buf1[MAX];
+  char buf2[MAX];
+  fprintf (dump_file, "%s: %s = %s OP %s\n",
+	   name, r.dump (buf0),
+	   o0.dump (buf1), o1.dump (buf2));
+}
+
+void
+debug_wwwv (const char* name, const wide_int &r,
+	    const wide_int &o0, const wide_int &o1, int v0)
+{
+  char buf0[MAX];
+  char buf1[MAX];
+  char buf2[MAX];
+  fprintf (dump_file, "%s: %s = %s OP %s %d\n",
+	   name, r.dump (buf0),
+	   o0.dump (buf1), o1.dump (buf2), v0);
+}
+
+void
+debug_wwww (const char* name, const wide_int &r,
+	    const wide_int &o0, const wide_int &o1, const wide_int &o2)
+{
+  char buf0[MAX];
+  char buf1[MAX];
+  char buf2[MAX];
+  char buf3[MAX];
+  fprintf (dump_file, "%s: %s = %s OP %s OP %s\n",
+	   name, r.dump (buf0),
+	   o0.dump (buf1), o1.dump (buf2), o2.dump (buf3));
+}
+#endif
+
Index: gcc/wide-int.h
===================================================================
--- gcc/wide-int.h	(revision 0)
+++ gcc/wide-int.h	(revision 0)
@@ -0,0 +1,718 @@ 
+/* Operations with very long integers.
+   Copyright (C) 2012 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by the
+Free Software Foundation; either version 3, or (at your option) any
+later version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT
+ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef WIDE_INT_H
+#define WIDE_INT_H
+
+/* A wide integer is currently represented as a vector of
+   HOST_WIDE_INTs.  The vector contains enough elements to hold a
+   value of MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT which is
+   a derived for each host target combination.  The values are stored
+   in the vector with the least signicant HOST_BITS_PER_WIDE_INT bits
+   of the value stored in element 0.
+
+   A wide_int contains four fields: the vector (VAL), the mode and a
+   length, (LEN).  The length is the number of HWIs needed to
+   represent the value.
+
+   Since most integers used in a compiler are small values, it is
+   generally profitable to use a representation of the value that is
+   shorter than the modes precision.  LEN is used to indicate the
+   number of elements of the vector that are in use.  When LEN *
+   HOST_BITS_PER_WIDE_INT < the precision, the value has been
+   compressed.  The values of the elements of the vector greater than
+   LEN - 1. are all equal to the highest order bit of LEN.
+
+   The representation does not contain any information about
+   signedness of the represented value, so it can be used to represent
+   both signed and unsigned numbers.  For operations where the results
+   depend on signedness (division, comparisons), the signedness must
+   be specified separately.  For operations where the signness
+   matters, one of the operands to the operation specifies either
+   wide_int::SIGNED or wide_int::UNSIGNED.
+
+   All constructors for wide_int take either an enum machine_mode or
+   tree_type.  */
+
+
+#ifndef GENERATOR_FILE
+#include "hwint.h"
+#include "options.h"
+#include "tm.h"
+#include "insn-modes.h"
+#include "machmode.h"
+#include "double-int.h"
+#include <gmp.h>
+#include "insn-modes.h"
+
+/* Some useful constants.  */
+
+#define wide_int_minus_one(MODE) (wide_int::from_shwi (-1, MODE))
+#define wide_int_zero(MODE)      (wide_int::from_shwi (0, MODE))
+#define wide_int_one(MODE)       (wide_int::from_shwi (1, MODE))
+#define wide_int_two(MODE)       (wide_int::from_shwi (2, MODE))
+#define wide_int_ten(MODE)       (wide_int::from_shwi (10, MODE))
+
+class wide_int {
+  /* Internal representation.  */
+  HOST_WIDE_INT val[MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT];
+  unsigned short len;
+  enum machine_mode mode;
+
+ public:
+  enum Op {
+    NONE,
+    /* There are two uses for the wide-int shifting functions.  The
+       first use is as an emulation of the target hardware.  The
+       second use is as service routines for other optimizations.  The
+       first case needs to be identified by passing TRUNC as the value
+       of Op so that shift amount is properly handled according to the
+       SHIFT_COUNT_TRUNCATED flag.  For the second case, the shift
+       amount is always truncated by the bytesize of the mode of
+       THIS.  */
+    TRUNC,
+
+    /* Many of the math functions produce different results depending
+       on if they are SIGNED or UNSIGNED.  In general, there are two
+       different functions, whose names are prefixed with an 'S" and
+       or an 'U'.  However, for some math functions there is also a
+       routine that does not have the prefix and takes an Op parameter
+       of SIGNED or UNSIGNED.  */
+    SIGNED,
+    UNSIGNED
+  };
+
+  /* Conversions.  */
+
+  static wide_int from_shwi (HOST_WIDE_INT op0, enum machine_mode mode);
+  static wide_int from_shwi (HOST_WIDE_INT op0, enum machine_mode mode, bool *overflow);
+  static wide_int from_uhwi (unsigned HOST_WIDE_INT op0, enum machine_mode mode);
+  static wide_int from_uhwi (unsigned HOST_WIDE_INT op0, enum machine_mode mode, bool *overflow);
+
+  static wide_int from_double_int (enum machine_mode, double_int);
+  static wide_int from_int_cst (const_tree);
+  static wide_int from_rtx (const_rtx, enum machine_mode);
+
+  HOST_WIDE_INT to_shwi () const;
+  HOST_WIDE_INT to_shwi (int prec) const;
+  unsigned HOST_WIDE_INT to_uhwi () const;
+  unsigned HOST_WIDE_INT to_uhwi (int prec) const;
+
+  /* Largest and smallest values that are represented in modes or precisions.  */
+
+  static wide_int max_value (const enum machine_mode mode, Op sgn);
+  static wide_int max_value (const enum machine_mode mode, int prec, Op sgn);
+  static wide_int min_value (const enum machine_mode mode, Op sgn);
+  static wide_int min_value (const enum machine_mode mode, int prec, Op sgn);
+
+  /* Accessors.  */
+
+  inline unsigned short get_len () const;
+  inline void set_len (unsigned int);
+  inline int full_len () const;
+  inline enum machine_mode get_mode () const;
+  inline void set_mode (enum machine_mode m);
+  inline HOST_WIDE_INT& elt_ref (unsigned int i);
+  inline unsigned HOST_WIDE_INT& uelt_ref (unsigned int i);
+  inline const unsigned HOST_WIDE_INT& uelt_ref (unsigned int i) const;
+  inline HOST_WIDE_INT elt (unsigned int i) const;
+  inline int precision () const;
+
+  /* Utility routines.  */
+
+  void canonize ();
+  wide_int copy (enum machine_mode mode) const;
+  static inline HOST_WIDE_INT sext (HOST_WIDE_INT src, int prec);
+  static inline HOST_WIDE_INT zext (HOST_WIDE_INT src, int prec);
+
+  /* Printing functions.  */
+
+  void print_dec (char *buf, Op sgn) const;
+  void print_dec (FILE *file, Op sgn) const;
+  void print_decs (char *buf) const;
+  void print_decs (FILE *file) const;
+  void print_decu (char *buf) const;
+  void print_decu (FILE *file) const;
+  void print_hex (char *buf) const;
+  void print_hex (FILE *file) const;
+
+  /* Comparative functions.  */
+
+  inline bool minus_one_p () const;
+  inline bool zero_p () const;
+  inline bool one_p () const;
+  inline bool neg_p () const;
+  
+  bool operator == (const wide_int &y) const;
+  inline bool operator != (const wide_int &y) const;
+  inline bool gt_p (const HOST_WIDE_INT x, Op sgn) const;
+  inline bool gt_p (const wide_int &x, Op sgn) const;
+  bool gts_p (const HOST_WIDE_INT y) const;
+  inline bool gts_p (const wide_int &y) const;
+  bool gtu_p (const unsigned HOST_WIDE_INT y) const;
+  inline bool gtu_p (const wide_int &y) const;
+
+  inline bool lt_p (const HOST_WIDE_INT x, Op sgn) const;
+  inline bool lt_p (const wide_int &x, Op sgn) const;
+  bool lts_p (const HOST_WIDE_INT y) const;
+  bool lts_p (const wide_int &y) const;
+  bool ltu_p (const unsigned HOST_WIDE_INT y) const;
+  bool ltu_p (const wide_int &y) const;
+
+  bool only_sign_bit_p (int prec) const;
+  bool only_sign_bit_p () const;
+  inline bool fits_uhwi_p () const;
+  inline bool fits_shwi_p () const;
+  bool fits_to_tree_p (const_tree type) const;
+  bool fits_u_p (int prec) const;
+  bool fits_s_p (int prec) const;
+
+  /* Extension  */
+
+  inline wide_int ext (int, Op sgn) const;
+  wide_int sext (int) const;
+  wide_int sext (enum machine_mode mode) const;
+  wide_int zext (int offset) const;
+  wide_int zext (enum machine_mode mode) const;
+
+  /* Masking, and Insertion  */
+
+  wide_int set_bit (int bitpos) const;
+  static wide_int set_bit_in_zero (int, enum machine_mode mode);
+  wide_int insert (const wide_int &op0, int offset, int width) const;
+  static wide_int mask (int, bool, enum machine_mode);
+  wide_int bswap () const;
+  static wide_int shifted_mask (int, int, bool, enum machine_mode);
+  inline HOST_WIDE_INT sign_mask () const;
+
+  /* Logicals */
+
+  wide_int operator & (const wide_int &y) const;
+  wide_int and_not (const wide_int &y) const;
+  wide_int operator ~ () const;
+  wide_int or_not (const wide_int &y) const;
+  wide_int operator | (const wide_int &y) const;
+  wide_int operator ^ (const wide_int &y) const;
+
+  /* Arithmetic operation functions, alpha sorted.  */
+  wide_int abs () const;
+  wide_int operator + (const wide_int &y) const;
+  wide_int operator + (HOST_WIDE_INT y) const;
+  wide_int operator + (unsigned HOST_WIDE_INT y) const;
+  wide_int add (const wide_int &x, Op sgn, bool *overflow) const;
+  wide_int clz (enum machine_mode mode) const;
+  HOST_WIDE_INT clz () const;
+  wide_int clrsb (enum machine_mode mode) const;
+  HOST_WIDE_INT clrsb () const;
+  int cmp (const wide_int &y, Op sgn) const;
+  int cmps (const wide_int &y) const;
+  int cmpu (const wide_int &y) const;
+  wide_int ctz (enum machine_mode mode) const;
+  HOST_WIDE_INT ctz () const;
+  HOST_WIDE_INT exact_log2 () const;
+  HOST_WIDE_INT floor_log2 () const;
+  wide_int ffs () const;
+  wide_int max (const wide_int &y, Op sgn) const;
+  wide_int umax (const wide_int &y) const;
+  wide_int smax (const wide_int &y) const;
+  wide_int min (const wide_int &y, Op sgn) const;
+  wide_int umin (const wide_int &y) const;
+  wide_int smin (const wide_int &y) const;
+  wide_int operator * (const wide_int &y) const;
+  wide_int mul (const wide_int &x, Op sgn, bool *overflow) const;
+  inline wide_int smul (const wide_int &x, bool *overflow) const;
+  inline wide_int umul (const wide_int &x, bool *overflow) const;
+  wide_int mul_full (const wide_int &x, Op sgn) const;
+  inline wide_int umul_full (const wide_int &x) const;
+  inline wide_int smul_full (const wide_int &x) const;
+  wide_int mul_high (const wide_int &x, Op sgn) const;
+  wide_int neg () const;
+  wide_int neg_overflow (bool *z) const;
+  wide_int parity (enum machine_mode m) const;
+  int popcount () const;
+  wide_int popcount (enum machine_mode m) const;
+  wide_int operator - (const wide_int &y) const;
+  wide_int operator - (HOST_WIDE_INT y) const;
+  wide_int operator - (unsigned HOST_WIDE_INT y) const;
+  wide_int sub (const wide_int &x, Op sgn, bool *overflow) const;
+  wide_int truncate (enum machine_mode mode) const;
+
+  /* Divison and mod.  These are the ones that are actually used, but
+     there are a lot of them.  */
+
+  wide_int div_trunc (const wide_int &divisor, Op sgn) const;
+  wide_int div_trunc (const wide_int &divisor, Op sgn, bool *overflow) const;
+  inline wide_int sdiv_trunc (const wide_int &divisor) const;
+  inline wide_int udiv_trunc (const wide_int &divisor) const;
+
+  wide_int div_floor (const wide_int &divisor, Op sgn, bool *overflow) const;
+  inline wide_int udiv_floor (const wide_int &divisor) const;
+  inline wide_int sdiv_floor (const wide_int &divisor) const;
+  wide_int div_ceil (const wide_int &divisor, Op sgn, bool *overflow) const;
+  wide_int div_round (const wide_int &divisor, Op sgn, bool *overflow) const;
+
+  wide_int divmod_trunc (const wide_int &divisor, wide_int *mod, Op sgn) const;
+  inline wide_int sdivmod_trunc (const wide_int &divisor, wide_int *mod) const;
+  inline wide_int udivmod_trunc (const wide_int &divisor, wide_int *mod) const;
+
+  wide_int divmod_floor (const wide_int &divisor, wide_int *mod, Op sgn) const;
+  inline wide_int sdivmod_floor (const wide_int &divisor, wide_int *mod) const;
+
+  wide_int mod_trunc (const wide_int &divisor, Op sgn) const;
+  wide_int mod_trunc (const wide_int &divisor, Op sgn, bool *overflow) const;
+  inline wide_int smod_trunc (const wide_int &divisor) const;
+  inline wide_int umod_trunc (const wide_int &divisor) const;
+
+  wide_int mod_floor (const wide_int &divisor, Op sgn, bool *overflow) const;
+  inline wide_int umod_floor (const wide_int &divisor) const;
+  wide_int mod_ceil (const wide_int &divisor, Op sgn, bool *overflow) const;
+  wide_int mod_round (const wide_int &divisor, Op sgn, bool *overflow) const;
+
+
+  /* Shifting rotating and extracting.  */
+  HOST_WIDE_INT extract_to_hwi (int offset, int width) const;
+
+  wide_int lshift (const wide_int &y, Op z = NONE) const;
+  wide_int lshift (int y, Op z, enum machine_mode m) const;
+  wide_int lshift (int y, Op z = NONE) const;
+
+  wide_int lrotate (const wide_int &y) const;
+  wide_int lrotate (int y) const;
+
+  wide_int rshift (int y, Op sgn) const;
+  inline wide_int rshift (const wide_int &y, Op sgn, Op z = NONE) const;
+  wide_int rshiftu (const wide_int &y, Op z = NONE) const;
+  wide_int rshiftu (int y, Op z = NONE) const;
+  wide_int rshifts (const wide_int &y, Op z = NONE) const;
+  wide_int rshifts (int y, Op z = NONE) const;
+
+  wide_int rrotate (const wide_int &y) const;
+  wide_int rrotate (int y) const;
+
+  static const int DUMP_MAX = (MAX_BITSIZE_MODE_ANY_INT / 4
+			       + MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT + 32);
+  char *dump (char* buf) const;
+ private:
+
+  /* Private utility routines.  */
+  wide_int decompress (int target, enum machine_mode mode) const;
+  static wide_int add_overflow (const wide_int *op0, const wide_int *op1,
+				wide_int::Op sgn, bool *overflow);
+  static wide_int sub_overflow (const wide_int *op0, const wide_int *op1, 
+				wide_int::Op sgn, bool *overflow);
+  static wide_int mul_internal (bool high, bool full, 
+				const wide_int *op1, const wide_int *op2, wide_int::Op op, 
+				bool *overflow, bool needs_overflow);
+};
+
+
+/* Produce 0 or -1 that is the smear of the sign bit.  */
+
+HOST_WIDE_INT
+wide_int::sign_mask () const
+{
+  int prec = GET_MODE_PRECISION (mode);
+  int i = len - 1;
+  if (prec < HOST_BITS_PER_WIDE_INT)
+    return ((val[0] << (HOST_BITS_PER_WIDE_INT - prec))
+	    >> (HOST_BITS_PER_WIDE_INT - 1));
+
+  /* VRP appears to be badly broken and this is a very ugly fix.  */
+  if (i >= 0)
+    return val[i] >> (HOST_BITS_PER_WIDE_INT - 1);
+
+  gcc_unreachable ();
+}
+
+#define wide_int_smin(OP0,OP1)  ((OP0).lts_p (OP1) ? (OP0) : (OP1))
+#define wide_int_smax(OP0,OP1)  ((OP0).lts_p (OP1) ? (OP1) : (OP0))
+#define wide_int_umin(OP0,OP1)  ((OP0).ltu_p (OP1) ? (OP0) : (OP1))
+#define wide_int_umax(OP0,OP1)  ((OP0).ltu_p (OP1) ? (OP1) : (OP0))
+
+
+/* Public accessors for the interior of a wide int.  */
+
+/* Get the number of host wide ints actually represented within the
+   wide int.  */
+
+unsigned short
+wide_int::get_len () const
+{
+  return len;
+}
+
+/* Set the number of host wide ints actually represented within the
+   wide int.  */
+
+void
+wide_int::set_len (unsigned int l)
+{
+  gcc_assert (l < MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT);
+  len = l;
+}
+
+/* Get the mode of the wide int.  */
+
+enum machine_mode
+wide_int::get_mode () const
+{
+  return mode;
+}
+
+/* Set the mode of the wide int.  */
+
+void
+wide_int::set_mode (enum machine_mode m)
+{
+  mode = m;
+}
+
+/* Get the number of host wide ints needed to represent the precision
+   of the number.  */
+
+int
+wide_int::full_len () const
+{
+  return ((GET_MODE_PRECISION (mode) + HOST_BITS_PER_WIDE_INT - 1)
+	  / HOST_BITS_PER_WIDE_INT);
+}
+
+/* Get a reference to a particular element of the wide int.  Does not
+   check I against len as during construction we might want to set len
+   after creating the value.  */
+
+HOST_WIDE_INT&
+wide_int::elt_ref (unsigned int i)
+{
+  /* We check maximal size, not len.  */
+  gcc_assert (i < MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT); 
+
+  return val[i];
+}
+
+/* Get a reference to a particular element of the wide int as an
+   unsigned quantity.  Does not check I against len as during
+   construction we might want to set len after creating the value.  */
+
+unsigned HOST_WIDE_INT&
+wide_int::uelt_ref (unsigned int i)
+{
+  return *(unsigned HOST_WIDE_INT *)&elt_ref (i);
+}
+
+/* Get a reference to a particular element of the wide int as a
+   constant unsigned quantity.  Does not check I against len as during
+   construction we might want to set len after creating the value.  */
+
+const unsigned HOST_WIDE_INT&
+wide_int::uelt_ref (unsigned int i) const
+{
+  /* We check maximal size, not len.  */
+  gcc_assert (i < MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT); 
+
+  return *(const unsigned HOST_WIDE_INT *)&val[i];
+}
+
+/* Get a particular element of the wide int.  */
+
+HOST_WIDE_INT
+wide_int::elt (unsigned int i) const
+{
+  return i >= len ? sign_mask () : val[i];
+}
+
+/* Get the precision of the mode in THIS.  */
+int
+wide_int::precision () const
+{
+  return GET_MODE_PRECISION (mode);
+}
+
+
+
+/* Sign extend SRC from PREC.  */
+
+HOST_WIDE_INT
+wide_int::sext (HOST_WIDE_INT src, int prec)
+{
+  if (prec == HOST_BITS_PER_WIDE_INT)
+    return src;
+  else
+    {
+      int shift = HOST_BITS_PER_WIDE_INT - (prec & (HOST_BITS_PER_WIDE_INT - 1));
+      return (src << shift) >> shift;
+    }
+}
+
+/* Zero extend SRC from PREC.  */
+
+HOST_WIDE_INT
+wide_int::zext (HOST_WIDE_INT src, int prec)
+{
+  if (prec == HOST_BITS_PER_WIDE_INT)
+    return src;
+  else
+    return src & (((HOST_WIDE_INT)1
+		   << (prec & (HOST_BITS_PER_WIDE_INT - 1))) - 1);
+}
+
+bool
+wide_int::minus_one_p () const
+{
+  return len == 1 && val[0] == (HOST_WIDE_INT)-1;
+}
+
+bool
+wide_int::zero_p () const
+{
+  return len == 1 && val[0] == 0;
+}
+
+bool
+wide_int::one_p () const
+{
+  return len == 1 && val[0] == 1;
+}
+
+bool
+wide_int::neg_p () const
+{
+  return sign_mask () != 0;
+}
+
+bool
+wide_int::operator != (const wide_int &op1) const
+{
+  return !(*this == op1);
+}  
+
+/* Return true if THIS is greater than OP1.  Signness is indicated by
+   OP.  */
+bool
+wide_int::gt_p (HOST_WIDE_INT op1, Op op) const
+{
+  if (op == SIGNED)
+    return gts_p (op1);
+  else
+    return gtu_p (op1);
+}  
+
+/* Return true if THIS is greater than OP1.  Signness is indicated by
+   OP.  */
+bool
+wide_int::gt_p (const wide_int &op1, Op op) const
+{
+  if (op == SIGNED)
+    return op1.lts_p (*this);
+  else
+    return op1.ltu_p (*this);
+}  
+
+/* Return true if THIS is signed greater than OP1.  */
+bool
+wide_int::gts_p (const wide_int &op1) const
+{
+  return op1.lts_p (*this);
+}  
+
+/* Return true if THIS is unsigned greater than OP1.  */
+bool
+wide_int::gtu_p (const wide_int &op1) const
+{
+  return op1.ltu_p (*this);
+}  
+
+/* Return true if THIS is less than OP1.  Signness is indicated by
+   OP.  */
+bool
+wide_int::lt_p (HOST_WIDE_INT op1, Op op) const
+{
+  if (op == SIGNED)
+    return lts_p (op1);
+  else
+    return ltu_p (op1);
+}  
+
+/* Return true if THIS is less than OP1.  Signness is indicated by
+   OP.  */
+bool
+wide_int::lt_p (const wide_int &op1, Op op) const
+{
+  if (op == SIGNED)
+    return lts_p (op1);
+  else
+    return ltu_p (op1);
+}  
+
+/* Return true if THIS fits in a HOST_WIDE_INT with no loss of
+   precision.  */
+bool
+wide_int::fits_shwi_p () const
+{
+  return len == 1;
+}
+
+/* Return true if THIS fits in an unsigned HOST_WIDE_INT with no loss
+   of precision.  */
+bool
+wide_int::fits_uhwi_p () const
+{
+  return len == 1 
+    || (len == 2 && val[1] == (HOST_WIDE_INT)-1);
+}
+
+/* Return THIS extended to PREC.  The signness of the extension is
+   specified by OP.  */
+wide_int 
+wide_int::ext (int prec, Op z) const
+{
+  if (z == UNSIGNED)
+    return zext (prec);
+  else
+    return zext (prec);
+}
+
+/* Signed multiply THIS and OP1.  The result is the same precision as
+   the operands.  OVERFLOW is set true if the result overflows.  */
+wide_int
+wide_int::smul (const wide_int &x, bool *overflow) const
+{
+  return mul (x, SIGNED, overflow);
+}
+
+/* Unsigned multiply THIS and OP1.  The result is the same precision
+   as the operands.  OVERFLOW is set true if the result overflows.  */
+wide_int
+wide_int::umul (const wide_int &x, bool *overflow) const
+{
+  return mul (x, UNSIGNED, overflow);
+}
+
+/* Signed multiply THIS and OP1.  The result is twice the precision as
+   the operands.  There is an issue with this function if it is called
+   on the widest mode defined on the platform, since there is no mode
+   that is twice as wide.  */
+wide_int
+wide_int::smul_full (const wide_int &x) const
+{
+  return mul_full (x, SIGNED);
+}
+
+/* Unsigned multiply THIS and OP1.  The result is twice the precision
+   as the operands.  There is an issue with this function if it is
+   called on the widest mode defined on the platform, since there is
+   no mode that is twice as wide.  */
+wide_int
+wide_int::umul_full (const wide_int &x) const
+{
+  return mul_full (x, UNSIGNED);
+}
+
+/* Signed divide with truncation of result.  */
+wide_int
+wide_int::sdiv_trunc (const wide_int &divisor) const
+{
+  return div_trunc (divisor, SIGNED);
+}
+
+/* Unsigned divide with truncation of result.  */
+wide_int
+wide_int::udiv_trunc (const wide_int &divisor) const
+{
+  return div_trunc (divisor, UNSIGNED);
+}
+
+/* Unsigned divide with floor truncation of result.  */
+wide_int
+wide_int::udiv_floor (const wide_int &divisor) const
+{
+  bool overflow;
+
+  return div_floor (divisor, UNSIGNED, &overflow);
+}
+
+/* Signed divide with floor truncation of result.  */
+wide_int
+wide_int::sdiv_floor (const wide_int &divisor) const
+{
+  bool overflow;
+
+  return div_floor (divisor, SIGNED, &overflow);
+}
+
+/* Signed divide/mod with truncation of result.  */
+wide_int
+wide_int::sdivmod_trunc (const wide_int &divisor, wide_int *mod) const
+{
+  return divmod_trunc (divisor, mod, SIGNED);
+}
+
+/* Unsigned divide/mod with truncation of result.  */
+wide_int
+wide_int::udivmod_trunc (const wide_int &divisor, wide_int *mod) const
+{
+  return divmod_trunc (divisor, mod, UNSIGNED);
+}
+
+/* Signed divide/mod with floor truncation of result.  */
+wide_int
+wide_int::sdivmod_floor (const wide_int &divisor, wide_int *mod) const
+{
+  return divmod_floor (divisor, mod, SIGNED);
+}
+
+/* Signed mod with truncation of result.  */
+wide_int
+wide_int::smod_trunc (const wide_int &divisor) const
+{
+  return mod_trunc (divisor, SIGNED);
+}
+
+/* Unsigned mod with truncation of result.  */
+wide_int
+wide_int::umod_trunc (const wide_int &divisor) const
+{
+  return mod_trunc (divisor, UNSIGNED);
+}
+
+/* Unsigned mod with floor truncation of result.  */
+wide_int
+wide_int::umod_floor (const wide_int &divisor) const
+{
+  bool overflow;
+
+  return mod_floor (divisor, UNSIGNED, &overflow);
+}
+
+wide_int
+wide_int::rshift (const wide_int &y, Op sgn, Op z) const
+{
+  if (sgn == UNSIGNED)
+    return rshiftu (y, z);
+  else
+    return rshifts (y, z);
+}
+
+/* Conversion to and from GMP integer representations.  */
+
+void mpz_set_wide_int (mpz_t, wide_int, bool);
+wide_int mpz_get_wide_int (const_tree, mpz_t, bool);
+#endif /* GENERATOR FILE */
+
+#endif /* WIDE_INT_H */
Index: gcc/genmodes.c
===================================================================
--- gcc/genmodes.c	(revision 191978)
+++ gcc/genmodes.c	(working copy)
@@ -849,6 +849,38 @@  calc_wider_mode (void)
 
 #define print_closer() puts ("};")
 
+/* Compute the max bitsize of some of the classes of integers.  It may
+   be that there are needs for the other integer classes, and this
+   code is easy to extend.  */
+static void
+emit_max_int (void)
+{
+  unsigned int max, mmax;
+  struct mode_data *i;
+  int j;
+
+  puts ("");
+  for (max = 1, i = modes[MODE_INT]; i; i = i->next)
+    if (max < i->bytesize)
+	max = i->bytesize;
+  printf ("#define MAX_BITSIZE_MODE_INT %d*BITS_PER_UNIT\n", max);
+  mmax = max;
+  for (max = 1, i = modes[MODE_PARTIAL_INT]; i; i = i->next)
+    if (max < i->bytesize)
+	max = i->bytesize;
+  printf ("#define MAX_BITSIZE_MODE_PARTIAL_INT %d*BITS_PER_UNIT\n", max);
+  if (max > mmax)
+    mmax = max;
+  printf ("#define MAX_BITSIZE_MODE_ANY_INT %d*BITS_PER_UNIT\n", mmax);
+
+  mmax = 0;
+  for (j = 0; j < MAX_MODE_CLASS; j++)
+    for (i = modes[j]; i; i = i->next)
+      if (mmax < i->bytesize)
+	mmax = i->bytesize;
+  printf ("#define MAX_BITSIZE_MODE_ANY_MODE %d*BITS_PER_UNIT\n", mmax);
+}
+
 static void
 emit_insn_modes_h (void)
 {
@@ -913,6 +945,7 @@  enum machine_mode\n{");
 #endif
   printf ("#define CONST_MODE_IBIT%s\n", adj_ibit ? "" : " const");
   printf ("#define CONST_MODE_FBIT%s\n", adj_fbit ? "" : " const");
+  emit_max_int ();
   puts ("\
 \n\
 #endif /* insn-modes.h */");
Index: gcc/ira-lives.c
===================================================================
--- gcc/ira-lives.c	(revision 191978)
+++ gcc/ira-lives.c	(working copy)
@@ -779,22 +779,16 @@  single_reg_class (const char *constraint
 	  break;
 
 	case 'n':
-	  if (CONST_INT_P (op)
-	      || CONST_DOUBLE_AS_INT_P (op)
-	      || (equiv_const != NULL_RTX
-		  && (CONST_INT_P (equiv_const)
-		      || CONST_DOUBLE_AS_INT_P (equiv_const))))
+	  if (CONST_SCALAR_INT_P (op)
+	      || (equiv_const != NULL_RTX && CONST_SCALAR_INT_P (equiv_const)))
 	    return NO_REGS;
 	  break;
 
 	case 's':
-	  if ((CONSTANT_P (op) 
-	       && !CONST_INT_P (op) 
-	       && !CONST_DOUBLE_AS_INT_P (op))
+	  if ((CONSTANT_P (op) && !CONST_SCALAR_INT_P (op))
 	      || (equiv_const != NULL_RTX
 		  && CONSTANT_P (equiv_const)
-		  && !CONST_INT_P (equiv_const)
-		  && !CONST_DOUBLE_AS_INT_P (equiv_const)))
+		  && !CONST_SCALAR_INT_P (equiv_const)))
 	    return NO_REGS;
 	  break;
 
Index: gcc/emit-rtl.c
===================================================================
--- gcc/emit-rtl.c	(revision 191978)
+++ gcc/emit-rtl.c	(working copy)
@@ -128,6 +128,9 @@  rtx cc0_rtx;
 static GTY ((if_marked ("ggc_marked_p"), param_is (struct rtx_def)))
      htab_t const_int_htab;
 
+static GTY ((if_marked ("ggc_marked_p"), param_is (struct rtx_def)))
+     htab_t const_wide_int_htab;
+
 /* A hash table storing memory attribute structures.  */
 static GTY ((if_marked ("ggc_marked_p"), param_is (struct mem_attrs)))
      htab_t mem_attrs_htab;
@@ -153,6 +156,11 @@  static void set_used_decls (tree);
 static void mark_label_nuses (rtx);
 static hashval_t const_int_htab_hash (const void *);
 static int const_int_htab_eq (const void *, const void *);
+#if TARGET_SUPPORTS_WIDE_INT
+static hashval_t const_wide_int_htab_hash (const void *);
+static int const_wide_int_htab_eq (const void *, const void *);
+static rtx lookup_const_wide_int (rtx);
+#endif
 static hashval_t const_double_htab_hash (const void *);
 static int const_double_htab_eq (const void *, const void *);
 static rtx lookup_const_double (rtx);
@@ -189,6 +197,43 @@  const_int_htab_eq (const void *x, const
   return (INTVAL ((const_rtx) x) == *((const HOST_WIDE_INT *) y));
 }
 
+#if TARGET_SUPPORTS_WIDE_INT
+/* Returns a hash code for X (which is a really a CONST_WIDE_INT).  */
+
+static hashval_t
+const_wide_int_htab_hash (const void *x)
+{
+  int i;
+  HOST_WIDE_INT hash = 0;
+  const_rtx xr = (const_rtx) x;
+
+  for (i = 0; i < CONST_WIDE_INT_NUNITS (xr); i++)
+    hash += CONST_WIDE_INT_ELT (xr, i);
+
+  return (hashval_t) hash;
+}
+
+/* Returns nonzero if the value represented by X (which is really a
+   CONST_WIDE_INT) is the same as that given by Y (which is really a
+   CONST_WIDE_INT).  */
+
+static int
+const_wide_int_htab_eq (const void *x, const void *y)
+{
+  int i;
+  const_rtx xr = (const_rtx)x;
+  const_rtx yr = (const_rtx)y;
+  if (CONST_WIDE_INT_NUNITS (xr) != CONST_WIDE_INT_NUNITS (yr))
+    return 0;
+
+  for (i = 0; i < CONST_WIDE_INT_NUNITS (xr); i++)
+    if (CONST_WIDE_INT_ELT (xr, i) != CONST_WIDE_INT_ELT (yr, i))
+      return 0;
+  
+  return 1;
+}
+#endif
+
 /* Returns a hash code for X (which is really a CONST_DOUBLE).  */
 static hashval_t
 const_double_htab_hash (const void *x)
@@ -196,7 +241,7 @@  const_double_htab_hash (const void *x)
   const_rtx const value = (const_rtx) x;
   hashval_t h;
 
-  if (GET_MODE (value) == VOIDmode)
+  if (TARGET_SUPPORTS_WIDE_INT == 0 && GET_MODE (value) == VOIDmode)
     h = CONST_DOUBLE_LOW (value) ^ CONST_DOUBLE_HIGH (value);
   else
     {
@@ -216,7 +261,7 @@  const_double_htab_eq (const void *x, con
 
   if (GET_MODE (a) != GET_MODE (b))
     return 0;
-  if (GET_MODE (a) == VOIDmode)
+  if (TARGET_SUPPORTS_WIDE_INT == 0 && GET_MODE (a) == VOIDmode)
     return (CONST_DOUBLE_LOW (a) == CONST_DOUBLE_LOW (b)
 	    && CONST_DOUBLE_HIGH (a) == CONST_DOUBLE_HIGH (b));
   else
@@ -482,6 +527,7 @@  const_fixed_from_fixed_value (FIXED_VALU
   return lookup_const_fixed (fixed);
 }
 
+#if TARGET_SUPPORTS_WIDE_INT == 0
 /* Constructs double_int from rtx CST.  */
 
 double_int
@@ -501,17 +547,58 @@  rtx_to_double_int (const_rtx cst)
   
   return r;
 }
+#endif
 
+#if TARGET_SUPPORTS_WIDE_INT
+/* Determine whether WIDE_INT, already exists in the hash table.  If
+   so, return its counterpart; otherwise add it to the hash table and
+   return it.  */
 
-/* Return a CONST_DOUBLE or CONST_INT for a value specified as
-   a double_int.  */
-
-rtx
-immed_double_int_const (double_int i, enum machine_mode mode)
+static rtx
+lookup_const_wide_int (rtx wint)
 {
-  return immed_double_const (i.low, i.high, mode);
+  void **slot = htab_find_slot (const_wide_int_htab, wint, INSERT);
+  if (*slot == 0)
+    *slot = wint;
+
+  return (rtx) *slot;
+}
+#endif
+
+/* V contains a wide_int.  A CONST_INT or CONST_WIDE_INT (if
+   TARGET_SUPPORTS_WIDE_INT is defined) or CONST_DOUBLE if
+   TARGET_SUPPORTS_WIDE_INT is not defined is produced based on the
+   number of HOST_WIDE_INTs that are necessary to represent the value
+   in compact form.  */
+rtx
+immed_wide_int_const (const wide_int &v)
+{
+  unsigned int len = v.get_len ();
+
+  if (len < 2)
+    return gen_int_mode (v.elt (0), v.get_mode ());
+
+#if TARGET_SUPPORTS_WIDE_INT
+  {
+    rtx value = const_wide_int_alloc (len);
+    unsigned int i;
+
+    /* It is so tempting to just put the mode in here.  Must control
+       myself ... */
+    PUT_MODE (value, VOIDmode);
+    HWI_PUT_NUM_ELEM (CONST_WIDE_INT_VEC (value), len);
+
+    for (i = 0; i < len; i++)
+      CONST_WIDE_INT_ELT (value, i) = v.elt (i);
+
+    return lookup_const_wide_int (value);
+  }
+#else
+  return immed_double_const (v.elt (0), v.elt (1), v.get_mode ());
+#endif
 }
 
+#if TARGET_SUPPORTS_WIDE_INT == 0
 /* Return a CONST_DOUBLE or CONST_INT for a value specified as a pair
    of ints: I0 is the low-order word and I1 is the high-order word.
    For values that are larger than HOST_BITS_PER_DOUBLE_INT, the
@@ -563,6 +650,7 @@  immed_double_const (HOST_WIDE_INT i0, HO
 
   return lookup_const_double (value);
 }
+#endif
 
 rtx
 gen_rtx_REG (enum machine_mode mode, unsigned int regno)
@@ -1244,7 +1332,7 @@  gen_lowpart_common (enum machine_mode mo
     }
   else if (GET_CODE (x) == SUBREG || REG_P (x)
 	   || GET_CODE (x) == CONCAT || GET_CODE (x) == CONST_VECTOR
-	   || CONST_DOUBLE_P (x) || CONST_INT_P (x))
+	   || CONST_DOUBLE_AS_FLOAT_P (x) || CONST_SCALAR_INT_P (x))
     return simplify_gen_subreg (mode, x, innermode, offset);
 
   /* Otherwise, we can't do this.  */
@@ -5575,11 +5663,15 @@  init_emit_once (void)
   enum machine_mode mode;
   enum machine_mode double_mode;
 
-  /* Initialize the CONST_INT, CONST_DOUBLE, CONST_FIXED, and memory attribute
-     hash tables.  */
+  /* Initialize the CONST_INT, CONST_WIDE_INT, CONST_DOUBLE,
+     CONST_FIXED, and memory attribute hash tables.  */
   const_int_htab = htab_create_ggc (37, const_int_htab_hash,
 				    const_int_htab_eq, NULL);
 
+#if TARGET_SUPPORTS_WIDE_INT
+  const_wide_int_htab = htab_create_ggc (37, const_wide_int_htab_hash,
+					 const_wide_int_htab_eq, NULL);
+#endif
   const_double_htab = htab_create_ggc (37, const_double_htab_hash,
 				       const_double_htab_eq, NULL);
 
Index: gcc/combine.c
===================================================================
--- gcc/combine.c	(revision 191978)
+++ gcc/combine.c	(working copy)
@@ -2617,16 +2617,19 @@  try_combine (rtx i3, rtx i2, rtx i1, rtx
      constant.  */
   if (i1 == 0
       && (temp = single_set (i2)) != 0
-      && (CONST_INT_P (SET_SRC (temp))
-	  || CONST_DOUBLE_AS_INT_P (SET_SRC (temp)))
+      && CONST_SCALAR_INT_P (SET_SRC (temp))
       && GET_CODE (PATTERN (i3)) == SET
-      && (CONST_INT_P (SET_SRC (PATTERN (i3)))
-	  || CONST_DOUBLE_AS_INT_P (SET_SRC (PATTERN (i3))))
+      && CONST_SCALAR_INT_P (SET_SRC (PATTERN (i3)))
       && reg_subword_p (SET_DEST (PATTERN (i3)), SET_DEST (temp)))
     {
       rtx dest = SET_DEST (PATTERN (i3));
       int offset = -1;
       int width = 0;
+      
+      /* There are not explicit tests to make sure that this is not a
+	 float, but there is code here that would not be correct if it
+	 was.  */
+      gcc_assert (GET_MODE_CLASS (GET_MODE (SET_SRC (temp))) != MODE_FLOAT);
 
       if (GET_CODE (dest) == ZERO_EXTRACT)
 	{
@@ -2662,23 +2665,15 @@  try_combine (rtx i3, rtx i2, rtx i1, rtx
 	    offset = -1;
 	}
 
-      if (offset >= 0
-	  && (GET_MODE_PRECISION (GET_MODE (SET_DEST (temp)))
-	      <= HOST_BITS_PER_DOUBLE_INT))
+      if (offset >= 0)
 	{
-	  double_int m, o, i;
+	  wide_int o;
 	  rtx inner = SET_SRC (PATTERN (i3));
 	  rtx outer = SET_SRC (temp);
-
-	  o = rtx_to_double_int (outer);
-	  i = rtx_to_double_int (inner);
-
-	  m = double_int::mask (width);
-	  i &= m;
-	  m = m.llshift (offset, HOST_BITS_PER_DOUBLE_INT);
-	  i = i.llshift (offset, HOST_BITS_PER_DOUBLE_INT);
-	  o = o.and_not (m) | i;
-
+	  
+	  o = (wide_int::from_rtx (outer, GET_MODE (SET_DEST (temp)))
+	       .insert (wide_int::from_rtx (inner, GET_MODE (dest)),
+			offset, width));
 	  combine_merges++;
 	  subst_insn = i3;
 	  subst_low_luid = DF_INSN_LUID (i2);
@@ -2689,8 +2684,7 @@  try_combine (rtx i3, rtx i2, rtx i1, rtx
 	  /* Replace the source in I2 with the new constant and make the
 	     resulting insn the new pattern for I3.  Then skip to where we
 	     validate the pattern.  Everything was set up above.  */
-	  SUBST (SET_SRC (temp),
-		 immed_double_int_const (o, GET_MODE (SET_DEST (temp))));
+	  SUBST (SET_SRC (temp), immed_wide_int_const (o));
 
 	  newpat = PATTERN (i2);
 
@@ -5102,8 +5096,7 @@  subst (rtx x, rtx from, rtx to, int in_d
 	      if (GET_CODE (new_rtx) == CLOBBER && XEXP (new_rtx, 0) == const0_rtx)
 		return new_rtx;
 
-	      if (GET_CODE (x) == SUBREG
-		  && (CONST_INT_P (new_rtx) || CONST_DOUBLE_AS_INT_P (new_rtx)))
+	      if (GET_CODE (x) == SUBREG && CONST_SCALAR_INT_P (new_rtx))
 		{
 		  enum machine_mode mode = GET_MODE (x);
 
@@ -5113,7 +5106,7 @@  subst (rtx x, rtx from, rtx to, int in_d
 		  if (! x)
 		    x = gen_rtx_CLOBBER (mode, const0_rtx);
 		}
-	      else if (CONST_INT_P (new_rtx)
+	      else if (CONST_SCALAR_INT_P (new_rtx)
 		       && GET_CODE (x) == ZERO_EXTEND)
 		{
 		  x = simplify_unary_operation (ZERO_EXTEND, GET_MODE (x),
@@ -7133,7 +7126,7 @@  make_extraction (enum machine_mode mode,
       if (mode == tmode)
 	return new_rtx;
 
-      if (CONST_INT_P (new_rtx) || CONST_DOUBLE_AS_INT_P (new_rtx))
+      if (CONST_SCALAR_INT_P (new_rtx))
 	return simplify_unary_operation (unsignedp ? ZERO_EXTEND : SIGN_EXTEND,
 					 mode, new_rtx, tmode);
 
@@ -10672,8 +10665,7 @@  gen_lowpart_for_combine (enum machine_mo
   /* We can only support MODE being wider than a word if X is a
      constant integer or has a mode the same size.  */
   if (GET_MODE_SIZE (omode) > UNITS_PER_WORD
-      && ! ((CONST_INT_P (x) || CONST_DOUBLE_AS_INT_P (x))
-	    || isize == osize))
+      && ! (CONST_SCALAR_INT_P (x) || isize == osize))
     goto fail;
 
   /* X might be a paradoxical (subreg (mem)).  In that case, gen_lowpart
Index: gcc/print-rtl.c
===================================================================
--- gcc/print-rtl.c	(revision 191978)
+++ gcc/print-rtl.c	(working copy)
@@ -634,6 +634,12 @@  print_rtx (const_rtx in_rtx)
 	  fprintf (outfile, " [%s]", s);
 	}
       break;
+
+    case CONST_WIDE_INT:
+      if (! flag_simple)
+	fprintf (outfile, " ");
+      hwivec_output_hex (outfile, CONST_WIDE_INT_VEC (in_rtx));
+      break;
 #endif
 
     case CODE_LABEL:
Index: gcc/genpreds.c
===================================================================
--- gcc/genpreds.c	(revision 191978)
+++ gcc/genpreds.c	(working copy)
@@ -613,7 +613,7 @@  write_one_predicate_function (struct pre
   add_mode_tests (p);
 
   /* A normal predicate can legitimately not look at enum machine_mode
-     if it accepts only CONST_INTs and/or CONST_DOUBLEs.  */
+     if it accepts only CONST_INTs and/or CONST_WIDE_INT and/or CONST_DOUBLEs.  */
   printf ("int\n%s (rtx op, enum machine_mode mode ATTRIBUTE_UNUSED)\n{\n",
 	  p->name);
   write_predicate_stmts (p->exp);
@@ -810,8 +810,11 @@  add_constraint (const char *name, const
   if (is_const_int || is_const_dbl)
     {
       enum rtx_code appropriate_code
+#if TARGET_SUPPORTS_WIDE_INT
+	= is_const_int ? CONST_INT : CONST_WIDE_INT;
+#else
 	= is_const_int ? CONST_INT : CONST_DOUBLE;
-
+#endif
       /* Consider relaxing this requirement in the future.  */
       if (regclass
 	  || GET_CODE (exp) != AND
@@ -1075,12 +1078,17 @@  write_tm_constrs_h (void)
 	if (needs_ival)
 	  puts ("  if (CONST_INT_P (op))\n"
 		"    ival = INTVAL (op);");
+#if TARGET_SUPPORTS_WIDE_INT
+	if (needs_lval || needs_hval)
+	  error ("you can't use lval or hval");
+#else
 	if (needs_hval)
 	  puts ("  if (GET_CODE (op) == CONST_DOUBLE && mode == VOIDmode)"
 		"    hval = CONST_DOUBLE_HIGH (op);");
 	if (needs_lval)
 	  puts ("  if (GET_CODE (op) == CONST_DOUBLE && mode == VOIDmode)"
 		"    lval = CONST_DOUBLE_LOW (op);");
+#endif
 	if (needs_rval)
 	  puts ("  if (GET_CODE (op) == CONST_DOUBLE && mode != VOIDmode)"
 		"    rval = CONST_DOUBLE_REAL_VALUE (op);");
Index: gcc/tree-ssa-address.c
===================================================================
--- gcc/tree-ssa-address.c	(revision 191978)
+++ gcc/tree-ssa-address.c	(working copy)
@@ -192,15 +192,16 @@  addr_for_mem_ref (struct mem_address *ad
   struct mem_addr_template *templ;
 
   if (addr->step && !integer_onep (addr->step))
-    st = immed_double_int_const (tree_to_double_int (addr->step), pointer_mode);
+    st = immed_wide_int_const (wide_int::from_int_cst (addr->step));
   else
     st = NULL_RTX;
 
   if (addr->offset && !integer_zerop (addr->offset))
-    off = immed_double_int_const
-	    (tree_to_double_int (addr->offset)
-	     .sext (TYPE_PRECISION (TREE_TYPE (addr->offset))),
-	     pointer_mode);
+    {
+      wide_int dc = wide_int::from_int_cst (addr->offset);
+      dc = dc.sext (TYPE_PRECISION (TREE_TYPE (addr->offset)));
+      off = immed_wide_int_const (dc);
+    }
   else
     off = NULL_RTX;
 
Index: gcc/ggc-zone.c
===================================================================
--- gcc/ggc-zone.c	(revision 191978)
+++ gcc/ggc-zone.c	(working copy)
@@ -1373,6 +1373,9 @@  ggc_alloc_typed_stat (enum gt_types_enum
     case gt_ggc_e_9rtvec_def:
       return ggc_internal_alloc_zone_pass_stat (size, &rtl_zone);
 
+    case gt_ggc_e_10hwivec_def:
+      return ggc_internal_alloc_zone_pass_stat (size, &rtl_zone);
+
     default:
       return ggc_internal_alloc_zone_pass_stat (size, &main_zone);
     }
Index: gcc/final.c
===================================================================
--- gcc/final.c	(revision 191978)
+++ gcc/final.c	(working copy)
@@ -3728,8 +3728,16 @@  output_addr_const (FILE *file, rtx x)
       output_addr_const (file, XEXP (x, 0));
       break;
 
+    case CONST_WIDE_INT:
+      /* This should be ok for a while.  */
+      gcc_assert (CONST_WIDE_INT_NUNITS (x) == 2);
+      fprintf (file, HOST_WIDE_INT_PRINT_DOUBLE_HEX,
+	       (unsigned HOST_WIDE_INT) CONST_WIDE_INT_ELT (x, 1),
+	       (unsigned HOST_WIDE_INT) CONST_WIDE_INT_ELT (x, 0));
+      break;
+
     case CONST_DOUBLE:
-      if (GET_MODE (x) == VOIDmode)
+      if (CONST_DOUBLE_AS_INT_P (x))
 	{
 	  /* We can use %d if the number is one word and positive.  */
 	  if (CONST_DOUBLE_HIGH (x))
Index: gcc/coretypes.h
===================================================================
--- gcc/coretypes.h	(revision 191978)
+++ gcc/coretypes.h	(working copy)
@@ -56,6 +56,9 @@  typedef const struct rtx_def *const_rtx;
 struct rtvec_def;
 typedef struct rtvec_def *rtvec;
 typedef const struct rtvec_def *const_rtvec;
+struct hwivec_def;
+typedef struct hwivec_def *hwivec;
+typedef const struct hwivec_def *const_hwivec;
 union tree_node;
 typedef union tree_node *tree;
 union gimple_statement_d;
Index: gcc/expr.c
===================================================================
--- gcc/expr.c	(revision 191978)
+++ gcc/expr.c	(working copy)
@@ -719,23 +719,23 @@  convert_modes (enum machine_mode mode, e
   if (mode == oldmode)
     return x;
 
-  /* There is one case that we must handle specially: If we are converting
-     a CONST_INT into a mode whose size is twice HOST_BITS_PER_WIDE_INT and
-     we are to interpret the constant as unsigned, gen_lowpart will do
-     the wrong if the constant appears negative.  What we want to do is
-     make the high-order word of the constant zero, not all ones.  */
+  /* There is one case that we must handle specially: If we are
+     converting a CONST_INT into a mode whose size is larger than
+     HOST_BITS_PER_WIDE_INT and we are to interpret the constant as
+     unsigned, gen_lowpart will do the wrong if the constant appears
+     negative.  What we want to do is make the high-order word of the
+     constant zero, not all ones.  */
 
   if (unsignedp && GET_MODE_CLASS (mode) == MODE_INT
-      && GET_MODE_BITSIZE (mode) == HOST_BITS_PER_DOUBLE_INT
+      && GET_MODE_BITSIZE (mode) > HOST_BITS_PER_WIDE_INT
       && CONST_INT_P (x) && INTVAL (x) < 0)
     {
-      double_int val = double_int::from_uhwi (INTVAL (x));
-
+      HOST_WIDE_INT val = INTVAL (x);
       /* We need to zero extend VAL.  */
       if (oldmode != VOIDmode)
-	val = val.zext (GET_MODE_BITSIZE (oldmode));
+	val &= GET_MODE_PRECISION (oldmode) - 1;
 
-      return immed_double_int_const (val, mode);
+      return immed_wide_int_const (wide_int::from_uhwi (val, mode));
     }
 
   /* We can do this with a gen_lowpart if both desired and current modes
@@ -747,7 +747,11 @@  convert_modes (enum machine_mode mode, e
        && GET_MODE_PRECISION (mode) <= HOST_BITS_PER_WIDE_INT)
       || (GET_MODE_CLASS (mode) == MODE_INT
 	  && GET_MODE_CLASS (oldmode) == MODE_INT
-	  && (CONST_DOUBLE_AS_INT_P (x) 
+#if TARGET_SUPPORTS_WIDE_INT
+	  && (CONST_WIDE_INT_P (x)
+#else
+ 	  && (CONST_DOUBLE_AS_INT_P (x)
+#endif
 	      || (GET_MODE_PRECISION (mode) <= GET_MODE_PRECISION (oldmode)
 		  && ((MEM_P (x) && ! MEM_VOLATILE_P (x)
 		       && direct_load[(int) mode])
@@ -1752,6 +1756,7 @@  emit_group_load_1 (rtx *tmps, rtx dst, r
 	    {
 	      rtx first, second;
 
+	      /* TODO: const_wide_int can have sizes other than this...  */
 	      gcc_assert (2 * len == ssize);
 	      split_double (src, &first, &second);
 	      if (i)
@@ -5180,10 +5185,10 @@  store_expr (tree exp, rtx target, int ca
 			       &alt_rtl);
     }
 
-  /* If TEMP is a VOIDmode constant and the mode of the type of EXP is not
-     the same as that of TARGET, adjust the constant.  This is needed, for
-     example, in case it is a CONST_DOUBLE and we want only a word-sized
-     value.  */
+  /* If TEMP is a VOIDmode constant and the mode of the type of EXP is
+     not the same as that of TARGET, adjust the constant.  This is
+     needed, for example, in case it is a CONST_DOUBLE or
+     CONST_WIDE_INT and we want only a word-sized value.  */
   if (CONSTANT_P (temp) && GET_MODE (temp) == VOIDmode
       && TREE_CODE (exp) != ERROR_MARK
       && GET_MODE (target) != TYPE_MODE (TREE_TYPE (exp)))
@@ -7711,11 +7716,12 @@  expand_constructor (tree exp, rtx target
 
   /* All elts simple constants => refer to a constant in memory.  But
      if this is a non-BLKmode mode, let it store a field at a time
-     since that should make a CONST_INT or CONST_DOUBLE when we
-     fold.  Likewise, if we have a target we can use, it is best to
-     store directly into the target unless the type is large enough
-     that memcpy will be used.  If we are making an initializer and
-     all operands are constant, put it in memory as well.
+     since that should make a CONST_INT, CONST_WIDE_INT or
+     CONST_DOUBLE when we fold.  Likewise, if we have a target we can
+     use, it is best to store directly into the target unless the type
+     is large enough that memcpy will be used.  If we are making an
+     initializer and all operands are constant, put it in memory as
+     well.
 
      FIXME: Avoid trying to fill vector constructors piece-meal.
      Output them with output_constant_def below unless we're sure
@@ -8207,17 +8213,18 @@  expand_expr_real_2 (sepops ops, rtx targ
 	      && TREE_CONSTANT (treeop1))
 	    {
 	      rtx constant_part;
+	      HOST_WIDE_INT wc;
+	      enum machine_mode wmode = TYPE_MODE (TREE_TYPE (treeop1));
 
 	      op1 = expand_expr (treeop1, subtarget, VOIDmode,
 				 EXPAND_SUM);
-	      /* Use immed_double_const to ensure that the constant is
+	      /* Use wide_int::from_shwi to ensure that the constant is
 		 truncated according to the mode of OP1, then sign extended
 		 to a HOST_WIDE_INT.  Using the constant directly can result
 		 in non-canonical RTL in a 64x32 cross compile.  */
-	      constant_part
-		= immed_double_const (TREE_INT_CST_LOW (treeop0),
-				      (HOST_WIDE_INT) 0,
-				      TYPE_MODE (TREE_TYPE (treeop1)));
+	      wc = TREE_INT_CST_LOW (treeop0);
+	      constant_part 
+		= immed_wide_int_const (wide_int::from_shwi (wc, wmode));
 	      op1 = plus_constant (mode, op1, INTVAL (constant_part));
 	      if (modifier != EXPAND_SUM && modifier != EXPAND_INITIALIZER)
 		op1 = force_operand (op1, target);
@@ -8229,7 +8236,8 @@  expand_expr_real_2 (sepops ops, rtx targ
 		   && TREE_CONSTANT (treeop0))
 	    {
 	      rtx constant_part;
-
+	      HOST_WIDE_INT wc;
+	      enum machine_mode wmode = TYPE_MODE (TREE_TYPE (treeop0));
 	      op0 = expand_expr (treeop0, subtarget, VOIDmode,
 				 (modifier == EXPAND_INITIALIZER
 				 ? EXPAND_INITIALIZER : EXPAND_SUM));
@@ -8243,14 +8251,13 @@  expand_expr_real_2 (sepops ops, rtx targ
 		    return simplify_gen_binary (PLUS, mode, op0, op1);
 		  goto binop2;
 		}
-	      /* Use immed_double_const to ensure that the constant is
+	      /* Use wide_int::from_shwi to ensure that the constant is
 		 truncated according to the mode of OP1, then sign extended
 		 to a HOST_WIDE_INT.  Using the constant directly can result
 		 in non-canonical RTL in a 64x32 cross compile.  */
-	      constant_part
-		= immed_double_const (TREE_INT_CST_LOW (treeop1),
-				      (HOST_WIDE_INT) 0,
-				      TYPE_MODE (TREE_TYPE (treeop0)));
+	      wc = TREE_INT_CST_LOW (treeop1);
+	      constant_part 
+		= immed_wide_int_const (wide_int::from_shwi (wc, wmode));
 	      op0 = plus_constant (mode, op0, INTVAL (constant_part));
 	      if (modifier != EXPAND_SUM && modifier != EXPAND_INITIALIZER)
 		op0 = force_operand (op0, target);
@@ -8752,10 +8759,13 @@  expand_expr_real_2 (sepops ops, rtx targ
 	 for unsigned bitfield expand this as XOR with a proper constant
 	 instead.  */
       if (reduce_bit_field && TYPE_UNSIGNED (type))
-	temp = expand_binop (mode, xor_optab, op0,
-			     immed_double_int_const
-			       (double_int::mask (TYPE_PRECISION (type)), mode),
-			     target, 1, OPTAB_LIB_WIDEN);
+	{
+	  wide_int mask = wide_int::mask (TYPE_PRECISION (type), false, mode);
+
+	  temp = expand_binop (mode, xor_optab, op0,
+			       immed_wide_int_const (mask),
+			       target, 1, OPTAB_LIB_WIDEN);
+	}
       else
 	temp = expand_unop (mode, one_cmpl_optab, op0, target, 1);
       gcc_assert (temp);
@@ -9335,9 +9345,7 @@  expand_expr_real_1 (tree exp, rtx target
       return decl_rtl;
 
     case INTEGER_CST:
-      temp = immed_double_const (TREE_INT_CST_LOW (exp),
-				 TREE_INT_CST_HIGH (exp), mode);
-
+      temp = immed_wide_int_const (wide_int::from_int_cst (exp));
       return temp;
 
     case VECTOR_CST:
@@ -9568,8 +9576,9 @@  expand_expr_real_1 (tree exp, rtx target
 	op0 = memory_address_addr_space (address_mode, op0, as);
 	if (!integer_zerop (TREE_OPERAND (exp, 1)))
 	  {
-	    rtx off
-	      = immed_double_int_const (mem_ref_offset (exp), address_mode);
+	    wide_int wi = wide_int::from_double_int
+	      (address_mode, mem_ref_offset (exp));
+	    rtx off = immed_wide_int_const (wi);
 	    op0 = simplify_gen_binary (PLUS, address_mode, op0, off);
 	  }
 	op0 = memory_address_addr_space (mode, op0, as);
@@ -10441,8 +10450,8 @@  reduce_to_bit_field_precision (rtx exp,
     }
   else if (TYPE_UNSIGNED (type))
     {
-      rtx mask = immed_double_int_const (double_int::mask (prec),
-					 GET_MODE (exp));
+      rtx mask = immed_wide_int_const 
+	(wide_int::mask (prec, false, GET_MODE (exp)));
       return expand_and (GET_MODE (exp), exp, mask, target);
     }
   else
@@ -11007,8 +11016,8 @@  const_vector_from_tree (tree exp)
 	RTVEC_ELT (v, i) = CONST_FIXED_FROM_FIXED_VALUE (TREE_FIXED_CST (elt),
 							 inner);
       else
-	RTVEC_ELT (v, i) = immed_double_int_const (tree_to_double_int (elt),
-						   inner);
+	RTVEC_ELT (v, i) 
+	  = immed_wide_int_const (wide_int::from_int_cst (elt));
     }
 
   return gen_rtx_CONST_VECTOR (mode, v);
Index: gcc/optabs.c
===================================================================
--- gcc/optabs.c	(revision 191978)
+++ gcc/optabs.c	(working copy)
@@ -838,7 +838,8 @@  expand_subword_shift (enum machine_mode
   if (CONSTANT_P (op1) || shift_mask >= BITS_PER_WORD)
     {
       carries = outof_input;
-      tmp = immed_double_const (BITS_PER_WORD, 0, op1_mode);
+      tmp = immed_wide_int_const (wide_int::from_shwi (BITS_PER_WORD,
+						       op1_mode));
       tmp = simplify_expand_binop (op1_mode, sub_optab, tmp, op1,
 				   0, true, methods);
     }
@@ -853,13 +854,14 @@  expand_subword_shift (enum machine_mode
 			      outof_input, const1_rtx, 0, unsignedp, methods);
       if (shift_mask == BITS_PER_WORD - 1)
 	{
-	  tmp = immed_double_const (-1, -1, op1_mode);
+	  tmp = immed_wide_int_const (wide_int_minus_one (op1_mode));
 	  tmp = simplify_expand_binop (op1_mode, xor_optab, op1, tmp,
 				       0, true, methods);
 	}
       else
 	{
-	  tmp = immed_double_const (BITS_PER_WORD - 1, 0, op1_mode);
+	  tmp = immed_wide_int_const (wide_int::from_shwi (BITS_PER_WORD - 1,
+							   op1_mode));
 	  tmp = simplify_expand_binop (op1_mode, sub_optab, tmp, op1,
 				       0, true, methods);
 	}
@@ -1022,7 +1024,7 @@  expand_doubleword_shift (enum machine_mo
      is true when the effective shift value is less than BITS_PER_WORD.
      Set SUPERWORD_OP1 to the shift count that should be used to shift
      OUTOF_INPUT into INTO_TARGET when the condition is false.  */
-  tmp = immed_double_const (BITS_PER_WORD, 0, op1_mode);
+  tmp = immed_wide_int_const (wide_int::from_shwi (BITS_PER_WORD, op1_mode));
   if (!CONSTANT_P (op1) && shift_mask == BITS_PER_WORD - 1)
     {
       /* Set CMP1 to OP1 & BITS_PER_WORD.  The result is zero iff OP1
@@ -2872,7 +2874,7 @@  expand_absneg_bit (enum rtx_code code, e
   const struct real_format *fmt;
   int bitpos, word, nwords, i;
   enum machine_mode imode;
-  double_int mask;
+  wide_int mask;
   rtx temp, insns;
 
   /* The format has to have a simple sign bit.  */
@@ -2908,7 +2910,7 @@  expand_absneg_bit (enum rtx_code code, e
       nwords = (GET_MODE_BITSIZE (mode) + BITS_PER_WORD - 1) / BITS_PER_WORD;
     }
 
-  mask = double_int_zero.set_bit (bitpos);
+  mask = wide_int::set_bit_in_zero (bitpos, imode);
   if (code == ABS)
     mask = ~mask;
 
@@ -2930,7 +2932,7 @@  expand_absneg_bit (enum rtx_code code, e
 	    {
 	      temp = expand_binop (imode, code == ABS ? and_optab : xor_optab,
 				   op0_piece,
-				   immed_double_int_const (mask, imode),
+				   immed_wide_int_const (mask),
 				   targ_piece, 1, OPTAB_LIB_WIDEN);
 	      if (temp != targ_piece)
 		emit_move_insn (targ_piece, temp);
@@ -2948,7 +2950,7 @@  expand_absneg_bit (enum rtx_code code, e
     {
       temp = expand_binop (imode, code == ABS ? and_optab : xor_optab,
 			   gen_lowpart (imode, op0),
-			   immed_double_int_const (mask, imode),
+			   immed_wide_int_const (mask),
 		           gen_lowpart (imode, target), 1, OPTAB_LIB_WIDEN);
       target = lowpart_subreg_maybe_copy (mode, temp, imode);
 
@@ -3547,7 +3549,7 @@  expand_copysign_absneg (enum machine_mod
     }
   else
     {
-      double_int mask;
+      wide_int mask;
 
       if (GET_MODE_SIZE (mode) <= UNITS_PER_WORD)
 	{
@@ -3569,10 +3571,9 @@  expand_copysign_absneg (enum machine_mod
 	  op1 = operand_subword_force (op1, word, mode);
 	}
 
-      mask = double_int_zero.set_bit (bitpos);
-
+      mask = wide_int::set_bit_in_zero (bitpos, imode);
       sign = expand_binop (imode, and_optab, op1,
-			   immed_double_int_const (mask, imode),
+			   immed_wide_int_const (mask),
 			   NULL_RTX, 1, OPTAB_LIB_WIDEN);
     }
 
@@ -3616,7 +3617,7 @@  expand_copysign_bit (enum machine_mode m
 		     int bitpos, bool op0_is_abs)
 {
   enum machine_mode imode;
-  double_int mask;
+  wide_int mask, nmask;
   int word, nwords, i;
   rtx temp, insns;
 
@@ -3640,7 +3641,7 @@  expand_copysign_bit (enum machine_mode m
       nwords = (GET_MODE_BITSIZE (mode) + BITS_PER_WORD - 1) / BITS_PER_WORD;
     }
 
-  mask = double_int_zero.set_bit (bitpos);
+  mask = wide_int::set_bit_in_zero (bitpos, imode);
 
   if (target == 0
       || target == op0
@@ -3660,14 +3661,16 @@  expand_copysign_bit (enum machine_mode m
 	  if (i == word)
 	    {
 	      if (!op0_is_abs)
-		op0_piece
-		  = expand_binop (imode, and_optab, op0_piece,
-				  immed_double_int_const (~mask, imode),
-				  NULL_RTX, 1, OPTAB_LIB_WIDEN);
-
+		{
+		  nmask = ~mask;
+  		  op0_piece
+		    = expand_binop (imode, and_optab, op0_piece,
+				    immed_wide_int_const (nmask),
+				    NULL_RTX, 1, OPTAB_LIB_WIDEN);
+		}
 	      op1 = expand_binop (imode, and_optab,
 				  operand_subword_force (op1, i, mode),
-				  immed_double_int_const (mask, imode),
+				  immed_wide_int_const (mask),
 				  NULL_RTX, 1, OPTAB_LIB_WIDEN);
 
 	      temp = expand_binop (imode, ior_optab, op0_piece, op1,
@@ -3687,15 +3690,17 @@  expand_copysign_bit (enum machine_mode m
   else
     {
       op1 = expand_binop (imode, and_optab, gen_lowpart (imode, op1),
-		          immed_double_int_const (mask, imode),
+		          immed_wide_int_const (mask),
 		          NULL_RTX, 1, OPTAB_LIB_WIDEN);
 
       op0 = gen_lowpart (imode, op0);
       if (!op0_is_abs)
-	op0 = expand_binop (imode, and_optab, op0,
-			    immed_double_int_const (~mask, imode),
-			    NULL_RTX, 1, OPTAB_LIB_WIDEN);
-
+	{
+	  nmask = ~mask;
+	  op0 = expand_binop (imode, and_optab, op0,
+			      immed_wide_int_const (nmask),
+			      NULL_RTX, 1, OPTAB_LIB_WIDEN);
+	}
       temp = expand_binop (imode, ior_optab, op0, op1,
 			   gen_lowpart (imode, target), 1, OPTAB_LIB_WIDEN);
       target = lowpart_subreg_maybe_copy (mode, temp, imode);
Index: gcc/cfgexpand.c
===================================================================
--- gcc/cfgexpand.c	(revision 191978)
+++ gcc/cfgexpand.c	(working copy)
@@ -3633,9 +3633,8 @@  expand_debug_locations (void)
 
 	    gcc_assert (mode == GET_MODE (val)
 			|| (GET_MODE (val) == VOIDmode
-			    && (CONST_INT_P (val)
+			    && (CONST_SCALAR_INT_P (val)
 				|| GET_CODE (val) == CONST_FIXED
-				|| CONST_DOUBLE_AS_INT_P (val) 
 				|| GET_CODE (val) == LABEL_REF)));
 	  }
 
Index: gcc/ggc.h
===================================================================
--- gcc/ggc.h	(revision 191978)
+++ gcc/ggc.h	(working copy)
@@ -271,6 +271,11 @@  extern struct alloc_zone tree_id_zone;
 			    + ((NELT) - 1) * sizeof (rtx),		\
 			    &rtl_zone)
 
+#define ggc_alloc_hwivec_sized(NELT)                                      \
+  ggc_alloc_zone_hwivec_def (sizeof (struct hwivec_def)			\
+			    + ((NELT) - 1) * sizeof (HOST_WIDE_INT),	\
+			    &rtl_zone)
+
 #if defined (GGC_ZONE) && !defined (GENERATOR_FILE)
 
 /* Allocate an object into the specified allocation zone.  */