Patchwork : Uncouple size_t and sizetype

login
register
mail settings
Submitter Tristan Gingold
Date March 16, 2012, 10:39 a.m.
Message ID <F686AEA8-F10A-4C2B-B97E-556A74231D90@adacore.com>
Download mbox | patch
Permalink /patch/147181/
State New
Headers show

Comments

Tristan Gingold - March 16, 2012, 10:39 a.m.
Hi,

currently sizetype precision (cf store-layout.c:initialize_sizetypes) is the same as size_t.
This is an issue on VMS, where size_t is 'unsigned int', but we'd like to have a 64 bit sizetype
for Ada.  My understanding is that ISO-C doesn't require size_t precision to match the one of
void *.

We can't really lie about size_t because it is exposed in API (such as writev).

I don't see any reason (other than historic one) to have an exact match between sizetype and size_t.
So this patch adds an hook to allow targets to define sizetype.

I initially thought about using Pmode precision for sizetype precision, but there are a few machines
(m32c, sh, h8300) where the precisions aren't the same.  I don't know wether this is on purpose or
unintentional.

Manually tested on ia64 and alpha vms.
Not yet regression tested on a more common machine.

Comments are welcome,
Tristan.

2012-03-16  Tristan Gingold  <gingold@adacore.com>

	* target.def (sizetype_cdecl): New hook.
	* stor-layout.c (initialize_sizetypes): Use sizetype_cdecl hook
	to get sizetype name.
	* targhooks.c (default_sizetype_cdecl): New function.
	* targhooks.h (default_sizetype_cdecl): New prototype.
	* doc/tm.texi.in (Type Layout): Add TARGET_SIZETYPE_CDECL hook.
	* doc/tm.texi: Regenerate.
	* config/vms/vms.h (SIZE_TYPE): Always unsigned int.
Richard Guenther - March 16, 2012, 11:02 a.m.
On Fri, Mar 16, 2012 at 11:39 AM, Tristan Gingold <gingold@adacore.com> wrote:
> Hi,
>
> currently sizetype precision (cf store-layout.c:initialize_sizetypes) is the same as size_t.
> This is an issue on VMS, where size_t is 'unsigned int', but we'd like to have a 64 bit sizetype
> for Ada.  My understanding is that ISO-C doesn't require size_t precision to match the one of
> void *.
>
> We can't really lie about size_t because it is exposed in API (such as writev).
>
> I don't see any reason (other than historic one) to have an exact match between sizetype and size_t.
> So this patch adds an hook to allow targets to define sizetype.

Well, there is at least "common sense" that couples size_t and sizetype.
As you can at most allocate size_t memory via malloc (due to its size_t
use for the size) sizes larger than what fits into size_t do not make much
sense.  Thus, a sizetype larger than size_t does not make much sense.

The middle-end of course would not care much what you use for sizetype.
But be warned - if the mode for sizetype is different of ptr_mode things
are going to be interesting for you (yes, ptr_mode, not Pmode).

> I initially thought about using Pmode precision for sizetype precision, but there are a few machines
> (m32c, sh, h8300) where the precisions aren't the same.  I don't know wether this is on purpose or
> unintentional.

At least for m32c it is IIRC because 24bit computations are soo expensive
on that target, so HImode is chosen for sizetype.

So - why do you need a 64bit sizetype again? ;)

Can it be that you don't really need 64bit sizes but you hit issues with
sizetype != ptr_mode size?

Btw, while we are transitioning to target hooks in this case I'd prefer
a target macro alongside the existing SIZE_TYPE, etc. ones.

Richard.

> Manually tested on ia64 and alpha vms.
> Not yet regression tested on a more common machine.
>
> Comments are welcome,
> Tristan.
>
> 2012-03-16  Tristan Gingold  <gingold@adacore.com>
>
>        * target.def (sizetype_cdecl): New hook.
>        * stor-layout.c (initialize_sizetypes): Use sizetype_cdecl hook
>        to get sizetype name.
>        * targhooks.c (default_sizetype_cdecl): New function.
>        * targhooks.h (default_sizetype_cdecl): New prototype.
>        * doc/tm.texi.in (Type Layout): Add TARGET_SIZETYPE_CDECL hook.
>        * doc/tm.texi: Regenerate.
>        * config/vms/vms.h (SIZE_TYPE): Always unsigned int.
>
> diff --git a/gcc/config/vms/vms.h b/gcc/config/vms/vms.h
> index e11b1bf..dc44441 100644
> --- a/gcc/config/vms/vms.h
> +++ b/gcc/config/vms/vms.h
> @@ -58,14 +58,12 @@ extern void vms_c_register_includes (const char *, const char *, int);
>  #define POINTER_SIZE (flag_vms_pointer_size == VMS_POINTER_SIZE_NONE ? 32 : 64)
>  #define POINTERS_EXTEND_UNSIGNED 0
>
> -/* FIXME: It should always be a 32 bit type.  */
> +/* Always 32 bits.  */
>  #undef SIZE_TYPE
> -#define SIZE_TYPE (flag_vms_pointer_size == VMS_POINTER_SIZE_NONE ? \
> -                  "unsigned int" : "long long unsigned int")
> -/* ???: Defined as a 'int' by dec-c, but obstack.h doesn't like it.  */
> +#define SIZE_TYPE  "unsigned int"
>  #undef PTRDIFF_TYPE
>  #define PTRDIFF_TYPE (flag_vms_pointer_size == VMS_POINTER_SIZE_NONE ? \
>                       "int" : "long long int")
>
>  #define C_COMMON_OVERRIDE_OPTIONS vms_c_common_override_options ()
>
> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> index 69f8aba..48d7b60 100644
> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -1651,6 +1651,12 @@ for the result of subtracting two pointers.  The typedef name
>  If you don't define this macro, the default is @code{"long int"}.
>  @end defmac
>
> +@deftypefn {Target Hook} {const char *} TARGET_SIZETYPE_CDECL (void)
> +This hooks should return the corresponding C declaration for the internal@code{sizetype} type, from which are also derived @code{bitsizetype} and thesigned variant.
> +
> +If you don't define it, the default is @code{SIZE_TYPE}.
> +@end deftypefn
> +
>  @defmac WCHAR_TYPE
>  A C expression for a string describing the name of the data type to use
>  for wide characters.  The typedef name @code{wchar_t} is defined using
> diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
> index c24cf1e..0028b76 100644
> --- a/gcc/doc/tm.texi.in
> +++ b/gcc/doc/tm.texi.in
> @@ -1639,6 +1639,8 @@ for the result of subtracting two pointers.  The typedef name
>  If you don't define this macro, the default is @code{"long int"}.
>  @end defmac
>
> +@hook TARGET_SIZETYPE_CDECL
> +
>  @defmac WCHAR_TYPE
>  A C expression for a string describing the name of the data type to use
>  for wide characters.  The typedef name @code{wchar_t} is defined using
> diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
> index 264edd7..d77abc2 100644
> diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
> index 7c7fabc..5ed0f12 100644
> --- a/gcc/stor-layout.c
> +++ b/gcc/stor-layout.c
> @@ -2383,15 +2383,16 @@ void
>  initialize_sizetypes (void)
>  {
>   int precision, bprecision;
> +  const char *sizetype_name = targetm.sizetype_cdecl ();
>
>   /* Get sizetypes precision from the SIZE_TYPE target macro.  */
> -  if (strcmp (SIZE_TYPE, "unsigned int") == 0)
> +  if (strcmp (sizetype_name, "unsigned int") == 0)
>     precision = INT_TYPE_SIZE;
> -  else if (strcmp (SIZE_TYPE, "long unsigned int") == 0)
> +  else if (strcmp (sizetype_name, "long unsigned int") == 0)
>     precision = LONG_TYPE_SIZE;
> -  else if (strcmp (SIZE_TYPE, "long long unsigned int") == 0)
> +  else if (strcmp (sizetype_name, "long long unsigned int") == 0)
>     precision = LONG_LONG_TYPE_SIZE;
> -  else if (strcmp (SIZE_TYPE, "short unsigned int") == 0)
> +  else if (strcmp (sizetype_name, "short unsigned int") == 0)
>     precision = SHORT_TYPE_SIZE;
>   else
>     gcc_unreachable ();
> diff --git a/gcc/target.def b/gcc/target.def
> index d658b11..bde3388 100644
> --- a/gcc/target.def
> +++ b/gcc/target.def
> @@ -2674,6 +2674,19 @@ DEFHOOKPOD
>  @code{bool} @code{true}.",
>  unsigned char, 1)
>
> +/* The corresponding C declaration for the internal 'sizetype' type, from which
> +   are also derived 'bitsizetype' and the signed variant.  The default is
> +   SIZE_TYPE.  */
> +DEFHOOK
> +(sizetype_cdecl,
> + "This hooks should return the corresponding C declaration for the internal\
> +@code{sizetype} type, from which are also derived @code{bitsizetype} and the\
> +signed variant.\n\
> +\n\
> +If you don't define it, the default is @code{SIZE_TYPE}.",
> + const char *, (void),
> + default_sizetype_cdecl)
> +
>  /* Leave the boolean fields at the end.  */
>
>  /* True if we can create zeroed data by switching to a BSS section
> diff --git a/gcc/targhooks.c b/gcc/targhooks.c
> index 8e3d74e..d490384 100644
> --- a/gcc/targhooks.c
> +++ b/gcc/targhooks.c
> @@ -1340,6 +1340,15 @@ default_get_reg_raw_mode(int regno)
>   return reg_raw_mode[regno];
>  }
>
> +/* To be used by almost any targets, except when size_t precision is less than
> +   pointers precision.  */
> +
> +const char *
> +default_sizetype_cdecl (void)
> +{
> +  return SIZE_TYPE;
> +}
> +
>  /* Return true if the state of option OPTION should be stored in PCH files
>    and checked by default_pch_valid_p.  Store the option's current state
>    in STATE if so.  */
> diff --git a/gcc/targhooks.h b/gcc/targhooks.h
> index 8618115..41f44f8 100644
> --- a/gcc/targhooks.h
> +++ b/gcc/targhooks.h
> @@ -47,6 +47,7 @@ extern unsigned HOST_WIDE_INT default_shift_truncation_mask
>   (enum machine_mode);
>  extern unsigned int default_min_divisions_for_recip_mul (enum machine_mode);
>  extern int default_mode_rep_extended (enum machine_mode, enum machine_mode);
> +extern const char *default_sizetype_cdecl (void);
>
>  extern tree default_stack_protect_guard (void);
>  extern tree default_external_stack_protect_fail (void);
>
Tristan Gingold - March 16, 2012, 11:33 a.m.
On Mar 16, 2012, at 12:02 PM, Richard Guenther wrote:

> On Fri, Mar 16, 2012 at 11:39 AM, Tristan Gingold <gingold@adacore.com> wrote:
>> Hi,
>> 
>> currently sizetype precision (cf store-layout.c:initialize_sizetypes) is the same as size_t.
>> This is an issue on VMS, where size_t is 'unsigned int', but we'd like to have a 64 bit sizetype
>> for Ada.  My understanding is that ISO-C doesn't require size_t precision to match the one of
>> void *.
>> 
>> We can't really lie about size_t because it is exposed in API (such as writev).
>> 
>> I don't see any reason (other than historic one) to have an exact match between sizetype and size_t.
>> So this patch adds an hook to allow targets to define sizetype.
> 
> Well, there is at least "common sense" that couples size_t and sizetype.
> As you can at most allocate size_t memory via malloc (due to its size_t
> use for the size) sizes larger than what fits into size_t do not make much
> sense.  Thus, a sizetype larger than size_t does not make much sense.

Agreed, but malloc() is not the only way to get memory.  At least on VMS, there are
some syscalls to allocate memory with a 64 bit length argument.

> The middle-end of course would not care much what you use for sizetype.
> But be warned - if the mode for sizetype is different of ptr_mode things
> are going to be interesting for you (yes, ptr_mode, not Pmode).

That's the issue.  POINTER_SIZE is 64 bits (when -mpointer-size=64) but size_t should always be 32 bit.

>> I initially thought about using Pmode precision for sizetype precision, but there are a few machines
>> (m32c, sh, h8300) where the precisions aren't the same.  I don't know wether this is on purpose or
>> unintentional.
> 
> At least for m32c it is IIRC because 24bit computations are soo expensive
> on that target, so HImode is chosen for sizetype.

That's a good reason!

> So - why do you need a 64bit sizetype again? ;)
> 
> Can it be that you don't really need 64bit sizes but you hit issues with
> sizetype != ptr_mode size?

I don't have an urgent need for 64bit sizes (although would be nice to have them).

I remember that the first build with sizetype=32 but ptr_mode =DImode was a failure.
Maybe I should first investigate this path, as m32c could use "unsigned int" (16 bits)
for size_type alongside 32 for POINTER_SIZE ?

> Btw, while we are transitioning to target hooks in this case I'd prefer
> a target macro alongside the existing SIZE_TYPE, etc. ones.

Ok.

Tristan.
Richard Guenther - March 16, 2012, 11:38 a.m.
On Fri, Mar 16, 2012 at 12:33 PM, Tristan Gingold <gingold@adacore.com> wrote:
>
> On Mar 16, 2012, at 12:02 PM, Richard Guenther wrote:
>
>> On Fri, Mar 16, 2012 at 11:39 AM, Tristan Gingold <gingold@adacore.com> wrote:
>>> Hi,
>>>
>>> currently sizetype precision (cf store-layout.c:initialize_sizetypes) is the same as size_t.
>>> This is an issue on VMS, where size_t is 'unsigned int', but we'd like to have a 64 bit sizetype
>>> for Ada.  My understanding is that ISO-C doesn't require size_t precision to match the one of
>>> void *.
>>>
>>> We can't really lie about size_t because it is exposed in API (such as writev).
>>>
>>> I don't see any reason (other than historic one) to have an exact match between sizetype and size_t.
>>> So this patch adds an hook to allow targets to define sizetype.
>>
>> Well, there is at least "common sense" that couples size_t and sizetype.
>> As you can at most allocate size_t memory via malloc (due to its size_t
>> use for the size) sizes larger than what fits into size_t do not make much
>> sense.  Thus, a sizetype larger than size_t does not make much sense.
>
> Agreed, but malloc() is not the only way to get memory.  At least on VMS, there are
> some syscalls to allocate memory with a 64 bit length argument.
>
>> The middle-end of course would not care much what you use for sizetype.
>> But be warned - if the mode for sizetype is different of ptr_mode things
>> are going to be interesting for you (yes, ptr_mode, not Pmode).
>
> That's the issue.  POINTER_SIZE is 64 bits (when -mpointer-size=64) but size_t should always be 32 bit.

Ok.

>>> I initially thought about using Pmode precision for sizetype precision, but there are a few machines
>>> (m32c, sh, h8300) where the precisions aren't the same.  I don't know wether this is on purpose or
>>> unintentional.
>>
>> At least for m32c it is IIRC because 24bit computations are soo expensive
>> on that target, so HImode is chosen for sizetype.
>
> That's a good reason!
>
>> So - why do you need a 64bit sizetype again? ;)
>>
>> Can it be that you don't really need 64bit sizes but you hit issues with
>> sizetype != ptr_mode size?
>
> I don't have an urgent need for 64bit sizes (although would be nice to have them).
>
> I remember that the first build with sizetype=32 but ptr_mode =DImode was a failure.
> Maybe I should first investigate this path, as m32c could use "unsigned int" (16 bits)
> for size_type alongside 32 for POINTER_SIZE ?

Well, this setup is not well supported by the middle-end (and indeed m32c
has existing issues with that).  So in your case decoupling sizetype from
size_t sounds like the more appropriate solution.

>> Btw, while we are transitioning to target hooks in this case I'd prefer
>> a target macro alongside the existing SIZE_TYPE, etc. ones.
>
> Ok.

I'd choose SIZETYPE (for confusion, heh), defaulting to SIZE_TYPE.

Richard.

> Tristan.
>
Tristan Gingold - March 16, 2012, 11:58 a.m.
On Mar 16, 2012, at 12:38 PM, Richard Guenther wrote:

> On Fri, Mar 16, 2012 at 12:33 PM, Tristan Gingold <gingold@adacore.com> wrote:
>> 
>> On Mar 16, 2012, at 12:02 PM, Richard Guenther wrote:
>> 
>>> On Fri, Mar 16, 2012 at 11:39 AM, Tristan Gingold <gingold@adacore.com> wrote:
>>>> Hi,
>>>> 
>>>> currently sizetype precision (cf store-layout.c:initialize_sizetypes) is the same as size_t.
>>>> This is an issue on VMS, where size_t is 'unsigned int', but we'd like to have a 64 bit sizetype
>>>> for Ada.  My understanding is that ISO-C doesn't require size_t precision to match the one of
>>>> void *.
>>>> 
>>>> We can't really lie about size_t because it is exposed in API (such as writev).
>>>> 
>>>> I don't see any reason (other than historic one) to have an exact match between sizetype and size_t.
>>>> So this patch adds an hook to allow targets to define sizetype.
>>> 
>>> Well, there is at least "common sense" that couples size_t and sizetype.
>>> As you can at most allocate size_t memory via malloc (due to its size_t
>>> use for the size) sizes larger than what fits into size_t do not make much
>>> sense.  Thus, a sizetype larger than size_t does not make much sense.
>> 
>> Agreed, but malloc() is not the only way to get memory.  At least on VMS, there are
>> some syscalls to allocate memory with a 64 bit length argument.
>> 
>>> The middle-end of course would not care much what you use for sizetype.
>>> But be warned - if the mode for sizetype is different of ptr_mode things
>>> are going to be interesting for you (yes, ptr_mode, not Pmode).
>> 
>> That's the issue.  POINTER_SIZE is 64 bits (when -mpointer-size=64) but size_t should always be 32 bit.
> 
> Ok.
> 
>>>> I initially thought about using Pmode precision for sizetype precision, but there are a few machines
>>>> (m32c, sh, h8300) where the precisions aren't the same.  I don't know wether this is on purpose or
>>>> unintentional.
>>> 
>>> At least for m32c it is IIRC because 24bit computations are soo expensive
>>> on that target, so HImode is chosen for sizetype.
>> 
>> That's a good reason!
>> 
>>> So - why do you need a 64bit sizetype again? ;)
>>> 
>>> Can it be that you don't really need 64bit sizes but you hit issues with
>>> sizetype != ptr_mode size?
>> 
>> I don't have an urgent need for 64bit sizes (although would be nice to have them).
>> 
>> I remember that the first build with sizetype=32 but ptr_mode =DImode was a failure.
>> Maybe I should first investigate this path, as m32c could use "unsigned int" (16 bits)
>> for size_type alongside 32 for POINTER_SIZE ?
> 
> Well, this setup is not well supported by the middle-end (and indeed m32c
> has existing issues with that).  So in your case decoupling sizetype from
> size_t sounds like the more appropriate solution.
> 
>>> Btw, while we are transitioning to target hooks in this case I'd prefer
>>> a target macro alongside the existing SIZE_TYPE, etc. ones.
>> 
>> Ok.
> 
> I'd choose SIZETYPE (for confusion, heh), defaulting to SIZE_TYPE.

Ok, thank you for your comments.

Tristan.
Eric Botcazou - March 19, 2012, 8:38 a.m.
> currently sizetype precision (cf store-layout.c:initialize_sizetypes) is
> the same as size_t. This is an issue on VMS, where size_t is 'unsigned
> int', but we'd like to have a 64 bit sizetype for Ada.  My understanding is
> that ISO-C doesn't require size_t precision to match the one of void *.

In fact this is very recent: up to (and including) GCC 4.6, each language could 
set its own sizetype (by means of set_sizetype).

> I initially thought about using Pmode precision for sizetype precision, but
> there are a few machines (m32c, sh, h8300) where the precisions aren't the
> same.  I don't know wether this is on purpose or unintentional.

That's what we used to do in Ada, see gnat_init:

  /* In Ada, we use the unsigned type corresponding to the width of Pmode as
     SIZETYPE.  In most cases when ptr_mode and Pmode differ, C will use the
     width of ptr_mode for SIZETYPE, but we get better code using the width
     of Pmode.  Note that, although we manipulate negative offsets for some
     internal constructs and rely on compile time overflow detection in size
     computations, using unsigned types for SIZETYPEs is fine since they are
     treated specially by the middle-end, in particular sign-extended.  */
  size_type_node = gnat_type_for_mode (Pmode, 1);
  set_sizetype (size_type_node);
  TYPE_NAME (sizetype) = get_identifier ("size_type");
Eric Botcazou - March 19, 2012, 8:46 a.m.
> The middle-end of course would not care much what you use for sizetype.
> But be warned - if the mode for sizetype is different of ptr_mode things
> are going to be interesting for you (yes, ptr_mode, not Pmode).

That worked well up to GCC 4.6 though, at least in Ada.  Of course using the 
same setting in all languages would be even better than we used to have.
Tristan Gingold - March 19, 2012, 8:59 a.m.
On Mar 19, 2012, at 9:46 AM, Eric Botcazou wrote:

>> The middle-end of course would not care much what you use for sizetype.
>> But be warned - if the mode for sizetype is different of ptr_mode things
>> are going to be interesting for you (yes, ptr_mode, not Pmode).
> 
> That worked well up to GCC 4.6 though, at least in Ada.  Of course using the 
> same setting in all languages would be even better than we used to have.

I am lost here.  Which targets (with ptr_mode size != Pmode size != sizetype size) are you referring to ?

Tristan.
Eric Botcazou - March 19, 2012, 9:41 a.m.
> I am lost here.  Which targets (with ptr_mode size != Pmode size !=
> sizetype size) are you referring to ?

Targets for which sizetype mode isn't necessarily equal to ptr_mode like VMS.
Up to GCC 4.6, sizetype was Pmode in Ada, but ptr_mode in C.
Richard Guenther - March 19, 2012, 9:48 a.m.
On Mon, 19 Mar 2012, Eric Botcazou wrote:

> > I am lost here.  Which targets (with ptr_mode size != Pmode size !=
> > sizetype size) are you referring to ?
> 
> Targets for which sizetype mode isn't necessarily equal to ptr_mode like VMS.
> Up to GCC 4.6, sizetype was Pmode in Ada, but ptr_mode in C.

It does make sense to give the target control over the mode used for
sizetype.  Of course a global change of the default (for example to
use Pmode as Ada did) will require testing each affected target,
so I think it makes sense to keep the default as-is.

Btw, we still have the issue on which _precision_ we should use for
sizetype -- if we expect modulo-semantics of arithmetic using it
(thus basically sign-less arithmetic) then the precision has to match
the expectation the C frontend (and other frontends) assume how pointer
offsets are handled.  Currently the C frontend gets this not correct
which means negative offsets will be not correctly handled.

Similar issues arise from the mode/precision chosen for the bitsize
types.  We choose a way to wide precision for them, so the
modulo-semantics assumption does not usually hold for bitsize
quantities.

Richard.
Tristan Gingold - March 19, 2012, 10:20 a.m.
On Mar 19, 2012, at 10:41 AM, Eric Botcazou wrote:

>> I am lost here.  Which targets (with ptr_mode size != Pmode size !=
>> sizetype size) are you referring to ?
> 
> Targets for which sizetype mode isn't necessarily equal to ptr_mode like VMS.

VMS was (in gcc < 4.8) configured with POINTER_SIZE = 64, Pmode = DImode and sizetype = unsigned long long int.

> Up to GCC 4.6, sizetype was Pmode in Ada, but ptr_mode in C.

Yes.

Tristan.
Eric Botcazou - March 19, 2012, 10:34 a.m.
> It does make sense to give the target control over the mode used for
> sizetype.  Of course a global change of the default (for example to
> use Pmode as Ada did) will require testing each affected target,
> so I think it makes sense to keep the default as-is.

No disagreement here.

> Btw, we still have the issue on which _precision_ we should use for
> sizetype -- if we expect modulo-semantics of arithmetic using it
> (thus basically sign-less arithmetic) then the precision has to match
> the expectation the C frontend (and other frontends) assume how pointer
> offsets are handled.  Currently the C frontend gets this not correct
> which means negative offsets will be not correctly handled.

Is this theoritical or practical?  Are you talking about GET_MODE_BITSIZE vs 
GET_MODE_PRECISION wrt TYPE_PRECISION?

> Similar issues arise from the mode/precision chosen for the bitsize
> types.  We choose a way to wide precision for them, so the
> modulo-semantics assumption does not usually hold for bitsize
> quantities.

Again because of GET_MODE_PRECISION vs GET_MODE_BITSIZE?  Otherwise we round up 
the precision since GCC 4.5 so there should be no more weird precision.
Richard Guenther - March 19, 2012, 11:08 a.m.
On Mon, 19 Mar 2012, Eric Botcazou wrote:

> > It does make sense to give the target control over the mode used for
> > sizetype.  Of course a global change of the default (for example to
> > use Pmode as Ada did) will require testing each affected target,
> > so I think it makes sense to keep the default as-is.
> 
> No disagreement here.
> 
> > Btw, we still have the issue on which _precision_ we should use for
> > sizetype -- if we expect modulo-semantics of arithmetic using it
> > (thus basically sign-less arithmetic) then the precision has to match
> > the expectation the C frontend (and other frontends) assume how pointer
> > offsets are handled.  Currently the C frontend gets this not correct
> > which means negative offsets will be not correctly handled.
> 
> Is this theoritical or practical?  Are you talking about GET_MODE_BITSIZE vs 
> GET_MODE_PRECISION wrt TYPE_PRECISION?

No, about the disagreement of the precision of ptrdiff_t and that
of sizetype.  See c-common.c:pointer_int_sum:

  /* Convert the integer argument to a type the same size as sizetype
     so the multiply won't overflow spuriously.  */
  if (TYPE_PRECISION (TREE_TYPE (intop)) != TYPE_PRECISION (sizetype)
      || TYPE_UNSIGNED (TREE_TYPE (intop)) != TYPE_UNSIGNED (sizetype))
    intop = convert (c_common_type_for_size (TYPE_PRECISION (sizetype),
                                             TYPE_UNSIGNED (sizetype)), 
intop);

and consider what happens for example on m32c - we truncate the
24bit ptrdiff_t to the 16bit sizetype, losing bits.  And we are
performing the index * size multiplication in a maybe artificially
large type, losing information about overflow behavior and possibly
generating slow code for no good reason.

ISTR there was a correctness issue here, too, but maybe I've fixed
that already.

> > Similar issues arise from the mode/precision chosen for the bitsize
> > types.  We choose a way to wide precision for them, so the
> > modulo-semantics assumption does not usually hold for bitsize
> > quantities.
> 
> Again because of GET_MODE_PRECISION vs GET_MODE_BITSIZE?  Otherwise we round up 
> the precision since GCC 4.5 so there should be no more weird precision.

Well, because if sizetype is SImode (with -m32) and bitsizetype DImode
(we round up its precision to 64bits) then a negative byte-offset
in the unsigned sizetype is 0xffff for example.  When we then perform
arithmetic on bits, say (bitsizetype)sz * BITS_PER_UNIT + 9 we get
0xffff * 8 == 0x80001 (oops) + 9 == 0x80001.  bitsizetype is of too
large precision to be a modulo-arithmetic bit-equivalent to sizetype
(at least for our constant-folding code) for "negative" offsets.
Probably one of the reasons of the weird 
sizetype-is-unsigned-but-constants-are-sign-extended rule.

Richard.
Eric Botcazou - March 19, 2012, 11:29 a.m.
> No, about the disagreement of the precision of ptrdiff_t and that
> of sizetype.  See c-common.c:pointer_int_sum:
>
>   /* Convert the integer argument to a type the same size as sizetype
>      so the multiply won't overflow spuriously.  */
>   if (TYPE_PRECISION (TREE_TYPE (intop)) != TYPE_PRECISION (sizetype)
>
>       || TYPE_UNSIGNED (TREE_TYPE (intop)) != TYPE_UNSIGNED (sizetype))
>
>     intop = convert (c_common_type_for_size (TYPE_PRECISION (sizetype),
>                                              TYPE_UNSIGNED (sizetype)),
> intop);
>
> and consider what happens for example on m32c - we truncate the
> 24bit ptrdiff_t to the 16bit sizetype, losing bits.  And we are
> performing the index * size multiplication in a maybe artificially
> large type, losing information about overflow behavior and possibly
> generating slow code for no good reason.

That seems to be again the POINTER_PLUS_EXPR issue, not sizetype per se.

> Well, because if sizetype is SImode (with -m32) and bitsizetype DImode
> (we round up its precision to 64bits) then a negative byte-offset
> in the unsigned sizetype is 0xffff for example.  When we then perform
> arithmetic on bits, say (bitsizetype)sz * BITS_PER_UNIT + 9 we get
> 0xffff * 8 == 0x80001 (oops) + 9 == 0x80001.  bitsizetype is of too
> large precision to be a modulo-arithmetic bit-equivalent to sizetype
> (at least for our constant-folding code) for "negative" offsets.

OK.  The definitive fix would be to use ssizetype for offsets and restrict 
sizetype to size calculations.  Changing the precision would be a kludge.
Richard Guenther - March 19, 2012, 11:35 a.m.
On Mon, 19 Mar 2012, Eric Botcazou wrote:

> > No, about the disagreement of the precision of ptrdiff_t and that
> > of sizetype.  See c-common.c:pointer_int_sum:
> >
> >   /* Convert the integer argument to a type the same size as sizetype
> >      so the multiply won't overflow spuriously.  */
> >   if (TYPE_PRECISION (TREE_TYPE (intop)) != TYPE_PRECISION (sizetype)
> >
> >       || TYPE_UNSIGNED (TREE_TYPE (intop)) != TYPE_UNSIGNED (sizetype))
> >
> >     intop = convert (c_common_type_for_size (TYPE_PRECISION (sizetype),
> >                                              TYPE_UNSIGNED (sizetype)),
> > intop);
> >
> > and consider what happens for example on m32c - we truncate the
> > 24bit ptrdiff_t to the 16bit sizetype, losing bits.  And we are
> > performing the index * size multiplication in a maybe artificially
> > large type, losing information about overflow behavior and possibly
> > generating slow code for no good reason.
> 
> That seems to be again the POINTER_PLUS_EXPR issue, not sizetype per se.

Yes.

> > Well, because if sizetype is SImode (with -m32) and bitsizetype DImode
> > (we round up its precision to 64bits) then a negative byte-offset
> > in the unsigned sizetype is 0xffff for example.  When we then perform
> > arithmetic on bits, say (bitsizetype)sz * BITS_PER_UNIT + 9 we get
> > 0xffff * 8 == 0x80001 (oops) + 9 == 0x80001.  bitsizetype is of too
> > large precision to be a modulo-arithmetic bit-equivalent to sizetype
> > (at least for our constant-folding code) for "negative" offsets.
> 
> OK.  The definitive fix would be to use ssizetype for offsets and restrict 
> sizetype to size calculations.  Changing the precision would be a kludge.

Indeed.

Richard.

Patch

diff --git a/gcc/config/vms/vms.h b/gcc/config/vms/vms.h
index e11b1bf..dc44441 100644
--- a/gcc/config/vms/vms.h
+++ b/gcc/config/vms/vms.h
@@ -58,14 +58,12 @@  extern void vms_c_register_includes (const char *, const char *, int);
 #define POINTER_SIZE (flag_vms_pointer_size == VMS_POINTER_SIZE_NONE ? 32 : 64)
 #define POINTERS_EXTEND_UNSIGNED 0
 
-/* FIXME: It should always be a 32 bit type.  */
+/* Always 32 bits.  */
 #undef SIZE_TYPE
-#define SIZE_TYPE (flag_vms_pointer_size == VMS_POINTER_SIZE_NONE ? \
-		   "unsigned int" : "long long unsigned int")
-/* ???: Defined as a 'int' by dec-c, but obstack.h doesn't like it.  */
+#define SIZE_TYPE  "unsigned int"
 #undef PTRDIFF_TYPE
 #define PTRDIFF_TYPE (flag_vms_pointer_size == VMS_POINTER_SIZE_NONE ? \
                       "int" : "long long int")
 
 #define C_COMMON_OVERRIDE_OPTIONS vms_c_common_override_options ()
 
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 69f8aba..48d7b60 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -1651,6 +1651,12 @@  for the result of subtracting two pointers.  The typedef name
 If you don't define this macro, the default is @code{"long int"}.
 @end defmac
 
+@deftypefn {Target Hook} {const char *} TARGET_SIZETYPE_CDECL (void)
+This hooks should return the corresponding C declaration for the internal@code{sizetype} type, from which are also derived @code{bitsizetype} and thesigned variant.
+
+If you don't define it, the default is @code{SIZE_TYPE}.
+@end deftypefn
+
 @defmac WCHAR_TYPE
 A C expression for a string describing the name of the data type to use
 for wide characters.  The typedef name @code{wchar_t} is defined using
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index c24cf1e..0028b76 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -1639,6 +1639,8 @@  for the result of subtracting two pointers.  The typedef name
 If you don't define this macro, the default is @code{"long int"}.
 @end defmac
 
+@hook TARGET_SIZETYPE_CDECL
+
 @defmac WCHAR_TYPE
 A C expression for a string describing the name of the data type to use
 for wide characters.  The typedef name @code{wchar_t} is defined using
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 264edd7..d77abc2 100644
diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
index 7c7fabc..5ed0f12 100644
--- a/gcc/stor-layout.c
+++ b/gcc/stor-layout.c
@@ -2383,15 +2383,16 @@  void
 initialize_sizetypes (void)
 {
   int precision, bprecision;
+  const char *sizetype_name = targetm.sizetype_cdecl ();
 
   /* Get sizetypes precision from the SIZE_TYPE target macro.  */
-  if (strcmp (SIZE_TYPE, "unsigned int") == 0)
+  if (strcmp (sizetype_name, "unsigned int") == 0)
     precision = INT_TYPE_SIZE;
-  else if (strcmp (SIZE_TYPE, "long unsigned int") == 0)
+  else if (strcmp (sizetype_name, "long unsigned int") == 0)
     precision = LONG_TYPE_SIZE;
-  else if (strcmp (SIZE_TYPE, "long long unsigned int") == 0)
+  else if (strcmp (sizetype_name, "long long unsigned int") == 0)
     precision = LONG_LONG_TYPE_SIZE;
-  else if (strcmp (SIZE_TYPE, "short unsigned int") == 0)
+  else if (strcmp (sizetype_name, "short unsigned int") == 0)
     precision = SHORT_TYPE_SIZE;
   else
     gcc_unreachable ();
diff --git a/gcc/target.def b/gcc/target.def
index d658b11..bde3388 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -2674,6 +2674,19 @@  DEFHOOKPOD
  @code{bool} @code{true}.",
  unsigned char, 1)
  
+/* The corresponding C declaration for the internal 'sizetype' type, from which
+   are also derived 'bitsizetype' and the signed variant.  The default is
+   SIZE_TYPE.  */
+DEFHOOK
+(sizetype_cdecl,
+ "This hooks should return the corresponding C declaration for the internal\
+@code{sizetype} type, from which are also derived @code{bitsizetype} and the\
+signed variant.\n\
+\n\
+If you don't define it, the default is @code{SIZE_TYPE}.",
+ const char *, (void),
+ default_sizetype_cdecl)
+
 /* Leave the boolean fields at the end.  */
 
 /* True if we can create zeroed data by switching to a BSS section
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index 8e3d74e..d490384 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -1340,6 +1340,15 @@  default_get_reg_raw_mode(int regno)
   return reg_raw_mode[regno];
 }
 
+/* To be used by almost any targets, except when size_t precision is less than
+   pointers precision.  */
+
+const char *
+default_sizetype_cdecl (void)
+{
+  return SIZE_TYPE;
+}
+
 /* Return true if the state of option OPTION should be stored in PCH files
    and checked by default_pch_valid_p.  Store the option's current state
    in STATE if so.  */
diff --git a/gcc/targhooks.h b/gcc/targhooks.h
index 8618115..41f44f8 100644
--- a/gcc/targhooks.h
+++ b/gcc/targhooks.h
@@ -47,6 +47,7 @@  extern unsigned HOST_WIDE_INT default_shift_truncation_mask
   (enum machine_mode);
 extern unsigned int default_min_divisions_for_recip_mul (enum machine_mode);
 extern int default_mode_rep_extended (enum machine_mode, enum machine_mode);
+extern const char *default_sizetype_cdecl (void);
 
 extern tree default_stack_protect_guard (void);
 extern tree default_external_stack_protect_fail (void);