Patchwork Emit DW_ATE_UTF for char16_t/char32_t

login
register
mail settings
Submitter Jakub Jelinek
Date June 17, 2010, 7:58 a.m.
Message ID <20100617075812.GR7811@tyan-ft48-01.lab.bos.redhat.com>
Download mbox | patch
Permalink /patch/55981/
State New
Headers show

Comments

Jakub Jelinek - June 17, 2010, 7:58 a.m.
Hi!

The final DWARF4 version now has DW_ATE_UTF value, so this patch
makes sure it is used for char16_t and char32_t (only in C++ so far,
in C we don't have such a builtin type).

The name based discovery is perhaps ugly, but we don't have enough bits to
waste to add TYPE_UTF_FLAG (like TYPE_STRING_FLAG) and adding a langhook
for that is IMHO overkill.  But if you wish to go that way, it is possible
too.

2010-06-17  Jakub Jelinek  <jakub@redhat.com>

	* dwarf2.h (enum dwarf_type): Add DW_ATE_UTF.

	* dwarf2out.c (base_type_die): Use DW_ATE_UTF for
	C++ char16_t and char32_t.


	Jakub
Richard Guenther - June 17, 2010, 8:34 a.m.
On Thu, Jun 17, 2010 at 9:58 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> Hi!
>
> The final DWARF4 version now has DW_ATE_UTF value, so this patch
> makes sure it is used for char16_t and char32_t (only in C++ so far,
> in C we don't have such a builtin type).
>
> The name based discovery is perhaps ugly, but we don't have enough bits to
> waste to add TYPE_UTF_FLAG (like TYPE_STRING_FLAG) and adding a langhook
> for that is IMHO overkill.  But if you wish to go that way, it is possible
> too.
>
> 2010-06-17  Jakub Jelinek  <jakub@redhat.com>
>
>        * dwarf2.h (enum dwarf_type): Add DW_ATE_UTF.
>
>        * dwarf2out.c (base_type_die): Use DW_ATE_UTF for
>        C++ char16_t and char32_t.
>
> --- include/dwarf2.h.jj 2010-06-09 13:42:16.000000000 +0200
> +++ include/dwarf2.h    2010-06-17 08:33:07.000000000 +0200
> @@ -654,6 +654,8 @@ enum dwarf_type
>     DW_ATE_signed_fixed = 0xd,
>     DW_ATE_unsigned_fixed = 0xe,
>     DW_ATE_decimal_float = 0xf,
> +    /* DWARF 4.  */
> +    DW_ATE_UTF = 0x10,
>
>     DW_ATE_lo_user = 0x80,
>     DW_ATE_hi_user = 0xff,
> --- gcc/dwarf2out.c.jj  2010-06-17 08:17:11.000000000 +0200
> +++ gcc/dwarf2out.c     2010-06-17 09:17:12.000000000 +0200
> @@ -12377,6 +12377,21 @@ base_type_die (tree type)
>   switch (TREE_CODE (type))
>     {
>     case INTEGER_TYPE:
> +      if ((dwarf_version >= 4 || !dwarf_strict)
> +         && is_cxx ()

I suppose this also would work without that check as you purely
rely on DECL_NAME and BUILTINS_LOCATION.

> +         && TYPE_NAME (type)
> +         && TREE_CODE (TYPE_NAME (type)) == TYPE_DECL
> +         && DECL_SOURCE_LOCATION (TYPE_NAME (type)) == BUILTINS_LOCATION

DECL_IS_BUILTIN

Otherwise this looks good, but I'll let Jason approve it.

Richard.

> +         && DECL_NAME (TYPE_NAME (type)))
> +       {
> +         const char *name = IDENTIFIER_POINTER (DECL_NAME (TYPE_NAME (type)));
> +         if (strcmp (name, "char16_t") == 0
> +             || strcmp (name, "char32_t") == 0)
> +           {
> +             encoding = DW_ATE_UTF;
> +             break;
> +           }
> +       }
>       if (TYPE_STRING_FLAG (type))
>        {
>          if (TYPE_UNSIGNED (type))
>
>        Jakub
>
Jakub Jelinek - June 17, 2010, 8:39 a.m.
On Thu, Jun 17, 2010 at 10:34:54AM +0200, Richard Guenther wrote:
> > --- gcc/dwarf2out.c.jj  2010-06-17 08:17:11.000000000 +0200
> > +++ gcc/dwarf2out.c     2010-06-17 09:17:12.000000000 +0200
> > @@ -12377,6 +12377,21 @@ base_type_die (tree type)
> >   switch (TREE_CODE (type))
> >     {
> >     case INTEGER_TYPE:
> > +      if ((dwarf_version >= 4 || !dwarf_strict)
> > +         && is_cxx ()
> 
> I suppose this also would work without that check as you purely
> rely on DECL_NAME and BUILTINS_LOCATION.

That would assume say Ada or other FE doesn't use char16_t or char32_t
builtin type for something unrelated.

> > +         && TYPE_NAME (type)
> > +         && TREE_CODE (TYPE_NAME (type)) == TYPE_DECL
> > +         && DECL_SOURCE_LOCATION (TYPE_NAME (type)) == BUILTINS_LOCATION
> 
> DECL_IS_BUILTIN

DECL_IS_BUILTIN includes also UNKNOWN_LOCATION, so the above check
looked safer to me.  But it is not a big deal for me.

	Jakub
Richard Guenther - June 17, 2010, 8:55 a.m.
On Thu, Jun 17, 2010 at 10:39 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Thu, Jun 17, 2010 at 10:34:54AM +0200, Richard Guenther wrote:
>> > --- gcc/dwarf2out.c.jj  2010-06-17 08:17:11.000000000 +0200
>> > +++ gcc/dwarf2out.c     2010-06-17 09:17:12.000000000 +0200
>> > @@ -12377,6 +12377,21 @@ base_type_die (tree type)
>> >   switch (TREE_CODE (type))
>> >     {
>> >     case INTEGER_TYPE:
>> > +      if ((dwarf_version >= 4 || !dwarf_strict)
>> > +         && is_cxx ()
>>
>> I suppose this also would work without that check as you purely
>> rely on DECL_NAME and BUILTINS_LOCATION.
>
> That would assume say Ada or other FE doesn't use char16_t or char32_t
> builtin type for something unrelated.

Yes.  As a bonus it would also work for LTO ;)

>> > +         && TYPE_NAME (type)
>> > +         && TREE_CODE (TYPE_NAME (type)) == TYPE_DECL
>> > +         && DECL_SOURCE_LOCATION (TYPE_NAME (type)) == BUILTINS_LOCATION
>>
>> DECL_IS_BUILTIN
>
> DECL_IS_BUILTIN includes also UNKNOWN_LOCATION, so the above check
> looked safer to me.  But it is not a big deal for me.

True - I suppose we might want to change DECL_IS_BUILTIN ...

Richard.

>        Jakub
>
Jason Merrill - June 21, 2010, 3:59 p.m.
OK with the changes Richard suggested.

Jason

Patch

--- include/dwarf2.h.jj	2010-06-09 13:42:16.000000000 +0200
+++ include/dwarf2.h	2010-06-17 08:33:07.000000000 +0200
@@ -654,6 +654,8 @@  enum dwarf_type
     DW_ATE_signed_fixed = 0xd,
     DW_ATE_unsigned_fixed = 0xe,
     DW_ATE_decimal_float = 0xf,
+    /* DWARF 4.  */
+    DW_ATE_UTF = 0x10,
 
     DW_ATE_lo_user = 0x80,
     DW_ATE_hi_user = 0xff,
--- gcc/dwarf2out.c.jj	2010-06-17 08:17:11.000000000 +0200
+++ gcc/dwarf2out.c	2010-06-17 09:17:12.000000000 +0200
@@ -12377,6 +12377,21 @@  base_type_die (tree type)
   switch (TREE_CODE (type))
     {
     case INTEGER_TYPE:
+      if ((dwarf_version >= 4 || !dwarf_strict)
+	  && is_cxx ()
+	  && TYPE_NAME (type)
+	  && TREE_CODE (TYPE_NAME (type)) == TYPE_DECL
+	  && DECL_SOURCE_LOCATION (TYPE_NAME (type)) == BUILTINS_LOCATION
+	  && DECL_NAME (TYPE_NAME (type)))
+	{
+	  const char *name = IDENTIFIER_POINTER (DECL_NAME (TYPE_NAME (type)));
+	  if (strcmp (name, "char16_t") == 0
+	      || strcmp (name, "char32_t") == 0)
+	    {
+	      encoding = DW_ATE_UTF;
+	      break;
+	    }
+	}
       if (TYPE_STRING_FLAG (type))
 	{
 	  if (TYPE_UNSIGNED (type))