diff mbox series

middle-end/114931 - type_hash_canon and structual equality types

Message ID 20240503121400.C8A4313991@imap1.dmz-prg2.suse.org
State New
Headers show
Series middle-end/114931 - type_hash_canon and structual equality types | expand

Commit Message

Richard Biener May 3, 2024, 12:13 p.m. UTC
TYPE_STRUCTURAL_EQUALITY_P is part of our type system so we have
to make sure to include that into the type unification done via
type_hash_canon.  This requires the flag to be set before querying
the hash which is the biggest part of the patch.

Bootstrapped and tested on x86_64-unknown-linux-gnu for all languages.

As said in the PR this merely makes sure to keep individual types
consistent with themselves.  We still will have a set of types
with TYPE_STRUCTURAL_EQUALITY_P and a set without that might be
otherwise identical.  That could be only avoided with changes in
the frontend.

OK for trunk?

Thanks,
Richard.

	PR middle-end/114931
gcc/
	* tree.cc (type_hash_canon_hash): Hash TYPE_STRUCTURAL_EQUALITY_P.
	(type_cache_hasher::equal): Compare TYPE_STRUCTURAL_EQUALITY_P.
	(build_array_type_1): Set TYPE_STRUCTURAL_EQUALITY_P before
	probing with type_hash_canon.
	(build_function_type): Likewise.
	(build_method_type_directly): Likewise.
	(build_offset_type): Likewise.
	(build_complex_type): Likewise.
	* attribs.cc (build_type_attribute_qual_variant): Likewise.

gcc/c-family/
	* c-common.cc (complete_array_type): Set TYPE_STRUCTURAL_EQUALITY_P
	before probing with type_hash_canon.

gcc/testsuite/
	* gcc.dg/pr114931.c: New testcase.
---
 gcc/attribs.cc                  | 20 +++++-----
 gcc/c-family/c-common.cc        | 11 ++++--
 gcc/testsuite/gcc.dg/pr114931.c | 10 +++++
 gcc/tree.cc                     | 65 +++++++++++++++++++++++----------
 4 files changed, 74 insertions(+), 32 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr114931.c

Comments

Martin Uecker May 3, 2024, 3:32 p.m. UTC | #1
Am Freitag, dem 03.05.2024 um 14:13 +0200 schrieb Richard Biener:
> TYPE_STRUCTURAL_EQUALITY_P is part of our type system so we have
> to make sure to include that into the type unification done via
> type_hash_canon.  This requires the flag to be set before querying
> the hash which is the biggest part of the patch.

I assume this does not affect structs / unions because they
do not make this mechanism of type unification (each tagged type
is a unique type), but only derived types that end up having
TYPE_STRUCTURAL_EQUALITY_P because they are constructed from
incomplete structs / unions before TYPE_CANONICAL is set.

I do not yet understand why this change is needed. Type
identity should not be affected by setting TYPE_CANONICAL, so
why do we need to keep such types separate?  I understand that we
created some inconsistencies, but I do not see why this change
is needed to fix it.  But I also haven't understood how we ended
up with a TYPE_CANONICAL having TYPE_STRUCTURAL_EQUALITY_P in
PR 114931 ...

Martin


> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu for all languages.
> 
> As said in the PR this merely makes sure to keep individual types
> consistent with themselves.  We still will have a set of types
> with TYPE_STRUCTURAL_EQUALITY_P and a set without that might be
> otherwise identical.  That could be only avoided with changes in
> the frontend.
> 
> OK for trunk?
> 
> Thanks,
> Richard.
> 
> 	PR middle-end/114931
> gcc/
> 	* tree.cc (type_hash_canon_hash): Hash TYPE_STRUCTURAL_EQUALITY_P.
> 	(type_cache_hasher::equal): Compare TYPE_STRUCTURAL_EQUALITY_P.
> 	(build_array_type_1): Set TYPE_STRUCTURAL_EQUALITY_P before
> 	probing with type_hash_canon.
> 	(build_function_type): Likewise.
> 	(build_method_type_directly): Likewise.
> 	(build_offset_type): Likewise.
> 	(build_complex_type): Likewise.
> 	* attribs.cc (build_type_attribute_qual_variant): Likewise.
> 
> gcc/c-family/
> 	* c-common.cc (complete_array_type): Set TYPE_STRUCTURAL_EQUALITY_P
> 	before probing with type_hash_canon.
> 
> gcc/testsuite/
> 	* gcc.dg/pr114931.c: New testcase.
> ---
>  gcc/attribs.cc                  | 20 +++++-----
>  gcc/c-family/c-common.cc        | 11 ++++--
>  gcc/testsuite/gcc.dg/pr114931.c | 10 +++++
>  gcc/tree.cc                     | 65 +++++++++++++++++++++++----------
>  4 files changed, 74 insertions(+), 32 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/pr114931.c
> 
> diff --git a/gcc/attribs.cc b/gcc/attribs.cc
> index 12ffc5f170a..3ab0b0fd87a 100644
> --- a/gcc/attribs.cc
> +++ b/gcc/attribs.cc
> @@ -1336,6 +1336,16 @@ build_type_attribute_qual_variant (tree otype, tree attribute, int quals)
>        tree dtype = ntype = build_distinct_type_copy (ttype);
>  
>        TYPE_ATTRIBUTES (ntype) = attribute;
> +      /* If the target-dependent attributes make NTYPE different from
> +	 its canonical type, we will need to use structural equality
> +	 checks for this type.
> +
> +	 We shouldn't get here for stripping attributes from a type;
> +	 the no-attribute type might not need structural comparison.  But
> +	 we can if was discarded from type_hash_table.  */
> +      if (TYPE_STRUCTURAL_EQUALITY_P (ttype)
> +	  || !comp_type_attributes (ntype, ttype))
> +	SET_TYPE_STRUCTURAL_EQUALITY (ntype);
>  
>        hashval_t hash = type_hash_canon_hash (ntype);
>        ntype = type_hash_canon (hash, ntype);
> @@ -1343,16 +1353,6 @@ build_type_attribute_qual_variant (tree otype, tree attribute, int quals)
>        if (ntype != dtype)
>  	/* This variant was already in the hash table, don't mess with
>  	   TYPE_CANONICAL.  */;
> -      else if (TYPE_STRUCTURAL_EQUALITY_P (ttype)
> -	       || !comp_type_attributes (ntype, ttype))
> -	/* If the target-dependent attributes make NTYPE different from
> -	   its canonical type, we will need to use structural equality
> -	   checks for this type.
> -
> -	   We shouldn't get here for stripping attributes from a type;
> -	   the no-attribute type might not need structural comparison.  But
> -	   we can if was discarded from type_hash_table.  */
> -	SET_TYPE_STRUCTURAL_EQUALITY (ntype);
>        else if (TYPE_CANONICAL (ntype) == ntype)
>  	TYPE_CANONICAL (ntype) = TYPE_CANONICAL (ttype);
>  
> diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
> index 01e3d247fc2..032dcb4b41d 100644
> --- a/gcc/c-family/c-common.cc
> +++ b/gcc/c-family/c-common.cc
> @@ -7115,6 +7115,13 @@ complete_array_type (tree *ptype, tree initial_value, bool do_default)
>    TYPE_TYPELESS_STORAGE (main_type) = TYPE_TYPELESS_STORAGE (type);
>    layout_type (main_type);
>  
> +  /* Set TYPE_STRUCTURAL_EQUALITY_P early.  */
> +  if (TYPE_STRUCTURAL_EQUALITY_P (TREE_TYPE (main_type))
> +      || TYPE_STRUCTURAL_EQUALITY_P (TYPE_DOMAIN (main_type)))
> +    SET_TYPE_STRUCTURAL_EQUALITY (main_type);
> +  else
> +    TYPE_CANONICAL (main_type) = main_type;
> +
>    /* Make sure we have the canonical MAIN_TYPE. */
>    hashval_t hashcode = type_hash_canon_hash (main_type);
>    main_type = type_hash_canon (hashcode, main_type);
> @@ -7122,7 +7129,7 @@ complete_array_type (tree *ptype, tree initial_value, bool do_default)
>    /* Fix the canonical type.  */
>    if (TYPE_STRUCTURAL_EQUALITY_P (TREE_TYPE (main_type))
>        || TYPE_STRUCTURAL_EQUALITY_P (TYPE_DOMAIN (main_type)))
> -    SET_TYPE_STRUCTURAL_EQUALITY (main_type);
> +    gcc_assert (TYPE_STRUCTURAL_EQUALITY_P (main_type));
>    else if (TYPE_CANONICAL (TREE_TYPE (main_type)) != TREE_TYPE (main_type)
>  	   || (TYPE_CANONICAL (TYPE_DOMAIN (main_type))
>  	       != TYPE_DOMAIN (main_type)))
> @@ -7130,8 +7137,6 @@ complete_array_type (tree *ptype, tree initial_value, bool do_default)
>        = build_array_type (TYPE_CANONICAL (TREE_TYPE (main_type)),
>  			  TYPE_CANONICAL (TYPE_DOMAIN (main_type)),
>  			  TYPE_TYPELESS_STORAGE (main_type));
> -  else
> -    TYPE_CANONICAL (main_type) = main_type;
>  
>    if (quals == 0)
>      type = main_type;
> diff --git a/gcc/testsuite/gcc.dg/pr114931.c b/gcc/testsuite/gcc.dg/pr114931.c
> new file mode 100644
> index 00000000000..d690ed70e52
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr114931.c
> @@ -0,0 +1,10 @@
> +/* { dg-do compile } */
> +/* { dg-options "-std=c23" } */
> +
> +struct Tcl_Obj;
> +void(Tcl_FreeInternalRepProc)(struct Tcl_Obj *);
> +typedef struct Tcl_Obj {
> +} Tcl_Obj;
> +struct {
> +  void (*tclFreeObj)(Tcl_Obj *);
> +} Tcl_InitStubs;
> diff --git a/gcc/tree.cc b/gcc/tree.cc
> index 780662549fe..6564b002dc1 100644
> --- a/gcc/tree.cc
> +++ b/gcc/tree.cc
> @@ -6012,6 +6012,8 @@ type_hash_canon_hash (tree type)
>  
>    hstate.add_int (TREE_CODE (type));
>  
> +  hstate.add_flag (TYPE_STRUCTURAL_EQUALITY_P (type));
> +
>    if (TREE_TYPE (type))
>      hstate.add_object (TYPE_HASH (TREE_TYPE (type)));
>  
> @@ -6109,6 +6111,10 @@ type_cache_hasher::equal (type_hash *a, type_hash *b)
>  	  || TYPE_MODE (a->type) != TYPE_MODE (b->type)))
>      return false;
>  
> +  if (TYPE_STRUCTURAL_EQUALITY_P (a->type)
> +      != TYPE_STRUCTURAL_EQUALITY_P (b->type))
> +    return false;
> +
>    switch (TREE_CODE (a->type))
>      {
>      case VOID_TYPE:
> @@ -7347,6 +7353,14 @@ build_array_type_1 (tree elt_type, tree index_type, bool typeless_storage,
>    TYPE_DOMAIN (t) = index_type;
>    TYPE_ADDR_SPACE (t) = TYPE_ADDR_SPACE (elt_type);
>    TYPE_TYPELESS_STORAGE (t) = typeless_storage;
> +
> +  /* Set TYPE_STRUCTURAL_EQUALITY_P.  */
> +  if (set_canonical
> +      && (TYPE_STRUCTURAL_EQUALITY_P (elt_type)
> +	  || (index_type && TYPE_STRUCTURAL_EQUALITY_P (index_type))
> +	  || in_lto_p))
> +    SET_TYPE_STRUCTURAL_EQUALITY (t);
> +
>    layout_type (t);
>  
>    if (shared)
> @@ -7363,7 +7377,7 @@ build_array_type_1 (tree elt_type, tree index_type, bool typeless_storage,
>        if (TYPE_STRUCTURAL_EQUALITY_P (elt_type)
>  	  || (index_type && TYPE_STRUCTURAL_EQUALITY_P (index_type))
>  	  || in_lto_p)
> -	SET_TYPE_STRUCTURAL_EQUALITY (t);
> +	gcc_unreachable ();
>        else if (TYPE_CANONICAL (elt_type) != elt_type
>  	       || (index_type && TYPE_CANONICAL (index_type) != index_type))
>  	TYPE_CANONICAL (t)
> @@ -7510,21 +7524,25 @@ build_function_type (tree value_type, tree arg_types,
>        TYPE_NO_NAMED_ARGS_STDARG_P (t) = 1;
>      }
>  
> -  /* If we already have such a type, use the old one.  */
> -  hashval_t hash = type_hash_canon_hash (t);
> -  tree probe_type = t;
> -  t = type_hash_canon (hash, t);
> -  if (t != probe_type)
> -    return t;
> -
>    /* Set up the canonical type. */
>    any_structural_p   = TYPE_STRUCTURAL_EQUALITY_P (value_type);
>    any_noncanonical_p = TYPE_CANONICAL (value_type) != value_type;
>    canon_argtypes = maybe_canonicalize_argtypes (arg_types,
>  						&any_structural_p,
>  						&any_noncanonical_p);
> +  /* Set TYPE_STRUCTURAL_EQUALITY_P early.  */
>    if (any_structural_p)
>      SET_TYPE_STRUCTURAL_EQUALITY (t);
> +
> +  /* If we already have such a type, use the old one.  */
> +  hashval_t hash = type_hash_canon_hash (t);
> +  tree probe_type = t;
> +  t = type_hash_canon (hash, t);
> +  if (t != probe_type)
> +    return t;
> +
> +  if (any_structural_p)
> +    gcc_assert (TYPE_STRUCTURAL_EQUALITY_P (t));
>    else if (any_noncanonical_p)
>      TYPE_CANONICAL (t) = build_function_type (TYPE_CANONICAL (value_type),
>  					      canon_argtypes);
> @@ -7667,13 +7685,6 @@ build_method_type_directly (tree basetype,
>    argtypes = tree_cons (NULL_TREE, ptype, argtypes);
>    TYPE_ARG_TYPES (t) = argtypes;
>  
> -  /* If we already have such a type, use the old one.  */
> -  hashval_t hash = type_hash_canon_hash (t);
> -  tree probe_type = t;
> -  t = type_hash_canon (hash, t);
> -  if (t != probe_type)
> -    return t;
> -
>    /* Set up the canonical type. */
>    any_structural_p
>      = (TYPE_STRUCTURAL_EQUALITY_P (basetype)
> @@ -7684,8 +7695,20 @@ build_method_type_directly (tree basetype,
>    canon_argtypes = maybe_canonicalize_argtypes (TREE_CHAIN (argtypes),
>  						&any_structural_p,
>  						&any_noncanonical_p);
> +
> +  /* Set TYPE_STRUCTURAL_EQUALITY_P early.  */
>    if (any_structural_p)
>      SET_TYPE_STRUCTURAL_EQUALITY (t);
> +
> +  /* If we already have such a type, use the old one.  */
> +  hashval_t hash = type_hash_canon_hash (t);
> +  tree probe_type = t;
> +  t = type_hash_canon (hash, t);
> +  if (t != probe_type)
> +    return t;
> +
> +  if (any_structural_p)
> +    gcc_assert (TYPE_STRUCTURAL_EQUALITY_P (t));
>    else if (any_noncanonical_p)
>      TYPE_CANONICAL (t)
>        = build_method_type_directly (TYPE_CANONICAL (basetype),
> @@ -7726,6 +7749,9 @@ build_offset_type (tree basetype, tree type)
>  
>    TYPE_OFFSET_BASETYPE (t) = TYPE_MAIN_VARIANT (basetype);
>    TREE_TYPE (t) = type;
> +  if (TYPE_STRUCTURAL_EQUALITY_P (basetype)
> +      || TYPE_STRUCTURAL_EQUALITY_P (type))
> +    SET_TYPE_STRUCTURAL_EQUALITY (t);
>  
>    /* If we already have such a type, use the old one.  */
>    hashval_t hash = type_hash_canon_hash (t);
> @@ -7741,7 +7767,7 @@ build_offset_type (tree basetype, tree type)
>      {
>        if (TYPE_STRUCTURAL_EQUALITY_P (basetype)
>  	  || TYPE_STRUCTURAL_EQUALITY_P (type))
> -	SET_TYPE_STRUCTURAL_EQUALITY (t);
> +	gcc_unreachable ();
>        else if (TYPE_CANONICAL (TYPE_MAIN_VARIANT (basetype)) != basetype
>  	       || TYPE_CANONICAL (type) != type)
>  	TYPE_CANONICAL (t)
> @@ -7770,6 +7796,8 @@ build_complex_type (tree component_type, bool named)
>    tree probe = make_node (COMPLEX_TYPE);
>  
>    TREE_TYPE (probe) = TYPE_MAIN_VARIANT (component_type);
> +  if (TYPE_STRUCTURAL_EQUALITY_P (TREE_TYPE (probe)))
> +    SET_TYPE_STRUCTURAL_EQUALITY (probe);
>  
>    /* If we already have such a type, use the old one.  */
>    hashval_t hash = type_hash_canon_hash (probe);
> @@ -7781,11 +7809,10 @@ build_complex_type (tree component_type, bool named)
>  	 out the type.  We need to check the canonicalization and
>  	 maybe set the name.  */
>        gcc_checking_assert (COMPLETE_TYPE_P (t)
> -			   && !TYPE_NAME (t)
> -			   && TYPE_CANONICAL (t) == t);
> +			   && !TYPE_NAME (t));
>  
>        if (TYPE_STRUCTURAL_EQUALITY_P (TREE_TYPE (t)))
> -	SET_TYPE_STRUCTURAL_EQUALITY (t);
> +	;
>        else if (TYPE_CANONICAL (TREE_TYPE (t)) != TREE_TYPE (t))
>  	TYPE_CANONICAL (t)
>  	  = build_complex_type (TYPE_CANONICAL (TREE_TYPE (t)), named);
Richard Biener May 3, 2024, 4:23 p.m. UTC | #2
> Am 03.05.2024 um 17:33 schrieb Martin Uecker <uecker@tugraz.at>:
> 
> Am Freitag, dem 03.05.2024 um 14:13 +0200 schrieb Richard Biener:
>> TYPE_STRUCTURAL_EQUALITY_P is part of our type system so we have
>> to make sure to include that into the type unification done via
>> type_hash_canon.  This requires the flag to be set before querying
>> the hash which is the biggest part of the patch.
> 
> I assume this does not affect structs / unions because they
> do not make this mechanism of type unification (each tagged type
> is a unique type), but only derived types that end up having
> TYPE_STRUCTURAL_EQUALITY_P because they are constructed from
> incomplete structs / unions before TYPE_CANONICAL is set.
> 
> I do not yet understand why this change is needed. Type
> identity should not be affected by setting TYPE_CANONICAL, so
> why do we need to keep such types separate?  I understand that we
> created some inconsistencies, but I do not see why this change
> is needed to fix it.  But I also haven't understood how we ended
> up with a TYPE_CANONICAL having TYPE_STRUCTURAL_EQUALITY_P in
> PR 114931 ...

Because we created the canonical function type before where one of its arguments had TYPE_STEUCTURAL_EQUALITY which makes the function type so.

Richard 

> 
> Martin
> 
> 
>> 
>> Bootstrapped and tested on x86_64-unknown-linux-gnu for all languages.
>> 
>> As said in the PR this merely makes sure to keep individual types
>> consistent with themselves.  We still will have a set of types
>> with TYPE_STRUCTURAL_EQUALITY_P and a set without that might be
>> otherwise identical.  That could be only avoided with changes in
>> the frontend.
>> 
>> OK for trunk?
>> 
>> Thanks,
>> Richard.
>> 
>>    PR middle-end/114931
>> gcc/
>>    * tree.cc (type_hash_canon_hash): Hash TYPE_STRUCTURAL_EQUALITY_P.
>>    (type_cache_hasher::equal): Compare TYPE_STRUCTURAL_EQUALITY_P.
>>    (build_array_type_1): Set TYPE_STRUCTURAL_EQUALITY_P before
>>    probing with type_hash_canon.
>>    (build_function_type): Likewise.
>>    (build_method_type_directly): Likewise.
>>    (build_offset_type): Likewise.
>>    (build_complex_type): Likewise.
>>    * attribs.cc (build_type_attribute_qual_variant): Likewise.
>> 
>> gcc/c-family/
>>    * c-common.cc (complete_array_type): Set TYPE_STRUCTURAL_EQUALITY_P
>>    before probing with type_hash_canon.
>> 
>> gcc/testsuite/
>>    * gcc.dg/pr114931.c: New testcase.
>> ---
>> gcc/attribs.cc                  | 20 +++++-----
>> gcc/c-family/c-common.cc        | 11 ++++--
>> gcc/testsuite/gcc.dg/pr114931.c | 10 +++++
>> gcc/tree.cc                     | 65 +++++++++++++++++++++++----------
>> 4 files changed, 74 insertions(+), 32 deletions(-)
>> create mode 100644 gcc/testsuite/gcc.dg/pr114931.c
>> 
>> diff --git a/gcc/attribs.cc b/gcc/attribs.cc
>> index 12ffc5f170a..3ab0b0fd87a 100644
>> --- a/gcc/attribs.cc
>> +++ b/gcc/attribs.cc
>> @@ -1336,6 +1336,16 @@ build_type_attribute_qual_variant (tree otype, tree attribute, int quals)
>>       tree dtype = ntype = build_distinct_type_copy (ttype);
>> 
>>       TYPE_ATTRIBUTES (ntype) = attribute;
>> +      /* If the target-dependent attributes make NTYPE different from
>> +     its canonical type, we will need to use structural equality
>> +     checks for this type.
>> +
>> +     We shouldn't get here for stripping attributes from a type;
>> +     the no-attribute type might not need structural comparison.  But
>> +     we can if was discarded from type_hash_table.  */
>> +      if (TYPE_STRUCTURAL_EQUALITY_P (ttype)
>> +      || !comp_type_attributes (ntype, ttype))
>> +    SET_TYPE_STRUCTURAL_EQUALITY (ntype);
>> 
>>       hashval_t hash = type_hash_canon_hash (ntype);
>>       ntype = type_hash_canon (hash, ntype);
>> @@ -1343,16 +1353,6 @@ build_type_attribute_qual_variant (tree otype, tree attribute, int quals)
>>       if (ntype != dtype)
>>    /* This variant was already in the hash table, don't mess with
>>       TYPE_CANONICAL.  */;
>> -      else if (TYPE_STRUCTURAL_EQUALITY_P (ttype)
>> -           || !comp_type_attributes (ntype, ttype))
>> -    /* If the target-dependent attributes make NTYPE different from
>> -       its canonical type, we will need to use structural equality
>> -       checks for this type.
>> -
>> -       We shouldn't get here for stripping attributes from a type;
>> -       the no-attribute type might not need structural comparison.  But
>> -       we can if was discarded from type_hash_table.  */
>> -    SET_TYPE_STRUCTURAL_EQUALITY (ntype);
>>       else if (TYPE_CANONICAL (ntype) == ntype)
>>    TYPE_CANONICAL (ntype) = TYPE_CANONICAL (ttype);
>> 
>> diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
>> index 01e3d247fc2..032dcb4b41d 100644
>> --- a/gcc/c-family/c-common.cc
>> +++ b/gcc/c-family/c-common.cc
>> @@ -7115,6 +7115,13 @@ complete_array_type (tree *ptype, tree initial_value, bool do_default)
>>   TYPE_TYPELESS_STORAGE (main_type) = TYPE_TYPELESS_STORAGE (type);
>>   layout_type (main_type);
>> 
>> +  /* Set TYPE_STRUCTURAL_EQUALITY_P early.  */
>> +  if (TYPE_STRUCTURAL_EQUALITY_P (TREE_TYPE (main_type))
>> +      || TYPE_STRUCTURAL_EQUALITY_P (TYPE_DOMAIN (main_type)))
>> +    SET_TYPE_STRUCTURAL_EQUALITY (main_type);
>> +  else
>> +    TYPE_CANONICAL (main_type) = main_type;
>> +
>>   /* Make sure we have the canonical MAIN_TYPE. */
>>   hashval_t hashcode = type_hash_canon_hash (main_type);
>>   main_type = type_hash_canon (hashcode, main_type);
>> @@ -7122,7 +7129,7 @@ complete_array_type (tree *ptype, tree initial_value, bool do_default)
>>   /* Fix the canonical type.  */
>>   if (TYPE_STRUCTURAL_EQUALITY_P (TREE_TYPE (main_type))
>>       || TYPE_STRUCTURAL_EQUALITY_P (TYPE_DOMAIN (main_type)))
>> -    SET_TYPE_STRUCTURAL_EQUALITY (main_type);
>> +    gcc_assert (TYPE_STRUCTURAL_EQUALITY_P (main_type));
>>   else if (TYPE_CANONICAL (TREE_TYPE (main_type)) != TREE_TYPE (main_type)
>>       || (TYPE_CANONICAL (TYPE_DOMAIN (main_type))
>>           != TYPE_DOMAIN (main_type)))
>> @@ -7130,8 +7137,6 @@ complete_array_type (tree *ptype, tree initial_value, bool do_default)
>>       = build_array_type (TYPE_CANONICAL (TREE_TYPE (main_type)),
>>              TYPE_CANONICAL (TYPE_DOMAIN (main_type)),
>>              TYPE_TYPELESS_STORAGE (main_type));
>> -  else
>> -    TYPE_CANONICAL (main_type) = main_type;
>> 
>>   if (quals == 0)
>>     type = main_type;
>> diff --git a/gcc/testsuite/gcc.dg/pr114931.c b/gcc/testsuite/gcc.dg/pr114931.c
>> new file mode 100644
>> index 00000000000..d690ed70e52
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/pr114931.c
>> @@ -0,0 +1,10 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-std=c23" } */
>> +
>> +struct Tcl_Obj;
>> +void(Tcl_FreeInternalRepProc)(struct Tcl_Obj *);
>> +typedef struct Tcl_Obj {
>> +} Tcl_Obj;
>> +struct {
>> +  void (*tclFreeObj)(Tcl_Obj *);
>> +} Tcl_InitStubs;
>> diff --git a/gcc/tree.cc b/gcc/tree.cc
>> index 780662549fe..6564b002dc1 100644
>> --- a/gcc/tree.cc
>> +++ b/gcc/tree.cc
>> @@ -6012,6 +6012,8 @@ type_hash_canon_hash (tree type)
>> 
>>   hstate.add_int (TREE_CODE (type));
>> 
>> +  hstate.add_flag (TYPE_STRUCTURAL_EQUALITY_P (type));
>> +
>>   if (TREE_TYPE (type))
>>     hstate.add_object (TYPE_HASH (TREE_TYPE (type)));
>> 
>> @@ -6109,6 +6111,10 @@ type_cache_hasher::equal (type_hash *a, type_hash *b)
>>      || TYPE_MODE (a->type) != TYPE_MODE (b->type)))
>>     return false;
>> 
>> +  if (TYPE_STRUCTURAL_EQUALITY_P (a->type)
>> +      != TYPE_STRUCTURAL_EQUALITY_P (b->type))
>> +    return false;
>> +
>>   switch (TREE_CODE (a->type))
>>     {
>>     case VOID_TYPE:
>> @@ -7347,6 +7353,14 @@ build_array_type_1 (tree elt_type, tree index_type, bool typeless_storage,
>>   TYPE_DOMAIN (t) = index_type;
>>   TYPE_ADDR_SPACE (t) = TYPE_ADDR_SPACE (elt_type);
>>   TYPE_TYPELESS_STORAGE (t) = typeless_storage;
>> +
>> +  /* Set TYPE_STRUCTURAL_EQUALITY_P.  */
>> +  if (set_canonical
>> +      && (TYPE_STRUCTURAL_EQUALITY_P (elt_type)
>> +      || (index_type && TYPE_STRUCTURAL_EQUALITY_P (index_type))
>> +      || in_lto_p))
>> +    SET_TYPE_STRUCTURAL_EQUALITY (t);
>> +
>>   layout_type (t);
>> 
>>   if (shared)
>> @@ -7363,7 +7377,7 @@ build_array_type_1 (tree elt_type, tree index_type, bool typeless_storage,
>>       if (TYPE_STRUCTURAL_EQUALITY_P (elt_type)
>>      || (index_type && TYPE_STRUCTURAL_EQUALITY_P (index_type))
>>      || in_lto_p)
>> -    SET_TYPE_STRUCTURAL_EQUALITY (t);
>> +    gcc_unreachable ();
>>       else if (TYPE_CANONICAL (elt_type) != elt_type
>>           || (index_type && TYPE_CANONICAL (index_type) != index_type))
>>    TYPE_CANONICAL (t)
>> @@ -7510,21 +7524,25 @@ build_function_type (tree value_type, tree arg_types,
>>       TYPE_NO_NAMED_ARGS_STDARG_P (t) = 1;
>>     }
>> 
>> -  /* If we already have such a type, use the old one.  */
>> -  hashval_t hash = type_hash_canon_hash (t);
>> -  tree probe_type = t;
>> -  t = type_hash_canon (hash, t);
>> -  if (t != probe_type)
>> -    return t;
>> -
>>   /* Set up the canonical type. */
>>   any_structural_p   = TYPE_STRUCTURAL_EQUALITY_P (value_type);
>>   any_noncanonical_p = TYPE_CANONICAL (value_type) != value_type;
>>   canon_argtypes = maybe_canonicalize_argtypes (arg_types,
>>                        &any_structural_p,
>>                        &any_noncanonical_p);
>> +  /* Set TYPE_STRUCTURAL_EQUALITY_P early.  */
>>   if (any_structural_p)
>>     SET_TYPE_STRUCTURAL_EQUALITY (t);
>> +
>> +  /* If we already have such a type, use the old one.  */
>> +  hashval_t hash = type_hash_canon_hash (t);
>> +  tree probe_type = t;
>> +  t = type_hash_canon (hash, t);
>> +  if (t != probe_type)
>> +    return t;
>> +
>> +  if (any_structural_p)
>> +    gcc_assert (TYPE_STRUCTURAL_EQUALITY_P (t));
>>   else if (any_noncanonical_p)
>>     TYPE_CANONICAL (t) = build_function_type (TYPE_CANONICAL (value_type),
>>                          canon_argtypes);
>> @@ -7667,13 +7685,6 @@ build_method_type_directly (tree basetype,
>>   argtypes = tree_cons (NULL_TREE, ptype, argtypes);
>>   TYPE_ARG_TYPES (t) = argtypes;
>> 
>> -  /* If we already have such a type, use the old one.  */
>> -  hashval_t hash = type_hash_canon_hash (t);
>> -  tree probe_type = t;
>> -  t = type_hash_canon (hash, t);
>> -  if (t != probe_type)
>> -    return t;
>> -
>>   /* Set up the canonical type. */
>>   any_structural_p
>>     = (TYPE_STRUCTURAL_EQUALITY_P (basetype)
>> @@ -7684,8 +7695,20 @@ build_method_type_directly (tree basetype,
>>   canon_argtypes = maybe_canonicalize_argtypes (TREE_CHAIN (argtypes),
>>                        &any_structural_p,
>>                        &any_noncanonical_p);
>> +
>> +  /* Set TYPE_STRUCTURAL_EQUALITY_P early.  */
>>   if (any_structural_p)
>>     SET_TYPE_STRUCTURAL_EQUALITY (t);
>> +
>> +  /* If we already have such a type, use the old one.  */
>> +  hashval_t hash = type_hash_canon_hash (t);
>> +  tree probe_type = t;
>> +  t = type_hash_canon (hash, t);
>> +  if (t != probe_type)
>> +    return t;
>> +
>> +  if (any_structural_p)
>> +    gcc_assert (TYPE_STRUCTURAL_EQUALITY_P (t));
>>   else if (any_noncanonical_p)
>>     TYPE_CANONICAL (t)
>>       = build_method_type_directly (TYPE_CANONICAL (basetype),
>> @@ -7726,6 +7749,9 @@ build_offset_type (tree basetype, tree type)
>> 
>>   TYPE_OFFSET_BASETYPE (t) = TYPE_MAIN_VARIANT (basetype);
>>   TREE_TYPE (t) = type;
>> +  if (TYPE_STRUCTURAL_EQUALITY_P (basetype)
>> +      || TYPE_STRUCTURAL_EQUALITY_P (type))
>> +    SET_TYPE_STRUCTURAL_EQUALITY (t);
>> 
>>   /* If we already have such a type, use the old one.  */
>>   hashval_t hash = type_hash_canon_hash (t);
>> @@ -7741,7 +7767,7 @@ build_offset_type (tree basetype, tree type)
>>     {
>>       if (TYPE_STRUCTURAL_EQUALITY_P (basetype)
>>      || TYPE_STRUCTURAL_EQUALITY_P (type))
>> -    SET_TYPE_STRUCTURAL_EQUALITY (t);
>> +    gcc_unreachable ();
>>       else if (TYPE_CANONICAL (TYPE_MAIN_VARIANT (basetype)) != basetype
>>           || TYPE_CANONICAL (type) != type)
>>    TYPE_CANONICAL (t)
>> @@ -7770,6 +7796,8 @@ build_complex_type (tree component_type, bool named)
>>   tree probe = make_node (COMPLEX_TYPE);
>> 
>>   TREE_TYPE (probe) = TYPE_MAIN_VARIANT (component_type);
>> +  if (TYPE_STRUCTURAL_EQUALITY_P (TREE_TYPE (probe)))
>> +    SET_TYPE_STRUCTURAL_EQUALITY (probe);
>> 
>>   /* If we already have such a type, use the old one.  */
>>   hashval_t hash = type_hash_canon_hash (probe);
>> @@ -7781,11 +7809,10 @@ build_complex_type (tree component_type, bool named)
>>     out the type.  We need to check the canonicalization and
>>     maybe set the name.  */
>>       gcc_checking_assert (COMPLETE_TYPE_P (t)
>> -               && !TYPE_NAME (t)
>> -               && TYPE_CANONICAL (t) == t);
>> +               && !TYPE_NAME (t));
>> 
>>       if (TYPE_STRUCTURAL_EQUALITY_P (TREE_TYPE (t)))
>> -    SET_TYPE_STRUCTURAL_EQUALITY (t);
>> +    ;
>>       else if (TYPE_CANONICAL (TREE_TYPE (t)) != TREE_TYPE (t))
>>    TYPE_CANONICAL (t)
>>      = build_complex_type (TYPE_CANONICAL (TREE_TYPE (t)), named);
>
Jakub Jelinek May 3, 2024, 5:30 p.m. UTC | #3
On Fri, May 03, 2024 at 05:32:12PM +0200, Martin Uecker wrote:
> Am Freitag, dem 03.05.2024 um 14:13 +0200 schrieb Richard Biener:
> > TYPE_STRUCTURAL_EQUALITY_P is part of our type system so we have
> > to make sure to include that into the type unification done via
> > type_hash_canon.  This requires the flag to be set before querying
> > the hash which is the biggest part of the patch.
> 
> I assume this does not affect structs / unions because they
> do not make this mechanism of type unification (each tagged type
> is a unique type), but only derived types that end up having
> TYPE_STRUCTURAL_EQUALITY_P because they are constructed from
> incomplete structs / unions before TYPE_CANONICAL is set.
> 
> I do not yet understand why this change is needed. Type
> identity should not be affected by setting TYPE_CANONICAL, so
> why do we need to keep such types separate?  I understand that we
> created some inconsistencies, but I do not see why this change
> is needed to fix it.  But I also haven't understood how we ended
> up with a TYPE_CANONICAL having TYPE_STRUCTURAL_EQUALITY_P in
> PR 114931 ...

So, the C23 situation before the r14-10045 change (not counting the
r14-9763 change that was later reverted) was that sometimes TYPE_CANONICAL
of a RECORD_TYPE/UNION_TYPE could change from self to a different
RECORD_TYPE/UNION_TYPE and we didn't bother to adjust derived types.
That was really dangerous, I think e.g. type alias set wasn't recomputed.

r14-10045 changed it to the non-ideal, but perhaps less wrong model,
where we start with TYPE_STRUCTURAL_EQUALITY_P on incomplete types in C23
and perhaps later on change them to !TYPE_STRUCTURAL_EQUALITY_P when
the type is completed, and adjust TYPE_CANONICAL of some easily discoverable
derived types but certainly not all.

Still, that change introduces something novel to the type system, namely
that TYPE_CANONICAL can change on a type, even when it is just limited to
the TYPE_STRUCTURAL_EQUALITY_P -> !TYPE_STRUCTURAL_EQUALITY_P kind of
change and we never change one non-NULL TYPE_CANONICAL to a different one
(ok, not counting the short lived TYPE_CANONICAL being initially set to
self during make_node and then quickly adjusted in the callers).

One question is, could we for C23 somehow limit this for the most common
case where people just forward declare some aggregate type and then soon
after define it?  But guess the problematic counterexample there is
struct S; // forward declare
struct S *p; // create some derived types from it
void foo (void)
{
  struct S { int s; };	// define the type in a different scope
			// (perhaps with a forward declaration as well)
  struct S s;
  use (&s);		// create derived types
}
struct S { int s; };	// define the type in the global scope to something
			// that matches previously defined struct S in
			// another scope
So e.g. noting in the hash table that a particular type has been forward
declared so far and using TYPE_STRUCTURAL_EQUALITY_P only if it has been
forward declared in some other scope wouldn't work.

Another question is whether c_update_type_canonical can or should try to
update TYPE_ALIAS_SET if TYPE_ALIAS_SET_KNOWN_P.  Or do we never cache
alias sets for TYPE_STRUCTURAL_EQUALITY_P types?

Anyway, the ICE on the testcase is because alias.cc asserts that
a !TYPE_STRUCTURAL_EQUALITY_P (type) has
!TYPE_STRUCTURAL_EQUALITY_P (TYPE_CANONICAL (type)).

The possibilities to resolve that are either at c_update_type_canonical
time try to find all the derived types rather than just some and recompute
their TYPE_CANONICAL.  Guess we could e.g. just traverse the whole
type_hash_table hash table and for each type see if it is in any way related
to the type that is being changed and then recompute them.  Though,
especially FUNCTION_TYPEs make that really ugly and furthermore it needs
to be recomputed in the right order, basically in the derivation order.
Without doing that, we'll have some TYPE_STRUCTURAL_EQUALITY_P derived
types in the type_hash_table hash table; that is conservatively correct,
but can result in worse code generation because of using alias set 0.

Another possibility is what Richi posted, essentially stop reusing
derived types created from the time when the base type was incomplete
when asking for a new derived type.  We'll get the TYPE_STRUCTURAL_EQUALITY_P
derived types if they were created before the base type was completed
when used directly (e.g. when it is a TREE_TYPE of some decl etc.), but
when we ask for a new type we'd disregard the old type and create a new one.
I'm not sure the patch is complete for that, because it doesn't adjust
check_base_type, build_pointer_type_for_mode, build_reference_type_for_mode
which don't use type_hash_canon but walk TYPE_NEXT_VARIANT list or
TYPE_POINTER_TO or TYPE_REFERENCE_TO chains.  Though, maybe it is ok
as is when c_update_type_canonical adjusts the pointer types
and variant types, those will be adjusted and as soon as we hit the first
ARRAY_TYPE/FUNCTION_TYPE or other type_canon_hash using derived types and
they aren't shared but have a new one created, further derived
pointer/qualified types will be based on the new ones, not the old ones.

Another possibility is what I wrote in the PR, ensure the
!TYPE_STRUCTURAL_EQUALITY_P (type) ->
!TYPE_STRUCTURAL_EQUALITY_P (TYPE_CANONICAL (type)) implication alias.cc
relies on is held.  Without the complete update of all derived types or
Richi's patch, there is always a chance that some TYPE_CANONICAL (...) =
assignment will get a TYPE_STRUCTURAL_EQUALITY_P type which was created
while the base type was incomplete.  So, we could just check for that
case and use conservative type as well.  I.e. after each TYPE_CANONICAL
(...) = ... (except the TYPE_CANONICAL (t) = t; cases which are always
fine) add
  if (TYPE_STRUCTURAL_EQUALITY_P (TYPE_CANONICAL (type)))
    SET_TYPE_STRUCTURAL_EQUALITY (type);
That is conservatively fine.

And yet another option is to change alias.cc to handle the
TYPE_STRUCTURAL_EQUALITY_P (TYPE_CANONICAL (type)) case the same
as TYPE_STRUCTURAL_EQUALITY_P (type) instead of asserting it is not
ok.

I think we'd get best generated code if we managed to update all derived
types, Richi's patch is the second and the other changes result in even
worse code.

	Jakub
Martin Uecker May 3, 2024, 5:44 p.m. UTC | #4
Am Freitag, dem 03.05.2024 um 18:23 +0200 schrieb Richard Biener:
> 
> > Am 03.05.2024 um 17:33 schrieb Martin Uecker <uecker@tugraz.at>:
> > 
> > Am Freitag, dem 03.05.2024 um 14:13 +0200 schrieb Richard Biener:
> > > TYPE_STRUCTURAL_EQUALITY_P is part of our type system so we have
> > > to make sure to include that into the type unification done via
> > > type_hash_canon.  This requires the flag to be set before querying
> > > the hash which is the biggest part of the patch.
> > 
> > I assume this does not affect structs / unions because they
> > do not make this mechanism of type unification (each tagged type
> > is a unique type), but only derived types that end up having
> > TYPE_STRUCTURAL_EQUALITY_P because they are constructed from
> > incomplete structs / unions before TYPE_CANONICAL is set.
> > 
> > I do not yet understand why this change is needed. Type
> > identity should not be affected by setting TYPE_CANONICAL, so
> > why do we need to keep such types separate?  I understand that we
> > created some inconsistencies, but I do not see why this change
> > is needed to fix it.  But I also haven't understood how we ended
> > up with a TYPE_CANONICAL having TYPE_STRUCTURAL_EQUALITY_P in
> > PR 114931 ...
> 
> Because we created the canonical function type before where one
> of its arguments had TYPE_STEUCTURAL_EQUALITY which makes the
> function type so.

So build_function_type when called recursively for creating
a TYPE_CANONICAL found some type with TYPE_STRUCTURAL_EQUALITY.

And the plan is to separate those in the hash table so that this
cannot happen?

Couldn't we instead lazily update TYPE_CANONICAL at this point? 


Martin

> 
> Richard 
> 
> > 
> > Martin
> > 
> > 
> > > 
> > > Bootstrapped and tested on x86_64-unknown-linux-gnu for all languages.
> > > 
> > > As said in the PR this merely makes sure to keep individual types
> > > consistent with themselves.  We still will have a set of types
> > > with TYPE_STRUCTURAL_EQUALITY_P and a set without that might be
> > > otherwise identical.  That could be only avoided with changes in
> > > the frontend.
> > > 
> > > OK for trunk?
> > > 
> > > Thanks,
> > > Richard.
> > > 
> > >    PR middle-end/114931
> > > gcc/
> > >    * tree.cc (type_hash_canon_hash): Hash TYPE_STRUCTURAL_EQUALITY_P.
> > >    (type_cache_hasher::equal): Compare TYPE_STRUCTURAL_EQUALITY_P.
> > >    (build_array_type_1): Set TYPE_STRUCTURAL_EQUALITY_P before
> > >    probing with type_hash_canon.
> > >    (build_function_type): Likewise.
> > >    (build_method_type_directly): Likewise.
> > >    (build_offset_type): Likewise.
> > >    (build_complex_type): Likewise.
> > >    * attribs.cc (build_type_attribute_qual_variant): Likewise.
> > > 
> > > gcc/c-family/
> > >    * c-common.cc (complete_array_type): Set TYPE_STRUCTURAL_EQUALITY_P
> > >    before probing with type_hash_canon.
> > > 
> > > gcc/testsuite/
> > >    * gcc.dg/pr114931.c: New testcase.
> > > ---
> > > gcc/attribs.cc                  | 20 +++++-----
> > > gcc/c-family/c-common.cc        | 11 ++++--
> > > gcc/testsuite/gcc.dg/pr114931.c | 10 +++++
> > > gcc/tree.cc                     | 65 +++++++++++++++++++++++----------
> > > 4 files changed, 74 insertions(+), 32 deletions(-)
> > > create mode 100644 gcc/testsuite/gcc.dg/pr114931.c
> > > 
> > > diff --git a/gcc/attribs.cc b/gcc/attribs.cc
> > > index 12ffc5f170a..3ab0b0fd87a 100644
> > > --- a/gcc/attribs.cc
> > > +++ b/gcc/attribs.cc
> > > @@ -1336,6 +1336,16 @@ build_type_attribute_qual_variant (tree otype, tree attribute, int quals)
> > >       tree dtype = ntype = build_distinct_type_copy (ttype);
> > > 
> > >       TYPE_ATTRIBUTES (ntype) = attribute;
> > > +      /* If the target-dependent attributes make NTYPE different from
> > > +     its canonical type, we will need to use structural equality
> > > +     checks for this type.
> > > +
> > > +     We shouldn't get here for stripping attributes from a type;
> > > +     the no-attribute type might not need structural comparison.  But
> > > +     we can if was discarded from type_hash_table.  */
> > > +      if (TYPE_STRUCTURAL_EQUALITY_P (ttype)
> > > +      || !comp_type_attributes (ntype, ttype))
> > > +    SET_TYPE_STRUCTURAL_EQUALITY (ntype);
> > > 
> > >       hashval_t hash = type_hash_canon_hash (ntype);
> > >       ntype = type_hash_canon (hash, ntype);
> > > @@ -1343,16 +1353,6 @@ build_type_attribute_qual_variant (tree otype, tree attribute, int quals)
> > >       if (ntype != dtype)
> > >    /* This variant was already in the hash table, don't mess with
> > >       TYPE_CANONICAL.  */;
> > > -      else if (TYPE_STRUCTURAL_EQUALITY_P (ttype)
> > > -           || !comp_type_attributes (ntype, ttype))
> > > -    /* If the target-dependent attributes make NTYPE different from
> > > -       its canonical type, we will need to use structural equality
> > > -       checks for this type.
> > > -
> > > -       We shouldn't get here for stripping attributes from a type;
> > > -       the no-attribute type might not need structural comparison.  But
> > > -       we can if was discarded from type_hash_table.  */
> > > -    SET_TYPE_STRUCTURAL_EQUALITY (ntype);
> > >       else if (TYPE_CANONICAL (ntype) == ntype)
> > >    TYPE_CANONICAL (ntype) = TYPE_CANONICAL (ttype);
> > > 
> > > diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
> > > index 01e3d247fc2..032dcb4b41d 100644
> > > --- a/gcc/c-family/c-common.cc
> > > +++ b/gcc/c-family/c-common.cc
> > > @@ -7115,6 +7115,13 @@ complete_array_type (tree *ptype, tree initial_value, bool do_default)
> > >   TYPE_TYPELESS_STORAGE (main_type) = TYPE_TYPELESS_STORAGE (type);
> > >   layout_type (main_type);
> > > 
> > > +  /* Set TYPE_STRUCTURAL_EQUALITY_P early.  */
> > > +  if (TYPE_STRUCTURAL_EQUALITY_P (TREE_TYPE (main_type))
> > > +      || TYPE_STRUCTURAL_EQUALITY_P (TYPE_DOMAIN (main_type)))
> > > +    SET_TYPE_STRUCTURAL_EQUALITY (main_type);
> > > +  else
> > > +    TYPE_CANONICAL (main_type) = main_type;
> > > +
> > >   /* Make sure we have the canonical MAIN_TYPE. */
> > >   hashval_t hashcode = type_hash_canon_hash (main_type);
> > >   main_type = type_hash_canon (hashcode, main_type);
> > > @@ -7122,7 +7129,7 @@ complete_array_type (tree *ptype, tree initial_value, bool do_default)
> > >   /* Fix the canonical type.  */
> > >   if (TYPE_STRUCTURAL_EQUALITY_P (TREE_TYPE (main_type))
> > >       || TYPE_STRUCTURAL_EQUALITY_P (TYPE_DOMAIN (main_type)))
> > > -    SET_TYPE_STRUCTURAL_EQUALITY (main_type);
> > > +    gcc_assert (TYPE_STRUCTURAL_EQUALITY_P (main_type));
> > >   else if (TYPE_CANONICAL (TREE_TYPE (main_type)) != TREE_TYPE (main_type)
> > >       || (TYPE_CANONICAL (TYPE_DOMAIN (main_type))
> > >           != TYPE_DOMAIN (main_type)))
> > > @@ -7130,8 +7137,6 @@ complete_array_type (tree *ptype, tree initial_value, bool do_default)
> > >       = build_array_type (TYPE_CANONICAL (TREE_TYPE (main_type)),
> > >              TYPE_CANONICAL (TYPE_DOMAIN (main_type)),
> > >              TYPE_TYPELESS_STORAGE (main_type));
> > > -  else
> > > -    TYPE_CANONICAL (main_type) = main_type;
> > > 
> > >   if (quals == 0)
> > >     type = main_type;
> > > diff --git a/gcc/testsuite/gcc.dg/pr114931.c b/gcc/testsuite/gcc.dg/pr114931.c
> > > new file mode 100644
> > > index 00000000000..d690ed70e52
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/pr114931.c
> > > @@ -0,0 +1,10 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-std=c23" } */
> > > +
> > > +struct Tcl_Obj;
> > > +void(Tcl_FreeInternalRepProc)(struct Tcl_Obj *);
> > > +typedef struct Tcl_Obj {
> > > +} Tcl_Obj;
> > > +struct {
> > > +  void (*tclFreeObj)(Tcl_Obj *);
> > > +} Tcl_InitStubs;
> > > diff --git a/gcc/tree.cc b/gcc/tree.cc
> > > index 780662549fe..6564b002dc1 100644
> > > --- a/gcc/tree.cc
> > > +++ b/gcc/tree.cc
> > > @@ -6012,6 +6012,8 @@ type_hash_canon_hash (tree type)
> > > 
> > >   hstate.add_int (TREE_CODE (type));
> > > 
> > > +  hstate.add_flag (TYPE_STRUCTURAL_EQUALITY_P (type));
> > > +
> > >   if (TREE_TYPE (type))
> > >     hstate.add_object (TYPE_HASH (TREE_TYPE (type)));
> > > 
> > > @@ -6109,6 +6111,10 @@ type_cache_hasher::equal (type_hash *a, type_hash *b)
> > >      || TYPE_MODE (a->type) != TYPE_MODE (b->type)))
> > >     return false;
> > > 
> > > +  if (TYPE_STRUCTURAL_EQUALITY_P (a->type)
> > > +      != TYPE_STRUCTURAL_EQUALITY_P (b->type))
> > > +    return false;
> > > +
> > >   switch (TREE_CODE (a->type))
> > >     {
> > >     case VOID_TYPE:
> > > @@ -7347,6 +7353,14 @@ build_array_type_1 (tree elt_type, tree index_type, bool typeless_storage,
> > >   TYPE_DOMAIN (t) = index_type;
> > >   TYPE_ADDR_SPACE (t) = TYPE_ADDR_SPACE (elt_type);
> > >   TYPE_TYPELESS_STORAGE (t) = typeless_storage;
> > > +
> > > +  /* Set TYPE_STRUCTURAL_EQUALITY_P.  */
> > > +  if (set_canonical
> > > +      && (TYPE_STRUCTURAL_EQUALITY_P (elt_type)
> > > +      || (index_type && TYPE_STRUCTURAL_EQUALITY_P (index_type))
> > > +      || in_lto_p))
> > > +    SET_TYPE_STRUCTURAL_EQUALITY (t);
> > > +
> > >   layout_type (t);
> > > 
> > >   if (shared)
> > > @@ -7363,7 +7377,7 @@ build_array_type_1 (tree elt_type, tree index_type, bool typeless_storage,
> > >       if (TYPE_STRUCTURAL_EQUALITY_P (elt_type)
> > >      || (index_type && TYPE_STRUCTURAL_EQUALITY_P (index_type))
> > >      || in_lto_p)
> > > -    SET_TYPE_STRUCTURAL_EQUALITY (t);
> > > +    gcc_unreachable ();
> > >       else if (TYPE_CANONICAL (elt_type) != elt_type
> > >           || (index_type && TYPE_CANONICAL (index_type) != index_type))
> > >    TYPE_CANONICAL (t)
> > > @@ -7510,21 +7524,25 @@ build_function_type (tree value_type, tree arg_types,
> > >       TYPE_NO_NAMED_ARGS_STDARG_P (t) = 1;
> > >     }
> > > 
> > > -  /* If we already have such a type, use the old one.  */
> > > -  hashval_t hash = type_hash_canon_hash (t);
> > > -  tree probe_type = t;
> > > -  t = type_hash_canon (hash, t);
> > > -  if (t != probe_type)
> > > -    return t;
> > > -
> > >   /* Set up the canonical type. */
> > >   any_structural_p   = TYPE_STRUCTURAL_EQUALITY_P (value_type);
> > >   any_noncanonical_p = TYPE_CANONICAL (value_type) != value_type;
> > >   canon_argtypes = maybe_canonicalize_argtypes (arg_types,
> > >                        &any_structural_p,
> > >                        &any_noncanonical_p);
> > > +  /* Set TYPE_STRUCTURAL_EQUALITY_P early.  */
> > >   if (any_structural_p)
> > >     SET_TYPE_STRUCTURAL_EQUALITY (t);
> > > +
> > > +  /* If we already have such a type, use the old one.  */
> > > +  hashval_t hash = type_hash_canon_hash (t);
> > > +  tree probe_type = t;
> > > +  t = type_hash_canon (hash, t);
> > > +  if (t != probe_type)
> > > +    return t;
> > > +
> > > +  if (any_structural_p)
> > > +    gcc_assert (TYPE_STRUCTURAL_EQUALITY_P (t));
> > >   else if (any_noncanonical_p)
> > >     TYPE_CANONICAL (t) = build_function_type (TYPE_CANONICAL (value_type),
> > >                          canon_argtypes);
> > > @@ -7667,13 +7685,6 @@ build_method_type_directly (tree basetype,
> > >   argtypes = tree_cons (NULL_TREE, ptype, argtypes);
> > >   TYPE_ARG_TYPES (t) = argtypes;
> > > 
> > > -  /* If we already have such a type, use the old one.  */
> > > -  hashval_t hash = type_hash_canon_hash (t);
> > > -  tree probe_type = t;
> > > -  t = type_hash_canon (hash, t);
> > > -  if (t != probe_type)
> > > -    return t;
> > > -
> > >   /* Set up the canonical type. */
> > >   any_structural_p
> > >     = (TYPE_STRUCTURAL_EQUALITY_P (basetype)
> > > @@ -7684,8 +7695,20 @@ build_method_type_directly (tree basetype,
> > >   canon_argtypes = maybe_canonicalize_argtypes (TREE_CHAIN (argtypes),
> > >                        &any_structural_p,
> > >                        &any_noncanonical_p);
> > > +
> > > +  /* Set TYPE_STRUCTURAL_EQUALITY_P early.  */
> > >   if (any_structural_p)
> > >     SET_TYPE_STRUCTURAL_EQUALITY (t);
> > > +
> > > +  /* If we already have such a type, use the old one.  */
> > > +  hashval_t hash = type_hash_canon_hash (t);
> > > +  tree probe_type = t;
> > > +  t = type_hash_canon (hash, t);
> > > +  if (t != probe_type)
> > > +    return t;
> > > +
> > > +  if (any_structural_p)
> > > +    gcc_assert (TYPE_STRUCTURAL_EQUALITY_P (t));
> > >   else if (any_noncanonical_p)
> > >     TYPE_CANONICAL (t)
> > >       = build_method_type_directly (TYPE_CANONICAL (basetype),
> > > @@ -7726,6 +7749,9 @@ build_offset_type (tree basetype, tree type)
> > > 
> > >   TYPE_OFFSET_BASETYPE (t) = TYPE_MAIN_VARIANT (basetype);
> > >   TREE_TYPE (t) = type;
> > > +  if (TYPE_STRUCTURAL_EQUALITY_P (basetype)
> > > +      || TYPE_STRUCTURAL_EQUALITY_P (type))
> > > +    SET_TYPE_STRUCTURAL_EQUALITY (t);
> > > 
> > >   /* If we already have such a type, use the old one.  */
> > >   hashval_t hash = type_hash_canon_hash (t);
> > > @@ -7741,7 +7767,7 @@ build_offset_type (tree basetype, tree type)
> > >     {
> > >       if (TYPE_STRUCTURAL_EQUALITY_P (basetype)
> > >      || TYPE_STRUCTURAL_EQUALITY_P (type))
> > > -    SET_TYPE_STRUCTURAL_EQUALITY (t);
> > > +    gcc_unreachable ();
> > >       else if (TYPE_CANONICAL (TYPE_MAIN_VARIANT (basetype)) != basetype
> > >           || TYPE_CANONICAL (type) != type)
> > >    TYPE_CANONICAL (t)
> > > @@ -7770,6 +7796,8 @@ build_complex_type (tree component_type, bool named)
> > >   tree probe = make_node (COMPLEX_TYPE);
> > > 
> > >   TREE_TYPE (probe) = TYPE_MAIN_VARIANT (component_type);
> > > +  if (TYPE_STRUCTURAL_EQUALITY_P (TREE_TYPE (probe)))
> > > +    SET_TYPE_STRUCTURAL_EQUALITY (probe);
> > > 
> > >   /* If we already have such a type, use the old one.  */
> > >   hashval_t hash = type_hash_canon_hash (probe);
> > > @@ -7781,11 +7809,10 @@ build_complex_type (tree component_type, bool named)
> > >     out the type.  We need to check the canonicalization and
> > >     maybe set the name.  */
> > >       gcc_checking_assert (COMPLETE_TYPE_P (t)
> > > -               && !TYPE_NAME (t)
> > > -               && TYPE_CANONICAL (t) == t);
> > > +               && !TYPE_NAME (t));
> > > 
> > >       if (TYPE_STRUCTURAL_EQUALITY_P (TREE_TYPE (t)))
> > > -    SET_TYPE_STRUCTURAL_EQUALITY (t);
> > > +    ;
> > >       else if (TYPE_CANONICAL (TREE_TYPE (t)) != TREE_TYPE (t))
> > >    TYPE_CANONICAL (t)
> > >      = build_complex_type (TYPE_CANONICAL (TREE_TYPE (t)), named);
> >
Martin Uecker May 3, 2024, 6:04 p.m. UTC | #5
Am Freitag, dem 03.05.2024 um 19:30 +0200 schrieb Jakub Jelinek:
> On Fri, May 03, 2024 at 05:32:12PM +0200, Martin Uecker wrote:
> > Am Freitag, dem 03.05.2024 um 14:13 +0200 schrieb Richard Biener:
> > > TYPE_STRUCTURAL_EQUALITY_P is part of our type system so we have
> > > to make sure to include that into the type unification done via
> > > type_hash_canon.  This requires the flag to be set before querying
> > > the hash which is the biggest part of the patch.
> > 
> > I assume this does not affect structs / unions because they
> > do not make this mechanism of type unification (each tagged type
> > is a unique type), but only derived types that end up having
> > TYPE_STRUCTURAL_EQUALITY_P because they are constructed from
> > incomplete structs / unions before TYPE_CANONICAL is set.
> > 
> > I do not yet understand why this change is needed. Type
> > identity should not be affected by setting TYPE_CANONICAL, so
> > why do we need to keep such types separate?  I understand that we
> > created some inconsistencies, but I do not see why this change
> > is needed to fix it.  But I also haven't understood how we ended
> > up with a TYPE_CANONICAL having TYPE_STRUCTURAL_EQUALITY_P in
> > PR 114931 ...
> 
> So, the C23 situation before the r14-10045 change (not counting the
> r14-9763 change that was later reverted) was that sometimes TYPE_CANONICAL
> of a RECORD_TYPE/UNION_TYPE could change from self to a different
> RECORD_TYPE/UNION_TYPE and we didn't bother to adjust derived types.
> That was really dangerous, I think e.g. type alias set wasn't recomputed.
> 
> r14-10045 changed it to the non-ideal, but perhaps less wrong model,
> where we start with TYPE_STRUCTURAL_EQUALITY_P on incomplete types in C23
> and perhaps later on change them to !TYPE_STRUCTURAL_EQUALITY_P when
> the type is completed, and adjust TYPE_CANONICAL of some easily discoverable
> derived types but certainly not all.
> 
> Still, that change introduces something novel to the type system, namely
> that TYPE_CANONICAL can change on a type, even when it is just limited to
> the TYPE_STRUCTURAL_EQUALITY_P -> !TYPE_STRUCTURAL_EQUALITY_P kind of
> change and we never change one non-NULL TYPE_CANONICAL to a different one
> (ok, not counting the short lived TYPE_CANONICAL being initially set to
> self during make_node and then quickly adjusted in the callers).
> 
> One question is, could we for C23 somehow limit this for the most common
> case where people just forward declare some aggregate type and then soon
> after define it?  But guess the problematic counterexample there is
> struct S; // forward declare
> struct S *p; // create some derived types from it
> void foo (void)
> {
>   struct S { int s; };	// define the type in a different scope
> 			// (perhaps with a forward declaration as well)
>   struct S s;
>   use (&s);		// create derived types
> }
> struct S { int s; };	// define the type in the global scope to something
> 			// that matches previously defined struct S in
> 			// another scope
> So e.g. noting in the hash table that a particular type has been forward
> declared so far and using TYPE_STRUCTURAL_EQUALITY_P only if it has been
> forward declared in some other scope wouldn't work.
> 
> Another question is whether c_update_type_canonical can or should try to
> update TYPE_ALIAS_SET if TYPE_ALIAS_SET_KNOWN_P.  Or do we never cache
> alias sets for TYPE_STRUCTURAL_EQUALITY_P types?
> 
> Anyway, the ICE on the testcase is because alias.cc asserts that
> a !TYPE_STRUCTURAL_EQUALITY_P (type) has
> !TYPE_STRUCTURAL_EQUALITY_P (TYPE_CANONICAL (type)).
> 
> The possibilities to resolve that are either at c_update_type_canonical
> time try to find all the derived types rather than just some and recompute
> their TYPE_CANONICAL.  Guess we could e.g. just traverse the whole
> type_hash_table hash table and for each type see if it is in any way related
> to the type that is being changed and then recompute them.  Though,
> especially FUNCTION_TYPEs make that really ugly and furthermore it needs
> to be recomputed in the right order, basically in the derivation order.
> Without doing that, we'll have some TYPE_STRUCTURAL_EQUALITY_P derived
> types in the type_hash_table hash table; that is conservatively correct,
> but can result in worse code generation because of using alias set 0.
> 
> Another possibility is what Richi posted, essentially stop reusing
> derived types created from the time when the base type was incomplete
> when asking for a new derived type.  We'll get the TYPE_STRUCTURAL_EQUALITY_P
> derived types if they were created before the base type was completed
> when used directly (e.g. when it is a TREE_TYPE of some decl etc.), but
> when we ask for a new type we'd disregard the old type and create a new one.
> I'm not sure the patch is complete for that, because it doesn't adjust
> check_base_type, build_pointer_type_for_mode, build_reference_type_for_mode
> which don't use type_hash_canon but walk TYPE_NEXT_VARIANT list or
> TYPE_POINTER_TO or TYPE_REFERENCE_TO chains.  Though, maybe it is ok
> as is when c_update_type_canonical adjusts the pointer types
> and variant types, those will be adjusted and as soon as we hit the first
> ARRAY_TYPE/FUNCTION_TYPE or other type_canon_hash using derived types and
> they aren't shared but have a new one created, further derived
> pointer/qualified types will be based on the new ones, not the old ones.
> 
> Another possibility is what I wrote in the PR, ensure the
> !TYPE_STRUCTURAL_EQUALITY_P (type) ->
> !TYPE_STRUCTURAL_EQUALITY_P (TYPE_CANONICAL (type)) implication alias.cc
> relies on is held.  Without the complete update of all derived types or
> Richi's patch, there is always a chance that some TYPE_CANONICAL (...) =
> assignment will get a TYPE_STRUCTURAL_EQUALITY_P type which was created
> while the base type was incomplete.  So, we could just check for that
> case and use conservative type as well.  I.e. after each TYPE_CANONICAL
> (...) = ... (except the TYPE_CANONICAL (t) = t; cases which are always
> fine) add
>   if (TYPE_STRUCTURAL_EQUALITY_P (TYPE_CANONICAL (type)))
>     SET_TYPE_STRUCTURAL_EQUALITY (type);
> That is conservatively fine.
> 
> And yet another option is to change alias.cc to handle the
> TYPE_STRUCTURAL_EQUALITY_P (TYPE_CANONICAL (type)) case the same
> as TYPE_STRUCTURAL_EQUALITY_P (type) instead of asserting it is not
> ok.
> 
> I think we'd get best generated code if we managed to update all derived
> types, Richi's patch is the second and the other changes result in even
> worse code.


A change that is not optimal but would avoid a lot of trouble is to
only use the tag of the struct for computing a TYPE_CANONICAL, which
could then be set already for incomplete types and never needs to
change again. We would not differentiate between different struct
types with the same tag for aliasing analysis, but in most cases
I would expect different structs to have a different tag.

Martin
Jakub Jelinek May 3, 2024, 6:18 p.m. UTC | #6
On Fri, May 03, 2024 at 08:04:18PM +0200, Martin Uecker wrote:
> A change that is not optimal but would avoid a lot of trouble is to
> only use the tag of the struct for computing a TYPE_CANONICAL, which
> could then be set already for incomplete types and never needs to
> change again. We would not differentiate between different struct
> types with the same tag for aliasing analysis, but in most cases
> I would expect different structs to have a different tag.

Having incompatible types have the same TYPE_CANONICAL would lead to wrong
code IMHO, while for aliasing purposes that might be conservative (though
not sure, the alias set computation is based on what types the element have
etc., so if the alias set is computed for say struct S { int s; }; and
then the same alias set used for struct S { long long a; double b; union {
short c; float d; } c; };, I think nothing good will come out of that),
but middle-end also uses TYPE_CANONICAL to see if types are the same,
say e.g. useless_type_conversion_p says that conversions from one
RECORD_TYPE to a different RECORD_TYPE are useless if they have the
same TYPE_CANONICAL.
  /* For aggregates we rely on TYPE_CANONICAL exclusively and require
     explicit conversions for types involving to be structurally
     compared types.  */
  else if (AGGREGATE_TYPE_P (inner_type)
           && TREE_CODE (inner_type) == TREE_CODE (outer_type))
    return TYPE_CANONICAL (inner_type)
           && TYPE_CANONICAL (inner_type) == TYPE_CANONICAL (outer_type);
So, if you have struct S { int s; } and struct S { short a, b; }; and
VIEW_CONVERT_EXPR between them, that VIEW_CONVERT_EXPR will be removed
as useless, etc.

BTW, the idea of lazily updating TYPE_CANONICAL is basically what I've
described as the option to update all the derived types where it would
pretty much do that for all TYPE_STRUCTURAL_EQUALITY_P types in the
hash table (see if they are derived from the type in question and recompute
the TYPE_CANONICAL after recomputing all the TYPE_CANONICAL of its base
types), except perhaps even more costly (if the trigger would be some
build_array_type/build_function_type/... function is called and found
a cached TYPE_STRUCTURAL_EQUALITY_P type).  Note also that
TYPE_STRUCTURAL_EQUALITY_P isn't the case just for the C23 types which
are marked that way when incomplete and later completed, but by various
other cases for types which will be permanently like that, so doing
expensive checks each time some build*_type* is called that refers
to those would be expensive.

	Jakub
Martin Uecker May 3, 2024, 6:37 p.m. UTC | #7
Am Freitag, dem 03.05.2024 um 20:18 +0200 schrieb Jakub Jelinek:
> On Fri, May 03, 2024 at 08:04:18PM +0200, Martin Uecker wrote:
> > A change that is not optimal but would avoid a lot of trouble is to
> > only use the tag of the struct for computing a TYPE_CANONICAL, which
> > could then be set already for incomplete types and never needs to
> > change again. We would not differentiate between different struct
> > types with the same tag for aliasing analysis, but in most cases
> > I would expect different structs to have a different tag.
> 
> Having incompatible types have the same TYPE_CANONICAL would lead to wrong
> code IMHO, while for aliasing purposes that might be conservative (though
> not sure, the alias set computation is based on what types the element have
> etc., so if the alias set is computed for say struct S { int s; }; and
> then the same alias set used for struct S { long long a; double b; union {
> short c; float d; } c; };, I think nothing good will come out of that),

The C type systems requires us to form equivalence classes though.
For example

int (*r)[1];
int (*q)[];
int (*p)[3];

need to be in the same equivalence class even though r and p are 
not compatible, while at the same time r and q and q and p
are compatible.


> but middle-end also uses TYPE_CANONICAL to see if types are the same,
> say e.g. useless_type_conversion_p says that conversions from one
> RECORD_TYPE to a different RECORD_TYPE are useless if they have the
> same TYPE_CANONICAL.
>   /* For aggregates we rely on TYPE_CANONICAL exclusively and require
>      explicit conversions for types involving to be structurally
>      compared types.  */
>   else if (AGGREGATE_TYPE_P (inner_type)
>            && TREE_CODE (inner_type) == TREE_CODE (outer_type))
>     return TYPE_CANONICAL (inner_type)
>            && TYPE_CANONICAL (inner_type) == TYPE_CANONICAL (outer_type);
> So, if you have struct S { int s; } and struct S { short a, b; }; and
> VIEW_CONVERT_EXPR between them, that VIEW_CONVERT_EXPR will be removed
> as useless, etc.

Maybe we could limit for purposes of computing TYPE_CANONICAL of derived
types, e.g. TYPE_CANONICAL of structs stays the same with the transition
from TYPE_STRUCT_EQUALITY to TYPE_CANONICAL but all the derived types
remain stable.

Martin

> 
> BTW, the idea of lazily updating TYPE_CANONICAL is basically what I've
> described as the option to update all the derived types where it would
> pretty much do that for all TYPE_STRUCTURAL_EQUALITY_P types in the
> hash table (see if they are derived from the type in question and recompute
> the TYPE_CANONICAL after recomputing all the TYPE_CANONICAL of its base
> types), except perhaps even more costly (if the trigger would be some
> build_array_type/build_function_type/... function is called and found
> a cached TYPE_STRUCTURAL_EQUALITY_P type).  Note also that
> TYPE_STRUCTURAL_EQUALITY_P isn't the case just for the C23 types which
> are marked that way when incomplete and later completed, but by various
> other cases for types which will be permanently like that, so doing
> expensive checks each time some build*_type* is called that refers
> to those would be expensive.
> 
> 	Jakub
>
Richard Biener May 3, 2024, 6:48 p.m. UTC | #8
> Am 03.05.2024 um 20:37 schrieb Martin Uecker <uecker@tugraz.at>:
> 
> Am Freitag, dem 03.05.2024 um 20:18 +0200 schrieb Jakub Jelinek:
>>> On Fri, May 03, 2024 at 08:04:18PM +0200, Martin Uecker wrote:
>>> A change that is not optimal but would avoid a lot of trouble is to
>>> only use the tag of the struct for computing a TYPE_CANONICAL, which
>>> could then be set already for incomplete types and never needs to
>>> change again. We would not differentiate between different struct
>>> types with the same tag for aliasing analysis, but in most cases
>>> I would expect different structs to have a different tag.
>> 
>> Having incompatible types have the same TYPE_CANONICAL would lead to wrong
>> code IMHO, while for aliasing purposes that might be conservative (though
>> not sure, the alias set computation is based on what types the element have
>> etc., so if the alias set is computed for say struct S { int s; }; and
>> then the same alias set used for struct S { long long a; double b; union {
>> short c; float d; } c; };, I think nothing good will come out of that),
> 
> The C type systems requires us to form equivalence classes though.
> For example
> 
> int (*r)[1];
> int (*q)[];
> int (*p)[3];
> 
> need to be in the same equivalence class even though r and p are
> not compatible, while at the same time r and q and q and p
> are compatible.

TYPE_CANONICAL as used by the middle-end cannot express this but useless_type_conversion_p is directed and has similar behavior.  Note the dual-use for TBAA and compatibility was convenient but maybe we have to separate both since making the equivalence class for TBAA larger is more conservative while for compatibility it’s the other way around…

Richard 

> 
>> but middle-end also uses TYPE_CANONICAL to see if types are the same,
>> say e.g. useless_type_conversion_p says that conversions from one
>> RECORD_TYPE to a different RECORD_TYPE are useless if they have the
>> same TYPE_CANONICAL.
>>  /* For aggregates we rely on TYPE_CANONICAL exclusively and require
>>     explicit conversions for types involving to be structurally
>>     compared types.  */
>>  else if (AGGREGATE_TYPE_P (inner_type)
>>           && TREE_CODE (inner_type) == TREE_CODE (outer_type))
>>    return TYPE_CANONICAL (inner_type)
>>           && TYPE_CANONICAL (inner_type) == TYPE_CANONICAL (outer_type);
>> So, if you have struct S { int s; } and struct S { short a, b; }; and
>> VIEW_CONVERT_EXPR between them, that VIEW_CONVERT_EXPR will be removed
>> as useless, etc.
> 
> Maybe we could limit for purposes of computing TYPE_CANONICAL of derived
> types, e.g. TYPE_CANONICAL of structs stays the same with the transition
> from TYPE_STRUCT_EQUALITY to TYPE_CANONICAL but all the derived types
> remain stable.
> 
> Martin
> 
>> 
>> BTW, the idea of lazily updating TYPE_CANONICAL is basically what I've
>> described as the option to update all the derived types where it would
>> pretty much do that for all TYPE_STRUCTURAL_EQUALITY_P types in the
>> hash table (see if they are derived from the type in question and recompute
>> the TYPE_CANONICAL after recomputing all the TYPE_CANONICAL of its base
>> types), except perhaps even more costly (if the trigger would be some
>> build_array_type/build_function_type/... function is called and found
>> a cached TYPE_STRUCTURAL_EQUALITY_P type).  Note also that
>> TYPE_STRUCTURAL_EQUALITY_P isn't the case just for the C23 types which
>> are marked that way when incomplete and later completed, but by various
>> other cases for types which will be permanently like that, so doing
>> expensive checks each time some build*_type* is called that refers
>> to those would be expensive.
>> 
>>    Jakub
>> 
>
Martin Uecker May 3, 2024, 7:11 p.m. UTC | #9
Am Freitag, dem 03.05.2024 um 20:48 +0200 schrieb Richard Biener:
> 
> > Am 03.05.2024 um 20:37 schrieb Martin Uecker <uecker@tugraz.at>:
> > 
> > Am Freitag, dem 03.05.2024 um 20:18 +0200 schrieb Jakub Jelinek:
> > > > On Fri, May 03, 2024 at 08:04:18PM +0200, Martin Uecker wrote:
> > > > A change that is not optimal but would avoid a lot of trouble is to
> > > > only use the tag of the struct for computing a TYPE_CANONICAL, which
> > > > could then be set already for incomplete types and never needs to
> > > > change again. We would not differentiate between different struct
> > > > types with the same tag for aliasing analysis, but in most cases
> > > > I would expect different structs to have a different tag.
> > > 
> > > Having incompatible types have the same TYPE_CANONICAL would lead to wrong
> > > code IMHO, while for aliasing purposes that might be conservative (though
> > > not sure, the alias set computation is based on what types the element have
> > > etc., so if the alias set is computed for say struct S { int s; }; and
> > > then the same alias set used for struct S { long long a; double b; union {
> > > short c; float d; } c; };, I think nothing good will come out of that),
> > 
> > The C type systems requires us to form equivalence classes though.
> > For example
> > 
> > int (*r)[1];
> > int (*q)[];
> > int (*p)[3];
> > 
> > need to be in the same equivalence class even though r and p are
> > not compatible, while at the same time r and q and q and p
> > are compatible.
> 
> TYPE_CANONICAL as used by the middle-end cannot express this but

Hm. so how does it work now for arrays?


> useless_type_conversion_p is directed and has similar behavior. 
> Note the dual-use for TBAA and compatibility was convenient but
> maybe we have to separate both since making the equivalence class
> for TBAA larger is more conservative while for compatibility it’s
> the other way around…

Maybe, although I do not understand why the middle end would
need precise knowledge for checking type compatibility?  The
FE has much stricter rules. 

Martin

> 
> Richard 
> 
> > 
> > > but middle-end also uses TYPE_CANONICAL to see if types are the same,
> > > say e.g. useless_type_conversion_p says that conversions from one
> > > RECORD_TYPE to a different RECORD_TYPE are useless if they have the
> > > same TYPE_CANONICAL.
> > >  /* For aggregates we rely on TYPE_CANONICAL exclusively and require
> > >     explicit conversions for types involving to be structurally
> > >     compared types.  */
> > >  else if (AGGREGATE_TYPE_P (inner_type)
> > >           && TREE_CODE (inner_type) == TREE_CODE (outer_type))
> > >    return TYPE_CANONICAL (inner_type)
> > >           && TYPE_CANONICAL (inner_type) == TYPE_CANONICAL (outer_type);
> > > So, if you have struct S { int s; } and struct S { short a, b; }; and
> > > VIEW_CONVERT_EXPR between them, that VIEW_CONVERT_EXPR will be removed
> > > as useless, etc.
> > 
> > Maybe we could limit for purposes of computing TYPE_CANONICAL of derived
> > types, e.g. TYPE_CANONICAL of structs stays the same with the transition
> > from TYPE_STRUCT_EQUALITY to TYPE_CANONICAL but all the derived types
> > remain stable.
> > 
> > Martin
> > 
> > > 
> > > BTW, the idea of lazily updating TYPE_CANONICAL is basically what I've
> > > described as the option to update all the derived types where it would
> > > pretty much do that for all TYPE_STRUCTURAL_EQUALITY_P types in the
> > > hash table (see if they are derived from the type in question and recompute
> > > the TYPE_CANONICAL after recomputing all the TYPE_CANONICAL of its base
> > > types), except perhaps even more costly (if the trigger would be some
> > > build_array_type/build_function_type/... function is called and found
> > > a cached TYPE_STRUCTURAL_EQUALITY_P type).  Note also that
> > > TYPE_STRUCTURAL_EQUALITY_P isn't the case just for the C23 types which
> > > are marked that way when incomplete and later completed, but by various
> > > other cases for types which will be permanently like that, so doing
> > > expensive checks each time some build*_type* is called that refers
> > > to those would be expensive.
> > > 
> > >    Jakub
> > > 
> >
Jakub Jelinek May 3, 2024, 7:16 p.m. UTC | #10
On Fri, May 03, 2024 at 09:11:20PM +0200, Martin Uecker wrote:
> > TYPE_CANONICAL as used by the middle-end cannot express this but
> 
> Hm. so how does it work now for arrays?

Do you have a testcase which doesn't work correctly with the arrays?

E.g. same_type_for_tbaa has
  type1 = TYPE_MAIN_VARIANT (type1);
  type2 = TYPE_MAIN_VARIANT (type2);

  /* Handle the most common case first.  */
  if (type1 == type2)
    return 1;

  /* If we would have to do structural comparison bail out.  */
  if (TYPE_STRUCTURAL_EQUALITY_P (type1)
      || TYPE_STRUCTURAL_EQUALITY_P (type2))
    return -1;

  /* Compare the canonical types.  */
  if (TYPE_CANONICAL (type1) == TYPE_CANONICAL (type2))
    return 1;

  /* ??? Array types are not properly unified in all cases as we have
     spurious changes in the index types for example.  Removing this
     causes all sorts of problems with the Fortran frontend.  */
  if (TREE_CODE (type1) == ARRAY_TYPE
      && TREE_CODE (type2) == ARRAY_TYPE)
    return -1;
...
and later compares alias sets and the like.
So, even if int[] and int[0] have different TYPE_CANONICAL, they
will be considered maybe the same.  Also, guess get_alias_set
has some ARRAY_TYPE handling...

Anyway, I think we should just go with Richi's patch.

	Jakub
Martin Uecker May 4, 2024, 6:38 a.m. UTC | #11
Am Freitag, dem 03.05.2024 um 21:16 +0200 schrieb Jakub Jelinek:
> > On Fri, May 03, 2024 at 09:11:20PM +0200, Martin Uecker wrote:
> > > > > > TYPE_CANONICAL as used by the middle-end cannot express this but
> > > > 
> > > > Hm. so how does it work now for arrays?
> > 
> > Do you have a testcase which doesn't work correctly with the arrays?

I am mostly trying to understand better how this works. But
if I am not mistaken, the following example would indeed
indicate that we do incorrect aliasing decisions for types
derived from arrays:

https://godbolt.org/z/rTsE3PhKc

Martin

> > 
> > E.g. same_type_for_tbaa has
> >   type1 = TYPE_MAIN_VARIANT (type1);
> >   type2 = TYPE_MAIN_VARIANT (type2);
> > 
> >   /* Handle the most common case first.  */
> >   if (type1 == type2)
> >     return 1;
> > 
> >   /* If we would have to do structural comparison bail out.  */
> >   if (TYPE_STRUCTURAL_EQUALITY_P (type1)
> >       || TYPE_STRUCTURAL_EQUALITY_P (type2))
> >     return -1;
> > 
> >   /* Compare the canonical types.  */
> >   if (TYPE_CANONICAL (type1) == TYPE_CANONICAL (type2))
> >     return 1;
> > 
> >   /* ??? Array types are not properly unified in all cases as we have
> >      spurious changes in the index types for example.  Removing this
> >      causes all sorts of problems with the Fortran frontend.  */
> >   if (TREE_CODE (type1) == ARRAY_TYPE
> >       && TREE_CODE (type2) == ARRAY_TYPE)
> >     return -1;
> > ...
> > and later compares alias sets and the like.
> > So, even if int[] and int[0] have different TYPE_CANONICAL, they
> > will be considered maybe the same.  Also, guess get_alias_set
> > has some ARRAY_TYPE handling...
> > 
> > Anyway, I think we should just go with Richi's patch.
> > 
> > 	Jakub
> >
Richard Biener May 6, 2024, 7 a.m. UTC | #12
On Sat, 4 May 2024, Martin Uecker wrote:

> Am Freitag, dem 03.05.2024 um 21:16 +0200 schrieb Jakub Jelinek:
> > > On Fri, May 03, 2024 at 09:11:20PM +0200, Martin Uecker wrote:
> > > > > > > TYPE_CANONICAL as used by the middle-end cannot express this but
> > > > > 
> > > > > Hm. so how does it work now for arrays?
> > > 
> > > Do you have a testcase which doesn't work correctly with the arrays?
> 
> I am mostly trying to understand better how this works. But
> if I am not mistaken, the following example would indeed
> indicate that we do incorrect aliasing decisions for types
> derived from arrays:
> 
> https://godbolt.org/z/rTsE3PhKc

This example is about pointer-to-array types, int (*)[2] and
int (*)[1] are supposed to be compatible as in receive the same alias
set.  This is ensured by get_alias_set POINTER_TYPE_P handling,
the alias set is supposed to be the same as that of int *.  It seems
we do restrict the handling a bit, the code does

      /* Unnest all pointers and references.
         We also want to make pointer to array/vector equivalent to 
pointer to
         its element (see the reasoning above). Skip all those types, too.  
*/
      for (p = t; POINTER_TYPE_P (p)
           || (TREE_CODE (p) == ARRAY_TYPE
               && (!TYPE_NONALIASED_COMPONENT (p)
                   || !COMPLETE_TYPE_P (p)
                   || TYPE_STRUCTURAL_EQUALITY_P (p)))
           || TREE_CODE (p) == VECTOR_TYPE;
           p = TREE_TYPE (p))

where the comment doesn't exactly match the code - but C should
never have TYPE_NONALIASED_COMPONENT (p).

But maybe I misread the example or it goes wrong elsewhere.

Richard.

> Martin
> 
> > > 
> > > E.g. same_type_for_tbaa has
> > >   type1 = TYPE_MAIN_VARIANT (type1);
> > >   type2 = TYPE_MAIN_VARIANT (type2);
> > > 
> > >   /* Handle the most common case first.  */
> > >   if (type1 == type2)
> > >     return 1;
> > > 
> > >   /* If we would have to do structural comparison bail out.  */
> > >   if (TYPE_STRUCTURAL_EQUALITY_P (type1)
> > >       || TYPE_STRUCTURAL_EQUALITY_P (type2))
> > >     return -1;
> > > 
> > >   /* Compare the canonical types.  */
> > >   if (TYPE_CANONICAL (type1) == TYPE_CANONICAL (type2))
> > >     return 1;
> > > 
> > >   /* ??? Array types are not properly unified in all cases as we have
> > >      spurious changes in the index types for example.  Removing this
> > >      causes all sorts of problems with the Fortran frontend.  */
> > >   if (TREE_CODE (type1) == ARRAY_TYPE
> > >       && TREE_CODE (type2) == ARRAY_TYPE)
> > >     return -1;
> > > ...
> > > and later compares alias sets and the like.
> > > So, even if int[] and int[0] have different TYPE_CANONICAL, they
> > > will be considered maybe the same.  Also, guess get_alias_set
> > > has some ARRAY_TYPE handling...
> > > 
> > > Anyway, I think we should just go with Richi's patch.
> > > 
> > > 	Jakub
> > > 
> 
> 
>
Martin Uecker May 6, 2024, 7:27 a.m. UTC | #13
Am Montag, dem 06.05.2024 um 09:00 +0200 schrieb Richard Biener:
> On Sat, 4 May 2024, Martin Uecker wrote:
> 
> > Am Freitag, dem 03.05.2024 um 21:16 +0200 schrieb Jakub Jelinek:
> > > > On Fri, May 03, 2024 at 09:11:20PM +0200, Martin Uecker wrote:
> > > > > > > > TYPE_CANONICAL as used by the middle-end cannot express this but
> > > > > > 
> > > > > > Hm. so how does it work now for arrays?
> > > > 
> > > > Do you have a testcase which doesn't work correctly with the arrays?
> > 
> > I am mostly trying to understand better how this works. But
> > if I am not mistaken, the following example would indeed
> > indicate that we do incorrect aliasing decisions for types
> > derived from arrays:
> > 
> > https://godbolt.org/z/rTsE3PhKc
> 
> This example is about pointer-to-array types, int (*)[2] and
> int (*)[1] are supposed to be compatible as in receive the same alias
> set. 

In C, char (*)[2] and char (*)[1] are not compatible. But with
COMPAT set, the example operates^1 with char (*)[] and char (*)[1]
which are compatible.  If we form equivalence classes, then
all three types would need to be treated as equivalent. 

^1 Actually, pointer to functions returning pointers
to arrays. Probably this example can still be simplified...

>  This is ensured by get_alias_set POINTER_TYPE_P handling,
> the alias set is supposed to be the same as that of int *.  It seems
> we do restrict the handling a bit, the code does
> 
>       /* Unnest all pointers and references.
>          We also want to make pointer to array/vector equivalent to 
> pointer to
>          its element (see the reasoning above). Skip all those types, too.  
> */
>       for (p = t; POINTER_TYPE_P (p)
>            || (TREE_CODE (p) == ARRAY_TYPE
>                && (!TYPE_NONALIASED_COMPONENT (p)
>                    || !COMPLETE_TYPE_P (p)
>                    || TYPE_STRUCTURAL_EQUALITY_P (p)))
>            || TREE_CODE (p) == VECTOR_TYPE;
>            p = TREE_TYPE (p))
> 
> where the comment doesn't exactly match the code - but C should
> never have TYPE_NONALIASED_COMPONENT (p).
> 
> But maybe I misread the example or it goes wrong elsewhere.

If I am not confusing myself too much, the example shows that
aliasing analysis treats the the types as incompatible in
both cases, because it does not reload *a with -O2. 

For char (*)[1] and char (*)[2] this would be correct (but an
implementation exploiting this would need to do structural
comparisons and not equivalence classes) but for 
char (*)[2] and char (*)[] it is not.

Martin


> 
> Richard.
> 
> > Martin
> > 
> > > > 
> > > > E.g. same_type_for_tbaa has
> > > >   type1 = TYPE_MAIN_VARIANT (type1);
> > > >   type2 = TYPE_MAIN_VARIANT (type2);
> > > > 
> > > >   /* Handle the most common case first.  */
> > > >   if (type1 == type2)
> > > >     return 1;
> > > > 
> > > >   /* If we would have to do structural comparison bail out.  */
> > > >   if (TYPE_STRUCTURAL_EQUALITY_P (type1)
> > > >       || TYPE_STRUCTURAL_EQUALITY_P (type2))
> > > >     return -1;
> > > > 
> > > >   /* Compare the canonical types.  */
> > > >   if (TYPE_CANONICAL (type1) == TYPE_CANONICAL (type2))
> > > >     return 1;
> > > > 
> > > >   /* ??? Array types are not properly unified in all cases as we have
> > > >      spurious changes in the index types for example.  Removing this
> > > >      causes all sorts of problems with the Fortran frontend.  */
> > > >   if (TREE_CODE (type1) == ARRAY_TYPE
> > > >       && TREE_CODE (type2) == ARRAY_TYPE)
> > > >     return -1;
> > > > ...
> > > > and later compares alias sets and the like.
> > > > So, even if int[] and int[0] have different TYPE_CANONICAL, they
> > > > will be considered maybe the same.  Also, guess get_alias_set
> > > > has some ARRAY_TYPE handling...
> > > > 
> > > > Anyway, I think we should just go with Richi's patch.
> > > > 
> > > > 	Jakub
> > > > 
> > 
> > 
> > 
>
Richard Biener May 6, 2024, 9:07 a.m. UTC | #14
On Mon, 6 May 2024, Martin Uecker wrote:

> Am Montag, dem 06.05.2024 um 09:00 +0200 schrieb Richard Biener:
> > On Sat, 4 May 2024, Martin Uecker wrote:
> > 
> > > Am Freitag, dem 03.05.2024 um 21:16 +0200 schrieb Jakub Jelinek:
> > > > > On Fri, May 03, 2024 at 09:11:20PM +0200, Martin Uecker wrote:
> > > > > > > > > TYPE_CANONICAL as used by the middle-end cannot express this but
> > > > > > > 
> > > > > > > Hm. so how does it work now for arrays?
> > > > > 
> > > > > Do you have a testcase which doesn't work correctly with the arrays?
> > > 
> > > I am mostly trying to understand better how this works. But
> > > if I am not mistaken, the following example would indeed
> > > indicate that we do incorrect aliasing decisions for types
> > > derived from arrays:
> > > 
> > > https://godbolt.org/z/rTsE3PhKc
> > 
> > This example is about pointer-to-array types, int (*)[2] and
> > int (*)[1] are supposed to be compatible as in receive the same alias
> > set. 
> 
> In C, char (*)[2] and char (*)[1] are not compatible. But with
> COMPAT set, the example operates^1 with char (*)[] and char (*)[1]
> which are compatible.  If we form equivalence classes, then
> all three types would need to be treated as equivalent. 
> 
> ^1 Actually, pointer to functions returning pointers
> to arrays. Probably this example can still be simplified...
> 
> >  This is ensured by get_alias_set POINTER_TYPE_P handling,
> > the alias set is supposed to be the same as that of int *.  It seems
> > we do restrict the handling a bit, the code does
> > 
> >       /* Unnest all pointers and references.
> >          We also want to make pointer to array/vector equivalent to 
> > pointer to
> >          its element (see the reasoning above). Skip all those types, too.  
> > */
> >       for (p = t; POINTER_TYPE_P (p)
> >            || (TREE_CODE (p) == ARRAY_TYPE
> >                && (!TYPE_NONALIASED_COMPONENT (p)
> >                    || !COMPLETE_TYPE_P (p)
> >                    || TYPE_STRUCTURAL_EQUALITY_P (p)))
> >            || TREE_CODE (p) == VECTOR_TYPE;
> >            p = TREE_TYPE (p))
> > 
> > where the comment doesn't exactly match the code - but C should
> > never have TYPE_NONALIASED_COMPONENT (p).
> > 
> > But maybe I misread the example or it goes wrong elsewhere.
> 
> If I am not confusing myself too much, the example shows that
> aliasing analysis treats the the types as incompatible in
> both cases, because it does not reload *a with -O2. 
> 
> For char (*)[1] and char (*)[2] this would be correct (but an
> implementation exploiting this would need to do structural
> comparisons and not equivalence classes) but for 
> char (*)[2] and char (*)[] it is not.

Oh, these are function pointers, so it's about the alias set of
a pointer to FUNCTION_TYPE.  I don't see any particular code
trying to make char[] * (*)() and char[1] *(*)() inter-operate
for TBAA iff the FUNCTION_TYPEs themselves are not having the
same TYPE_CANONICAL.

Can you open a bugreport and please point to the relevant parts
of the C standard that tells how pointer-to FUNCTION_TYPE TBAA
is supposed to work?

Thanks,
Richard.

> Martin
> 
> 
> > 
> > Richard.
> > 
> > > Martin
> > > 
> > > > > 
> > > > > E.g. same_type_for_tbaa has
> > > > >   type1 = TYPE_MAIN_VARIANT (type1);
> > > > >   type2 = TYPE_MAIN_VARIANT (type2);
> > > > > 
> > > > >   /* Handle the most common case first.  */
> > > > >   if (type1 == type2)
> > > > >     return 1;
> > > > > 
> > > > >   /* If we would have to do structural comparison bail out.  */
> > > > >   if (TYPE_STRUCTURAL_EQUALITY_P (type1)
> > > > >       || TYPE_STRUCTURAL_EQUALITY_P (type2))
> > > > >     return -1;
> > > > > 
> > > > >   /* Compare the canonical types.  */
> > > > >   if (TYPE_CANONICAL (type1) == TYPE_CANONICAL (type2))
> > > > >     return 1;
> > > > > 
> > > > >   /* ??? Array types are not properly unified in all cases as we have
> > > > >      spurious changes in the index types for example.  Removing this
> > > > >      causes all sorts of problems with the Fortran frontend.  */
> > > > >   if (TREE_CODE (type1) == ARRAY_TYPE
> > > > >       && TREE_CODE (type2) == ARRAY_TYPE)
> > > > >     return -1;
> > > > > ...
> > > > > and later compares alias sets and the like.
> > > > > So, even if int[] and int[0] have different TYPE_CANONICAL, they
> > > > > will be considered maybe the same.  Also, guess get_alias_set
> > > > > has some ARRAY_TYPE handling...
> > > > > 
> > > > > Anyway, I think we should just go with Richi's patch.
> > > > > 
> > > > > 	Jakub
> > > > > 
> > > 
> > > 
> > > 
> > 
> 
>
Martin Uecker May 6, 2024, 11:20 a.m. UTC | #15
Am Montag, dem 06.05.2024 um 11:07 +0200 schrieb Richard Biener:
> On Mon, 6 May 2024, Martin Uecker wrote:
> 
> > Am Montag, dem 06.05.2024 um 09:00 +0200 schrieb Richard Biener:
> > > On Sat, 4 May 2024, Martin Uecker wrote:
> > > 
> > > > Am Freitag, dem 03.05.2024 um 21:16 +0200 schrieb Jakub Jelinek:
> > > > > > On Fri, May 03, 2024 at 09:11:20PM +0200, Martin Uecker wrote:
> > > > > > > > > > TYPE_CANONICAL as used by the middle-end cannot express this but
> > > > > > > > 
> > > > > > > > Hm. so how does it work now for arrays?
> > > > > > 
> > > > > > Do you have a testcase which doesn't work correctly with the arrays?
> > > > 
> > > > I am mostly trying to understand better how this works. But
> > > > if I am not mistaken, the following example would indeed
> > > > indicate that we do incorrect aliasing decisions for types
> > > > derived from arrays:
> > > > 
> > > > https://godbolt.org/z/rTsE3PhKc
> > > 
> > > This example is about pointer-to-array types, int (*)[2] and
> > > int (*)[1] are supposed to be compatible as in receive the same alias
> > > set. 
> > 
> > In C, char (*)[2] and char (*)[1] are not compatible. But with
> > COMPAT set, the example operates^1 with char (*)[] and char (*)[1]
> > which are compatible.  If we form equivalence classes, then
> > all three types would need to be treated as equivalent. 
> > 
> > ^1 Actually, pointer to functions returning pointers
> > to arrays. Probably this example can still be simplified...
> > 
> > >  This is ensured by get_alias_set POINTER_TYPE_P handling,
> > > the alias set is supposed to be the same as that of int *.  It seems
> > > we do restrict the handling a bit, the code does
> > > 
> > >       /* Unnest all pointers and references.
> > >          We also want to make pointer to array/vector equivalent to 
> > > pointer to
> > >          its element (see the reasoning above). Skip all those types, too.  
> > > */
> > >       for (p = t; POINTER_TYPE_P (p)
> > >            || (TREE_CODE (p) == ARRAY_TYPE
> > >                && (!TYPE_NONALIASED_COMPONENT (p)
> > >                    || !COMPLETE_TYPE_P (p)
> > >                    || TYPE_STRUCTURAL_EQUALITY_P (p)))
> > >            || TREE_CODE (p) == VECTOR_TYPE;
> > >            p = TREE_TYPE (p))
> > > 
> > > where the comment doesn't exactly match the code - but C should
> > > never have TYPE_NONALIASED_COMPONENT (p).
> > > 
> > > But maybe I misread the example or it goes wrong elsewhere.
> > 
> > If I am not confusing myself too much, the example shows that
> > aliasing analysis treats the the types as incompatible in
> > both cases, because it does not reload *a with -O2. 
> > 
> > For char (*)[1] and char (*)[2] this would be correct (but an
> > implementation exploiting this would need to do structural
> > comparisons and not equivalence classes) but for 
> > char (*)[2] and char (*)[] it is not.
> 
> Oh, these are function pointers, so it's about the alias set of
> a pointer to FUNCTION_TYPE.  I don't see any particular code
> trying to make char[] * (*)() and char[1] *(*)() inter-operate
> for TBAA iff the FUNCTION_TYPEs themselves are not having the
> same TYPE_CANONICAL.
> 
> Can you open a bugreport and please point to the relevant parts
> of the C standard that tells how pointer-to FUNCTION_TYPE TBAA
> is supposed to work?

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114959

Martin
> 

> Thanks,
> Richard.
> 
> > Martin
> > 
> > 
> > > 
> > > Richard.
> > > 
> > > > Martin
> > > > 
> > > > > > 
> > > > > > E.g. same_type_for_tbaa has
> > > > > >   type1 = TYPE_MAIN_VARIANT (type1);
> > > > > >   type2 = TYPE_MAIN_VARIANT (type2);
> > > > > > 
> > > > > >   /* Handle the most common case first.  */
> > > > > >   if (type1 == type2)
> > > > > >     return 1;
> > > > > > 
> > > > > >   /* If we would have to do structural comparison bail out.  */
> > > > > >   if (TYPE_STRUCTURAL_EQUALITY_P (type1)
> > > > > >       || TYPE_STRUCTURAL_EQUALITY_P (type2))
> > > > > >     return -1;
> > > > > > 
> > > > > >   /* Compare the canonical types.  */
> > > > > >   if (TYPE_CANONICAL (type1) == TYPE_CANONICAL (type2))
> > > > > >     return 1;
> > > > > > 
> > > > > >   /* ??? Array types are not properly unified in all cases as we have
> > > > > >      spurious changes in the index types for example.  Removing this
> > > > > >      causes all sorts of problems with the Fortran frontend.  */
> > > > > >   if (TREE_CODE (type1) == ARRAY_TYPE
> > > > > >       && TREE_CODE (type2) == ARRAY_TYPE)
> > > > > >     return -1;
> > > > > > ...
> > > > > > and later compares alias sets and the like.
> > > > > > So, even if int[] and int[0] have different TYPE_CANONICAL, they
> > > > > > will be considered maybe the same.  Also, guess get_alias_set
> > > > > > has some ARRAY_TYPE handling...
> > > > > > 
> > > > > > Anyway, I think we should just go with Richi's patch.
> > > > > > 
> > > > > > 	Jakub
> > > > > > 
> > > > 
> > > > 
> > > > 
> > > 
> > 
> > 
>
Richard Biener May 7, 2024, 11:05 a.m. UTC | #16
On Mon, 6 May 2024, Martin Uecker wrote:

> Am Montag, dem 06.05.2024 um 11:07 +0200 schrieb Richard Biener:
> > On Mon, 6 May 2024, Martin Uecker wrote:
> > 
> > > Am Montag, dem 06.05.2024 um 09:00 +0200 schrieb Richard Biener:
> > > > On Sat, 4 May 2024, Martin Uecker wrote:
> > > > 
> > > > > Am Freitag, dem 03.05.2024 um 21:16 +0200 schrieb Jakub Jelinek:
> > > > > > > On Fri, May 03, 2024 at 09:11:20PM +0200, Martin Uecker wrote:
> > > > > > > > > > > TYPE_CANONICAL as used by the middle-end cannot express this but
> > > > > > > > > 
> > > > > > > > > Hm. so how does it work now for arrays?
> > > > > > > 
> > > > > > > Do you have a testcase which doesn't work correctly with the arrays?
> > > > > 
> > > > > I am mostly trying to understand better how this works. But
> > > > > if I am not mistaken, the following example would indeed
> > > > > indicate that we do incorrect aliasing decisions for types
> > > > > derived from arrays:
> > > > > 
> > > > > https://godbolt.org/z/rTsE3PhKc
> > > > 
> > > > This example is about pointer-to-array types, int (*)[2] and
> > > > int (*)[1] are supposed to be compatible as in receive the same alias
> > > > set. 
> > > 
> > > In C, char (*)[2] and char (*)[1] are not compatible. But with
> > > COMPAT set, the example operates^1 with char (*)[] and char (*)[1]
> > > which are compatible.  If we form equivalence classes, then
> > > all three types would need to be treated as equivalent. 
> > > 
> > > ^1 Actually, pointer to functions returning pointers
> > > to arrays. Probably this example can still be simplified...
> > > 
> > > >  This is ensured by get_alias_set POINTER_TYPE_P handling,
> > > > the alias set is supposed to be the same as that of int *.  It seems
> > > > we do restrict the handling a bit, the code does
> > > > 
> > > >       /* Unnest all pointers and references.
> > > >          We also want to make pointer to array/vector equivalent to 
> > > > pointer to
> > > >          its element (see the reasoning above). Skip all those types, too.  
> > > > */
> > > >       for (p = t; POINTER_TYPE_P (p)
> > > >            || (TREE_CODE (p) == ARRAY_TYPE
> > > >                && (!TYPE_NONALIASED_COMPONENT (p)
> > > >                    || !COMPLETE_TYPE_P (p)
> > > >                    || TYPE_STRUCTURAL_EQUALITY_P (p)))
> > > >            || TREE_CODE (p) == VECTOR_TYPE;
> > > >            p = TREE_TYPE (p))
> > > > 
> > > > where the comment doesn't exactly match the code - but C should
> > > > never have TYPE_NONALIASED_COMPONENT (p).
> > > > 
> > > > But maybe I misread the example or it goes wrong elsewhere.
> > > 
> > > If I am not confusing myself too much, the example shows that
> > > aliasing analysis treats the the types as incompatible in
> > > both cases, because it does not reload *a with -O2. 
> > > 
> > > For char (*)[1] and char (*)[2] this would be correct (but an
> > > implementation exploiting this would need to do structural
> > > comparisons and not equivalence classes) but for 
> > > char (*)[2] and char (*)[] it is not.
> > 
> > Oh, these are function pointers, so it's about the alias set of
> > a pointer to FUNCTION_TYPE.  I don't see any particular code
> > trying to make char[] * (*)() and char[1] *(*)() inter-operate
> > for TBAA iff the FUNCTION_TYPEs themselves are not having the
> > same TYPE_CANONICAL.
> > 
> > Can you open a bugreport and please point to the relevant parts
> > of the C standard that tells how pointer-to FUNCTION_TYPE TBAA
> > is supposed to work?
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114959

I have now pushed the patch after consulting with Jakub.

Richard.

> Martin
> > 
> 
> > Thanks,
> > Richard.
> > 
> > > Martin
> > > 
> > > 
> > > > 
> > > > Richard.
> > > > 
> > > > > Martin
> > > > > 
> > > > > > > 
> > > > > > > E.g. same_type_for_tbaa has
> > > > > > >   type1 = TYPE_MAIN_VARIANT (type1);
> > > > > > >   type2 = TYPE_MAIN_VARIANT (type2);
> > > > > > > 
> > > > > > >   /* Handle the most common case first.  */
> > > > > > >   if (type1 == type2)
> > > > > > >     return 1;
> > > > > > > 
> > > > > > >   /* If we would have to do structural comparison bail out.  */
> > > > > > >   if (TYPE_STRUCTURAL_EQUALITY_P (type1)
> > > > > > >       || TYPE_STRUCTURAL_EQUALITY_P (type2))
> > > > > > >     return -1;
> > > > > > > 
> > > > > > >   /* Compare the canonical types.  */
> > > > > > >   if (TYPE_CANONICAL (type1) == TYPE_CANONICAL (type2))
> > > > > > >     return 1;
> > > > > > > 
> > > > > > >   /* ??? Array types are not properly unified in all cases as we have
> > > > > > >      spurious changes in the index types for example.  Removing this
> > > > > > >      causes all sorts of problems with the Fortran frontend.  */
> > > > > > >   if (TREE_CODE (type1) == ARRAY_TYPE
> > > > > > >       && TREE_CODE (type2) == ARRAY_TYPE)
> > > > > > >     return -1;
> > > > > > > ...
> > > > > > > and later compares alias sets and the like.
> > > > > > > So, even if int[] and int[0] have different TYPE_CANONICAL, they
> > > > > > > will be considered maybe the same.  Also, guess get_alias_set
> > > > > > > has some ARRAY_TYPE handling...
> > > > > > > 
> > > > > > > Anyway, I think we should just go with Richi's patch.
> > > > > > > 
> > > > > > > 	Jakub
> > > > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > 
> > > 
> > > 
> > 
> 
>
diff mbox series

Patch

diff --git a/gcc/attribs.cc b/gcc/attribs.cc
index 12ffc5f170a..3ab0b0fd87a 100644
--- a/gcc/attribs.cc
+++ b/gcc/attribs.cc
@@ -1336,6 +1336,16 @@  build_type_attribute_qual_variant (tree otype, tree attribute, int quals)
       tree dtype = ntype = build_distinct_type_copy (ttype);
 
       TYPE_ATTRIBUTES (ntype) = attribute;
+      /* If the target-dependent attributes make NTYPE different from
+	 its canonical type, we will need to use structural equality
+	 checks for this type.
+
+	 We shouldn't get here for stripping attributes from a type;
+	 the no-attribute type might not need structural comparison.  But
+	 we can if was discarded from type_hash_table.  */
+      if (TYPE_STRUCTURAL_EQUALITY_P (ttype)
+	  || !comp_type_attributes (ntype, ttype))
+	SET_TYPE_STRUCTURAL_EQUALITY (ntype);
 
       hashval_t hash = type_hash_canon_hash (ntype);
       ntype = type_hash_canon (hash, ntype);
@@ -1343,16 +1353,6 @@  build_type_attribute_qual_variant (tree otype, tree attribute, int quals)
       if (ntype != dtype)
 	/* This variant was already in the hash table, don't mess with
 	   TYPE_CANONICAL.  */;
-      else if (TYPE_STRUCTURAL_EQUALITY_P (ttype)
-	       || !comp_type_attributes (ntype, ttype))
-	/* If the target-dependent attributes make NTYPE different from
-	   its canonical type, we will need to use structural equality
-	   checks for this type.
-
-	   We shouldn't get here for stripping attributes from a type;
-	   the no-attribute type might not need structural comparison.  But
-	   we can if was discarded from type_hash_table.  */
-	SET_TYPE_STRUCTURAL_EQUALITY (ntype);
       else if (TYPE_CANONICAL (ntype) == ntype)
 	TYPE_CANONICAL (ntype) = TYPE_CANONICAL (ttype);
 
diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index 01e3d247fc2..032dcb4b41d 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -7115,6 +7115,13 @@  complete_array_type (tree *ptype, tree initial_value, bool do_default)
   TYPE_TYPELESS_STORAGE (main_type) = TYPE_TYPELESS_STORAGE (type);
   layout_type (main_type);
 
+  /* Set TYPE_STRUCTURAL_EQUALITY_P early.  */
+  if (TYPE_STRUCTURAL_EQUALITY_P (TREE_TYPE (main_type))
+      || TYPE_STRUCTURAL_EQUALITY_P (TYPE_DOMAIN (main_type)))
+    SET_TYPE_STRUCTURAL_EQUALITY (main_type);
+  else
+    TYPE_CANONICAL (main_type) = main_type;
+
   /* Make sure we have the canonical MAIN_TYPE. */
   hashval_t hashcode = type_hash_canon_hash (main_type);
   main_type = type_hash_canon (hashcode, main_type);
@@ -7122,7 +7129,7 @@  complete_array_type (tree *ptype, tree initial_value, bool do_default)
   /* Fix the canonical type.  */
   if (TYPE_STRUCTURAL_EQUALITY_P (TREE_TYPE (main_type))
       || TYPE_STRUCTURAL_EQUALITY_P (TYPE_DOMAIN (main_type)))
-    SET_TYPE_STRUCTURAL_EQUALITY (main_type);
+    gcc_assert (TYPE_STRUCTURAL_EQUALITY_P (main_type));
   else if (TYPE_CANONICAL (TREE_TYPE (main_type)) != TREE_TYPE (main_type)
 	   || (TYPE_CANONICAL (TYPE_DOMAIN (main_type))
 	       != TYPE_DOMAIN (main_type)))
@@ -7130,8 +7137,6 @@  complete_array_type (tree *ptype, tree initial_value, bool do_default)
       = build_array_type (TYPE_CANONICAL (TREE_TYPE (main_type)),
 			  TYPE_CANONICAL (TYPE_DOMAIN (main_type)),
 			  TYPE_TYPELESS_STORAGE (main_type));
-  else
-    TYPE_CANONICAL (main_type) = main_type;
 
   if (quals == 0)
     type = main_type;
diff --git a/gcc/testsuite/gcc.dg/pr114931.c b/gcc/testsuite/gcc.dg/pr114931.c
new file mode 100644
index 00000000000..d690ed70e52
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr114931.c
@@ -0,0 +1,10 @@ 
+/* { dg-do compile } */
+/* { dg-options "-std=c23" } */
+
+struct Tcl_Obj;
+void(Tcl_FreeInternalRepProc)(struct Tcl_Obj *);
+typedef struct Tcl_Obj {
+} Tcl_Obj;
+struct {
+  void (*tclFreeObj)(Tcl_Obj *);
+} Tcl_InitStubs;
diff --git a/gcc/tree.cc b/gcc/tree.cc
index 780662549fe..6564b002dc1 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -6012,6 +6012,8 @@  type_hash_canon_hash (tree type)
 
   hstate.add_int (TREE_CODE (type));
 
+  hstate.add_flag (TYPE_STRUCTURAL_EQUALITY_P (type));
+
   if (TREE_TYPE (type))
     hstate.add_object (TYPE_HASH (TREE_TYPE (type)));
 
@@ -6109,6 +6111,10 @@  type_cache_hasher::equal (type_hash *a, type_hash *b)
 	  || TYPE_MODE (a->type) != TYPE_MODE (b->type)))
     return false;
 
+  if (TYPE_STRUCTURAL_EQUALITY_P (a->type)
+      != TYPE_STRUCTURAL_EQUALITY_P (b->type))
+    return false;
+
   switch (TREE_CODE (a->type))
     {
     case VOID_TYPE:
@@ -7347,6 +7353,14 @@  build_array_type_1 (tree elt_type, tree index_type, bool typeless_storage,
   TYPE_DOMAIN (t) = index_type;
   TYPE_ADDR_SPACE (t) = TYPE_ADDR_SPACE (elt_type);
   TYPE_TYPELESS_STORAGE (t) = typeless_storage;
+
+  /* Set TYPE_STRUCTURAL_EQUALITY_P.  */
+  if (set_canonical
+      && (TYPE_STRUCTURAL_EQUALITY_P (elt_type)
+	  || (index_type && TYPE_STRUCTURAL_EQUALITY_P (index_type))
+	  || in_lto_p))
+    SET_TYPE_STRUCTURAL_EQUALITY (t);
+
   layout_type (t);
 
   if (shared)
@@ -7363,7 +7377,7 @@  build_array_type_1 (tree elt_type, tree index_type, bool typeless_storage,
       if (TYPE_STRUCTURAL_EQUALITY_P (elt_type)
 	  || (index_type && TYPE_STRUCTURAL_EQUALITY_P (index_type))
 	  || in_lto_p)
-	SET_TYPE_STRUCTURAL_EQUALITY (t);
+	gcc_unreachable ();
       else if (TYPE_CANONICAL (elt_type) != elt_type
 	       || (index_type && TYPE_CANONICAL (index_type) != index_type))
 	TYPE_CANONICAL (t)
@@ -7510,21 +7524,25 @@  build_function_type (tree value_type, tree arg_types,
       TYPE_NO_NAMED_ARGS_STDARG_P (t) = 1;
     }
 
-  /* If we already have such a type, use the old one.  */
-  hashval_t hash = type_hash_canon_hash (t);
-  tree probe_type = t;
-  t = type_hash_canon (hash, t);
-  if (t != probe_type)
-    return t;
-
   /* Set up the canonical type. */
   any_structural_p   = TYPE_STRUCTURAL_EQUALITY_P (value_type);
   any_noncanonical_p = TYPE_CANONICAL (value_type) != value_type;
   canon_argtypes = maybe_canonicalize_argtypes (arg_types,
 						&any_structural_p,
 						&any_noncanonical_p);
+  /* Set TYPE_STRUCTURAL_EQUALITY_P early.  */
   if (any_structural_p)
     SET_TYPE_STRUCTURAL_EQUALITY (t);
+
+  /* If we already have such a type, use the old one.  */
+  hashval_t hash = type_hash_canon_hash (t);
+  tree probe_type = t;
+  t = type_hash_canon (hash, t);
+  if (t != probe_type)
+    return t;
+
+  if (any_structural_p)
+    gcc_assert (TYPE_STRUCTURAL_EQUALITY_P (t));
   else if (any_noncanonical_p)
     TYPE_CANONICAL (t) = build_function_type (TYPE_CANONICAL (value_type),
 					      canon_argtypes);
@@ -7667,13 +7685,6 @@  build_method_type_directly (tree basetype,
   argtypes = tree_cons (NULL_TREE, ptype, argtypes);
   TYPE_ARG_TYPES (t) = argtypes;
 
-  /* If we already have such a type, use the old one.  */
-  hashval_t hash = type_hash_canon_hash (t);
-  tree probe_type = t;
-  t = type_hash_canon (hash, t);
-  if (t != probe_type)
-    return t;
-
   /* Set up the canonical type. */
   any_structural_p
     = (TYPE_STRUCTURAL_EQUALITY_P (basetype)
@@ -7684,8 +7695,20 @@  build_method_type_directly (tree basetype,
   canon_argtypes = maybe_canonicalize_argtypes (TREE_CHAIN (argtypes),
 						&any_structural_p,
 						&any_noncanonical_p);
+
+  /* Set TYPE_STRUCTURAL_EQUALITY_P early.  */
   if (any_structural_p)
     SET_TYPE_STRUCTURAL_EQUALITY (t);
+
+  /* If we already have such a type, use the old one.  */
+  hashval_t hash = type_hash_canon_hash (t);
+  tree probe_type = t;
+  t = type_hash_canon (hash, t);
+  if (t != probe_type)
+    return t;
+
+  if (any_structural_p)
+    gcc_assert (TYPE_STRUCTURAL_EQUALITY_P (t));
   else if (any_noncanonical_p)
     TYPE_CANONICAL (t)
       = build_method_type_directly (TYPE_CANONICAL (basetype),
@@ -7726,6 +7749,9 @@  build_offset_type (tree basetype, tree type)
 
   TYPE_OFFSET_BASETYPE (t) = TYPE_MAIN_VARIANT (basetype);
   TREE_TYPE (t) = type;
+  if (TYPE_STRUCTURAL_EQUALITY_P (basetype)
+      || TYPE_STRUCTURAL_EQUALITY_P (type))
+    SET_TYPE_STRUCTURAL_EQUALITY (t);
 
   /* If we already have such a type, use the old one.  */
   hashval_t hash = type_hash_canon_hash (t);
@@ -7741,7 +7767,7 @@  build_offset_type (tree basetype, tree type)
     {
       if (TYPE_STRUCTURAL_EQUALITY_P (basetype)
 	  || TYPE_STRUCTURAL_EQUALITY_P (type))
-	SET_TYPE_STRUCTURAL_EQUALITY (t);
+	gcc_unreachable ();
       else if (TYPE_CANONICAL (TYPE_MAIN_VARIANT (basetype)) != basetype
 	       || TYPE_CANONICAL (type) != type)
 	TYPE_CANONICAL (t)
@@ -7770,6 +7796,8 @@  build_complex_type (tree component_type, bool named)
   tree probe = make_node (COMPLEX_TYPE);
 
   TREE_TYPE (probe) = TYPE_MAIN_VARIANT (component_type);
+  if (TYPE_STRUCTURAL_EQUALITY_P (TREE_TYPE (probe)))
+    SET_TYPE_STRUCTURAL_EQUALITY (probe);
 
   /* If we already have such a type, use the old one.  */
   hashval_t hash = type_hash_canon_hash (probe);
@@ -7781,11 +7809,10 @@  build_complex_type (tree component_type, bool named)
 	 out the type.  We need to check the canonicalization and
 	 maybe set the name.  */
       gcc_checking_assert (COMPLETE_TYPE_P (t)
-			   && !TYPE_NAME (t)
-			   && TYPE_CANONICAL (t) == t);
+			   && !TYPE_NAME (t));
 
       if (TYPE_STRUCTURAL_EQUALITY_P (TREE_TYPE (t)))
-	SET_TYPE_STRUCTURAL_EQUALITY (t);
+	;
       else if (TYPE_CANONICAL (TREE_TYPE (t)) != TREE_TYPE (t))
 	TYPE_CANONICAL (t)
 	  = build_complex_type (TYPE_CANONICAL (TREE_TYPE (t)), named);