diff mbox series

[v3,1/2] c++: Initial support for P0847R7 (Deducing This) [PR102609]

Message ID HYWxUFZaD4PRakvhdIXXUy56MaNNEJSGBr-tlTxvIoxSuRPRedUVVWDoGfttYPI1ySvJYkja6yudismLmGkzyTdJyxG9sGp8AlhXOu5451U=@protonmail.com
State New
Headers show
Series [v3,1/2] c++: Initial support for P0847R7 (Deducing This) [PR102609] | expand

Commit Message

waffl3x Sept. 26, 2023, 1:56 a.m. UTC
> Yes, but I'll warn you that grokdeclarator has resisted refactoring for
> a long time...

That will certainly be what I work on after this is squared off then,
I've been up and down grokdeclarator so I'm confident I'll be able to
do it.

As for the patch, I sure took my sweet time with it, but here it is. I
hope to work on the diagnostics patch tomorrow, but as you've probably
figured out it's best not to take my word on timeframes :^).

On the plus side, I took my time to figure out how to best to pass down
information about whether a param is an xobj param. My initial
impression on what you were suggesting was to push another node on the
front of the list, but I stared at it for a few hours and didn't think
it would work out. However, eventually I realized that the purpose
member if free for xobj params as it is illegal for them to have
default arguments. So I ended up passing it over the TREE_LIST after
all, maybe this is what you meant in the first place anyway too.

I am pretty confident that this version is all good, with only a few
possible issues.

An update on my copyright assignment, I sent an e-mail and haven't
gotten a response yet. From what I saw, I am confident that it's my
preferred option going forward though. Hopefully they get back to me
soon.

Also, just a quick update on my copyright assignment, I have sent an
e-mail to the FSF and haven't gotten a response yet. From what I was
reading, I am confident that it's my preferred option going forward
though. Hopefully they get back to me soon.

Bootstrapped and regtested on x86_64-pc-linux-gnu.

Comments

Hans-Peter Nilsson Sept. 27, 2023, 10:43 p.m. UTC | #1
> Date: Tue, 26 Sep 2023 01:56:55 +0000
> From: waffl3x <waffl3x@protonmail.com>

> Signed-off-by: waffl3x <waffl3x@protonmail.com>

I think I've read that you have to put your actual name in
the DCO; using an alias (presumably) as above would be
wrong.

Ah, it's on https://gcc.gnu.org/dco.html - the *second* DCO
link; under "Signed-off-by", on
https://gcc.gnu.org/contribute.html! "sorry, no pseudonyms
or anonymous contributions".

(Also, from Some Source I Don't Remember: using an alias if
you have FSF papers in place is ok; you can use a pseudonym
if FSF can match it to papers on file that have your actual
name or something to that effect.)

brgds, H-P
waffl3x Sept. 27, 2023, 11:35 p.m. UTC | #2
Not to worry, I'm currently going through that process with the FSF, it
was confirmed that a pseudonym should be just fine. I don't know how
long the process takes but my goal is to get this in for GCC14, and
surely this won't take more than a month. One can only hope anyway.

On 2023-09-27 04:43 p.m., Hans-Peter Nilsson wrote:
>> Date: Tue, 26 Sep 2023 01:56:55 +0000
>> From: waffl3x <waffl3x@protonmail.com>
> 
>> Signed-off-by: waffl3x <waffl3x@protonmail.com>
> 
> I think I've read that you have to put your actual name in
> the DCO; using an alias (presumably) as above would be
> wrong.
> 
> Ah, it's on https://gcc.gnu.org/dco.html - the *second* DCO
> link; under "Signed-off-by", on
> https://gcc.gnu.org/contribute.html! "sorry, no pseudonyms
> or anonymous contributions".
> 
> (Also, from Some Source I Don't Remember: using an alias if
> you have FSF papers in place is ok; you can use a pseudonym
> if FSF can match it to papers on file that have your actual
> name or something to that effect.)
> 
> brgds, H-P
Jason Merrill Oct. 17, 2023, 8:53 p.m. UTC | #3
On 9/25/23 21:56, waffl3x wrote:
> 
> Also, just a quick update on my copyright assignment, I have sent an
> e-mail to the FSF and haven't gotten a response yet. From what I was
> reading, I am confident that it's my preferred option going forward
> though. Hopefully they get back to me soon.

Any progress on this, or do I need to coax the process along?  :)

Jason
Jason Merrill Oct. 17, 2023, 9:11 p.m. UTC | #4
On 9/25/23 21:56, waffl3x wrote:
> 
> On the plus side, I took my time to figure out how to best to pass down
> information about whether a param is an xobj param. My initial
> impression on what you were suggesting was to push another node on the
> front of the list, but I stared at it for a few hours and didn't think
> it would work out.

I was thinking to set a TREE_LANG_FLAG on the TREE_LIST node.

> However, eventually I realized that the purpose
> member if free for xobj params as it is illegal for them to have
> default arguments.

Hmm, is it?  I see that clang thinks so, but I don't know where they get 
that idea from.  The grammar certainly allows it:

> attribute-specifier-seqopt decl-specifier-seq declarator = initializer-clause

and I don't see anything else that prohibits it.

Jason
waffl3x Oct. 18, 2023, 11:46 a.m. UTC | #5
> Any progress on this, or do I need to coax the process along?  :)

Yeah, I've been working on it since the copyright assignment process
has finished, originally I was going to note that on my next update
which I had hoped to finish today or tomorrow. Well, in truth I was
hoping to send one the same day that copyright assignment finished, but
I found a nasty bug so I spent all day adding test cases for all the
relevant overloadable operators. Currently, it crashes when calling a
subscript operator declared with an explicit object parameter in a
dependent context. I haven't looked into the fix yet, but I plan to.

Also, before I forget, what is the process for confirming my copyright
assignment on your end? Do you just need to check in with the FSF to
see if it went through? Let me know if there's anything you need from
me regarding that.

Aside from the bug that's currently present in the first patch, I only
have like 1 or 2 little things I want to change about it. I will make
those few changes to patch 1, finish patch 2 (diagnostics) which will
also include test cases for the new bug I found. After I am done that I
plan on adding the things that are missing, because I suspect that
looking into that will get me close to finding the fix for the crash.

> Hmm, is it? I see that clang thinks so, but I don't know where they get
> that idea from. The grammar certainly allows it:
> 
> > attribute-specifier-seqopt decl-specifier-seq declarator = initializer-clause
> 
> 
> and I don't see anything else that prohibits it.

You would be right for P0847R7, but CWG DR 2653 changed that. You can
find the updated grammar in dcl.fct section 3 (subsection? I'm not
really sure to be honest.)

I've also included a copy of the grammar here for your convenience.

https://eel.is/c++draft/dcl.fct#nt:parameter-declaration
parameter-declaration:
  attribute-specifier-seqopt thisopt decl-specifier-seq declarator
  attribute-specifier-seqopt decl-specifier-seq declarator = initializer-clause
  attribute-specifier-seqopt thisopt decl-specifier-seq abstract-declaratoropt
  attribute-specifier-seqopt decl-specifier-seq abstract-declaratoropt = initializer-clause 


> I was thinking to set a TREE_LANG_FLAG on the TREE_LIST node.

I did figure this is originally what you meant, and I can still change
it to go this route since I'm sure it's likely just as good. But I do
recall something I didn't like in the implementation that nudged me
towards using the purpose member instead. Either way, not a big deal. I
think I just liked not having to mess with a linked list as I am not
used to them as a data structure, it might have been that simple. :^)

I will try to get something done today, but I was struggling with
writing some of the tests, there's also a lot more of them now. I also
wrote a bunch of musings in comments that I would like feedback on.

My most concrete question is, how exactly should I be testing a
pedwarn, I want to test that I get the correct warning and error with
the separate flags, do I have to create two separate tests for each one?

I'm just going to include the little wall I wrote in decl.cc regarding
pedwarn, just in case I can't get this done tonight so I can get some
feedback regarding it. On the other hand, it might just not be very
relevant to this patch in particular as I kind of noted, but maybe
there's some way to do what I was musing about that I've overlooked. It
does end up a bit ranty I guess, hopefully that doesn't make it
confusing.

```
/* I believe we should make a switch for this feature specifically,
   I recall seeing discussion regarding enabling newer language
   features when set to older standards. I would advocate for a
   specific flag for each specific feature. Maybe they should all
   be under an override flag? -foverride-dialect=xobj,ifconstexpr (?)
   I dont think it makes sense to give each feature override it's own
   flag. I don't recall what the consensus was around this discussion
   either though.
   For the time being it's controlled by pedantic. I am concerned that
   tying this to pedantic going forward that one might want to enable
   -pedantic-errors while also enabling select features from newer
   dialects. It didn't look like this use case is supported to me.

   I suppose this will require redesign work to support, so for
   the purposes of this patch, emitting a pedwarn seems correct.
   I just don't like that it can't be suppressed on an individual
   basis.  */
if (xobj_parm && cxx_dialect < cxx23)
  pedwarn(DECL_SOURCE_LOCATION (xobj_parm), OPT_Wpedantic, "");
```

That's all for now, I will try, (but I am very much not promising,) to
have an update by the end of today (6-8 hours for me.) If I manage to
get that out, I will (finally) start moving forward on implementing the
missing and broken features, and the aforementioned bug.

Alex
Jason Merrill Oct. 18, 2023, 2:17 p.m. UTC | #6
On 10/18/23 07:46, waffl3x wrote:
>> Any progress on this, or do I need to coax the process along?  :)
> 
> Yeah, I've been working on it since the copyright assignment process
> has finished, originally I was going to note that on my next update
> which I had hoped to finish today or tomorrow. Well, in truth I was
> hoping to send one the same day that copyright assignment finished, but
> I found a nasty bug so I spent all day adding test cases for all the
> relevant overloadable operators. Currently, it crashes when calling a
> subscript operator declared with an explicit object parameter in a
> dependent context. I haven't looked into the fix yet, but I plan to.
> 
> Also, before I forget, what is the process for confirming my copyright
> assignment on your end? Do you just need to check in with the FSF to
> see if it went through? Let me know if there's anything you need from
> me regarding that.
> 
> Aside from the bug that's currently present in the first patch, I only
> have like 1 or 2 little things I want to change about it. I will make
> those few changes to patch 1, finish patch 2 (diagnostics) which will
> also include test cases for the new bug I found. After I am done that I
> plan on adding the things that are missing, because I suspect that
> looking into that will get me close to finding the fix for the crash.
> 
>> Hmm, is it? I see that clang thinks so, but I don't know where they get
>> that idea from. The grammar certainly allows it:
>>
>>> attribute-specifier-seqopt decl-specifier-seq declarator = initializer-clause
>>
>> and I don't see anything else that prohibits it.
> 
> You would be right for P0847R7, but CWG DR 2653 changed that. You can
> find the updated grammar in dcl.fct section 3 (subsection? I'm not
> really sure to be honest.)
> 
> I've also included a copy of the grammar here for your convenience.
> 
> https://eel.is/c++draft/dcl.fct#nt:parameter-declaration
> parameter-declaration:
>    attribute-specifier-seqopt thisopt decl-specifier-seq declarator
>    attribute-specifier-seqopt decl-specifier-seq declarator = initializer-clause
>    attribute-specifier-seqopt thisopt decl-specifier-seq abstract-declaratoropt
>    attribute-specifier-seqopt decl-specifier-seq abstract-declaratoropt = initializer-clause

Ah, yes, thanks.

>> I was thinking to set a TREE_LANG_FLAG on the TREE_LIST node.
> 
> I did figure this is originally what you meant, and I can still change
> it to go this route since I'm sure it's likely just as good. But I do
> recall something I didn't like in the implementation that nudged me
> towards using the purpose member instead. Either way, not a big deal. I
> think I just liked not having to mess with a linked list as I am not
> used to them as a data structure, it might have been that simple. :^)

I wouldn't expect to need any actual dealing with the linked list, just 
setting a flag in cp_parameter_declaration_list at the same point as the 
existing PARENTHESIZED_LIST_P flag.

But given CWG2653 as you pointed out, your current approach is fine.

> I will try to get something done today, but I was struggling with
> writing some of the tests, there's also a lot more of them now. I also
> wrote a bunch of musings in comments that I would like feedback on.
> 
> My most concrete question is, how exactly should I be testing a
> pedwarn, I want to test that I get the correct warning and error with
> the separate flags, do I have to create two separate tests for each one?

Yes.  I tend to use letter suffixes for tests that vary only in flags 
(and expected results), e.g. feature1a.C, feature1b.C.

> I'm just going to include the little wall I wrote in decl.cc regarding
> pedwarn, just in case I can't get this done tonight so I can get some
> feedback regarding it. On the other hand, it might just not be very
> relevant to this patch in particular as I kind of noted, but maybe
> there's some way to do what I was musing about that I've overlooked. It
> does end up a bit ranty I guess, hopefully that doesn't make it
> confusing.
> 
> ```
> /* I believe we should make a switch for this feature specifically,
>     I recall seeing discussion regarding enabling newer language
>     features when set to older standards. I would advocate for a
>     specific flag for each specific feature. Maybe they should all
>     be under an override flag? -foverride-dialect=xobj,ifconstexpr (?)
>     I dont think it makes sense to give each feature override it's own
>     flag. I don't recall what the consensus was around this discussion
>     either though.
>
>     For the time being it's controlled by pedantic. I am concerned that
>     tying this to pedantic going forward that one might want to enable
>     -pedantic-errors while also enabling select features from newer
>     dialects. It didn't look like this use case is supported to me.
> 
>     I suppose this will require redesign work to support, so for
>     the purposes of this patch, emitting a pedwarn seems correct.
>     I just don't like that it can't be suppressed on an individual
>     basis.  */
> if (xobj_parm && cxx_dialect < cxx23)
>    pedwarn(DECL_SOURCE_LOCATION (xobj_parm), OPT_Wpedantic, "");

Instead of OPT_Wpedantic, this should be controlled by 
-Wc++23-extensions (OPT_Wc__23_extensions)

If you wanted, you could add a more specific warning option for this 
(e.g. -Wc++23-explicit-this) which is also affected by 
-Wc++23-extensions, but I would lean toward just using the existing 
flag.  Up to you.

Jason
waffl3x Oct. 18, 2023, 5:28 p.m. UTC | #7
> > I will try to get something done today, but I was struggling with
> > writing some of the tests, there's also a lot more of them now. I also
> > wrote a bunch of musings in comments that I would like feedback on.
> > 
> > My most concrete question is, how exactly should I be testing a
> > pedwarn, I want to test that I get the correct warning and error with
> > the separate flags, do I have to create two separate tests for each one?
> 
> 
> Yes. I tend to use letter suffixes for tests that vary only in flags
> (and expected results), e.g. feature1a.C, feature1b.C.

Will do.

> Instead of OPT_Wpedantic, this should be controlled by
> -Wc++23-extensions (OPT_Wc__23_extensions)

Yeah, I'll do this.

> If you wanted, you could add a more specific warning option for this
> (e.g. -Wc++23-explicit-this) which is also affected by
> -Wc++23-extensions, but I would lean toward just using the existing
> flag. Up to you.

I brought it up in irc and there was some pushback to my point of view
on it, so I'll just stick with OPT_Wc__23_extensions for now. I do
think a more sophisticated interface would be beneficial but I will
bring discussion around that up again in the future.

I've seen plenty of these G_ or _ macros on strings around like in
grokfndecl for these errors.

G_("static member function %qD cannot have cv-qualifier")
G_("non-member function %qD cannot have cv-qualifier")

G_("static member function %qD cannot have ref-qualifier")
G_("non-member function %qD cannot have ref-qualifier")

I have been able to figure out it relates to translation, but not
exactly what the protocol around them is. I think in my original patch
I had refactored this code a bunch, I figured adding a 3rd case to it
justifies a refactor. I think I forgot to add those changes to the
original patch, either that or I undid it or moved it somewhere else.
Anyway, the point is, coming back to it now to re-add those diagnostics
I realized I probably shouldn't have changed those strings.

I also have been wondering whether I should be putting macros on any
strings I add, it seemed like there might have been a macro for text
that needs translation. Is this something I should be doing?

Alex
Jakub Jelinek Oct. 18, 2023, 5:45 p.m. UTC | #8
On Wed, Oct 18, 2023 at 05:28:10PM +0000, waffl3x wrote:
> I've seen plenty of these G_ or _ macros on strings around like in
> grokfndecl for these errors.
> 
> G_("static member function %qD cannot have cv-qualifier")
> G_("non-member function %qD cannot have cv-qualifier")
> 
> G_("static member function %qD cannot have ref-qualifier")
> G_("non-member function %qD cannot have ref-qualifier")
> 
> I have been able to figure out it relates to translation, but not
> exactly what the protocol around them is. I think in my original patch
> I had refactored this code a bunch, I figured adding a 3rd case to it
> justifies a refactor. I think I forgot to add those changes to the
> original patch, either that or I undid it or moved it somewhere else.
> Anyway, the point is, coming back to it now to re-add those diagnostics
> I realized I probably shouldn't have changed those strings.
> 
> I also have been wondering whether I should be putting macros on any
> strings I add, it seemed like there might have been a macro for text
> that needs translation. Is this something I should be doing?

There are different kinds of format strings in GCC, the most common
are the gcc-internal-format strings.  If you call a function which
is expected to take such translatable format string (in particular
a function which takes a gmsgid named argument like error, error_at,
pedwarn, warning_at, ...) and pass a string literal to that function,
nothing needs to be marked in a special way, both gcc/po/exgettext
is able to collect such literals into gcc/po/gcc.pot for translations
and the function is supposed to use gettext etc. to translate it
- e.g. see diagnostic_set_info using _(gmsgid) for that.
But, if there is e.g. a temporary pointer var which points to format
strings and only that is eventually passed to the diagnostic functions,
gcc/po/exgettext won't be able to collect such literals, which is where
the G_() macro comes into play and one marks the string as
gcc-internal-format with it; the translation is still handled by the
diagnostic function at runtime.  The N_() macro is similar but for c-format
strings instead.  The _() macro both collects for translations if it is
used with string literal, and expands to gettext call to translate it at
runtime, which is something that should be avoided if something translates
it again.

And another i18n rule is that one shouldn't try to construct diagnostic
messages from parts of english sentences, it is fine to fill in with %s/%qs
etc. language keywords etc. but otherwise the format string should contain
the whole diagnostic line, so that translators can reorder the words etc.

	Jakub
Jason Merrill Oct. 18, 2023, 6:12 p.m. UTC | #9
On 10/18/23 13:28, waffl3x wrote:
>>> I will try to get something done today, but I was struggling with
>>> writing some of the tests, there's also a lot more of them now. I also
>>> wrote a bunch of musings in comments that I would like feedback on.
>>>
>>> My most concrete question is, how exactly should I be testing a
>>> pedwarn, I want to test that I get the correct warning and error with
>>> the separate flags, do I have to create two separate tests for each one?
>>
>>
>> Yes. I tend to use letter suffixes for tests that vary only in flags
>> (and expected results), e.g. feature1a.C, feature1b.C.
> 
> Will do.
> 
>> Instead of OPT_Wpedantic, this should be controlled by
>> -Wc++23-extensions (OPT_Wc__23_extensions)
> 
> Yeah, I'll do this.
> 
>> If you wanted, you could add a more specific warning option for this
>> (e.g. -Wc++23-explicit-this) which is also affected by
>> -Wc++23-extensions, but I would lean toward just using the existing
>> flag. Up to you.
> 
> I brought it up in irc and there was some pushback to my point of view
> on it, so I'll just stick with OPT_Wc__23_extensions for now. I do
> think a more sophisticated interface would be beneficial but I will
> bring discussion around that up again in the future.
> 
> I've seen plenty of these G_ or _ macros on strings around like in
> grokfndecl for these errors.
> 
> G_("static member function %qD cannot have cv-qualifier")
> G_("non-member function %qD cannot have cv-qualifier")
> 
> G_("static member function %qD cannot have ref-qualifier")
> G_("non-member function %qD cannot have ref-qualifier")
> 
> I have been able to figure out it relates to translation, but not
> exactly what the protocol around them is.

The protocol is described in gcc/ABOUT-GCC-NLS.  In general, "strings" 
passed directly to a diagnostic function don't need any decoration, but 
if they're assigned to a variable first, they need G_() so they're 
recognized as diagnostic strings to be added to the translation table.

The _() macro is used for strings that are going to be passed to a %s, 
but better to avoid doing that for strings that need translation.  N_() 
is (rarely) used for strings that aren't diagnostic format strings, but 
get passed to another function that passes them to _().

Jason
waffl3x Oct. 19, 2023, 9:05 p.m. UTC | #10
> (Jakub)
> There are different kinds of format strings in GCC, the most common
> are the gcc-internal-format strings. If you call a function which
> is expected to take such translatable format string (in particular
> a function which takes a gmsgid named argument like error, error_at,
> pedwarn, warning_at, ...) and pass a string literal to that function,
> nothing needs to be marked in a special way, both gcc/po/exgettext
> is able to collect such literals into gcc/po/gcc.pot for translations
> and the function is supposed to use gettext etc. to translate it
> - e.g. see diagnostic_set_info using (gmsgid) for that.
> But, if there is e.g. a temporary pointer var which points to format
> strings and only that is eventually passed to the diagnostic functions,
> gcc/po/exgettext won't be able to collect such literals, which is where
> the G() macro comes into play and one marks the string as
> gcc-internal-format with it; the translation is still handled by the
> diagnostic function at runtime. The N_() macro is similar but for c-format
> strings instead. The _() macro both collects for translations if it is
> used with string literal, and expands to gettext call to translate it at
> runtime, which is something that should be avoided if something translates
> it again.

> (Jason)
> The protocol is described in gcc/ABOUT-GCC-NLS. In general, "strings"
> passed directly to a diagnostic function don't need any decoration, but
> if they're assigned to a variable first, they need G_() so they're
> recognized as diagnostic strings to be added to the translation table.

I read this last night and decided to sleep on it hoping it would make
more sense now. I was going to write about how I'm still confused but I
think it just clicked for me. I somehow didn't make the connection that
any kind of of expression (like a conditional) is going to require it's
use. I guess I kept thinking "but they aren't being assigned to a
variable" but I get it now, it's when the strings aren't passed
DIRECTLY into the diagnostic function.

> (Jakub)
> And another i18n rule is that one shouldn't try to construct diagnostic
> messages from parts of english sentences, it is fine to fill in with %s/%qs
> etc. language keywords etc. but otherwise the format string should contain
> the whole diagnostic line, so that translators can reorder the words etc.

> (Jason)
> The () macro is used for strings that are going to be passed to a %s,
> but better to avoid doing that for strings that need translation. N()
> is (rarely) used for strings that aren't diagnostic format strings, but
> get passed to another function that passes them to _().

Okay so taking what you guys are saying here it sounds like it would be
appropriate to refactor the code I was reluctant to refactor. The code
(in grokfndecl) conditionally selects one of the two (now three with my
changes) following strings despite them being mostly identical.

"non-member function %qD cannot have cv-qualifier"
"static member function %qD cannot have cv-qualifier"
"explicit object member function %qD cannot have cv-qualifier"

From what I'm getting from what you two have said is that it would be
reasonable to instead construct a string from it's two parts, just
selecting the differing part dynamically.

"%s function %qD cannot have cv-qualifier"

Just to clarify, should I be marking the "non-member", "static member",
"explicit object member" strings with the _ macro then? I'm just not
sure since it seems like Jason is saying they shouldn't be since the
string they are being formatted into is being translated afterwards.

Also, I'm not sure what %qs is, should I be using that instead of %s
for strings?

On another topic, I have been trying to fix taking pointers to explicit
object member functions today, as I ended up breaking that when I
started setting static_function to false for them. Originally it just
worked so I haven't touched any code related to this, but now that they
aren't being treating like static member functions as much so a few
things just broke. What I'm asking myself now is whether it would be
appropriate to just opt-in to static member function behavior for this
one, and I'm not sure that would be correct.

So I started by checking what it did before I turned off the
static_function flag. It's was being passed into cp_build_addr_expr_1
as a baselink node, while regular member functions are passed in as an
offset_ref node. I then checked what the case is for static member
function, and unsurprisingly those are also handled wrapped in a
baselink node, but this raised some questions for me.

I am now trying to figure out what exactly a baselink is, and why it's
used for static member functions. My current best guess is that
method_type nodes already hold the information that a baselink does,
and that information is needed in general. If that is the case, it
might just be correct to just do the same thing for explicit object
member functions, but I wonder if there is more to it, but maybe there
isn't. It worked just fine before when the static_function was still
being set after all.

Any insight on this is appreciated.

Alex
Jakub Jelinek Oct. 19, 2023, 9:11 p.m. UTC | #11
On Thu, Oct 19, 2023 at 09:05:38PM +0000, waffl3x wrote:
> Okay so taking what you guys are saying here it sounds like it would be
> appropriate to refactor the code I was reluctant to refactor. The code
> (in grokfndecl) conditionally selects one of the two (now three with my
> changes) following strings despite them being mostly identical.
> 
> "non-member function %qD cannot have cv-qualifier"
> "static member function %qD cannot have cv-qualifier"
> "explicit object member function %qD cannot have cv-qualifier"
> 
> >From what I'm getting from what you two have said is that it would be
> reasonable to instead construct a string from it's two parts, just
> selecting the differing part dynamically.
> 
> "%s function %qD cannot have cv-qualifier"

No, that wouldn't be appropriate for translation.
None of non-member, static member and explicit object member are
something that should be printed verbatim untranslated.
"%s function %qD cannot have cv-qualifier", _("non-member")
etc. is still inappropriate, some language might need to reorder
the words in one case and not another one etc.

What is ok if that %s (but in that case better %qs) is say some
language keyword which shouldn't be translated.

"%qs keyword not expected"
with
the arg being "inline", "constexpr", "consteval" for example would
be fine.

	Jakub
waffl3x Oct. 19, 2023, 9:31 p.m. UTC | #12
> No, that wouldn't be appropriate for translation.
> None of non-member, static member and explicit object member are
> something that should be printed verbatim untranslated.
> "%s function %qD cannot have cv-qualifier", _("non-member")
> etc. is still inappropriate, some language might need to reorder
> the words in one case and not another one etc.
>
> What is ok if that %s (but in that case better %qs) is say some
> language keyword which shouldn't be translated.
>
> "%qs keyword not expected"
> with
> the arg being "inline", "constexpr", "consteval" for example would
> be fine.
>
> Jakub

Ah alright, I see what you're saying, I see what the difference is now.
It's a shame we can't have the translated string insert a %s and format
into that :^). Ah well, I guess this code is just doomed to look poor
then, what can you do.

This is pretty much why I've been afraid to touch anything with these
macros, if I don't understand exactly how it works anything slightly
weird solution I use might just not work.

Anyway, I think that clears up everything regarding that now, thanks.

Alex
Jakub Jelinek Oct. 19, 2023, 9:53 p.m. UTC | #13
On Thu, Oct 19, 2023 at 09:31:06PM +0000, waffl3x wrote:
> Ah alright, I see what you're saying, I see what the difference is now.
> It's a shame we can't have the translated string insert a %s and format
> into that :^). Ah well, I guess this code is just doomed to look poor
> then, what can you do.

Consider e.g. de.po:
msgid "Generate code to check exception specifications."
msgstr "Code zur Überprüfung von Exception-Spezifikationen erzeugen."
If you try to construct the above from 2 parts as
"%s to check exception specifications.", _("Generate code")
then the german translator will need to give up, because the verb
needs to go last and noun first, so translating "Generate code"
to "Code erzeugen" and the rest can't be done, you want to put one word
in one place and another at another.  A lot of languages
have quite strict rules on order of words in sentence, unlike say
Latin with its comparatively free word order.
You could be lucky, but without knowing at least all the currently
supported languages it can be hard to prove it is ok.
Furthermore, it could prevent some other language translations from being
added.

	Jakub
Jason Merrill Oct. 19, 2023, 10:18 p.m. UTC | #14
On 10/19/23 17:05, waffl3x wrote:
> Also, I'm not sure what %qs is, should I be using that instead of %s
> for strings?

The q prefix means quoted, with ' or other quotation marks, depending on 
the locale.

> On another topic, I have been trying to fix taking pointers to explicit
> object member functions today, as I ended up breaking that when I
> started setting static_function to false for them. Originally it just
> worked so I haven't touched any code related to this, but now that they
> aren't being treating like static member functions as much so a few
> things just broke. What I'm asking myself now is whether it would be
> appropriate to just opt-in to static member function behavior for this
> one, and I'm not sure that would be correct.
> 
> So I started by checking what it did before I turned off the
> static_function flag. It's was being passed into cp_build_addr_expr_1
> as a baselink node, while regular member functions are passed in as an
> offset_ref node. I then checked what the case is for static member
> function, and unsurprisingly those are also handled wrapped in a
> baselink node, but this raised some questions for me.
> 
> I am now trying to figure out what exactly a baselink is, and why it's
> used for static member functions. My current best guess is that
> method_type nodes already hold the information that a baselink does,
> and that information is needed in general. If that is the case, it
> might just be correct to just do the same thing for explicit object
> member functions, but I wonder if there is more to it, but maybe there
> isn't. It worked just fine before when the static_function was still
> being set after all.
> 
> Any insight on this is appreciated.

A BASELINK expresses the result of name lookup for a member function, 
since we need to pass information about the name lookup context along to 
after overload resolution.

An OFFSET_REF (with PTRMEM_OK_P) is used to express that we saw the 
&A::f syntax, so we could build a pointer to member if it resolves to an 
implicit-object member function.

For an overload set containing only a single static member function, 
build_offset_ref doesn't bother to build an OFFSET_REF, but returns the 
BASELINK itself.

I think we need the OFFSET_REF for an explicit-object member function 
because it expresses that the code satisfies the requirement "If the 
operand names an explicit object member function, the operand shall be a 
qualified-id."

It might simplify things to remove the optimization in build_offset_ref 
so we get an OFFSET_REF even for a single static member function, and 
add support for that to cp_build_addr_expr_1.

Jason
waffl3x Oct. 19, 2023, 10:51 p.m. UTC | #15
> A BASELINK expresses the result of name lookup for a member function,
> since we need to pass information about the name lookup context along to
> after overload resolution.
> 
> An OFFSET_REF (with PTRMEM_OK_P) is used to express that we saw the
> &A::f syntax, so we could build a pointer to member if it resolves to an
> implicit-object member function.
> 
> For an overload set containing only a single static member function,
> build_offset_ref doesn't bother to build an OFFSET_REF, but returns the
> BASELINK itself.
> 
> I think we need the OFFSET_REF for an explicit-object member function
> because it expresses that the code satisfies the requirement "If the
> operand names an explicit object member function, the operand shall be a
> qualified-id."
> 
> It might simplify things to remove the optimization in build_offset_ref
> so we get an OFFSET_REF even for a single static member function, and
> add support for that to cp_build_addr_expr_1.
> 
> Jason

Ah okay I think that sheds a little bit of light on things, and here I
was trying not to involve overloads to make it easier for me to
understand things, it seems it ended up making me miss some things.

At a glance it seems like all I need to do then is disable the
PTRMEM_OK_P flag then. I will try that and see how it goes, provided I
can find where it's all setup. (I think I'm almost to the bottom of it,
but it's tough to unravel so I'm not sure.)

I'm also now realizing it's probably about time I add more tests
involving overloads, because I avoided that early on and I don't think
I ever got around to adding any for that.

Alex
waffl3x Oct. 19, 2023, 11:35 p.m. UTC | #16
> (waffl3x (me))
> At a glance it seems like all I need to do then is disable the
> PTRMEM_OK_P flag then.

I'm just now realizing that I'm almost certainly wrong about this. It
still needs PTRMEM_OK_P set if there are any implicit-object member
functions in the overload set. That is, if OFFSET_REF includes that
information... but it doesn't seem like it does? Reading the
information on OFFSET_REF, particularly build_offset_ref, seems to
indicate that OFFSET_REF (at least historically) was only for things
with a pointer to member type.

> > An OFFSET_REF (with PTRMEM_OK_P) is used to express that we saw the
> > &A::f syntax, so we could build a pointer to member if it resolves to an
> > implicit-object member function.
> > 
> > For an overload set containing only a single static member function,
> > build_offset_ref doesn't bother to build an OFFSET_REF, but returns the
> > BASELINK itself.

Based on what you've said, I assume that OFFSET_REF handles static
member functions that are overloaded. But as I've said this seems to
contradict the comments I'm reading, so I'm not sure that I'm
understanding you correctly.

I suppose that will be pretty easy to test, so I'll just do that as
well.

> > I think we need the OFFSET_REF for an explicit-object member function
> > because it expresses that the code satisfies the requirement "If the
> > operand names an explicit object member function, the operand shall be a
> > qualified-id."

I do agree here, but it does reinforce that OFFSET_REF is no longer
just for members represented by pointer to member type. So that might
be something to take into consideration.

> > It might simplify things to remove the optimization in build_offset_ref
> > so we get an OFFSET_REF even for a single static member function, and
> > add support for that to cp_build_addr_expr_1.

I don't think this should be necessary, the "right thing" should just
be done for explicit-object member functions. With all the stuff going
on here that I missed I'm starting to wonder how function overloads
ever worked at all in my patch. On the other hand though, this
optimization probably could be documented better, but I very well might
have missed it even if it were.

Hell, maybe it needs a greater redesign altogether, it seems strange to
me to bundle overload information in with a construct for a specific
expression. (Assuming that's whats happening of course, I still don't
fully understand it.) It's not like this has rules unique to it for how
overload resolution is decided, right? Initializing a param/variable of
pointer to function type with an overloaded function resolves that with
similar rules, I think? Maybe it is a little different now that I write
it out loud.

I wasn't going to finish my musings about that, but it made me realize
that it might not actually be correct for address of explicit-object
member functions to be wrapped by OFFSET_REF. I mean surely it's fine
because based on what you've said static member functions are also
wrapped by OFFSET_REF, so it's likely fully implemented, especially
considering things worked before. But now that there are 2 different
varieties of class members that the address of them can be taken, it
might make sense to split things up a bit? Then again, why were static
member functions ever handled the same way? Taking the address of other
static members isn't handled in the same way here is it?

I'm probably spending too much time thinking about it when I don't
fully understand how it's all being done, I'll just go back to poking
around trying to figure it all out. Then I'll worry about whether
thing's should be done differently or not.

Alex
Jason Merrill Oct. 20, 2023, 2:39 a.m. UTC | #17
On 10/19/23 19:35, waffl3x wrote:
>> (waffl3x (me))
>> At a glance it seems like all I need to do then is disable the
>> PTRMEM_OK_P flag then.
> 
> I'm just now realizing that I'm almost certainly wrong about this. It
> still needs PTRMEM_OK_P set if there are any implicit-object member
> functions in the overload set. That is, if OFFSET_REF includes that
> information... but it doesn't seem like it does? Reading the
> information on OFFSET_REF, particularly build_offset_ref, seems to
> indicate that OFFSET_REF (at least historically) was only for things
> with a pointer to member type.

Or things that might end up with pointer-to-member type after overload 
resolution.

>>> An OFFSET_REF (with PTRMEM_OK_P) is used to express that we saw the
>>> &A::f syntax, so we could build a pointer to member if it resolves to an
>>> implicit-object member function.
>>>
>>> For an overload set containing only a single static member function,
>>> build_offset_ref doesn't bother to build an OFFSET_REF, but returns the
>>> BASELINK itself.
> 
> Based on what you've said, I assume that OFFSET_REF handles static
> member functions that are overloaded. But as I've said this seems to
> contradict the comments I'm reading, so I'm not sure that I'm
> understanding you correctly.

That's right.  For instance,

struct A {
   static void g();
   static void g(int);
};

void (*p)(int) = &A::g; // cp_build_addr_expr_1 gets an OFFSET_REF

>>> I think we need the OFFSET_REF for an explicit-object member function
>>> because it expresses that the code satisfies the requirement "If the
>>> operand names an explicit object member function, the operand shall be a
>>> qualified-id."
> 
> I do agree here, but it does reinforce that OFFSET_REF is no longer
> just for members represented by pointer to member type. So that might
> be something to take into consideration.

An OFFSET_REF that isn't type_unknown_p, agreed.

>>> It might simplify things to remove the optimization in build_offset_ref
>>> so we get an OFFSET_REF even for a single static member function, and
>>> add support for that to cp_build_addr_expr_1.
> 
> I don't think this should be necessary, the "right thing" should just
> be done for explicit-object member functions. With all the stuff going
> on here that I missed I'm starting to wonder how function overloads
> ever worked at all in my patch. On the other hand though, this
> optimization probably could be documented better, but I very well might
> have missed it even if it were.
> 
> Hell, maybe it needs a greater redesign altogether, it seems strange to
> me to bundle overload information in with a construct for a specific
> expression. (Assuming that's whats happening of course, I still don't
> fully understand it.) It's not like this has rules unique to it for how
> overload resolution is decided, right? Initializing a param/variable of
> pointer to function type with an overloaded function resolves that with
> similar rules, I think? Maybe it is a little different now that I write
> it out loud.
> 
> I wasn't going to finish my musings about that, but it made me realize
> that it might not actually be correct for address of explicit-object
> member functions to be wrapped by OFFSET_REF. I mean surely it's fine
> because based on what you've said static member functions are also
> wrapped by OFFSET_REF, so it's likely fully implemented, especially
> considering things worked before. But now that there are 2 different
> varieties of class members that the address of them can be taken, it
> might make sense to split things up a bit? Then again, why were static
> member functions ever handled the same way? Taking the address of other
> static members isn't handled in the same way here is it?

Functions are different because of overloading; in general we can't 
decide what an expression that names a function actually means until we 
have enough context to decide which function, exactly.  So we represent 
the id-expression largely as lookup+syntax until overload resolution 
turns it into a specific function.  The type_unknown_p check earlier in 
cp_build_addr_expr_1 is for that case.

An id-expression that names a single non-template function 
(!really_overloaded_fn) is handled somewhat differently, as we don't 
need to defer everything.  But that means various special-case code.

Currently build_offset_ref special-cases &A::f for a single static 
member function, but we can't use the same special case for single 
explicit object member functions because we need to distinguish between 
&A::f and &f somehow to check the requirement I quoted above.  So it 
seems to me we'll need to add support for single explicit object member 
functions in the OFFSET_REF handling in cp_build_addr_expr_1.  And I 
thought if we're doing that, perhaps we want to move the single static 
handling over there as well, but that's not necessary.

Jason
waffl3x Oct. 20, 2023, 4:34 a.m. UTC | #18
> > 
> > Based on what you've said, I assume that OFFSET_REF handles static
> > member functions that are overloaded. But as I've said this seems to
> > contradict the comments I'm reading, so I'm not sure that I'm
> > understanding you correctly.
> 
> 
> That's right. For instance,
> 
> struct A {
> static void g();
> static void g(int);
> };
> 
> void (*p)(int) = &A::g; // cp_build_addr_expr_1 gets an OFFSET_REF
> 
> > > > I think we need the OFFSET_REF for an explicit-object member function
> > > > because it expresses that the code satisfies the requirement "If the
> > > > operand names an explicit object member function, the operand shall be a
> > > > qualified-id."
> > 
> > I do agree here, but it does reinforce that OFFSET_REF is no longer
> > just for members represented by pointer to member type. So that might
> > be something to take into consideration.
> 
> 
> An OFFSET_REF that isn't type_unknown_p, agreed.
> 
> > > > It might simplify things to remove the optimization in build_offset_ref
> > > > so we get an OFFSET_REF even for a single static member function, and
> > > > add support for that to cp_build_addr_expr_1.
> > 
> > I don't think this should be necessary, the "right thing" should just
> > be done for explicit-object member functions. With all the stuff going
> > on here that I missed I'm starting to wonder how function overloads
> > ever worked at all in my patch. On the other hand though, this
> > optimization probably could be documented better, but I very well might
> > have missed it even if it were.
> > 
> > Hell, maybe it needs a greater redesign altogether, it seems strange to
> > me to bundle overload information in with a construct for a specific
> > expression. (Assuming that's whats happening of course, I still don't
> > fully understand it.) It's not like this has rules unique to it for how
> > overload resolution is decided, right? Initializing a param/variable of
> > pointer to function type with an overloaded function resolves that with
> > similar rules, I think? Maybe it is a little different now that I write
> > it out loud.
> > 
> > I wasn't going to finish my musings about that, but it made me realize
> > that it might not actually be correct for address of explicit-object
> > member functions to be wrapped by OFFSET_REF. I mean surely it's fine
> > because based on what you've said static member functions are also
> > wrapped by OFFSET_REF, so it's likely fully implemented, especially
> > considering things worked before. But now that there are 2 different
> > varieties of class members that the address of them can be taken, it
> > might make sense to split things up a bit? Then again, why were static
> > member functions ever handled the same way? Taking the address of other
> > static members isn't handled in the same way here is it?
> 
> 
> Functions are different because of overloading; in general we can't
> decide what an expression that names a function actually means until we
> have enough context to decide which function, exactly. So we represent
> the id-expression largely as lookup+syntax until overload resolution
> turns it into a specific function. The type_unknown_p check earlier in
> cp_build_addr_expr_1 is for that case.

Yeah this all makes sense, but that's why I was confused by the
following documentation from cp-tree.def.

```
/* An OFFSET_REF is used in two situations:

   1. An expression of the form `A::m' where `A' is a class and `m' is
      a non-static member.  In this case, operand 0 will be a TYPE
      (corresponding to `A') and operand 1 will be a FIELD_DECL,
      BASELINK, or TEMPLATE_ID_EXPR (corresponding to `m').

      The expression is a pointer-to-member if its address is taken,
      but simply denotes a member of the object if its address is not
      taken.
```

> An OFFSET_REF that isn't type_unknown_p, agreed.
But I suppose that's what this might have been referring to.

So is that the case then? OFFSET_REF might be for a regular address of
member expression unless it's type_unknown_p?

> 
> An id-expression that names a single non-template function
> (!really_overloaded_fn) is handled somewhat differently, as we don't
> need to defer everything. But that means various special-case code.
> 
> Currently build_offset_ref special-cases &A::f for a single static
> member function, but we can't use the same special case for single
> explicit object member functions because we need to distinguish between
> &A::f and &f somehow to check the requirement I quoted above. So it
> seems to me we'll need to add support for single explicit object member
> functions in the OFFSET_REF handling in cp_build_addr_expr_1. And I
> thought if we're doing that, perhaps we want to move the single static
> handling over there as well, but that's not necessary.
> 
> Jason

I'm done for today but it does seem like that special case is what is
causing my crash. I found that it only jumps into the section that it
crashes in when there are no overloads. I'm kinda close to fixing it
probably but I've spent too long on it for today, it's melting together.

> Currently build_offset_ref special-cases &A::f for a single static
> member function, but we can't use the same special case for single
> explicit object member functions because we need to distinguish between
> &A::f and &f somehow to check the requirement I quoted above.

I don't understand what this means exactly, under what circumstances
would &f find the member function. Oh, I guess while in the body of
it's class, I hadn't considered that. Is that what you're referring to?

Well either way, I'm going to pick back up here tomorrow and see if I
can finish figuring out exactly whats causing the problems. I'm pretty
certain it's the special case that causes it, I'm just not sure what
part exactly is causing the difference, and whether there aren't more
hiding issues. But for now I need to get some rest. Thanks for the
detailed responses, they helped a lot, this part of the code has been
particularly difficult to figure out.

Alex
Jason Merrill Oct. 20, 2023, 4:01 p.m. UTC | #19
On 10/20/23 00:34, waffl3x wrote:
>>>
>>> Based on what you've said, I assume that OFFSET_REF handles static
>>> member functions that are overloaded. But as I've said this seems to
>>> contradict the comments I'm reading, so I'm not sure that I'm
>>> understanding you correctly.
>>
>>
>> That's right. For instance,
>>
>> struct A {
>> static void g();
>> static void g(int);
>> };
>>
>> void (*p)(int) = &A::g; // cp_build_addr_expr_1 gets an OFFSET_REF
>>
>>>>> I think we need the OFFSET_REF for an explicit-object member function
>>>>> because it expresses that the code satisfies the requirement "If the
>>>>> operand names an explicit object member function, the operand shall be a
>>>>> qualified-id."
>>>
>>> I do agree here, but it does reinforce that OFFSET_REF is no longer
>>> just for members represented by pointer to member type. So that might
>>> be something to take into consideration.
>>
>>
>> An OFFSET_REF that isn't type_unknown_p, agreed.
>>
>>>>> It might simplify things to remove the optimization in build_offset_ref
>>>>> so we get an OFFSET_REF even for a single static member function, and
>>>>> add support for that to cp_build_addr_expr_1.
>>>
>>> I don't think this should be necessary, the "right thing" should just
>>> be done for explicit-object member functions. With all the stuff going
>>> on here that I missed I'm starting to wonder how function overloads
>>> ever worked at all in my patch. On the other hand though, this
>>> optimization probably could be documented better, but I very well might
>>> have missed it even if it were.
>>>
>>> Hell, maybe it needs a greater redesign altogether, it seems strange to
>>> me to bundle overload information in with a construct for a specific
>>> expression. (Assuming that's whats happening of course, I still don't
>>> fully understand it.) It's not like this has rules unique to it for how
>>> overload resolution is decided, right? Initializing a param/variable of
>>> pointer to function type with an overloaded function resolves that with
>>> similar rules, I think? Maybe it is a little different now that I write
>>> it out loud.
>>>
>>> I wasn't going to finish my musings about that, but it made me realize
>>> that it might not actually be correct for address of explicit-object
>>> member functions to be wrapped by OFFSET_REF. I mean surely it's fine
>>> because based on what you've said static member functions are also
>>> wrapped by OFFSET_REF, so it's likely fully implemented, especially
>>> considering things worked before. But now that there are 2 different
>>> varieties of class members that the address of them can be taken, it
>>> might make sense to split things up a bit? Then again, why were static
>>> member functions ever handled the same way? Taking the address of other
>>> static members isn't handled in the same way here is it?
>>
>>
>> Functions are different because of overloading; in general we can't
>> decide what an expression that names a function actually means until we
>> have enough context to decide which function, exactly. So we represent
>> the id-expression largely as lookup+syntax until overload resolution
>> turns it into a specific function. The type_unknown_p check earlier in
>> cp_build_addr_expr_1 is for that case.
> 
> Yeah this all makes sense, but that's why I was confused by the
> following documentation from cp-tree.def.
> 
> ```
> /* An OFFSET_REF is used in two situations:
> 
>     1. An expression of the form `A::m' where `A' is a class and `m' is
>        a non-static member.  In this case, operand 0 will be a TYPE
>        (corresponding to `A') and operand 1 will be a FIELD_DECL,
>        BASELINK, or TEMPLATE_ID_EXPR (corresponding to `m').
> 
>        The expression is a pointer-to-member if its address is taken,
>        but simply denotes a member of the object if its address is not
>        taken.
> ```

Yeah, I'll adjust that statement to be more conditional.  Is this clearer?

     1. An expression of the form `A::m' where `A' is a class and `m' is 

        a non-static member or an overload set.  In this case, operand 0 

        will be a TYPE (corresponding to `A') and operand 1 will be a 

        FIELD_DECL, BASELINK, or TEMPLATE_ID_EXPR (corresponding to 
`m').
 

        The expression is a pointer-to-member if its address is taken 
(and
        if, after any overload resolution, 'm' does not designate a 

        static or explicit object member function).  It simply denotes a 

        member of the object if its address is not taken. 


>> An OFFSET_REF that isn't type_unknown_p, agreed.
> But I suppose that's what this might have been referring to.
> 
> So is that the case then? OFFSET_REF might be for a regular address of
> member expression unless it's type_unknown_p?

If it's type_unknown_p we don't know which member it is, so we don't 
know whether taking its address gives a PMF or a regular pointer.

If it's not type_unknown_p, we know which member it is, and we currently 
skip building it at all for static members, so taking its address always 
gives a pointer to member.  explicit object member functions will make 
that assumption no longer valid.

>> An id-expression that names a single non-template function
>> (!really_overloaded_fn) is handled somewhat differently, as we don't
>> need to defer everything. But that means various special-case code.
>>
>> Currently build_offset_ref special-cases &A::f for a single static
>> member function, but we can't use the same special case for single
>> explicit object member functions because we need to distinguish between
>> &A::f and &f somehow to check the requirement I quoted above. So it
>> seems to me we'll need to add support for single explicit object member
>> functions in the OFFSET_REF handling in cp_build_addr_expr_1. And I
>> thought if we're doing that, perhaps we want to move the single static
>> handling over there as well, but that's not necessary.
> 
> I'm done for today but it does seem like that special case is what is
> causing my crash. I found that it only jumps into the section that it
> crashes in when there are no overloads. I'm kinda close to fixing it
> probably but I've spent too long on it for today, it's melting together.
> 
>> Currently build_offset_ref special-cases &A::f for a single static
>> member function, but we can't use the same special case for single
>> explicit object member functions because we need to distinguish between
>> &A::f and &f somehow to check the requirement I quoted above.
> 
> I don't understand what this means exactly, under what circumstances
> would &f find the member function. Oh, I guess while in the body of
> it's class, I hadn't considered that. Is that what you're referring to?

Right:

struct A {
   void g(this A&);
   A() {
     &A::g; // ok
     &g; // same error as for an implicit object member function
   }
};

> Well either way, I'm going to pick back up here tomorrow and see if I
> can finish figuring out exactly whats causing the problems. I'm pretty
> certain it's the special case that causes it, I'm just not sure what
> part exactly is causing the difference, and whether there aren't more
> hiding issues. But for now I need to get some rest. Thanks for the
> detailed responses, they helped a lot, this part of the code has been
> particularly difficult to figure out.

You're welcome, thanks for your interest.

Jason
waffl3x Oct. 28, 2023, 4:07 a.m. UTC | #20
I've been under the weather so I took a few days break, I honestly was
also very reluctant to pick it back up. The current problem is what I
like to call "not friendly", but I am back at it now.

> > I don't understand what this means exactly, under what circumstances
> > would &f find the member function. Oh, I guess while in the body of
> > it's class, I hadn't considered that. Is that what you're referring to?
>
>
> Right:
>
> struct A {
> void g(this A&);
> A() {
> &A::g; // ok
> &g; // same error as for an implicit object member function
> }
> };

I fully get this now, I threw together a test for it so this case
doesn't get forgotten about. Unfortunately though, I am concerned that
the approach I was going to take to fix the crash would have the
incorrect behavior for this.

Here is what I added to cp_build_addr_expr_1 with context included.
```
    case OFFSET_REF:
    offset_ref:
      /* Turn a reference to a non-static data member into a
	 pointer-to-member.  */
      {
	tree type;
	tree t;

	gcc_assert (PTRMEM_OK_P (arg));

	t = TREE_OPERAND (arg, 1);
	if (TYPE_REF_P (TREE_TYPE (t)))
	  {
	    if (complain & tf_error)
	      error_at (loc,
			"cannot create pointer to reference member %qD", t);
	    return error_mark_node;
	  }
        /* -- Waffl3x additions start -- */
	/* Exception for non-overloaded explicit object member function.  */
	if (TREE_CODE (TREE_TYPE (t)) == FUNCTION_TYPE)
	  return build1 (ADDR_EXPR, unknown_type_node, arg);
        /* -- Waffl3x additions end -- */

	/* Forming a pointer-to-member is a use of non-pure-virtual fns.  */
	if (TREE_CODE (t) == FUNCTION_DECL
	    && !DECL_PURE_VIRTUAL_P (t)
	    && !mark_used (t, complain) && !(complain & tf_error))
	  return error_mark_node;
```

I had hoped this naive solution would work just fine, but unfortunately
the following code fails to compile with an error.

```
struct S {
    void f(this S&) {}
};
int main() {
    void (*a)(S&) = &S::f;
}
```
normal_static.C: In function ‘int main()’:
normal_static.C:13:25: error: cannot convert ‘S::f’ from type ‘void(S&)’ to type ‘void (*)(S&)’
   13 |     void (*a)(S&) = &S::f;
      |                         ^

So clearly it isn't going to be that easy. I went up and down looking
at how the single static case is handled, and tried to read the code in
build_ptrmem_type and build_ptrmemfunc_type but I had a hard time
figuring it out.

The good news is, this problem was difficult enough that it made me
pick a proper diff tool with regex support instead of using a silly web
browser tool and pasting things into it. Or worse, pasting them into a
tool and doing replacement and then pasting them into the silly web
browser tool. I have been forced to improve my workflow thanks to this
head scratcher. So it's not all for naught.

Back on topic, it's not really the optimization returning a baselink
that is causing the original crash. It's just the assert in
build_ptrmem_type failing when a FUNCTION_TYPE is reaching it. The
optimization did throw me for a loop when I was comparing how my
previous version (that incorrectly set the lang_decl_fn ::
static_function flag) was handling things. Looking back, I think I
explained myself and the methodology I was using to investigate really
poorly, I apologize for the confusion I might have caused :).

To state it plainly, it seems to me that the arg parameter being passed
into cp_build_addr_expr_1 for explicit object member functions is
(mostly) pretty much correct and what we would want.

So the whole thing with the baselink optimization was really just a red
herring that I was following. Now that I have a better understanding of
what's happening leading up to and in cp_build_addr_expr_1 I don't
think it's relevant at all for this problem. With that said, I am
questioning again if the optimization that returns a baselink node is
the right way to do things. So this is something I'm going to put down
into my little notes text file to investigate at a later time, and
forget about it for the moment as it shouldn't be causing any friction
for us here.

Anyway, as I eluded to above, if I can't figure out the right way to
solve this problem in a decent amount of time I think I'm going to
leave it for now. I'll come back to it once other higher priority
things are fixed or finished. And hopefully someone more familiar with
this area of the code will have a better idea of what we need to do to
handle this case in a non-intrusive manner.

That wraps up my current status on this specifically. But while
investigating it I uncovered a few things that I feel are important to
discuss/note.

I wanted to change DECL_NONSTATIC_MEMBER_FUNCTION_P to include explicit
object member functions, but it had some problems when I made the
modification. I also noticed that it's used in cp-objcp-common.cc so
would making changes to it be a bad idea?

-- cp-tree.h
```
/* Nonzero for FUNCTION_DECL means that this decl is a non-static
   member function.  */
#define DECL_NONSTATIC_MEMBER_FUNCTION_P(NODE) \
  (TREE_CODE (TREE_TYPE (NODE)) == METHOD_TYPE)
```
I didn't want to investigate the problems as I was knee deep in
investigating the addressof bug. So I instead modified
DECL_FUNCTION_MEMBER_P to include explicit object member functions and
moved on.

-- cp-tree.h
```
/* Nonzero for FUNCTION_DECL means that this decl is a member function
   (static or non-static).  */
#define DECL_FUNCTION_MEMBER_P(NODE) \
  (DECL_NONSTATIC_MEMBER_FUNCTION_P (NODE) || DECL_STATIC_FUNCTION_P (NODE) \
  || DECL_IS_XOBJ_MEMBER_FUNC (NODE))
```
I am mostly just mentioning this here in case it becomes more relevant
later. Looking at how much DECL_NONSTATIC_MEMBER_FUNCTION_P is used
throughout the code I now suspect that adding explicit object member
functions to it might cause xobj member functions to be treated as
regular member functions when they should not be.

If this change were to stick it would cause a discrepancy in the
behavior of DECL_NONSTATIC_MEMBER_FUNCTION_P and it's name. If we were
to do this, I think it's important we document the discrepancy and why
it exists, and in the future, it should possibly be refactored. One
option would be to simply rename it to DECL_IOBJ_MEMBER_FUNCTION_P.
After all, I suspect that it's unlikely that the current macro
(DECL_NONSTATIC_MEMBER_FUNCTION_P) is being used in places that concern
explicit object member functions. So just adding explicit object member
functions to it will most likely just result in headaches.

It seems to me that would be the best solution, so when and if it comes
up again, I think that route should be considered.


Secondly, there are some differences in the nodes describing an
explicit object member function from those describing static member
functions and implicit object member functions that I am not sure
should be present.

I did my best to summarize the differences, if you want the logs of
tree_debug that I derived them from I can provide them. Most of my
understanding of the layout of the nodes is from reading print-tree.cc
and looking at debug_tree outputs, so it's possible I made a mistake.

I am opting to use the names of members as they are output by
debug_tree, I recognize this is not always the actual name of the
member in the actual tree_node structures.

Additionally, some of the differences listed are to be expected and are
most likely the correct values for each node. However, I wanted to be
exhaustive when listing them just in case I am mistaken in my opinion
on whether the differences should or should not occur.

The following declarations were used as input to the compiler.
iobj decl:
struct S { void f() {} };
xobj decl:
struct S { void f(this S&) {} };
static decl:
struct S { static void f(S&) {} };

These differences can be observed in the return values of
grokdeclarator for each declaration.

1. function_decl::type::tree_code
iobj: method_type
xobj: function_type
stat: function_type
2. function_decl::type::method basetype
iobj: <record_type 0x7ffff7194c78 S>
xobj: NULL/no output
stat: NULL/no output
3. function_decl::type::arg-types::tree_list[0]::value
iobj: <pointer_type>
xobj: <reference_type>
stat: <reference_type>
4. function_decl::decl_6
iobj: false/no output
xobj: false/no output
stat: true
5. function_decl::align
iobj: 16
xobj: 8
stat: 8
6. function_decl::result::uid
iobj: D.2513
xobj: D.2513
stat: D.2512
7. function_decl::full-name
iobj: "void S::f()"
xobj: "void S::f(this S&)"

Differences 1, 3, and 7 seem obviously correct to me for all 3
declarations, 6 is a little bizarre to me, but since it's just a UID
it's merely an oddity, I doubt it is concerning.

That leaves 2, 4, and 5.

2. I am pretty sure xobj functions should have the struct they are a
part of recorded as the method basetype member. I have already checked
that function_type and method_type are the same node type under the
hood and it does appear to be, so it should be trivial to set it.
However I do have to wonder why static member functions don't set it,
it seems to be that it would be convenient to use the same field. Can
you provide any insight into that?

4. I have no comment here, but it does concern me since I don't
understand it at all.

5. I am pretty sure this is fine for now, but if xobj member functions
ever were to support virtual/override capabilities, then it would be a
problem. Is my understanding correct, or is there some other reason
that iobj member functions have a different value here?

There are also some differences in the arg param in
cp_build_addr_expr_1 that concerns me, but most of them are reflected
in the differences I have already noted. I had wanted to include these
differences as well but I have been spending too much time staring at
it, it's no longer productive. In short, the indirect_ref node for xobj
member functions has reference_to_this set, while iobj member functions
do not. The baselink binfo field has the private flag set for xobj
member functions, iobj member functions does not.

I've spent too much time on this write up, so I am calling it here, it
wasn't all a waste of time because half of what I was doing here are
things I was going to need to do anyway at least. I still feel like I
spent too much time on it. Hopefully it's of some value for me/others
later.

Alex
waffl3x Oct. 28, 2023, 11:21 p.m. UTC | #21
I woke up today and figured it out in about 30 minutes, I don't know if
this solution would be okay but I am running away from
cp_build_addr_expr_1 for now. I think this is another place I will have
the urge to refactor in the future, but if I keep looking at it right
now I am just going to waste all my energy before I finish everything
else.

This line
```
  else if (BASELINK_P (TREE_OPERAND (arg, 1)))
```
near the bottom of the function is almost certainly dead. The switch
case (near the top) for baselink reassigns arg.
```
case BASELINK:
  arg = BASELINK_FUNCTIONS (arg);
```
And I'm pretty sure a baselink's functions can't be a baselink, so that
case near the bottom is almost certainly dead. I've put it in my todo
so I will look at it in the future.

Anyway, here is my new solution, it seems to work, but as indicated in
the comment (for myself) I'm not sure this handles constexpr things
right as near the bottom of this function there is some constexpr
handling. I know I said it took me 30 minutes but I realized I had some
holes I needed to check as I was writing this email, so it was more
like 2 hours.
```
        /* Forming a pointer-to-member is a use of non-pure-virtual fns.  */
        if (TREE_CODE (t) == FUNCTION_DECL
            && !DECL_PURE_VIRTUAL_P (t)
            && !mark_used (t, complain) && !(complain & tf_error))
          return error_mark_node;

        /* Exception for non-overloaded explicit object member function.  */
        /* I am pretty sure this is still not perfect, I think we aren't
           handling some constexpr stuff, but I am leaving it for now. */
        if (TREE_CODE (TREE_TYPE (t)) == FUNCTION_TYPE)
          return build_address (t);

        type = build_ptrmem_type (context_for_name_lookup (t),
                                  TREE_TYPE (t));
        t = make_ptrmem_cst (type, t);
        return t;
      }
```
I'm sharing it because I am still uncertain on whether this is the
ideal solution, this function is kind of a mess in general (sorry) so
it's hard to tell.

There's still a few things in my previous email that I would like
looked at, primarily the differences I observed in returns from
grokdeclarator, but I am glad to finally be moving forward on to other
things. Hopefully I can put together the patches with this one fixed.
(Of course I still have to fix the other bug I found as well.)

I had thought I might try to get lambda support and by value xobj
params working in this patch, but I think that will still be something
I work on once I am finished the initial patch. I still want to get
both of those into GCC14 though so I probably need to speed things up.

Alex

> 
> 
> I've been under the weather so I took a few days break, I honestly was
> also very reluctant to pick it back up. The current problem is what I
> like to call "not friendly", but I am back at it now.
> 
> > > I don't understand what this means exactly, under what circumstances
> > > would &f find the member function. Oh, I guess while in the body of
> > > it's class, I hadn't considered that. Is that what you're referring to?
> > 
> > Right:
> > 
> > struct A {
> > void g(this A&);
> > A() {
> > &A::g; // ok
> > &g; // same error as for an implicit object member function
> > }
> > };
> 
> 
> I fully get this now, I threw together a test for it so this case
> doesn't get forgotten about. Unfortunately though, I am concerned that
> the approach I was going to take to fix the crash would have the
> incorrect behavior for this.
> 
> Here is what I added to cp_build_addr_expr_1 with context included.
> `case OFFSET_REF: offset_ref: /* Turn a reference to a non-static data member into a pointer-to-member. */ { tree type; tree t; gcc_assert (PTRMEM_OK_P (arg)); t = TREE_OPERAND (arg, 1); if (TYPE_REF_P (TREE_TYPE (t))) { if (complain & tf_error) error_at (loc, "cannot create pointer to reference member %qD", t); return error_mark_node; } /* -- Waffl3x additions start -- */ /* Exception for non-overloaded explicit object member function. */ if (TREE_CODE (TREE_TYPE (t)) == FUNCTION_TYPE) return build1 (ADDR_EXPR, unknown_type_node, arg); /* -- Waffl3x additions end -- */ /* Forming a pointer-to-member is a use of non-pure-virtual fns. */ if (TREE_CODE (t) == FUNCTION_DECL && !DECL_PURE_VIRTUAL_P (t) && !mark_used (t, complain) && !(complain & tf_error)) return error_mark_node;`
> 
> I had hoped this naive solution would work just fine, but unfortunately
> the following code fails to compile with an error.
> 
> `struct S { void f(this S&) {} }; int main() { void (*a)(S&) = &S::f; }`
> normal_static.C: In function ‘int main()’:
> normal_static.C:13:25: error: cannot convert ‘S::f’ from type ‘void(S&)’ to type ‘void (*)(S&)’
> 13 | void (*a)(S&) = &S::f;
> | ^
> 
> So clearly it isn't going to be that easy. I went up and down looking
> at how the single static case is handled, and tried to read the code in
> build_ptrmem_type and build_ptrmemfunc_type but I had a hard time
> figuring it out.
> 
> The good news is, this problem was difficult enough that it made me
> pick a proper diff tool with regex support instead of using a silly web
> browser tool and pasting things into it. Or worse, pasting them into a
> tool and doing replacement and then pasting them into the silly web
> browser tool. I have been forced to improve my workflow thanks to this
> head scratcher. So it's not all for naught.
> 
> Back on topic, it's not really the optimization returning a baselink
> that is causing the original crash. It's just the assert in
> build_ptrmem_type failing when a FUNCTION_TYPE is reaching it. The
> optimization did throw me for a loop when I was comparing how my
> previous version (that incorrectly set the lang_decl_fn ::
> static_function flag) was handling things. Looking back, I think I
> explained myself and the methodology I was using to investigate really
> poorly, I apologize for the confusion I might have caused :).
> 
> To state it plainly, it seems to me that the arg parameter being passed
> into cp_build_addr_expr_1 for explicit object member functions is
> (mostly) pretty much correct and what we would want.
> 
> So the whole thing with the baselink optimization was really just a red
> herring that I was following. Now that I have a better understanding of
> what's happening leading up to and in cp_build_addr_expr_1 I don't
> think it's relevant at all for this problem. With that said, I am
> questioning again if the optimization that returns a baselink node is
> the right way to do things. So this is something I'm going to put down
> into my little notes text file to investigate at a later time, and
> forget about it for the moment as it shouldn't be causing any friction
> for us here.
> 
> Anyway, as I eluded to above, if I can't figure out the right way to
> solve this problem in a decent amount of time I think I'm going to
> leave it for now. I'll come back to it once other higher priority
> things are fixed or finished. And hopefully someone more familiar with
> this area of the code will have a better idea of what we need to do to
> handle this case in a non-intrusive manner.
> 
> That wraps up my current status on this specifically. But while
> investigating it I uncovered a few things that I feel are important to
> discuss/note.
> 
> I wanted to change DECL_NONSTATIC_MEMBER_FUNCTION_P to include explicit
> object member functions, but it had some problems when I made the
> modification. I also noticed that it's used in cp-objcp-common.cc so
> would making changes to it be a bad idea?
> 
> -- cp-tree.h
> `/* Nonzero for FUNCTION_DECL means that this decl is a non-static member function. */ #define DECL_NONSTATIC_MEMBER_FUNCTION_P(NODE) \\ (TREE_CODE (TREE_TYPE (NODE)) == METHOD_TYPE)`
> I didn't want to investigate the problems as I was knee deep in
> investigating the addressof bug. So I instead modified
> DECL_FUNCTION_MEMBER_P to include explicit object member functions and
> moved on.
> 
> -- cp-tree.h
> `/* Nonzero for FUNCTION_DECL means that this decl is a member function (static or non-static). */ #define DECL_FUNCTION_MEMBER_P(NODE) \\ (DECL_NONSTATIC_MEMBER_FUNCTION_P (NODE) || DECL_STATIC_FUNCTION_P (NODE) \\ || DECL_IS_XOBJ_MEMBER_FUNC (NODE))`
> I am mostly just mentioning this here in case it becomes more relevant
> later. Looking at how much DECL_NONSTATIC_MEMBER_FUNCTION_P is used
> throughout the code I now suspect that adding explicit object member
> functions to it might cause xobj member functions to be treated as
> regular member functions when they should not be.
> 
> If this change were to stick it would cause a discrepancy in the
> behavior of DECL_NONSTATIC_MEMBER_FUNCTION_P and it's name. If we were
> to do this, I think it's important we document the discrepancy and why
> it exists, and in the future, it should possibly be refactored. One
> option would be to simply rename it to DECL_IOBJ_MEMBER_FUNCTION_P.
> After all, I suspect that it's unlikely that the current macro
> (DECL_NONSTATIC_MEMBER_FUNCTION_P) is being used in places that concern
> explicit object member functions. So just adding explicit object member
> functions to it will most likely just result in headaches.
> 
> It seems to me that would be the best solution, so when and if it comes
> up again, I think that route should be considered.
> 
> 
> Secondly, there are some differences in the nodes describing an
> explicit object member function from those describing static member
> functions and implicit object member functions that I am not sure
> should be present.
> 
> I did my best to summarize the differences, if you want the logs of
> tree_debug that I derived them from I can provide them. Most of my
> understanding of the layout of the nodes is from reading print-tree.cc
> and looking at debug_tree outputs, so it's possible I made a mistake.
> 
> I am opting to use the names of members as they are output by
> debug_tree, I recognize this is not always the actual name of the
> member in the actual tree_node structures.
> 
> Additionally, some of the differences listed are to be expected and are
> most likely the correct values for each node. However, I wanted to be
> exhaustive when listing them just in case I am mistaken in my opinion
> on whether the differences should or should not occur.
> 
> The following declarations were used as input to the compiler.
> iobj decl:
> struct S { void f() {} };
> xobj decl:
> struct S { void f(this S&) {} };
> static decl:
> struct S { static void f(S&) {} };
> 
> These differences can be observed in the return values of
> grokdeclarator for each declaration.
> 
> 1. function_decl::type::tree_code
> iobj: method_type
> xobj: function_type
> stat: function_type
> 2. function_decl::type::method basetype
> iobj: <record_type 0x7ffff7194c78 S>
> 
> xobj: NULL/no output
> stat: NULL/no output
> 3. function_decl::type::arg-types::tree_list[0]::value
> iobj: <pointer_type>
> 
> xobj: <reference_type>
> 
> stat: <reference_type>
> 
> 4. function_decl::decl_6
> iobj: false/no output
> xobj: false/no output
> stat: true
> 5. function_decl::align
> iobj: 16
> xobj: 8
> stat: 8
> 6. function_decl::result::uid
> iobj: D.2513
> xobj: D.2513
> stat: D.2512
> 7. function_decl::full-name
> iobj: "void S::f()"
> xobj: "void S::f(this S&)"
> 
> Differences 1, 3, and 7 seem obviously correct to me for all 3
> declarations, 6 is a little bizarre to me, but since it's just a UID
> it's merely an oddity, I doubt it is concerning.
> 
> That leaves 2, 4, and 5.
> 
> 2. I am pretty sure xobj functions should have the struct they are a
> part of recorded as the method basetype member. I have already checked
> that function_type and method_type are the same node type under the
> hood and it does appear to be, so it should be trivial to set it.
> However I do have to wonder why static member functions don't set it,
> it seems to be that it would be convenient to use the same field. Can
> you provide any insight into that?
> 
> 4. I have no comment here, but it does concern me since I don't
> understand it at all.
> 
> 5. I am pretty sure this is fine for now, but if xobj member functions
> ever were to support virtual/override capabilities, then it would be a
> problem. Is my understanding correct, or is there some other reason
> that iobj member functions have a different value here?
> 
> There are also some differences in the arg param in
> cp_build_addr_expr_1 that concerns me, but most of them are reflected
> in the differences I have already noted. I had wanted to include these
> differences as well but I have been spending too much time staring at
> it, it's no longer productive. In short, the indirect_ref node for xobj
> member functions has reference_to_this set, while iobj member functions
> do not. The baselink binfo field has the private flag set for xobj
> member functions, iobj member functions does not.
> 
> I've spent too much time on this write up, so I am calling it here, it
> wasn't all a waste of time because half of what I was doing here are
> things I was going to need to do anyway at least. I still feel like I
> spent too much time on it. Hopefully it's of some value for me/others
> later.
> 
> Alex
waffl3x Nov. 1, 2023, 11:15 p.m. UTC | #22
Just want to quickly check, when is the cutoff for GCC14 exactly? I
want to know how much time I have left to try to tackle this bug with
subscript. I'm going to be crunching out final stuff this week and I'll
try to get a (probably non-final) patch for you to review today.

Alex
waffl3x Nov. 2, 2023, 7:01 a.m. UTC | #23
The problem with operators is fixed, now starts a long period of
testing. It's been so long since I've gotten to this part that I almost
forgot that I have to do it :'). When/if regtests and bootstrap all
pass I will format the patch and submit it.



------- Original Message -------
On Wednesday, November 1st, 2023 at 5:15 PM, waffl3x <waffl3x@protonmail.com> wrote:


> 
> 
> Just want to quickly check, when is the cutoff for GCC14 exactly? I
> want to know how much time I have left to try to tackle this bug with
> subscript. I'm going to be crunching out final stuff this week and I'll
> try to get a (probably non-final) patch for you to review today.
> 
> Alex
Jakub Jelinek Nov. 2, 2023, 7:01 a.m. UTC | #24
On Wed, Nov 01, 2023 at 11:15:56PM +0000, waffl3x wrote:
> Just want to quickly check, when is the cutoff for GCC14 exactly? I

Nov 18th EOD.  But generally it is cutoff for posting new feature patches
that could be accepted, patches posted before the deadline if reviewed
soon after the deadline still can make it in.

	Jakub
Jason Merrill Nov. 3, 2023, 3:04 a.m. UTC | #25
On 10/28/23 00:07, waffl3x wrote:
> I wanted to change DECL_NONSTATIC_MEMBER_FUNCTION_P to include explicit
> object member functions, but it had some problems when I made the
> modification. I also noticed that it's used in cp-objcp-common.cc so
> would making changes to it be a bad idea?
> 
> -- cp-tree.h
> ```
> /* Nonzero for FUNCTION_DECL means that this decl is a non-static
>     member function.  */
> #define DECL_NONSTATIC_MEMBER_FUNCTION_P(NODE) \
>    (TREE_CODE (TREE_TYPE (NODE)) == METHOD_TYPE)
> ```
> I didn't want to investigate the problems as I was knee deep in
> investigating the addressof bug. So I instead modified
> DECL_FUNCTION_MEMBER_P to include explicit object member functions and
> moved on.
> 
> -- cp-tree.h
> ```
> /* Nonzero for FUNCTION_DECL means that this decl is a member function
>     (static or non-static).  */
> #define DECL_FUNCTION_MEMBER_P(NODE) \
>    (DECL_NONSTATIC_MEMBER_FUNCTION_P (NODE) || DECL_STATIC_FUNCTION_P (NODE) \
>    || DECL_IS_XOBJ_MEMBER_FUNC (NODE))
> ```
> I am mostly just mentioning this here in case it becomes more relevant
> later. Looking at how much DECL_NONSTATIC_MEMBER_FUNCTION_P is used
> throughout the code I now suspect that adding explicit object member
> functions to it might cause xobj member functions to be treated as
> regular member functions when they should not be.
> 
> If this change were to stick it would cause a discrepancy in the
> behavior of DECL_NONSTATIC_MEMBER_FUNCTION_P and it's name. If we were
> to do this, I think it's important we document the discrepancy and why
> it exists, and in the future, it should possibly be refactored. One
> option would be to simply rename it to DECL_IOBJ_MEMBER_FUNCTION_P.
> After all, I suspect that it's unlikely that the current macro
> (DECL_NONSTATIC_MEMBER_FUNCTION_P) is being used in places that concern
> explicit object member functions. So just adding explicit object member
> functions to it will most likely just result in headaches.
> 
> It seems to me that would be the best solution, so when and if it comes
> up again, I think that route should be considered.

Agreed, it sounds good to rename the current macro and then add a new 
macro that includes both implicit and explicit, assuming that's a useful 
category.

> Secondly, there are some differences in the nodes describing an
> explicit object member function from those describing static member
> functions and implicit object member functions that I am not sure
> should be present.
> 
> I did my best to summarize the differences, if you want the logs of
> tree_debug that I derived them from I can provide them. Most of my
> understanding of the layout of the nodes is from reading print-tree.cc
> and looking at debug_tree outputs, so it's possible I made a mistake.
> 
> I am opting to use the names of members as they are output by
> debug_tree, I recognize this is not always the actual name of the
> member in the actual tree_node structures.
> 
> Additionally, some of the differences listed are to be expected and are
> most likely the correct values for each node. However, I wanted to be
> exhaustive when listing them just in case I am mistaken in my opinion
> on whether the differences should or should not occur.
> 
> The following declarations were used as input to the compiler.
> iobj decl:
> struct S { void f() {} };
> xobj decl:
> struct S { void f(this S&) {} };
> static decl:
> struct S { static void f(S&) {} };
> 
> These differences can be observed in the return values of
> grokdeclarator for each declaration.
> 
> 1. function_decl::type::tree_code
> iobj: method_type
> xobj: function_type
> stat: function_type
> 2. function_decl::type::method basetype
> iobj: <record_type 0x7ffff7194c78 S>
> xobj: NULL/no output
> stat: NULL/no output
> 3. function_decl::type::arg-types::tree_list[0]::value
> iobj: <pointer_type>
> xobj: <reference_type>
> stat: <reference_type>
> 4. function_decl::decl_6
> iobj: false/no output
> xobj: false/no output
> stat: true
> 5. function_decl::align
> iobj: 16
> xobj: 8
> stat: 8
> 6. function_decl::result::uid
> iobj: D.2513
> xobj: D.2513
> stat: D.2512
> 7. function_decl::full-name
> iobj: "void S::f()"
> xobj: "void S::f(this S&)"
> 
> Differences 1, 3, and 7 seem obviously correct to me for all 3
> declarations, 6 is a little bizarre to me, but since it's just a UID
> it's merely an oddity, I doubt it is concerning.

Agreed.

> That leaves 2, 4, and 5.
> 
> 2. I am pretty sure xobj functions should have the struct they are a
> part of recorded as the method basetype member. I have already checked
> that function_type and method_type are the same node type under the
> hood and it does appear to be, so it should be trivial to set it.
> However I do have to wonder why static member functions don't set it,
> it seems to be that it would be convenient to use the same field. Can
> you provide any insight into that?

method basetype is only for METHOD_TYPE; if you want the containing type 
of the function, that's DECL_CONTEXT.

> 4. I have no comment here, but it does concern me since I don't
> understand it at all.

In the list near the top of cp-tree.h, DECL_LANG_FLAG_6 for a 
FUNCTION_DECL is documented to be DECL_THIS_STATIC, which should only be 
set on the static member.

> 5. I am pretty sure this is fine for now, but if xobj member functions
> ever were to support virtual/override capabilities, then it would be a
> problem. Is my understanding correct, or is there some other reason
> that iobj member functions have a different value here?

It is surprising that an iobj memfn would have a different DECL_ALIGN, 
but it shouldn't be a problem; the vtables only rely on alignment being 
at least 2.

> There are also some differences in the arg param in
> cp_build_addr_expr_1 that concerns me, but most of them are reflected
> in the differences I have already noted. I had wanted to include these
> differences as well but I have been spending too much time staring at
> it, it's no longer productive. In short, the indirect_ref node for xobj
> member functions has reference_to_this set, while iobj member functions
> do not.

That's a result of difference 3.

> The baselink binfo field has the private flag set for xobj
> member functions, iobj member functions does not.

TREE_PRIVATE on a binfo is part of BINFO_ACCESS, which isn't a fixed 
value, but gets updated during member search.  Perhaps the differences 
in consideration of conversion to a base led to it being set or cleared 
differently?  I wouldn't worry too much about it unless you see 
differences in access control.

> I've spent too much time on this write up, so I am calling it here, it
> wasn't all a waste of time because half of what I was doing here are
> things I was going to need to do anyway at least. I still feel like I
> spent too much time on it. Hopefully it's of some value for me/others
> later.

I hope my responses are helpful as well.

Jason
waffl3x Nov. 3, 2023, 4:44 a.m. UTC | #26
> > That leaves 2, 4, and 5.
> > 
> > 2. I am pretty sure xobj functions should have the struct they are a
> > part of recorded as the method basetype member. I have already checked
> > that function_type and method_type are the same node type under the
> > hood and it does appear to be, so it should be trivial to set it.
> > However I do have to wonder why static member functions don't set it,
> > it seems to be that it would be convenient to use the same field. Can
> > you provide any insight into that?
> 
> 
> method basetype is only for METHOD_TYPE; if you want the containing type
> of the function, that's DECL_CONTEXT.

-- gcc/tree.h:530
#define FUNC_OR_METHOD_CHECK(T)	TREE_CHECK2 (T, FUNCTION_TYPE, METHOD_TYPE)
-- gcc/tree.h:2518
#define TYPE_METHOD_BASETYPE(NODE)			\
  (FUNC_OR_METHOD_CHECK (NODE)->type_non_common.maxval)

The code doesn't seem to reflect that, perhaps since the underlying
node type is the same (as far as I can tell, they both inherit from
tree_type_non_common) it wasn't believed to be necessary.

Upon looking at DECL_CONTEXT though I see it seems you were thinking of
FUNCTION_DECL. I haven't observed DECL_CONTEXT being set for
FUNCTION_DECL nodes though, perhaps it should be? Although it's more
likely that it is being set and I just haven't noticed, but if that's
the case I'll have to make a note to make sure it is being set for xobj
member functions.

I was going to say that this seems like a redundancy, but I realized
that the type of a member function pointer is tied to the struct, so it
actually ends up relevant for METHOD_TYPE nodes. I would hazard a guess
that when forming member function pointers the FUNCTION_DECL node was
not as easily accessed. If I remember correctly that is not the case
right now so it might be worthwhile to refactor away from
TYPE_METHOD_BASETYPE and replace uses of it with DECL_CONTEXT.

I'm getting ahead of myself though, I'll stop here and avoid going on
too much of a refactoring tangent. I do want this patch to make it into
GCC14 after all.

> > 4. I have no comment here, but it does concern me since I don't
> > understand it at all.
> 
> 
> In the list near the top of cp-tree.h, DECL_LANG_FLAG_6 for a
> FUNCTION_DECL is documented to be DECL_THIS_STATIC, which should only be
> set on the static member.

Right, I'll try to remember to check this area in the future, but yeah
that tracks because I did remove that flag. Removing that flag just so
happened to be the start of this saga of bug fixes but alas, it had to
be done.

> > 5. I am pretty sure this is fine for now, but if xobj member functions
> > ever were to support virtual/override capabilities, then it would be a
> > problem. Is my understanding correct, or is there some other reason
> > that iobj member functions have a different value here?
> 
> 
> It is surprising that an iobj memfn would have a different DECL_ALIGN,
> but it shouldn't be a problem; the vtables only rely on alignment being
> at least 2.

I'll put a note for myself to look into it in the future, it's an
oddity at minimum and oddities interest me :^).

> > There are also some differences in the arg param in
> > cp_build_addr_expr_1 that concerns me, but most of them are reflected
> > in the differences I have already noted. I had wanted to include these
> > differences as well but I have been spending too much time staring at
> > it, it's no longer productive. In short, the indirect_ref node for xobj
> > member functions has reference_to_this set, while iobj member functions
> > do not.
> 
> 
> That's a result of difference 3.

Okay, makes sense, I'm mildly concerned about any possible side effects
this might have since we have a function_type node suddenly going
through execution paths that only method_type went through before. The
whole "reference_to_this" "pointer_to_this" thing is a little confusing
because I'm pretty sure that doesn't refer to the actual `this` object
argument or parameter since I've seen it all over the place. Hopefully
it's benign.

> > The baselink binfo field has the private flag set for xobj
> > member functions, iobj member functions does not.
> 
> 
> TREE_PRIVATE on a binfo is part of BINFO_ACCESS, which isn't a fixed
> value, but gets updated during member search. Perhaps the differences
> in consideration of conversion to a base led to it being set or cleared
> differently? I wouldn't worry too much about it unless you see
> differences in access control.

Unfortunately I don't have any tests for private/public access yet,
it's one of the area's I've neglected. Unfortunately I probably won't
put too much effort into writing TOO many more right now as it takes up
a lot of my time. I still have to clean up the ones I currently have
and I have a few I wanted to write that are not yet written.

> > I've spent too much time on this write up, so I am calling it here, it
> > wasn't all a waste of time because half of what I was doing here are
> > things I was going to need to do anyway at least. I still feel like I
> > spent too much time on it. Hopefully it's of some value for me/others
> > later.
> 
> 
> I hope my responses are helpful as well.

Very much so, thank you!

An update on the regression testing, I had one test fail comparing
against commit a4d2b108cf234e7893322a32a7956ca24e283b05 (GCC13) and I'm
not sure if I need to be concerned about it.
libgomp.c++/../libgomp.c-c++-common/for-16.c execution test

I'm going to be starting tests for my patch against trunk now, once
that is finished I should be ready to format a patch for review.

That's all for now, thanks again.

Alex
Jason Merrill Nov. 3, 2023, 6:05 p.m. UTC | #27
On 11/3/23 00:44, waffl3x wrote:
>>> That leaves 2, 4, and 5.
>>>
>>> 2. I am pretty sure xobj functions should have the struct they are a
>>> part of recorded as the method basetype member. I have already checked
>>> that function_type and method_type are the same node type under the
>>> hood and it does appear to be, so it should be trivial to set it.
>>> However I do have to wonder why static member functions don't set it,
>>> it seems to be that it would be convenient to use the same field. Can
>>> you provide any insight into that?
>>
>>
>> method basetype is only for METHOD_TYPE; if you want the containing type
>> of the function, that's DECL_CONTEXT.
> 
> -- gcc/tree.h:530
> #define FUNC_OR_METHOD_CHECK(T)	TREE_CHECK2 (T, FUNCTION_TYPE, METHOD_TYPE)
> -- gcc/tree.h:2518
> #define TYPE_METHOD_BASETYPE(NODE)			\
>    (FUNC_OR_METHOD_CHECK (NODE)->type_non_common.maxval)
> 
> The code doesn't seem to reflect that, perhaps since the underlying
> node type is the same (as far as I can tell, they both inherit from
> tree_type_non_common) it wasn't believed to be necessary.

Curious.  It might also be a remnant of how older code dealt with how 
sometimes a member function changes between FUNCTION_TYPE and 
METHOD_TYPE during parsing.

> Upon looking at DECL_CONTEXT though I see it seems you were thinking of
> FUNCTION_DECL. I haven't observed DECL_CONTEXT being set for
> FUNCTION_DECL nodes though, perhaps it should be? Although it's more
> likely that it is being set and I just haven't noticed, but if that's
> the case I'll have to make a note to make sure it is being set for xobj
> member functions.

I would certainly expect it to be getting set already.

> I was going to say that this seems like a redundancy, but I realized
> that the type of a member function pointer is tied to the struct, so it
> actually ends up relevant for METHOD_TYPE nodes. I would hazard a guess
> that when forming member function pointers the FUNCTION_DECL node was
> not as easily accessed. If I remember correctly that is not the case
> right now so it might be worthwhile to refactor away from
> TYPE_METHOD_BASETYPE and replace uses of it with DECL_CONTEXT.

For a pointer-to-member, the class type is part of the type, so trying 
to remove it from the type doesn't sound like an improvement to me. 
Specifically, TYPE_PTRMEM_CLASS_TYPE refers to TYPE_METHOD_BASETYPE for 
a pointer to member function.

> I'm getting ahead of myself though, I'll stop here and avoid going on
> too much of a refactoring tangent. I do want this patch to make it into
> GCC14 after all.

Good plan.  :)

>>> 4. I have no comment here, but it does concern me since I don't
>>> understand it at all.
>>
>> In the list near the top of cp-tree.h, DECL_LANG_FLAG_6 for a
>> FUNCTION_DECL is documented to be DECL_THIS_STATIC, which should only be
>> set on the static member.
> 
> Right, I'll try to remember to check this area in the future, but yeah
> that tracks because I did remove that flag. Removing that flag just so
> happened to be the start of this saga of bug fixes but alas, it had to
> be done.
> 
>>> 5. I am pretty sure this is fine for now, but if xobj member functions
>>> ever were to support virtual/override capabilities, then it would be a
>>> problem. Is my understanding correct, or is there some other reason
>>> that iobj member functions have a different value here?
>>
>> It is surprising that an iobj memfn would have a different DECL_ALIGN,
>> but it shouldn't be a problem; the vtables only rely on alignment being
>> at least 2.
> 
> I'll put a note for myself to look into it in the future, it's an
> oddity at minimum and oddities interest me :^).
> 
>>> There are also some differences in the arg param in
>>> cp_build_addr_expr_1 that concerns me, but most of them are reflected
>>> in the differences I have already noted. I had wanted to include these
>>> differences as well but I have been spending too much time staring at
>>> it, it's no longer productive. In short, the indirect_ref node for xobj
>>> member functions has reference_to_this set, while iobj member functions
>>> do not.
>>
>> That's a result of difference 3.
> 
> Okay, makes sense, I'm mildly concerned about any possible side effects
> this might have since we have a function_type node suddenly going
> through execution paths that only method_type went through before. The
> whole "reference_to_this" "pointer_to_this" thing is a little confusing
> because I'm pretty sure that doesn't refer to the actual `this` object
> argument or parameter since I've seen it all over the place. Hopefully
> it's benign.

Yes, "pointer_to_this" is just a cache of the type that is a pointer to 
the type you're looking at.  You are correct that it has nothing to do 
with the C++ 'this'.

>>> The baselink binfo field has the private flag set for xobj
>>> member functions, iobj member functions does not.
>>
>> TREE_PRIVATE on a binfo is part of BINFO_ACCESS, which isn't a fixed
>> value, but gets updated during member search. Perhaps the differences
>> in consideration of conversion to a base led to it being set or cleared
>> differently? I wouldn't worry too much about it unless you see
>> differences in access control.
> 
> Unfortunately I don't have any tests for private/public access yet,
> it's one of the area's I've neglected. Unfortunately I probably won't
> put too much effort into writing TOO many more right now as it takes up
> a lot of my time. I still have to clean up the ones I currently have
> and I have a few I wanted to write that are not yet written.

Makes sense.  I wouldn't expect access control to need specific changes.

>>> I've spent too much time on this write up, so I am calling it here, it
>>> wasn't all a waste of time because half of what I was doing here are
>>> things I was going to need to do anyway at least. I still feel like I
>>> spent too much time on it. Hopefully it's of some value for me/others
>>> later.
>>
>> I hope my responses are helpful as well.
> 
> Very much so, thank you!
> 
> An update on the regression testing, I had one test fail comparing
> against commit a4d2b108cf234e7893322a32a7956ca24e283b05 (GCC13) and I'm
> not sure if I need to be concerned about it.
> libgomp.c++/../libgomp.c-c++-common/for-16.c execution test

No, that test has been pretty flaky for me, you can ignore it.

> I'm going to be starting tests for my patch against trunk now, once
> that is finished I should be ready to format a patch for review.

Great!

Jason
waffl3x Nov. 4, 2023, 6:40 a.m. UTC | #28
I'm unfortunately going down a rabbit hole again.

--function.h:608
```
/* If pointers to member functions use the least significant bit to
   indicate whether a function is virtual, ensure a pointer
   to this function will have that bit clear.  */
#define MINIMUM_METHOD_BOUNDARY \
  ((TARGET_PTRMEMFUNC_VBIT_LOCATION == ptrmemfunc_vbit_in_pfn)	     \
   ? MAX (FUNCTION_BOUNDARY, 2 * BITS_PER_UNIT) : FUNCTION_BOUNDARY)
```
--decl.cc:10400 grokfndecl
```
  if (TREE_CODE (type) == METHOD_TYPE)
    {
      tree parm = build_this_parm (decl, type, quals);
      DECL_CHAIN (parm) = parms;
      parms = parm;

      /* Allocate space to hold the vptr bit if needed.  */
      SET_DECL_ALIGN (decl, MINIMUM_METHOD_BOUNDARY);
    }
```
The good news is I think I found where the alignment is being set, so
it can be addressed later if desired.

I stumbled upon this while cleaning up the patch, grokfndecl is just so
full of cruft it's crazy hard to reason about. There's more than one
block that I am near certain is completely dead code. I would like to
just ignore them for now but some of them unfortunately pertain to xobj
functions. I just don't feel good about putting in any hacks, but to
really get any modifications in here correct it would need to be
refactored much more than I should be doing in this patch.

Here's another example that I'm not sure how I want to address it.

~decl.cc:10331 grokfndecl
```
  int staticp = ctype && TREE_CODE (type) == FUNCTION_TYPE;
```
~decl.cc:10506 grokfndecl
```
  /* If this decl has namespace scope, set that up.  */
  if (in_namespace)
    set_decl_namespace (decl, in_namespace, friendp);
  else if (ctype)
    DECL_CONTEXT (decl) = ctype;
  else
    DECL_CONTEXT (decl) = FROB_CONTEXT (current_decl_namespace ());
```
And just a few lines down;
~decl.cc:10529
```
  /* Should probably propagate const out from type to decl I bet (mrs).  */
  if (staticp)
    {
      DECL_STATIC_FUNCTION_P (decl) = 1;
      DECL_CONTEXT (decl) = ctype;
    }
```

If staticp is true, ctype must have been non-null, and if ctype is
non-null, the context for decl should have been set in the second
block. So why was the code in the second block added?

commit f3665bd1111c1799c0421490b5e655f977570354
Author: Nathan Sidwell <nathan@acm.org>
Date:   Tue Jul 28 08:57:36 2020 -0700

    c++: Set more DECL_CONTEXTs
    
    I discovered we were not setting DECL_CONTEXT in a few cases, and
    grokfndecl's control flow wasn't making it clear that we were doing it
    in all cases.
    
            gcc/cp/
            * cp-gimplify.c (cp_genericize_r): Set IMPORTED_DECL's context.
            * cp-objcp-common.c (cp_pushdecl): Set decl's context.
            * decl.c (grokfndecl): Make DECL_CONTEXT setting clearer.

According to the commit, it was because it was not clear, which quite
frankly I can agree to, it just wasn't determined that the code below
is redundantly setting the context so it wasn't removed.

This puts me in a dilemma though, do I put another condition in that
code block for the xobj case even though the code is nearly dead? Or do
I give it a minor refactor for it to make a little more sense? If I add
to the code I feel like it's just going to add to the problem, while if
I give it a minor refactor it still won't look great and has a greater
chance of breaking something.

In this case I'm going to risk refactoring it, staticp is only used in
that 1 place so I will just rip it out. I am not concerned with decl's
type spontaneously changing to something that is not FUNCTION_TYPE, and
if it did I think there are bigger problems afoot.

I guess I'll know if I went too far with the refactoring when the patch
reaches you, do let me know about this one specifically though because
it took up a lot of my time trying to decide how to address it.

> > The code doesn't seem to reflect that, perhaps since the underlying
> > node type is the same (as far as I can tell, they both inherit from
> > tree_type_non_common) it wasn't believed to be necessary.
> 
> 
> Curious. It might also be a remnant of how older code dealt with how
> sometimes a member function changes between FUNCTION_TYPE and
> METHOD_TYPE during parsing.

Sounds plausible.

> > Upon looking at DECL_CONTEXT though I see it seems you were thinking of
> > FUNCTION_DECL. I haven't observed DECL_CONTEXT being set for
> > FUNCTION_DECL nodes though, perhaps it should be? Although it's more
> > likely that it is being set and I just haven't noticed, but if that's
> > the case I'll have to make a note to make sure it is being set for xobj
> > member functions.
> 
> 
> I would certainly expect it to be getting set already.

This being on my mind is partially what sent me down the rabbit hole
above, but yeah, it seems like it is.

> > I was going to say that this seems like a redundancy, but I realized
> > that the type of a member function pointer is tied to the struct, so it
> > actually ends up relevant for METHOD_TYPE nodes. I would hazard a guess
> > that when forming member function pointers the FUNCTION_DECL node was
> > not as easily accessed. If I remember correctly that is not the case
> > right now so it might be worthwhile to refactor away from
> > TYPE_METHOD_BASETYPE and replace uses of it with DECL_CONTEXT.
> 
> 
> For a pointer-to-member, the class type is part of the type, so trying
> to remove it from the type doesn't sound like an improvement to me.
> Specifically, TYPE_PTRMEM_CLASS_TYPE refers to TYPE_METHOD_BASETYPE for
> a pointer to member function.

I suppose it's not a good idea given that other decl nodes use the same
field, but redundancies do feel icky to me. It makes stuff way harder
to reason about for me.

> > 
> > Unfortunately I don't have any tests for private/public access yet,
> > it's one of the area's I've neglected. Unfortunately I probably won't
> > put too much effort into writing TOO many more right now as it takes up
> > a lot of my time. I still have to clean up the ones I currently have
> > and I have a few I wanted to write that are not yet written.
> 
> 
> Makes sense. I wouldn't expect access control to need specific changes.

I think I saw something that concerned me regarding this in grokfndecl,
but I agree that it shouldn't need specific changes, so I'm going to
assume I imagined it unless problems start to pop up :).

> No, that test has been pretty flaky for me, you can ignore it.

Noted.

> > I'm going to be starting tests for my patch against trunk now, once
> > that is finished I should be ready to format a patch for review.
> 
> 
> Great!

All tests seemed to pass when applied to GCC14, but the results did
something funny where it said tests disappeared and new tests appeared
and passed. The ones that disappeared and the new ones that appeared
looked like they were identical so I'm not worrying about it. Just
mentioning it in case this is something I do need to look into.

Back to work now, thank you as always.

Alex
Jason Merrill Nov. 9, 2023, 6:44 p.m. UTC | #29
On 11/4/23 02:40, waffl3x wrote:
> I'm unfortunately going down a rabbit hole again.
> 
> --function.h:608
> ```
> /* If pointers to member functions use the least significant bit to
>     indicate whether a function is virtual, ensure a pointer
>     to this function will have that bit clear.  */
> #define MINIMUM_METHOD_BOUNDARY \
>    ((TARGET_PTRMEMFUNC_VBIT_LOCATION == ptrmemfunc_vbit_in_pfn)	     \
>     ? MAX (FUNCTION_BOUNDARY, 2 * BITS_PER_UNIT) : FUNCTION_BOUNDARY)
> ```

So yes, it was for PMFs using the low bit of the pointer to indicate a 
virtual member function.  Since an xob memfn can't be virtual, it's 
correct for them to have the same alignment as a static memfn.

> I stumbled upon this while cleaning up the patch, grokfndecl is just so
> full of cruft it's crazy hard to reason about. There's more than one
> block that I am near certain is completely dead code. I would like to
> just ignore them for now but some of them unfortunately pertain to xobj
> functions. I just don't feel good about putting in any hacks, but to
> really get any modifications in here correct it would need to be
> refactored much more than I should be doing in this patch.
> 
> Here's another example that I'm not sure how I want to address it.
> 
> ~decl.cc:10331 grokfndecl
> ```
>    int staticp = ctype && TREE_CODE (type) == FUNCTION_TYPE;
> ```
> ~decl.cc:10506 grokfndecl
> ```
>    /* If this decl has namespace scope, set that up.  */
>    if (in_namespace)
>      set_decl_namespace (decl, in_namespace, friendp);
>    else if (ctype)
>      DECL_CONTEXT (decl) = ctype;
>    else
>      DECL_CONTEXT (decl) = FROB_CONTEXT (current_decl_namespace ());
> ```
> And just a few lines down;
> ~decl.cc:10529
> ```
>    /* Should probably propagate const out from type to decl I bet (mrs).  */
>    if (staticp)
>      {
>        DECL_STATIC_FUNCTION_P (decl) = 1;
>        DECL_CONTEXT (decl) = ctype;
>      }
> ```
> 
> If staticp is true, ctype must have been non-null, and if ctype is
> non-null, the context for decl should have been set in the second
> block. So why was the code in the second block added?
>
> commit f3665bd1111c1799c0421490b5e655f977570354
> Author: Nathan Sidwell <nathan@acm.org>
> Date:   Tue Jul 28 08:57:36 2020 -0700
> 
>      c++: Set more DECL_CONTEXTs
>      
>      I discovered we were not setting DECL_CONTEXT in a few cases, and
>      grokfndecl's control flow wasn't making it clear that we were doing it
>      in all cases.
>      
>              gcc/cp/
>              * cp-gimplify.c (cp_genericize_r): Set IMPORTED_DECL's context.
>              * cp-objcp-common.c (cp_pushdecl): Set decl's context.
>              * decl.c (grokfndecl): Make DECL_CONTEXT setting clearer.
> 
> According to the commit, it was because it was not clear, which quite
> frankly I can agree to, it just wasn't determined that the code below
> is redundantly setting the context so it wasn't removed.
> 
> This puts me in a dilemma though, do I put another condition in that
> code block for the xobj case even though the code is nearly dead? Or do
> I give it a minor refactor for it to make a little more sense? If I add
> to the code I feel like it's just going to add to the problem, while if
> I give it a minor refactor it still won't look great and has a greater
> chance of breaking something.
> 
> In this case I'm going to risk refactoring it, staticp is only used in
> that 1 place so I will just rip it out. I am not concerned with decl's
> type spontaneously changing to something that is not FUNCTION_TYPE, and
> if it did I think there are bigger problems afoot.
> 
> I guess I'll know if I went too far with the refactoring when the patch
> reaches you, do let me know about this one specifically though because
> it took up a lot of my time trying to decide how to address it.

Removing the redundant DECL_CONTEXT setting seems appropriate, and 
changing how staticp is handled to reflect that xobfns can also have 
FUNCTION_TYPE.

> All tests seemed to pass when applied to GCC14, but the results did
> something funny where it said tests disappeared and new tests appeared
> and passed. The ones that disappeared and the new ones that appeared
> looked like they were identical so I'm not worrying about it. Just
> mentioning it in case this is something I do need to look into.

That doesn't sound like a problem, but I'm curious about the specific 
output you're seeing.

Jason
waffl3x Nov. 10, 2023, 4:24 a.m. UTC | #30
> > I'm unfortunately going down a rabbit hole again.
> > 
> > --function.h:608
> > `/* If pointers to member functions use the least significant bit to indicate whether a function is virtual, ensure a pointer to this function will have that bit clear. */ #define MINIMUM_METHOD_BOUNDARY \\ ((TARGET_PTRMEMFUNC_VBIT_LOCATION == ptrmemfunc_vbit_in_pfn) \\ ? MAX (FUNCTION_BOUNDARY, 2 * BITS_PER_UNIT) : FUNCTION_BOUNDARY)`
> 
> 
> So yes, it was for PMFs using the low bit of the pointer to indicate a
> virtual member function. Since an xob memfn can't be virtual, it's
> correct for them to have the same alignment as a static memfn.

Is it worth considering whether we want to support virtual xobj member
functions in the future? If that were the case would it be better if we
aligned things a little differently here? Or might it be better if we
wanted to support it as an extension to just effectively translate the
declaration back to one that is a METHOD_TYPE? I imagine this would be
the best solution for non-standard support of the syntax. We would
simply have to forbid by-value and conversion semantics and on the
user's side they would get consistent syntax.

However, this flies in the face of the defective/contradictory spec for
virtual function overrides. So I'm not really sure whether we would
want to do this. I just want to raise the question before we lock in
the alignment, if pushing the patch locks it in that is, I'm not really
sure if it needs to be stable or not.

> > I stumbled upon this while cleaning up the patch, grokfndecl is just so
> > full of cruft it's crazy hard to reason about. There's more than one
> > block that I am near certain is completely dead code. I would like to
> > just ignore them for now but some of them unfortunately pertain to xobj
> > functions. I just don't feel good about putting in any hacks, but to
> > really get any modifications in here correct it would need to be
> > refactored much more than I should be doing in this patch.
> > 
> > Here's another example that I'm not sure how I want to address it.
> > 
> > :10331~decl.cc grokfndecl
> > `int staticp = ctype && TREE_CODE (type) == FUNCTION_TYPE;`
> > :10506~decl.cc grokfndecl
> > `/* If this decl has namespace scope, set that up. */ if (in_namespace) set_decl_namespace (decl, in_namespace, friendp); else if (ctype) DECL_CONTEXT (decl) = ctype; else DECL_CONTEXT (decl) = FROB_CONTEXT (current_decl_namespace ());`
> > And just a few lines down;
> > :10529~decl.cc
> > `/* Should probably propagate const out from type to decl I bet (mrs). */ if (staticp) { DECL_STATIC_FUNCTION_P (decl) = 1; DECL_CONTEXT (decl) = ctype; }`
> > 
> > If staticp is true, ctype must have been non-null, and if ctype is
> > non-null, the context for decl should have been set in the second
> > block. So why was the code in the second block added?
> > 
> > commit f3665bd1111c1799c0421490b5e655f977570354
> > Author: Nathan Sidwell nathan@acm.org
> > Date: Tue Jul 28 08:57:36 2020 -0700
> > 
> > c++: Set more DECL_CONTEXTs
> > 
> > I discovered we were not setting DECL_CONTEXT in a few cases, and
> > grokfndecl's control flow wasn't making it clear that we were doing it
> > in all cases.
> > 
> > gcc/cp/
> > * cp-gimplify.c (cp_genericize_r): Set IMPORTED_DECL's context.
> > * cp-objcp-common.c (cp_pushdecl): Set decl's context.
> > * decl.c (grokfndecl): Make DECL_CONTEXT setting clearer.
> > 
> > According to the commit, it was because it was not clear, which quite
> > frankly I can agree to, it just wasn't determined that the code below
> > is redundantly setting the context so it wasn't removed.
> > 
> > This puts me in a dilemma though, do I put another condition in that
> > code block for the xobj case even though the code is nearly dead? Or do
> > I give it a minor refactor for it to make a little more sense? If I add
> > to the code I feel like it's just going to add to the problem, while if
> > I give it a minor refactor it still won't look great and has a greater
> > chance of breaking something.
> > 
> > In this case I'm going to risk refactoring it, staticp is only used in
> > that 1 place so I will just rip it out. I am not concerned with decl's
> > type spontaneously changing to something that is not FUNCTION_TYPE, and
> > if it did I think there are bigger problems afoot.
> > 
> > I guess I'll know if I went too far with the refactoring when the patch
> > reaches you, do let me know about this one specifically though because
> > it took up a lot of my time trying to decide how to address it.
> 
> 
> Removing the redundant DECL_CONTEXT setting seems appropriate, and
> changing how staticp is handled to reflect that xobfns can also have
> FUNCTION_TYPE.

I removed static_p as it was only used in that one case, I'm pretty
happy with the resulting code but I saw you replied on the patch as
well so I'll see if you commented on it in the review and address your
thoughts there.

> > All tests seemed to pass when applied to GCC14, but the results did
> > something funny where it said tests disappeared and new tests appeared
> > and passed. The ones that disappeared and the new ones that appeared
> > looked like they were identical so I'm not worrying about it. Just
> > mentioning it in case this is something I do need to look into.
> 
> 
> That doesn't sound like a problem, but I'm curious about the specific
> output you're seeing.
> 
> Jason

I've attached a few test result comparisons so you can take a look.

Alex
Jason Merrill Nov. 10, 2023, 11:12 p.m. UTC | #31
[combined reply to all three threads]

On 11/9/23 23:24, waffl3x wrote:
> 
>>> I'm unfortunately going down a rabbit hole again.
>>>
>>> --function.h:608
>>> `/* If pointers to member functions use the least significant bit to indicate whether a function is virtual, ensure a pointer to this function will have that bit clear. */ #define MINIMUM_METHOD_BOUNDARY \\ ((TARGET_PTRMEMFUNC_VBIT_LOCATION == ptrmemfunc_vbit_in_pfn) \\ ? MAX (FUNCTION_BOUNDARY, 2 * BITS_PER_UNIT) : FUNCTION_BOUNDARY)`
>>
>>
>> So yes, it was for PMFs using the low bit of the pointer to indicate a
>> virtual member function. Since an xob memfn can't be virtual, it's
>> correct for them to have the same alignment as a static memfn.
> 
> Is it worth considering whether we want to support virtual xobj member
> functions in the future? If that were the case would it be better if we
> aligned things a little differently here? Or might it be better if we
> wanted to support it as an extension to just effectively translate the
> declaration back to one that is a METHOD_TYPE? I imagine this would be
> the best solution for non-standard support of the syntax. We would
> simply have to forbid by-value and conversion semantics and on the
> user's side they would get consistent syntax.
> 
> However, this flies in the face of the defective/contradictory spec for
> virtual function overrides. So I'm not really sure whether we would
> want to do this. I just want to raise the question before we lock in
> the alignment, if pushing the patch locks it in that is, I'm not really
> sure if it needs to be stable or not.

It doesn't need to be stable; we can increase the alignment of decls as 
needed in new code without breaking older code.

>>> All tests seemed to pass when applied to GCC14, but the results did
>>> something funny where it said tests disappeared and new tests appeared
>>> and passed. The ones that disappeared and the new ones that appeared
>>> looked like they were identical so I'm not worrying about it. Just
>>> mentioning it in case this is something I do need to look into.
>>
>> That doesn't sound like a problem, but I'm curious about the specific
>> output you're seeing.
> 
> I've attached a few test result comparisons so you can take a look.

Looks like you're comparing results from different build directories and 
the libitm test wrongly includes the build directory in the test "name". 
  So yeah, just noise.

> Side note, would you prefer I compile the lambda and by-value fixes
> into a new version of this patch? Or as a separate patch? Originally I
> had planned to put it in another patch, but I identified that the code
> I wrote in build_over_call was kind of fundamentally broken and it was
> almost merely coincidence that it worked at all. In light of this and
> your comments (which I've skimmed, I will respond directly below) I
> think I should just revise this patch with everything else.

Agreed.

>>> There are a few known issues still present in this patch. Most importantly,
>>> the implicit object argument fails to convert when passed to by-value xobj
>>> parameters. This occurs both for xobj parameters that match the argument type
>>> and xobj parameters that are unrelated to the object type, but have valid
>>> conversions available. This behavior can be observed in the
>>> explicit-obj-by-value[1-3].C tests. The implicit object argument appears to be
>>> simply reinterpreted instead of any conversion applied. This is elaborated on
>>> in the test cases.
>>
>> Yes, that's because of:
>>
>>> @@ -9949,7 +9951,8 @@ build_over_call (struct z_candidate cand, int flags, tsubst_flags_t complain)
>>> }
>>> }
>>> / Bypass access control for 'this' parameter. */
>>> - else if (TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE)
>>> + else if (TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE
>>> + || DECL_XOBJ_MEMBER_FUNC_P (fn))
>>
>>
>> We don't want to take this path for xob fns. Instead I think we need to
>> change the existing:
>>
>>> gcc_assert (first_arg == NULL_TREE);
>>
>>
>> to assert that if first_arg is non-null, we're dealing with an xob fn,
>> and then go ahead and do the same conversion as the loop body on first_arg.
>>
>>> Despite this, calls where there is no valid conversion
>>> available are correctly rejected, which I find surprising. The
>>> explicit-obj-by-value4.C testcase demonstrates this odd but correct behavior.
>>
>>
>> Yes, because checking for conversions is handled elsewhere.
> 
> Yeah, as I noted above I realized that just handling it the same way as
> iobj member functions is fundamentally broken. I was staring at it last
> night and eventually realized that I could just copy the loop body. I
> ended up asserting in the body handling the implicit object argument
> for xobj member functions that first_arg != NULL_TREE, which I wasn't
> sure of, but it seems to work.

That sounds like it might cause trouble with

struct A {
    void f(this A);
};

int main()
{
   (&A::f) (A());
}

> I tried asking in IRC if there are any circumstances where first_arg
> would be null for a non-static member function and I didn't get an
> answer. The code above seemed to indicate that it could be. It just
> looks like old code that is no longer valid and never got removed.
> Consequently this function has made it on my list of things to refactor
> :^).

Right, first_arg is only actually used for the implicit object argument, 
it's just easier to store it separately from the arguments in ().  I'm 
not sure which code you mean is no longer valid?

>>> Other than this, lambdas are not yet supported,
>>
>>
>> The error I'm seeing with the lambda testcase is "explicit object member
>> function cannot have cv-qualifier". To avoid that, in
>> cp_parser_lambda_declarator_opt you need to set quals to
>> TYPE_UNQUALIFIED around where we do that for mutable lambdas.
> 
> Yeah, this ended up being the case as suspected, and it was exactly in
> cp_parser_lambda_declarator_opt that it needed to be done. There's an
> additional quirk in start_preparsed_function, which I already rambled
> about a little in a previous reply on this chain, that suddenly
> restored setting up the 'this' pointer. Previously, it was being
> skipped because ctype was not being set for xobj member functions.
> However, ctype not being set was causing the scope to not be set up
> correctly for lambdas. In truth I don't remember exactly how the
> problem was presenting but I do know how I fixed it.
> 
> ```
>   tree fntype = TREE_TYPE (decl1);
>   if (TREE_CODE (fntype) == METHOD_TYPE)
>     ctype = TYPE_METHOD_BASETYPE (fntype);
>   else if (DECL_XOBJ_MEMBER_FUNC_P (decl1))
>     ctype = DECL_CONTEXT (decl1);
> ```
> 
> In hindsight though, I have to question if this is the correct way of
> dealing with it. As I mentioned previously, there's an additional quirk
> in start_preparsed_function where it sets up the 'this' param, or at
> least, it kind of looks like it? This is where my rambling was about.
> 
> ```
>   /* We don't need deal with 'this' or vtable for xobj member functions.  */
>   if (ctype && !doing_friend &&
>       !(DECL_STATIC_FUNCTION_P (decl1) || DECL_XOBJ_MEMBER_FUNC_P (decl1)))
> ```
> 
> My solution was to just not enter that block for xobj member functions,
> but I'm realizing now that ctype doesn't seem to be set for static
> member functions so setting ctype for xobj member functions might be
> incorrect. I'll need to investigate it closer, I recall seeing the
> second block above and thinking "it's checking to make sure decl1 is
> not a static member function before entering this block, so that means
> it must be set." It seems that was incorrect.
> 
> At bare minimum, I'll have to look at this again.

Yeah, that could use some refactoring.  IMO ctype should be set iff we 
want to push_nested_class, and the code for setting up 'this' should 
check for METHOD_TYPE instead of referring to ctype.

>>> I had wanted to write about some of my frustrations with trying to
>>> write a test for virtual specifiers and errors/warnings for
>>> shadowing/overloading virtual functions, but I am a bit too tired at
>>> the moment and I don't want to delay getting this up for another night.
>>> In short, the standard does not properly specify the criteria for
>>> overriding functions, which leaves a lot of ambiguity in how exactly we
>>> should be handling these cases.
>>
>> Agreed, this issue came up in the C++ committee meeting today. See
>>
>> https://cplusplus.github.io/CWG/issues/2553.html
>> https://cplusplus.github.io/CWG/issues/2554.html
>>
>> for draft changes to clarify some of these issues.
> 
> Ah I guess I should have read all your responses before sending any
> back instead of going one by one. I ended up writing about this again I
> think.
> 
> struct B {
>     virtual void f() const {}
> };
> 
> struct S0 : B {
>     void f() {}
> };
> 
> struct S1 : B {
>     void f(this S1&) {}
> };
> 
> I had a bit of a debate with a peer, I initially disagreed with the
> standard resolution because it disallows S1's declaration of f, while
> S0's is an overload. I won't bore you with all the details of going
> back and forth about the proposed wording, my feelings are mostly that
> being able to overload something with an iobj member function but not
> being able to with an xobj member function was inconsistent. He argued
> that keeping restrictions high at first and lowering them later is
> better, and overloading virtual functions is already not something you
> should really ever do, so he was in favor of the proposed wording.

Right.  As the author of this proposal said in discussion, "We want very 
liberal overriding for explicit object member functions so that it'll 
blow up."

> In light of our debate, my stance is that we should implement things as
> per the proposed wording. 
> 
> struct B {
>   virtual void foo() const&;
> };
> 
> struct D : B {
>   void foo(this int);
> };
> 
> This would be ill-formed now according to the change in wording. My
> laymans interpretation of the semantics are that, the parameter list is
> the same when the xobj parameter is ignore, so it overrides. And since
> it overrides, and xobj member functions are not allowed to override, it
> is an error.

Yes.

> To be honest, I still don't understand where the wording accounts for
> the qualifiers of B::f, but my friend assured me that it is.

The qualifiers of B::f are part of the object parameter which we're 
ignoring because the derived function is xobj.

> Somewhat related is some warnings I wanted to implement for impossible
> to call by-value xobj member functions. Ones that involve an unrelated
> type get dicey because people could in theory have legitimate use cases
> for that, P0847R7 includes an example of this combining recursive
> lambdas and the overload pattern to create a visitor. However, I think
> it would be reasonable to warn when a by-value xobj member function can
> not be called due to the presence of overloads that take references.
> Off the top of my head I don't recall how many overloads it would take
> to prevent a by-value version from being called, nor have I looked into
> how to implement this.

Hmm, is it possible to make it un-callable?  A function taking A and a 
function taking A&& are ambiguous when called with an rvalue argument, 
the by-value overload isn't hidden.  Similarly, A and A& are ambiguous 
when called with a cv-unqualified lvalue.

> But I do know that redeclarations of xobj member
> functions as iobj member functions (and vice-versa) are not properly
> identified when ref qualifiers are omitted from the corresponding (is
> this the correct term here?) iobj member function.

That would be the code in add_method discussed later in that message.

>>> + /* If */
>>
>> I think this comment doesn't add much. :)
> 
> I wish I remembered what I even meant to put here, oh well, must not
> have been all that important. Oh wait, I remember it now, I was going
> to note that the next element in the chain of declarators being null
> seems to indicate that the declaration is a function type. I will
> probably put that comment in again, it's on the line of commenting
> exactly what the code does but it was cryptic to figure this out, I'm
> not sure if I should lean towards including the comment here or not.

I always lean towards including comments. :)

>>> + /* I can imagine doing a fixit here, suggesting replacing
>>> + this / *this / this-> with &name / name / "name." but it would be
>>> + very difficult to get it perfect and I've been advised against
>>> + making imperfect fixits.
>>> + Perhaps it would be as simple as the replacements listed,
>>> + even if one is move'ing/forward'ing, if the replacement is just
>>> + done in the same place, it will be exactly what the user wants?
>>> + Even if this is true though, there's still a problem of getting the
>>> + context of the expression to find which tokens to replace.
>>> + I would really like for this to be possible though.
>>> + I will decide whether or not to persue this after review. */
>>
>> You could pass the location of 'this' from the parser to
>> finish_this_expr and always replace it with (&name)?
> 
> I've been reluctant to make any changes to parameters, but if you're
> suggesting it I will consider it. Just replacing it with '(&name)'
> seems really mediocre to me though. Yeah, sure, it's always going to be
> syntactically valid but I worry that it wont be semantically correct
> often enough.
> 
> Yeah, considering that I think I will leave it for now until I have
> time to come up with a robust solution. I might still pass in the
> location of the this token though and use error_at instead of error.
> I'll see how I feel once I finish higher priority stuff.

Makes sense.

> To be clear, I should be replacing { dg-options "-Wno-error=pedantic" }
> with { dg-options "" } and you would prefer I just remove { dg-options
> "-pedantic-errors" } yeah?

Yes.

Jason
waffl3x Nov. 11, 2023, 10:43 a.m. UTC | #32
> [combined reply to all three threads]
> 
> On 11/9/23 23:24, waffl3x wrote:
> 
> > > > I'm unfortunately going down a rabbit hole again.
> > > > 
> > > > --function.h:608
> > > > `/* If pointers to member functions use the least significant bit to indicate whether a function is virtual, ensure a pointer to this function will have that bit clear. */ #define MINIMUM_METHOD_BOUNDARY \\\\ ((TARGET_PTRMEMFUNC_VBIT_LOCATION == ptrmemfunc_vbit_in_pfn) \\\\ ? MAX (FUNCTION_BOUNDARY, 2 * BITS_PER_UNIT) : FUNCTION_BOUNDARY)`
> > > 
> > > So yes, it was for PMFs using the low bit of the pointer to indicate a
> > > virtual member function. Since an xob memfn can't be virtual, it's
> > > correct for them to have the same alignment as a static memfn.
> > 
> > Is it worth considering whether we want to support virtual xobj member
> > functions in the future? If that were the case would it be better if we
> > aligned things a little differently here? Or might it be better if we
> > wanted to support it as an extension to just effectively translate the
> > declaration back to one that is a METHOD_TYPE? I imagine this would be
> > the best solution for non-standard support of the syntax. We would
> > simply have to forbid by-value and conversion semantics and on the
> > user's side they would get consistent syntax.
> > 
> > However, this flies in the face of the defective/contradictory spec for
> > virtual function overrides. So I'm not really sure whether we would
> > want to do this. I just want to raise the question before we lock in
> > the alignment, if pushing the patch locks it in that is, I'm not really
> > sure if it needs to be stable or not.
> 
> 
> It doesn't need to be stable; we can increase the alignment of decls as
> needed in new code without breaking older code.

Okay great, good to know, it seems so obvious when you put it that way.

> > > > All tests seemed to pass when applied to GCC14, but the results did
> > > > something funny where it said tests disappeared and new tests appeared
> > > > and passed. The ones that disappeared and the new ones that appeared
> > > > looked like they were identical so I'm not worrying about it. Just
> > > > mentioning it in case this is something I do need to look into.
> > > 
> > > That doesn't sound like a problem, but I'm curious about the specific
> > > output you're seeing.
> > 
> > I've attached a few test result comparisons so you can take a look.
> 
> 
> Looks like you're comparing results from different build directories and
> the libitm test wrongly includes the build directory in the test "name".
> So yeah, just noise.

AH okay that makes sense.

> > Side note, would you prefer I compile the lambda and by-value fixes
> > into a new version of this patch? Or as a separate patch? Originally I
> > had planned to put it in another patch, but I identified that the code
> > I wrote in build_over_call was kind of fundamentally broken and it was
> > almost merely coincidence that it worked at all. In light of this and
> > your comments (which I've skimmed, I will respond directly below) I
> > think I should just revise this patch with everything else.
> 
> 
> Agreed.

Will do then.

> > > > There are a few known issues still present in this patch. Most importantly,
> > > > the implicit object argument fails to convert when passed to by-value xobj
> > > > parameters. This occurs both for xobj parameters that match the argument type
> > > > and xobj parameters that are unrelated to the object type, but have valid
> > > > conversions available. This behavior can be observed in the
> > > > explicit-obj-by-value[1-3].C tests. The implicit object argument appears to be
> > > > simply reinterpreted instead of any conversion applied. This is elaborated on
> > > > in the test cases.
> > > 
> > > Yes, that's because of:
> > > 
> > > > @@ -9949,7 +9951,8 @@ build_over_call (struct z_candidate cand, int flags, tsubst_flags_t complain)
> > > > }
> > > > }
> > > > / Bypass access control for 'this' parameter. */
> > > > - else if (TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE)
> > > > + else if (TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE
> > > > + || DECL_XOBJ_MEMBER_FUNC_P (fn))
> > > 
> > > We don't want to take this path for xob fns. Instead I think we need to
> > > change the existing:
> > > 
> > > > gcc_assert (first_arg == NULL_TREE);
> > > 
> > > to assert that if first_arg is non-null, we're dealing with an xob fn,
> > > and then go ahead and do the same conversion as the loop body on first_arg.
> > > 
> > > > Despite this, calls where there is no valid conversion
> > > > available are correctly rejected, which I find surprising. The
> > > > explicit-obj-by-value4.C testcase demonstrates this odd but correct behavior.
> > > 
> > > Yes, because checking for conversions is handled elsewhere.
> > 
> > Yeah, as I noted above I realized that just handling it the same way as
> > iobj member functions is fundamentally broken. I was staring at it last
> > night and eventually realized that I could just copy the loop body. I
> > ended up asserting in the body handling the implicit object argument
> > for xobj member functions that first_arg != NULL_TREE, which I wasn't
> > sure of, but it seems to work.
> 
> 
> That sounds like it might cause trouble with
> 
> struct A {
> void f(this A);
> };
> 
> int main()
> {
> (&A::f) (A());
> }

I will check to see what the behavior with this is. This sounds related
to the next question I asked as well.

> > I tried asking in IRC if there are any circumstances where first_arg
> > would be null for a non-static member function and I didn't get an
> > answer. The code above seemed to indicate that it could be. It just
> > looks like old code that is no longer valid and never got removed.
> > Consequently this function has made it on my list of things to refactor
> > :^).
> 
> 
> Right, first_arg is only actually used for the implicit object argument,
> it's just easier to store it separately from the arguments in (). I'm
> not sure which code you mean is no longer valid?

Yeah I agree that it's easier to store it separately.

-- call.cc:build_over_call
```
  else if (TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE)
    {
      tree arg = build_this (first_arg != NULL_TREE
                             ? first_arg
                             : (*args)[arg_index]);
```

The trouble is, the code (shown above) does not assume that this holds
true. It handles the case where the implicit object argument was passed
in with the rest of the arguments. As far as I've observed, it seems
like it's always passed in through the first_arg member of cand, which
is what I was referring to here.

> > ended up asserting in the body handling the implicit object argument
> > for xobj member functions that first_arg != NULL_TREE, which I wasn't
> > sure of, but it seems to work.

Since it wasn't clear what I was referring to, here is the code that I
wrote (copied from the loop really) handling the case. In case it isn't
obvious, I didn't snip the code in the METHOD_TYPE block, it's just
snipped here as it's not code I've modified. I'm hopeful that the case
you mentioned above is not problematic, but like I said I will be sure
to test it.

-- call.cc:build_over_call
```
  else if (TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE)
    {
      /* SNIP */
    }
  else if (DECL_XOBJ_MEMBER_FUNC_P (fn))
    {
      gcc_assert (cand->first_arg);
      gcc_assert (cand->num_convs > 0);
      tree type = TREE_VALUE (parm);
      tree arg = cand->first_arg;
      bool conversion_warning = true;

      conv = convs[0];

      /* Set user_conv_p on the argument conversions, so rvalue/base handling
         knows not to allow any more UDCs.  This needs to happen after we
         process cand->warnings.  */
      if (flags & LOOKUP_NO_CONVERSION)
        conv->user_conv_p = true;

      tsubst_flags_t arg_complain = complain;
      if (!conversion_warning)
        arg_complain &= ~tf_warning;

      if (arg_complain & tf_warning)
        maybe_warn_pessimizing_move (arg, type, /*return_p*/false);

      val = convert_like_with_context (conv, arg, fn, 0,
                                       arg_complain);
      val = convert_for_arg_passing (type, val, arg_complain);


      if (val == error_mark_node)
        return error_mark_node;
      else
        argarray[j++] = val;
      /* Advance parameter chain, don't advance arg index */
      parm = TREE_CHAIN (parm);
      // ++arg_index;
      ++i;
      is_method = 1;
      first_arg = NULL_TREE;
    }
```

Anyway, I hope we can assume that first_argument is populated for all
non-static member function calls, and if we can't perhaps that is
something we can change. The assumptions we can make are unclear with
how the code handling METHOD_TYPE is right now. If we make it a
guarantee that first_argument is populated for non-static member
function calls that code can be vastly simplified.

> > > > Other than this, lambdas are not yet supported,
> > > 
> > > The error I'm seeing with the lambda testcase is "explicit object member
> > > function cannot have cv-qualifier". To avoid that, in
> > > cp_parser_lambda_declarator_opt you need to set quals to
> > > TYPE_UNQUALIFIED around where we do that for mutable lambdas.
> > 
> > Yeah, this ended up being the case as suspected, and it was exactly in
> > cp_parser_lambda_declarator_opt that it needed to be done. There's an
> > additional quirk in start_preparsed_function, which I already rambled
> > about a little in a previous reply on this chain, that suddenly
> > restored setting up the 'this' pointer. Previously, it was being
> > skipped because ctype was not being set for xobj member functions.
> > However, ctype not being set was causing the scope to not be set up
> > correctly for lambdas. In truth I don't remember exactly how the
> > problem was presenting but I do know how I fixed it.
> > 
> > `tree fntype = TREE_TYPE (decl1); if (TREE_CODE (fntype) == METHOD_TYPE) ctype = TYPE_METHOD_BASETYPE (fntype); else if (DECL_XOBJ_MEMBER_FUNC_P (decl1)) ctype = DECL_CONTEXT (decl1);`
> > 
> > In hindsight though, I have to question if this is the correct way of
> > dealing with it. As I mentioned previously, there's an additional quirk
> > in start_preparsed_function where it sets up the 'this' param, or at
> > least, it kind of looks like it? This is where my rambling was about.
> > 
> > `/* We don't need deal with 'this' or vtable for xobj member functions. */ if (ctype && !doing_friend && !(DECL_STATIC_FUNCTION_P (decl1) || DECL_XOBJ_MEMBER_FUNC_P (decl1)))`
> > 
> > My solution was to just not enter that block for xobj member functions,
> > but I'm realizing now that ctype doesn't seem to be set for static
> > member functions so setting ctype for xobj member functions might be
> > incorrect. I'll need to investigate it closer, I recall seeing the
> > second block above and thinking "it's checking to make sure decl1 is
> > not a static member function before entering this block, so that means
> > it must be set." It seems that was incorrect.
> > 
> > At bare minimum, I'll have to look at this again.
> 
> 
> Yeah, that could use some refactoring. IMO ctype should be set iff we
> want to push_nested_class, and the code for setting up 'this' should
> check for METHOD_TYPE instead of referring to ctype.

I'll put it on the todo list for after this patch is done. I'll note
that it seemed like not doing push_nested_class seemed to work just
fine for xobj member functions, and only caused problems once I was
working on lambdas. For this patch, I think I will change the condition
for setting up 'this' instead. On the other hand, are we sure that
lambdas don't assume 'this' was setup when accessing their captures? I
will investigate this briefly but now that I've thought of that, maybe
I won't mess around with this after all. On the other other hand, I'm
pretty sure captures still work for lambdas even though that block
isn't being handled now... or maybe I haven't tested them.

Actually, I think it doesn't matter if lambdas do use that or not for
their captures, because handling lambdas with an xobj parameter
correctly will still involve changing how lambdas access their
captures. PR102609 has a reply providing a test case when calling a
lambdas call operator where the first argument is a type unrelated to
the lambda's closure type. This might actually be relevant because I
can think of a case where one could conditionally (if constexpr) access
captures based on the type of the xobj parameter. THEN AGAIN, I don't
think it's possible to pass an unrelated implicit object parameter to a
lambda's call operator. (Not without explicitly calling it by function
pointer of course.) There are types that are derived from the lambda,
but those are still related so I assume that captures should be able to
be accessed in such a case.

Man, that raises even more questions for myself. If we are supposed to
implicitly use the explicit object parameter for accessing captures...

([PR102609] Gašper Ažman from comment #17)
> When a lambda has captures, the explicit object parameter is used to get at
> them *silently*, if they are related, otherwise the program is ill-formed:

...then what happens when the type of the explicit object parameter is
a class derived from the lambda, and the derived class has a member
variable with the same name as a capture we are trying to access?

I'm going to sit on this for a while and wait for feedback, I'm not
planning on working on this section today anyway.

> > > > I had wanted to write about some of my frustrations with trying to
> > > > write a test for virtual specifiers and errors/warnings for
> > > > shadowing/overloading virtual functions, but I am a bit too tired at
> > > > the moment and I don't want to delay getting this up for another night.
> > > > In short, the standard does not properly specify the criteria for
> > > > overriding functions, which leaves a lot of ambiguity in how exactly we
> > > > should be handling these cases.
> > > 
> > > Agreed, this issue came up in the C++ committee meeting today. See
> > > 
> > > https://cplusplus.github.io/CWG/issues/2553.html
> > > https://cplusplus.github.io/CWG/issues/2554.html
> > > 
> > > for draft changes to clarify some of these issues.
> > 
> > Ah I guess I should have read all your responses before sending any
> > back instead of going one by one. I ended up writing about this again I
> > think.
> > 
> > struct B {
> > virtual void f() const {}
> > };
> > 
> > struct S0 : B {
> > void f() {}
> > };
> > 
> > struct S1 : B {
> > void f(this S1&) {}
> > };
> > 
> > I had a bit of a debate with a peer, I initially disagreed with the
> > standard resolution because it disallows S1's declaration of f, while
> > S0's is an overload. I won't bore you with all the details of going
> > back and forth about the proposed wording, my feelings are mostly that
> > being able to overload something with an iobj member function but not
> > being able to with an xobj member function was inconsistent. He argued
> > that keeping restrictions high at first and lowering them later is
> > better, and overloading virtual functions is already not something you
> > should really ever do, so he was in favor of the proposed wording.
> 
> 
> Right. As the author of this proposal said in discussion, "We want very
> liberal overriding for explicit object member functions so that it'll
> blow up."

Yeah okay that makes sense, that was definitely what it felt like my
friend was arguing, and I've come to agree with it.

> > In light of our debate, my stance is that we should implement things as
> > per the proposed wording.
> > 
> > struct B {
> > virtual void foo() const&;
> > };
> > 
> > struct D : B {
> > void foo(this int);
> > };
> > 
> > This would be ill-formed now according to the change in wording. My
> > laymans interpretation of the semantics are that, the parameter list is
> > the same when the xobj parameter is ignore, so it overrides. And since
> > it overrides, and xobj member functions are not allowed to override, it
> > is an error.
> 
> 
> Yes.
> 
> > To be honest, I still don't understand where the wording accounts for
> > the qualifiers of B::f, but my friend assured me that it is.
> 
> 
> The qualifiers of B::f are part of the object parameter which we're
> ignoring because the derived function is xobj.

OKAY I get it now. Correspondence of the declarations involves the type
of the object parameter, while whether something overrides or not
doesn't take the object parameter into account, and it only takes the
ref qualifiers of the function into account for iobj member functions.
While for xobj member functions it ignores ref qualifiers. Alright, got
it.

> > Somewhat related is some warnings I wanted to implement for impossible
> > to call by-value xobj member functions. Ones that involve an unrelated
> > type get dicey because people could in theory have legitimate use cases
> > for that, P0847R7 includes an example of this combining recursive
> > lambdas and the overload pattern to create a visitor. However, I think
> > it would be reasonable to warn when a by-value xobj member function can
> > not be called due to the presence of overloads that take references.
> > Off the top of my head I don't recall how many overloads it would take
> > to prevent a by-value version from being called, nor have I looked into
> > how to implement this.
> 
> 
> Hmm, is it possible to make it un-callable? A function taking A and a
> function taking A&& are ambiguous when called with an rvalue argument,
> the by-value overload isn't hidden. Similarly, A and A& are ambiguous
> when called with a cv-unqualified lvalue.

I believe so, consider the following:

struct S {
  void f(this S) {}
  void f(this S const&) {}
  void f(this S&&) {}
};

I'm not 100% what the rules on this are, if I were to hazard a guess I
bet that the reference overloads are taken because they have one less
conversion?

To be clear, I recognize that you could still select the by-value
candidate by assigning it to a pointer to function type. I just don't
really think this is a useful or typical case, while accidentally
preventing a by-value xobj member function from being callable seems
like something that could easily happen, especially if we had the
following.

struct S {
  template<typename Self>
  void f(this Self&&) {}
  /* Our class is small enough, optimize const calls here.  */
  void f(this S) {}
}

I could see this being a typical enough refactor that I would want a
warning to alert me that my function that I'm writing for optimization
purposes won't ever be called in the cases I intend it to be called in.

Granted, I fully admit that I don't fully understand the semantics with
overload resolution here so my concerns might be entirely unfounded.
This is especially true in the second case since I'm not sure if the
deduced case actually will eat up all the calls of the member function
or not.

> > But I do know that redeclarations of xobj member
> > functions as iobj member functions (and vice-versa) are not properly
> > identified when ref qualifiers are omitted from the corresponding (is
> > this the correct term here?) iobj member function.
> 
> 
> That would be the code in add_method discussed later in that message.

Yeah, which I should get to before we hit our deadline.

> > > > + /* If */
> > > 
> > > I think this comment doesn't add much. :)
> > 
> > I wish I remembered what I even meant to put here, oh well, must not
> > have been all that important. Oh wait, I remember it now, I was going
> > to note that the next element in the chain of declarators being null
> > seems to indicate that the declaration is a function type. I will
> > probably put that comment in again, it's on the line of commenting
> > exactly what the code does but it was cryptic to figure this out, I'm
> > not sure if I should lean towards including the comment here or not.
> 
> 
> I always lean towards including comments. :)

I personally prefer to write code that doesn't need comments, but in a
code base like this we don't really have that luxury. It would take
massive amounts of refactoring to communicate assumptions through code
instead of through comments.

While I would like to move towards this, for now I will also lean
towards including comments, and in this particular case I'll make sure
to include what I wrote in the comment.

> > > > + /* I can imagine doing a fixit here, suggesting replacing
> > > > + this / *this / this-> with &name / name / "name." but it would be
> > > > + very difficult to get it perfect and I've been advised against
> > > > + making imperfect fixits.
> > > > + Perhaps it would be as simple as the replacements listed,
> > > > + even if one is move'ing/forward'ing, if the replacement is just
> > > > + done in the same place, it will be exactly what the user wants?
> > > > + Even if this is true though, there's still a problem of getting the
> > > > + context of the expression to find which tokens to replace.
> > > > + I would really like for this to be possible though.
> > > > + I will decide whether or not to persue this after review. */
> > > 
> > > You could pass the location of 'this' from the parser to
> > > finish_this_expr and always replace it with (&name)?
> > 
> > I've been reluctant to make any changes to parameters, but if you're
> > suggesting it I will consider it. Just replacing it with '(&name)'
> > seems really mediocre to me though. Yeah, sure, it's always going to be
> > syntactically valid but I worry that it wont be semantically correct
> > often enough.
> > 
> > Yeah, considering that I think I will leave it for now until I have
> > time to come up with a robust solution. I might still pass in the
> > location of the this token though and use error_at instead of error.
> > I'll see how I feel once I finish higher priority stuff.
> 
> 
> Makes sense.
> 
> > To be clear, I should be replacing { dg-options "-Wno-error=pedantic" }
> > with { dg-options "" } and you would prefer I just remove { dg-options
> > "-pedantic-errors" } yeah?
> 
> 
> Yes.

Sure thing then.

I'll try to get a new version of the patch out ASAP, I'm not sure I'll
include everything we talked about right away but I'll stuff in as much
as I can. I'll focus on the missing semantics before the more
complicated diagnostics changes. I think detection of redeclarations,
the correct overriding behavior, and the additional warnings I have
been discussing, will come later.

I think I'm also going to leave investigation of requires expressions
in non-dependent contexts that I had mentioned for another time as
well. The tests related to those have been fixed and work as expected
right now, and I think diving in and trying to figure out whether I can
do those tests with non-dependent operands would be a bad use of my
time.

Of course I will make sure all the small changes you requested make it
in, that might be the first thing I work on.

Alex
waffl3x Nov. 11, 2023, 11:24 a.m. UTC | #33
Just a quick addition here as I was starting to work on things I
realized where some misunderstandings were coming from. (Please also
see my previous e-mail, it is all still relevant, I just wanted to
clarify this.)

(From the other thread)
> > @@ -9949,7 +9951,8 @@ build_over_call (struct z_candidate cand, int flags, tsubst_flags_t complain)
> > }
> > }
> > / Bypass access control for 'this' parameter. */
> > - else if (TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE)
> > + else if (TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE
> > + || DECL_XOBJ_MEMBER_FUNC_P (fn))
> 
> 
> We don't want to take this path for xob fns. Instead I think we need to
> change the existing:
>
> > gcc_assert (first_arg == NULL_TREE);
> 
> 
> to assert that if first_arg is non-null, we're dealing with an xob fn,
> and then go ahead and do the same conversion as the loop body on first_arg.

(This thread)
> > Yeah, as I noted above I realized that just handling it the same way as
> > iobj member functions is fundamentally broken. I was staring at it last
> > night and eventually realized that I could just copy the loop body. I
> > ended up asserting in the body handling the implicit object argument
> > for xobj member functions that first_arg != NULL_TREE, which I wasn't
> > sure of, but it seems to work.
> 
> 
> That sounds like it might cause trouble with
> 
> struct A {
> void f(this A);
> };
> 
> int main()
> {
> (&A::f) (A());
> }
> 

Upon reviewing your reply in the other thread, I noticed you are
maybe misunderstanding the assertion a little bit.
```
  else if (TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE)
    {
      /* SNIP */
      if (first_arg != NULL_TREE)
        first_arg = NULL_TREE;
      else
        ++arg_index;
      ++i;
      is_method = 1;
    }
  else if (DECL_XOBJ_MEMBER_FUNC_P (fn))
    {
      gcc_assert (cand->first_arg);
      /* SNIP */
      first_arg = NULL_TREE;
    }
```
I already showed my code here but I didn't think to include the
previous conditional block to demonstrate how the first_arg variable
gets handled in it. Although, maybe you understood they set it to
NULL_TREE by the end of the block and I just poorly communicated that I
would be doing the same.

Perhaps you think it better to handle the implicit object argument
within the loop, but given that it is stored in first_arg I don't think
that would be correct. Granted, as I previously said, maybe there are
some situations where it isn't that I haven't encountered yet. The
other thing is there is some handling in the loop that isn't relevant
to the implicit object argument. Well, I would also just argue this
function needs a fair degree of refactoring, but I've already talked
about that.

Anyway, hopefully that clarifies things, and hopefully things actually
needed to be clarified and I'm not being a fool here. :) That's all I
had to add about this, back to work now.

Alex
Jason Merrill Nov. 14, 2023, 3:48 a.m. UTC | #34
On 11/11/23 05:43, waffl3x wrote:
>> [combined reply to all three threads]
>>
>> On 11/9/23 23:24, waffl3x wrote:
>>
>>>>> There are a few known issues still present in this patch. Most importantly,
>>>>> the implicit object argument fails to convert when passed to by-value xobj
>>>>> parameters. This occurs both for xobj parameters that match the argument type
>>>>> and xobj parameters that are unrelated to the object type, but have valid
>>>>> conversions available. This behavior can be observed in the
>>>>> explicit-obj-by-value[1-3].C tests. The implicit object argument appears to be
>>>>> simply reinterpreted instead of any conversion applied. This is elaborated on
>>>>> in the test cases.
>>>>
>>>> Yes, that's because of:
>>>>
>>>>> @@ -9949,7 +9951,8 @@ build_over_call (struct z_candidate cand, int flags, tsubst_flags_t complain)
>>>>> }
>>>>> }
>>>>> / Bypass access control for 'this' parameter. */
>>>>> - else if (TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE)
>>>>> + else if (TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE
>>>>> + || DECL_XOBJ_MEMBER_FUNC_P (fn))
>>>>
>>>> We don't want to take this path for xob fns. Instead I think we need to
>>>> change the existing:
>>>>
>>>>> gcc_assert (first_arg == NULL_TREE);
>>>>
>>>> to assert that if first_arg is non-null, we're dealing with an xob fn,
>>>> and then go ahead and do the same conversion as the loop body on first_arg.
>>>>
>>>>> Despite this, calls where there is no valid conversion
>>>>> available are correctly rejected, which I find surprising. The
>>>>> explicit-obj-by-value4.C testcase demonstrates this odd but correct behavior.
>>>>
>>>> Yes, because checking for conversions is handled elsewhere.
>>>
>>> Yeah, as I noted above I realized that just handling it the same way as
>>> iobj member functions is fundamentally broken. I was staring at it last
>>> night and eventually realized that I could just copy the loop body. I
>>> ended up asserting in the body handling the implicit object argument
>>> for xobj member functions that first_arg != NULL_TREE, which I wasn't
>>> sure of, but it seems to work.
>>
>>
>> That sounds like it might cause trouble with
>>
>> struct A {
>> void f(this A);
>> };
>>
>> int main()
>> {
>> (&A::f) (A());
>> }
> 
> I will check to see what the behavior with this is. This sounds related
> to the next question I asked as well.
> 
>>> I tried asking in IRC if there are any circumstances where first_arg
>>> would be null for a non-static member function and I didn't get an
>>> answer. The code above seemed to indicate that it could be. It just
>>> looks like old code that is no longer valid and never got removed.
>>> Consequently this function has made it on my list of things to refactor
>>> :^).
>>
>>
>> Right, first_arg is only actually used for the implicit object argument,
>> it's just easier to store it separately from the arguments in (). I'm
>> not sure which code you mean is no longer valid?
> 
> Yeah I agree that it's easier to store it separately.
> 
> -- call.cc:build_over_call
> ```
>    else if (TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE)
>      {
>        tree arg = build_this (first_arg != NULL_TREE
>                               ? first_arg
>                               : (*args)[arg_index]);
> ```
> 
> The trouble is, the code (shown above) does not assume that this holds
> true. It handles the case where the implicit object argument was passed
> in with the rest of the arguments. As far as I've observed, it seems
> like it's always passed in through the first_arg member of cand, which
> is what I was referring to here.
> 
>>> ended up asserting in the body handling the implicit object argument
>>> for xobj member functions that first_arg != NULL_TREE, which I wasn't
>>> sure of, but it seems to work.
> 
> Since it wasn't clear what I was referring to, here is the code that I
> wrote (copied from the loop really) handling the case. In case it isn't
> obvious, I didn't snip the code in the METHOD_TYPE block, it's just
> snipped here as it's not code I've modified. I'm hopeful that the case
> you mentioned above is not problematic, but like I said I will be sure
> to test it.
> 
> -- call.cc:build_over_call
> ```
>    else if (TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE)
>      {
>       /* SNIP */
>       if (first_arg != NULL_TREE)
>         first_arg = NULL_TREE;
>       else
>         ++arg_index;
>       ++i;
>       is_method = 1;
>      }
>    else if (DECL_XOBJ_MEMBER_FUNC_P (fn))
>      {
>        gcc_assert (cand->first_arg);
>        gcc_assert (cand->num_convs > 0);
>        tree type = TREE_VALUE (parm);
>        tree arg = cand->first_arg;
>        bool conversion_warning = true;
> 
>        conv = convs[0];
> 
>        /* Set user_conv_p on the argument conversions, so rvalue/base handling
>           knows not to allow any more UDCs.  This needs to happen after we
>           process cand->warnings.  */
>        if (flags & LOOKUP_NO_CONVERSION)
>          conv->user_conv_p = true;
> 
>        tsubst_flags_t arg_complain = complain;
>        if (!conversion_warning)
>          arg_complain &= ~tf_warning;
> 
>        if (arg_complain & tf_warning)
>          maybe_warn_pessimizing_move (arg, type, /*return_p*/false);
> 
>        val = convert_like_with_context (conv, arg, fn, 0,
>                                         arg_complain);
>        val = convert_for_arg_passing (type, val, arg_complain);
> 
> 
>        if (val == error_mark_node)
>          return error_mark_node;
>        else
>          argarray[j++] = val;
>        /* Advance parameter chain, don't advance arg index */
>        parm = TREE_CHAIN (parm);
>        // ++arg_index;
>        ++i;
>        is_method = 1;
>        first_arg = NULL_TREE;
>      }
> ```
> 
> Anyway, I hope we can assume that first_argument is populated for all
> non-static member function calls, and if we can't perhaps that is
> something we can change. The assumptions we can make are unclear with
> how the code handling METHOD_TYPE is right now. If we make it a
> guarantee that first_argument is populated for non-static member
> function calls that code can be vastly simplified.

Somewhat, but the current METHOD_TYPE code works fine, so let's not mess 
with it.  There might be corner cases that end up with the object 
argument in the args vec.

[other reply]

> I already showed my code here but I didn't think to include the
> previous conditional block to demonstrate how the first_arg variable
> gets handled in it. Although, maybe you understood they set it to
> NULL_TREE by the end of the block and I just poorly communicated that I
> would be doing the same.

I meant that I expect the

> gcc_assert (cand->first_arg);

to fail on my example.

> Perhaps you think it better to handle the implicit object argument
> within the loop, but given that it is stored in first_arg I don't think
> that would be correct. Granted, as I previously said, maybe there are
> some situations where it isn't that I haven't encountered yet. The
> other thing is there is some handling in the loop that isn't relevant
> to the implicit object argument. Well, I would also just argue this
> function needs a fair degree of refactoring, but I've already talked
> about that.

Factoring out the loop body (into a lambda?) rather than copying it 
would make sense to me.

>>>>> Other than this, lambdas are not yet supported,
>>>>
>>>> The error I'm seeing with the lambda testcase is "explicit object member
>>>> function cannot have cv-qualifier". To avoid that, in
>>>> cp_parser_lambda_declarator_opt you need to set quals to
>>>> TYPE_UNQUALIFIED around where we do that for mutable lambdas.
>>>
>>> Yeah, this ended up being the case as suspected, and it was exactly in
>>> cp_parser_lambda_declarator_opt that it needed to be done. There's an
>>> additional quirk in start_preparsed_function, which I already rambled
>>> about a little in a previous reply on this chain, that suddenly
>>> restored setting up the 'this' pointer. Previously, it was being
>>> skipped because ctype was not being set for xobj member functions.
>>> However, ctype not being set was causing the scope to not be set up
>>> correctly for lambdas. In truth I don't remember exactly how the
>>> problem was presenting but I do know how I fixed it.
>>>
>>> `tree fntype = TREE_TYPE (decl1); if (TREE_CODE (fntype) == METHOD_TYPE) ctype = TYPE_METHOD_BASETYPE (fntype); else if (DECL_XOBJ_MEMBER_FUNC_P (decl1)) ctype = DECL_CONTEXT (decl1);`
>>>
>>> In hindsight though, I have to question if this is the correct way of
>>> dealing with it. As I mentioned previously, there's an additional quirk
>>> in start_preparsed_function where it sets up the 'this' param, or at
>>> least, it kind of looks like it? This is where my rambling was about.
>>>
>>> `/* We don't need deal with 'this' or vtable for xobj member functions. */ if (ctype && !doing_friend && !(DECL_STATIC_FUNCTION_P (decl1) || DECL_XOBJ_MEMBER_FUNC_P (decl1)))`
>>>
>>> My solution was to just not enter that block for xobj member functions,
>>> but I'm realizing now that ctype doesn't seem to be set for static
>>> member functions so setting ctype for xobj member functions might be
>>> incorrect. I'll need to investigate it closer, I recall seeing the
>>> second block above and thinking "it's checking to make sure decl1 is
>>> not a static member function before entering this block, so that means
>>> it must be set." It seems that was incorrect.
>>>
>>> At bare minimum, I'll have to look at this again.
>>
>>
>> Yeah, that could use some refactoring. IMO ctype should be set iff we
>> want to push_nested_class, and the code for setting up 'this' should
>> check for METHOD_TYPE instead of referring to ctype.
> 
> I'll put it on the todo list for after this patch is done. I'll note
> that it seemed like not doing push_nested_class seemed to work just
> fine for xobj member functions, and only caused problems once I was
> working on lambdas.

Hmm, we want push_nested_class for any member function so that name 
lookup within the function finds class members.

> For this patch, I think I will change the condition
> for setting up 'this' instead. On the other hand, are we sure that
> lambdas don't assume 'this' was setup when accessing their captures? I
> will investigate this briefly but now that I've thought of that, maybe
> I won't mess around with this after all. On the other other hand, I'm
> pretty sure captures still work for lambdas even though that block
> isn't being handled now... or maybe I haven't tested them.

Captures don't rely on being able to name the object parameter, they use 
it directly without name lookup.  And indeed 'this' doesn't name the 
object parameter for the lambda op(), finish_this_expr looks past it.

> Actually, I think it doesn't matter if lambdas do use that or not for
> their captures, because handling lambdas with an xobj parameter
> correctly will still involve changing how lambdas access their
> captures. PR102609 has a reply providing a test case when calling a
> lambdas call operator where the first argument is a type unrelated to
> the lambda's closure type. This might actually be relevant because I
> can think of a case where one could conditionally (if constexpr) access
> captures based on the type of the xobj parameter. THEN AGAIN, I don't
> think it's possible to pass an unrelated implicit object parameter to a
> lambda's call operator. (Not without explicitly calling it by function
> pointer of course.) There are types that are derived from the lambda,
> but those are still related so I assume that captures should be able to
> be accessed in such a case.
> 
> Man, that raises even more questions for myself. If we are supposed to
> implicitly use the explicit object parameter for accessing captures...
> 
> ([PR102609] Gašper Ažman from comment #17)
>> When a lambda has captures, the explicit object parameter is used to get at
>> them *silently*, if they are related, otherwise the program is ill-formed:
> 
> ...then what happens when the type of the explicit object parameter is
> a class derived from the lambda, and the derived class has a member
> variable with the same name as a capture we are trying to access?

That ought to work fine; capture fields don't have the name of the 
variables they capture, they are only accessible through the capture 
proxy (from build_capture_proxy), which in turn refers directly to the 
member by its _DECL, not by name lookup.

>>> Somewhat related is some warnings I wanted to implement for impossible
>>> to call by-value xobj member functions. Ones that involve an unrelated
>>> type get dicey because people could in theory have legitimate use cases
>>> for that, P0847R7 includes an example of this combining recursive
>>> lambdas and the overload pattern to create a visitor. However, I think
>>> it would be reasonable to warn when a by-value xobj member function can
>>> not be called due to the presence of overloads that take references.
>>> Off the top of my head I don't recall how many overloads it would take
>>> to prevent a by-value version from being called, nor have I looked into
>>> how to implement this.
>>
>> Hmm, is it possible to make it un-callable? A function taking A and a
>> function taking A&& are ambiguous when called with an rvalue argument,
>> the by-value overload isn't hidden. Similarly, A and A& are ambiguous
>> when called with a cv-unqualified lvalue.
> 
> I believe so, consider the following:
> 
> struct S {
>    void f(this S) {}
>    void f(this S const&) {}
>    void f(this S&&) {}
> };
> 
> I'm not 100% what the rules on this are, if I were to hazard a guess I
> bet that the reference overloads are taken because they have one less
> conversion?

Initializing a by-value parameter isn't an extra conversion, it's an 
identity conversion, so ambiguous with the reference overloads, just as with

struct A { };
void f (A);
void f (const A&);
void f (A&&);

int main()
{
   A a;
   f (a);   // ambiguous
   f (A()); // ambiguous
}

> To be clear, I recognize that you could still select the by-value
> candidate by assigning it to a pointer to function type. I just don't
> really think this is a useful or typical case, while accidentally
> preventing a by-value xobj member function from being callable seems
> like something that could easily happen, especially if we had the
> following.
> 
> struct S {
>    template<typename Self>
>    void f(this Self&&) {}
>    /* Our class is small enough, optimize const calls here.  */
>    void f(this S) {}
> }
> 
> I could see this being a typical enough refactor that I would want a
> warning to alert me that my function that I'm writing for optimization
> purposes won't ever be called in the cases I intend it to be called in.
> 
> Granted, I fully admit that I don't fully understand the semantics with
> overload resolution here so my concerns might be entirely unfounded.
> This is especially true in the second case since I'm not sure if the
> deduced case actually will eat up all the calls of the member function
> or not.

Rather, in the deduced case the by-value overload always wins because 
overload resolution prefers a non-template if the conversions are 
equally good.

struct A { };
void f (A);
template <class T> void f (T&&);

int main()
{
   A a;
   f (a);   // calls non-template
   f (A()); // calls non-template
}

> I'll try to get a new version of the patch out ASAP, I'm not sure I'll
> include everything we talked about right away but I'll stuff in as much
> as I can. I'll focus on the missing semantics before the more
> complicated diagnostics changes. I think detection of redeclarations,
> the correct overriding behavior, and the additional warnings I have
> been discussing, will come later.

Makes sense.

> I think I'm also going to leave investigation of requires expressions
> in non-dependent contexts that I had mentioned for another time as
> well. The tests related to those have been fixed and work as expected
> right now, and I think diving in and trying to figure out whether I can
> do those tests with non-dependent operands would be a bad use of my
> time.

Agreed.

> Of course I will make sure all the small changes you requested make it
> in, that might be the first thing I work on.

Also note that revisions can slip into Stage 3, though preferably not 
too far.

Jason
waffl3x Nov. 14, 2023, 4:36 a.m. UTC | #35
On Monday, November 13th, 2023 at 8:48 PM, Jason Merrill <jason@redhat.com> wrote:


> 
> 
> On 11/11/23 05:43, waffl3x wrote:
> 
> > > [combined reply to all three threads]
> > > 
> > > On 11/9/23 23:24, waffl3x wrote:
> > > 
> > > > > > There are a few known issues still present in this patch. Most importantly,
> > > > > > the implicit object argument fails to convert when passed to by-value xobj
> > > > > > parameters. This occurs both for xobj parameters that match the argument type
> > > > > > and xobj parameters that are unrelated to the object type, but have valid
> > > > > > conversions available. This behavior can be observed in the
> > > > > > explicit-obj-by-value[1-3].C tests. The implicit object argument appears to be
> > > > > > simply reinterpreted instead of any conversion applied. This is elaborated on
> > > > > > in the test cases.
> > > > > 
> > > > > Yes, that's because of:
> > > > > 
> > > > > > @@ -9949,7 +9951,8 @@ build_over_call (struct z_candidate cand, int flags, tsubst_flags_t complain)
> > > > > > }
> > > > > > }
> > > > > > / Bypass access control for 'this' parameter. */
> > > > > > - else if (TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE)
> > > > > > + else if (TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE
> > > > > > + || DECL_XOBJ_MEMBER_FUNC_P (fn))
> > > > > 
> > > > > We don't want to take this path for xob fns. Instead I think we need to
> > > > > change the existing:
> > > > > 
> > > > > > gcc_assert (first_arg == NULL_TREE);
> > > > > 
> > > > > to assert that if first_arg is non-null, we're dealing with an xob fn,
> > > > > and then go ahead and do the same conversion as the loop body on first_arg.
> > > > > 
> > > > > > Despite this, calls where there is no valid conversion
> > > > > > available are correctly rejected, which I find surprising. The
> > > > > > explicit-obj-by-value4.C testcase demonstrates this odd but correct behavior.
> > > > > 
> > > > > Yes, because checking for conversions is handled elsewhere.
> > > > 
> > > > Yeah, as I noted above I realized that just handling it the same way as
> > > > iobj member functions is fundamentally broken. I was staring at it last
> > > > night and eventually realized that I could just copy the loop body. I
> > > > ended up asserting in the body handling the implicit object argument
> > > > for xobj member functions that first_arg != NULL_TREE, which I wasn't
> > > > sure of, but it seems to work.
> > > 
> > > That sounds like it might cause trouble with
> > > 
> > > struct A {
> > > void f(this A);
> > > };
> > > 
> > > int main()
> > > {
> > > (&A::f) (A());
> > > }
> > 
> > I will check to see what the behavior with this is. This sounds related
> > to the next question I asked as well.
> > 
> > > > I tried asking in IRC if there are any circumstances where first_arg
> > > > would be null for a non-static member function and I didn't get an
> > > > answer. The code above seemed to indicate that it could be. It just
> > > > looks like old code that is no longer valid and never got removed.
> > > > Consequently this function has made it on my list of things to refactor
> > > > :^).
> > > 
> > > Right, first_arg is only actually used for the implicit object argument,
> > > it's just easier to store it separately from the arguments in (). I'm
> > > not sure which code you mean is no longer valid?
> > 
> > Yeah I agree that it's easier to store it separately.
> > 
> > -- call.cc:build_over_call
> > `else if (TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE) { tree arg = build_this (first_arg != NULL_TREE ? first_arg : (*args)[arg_index]);`
> > 
> > The trouble is, the code (shown above) does not assume that this holds
> > true. It handles the case where the implicit object argument was passed
> > in with the rest of the arguments. As far as I've observed, it seems
> > like it's always passed in through the first_arg member of cand, which
> > is what I was referring to here.
> > 
> > > > ended up asserting in the body handling the implicit object argument
> > > > for xobj member functions that first_arg != NULL_TREE, which I wasn't
> > > > sure of, but it seems to work.
> > 
> > Since it wasn't clear what I was referring to, here is the code that I
> > wrote (copied from the loop really) handling the case. In case it isn't
> > obvious, I didn't snip the code in the METHOD_TYPE block, it's just
> > snipped here as it's not code I've modified. I'm hopeful that the case
> > you mentioned above is not problematic, but like I said I will be sure
> > to test it.
> > 
> > -- call.cc:build_over_call
> > ```
> > else if (TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE)
> > {
> > /* SNIP */
> > if (first_arg != NULL_TREE)
> > first_arg = NULL_TREE;
> > else
> > ++arg_index;
> > ++i;
> > is_method = 1;
> > }
> > else if (DECL_XOBJ_MEMBER_FUNC_P (fn))
> > {
> > gcc_assert (cand->first_arg);
> > gcc_assert (cand->num_convs > 0);
> > tree type = TREE_VALUE (parm);
> > tree arg = cand->first_arg;
> > bool conversion_warning = true;
> > 
> > conv = convs[0];
> > 
> > /* Set user_conv_p on the argument conversions, so rvalue/base handling
> > knows not to allow any more UDCs. This needs to happen after we
> > process cand->warnings. */
> > if (flags & LOOKUP_NO_CONVERSION)
> > conv->user_conv_p = true;
> > 
> > tsubst_flags_t arg_complain = complain;
> > if (!conversion_warning)
> > arg_complain &= ~tf_warning;
> > 
> > if (arg_complain & tf_warning)
> > maybe_warn_pessimizing_move (arg, type, /return_p/false);
> > 
> > val = convert_like_with_context (conv, arg, fn, 0,
> > arg_complain);
> > val = convert_for_arg_passing (type, val, arg_complain);
> > 
> > if (val == error_mark_node)
> > return error_mark_node;
> > else
> > argarray[j++] = val;
> > /* Advance parameter chain, don't advance arg index */
> > parm = TREE_CHAIN (parm);
> > // ++arg_index;
> > ++i;
> > is_method = 1;
> > first_arg = NULL_TREE;
> > }
> > ```
> > 
> > Anyway, I hope we can assume that first_argument is populated for all
> > non-static member function calls, and if we can't perhaps that is
> > something we can change. The assumptions we can make are unclear with
> > how the code handling METHOD_TYPE is right now. If we make it a
> > guarantee that first_argument is populated for non-static member
> > function calls that code can be vastly simplified.
> 
> 
> Somewhat, but the current METHOD_TYPE code works fine, so let's not mess
> with it. There might be corner cases that end up with the object
> argument in the args vec.

Yeah I haven't messed with it, I fully agree that it's not something
that should be a part of this patch. I do want to thoroughly
investigate whether these corner cases exist in the future though.

> [other reply]
> 
> > I already showed my code here but I didn't think to include the
> > previous conditional block to demonstrate how the first_arg variable
> > gets handled in it. Although, maybe you understood they set it to
> > NULL_TREE by the end of the block and I just poorly communicated that I
> > would be doing the same.
> 
> 
> I meant that I expect the
> 
> > gcc_assert (cand->first_arg);
> 
> 
> to fail on my example.
> 
> > Perhaps you think it better to handle the implicit object argument
> > within the loop, but given that it is stored in first_arg I don't think
> > that would be correct. Granted, as I previously said, maybe there are
> > some situations where it isn't that I haven't encountered yet. The
> > other thing is there is some handling in the loop that isn't relevant
> > to the implicit object argument. Well, I would also just argue this
> > function needs a fair degree of refactoring, but I've already talked
> > about that.
> 
> 
> Factoring out the loop body (into a lambda?) rather than copying it
> would make sense to me.

Gladly, the duplication made me uneasy for sure.

> > > > > > Other than this, lambdas are not yet supported,
> > > > > 
> > > > > The error I'm seeing with the lambda testcase is "explicit object member
> > > > > function cannot have cv-qualifier". To avoid that, in
> > > > > cp_parser_lambda_declarator_opt you need to set quals to
> > > > > TYPE_UNQUALIFIED around where we do that for mutable lambdas.
> > > > 
> > > > Yeah, this ended up being the case as suspected, and it was exactly in
> > > > cp_parser_lambda_declarator_opt that it needed to be done. There's an
> > > > additional quirk in start_preparsed_function, which I already rambled
> > > > about a little in a previous reply on this chain, that suddenly
> > > > restored setting up the 'this' pointer. Previously, it was being
> > > > skipped because ctype was not being set for xobj member functions.
> > > > However, ctype not being set was causing the scope to not be set up
> > > > correctly for lambdas. In truth I don't remember exactly how the
> > > > problem was presenting but I do know how I fixed it.
> > > > 
> > > > `tree fntype = TREE_TYPE (decl1); if (TREE_CODE (fntype) == METHOD_TYPE) ctype = TYPE_METHOD_BASETYPE (fntype); else if (DECL_XOBJ_MEMBER_FUNC_P (decl1)) ctype = DECL_CONTEXT (decl1);`
> > > > 
> > > > In hindsight though, I have to question if this is the correct way of
> > > > dealing with it. As I mentioned previously, there's an additional quirk
> > > > in start_preparsed_function where it sets up the 'this' param, or at
> > > > least, it kind of looks like it? This is where my rambling was about.
> > > > 
> > > > `/* We don't need deal with 'this' or vtable for xobj member functions. */ if (ctype && !doing_friend && !(DECL_STATIC_FUNCTION_P (decl1) || DECL_XOBJ_MEMBER_FUNC_P (decl1)))`
> > > > 
> > > > My solution was to just not enter that block for xobj member functions,
> > > > but I'm realizing now that ctype doesn't seem to be set for static
> > > > member functions so setting ctype for xobj member functions might be
> > > > incorrect. I'll need to investigate it closer, I recall seeing the
> > > > second block above and thinking "it's checking to make sure decl1 is
> > > > not a static member function before entering this block, so that means
> > > > it must be set." It seems that was incorrect.
> > > > 
> > > > At bare minimum, I'll have to look at this again.
> > > 
> > > Yeah, that could use some refactoring. IMO ctype should be set iff we
> > > want to push_nested_class, and the code for setting up 'this' should
> > > check for METHOD_TYPE instead of referring to ctype.
> > 
> > I'll put it on the todo list for after this patch is done. I'll note
> > that it seemed like not doing push_nested_class seemed to work just
> > fine for xobj member functions, and only caused problems once I was
> > working on lambdas.
> 
> 
> Hmm, we want push_nested_class for any member function so that name
> lookup within the function finds class members.

Okay, thanks for noting this explicitly, I probably would have run into
problems otherwise. Come to think of it, I might have to be careful to
make sure that I'm handling calling a function of base from a derived
class correctly. I definitely need tests for this case.

> > For this patch, I think I will change the condition
> > for setting up 'this' instead. On the other hand, are we sure that
> > lambdas don't assume 'this' was setup when accessing their captures? I
> > will investigate this briefly but now that I've thought of that, maybe
> > I won't mess around with this after all. On the other other hand, I'm
> > pretty sure captures still work for lambdas even though that block
> > isn't being handled now... or maybe I haven't tested them.
> 
> 
> Captures don't rely on being able to name the object parameter, they use
> it directly without name lookup. And indeed 'this' doesn't name the
> object parameter for the lambda op(), finish_this_expr looks past it.

Wonderful, that should be helpful.

> > Actually, I think it doesn't matter if lambdas do use that or not for
> > their captures, because handling lambdas with an xobj parameter
> > correctly will still involve changing how lambdas access their
> > captures. PR102609 has a reply providing a test case when calling a
> > lambdas call operator where the first argument is a type unrelated to
> > the lambda's closure type. This might actually be relevant because I
> > can think of a case where one could conditionally (if constexpr) access
> > captures based on the type of the xobj parameter. THEN AGAIN, I don't
> > think it's possible to pass an unrelated implicit object parameter to a
> > lambda's call operator. (Not without explicitly calling it by function
> > pointer of course.) There are types that are derived from the lambda,
> > but those are still related so I assume that captures should be able to
> > be accessed in such a case.
> > 
> > Man, that raises even more questions for myself. If we are supposed to
> > implicitly use the explicit object parameter for accessing captures...
> > 
> > ([PR102609] Gašper Ažman from comment #17)
> > 
> > > When a lambda has captures, the explicit object parameter is used to get at
> > > them silently, if they are related, otherwise the program is ill-formed:
> > 
> > ...then what happens when the type of the explicit object parameter is
> > a class derived from the lambda, and the derived class has a member
> > variable with the same name as a capture we are trying to access?
> 
> 
> That ought to work fine; capture fields don't have the name of the
> variables they capture, they are only accessible through the capture
> proxy (from build_capture_proxy), which in turn refers directly to the
> member by its _DECL, not by name lookup.

Okay, so just adding error checking for an unrelated object parameter
type should do the trick for us then.

> > > > Somewhat related is some warnings I wanted to implement for impossible
> > > > to call by-value xobj member functions. Ones that involve an unrelated
> > > > type get dicey because people could in theory have legitimate use cases
> > > > for that, P0847R7 includes an example of this combining recursive
> > > > lambdas and the overload pattern to create a visitor. However, I think
> > > > it would be reasonable to warn when a by-value xobj member function can
> > > > not be called due to the presence of overloads that take references.
> > > > Off the top of my head I don't recall how many overloads it would take
> > > > to prevent a by-value version from being called, nor have I looked into
> > > > how to implement this.
> > > 
> > > Hmm, is it possible to make it un-callable? A function taking A and a
> > > function taking A&& are ambiguous when called with an rvalue argument,
> > > the by-value overload isn't hidden. Similarly, A and A& are ambiguous
> > > when called with a cv-unqualified lvalue.
> > 
> > I believe so, consider the following:
> > 
> > struct S {
> > void f(this S) {}
> > void f(this S const&) {}
> > void f(this S&&) {}
> > };
> > 
> > I'm not 100% what the rules on this are, if I were to hazard a guess I
> > bet that the reference overloads are taken because they have one less
> > conversion?
> 
> 
> Initializing a by-value parameter isn't an extra conversion, it's an
> identity conversion, so ambiguous with the reference overloads, just as with

I will have to check again, but I recall seeing cand had conversion to
rvalue for arguments that initialize by-value parameters. I assume I'm
either mistaken or misunderstanding what I was looking at though.

> struct A { };
> void f (A);
> void f (const A&);
> void f (A&&);
> 
> int main()
> {
> A a;
> f (a); // ambiguous
> f (A()); // ambiguous
> }

Ambiguous is good though, we might not need to issue a warning at all.

> > To be clear, I recognize that you could still select the by-value
> > candidate by assigning it to a pointer to function type. I just don't
> > really think this is a useful or typical case, while accidentally
> > preventing a by-value xobj member function from being callable seems
> > like something that could easily happen, especially if we had the
> > following.
> > 
> > struct S {
> > template<typename Self>
> > void f(this Self&&) {}
> > /* Our class is small enough, optimize const calls here. */
> > void f(this S) {}
> > }
> > 
> > I could see this being a typical enough refactor that I would want a
> > warning to alert me that my function that I'm writing for optimization
> > purposes won't ever be called in the cases I intend it to be called in.
> > 
> > Granted, I fully admit that I don't fully understand the semantics with
> > overload resolution here so my concerns might be entirely unfounded.
> > This is especially true in the second case since I'm not sure if the
> > deduced case actually will eat up all the calls of the member function
> > or not.
> 
> 
> Rather, in the deduced case the by-value overload always wins because
> overload resolution prefers a non-template if the conversions are
> equally good.
> 
> struct A { };
> void f (A);
> template <class T> void f (T&&);
> 
> 
> int main()
> {
> A a;
> f (a); // calls non-template
> f (A()); // calls non-template
> }

That's less good, maybe we want a warning in the other direction now
then? I had expected it to prioritize the template for this. I can
definitely imagine someone getting confused why their object isn't
mutating in a case like this if they were delegating to other functions
in the function call.

Well, I tried to write an example and I'm realizing that maybe my
concerns are imagined. So I'm not going to worry about it for now. I'm
thinking that the cases where someone wants to deduce Self and also
have a by-value overload are not that common. On the other hand, maybe
we still want to warn for it in case people try to do it.

I'll put it in the back of my mind for now, it doesn't seem important
enough to worry about anymore.

> > I'll try to get a new version of the patch out ASAP, I'm not sure I'll
> > include everything we talked about right away but I'll stuff in as much
> > as I can. I'll focus on the missing semantics before the more
> > complicated diagnostics changes. I think detection of redeclarations,
> > the correct overriding behavior, and the additional warnings I have
> > been discussing, will come later.
> 
> 
> Makes sense.
> 
> > I think I'm also going to leave investigation of requires expressions
> > in non-dependent contexts that I had mentioned for another time as
> > well. The tests related to those have been fixed and work as expected
> > right now, and I think diving in and trying to figure out whether I can
> > do those tests with non-dependent operands would be a bad use of my
> > time.
> 
> 
> Agreed.

Unfortunately I was a bit under the weather this weekend so I haven't
gotten much work done, I've been going strong today though. Plan stays
the same though.

> > Of course I will make sure all the small changes you requested make it
> > in, that might be the first thing I work on.
> 
> 
> Also note that revisions can slip into Stage 3, though preferably not
> too far.
> 
> Jason

I'll try not to let it come to that, better I don't consider it an
option anyway. :^)

Alex
waffl3x Nov. 18, 2023, 6:43 a.m. UTC | #36
The patch is coming along, I just have a quick question regarding
style. I make use of IILE's (immediately invoked lambda expression) a
whole lot in my own code. I know that their use is controversial in
general so I would prefer to ask instead of just submitting the patch
using them a bunch suddenly. I wouldn't have bothered either but this
part is really miserable without them.

If that would be okay, I would suggest an additional exception to
bracing style for lambdas.
This:
[](){
  // stuff
};
Instead of this:
[]()
  {
    // stuff
  };

This is especially important for IILE pattern IMO, else it looks really
mediocre. If this isn't okay okay I'll refactor all the IILE's that I
added, or just name them and call them instead. Whatever you think is
most appropriate.

Alex
Jason Merrill Nov. 19, 2023, 6:22 a.m. UTC | #37
On Sat, Nov 18, 2023 at 1:43 AM waffl3x <waffl3x@protonmail.com> wrote:
>
> The patch is coming along, I just have a quick question regarding
> style. I make use of IILE's (immediately invoked lambda expression) a
> whole lot in my own code. I know that their use is controversial in
> general so I would prefer to ask instead of just submitting the patch
> using them a bunch suddenly. I wouldn't have bothered either but this
> part is really miserable without them.

We don't use them much currently, but I'm not opposed to them either,
if they help make the code clearer.

> If that would be okay, I would suggest an additional exception to
> bracing style for lambdas.
> This:
> [](){
>   // stuff
> };
> Instead of this:
> []()
>   {
>     // stuff
>   };

Makes sense.

Jason
waffl3x Nov. 19, 2023, 1:39 p.m. UTC | #38
Funny enough I ended up removing the ones I was thinking about, seems
to always happen when I ask style questions but I'm glad to hear it's
okay going forward.

I'm having trouble fixing this bug, based on what Gasper said in
PR102609 I am pretty sure I know what the semantics should be. Since
the capture is not used in the body of the function, it should be well
formed to call the function with an unrelated type.

I had begun trying to tackle the case that Gasper mentioned and got the
following ICE. I also have another case that ICEs so I've been thinking
I don't get to do little changes to fix this. I've been looking at this
for a few hours now and given we are past the deadline now I figured I
should see what others think.

int main()
{
  int x = 42;
  auto f1 = [x](this auto&& self) {};

  static_cast<int(*)(int&)>(decltype(f1)::operator());
}

explicit-obj-lambdaX3.C: In instantiation of 'main()::<lambda(auto:1&&)> static [with auto:1 = int&]':
explicit-obj-lambdaX3.C:33:53:   required from here
   33 |   static_cast<int(*)(int&)>(decltype(f1)::operator());
      |                                                     ^
explicit-obj-lambdaX3.C:31:33: internal compiler error: tree check: expected record_type or union_type or qual_union_type, have integer_type in finish_non_static_data_member, at cp/semantics.cc:2294
   31 |   auto f1 = [x](this auto&& self) {};
      |                                 ^
0x1c66dda tree_check_failed(tree_node const*, char const*, int, char const*, ...)
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/tree.cc:8949
0xb2e125 tree_check3(tree_node*, char const*, int, char const*, tree_code, tree_code, tree_code)
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/tree.h:3638
0xedfaf4 finish_non_static_data_member(tree_node*, tree_node*, tree_node*, int)
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/semantics.cc:2294
0xe8b9b8 tsubst_expr(tree_node*, tree_node*, int, tree_node*)
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/pt.cc:20864
0xe6d713 tsubst_decl
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/pt.cc:15387
0xe6fb1b tsubst(tree_node*, tree_node*, int, tree_node*)
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/pt.cc:15967
0xe7bd81 tsubst_stmt
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/pt.cc:18299
0xe7df18 tsubst_stmt
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/pt.cc:18554
0xea6982 instantiate_body
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/pt.cc:26743
0xea83e9 instantiate_decl(tree_node*, bool, bool)
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/pt.cc:27030
0xb5f9c9 resolve_address_of_overloaded_function
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/class.cc:8802
0xb60be1 instantiate_type(tree_node*, tree_node*, int)
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/class.cc:9061
0xaf9992 standard_conversion
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/call.cc:1244
0xafcb57 implicit_conversion
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/call.cc:2081
0xb2a8cb perform_direct_initialization_if_possible(tree_node*, tree_node*, bool, int)
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/call.cc:13456
0xf69db8 build_static_cast_1
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/typeck.cc:8356
0xf6af1b build_static_cast(unsigned int, tree_node*, tree_node*, int)
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/typeck.cc:8566
0xd9fc02 cp_parser_postfix_expression
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/parser.cc:7531
0xda45af cp_parser_unary_expression
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/parser.cc:9244
0xda5db4 cp_parser_cast_expression
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/parser.cc:10148

As I said, there is also this case that also ICEs in the same region.
It's making me think that some core assumptions are being violated in
the code leading up to finish_non_static_data_member.

int main()
{
  int x = 42;
  auto f1 = [x](this auto self) {};
}

explicit-obj-lambdaX3.C: In lambda function:
explicit-obj-lambdaX3.C:31:31: internal compiler error: Segmentation fault
   31 |   auto f1 = [x](this auto self) {};
      |                               ^
0x1869eaa crash_signal
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/toplev.cc:315
0xb2ea0b strip_array_types(tree_node*)
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/tree.h:4955
0xf773e2 cp_type_quals(tree_node const*)
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/typeck.cc:11509
0xedf993 finish_non_static_data_member(tree_node*, tree_node*, tree_node*, int)
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/semantics.cc:2271
0xcc3e12 build_capture_proxy
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/lambda.cc:410
0xcce99a start_lambda_function(tree_node*, tree_node*)
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/lambda.cc:1684
0xdab196 cp_parser_lambda_body
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/parser.cc:12004
0xda8f64 cp_parser_lambda_expression
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/parser.cc:11293
0xd9c03a cp_parser_primary_expression
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/parser.cc:5921
0xda0b33 cp_parser_postfix_expression
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/parser.cc:7868
0xda45af cp_parser_unary_expression
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/parser.cc:9244
0xda5db4 cp_parser_cast_expression
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/parser.cc:10148
0xda5ecd cp_parser_binary_expression
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/parser.cc:10251
0xda6fd9 cp_parser_assignment_expression
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/parser.cc:10595
0xda77b3 cp_parser_constant_expression
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/parser.cc:10885
0xdc9a36 cp_parser_initializer_clause
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/parser.cc:25844
0xdc981d cp_parser_initializer
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/parser.cc:25783
0xdc40bd cp_parser_init_declarator
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/parser.cc:23332
0xdb46d2 cp_parser_simple_declaration
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/parser.cc:15690
0xdb41e5 cp_parser_block_declaration
	/home/waffl3x/projects/gcc-dev/workspace/src/xobj-next/gcc/cp/parser.cc:15510

Any insight is appreciated here, I've been going through with GDB and I
do feel like I'm making progress, but I'm not much closer to figuring
out what the correct solution (to the first) is.

Right now, my plan is when trying to access captures, it should be
checking to see if the type of the xobj param passed in is related to
the context of the function decl. I just haven't figured out exactly
where to insert it yet. With any luck, now that I've written an e-mail
saying I'm having trouble I'll immediately figure out what to do. It's
happened every previous time so far. :)

Alex

On Saturday, November 18th, 2023 at 11:22 PM, Jason Merrill <jason@redhat.com> wrote:


> 
> 
> On Sat, Nov 18, 2023 at 1:43 AM waffl3x waffl3x@protonmail.com wrote:
> 
> > The patch is coming along, I just have a quick question regarding
> > style. I make use of IILE's (immediately invoked lambda expression) a
> > whole lot in my own code. I know that their use is controversial in
> > general so I would prefer to ask instead of just submitting the patch
> > using them a bunch suddenly. I wouldn't have bothered either but this
> > part is really miserable without them.
> 
> 
> We don't use them much currently, but I'm not opposed to them either,
> if they help make the code clearer.
> 
> > If that would be okay, I would suggest an additional exception to
> > bracing style for lambdas.
> > This:
> > {
> > // stuff
> > };
> > Instead of this:
> > 
> > {
> > // stuff
> > };
> 
> 
> Makes sense.
> 
> Jason
Jason Merrill Nov. 19, 2023, 4:31 p.m. UTC | #39
On 11/19/23 08:39, waffl3x wrote:
> Funny enough I ended up removing the ones I was thinking about, seems
> to always happen when I ask style questions but I'm glad to hear it's
> okay going forward.
> 
> I'm having trouble fixing this bug, based on what Gasper said in
> PR102609 I am pretty sure I know what the semantics should be. Since
> the capture is not used in the body of the function, it should be well
> formed to call the function with an unrelated type.

I don't think so: https://eel.is/c++draft/expr.prim#lambda.closure-5
says the type of the xobj parameter must be related.

> I had begun trying to tackle the case that Gasper mentioned and got the
> following ICE. I also have another case that ICEs so I've been thinking
> I don't get to do little changes to fix this. I've been looking at this
> for a few hours now and given we are past the deadline now I figured I
> should see what others think.
> 
> int main()
> {
>    int x = 42;
>    auto f1 = [x](this auto&& self) {};
> 
>    static_cast<int(*)(int&)>(decltype(f1)::operator());
> }

This should be rejected when we try to instantiate the op() with int.

> As I said, there is also this case that also ICEs in the same region.
> It's making me think that some core assumptions are being violated in
> the code leading up to finish_non_static_data_member.
> 
> int main()
> {
>    int x = 42;
>    auto f1 = [x](this auto self) {};
> }

Here I think the problem is in build_capture_proxy:

>   /* The proxy variable forwards to the capture field.  */
>   object = build_fold_indirect_ref (DECL_ARGUMENTS (fn));
>   object = finish_non_static_data_member (member, object, NULL_TREE);

The call to build_fold_indirect_ref assumes that 'this' is a pointer, 
which it is not here.  I think you can just make that conditional on it 
being a pointer or reference?

Jason
waffl3x Nov. 19, 2023, 6:36 p.m. UTC | #40
On Sunday, November 19th, 2023 at 9:31 AM, Jason Merrill <jason@redhat.com> wrote:


>
>
> On 11/19/23 08:39, waffl3x wrote:
>
> > Funny enough I ended up removing the ones I was thinking about, seems
> > to always happen when I ask style questions but I'm glad to hear it's
> > okay going forward.
> >
> > I'm having trouble fixing this bug, based on what Gasper said in
> > PR102609 I am pretty sure I know what the semantics should be. Since
> > the capture is not used in the body of the function, it should be well
> > formed to call the function with an unrelated type.
>
>
> I don't think so: https://eel.is/c++draft/expr.prim#lambda.closure-5
> says the type of the xobj parameter must be related.

Well, thanks for bringing that to my attention, that makes things
easier but I'm kinda disappointed, I almost wanted an excuse to write
more code. I wonder why Gasper thought this wasn't the case, perhaps it
was a later decision?

Okay, I checked, that paragraph is in the original paper, I thought I
could just take what one of the paper's authors said for granted but I
guess not, I'm not going to make that mistake again. On the other hand,
maybe he just misunderstood my question, what can you do.

Well regardless, that reduces the things I have left to do by a whole
lot.

> > I had begun trying to tackle the case that Gasper mentioned and got the
> > following ICE. I also have another case that ICEs so I've been thinking
> > I don't get to do little changes to fix this. I've been looking at this
> > for a few hours now and given we are past the deadline now I figured I
> > should see what others think.
> >
> > int main()
> > {
> > int x = 42;
> > auto f1 = [x](this auto&& self) {};
> >
> > static_cast<int(*)(int&)>(decltype(f1)::operator());
> > }
>
>
> This should be rejected when we try to instantiate the op() with int.

Yep, absolutely, that is clear now.

Just for the record, clang accepts it, but since I can't think of any
use cases for this I don't think there's any value in supporting it.

>
> > As I said, there is also this case that also ICEs in the same region.
> > It's making me think that some core assumptions are being violated in
> > the code leading up to finish_non_static_data_member.
> >
> > int main()
> > {
> > int x = 42;
> > auto f1 = [x](this auto self) {};
> > }
>
>
> Here I think the problem is in build_capture_proxy:
>
> > /* The proxy variable forwards to the capture field. */
> > object = build_fold_indirect_ref (DECL_ARGUMENTS (fn));
> > object = finish_non_static_data_member (member, object, NULL_TREE);
>
>
> The call to build_fold_indirect_ref assumes that 'this' is a pointer,
> which it is not here. I think you can just make that conditional on it
> being a pointer or reference?
>
> Jason

Thanks, I will take a look at that area.

I'm having trouble fixing the error for this case, the control flow
when the functions are overloaded is much more complex.

struct S {
  void f(this S&) {}
  void f(this S&, int)

  void g() {
    void (*fp)(S&) = &f;
  }
};

This seemed to have fixed the non overloaded case, but I'm also not
very happy with it, it feels kind of icky. Especially since the expr's
location isn't available here, although, it just occurred to me that
the expr's location is probably stored in the node.

typeck.cc:cp_build_addr_expr_1
```
    case BASELINK:
      arg = BASELINK_FUNCTIONS (arg);
      if (DECL_XOBJ_MEMBER_FUNC_P (
        {
          error ("You must qualify taking address of xobj member functions");
	  return error_mark_node;
        }

Anyway, I'm quite tired but I'll to finish off the lambda stuff before
calling it, then I'll run a bootstrap and tests and if all is well I
will submit the patch. I will probably skimp on the changelog and
commit message as that's the part I have the hardest time on,
especially when I'm tired.

Alex
```
Jason Merrill Nov. 19, 2023, 8:34 p.m. UTC | #41
On 11/19/23 13:36, waffl3x wrote:
> I'm having trouble fixing the error for this case, the control flow
> when the functions are overloaded is much more complex.
> 
> struct S {
>    void f(this S&) {}
>    void f(this S&, int)
> 
>    void g() {
>      void (*fp)(S&) = &f;
>    }
> };
> 
> This seemed to have fixed the non overloaded case, but I'm also not
> very happy with it, it feels kind of icky. Especially since the expr's
> location isn't available here, although, it just occurred to me that
> the expr's location is probably stored in the node.
> 
> typeck.cc:cp_build_addr_expr_1
> ```
>      case BASELINK:
>        arg = BASELINK_FUNCTIONS (arg);
>        if (DECL_XOBJ_MEMBER_FUNC_P (
>          {
>            error ("You must qualify taking address of xobj member functions");
> 	  return error_mark_node;
>          }

The loc variable was set earlier in the function, you can use that.

The overloaded case we want to handle here in 
resolve_address_of_overloaded_function:

>   if (DECL_NONSTATIC_MEMBER_FUNCTION_P (fn)
>       && !(complain & tf_ptrmem_ok) && !flag_ms_extensions)
>     {
>       static int explained;
> 
>       if (!(complain & tf_error))
>         return error_mark_node;
> 
>       auto_diagnostic_group d;
>       if (permerror (input_location, "assuming pointer to member %qD", fn)
>           && !explained)
>         {
>           inform (input_location, "(a pointer to member can only be "
>                   "formed with %<&%E%>)", fn);
>           explained = 1;
>         }
>     }

Jason
waffl3x Nov. 19, 2023, 9:44 p.m. UTC | #42
On Sunday, November 19th, 2023 at 1:34 PM, Jason Merrill <jason@redhat.com> wrote:


> 
> 
> On 11/19/23 13:36, waffl3x wrote:
> 
> > I'm having trouble fixing the error for this case, the control flow
> > when the functions are overloaded is much more complex.
> > 
> > struct S {
> > void f(this S&) {}
> > void f(this S&, int)
> > 
> > void g() {
> > void (*fp)(S&) = &f;
> > }
> > };
> > 
> > This seemed to have fixed the non overloaded case, but I'm also not
> > very happy with it, it feels kind of icky. Especially since the expr's
> > location isn't available here, although, it just occurred to me that
> > the expr's location is probably stored in the node.
> > 
> > typeck.cc:cp_build_addr_expr_1
> > ```
> > case BASELINK:
> > arg = BASELINK_FUNCTIONS (arg);
> > if (DECL_XOBJ_MEMBER_FUNC_P (
> > {
> > error ("You must qualify taking address of xobj member functions");
> > return error_mark_node;
> > }
> 
> 
> The loc variable was set earlier in the function, you can use that.

Will do.

> The overloaded case we want to handle here in
> resolve_address_of_overloaded_function:
> 
> > if (DECL_NONSTATIC_MEMBER_FUNCTION_P (fn)
> > && !(complain & tf_ptrmem_ok) && !flag_ms_extensions)
> > {
> > static int explained;
> > 
> > if (!(complain & tf_error))
> > return error_mark_node;
> > 
> > auto_diagnostic_group d;
> > if (permerror (input_location, "assuming pointer to member %qD", fn)
> > && !explained)
> > {
> > inform (input_location, "(a pointer to member can only be "
> > "formed with %<&%E%>)", fn);
> > explained = 1;
> > }
> > }
> 
> 
> Jason

I'll check that out now, I just mostly finished the first lambda crash.

What is the proper way to error out of instantiate_body? What I have
right now is just not recursing down further if theres a problem. Also,
I'm starting to wonder if I should actually be erroring in
instantiate_decl instead.

I guess it will be better to just finish and you can share your
comments upon review though.

Alex
Jason Merrill Nov. 20, 2023, 2:35 p.m. UTC | #43
On 11/19/23 16:44, waffl3x wrote:
> 
> 
> 
> 
> 
> On Sunday, November 19th, 2023 at 1:34 PM, Jason Merrill <jason@redhat.com> wrote:
> 
> 
>>
>>
>> On 11/19/23 13:36, waffl3x wrote:
>>
>>> I'm having trouble fixing the error for this case, the control flow
>>> when the functions are overloaded is much more complex.
>>>
>>> struct S {
>>> void f(this S&) {}
>>> void f(this S&, int)
>>>
>>> void g() {
>>> void (*fp)(S&) = &f;
>>> }
>>> };
>>>
>>> This seemed to have fixed the non overloaded case, but I'm also not
>>> very happy with it, it feels kind of icky. Especially since the expr's
>>> location isn't available here, although, it just occurred to me that
>>> the expr's location is probably stored in the node.
>>>
>>> typeck.cc:cp_build_addr_expr_1
>>> ```
>>> case BASELINK:
>>> arg = BASELINK_FUNCTIONS (arg);
>>> if (DECL_XOBJ_MEMBER_FUNC_P (
>>> {
>>> error ("You must qualify taking address of xobj member functions");
>>> return error_mark_node;
>>> }
>>
>>
>> The loc variable was set earlier in the function, you can use that.
> 
> Will do.
> 
>> The overloaded case we want to handle here in
>> resolve_address_of_overloaded_function:
>>
>>> if (DECL_NONSTATIC_MEMBER_FUNCTION_P (fn)
>>> && !(complain & tf_ptrmem_ok) && !flag_ms_extensions)
>>> {
>>> static int explained;
>>>
>>> if (!(complain & tf_error))
>>> return error_mark_node;
>>>
>>> auto_diagnostic_group d;
>>> if (permerror (input_location, "assuming pointer to member %qD", fn)
>>> && !explained)
>>> {
>>> inform (input_location, "(a pointer to member can only be "
>>> "formed with %<&%E%>)", fn);
>>> explained = 1;
>>> }
>>> }
>>
>>
>> Jason
> 
> I'll check that out now, I just mostly finished the first lambda crash.
> 
> What is the proper way to error out of instantiate_body? What I have
> right now is just not recursing down further if theres a problem. Also,
> I'm starting to wonder if I should actually be erroring in
> instantiate_decl instead.

I think you want to error in start_preparsed_function, to handle 
template and non-template cases in the same place.

Jason
waffl3x Nov. 21, 2023, 10:02 a.m. UTC | #44
On Monday, November 20th, 2023 at 7:35 AM, Jason Merrill <jason@redhat.com> wrote:


> 
> 
> On 11/19/23 16:44, waffl3x wrote:
> 
> > On Sunday, November 19th, 2023 at 1:34 PM, Jason Merrill jason@redhat.com wrote:
> > 
> > > On 11/19/23 13:36, waffl3x wrote:
> > > 
> > > > I'm having trouble fixing the error for this case, the control flow
> > > > when the functions are overloaded is much more complex.
> > > > 
> > > > struct S {
> > > > void f(this S&) {}
> > > > void f(this S&, int)
> > > > 
> > > > void g() {
> > > > void (*fp)(S&) = &f;
> > > > }
> > > > };
> > > > 
> > > > This seemed to have fixed the non overloaded case, but I'm also not
> > > > very happy with it, it feels kind of icky. Especially since the expr's
> > > > location isn't available here, although, it just occurred to me that
> > > > the expr's location is probably stored in the node.
> > > > 
> > > > typeck.cc:cp_build_addr_expr_1
> > > > ```
> > > > case BASELINK:
> > > > arg = BASELINK_FUNCTIONS (arg);
> > > > if (DECL_XOBJ_MEMBER_FUNC_P (
> > > > {
> > > > error ("You must qualify taking address of xobj member functions");
> > > > return error_mark_node;
> > > > }
> > > 
> > > The loc variable was set earlier in the function, you can use that.
> > 
> > Will do.
> > 
> > > The overloaded case we want to handle here in
> > > resolve_address_of_overloaded_function:
> > > 
> > > > if (DECL_NONSTATIC_MEMBER_FUNCTION_P (fn)
> > > > && !(complain & tf_ptrmem_ok) && !flag_ms_extensions)
> > > > {
> > > > static int explained;
> > > > 
> > > > if (!(complain & tf_error))
> > > > return error_mark_node;
> > > > 
> > > > auto_diagnostic_group d;
> > > > if (permerror (input_location, "assuming pointer to member %qD", fn)
> > > > && !explained)
> > > > {
> > > > inform (input_location, "(a pointer to member can only be "
> > > > "formed with %<&%E%>)", fn);
> > > > explained = 1;
> > > > }
> > > > }
> > > 
> > > Jason
> > 
> > I'll check that out now, I just mostly finished the first lambda crash.
> > 
> > What is the proper way to error out of instantiate_body? What I have
> > right now is just not recursing down further if theres a problem. Also,
> > I'm starting to wonder if I should actually be erroring in
> > instantiate_decl instead.
> 
> 
> I think you want to error in start_preparsed_function, to handle
> template and non-template cases in the same place.
> 
> Jason

I just started a bootstrap, hopefully everything comes out just fine.
If I don't pass out before the tests finish (and the tests are all
fine) then I'll send it in for review tonight.

I stared at start_preparsed_function for a long while and couldn't
figure out where to start off at. So for now the error handling is
split up between instantiate_body and cp_parser_lambda_declarator_opt.
The latter is super not correct but I've been stuck on this for a long
time now though so I wanted to actually get something that works and
then try to make it better.

This patch is not as final as I would have liked as you can probably
deduce from the previous paragraph. I still have to write tests for a
number of cases, but I'm pretty sure everything works. I was going to
say except for by-value xobj parameters in lambdas with captures, but
that's magically working now too. I was also going to say I don't know
why, but I found where my mistake was (that I fixed without realizing
it was causing the aforementioned problem) so I do know what did it.

So rambling aside, I think everything should work now. To reiterate, I
still have to finish the tests for a few things. There's some
diagnostics I'm super not happy with, and lambda's names still say
static, but I already know how to fix that I think.

I will make an attempt at moving the diagnostics for an unrelated
explicit object parameter in a lambda with captures tomorrow. I just
want to get the almost fully featured patch reviewed ASAP, even if it's
still got some cruft.

As soon as these tests pass I will submit the patch, I'm not going to
split it up today, I'm simply too tired, but I assure you the final
version will properly be split up with a correct commit message and
everything.

Alex
diff mbox series

Patch

From bbfbcc72e8c0868559284352c71731394c98441e Mon Sep 17 00:00:00 2001
From: waffl3x <waffl3x@protonmail.com>
Date: Mon, 25 Sep 2023 16:59:10 -0600
Subject: [PATCH] c++: Initial support for C++23 P0847R7 (Deducing This)
 [PR102609]

This patch implements initial support for P0847R7, without additions to
diagnostics.  Almost everything should work correctly, barring a few
limitations which are listed below.  I attempted to minimize changes to the
existing code, treating explicit object member functions as static functions,
while flagging them to give them extra powers seemed to be the best way of
achieving this.  For this patch, the flag is only utilized in call.cc for
resolving overloads and making the actual function call.

Internally, the difference between a static member function and an implicit
object member function appears to be whether the type node of the decl is a
FUNCTION_TYPE or a METHOD_TYPE.  So to get the desired behavior, it seems to be
sufficient to simply prevent conversion from FUNC_TYPE to METHOD_TYPE in
grokdeclarator when the first parameter is an explicit object parameter.  To
achieve this, explicit object parameters are flagged as such through each the
TREE_LIST's purpose member in declarator->u.function.parameters.  Typically the
purpose member is used for default arguments,  as those are not allowed for
explicit object parameters, we are able to repurpose purpose for our purposes.
The value used as a flag is the "this_identifier" global tree, as it seemed to
be the most fitting of the current global trees.  Even though it is obviously
illegal for any parameter except the first to be an explicit object parameter,
each parameter parsed as an explicit object parameter will be flagged in this
manner.  This will be used for diagnostics in the following patch.  When an
explicit object parameter is encountered in grokdeclarator, the purpose member
is nulled before the list is passed elsewhere to maintain compatibility with
any code that assumes that a non-null purpose member indicates a default
argument.  This patch only checks for and nulls the first parameter however.

As for the previously mentioned limitations, lambdas do not work correctly yet,
but I suspect that a few tweaks are all it will take to have them fully
functional.  User defined conversion functions are not called when an explicit
object member function with an explicit object parameter of an unrelated type
is called.  The following case does not behave correctly because of this.

struct S {
  operator size_t() {
    return 42;
  }
  size_t f(this size_t n) {
    return n;
  }
};

int main()
{
  S s{};
  size_t a = s.f();
}

Currently, it appears that the object argument is simply reinterpreted as
a size_t instead of properly calling the user defined conversion function.
The validity of such a conversion is still considered however, if there is no
way to convert S to a size_t an appropriate compile error will be emitted.
I have an idea of what changes need to be made to fix this, but I did not
persue this for the initial implementation patch.
This bug can be observed in the explicit-object-param4.C test case, while
explicit-object-param3.C demonstrates the non functioning lambdas.

	PR c++/102609

gcc/cp/ChangeLog:
	PR c++/102609
	Initial support for C++23 P0847R7 - Deducing this.
	* call.cc (add_candidates): Check if fn is an xobj member function.
	(build_over_call): Ditto.
	* cp-tree.h (struct lang_decl_fn::xobj_func): New data member.
	(DECL_FUNC_XOBJ_FLAG): Define.
	(DECL_IS_XOBJ_MEMBER_FUNC): Define.
	(enum cp_decl_spec): Add ds_this.
	* decl.cc (grokdeclarator): Clear "this_identifier" from declarator's
	first parameter's purpose member.  Set xobj_func flag on xobj member
	function's decl, set it's type node to FUNCTION_TYPE, not METHOD_TYPE.
	* parser.cc (cp_parser_decl_specifier_seq): Handle this specifier.
	(cp_parser_parameter_declaration): Set default argument to
	"this_identifier" for xobj parameters.
	(set_and_check_decl_spec_loc): Add "this".

gcc/testsuite/ChangeLog:
	PR c++/102609
	Initial support for C++23 P0847R7 - Deducing this.
	* g++.dg/cpp23/explicit-object-param1.C: New test.
	* g++.dg/cpp23/explicit-object-param2.C: New test.
	* g++.dg/cpp23/explicit-object-param3.C: New test.
	* g++.dg/cpp23/explicit-object-param4.C: New test.

Signed-off-by: waffl3x <waffl3x@protonmail.com>
---
 gcc/cp/call.cc                                |   6 +-
 gcc/cp/cp-tree.h                              |  19 ++-
 gcc/cp/decl.cc                                |  23 ++++
 gcc/cp/parser.cc                              |  17 ++-
 .../g++.dg/cpp23/explicit-object-param1.C     | 114 ++++++++++++++++++
 .../g++.dg/cpp23/explicit-object-param2.C     |  28 +++++
 .../g++.dg/cpp23/explicit-object-param3.C     |  15 +++
 .../g++.dg/cpp23/explicit-object-param4.C     |  33 +++++
 8 files changed, 249 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp23/explicit-object-param1.C
 create mode 100644 gcc/testsuite/g++.dg/cpp23/explicit-object-param2.C
 create mode 100644 gcc/testsuite/g++.dg/cpp23/explicit-object-param3.C
 create mode 100644 gcc/testsuite/g++.dg/cpp23/explicit-object-param4.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 85274b81d7e..d861f5e7d54 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -6555,7 +6555,8 @@  add_candidates (tree fns, tree first_arg, const vec<tree, va_gc> *args,
       tree fn_first_arg = NULL_TREE;
       const vec<tree, va_gc> *fn_args = args;
 
-      if (DECL_NONSTATIC_MEMBER_FUNCTION_P (fn))
+      if (DECL_NONSTATIC_MEMBER_FUNCTION_P (fn)
+	  || DECL_IS_XOBJ_MEMBER_FUNC (fn))
 	{
 	  /* Figure out where the object arg comes from.  If this
 	     function is a non-static member and we didn't get an
@@ -9975,7 +9976,8 @@  build_over_call (struct z_candidate *cand, int flags, tsubst_flags_t complain)
 	}
     }
   /* Bypass access control for 'this' parameter.  */
-  else if (TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE)
+  else if (TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE
+	   || DECL_IS_XOBJ_MEMBER_FUNC (fn))
     {
       tree arg = build_this (first_arg != NULL_TREE
 			     ? first_arg
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 8fee4754604..2bef5a4909c 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -2932,8 +2932,9 @@  struct GTY(()) lang_decl_fn {
   unsigned maybe_deleted : 1;
   unsigned coroutine_p : 1;
   unsigned implicit_constexpr : 1;
+  unsigned xobj_func : 1;
 
-  unsigned spare : 9;
+  unsigned spare : 8;
 
   /* 32-bits padding on 64-bit host.  */
 
@@ -4394,6 +4395,15 @@  get_vec_init_expr (tree t)
 #define DECL_DEFAULTED_FN(DECL) \
   (LANG_DECL_FN_CHECK (DECL)->defaulted_p)
 
+/* Simple member access, only valid for FUNCTION_DECL nodes.  */
+#define DECL_FUNC_XOBJ_FLAG(NODE)	\
+  (LANG_DECL_FN_CHECK (NODE)->xobj_func)
+/* Nonzero if NODE is a FUNCTION_DECL that is an xobj member function,
+   safely evaluates to false for all non FUNCTION_DECL decls.  */
+#define DECL_IS_XOBJ_MEMBER_FUNC(NODE)			\
+  (TREE_CODE (STRIP_TEMPLATE (NODE)) == FUNCTION_DECL	\
+   && DECL_FUNC_XOBJ_FLAG (NODE) == 1)
+
 /* Nonzero if DECL is explicitly defaulted in the class body.  */
 #define DECL_DEFAULTED_IN_CLASS_P(DECL)					\
   (DECL_DEFAULTED_FN (DECL) && DECL_INITIALIZED_IN_CLASS_P (DECL))
@@ -6223,11 +6233,13 @@  enum cp_storage_class {
 
 /* An individual decl-specifier.  This is used to index the array of
    locations for the declspecs in struct cp_decl_specifier_seq
-   below.  */
+   below.
+   A subset of these enums also corresponds to elements of
+   cp_parser_set_decl_spec_type:decl_spec_names in parser.cc.  */
 
 enum cp_decl_spec {
   ds_first,
-  ds_signed = ds_first,
+  ds_signed = ds_first, /* Index of first element of decl_spec_names.  */
   ds_unsigned,
   ds_short,
   ds_long,
@@ -6244,6 +6256,7 @@  enum cp_decl_spec {
   ds_complex,
   ds_constinit,
   ds_consteval,
+  ds_this, /* Index of last element of decl_spec_names.  */
   ds_thread,
   ds_type_spec,
   ds_redefined_builtin_type_spec,
diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 30d89bba9e8..427fd12d3d8 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -12926,6 +12926,8 @@  grokdeclarator (const cp_declarator *declarator,
   if (attrlist)
     diagnose_misapplied_contracts (*attrlist);
 
+  /* Skip over build_memfn_type when a FUNCTION_DECL is an xobj memfn.  */
+  bool is_xobj_member_function = false;
   /* Determine the type of the entity declared by recurring on the
      declarator.  */
   for (; declarator; declarator = declarator->declarator)
@@ -13021,6 +13023,19 @@  grokdeclarator (const cp_declarator *declarator,
 
 	case cdk_function:
 	  {
+	    auto get_xobj_parm = [](tree parm_list)
+	      {
+		if (!parm_list)
+		  return NULL_TREE;
+		if (TREE_VALUE (parm_list) == void_type_node)
+		  return NULL_TREE;
+		if (TREE_PURPOSE (parm_list) != this_identifier)
+		  return NULL_TREE;
+		/* Non-null 'purpose' usually means the param has a default
+		   argument, we don't want to violate this assumption.  */
+		TREE_PURPOSE (parm_list) = NULL_TREE;
+		return TREE_VALUE (parm_list);
+	      };
 	    tree arg_types;
 	    int funcdecl_p;
 
@@ -13041,6 +13056,10 @@  grokdeclarator (const cp_declarator *declarator,
 	    if (raises == error_mark_node)
 	      raises = NULL_TREE;
 
+	    tree xobj_parm
+	      = get_xobj_parm (declarator->u.function.parameters);
+	    is_xobj_member_function = xobj_parm;
+
 	    if (reqs)
 	      error_at (location_of (reqs), "requires-clause on return type");
 	    reqs = declarator->u.function.requires_clause;
@@ -14105,6 +14124,8 @@  grokdeclarator (const cp_declarator *declarator,
     }
 
   if (ctype && TREE_CODE (type) == FUNCTION_TYPE && staticp < 2
+      /* Don't convert xobj member functions to METHOD_TYPE.  */
+      && !is_xobj_member_function
       && !(unqualified_id
 	   && identifier_p (unqualified_id)
 	   && IDENTIFIER_NEWDEL_OP_P (unqualified_id)))
@@ -14333,6 +14354,7 @@  grokdeclarator (const cp_declarator *declarator,
             decl = set_virt_specifiers (decl, virt_specifiers);
 	    if (decl == NULL_TREE)
 	      return error_mark_node;
+	    DECL_FUNC_XOBJ_FLAG (decl) = is_xobj_member_function;
 #if 0
 	    /* This clobbers the attrs stored in `decl' from `attrlist'.  */
 	    /* The decl and setting of decl_attr is also turned off.  */
@@ -14665,6 +14687,7 @@  grokdeclarator (const cp_declarator *declarator,
 			   id_loc);
 	if (decl == NULL_TREE)
 	  return error_mark_node;
+	DECL_FUNC_XOBJ_FLAG (decl) = is_xobj_member_function;
 
 	if (explicitp == 2)
 	  DECL_NONCONVERTING_P (decl) = 1;
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index b0cd1b463c8..0f3ff5c796d 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -15843,6 +15843,16 @@  cp_parser_decl_specifier_seq (cp_parser* parser,
 	    decl_specs->locations[ds_attribute] = token->location;
 	  continue;
 	}
+      /* Special case for "this" specifier, indicating a parm is an xobj parm.
+	 The "this" specifier must be the first specifier in the declaration,
+	 after any attributes.  */
+      if (token->keyword == RID_THIS)
+	{
+	  cp_lexer_consume_token (parser->lexer);
+	  set_and_check_decl_spec_loc (decl_specs, ds_this, token);
+	  continue;
+	}
+
       /* Assume we will find a decl-specifier keyword.  */
       found_decl_spec = true;
       /* If the next token is an appropriate keyword, we can simply
@@ -25239,6 +25249,10 @@  cp_parser_parameter_declaration (cp_parser *parser,
 
   if (default_argument)
     STRIP_ANY_LOCATION_WRAPPER (default_argument);
+  /* Xobj parameters can not have default arguments, thus
+     we can reuse the default argument field to flag the param as such.  */
+  if (decl_spec_seq_has_spec_p (&decl_specifiers, ds_this))
+    default_argument = this_identifier;
 
   /* Generate a location for the parameter, ranging from the start of the
      initial token to the end of the final token (using input_location for
@@ -33587,7 +33601,8 @@  set_and_check_decl_spec_loc (cp_decl_specifier_seq *decl_specs,
 	    "constexpr",
 	    "__complex",
 	    "constinit",
-	    "consteval"
+	    "consteval",
+	    "this"
 	  };
 	  gcc_rich_location richloc (location);
 	  richloc.add_fixit_remove ();
diff --git a/gcc/testsuite/g++.dg/cpp23/explicit-object-param1.C b/gcc/testsuite/g++.dg/cpp23/explicit-object-param1.C
new file mode 100644
index 00000000000..134182c7741
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp23/explicit-object-param1.C
@@ -0,0 +1,114 @@ 
+// P0847R7
+// { dg-do compile { target c++23 } }
+
+// basic use cases and calling
+
+// non-trailing return
+// definitions
+struct S0 {
+  void f0(this S0) {}
+  void f1(this S0&) {}
+  void f2(this S0&&) {}
+  void f3(this S0 const&) {}
+  void f4(this S0 const&&) {}
+  template<typename Self>
+  void d0(this Self&&) {}
+  void d1(this auto&&) {}
+};
+// declarations
+struct S1 {
+  void f0(this S1);
+  void f1(this S1&);
+  void f2(this S1&&);
+  void f3(this S1 const&);
+  void f4(this S1 const&&);
+  template<typename Self>
+  void d0(this Self&&);
+  void d1(this auto&&);
+};
+// out of line definitions
+void S1::f0(this S1) {}
+void S1::f1(this S1&) {}
+void S1::f2(this S1&&) {}
+void S1::f3(this S1 const&) {}
+void S1::f4(this S1 const&&) {}
+template<typename Self>
+void S1::d0(this Self&&) {}
+void S1::d1(this auto&&) {}
+
+// trailing return
+// definitions
+struct S2 {
+  auto f0(this S2) -> void {}
+  auto f1(this S2&) -> void {}
+  auto f2(this S2&&) -> void {}
+  auto f3(this S2 const&) -> void {}
+  auto f4(this S2 const&&) -> void {}
+  template<typename Self>
+  auto d0(this Self&&) -> void {}
+
+  auto d1(this auto&&) -> void {}
+};
+// declarations
+struct S3 {
+  auto f0(this S3) -> void;
+  auto f1(this S3&) -> void;
+  auto f2(this S3&&) -> void;
+  auto f3(this S3 const&) -> void;
+  auto f4(this S3 const&&) -> void;
+  template<typename Self>
+  auto d0(this Self&&) -> void;
+  auto d1(this auto&&) -> void;
+};
+// out of line definitions
+auto S3::f0(this S3) -> void {}
+auto S3::f1(this S3&) -> void {}
+auto S3::f2(this S3&&) -> void {}
+auto S3::f3(this S3 const&) -> void {}
+auto S3::f4(this S3 const&&) -> void {}
+template<typename Self>
+auto S3::d0(this Self&&) -> void {}
+auto S3::d1(this auto&&) -> void {}
+
+template<typename T>
+void call_with_qualification()
+{
+  T obj{};
+  // by value should take any qualification (f0)
+  T{}.f0();
+  obj.f0();
+  static_cast<T&&>(obj).f0(); 
+  static_cast<T const&>(obj).f0();
+  static_cast<T const&&>(obj).f0();
+  // specific qualification (f1 - f4)
+  T{}.f2();
+  T{}.f3();
+  T{}.f4();
+  obj.f1();
+  obj.f3();
+  static_cast<T&&>(obj).f2();
+  static_cast<T&&>(obj).f3();
+  static_cast<T&&>(obj).f4();
+  static_cast<T const&>(obj).f3();
+  static_cast<T const&&>(obj).f4();
+  // deduced should (obviously) take any qualification (d0, d1)
+  T{}.d0();
+  obj.d0();
+  static_cast<T&&>(obj).d0();
+  static_cast<T const&>(obj).d0();
+  static_cast<T const&&>(obj).d0();
+  T{}.d1();
+  obj.d1();
+  static_cast<T&&>(obj).d1();
+  static_cast<T const&>(obj).d1();
+  static_cast<T const&&>(obj).d1();
+}
+
+void perform_calls()
+{
+  call_with_qualification<S0>();
+  call_with_qualification<S1>();
+  call_with_qualification<S2>();
+  call_with_qualification<S3>();
+}
+
diff --git a/gcc/testsuite/g++.dg/cpp23/explicit-object-param2.C b/gcc/testsuite/g++.dg/cpp23/explicit-object-param2.C
new file mode 100644
index 00000000000..a3164c2537c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp23/explicit-object-param2.C
@@ -0,0 +1,28 @@ 
+// P0847R7
+// { dg-do run { target c++23 } }
+
+// explicit object member function pointer type deduction,
+// conversion to function pointer, and
+// calling through pointer to function
+
+struct S {
+  int _n;
+  int f(this S& self) { return self._n; }
+};
+
+using f_type = int(*)(S&);
+
+static_assert (__is_same (f_type, decltype (&S::f)));
+
+int main()
+{
+  auto fp0 = &S::f;
+  f_type fp1 = &S::f;
+  static_assert (__is_same (decltype (fp0), decltype (fp1)));
+  S s{42};
+  if (fp0 (s) != 42)
+    __builtin_abort ();
+  if (fp1 (s) != 42)
+    __builtin_abort ();
+}
+
diff --git a/gcc/testsuite/g++.dg/cpp23/explicit-object-param3.C b/gcc/testsuite/g++.dg/cpp23/explicit-object-param3.C
new file mode 100644
index 00000000000..fa92e4cd440
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp23/explicit-object-param3.C
@@ -0,0 +1,15 @@ 
+// P0847R7
+// { dg-do compile { target c++23 } }
+
+// recursive lambdas
+
+// { dg-excess-errors "deducing this with lambdas not implemented yet" { xfail *-*-* } }
+
+int main()
+{
+  auto cl0 = [](this auto&& self, int n){ return n ? self(n - 1) : 42 };
+  auto cl1 = [](this auto self, int n){ return n ? self(n - 1) : 42};
+  int a = cl0(5);
+  int b = cl1(5);
+}
+
diff --git a/gcc/testsuite/g++.dg/cpp23/explicit-object-param4.C b/gcc/testsuite/g++.dg/cpp23/explicit-object-param4.C
new file mode 100644
index 00000000000..746f6d99b94
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp23/explicit-object-param4.C
@@ -0,0 +1,33 @@ 
+// P0847R7
+// { dg-do run { target c++23 } }
+
+// test implicit conversion of the object argument
+// to the explicit object parameter
+
+// we compare &s to ret because early on, the
+// object parameter would not convert, it would just get
+// reinterpreted as the type of the explicit object param
+
+// { dg-xfail-run-if "user defined conversions from an implicit object argument to an explicit object parameter are not supported yet" { *-*-* } }
+
+using uintptr_t = __UINTPTR_TYPE__;
+
+struct S {
+    operator uintptr_t() const {
+	return 42;
+    }
+    uintptr_t f(this uintptr_t n) {
+        return n;
+    }
+};
+
+int main() 
+{
+  S s{};
+  uintptr_t ret = s.f();
+  if (ret == reinterpret_cast<uintptr_t>(&s))
+    __builtin_abort ();
+  if (ret != 42)
+    __builtin_abort ();
+}
+
-- 
2.42.0