[Coroutines] Fix issue with unused corutine function parameters
diff mbox series

Message ID 490e331a-e1f2-bee4-e150-4df87c752bfc@linux.alibaba.com
State New
Headers show
Series
  • [Coroutines] Fix issue with unused corutine function parameters
Related show

Commit Message

JunMa Jan. 20, 2020, 9:25 a.m. UTC
Hi
Accroding to N4835: When a coroutine is invoked, a copy is created
for each coroutine parameter. While in current implementation, we
only collect used parameters. This may lost behavior inside constructor
and destructor of unused parameters.

This patch change the implementation to collect all parameter.

Bootstrap and test on X86_64, is it OK?

Regards
JunMa

gcc/cp
2020-01-20  Jun Ma <JunMa@linux.alibaba.com>

         * coroutines.cc (build_actor_fn): Skip rewrite when there is no
         param references.
         (register_param_uses): Only collect param references here.
         (morph_fn_to_coro): Create coroutine frame field for each
         function params.

gcc/testsuite
2020-01-20  Jun Ma <JunMa@linux.alibaba.com>

         * g++.dg/coroutines/torture/func-params-07.C: New test.
---
 gcc/cp/coroutines.cc                          | 63 +++++++----------
 .../coroutines/torture/func-params-07.C       | 68 +++++++++++++++++++
 2 files changed, 93 insertions(+), 38 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/coroutines/torture/func-params-07.C

Comments

Iain Sandoe Jan. 20, 2020, 10:07 a.m. UTC | #1
Hi JunMa,

JunMa <JunMa@linux.alibaba.com> wrote:

> Hi
> Accroding to N4835: When a coroutine is invoked, a copy is created
> for each coroutine parameter. While in current implementation, we
> only collect used parameters. This may lost behavior inside constructor
> and destructor of unused parameters.
>
> This patch change the implementation to collect all parameter.

thanks for the patch.

I am not convinced this is the right way to fix this, we do not want to  
increase
the size of the coroutine frame unnecessarily.  I would like to check the  
lifetime
requirements; such copies might only need to be made local to the ramp  
function.

Iain

>
> Bootstrap and test on X86_64, is it OK?
>
> Regards
> JunMa
>
> gcc/cp
> 2020-01-20  Jun Ma <JunMa@linux.alibaba.com>
>
>         * coroutines.cc (build_actor_fn): Skip rewrite when there is no
>         param references.
>         (register_param_uses): Only collect param references here.
>         (morph_fn_to_coro): Create coroutine frame field for each
>         function params.
>
> gcc/testsuite
> 2020-01-20  Jun Ma <JunMa@linux.alibaba.com>
>
>         * g++.dg/coroutines/torture/func-params-07.C: New test.
> <0001-Fix-issue-when-copy-function-parameters-into-corouti.patch>
JunMa Jan. 20, 2020, 10:43 a.m. UTC | #2
在 2020/1/20 下午6:07, Iain Sandoe 写道:
> Hi JunMa,
>
> JunMa <JunMa@linux.alibaba.com> wrote:
>
>> Hi
>> Accroding to N4835: When a coroutine is invoked, a copy is created
>> for each coroutine parameter. While in current implementation, we
>> only collect used parameters. This may lost behavior inside constructor
>> and destructor of unused parameters.
>>
>> This patch change the implementation to collect all parameter.
>
> thanks for the patch.
>
> I am not convinced this is the right way to fix this, we do not want 
> to increase
> the size of the coroutine frame unnecessarily.  I would like to check 
> the lifetime
> requirements; such copies might only need to be made local to the ramp 
> function.
>
> Iain
>
For type with trivial destructor, There is no need to copy parameter if 
it is
never referenced after a reachable suspend 
point(return-to-caller/resume) in the
coroutine. Since we are in front end, that should be inital_suspend 
point. so we
can create field for type with non-trivial destructor first, and skip 
unused parameters
in register_param_uses. I'll update the patch

Regards
JunMa
>>
>> Bootstrap and test on X86_64, is it OK?
>>
>> Regards
>> JunMa
>>
>> gcc/cp
>> 2020-01-20  Jun Ma <JunMa@linux.alibaba.com>
>>
>>         * coroutines.cc (build_actor_fn): Skip rewrite when there is no
>>         param references.
>>         (register_param_uses): Only collect param references here.
>>         (morph_fn_to_coro): Create coroutine frame field for each
>>         function params.
>>
>> gcc/testsuite
>> 2020-01-20  Jun Ma <JunMa@linux.alibaba.com>
>>
>>         * g++.dg/coroutines/torture/func-params-07.C: New test.
>> <0001-Fix-issue-when-copy-function-parameters-into-corouti.patch>
>
Iain Sandoe Jan. 20, 2020, 11:55 a.m. UTC | #3
Hi JunMa,

JunMa <JunMa@linux.alibaba.com> wrote:

> 在 2020/1/20 下午6:07, Iain Sandoe 写道:
>> Hi JunMa,
>> 
>> JunMa <JunMa@linux.alibaba.com> wrote:
>> 
>>> Hi
>>> Accroding to N4835: When a coroutine is invoked, a copy is created
>>> for each coroutine parameter. While in current implementation, we
>>> only collect used parameters. This may lost behavior inside constructor
>>> and destructor of unused parameters.
>>> 
>>> This patch change the implementation to collect all parameter.
>> 
>> thanks for the patch.
>> 
>> I am not convinced this is the right way to fix this, we do not want to increase
>> the size of the coroutine frame unnecessarily.  I would like to check the lifetime
>> requirements; such copies might only need to be made local to the ramp function.
>> 
>> Iain
> For type with trivial destructor, There is no need to copy parameter if it is
> never referenced after a reachable suspend point(return-to-caller/resume) in the
> coroutine. Since we are in front end, that should be inital_suspend point. so we
> can create field for type with non-trivial destructor first, and skip unused parameters
> in register_param_uses. I'll update the patch

I think that, even if the DTOR is non-trivial, the copy only needs to be in the stack
frame of the ramp, not in the coroutine frame.
This is also what clang does for -O>0 (although it makes a frame copy at O0).

Will clarify this (perhaps the wording could be slightly more specfic).

thanks
Iain

> 
> Regards
> JunMa
>>> Bootstrap and test on X86_64, is it OK?
>>> 
>>> Regards
>>> JunMa
>>> 
>>> gcc/cp
>>> 2020-01-20  Jun Ma <JunMa@linux.alibaba.com>
>>> 
>>>         * coroutines.cc (build_actor_fn): Skip rewrite when there is no
>>>         param references.
>>>         (register_param_uses): Only collect param references here.
>>>         (morph_fn_to_coro): Create coroutine frame field for each
>>>         function params.
>>> 
>>> gcc/testsuite
>>> 2020-01-20  Jun Ma <JunMa@linux.alibaba.com>
>>> 
>>>         * g++.dg/coroutines/torture/func-params-07.C: New test.
>>> <0001-Fix-issue-when-copy-function-parameters-into-corouti.patch>
JunMa Jan. 20, 2020, 12:15 p.m. UTC | #4
在 2020/1/20 下午7:55, Iain Sandoe 写道:
> Hi JunMa,
>
> JunMa <JunMa@linux.alibaba.com> wrote:
>
>> 在 2020/1/20 下午6:07, Iain Sandoe 写道:
>>> Hi JunMa,
>>>
>>> JunMa <JunMa@linux.alibaba.com> wrote:
>>>
>>>> Hi
>>>> Accroding to N4835: When a coroutine is invoked, a copy is created
>>>> for each coroutine parameter. While in current implementation, we
>>>> only collect used parameters. This may lost behavior inside constructor
>>>> and destructor of unused parameters.
>>>>
>>>> This patch change the implementation to collect all parameter.
>>> thanks for the patch.
>>>
>>> I am not convinced this is the right way to fix this, we do not want to increase
>>> the size of the coroutine frame unnecessarily.  I would like to check the lifetime
>>> requirements; such copies might only need to be made local to the ramp function.
>>>
>>> Iain
>> For type with trivial destructor, There is no need to copy parameter if it is
>> never referenced after a reachable suspend point(return-to-caller/resume) in the
>> coroutine. Since we are in front end, that should be inital_suspend point. so we
>> can create field for type with non-trivial destructor first, and skip unused parameters
>> in register_param_uses. I'll update the patch
> I think that, even if the DTOR is non-trivial, the copy only needs to be in the stack
> frame of the ramp, not in the coroutine frame.
I do think this.  It's just need more work on front end.
> This is also what clang does for -O>0 (although it makes a frame copy at O0).
>
> Will clarify this (perhaps the wording could be slightly more specfic).
No, clang build frame in middle end. In FE, it just generate local 
variable to do the copy, and let the middle end to
do the optimization which can see more opportunity.

Regards
JunMa
> thanks
> Iain
>
>> Regards
>> JunMa
>>>> Bootstrap and test on X86_64, is it OK?
>>>>
>>>> Regards
>>>> JunMa
>>>>
>>>> gcc/cp
>>>> 2020-01-20  Jun Ma <JunMa@linux.alibaba.com>
>>>>
>>>>          * coroutines.cc (build_actor_fn): Skip rewrite when there is no
>>>>          param references.
>>>>          (register_param_uses): Only collect param references here.
>>>>          (morph_fn_to_coro): Create coroutine frame field for each
>>>>          function params.
>>>>
>>>> gcc/testsuite
>>>> 2020-01-20  Jun Ma <JunMa@linux.alibaba.com>
>>>>
>>>>          * g++.dg/coroutines/torture/func-params-07.C: New test.
>>>> <0001-Fix-issue-when-copy-function-parameters-into-corouti.patch>
Iain Sandoe Jan. 20, 2020, 12:21 p.m. UTC | #5
JunMa <JunMa@linux.alibaba.com> wrote:

> 在 2020/1/20 下午7:55, Iain Sandoe 写道:
>> Hi JunMa,
>>
>> JunMa <JunMa@linux.alibaba.com> wrote:
>>
>>> 在 2020/1/20 下午6:07, Iain Sandoe 写道:
>>>> Hi JunMa,
>>>>
>>>> JunMa <JunMa@linux.alibaba.com> wrote:
>>>>
>>>>> Hi
>>>>> Accroding to N4835: When a coroutine is invoked, a copy is created
>>>>> for each coroutine parameter. While in current implementation, we
>>>>> only collect used parameters. This may lost behavior inside constructor
>>>>> and destructor of unused parameters.
>>>>>
>>>>> This patch change the implementation to collect all parameter.
>>>> thanks for the patch.
>>>>
>>>> I am not convinced this is the right way to fix this, we do not want  
>>>> to increase
>>>> the size of the coroutine frame unnecessarily.  I would like to check  
>>>> the lifetime
>>>> requirements; such copies might only need to be made local to the ramp  
>>>> function.
>>>>
>>>> Iain
>>> For type with trivial destructor, There is no need to copy parameter if  
>>> it is
>>> never referenced after a reachable suspend  
>>> point(return-to-caller/resume) in the
>>> coroutine. Since we are in front end, that should be inital_suspend  
>>> point. so we
>>> can create field for type with non-trivial destructor first, and skip  
>>> unused parameters
>>> in register_param_uses. I'll update the patch
>> I think that, even if the DTOR is non-trivial, the copy only needs to be  
>> in the stack
>> frame of the ramp, not in the coroutine frame.
> I do think this.  It's just need more work on front end.

I think we already did the work, and know the answer (about param uses in  
the body), right?

>> This is also what clang does for -O>0 (although it makes a frame copy at  
>> O0).
>>
>> Will clarify this (perhaps the wording could be slightly more specfic).
> No, clang build frame in middle end. In FE, it just generate local  
> variable to do the copy, and let the middle end to
> do the optimization which can see more opportunity.

Yes, i understand that clang does this in the ME not in the FE but, so long  
as the principle is correct, it is equivalent for GCC to do this in the FE.

thanks
Iain

>
> Regards
> JunMa
>> thanks
>> Iain
>>
>>> Regards
>>> JunMa
>>>>> Bootstrap and test on X86_64, is it OK?
>>>>>
>>>>> Regards
>>>>> JunMa
>>>>>
>>>>> gcc/cp
>>>>> 2020-01-20  Jun Ma <JunMa@linux.alibaba.com>
>>>>>
>>>>>         * coroutines.cc (build_actor_fn): Skip rewrite when there is no
>>>>>         param references.
>>>>>         (register_param_uses): Only collect param references here.
>>>>>         (morph_fn_to_coro): Create coroutine frame field for each
>>>>>         function params.
>>>>>
>>>>> gcc/testsuite
>>>>> 2020-01-20  Jun Ma <JunMa@linux.alibaba.com>
>>>>>
>>>>>         * g++.dg/coroutines/torture/func-params-07.C: New test.
>>>>> <0001-Fix-issue-when-copy-function-parameters-into-corouti.patch>
JunMa Jan. 20, 2020, 12:32 p.m. UTC | #6
在 2020/1/20 下午8:21, Iain Sandoe 写道:
> JunMa <JunMa@linux.alibaba.com> wrote:
>
>> 在 2020/1/20 下午7:55, Iain Sandoe 写道:
>>> Hi JunMa,
>>>
>>> JunMa <JunMa@linux.alibaba.com> wrote:
>>>
>>>> 在 2020/1/20 下午6:07, Iain Sandoe 写道:
>>>>> Hi JunMa,
>>>>>
>>>>> JunMa <JunMa@linux.alibaba.com> wrote:
>>>>>
>>>>>> Hi
>>>>>> Accroding to N4835: When a coroutine is invoked, a copy is created
>>>>>> for each coroutine parameter. While in current implementation, we
>>>>>> only collect used parameters. This may lost behavior inside 
>>>>>> constructor
>>>>>> and destructor of unused parameters.
>>>>>>
>>>>>> This patch change the implementation to collect all parameter.
>>>>> thanks for the patch.
>>>>>
>>>>> I am not convinced this is the right way to fix this, we do not 
>>>>> want to increase
>>>>> the size of the coroutine frame unnecessarily.  I would like to 
>>>>> check the lifetime
>>>>> requirements; such copies might only need to be made local to the 
>>>>> ramp function.
>>>>>
>>>>> Iain
>>>> For type with trivial destructor, There is no need to copy 
>>>> parameter if it is
>>>> never referenced after a reachable suspend 
>>>> point(return-to-caller/resume) in the
>>>> coroutine. Since we are in front end, that should be inital_suspend 
>>>> point. so we
>>>> can create field for type with non-trivial destructor first, and 
>>>> skip unused parameters
>>>> in register_param_uses. I'll update the patch
>>> I think that, even if the DTOR is non-trivial, the copy only needs 
>>> to be in the stack
>>> frame of the ramp, not in the coroutine frame.
>> I do think this.  It's just need more work on front end.
>
> I think we already did the work, and know the answer (about param uses 
> in the body), right?
Yes, we may need extra work on copy parameters, I'll do this.
>
>>> This is also what clang does for -O>0 (although it makes a frame 
>>> copy at O0).
>>>
>>> Will clarify this (perhaps the wording could be slightly more specfic).
>> No, clang build frame in middle end. In FE, it just generate local 
>> variable to do the copy, and let the middle end to
>> do the optimization which can see more opportunity.
>
> Yes, i understand that clang does this in the ME not in the FE but, so 
> long as the principle is correct, it is equivalent for GCC to do this 
> in the FE.
>
Yes, just less situation we can handle, that's why we still need 
coroutine frame reduction pass.

Regards
JunMa
> thanks
> Iain
>
>>
>> Regards
>> JunMa
>>> thanks
>>> Iain
>>>
>>>> Regards
>>>> JunMa
>>>>>> Bootstrap and test on X86_64, is it OK?
>>>>>>
>>>>>> Regards
>>>>>> JunMa
>>>>>>
>>>>>> gcc/cp
>>>>>> 2020-01-20  Jun Ma <JunMa@linux.alibaba.com>
>>>>>>
>>>>>>         * coroutines.cc (build_actor_fn): Skip rewrite when there 
>>>>>> is no
>>>>>>         param references.
>>>>>>         (register_param_uses): Only collect param references here.
>>>>>>         (morph_fn_to_coro): Create coroutine frame field for each
>>>>>>         function params.
>>>>>>
>>>>>> gcc/testsuite
>>>>>> 2020-01-20  Jun Ma <JunMa@linux.alibaba.com>
>>>>>>
>>>>>>         * g++.dg/coroutines/torture/func-params-07.C: New test.
>>>>>> <0001-Fix-issue-when-copy-function-parameters-into-corouti.patch>
>
Iain Sandoe Jan. 20, 2020, 4:39 p.m. UTC | #7
JunMa <JunMa@linux.alibaba.com> wrote:

> 在 2020/1/20 下午8:21, Iain Sandoe 写道:
>> JunMa <JunMa@linux.alibaba.com> wrote:
>> 
>>> 在 2020/1/20 下午7:55, Iain Sandoe 写道:
>>>> Hi JunMa,
>>>> 
>>>> JunMa <JunMa@linux.alibaba.com> wrote:
>>>> 
>>>>> 在 2020/1/20 下午6:07, Iain Sandoe 写道:
>>>>>> Hi JunMa,
>>>>>> 
>>>>>> JunMa <JunMa@linux.alibaba.com> wrote:
>>>>>> 
>>>>>>> Hi
>>>>>>> Accroding to N4835: When a coroutine is invoked, a copy is created
>>>>>>> for each coroutine parameter. While in current implementation, we
>>>>>>> only collect used parameters. This may lost behavior inside constructor
>>>>>>> and destructor of unused parameters.
>>>>>>> 
>>>>>>> This patch change the implementation to collect all parameter.
>>>>>> thanks for the patch.
>>>>>> 
>>>>>> I am not convinced this is the right way to fix this, we do not want to increase
>>>>>> the size of the coroutine frame unnecessarily.  I would like to check the lifetime
>>>>>> requirements; such copies might only need to be made local to the ramp function.
>>>>>> 
>>>>>> Iain
>>>>> For type with trivial destructor, There is no need to copy parameter if it is
>>>>> never referenced after a reachable suspend point(return-to-caller/resume) in the
>>>>> coroutine. Since we are in front end, that should be inital_suspend point. so we
>>>>> can create field for type with non-trivial destructor first, and skip unused parameters
>>>>> in register_param_uses. I'll update the patch
>>>> I think that, even if the DTOR is non-trivial, the copy only needs to be in the stack
>>>> frame of the ramp, not in the coroutine frame.
>>> I do think this.  It's just need more work on front end.
>> 
>> I think we already did the work, and know the answer (about param uses in the body), right?
> Yes, we may need extra work on copy parameters, I'll do this.

Having discussed this with Nathan (and I’ve also mailed Gor for clarification).  I think it might be
a good idea to wait for those responses before revising (it could be that your original reading of
the wording is correct, and the clang impl. needs to be updated).

thanks
Iain

>>>> This is also what clang does for -O>0 (although it makes a frame copy at O0).
>>>> 
>>>> Will clarify this (perhaps the wording could be slightly more specfic).
>>> No, clang build frame in middle end. In FE, it just generate local variable to do the copy, and let the middle end to
>>> do the optimization which can see more opportunity.
>> 
>> Yes, i understand that clang does this in the ME not in the FE but, so long as the principle is correct, it is equivalent for GCC to do this in the FE.
> Yes, just less situation we can handle, that's why we still need coroutine frame reduction pass.
> 
> Regards
> JunMa
>> thanks
>> Iain
>> 
>>> Regards
>>> JunMa
>>>> thanks
>>>> Iain
>>>> 
>>>>> Regards
>>>>> JunMa
>>>>>>> Bootstrap and test on X86_64, is it OK?
>>>>>>> 
>>>>>>> Regards
>>>>>>> JunMa
>>>>>>> 
>>>>>>> gcc/cp
>>>>>>> 2020-01-20  Jun Ma <JunMa@linux.alibaba.com>
>>>>>>> 
>>>>>>>         * coroutines.cc (build_actor_fn): Skip rewrite when there is no
>>>>>>>         param references.
>>>>>>>         (register_param_uses): Only collect param references here.
>>>>>>>         (morph_fn_to_coro): Create coroutine frame field for each
>>>>>>>         function params.
>>>>>>> 
>>>>>>> gcc/testsuite
>>>>>>> 2020-01-20  Jun Ma <JunMa@linux.alibaba.com>
>>>>>>> 
>>>>>>>         * g++.dg/coroutines/torture/func-params-07.C: New test.
>>>>>>> <0001-Fix-issue-when-copy-function-parameters-into-corouti.patch>
JunMa Jan. 21, 2020, 1:34 a.m. UTC | #8
在 2020/1/21 上午12:39, Iain Sandoe 写道:
> JunMa <JunMa@linux.alibaba.com> wrote:
>
>> 在 2020/1/20 下午8:21, Iain Sandoe 写道:
>>> JunMa <JunMa@linux.alibaba.com> wrote:
>>>
>>>> 在 2020/1/20 下午7:55, Iain Sandoe 写道:
>>>>> Hi JunMa,
>>>>>
>>>>> JunMa <JunMa@linux.alibaba.com> wrote:
>>>>>
>>>>>> 在 2020/1/20 下午6:07, Iain Sandoe 写道:
>>>>>>> Hi JunMa,
>>>>>>>
>>>>>>> JunMa <JunMa@linux.alibaba.com> wrote:
>>>>>>>
>>>>>>>> Hi
>>>>>>>> Accroding to N4835: When a coroutine is invoked, a copy is created
>>>>>>>> for each coroutine parameter. While in current implementation, we
>>>>>>>> only collect used parameters. This may lost behavior inside constructor
>>>>>>>> and destructor of unused parameters.
>>>>>>>>
>>>>>>>> This patch change the implementation to collect all parameter.
>>>>>>> thanks for the patch.
>>>>>>>
>>>>>>> I am not convinced this is the right way to fix this, we do not want to increase
>>>>>>> the size of the coroutine frame unnecessarily.  I would like to check the lifetime
>>>>>>> requirements; such copies might only need to be made local to the ramp function.
>>>>>>>
>>>>>>> Iain
>>>>>> For type with trivial destructor, There is no need to copy parameter if it is
>>>>>> never referenced after a reachable suspend point(return-to-caller/resume) in the
>>>>>> coroutine. Since we are in front end, that should be inital_suspend point. so we
>>>>>> can create field for type with non-trivial destructor first, and skip unused parameters
>>>>>> in register_param_uses. I'll update the patch
>>>>> I think that, even if the DTOR is non-trivial, the copy only needs to be in the stack
>>>>> frame of the ramp, not in the coroutine frame.
>>>> I do think this.  It's just need more work on front end.
>>> I think we already did the work, and know the answer (about param uses in the body), right?
>> Yes, we may need extra work on copy parameters, I'll do this.
> Having discussed this with Nathan (and I’ve also mailed Gor for clarification).  I think it might be
> a good idea to wait for those responses before revising (it could be that your original reading of
> the wording is correct, and the clang impl. needs to be updated).
ok, thanks.

Regards
JunMa
> thanks
> Iain
>
>>>>> This is also what clang does for -O>0 (although it makes a frame copy at O0).
>>>>>
>>>>> Will clarify this (perhaps the wording could be slightly more specfic).
>>>> No, clang build frame in middle end. In FE, it just generate local variable to do the copy, and let the middle end to
>>>> do the optimization which can see more opportunity.
>>> Yes, i understand that clang does this in the ME not in the FE but, so long as the principle is correct, it is equivalent for GCC to do this in the FE.
>> Yes, just less situation we can handle, that's why we still need coroutine frame reduction pass.
>>
>> Regards
>> JunMa
>>> thanks
>>> Iain
>>>
>>>> Regards
>>>> JunMa
>>>>> thanks
>>>>> Iain
>>>>>
>>>>>> Regards
>>>>>> JunMa
>>>>>>>> Bootstrap and test on X86_64, is it OK?
>>>>>>>>
>>>>>>>> Regards
>>>>>>>> JunMa
>>>>>>>>
>>>>>>>> gcc/cp
>>>>>>>> 2020-01-20  Jun Ma <JunMa@linux.alibaba.com>
>>>>>>>>
>>>>>>>>          * coroutines.cc (build_actor_fn): Skip rewrite when there is no
>>>>>>>>          param references.
>>>>>>>>          (register_param_uses): Only collect param references here.
>>>>>>>>          (morph_fn_to_coro): Create coroutine frame field for each
>>>>>>>>          function params.
>>>>>>>>
>>>>>>>> gcc/testsuite
>>>>>>>> 2020-01-20  Jun Ma <JunMa@linux.alibaba.com>
>>>>>>>>
>>>>>>>>          * g++.dg/coroutines/torture/func-params-07.C: New test.
>>>>>>>> <0001-Fix-issue-when-copy-function-parameters-into-corouti.patch>
JunMa Jan. 21, 2020, 2:06 a.m. UTC | #9
在 2020/1/21 上午9:34, JunMa 写道:
> 在 2020/1/21 上午12:39, Iain Sandoe 写道:
>> JunMa <JunMa@linux.alibaba.com> wrote:
>>
>>> 在 2020/1/20 下午8:21, Iain Sandoe 写道:
>>>> JunMa <JunMa@linux.alibaba.com> wrote:
>>>>
>>>>> 在 2020/1/20 下午7:55, Iain Sandoe 写道:
>>>>>> Hi JunMa,
>>>>>>
>>>>>> JunMa <JunMa@linux.alibaba.com> wrote:
>>>>>>
>>>>>>> 在 2020/1/20 下午6:07, Iain Sandoe 写道:
>>>>>>>> Hi JunMa,
>>>>>>>>
>>>>>>>> JunMa <JunMa@linux.alibaba.com> wrote:
>>>>>>>>
>>>>>>>>> Hi
>>>>>>>>> Accroding to N4835: When a coroutine is invoked, a copy is 
>>>>>>>>> created
>>>>>>>>> for each coroutine parameter. While in current implementation, we
>>>>>>>>> only collect used parameters. This may lost behavior inside 
>>>>>>>>> constructor
>>>>>>>>> and destructor of unused parameters.
>>>>>>>>>
>>>>>>>>> This patch change the implementation to collect all parameter.
>>>>>>>> thanks for the patch.
>>>>>>>>
>>>>>>>> I am not convinced this is the right way to fix this, we do not 
>>>>>>>> want to increase
>>>>>>>> the size of the coroutine frame unnecessarily.  I would like to 
>>>>>>>> check the lifetime
>>>>>>>> requirements; such copies might only need to be made local to 
>>>>>>>> the ramp function.
>>>>>>>>
>>>>>>>> Iain
>>>>>>> For type with trivial destructor, There is no need to copy 
>>>>>>> parameter if it is
>>>>>>> never referenced after a reachable suspend 
>>>>>>> point(return-to-caller/resume) in the
>>>>>>> coroutine. Since we are in front end, that should be 
>>>>>>> inital_suspend point. so we
>>>>>>> can create field for type with non-trivial destructor first, and 
>>>>>>> skip unused parameters
>>>>>>> in register_param_uses. I'll update the patch
>>>>>> I think that, even if the DTOR is non-trivial, the copy only 
>>>>>> needs to be in the stack
>>>>>> frame of the ramp, not in the coroutine frame.
>>>>> I do think this.  It's just need more work on front end.
>>>> I think we already did the work, and know the answer (about param 
>>>> uses in the body), right?
>>> Yes, we may need extra work on copy parameters, I'll do this.
>> Having discussed this with Nathan (and I’ve also mailed Gor for 
>> clarification).  I think it might be
>> a good idea to wait for those responses before revising (it could be 
>> that your original reading of
>> the wording is correct, and the clang impl. needs to be updated).
> ok, thanks.
>
The reason why i consider about non-trivial destructors is that if the 
destructors are called in ramp
function, actor function may has different status on something. the 
testcase I attachted is such
example: it changes global variable in destructor.

Regards
JunMa
> Regards
> JunMa
>> thanks
>> Iain
>>
>>>>>> This is also what clang does for -O>0 (although it makes a frame 
>>>>>> copy at O0).
>>>>>>
>>>>>> Will clarify this (perhaps the wording could be slightly more 
>>>>>> specfic).
>>>>> No, clang build frame in middle end. In FE, it just generate local 
>>>>> variable to do the copy, and let the middle end to
>>>>> do the optimization which can see more opportunity.
>>>> Yes, i understand that clang does this in the ME not in the FE but, 
>>>> so long as the principle is correct, it is equivalent for GCC to do 
>>>> this in the FE.
>>> Yes, just less situation we can handle, that's why we still need 
>>> coroutine frame reduction pass.
>>>
>>> Regards
>>> JunMa
>>>> thanks
>>>> Iain
>>>>
>>>>> Regards
>>>>> JunMa
>>>>>> thanks
>>>>>> Iain
>>>>>>
>>>>>>> Regards
>>>>>>> JunMa
>>>>>>>>> Bootstrap and test on X86_64, is it OK?
>>>>>>>>>
>>>>>>>>> Regards
>>>>>>>>> JunMa
>>>>>>>>>
>>>>>>>>> gcc/cp
>>>>>>>>> 2020-01-20  Jun Ma <JunMa@linux.alibaba.com>
>>>>>>>>>
>>>>>>>>>          * coroutines.cc (build_actor_fn): Skip rewrite when 
>>>>>>>>> there is no
>>>>>>>>>          param references.
>>>>>>>>>          (register_param_uses): Only collect param references 
>>>>>>>>> here.
>>>>>>>>>          (morph_fn_to_coro): Create coroutine frame field for 
>>>>>>>>> each
>>>>>>>>>          function params.
>>>>>>>>>
>>>>>>>>> gcc/testsuite
>>>>>>>>> 2020-01-20  Jun Ma <JunMa@linux.alibaba.com>
>>>>>>>>>
>>>>>>>>>          * g++.dg/coroutines/torture/func-params-07.C: New test.
>>>>>>>>> <0001-Fix-issue-when-copy-function-parameters-into-corouti.patch>
>
Iain Sandoe Jan. 21, 2020, 8:07 a.m. UTC | #10
JunMa <JunMa@linux.alibaba.com> wrote:

> 在 2020/1/21 上午9:34, JunMa 写道:
>> 在 2020/1/21 上午12:39, Iain Sandoe 写道:
>>> JunMa <JunMa@linux.alibaba.com> wrote:
>>>
>>>> 在 2020/1/20 下午8:21, Iain Sandoe 写道:
>>>>> JunMa <JunMa@linux.alibaba.com> wrote:
>>>>>
>>>>>> 在 2020/1/20 下午7:55, Iain Sandoe 写道:
>>>>>>> Hi JunMa,
>>>>>>>
>>>>>>> JunMa <JunMa@linux.alibaba.com> wrote:
>>>>>>>
>>>>>>>> 在 2020/1/20 下午6:07, Iain Sandoe 写道:
>>>>>>>>> Hi JunMa,
>>>>>>>>>
>>>>>>>>> JunMa <JunMa@linux.alibaba.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi
>>>>>>>>>> Accroding to N4835: When a coroutine is invoked, a copy is created
>>>>>>>>>> for each coroutine parameter. While in current implementation, we
>>>>>>>>>> only collect used parameters. This may lost behavior inside  
>>>>>>>>>> constructor
>>>>>>>>>> and destructor of unused parameters.
>>>>>>>>>>
>>>>>>>>>> This patch change the implementation to collect all parameter.
>>>>>>>>> thanks for the patch.
>>>>>>>>>
>>>>>>>>> I am not convinced this is the right way to fix this, we do not  
>>>>>>>>> want to increase
>>>>>>>>> the size of the coroutine frame unnecessarily.  I would like to  
>>>>>>>>> check the lifetime
>>>>>>>>> requirements; such copies might only need to be made local to the  
>>>>>>>>> ramp function.
>>>>>>>>>
>>>>>>>>> Iain
>>>>>>>> For type with trivial destructor, There is no need to copy  
>>>>>>>> parameter if it is
>>>>>>>> never referenced after a reachable suspend  
>>>>>>>> point(return-to-caller/resume) in the
>>>>>>>> coroutine. Since we are in front end, that should be  
>>>>>>>> inital_suspend point. so we
>>>>>>>> can create field for type with non-trivial destructor first, and  
>>>>>>>> skip unused parameters
>>>>>>>> in register_param_uses. I'll update the patch
>>>>>>> I think that, even if the DTOR is non-trivial, the copy only needs  
>>>>>>> to be in the stack
>>>>>>> frame of the ramp, not in the coroutine frame.
>>>>>> I do think this.  It's just need more work on front end.
>>>>> I think we already did the work, and know the answer (about param  
>>>>> uses in the body), right?
>>>> Yes, we may need extra work on copy parameters, I'll do this.
>>> Having discussed this with Nathan (and I’ve also mailed Gor for  
>>> clarification).  I think it might be
>>> a good idea to wait for those responses before revising (it could be  
>>> that your original reading of
>>> the wording is correct, and the clang impl. needs to be updated).
>> ok, thanks.
> The reason why i consider about non-trivial destructors is that if the  
> destructors are called in ramp
> function, actor function may has different status on something. the  
> testcase I attachted is such
> example: it changes global variable in destructor.

Yes we all agree on this :-)

The issue is:
   1. should the copy exist for the duration of the ramp only? (i.e. copied to the ramp() stack frame)
   2. should the copy exist for the duration of the resume() function? (i.e. copied to the coro frame)

At the present, clang appears to do 1. when optimisation is on and 2.  
without optimisation.
Nathan pointed out that the lifetime of an object should not depend on the  
optimisation level.

So - I think we need some clarification on the intent (from the designer)  
and maybe some revision
of the standard wording to make the lifetime clear.

thanks
Iain
JunMa Jan. 21, 2020, 8:30 a.m. UTC | #11
在 2020/1/21 下午4:07, Iain Sandoe 写道:
> JunMa <JunMa@linux.alibaba.com> wrote:
>
>> 在 2020/1/21 上午9:34, JunMa 写道:
>>> 在 2020/1/21 上午12:39, Iain Sandoe 写道:
>>>> JunMa <JunMa@linux.alibaba.com> wrote:
>>>>
>>>>> 在 2020/1/20 下午8:21, Iain Sandoe 写道:
>>>>>> JunMa <JunMa@linux.alibaba.com> wrote:
>>>>>>
>>>>>>> 在 2020/1/20 下午7:55, Iain Sandoe 写道:
>>>>>>>> Hi JunMa,
>>>>>>>>
>>>>>>>> JunMa <JunMa@linux.alibaba.com> wrote:
>>>>>>>>
>>>>>>>>> 在 2020/1/20 下午6:07, Iain Sandoe 写道:
>>>>>>>>>> Hi JunMa,
>>>>>>>>>>
>>>>>>>>>> JunMa <JunMa@linux.alibaba.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi
>>>>>>>>>>> Accroding to N4835: When a coroutine is invoked, a copy is 
>>>>>>>>>>> created
>>>>>>>>>>> for each coroutine parameter. While in current 
>>>>>>>>>>> implementation, we
>>>>>>>>>>> only collect used parameters. This may lost behavior inside 
>>>>>>>>>>> constructor
>>>>>>>>>>> and destructor of unused parameters.
>>>>>>>>>>>
>>>>>>>>>>> This patch change the implementation to collect all parameter.
>>>>>>>>>> thanks for the patch.
>>>>>>>>>>
>>>>>>>>>> I am not convinced this is the right way to fix this, we do 
>>>>>>>>>> not want to increase
>>>>>>>>>> the size of the coroutine frame unnecessarily. I would like 
>>>>>>>>>> to check the lifetime
>>>>>>>>>> requirements; such copies might only need to be made local to 
>>>>>>>>>> the ramp function.
>>>>>>>>>>
>>>>>>>>>> Iain
>>>>>>>>> For type with trivial destructor, There is no need to copy 
>>>>>>>>> parameter if it is
>>>>>>>>> never referenced after a reachable suspend 
>>>>>>>>> point(return-to-caller/resume) in the
>>>>>>>>> coroutine. Since we are in front end, that should be 
>>>>>>>>> inital_suspend point. so we
>>>>>>>>> can create field for type with non-trivial destructor first, 
>>>>>>>>> and skip unused parameters
>>>>>>>>> in register_param_uses. I'll update the patch
>>>>>>>> I think that, even if the DTOR is non-trivial, the copy only 
>>>>>>>> needs to be in the stack
>>>>>>>> frame of the ramp, not in the coroutine frame.
>>>>>>> I do think this.  It's just need more work on front end.
>>>>>> I think we already did the work, and know the answer (about param 
>>>>>> uses in the body), right?
>>>>> Yes, we may need extra work on copy parameters, I'll do this.
>>>> Having discussed this with Nathan (and I’ve also mailed Gor for 
>>>> clarification).  I think it might be
>>>> a good idea to wait for those responses before revising (it could 
>>>> be that your original reading of
>>>> the wording is correct, and the clang impl. needs to be updated).
>>> ok, thanks.
>> The reason why i consider about non-trivial destructors is that if 
>> the destructors are called in ramp
>> function, actor function may has different status on something. the 
>> testcase I attachted is such
>> example: it changes global variable in destructor.
>
> Yes we all agree on this :-)
>
> The issue is:
>   1. should the copy exist for the duration of the ramp only? (i.e. 
> copied to the ramp() stack frame)
>   2. should the copy exist for the duration of the resume() function? 
> (i.e. copied to the coro frame)
>
> At the present, clang appears to do 1. when optimisation is on and 2. 
> without optimisation.
> Nathan pointed out that the lifetime of an object should not depend on 
> the optimisation level.
>
Although I prefer 2, it seems that maybe clang is hard to control 
this(not sure:), maybe new attribute).

> So - I think we need some clarification on the intent (from the 
> designer) and maybe some revision
> of the standard wording to make the lifetime clear.
>
Yeah, let's make this more clear. Thanks.

Regards
JunMa
> thanks
> Iain
>

Patch
diff mbox series

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 49e509f4384..74e0f6d1785 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -1877,6 +1877,8 @@  build_actor_fn (location_t loc, tree coro_frame_type, tree actor, tree fnbody,
 				     actor_frame, fld_ref, NULL_TREE);
 	  int i;
 	  tree *puse;
+	  if (parm.body_uses == NULL)
+	    continue;
 	  FOR_EACH_VEC_ELT (*parm.body_uses, i, puse)
 	    {
 	      *puse = fld_idx;
@@ -2717,7 +2719,6 @@  struct param_frame_data
   tree *field_list;
   hash_map<tree, param_info> *param_uses;
   location_t loc;
-  bool param_seen;
 };
 
 static tree
@@ -2731,30 +2732,10 @@  register_param_uses (tree *stmt, int *do_subtree ATTRIBUTE_UNUSED, void *d)
   bool existed;
   param_info &parm = data->param_uses->get_or_insert (*stmt, &existed);
   gcc_checking_assert (existed);
-
-  if (parm.field_id == NULL_TREE)
-    {
-      tree actual_type = TREE_TYPE (*stmt);
-
-      if (!COMPLETE_TYPE_P (actual_type))
-	actual_type = complete_type_or_else (actual_type, *stmt);
-
-      if (TREE_CODE (actual_type) == REFERENCE_TYPE)
-	actual_type = build_pointer_type (TREE_TYPE (actual_type));
-
-      parm.frame_type = actual_type;
-      tree pname = DECL_NAME (*stmt);
-      size_t namsize = sizeof ("__parm.") + IDENTIFIER_LENGTH (pname) + 1;
-      char *buf = (char *) alloca (namsize);
-      snprintf (buf, namsize, "__parm.%s", IDENTIFIER_POINTER (pname));
-      parm.field_id
-	= coro_make_frame_entry (data->field_list, buf, actual_type, data->loc);
-      vec_alloc (parm.body_uses, 4);
-      parm.body_uses->quick_push (stmt);
-      data->param_seen = true;
-    }
-  else
-    parm.body_uses->safe_push (stmt);
+  gcc_checking_assert (parm.field_id != NULL_TREE);
+  if (parm.body_uses == NULL)
+    vec_alloc (parm.body_uses, 4);
+  parm.body_uses->safe_push (stmt);
 
   return NULL_TREE;
 }
@@ -3060,6 +3041,8 @@  morph_fn_to_coro (tree orig, tree *resumer, tree *destroyer)
 	  The second two entries start out empty - and only get populated
 	  when we see uses.  */
       param_uses = new hash_map<tree, param_info>;
+      struct param_frame_data param_data
+	= {&field_list, param_uses, fn_start};
 
       for (tree arg = DECL_ARGUMENTS (orig); arg != NULL;
 	   arg = DECL_CHAIN (arg))
@@ -3067,24 +3050,28 @@  morph_fn_to_coro (tree orig, tree *resumer, tree *destroyer)
 	  bool existed;
 	  param_info &parm = param_uses->get_or_insert (arg, &existed);
 	  gcc_checking_assert (!existed);
-	  parm.field_id = NULL_TREE;
 	  parm.body_uses = NULL;
+	  tree actual_type = TREE_TYPE (arg);
+
+	  if (!COMPLETE_TYPE_P (actual_type))
+	    actual_type = complete_type_or_else (actual_type, arg);
+
+	  if (TREE_CODE (actual_type) == REFERENCE_TYPE)
+	    actual_type = build_pointer_type (TREE_TYPE (actual_type));
+
+	  parm.frame_type = actual_type;
+	  tree pname = DECL_NAME (arg);
+	  size_t namsize = sizeof ("__parm.") + IDENTIFIER_LENGTH (pname) + 1;
+	  char *buf = (char *) alloca (namsize);
+	  snprintf (buf, namsize, "__parm.%s", IDENTIFIER_POINTER (pname));
+	  parm.field_id
+	    = coro_make_frame_entry (param_data.field_list, buf,
+				     actual_type, param_data.loc);
 	}
 
-      param_frame_data param_data
-	= {&field_list, param_uses, fn_start, false};
       /* We want to record every instance of param's use, so don't include
 	 a 'visited' hash_set.  */
       cp_walk_tree (&fnbody, register_param_uses, &param_data, NULL);
-
-      /* If no uses for any param were seen, act as if there were no
-	 params (it could be that they are only used to construct the
-	 promise).  */
-      if (!param_data.param_seen)
-	{
-	  delete param_uses;
-	  param_uses = NULL;
-	}
     }
 
   /* 4. Now make space for local vars, this is conservative again, and we
@@ -3333,7 +3320,7 @@  morph_fn_to_coro (tree orig, tree *resumer, tree *destroyer)
       add_stmt (r);
     }
 
-  /* Copy in any of the function params we found to be used.
+  /* Copy in all of the function params.
      Param types with non-trivial dtors will have to be moved into position
      and the dtor run before the frame is freed.  */
   vec<tree, va_gc> *param_dtor_list = NULL;
diff --git a/gcc/testsuite/g++.dg/coroutines/torture/func-params-07.C b/gcc/testsuite/g++.dg/coroutines/torture/func-params-07.C
new file mode 100644
index 00000000000..7559eb36889
--- /dev/null
+++ b/gcc/testsuite/g++.dg/coroutines/torture/func-params-07.C
@@ -0,0 +1,68 @@ 
+// { dg-do run }
+
+// template parm in a class
+
+#include "../coro.h"
+
+// boiler-plate for tests of codegen
+#include "../coro1-ret-int-yield-int.h"
+int count=0;
+struct TestAwaiter {
+    int recent_test;
+    TestAwaiter(int test) : recent_test{test} {}
+    ~TestAwaiter() {count++;}
+    bool await_ready() { return true; }
+    void await_suspend(coro::coroutine_handle<>) {}
+    int await_resume() { return recent_test;}
+    void return_value(int x) { recent_test = x;}
+};
+
+coro1 test_lam (struct TestAwaiter t)
+{
+
+	int sum = 0;
+	for (int i = 0; i < 1000; ++i)
+	{
+	  sum += co_await TestAwaiter(1);
+	}
+	co_return sum;
+}
+
+int main ()
+{
+
+  PRINT ("main: create coro1");
+  coro1 x = test_lam (TestAwaiter(1));
+  if (x.handle.done())
+    abort();
+  x.handle.resume();
+  PRINT ("main: after resume (initial suspend)");
+  
+  while(!x.handle.done())
+  x.handle.resume();
+  PRINT ("main: after resume (co_await)");
+
+  /* Now we should have the co_returned value.  */
+  int y = x.handle.promise().get_value();
+  if ( y != 1000 )
+    {
+      PRINTF ("main: wrong result (%d).", y);
+      abort ();
+    }
+
+  if (!x.handle.done())
+    {
+      PRINT ("main: apparently not done...");
+      abort ();
+    }
+
+  x.handle.destroy();
+  x.handle = NULL;
+  if (count != 1002)
+   {
+      PRINT ("main: missed destructor...");
+      abort ();
+   }
+  PRINT ("main: returning");
+  return 0;
+}