diff mbox

[net,1/1] net sched actions: fix refcnt when GETing of action after bind

Message ID CAM_iQpXZqgcSbK2A0_RYmZxmawNgM9RtUsA2Bf0Jf91meu2SWg@mail.gmail.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Cong Wang Jan. 20, 2017, 6:20 a.m. UTC
On Wed, Jan 18, 2017 at 3:33 AM, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> On 17-01-17 01:17 PM, Cong Wang wrote:
>>
>> Why this check for RTM_GETACTION? It does not make sense
>> at least for the error case, that is, when tcf_action_get_1() fails
>> in the middle of the loop, all the previous ones should be destroyed
>> no matter it's GETACTION or DELACTION...
>>
>> Also, for the non-error case of GETACTION, they should be
>> destroyed too after dumping to user-space, otherwise there is no
>> way to use them since 'actions' is local to that function.
>>
>> I thought in commit aecc5cefc389 you grab that refcnt on purpose.
>>
>
> I did.
> The issue there (after your original patch) was destroy() would
> decrement the refcount to zero and a GET was essentially translated
> to a DEL. Incrementing the refcount earlier protected against that
> assuming destroy was going to decrement it.
> However, when an action is bound the destroy() doesnt decrement
> the refcnt. So the refcnt keeps going up forever (and therefore
> deleting fails in the future). So we cant use destroy() as is.

Hmm, tcf_action_destroy() should not touch the refcnt at all in this case,
right? Since the refcnt here is not for readers in kernel code but for
user-space. We mix the use of this refcnt, which leads to problems.

Your patch is not correct either for DEL, tcf_action_destroy() is not
needed to call again after tcf_del_notify() fails, right? Probably
it is not needed at all:

Comments

Jamal Hadi Salim Jan. 22, 2017, 12:56 p.m. UTC | #1
On 17-01-20 01:20 AM, Cong Wang wrote:
> On Wed, Jan 18, 2017 at 3:33 AM, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>> On 17-01-17 01:17 PM, Cong Wang wrote:
>>>

>> I did.
>> The issue there (after your original patch) was destroy() would
>> decrement the refcount to zero and a GET was essentially translated
>> to a DEL. Incrementing the refcount earlier protected against that
>> assuming destroy was going to decrement it.
>> However, when an action is bound the destroy() doesnt decrement
>> the refcnt. So the refcnt keeps going up forever (and therefore
>> deleting fails in the future). So we cant use destroy() as is.
>
> Hmm, tcf_action_destroy() should not touch the refcnt at all in this case,
> right? Since the refcnt here is not for readers in kernel code but for
> user-space. We mix the use of this refcnt, which leads to problems.
>
> Your patch is not correct either for DEL, tcf_action_destroy() is not
> needed to call again after tcf_del_notify() fails, right? Probably
> it is not needed at all:

Cong,
Please proceed to separate del from get. The trickery is biting us.
Also - run those tests i had in my patch. There is a difference
between the bound vs not-bound use cases.

cheers,
jamal
diff mbox

Patch

diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index e10456ef6f..82eed18 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -907,13 +907,8 @@  tca_action_gd(struct net *net, struct nlattr
*nla, struct nlmsghdr *n,
                ret = act_get_notify(net, portid, n, &actions, event);
        else { /* delete */
                ret = tcf_del_notify(net, n, &actions, portid);
-               if (ret)
-                       goto err;
-               return ret;
        }
 err:
-       if (event != RTM_GETACTION)
-               tcf_action_destroy(&actions, 0);
        return ret;
 }