diff mbox series

[net-next] net_sched: fix an extack message in tcf_block_find()

Message ID 20180927204219.17846-1-xiyou.wangcong@gmail.com
State Not Applicable, archived
Delegated to: David Miller
Headers show
Series [net-next] net_sched: fix an extack message in tcf_block_find() | expand

Commit Message

Cong Wang Sept. 27, 2018, 8:42 p.m. UTC
It is clearly a copy-n-paste.

Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 net/sched/cls_api.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Eric Dumazet Sept. 27, 2018, 9:16 p.m. UTC | #1
On 09/27/2018 01:42 PM, Cong Wang wrote:
> It is clearly a copy-n-paste.
> 
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> ---
>  net/sched/cls_api.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
> index 3de47e99b788..8dd7f8af6d54 100644
> --- a/net/sched/cls_api.c
> +++ b/net/sched/cls_api.c
> @@ -655,7 +655,7 @@ static struct tcf_block *tcf_block_find(struct net *net, struct Qdisc **q,
>  
>  		*q = qdisc_refcount_inc_nz(*q);
>  		if (!*q) {
> -			NL_SET_ERR_MSG(extack, "Parent Qdisc doesn't exists");
> +			NL_SET_ERR_MSG(extack, "Can't increase Qdisc refcount");


I am not sure it was a copy-n-paste.

Qdisc refcount business is kernel internal.
If we can not increase the refcount, this is precisely because this qdisc is about
to be destroyed. Nothing fundamentally different than having this thread delayed a bit
and qdisc_lookup_rcu() returning NULL in the first place.

This also means that using RCU for control path is problematic, as surely the caller
of this interface would prefer something that succeeds, even if this means
waiting a bit in the kernel.

Or are we willing to change ip command and make it restart failed syscalls ?
Cong Wang Sept. 27, 2018, 9:36 p.m. UTC | #2
On Thu, Sep 27, 2018 at 2:16 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
>
> On 09/27/2018 01:42 PM, Cong Wang wrote:
> > It is clearly a copy-n-paste.
> >
> > Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> > ---
> >  net/sched/cls_api.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
> > index 3de47e99b788..8dd7f8af6d54 100644
> > --- a/net/sched/cls_api.c
> > +++ b/net/sched/cls_api.c
> > @@ -655,7 +655,7 @@ static struct tcf_block *tcf_block_find(struct net *net, struct Qdisc **q,
> >
> >               *q = qdisc_refcount_inc_nz(*q);
> >               if (!*q) {
> > -                     NL_SET_ERR_MSG(extack, "Parent Qdisc doesn't exists");
> > +                     NL_SET_ERR_MSG(extack, "Can't increase Qdisc refcount");
>
>
> I am not sure it was a copy-n-paste.


Make sure you knew there is an exactly same extack message
(with a same English grammar error).


>
> Qdisc refcount business is kernel internal.

Yeah, but the extack message is already there, this patch doesn't add
any new extack. Or you are suggesting we should remove it?



> If we can not increase the refcount, this is precisely because this qdisc is about
> to be destroyed. Nothing fundamentally different than having this thread delayed a bit
> and qdisc_lookup_rcu() returning NULL in the first place.


qdisc_lookup_rcu() is not always called, it could be dev->qdisc.
I am pretty sure parent exists in dev->qdisc.


>
> This also means that using RCU for control path is problematic, as surely the caller
> of this interface would prefer something that succeeds, even if this means
> waiting a bit in the kernel.

I fail to validate this statement, Why it prefers success when refcnt reaches
0?


>
> Or are we willing to change ip command and make it restart failed syscalls ?
>

I don't understand what you mean by changing ip command, you must
mean tc command, but still, I have no idea about how restarting failed
syscall could be related to my patch and why we need to restart anything
here. If the refcnt goes to 0, it will never come back, retrying won't help
anything.

BTW:

If you have any other question beyond my patch's scope, isn't it better
that we start a new thread for discussion?

In case you still misunderstand, my patch never intends to address any
other problem rather than correcting an inaccurate extack message.
Eric Dumazet Sept. 27, 2018, 10:18 p.m. UTC | #3
On 09/27/2018 02:36 PM, Cong Wang wrote:

> I don't understand what you mean by changing ip command, you must
> mean tc command, but still, I have no idea about how restarting failed
> syscall could be related to my patch and why we need to restart anything
> here. If the refcnt goes to 0, it will never come back, retrying won't help
> anything.
>

Yep, tc command it is.

I was not especially commenting your patch (replacing an english message by another does
not seem very big deal), but the fact that the code right there seems to be prepared
for parallel changes.

But using RCU lookups in control path will lead to occasional failures
that most user space tools would not expect.

Lets assume two tasks are launching "tc qdisc replace dev eth0 root XXX" in whatever order/parallelism.

Both should succeed, after/before major RTNL->other_locking_mechanism

Control paths are usually using a mutex or a spinlock so that they never hit a 0-refcount at all.
David Ahern Sept. 28, 2018, 1:51 a.m. UTC | #4
On 9/27/18 3:36 PM, Cong Wang wrote:
> On Thu, Sep 27, 2018 at 2:16 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>
>>
>>
>> On 09/27/2018 01:42 PM, Cong Wang wrote:
>>> It is clearly a copy-n-paste.
>>>
>>> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
>>> ---
>>>  net/sched/cls_api.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
>>> index 3de47e99b788..8dd7f8af6d54 100644
>>> --- a/net/sched/cls_api.c
>>> +++ b/net/sched/cls_api.c
>>> @@ -655,7 +655,7 @@ static struct tcf_block *tcf_block_find(struct net *net, struct Qdisc **q,
>>>
>>>               *q = qdisc_refcount_inc_nz(*q);
>>>               if (!*q) {
>>> -                     NL_SET_ERR_MSG(extack, "Parent Qdisc doesn't exists");
>>> +                     NL_SET_ERR_MSG(extack, "Can't increase Qdisc refcount");
>>
>>
>> I am not sure it was a copy-n-paste.
> 
> 
> Make sure you knew there is an exactly same extack message
> (with a same English grammar error).
> 
> 
>>
>> Qdisc refcount business is kernel internal.
> 
> Yeah, but the extack message is already there, this patch doesn't add
> any new extack. Or you are suggesting we should remove it?

IMO the message grammar should be fixed, but the content is correct --
ie, parent qdisc does not exist.
Vlad Buslov Sept. 28, 2018, 11:36 a.m. UTC | #5
On Thu 27 Sep 2018 at 20:42, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> It is clearly a copy-n-paste.
>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> ---
>  net/sched/cls_api.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
> index 3de47e99b788..8dd7f8af6d54 100644
> --- a/net/sched/cls_api.c
> +++ b/net/sched/cls_api.c
> @@ -655,7 +655,7 @@ static struct tcf_block *tcf_block_find(struct net *net, struct Qdisc **q,
>  
>  		*q = qdisc_refcount_inc_nz(*q);
>  		if (!*q) {
> -			NL_SET_ERR_MSG(extack, "Parent Qdisc doesn't exists");
> +			NL_SET_ERR_MSG(extack, "Can't increase Qdisc refcount");
>  			err = -EINVAL;
>  			goto errout_rcu;
>  		}

Is there a benefit in exposing this info to user?
For all intents and purposes Qdisc is gone at that point.
Cong Wang Sept. 28, 2018, 5:03 p.m. UTC | #6
On Fri, Sep 28, 2018 at 4:36 AM Vlad Buslov <vladbu@mellanox.com> wrote:
>
> On Thu 27 Sep 2018 at 20:42, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> > It is clearly a copy-n-paste.
> >
> > Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> > ---
> >  net/sched/cls_api.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
> > index 3de47e99b788..8dd7f8af6d54 100644
> > --- a/net/sched/cls_api.c
> > +++ b/net/sched/cls_api.c
> > @@ -655,7 +655,7 @@ static struct tcf_block *tcf_block_find(struct net *net, struct Qdisc **q,
> >
> >               *q = qdisc_refcount_inc_nz(*q);
> >               if (!*q) {
> > -                     NL_SET_ERR_MSG(extack, "Parent Qdisc doesn't exists");
> > +                     NL_SET_ERR_MSG(extack, "Can't increase Qdisc refcount");
> >                       err = -EINVAL;
> >                       goto errout_rcu;
> >               }
>
> Is there a benefit in exposing this info to user?

Depends on what user you mean here. For kernel developers, yes,
this is useful. For others, no.


> For all intents and purposes Qdisc is gone at that point.

I don't want to be a language lawyer, but there is a difference between
"it doesn't exist" and "it exists but being removed". The errno EINVAL
betrays what you said too, it must be ENOENT to mach "Qdisc is gone".

I don't want to waste my time on this any more. Let's just drop it.

I really don't care, do you?
Cong Wang Sept. 28, 2018, 5:04 p.m. UTC | #7
On Thu, Sep 27, 2018 at 1:42 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
>
> It is clearly a copy-n-paste.
>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>

I regret again for wasting my time, so:

NAcked-by: Cong Wang <xiyou.wangcong@gmail.com>

I really don't care, do you?
Cong Wang Sept. 28, 2018, 5:19 p.m. UTC | #8
(Changing $subject as the discussion is going to a completely different topic)

On Thu, Sep 27, 2018 at 3:19 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
>
> On 09/27/2018 02:36 PM, Cong Wang wrote:
>
> > I don't understand what you mean by changing ip command, you must
> > mean tc command, but still, I have no idea about how restarting failed
> > syscall could be related to my patch and why we need to restart anything
> > here. If the refcnt goes to 0, it will never come back, retrying won't help
> > anything.
> >
>
> Yep, tc command it is.
>
> I was not especially commenting your patch (replacing an english message by another does
> not seem very big deal), but the fact that the code right there seems to be prepared
> for parallel changes.
>
> But using RCU lookups in control path will lead to occasional failures
> that most user space tools would not expect.
>

I already discussed this with Vlad in the beginning of his RTNL
removal patches, we both agreed some lock is still needed, it is not
completely lockless. Take a look at tc action code now, two spinlocks
are still needed even after we will remove the RTNL there.


> Lets assume two tasks are launching "tc qdisc replace dev eth0 root XXX" in whatever order/parallelism.
>
> Both should succeed, after/before major RTNL->other_locking_mechanism


Yes, it is never going to be completely lockless.


>
> Control paths are usually using a mutex or a spinlock so that they never hit a 0-refcount at all.


For dev->qdisc, sure, we should already hold a refcnt, it can't be gone.

For qdisc_lookup_rcu(), it could be that refcnt goes to 0 before we
remove it from hashtable, right? call_rcu() is only called after
refcnt==0, so rcu read lock can't help here.

Thanks.
Vlad Buslov Sept. 30, 2018, 2:12 p.m. UTC | #9
On Fri 28 Sep 2018 at 17:03, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Fri, Sep 28, 2018 at 4:36 AM Vlad Buslov <vladbu@mellanox.com> wrote:
>>
>> On Thu 27 Sep 2018 at 20:42, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>> > It is clearly a copy-n-paste.
>> >
>> > Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
>> > ---
>> >  net/sched/cls_api.c | 2 +-
>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>> >
>> > diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
>> > index 3de47e99b788..8dd7f8af6d54 100644
>> > --- a/net/sched/cls_api.c
>> > +++ b/net/sched/cls_api.c
>> > @@ -655,7 +655,7 @@ static struct tcf_block *tcf_block_find(struct net *net, struct Qdisc **q,
>> >
>> >               *q = qdisc_refcount_inc_nz(*q);
>> >               if (!*q) {
>> > -                     NL_SET_ERR_MSG(extack, "Parent Qdisc doesn't exists");
>> > +                     NL_SET_ERR_MSG(extack, "Can't increase Qdisc refcount");
>> >                       err = -EINVAL;
>> >                       goto errout_rcu;
>> >               }
>>
>> Is there a benefit in exposing this info to user?
>
> Depends on what user you mean here. For kernel developers, yes,
> this is useful. For others, no.
>
>
>> For all intents and purposes Qdisc is gone at that point.
>
> I don't want to be a language lawyer, but there is a difference between
> "it doesn't exist" and "it exists but being removed". The errno EINVAL
> betrays what you said too, it must be ENOENT to mach "Qdisc is gone".
>
> I don't want to waste my time on this any more. Let's just drop it.
>
> I really don't care, do you?

I'm asked the question in order to improve error messages in my future
patches, not because I care about this particular string.
diff mbox series

Patch

diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 3de47e99b788..8dd7f8af6d54 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -655,7 +655,7 @@  static struct tcf_block *tcf_block_find(struct net *net, struct Qdisc **q,
 
 		*q = qdisc_refcount_inc_nz(*q);
 		if (!*q) {
-			NL_SET_ERR_MSG(extack, "Parent Qdisc doesn't exists");
+			NL_SET_ERR_MSG(extack, "Can't increase Qdisc refcount");
 			err = -EINVAL;
 			goto errout_rcu;
 		}