Message ID | 20180927204219.17846-1-xiyou.wangcong@gmail.com |
---|---|
State | Not Applicable, archived |
Delegated to: | David Miller |
Headers | show |
Series | [net-next] net_sched: fix an extack message in tcf_block_find() | expand |
On 09/27/2018 01:42 PM, Cong Wang wrote: > It is clearly a copy-n-paste. > > Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> > --- > net/sched/cls_api.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c > index 3de47e99b788..8dd7f8af6d54 100644 > --- a/net/sched/cls_api.c > +++ b/net/sched/cls_api.c > @@ -655,7 +655,7 @@ static struct tcf_block *tcf_block_find(struct net *net, struct Qdisc **q, > > *q = qdisc_refcount_inc_nz(*q); > if (!*q) { > - NL_SET_ERR_MSG(extack, "Parent Qdisc doesn't exists"); > + NL_SET_ERR_MSG(extack, "Can't increase Qdisc refcount"); I am not sure it was a copy-n-paste. Qdisc refcount business is kernel internal. If we can not increase the refcount, this is precisely because this qdisc is about to be destroyed. Nothing fundamentally different than having this thread delayed a bit and qdisc_lookup_rcu() returning NULL in the first place. This also means that using RCU for control path is problematic, as surely the caller of this interface would prefer something that succeeds, even if this means waiting a bit in the kernel. Or are we willing to change ip command and make it restart failed syscalls ?
On Thu, Sep 27, 2018 at 2:16 PM Eric Dumazet <eric.dumazet@gmail.com> wrote: > > > > On 09/27/2018 01:42 PM, Cong Wang wrote: > > It is clearly a copy-n-paste. > > > > Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> > > --- > > net/sched/cls_api.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c > > index 3de47e99b788..8dd7f8af6d54 100644 > > --- a/net/sched/cls_api.c > > +++ b/net/sched/cls_api.c > > @@ -655,7 +655,7 @@ static struct tcf_block *tcf_block_find(struct net *net, struct Qdisc **q, > > > > *q = qdisc_refcount_inc_nz(*q); > > if (!*q) { > > - NL_SET_ERR_MSG(extack, "Parent Qdisc doesn't exists"); > > + NL_SET_ERR_MSG(extack, "Can't increase Qdisc refcount"); > > > I am not sure it was a copy-n-paste. Make sure you knew there is an exactly same extack message (with a same English grammar error). > > Qdisc refcount business is kernel internal. Yeah, but the extack message is already there, this patch doesn't add any new extack. Or you are suggesting we should remove it? > If we can not increase the refcount, this is precisely because this qdisc is about > to be destroyed. Nothing fundamentally different than having this thread delayed a bit > and qdisc_lookup_rcu() returning NULL in the first place. qdisc_lookup_rcu() is not always called, it could be dev->qdisc. I am pretty sure parent exists in dev->qdisc. > > This also means that using RCU for control path is problematic, as surely the caller > of this interface would prefer something that succeeds, even if this means > waiting a bit in the kernel. I fail to validate this statement, Why it prefers success when refcnt reaches 0? > > Or are we willing to change ip command and make it restart failed syscalls ? > I don't understand what you mean by changing ip command, you must mean tc command, but still, I have no idea about how restarting failed syscall could be related to my patch and why we need to restart anything here. If the refcnt goes to 0, it will never come back, retrying won't help anything. BTW: If you have any other question beyond my patch's scope, isn't it better that we start a new thread for discussion? In case you still misunderstand, my patch never intends to address any other problem rather than correcting an inaccurate extack message.
On 09/27/2018 02:36 PM, Cong Wang wrote: > I don't understand what you mean by changing ip command, you must > mean tc command, but still, I have no idea about how restarting failed > syscall could be related to my patch and why we need to restart anything > here. If the refcnt goes to 0, it will never come back, retrying won't help > anything. > Yep, tc command it is. I was not especially commenting your patch (replacing an english message by another does not seem very big deal), but the fact that the code right there seems to be prepared for parallel changes. But using RCU lookups in control path will lead to occasional failures that most user space tools would not expect. Lets assume two tasks are launching "tc qdisc replace dev eth0 root XXX" in whatever order/parallelism. Both should succeed, after/before major RTNL->other_locking_mechanism Control paths are usually using a mutex or a spinlock so that they never hit a 0-refcount at all.
On 9/27/18 3:36 PM, Cong Wang wrote: > On Thu, Sep 27, 2018 at 2:16 PM Eric Dumazet <eric.dumazet@gmail.com> wrote: >> >> >> >> On 09/27/2018 01:42 PM, Cong Wang wrote: >>> It is clearly a copy-n-paste. >>> >>> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> >>> --- >>> net/sched/cls_api.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c >>> index 3de47e99b788..8dd7f8af6d54 100644 >>> --- a/net/sched/cls_api.c >>> +++ b/net/sched/cls_api.c >>> @@ -655,7 +655,7 @@ static struct tcf_block *tcf_block_find(struct net *net, struct Qdisc **q, >>> >>> *q = qdisc_refcount_inc_nz(*q); >>> if (!*q) { >>> - NL_SET_ERR_MSG(extack, "Parent Qdisc doesn't exists"); >>> + NL_SET_ERR_MSG(extack, "Can't increase Qdisc refcount"); >> >> >> I am not sure it was a copy-n-paste. > > > Make sure you knew there is an exactly same extack message > (with a same English grammar error). > > >> >> Qdisc refcount business is kernel internal. > > Yeah, but the extack message is already there, this patch doesn't add > any new extack. Or you are suggesting we should remove it? IMO the message grammar should be fixed, but the content is correct -- ie, parent qdisc does not exist.
On Thu 27 Sep 2018 at 20:42, Cong Wang <xiyou.wangcong@gmail.com> wrote: > It is clearly a copy-n-paste. > > Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> > --- > net/sched/cls_api.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c > index 3de47e99b788..8dd7f8af6d54 100644 > --- a/net/sched/cls_api.c > +++ b/net/sched/cls_api.c > @@ -655,7 +655,7 @@ static struct tcf_block *tcf_block_find(struct net *net, struct Qdisc **q, > > *q = qdisc_refcount_inc_nz(*q); > if (!*q) { > - NL_SET_ERR_MSG(extack, "Parent Qdisc doesn't exists"); > + NL_SET_ERR_MSG(extack, "Can't increase Qdisc refcount"); > err = -EINVAL; > goto errout_rcu; > } Is there a benefit in exposing this info to user? For all intents and purposes Qdisc is gone at that point.
On Fri, Sep 28, 2018 at 4:36 AM Vlad Buslov <vladbu@mellanox.com> wrote: > > On Thu 27 Sep 2018 at 20:42, Cong Wang <xiyou.wangcong@gmail.com> wrote: > > It is clearly a copy-n-paste. > > > > Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> > > --- > > net/sched/cls_api.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c > > index 3de47e99b788..8dd7f8af6d54 100644 > > --- a/net/sched/cls_api.c > > +++ b/net/sched/cls_api.c > > @@ -655,7 +655,7 @@ static struct tcf_block *tcf_block_find(struct net *net, struct Qdisc **q, > > > > *q = qdisc_refcount_inc_nz(*q); > > if (!*q) { > > - NL_SET_ERR_MSG(extack, "Parent Qdisc doesn't exists"); > > + NL_SET_ERR_MSG(extack, "Can't increase Qdisc refcount"); > > err = -EINVAL; > > goto errout_rcu; > > } > > Is there a benefit in exposing this info to user? Depends on what user you mean here. For kernel developers, yes, this is useful. For others, no. > For all intents and purposes Qdisc is gone at that point. I don't want to be a language lawyer, but there is a difference between "it doesn't exist" and "it exists but being removed". The errno EINVAL betrays what you said too, it must be ENOENT to mach "Qdisc is gone". I don't want to waste my time on this any more. Let's just drop it. I really don't care, do you?
On Thu, Sep 27, 2018 at 1:42 PM Cong Wang <xiyou.wangcong@gmail.com> wrote: > > It is clearly a copy-n-paste. > > Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> I regret again for wasting my time, so: NAcked-by: Cong Wang <xiyou.wangcong@gmail.com> I really don't care, do you?
(Changing $subject as the discussion is going to a completely different topic) On Thu, Sep 27, 2018 at 3:19 PM Eric Dumazet <eric.dumazet@gmail.com> wrote: > > > > On 09/27/2018 02:36 PM, Cong Wang wrote: > > > I don't understand what you mean by changing ip command, you must > > mean tc command, but still, I have no idea about how restarting failed > > syscall could be related to my patch and why we need to restart anything > > here. If the refcnt goes to 0, it will never come back, retrying won't help > > anything. > > > > Yep, tc command it is. > > I was not especially commenting your patch (replacing an english message by another does > not seem very big deal), but the fact that the code right there seems to be prepared > for parallel changes. > > But using RCU lookups in control path will lead to occasional failures > that most user space tools would not expect. > I already discussed this with Vlad in the beginning of his RTNL removal patches, we both agreed some lock is still needed, it is not completely lockless. Take a look at tc action code now, two spinlocks are still needed even after we will remove the RTNL there. > Lets assume two tasks are launching "tc qdisc replace dev eth0 root XXX" in whatever order/parallelism. > > Both should succeed, after/before major RTNL->other_locking_mechanism Yes, it is never going to be completely lockless. > > Control paths are usually using a mutex or a spinlock so that they never hit a 0-refcount at all. For dev->qdisc, sure, we should already hold a refcnt, it can't be gone. For qdisc_lookup_rcu(), it could be that refcnt goes to 0 before we remove it from hashtable, right? call_rcu() is only called after refcnt==0, so rcu read lock can't help here. Thanks.
On Fri 28 Sep 2018 at 17:03, Cong Wang <xiyou.wangcong@gmail.com> wrote: > On Fri, Sep 28, 2018 at 4:36 AM Vlad Buslov <vladbu@mellanox.com> wrote: >> >> On Thu 27 Sep 2018 at 20:42, Cong Wang <xiyou.wangcong@gmail.com> wrote: >> > It is clearly a copy-n-paste. >> > >> > Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> >> > --- >> > net/sched/cls_api.c | 2 +- >> > 1 file changed, 1 insertion(+), 1 deletion(-) >> > >> > diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c >> > index 3de47e99b788..8dd7f8af6d54 100644 >> > --- a/net/sched/cls_api.c >> > +++ b/net/sched/cls_api.c >> > @@ -655,7 +655,7 @@ static struct tcf_block *tcf_block_find(struct net *net, struct Qdisc **q, >> > >> > *q = qdisc_refcount_inc_nz(*q); >> > if (!*q) { >> > - NL_SET_ERR_MSG(extack, "Parent Qdisc doesn't exists"); >> > + NL_SET_ERR_MSG(extack, "Can't increase Qdisc refcount"); >> > err = -EINVAL; >> > goto errout_rcu; >> > } >> >> Is there a benefit in exposing this info to user? > > Depends on what user you mean here. For kernel developers, yes, > this is useful. For others, no. > > >> For all intents and purposes Qdisc is gone at that point. > > I don't want to be a language lawyer, but there is a difference between > "it doesn't exist" and "it exists but being removed". The errno EINVAL > betrays what you said too, it must be ENOENT to mach "Qdisc is gone". > > I don't want to waste my time on this any more. Let's just drop it. > > I really don't care, do you? I'm asked the question in order to improve error messages in my future patches, not because I care about this particular string.
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c index 3de47e99b788..8dd7f8af6d54 100644 --- a/net/sched/cls_api.c +++ b/net/sched/cls_api.c @@ -655,7 +655,7 @@ static struct tcf_block *tcf_block_find(struct net *net, struct Qdisc **q, *q = qdisc_refcount_inc_nz(*q); if (!*q) { - NL_SET_ERR_MSG(extack, "Parent Qdisc doesn't exists"); + NL_SET_ERR_MSG(extack, "Can't increase Qdisc refcount"); err = -EINVAL; goto errout_rcu; }
It is clearly a copy-n-paste. Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> --- net/sched/cls_api.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)