diff mbox series

[net] net_sched: fix a missing rcu barrier in mini_qdisc_pair_swap()

Message ID 20171221072624.6204-1-xiyou.wangcong@gmail.com
State Accepted, archived
Delegated to: David Miller
Headers show
Series [net] net_sched: fix a missing rcu barrier in mini_qdisc_pair_swap() | expand

Commit Message

Cong Wang Dec. 21, 2017, 7:26 a.m. UTC
The rcu_barrier_bh() in mini_qdisc_pair_swap() is to wait for
flying RCU callback installed by a previous mini_qdisc_pair_swap(),
however we miss it on the tp_head==NULL path, which leads to that
the RCU callback still uses miniq_old->rcu after it is freed together
with qdisc in qdisc_graft(). So just add it on that path too.

Fixes: 46209401f8f6 ("net: core: introduce mini_Qdisc and eliminate usage of tp->q for clsact fastpath ")
Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Tested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Jiri Pirko <jiri@mellanox.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 net/sched/sch_generic.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Jiri Pirko Dec. 21, 2017, 9:03 a.m. UTC | #1
Thu, Dec 21, 2017 at 08:26:24AM CET, xiyou.wangcong@gmail.com wrote:
>The rcu_barrier_bh() in mini_qdisc_pair_swap() is to wait for
>flying RCU callback installed by a previous mini_qdisc_pair_swap(),
>however we miss it on the tp_head==NULL path, which leads to that
>the RCU callback still uses miniq_old->rcu after it is freed together
>with qdisc in qdisc_graft(). So just add it on that path too.
>
>Fixes: 46209401f8f6 ("net: core: introduce mini_Qdisc and eliminate usage of tp->q for clsact fastpath ")

This fixes:
752fbcc33405 ("net_sched: no need to free qdisc in RCU callback")

Before that, the issue was not there as the qdisc struct got removed
after a grace period.


>Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
>Tested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
>Cc: Jiri Pirko <jiri@mellanox.com>
>Cc: John Fastabend <john.fastabend@gmail.com>
>Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
>---
> net/sched/sch_generic.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
>diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
>index cd1b200acae7..661c7144b53a 100644
>--- a/net/sched/sch_generic.c
>+++ b/net/sched/sch_generic.c
>@@ -1040,6 +1040,8 @@ void mini_qdisc_pair_swap(struct mini_Qdisc_pair *miniqp,
> 
> 	if (!tp_head) {
> 		RCU_INIT_POINTER(*miniqp->p_miniq, NULL);
>+		/* Wait for flying RCU callback before it is freed. */
>+		rcu_barrier_bh();


> 		return;
> 	}
> 
>@@ -1055,7 +1057,7 @@ void mini_qdisc_pair_swap(struct mini_Qdisc_pair *miniqp,
> 	rcu_assign_pointer(*miniqp->p_miniq, miniq);
> 
> 	if (miniq_old)
>-		/* This is counterpart of the rcu barrier above. We need to
>+		/* This is counterpart of the rcu barriers above. We need to

This is incorrect. Here we block in order to not use the same miniq
again in scenario

miniq1 (X)
miniq2
miniq1 (yet there are reader using X)

This call_rcu has 0 relation to the barrier you are adding.


But again, we don't we just free qdisc in call_rcu and avoid the
barrier?
Cong Wang Dec. 21, 2017, 7:01 p.m. UTC | #2
On Thu, Dec 21, 2017 at 1:03 AM, Jiri Pirko <jiri@resnulli.us> wrote:
> Thu, Dec 21, 2017 at 08:26:24AM CET, xiyou.wangcong@gmail.com wrote:
>>The rcu_barrier_bh() in mini_qdisc_pair_swap() is to wait for
>>flying RCU callback installed by a previous mini_qdisc_pair_swap(),
>>however we miss it on the tp_head==NULL path, which leads to that
>>the RCU callback still uses miniq_old->rcu after it is freed together
>>with qdisc in qdisc_graft(). So just add it on that path too.
>>
>>Fixes: 46209401f8f6 ("net: core: introduce mini_Qdisc and eliminate usage of tp->q for clsact fastpath ")
>
> This fixes:
> 752fbcc33405 ("net_sched: no need to free qdisc in RCU callback")
>
> Before that, the issue was not there as the qdisc struct got removed
> after a grace period.


This is non-sense. You have to read the stack trace from Jakub again
and tell me why you keep believing any RCU reader involved.

I am pretty sure no one reported any crash between commit
752fbcc33405 and 46209401f8f6.


>
>
>>Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
>>Tested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
>>Cc: Jiri Pirko <jiri@mellanox.com>
>>Cc: John Fastabend <john.fastabend@gmail.com>
>>Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
>>---
>> net/sched/sch_generic.c | 4 +++-
>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>
>>diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
>>index cd1b200acae7..661c7144b53a 100644
>>--- a/net/sched/sch_generic.c
>>+++ b/net/sched/sch_generic.c
>>@@ -1040,6 +1040,8 @@ void mini_qdisc_pair_swap(struct mini_Qdisc_pair *miniqp,
>>
>>       if (!tp_head) {
>>               RCU_INIT_POINTER(*miniqp->p_miniq, NULL);
>>+              /* Wait for flying RCU callback before it is freed. */
>>+              rcu_barrier_bh();
>
>
>>               return;
>>       }
>>
>>@@ -1055,7 +1057,7 @@ void mini_qdisc_pair_swap(struct mini_Qdisc_pair *miniqp,
>>       rcu_assign_pointer(*miniqp->p_miniq, miniq);
>>
>>       if (miniq_old)
>>-              /* This is counterpart of the rcu barrier above. We need to
>>+              /* This is counterpart of the rcu barriers above. We need to
>
> This is incorrect. Here we block in order to not use the same miniq
> again in scenario
>
> miniq1 (X)
> miniq2
> miniq1 (yet there are reader using X)
>
> This call_rcu has 0 relation to the barrier you are adding.


Seriously? It is this call_rcu still flying after we free the qdisc.
Did you seriously look into the stack trace from Jakub?


>
>
> But again, we don't we just free qdisc in call_rcu and avoid the
> barrier?


Non-sense again. Why qdisc code should be adjusted for your
miniq code? It is your own responsibility to take care of this shit.
Don't spread it out of minq.
Cong Wang Dec. 21, 2017, 8:54 p.m. UTC | #3
On Thu, Dec 21, 2017 at 11:01 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Thu, Dec 21, 2017 at 1:03 AM, Jiri Pirko <jiri@resnulli.us> wrote:
>>
>>
>> But again, we don't we just free qdisc in call_rcu and avoid the
>> barrier?
>
>
> Non-sense again. Why qdisc code should be adjusted for your
> miniq code? It is your own responsibility to take care of this shit.
> Don't spread it out of minq.

Also, in case you believe call_rcu to free qdisc is queued after
the call_rcu in miniq, you are wrong again:

https://www.kernel.org/pub/linux/kernel/people/paulmck/Answers/RCU/RCUCBordering.html

The rcu callbacks don't guarantee FIFO ordering.
Jiri Pirko Dec. 22, 2017, 9:38 a.m. UTC | #4
Thu, Dec 21, 2017 at 09:54:06PM CET, xiyou.wangcong@gmail.com wrote:
>On Thu, Dec 21, 2017 at 11:01 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>> On Thu, Dec 21, 2017 at 1:03 AM, Jiri Pirko <jiri@resnulli.us> wrote:
>>>
>>>
>>> But again, we don't we just free qdisc in call_rcu and avoid the
>>> barrier?
>>
>>
>> Non-sense again. Why qdisc code should be adjusted for your
>> miniq code? It is your own responsibility to take care of this shit.
>> Don't spread it out of minq.
>
>Also, in case you believe call_rcu to free qdisc is queued after
>the call_rcu in miniq, you are wrong again:

I certainly don't :) Not sure why you think I do.

>
>https://www.kernel.org/pub/linux/kernel/people/paulmck/Answers/RCU/RCUCBordering.html
>
>The rcu callbacks don't guarantee FIFO ordering.
Jiri Pirko Dec. 22, 2017, 10:23 a.m. UTC | #5
Thu, Dec 21, 2017 at 08:01:35PM CET, xiyou.wangcong@gmail.com wrote:
>On Thu, Dec 21, 2017 at 1:03 AM, Jiri Pirko <jiri@resnulli.us> wrote:
>> Thu, Dec 21, 2017 at 08:26:24AM CET, xiyou.wangcong@gmail.com wrote:

[...]

>
>
>Seriously? It is this call_rcu still flying after we free the qdisc.
>Did you seriously look into the stack trace from Jakub?

Got it now. You are right. Thank you for politely helping me to get out
of my confusion.
Jiri Pirko Dec. 22, 2017, 10:24 a.m. UTC | #6
Thu, Dec 21, 2017 at 08:26:24AM CET, xiyou.wangcong@gmail.com wrote:
>The rcu_barrier_bh() in mini_qdisc_pair_swap() is to wait for
>flying RCU callback installed by a previous mini_qdisc_pair_swap(),
>however we miss it on the tp_head==NULL path, which leads to that
>the RCU callback still uses miniq_old->rcu after it is freed together
>with qdisc in qdisc_graft(). So just add it on that path too.
>
>Fixes: 46209401f8f6 ("net: core: introduce mini_Qdisc and eliminate usage of tp->q for clsact fastpath ")
>Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
>Tested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
>Cc: Jiri Pirko <jiri@mellanox.com>
>Cc: John Fastabend <john.fastabend@gmail.com>
>Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>

Acked-by: Jiri Pirko <jiri@mellanox.com>
David Miller Dec. 26, 2017, 5:30 p.m. UTC | #7
From: Cong Wang <xiyou.wangcong@gmail.com>
Date: Wed, 20 Dec 2017 23:26:24 -0800

> The rcu_barrier_bh() in mini_qdisc_pair_swap() is to wait for
> flying RCU callback installed by a previous mini_qdisc_pair_swap(),
> however we miss it on the tp_head==NULL path, which leads to that
> the RCU callback still uses miniq_old->rcu after it is freed together
> with qdisc in qdisc_graft(). So just add it on that path too.
> 
> Fixes: 46209401f8f6 ("net: core: introduce mini_Qdisc and eliminate usage of tp->q for clsact fastpath ")
> Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> Tested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> Cc: Jiri Pirko <jiri@mellanox.com>
> Cc: John Fastabend <john.fastabend@gmail.com>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>

Applied.
diff mbox series

Patch

diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index cd1b200acae7..661c7144b53a 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -1040,6 +1040,8 @@  void mini_qdisc_pair_swap(struct mini_Qdisc_pair *miniqp,
 
 	if (!tp_head) {
 		RCU_INIT_POINTER(*miniqp->p_miniq, NULL);
+		/* Wait for flying RCU callback before it is freed. */
+		rcu_barrier_bh();
 		return;
 	}
 
@@ -1055,7 +1057,7 @@  void mini_qdisc_pair_swap(struct mini_Qdisc_pair *miniqp,
 	rcu_assign_pointer(*miniqp->p_miniq, miniq);
 
 	if (miniq_old)
-		/* This is counterpart of the rcu barrier above. We need to
+		/* This is counterpart of the rcu barriers above. We need to
 		 * block potential new user of miniq_old until all readers
 		 * are not seeing it.
 		 */