Message ID | 1519292206-6384-2-git-send-email-jasowang@redhat.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
Series | [net,v3,1/2] Revert "tuntap: add missing xdp flush" | expand |
On Thu, 22 Feb 2018 17:36:46 +0800 Jason Wang <jasowang@redhat.com> wrote: > Commit 762c330d670e ("tuntap: add missing xdp flush") tries to fix the > devmap stall caused by missed xdp flush by counting the pending xdp > redirected packets and flush when it exceeds NAPI_POLL_WEIGHT or > MSG_MORE is clear. This may lead to BUG() since xdp_do_flush() was > called in the process context with preemption enabled. Simply > disabling preemption may silence the warning but be not enough since > process may move between different CPUS during a batch which cause > xdp_do_flush() misses some CPU where the process run > previously. Consider the fallouts, that commit was reverted. To fix > the issue correctly, we can simply call xdp_do_flush() immediately > after xdp_do_redirect(), a side effect is that this removes any > possibility of batching which could be addressed in the future. > > Reported-by: Christoffer Dall <christoffer.dall@linaro.org> > Fixes: 762c330d670e ("tuntap: add missing xdp flush") > Signed-off-by: Jason Wang <jasowang@redhat.com> > --- > drivers/net/tun.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/net/tun.c b/drivers/net/tun.c > index 2823a4a..a363ea2 100644 > --- a/drivers/net/tun.c > +++ b/drivers/net/tun.c > @@ -1662,6 +1662,7 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun, > get_page(alloc_frag->page); > alloc_frag->offset += buflen; > err = xdp_do_redirect(tun->dev, &xdp, xdp_prog); > + xdp_do_flush_map(); > if (err) > goto err_redirect; > rcu_read_unlock(); As you have noticed, the xdp_do_redirect() + xdp_do_flush_map() rely heavily on being executed in softirq/napi_schedule context. Particularly the map infra devmap[1]+cpumap depend on the enqueue and flush operation MUST happen on the same CPU (e.g. stores which devices needs flushing in a this_cpu_ptr bitmap [1]). What context is tun_build_skb() invoked under? Even when you call xdp_do_redirect and xdp_do_flush_map right after each-other, are we sure we cannot be preempted here? [1] https://github.com/torvalds/linux/blob/master/kernel/bpf/devmap.c#L209-L215
On 2018年02月23日 01:46, Jesper Dangaard Brouer wrote: > On Thu, 22 Feb 2018 17:36:46 +0800 > Jason Wang <jasowang@redhat.com> wrote: > >> Commit 762c330d670e ("tuntap: add missing xdp flush") tries to fix the >> devmap stall caused by missed xdp flush by counting the pending xdp >> redirected packets and flush when it exceeds NAPI_POLL_WEIGHT or >> MSG_MORE is clear. This may lead to BUG() since xdp_do_flush() was >> called in the process context with preemption enabled. Simply >> disabling preemption may silence the warning but be not enough since >> process may move between different CPUS during a batch which cause >> xdp_do_flush() misses some CPU where the process run >> previously. Consider the fallouts, that commit was reverted. To fix >> the issue correctly, we can simply call xdp_do_flush() immediately >> after xdp_do_redirect(), a side effect is that this removes any >> possibility of batching which could be addressed in the future. >> >> Reported-by: Christoffer Dall <christoffer.dall@linaro.org> >> Fixes: 762c330d670e ("tuntap: add missing xdp flush") >> Signed-off-by: Jason Wang <jasowang@redhat.com> >> --- >> drivers/net/tun.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/drivers/net/tun.c b/drivers/net/tun.c >> index 2823a4a..a363ea2 100644 >> --- a/drivers/net/tun.c >> +++ b/drivers/net/tun.c >> @@ -1662,6 +1662,7 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun, >> get_page(alloc_frag->page); >> alloc_frag->offset += buflen; >> err = xdp_do_redirect(tun->dev, &xdp, xdp_prog); >> + xdp_do_flush_map(); >> if (err) >> goto err_redirect; >> rcu_read_unlock(); > As you have noticed, the xdp_do_redirect() + xdp_do_flush_map() rely > heavily on being executed in softirq/napi_schedule context. > Particularly the map infra devmap[1]+cpumap depend on the enqueue and > flush operation MUST happen on the same CPU (e.g. stores which > devices needs flushing in a this_cpu_ptr bitmap [1]). > > What context is tun_build_skb() invoked under? > > Even when you call xdp_do_redirect and xdp_do_flush_map right after > each-other, are we sure we cannot be preempted here? Ok, I miss the fact that we can be preempted here with preemptible RCU. Let me disable preemption here and post a V4. Thanks > > > [1] https://github.com/torvalds/linux/blob/master/kernel/bpf/devmap.c#L209-L215
diff --git a/drivers/net/tun.c b/drivers/net/tun.c index 2823a4a..a363ea2 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -1662,6 +1662,7 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun, get_page(alloc_frag->page); alloc_frag->offset += buflen; err = xdp_do_redirect(tun->dev, &xdp, xdp_prog); + xdp_do_flush_map(); if (err) goto err_redirect; rcu_read_unlock();
Commit 762c330d670e ("tuntap: add missing xdp flush") tries to fix the devmap stall caused by missed xdp flush by counting the pending xdp redirected packets and flush when it exceeds NAPI_POLL_WEIGHT or MSG_MORE is clear. This may lead to BUG() since xdp_do_flush() was called in the process context with preemption enabled. Simply disabling preemption may silence the warning but be not enough since process may move between different CPUS during a batch which cause xdp_do_flush() misses some CPU where the process run previously. Consider the fallouts, that commit was reverted. To fix the issue correctly, we can simply call xdp_do_flush() immediately after xdp_do_redirect(), a side effect is that this removes any possibility of batching which could be addressed in the future. Reported-by: Christoffer Dall <christoffer.dall@linaro.org> Fixes: 762c330d670e ("tuntap: add missing xdp flush") Signed-off-by: Jason Wang <jasowang@redhat.com> --- drivers/net/tun.c | 1 + 1 file changed, 1 insertion(+)