diff mbox series

[bpf-next] xsk: proper AF_XDP socket teardown ordering

Message ID 20181005112515.3009-1-bjorn.topel@gmail.com
State Accepted, archived
Delegated to: BPF Maintainers
Headers show
Series [bpf-next] xsk: proper AF_XDP socket teardown ordering | expand

Commit Message

Björn Töpel Oct. 5, 2018, 11:25 a.m. UTC
From: Björn Töpel <bjorn.topel@intel.com>

The AF_XDP socket struct can exist in three different, implicit
states: setup, bound and released. Setup is prior the socket has been
bound to a device. Bound is when the socket is active for receive and
send. Released is when the process/userspace side of the socket is
released, but the sock object is still lingering, e.g. when there is a
reference to the socket in an XSKMAP after process termination.

The Rx fast-path code uses the "dev" member of struct xdp_sock to
check whether a socket is bound or relased, and the Tx code uses the
struct xdp_umem "xsk_list" member in conjunction with "dev" to
determine the state of a socket.

However, the transition from bound to released did not tear the socket
down in correct order.

On the Rx side "dev" was cleared after synchronize_net() making the
synchronization useless. On the Tx side, the internal queues were
destroyed prior removing them from the "xsk_list".

This commit corrects the cleanup order, and by doing so
xdp_del_sk_umem() can be simplified and one synchronize_net() can be
removed.

Fixes: 965a99098443 ("xsk: add support for bind for Rx")
Fixes: ac98d8aab61b ("xsk: wire upp Tx zero-copy functions")
Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
---
 net/xdp/xdp_umem.c | 11 +++--------
 net/xdp/xsk.c      | 13 ++++++++-----
 2 files changed, 11 insertions(+), 13 deletions(-)

Comments

Song Liu Oct. 5, 2018, 4:41 p.m. UTC | #1
On Fri, Oct 5, 2018 at 4:28 AM Björn Töpel <bjorn.topel@gmail.com> wrote:
>
> From: Björn Töpel <bjorn.topel@intel.com>
>
> The AF_XDP socket struct can exist in three different, implicit
> states: setup, bound and released. Setup is prior the socket has been
> bound to a device. Bound is when the socket is active for receive and
> send. Released is when the process/userspace side of the socket is
> released, but the sock object is still lingering, e.g. when there is a
> reference to the socket in an XSKMAP after process termination.
>
> The Rx fast-path code uses the "dev" member of struct xdp_sock to
> check whether a socket is bound or relased, and the Tx code uses the
> struct xdp_umem "xsk_list" member in conjunction with "dev" to
> determine the state of a socket.
>
> However, the transition from bound to released did not tear the socket
> down in correct order.
>
> On the Rx side "dev" was cleared after synchronize_net() making the
> synchronization useless. On the Tx side, the internal queues were
> destroyed prior removing them from the "xsk_list".
>
> This commit corrects the cleanup order, and by doing so
> xdp_del_sk_umem() can be simplified and one synchronize_net() can be
> removed.
>
> Fixes: 965a99098443 ("xsk: add support for bind for Rx")
> Fixes: ac98d8aab61b ("xsk: wire upp Tx zero-copy functions")
> Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
> Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Acked-by: Song Liu <songliubraving@fb.com>



> ---
>  net/xdp/xdp_umem.c | 11 +++--------
>  net/xdp/xsk.c      | 13 ++++++++-----
>  2 files changed, 11 insertions(+), 13 deletions(-)
>
> diff --git a/net/xdp/xdp_umem.c b/net/xdp/xdp_umem.c
> index c6007c58231c..a264cf2accd0 100644
> --- a/net/xdp/xdp_umem.c
> +++ b/net/xdp/xdp_umem.c
> @@ -32,14 +32,9 @@ void xdp_del_sk_umem(struct xdp_umem *umem, struct xdp_sock *xs)
>  {
>         unsigned long flags;
>
> -       if (xs->dev) {
> -               spin_lock_irqsave(&umem->xsk_list_lock, flags);
> -               list_del_rcu(&xs->list);
> -               spin_unlock_irqrestore(&umem->xsk_list_lock, flags);
> -
> -               if (umem->zc)
> -                       synchronize_net();
> -       }
> +       spin_lock_irqsave(&umem->xsk_list_lock, flags);
> +       list_del_rcu(&xs->list);
> +       spin_unlock_irqrestore(&umem->xsk_list_lock, flags);
>  }
>
>  /* The umem is stored both in the _rx struct and the _tx struct as we do
> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> index caeddad15b7c..0577cd49aa72 100644
> --- a/net/xdp/xsk.c
> +++ b/net/xdp/xsk.c
> @@ -355,12 +355,18 @@ static int xsk_release(struct socket *sock)
>         local_bh_enable();
>
>         if (xs->dev) {
> +               struct net_device *dev = xs->dev;
> +
>                 /* Wait for driver to stop using the xdp socket. */
nit: I guess we should move this comment together with synchronize_net().

> -               synchronize_net();
> -               dev_put(xs->dev);
> +               xdp_del_sk_umem(xs->umem, xs);
>                 xs->dev = NULL;
> +               synchronize_net();
> +               dev_put(dev);
>         }
>
> +       xskq_destroy(xs->rx);
> +       xskq_destroy(xs->tx);
> +
>         sock_orphan(sk);
>         sock->sk = NULL;
>
> @@ -714,9 +720,6 @@ static void xsk_destruct(struct sock *sk)
>         if (!sock_flag(sk, SOCK_DEAD))
>                 return;
>
> -       xskq_destroy(xs->rx);
> -       xskq_destroy(xs->tx);
> -       xdp_del_sk_umem(xs->umem, xs);
>         xdp_put_umem(xs->umem);
>
>         sk_refcnt_debug_dec(sk);
> --
> 2.17.1
>
Daniel Borkmann Oct. 8, 2018, 8:11 a.m. UTC | #2
On 10/05/2018 01:25 PM, Björn Töpel wrote:
> From: Björn Töpel <bjorn.topel@intel.com>
> 
> The AF_XDP socket struct can exist in three different, implicit
> states: setup, bound and released. Setup is prior the socket has been
> bound to a device. Bound is when the socket is active for receive and
> send. Released is when the process/userspace side of the socket is
> released, but the sock object is still lingering, e.g. when there is a
> reference to the socket in an XSKMAP after process termination.
> 
> The Rx fast-path code uses the "dev" member of struct xdp_sock to
> check whether a socket is bound or relased, and the Tx code uses the
> struct xdp_umem "xsk_list" member in conjunction with "dev" to
> determine the state of a socket.
> 
> However, the transition from bound to released did not tear the socket
> down in correct order.
> 
> On the Rx side "dev" was cleared after synchronize_net() making the
> synchronization useless. On the Tx side, the internal queues were
> destroyed prior removing them from the "xsk_list".
> 
> This commit corrects the cleanup order, and by doing so
> xdp_del_sk_umem() can be simplified and one synchronize_net() can be
> removed.
> 
> Fixes: 965a99098443 ("xsk: add support for bind for Rx")
> Fixes: ac98d8aab61b ("xsk: wire upp Tx zero-copy functions")
> Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
> Signed-off-by: Björn Töpel <bjorn.topel@intel.com>

Applied to bpf-next, thanks Björn!
diff mbox series

Patch

diff --git a/net/xdp/xdp_umem.c b/net/xdp/xdp_umem.c
index c6007c58231c..a264cf2accd0 100644
--- a/net/xdp/xdp_umem.c
+++ b/net/xdp/xdp_umem.c
@@ -32,14 +32,9 @@  void xdp_del_sk_umem(struct xdp_umem *umem, struct xdp_sock *xs)
 {
 	unsigned long flags;
 
-	if (xs->dev) {
-		spin_lock_irqsave(&umem->xsk_list_lock, flags);
-		list_del_rcu(&xs->list);
-		spin_unlock_irqrestore(&umem->xsk_list_lock, flags);
-
-		if (umem->zc)
-			synchronize_net();
-	}
+	spin_lock_irqsave(&umem->xsk_list_lock, flags);
+	list_del_rcu(&xs->list);
+	spin_unlock_irqrestore(&umem->xsk_list_lock, flags);
 }
 
 /* The umem is stored both in the _rx struct and the _tx struct as we do
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index caeddad15b7c..0577cd49aa72 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -355,12 +355,18 @@  static int xsk_release(struct socket *sock)
 	local_bh_enable();
 
 	if (xs->dev) {
+		struct net_device *dev = xs->dev;
+
 		/* Wait for driver to stop using the xdp socket. */
-		synchronize_net();
-		dev_put(xs->dev);
+		xdp_del_sk_umem(xs->umem, xs);
 		xs->dev = NULL;
+		synchronize_net();
+		dev_put(dev);
 	}
 
+	xskq_destroy(xs->rx);
+	xskq_destroy(xs->tx);
+
 	sock_orphan(sk);
 	sock->sk = NULL;
 
@@ -714,9 +720,6 @@  static void xsk_destruct(struct sock *sk)
 	if (!sock_flag(sk, SOCK_DEAD))
 		return;
 
-	xskq_destroy(xs->rx);
-	xskq_destroy(xs->tx);
-	xdp_del_sk_umem(xs->umem, xs);
 	xdp_put_umem(xs->umem);
 
 	sk_refcnt_debug_dec(sk);