diff mbox series

[bpf-next,v2,11/14] xsk: add shared umem support between devices

Message ID 1594390602-7635-12-git-send-email-magnus.karlsson@intel.com
State Changes Requested
Delegated to: BPF Maintainers
Headers show
Series xsk: support shared umems between devices and queues | expand

Commit Message

Magnus Karlsson July 10, 2020, 2:16 p.m. UTC
Add support to share a umem between different devices. This mode
can be invoked with the XDP_SHARED_UMEM bind flag. Previously,
sharing was only supported within the same device. Note that when
sharing a umem between devices, just as in the case of sharing a
umem between queue ids, you need to create a fill ring and a
completion ring and tie them to the socket (with two setsockopts,
one for each ring) before you do the bind with the
XDP_SHARED_UMEM flag. This so that the single-producer
single-consumer semantics of the rings can be upheld.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 net/xdp/xsk.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

Comments

Maxim Mikityanskiy July 14, 2020, 10:17 a.m. UTC | #1
On 2020-07-10 17:16, Magnus Karlsson wrote:
> Add support to share a umem between different devices. This mode
> can be invoked with the XDP_SHARED_UMEM bind flag. Previously,
> sharing was only supported within the same device. Note that when
> sharing a umem between devices, just as in the case of sharing a
> umem between queue ids, you need to create a fill ring and a
> completion ring and tie them to the socket (with two setsockopts,
> one for each ring) before you do the bind with the
> XDP_SHARED_UMEM flag. This so that the single-producer
> single-consumer semantics of the rings can be upheld.

I'm not sure if you saw my comment under v1 asking about performance. 
Could you share what performance numbers (packet rate) you see when 
doing forwarding with xsk_fwd? I'm interested in:

1. Forwarding between two queues of the same netdev.

2. Forwarding between two netdevs.

3. xdpsock -l as the baseline.

Thanks,
Max

> 
> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> ---
>   net/xdp/xsk.c | 11 ++++-------
>   1 file changed, 4 insertions(+), 7 deletions(-)
> 
> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> index 05fadd9..4bf47d3 100644
> --- a/net/xdp/xsk.c
> +++ b/net/xdp/xsk.c
> @@ -695,14 +695,11 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len)
>   			sockfd_put(sock);
>   			goto out_unlock;
>   		}
> -		if (umem_xs->dev != dev) {
> -			err = -EINVAL;
> -			sockfd_put(sock);
> -			goto out_unlock;
> -		}
>   
> -		if (umem_xs->queue_id != qid) {
> -			/* Share the umem with another socket on another qid */
> +		if (umem_xs->queue_id != qid || umem_xs->dev != dev) {
> +			/* Share the umem with another socket on another qid
> +			 * and/or device.
> +			 */
>   			xs->pool = xp_create_and_assign_umem(xs,
>   							     umem_xs->umem);
>   			if (!xs->pool) {
>
Magnus Karlsson July 16, 2020, 4:35 a.m. UTC | #2
On Tue, Jul 14, 2020 at 12:18 PM Maxim Mikityanskiy
<maximmi@mellanox.com> wrote:
>
> On 2020-07-10 17:16, Magnus Karlsson wrote:
> > Add support to share a umem between different devices. This mode
> > can be invoked with the XDP_SHARED_UMEM bind flag. Previously,
> > sharing was only supported within the same device. Note that when
> > sharing a umem between devices, just as in the case of sharing a
> > umem between queue ids, you need to create a fill ring and a
> > completion ring and tie them to the socket (with two setsockopts,
> > one for each ring) before you do the bind with the
> > XDP_SHARED_UMEM flag. This so that the single-producer
> > single-consumer semantics of the rings can be upheld.
>
> I'm not sure if you saw my comment under v1 asking about performance.
> Could you share what performance numbers (packet rate) you see when
> doing forwarding with xsk_fwd? I'm interested in:
>
> 1. Forwarding between two queues of the same netdev.
>
> 2. Forwarding between two netdevs.
>
> 3. xdpsock -l as the baseline.

Sorry for the delay Max, but it is all due to vacation. I will provide
you with the numbers once the weather turns sour and/or the family
gets tired of me ;-). From what I can remember, it did not scale
perfectly linearly, instead it hit some other bottleneck, though I did
not examine what at that time.

/Magnus

> Thanks,
> Max
>
> >
> > Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> > ---
> >   net/xdp/xsk.c | 11 ++++-------
> >   1 file changed, 4 insertions(+), 7 deletions(-)
> >
> > diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> > index 05fadd9..4bf47d3 100644
> > --- a/net/xdp/xsk.c
> > +++ b/net/xdp/xsk.c
> > @@ -695,14 +695,11 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len)
> >                       sockfd_put(sock);
> >                       goto out_unlock;
> >               }
> > -             if (umem_xs->dev != dev) {
> > -                     err = -EINVAL;
> > -                     sockfd_put(sock);
> > -                     goto out_unlock;
> > -             }
> >
> > -             if (umem_xs->queue_id != qid) {
> > -                     /* Share the umem with another socket on another qid */
> > +             if (umem_xs->queue_id != qid || umem_xs->dev != dev) {
> > +                     /* Share the umem with another socket on another qid
> > +                      * and/or device.
> > +                      */
> >                       xs->pool = xp_create_and_assign_umem(xs,
> >                                                            umem_xs->umem);
> >                       if (!xs->pool) {
> >
>
Magnus Karlsson July 20, 2020, 9:33 a.m. UTC | #3
On Thu, Jul 16, 2020 at 6:35 AM Magnus Karlsson
<magnus.karlsson@gmail.com> wrote:
>
> On Tue, Jul 14, 2020 at 12:18 PM Maxim Mikityanskiy
> <maximmi@mellanox.com> wrote:
> >
> > On 2020-07-10 17:16, Magnus Karlsson wrote:
> > > Add support to share a umem between different devices. This mode
> > > can be invoked with the XDP_SHARED_UMEM bind flag. Previously,
> > > sharing was only supported within the same device. Note that when
> > > sharing a umem between devices, just as in the case of sharing a
> > > umem between queue ids, you need to create a fill ring and a
> > > completion ring and tie them to the socket (with two setsockopts,
> > > one for each ring) before you do the bind with the
> > > XDP_SHARED_UMEM flag. This so that the single-producer
> > > single-consumer semantics of the rings can be upheld.
> >
> > I'm not sure if you saw my comment under v1 asking about performance.
> > Could you share what performance numbers (packet rate) you see when
> > doing forwarding with xsk_fwd? I'm interested in:
> >
> > 1. Forwarding between two queues of the same netdev.
> >
> > 2. Forwarding between two netdevs.
> >
> > 3. xdpsock -l as the baseline.
>
> Sorry for the delay Max, but it is all due to vacation. I will provide
> you with the numbers once the weather turns sour and/or the family
> gets tired of me ;-). From what I can remember, it did not scale
> perfectly linearly, instead it hit some other bottleneck, though I did
> not examine what at that time.

Some quick and dirty numbers from my testing of the v3. All from my
machine with an i40e and with 64 byte packets being.

xdpsock -l: 11 Mpps
xsk_fwd with one thread: 12 Mpps
xsk_fwd with two threads and two netdevs (two ports on the same NIC):
9 - 11 Mpps per thread
xsk_fwd with two thread and one netdev, each using one separate queue:
5 Mpps per thread

In summary:

* xsk_fwd delivers better performance compared to xdpsock performing
the same function.
* Using two netdevs does not scale linearly. One is 9 Mpps the other
11 Mpps. Have not examined why. There is a lock in the xsk_fwd code,
so do not expect perfect linearity, but thought the two netdevs would
show the same number at least.
* Something weird is happening when using two queues on the same
netdev. This is also present without the shared umem patch set (if you
register the umem multiple times). Can see that the application is
generating a lot of syscalls with a quick use of perf.

All in all, after this patch set and my vacation I will examine the
scalability of AF_XDP in this scenario and this will most likely lead
to a number of performance optimization patches to improve the
scalability. Would like to have the two last numbers to be closer to
12 Mpps on my machine.

/Magnus

> /Magnus
>
> > Thanks,
> > Max
> >
> > >
> > > Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> > > ---
> > >   net/xdp/xsk.c | 11 ++++-------
> > >   1 file changed, 4 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> > > index 05fadd9..4bf47d3 100644
> > > --- a/net/xdp/xsk.c
> > > +++ b/net/xdp/xsk.c
> > > @@ -695,14 +695,11 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len)
> > >                       sockfd_put(sock);
> > >                       goto out_unlock;
> > >               }
> > > -             if (umem_xs->dev != dev) {
> > > -                     err = -EINVAL;
> > > -                     sockfd_put(sock);
> > > -                     goto out_unlock;
> > > -             }
> > >
> > > -             if (umem_xs->queue_id != qid) {
> > > -                     /* Share the umem with another socket on another qid */
> > > +             if (umem_xs->queue_id != qid || umem_xs->dev != dev) {
> > > +                     /* Share the umem with another socket on another qid
> > > +                      * and/or device.
> > > +                      */
> > >                       xs->pool = xp_create_and_assign_umem(xs,
> > >                                                            umem_xs->umem);
> > >                       if (!xs->pool) {
> > >
> >
diff mbox series

Patch

diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 05fadd9..4bf47d3 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -695,14 +695,11 @@  static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len)
 			sockfd_put(sock);
 			goto out_unlock;
 		}
-		if (umem_xs->dev != dev) {
-			err = -EINVAL;
-			sockfd_put(sock);
-			goto out_unlock;
-		}
 
-		if (umem_xs->queue_id != qid) {
-			/* Share the umem with another socket on another qid */
+		if (umem_xs->queue_id != qid || umem_xs->dev != dev) {
+			/* Share the umem with another socket on another qid
+			 * and/or device.
+			 */
 			xs->pool = xp_create_and_assign_umem(xs,
 							     umem_xs->umem);
 			if (!xs->pool) {