mbox series

[bpf-next,v5,00/16] AF_XDP infrastructure improvements and mlx5e support

Message ID 20190618120024.16788-1-maximmi@mellanox.com
Headers show
Series AF_XDP infrastructure improvements and mlx5e support | expand

Message

Maxim Mikityanskiy June 18, 2019, noon UTC
This series contains improvements to the AF_XDP kernel infrastructure
and AF_XDP support in mlx5e. The infrastructure improvements are
required for mlx5e, but also some of them benefit to all drivers, and
some can be useful for other drivers that want to implement AF_XDP.

The performance testing was performed on a machine with the following
configuration:

- 24 cores of Intel Xeon E5-2620 v3 @ 2.40 GHz
- Mellanox ConnectX-5 Ex with 100 Gbit/s link

The results with retpoline disabled, single stream:

txonly: 33.3 Mpps (21.5 Mpps with queue and app pinned to the same CPU)
rxdrop: 12.2 Mpps
l2fwd: 9.4 Mpps

The results with retpoline enabled, single stream:

txonly: 21.3 Mpps (14.1 Mpps with queue and app pinned to the same CPU)
rxdrop: 9.9 Mpps
l2fwd: 6.8 Mpps

v2 changes:

Added patches for mlx5e and addressed the comments for v1. Rebased for
bpf-next.

v3 changes:

Rebased for the newer bpf-next, resolved conflicts in libbpf. Addressed
Björn's comments for coding style. Fixed a bug in error handling flow in
mlx5e_open_xsk.

v4 changes:

UAPI is not changed, XSK RX queues are exposed to the kernel. The lower
half of the available amount of RX queues are regular queues, and the
upper half are XSK RX queues. The patch "xsk: Extend channels to support
combined XSK/non-XSK traffic" was dropped. The final patch was reworked
accordingly.

Added "net/mlx5e: Attach/detach XDP program safely", as the changes
introduced in the XSK patch base on the stuff from this one.

Added "libbpf: Support drivers with non-combined channels", which aligns
the condition in libbpf with the condition in the kernel.

Rebased over the newer bpf-next.

v5 changes:

In v4, ethtool reports the number of channels as 'combined' and the
number of XSK RX queues as 'rx' for mlx5e. It was changed, so that 'rx'
is 0, and 'combined' reports the double amount of channels if there is
an active UMEM - to make libbpf happy.

The patch for libbpf was dropped. Although it's still useful and fixes
things, it raises some disagreement, so I'm dropping it - it's no longer
useful for mlx5e anymore after the change above.

Maxim Mikityanskiy (16):
  net/mlx5e: Attach/detach XDP program safely
  xsk: Add API to check for available entries in FQ
  xsk: Add getsockopt XDP_OPTIONS
  libbpf: Support getsockopt XDP_OPTIONS
  xsk: Change the default frame size to 4096 and allow controlling it
  xsk: Return the whole xdp_desc from xsk_umem_consume_tx
  net/mlx5e: Replace deprecated PCI_DMA_TODEVICE
  net/mlx5e: Calculate linear RX frag size considering XSK
  net/mlx5e: Allow ICO SQ to be used by multiple RQs
  net/mlx5e: Refactor struct mlx5e_xdp_info
  net/mlx5e: Share the XDP SQ for XDP_TX between RQs
  net/mlx5e: XDP_TX from UMEM support
  net/mlx5e: Consider XSK in XDP MTU limit calculation
  net/mlx5e: Encapsulate open/close queues into a function
  net/mlx5e: Move queue param structs to en/params.h
  net/mlx5e: Add XSK zero-copy support

 drivers/net/ethernet/intel/i40e/i40e_xsk.c    |  12 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c  |  15 +-
 .../net/ethernet/mellanox/mlx5/core/Makefile  |   2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en.h  | 155 +++-
 .../ethernet/mellanox/mlx5/core/en/params.c   | 108 ++-
 .../ethernet/mellanox/mlx5/core/en/params.h   | 118 ++-
 .../net/ethernet/mellanox/mlx5/core/en/xdp.c  | 231 ++++--
 .../net/ethernet/mellanox/mlx5/core/en/xdp.h  |  36 +-
 .../mellanox/mlx5/core/en/xsk/Makefile        |   1 +
 .../ethernet/mellanox/mlx5/core/en/xsk/rx.c   | 192 +++++
 .../ethernet/mellanox/mlx5/core/en/xsk/rx.h   |  27 +
 .../mellanox/mlx5/core/en/xsk/setup.c         | 223 ++++++
 .../mellanox/mlx5/core/en/xsk/setup.h         |  25 +
 .../ethernet/mellanox/mlx5/core/en/xsk/tx.c   | 111 +++
 .../ethernet/mellanox/mlx5/core/en/xsk/tx.h   |  15 +
 .../ethernet/mellanox/mlx5/core/en/xsk/umem.c | 267 +++++++
 .../ethernet/mellanox/mlx5/core/en/xsk/umem.h |  31 +
 .../ethernet/mellanox/mlx5/core/en_ethtool.c  |  25 +-
 .../mellanox/mlx5/core/en_fs_ethtool.c        |  18 +-
 .../net/ethernet/mellanox/mlx5/core/en_main.c | 726 ++++++++++++------
 .../net/ethernet/mellanox/mlx5/core/en_rep.c  |  12 +-
 .../net/ethernet/mellanox/mlx5/core/en_rx.c   | 104 ++-
 .../ethernet/mellanox/mlx5/core/en_stats.c    | 115 ++-
 .../ethernet/mellanox/mlx5/core/en_stats.h    |  30 +
 .../net/ethernet/mellanox/mlx5/core/en_txrx.c |  42 +-
 .../ethernet/mellanox/mlx5/core/ipoib/ipoib.c |  14 +-
 drivers/net/ethernet/mellanox/mlx5/core/wq.h  |   5 -
 include/net/xdp_sock.h                        |  27 +-
 include/uapi/linux/if_xdp.h                   |   8 +
 net/xdp/xsk.c                                 |  36 +-
 net/xdp/xsk_queue.h                           |  14 +
 samples/bpf/xdpsock_user.c                    |  44 +-
 tools/include/uapi/linux/if_xdp.h             |   8 +
 tools/lib/bpf/xsk.c                           |  12 +
 tools/lib/bpf/xsk.h                           |   2 +-
 35 files changed, 2330 insertions(+), 481 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/xsk/Makefile
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.h
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.h
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/xsk/umem.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/xsk/umem.h

Comments

Björn Töpel June 20, 2019, 9:13 a.m. UTC | #1
On Tue, 18 Jun 2019 at 14:00, Maxim Mikityanskiy <maximmi@mellanox.com> wrote:
>
> This series contains improvements to the AF_XDP kernel infrastructure
> and AF_XDP support in mlx5e. The infrastructure improvements are
> required for mlx5e, but also some of them benefit to all drivers, and
> some can be useful for other drivers that want to implement AF_XDP.
>
> The performance testing was performed on a machine with the following
> configuration:
>
> - 24 cores of Intel Xeon E5-2620 v3 @ 2.40 GHz
> - Mellanox ConnectX-5 Ex with 100 Gbit/s link
>
> The results with retpoline disabled, single stream:
>
> txonly: 33.3 Mpps (21.5 Mpps with queue and app pinned to the same CPU)
> rxdrop: 12.2 Mpps
> l2fwd: 9.4 Mpps
>
> The results with retpoline enabled, single stream:
>
> txonly: 21.3 Mpps (14.1 Mpps with queue and app pinned to the same CPU)
> rxdrop: 9.9 Mpps
> l2fwd: 6.8 Mpps
>
> v2 changes:
>
> Added patches for mlx5e and addressed the comments for v1. Rebased for
> bpf-next.
>
> v3 changes:
>
> Rebased for the newer bpf-next, resolved conflicts in libbpf. Addressed
> Björn's comments for coding style. Fixed a bug in error handling flow in
> mlx5e_open_xsk.
>
> v4 changes:
>
> UAPI is not changed, XSK RX queues are exposed to the kernel. The lower
> half of the available amount of RX queues are regular queues, and the
> upper half are XSK RX queues. The patch "xsk: Extend channels to support
> combined XSK/non-XSK traffic" was dropped. The final patch was reworked
> accordingly.
>
> Added "net/mlx5e: Attach/detach XDP program safely", as the changes
> introduced in the XSK patch base on the stuff from this one.
>
> Added "libbpf: Support drivers with non-combined channels", which aligns
> the condition in libbpf with the condition in the kernel.
>
> Rebased over the newer bpf-next.
>
> v5 changes:
>
> In v4, ethtool reports the number of channels as 'combined' and the
> number of XSK RX queues as 'rx' for mlx5e. It was changed, so that 'rx'
> is 0, and 'combined' reports the double amount of channels if there is
> an active UMEM - to make libbpf happy.
>
> The patch for libbpf was dropped. Although it's still useful and fixes
> things, it raises some disagreement, so I'm dropping it - it's no longer
> useful for mlx5e anymore after the change above.
>

Just a heads-up: There are some checkpatch warnings (>80 chars/line)
for the mlnx5 driver parts, and the series didn't apply cleanly on
bpf-next for me.

I haven't been able to test the mlnx5 parts.

Parts of the series are unrelated/orthogonal, and could be submitted
as separate series, e.g. patches {1,7} and patches {3,4}. No blockers
for me, though.

Thanks for the hard work!

For the series:
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Saeed Mahameed June 21, 2019, 7:52 p.m. UTC | #2
On Thu, Jun 20, 2019 at 2:13 AM Björn Töpel <bjorn.topel@gmail.com> wrote:
>
> On Tue, 18 Jun 2019 at 14:00, Maxim Mikityanskiy <maximmi@mellanox.com> wrote:
> >
> > This series contains improvements to the AF_XDP kernel infrastructure
> > and AF_XDP support in mlx5e. The infrastructure improvements are
> > required for mlx5e, but also some of them benefit to all drivers, and
> > some can be useful for other drivers that want to implement AF_XDP.
> >
> > The performance testing was performed on a machine with the following
> > configuration:
> >
> > - 24 cores of Intel Xeon E5-2620 v3 @ 2.40 GHz
> > - Mellanox ConnectX-5 Ex with 100 Gbit/s link
> >
> > The results with retpoline disabled, single stream:
> >
> > txonly: 33.3 Mpps (21.5 Mpps with queue and app pinned to the same CPU)
> > rxdrop: 12.2 Mpps
> > l2fwd: 9.4 Mpps
> >
> > The results with retpoline enabled, single stream:
> >
> > txonly: 21.3 Mpps (14.1 Mpps with queue and app pinned to the same CPU)
> > rxdrop: 9.9 Mpps
> > l2fwd: 6.8 Mpps
> >
> > v2 changes:
> >
> > Added patches for mlx5e and addressed the comments for v1. Rebased for
> > bpf-next.
> >
> > v3 changes:
> >
> > Rebased for the newer bpf-next, resolved conflicts in libbpf. Addressed
> > Björn's comments for coding style. Fixed a bug in error handling flow in
> > mlx5e_open_xsk.
> >
> > v4 changes:
> >
> > UAPI is not changed, XSK RX queues are exposed to the kernel. The lower
> > half of the available amount of RX queues are regular queues, and the
> > upper half are XSK RX queues. The patch "xsk: Extend channels to support
> > combined XSK/non-XSK traffic" was dropped. The final patch was reworked
> > accordingly.
> >
> > Added "net/mlx5e: Attach/detach XDP program safely", as the changes
> > introduced in the XSK patch base on the stuff from this one.
> >
> > Added "libbpf: Support drivers with non-combined channels", which aligns
> > the condition in libbpf with the condition in the kernel.
> >
> > Rebased over the newer bpf-next.
> >
> > v5 changes:
> >
> > In v4, ethtool reports the number of channels as 'combined' and the
> > number of XSK RX queues as 'rx' for mlx5e. It was changed, so that 'rx'
> > is 0, and 'combined' reports the double amount of channels if there is
> > an active UMEM - to make libbpf happy.
> >
> > The patch for libbpf was dropped. Although it's still useful and fixes
> > things, it raises some disagreement, so I'm dropping it - it's no longer
> > useful for mlx5e anymore after the change above.
> >
>
> Just a heads-up: There are some checkpatch warnings (>80 chars/line)

Thanks Bjorn for your comment, in mlx5 we allow up to 95 chars per line,
otherwise it is going to be an ugly zigzags.

> for the mlnx5 driver parts, and the series didn't apply cleanly on
> bpf-next for me.
>
> I haven't been able to test the mlnx5 parts.
>
> Parts of the series are unrelated/orthogonal, and could be submitted
> as separate series, e.g. patches {1,7} and patches {3,4}. No blockers
> for me, though.
>
> Thanks for the hard work!
>
> For the series:
> Acked-by: Björn Töpel <bjorn.topel@intel.com>
Tariq Toukan June 23, 2019, 11:53 a.m. UTC | #3
On 6/21/2019 10:52 PM, Saeed Mahameed wrote:
> On Thu, Jun 20, 2019 at 2:13 AM Björn Töpel <bjorn.topel@gmail.com> wrote:
>>
>> On Tue, 18 Jun 2019 at 14:00, Maxim Mikityanskiy <maximmi@mellanox.com> wrote:
>>>
>>> This series contains improvements to the AF_XDP kernel infrastructure
>>> and AF_XDP support in mlx5e. The infrastructure improvements are
>>> required for mlx5e, but also some of them benefit to all drivers, and
>>> some can be useful for other drivers that want to implement AF_XDP.
>>>
>>> The performance testing was performed on a machine with the following
>>> configuration:
>>>
>>> - 24 cores of Intel Xeon E5-2620 v3 @ 2.40 GHz
>>> - Mellanox ConnectX-5 Ex with 100 Gbit/s link
>>>
>>> The results with retpoline disabled, single stream:
>>>
>>> txonly: 33.3 Mpps (21.5 Mpps with queue and app pinned to the same CPU)
>>> rxdrop: 12.2 Mpps
>>> l2fwd: 9.4 Mpps
>>>
>>> The results with retpoline enabled, single stream:
>>>
>>> txonly: 21.3 Mpps (14.1 Mpps with queue and app pinned to the same CPU)
>>> rxdrop: 9.9 Mpps
>>> l2fwd: 6.8 Mpps
>>>
>>> v2 changes:
>>>
>>> Added patches for mlx5e and addressed the comments for v1. Rebased for
>>> bpf-next.
>>>
>>> v3 changes:
>>>
>>> Rebased for the newer bpf-next, resolved conflicts in libbpf. Addressed
>>> Björn's comments for coding style. Fixed a bug in error handling flow in
>>> mlx5e_open_xsk.
>>>
>>> v4 changes:
>>>
>>> UAPI is not changed, XSK RX queues are exposed to the kernel. The lower
>>> half of the available amount of RX queues are regular queues, and the
>>> upper half are XSK RX queues. The patch "xsk: Extend channels to support
>>> combined XSK/non-XSK traffic" was dropped. The final patch was reworked
>>> accordingly.
>>>
>>> Added "net/mlx5e: Attach/detach XDP program safely", as the changes
>>> introduced in the XSK patch base on the stuff from this one.
>>>
>>> Added "libbpf: Support drivers with non-combined channels", which aligns
>>> the condition in libbpf with the condition in the kernel.
>>>
>>> Rebased over the newer bpf-next.
>>>
>>> v5 changes:
>>>
>>> In v4, ethtool reports the number of channels as 'combined' and the
>>> number of XSK RX queues as 'rx' for mlx5e. It was changed, so that 'rx'
>>> is 0, and 'combined' reports the double amount of channels if there is
>>> an active UMEM - to make libbpf happy.
>>>
>>> The patch for libbpf was dropped. Although it's still useful and fixes
>>> things, it raises some disagreement, so I'm dropping it - it's no longer
>>> useful for mlx5e anymore after the change above.
>>>
>>
>> Just a heads-up: There are some checkpatch warnings (>80 chars/line)
> 
> Thanks Bjorn for your comment, in mlx5 we allow up to 95 chars per line,
> otherwise it is going to be an ugly zigzags.
> 
>> for the mlnx5 driver parts, and the series didn't apply cleanly on
>> bpf-next for me.
>>
>> I haven't been able to test the mlnx5 parts.
>>
>> Parts of the series are unrelated/orthogonal, and could be submitted
>> as separate series, e.g. patches {1,7} and patches {3,4}. No blockers
>> for me, though.
>>
>> Thanks for the hard work!
>>
>> For the series:
>> Acked-by: Björn Töpel <bjorn.topel@intel.com>

Just wanted to make sure we're on the same page, so we don't miss this 
kernel.

AIU, currently no action is needed from Maxim's side, as Saeed is fine 
with the mlx5 part, and series is still marked as 'New' in patchworks 
with no requested changes.

Regards,
Tariq
Daniel Borkmann June 24, 2019, 2:48 p.m. UTC | #4
On 06/20/2019 11:13 AM, Björn Töpel wrote:
> On Tue, 18 Jun 2019 at 14:00, Maxim Mikityanskiy <maximmi@mellanox.com> wrote:
>>
>> This series contains improvements to the AF_XDP kernel infrastructure
>> and AF_XDP support in mlx5e. The infrastructure improvements are
>> required for mlx5e, but also some of them benefit to all drivers, and
>> some can be useful for other drivers that want to implement AF_XDP.
>>
>> The performance testing was performed on a machine with the following
>> configuration:
>>
>> - 24 cores of Intel Xeon E5-2620 v3 @ 2.40 GHz
>> - Mellanox ConnectX-5 Ex with 100 Gbit/s link
>>
>> The results with retpoline disabled, single stream:
>>
>> txonly: 33.3 Mpps (21.5 Mpps with queue and app pinned to the same CPU)
>> rxdrop: 12.2 Mpps
>> l2fwd: 9.4 Mpps
>>
>> The results with retpoline enabled, single stream:
>>
>> txonly: 21.3 Mpps (14.1 Mpps with queue and app pinned to the same CPU)
>> rxdrop: 9.9 Mpps
>> l2fwd: 6.8 Mpps
>>
>> v2 changes:
>>
>> Added patches for mlx5e and addressed the comments for v1. Rebased for
>> bpf-next.
>>
>> v3 changes:
>>
>> Rebased for the newer bpf-next, resolved conflicts in libbpf. Addressed
>> Björn's comments for coding style. Fixed a bug in error handling flow in
>> mlx5e_open_xsk.
>>
>> v4 changes:
>>
>> UAPI is not changed, XSK RX queues are exposed to the kernel. The lower
>> half of the available amount of RX queues are regular queues, and the
>> upper half are XSK RX queues. The patch "xsk: Extend channels to support
>> combined XSK/non-XSK traffic" was dropped. The final patch was reworked
>> accordingly.
>>
>> Added "net/mlx5e: Attach/detach XDP program safely", as the changes
>> introduced in the XSK patch base on the stuff from this one.
>>
>> Added "libbpf: Support drivers with non-combined channels", which aligns
>> the condition in libbpf with the condition in the kernel.
>>
>> Rebased over the newer bpf-next.
>>
>> v5 changes:
>>
>> In v4, ethtool reports the number of channels as 'combined' and the
>> number of XSK RX queues as 'rx' for mlx5e. It was changed, so that 'rx'
>> is 0, and 'combined' reports the double amount of channels if there is
>> an active UMEM - to make libbpf happy.
>>
>> The patch for libbpf was dropped. Although it's still useful and fixes
>> things, it raises some disagreement, so I'm dropping it - it's no longer
>> useful for mlx5e anymore after the change above.
> 
> Just a heads-up: There are some checkpatch warnings (>80 chars/line)
> for the mlnx5 driver parts, and the series didn't apply cleanly on
> bpf-next for me.
> 
> I haven't been able to test the mlnx5 parts.
> 
> Parts of the series are unrelated/orthogonal, and could be submitted
> as separate series, e.g. patches {1,7} and patches {3,4}. No blockers
> for me, though.
> 
> Thanks for the hard work!

+1

> For the series:
> Acked-by: Björn Töpel <bjorn.topel@intel.com>

Looks good to me, but as Björn already indicated, there's one last rebase
needed since it doesn't apply cleanly in the last one that adds the actual
AF_XDP support, please take a look and rebase:

[...]
Applying: net/mlx5e: Attach/detach XDP program safely
Applying: xsk: Add API to check for available entries in FQ
Applying: xsk: Add getsockopt XDP_OPTIONS
Applying: libbpf: Support getsockopt XDP_OPTIONS
Applying: xsk: Change the default frame size to 4096 and allow controlling it
Applying: xsk: Return the whole xdp_desc from xsk_umem_consume_tx
Using index info to reconstruct a base tree...
M	drivers/net/ethernet/intel/i40e/i40e_xsk.c
M	drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
Falling back to patching base and 3-way merge...
Auto-merging drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
Auto-merging drivers/net/ethernet/intel/i40e/i40e_xsk.c
Applying: net/mlx5e: Replace deprecated PCI_DMA_TODEVICE
Applying: net/mlx5e: Calculate linear RX frag size considering XSK
Applying: net/mlx5e: Allow ICO SQ to be used by multiple RQs
Applying: net/mlx5e: Refactor struct mlx5e_xdp_info
Applying: net/mlx5e: Share the XDP SQ for XDP_TX between RQs
Applying: net/mlx5e: XDP_TX from UMEM support
Applying: net/mlx5e: Consider XSK in XDP MTU limit calculation
Applying: net/mlx5e: Encapsulate open/close queues into a function
Applying: net/mlx5e: Move queue param structs to en/params.h
Applying: net/mlx5e: Add XSK zero-copy support
fatal: sha1 information is lacking or useless (drivers/net/ethernet/mellanox/mlx5/core/en.h).
error: could not build fake ancestor
Patch failed at 0016 net/mlx5e: Add XSK zero-copy support

Thanks,
Daniel