mbox

[pull,request,net-next,00/14] Mellanox, mlx5 RX and XDP improvements

Message ID 20190422223306.31568-1-saeedm@mellanox.com
State Changes Requested
Delegated to: David Miller
Headers show

Pull-request

git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-updates-2019-04-22

Message

Saeed Mahameed April 22, 2019, 10:32 p.m. UTC
Hi Dave,

This series includes updates to mlx5e driver RX data path and some
significant XDP RX/TX improvements to overcome/mitigate HW and PCIE
bottlenecks.

For more information please see tag log below.

Please pull and let me know if there is any problem.

Please note that the series starts with a merge of mlx5-next branch,
to resolve and avoid dependency with rdma tree, and I just merged
v5.1-rc1 into mlx5-next since we forgot to reset the branch on last
merge window, i hope this is ok with you, next time i will avoid such
merges with linus tree.

Thanks,
Saeed.

---
The following changes since commit 47eae9c6922daf35559d6ecc4408f573df251b20:

  Merge branch 'mlx5-next-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux (2019-04-22 15:27:29 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-updates-2019-04-22

for you to fetch changes up to 9800e88c14d58c45f3d8b8faaa1cbd624267fe70:

  net/mlx5e: Use #define for the WQE wait timeout constant (2019-04-22 15:29:47 -0700)

----------------------------------------------------------------
mlx5-updates-2019-04-22

This series includes updates to mlx5e driver RX data path and some
significant XDP RX/TX improvements to overcome/mitigate HW and PCIE
bottlenecks.

From Tariq:
1) Additional prefetch for small L1_CACHE_BYTES
2) Some Enhancements in rq->flags
3) Stabilize RX packet rate (on Striding RQ) with
multiple outstanding UMR posts
In this patch, we add support for multiple outstanding UMR posts,
 to allow faster gap closure between consuming MPWQEs and reposting
them back into the WQ.

Performance test:
As expected, huge improvement in large-scale (48 cores).

xdp_redirect_map, 64B UDP multi-stream.
Redirect from ConnectX-5 100Gbps to ConnectX-6 100Gbps.
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz.

Before: Unstable, 7 to 30 Mpps
After:  Stable,   at 70.5 Mpps

From Shay:
4) XDP, Inline small packets into the TX MPWQE in XDP xmit flow

Upon high packet rate with multiple CPUs TX workloads, much of the HCA's
resources are spent on prefetching TX descriptors, thus affecting
transmission rates.
This patch comes to mitigate this problem by moving some workload to the
CPU and reducing the HW data prefetch overhead for small packets (<= 256B).

When forwarding packets with XDP, a packet that is smaller
than a certain size (set to ~256 bytes) would be sent inline within
its WQE TX descrptor (mem-copied), when the hardware tx queue is congested
beyond a pre-defined water-mark.

Performance:
    Tested packet rate for UDP 64Byte multi-stream
    over two dual port ConnectX-5 100Gbps NICs.
    CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz

    * Tested with hyper-threading disabled

    XDP_TX:

    |          | before | after   |       |
    | 24 rings | 51Mpps | 116Mpps | +126% |
    | 1 ring   | 12Mpps | 12Mpps  | same  |

    XDP_REDIRECT:

    ** Below is the transmit rate, not the redirection rate
    which might be larger, and is not affected by this patch.

    |          | before  | after   |      |
    | 32 rings | 64Mpps  | 92Mpps  | +43% |
    | 1 ring   | 6.4Mpps | 6.4Mpps | same |

As we can see, feature significantly improves scaling, without
hurting single ring performance.

From Maxim:
5) Some trivial refactoring and code improvements prior to a larger series
to support AF_XDP.

-Saeed.

----------------------------------------------------------------
Maxim Mikityanskiy (8):
      net/mlx5e: Remove unused parameter
      net/mlx5e: Report mlx5e_xdp_set errors
      net/mlx5e: Move parameter calculation functions to en/params.c
      net/mlx5e: Add an underflow warning comment
      net/mlx5e: Remove unused parameter
      net/mlx5e: Take HW interrupt trigger into a function
      net/mlx5e: Remove unused rx_page_reuse stat
      net/mlx5e: Use #define for the WQE wait timeout constant

Shay Agroskin (2):
      net/mlx5e: XDP, Add TX MPWQE session counter
      net/mlx5e: XDP, Inline small packets into the TX MPWQE in XDP xmit flow

Tariq Toukan (4):
      net/mlx5e: RX, Add a prefetch command for small L1_CACHE_BYTES
      net/mlx5e: RX, Support multiple outstanding UMR posts
      net/mlx5e: XDP, Fix shifted flag index in RQ bitmap
      net/mlx5e: XDP, Enhance RQ indication for XDP redirect flush

 drivers/net/ethernet/mellanox/mlx5/core/Makefile   |   3 +-
 drivers/net/ethernet/mellanox/mlx5/core/en.h       |  37 ++++-
 .../net/ethernet/mellanox/mlx5/core/en/params.c    | 104 +++++++++++++
 .../net/ethernet/mellanox/mlx5/core/en/params.h    |  22 +++
 drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c   |  34 ++--
 drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h   |  57 ++++++-
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  | 172 ++++++---------------
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c    | 143 +++++++++++------
 .../net/ethernet/mellanox/mlx5/core/en_selftest.c  |   3 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.c |  15 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.h |   8 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c  |  11 ++
 drivers/net/ethernet/mellanox/mlx5/core/wq.h       |  12 ++
 include/linux/mlx5/qp.h                            |   1 +
 14 files changed, 416 insertions(+), 206 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/params.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/params.h