mbox series

[bpf-next,v2,0/2] xdp: Introduce bulking for non-map XDP_REDIRECT

Message ID 157893905455.861394.14341695989510022302.stgit@toke.dk
Headers show
Series xdp: Introduce bulking for non-map XDP_REDIRECT | expand

Message

Toke Høiland-Jørgensen Jan. 13, 2020, 6:10 p.m. UTC
Since commit 96360004b862 ("xdp: Make devmap flush_list common for all map
instances"), devmap flushing is a global operation instead of tied to a
particular map. This means that with a bit of refactoring, we can finally fix
the performance delta between the bpf_redirect_map() and bpf_redirect() helper
functions, by introducing bulking for the latter as well.

This series makes this change by moving the data structure used for the bulking
into struct net_device itself, so we can access it even when there is not
devmap. Once this is done, moving the bpf_redirect() helper to use the bulking
mechanism becomes quite trivial, and brings bpf_redirect() up to the same as
bpf_redirect_map():

                       Before:   After:
1 CPU:
bpf_redirect_map:      8.4 Mpps  8.4 Mpps  (no change)
bpf_redirect:          5.0 Mpps  8.4 Mpps  (+68%)
2 CPUs:
bpf_redirect_map:     15.9 Mpps  16.1 Mpps  (+1% or ~no change)
bpf_redirect:          9.5 Mpps  15.9 Mpps  (+67%)

After this patch series, the only semantics different between the two variants
of the bpf() helper (apart from the absence of a map argument, obviously) is
that the _map() variant will return an error if passed an invalid map index,
whereas the bpf_redirect() helper will succeed, but drop packets on
xdp_do_redirect(). This is because the helper has no reference to the calling
netdev, so unfortunately we can't do the ifindex lookup directly in the helper.

Changelog:

v2:
  - Consolidate code paths and tracepoints for map and non-map redirect variants
    (Björn)
  - Add performance data for 2-CPU test (Jesper)
  - Move fields to avoid shifting cache lines in struct net_device (Eric)

---

Toke Høiland-Jørgensen (2):
      xdp: Move devmap bulk queue into struct net_device
      xdp: Use bulking for non-map XDP_REDIRECT and consolidate code paths


 include/linux/bpf.h        |   13 +++++-
 include/linux/netdevice.h  |   11 +++--
 include/trace/events/xdp.h |  104 +++++++++++++++++++-------------------------
 kernel/bpf/devmap.c        |   94 +++++++++++++++++++++-------------------
 net/core/dev.c             |    2 +
 net/core/filter.c          |   86 +++++++-----------------------------
 6 files changed, 132 insertions(+), 178 deletions(-)

Comments

Alexei Starovoitov Jan. 14, 2020, 5:47 p.m. UTC | #1
On Mon, Jan 13, 2020 at 10:11 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Since commit 96360004b862 ("xdp: Make devmap flush_list common for all map
> instances"), devmap flushing is a global operation instead of tied to a
> particular map. This means that with a bit of refactoring, we can finally fix
> the performance delta between the bpf_redirect_map() and bpf_redirect() helper
> functions, by introducing bulking for the latter as well.
>
> This series makes this change by moving the data structure used for the bulking
> into struct net_device itself, so we can access it even when there is not
> devmap. Once this is done, moving the bpf_redirect() helper to use the bulking
> mechanism becomes quite trivial, and brings bpf_redirect() up to the same as
> bpf_redirect_map():
>
>                        Before:   After:
> 1 CPU:
> bpf_redirect_map:      8.4 Mpps  8.4 Mpps  (no change)
> bpf_redirect:          5.0 Mpps  8.4 Mpps  (+68%)
> 2 CPUs:
> bpf_redirect_map:     15.9 Mpps  16.1 Mpps  (+1% or ~no change)
> bpf_redirect:          9.5 Mpps  15.9 Mpps  (+67%)
>
> After this patch series, the only semantics different between the two variants
> of the bpf() helper (apart from the absence of a map argument, obviously) is
> that the _map() variant will return an error if passed an invalid map index,
> whereas the bpf_redirect() helper will succeed, but drop packets on
> xdp_do_redirect(). This is because the helper has no reference to the calling
> netdev, so unfortunately we can't do the ifindex lookup directly in the helper.
>
> Changelog:
>
> v2:
>   - Consolidate code paths and tracepoints for map and non-map redirect variants
>     (Björn)
>   - Add performance data for 2-CPU test (Jesper)
>   - Move fields to avoid shifting cache lines in struct net_device (Eric)

John, since you commented on v1 please review this v2. Thanks!
John Fastabend Jan. 15, 2020, 5:49 p.m. UTC | #2
Alexei Starovoitov wrote:
> On Mon, Jan 13, 2020 at 10:11 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >
> > Since commit 96360004b862 ("xdp: Make devmap flush_list common for all map
> > instances"), devmap flushing is a global operation instead of tied to a
> > particular map. This means that with a bit of refactoring, we can finally fix
> > the performance delta between the bpf_redirect_map() and bpf_redirect() helper
> > functions, by introducing bulking for the latter as well.
> >
> > This series makes this change by moving the data structure used for the bulking
> > into struct net_device itself, so we can access it even when there is not
> > devmap. Once this is done, moving the bpf_redirect() helper to use the bulking
> > mechanism becomes quite trivial, and brings bpf_redirect() up to the same as
> > bpf_redirect_map():
> >
> >                        Before:   After:
> > 1 CPU:
> > bpf_redirect_map:      8.4 Mpps  8.4 Mpps  (no change)
> > bpf_redirect:          5.0 Mpps  8.4 Mpps  (+68%)
> > 2 CPUs:
> > bpf_redirect_map:     15.9 Mpps  16.1 Mpps  (+1% or ~no change)
> > bpf_redirect:          9.5 Mpps  15.9 Mpps  (+67%)
> >
> > After this patch series, the only semantics different between the two variants
> > of the bpf() helper (apart from the absence of a map argument, obviously) is
> > that the _map() variant will return an error if passed an invalid map index,
> > whereas the bpf_redirect() helper will succeed, but drop packets on
> > xdp_do_redirect(). This is because the helper has no reference to the calling
> > netdev, so unfortunately we can't do the ifindex lookup directly in the helper.
> >
> > Changelog:
> >
> > v2:
> >   - Consolidate code paths and tracepoints for map and non-map redirect variants
> >     (Björn)
> >   - Add performance data for 2-CPU test (Jesper)
> >   - Move fields to avoid shifting cache lines in struct net_device (Eric)
> 
> John, since you commented on v1 please review this v2. Thanks!

hmm don't think I had an initial comment but will review regardless ;)