mbox series

[bpf-next,v5,0/3] xdp: Allow lookup into devmaps before redirect

Message ID 156125626076.5209.13424524054109901554.stgit@alrua-x1
Headers show
Series xdp: Allow lookup into devmaps before redirect | expand

Message

Toke Høiland-Jørgensen June 23, 2019, 2:17 a.m. UTC
When using the bpf_redirect_map() helper to redirect packets from XDP, the eBPF
program cannot currently know whether the redirect will succeed, which makes it
impossible to gracefully handle errors. To properly fix this will probably
require deeper changes to the way TX resources are allocated, but one thing that
is fairly straight forward to fix is to allow lookups into devmaps, so programs
can at least know when a redirect is *guaranteed* to fail because there is no
entry in the map. Currently, programs work around this by keeping a shadow map
of another type which indicates whether a map index is valid.

This series contains two changes that are complementary ways to fix this issue:

- Moving the map lookup into the bpf_redirect_map() helper (and caching the
  result), so the helper can return an error if no value is found in the map.
  This includes a refactoring of the devmap and cpumap code to not care about
  the index on enqueue.

- Allowing regular lookups into devmaps from eBPF programs, using the read-only
  flag to make sure they don't change the values.

The performance impact of the series is negligible, in the sense that I cannot
measure it because the variance between test runs is higher than the difference
pre/post series.

Changelog:

v5:
  - Rebase on latest bpf-next.
  - Update documentation for bpf_redirect_map() with the new meaning of flags.

v4:
  - Fix a few nits from Andrii
  - Lose the #defines in bpf.h and just compare the flags argument directly to
    XDP_TX in bpf_xdp_redirect_map().

v3:
  - Adopt Jonathan's idea of using the lower two bits of the flag value as the
    return code.
  - Always do the lookup, and cache the result for use in xdp_do_redirect(); to
    achieve this, refactor the devmap and cpumap code to get rid the bitmap for
    selecting which devices to flush.

v2:
  - For patch 1, make it clear that the change works for any map type.
  - For patch 2, just use the new BPF_F_RDONLY_PROG flag to make the return
    value read-only.

---

Toke Høiland-Jørgensen (3):
      devmap/cpumap: Use flush list instead of bitmap
      bpf_xdp_redirect_map: Perform map lookup in eBPF helper
      devmap: Allow map lookups from eBPF


 include/linux/filter.h   |    1 
 include/uapi/linux/bpf.h |    7 ++-
 kernel/bpf/cpumap.c      |  106 ++++++++++++++++++++-----------------------
 kernel/bpf/devmap.c      |  113 ++++++++++++++++++++++------------------------
 kernel/bpf/verifier.c    |    7 +--
 net/core/filter.c        |   29 +++++-------
 6 files changed, 123 insertions(+), 140 deletions(-)

Comments

Andrii Nakryiko June 24, 2019, 4:38 p.m. UTC | #1
On Sat, Jun 22, 2019 at 7:19 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> When using the bpf_redirect_map() helper to redirect packets from XDP, the eBPF
> program cannot currently know whether the redirect will succeed, which makes it
> impossible to gracefully handle errors. To properly fix this will probably
> require deeper changes to the way TX resources are allocated, but one thing that
> is fairly straight forward to fix is to allow lookups into devmaps, so programs
> can at least know when a redirect is *guaranteed* to fail because there is no
> entry in the map. Currently, programs work around this by keeping a shadow map
> of another type which indicates whether a map index is valid.
>
> This series contains two changes that are complementary ways to fix this issue:
>
> - Moving the map lookup into the bpf_redirect_map() helper (and caching the
>   result), so the helper can return an error if no value is found in the map.
>   This includes a refactoring of the devmap and cpumap code to not care about
>   the index on enqueue.
>
> - Allowing regular lookups into devmaps from eBPF programs, using the read-only
>   flag to make sure they don't change the values.
>
> The performance impact of the series is negligible, in the sense that I cannot
> measure it because the variance between test runs is higher than the difference
> pre/post series.
>
> Changelog:
>
> v5:
>   - Rebase on latest bpf-next.
>   - Update documentation for bpf_redirect_map() with the new meaning of flags.
>
> v4:
>   - Fix a few nits from Andrii
>   - Lose the #defines in bpf.h and just compare the flags argument directly to
>     XDP_TX in bpf_xdp_redirect_map().
>
> v3:
>   - Adopt Jonathan's idea of using the lower two bits of the flag value as the
>     return code.
>   - Always do the lookup, and cache the result for use in xdp_do_redirect(); to
>     achieve this, refactor the devmap and cpumap code to get rid the bitmap for
>     selecting which devices to flush.
>
> v2:
>   - For patch 1, make it clear that the change works for any map type.
>   - For patch 2, just use the new BPF_F_RDONLY_PROG flag to make the return
>     value read-only.
>
> ---
>
> Toke Høiland-Jørgensen (3):
>       devmap/cpumap: Use flush list instead of bitmap
>       bpf_xdp_redirect_map: Perform map lookup in eBPF helper
>       devmap: Allow map lookups from eBPF
>
>
>  include/linux/filter.h   |    1
>  include/uapi/linux/bpf.h |    7 ++-
>  kernel/bpf/cpumap.c      |  106 ++++++++++++++++++++-----------------------
>  kernel/bpf/devmap.c      |  113 ++++++++++++++++++++++------------------------
>  kernel/bpf/verifier.c    |    7 +--
>  net/core/filter.c        |   29 +++++-------
>  6 files changed, 123 insertions(+), 140 deletions(-)
>


Looks like you forgot to add my Acked-by's for your patches?
Toke Høiland-Jørgensen June 24, 2019, 7:38 p.m. UTC | #2
Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:

> On Sat, Jun 22, 2019 at 7:19 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> When using the bpf_redirect_map() helper to redirect packets from XDP, the eBPF
>> program cannot currently know whether the redirect will succeed, which makes it
>> impossible to gracefully handle errors. To properly fix this will probably
>> require deeper changes to the way TX resources are allocated, but one thing that
>> is fairly straight forward to fix is to allow lookups into devmaps, so programs
>> can at least know when a redirect is *guaranteed* to fail because there is no
>> entry in the map. Currently, programs work around this by keeping a shadow map
>> of another type which indicates whether a map index is valid.
>>
>> This series contains two changes that are complementary ways to fix this issue:
>>
>> - Moving the map lookup into the bpf_redirect_map() helper (and caching the
>>   result), so the helper can return an error if no value is found in the map.
>>   This includes a refactoring of the devmap and cpumap code to not care about
>>   the index on enqueue.
>>
>> - Allowing regular lookups into devmaps from eBPF programs, using the read-only
>>   flag to make sure they don't change the values.
>>
>> The performance impact of the series is negligible, in the sense that I cannot
>> measure it because the variance between test runs is higher than the difference
>> pre/post series.
>>
>> Changelog:
>>
>> v5:
>>   - Rebase on latest bpf-next.
>>   - Update documentation for bpf_redirect_map() with the new meaning of flags.
>>
>> v4:
>>   - Fix a few nits from Andrii
>>   - Lose the #defines in bpf.h and just compare the flags argument directly to
>>     XDP_TX in bpf_xdp_redirect_map().
>>
>> v3:
>>   - Adopt Jonathan's idea of using the lower two bits of the flag value as the
>>     return code.
>>   - Always do the lookup, and cache the result for use in xdp_do_redirect(); to
>>     achieve this, refactor the devmap and cpumap code to get rid the bitmap for
>>     selecting which devices to flush.
>>
>> v2:
>>   - For patch 1, make it clear that the change works for any map type.
>>   - For patch 2, just use the new BPF_F_RDONLY_PROG flag to make the return
>>     value read-only.
>>
>> ---
>>
>> Toke Høiland-Jørgensen (3):
>>       devmap/cpumap: Use flush list instead of bitmap
>>       bpf_xdp_redirect_map: Perform map lookup in eBPF helper
>>       devmap: Allow map lookups from eBPF
>>
>>
>>  include/linux/filter.h   |    1
>>  include/uapi/linux/bpf.h |    7 ++-
>>  kernel/bpf/cpumap.c      |  106 ++++++++++++++++++++-----------------------
>>  kernel/bpf/devmap.c      |  113 ++++++++++++++++++++++------------------------
>>  kernel/bpf/verifier.c    |    7 +--
>>  net/core/filter.c        |   29 +++++-------
>>  6 files changed, 123 insertions(+), 140 deletions(-)
>>
>
>
> Looks like you forgot to add my Acked-by's for your patches?

Ah yes, did not carry those forward for the individual patches, my
apologies. Could you perhaps be persuaded to send a new one (I believe a
response to the cover letter acking the whole series would suffice)?
I'll make sure to add the carrying forward of acks into my workflow in
the future :)

-Toke
Andrii Nakryiko June 24, 2019, 8:49 p.m. UTC | #3
On Mon, Jun 24, 2019 at 12:38 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
>
> > On Sat, Jun 22, 2019 at 7:19 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >>
> >> When using the bpf_redirect_map() helper to redirect packets from XDP, the eBPF
> >> program cannot currently know whether the redirect will succeed, which makes it
> >> impossible to gracefully handle errors. To properly fix this will probably
> >> require deeper changes to the way TX resources are allocated, but one thing that
> >> is fairly straight forward to fix is to allow lookups into devmaps, so programs
> >> can at least know when a redirect is *guaranteed* to fail because there is no
> >> entry in the map. Currently, programs work around this by keeping a shadow map
> >> of another type which indicates whether a map index is valid.
> >>
> >> This series contains two changes that are complementary ways to fix this issue:
> >>
> >> - Moving the map lookup into the bpf_redirect_map() helper (and caching the
> >>   result), so the helper can return an error if no value is found in the map.
> >>   This includes a refactoring of the devmap and cpumap code to not care about
> >>   the index on enqueue.
> >>
> >> - Allowing regular lookups into devmaps from eBPF programs, using the read-only
> >>   flag to make sure they don't change the values.
> >>
> >> The performance impact of the series is negligible, in the sense that I cannot
> >> measure it because the variance between test runs is higher than the difference
> >> pre/post series.
> >>
> >> Changelog:
> >>
> >> v5:
> >>   - Rebase on latest bpf-next.
> >>   - Update documentation for bpf_redirect_map() with the new meaning of flags.
> >>
> >> v4:
> >>   - Fix a few nits from Andrii
> >>   - Lose the #defines in bpf.h and just compare the flags argument directly to
> >>     XDP_TX in bpf_xdp_redirect_map().
> >>
> >> v3:
> >>   - Adopt Jonathan's idea of using the lower two bits of the flag value as the
> >>     return code.
> >>   - Always do the lookup, and cache the result for use in xdp_do_redirect(); to
> >>     achieve this, refactor the devmap and cpumap code to get rid the bitmap for
> >>     selecting which devices to flush.
> >>
> >> v2:
> >>   - For patch 1, make it clear that the change works for any map type.
> >>   - For patch 2, just use the new BPF_F_RDONLY_PROG flag to make the return
> >>     value read-only.
> >>
> >> ---
> >>
> >> Toke Høiland-Jørgensen (3):
> >>       devmap/cpumap: Use flush list instead of bitmap
> >>       bpf_xdp_redirect_map: Perform map lookup in eBPF helper
> >>       devmap: Allow map lookups from eBPF
> >>
> >>
> >>  include/linux/filter.h   |    1
> >>  include/uapi/linux/bpf.h |    7 ++-
> >>  kernel/bpf/cpumap.c      |  106 ++++++++++++++++++++-----------------------
> >>  kernel/bpf/devmap.c      |  113 ++++++++++++++++++++++------------------------
> >>  kernel/bpf/verifier.c    |    7 +--
> >>  net/core/filter.c        |   29 +++++-------
> >>  6 files changed, 123 insertions(+), 140 deletions(-)
> >>
> >
> >
> > Looks like you forgot to add my Acked-by's for your patches?
>
> Ah yes, did not carry those forward for the individual patches, my
> apologies. Could you perhaps be persuaded to send a new one (I believe a
> response to the cover letter acking the whole series would suffice)?
> I'll make sure to add the carrying forward of acks into my workflow in
> the future :)

I don't think patchworks captures ack's from cover letter, but let's
give it a go:

Acked-by: Andrii Nakryiko <andriin@fb.com>

>
> -Toke
Toke Høiland-Jørgensen June 24, 2019, 9:38 p.m. UTC | #4
Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:

> On Mon, Jun 24, 2019 at 12:38 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
>>
>> > On Sat, Jun 22, 2019 at 7:19 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>> >>
>> >> When using the bpf_redirect_map() helper to redirect packets from XDP, the eBPF
>> >> program cannot currently know whether the redirect will succeed, which makes it
>> >> impossible to gracefully handle errors. To properly fix this will probably
>> >> require deeper changes to the way TX resources are allocated, but one thing that
>> >> is fairly straight forward to fix is to allow lookups into devmaps, so programs
>> >> can at least know when a redirect is *guaranteed* to fail because there is no
>> >> entry in the map. Currently, programs work around this by keeping a shadow map
>> >> of another type which indicates whether a map index is valid.
>> >>
>> >> This series contains two changes that are complementary ways to fix this issue:
>> >>
>> >> - Moving the map lookup into the bpf_redirect_map() helper (and caching the
>> >>   result), so the helper can return an error if no value is found in the map.
>> >>   This includes a refactoring of the devmap and cpumap code to not care about
>> >>   the index on enqueue.
>> >>
>> >> - Allowing regular lookups into devmaps from eBPF programs, using the read-only
>> >>   flag to make sure they don't change the values.
>> >>
>> >> The performance impact of the series is negligible, in the sense that I cannot
>> >> measure it because the variance between test runs is higher than the difference
>> >> pre/post series.
>> >>
>> >> Changelog:
>> >>
>> >> v5:
>> >>   - Rebase on latest bpf-next.
>> >>   - Update documentation for bpf_redirect_map() with the new meaning of flags.
>> >>
>> >> v4:
>> >>   - Fix a few nits from Andrii
>> >>   - Lose the #defines in bpf.h and just compare the flags argument directly to
>> >>     XDP_TX in bpf_xdp_redirect_map().
>> >>
>> >> v3:
>> >>   - Adopt Jonathan's idea of using the lower two bits of the flag value as the
>> >>     return code.
>> >>   - Always do the lookup, and cache the result for use in xdp_do_redirect(); to
>> >>     achieve this, refactor the devmap and cpumap code to get rid the bitmap for
>> >>     selecting which devices to flush.
>> >>
>> >> v2:
>> >>   - For patch 1, make it clear that the change works for any map type.
>> >>   - For patch 2, just use the new BPF_F_RDONLY_PROG flag to make the return
>> >>     value read-only.
>> >>
>> >> ---
>> >>
>> >> Toke Høiland-Jørgensen (3):
>> >>       devmap/cpumap: Use flush list instead of bitmap
>> >>       bpf_xdp_redirect_map: Perform map lookup in eBPF helper
>> >>       devmap: Allow map lookups from eBPF
>> >>
>> >>
>> >>  include/linux/filter.h   |    1
>> >>  include/uapi/linux/bpf.h |    7 ++-
>> >>  kernel/bpf/cpumap.c      |  106 ++++++++++++++++++++-----------------------
>> >>  kernel/bpf/devmap.c      |  113 ++++++++++++++++++++++------------------------
>> >>  kernel/bpf/verifier.c    |    7 +--
>> >>  net/core/filter.c        |   29 +++++-------
>> >>  6 files changed, 123 insertions(+), 140 deletions(-)
>> >>
>> >
>> >
>> > Looks like you forgot to add my Acked-by's for your patches?
>>
>> Ah yes, did not carry those forward for the individual patches, my
>> apologies. Could you perhaps be persuaded to send a new one (I believe a
>> response to the cover letter acking the whole series would suffice)?
>> I'll make sure to add the carrying forward of acks into my workflow in
>> the future :)
>
> I don't think patchworks captures ack's from cover letter, but let's
> give it a go:
>
> Acked-by: Andrii Nakryiko <andriin@fb.com>

Thanks!

-Toke