mbox series

[v2,bpf-next,0/9] bpf: Network Resource Manager (NRM)

Message ID 20190223010703.678070-1-brakmo@fb.com
Headers show
Series bpf: Network Resource Manager (NRM) | expand

Message

Lawrence Brakmo Feb. 23, 2019, 1:06 a.m. UTC
Network Resource Manager is a framework for limiting the bandwidth used
by v2 cgroups. It consists of 4 BPF helpers and a sample BPF program to
limit egress bandwdith as well as a sample user program and script to
simplify NRM testing.

The sample NRM BPF program is not meant to be production quality, it is
provided as proof of concept. A lot more information, including sample
runs in some cases, are provided in the commit messages of the individual
patches.

Two more BPF programs, one to limit ingress and one that limits egress
and uses fq's Earliest Departure Time feature (EDT), will be provided in an
upcomming patchset.

Changes from v1 to v2:
  * bpf_tcp_enter_cwr can only be called from a cgroup skb egress BPF
    program (otherwise load or attach will fail) where we already hold
    the sk lock. Also only applies for ESTABLISHED state.
  * bpf_skb_ecn_set_ce uses INET_ECN_set_ce()
  * bpf_tcp_check_probe_timer now uses tcp_reset_xmit_timer. Can only be
    used by egress cgroup skb programs.
  * removed load_cg_skb user program. 
  * nrm bpf egress program checks packet header in skb to determine
    ECN value. Now also works for ECN enabled UDP packets.
    Using ECN_ defines instead of integers.
  * NRM script test program now uses bpftool instead of load_cg_skb

Martin KaFai Lau (2):
  bpf: Remove const from get_func_proto
  bpf: Add bpf helper bpf_tcp_enter_cwr

brakmo (7):
  bpf: Test bpf_tcp_enter_cwr in test_verifier
  bpf: add bpf helper bpf_skb_ecn_set_ce
  bpf: Add bpf helper bpf_tcp_check_probe_timer
  bpf: sync bpf.h to tools and update bpf_helpers.h
  bpf: Sample NRM BPF program to limit egress bw
  bpf: User program for testing NRM
  bpf: NRM test script

 drivers/media/rc/bpf-lirc.c                 |   2 +-
 include/linux/bpf.h                         |   3 +-
 include/linux/filter.h                      |   3 +-
 include/uapi/linux/bpf.h                    |  27 +-
 kernel/bpf/cgroup.c                         |   2 +-
 kernel/bpf/syscall.c                        |  12 +
 kernel/bpf/verifier.c                       |   4 +
 kernel/trace/bpf_trace.c                    |  10 +-
 net/core/filter.c                           | 101 ++++-
 samples/bpf/Makefile                        |   5 +
 samples/bpf/do_nrm_test.sh                  | 437 +++++++++++++++++++
 samples/bpf/nrm.c                           | 440 ++++++++++++++++++++
 samples/bpf/nrm.h                           |  31 ++
 samples/bpf/nrm_kern.h                      | 137 ++++++
 samples/bpf/nrm_out_kern.c                  | 190 +++++++++
 tools/include/uapi/linux/bpf.h              |  27 +-
 tools/testing/selftests/bpf/bpf_helpers.h   |   6 +
 tools/testing/selftests/bpf/verifier/sock.c |  33 ++
 18 files changed, 1444 insertions(+), 26 deletions(-)
 create mode 100755 samples/bpf/do_nrm_test.sh
 create mode 100644 samples/bpf/nrm.c
 create mode 100644 samples/bpf/nrm.h
 create mode 100644 samples/bpf/nrm_kern.h
 create mode 100644 samples/bpf/nrm_out_kern.c

Comments

David Ahern Feb. 23, 2019, 3:03 a.m. UTC | #1
On 2/22/19 8:06 PM, brakmo wrote:
> Network Resource Manager is a framework for limiting the bandwidth used
> by v2 cgroups. It consists of 4 BPF helpers and a sample BPF program to
> limit egress bandwdith as well as a sample user program and script to
> simplify NRM testing.

'resource manager' is a really generic name. Since you are referring to
bandwidth, how about renaming to Network Bandwidth Manager?
Eric Dumazet Feb. 23, 2019, 6:39 p.m. UTC | #2
On 02/22/2019 07:03 PM, David Ahern wrote:
> On 2/22/19 8:06 PM, brakmo wrote:
>> Network Resource Manager is a framework for limiting the bandwidth used
>> by v2 cgroups. It consists of 4 BPF helpers and a sample BPF program to
>> limit egress bandwdith as well as a sample user program and script to
>> simplify NRM testing.
> 
> 'resource manager' is a really generic name. Since you are referring to
> bandwidth, how about renaming to Network Bandwidth Manager?
> 

Or just use the normal word for a policer ...

Really this is beyond me that TCP experts can still push policers out there,
they are really a huge pain.
Alexei Starovoitov Feb. 23, 2019, 8:40 p.m. UTC | #3
On Sat, Feb 23, 2019 at 10:39:53AM -0800, Eric Dumazet wrote:
> 
> 
> On 02/22/2019 07:03 PM, David Ahern wrote:
> > On 2/22/19 8:06 PM, brakmo wrote:
> >> Network Resource Manager is a framework for limiting the bandwidth used
> >> by v2 cgroups. It consists of 4 BPF helpers and a sample BPF program to
> >> limit egress bandwdith as well as a sample user program and script to
> >> simplify NRM testing.
> > 
> > 'resource manager' is a really generic name. Since you are referring to
> > bandwidth, how about renaming to Network Bandwidth Manager?
> > 
> 
> Or just use the normal word for a policer ...
> 
> Really this is beyond me that TCP experts can still push policers out there,
> they are really a huge pain.

hmm. please see our NRM presentation at LPC.
It is a networking _resource_ management for cgroups.
Bandwidth enforcement is a particular example.
It's not a policer either.
Eric Dumazet Feb. 23, 2019, 8:43 p.m. UTC | #4
On 02/23/2019 12:40 PM, Alexei Starovoitov wrote:
> On Sat, Feb 23, 2019 at 10:39:53AM -0800, Eric Dumazet wrote:
>>
>>
>> On 02/22/2019 07:03 PM, David Ahern wrote:
>>> On 2/22/19 8:06 PM, brakmo wrote:
>>>> Network Resource Manager is a framework for limiting the bandwidth used
>>>> by v2 cgroups. It consists of 4 BPF helpers and a sample BPF program to
>>>> limit egress bandwdith as well as a sample user program and script to
>>>> simplify NRM testing.
>>>
>>> 'resource manager' is a really generic name. Since you are referring to
>>> bandwidth, how about renaming to Network Bandwidth Manager?
>>>
>>
>> Or just use the normal word for a policer ...
>>
>> Really this is beyond me that TCP experts can still push policers out there,
>> they are really a huge pain.
> 
> hmm. please see our NRM presentation at LPC.
> It is a networking _resource_ management for cgroups.
> Bandwidth enforcement is a particular example.
> It's not a policer either.
> 

Well, this definitely looks a policer to me, sorry if we disagree, this is fine.
Alexei Starovoitov Feb. 23, 2019, 11:25 p.m. UTC | #5
On Sat, Feb 23, 2019 at 12:43:51PM -0800, Eric Dumazet wrote:
> 
> 
> On 02/23/2019 12:40 PM, Alexei Starovoitov wrote:
> > On Sat, Feb 23, 2019 at 10:39:53AM -0800, Eric Dumazet wrote:
> >>
> >>
> >> On 02/22/2019 07:03 PM, David Ahern wrote:
> >>> On 2/22/19 8:06 PM, brakmo wrote:
> >>>> Network Resource Manager is a framework for limiting the bandwidth used
> >>>> by v2 cgroups. It consists of 4 BPF helpers and a sample BPF program to
> >>>> limit egress bandwdith as well as a sample user program and script to
> >>>> simplify NRM testing.
> >>>
> >>> 'resource manager' is a really generic name. Since you are referring to
> >>> bandwidth, how about renaming to Network Bandwidth Manager?
> >>>
> >>
> >> Or just use the normal word for a policer ...
> >>
> >> Really this is beyond me that TCP experts can still push policers out there,
> >> they are really a huge pain.
> > 
> > hmm. please see our NRM presentation at LPC.
> > It is a networking _resource_ management for cgroups.
> > Bandwidth enforcement is a particular example.
> > It's not a policer either.
> > 
> 
> Well, this definitely looks a policer to me, sorry if we disagree, this is fine.

this particular example certainly does look like it. we both agree.
It's overall direction of this work that is aiming to do
network resource management. For example bpf prog may choose
to react on SLA violations in one cgroup by throttling flows
in the other cgroup. Aggregated per-cgroup bandwidth doesn't
need to cross a threshold for bpf prog to take action.
It could do 'work conserving' 'policer'.
I think this set of patches represent a revolutionary approach and existing
networking nomenclature doesn't have precise words to describe it :)
'NRM' describes our goals the best.
Other folks may choose to use it differently, of course.
Note that NRM abbreviation doesn't leak anywhere in uapi.
It's only used in examples. So not sure what we're arguing about.
David Ahern Feb. 24, 2019, 2:58 a.m. UTC | #6
On 2/23/19 6:25 PM, Alexei Starovoitov wrote:
>>> hmm. please see our NRM presentation at LPC.

Reference?

We also gave a talk about a resource manager in November 2017:

https://netdevconf.org/2.2/papers/roulin-hardwareresourcesmgmt-talk.pdf

in this case the context is hardware resources for networking which
aligns with devlink and switchdev.

>>> It is a networking _resource_ management for cgroups.
>>> Bandwidth enforcement is a particular example.
>>> It's not a policer either.
>>>
>>
>> Well, this definitely looks a policer to me, sorry if we disagree, this is fine.
> 
> this particular example certainly does look like it. we both agree.
> It's overall direction of this work that is aiming to do
> network resource management. For example bpf prog may choose
> to react on SLA violations in one cgroup by throttling flows
> in the other cgroup. Aggregated per-cgroup bandwidth doesn't
> need to cross a threshold for bpf prog to take action.
> It could do 'work conserving' 'policer'.
> I think this set of patches represent a revolutionary approach and existing
> networking nomenclature doesn't have precise words to describe it :)
> 'NRM' describes our goals the best.

Are you doing something beyond bandwidth usage? e.g., are you limiting
neighbor entries, fdb entries or FIB entries by cgroup? what about
router interfaces or vlans? I cannot imagine why or how you would manage
that but my point is the meaning of 'network resources'.


> Other folks may choose to use it differently, of course.
> Note that NRM abbreviation doesn't leak anywhere in uapi.
> It's only used in examples. So not sure what we're arguing about.
> 

It was a simple request for a more specific name that better represents
the scope of the project. Everything presented so far has been about
bandwidth.
Alexei Starovoitov Feb. 24, 2019, 4:48 a.m. UTC | #7
On Sat, Feb 23, 2019 at 09:58:57PM -0500, David Ahern wrote:
> On 2/23/19 6:25 PM, Alexei Starovoitov wrote:
> >>> hmm. please see our NRM presentation at LPC.
> 
> Reference?
> 
> We also gave a talk about a resource manager in November 2017:
> 
> https://netdevconf.org/2.2/papers/roulin-hardwareresourcesmgmt-talk.pdf
> 
> in this case the context is hardware resources for networking which
> aligns with devlink and switchdev.
> 
> >>> It is a networking _resource_ management for cgroups.
> >>> Bandwidth enforcement is a particular example.
> >>> It's not a policer either.
> >>>
> >>
> >> Well, this definitely looks a policer to me, sorry if we disagree, this is fine.
> > 
> > this particular example certainly does look like it. we both agree.
> > It's overall direction of this work that is aiming to do
> > network resource management. For example bpf prog may choose
> > to react on SLA violations in one cgroup by throttling flows
> > in the other cgroup. Aggregated per-cgroup bandwidth doesn't
> > need to cross a threshold for bpf prog to take action.
> > It could do 'work conserving' 'policer'.
> > I think this set of patches represent a revolutionary approach and existing
> > networking nomenclature doesn't have precise words to describe it :)
> > 'NRM' describes our goals the best.
> 
> Are you doing something beyond bandwidth usage? e.g., are you limiting
> neighbor entries, fdb entries or FIB entries by cgroup? what about
> router interfaces or vlans? I cannot imagine why or how you would manage
> that but my point is the meaning of 'network resources'.

'network resources' also include back bone and TOR capacity and
this mechanism is going to help address that as well.
David Ahern Feb. 25, 2019, 1:38 a.m. UTC | #8
On 2/23/19 11:48 PM, Alexei Starovoitov wrote:
> 'network resources' also include back bone and TOR capacity and
> this mechanism is going to help address that as well.

This appears to be the talk you are referring to:

http://vger.kernel.org/lpc_net2018_talks/LPC%20NRM.pdf

and from my reading it only references throttling at L4 - ie.,
bandwidth. hence my request for a better name than 'network resources'
in the commit logs and code references.