diff mbox series

[ovs-dev,RESEND,ovs,v3,4/4] dpif: Don't set "burst_size" to "rate" if not specified.

Message ID 20210303144658.47858-5-xiangxia.m.yue@gmail.com
State Changes Requested
Headers show
Series dpif-netdev meter fixes bug and improves. | expand

Commit Message

Tonghao Zhang March 3, 2021, 2:46 p.m. UTC
From: Tonghao Zhang <xiangxia.m.yue@gmail.com>

If user don't set the meter "burst_size", when creating them. OvS
will set "brust_size" to "rate", and there will be a double "rate"
as burst rate. For example:

$ ovs-ofctl -O OpenFlow13 add-meter br0 'meter=1 pktps stats bands=type=drop rate=1000000'

The rate expected is 1Gbps, but burst rate is 2Gbps while
we don't use "burst_size".

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
---
 lib/dpif-netdev.c    |   5 ---
 lib/dpif-netlink.c   |   2 +-
 tests/dpif-netdev.at | 103 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 104 insertions(+), 6 deletions(-)

Comments

Ilya Maximets March 30, 2021, 9:16 p.m. UTC | #1
On 3/3/21 3:46 PM, xiangxia.m.yue@gmail.com wrote:
> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> 
> If user don't set the meter "burst_size", when creating them. OvS
> will set "brust_size" to "rate", and there will be a double "rate"
> as burst rate. For example:
> 
> $ ovs-ofctl -O OpenFlow13 add-meter br0 'meter=1 pktps stats bands=type=drop rate=1000000'
> 
> The rate expected is 1Gbps, but burst rate is 2Gbps while
> we don't use "burst_size".


OpenFlow spec is a bit loose in definition of what should
be behavior if burst is not set:
"""
If the flag OFPMF_BURST is not set the burst_size values from meter
bands are ignored, and if the meter implementation uses a burst value,
this burst value must be set to an implementation defined optimal value.
"""

In our case, historically, "implementation defined optimal value" was
value equal to rate.  I have no idea why, but it's hard to argue with
it since the spec gives a great freedom to choose.

Actually, the "burst" itself as a term makes very little sense to me.
It's defined by the spec as:
"""
It defines the granularity of the meter band, for all packet or byte
bursts whose length is greater than burst value, the meter rate will
always be strictly enforced.
"""

But what is the burst?  How the implementation should define which
packets are in the burst and which are from the next one?

Current implementation just assumes that bursts are measured per second.
But the rate is measured per second too.  So, burst and rate is
essentially the same thing and implementations just sums them together
to get the bucket size.  So, I do not understand why "burst" and
"burst_size" exist at all.  Why not just set the rate a bit higher?

Ben, can you shed some light on this?  What was the original idea
behind the meter burst?  Or maybe I'm missing something?

Some review comments inline.

Best regards, Ilya Maximets.

> 
> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> ---
>  lib/dpif-netdev.c    |   5 ---
>  lib/dpif-netlink.c   |   2 +-
>  tests/dpif-netdev.at | 103 +++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 104 insertions(+), 6 deletions(-)
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index e53ed31b906c..e0a44abebda6 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -6327,11 +6327,6 @@ dpif_netdev_meter_set(struct dpif *dpif, ofproto_meter_id meter_id,
>      for (i = 0; i < config->n_bands; ++i) {
>          uint32_t band_max_delta_t;
>  
> -        /* Set burst size to a workable value if none specified. */
> -        if (config->bands[i].burst_size == 0) {
> -            config->bands[i].burst_size = config->bands[i].rate;
> -        }
> -
>          meter->bands[i].rate = config->bands[i].rate;
>          meter->bands[i].burst_size = config->bands[i].burst_size;

I still think that we need to check for OFPMF13_BURST in flags
and use config->bands[i].burst_size only if it is set.

>          /* Start with a full bucket. */
> diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
> index ceb56c6851c6..f3db0c6802b9 100644
> --- a/lib/dpif-netlink.c
> +++ b/lib/dpif-netlink.c
> @@ -3761,7 +3761,7 @@ dpif_netlink_meter_set__(struct dpif *dpif_, ofproto_meter_id meter_id,
>          nl_msg_put_u32(&buf, OVS_BAND_ATTR_RATE, band->rate);
>          nl_msg_put_u32(&buf, OVS_BAND_ATTR_BURST,
>                         config->flags & OFPMF13_BURST ?
> -                       band->burst_size : band->rate);
> +                       band->burst_size : 0);
>          nl_msg_end_nested(&buf, band_offset);
>      }
>      nl_msg_end_nested(&buf, bands_offset);
> diff --git a/tests/dpif-netdev.at b/tests/dpif-netdev.at
> index b2ff69a01ee6..b18297983007 100644
> --- a/tests/dpif-netdev.at
> +++ b/tests/dpif-netdev.at
> @@ -372,6 +372,109 @@ recirc_id(0),in_port(8),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), a
>  OVS_VSWITCHD_STOP
>  AT_CLEANUP
>  
> +AT_SETUP([dpif-netdev - meters without burst_size])
> +# Create br0 with interfaces p1 and p7
> +#    and br1 with interfaces p2 and p8
> +# with p1 and p2 connected via unix domain socket
> +OVS_VSWITCHD_START(
> +  [add-port br0 p1 -- set interface p1 type=dummy options:pstream=punix:$OVS_RUNDIR/p0.sock ofport_request=1 -- \
> +   add-port br0 p7 -- set interface p7 ofport_request=7 type=dummy -- \
> +   add-br br1 -- \
> +   set bridge br1 other-config:hwaddr=aa:66:aa:66:00:00 -- \
> +   set bridge br1 datapath-type=dummy other-config:datapath-id=1234 \
> +                  fail-mode=secure -- \
> +   add-port br1 p2 -- set interface p2 type=dummy options:stream=unix:$OVS_RUNDIR/p0.sock ofport_request=2 -- \
> +   add-port br1 p8 -- set interface p8 ofport_request=8 type=dummy --])
> +AT_CHECK([ovs-appctl vlog/set dpif:dbg dpif_netdev:dbg])
> +
> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br0 'meter=1 pktps stats bands=type=drop rate=1'])
> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br0 'meter=2 kbps stats bands=type=drop rate=1'])
> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 'in_port=1 action=meter:1,7'])
> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 'in_port=7 action=meter:2,1'])
> +AT_CHECK([ovs-ofctl add-flow br1 'in_port=2 action=8'])
> +AT_CHECK([ovs-ofctl add-flow br1 'in_port=8 action=2'])
> +ovs-appctl time/stop
> +
> +AT_CHECK([ovs-ofctl -O OpenFlow13 dump-meters br0], [0], [dnl
> +OFPST_METER_CONFIG reply (OF1.3) (xid=0x2):
> +meter=1 pktps stats bands=
> +type=drop rate=1
> +
> +meter=2 kbps stats bands=
> +type=drop rate=1
> +])
> +
> +ovs-appctl time/warp 5000
> +for i in `seq 1 5`; do
> +    AT_CHECK([ovs-appctl netdev-dummy/receive p7 \
> +            'in_port(7),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x0800),ipv4(src=10.0.0.2,dst=10.0.0.1,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)' --len 60])
> +    AT_CHECK([ovs-appctl netdev-dummy/receive p8 \
> +            'in_port(8),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:0b,dst=50:54:00:00:00:0c),eth_type(0x0800),ipv4(src=10.0.0.3,dst=10.0.0.4,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)' --len 60])
> +done
> +
> +sleep 1  # wait for forwarders process packets
> +
> +# Meter 1 is measuring packets, allowing one packet per second,
> +# so 4 out of 5 packets should hit the drop band.
> +#
> +# Meter 2 is measuring kbps (== 1000 bits). 2 packets
> +# (120 bytes == 960 bits) pass, but the last 3 packets should hit the drop band.
> +AT_CHECK([ovs-ofctl -O OpenFlow13 meter-stats br0 | strip_timers], [0], [dnl
> +OFPST_METER reply (OF1.3) (xid=0x2):
> +meter:1 flow_count:1 packet_in_count:5 byte_in_count:300 duration:0.0s bands:
> +0: packet_count:4 byte_count:240
> +
> +meter:2 flow_count:1 packet_in_count:5 byte_in_count:300 duration:0.0s bands:
> +0: packet_count:3 byte_count:180
> +])
> +
> +# Advance time by 1/2 second
> +ovs-appctl time/warp 500
> +
> +for i in `seq 1 5`; do
> +    AT_CHECK([ovs-appctl netdev-dummy/receive p7 \
> +            'in_port(7),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x0800),ipv4(src=10.0.0.2,dst=10.0.0.1,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)' --len 60])
> +
> +    AT_CHECK([ovs-appctl netdev-dummy/receive p8 \
> +            'in_port(8),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:0b,dst=50:54:00:00:00:0c),eth_type(0x0800),ipv4(src=10.0.0.3,dst=10.0.0.4,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)' --len 60])
> +
> +done
> +
> +sleep 1  # wait for forwarders process packets
> +
> +# Meter 1 is measuring packets, allowing one packet per second with
> +# bursts of one packet, so all 5 of the new packets should hit the drop

This comment still talks about bursts.

It might make sense to not copy the whole test, but just add a couple of
new ports and flows and one extra meter to the existing test.
WDYT?

> +# band.
> +# Meter 2 is measuring kbps (== 1000 bits). After 500ms
> +# there should be space for 40 + 500 bits, so one new 60 byte (480 bit) packet
> +# should pass, remaining 4 should hit the drop band.
> +AT_CHECK([ovs-ofctl -O OpenFlow13 meter-stats br0 | strip_timers], [0], [dnl
> +OFPST_METER reply (OF1.3) (xid=0x2):
> +meter:1 flow_count:1 packet_in_count:10 byte_in_count:600 duration:0.0s bands:
> +0: packet_count:9 byte_count:540
> +
> +meter:2 flow_count:1 packet_in_count:10 byte_in_count:600 duration:0.0s bands:
> +0: packet_count:7 byte_count:420
> +])
> +
> +ovs-appctl time/warp 5000
> +
> +AT_CHECK([
> +ovs-appctl coverage/read-counter datapath_drop_meter
> +], [0], [dnl
> +16
> +])
> +
> +AT_CHECK([cat ovs-vswitchd.log | filter_flow_install | strip_xout_keep_actions], [0], [dnl
> +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), actions:meter(0),7
> +recirc_id(0),in_port(2),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), actions:8
> +recirc_id(0),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), actions:meter(1),1
> +recirc_id(0),in_port(8),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), actions:2
> +])
> +
> +OVS_VSWITCHD_STOP
> +AT_CLEANUP
> +
>  m4_define([DPIF_NETDEV_FLOW_HW_OFFLOAD],
>    [AT_SETUP([dpif-netdev - partial hw offload - $1])
>     OVS_VSWITCHD_START(
>
Ben Pfaff March 30, 2021, 9:27 p.m. UTC | #2
On Tue, Mar 30, 2021 at 11:16:48PM +0200, Ilya Maximets wrote:
> On 3/3/21 3:46 PM, xiangxia.m.yue@gmail.com wrote:
> > From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > 
> > If user don't set the meter "burst_size", when creating them. OvS
> > will set "brust_size" to "rate", and there will be a double "rate"
> > as burst rate. For example:
> > 
> > $ ovs-ofctl -O OpenFlow13 add-meter br0 'meter=1 pktps stats bands=type=drop rate=1000000'
> > 
> > The rate expected is 1Gbps, but burst rate is 2Gbps while
> > we don't use "burst_size".
> 
> 
> OpenFlow spec is a bit loose in definition of what should
> be behavior if burst is not set:
> """
> If the flag OFPMF_BURST is not set the burst_size values from meter
> bands are ignored, and if the meter implementation uses a burst value,
> this burst value must be set to an implementation defined optimal value.
> """
> 
> In our case, historically, "implementation defined optimal value" was
> value equal to rate.  I have no idea why, but it's hard to argue with
> it since the spec gives a great freedom to choose.
> 
> Actually, the "burst" itself as a term makes very little sense to me.
> It's defined by the spec as:
> """
> It defines the granularity of the meter band, for all packet or byte
> bursts whose length is greater than burst value, the meter rate will
> always be strictly enforced.
> """
> 
> But what is the burst?  How the implementation should define which
> packets are in the burst and which are from the next one?
> 
> Current implementation just assumes that bursts are measured per second.
> But the rate is measured per second too.  So, burst and rate is
> essentially the same thing and implementations just sums them together
> to get the bucket size.  So, I do not understand why "burst" and
> "burst_size" exist at all.  Why not just set the rate a bit higher?
> 
> Ben, can you shed some light on this?  What was the original idea
> behind the meter burst?  Or maybe I'm missing something?

I wasn't really involved in the design of meters.  I saw them as a
feature of hardware switches that was not very relevant to software
switches.  I guess I was wrong.

I think that Jean Tourillhes was the primary architect of meters in
OpenFlow.  I am adding him to this thread.  I hope that he can help.
Jean Tourrilhes March 30, 2021, 10:29 p.m. UTC | #3
On Tue, Mar 30, 2021 at 02:27:11PM -0700, Ben Pfaff wrote:
> On Tue, Mar 30, 2021 at 11:16:48PM +0200, Ilya Maximets wrote:
> > 
> > OpenFlow spec is a bit loose in definition of what should
> > be behavior if burst is not set:
> > """
> > If the flag OFPMF_BURST is not set the burst_size values from meter
> > bands are ignored, and if the meter implementation uses a burst value,
> > this burst value must be set to an implementation defined optimal value.
> > """
> > 
> > In our case, historically, "implementation defined optimal value" was
> > value equal to rate.  I have no idea why, but it's hard to argue with
> > it since the spec gives a great freedom to choose.
> > 
> > Actually, the "burst" itself as a term makes very little sense to me.
> > It's defined by the spec as:
> > """
> > It defines the granularity of the meter band, for all packet or byte
> > bursts whose length is greater than burst value, the meter rate will
> > always be strictly enforced.
> > """
> > 
> > But what is the burst?  How the implementation should define which
> > packets are in the burst and which are from the next one?
> > 
> > Current implementation just assumes that bursts are measured per second.
> > But the rate is measured per second too.  So, burst and rate is
> > essentially the same thing and implementations just sums them together
> > to get the bucket size.  So, I do not understand why "burst" and
> > "burst_size" exist at all.  Why not just set the rate a bit higher?
> > 
> > Ben, can you shed some light on this?  What was the original idea
> > behind the meter burst?  Or maybe I'm missing something?

	I don't understand how you can confuse a rate and a size. The
OpenFlow spec clearly says it's in kilobits or packets (not per
seconds).
	A basic token bucket has only two parameters, the commited
rate and the burst size (i.e. maximum number of tokens in the
bucket). The spec reflect that in a generic way to avoid mandating an
implementation.
	Burst rate is only defined for more fancy rate limiters, such
as two colors rate limiters. In this case, you also have two burst
size, one for each token bucket. The OpenFlow spec does not support
those extra parameters (as of version 1.5.1).
	For Linux 'police' filter : rate == rate ; burst_size == burst
	For Linux 'htb' qdisc : rate == rate ; burst_size == burst ;
ceil and cburst are not supported.

> I wasn't really involved in the design of meters.  I saw them as a
> feature of hardware switches that was not very relevant to software
> switches.  I guess I was wrong.
> 
> I think that Jean Tourillhes was the primary architect of meters in
> OpenFlow.  I am adding him to this thread.  I hope that he can help.

	Have fun...

	Jean
Ben Pfaff March 30, 2021, 10:33 p.m. UTC | #4
On Tue, Mar 30, 2021 at 03:29:54PM -0700, Jean Tourrilhes wrote:
> 	Have fun...

From that line you can be sure that this is a genuine Jean Tourrilhes
email.

Thanks, Jean!
Tonghao Zhang March 31, 2021, 2:22 a.m. UTC | #5
On Wed, Mar 31, 2021 at 5:16 AM Ilya Maximets <i.maximets@ovn.org> wrote:
>
> On 3/3/21 3:46 PM, xiangxia.m.yue@gmail.com wrote:
> > From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> >
> > If user don't set the meter "burst_size", when creating them. OvS
> > will set "brust_size" to "rate", and there will be a double "rate"
> > as burst rate. For example:
> >
> > $ ovs-ofctl -O OpenFlow13 add-meter br0 'meter=1 pktps stats bands=type=drop rate=1000000'
> >
> > The rate expected is 1Gbps, but burst rate is 2Gbps while
> > we don't use "burst_size".
>
>
> OpenFlow spec is a bit loose in definition of what should
> be behavior if burst is not set:
> """
> If the flag OFPMF_BURST is not set the burst_size values from meter
> bands are ignored, and if the meter implementation uses a burst value,
> this burst value must be set to an implementation defined optimal value.
> """
>
> In our case, historically, "implementation defined optimal value" was
> value equal to rate.  I have no idea why, but it's hard to argue with
> it since the spec gives a great freedom to choose.
>
> Actually, the "burst" itself as a term makes very little sense to me.
> It's defined by the spec as:
> """
> It defines the granularity of the meter band, for all packet or byte
> bursts whose length is greater than burst value, the meter rate will
> always be strictly enforced.
> """
>
> But what is the burst?  How the implementation should define which
> packets are in the burst and which are from the next one?
>
> Current implementation just assumes that bursts are measured per second.
> But the rate is measured per second too.  So, burst and rate is
> essentially the same thing and implementations just sums them together
> to get the bucket size.  So, I do not understand why "burst" and
> "burst_size" exist at all.  Why not just set the rate a bit higher?
As I test if we set the burst_size, for example
$ ovs-ofctl -O OpenFlow13 add-meter br0 'meter=1 pktps stats
bands=type=drop rate=1000000'
bandwidth is 2Gbps in the first second, and later 1Gbps. The
burst_size only affects the rate in the first second.

> Ben, can you shed some light on this?  What was the original idea
> behind the meter burst?  Or maybe I'm missing something?
>
> Some review comments inline.
>
> Best regards, Ilya Maximets.
>
> >
> > Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > ---
> >  lib/dpif-netdev.c    |   5 ---
> >  lib/dpif-netlink.c   |   2 +-
> >  tests/dpif-netdev.at | 103 +++++++++++++++++++++++++++++++++++++++++++
> >  3 files changed, 104 insertions(+), 6 deletions(-)
> >
> > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> > index e53ed31b906c..e0a44abebda6 100644
> > --- a/lib/dpif-netdev.c
> > +++ b/lib/dpif-netdev.c
> > @@ -6327,11 +6327,6 @@ dpif_netdev_meter_set(struct dpif *dpif, ofproto_meter_id meter_id,
> >      for (i = 0; i < config->n_bands; ++i) {
> >          uint32_t band_max_delta_t;
> >
> > -        /* Set burst size to a workable value if none specified. */
> > -        if (config->bands[i].burst_size == 0) {
> > -            config->bands[i].burst_size = config->bands[i].rate;
> > -        }
> > -
> >          meter->bands[i].rate = config->bands[i].rate;
> >          meter->bands[i].burst_size = config->bands[i].burst_size;
>
> I still think that we need to check for OFPMF13_BURST in flags
> and use config->bands[i].burst_size only if it is set.
Hi I guess bust_size and rate is different, in the kernel datapath:
band->rate = nla_get_u32(attr[OVS_BAND_ATTR_RATE]);
band->burst_size = nla_get_u32(attr[OVS_BAND_ATTR_BURST]);

> >          /* Start with a full bucket. */
> > diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
> > index ceb56c6851c6..f3db0c6802b9 100644
> > --- a/lib/dpif-netlink.c
> > +++ b/lib/dpif-netlink.c
> > @@ -3761,7 +3761,7 @@ dpif_netlink_meter_set__(struct dpif *dpif_, ofproto_meter_id meter_id,
> >          nl_msg_put_u32(&buf, OVS_BAND_ATTR_RATE, band->rate);
> >          nl_msg_put_u32(&buf, OVS_BAND_ATTR_BURST,
> >                         config->flags & OFPMF13_BURST ?
> > -                       band->burst_size : band->rate);
> > +                       band->burst_size : 0);
> >          nl_msg_end_nested(&buf, band_offset);
> >      }
> >      nl_msg_end_nested(&buf, bands_offset);
> > diff --git a/tests/dpif-netdev.at b/tests/dpif-netdev.at
> > index b2ff69a01ee6..b18297983007 100644
> > --- a/tests/dpif-netdev.at
> > +++ b/tests/dpif-netdev.at
> > @@ -372,6 +372,109 @@ recirc_id(0),in_port(8),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), a
> >  OVS_VSWITCHD_STOP
> >  AT_CLEANUP
> >
> > +AT_SETUP([dpif-netdev - meters without burst_size])
> > +# Create br0 with interfaces p1 and p7
> > +#    and br1 with interfaces p2 and p8
> > +# with p1 and p2 connected via unix domain socket
> > +OVS_VSWITCHD_START(
> > +  [add-port br0 p1 -- set interface p1 type=dummy options:pstream=punix:$OVS_RUNDIR/p0.sock ofport_request=1 -- \
> > +   add-port br0 p7 -- set interface p7 ofport_request=7 type=dummy -- \
> > +   add-br br1 -- \
> > +   set bridge br1 other-config:hwaddr=aa:66:aa:66:00:00 -- \
> > +   set bridge br1 datapath-type=dummy other-config:datapath-id=1234 \
> > +                  fail-mode=secure -- \
> > +   add-port br1 p2 -- set interface p2 type=dummy options:stream=unix:$OVS_RUNDIR/p0.sock ofport_request=2 -- \
> > +   add-port br1 p8 -- set interface p8 ofport_request=8 type=dummy --])
> > +AT_CHECK([ovs-appctl vlog/set dpif:dbg dpif_netdev:dbg])
> > +
> > +AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br0 'meter=1 pktps stats bands=type=drop rate=1'])
> > +AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br0 'meter=2 kbps stats bands=type=drop rate=1'])
> > +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 'in_port=1 action=meter:1,7'])
> > +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 'in_port=7 action=meter:2,1'])
> > +AT_CHECK([ovs-ofctl add-flow br1 'in_port=2 action=8'])
> > +AT_CHECK([ovs-ofctl add-flow br1 'in_port=8 action=2'])
> > +ovs-appctl time/stop
> > +
> > +AT_CHECK([ovs-ofctl -O OpenFlow13 dump-meters br0], [0], [dnl
> > +OFPST_METER_CONFIG reply (OF1.3) (xid=0x2):
> > +meter=1 pktps stats bands=
> > +type=drop rate=1
> > +
> > +meter=2 kbps stats bands=
> > +type=drop rate=1
> > +])
> > +
> > +ovs-appctl time/warp 5000
> > +for i in `seq 1 5`; do
> > +    AT_CHECK([ovs-appctl netdev-dummy/receive p7 \
> > +            'in_port(7),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x0800),ipv4(src=10.0.0.2,dst=10.0.0.1,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)' --len 60])
> > +    AT_CHECK([ovs-appctl netdev-dummy/receive p8 \
> > +            'in_port(8),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:0b,dst=50:54:00:00:00:0c),eth_type(0x0800),ipv4(src=10.0.0.3,dst=10.0.0.4,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)' --len 60])
> > +done
> > +
> > +sleep 1  # wait for forwarders process packets
> > +
> > +# Meter 1 is measuring packets, allowing one packet per second,
> > +# so 4 out of 5 packets should hit the drop band.
> > +#
> > +# Meter 2 is measuring kbps (== 1000 bits). 2 packets
> > +# (120 bytes == 960 bits) pass, but the last 3 packets should hit the drop band.
> > +AT_CHECK([ovs-ofctl -O OpenFlow13 meter-stats br0 | strip_timers], [0], [dnl
> > +OFPST_METER reply (OF1.3) (xid=0x2):
> > +meter:1 flow_count:1 packet_in_count:5 byte_in_count:300 duration:0.0s bands:
> > +0: packet_count:4 byte_count:240
> > +
> > +meter:2 flow_count:1 packet_in_count:5 byte_in_count:300 duration:0.0s bands:
> > +0: packet_count:3 byte_count:180
> > +])
> > +
> > +# Advance time by 1/2 second
> > +ovs-appctl time/warp 500
> > +
> > +for i in `seq 1 5`; do
> > +    AT_CHECK([ovs-appctl netdev-dummy/receive p7 \
> > +            'in_port(7),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x0800),ipv4(src=10.0.0.2,dst=10.0.0.1,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)' --len 60])
> > +
> > +    AT_CHECK([ovs-appctl netdev-dummy/receive p8 \
> > +            'in_port(8),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:0b,dst=50:54:00:00:00:0c),eth_type(0x0800),ipv4(src=10.0.0.3,dst=10.0.0.4,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)' --len 60])
> > +
> > +done
> > +
> > +sleep 1  # wait for forwarders process packets
> > +
> > +# Meter 1 is measuring packets, allowing one packet per second with
> > +# bursts of one packet, so all 5 of the new packets should hit the drop
>
> This comment still talks about bursts.
>
> It might make sense to not copy the whole test, but just add a couple of
> new ports and flows and one extra meter to the existing test.
> WDYT?
The original meter test case, use  "burst_size", if users don't use
it, OvS should work fine too. Because "burst_size" is important in the
merter function,
I use a new test case for it, that makes the test case clean.
> > +# band.
> > +# Meter 2 is measuring kbps (== 1000 bits). After 500ms
> > +# there should be space for 40 + 500 bits, so one new 60 byte (480 bit) packet
> > +# should pass, remaining 4 should hit the drop band.
> > +AT_CHECK([ovs-ofctl -O OpenFlow13 meter-stats br0 | strip_timers], [0], [dnl
> > +OFPST_METER reply (OF1.3) (xid=0x2):
> > +meter:1 flow_count:1 packet_in_count:10 byte_in_count:600 duration:0.0s bands:
> > +0: packet_count:9 byte_count:540
> > +
> > +meter:2 flow_count:1 packet_in_count:10 byte_in_count:600 duration:0.0s bands:
> > +0: packet_count:7 byte_count:420
> > +])
> > +
> > +ovs-appctl time/warp 5000
> > +
> > +AT_CHECK([
> > +ovs-appctl coverage/read-counter datapath_drop_meter
> > +], [0], [dnl
> > +16
> > +])
> > +
> > +AT_CHECK([cat ovs-vswitchd.log | filter_flow_install | strip_xout_keep_actions], [0], [dnl
> > +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), actions:meter(0),7
> > +recirc_id(0),in_port(2),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), actions:8
> > +recirc_id(0),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), actions:meter(1),1
> > +recirc_id(0),in_port(8),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), actions:2
> > +])
> > +
> > +OVS_VSWITCHD_STOP
> > +AT_CLEANUP
> > +
> >  m4_define([DPIF_NETDEV_FLOW_HW_OFFLOAD],
> >    [AT_SETUP([dpif-netdev - partial hw offload - $1])
> >     OVS_VSWITCHD_START(
> >
>
Tonghao Zhang March 31, 2021, 2:26 a.m. UTC | #6
On Wed, Mar 31, 2021 at 6:29 AM Jean Tourrilhes <jean.tourrilhes@hpe.com> wrote:
>
> On Tue, Mar 30, 2021 at 02:27:11PM -0700, Ben Pfaff wrote:
> > On Tue, Mar 30, 2021 at 11:16:48PM +0200, Ilya Maximets wrote:
> > >
> > > OpenFlow spec is a bit loose in definition of what should
> > > be behavior if burst is not set:
> > > """
> > > If the flag OFPMF_BURST is not set the burst_size values from meter
> > > bands are ignored, and if the meter implementation uses a burst value,
> > > this burst value must be set to an implementation defined optimal value.
> > > """
> > >
> > > In our case, historically, "implementation defined optimal value" was
> > > value equal to rate.  I have no idea why, but it's hard to argue with
> > > it since the spec gives a great freedom to choose.
> > >
> > > Actually, the "burst" itself as a term makes very little sense to me.
> > > It's defined by the spec as:
> > > """
> > > It defines the granularity of the meter band, for all packet or byte
> > > bursts whose length is greater than burst value, the meter rate will
> > > always be strictly enforced.
> > > """
> > >
> > > But what is the burst?  How the implementation should define which
> > > packets are in the burst and which are from the next one?
> > >
> > > Current implementation just assumes that bursts are measured per second.
> > > But the rate is measured per second too.  So, burst and rate is
> > > essentially the same thing and implementations just sums them together
> > > to get the bucket size.  So, I do not understand why "burst" and
> > > "burst_size" exist at all.  Why not just set the rate a bit higher?
> > >
> > > Ben, can you shed some light on this?  What was the original idea
> > > behind the meter burst?  Or maybe I'm missing something?
>
>         I don't understand how you can confuse a rate and a size. The
> OpenFlow spec clearly says it's in kilobits or packets (not per
> seconds).
>         A basic token bucket has only two parameters, the commited
> rate and the burst size (i.e. maximum number of tokens in the
> bucket). The spec reflect that in a generic way to avoid mandating an
> implementation.
>         Burst rate is only defined for more fancy rate limiters, such
> as two colors rate limiters. In this case, you also have two burst
> size, one for each token bucket. The OpenFlow spec does not support
> those extra parameters (as of version 1.5.1).
>         For Linux 'police' filter : rate == rate ; burst_size == burst
>         For Linux 'htb' qdisc : rate == rate ; burst_size == burst ;
Yes, we do also this in the kernel datapath, but not userspace datapath.
> ceil and cburst are not supported.
>
> > I wasn't really involved in the design of meters.  I saw them as a
> > feature of hardware switches that was not very relevant to software
> > switches.  I guess I was wrong.
> >
> > I think that Jean Tourillhes was the primary architect of meters in
> > OpenFlow.  I am adding him to this thread.  I hope that he can help.
>
>         Have fun...
>
>         Jean
Tonghao Zhang April 1, 2021, 11:53 a.m. UTC | #7
On Wed, Mar 31, 2021 at 10:26 AM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
>
> On Wed, Mar 31, 2021 at 6:29 AM Jean Tourrilhes <jean.tourrilhes@hpe.com> wrote:
> >
> > On Tue, Mar 30, 2021 at 02:27:11PM -0700, Ben Pfaff wrote:
> > > On Tue, Mar 30, 2021 at 11:16:48PM +0200, Ilya Maximets wrote:
> > > >
> > > > OpenFlow spec is a bit loose in definition of what should
> > > > be behavior if burst is not set:
> > > > """
> > > > If the flag OFPMF_BURST is not set the burst_size values from meter
> > > > bands are ignored, and if the meter implementation uses a burst value,
> > > > this burst value must be set to an implementation defined optimal value.
> > > > """
> > > >
> > > > In our case, historically, "implementation defined optimal value" was
> > > > value equal to rate.  I have no idea why, but it's hard to argue with
> > > > it since the spec gives a great freedom to choose.
> > > >
> > > > Actually, the "burst" itself as a term makes very little sense to me.
> > > > It's defined by the spec as:
> > > > """
> > > > It defines the granularity of the meter band, for all packet or byte
> > > > bursts whose length is greater than burst value, the meter rate will
> > > > always be strictly enforced.
> > > > """
> > > >
> > > > But what is the burst?  How the implementation should define which
> > > > packets are in the burst and which are from the next one?
> > > >
> > > > Current implementation just assumes that bursts are measured per second.
> > > > But the rate is measured per second too.  So, burst and rate is
> > > > essentially the same thing and implementations just sums them together
> > > > to get the bucket size.  So, I do not understand why "burst" and
> > > > "burst_size" exist at all.  Why not just set the rate a bit higher?
> > > >
> > > > Ben, can you shed some light on this?  What was the original idea
> > > > behind the meter burst?  Or maybe I'm missing something?
> >
> >         I don't understand how you can confuse a rate and a size. The
> > OpenFlow spec clearly says it's in kilobits or packets (not per
> > seconds).
> >         A basic token bucket has only two parameters, the commited
> > rate and the burst size (i.e. maximum number of tokens in the
> > bucket). The spec reflect that in a generic way to avoid mandating an
> > implementation.
> >         Burst rate is only defined for more fancy rate limiters, such
> > as two colors rate limiters. In this case, you also have two burst
> > size, one for each token bucket. The OpenFlow spec does not support
> > those extra parameters (as of version 1.5.1).
> >         For Linux 'police' filter : rate == rate ; burst_size == burst
> >         For Linux 'htb' qdisc : rate == rate ; burst_size == burst ;
> Yes, we do also this in the kernel datapath, but not userspace datapath.
> > ceil and cburst are not supported.
> >
> > > I wasn't really involved in the design of meters.  I saw them as a
> > > feature of hardware switches that was not very relevant to software
> > > switches.  I guess I was wrong.
> > >
> > > I think that Jean Tourillhes was the primary architect of meters in
> > > OpenFlow.  I am adding him to this thread.  I hope that he can help.
> >
> >         Have fun...
> >
> >         Jean
Hi Ben, Ilya
Try to explain this patch again. Now OvS has supported the burst_size,
 as one user case,
if users don't use the burst_size feature, we should set burst_size to
rate or 0. This patch set this to 0.

As Ilya said, we should check the OFPMF13_BURST in userspace datapath,
I think it's right.

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 251788b04965..afd698be1a59 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -6349,13 +6349,12 @@ dpif_netdev_meter_set(struct dpif *dpif,
ofproto_meter_id meter_id,
     for (i = 0; i < config->n_bands; ++i) {
         uint32_t band_max_delta_t;

-        /* Set burst size to a workable value if none specified. */
-        if (config->bands[i].burst_size == 0) {
-            config->bands[i].burst_size = config->bands[i].rate;
+        /* Set burst size to a workable value if specified. */
+        if (config->flags & OFPMF13_BURST) {
+            meter->bands[i].burst_size = config->bands[i].burst_size;
         }

         meter->bands[i].rate = config->bands[i].rate;
-        meter->bands[i].burst_size = config->bands[i].burst_size;
         /* Start with a full bucket. */
         meter->bands[i].bucket =
             (meter->bands[i].burst_size + meter->bands[i].rate) * 1000ULL;
diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
index ceb56c6851c6..f3db0c6802b9 100644
--- a/lib/dpif-netlink.c
+++ b/lib/dpif-netlink.c
@@ -3761,7 +3761,7 @@ dpif_netlink_meter_set__(struct dpif *dpif_,
ofproto_meter_id meter_id,
         nl_msg_put_u32(&buf, OVS_BAND_ATTR_RATE, band->rate);
         nl_msg_put_u32(&buf, OVS_BAND_ATTR_BURST,
                        config->flags & OFPMF13_BURST ?
-                       band->burst_size : band->rate);
+                       band->burst_size : 0);
         nl_msg_end_nested(&buf, band_offset);
     }
     nl_msg_end_nested(&buf, bands_offset);


>
>
> --
> Best regards, Tonghao
Jean Tourrilhes April 1, 2021, 7:04 p.m. UTC | #8
On Thu, Apr 01, 2021 at 07:53:25PM +0800, Tonghao Zhang wrote:
> 
> Hi Ben, Ilya
> Try to explain this patch again. Now OvS has supported the burst_size,
>  as one user case,
> if users don't use the burst_size feature, we should set burst_size to
> rate or 0. This patch set this to 0.

	'0' is definitely not an "implementation defined optimal
value" as the spec requires. Actually, most token buckets
implementations do not work or work very poorly with a bucket size of
zero.
	The first hit on a Google search :
https://www.juniper.net/documentation/us/en/software/junos/routing-policy/topics/concept/policer-mx-m120-m320-burstsize-determining.html
	I don't fully agree with their recommendations, 5ms is way too
large for high speed networks, but they expose the problem properly.

> As Ilya said, we should check the OFPMF13_BURST in userspace datapath,
> I think it's right.

	I don't have enough context to comment on the patch and
I don't know where that flag should be tested. It would seem that all
datapaths will suffer from the same issue, you need to figure out an
optimal burst_size for user space *and* for kernel space if none is
given via OpenFlow.
	I have a strong suspicion that the patch does not do a great
job if OFPMF13_BURST is not set, but I may be wrong.

	Regards,

	Jean
Ilya Maximets April 1, 2021, 8:35 p.m. UTC | #9
On 3/31/21 12:29 AM, Jean Tourrilhes wrote:
> On Tue, Mar 30, 2021 at 02:27:11PM -0700, Ben Pfaff wrote:
>> On Tue, Mar 30, 2021 at 11:16:48PM +0200, Ilya Maximets wrote:
>>>
>>> OpenFlow spec is a bit loose in definition of what should
>>> be behavior if burst is not set:
>>> """
>>> If the flag OFPMF_BURST is not set the burst_size values from meter
>>> bands are ignored, and if the meter implementation uses a burst value,
>>> this burst value must be set to an implementation defined optimal value.
>>> """
>>>
>>> In our case, historically, "implementation defined optimal value" was
>>> value equal to rate.  I have no idea why, but it's hard to argue with
>>> it since the spec gives a great freedom to choose.
>>>
>>> Actually, the "burst" itself as a term makes very little sense to me.
>>> It's defined by the spec as:
>>> """
>>> It defines the granularity of the meter band, for all packet or byte
>>> bursts whose length is greater than burst value, the meter rate will
>>> always be strictly enforced.
>>> """
>>>
>>> But what is the burst?  How the implementation should define which
>>> packets are in the burst and which are from the next one?
>>>
>>> Current implementation just assumes that bursts are measured per second.
>>> But the rate is measured per second too.  So, burst and rate is
>>> essentially the same thing and implementations just sums them together
>>> to get the bucket size.  So, I do not understand why "burst" and
>>> "burst_size" exist at all.  Why not just set the rate a bit higher?
>>>
>>> Ben, can you shed some light on this?  What was the original idea
>>> behind the meter burst?  Or maybe I'm missing something?
> 
> 	I don't understand how you can confuse a rate and a size. The
> OpenFlow spec clearly says it's in kilobits or packets (not per
> seconds).
> 	A basic token bucket has only two parameters, the commited
> rate and the burst size (i.e. maximum number of tokens in the
> bucket). The spec reflect that in a generic way to avoid mandating an
> implementation.

Thanks, Jean.

My problem, actually, was that I started from the implementation in the
kernel datapath and it looked super weird.  Especially, this part:
  https://elixir.bootlin.com/linux/latest/source/net/openvswitch/meter.c#L644
Than I tried to find the truth inside the spec, but it doesn't define
the implementation on purpose.  So the only option I had is to guess
how this suppose to work.  Was it 11pm or something else, but my guessing
engine didn't came up with anything that might make sense. :)

> 	Burst rate is only defined for more fancy rate limiters, such
> as two colors rate limiters. In this case, you also have two burst
> size, one for each token bucket. The OpenFlow spec does not support
> those extra parameters (as of version 1.5.1).
> 	For Linux 'police' filter : rate == rate ; burst_size == burst
> 	For Linux 'htb' qdisc : rate == rate ; burst_size == burst ;
> ceil and cburst are not supported.

This totally makes sense.  OTOH, Implementation inside both datapaths
doesn't.

Thanks for pushing me in right direction.

For the implementations: I think, they needs to be reworked.
At least, we need to get rid of 'rate' in a calculation of a maximum
bucket size.  It should not depend on rate, only on a burst size.
i.e. instead of:
  max_bucket_size = (band->burst_size + band->rate) * 1000;
there should be:
  max_bucket_size = band->burst_size * 1000;

This way implementations will have at least a bit of sense.

Summing burst size with rate is like summing apples with oranges.
And that is what misled me.

About having a value for a burst size being numerically equal to the
configured rate:  this looks like some kind of estimated value, but
it's really hard to argue with it, because research is needed to
define the "good value for most cases".

Anyway, we need to fix calculation of a maximum bucket size first.

Best regards, Ilya Maximets.
Tonghao Zhang April 7, 2021, 11:42 a.m. UTC | #10
On Fri, Apr 2, 2021 at 3:04 AM Jean Tourrilhes <jean.tourrilhes@hpe.com> wrote:
>
> On Thu, Apr 01, 2021 at 07:53:25PM +0800, Tonghao Zhang wrote:
> >
> > Hi Ben, Ilya
> > Try to explain this patch again. Now OvS has supported the burst_size,
> >  as one user case,
> > if users don't use the burst_size feature, we should set burst_size to
> > rate or 0. This patch set this to 0.
>
>         '0' is definitely not an "implementation defined optimal
> value" as the spec requires. Actually, most token buckets
> implementations do not work or work very poorly with a bucket size of
> zero.
Hi
I want to know what tools can we use to measure the meters "rate" and
"bust_size" is optimal value,
I use the vnstat tool, and not find the ovs meter doesn't work with or
without burst_size.

>         The first hit on a Google search :
> https://www.juniper.net/documentation/us/en/software/junos/routing-policy/topics/concept/policer-mx-m120-m320-burstsize-determining.html
>         I don't fully agree with their recommendations, 5ms is way too
> large for high speed networks, but they expose the problem properly.
>
> > As Ilya said, we should check the OFPMF13_BURST in userspace datapath,
> > I think it's right.
>
>         I don't have enough context to comment on the patch and
> I don't know where that flag should be tested. It would seem that all
> datapaths will suffer from the same issue, you need to figure out an
> optimal burst_size for user space *and* for kernel space if none is
> given via OpenFlow.
>         I have a strong suspicion that the patch does not do a great
> job if OFPMF13_BURST is not set, but I may be wrong.
>
>         Regards,
>
>         Jean
Tonghao Zhang April 7, 2021, 11:45 a.m. UTC | #11
On Fri, Apr 2, 2021 at 4:35 AM Ilya Maximets <i.maximets@ovn.org> wrote:
>
> On 3/31/21 12:29 AM, Jean Tourrilhes wrote:
> > On Tue, Mar 30, 2021 at 02:27:11PM -0700, Ben Pfaff wrote:
> >> On Tue, Mar 30, 2021 at 11:16:48PM +0200, Ilya Maximets wrote:
> >>>
> >>> OpenFlow spec is a bit loose in definition of what should
> >>> be behavior if burst is not set:
> >>> """
> >>> If the flag OFPMF_BURST is not set the burst_size values from meter
> >>> bands are ignored, and if the meter implementation uses a burst value,
> >>> this burst value must be set to an implementation defined optimal value.
> >>> """
> >>>
> >>> In our case, historically, "implementation defined optimal value" was
> >>> value equal to rate.  I have no idea why, but it's hard to argue with
> >>> it since the spec gives a great freedom to choose.
> >>>
> >>> Actually, the "burst" itself as a term makes very little sense to me.
> >>> It's defined by the spec as:
> >>> """
> >>> It defines the granularity of the meter band, for all packet or byte
> >>> bursts whose length is greater than burst value, the meter rate will
> >>> always be strictly enforced.
> >>> """
> >>>
> >>> But what is the burst?  How the implementation should define which
> >>> packets are in the burst and which are from the next one?
> >>>
> >>> Current implementation just assumes that bursts are measured per second.
> >>> But the rate is measured per second too.  So, burst and rate is
> >>> essentially the same thing and implementations just sums them together
> >>> to get the bucket size.  So, I do not understand why "burst" and
> >>> "burst_size" exist at all.  Why not just set the rate a bit higher?
> >>>
> >>> Ben, can you shed some light on this?  What was the original idea
> >>> behind the meter burst?  Or maybe I'm missing something?
> >
> >       I don't understand how you can confuse a rate and a size. The
> > OpenFlow spec clearly says it's in kilobits or packets (not per
> > seconds).
> >       A basic token bucket has only two parameters, the commited
> > rate and the burst size (i.e. maximum number of tokens in the
> > bucket). The spec reflect that in a generic way to avoid mandating an
> > implementation.
>
> Thanks, Jean.
>
> My problem, actually, was that I started from the implementation in the
> kernel datapath and it looked super weird.  Especially, this part:
>   https://elixir.bootlin.com/linux/latest/source/net/openvswitch/meter.c#L644
> Than I tried to find the truth inside the spec, but it doesn't define
> the implementation on purpose.  So the only option I had is to guess
> how this suppose to work.  Was it 11pm or something else, but my guessing
> engine didn't came up with anything that might make sense. :)
>
> >       Burst rate is only defined for more fancy rate limiters, such
> > as two colors rate limiters. In this case, you also have two burst
> > size, one for each token bucket. The OpenFlow spec does not support
> > those extra parameters (as of version 1.5.1).
> >       For Linux 'police' filter : rate == rate ; burst_size == burst
> >       For Linux 'htb' qdisc : rate == rate ; burst_size == burst ;
> > ceil and cburst are not supported.
>
> This totally makes sense.  OTOH, Implementation inside both datapaths
> doesn't.
>
> Thanks for pushing me in right direction.
>
> For the implementations: I think, they needs to be reworked.
> At least, we need to get rid of 'rate' in a calculation of a maximum
> bucket size.  It should not depend on rate, only on a burst size.
> i.e. instead of:
>   max_bucket_size = (band->burst_size + band->rate) * 1000;
> there should be:
>   max_bucket_size = band->burst_size * 1000;
>
> This way implementations will have at least a bit of sense.
>
> Summing burst size with rate is like summing apples with oranges.
> And that is what misled me.
>
> About having a value for a burst size being numerically equal to the
> configured rate:  this looks like some kind of estimated value, but
> it's really hard to argue with it, because research is needed to
> define the "good value for most cases".
>
> Anyway, we need to fix calculation of a maximum bucket size first.
Hi Ilya
Would you plan to send a patch to fix it ? I have no idea to fix it.
> Best regards, Ilya Maximets.
Ilya Maximets April 7, 2021, 11:48 a.m. UTC | #12
On 4/7/21 1:45 PM, Tonghao Zhang wrote:
> On Fri, Apr 2, 2021 at 4:35 AM Ilya Maximets <i.maximets@ovn.org> wrote:
>>
>> On 3/31/21 12:29 AM, Jean Tourrilhes wrote:
>>> On Tue, Mar 30, 2021 at 02:27:11PM -0700, Ben Pfaff wrote:
>>>> On Tue, Mar 30, 2021 at 11:16:48PM +0200, Ilya Maximets wrote:
>>>>>
>>>>> OpenFlow spec is a bit loose in definition of what should
>>>>> be behavior if burst is not set:
>>>>> """
>>>>> If the flag OFPMF_BURST is not set the burst_size values from meter
>>>>> bands are ignored, and if the meter implementation uses a burst value,
>>>>> this burst value must be set to an implementation defined optimal value.
>>>>> """
>>>>>
>>>>> In our case, historically, "implementation defined optimal value" was
>>>>> value equal to rate.  I have no idea why, but it's hard to argue with
>>>>> it since the spec gives a great freedom to choose.
>>>>>
>>>>> Actually, the "burst" itself as a term makes very little sense to me.
>>>>> It's defined by the spec as:
>>>>> """
>>>>> It defines the granularity of the meter band, for all packet or byte
>>>>> bursts whose length is greater than burst value, the meter rate will
>>>>> always be strictly enforced.
>>>>> """
>>>>>
>>>>> But what is the burst?  How the implementation should define which
>>>>> packets are in the burst and which are from the next one?
>>>>>
>>>>> Current implementation just assumes that bursts are measured per second.
>>>>> But the rate is measured per second too.  So, burst and rate is
>>>>> essentially the same thing and implementations just sums them together
>>>>> to get the bucket size.  So, I do not understand why "burst" and
>>>>> "burst_size" exist at all.  Why not just set the rate a bit higher?
>>>>>
>>>>> Ben, can you shed some light on this?  What was the original idea
>>>>> behind the meter burst?  Or maybe I'm missing something?
>>>
>>>       I don't understand how you can confuse a rate and a size. The
>>> OpenFlow spec clearly says it's in kilobits or packets (not per
>>> seconds).
>>>       A basic token bucket has only two parameters, the commited
>>> rate and the burst size (i.e. maximum number of tokens in the
>>> bucket). The spec reflect that in a generic way to avoid mandating an
>>> implementation.
>>
>> Thanks, Jean.
>>
>> My problem, actually, was that I started from the implementation in the
>> kernel datapath and it looked super weird.  Especially, this part:
>>   https://elixir.bootlin.com/linux/latest/source/net/openvswitch/meter.c#L644
>> Than I tried to find the truth inside the spec, but it doesn't define
>> the implementation on purpose.  So the only option I had is to guess
>> how this suppose to work.  Was it 11pm or something else, but my guessing
>> engine didn't came up with anything that might make sense. :)
>>
>>>       Burst rate is only defined for more fancy rate limiters, such
>>> as two colors rate limiters. In this case, you also have two burst
>>> size, one for each token bucket. The OpenFlow spec does not support
>>> those extra parameters (as of version 1.5.1).
>>>       For Linux 'police' filter : rate == rate ; burst_size == burst
>>>       For Linux 'htb' qdisc : rate == rate ; burst_size == burst ;
>>> ceil and cburst are not supported.
>>
>> This totally makes sense.  OTOH, Implementation inside both datapaths
>> doesn't.
>>
>> Thanks for pushing me in right direction.
>>
>> For the implementations: I think, they needs to be reworked.
>> At least, we need to get rid of 'rate' in a calculation of a maximum
>> bucket size.  It should not depend on rate, only on a burst size.
>> i.e. instead of:
>>   max_bucket_size = (band->burst_size + band->rate) * 1000;
>> there should be:
>>   max_bucket_size = band->burst_size * 1000;
>>
>> This way implementations will have at least a bit of sense.
>>
>> Summing burst size with rate is like summing apples with oranges.
>> And that is what misled me.
>>
>> About having a value for a burst size being numerically equal to the
>> configured rate:  this looks like some kind of estimated value, but
>> it's really hard to argue with it, because research is needed to
>> define the "good value for most cases".
>>
>> Anyway, we need to fix calculation of a maximum bucket size first.
> Hi Ilya
> Would you plan to send a patch to fix it ? I have no idea to fix it.

OK.  I'll prepare and send a patch later this week.
For both userspace and kernel.
diff mbox series

Patch

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index e53ed31b906c..e0a44abebda6 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -6327,11 +6327,6 @@  dpif_netdev_meter_set(struct dpif *dpif, ofproto_meter_id meter_id,
     for (i = 0; i < config->n_bands; ++i) {
         uint32_t band_max_delta_t;
 
-        /* Set burst size to a workable value if none specified. */
-        if (config->bands[i].burst_size == 0) {
-            config->bands[i].burst_size = config->bands[i].rate;
-        }
-
         meter->bands[i].rate = config->bands[i].rate;
         meter->bands[i].burst_size = config->bands[i].burst_size;
         /* Start with a full bucket. */
diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
index ceb56c6851c6..f3db0c6802b9 100644
--- a/lib/dpif-netlink.c
+++ b/lib/dpif-netlink.c
@@ -3761,7 +3761,7 @@  dpif_netlink_meter_set__(struct dpif *dpif_, ofproto_meter_id meter_id,
         nl_msg_put_u32(&buf, OVS_BAND_ATTR_RATE, band->rate);
         nl_msg_put_u32(&buf, OVS_BAND_ATTR_BURST,
                        config->flags & OFPMF13_BURST ?
-                       band->burst_size : band->rate);
+                       band->burst_size : 0);
         nl_msg_end_nested(&buf, band_offset);
     }
     nl_msg_end_nested(&buf, bands_offset);
diff --git a/tests/dpif-netdev.at b/tests/dpif-netdev.at
index b2ff69a01ee6..b18297983007 100644
--- a/tests/dpif-netdev.at
+++ b/tests/dpif-netdev.at
@@ -372,6 +372,109 @@  recirc_id(0),in_port(8),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), a
 OVS_VSWITCHD_STOP
 AT_CLEANUP
 
+AT_SETUP([dpif-netdev - meters without burst_size])
+# Create br0 with interfaces p1 and p7
+#    and br1 with interfaces p2 and p8
+# with p1 and p2 connected via unix domain socket
+OVS_VSWITCHD_START(
+  [add-port br0 p1 -- set interface p1 type=dummy options:pstream=punix:$OVS_RUNDIR/p0.sock ofport_request=1 -- \
+   add-port br0 p7 -- set interface p7 ofport_request=7 type=dummy -- \
+   add-br br1 -- \
+   set bridge br1 other-config:hwaddr=aa:66:aa:66:00:00 -- \
+   set bridge br1 datapath-type=dummy other-config:datapath-id=1234 \
+                  fail-mode=secure -- \
+   add-port br1 p2 -- set interface p2 type=dummy options:stream=unix:$OVS_RUNDIR/p0.sock ofport_request=2 -- \
+   add-port br1 p8 -- set interface p8 ofport_request=8 type=dummy --])
+AT_CHECK([ovs-appctl vlog/set dpif:dbg dpif_netdev:dbg])
+
+AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br0 'meter=1 pktps stats bands=type=drop rate=1'])
+AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br0 'meter=2 kbps stats bands=type=drop rate=1'])
+AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 'in_port=1 action=meter:1,7'])
+AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 'in_port=7 action=meter:2,1'])
+AT_CHECK([ovs-ofctl add-flow br1 'in_port=2 action=8'])
+AT_CHECK([ovs-ofctl add-flow br1 'in_port=8 action=2'])
+ovs-appctl time/stop
+
+AT_CHECK([ovs-ofctl -O OpenFlow13 dump-meters br0], [0], [dnl
+OFPST_METER_CONFIG reply (OF1.3) (xid=0x2):
+meter=1 pktps stats bands=
+type=drop rate=1
+
+meter=2 kbps stats bands=
+type=drop rate=1
+])
+
+ovs-appctl time/warp 5000
+for i in `seq 1 5`; do
+    AT_CHECK([ovs-appctl netdev-dummy/receive p7 \
+            'in_port(7),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x0800),ipv4(src=10.0.0.2,dst=10.0.0.1,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)' --len 60])
+    AT_CHECK([ovs-appctl netdev-dummy/receive p8 \
+            'in_port(8),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:0b,dst=50:54:00:00:00:0c),eth_type(0x0800),ipv4(src=10.0.0.3,dst=10.0.0.4,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)' --len 60])
+done
+
+sleep 1  # wait for forwarders process packets
+
+# Meter 1 is measuring packets, allowing one packet per second,
+# so 4 out of 5 packets should hit the drop band.
+#
+# Meter 2 is measuring kbps (== 1000 bits). 2 packets
+# (120 bytes == 960 bits) pass, but the last 3 packets should hit the drop band.
+AT_CHECK([ovs-ofctl -O OpenFlow13 meter-stats br0 | strip_timers], [0], [dnl
+OFPST_METER reply (OF1.3) (xid=0x2):
+meter:1 flow_count:1 packet_in_count:5 byte_in_count:300 duration:0.0s bands:
+0: packet_count:4 byte_count:240
+
+meter:2 flow_count:1 packet_in_count:5 byte_in_count:300 duration:0.0s bands:
+0: packet_count:3 byte_count:180
+])
+
+# Advance time by 1/2 second
+ovs-appctl time/warp 500
+
+for i in `seq 1 5`; do
+    AT_CHECK([ovs-appctl netdev-dummy/receive p7 \
+            'in_port(7),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x0800),ipv4(src=10.0.0.2,dst=10.0.0.1,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)' --len 60])
+
+    AT_CHECK([ovs-appctl netdev-dummy/receive p8 \
+            'in_port(8),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:0b,dst=50:54:00:00:00:0c),eth_type(0x0800),ipv4(src=10.0.0.3,dst=10.0.0.4,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)' --len 60])
+
+done
+
+sleep 1  # wait for forwarders process packets
+
+# Meter 1 is measuring packets, allowing one packet per second with
+# bursts of one packet, so all 5 of the new packets should hit the drop
+# band.
+# Meter 2 is measuring kbps (== 1000 bits). After 500ms
+# there should be space for 40 + 500 bits, so one new 60 byte (480 bit) packet
+# should pass, remaining 4 should hit the drop band.
+AT_CHECK([ovs-ofctl -O OpenFlow13 meter-stats br0 | strip_timers], [0], [dnl
+OFPST_METER reply (OF1.3) (xid=0x2):
+meter:1 flow_count:1 packet_in_count:10 byte_in_count:600 duration:0.0s bands:
+0: packet_count:9 byte_count:540
+
+meter:2 flow_count:1 packet_in_count:10 byte_in_count:600 duration:0.0s bands:
+0: packet_count:7 byte_count:420
+])
+
+ovs-appctl time/warp 5000
+
+AT_CHECK([
+ovs-appctl coverage/read-counter datapath_drop_meter
+], [0], [dnl
+16
+])
+
+AT_CHECK([cat ovs-vswitchd.log | filter_flow_install | strip_xout_keep_actions], [0], [dnl
+recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), actions:meter(0),7
+recirc_id(0),in_port(2),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), actions:8
+recirc_id(0),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), actions:meter(1),1
+recirc_id(0),in_port(8),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), actions:2
+])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP
+
 m4_define([DPIF_NETDEV_FLOW_HW_OFFLOAD],
   [AT_SETUP([dpif-netdev - partial hw offload - $1])
    OVS_VSWITCHD_START(