diff mbox series

[net] selftests: forwarding: vxlan_bridge_1d: fix tos value

Message ID 20200213094054.27993-1-liuhangbin@gmail.com
State Accepted
Delegated to: David Miller
Headers show
Series [net] selftests: forwarding: vxlan_bridge_1d: fix tos value | expand

Commit Message

Hangbin Liu Feb. 13, 2020, 9:40 a.m. UTC
After commit 71130f29979c ("vxlan: fix tos value before xmit") we start
strict vxlan xmit tos value by RT_TOS(), which limits the tos value less
than 0x1E. With current value 0x40 the test will failed with "v1: Expected
to capture 10 packets, got 0". So let's choose a smaller tos value for
testing.

Fixes: d417ecf533fe ("selftests: forwarding: vxlan_bridge_1d: Add a TOS test")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
---
 tools/testing/selftests/net/forwarding/vxlan_bridge_1d.sh | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Petr Machata Feb. 13, 2020, 12:52 p.m. UTC | #1
Hangbin Liu <liuhangbin@gmail.com> writes:

> After commit 71130f29979c ("vxlan: fix tos value before xmit") we start
> strict vxlan xmit tos value by RT_TOS(), which limits the tos value less

I don't understand how it is OK to slice the TOS field like this. It
could contain a DSCP value, which will be mangled.

> than 0x1E. With current value 0x40 the test will failed with "v1: Expected
> to capture 10 packets, got 0". So let's choose a smaller tos value for
> testing.
>
> Fixes: d417ecf533fe ("selftests: forwarding: vxlan_bridge_1d: Add a TOS test")
> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
> ---
>  tools/testing/selftests/net/forwarding/vxlan_bridge_1d.sh | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/tools/testing/selftests/net/forwarding/vxlan_bridge_1d.sh b/tools/testing/selftests/net/forwarding/vxlan_bridge_1d.sh
> index bb10e33690b2..353613fc1947 100755
> --- a/tools/testing/selftests/net/forwarding/vxlan_bridge_1d.sh
> +++ b/tools/testing/selftests/net/forwarding/vxlan_bridge_1d.sh
> @@ -516,9 +516,9 @@ test_tos()
>  	RET=0
>
>  	tc filter add dev v1 egress pref 77 prot ip \
> -		flower ip_tos 0x40 action pass
> -	vxlan_ping_test $h1 192.0.2.3 "-Q 0x40" v1 egress 77 10
> -	vxlan_ping_test $h1 192.0.2.3 "-Q 0x30" v1 egress 77 0
> +		flower ip_tos 0x11 action pass
> +	vxlan_ping_test $h1 192.0.2.3 "-Q 0x11" v1 egress 77 10
> +	vxlan_ping_test $h1 192.0.2.3 "-Q 0x12" v1 egress 77 0

0x11 and 0x12 set the ECN bits, I think it would be better to avoid
that. It works just as well with 0x14 and 0x18.

>  	tc filter del dev v1 egress pref 77 prot ip
>
>  	log_test "VXLAN: envelope TOS inheritance"
Hangbin Liu Feb. 14, 2020, 2:53 a.m. UTC | #2
On Thu, Feb 13, 2020 at 01:52:49PM +0100, Petr Machata wrote:
> 
> Hangbin Liu <liuhangbin@gmail.com> writes:
> 
> > After commit 71130f29979c ("vxlan: fix tos value before xmit") we start
> > strict vxlan xmit tos value by RT_TOS(), which limits the tos value less
> 
> I don't understand how it is OK to slice the TOS field like this. It
> could contain a DSCP value, which will be mangled.

Thanks for this remind. I re-checked the tos definition and found a summary
from Peter Dawson[1].

IPv4/6 Header:0 |0 1 2 3 |0 1 2 3 |0 1 2 3 |0 1 2 3 |
RFC2460(IPv6)   |Version | Traffic Class   |        |
RFC2474(IPv6)   |Version | DSCP        |ECN|        |
RFC2474(IPv4)   |Version |  IHL   |    DSCP     |ECN|
RFC1349(IPv4)   |Version |  IHL   | PREC |  TOS   |X|
RFC791 (IPv4)   |Version |  IHL   |      TOS        |

According to this I think our current IPTOS_TOS_MASK should be updated to 0xFC
based on RFC2474. But I'm not sure if there will have compatibility issue.

What do you think?

> >  	tc filter add dev v1 egress pref 77 prot ip \
> > -		flower ip_tos 0x40 action pass
> > -	vxlan_ping_test $h1 192.0.2.3 "-Q 0x40" v1 egress 77 10
> > -	vxlan_ping_test $h1 192.0.2.3 "-Q 0x30" v1 egress 77 0
> > +		flower ip_tos 0x11 action pass
> > +	vxlan_ping_test $h1 192.0.2.3 "-Q 0x11" v1 egress 77 10
> > +	vxlan_ping_test $h1 192.0.2.3 "-Q 0x12" v1 egress 77 0
> 
> 0x11 and 0x12 set the ECN bits, I think it would be better to avoid
> that. It works just as well with 0x14 and 0x18.

Thanks, I will update it.

[1] https://lore.kernel.org/patchwork/patch/799698/#992992

Regards
Hangbin
Petr Machata Feb. 14, 2020, 10:54 a.m. UTC | #3
Hangbin Liu <liuhangbin@gmail.com> writes:

> On Thu, Feb 13, 2020 at 01:52:49PM +0100, Petr Machata wrote:
>> 
>> Hangbin Liu <liuhangbin@gmail.com> writes:
>> 
>> > After commit 71130f29979c ("vxlan: fix tos value before xmit") we start
>> > strict vxlan xmit tos value by RT_TOS(), which limits the tos value less
>> 
>> I don't understand how it is OK to slice the TOS field like this. It
>> could contain a DSCP value, which will be mangled.
>
> Thanks for this remind. I re-checked the tos definition and found a summary
> from Peter Dawson[1].
>
> IPv4/6 Header:0 |0 1 2 3 |0 1 2 3 |0 1 2 3 |0 1 2 3 |
> RFC2460(IPv6)   |Version | Traffic Class   |        |
> RFC2474(IPv6)   |Version | DSCP        |ECN|        |
> RFC2474(IPv4)   |Version |  IHL   |    DSCP     |ECN|
> RFC1349(IPv4)   |Version |  IHL   | PREC |  TOS   |X|
> RFC791 (IPv4)   |Version |  IHL   |      TOS        |
>
> According to this I think our current IPTOS_TOS_MASK should be updated to 0xFC
> based on RFC2474. But I'm not sure if there will have compatibility issue.
> What do you think?

Looking at the various uses of RT_TOS, it looks like they tend to be
used in tunneling and routing code. I think that in both cases it makes
sense to convert to 0xfc. But I'm not ready to vouch for this :)

What is the problem that commit 71130f29979c aims to solve? It's not
clear to me from the commit message. What issues arise if the TOS is
copied as is?

>
>> >  	tc filter add dev v1 egress pref 77 prot ip \
>> > -		flower ip_tos 0x40 action pass
>> > -	vxlan_ping_test $h1 192.0.2.3 "-Q 0x40" v1 egress 77 10
>> > -	vxlan_ping_test $h1 192.0.2.3 "-Q 0x30" v1 egress 77 0
>> > +		flower ip_tos 0x11 action pass
>> > +	vxlan_ping_test $h1 192.0.2.3 "-Q 0x11" v1 egress 77 10
>> > +	vxlan_ping_test $h1 192.0.2.3 "-Q 0x12" v1 egress 77 0
>> 
>> 0x11 and 0x12 set the ECN bits, I think it would be better to avoid
>> that. It works just as well with 0x14 and 0x18.
>
> Thanks, I will update it.
>
> [1] https://lore.kernel.org/patchwork/patch/799698/#992992
>
> Regards
> Hangbin
David Miller Feb. 17, 2020, 2:57 a.m. UTC | #4
From: Hangbin Liu <liuhangbin@gmail.com>
Date: Thu, 13 Feb 2020 17:40:54 +0800

> After commit 71130f29979c ("vxlan: fix tos value before xmit") we start
> strict vxlan xmit tos value by RT_TOS(), which limits the tos value less
> than 0x1E. With current value 0x40 the test will failed with "v1: Expected
> to capture 10 packets, got 0". So let's choose a smaller tos value for
> testing.
> 
> Fixes: d417ecf533fe ("selftests: forwarding: vxlan_bridge_1d: Add a TOS test")
> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>

Applied.
Hangbin Liu Feb. 17, 2020, 2:59 a.m. UTC | #5
On Fri, Feb 14, 2020 at 11:54:09AM +0100, Petr Machata wrote:
> >> > After commit 71130f29979c ("vxlan: fix tos value before xmit") we start
> >> > strict vxlan xmit tos value by RT_TOS(), which limits the tos value less
> >> 
> >> I don't understand how it is OK to slice the TOS field like this. It
> >> could contain a DSCP value, which will be mangled.
> >
> > Thanks for this remind. I re-checked the tos definition and found a summary
> > from Peter Dawson[1].
> >
> > IPv4/6 Header:0 |0 1 2 3 |0 1 2 3 |0 1 2 3 |0 1 2 3 |
> > RFC2460(IPv6)   |Version | Traffic Class   |        |
> > RFC2474(IPv6)   |Version | DSCP        |ECN|        |
> > RFC2474(IPv4)   |Version |  IHL   |    DSCP     |ECN|
> > RFC1349(IPv4)   |Version |  IHL   | PREC |  TOS   |X|
> > RFC791 (IPv4)   |Version |  IHL   |      TOS        |
> >
> > According to this I think our current IPTOS_TOS_MASK should be updated to 0xFC
> > based on RFC2474. But I'm not sure if there will have compatibility issue.
> > What do you think?
> 
> Looking at the various uses of RT_TOS, it looks like they tend to be
> used in tunneling and routing code. I think that in both cases it makes
> sense to convert to 0xfc. But I'm not ready to vouch for this :)

Yes, I also could not... Maybe David or Daniel could help give some comments?

> 
> What is the problem that commit 71130f29979c aims to solve? It's not
> clear to me from the commit message. What issues arise if the TOS is
> copied as is?

As the commit said, we should not use config tos directly. We should remove
the precedence field based on RFC1349 or ENC field based on RFC2474.

Thanks
Hangbin
Petr Machata Feb. 17, 2020, 10:40 a.m. UTC | #6
Hangbin Liu <liuhangbin@gmail.com> writes:

> On Fri, Feb 14, 2020 at 11:54:09AM +0100, Petr Machata wrote:
>> >> > After commit 71130f29979c ("vxlan: fix tos value before xmit") we start
>> >> > strict vxlan xmit tos value by RT_TOS(), which limits the tos value less
>> >>
>> >> I don't understand how it is OK to slice the TOS field like this. It
>> >> could contain a DSCP value, which will be mangled.
>> >
>> > Thanks for this remind. I re-checked the tos definition and found a summary
>> > from Peter Dawson[1].
>> >
>> > IPv4/6 Header:0 |0 1 2 3 |0 1 2 3 |0 1 2 3 |0 1 2 3 |
>> > RFC2460(IPv6)   |Version | Traffic Class   |        |
>> > RFC2474(IPv6)   |Version | DSCP        |ECN|        |
>> > RFC2474(IPv4)   |Version |  IHL   |    DSCP     |ECN|
>> > RFC1349(IPv4)   |Version |  IHL   | PREC |  TOS   |X|
>> > RFC791 (IPv4)   |Version |  IHL   |      TOS        |
>> >
>> > According to this I think our current IPTOS_TOS_MASK should be updated to 0xFC
>> > based on RFC2474. But I'm not sure if there will have compatibility issue.
>> > What do you think?
>>
>> Looking at the various uses of RT_TOS, it looks like they tend to be
>> used in tunneling and routing code. I think that in both cases it makes
>> sense to convert to 0xfc. But I'm not ready to vouch for this :)
>
> Yes, I also could not... Maybe David or Daniel could help give some comments?
>
>>
>> What is the problem that commit 71130f29979c aims to solve? It's not
>> clear to me from the commit message. What issues arise if the TOS is
>> copied as is?
>
> As the commit said, we should not use config tos directly. We should remove
> the precedence field based on RFC1349 or ENC field based on RFC2474.

Well, RFC1349 didn't know about DSCP. I do not think it is possible to
conform to both RFC1349 and RFC2474 at the same time.

RFC2474 states that "DS field [...] is intended to supersede the
existing definitions of the IPv4 TOS octet [RFC791] and the IPv6 Traffic
Class octet [IPv6]". So the field should be assumed to contain DSCP from
that point on. In my opinion, that makes commit 71130f29979c incorrect.

(And other similar uses of RT_TOS in other tunneling devices likewise.)
Hangbin Liu Feb. 18, 2020, 2:05 a.m. UTC | #7
On Mon, Feb 17, 2020 at 11:40:13AM +0100, Petr Machata wrote:
> >> > Thanks for this remind. I re-checked the tos definition and found a summary
> >> > from Peter Dawson[1].
> >> >
> >> > IPv4/6 Header:0 |0 1 2 3 |0 1 2 3 |0 1 2 3 |0 1 2 3 |
> >> > RFC2460(IPv6)   |Version | Traffic Class   |        |
> >> > RFC2474(IPv6)   |Version | DSCP        |ECN|        |
> >> > RFC2474(IPv4)   |Version |  IHL   |    DSCP     |ECN|
> >> > RFC1349(IPv4)   |Version |  IHL   | PREC |  TOS   |X|
> >> > RFC791 (IPv4)   |Version |  IHL   |      TOS        |
> >
> > As the commit said, we should not use config tos directly. We should remove
> > the precedence field based on RFC1349 or ENC field based on RFC2474.
> 
> Well, RFC1349 didn't know about DSCP. I do not think it is possible to
> conform to both RFC1349 and RFC2474 at the same time.

No, we can't. I mean no mater based on RFC1349 or RFC2474, we should not use
the config tos directly.

> 
> RFC2474 states that "DS field [...] is intended to supersede the
> existing definitions of the IPv4 TOS octet [RFC791] and the IPv6 Traffic
> Class octet [IPv6]". So the field should be assumed to contain DSCP from
> that point on. In my opinion, that makes commit 71130f29979c incorrect.
> 
> (And other similar uses of RT_TOS in other tunneling devices likewise.)

Yes, that's also what I mean, should we update RT_TOS to match
RFC2474?

Thanks
Hangbin
David Miller Feb. 18, 2020, 3:01 a.m. UTC | #8
From: Hangbin Liu <liuhangbin@gmail.com>
Date: Tue, 18 Feb 2020 10:05:08 +0800

> On Mon, Feb 17, 2020 at 11:40:13AM +0100, Petr Machata wrote:
>> RFC2474 states that "DS field [...] is intended to supersede the
>> existing definitions of the IPv4 TOS octet [RFC791] and the IPv6 Traffic
>> Class octet [IPv6]". So the field should be assumed to contain DSCP from
>> that point on. In my opinion, that makes commit 71130f29979c incorrect.
>> 
>> (And other similar uses of RT_TOS in other tunneling devices likewise.)
> 
> Yes, that's also what I mean, should we update RT_TOS to match
> RFC2474?

The RT_TOS() value elides the two lowest bits so that we can store other
pieces of binary state into those two lower bits.

So you can't just blindly change the RT_TOS() definition without breaking
a bunch of things.
Hangbin Liu Feb. 18, 2020, 8 a.m. UTC | #9
Hi David,

Thanks for the comments. Please see the reply below.
On Mon, Feb 17, 2020 at 07:01:18PM -0800, David Miller wrote:
> > On Mon, Feb 17, 2020 at 11:40:13AM +0100, Petr Machata wrote:
> >> RFC2474 states that "DS field [...] is intended to supersede the
> >> existing definitions of the IPv4 TOS octet [RFC791] and the IPv6 Traffic
> >> Class octet [IPv6]". So the field should be assumed to contain DSCP from
> >> that point on. In my opinion, that makes commit 71130f29979c incorrect.
> >> 
> >> (And other similar uses of RT_TOS in other tunneling devices likewise.)
> > 
> > Yes, that's also what I mean, should we update RT_TOS to match
> > RFC2474?
> 
> The RT_TOS() value elides the two lowest bits so that we can store other
> pieces of binary state into those two lower bits.


In my understanding, RT_TOS() only omit the lowest bits and first 3 bit, as
it defined like:
#define RT_TOS(tos)     ((tos)&IPTOS_TOS_MASK)
#define IPTOS_TOS_MASK          0x1E

> IPv4/6 Header:0 |0 1 2 3 |0 1 2 3 |0 1 2 3 |0 1 2 3 |
> RFC2474(IPv4)   |Version |  IHL   |    DSCP     |ECN|
> RFC1349(IPv4)   |Version |  IHL   | PREC |  TOS   |X|

This looks that it's based on rfc1349. At the same time, we have function
INET_ECN_encapsulate() to respect of rfc2474, which elides the two lowest
bits to stor ECN.

But this two has some conflicts. RT_TOS() omit config tos with 3 PREC bits
and 1 ECN bit. INET_ECN_encapsulate() then omit 1 more ECN bits and set
new ECN based on inner header.

Petr mentioned(and I agree) that this would make us lost 3 DSCP bit info.
So what I though is we should update IPTOS_TOS_MASK to 0xFC, so we can
catch all DSCP field from config tos and leave the two lowest ECN bit
to INET_ECN_encapsulate().

> 
> So you can't just blindly change the RT_TOS() definition without breaking
> a bunch of things.

Yes, RT_TOS() was used in many places. I'm also afraid it may break
some old configs. But then I found that the current IPTOS_TOS_MASK
is more strict. It only take 4 bits(and 1 bit will be overwrite by ECN bit
later) from tos config. If we update it to 0xFC, it would take more
bits.

So what do you think?

Thanks
Hangbin
diff mbox series

Patch

diff --git a/tools/testing/selftests/net/forwarding/vxlan_bridge_1d.sh b/tools/testing/selftests/net/forwarding/vxlan_bridge_1d.sh
index bb10e33690b2..353613fc1947 100755
--- a/tools/testing/selftests/net/forwarding/vxlan_bridge_1d.sh
+++ b/tools/testing/selftests/net/forwarding/vxlan_bridge_1d.sh
@@ -516,9 +516,9 @@  test_tos()
 	RET=0
 
 	tc filter add dev v1 egress pref 77 prot ip \
-		flower ip_tos 0x40 action pass
-	vxlan_ping_test $h1 192.0.2.3 "-Q 0x40" v1 egress 77 10
-	vxlan_ping_test $h1 192.0.2.3 "-Q 0x30" v1 egress 77 0
+		flower ip_tos 0x11 action pass
+	vxlan_ping_test $h1 192.0.2.3 "-Q 0x11" v1 egress 77 10
+	vxlan_ping_test $h1 192.0.2.3 "-Q 0x12" v1 egress 77 0
 	tc filter del dev v1 egress pref 77 prot ip
 
 	log_test "VXLAN: envelope TOS inheritance"