diff mbox series

[ovs-dev] flow: Allow matches on nw_proto also for IPv6 later frags.

Message ID 164885612346.1396061.15065322158940269369.stgit@fed.void
State Superseded
Headers show
Series [ovs-dev] flow: Allow matches on nw_proto also for IPv6 later frags. | expand

Checks

Context Check Description
ovsrobot/apply-robot fail apply and check: fail
ovsrobot/intel-ovs-compilation fail test: fail

Commit Message

Paolo Valerio April 1, 2022, 11:35 p.m. UTC
The next header contained in the last extension header of the IPv6
later frags still contain the information of the upper layer protocol
number.

Similarly to what OvS does for IPv4, allow L4 matches for later IPv6
frags as well by processing later frags and storing the nw_proto
information.

Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
---
 lib/flow.c            |    3 ---
 tests/ofproto-dpif.at |   24 ++++++++++++++++++++++++
 2 files changed, 24 insertions(+), 3 deletions(-)

 AT_SETUP([ofproto-dpif - fragment handling - actions])
 OVS_VSWITCHD_START
 add_of_ports br0 1 2 3 4 5 6 90

Comments

Aaron Conole April 4, 2022, 3:58 p.m. UTC | #1
Paolo Valerio <pvalerio@redhat.com> writes:

> The next header contained in the last extension header of the IPv6
> later frags still contain the information of the upper layer protocol
> number.
>
> Similarly to what OvS does for IPv4, allow L4 matches for later IPv6
> frags as well by processing later frags and storing the nw_proto
> information.
>
> Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
> ---

Digging in, I could never find a reason why ipv6 was treated differently
in this manner, but I guess there could be cases where the header layout
could cause problems (see the top of the while() block).  I don't know
that it's a practical concern (other values are probably less common),
but it was the only thing I could come up with for why the match might
behave this way.  I haven't done enough mailing list archeology to have
a good idea.
Paolo Valerio April 4, 2022, 9:54 p.m. UTC | #2
Aaron Conole <aconole@redhat.com> writes:

> Paolo Valerio <pvalerio@redhat.com> writes:
>
>> The next header contained in the last extension header of the IPv6
>> later frags still contain the information of the upper layer protocol
>> number.
>>
>> Similarly to what OvS does for IPv4, allow L4 matches for later IPv6
>> frags as well by processing later frags and storing the nw_proto
>> information.
>>
>> Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
>> ---
>
> Digging in, I could never find a reason why ipv6 was treated differently
> in this manner, but I guess there could be cases where the header layout
> could cause problems (see the top of the while() block).  I don't know
> that it's a practical concern (other values are probably less common),
> but it was the only thing I could come up with for why the match might
> behave this way.  I haven't done enough mailing list archeology to have
> a good idea.

Guess this may probably be due to (rfc8200 section-4.5):

"The subsequent fragment packets are composed of:

   [...]

   (2)  A Fragment header containing:

           The Next Header value that identifies the first header
           after the Per-Fragment headers of the original packet.

           [...]"

where the "first header" in the original packet could be another
extension header or an upper-layer header, while the Fragment header is
the last EH for later fragments (according to section 4.5).

If I'm not missing something, it makes sense to drop the patch.
It's probably a good thing if we document the different behavior between
IPv4 and IPv6 for later frags, though.
Eelco Chaudron April 5, 2022, 9:04 a.m. UTC | #3
On 4 Apr 2022, at 23:54, Paolo Valerio wrote:

> Aaron Conole <aconole@redhat.com> writes:
>
>> Paolo Valerio <pvalerio@redhat.com> writes:
>>
>>> The next header contained in the last extension header of the IPv6
>>> later frags still contain the information of the upper layer protocol
>>> number.
>>>
>>> Similarly to what OvS does for IPv4, allow L4 matches for later IPv6
>>> frags as well by processing later frags and storing the nw_proto
>>> information.
>>>
>>> Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
>>> ---
>>
>> Digging in, I could never find a reason why ipv6 was treated differently
>> in this manner, but I guess there could be cases where the header layout
>> could cause problems (see the top of the while() block).  I don't know
>> that it's a practical concern (other values are probably less common),
>> but it was the only thing I could come up with for why the match might
>> behave this way.  I haven't done enough mailing list archeology to have
>> a good idea.
>
> Guess this may probably be due to (rfc8200 section-4.5):
>
> "The subsequent fragment packets are composed of:
>
>    [...]
>
>    (2)  A Fragment header containing:
>
>            The Next Header value that identifies the first header
>            after the Per-Fragment headers of the original packet.
>
>            [...]"
>
> where the "first header" in the original packet could be another
> extension header or an upper-layer header, while the Fragment header is
> the last EH for later fragments (according to section 4.5).
>
> If I'm not missing something, it makes sense to drop the patch.
> It's probably a good thing if we document the different behavior between
> IPv4 and IPv6 for later frags, though.

Alternatively, we could check if the next header is a non-extension header and if so, mark it as such.

The only problem might be the case where packets do have extension headers after the fragmentation header, and they might be dropped (handled differently).

Anyhow, whatever we decide, we should document it in both the documentation and code.

//Eelco
Aaron Conole April 5, 2022, 2:46 p.m. UTC | #4
Eelco Chaudron <echaudro@redhat.com> writes:

> On 4 Apr 2022, at 23:54, Paolo Valerio wrote:
>
>> Aaron Conole <aconole@redhat.com> writes:
>>
>>> Paolo Valerio <pvalerio@redhat.com> writes:
>>>
>>>> The next header contained in the last extension header of the IPv6
>>>> later frags still contain the information of the upper layer protocol
>>>> number.
>>>>
>>>> Similarly to what OvS does for IPv4, allow L4 matches for later IPv6
>>>> frags as well by processing later frags and storing the nw_proto
>>>> information.
>>>>
>>>> Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
>>>> ---
>>>
>>> Digging in, I could never find a reason why ipv6 was treated differently
>>> in this manner, but I guess there could be cases where the header layout
>>> could cause problems (see the top of the while() block).  I don't know
>>> that it's a practical concern (other values are probably less common),
>>> but it was the only thing I could come up with for why the match might
>>> behave this way.  I haven't done enough mailing list archeology to have
>>> a good idea.
>>
>> Guess this may probably be due to (rfc8200 section-4.5):
>>
>> "The subsequent fragment packets are composed of:
>>
>>    [...]
>>
>>    (2)  A Fragment header containing:
>>
>>            The Next Header value that identifies the first header
>>            after the Per-Fragment headers of the original packet.
>>
>>            [...]"
>>
>> where the "first header" in the original packet could be another
>> extension header or an upper-layer header, while the Fragment header is
>> the last EH for later fragments (according to section 4.5).

Thanks for digging in and finding this.

>> If I'm not missing something, it makes sense to drop the patch.
>> It's probably a good thing if we document the different behavior between
>> IPv4 and IPv6 for later frags, though.
>
> Alternatively, we could check if the next header is a non-extension header and if so, mark it as such.
>
> The only problem might be the case where packets do have extension
> headers after the fragmentation header, and they might be dropped
> (handled differently).

I think it might be best to keep the system as-is.  If we add this
"feature," users will still need to do matches like:

  table=X,ip6,ct_state=-trk actions=ct(...)

They can try to optimize the flow table looks by doing individual
selections, but it will be confusing to flow writers (and take an extra
bit of time to debug) when they get missed packets.  Even if we try to
make things convenient, there can be edge cases we miss - and today
systems like OVN already accommodate the nuance of v4 vs v6.

I'm in favor of a documentation patch instead that spells it out for
flow writers to clearly understand.  That way we don't have a custom
state machine to try and work around this limitation.

> Anyhow, whatever we decide, we should document it in both the documentation and code.
>
> //Eelco
Paolo Valerio April 5, 2022, 3:13 p.m. UTC | #5
Aaron Conole <aconole@redhat.com> writes:

> Eelco Chaudron <echaudro@redhat.com> writes:
>
>> On 4 Apr 2022, at 23:54, Paolo Valerio wrote:
>>
>>> Aaron Conole <aconole@redhat.com> writes:
>>>
>>>> Paolo Valerio <pvalerio@redhat.com> writes:
>>>>
>>>>> The next header contained in the last extension header of the IPv6
>>>>> later frags still contain the information of the upper layer protocol
>>>>> number.
>>>>>
>>>>> Similarly to what OvS does for IPv4, allow L4 matches for later IPv6
>>>>> frags as well by processing later frags and storing the nw_proto
>>>>> information.
>>>>>
>>>>> Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
>>>>> ---
>>>>
>>>> Digging in, I could never find a reason why ipv6 was treated differently
>>>> in this manner, but I guess there could be cases where the header layout
>>>> could cause problems (see the top of the while() block).  I don't know
>>>> that it's a practical concern (other values are probably less common),
>>>> but it was the only thing I could come up with for why the match might
>>>> behave this way.  I haven't done enough mailing list archeology to have
>>>> a good idea.
>>>
>>> Guess this may probably be due to (rfc8200 section-4.5):
>>>
>>> "The subsequent fragment packets are composed of:
>>>
>>>    [...]
>>>
>>>    (2)  A Fragment header containing:
>>>
>>>            The Next Header value that identifies the first header
>>>            after the Per-Fragment headers of the original packet.
>>>
>>>            [...]"
>>>
>>> where the "first header" in the original packet could be another
>>> extension header or an upper-layer header, while the Fragment header is
>>> the last EH for later fragments (according to section 4.5).
>
> Thanks for digging in and finding this.
>
>>> If I'm not missing something, it makes sense to drop the patch.
>>> It's probably a good thing if we document the different behavior between
>>> IPv4 and IPv6 for later frags, though.
>>
>> Alternatively, we could check if the next header is a non-extension header and if so, mark it as such.
>>
>> The only problem might be the case where packets do have extension
>> headers after the fragmentation header, and they might be dropped
>> (handled differently).
>
> I think it might be best to keep the system as-is.  If we add this
> "feature," users will still need to do matches like:
>
>   table=X,ip6,ct_state=-trk actions=ct(...)
>
> They can try to optimize the flow table looks by doing individual
> selections, but it will be confusing to flow writers (and take an extra
> bit of time to debug) when they get missed packets.  Even if we try to
> make things convenient, there can be edge cases we miss - and today
> systems like OVN already accommodate the nuance of v4 vs v6.
>

I agree. Although the proposal is reasonable and covers the majority of
the cases, it's probably better to stick with the current behavior.

> I'm in favor of a documentation patch instead that spells it out for
> flow writers to clearly understand.  That way we don't have a custom
> state machine to try and work around this limitation.
>

If we all agree, I'd proceed with that.

>> Anyhow, whatever we decide, we should document it in both the documentation and code.
>>
>> //Eelco
Eelco Chaudron April 6, 2022, 8:31 a.m. UTC | #6
On 5 Apr 2022, at 17:13, Paolo Valerio wrote:

> Aaron Conole <aconole@redhat.com> writes:
>
>> Eelco Chaudron <echaudro@redhat.com> writes:
>>
>>> On 4 Apr 2022, at 23:54, Paolo Valerio wrote:
>>>
>>>> Aaron Conole <aconole@redhat.com> writes:
>>>>
>>>>> Paolo Valerio <pvalerio@redhat.com> writes:
>>>>>
>>>>>> The next header contained in the last extension header of the IPv6
>>>>>> later frags still contain the information of the upper layer protocol
>>>>>> number.
>>>>>>
>>>>>> Similarly to what OvS does for IPv4, allow L4 matches for later IPv6
>>>>>> frags as well by processing later frags and storing the nw_proto
>>>>>> information.
>>>>>>
>>>>>> Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
>>>>>> ---
>>>>>
>>>>> Digging in, I could never find a reason why ipv6 was treated differently
>>>>> in this manner, but I guess there could be cases where the header layout
>>>>> could cause problems (see the top of the while() block).  I don't know
>>>>> that it's a practical concern (other values are probably less common),
>>>>> but it was the only thing I could come up with for why the match might
>>>>> behave this way.  I haven't done enough mailing list archeology to have
>>>>> a good idea.
>>>>
>>>> Guess this may probably be due to (rfc8200 section-4.5):
>>>>
>>>> "The subsequent fragment packets are composed of:
>>>>
>>>>    [...]
>>>>
>>>>    (2)  A Fragment header containing:
>>>>
>>>>            The Next Header value that identifies the first header
>>>>            after the Per-Fragment headers of the original packet.
>>>>
>>>>            [...]"
>>>>
>>>> where the "first header" in the original packet could be another
>>>> extension header or an upper-layer header, while the Fragment header is
>>>> the last EH for later fragments (according to section 4.5).
>>
>> Thanks for digging in and finding this.
>>
>>>> If I'm not missing something, it makes sense to drop the patch.
>>>> It's probably a good thing if we document the different behavior between
>>>> IPv4 and IPv6 for later frags, though.
>>>
>>> Alternatively, we could check if the next header is a non-extension header and if so, mark it as such.
>>>
>>> The only problem might be the case where packets do have extension
>>> headers after the fragmentation header, and they might be dropped
>>> (handled differently).
>>
>> I think it might be best to keep the system as-is.  If we add this
>> "feature," users will still need to do matches like:
>>
>>   table=X,ip6,ct_state=-trk actions=ct(...)
>>
>> They can try to optimize the flow table looks by doing individual
>> selections, but it will be confusing to flow writers (and take an extra
>> bit of time to debug) when they get missed packets.  Even if we try to
>> make things convenient, there can be edge cases we miss - and today
>> systems like OVN already accommodate the nuance of v4 vs v6.
>>
>
> I agree. Although the proposal is reasonable and covers the majority of
> the cases, it's probably better to stick with the current behavior.
>
>> I'm in favor of a documentation patch instead that spells it out for
>> flow writers to clearly understand.  That way we don't have a custom
>> state machine to try and work around this limitation.
>>
>
> If we all agree, I'd proceed with that.

Sound good to me, see below ;)

>>> Anyhow, whatever we decide, we should document it in both the documentation and code.
>>> //Eelco
diff mbox series

Patch

diff --git a/lib/flow.c b/lib/flow.c
index dd523c889..0a7301ccf 100644
--- a/lib/flow.c
+++ b/lib/flow.c
@@ -534,13 +534,10 @@  parse_ipv6_ext_hdrs__(const void **datap, size_t *sizep, uint8_t *nw_proto,
                 return false;
             }
 
-            /* We only process the first fragment. */
             if ((*frag_hdr)->ip6f_offlg != htons(0)) {
                 *nw_frag = FLOW_NW_FRAG_ANY;
                 if (((*frag_hdr)->ip6f_offlg & IP6F_OFF_MASK) != htons(0)) {
                     *nw_frag |= FLOW_NW_FRAG_LATER;
-                    *nw_proto = IPPROTO_FRAGMENT;
-                    return true;
                 }
             }
         }
diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
index 439761216..9a97797b7 100644
--- a/tests/ofproto-dpif.at
+++ b/tests/ofproto-dpif.at
@@ -4757,6 +4757,30 @@  recirc_id(0),in_port(90),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(proto=6,fr
 OVS_VSWITCHD_STOP
 AT_CLEANUP
 
+AT_SETUP([ofproto-dpif - fragment handling - IPv6 with match on L4 proto])
+AT_SKIP_IF([test $HAVE_IPV6 = no])
+OVS_VSWITCHD_START
+add_of_ports br0 1 2
+
+AT_DATA([flows.txt], [dnl
+priority=10,in_port=1,udp6,action=output:2
+priority=0,action=drop
+])
+AT_CHECK([ovs-ofctl add-flows br0 flows.txt])
+
+AT_CHECK([ovs-appctl netdev-dummy/receive p1 "f2ff00000002f2ff0000000186dd6007c0aa05b02c40fc000000000000000000000000000001fc0000000000000000000000000000021100000161f9332fa33b1f900608390e0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
 000000000000000000000000000000000000000000000000000000000000000000000000000"])
+
+AT_CHECK([ovs-appctl netdev-dummy/receive p1 "f2ff00000002f2ff0000000186dd6007c0aa00682c40fc000000000000000000000000000001fc000000000000000000000000000002110005a861f9332f000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"])
+
+AT_CHECK([ovs-appctl dpctl/dump-flows filter="in_port=1"], [0], [dnl
+flow-dump from the main thread:
+recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x86dd),ipv6(proto=17,frag=first), packets:0, bytes:0, used:never, actions:2
+recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x86dd),ipv6(proto=17,frag=later), packets:0, bytes:0, used:never, actions:2
+])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP
+