diff mbox series

[ovs-dev,ovs-dev,v4] dpif-netdev: fix dpif_netdev_flow_put

Message ID 20230809030231.347159-1-hepeng.0320@bytedance.com
State Changes Requested
Headers show
Series [ovs-dev,ovs-dev,v4] dpif-netdev: fix dpif_netdev_flow_put | expand

Checks

Context Check Description
ovsrobot/apply-robot warning apply and check: warning
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/intel-ovs-compilation success test: success

Commit Message

Peng He Aug. 9, 2023, 3:02 a.m. UTC
OVS allows overlapping megaflows, as long as the actions of these
megaflows are equal. However, the current implementation of action
modification relies on flow_lookup instead of ufid, this could result
in looking up a wrong megaflow and make the ukeys and megaflows inconsistent

Just like the test case in the patch, at first we have a rule with the
prefix:

10.1.2.0/24

and we will get a megaflow with prefixes 10.1.2.2/24 when a packet with IP
10.1.2.2 is received.

Then suppose we change the rule into 10.1.0.0/16. OVS prefers to keep the
10.1.2.2/24 megaflow and just changes its action instead of extending
the prefix into 10.1.2.2/16.

then suppose we have a 10.1.0.2 packet, since it misses the megaflow,
this time, we will have an overlapping megaflow with the right prefix:
10.1.0.2/16

now we have two megaflows:
10.1.2.2/24
10.1.0.2/16

last, suppose we have changed the ruleset again. The revalidator this
time still decides to change the actions of both megaflows instead of
deleting them.

The dpif_netdev_flow_put will search the megaflow to modify with unmasked
keys, however it might lookup the wrong megaflow as the key 10.1.2.2 matches
both 10.1.2.2/24 and 10.1.0.2/16!

This patch changes the megaflow lookup code in modification path into
relying the ufid to find the correct megaflow instead of key lookup.

Signed-off-by: Peng He <hepeng.0320@bytedance.com>
---
 lib/dpif-netdev.c | 45 ++++++++++++++++++++++++++++++---------------
 tests/pmd.at      | 47 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 77 insertions(+), 15 deletions(-)

Comments

0-day Robot Aug. 9, 2023, 3:19 a.m. UTC | #1
Bleep bloop.  Greetings Peng He, I am a robot and I have tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
ERROR: Author Peng He <xnhp0320@gmail.com> needs to sign off.
WARNING: Unexpected sign-offs from developers who are not authors or co-authors or committers: Peng He <hepeng.0320@bytedance.com>
Lines checked: 181, Warnings: 1, Errors: 1


Please check this out.  If you feel there has been an error, please email aconole@redhat.com

Thanks,
0-day Robot
Peng He Aug. 9, 2023, 6:13 a.m. UTC | #2
again, I forget the fixes tag ...


0-day Robot <robot@bytheb.org> 于2023年8月9日周三 11:19写道:

> Bleep bloop.  Greetings Peng He, I am a robot and I have tried out your
> patch.
> Thanks for your contribution.
>
> I encountered some error that I wasn't expecting.  See the details below.
>
>
> checkpatch:
> ERROR: Author Peng He <xnhp0320@gmail.com> needs to sign off.
> WARNING: Unexpected sign-offs from developers who are not authors or
> co-authors or committers: Peng He <hepeng.0320@bytedance.com>
> Lines checked: 181, Warnings: 1, Errors: 1
>
>
> Please check this out.  If you feel there has been an error, please email
> aconole@redhat.com
>
> Thanks,
> 0-day Robot
>
Ilya Maximets Aug. 11, 2023, 5:37 p.m. UTC | #3
On 8/9/23 05:02, Peng He wrote:
> OVS allows overlapping megaflows, as long as the actions of these
> megaflows are equal. However, the current implementation of action
> modification relies on flow_lookup instead of ufid, this could result
> in looking up a wrong megaflow and make the ukeys and megaflows inconsistent
> 
> Just like the test case in the patch, at first we have a rule with the
> prefix:
> 
> 10.1.2.0/24
> 
> and we will get a megaflow with prefixes 10.1.2.2/24 when a packet with IP
> 10.1.2.2 is received.
> 
> Then suppose we change the rule into 10.1.0.0/16. OVS prefers to keep the
> 10.1.2.2/24 megaflow and just changes its action instead of extending
> the prefix into 10.1.2.2/16.
> 
> then suppose we have a 10.1.0.2 packet, since it misses the megaflow,
> this time, we will have an overlapping megaflow with the right prefix:
> 10.1.0.2/16
> 
> now we have two megaflows:
> 10.1.2.2/24
> 10.1.0.2/16
> 
> last, suppose we have changed the ruleset again. The revalidator this
> time still decides to change the actions of both megaflows instead of
> deleting them.
> 
> The dpif_netdev_flow_put will search the megaflow to modify with unmasked
> keys, however it might lookup the wrong megaflow as the key 10.1.2.2 matches
> both 10.1.2.2/24 and 10.1.0.2/16!
> 
> This patch changes the megaflow lookup code in modification path into
> relying the ufid to find the correct megaflow instead of key lookup.
> 
> Signed-off-by: Peng He <hepeng.0320@bytedance.com>
> ---
>  lib/dpif-netdev.c | 45 ++++++++++++++++++++++++++++++---------------
>  tests/pmd.at      | 47 +++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 77 insertions(+), 15 deletions(-)
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index 70b953ae6..8479630b8 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -4191,7 +4191,7 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd,
>                  const struct dpif_flow_put *put,
>                  struct dpif_flow_stats *stats)
>  {
> -    struct dp_netdev_flow *netdev_flow;
> +    struct dp_netdev_flow *netdev_flow = NULL;
>      int error = 0;
>  
>      if (stats) {
> @@ -4199,16 +4199,35 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd,
>      }
>  
>      ovs_mutex_lock(&pmd->flow_mutex);
> -    netdev_flow = dp_netdev_pmd_lookup_flow(pmd, key, NULL);
> -    if (!netdev_flow) {
> -        if (put->flags & DPIF_FP_CREATE) {
> -            dp_netdev_flow_add(pmd, match, ufid, put->actions,
> -                               put->actions_len, ODPP_NONE);
> +    if (put->ufid) {
> +        netdev_flow = dp_netdev_pmd_find_flow(pmd, put->ufid,
> +                                              put->key, put->key_len);
> +    } else {
> +        /* Use key instead of the locally generated ufid
> +         * to search netdev_flow. */
> +        netdev_flow = dp_netdev_pmd_lookup_flow(pmd, key, NULL);
> +    }
> +
> +    if (put->flags & DPIF_FP_CREATE) {
> +        if (!netdev_flow) {
> +            dp_netdev_flow_add(pmd, match, ufid,
> +                               put->actions, put->actions_len, ODPP_NONE);
>          } else {
> -            error = ENOENT;
> +            error = EEXIST;
>          }
> -    } else {
> -        if (put->flags & DPIF_FP_MODIFY) {
> +        goto exit;
> +    }
> +
> +    if (put->flags & DPIF_FP_MODIFY) {
> +        if (!netdev_flow) {
> +            error = ENOENT;
> +        } else {
> +            if (!put->ufid && !flow_equal(&match->flow, &netdev_flow->flow)) {
> +                /* Overlapping flow. */
> +                error = EINVAL;
> +                goto exit;
> +            }
> +
>              struct dp_netdev_actions *new_actions;
>              struct dp_netdev_actions *old_actions;
>  
> @@ -4239,15 +4258,11 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd,
>                   *   counter, and subtracting it before outputting the stats */
>                  error = EOPNOTSUPP;
>              }
> -
>              ovsrcu_postpone(dp_netdev_actions_free, old_actions);
> -        } else if (put->flags & DPIF_FP_CREATE) {
> -            error = EEXIST;
> -        } else {
> -            /* Overlapping flow. */
> -            error = EINVAL;
>          }
>      }
> +
> +exit:
>      ovs_mutex_unlock(&pmd->flow_mutex);
>      return error;
>  }
> diff --git a/tests/pmd.at b/tests/pmd.at
> index 48f3d432d..b03cd831e 100644
> --- a/tests/pmd.at
> +++ b/tests/pmd.at
> @@ -1300,3 +1300,50 @@ OVS_WAIT_UNTIL([tail -n +$LINENUM ovs-vswitchd.log | grep "PMD load based sleeps
>  
>  OVS_VSWITCHD_STOP
>  AT_CLEANUP
> +
> +AT_SETUP([PMD - revalidator wrongly modify userspace megaflows])
> +
> +OVS_VSWITCHD_START(
> +[add-port br0 p1 \
> +   -- set bridge br0 datapath-type=dummy \
> +   -- set interface p1 type=dummy-pmd \
> +   -- add-port br0 p2 \
> +   -- set interface p2 type=dummy-pmd
> +])
> +
> +dnl Add one openflow rule and generate a megaflow.
> +AT_CHECK([ovs-ofctl add-flow br0 'table=0,in_port=p1,ip,nw_dst=10.1.2.0/24,actions=p2'])
> +AT_CHECK([ovs-appctl netdev-dummy/receive p1 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)'])
> +
> +OVS_WAIT_UNTIL_EQUAL([ovs-appctl dpctl/dump-flows | sed 's/.*core: [[0-9]]*//'], [
> +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.2.2/255.255.255.0,frag=no), packets:0, bytes:0, used:never, actions:2])

For some reason this check keeps failing on FreeBSD.  Investigating...

Best regards, Ilya Maximets.

> +
> +AT_CHECK([ovs-appctl netdev-dummy/receive p1 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)'])
> +dnl Replace openflow rules, trigger the revalidation.
> +AT_CHECK([echo 'table=0,in_port=p1,ip,nw_dst=10.1.0.0/16 actions=ct(commit)' | dnl
> +          ovs-ofctl --bundle replace-flows br0 -])
> +AT_CHECK([ovs-appctl revalidator/wait])
> +
> +AT_CHECK([ovs-appctl netdev-dummy/receive p1 'ipv4(src=10.0.0.1,dst=10.1.0.2),tcp(src=1,dst=2)'])
> +OVS_WAIT_UNTIL_EQUAL([ovs-appctl dpctl/dump-flows | sed 's/.*core: [[0-9]]*//' | strip_xout_keep_actions], [
> +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.0.2/255.255.0.0,frag=no), packets:0, bytes:0, used:never, actions:ct(commit)
> +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.2.2/255.255.255.0,frag=no), packets:0, bytes:0, used:0.0s, actions:ct(commit)])
> +
> +dnl Hold the prefix 10.1.2.2/24 by another 10s.
> +AT_CHECK([ovs-appctl netdev-dummy/receive p1 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)'])
> +dnl Send more 10.1.0.2 to make 10.1.0.0/16 tuple preprend 10.1.2.0/24 tuple in the pvector of subtables.
> +for i in $(seq 0 256); do
> +    AT_CHECK([ovs-appctl netdev-dummy/receive p1 'ipv4(src=10.0.0.1,dst=10.1.0.2),tcp(src=1,dst=2)'])
> +done
> +
> +AT_CHECK([echo 'table=0,in_port=p1,ip,nw_dst=10.1.0.0/16 actions=p2' | dnl
> +          ovs-ofctl --bundle replace-flows br0 -])
> +
> +AT_CHECK([ovs-appctl revalidator/wait])
> +AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*core: [[0-9]]*//' | strip_xout_keep_actions], [0], [
> +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.0.2/255.255.0.0,frag=no), packets:0, bytes:0, used:0.0s, actions:2
> +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.2.2/255.255.255.0,frag=no), packets:0, bytes:0, used:0.0s, actions:2
> +])
> +
> +OVS_VSWITCHD_STOP
> +AT_CLEANUP
Ilya Maximets Aug. 11, 2023, 6:19 p.m. UTC | #4
On 8/11/23 19:37, Ilya Maximets wrote:
> On 8/9/23 05:02, Peng He wrote:
>> OVS allows overlapping megaflows, as long as the actions of these
>> megaflows are equal. However, the current implementation of action
>> modification relies on flow_lookup instead of ufid, this could result
>> in looking up a wrong megaflow and make the ukeys and megaflows inconsistent
>>
>> Just like the test case in the patch, at first we have a rule with the
>> prefix:
>>
>> 10.1.2.0/24
>>
>> and we will get a megaflow with prefixes 10.1.2.2/24 when a packet with IP
>> 10.1.2.2 is received.
>>
>> Then suppose we change the rule into 10.1.0.0/16. OVS prefers to keep the
>> 10.1.2.2/24 megaflow and just changes its action instead of extending
>> the prefix into 10.1.2.2/16.
>>
>> then suppose we have a 10.1.0.2 packet, since it misses the megaflow,
>> this time, we will have an overlapping megaflow with the right prefix:
>> 10.1.0.2/16
>>
>> now we have two megaflows:
>> 10.1.2.2/24
>> 10.1.0.2/16
>>
>> last, suppose we have changed the ruleset again. The revalidator this
>> time still decides to change the actions of both megaflows instead of
>> deleting them.
>>
>> The dpif_netdev_flow_put will search the megaflow to modify with unmasked
>> keys, however it might lookup the wrong megaflow as the key 10.1.2.2 matches
>> both 10.1.2.2/24 and 10.1.0.2/16!
>>
>> This patch changes the megaflow lookup code in modification path into
>> relying the ufid to find the correct megaflow instead of key lookup.
>>
>> Signed-off-by: Peng He <hepeng.0320@bytedance.com>
>> ---
>>  lib/dpif-netdev.c | 45 ++++++++++++++++++++++++++++++---------------
>>  tests/pmd.at      | 47 +++++++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 77 insertions(+), 15 deletions(-)
>>
>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
>> index 70b953ae6..8479630b8 100644
>> --- a/lib/dpif-netdev.c
>> +++ b/lib/dpif-netdev.c
>> @@ -4191,7 +4191,7 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd,
>>                  const struct dpif_flow_put *put,
>>                  struct dpif_flow_stats *stats)
>>  {
>> -    struct dp_netdev_flow *netdev_flow;
>> +    struct dp_netdev_flow *netdev_flow = NULL;
>>      int error = 0;
>>  
>>      if (stats) {
>> @@ -4199,16 +4199,35 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd,
>>      }
>>  
>>      ovs_mutex_lock(&pmd->flow_mutex);
>> -    netdev_flow = dp_netdev_pmd_lookup_flow(pmd, key, NULL);
>> -    if (!netdev_flow) {
>> -        if (put->flags & DPIF_FP_CREATE) {
>> -            dp_netdev_flow_add(pmd, match, ufid, put->actions,
>> -                               put->actions_len, ODPP_NONE);
>> +    if (put->ufid) {
>> +        netdev_flow = dp_netdev_pmd_find_flow(pmd, put->ufid,
>> +                                              put->key, put->key_len);
>> +    } else {
>> +        /* Use key instead of the locally generated ufid
>> +         * to search netdev_flow. */
>> +        netdev_flow = dp_netdev_pmd_lookup_flow(pmd, key, NULL);
>> +    }
>> +
>> +    if (put->flags & DPIF_FP_CREATE) {
>> +        if (!netdev_flow) {
>> +            dp_netdev_flow_add(pmd, match, ufid,
>> +                               put->actions, put->actions_len, ODPP_NONE);
>>          } else {
>> -            error = ENOENT;
>> +            error = EEXIST;
>>          }
>> -    } else {
>> -        if (put->flags & DPIF_FP_MODIFY) {
>> +        goto exit;
>> +    }
>> +
>> +    if (put->flags & DPIF_FP_MODIFY) {
>> +        if (!netdev_flow) {
>> +            error = ENOENT;
>> +        } else {
>> +            if (!put->ufid && !flow_equal(&match->flow, &netdev_flow->flow)) {
>> +                /* Overlapping flow. */
>> +                error = EINVAL;
>> +                goto exit;
>> +            }
>> +
>>              struct dp_netdev_actions *new_actions;
>>              struct dp_netdev_actions *old_actions;
>>  
>> @@ -4239,15 +4258,11 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd,
>>                   *   counter, and subtracting it before outputting the stats */
>>                  error = EOPNOTSUPP;
>>              }
>> -
>>              ovsrcu_postpone(dp_netdev_actions_free, old_actions);
>> -        } else if (put->flags & DPIF_FP_CREATE) {
>> -            error = EEXIST;
>> -        } else {
>> -            /* Overlapping flow. */
>> -            error = EINVAL;
>>          }
>>      }
>> +
>> +exit:
>>      ovs_mutex_unlock(&pmd->flow_mutex);
>>      return error;
>>  }
>> diff --git a/tests/pmd.at b/tests/pmd.at
>> index 48f3d432d..b03cd831e 100644
>> --- a/tests/pmd.at
>> +++ b/tests/pmd.at
>> @@ -1300,3 +1300,50 @@ OVS_WAIT_UNTIL([tail -n +$LINENUM ovs-vswitchd.log | grep "PMD load based sleeps
>>  
>>  OVS_VSWITCHD_STOP
>>  AT_CLEANUP
>> +
>> +AT_SETUP([PMD - revalidator wrongly modify userspace megaflows])
>> +
>> +OVS_VSWITCHD_START(
>> +[add-port br0 p1 \
>> +   -- set bridge br0 datapath-type=dummy \
>> +   -- set interface p1 type=dummy-pmd \
>> +   -- add-port br0 p2 \
>> +   -- set interface p2 type=dummy-pmd
>> +])

OK.  I found the reason why the test is failing on FreeSBD.
OVS doesn't implement CPU/NUMA discovery on non-linux systems,
so dummy NUMA configuration has to be provided in order for
OVS to actually create PMD threads.  Without dummy NUMA config
no threads are started and so no traffic is forwarded.

See other tests in the fie on how dummy NUMA is provided in
the OVS_VSWITCHD_START macro.

>> +
>> +dnl Add one openflow rule and generate a megaflow.
>> +AT_CHECK([ovs-ofctl add-flow br0 'table=0,in_port=p1,ip,nw_dst=10.1.2.0/24,actions=p2'])
>> +AT_CHECK([ovs-appctl netdev-dummy/receive p1 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)'])

Not very important, but this packet is invalid.
You're no specifying important fields of ipv4 header, for
example the ipv4.proto is not set and hence equals zero.
In this case the tcp configuration is considered invalid,
because it's not a tcp packet.

It's not an OpenFlow formatting and OVS doesn't fill out any
missing fields.  Everything that is missing will be zero.
It's better to provide a correct packet here and below in
the test to avoid potential issues in the future.

Best regards, Ilya Maximets.

>> +
>> +OVS_WAIT_UNTIL_EQUAL([ovs-appctl dpctl/dump-flows | sed 's/.*core: [[0-9]]*//'], [
>> +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.2.2/255.255.255.0,frag=no), packets:0, bytes:0, used:never, actions:2])
> 
> For some reason this check keeps failing on FreeBSD.  Investigating...
> 
> Best regards, Ilya Maximets.
> 
>> +
>> +AT_CHECK([ovs-appctl netdev-dummy/receive p1 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)'])
>> +dnl Replace openflow rules, trigger the revalidation.
>> +AT_CHECK([echo 'table=0,in_port=p1,ip,nw_dst=10.1.0.0/16 actions=ct(commit)' | dnl
>> +          ovs-ofctl --bundle replace-flows br0 -])
>> +AT_CHECK([ovs-appctl revalidator/wait])
>> +
>> +AT_CHECK([ovs-appctl netdev-dummy/receive p1 'ipv4(src=10.0.0.1,dst=10.1.0.2),tcp(src=1,dst=2)'])
>> +OVS_WAIT_UNTIL_EQUAL([ovs-appctl dpctl/dump-flows | sed 's/.*core: [[0-9]]*//' | strip_xout_keep_actions], [
>> +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.0.2/255.255.0.0,frag=no), packets:0, bytes:0, used:never, actions:ct(commit)
>> +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.2.2/255.255.255.0,frag=no), packets:0, bytes:0, used:0.0s, actions:ct(commit)])
>> +
>> +dnl Hold the prefix 10.1.2.2/24 by another 10s.
>> +AT_CHECK([ovs-appctl netdev-dummy/receive p1 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)'])
>> +dnl Send more 10.1.0.2 to make 10.1.0.0/16 tuple preprend 10.1.2.0/24 tuple in the pvector of subtables.
>> +for i in $(seq 0 256); do
>> +    AT_CHECK([ovs-appctl netdev-dummy/receive p1 'ipv4(src=10.0.0.1,dst=10.1.0.2),tcp(src=1,dst=2)'])
>> +done
>> +
>> +AT_CHECK([echo 'table=0,in_port=p1,ip,nw_dst=10.1.0.0/16 actions=p2' | dnl
>> +          ovs-ofctl --bundle replace-flows br0 -])
>> +
>> +AT_CHECK([ovs-appctl revalidator/wait])
>> +AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*core: [[0-9]]*//' | strip_xout_keep_actions], [0], [
>> +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.0.2/255.255.0.0,frag=no), packets:0, bytes:0, used:0.0s, actions:2
>> +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.2.2/255.255.255.0,frag=no), packets:0, bytes:0, used:0.0s, actions:2
>> +])
>> +
>> +OVS_VSWITCHD_STOP
>> +AT_CLEANUP
>
Peng He Aug. 12, 2023, 1:21 p.m. UTC | #5
Ilya Maximets <i.maximets@ovn.org> 于2023年8月12日周六 02:18写道:

> On 8/11/23 19:37, Ilya Maximets wrote:
> > On 8/9/23 05:02, Peng He wrote:
> >> OVS allows overlapping megaflows, as long as the actions of these
> >> megaflows are equal. However, the current implementation of action
> >> modification relies on flow_lookup instead of ufid, this could result
> >> in looking up a wrong megaflow and make the ukeys and megaflows
> inconsistent
> >>
> >> Just like the test case in the patch, at first we have a rule with the
> >> prefix:
> >>
> >> 10.1.2.0/24
> >>
> >> and we will get a megaflow with prefixes 10.1.2.2/24 when a packet
> with IP
> >> 10.1.2.2 is received.
> >>
> >> Then suppose we change the rule into 10.1.0.0/16. OVS prefers to keep
> the
> >> 10.1.2.2/24 megaflow and just changes its action instead of extending
> >> the prefix into 10.1.2.2/16.
> >>
> >> then suppose we have a 10.1.0.2 packet, since it misses the megaflow,
> >> this time, we will have an overlapping megaflow with the right prefix:
> >> 10.1.0.2/16
> >>
> >> now we have two megaflows:
> >> 10.1.2.2/24
> >> 10.1.0.2/16
> >>
> >> last, suppose we have changed the ruleset again. The revalidator this
> >> time still decides to change the actions of both megaflows instead of
> >> deleting them.
> >>
> >> The dpif_netdev_flow_put will search the megaflow to modify with
> unmasked
> >> keys, however it might lookup the wrong megaflow as the key 10.1.2.2
> matches
> >> both 10.1.2.2/24 and 10.1.0.2/16!
> >>
> >> This patch changes the megaflow lookup code in modification path into
> >> relying the ufid to find the correct megaflow instead of key lookup.
> >>
> >> Signed-off-by: Peng He <hepeng.0320@bytedance.com>
> >> ---
> >>  lib/dpif-netdev.c | 45 ++++++++++++++++++++++++++++++---------------
> >>  tests/pmd.at      | 47 +++++++++++++++++++++++++++++++++++++++++++++++
> >>  2 files changed, 77 insertions(+), 15 deletions(-)
> >>
> >> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> >> index 70b953ae6..8479630b8 100644
> >> --- a/lib/dpif-netdev.c
> >> +++ b/lib/dpif-netdev.c
> >> @@ -4191,7 +4191,7 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd,
> >>                  const struct dpif_flow_put *put,
> >>                  struct dpif_flow_stats *stats)
> >>  {
> >> -    struct dp_netdev_flow *netdev_flow;
> >> +    struct dp_netdev_flow *netdev_flow = NULL;
> >>      int error = 0;
> >>
> >>      if (stats) {
> >> @@ -4199,16 +4199,35 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread
> *pmd,
> >>      }
> >>
> >>      ovs_mutex_lock(&pmd->flow_mutex);
> >> -    netdev_flow = dp_netdev_pmd_lookup_flow(pmd, key, NULL);
> >> -    if (!netdev_flow) {
> >> -        if (put->flags & DPIF_FP_CREATE) {
> >> -            dp_netdev_flow_add(pmd, match, ufid, put->actions,
> >> -                               put->actions_len, ODPP_NONE);
> >> +    if (put->ufid) {
> >> +        netdev_flow = dp_netdev_pmd_find_flow(pmd, put->ufid,
> >> +                                              put->key, put->key_len);
> >> +    } else {
> >> +        /* Use key instead of the locally generated ufid
> >> +         * to search netdev_flow. */
> >> +        netdev_flow = dp_netdev_pmd_lookup_flow(pmd, key, NULL);
> >> +    }
> >> +
> >> +    if (put->flags & DPIF_FP_CREATE) {
> >> +        if (!netdev_flow) {
> >> +            dp_netdev_flow_add(pmd, match, ufid,
> >> +                               put->actions, put->actions_len,
> ODPP_NONE);
> >>          } else {
> >> -            error = ENOENT;
> >> +            error = EEXIST;
> >>          }
> >> -    } else {
> >> -        if (put->flags & DPIF_FP_MODIFY) {
> >> +        goto exit;
> >> +    }
> >> +
> >> +    if (put->flags & DPIF_FP_MODIFY) {
> >> +        if (!netdev_flow) {
> >> +            error = ENOENT;
> >> +        } else {
> >> +            if (!put->ufid && !flow_equal(&match->flow,
> &netdev_flow->flow)) {
> >> +                /* Overlapping flow. */
> >> +                error = EINVAL;
> >> +                goto exit;
> >> +            }
> >> +
> >>              struct dp_netdev_actions *new_actions;
> >>              struct dp_netdev_actions *old_actions;
> >>
> >> @@ -4239,15 +4258,11 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread
> *pmd,
> >>                   *   counter, and subtracting it before outputting the
> stats */
> >>                  error = EOPNOTSUPP;
> >>              }
> >> -
> >>              ovsrcu_postpone(dp_netdev_actions_free, old_actions);
> >> -        } else if (put->flags & DPIF_FP_CREATE) {
> >> -            error = EEXIST;
> >> -        } else {
> >> -            /* Overlapping flow. */
> >> -            error = EINVAL;
> >>          }
> >>      }
> >> +
> >> +exit:
> >>      ovs_mutex_unlock(&pmd->flow_mutex);
> >>      return error;
> >>  }
> >> diff --git a/tests/pmd.at b/tests/pmd.at
> >> index 48f3d432d..b03cd831e 100644
> >> --- a/tests/pmd.at
> >> +++ b/tests/pmd.at
> >> @@ -1300,3 +1300,50 @@ OVS_WAIT_UNTIL([tail -n +$LINENUM
> ovs-vswitchd.log | grep "PMD load based sleeps
> >>
> >>  OVS_VSWITCHD_STOP
> >>  AT_CLEANUP
> >> +
> >> +AT_SETUP([PMD - revalidator wrongly modify userspace megaflows])
> >> +
> >> +OVS_VSWITCHD_START(
> >> +[add-port br0 p1 \
> >> +   -- set bridge br0 datapath-type=dummy \
> >> +   -- set interface p1 type=dummy-pmd \
> >> +   -- add-port br0 p2 \
> >> +   -- set interface p2 type=dummy-pmd
> >> +])
>
> OK.  I found the reason why the test is failing on FreeSBD.
> OVS doesn't implement CPU/NUMA discovery on non-linux systems,
> so dummy NUMA configuration has to be provided in order for
> OVS to actually create PMD threads.  Without dummy NUMA config
> no threads are started and so no traffic is forwarded.
>

I see. will fix it in the next version.

>
> See other tests in the fie on how dummy NUMA is provided in
> the OVS_VSWITCHD_START macro.
>
> >> +
> >> +dnl Add one openflow rule and generate a megaflow.
> >> +AT_CHECK([ovs-ofctl add-flow br0 'table=0,in_port=p1,ip,nw_dst=
> 10.1.2.0/24,actions=p2'])
> >> +AT_CHECK([ovs-appctl netdev-dummy/receive p1
> 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)'])
>
> Not very important, but this packet is invalid.
> You're no specifying important fields of ipv4 header, for
> example the ipv4.proto is not set and hence equals zero.
> In this case the tcp configuration is considered invalid,
> because it's not a tcp packet.
>

Ok, will fix it in the next version.

>
> It's not an OpenFlow formatting and OVS doesn't fill out any
> missing fields.  Everything that is missing will be zero.
> It's better to provide a correct packet here and below in
> the test to avoid potential issues in the future.
>
> Best regards, Ilya Maximets.
>
> >> +
> >> +OVS_WAIT_UNTIL_EQUAL([ovs-appctl dpctl/dump-flows | sed 's/.*core:
> [[0-9]]*//'], [
> >>
> +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=
> 10.1.2.2/255.255.255.0,frag=no), packets:0, bytes:0, used:never,
> actions:2])
> >
> > For some reason this check keeps failing on FreeBSD.  Investigating...
> >
> > Best regards, Ilya Maximets.
> >
> >> +
> >> +AT_CHECK([ovs-appctl netdev-dummy/receive p1
> 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)'])
> >> +dnl Replace openflow rules, trigger the revalidation.
> >> +AT_CHECK([echo 'table=0,in_port=p1,ip,nw_dst=10.1.0.0/16
> actions=ct(commit)' | dnl
> >> +          ovs-ofctl --bundle replace-flows br0 -])
> >> +AT_CHECK([ovs-appctl revalidator/wait])
> >> +
> >> +AT_CHECK([ovs-appctl netdev-dummy/receive p1
> 'ipv4(src=10.0.0.1,dst=10.1.0.2),tcp(src=1,dst=2)'])
> >> +OVS_WAIT_UNTIL_EQUAL([ovs-appctl dpctl/dump-flows | sed 's/.*core:
> [[0-9]]*//' | strip_xout_keep_actions], [
> >>
> +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=
> 10.1.0.2/255.255.0.0,frag=no), packets:0, bytes:0, used:never,
> actions:ct(commit)
> >>
> +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=
> 10.1.2.2/255.255.255.0,frag=no), packets:0, bytes:0, used:0.0s,
> actions:ct(commit)])
> >> +
> >> +dnl Hold the prefix 10.1.2.2/24 by another 10s.
> >> +AT_CHECK([ovs-appctl netdev-dummy/receive p1
> 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)'])
> >> +dnl Send more 10.1.0.2 to make 10.1.0.0/16 tuple preprend 10.1.2.0/24
> tuple in the pvector of subtables.
> >> +for i in $(seq 0 256); do
> >> +    AT_CHECK([ovs-appctl netdev-dummy/receive p1
> 'ipv4(src=10.0.0.1,dst=10.1.0.2),tcp(src=1,dst=2)'])
> >> +done
> >> +
> >> +AT_CHECK([echo 'table=0,in_port=p1,ip,nw_dst=10.1.0.0/16 actions=p2'
> | dnl
> >> +          ovs-ofctl --bundle replace-flows br0 -])
> >> +
> >> +AT_CHECK([ovs-appctl revalidator/wait])
> >> +AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*core: [[0-9]]*//' |
> strip_xout_keep_actions], [0], [
> >>
> +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=
> 10.1.0.2/255.255.0.0,frag=no), packets:0, bytes:0, used:0.0s, actions:2
> >>
> +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=
> 10.1.2.2/255.255.255.0,frag=no), packets:0, bytes:0, used:0.0s, actions:2
> >> +])
> >> +
> >> +OVS_VSWITCHD_STOP
> >> +AT_CLEANUP
> >
>
>
diff mbox series

Patch

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 70b953ae6..8479630b8 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -4191,7 +4191,7 @@  flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd,
                 const struct dpif_flow_put *put,
                 struct dpif_flow_stats *stats)
 {
-    struct dp_netdev_flow *netdev_flow;
+    struct dp_netdev_flow *netdev_flow = NULL;
     int error = 0;
 
     if (stats) {
@@ -4199,16 +4199,35 @@  flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd,
     }
 
     ovs_mutex_lock(&pmd->flow_mutex);
-    netdev_flow = dp_netdev_pmd_lookup_flow(pmd, key, NULL);
-    if (!netdev_flow) {
-        if (put->flags & DPIF_FP_CREATE) {
-            dp_netdev_flow_add(pmd, match, ufid, put->actions,
-                               put->actions_len, ODPP_NONE);
+    if (put->ufid) {
+        netdev_flow = dp_netdev_pmd_find_flow(pmd, put->ufid,
+                                              put->key, put->key_len);
+    } else {
+        /* Use key instead of the locally generated ufid
+         * to search netdev_flow. */
+        netdev_flow = dp_netdev_pmd_lookup_flow(pmd, key, NULL);
+    }
+
+    if (put->flags & DPIF_FP_CREATE) {
+        if (!netdev_flow) {
+            dp_netdev_flow_add(pmd, match, ufid,
+                               put->actions, put->actions_len, ODPP_NONE);
         } else {
-            error = ENOENT;
+            error = EEXIST;
         }
-    } else {
-        if (put->flags & DPIF_FP_MODIFY) {
+        goto exit;
+    }
+
+    if (put->flags & DPIF_FP_MODIFY) {
+        if (!netdev_flow) {
+            error = ENOENT;
+        } else {
+            if (!put->ufid && !flow_equal(&match->flow, &netdev_flow->flow)) {
+                /* Overlapping flow. */
+                error = EINVAL;
+                goto exit;
+            }
+
             struct dp_netdev_actions *new_actions;
             struct dp_netdev_actions *old_actions;
 
@@ -4239,15 +4258,11 @@  flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd,
                  *   counter, and subtracting it before outputting the stats */
                 error = EOPNOTSUPP;
             }
-
             ovsrcu_postpone(dp_netdev_actions_free, old_actions);
-        } else if (put->flags & DPIF_FP_CREATE) {
-            error = EEXIST;
-        } else {
-            /* Overlapping flow. */
-            error = EINVAL;
         }
     }
+
+exit:
     ovs_mutex_unlock(&pmd->flow_mutex);
     return error;
 }
diff --git a/tests/pmd.at b/tests/pmd.at
index 48f3d432d..b03cd831e 100644
--- a/tests/pmd.at
+++ b/tests/pmd.at
@@ -1300,3 +1300,50 @@  OVS_WAIT_UNTIL([tail -n +$LINENUM ovs-vswitchd.log | grep "PMD load based sleeps
 
 OVS_VSWITCHD_STOP
 AT_CLEANUP
+
+AT_SETUP([PMD - revalidator wrongly modify userspace megaflows])
+
+OVS_VSWITCHD_START(
+[add-port br0 p1 \
+   -- set bridge br0 datapath-type=dummy \
+   -- set interface p1 type=dummy-pmd \
+   -- add-port br0 p2 \
+   -- set interface p2 type=dummy-pmd
+])
+
+dnl Add one openflow rule and generate a megaflow.
+AT_CHECK([ovs-ofctl add-flow br0 'table=0,in_port=p1,ip,nw_dst=10.1.2.0/24,actions=p2'])
+AT_CHECK([ovs-appctl netdev-dummy/receive p1 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)'])
+
+OVS_WAIT_UNTIL_EQUAL([ovs-appctl dpctl/dump-flows | sed 's/.*core: [[0-9]]*//'], [
+recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.2.2/255.255.255.0,frag=no), packets:0, bytes:0, used:never, actions:2])
+
+AT_CHECK([ovs-appctl netdev-dummy/receive p1 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)'])
+dnl Replace openflow rules, trigger the revalidation.
+AT_CHECK([echo 'table=0,in_port=p1,ip,nw_dst=10.1.0.0/16 actions=ct(commit)' | dnl
+          ovs-ofctl --bundle replace-flows br0 -])
+AT_CHECK([ovs-appctl revalidator/wait])
+
+AT_CHECK([ovs-appctl netdev-dummy/receive p1 'ipv4(src=10.0.0.1,dst=10.1.0.2),tcp(src=1,dst=2)'])
+OVS_WAIT_UNTIL_EQUAL([ovs-appctl dpctl/dump-flows | sed 's/.*core: [[0-9]]*//' | strip_xout_keep_actions], [
+recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.0.2/255.255.0.0,frag=no), packets:0, bytes:0, used:never, actions:ct(commit)
+recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.2.2/255.255.255.0,frag=no), packets:0, bytes:0, used:0.0s, actions:ct(commit)])
+
+dnl Hold the prefix 10.1.2.2/24 by another 10s.
+AT_CHECK([ovs-appctl netdev-dummy/receive p1 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)'])
+dnl Send more 10.1.0.2 to make 10.1.0.0/16 tuple preprend 10.1.2.0/24 tuple in the pvector of subtables.
+for i in $(seq 0 256); do
+    AT_CHECK([ovs-appctl netdev-dummy/receive p1 'ipv4(src=10.0.0.1,dst=10.1.0.2),tcp(src=1,dst=2)'])
+done
+
+AT_CHECK([echo 'table=0,in_port=p1,ip,nw_dst=10.1.0.0/16 actions=p2' | dnl
+          ovs-ofctl --bundle replace-flows br0 -])
+
+AT_CHECK([ovs-appctl revalidator/wait])
+AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*core: [[0-9]]*//' | strip_xout_keep_actions], [0], [
+recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.0.2/255.255.0.0,frag=no), packets:0, bytes:0, used:0.0s, actions:2
+recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.2.2/255.255.255.0,frag=no), packets:0, bytes:0, used:0.0s, actions:2
+])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP