diff mbox

[ovs-dev] conntrack: Fix for force/commit bug

Message ID 1499882720-21050-1-git-send-email-gvrose8192@gmail.com
State Changes Requested
Delegated to: Joe Stringer
Headers show

Commit Message

Gregory Rose July 12, 2017, 6:05 p.m. UTC
When the direction is being forced key->ct_state may not have been
set.  Check for this condition and take action to set the state
correctly so that the force direction occurs.

Co-authored-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
---
 datapath/conntrack.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

Comments

Joe Stringer July 12, 2017, 9:38 p.m. UTC | #1
On 12 July 2017 at 11:05, Greg Rose <gvrose8192@gmail.com> wrote:
> When the direction is being forced key->ct_state may not have been
> set.  Check for this condition and take action to set the state
> correctly so that the force direction occurs.
>
> Co-authored-by: Joe Stringer <joe@ovn.org>
> Signed-off-by: Joe Stringer <joe@ovn.org>
> Signed-off-by: Greg Rose <gvrose8192@gmail.com>
> ---

Hi Greg, a few bits of high level feedback before I get into the patch itself:

Usually, patches to datapath/* just use the "datapath: " prefix in the
subject. Typically, the "conntrack:" prefix in the OVS tree indicates
changes to lib/conntrack.[ch], which is used only for the userspace
datapath.

Patches to datapath/*.[ch], ie the Linux kernel module code, need to
be accepted into upstream Linux trees before applying to the OVS tree.
For bugfixes, this means the 'net' tree; for features, 'net-next'.
That said, if you're looking for feedback from the OVS community
before posting upstream then it's still quite reasonable to post the
patch here on ovs-dev first. For what it's worth, there's some
documentation about this policy here:

http://docs.openvswitch.org/en/latest/internals/contributing/backporting-patches/

I happen to have some of the context here, so it's a little easier for
me to understand what's going on but the patch description is a little
hard to follow. Given that the upstream Linux openvswitch maintainer
will be the primary person to provide feedback on the patch, the
clearer we can explain the better :-) I've given an attempt at
providing more context for the patch description below, but feel free
to take or leave any/all of it if it helps to describe the solution:

"When there is an established connection in direction A->B, it is
possible to receive a packet on port B which then executes
ct(commit,force) without first performing ct() - ie, a lookup. In this
case, we would expect that this packet can delete the existing entry
so that we can commit a connection with direction B->A. However,
currently we only perform a check in skb_nfct_cached() for whether
OVS_CS_F_TRACKED is set and OVS_CS_F_INVALID is not set, ie that a
lookup previously occurred. In the above scenario, a lookup has not
occurred but we should still be able to statelessly look up the
existing entry and potentially delete the entry if it is in the
opposite direction.

This patch extends the check to also hint that if the action has the
force flag set, then we will lookup the existing entry so that the
force check at the end of skb_nfct_cached has the ability to delete
the connection.
"

I ran this on my tester box across a variety of kernels and platforms,
and everything was green.

I think that you've been using a patch to tests/system-traffic.at to
validate this behaviour, would you be able to send that to the ovs-dev
list as a separate patch as well? When we get this into the OVS tree,
it would be nice to have the test to validate that behaviour, to
ensure that the userspace datapath has the same behaviour, and to help
to stop regressions in future.

>  datapath/conntrack.c | 20 +++++++++++++++-----
>  1 file changed, 15 insertions(+), 5 deletions(-)
>
> diff --git a/datapath/conntrack.c b/datapath/conntrack.c
> index bf28fc0..2da0321 100644
> --- a/datapath/conntrack.c
> +++ b/datapath/conntrack.c
> @@ -665,7 +665,7 @@ ovs_ct_find_existing(struct net *net, const struct nf_conntrack_zone *zone,
>
>  /* Determine whether skb->_nfct is equal to the result of conntrack lookup. */
>  static bool skb_nfct_cached(struct net *net,
> -                           const struct sw_flow_key *key,
> +                           struct sw_flow_key *key,
>                             const struct ovs_conntrack_info *info,
>                             struct sk_buff *skb)
>  {
> @@ -675,12 +675,22 @@ static bool skb_nfct_cached(struct net *net,
>         ct = nf_ct_get(skb, &ctinfo);
>         /* If no ct, check if we have evidence that an existing conntrack entry
>          * might be found for this skb.  This happens when we lose a skb->_nfct
> -        * due to an upcall.  If the connection was not confirmed, it is not
> -        * cached and needs to be run through conntrack again.
> +        * due to an upcall, or if the direction is being forced.  If the
> +        * connection was not confirmed, it is not cached and needs to be run
> +        * through conntrack again.
>          */
> -       if (!ct && key->ct_state & OVS_CS_F_TRACKED &&
> +       if ((!ct && (key->ct_state & OVS_CS_F_TRACKED &&
>             !(key->ct_state & OVS_CS_F_INVALID) &&
> -           key->ct_zone == info->zone.id) {
> +           key->ct_zone == info->zone.id)) ||
> +           (!key->ct_state && info->force)) {
> +               if (!key->ct_state && info->force && !info->ct) {
> +                       int result = nf_conntrack_in(net, info->family,
> +                                                    NF_INET_PRE_ROUTING, skb);
> +                       if (result != NF_ACCEPT)
> +                               return false;

I suspect that in this scenario, the extra nf_conntrack_in() here is
not strictly necessary; skb_nfct_cached() is actually trying to avoid
doing nf_conntrack_in() in __ovs_ct_lookup(), so to put an
nf_conntrack_in() call here is a bit difficult to fully reason about.
This function is always hit for every packet that executes ct()
action, but it's primarily here to deal with losing state upon upcall.
Can you try removing it and just keeping the if condition changes?

Cheers,
Joe
Gregory Rose July 12, 2017, 9:49 p.m. UTC | #2
On 07/12/2017 02:38 PM, Joe Stringer wrote:
> On 12 July 2017 at 11:05, Greg Rose <gvrose8192@gmail.com> wrote:
>> When the direction is being forced key->ct_state may not have been
>> set.  Check for this condition and take action to set the state
>> correctly so that the force direction occurs.
>>
>> Co-authored-by: Joe Stringer <joe@ovn.org>
>> Signed-off-by: Joe Stringer <joe@ovn.org>
>> Signed-off-by: Greg Rose <gvrose8192@gmail.com>
>> ---
> 
> Hi Greg, a few bits of high level feedback before I get into the patch itself:
> 
> Usually, patches to datapath/* just use the "datapath: " prefix in the
> subject. Typically, the "conntrack:" prefix in the OVS tree indicates
> changes to lib/conntrack.[ch], which is used only for the userspace
> datapath.
> 
> Patches to datapath/*.[ch], ie the Linux kernel module code, need to
> be accepted into upstream Linux trees before applying to the OVS tree.
> For bugfixes, this means the 'net' tree; for features, 'net-next'.
> That said, if you're looking for feedback from the OVS community
> before posting upstream then it's still quite reasonable to post the
> patch here on ovs-dev first. For what it's worth, there's some
> documentation about this policy here:
> 
> http://docs.openvswitch.org/en/latest/internals/contributing/backporting-patches/

Thanks for the direction!  Much appreciated.

> 
> I happen to have some of the context here, so it's a little easier for
> me to understand what's going on but the patch description is a little
> hard to follow. Given that the upstream Linux openvswitch maintainer
> will be the primary person to provide feedback on the patch, the
> clearer we can explain the better :-) I've given an attempt at
> providing more context for the patch description below, but feel free
> to take or leave any/all of it if it helps to describe the solution:
> 
> "When there is an established connection in direction A->B, it is
> possible to receive a packet on port B which then executes
> ct(commit,force) without first performing ct() - ie, a lookup.

That is what is happening with our test but I continue to ponder if
it would occur in the real world.  In any case it can certainly happen
using our packet sending interface so it needs to be fixed.  There's
a high priority bug as well.

  In this
> case, we would expect that this packet can delete the existing entry
> so that we can commit a connection with direction B->A. However,
> currently we only perform a check in skb_nfct_cached() for whether
> OVS_CS_F_TRACKED is set and OVS_CS_F_INVALID is not set, ie that a
> lookup previously occurred. In the above scenario, a lookup has not
> occurred but we should still be able to statelessly look up the
> existing entry and potentially delete the entry if it is in the
> opposite direction.
> 
> This patch extends the check to also hint that if the action has the
> force flag set, then we will lookup the existing entry so that the
> force check at the end of skb_nfct_cached has the ability to delete
> the connection.
> "
> 

That looks fine to me - since you went to the trouble of writing it up
let's just use it. 8-)

> I ran this on my tester box across a variety of kernels and platforms,
> and everything was green.
> 
> I think that you've been using a patch to tests/system-traffic.at to
> validate this behaviour, would you be able to send that to the ovs-dev
> list as a separate patch as well? When we get this into the OVS tree,
> it would be nice to have the test to validate that behaviour, to
> ensure that the userspace datapath has the same behaviour, and to help
> to stop regressions in future.

Sure, you sent that patch to me so I'll add you as co author on that as well
and add your signature if that's OK.

> 
>>   datapath/conntrack.c | 20 +++++++++++++++-----
>>   1 file changed, 15 insertions(+), 5 deletions(-)
>>
>> diff --git a/datapath/conntrack.c b/datapath/conntrack.c
>> index bf28fc0..2da0321 100644
>> --- a/datapath/conntrack.c
>> +++ b/datapath/conntrack.c
>> @@ -665,7 +665,7 @@ ovs_ct_find_existing(struct net *net, const struct nf_conntrack_zone *zone,
>>
>>   /* Determine whether skb->_nfct is equal to the result of conntrack lookup. */
>>   static bool skb_nfct_cached(struct net *net,
>> -                           const struct sw_flow_key *key,
>> +                           struct sw_flow_key *key,
>>                              const struct ovs_conntrack_info *info,
>>                              struct sk_buff *skb)
>>   {
>> @@ -675,12 +675,22 @@ static bool skb_nfct_cached(struct net *net,
>>          ct = nf_ct_get(skb, &ctinfo);
>>          /* If no ct, check if we have evidence that an existing conntrack entry
>>           * might be found for this skb.  This happens when we lose a skb->_nfct
>> -        * due to an upcall.  If the connection was not confirmed, it is not
>> -        * cached and needs to be run through conntrack again.
>> +        * due to an upcall, or if the direction is being forced.  If the
>> +        * connection was not confirmed, it is not cached and needs to be run
>> +        * through conntrack again.
>>           */
>> -       if (!ct && key->ct_state & OVS_CS_F_TRACKED &&
>> +       if ((!ct && (key->ct_state & OVS_CS_F_TRACKED &&
>>              !(key->ct_state & OVS_CS_F_INVALID) &&
>> -           key->ct_zone == info->zone.id) {
>> +           key->ct_zone == info->zone.id)) ||
>> +           (!key->ct_state && info->force)) {
>> +               if (!key->ct_state && info->force && !info->ct) {
>> +                       int result = nf_conntrack_in(net, info->family,
>> +                                                    NF_INET_PRE_ROUTING, skb);
>> +                       if (result != NF_ACCEPT)
>> +                               return false;
> 
> I suspect that in this scenario, the extra nf_conntrack_in() here is
> not strictly necessary; skb_nfct_cached() is actually trying to avoid
> doing nf_conntrack_in() in __ovs_ct_lookup(), so to put an
> nf_conntrack_in() call here is a bit difficult to fully reason about.
> This function is always hit for every packet that executes ct()
> action, but it's primarily here to deal with losing state upon upcall.
> Can you try removing it and just keeping the if condition changes?

Done.  As per our conversation over the phone and on slack it works just fine.
I'll whip up a new patch and send it to netdev and then cc ovsdev so it can
get some more eyes here as well.

Stay tuned...

Thanks,

- Greg

> 
> Cheers,
> Joe
>
Joe Stringer July 12, 2017, 10:55 p.m. UTC | #3
On 12 July 2017 at 14:49, Greg Rose <gvrose8192@gmail.com> wrote:
> On 07/12/2017 02:38 PM, Joe Stringer wrote:
>>
>> On 12 July 2017 at 11:05, Greg Rose <gvrose8192@gmail.com> wrote:
>>>
>>> When the direction is being forced key->ct_state may not have been
>>> set.  Check for this condition and take action to set the state
>>> correctly so that the force direction occurs.
>>>
>>> Co-authored-by: Joe Stringer <joe@ovn.org>
>>> Signed-off-by: Joe Stringer <joe@ovn.org>
>>> Signed-off-by: Greg Rose <gvrose8192@gmail.com>
>>> ---
>>
>>
>> Hi Greg, a few bits of high level feedback before I get into the patch
>> itself:
>>
>> Usually, patches to datapath/* just use the "datapath: " prefix in the
>> subject. Typically, the "conntrack:" prefix in the OVS tree indicates
>> changes to lib/conntrack.[ch], which is used only for the userspace
>> datapath.
>>
>> Patches to datapath/*.[ch], ie the Linux kernel module code, need to
>> be accepted into upstream Linux trees before applying to the OVS tree.
>> For bugfixes, this means the 'net' tree; for features, 'net-next'.
>> That said, if you're looking for feedback from the OVS community
>> before posting upstream then it's still quite reasonable to post the
>> patch here on ovs-dev first. For what it's worth, there's some
>> documentation about this policy here:
>>
>>
>> http://docs.openvswitch.org/en/latest/internals/contributing/backporting-patches/
>
>
> Thanks for the direction!  Much appreciated.
>
>>
>> I happen to have some of the context here, so it's a little easier for
>> me to understand what's going on but the patch description is a little
>> hard to follow. Given that the upstream Linux openvswitch maintainer
>> will be the primary person to provide feedback on the patch, the
>> clearer we can explain the better :-) I've given an attempt at
>> providing more context for the patch description below, but feel free
>> to take or leave any/all of it if it helps to describe the solution:
>>
>> "When there is an established connection in direction A->B, it is
>> possible to receive a packet on port B which then executes
>> ct(commit,force) without first performing ct() - ie, a lookup.
>
>
> That is what is happening with our test but I continue to ponder if
> it would occur in the real world.  In any case it can certainly happen
> using our packet sending interface so it needs to be fixed.  There's
> a high priority bug as well.

Yeah, ultimately it depends on your particular flow table so some
applications may configure a pipeline like this, while others may not.
Still, we're not quite satisfying the API that we have described so we
should probably fix it up. I think that there may still be subsequent
work to see if there are further similar situations where the
connections are not properly updated.

>  In this
>>
>> case, we would expect that this packet can delete the existing entry
>> so that we can commit a connection with direction B->A. However,
>> currently we only perform a check in skb_nfct_cached() for whether
>> OVS_CS_F_TRACKED is set and OVS_CS_F_INVALID is not set, ie that a
>> lookup previously occurred. In the above scenario, a lookup has not
>> occurred but we should still be able to statelessly look up the
>> existing entry and potentially delete the entry if it is in the
>> opposite direction.
>>
>> This patch extends the check to also hint that if the action has the
>> force flag set, then we will lookup the existing entry so that the
>> force check at the end of skb_nfct_cached has the ability to delete
>> the connection.
>> "
>>
>
> That looks fine to me - since you went to the trouble of writing it up
> let's just use it. 8-)
>
>> I ran this on my tester box across a variety of kernels and platforms,
>> and everything was green.
>>
>> I think that you've been using a patch to tests/system-traffic.at to
>> validate this behaviour, would you be able to send that to the ovs-dev
>> list as a separate patch as well? When we get this into the OVS tree,
>> it would be nice to have the test to validate that behaviour, to
>> ensure that the userspace datapath has the same behaviour, and to help
>> to stop regressions in future.
>
>
> Sure, you sent that patch to me so I'll add you as co author on that as well
> and add your signature if that's OK.

OK, no problem.

>>
>>>   datapath/conntrack.c | 20 +++++++++++++++-----
>>>   1 file changed, 15 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/datapath/conntrack.c b/datapath/conntrack.c
>>> index bf28fc0..2da0321 100644
>>> --- a/datapath/conntrack.c
>>> +++ b/datapath/conntrack.c
>>> @@ -665,7 +665,7 @@ ovs_ct_find_existing(struct net *net, const struct
>>> nf_conntrack_zone *zone,
>>>
>>>   /* Determine whether skb->_nfct is equal to the result of conntrack
>>> lookup. */
>>>   static bool skb_nfct_cached(struct net *net,
>>> -                           const struct sw_flow_key *key,
>>> +                           struct sw_flow_key *key,
>>>                              const struct ovs_conntrack_info *info,
>>>                              struct sk_buff *skb)
>>>   {
>>> @@ -675,12 +675,22 @@ static bool skb_nfct_cached(struct net *net,
>>>          ct = nf_ct_get(skb, &ctinfo);
>>>          /* If no ct, check if we have evidence that an existing
>>> conntrack entry
>>>           * might be found for this skb.  This happens when we lose a
>>> skb->_nfct
>>> -        * due to an upcall.  If the connection was not confirmed, it is
>>> not
>>> -        * cached and needs to be run through conntrack again.
>>> +        * due to an upcall, or if the direction is being forced.  If the
>>> +        * connection was not confirmed, it is not cached and needs to be
>>> run
>>> +        * through conntrack again.
>>>           */
>>> -       if (!ct && key->ct_state & OVS_CS_F_TRACKED &&
>>> +       if ((!ct && (key->ct_state & OVS_CS_F_TRACKED &&
>>>              !(key->ct_state & OVS_CS_F_INVALID) &&
>>> -           key->ct_zone == info->zone.id) {
>>> +           key->ct_zone == info->zone.id)) ||
>>> +           (!key->ct_state && info->force)) {
>>> +               if (!key->ct_state && info->force && !info->ct) {
>>> +                       int result = nf_conntrack_in(net, info->family,
>>> +                                                    NF_INET_PRE_ROUTING,
>>> skb);
>>> +                       if (result != NF_ACCEPT)
>>> +                               return false;
>>
>>
>> I suspect that in this scenario, the extra nf_conntrack_in() here is
>> not strictly necessary; skb_nfct_cached() is actually trying to avoid
>> doing nf_conntrack_in() in __ovs_ct_lookup(), so to put an
>> nf_conntrack_in() call here is a bit difficult to fully reason about.
>> This function is always hit for every packet that executes ct()
>> action, but it's primarily here to deal with losing state upon upcall.
>> Can you try removing it and just keeping the if condition changes?
>
>
> Done.  As per our conversation over the phone and on slack it works just
> fine.
> I'll whip up a new patch and send it to netdev and then cc ovsdev so it can
> get some more eyes here as well.

Great, thanks!
diff mbox

Patch

diff --git a/datapath/conntrack.c b/datapath/conntrack.c
index bf28fc0..2da0321 100644
--- a/datapath/conntrack.c
+++ b/datapath/conntrack.c
@@ -665,7 +665,7 @@  ovs_ct_find_existing(struct net *net, const struct nf_conntrack_zone *zone,
 
 /* Determine whether skb->_nfct is equal to the result of conntrack lookup. */
 static bool skb_nfct_cached(struct net *net,
-			    const struct sw_flow_key *key,
+			    struct sw_flow_key *key,
 			    const struct ovs_conntrack_info *info,
 			    struct sk_buff *skb)
 {
@@ -675,12 +675,22 @@  static bool skb_nfct_cached(struct net *net,
 	ct = nf_ct_get(skb, &ctinfo);
 	/* If no ct, check if we have evidence that an existing conntrack entry
 	 * might be found for this skb.  This happens when we lose a skb->_nfct
-	 * due to an upcall.  If the connection was not confirmed, it is not
-	 * cached and needs to be run through conntrack again.
+	 * due to an upcall, or if the direction is being forced.  If the
+	 * connection was not confirmed, it is not cached and needs to be run
+	 * through conntrack again.
 	 */
-	if (!ct && key->ct_state & OVS_CS_F_TRACKED &&
+	if ((!ct && (key->ct_state & OVS_CS_F_TRACKED &&
 	    !(key->ct_state & OVS_CS_F_INVALID) &&
-	    key->ct_zone == info->zone.id) {
+	    key->ct_zone == info->zone.id)) ||
+	    (!key->ct_state && info->force)) {
+		if (!key->ct_state && info->force && !info->ct) {
+			int result = nf_conntrack_in(net, info->family,
+						     NF_INET_PRE_ROUTING, skb);
+			if (result != NF_ACCEPT)
+				return false;
+			/* Update the key, but keep the NAT flags. */
+			ovs_ct_update_key(skb, info, key, true, true);
+		}
 		ct = ovs_ct_find_existing(net, &info->zone, info->family, skb,
 					  !!(key->ct_state
 					     & OVS_CS_F_NAT_MASK));