diff mbox series

[ovs-dev,net] net: openvswitch: fix TTL decrement action netlink message format

Message ID 160577663600.7755.4779460826621858224.stgit@wsfd-netdev64.ntdv.lab.eng.bos.redhat.com
State Awaiting Upstream
Headers show
Series [ovs-dev,net] net: openvswitch: fix TTL decrement action netlink message format | expand

Commit Message

Eelco Chaudron Nov. 19, 2020, 9:04 a.m. UTC
Currently, the openvswitch module is not accepting the correctly formated
netlink message for the TTL decrement action. For both setting and getting
the dec_ttl action, the actions should be nested in the
OVS_DEC_TTL_ATTR_ACTION attribute as mentioned in the openvswitch.h uapi.

Fixes: 744676e77720 ("openvswitch: add TTL decrement action")
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
---
 include/uapi/linux/openvswitch.h |    2 +
 net/openvswitch/actions.c        |    7 ++--
 net/openvswitch/flow_netlink.c   |   74 ++++++++++++++++++++++++++++----------
 3 files changed, 60 insertions(+), 23 deletions(-)

Comments

Jakub Kicinski Nov. 20, 2020, 9:12 p.m. UTC | #1
On Thu, 19 Nov 2020 04:04:04 -0500 Eelco Chaudron wrote:
> Currently, the openvswitch module is not accepting the correctly formated
> netlink message for the TTL decrement action. For both setting and getting
> the dec_ttl action, the actions should be nested in the
> OVS_DEC_TTL_ATTR_ACTION attribute as mentioned in the openvswitch.h uapi.

IOW this change will not break any known user space, correct?

But existing OvS user space already expects it to work like you 
make it work now?

What's the harm in leaving it as is?

> Fixes: 744676e77720 ("openvswitch: add TTL decrement action")
> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>

Can we get a review from OvS folks? Matteo looks good to you (as the
original author)?

> -	err = __ovs_nla_copy_actions(net, attr, key, sfa, eth_type,
> +	err = __ovs_nla_copy_actions(net, actions, key, sfa, eth_type,
>  				     vlan_tci, mpls_label_count, log);
>  	if (err)
>  		return err;

You're not canceling any nests on error, I assume this is normal.

> +	add_nested_action_end(*sfa, action_start);
>  	add_nested_action_end(*sfa, start);
>  	return 0;
>  }
Pravin Shelar Nov. 20, 2020, 10:16 p.m. UTC | #2
On Thu, Nov 19, 2020 at 1:04 AM Eelco Chaudron <echaudro@redhat.com> wrote:
>
> Currently, the openvswitch module is not accepting the correctly formated
> netlink message for the TTL decrement action. For both setting and getting
> the dec_ttl action, the actions should be nested in the
> OVS_DEC_TTL_ATTR_ACTION attribute as mentioned in the openvswitch.h uapi.
>
> Fixes: 744676e77720 ("openvswitch: add TTL decrement action")
> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Thanks for working on this. can you share OVS kmod unit test for this action?
Matteo Croce Nov. 23, 2020, 7:36 p.m. UTC | #3
On Fri, Nov 20, 2020 at 10:12 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Thu, 19 Nov 2020 04:04:04 -0500 Eelco Chaudron wrote:
> > Currently, the openvswitch module is not accepting the correctly formated
> > netlink message for the TTL decrement action. For both setting and getting
> > the dec_ttl action, the actions should be nested in the
> > OVS_DEC_TTL_ATTR_ACTION attribute as mentioned in the openvswitch.h uapi.
>
> IOW this change will not break any known user space, correct?
>
> But existing OvS user space already expects it to work like you
> make it work now?
>
> What's the harm in leaving it as is?
>
> > Fixes: 744676e77720 ("openvswitch: add TTL decrement action")
> > Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
>
> Can we get a review from OvS folks? Matteo looks good to you (as the
> original author)?
>

Hi,

I think that the userspace still has to implement the dec_ttl action;
by now dec_ttl is implemented with set_ttl().
So there is no breakage yet.

Eelco, with this fix we will encode the netlink attribute in the same
way for the kernel and netdev datapath?

If so, go for it.


> > -     err = __ovs_nla_copy_actions(net, attr, key, sfa, eth_type,
> > +     err = __ovs_nla_copy_actions(net, actions, key, sfa, eth_type,
> >                                    vlan_tci, mpls_label_count, log);
> >       if (err)
> >               return err;
>
> You're not canceling any nests on error, I assume this is normal.
>
> > +     add_nested_action_end(*sfa, action_start);
> >       add_nested_action_end(*sfa, start);
> >       return 0;
> >  }
>
Jakub Kicinski Nov. 24, 2020, 1:57 a.m. UTC | #4
On Mon, 23 Nov 2020 20:36:39 +0100 Matteo Croce wrote:
> On Fri, Nov 20, 2020 at 10:12 PM Jakub Kicinski <kuba@kernel.org> wrote:
> > On Thu, 19 Nov 2020 04:04:04 -0500 Eelco Chaudron wrote:  
> > > Currently, the openvswitch module is not accepting the correctly formated
> > > netlink message for the TTL decrement action. For both setting and getting
> > > the dec_ttl action, the actions should be nested in the
> > > OVS_DEC_TTL_ATTR_ACTION attribute as mentioned in the openvswitch.h uapi.  
> >
> > IOW this change will not break any known user space, correct?
> >
> > But existing OvS user space already expects it to work like you
> > make it work now?
> >
> > What's the harm in leaving it as is?
> >  
> > > Fixes: 744676e77720 ("openvswitch: add TTL decrement action")
> > > Signed-off-by: Eelco Chaudron <echaudro@redhat.com>  
> >
> > Can we get a review from OvS folks? Matteo looks good to you (as the
> > original author)?
> 
> I think that the userspace still has to implement the dec_ttl action;
> by now dec_ttl is implemented with set_ttl().
> So there is no breakage yet.
> 
> Eelco, with this fix we will encode the netlink attribute in the same
> way for the kernel and netdev datapath?

We don't allow breaking uAPI. Sounds like the user space never
implemented this and perhaps the nesting is just inconvenient 
but not necessarily broken? If it is broken and unusable that 
has to be clearly explained in the commit message. I'm dropping 
v1 from patchwork.
Eelco Chaudron Nov. 24, 2020, 10:51 a.m. UTC | #5
On 20 Nov 2020, at 23:16, Pravin Shelar wrote:

> On Thu, Nov 19, 2020 at 1:04 AM Eelco Chaudron <echaudro@redhat.com> 
> wrote:
>>
>> Currently, the openvswitch module is not accepting the correctly 
>> formated
>> netlink message for the TTL decrement action. For both setting and 
>> getting
>> the dec_ttl action, the actions should be nested in the
>> OVS_DEC_TTL_ATTR_ACTION attribute as mentioned in the openvswitch.h 
>> uapi.
>>
>> Fixes: 744676e77720 ("openvswitch: add TTL decrement action")
>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
> Thanks for working on this. can you share OVS kmod unit test for this 
> action?

Hi Pravin,

I did add a self-test, however, my previous plan was to send out the 
updated OVS patch after this change got accepted. But due to all the 
comments, I sent it out anyway, so here it is with a datapath test:

https://mail.openvswitch.org/pipermail/ovs-dev/2020-November/377795.html

//Eelco
Eelco Chaudron Nov. 24, 2020, 11:18 a.m. UTC | #6
On 20 Nov 2020, at 22:12, Jakub Kicinski wrote:

> On Thu, 19 Nov 2020 04:04:04 -0500 Eelco Chaudron wrote:
>> Currently, the openvswitch module is not accepting the correctly 
>> formated
>> netlink message for the TTL decrement action. For both setting and 
>> getting
>> the dec_ttl action, the actions should be nested in the
>> OVS_DEC_TTL_ATTR_ACTION attribute as mentioned in the openvswitch.h 
>> uapi.
>
> IOW this change will not break any known user space, correct?

It will not as there isn’t any yet. Unfortunately, the original patch 
was sent out without a userspace part. It was internally tested by the 
original authors and not properly reviewed to bring forward the issue. 
They did add some weird code to work around it.

> But existing OvS user space already expects it to work like you
> make it work now?
>
> What's the harm in leaving it as is?

Without this change, the different Datapaths in OVS behave differently, 
making the code to be datapath agnostic having to do all kinds of weird 
tricks to work around it.

But even worse, the patch in the current format could interpret 
additional options/attributes as actions, due to the actions not being 
encapsulated/nested within the actual attribute.

>> Fixes: 744676e77720 ("openvswitch: add TTL decrement action")
>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
>
> Can we get a review from OvS folks? Matteo looks good to you (as the
> original author)?

See Matteo’s reply, looks like he is ok with this change.

>> -	err = __ovs_nla_copy_actions(net, attr, key, sfa, eth_type,
>> +	err = __ovs_nla_copy_actions(net, actions, key, sfa, eth_type,
>>  				     vlan_tci, mpls_label_count, log);
>>  	if (err)
>>  		return err;
>
> You're not canceling any nests on error, I assume this is normal.

Yes, on error the sfa actions are not used.

>> +	add_nested_action_end(*sfa, action_start);
>>  	add_nested_action_end(*sfa, start);
>>  	return 0;
>>  }
Eelco Chaudron Nov. 24, 2020, 11:19 a.m. UTC | #7
On 23 Nov 2020, at 20:36, Matteo Croce wrote:

> On Fri, Nov 20, 2020 at 10:12 PM Jakub Kicinski <kuba@kernel.org> 
> wrote:
>>
>> On Thu, 19 Nov 2020 04:04:04 -0500 Eelco Chaudron wrote:
>>> Currently, the openvswitch module is not accepting the correctly 
>>> formated
>>> netlink message for the TTL decrement action. For both setting and 
>>> getting
>>> the dec_ttl action, the actions should be nested in the
>>> OVS_DEC_TTL_ATTR_ACTION attribute as mentioned in the openvswitch.h 
>>> uapi.
>>
>> IOW this change will not break any known user space, correct?
>>
>> But existing OvS user space already expects it to work like you
>> make it work now?
>>
>> What's the harm in leaving it as is?
>>
>>> Fixes: 744676e77720 ("openvswitch: add TTL decrement action")
>>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
>>
>> Can we get a review from OvS folks? Matteo looks good to you (as the
>> original author)?
>>
>
> Hi,
>
> I think that the userspace still has to implement the dec_ttl action;
> by now dec_ttl is implemented with set_ttl().
> So there is no breakage yet.

Yes, see reply to Jakub’s email.

> Eelco, with this fix we will encode the netlink attribute in the same
> way for the kernel and netdev datapath?

Yes, this should make both implementations the same. No more weird code 
in the data-plane agnostic code :)

> If so, go for it.
>
>
>>> -     err = __ovs_nla_copy_actions(net, attr, key, sfa, eth_type,
>>> +     err = __ovs_nla_copy_actions(net, actions, key, sfa, eth_type,
>>>                                    vlan_tci, mpls_label_count, log);
>>>       if (err)
>>>               return err;
>>
>> You're not canceling any nests on error, I assume this is normal.
>>
>>> +     add_nested_action_end(*sfa, action_start);
>>>       add_nested_action_end(*sfa, start);
>>>       return 0;
>>>  }
>>
>
>
> -- 
> per aspera ad upstream
Eelco Chaudron Nov. 24, 2020, 11:20 a.m. UTC | #8
On 24 Nov 2020, at 2:57, Jakub Kicinski wrote:

> On Mon, 23 Nov 2020 20:36:39 +0100 Matteo Croce wrote:
>> On Fri, Nov 20, 2020 at 10:12 PM Jakub Kicinski <kuba@kernel.org> 
>> wrote:
>>> On Thu, 19 Nov 2020 04:04:04 -0500 Eelco Chaudron wrote:
>>>> Currently, the openvswitch module is not accepting the correctly 
>>>> formated
>>>> netlink message for the TTL decrement action. For both setting and 
>>>> getting
>>>> the dec_ttl action, the actions should be nested in the
>>>> OVS_DEC_TTL_ATTR_ACTION attribute as mentioned in the openvswitch.h 
>>>> uapi.
>>>
>>> IOW this change will not break any known user space, correct?
>>>
>>> But existing OvS user space already expects it to work like you
>>> make it work now?
>>>
>>> What's the harm in leaving it as is?
>>>
>>>> Fixes: 744676e77720 ("openvswitch: add TTL decrement action")
>>>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
>>>
>>> Can we get a review from OvS folks? Matteo looks good to you (as the
>>> original author)?
>>
>> I think that the userspace still has to implement the dec_ttl action;
>> by now dec_ttl is implemented with set_ttl().
>> So there is no breakage yet.
>>
>> Eelco, with this fix we will encode the netlink attribute in the same
>> way for the kernel and netdev datapath?
>
> We don't allow breaking uAPI. Sounds like the user space never
> implemented this and perhaps the nesting is just inconvenient
> but not necessarily broken? If it is broken and unusable that
> has to be clearly explained in the commit message. I'm dropping
> v1 from patchwork.

Thanks, I will add some explaining comments to the V2, and sent it out.
diff mbox series

Patch

diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
index 8300cc29dec8..8d16744edc31 100644
--- a/include/uapi/linux/openvswitch.h
+++ b/include/uapi/linux/openvswitch.h
@@ -1058,4 +1058,6 @@  enum ovs_dec_ttl_attr {
 	__OVS_DEC_TTL_ATTR_MAX
 };
 
+#define OVS_DEC_TTL_ATTR_MAX (__OVS_DEC_TTL_ATTR_MAX - 1)
+
 #endif /* _LINUX_OPENVSWITCH_H */
diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index b87bfc82f44f..5829a020b81c 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -958,14 +958,13 @@  static int dec_ttl_exception_handler(struct datapath *dp, struct sk_buff *skb,
 {
 	/* The first action is always 'OVS_DEC_TTL_ATTR_ARG'. */
 	struct nlattr *dec_ttl_arg = nla_data(attr);
-	int rem = nla_len(attr);
 
 	if (nla_len(dec_ttl_arg)) {
-		struct nlattr *actions = nla_next(dec_ttl_arg, &rem);
+		struct nlattr *actions = nla_data(dec_ttl_arg);
 
 		if (actions)
-			return clone_execute(dp, skb, key, 0, actions, rem,
-					     last, false);
+			return clone_execute(dp, skb, key, 0, nla_data(actions),
+					     nla_len(actions), last, false);
 	}
 	consume_skb(skb);
 	return 0;
diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
index 9d3e50c4d29f..ec0689ddc635 100644
--- a/net/openvswitch/flow_netlink.c
+++ b/net/openvswitch/flow_netlink.c
@@ -2503,28 +2503,42 @@  static int validate_and_copy_dec_ttl(struct net *net,
 				     __be16 eth_type, __be16 vlan_tci,
 				     u32 mpls_label_count, bool log)
 {
-	int start, err;
-	u32 nested = true;
+	const struct nlattr *attrs[OVS_DEC_TTL_ATTR_MAX + 1];
+	int start, action_start, err, rem;
+	const struct nlattr *a, *actions;
+
+	memset(attrs, 0, sizeof(attrs));
+	nla_for_each_nested(a, attr, rem) {
+		int type = nla_type(a);
 
-	if (!nla_len(attr))
-		return ovs_nla_add_action(sfa, OVS_ACTION_ATTR_DEC_TTL,
-					  NULL, 0, log);
+		/* Ignore unknown attributes to be future proof. */
+		if (type > OVS_DEC_TTL_ATTR_MAX)
+			continue;
+
+		if (!type || attrs[type])
+			return -EINVAL;
+
+		attrs[type] = a;
+	}
+
+	actions = attrs[OVS_DEC_TTL_ATTR_ACTION];
+	if (rem || !actions || (nla_len(actions) && nla_len(actions) < NLA_HDRLEN))
+		return -EINVAL;
 
 	start = add_nested_action_start(sfa, OVS_ACTION_ATTR_DEC_TTL, log);
 	if (start < 0)
 		return start;
 
-	err = ovs_nla_add_action(sfa, OVS_DEC_TTL_ATTR_ACTION, &nested,
-				 sizeof(nested), log);
-
-	if (err)
-		return err;
+	action_start = add_nested_action_start(sfa, OVS_DEC_TTL_ATTR_ACTION, log);
+	if (action_start < 0)
+		return start;
 
-	err = __ovs_nla_copy_actions(net, attr, key, sfa, eth_type,
+	err = __ovs_nla_copy_actions(net, actions, key, sfa, eth_type,
 				     vlan_tci, mpls_label_count, log);
 	if (err)
 		return err;
 
+	add_nested_action_end(*sfa, action_start);
 	add_nested_action_end(*sfa, start);
 	return 0;
 }
@@ -3487,20 +3501,42 @@  static int check_pkt_len_action_to_attr(const struct nlattr *attr,
 static int dec_ttl_action_to_attr(const struct nlattr *attr,
 				  struct sk_buff *skb)
 {
-	int err = 0, rem = nla_len(attr);
-	struct nlattr *start;
+	struct nlattr *start, *action_start;
+	const struct nlattr *a;
+	int err = 0, rem;
 
 	start = nla_nest_start_noflag(skb, OVS_ACTION_ATTR_DEC_TTL);
-
 	if (!start)
 		return -EMSGSIZE;
 
-	err = ovs_nla_put_actions(nla_data(attr), rem, skb);
-	if (err)
-		nla_nest_cancel(skb, start);
-	else
-		nla_nest_end(skb, start);
+	nla_for_each_attr(a, nla_data(attr), nla_len(attr), rem) {
+		switch (nla_type(a)) {
+		case OVS_DEC_TTL_ATTR_ACTION:
+
+			action_start = nla_nest_start_noflag(skb, OVS_DEC_TTL_ATTR_ACTION);
+			if (!action_start) {
+				err = -EMSGSIZE;
+				goto out;
+			}
+
+			err = ovs_nla_put_actions(nla_data(a), nla_len(a), skb);
+			if (err)
+				goto out;
+
+			nla_nest_end(skb, action_start);
+			break;
 
+		default:
+			/* Ignore all other option to be future compatible */
+			break;
+		}
+	}
+
+	nla_nest_end(skb, start);
+	return 0;
+
+out:
+	nla_nest_cancel(skb, start);
 	return err;
 }