[ovs-dev,net-next] openvswitch: report features supported by the kernel datapath
diff mbox

Message ID f035dbc85f4fb7e8c8691801849a9db286154bde.1444312374.git.jbenc@redhat.com
State Not Applicable
Headers show

Commit Message

Jiri Benc Oct. 8, 2015, 1:53 p.m. UTC
Allow the user space to query what features are supported by the openvswitch
module. This will be used to allow or disallow certain configurations and/or
switch between newer and older APIs depending on what the kernel supports.

Two features are reported as supported by this patch: lwtunnel and IPv6
tunneling support. Theoretically, we could merge these two, as any of them
implies the other with this patch applied, but it's better to keep them
separate: kernel 4.3 supports lwtunnels but not IPv6 for ovs, and the
separation of the two flags allows us to backport a version of this patch
to 4.3 should the need arise.

Signed-off-by: Jiri Benc <jbenc@redhat.com>
---
This is intentionally submitted to net-next. Kernel 4.3 will be fine without
this feature (but despite that, I'd like to keep the two flags separate).
---
 include/uapi/linux/openvswitch.h | 12 ++++++++++++
 net/openvswitch/datapath.c       | 21 ++++++++++++++++++---
 2 files changed, 30 insertions(+), 3 deletions(-)

Comments

Jesse Gross Oct. 8, 2015, 10:40 p.m. UTC | #1
On Thu, Oct 8, 2015 at 6:53 AM, Jiri Benc <jbenc@redhat.com> wrote:
> Allow the user space to query what features are supported by the openvswitch
> module. This will be used to allow or disallow certain configurations and/or
> switch between newer and older APIs depending on what the kernel supports.
>
> Two features are reported as supported by this patch: lwtunnel and IPv6
> tunneling support. Theoretically, we could merge these two, as any of them
> implies the other with this patch applied, but it's better to keep them
> separate: kernel 4.3 supports lwtunnels but not IPv6 for ovs, and the
> separation of the two flags allows us to backport a version of this patch
> to 4.3 should the need arise.
>
> Signed-off-by: Jiri Benc <jbenc@redhat.com>

I have similar concerns as were expressed in the other thread. The
features listed here aren't OVS components and I don't think that it
makes sense for OVS to try to cover everything that is related - the
goal that we've been working towards is to have OVS be less monolithic
and more integrated. So to the extent that it is necessary to have
capabilities be exposed (and I would like to avoid this where
possible), it should be in the individual component, not in OVS.
Thomas Graf Oct. 9, 2015, 9:24 a.m. UTC | #2
On 10/08/15 at 03:40pm, Jesse Gross wrote:
> On Thu, Oct 8, 2015 at 6:53 AM, Jiri Benc <jbenc@redhat.com> wrote:
> > Allow the user space to query what features are supported by the openvswitch
> > module. This will be used to allow or disallow certain configurations and/or
> > switch between newer and older APIs depending on what the kernel supports.
> >
> > Two features are reported as supported by this patch: lwtunnel and IPv6
> > tunneling support. Theoretically, we could merge these two, as any of them
> > implies the other with this patch applied, but it's better to keep them
> > separate: kernel 4.3 supports lwtunnels but not IPv6 for ovs, and the
> > separation of the two flags allows us to backport a version of this patch
> > to 4.3 should the need arise.
> >
> > Signed-off-by: Jiri Benc <jbenc@redhat.com>
> 
> I have similar concerns as were expressed in the other thread. The
> features listed here aren't OVS components and I don't think that it
> makes sense for OVS to try to cover everything that is related - the
> goal that we've been working towards is to have OVS be less monolithic
> and more integrated. So to the extent that it is necessary to have
> capabilities be exposed (and I would like to avoid this where
> possible), it should be in the individual component, not in OVS.

I'm fine with that as well. However, I do dislike the idea of creating
net_devices with a set of parameters just to figure if the parameters
are supported or not. This works OK for the first step of evolution
where we have support or not but it gets absolutely messy when we
have: no support, multiple levels of partial support and finally full
support.

We have been thinking about a more generic capabilities Netlink
interface for a while and this looks like a good justification for
finally doing that work.
Jiri Benc Oct. 9, 2015, 9:46 a.m. UTC | #3
On Fri, 9 Oct 2015 11:24:53 +0200, Thomas Graf wrote:
> On 10/08/15 at 03:40pm, Jesse Gross wrote:
> > I have similar concerns as were expressed in the other thread. The
> > features listed here aren't OVS components and I don't think that it
> > makes sense for OVS to try to cover everything that is related - the
> > goal that we've been working towards is to have OVS be less monolithic
> > and more integrated. So to the extent that it is necessary to have
> > capabilities be exposed (and I would like to avoid this where
> > possible), it should be in the individual component, not in OVS.

Fair enough. Note that the IPv6 flag really belongs to ovs, though -
it's about the existence of OVS_TUNNEL_KEY_ATTR_IPV6_SRC and
OVS_TUNNEL_KEY_ATTR_IPV6_DST netlink attributes. For the lwtunnel flag
(which is just another way to tell whether vxlan/geneve/etc. has
COLLECT_METADATA support) I can agree that it does not belong to ovs.

> I'm fine with that as well. However, I do dislike the idea of creating
> net_devices with a set of parameters just to figure if the parameters
> are supported or not. This works OK for the first step of evolution
> where we have support or not but it gets absolutely messy when we
> have: no support, multiple levels of partial support and finally full
> support.

100% agreed.

> We have been thinking about a more generic capabilities Netlink
> interface for a while and this looks like a good justification for
> finally doing that work.

I've been looking into this since morning and everything I've been able
to come up with seems to be quite intrusive. Before investing time to
create a long patchset that might be potentially rejected, I'd like to
get some opinions.

My thoughts are introducing either RTM_VALIDATELINK or
RTM_NEWLINK_STRICT. In the first case, it would just check whether the
passed attributes are okay for "strict" creation of the link; in the
second case, it would either reject the request, or create the link
(similarly to what RTM_NEWLINK does but with "strict" attributes
checking).

The "strict" checking would mean:

- Rejecting attributes with type <= 0 and > maxtype (i.e. changing
  nla_parse, nlmsg_parse, etc. to do optional strict checking based on
  a passed bool parameter).

- Adding the bool parameter for strict checking to rtnl_link_ops
  validate and slave_validate callbacks.

It would mean refactoring of rtnl_newlink.

Or do you have something more generic in mind? Like adding a new
NLM_F_REQUEST_STRICT flag to nlmsghdr to be used instead of
NLM_F_REQUEST?

Thanks,

 Jiri
Jesse Gross Oct. 9, 2015, 10:06 p.m. UTC | #4
On Fri, Oct 9, 2015 at 2:46 AM, Jiri Benc <jbenc@redhat.com> wrote:
> On Fri, 9 Oct 2015 11:24:53 +0200, Thomas Graf wrote:
>> On 10/08/15 at 03:40pm, Jesse Gross wrote:
>> > I have similar concerns as were expressed in the other thread. The
>> > features listed here aren't OVS components and I don't think that it
>> > makes sense for OVS to try to cover everything that is related - the
>> > goal that we've been working towards is to have OVS be less monolithic
>> > and more integrated. So to the extent that it is necessary to have
>> > capabilities be exposed (and I would like to avoid this where
>> > possible), it should be in the individual component, not in OVS.
>
> Fair enough. Note that the IPv6 flag really belongs to ovs, though -
> it's about the existence of OVS_TUNNEL_KEY_ATTR_IPV6_SRC and
> OVS_TUNNEL_KEY_ATTR_IPV6_DST netlink attributes. For the lwtunnel flag
> (which is just another way to tell whether vxlan/geneve/etc. has
> COLLECT_METADATA support) I can agree that it does not belong to ovs.

We actually already have a mechanism for handling compatibility as new
keys are added without requiring capabilities bits - see
Documentation/networking/openvswitch.txt. We've used this quite a few
times, including the addition of baseline tunnel capabilities.

That doesn't really cover IPv6 capability for individual tunneling
protocols though (which could vary on a per-protocol basis).

Patch
diff mbox

diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
index 32e07d8cbaf4..3d8d00b6e55d 100644
--- a/include/uapi/linux/openvswitch.h
+++ b/include/uapi/linux/openvswitch.h
@@ -73,6 +73,10 @@  enum ovs_datapath_cmd {
  * datapath.  Always present in notifications.
  * @OVS_DP_ATTR_MEGAFLOW_STATS: Statistics about mega flow masks usage for the
  * datapath. Always present in notifications.
+ * @OVS_DP_ATTR_KERNEL_FEATURES: Bitmask of features that the datapath
+ * supports.  When creating a datapath, this is a minimal feature set
+ * requested, the request will fail if it cannot be honored.  On get
+ * operations, this contains all features supported by the datapath.
  *
  * These attributes follow the &struct ovs_header within the Generic Netlink
  * payload for %OVS_DP_* commands.
@@ -84,6 +88,7 @@  enum ovs_datapath_attr {
 	OVS_DP_ATTR_STATS,		/* struct ovs_dp_stats */
 	OVS_DP_ATTR_MEGAFLOW_STATS,	/* struct ovs_dp_megaflow_stats */
 	OVS_DP_ATTR_USER_FEATURES,	/* OVS_DP_F_*  */
+	OVS_DP_ATTR_KERNEL_FEATURES,	/* OVS_DP_KF_* */
 	__OVS_DP_ATTR_MAX
 };
 
@@ -121,6 +126,13 @@  struct ovs_vport_stats {
 /* Allow datapath to associate multiple Netlink PIDs to each vport */
 #define OVS_DP_F_VPORT_PIDS	(1 << 1)
 
+/* Datapath features */
+#define OVS_DP_KF_LWTUNNEL	(1ULL << 0)  /* lwtunnel interface supported */
+#define OVS_DP_KF_IPV6_TUNNEL	(1ULL << 1)  /* IPv6 tunneling supported */
+
+#define OVS_DP_KF_ALL		(OVS_DP_KF_LWTUNNEL | \
+				 OVS_DP_KF_IPV6_TUNNEL)
+
 /* Fixed logical ports. */
 #define OVSP_LOCAL      ((__u32)0)
 
diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index a75828091e21..896435779fc8 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -1439,6 +1439,7 @@  static size_t ovs_dp_cmd_msg_size(void)
 	msgsize += nla_total_size(sizeof(struct ovs_dp_stats));
 	msgsize += nla_total_size(sizeof(struct ovs_dp_megaflow_stats));
 	msgsize += nla_total_size(sizeof(u32)); /* OVS_DP_ATTR_USER_FEATURES */
+	msgsize += nla_total_size(sizeof(u64)); /* OVS_DP_ATTR_KERNEL_FEATURES */
 
 	return msgsize;
 }
@@ -1476,6 +1477,9 @@  static int ovs_dp_cmd_fill_info(struct datapath *dp, struct sk_buff *skb,
 	if (nla_put_u32(skb, OVS_DP_ATTR_USER_FEATURES, dp->user_features))
 		goto nla_put_failure;
 
+	if (nla_put_u64(skb, OVS_DP_ATTR_KERNEL_FEATURES, OVS_DP_KF_ALL))
+		goto nla_put_failure;
+
 	genlmsg_end(skb, ovs_header);
 	return 0;
 
@@ -1526,6 +1530,13 @@  static void ovs_dp_change(struct datapath *dp, struct nlattr *a[])
 		dp->user_features = nla_get_u32(a[OVS_DP_ATTR_USER_FEATURES]);
 }
 
+static bool ovs_dp_kernel_features_ok(struct nlattr *a[])
+{
+	if (!a[OVS_DP_ATTR_KERNEL_FEATURES])
+		return true;
+	return !(nla_get_u64(a[OVS_DP_ATTR_KERNEL_FEATURES]) & ~OVS_DP_KF_ALL);
+}
+
 static int ovs_dp_cmd_new(struct sk_buff *skb, struct genl_info *info)
 {
 	struct nlattr **a = info->attrs;
@@ -1536,9 +1547,10 @@  static int ovs_dp_cmd_new(struct sk_buff *skb, struct genl_info *info)
 	struct ovs_net *ovs_net;
 	int err, i;
 
-	err = -EINVAL;
 	if (!a[OVS_DP_ATTR_NAME] || !a[OVS_DP_ATTR_UPCALL_PID])
-		goto err;
+		return -EINVAL;
+	if (!ovs_dp_kernel_features_ok(a))
+		return -EOPNOTSUPP;
 
 	reply = ovs_dp_cmd_alloc_info(info);
 	if (!reply)
@@ -1626,7 +1638,6 @@  err_free_dp:
 	kfree(dp);
 err_free_reply:
 	kfree_skb(reply);
-err:
 	return err;
 }
 
@@ -1694,6 +1705,9 @@  static int ovs_dp_cmd_set(struct sk_buff *skb, struct genl_info *info)
 	struct datapath *dp;
 	int err;
 
+	if (!ovs_dp_kernel_features_ok(info->attrs))
+		return -EOPNOTSUPP;
+
 	reply = ovs_dp_cmd_alloc_info(info);
 	if (!reply)
 		return -ENOMEM;
@@ -1777,6 +1791,7 @@  static const struct nla_policy datapath_policy[OVS_DP_ATTR_MAX + 1] = {
 	[OVS_DP_ATTR_NAME] = { .type = NLA_NUL_STRING, .len = IFNAMSIZ - 1 },
 	[OVS_DP_ATTR_UPCALL_PID] = { .type = NLA_U32 },
 	[OVS_DP_ATTR_USER_FEATURES] = { .type = NLA_U32 },
+	[OVS_DP_ATTR_KERNEL_FEATURES] = { .type = NLA_U64 },
 };
 
 static const struct genl_ops dp_datapath_genl_ops[] = {