[ovs-dev,net-next,1/2] openvswitch: add tunnel protocol to sw_flow_key
diff mbox

Message ID 428cee4cb7e04cfea5bef90b37d257562ea78a83.1443548447.git.jbenc@redhat.com
State Not Applicable
Headers show

Commit Message

Jiri Benc Sept. 29, 2015, 5:52 p.m. UTC
Store tunnel protocol (AF_INET or AF_INET6) in sw_flow_key. This field now
also acts as an indicator whether the flow contains tunnel data (this was
previously indicated by tun_key.u.ipv4.dst being set but with IPv6 addresses
in an union with IPv4 ones this won't work anymore).

The new field was added to a hole in sw_flow_key.

Signed-off-by: Jiri Benc <jbenc@redhat.com>
---
 net/openvswitch/flow.c         | 4 ++--
 net/openvswitch/flow.h         | 1 +
 net/openvswitch/flow_netlink.c | 7 +++++--
 net/openvswitch/flow_table.c   | 2 +-
 4 files changed, 9 insertions(+), 5 deletions(-)

Comments

Pravin B Shelar Sept. 29, 2015, 8:41 p.m. UTC | #1
On Tue, Sep 29, 2015 at 10:52 AM, Jiri Benc <jbenc@redhat.com> wrote:
> Store tunnel protocol (AF_INET or AF_INET6) in sw_flow_key. This field now
> also acts as an indicator whether the flow contains tunnel data (this was
> previously indicated by tun_key.u.ipv4.dst being set but with IPv6 addresses
> in an union with IPv4 ones this won't work anymore).
>
> The new field was added to a hole in sw_flow_key.
>
> Signed-off-by: Jiri Benc <jbenc@redhat.com>
> ---
>  net/openvswitch/flow.c         | 4 ++--
>  net/openvswitch/flow.h         | 1 +
>  net/openvswitch/flow_netlink.c | 7 +++++--
>  net/openvswitch/flow_table.c   | 2 +-
>  4 files changed, 9 insertions(+), 5 deletions(-)
>
...

> diff --git a/net/openvswitch/flow.h b/net/openvswitch/flow.h
> index fe527d2dd4b7..5688e33e2de6 100644
> --- a/net/openvswitch/flow.h
> +++ b/net/openvswitch/flow.h
> @@ -63,6 +63,7 @@ struct sw_flow_key {
>                 u32     skb_mark;       /* SKB mark. */
>                 u16     in_port;        /* Input switch port (or DP_MAX_PORTS). */
>         } __packed phy; /* Safe when right after 'tun_key'. */
> +       u8 tun_proto;                   /* Protocol of encapsulating tunnel. */

We can add rather add TUNNEL_IPV6 flag to distinguish IPv4 and IPv6
tunnel keys. This can be stored in ip_tunnel_key.tun_flags. That also
saves space in flow key.
Jesse Gross Sept. 30, 2015, 2:08 a.m. UTC | #2
On Tue, Sep 29, 2015 at 10:52 AM, Jiri Benc <jbenc@redhat.com> wrote:
> diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
> index 5c030a4d7338..03ba070c3256 100644
> --- a/net/openvswitch/flow_netlink.c
> +++ b/net/openvswitch/flow_netlink.c
> @@ -643,6 +643,7 @@ static int ipv4_tun_from_nlattr(const struct nlattr *attr,
>         }
>
>         SW_FLOW_KEY_PUT(match, tun_key.tun_flags, tun_flags, is_mask);
> +       SW_FLOW_KEY_PUT(match, tun_proto, AF_INET, is_mask);

I don't think this is right in the case of the mask. It will cause the
the mask to be the value AF_INET - instead you want to set the mask to
be 0xff.
Jiri Benc Sept. 30, 2015, 7:09 a.m. UTC | #3
On Tue, 29 Sep 2015 13:41:34 -0700, Pravin Shelar wrote:
> We can add rather add TUNNEL_IPV6 flag to distinguish IPv4 and IPv6
> tunnel keys. This can be stored in ip_tunnel_key.tun_flags.

Not really. This was my original approach, too, but openvswitch is not
the only user of struct ip_tunnel_key, and in the lwtunnel core,
tun_flags are handled in the way that makes this impractical. Most
importantly, the tun_flags value is directly taken from/stored to
LWTUNNEL_IP_FLAGS/LWTUNNEL_IP6_FLAGS netlink attributes in
net/ipv4/ip_tunnel_core.c. This would mean complicated masking, etc.

> That also saves space in flow key.

The field was added to a 2 byte hole in the struct sw_flow_key (leaving
still 1 byte free), thus there's no additional space used.

 Jiri
Jiri Benc Sept. 30, 2015, 7:14 a.m. UTC | #4
On Tue, 29 Sep 2015 19:08:44 -0700, Jesse Gross wrote:
> On Tue, Sep 29, 2015 at 10:52 AM, Jiri Benc <jbenc@redhat.com> wrote:
> > diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
> > index 5c030a4d7338..03ba070c3256 100644
> > --- a/net/openvswitch/flow_netlink.c
> > +++ b/net/openvswitch/flow_netlink.c
> > @@ -643,6 +643,7 @@ static int ipv4_tun_from_nlattr(const struct nlattr *attr,
> >         }
> >
> >         SW_FLOW_KEY_PUT(match, tun_key.tun_flags, tun_flags, is_mask);
> > +       SW_FLOW_KEY_PUT(match, tun_proto, AF_INET, is_mask);
> 
> I don't think this is right in the case of the mask. It will cause the
> the mask to be the value AF_INET - instead you want to set the mask to
> be 0xff.

I think you're right, this is a special case. I'll fix it.

Thanks,

 Jiri
Pravin B Shelar Sept. 30, 2015, 8:13 p.m. UTC | #5
On Wed, Sep 30, 2015 at 12:09 AM, Jiri Benc <jbenc@redhat.com> wrote:
> On Tue, 29 Sep 2015 13:41:34 -0700, Pravin Shelar wrote:
>> We can add rather add TUNNEL_IPV6 flag to distinguish IPv4 and IPv6
>> tunnel keys. This can be stored in ip_tunnel_key.tun_flags.
>
> Not really. This was my original approach, too, but openvswitch is not
> the only user of struct ip_tunnel_key, and in the lwtunnel core,
> tun_flags are handled in the way that makes this impractical. Most
> importantly, the tun_flags value is directly taken from/stored to
> LWTUNNEL_IP_FLAGS/LWTUNNEL_IP6_FLAGS netlink attributes in
> net/ipv4/ip_tunnel_core.c. This would mean complicated masking, etc.
>
How is it impractical ? Userspace can set flag for IPv6 tunnel info.
That should be easy.

IPv6 bit can not be masked anyways so I do not see problem with
masking this flag due to the new bit.

Since this field is exposed to userspace. TUNNEL_* flags needs to be
moved to uapi header.


>> That also saves space in flow key.
>
> The field was added to a 2 byte hole in the struct sw_flow_key (leaving
> still 1 byte free), thus there's no additional space used.
>
>  Jiri
>
> --
> Jiri Benc
Jesse Gross Sept. 30, 2015, 8:25 p.m. UTC | #6
On Wed, Sep 30, 2015 at 1:13 PM, Pravin Shelar <pshelar@nicira.com> wrote:
> On Wed, Sep 30, 2015 at 12:09 AM, Jiri Benc <jbenc@redhat.com> wrote:
>> On Tue, 29 Sep 2015 13:41:34 -0700, Pravin Shelar wrote:
>>> We can add rather add TUNNEL_IPV6 flag to distinguish IPv4 and IPv6
>>> tunnel keys. This can be stored in ip_tunnel_key.tun_flags.
>>
>> Not really. This was my original approach, too, but openvswitch is not
>> the only user of struct ip_tunnel_key, and in the lwtunnel core,
>> tun_flags are handled in the way that makes this impractical. Most
>> importantly, the tun_flags value is directly taken from/stored to
>> LWTUNNEL_IP_FLAGS/LWTUNNEL_IP6_FLAGS netlink attributes in
>> net/ipv4/ip_tunnel_core.c. This would mean complicated masking, etc.
>>
> How is it impractical ? Userspace can set flag for IPv6 tunnel info.
> That should be easy.
>
> IPv6 bit can not be masked anyways so I do not see problem with
> masking this flag due to the new bit.

I think he meant for non-OVS users.

> Since this field is exposed to userspace. TUNNEL_* flags needs to be
> moved to uapi header.

This doesn't really seem all that desirable to me. It's nice to be
able to change these as necessary and in the particular case of IPv6,
it seems like something that the kernel can manage by itself (as is
done in this patch and I think the same strategy would apply
regardless of the particular representation).
Jiri Benc Sept. 30, 2015, 8:52 p.m. UTC | #7
On Wed, 30 Sep 2015 13:25:12 -0700, Jesse Gross wrote:
> On Wed, Sep 30, 2015 at 1:13 PM, Pravin Shelar <pshelar@nicira.com> wrote:
> > On Wed, Sep 30, 2015 at 12:09 AM, Jiri Benc <jbenc@redhat.com> wrote:
> >> On Tue, 29 Sep 2015 13:41:34 -0700, Pravin Shelar wrote:
> >>> We can add rather add TUNNEL_IPV6 flag to distinguish IPv4 and IPv6
> >>> tunnel keys. This can be stored in ip_tunnel_key.tun_flags.
> >>
> >> Not really. This was my original approach, too, but openvswitch is not
> >> the only user of struct ip_tunnel_key, and in the lwtunnel core,
> >> tun_flags are handled in the way that makes this impractical. Most
> >> importantly, the tun_flags value is directly taken from/stored to
> >> LWTUNNEL_IP_FLAGS/LWTUNNEL_IP6_FLAGS netlink attributes in
> >> net/ipv4/ip_tunnel_core.c. This would mean complicated masking, etc.
> >>
> > How is it impractical ? Userspace can set flag for IPv6 tunnel info.
> > That should be easy.
> >
> > IPv6 bit can not be masked anyways so I do not see problem with
> > masking this flag due to the new bit.
> 
> I think he meant for non-OVS users.

Yes, I didn't mean masking in ovs, I meant that we'd need to hide the
bit from other users, for example in net/ipv4/ip_tunnel_core.c.
Currently, the information about ip_tunnel_key protocol is stored
outside the structure. Changing this would mean quite big changes in
the lwtunnel code (or, rather, IP users of lwtunnel) which doesn't seem
worth it just because of ovs. Especially when ovs can store the
information just fine without impact on memory footprint.

I don't see any real advantage in storing the protocol inside
ip_tunnel_key, this looks like it would be just a change for the change.

> > Since this field is exposed to userspace. TUNNEL_* flags needs to be
> > moved to uapi header.
> 
> This doesn't really seem all that desirable to me. It's nice to be
> able to change these as necessary and in the particular case of IPv6,
> it seems like something that the kernel can manage by itself (as is
> done in this patch and I think the same strategy would apply
> regardless of the particular representation).

User space can set and get those bits in LWTUNNEL_IP_FLAGS netlink
attribute when using lwtunnel+routing rules. It would make sense to
move them to uapi but that's for a different patch(set).

 Jiri

Patch
diff mbox

diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
index c8db44ab2ee7..0ea128eeeab2 100644
--- a/net/openvswitch/flow.c
+++ b/net/openvswitch/flow.c
@@ -698,8 +698,7 @@  int ovs_flow_key_extract(const struct ip_tunnel_info *tun_info,
 {
 	/* Extract metadata from packet. */
 	if (tun_info) {
-		if (ip_tunnel_info_af(tun_info) != AF_INET)
-			return -EINVAL;
+		key->tun_proto = ip_tunnel_info_af(tun_info);
 		memcpy(&key->tun_key, &tun_info->key, sizeof(key->tun_key));
 
 		if (tun_info->options_len) {
@@ -714,6 +713,7 @@  int ovs_flow_key_extract(const struct ip_tunnel_info *tun_info,
 			key->tun_opts_len = 0;
 		}
 	} else  {
+		key->tun_proto = 0;
 		key->tun_opts_len = 0;
 		memset(&key->tun_key, 0, sizeof(key->tun_key));
 	}
diff --git a/net/openvswitch/flow.h b/net/openvswitch/flow.h
index fe527d2dd4b7..5688e33e2de6 100644
--- a/net/openvswitch/flow.h
+++ b/net/openvswitch/flow.h
@@ -63,6 +63,7 @@  struct sw_flow_key {
 		u32	skb_mark;	/* SKB mark. */
 		u16	in_port;	/* Input switch port (or DP_MAX_PORTS). */
 	} __packed phy; /* Safe when right after 'tun_key'. */
+	u8 tun_proto;			/* Protocol of encapsulating tunnel. */
 	u32 ovs_flow_hash;		/* Datapath computed hash value.  */
 	u32 recirc_id;			/* Recirculation ID.  */
 	struct {
diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
index 5c030a4d7338..03ba070c3256 100644
--- a/net/openvswitch/flow_netlink.c
+++ b/net/openvswitch/flow_netlink.c
@@ -643,6 +643,7 @@  static int ipv4_tun_from_nlattr(const struct nlattr *attr,
 	}
 
 	SW_FLOW_KEY_PUT(match, tun_key.tun_flags, tun_flags, is_mask);
+	SW_FLOW_KEY_PUT(match, tun_proto, AF_INET, is_mask);
 
 	if (rem > 0) {
 		OVS_NLERR(log, "IPv4 tunnel attribute has %d unknown bytes.",
@@ -1194,7 +1195,7 @@  int ovs_nla_get_match(struct net *net, struct sw_flow_match *match,
 			/* The userspace does not send tunnel attributes that
 			 * are 0, but we should not wildcard them nonetheless.
 			 */
-			if (match->key->tun_key.u.ipv4.dst)
+			if (match->key->tun_proto)
 				SW_FLOW_KEY_MEMSET_FIELD(match, tun_key,
 							 0xff, true);
 
@@ -1367,7 +1368,7 @@  static int __ovs_nla_put_key(const struct sw_flow_key *swkey,
 	if (nla_put_u32(skb, OVS_KEY_ATTR_PRIORITY, output->phy.priority))
 		goto nla_put_failure;
 
-	if ((swkey->tun_key.u.ipv4.dst || is_mask)) {
+	if ((swkey->tun_proto || is_mask)) {
 		const void *opts = NULL;
 
 		if (output->tun_key.tun_flags & TUNNEL_OPTIONS_PRESENT)
@@ -1913,6 +1914,8 @@  static int validate_and_copy_set_tun(const struct nlattr *attr,
 
 	tun_info = &tun_dst->u.tun_info;
 	tun_info->mode = IP_TUNNEL_INFO_TX;
+	if (key.tun_proto == AF_INET6)
+		tun_info->mode |= IP_TUNNEL_INFO_IPV6;
 	tun_info->key = key.tun_key;
 
 	/* We need to store the options in the action itself since
diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
index f2ea83ba4763..95dbcedf0bd4 100644
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -427,7 +427,7 @@  static u32 flow_hash(const struct sw_flow_key *key,
 
 static int flow_key_start(const struct sw_flow_key *key)
 {
-	if (key->tun_key.u.ipv4.dst)
+	if (key->tun_proto)
 		return 0;
 	else
 		return rounddown(offsetof(struct sw_flow_key, phy),