diff mbox

[ovs-dev,v3,1/5] datapath: Fixups for MPLS GSO

Message ID 1493243891-92162-1-git-send-email-yihung.wei@gmail.com
State Superseded
Headers show

Commit Message

Yi-Hung Wei April 26, 2017, 9:58 p.m. UTC
This patch backports the following two upstream commits to fix MPLS GSO in
ovs datapath. Starting from upstream commit 48d2ab609b6b ("net: mpls: Fixups
for GSO"), the mpls_gso kernel module relies on the fact that
skb_network_header() points to the mpls header and skb_inner_network_header()
points to the L3 header so that it can derive the length of mpls header
correctly, and the upstream commit updates how ovs datapath marks the skb
header when push and pop mpls. However, the old mpls_gso kernel module
assumes that the skb_network_header() points to the L3 header, and the old
mpls_gso kernel module will misbehave if the ovs datapath marks the
skb_network_header() in the new way since it will treat mpls header as the L3
header.

Because of the functional signature of mpls_gso_segment() does not change,
this backport patch uses the new mpls_hdr() to determine if the kernel that
ovs datapath is compiled with has the new or legacy mpls_gso kernel module.
It has been tested on kernel 4.4 and 4.9.

Upstream commit:
    commit 48d2ab609b6bbecb7698487c8579bc40de9d6dfa
    Author: David Ahern <dsa@cumulusnetworks.com>
    Date:   Wed Aug 24 20:10:44 2016 -0700

    net: mpls: Fixups for GSO

    As reported by Lennert the MPLS GSO code is failing to properly segment
    large packets. There are a couple of problems:

    1. the inner protocol is not set so the gso segment functions for inner
       protocol layers are not getting run, and

    2  MPLS labels for packets that use the "native" (non-OVS) MPLS code
       are not properly accounted for in mpls_gso_segment.

    The MPLS GSO code was added for OVS. It is re-using skb_mac_gso_segment
    to call the gso segment functions for the higher layer protocols. That
    means skb_mac_gso_segment is called twice -- once with the network
    protocol set to MPLS and again with the network protocol set to the
    inner protocol.

    This patch sets the inner skb protocol addressing item 1 above and sets
    the network_header and inner_network_header to mark where the MPLS labels
    start and end. The MPLS code in OVS is also updated to set the two
    network markers.

    >From there the MPLS GSO code uses the difference between the network
    header and the inner network header to know the size of the MPLS header
    that was pushed. It then pulls the MPLS header, resets the mac_len and
    protocol for the inner protocol and then calls skb_mac_gso_segment
    to segment the skb.

    Afterward the inner protocol segmentation is done the skb protocol
    is set to mpls for each segment and the network and mac headers
    restored.

    Reported-by: Lennert Buytenhek <buytenh@wantstofly.org>
    Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Upstream commit:
    commit 85de4a2101acb85c3b1dde465e84596ccca99f2c
    Author: Jiri Benc <jbenc@redhat.com>
    Date:   Fri Sep 30 19:08:07 2016 +0200

    openvswitch: use mpls_hdr

    skb_mpls_header is equivalent to mpls_hdr now. Use the existing helper
    instead.

    Signed-off-by: Jiri Benc <jbenc@redhat.com>
    Acked-by: Pravin B Shelar <pshelar@ovn.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
---
 acinclude.m4                             |  2 ++
 datapath/actions.c                       | 33 +++++++++++++++++++-------------
 datapath/linux/compat/include/net/mpls.h | 27 ++++++++++++++++++++++++--
 3 files changed, 47 insertions(+), 15 deletions(-)

Comments

Simon Horman May 1, 2017, 10:42 a.m. UTC | #1
Hi Yi-Hung,

I'm having a little trouble following the status of this patchset.
Would it be possible to repost v3 (or post v4 if that is pending) with:

* A cover letter including:
  - An overview of the patchset
  - An overview of the changes between v2 and v3
    (and v4 is there is one). Changes between v1 and v2 would also be
    nice but I think I have a handle on those. Alternatively the details
    of changes between revisions could be after changelog and a '---'
    in each patch
* All (5) patches.
Yi-Hung Wei May 1, 2017, 5:24 p.m. UTC | #2
Sure, I will rebase the patch set and post the v4.

Thanks,

-Yi-Hung

On Mon, May 1, 2017 at 3:42 AM, Simon Horman <simon.horman@netronome.com> wrote:
> Hi Yi-Hung,
>
> I'm having a little trouble following the status of this patchset.
> Would it be possible to repost v3 (or post v4 if that is pending) with:
>
> * A cover letter including:
>   - An overview of the patchset
>   - An overview of the changes between v2 and v3
>     (and v4 is there is one). Changes between v1 and v2 would also be
>     nice but I think I have a handle on those. Alternatively the details
>     of changes between revisions could be after changelog and a '---'
>     in each patch
> * All (5) patches.
diff mbox

Patch

diff --git a/acinclude.m4 b/acinclude.m4
index 9f8e30d9b47a..fef4872a4cd7 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -479,6 +479,8 @@  AC_DEFUN([OVS_CHECK_LINUX_COMPAT], [
                         [OVS_GREP_IFELSE([$KSRC/include/net/dst_cache.h], [dst_cache],
                                          [OVS_DEFINE([USE_UPSTREAM_TUNNEL])])])])
 
+  OVS_GREP_IFELSE([$KSRC/include/net/mpls.h], [mpls_hdr],
+                  [OVS_DEFINE([MPLS_HEADER_IS_L3])])
   OVS_GREP_IFELSE([$KSRC/include/linux/net.h], [sock_create_kern.*net],
                   [OVS_DEFINE([HAVE_SOCK_CREATE_KERN_NET])])
   OVS_GREP_IFELSE([$KSRC/include/linux/netdevice.h], [ndo_fill_metadata_dst])
diff --git a/datapath/actions.c b/datapath/actions.c
index 75f17709aec0..9ac02bc8ee2f 100644
--- a/datapath/actions.c
+++ b/datapath/actions.c
@@ -186,7 +186,7 @@  static void update_ethertype(struct sk_buff *skb, struct ethhdr *hdr,
 static int push_mpls(struct sk_buff *skb, struct sw_flow_key *key,
 		     const struct ovs_action_push_mpls *mpls)
 {
-	__be32 *new_mpls_lse;
+	struct mpls_shim_hdr *new_mpls_lse;
 
 	/* Networking stack do not allow simultaneous Tunnel and MPLS GSO. */
 	if (skb->encapsulation)
@@ -195,20 +195,26 @@  static int push_mpls(struct sk_buff *skb, struct sw_flow_key *key,
 	if (skb_cow_head(skb, MPLS_HLEN) < 0)
 		return -ENOMEM;
 
+	if (!ovs_skb_get_inner_protocol(skb)) {
+		skb_set_inner_network_header(skb, skb->mac_len);
+		ovs_skb_set_inner_protocol(skb, skb->protocol);
+	}
+
 	skb_push(skb, MPLS_HLEN);
 	memmove(skb_mac_header(skb) - MPLS_HLEN, skb_mac_header(skb),
 		skb->mac_len);
 	skb_reset_mac_header(skb);
+#ifdef MPLS_HEADER_IS_L3
+	skb_set_network_header(skb, skb->mac_len);
+#endif
 
-	new_mpls_lse = (__be32 *)skb_mpls_header(skb);
-	*new_mpls_lse = mpls->mpls_lse;
+	new_mpls_lse = mpls_hdr(skb);
+	new_mpls_lse->label_stack_entry = mpls->mpls_lse;
 
 	skb_postpush_rcsum(skb, new_mpls_lse, MPLS_HLEN);
 
 	if (ovs_key_mac_proto(key) == MAC_PROTO_ETHERNET)
 		update_ethertype(skb, eth_hdr(skb), mpls->mpls_ethertype);
-	if (!ovs_skb_get_inner_protocol(skb))
-		ovs_skb_set_inner_protocol(skb, skb->protocol);
 	skb->protocol = mpls->mpls_ethertype;
 
 	invalidate_flow_key(key);
@@ -224,21 +230,22 @@  static int pop_mpls(struct sk_buff *skb, struct sw_flow_key *key,
 	if (unlikely(err))
 		return err;
 
-	skb_postpull_rcsum(skb, skb_mpls_header(skb), MPLS_HLEN);
+	skb_postpull_rcsum(skb, mpls_hdr(skb), MPLS_HLEN);
 
 	memmove(skb_mac_header(skb) + MPLS_HLEN, skb_mac_header(skb),
 		skb->mac_len);
 
 	__skb_pull(skb, MPLS_HLEN);
 	skb_reset_mac_header(skb);
+	skb_set_network_header(skb, skb->mac_len);
 
 	if (ovs_key_mac_proto(key) == MAC_PROTO_ETHERNET) {
 		struct ethhdr *hdr;
 
-		/* skb_mpls_header() is used to locate the ethertype
+		/* mpls_hdr() is used to locate the ethertype
 		 * field correctly in the presence of VLAN tags.
 		 */
-		hdr = (struct ethhdr *)(skb_mpls_header(skb) - ETH_HLEN);
+		hdr = (struct ethhdr *)((void*)mpls_hdr(skb) - ETH_HLEN);
 		update_ethertype(skb, hdr, ethertype);
         }
 	if (eth_p_mpls(skb->protocol))
@@ -251,7 +258,7 @@  static int pop_mpls(struct sk_buff *skb, struct sw_flow_key *key,
 static int set_mpls(struct sk_buff *skb, struct sw_flow_key *flow_key,
 		    const __be32 *mpls_lse, const __be32 *mask)
 {
-	__be32 *stack;
+	struct mpls_shim_hdr *stack;
 	__be32 lse;
 	int err;
 
@@ -259,16 +266,16 @@  static int set_mpls(struct sk_buff *skb, struct sw_flow_key *flow_key,
 	if (unlikely(err))
 		return err;
 
-	stack = (__be32 *)skb_mpls_header(skb);
-	lse = OVS_MASKED(*stack, *mpls_lse, *mask);
+	stack = mpls_hdr(skb);
+	lse = OVS_MASKED(stack->label_stack_entry, *mpls_lse, *mask);
 	if (skb->ip_summed == CHECKSUM_COMPLETE) {
-		__be32 diff[] = { ~(*stack), lse };
+		__be32 diff[] = { ~(stack->label_stack_entry), lse };
 
 		skb->csum = ~csum_partial((char *)diff, sizeof(diff),
 					  ~skb->csum);
 	}
 
-	*stack = lse;
+	stack->label_stack_entry = lse;
 	flow_key->mpls.top_lse = lse;
 	return 0;
 }
diff --git a/datapath/linux/compat/include/net/mpls.h b/datapath/linux/compat/include/net/mpls.h
index 73e48e37eb55..302377309967 100644
--- a/datapath/linux/compat/include/net/mpls.h
+++ b/datapath/linux/compat/include/net/mpls.h
@@ -19,12 +19,33 @@ 
 
 #define MPLS_HLEN 4
 
+struct mpls_shim_hdr {
+	__be32 label_stack_entry;
+};
+
 static inline bool eth_p_mpls(__be16 eth_type)
 {
 	return eth_type == htons(ETH_P_MPLS_UC) ||
 		eth_type == htons(ETH_P_MPLS_MC);
 }
 
+/* Starting from kernel 4.9, commit 48d2ab609b6b ("net: mpls: Fixups for GSO")
+ * and commit 85de4a2101ac ("openvswitch: use mpls_hdr") introduced
+ * behavioural changes to mpls_gso kernel module. It now assumes that
+ * skb_network_header() points to the mpls header and
+ * skb_inner_network_header() points to the L3 header. However, the old
+ * mpls_gso kernel module assumes that the skb_network_header() points
+ * to the L3 header. We shall backport the following function to ensure
+ * mpls_gso works properly for kernels older than the one which contains
+ * these commits.
+ */
+#ifdef MPLS_HEADER_IS_L3
+static inline struct mpls_shim_hdr *mpls_hdr(const struct sk_buff *skb)
+{
+    return (struct mpls_shim_hdr *)skb_network_header(skb);
+}
+#else
+#define mpls_hdr rpl_mpls_hdr
 /*
  * For non-MPLS skbs this will correspond to the network header.
  * For MPLS skbs it will be before the network_header as the MPLS
@@ -32,8 +53,10 @@  static inline bool eth_p_mpls(__be16 eth_type)
  * header. That is, for MPLS skbs the end of the mac header
  * is the top of the MPLS label stack.
  */
-static inline unsigned char *skb_mpls_header(struct sk_buff *skb)
+static inline struct mpls_shim_hdr *rpl_mpls_hdr(const struct sk_buff *skb)
 {
-	return skb_mac_header(skb) + skb->mac_len;
+	return (struct mpls_shim_hdr *) (skb_mac_header(skb) + skb->mac_len);
 }
+#endif
+
 #endif /* _NET_MPLS_WRAPPER_H */