diff mbox

[ovs-dev,2/7] userspace: Support for push_eth and pop_eth actions

Message ID CFF8EF42F1132E4CBE2BF0AB6C21C58D662CC6BB@ESESSMB205.ericsson.se
State Changes Requested
Headers show

Commit Message

Jan Scheurich Feb. 3, 2017, 10:39 a.m. UTC
Add support for actions push_eth and pop_eth to the netdev datapath and
the supporting libraries. This patch relies on the support for these actions
in the kernel datapath to be present.

Signed-off-by: Lorand Jakab <lojakab@cisco.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Signed-off-by: Jean Tourrilhes <jt@labs.hpe.com>
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Co-authored-by: Zoltan Balogh <zoltan.balogh@ericsson.com>
---
 datapath/linux/compat/include/linux/openvswitch.h | 17 ++++++
 lib/dpif-netdev.c                                 |  2 +
 lib/dpif.c                                        |  2 +
 lib/odp-execute.c                                 | 17 ++++++
 lib/odp-util.c                                    | 72 ++++++++++++++++++++++-
 lib/odp-util.h                                    |  7 +++
 lib/packets.c                                     | 41 +++++++++++++
 lib/packets.h                                     |  4 ++
 ofproto/ofproto-dpif-sflow.c                      |  7 +++
 9 files changed, 167 insertions(+), 2 deletions(-)

Comments

Ben Pfaff March 3, 2017, 6:22 p.m. UTC | #1
On Fri, Feb 03, 2017 at 10:39:48AM +0000, Jan Scheurich wrote:
> Add support for actions push_eth and pop_eth to the netdev datapath and
> the supporting libraries. This patch relies on the support for these actions
> in the kernel datapath to be present.
> 
> Signed-off-by: Lorand Jakab <lojakab@cisco.com>
> Signed-off-by: Simon Horman <simon.horman@netronome.com>
> Signed-off-by: Jiri Benc <jbenc@redhat.com>
> Signed-off-by: Yi Yang <yi.y.yang@intel.com>
> Signed-off-by: Jean Tourrilhes <jt@labs.hpe.com>
> Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
> Co-authored-by: Zoltan Balogh <zoltan.balogh@ericsson.com>

This sign-off chain is baffling.  Please explain it.

Clang reports:

    ../lib/packets.c:260:22: error: cast from 'char *' to 'ovs_be16 *' (aka 'unsigned short *') increases required alignment from 1 to 2 [-Werror,-Wcast-align]
    ../lib/packets.c:263:22: error: cast from 'char *' to 'ovs_be16 *' (aka 'unsigned short *') increases required alignment from 1 to 2 [-Werror,-Wcast-align]

Thanks,

Ben.
Jan Scheurich March 6, 2017, midnight UTC | #2
> > Signed-off-by: Lorand Jakab <lojakab@cisco.com>
> > Signed-off-by: Simon Horman <simon.horman@netronome.com>
> > Signed-off-by: Jiri Benc <jbenc@redhat.com>
> > Signed-off-by: Yi Yang <yi.y.yang@intel.com>
> > Signed-off-by: Jean Tourrilhes <jt@labs.hpe.com>
> > Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
> > Co-authored-by: Zoltan Balogh <zoltan.balogh@ericsson.com>
> 
> This sign-off chain is baffling.  Please explain it.

Support for L3 tunneling has been dragging along for several years. We are just the last ones in a long chain of authors who have submitted earlier versions of patches for this. We thought it was appropriate to keep the earlier Signed-off-by tags to reflect that. If there is a better way to express this, please let us know.

> Clang reports:
> 
>     ../lib/packets.c:260:22: error: cast from 'char *' to 'ovs_be16 *' (aka 'unsigned short *') increases required alignment from 1 to 2 [-
> Werror,-Wcast-align]
>     ../lib/packets.c:263:22: error: cast from 'char *' to 'ovs_be16 *' (aka 'unsigned short *') increases required alignment from 1 to 2 [-
> Werror,-Wcast-align]

I think these warnings are artifacts caused by the fact that char* does not express the actual alignment of the l2.5 or l3 headers. The packet data is word-aligned and the Ethernet header occupies 14 bytes. Any number of potentially present VLAN tags cannot change that. Any suggestion how to fix this?

Thanks , Jan
Ben Pfaff March 7, 2017, 7:15 p.m. UTC | #3
On Mon, Mar 06, 2017 at 12:00:59AM +0000, Jan Scheurich wrote:
> > > Signed-off-by: Lorand Jakab <lojakab@cisco.com>
> > > Signed-off-by: Simon Horman <simon.horman@netronome.com>
> > > Signed-off-by: Jiri Benc <jbenc@redhat.com>
> > > Signed-off-by: Yi Yang <yi.y.yang@intel.com>
> > > Signed-off-by: Jean Tourrilhes <jt@labs.hpe.com>
> > > Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
> > > Co-authored-by: Zoltan Balogh <zoltan.balogh@ericsson.com>
> > 
> > This sign-off chain is baffling.  Please explain it.
> 
> Support for L3 tunneling has been dragging along for several years. We
> are just the last ones in a long chain of authors who have submitted
> earlier versions of patches for this. We thought it was appropriate to
> keep the earlier Signed-off-by tags to reflect that. If there is a
> better way to express this, please let us know.

OK, it really was not clear to me that the code had really passed
through that many hands.  If it really has, then it's fine.

> > Clang reports:
> > 
> >     ../lib/packets.c:260:22: error: cast from 'char *' to 'ovs_be16 *' (aka 'unsigned short *') increases required alignment from 1 to 2 [-
> > Werror,-Wcast-align]
> >     ../lib/packets.c:263:22: error: cast from 'char *' to 'ovs_be16 *' (aka 'unsigned short *') increases required alignment from 1 to 2 [-
> > Werror,-Wcast-align]
> 
> I think these warnings are artifacts caused by the fact that char* does not express the actual alignment of the l2.5 or l3 headers. The packet data is word-aligned and the Ethernet header occupies 14 bytes. Any number of potentially present VLAN tags cannot change that. Any suggestion how to fix this?

When these warnings come about and the code is actually correct, then
the ALIGNED_CAST macro is the right way to suppress them.
diff mbox

Patch

diff --git a/datapath/linux/compat/include/linux/openvswitch.h b/datapath/linux/compat/include/linux/openvswitch.h
index 425d3a4..d78d8a8 100644
--- a/datapath/linux/compat/include/linux/openvswitch.h
+++ b/datapath/linux/compat/include/linux/openvswitch.h
@@ -749,6 +749,18 @@  enum ovs_nat_attr {
 #define OVS_NAT_ATTR_MAX (__OVS_NAT_ATTR_MAX - 1)
 
 /**
+ * struct ovs_action_push_eth - %OVS_ACTION_ATTR_PUSH_ETH action argument.
+ * @addresses: Source and destination MAC addresses.
+ * @eth_type: Ethernet type
+ */
+struct ovs_action_push_eth {
+	struct  ovs_key_ethernet addresses;
+#ifndef __KERNEL__
+	__be16  eth_type;
+#endif
+};
+
+/**
  * enum ovs_action_attr - Action types.
  *
  * @OVS_ACTION_ATTR_OUTPUT: Output packet to port.
@@ -782,6 +794,9 @@  enum ovs_nat_attr {
  * is no MPLS label stack, as determined by ethertype, no action is taken.
  * @OVS_ACTION_ATTR_CT: Track the connection. Populate the conntrack-related
  * entries in the flow key.
+ * @OVS_ACTION_ATTR_PUSH_ETH: Push a new outermost Ethernet header onto the
+ * packet.
+ * @OVS_ACTION_ATTR_POP_ETH: Pop the outermost Ethernet header off the packet.
  *
  * Only a single header can be set with a single %OVS_ACTION_ATTR_SET.  Not all
  * fields within a header are modifiable, e.g. the IPv4 protocol and fragment
@@ -814,6 +829,8 @@  enum ovs_action_attr {
 				       * bits. */
 	OVS_ACTION_ATTR_CT,           /* Nested OVS_CT_ATTR_* . */
 	OVS_ACTION_ATTR_TRUNC,        /* u32 struct ovs_action_trunc. */
+	OVS_ACTION_ATTR_PUSH_ETH,     /* struct ovs_action_push_eth. */
+	OVS_ACTION_ATTR_POP_ETH,      /* No argument. */
 
 #ifndef __KERNEL__
 	OVS_ACTION_ATTR_TUNNEL_PUSH,   /* struct ovs_action_push_tnl*/
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index b630824..973814c 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -4731,6 +4731,8 @@  dp_execute_cb(void *aux_, struct dp_packet_batch *packets_,
     case OVS_ACTION_ATTR_UNSPEC:
     case OVS_ACTION_ATTR_TRUNC:
     case OVS_ACTION_ATTR_CLONE:
+    case OVS_ACTION_ATTR_PUSH_ETH:
+    case OVS_ACTION_ATTR_POP_ETH:
     case __OVS_ACTION_ATTR_MAX:
         OVS_NOT_REACHED();
     }
diff --git a/lib/dpif.c b/lib/dpif.c
index d9816e5..3f82d80 100644
--- a/lib/dpif.c
+++ b/lib/dpif.c
@@ -1183,6 +1183,8 @@  dpif_execute_helper_cb(void *aux_, struct dp_packet_batch *packets_,
     case OVS_ACTION_ATTR_SAMPLE:
     case OVS_ACTION_ATTR_TRUNC:
     case OVS_ACTION_ATTR_CLONE:
+    case OVS_ACTION_ATTR_PUSH_ETH:
+    case OVS_ACTION_ATTR_POP_ETH:
     case OVS_ACTION_ATTR_UNSPEC:
     case __OVS_ACTION_ATTR_MAX:
         OVS_NOT_REACHED();
diff --git a/lib/odp-execute.c b/lib/odp-execute.c
index 9612ef5..4ed4475 100644
--- a/lib/odp-execute.c
+++ b/lib/odp-execute.c
@@ -573,6 +573,8 @@  requires_datapath_assistance(const struct nlattr *a)
     case OVS_ACTION_ATTR_POP_MPLS:
     case OVS_ACTION_ATTR_TRUNC:
     case OVS_ACTION_ATTR_CLONE:
+    case OVS_ACTION_ATTR_POP_ETH:
+    case OVS_ACTION_ATTR_PUSH_ETH:
         return false;
 
     case OVS_ACTION_ATTR_UNSPEC:
@@ -715,6 +717,21 @@  odp_execute_actions(void *dp, struct dp_packet_batch *batch, bool steal,
                 return;
             }
             break;
+        case OVS_ACTION_ATTR_PUSH_ETH: {
+            const struct ovs_action_push_eth *eth = nl_attr_get(a);
+
+            DP_PACKET_BATCH_FOR_EACH (packet, batch) {
+                push_eth(packet, &eth->addresses.eth_dst,
+                         &eth->addresses.eth_src, eth->eth_type);
+            }
+            break;
+        }
+
+        case OVS_ACTION_ATTR_POP_ETH:
+            DP_PACKET_BATCH_FOR_EACH (packet, batch) {
+                pop_eth(packet);
+            }
+            break;
 
         case OVS_ACTION_ATTR_OUTPUT:
         case OVS_ACTION_ATTR_TUNNEL_PUSH:
diff --git a/lib/odp-util.c b/lib/odp-util.c
index 4106738..f97c4eb 100644
--- a/lib/odp-util.c
+++ b/lib/odp-util.c
@@ -125,6 +125,8 @@  odp_action_len(uint16_t type)
     case OVS_ACTION_ATTR_SAMPLE: return ATTR_LEN_VARIABLE;
     case OVS_ACTION_ATTR_CT: return ATTR_LEN_VARIABLE;
     case OVS_ACTION_ATTR_CLONE: return ATTR_LEN_VARIABLE;
+    case OVS_ACTION_ATTR_PUSH_ETH: return sizeof(struct ovs_action_push_eth);
+    case OVS_ACTION_ATTR_POP_ETH: return 0;
 
     case OVS_ACTION_ATTR_UNSPEC:
     case __OVS_ACTION_ATTR_MAX:
@@ -850,6 +852,18 @@  format_odp_action(struct ds *ds, const struct nlattr *a)
         format_odp_key_attr(nl_attr_get(a), NULL, NULL, ds, true);
         ds_put_cstr(ds, ")");
         break;
+    case OVS_ACTION_ATTR_PUSH_ETH: {
+        const struct ovs_action_push_eth *eth = nl_attr_get(a);
+        ds_put_format(ds, "push_eth(src="ETH_ADDR_FMT",dst="ETH_ADDR_FMT
+                      ",type=0x%04"PRIx16")",
+                      ETH_ADDR_ARGS(eth->addresses.eth_src),
+                      ETH_ADDR_ARGS(eth->addresses.eth_dst),
+                      ntohs(eth->eth_type));
+        break;
+    }
+    case OVS_ACTION_ATTR_POP_ETH:
+        ds_put_cstr(ds, "pop_eth");
+        break;
     case OVS_ACTION_ATTR_PUSH_VLAN: {
         const struct ovs_action_push_vlan *vlan = nl_attr_get(a);
         ds_put_cstr(ds, "push_vlan(");
@@ -1055,14 +1069,41 @@  parse_odp_userspace_action(const char *s, struct ofpbuf *actions)
             odp_put_userspace_action(pid, user_data, user_data_size,
                                      tunnel_out_port, include_actions, actions);
             res = n + n1;
+            goto out;
         } else if (s[n] == ')') {
             odp_put_userspace_action(pid, user_data, user_data_size,
                                      ODPP_NONE, include_actions, actions);
             res = n + 1;
-        } else {
-            res = -EINVAL;
+            goto out;
         }
     }
+
+    {
+        struct ovs_action_push_eth push;
+        int eth_type = 0;
+        int n1 = -1;
+
+        if (ovs_scan(&s[n], "push_eth(src="ETH_ADDR_SCAN_FMT","
+                     "dst="ETH_ADDR_SCAN_FMT",type=%i)%n",
+                     ETH_ADDR_SCAN_ARGS(push.addresses.eth_src),
+                     ETH_ADDR_SCAN_ARGS(push.addresses.eth_dst),
+                     &eth_type, &n1)) {
+
+            nl_msg_put_unspec(actions, OVS_ACTION_ATTR_PUSH_ETH,
+                              &push, sizeof push);
+
+            res = n + n1;
+            goto out;
+        }
+    }
+
+    if (!strncmp(&s[n], "pop_eth", 7)) {
+        nl_msg_put_flag(actions, OVS_ACTION_ATTR_POP_ETH);
+        res = 7;
+        goto out;
+    }
+
+    res = -EINVAL;
 out:
     ofpbuf_uninit(&buf);
     return res;
@@ -5361,6 +5402,33 @@  odp_put_userspace_action(uint32_t pid,
 }
 
 void
+odp_put_pop_eth_action(struct ofpbuf *odp_actions)
+{
+    nl_msg_put_flag(odp_actions, OVS_ACTION_ATTR_POP_ETH);
+}
+
+void
+odp_put_push_eth_action(struct ofpbuf *odp_actions,
+                        const struct eth_addr *eth_src,
+                        const struct eth_addr *eth_dst,
+                        const ovs_be16 eth_type)
+{
+    struct ovs_action_push_eth eth;
+
+    memset(&eth, 0, sizeof eth);
+    if (eth_src) {
+        eth.addresses.eth_src = *eth_src;
+    }
+    if (eth_dst) {
+        eth.addresses.eth_dst = *eth_dst;
+    }
+    eth.eth_type = eth_type;
+
+    nl_msg_put_unspec(odp_actions, OVS_ACTION_ATTR_PUSH_ETH,
+                      &eth, sizeof eth);
+}
+
+void
 odp_put_tunnel_action(const struct flow_tnl *tunnel,
                       struct ofpbuf *odp_actions)
 {
diff --git a/lib/odp-util.h b/lib/odp-util.h
index f391e2a..c4cb509 100644
--- a/lib/odp-util.h
+++ b/lib/odp-util.h
@@ -308,4 +308,11 @@  void odp_put_tunnel_action(const struct flow_tnl *tunnel,
 
 void odp_put_tnl_push_action(struct ofpbuf *odp_actions,
                              struct ovs_action_push_tnl *data);
+
+void odp_put_pop_eth_action(struct ofpbuf *odp_actions);
+void odp_put_push_eth_action(struct ofpbuf *odp_actions,
+                             const struct eth_addr *eth_src,
+                             const struct eth_addr *eth_dst,
+                             const ovs_be16 eth_type);
+
 #endif /* odp-util.h */
diff --git a/lib/packets.c b/lib/packets.c
index dec6513..429083b 100644
--- a/lib/packets.c
+++ b/lib/packets.c
@@ -226,6 +226,47 @@  eth_pop_vlan(struct dp_packet *packet)
     }
 }
 
+/* Push Ethernet header onto 'packet' assuming it is layer 3 */
+void
+push_eth(struct dp_packet *packet, const struct eth_addr *dst,
+         const struct eth_addr *src, ovs_be16 type)
+{
+    struct eth_header *eh;
+
+    ovs_assert(packet->packet_type != PT_ETH);
+    eh = dp_packet_resize_l2(packet, ETH_HEADER_LEN);
+    eh->eth_dst = *dst;
+    eh->eth_src = *src;
+    eh->eth_type = type;
+    packet->packet_type = PT_ETH;
+}
+
+/* Removes Ethernet header, including VLAN header, from 'packet'.
+ *
+ * Previous to calling this function, 'ofpbuf_l3(packet)' must not be NULL */
+void
+pop_eth(struct dp_packet *packet)
+{
+    char *l2_5 = dp_packet_l2_5(packet);
+    char *l3 = dp_packet_l3(packet);
+    ovs_be16 ethertype;
+    int increment;
+
+    ovs_assert(packet->packet_type == PT_ETH);
+    ovs_assert(l3 != NULL);
+
+    if (l2_5) {
+        increment = packet->l2_5_ofs;
+        ethertype = *(ovs_be16 *)(l2_5 - 2);
+    } else {
+        increment = packet->l3_ofs;
+        ethertype = *(ovs_be16 *)(l3 - 2);
+    }
+
+    dp_packet_resize_l2(packet, -increment);
+    packet->packet_type = PACKET_TYPE(OFPHTN_ETHERTYPE, ethertype);
+}
+
 /* Set ethertype of the packet. */
 static void
 set_ethertype(struct dp_packet *packet, ovs_be16 eth_type)
diff --git a/lib/packets.h b/lib/packets.h
index c4d3799..a1ce5ea 100644
--- a/lib/packets.h
+++ b/lib/packets.h
@@ -366,6 +366,10 @@  struct eth_header {
 });
 BUILD_ASSERT_DECL(ETH_HEADER_LEN == sizeof(struct eth_header));
 
+void push_eth(struct dp_packet *packet, const struct eth_addr *dst,
+              const struct eth_addr *src, ovs_be16 type);
+void pop_eth(struct dp_packet *packet);
+
 #define LLC_DSAP_SNAP 0xaa
 #define LLC_SSAP_SNAP 0xaa
 #define LLC_CNTL_SNAP 3
diff --git a/ofproto/ofproto-dpif-sflow.c b/ofproto/ofproto-dpif-sflow.c
index 520b8dd..beb4b1f 100644
--- a/ofproto/ofproto-dpif-sflow.c
+++ b/ofproto/ofproto-dpif-sflow.c
@@ -1164,6 +1164,13 @@  dpif_sflow_read_actions(const struct flow *flow,
 	    dpif_sflow_pop_mpls_lse(sflow_actions);
 	    break;
 	}
+	case OVS_ACTION_ATTR_PUSH_ETH:
+	case OVS_ACTION_ATTR_POP_ETH:
+	    /* TODO: SFlow does not currently define a MAC-in-MAC
+	     * encapsulation structure.  We could use an extension
+	     * structure to report this.
+	     */
+	    break;
 	case OVS_ACTION_ATTR_SAMPLE:
 	case OVS_ACTION_ATTR_CLONE:
 	case OVS_ACTION_ATTR_UNSPEC: