Message ID | 1593437507-4710-1-git-send-email-martinvarghesenokia@gmail.com |
---|---|
State | Superseded |
Headers | show |
Series | [ovs-dev,v6] Bareudp Tunnel Support | expand |
On 6/29/2020 6:31 AM, Martin Varghese wrote: > From: Martin Varghese <martin.varghese@nokia.com> > > There are various L3 encapsulation standards using UDP being discussed to > leverage the UDP based load balancing capability of different networks. > MPLSoUDP (__ https://tools.ietf.org/html/rfc7510) is one among them. > > The Bareudp tunnel provides a generic L3 encapsulation support for > tunnelling different L3 protocols like MPLS, IP, NSH etc. inside a UDP > tunnel. > > An example to create bareudp device to tunnel MPLS traffic is > given > > $ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \ > type=bareudp options:remote_ip=2.1.1.3 > options:local_ip=2.1.1.2 \ > options:payload_type=0x8847 options:dst_port=6635 \ > options:packet_type="legacy_l3" \ > ofport_request=$bareudp_egress_port > > The bareudp device supports special handling for MPLS & IP as > they can have multiple ethertypes. MPLS procotcol can have ethertypes > ETH_P_MPLS_UC (unicast) & ETH_P_MPLS_MC (multicast). IP protocol can have > ethertypes ETH_P_IP (v4) & ETH_P_IPV6 (v6). > > The bareudp device to tunnel L3 traffic with multiple ethertypes > (MPLS & IP) can be created by passing the L3 protocol name as string in > the field payload_type. An example to create bareudp device to tunnel > MPLS unicast & multicast traffic is given below. > > $ ovs-vsctl add-port br_mpls udp_port -- set interface > udp_port \ > type=bareudp options:remote_ip=2.1.1.3 > options:local_ip=2.1.1.2 \ > options:payload_type=mpls options:dst_port=6635 \ > options:packet_type="legacy_l3" > > Signed-off-by: Martin Varghese <martin.varghese@nokia.com> Thanks for your work on this Martin. I think it's good to go now. Passes check-kernel with no regressions. Acked-By: Greg Rose <gvrose8192@gmail.com> Tested-by: Greg Rose <gvrose8192@gmail.com > --- > Changes in v2: > - Removed vport-bareudp module. > > Changes in v3: > - Added net-next upstream commit id and message to commit message. > > Changes in v4: > - Removed kernel datapath changes. > > Changes in v5: > - Fixed release notes errors. > - Fixed coding errors in dpif-nelink-rtnl.c. > > Changes in v6: > - Added code to enable rx metadata collection in the kernel device. > - Added version history. > > Documentation/automake.mk | 1 + > Documentation/faq/bareudp.rst | 62 +++++++++++++++++++++++ > Documentation/faq/index.rst | 1 + > Documentation/faq/releases.rst | 1 + > NEWS | 4 ++ > datapath/linux/compat/include/linux/openvswitch.h | 10 ++++ > lib/dpif-netlink-rtnl.c | 55 ++++++++++++++++++++ > lib/dpif-netlink.c | 5 ++ > lib/netdev-vport.c | 27 +++++++++- > lib/netdev.h | 1 + > ofproto/ofproto-dpif-xlate.c | 1 + > tests/system-layer3-tunnels.at | 47 +++++++++++++++++ > 12 files changed, 213 insertions(+), 2 deletions(-) > create mode 100644 Documentation/faq/bareudp.rst > > diff --git a/Documentation/automake.mk b/Documentation/automake.mk > index f85c432..ea3475f 100644 > --- a/Documentation/automake.mk > +++ b/Documentation/automake.mk > @@ -88,6 +88,7 @@ DOC_SOURCE = \ > Documentation/faq/terminology.rst \ > Documentation/faq/vlan.rst \ > Documentation/faq/vxlan.rst \ > + Documentation/faq/bareudp.rst \ > Documentation/internals/index.rst \ > Documentation/internals/authors.rst \ > Documentation/internals/bugs.rst \ > diff --git a/Documentation/faq/bareudp.rst b/Documentation/faq/bareudp.rst > new file mode 100644 > index 0000000..9266daa > --- /dev/null > +++ b/Documentation/faq/bareudp.rst > @@ -0,0 +1,62 @@ > +.. > + Licensed under the Apache License, Version 2.0 (the "License"); you may > + not use this file except in compliance with the License. You may obtain > + a copy of the License at > + > + http://www.apache.org/licenses/LICENSE-2.0 > + > + Unless required by applicable law or agreed to in writing, software > + distributed under the License is distributed on an "AS IS" BASIS, WITHOUT > + WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the > + License for the specific language governing permissions and limitations > + under the License. > + > + Convention for heading levels in Open vSwitch documentation: > + > + ======= Heading 0 (reserved for the title in a document) > + ------- Heading 1 > + ~~~~~~~ Heading 2 > + +++++++ Heading 3 > + ''''''' Heading 4 > + > + Avoid deeper levels because they do not render well. > + > +======= > +Bareudp > +======= > + > +Q: What is Bareudp? > + > + A: There are various L3 encapsulation standards using UDP being discussed > + to leverage the UDP based load balancing capability of different > + networks. MPLSoUDP (__ https://tools.ietf.org/html/rfc7510) is one among > + them. > + > + The Bareudp tunnel provides a generic L3 encapsulation support for > + tunnelling different L3 protocols like MPLS, IP, NSH etc. inside a UDP > + tunnel. > + > + An example to create bareudp device to tunnel MPLS traffic is given > + below.:: > + > + $ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \ > + type=bareudp options:remote_ip=2.1.1.3 options:local_ip=2.1.1.2 \ > + options:payload_type=0x8847 options:dst_port=6635 \ > + options:packet_type="legacy_l3" \ > + ofport_request=$bareudp_egress_port > + > + The bareudp device supports special handling for MPLS & IP as they can > + have multiple ethertypes. > + MPLS procotcol can have ethertypes ETH_P_MPLS_UC (unicast) & > + ETH_P_MPLS_MC (multicast). IP protocol can have ethertypes ETH_P_IP (v4) > + & ETH_P_IPV6 (v6). > + > + The bareudp device to tunnel L3 traffic with multiple ethertypes > + (MPLS & IP) can be created by passing the L3 protocol name as string in > + the field payload_type. An example to create bareudp device to tunnel > + MPLS unicast & multicast traffic is given below.:: > + > + $ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \ > + type=bareudp options:remote_ip=2.1.1.3 options:local_ip=2.1.1.2 \ > + options:payload_type=mpls options:dst_port=6635 \ > + options:packet_type="legacy_l3" > diff --git a/Documentation/faq/index.rst b/Documentation/faq/index.rst > index 334b828..1dd2998 100644 > --- a/Documentation/faq/index.rst > +++ b/Documentation/faq/index.rst > @@ -30,6 +30,7 @@ Open vSwitch FAQ > .. toctree:: > :maxdepth: 2 > > + bareudp > configuration > contributing > design > diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst > index e5cef39..9915839 100644 > --- a/Documentation/faq/releases.rst > +++ b/Documentation/faq/releases.rst > @@ -136,6 +136,7 @@ Q: Are all features available with all datapaths? > Tunnel - ERSPAN 4.18 2.10 2.10 NO > Tunnel - ERSPAN-IPv6 4.18 2.10 2.10 NO > Tunnel - GTP-U NO NO 2.14 NO > + Tunnel - Bareudp 5.7 NO 2.14 NO > QoS - Policing YES 1.1 2.6 NO > QoS - Shaping YES 1.1 NO NO > sFlow YES 1.0 1.0 NO > diff --git a/NEWS b/NEWS > index 0116b3e..f5aa840 100644 > --- a/NEWS > +++ b/NEWS > @@ -23,6 +23,10 @@ Post-v2.13.0 > - Tunnels: TC Flower offload > * Tunnel Local endpoint address masked match are supported. > * Tunnel Romte endpoint address masked match are supported. > + - Bareudp Tunnel > + * Bareudp device support is present in linux kernel from version 5.7 > + * Kernel bareudp device is not backported to ovs tree. > + * Userspace datapath support is not added > > > v2.13.0 - 14 Feb 2020 > diff --git a/datapath/linux/compat/include/linux/openvswitch.h b/datapath/linux/compat/include/linux/openvswitch.h > index cc41bbe..3073faa 100644 > --- a/datapath/linux/compat/include/linux/openvswitch.h > +++ b/datapath/linux/compat/include/linux/openvswitch.h > @@ -240,6 +240,7 @@ enum ovs_vport_type { > OVS_VPORT_TYPE_GRE, /* GRE tunnel. */ > OVS_VPORT_TYPE_VXLAN, /* VXLAN tunnel. */ > OVS_VPORT_TYPE_GENEVE, /* Geneve tunnel. */ > + OVS_VPORT_TYPE_BAREUDP, /* Bareudp tunnel. */ > OVS_VPORT_TYPE_LISP = 105, /* LISP tunnel */ > OVS_VPORT_TYPE_STT = 106, /* STT tunnel */ > OVS_VPORT_TYPE_ERSPAN = 107, /* ERSPAN tunnel. */ > @@ -308,6 +309,15 @@ enum { > > #define OVS_VXLAN_EXT_MAX (__OVS_VXLAN_EXT_MAX - 1) > > +enum { > + OVS_BAREUDP_EXT_UNSPEC, > + OVS_BAREUDP_EXT_MULTIPROTO_MODE, > + /* place new values here to fill gap. */ > + __OVS_BAREUDP_EXT_MAX, > +}; > + > +#define OVS_BAREUDP_EXT_MAX (__OVS_BAREUDP_EXT_MAX - 1) > + > /* OVS_VPORT_ATTR_OPTIONS attributes for tunnels. > */ > enum { > diff --git a/lib/dpif-netlink-rtnl.c b/lib/dpif-netlink-rtnl.c > index fd157ce..3f6842a 100644 > --- a/lib/dpif-netlink-rtnl.c > +++ b/lib/dpif-netlink-rtnl.c > @@ -58,6 +58,19 @@ VLOG_DEFINE_THIS_MODULE(dpif_netlink_rtnl); > #define IFLA_GENEVE_UDP_ZERO_CSUM6_RX 10 > #endif > > +#ifndef __IFLA_BAREUDP_MAX > +#define IFLA_BAREUDP_MAX 0 > +#endif > +#if IFLA_BAREUDP_MAX < 4 > +#define IFLA_BAREUDP_PORT 1 > +#define IFLA_BAREUDP_ETHERTYPE 2 > +#define IFLA_BAREUDP_SRCPORT_MIN 3 > +#define IFLA_BAREUDP_MULTIPROTO_MODE 4 > +#define IFLA_BAREUDP_RX_COLLECT_METADATA 5 > +#endif > + > +#define BAREUDP_MPLS_SRCPORT_MIN 49153 > + > static const struct nl_policy rtlink_policy[] = { > [IFLA_LINKINFO] = { .type = NL_A_NESTED }, > }; > @@ -81,6 +94,10 @@ static const struct nl_policy geneve_policy[] = { > [IFLA_GENEVE_UDP_ZERO_CSUM6_RX] = { .type = NL_A_U8 }, > [IFLA_GENEVE_PORT] = { .type = NL_A_U16 }, > }; > +static const struct nl_policy bareudp_policy[] = { > + [IFLA_BAREUDP_PORT] = { .type = NL_A_U16 }, > + [IFLA_BAREUDP_ETHERTYPE] = { .type = NL_A_U16 }, > +}; > > static const char * > vport_type_to_kind(enum ovs_vport_type type, > @@ -113,6 +130,8 @@ vport_type_to_kind(enum ovs_vport_type type, > } > case OVS_VPORT_TYPE_GTPU: > return NULL; > + case OVS_VPORT_TYPE_BAREUDP: > + return "bareudp"; > case OVS_VPORT_TYPE_NETDEV: > case OVS_VPORT_TYPE_INTERNAL: > case OVS_VPORT_TYPE_LISP: > @@ -243,6 +262,24 @@ dpif_netlink_rtnl_geneve_verify(const struct netdev_tunnel_config *tnl_cfg, > > return err; > } > +static int > +dpif_netlink_rtnl_bareudp_verify(const struct netdev_tunnel_config *tnl_cfg, > + const char *kind, struct ofpbuf *reply) > +{ > + struct nlattr *bareudp[ARRAY_SIZE(bareudp_policy)]; > + int err; > + > + err = rtnl_policy_parse(kind, reply, bareudp_policy, bareudp, > + ARRAY_SIZE(bareudp_policy)); > + if (!err) { > + if ((tnl_cfg->dst_port != nl_attr_get_be16(bareudp[IFLA_BAREUDP_PORT])) > + || (tnl_cfg->payload_ethertype > + != nl_attr_get_be16(bareudp[IFLA_BAREUDP_ETHERTYPE]))) { > + err = EINVAL; > + } > + } > + return err; > +} > > static int > dpif_netlink_rtnl_verify(const struct netdev_tunnel_config *tnl_cfg, > @@ -275,6 +312,9 @@ dpif_netlink_rtnl_verify(const struct netdev_tunnel_config *tnl_cfg, > case OVS_VPORT_TYPE_GENEVE: > err = dpif_netlink_rtnl_geneve_verify(tnl_cfg, kind, reply); > break; > + case OVS_VPORT_TYPE_BAREUDP: > + err = dpif_netlink_rtnl_bareudp_verify(tnl_cfg, kind, reply); > + break; > case OVS_VPORT_TYPE_NETDEV: > case OVS_VPORT_TYPE_INTERNAL: > case OVS_VPORT_TYPE_LISP: > @@ -357,6 +397,20 @@ dpif_netlink_rtnl_create(const struct netdev_tunnel_config *tnl_cfg, > nl_msg_put_u8(&request, IFLA_GENEVE_UDP_ZERO_CSUM6_RX, 1); > nl_msg_put_be16(&request, IFLA_GENEVE_PORT, tnl_cfg->dst_port); > break; > + case OVS_VPORT_TYPE_BAREUDP: > + nl_msg_put_be16(&request, IFLA_BAREUDP_ETHERTYPE, > + tnl_cfg->payload_ethertype); > + if ((tnl_cfg->payload_ethertype == htons(ETH_TYPE_MPLS)) || > + (tnl_cfg->payload_ethertype == htons(ETH_TYPE_MPLS_MCAST))) { > + nl_msg_put_u16(&request, IFLA_BAREUDP_SRCPORT_MIN, > + BAREUDP_MPLS_SRCPORT_MIN); > + } > + nl_msg_put_be16(&request, IFLA_BAREUDP_PORT, tnl_cfg->dst_port); > + if (tnl_cfg->exts & (1 << OVS_BAREUDP_EXT_MULTIPROTO_MODE)) { > + nl_msg_put_flag(&request, IFLA_BAREUDP_MULTIPROTO_MODE); > + } > + nl_msg_put_flag(&request, IFLA_BAREUDP_RX_COLLECT_METADATA); > + break; > case OVS_VPORT_TYPE_NETDEV: > case OVS_VPORT_TYPE_INTERNAL: > case OVS_VPORT_TYPE_LISP: > @@ -470,6 +524,7 @@ dpif_netlink_rtnl_port_destroy(const char *name, const char *type) > case OVS_VPORT_TYPE_ERSPAN: > case OVS_VPORT_TYPE_IP6ERSPAN: > case OVS_VPORT_TYPE_IP6GRE: > + case OVS_VPORT_TYPE_BAREUDP: > return dpif_netlink_rtnl_destroy(name); > case OVS_VPORT_TYPE_NETDEV: > case OVS_VPORT_TYPE_INTERNAL: > diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c > index 18322e8..2ad0e64 100644 > --- a/lib/dpif-netlink.c > +++ b/lib/dpif-netlink.c > @@ -749,6 +749,9 @@ get_vport_type(const struct dpif_netlink_vport *vport) > case OVS_VPORT_TYPE_GTPU: > return "gtpu"; > > + case OVS_VPORT_TYPE_BAREUDP: > + return "bareudp"; > + > case OVS_VPORT_TYPE_UNSPEC: > case __OVS_VPORT_TYPE_MAX: > break; > @@ -784,6 +787,8 @@ netdev_to_ovs_vport_type(const char *type) > return OVS_VPORT_TYPE_GRE; > } else if (!strcmp(type, "gtpu")) { > return OVS_VPORT_TYPE_GTPU; > + } else if (!strcmp(type, "bareudp")) { > + return OVS_VPORT_TYPE_BAREUDP; > } else { > return OVS_VPORT_TYPE_UNSPEC; > } > diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c > index 0252b61..c86d420 100644 > --- a/lib/netdev-vport.c > +++ b/lib/netdev-vport.c > @@ -112,7 +112,7 @@ netdev_vport_needs_dst_port(const struct netdev *dev) > return (class->get_config == get_tunnel_config && > (!strcmp("geneve", type) || !strcmp("vxlan", type) || > !strcmp("lisp", type) || !strcmp("stt", type) || > - !strcmp("gtpu", type))); > + !strcmp("gtpu", type) || !strcmp("bareudp",type))); > } > > const char * > @@ -219,6 +219,8 @@ netdev_vport_construct(struct netdev *netdev_) > dev->tnl_cfg.dst_port = port ? htons(port) : htons(STT_DST_PORT); > } else if (!strcmp(type, "gtpu")) { > dev->tnl_cfg.dst_port = port ? htons(port) : htons(GTPU_DST_PORT); > + } else if (!strcmp(type, "bareudp")) { > + dev->tnl_cfg.dst_port = htons(port); > } > > dev->tnl_cfg.dont_fragment = true; > @@ -438,6 +440,8 @@ tunnel_supported_layers(const char *type, > return TNL_L2 | TNL_L3; > } else if (!strcmp(type, "gtpu")) { > return TNL_L3; > + } else if (!strcmp(type, "bareudp")) { > + return TNL_L3; > } else { > return TNL_L2; > } > @@ -745,6 +749,16 @@ set_tunnel_config(struct netdev *dev_, const struct smap *args, char **errp) > goto out; > } > } > + } else if (!strcmp(node->key, "payload_type")) { > + if (strcmp(node->key, "mpls")) { > + tnl_cfg.payload_ethertype = htons(ETH_TYPE_MPLS); > + tnl_cfg.exts |= (1 << OVS_BAREUDP_EXT_MULTIPROTO_MODE); > + } else if ((strcmp(node->key, "ip"))) { > + tnl_cfg.payload_ethertype = htons(ETH_TYPE_IP); > + tnl_cfg.exts |= (1 << OVS_BAREUDP_EXT_MULTIPROTO_MODE); > + } else { > + tnl_cfg.payload_ethertype = htons(atoi(node->value)); > + } > } else { > ds_put_format(&errors, "%s: unknown %s argument '%s'\n", name, > type, node->key); > @@ -917,7 +931,8 @@ get_tunnel_config(const struct netdev *dev, struct smap *args) > (!strcmp("vxlan", type) && dst_port != VXLAN_DST_PORT) || > (!strcmp("lisp", type) && dst_port != LISP_DST_PORT) || > (!strcmp("stt", type) && dst_port != STT_DST_PORT) || > - (!strcmp("gtpu", type) && dst_port != GTPU_DST_PORT)) { > + (!strcmp("gtpu", type) && dst_port != GTPU_DST_PORT) || > + !strcmp("bareudp", type)) { > smap_add_format(args, "dst_port", "%d", dst_port); > } > } > @@ -1243,6 +1258,14 @@ netdev_vport_tunnel_register(void) > }, > {{NULL, NULL, 0, 0}} > }, > + { "udp_sys", > + { > + TUNNEL_FUNCTIONS_COMMON, > + .type = "bareudp", > + .get_ifindex = NETDEV_VPORT_GET_IFINDEX, > + }, > + {{NULL, NULL, 0, 0}} > + }, > > }; > static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER; > diff --git a/lib/netdev.h b/lib/netdev.h > index fdbe0e1..f15bca5 100644 > --- a/lib/netdev.h > +++ b/lib/netdev.h > @@ -107,6 +107,7 @@ struct netdev_tunnel_config { > bool out_key_flow; > ovs_be64 out_key; > > + ovs_be16 payload_ethertype; > ovs_be16 dst_port; > > bool ip_src_flow; > diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c > index e0ede2c..6e07960 100644 > --- a/ofproto/ofproto-dpif-xlate.c > +++ b/ofproto/ofproto-dpif-xlate.c > @@ -3573,6 +3573,7 @@ propagate_tunnel_data_to_flow(struct xlate_ctx *ctx, struct eth_addr dmac, > case OVS_VPORT_TYPE_VXLAN: > case OVS_VPORT_TYPE_GENEVE: > case OVS_VPORT_TYPE_GTPU: > + case OVS_VPORT_TYPE_BAREUDP: > nw_proto = IPPROTO_UDP; > break; > case OVS_VPORT_TYPE_LISP: > diff --git a/tests/system-layer3-tunnels.at b/tests/system-layer3-tunnels.at > index 1232964..5d9ea93 100644 > --- a/tests/system-layer3-tunnels.at > +++ b/tests/system-layer3-tunnels.at > @@ -152,3 +152,50 @@ AT_CHECK([tail -1 stdout], [0], > > OVS_VSWITCHD_STOP > AT_CLEANUP > + > +AT_SETUP([layer3 - ping over MPLS Bareudp]) > +OVS_TRAFFIC_VSWITCHD_START([_ADD_BR([br1])]) > +ADD_NAMESPACES(at_ns0, at_ns1) > + > +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24", "36:b1:ee:7c:01:01") > +ADD_VETH(p1, at_ns1, br1, "10.1.1.2/24", "36:b1:ee:7c:01:02") > + > +ADD_OVS_TUNNEL([bareudp], [br0], [at_bareudp0], [8.1.1.3], [8.1.1.2/24], > + [ options:local_ip=8.1.1.2 options:packet_type="legacy_l3" options:payload_type=mpls options:dst_port=6635]) > + > +ADD_OVS_TUNNEL([bareudp], [br1], [at_bareudp1], [8.1.1.2], [8.1.1.3/24], > + [options:local_ip=8.1.1.3 options:packet_type="legacy_l3" options:payload_type=mpls options:dst_port=6635]) > + > +AT_DATA([flows0.txt], [dnl > +table=0,priority=100,dl_type=0x0800 actions=push_mpls:0x8847,set_mpls_label:3,output:at_bareudp0 > +table=0,priority=100,dl_type=0x8847 in_port=at_bareudp0 actions=pop_mpls:0x0800,set_field:36:b1:ee:7c:01:01->dl_dst,set_field:36:b1:ee:7c:01:02->dl_src,output:ovs-p0 > +table=0,priority=10 actions=normal > +]) > + > +AT_DATA([flows1.txt], [dnl > +table=0,priority=100,dl_type=0x0800 actions=push_mpls:0x8847,set_mpls_label:3,output:at_bareudp1 > +table=0,priority=100,dl_type=0x8847 in_port=at_bareudp1 actions=pop_mpls:0x0800,set_field:36:b1:ee:7c:01:02->dl_dst,set_field:36:b1:ee:7c:01:01->dl_src,output:ovs-p1 > +table=0,priority=10 actions=normal > +]) > + > +AT_CHECK([ip link add patch0 type veth peer name patch1]) > +on_exit 'ip link del patch0' > + > +AT_CHECK([ip link set dev patch0 up]) > +AT_CHECK([ip link set dev patch1 up]) > +AT_CHECK([ovs-vsctl add-port br0 patch0]) > +AT_CHECK([ovs-vsctl add-port br1 patch1]) > + > + > +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flows br0 flows0.txt]) > +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flows br1 flows1.txt]) > + > +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.2 | FORMAT_PING], [0], [dnl > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > +]) > + > +NS_CHECK_EXEC([at_ns1], [ping -q -c 3 -i 0.3 -w 2 10.1.1.1 | FORMAT_PING], [0], [dnl > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > +]) > +OVS_TRAFFIC_VSWITCHD_STOP > +AT_CLEANUP >
On 6/29/20 3:31 PM, Martin Varghese wrote: > From: Martin Varghese <martin.varghese@nokia.com> > > There are various L3 encapsulation standards using UDP being discussed to > leverage the UDP based load balancing capability of different networks. > MPLSoUDP (__ https://tools.ietf.org/html/rfc7510) is one among them. > > The Bareudp tunnel provides a generic L3 encapsulation support for > tunnelling different L3 protocols like MPLS, IP, NSH etc. inside a UDP > tunnel. > > An example to create bareudp device to tunnel MPLS traffic is > given > > $ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \ > type=bareudp options:remote_ip=2.1.1.3 > options:local_ip=2.1.1.2 \ > options:payload_type=0x8847 options:dst_port=6635 \ > options:packet_type="legacy_l3" \ > ofport_request=$bareudp_egress_port > > The bareudp device supports special handling for MPLS & IP as > they can have multiple ethertypes. MPLS procotcol can have ethertypes > ETH_P_MPLS_UC (unicast) & ETH_P_MPLS_MC (multicast). IP protocol can have > ethertypes ETH_P_IP (v4) & ETH_P_IPV6 (v6). > > The bareudp device to tunnel L3 traffic with multiple ethertypes > (MPLS & IP) can be created by passing the L3 protocol name as string in > the field payload_type. An example to create bareudp device to tunnel > MPLS unicast & multicast traffic is given below. > > $ ovs-vsctl add-port br_mpls udp_port -- set interface > udp_port \ > type=bareudp options:remote_ip=2.1.1.3 > options:local_ip=2.1.1.2 \ > options:payload_type=mpls options:dst_port=6635 \ > options:packet_type="legacy_l3" > > Signed-off-by: Martin Varghese <martin.varghese@nokia.com> > --- > Changes in v2: > - Removed vport-bareudp module. > > Changes in v3: > - Added net-next upstream commit id and message to commit message. > > Changes in v4: > - Removed kernel datapath changes. > > Changes in v5: > - Fixed release notes errors. > - Fixed coding errors in dpif-nelink-rtnl.c. > > Changes in v6: > - Added code to enable rx metadata collection in the kernel device. > - Added version history. > > Documentation/automake.mk | 1 + > Documentation/faq/bareudp.rst | 62 +++++++++++++++++++++++ > Documentation/faq/index.rst | 1 + > Documentation/faq/releases.rst | 1 + > NEWS | 4 ++ > datapath/linux/compat/include/linux/openvswitch.h | 10 ++++ > lib/dpif-netlink-rtnl.c | 55 ++++++++++++++++++++ > lib/dpif-netlink.c | 5 ++ > lib/netdev-vport.c | 27 +++++++++- > lib/netdev.h | 1 + > ofproto/ofproto-dpif-xlate.c | 1 + > tests/system-layer3-tunnels.at | 47 +++++++++++++++++ > 12 files changed, 213 insertions(+), 2 deletions(-) > create mode 100644 Documentation/faq/bareudp.rst > > diff --git a/Documentation/automake.mk b/Documentation/automake.mk > index f85c432..ea3475f 100644 > --- a/Documentation/automake.mk > +++ b/Documentation/automake.mk > @@ -88,6 +88,7 @@ DOC_SOURCE = \ > Documentation/faq/terminology.rst \ > Documentation/faq/vlan.rst \ > Documentation/faq/vxlan.rst \ > + Documentation/faq/bareudp.rst \ > Documentation/internals/index.rst \ > Documentation/internals/authors.rst \ > Documentation/internals/bugs.rst \ > diff --git a/Documentation/faq/bareudp.rst b/Documentation/faq/bareudp.rst > new file mode 100644 > index 0000000..9266daa > --- /dev/null > +++ b/Documentation/faq/bareudp.rst > @@ -0,0 +1,62 @@ > +.. > + Licensed under the Apache License, Version 2.0 (the "License"); you may > + not use this file except in compliance with the License. You may obtain > + a copy of the License at > + > + http://www.apache.org/licenses/LICENSE-2.0 > + > + Unless required by applicable law or agreed to in writing, software > + distributed under the License is distributed on an "AS IS" BASIS, WITHOUT > + WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the > + License for the specific language governing permissions and limitations > + under the License. > + > + Convention for heading levels in Open vSwitch documentation: > + > + ======= Heading 0 (reserved for the title in a document) > + ------- Heading 1 > + ~~~~~~~ Heading 2 > + +++++++ Heading 3 > + ''''''' Heading 4 > + > + Avoid deeper levels because they do not render well. > + > +======= > +Bareudp > +======= > + > +Q: What is Bareudp? > + > + A: There are various L3 encapsulation standards using UDP being discussed > + to leverage the UDP based load balancing capability of different > + networks. MPLSoUDP (__ https://tools.ietf.org/html/rfc7510) is one among > + them. > + > + The Bareudp tunnel provides a generic L3 encapsulation support for > + tunnelling different L3 protocols like MPLS, IP, NSH etc. inside a UDP > + tunnel. > + > + An example to create bareudp device to tunnel MPLS traffic is given > + below.:: > + > + $ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \ > + type=bareudp options:remote_ip=2.1.1.3 options:local_ip=2.1.1.2 \ > + options:payload_type=0x8847 options:dst_port=6635 \ > + options:packet_type="legacy_l3" \ > + ofport_request=$bareudp_egress_port > + > + The bareudp device supports special handling for MPLS & IP as they can > + have multiple ethertypes. > + MPLS procotcol can have ethertypes ETH_P_MPLS_UC (unicast) & > + ETH_P_MPLS_MC (multicast). IP protocol can have ethertypes ETH_P_IP (v4) > + & ETH_P_IPV6 (v6). > + > + The bareudp device to tunnel L3 traffic with multiple ethertypes > + (MPLS & IP) can be created by passing the L3 protocol name as string in > + the field payload_type. An example to create bareudp device to tunnel > + MPLS unicast & multicast traffic is given below.:: > + > + $ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \ > + type=bareudp options:remote_ip=2.1.1.3 options:local_ip=2.1.1.2 \ > + options:payload_type=mpls options:dst_port=6635 \ > + options:packet_type="legacy_l3" > diff --git a/Documentation/faq/index.rst b/Documentation/faq/index.rst > index 334b828..1dd2998 100644 > --- a/Documentation/faq/index.rst > +++ b/Documentation/faq/index.rst > @@ -30,6 +30,7 @@ Open vSwitch FAQ > .. toctree:: > :maxdepth: 2 > > + bareudp > configuration > contributing > design > diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst > index e5cef39..9915839 100644 > --- a/Documentation/faq/releases.rst > +++ b/Documentation/faq/releases.rst > @@ -136,6 +136,7 @@ Q: Are all features available with all datapaths? > Tunnel - ERSPAN 4.18 2.10 2.10 NO > Tunnel - ERSPAN-IPv6 4.18 2.10 2.10 NO > Tunnel - GTP-U NO NO 2.14 NO > + Tunnel - Bareudp 5.7 NO 2.14 NO There should be NO instead of 2.14, since you're not adding userspace datapath support. > QoS - Policing YES 1.1 2.6 NO > QoS - Shaping YES 1.1 NO NO > sFlow YES 1.0 1.0 NO > diff --git a/NEWS b/NEWS > index 0116b3e..f5aa840 100644 > --- a/NEWS > +++ b/NEWS > @@ -23,6 +23,10 @@ Post-v2.13.0 > - Tunnels: TC Flower offload > * Tunnel Local endpoint address masked match are supported. > * Tunnel Romte endpoint address masked match are supported. > + - Bareudp Tunnel > + * Bareudp device support is present in linux kernel from version 5.7 > + * Kernel bareudp device is not backported to ovs tree. > + * Userspace datapath support is not added > > > v2.13.0 - 14 Feb 2020 > diff --git a/datapath/linux/compat/include/linux/openvswitch.h b/datapath/linux/compat/include/linux/openvswitch.h > index cc41bbe..3073faa 100644 > --- a/datapath/linux/compat/include/linux/openvswitch.h > +++ b/datapath/linux/compat/include/linux/openvswitch.h > @@ -240,6 +240,7 @@ enum ovs_vport_type { > OVS_VPORT_TYPE_GRE, /* GRE tunnel. */ > OVS_VPORT_TYPE_VXLAN, /* VXLAN tunnel. */ > OVS_VPORT_TYPE_GENEVE, /* Geneve tunnel. */ > + OVS_VPORT_TYPE_BAREUDP, /* Bareudp tunnel. */ Since this is not defined in upstream kernel, we should, probably, make it '= 111' in order to avoid possible future collisions. > OVS_VPORT_TYPE_LISP = 105, /* LISP tunnel */ > OVS_VPORT_TYPE_STT = 106, /* STT tunnel */ > OVS_VPORT_TYPE_ERSPAN = 107, /* ERSPAN tunnel. */ > @@ -308,6 +309,15 @@ enum { > > #define OVS_VXLAN_EXT_MAX (__OVS_VXLAN_EXT_MAX - 1) > > +enum { > + OVS_BAREUDP_EXT_UNSPEC, > + OVS_BAREUDP_EXT_MULTIPROTO_MODE, > + /* place new values here to fill gap. */ There is no any gap here. > + __OVS_BAREUDP_EXT_MAX, > +}; > + > +#define OVS_BAREUDP_EXT_MAX (__OVS_BAREUDP_EXT_MAX - 1) > + > /* OVS_VPORT_ATTR_OPTIONS attributes for tunnels. > */ > enum { > diff --git a/lib/dpif-netlink-rtnl.c b/lib/dpif-netlink-rtnl.c > index fd157ce..3f6842a 100644 > --- a/lib/dpif-netlink-rtnl.c > +++ b/lib/dpif-netlink-rtnl.c > @@ -58,6 +58,19 @@ VLOG_DEFINE_THIS_MODULE(dpif_netlink_rtnl); > #define IFLA_GENEVE_UDP_ZERO_CSUM6_RX 10 > #endif > > +#ifndef __IFLA_BAREUDP_MAX > +#define IFLA_BAREUDP_MAX 0 > +#endif > +#if IFLA_BAREUDP_MAX < 4 > +#define IFLA_BAREUDP_PORT 1 > +#define IFLA_BAREUDP_ETHERTYPE 2 > +#define IFLA_BAREUDP_SRCPORT_MIN 3 > +#define IFLA_BAREUDP_MULTIPROTO_MODE 4 > +#define IFLA_BAREUDP_RX_COLLECT_METADATA 5 > +#endif > + > +#define BAREUDP_MPLS_SRCPORT_MIN 49153 > + > static const struct nl_policy rtlink_policy[] = { > [IFLA_LINKINFO] = { .type = NL_A_NESTED }, > }; > @@ -81,6 +94,10 @@ static const struct nl_policy geneve_policy[] = { > [IFLA_GENEVE_UDP_ZERO_CSUM6_RX] = { .type = NL_A_U8 }, > [IFLA_GENEVE_PORT] = { .type = NL_A_U16 }, > }; > +static const struct nl_policy bareudp_policy[] = { > + [IFLA_BAREUDP_PORT] = { .type = NL_A_U16 }, > + [IFLA_BAREUDP_ETHERTYPE] = { .type = NL_A_U16 }, > +}; > > static const char * > vport_type_to_kind(enum ovs_vport_type type, > @@ -113,6 +130,8 @@ vport_type_to_kind(enum ovs_vport_type type, > } > case OVS_VPORT_TYPE_GTPU: > return NULL; > + case OVS_VPORT_TYPE_BAREUDP: > + return "bareudp"; > case OVS_VPORT_TYPE_NETDEV: > case OVS_VPORT_TYPE_INTERNAL: > case OVS_VPORT_TYPE_LISP: > @@ -243,6 +262,24 @@ dpif_netlink_rtnl_geneve_verify(const struct netdev_tunnel_config *tnl_cfg, > > return err; > } > +static int > +dpif_netlink_rtnl_bareudp_verify(const struct netdev_tunnel_config *tnl_cfg, > + const char *kind, struct ofpbuf *reply) > +{ > + struct nlattr *bareudp[ARRAY_SIZE(bareudp_policy)]; > + int err; > + > + err = rtnl_policy_parse(kind, reply, bareudp_policy, bareudp, > + ARRAY_SIZE(bareudp_policy)); > + if (!err) { > + if ((tnl_cfg->dst_port != nl_attr_get_be16(bareudp[IFLA_BAREUDP_PORT])) > + || (tnl_cfg->payload_ethertype > + != nl_attr_get_be16(bareudp[IFLA_BAREUDP_ETHERTYPE]))) { > + err = EINVAL; > + } > + } > + return err; > +} > > static int > dpif_netlink_rtnl_verify(const struct netdev_tunnel_config *tnl_cfg, > @@ -275,6 +312,9 @@ dpif_netlink_rtnl_verify(const struct netdev_tunnel_config *tnl_cfg, > case OVS_VPORT_TYPE_GENEVE: > err = dpif_netlink_rtnl_geneve_verify(tnl_cfg, kind, reply); > break; > + case OVS_VPORT_TYPE_BAREUDP: > + err = dpif_netlink_rtnl_bareudp_verify(tnl_cfg, kind, reply); > + break; > case OVS_VPORT_TYPE_NETDEV: > case OVS_VPORT_TYPE_INTERNAL: > case OVS_VPORT_TYPE_LISP: > @@ -357,6 +397,20 @@ dpif_netlink_rtnl_create(const struct netdev_tunnel_config *tnl_cfg, > nl_msg_put_u8(&request, IFLA_GENEVE_UDP_ZERO_CSUM6_RX, 1); > nl_msg_put_be16(&request, IFLA_GENEVE_PORT, tnl_cfg->dst_port); > break; > + case OVS_VPORT_TYPE_BAREUDP: > + nl_msg_put_be16(&request, IFLA_BAREUDP_ETHERTYPE, > + tnl_cfg->payload_ethertype); > + if ((tnl_cfg->payload_ethertype == htons(ETH_TYPE_MPLS)) || > + (tnl_cfg->payload_ethertype == htons(ETH_TYPE_MPLS_MCAST))) { > + nl_msg_put_u16(&request, IFLA_BAREUDP_SRCPORT_MIN, > + BAREUDP_MPLS_SRCPORT_MIN); > + } > + nl_msg_put_be16(&request, IFLA_BAREUDP_PORT, tnl_cfg->dst_port); > + if (tnl_cfg->exts & (1 << OVS_BAREUDP_EXT_MULTIPROTO_MODE)) { > + nl_msg_put_flag(&request, IFLA_BAREUDP_MULTIPROTO_MODE); > + } > + nl_msg_put_flag(&request, IFLA_BAREUDP_RX_COLLECT_METADATA); > + break; > case OVS_VPORT_TYPE_NETDEV: > case OVS_VPORT_TYPE_INTERNAL: > case OVS_VPORT_TYPE_LISP: > @@ -470,6 +524,7 @@ dpif_netlink_rtnl_port_destroy(const char *name, const char *type) > case OVS_VPORT_TYPE_ERSPAN: > case OVS_VPORT_TYPE_IP6ERSPAN: > case OVS_VPORT_TYPE_IP6GRE: > + case OVS_VPORT_TYPE_BAREUDP: > return dpif_netlink_rtnl_destroy(name); > case OVS_VPORT_TYPE_NETDEV: > case OVS_VPORT_TYPE_INTERNAL: > diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c > index 18322e8..2ad0e64 100644 > --- a/lib/dpif-netlink.c > +++ b/lib/dpif-netlink.c > @@ -749,6 +749,9 @@ get_vport_type(const struct dpif_netlink_vport *vport) > case OVS_VPORT_TYPE_GTPU: > return "gtpu"; > > + case OVS_VPORT_TYPE_BAREUDP: > + return "bareudp"; > + > case OVS_VPORT_TYPE_UNSPEC: > case __OVS_VPORT_TYPE_MAX: > break; > @@ -784,6 +787,8 @@ netdev_to_ovs_vport_type(const char *type) > return OVS_VPORT_TYPE_GRE; > } else if (!strcmp(type, "gtpu")) { > return OVS_VPORT_TYPE_GTPU; > + } else if (!strcmp(type, "bareudp")) { > + return OVS_VPORT_TYPE_BAREUDP; > } else { > return OVS_VPORT_TYPE_UNSPEC; > } > diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c > index 0252b61..c86d420 100644 > --- a/lib/netdev-vport.c > +++ b/lib/netdev-vport.c > @@ -112,7 +112,7 @@ netdev_vport_needs_dst_port(const struct netdev *dev) > return (class->get_config == get_tunnel_config && > (!strcmp("geneve", type) || !strcmp("vxlan", type) || > !strcmp("lisp", type) || !strcmp("stt", type) || > - !strcmp("gtpu", type))); > + !strcmp("gtpu", type) || !strcmp("bareudp",type))); > } > > const char * > @@ -219,6 +219,8 @@ netdev_vport_construct(struct netdev *netdev_) > dev->tnl_cfg.dst_port = port ? htons(port) : htons(STT_DST_PORT); > } else if (!strcmp(type, "gtpu")) { > dev->tnl_cfg.dst_port = port ? htons(port) : htons(GTPU_DST_PORT); > + } else if (!strcmp(type, "bareudp")) { > + dev->tnl_cfg.dst_port = htons(port); > } > > dev->tnl_cfg.dont_fragment = true; > @@ -438,6 +440,8 @@ tunnel_supported_layers(const char *type, > return TNL_L2 | TNL_L3; > } else if (!strcmp(type, "gtpu")) { > return TNL_L3; > + } else if (!strcmp(type, "bareudp")) { > + return TNL_L3; > } else { > return TNL_L2; > } > @@ -745,6 +749,16 @@ set_tunnel_config(struct netdev *dev_, const struct smap *args, char **errp) > goto out; > } > } > + } else if (!strcmp(node->key, "payload_type")) { > + if (strcmp(node->key, "mpls")) { > + tnl_cfg.payload_ethertype = htons(ETH_TYPE_MPLS); > + tnl_cfg.exts |= (1 << OVS_BAREUDP_EXT_MULTIPROTO_MODE); > + } else if ((strcmp(node->key, "ip"))) { > + tnl_cfg.payload_ethertype = htons(ETH_TYPE_IP); > + tnl_cfg.exts |= (1 << OVS_BAREUDP_EXT_MULTIPROTO_MODE); > + } else { > + tnl_cfg.payload_ethertype = htons(atoi(node->value)); > + } > } else { > ds_put_format(&errors, "%s: unknown %s argument '%s'\n", name, > type, node->key); > @@ -917,7 +931,8 @@ get_tunnel_config(const struct netdev *dev, struct smap *args) > (!strcmp("vxlan", type) && dst_port != VXLAN_DST_PORT) || > (!strcmp("lisp", type) && dst_port != LISP_DST_PORT) || > (!strcmp("stt", type) && dst_port != STT_DST_PORT) || > - (!strcmp("gtpu", type) && dst_port != GTPU_DST_PORT)) { > + (!strcmp("gtpu", type) && dst_port != GTPU_DST_PORT) || > + !strcmp("bareudp", type)) { > smap_add_format(args, "dst_port", "%d", dst_port); > } > } > @@ -1243,6 +1258,14 @@ netdev_vport_tunnel_register(void) > }, > {{NULL, NULL, 0, 0}} > }, > + { "udp_sys", > + { > + TUNNEL_FUNCTIONS_COMMON, > + .type = "bareudp", > + .get_ifindex = NETDEV_VPORT_GET_IFINDEX, > + }, > + {{NULL, NULL, 0, 0}} > + }, > > }; > static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER; > diff --git a/lib/netdev.h b/lib/netdev.h > index fdbe0e1..f15bca5 100644 > --- a/lib/netdev.h > +++ b/lib/netdev.h > @@ -107,6 +107,7 @@ struct netdev_tunnel_config { > bool out_key_flow; > ovs_be64 out_key; > > + ovs_be16 payload_ethertype; > ovs_be16 dst_port; > > bool ip_src_flow; > diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c > index e0ede2c..6e07960 100644 > --- a/ofproto/ofproto-dpif-xlate.c > +++ b/ofproto/ofproto-dpif-xlate.c > @@ -3573,6 +3573,7 @@ propagate_tunnel_data_to_flow(struct xlate_ctx *ctx, struct eth_addr dmac, > case OVS_VPORT_TYPE_VXLAN: > case OVS_VPORT_TYPE_GENEVE: > case OVS_VPORT_TYPE_GTPU: > + case OVS_VPORT_TYPE_BAREUDP: > nw_proto = IPPROTO_UDP; > break; > case OVS_VPORT_TYPE_LISP: > diff --git a/tests/system-layer3-tunnels.at b/tests/system-layer3-tunnels.at > index 1232964..5d9ea93 100644 > --- a/tests/system-layer3-tunnels.at > +++ b/tests/system-layer3-tunnels.at > @@ -152,3 +152,50 @@ AT_CHECK([tail -1 stdout], [0], > > OVS_VSWITCHD_STOP > AT_CLEANUP > + > +AT_SETUP([layer3 - ping over MPLS Bareudp]) > +OVS_TRAFFIC_VSWITCHD_START([_ADD_BR([br1])]) > +ADD_NAMESPACES(at_ns0, at_ns1) > + > +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24", "36:b1:ee:7c:01:01") > +ADD_VETH(p1, at_ns1, br1, "10.1.1.2/24", "36:b1:ee:7c:01:02") > + > +ADD_OVS_TUNNEL([bareudp], [br0], [at_bareudp0], [8.1.1.3], [8.1.1.2/24], > + [ options:local_ip=8.1.1.2 options:packet_type="legacy_l3" options:payload_type=mpls options:dst_port=6635]) > + > +ADD_OVS_TUNNEL([bareudp], [br1], [at_bareudp1], [8.1.1.2], [8.1.1.3/24], > + [options:local_ip=8.1.1.3 options:packet_type="legacy_l3" options:payload_type=mpls options:dst_port=6635]) > + > +AT_DATA([flows0.txt], [dnl > +table=0,priority=100,dl_type=0x0800 actions=push_mpls:0x8847,set_mpls_label:3,output:at_bareudp0 > +table=0,priority=100,dl_type=0x8847 in_port=at_bareudp0 actions=pop_mpls:0x0800,set_field:36:b1:ee:7c:01:01->dl_dst,set_field:36:b1:ee:7c:01:02->dl_src,output:ovs-p0 > +table=0,priority=10 actions=normal > +]) > + > +AT_DATA([flows1.txt], [dnl > +table=0,priority=100,dl_type=0x0800 actions=push_mpls:0x8847,set_mpls_label:3,output:at_bareudp1 > +table=0,priority=100,dl_type=0x8847 in_port=at_bareudp1 actions=pop_mpls:0x0800,set_field:36:b1:ee:7c:01:02->dl_dst,set_field:36:b1:ee:7c:01:01->dl_src,output:ovs-p1 > +table=0,priority=10 actions=normal > +]) > + > +AT_CHECK([ip link add patch0 type veth peer name patch1]) > +on_exit 'ip link del patch0' > + > +AT_CHECK([ip link set dev patch0 up]) > +AT_CHECK([ip link set dev patch1 up]) > +AT_CHECK([ovs-vsctl add-port br0 patch0]) > +AT_CHECK([ovs-vsctl add-port br1 patch1]) > + > + > +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flows br0 flows0.txt]) > +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flows br1 flows1.txt]) > + > +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.2 | FORMAT_PING], [0], [dnl > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > +]) > + > +NS_CHECK_EXEC([at_ns1], [ping -q -c 3 -i 0.3 -w 2 10.1.1.1 | FORMAT_PING], [0], [dnl > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > +]) > +OVS_TRAFFIC_VSWITCHD_STOP > +AT_CLEANUP >
On 7/10/20 7:20 PM, Ilya Maximets wrote: > On 6/29/20 3:31 PM, Martin Varghese wrote: >> From: Martin Varghese <martin.varghese@nokia.com> >> >> There are various L3 encapsulation standards using UDP being discussed to >> leverage the UDP based load balancing capability of different networks. >> MPLSoUDP (__ https://tools.ietf.org/html/rfc7510) is one among them. >> >> The Bareudp tunnel provides a generic L3 encapsulation support for >> tunnelling different L3 protocols like MPLS, IP, NSH etc. inside a UDP >> tunnel. >> >> An example to create bareudp device to tunnel MPLS traffic is >> given >> >> $ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \ >> type=bareudp options:remote_ip=2.1.1.3 >> options:local_ip=2.1.1.2 \ >> options:payload_type=0x8847 options:dst_port=6635 \ >> options:packet_type="legacy_l3" \ >> ofport_request=$bareudp_egress_port >> >> The bareudp device supports special handling for MPLS & IP as >> they can have multiple ethertypes. MPLS procotcol can have ethertypes >> ETH_P_MPLS_UC (unicast) & ETH_P_MPLS_MC (multicast). IP protocol can have >> ethertypes ETH_P_IP (v4) & ETH_P_IPV6 (v6). >> >> The bareudp device to tunnel L3 traffic with multiple ethertypes >> (MPLS & IP) can be created by passing the L3 protocol name as string in >> the field payload_type. An example to create bareudp device to tunnel >> MPLS unicast & multicast traffic is given below. >> >> $ ovs-vsctl add-port br_mpls udp_port -- set interface >> udp_port \ >> type=bareudp options:remote_ip=2.1.1.3 >> options:local_ip=2.1.1.2 \ >> options:payload_type=mpls options:dst_port=6635 \ >> options:packet_type="legacy_l3" >> >> Signed-off-by: Martin Varghese <martin.varghese@nokia.com> >> --- >> Changes in v2: >> - Removed vport-bareudp module. >> >> Changes in v3: >> - Added net-next upstream commit id and message to commit message. >> >> Changes in v4: >> - Removed kernel datapath changes. >> >> Changes in v5: >> - Fixed release notes errors. >> - Fixed coding errors in dpif-nelink-rtnl.c. >> >> Changes in v6: >> - Added code to enable rx metadata collection in the kernel device. >> - Added version history. >> >> Documentation/automake.mk | 1 + >> Documentation/faq/bareudp.rst | 62 +++++++++++++++++++++++ >> Documentation/faq/index.rst | 1 + >> Documentation/faq/releases.rst | 1 + >> NEWS | 4 ++ >> datapath/linux/compat/include/linux/openvswitch.h | 10 ++++ >> lib/dpif-netlink-rtnl.c | 55 ++++++++++++++++++++ >> lib/dpif-netlink.c | 5 ++ >> lib/netdev-vport.c | 27 +++++++++- >> lib/netdev.h | 1 + >> ofproto/ofproto-dpif-xlate.c | 1 + >> tests/system-layer3-tunnels.at | 47 +++++++++++++++++ >> 12 files changed, 213 insertions(+), 2 deletions(-) >> create mode 100644 Documentation/faq/bareudp.rst >> >> diff --git a/Documentation/automake.mk b/Documentation/automake.mk >> index f85c432..ea3475f 100644 >> --- a/Documentation/automake.mk >> +++ b/Documentation/automake.mk >> @@ -88,6 +88,7 @@ DOC_SOURCE = \ >> Documentation/faq/terminology.rst \ >> Documentation/faq/vlan.rst \ >> Documentation/faq/vxlan.rst \ >> + Documentation/faq/bareudp.rst \ >> Documentation/internals/index.rst \ >> Documentation/internals/authors.rst \ >> Documentation/internals/bugs.rst \ >> diff --git a/Documentation/faq/bareudp.rst b/Documentation/faq/bareudp.rst >> new file mode 100644 >> index 0000000..9266daa >> --- /dev/null >> +++ b/Documentation/faq/bareudp.rst >> @@ -0,0 +1,62 @@ >> +.. >> + Licensed under the Apache License, Version 2.0 (the "License"); you may >> + not use this file except in compliance with the License. You may obtain >> + a copy of the License at >> + >> + http://www.apache.org/licenses/LICENSE-2.0 >> + >> + Unless required by applicable law or agreed to in writing, software >> + distributed under the License is distributed on an "AS IS" BASIS, WITHOUT >> + WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the >> + License for the specific language governing permissions and limitations >> + under the License. >> + >> + Convention for heading levels in Open vSwitch documentation: >> + >> + ======= Heading 0 (reserved for the title in a document) >> + ------- Heading 1 >> + ~~~~~~~ Heading 2 >> + +++++++ Heading 3 >> + ''''''' Heading 4 >> + >> + Avoid deeper levels because they do not render well. >> + >> +======= >> +Bareudp >> +======= >> + >> +Q: What is Bareudp? >> + >> + A: There are various L3 encapsulation standards using UDP being discussed >> + to leverage the UDP based load balancing capability of different >> + networks. MPLSoUDP (__ https://tools.ietf.org/html/rfc7510) is one among >> + them. >> + >> + The Bareudp tunnel provides a generic L3 encapsulation support for >> + tunnelling different L3 protocols like MPLS, IP, NSH etc. inside a UDP >> + tunnel. >> + >> + An example to create bareudp device to tunnel MPLS traffic is given >> + below.:: >> + >> + $ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \ >> + type=bareudp options:remote_ip=2.1.1.3 options:local_ip=2.1.1.2 \ >> + options:payload_type=0x8847 options:dst_port=6635 \ >> + options:packet_type="legacy_l3" \ >> + ofport_request=$bareudp_egress_port >> + >> + The bareudp device supports special handling for MPLS & IP as they can >> + have multiple ethertypes. >> + MPLS procotcol can have ethertypes ETH_P_MPLS_UC (unicast) & >> + ETH_P_MPLS_MC (multicast). IP protocol can have ethertypes ETH_P_IP (v4) >> + & ETH_P_IPV6 (v6). >> + >> + The bareudp device to tunnel L3 traffic with multiple ethertypes >> + (MPLS & IP) can be created by passing the L3 protocol name as string in >> + the field payload_type. An example to create bareudp device to tunnel >> + MPLS unicast & multicast traffic is given below.:: >> + >> + $ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \ >> + type=bareudp options:remote_ip=2.1.1.3 options:local_ip=2.1.1.2 \ >> + options:payload_type=mpls options:dst_port=6635 \ >> + options:packet_type="legacy_l3" >> diff --git a/Documentation/faq/index.rst b/Documentation/faq/index.rst >> index 334b828..1dd2998 100644 >> --- a/Documentation/faq/index.rst >> +++ b/Documentation/faq/index.rst >> @@ -30,6 +30,7 @@ Open vSwitch FAQ >> .. toctree:: >> :maxdepth: 2 >> >> + bareudp >> configuration >> contributing >> design >> diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst >> index e5cef39..9915839 100644 >> --- a/Documentation/faq/releases.rst >> +++ b/Documentation/faq/releases.rst >> @@ -136,6 +136,7 @@ Q: Are all features available with all datapaths? >> Tunnel - ERSPAN 4.18 2.10 2.10 NO >> Tunnel - ERSPAN-IPv6 4.18 2.10 2.10 NO >> Tunnel - GTP-U NO NO 2.14 NO >> + Tunnel - Bareudp 5.7 NO 2.14 NO > > There should be NO instead of 2.14, since you're not adding userspace datapath > support. > >> QoS - Policing YES 1.1 2.6 NO >> QoS - Shaping YES 1.1 NO NO >> sFlow YES 1.0 1.0 NO >> diff --git a/NEWS b/NEWS >> index 0116b3e..f5aa840 100644 >> --- a/NEWS >> +++ b/NEWS >> @@ -23,6 +23,10 @@ Post-v2.13.0 >> - Tunnels: TC Flower offload >> * Tunnel Local endpoint address masked match are supported. >> * Tunnel Romte endpoint address masked match are supported. >> + - Bareudp Tunnel >> + * Bareudp device support is present in linux kernel from version 5.7 >> + * Kernel bareudp device is not backported to ovs tree. >> + * Userspace datapath support is not added >> >> >> v2.13.0 - 14 Feb 2020 >> diff --git a/datapath/linux/compat/include/linux/openvswitch.h b/datapath/linux/compat/include/linux/openvswitch.h >> index cc41bbe..3073faa 100644 >> --- a/datapath/linux/compat/include/linux/openvswitch.h >> +++ b/datapath/linux/compat/include/linux/openvswitch.h >> @@ -240,6 +240,7 @@ enum ovs_vport_type { >> OVS_VPORT_TYPE_GRE, /* GRE tunnel. */ >> OVS_VPORT_TYPE_VXLAN, /* VXLAN tunnel. */ >> OVS_VPORT_TYPE_GENEVE, /* Geneve tunnel. */ >> + OVS_VPORT_TYPE_BAREUDP, /* Bareudp tunnel. */ > > Since this is not defined in upstream kernel, we should, probably, > make it '= 111' in order to avoid possible future collisions. > >> OVS_VPORT_TYPE_LISP = 105, /* LISP tunnel */ >> OVS_VPORT_TYPE_STT = 106, /* STT tunnel */ >> OVS_VPORT_TYPE_ERSPAN = 107, /* ERSPAN tunnel. */ >> @@ -308,6 +309,15 @@ enum { >> >> #define OVS_VXLAN_EXT_MAX (__OVS_VXLAN_EXT_MAX - 1) >> >> +enum { >> + OVS_BAREUDP_EXT_UNSPEC, >> + OVS_BAREUDP_EXT_MULTIPROTO_MODE, >> + /* place new values here to fill gap. */ > > There is no any gap here. > >> + __OVS_BAREUDP_EXT_MAX, >> +}; >> + >> +#define OVS_BAREUDP_EXT_MAX (__OVS_BAREUDP_EXT_MAX - 1) >> + >> /* OVS_VPORT_ATTR_OPTIONS attributes for tunnels. >> */ >> enum { >> diff --git a/lib/dpif-netlink-rtnl.c b/lib/dpif-netlink-rtnl.c >> index fd157ce..3f6842a 100644 >> --- a/lib/dpif-netlink-rtnl.c >> +++ b/lib/dpif-netlink-rtnl.c >> @@ -58,6 +58,19 @@ VLOG_DEFINE_THIS_MODULE(dpif_netlink_rtnl); >> #define IFLA_GENEVE_UDP_ZERO_CSUM6_RX 10 >> #endif >> >> +#ifndef __IFLA_BAREUDP_MAX >> +#define IFLA_BAREUDP_MAX 0 >> +#endif >> +#if IFLA_BAREUDP_MAX < 4 >> +#define IFLA_BAREUDP_PORT 1 >> +#define IFLA_BAREUDP_ETHERTYPE 2 >> +#define IFLA_BAREUDP_SRCPORT_MIN 3 >> +#define IFLA_BAREUDP_MULTIPROTO_MODE 4 >> +#define IFLA_BAREUDP_RX_COLLECT_METADATA 5 >> +#endif >> + >> +#define BAREUDP_MPLS_SRCPORT_MIN 49153 >> + >> static const struct nl_policy rtlink_policy[] = { >> [IFLA_LINKINFO] = { .type = NL_A_NESTED }, >> }; >> @@ -81,6 +94,10 @@ static const struct nl_policy geneve_policy[] = { >> [IFLA_GENEVE_UDP_ZERO_CSUM6_RX] = { .type = NL_A_U8 }, >> [IFLA_GENEVE_PORT] = { .type = NL_A_U16 }, >> }; >> +static const struct nl_policy bareudp_policy[] = { >> + [IFLA_BAREUDP_PORT] = { .type = NL_A_U16 }, >> + [IFLA_BAREUDP_ETHERTYPE] = { .type = NL_A_U16 }, >> +}; >> >> static const char * >> vport_type_to_kind(enum ovs_vport_type type, >> @@ -113,6 +130,8 @@ vport_type_to_kind(enum ovs_vport_type type, >> } >> case OVS_VPORT_TYPE_GTPU: >> return NULL; >> + case OVS_VPORT_TYPE_BAREUDP: >> + return "bareudp"; >> case OVS_VPORT_TYPE_NETDEV: >> case OVS_VPORT_TYPE_INTERNAL: >> case OVS_VPORT_TYPE_LISP: >> @@ -243,6 +262,24 @@ dpif_netlink_rtnl_geneve_verify(const struct netdev_tunnel_config *tnl_cfg, >> >> return err; >> } >> +static int >> +dpif_netlink_rtnl_bareudp_verify(const struct netdev_tunnel_config *tnl_cfg, >> + const char *kind, struct ofpbuf *reply) >> +{ >> + struct nlattr *bareudp[ARRAY_SIZE(bareudp_policy)]; >> + int err; >> + >> + err = rtnl_policy_parse(kind, reply, bareudp_policy, bareudp, >> + ARRAY_SIZE(bareudp_policy)); >> + if (!err) { >> + if ((tnl_cfg->dst_port != nl_attr_get_be16(bareudp[IFLA_BAREUDP_PORT])) >> + || (tnl_cfg->payload_ethertype >> + != nl_attr_get_be16(bareudp[IFLA_BAREUDP_ETHERTYPE]))) { >> + err = EINVAL; >> + } >> + } >> + return err; >> +} >> >> static int >> dpif_netlink_rtnl_verify(const struct netdev_tunnel_config *tnl_cfg, >> @@ -275,6 +312,9 @@ dpif_netlink_rtnl_verify(const struct netdev_tunnel_config *tnl_cfg, >> case OVS_VPORT_TYPE_GENEVE: >> err = dpif_netlink_rtnl_geneve_verify(tnl_cfg, kind, reply); >> break; >> + case OVS_VPORT_TYPE_BAREUDP: >> + err = dpif_netlink_rtnl_bareudp_verify(tnl_cfg, kind, reply); >> + break; >> case OVS_VPORT_TYPE_NETDEV: >> case OVS_VPORT_TYPE_INTERNAL: >> case OVS_VPORT_TYPE_LISP: >> @@ -357,6 +397,20 @@ dpif_netlink_rtnl_create(const struct netdev_tunnel_config *tnl_cfg, >> nl_msg_put_u8(&request, IFLA_GENEVE_UDP_ZERO_CSUM6_RX, 1); >> nl_msg_put_be16(&request, IFLA_GENEVE_PORT, tnl_cfg->dst_port); >> break; >> + case OVS_VPORT_TYPE_BAREUDP: >> + nl_msg_put_be16(&request, IFLA_BAREUDP_ETHERTYPE, >> + tnl_cfg->payload_ethertype); >> + if ((tnl_cfg->payload_ethertype == htons(ETH_TYPE_MPLS)) || >> + (tnl_cfg->payload_ethertype == htons(ETH_TYPE_MPLS_MCAST))) { >> + nl_msg_put_u16(&request, IFLA_BAREUDP_SRCPORT_MIN, >> + BAREUDP_MPLS_SRCPORT_MIN); >> + } >> + nl_msg_put_be16(&request, IFLA_BAREUDP_PORT, tnl_cfg->dst_port); >> + if (tnl_cfg->exts & (1 << OVS_BAREUDP_EXT_MULTIPROTO_MODE)) { >> + nl_msg_put_flag(&request, IFLA_BAREUDP_MULTIPROTO_MODE); >> + } >> + nl_msg_put_flag(&request, IFLA_BAREUDP_RX_COLLECT_METADATA); >> + break; >> case OVS_VPORT_TYPE_NETDEV: >> case OVS_VPORT_TYPE_INTERNAL: >> case OVS_VPORT_TYPE_LISP: >> @@ -470,6 +524,7 @@ dpif_netlink_rtnl_port_destroy(const char *name, const char *type) >> case OVS_VPORT_TYPE_ERSPAN: >> case OVS_VPORT_TYPE_IP6ERSPAN: >> case OVS_VPORT_TYPE_IP6GRE: >> + case OVS_VPORT_TYPE_BAREUDP: >> return dpif_netlink_rtnl_destroy(name); >> case OVS_VPORT_TYPE_NETDEV: >> case OVS_VPORT_TYPE_INTERNAL: >> diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c >> index 18322e8..2ad0e64 100644 >> --- a/lib/dpif-netlink.c >> +++ b/lib/dpif-netlink.c >> @@ -749,6 +749,9 @@ get_vport_type(const struct dpif_netlink_vport *vport) >> case OVS_VPORT_TYPE_GTPU: >> return "gtpu"; >> >> + case OVS_VPORT_TYPE_BAREUDP: >> + return "bareudp"; >> + >> case OVS_VPORT_TYPE_UNSPEC: >> case __OVS_VPORT_TYPE_MAX: >> break; >> @@ -784,6 +787,8 @@ netdev_to_ovs_vport_type(const char *type) >> return OVS_VPORT_TYPE_GRE; >> } else if (!strcmp(type, "gtpu")) { >> return OVS_VPORT_TYPE_GTPU; >> + } else if (!strcmp(type, "bareudp")) { >> + return OVS_VPORT_TYPE_BAREUDP; >> } else { >> return OVS_VPORT_TYPE_UNSPEC; >> } >> diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c >> index 0252b61..c86d420 100644 >> --- a/lib/netdev-vport.c >> +++ b/lib/netdev-vport.c >> @@ -112,7 +112,7 @@ netdev_vport_needs_dst_port(const struct netdev *dev) >> return (class->get_config == get_tunnel_config && >> (!strcmp("geneve", type) || !strcmp("vxlan", type) || >> !strcmp("lisp", type) || !strcmp("stt", type) || >> - !strcmp("gtpu", type))); >> + !strcmp("gtpu", type) || !strcmp("bareudp",type))); >> } >> >> const char * >> @@ -219,6 +219,8 @@ netdev_vport_construct(struct netdev *netdev_) >> dev->tnl_cfg.dst_port = port ? htons(port) : htons(STT_DST_PORT); >> } else if (!strcmp(type, "gtpu")) { >> dev->tnl_cfg.dst_port = port ? htons(port) : htons(GTPU_DST_PORT); >> + } else if (!strcmp(type, "bareudp")) { >> + dev->tnl_cfg.dst_port = htons(port); >> } >> >> dev->tnl_cfg.dont_fragment = true; >> @@ -438,6 +440,8 @@ tunnel_supported_layers(const char *type, >> return TNL_L2 | TNL_L3; >> } else if (!strcmp(type, "gtpu")) { >> return TNL_L3; >> + } else if (!strcmp(type, "bareudp")) { >> + return TNL_L3; >> } else { >> return TNL_L2; >> } >> @@ -745,6 +749,16 @@ set_tunnel_config(struct netdev *dev_, const struct smap *args, char **errp) >> goto out; >> } >> } >> + } else if (!strcmp(node->key, "payload_type")) { >> + if (strcmp(node->key, "mpls")) { >> + tnl_cfg.payload_ethertype = htons(ETH_TYPE_MPLS); >> + tnl_cfg.exts |= (1 << OVS_BAREUDP_EXT_MULTIPROTO_MODE); >> + } else if ((strcmp(node->key, "ip"))) { >> + tnl_cfg.payload_ethertype = htons(ETH_TYPE_IP); >> + tnl_cfg.exts |= (1 << OVS_BAREUDP_EXT_MULTIPROTO_MODE); >> + } else { >> + tnl_cfg.payload_ethertype = htons(atoi(node->value)); >> + } >> } else { >> ds_put_format(&errors, "%s: unknown %s argument '%s'\n", name, >> type, node->key); >> @@ -917,7 +931,8 @@ get_tunnel_config(const struct netdev *dev, struct smap *args) >> (!strcmp("vxlan", type) && dst_port != VXLAN_DST_PORT) || >> (!strcmp("lisp", type) && dst_port != LISP_DST_PORT) || >> (!strcmp("stt", type) && dst_port != STT_DST_PORT) || >> - (!strcmp("gtpu", type) && dst_port != GTPU_DST_PORT)) { >> + (!strcmp("gtpu", type) && dst_port != GTPU_DST_PORT) || >> + !strcmp("bareudp", type)) { >> smap_add_format(args, "dst_port", "%d", dst_port); >> } >> } >> @@ -1243,6 +1258,14 @@ netdev_vport_tunnel_register(void) >> }, >> {{NULL, NULL, 0, 0}} >> }, >> + { "udp_sys", >> + { >> + TUNNEL_FUNCTIONS_COMMON, >> + .type = "bareudp", >> + .get_ifindex = NETDEV_VPORT_GET_IFINDEX, >> + }, >> + {{NULL, NULL, 0, 0}} >> + }, >> >> }; >> static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER; >> diff --git a/lib/netdev.h b/lib/netdev.h >> index fdbe0e1..f15bca5 100644 >> --- a/lib/netdev.h >> +++ b/lib/netdev.h >> @@ -107,6 +107,7 @@ struct netdev_tunnel_config { >> bool out_key_flow; >> ovs_be64 out_key; >> >> + ovs_be16 payload_ethertype; >> ovs_be16 dst_port; >> >> bool ip_src_flow; >> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c >> index e0ede2c..6e07960 100644 >> --- a/ofproto/ofproto-dpif-xlate.c >> +++ b/ofproto/ofproto-dpif-xlate.c >> @@ -3573,6 +3573,7 @@ propagate_tunnel_data_to_flow(struct xlate_ctx *ctx, struct eth_addr dmac, >> case OVS_VPORT_TYPE_VXLAN: >> case OVS_VPORT_TYPE_GENEVE: >> case OVS_VPORT_TYPE_GTPU: >> + case OVS_VPORT_TYPE_BAREUDP: >> nw_proto = IPPROTO_UDP; >> break; >> case OVS_VPORT_TYPE_LISP: >> diff --git a/tests/system-layer3-tunnels.at b/tests/system-layer3-tunnels.at >> index 1232964..5d9ea93 100644 >> --- a/tests/system-layer3-tunnels.at >> +++ b/tests/system-layer3-tunnels.at >> @@ -152,3 +152,50 @@ AT_CHECK([tail -1 stdout], [0], >> >> OVS_VSWITCHD_STOP >> AT_CLEANUP >> + >> +AT_SETUP([layer3 - ping over MPLS Bareudp]) I do not see any checks that bareudp is supported in kernel. And I suspect that test will just fail on any system with older kernel. But we should skip the test instead of failing. >> +OVS_TRAFFIC_VSWITCHD_START([_ADD_BR([br1])]) >> +ADD_NAMESPACES(at_ns0, at_ns1) >> + >> +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24", "36:b1:ee:7c:01:01") >> +ADD_VETH(p1, at_ns1, br1, "10.1.1.2/24", "36:b1:ee:7c:01:02") >> + >> +ADD_OVS_TUNNEL([bareudp], [br0], [at_bareudp0], [8.1.1.3], [8.1.1.2/24], >> + [ options:local_ip=8.1.1.2 options:packet_type="legacy_l3" options:payload_type=mpls options:dst_port=6635]) >> + >> +ADD_OVS_TUNNEL([bareudp], [br1], [at_bareudp1], [8.1.1.2], [8.1.1.3/24], >> + [options:local_ip=8.1.1.3 options:packet_type="legacy_l3" options:payload_type=mpls options:dst_port=6635]) >> + >> +AT_DATA([flows0.txt], [dnl >> +table=0,priority=100,dl_type=0x0800 actions=push_mpls:0x8847,set_mpls_label:3,output:at_bareudp0 >> +table=0,priority=100,dl_type=0x8847 in_port=at_bareudp0 actions=pop_mpls:0x0800,set_field:36:b1:ee:7c:01:01->dl_dst,set_field:36:b1:ee:7c:01:02->dl_src,output:ovs-p0 >> +table=0,priority=10 actions=normal >> +]) >> + >> +AT_DATA([flows1.txt], [dnl >> +table=0,priority=100,dl_type=0x0800 actions=push_mpls:0x8847,set_mpls_label:3,output:at_bareudp1 >> +table=0,priority=100,dl_type=0x8847 in_port=at_bareudp1 actions=pop_mpls:0x0800,set_field:36:b1:ee:7c:01:02->dl_dst,set_field:36:b1:ee:7c:01:01->dl_src,output:ovs-p1 >> +table=0,priority=10 actions=normal >> +]) >> + >> +AT_CHECK([ip link add patch0 type veth peer name patch1]) >> +on_exit 'ip link del patch0' >> + >> +AT_CHECK([ip link set dev patch0 up]) >> +AT_CHECK([ip link set dev patch1 up]) >> +AT_CHECK([ovs-vsctl add-port br0 patch0]) >> +AT_CHECK([ovs-vsctl add-port br1 patch1]) >> + >> + >> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flows br0 flows0.txt]) >> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flows br1 flows1.txt]) >> + >> +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.2 | FORMAT_PING], [0], [dnl >> +3 packets transmitted, 3 received, 0% packet loss, time 0ms >> +]) >> + >> +NS_CHECK_EXEC([at_ns1], [ping -q -c 3 -i 0.3 -w 2 10.1.1.1 | FORMAT_PING], [0], [dnl >> +3 packets transmitted, 3 received, 0% packet loss, time 0ms >> +]) >> +OVS_TRAFFIC_VSWITCHD_STOP >> +AT_CLEANUP >> >
On Fri, Jul 10, 2020 at 07:20:02PM +0200, Ilya Maximets wrote: > On 6/29/20 3:31 PM, Martin Varghese wrote: > > From: Martin Varghese <martin.varghese@nokia.com> > > > > There are various L3 encapsulation standards using UDP being discussed to > > leverage the UDP based load balancing capability of different networks. > > MPLSoUDP (__ https://tools.ietf.org/html/rfc7510) is one among them. > > > > The Bareudp tunnel provides a generic L3 encapsulation support for > > tunnelling different L3 protocols like MPLS, IP, NSH etc. inside a UDP > > tunnel. > > > > An example to create bareudp device to tunnel MPLS traffic is > > given > > > > $ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \ > > type=bareudp options:remote_ip=2.1.1.3 > > options:local_ip=2.1.1.2 \ > > options:payload_type=0x8847 options:dst_port=6635 \ > > options:packet_type="legacy_l3" \ > > ofport_request=$bareudp_egress_port > > > > The bareudp device supports special handling for MPLS & IP as > > they can have multiple ethertypes. MPLS procotcol can have ethertypes > > ETH_P_MPLS_UC (unicast) & ETH_P_MPLS_MC (multicast). IP protocol can have > > ethertypes ETH_P_IP (v4) & ETH_P_IPV6 (v6). > > > > The bareudp device to tunnel L3 traffic with multiple ethertypes > > (MPLS & IP) can be created by passing the L3 protocol name as string in > > the field payload_type. An example to create bareudp device to tunnel > > MPLS unicast & multicast traffic is given below. > > > > $ ovs-vsctl add-port br_mpls udp_port -- set interface > > udp_port \ > > type=bareudp options:remote_ip=2.1.1.3 > > options:local_ip=2.1.1.2 \ > > options:payload_type=mpls options:dst_port=6635 \ > > options:packet_type="legacy_l3" > > > > Signed-off-by: Martin Varghese <martin.varghese@nokia.com> > > --- > > Changes in v2: > > - Removed vport-bareudp module. > > > > Changes in v3: > > - Added net-next upstream commit id and message to commit message. > > > > Changes in v4: > > - Removed kernel datapath changes. > > > > Changes in v5: > > - Fixed release notes errors. > > - Fixed coding errors in dpif-nelink-rtnl.c. > > > > Changes in v6: > > - Added code to enable rx metadata collection in the kernel device. > > - Added version history. > > > > Documentation/automake.mk | 1 + > > Documentation/faq/bareudp.rst | 62 +++++++++++++++++++++++ > > Documentation/faq/index.rst | 1 + > > Documentation/faq/releases.rst | 1 + > > NEWS | 4 ++ > > datapath/linux/compat/include/linux/openvswitch.h | 10 ++++ > > lib/dpif-netlink-rtnl.c | 55 ++++++++++++++++++++ > > lib/dpif-netlink.c | 5 ++ > > lib/netdev-vport.c | 27 +++++++++- > > lib/netdev.h | 1 + > > ofproto/ofproto-dpif-xlate.c | 1 + > > tests/system-layer3-tunnels.at | 47 +++++++++++++++++ > > 12 files changed, 213 insertions(+), 2 deletions(-) > > create mode 100644 Documentation/faq/bareudp.rst > > > > diff --git a/Documentation/automake.mk b/Documentation/automake.mk > > index f85c432..ea3475f 100644 > > --- a/Documentation/automake.mk > > +++ b/Documentation/automake.mk > > @@ -88,6 +88,7 @@ DOC_SOURCE = \ > > Documentation/faq/terminology.rst \ > > Documentation/faq/vlan.rst \ > > Documentation/faq/vxlan.rst \ > > + Documentation/faq/bareudp.rst \ > > Documentation/internals/index.rst \ > > Documentation/internals/authors.rst \ > > Documentation/internals/bugs.rst \ > > diff --git a/Documentation/faq/bareudp.rst b/Documentation/faq/bareudp.rst > > new file mode 100644 > > index 0000000..9266daa > > --- /dev/null > > +++ b/Documentation/faq/bareudp.rst > > @@ -0,0 +1,62 @@ > > +.. > > + Licensed under the Apache License, Version 2.0 (the "License"); you may > > + not use this file except in compliance with the License. You may obtain > > + a copy of the License at > > + > > + http://www.apache.org/licenses/LICENSE-2.0 > > + > > + Unless required by applicable law or agreed to in writing, software > > + distributed under the License is distributed on an "AS IS" BASIS, WITHOUT > > + WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the > > + License for the specific language governing permissions and limitations > > + under the License. > > + > > + Convention for heading levels in Open vSwitch documentation: > > + > > + ======= Heading 0 (reserved for the title in a document) > > + ------- Heading 1 > > + ~~~~~~~ Heading 2 > > + +++++++ Heading 3 > > + ''''''' Heading 4 > > + > > + Avoid deeper levels because they do not render well. > > + > > +======= > > +Bareudp > > +======= > > + > > +Q: What is Bareudp? > > + > > + A: There are various L3 encapsulation standards using UDP being discussed > > + to leverage the UDP based load balancing capability of different > > + networks. MPLSoUDP (__ https://tools.ietf.org/html/rfc7510) is one among > > + them. > > + > > + The Bareudp tunnel provides a generic L3 encapsulation support for > > + tunnelling different L3 protocols like MPLS, IP, NSH etc. inside a UDP > > + tunnel. > > + > > + An example to create bareudp device to tunnel MPLS traffic is given > > + below.:: > > + > > + $ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \ > > + type=bareudp options:remote_ip=2.1.1.3 options:local_ip=2.1.1.2 \ > > + options:payload_type=0x8847 options:dst_port=6635 \ > > + options:packet_type="legacy_l3" \ > > + ofport_request=$bareudp_egress_port > > + > > + The bareudp device supports special handling for MPLS & IP as they can > > + have multiple ethertypes. > > + MPLS procotcol can have ethertypes ETH_P_MPLS_UC (unicast) & > > + ETH_P_MPLS_MC (multicast). IP protocol can have ethertypes ETH_P_IP (v4) > > + & ETH_P_IPV6 (v6). > > + > > + The bareudp device to tunnel L3 traffic with multiple ethertypes > > + (MPLS & IP) can be created by passing the L3 protocol name as string in > > + the field payload_type. An example to create bareudp device to tunnel > > + MPLS unicast & multicast traffic is given below.:: > > + > > + $ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \ > > + type=bareudp options:remote_ip=2.1.1.3 options:local_ip=2.1.1.2 \ > > + options:payload_type=mpls options:dst_port=6635 \ > > + options:packet_type="legacy_l3" > > diff --git a/Documentation/faq/index.rst b/Documentation/faq/index.rst > > index 334b828..1dd2998 100644 > > --- a/Documentation/faq/index.rst > > +++ b/Documentation/faq/index.rst > > @@ -30,6 +30,7 @@ Open vSwitch FAQ > > .. toctree:: > > :maxdepth: 2 > > > > + bareudp > > configuration > > contributing > > design > > diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst > > index e5cef39..9915839 100644 > > --- a/Documentation/faq/releases.rst > > +++ b/Documentation/faq/releases.rst > > @@ -136,6 +136,7 @@ Q: Are all features available with all datapaths? > > Tunnel - ERSPAN 4.18 2.10 2.10 NO > > Tunnel - ERSPAN-IPv6 4.18 2.10 2.10 NO > > Tunnel - GTP-U NO NO 2.14 NO > > + Tunnel - Bareudp 5.7 NO 2.14 NO > > There should be NO instead of 2.14, since you're not adding userspace datapath > support. > Noted. > > QoS - Policing YES 1.1 2.6 NO > > QoS - Shaping YES 1.1 NO NO > > sFlow YES 1.0 1.0 NO > > diff --git a/NEWS b/NEWS > > index 0116b3e..f5aa840 100644 > > --- a/NEWS > > +++ b/NEWS > > @@ -23,6 +23,10 @@ Post-v2.13.0 > > - Tunnels: TC Flower offload > > * Tunnel Local endpoint address masked match are supported. > > * Tunnel Romte endpoint address masked match are supported. > > + - Bareudp Tunnel > > + * Bareudp device support is present in linux kernel from version 5.7 > > + * Kernel bareudp device is not backported to ovs tree. > > + * Userspace datapath support is not added > > > > > > v2.13.0 - 14 Feb 2020 > > diff --git a/datapath/linux/compat/include/linux/openvswitch.h b/datapath/linux/compat/include/linux/openvswitch.h > > index cc41bbe..3073faa 100644 > > --- a/datapath/linux/compat/include/linux/openvswitch.h > > +++ b/datapath/linux/compat/include/linux/openvswitch.h > > @@ -240,6 +240,7 @@ enum ovs_vport_type { > > OVS_VPORT_TYPE_GRE, /* GRE tunnel. */ > > OVS_VPORT_TYPE_VXLAN, /* VXLAN tunnel. */ > > OVS_VPORT_TYPE_GENEVE, /* Geneve tunnel. */ > > + OVS_VPORT_TYPE_BAREUDP, /* Bareudp tunnel. */ > > Since this is not defined in upstream kernel, we should, probably, > make it '= 111' in order to avoid possible future collisions. > We could make it 111 > > OVS_VPORT_TYPE_LISP = 105, /* LISP tunnel */ > > OVS_VPORT_TYPE_STT = 106, /* STT tunnel */ > > OVS_VPORT_TYPE_ERSPAN = 107, /* ERSPAN tunnel. */ > > @@ -308,6 +309,15 @@ enum { > > > > #define OVS_VXLAN_EXT_MAX (__OVS_VXLAN_EXT_MAX - 1) > > > > +enum { > > + OVS_BAREUDP_EXT_UNSPEC, > > + OVS_BAREUDP_EXT_MULTIPROTO_MODE, > > + /* place new values here to fill gap. */ > > There is no any gap here. > Noted > > + __OVS_BAREUDP_EXT_MAX, > > +}; > > + > > +#define OVS_BAREUDP_EXT_MAX (__OVS_BAREUDP_EXT_MAX - 1) > > + > > /* OVS_VPORT_ATTR_OPTIONS attributes for tunnels. > > */ > > enum { > > diff --git a/lib/dpif-netlink-rtnl.c b/lib/dpif-netlink-rtnl.c > > index fd157ce..3f6842a 100644 > > --- a/lib/dpif-netlink-rtnl.c > > +++ b/lib/dpif-netlink-rtnl.c > > @@ -58,6 +58,19 @@ VLOG_DEFINE_THIS_MODULE(dpif_netlink_rtnl); > > #define IFLA_GENEVE_UDP_ZERO_CSUM6_RX 10 > > #endif > > > > +#ifndef __IFLA_BAREUDP_MAX > > +#define IFLA_BAREUDP_MAX 0 > > +#endif > > +#if IFLA_BAREUDP_MAX < 4 > > +#define IFLA_BAREUDP_PORT 1 > > +#define IFLA_BAREUDP_ETHERTYPE 2 > > +#define IFLA_BAREUDP_SRCPORT_MIN 3 > > +#define IFLA_BAREUDP_MULTIPROTO_MODE 4 > > +#define IFLA_BAREUDP_RX_COLLECT_METADATA 5 > > +#endif > > + > > +#define BAREUDP_MPLS_SRCPORT_MIN 49153 > > + > > static const struct nl_policy rtlink_policy[] = { > > [IFLA_LINKINFO] = { .type = NL_A_NESTED }, > > }; > > @@ -81,6 +94,10 @@ static const struct nl_policy geneve_policy[] = { > > [IFLA_GENEVE_UDP_ZERO_CSUM6_RX] = { .type = NL_A_U8 }, > > [IFLA_GENEVE_PORT] = { .type = NL_A_U16 }, > > }; > > +static const struct nl_policy bareudp_policy[] = { > > + [IFLA_BAREUDP_PORT] = { .type = NL_A_U16 }, > > + [IFLA_BAREUDP_ETHERTYPE] = { .type = NL_A_U16 }, > > +}; > > > > static const char * > > vport_type_to_kind(enum ovs_vport_type type, > > @@ -113,6 +130,8 @@ vport_type_to_kind(enum ovs_vport_type type, > > } > > case OVS_VPORT_TYPE_GTPU: > > return NULL; > > + case OVS_VPORT_TYPE_BAREUDP: > > + return "bareudp"; > > case OVS_VPORT_TYPE_NETDEV: > > case OVS_VPORT_TYPE_INTERNAL: > > case OVS_VPORT_TYPE_LISP: > > @@ -243,6 +262,24 @@ dpif_netlink_rtnl_geneve_verify(const struct netdev_tunnel_config *tnl_cfg, > > > > return err; > > } > > +static int > > +dpif_netlink_rtnl_bareudp_verify(const struct netdev_tunnel_config *tnl_cfg, > > + const char *kind, struct ofpbuf *reply) > > +{ > > + struct nlattr *bareudp[ARRAY_SIZE(bareudp_policy)]; > > + int err; > > + > > + err = rtnl_policy_parse(kind, reply, bareudp_policy, bareudp, > > + ARRAY_SIZE(bareudp_policy)); > > + if (!err) { > > + if ((tnl_cfg->dst_port != nl_attr_get_be16(bareudp[IFLA_BAREUDP_PORT])) > > + || (tnl_cfg->payload_ethertype > > + != nl_attr_get_be16(bareudp[IFLA_BAREUDP_ETHERTYPE]))) { > > + err = EINVAL; > > + } > > + } > > + return err; > > +} > > > > static int > > dpif_netlink_rtnl_verify(const struct netdev_tunnel_config *tnl_cfg, > > @@ -275,6 +312,9 @@ dpif_netlink_rtnl_verify(const struct netdev_tunnel_config *tnl_cfg, > > case OVS_VPORT_TYPE_GENEVE: > > err = dpif_netlink_rtnl_geneve_verify(tnl_cfg, kind, reply); > > break; > > + case OVS_VPORT_TYPE_BAREUDP: > > + err = dpif_netlink_rtnl_bareudp_verify(tnl_cfg, kind, reply); > > + break; > > case OVS_VPORT_TYPE_NETDEV: > > case OVS_VPORT_TYPE_INTERNAL: > > case OVS_VPORT_TYPE_LISP: > > @@ -357,6 +397,20 @@ dpif_netlink_rtnl_create(const struct netdev_tunnel_config *tnl_cfg, > > nl_msg_put_u8(&request, IFLA_GENEVE_UDP_ZERO_CSUM6_RX, 1); > > nl_msg_put_be16(&request, IFLA_GENEVE_PORT, tnl_cfg->dst_port); > > break; > > + case OVS_VPORT_TYPE_BAREUDP: > > + nl_msg_put_be16(&request, IFLA_BAREUDP_ETHERTYPE, > > + tnl_cfg->payload_ethertype); > > + if ((tnl_cfg->payload_ethertype == htons(ETH_TYPE_MPLS)) || > > + (tnl_cfg->payload_ethertype == htons(ETH_TYPE_MPLS_MCAST))) { > > + nl_msg_put_u16(&request, IFLA_BAREUDP_SRCPORT_MIN, > > + BAREUDP_MPLS_SRCPORT_MIN); > > + } > > + nl_msg_put_be16(&request, IFLA_BAREUDP_PORT, tnl_cfg->dst_port); > > + if (tnl_cfg->exts & (1 << OVS_BAREUDP_EXT_MULTIPROTO_MODE)) { > > + nl_msg_put_flag(&request, IFLA_BAREUDP_MULTIPROTO_MODE); > > + } > > + nl_msg_put_flag(&request, IFLA_BAREUDP_RX_COLLECT_METADATA); > > + break; > > case OVS_VPORT_TYPE_NETDEV: > > case OVS_VPORT_TYPE_INTERNAL: > > case OVS_VPORT_TYPE_LISP: > > @@ -470,6 +524,7 @@ dpif_netlink_rtnl_port_destroy(const char *name, const char *type) > > case OVS_VPORT_TYPE_ERSPAN: > > case OVS_VPORT_TYPE_IP6ERSPAN: > > case OVS_VPORT_TYPE_IP6GRE: > > + case OVS_VPORT_TYPE_BAREUDP: > > return dpif_netlink_rtnl_destroy(name); > > case OVS_VPORT_TYPE_NETDEV: > > case OVS_VPORT_TYPE_INTERNAL: > > diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c > > index 18322e8..2ad0e64 100644 > > --- a/lib/dpif-netlink.c > > +++ b/lib/dpif-netlink.c > > @@ -749,6 +749,9 @@ get_vport_type(const struct dpif_netlink_vport *vport) > > case OVS_VPORT_TYPE_GTPU: > > return "gtpu"; > > > > + case OVS_VPORT_TYPE_BAREUDP: > > + return "bareudp"; > > + > > case OVS_VPORT_TYPE_UNSPEC: > > case __OVS_VPORT_TYPE_MAX: > > break; > > @@ -784,6 +787,8 @@ netdev_to_ovs_vport_type(const char *type) > > return OVS_VPORT_TYPE_GRE; > > } else if (!strcmp(type, "gtpu")) { > > return OVS_VPORT_TYPE_GTPU; > > + } else if (!strcmp(type, "bareudp")) { > > + return OVS_VPORT_TYPE_BAREUDP; > > } else { > > return OVS_VPORT_TYPE_UNSPEC; > > } > > diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c > > index 0252b61..c86d420 100644 > > --- a/lib/netdev-vport.c > > +++ b/lib/netdev-vport.c > > @@ -112,7 +112,7 @@ netdev_vport_needs_dst_port(const struct netdev *dev) > > return (class->get_config == get_tunnel_config && > > (!strcmp("geneve", type) || !strcmp("vxlan", type) || > > !strcmp("lisp", type) || !strcmp("stt", type) || > > - !strcmp("gtpu", type))); > > + !strcmp("gtpu", type) || !strcmp("bareudp",type))); > > } > > > > const char * > > @@ -219,6 +219,8 @@ netdev_vport_construct(struct netdev *netdev_) > > dev->tnl_cfg.dst_port = port ? htons(port) : htons(STT_DST_PORT); > > } else if (!strcmp(type, "gtpu")) { > > dev->tnl_cfg.dst_port = port ? htons(port) : htons(GTPU_DST_PORT); > > + } else if (!strcmp(type, "bareudp")) { > > + dev->tnl_cfg.dst_port = htons(port); > > } > > > > dev->tnl_cfg.dont_fragment = true; > > @@ -438,6 +440,8 @@ tunnel_supported_layers(const char *type, > > return TNL_L2 | TNL_L3; > > } else if (!strcmp(type, "gtpu")) { > > return TNL_L3; > > + } else if (!strcmp(type, "bareudp")) { > > + return TNL_L3; > > } else { > > return TNL_L2; > > } > > @@ -745,6 +749,16 @@ set_tunnel_config(struct netdev *dev_, const struct smap *args, char **errp) > > goto out; > > } > > } > > + } else if (!strcmp(node->key, "payload_type")) { > > + if (strcmp(node->key, "mpls")) { > > + tnl_cfg.payload_ethertype = htons(ETH_TYPE_MPLS); > > + tnl_cfg.exts |= (1 << OVS_BAREUDP_EXT_MULTIPROTO_MODE); > > + } else if ((strcmp(node->key, "ip"))) { > > + tnl_cfg.payload_ethertype = htons(ETH_TYPE_IP); > > + tnl_cfg.exts |= (1 << OVS_BAREUDP_EXT_MULTIPROTO_MODE); > > + } else { > > + tnl_cfg.payload_ethertype = htons(atoi(node->value)); > > + } > > } else { > > ds_put_format(&errors, "%s: unknown %s argument '%s'\n", name, > > type, node->key); > > @@ -917,7 +931,8 @@ get_tunnel_config(const struct netdev *dev, struct smap *args) > > (!strcmp("vxlan", type) && dst_port != VXLAN_DST_PORT) || > > (!strcmp("lisp", type) && dst_port != LISP_DST_PORT) || > > (!strcmp("stt", type) && dst_port != STT_DST_PORT) || > > - (!strcmp("gtpu", type) && dst_port != GTPU_DST_PORT)) { > > + (!strcmp("gtpu", type) && dst_port != GTPU_DST_PORT) || > > + !strcmp("bareudp", type)) { > > smap_add_format(args, "dst_port", "%d", dst_port); > > } > > } > > @@ -1243,6 +1258,14 @@ netdev_vport_tunnel_register(void) > > }, > > {{NULL, NULL, 0, 0}} > > }, > > + { "udp_sys", > > + { > > + TUNNEL_FUNCTIONS_COMMON, > > + .type = "bareudp", > > + .get_ifindex = NETDEV_VPORT_GET_IFINDEX, > > + }, > > + {{NULL, NULL, 0, 0}} > > + }, > > > > }; > > static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER; > > diff --git a/lib/netdev.h b/lib/netdev.h > > index fdbe0e1..f15bca5 100644 > > --- a/lib/netdev.h > > +++ b/lib/netdev.h > > @@ -107,6 +107,7 @@ struct netdev_tunnel_config { > > bool out_key_flow; > > ovs_be64 out_key; > > > > + ovs_be16 payload_ethertype; > > ovs_be16 dst_port; > > > > bool ip_src_flow; > > diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c > > index e0ede2c..6e07960 100644 > > --- a/ofproto/ofproto-dpif-xlate.c > > +++ b/ofproto/ofproto-dpif-xlate.c > > @@ -3573,6 +3573,7 @@ propagate_tunnel_data_to_flow(struct xlate_ctx *ctx, struct eth_addr dmac, > > case OVS_VPORT_TYPE_VXLAN: > > case OVS_VPORT_TYPE_GENEVE: > > case OVS_VPORT_TYPE_GTPU: > > + case OVS_VPORT_TYPE_BAREUDP: > > nw_proto = IPPROTO_UDP; > > break; > > case OVS_VPORT_TYPE_LISP: > > diff --git a/tests/system-layer3-tunnels.at b/tests/system-layer3-tunnels.at > > index 1232964..5d9ea93 100644 > > --- a/tests/system-layer3-tunnels.at > > +++ b/tests/system-layer3-tunnels.at > > @@ -152,3 +152,50 @@ AT_CHECK([tail -1 stdout], [0], > > > > OVS_VSWITCHD_STOP > > AT_CLEANUP > > + > > +AT_SETUP([layer3 - ping over MPLS Bareudp]) > > +OVS_TRAFFIC_VSWITCHD_START([_ADD_BR([br1])]) > > +ADD_NAMESPACES(at_ns0, at_ns1) > > + > > +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24", "36:b1:ee:7c:01:01") > > +ADD_VETH(p1, at_ns1, br1, "10.1.1.2/24", "36:b1:ee:7c:01:02") > > + > > +ADD_OVS_TUNNEL([bareudp], [br0], [at_bareudp0], [8.1.1.3], [8.1.1.2/24], > > + [ options:local_ip=8.1.1.2 options:packet_type="legacy_l3" options:payload_type=mpls options:dst_port=6635]) > > + > > +ADD_OVS_TUNNEL([bareudp], [br1], [at_bareudp1], [8.1.1.2], [8.1.1.3/24], > > + [options:local_ip=8.1.1.3 options:packet_type="legacy_l3" options:payload_type=mpls options:dst_port=6635]) > > + > > +AT_DATA([flows0.txt], [dnl > > +table=0,priority=100,dl_type=0x0800 actions=push_mpls:0x8847,set_mpls_label:3,output:at_bareudp0 > > +table=0,priority=100,dl_type=0x8847 in_port=at_bareudp0 actions=pop_mpls:0x0800,set_field:36:b1:ee:7c:01:01->dl_dst,set_field:36:b1:ee:7c:01:02->dl_src,output:ovs-p0 > > +table=0,priority=10 actions=normal > > +]) > > + > > +AT_DATA([flows1.txt], [dnl > > +table=0,priority=100,dl_type=0x0800 actions=push_mpls:0x8847,set_mpls_label:3,output:at_bareudp1 > > +table=0,priority=100,dl_type=0x8847 in_port=at_bareudp1 actions=pop_mpls:0x0800,set_field:36:b1:ee:7c:01:02->dl_dst,set_field:36:b1:ee:7c:01:01->dl_src,output:ovs-p1 > > +table=0,priority=10 actions=normal > > +]) > > + > > +AT_CHECK([ip link add patch0 type veth peer name patch1]) > > +on_exit 'ip link del patch0' > > + > > +AT_CHECK([ip link set dev patch0 up]) > > +AT_CHECK([ip link set dev patch1 up]) > > +AT_CHECK([ovs-vsctl add-port br0 patch0]) > > +AT_CHECK([ovs-vsctl add-port br1 patch1]) > > + > > + > > +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flows br0 flows0.txt]) > > +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flows br1 flows1.txt]) > > + > > +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.2 | FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > + > > +NS_CHECK_EXEC([at_ns1], [ping -q -c 3 -i 0.3 -w 2 10.1.1.1 | FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > +OVS_TRAFFIC_VSWITCHD_STOP > > +AT_CLEANUP > > > Thanks for reviewing Regards, Martin
diff --git a/Documentation/automake.mk b/Documentation/automake.mk index f85c432..ea3475f 100644 --- a/Documentation/automake.mk +++ b/Documentation/automake.mk @@ -88,6 +88,7 @@ DOC_SOURCE = \ Documentation/faq/terminology.rst \ Documentation/faq/vlan.rst \ Documentation/faq/vxlan.rst \ + Documentation/faq/bareudp.rst \ Documentation/internals/index.rst \ Documentation/internals/authors.rst \ Documentation/internals/bugs.rst \ diff --git a/Documentation/faq/bareudp.rst b/Documentation/faq/bareudp.rst new file mode 100644 index 0000000..9266daa --- /dev/null +++ b/Documentation/faq/bareudp.rst @@ -0,0 +1,62 @@ +.. + Licensed under the Apache License, Version 2.0 (the "License"); you may + not use this file except in compliance with the License. You may obtain + a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, WITHOUT + WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the + License for the specific language governing permissions and limitations + under the License. + + Convention for heading levels in Open vSwitch documentation: + + ======= Heading 0 (reserved for the title in a document) + ------- Heading 1 + ~~~~~~~ Heading 2 + +++++++ Heading 3 + ''''''' Heading 4 + + Avoid deeper levels because they do not render well. + +======= +Bareudp +======= + +Q: What is Bareudp? + + A: There are various L3 encapsulation standards using UDP being discussed + to leverage the UDP based load balancing capability of different + networks. MPLSoUDP (__ https://tools.ietf.org/html/rfc7510) is one among + them. + + The Bareudp tunnel provides a generic L3 encapsulation support for + tunnelling different L3 protocols like MPLS, IP, NSH etc. inside a UDP + tunnel. + + An example to create bareudp device to tunnel MPLS traffic is given + below.:: + + $ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \ + type=bareudp options:remote_ip=2.1.1.3 options:local_ip=2.1.1.2 \ + options:payload_type=0x8847 options:dst_port=6635 \ + options:packet_type="legacy_l3" \ + ofport_request=$bareudp_egress_port + + The bareudp device supports special handling for MPLS & IP as they can + have multiple ethertypes. + MPLS procotcol can have ethertypes ETH_P_MPLS_UC (unicast) & + ETH_P_MPLS_MC (multicast). IP protocol can have ethertypes ETH_P_IP (v4) + & ETH_P_IPV6 (v6). + + The bareudp device to tunnel L3 traffic with multiple ethertypes + (MPLS & IP) can be created by passing the L3 protocol name as string in + the field payload_type. An example to create bareudp device to tunnel + MPLS unicast & multicast traffic is given below.:: + + $ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \ + type=bareudp options:remote_ip=2.1.1.3 options:local_ip=2.1.1.2 \ + options:payload_type=mpls options:dst_port=6635 \ + options:packet_type="legacy_l3" diff --git a/Documentation/faq/index.rst b/Documentation/faq/index.rst index 334b828..1dd2998 100644 --- a/Documentation/faq/index.rst +++ b/Documentation/faq/index.rst @@ -30,6 +30,7 @@ Open vSwitch FAQ .. toctree:: :maxdepth: 2 + bareudp configuration contributing design diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst index e5cef39..9915839 100644 --- a/Documentation/faq/releases.rst +++ b/Documentation/faq/releases.rst @@ -136,6 +136,7 @@ Q: Are all features available with all datapaths? Tunnel - ERSPAN 4.18 2.10 2.10 NO Tunnel - ERSPAN-IPv6 4.18 2.10 2.10 NO Tunnel - GTP-U NO NO 2.14 NO + Tunnel - Bareudp 5.7 NO 2.14 NO QoS - Policing YES 1.1 2.6 NO QoS - Shaping YES 1.1 NO NO sFlow YES 1.0 1.0 NO diff --git a/NEWS b/NEWS index 0116b3e..f5aa840 100644 --- a/NEWS +++ b/NEWS @@ -23,6 +23,10 @@ Post-v2.13.0 - Tunnels: TC Flower offload * Tunnel Local endpoint address masked match are supported. * Tunnel Romte endpoint address masked match are supported. + - Bareudp Tunnel + * Bareudp device support is present in linux kernel from version 5.7 + * Kernel bareudp device is not backported to ovs tree. + * Userspace datapath support is not added v2.13.0 - 14 Feb 2020 diff --git a/datapath/linux/compat/include/linux/openvswitch.h b/datapath/linux/compat/include/linux/openvswitch.h index cc41bbe..3073faa 100644 --- a/datapath/linux/compat/include/linux/openvswitch.h +++ b/datapath/linux/compat/include/linux/openvswitch.h @@ -240,6 +240,7 @@ enum ovs_vport_type { OVS_VPORT_TYPE_GRE, /* GRE tunnel. */ OVS_VPORT_TYPE_VXLAN, /* VXLAN tunnel. */ OVS_VPORT_TYPE_GENEVE, /* Geneve tunnel. */ + OVS_VPORT_TYPE_BAREUDP, /* Bareudp tunnel. */ OVS_VPORT_TYPE_LISP = 105, /* LISP tunnel */ OVS_VPORT_TYPE_STT = 106, /* STT tunnel */ OVS_VPORT_TYPE_ERSPAN = 107, /* ERSPAN tunnel. */ @@ -308,6 +309,15 @@ enum { #define OVS_VXLAN_EXT_MAX (__OVS_VXLAN_EXT_MAX - 1) +enum { + OVS_BAREUDP_EXT_UNSPEC, + OVS_BAREUDP_EXT_MULTIPROTO_MODE, + /* place new values here to fill gap. */ + __OVS_BAREUDP_EXT_MAX, +}; + +#define OVS_BAREUDP_EXT_MAX (__OVS_BAREUDP_EXT_MAX - 1) + /* OVS_VPORT_ATTR_OPTIONS attributes for tunnels. */ enum { diff --git a/lib/dpif-netlink-rtnl.c b/lib/dpif-netlink-rtnl.c index fd157ce..3f6842a 100644 --- a/lib/dpif-netlink-rtnl.c +++ b/lib/dpif-netlink-rtnl.c @@ -58,6 +58,19 @@ VLOG_DEFINE_THIS_MODULE(dpif_netlink_rtnl); #define IFLA_GENEVE_UDP_ZERO_CSUM6_RX 10 #endif +#ifndef __IFLA_BAREUDP_MAX +#define IFLA_BAREUDP_MAX 0 +#endif +#if IFLA_BAREUDP_MAX < 4 +#define IFLA_BAREUDP_PORT 1 +#define IFLA_BAREUDP_ETHERTYPE 2 +#define IFLA_BAREUDP_SRCPORT_MIN 3 +#define IFLA_BAREUDP_MULTIPROTO_MODE 4 +#define IFLA_BAREUDP_RX_COLLECT_METADATA 5 +#endif + +#define BAREUDP_MPLS_SRCPORT_MIN 49153 + static const struct nl_policy rtlink_policy[] = { [IFLA_LINKINFO] = { .type = NL_A_NESTED }, }; @@ -81,6 +94,10 @@ static const struct nl_policy geneve_policy[] = { [IFLA_GENEVE_UDP_ZERO_CSUM6_RX] = { .type = NL_A_U8 }, [IFLA_GENEVE_PORT] = { .type = NL_A_U16 }, }; +static const struct nl_policy bareudp_policy[] = { + [IFLA_BAREUDP_PORT] = { .type = NL_A_U16 }, + [IFLA_BAREUDP_ETHERTYPE] = { .type = NL_A_U16 }, +}; static const char * vport_type_to_kind(enum ovs_vport_type type, @@ -113,6 +130,8 @@ vport_type_to_kind(enum ovs_vport_type type, } case OVS_VPORT_TYPE_GTPU: return NULL; + case OVS_VPORT_TYPE_BAREUDP: + return "bareudp"; case OVS_VPORT_TYPE_NETDEV: case OVS_VPORT_TYPE_INTERNAL: case OVS_VPORT_TYPE_LISP: @@ -243,6 +262,24 @@ dpif_netlink_rtnl_geneve_verify(const struct netdev_tunnel_config *tnl_cfg, return err; } +static int +dpif_netlink_rtnl_bareudp_verify(const struct netdev_tunnel_config *tnl_cfg, + const char *kind, struct ofpbuf *reply) +{ + struct nlattr *bareudp[ARRAY_SIZE(bareudp_policy)]; + int err; + + err = rtnl_policy_parse(kind, reply, bareudp_policy, bareudp, + ARRAY_SIZE(bareudp_policy)); + if (!err) { + if ((tnl_cfg->dst_port != nl_attr_get_be16(bareudp[IFLA_BAREUDP_PORT])) + || (tnl_cfg->payload_ethertype + != nl_attr_get_be16(bareudp[IFLA_BAREUDP_ETHERTYPE]))) { + err = EINVAL; + } + } + return err; +} static int dpif_netlink_rtnl_verify(const struct netdev_tunnel_config *tnl_cfg, @@ -275,6 +312,9 @@ dpif_netlink_rtnl_verify(const struct netdev_tunnel_config *tnl_cfg, case OVS_VPORT_TYPE_GENEVE: err = dpif_netlink_rtnl_geneve_verify(tnl_cfg, kind, reply); break; + case OVS_VPORT_TYPE_BAREUDP: + err = dpif_netlink_rtnl_bareudp_verify(tnl_cfg, kind, reply); + break; case OVS_VPORT_TYPE_NETDEV: case OVS_VPORT_TYPE_INTERNAL: case OVS_VPORT_TYPE_LISP: @@ -357,6 +397,20 @@ dpif_netlink_rtnl_create(const struct netdev_tunnel_config *tnl_cfg, nl_msg_put_u8(&request, IFLA_GENEVE_UDP_ZERO_CSUM6_RX, 1); nl_msg_put_be16(&request, IFLA_GENEVE_PORT, tnl_cfg->dst_port); break; + case OVS_VPORT_TYPE_BAREUDP: + nl_msg_put_be16(&request, IFLA_BAREUDP_ETHERTYPE, + tnl_cfg->payload_ethertype); + if ((tnl_cfg->payload_ethertype == htons(ETH_TYPE_MPLS)) || + (tnl_cfg->payload_ethertype == htons(ETH_TYPE_MPLS_MCAST))) { + nl_msg_put_u16(&request, IFLA_BAREUDP_SRCPORT_MIN, + BAREUDP_MPLS_SRCPORT_MIN); + } + nl_msg_put_be16(&request, IFLA_BAREUDP_PORT, tnl_cfg->dst_port); + if (tnl_cfg->exts & (1 << OVS_BAREUDP_EXT_MULTIPROTO_MODE)) { + nl_msg_put_flag(&request, IFLA_BAREUDP_MULTIPROTO_MODE); + } + nl_msg_put_flag(&request, IFLA_BAREUDP_RX_COLLECT_METADATA); + break; case OVS_VPORT_TYPE_NETDEV: case OVS_VPORT_TYPE_INTERNAL: case OVS_VPORT_TYPE_LISP: @@ -470,6 +524,7 @@ dpif_netlink_rtnl_port_destroy(const char *name, const char *type) case OVS_VPORT_TYPE_ERSPAN: case OVS_VPORT_TYPE_IP6ERSPAN: case OVS_VPORT_TYPE_IP6GRE: + case OVS_VPORT_TYPE_BAREUDP: return dpif_netlink_rtnl_destroy(name); case OVS_VPORT_TYPE_NETDEV: case OVS_VPORT_TYPE_INTERNAL: diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c index 18322e8..2ad0e64 100644 --- a/lib/dpif-netlink.c +++ b/lib/dpif-netlink.c @@ -749,6 +749,9 @@ get_vport_type(const struct dpif_netlink_vport *vport) case OVS_VPORT_TYPE_GTPU: return "gtpu"; + case OVS_VPORT_TYPE_BAREUDP: + return "bareudp"; + case OVS_VPORT_TYPE_UNSPEC: case __OVS_VPORT_TYPE_MAX: break; @@ -784,6 +787,8 @@ netdev_to_ovs_vport_type(const char *type) return OVS_VPORT_TYPE_GRE; } else if (!strcmp(type, "gtpu")) { return OVS_VPORT_TYPE_GTPU; + } else if (!strcmp(type, "bareudp")) { + return OVS_VPORT_TYPE_BAREUDP; } else { return OVS_VPORT_TYPE_UNSPEC; } diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c index 0252b61..c86d420 100644 --- a/lib/netdev-vport.c +++ b/lib/netdev-vport.c @@ -112,7 +112,7 @@ netdev_vport_needs_dst_port(const struct netdev *dev) return (class->get_config == get_tunnel_config && (!strcmp("geneve", type) || !strcmp("vxlan", type) || !strcmp("lisp", type) || !strcmp("stt", type) || - !strcmp("gtpu", type))); + !strcmp("gtpu", type) || !strcmp("bareudp",type))); } const char * @@ -219,6 +219,8 @@ netdev_vport_construct(struct netdev *netdev_) dev->tnl_cfg.dst_port = port ? htons(port) : htons(STT_DST_PORT); } else if (!strcmp(type, "gtpu")) { dev->tnl_cfg.dst_port = port ? htons(port) : htons(GTPU_DST_PORT); + } else if (!strcmp(type, "bareudp")) { + dev->tnl_cfg.dst_port = htons(port); } dev->tnl_cfg.dont_fragment = true; @@ -438,6 +440,8 @@ tunnel_supported_layers(const char *type, return TNL_L2 | TNL_L3; } else if (!strcmp(type, "gtpu")) { return TNL_L3; + } else if (!strcmp(type, "bareudp")) { + return TNL_L3; } else { return TNL_L2; } @@ -745,6 +749,16 @@ set_tunnel_config(struct netdev *dev_, const struct smap *args, char **errp) goto out; } } + } else if (!strcmp(node->key, "payload_type")) { + if (strcmp(node->key, "mpls")) { + tnl_cfg.payload_ethertype = htons(ETH_TYPE_MPLS); + tnl_cfg.exts |= (1 << OVS_BAREUDP_EXT_MULTIPROTO_MODE); + } else if ((strcmp(node->key, "ip"))) { + tnl_cfg.payload_ethertype = htons(ETH_TYPE_IP); + tnl_cfg.exts |= (1 << OVS_BAREUDP_EXT_MULTIPROTO_MODE); + } else { + tnl_cfg.payload_ethertype = htons(atoi(node->value)); + } } else { ds_put_format(&errors, "%s: unknown %s argument '%s'\n", name, type, node->key); @@ -917,7 +931,8 @@ get_tunnel_config(const struct netdev *dev, struct smap *args) (!strcmp("vxlan", type) && dst_port != VXLAN_DST_PORT) || (!strcmp("lisp", type) && dst_port != LISP_DST_PORT) || (!strcmp("stt", type) && dst_port != STT_DST_PORT) || - (!strcmp("gtpu", type) && dst_port != GTPU_DST_PORT)) { + (!strcmp("gtpu", type) && dst_port != GTPU_DST_PORT) || + !strcmp("bareudp", type)) { smap_add_format(args, "dst_port", "%d", dst_port); } } @@ -1243,6 +1258,14 @@ netdev_vport_tunnel_register(void) }, {{NULL, NULL, 0, 0}} }, + { "udp_sys", + { + TUNNEL_FUNCTIONS_COMMON, + .type = "bareudp", + .get_ifindex = NETDEV_VPORT_GET_IFINDEX, + }, + {{NULL, NULL, 0, 0}} + }, }; static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER; diff --git a/lib/netdev.h b/lib/netdev.h index fdbe0e1..f15bca5 100644 --- a/lib/netdev.h +++ b/lib/netdev.h @@ -107,6 +107,7 @@ struct netdev_tunnel_config { bool out_key_flow; ovs_be64 out_key; + ovs_be16 payload_ethertype; ovs_be16 dst_port; bool ip_src_flow; diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c index e0ede2c..6e07960 100644 --- a/ofproto/ofproto-dpif-xlate.c +++ b/ofproto/ofproto-dpif-xlate.c @@ -3573,6 +3573,7 @@ propagate_tunnel_data_to_flow(struct xlate_ctx *ctx, struct eth_addr dmac, case OVS_VPORT_TYPE_VXLAN: case OVS_VPORT_TYPE_GENEVE: case OVS_VPORT_TYPE_GTPU: + case OVS_VPORT_TYPE_BAREUDP: nw_proto = IPPROTO_UDP; break; case OVS_VPORT_TYPE_LISP: diff --git a/tests/system-layer3-tunnels.at b/tests/system-layer3-tunnels.at index 1232964..5d9ea93 100644 --- a/tests/system-layer3-tunnels.at +++ b/tests/system-layer3-tunnels.at @@ -152,3 +152,50 @@ AT_CHECK([tail -1 stdout], [0], OVS_VSWITCHD_STOP AT_CLEANUP + +AT_SETUP([layer3 - ping over MPLS Bareudp]) +OVS_TRAFFIC_VSWITCHD_START([_ADD_BR([br1])]) +ADD_NAMESPACES(at_ns0, at_ns1) + +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24", "36:b1:ee:7c:01:01") +ADD_VETH(p1, at_ns1, br1, "10.1.1.2/24", "36:b1:ee:7c:01:02") + +ADD_OVS_TUNNEL([bareudp], [br0], [at_bareudp0], [8.1.1.3], [8.1.1.2/24], + [ options:local_ip=8.1.1.2 options:packet_type="legacy_l3" options:payload_type=mpls options:dst_port=6635]) + +ADD_OVS_TUNNEL([bareudp], [br1], [at_bareudp1], [8.1.1.2], [8.1.1.3/24], + [options:local_ip=8.1.1.3 options:packet_type="legacy_l3" options:payload_type=mpls options:dst_port=6635]) + +AT_DATA([flows0.txt], [dnl +table=0,priority=100,dl_type=0x0800 actions=push_mpls:0x8847,set_mpls_label:3,output:at_bareudp0 +table=0,priority=100,dl_type=0x8847 in_port=at_bareudp0 actions=pop_mpls:0x0800,set_field:36:b1:ee:7c:01:01->dl_dst,set_field:36:b1:ee:7c:01:02->dl_src,output:ovs-p0 +table=0,priority=10 actions=normal +]) + +AT_DATA([flows1.txt], [dnl +table=0,priority=100,dl_type=0x0800 actions=push_mpls:0x8847,set_mpls_label:3,output:at_bareudp1 +table=0,priority=100,dl_type=0x8847 in_port=at_bareudp1 actions=pop_mpls:0x0800,set_field:36:b1:ee:7c:01:02->dl_dst,set_field:36:b1:ee:7c:01:01->dl_src,output:ovs-p1 +table=0,priority=10 actions=normal +]) + +AT_CHECK([ip link add patch0 type veth peer name patch1]) +on_exit 'ip link del patch0' + +AT_CHECK([ip link set dev patch0 up]) +AT_CHECK([ip link set dev patch1 up]) +AT_CHECK([ovs-vsctl add-port br0 patch0]) +AT_CHECK([ovs-vsctl add-port br1 patch1]) + + +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flows br0 flows0.txt]) +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flows br1 flows1.txt]) + +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.2 | FORMAT_PING], [0], [dnl +3 packets transmitted, 3 received, 0% packet loss, time 0ms +]) + +NS_CHECK_EXEC([at_ns1], [ping -q -c 3 -i 0.3 -w 2 10.1.1.1 | FORMAT_PING], [0], [dnl +3 packets transmitted, 3 received, 0% packet loss, time 0ms +]) +OVS_TRAFFIC_VSWITCHD_STOP +AT_CLEANUP