diff mbox series

[ovs-dev,3/3] netdev-offload-tc: Add VxLAN encap support.

Message ID 1594224636-42337-4-git-send-email-u9012063@gmail.com
State Changes Requested
Headers show
Series Add VxLAN encap support for tc offload. | expand

Commit Message

William Tu July 8, 2020, 4:10 p.m. UTC
The patch adds VxLAN encap tc-offload support.  The userspace datapath, dpif-netdev,
flow format differs than the kernel datapath in case of tunnel encap.  Unlike kernel,
the dpif-netdev does not use set and output action, but uses a single clone action with
all the tunnel info nested inside.  As an exmaple blow:
actions:clone(tnl_push(tnl_port(5),
  header(size=50,type=4,eth(dst=06:1d:6e:a3:f1:61,src=26:df:25:f6:7b:4f,dl_type=0x0800),
    ipv4(src=172.31.1.100,dst=172.31.1.1,proto=17,tos=0,ttl=64,frag=0x4000),
    udp(src=0,dst=4789,csum=0x0),
    vxlan(flags=0x8000000,vni=0x0)),out_port(2)
  ), 3)

The patch parses the above tunnel encap format and passes to
the tc for offloading the VxLAN tunnel.

Example of tc format:
$ tc -s filter show dev ovs-p1 ingress
filter protocol ip pref 3 flower chain 0
filter protocol ip pref 3 flower chain 0 handle 0x1
  dst_mac 56:2a:1f:3c:bb:f2
  src_mac 96:0c:a7:b0:60:a4
  eth_type ipv4
  ip_tos 0/0x3
  ip_flags nofrag
  skip_hw
  not_in_hw
	action order 1: tunnel_key  set
	src_ip 172.31.1.100
	dst_ip 172.31.1.1
	key_id 0
	dst_port 4789
	nocsum
	ttl 64 pipe
	 index 2 ref 1 bind 1 installed 0 sec used 0 sec
	Action statistics:
	Sent 168 bytes 2 pkt (dropped 0, overlimits 0 requeues 0)
	backlog 0b 0p requeues 0
	no_percpu

	action order 2: mirred (Egress Redirect to device ovs-p0) stolen
	index 2 ref 1 bind 1 installed 0 sec used 0 sec
	Action statistics:
	Sent 168 bytes 2 pkt (dropped 0, overlimits 0 requeues 0)
	backlog 0b 0p requeues 0
	cookie b46e99079448ce581d0fe7a9853c0bb5
	no_percpu

Signed-off-by: William Tu <u9012063@gmail.com>
---
 lib/netdev-offload-tc.c | 121 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 121 insertions(+)

Comments

Simon Horman June 26, 2023, 2:53 p.m. UTC | #1
On Wed, Jul 08, 2020 at 09:10:36AM -0700, William Tu wrote:
> The patch adds VxLAN encap tc-offload support.  The userspace datapath, dpif-netdev,
> flow format differs than the kernel datapath in case of tunnel encap.  Unlike kernel,
> the dpif-netdev does not use set and output action, but uses a single clone action with
> all the tunnel info nested inside.  As an exmaple blow:
> actions:clone(tnl_push(tnl_port(5),
>   header(size=50,type=4,eth(dst=06:1d:6e:a3:f1:61,src=26:df:25:f6:7b:4f,dl_type=0x0800),
>     ipv4(src=172.31.1.100,dst=172.31.1.1,proto=17,tos=0,ttl=64,frag=0x4000),
>     udp(src=0,dst=4789,csum=0x0),
>     vxlan(flags=0x8000000,vni=0x0)),out_port(2)
>   ), 3)
> 
> The patch parses the above tunnel encap format and passes to
> the tc for offloading the VxLAN tunnel.
> 
> Example of tc format:
> $ tc -s filter show dev ovs-p1 ingress
> filter protocol ip pref 3 flower chain 0
> filter protocol ip pref 3 flower chain 0 handle 0x1
>   dst_mac 56:2a:1f:3c:bb:f2
>   src_mac 96:0c:a7:b0:60:a4
>   eth_type ipv4
>   ip_tos 0/0x3
>   ip_flags nofrag
>   skip_hw
>   not_in_hw
> 	action order 1: tunnel_key  set
> 	src_ip 172.31.1.100
> 	dst_ip 172.31.1.1
> 	key_id 0
> 	dst_port 4789
> 	nocsum
> 	ttl 64 pipe
> 	 index 2 ref 1 bind 1 installed 0 sec used 0 sec
> 	Action statistics:
> 	Sent 168 bytes 2 pkt (dropped 0, overlimits 0 requeues 0)
> 	backlog 0b 0p requeues 0
> 	no_percpu
> 
> 	action order 2: mirred (Egress Redirect to device ovs-p0) stolen
> 	index 2 ref 1 bind 1 installed 0 sec used 0 sec
> 	Action statistics:
> 	Sent 168 bytes 2 pkt (dropped 0, overlimits 0 requeues 0)
> 	backlog 0b 0p requeues 0
> 	cookie b46e99079448ce581d0fe7a9853c0bb5
> 	no_percpu
> 
> Signed-off-by: William Tu <u9012063@gmail.com>

Hi William, all,

I'm a little unclear on the history of this patchset [1].
But it seems to me that while patches 1/2 and 2/3 were applied as:

* 48c1ab5d74ec netdev: Allow storing dpif type into netdev structure.
* 8842fdf1b318 netdev-offload: Use dpif type instead of class.

This patch was not. As we are now getting towards it's third birthday
I'm going to declare it stale and mark it as Changes Requested
in patchwork.

[1] https://mail.openvswitch.org/pipermail/ovs-dev/2020-July/372699.html
William Tu July 6, 2023, 6:46 p.m. UTC | #2
On Mon, Jun 26, 2023 at 7:53 AM Simon Horman <simon.horman@corigine.com> wrote:
>
> On Wed, Jul 08, 2020 at 09:10:36AM -0700, William Tu wrote:
> > The patch adds VxLAN encap tc-offload support.  The userspace datapath, dpif-netdev,
> > flow format differs than the kernel datapath in case of tunnel encap.  Unlike kernel,
> > the dpif-netdev does not use set and output action, but uses a single clone action with
> > all the tunnel info nested inside.  As an exmaple blow:
> > actions:clone(tnl_push(tnl_port(5),
> >   header(size=50,type=4,eth(dst=06:1d:6e:a3:f1:61,src=26:df:25:f6:7b:4f,dl_type=0x0800),
> >     ipv4(src=172.31.1.100,dst=172.31.1.1,proto=17,tos=0,ttl=64,frag=0x4000),
> >     udp(src=0,dst=4789,csum=0x0),
> >     vxlan(flags=0x8000000,vni=0x0)),out_port(2)
> >   ), 3)
> >
> > The patch parses the above tunnel encap format and passes to
> > the tc for offloading the VxLAN tunnel.
> >
> > Example of tc format:
> > $ tc -s filter show dev ovs-p1 ingress
> > filter protocol ip pref 3 flower chain 0
> > filter protocol ip pref 3 flower chain 0 handle 0x1
> >   dst_mac 56:2a:1f:3c:bb:f2
> >   src_mac 96:0c:a7:b0:60:a4
> >   eth_type ipv4
> >   ip_tos 0/0x3
> >   ip_flags nofrag
> >   skip_hw
> >   not_in_hw
> >       action order 1: tunnel_key  set
> >       src_ip 172.31.1.100
> >       dst_ip 172.31.1.1
> >       key_id 0
> >       dst_port 4789
> >       nocsum
> >       ttl 64 pipe
> >        index 2 ref 1 bind 1 installed 0 sec used 0 sec
> >       Action statistics:
> >       Sent 168 bytes 2 pkt (dropped 0, overlimits 0 requeues 0)
> >       backlog 0b 0p requeues 0
> >       no_percpu
> >
> >       action order 2: mirred (Egress Redirect to device ovs-p0) stolen
> >       index 2 ref 1 bind 1 installed 0 sec used 0 sec
> >       Action statistics:
> >       Sent 168 bytes 2 pkt (dropped 0, overlimits 0 requeues 0)
> >       backlog 0b 0p requeues 0
> >       cookie b46e99079448ce581d0fe7a9853c0bb5
> >       no_percpu
> >
> > Signed-off-by: William Tu <u9012063@gmail.com>
>
> Hi William, all,
>
> I'm a little unclear on the history of this patchset [1].
> But it seems to me that while patches 1/2 and 2/3 were applied as:
>
> * 48c1ab5d74ec netdev: Allow storing dpif type into netdev structure.
> * 8842fdf1b318 netdev-offload: Use dpif type instead of class.
>
> This patch was not. As we are now getting towards it's third birthday
> I'm going to declare it stale and mark it as Changes Requested
> in patchwork.
>
> [1] https://mail.openvswitch.org/pipermail/ovs-dev/2020-July/372699.html

Hi Simon,
It's obsolete and vxlan offload is already supported, so feel free to
declare it stale.
thanks
William
diff mbox series

Patch

diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
index 2c9c6f4cae8b..a1deeb2b3040 100644
--- a/lib/netdev-offload-tc.c
+++ b/lib/netdev-offload-tc.c
@@ -1114,6 +1114,118 @@  parse_put_flow_ct_action(struct tc_flower *flower,
 }
 
 static int
+parse_put_tnl_header(struct tc_flower *flower OVS_UNUSED,
+                     struct tc_action *action,
+                     const struct ovs_action_push_tnl *data)
+{
+    static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 20);
+    const struct eth_header *eth;
+    const struct udp_header *udp;
+    const void *l3;
+    const void *l4;
+    struct ds ds;
+
+    ds_init(&ds);
+    eth = (const struct eth_header *)data->header;
+    l3 = eth + 1;
+
+    if (eth->eth_type == htons(ETH_TYPE_IP)) {
+        const struct ip_header *ip = l3;
+        action->encap.ipv4.ipv4_src = get_16aligned_be32(&ip->ip_src);
+        action->encap.ipv4.ipv4_dst = get_16aligned_be32(&ip->ip_dst);
+        action->encap.ttl = ip->ip_ttl;
+        l4 = (ip + 1);
+    } else {
+        const struct ovs_16aligned_ip6_hdr *ip6 = l3;
+        memcpy(&action->encap.ipv6.ipv6_src, &ip6->ip6_src,
+               sizeof ip6->ip6_src);
+        memcpy(&action->encap.ipv6.ipv6_dst, &ip6->ip6_dst,
+               sizeof ip6->ip6_dst);
+        l4 = (ip6 + 1);
+    }
+
+    udp = (const struct udp_header *) l4;
+
+    if (data->tnl_type == OVS_VPORT_TYPE_VXLAN) {
+        const struct vxlanhdr *vxh;
+
+        vxh = (const struct vxlanhdr *)(udp + 1);
+        action->encap.tp_src = udp->udp_src;
+        action->encap.tp_dst = udp->udp_dst;
+        action->encap.id_present = true;
+        action->encap.no_csum = true;
+        action->encap.id = be32_to_be64(get_16aligned_be32(&vxh->vx_vni) >> 8);
+
+        ds_put_format(&ds, "vxlan(flags=0x%"PRIx32",vni=0x%"PRIx32")",
+                      ntohl(get_16aligned_be32(&vxh->vx_flags)),
+                      ntohl(get_16aligned_be32(&vxh->vx_vni)) >> 8);
+        VLOG_DBG_RL(&rl, "%s", ds_cstr(&ds));
+    } else {
+        VLOG_DBG_RL(&rl, "unsupported tunnel type: %d", data->tnl_type);
+        return EOPNOTSUPP;
+    }
+
+    ds_destroy(&ds);
+    return 0;
+}
+
+static int
+parse_put_flow_clone_action(struct tc_flower *flower,
+                            const struct netdev *netdev,
+                            const struct nlattr *clone,
+                            size_t clone_len)
+{
+        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 20);
+        struct tc_action *action;
+        const struct nlattr *ca;
+        size_t ca_left;
+        int err;
+
+        NL_ATTR_FOR_EACH_UNSAFE (ca, ca_left, clone, clone_len) {
+            action = &flower->actions[flower->action_count];
+            switch (nl_attr_type(ca)) {
+            case OVS_ACTION_ATTR_TUNNEL_PUSH: {
+                const struct ovs_action_push_tnl *tnl_push = nl_attr_get(ca);
+
+                err = parse_put_tnl_header(flower, action, tnl_push);
+                if (err) {
+                    return err;
+                }
+                action->type = TC_ACT_ENCAP;
+                flower->action_count++;
+            }
+            break;
+            case OVS_ACTION_ATTR_OUTPUT: {
+                struct netdev *outdev;
+                const char *outdev_type;
+
+                odp_port_t port = nl_attr_get_odp_port(ca);
+                outdev = netdev_ports_get(port, netdev_get_dpif_type(netdev));
+                if (!outdev) {
+                    VLOG_DBG_RL(&rl, "Can't find netdev for output port "
+                                     "%d inside clone().", port);
+                    return ENODEV;
+                }
+                outdev_type = netdev_get_type(outdev);
+                action->out.ifindex_out = netdev_get_ifindex(outdev);
+                action->out.ingress = is_internal_port(outdev_type);
+                netdev_close(outdev);
+
+                action->type = TC_ACT_OUTPUT;
+                flower->action_count++;
+            }
+            break;
+            default:
+                VLOG_WARN_RL(&rl, "unsupported action %d inside clone()",
+                             nl_attr_type(ca));
+                return EOPNOTSUPP;
+            break;
+        }
+    }
+    return 0;
+}
+
+static int
 parse_put_flow_set_masked_action(struct tc_flower *flower,
                                  struct tc_action *action,
                                  const struct nlattr *set,
@@ -1789,6 +1901,15 @@  netdev_tc_flow_put(struct netdev *netdev, struct match *match,
             action->chain = nl_attr_get_u32(nla);
             flower.action_count++;
             recirc_act = true;
+        } else if (nl_attr_type(nla) == OVS_ACTION_ATTR_CLONE) {
+            const struct nlattr *clone = nl_attr_get(nla);
+            const size_t clone_len = nl_attr_get_size(nla);
+
+            err = parse_put_flow_clone_action(&flower, netdev, clone,
+                                              clone_len);
+            if (err) {
+                return err;
+            }
         } else if (nl_attr_type(nla) == OVS_ACTION_ATTR_DROP) {
             action->type = TC_ACT_GOTO;
             action->chain = 0;  /* 0 is reserved and not used by recirc. */