diff mbox

[net] openvswitch: compute needed headroom for internal vports

Message ID 0d85bf752132423483b8c5708bb9188805a99ff9.1452289759.git.pabeni@redhat.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Paolo Abeni Jan. 8, 2016, 9:50 p.m. UTC
Currently the ovs internal vports always use a default needed_headroom.
This leads to a skb head copy while xmitting on ovs swith via vport
that add some kind of encapsulation (gre, geneve, etc.).

This patch add book-keeping for the maximum needed_headroom used by
the non internal vports in any dp, updating it on vport creation and
deletion.

Said value is than used as needed_headroom for internal vports,
avoiding the above copy.

With ~1000 bytes long frames, this give about a 6% xmit performance
improvement in case of vxlan tunnels and about 8% when using geneve
tunnels.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
---
 net/openvswitch/datapath.c           | 39 ++++++++++++++++++++++++++++++++++++
 net/openvswitch/datapath.h           |  4 ++++
 net/openvswitch/vport-internal_dev.c |  1 +
 3 files changed, 44 insertions(+)

Comments

Pravin Shelar Jan. 8, 2016, 10:53 p.m. UTC | #1
On Fri, Jan 8, 2016 at 1:50 PM, Paolo Abeni <pabeni@redhat.com> wrote:
> Currently the ovs internal vports always use a default needed_headroom.
> This leads to a skb head copy while xmitting on ovs swith via vport
> that add some kind of encapsulation (gre, geneve, etc.).
>
> This patch add book-keeping for the maximum needed_headroom used by
> the non internal vports in any dp, updating it on vport creation and
> deletion.
>
> Said value is than used as needed_headroom for internal vports,
> avoiding the above copy.
>
Why is this done only for internal devices? In most common case of
traffic the packet enters OVS from tap or other netdev type vport
device.

> With ~1000 bytes long frames, this give about a 6% xmit performance
> improvement in case of vxlan tunnels and about 8% when using geneve
> tunnels.
>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> Acked-by: Flavio Leitner <fbl@sysclose.org>
> --
Jesse Gross Jan. 9, 2016, 12:42 a.m. UTC | #2
On Fri, Jan 8, 2016 at 2:53 PM, pravin shelar <pshelar@ovn.org> wrote:
> On Fri, Jan 8, 2016 at 1:50 PM, Paolo Abeni <pabeni@redhat.com> wrote:
>> Currently the ovs internal vports always use a default needed_headroom.
>> This leads to a skb head copy while xmitting on ovs swith via vport
>> that add some kind of encapsulation (gre, geneve, etc.).
>>
>> This patch add book-keeping for the maximum needed_headroom used by
>> the non internal vports in any dp, updating it on vport creation and
>> deletion.
>>
>> Said value is than used as needed_headroom for internal vports,
>> avoiding the above copy.
>>
> Why is this done only for internal devices? In most common case of
> traffic the packet enters OVS from tap or other netdev type vport
> device.

How would you influence the allocation for non-internal devices?
Pravin Shelar Jan. 9, 2016, 2:44 a.m. UTC | #3
On Fri, Jan 8, 2016 at 4:42 PM, Jesse Gross <jesse@kernel.org> wrote:
> On Fri, Jan 8, 2016 at 2:53 PM, pravin shelar <pshelar@ovn.org> wrote:
>> On Fri, Jan 8, 2016 at 1:50 PM, Paolo Abeni <pabeni@redhat.com> wrote:
>>> Currently the ovs internal vports always use a default needed_headroom.
>>> This leads to a skb head copy while xmitting on ovs swith via vport
>>> that add some kind of encapsulation (gre, geneve, etc.).
>>>
>>> This patch add book-keeping for the maximum needed_headroom used by
>>> the non internal vports in any dp, updating it on vport creation and
>>> deletion.
>>>
>>> Said value is than used as needed_headroom for internal vports,
>>> avoiding the above copy.
>>>
>> Why is this done only for internal devices? In most common case of
>> traffic the packet enters OVS from tap or other netdev type vport
>> device.
>
> How would you influence the allocation for non-internal devices?

Today there is no way of influencing this. But we could add new
skb-headroom parameter to netdev for packets that are received on the
device. This new parameter could be controlled from master devices
like OVS, Bridge, etc. To set this value we need new ndo operation. So
that it can work on devices like tap where it would just set this new
value and in case of ovs-internal or veth device, it can also update
needed_headroom.
Paolo Abeni Jan. 11, 2016, 8:43 a.m. UTC | #4
On Fri, 2016-01-08 at 18:44 -0800, pravin shelar wrote:
> On Fri, Jan 8, 2016 at 4:42 PM, Jesse Gross <jesse@kernel.org> wrote:
> > On Fri, Jan 8, 2016 at 2:53 PM, pravin shelar <pshelar@ovn.org> wrote:
> >> On Fri, Jan 8, 2016 at 1:50 PM, Paolo Abeni <pabeni@redhat.com> wrote:
> >>> Currently the ovs internal vports always use a default needed_headroom.
> >>> This leads to a skb head copy while xmitting on ovs swith via vport
> >>> that add some kind of encapsulation (gre, geneve, etc.).
> >>>
> >>> This patch add book-keeping for the maximum needed_headroom used by
> >>> the non internal vports in any dp, updating it on vport creation and
> >>> deletion.
> >>>
> >>> Said value is than used as needed_headroom for internal vports,
> >>> avoiding the above copy.
> >>>
> >> Why is this done only for internal devices? In most common case of
> >> traffic the packet enters OVS from tap or other netdev type vport
> >> device.
> >
> > How would you influence the allocation for non-internal devices?
> 
> Today there is no way of influencing this. But we could add new
> skb-headroom parameter to netdev for packets that are received on the
> device. This new parameter could be controlled from master devices
> like OVS, Bridge, etc. To set this value we need new ndo operation. So
> that it can work on devices like tap where it would just set this new
> value and in case of ovs-internal or veth device, it can also update
> needed_headroom.

My idea was to continue working along this lines.

However I thought to get there incrementally, i.e. handle internal
vports only first. Can this be ok for you?

Cheers,

Paolo
Pravin Shelar Jan. 12, 2016, 12:34 a.m. UTC | #5
On Mon, Jan 11, 2016 at 12:43 AM, Paolo Abeni <pabeni@redhat.com> wrote:
> On Fri, 2016-01-08 at 18:44 -0800, pravin shelar wrote:
>> On Fri, Jan 8, 2016 at 4:42 PM, Jesse Gross <jesse@kernel.org> wrote:
>> > On Fri, Jan 8, 2016 at 2:53 PM, pravin shelar <pshelar@ovn.org> wrote:
>> >> On Fri, Jan 8, 2016 at 1:50 PM, Paolo Abeni <pabeni@redhat.com> wrote:
>> >>> Currently the ovs internal vports always use a default needed_headroom.
>> >>> This leads to a skb head copy while xmitting on ovs swith via vport
>> >>> that add some kind of encapsulation (gre, geneve, etc.).
>> >>>
>> >>> This patch add book-keeping for the maximum needed_headroom used by
>> >>> the non internal vports in any dp, updating it on vport creation and
>> >>> deletion.
>> >>>
>> >>> Said value is than used as needed_headroom for internal vports,
>> >>> avoiding the above copy.
>> >>>
>> >> Why is this done only for internal devices? In most common case of
>> >> traffic the packet enters OVS from tap or other netdev type vport
>> >> device.
>> >
>> > How would you influence the allocation for non-internal devices?
>>
>> Today there is no way of influencing this. But we could add new
>> skb-headroom parameter to netdev for packets that are received on the
>> device. This new parameter could be controlled from master devices
>> like OVS, Bridge, etc. To set this value we need new ndo operation. So
>> that it can work on devices like tap where it would just set this new
>> value and in case of ovs-internal or veth device, it can also update
>> needed_headroom.
>
> My idea was to continue working along this lines.
>
> However I thought to get there incrementally, i.e. handle internal
> vports only first. Can this be ok for you?
>

If the final implementation is going to change alot, then I do not see
much value in this change going in first.
Jesse Gross Jan. 12, 2016, 7:20 p.m. UTC | #6
On Mon, Jan 11, 2016 at 4:34 PM, pravin shelar <pshelar@ovn.org> wrote:
> On Mon, Jan 11, 2016 at 12:43 AM, Paolo Abeni <pabeni@redhat.com> wrote:
>> On Fri, 2016-01-08 at 18:44 -0800, pravin shelar wrote:
>>> On Fri, Jan 8, 2016 at 4:42 PM, Jesse Gross <jesse@kernel.org> wrote:
>>> > On Fri, Jan 8, 2016 at 2:53 PM, pravin shelar <pshelar@ovn.org> wrote:
>>> >> On Fri, Jan 8, 2016 at 1:50 PM, Paolo Abeni <pabeni@redhat.com> wrote:
>>> >>> Currently the ovs internal vports always use a default needed_headroom.
>>> >>> This leads to a skb head copy while xmitting on ovs swith via vport
>>> >>> that add some kind of encapsulation (gre, geneve, etc.).
>>> >>>
>>> >>> This patch add book-keeping for the maximum needed_headroom used by
>>> >>> the non internal vports in any dp, updating it on vport creation and
>>> >>> deletion.
>>> >>>
>>> >>> Said value is than used as needed_headroom for internal vports,
>>> >>> avoiding the above copy.
>>> >>>
>>> >> Why is this done only for internal devices? In most common case of
>>> >> traffic the packet enters OVS from tap or other netdev type vport
>>> >> device.
>>> >
>>> > How would you influence the allocation for non-internal devices?
>>>
>>> Today there is no way of influencing this. But we could add new
>>> skb-headroom parameter to netdev for packets that are received on the
>>> device. This new parameter could be controlled from master devices
>>> like OVS, Bridge, etc. To set this value we need new ndo operation. So
>>> that it can work on devices like tap where it would just set this new
>>> value and in case of ovs-internal or veth device, it can also update
>>> needed_headroom.
>>
>> My idea was to continue working along this lines.
>>
>> However I thought to get there incrementally, i.e. handle internal
>> vports only first. Can this be ok for you?
>>
>
> If the final implementation is going to change alot, then I do not see
> much value in this change going in first.

Even if the code will change in the future, it seems like an
incremental improvement that will help in some cases so I don't see
much reason to not do this part now.
Pravin Shelar Jan. 12, 2016, 8:44 p.m. UTC | #7
On Tue, Jan 12, 2016 at 11:20 AM, Jesse Gross <jesse@kernel.org> wrote:
> On Mon, Jan 11, 2016 at 4:34 PM, pravin shelar <pshelar@ovn.org> wrote:
>> On Mon, Jan 11, 2016 at 12:43 AM, Paolo Abeni <pabeni@redhat.com> wrote:
>>> On Fri, 2016-01-08 at 18:44 -0800, pravin shelar wrote:
>>>> On Fri, Jan 8, 2016 at 4:42 PM, Jesse Gross <jesse@kernel.org> wrote:
>>>> > On Fri, Jan 8, 2016 at 2:53 PM, pravin shelar <pshelar@ovn.org> wrote:
>>>> >> On Fri, Jan 8, 2016 at 1:50 PM, Paolo Abeni <pabeni@redhat.com> wrote:
>>>> >>> Currently the ovs internal vports always use a default needed_headroom.
>>>> >>> This leads to a skb head copy while xmitting on ovs swith via vport
>>>> >>> that add some kind of encapsulation (gre, geneve, etc.).
>>>> >>>
>>>> >>> This patch add book-keeping for the maximum needed_headroom used by
>>>> >>> the non internal vports in any dp, updating it on vport creation and
>>>> >>> deletion.
>>>> >>>
>>>> >>> Said value is than used as needed_headroom for internal vports,
>>>> >>> avoiding the above copy.
>>>> >>>
>>>> >> Why is this done only for internal devices? In most common case of
>>>> >> traffic the packet enters OVS from tap or other netdev type vport
>>>> >> device.
>>>> >
>>>> > How would you influence the allocation for non-internal devices?
>>>>
>>>> Today there is no way of influencing this. But we could add new
>>>> skb-headroom parameter to netdev for packets that are received on the
>>>> device. This new parameter could be controlled from master devices
>>>> like OVS, Bridge, etc. To set this value we need new ndo operation. So
>>>> that it can work on devices like tap where it would just set this new
>>>> value and in case of ovs-internal or veth device, it can also update
>>>> needed_headroom.
>>>
>>> My idea was to continue working along this lines.
>>>
>>> However I thought to get there incrementally, i.e. handle internal
>>> vports only first. Can this be ok for you?
>>>
>>
>> If the final implementation is going to change alot, then I do not see
>> much value in this change going in first.
>
> Even if the code will change in the future, it seems like an
> incremental improvement that will help in some cases so I don't see
> much reason to not do this part now.

I am not sure which cases it help. Can you tell me use cases for
internal port in production?
Jesse Gross Jan. 13, 2016, 1:08 a.m. UTC | #8
On Tue, Jan 12, 2016 at 12:44 PM, pravin shelar <pshelar@ovn.org> wrote:
> On Tue, Jan 12, 2016 at 11:20 AM, Jesse Gross <jesse@kernel.org> wrote:
>> On Mon, Jan 11, 2016 at 4:34 PM, pravin shelar <pshelar@ovn.org> wrote:
>>> On Mon, Jan 11, 2016 at 12:43 AM, Paolo Abeni <pabeni@redhat.com> wrote:
>>>> On Fri, 2016-01-08 at 18:44 -0800, pravin shelar wrote:
>>>>> On Fri, Jan 8, 2016 at 4:42 PM, Jesse Gross <jesse@kernel.org> wrote:
>>>>> > On Fri, Jan 8, 2016 at 2:53 PM, pravin shelar <pshelar@ovn.org> wrote:
>>>>> >> On Fri, Jan 8, 2016 at 1:50 PM, Paolo Abeni <pabeni@redhat.com> wrote:
>>>>> >>> Currently the ovs internal vports always use a default needed_headroom.
>>>>> >>> This leads to a skb head copy while xmitting on ovs swith via vport
>>>>> >>> that add some kind of encapsulation (gre, geneve, etc.).
>>>>> >>>
>>>>> >>> This patch add book-keeping for the maximum needed_headroom used by
>>>>> >>> the non internal vports in any dp, updating it on vport creation and
>>>>> >>> deletion.
>>>>> >>>
>>>>> >>> Said value is than used as needed_headroom for internal vports,
>>>>> >>> avoiding the above copy.
>>>>> >>>
>>>>> >> Why is this done only for internal devices? In most common case of
>>>>> >> traffic the packet enters OVS from tap or other netdev type vport
>>>>> >> device.
>>>>> >
>>>>> > How would you influence the allocation for non-internal devices?
>>>>>
>>>>> Today there is no way of influencing this. But we could add new
>>>>> skb-headroom parameter to netdev for packets that are received on the
>>>>> device. This new parameter could be controlled from master devices
>>>>> like OVS, Bridge, etc. To set this value we need new ndo operation. So
>>>>> that it can work on devices like tap where it would just set this new
>>>>> value and in case of ovs-internal or veth device, it can also update
>>>>> needed_headroom.
>>>>
>>>> My idea was to continue working along this lines.
>>>>
>>>> However I thought to get there incrementally, i.e. handle internal
>>>> vports only first. Can this be ok for you?
>>>>
>>>
>>> If the final implementation is going to change alot, then I do not see
>>> much value in this change going in first.
>>
>> Even if the code will change in the future, it seems like an
>> incremental improvement that will help in some cases so I don't see
>> much reason to not do this part now.
>
> I am not sure which cases it help. Can you tell me use cases for
> internal port in production?

Any traffic coming from the hypervisor itself (as well as tunnels
although unless you are doing double encapsulation then this patch
doesn't matter in that case).

Since we're in the merge window now, maybe it makes sense to just go
for the full version in the next cycle in any case.
Pravin Shelar Jan. 13, 2016, 5:27 a.m. UTC | #9
On Tue, Jan 12, 2016 at 5:08 PM, Jesse Gross <jesse@kernel.org> wrote:
> On Tue, Jan 12, 2016 at 12:44 PM, pravin shelar <pshelar@ovn.org> wrote:
>> On Tue, Jan 12, 2016 at 11:20 AM, Jesse Gross <jesse@kernel.org> wrote:
>>> On Mon, Jan 11, 2016 at 4:34 PM, pravin shelar <pshelar@ovn.org> wrote:
>>>> On Mon, Jan 11, 2016 at 12:43 AM, Paolo Abeni <pabeni@redhat.com> wrote:
>>>>> On Fri, 2016-01-08 at 18:44 -0800, pravin shelar wrote:
>>>>>> On Fri, Jan 8, 2016 at 4:42 PM, Jesse Gross <jesse@kernel.org> wrote:
>>>>>> > On Fri, Jan 8, 2016 at 2:53 PM, pravin shelar <pshelar@ovn.org> wrote:
>>>>>> >> On Fri, Jan 8, 2016 at 1:50 PM, Paolo Abeni <pabeni@redhat.com> wrote:
>>>>>> >>> Currently the ovs internal vports always use a default needed_headroom.
>>>>>> >>> This leads to a skb head copy while xmitting on ovs swith via vport
>>>>>> >>> that add some kind of encapsulation (gre, geneve, etc.).
>>>>>> >>>
>>>>>> >>> This patch add book-keeping for the maximum needed_headroom used by
>>>>>> >>> the non internal vports in any dp, updating it on vport creation and
>>>>>> >>> deletion.
>>>>>> >>>
>>>>>> >>> Said value is than used as needed_headroom for internal vports,
>>>>>> >>> avoiding the above copy.
>>>>>> >>>
>>>>>> >> Why is this done only for internal devices? In most common case of
>>>>>> >> traffic the packet enters OVS from tap or other netdev type vport
>>>>>> >> device.
>>>>>> >
>>>>>> > How would you influence the allocation for non-internal devices?
>>>>>>
>>>>>> Today there is no way of influencing this. But we could add new
>>>>>> skb-headroom parameter to netdev for packets that are received on the
>>>>>> device. This new parameter could be controlled from master devices
>>>>>> like OVS, Bridge, etc. To set this value we need new ndo operation. So
>>>>>> that it can work on devices like tap where it would just set this new
>>>>>> value and in case of ovs-internal or veth device, it can also update
>>>>>> needed_headroom.
>>>>>
>>>>> My idea was to continue working along this lines.
>>>>>
>>>>> However I thought to get there incrementally, i.e. handle internal
>>>>> vports only first. Can this be ok for you?
>>>>>
>>>>
>>>> If the final implementation is going to change alot, then I do not see
>>>> much value in this change going in first.
>>>
>>> Even if the code will change in the future, it seems like an
>>> incremental improvement that will help in some cases so I don't see
>>> much reason to not do this part now.
>>
>> I am not sure which cases it help. Can you tell me use cases for
>> internal port in production?
>
> Any traffic coming from the hypervisor itself (as well as tunnels
> although unless you are doing double encapsulation then this patch
> doesn't matter in that case).
>

ok, But majority of traffic comes from VM. So IMO only this patch
would not make much difference.

> Since we're in the merge window now, maybe it makes sense to just go
> for the full version in the next cycle in any case.

ok.
Paolo Abeni Jan. 13, 2016, 5:30 p.m. UTC | #10
On Tue, 2016-01-12 at 21:27 -0800, pravin shelar wrote:
> On Tue, Jan 12, 2016 at 5:08 PM, Jesse Gross <jesse@kernel.org> wrote:
> > Any traffic coming from the hypervisor itself (as well as tunnels
> > although unless you are doing double encapsulation then this patch
> > doesn't matter in that case).
> >
> 
> ok, But majority of traffic comes from VM. So IMO only this patch
> would not make much difference.
> 
> > Since we're in the merge window now, maybe it makes sense to just go
> > for the full version in the next cycle in any case.
> 
> ok.

Thank you for the feedback.

I'll try to implement a more general solution and get some performance
numbers for the next cycle,

Paolo
diff mbox

Patch

diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index 91a8b00..c2c48b5 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -1915,6 +1915,28 @@  static struct vport *lookup_vport(struct net *net,
 		return ERR_PTR(-EINVAL);
 }
 
+/* Called with ovs_mutex */
+static void update_headroom(struct datapath *dp)
+{
+	int i;
+	struct vport *vport;
+	unsigned max_headroom = 0;
+
+	for (i = 0; i < DP_VPORT_HASH_BUCKETS; i++) {
+		hlist_for_each_entry_rcu(vport, &dp->ports[i], dp_hash_node)
+			if (vport->ops->type != OVS_VPORT_TYPE_INTERNAL &&
+			    vport->dev->needed_headroom > max_headroom)
+				max_headroom = vport->dev->needed_headroom;
+	}
+
+	dp->max_headroom = max_headroom;
+	for (i = 0; i < DP_VPORT_HASH_BUCKETS; i++) {
+		hlist_for_each_entry_rcu(vport, &dp->ports[i], dp_hash_node)
+			if (vport->ops->type == OVS_VPORT_TYPE_INTERNAL)
+				vport->dev->needed_headroom = max_headroom;
+	}
+}
+
 static int ovs_vport_cmd_new(struct sk_buff *skb, struct genl_info *info)
 {
 	struct nlattr **a = info->attrs;
@@ -1980,6 +2002,10 @@  restart:
 
 	err = ovs_vport_cmd_fill_info(vport, reply, info->snd_portid,
 				      info->snd_seq, 0, OVS_VPORT_CMD_NEW);
+
+	if (vport->ops->type != OVS_VPORT_TYPE_INTERNAL &&
+	    vport->dev->needed_headroom > dp->max_headroom)
+		update_headroom(dp);
 	BUG_ON(err < 0);
 	ovs_unlock();
 
@@ -2050,6 +2076,8 @@  static int ovs_vport_cmd_del(struct sk_buff *skb, struct genl_info *info)
 	struct sk_buff *reply;
 	struct vport *vport;
 	int err;
+	struct datapath *dp;
+	bool must_update_headroom = false;
 
 	reply = ovs_vport_cmd_alloc_info();
 	if (!reply)
@@ -2069,7 +2097,18 @@  static int ovs_vport_cmd_del(struct sk_buff *skb, struct genl_info *info)
 	err = ovs_vport_cmd_fill_info(vport, reply, info->snd_portid,
 				      info->snd_seq, 0, OVS_VPORT_CMD_DEL);
 	BUG_ON(err < 0);
+
+	/* check if the deletion of this port may change the dp max_headroom
+	 * before deleting the vport
+	 */
+	dp = vport->dp;
+	if (vport->ops->type != OVS_VPORT_TYPE_INTERNAL &&
+	    vport->dev->needed_headroom == dp->max_headroom)
+		must_update_headroom = true;
 	ovs_dp_detach_port(vport);
+
+	if (must_update_headroom)
+		update_headroom(dp);
 	ovs_unlock();
 
 	ovs_notify(&dp_vport_genl_family, reply, info);
diff --git a/net/openvswitch/datapath.h b/net/openvswitch/datapath.h
index 67bdecd..427e39a 100644
--- a/net/openvswitch/datapath.h
+++ b/net/openvswitch/datapath.h
@@ -68,6 +68,8 @@  struct dp_stats_percpu {
  * ovs_mutex and RCU.
  * @stats_percpu: Per-CPU datapath statistics.
  * @net: Reference to net namespace.
+ * @max_headroom: the maximum headroom of all vports in this datapath; it will
+ * be used by all the internal vports in this dp.
  *
  * Context: See the comment on locking at the top of datapath.c for additional
  * locking information.
@@ -89,6 +91,8 @@  struct datapath {
 	possible_net_t net;
 
 	u32 user_features;
+
+	u32 max_headroom;
 };
 
 /**
diff --git a/net/openvswitch/vport-internal_dev.c b/net/openvswitch/vport-internal_dev.c
index ec76398..3d0a55a 100644
--- a/net/openvswitch/vport-internal_dev.c
+++ b/net/openvswitch/vport-internal_dev.c
@@ -199,6 +199,7 @@  static struct vport *internal_dev_create(const struct vport_parms *parms)
 		err = -ENOMEM;
 		goto error_free_netdev;
 	}
+	vport->dev->needed_headroom = vport->dp->max_headroom;
 
 	dev_net_set(vport->dev, ovs_dp_get_net(vport->dp));
 	internal_dev = internal_dev_priv(vport->dev);