diff mbox

[ovs-dev,V2] netdev-dpdk: fix ifindex assignment for DPDK ports

Message ID 20161208121622.36200-1-przemyslawx.lal@intel.com
State Changes Requested
Delegated to: Daniele Di Proietto
Headers show

Commit Message

Przemyslaw Lal Dec. 8, 2016, 12:16 p.m. UTC
In current implementation port_id is used as an ifindex for all netdev-dpdk
interfaces.

For physical DPDK interfaces using port_id as ifindex causes that '0' is set as
ifindex for 'dpdk0' interface, '1' for 'dpdk1' and so on. For the DPDK vHost
interfaces ifindexes are not even assigned (0 is used by default) due to the
fact that vHost ports don't use port_id field from the DPDK library.

This causes multiple negative side-effects. First of all 0 is an invalid
ifindex value. The other issue is possible overlapping of 'dpdkX' interfaces
ifindex values with the infindexes of kernel space interfaces which may cause
problems in any external tools that use those values. Neither 'dpdk0', nor any
DPDK vHost interfaces are visible in sFlow collector tools, as all interfaces
with ifindexes smaller than 1 are ignored.

Proposed solution to these issues is to calculate a hash of interface's name
and use calculated value as an ifindex. This way interfaces keep their
ifindexes during OVS-DPDK restarts, ports re-initialization events, etc., show
up in sFlow collectors and meet RFC2863 specification regarding re-using
ifindex values by the same virtual interfaces.

Signed-off-by: Przemyslaw Lal <przemyslawx.lal@intel.com>
---
 lib/netdev-dpdk.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

Comments

Stokes, Ian Jan. 15, 2017, 6:34 p.m. UTC | #1
> In current implementation port_id is used as an ifindex for all netdev-
> dpdk interfaces.
> 
> For physical DPDK interfaces using port_id as ifindex causes that '0' is
> set as ifindex for 'dpdk0' interface, '1' for 'dpdk1' and so on. For the
> DPDK vHost interfaces ifindexes are not even assigned (0 is used by
> default) due to the fact that vHost ports don't use port_id field from the
> DPDK library.
> 
> This causes multiple negative side-effects. First of all 0 is an invalid
> ifindex value. The other issue is possible overlapping of 'dpdkX'
> interfaces ifindex values with the infindexes of kernel space interfaces
> which may cause problems in any external tools that use those values.
> Neither 'dpdk0', nor any DPDK vHost interfaces are visible in sFlow
> collector tools, as all interfaces with ifindexes smaller than 1 are
> ignored.
> 
> Proposed solution to these issues is to calculate a hash of interface's
> name and use calculated value as an ifindex. This way interfaces keep
> their ifindexes during OVS-DPDK restarts, ports re-initialization events,
> etc., show up in sFlow collectors and meet RFC2863 specification regarding
> re-using ifindex values by the same virtual interfaces.
> 
> Signed-off-by: Przemyslaw Lal <przemyslawx.lal@intel.com>
> ---

Thanks for the patch.

I'm not the most experienced with sflow but after looking into it its clear the need for a change to ifindex is required for DPDK devices.

>  lib/netdev-dpdk.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index de78ddd..ef99eb3
> 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -2075,7 +2075,13 @@ netdev_dpdk_get_ifindex(const struct netdev
> *netdev)
>      int ifindex;
> 
>      ovs_mutex_lock(&dev->mutex);
> -    ifindex = dev->port_id;
Using a hash for the ifindex of dpdk devices will essentially make them non sequential.
More of a question to the community, but is it ok if ifindexes are non-sequential? Wasn't sure of this myself.

> +    /* Calculate hash from the netdev name using hash_bytes() function.
> +     * Because ifindex is declared as signed int in the kernel sources
> and
> +     * OVS follows this implementation right shift is needed to set sign
> bit
> +     * to 0 and then XOR to slightly improve collision rate.
> +     */
> +    uint32_t h = hash_bytes(netdev->name, strlen(netdev->name), 0);
> +    ifindex = (int)((h >> 1) ^ (h & 0x0FFFFFFF));

In the (unlikely) event of a hash collision, what would be the outcome?
From what I can see the main use is for dpif_sflow_add_port().
I'm just wondering is there a need to ensure that no 2 DPDK ports have the same ifindex?
From What I can see dpif_sflow_add_port() would still execute for the port.

>      ovs_mutex_unlock(&dev->mutex);
> 
>      return ifindex;
> --
> 1.9.1
> 
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Przemyslaw Lal Jan. 19, 2017, 10:46 a.m. UTC | #2
>> In current implementation port_id is used as an ifindex for all netdev-
>> dpdk interfaces.
>>
>> For physical DPDK interfaces using port_id as ifindex causes that '0' is
>> set as ifindex for 'dpdk0' interface, '1' for 'dpdk1' and so on. For the
>> DPDK vHost interfaces ifindexes are not even assigned (0 is used by
>> default) due to the fact that vHost ports don't use port_id field from the
>> DPDK library.
>>
>> This causes multiple negative side-effects. First of all 0 is an invalid
>> ifindex value. The other issue is possible overlapping of 'dpdkX'
>> interfaces ifindex values with the infindexes of kernel space interfaces
>> which may cause problems in any external tools that use those values.
>> Neither 'dpdk0', nor any DPDK vHost interfaces are visible in sFlow
>> collector tools, as all interfaces with ifindexes smaller than 1 are
>> ignored.
>>
>> Proposed solution to these issues is to calculate a hash of interface's
>> name and use calculated value as an ifindex. This way interfaces keep
>> their ifindexes during OVS-DPDK restarts, ports re-initialization events,
>> etc., show up in sFlow collectors and meet RFC2863 specification regarding
>> re-using ifindex values by the same virtual interfaces.
>>
>> Signed-off-by: Przemyslaw Lal <przemyslawx.lal@intel.com>
>> ---
> Thanks for the patch.
>
> I'm not the most experienced with sflow but after looking into it its clear the need for a change to ifindex is required for DPDK devices.
>
>>   lib/netdev-dpdk.c | 8 +++++++-
>>   1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index de78ddd..ef99eb3
>> 100644
>> --- a/lib/netdev-dpdk.c
>> +++ b/lib/netdev-dpdk.c
>> @@ -2075,7 +2075,13 @@ netdev_dpdk_get_ifindex(const struct netdev
>> *netdev)
>>       int ifindex;
>>
>>       ovs_mutex_lock(&dev->mutex);
>> -    ifindex = dev->port_id;
> Using a hash for the ifindex of dpdk devices will essentially make them non sequential.
> More of a question to the community, but is it ok if ifindexes are non-sequential? Wasn't sure of this myself.
There's no requirement for ifIndexes to be sequential, even more - 
calculating hash using devices' names ensures that the same devices will 
have the same ifIndex after event of port re-initialization and OVS 
restart. Such behavior is also recommended by IETF RFC 2863 standard. 
Using hash was also suggested by the community during patch V1 review.
>
>> +    /* Calculate hash from the netdev name using hash_bytes() function.
>> +     * Because ifindex is declared as signed int in the kernel sources
>> and
>> +     * OVS follows this implementation right shift is needed to set sign
>> bit
>> +     * to 0 and then XOR to slightly improve collision rate.
>> +     */
>> +    uint32_t h = hash_bytes(netdev->name, strlen(netdev->name), 0);
>> +    ifindex = (int)((h >> 1) ^ (h & 0x0FFFFFFF));
> In the (unlikely) event of a hash collision, what would be the outcome?
>  From what I can see the main use is for dpif_sflow_add_port().
> I'm just wondering is there a need to ensure that no 2 DPDK ports have the same ifindex?
>  From What I can see dpif_sflow_add_port() would still execute for the port.
Only one of them will be accessible in the sFlow. However, as you 
mentioned, this is very unlikely - during my tests I've generated random 
100 000 vHost-user port names using"vhuXXXXXXXX-XX" naming scheme where 
Xs are the first characters of UUID - I used Openstack port naming 
convention as an inspiration - and maximum number of collisions I 
observed was 5, but in the most of runs it was 0.
Anyway, at the end of the day it shouldn't be a big issue since DPDK 
vHost and DPDK physical interfaces names can be changed by user.
>
>>       ovs_mutex_unlock(&dev->mutex);
>>
>>       return ifindex;
>> --
>> 1.9.1
>>
>> _______________________________________________
>> dev mailing list
>> dev@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Ben Pfaff Jan. 31, 2017, 9:52 p.m. UTC | #3
On Thu, Dec 08, 2016 at 01:16:22PM +0100, Przemyslaw Lal wrote:
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index de78ddd..ef99eb3 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -2075,7 +2075,13 @@ netdev_dpdk_get_ifindex(const struct netdev *netdev)
>      int ifindex;
>  
>      ovs_mutex_lock(&dev->mutex);
> -    ifindex = dev->port_id;
> +    /* Calculate hash from the netdev name using hash_bytes() function.
> +     * Because ifindex is declared as signed int in the kernel sources and
> +     * OVS follows this implementation right shift is needed to set sign bit
> +     * to 0 and then XOR to slightly improve collision rate.
> +     */
> +    uint32_t h = hash_bytes(netdev->name, strlen(netdev->name), 0);
> +    ifindex = (int)((h >> 1) ^ (h & 0x0FFFFFFF));

To hash a string, please use hash_string().

Daniele, are you planning to review this?
Daniele Di Proietto Feb. 1, 2017, 2:38 a.m. UTC | #4
On 31/01/2017 13:52, "Ben Pfaff" <blp@ovn.org> wrote:

>On Thu, Dec 08, 2016 at 01:16:22PM +0100, Przemyslaw Lal wrote:
>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
>> index de78ddd..ef99eb3 100644
>> --- a/lib/netdev-dpdk.c
>> +++ b/lib/netdev-dpdk.c
>> @@ -2075,7 +2075,13 @@ netdev_dpdk_get_ifindex(const struct netdev *netdev)
>>      int ifindex;
>>  
>>      ovs_mutex_lock(&dev->mutex);
>> -    ifindex = dev->port_id;
>> +    /* Calculate hash from the netdev name using hash_bytes() function.
>> +     * Because ifindex is declared as signed int in the kernel sources and
>> +     * OVS follows this implementation right shift is needed to set sign bit
>> +     * to 0 and then XOR to slightly improve collision rate.
>> +     */
>> +    uint32_t h = hash_bytes(netdev->name, strlen(netdev->name), 0);
>> +    ifindex = (int)((h >> 1) ^ (h & 0x0FFFFFFF));
>
>To hash a string, please use hash_string().
>
>Daniele, are you planning to review this?

Sorry for the delay.

At some point, with vhost-pmd we will have port_ids also for vhost interfaces.  Maybe we can revisit this approach when that becomes available.

I wish there was a better way to avoid collisions with the linux kernel, but I can't think of anything generic.

Unless someone else has an objection I'm fine with the approach.  Two minor comments:

Could you please use hash_string(), as suggested by Ben?

I guess the result after hashing and XOR could still be zero.  Could you maybe add a check for that case and set it to something else?

Thanks,

Daniele
diff mbox

Patch

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index de78ddd..ef99eb3 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -2075,7 +2075,13 @@  netdev_dpdk_get_ifindex(const struct netdev *netdev)
     int ifindex;
 
     ovs_mutex_lock(&dev->mutex);
-    ifindex = dev->port_id;
+    /* Calculate hash from the netdev name using hash_bytes() function.
+     * Because ifindex is declared as signed int in the kernel sources and
+     * OVS follows this implementation right shift is needed to set sign bit
+     * to 0 and then XOR to slightly improve collision rate.
+     */
+    uint32_t h = hash_bytes(netdev->name, strlen(netdev->name), 0);
+    ifindex = (int)((h >> 1) ^ (h & 0x0FFFFFFF));
     ovs_mutex_unlock(&dev->mutex);
 
     return ifindex;