diff mbox series

[ovs-dev,RFC,v2] netdev-dpdk: Allow specification of index for PCI devices

Message ID 1508237335-21941-1-git-send-email-ciara.loftus@intel.com
State Changes Requested
Headers show
Series [ovs-dev,RFC,v2] netdev-dpdk: Allow specification of index for PCI devices | expand

Commit Message

Ciara Loftus Oct. 17, 2017, 10:48 a.m. UTC
Some NICs have only one PCI address associated with multiple ports. This
patch extends the dpdk-devargs option's format to cater for such
devices. Whereas before only one of N ports associated with one PCI
address could be added, now N can.

This patch allows for the following use of the dpdk-devargs option:

ovs-vsctl set Interface myport options:dpdk-devargs=0000:06:00.0,X

Where X is an unsigned integer representing one of multiple ports
associated with the PCI address provided.

This patch has not been tested.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
---
v2:
* Simplify function to find port ID of indexed device
* Hotplug compatibility
* Detach compatibility
* Documentation
* Use strtol instead of atoi

 Documentation/howto/dpdk.rst         |  9 +++++
 Documentation/intro/install/dpdk.rst |  9 +++++
 NEWS                                 |  2 ++
 lib/netdev-dpdk.c                    | 67 ++++++++++++++++++++++++++++++------
 vswitchd/vswitch.xml                 | 11 ++++--
 5 files changed, 85 insertions(+), 13 deletions(-)

Comments

Kevin Traynor Oct. 19, 2017, 2:16 p.m. UTC | #1
On 10/17/2017 11:48 AM, Ciara Loftus wrote:
> Some NICs have only one PCI address associated with multiple ports. This
> patch extends the dpdk-devargs option's format to cater for such
> devices. Whereas before only one of N ports associated with one PCI
> address could be added, now N can.
> 
> This patch allows for the following use of the dpdk-devargs option:
> 
> ovs-vsctl set Interface myport options:dpdk-devargs=0000:06:00.0,X
> 
> Where X is an unsigned integer representing one of multiple ports
> associated with the PCI address provided.
> 
> This patch has not been tested.
> 
> Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
> ---
> v2:
> * Simplify function to find port ID of indexed device
> * Hotplug compatibility
> * Detach compatibility
> * Documentation
> * Use strtol instead of atoi
> 
>  Documentation/howto/dpdk.rst         |  9 +++++
>  Documentation/intro/install/dpdk.rst |  9 +++++
>  NEWS                                 |  2 ++
>  lib/netdev-dpdk.c                    | 67 ++++++++++++++++++++++++++++++------
>  vswitchd/vswitch.xml                 | 11 ++++--
>  5 files changed, 85 insertions(+), 13 deletions(-)
> 
> diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst
> index d123819..9447b71 100644
> --- a/Documentation/howto/dpdk.rst
> +++ b/Documentation/howto/dpdk.rst
> @@ -48,6 +48,15 @@ number of dpdk devices found in the log file::
>      $ ovs-vsctl add-port br0 dpdk-p1 -- set Interface dpdk-p1 type=dpdk \
>          options:dpdk-devargs=0000:01:00.1
>  
> +If your PCI device has multiple ports under the same PCI ID, you can use the
> +following notation to indicate the specific device you wish to add::
> +
> +    $ ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk \
> +        options:dpdk-devargs=0000:01:00.0,0
> +
> +The above would attempt to use the first device (0) associated with that PCI
> +ID. Use ,1 ,2 etc. to access the next.
> +
>  After the DPDK ports get added to switch, a polling thread continuously polls
>  DPDK devices and consumes 100% of the core, as can be checked from ``top`` and
>  ``ps`` commands::
> diff --git a/Documentation/intro/install/dpdk.rst b/Documentation/intro/install/dpdk.rst
> index bb69ae5..d0e87f5 100644
> --- a/Documentation/intro/install/dpdk.rst
> +++ b/Documentation/intro/install/dpdk.rst
> @@ -271,6 +271,15 @@ ports. For example, to create a userspace bridge named ``br0`` and add two
>  DPDK devices will not be available for use until a valid dpdk-devargs is
>  specified.
>  
> +If your PCI device has multiple ports under the same PCI ID, you can use the
> +following notation to indicate the specific device you wish to add::
> +
> +    $ ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk \
> +        options:dpdk-devargs=0000:01:00.0,0
> +
> +The above would attempt to use the first device (0) associated with that PCI
> +ID. Use ,1 ,2 etc. to access the next.
> +
>  Refer to ovs-vsctl(8) and :doc:`/howto/dpdk` for more details.
>  
>  Performance Tuning
> diff --git a/NEWS b/NEWS
> index 75f5fa5..ca885a6 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -5,6 +5,8 @@ Post-v2.8.0
>         chassis "hostname" in addition to a chassis "name".
>     - Linux kernel 4.13
>       * Add support for compiling OVS with the latest Linux 4.13 kernel
> +   - DPDK:
> +     * Support for adding devices with multiple DPDK ports under one PCI ID.
>  
>  v2.8.0 - 31 Aug 2017
>     - ovs-ofctl:
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index c60f46f..35e15da 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -1214,16 +1214,40 @@ netdev_dpdk_lookup_by_port_id(dpdk_port_t port_id)
>  }
>  
>  static dpdk_port_t
> +dpdk_get_port_by_name_with_index(char *name, int idx, int base_id)
> +{
> +    struct rte_eth_dev_info dev_info;
> +    char pci_addr[PCI_PRI_STR_SIZE];
> +    dpdk_port_t offset_port_id = base_id + idx;
> +
> +    if (rte_eth_dev_is_valid_port(offset_port_id)) {
> +        rte_eth_dev_info_get(offset_port_id, &dev_info);
> +        rte_pci_device_name(&dev_info.pci_dev->addr, pci_addr,
> +                            sizeof(pci_addr));
> +        if (!strcmp(pci_addr, name)) {
> +            return offset_port_id;
> +        }
> +    }
> +
> +    VLOG_INFO("Invalid PCI offset %i for device %s", idx, name);
> +
> +    return DPDK_ETH_PORT_ID_INVALID;
> +}
> +
> +static dpdk_port_t
>  netdev_dpdk_process_devargs(struct netdev_dpdk *dev,
>                              const char *devargs, char **errp)
>  {
> -    /* Get the name up to the first comma. */
> -    char *name = xmemdup0(devargs, strcspn(devargs, ","));
> +    char *devargs_copy = xmemdup0((devargs), strlen(devargs));
> +    char *name, *idx_str;
> +    unsigned idx;
>      dpdk_port_t new_port_id = DPDK_ETH_PORT_ID_INVALID;
>  
> +    name = strtok_r(devargs_copy, ",", &devargs_copy);
> +    idx_str = strtok_r(devargs_copy, ",", &devargs_copy);
> +
>      if (!rte_eth_dev_count()
> -            || rte_eth_dev_get_port_by_name(name, &new_port_id)
> -            || !rte_eth_dev_is_valid_port(new_port_id)) {
> +        || rte_eth_dev_get_port_by_name(name, &new_port_id)) {
>          /* Device not found in DPDK, attempt to attach it */
>          if (!rte_eth_dev_attach(devargs, &new_port_id)) {

I think you should strip out the ",X" for attach. Although looking at
the DPDK code it would probably ignore the ",X" once the pci address is
ok, so maybe it was intentional.

>              /* Attach successful */
> @@ -1232,12 +1256,21 @@ netdev_dpdk_process_devargs(struct netdev_dpdk *dev,
>          } else {
>              /* Attach unsuccessful */
>              new_port_id = DPDK_ETH_PORT_ID_INVALID;
> -            VLOG_WARN_BUF(errp, "Error attaching device '%s' to DPDK",
> -                          devargs);
> +            goto out;
>          }
>      }
>  
> -    free(name);
> +    if (idx_str) {
> +        idx = strtol(idx_str, NULL, 10);
> +        new_port_id = dpdk_get_port_by_name_with_index(name, idx, new_port_id);
> +    }
> +
> +out:
> +    if (new_port_id == DPDK_ETH_PORT_ID_INVALID) {
> +        VLOG_WARN_BUF(errp, "Error attaching device '%s' to DPDK",
> +                      devargs);
> +    }
> +

you need to free devargs_copy here?

>      return new_port_id;
>  }
>  
> @@ -2549,14 +2582,28 @@ netdev_dpdk_detach(struct unixctl_conn *conn, int argc OVS_UNUSED,
>  {
>      int ret;
>      char *response;
> -    uint8_t port_id;
>      char devname[RTE_ETH_NAME_MAX_LEN];
>      struct netdev_dpdk *dev;
> +    char *devname_copy = xmemdup0((argv[1]), strlen(argv[1]));
> +    char *name, *idx_str;
> +    int idx;
> +    dpdk_port_t port_id;
>  
>      ovs_mutex_lock(&dpdk_mutex);
>  
> -    if (!rte_eth_dev_count() || rte_eth_dev_get_port_by_name(argv[1],
> -                                                             &port_id)) {
> +    name = strtok_r(devname_copy, ",", &devname_copy);
> +    idx_str = strtok_r(devname_copy, ",", &devname_copy);
> +
> +    if (rte_eth_dev_count()) {
> +        if (rte_eth_dev_get_port_by_name(argv[1], &port_id)) {
> +            port_id = DPDK_ETH_PORT_ID_INVALID;
> +        } else if (idx_str) {
> +            idx = strtol(idx_str, NULL, 10);
> +            port_id = dpdk_get_port_by_name_with_index(name, idx, port_id);
> +        }
> +    }
> +
> +    if (port_id == DPDK_ETH_PORT_ID_INVALID) {
>          response = xasprintf("Device '%s' not found in DPDK", argv[1]);
>          goto error;
>      }
> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
> index 04c771f..7f503fb 100644
> --- a/vswitchd/vswitch.xml
> +++ b/vswitchd/vswitch.xml
> @@ -2608,12 +2608,17 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 type=patch options:peer=p1 \
>          <p>
>            Specifies the PCI address associated with the port for physical
>            devices, or the virtual driver to be used for the port when a virtual
> -          PMD is intended to be used. For the latter, the argument string
> -          typically takes the form of
> +          PMD is intended to be used.
> +          For physical devices, you may specify a single PCI ID like
> +          <code>0000:06:00.0</code> or an ID followed by a comma and an index
> +          like <code>0000:06:00.0,1</code> for cases where multiple ports are
> +          associated with a single PCI ID.
> +          For virtual devices, the argument string typically takes the form of
>            <code>eth_<var>driver_name</var><var>x</var></code>, where
>            <var>driver_name</var> is a valid virtual DPDK PMD driver name and
>            <var>x</var> is a unique identifier of your choice for the given
> -          port.  Only supported by the dpdk port type.
> +          port.
> +          Only supported by the dpdk port type.

Spurious

>          </p>
>        </column>
>  
>
Ciara Loftus Oct. 19, 2017, 6:27 p.m. UTC | #2
> 
> On 10/17/2017 11:48 AM, Ciara Loftus wrote:
> > Some NICs have only one PCI address associated with multiple ports. This
> > patch extends the dpdk-devargs option's format to cater for such
> > devices. Whereas before only one of N ports associated with one PCI
> > address could be added, now N can.
> >
> > This patch allows for the following use of the dpdk-devargs option:
> >
> > ovs-vsctl set Interface myport options:dpdk-devargs=0000:06:00.0,X
> >
> > Where X is an unsigned integer representing one of multiple ports
> > associated with the PCI address provided.
> >
> > This patch has not been tested.
> >
> > Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
> > ---
> > v2:
> > * Simplify function to find port ID of indexed device
> > * Hotplug compatibility
> > * Detach compatibility
> > * Documentation
> > * Use strtol instead of atoi
> >
> >  Documentation/howto/dpdk.rst         |  9 +++++
> >  Documentation/intro/install/dpdk.rst |  9 +++++
> >  NEWS                                 |  2 ++
> >  lib/netdev-dpdk.c                    | 67 ++++++++++++++++++++++++++++++---
> ---
> >  vswitchd/vswitch.xml                 | 11 ++++--
> >  5 files changed, 85 insertions(+), 13 deletions(-)
> >
> > diff --git a/Documentation/howto/dpdk.rst
> b/Documentation/howto/dpdk.rst
> > index d123819..9447b71 100644
> > --- a/Documentation/howto/dpdk.rst
> > +++ b/Documentation/howto/dpdk.rst
> > @@ -48,6 +48,15 @@ number of dpdk devices found in the log file::
> >      $ ovs-vsctl add-port br0 dpdk-p1 -- set Interface dpdk-p1 type=dpdk \
> >          options:dpdk-devargs=0000:01:00.1
> >
> > +If your PCI device has multiple ports under the same PCI ID, you can use
> the
> > +following notation to indicate the specific device you wish to add::
> > +
> > +    $ ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk \
> > +        options:dpdk-devargs=0000:01:00.0,0
> > +
> > +The above would attempt to use the first device (0) associated with that
> PCI
> > +ID. Use ,1 ,2 etc. to access the next.
> > +
> >  After the DPDK ports get added to switch, a polling thread continuously
> polls
> >  DPDK devices and consumes 100% of the core, as can be checked from
> ``top`` and
> >  ``ps`` commands::
> > diff --git a/Documentation/intro/install/dpdk.rst
> b/Documentation/intro/install/dpdk.rst
> > index bb69ae5..d0e87f5 100644
> > --- a/Documentation/intro/install/dpdk.rst
> > +++ b/Documentation/intro/install/dpdk.rst
> > @@ -271,6 +271,15 @@ ports. For example, to create a userspace bridge
> named ``br0`` and add two
> >  DPDK devices will not be available for use until a valid dpdk-devargs is
> >  specified.
> >
> > +If your PCI device has multiple ports under the same PCI ID, you can use
> the
> > +following notation to indicate the specific device you wish to add::
> > +
> > +    $ ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk \
> > +        options:dpdk-devargs=0000:01:00.0,0
> > +
> > +The above would attempt to use the first device (0) associated with that
> PCI
> > +ID. Use ,1 ,2 etc. to access the next.
> > +
> >  Refer to ovs-vsctl(8) and :doc:`/howto/dpdk` for more details.
> >
> >  Performance Tuning
> > diff --git a/NEWS b/NEWS
> > index 75f5fa5..ca885a6 100644
> > --- a/NEWS
> > +++ b/NEWS
> > @@ -5,6 +5,8 @@ Post-v2.8.0
> >         chassis "hostname" in addition to a chassis "name".
> >     - Linux kernel 4.13
> >       * Add support for compiling OVS with the latest Linux 4.13 kernel
> > +   - DPDK:
> > +     * Support for adding devices with multiple DPDK ports under one PCI
> ID.
> >
> >  v2.8.0 - 31 Aug 2017
> >     - ovs-ofctl:
> > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> > index c60f46f..35e15da 100644
> > --- a/lib/netdev-dpdk.c
> > +++ b/lib/netdev-dpdk.c
> > @@ -1214,16 +1214,40 @@
> netdev_dpdk_lookup_by_port_id(dpdk_port_t port_id)
> >  }
> >
> >  static dpdk_port_t
> > +dpdk_get_port_by_name_with_index(char *name, int idx, int base_id)
> > +{
> > +    struct rte_eth_dev_info dev_info;
> > +    char pci_addr[PCI_PRI_STR_SIZE];
> > +    dpdk_port_t offset_port_id = base_id + idx;
> > +
> > +    if (rte_eth_dev_is_valid_port(offset_port_id)) {
> > +        rte_eth_dev_info_get(offset_port_id, &dev_info);
> > +        rte_pci_device_name(&dev_info.pci_dev->addr, pci_addr,
> > +                            sizeof(pci_addr));
> > +        if (!strcmp(pci_addr, name)) {
> > +            return offset_port_id;
> > +        }
> > +    }
> > +
> > +    VLOG_INFO("Invalid PCI offset %i for device %s", idx, name);
> > +
> > +    return DPDK_ETH_PORT_ID_INVALID;
> > +}
> > +
> > +static dpdk_port_t
> >  netdev_dpdk_process_devargs(struct netdev_dpdk *dev,
> >                              const char *devargs, char **errp)
> >  {
> > -    /* Get the name up to the first comma. */
> > -    char *name = xmemdup0(devargs, strcspn(devargs, ","));
> > +    char *devargs_copy = xmemdup0((devargs), strlen(devargs));
> > +    char *name, *idx_str;
> > +    unsigned idx;
> >      dpdk_port_t new_port_id = DPDK_ETH_PORT_ID_INVALID;
> >
> > +    name = strtok_r(devargs_copy, ",", &devargs_copy);
> > +    idx_str = strtok_r(devargs_copy, ",", &devargs_copy);
> > +
> >      if (!rte_eth_dev_count()
> > -            || rte_eth_dev_get_port_by_name(name, &new_port_id)
> > -            || !rte_eth_dev_is_valid_port(new_port_id)) {
> > +        || rte_eth_dev_get_port_by_name(name, &new_port_id)) {
> >          /* Device not found in DPDK, attempt to attach it */
> >          if (!rte_eth_dev_attach(devargs, &new_port_id)) {
> 
> I think you should strip out the ",X" for attach. Although looking at
> the DPDK code it would probably ignore the ",X" once the pci address is
> ok, so maybe it was intentional.

I left this in intentionally. Vdev devargs often include commas and the information following them is important. Eg.
net_pcap0,tx_pcap=/tmp/file_tx.pcap

But actually this will cause a failure later when we assume those arguments are an integer index and try to process them. That needs to be updated in v3.

> 
> >              /* Attach successful */
> > @@ -1232,12 +1256,21 @@ netdev_dpdk_process_devargs(struct
> netdev_dpdk *dev,
> >          } else {
> >              /* Attach unsuccessful */
> >              new_port_id = DPDK_ETH_PORT_ID_INVALID;
> > -            VLOG_WARN_BUF(errp, "Error attaching device '%s' to DPDK",
> > -                          devargs);
> > +            goto out;
> >          }
> >      }
> >
> > -    free(name);
> > +    if (idx_str) {
> > +        idx = strtol(idx_str, NULL, 10);
> > +        new_port_id = dpdk_get_port_by_name_with_index(name, idx,
> new_port_id);
> > +    }
> > +
> > +out:
> > +    if (new_port_id == DPDK_ETH_PORT_ID_INVALID) {
> > +        VLOG_WARN_BUF(errp, "Error attaching device '%s' to DPDK",
> > +                      devargs);
> > +    }
> > +
> 
> you need to free devargs_copy here?

Yes. Will put this in v3.

> 
> >      return new_port_id;
> >  }
> >
> > @@ -2549,14 +2582,28 @@ netdev_dpdk_detach(struct unixctl_conn
> *conn, int argc OVS_UNUSED,
> >  {
> >      int ret;
> >      char *response;
> > -    uint8_t port_id;
> >      char devname[RTE_ETH_NAME_MAX_LEN];
> >      struct netdev_dpdk *dev;
> > +    char *devname_copy = xmemdup0((argv[1]), strlen(argv[1]));
> > +    char *name, *idx_str;
> > +    int idx;
> > +    dpdk_port_t port_id;
> >
> >      ovs_mutex_lock(&dpdk_mutex);
> >
> > -    if (!rte_eth_dev_count() || rte_eth_dev_get_port_by_name(argv[1],
> > -                                                             &port_id)) {
> > +    name = strtok_r(devname_copy, ",", &devname_copy);
> > +    idx_str = strtok_r(devname_copy, ",", &devname_copy);
> > +
> > +    if (rte_eth_dev_count()) {
> > +        if (rte_eth_dev_get_port_by_name(argv[1], &port_id)) {
> > +            port_id = DPDK_ETH_PORT_ID_INVALID;
> > +        } else if (idx_str) {
> > +            idx = strtol(idx_str, NULL, 10);
> > +            port_id = dpdk_get_port_by_name_with_index(name, idx,
> port_id);
> > +        }
> > +    }
> > +
> > +    if (port_id == DPDK_ETH_PORT_ID_INVALID) {
> >          response = xasprintf("Device '%s' not found in DPDK", argv[1]);
> >          goto error;
> >      }
> > diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
> > index 04c771f..7f503fb 100644
> > --- a/vswitchd/vswitch.xml
> > +++ b/vswitchd/vswitch.xml
> > @@ -2608,12 +2608,17 @@ ovs-vsctl add-port br0 p0 -- set Interface p0
> type=patch options:peer=p1 \
> >          <p>
> >            Specifies the PCI address associated with the port for physical
> >            devices, or the virtual driver to be used for the port when a virtual
> > -          PMD is intended to be used. For the latter, the argument string
> > -          typically takes the form of
> > +          PMD is intended to be used.
> > +          For physical devices, you may specify a single PCI ID like
> > +          <code>0000:06:00.0</code> or an ID followed by a comma and an
> index
> > +          like <code>0000:06:00.0,1</code> for cases where multiple ports are
> > +          associated with a single PCI ID.
> > +          For virtual devices, the argument string typically takes the form of
> >            <code>eth_<var>driver_name</var><var>x</var></code>, where
> >            <var>driver_name</var> is a valid virtual DPDK PMD driver name and
> >            <var>x</var> is a unique identifier of your choice for the given
> > -          port.  Only supported by the dpdk port type.
> > +          port.
> > +          Only supported by the dpdk port type.
> 
> Spurious

Will remove in v3.

Thanks,
Ciara

> 
> >          </p>
> >        </column>
> >
> >
Kevin Traynor Oct. 20, 2017, 9:01 a.m. UTC | #3
On 10/19/2017 07:27 PM, Loftus, Ciara wrote:
>>
>> On 10/17/2017 11:48 AM, Ciara Loftus wrote:
>>> Some NICs have only one PCI address associated with multiple ports. This
>>> patch extends the dpdk-devargs option's format to cater for such
>>> devices. Whereas before only one of N ports associated with one PCI
>>> address could be added, now N can.
>>>
>>> This patch allows for the following use of the dpdk-devargs option:
>>>
>>> ovs-vsctl set Interface myport options:dpdk-devargs=0000:06:00.0,X
>>>
>>> Where X is an unsigned integer representing one of multiple ports
>>> associated with the PCI address provided.
>>>
>>> This patch has not been tested.
>>>
>>> Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
>>> ---
>>> v2:
>>> * Simplify function to find port ID of indexed device
>>> * Hotplug compatibility
>>> * Detach compatibility
>>> * Documentation
>>> * Use strtol instead of atoi
>>>
>>>  Documentation/howto/dpdk.rst         |  9 +++++
>>>  Documentation/intro/install/dpdk.rst |  9 +++++
>>>  NEWS                                 |  2 ++
>>>  lib/netdev-dpdk.c                    | 67 ++++++++++++++++++++++++++++++---
>> ---
>>>  vswitchd/vswitch.xml                 | 11 ++++--
>>>  5 files changed, 85 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/Documentation/howto/dpdk.rst
>> b/Documentation/howto/dpdk.rst
>>> index d123819..9447b71 100644
>>> --- a/Documentation/howto/dpdk.rst
>>> +++ b/Documentation/howto/dpdk.rst
>>> @@ -48,6 +48,15 @@ number of dpdk devices found in the log file::
>>>      $ ovs-vsctl add-port br0 dpdk-p1 -- set Interface dpdk-p1 type=dpdk \
>>>          options:dpdk-devargs=0000:01:00.1
>>>
>>> +If your PCI device has multiple ports under the same PCI ID, you can use
>> the
>>> +following notation to indicate the specific device you wish to add::
>>> +
>>> +    $ ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk \
>>> +        options:dpdk-devargs=0000:01:00.0,0
>>> +
>>> +The above would attempt to use the first device (0) associated with that
>> PCI
>>> +ID. Use ,1 ,2 etc. to access the next.
>>> +
>>>  After the DPDK ports get added to switch, a polling thread continuously
>> polls
>>>  DPDK devices and consumes 100% of the core, as can be checked from
>> ``top`` and
>>>  ``ps`` commands::
>>> diff --git a/Documentation/intro/install/dpdk.rst
>> b/Documentation/intro/install/dpdk.rst
>>> index bb69ae5..d0e87f5 100644
>>> --- a/Documentation/intro/install/dpdk.rst
>>> +++ b/Documentation/intro/install/dpdk.rst
>>> @@ -271,6 +271,15 @@ ports. For example, to create a userspace bridge
>> named ``br0`` and add two
>>>  DPDK devices will not be available for use until a valid dpdk-devargs is
>>>  specified.
>>>
>>> +If your PCI device has multiple ports under the same PCI ID, you can use
>> the
>>> +following notation to indicate the specific device you wish to add::
>>> +
>>> +    $ ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk \
>>> +        options:dpdk-devargs=0000:01:00.0,0
>>> +
>>> +The above would attempt to use the first device (0) associated with that
>> PCI
>>> +ID. Use ,1 ,2 etc. to access the next.
>>> +
>>>  Refer to ovs-vsctl(8) and :doc:`/howto/dpdk` for more details.
>>>
>>>  Performance Tuning
>>> diff --git a/NEWS b/NEWS
>>> index 75f5fa5..ca885a6 100644
>>> --- a/NEWS
>>> +++ b/NEWS
>>> @@ -5,6 +5,8 @@ Post-v2.8.0
>>>         chassis "hostname" in addition to a chassis "name".
>>>     - Linux kernel 4.13
>>>       * Add support for compiling OVS with the latest Linux 4.13 kernel
>>> +   - DPDK:
>>> +     * Support for adding devices with multiple DPDK ports under one PCI
>> ID.
>>>
>>>  v2.8.0 - 31 Aug 2017
>>>     - ovs-ofctl:
>>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
>>> index c60f46f..35e15da 100644
>>> --- a/lib/netdev-dpdk.c
>>> +++ b/lib/netdev-dpdk.c
>>> @@ -1214,16 +1214,40 @@
>> netdev_dpdk_lookup_by_port_id(dpdk_port_t port_id)
>>>  }
>>>
>>>  static dpdk_port_t
>>> +dpdk_get_port_by_name_with_index(char *name, int idx, int base_id)
>>> +{
>>> +    struct rte_eth_dev_info dev_info;
>>> +    char pci_addr[PCI_PRI_STR_SIZE];
>>> +    dpdk_port_t offset_port_id = base_id + idx;
>>> +
>>> +    if (rte_eth_dev_is_valid_port(offset_port_id)) {
>>> +        rte_eth_dev_info_get(offset_port_id, &dev_info);
>>> +        rte_pci_device_name(&dev_info.pci_dev->addr, pci_addr,
>>> +                            sizeof(pci_addr));
>>> +        if (!strcmp(pci_addr, name)) {
>>> +            return offset_port_id;
>>> +        }
>>> +    }
>>> +
>>> +    VLOG_INFO("Invalid PCI offset %i for device %s", idx, name);
>>> +
>>> +    return DPDK_ETH_PORT_ID_INVALID;
>>> +}
>>> +
>>> +static dpdk_port_t
>>>  netdev_dpdk_process_devargs(struct netdev_dpdk *dev,
>>>                              const char *devargs, char **errp)
>>>  {
>>> -    /* Get the name up to the first comma. */
>>> -    char *name = xmemdup0(devargs, strcspn(devargs, ","));
>>> +    char *devargs_copy = xmemdup0((devargs), strlen(devargs));
>>> +    char *name, *idx_str;
>>> +    unsigned idx;
>>>      dpdk_port_t new_port_id = DPDK_ETH_PORT_ID_INVALID;
>>>
>>> +    name = strtok_r(devargs_copy, ",", &devargs_copy);
>>> +    idx_str = strtok_r(devargs_copy, ",", &devargs_copy);
>>> +
>>>      if (!rte_eth_dev_count()
>>> -            || rte_eth_dev_get_port_by_name(name, &new_port_id)
>>> -            || !rte_eth_dev_is_valid_port(new_port_id)) {
>>> +        || rte_eth_dev_get_port_by_name(name, &new_port_id)) {
>>>          /* Device not found in DPDK, attempt to attach it */
>>>          if (!rte_eth_dev_attach(devargs, &new_port_id)) {
>>
>> I think you should strip out the ",X" for attach. Although looking at
>> the DPDK code it would probably ignore the ",X" once the pci address is
>> ok, so maybe it was intentional.
> 
> I left this in intentionally. Vdev devargs often include commas and the information following them is important. Eg.
> net_pcap0,tx_pcap=/tmp/file_tx.pcap
> 

True for vdev, but I was thinking of the pci case. If you are
distinguishing between pci and vdev because of the "<pci>,<int>" vs
"<vdev>,<string>" issue below then maybe you could also strip out the
",<int>" for pci.

> But actually this will cause a failure later when we assume those arguments are an integer index and try to process them. That needs to be updated in v3.
> 
>>
>>>              /* Attach successful */
>>> @@ -1232,12 +1256,21 @@ netdev_dpdk_process_devargs(struct
>> netdev_dpdk *dev,
>>>          } else {
>>>              /* Attach unsuccessful */
>>>              new_port_id = DPDK_ETH_PORT_ID_INVALID;
>>> -            VLOG_WARN_BUF(errp, "Error attaching device '%s' to DPDK",
>>> -                          devargs);
>>> +            goto out;
>>>          }
>>>      }
>>>
>>> -    free(name);
>>> +    if (idx_str) {
>>> +        idx = strtol(idx_str, NULL, 10);
>>> +        new_port_id = dpdk_get_port_by_name_with_index(name, idx,
>> new_port_id);
>>> +    }
>>> +
>>> +out:
>>> +    if (new_port_id == DPDK_ETH_PORT_ID_INVALID) {
>>> +        VLOG_WARN_BUF(errp, "Error attaching device '%s' to DPDK",
>>> +                      devargs);
>>> +    }
>>> +
>>
>> you need to free devargs_copy here?
> 
> Yes. Will put this in v3.
> 
>>
>>>      return new_port_id;
>>>  }
>>>
>>> @@ -2549,14 +2582,28 @@ netdev_dpdk_detach(struct unixctl_conn
>> *conn, int argc OVS_UNUSED,
>>>  {
>>>      int ret;
>>>      char *response;
>>> -    uint8_t port_id;
>>>      char devname[RTE_ETH_NAME_MAX_LEN];
>>>      struct netdev_dpdk *dev;
>>> +    char *devname_copy = xmemdup0((argv[1]), strlen(argv[1]));
>>> +    char *name, *idx_str;
>>> +    int idx;
>>> +    dpdk_port_t port_id;
>>>
>>>      ovs_mutex_lock(&dpdk_mutex);
>>>
>>> -    if (!rte_eth_dev_count() || rte_eth_dev_get_port_by_name(argv[1],
>>> -                                                             &port_id)) {
>>> +    name = strtok_r(devname_copy, ",", &devname_copy);
>>> +    idx_str = strtok_r(devname_copy, ",", &devname_copy);
>>> +
>>> +    if (rte_eth_dev_count()) {
>>> +        if (rte_eth_dev_get_port_by_name(argv[1], &port_id)) {
>>> +            port_id = DPDK_ETH_PORT_ID_INVALID;
>>> +        } else if (idx_str) {
>>> +            idx = strtol(idx_str, NULL, 10);
>>> +            port_id = dpdk_get_port_by_name_with_index(name, idx,
>> port_id);
>>> +        }
>>> +    }
>>> +
>>> +    if (port_id == DPDK_ETH_PORT_ID_INVALID) {
>>>          response = xasprintf("Device '%s' not found in DPDK", argv[1]);
>>>          goto error;
>>>      }
>>> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
>>> index 04c771f..7f503fb 100644
>>> --- a/vswitchd/vswitch.xml
>>> +++ b/vswitchd/vswitch.xml
>>> @@ -2608,12 +2608,17 @@ ovs-vsctl add-port br0 p0 -- set Interface p0
>> type=patch options:peer=p1 \
>>>          <p>
>>>            Specifies the PCI address associated with the port for physical
>>>            devices, or the virtual driver to be used for the port when a virtual
>>> -          PMD is intended to be used. For the latter, the argument string
>>> -          typically takes the form of
>>> +          PMD is intended to be used.
>>> +          For physical devices, you may specify a single PCI ID like
>>> +          <code>0000:06:00.0</code> or an ID followed by a comma and an
>> index
>>> +          like <code>0000:06:00.0,1</code> for cases where multiple ports are
>>> +          associated with a single PCI ID.
>>> +          For virtual devices, the argument string typically takes the form of
>>>            <code>eth_<var>driver_name</var><var>x</var></code>, where
>>>            <var>driver_name</var> is a valid virtual DPDK PMD driver name and
>>>            <var>x</var> is a unique identifier of your choice for the given
>>> -          port.  Only supported by the dpdk port type.
>>> +          port.
>>> +          Only supported by the dpdk port type.
>>
>> Spurious
> 
> Will remove in v3.
> 
> Thanks,
> Ciara
> 
>>
>>>          </p>
>>>        </column>
>>>
>>>
>
diff mbox series

Patch

diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst
index d123819..9447b71 100644
--- a/Documentation/howto/dpdk.rst
+++ b/Documentation/howto/dpdk.rst
@@ -48,6 +48,15 @@  number of dpdk devices found in the log file::
     $ ovs-vsctl add-port br0 dpdk-p1 -- set Interface dpdk-p1 type=dpdk \
         options:dpdk-devargs=0000:01:00.1
 
+If your PCI device has multiple ports under the same PCI ID, you can use the
+following notation to indicate the specific device you wish to add::
+
+    $ ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk \
+        options:dpdk-devargs=0000:01:00.0,0
+
+The above would attempt to use the first device (0) associated with that PCI
+ID. Use ,1 ,2 etc. to access the next.
+
 After the DPDK ports get added to switch, a polling thread continuously polls
 DPDK devices and consumes 100% of the core, as can be checked from ``top`` and
 ``ps`` commands::
diff --git a/Documentation/intro/install/dpdk.rst b/Documentation/intro/install/dpdk.rst
index bb69ae5..d0e87f5 100644
--- a/Documentation/intro/install/dpdk.rst
+++ b/Documentation/intro/install/dpdk.rst
@@ -271,6 +271,15 @@  ports. For example, to create a userspace bridge named ``br0`` and add two
 DPDK devices will not be available for use until a valid dpdk-devargs is
 specified.
 
+If your PCI device has multiple ports under the same PCI ID, you can use the
+following notation to indicate the specific device you wish to add::
+
+    $ ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk \
+        options:dpdk-devargs=0000:01:00.0,0
+
+The above would attempt to use the first device (0) associated with that PCI
+ID. Use ,1 ,2 etc. to access the next.
+
 Refer to ovs-vsctl(8) and :doc:`/howto/dpdk` for more details.
 
 Performance Tuning
diff --git a/NEWS b/NEWS
index 75f5fa5..ca885a6 100644
--- a/NEWS
+++ b/NEWS
@@ -5,6 +5,8 @@  Post-v2.8.0
        chassis "hostname" in addition to a chassis "name".
    - Linux kernel 4.13
      * Add support for compiling OVS with the latest Linux 4.13 kernel
+   - DPDK:
+     * Support for adding devices with multiple DPDK ports under one PCI ID.
 
 v2.8.0 - 31 Aug 2017
    - ovs-ofctl:
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index c60f46f..35e15da 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -1214,16 +1214,40 @@  netdev_dpdk_lookup_by_port_id(dpdk_port_t port_id)
 }
 
 static dpdk_port_t
+dpdk_get_port_by_name_with_index(char *name, int idx, int base_id)
+{
+    struct rte_eth_dev_info dev_info;
+    char pci_addr[PCI_PRI_STR_SIZE];
+    dpdk_port_t offset_port_id = base_id + idx;
+
+    if (rte_eth_dev_is_valid_port(offset_port_id)) {
+        rte_eth_dev_info_get(offset_port_id, &dev_info);
+        rte_pci_device_name(&dev_info.pci_dev->addr, pci_addr,
+                            sizeof(pci_addr));
+        if (!strcmp(pci_addr, name)) {
+            return offset_port_id;
+        }
+    }
+
+    VLOG_INFO("Invalid PCI offset %i for device %s", idx, name);
+
+    return DPDK_ETH_PORT_ID_INVALID;
+}
+
+static dpdk_port_t
 netdev_dpdk_process_devargs(struct netdev_dpdk *dev,
                             const char *devargs, char **errp)
 {
-    /* Get the name up to the first comma. */
-    char *name = xmemdup0(devargs, strcspn(devargs, ","));
+    char *devargs_copy = xmemdup0((devargs), strlen(devargs));
+    char *name, *idx_str;
+    unsigned idx;
     dpdk_port_t new_port_id = DPDK_ETH_PORT_ID_INVALID;
 
+    name = strtok_r(devargs_copy, ",", &devargs_copy);
+    idx_str = strtok_r(devargs_copy, ",", &devargs_copy);
+
     if (!rte_eth_dev_count()
-            || rte_eth_dev_get_port_by_name(name, &new_port_id)
-            || !rte_eth_dev_is_valid_port(new_port_id)) {
+        || rte_eth_dev_get_port_by_name(name, &new_port_id)) {
         /* Device not found in DPDK, attempt to attach it */
         if (!rte_eth_dev_attach(devargs, &new_port_id)) {
             /* Attach successful */
@@ -1232,12 +1256,21 @@  netdev_dpdk_process_devargs(struct netdev_dpdk *dev,
         } else {
             /* Attach unsuccessful */
             new_port_id = DPDK_ETH_PORT_ID_INVALID;
-            VLOG_WARN_BUF(errp, "Error attaching device '%s' to DPDK",
-                          devargs);
+            goto out;
         }
     }
 
-    free(name);
+    if (idx_str) {
+        idx = strtol(idx_str, NULL, 10);
+        new_port_id = dpdk_get_port_by_name_with_index(name, idx, new_port_id);
+    }
+
+out:
+    if (new_port_id == DPDK_ETH_PORT_ID_INVALID) {
+        VLOG_WARN_BUF(errp, "Error attaching device '%s' to DPDK",
+                      devargs);
+    }
+
     return new_port_id;
 }
 
@@ -2549,14 +2582,28 @@  netdev_dpdk_detach(struct unixctl_conn *conn, int argc OVS_UNUSED,
 {
     int ret;
     char *response;
-    uint8_t port_id;
     char devname[RTE_ETH_NAME_MAX_LEN];
     struct netdev_dpdk *dev;
+    char *devname_copy = xmemdup0((argv[1]), strlen(argv[1]));
+    char *name, *idx_str;
+    int idx;
+    dpdk_port_t port_id;
 
     ovs_mutex_lock(&dpdk_mutex);
 
-    if (!rte_eth_dev_count() || rte_eth_dev_get_port_by_name(argv[1],
-                                                             &port_id)) {
+    name = strtok_r(devname_copy, ",", &devname_copy);
+    idx_str = strtok_r(devname_copy, ",", &devname_copy);
+
+    if (rte_eth_dev_count()) {
+        if (rte_eth_dev_get_port_by_name(argv[1], &port_id)) {
+            port_id = DPDK_ETH_PORT_ID_INVALID;
+        } else if (idx_str) {
+            idx = strtol(idx_str, NULL, 10);
+            port_id = dpdk_get_port_by_name_with_index(name, idx, port_id);
+        }
+    }
+
+    if (port_id == DPDK_ETH_PORT_ID_INVALID) {
         response = xasprintf("Device '%s' not found in DPDK", argv[1]);
         goto error;
     }
diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index 04c771f..7f503fb 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -2608,12 +2608,17 @@  ovs-vsctl add-port br0 p0 -- set Interface p0 type=patch options:peer=p1 \
         <p>
           Specifies the PCI address associated with the port for physical
           devices, or the virtual driver to be used for the port when a virtual
-          PMD is intended to be used. For the latter, the argument string
-          typically takes the form of
+          PMD is intended to be used.
+          For physical devices, you may specify a single PCI ID like
+          <code>0000:06:00.0</code> or an ID followed by a comma and an index
+          like <code>0000:06:00.0,1</code> for cases where multiple ports are
+          associated with a single PCI ID.
+          For virtual devices, the argument string typically takes the form of
           <code>eth_<var>driver_name</var><var>x</var></code>, where
           <var>driver_name</var> is a valid virtual DPDK PMD driver name and
           <var>x</var> is a unique identifier of your choice for the given
-          port.  Only supported by the dpdk port type.
+          port.
+          Only supported by the dpdk port type.
         </p>
       </column>