[ovs-dev,v3] netdev-dpdk: add support for the RTE_ETH_EVENT_INTR_RESET event
diff mbox series

Message ID 20190911131907.180147.7202.stgit@netdev64
State New
Headers show
Series
  • [ovs-dev,v3] netdev-dpdk: add support for the RTE_ETH_EVENT_INTR_RESET event
Related show

Commit Message

Eelco Chaudron Sept. 11, 2019, 1:20 p.m. UTC
Currently, OVS does not register and therefore not handle the
interface reset event from the DPDK framework. This would cause a
problem in cases where a VF is used as an interface, and its
configuration changes.

As an example in the following scenario the MAC change is not
detected/acted upon until OVS is restarted without the patch applied:

  $ echo 1 > /sys/bus/pci/devices/0000:05:00.1/sriov_numvfs
  $ ovs-vsctl add-port ovs_pvp_br0 dpdk0 -- \
            set Interface dpdk0 type=dpdk -- \
            set Interface dpdk0 options:dpdk-devargs=0000:05:0a.0

  $ ip link set p5p2 vf 0 mac 52:54:00:92:d3:33

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
---
v2 -> v3:
v1 -> v2:
  Fixed Ilya's comments

 lib/netdev-dpdk.c |   53 +++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 51 insertions(+), 2 deletions(-)

Comments

Ilya Maximets Sept. 12, 2019, 8:39 a.m. UTC | #1
On 11.09.2019 16:20, Eelco Chaudron wrote:
> Currently, OVS does not register and therefore not handle the
> interface reset event from the DPDK framework. This would cause a
> problem in cases where a VF is used as an interface, and its
> configuration changes.
> 
> As an example in the following scenario the MAC change is not
> detected/acted upon until OVS is restarted without the patch applied:
> 
>   $ echo 1 > /sys/bus/pci/devices/0000:05:00.1/sriov_numvfs
>   $ ovs-vsctl add-port ovs_pvp_br0 dpdk0 -- \
>             set Interface dpdk0 type=dpdk -- \
>             set Interface dpdk0 options:dpdk-devargs=0000:05:0a.0
> 
>   $ ip link set p5p2 vf 0 mac 52:54:00:92:d3:33
> 
> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
> ---

<snip>

> @@ -4180,13 +4223,19 @@ netdev_dpdk_reconfigure(struct netdev *netdev)
>          && dev->rxq_size == dev->requested_rxq_size
>          && dev->txq_size == dev->requested_txq_size
>          && dev->socket_id == dev->requested_socket_id
> -        && dev->started) {
> +        && dev->started && !dev->reset_needed) {
>          /* Reconfiguration is unnecessary */
>  
>          goto out;
>      }
>  
> -    rte_eth_dev_stop(dev->port_id);
> +    if (dev->reset_needed) {
> +        rte_eth_dev_reset(dev->port_id);

Thinking more about the change and looking at flow control configuration,
it seems that on reset we'll lost configurations done in set_config().
Device reset should return it to initial state, i.e. all the default settings,
but set_config() will not be called after that.
I know, that VFs commonly doesn't support flow control, but if we'll add like
real configuration of MAC address, it will be lost too.

What do you think?

BTW, there is a discussion about flow control configuration:
https://patchwork.ozlabs.org/patch/1159689/

> +        dev->reset_needed = false;
> +    } else {
> +        rte_eth_dev_stop(dev->port_id);
> +    }
> +
>      dev->started = false;
>  
>      err = netdev_dpdk_mempool_configure(dev);
Eelco Chaudron Sept. 12, 2019, 10:07 a.m. UTC | #2
On 12 Sep 2019, at 10:39, Ilya Maximets wrote:

> On 11.09.2019 16:20, Eelco Chaudron wrote:
>> Currently, OVS does not register and therefore not handle the
>> interface reset event from the DPDK framework. This would cause a
>> problem in cases where a VF is used as an interface, and its
>> configuration changes.
>>
>> As an example in the following scenario the MAC change is not
>> detected/acted upon until OVS is restarted without the patch applied:
>>
>>   $ echo 1 > /sys/bus/pci/devices/0000:05:00.1/sriov_numvfs
>>   $ ovs-vsctl add-port ovs_pvp_br0 dpdk0 -- \
>>             set Interface dpdk0 type=dpdk -- \
>>             set Interface dpdk0 options:dpdk-devargs=0000:05:0a.0
>>
>>   $ ip link set p5p2 vf 0 mac 52:54:00:92:d3:33
>>
>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
>> ---
>
> <snip>
>
>> @@ -4180,13 +4223,19 @@ netdev_dpdk_reconfigure(struct netdev 
>> *netdev)
>>          && dev->rxq_size == dev->requested_rxq_size
>>          && dev->txq_size == dev->requested_txq_size
>>          && dev->socket_id == dev->requested_socket_id
>> -        && dev->started) {
>> +        && dev->started && !dev->reset_needed) {
>>          /* Reconfiguration is unnecessary */
>>
>>          goto out;
>>      }
>>
>> -    rte_eth_dev_stop(dev->port_id);
>> +    if (dev->reset_needed) {
>> +        rte_eth_dev_reset(dev->port_id);
>
> Thinking more about the change and looking at flow control 
> configuration,
> it seems that on reset we'll lost configurations done in set_config().
> Device reset should return it to initial state, i.e. all the default 
> settings,
> but set_config() will not be called after that.
> I know, that VFs commonly doesn't support flow control, but if we'll 
> add like
> real configuration of MAC address, it will be lost too.
>
> What do you think?

Doesn’t the full bridge run sequence take care of this?

So in callback we do netdev_request_reconfigure() which triggers the 
following sequence…

bridge_run__()
  ...
    dpif_netdev_run()
      ...
        reconfigure_datapath()
          ...
            netdev_dpdk_reconfigure()

But than also it will call in the next run

bridge_run()
  ...
    bridge_delete_or_reconfigure_ports()
      ...
         netdev_set_config()
           netdev_dpdk_set_config()


Or do I miss something?

>
> BTW, there is a discussion about flow control configuration:
> https://patchwork.ozlabs.org/patch/1159689/
>
>> +        dev->reset_needed = false;
>> +    } else {
>> +        rte_eth_dev_stop(dev->port_id);
>> +    }
>> +
>>      dev->started = false;
>>
>>      err = netdev_dpdk_mempool_configure(dev);
Ilya Maximets Sept. 12, 2019, 10:19 a.m. UTC | #3
On 12.09.2019 13:07, Eelco Chaudron wrote:
> 
> 
> On 12 Sep 2019, at 10:39, Ilya Maximets wrote:
> 
>> On 11.09.2019 16:20, Eelco Chaudron wrote:
>>> Currently, OVS does not register and therefore not handle the
>>> interface reset event from the DPDK framework. This would cause a
>>> problem in cases where a VF is used as an interface, and its
>>> configuration changes.
>>>
>>> As an example in the following scenario the MAC change is not
>>> detected/acted upon until OVS is restarted without the patch applied:
>>>
>>>   $ echo 1 > /sys/bus/pci/devices/0000:05:00.1/sriov_numvfs
>>>   $ ovs-vsctl add-port ovs_pvp_br0 dpdk0 -- \
>>>             set Interface dpdk0 type=dpdk -- \
>>>             set Interface dpdk0 options:dpdk-devargs=0000:05:0a.0
>>>
>>>   $ ip link set p5p2 vf 0 mac 52:54:00:92:d3:33
>>>
>>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
>>> ---
>>
>> <snip>
>>
>>> @@ -4180,13 +4223,19 @@ netdev_dpdk_reconfigure(struct netdev *netdev)
>>>          && dev->rxq_size == dev->requested_rxq_size
>>>          && dev->txq_size == dev->requested_txq_size
>>>          && dev->socket_id == dev->requested_socket_id
>>> -        && dev->started) {
>>> +        && dev->started && !dev->reset_needed) {
>>>          /* Reconfiguration is unnecessary */
>>>
>>>          goto out;
>>>      }
>>>
>>> -    rte_eth_dev_stop(dev->port_id);
>>> +    if (dev->reset_needed) {
>>> +        rte_eth_dev_reset(dev->port_id);
>>
>> Thinking more about the change and looking at flow control configuration,
>> it seems that on reset we'll lost configurations done in set_config().
>> Device reset should return it to initial state, i.e. all the default settings,
>> but set_config() will not be called after that.
>> I know, that VFs commonly doesn't support flow control, but if we'll add like
>> real configuration of MAC address, it will be lost too.
>>
>> What do you think?
> 
> Doesn’t the full bridge run sequence take care of this?
> 
> So in callback we do netdev_request_reconfigure() which triggers the following sequence…
> 
> bridge_run__()
>  ...
>    dpif_netdev_run()
>      ...
>        reconfigure_datapath()
>          ...
>            netdev_dpdk_reconfigure()
> 
> But than also it will call in the next run
> 
> bridge_run()
>  ...
>    bridge_delete_or_reconfigure_ports()
>      ...
>         netdev_set_config()
>           netdev_dpdk_set_config()
> 
> 
> Or do I miss something?

dpif_netdev_run() is called from ofproto_type_run() which is called
unconditionally from the bridge_run().
But bridge_delete_or_reconfigure_ports() only called from bridge_reconfigure(),
which is protected by the following condition:

    if (ovsdb_idl_get_seqno(idl) != idl_seqno ||
        if_notifier_changed(ifnotifier)) {

i.e. if there will be no DB updates or there will be no netlink notifications,
set_config() will not be called. I understand that in your scenario there
might be netlink notification for interface changes since PF is controlled
by the kernel, but it might be not the case if PF attached as a DPDK port
to OVS or some other application.

> 
>>
>> BTW, there is a discussion about flow control configuration:
>> https://patchwork.ozlabs.org/patch/1159689/
>>
>>> +        dev->reset_needed = false;
>>> +    } else {
>>> +        rte_eth_dev_stop(dev->port_id);
>>> +    }
>>> +
>>>      dev->started = false;
>>>
>>>      err = netdev_dpdk_mempool_configure(dev);
> 
>
Ilya Maximets Sept. 12, 2019, 10:24 a.m. UTC | #4
On 12.09.2019 13:19, Ilya Maximets wrote:
> On 12.09.2019 13:07, Eelco Chaudron wrote:
>>
>>
>> On 12 Sep 2019, at 10:39, Ilya Maximets wrote:
>>
>>> On 11.09.2019 16:20, Eelco Chaudron wrote:
>>>> Currently, OVS does not register and therefore not handle the
>>>> interface reset event from the DPDK framework. This would cause a
>>>> problem in cases where a VF is used as an interface, and its
>>>> configuration changes.
>>>>
>>>> As an example in the following scenario the MAC change is not
>>>> detected/acted upon until OVS is restarted without the patch applied:
>>>>
>>>>   $ echo 1 > /sys/bus/pci/devices/0000:05:00.1/sriov_numvfs
>>>>   $ ovs-vsctl add-port ovs_pvp_br0 dpdk0 -- \
>>>>             set Interface dpdk0 type=dpdk -- \
>>>>             set Interface dpdk0 options:dpdk-devargs=0000:05:0a.0
>>>>
>>>>   $ ip link set p5p2 vf 0 mac 52:54:00:92:d3:33
>>>>
>>>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
>>>> ---
>>>
>>> <snip>
>>>
>>>> @@ -4180,13 +4223,19 @@ netdev_dpdk_reconfigure(struct netdev *netdev)
>>>>          && dev->rxq_size == dev->requested_rxq_size
>>>>          && dev->txq_size == dev->requested_txq_size
>>>>          && dev->socket_id == dev->requested_socket_id
>>>> -        && dev->started) {
>>>> +        && dev->started && !dev->reset_needed) {
>>>>          /* Reconfiguration is unnecessary */
>>>>
>>>>          goto out;
>>>>      }
>>>>
>>>> -    rte_eth_dev_stop(dev->port_id);
>>>> +    if (dev->reset_needed) {
>>>> +        rte_eth_dev_reset(dev->port_id);
>>>
>>> Thinking more about the change and looking at flow control configuration,
>>> it seems that on reset we'll lost configurations done in set_config().
>>> Device reset should return it to initial state, i.e. all the default settings,
>>> but set_config() will not be called after that.
>>> I know, that VFs commonly doesn't support flow control, but if we'll add like
>>> real configuration of MAC address, it will be lost too.
>>>
>>> What do you think?
>>
>> Doesn’t the full bridge run sequence take care of this?
>>
>> So in callback we do netdev_request_reconfigure() which triggers the following sequence…
>>
>> bridge_run__()
>>  ...
>>    dpif_netdev_run()
>>      ...
>>        reconfigure_datapath()
>>          ...
>>            netdev_dpdk_reconfigure()
>>
>> But than also it will call in the next run
>>
>> bridge_run()
>>  ...
>>    bridge_delete_or_reconfigure_ports()
>>      ...
>>         netdev_set_config()
>>           netdev_dpdk_set_config()
>>
>>
>> Or do I miss something?
> 
> dpif_netdev_run() is called from ofproto_type_run() which is called
> unconditionally from the bridge_run().
> But bridge_delete_or_reconfigure_ports() only called from bridge_reconfigure(),
> which is protected by the following condition:
> 
>     if (ovsdb_idl_get_seqno(idl) != idl_seqno ||
>         if_notifier_changed(ifnotifier)) {
> 
> i.e. if there will be no DB updates or there will be no netlink notifications,
> set_config() will not be called. I understand that in your scenario there
> might be netlink notification for interface changes since PF is controlled
> by the kernel, but it might be not the case if PF attached as a DPDK port
> to OVS or some other application.

Hmm. Basically, dpdk_eth_event_callback() is an analogue of the
if_notifier, but for DPDK application.

Maybe we can add it as another trigger for above 'if' condition?

> 
>>
>>>
>>> BTW, there is a discussion about flow control configuration:
>>> https://patchwork.ozlabs.org/patch/1159689/
>>>
>>>> +        dev->reset_needed = false;
>>>> +    } else {
>>>> +        rte_eth_dev_stop(dev->port_id);
>>>> +    }
>>>> +
>>>>      dev->started = false;
>>>>
>>>>      err = netdev_dpdk_mempool_configure(dev);
>>
>>
> 
>

Patch
diff mbox series

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 4805783..9747def 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -396,6 +396,8 @@  struct netdev_dpdk {
         bool attached;
         /* If true, rte_eth_dev_start() was successfully called */
         bool started;
+        bool reset_needed;
+        /* 1 pad byte here. */
         struct eth_addr hwaddr;
         int mtu;
         int socket_id;
@@ -1173,6 +1175,8 @@  common_construct(struct netdev *netdev, dpdk_port_t port_no,
     ovsrcu_index_init(&dev->vid, -1);
     dev->vhost_reconfigured = false;
     dev->attached = false;
+    dev->started = false;
+    dev->reset_needed = false;
 
     ovsrcu_init(&dev->qos_conf, NULL);
 
@@ -1769,6 +1773,34 @@  netdev_dpdk_process_devargs(struct netdev_dpdk *dev,
     return new_port_id;
 }
 
+static int
+dpdk_eth_event_callback(dpdk_port_t port_id, enum rte_eth_event_type type,
+                        void *param OVS_UNUSED, void *ret_param OVS_UNUSED)
+{
+    struct netdev_dpdk *dev;
+
+    switch ((int) type) {
+    case RTE_ETH_EVENT_INTR_RESET:
+        ovs_mutex_lock(&dpdk_mutex);
+        dev = netdev_dpdk_lookup_by_port_id(port_id);
+        if (dev) {
+            ovs_mutex_lock(&dev->mutex);
+            dev->reset_needed = true;
+            netdev_request_reconfigure(&dev->up);
+            VLOG_DBG_RL(&rl, "%s: Device reset requested.",
+                        netdev_get_name(&dev->up));
+            ovs_mutex_unlock(&dev->mutex);
+        }
+        ovs_mutex_unlock(&dpdk_mutex);
+        break;
+
+    default:
+        /* Ignore all other types. */
+        break;
+   }
+   return 0;
+}
+
 static void
 dpdk_set_rxq_config(struct netdev_dpdk *dev, const struct smap *args)
     OVS_REQUIRES(dev->mutex)
@@ -3807,6 +3839,8 @@  netdev_dpdk_class_init(void)
     /* This function can be called for different classes.  The initialization
      * needs to be done only once */
     if (ovsthread_once_start(&once)) {
+        int ret;
+
         ovs_thread_create("dpdk_watchdog", dpdk_watchdog, NULL);
         unixctl_command_register("netdev-dpdk/set-admin-state",
                                  "[netdev] up|down", 1, 2,
@@ -3820,6 +3854,15 @@  netdev_dpdk_class_init(void)
                                  "[netdev]", 0, 1,
                                  netdev_dpdk_get_mempool_info, NULL);
 
+        ret = rte_eth_dev_callback_register(RTE_ETH_ALL,
+                                            RTE_ETH_EVENT_INTR_RESET,
+                                            dpdk_eth_event_callback, NULL);
+
+        if (ret != 0) {
+            VLOG_ERR("Ethernet device callback register error: %s",
+                     rte_strerror(-ret));
+        }
+
         ovsthread_once_done(&once);
     }
 
@@ -4180,13 +4223,19 @@  netdev_dpdk_reconfigure(struct netdev *netdev)
         && dev->rxq_size == dev->requested_rxq_size
         && dev->txq_size == dev->requested_txq_size
         && dev->socket_id == dev->requested_socket_id
-        && dev->started) {
+        && dev->started && !dev->reset_needed) {
         /* Reconfiguration is unnecessary */
 
         goto out;
     }
 
-    rte_eth_dev_stop(dev->port_id);
+    if (dev->reset_needed) {
+        rte_eth_dev_reset(dev->port_id);
+        dev->reset_needed = false;
+    } else {
+        rte_eth_dev_stop(dev->port_id);
+    }
+
     dev->started = false;
 
     err = netdev_dpdk_mempool_configure(dev);