[ovs-dev] Openvswitch crash when bringing down the dpdk bond port using "ovs-ofctl mod-port br-prv dpdk1 down"

Submitted by Ilya Maximets on Aug. 1, 2017, 2:19 p.m.

Details

Message ID 17653a73-8033-3b42-7e15-d11398ee8e21@samsung.com
State New
Headers show

Commit Message

Ilya Maximets Aug. 1, 2017, 2:19 p.m.
On 01.08.2017 15:46, Keshav Gupta wrote:
> Hi Ben/Ilya
>         Mainly patch do the following
> 1) Call the dpdk rx api(rte_eth_rx_burst) only when ovs netdev state is UP
> 2)  Now the rte_eth_dev_stop/ rte_eth_dev_start calls are removed in updating the flags   
> 
> 
> This  fixes the core dump issue but I see below side effect with this patch:
> With this change now only ports admin state(ovs netdev level) is changed (dpdk/nic level state is not changed) so NIC will continue to receive the packet until its Rxq ring is full
> Later if somebody bring up the port then he will get the burst of old packets which are there in the Rxq (typically 4k packet (dpdk Rxq size)).
> 
> 
> Thanks
> Keshav
>    

Hi Keshav,

Yes, you're right. I think it's the same situation right now can be observed on
current master. We can try to fix this for the HW NICs in the similar way as we
handle destroy_device event for vhost-user ports. I prepared quick fix (not even
compile tested) for the reference how this can be fixed. Please check the code
below. One thing is that we actually can't fix such behaviour for the vhost-user
because there is no such command to stop the vhost device. But we may try to
fix it for HW NICs if it's important.

--8<--------------------------------------------------------------------->8--
--8<--------------------------------------------------------------------->8--

Also, the diff above made on top of current master. So, you possibly will
need to port it a little.

Best regards, Ilya Maximets.

> 
> -----Original Message-----
> From: Ben Pfaff [mailto:blp@ovn.org] 
> Sent: Monday, July 31, 2017 9:26 PM
> To: Ilya Maximets
> Cc: ovs-dev@openvswitch.org; Keshav Gupta
> Subject: Re: [ovs-dev] Openvswitch crash when bringing down the dpdk bond port using "ovs-ofctl mod-port br-prv dpdk1 down"
> 
> Thanks.
> 
> It doesn't cherry-pick cleanly to branch-2.6, so I think I'll leave it for now since Darrell made the reasonable point that DPDK was experimental in 2.6.
> 
> On Mon, Jul 31, 2017 at 06:20:28PM +0300, Ilya Maximets wrote:
>> On 31.07.2017 16:05, Ben Pfaff wrote:
>>> Ilya, should we apply this patch to branch-2.6?  Are there other 
>>> patches that should be backported?
>>
>> It's definitely a bug and should be backported if someone wants to use 
>> barnch-2.6 with DPDK datapath. I traced only this exact case, so it's 
>> hard to say if something else should be backported, but this patch 
>> should fix described issue without  any additional changes.
>>  
>>> On Wed, Jul 26, 2017 at 03:28:12PM +0300, Ilya Maximets wrote:
>>>> Hi.
>>>>
>>>> You need to backport at least following patch:
>>>>
>>>> commit 3b1fb0779b87788968c1a6a9ff295a9883547485
>>>> Author: Daniele Di Proietto <diproiettod@vmware.com>
>>>> Date:   Tue Nov 15 15:40:49 2016 -0800
>>>>
>>>>     netdev-dpdk: Don't call rte_dev_stop() in update_flags().
>>>>     
>>>>     Calling rte_eth_dev_stop() while the device is running causes a crash.
>>>>     
>>>>     We could use rte_eth_dev_set_link_down(), but not every PMD implements
>>>>     that, and I found one NIC where that has no effect.
>>>>     
>>>>     Instead, this commit checks if the device has the NETDEV_UP flag when
>>>>     transmitting or receiving (similarly to what we do for vhostuser). I
>>>>     didn't notice any performance difference with this check in case the
>>>>     device is up.
>>>>     
>>>>     An alternative would be to remove the device queues from the pmd threads
>>>>     tx and receive cache, but that requires reconfiguration and I'd prefer
>>>>     to avoid it, because the change can come from OpenFlow.
>>>>     
>>>>     Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
>>>>     Acked-by: Ilya Maximets <i.maximets@samsung.com>
>>>>
>>>> This should fix your issue.
>>>> In general, I'm suggesting to use stable 2.7 OVS, there was too 
>>>> many DPDK related changes including stability fixes since 2.6.
>>>>
>>>> Best regards, Ilya Maximets.
>>>>
>>>>> Hi
>>>>>   We are experiencing a openvswitch crash when bringing down the dpdk bond port using "ovs-ofctl mod-port br-prv dpdk1 down".
>>>>>
>>>>> backtrace of core is like below. Is there any issue reported earlier  for this type of crash in openvswitch community.
>>>>>
>>>>> (gdb) bt
>>>>> #0  ixgbe_rxq_rearm (rxq=0x7fa45061f800) at 
>>>>> /home/sdn/new_cloud_sdn_switch_2/cloud-sdn-switch/dpdk/drivers/net
>>>>> /ixgbe/ixgbe_rxtx_vec_sse.c:98
>>>>> #1  _recv_raw_pkts_vec (split_packet=0x0, nb_pkts=32, rx_pkts=<optimized out>, rxq=0x7fa45061f800)
>>>>>     at 
>>>>> /home/sdn/new_cloud_sdn_switch_2/cloud-sdn-switch/dpdk/drivers/net
>>>>> /ixgbe/ixgbe_rxtx_vec_sse.c:290
>>>>> #2  ixgbe_recv_pkts_vec (rx_queue=0x7fa45061f800, rx_pkts=<optimized out>, nb_pkts=<optimized out>)
>>>>>     at 
>>>>> /home/sdn/new_cloud_sdn_switch_2/cloud-sdn-switch/dpdk/drivers/net
>>>>> /ixgbe/ixgbe_rxtx_vec_sse.c:474
>>>>> #3  0x000000e5000000e4 in ?? ()
>>>>> #4  0x00000046000000e6 in ?? ()
>>>>> #5  0x0000006a00000069 in ?? ()
>>>>> #6  0x0000006c0000006b in ?? ()
>>>>> #7  0x000000ec0000006d in ?? ()
>>>>> #8  0x000000ee000000ed in ?? ()
>>>>> #9  0x00000001537f5780 in ?? ()
>>>>> #10 0x0000000000000000 in ?? ()
>>>>> (gdb)
>>>>>
>>>>>
>>>>> I have analyzed the core and it seems it is a result of device 
>>>>> stop and packet receive from the port happening at same time by 
>>>>> two thread OVS main thread(device stop) and PMD thread(pkt receive). More precisely main thread cleaning the packet buffer from rxq sw_ring to avoid the packet buffer leak while in parallel PMD thread is filling the packet buffer in sw_ring/descriptor ring as part of ixgbe_recv_pkts_vec.
>>>>>
>>>>> version used is: openvswitch (2.6.1) with dpdk (16.11).
>>>>>
>>>>> This crash is not every time reproducible but frequency seems to be high.
>>>>>
>>>>> I am new to openvswitch community and this is first time I am posting a query. let me know if anything you require from my side.
>>>>>
>>>>> Thanks
>>>>> Keshav
>>>>
>>>> _______________________________________________
>>>> dev mailing list
>>>> dev@openvswitch.org
>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>
>>>
>>>

Patch hide | download patch | download mbox

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index ea17b97..202678f 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -2300,20 +2300,48 @@  netdev_dpdk_update_flags__(struct netdev_dpdk *dev,
                            enum netdev_flags *old_flagsp)
     OVS_REQUIRES(dev->mutex)
 {
+    enum netdev_flags new_flags = dev->flags;
+
     if ((off | on) & ~(NETDEV_UP | NETDEV_PROMISC)) {
         return EINVAL;
     }
 
     *old_flagsp = dev->flags;
-    dev->flags |= on;
-    dev->flags &= ~off;
+    new_flags |= on;
+    new_flags &= ~off;
 
-    if (dev->flags == *old_flagsp) {
+    if (new_flags == dev->flags) {
         return 0;
     }
 
     if (dev->type == DPDK_DEV_ETH) {
-        if (dev->flags & NETDEV_PROMISC) {
+        if (NETDEV_UP & ((*old_flagsp ^ on) | (*old_flagsp ^ off))) {
+            if (NETDEV_UP & off) {
+                bool quiescent = ovsrcu_is_quiescent();
+
+                dev->flags = new_flags;
+                /* Wait for other threads to quiesce after turning off the
+                 * 'NETDEV_UP' before actual stopping the device to be sure
+                 * that all threads finished their rx/tx operations. */
+                ovsrcu_synchronize();
+                if (quiescent) {
+                    /* As call to ovsrcu_synchronize() will end the quiescent
+                     * state, put thread back into quiescent state. */
+                    ovsrcu_quiesce_start();
+                }
+                rte_eth_dev_stop(dev->port_id);
+            }  else {
+                int retval = rte_eth_dev_start(dev->port_id);
+
+                if (retval) {
+                    VLOG_ERR("Interface %s start error: %s", dev->up.name,
+                             rte_strerror(-diag));
+                    return -retval;
+                }
+            }
+        }
+
+        if (new_flags & NETDEV_PROMISC) {
             rte_eth_promiscuous_enable(dev->port_id);
         }
 
@@ -2336,6 +2364,7 @@  netdev_dpdk_update_flags__(struct netdev_dpdk *dev,
         }
     }
 
+    dev->flags = new_flags;
     return 0;
 }