Message ID | 1512734667-23202-1-git-send-email-bhanuprakash.bodireddy@intel.com |
---|---|
Headers | show |
Series | Add OVS DPDK keep-alive functionality. | expand |
Hi Bhanu, I've tested v6 and LGTM, works as expected. To check the behavior I blocked one or more of the PMD with the trick of attaching cgdb. So with utilities/ovs-appctl keepalive/pmd-health-show I could see for example 2 of the 3 PMDs - those I intentionally blocked - correctly reported as DEAD first, then GONE like Keepalive status keepalive status : Enabled keepalive interval : 1000 ms keepalive init time: 15 Dec 2017 13:20:54 PMD threads : 3 PMD CORE STATE LAST SEEN TIMESTAMP(UTC) pmd62 5 GONE 15 Dec 2017 13:22:39 pmd65 7 ALIVE 15 Dec 2017 13:22:50 pmd63 4 GONE 15 Dec 2017 13:22:14 Thanks, Antonio > -----Original Message----- > From: ovs-dev-bounces@openvswitch.org [mailto:ovs-dev- > bounces@openvswitch.org] On Behalf Of Bhanuprakash Bodireddy > Sent: Friday, December 8, 2017 12:04 PM > To: dev@openvswitch.org > Subject: [ovs-dev] [PATCH v6 0/8] Add OVS DPDK keep-alive functionality. > > Keepalive feature is aimed at achieving Fastpath Service Assurance > in OVS-DPDK deployments. It adds support for monitoring the packet > processing threads by dispatching heartbeats at regular intervals. > > keepalive feature can be enabled through below OVSDB settings. > > enable-keepalive=true > - Keepalive feature is disabled by default and should be enabled > at startup before ovs-vswitchd daemon is started. > > keepalive-interval="5000" > - Timer interval in milliseconds for monitoring the packet > processing cores. > > TESTING: > The testing of keepalive is done using stress cmd (simulating the > stalls). > - pmd-cpu-mask=0xf [MQ enabled on DPDK ports] > - stress -c 1 & [tid is usually the __tid + 1 of the > output] > - chrt -r -p 99 <tid> [set realtime priority for stress thread] > - taskset -p 0x8 <tid> [Pin the stress thread to the core PMD is > running] > - PMD thread will be descheduled due to its normal priority and > yields > core to stress thread. > > - ovs-appctl keepalive/pmd-health-show [Display that the thread > is GONE] > - ./ovsdb/ovsdb-client monitor Open_vSwitch [Should update the > status] > > - taskset -p 0x10 <tid> [This brings back pmd thread to life as > stress thread > is moved to idle core] > > (watch out for stress threads, and carefully pin them to core not > to hang your DUTs > during tesing). > > v5 -> v6 > * Remove 2 patches from series > - xnanosleep was applied to master as part of high resolution > timeout support. > - Extend get_process_info() API was also applied to master earlier. > * Remove KA_STATE_DOZING as it was initially meant to handle Core C > states, not needed > for now. > * Fixed ka_destroy(), to fix unit test cases 536, 537. > * A minor performance degradation(0.5%) is observed with Keepalive > enabled. > [Tested with loopback case using 1000 IXIA streams/64 byte udp pkts > and > 1 PMD thread(9.239 vs 9.177Mpps) at 10ms ka-interval timeout] > * Verified with sparse, MSVC compilers(appveyor). > > v4 -> v5 > * Add 3 more patches to the series > - xnanosleep() > - Documentation > - Update to NEWS > * Remove all references to core_id and instead implemented thread > based tracking. > * Addressed most of the comments in v4. > > v3 -> v4 > * Split the functionality in to 2 parts. This patch series only > updates > PMD status to OVSDB. The incremental patch series to handle false > positives, > negatives and more checking and stats. > * Remove code from netdev layer and dependency on rte_keepalive lib. > * Merged few patches and simplified the patch series. > * Timestamp in human readable form. > > v2 -> v3 > * Rebase. > * Verified with dpdk-stable-17.05.1 release. > * Fixed build issues with MSVC and cross checked with appveyor. > > v1 -> v2 > * Rebase > * Drop 01/20 Patch "Consolidate process related APIs" of V1 as it > is already applied as separate patch. > > RFCv3 -> v1 > * Made changes to fix failures in some unit test cases. > * some more code cleanup w.r.t process related APIs. > > RFCv2 -> RFCv3 > * Remove POSIX shared memory block implementation (suggested by > Aaron). > * Rework the logic to register and track threads instead of cores. > This way > in the future any thread can be registered to KA framework. For now > only PMD > threads are tracked (suggested by Aaron). > * Refactor few APIs and further clean up the code. > > RFCv1 -> RFCv2 > * Merged the xml and schema commits to later commit where the actual > implementation is done(suggested by Ben). > * Fix ovs-appctl keepalive/* hang issue when KA disabled. > * Fixed memory leaks with appctl commands for keepalive/pmd-health- > show, > pmd-xstats-show. > * Refactor code and fixed APIs dealing with PMD health monitoring. > > > Bhanuprakash Bodireddy (8): > Keepalive: Add initial keepalive configuration. > dpif-netdev: Register packet processing cores to KA framework. > dpif-netdev: Enable heartbeats for DPDK datapath. > keepalive: Retrieve PMD status periodically. > bridge: Update keepalive status in OVSDB. > keepalive: Add support to query keepalive status and statistics. > Documentation: Update DPDK doc with Keepalive feature. > NEWS: Add keepalive support information in NEWS. > > Documentation/howto/dpdk.rst | 112 +++++++++ > NEWS | 2 + > lib/automake.mk | 2 + > lib/dpif-netdev.c | 92 ++++++++ > lib/keepalive.c | 552 > +++++++++++++++++++++++++++++++++++++++++++ > lib/keepalive.h | 109 +++++++++ > lib/ovs-thread.c | 6 + > lib/ovs-thread.h | 1 + > lib/util.c | 22 ++ > lib/util.h | 1 + > vswitchd/bridge.c | 29 +++ > vswitchd/vswitch.ovsschema | 8 +- > vswitchd/vswitch.xml | 49 ++++ > 13 files changed, 983 insertions(+), 2 deletions(-) > create mode 100644 lib/keepalive.c > create mode 100644 lib/keepalive.h > > -- > 2.4.11 > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Hi, Sorry to jump on this at v6 only, but I skimmed over the code and I am struggling to understand what problem you're trying to solve. Yes, I realize you want some sort of feedback about the PMD processing, but it's not clear to me what exactly you want from it. This last patchset uses a separate thread just to monitor the PMD threads which can update their status in the core busy loop. I guess it tells you if the PMD thread is stuck or not, but not really if it's processing packets. That's again, my question above. If you need to know if the thread is running, I think any OVS can provide you the process stats which should be more reliable and doesn't depend on OVS at all. I appreciate if you could elaborate more on the use-case. Thanks fbl On Fri, Dec 08, 2017 at 12:04:19PM +0000, Bhanuprakash Bodireddy wrote: > Keepalive feature is aimed at achieving Fastpath Service Assurance > in OVS-DPDK deployments. It adds support for monitoring the packet > processing threads by dispatching heartbeats at regular intervals. > > keepalive feature can be enabled through below OVSDB settings. > > enable-keepalive=true > - Keepalive feature is disabled by default and should be enabled > at startup before ovs-vswitchd daemon is started. > > keepalive-interval="5000" > - Timer interval in milliseconds for monitoring the packet > processing cores. > > TESTING: > The testing of keepalive is done using stress cmd (simulating the stalls). > - pmd-cpu-mask=0xf [MQ enabled on DPDK ports] > - stress -c 1 & [tid is usually the __tid + 1 of the output] > - chrt -r -p 99 <tid> [set realtime priority for stress thread] > - taskset -p 0x8 <tid> [Pin the stress thread to the core PMD is running] > - PMD thread will be descheduled due to its normal priority and yields > core to stress thread. > > - ovs-appctl keepalive/pmd-health-show [Display that the thread is GONE] > - ./ovsdb/ovsdb-client monitor Open_vSwitch [Should update the status] > > - taskset -p 0x10 <tid> [This brings back pmd thread to life as stress thread > is moved to idle core] > > (watch out for stress threads, and carefully pin them to core not to hang your DUTs > during tesing). > > v5 -> v6 > * Remove 2 patches from series > - xnanosleep was applied to master as part of high resolution timeout support. > - Extend get_process_info() API was also applied to master earlier. > * Remove KA_STATE_DOZING as it was initially meant to handle Core C states, not needed > for now. > * Fixed ka_destroy(), to fix unit test cases 536, 537. > * A minor performance degradation(0.5%) is observed with Keepalive enabled. > [Tested with loopback case using 1000 IXIA streams/64 byte udp pkts and > 1 PMD thread(9.239 vs 9.177Mpps) at 10ms ka-interval timeout] > * Verified with sparse, MSVC compilers(appveyor). > > v4 -> v5 > * Add 3 more patches to the series > - xnanosleep() > - Documentation > - Update to NEWS > * Remove all references to core_id and instead implemented thread based tracking. > * Addressed most of the comments in v4. > > v3 -> v4 > * Split the functionality in to 2 parts. This patch series only updates > PMD status to OVSDB. The incremental patch series to handle false positives, > negatives and more checking and stats. > * Remove code from netdev layer and dependency on rte_keepalive lib. > * Merged few patches and simplified the patch series. > * Timestamp in human readable form. > > v2 -> v3 > * Rebase. > * Verified with dpdk-stable-17.05.1 release. > * Fixed build issues with MSVC and cross checked with appveyor. > > v1 -> v2 > * Rebase > * Drop 01/20 Patch "Consolidate process related APIs" of V1 as it > is already applied as separate patch. > > RFCv3 -> v1 > * Made changes to fix failures in some unit test cases. > * some more code cleanup w.r.t process related APIs. > > RFCv2 -> RFCv3 > * Remove POSIX shared memory block implementation (suggested by Aaron). > * Rework the logic to register and track threads instead of cores. This way > in the future any thread can be registered to KA framework. For now only PMD > threads are tracked (suggested by Aaron). > * Refactor few APIs and further clean up the code. > > RFCv1 -> RFCv2 > * Merged the xml and schema commits to later commit where the actual > implementation is done(suggested by Ben). > * Fix ovs-appctl keepalive/* hang issue when KA disabled. > * Fixed memory leaks with appctl commands for keepalive/pmd-health-show, > pmd-xstats-show. > * Refactor code and fixed APIs dealing with PMD health monitoring. > > > Bhanuprakash Bodireddy (8): > Keepalive: Add initial keepalive configuration. > dpif-netdev: Register packet processing cores to KA framework. > dpif-netdev: Enable heartbeats for DPDK datapath. > keepalive: Retrieve PMD status periodically. > bridge: Update keepalive status in OVSDB. > keepalive: Add support to query keepalive status and statistics. > Documentation: Update DPDK doc with Keepalive feature. > NEWS: Add keepalive support information in NEWS. > > Documentation/howto/dpdk.rst | 112 +++++++++ > NEWS | 2 + > lib/automake.mk | 2 + > lib/dpif-netdev.c | 92 ++++++++ > lib/keepalive.c | 552 +++++++++++++++++++++++++++++++++++++++++++ > lib/keepalive.h | 109 +++++++++ > lib/ovs-thread.c | 6 + > lib/ovs-thread.h | 1 + > lib/util.c | 22 ++ > lib/util.h | 1 + > vswitchd/bridge.c | 29 +++ > vswitchd/vswitch.ovsschema | 8 +- > vswitchd/vswitch.xml | 49 ++++ > 13 files changed, 983 insertions(+), 2 deletions(-) > create mode 100644 lib/keepalive.c > create mode 100644 lib/keepalive.h > > -- > 2.4.11 > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>Hi, > >Sorry to jump on this at v6 only, but I skimmed over the code and I am >struggling to understand what problem you're trying to solve. Yes, I realize >you want some sort of feedback about the PMD processing, but it's not clear >to me what exactly you want from it. > >This last patchset uses a separate thread just to monitor the PMD threads >which can update their status in the core busy loop. I guess it tells you if the >PMD thread is stuck or not, but not really if it's processing packets. That's >again, my question above. > >If you need to know if the thread is running, I think any OVS can provide you >the process stats which should be more reliable and doesn't depend on OVS >at all. > >I appreciate if you could elaborate more on the use-case. Intel SA team has been working on SA Framework for NFV environment and has defined interfaces for the base platform(aligned with ETSI GS NFV 002) which includes compute, storage, NW, virtual switch, OS and hypervisor. The core idea here is to monitor and detect the service impacting faults on the Base platform. Both reactive and pro-active fault detection techniques are employed and faults are reported to higher level layers for corrective actions. The corrective actions for example here can be migrating the workloads, marking the compute offline and is based on the policies enforced at higher layers. One aspect of larger SA framework is monitoring virtual switch health. Some of the events of interest here are link status, OvS DB connection status, packet statistics(drops/errors), PMD health. This patch series has only implemented *PMD health* monitoring and reporting mechanism and the details are already in the patch. The other interesting events of virtual switch are already implemented as part of collectd plugin. On your questions: > I guess it tells you if the PMD thread is stuck or not, but not really if it's processing packets. That's >again, my question above. The functionality to check if the PMD is processing the packets was implemented way back in v3. https://mail.openvswitch.org/pipermail/ovs-dev/2017-August/336789.html For easier review, the patch series was split up in v4 to get the basic functionality in. This is mentioned in version change log below. https://mail.openvswitch.org/pipermail/ovs-dev/2017-August/337702.html >If you need to know if the thread is running, I think any OVS can provide you >the process stats which should be more reliable and doesn't depend on OVS >at all. There is a problem here and I did simulate the case to show that the stats reported by OS aren't accurate in the below thread. https://mail.openvswitch.org/pipermail/ovs-dev/2017-September/338388.html Check the details on /proc/[pid]/stats. Though the PMD thread is stalled, OS reports the thread as *Running (R)* state. - Bhanuprakash. > > >On Fri, Dec 08, 2017 at 12:04:19PM +0000, Bhanuprakash Bodireddy wrote: >> Keepalive feature is aimed at achieving Fastpath Service Assurance in >> OVS-DPDK deployments. It adds support for monitoring the packet >> processing threads by dispatching heartbeats at regular intervals. >> >> keepalive feature can be enabled through below OVSDB settings. >> >> enable-keepalive=true >> - Keepalive feature is disabled by default and should be enabled >> at startup before ovs-vswitchd daemon is started. >> >> keepalive-interval="5000" >> - Timer interval in milliseconds for monitoring the packet >> processing cores. >> >> TESTING: >> The testing of keepalive is done using stress cmd (simulating the stalls). >> - pmd-cpu-mask=0xf [MQ enabled on DPDK ports] >> - stress -c 1 & [tid is usually the __tid + 1 of the output] >> - chrt -r -p 99 <tid> [set realtime priority for stress thread] >> - taskset -p 0x8 <tid> [Pin the stress thread to the core PMD is running] >> - PMD thread will be descheduled due to its normal priority and yields >> core to stress thread. >> >> - ovs-appctl keepalive/pmd-health-show [Display that the thread is >GONE] >> - ./ovsdb/ovsdb-client monitor Open_vSwitch [Should update the >> status] >> >> - taskset -p 0x10 <tid> [This brings back pmd thread to life as stress >thread >> is moved to idle core] >> >> (watch out for stress threads, and carefully pin them to core not to hang >your DUTs >> during tesing). >> >> v5 -> v6 >> * Remove 2 patches from series >> - xnanosleep was applied to master as part of high resolution timeout >support. >> - Extend get_process_info() API was also applied to master earlier. >> * Remove KA_STATE_DOZING as it was initially meant to handle Core C >states, not needed >> for now. >> * Fixed ka_destroy(), to fix unit test cases 536, 537. >> * A minor performance degradation(0.5%) is observed with Keepalive >enabled. >> [Tested with loopback case using 1000 IXIA streams/64 byte udp pkts and >> 1 PMD thread(9.239 vs 9.177Mpps) at 10ms ka-interval timeout] >> * Verified with sparse, MSVC compilers(appveyor). >> >> v4 -> v5 >> * Add 3 more patches to the series >> - xnanosleep() >> - Documentation >> - Update to NEWS >> * Remove all references to core_id and instead implemented thread based >tracking. >> * Addressed most of the comments in v4. >> >> v3 -> v4 >> * Split the functionality in to 2 parts. This patch series only updates >> PMD status to OVSDB. The incremental patch series to handle false >positives, >> negatives and more checking and stats. >> * Remove code from netdev layer and dependency on rte_keepalive lib. >> * Merged few patches and simplified the patch series. >> * Timestamp in human readable form. >> >> v2 -> v3 >> * Rebase. >> * Verified with dpdk-stable-17.05.1 release. >> * Fixed build issues with MSVC and cross checked with appveyor. >> >> v1 -> v2 >> * Rebase >> * Drop 01/20 Patch "Consolidate process related APIs" of V1 as it >> is already applied as separate patch. >> >> RFCv3 -> v1 >> * Made changes to fix failures in some unit test cases. >> * some more code cleanup w.r.t process related APIs. >> >> RFCv2 -> RFCv3 >> * Remove POSIX shared memory block implementation (suggested by >Aaron). >> * Rework the logic to register and track threads instead of cores. This way >> in the future any thread can be registered to KA framework. For now only >PMD >> threads are tracked (suggested by Aaron). >> * Refactor few APIs and further clean up the code. >> >> RFCv1 -> RFCv2 >> * Merged the xml and schema commits to later commit where the actual >> implementation is done(suggested by Ben). >> * Fix ovs-appctl keepalive/* hang issue when KA disabled. >> * Fixed memory leaks with appctl commands for keepalive/pmd-health- >show, >> pmd-xstats-show. >> * Refactor code and fixed APIs dealing with PMD health monitoring. >> >> >> Bhanuprakash Bodireddy (8): >> Keepalive: Add initial keepalive configuration. >> dpif-netdev: Register packet processing cores to KA framework. >> dpif-netdev: Enable heartbeats for DPDK datapath. >> keepalive: Retrieve PMD status periodically. >> bridge: Update keepalive status in OVSDB. >> keepalive: Add support to query keepalive status and statistics. >> Documentation: Update DPDK doc with Keepalive feature. >> NEWS: Add keepalive support information in NEWS. >> >> Documentation/howto/dpdk.rst | 112 +++++++++ >> NEWS | 2 + >> lib/automake.mk | 2 + >> lib/dpif-netdev.c | 92 ++++++++ >> lib/keepalive.c | 552 >+++++++++++++++++++++++++++++++++++++++++++ >> lib/keepalive.h | 109 +++++++++ >> lib/ovs-thread.c | 6 + >> lib/ovs-thread.h | 1 + >> lib/util.c | 22 ++ >> lib/util.h | 1 + >> vswitchd/bridge.c | 29 +++ >> vswitchd/vswitch.ovsschema | 8 +- >> vswitchd/vswitch.xml | 49 ++++ >> 13 files changed, 983 insertions(+), 2 deletions(-) create mode >> 100644 lib/keepalive.c create mode 100644 lib/keepalive.h >> >> -- >> 2.4.11 >> >> _______________________________________________ >> dev mailing list >> dev@openvswitch.org >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >-- >Flavio
On 12/08/2017 12:04 PM, Bhanuprakash Bodireddy wrote: > Keepalive feature is aimed at achieving Fastpath Service Assurance > in OVS-DPDK deployments. It adds support for monitoring the packet > processing threads by dispatching heartbeats at regular intervals. > Hi Bhanu, I guess there's comments elsewhere about the need for this feature, but looking at this solution I had a few comments on it. Not a full review. This is creating a dependency on internal OVS threading model from collectd/ceilometer. That seems like something that could cause problems in the future. What if someone wanted to put multiple pmds on a core, or put a pmd into interrupt mode. Would it be blocked in OVS because it would break collectd/ceilometer? or how would the projects coordinate? I see there is a sleep state, so maybe that could be used for things like an interrupt mode etc. but please consider that that OVS may not always spin the pmds in the manner it does now and at least allow for a "don't monitor this pmd" type option, so OVS is not restricted by trying to please collectd/ceilometer. It also seems strange that collectd/ceilometer would be monitoring pmd health but be ok with pmds appearing and disappearing due to internal OVS rxq assignment. > keepalive feature can be enabled through below OVSDB settings. > > enable-keepalive=true > - Keepalive feature is disabled by default and should be enabled > at startup before ovs-vswitchd daemon is started. > > keepalive-interval="5000" > - Timer interval in milliseconds for monitoring the packet > processing cores. > > TESTING: > The testing of keepalive is done using stress cmd (simulating the stalls). > - pmd-cpu-mask=0xf [MQ enabled on DPDK ports] > - stress -c 1 & [tid is usually the __tid + 1 of the output] > - chrt -r -p 99 <tid> [set realtime priority for stress thread] > - taskset -p 0x8 <tid> [Pin the stress thread to the core PMD is running] > - PMD thread will be descheduled due to its normal priority and yields > core to stress thread. > > - ovs-appctl keepalive/pmd-health-show [Display that the thread is GONE] > - ./ovsdb/ovsdb-client monitor Open_vSwitch [Should update the status] > > - taskset -p 0x10 <tid> [This brings back pmd thread to life as stress thread > is moved to idle core] > > (watch out for stress threads, and carefully pin them to core not to hang your DUTs > during tesing). > > v5 -> v6 > * Remove 2 patches from series > - xnanosleep was applied to master as part of high resolution timeout support. > - Extend get_process_info() API was also applied to master earlier. > * Remove KA_STATE_DOZING as it was initially meant to handle Core C states, not needed > for now. > * Fixed ka_destroy(), to fix unit test cases 536, 537. > * A minor performance degradation(0.5%) is observed with Keepalive enabled. > [Tested with loopback case using 1000 IXIA streams/64 byte udp pkts and > 1 PMD thread(9.239 vs 9.177Mpps) at 10ms ka-interval timeout] > * Verified with sparse, MSVC compilers(appveyor). > > v4 -> v5 > * Add 3 more patches to the series > - xnanosleep() > - Documentation > - Update to NEWS > * Remove all references to core_id and instead implemented thread based tracking. > * Addressed most of the comments in v4. > > v3 -> v4 > * Split the functionality in to 2 parts. This patch series only updates > PMD status to OVSDB. The incremental patch series to handle false positives, > negatives and more checking and stats. > * Remove code from netdev layer and dependency on rte_keepalive lib. > * Merged few patches and simplified the patch series. > * Timestamp in human readable form. > > v2 -> v3 > * Rebase. > * Verified with dpdk-stable-17.05.1 release. > * Fixed build issues with MSVC and cross checked with appveyor. > > v1 -> v2 > * Rebase > * Drop 01/20 Patch "Consolidate process related APIs" of V1 as it > is already applied as separate patch. > > RFCv3 -> v1 > * Made changes to fix failures in some unit test cases. > * some more code cleanup w.r.t process related APIs. > > RFCv2 -> RFCv3 > * Remove POSIX shared memory block implementation (suggested by Aaron). > * Rework the logic to register and track threads instead of cores. This way > in the future any thread can be registered to KA framework. For now only PMD > threads are tracked (suggested by Aaron). > * Refactor few APIs and further clean up the code. > > RFCv1 -> RFCv2 > * Merged the xml and schema commits to later commit where the actual > implementation is done(suggested by Ben). > * Fix ovs-appctl keepalive/* hang issue when KA disabled. > * Fixed memory leaks with appctl commands for keepalive/pmd-health-show, > pmd-xstats-show. > * Refactor code and fixed APIs dealing with PMD health monitoring. > > > Bhanuprakash Bodireddy (8): > Keepalive: Add initial keepalive configuration. > dpif-netdev: Register packet processing cores to KA framework. > dpif-netdev: Enable heartbeats for DPDK datapath. > keepalive: Retrieve PMD status periodically. > bridge: Update keepalive status in OVSDB. > keepalive: Add support to query keepalive status and statistics. > Documentation: Update DPDK doc with Keepalive feature. > NEWS: Add keepalive support information in NEWS. > > Documentation/howto/dpdk.rst | 112 +++++++++ > NEWS | 2 + > lib/automake.mk | 2 + > lib/dpif-netdev.c | 92 ++++++++ > lib/keepalive.c | 552 +++++++++++++++++++++++++++++++++++++++++++ > lib/keepalive.h | 109 +++++++++ > lib/ovs-thread.c | 6 + > lib/ovs-thread.h | 1 + > lib/util.c | 22 ++ > lib/util.h | 1 + > vswitchd/bridge.c | 29 +++ > vswitchd/vswitch.ovsschema | 8 +- > vswitchd/vswitch.xml | 49 ++++ > 13 files changed, 983 insertions(+), 2 deletions(-) > create mode 100644 lib/keepalive.c > create mode 100644 lib/keepalive.h >
"Bodireddy, Bhanuprakash" <bhanuprakash.bodireddy@intel.com> writes: >>Hi, >> >>Sorry to jump on this at v6 only, but I skimmed over the code and I am >>struggling to understand what problem you're trying to solve. Yes, I realize >>you want some sort of feedback about the PMD processing, but it's not clear >>to me what exactly you want from it. >> >>This last patchset uses a separate thread just to monitor the PMD threads >>which can update their status in the core busy loop. I guess it tells you if the >>PMD thread is stuck or not, but not really if it's processing packets. That's >>again, my question above. >> >>If you need to know if the thread is running, I think any OVS can provide you >>the process stats which should be more reliable and doesn't depend on OVS >>at all. >> >>I appreciate if you could elaborate more on the use-case. > > Intel SA team has been working on SA Framework for NFV environment and > has defined interfaces > for the base platform(aligned with ETSI GS NFV 002) which includes > compute, storage, NW, virtual switch, OS and hypervisor. > The core idea here is to monitor and detect the service impacting > faults on the Base platform. > Both reactive and pro-active fault detection techniques are employed > and faults are reported to > higher level layers for corrective actions. The corrective actions for > example here can be migrating > the workloads, marking the compute offline and is based on the > policies enforced at higher layers. > > One aspect of larger SA framework is monitoring virtual switch > health. Some of the events of interest here > are link status, OvS DB connection status, packet > statistics(drops/errors), PMD health. > > This patch series has only implemented *PMD health* monitoring and > reporting mechanism and the details are > already in the patch. The other interesting events of virtual switch > are already implemented as part of collectd plugin. > > On your questions: > >> I guess it tells you if the PMD thread is stuck or not, but not >> really if it's processing packets. That's >>again, my question above. > > The functionality to check if the PMD is processing the packets was > implemented way back in v3. > https://mail.openvswitch.org/pipermail/ovs-dev/2017-August/336789.html > > For easier review, the patch series was split up in v4 to get the > basic functionality in. This is mentioned in version change log below. > https://mail.openvswitch.org/pipermail/ovs-dev/2017-August/337702.html > >>If you need to know if the thread is running, I think any OVS can provide you >>the process stats which should be more reliable and doesn't depend on OVS >>at all. > > There is a problem here and I did simulate the case to show that the > stats reported by OS aren't accurate in the below thread. > https://mail.openvswitch.org/pipermail/ovs-dev/2017-September/338388.html > > Check the details on /proc/[pid]/stats. Though the PMD thread is > stalled, OS reports the thread as *Running (R)* state. Just responding to this point, and I assumed you would follow up to my statement when I wrote: That said, I suspect that when those times were increasing, they really were getting some system / user time. Was your worker thread occasionally making system calls (or even doing sched_yield()?) In those cases, it likely was letting the pmd thread move on. You are observing that the thread is still running, and still getting time. Your experiment merely shows that the scheduler works as intended. If you clocked the percentage (as I just did), you'll see that it balances. ex, when I run two RT threads on the same core as a PMD thread (and use a simple 'top' to show the data): 5905 root 10 -10 1996516 141900 11128 R 60.1 0.9 3:09.08 pmd20 6361 aconole 20 0 7896 104 0 R 19.9 0.0 1:03.46 stress 6362 aconole 20 0 7896 104 0 R 19.9 0.0 1:06.95 stress Note the CPU utilization percentage. When I stop these threads: 5905 root 10 -10 1996516 141900 11128 R 99.7 0.9 4:22.77 pmd20 That means, if you simply calculate the time taken by the thread as a percentage of the time available to take, you'll detect stalls by seeing if the percentage drops below, say, 90%. This can still be done using procfs. I reject the assertion that the kernel doesn't account for process time properly. That would be a scheduler bug or an accounting bug. I disagree with your statement: "The other fields related to time in stats file can't be completely relied due to below test case." Did I misunderstand your assertion and test? > - Bhanuprakash. > >> >> >>On Fri, Dec 08, 2017 at 12:04:19PM +0000, Bhanuprakash Bodireddy wrote: >>> Keepalive feature is aimed at achieving Fastpath Service Assurance in >>> OVS-DPDK deployments. It adds support for monitoring the packet >>> processing threads by dispatching heartbeats at regular intervals. >>> >>> keepalive feature can be enabled through below OVSDB settings. >>> >>> enable-keepalive=true >>> - Keepalive feature is disabled by default and should be enabled >>> at startup before ovs-vswitchd daemon is started. >>> >>> keepalive-interval="5000" >>> - Timer interval in milliseconds for monitoring the packet >>> processing cores. >>> >>> TESTING: >>> The testing of keepalive is done using stress cmd (simulating the stalls). >>> - pmd-cpu-mask=0xf [MQ enabled on DPDK ports] >>> - stress -c 1 & [tid is usually the __tid + 1 of the output] >>> - chrt -r -p 99 <tid> [set realtime priority for stress thread] >>> - taskset -p 0x8 <tid> [Pin the stress thread to the core PMD is running] >>> - PMD thread will be descheduled due to its normal priority and yields >>> core to stress thread. >>> >>> - ovs-appctl keepalive/pmd-health-show [Display that the thread is >>GONE] >>> - ./ovsdb/ovsdb-client monitor Open_vSwitch [Should update the >>> status] >>> >>> - taskset -p 0x10 <tid> [This brings back pmd thread to life as stress >>thread >>> is moved to idle core] >>> >>> (watch out for stress threads, and carefully pin them to core not to hang >>your DUTs >>> during tesing). >>> >>> v5 -> v6 >>> * Remove 2 patches from series >>> - xnanosleep was applied to master as part of high resolution timeout >>support. >>> - Extend get_process_info() API was also applied to master earlier. >>> * Remove KA_STATE_DOZING as it was initially meant to handle Core C >>states, not needed >>> for now. >>> * Fixed ka_destroy(), to fix unit test cases 536, 537. >>> * A minor performance degradation(0.5%) is observed with Keepalive >>enabled. >>> [Tested with loopback case using 1000 IXIA streams/64 byte udp pkts and >>> 1 PMD thread(9.239 vs 9.177Mpps) at 10ms ka-interval timeout] >>> * Verified with sparse, MSVC compilers(appveyor). >>> >>> v4 -> v5 >>> * Add 3 more patches to the series >>> - xnanosleep() >>> - Documentation >>> - Update to NEWS >>> * Remove all references to core_id and instead implemented thread based >>tracking. >>> * Addressed most of the comments in v4. >>> >>> v3 -> v4 >>> * Split the functionality in to 2 parts. This patch series only updates >>> PMD status to OVSDB. The incremental patch series to handle false >>positives, >>> negatives and more checking and stats. >>> * Remove code from netdev layer and dependency on rte_keepalive lib. >>> * Merged few patches and simplified the patch series. >>> * Timestamp in human readable form. >>> >>> v2 -> v3 >>> * Rebase. >>> * Verified with dpdk-stable-17.05.1 release. >>> * Fixed build issues with MSVC and cross checked with appveyor. >>> >>> v1 -> v2 >>> * Rebase >>> * Drop 01/20 Patch "Consolidate process related APIs" of V1 as it >>> is already applied as separate patch. >>> >>> RFCv3 -> v1 >>> * Made changes to fix failures in some unit test cases. >>> * some more code cleanup w.r.t process related APIs. >>> >>> RFCv2 -> RFCv3 >>> * Remove POSIX shared memory block implementation (suggested by >>Aaron). >>> * Rework the logic to register and track threads instead of cores. This way >>> in the future any thread can be registered to KA framework. For now only >>PMD >>> threads are tracked (suggested by Aaron). >>> * Refactor few APIs and further clean up the code. >>> >>> RFCv1 -> RFCv2 >>> * Merged the xml and schema commits to later commit where the actual >>> implementation is done(suggested by Ben). >>> * Fix ovs-appctl keepalive/* hang issue when KA disabled. >>> * Fixed memory leaks with appctl commands for keepalive/pmd-health- >>show, >>> pmd-xstats-show. >>> * Refactor code and fixed APIs dealing with PMD health monitoring. >>> >>> >>> Bhanuprakash Bodireddy (8): >>> Keepalive: Add initial keepalive configuration. >>> dpif-netdev: Register packet processing cores to KA framework. >>> dpif-netdev: Enable heartbeats for DPDK datapath. >>> keepalive: Retrieve PMD status periodically. >>> bridge: Update keepalive status in OVSDB. >>> keepalive: Add support to query keepalive status and statistics. >>> Documentation: Update DPDK doc with Keepalive feature. >>> NEWS: Add keepalive support information in NEWS. >>> >>> Documentation/howto/dpdk.rst | 112 +++++++++ >>> NEWS | 2 + >>> lib/automake.mk | 2 + >>> lib/dpif-netdev.c | 92 ++++++++ >>> lib/keepalive.c | 552 >>+++++++++++++++++++++++++++++++++++++++++++ >>> lib/keepalive.h | 109 +++++++++ >>> lib/ovs-thread.c | 6 + >>> lib/ovs-thread.h | 1 + >>> lib/util.c | 22 ++ >>> lib/util.h | 1 + >>> vswitchd/bridge.c | 29 +++ >>> vswitchd/vswitch.ovsschema | 8 +- >>> vswitchd/vswitch.xml | 49 ++++ >>> 13 files changed, 983 insertions(+), 2 deletions(-) create mode >>> 100644 lib/keepalive.c create mode 100644 lib/keepalive.h >>> >>> -- >>> 2.4.11 >>> >>> _______________________________________________ >>> dev mailing list >>> dev@openvswitch.org >>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >> >>-- >>Flavio > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev