diff mbox series

[ovs-dev,v6,7/8] Documentation: Update DPDK doc with Keepalive feature.

Message ID 1512734667-23202-8-git-send-email-bhanuprakash.bodireddy@intel.com
State Deferred
Headers show
Series Add OVS DPDK keep-alive functionality. | expand

Commit Message

Bodireddy, Bhanuprakash Dec. 8, 2017, 12:04 p.m. UTC
Keepalive feature is aimed at achieving Fastpath Service Assurance
in OVS-DPDK deployments. It adds support for monitoring the packet
processing threads by dispatching heartbeats at regular intervals.

The implementation uses OvSDB for reporting the health of the PMD threads.
Any external monitoring application can query the OvSDB for status
at regular intervals (or) subscribe to OvSDB updates.

keepalive feature can be enabled through below OVSDB settings.

    enable-keepalive=true
      - Keepalive feature is disabled by default and should be enabled
        at startup before ovs-vswitchd daemon is started.

    keepalive-interval="5000"
      - Timer interval in milliseconds for monitoring the packet
        processing cores.

When KA is enabled, 'ovs-keepalive' thread shall be spawned that wakes
up at regular intervals to update the timestamp and status of pmd
threads in process map. This information shall be read by vswitchd thread
and written in to 'keepalive' column of Open_vSwitch table in OVSDB.

An external monitoring framework like collectd with ovs events support
can read (or) subscribe to the datapath status changes in ovsdb. When the state
is updated, the collectd shall be notified and will eventually relay the status
to ceilometer service running in the controller. Below is the high level
overview of deployment model.

        Compute Node            Controller            Compute Node

        Collectd  <----------> Ceilometer <-------->   Collectd

        OvS DPDK                                       OvS DPDK

        +-----+
        | VM  |
        +--+--+
       \---+---/
           |
        +--+---+       +------------+----------+     +------+-------+
        | OVS  |-----> |   ovsevents plugin    | --> |   collectd   |
        +--+---+       +------------+----------+     +------+-------+

        +------+-----+     +---------------+------------+     |
        | Ceilometer | <-- | collectd ceilometer plugin |  <---
        +------+-----+     +---------------+------------+

Performance impact:
  No noticeable performance or latency impact is observed with
  KA feature enabled.

Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
---
 Documentation/howto/dpdk.rst | 112 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 112 insertions(+)

Comments

Fischetti, Antonio Dec. 15, 2017, 2:26 p.m. UTC | #1
LGTM,

Acked-by: Antonio Fischetti <antonio.fischetti@intel.com>


> -----Original Message-----
> From: ovs-dev-bounces@openvswitch.org [mailto:ovs-dev-
> bounces@openvswitch.org] On Behalf Of Bhanuprakash Bodireddy
> Sent: Friday, December 8, 2017 12:04 PM
> To: dev@openvswitch.org
> Subject: [ovs-dev] [PATCH v6 7/8] Documentation: Update DPDK doc with
> Keepalive feature.
> 
> Keepalive feature is aimed at achieving Fastpath Service Assurance
> in OVS-DPDK deployments. It adds support for monitoring the packet
> processing threads by dispatching heartbeats at regular intervals.
> 
> The implementation uses OvSDB for reporting the health of the PMD
> threads.
> Any external monitoring application can query the OvSDB for status
> at regular intervals (or) subscribe to OvSDB updates.
> 
> keepalive feature can be enabled through below OVSDB settings.
> 
>     enable-keepalive=true
>       - Keepalive feature is disabled by default and should be enabled
>         at startup before ovs-vswitchd daemon is started.
> 
>     keepalive-interval="5000"
>       - Timer interval in milliseconds for monitoring the packet
>         processing cores.
> 
> When KA is enabled, 'ovs-keepalive' thread shall be spawned that wakes
> up at regular intervals to update the timestamp and status of pmd
> threads in process map. This information shall be read by vswitchd
> thread
> and written in to 'keepalive' column of Open_vSwitch table in OVSDB.
> 
> An external monitoring framework like collectd with ovs events support
> can read (or) subscribe to the datapath status changes in ovsdb. When
> the state
> is updated, the collectd shall be notified and will eventually relay the
> status
> to ceilometer service running in the controller. Below is the high level
> overview of deployment model.
> 
>         Compute Node            Controller            Compute Node
> 
>         Collectd  <----------> Ceilometer <-------->   Collectd
> 
>         OvS DPDK                                       OvS DPDK
> 
>         +-----+
>         | VM  |
>         +--+--+
>        \---+---/
>            |
>         +--+---+       +------------+----------+     +------+-------+
>         | OVS  |-----> |   ovsevents plugin    | --> |   collectd   |
>         +--+---+       +------------+----------+     +------+-------+
> 
>         +------+-----+     +---------------+------------+     |
>         | Ceilometer | <-- | collectd ceilometer plugin |  <---
>         +------+-----+     +---------------+------------+
> 
> Performance impact:
>   No noticeable performance or latency impact is observed with
>   KA feature enabled.
> 
> Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
> ---
>  Documentation/howto/dpdk.rst | 112
> +++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 112 insertions(+)
> 
> diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst
> index d123819..e7a2b27 100644
> --- a/Documentation/howto/dpdk.rst
> +++ b/Documentation/howto/dpdk.rst
> @@ -439,6 +439,118 @@ For certain traffic profiles with many parallel
> flows, it's recommended to set
> 
>  For more information on the EMC refer to :doc:`/intro/install/dpdk` .
> 
> +.. _dpdk_keepalive:
> +
> +Keepalive
> +---------
> +
> +OvS Keepalive(KA) feature is disabled by default. To enable KA
> feature::
> +
> +    $ ovs-vsctl --no-wait set Open_vSwitch . other_config:enable-
> keepalive=true
> +
> +The KA feature can't be enabled at run time and should be done at
> startup
> +before ovs-vswitchd daemon is started.
> +
> +The default timer interval for monitoring packet processing threads is
> 1000ms.
> +To set a different timer value, run::
> +
> +    $ ovs-vsctl --no-wait set Open_vSwitch . \
> +        other_config:keepalive-interval="5000"
> +
> +The events comprise of thread states and the last seen timestamps. The
> events
> +are written in to process map periodically by keepalive thread.
> +
> +The events in the process map are retrieved by main(vswitchd) thread
> and
> +updated in to keepalive column of Open_vSwitch table in OVSDB. Any
> external
> +monitoring application can read the status from OVSDB at intervals or
> subscribe
> +to the updates so that they get notified when the changes happen on
> OvSDB.
> +
> +To monitor the datapath status using ovsdb-client, run::
> +
> +    $ ovsdb-client monitor Open_vSwitch
> +    $ ovsdb-client monitor Open_vSwitch Open_vSwitch keepalive
> +
> +The datapath thread states are explained below::
> +
> +      KA_STATE_UNUSED  - Not registered to KA framework.
> +      KA_STATE_ALIVE   - Thread alive.
> +      KA_STATE_MISSING - Thread missed first heartbeat.
> +      KA_STATE_DEAD    - Thread missed two heartbeats.
> +      KA_STATE_GONE    - Thread missed two or more heartbeats and
> burried.
> +      KA_STATE_SLEEP   - Thread is sleeping.
> +
> +To query the datapath status, run::
> +
> +    $ ovs-appctl keepalive/pmd-health-show
> +
> +`collectd <https://collectd.org/>`__ has built-in support for DPDK and
> provides
> +a `ovs_events` and `ovs_stats` plugin that can be enabled to relay the
> datapath
> +status and the PMD status to OpenStack service `Ceilometer
> +<https://docs.openstack.org/developer/ceilometer/>`__.
> +
> +To install and configure `collectd`, run::
> +
> +    # Clone collectd from Git repository
> +    $ git clone https://github.com/collectd/collectd.git
> +
> +    # configure and install collectd
> +    $ cd collectd
> +    $ ./build.sh
> +    $ ./configure --enable-syslog --enable-logfile --with-libdpdk=/usr
> +    $ make
> +    $ make install
> +
> +`collectd` is installed in ``/opt/collectd`` by default. Edit the
> configuration
> +file in ``/opt/collectd/etc/collectd.conf`` to enable logfile,
> dpdkevents
> +and csv plugin::
> +
> +   LoadPlugin logfile
> +   <Plugin logfile>
> +       LogLevel debug
> +       File "/var/log/collectd/collectd.log"
> +       Timestamp true
> +       PrintSeverity false
> +   </Plugin>
> +
> +   <Plugin syslog>
> +       LogLevel info
> +   </Plugin>
> +
> +Enable `ovs_events` plugin and update the plugindetails as below::
> +
> +   LoadPlugin ovs_events
> +
> +   <Plugin ovs_events>
> +     Port "6640"
> +     Address "127.0.0.1"
> +     Socket "/usr/local/var/run/openvswitch/db.sock"
> +     SendNotification true
> +     DispatchValues false
> +   </Plugin>
> +
> +Enable `ovs_stats` plugin and update the plugindetails as below::
> +
> +   LoadPlugin ovs_stats
> +
> +   <Plugin ovs_stats>
> +     Port "6640"
> +     Address "127.0.0.1"
> +     Socket "/usr/local/var/run/openvswitch/db.sock"
> +     Bridges "br0"
> +   </Plugin>
> +
> +Enable ``csv`` plugin as below::
> +
> +   LoadPlugin csv
> +
> +   <Plugin csv>
> +       DataDir "/var/log/collectd/csv"
> +       StoreRates false
> +   </Plugin>
> +
> +With csv plugin enabled, *meter* or *gauge* file is created and
> timestamp
> +and thread status gets updated which are sent to ceilometer service.
> +
>  .. _dpdk-ovs-in-guest:
> 
>  OVS with DPDK Inside VMs
> --
> 2.4.11
> 
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Aaron Conole Jan. 24, 2018, 8:36 p.m. UTC | #2
Hi Bhanu,

This time around I just reviewed the documentation.  I'll spend some
time on the other patches as I can, but I think at least the
documentation needs more work.

Comments inline.

Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com> writes:

> Keepalive feature is aimed at achieving Fastpath Service Assurance
> in OVS-DPDK deployments. It adds support for monitoring the packet
> processing threads by dispatching heartbeats at regular intervals.

This belongs in the document.

> The implementation uses OvSDB for reporting the health of the PMD threads.

s/implementation/keepalive framework/

> Any external monitoring application can query the OvSDB for status
> at regular intervals (or) subscribe to OvSDB updates.

                                              ^ for

>
> keepalive feature can be enabled through below OVSDB settings.

 ^ The

>
>     enable-keepalive=true
>       - Keepalive feature is disabled by default and should be enabled
>         at startup before ovs-vswitchd daemon is started.
>
>     keepalive-interval="5000"
>       - Timer interval in milliseconds for monitoring the packet
>         processing cores.
>
> When KA is enabled, 'ovs-keepalive' thread shall be spawned that wakes
> up at regular intervals to update the timestamp and status of pmd
> threads in process map. This information shall be read by vswitchd thread

s/shall/will/

> and written in to 'keepalive' column of Open_vSwitch table in OVSDB.
>
> An external monitoring framework like collectd with ovs events support

s/like collectd/(such as collectd)/

> can read (or) subscribe to the datapath status changes in ovsdb. When the state

s/(or)/or/

Everything after 'When the state is updated' is specific to
collectd/ceilometer.  Should it go into a separate document?

> is updated, the collectd shall be notified and will eventually relay the status
> to ceilometer service running in the controller. Below is the high level
> overview of deployment model.
>
>         Compute Node            Controller            Compute Node
>
>         Collectd  <----------> Ceilometer <-------->   Collectd
>
>         OvS DPDK                                       OvS DPDK
>
>         +-----+
>         | VM  |
>         +--+--+
>        \---+---/
>            |
>         +--+---+       +------------+----------+     +------+-------+
>         | OVS  |-----> |   ovsevents plugin    | --> |   collectd   |
>         +--+---+       +------------+----------+     +------+-------+
>
>         +------+-----+     +---------------+------------+     |
>         | Ceilometer | <-- | collectd ceilometer plugin |  <---
>         +------+-----+     +---------------+------------+
>
> Performance impact:
>   No noticeable performance or latency impact is observed with
>   KA feature enabled.
>
> Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy at intel.com>
> ---

A good chunk of this commit message should be in the documentation.
Either as a duplicate, or just moved there.  While commit messages are
meant to describe the 'why' of a change, this commit merely adds
documentation for the keepalive feature.  The actual description of the
feature should live on in the documentation, so that when a user is
trying to figure out what it is, they can read it.

I hope this feature isn't exclusively for ceilometer / collectd.

>  Documentation/howto/dpdk.rst | 112 +++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 112 insertions(+)

Is the dpdk documentation the correct place for this?  Will this
framework ever be applied to more than just PMDs?  Isn't that a goal of
monitoring the health of various tasks?

> diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst
> index d123819..e7a2b27 100644
> --- a/Documentation/howto/dpdk.rst
> +++ b/Documentation/howto/dpdk.rst
> @@ -439,6 +439,118 @@ For certain traffic profiles with many parallel flows, it's recommended to set
>  
>  For more information on the EMC refer to :doc:`/intro/install/dpdk` .
>  
> +.. _dpdk_keepalive:
> +
> +Keepalive
> +---------
> +

This needs a bit more description.  I read this section, and I don't
know what the Keepalive(KA) feature is - is it merely a reporting
framework for the last time threads 'checked in'?  Does it take any
actions?  What is a user meant to do with this data?  Are there cases
where this framework gives false positives / negatives?  Downsides to
enabling it?  Alternatives?

Some of that is explained in the commit message - but it really belongs
here.

> +OvS Keepalive(KA) feature is disabled by default. To enable KA feature::
> +
> +    $ ovs-vsctl --no-wait set Open_vSwitch . other_config:enable-keepalive=true
> +
> +The KA feature can't be enabled at run time and should be done at startup
> +before ovs-vswitchd daemon is started.
> +
> +The default timer interval for monitoring packet processing threads is 1000ms.
> +To set a different timer value, run::
> +
> +    $ ovs-vsctl --no-wait set Open_vSwitch . \
> +        other_config:keepalive-interval="5000"
> +

Can this be changed at run-time?  Such information should be included.

> +The events comprise of thread states and the last seen timestamps. The events
> +are written in to process map periodically by keepalive thread.
> +
> +The events in the process map are retrieved by main(vswitchd) thread and
> +updated in to keepalive column of Open_vSwitch table in OVSDB. Any external
> +monitoring application can read the status from OVSDB at intervals or subscribe
> +to the updates so that they get notified when the changes happen on OvSDB.
> +
> +To monitor the datapath status using ovsdb-client, run::

Will this monitor revalidator threads?  RCU related threads?  Is it only
ever datapath status?  Maybe should read 'To monitor the keepalive status'.

> +    $ ovsdb-client monitor Open_vSwitch
> +    $ ovsdb-client monitor Open_vSwitch Open_vSwitch keepalive
> +
> +The datapath thread states are explained below::

s/datapath/keepalive/

> +
> +      KA_STATE_UNUSED  - Not registered to KA framework.

Maybe "KA_STATE_NONE"?  Or better yet - why should we have an enum for
this?  After all... the thread isn't tracked?

> +      KA_STATE_ALIVE   - Thread alive.

--
> +      KA_STATE_MISSING - Thread missed first heartbeat.
> +      KA_STATE_DEAD    - Thread missed two heartbeats.
> +      KA_STATE_GONE    - Thread missed two or more heartbeats and burried.
--

Is it useful to distinguish between 'missed one', 'missed two', and
'missed two or more?'  Further, what does 'burried' mean?  I hope we
don't have to dig a thread out from the under a vast field of
permafrost.

Maybe it's better to have a state for MISSED, and show the missed
counter.

> +      KA_STATE_SLEEP   - Thread is sleeping.
> +
> +To query the datapath status, run::
> +
> +    $ ovs-appctl keepalive/pmd-health-show
> +

The following section seems to be more about installation and
configuration of collectd.  Perhaps that should be split into a
different document?

> +`collectd <https://collectd.org/>`__ has built-in support for DPDK and provides
> +a `ovs_events` and `ovs_stats` plugin that can be enabled to relay the datapath
> +status and the PMD status to OpenStack service `Ceilometer
> +<https://docs.openstack.org/developer/ceilometer/>`__.
> +
> +To install and configure `collectd`, run::
> +
> +    # Clone collectd from Git repository
> +    $ git clone https://github.com/collectd/collectd.git
> +
> +    # configure and install collectd
> +    $ cd collectd
> +    $ ./build.sh
> +    $ ./configure --enable-syslog --enable-logfile --with-libdpdk=/usr
> +    $ make
> +    $ make install
> +
> +`collectd` is installed in ``/opt/collectd`` by default. Edit the configuration
> +file in ``/opt/collectd/etc/collectd.conf`` to enable logfile, dpdkevents
> +and csv plugin::
> +
> +   LoadPlugin logfile
> +   <Plugin logfile>
> +       LogLevel debug
> +       File "/var/log/collectd/collectd.log"
> +       Timestamp true
> +       PrintSeverity false
> +   </Plugin>
> +
> +   <Plugin syslog>
> +       LogLevel info
> +   </Plugin>
> +
> +Enable `ovs_events` plugin and update the plugindetails as below::
> +
> +   LoadPlugin ovs_events
> +
> +   <Plugin ovs_events>
> +     Port "6640"
> +     Address "127.0.0.1"
> +     Socket "/usr/local/var/run/openvswitch/db.sock"
> +     SendNotification true
> +     DispatchValues false
> +   </Plugin>
> +
> +Enable `ovs_stats` plugin and update the plugindetails as below::
> +
> +   LoadPlugin ovs_stats
> +
> +   <Plugin ovs_stats>
> +     Port "6640"
> +     Address "127.0.0.1"
> +     Socket "/usr/local/var/run/openvswitch/db.sock"
> +     Bridges "br0"
> +   </Plugin>
> +
> +Enable ``csv`` plugin as below::
> +
> +   LoadPlugin csv
> +
> +   <Plugin csv>
> +       DataDir "/var/log/collectd/csv"
> +       StoreRates false
> +   </Plugin>
> +
> +With csv plugin enabled, *meter* or *gauge* file is created and timestamp
> +and thread status gets updated which are sent to ceilometer service.
> +
 .. _dpdk-ovs-in-guest:
 
 OVS with DPDK Inside VMs
diff mbox series

Patch

diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst
index d123819..e7a2b27 100644
--- a/Documentation/howto/dpdk.rst
+++ b/Documentation/howto/dpdk.rst
@@ -439,6 +439,118 @@  For certain traffic profiles with many parallel flows, it's recommended to set
 
 For more information on the EMC refer to :doc:`/intro/install/dpdk` .
 
+.. _dpdk_keepalive:
+
+Keepalive
+---------
+
+OvS Keepalive(KA) feature is disabled by default. To enable KA feature::
+
+    $ ovs-vsctl --no-wait set Open_vSwitch . other_config:enable-keepalive=true
+
+The KA feature can't be enabled at run time and should be done at startup
+before ovs-vswitchd daemon is started.
+
+The default timer interval for monitoring packet processing threads is 1000ms.
+To set a different timer value, run::
+
+    $ ovs-vsctl --no-wait set Open_vSwitch . \
+        other_config:keepalive-interval="5000"
+
+The events comprise of thread states and the last seen timestamps. The events
+are written in to process map periodically by keepalive thread.
+
+The events in the process map are retrieved by main(vswitchd) thread and
+updated in to keepalive column of Open_vSwitch table in OVSDB. Any external
+monitoring application can read the status from OVSDB at intervals or subscribe
+to the updates so that they get notified when the changes happen on OvSDB.
+
+To monitor the datapath status using ovsdb-client, run::
+
+    $ ovsdb-client monitor Open_vSwitch
+    $ ovsdb-client monitor Open_vSwitch Open_vSwitch keepalive
+
+The datapath thread states are explained below::
+
+      KA_STATE_UNUSED  - Not registered to KA framework.
+      KA_STATE_ALIVE   - Thread alive.
+      KA_STATE_MISSING - Thread missed first heartbeat.
+      KA_STATE_DEAD    - Thread missed two heartbeats.
+      KA_STATE_GONE    - Thread missed two or more heartbeats and burried.
+      KA_STATE_SLEEP   - Thread is sleeping.
+
+To query the datapath status, run::
+
+    $ ovs-appctl keepalive/pmd-health-show
+
+`collectd <https://collectd.org/>`__ has built-in support for DPDK and provides
+a `ovs_events` and `ovs_stats` plugin that can be enabled to relay the datapath
+status and the PMD status to OpenStack service `Ceilometer
+<https://docs.openstack.org/developer/ceilometer/>`__.
+
+To install and configure `collectd`, run::
+
+    # Clone collectd from Git repository
+    $ git clone https://github.com/collectd/collectd.git
+
+    # configure and install collectd
+    $ cd collectd
+    $ ./build.sh
+    $ ./configure --enable-syslog --enable-logfile --with-libdpdk=/usr
+    $ make
+    $ make install
+
+`collectd` is installed in ``/opt/collectd`` by default. Edit the configuration
+file in ``/opt/collectd/etc/collectd.conf`` to enable logfile, dpdkevents
+and csv plugin::
+
+   LoadPlugin logfile
+   <Plugin logfile>
+       LogLevel debug
+       File "/var/log/collectd/collectd.log"
+       Timestamp true
+       PrintSeverity false
+   </Plugin>
+
+   <Plugin syslog>
+       LogLevel info
+   </Plugin>
+
+Enable `ovs_events` plugin and update the plugindetails as below::
+
+   LoadPlugin ovs_events
+
+   <Plugin ovs_events>
+     Port "6640"
+     Address "127.0.0.1"
+     Socket "/usr/local/var/run/openvswitch/db.sock"
+     SendNotification true
+     DispatchValues false
+   </Plugin>
+
+Enable `ovs_stats` plugin and update the plugindetails as below::
+
+   LoadPlugin ovs_stats
+
+   <Plugin ovs_stats>
+     Port "6640"
+     Address "127.0.0.1"
+     Socket "/usr/local/var/run/openvswitch/db.sock"
+     Bridges "br0"
+   </Plugin>
+
+Enable ``csv`` plugin as below::
+
+   LoadPlugin csv
+
+   <Plugin csv>
+       DataDir "/var/log/collectd/csv"
+       StoreRates false
+   </Plugin>
+
+With csv plugin enabled, *meter* or *gauge* file is created and timestamp
+and thread status gets updated which are sent to ceilometer service.
+
 .. _dpdk-ovs-in-guest:
 
 OVS with DPDK Inside VMs