diff mbox series

[ovs-dev,3/5] Documentation: add USDT documentation and bpftrace example

Message ID 163949555563.1639090.16031273910422675537.stgit@ebuild
State Superseded
Headers show
Series adding USDT points to OVS | expand

Checks

Context Check Description
ovsrobot/apply-robot warning apply and check: warning
ovsrobot/github-robot-_Build_and_Test success github build: passed

Commit Message

Eelco Chaudron Dec. 14, 2021, 3:26 p.m. UTC
Add the USDT documentation and a bpftrace example using the
bridge run USDT probes.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
---
 Documentation/automake.mk             |    1 
 Documentation/topics/index.rst        |    1 
 Documentation/topics/usdt-probes.rst  |  267 +++++++++++++++++++++++++++++++++
 utilities/automake.mk                 |   13 +-
 utilities/usdt_scripts/bridge_loop.bt |  120 +++++++++++++++
 5 files changed, 396 insertions(+), 6 deletions(-)
 create mode 100644 Documentation/topics/usdt-probes.rst
 create mode 100755 utilities/usdt_scripts/bridge_loop.bt

Comments

0-day Robot Dec. 14, 2021, 3:42 p.m. UTC | #1
Bleep bloop.  Greetings Eelco Chaudron, I am a robot and I have tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line is 108 characters long (recommended limit is 79)
#310 FILE: Documentation/topics/usdt-probes.rst:264:
.. _perf : https://developers.redhat.com/blog/2020/05/29/debugging-vhost-user-tx-contention-in-open-vswitch#

Lines checked: 466, Warnings: 1, Errors: 0


Please check this out.  If you feel there has been an error, please email aconole@redhat.com

Thanks,
0-day Robot
Paolo Valerio Dec. 19, 2021, 11:45 p.m. UTC | #2
Eelco Chaudron <echaudro@redhat.com> writes:

> Add the USDT documentation and a bpftrace example using the
> bridge run USDT probes.
>
> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
> ---
>  Documentation/automake.mk             |    1 
>  Documentation/topics/index.rst        |    1 
>  Documentation/topics/usdt-probes.rst  |  267 +++++++++++++++++++++++++++++++++
>  utilities/automake.mk                 |   13 +-
>  utilities/usdt_scripts/bridge_loop.bt |  120 +++++++++++++++
>  5 files changed, 396 insertions(+), 6 deletions(-)
>  create mode 100644 Documentation/topics/usdt-probes.rst
>  create mode 100755 utilities/usdt_scripts/bridge_loop.bt
>
> diff --git a/Documentation/automake.mk b/Documentation/automake.mk
> index 137cc57c5..9b9c72cff 100644
> --- a/Documentation/automake.mk
> +++ b/Documentation/automake.mk
> @@ -58,6 +58,7 @@ DOC_SOURCE = \
>  	Documentation/topics/record-replay.rst \
>  	Documentation/topics/tracing.rst \
>  	Documentation/topics/userspace-tso.rst \
> +	Documentation/topics/usdt-probes.rst \
>  	Documentation/topics/windows.rst \
>  	Documentation/howto/index.rst \
>  	Documentation/howto/dpdk.rst \
> diff --git a/Documentation/topics/index.rst b/Documentation/topics/index.rst
> index d8ccbd757..7ca364340 100644
> --- a/Documentation/topics/index.rst
> +++ b/Documentation/topics/index.rst
> @@ -55,3 +55,4 @@ OVS
>     userspace-tso
>     idl-compound-indexes
>     ovs-extensions
> +   usdt-probes
> diff --git a/Documentation/topics/usdt-probes.rst b/Documentation/topics/usdt-probes.rst
> new file mode 100644
> index 000000000..38acf7d02
> --- /dev/null
> +++ b/Documentation/topics/usdt-probes.rst
> @@ -0,0 +1,267 @@
> +..
> +      Licensed under the Apache License, Version 2.0 (the "License"); you may
> +      not use this file except in compliance with the License. You may obtain
> +      a copy of the License at
> +
> +          http://www.apache.org/licenses/LICENSE-2.0
> +
> +      Unless required by applicable law or agreed to in writing, software
> +      distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
> +      WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
> +      License for the specific language governing permissions and limitations
> +      under the License.
> +
> +      Convention for heading levels in Open vSwitch documentation:
> +
> +      =======  Heading 0 (reserved for the title in a document)
> +      -------  Heading 1
> +      ~~~~~~~  Heading 2
> +      +++++++  Heading 3
> +      '''''''  Heading 4
> +
> +      Avoid deeper levels because they do not render well.
> +
> +=============================================
> +User Statically-Defined Tracing (USDT) probes
> +=============================================
> +
> +Sometimes it's desired to troubleshoot one of OVS's components in the field.
> +One of the techniques used is to add dynamic tracepoints, for example using
> +perf_. However, the desired dynamic tracepoint and/or the desired variable,
> +might not be available due to compiler optimizations.
> +
> +In this case, a well-thought-off, static tracepoint could be permanently added,
> +so it's always available. For OVS we use the DTrace probe macro's, which have
> +little to no overhead when disabled. Various tools exist to enable them. See
> +some examples below.
> +
> +
> +Compiling with USDT probes enabled
> +----------------------------------
> +
> +Since USDT probes are compiled out by default, a compile-time option is
> +available to include them. To add the probes to the generated code, use the
> +following configure option ::
> +
> +    $ ./configure --enable-usdt-probes
> +
> +The following line should be seen in the configure output when the above option
> +is used ::
> +
> +    checking whether USDT probes are enabled... yes
> +
> +
> +Listing available probes
> +------------------------
> +
> +There are various ways to display USDT probes available in a userspace
> +application. Here we show three examples. All assuming ovs-vswitchd is in the
> +search path with USDT probes enabled:
> +
> +You can use the **perf** tool as follows ::
> +
> +    $ perf buildid-cache --add $(which ovs-vswitchd)
> +    $ perf list | grep sdt_
> +      sdt_main:poll_block                                [SDT event]
> +      sdt_main:run_start                                 [SDT event]
> +
> +You can use the bpftrace_ tool ::
> +
> +    # bpftrace -l "usdt:$(which ovs-vswitchd):*"
> +    usdt:/usr/sbin/ovs-vswitchd:main:poll_block
> +    usdt:/usr/sbin/ovs-vswitchd:main:run_start
> +
> +NOTE: If you execute this on a running process,
> +``bpftrace -lp $(pidof ovs-vswitchd) "usdt:*"`` , it will list all USDT events,
> +i.e., also the ones available in the used shared libraries.
> +
> +Finally, you can use the **tplist** tool which is part of the bcc_ framework ::
> +
> +    $ /usr/share/bcc/tools/tplist -vv -l $(which ovs-vswitchd)
> +    b'main':b'poll_block' [sema 0x0]
> +      location #1 b'/usr/sbin/ovs-vswitchd' 0x407fdc
> +    b'main':b'run_start' [sema 0x0]
> +      location #1 b'/usr/sbin/ovs-vswitchd' 0x407ff6
> +
> +
> +Using probes
> +------------
> +
> +We will use the OVS sandbox environment in combination with the probes shown
> +above to try out some of the available trace tools. To start up the virtual
> +environment use the ``make sandbox`` command. In addition we have to create
> +a bridge to kick of the main bridge loop ::
> +
> +    $ ovs-vsctl add-br test_bridge
> +    $ ovs-vsctl show
> +    055acdca-2f0c-4f6e-b542-f4b6d2c44e08
> +        Bridge test_bridge
> +            Port test_bridge
> +                Interface test_bridge
> +                    type: internal
> +
> +perf
> +~~~~
> +
> +Perf is using Linux uprobe based event tracing to for capturing the events.
> +To enable the main:\* probes as displayed above and take an actual trace, you
> +need to execute the following sequence of perf commands ::
> +
> +    # perf buildid-cache --add $(which ovs-vswitchd)
> +
> +    # perf list | grep sdt_
> +      sdt_main:poll_block                                [SDT event]
> +      sdt_main:run_start                                 [SDT event]
> +
> +    # perf probe --add=sdt_main:poll_block --add=sdt_main:run_start
> +    Added new events:
> +      sdt_main:poll_block  (on %poll_block in /usr/sbin/ovs-vswitchd)
> +      sdt_main:run_start   (on %run_start in /usr/sbin/ovs-vswitchd)
> +
> +    You can now use it in all perf tools, such as:
> +
> +      perf record -e sdt_main:run_start -aR sleep 1
> +
> +    # perf record -e sdt_main:run_start -e sdt_main:poll_block \
> +        -p $(pidof ovs-vswitchd) sleep 30
> +    [ perf record: Woken up 1 times to write data ]
> +    [ perf record: Captured and wrote 0.039 MB perf.data (132 samples) ]
> +
> +    # perf script
> +        ovs-vswitchd  8576 [011] 21031.340433:  sdt_main:run_start: (407ff6)
> +        ovs-vswitchd  8576 [011] 21031.340516: sdt_main:poll_block: (407fdc)
> +        ovs-vswitchd  8576 [011] 21031.841726:  sdt_main:run_start: (407ff6)
> +        ovs-vswitchd  8576 [011] 21031.842088: sdt_main:poll_block: (407fdc)
> +    ...
> +
> +Note that the above examples works with the sandbox environment, so make sure
> +you execute the above command while in the sandbox shell!
> +
> +There are a lot more options available with perf, for example, the
> +``--call-graph dwarf`` option, which would give you a call graph in the
> +``perf script`` output. See the perf documentation for more information.
> +
> +One other interesting feature is that the perf data can be converted for use
> +by the trace visualizer `Trace Compass`_. This can be done using the
> +``--all --to-ctf`` option to the ``perf data convert`` tool.
> +
> +
> +bpftrace
> +~~~~~~~~
> +
> +bpftrace is a high-level tracing language based on eBPF, which can be used to
> +script USDT probes. Here we will show a simple one-liner to display the
> +USDT probes being hit. However, the script section below reference some more
> +advanced bpftrace scripts.
> +
> +This is a simple bpftrace one-liner to show all ``main:*`` USDT probes ::
> +
> +    # bpftrace -p $(pidof ovs-vswitchd) -e \
> +        'usdt::main:* { printf("%s %u [%u] %u %s\n",
> +          comm, pid, cpu, elapsed, probe); }'
> +    Attaching 2 probes...
> +    ovs-vswitchd 8576 [11] 203833199 usdt:main:run_start
> +    ovs-vswitchd 8576 [11] 204086854 usdt:main:poll_block
> +    ovs-vswitchd 8576 [11] 221611985 usdt:main:run_start
> +    ovs-vswitchd 8576 [11] 221892019 usdt:main:poll_block
> +
> +
> +bcc
> +~~~
> +
> +The BPF Compiler Collection (BCC) is a set of tools and scripts that also use
> +eBPF for tracing. The example below uses the ``trace`` tool to show the events
> +while they are being generated ::
> +
> +    # /usr/share/bcc/tools/trace -T -p $(pidof ovs-vswitchd) \
> +        'u::main:run_start' 'u::main:poll_block'
> +    TIME     PID     TID     COMM            FUNC
> +    15:49:06 8576    8576    ovs-vswitchd    main:run_start
> +    15:49:06 8576    8576    ovs-vswitchd    main:poll_block
> +    15:49:06 8576    8576    ovs-vswitchd    main:run_start
> +    15:49:06 8576    8576    ovs-vswitchd    main:poll_block
> +    ^C
> +
> +
> +Scripts
> +-------
> +To not have to re-invent the wheel when trying to debug complex OVS issues, a
> +set of scripts are provided in the source repository. They are located in the
> +``/utilities/usdt_scripts/`` directory, and each script contains detailed

nit: shouldn't this be ``utilities/usdt_scripts/``? The same applies to
all the other paths in the patch.

That aside,

Acked-by: Paolo Valerio <pvalerio@redhat.com>

> +information on how they should be used, and what information they provide.
> +
> +
> +Available probes
> +----------------
> +The next sections describes all the available probes, their use case, and if
> +used in any script, which one. Any new probes being added to OVS should get
> +their own section. See the below "Adding your own probes" section for the
> +used naming convention.
> +
> +Available probes in ``ovs_vswitchd``:
> +
> +- main:poll_block
> +- main:run_start
> +
> +
> +probe main:run_start
> +~~~~~~~~~~~~~~~~~~~~
> +
> +**Description**:
> +The ovs-vswitchd's main process contains a loop that runs every time some work
> +needs to be done. This probe gets triggered every time the loop starts from the
> +beginning. See also the ``main:poll_block`` probe below.
> +
> +**Arguments**:
> +
> +*None*
> +
> +**Script references**:
> +
> +- ``/utilities/usdt_scripts/bridge_loop.bt``
> +
> +
> +probe main:poll_block
> +~~~~~~~~~~~~~~~~~~~~~
> +
> +**Description**:
> +The ovs-vswitchd's main process contains a loop that runs every time some work
> +needs to be done. This probe gets triggered every time the loop is done, and
> +it's about to wait for being re-started by a poll_block() call returning.
> +See also the ``main:run_start`` probe above.
> +
> +**Arguments**:
> +
> +*None*
> +
> +**Script references**:
> +
> +- ``/utilities/usdt_scripts/bridge_loop.bt``
> +
> +
> +Adding your own probes
> +----------------------
> +
> +Adding your own probes is as simple as adding the ``OVS_USDT_PROBE()`` macro
> +to the code. It's similar to the ``DTRACE_PROBExx`` macro's with the difference
> +that it does automatically determine the number of optional arguments.
> +
> +The macro requires at least two arguments. The first one being the *provider*,
> +and the second one being the *name*. To keep some consistency with the probe
> +naming, please use the following convention. The *provider* should be the
> +function name, and the *name* should be the name of the tracepoint. If you do
> +function entry and exit like probes, please use ``entry`` and ``exit``.
> +
> +If, for some reason, you do not like to use the function name as a *provider*,
> +please prefix it with ``__``, so we know it's not a function name.
> +
> +The remaining parameters, up to 10, can be variables, pointers, etc., that
> +might be of interest to capture at this point in the code. Note that the
> +provided variables can cause the compiler to be less effective in optimizing.
> +
> +
> +
> +.. _perf : https://developers.redhat.com/blog/2020/05/29/debugging-vhost-user-tx-contention-in-open-vswitch#
> +.. _bpftrace : https://github.com/iovisor/bpftrace
> +.. _bcc : https://github.com/iovisor/bcc
> +.. _Trace Compass : https://www.eclipse.org/tracecompass/
> diff --git a/utilities/automake.mk b/utilities/automake.mk
> index e2e22c39a..55c7b0022 100644
> --- a/utilities/automake.mk
> +++ b/utilities/automake.mk
> @@ -56,12 +56,13 @@ EXTRA_DIST += \
>  	utilities/ovs-vlan-test.in \
>  	utilities/ovs-vsctl-bashcomp.bash \
>  	utilities/checkpatch.py \
> -        utilities/docker/Makefile \
> -        utilities/docker/ovs-override.conf \
> -        utilities/docker/start-ovs \
> -        utilities/docker/create_ovs_db.sh \
> -        utilities/docker/debian/Dockerfile \
> -        utilities/docker/debian/build-kernel-modules.sh
> +	utilities/docker/Makefile \
> +	utilities/docker/ovs-override.conf \
> +	utilities/docker/start-ovs \
> +	utilities/docker/create_ovs_db.sh \
> +	utilities/docker/debian/Dockerfile \
> +	utilities/docker/debian/build-kernel-modules.sh \
> +	utilities/usdt_scripts/bridge_loop.bt
>  MAN_ROOTS += \
>  	utilities/ovs-testcontroller.8.in \
>  	utilities/ovs-dpctl.8.in \
> diff --git a/utilities/usdt_scripts/bridge_loop.bt b/utilities/usdt_scripts/bridge_loop.bt
> new file mode 100755
> index 000000000..011626947
> --- /dev/null
> +++ b/utilities/usdt_scripts/bridge_loop.bt
> @@ -0,0 +1,120 @@
> +#!/usr/bin/env bpftrace
> +/*
> + * Copyright (c) 2021 Red Hat, Inc.
> + *
> + * Licensed under the Apache License, Version 2.0 (the "License");
> + * you may not use this file except in compliance with the License.
> + * You may obtain a copy of the License at:
> + *
> + *     http://www.apache.org/licenses/LICENSE-2.0
> + *
> + * Unless required by applicable law or agreed to in writing, software
> + * distributed under the License is distributed on an "AS IS" BASIS,
> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
> + * See the License for the specific language governing permissions and
> + * limitations under the License.
> + *
> + *
> + * Script information:
> + * -------------------
> + * bridge_loop.bt uses the USDTs in the ovs-vswitchd's main bridge loop
> + * to report how long it spends running a single loop, as well as the time
> + * it spends waiting in the poll_loop(). Once done, it will also print a
> + * histogram for the time spent.
> + *
> + * The following is an example of how to use the script on the running
> + * ovs-vswitchd process:
> + *
> + *   $ bridge_loop.bt -p `pgrep -n ovs-vswitchd`
> + *     Attaching 4 probes...
> + *     --------------------------------------------------------------
> + *     Tracing ovs-vswitchd's main() loop... Hit Ctrl-C to end.
> + *     --------------------------------------------------------------
> + *     - [886467@ovs-vswitchd] bridge run loop time  : 0:00:00.000230706
> + *     - [886467@ovs-vswitchd] poll_block() wait time: 0:00:00.501854292
> + *     - [886467@ovs-vswitchd] bridge run loop time  : 0:00:00.000266445
> + *     - [886467@ovs-vswitchd] poll_block() wait time: 0:00:00.499750288
> + *     - [886467@ovs-vswitchd] bridge run loop time  : 0:00:00.000254856
> + *     - [886467@ovs-vswitchd] poll_block() wait time: 0:00:00.499944280
> + *     - [886467@ovs-vswitchd] bridge run loop time  : 0:00:00.000267390
> + *     - [886467@ovs-vswitchd] poll_block() wait time: 0:00:00.093566288
> + *     - [886467@ovs-vswitchd] bridge run loop time  : 0:00:00.000316774
> + *     - [886467@ovs-vswitchd] poll_block() wait time: 0:00:00.406697754
> + *     - [886467@ovs-vswitchd] bridge run loop time  : 0:00:00.000264505
> + *     ^C
> + *     --------------------------------------------------------------
> + *     Showing run time histograms in micro seconds:
> + *     --------------------------------------------------------------
> + *
> + *     @bridge_run_time:
> + *     [0, 1000)              6 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
> + *
> + *
> + *     @poll_block_wait_time:
> + *     [90000, 120000)        1 |@@@@@@@@@@@@@@@@@                         |
> + *     [120000, 150000)       0 |                                          |
> + *     [150000, 180000)       0 |                                          |
> + *     [180000, 210000)       0 |                                          |
> + *     [210000, 240000)       0 |                                          |
> + *     [240000, 270000)       0 |                                          |
> + *     [270000, 300000)       0 |                                          |
> + *     [300000, 330000)       0 |                                          |
> + *     [330000, 360000)       0 |                                          |
> + *     [360000, 390000)       0 |                                          |
> + *     [390000, 420000)       1 |@@@@@@@@@@@@@@@@@                         |
> + *     [420000, 450000)       0 |                                          |
> + *     [450000, 480000)       0 |                                          |
> + *     [480000, 510000)       3 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
> + */
> +
> +
> +BEGIN
> +{
> +    printf("--------------------------------------------------------------\n");
> +    printf("Tracing ovs-vswitchd's main() loop... Hit Ctrl-C to end.\n");
> +    printf("--------------------------------------------------------------\n");
> +}
> +
> +
> +usdt::main:poll_block
> +{
> +    @pb_start[tid] = nsecs;
> +    if (@rs_start[tid] != 0) {
> +        $delta = nsecs - @rs_start[tid];
> +        printf("- [%d@%s] bridge run loop time  : %u:%2.2u:%2.2u.%9.9u\n",
> +               tid, comm,
> +               $delta / 3600 / 1000000000,
> +               $delta / 60   / 1000000000 % 60,
> +               $delta / 1000000000 % 60,
> +               $delta % 1000000000);
> +        @bridge_run_time = lhist($delta / 1000, 0, 1000000, 1000);
> +    }
> +}
> +
> +
> +usdt::main:run_start
> +{
> +    @rs_start[tid] = nsecs;
> +    if (@pb_start[tid] != 0) {
> +        $delta = nsecs - @pb_start[tid];
> +        printf("- [%d@%s] poll_block() wait time: %u:%2.2u:%2.2u.%9.9u\n",
> +               tid, comm,
> +               $delta / 3600 / 1000000000,
> +               $delta / 60   / 1000000000 % 60,
> +               $delta / 1000000000 % 60,
> +               $delta % 1000000000);
> +        @poll_block_wait_time = lhist($delta / 1000, 0, 30000000, 30000);
> +    }
> +}
> +
> +
> +END
> +{
> +    clear(@rs_start);
> +    clear(@pb_start);
> +
> +    printf("\n");
> +    printf("--------------------------------------------------------------\n");
> +    printf("Showing run time histograms in micro seconds:\n");
> +    printf("--------------------------------------------------------------");
> +}
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Eelco Chaudron Dec. 20, 2021, 8:39 a.m. UTC | #3
On 20 Dec 2021, at 0:45, Paolo Valerio wrote:

> Eelco Chaudron <echaudro@redhat.com> writes:
>
>> Add the USDT documentation and a bpftrace example using the
>> bridge run USDT probes.
>>
>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>

<SNIP>
>> +Scripts
>> +-------
>> +To not have to re-invent the wheel when trying to debug complex OVS issues, a
>> +set of scripts are provided in the source repository. They are located in the
>> +``/utilities/usdt_scripts/`` directory, and each script contains detailed
>
> nit: shouldn't this be ``utilities/usdt_scripts/``? The same applies to
> all the other paths in the patch.

Thanks, will fix this in the next revision!

> That aside,
>
> Acked-by: Paolo Valerio <pvalerio@redhat.com>
diff mbox series

Patch

diff --git a/Documentation/automake.mk b/Documentation/automake.mk
index 137cc57c5..9b9c72cff 100644
--- a/Documentation/automake.mk
+++ b/Documentation/automake.mk
@@ -58,6 +58,7 @@  DOC_SOURCE = \
 	Documentation/topics/record-replay.rst \
 	Documentation/topics/tracing.rst \
 	Documentation/topics/userspace-tso.rst \
+	Documentation/topics/usdt-probes.rst \
 	Documentation/topics/windows.rst \
 	Documentation/howto/index.rst \
 	Documentation/howto/dpdk.rst \
diff --git a/Documentation/topics/index.rst b/Documentation/topics/index.rst
index d8ccbd757..7ca364340 100644
--- a/Documentation/topics/index.rst
+++ b/Documentation/topics/index.rst
@@ -55,3 +55,4 @@  OVS
    userspace-tso
    idl-compound-indexes
    ovs-extensions
+   usdt-probes
diff --git a/Documentation/topics/usdt-probes.rst b/Documentation/topics/usdt-probes.rst
new file mode 100644
index 000000000..38acf7d02
--- /dev/null
+++ b/Documentation/topics/usdt-probes.rst
@@ -0,0 +1,267 @@ 
+..
+      Licensed under the Apache License, Version 2.0 (the "License"); you may
+      not use this file except in compliance with the License. You may obtain
+      a copy of the License at
+
+          http://www.apache.org/licenses/LICENSE-2.0
+
+      Unless required by applicable law or agreed to in writing, software
+      distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+      WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+      License for the specific language governing permissions and limitations
+      under the License.
+
+      Convention for heading levels in Open vSwitch documentation:
+
+      =======  Heading 0 (reserved for the title in a document)
+      -------  Heading 1
+      ~~~~~~~  Heading 2
+      +++++++  Heading 3
+      '''''''  Heading 4
+
+      Avoid deeper levels because they do not render well.
+
+=============================================
+User Statically-Defined Tracing (USDT) probes
+=============================================
+
+Sometimes it's desired to troubleshoot one of OVS's components in the field.
+One of the techniques used is to add dynamic tracepoints, for example using
+perf_. However, the desired dynamic tracepoint and/or the desired variable,
+might not be available due to compiler optimizations.
+
+In this case, a well-thought-off, static tracepoint could be permanently added,
+so it's always available. For OVS we use the DTrace probe macro's, which have
+little to no overhead when disabled. Various tools exist to enable them. See
+some examples below.
+
+
+Compiling with USDT probes enabled
+----------------------------------
+
+Since USDT probes are compiled out by default, a compile-time option is
+available to include them. To add the probes to the generated code, use the
+following configure option ::
+
+    $ ./configure --enable-usdt-probes
+
+The following line should be seen in the configure output when the above option
+is used ::
+
+    checking whether USDT probes are enabled... yes
+
+
+Listing available probes
+------------------------
+
+There are various ways to display USDT probes available in a userspace
+application. Here we show three examples. All assuming ovs-vswitchd is in the
+search path with USDT probes enabled:
+
+You can use the **perf** tool as follows ::
+
+    $ perf buildid-cache --add $(which ovs-vswitchd)
+    $ perf list | grep sdt_
+      sdt_main:poll_block                                [SDT event]
+      sdt_main:run_start                                 [SDT event]
+
+You can use the bpftrace_ tool ::
+
+    # bpftrace -l "usdt:$(which ovs-vswitchd):*"
+    usdt:/usr/sbin/ovs-vswitchd:main:poll_block
+    usdt:/usr/sbin/ovs-vswitchd:main:run_start
+
+NOTE: If you execute this on a running process,
+``bpftrace -lp $(pidof ovs-vswitchd) "usdt:*"`` , it will list all USDT events,
+i.e., also the ones available in the used shared libraries.
+
+Finally, you can use the **tplist** tool which is part of the bcc_ framework ::
+
+    $ /usr/share/bcc/tools/tplist -vv -l $(which ovs-vswitchd)
+    b'main':b'poll_block' [sema 0x0]
+      location #1 b'/usr/sbin/ovs-vswitchd' 0x407fdc
+    b'main':b'run_start' [sema 0x0]
+      location #1 b'/usr/sbin/ovs-vswitchd' 0x407ff6
+
+
+Using probes
+------------
+
+We will use the OVS sandbox environment in combination with the probes shown
+above to try out some of the available trace tools. To start up the virtual
+environment use the ``make sandbox`` command. In addition we have to create
+a bridge to kick of the main bridge loop ::
+
+    $ ovs-vsctl add-br test_bridge
+    $ ovs-vsctl show
+    055acdca-2f0c-4f6e-b542-f4b6d2c44e08
+        Bridge test_bridge
+            Port test_bridge
+                Interface test_bridge
+                    type: internal
+
+perf
+~~~~
+
+Perf is using Linux uprobe based event tracing to for capturing the events.
+To enable the main:\* probes as displayed above and take an actual trace, you
+need to execute the following sequence of perf commands ::
+
+    # perf buildid-cache --add $(which ovs-vswitchd)
+
+    # perf list | grep sdt_
+      sdt_main:poll_block                                [SDT event]
+      sdt_main:run_start                                 [SDT event]
+
+    # perf probe --add=sdt_main:poll_block --add=sdt_main:run_start
+    Added new events:
+      sdt_main:poll_block  (on %poll_block in /usr/sbin/ovs-vswitchd)
+      sdt_main:run_start   (on %run_start in /usr/sbin/ovs-vswitchd)
+
+    You can now use it in all perf tools, such as:
+
+      perf record -e sdt_main:run_start -aR sleep 1
+
+    # perf record -e sdt_main:run_start -e sdt_main:poll_block \
+        -p $(pidof ovs-vswitchd) sleep 30
+    [ perf record: Woken up 1 times to write data ]
+    [ perf record: Captured and wrote 0.039 MB perf.data (132 samples) ]
+
+    # perf script
+        ovs-vswitchd  8576 [011] 21031.340433:  sdt_main:run_start: (407ff6)
+        ovs-vswitchd  8576 [011] 21031.340516: sdt_main:poll_block: (407fdc)
+        ovs-vswitchd  8576 [011] 21031.841726:  sdt_main:run_start: (407ff6)
+        ovs-vswitchd  8576 [011] 21031.842088: sdt_main:poll_block: (407fdc)
+    ...
+
+Note that the above examples works with the sandbox environment, so make sure
+you execute the above command while in the sandbox shell!
+
+There are a lot more options available with perf, for example, the
+``--call-graph dwarf`` option, which would give you a call graph in the
+``perf script`` output. See the perf documentation for more information.
+
+One other interesting feature is that the perf data can be converted for use
+by the trace visualizer `Trace Compass`_. This can be done using the
+``--all --to-ctf`` option to the ``perf data convert`` tool.
+
+
+bpftrace
+~~~~~~~~
+
+bpftrace is a high-level tracing language based on eBPF, which can be used to
+script USDT probes. Here we will show a simple one-liner to display the
+USDT probes being hit. However, the script section below reference some more
+advanced bpftrace scripts.
+
+This is a simple bpftrace one-liner to show all ``main:*`` USDT probes ::
+
+    # bpftrace -p $(pidof ovs-vswitchd) -e \
+        'usdt::main:* { printf("%s %u [%u] %u %s\n",
+          comm, pid, cpu, elapsed, probe); }'
+    Attaching 2 probes...
+    ovs-vswitchd 8576 [11] 203833199 usdt:main:run_start
+    ovs-vswitchd 8576 [11] 204086854 usdt:main:poll_block
+    ovs-vswitchd 8576 [11] 221611985 usdt:main:run_start
+    ovs-vswitchd 8576 [11] 221892019 usdt:main:poll_block
+
+
+bcc
+~~~
+
+The BPF Compiler Collection (BCC) is a set of tools and scripts that also use
+eBPF for tracing. The example below uses the ``trace`` tool to show the events
+while they are being generated ::
+
+    # /usr/share/bcc/tools/trace -T -p $(pidof ovs-vswitchd) \
+        'u::main:run_start' 'u::main:poll_block'
+    TIME     PID     TID     COMM            FUNC
+    15:49:06 8576    8576    ovs-vswitchd    main:run_start
+    15:49:06 8576    8576    ovs-vswitchd    main:poll_block
+    15:49:06 8576    8576    ovs-vswitchd    main:run_start
+    15:49:06 8576    8576    ovs-vswitchd    main:poll_block
+    ^C
+
+
+Scripts
+-------
+To not have to re-invent the wheel when trying to debug complex OVS issues, a
+set of scripts are provided in the source repository. They are located in the
+``/utilities/usdt_scripts/`` directory, and each script contains detailed
+information on how they should be used, and what information they provide.
+
+
+Available probes
+----------------
+The next sections describes all the available probes, their use case, and if
+used in any script, which one. Any new probes being added to OVS should get
+their own section. See the below "Adding your own probes" section for the
+used naming convention.
+
+Available probes in ``ovs_vswitchd``:
+
+- main:poll_block
+- main:run_start
+
+
+probe main:run_start
+~~~~~~~~~~~~~~~~~~~~
+
+**Description**:
+The ovs-vswitchd's main process contains a loop that runs every time some work
+needs to be done. This probe gets triggered every time the loop starts from the
+beginning. See also the ``main:poll_block`` probe below.
+
+**Arguments**:
+
+*None*
+
+**Script references**:
+
+- ``/utilities/usdt_scripts/bridge_loop.bt``
+
+
+probe main:poll_block
+~~~~~~~~~~~~~~~~~~~~~
+
+**Description**:
+The ovs-vswitchd's main process contains a loop that runs every time some work
+needs to be done. This probe gets triggered every time the loop is done, and
+it's about to wait for being re-started by a poll_block() call returning.
+See also the ``main:run_start`` probe above.
+
+**Arguments**:
+
+*None*
+
+**Script references**:
+
+- ``/utilities/usdt_scripts/bridge_loop.bt``
+
+
+Adding your own probes
+----------------------
+
+Adding your own probes is as simple as adding the ``OVS_USDT_PROBE()`` macro
+to the code. It's similar to the ``DTRACE_PROBExx`` macro's with the difference
+that it does automatically determine the number of optional arguments.
+
+The macro requires at least two arguments. The first one being the *provider*,
+and the second one being the *name*. To keep some consistency with the probe
+naming, please use the following convention. The *provider* should be the
+function name, and the *name* should be the name of the tracepoint. If you do
+function entry and exit like probes, please use ``entry`` and ``exit``.
+
+If, for some reason, you do not like to use the function name as a *provider*,
+please prefix it with ``__``, so we know it's not a function name.
+
+The remaining parameters, up to 10, can be variables, pointers, etc., that
+might be of interest to capture at this point in the code. Note that the
+provided variables can cause the compiler to be less effective in optimizing.
+
+
+
+.. _perf : https://developers.redhat.com/blog/2020/05/29/debugging-vhost-user-tx-contention-in-open-vswitch#
+.. _bpftrace : https://github.com/iovisor/bpftrace
+.. _bcc : https://github.com/iovisor/bcc
+.. _Trace Compass : https://www.eclipse.org/tracecompass/
diff --git a/utilities/automake.mk b/utilities/automake.mk
index e2e22c39a..55c7b0022 100644
--- a/utilities/automake.mk
+++ b/utilities/automake.mk
@@ -56,12 +56,13 @@  EXTRA_DIST += \
 	utilities/ovs-vlan-test.in \
 	utilities/ovs-vsctl-bashcomp.bash \
 	utilities/checkpatch.py \
-        utilities/docker/Makefile \
-        utilities/docker/ovs-override.conf \
-        utilities/docker/start-ovs \
-        utilities/docker/create_ovs_db.sh \
-        utilities/docker/debian/Dockerfile \
-        utilities/docker/debian/build-kernel-modules.sh
+	utilities/docker/Makefile \
+	utilities/docker/ovs-override.conf \
+	utilities/docker/start-ovs \
+	utilities/docker/create_ovs_db.sh \
+	utilities/docker/debian/Dockerfile \
+	utilities/docker/debian/build-kernel-modules.sh \
+	utilities/usdt_scripts/bridge_loop.bt
 MAN_ROOTS += \
 	utilities/ovs-testcontroller.8.in \
 	utilities/ovs-dpctl.8.in \
diff --git a/utilities/usdt_scripts/bridge_loop.bt b/utilities/usdt_scripts/bridge_loop.bt
new file mode 100755
index 000000000..011626947
--- /dev/null
+++ b/utilities/usdt_scripts/bridge_loop.bt
@@ -0,0 +1,120 @@ 
+#!/usr/bin/env bpftrace
+/*
+ * Copyright (c) 2021 Red Hat, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ *
+ *
+ * Script information:
+ * -------------------
+ * bridge_loop.bt uses the USDTs in the ovs-vswitchd's main bridge loop
+ * to report how long it spends running a single loop, as well as the time
+ * it spends waiting in the poll_loop(). Once done, it will also print a
+ * histogram for the time spent.
+ *
+ * The following is an example of how to use the script on the running
+ * ovs-vswitchd process:
+ *
+ *   $ bridge_loop.bt -p `pgrep -n ovs-vswitchd`
+ *     Attaching 4 probes...
+ *     --------------------------------------------------------------
+ *     Tracing ovs-vswitchd's main() loop... Hit Ctrl-C to end.
+ *     --------------------------------------------------------------
+ *     - [886467@ovs-vswitchd] bridge run loop time  : 0:00:00.000230706
+ *     - [886467@ovs-vswitchd] poll_block() wait time: 0:00:00.501854292
+ *     - [886467@ovs-vswitchd] bridge run loop time  : 0:00:00.000266445
+ *     - [886467@ovs-vswitchd] poll_block() wait time: 0:00:00.499750288
+ *     - [886467@ovs-vswitchd] bridge run loop time  : 0:00:00.000254856
+ *     - [886467@ovs-vswitchd] poll_block() wait time: 0:00:00.499944280
+ *     - [886467@ovs-vswitchd] bridge run loop time  : 0:00:00.000267390
+ *     - [886467@ovs-vswitchd] poll_block() wait time: 0:00:00.093566288
+ *     - [886467@ovs-vswitchd] bridge run loop time  : 0:00:00.000316774
+ *     - [886467@ovs-vswitchd] poll_block() wait time: 0:00:00.406697754
+ *     - [886467@ovs-vswitchd] bridge run loop time  : 0:00:00.000264505
+ *     ^C
+ *     --------------------------------------------------------------
+ *     Showing run time histograms in micro seconds:
+ *     --------------------------------------------------------------
+ *
+ *     @bridge_run_time:
+ *     [0, 1000)              6 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
+ *
+ *
+ *     @poll_block_wait_time:
+ *     [90000, 120000)        1 |@@@@@@@@@@@@@@@@@                         |
+ *     [120000, 150000)       0 |                                          |
+ *     [150000, 180000)       0 |                                          |
+ *     [180000, 210000)       0 |                                          |
+ *     [210000, 240000)       0 |                                          |
+ *     [240000, 270000)       0 |                                          |
+ *     [270000, 300000)       0 |                                          |
+ *     [300000, 330000)       0 |                                          |
+ *     [330000, 360000)       0 |                                          |
+ *     [360000, 390000)       0 |                                          |
+ *     [390000, 420000)       1 |@@@@@@@@@@@@@@@@@                         |
+ *     [420000, 450000)       0 |                                          |
+ *     [450000, 480000)       0 |                                          |
+ *     [480000, 510000)       3 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
+ */
+
+
+BEGIN
+{
+    printf("--------------------------------------------------------------\n");
+    printf("Tracing ovs-vswitchd's main() loop... Hit Ctrl-C to end.\n");
+    printf("--------------------------------------------------------------\n");
+}
+
+
+usdt::main:poll_block
+{
+    @pb_start[tid] = nsecs;
+    if (@rs_start[tid] != 0) {
+        $delta = nsecs - @rs_start[tid];
+        printf("- [%d@%s] bridge run loop time  : %u:%2.2u:%2.2u.%9.9u\n",
+               tid, comm,
+               $delta / 3600 / 1000000000,
+               $delta / 60   / 1000000000 % 60,
+               $delta / 1000000000 % 60,
+               $delta % 1000000000);
+        @bridge_run_time = lhist($delta / 1000, 0, 1000000, 1000);
+    }
+}
+
+
+usdt::main:run_start
+{
+    @rs_start[tid] = nsecs;
+    if (@pb_start[tid] != 0) {
+        $delta = nsecs - @pb_start[tid];
+        printf("- [%d@%s] poll_block() wait time: %u:%2.2u:%2.2u.%9.9u\n",
+               tid, comm,
+               $delta / 3600 / 1000000000,
+               $delta / 60   / 1000000000 % 60,
+               $delta / 1000000000 % 60,
+               $delta % 1000000000);
+        @poll_block_wait_time = lhist($delta / 1000, 0, 30000000, 30000);
+    }
+}
+
+
+END
+{
+    clear(@rs_start);
+    clear(@pb_start);
+
+    printf("\n");
+    printf("--------------------------------------------------------------\n");
+    printf("Showing run time histograms in micro seconds:\n");
+    printf("--------------------------------------------------------------");
+}