Message ID | 163949555563.1639090.16031273910422675537.stgit@ebuild |
---|---|
State | Superseded |
Headers | show |
Series | adding USDT points to OVS | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | warning | apply and check: warning |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
Bleep bloop. Greetings Eelco Chaudron, I am a robot and I have tried out your patch. Thanks for your contribution. I encountered some error that I wasn't expecting. See the details below. checkpatch: WARNING: Line is 108 characters long (recommended limit is 79) #310 FILE: Documentation/topics/usdt-probes.rst:264: .. _perf : https://developers.redhat.com/blog/2020/05/29/debugging-vhost-user-tx-contention-in-open-vswitch# Lines checked: 466, Warnings: 1, Errors: 0 Please check this out. If you feel there has been an error, please email aconole@redhat.com Thanks, 0-day Robot
Eelco Chaudron <echaudro@redhat.com> writes: > Add the USDT documentation and a bpftrace example using the > bridge run USDT probes. > > Signed-off-by: Eelco Chaudron <echaudro@redhat.com> > --- > Documentation/automake.mk | 1 > Documentation/topics/index.rst | 1 > Documentation/topics/usdt-probes.rst | 267 +++++++++++++++++++++++++++++++++ > utilities/automake.mk | 13 +- > utilities/usdt_scripts/bridge_loop.bt | 120 +++++++++++++++ > 5 files changed, 396 insertions(+), 6 deletions(-) > create mode 100644 Documentation/topics/usdt-probes.rst > create mode 100755 utilities/usdt_scripts/bridge_loop.bt > > diff --git a/Documentation/automake.mk b/Documentation/automake.mk > index 137cc57c5..9b9c72cff 100644 > --- a/Documentation/automake.mk > +++ b/Documentation/automake.mk > @@ -58,6 +58,7 @@ DOC_SOURCE = \ > Documentation/topics/record-replay.rst \ > Documentation/topics/tracing.rst \ > Documentation/topics/userspace-tso.rst \ > + Documentation/topics/usdt-probes.rst \ > Documentation/topics/windows.rst \ > Documentation/howto/index.rst \ > Documentation/howto/dpdk.rst \ > diff --git a/Documentation/topics/index.rst b/Documentation/topics/index.rst > index d8ccbd757..7ca364340 100644 > --- a/Documentation/topics/index.rst > +++ b/Documentation/topics/index.rst > @@ -55,3 +55,4 @@ OVS > userspace-tso > idl-compound-indexes > ovs-extensions > + usdt-probes > diff --git a/Documentation/topics/usdt-probes.rst b/Documentation/topics/usdt-probes.rst > new file mode 100644 > index 000000000..38acf7d02 > --- /dev/null > +++ b/Documentation/topics/usdt-probes.rst > @@ -0,0 +1,267 @@ > +.. > + Licensed under the Apache License, Version 2.0 (the "License"); you may > + not use this file except in compliance with the License. You may obtain > + a copy of the License at > + > + http://www.apache.org/licenses/LICENSE-2.0 > + > + Unless required by applicable law or agreed to in writing, software > + distributed under the License is distributed on an "AS IS" BASIS, WITHOUT > + WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the > + License for the specific language governing permissions and limitations > + under the License. > + > + Convention for heading levels in Open vSwitch documentation: > + > + ======= Heading 0 (reserved for the title in a document) > + ------- Heading 1 > + ~~~~~~~ Heading 2 > + +++++++ Heading 3 > + ''''''' Heading 4 > + > + Avoid deeper levels because they do not render well. > + > +============================================= > +User Statically-Defined Tracing (USDT) probes > +============================================= > + > +Sometimes it's desired to troubleshoot one of OVS's components in the field. > +One of the techniques used is to add dynamic tracepoints, for example using > +perf_. However, the desired dynamic tracepoint and/or the desired variable, > +might not be available due to compiler optimizations. > + > +In this case, a well-thought-off, static tracepoint could be permanently added, > +so it's always available. For OVS we use the DTrace probe macro's, which have > +little to no overhead when disabled. Various tools exist to enable them. See > +some examples below. > + > + > +Compiling with USDT probes enabled > +---------------------------------- > + > +Since USDT probes are compiled out by default, a compile-time option is > +available to include them. To add the probes to the generated code, use the > +following configure option :: > + > + $ ./configure --enable-usdt-probes > + > +The following line should be seen in the configure output when the above option > +is used :: > + > + checking whether USDT probes are enabled... yes > + > + > +Listing available probes > +------------------------ > + > +There are various ways to display USDT probes available in a userspace > +application. Here we show three examples. All assuming ovs-vswitchd is in the > +search path with USDT probes enabled: > + > +You can use the **perf** tool as follows :: > + > + $ perf buildid-cache --add $(which ovs-vswitchd) > + $ perf list | grep sdt_ > + sdt_main:poll_block [SDT event] > + sdt_main:run_start [SDT event] > + > +You can use the bpftrace_ tool :: > + > + # bpftrace -l "usdt:$(which ovs-vswitchd):*" > + usdt:/usr/sbin/ovs-vswitchd:main:poll_block > + usdt:/usr/sbin/ovs-vswitchd:main:run_start > + > +NOTE: If you execute this on a running process, > +``bpftrace -lp $(pidof ovs-vswitchd) "usdt:*"`` , it will list all USDT events, > +i.e., also the ones available in the used shared libraries. > + > +Finally, you can use the **tplist** tool which is part of the bcc_ framework :: > + > + $ /usr/share/bcc/tools/tplist -vv -l $(which ovs-vswitchd) > + b'main':b'poll_block' [sema 0x0] > + location #1 b'/usr/sbin/ovs-vswitchd' 0x407fdc > + b'main':b'run_start' [sema 0x0] > + location #1 b'/usr/sbin/ovs-vswitchd' 0x407ff6 > + > + > +Using probes > +------------ > + > +We will use the OVS sandbox environment in combination with the probes shown > +above to try out some of the available trace tools. To start up the virtual > +environment use the ``make sandbox`` command. In addition we have to create > +a bridge to kick of the main bridge loop :: > + > + $ ovs-vsctl add-br test_bridge > + $ ovs-vsctl show > + 055acdca-2f0c-4f6e-b542-f4b6d2c44e08 > + Bridge test_bridge > + Port test_bridge > + Interface test_bridge > + type: internal > + > +perf > +~~~~ > + > +Perf is using Linux uprobe based event tracing to for capturing the events. > +To enable the main:\* probes as displayed above and take an actual trace, you > +need to execute the following sequence of perf commands :: > + > + # perf buildid-cache --add $(which ovs-vswitchd) > + > + # perf list | grep sdt_ > + sdt_main:poll_block [SDT event] > + sdt_main:run_start [SDT event] > + > + # perf probe --add=sdt_main:poll_block --add=sdt_main:run_start > + Added new events: > + sdt_main:poll_block (on %poll_block in /usr/sbin/ovs-vswitchd) > + sdt_main:run_start (on %run_start in /usr/sbin/ovs-vswitchd) > + > + You can now use it in all perf tools, such as: > + > + perf record -e sdt_main:run_start -aR sleep 1 > + > + # perf record -e sdt_main:run_start -e sdt_main:poll_block \ > + -p $(pidof ovs-vswitchd) sleep 30 > + [ perf record: Woken up 1 times to write data ] > + [ perf record: Captured and wrote 0.039 MB perf.data (132 samples) ] > + > + # perf script > + ovs-vswitchd 8576 [011] 21031.340433: sdt_main:run_start: (407ff6) > + ovs-vswitchd 8576 [011] 21031.340516: sdt_main:poll_block: (407fdc) > + ovs-vswitchd 8576 [011] 21031.841726: sdt_main:run_start: (407ff6) > + ovs-vswitchd 8576 [011] 21031.842088: sdt_main:poll_block: (407fdc) > + ... > + > +Note that the above examples works with the sandbox environment, so make sure > +you execute the above command while in the sandbox shell! > + > +There are a lot more options available with perf, for example, the > +``--call-graph dwarf`` option, which would give you a call graph in the > +``perf script`` output. See the perf documentation for more information. > + > +One other interesting feature is that the perf data can be converted for use > +by the trace visualizer `Trace Compass`_. This can be done using the > +``--all --to-ctf`` option to the ``perf data convert`` tool. > + > + > +bpftrace > +~~~~~~~~ > + > +bpftrace is a high-level tracing language based on eBPF, which can be used to > +script USDT probes. Here we will show a simple one-liner to display the > +USDT probes being hit. However, the script section below reference some more > +advanced bpftrace scripts. > + > +This is a simple bpftrace one-liner to show all ``main:*`` USDT probes :: > + > + # bpftrace -p $(pidof ovs-vswitchd) -e \ > + 'usdt::main:* { printf("%s %u [%u] %u %s\n", > + comm, pid, cpu, elapsed, probe); }' > + Attaching 2 probes... > + ovs-vswitchd 8576 [11] 203833199 usdt:main:run_start > + ovs-vswitchd 8576 [11] 204086854 usdt:main:poll_block > + ovs-vswitchd 8576 [11] 221611985 usdt:main:run_start > + ovs-vswitchd 8576 [11] 221892019 usdt:main:poll_block > + > + > +bcc > +~~~ > + > +The BPF Compiler Collection (BCC) is a set of tools and scripts that also use > +eBPF for tracing. The example below uses the ``trace`` tool to show the events > +while they are being generated :: > + > + # /usr/share/bcc/tools/trace -T -p $(pidof ovs-vswitchd) \ > + 'u::main:run_start' 'u::main:poll_block' > + TIME PID TID COMM FUNC > + 15:49:06 8576 8576 ovs-vswitchd main:run_start > + 15:49:06 8576 8576 ovs-vswitchd main:poll_block > + 15:49:06 8576 8576 ovs-vswitchd main:run_start > + 15:49:06 8576 8576 ovs-vswitchd main:poll_block > + ^C > + > + > +Scripts > +------- > +To not have to re-invent the wheel when trying to debug complex OVS issues, a > +set of scripts are provided in the source repository. They are located in the > +``/utilities/usdt_scripts/`` directory, and each script contains detailed nit: shouldn't this be ``utilities/usdt_scripts/``? The same applies to all the other paths in the patch. That aside, Acked-by: Paolo Valerio <pvalerio@redhat.com> > +information on how they should be used, and what information they provide. > + > + > +Available probes > +---------------- > +The next sections describes all the available probes, their use case, and if > +used in any script, which one. Any new probes being added to OVS should get > +their own section. See the below "Adding your own probes" section for the > +used naming convention. > + > +Available probes in ``ovs_vswitchd``: > + > +- main:poll_block > +- main:run_start > + > + > +probe main:run_start > +~~~~~~~~~~~~~~~~~~~~ > + > +**Description**: > +The ovs-vswitchd's main process contains a loop that runs every time some work > +needs to be done. This probe gets triggered every time the loop starts from the > +beginning. See also the ``main:poll_block`` probe below. > + > +**Arguments**: > + > +*None* > + > +**Script references**: > + > +- ``/utilities/usdt_scripts/bridge_loop.bt`` > + > + > +probe main:poll_block > +~~~~~~~~~~~~~~~~~~~~~ > + > +**Description**: > +The ovs-vswitchd's main process contains a loop that runs every time some work > +needs to be done. This probe gets triggered every time the loop is done, and > +it's about to wait for being re-started by a poll_block() call returning. > +See also the ``main:run_start`` probe above. > + > +**Arguments**: > + > +*None* > + > +**Script references**: > + > +- ``/utilities/usdt_scripts/bridge_loop.bt`` > + > + > +Adding your own probes > +---------------------- > + > +Adding your own probes is as simple as adding the ``OVS_USDT_PROBE()`` macro > +to the code. It's similar to the ``DTRACE_PROBExx`` macro's with the difference > +that it does automatically determine the number of optional arguments. > + > +The macro requires at least two arguments. The first one being the *provider*, > +and the second one being the *name*. To keep some consistency with the probe > +naming, please use the following convention. The *provider* should be the > +function name, and the *name* should be the name of the tracepoint. If you do > +function entry and exit like probes, please use ``entry`` and ``exit``. > + > +If, for some reason, you do not like to use the function name as a *provider*, > +please prefix it with ``__``, so we know it's not a function name. > + > +The remaining parameters, up to 10, can be variables, pointers, etc., that > +might be of interest to capture at this point in the code. Note that the > +provided variables can cause the compiler to be less effective in optimizing. > + > + > + > +.. _perf : https://developers.redhat.com/blog/2020/05/29/debugging-vhost-user-tx-contention-in-open-vswitch# > +.. _bpftrace : https://github.com/iovisor/bpftrace > +.. _bcc : https://github.com/iovisor/bcc > +.. _Trace Compass : https://www.eclipse.org/tracecompass/ > diff --git a/utilities/automake.mk b/utilities/automake.mk > index e2e22c39a..55c7b0022 100644 > --- a/utilities/automake.mk > +++ b/utilities/automake.mk > @@ -56,12 +56,13 @@ EXTRA_DIST += \ > utilities/ovs-vlan-test.in \ > utilities/ovs-vsctl-bashcomp.bash \ > utilities/checkpatch.py \ > - utilities/docker/Makefile \ > - utilities/docker/ovs-override.conf \ > - utilities/docker/start-ovs \ > - utilities/docker/create_ovs_db.sh \ > - utilities/docker/debian/Dockerfile \ > - utilities/docker/debian/build-kernel-modules.sh > + utilities/docker/Makefile \ > + utilities/docker/ovs-override.conf \ > + utilities/docker/start-ovs \ > + utilities/docker/create_ovs_db.sh \ > + utilities/docker/debian/Dockerfile \ > + utilities/docker/debian/build-kernel-modules.sh \ > + utilities/usdt_scripts/bridge_loop.bt > MAN_ROOTS += \ > utilities/ovs-testcontroller.8.in \ > utilities/ovs-dpctl.8.in \ > diff --git a/utilities/usdt_scripts/bridge_loop.bt b/utilities/usdt_scripts/bridge_loop.bt > new file mode 100755 > index 000000000..011626947 > --- /dev/null > +++ b/utilities/usdt_scripts/bridge_loop.bt > @@ -0,0 +1,120 @@ > +#!/usr/bin/env bpftrace > +/* > + * Copyright (c) 2021 Red Hat, Inc. > + * > + * Licensed under the Apache License, Version 2.0 (the "License"); > + * you may not use this file except in compliance with the License. > + * You may obtain a copy of the License at: > + * > + * http://www.apache.org/licenses/LICENSE-2.0 > + * > + * Unless required by applicable law or agreed to in writing, software > + * distributed under the License is distributed on an "AS IS" BASIS, > + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. > + * See the License for the specific language governing permissions and > + * limitations under the License. > + * > + * > + * Script information: > + * ------------------- > + * bridge_loop.bt uses the USDTs in the ovs-vswitchd's main bridge loop > + * to report how long it spends running a single loop, as well as the time > + * it spends waiting in the poll_loop(). Once done, it will also print a > + * histogram for the time spent. > + * > + * The following is an example of how to use the script on the running > + * ovs-vswitchd process: > + * > + * $ bridge_loop.bt -p `pgrep -n ovs-vswitchd` > + * Attaching 4 probes... > + * -------------------------------------------------------------- > + * Tracing ovs-vswitchd's main() loop... Hit Ctrl-C to end. > + * -------------------------------------------------------------- > + * - [886467@ovs-vswitchd] bridge run loop time : 0:00:00.000230706 > + * - [886467@ovs-vswitchd] poll_block() wait time: 0:00:00.501854292 > + * - [886467@ovs-vswitchd] bridge run loop time : 0:00:00.000266445 > + * - [886467@ovs-vswitchd] poll_block() wait time: 0:00:00.499750288 > + * - [886467@ovs-vswitchd] bridge run loop time : 0:00:00.000254856 > + * - [886467@ovs-vswitchd] poll_block() wait time: 0:00:00.499944280 > + * - [886467@ovs-vswitchd] bridge run loop time : 0:00:00.000267390 > + * - [886467@ovs-vswitchd] poll_block() wait time: 0:00:00.093566288 > + * - [886467@ovs-vswitchd] bridge run loop time : 0:00:00.000316774 > + * - [886467@ovs-vswitchd] poll_block() wait time: 0:00:00.406697754 > + * - [886467@ovs-vswitchd] bridge run loop time : 0:00:00.000264505 > + * ^C > + * -------------------------------------------------------------- > + * Showing run time histograms in micro seconds: > + * -------------------------------------------------------------- > + * > + * @bridge_run_time: > + * [0, 1000) 6 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| > + * > + * > + * @poll_block_wait_time: > + * [90000, 120000) 1 |@@@@@@@@@@@@@@@@@ | > + * [120000, 150000) 0 | | > + * [150000, 180000) 0 | | > + * [180000, 210000) 0 | | > + * [210000, 240000) 0 | | > + * [240000, 270000) 0 | | > + * [270000, 300000) 0 | | > + * [300000, 330000) 0 | | > + * [330000, 360000) 0 | | > + * [360000, 390000) 0 | | > + * [390000, 420000) 1 |@@@@@@@@@@@@@@@@@ | > + * [420000, 450000) 0 | | > + * [450000, 480000) 0 | | > + * [480000, 510000) 3 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| > + */ > + > + > +BEGIN > +{ > + printf("--------------------------------------------------------------\n"); > + printf("Tracing ovs-vswitchd's main() loop... Hit Ctrl-C to end.\n"); > + printf("--------------------------------------------------------------\n"); > +} > + > + > +usdt::main:poll_block > +{ > + @pb_start[tid] = nsecs; > + if (@rs_start[tid] != 0) { > + $delta = nsecs - @rs_start[tid]; > + printf("- [%d@%s] bridge run loop time : %u:%2.2u:%2.2u.%9.9u\n", > + tid, comm, > + $delta / 3600 / 1000000000, > + $delta / 60 / 1000000000 % 60, > + $delta / 1000000000 % 60, > + $delta % 1000000000); > + @bridge_run_time = lhist($delta / 1000, 0, 1000000, 1000); > + } > +} > + > + > +usdt::main:run_start > +{ > + @rs_start[tid] = nsecs; > + if (@pb_start[tid] != 0) { > + $delta = nsecs - @pb_start[tid]; > + printf("- [%d@%s] poll_block() wait time: %u:%2.2u:%2.2u.%9.9u\n", > + tid, comm, > + $delta / 3600 / 1000000000, > + $delta / 60 / 1000000000 % 60, > + $delta / 1000000000 % 60, > + $delta % 1000000000); > + @poll_block_wait_time = lhist($delta / 1000, 0, 30000000, 30000); > + } > +} > + > + > +END > +{ > + clear(@rs_start); > + clear(@pb_start); > + > + printf("\n"); > + printf("--------------------------------------------------------------\n"); > + printf("Showing run time histograms in micro seconds:\n"); > + printf("--------------------------------------------------------------"); > +} > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
On 20 Dec 2021, at 0:45, Paolo Valerio wrote: > Eelco Chaudron <echaudro@redhat.com> writes: > >> Add the USDT documentation and a bpftrace example using the >> bridge run USDT probes. >> >> Signed-off-by: Eelco Chaudron <echaudro@redhat.com> <SNIP> >> +Scripts >> +------- >> +To not have to re-invent the wheel when trying to debug complex OVS issues, a >> +set of scripts are provided in the source repository. They are located in the >> +``/utilities/usdt_scripts/`` directory, and each script contains detailed > > nit: shouldn't this be ``utilities/usdt_scripts/``? The same applies to > all the other paths in the patch. Thanks, will fix this in the next revision! > That aside, > > Acked-by: Paolo Valerio <pvalerio@redhat.com>
diff --git a/Documentation/automake.mk b/Documentation/automake.mk index 137cc57c5..9b9c72cff 100644 --- a/Documentation/automake.mk +++ b/Documentation/automake.mk @@ -58,6 +58,7 @@ DOC_SOURCE = \ Documentation/topics/record-replay.rst \ Documentation/topics/tracing.rst \ Documentation/topics/userspace-tso.rst \ + Documentation/topics/usdt-probes.rst \ Documentation/topics/windows.rst \ Documentation/howto/index.rst \ Documentation/howto/dpdk.rst \ diff --git a/Documentation/topics/index.rst b/Documentation/topics/index.rst index d8ccbd757..7ca364340 100644 --- a/Documentation/topics/index.rst +++ b/Documentation/topics/index.rst @@ -55,3 +55,4 @@ OVS userspace-tso idl-compound-indexes ovs-extensions + usdt-probes diff --git a/Documentation/topics/usdt-probes.rst b/Documentation/topics/usdt-probes.rst new file mode 100644 index 000000000..38acf7d02 --- /dev/null +++ b/Documentation/topics/usdt-probes.rst @@ -0,0 +1,267 @@ +.. + Licensed under the Apache License, Version 2.0 (the "License"); you may + not use this file except in compliance with the License. You may obtain + a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, WITHOUT + WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the + License for the specific language governing permissions and limitations + under the License. + + Convention for heading levels in Open vSwitch documentation: + + ======= Heading 0 (reserved for the title in a document) + ------- Heading 1 + ~~~~~~~ Heading 2 + +++++++ Heading 3 + ''''''' Heading 4 + + Avoid deeper levels because they do not render well. + +============================================= +User Statically-Defined Tracing (USDT) probes +============================================= + +Sometimes it's desired to troubleshoot one of OVS's components in the field. +One of the techniques used is to add dynamic tracepoints, for example using +perf_. However, the desired dynamic tracepoint and/or the desired variable, +might not be available due to compiler optimizations. + +In this case, a well-thought-off, static tracepoint could be permanently added, +so it's always available. For OVS we use the DTrace probe macro's, which have +little to no overhead when disabled. Various tools exist to enable them. See +some examples below. + + +Compiling with USDT probes enabled +---------------------------------- + +Since USDT probes are compiled out by default, a compile-time option is +available to include them. To add the probes to the generated code, use the +following configure option :: + + $ ./configure --enable-usdt-probes + +The following line should be seen in the configure output when the above option +is used :: + + checking whether USDT probes are enabled... yes + + +Listing available probes +------------------------ + +There are various ways to display USDT probes available in a userspace +application. Here we show three examples. All assuming ovs-vswitchd is in the +search path with USDT probes enabled: + +You can use the **perf** tool as follows :: + + $ perf buildid-cache --add $(which ovs-vswitchd) + $ perf list | grep sdt_ + sdt_main:poll_block [SDT event] + sdt_main:run_start [SDT event] + +You can use the bpftrace_ tool :: + + # bpftrace -l "usdt:$(which ovs-vswitchd):*" + usdt:/usr/sbin/ovs-vswitchd:main:poll_block + usdt:/usr/sbin/ovs-vswitchd:main:run_start + +NOTE: If you execute this on a running process, +``bpftrace -lp $(pidof ovs-vswitchd) "usdt:*"`` , it will list all USDT events, +i.e., also the ones available in the used shared libraries. + +Finally, you can use the **tplist** tool which is part of the bcc_ framework :: + + $ /usr/share/bcc/tools/tplist -vv -l $(which ovs-vswitchd) + b'main':b'poll_block' [sema 0x0] + location #1 b'/usr/sbin/ovs-vswitchd' 0x407fdc + b'main':b'run_start' [sema 0x0] + location #1 b'/usr/sbin/ovs-vswitchd' 0x407ff6 + + +Using probes +------------ + +We will use the OVS sandbox environment in combination with the probes shown +above to try out some of the available trace tools. To start up the virtual +environment use the ``make sandbox`` command. In addition we have to create +a bridge to kick of the main bridge loop :: + + $ ovs-vsctl add-br test_bridge + $ ovs-vsctl show + 055acdca-2f0c-4f6e-b542-f4b6d2c44e08 + Bridge test_bridge + Port test_bridge + Interface test_bridge + type: internal + +perf +~~~~ + +Perf is using Linux uprobe based event tracing to for capturing the events. +To enable the main:\* probes as displayed above and take an actual trace, you +need to execute the following sequence of perf commands :: + + # perf buildid-cache --add $(which ovs-vswitchd) + + # perf list | grep sdt_ + sdt_main:poll_block [SDT event] + sdt_main:run_start [SDT event] + + # perf probe --add=sdt_main:poll_block --add=sdt_main:run_start + Added new events: + sdt_main:poll_block (on %poll_block in /usr/sbin/ovs-vswitchd) + sdt_main:run_start (on %run_start in /usr/sbin/ovs-vswitchd) + + You can now use it in all perf tools, such as: + + perf record -e sdt_main:run_start -aR sleep 1 + + # perf record -e sdt_main:run_start -e sdt_main:poll_block \ + -p $(pidof ovs-vswitchd) sleep 30 + [ perf record: Woken up 1 times to write data ] + [ perf record: Captured and wrote 0.039 MB perf.data (132 samples) ] + + # perf script + ovs-vswitchd 8576 [011] 21031.340433: sdt_main:run_start: (407ff6) + ovs-vswitchd 8576 [011] 21031.340516: sdt_main:poll_block: (407fdc) + ovs-vswitchd 8576 [011] 21031.841726: sdt_main:run_start: (407ff6) + ovs-vswitchd 8576 [011] 21031.842088: sdt_main:poll_block: (407fdc) + ... + +Note that the above examples works with the sandbox environment, so make sure +you execute the above command while in the sandbox shell! + +There are a lot more options available with perf, for example, the +``--call-graph dwarf`` option, which would give you a call graph in the +``perf script`` output. See the perf documentation for more information. + +One other interesting feature is that the perf data can be converted for use +by the trace visualizer `Trace Compass`_. This can be done using the +``--all --to-ctf`` option to the ``perf data convert`` tool. + + +bpftrace +~~~~~~~~ + +bpftrace is a high-level tracing language based on eBPF, which can be used to +script USDT probes. Here we will show a simple one-liner to display the +USDT probes being hit. However, the script section below reference some more +advanced bpftrace scripts. + +This is a simple bpftrace one-liner to show all ``main:*`` USDT probes :: + + # bpftrace -p $(pidof ovs-vswitchd) -e \ + 'usdt::main:* { printf("%s %u [%u] %u %s\n", + comm, pid, cpu, elapsed, probe); }' + Attaching 2 probes... + ovs-vswitchd 8576 [11] 203833199 usdt:main:run_start + ovs-vswitchd 8576 [11] 204086854 usdt:main:poll_block + ovs-vswitchd 8576 [11] 221611985 usdt:main:run_start + ovs-vswitchd 8576 [11] 221892019 usdt:main:poll_block + + +bcc +~~~ + +The BPF Compiler Collection (BCC) is a set of tools and scripts that also use +eBPF for tracing. The example below uses the ``trace`` tool to show the events +while they are being generated :: + + # /usr/share/bcc/tools/trace -T -p $(pidof ovs-vswitchd) \ + 'u::main:run_start' 'u::main:poll_block' + TIME PID TID COMM FUNC + 15:49:06 8576 8576 ovs-vswitchd main:run_start + 15:49:06 8576 8576 ovs-vswitchd main:poll_block + 15:49:06 8576 8576 ovs-vswitchd main:run_start + 15:49:06 8576 8576 ovs-vswitchd main:poll_block + ^C + + +Scripts +------- +To not have to re-invent the wheel when trying to debug complex OVS issues, a +set of scripts are provided in the source repository. They are located in the +``/utilities/usdt_scripts/`` directory, and each script contains detailed +information on how they should be used, and what information they provide. + + +Available probes +---------------- +The next sections describes all the available probes, their use case, and if +used in any script, which one. Any new probes being added to OVS should get +their own section. See the below "Adding your own probes" section for the +used naming convention. + +Available probes in ``ovs_vswitchd``: + +- main:poll_block +- main:run_start + + +probe main:run_start +~~~~~~~~~~~~~~~~~~~~ + +**Description**: +The ovs-vswitchd's main process contains a loop that runs every time some work +needs to be done. This probe gets triggered every time the loop starts from the +beginning. See also the ``main:poll_block`` probe below. + +**Arguments**: + +*None* + +**Script references**: + +- ``/utilities/usdt_scripts/bridge_loop.bt`` + + +probe main:poll_block +~~~~~~~~~~~~~~~~~~~~~ + +**Description**: +The ovs-vswitchd's main process contains a loop that runs every time some work +needs to be done. This probe gets triggered every time the loop is done, and +it's about to wait for being re-started by a poll_block() call returning. +See also the ``main:run_start`` probe above. + +**Arguments**: + +*None* + +**Script references**: + +- ``/utilities/usdt_scripts/bridge_loop.bt`` + + +Adding your own probes +---------------------- + +Adding your own probes is as simple as adding the ``OVS_USDT_PROBE()`` macro +to the code. It's similar to the ``DTRACE_PROBExx`` macro's with the difference +that it does automatically determine the number of optional arguments. + +The macro requires at least two arguments. The first one being the *provider*, +and the second one being the *name*. To keep some consistency with the probe +naming, please use the following convention. The *provider* should be the +function name, and the *name* should be the name of the tracepoint. If you do +function entry and exit like probes, please use ``entry`` and ``exit``. + +If, for some reason, you do not like to use the function name as a *provider*, +please prefix it with ``__``, so we know it's not a function name. + +The remaining parameters, up to 10, can be variables, pointers, etc., that +might be of interest to capture at this point in the code. Note that the +provided variables can cause the compiler to be less effective in optimizing. + + + +.. _perf : https://developers.redhat.com/blog/2020/05/29/debugging-vhost-user-tx-contention-in-open-vswitch# +.. _bpftrace : https://github.com/iovisor/bpftrace +.. _bcc : https://github.com/iovisor/bcc +.. _Trace Compass : https://www.eclipse.org/tracecompass/ diff --git a/utilities/automake.mk b/utilities/automake.mk index e2e22c39a..55c7b0022 100644 --- a/utilities/automake.mk +++ b/utilities/automake.mk @@ -56,12 +56,13 @@ EXTRA_DIST += \ utilities/ovs-vlan-test.in \ utilities/ovs-vsctl-bashcomp.bash \ utilities/checkpatch.py \ - utilities/docker/Makefile \ - utilities/docker/ovs-override.conf \ - utilities/docker/start-ovs \ - utilities/docker/create_ovs_db.sh \ - utilities/docker/debian/Dockerfile \ - utilities/docker/debian/build-kernel-modules.sh + utilities/docker/Makefile \ + utilities/docker/ovs-override.conf \ + utilities/docker/start-ovs \ + utilities/docker/create_ovs_db.sh \ + utilities/docker/debian/Dockerfile \ + utilities/docker/debian/build-kernel-modules.sh \ + utilities/usdt_scripts/bridge_loop.bt MAN_ROOTS += \ utilities/ovs-testcontroller.8.in \ utilities/ovs-dpctl.8.in \ diff --git a/utilities/usdt_scripts/bridge_loop.bt b/utilities/usdt_scripts/bridge_loop.bt new file mode 100755 index 000000000..011626947 --- /dev/null +++ b/utilities/usdt_scripts/bridge_loop.bt @@ -0,0 +1,120 @@ +#!/usr/bin/env bpftrace +/* + * Copyright (c) 2021 Red Hat, Inc. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * + * + * Script information: + * ------------------- + * bridge_loop.bt uses the USDTs in the ovs-vswitchd's main bridge loop + * to report how long it spends running a single loop, as well as the time + * it spends waiting in the poll_loop(). Once done, it will also print a + * histogram for the time spent. + * + * The following is an example of how to use the script on the running + * ovs-vswitchd process: + * + * $ bridge_loop.bt -p `pgrep -n ovs-vswitchd` + * Attaching 4 probes... + * -------------------------------------------------------------- + * Tracing ovs-vswitchd's main() loop... Hit Ctrl-C to end. + * -------------------------------------------------------------- + * - [886467@ovs-vswitchd] bridge run loop time : 0:00:00.000230706 + * - [886467@ovs-vswitchd] poll_block() wait time: 0:00:00.501854292 + * - [886467@ovs-vswitchd] bridge run loop time : 0:00:00.000266445 + * - [886467@ovs-vswitchd] poll_block() wait time: 0:00:00.499750288 + * - [886467@ovs-vswitchd] bridge run loop time : 0:00:00.000254856 + * - [886467@ovs-vswitchd] poll_block() wait time: 0:00:00.499944280 + * - [886467@ovs-vswitchd] bridge run loop time : 0:00:00.000267390 + * - [886467@ovs-vswitchd] poll_block() wait time: 0:00:00.093566288 + * - [886467@ovs-vswitchd] bridge run loop time : 0:00:00.000316774 + * - [886467@ovs-vswitchd] poll_block() wait time: 0:00:00.406697754 + * - [886467@ovs-vswitchd] bridge run loop time : 0:00:00.000264505 + * ^C + * -------------------------------------------------------------- + * Showing run time histograms in micro seconds: + * -------------------------------------------------------------- + * + * @bridge_run_time: + * [0, 1000) 6 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| + * + * + * @poll_block_wait_time: + * [90000, 120000) 1 |@@@@@@@@@@@@@@@@@ | + * [120000, 150000) 0 | | + * [150000, 180000) 0 | | + * [180000, 210000) 0 | | + * [210000, 240000) 0 | | + * [240000, 270000) 0 | | + * [270000, 300000) 0 | | + * [300000, 330000) 0 | | + * [330000, 360000) 0 | | + * [360000, 390000) 0 | | + * [390000, 420000) 1 |@@@@@@@@@@@@@@@@@ | + * [420000, 450000) 0 | | + * [450000, 480000) 0 | | + * [480000, 510000) 3 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| + */ + + +BEGIN +{ + printf("--------------------------------------------------------------\n"); + printf("Tracing ovs-vswitchd's main() loop... Hit Ctrl-C to end.\n"); + printf("--------------------------------------------------------------\n"); +} + + +usdt::main:poll_block +{ + @pb_start[tid] = nsecs; + if (@rs_start[tid] != 0) { + $delta = nsecs - @rs_start[tid]; + printf("- [%d@%s] bridge run loop time : %u:%2.2u:%2.2u.%9.9u\n", + tid, comm, + $delta / 3600 / 1000000000, + $delta / 60 / 1000000000 % 60, + $delta / 1000000000 % 60, + $delta % 1000000000); + @bridge_run_time = lhist($delta / 1000, 0, 1000000, 1000); + } +} + + +usdt::main:run_start +{ + @rs_start[tid] = nsecs; + if (@pb_start[tid] != 0) { + $delta = nsecs - @pb_start[tid]; + printf("- [%d@%s] poll_block() wait time: %u:%2.2u:%2.2u.%9.9u\n", + tid, comm, + $delta / 3600 / 1000000000, + $delta / 60 / 1000000000 % 60, + $delta / 1000000000 % 60, + $delta % 1000000000); + @poll_block_wait_time = lhist($delta / 1000, 0, 30000000, 30000); + } +} + + +END +{ + clear(@rs_start); + clear(@pb_start); + + printf("\n"); + printf("--------------------------------------------------------------\n"); + printf("Showing run time histograms in micro seconds:\n"); + printf("--------------------------------------------------------------"); +}
Add the USDT documentation and a bpftrace example using the bridge run USDT probes. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> --- Documentation/automake.mk | 1 Documentation/topics/index.rst | 1 Documentation/topics/usdt-probes.rst | 267 +++++++++++++++++++++++++++++++++ utilities/automake.mk | 13 +- utilities/usdt_scripts/bridge_loop.bt | 120 +++++++++++++++ 5 files changed, 396 insertions(+), 6 deletions(-) create mode 100644 Documentation/topics/usdt-probes.rst create mode 100755 utilities/usdt_scripts/bridge_loop.bt