[ovs-dev] dpdk: Deprecate pdump support.
diff mbox series

Message ID 20191111185256.25690-1-i.maximets@ovn.org
State Accepted
Headers show
Series
  • [ovs-dev] dpdk: Deprecate pdump support.
Related show

Commit Message

Ilya Maximets Nov. 11, 2019, 6:52 p.m. UTC
The conventional way for packet dumping in OVS is to use ovs-tcpdump
that works via traffic mirroring.  DPDK pdump could probably be used
for some lower level debugging, but it is not commonly used for
various reasons.

There are lots of limitations for using this functionality in practice.
Most of them connected with running secondary pdump process and
memory layout issues like requirement to disable ASLR in kernel.
More details are available in DPDK guide:
https://doc.dpdk.org/guides/prog_guide/multi_proc_support.html#multi-process-limitations

Beside the functional limitations it's also hard to use this
functionality correctly.  User must be sure that OVS and pdump utility
are running on different CPU cores, which is hard because non-PMD
threads could float over available CPU cores.  This or any other
misconfiguration will likely lead to crash of the pdump utility
or/and OVS.

Another problem is that the user must actually have this special pdump
utility in a system and it might be not available in distributions.

This change disables pdump support by default introducing special
configuration option '--enable-dpdk-pdump'.  Deprecation warnings will
be shown to users on configuration and in runtime.

Claiming to completely remove this functionality from OVS in one
of the next releases.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
---

Version 1:
  * No changes since RFC.
  * Added ACK from Aaron.

 .travis/linux-build.sh              |  4 +++-
 Documentation/topics/dpdk/pdump.rst |  8 +++++++-
 NEWS                                |  4 ++++
 acinclude.m4                        | 24 ++++++++++++++++++------
 lib/dpdk.c                          |  2 ++
 5 files changed, 34 insertions(+), 8 deletions(-)

Comments

Flavio Leitner Nov. 13, 2019, 3:30 p.m. UTC | #1
On Mon, 11 Nov 2019 19:52:56 +0100
Ilya Maximets <i.maximets@ovn.org> wrote:

> The conventional way for packet dumping in OVS is to use ovs-tcpdump
> that works via traffic mirroring.  DPDK pdump could probably be used
> for some lower level debugging, but it is not commonly used for
> various reasons.
> 
> There are lots of limitations for using this functionality in
> practice. Most of them connected with running secondary pdump process
> and memory layout issues like requirement to disable ASLR in kernel.
> More details are available in DPDK guide:
> https://doc.dpdk.org/guides/prog_guide/multi_proc_support.html#multi-process-limitations
> 
> Beside the functional limitations it's also hard to use this
> functionality correctly.  User must be sure that OVS and pdump utility
> are running on different CPU cores, which is hard because non-PMD
> threads could float over available CPU cores.  This or any other
> misconfiguration will likely lead to crash of the pdump utility
> or/and OVS.
> 
> Another problem is that the user must actually have this special pdump
> utility in a system and it might be not available in distributions.
> 
> This change disables pdump support by default introducing special
> configuration option '--enable-dpdk-pdump'.  Deprecation warnings will
> be shown to users on configuration and in runtime.
> 
> Claiming to completely remove this functionality from OVS in one
> of the next releases.
> 
> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> Acked-by: Aaron Conole <aconole@redhat.com>
> ---
> 
> Version 1:
>   * No changes since RFC.
>   * Added ACK from Aaron.
> 
>  .travis/linux-build.sh              |  4 +++-
>  Documentation/topics/dpdk/pdump.rst |  8 +++++++-
>  NEWS                                |  4 ++++
>  acinclude.m4                        | 24 ++++++++++++++++++------
>  lib/dpdk.c                          |  2 ++
>  5 files changed, 34 insertions(+), 8 deletions(-)

New option noted in NEWS, warning for those users enabling the
option, documentation updated.
LGTM
Acked-by: Flavio Leitner <fbl@sysclose.org>
David Marchand Nov. 14, 2019, 4:41 p.m. UTC | #2
Hello Reshma,

Has pdump been tested (recently) with OVS?


On Mon, Nov 11, 2019 at 7:53 PM Ilya Maximets <i.maximets@ovn.org> wrote:
>
> The conventional way for packet dumping in OVS is to use ovs-tcpdump
> that works via traffic mirroring.  DPDK pdump could probably be used
> for some lower level debugging, but it is not commonly used for
> various reasons.
>
> There are lots of limitations for using this functionality in practice.
> Most of them connected with running secondary pdump process and
> memory layout issues like requirement to disable ASLR in kernel.
> More details are available in DPDK guide:
> https://doc.dpdk.org/guides/prog_guide/multi_proc_support.html#multi-process-limitations
>
> Beside the functional limitations it's also hard to use this
> functionality correctly.  User must be sure that OVS and pdump utility
> are running on different CPU cores, which is hard because non-PMD
> threads could float over available CPU cores.  This or any other
> misconfiguration will likely lead to crash of the pdump utility
> or/and OVS.
>
> Another problem is that the user must actually have this special pdump
> utility in a system and it might be not available in distributions.
>
> This change disables pdump support by default introducing special
> configuration option '--enable-dpdk-pdump'.  Deprecation warnings will
> be shown to users on configuration and in runtime.
>
> Claiming to completely remove this functionality from OVS in one
> of the next releases.
>
> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> Acked-by: Aaron Conole <aconole@redhat.com>

- Recompiled from scratch, on OVS master (before this patch) with dpdk 18.11.2.

other_config        : {dpdk-init="true", pmd-cpu-mask="0x00008002"}

2 physical ports, 2 vhost ports.

2019-11-14T14:13:09.596Z|00018|dpdk|INFO|Using DPDK 18.11.2
2019-11-14T14:13:09.596Z|00019|dpdk|INFO|DPDK Enabled - initializing...
2019-11-14T14:13:09.596Z|00020|dpdk|INFO|No vhost-sock-dir provided -
defaulting to //var/run/openvswitch
2019-11-14T14:13:09.596Z|00021|dpdk|INFO|IOMMU support for
vhost-user-client disabled.
2019-11-14T14:13:09.596Z|00022|dpdk|INFO|POSTCOPY support for
vhost-user-client disabled.
2019-11-14T14:13:09.596Z|00023|dpdk|INFO|Per port memory for DPDK
devices disabled.
2019-11-14T14:13:09.596Z|00024|dpdk|INFO|EAL ARGS: ovs-vswitchd
--socket-mem 1024 --socket-limit 1024 -l 0.
2019-11-14T14:13:09.600Z|00025|dpdk|INFO|EAL: Detected 28 lcore(s)
2019-11-14T14:13:09.600Z|00026|dpdk|INFO|EAL: Detected 1 NUMA nodes
2019-11-14T14:13:09.602Z|00027|dpdk|INFO|EAL: Multi-process socket
/var/run/openvswitch/dpdk/rte/mp_socket
2019-11-14T14:13:09.618Z|00028|dpdk|INFO|EAL: Probing VFIO support...
2019-11-14T14:13:09.618Z|00029|dpdk|INFO|EAL: VFIO support initialized
2019-11-14T14:13:14.612Z|00030|dpdk|INFO|EAL: PCI device 0000:01:00.0
on NUMA socket 0
2019-11-14T14:13:14.612Z|00031|dpdk|INFO|EAL:   probe driver:
8086:10fb net_ixgbe
2019-11-14T14:13:14.613Z|00032|dpdk|INFO|EAL:   using IOMMU type 1 (Type 1)
2019-11-14T14:13:14.744Z|00033|dpdk|INFO|EAL: Ignore mapping IO port bar(2)
2019-11-14T14:13:15.090Z|00034|dpdk|INFO|EAL: PCI device 0000:01:00.1
on NUMA socket 0
2019-11-14T14:13:15.090Z|00035|dpdk|INFO|EAL:   probe driver:
8086:10fb net_ixgbe
2019-11-14T14:13:15.199Z|00036|dpdk|INFO|EAL: Ignore mapping IO port bar(2)
2019-11-14T14:13:15.530Z|00037|dpdk|INFO|EAL: PCI device 0000:07:00.0
on NUMA socket 0
2019-11-14T14:13:15.530Z|00038|dpdk|INFO|EAL:   probe driver:
8086:1521 net_e1000_igb
2019-11-14T14:13:15.530Z|00039|dpdk|INFO|EAL: PCI device 0000:07:00.1
on NUMA socket 0
2019-11-14T14:13:15.530Z|00040|dpdk|INFO|EAL:   probe driver:
8086:1521 net_e1000_igb
...
2019-11-14T14:13:15.802Z|00042|dpdk|INFO|DPDK pdump packet capture enabled
2019-11-14T14:13:15.803Z|00043|dpdk|INFO|DPDK Enabled - initialized

- Attached a gdb to ovs-vswitchd.

- Started pdump:
# sudo -u openvswitch XDG_RUNTIME_DIR=/var/run/openvswitch
./v18.11.2/app/dpdk-pdump -- --pdump
'port=0,queue=*,rx-dev=/tmp/pkts.pcap'
EAL: Detected 28 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Multi-process socket
/var/run/openvswitch/dpdk/rte/mp_socket_83791_549cdfd05e328e
EAL: Probing VFIO support...
EAL: VFIO support initialized
EAL: PCI device 0000:01:00.0 on NUMA socket 0
EAL:   probe driver: 8086:10fb net_ixgbe
EAL:   using IOMMU type 1 (Type 1)
EAL: PCI device 0000:01:00.1 on NUMA socket 0
EAL:   probe driver: 8086:10fb net_ixgbe
EAL: PCI device 0000:07:00.0 on NUMA socket 0
EAL:   probe driver: 8086:1521 net_e1000_igb
EAL: PCI device 0000:07:00.1 on NUMA socket 0
EAL:   probe driver: 8086:1521 net_e1000_igb
Port 3 MAC: 02 70 63 61 70 00

- Sent one packet to the first physical port from my tgen

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f4840659700 (LWP 84336)]
bucket_dequeue_orphans (n_orphans=251, obj_table=0x14fe5af50,
bd=0x14fdad880) at
/root/dpdk/drivers/mempool/bucket/rte_mempool_bucket.c:190
190            objptr = bucket_stack_pop(bd->buckets[rte_lcore_id()]);
(gdb) bt
#0  bucket_dequeue_orphans (n_orphans=251, obj_table=0x14fe5af50,
bd=0x14fdad880) at
/root/dpdk/drivers/mempool/bucket/rte_mempool_bucket.c:190
#1  bucket_dequeue (mp=<optimized out>, obj_table=0x14fe5af50,
n=<optimized out>) at
/root/dpdk/drivers/mempool/bucket/rte_mempool_bucket.c:288
#2  0x00000000004eeeef in rte_mempool_ops_dequeue_bulk (n=251,
obj_table=0x14fe5af50, mp=0x14fe2dac0) at
/root/dpdk/v18.11.2/include/rte_mempool.h:657
#3  __mempool_generic_get (cache=0x14fe5af40, n=1,
obj_table=0x7f4840656fe0, mp=0x14fe2dac0) at
/root/dpdk/v18.11.2/include/rte_mempool.h:1363
#4  rte_mempool_generic_get (cache=0x14fe5af40, n=1,
obj_table=0x7f4840656fe0, mp=0x14fe2dac0) at
/root/dpdk/v18.11.2/include/rte_mempool.h:1426
#5  rte_mempool_get_bulk (n=1, obj_table=0x7f4840656fe0,
mp=0x14fe2dac0) at /root/dpdk/v18.11.2/include/rte_mempool.h:1459
#6  rte_mempool_get (obj_p=0x7f4840656fe0, mp=0x14fe2dac0) at
/root/dpdk/v18.11.2/include/rte_mempool.h:1485
#7  rte_mbuf_raw_alloc (mp=0x14fe2dac0) at
/root/dpdk/v18.11.2/include/rte_mbuf.h:1078
#8  rte_pktmbuf_alloc (mp=0x14fe2dac0) at
/root/dpdk/v18.11.2/include/rte_mbuf.h:1331
#9  pdump_pktmbuf_copy (mp=0x14fe2dac0, m=0x1509ea100) at
/root/dpdk/lib/librte_pdump/rte_pdump.c:99
#10 pdump_copy (pkts=<optimized out>, nb_pkts=<optimized out>,
user_params=<optimized out>) at
/root/dpdk/lib/librte_pdump/rte_pdump.c:151
#11 0x00000000004eff31 in pdump_rx (port=<optimized out>,
qidx=<optimized out>, pkts=<optimized out>, nb_pkts=<optimized out>,
max_pkts=<optimized out>, user_params=<optimized out>)
    at /root/dpdk/lib/librte_pdump/rte_pdump.c:172
#12 0x00000000009f25fa in rte_eth_rx_burst (nb_pkts=32,
rx_pkts=0x7f4840657110, queue_id=0, port_id=0) at
/usr/local/include/dpdk/rte_ethdev.h:3888
#13 netdev_dpdk_rxq_recv (rxq=0x1501bd940, batch=0x7f4840657100,
qfill=0x0) at ../lib/netdev-dpdk.c:2287
#14 0x000000000093dab1 in netdev_rxq_recv (rx=<optimized out>,
batch=batch@entry=0x7f4840657100, qfill=<optimized out>) at
../lib/netdev.c:724
#15 0x0000000000911694 in dp_netdev_process_rxq_port
(pmd=pmd@entry=0x7f484065a010, rxq=0x2711ae0, port_no=3) at
../lib/dpif-netdev.c:4268
#16 0x0000000000911af9 in pmd_thread_main (f_=<optimized out>) at
../lib/dpif-netdev.c:5526
#17 0x000000000099355d in ovsthread_wrapper (aux_=<optimized out>) at
../lib/ovs-thread.c:383
#18 0x00007f4875402dd5 in start_thread (arg=0x7f4840659700) at
pthread_create.c:307
#19 0x00007f4874920ead in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb) p bd->buckets[rte_lcore_id()]
$2 = (struct bucket_stack *) 0x0


We can see a "bucket" mempool.
ovs is using "ring_mp_mc" default mempool.
So maybe something unaligned here.


On OVS side, we can see the pdump code expects that this mempool has
some resource initialised for a lcore 15.
Not sure who is responsible for doing this part, primary ? secondary ?

The pdump application in 18.11.2 has a hardwired core mask as 0x1.
If I shoot this (backporting a commit that did this) and start pdump
on lcores 0, 1 and 15 (to mimic OVS running on master core 0 + lcore 1
and 15), I get a segfault a little bit later, but again with an
uninitialised resource on OVS side.

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ff795324700 (LWP 176548)]
bucket_dequeue_orphans (n_orphans=251, obj_table=0x14fe30bd0,
bd=0x14fdad880) at
/root/dpdk/drivers/mempool/bucket/rte_mempool_bucket.c:194
194                rc = rte_ring_dequeue(bd->shared_bucket_ring,
(gdb) bt
#0  bucket_dequeue_orphans (n_orphans=251, obj_table=0x14fe30bd0,
bd=0x14fdad880) at
/root/dpdk/drivers/mempool/bucket/rte_mempool_bucket.c:194
#1  bucket_dequeue (mp=<optimized out>, obj_table=0x14fe30bd0,
n=<optimized out>) at
/root/dpdk/drivers/mempool/bucket/rte_mempool_bucket.c:288
#2  0x00000000004eeeef in rte_mempool_ops_dequeue_bulk (n=251,
obj_table=0x14fe30bd0, mp=0x14fe2dac0) at
/root/dpdk/v18.11.2/include/rte_mempool.h:657
#3  __mempool_generic_get (cache=0x14fe30bc0, n=1,
obj_table=0x7ff795321fe0, mp=0x14fe2dac0) at
/root/dpdk/v18.11.2/include/rte_mempool.h:1363
#4  rte_mempool_generic_get (cache=0x14fe30bc0, n=1,
obj_table=0x7ff795321fe0, mp=0x14fe2dac0) at
/root/dpdk/v18.11.2/include/rte_mempool.h:1426
#5  rte_mempool_get_bulk (n=1, obj_table=0x7ff795321fe0,
mp=0x14fe2dac0) at /root/dpdk/v18.11.2/include/rte_mempool.h:1459
#6  rte_mempool_get (obj_p=0x7ff795321fe0, mp=0x14fe2dac0) at
/root/dpdk/v18.11.2/include/rte_mempool.h:1485
#7  rte_mbuf_raw_alloc (mp=0x14fe2dac0) at
/root/dpdk/v18.11.2/include/rte_mbuf.h:1078
#8  rte_pktmbuf_alloc (mp=0x14fe2dac0) at
/root/dpdk/v18.11.2/include/rte_mbuf.h:1331
#9  pdump_pktmbuf_copy (mp=0x14fe2dac0, m=0x1509ea100) at
/root/dpdk/lib/librte_pdump/rte_pdump.c:99
#10 pdump_copy (pkts=<optimized out>, nb_pkts=<optimized out>,
user_params=<optimized out>) at
/root/dpdk/lib/librte_pdump/rte_pdump.c:151
#11 0x00000000004eff31 in pdump_rx (port=<optimized out>,
qidx=<optimized out>, pkts=<optimized out>, nb_pkts=<optimized out>,
max_pkts=<optimized out>, user_params=<optimized out>)
    at /root/dpdk/lib/librte_pdump/rte_pdump.c:172
#12 0x00000000009f25fa in rte_eth_rx_burst (nb_pkts=32,
rx_pkts=0x7ff795322110, queue_id=0, port_id=0) at
/usr/local/include/dpdk/rte_ethdev.h:3888
#13 netdev_dpdk_rxq_recv (rxq=0x1501bd940, batch=0x7ff795322100,
qfill=0x0) at ../lib/netdev-dpdk.c:2287
#14 0x000000000093dab1 in netdev_rxq_recv (rx=<optimized out>,
batch=batch@entry=0x7ff795322100, qfill=<optimized out>) at
../lib/netdev.c:724
#15 0x0000000000911694 in dp_netdev_process_rxq_port
(pmd=pmd@entry=0x7ff795325010, rxq=0x2376b30, port_no=3) at
../lib/dpif-netdev.c:4268
#16 0x0000000000911af9 in pmd_thread_main (f_=<optimized out>) at
../lib/dpif-netdev.c:5526
#17 0x000000000099355d in ovsthread_wrapper (aux_=<optimized out>) at
../lib/ovs-thread.c:383
#18 0x00007ff7cd81ddd5 in start_thread (arg=0x7ff795324700) at
pthread_create.c:307
#19 0x00007ff7ccd3bead in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb) p bd->shared_bucket_ring
$1 = (struct rte_ring *) 0x0



--
David Marchand
Pattan, Reshma Nov. 14, 2019, 4:48 p.m. UTC | #3
Hi,

Sorry, I don’t work on OVS, so I never tested pdump with OVS.
I remembered,  Ciara testing earlier versions of pdump with  OVS in the past.

Thanks,
Reshma



> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Thursday, November 14, 2019 4:41 PM
> To: Pattan, Reshma <reshma.pattan@intel.com>
> Cc: ovs-dev@openvswitch.org; Kevin Traynor <ktraynor@redhat.com>;
> Aaron Conole <aconole@redhat.com>; Loftus, Ciara
> <ciara.loftus@intel.com>; Flavio Leitner <fbl@sysclose.org>; Ilya Maximets
> <i.maximets@ovn.org>; Stokes, Ian <ian.stokes@intel.com>
> Subject: Re: [PATCH] dpdk: Deprecate pdump support.
> 
> Hello Reshma,
> 
> Has pdump been tested (recently) with OVS?
> 
> 
> On Mon, Nov 11, 2019 at 7:53 PM Ilya Maximets <i.maximets@ovn.org>
> wrote:
> >
> > The conventional way for packet dumping in OVS is to use ovs-tcpdump
> > that works via traffic mirroring.  DPDK pdump could probably be used
> > for some lower level debugging, but it is not commonly used for
> > various reasons.
> >
> > There are lots of limitations for using this functionality in practice.
> > Most of them connected with running secondary pdump process and
> memory
> > layout issues like requirement to disable ASLR in kernel.
> > More details are available in DPDK guide:
> > https://doc.dpdk.org/guides/prog_guide/multi_proc_support.html#multi-
> p
> > rocess-limitations
> >
> > Beside the functional limitations it's also hard to use this
> > functionality correctly.  User must be sure that OVS and pdump utility
> > are running on different CPU cores, which is hard because non-PMD
> > threads could float over available CPU cores.  This or any other
> > misconfiguration will likely lead to crash of the pdump utility or/and
> > OVS.
> >
> > Another problem is that the user must actually have this special pdump
> > utility in a system and it might be not available in distributions.
> >
> > This change disables pdump support by default introducing special
> > configuration option '--enable-dpdk-pdump'.  Deprecation warnings will
> > be shown to users on configuration and in runtime.
> >
> > Claiming to completely remove this functionality from OVS in one of
> > the next releases.
> >
> > Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> > Acked-by: Aaron Conole <aconole@redhat.com>
> 
> - Recompiled from scratch, on OVS master (before this patch) with dpdk
> 18.11.2.
> 
> other_config        : {dpdk-init="true", pmd-cpu-mask="0x00008002"}
> 
> 2 physical ports, 2 vhost ports.
> 
> 2019-11-14T14:13:09.596Z|00018|dpdk|INFO|Using DPDK 18.11.2 2019-11-
> 14T14:13:09.596Z|00019|dpdk|INFO|DPDK Enabled - initializing...
> 2019-11-14T14:13:09.596Z|00020|dpdk|INFO|No vhost-sock-dir provided -
> defaulting to //var/run/openvswitch 2019-11-
> 14T14:13:09.596Z|00021|dpdk|INFO|IOMMU support for vhost-user-client
> disabled.
> 2019-11-14T14:13:09.596Z|00022|dpdk|INFO|POSTCOPY support for vhost-
> user-client disabled.
> 2019-11-14T14:13:09.596Z|00023|dpdk|INFO|Per port memory for DPDK
> devices disabled.
> 2019-11-14T14:13:09.596Z|00024|dpdk|INFO|EAL ARGS: ovs-vswitchd --
> socket-mem 1024 --socket-limit 1024 -l 0.
> 2019-11-14T14:13:09.600Z|00025|dpdk|INFO|EAL: Detected 28 lcore(s)
> 2019-11-14T14:13:09.600Z|00026|dpdk|INFO|EAL: Detected 1 NUMA nodes
> 2019-11-14T14:13:09.602Z|00027|dpdk|INFO|EAL: Multi-process socket
> /var/run/openvswitch/dpdk/rte/mp_socket
> 2019-11-14T14:13:09.618Z|00028|dpdk|INFO|EAL: Probing VFIO support...
> 2019-11-14T14:13:09.618Z|00029|dpdk|INFO|EAL: VFIO support initialized
> 2019-11-14T14:13:14.612Z|00030|dpdk|INFO|EAL: PCI device 0000:01:00.0 on
> NUMA socket 0
> 2019-11-14T14:13:14.612Z|00031|dpdk|INFO|EAL:   probe driver:
> 8086:10fb net_ixgbe
> 2019-11-14T14:13:14.613Z|00032|dpdk|INFO|EAL:   using IOMMU type 1
> (Type 1)
> 2019-11-14T14:13:14.744Z|00033|dpdk|INFO|EAL: Ignore mapping IO port
> bar(2)
> 2019-11-14T14:13:15.090Z|00034|dpdk|INFO|EAL: PCI device 0000:01:00.1 on
> NUMA socket 0
> 2019-11-14T14:13:15.090Z|00035|dpdk|INFO|EAL:   probe driver:
> 8086:10fb net_ixgbe
> 2019-11-14T14:13:15.199Z|00036|dpdk|INFO|EAL: Ignore mapping IO port
> bar(2)
> 2019-11-14T14:13:15.530Z|00037|dpdk|INFO|EAL: PCI device 0000:07:00.0 on
> NUMA socket 0
> 2019-11-14T14:13:15.530Z|00038|dpdk|INFO|EAL:   probe driver:
> 8086:1521 net_e1000_igb
> 2019-11-14T14:13:15.530Z|00039|dpdk|INFO|EAL: PCI device 0000:07:00.1 on
> NUMA socket 0
> 2019-11-14T14:13:15.530Z|00040|dpdk|INFO|EAL:   probe driver:
> 8086:1521 net_e1000_igb
> ...
> 2019-11-14T14:13:15.802Z|00042|dpdk|INFO|DPDK pdump packet capture
> enabled 2019-11-14T14:13:15.803Z|00043|dpdk|INFO|DPDK Enabled -
> initialized
> 
> - Attached a gdb to ovs-vswitchd.
> 
> - Started pdump:
> # sudo -u openvswitch XDG_RUNTIME_DIR=/var/run/openvswitch
> ./v18.11.2/app/dpdk-pdump -- --pdump
> 'port=0,queue=*,rx-dev=/tmp/pkts.pcap'
> EAL: Detected 28 lcore(s)
> EAL: Detected 1 NUMA nodes
> EAL: Multi-process socket
> /var/run/openvswitch/dpdk/rte/mp_socket_83791_549cdfd05e328e
> EAL: Probing VFIO support...
> EAL: VFIO support initialized
> EAL: PCI device 0000:01:00.0 on NUMA socket 0
> EAL:   probe driver: 8086:10fb net_ixgbe
> EAL:   using IOMMU type 1 (Type 1)
> EAL: PCI device 0000:01:00.1 on NUMA socket 0
> EAL:   probe driver: 8086:10fb net_ixgbe
> EAL: PCI device 0000:07:00.0 on NUMA socket 0
> EAL:   probe driver: 8086:1521 net_e1000_igb
> EAL: PCI device 0000:07:00.1 on NUMA socket 0
> EAL:   probe driver: 8086:1521 net_e1000_igb
> Port 3 MAC: 02 70 63 61 70 00
> 
> - Sent one packet to the first physical port from my tgen
> 
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0x7f4840659700 (LWP 84336)]
> bucket_dequeue_orphans (n_orphans=251, obj_table=0x14fe5af50,
> bd=0x14fdad880) at
> /root/dpdk/drivers/mempool/bucket/rte_mempool_bucket.c:190
> 190            objptr = bucket_stack_pop(bd->buckets[rte_lcore_id()]);
> (gdb) bt
> #0  bucket_dequeue_orphans (n_orphans=251, obj_table=0x14fe5af50,
> bd=0x14fdad880) at
> /root/dpdk/drivers/mempool/bucket/rte_mempool_bucket.c:190
> #1  bucket_dequeue (mp=<optimized out>, obj_table=0x14fe5af50,
> n=<optimized out>) at
> /root/dpdk/drivers/mempool/bucket/rte_mempool_bucket.c:288
> #2  0x00000000004eeeef in rte_mempool_ops_dequeue_bulk (n=251,
> obj_table=0x14fe5af50, mp=0x14fe2dac0) at
> /root/dpdk/v18.11.2/include/rte_mempool.h:657
> #3  __mempool_generic_get (cache=0x14fe5af40, n=1,
> obj_table=0x7f4840656fe0, mp=0x14fe2dac0) at
> /root/dpdk/v18.11.2/include/rte_mempool.h:1363
> #4  rte_mempool_generic_get (cache=0x14fe5af40, n=1,
> obj_table=0x7f4840656fe0, mp=0x14fe2dac0) at
> /root/dpdk/v18.11.2/include/rte_mempool.h:1426
> #5  rte_mempool_get_bulk (n=1, obj_table=0x7f4840656fe0,
> mp=0x14fe2dac0) at /root/dpdk/v18.11.2/include/rte_mempool.h:1459
> #6  rte_mempool_get (obj_p=0x7f4840656fe0, mp=0x14fe2dac0) at
> /root/dpdk/v18.11.2/include/rte_mempool.h:1485
> #7  rte_mbuf_raw_alloc (mp=0x14fe2dac0) at
> /root/dpdk/v18.11.2/include/rte_mbuf.h:1078
> #8  rte_pktmbuf_alloc (mp=0x14fe2dac0) at
> /root/dpdk/v18.11.2/include/rte_mbuf.h:1331
> #9  pdump_pktmbuf_copy (mp=0x14fe2dac0, m=0x1509ea100) at
> /root/dpdk/lib/librte_pdump/rte_pdump.c:99
> #10 pdump_copy (pkts=<optimized out>, nb_pkts=<optimized out>,
> user_params=<optimized out>) at
> /root/dpdk/lib/librte_pdump/rte_pdump.c:151
> #11 0x00000000004eff31 in pdump_rx (port=<optimized out>,
> qidx=<optimized out>, pkts=<optimized out>, nb_pkts=<optimized out>,
> max_pkts=<optimized out>, user_params=<optimized out>)
>     at /root/dpdk/lib/librte_pdump/rte_pdump.c:172
> #12 0x00000000009f25fa in rte_eth_rx_burst (nb_pkts=32,
> rx_pkts=0x7f4840657110, queue_id=0, port_id=0) at
> /usr/local/include/dpdk/rte_ethdev.h:3888
> #13 netdev_dpdk_rxq_recv (rxq=0x1501bd940, batch=0x7f4840657100,
> qfill=0x0) at ../lib/netdev-dpdk.c:2287
> #14 0x000000000093dab1 in netdev_rxq_recv (rx=<optimized out>,
> batch=batch@entry=0x7f4840657100, qfill=<optimized out>) at
> ../lib/netdev.c:724
> #15 0x0000000000911694 in dp_netdev_process_rxq_port
> (pmd=pmd@entry=0x7f484065a010, rxq=0x2711ae0, port_no=3) at
> ../lib/dpif-netdev.c:4268
> #16 0x0000000000911af9 in pmd_thread_main (f_=<optimized out>) at
> ../lib/dpif-netdev.c:5526
> #17 0x000000000099355d in ovsthread_wrapper (aux_=<optimized out>) at
> ../lib/ovs-thread.c:383
> #18 0x00007f4875402dd5 in start_thread (arg=0x7f4840659700) at
> pthread_create.c:307
> #19 0x00007f4874920ead in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
> (gdb) p bd->buckets[rte_lcore_id()]
> $2 = (struct bucket_stack *) 0x0
> 
> 
> We can see a "bucket" mempool.
> ovs is using "ring_mp_mc" default mempool.
> So maybe something unaligned here.
> 
> 
> On OVS side, we can see the pdump code expects that this mempool has
> some resource initialised for a lcore 15.
> Not sure who is responsible for doing this part, primary ? secondary ?
> 
> The pdump application in 18.11.2 has a hardwired core mask as 0x1.
> If I shoot this (backporting a commit that did this) and start pdump on lcores
> 0, 1 and 15 (to mimic OVS running on master core 0 + lcore 1 and 15), I get a
> segfault a little bit later, but again with an uninitialised resource on OVS side.
> 
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0x7ff795324700 (LWP 176548)]
> bucket_dequeue_orphans (n_orphans=251, obj_table=0x14fe30bd0,
> bd=0x14fdad880) at
> /root/dpdk/drivers/mempool/bucket/rte_mempool_bucket.c:194
> 194                rc = rte_ring_dequeue(bd->shared_bucket_ring,
> (gdb) bt
> #0  bucket_dequeue_orphans (n_orphans=251, obj_table=0x14fe30bd0,
> bd=0x14fdad880) at
> /root/dpdk/drivers/mempool/bucket/rte_mempool_bucket.c:194
> #1  bucket_dequeue (mp=<optimized out>, obj_table=0x14fe30bd0,
> n=<optimized out>) at
> /root/dpdk/drivers/mempool/bucket/rte_mempool_bucket.c:288
> #2  0x00000000004eeeef in rte_mempool_ops_dequeue_bulk (n=251,
> obj_table=0x14fe30bd0, mp=0x14fe2dac0) at
> /root/dpdk/v18.11.2/include/rte_mempool.h:657
> #3  __mempool_generic_get (cache=0x14fe30bc0, n=1,
> obj_table=0x7ff795321fe0, mp=0x14fe2dac0) at
> /root/dpdk/v18.11.2/include/rte_mempool.h:1363
> #4  rte_mempool_generic_get (cache=0x14fe30bc0, n=1,
> obj_table=0x7ff795321fe0, mp=0x14fe2dac0) at
> /root/dpdk/v18.11.2/include/rte_mempool.h:1426
> #5  rte_mempool_get_bulk (n=1, obj_table=0x7ff795321fe0,
> mp=0x14fe2dac0) at /root/dpdk/v18.11.2/include/rte_mempool.h:1459
> #6  rte_mempool_get (obj_p=0x7ff795321fe0, mp=0x14fe2dac0) at
> /root/dpdk/v18.11.2/include/rte_mempool.h:1485
> #7  rte_mbuf_raw_alloc (mp=0x14fe2dac0) at
> /root/dpdk/v18.11.2/include/rte_mbuf.h:1078
> #8  rte_pktmbuf_alloc (mp=0x14fe2dac0) at
> /root/dpdk/v18.11.2/include/rte_mbuf.h:1331
> #9  pdump_pktmbuf_copy (mp=0x14fe2dac0, m=0x1509ea100) at
> /root/dpdk/lib/librte_pdump/rte_pdump.c:99
> #10 pdump_copy (pkts=<optimized out>, nb_pkts=<optimized out>,
> user_params=<optimized out>) at
> /root/dpdk/lib/librte_pdump/rte_pdump.c:151
> #11 0x00000000004eff31 in pdump_rx (port=<optimized out>,
> qidx=<optimized out>, pkts=<optimized out>, nb_pkts=<optimized out>,
> max_pkts=<optimized out>, user_params=<optimized out>)
>     at /root/dpdk/lib/librte_pdump/rte_pdump.c:172
> #12 0x00000000009f25fa in rte_eth_rx_burst (nb_pkts=32,
> rx_pkts=0x7ff795322110, queue_id=0, port_id=0) at
> /usr/local/include/dpdk/rte_ethdev.h:3888
> #13 netdev_dpdk_rxq_recv (rxq=0x1501bd940, batch=0x7ff795322100,
> qfill=0x0) at ../lib/netdev-dpdk.c:2287
> #14 0x000000000093dab1 in netdev_rxq_recv (rx=<optimized out>,
> batch=batch@entry=0x7ff795322100, qfill=<optimized out>) at
> ../lib/netdev.c:724
> #15 0x0000000000911694 in dp_netdev_process_rxq_port
> (pmd=pmd@entry=0x7ff795325010, rxq=0x2376b30, port_no=3) at
> ../lib/dpif-netdev.c:4268
> #16 0x0000000000911af9 in pmd_thread_main (f_=<optimized out>) at
> ../lib/dpif-netdev.c:5526
> #17 0x000000000099355d in ovsthread_wrapper (aux_=<optimized out>) at
> ../lib/ovs-thread.c:383
> #18 0x00007ff7cd81ddd5 in start_thread (arg=0x7ff795324700) at
> pthread_create.c:307
> #19 0x00007ff7ccd3bead in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
> (gdb) p bd->shared_bucket_ring
> $1 = (struct rte_ring *) 0x0
> 
> 
> 
> --
> David Marchand
David Marchand Nov. 19, 2019, 12:45 p.m. UTC | #4
On Mon, Nov 11, 2019 at 7:53 PM Ilya Maximets <i.maximets@ovn.org> wrote:
>
> The conventional way for packet dumping in OVS is to use ovs-tcpdump
> that works via traffic mirroring.  DPDK pdump could probably be used
> for some lower level debugging, but it is not commonly used for
> various reasons.
>
> There are lots of limitations for using this functionality in practice.
> Most of them connected with running secondary pdump process and
> memory layout issues like requirement to disable ASLR in kernel.
> More details are available in DPDK guide:
> https://doc.dpdk.org/guides/prog_guide/multi_proc_support.html#multi-process-limitations
>
> Beside the functional limitations it's also hard to use this
> functionality correctly.  User must be sure that OVS and pdump utility
> are running on different CPU cores, which is hard because non-PMD
> threads could float over available CPU cores.  This or any other
> misconfiguration will likely lead to crash of the pdump utility
> or/and OVS.
>
> Another problem is that the user must actually have this special pdump
> utility in a system and it might be not available in distributions.
>
> This change disables pdump support by default introducing special
> configuration option '--enable-dpdk-pdump'.  Deprecation warnings will
> be shown to users on configuration and in runtime.
>
> Claiming to completely remove this functionality from OVS in one
> of the next releases.
>
> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> Acked-by: Aaron Conole <aconole@redhat.com>

Acked-by: David Marchand <david.marchand@redhat.com>
Stokes, Ian Nov. 19, 2019, 9:27 p.m. UTC | #5
On 11/19/2019 12:45 PM, David Marchand wrote:
> On Mon, Nov 11, 2019 at 7:53 PM Ilya Maximets <i.maximets@ovn.org> wrote:
>>
>> The conventional way for packet dumping in OVS is to use ovs-tcpdump
>> that works via traffic mirroring.  DPDK pdump could probably be used
>> for some lower level debugging, but it is not commonly used for
>> various reasons.
>>
>> There are lots of limitations for using this functionality in practice.
>> Most of them connected with running secondary pdump process and
>> memory layout issues like requirement to disable ASLR in kernel.
>> More details are available in DPDK guide:
>> https://doc.dpdk.org/guides/prog_guide/multi_proc_support.html#multi-process-limitations
>>
>> Beside the functional limitations it's also hard to use this
>> functionality correctly.  User must be sure that OVS and pdump utility
>> are running on different CPU cores, which is hard because non-PMD
>> threads could float over available CPU cores.  This or any other
>> misconfiguration will likely lead to crash of the pdump utility
>> or/and OVS.
>>
>> Another problem is that the user must actually have this special pdump
>> utility in a system and it might be not available in distributions.
>>
>> This change disables pdump support by default introducing special
>> configuration option '--enable-dpdk-pdump'.  Deprecation warnings will
>> be shown to users on configuration and in runtime.
>>
>> Claiming to completely remove this functionality from OVS in one
>> of the next releases.
>>
>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
>> Acked-by: Aaron Conole <aconole@redhat.com>
> 
> Acked-by: David Marchand <david.marchand@redhat.com>
> 
> 
Thanks all, applied and pushed to master.

Regards
Ian

Patch
diff mbox series

diff --git a/.travis/linux-build.sh b/.travis/linux-build.sh
index 69260181b..4e74973a3 100755
--- a/.travis/linux-build.sh
+++ b/.travis/linux-build.sh
@@ -124,7 +124,7 @@  function install_dpdk()
     sed -i '/CONFIG_RTE_EAL_IGB_UIO=y/s/=y/=n/' build/.config
     sed -i '/CONFIG_RTE_KNI_KMOD=y/s/=y/=n/' build/.config
 
-    # Enable pdump.  This will enable building of the relevant OVS code.
+    # Enable pdump support in DPDK.
     sed -i '/CONFIG_RTE_LIBRTE_PMD_PCAP=n/s/=n/=y/' build/.config
     sed -i '/CONFIG_RTE_LIBRTE_PDUMP=n/s/=n/=y/' build/.config
 
@@ -168,6 +168,8 @@  if [ "$DPDK" ] || [ "$DPDK_SHARED" ]; then
         DPDK_VER="18.11.2"
     fi
     install_dpdk $DPDK_VER
+    # Enable pdump support in OVS.
+    EXTRA_OPTS="${EXTRA_OPTS} --enable-dpdk-pdump"
     if [ "$CC" = "clang" ]; then
         # Disregard cast alignment errors until DPDK is fixed
         CFLAGS_FOR_OVS="${CFLAGS_FOR_OVS} -Wno-cast-align"
diff --git a/Documentation/topics/dpdk/pdump.rst b/Documentation/topics/dpdk/pdump.rst
index 7bd1d3e9f..b4d8aa8e9 100644
--- a/Documentation/topics/dpdk/pdump.rst
+++ b/Documentation/topics/dpdk/pdump.rst
@@ -27,10 +27,16 @@  pdump
 
 .. versionadded:: 2.6.0
 
+.. warning::
+
+   DPDK pdump support is deprecated in OVS and will be removed in next
+   releases.
+
 pdump allows you to listen on DPDK ports and view the traffic that is passing
 on them. To use this utility, one must have libpcap installed on the system.
 Furthermore, DPDK must be built with ``CONFIG_RTE_LIBRTE_PDUMP=y`` and
-``CONFIG_RTE_LIBRTE_PMD_PCAP=y``.
+``CONFIG_RTE_LIBRTE_PMD_PCAP=y``. OVS should be built with
+``--enable-dpdk-pdump`` configuration option.
 
 .. warning::
 
diff --git a/NEWS b/NEWS
index 88b818948..0d65d5a7f 100644
--- a/NEWS
+++ b/NEWS
@@ -10,6 +10,10 @@  Post-v2.12.0
        if supported by libbpf.
      * Add option to enable, disable and query TCP sequence checking in
        conntrack.
+   - DPDK:
+     * DPDK pdump packet capture support disabled by default. New configure
+       option '--enable-dpdk-pdump' to enable it.
+     * DPDK pdump support is deprecated and will be removed in next releases.
 
 v2.12.0 - 03 Sep 2019
 ---------------------
diff --git a/acinclude.m4 b/acinclude.m4
index fc6157ac8..542637ac8 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -357,12 +357,24 @@  AC_DEFUN([OVS_CHECK_DPDK], [
       AC_DEFINE([VHOST_NUMA], [1], [NUMA Aware vHost support detected in DPDK.])
     ], [], [[#include <rte_config.h>]])
 
-    AC_CHECK_DECL([RTE_LIBRTE_PMD_PCAP], [
-      OVS_FIND_DEPENDENCY([pcap_dump], [pcap], [libpcap])
-      AC_CHECK_DECL([RTE_LIBRTE_PDUMP], [
-        AC_DEFINE([DPDK_PDUMP], [1], [DPDK pdump enabled in OVS.])
-      ], [], [[#include <rte_config.h>]])
-    ], [], [[#include <rte_config.h>]])
+   AC_MSG_CHECKING([whether DPDK pdump support is enabled])
+   AC_ARG_ENABLE(
+     [dpdk-pdump],
+     [AC_HELP_STRING([--enable-dpdk-pdump],
+                     [Enable DPDK pdump packet capture support])],
+     [AC_MSG_RESULT([yes])
+      AC_MSG_WARN([DPDK pdump is deprecated, consider using ovs-tcpdump instead])
+      AC_CHECK_DECL([RTE_LIBRTE_PMD_PCAP], [
+        OVS_FIND_DEPENDENCY([pcap_dump], [pcap], [libpcap])
+        AC_CHECK_DECL([RTE_LIBRTE_PDUMP], [
+          AC_DEFINE([DPDK_PDUMP], [1], [DPDK pdump enabled in OVS.])
+        ], [
+          AC_MSG_ERROR([RTE_LIBRTE_PDUMP is not defined in rte_config.h])
+        ], [[#include <rte_config.h>]])
+      ], [
+        AC_MSG_ERROR([RTE_LIBRTE_PMD_PCAP is not defined in rte_config.h])
+      ], [[#include <rte_config.h>]])],
+      [AC_MSG_RESULT([no])])
 
     AC_CHECK_DECL([RTE_LIBRTE_MLX5_PMD], [dnl found
       OVS_FIND_DEPENDENCY([mnl_attr_put], [mnl], [libmnl])
diff --git a/lib/dpdk.c b/lib/dpdk.c
index f90cda75a..21dd47e80 100644
--- a/lib/dpdk.c
+++ b/lib/dpdk.c
@@ -434,6 +434,8 @@  dpdk_init__(const struct smap *ovs_other_config)
 
 #ifdef DPDK_PDUMP
     VLOG_INFO("DPDK pdump packet capture enabled");
+    VLOG_WARN("DPDK pdump support is deprecated and "
+              "will be removed in next OVS releases.");
     err = rte_pdump_init(ovs_rundir());
     if (err) {
         VLOG_INFO("Error initialising DPDK pdump");