[ovs-dev,v3] dpdk: Migrate to the new pdump API.
diff mbox series

Message ID 20191111150129.22847-1-i.maximets@ovn.org
State Accepted
Headers show
Series
  • [ovs-dev,v3] dpdk: Migrate to the new pdump API.
Related show

Commit Message

Ilya Maximets Nov. 11, 2019, 3:01 p.m. UTC
DPDK commit 660098d61f57 ("pdump: use generic multi-process channel")
switched pdump to use generic DPDK IPC instead of sockets.
Old API was deprecated and removed.  Updating OVS code accordingly.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
---

Version 3:
  * Added note about XDG_RUNTIME_DIR.

Version 2:
  * Removed unneeded deinitialization on error.
  * Docs updated.

 Documentation/topics/dpdk/pdump.rst | 13 +++++++------
 lib/dpdk.c                          | 11 +----------
 2 files changed, 8 insertions(+), 16 deletions(-)

Comments

Ilya Maximets Nov. 11, 2019, 5:25 p.m. UTC | #1
Sorry, I missed the 'dpdk-latest' tag.

On 11.11.2019 16:01, Ilya Maximets wrote:
> DPDK commit 660098d61f57 ("pdump: use generic multi-process channel")
> switched pdump to use generic DPDK IPC instead of sockets.
> Old API was deprecated and removed.  Updating OVS code accordingly.
> 
> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> ---
> 
> Version 3:
>   * Added note about XDG_RUNTIME_DIR.
> 
> Version 2:
>   * Removed unneeded deinitialization on error.
>   * Docs updated.
> 
>  Documentation/topics/dpdk/pdump.rst | 13 +++++++------
>  lib/dpdk.c                          | 11 +----------
>  2 files changed, 8 insertions(+), 16 deletions(-)
> 
> diff --git a/Documentation/topics/dpdk/pdump.rst b/Documentation/topics/dpdk/pdump.rst
> index 7bd1d3e9f..affd98371 100644
> --- a/Documentation/topics/dpdk/pdump.rst
> +++ b/Documentation/topics/dpdk/pdump.rst
> @@ -41,8 +41,7 @@ To use pdump, simply launch OVS as usual, then navigate to the ``app/pdump``
>  directory in DPDK, ``make`` the application and run like so::
>  
>      $ sudo ./build/app/dpdk-pdump -- \
> -        --pdump port=0,queue=0,rx-dev=/tmp/pkts.pcap \
> -        --server-socket-path=/usr/local/var/run/openvswitch
> +        --pdump port=0,queue=0,rx-dev=/tmp/pkts.pcap
>  
>  The above command captures traffic received on queue 0 of port 0 and stores it
>  in ``/tmp/pkts.pcap``. Other combinations of port numbers, queues numbers and
> @@ -50,11 +49,13 @@ pcap locations are of course also available to use. For example, to capture all
>  packets that traverse port 0 in a single pcap file::
>  
>      $ sudo ./build/app/dpdk-pdump -- \
> -        --pdump 'port=0,queue=*,rx-dev=/tmp/pkts.pcap,tx-dev=/tmp/pkts.pcap' \
> -        --server-socket-path=/usr/local/var/run/openvswitch
> +        --pdump 'port=0,queue=*,rx-dev=/tmp/pkts.pcap,tx-dev=/tmp/pkts.pcap'
>  
> -``server-socket-path`` must be set to the value of ``ovs_rundir()`` which
> -typically resolves to ``/usr/local/var/run/openvswitch``.
> +.. note::
> +
> +   ``XDG_RUNTIME_DIR`` environment variable might need to be adjusted to
> +   OVS runtime directory (``/var/run/openvswitch`` in most cases) for
> +   ``dpdk-pdump`` utility if OVS started by non-root user.
>  
>  Many tools are available to view the contents of the pcap file. Once example is
>  tcpdump. Issue the following command to view the contents of ``pkts.pcap``::
> diff --git a/lib/dpdk.c b/lib/dpdk.c
> index f90cda75a..748f63d31 100644
> --- a/lib/dpdk.c
> +++ b/lib/dpdk.c
> @@ -27,7 +27,6 @@
>  #include <rte_memzone.h>
>  #include <rte_version.h>
>  #ifdef DPDK_PDUMP
> -#include <rte_mempool.h>
>  #include <rte_pdump.h>
>  #endif
>  
> @@ -434,17 +433,9 @@ dpdk_init__(const struct smap *ovs_other_config)
>  
>  #ifdef DPDK_PDUMP
>      VLOG_INFO("DPDK pdump packet capture enabled");
> -    err = rte_pdump_init(ovs_rundir());
> +    err = rte_pdump_init();
>      if (err) {
>          VLOG_INFO("Error initialising DPDK pdump");
> -        rte_pdump_uninit();
> -    } else {
> -        char *server_socket_path;
> -
> -        server_socket_path = xasprintf("%s/%s", ovs_rundir(),
> -                                       "pdump_server_socket");
> -        fatal_signal_add_file_to_unlink(server_socket_path);
> -        free(server_socket_path);
>      }
>  #endif
>  
>
Stokes, Ian Nov. 12, 2019, 5:07 p.m. UTC | #2
On 11/11/2019 3:01 PM, Ilya Maximets wrote:
> DPDK commit 660098d61f57 ("pdump: use generic multi-process channel")
> switched pdump to use generic DPDK IPC instead of sockets.
> Old API was deprecated and removed.  Updating OVS code accordingly.
> 
> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

Thanks for the patch Ilya.

I see compilation passing now on dpdk-latest with this applied.

https://travis-ci.org/istokes/ovs/builds/610915636

I still had issues with running PDUMP, but those issues are specific to 
PDUMP setup in my environment. A separate issue we can discuss further 
on the deprecation thread as it seems unrelated to this patch.

@David, are you happy to ack the patch (I see some of the changes are 
from your side).

Thanks
Ian
> ---
> 
> Version 3:
>    * Added note about XDG_RUNTIME_DIR.
> 
> Version 2:
>    * Removed unneeded deinitialization on error.
>    * Docs updated.
> 
>   Documentation/topics/dpdk/pdump.rst | 13 +++++++------
>   lib/dpdk.c                          | 11 +----------
>   2 files changed, 8 insertions(+), 16 deletions(-)
> 
> diff --git a/Documentation/topics/dpdk/pdump.rst b/Documentation/topics/dpdk/pdump.rst
> index 7bd1d3e9f..affd98371 100644
> --- a/Documentation/topics/dpdk/pdump.rst
> +++ b/Documentation/topics/dpdk/pdump.rst
> @@ -41,8 +41,7 @@ To use pdump, simply launch OVS as usual, then navigate to the ``app/pdump``
>   directory in DPDK, ``make`` the application and run like so::
>   
>       $ sudo ./build/app/dpdk-pdump -- \
> -        --pdump port=0,queue=0,rx-dev=/tmp/pkts.pcap \
> -        --server-socket-path=/usr/local/var/run/openvswitch
> +        --pdump port=0,queue=0,rx-dev=/tmp/pkts.pcap
>   
>   The above command captures traffic received on queue 0 of port 0 and stores it
>   in ``/tmp/pkts.pcap``. Other combinations of port numbers, queues numbers and
> @@ -50,11 +49,13 @@ pcap locations are of course also available to use. For example, to capture all
>   packets that traverse port 0 in a single pcap file::
>   
>       $ sudo ./build/app/dpdk-pdump -- \
> -        --pdump 'port=0,queue=*,rx-dev=/tmp/pkts.pcap,tx-dev=/tmp/pkts.pcap' \
> -        --server-socket-path=/usr/local/var/run/openvswitch
> +        --pdump 'port=0,queue=*,rx-dev=/tmp/pkts.pcap,tx-dev=/tmp/pkts.pcap'
>   
> -``server-socket-path`` must be set to the value of ``ovs_rundir()`` which
> -typically resolves to ``/usr/local/var/run/openvswitch``.
> +.. note::
> +
> +   ``XDG_RUNTIME_DIR`` environment variable might need to be adjusted to
> +   OVS runtime directory (``/var/run/openvswitch`` in most cases) for
> +   ``dpdk-pdump`` utility if OVS started by non-root user.
>   
>   Many tools are available to view the contents of the pcap file. Once example is
>   tcpdump. Issue the following command to view the contents of ``pkts.pcap``::
> diff --git a/lib/dpdk.c b/lib/dpdk.c
> index f90cda75a..748f63d31 100644
> --- a/lib/dpdk.c
> +++ b/lib/dpdk.c
> @@ -27,7 +27,6 @@
>   #include <rte_memzone.h>
>   #include <rte_version.h>
>   #ifdef DPDK_PDUMP
> -#include <rte_mempool.h>
>   #include <rte_pdump.h>
>   #endif
>   
> @@ -434,17 +433,9 @@ dpdk_init__(const struct smap *ovs_other_config)
>   
>   #ifdef DPDK_PDUMP
>       VLOG_INFO("DPDK pdump packet capture enabled");
> -    err = rte_pdump_init(ovs_rundir());
> +    err = rte_pdump_init();
>       if (err) {
>           VLOG_INFO("Error initialising DPDK pdump");
> -        rte_pdump_uninit();
> -    } else {
> -        char *server_socket_path;
> -
> -        server_socket_path = xasprintf("%s/%s", ovs_rundir(),
> -                                       "pdump_server_socket");
> -        fatal_signal_add_file_to_unlink(server_socket_path);
> -        free(server_socket_path);
>       }
>   #endif
>   
>
David Marchand Nov. 12, 2019, 5:15 p.m. UTC | #3
On Tue, Nov 12, 2019 at 6:07 PM Stokes, Ian <ian.stokes@intel.com> wrote:
> On 11/11/2019 3:01 PM, Ilya Maximets wrote:
> > DPDK commit 660098d61f57 ("pdump: use generic multi-process channel")
> > switched pdump to use generic DPDK IPC instead of sockets.
> > Old API was deprecated and removed.  Updating OVS code accordingly.
> >
> > Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
>
> Thanks for the patch Ilya.
>
> I see compilation passing now on dpdk-latest with this applied.
>
> https://travis-ci.org/istokes/ovs/builds/610915636
>
> I still had issues with running PDUMP, but those issues are specific to
> PDUMP setup in my environment. A separate issue we can discuss further
> on the deprecation thread as it seems unrelated to this patch.
>
> @David, are you happy to ack the patch (I see some of the changes are
> from your side).

I did not work on the crash I saw, but it was most likely a problem on my side.
This looks good to me.

Acked-by: David Marchand <david.marchand@redhat.com>


--
David Marchand
Stokes, Ian Nov. 12, 2019, 6:51 p.m. UTC | #4
On 11/12/2019 5:15 PM, David Marchand wrote:
> On Tue, Nov 12, 2019 at 6:07 PM Stokes, Ian <ian.stokes@intel.com> wrote:
>> On 11/11/2019 3:01 PM, Ilya Maximets wrote:
>>> DPDK commit 660098d61f57 ("pdump: use generic multi-process channel")
>>> switched pdump to use generic DPDK IPC instead of sockets.
>>> Old API was deprecated and removed.  Updating OVS code accordingly.
>>>
>>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
>>
>> Thanks for the patch Ilya.
>>
>> I see compilation passing now on dpdk-latest with this applied.
>>
>> https://travis-ci.org/istokes/ovs/builds/610915636
>>
>> I still had issues with running PDUMP, but those issues are specific to
>> PDUMP setup in my environment. A separate issue we can discuss further
>> on the deprecation thread as it seems unrelated to this patch.
>>
>> @David, are you happy to ack the patch (I see some of the changes are
>> from your side).
> 
> I did not work on the crash I saw, but it was most likely a problem on my side.
> This looks good to me.

 From a some further testing on my side I'm also seeing a crash, 
specifically OVS crashes out once packets are received. PDUMP is still 
running but complains of being unable to communicate with the primary 
process and then exits. Is this similar to what you saw?

@Ilya, by chance did you see anything like this? I'll investigate 
further myself tomorrow. I was going to hold off on the merge in the 
meantime. Thoughts?

Regards
Ian
Ilya Maximets Nov. 12, 2019, 7:40 p.m. UTC | #5
On 12.11.2019 19:51, Stokes, Ian wrote:
> 
> 
> On 11/12/2019 5:15 PM, David Marchand wrote:
>> On Tue, Nov 12, 2019 at 6:07 PM Stokes, Ian <ian.stokes@intel.com> wrote:
>>> On 11/11/2019 3:01 PM, Ilya Maximets wrote:
>>>> DPDK commit 660098d61f57 ("pdump: use generic multi-process channel")
>>>> switched pdump to use generic DPDK IPC instead of sockets.
>>>> Old API was deprecated and removed.  Updating OVS code accordingly.
>>>>
>>>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
>>>
>>> Thanks for the patch Ilya.
>>>
>>> I see compilation passing now on dpdk-latest with this applied.
>>>
>>> https://travis-ci.org/istokes/ovs/builds/610915636
>>>
>>> I still had issues with running PDUMP, but those issues are specific to
>>> PDUMP setup in my environment. A separate issue we can discuss further
>>> on the deprecation thread as it seems unrelated to this patch.
>>>
>>> @David, are you happy to ack the patch (I see some of the changes are
>>> from your side).
>>
>> I did not work on the crash I saw, but it was most likely a problem on my side.
>> This looks good to me.
> 
> From a some further testing on my side I'm also seeing a crash, specifically OVS crashes out once packets are received. PDUMP is still running but complains of being unable to communicate with the primary process and then exits. Is this similar to what you saw?
> 
> @Ilya, by chance did you see anything like this?


Honestly, I never tried to use pdump, and I don't really want to try
preparing the setup for it (building ASLR disabled kernel and stuff).


> I'll investigate further myself tomorrow. I was going to hold off on the merge in the meantime. Thoughts?

There are 2 options here:
1. Apply this patch and hope that DPDK will be fixed someday.
   + Optionally apply deprecation patch.
2. Completely remove pdump support now without prior deprecation
   because it just doesn't work.

Best regards, Ilya Maximets.
David Marchand Nov. 12, 2019, 8:13 p.m. UTC | #6
On Tue, Nov 12, 2019 at 8:41 PM Ilya Maximets <i.maximets@ovn.org> wrote:
>
> On 12.11.2019 19:51, Stokes, Ian wrote:
> >
> >
> > On 11/12/2019 5:15 PM, David Marchand wrote:
> >> On Tue, Nov 12, 2019 at 6:07 PM Stokes, Ian <ian.stokes@intel.com> wrote:
> >>> On 11/11/2019 3:01 PM, Ilya Maximets wrote:
> >>>> DPDK commit 660098d61f57 ("pdump: use generic multi-process channel")
> >>>> switched pdump to use generic DPDK IPC instead of sockets.
> >>>> Old API was deprecated and removed.  Updating OVS code accordingly.
> >>>>
> >>>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> >>>
> >>> Thanks for the patch Ilya.
> >>>
> >>> I see compilation passing now on dpdk-latest with this applied.
> >>>
> >>> https://travis-ci.org/istokes/ovs/builds/610915636
> >>>
> >>> I still had issues with running PDUMP, but those issues are specific to
> >>> PDUMP setup in my environment. A separate issue we can discuss further
> >>> on the deprecation thread as it seems unrelated to this patch.
> >>>
> >>> @David, are you happy to ack the patch (I see some of the changes are
> >>> from your side).
> >>
> >> I did not work on the crash I saw, but it was most likely a problem on my side.
> >> This looks good to me.
> >
> > From a some further testing on my side I'm also seeing a crash, specifically OVS crashes out once packets are received. PDUMP is still running but complains of being unable to communicate with the primary process and then exits. Is this similar to what you saw?
> >
> > @Ilya, by chance did you see anything like this?
>
>
> Honestly, I never tried to use pdump, and I don't really want to try
> preparing the setup for it (building ASLR disabled kernel and stuff).
>
>
> > I'll investigate further myself tomorrow. I was going to hold off on the merge in the meantime. Thoughts?
>
> There are 2 options here:
> 1. Apply this patch and hope that DPDK will be fixed someday.
>    + Optionally apply deprecation patch.

This is pure speculation, but when I saw the crash before, I thought
that the problem was in the way ovs creates its thread without the
dpdk being aware of it.
dpdk pdump component expects that it's running on a EAL thread, with a
known lcore, and *boom* when it dereferences some uninitialized
structures/resources.

I did not really investigate, I just fear we have this class of
issues, since dpdk (and its sub components) is not instructed by ovs
how it placed its threads.
ovs has been doing this for some time, without people hitting bugs, so
I might just be paranoid.


> 2. Completely remove pdump support now without prior deprecation
>    because it just doesn't work.

That is an alternative too.



--
David Marchand
Stokes, Ian Nov. 13, 2019, 2:28 p.m. UTC | #7
On 11/12/2019 7:40 PM, Ilya Maximets wrote:
> On 12.11.2019 19:51, Stokes, Ian wrote:
>>
>>
>> On 11/12/2019 5:15 PM, David Marchand wrote:
>>> On Tue, Nov 12, 2019 at 6:07 PM Stokes, Ian <ian.stokes@intel.com> wrote:
>>>> On 11/11/2019 3:01 PM, Ilya Maximets wrote:
>>>>> DPDK commit 660098d61f57 ("pdump: use generic multi-process channel")
>>>>> switched pdump to use generic DPDK IPC instead of sockets.
>>>>> Old API was deprecated and removed.  Updating OVS code accordingly.
>>>>>
>>>>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
>>>>
>>>> Thanks for the patch Ilya.
>>>>
>>>> I see compilation passing now on dpdk-latest with this applied.
>>>>
>>>> https://travis-ci.org/istokes/ovs/builds/610915636
>>>>
>>>> I still had issues with running PDUMP, but those issues are specific to
>>>> PDUMP setup in my environment. A separate issue we can discuss further
>>>> on the deprecation thread as it seems unrelated to this patch.
>>>>
>>>> @David, are you happy to ack the patch (I see some of the changes are
>>>> from your side).
>>>
>>> I did not work on the crash I saw, but it was most likely a problem on my side.
>>> This looks good to me.
>>
>>  From a some further testing on my side I'm also seeing a crash, specifically OVS crashes out once packets are received. PDUMP is still running but complains of being unable to communicate with the primary process and then exits. Is this similar to what you saw?
>>
>> @Ilya, by chance did you see anything like this?
> 
> 
> Honestly, I never tried to use pdump, and I don't really want to try
> preparing the setup for it (building ASLR disabled kernel and stuff).
> 

I haven't run it a while myself either. Typically use ovs-tcpdump.


> 
>> I'll investigate further myself tomorrow. I was going to hold off on the merge in the meantime. Thoughts?
> 
> There are 2 options here:
> 1. Apply this patch and hope that DPDK will be fixed someday.
>     + Optionally apply deprecation patch.
> 2. Completely remove pdump support now without prior deprecation
>     because it just doesn't work.
I'd like option 1, apply the patches & deprecate on dpdk-latest. This at 
least gets the dpdk-latest passing with rc2 for the moment (I'd do a 
rebase for ovs master with this also).

Before removing pdump support completely I'd like to investigate 
further, if it can be confirmed to be a DPDK/OVS compatibility issue WRT 
the PMDs as David has suggested, then thats fine,  once confirmed I feel 
it would be fair to remove it completely then as it just doesn't work.

Thoughts?

> 
> Best regards, Ilya Maximets.
>
Ilya Maximets Nov. 13, 2019, 3:26 p.m. UTC | #8
On 13.11.2019 15:28, Stokes, Ian wrote:
> 
> 
> On 11/12/2019 7:40 PM, Ilya Maximets wrote:
>> On 12.11.2019 19:51, Stokes, Ian wrote:
>>>
>>>
>>> On 11/12/2019 5:15 PM, David Marchand wrote:
>>>> On Tue, Nov 12, 2019 at 6:07 PM Stokes, Ian <ian.stokes@intel.com> wrote:
>>>>> On 11/11/2019 3:01 PM, Ilya Maximets wrote:
>>>>>> DPDK commit 660098d61f57 ("pdump: use generic multi-process channel")
>>>>>> switched pdump to use generic DPDK IPC instead of sockets.
>>>>>> Old API was deprecated and removed.  Updating OVS code accordingly.
>>>>>>
>>>>>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
>>>>>
>>>>> Thanks for the patch Ilya.
>>>>>
>>>>> I see compilation passing now on dpdk-latest with this applied.
>>>>>
>>>>> https://travis-ci.org/istokes/ovs/builds/610915636
>>>>>
>>>>> I still had issues with running PDUMP, but those issues are specific to
>>>>> PDUMP setup in my environment. A separate issue we can discuss further
>>>>> on the deprecation thread as it seems unrelated to this patch.
>>>>>
>>>>> @David, are you happy to ack the patch (I see some of the changes are
>>>>> from your side).
>>>>
>>>> I did not work on the crash I saw, but it was most likely a problem on my side.
>>>> This looks good to me.
>>>
>>>  From a some further testing on my side I'm also seeing a crash, specifically OVS crashes out once packets are received. PDUMP is still running but complains of being unable to communicate with the primary process and then exits. Is this similar to what you saw?
>>>
>>> @Ilya, by chance did you see anything like this?
>>
>>
>> Honestly, I never tried to use pdump, and I don't really want to try
>> preparing the setup for it (building ASLR disabled kernel and stuff).
>>
> 
> I haven't run it a while myself either. Typically use ovs-tcpdump.
> 
> 
>>
>>> I'll investigate further myself tomorrow. I was going to hold off on the merge in the meantime. Thoughts?
>>
>> There are 2 options here:
>> 1. Apply this patch and hope that DPDK will be fixed someday.
>>     + Optionally apply deprecation patch.
>> 2. Completely remove pdump support now without prior deprecation
>>     because it just doesn't work.
> I'd like option 1, apply the patches & deprecate on dpdk-latest. This at least gets the dpdk-latest passing with rc2 for the moment (I'd do a rebase for ovs master with this also).
> 
> Before removing pdump support completely I'd like to investigate further, if it can be confirmed to be a DPDK/OVS compatibility issue WRT the PMDs as David has suggested, then thats fine,  once confirmed I feel it would be fair to remove it completely then as it just doesn't work.
> 
> Thoughts?

OK for me.

Best regards, Ilya Maximets.
Ilya Maximets Nov. 13, 2019, 3:48 p.m. UTC | #9
On 12.11.2019 21:13, David Marchand wrote:
> On Tue, Nov 12, 2019 at 8:41 PM Ilya Maximets <i.maximets@ovn.org> wrote:
>>
>> On 12.11.2019 19:51, Stokes, Ian wrote:
>>>
>>>
>>> On 11/12/2019 5:15 PM, David Marchand wrote:
>>>> On Tue, Nov 12, 2019 at 6:07 PM Stokes, Ian <ian.stokes@intel.com> wrote:
>>>>> On 11/11/2019 3:01 PM, Ilya Maximets wrote:
>>>>>> DPDK commit 660098d61f57 ("pdump: use generic multi-process channel")
>>>>>> switched pdump to use generic DPDK IPC instead of sockets.
>>>>>> Old API was deprecated and removed.  Updating OVS code accordingly.
>>>>>>
>>>>>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
>>>>>
>>>>> Thanks for the patch Ilya.
>>>>>
>>>>> I see compilation passing now on dpdk-latest with this applied.
>>>>>
>>>>> https://travis-ci.org/istokes/ovs/builds/610915636
>>>>>
>>>>> I still had issues with running PDUMP, but those issues are specific to
>>>>> PDUMP setup in my environment. A separate issue we can discuss further
>>>>> on the deprecation thread as it seems unrelated to this patch.
>>>>>
>>>>> @David, are you happy to ack the patch (I see some of the changes are
>>>>> from your side).
>>>>
>>>> I did not work on the crash I saw, but it was most likely a problem on my side.
>>>> This looks good to me.
>>>
>>> From a some further testing on my side I'm also seeing a crash, specifically OVS crashes out once packets are received. PDUMP is still running but complains of being unable to communicate with the primary process and then exits. Is this similar to what you saw?
>>>
>>> @Ilya, by chance did you see anything like this?
>>
>>
>> Honestly, I never tried to use pdump, and I don't really want to try
>> preparing the setup for it (building ASLR disabled kernel and stuff).
>>
>>
>>> I'll investigate further myself tomorrow. I was going to hold off on the merge in the meantime. Thoughts?
>>
>> There are 2 options here:
>> 1. Apply this patch and hope that DPDK will be fixed someday.
>>    + Optionally apply deprecation patch.
> 
> This is pure speculation, but when I saw the crash before, I thought
> that the problem was in the way ovs creates its thread without the
> dpdk being aware of it.

At least, lcore_ids are set.

> dpdk pdump component expects that it's running on a EAL thread, with a
> known lcore, and *boom* when it dereferences some uninitialized
> structures/resources.
> 
> I did not really investigate, I just fear we have this class of
> issues, since dpdk (and its sub components) is not instructed by ovs
> how it placed its threads.
> ovs has been doing this for some time, without people hitting bugs, so
> I might just be paranoid.

BTW, it seems for me that all this "EAL threads" concept is an "RTE"
legacy.  I understand that automatic support for dynamically created
threads in the outer application sounds like a Sci-Fi fantasy, but as
far as DPDK tries to be a library, I think, it should at least allow
users to register their own threads.
Large applications that wants to use DPDK, but also wants to be usable
without it will likely not use EAL-threads because it's not flexible
and not re-usable.  Otherwise they will need to create layers of proxy
libraries to abstract their thread management.

Best regards, Ilya Maximets.
David Marchand Nov. 13, 2019, 4:28 p.m. UTC | #10
On Wed, Nov 13, 2019 at 4:48 PM Ilya Maximets <i.maximets@ovn.org> wrote:
> On 12.11.2019 21:13, David Marchand wrote:
> > This is pure speculation, but when I saw the crash before, I thought
> > that the problem was in the way ovs creates its thread without the
> > dpdk being aware of it.
>
> At least, lcore_ids are set.

It works, but this is a hack.

>
> > dpdk pdump component expects that it's running on a EAL thread, with a
> > known lcore, and *boom* when it dereferences some uninitialized
> > structures/resources.
> >
> > I did not really investigate, I just fear we have this class of
> > issues, since dpdk (and its sub components) is not instructed by ovs
> > how it placed its threads.
> > ovs has been doing this for some time, without people hitting bugs, so
> > I might just be paranoid.
>
> BTW, it seems for me that all this "EAL threads" concept is an "RTE"
> legacy.  I understand that automatic support for dynamically created
> threads in the outer application sounds like a Sci-Fi fantasy, but as
> far as DPDK tries to be a library, I think, it should at least allow
> users to register their own threads.
> Large applications that wants to use DPDK, but also wants to be usable
> without it will likely not use EAL-threads because it's not flexible
> and not re-usable.  Otherwise they will need to create layers of proxy
> libraries to abstract their thread management.

Not entering the debate on what is legacy.

I agree that DPDK should provide a way to register threads.
I don't remember discussion with OVS people when/if DPDK integration
has been done.
If it happened, pointers are welcome.


Anyway, it seems worth looking at, but not necessarily easy.
It would make dpdk a little more like a library.
I will put this in my dpdk todolist (for the next release(s)).


--
David Marchand
Stokes, Ian Nov. 19, 2019, 9:44 p.m. UTC | #11
On 11/13/2019 3:26 PM, Ilya Maximets wrote:
> On 13.11.2019 15:28, Stokes, Ian wrote:
>>
>>
>> On 11/12/2019 7:40 PM, Ilya Maximets wrote:
>>> On 12.11.2019 19:51, Stokes, Ian wrote:
>>>>
>>>>
>>>> On 11/12/2019 5:15 PM, David Marchand wrote:
>>>>> On Tue, Nov 12, 2019 at 6:07 PM Stokes, Ian <ian.stokes@intel.com> wrote:
>>>>>> On 11/11/2019 3:01 PM, Ilya Maximets wrote:
>>>>>>> DPDK commit 660098d61f57 ("pdump: use generic multi-process channel")
>>>>>>> switched pdump to use generic DPDK IPC instead of sockets.
>>>>>>> Old API was deprecated and removed.  Updating OVS code accordingly.
>>>>>>>
>>>>>>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
>>>>>>
>>>>>> Thanks for the patch Ilya.
>>>>>>
>>>>>> I see compilation passing now on dpdk-latest with this applied.
>>>>>>
>>>>>> https://travis-ci.org/istokes/ovs/builds/610915636
>>>>>>
>>>>>> I still had issues with running PDUMP, but those issues are specific to
>>>>>> PDUMP setup in my environment. A separate issue we can discuss further
>>>>>> on the deprecation thread as it seems unrelated to this patch.
>>>>>>
>>>>>> @David, are you happy to ack the patch (I see some of the changes are
>>>>>> from your side).
>>>>>
>>>>> I did not work on the crash I saw, but it was most likely a problem on my side.
>>>>> This looks good to me.
>>>>
>>>>   From a some further testing on my side I'm also seeing a crash, specifically OVS crashes out once packets are received. PDUMP is still running but complains of being unable to communicate with the primary process and then exits. Is this similar to what you saw?
>>>>
>>>> @Ilya, by chance did you see anything like this?
>>>
>>>
>>> Honestly, I never tried to use pdump, and I don't really want to try
>>> preparing the setup for it (building ASLR disabled kernel and stuff).
>>>
>>
>> I haven't run it a while myself either. Typically use ovs-tcpdump.
>>
>>
>>>
>>>> I'll investigate further myself tomorrow. I was going to hold off on the merge in the meantime. Thoughts?
>>>
>>> There are 2 options here:
>>> 1. Apply this patch and hope that DPDK will be fixed someday.
>>>      + Optionally apply deprecation patch.
>>> 2. Completely remove pdump support now without prior deprecation
>>>      because it just doesn't work.
>> I'd like option 1, apply the patches & deprecate on dpdk-latest. This at least gets the dpdk-latest passing with rc2 for the moment (I'd do a rebase for ovs master with this also).
>>
>> Before removing pdump support completely I'd like to investigate further, if it can be confirmed to be a DPDK/OVS compatibility issue WRT the PMDs as David has suggested, then thats fine,  once confirmed I feel it would be fair to remove it completely then as it just doesn't work.
>>
>> Thoughts?
> 
> OK for me.
> 
Thanks, I've pushed this to dpdk-latest, I've also rebased dpdk-latest 
with ovs master (I had applied the pdump deprecation notice on OVS 
master already as there was no need to wait for 19.11 for that).

Thanks
Ian

Patch
diff mbox series

diff --git a/Documentation/topics/dpdk/pdump.rst b/Documentation/topics/dpdk/pdump.rst
index 7bd1d3e9f..affd98371 100644
--- a/Documentation/topics/dpdk/pdump.rst
+++ b/Documentation/topics/dpdk/pdump.rst
@@ -41,8 +41,7 @@  To use pdump, simply launch OVS as usual, then navigate to the ``app/pdump``
 directory in DPDK, ``make`` the application and run like so::
 
     $ sudo ./build/app/dpdk-pdump -- \
-        --pdump port=0,queue=0,rx-dev=/tmp/pkts.pcap \
-        --server-socket-path=/usr/local/var/run/openvswitch
+        --pdump port=0,queue=0,rx-dev=/tmp/pkts.pcap
 
 The above command captures traffic received on queue 0 of port 0 and stores it
 in ``/tmp/pkts.pcap``. Other combinations of port numbers, queues numbers and
@@ -50,11 +49,13 @@  pcap locations are of course also available to use. For example, to capture all
 packets that traverse port 0 in a single pcap file::
 
     $ sudo ./build/app/dpdk-pdump -- \
-        --pdump 'port=0,queue=*,rx-dev=/tmp/pkts.pcap,tx-dev=/tmp/pkts.pcap' \
-        --server-socket-path=/usr/local/var/run/openvswitch
+        --pdump 'port=0,queue=*,rx-dev=/tmp/pkts.pcap,tx-dev=/tmp/pkts.pcap'
 
-``server-socket-path`` must be set to the value of ``ovs_rundir()`` which
-typically resolves to ``/usr/local/var/run/openvswitch``.
+.. note::
+
+   ``XDG_RUNTIME_DIR`` environment variable might need to be adjusted to
+   OVS runtime directory (``/var/run/openvswitch`` in most cases) for
+   ``dpdk-pdump`` utility if OVS started by non-root user.
 
 Many tools are available to view the contents of the pcap file. Once example is
 tcpdump. Issue the following command to view the contents of ``pkts.pcap``::
diff --git a/lib/dpdk.c b/lib/dpdk.c
index f90cda75a..748f63d31 100644
--- a/lib/dpdk.c
+++ b/lib/dpdk.c
@@ -27,7 +27,6 @@ 
 #include <rte_memzone.h>
 #include <rte_version.h>
 #ifdef DPDK_PDUMP
-#include <rte_mempool.h>
 #include <rte_pdump.h>
 #endif
 
@@ -434,17 +433,9 @@  dpdk_init__(const struct smap *ovs_other_config)
 
 #ifdef DPDK_PDUMP
     VLOG_INFO("DPDK pdump packet capture enabled");
-    err = rte_pdump_init(ovs_rundir());
+    err = rte_pdump_init();
     if (err) {
         VLOG_INFO("Error initialising DPDK pdump");
-        rte_pdump_uninit();
-    } else {
-        char *server_socket_path;
-
-        server_socket_path = xasprintf("%s/%s", ovs_rundir(),
-                                       "pdump_server_socket");
-        fatal_signal_add_file_to_unlink(server_socket_path);
-        free(server_socket_path);
     }
 #endif