diff mbox series

[ovs-dev] dpdk: Use --in-memory by default.

Message ID 20210806124432.43042-1-roriorde@redhat.com
State Superseded
Headers show
Series [ovs-dev] dpdk: Use --in-memory by default. | expand

Checks

Context Check Description
ovsrobot/apply-robot warning apply and check: warning
ovsrobot/github-robot success github build: passed

Commit Message

Rosemarie O'Riorden Aug. 6, 2021, 12:44 p.m. UTC
If anonymous memory mapping is supported by the kernel, it's better
to run OVS entirely in memory rather than creating shared data
structures. OVS doesn't work in multi-process mode, so there is no need
to litter a filesystem and experience random crashes due to old memory
chunks stored in re-opened files.

When OVS is not running in memory and crashes, it never reaches the
clean up scripts that delete the new files it has created, resulting in
"dirty" memory. OVS will partially overwrite this memory on the next
start-up, but will fail to run again because its filesystem is full of
old memory.

Here is an example of these crashes:
  https://inbox.dpdk.org/dev/20200910162407.12669-1-david.marchand@redhat.com/

Reported at: https://bugzilla.redhat.com/show_bug.cgi?id=1949849
Signed-off-by: Rosemarie O'Riorden <roriorde@redhat.com>
---
 NEWS         | 1 +
 acinclude.m4 | 6 ++++++
 lib/dpdk.c   | 7 +++++++
 3 files changed, 14 insertions(+)

Comments

0-day Robot Aug. 6, 2021, 12:58 p.m. UTC | #1
Bleep bloop.  Greetings Rosemarie O'Riorden, I am a robot and I have tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line has trailing whitespace
#67 FILE: lib/dpdk.c:409:
    if (!args_contains(&args, "--in-memory") && 

Lines checked: 78, Warnings: 1, Errors: 0


Please check this out.  If you feel there has been an error, please email aconole@redhat.com

Thanks,
0-day Robot
Ilya Maximets Aug. 11, 2021, 11:51 a.m. UTC | #2
On 8/6/21 2:44 PM, Rosemarie O'Riorden wrote:
> If anonymous memory mapping is supported by the kernel, it's better
> to run OVS entirely in memory rather than creating shared data
> structures. OVS doesn't work in multi-process mode, so there is no need
> to litter a filesystem and experience random crashes due to old memory
> chunks stored in re-opened files.
> 
> When OVS is not running in memory and crashes, it never reaches the
> clean up scripts that delete the new files it has created, resulting in
> "dirty" memory. OVS will partially overwrite this memory on the next
> start-up, but will fail to run again because its filesystem is full of
> old memory.
> 
> Here is an example of these crashes:
>   https://inbox.dpdk.org/dev/20200910162407.12669-1-david.marchand@redhat.com/
> 
> Reported at: https://bugzilla.redhat.com/show_bug.cgi?id=1949849
> Signed-off-by: Rosemarie O'Riorden <roriorde@redhat.com>
> ---

Thanks for the patch!

Anonymous hugepage mapping is supported starting from 3.8 kernel, and
the oldest longterm upstream kernel currently is 4.4, so it should be
supported in any modern system.  It's nice to have a compile time check
though.  So we're good here.

We also removed from OVS all the features that required support for
multi-process mode (pdump and ring ports), so it should be safe to
use --in-memory.

All in all, beside the trailing whitespace warning, the patch looks
good to me.

Ian, what do you think?

Best regards, Ilya Maximets.

>  NEWS         | 1 +
>  acinclude.m4 | 6 ++++++
>  lib/dpdk.c   | 7 +++++++
>  3 files changed, 14 insertions(+)
> 
> diff --git a/NEWS b/NEWS
> index 26920e215..515d3cddd 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -8,6 +8,7 @@ Post-v2.16.0
>         by default.  'other_config:dpdk-socket-limit' can be set equal to
>         the 'other_config:dpdk-socket-mem' to preserve the legacy memory
>         limiting behavior.
> +     * EAL argument --in-memory is applied by default if supported.
>  
>  
>  v2.16.0 - xx xxx xxxx
> diff --git a/acinclude.m4 b/acinclude.m4
> index 5a48f0335..7720f2f1b 100644
> --- a/acinclude.m4
> +++ b/acinclude.m4
> @@ -448,6 +448,12 @@ AC_DEFUN([OVS_CHECK_DPDK], [
>        ], [[#include <rte_config.h>]])
>      ], [], [[#include <rte_config.h>]])
>  
> +    AC_CHECK_DECL([MAP_HUGE_SHIFT], [
> +      AC_DEFINE([DPDK_IN_MEMORY_SUPPORTED], [1], [If MAP_HUGE_SHIFT is
> +                 defined, anonomous memory mapping is supported by the
> +                 kernel, and --in-memory can be used.])
> +    ], [], [[#include <sys/mman.h>]])
> +
>      # DPDK uses dlopen to load plugins.
>      OVS_FIND_DEPENDENCY([dlopen], [dl], [libdl])
>  
> diff --git a/lib/dpdk.c b/lib/dpdk.c
> index b2ef31cd2..97c902fab 100644
> --- a/lib/dpdk.c
> +++ b/lib/dpdk.c
> @@ -405,6 +405,13 @@ dpdk_init__(const struct smap *ovs_other_config)
>      svec_add(&args, ovs_get_program_name());
>      construct_dpdk_args(ovs_other_config, &args);
>  
> +#ifdef DPDK_IN_MEMORY_SUPPORTED
> +    if (!args_contains(&args, "--in-memory") && 
> +            !args_contains(&args, "--legacy-mem")) {
> +        svec_add(&args, "--in-memory");
> +    }
> +#endif
> +
>      if (args_contains(&args, "-c") || args_contains(&args, "-l")) {
>          auto_determine = false;
>      }
>
David Marchand Sept. 9, 2021, 12:49 p.m. UTC | #3
On Fri, Aug 6, 2021 at 2:44 PM Rosemarie O'Riorden <roriorde@redhat.com> wrote:
>
> If anonymous memory mapping is supported by the kernel, it's better
> to run OVS entirely in memory rather than creating shared data
> structures. OVS doesn't work in multi-process mode, so there is no need
> to litter a filesystem and experience random crashes due to old memory
> chunks stored in re-opened files.
>
> When OVS is not running in memory and crashes, it never reaches the
> clean up scripts that delete the new files it has created, resulting in
> "dirty" memory. OVS will partially overwrite this memory on the next
> start-up, but will fail to run again because its filesystem is full of
> old memory.
>
> Here is an example of these crashes:
>   https://inbox.dpdk.org/dev/20200910162407.12669-1-david.marchand@redhat.com/

Overall, I am fine with this change.


- In the commitlog, I would focus on the simplication aspect and that
we only use what we need.
Mention of this bug can be removed.
The reason is that this is not as black/white as described: reusing
dirty memory only happened in a special case with containers.
But this does not affect OVS (I don't think we run OVS in containers).


- As mentionned by Ilya, MAP_HUGE_SHIFT is most likely supported on
systems running current OVS.
Can we remove the permissions tweak on /dev/hugepages in
rhel/usr_lib_systemd_system_ovs-vswitchd.service.in ?
(Cc: Timothy who might have an opinion on this).


- Somehow related:
* as a followup, we can deprecate --dpdk-hugepage-dir,
* I have been playing with --in-memory and I noticed some bug with the
telemetry socket.
The last started dpdk process takes control of the socket regardless
of other processes, so this happens when running multiple dpdk
applications with none or same XDG_RUNTIME_DIR set in env.
This is more of a fyi: it should not be a problem for people running
ovs with a dedicated user, but it could be a problem for those vilains
that run all their applications as root (/me whistles).


>
> Reported at: https://bugzilla.redhat.com/show_bug.cgi?id=1949849
> Signed-off-by: Rosemarie O'Riorden <roriorde@redhat.com>


Thanks,
Ilya Maximets Dec. 4, 2021, 1:22 a.m. UTC | #4
On 9/9/21 14:49, David Marchand wrote:
> On Fri, Aug 6, 2021 at 2:44 PM Rosemarie O'Riorden <roriorde@redhat.com> wrote:
>>
>> If anonymous memory mapping is supported by the kernel, it's better
>> to run OVS entirely in memory rather than creating shared data
>> structures. OVS doesn't work in multi-process mode, so there is no need
>> to litter a filesystem and experience random crashes due to old memory
>> chunks stored in re-opened files.
>>
>> When OVS is not running in memory and crashes, it never reaches the
>> clean up scripts that delete the new files it has created, resulting in
>> "dirty" memory. OVS will partially overwrite this memory on the next
>> start-up, but will fail to run again because its filesystem is full of
>> old memory.
>>
>> Here is an example of these crashes:
>>   https://inbox.dpdk.org/dev/20200910162407.12669-1-david.marchand@redhat.com/
> 
> Overall, I am fine with this change.
> 
> 
> - In the commitlog, I would focus on the simplication aspect and that
> we only use what we need.
> Mention of this bug can be removed.
> The reason is that this is not as black/white as described: reusing
> dirty memory only happened in a special case with containers.
> But this does not affect OVS (I don't think we run OVS in containers).

OK.  I think, we can just put a period after the 'to litter a filesystem'
and remove everything else.  That's enough for justification.  WDYT?

> 
> 
> - As mentionned by Ilya, MAP_HUGE_SHIFT is most likely supported on
> systems running current OVS.
> Can we remove the permissions tweak on /dev/hugepages in
> rhel/usr_lib_systemd_system_ovs-vswitchd.service.in ?
> (Cc: Timothy who might have an opinion on this).

Since users could still put '--legcy-mem' into dpdk-extra, they might
still need filesystem access.  So, I guess, we still need the tweaks.
We could, probably, forbid running with legacy-mem at some point in the
future though.

Are there plans in DPDK to deprecate the legacy memory model someday?

> 
> 
> - Somehow related:
> * as a followup, we can deprecate --dpdk-hugepage-dir,
> * I have been playing with --in-memory and I noticed some bug with the
> telemetry socket.
> The last started dpdk process takes control of the socket regardless
> of other processes, so this happens when running multiple dpdk
> applications with none or same XDG_RUNTIME_DIR set in env.
> This is more of a fyi: it should not be a problem for people running
> ovs with a dedicated user, but it could be a problem for those vilains
> that run all their applications as root (/me whistles).
> 
> 
>>
>> Reported at: https://bugzilla.redhat.com/show_bug.cgi?id=1949849
>> Signed-off-by: Rosemarie O'Riorden <roriorde@redhat.com>
> 
> 
> Thanks,
>
David Marchand Dec. 6, 2021, 4:34 p.m. UTC | #5
On Sat, Dec 4, 2021 at 2:22 AM Ilya Maximets <i.maximets@ovn.org> wrote:
>
> On 9/9/21 14:49, David Marchand wrote:
> > On Fri, Aug 6, 2021 at 2:44 PM Rosemarie O'Riorden <roriorde@redhat.com> wrote:
> >>
> >> If anonymous memory mapping is supported by the kernel, it's better
> >> to run OVS entirely in memory rather than creating shared data
> >> structures. OVS doesn't work in multi-process mode, so there is no need
> >> to litter a filesystem and experience random crashes due to old memory
> >> chunks stored in re-opened files.
> >>
> >> When OVS is not running in memory and crashes, it never reaches the
> >> clean up scripts that delete the new files it has created, resulting in
> >> "dirty" memory. OVS will partially overwrite this memory on the next
> >> start-up, but will fail to run again because its filesystem is full of
> >> old memory.
> >>
> >> Here is an example of these crashes:
> >>   https://inbox.dpdk.org/dev/20200910162407.12669-1-david.marchand@redhat.com/
> >
> > Overall, I am fine with this change.
> >
> >
> > - In the commitlog, I would focus on the simplication aspect and that
> > we only use what we need.
> > Mention of this bug can be removed.
> > The reason is that this is not as black/white as described: reusing
> > dirty memory only happened in a special case with containers.
> > But this does not affect OVS (I don't think we run OVS in containers).
>
> OK.  I think, we can just put a period after the 'to litter a filesystem'
> and remove everything else.  That's enough for justification.  WDYT?

+1


>
> >
> >
> > - As mentionned by Ilya, MAP_HUGE_SHIFT is most likely supported on
> > systems running current OVS.
> > Can we remove the permissions tweak on /dev/hugepages in
> > rhel/usr_lib_systemd_system_ovs-vswitchd.service.in ?
> > (Cc: Timothy who might have an opinion on this).
>
> Since users could still put '--legcy-mem' into dpdk-extra, they might
> still need filesystem access.  So, I guess, we still need the tweaks.

True.


> We could, probably, forbid running with legacy-mem at some point in the
> future though.
>
> Are there plans in DPDK to deprecate the legacy memory model someday?

Hard to remove this mode in DPDK atm: FreeBSD does not implement the new mode.
And the DPDK memory allocator beast has very few contributors, so I
can't promise anything.
Dmitry Kozlyuk Dec. 6, 2021, 4:57 p.m. UTC | #6
2021-12-06 17:34 (UTC+0100), David Marchand:
[...]
> > Are there plans in DPDK to deprecate the legacy memory model someday?  
> 
> Hard to remove this mode in DPDK atm: FreeBSD does not implement the new mode.
> And the DPDK memory allocator beast has very few contributors, so I
> can't promise anything.

Windows only supports dynamic --in-memory mode,
it was never intended to support --legacy-mem
(also no concrete plan to support multi-process).
diff mbox series

Patch

diff --git a/NEWS b/NEWS
index 26920e215..515d3cddd 100644
--- a/NEWS
+++ b/NEWS
@@ -8,6 +8,7 @@  Post-v2.16.0
        by default.  'other_config:dpdk-socket-limit' can be set equal to
        the 'other_config:dpdk-socket-mem' to preserve the legacy memory
        limiting behavior.
+     * EAL argument --in-memory is applied by default if supported.
 
 
 v2.16.0 - xx xxx xxxx
diff --git a/acinclude.m4 b/acinclude.m4
index 5a48f0335..7720f2f1b 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -448,6 +448,12 @@  AC_DEFUN([OVS_CHECK_DPDK], [
       ], [[#include <rte_config.h>]])
     ], [], [[#include <rte_config.h>]])
 
+    AC_CHECK_DECL([MAP_HUGE_SHIFT], [
+      AC_DEFINE([DPDK_IN_MEMORY_SUPPORTED], [1], [If MAP_HUGE_SHIFT is
+                 defined, anonomous memory mapping is supported by the
+                 kernel, and --in-memory can be used.])
+    ], [], [[#include <sys/mman.h>]])
+
     # DPDK uses dlopen to load plugins.
     OVS_FIND_DEPENDENCY([dlopen], [dl], [libdl])
 
diff --git a/lib/dpdk.c b/lib/dpdk.c
index b2ef31cd2..97c902fab 100644
--- a/lib/dpdk.c
+++ b/lib/dpdk.c
@@ -405,6 +405,13 @@  dpdk_init__(const struct smap *ovs_other_config)
     svec_add(&args, ovs_get_program_name());
     construct_dpdk_args(ovs_other_config, &args);
 
+#ifdef DPDK_IN_MEMORY_SUPPORTED
+    if (!args_contains(&args, "--in-memory") && 
+            !args_contains(&args, "--legacy-mem")) {
+        svec_add(&args, "--in-memory");
+    }
+#endif
+
     if (args_contains(&args, "-c") || args_contains(&args, "-l")) {
         auto_determine = false;
     }