Message ID | 20210806124432.43042-1-roriorde@redhat.com |
---|---|
State | Superseded |
Headers | show |
Series | [ovs-dev] dpdk: Use --in-memory by default. | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | warning | apply and check: warning |
ovsrobot/github-robot | success | github build: passed |
Bleep bloop. Greetings Rosemarie O'Riorden, I am a robot and I have tried out your patch. Thanks for your contribution. I encountered some error that I wasn't expecting. See the details below. checkpatch: WARNING: Line has trailing whitespace #67 FILE: lib/dpdk.c:409: if (!args_contains(&args, "--in-memory") && Lines checked: 78, Warnings: 1, Errors: 0 Please check this out. If you feel there has been an error, please email aconole@redhat.com Thanks, 0-day Robot
On 8/6/21 2:44 PM, Rosemarie O'Riorden wrote: > If anonymous memory mapping is supported by the kernel, it's better > to run OVS entirely in memory rather than creating shared data > structures. OVS doesn't work in multi-process mode, so there is no need > to litter a filesystem and experience random crashes due to old memory > chunks stored in re-opened files. > > When OVS is not running in memory and crashes, it never reaches the > clean up scripts that delete the new files it has created, resulting in > "dirty" memory. OVS will partially overwrite this memory on the next > start-up, but will fail to run again because its filesystem is full of > old memory. > > Here is an example of these crashes: > https://inbox.dpdk.org/dev/20200910162407.12669-1-david.marchand@redhat.com/ > > Reported at: https://bugzilla.redhat.com/show_bug.cgi?id=1949849 > Signed-off-by: Rosemarie O'Riorden <roriorde@redhat.com> > --- Thanks for the patch! Anonymous hugepage mapping is supported starting from 3.8 kernel, and the oldest longterm upstream kernel currently is 4.4, so it should be supported in any modern system. It's nice to have a compile time check though. So we're good here. We also removed from OVS all the features that required support for multi-process mode (pdump and ring ports), so it should be safe to use --in-memory. All in all, beside the trailing whitespace warning, the patch looks good to me. Ian, what do you think? Best regards, Ilya Maximets. > NEWS | 1 + > acinclude.m4 | 6 ++++++ > lib/dpdk.c | 7 +++++++ > 3 files changed, 14 insertions(+) > > diff --git a/NEWS b/NEWS > index 26920e215..515d3cddd 100644 > --- a/NEWS > +++ b/NEWS > @@ -8,6 +8,7 @@ Post-v2.16.0 > by default. 'other_config:dpdk-socket-limit' can be set equal to > the 'other_config:dpdk-socket-mem' to preserve the legacy memory > limiting behavior. > + * EAL argument --in-memory is applied by default if supported. > > > v2.16.0 - xx xxx xxxx > diff --git a/acinclude.m4 b/acinclude.m4 > index 5a48f0335..7720f2f1b 100644 > --- a/acinclude.m4 > +++ b/acinclude.m4 > @@ -448,6 +448,12 @@ AC_DEFUN([OVS_CHECK_DPDK], [ > ], [[#include <rte_config.h>]]) > ], [], [[#include <rte_config.h>]]) > > + AC_CHECK_DECL([MAP_HUGE_SHIFT], [ > + AC_DEFINE([DPDK_IN_MEMORY_SUPPORTED], [1], [If MAP_HUGE_SHIFT is > + defined, anonomous memory mapping is supported by the > + kernel, and --in-memory can be used.]) > + ], [], [[#include <sys/mman.h>]]) > + > # DPDK uses dlopen to load plugins. > OVS_FIND_DEPENDENCY([dlopen], [dl], [libdl]) > > diff --git a/lib/dpdk.c b/lib/dpdk.c > index b2ef31cd2..97c902fab 100644 > --- a/lib/dpdk.c > +++ b/lib/dpdk.c > @@ -405,6 +405,13 @@ dpdk_init__(const struct smap *ovs_other_config) > svec_add(&args, ovs_get_program_name()); > construct_dpdk_args(ovs_other_config, &args); > > +#ifdef DPDK_IN_MEMORY_SUPPORTED > + if (!args_contains(&args, "--in-memory") && > + !args_contains(&args, "--legacy-mem")) { > + svec_add(&args, "--in-memory"); > + } > +#endif > + > if (args_contains(&args, "-c") || args_contains(&args, "-l")) { > auto_determine = false; > } >
On Fri, Aug 6, 2021 at 2:44 PM Rosemarie O'Riorden <roriorde@redhat.com> wrote: > > If anonymous memory mapping is supported by the kernel, it's better > to run OVS entirely in memory rather than creating shared data > structures. OVS doesn't work in multi-process mode, so there is no need > to litter a filesystem and experience random crashes due to old memory > chunks stored in re-opened files. > > When OVS is not running in memory and crashes, it never reaches the > clean up scripts that delete the new files it has created, resulting in > "dirty" memory. OVS will partially overwrite this memory on the next > start-up, but will fail to run again because its filesystem is full of > old memory. > > Here is an example of these crashes: > https://inbox.dpdk.org/dev/20200910162407.12669-1-david.marchand@redhat.com/ Overall, I am fine with this change. - In the commitlog, I would focus on the simplication aspect and that we only use what we need. Mention of this bug can be removed. The reason is that this is not as black/white as described: reusing dirty memory only happened in a special case with containers. But this does not affect OVS (I don't think we run OVS in containers). - As mentionned by Ilya, MAP_HUGE_SHIFT is most likely supported on systems running current OVS. Can we remove the permissions tweak on /dev/hugepages in rhel/usr_lib_systemd_system_ovs-vswitchd.service.in ? (Cc: Timothy who might have an opinion on this). - Somehow related: * as a followup, we can deprecate --dpdk-hugepage-dir, * I have been playing with --in-memory and I noticed some bug with the telemetry socket. The last started dpdk process takes control of the socket regardless of other processes, so this happens when running multiple dpdk applications with none or same XDG_RUNTIME_DIR set in env. This is more of a fyi: it should not be a problem for people running ovs with a dedicated user, but it could be a problem for those vilains that run all their applications as root (/me whistles). > > Reported at: https://bugzilla.redhat.com/show_bug.cgi?id=1949849 > Signed-off-by: Rosemarie O'Riorden <roriorde@redhat.com> Thanks,
On 9/9/21 14:49, David Marchand wrote: > On Fri, Aug 6, 2021 at 2:44 PM Rosemarie O'Riorden <roriorde@redhat.com> wrote: >> >> If anonymous memory mapping is supported by the kernel, it's better >> to run OVS entirely in memory rather than creating shared data >> structures. OVS doesn't work in multi-process mode, so there is no need >> to litter a filesystem and experience random crashes due to old memory >> chunks stored in re-opened files. >> >> When OVS is not running in memory and crashes, it never reaches the >> clean up scripts that delete the new files it has created, resulting in >> "dirty" memory. OVS will partially overwrite this memory on the next >> start-up, but will fail to run again because its filesystem is full of >> old memory. >> >> Here is an example of these crashes: >> https://inbox.dpdk.org/dev/20200910162407.12669-1-david.marchand@redhat.com/ > > Overall, I am fine with this change. > > > - In the commitlog, I would focus on the simplication aspect and that > we only use what we need. > Mention of this bug can be removed. > The reason is that this is not as black/white as described: reusing > dirty memory only happened in a special case with containers. > But this does not affect OVS (I don't think we run OVS in containers). OK. I think, we can just put a period after the 'to litter a filesystem' and remove everything else. That's enough for justification. WDYT? > > > - As mentionned by Ilya, MAP_HUGE_SHIFT is most likely supported on > systems running current OVS. > Can we remove the permissions tweak on /dev/hugepages in > rhel/usr_lib_systemd_system_ovs-vswitchd.service.in ? > (Cc: Timothy who might have an opinion on this). Since users could still put '--legcy-mem' into dpdk-extra, they might still need filesystem access. So, I guess, we still need the tweaks. We could, probably, forbid running with legacy-mem at some point in the future though. Are there plans in DPDK to deprecate the legacy memory model someday? > > > - Somehow related: > * as a followup, we can deprecate --dpdk-hugepage-dir, > * I have been playing with --in-memory and I noticed some bug with the > telemetry socket. > The last started dpdk process takes control of the socket regardless > of other processes, so this happens when running multiple dpdk > applications with none or same XDG_RUNTIME_DIR set in env. > This is more of a fyi: it should not be a problem for people running > ovs with a dedicated user, but it could be a problem for those vilains > that run all their applications as root (/me whistles). > > >> >> Reported at: https://bugzilla.redhat.com/show_bug.cgi?id=1949849 >> Signed-off-by: Rosemarie O'Riorden <roriorde@redhat.com> > > > Thanks, >
On Sat, Dec 4, 2021 at 2:22 AM Ilya Maximets <i.maximets@ovn.org> wrote: > > On 9/9/21 14:49, David Marchand wrote: > > On Fri, Aug 6, 2021 at 2:44 PM Rosemarie O'Riorden <roriorde@redhat.com> wrote: > >> > >> If anonymous memory mapping is supported by the kernel, it's better > >> to run OVS entirely in memory rather than creating shared data > >> structures. OVS doesn't work in multi-process mode, so there is no need > >> to litter a filesystem and experience random crashes due to old memory > >> chunks stored in re-opened files. > >> > >> When OVS is not running in memory and crashes, it never reaches the > >> clean up scripts that delete the new files it has created, resulting in > >> "dirty" memory. OVS will partially overwrite this memory on the next > >> start-up, but will fail to run again because its filesystem is full of > >> old memory. > >> > >> Here is an example of these crashes: > >> https://inbox.dpdk.org/dev/20200910162407.12669-1-david.marchand@redhat.com/ > > > > Overall, I am fine with this change. > > > > > > - In the commitlog, I would focus on the simplication aspect and that > > we only use what we need. > > Mention of this bug can be removed. > > The reason is that this is not as black/white as described: reusing > > dirty memory only happened in a special case with containers. > > But this does not affect OVS (I don't think we run OVS in containers). > > OK. I think, we can just put a period after the 'to litter a filesystem' > and remove everything else. That's enough for justification. WDYT? +1 > > > > > > > - As mentionned by Ilya, MAP_HUGE_SHIFT is most likely supported on > > systems running current OVS. > > Can we remove the permissions tweak on /dev/hugepages in > > rhel/usr_lib_systemd_system_ovs-vswitchd.service.in ? > > (Cc: Timothy who might have an opinion on this). > > Since users could still put '--legcy-mem' into dpdk-extra, they might > still need filesystem access. So, I guess, we still need the tweaks. True. > We could, probably, forbid running with legacy-mem at some point in the > future though. > > Are there plans in DPDK to deprecate the legacy memory model someday? Hard to remove this mode in DPDK atm: FreeBSD does not implement the new mode. And the DPDK memory allocator beast has very few contributors, so I can't promise anything.
2021-12-06 17:34 (UTC+0100), David Marchand: [...] > > Are there plans in DPDK to deprecate the legacy memory model someday? > > Hard to remove this mode in DPDK atm: FreeBSD does not implement the new mode. > And the DPDK memory allocator beast has very few contributors, so I > can't promise anything. Windows only supports dynamic --in-memory mode, it was never intended to support --legacy-mem (also no concrete plan to support multi-process).
diff --git a/NEWS b/NEWS index 26920e215..515d3cddd 100644 --- a/NEWS +++ b/NEWS @@ -8,6 +8,7 @@ Post-v2.16.0 by default. 'other_config:dpdk-socket-limit' can be set equal to the 'other_config:dpdk-socket-mem' to preserve the legacy memory limiting behavior. + * EAL argument --in-memory is applied by default if supported. v2.16.0 - xx xxx xxxx diff --git a/acinclude.m4 b/acinclude.m4 index 5a48f0335..7720f2f1b 100644 --- a/acinclude.m4 +++ b/acinclude.m4 @@ -448,6 +448,12 @@ AC_DEFUN([OVS_CHECK_DPDK], [ ], [[#include <rte_config.h>]]) ], [], [[#include <rte_config.h>]]) + AC_CHECK_DECL([MAP_HUGE_SHIFT], [ + AC_DEFINE([DPDK_IN_MEMORY_SUPPORTED], [1], [If MAP_HUGE_SHIFT is + defined, anonomous memory mapping is supported by the + kernel, and --in-memory can be used.]) + ], [], [[#include <sys/mman.h>]]) + # DPDK uses dlopen to load plugins. OVS_FIND_DEPENDENCY([dlopen], [dl], [libdl]) diff --git a/lib/dpdk.c b/lib/dpdk.c index b2ef31cd2..97c902fab 100644 --- a/lib/dpdk.c +++ b/lib/dpdk.c @@ -405,6 +405,13 @@ dpdk_init__(const struct smap *ovs_other_config) svec_add(&args, ovs_get_program_name()); construct_dpdk_args(ovs_other_config, &args); +#ifdef DPDK_IN_MEMORY_SUPPORTED + if (!args_contains(&args, "--in-memory") && + !args_contains(&args, "--legacy-mem")) { + svec_add(&args, "--in-memory"); + } +#endif + if (args_contains(&args, "-c") || args_contains(&args, "-l")) { auto_determine = false; }
If anonymous memory mapping is supported by the kernel, it's better to run OVS entirely in memory rather than creating shared data structures. OVS doesn't work in multi-process mode, so there is no need to litter a filesystem and experience random crashes due to old memory chunks stored in re-opened files. When OVS is not running in memory and crashes, it never reaches the clean up scripts that delete the new files it has created, resulting in "dirty" memory. OVS will partially overwrite this memory on the next start-up, but will fail to run again because its filesystem is full of old memory. Here is an example of these crashes: https://inbox.dpdk.org/dev/20200910162407.12669-1-david.marchand@redhat.com/ Reported at: https://bugzilla.redhat.com/show_bug.cgi?id=1949849 Signed-off-by: Rosemarie O'Riorden <roriorde@redhat.com> --- NEWS | 1 + acinclude.m4 | 6 ++++++ lib/dpdk.c | 7 +++++++ 3 files changed, 14 insertions(+)