[ovs-dev,RFC,1/3] dpdk: Unify passing of DPDK EAL arguments.

Message ID 20190201131114.29799-2-i.maximets@samsung.com
State New
Delegated to: Ian Stokes
Headers show
Series
  • dpdk: Unification/deprecation of DPDK EAL config knobs.
Related show

Commit Message

Ilya Maximets Feb. 1, 2019, 1:11 p.m.
This patch deprecates the usage of current configuration knobs for
DPDK EAL arguments:

  - dpdk-alloc-mem
  - dpdk-socket-mem
  - dpdk-socket-limit
  - dpdk-lcore-mask
  - dpdk-hugepage-dir
  - dpdk-extra

All above configuration options replaced with single 'dpdk-options'
which has same format as old 'dpdk-extra', i.e. just a string with
all the DPDK EAL arguments.

All the documentation about old configuration knobs removed. Users
could still use an old interface, but the deprecation warning will be
printed. In case 'dpdk-options' provided, values of old options will
be ignored. New users will start using new 'dpdk-options' as this is
the only documented way providing EAL arguments.

Rationale:

From release to release DPDK becomes more complex. New EAL arguments
appears, old arguments becomes deprecated. Some changes their meaning
and behaviour. It's not easy task to manage all this and current
code in OVS that tries to manage DPDK options is not flexible/extendable
enough even to track all the dependencies between options in DPDK 18.11.
For example, '--socket-mem' changed its meaning, '--legacy-mem' and
'--socket-limit' appeared. But '--legacy-mem' and '--socket-limit'
could not be used at the same time and, also, we want to provide
good default value for '--socket-limit' to keep the consistent
behaviour of OVS memory usage. And this will be worse in the future.

Another point is that even now we have mutually exclusive database
configs in OVS and we have to handle them. i.e we're providing
'dpdk-alloc-mem' and 'dpdk-socket-mem' at the same time. Additionally,
user allowed to configure same options in 'dpdk-extra'. So, we have
similar/equal options in a three places in ovsdb and we need to
choose which one to use. It's pretty easy for user to get into
inconsistent database configuration, because users could update
one field without removing others, and OVS has to resolve all the
conflicts somehow constructing the EAL args. But we do not know in
practice, if the resulted arguments are the arguments that user wanted
to see or just forget to remove the higher priority knob.

Next point is that we have 'dpdk-lcore-mask' (-c) which is actually not
so user-friendly as '-l', but we have no option for it. Will we create
additional 'dpdk-lcore-list' ? I guess, no, because it's not really
important thing. But users will stuck with this not so user-friendly
masks.

Another thing is that OVS is not really need to check the consistency
of the EAL arguments because DPDK could check it instead. DPDK will
always check the args and it will do it better. There is no real
need to duplicate this functionality.

Keeping all the options in a single string 'dpdk-options' allows:

  * To have only one place for all the configs, i.e. it will be harder
    for user to forget to remove something from the single string while
    updating it.
  * Not to check the consistency between different configs, DPDK could
    check the cmdline itself.
  * Not to document every DPDK EAL argument in OVS. We can just
    describe few of them and point to DPDK docs for more information.
  * OVS still able to provide some minimal default arguments.
    Thanks to DPDK 18.11 we only need to specify an lcore. All other
    arguments are not necessary. (We still also providing default
    --socket-limit in case --socket-mem passed without it and without
    --legacy-mem)

Usage example:

  ovs-vsctl set Open_vSwitch . \
      other_config:dpdk-options="-l 1 -n 4 --socket-mem 1024,1024"

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
---
 Documentation/intro/install/dpdk.rst |  40 +++++---
 NEWS                                 |   7 +-
 lib/dpdk.c                           |  49 +++++++++-
 utilities/ovs-dev.py                 |   2 +-
 vswitchd/vswitch.xml                 | 132 ++++++---------------------
 5 files changed, 108 insertions(+), 122 deletions(-)

Comments

Flavio Leitner Feb. 5, 2019, 12:38 p.m. | #1
Hi Ilya,

Thanks for the patch and I think we knew about that pain when we
exposed the very first parameter. I still remember Aaron struggling
to find the best path forward. Time flies and here we are.

The problem is that this change is not friendly from the community
perspective to its users. That is like an exposed API which we should
avoid break as much as possible.

For instance, there are users (OpenStack) with support life cycle of
5+ years that cannot keep up with this kind of change but they expect
to be able to update to newer OVS.

One idea is to freeze whatever we have now and update the documentation
to recommend the new way. We give like a couple OVS releases and only
then ignore (or remove?) those parameters. 

IMO in the end we are moving the problem from one place to another
because even with a single string, OVS users will be caught off guard
when DPDK changes. Yes, less pain to OVS community in the sense that
we don't have to add/remove/deprecate stuff, but it is still a bad
user experience regardless, which is not what OVS is known for.

Thanks,
fbl


On Fri, Feb 01, 2019 at 04:11:12PM +0300, Ilya Maximets wrote:
> This patch deprecates the usage of current configuration knobs for
> DPDK EAL arguments:
> 
>   - dpdk-alloc-mem
>   - dpdk-socket-mem
>   - dpdk-socket-limit
>   - dpdk-lcore-mask
>   - dpdk-hugepage-dir
>   - dpdk-extra
> 
> All above configuration options replaced with single 'dpdk-options'
> which has same format as old 'dpdk-extra', i.e. just a string with
> all the DPDK EAL arguments.
> 
> All the documentation about old configuration knobs removed. Users
> could still use an old interface, but the deprecation warning will be
> printed. In case 'dpdk-options' provided, values of old options will
> be ignored. New users will start using new 'dpdk-options' as this is
> the only documented way providing EAL arguments.
> 
> Rationale:
> 
> From release to release DPDK becomes more complex. New EAL arguments
> appears, old arguments becomes deprecated. Some changes their meaning
> and behaviour. It's not easy task to manage all this and current
> code in OVS that tries to manage DPDK options is not flexible/extendable
> enough even to track all the dependencies between options in DPDK 18.11.
> For example, '--socket-mem' changed its meaning, '--legacy-mem' and
> '--socket-limit' appeared. But '--legacy-mem' and '--socket-limit'
> could not be used at the same time and, also, we want to provide
> good default value for '--socket-limit' to keep the consistent
> behaviour of OVS memory usage. And this will be worse in the future.
> 
> Another point is that even now we have mutually exclusive database
> configs in OVS and we have to handle them. i.e we're providing
> 'dpdk-alloc-mem' and 'dpdk-socket-mem' at the same time. Additionally,
> user allowed to configure same options in 'dpdk-extra'. So, we have
> similar/equal options in a three places in ovsdb and we need to
> choose which one to use. It's pretty easy for user to get into
> inconsistent database configuration, because users could update
> one field without removing others, and OVS has to resolve all the
> conflicts somehow constructing the EAL args. But we do not know in
> practice, if the resulted arguments are the arguments that user wanted
> to see or just forget to remove the higher priority knob.
> 
> Next point is that we have 'dpdk-lcore-mask' (-c) which is actually not
> so user-friendly as '-l', but we have no option for it. Will we create
> additional 'dpdk-lcore-list' ? I guess, no, because it's not really
> important thing. But users will stuck with this not so user-friendly
> masks.
> 
> Another thing is that OVS is not really need to check the consistency
> of the EAL arguments because DPDK could check it instead. DPDK will
> always check the args and it will do it better. There is no real
> need to duplicate this functionality.
> 
> Keeping all the options in a single string 'dpdk-options' allows:
> 
>   * To have only one place for all the configs, i.e. it will be harder
>     for user to forget to remove something from the single string while
>     updating it.
>   * Not to check the consistency between different configs, DPDK could
>     check the cmdline itself.
>   * Not to document every DPDK EAL argument in OVS. We can just
>     describe few of them and point to DPDK docs for more information.
>   * OVS still able to provide some minimal default arguments.
>     Thanks to DPDK 18.11 we only need to specify an lcore. All other
>     arguments are not necessary. (We still also providing default
>     --socket-limit in case --socket-mem passed without it and without
>     --legacy-mem)
> 
> Usage example:
> 
>   ovs-vsctl set Open_vSwitch . \
>       other_config:dpdk-options="-l 1 -n 4 --socket-mem 1024,1024"
> 
> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
> ---
>  Documentation/intro/install/dpdk.rst |  40 +++++---
>  NEWS                                 |   7 +-
>  lib/dpdk.c                           |  49 +++++++++-
>  utilities/ovs-dev.py                 |   2 +-
>  vswitchd/vswitch.xml                 | 132 ++++++---------------------
>  5 files changed, 108 insertions(+), 122 deletions(-)
> 
> diff --git a/Documentation/intro/install/dpdk.rst b/Documentation/intro/install/dpdk.rst
> index 344d2b3a6..2243fde53 100644
> --- a/Documentation/intro/install/dpdk.rst
> +++ b/Documentation/intro/install/dpdk.rst
> @@ -235,32 +235,44 @@ listed below. Defaults will be provided for all values not explicitly set.
>    A value of ``try`` will imply that the ovs-vswitchd process should
>    continue running even if the EAL initialization fails.
>  
> -``dpdk-lcore-mask``
> -  Specifies the CPU cores on which dpdk lcore threads should be spawned and
> -  expects hex string (eg '0x123').
> +``dpdk-options``
> +  Specifies the string of EAL command line arguments for DPDK.
> +  For example, ``other_config:dpdk-options="-l 0 --socket-mem 1024,1024"``.
> +  Important arguments that could be passed there are:
>  
> -``dpdk-socket-mem``
> -  Comma separated list of memory to pre-allocate from hugepages on specific
> -  sockets. If not specified, 1024 MB will be set for each numa node by
> -  default.
> +  - ``-c`` or ``-l``
>  
> -``dpdk-hugepage-dir``
> -  Directory where hugetlbfs is mounted
> +    Specifies the CPU cores on which dpdk lcore threads should be spawned.
> +    ``-c`` expects hex string (eg ``0x123``) while ``-l`` expects core list
> +    (eg ``0-5,8``).
> +
> +  - ``--socket-mem``
> +
> +    Comma separated list of memory to pre-allocate from hugepages on specific
> +    sockets.
> +
> +  - ``--huge-dir``
> +
> +    Directory where hugetlbfs is mounted
> +
> +  - See DPDK documentation for the full list:
> +
> +    https://doc.dpdk.org/guides-18.11/linux_gsg/linux_eal_parameters.html
>  
>  ``vhost-sock-dir``
>    Option to set the path to the vhost-user unix socket files.
>  
> -If allocating more than one GB hugepage, you can configure the
> -amount of memory used from any given NUMA nodes. For example, to use 1GB from
> -NUMA node 0 and 0GB for all other NUMA nodes, run::
> +You can configure the amount of memory preallocated from any given NUMA nodes.
> +For example, to preallocate ``1GB`` from NUMA node ``0`` and ``0GB`` for all
> +other NUMA nodes, run::
>  
>      $ ovs-vsctl --no-wait set Open_vSwitch . \
> -        other_config:dpdk-socket-mem="1024,0"
> +        other_config:dpdk-options="<...> --socket-mem 1024,0"
>  
>  or::
>  
>      $ ovs-vsctl --no-wait set Open_vSwitch . \
> -        other_config:dpdk-socket-mem="1024"
> +        other_config:dpdk-options="<...> --socket-mem 1024"
>  
>  .. note::
>    Changing any of these options requires restarting the ovs-vswitchd
> diff --git a/NEWS b/NEWS
> index a64b9d94d..1c09e9d57 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -1,8 +1,11 @@
>  Post-v2.11.0
>  ---------------------
>     - DPDK:
> -     * New option 'other_config:dpdk-socket-limit' to limit amount of
> -       hugepage memory that can be used by DPDK.
> +     * New option 'other_config:dpdk-options' to pass all the
> +       DPDK EAL argumants in a single string.
> +     * Following config options deprecated:
> +       'dpdk-alloc-mem', 'dpdk-socket-mem', 'dpdk-socket-limit',
> +       'dpdk-lcore-mask', 'dpdk-hugepage-dir' and 'dpdk-extra'.
>  
>  
>  v2.11.0 - xx xxx xxxx
> diff --git a/lib/dpdk.c b/lib/dpdk.c
> index 53b74fba4..0c37f231d 100644
> --- a/lib/dpdk.c
> +++ b/lib/dpdk.c
> @@ -91,6 +91,41 @@ args_contains(const struct svec *args, const char *value)
>      return false;
>  }
>  
> +static bool
> +report_deprecated_configs(const struct smap *ovs_other_config, bool ignore)
> +{
> +    struct option {
> +        const char *opt;
> +        const char *replacement;
> +    } options[] = {
> +        { "dpdk-alloc-mem",    "-m"             },
> +        { "dpdk-socket-mem",   "--socket-mem"   },
> +        { "dpdk-socket-limit", "--socket-limit" },
> +        { "dpdk-lcore-mask",   "-c"             },
> +        { "dpdk-hugepage-dir", "--huge-dir"     },
> +        { "dpdk-extra",        ""               }
> +    };
> +    int i;
> +    bool found = false;
> +
> +    for (i = 0; i < ARRAY_SIZE(options); i++) {
> +        const char *value = smap_get(ovs_other_config, options[i].opt);
> +
> +        if (value) {
> +            VLOG_WARN("Detected deprecated '%s' config. Use '%s %s'"
> +                      " in 'dpdk-options' instead.%s",
> +                      options[i].opt, options[i].replacement, value,
> +                      ignore ? " Value ignored." : "");
> +            found = true;
> +        }
> +    }
> +    if (found) {
> +        VLOG_WARN("Deprecated options will be "
> +                  "silently ignored in the future.");
> +    }
> +    return found;
> +}
> +
>  static void
>  construct_dpdk_options(const struct smap *ovs_other_config, struct svec *args)
>  {
> @@ -270,8 +305,10 @@ dpdk_init__(const struct smap *ovs_other_config)
>      char **argv = NULL;
>      int result;
>      bool auto_determine = true;
> +    bool deprecated_found;
>      int err = 0;
>      cpu_set_t cpuset;
> +    const char *dpdk_options;
>      struct svec args = SVEC_EMPTY_INITIALIZER;
>  
>      log_stream = fopencookie(NULL, "w+", dpdk_log_func);
> @@ -317,7 +354,15 @@ dpdk_init__(const struct smap *ovs_other_config)
>                per_port_memory ? "enabled" : "disabled");
>  
>      svec_add(&args, ovs_get_program_name());
> -    construct_dpdk_args(ovs_other_config, &args);
> +    dpdk_options = smap_get(ovs_other_config, "dpdk-options");
> +
> +    deprecated_found = report_deprecated_configs(ovs_other_config,
> +                                                 dpdk_options ? true : false);
> +    if (dpdk_options) {
> +        svec_parse_words(&args, dpdk_options);
> +    } else if (deprecated_found) {
> +        construct_dpdk_args(ovs_other_config, &args);
> +    }
>  
>      if (!args_contains(&args, "--legacy-mem")
>          && !args_contains(&args, "--socket-limit")) {
> @@ -357,7 +402,7 @@ dpdk_init__(const struct smap *ovs_other_config)
>                  }
>              }
>          } else {
> -            /* User did not set dpdk-lcore-mask and unable to get current
> +            /* User did not set lcore mask and unable to get current
>               * thread affintity - default to core #0 */
>              VLOG_ERR("Thread getaffinity error %d. Using core #0", err);
>          }
> diff --git a/utilities/ovs-dev.py b/utilities/ovs-dev.py
> index 248d22ab9..d10b40c87 100755
> --- a/utilities/ovs-dev.py
> +++ b/utilities/ovs-dev.py
> @@ -283,7 +283,7 @@ def run():
>          _sh("ovs-vsctl --no-wait set Open_vSwitch %s "
>              "other_config:dpdk-init=true" % root_uuid)
>          _sh("ovs-vsctl --no-wait set Open_vSwitch %s other_config:"
> -            "dpdk-extra=\"%s\"" % (root_uuid, ' '.join(options.dpdk)))
> +            "dpdk-options=\"%s\"" % (root_uuid, ' '.join(options.dpdk)))
>      else:
>          _sh("ovs-vsctl --no-wait set Open_vSwitch %s "
>              "other_config:dpdk-init=false" % root_uuid)
> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
> index 1db4e8694..c30265d9e 100644
> --- a/vswitchd/vswitch.xml
> +++ b/vswitchd/vswitch.xml
> @@ -234,56 +234,6 @@
>          </p>
>        </column>
>  
> -      <column name="other_config" key="dpdk-init"
> -              type='{"type": "string",
> -                     "enum": ["set", ["false", "true", "try"]]}'>
> -        <p>
> -          Set this value to <code>true</code> or <code>try</code> to enable
> -          runtime support for DPDK ports. The vswitch must have compile-time
> -          support for DPDK as well.
> -        </p>
> -        <p>
> -          A value of <code>true</code> will cause the ovs-vswitchd process to
> -          abort if DPDK cannot be initialized. A value of <code>try</code>
> -          will allow the ovs-vswitchd process to continue running even if DPDK
> -          cannot be initialized.
> -        </p>
> -        <p>
> -          The default value is <code>false</code>. Changing this value requires
> -          restarting the daemon
> -        </p>
> -        <p>
> -          If this value is <code>false</code> at startup, any dpdk ports which
> -          are configured in the bridge will fail due to memory errors.
> -        </p>
> -      </column>
> -
> -      <column name="other_config" key="dpdk-lcore-mask"
> -              type='{"type": "integer", "minInteger": 1}'>
> -        <p>
> -          Specifies the CPU cores where dpdk lcore threads should be spawned.
> -          The DPDK lcore threads are used for DPDK library tasks, such as
> -          library internal message processing, logging, etc. Value should be in
> -          the form of a hex string (so '0x123') similar to the 'taskset' mask
> -          input.
> -        </p>
> -        <p>
> -          The lowest order bit corresponds to the first CPU core.  A set bit
> -          means the corresponding core is available and an lcore thread will be
> -          created and pinned to it.  If the input does not cover all cores,
> -          those uncovered cores are considered not set.
> -        </p>
> -        <p>
> -          For performance reasons, it is best to set this to a single core on
> -          the system, rather than allow lcore threads to float.
> -        </p>
> -        <p>
> -          If not specified, the value will be determined by choosing the lowest
> -          CPU core from initial cpu affinity list. Otherwise, the value will be
> -          passed directly to the DPDK library.
> -        </p>
> -      </column>
> -
>        <column name="other_config" key="pmd-cpu-mask">
>          <p>
>            Specifies CPU mask for setting the cpu affinity of PMD (Poll
> @@ -303,78 +253,54 @@
>          </p>
>        </column>
>  
> -      <column name="other_config" key="dpdk-alloc-mem"
> -              type='{"type": "integer", "minInteger": 0}'>
> +      <column name="other_config" key="dpdk-init"
> +              type='{"type": "string",
> +                     "enum": ["set", ["false", "true", "try"]]}'>
>          <p>
> -          Specifies the amount of memory to preallocate from the hugepage pool,
> -          regardless of socket. It is recommended that dpdk-socket-mem is used
> -          instead.
> +          Set this value to <code>true</code> or <code>try</code> to enable
> +          runtime support for DPDK ports. The vswitch must have compile-time
> +          support for DPDK as well.
>          </p>
> -      </column>
> -
> -      <column name="other_config" key="dpdk-socket-mem"
> -              type='{"type": "string"}'>
>          <p>
> -          Specifies the amount of memory to preallocate from the hugepage pool,
> -          on a per-socket basis.
> +          A value of <code>true</code> will cause the ovs-vswitchd process to
> +          abort if DPDK cannot be initialized. A value of <code>try</code>
> +          will allow the ovs-vswitchd process to continue running even if DPDK
> +          cannot be initialized.
>          </p>
>          <p>
> -          The specifier is a comma-separated string, in ascending order of CPU
> -          socket. E.g. On a four socket system 1024,0,2048 would set socket 0
> -          to preallocate 1024MB, socket 1 to preallocate 0MB, socket 2 to
> -          preallocate 2048MB and socket 3 (no value given) to preallocate 0MB.
> +          The default value is <code>false</code>. Changing this value requires
> +          restarting the daemon
>          </p>
>          <p>
> -          If dpdk-socket-mem and dpdk-alloc-mem are not specified, dpdk-socket-mem
> -          will be used and the default value is 1024 for each numa node. If
> -          dpdk-socket-mem and dpdk-alloc-mem are specified at same time,
> -          dpdk-socket-mem will be used as default. Changing this value
> -          requires restarting the daemon.
> +          If this value is <code>false</code> at startup, any dpdk ports which
> +          are configured in the bridge will fail due to memory errors.
>          </p>
>        </column>
>  
> -      <column name="other_config" key="dpdk-socket-limit"
> +      <column name="other_config" key="dpdk-options"
>                type='{"type": "string"}'>
>          <p>
> -          Limits the maximum amount of memory that can be used from the
> -          hugepage pool, on a per-socket basis.
> +          Specifies EAL command line arguments for DPDK. For example,
> +          <code>"-l 1 --socket-mem 1024,1024 -n 4"</code>
>          </p>
>          <p>
> -          The specifier is a comma-separated list of memory limits per socket.
> -          <code>0</code> will disable the limit for a particular socket.
> -        </p>
> -        <p>
> -          If not specified, OVS will configure limits equal to the amount of
> -          preallocated memory specified by <ref column="other_config"
> -          key="dpdk-socket-mem"/> or <code>--socket-mem</code> in
> -          <ref column="other_config" key="dpdk-extra"/>. If none of the above
> -          options specified or <code>--legacy-mem</code> provided in
> -          <ref column="other_config" key="dpdk-extra"/>, limits will not be
> -          applied.
>            Changing this value requires restarting the daemon.
>          </p>
> -      </column>
> -
> -      <column name="other_config" key="dpdk-hugepage-dir"
> -              type='{"type": "string"}'>
> -        <p>
> -          Specifies the path to the hugetlbfs mount point.
> -        </p>
> -        <p>
> -          If not specified, this will be guessed by the DPDK library (default
> -          is /dev/hugepages). Changing this value requires restarting the
> -          daemon.
> -        </p>
> -      </column>
> -
> -      <column name="other_config" key="dpdk-extra"
> -              type='{"type": "string"}'>
>          <p>
> -          Specifies additional eal command line arguments for DPDK.
> +          If <code>-l</code> and <code>-c</code> options are not specified,
> +          the default value will be provided by choosing the lowest CPU core
> +          from initial cpu affinity list.
> +          For performance reasons, it is best to set lcore set/list to a single
> +          core on the system, rather than allow lcore threads to float.
> +          CPU cores proveded by <code>-l</code> and <code>-c</code> options
> +          only used by DPDK internal threads. To manage affinity of PMD threads
> +          use <ref column="other_config" key="pmd-cpu-mask"/>.
>          </p>
>          <p>
> -          The default is empty. Changing this value requires restarting the
> -          daemon
> +          if <code>--socket-mem</code> option provided without
> +          <code>--legacy-mem</code> and <code>--socket-limit</code>, OVS will
> +          provide <code>--socket-limit</code> equal to the amount of
> +          preallocated memory specified by <code>--socket-mem</code>.
>          </p>
>        </column>
>  
> -- 
> 2.17.1
> 
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Ilya Maximets Feb. 5, 2019, 3:43 p.m. | #2
Hi Flavio.
Thanks for taking a look.

On 05.02.2019 15:38, Flavio Leitner wrote:
> 
> Hi Ilya,
> 
> Thanks for the patch and I think we knew about that pain when we
> exposed the very first parameter. I still remember Aaron struggling
> to find the best path forward. Time flies and here we are.
> 
> The problem is that this change is not friendly from the community
> perspective to its users. That is like an exposed API which we should
> avoid break as much as possible.
> 
> For instance, there are users (OpenStack) with support life cycle of
> 5+ years that cannot keep up with this kind of change but they expect
> to be able to update to newer OVS.

Sure, there are users that wants stable API that will never change.
But this is really impossible in practice. I'm working with OpenStack
too and will have to fixup deployment tools with these changes. BTW, from
the whole OpenStack only few deployment projects (like TripleO) will need
to make changes and these changes are not very big.

> 
> One idea is to freeze whatever we have now and update the documentation
> to recommend the new way. We give like a couple OVS releases and only
> then ignore (or remove?) those parameters.

Yes, In cover letter I proposed these patches to be applied one per release.
And current (first) patch does not remove the functionality, only docs.
Users still will be able to use old interface, but will have warnings
in the log. In the next release cycle we'll start ignore the values
while still printing the warnings. This should give enough time for adaptation.
If you feel that we need more time, we could apply the second patch to 2.14
(or whatever number will be in 2 releases from now).


> 
> IMO in the end we are moving the problem from one place to another
> because even with a single string, OVS users will be caught off guard
> when DPDK changes. Yes, less pain to OVS community in the sense that
> we don't have to add/remove/deprecate stuff, but it is still a bad
> user experience regardless, which is not what OVS is known for.

Unfortunately, DPDK was never user-friendly enough. It tries, but still not.
We're keeping few sane defaults like providing default lcore and setting the
socket-limit if needed. And we'll try to do that in the future. The thing
this patch tries to eliminate is the dependency tracking between different
EAL arguments, i.e. duplicating the work with rte_eal_init() and having too
many configuration knobs with similar meanings what does not work at the
same time.

Right now there are no critical arguments that user must provide.
So, IMHO, having special configs for them is really unnecessary.

> 
> Thanks,
> fbl
> 
> 
> On Fri, Feb 01, 2019 at 04:11:12PM +0300, Ilya Maximets wrote:
>> This patch deprecates the usage of current configuration knobs for
>> DPDK EAL arguments:
>>
>>   - dpdk-alloc-mem
>>   - dpdk-socket-mem
>>   - dpdk-socket-limit
>>   - dpdk-lcore-mask
>>   - dpdk-hugepage-dir
>>   - dpdk-extra
>>
>> All above configuration options replaced with single 'dpdk-options'
>> which has same format as old 'dpdk-extra', i.e. just a string with
>> all the DPDK EAL arguments.
>>
>> All the documentation about old configuration knobs removed. Users
>> could still use an old interface, but the deprecation warning will be
>> printed. In case 'dpdk-options' provided, values of old options will
>> be ignored. New users will start using new 'dpdk-options' as this is
>> the only documented way providing EAL arguments.
>>
>> Rationale:
>>
>> From release to release DPDK becomes more complex. New EAL arguments
>> appears, old arguments becomes deprecated. Some changes their meaning
>> and behaviour. It's not easy task to manage all this and current
>> code in OVS that tries to manage DPDK options is not flexible/extendable
>> enough even to track all the dependencies between options in DPDK 18.11.
>> For example, '--socket-mem' changed its meaning, '--legacy-mem' and
>> '--socket-limit' appeared. But '--legacy-mem' and '--socket-limit'
>> could not be used at the same time and, also, we want to provide
>> good default value for '--socket-limit' to keep the consistent
>> behaviour of OVS memory usage. And this will be worse in the future.
>>
>> Another point is that even now we have mutually exclusive database
>> configs in OVS and we have to handle them. i.e we're providing
>> 'dpdk-alloc-mem' and 'dpdk-socket-mem' at the same time. Additionally,
>> user allowed to configure same options in 'dpdk-extra'. So, we have
>> similar/equal options in a three places in ovsdb and we need to
>> choose which one to use. It's pretty easy for user to get into
>> inconsistent database configuration, because users could update
>> one field without removing others, and OVS has to resolve all the
>> conflicts somehow constructing the EAL args. But we do not know in
>> practice, if the resulted arguments are the arguments that user wanted
>> to see or just forget to remove the higher priority knob.
>>
>> Next point is that we have 'dpdk-lcore-mask' (-c) which is actually not
>> so user-friendly as '-l', but we have no option for it. Will we create
>> additional 'dpdk-lcore-list' ? I guess, no, because it's not really
>> important thing. But users will stuck with this not so user-friendly
>> masks.
>>
>> Another thing is that OVS is not really need to check the consistency
>> of the EAL arguments because DPDK could check it instead. DPDK will
>> always check the args and it will do it better. There is no real
>> need to duplicate this functionality.
>>
>> Keeping all the options in a single string 'dpdk-options' allows:
>>
>>   * To have only one place for all the configs, i.e. it will be harder
>>     for user to forget to remove something from the single string while
>>     updating it.
>>   * Not to check the consistency between different configs, DPDK could
>>     check the cmdline itself.
>>   * Not to document every DPDK EAL argument in OVS. We can just
>>     describe few of them and point to DPDK docs for more information.
>>   * OVS still able to provide some minimal default arguments.
>>     Thanks to DPDK 18.11 we only need to specify an lcore. All other
>>     arguments are not necessary. (We still also providing default
>>     --socket-limit in case --socket-mem passed without it and without
>>     --legacy-mem)
>>
>> Usage example:
>>
>>   ovs-vsctl set Open_vSwitch . \
>>       other_config:dpdk-options="-l 1 -n 4 --socket-mem 1024,1024"
>>
>> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
>> ---
>>  Documentation/intro/install/dpdk.rst |  40 +++++---
>>  NEWS                                 |   7 +-
>>  lib/dpdk.c                           |  49 +++++++++-
>>  utilities/ovs-dev.py                 |   2 +-
>>  vswitchd/vswitch.xml                 | 132 ++++++---------------------
>>  5 files changed, 108 insertions(+), 122 deletions(-)
>>
>> diff --git a/Documentation/intro/install/dpdk.rst b/Documentation/intro/install/dpdk.rst
>> index 344d2b3a6..2243fde53 100644
>> --- a/Documentation/intro/install/dpdk.rst
>> +++ b/Documentation/intro/install/dpdk.rst
>> @@ -235,32 +235,44 @@ listed below. Defaults will be provided for all values not explicitly set.
>>    A value of ``try`` will imply that the ovs-vswitchd process should
>>    continue running even if the EAL initialization fails.
>>  
>> -``dpdk-lcore-mask``
>> -  Specifies the CPU cores on which dpdk lcore threads should be spawned and
>> -  expects hex string (eg '0x123').
>> +``dpdk-options``
>> +  Specifies the string of EAL command line arguments for DPDK.
>> +  For example, ``other_config:dpdk-options="-l 0 --socket-mem 1024,1024"``.
>> +  Important arguments that could be passed there are:
>>  
>> -``dpdk-socket-mem``
>> -  Comma separated list of memory to pre-allocate from hugepages on specific
>> -  sockets. If not specified, 1024 MB will be set for each numa node by
>> -  default.
>> +  - ``-c`` or ``-l``
>>  
>> -``dpdk-hugepage-dir``
>> -  Directory where hugetlbfs is mounted
>> +    Specifies the CPU cores on which dpdk lcore threads should be spawned.
>> +    ``-c`` expects hex string (eg ``0x123``) while ``-l`` expects core list
>> +    (eg ``0-5,8``).
>> +
>> +  - ``--socket-mem``
>> +
>> +    Comma separated list of memory to pre-allocate from hugepages on specific
>> +    sockets.
>> +
>> +  - ``--huge-dir``
>> +
>> +    Directory where hugetlbfs is mounted
>> +
>> +  - See DPDK documentation for the full list:
>> +
>> +    https://doc.dpdk.org/guides-18.11/linux_gsg/linux_eal_parameters.html
>>  
>>  ``vhost-sock-dir``
>>    Option to set the path to the vhost-user unix socket files.
>>  
>> -If allocating more than one GB hugepage, you can configure the
>> -amount of memory used from any given NUMA nodes. For example, to use 1GB from
>> -NUMA node 0 and 0GB for all other NUMA nodes, run::
>> +You can configure the amount of memory preallocated from any given NUMA nodes.
>> +For example, to preallocate ``1GB`` from NUMA node ``0`` and ``0GB`` for all
>> +other NUMA nodes, run::
>>  
>>      $ ovs-vsctl --no-wait set Open_vSwitch . \
>> -        other_config:dpdk-socket-mem="1024,0"
>> +        other_config:dpdk-options="<...> --socket-mem 1024,0"
>>  
>>  or::
>>  
>>      $ ovs-vsctl --no-wait set Open_vSwitch . \
>> -        other_config:dpdk-socket-mem="1024"
>> +        other_config:dpdk-options="<...> --socket-mem 1024"
>>  
>>  .. note::
>>    Changing any of these options requires restarting the ovs-vswitchd
>> diff --git a/NEWS b/NEWS
>> index a64b9d94d..1c09e9d57 100644
>> --- a/NEWS
>> +++ b/NEWS
>> @@ -1,8 +1,11 @@
>>  Post-v2.11.0
>>  ---------------------
>>     - DPDK:
>> -     * New option 'other_config:dpdk-socket-limit' to limit amount of
>> -       hugepage memory that can be used by DPDK.
>> +     * New option 'other_config:dpdk-options' to pass all the
>> +       DPDK EAL argumants in a single string.
>> +     * Following config options deprecated:
>> +       'dpdk-alloc-mem', 'dpdk-socket-mem', 'dpdk-socket-limit',
>> +       'dpdk-lcore-mask', 'dpdk-hugepage-dir' and 'dpdk-extra'.
>>  
>>  
>>  v2.11.0 - xx xxx xxxx
>> diff --git a/lib/dpdk.c b/lib/dpdk.c
>> index 53b74fba4..0c37f231d 100644
>> --- a/lib/dpdk.c
>> +++ b/lib/dpdk.c
>> @@ -91,6 +91,41 @@ args_contains(const struct svec *args, const char *value)
>>      return false;
>>  }
>>  
>> +static bool
>> +report_deprecated_configs(const struct smap *ovs_other_config, bool ignore)
>> +{
>> +    struct option {
>> +        const char *opt;
>> +        const char *replacement;
>> +    } options[] = {
>> +        { "dpdk-alloc-mem",    "-m"             },
>> +        { "dpdk-socket-mem",   "--socket-mem"   },
>> +        { "dpdk-socket-limit", "--socket-limit" },
>> +        { "dpdk-lcore-mask",   "-c"             },
>> +        { "dpdk-hugepage-dir", "--huge-dir"     },
>> +        { "dpdk-extra",        ""               }
>> +    };
>> +    int i;
>> +    bool found = false;
>> +
>> +    for (i = 0; i < ARRAY_SIZE(options); i++) {
>> +        const char *value = smap_get(ovs_other_config, options[i].opt);
>> +
>> +        if (value) {
>> +            VLOG_WARN("Detected deprecated '%s' config. Use '%s %s'"
>> +                      " in 'dpdk-options' instead.%s",
>> +                      options[i].opt, options[i].replacement, value,
>> +                      ignore ? " Value ignored." : "");
>> +            found = true;
>> +        }
>> +    }
>> +    if (found) {
>> +        VLOG_WARN("Deprecated options will be "
>> +                  "silently ignored in the future.");
>> +    }
>> +    return found;
>> +}
>> +
>>  static void
>>  construct_dpdk_options(const struct smap *ovs_other_config, struct svec *args)
>>  {
>> @@ -270,8 +305,10 @@ dpdk_init__(const struct smap *ovs_other_config)
>>      char **argv = NULL;
>>      int result;
>>      bool auto_determine = true;
>> +    bool deprecated_found;
>>      int err = 0;
>>      cpu_set_t cpuset;
>> +    const char *dpdk_options;
>>      struct svec args = SVEC_EMPTY_INITIALIZER;
>>  
>>      log_stream = fopencookie(NULL, "w+", dpdk_log_func);
>> @@ -317,7 +354,15 @@ dpdk_init__(const struct smap *ovs_other_config)
>>                per_port_memory ? "enabled" : "disabled");
>>  
>>      svec_add(&args, ovs_get_program_name());
>> -    construct_dpdk_args(ovs_other_config, &args);
>> +    dpdk_options = smap_get(ovs_other_config, "dpdk-options");
>> +
>> +    deprecated_found = report_deprecated_configs(ovs_other_config,
>> +                                                 dpdk_options ? true : false);
>> +    if (dpdk_options) {
>> +        svec_parse_words(&args, dpdk_options);
>> +    } else if (deprecated_found) {
>> +        construct_dpdk_args(ovs_other_config, &args);
>> +    }
>>  
>>      if (!args_contains(&args, "--legacy-mem")
>>          && !args_contains(&args, "--socket-limit")) {
>> @@ -357,7 +402,7 @@ dpdk_init__(const struct smap *ovs_other_config)
>>                  }
>>              }
>>          } else {
>> -            /* User did not set dpdk-lcore-mask and unable to get current
>> +            /* User did not set lcore mask and unable to get current
>>               * thread affintity - default to core #0 */
>>              VLOG_ERR("Thread getaffinity error %d. Using core #0", err);
>>          }
>> diff --git a/utilities/ovs-dev.py b/utilities/ovs-dev.py
>> index 248d22ab9..d10b40c87 100755
>> --- a/utilities/ovs-dev.py
>> +++ b/utilities/ovs-dev.py
>> @@ -283,7 +283,7 @@ def run():
>>          _sh("ovs-vsctl --no-wait set Open_vSwitch %s "
>>              "other_config:dpdk-init=true" % root_uuid)
>>          _sh("ovs-vsctl --no-wait set Open_vSwitch %s other_config:"
>> -            "dpdk-extra=\"%s\"" % (root_uuid, ' '.join(options.dpdk)))
>> +            "dpdk-options=\"%s\"" % (root_uuid, ' '.join(options.dpdk)))
>>      else:
>>          _sh("ovs-vsctl --no-wait set Open_vSwitch %s "
>>              "other_config:dpdk-init=false" % root_uuid)
>> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
>> index 1db4e8694..c30265d9e 100644
>> --- a/vswitchd/vswitch.xml
>> +++ b/vswitchd/vswitch.xml
>> @@ -234,56 +234,6 @@
>>          </p>
>>        </column>
>>  
>> -      <column name="other_config" key="dpdk-init"
>> -              type='{"type": "string",
>> -                     "enum": ["set", ["false", "true", "try"]]}'>
>> -        <p>
>> -          Set this value to <code>true</code> or <code>try</code> to enable
>> -          runtime support for DPDK ports. The vswitch must have compile-time
>> -          support for DPDK as well.
>> -        </p>
>> -        <p>
>> -          A value of <code>true</code> will cause the ovs-vswitchd process to
>> -          abort if DPDK cannot be initialized. A value of <code>try</code>
>> -          will allow the ovs-vswitchd process to continue running even if DPDK
>> -          cannot be initialized.
>> -        </p>
>> -        <p>
>> -          The default value is <code>false</code>. Changing this value requires
>> -          restarting the daemon
>> -        </p>
>> -        <p>
>> -          If this value is <code>false</code> at startup, any dpdk ports which
>> -          are configured in the bridge will fail due to memory errors.
>> -        </p>
>> -      </column>
>> -
>> -      <column name="other_config" key="dpdk-lcore-mask"
>> -              type='{"type": "integer", "minInteger": 1}'>
>> -        <p>
>> -          Specifies the CPU cores where dpdk lcore threads should be spawned.
>> -          The DPDK lcore threads are used for DPDK library tasks, such as
>> -          library internal message processing, logging, etc. Value should be in
>> -          the form of a hex string (so '0x123') similar to the 'taskset' mask
>> -          input.
>> -        </p>
>> -        <p>
>> -          The lowest order bit corresponds to the first CPU core.  A set bit
>> -          means the corresponding core is available and an lcore thread will be
>> -          created and pinned to it.  If the input does not cover all cores,
>> -          those uncovered cores are considered not set.
>> -        </p>
>> -        <p>
>> -          For performance reasons, it is best to set this to a single core on
>> -          the system, rather than allow lcore threads to float.
>> -        </p>
>> -        <p>
>> -          If not specified, the value will be determined by choosing the lowest
>> -          CPU core from initial cpu affinity list. Otherwise, the value will be
>> -          passed directly to the DPDK library.
>> -        </p>
>> -      </column>
>> -
>>        <column name="other_config" key="pmd-cpu-mask">
>>          <p>
>>            Specifies CPU mask for setting the cpu affinity of PMD (Poll
>> @@ -303,78 +253,54 @@
>>          </p>
>>        </column>
>>  
>> -      <column name="other_config" key="dpdk-alloc-mem"
>> -              type='{"type": "integer", "minInteger": 0}'>
>> +      <column name="other_config" key="dpdk-init"
>> +              type='{"type": "string",
>> +                     "enum": ["set", ["false", "true", "try"]]}'>
>>          <p>
>> -          Specifies the amount of memory to preallocate from the hugepage pool,
>> -          regardless of socket. It is recommended that dpdk-socket-mem is used
>> -          instead.
>> +          Set this value to <code>true</code> or <code>try</code> to enable
>> +          runtime support for DPDK ports. The vswitch must have compile-time
>> +          support for DPDK as well.
>>          </p>
>> -      </column>
>> -
>> -      <column name="other_config" key="dpdk-socket-mem"
>> -              type='{"type": "string"}'>
>>          <p>
>> -          Specifies the amount of memory to preallocate from the hugepage pool,
>> -          on a per-socket basis.
>> +          A value of <code>true</code> will cause the ovs-vswitchd process to
>> +          abort if DPDK cannot be initialized. A value of <code>try</code>
>> +          will allow the ovs-vswitchd process to continue running even if DPDK
>> +          cannot be initialized.
>>          </p>
>>          <p>
>> -          The specifier is a comma-separated string, in ascending order of CPU
>> -          socket. E.g. On a four socket system 1024,0,2048 would set socket 0
>> -          to preallocate 1024MB, socket 1 to preallocate 0MB, socket 2 to
>> -          preallocate 2048MB and socket 3 (no value given) to preallocate 0MB.
>> +          The default value is <code>false</code>. Changing this value requires
>> +          restarting the daemon
>>          </p>
>>          <p>
>> -          If dpdk-socket-mem and dpdk-alloc-mem are not specified, dpdk-socket-mem
>> -          will be used and the default value is 1024 for each numa node. If
>> -          dpdk-socket-mem and dpdk-alloc-mem are specified at same time,
>> -          dpdk-socket-mem will be used as default. Changing this value
>> -          requires restarting the daemon.
>> +          If this value is <code>false</code> at startup, any dpdk ports which
>> +          are configured in the bridge will fail due to memory errors.
>>          </p>
>>        </column>
>>  
>> -      <column name="other_config" key="dpdk-socket-limit"
>> +      <column name="other_config" key="dpdk-options"
>>                type='{"type": "string"}'>
>>          <p>
>> -          Limits the maximum amount of memory that can be used from the
>> -          hugepage pool, on a per-socket basis.
>> +          Specifies EAL command line arguments for DPDK. For example,
>> +          <code>"-l 1 --socket-mem 1024,1024 -n 4"</code>
>>          </p>
>>          <p>
>> -          The specifier is a comma-separated list of memory limits per socket.
>> -          <code>0</code> will disable the limit for a particular socket.
>> -        </p>
>> -        <p>
>> -          If not specified, OVS will configure limits equal to the amount of
>> -          preallocated memory specified by <ref column="other_config"
>> -          key="dpdk-socket-mem"/> or <code>--socket-mem</code> in
>> -          <ref column="other_config" key="dpdk-extra"/>. If none of the above
>> -          options specified or <code>--legacy-mem</code> provided in
>> -          <ref column="other_config" key="dpdk-extra"/>, limits will not be
>> -          applied.
>>            Changing this value requires restarting the daemon.
>>          </p>
>> -      </column>
>> -
>> -      <column name="other_config" key="dpdk-hugepage-dir"
>> -              type='{"type": "string"}'>
>> -        <p>
>> -          Specifies the path to the hugetlbfs mount point.
>> -        </p>
>> -        <p>
>> -          If not specified, this will be guessed by the DPDK library (default
>> -          is /dev/hugepages). Changing this value requires restarting the
>> -          daemon.
>> -        </p>
>> -      </column>
>> -
>> -      <column name="other_config" key="dpdk-extra"
>> -              type='{"type": "string"}'>
>>          <p>
>> -          Specifies additional eal command line arguments for DPDK.
>> +          If <code>-l</code> and <code>-c</code> options are not specified,
>> +          the default value will be provided by choosing the lowest CPU core
>> +          from initial cpu affinity list.
>> +          For performance reasons, it is best to set lcore set/list to a single
>> +          core on the system, rather than allow lcore threads to float.
>> +          CPU cores proveded by <code>-l</code> and <code>-c</code> options
>> +          only used by DPDK internal threads. To manage affinity of PMD threads
>> +          use <ref column="other_config" key="pmd-cpu-mask"/>.
>>          </p>
>>          <p>
>> -          The default is empty. Changing this value requires restarting the
>> -          daemon
>> +          if <code>--socket-mem</code> option provided without
>> +          <code>--legacy-mem</code> and <code>--socket-limit</code>, OVS will
>> +          provide <code>--socket-limit</code> equal to the amount of
>> +          preallocated memory specified by <code>--socket-mem</code>.
>>          </p>
>>        </column>
>>  
>> -- 
>> 2.17.1
>>
>> _______________________________________________
>> dev mailing list
>> dev@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> 
> 
>
Flavio Leitner Feb. 5, 2019, 8:19 p.m. | #3
On Tue, Feb 05, 2019 at 06:43:08PM +0300, Ilya Maximets wrote:
> Hi Flavio.
> Thanks for taking a look.
> 
> On 05.02.2019 15:38, Flavio Leitner wrote:
> > 
> > Hi Ilya,
> > 
> > Thanks for the patch and I think we knew about that pain when we
> > exposed the very first parameter. I still remember Aaron struggling
> > to find the best path forward. Time flies and here we are.
> > 
> > The problem is that this change is not friendly from the community
> > perspective to its users. That is like an exposed API which we should
> > avoid break as much as possible.
> > 
> > For instance, there are users (OpenStack) with support life cycle of
> > 5+ years that cannot keep up with this kind of change but they expect
> > to be able to update to newer OVS.
> 
> Sure, there are users that wants stable API that will never change.
> But this is really impossible in practice. I'm working with OpenStack
> too and will have to fixup deployment tools with these changes. BTW, from
> the whole OpenStack only few deployment projects (like TripleO) will need
> to make changes and these changes are not very big.

That's only part of the work. There will be work on QA, documentation
and even migration path from one to another. And we can't change the
past for existing deployments.

> > One idea is to freeze whatever we have now and update the documentation
> > to recommend the new way. We give like a couple OVS releases and only
> > then ignore (or remove?) those parameters.
> 
> Yes, In cover letter I proposed these patches to be applied one per release.
> And current (first) patch does not remove the functionality, only docs.
> Users still will be able to use old interface, but will have warnings
> in the log. In the next release cycle we'll start ignore the values
> while still printing the warnings. This should give enough time for adaptation.
> If you feel that we need more time, we could apply the second patch to 2.14
> (or whatever number will be in 2 releases from now).

I don't think we should remove the docs if the parameters are there as
a first step. I mean, assume an existing deployment, there is a parameter
which might be in use but there is no documentation available. That
doesn't sound like a good user experience to me.

On another hand, you could introduce the new interface and update the
docs to recommend using the new one because the old one will be removed
in the future. Warning messages come next, and then finally its removal.


> > IMO in the end we are moving the problem from one place to another
> > because even with a single string, OVS users will be caught off guard
> > when DPDK changes. Yes, less pain to OVS community in the sense that
> > we don't have to add/remove/deprecate stuff, but it is still a bad
> > user experience regardless, which is not what OVS is known for.
> 
> Unfortunately, DPDK was never user-friendly enough. It tries, but still not.

Agreed.

> We're keeping few sane defaults like providing default lcore and setting the
> socket-limit if needed. And we'll try to do that in the future. The thing
> this patch tries to eliminate is the dependency tracking between different
> EAL arguments, i.e. duplicating the work with rte_eal_init() and having too
> many configuration knobs with similar meanings what does not work at the
> same time.
> 
> Right now there are no critical arguments that user must provide.
> So, IMHO, having special configs for them is really unnecessary.

Do you think the defaults work for the majority of DPDK users?

If we expose only the necessary things to consume DPDK it means we
have tight control (less combinations to worry about), otherwise if
the user can pass anything, a can of worms is opened and we have no
control of all things the user can do.

fbl
Ilya Maximets Feb. 6, 2019, 8:33 a.m. | #4
On 05.02.2019 23:19, Flavio Leitner wrote:
> On Tue, Feb 05, 2019 at 06:43:08PM +0300, Ilya Maximets wrote:
>> Hi Flavio.
>> Thanks for taking a look.
>>
>> On 05.02.2019 15:38, Flavio Leitner wrote:
>>>
>>> Hi Ilya,
>>>
>>> Thanks for the patch and I think we knew about that pain when we
>>> exposed the very first parameter. I still remember Aaron struggling
>>> to find the best path forward. Time flies and here we are.
>>>
>>> The problem is that this change is not friendly from the community
>>> perspective to its users. That is like an exposed API which we should
>>> avoid break as much as possible.
>>>
>>> For instance, there are users (OpenStack) with support life cycle of
>>> 5+ years that cannot keep up with this kind of change but they expect
>>> to be able to update to newer OVS.
>>
>> Sure, there are users that wants stable API that will never change.
>> But this is really impossible in practice. I'm working with OpenStack
>> too and will have to fixup deployment tools with these changes. BTW, from
>> the whole OpenStack only few deployment projects (like TripleO) will need
>> to make changes and these changes are not very big.
> 
> That's only part of the work. There will be work on QA, documentation
> and even migration path from one to another. And we can't change the
> past for existing deployments.

Sure. But incompatible API changes are almost unavoidable for a young
projects that wants to be better.

> 
>>> One idea is to freeze whatever we have now and update the documentation
>>> to recommend the new way. We give like a couple OVS releases and only
>>> then ignore (or remove?) those parameters.
>>
>> Yes, In cover letter I proposed these patches to be applied one per release.
>> And current (first) patch does not remove the functionality, only docs.
>> Users still will be able to use old interface, but will have warnings
>> in the log. In the next release cycle we'll start ignore the values
>> while still printing the warnings. This should give enough time for adaptation.
>> If you feel that we need more time, we could apply the second patch to 2.14
>> (or whatever number will be in 2 releases from now).
> 
> I don't think we should remove the docs if the parameters are there as
> a first step. I mean, assume an existing deployment, there is a parameter
> which might be in use but there is no documentation available. That
> doesn't sound like a good user experience to me.

Maybe we could save a man pages while removing the guides. There is no much
information in Documentation/intro/install/dpdk.rst anyway.

> 
> On another hand, you could introduce the new interface and update the
> docs to recommend using the new one because the old one will be removed
> in the future. Warning messages come next, and then finally its removal.

I'd prefer to have warning messages to be there right from the start to
push users to migrate to the new interfaces as early as possible.

How about this:

First stage (apply now, release in 2.12):
  - Introduce new interface 'dpdk-options'.
  - Rewrite installation guide with new interface fully removing the old one.
  - Add new interface to man pages (vswitch.xml) and mark all the old options
    as deprecated by adding something like: "Deprecated. 'dpdk-options' should
    be used instead. Will be ignored in the future."
  - Add a runtime deprecation warning if old interface is in use.
  - Ignore values of old knobs if 'dpdk-options' provided.

Second stage (release in 2.13 or 2.14, could wait longer if required):
  - Remove old interfaces wile keeping the warnings. (i.e. values always ignored)
  - Remove old knobs from the man pages.

Third stage (optional):
  - Remove warnings.

So the main difference from the current patches is delaying removal of the man
pages to the second stage.

> 
> 
>>> IMO in the end we are moving the problem from one place to another
>>> because even with a single string, OVS users will be caught off guard
>>> when DPDK changes. Yes, less pain to OVS community in the sense that
>>> we don't have to add/remove/deprecate stuff, but it is still a bad
>>> user experience regardless, which is not what OVS is known for.
>>
>> Unfortunately, DPDK was never user-friendly enough. It tries, but still not.
> 
> Agreed.
> 
>> We're keeping few sane defaults like providing default lcore and setting the
>> socket-limit if needed. And we'll try to do that in the future. The thing
>> this patch tries to eliminate is the dependency tracking between different
>> EAL arguments, i.e. duplicating the work with rte_eal_init() and having too
>> many configuration knobs with similar meanings what does not work at the
>> same time.
>>
>> Right now there are no critical arguments that user must provide.
>> So, IMHO, having special configs for them is really unnecessary.
> 
> Do you think the defaults work for the majority of DPDK users?

My personal experience is opposite. Most of my deployments and deployments that
I had to work with had massive 'dpdk-extra' arguments. These are mostly memory
tuning options, pci whitelists and various virtual devices. And these 'dpdk-extra'
strings only grows over time with '--socket-limit', '--legacy-mem',
'--single-file-segments', '--in-memory', '--file-prefix' and so on.

Defaults are OK if your application is the only application you want to see on
a system. Otherwise, you need to fight with DPDK for resources/devices and your
(almost) only weapon is EAL argument string.

> 
> If we expose only the necessary things to consume DPDK it means we
> have tight control (less combinations to worry about), otherwise if
> the user can pass anything, a can of worms is opened and we have no
> control of all things the user can do.

We have only illusion of control. User can pass anything with 'dpdk-extra'.
Also, the only necessary thing with dpdk 18.11 is an lcore list.

IMHO, non-proficent users are happy with defaults. If user wants to tune,
'dpdk-extra' will be used anyway.

Best regards, Ilya Maximets.
Aaron Conole Feb. 6, 2019, 3:05 p.m. | #5
Ilya Maximets <i.maximets@samsung.com> writes:

> On 05.02.2019 23:19, Flavio Leitner wrote:
>> On Tue, Feb 05, 2019 at 06:43:08PM +0300, Ilya Maximets wrote:
>>> Hi Flavio.
>>> Thanks for taking a look.
>>>
>>> On 05.02.2019 15:38, Flavio Leitner wrote:
>>>>
>>>> Hi Ilya,
>>>>
>>>> Thanks for the patch and I think we knew about that pain when we
>>>> exposed the very first parameter. I still remember Aaron struggling
>>>> to find the best path forward. Time flies and here we are.
>>>>
>>>> The problem is that this change is not friendly from the community
>>>> perspective to its users. That is like an exposed API which we should
>>>> avoid break as much as possible.
>>>>
>>>> For instance, there are users (OpenStack) with support life cycle of
>>>> 5+ years that cannot keep up with this kind of change but they expect
>>>> to be able to update to newer OVS.
>>>
>>> Sure, there are users that wants stable API that will never change.
>>> But this is really impossible in practice. I'm working with OpenStack
>>> too and will have to fixup deployment tools with these changes. BTW, from
>>> the whole OpenStack only few deployment projects (like TripleO) will need
>>> to make changes and these changes are not very big.
>> 
>> That's only part of the work. There will be work on QA, documentation
>> and even migration path from one to another. And we can't change the
>> past for existing deployments.
>
> Sure. But incompatible API changes are almost unavoidable for a young
> projects that wants to be better.
>
>> 
>>>> One idea is to freeze whatever we have now and update the documentation
>>>> to recommend the new way. We give like a couple OVS releases and only
>>>> then ignore (or remove?) those parameters.
>>>
>>> Yes, In cover letter I proposed these patches to be applied one per release.
>>> And current (first) patch does not remove the functionality, only docs.
>>> Users still will be able to use old interface, but will have warnings
>>> in the log. In the next release cycle we'll start ignore the values
>>> while still printing the warnings. This should give enough time for adaptation.
>>> If you feel that we need more time, we could apply the second patch to 2.14
>>> (or whatever number will be in 2 releases from now).
>> 
>> I don't think we should remove the docs if the parameters are there as
>> a first step. I mean, assume an existing deployment, there is a parameter
>> which might be in use but there is no documentation available. That
>> doesn't sound like a good user experience to me.
>
> Maybe we could save a man pages while removing the guides. There is no much
> information in Documentation/intro/install/dpdk.rst anyway.
>
>> 
>> On another hand, you could introduce the new interface and update the
>> docs to recommend using the new one because the old one will be removed
>> in the future. Warning messages come next, and then finally its removal.
>
> I'd prefer to have warning messages to be there right from the start to
> push users to migrate to the new interfaces as early as possible.
>
> How about this:
>
> First stage (apply now, release in 2.12):
>   - Introduce new interface 'dpdk-options'.
>   - Rewrite installation guide with new interface fully removing the old one.
>   - Add new interface to man pages (vswitch.xml) and mark all the old options
>     as deprecated by adding something like: "Deprecated. 'dpdk-options' should
>     be used instead. Will be ignored in the future."
>   - Add a runtime deprecation warning if old interface is in use.
>   - Ignore values of old knobs if 'dpdk-options' provided.
>
> Second stage (release in 2.13 or 2.14, could wait longer if required):
>   - Remove old interfaces wile keeping the warnings. (i.e. values always ignored)
>   - Remove old knobs from the man pages.
>
> Third stage (optional):
>   - Remove warnings.
>
> So the main difference from the current patches is delaying removal of the man
> pages to the second stage.
>
>> 
>> 
>>>> IMO in the end we are moving the problem from one place to another
>>>> because even with a single string, OVS users will be caught off guard
>>>> when DPDK changes. Yes, less pain to OVS community in the sense that
>>>> we don't have to add/remove/deprecate stuff, but it is still a bad
>>>> user experience regardless, which is not what OVS is known for.
>>>
>>> Unfortunately, DPDK was never user-friendly enough. It tries, but still not.
>> 
>> Agreed.
>> 
>>> We're keeping few sane defaults like providing default lcore and setting the
>>> socket-limit if needed. And we'll try to do that in the future. The thing
>>> this patch tries to eliminate is the dependency tracking between different
>>> EAL arguments, i.e. duplicating the work with rte_eal_init() and having too
>>> many configuration knobs with similar meanings what does not work at the
>>> same time.
>>>
>>> Right now there are no critical arguments that user must provide.
>>> So, IMHO, having special configs for them is really unnecessary.
>> 
>> Do you think the defaults work for the majority of DPDK users?
>
> My personal experience is opposite. Most of my deployments and deployments that
> I had to work with had massive 'dpdk-extra' arguments. These are mostly memory
> tuning options, pci whitelists and various virtual devices. And these 'dpdk-extra'
> strings only grows over time with '--socket-limit', '--legacy-mem',
> '--single-file-segments', '--in-memory', '--file-prefix' and so on.
>
> Defaults are OK if your application is the only application you want to see on
> a system. Otherwise, you need to fight with DPDK for resources/devices and your
> (almost) only weapon is EAL argument string.
>
>> 
>> If we expose only the necessary things to consume DPDK it means we
>> have tight control (less combinations to worry about), otherwise if
>> the user can pass anything, a can of worms is opened and we have no
>> control of all things the user can do.
>
> We have only illusion of control. User can pass anything with 'dpdk-extra'.
> Also, the only necessary thing with dpdk 18.11 is an lcore list.
>
> IMHO, non-proficent users are happy with defaults. If user wants to tune,
> 'dpdk-extra' will be used anyway.

A few questions about this - doesn't this imply that the dpdk-extra knob
already does everything you want this new dpdk-options to do?  If so,
what is the reason to rewrite the arguments?  It was really painful the
first time we went through this exercise moving from the CLI to the DB.
It's really difficult to make a change like this since no matter how you
try to stage it, it *will* break some deployment.  And in this case, we
have a knob that satisfies the requirement, I think.  Did I
misunderstand it?

> Best regards, Ilya Maximets.
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Ilya Maximets Feb. 6, 2019, 4:07 p.m. | #6
On 06.02.2019 18:05, Aaron Conole wrote:
> Ilya Maximets <i.maximets@samsung.com> writes:
> 
>> On 05.02.2019 23:19, Flavio Leitner wrote:
>>> On Tue, Feb 05, 2019 at 06:43:08PM +0300, Ilya Maximets wrote:
>>>> Hi Flavio.
>>>> Thanks for taking a look.
>>>>
>>>> On 05.02.2019 15:38, Flavio Leitner wrote:
>>>>>
>>>>> Hi Ilya,
>>>>>
>>>>> Thanks for the patch and I think we knew about that pain when we
>>>>> exposed the very first parameter. I still remember Aaron struggling
>>>>> to find the best path forward. Time flies and here we are.
>>>>>
>>>>> The problem is that this change is not friendly from the community
>>>>> perspective to its users. That is like an exposed API which we should
>>>>> avoid break as much as possible.
>>>>>
>>>>> For instance, there are users (OpenStack) with support life cycle of
>>>>> 5+ years that cannot keep up with this kind of change but they expect
>>>>> to be able to update to newer OVS.
>>>>
>>>> Sure, there are users that wants stable API that will never change.
>>>> But this is really impossible in practice. I'm working with OpenStack
>>>> too and will have to fixup deployment tools with these changes. BTW, from
>>>> the whole OpenStack only few deployment projects (like TripleO) will need
>>>> to make changes and these changes are not very big.
>>>
>>> That's only part of the work. There will be work on QA, documentation
>>> and even migration path from one to another. And we can't change the
>>> past for existing deployments.
>>
>> Sure. But incompatible API changes are almost unavoidable for a young
>> projects that wants to be better.
>>
>>>
>>>>> One idea is to freeze whatever we have now and update the documentation
>>>>> to recommend the new way. We give like a couple OVS releases and only
>>>>> then ignore (or remove?) those parameters.
>>>>
>>>> Yes, In cover letter I proposed these patches to be applied one per release.
>>>> And current (first) patch does not remove the functionality, only docs.
>>>> Users still will be able to use old interface, but will have warnings
>>>> in the log. In the next release cycle we'll start ignore the values
>>>> while still printing the warnings. This should give enough time for adaptation.
>>>> If you feel that we need more time, we could apply the second patch to 2.14
>>>> (or whatever number will be in 2 releases from now).
>>>
>>> I don't think we should remove the docs if the parameters are there as
>>> a first step. I mean, assume an existing deployment, there is a parameter
>>> which might be in use but there is no documentation available. That
>>> doesn't sound like a good user experience to me.
>>
>> Maybe we could save a man pages while removing the guides. There is no much
>> information in Documentation/intro/install/dpdk.rst anyway.
>>
>>>
>>> On another hand, you could introduce the new interface and update the
>>> docs to recommend using the new one because the old one will be removed
>>> in the future. Warning messages come next, and then finally its removal.
>>
>> I'd prefer to have warning messages to be there right from the start to
>> push users to migrate to the new interfaces as early as possible.
>>
>> How about this:
>>
>> First stage (apply now, release in 2.12):
>>   - Introduce new interface 'dpdk-options'.
>>   - Rewrite installation guide with new interface fully removing the old one.
>>   - Add new interface to man pages (vswitch.xml) and mark all the old options
>>     as deprecated by adding something like: "Deprecated. 'dpdk-options' should
>>     be used instead. Will be ignored in the future."
>>   - Add a runtime deprecation warning if old interface is in use.
>>   - Ignore values of old knobs if 'dpdk-options' provided.
>>
>> Second stage (release in 2.13 or 2.14, could wait longer if required):
>>   - Remove old interfaces wile keeping the warnings. (i.e. values always ignored)
>>   - Remove old knobs from the man pages.
>>
>> Third stage (optional):
>>   - Remove warnings.
>>
>> So the main difference from the current patches is delaying removal of the man
>> pages to the second stage.
>>
>>>
>>>
>>>>> IMO in the end we are moving the problem from one place to another
>>>>> because even with a single string, OVS users will be caught off guard
>>>>> when DPDK changes. Yes, less pain to OVS community in the sense that
>>>>> we don't have to add/remove/deprecate stuff, but it is still a bad
>>>>> user experience regardless, which is not what OVS is known for.
>>>>
>>>> Unfortunately, DPDK was never user-friendly enough. It tries, but still not.
>>>
>>> Agreed.
>>>
>>>> We're keeping few sane defaults like providing default lcore and setting the
>>>> socket-limit if needed. And we'll try to do that in the future. The thing
>>>> this patch tries to eliminate is the dependency tracking between different
>>>> EAL arguments, i.e. duplicating the work with rte_eal_init() and having too
>>>> many configuration knobs with similar meanings what does not work at the
>>>> same time.
>>>>
>>>> Right now there are no critical arguments that user must provide.
>>>> So, IMHO, having special configs for them is really unnecessary.
>>>
>>> Do you think the defaults work for the majority of DPDK users?
>>
>> My personal experience is opposite. Most of my deployments and deployments that
>> I had to work with had massive 'dpdk-extra' arguments. These are mostly memory
>> tuning options, pci whitelists and various virtual devices. And these 'dpdk-extra'
>> strings only grows over time with '--socket-limit', '--legacy-mem',
>> '--single-file-segments', '--in-memory', '--file-prefix' and so on.
>>
>> Defaults are OK if your application is the only application you want to see on
>> a system. Otherwise, you need to fight with DPDK for resources/devices and your
>> (almost) only weapon is EAL argument string.
>>
>>>
>>> If we expose only the necessary things to consume DPDK it means we
>>> have tight control (less combinations to worry about), otherwise if
>>> the user can pass anything, a can of worms is opened and we have no
>>> control of all things the user can do.
>>
>> We have only illusion of control. User can pass anything with 'dpdk-extra'.
>> Also, the only necessary thing with dpdk 18.11 is an lcore list.
>>
>> IMHO, non-proficent users are happy with defaults. If user wants to tune,
>> 'dpdk-extra' will be used anyway.
> 
> A few questions about this - doesn't this imply that the dpdk-extra knob
> already does everything you want this new dpdk-options to do?  If so,
> what is the reason to rewrite the arguments?  It was really painful the
> first time we went through this exercise moving from the CLI to the DB.
> It's really difficult to make a change like this since no matter how you
> try to stage it, it *will* break some deployment.  And in this case, we
> have a knob that satisfies the requirement, I think.  Did I
> misunderstand it?

Yes, 'dpdk-extra' could be used to provide same functionality in current
deployments. Few points why I made another knob to rule them all:

 * I wanted to ignore all the values of old parameters by some rule.
   This rule now is that 'dpdk-options' specified.
 * 'dpdk-extra' is not the right name if we have only one knob.
 * We need to stop supporting old options because we can't support them
   properly and this requires big efforts to track all the dependencies
   which will be tracked by DPDK anyway.

> 
>> Best regards, Ilya Maximets.
>> _______________________________________________
>> dev mailing list
>> dev@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> 
>
Flavio Leitner Feb. 7, 2019, 4:56 p.m. | #7
On Wed, Feb 06, 2019 at 11:33:04AM +0300, Ilya Maximets wrote:
> On 05.02.2019 23:19, Flavio Leitner wrote:
> > On Tue, Feb 05, 2019 at 06:43:08PM +0300, Ilya Maximets wrote:
> >> Hi Flavio.
> >> Thanks for taking a look.
> >>
> >> On 05.02.2019 15:38, Flavio Leitner wrote:
> >>>
> >>> Hi Ilya,
> >>>
> >>> Thanks for the patch and I think we knew about that pain when we
> >>> exposed the very first parameter. I still remember Aaron struggling
> >>> to find the best path forward. Time flies and here we are.
> >>>
> >>> The problem is that this change is not friendly from the community
> >>> perspective to its users. That is like an exposed API which we should
> >>> avoid break as much as possible.
> >>>
> >>> For instance, there are users (OpenStack) with support life cycle of
> >>> 5+ years that cannot keep up with this kind of change but they expect
> >>> to be able to update to newer OVS.
> >>
> >> Sure, there are users that wants stable API that will never change.
> >> But this is really impossible in practice. I'm working with OpenStack
> >> too and will have to fixup deployment tools with these changes. BTW, from
> >> the whole OpenStack only few deployment projects (like TripleO) will need
> >> to make changes and these changes are not very big.
> > 
> > That's only part of the work. There will be work on QA, documentation
> > and even migration path from one to another. And we can't change the
> > past for existing deployments.
> 
> Sure. But incompatible API changes are almost unavoidable for a young
> projects that wants to be better.

My point here is that the associated impact and cost of this change is
far from trivial.

> > I don't think we should remove the docs if the parameters are there as
> > a first step. I mean, assume an existing deployment, there is a parameter
> > which might be in use but there is no documentation available. That
> > doesn't sound like a good user experience to me.
> 
> Maybe we could save a man pages while removing the guides. There is no much
> information in Documentation/intro/install/dpdk.rst anyway.

Agreed.

> > On another hand, you could introduce the new interface and update the
> > docs to recommend using the new one because the old one will be removed
> > in the future. Warning messages come next, and then finally its removal.
> 
> I'd prefer to have warning messages to be there right from the start to
> push users to migrate to the new interfaces as early as possible.

If it's a WARN level message then I can tell you it will break
environments over here. If that is a INFO level message, then it might
be okay.  Though I still think there could be a period of time where
both could co-exist before we announce/warn about the deprecation of
the current interface.

> How about this:
> 
> First stage (apply now, release in 2.12):
>   - Introduce new interface 'dpdk-options'.
>   - Rewrite installation guide with new interface fully removing the old one.
>   - Add new interface to man pages (vswitch.xml) and mark all the old options
>     as deprecated by adding something like: "Deprecated. 'dpdk-options' should
>     be used instead. Will be ignored in the future."
>   - Add a runtime deprecation warning if old interface is in use.

^^^ this is bad as an initial step as explained above.

>   - Ignore values of old knobs if 'dpdk-options' provided.

Sounds like a good transition barrier.


> Second stage (release in 2.13 or 2.14, could wait longer if required):
>   - Remove old interfaces wile keeping the warnings. (i.e. values always ignored)
>   - Remove old knobs from the man pages.
> 
> Third stage (optional):
>   - Remove warnings.
> 
> So the main difference from the current patches is delaying removal of the man
> pages to the second stage.
[...] 
> >> We're keeping few sane defaults like providing default lcore and setting the
> >> socket-limit if needed. And we'll try to do that in the future. The thing
> >> this patch tries to eliminate is the dependency tracking between different
> >> EAL arguments, i.e. duplicating the work with rte_eal_init() and having too
> >> many configuration knobs with similar meanings what does not work at the
> >> same time.
> >>
> >> Right now there are no critical arguments that user must provide.
> >> So, IMHO, having special configs for them is really unnecessary.
> > 
> > Do you think the defaults work for the majority of DPDK users?
> 
> My personal experience is opposite. Most of my deployments and deployments that
> I had to work with had massive 'dpdk-extra' arguments. These are mostly memory
> tuning options, pci whitelists and various virtual devices. And these 'dpdk-extra'
> strings only grows over time with '--socket-limit', '--legacy-mem',
> '--single-file-segments', '--in-memory', '--file-prefix' and so on.

That is indeed a problem, but moving to 'dpdk-options' will not solve it.

> Defaults are OK if your application is the only application you want to see on
> a system. Otherwise, you need to fight with DPDK for resources/devices and your
> (almost) only weapon is EAL argument string.

The above looks like to be the real motivation behind the change, which is
fine, but somehow I missed that while reading the patchset.

> > If we expose only the necessary things to consume DPDK it means we
> > have tight control (less combinations to worry about), otherwise if
> > the user can pass anything, a can of worms is opened and we have no
> > control of all things the user can do.
> 
> We have only illusion of control. User can pass anything with 'dpdk-extra'.
> Also, the only necessary thing with dpdk 18.11 is an lcore list.

My statement was generic which remains true, but you're right that we
already exposed dpdk-extra and maybe that was already a mistake.

> IMHO, non-proficent users are happy with defaults. If user wants to tune,
> 'dpdk-extra' will be used anyway.

My point is that this is going to an unhealthy direction. We will be
telling that users are in full control of all possible options and
combinations. The user may be or not proficient, doesn't matter because
in the end, users do stupid things and OVS is the one at the front to
blame.

The only reason that '--socket-mem' meaning change was brought up was
because we exposed that to users. That would not have happened if we had
only --dpdk-options and then I wonder how we will manage OVS upgrades
in the future when we upgrade DPDK.

fbl
Kevin Traynor Feb. 8, 2019, 10:36 a.m. | #8
Hi Ilya,

Thanks for raising this and explaining in detail so it can be discussed.
Comments below,

On 02/01/2019 01:11 PM, Ilya Maximets wrote:
> This patch deprecates the usage of current configuration knobs for
> DPDK EAL arguments:
> 
>   - dpdk-alloc-mem
>   - dpdk-socket-mem
>   - dpdk-socket-limit
>   - dpdk-lcore-mask
>   - dpdk-hugepage-dir
>   - dpdk-extra
> 
> All above configuration options replaced with single 'dpdk-options'
> which has same format as old 'dpdk-extra', i.e. just a string with
> all the DPDK EAL arguments.
> 
> All the documentation about old configuration knobs removed. Users
> could still use an old interface, but the deprecation warning will be
> printed. In case 'dpdk-options' provided, values of old options will
> be ignored. New users will start using new 'dpdk-options' as this is
> the only documented way providing EAL arguments.
> 
> Rationale:
> 
> From release to release DPDK becomes more complex. New EAL arguments
> appears, old arguments becomes deprecated. Some changes their meaning
> and behaviour. It's not easy task to manage all this and current
> code in OVS that tries to manage DPDK options is not flexible/extendable
> enough even to track all the dependencies between options in DPDK 18.11.
> For example, '--socket-mem' changed its meaning, '--legacy-mem' and
> '--socket-limit' appeared. But '--legacy-mem' and '--socket-limit'
> could not be used at the same time and, also, we want to provide
> good default value for '--socket-limit' to keep the consistent
> behaviour of OVS memory usage. And this will be worse in the future.
> 

I think it is an exception that shows it is quite stable. There's
probably been something like 10 DPDK releases (didn't check exactly)
since these params were introduced and this seems to be first time that
there was an issue with an argument.

It was also a very rare change in DPDK where the memory mgmt was
re-designed and re-written from the ground up. That meant an extra flag
was needed to keep the same behavior. Perhaps if the existing flag was
not reused, there wouldn't have been an issue at all (or at least it
would have been much clearer).

> Another point is that even now we have mutually exclusive database
> configs in OVS and we have to handle them. i.e we're providing
> 'dpdk-alloc-mem' and 'dpdk-socket-mem' at the same time. Additionally,
> user allowed to configure same options in 'dpdk-extra'. So, we have
> similar/equal options in a three places in ovsdb and we need to
> choose which one to use. It's pretty easy for user to get into
> inconsistent database configuration, because users could update
> one field without removing others, and OVS has to resolve all the
> conflicts somehow constructing the EAL args. But we do not know in
> practice, if the resulted arguments are the arguments that user wanted
> to see or just forget to remove the higher priority knob.
> 

That's true about dpdk-alloc-mem and dpdk-socket-mem. I'm not sure if
the docs give some direction on that. IIRC, it is documented and checked
about precedence when using dpdk-extra's.

> Next point is that we have 'dpdk-lcore-mask' (-c) which is actually not
> so user-friendly as '-l', but we have no option for it. Will we create
> additional 'dpdk-lcore-list' ? I guess, no, because it's not really
> important thing. But users will stuck with this not so user-friendly
> masks.
> 

I'm not sure which is more user-friendly but I agree adding
dpdk-lcore-list is not needed, and probably could add confusion
(especially as dpdk-lcore-mask doesn't need '0x' prefix).

> Another thing is that OVS is not really need to check the consistency
> of the EAL arguments because DPDK could check it instead. DPDK will
> always check the args and it will do it better. There is no real
> need to duplicate this functionality.
> 
> Keeping all the options in a single string 'dpdk-options' allows:
> 
>   * To have only one place for all the configs, i.e. it will be harder
>     for user to forget to remove something from the single string while
>     updating it.
>   * Not to check the consistency between different configs, DPDK could
>     check the cmdline itself.
>   * Not to document every DPDK EAL argument in OVS. We can just
>     describe few of them and point to DPDK docs for more information.
>   * OVS still able to provide some minimal default arguments.
>     Thanks to DPDK 18.11 we only need to specify an lcore. All other
>     arguments are not necessary. (We still also providing default
>     --socket-limit in case --socket-mem passed without it and without
>     --legacy-mem)
> 

It would seem wrong to tell the OVS user to fill in the DPDK EAL
arguments in a type of passthrough string to DPDK but then also insert
defaults to that.

> Usage example:
> 
>   ovs-vsctl set Open_vSwitch . \
>       other_config:dpdk-options="-l 1 -n 4 --socket-mem 1024,1024"
> 

It is possible to do that now.

> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>


What I like about the current scheme is that it caters for users who are
very familiar with DPDK and will customise their EAL arguments, but it
also provides an abstraction and defaults for the bare minimum items for
OVS users who aren't as familiar with DPDK.

For example, with the socket-mem change you've highlighted, if this
wasn't handled in OVS (and thanks for that), then every OVS user would
have to find this very subtle change themselves and adjust their
arguments. So I believe that there is value in OVS creating a little
abstraction and defaults for some of the most commonly used args.

Sorry, but overall I don't think there's a big problem that needs to be
addressed and there would be a heavy price in disruption for users.

thanks,
Kevin.
Ilya Maximets Feb. 11, 2019, 3:38 p.m. | #9
On 07.02.2019 19:56, Flavio Leitner wrote:
> On Wed, Feb 06, 2019 at 11:33:04AM +0300, Ilya Maximets wrote:
>> On 05.02.2019 23:19, Flavio Leitner wrote:
>>> On Tue, Feb 05, 2019 at 06:43:08PM +0300, Ilya Maximets wrote:
>>>> Hi Flavio.
>>>> Thanks for taking a look.
>>>>
>>>> On 05.02.2019 15:38, Flavio Leitner wrote:
>>>>>
>>>>> Hi Ilya,
>>>>>
>>>>> Thanks for the patch and I think we knew about that pain when we
>>>>> exposed the very first parameter. I still remember Aaron struggling
>>>>> to find the best path forward. Time flies and here we are.
>>>>>
>>>>> The problem is that this change is not friendly from the community
>>>>> perspective to its users. That is like an exposed API which we should
>>>>> avoid break as much as possible.
>>>>>
>>>>> For instance, there are users (OpenStack) with support life cycle of
>>>>> 5+ years that cannot keep up with this kind of change but they expect
>>>>> to be able to update to newer OVS.
>>>>
>>>> Sure, there are users that wants stable API that will never change.
>>>> But this is really impossible in practice. I'm working with OpenStack
>>>> too and will have to fixup deployment tools with these changes. BTW, from
>>>> the whole OpenStack only few deployment projects (like TripleO) will need
>>>> to make changes and these changes are not very big.
>>>
>>> That's only part of the work. There will be work on QA, documentation
>>> and even migration path from one to another. And we can't change the
>>> past for existing deployments.
>>
>> Sure. But incompatible API changes are almost unavoidable for a young
>> projects that wants to be better.
> 
> My point here is that the associated impact and cost of this change is
> far from trivial.

I agree.

> 
>>> I don't think we should remove the docs if the parameters are there as
>>> a first step. I mean, assume an existing deployment, there is a parameter
>>> which might be in use but there is no documentation available. That
>>> doesn't sound like a good user experience to me.
>>
>> Maybe we could save a man pages while removing the guides. There is no much
>> information in Documentation/intro/install/dpdk.rst anyway.
> 
> Agreed.
> 
>>> On another hand, you could introduce the new interface and update the
>>> docs to recommend using the new one because the old one will be removed
>>> in the future. Warning messages come next, and then finally its removal.
>>
>> I'd prefer to have warning messages to be there right from the start to
>> push users to migrate to the new interfaces as early as possible.
> 
> If it's a WARN level message then I can tell you it will break
> environments over here. If that is a INFO level message, then it might
> be okay.  Though I still think there could be a period of time where
> both could co-exist before we announce/warn about the deprecation of
> the current interface.

Yeah, most testing tools wants to have no warning at all. INFO level messages
could be fine for the period when we're still supporting old knobs.
I'm not sure about co-existence without any messages at all. I think that we
need something that will push users to migrate to the new interface.

> 
>> How about this:
>>
>> First stage (apply now, release in 2.12):
>>   - Introduce new interface 'dpdk-options'.
>>   - Rewrite installation guide with new interface fully removing the old one.
>>   - Add new interface to man pages (vswitch.xml) and mark all the old options
>>     as deprecated by adding something like: "Deprecated. 'dpdk-options' should
>>     be used instead. Will be ignored in the future."
>>   - Add a runtime deprecation warning if old interface is in use.
> 
> ^^^ this is bad as an initial step as explained above.

Sure. Could be changed to INFO.

> 
>>   - Ignore values of old knobs if 'dpdk-options' provided.
> 
> Sounds like a good transition barrier.
> 
> 
>> Second stage (release in 2.13 or 2.14, could wait longer if required):
>>   - Remove old interfaces wile keeping the warnings. (i.e. values always ignored)
>>   - Remove old knobs from the man pages.
>>
>> Third stage (optional):
>>   - Remove warnings.
>>
>> So the main difference from the current patches is delaying removal of the man
>> pages to the second stage.
> [...] 
>>>> We're keeping few sane defaults like providing default lcore and setting the
>>>> socket-limit if needed. And we'll try to do that in the future. The thing
>>>> this patch tries to eliminate is the dependency tracking between different
>>>> EAL arguments, i.e. duplicating the work with rte_eal_init() and having too
>>>> many configuration knobs with similar meanings what does not work at the
>>>> same time.
>>>>
>>>> Right now there are no critical arguments that user must provide.
>>>> So, IMHO, having special configs for them is really unnecessary.
>>>
>>> Do you think the defaults work for the majority of DPDK users?
>>
>> My personal experience is opposite. Most of my deployments and deployments that
>> I had to work with had massive 'dpdk-extra' arguments. These are mostly memory
>> tuning options, pci whitelists and various virtual devices. And these 'dpdk-extra'
>> strings only grows over time with '--socket-limit', '--legacy-mem',
>> '--single-file-segments', '--in-memory', '--file-prefix' and so on.
> 
> That is indeed a problem, but moving to 'dpdk-options' will not solve it.

Yes. But that is not the problem I'm trying to solve.
More configuration knobs is not the answer too.

> 
>> Defaults are OK if your application is the only application you want to see on
>> a system. Otherwise, you need to fight with DPDK for resources/devices and your
>> (almost) only weapon is EAL argument string.
> 
> The above looks like to be the real motivation behind the change, which is
> fine, but somehow I missed that while reading the patchset.

The real motivation from my side is following:
When I started working on 'socket-limit' patch-set I also wanted to implement
something to enable 'legacy-mem' mode. However, I figured out that to do so
I'll need to re-write all the parsing code from scratch and it will mostly
duplicate the same code from DPDK. Our current EAL args parsing/constructing
infrastructure just can't handle complex dependencies between arguments.
So, we have 2 choices. First is to re-implement same checks as inside the DPDK
and keep updating them to be consistent with DPDK internal checks. The second
is to drop attempts to handle dependencies in OVS. The second seems more sane
for me from the developers' point of view.
Another option is to keep the knobs, but stop checking them. i.e. put them
right to the EAL args and allow 'rte_eal_init()' to complain about inconsistency.
Maybe only checking for duplicated options in OVS, not the dependencies.

> 
>>> If we expose only the necessary things to consume DPDK it means we
>>> have tight control (less combinations to worry about), otherwise if
>>> the user can pass anything, a can of worms is opened and we have no
>>> control of all things the user can do.
>>
>> We have only illusion of control. User can pass anything with 'dpdk-extra'.
>> Also, the only necessary thing with dpdk 18.11 is an lcore list.
> 
> My statement was generic which remains true, but you're right that we
> already exposed dpdk-extra and maybe that was already a mistake.

Unfortunately, we can't do any tuning without 'dpdk-extra'.

> 
>> IMHO, non-proficent users are happy with defaults. If user wants to tune,
>> 'dpdk-extra' will be used anyway.
> 
> My point is that this is going to an unhealthy direction. We will be
> telling that users are in full control of all possible options and
> combinations. The user may be or not proficient, doesn't matter because
> in the end, users do stupid things and OVS is the one at the front to
> blame.

Users and developers are able to do stupid things with almost any interface.
Especially, with so complex.

> 
> The only reason that '--socket-mem' meaning change was brought up was
> because we exposed that to users. That would not have happened if we had
> only --dpdk-options

Yes. And that's the key point of this change.

> and then I wonder how we will manage OVS upgrades
> in the future when we upgrade DPDK> 
> fbl
Ilya Maximets Feb. 11, 2019, 5:08 p.m. | #10
On 08.02.2019 13:36, Kevin Traynor wrote:
> Hi Ilya,
> 
> Thanks for raising this and explaining in detail so it can be discussed.

Thanks for sharing your thoughts.

> Comments below,
> 
> On 02/01/2019 01:11 PM, Ilya Maximets wrote:
>> This patch deprecates the usage of current configuration knobs for
>> DPDK EAL arguments:
>>
>>   - dpdk-alloc-mem
>>   - dpdk-socket-mem
>>   - dpdk-socket-limit
>>   - dpdk-lcore-mask
>>   - dpdk-hugepage-dir
>>   - dpdk-extra
>>
>> All above configuration options replaced with single 'dpdk-options'
>> which has same format as old 'dpdk-extra', i.e. just a string with
>> all the DPDK EAL arguments.
>>
>> All the documentation about old configuration knobs removed. Users
>> could still use an old interface, but the deprecation warning will be
>> printed. In case 'dpdk-options' provided, values of old options will
>> be ignored. New users will start using new 'dpdk-options' as this is
>> the only documented way providing EAL arguments.
>>
>> Rationale:
>>
>> From release to release DPDK becomes more complex. New EAL arguments
>> appears, old arguments becomes deprecated. Some changes their meaning
>> and behaviour. It's not easy task to manage all this and current
>> code in OVS that tries to manage DPDK options is not flexible/extendable
>> enough even to track all the dependencies between options in DPDK 18.11.
>> For example, '--socket-mem' changed its meaning, '--legacy-mem' and
>> '--socket-limit' appeared. But '--legacy-mem' and '--socket-limit'
>> could not be used at the same time and, also, we want to provide
>> good default value for '--socket-limit' to keep the consistent
>> behaviour of OVS memory usage. And this will be worse in the future.
>>
> 
> I think it is an exception that shows it is quite stable. There's
> probably been something like 10 DPDK releases (didn't check exactly)
> since these params were introduced and this seems to be first time that
> there was an issue with an argument.
> 
> It was also a very rare change in DPDK where the memory mgmt was
> re-designed and re-written from the ground up. That meant an extra flag
> was needed to keep the same behavior. Perhaps if the existing flag was
> not reused, there wouldn't have been an issue at all (or at least it
> would have been much clearer).

I want to clarify that the 'socket-mem' related issue is not fully solved
in current OVS. It's still possible for user to specify the 'dpdk-socket-limit'
knob and use '--legacy-mem' in 'dpdk-extra'. OVS is not able to track this
in current parsing code without introducing nasty workarounds.

> 
>> Another point is that even now we have mutually exclusive database
>> configs in OVS and we have to handle them. i.e we're providing
>> 'dpdk-alloc-mem' and 'dpdk-socket-mem' at the same time. Additionally,
>> user allowed to configure same options in 'dpdk-extra'. So, we have
>> similar/equal options in a three places in ovsdb and we need to
>> choose which one to use. It's pretty easy for user to get into
>> inconsistent database configuration, because users could update
>> one field without removing others, and OVS has to resolve all the
>> conflicts somehow constructing the EAL args. But we do not know in
>> practice, if the resulted arguments are the arguments that user wanted
>> to see or just forget to remove the higher priority knob.
>>
> 
> That's true about dpdk-alloc-mem and dpdk-socket-mem. I'm not sure if
> the docs give some direction on that. IIRC, it is documented and checked
> about precedence when using dpdk-extra's.
> 
>> Next point is that we have 'dpdk-lcore-mask' (-c) which is actually not
>> so user-friendly as '-l', but we have no option for it. Will we create
>> additional 'dpdk-lcore-list' ? I guess, no, because it's not really
>> important thing. But users will stuck with this not so user-friendly
>> masks.
>>
> 
> I'm not sure which is more user-friendly but I agree adding
> dpdk-lcore-list is not needed, and probably could add confusion
> (especially as dpdk-lcore-mask doesn't need '0x' prefix).

IMHO, numbers are more visual than masks. But yes, this will add more
confusion.

> 
>> Another thing is that OVS is not really need to check the consistency
>> of the EAL arguments because DPDK could check it instead. DPDK will
>> always check the args and it will do it better. There is no real
>> need to duplicate this functionality.
>>
>> Keeping all the options in a single string 'dpdk-options' allows:
>>
>>   * To have only one place for all the configs, i.e. it will be harder
>>     for user to forget to remove something from the single string while
>>     updating it.
>>   * Not to check the consistency between different configs, DPDK could
>>     check the cmdline itself.
>>   * Not to document every DPDK EAL argument in OVS. We can just
>>     describe few of them and point to DPDK docs for more information.
>>   * OVS still able to provide some minimal default arguments.
>>     Thanks to DPDK 18.11 we only need to specify an lcore. All other
>>     arguments are not necessary. (We still also providing default
>>     --socket-limit in case --socket-mem passed without it and without
>>     --legacy-mem)
>>
> 
> It would seem wrong to tell the OVS user to fill in the DPDK EAL
> arguments in a type of passthrough string to DPDK but then also insert
> defaults to that.

That is the good point. Maybe we could drop the defaults except the
lcore list (which is mandatory in dpdk) if 'dpdk-options' provided.

> 
>> Usage example:
>>
>>   ovs-vsctl set Open_vSwitch . \
>>       other_config:dpdk-options="-l 1 -n 4 --socket-mem 1024,1024"
>>
> 
> It is possible to do that now.

Sure, this could be done with 'dpdk-extra'. But as I said, 'dpdk-extra'
is not the right name for the only configuration option.

Anyway, the main target for this patch is not to change the user API,
but to simplify the code. i.e. there is an option to keep the old API
if we'll stop checking it for consistency in OVS code.

> 
>> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
> 
> 
> What I like about the current scheme is that it caters for users who are
> very familiar with DPDK and will customise their EAL arguments, but it
> also provides an abstraction and defaults for the bare minimum items for
> OVS users who aren't as familiar with DPDK.
> 
> For example, with the socket-mem change you've highlighted, if this
> wasn't handled in OVS (and thanks for that), then every OVS user would
> have to find this very subtle change themselves and adjust their
> arguments. So I believe that there is value in OVS creating a little
> abstraction and defaults for some of the most commonly used args.

With DPDK 18.11 we only need to provide default value for the lcore mask.
And we're providing it. So, users may not think about EAL args. You don't
need to have any DPDK knob configured.
We may highlight some incompatible EAL ARGS changes in NEWS while upgrading
to new DPDK release. This should be user-visible, because NEWS should be
the first thing you read before upgrading your OVS version. No real need to
track the changes and try to preserve previous behaviour in code.

OTOH, it might be better for some users if we'll not force the 'socket-limit'.
i.e. they will have better memory model out-of-the-box without need to
disable limits. But we're forcing them to look at the release notes of OVS
and DPDK to find out that there is a better solution for them.

In fact, we're pushing many users to the past, protecting them from the
better life. And they could never know about that because we're hiding
changes making everything work same as before.

> 
> Sorry, but overall I don't think there's a big problem that needs to be
> addressed and there would be a heavy price in disruption for users.

I see your point. That's what an RFCs are made for, to gather all the
opinions and make a good decision.

> 
> thanks,
> Kevin.

Patch

diff --git a/Documentation/intro/install/dpdk.rst b/Documentation/intro/install/dpdk.rst
index 344d2b3a6..2243fde53 100644
--- a/Documentation/intro/install/dpdk.rst
+++ b/Documentation/intro/install/dpdk.rst
@@ -235,32 +235,44 @@  listed below. Defaults will be provided for all values not explicitly set.
   A value of ``try`` will imply that the ovs-vswitchd process should
   continue running even if the EAL initialization fails.
 
-``dpdk-lcore-mask``
-  Specifies the CPU cores on which dpdk lcore threads should be spawned and
-  expects hex string (eg '0x123').
+``dpdk-options``
+  Specifies the string of EAL command line arguments for DPDK.
+  For example, ``other_config:dpdk-options="-l 0 --socket-mem 1024,1024"``.
+  Important arguments that could be passed there are:
 
-``dpdk-socket-mem``
-  Comma separated list of memory to pre-allocate from hugepages on specific
-  sockets. If not specified, 1024 MB will be set for each numa node by
-  default.
+  - ``-c`` or ``-l``
 
-``dpdk-hugepage-dir``
-  Directory where hugetlbfs is mounted
+    Specifies the CPU cores on which dpdk lcore threads should be spawned.
+    ``-c`` expects hex string (eg ``0x123``) while ``-l`` expects core list
+    (eg ``0-5,8``).
+
+  - ``--socket-mem``
+
+    Comma separated list of memory to pre-allocate from hugepages on specific
+    sockets.
+
+  - ``--huge-dir``
+
+    Directory where hugetlbfs is mounted
+
+  - See DPDK documentation for the full list:
+
+    https://doc.dpdk.org/guides-18.11/linux_gsg/linux_eal_parameters.html
 
 ``vhost-sock-dir``
   Option to set the path to the vhost-user unix socket files.
 
-If allocating more than one GB hugepage, you can configure the
-amount of memory used from any given NUMA nodes. For example, to use 1GB from
-NUMA node 0 and 0GB for all other NUMA nodes, run::
+You can configure the amount of memory preallocated from any given NUMA nodes.
+For example, to preallocate ``1GB`` from NUMA node ``0`` and ``0GB`` for all
+other NUMA nodes, run::
 
     $ ovs-vsctl --no-wait set Open_vSwitch . \
-        other_config:dpdk-socket-mem="1024,0"
+        other_config:dpdk-options="<...> --socket-mem 1024,0"
 
 or::
 
     $ ovs-vsctl --no-wait set Open_vSwitch . \
-        other_config:dpdk-socket-mem="1024"
+        other_config:dpdk-options="<...> --socket-mem 1024"
 
 .. note::
   Changing any of these options requires restarting the ovs-vswitchd
diff --git a/NEWS b/NEWS
index a64b9d94d..1c09e9d57 100644
--- a/NEWS
+++ b/NEWS
@@ -1,8 +1,11 @@ 
 Post-v2.11.0
 ---------------------
    - DPDK:
-     * New option 'other_config:dpdk-socket-limit' to limit amount of
-       hugepage memory that can be used by DPDK.
+     * New option 'other_config:dpdk-options' to pass all the
+       DPDK EAL argumants in a single string.
+     * Following config options deprecated:
+       'dpdk-alloc-mem', 'dpdk-socket-mem', 'dpdk-socket-limit',
+       'dpdk-lcore-mask', 'dpdk-hugepage-dir' and 'dpdk-extra'.
 
 
 v2.11.0 - xx xxx xxxx
diff --git a/lib/dpdk.c b/lib/dpdk.c
index 53b74fba4..0c37f231d 100644
--- a/lib/dpdk.c
+++ b/lib/dpdk.c
@@ -91,6 +91,41 @@  args_contains(const struct svec *args, const char *value)
     return false;
 }
 
+static bool
+report_deprecated_configs(const struct smap *ovs_other_config, bool ignore)
+{
+    struct option {
+        const char *opt;
+        const char *replacement;
+    } options[] = {
+        { "dpdk-alloc-mem",    "-m"             },
+        { "dpdk-socket-mem",   "--socket-mem"   },
+        { "dpdk-socket-limit", "--socket-limit" },
+        { "dpdk-lcore-mask",   "-c"             },
+        { "dpdk-hugepage-dir", "--huge-dir"     },
+        { "dpdk-extra",        ""               }
+    };
+    int i;
+    bool found = false;
+
+    for (i = 0; i < ARRAY_SIZE(options); i++) {
+        const char *value = smap_get(ovs_other_config, options[i].opt);
+
+        if (value) {
+            VLOG_WARN("Detected deprecated '%s' config. Use '%s %s'"
+                      " in 'dpdk-options' instead.%s",
+                      options[i].opt, options[i].replacement, value,
+                      ignore ? " Value ignored." : "");
+            found = true;
+        }
+    }
+    if (found) {
+        VLOG_WARN("Deprecated options will be "
+                  "silently ignored in the future.");
+    }
+    return found;
+}
+
 static void
 construct_dpdk_options(const struct smap *ovs_other_config, struct svec *args)
 {
@@ -270,8 +305,10 @@  dpdk_init__(const struct smap *ovs_other_config)
     char **argv = NULL;
     int result;
     bool auto_determine = true;
+    bool deprecated_found;
     int err = 0;
     cpu_set_t cpuset;
+    const char *dpdk_options;
     struct svec args = SVEC_EMPTY_INITIALIZER;
 
     log_stream = fopencookie(NULL, "w+", dpdk_log_func);
@@ -317,7 +354,15 @@  dpdk_init__(const struct smap *ovs_other_config)
               per_port_memory ? "enabled" : "disabled");
 
     svec_add(&args, ovs_get_program_name());
-    construct_dpdk_args(ovs_other_config, &args);
+    dpdk_options = smap_get(ovs_other_config, "dpdk-options");
+
+    deprecated_found = report_deprecated_configs(ovs_other_config,
+                                                 dpdk_options ? true : false);
+    if (dpdk_options) {
+        svec_parse_words(&args, dpdk_options);
+    } else if (deprecated_found) {
+        construct_dpdk_args(ovs_other_config, &args);
+    }
 
     if (!args_contains(&args, "--legacy-mem")
         && !args_contains(&args, "--socket-limit")) {
@@ -357,7 +402,7 @@  dpdk_init__(const struct smap *ovs_other_config)
                 }
             }
         } else {
-            /* User did not set dpdk-lcore-mask and unable to get current
+            /* User did not set lcore mask and unable to get current
              * thread affintity - default to core #0 */
             VLOG_ERR("Thread getaffinity error %d. Using core #0", err);
         }
diff --git a/utilities/ovs-dev.py b/utilities/ovs-dev.py
index 248d22ab9..d10b40c87 100755
--- a/utilities/ovs-dev.py
+++ b/utilities/ovs-dev.py
@@ -283,7 +283,7 @@  def run():
         _sh("ovs-vsctl --no-wait set Open_vSwitch %s "
             "other_config:dpdk-init=true" % root_uuid)
         _sh("ovs-vsctl --no-wait set Open_vSwitch %s other_config:"
-            "dpdk-extra=\"%s\"" % (root_uuid, ' '.join(options.dpdk)))
+            "dpdk-options=\"%s\"" % (root_uuid, ' '.join(options.dpdk)))
     else:
         _sh("ovs-vsctl --no-wait set Open_vSwitch %s "
             "other_config:dpdk-init=false" % root_uuid)
diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index 1db4e8694..c30265d9e 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -234,56 +234,6 @@ 
         </p>
       </column>
 
-      <column name="other_config" key="dpdk-init"
-              type='{"type": "string",
-                     "enum": ["set", ["false", "true", "try"]]}'>
-        <p>
-          Set this value to <code>true</code> or <code>try</code> to enable
-          runtime support for DPDK ports. The vswitch must have compile-time
-          support for DPDK as well.
-        </p>
-        <p>
-          A value of <code>true</code> will cause the ovs-vswitchd process to
-          abort if DPDK cannot be initialized. A value of <code>try</code>
-          will allow the ovs-vswitchd process to continue running even if DPDK
-          cannot be initialized.
-        </p>
-        <p>
-          The default value is <code>false</code>. Changing this value requires
-          restarting the daemon
-        </p>
-        <p>
-          If this value is <code>false</code> at startup, any dpdk ports which
-          are configured in the bridge will fail due to memory errors.
-        </p>
-      </column>
-
-      <column name="other_config" key="dpdk-lcore-mask"
-              type='{"type": "integer", "minInteger": 1}'>
-        <p>
-          Specifies the CPU cores where dpdk lcore threads should be spawned.
-          The DPDK lcore threads are used for DPDK library tasks, such as
-          library internal message processing, logging, etc. Value should be in
-          the form of a hex string (so '0x123') similar to the 'taskset' mask
-          input.
-        </p>
-        <p>
-          The lowest order bit corresponds to the first CPU core.  A set bit
-          means the corresponding core is available and an lcore thread will be
-          created and pinned to it.  If the input does not cover all cores,
-          those uncovered cores are considered not set.
-        </p>
-        <p>
-          For performance reasons, it is best to set this to a single core on
-          the system, rather than allow lcore threads to float.
-        </p>
-        <p>
-          If not specified, the value will be determined by choosing the lowest
-          CPU core from initial cpu affinity list. Otherwise, the value will be
-          passed directly to the DPDK library.
-        </p>
-      </column>
-
       <column name="other_config" key="pmd-cpu-mask">
         <p>
           Specifies CPU mask for setting the cpu affinity of PMD (Poll
@@ -303,78 +253,54 @@ 
         </p>
       </column>
 
-      <column name="other_config" key="dpdk-alloc-mem"
-              type='{"type": "integer", "minInteger": 0}'>
+      <column name="other_config" key="dpdk-init"
+              type='{"type": "string",
+                     "enum": ["set", ["false", "true", "try"]]}'>
         <p>
-          Specifies the amount of memory to preallocate from the hugepage pool,
-          regardless of socket. It is recommended that dpdk-socket-mem is used
-          instead.
+          Set this value to <code>true</code> or <code>try</code> to enable
+          runtime support for DPDK ports. The vswitch must have compile-time
+          support for DPDK as well.
         </p>
-      </column>
-
-      <column name="other_config" key="dpdk-socket-mem"
-              type='{"type": "string"}'>
         <p>
-          Specifies the amount of memory to preallocate from the hugepage pool,
-          on a per-socket basis.
+          A value of <code>true</code> will cause the ovs-vswitchd process to
+          abort if DPDK cannot be initialized. A value of <code>try</code>
+          will allow the ovs-vswitchd process to continue running even if DPDK
+          cannot be initialized.
         </p>
         <p>
-          The specifier is a comma-separated string, in ascending order of CPU
-          socket. E.g. On a four socket system 1024,0,2048 would set socket 0
-          to preallocate 1024MB, socket 1 to preallocate 0MB, socket 2 to
-          preallocate 2048MB and socket 3 (no value given) to preallocate 0MB.
+          The default value is <code>false</code>. Changing this value requires
+          restarting the daemon
         </p>
         <p>
-          If dpdk-socket-mem and dpdk-alloc-mem are not specified, dpdk-socket-mem
-          will be used and the default value is 1024 for each numa node. If
-          dpdk-socket-mem and dpdk-alloc-mem are specified at same time,
-          dpdk-socket-mem will be used as default. Changing this value
-          requires restarting the daemon.
+          If this value is <code>false</code> at startup, any dpdk ports which
+          are configured in the bridge will fail due to memory errors.
         </p>
       </column>
 
-      <column name="other_config" key="dpdk-socket-limit"
+      <column name="other_config" key="dpdk-options"
               type='{"type": "string"}'>
         <p>
-          Limits the maximum amount of memory that can be used from the
-          hugepage pool, on a per-socket basis.
+          Specifies EAL command line arguments for DPDK. For example,
+          <code>"-l 1 --socket-mem 1024,1024 -n 4"</code>
         </p>
         <p>
-          The specifier is a comma-separated list of memory limits per socket.
-          <code>0</code> will disable the limit for a particular socket.
-        </p>
-        <p>
-          If not specified, OVS will configure limits equal to the amount of
-          preallocated memory specified by <ref column="other_config"
-          key="dpdk-socket-mem"/> or <code>--socket-mem</code> in
-          <ref column="other_config" key="dpdk-extra"/>. If none of the above
-          options specified or <code>--legacy-mem</code> provided in
-          <ref column="other_config" key="dpdk-extra"/>, limits will not be
-          applied.
           Changing this value requires restarting the daemon.
         </p>
-      </column>
-
-      <column name="other_config" key="dpdk-hugepage-dir"
-              type='{"type": "string"}'>
-        <p>
-          Specifies the path to the hugetlbfs mount point.
-        </p>
-        <p>
-          If not specified, this will be guessed by the DPDK library (default
-          is /dev/hugepages). Changing this value requires restarting the
-          daemon.
-        </p>
-      </column>
-
-      <column name="other_config" key="dpdk-extra"
-              type='{"type": "string"}'>
         <p>
-          Specifies additional eal command line arguments for DPDK.
+          If <code>-l</code> and <code>-c</code> options are not specified,
+          the default value will be provided by choosing the lowest CPU core
+          from initial cpu affinity list.
+          For performance reasons, it is best to set lcore set/list to a single
+          core on the system, rather than allow lcore threads to float.
+          CPU cores proveded by <code>-l</code> and <code>-c</code> options
+          only used by DPDK internal threads. To manage affinity of PMD threads
+          use <ref column="other_config" key="pmd-cpu-mask"/>.
         </p>
         <p>
-          The default is empty. Changing this value requires restarting the
-          daemon
+          if <code>--socket-mem</code> option provided without
+          <code>--legacy-mem</code> and <code>--socket-limit</code>, OVS will
+          provide <code>--socket-limit</code> equal to the amount of
+          preallocated memory specified by <code>--socket-mem</code>.
         </p>
       </column>