diff mbox series

[ovs-dev] dpdk: Fix main lcore on systems with many cores.

Message ID 20250417113052.879025-1-david.marchand@redhat.com
State Accepted
Commit fe53b478f86e3b35668479279826fb2df26d2a5b
Delegated to: aaron conole
Headers show
Series [ovs-dev] dpdk: Fix main lcore on systems with many cores. | expand

Commit Message

David Marchand April 17, 2025, 11:30 a.m. UTC
If OVS is started with a cpu affinity which starts at a core >= 128,
EAL won't be able to run since the -l option is limited to RTE_MAX_LCORES
(which defaults to 128 on x86_64).

Instead map the first discovered cpu to lcore 0.

Reported-at: https://issues.redhat.com/browse/FDP-1312
Fixes: 88964e6428dc ("netdev-dpdk: Autofill lcore coremask if absent")
Signed-off-by: David Marchand <david.marchand@redhat.com>
---
 lib/dpdk.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Eelco Chaudron April 17, 2025, 1:40 p.m. UTC | #1
On 17 Apr 2025, at 13:30, David Marchand via dev wrote:

> If OVS is started with a cpu affinity which starts at a core >= 128,
> EAL won't be able to run since the -l option is limited to RTE_MAX_LCORES
> (which defaults to 128 on x86_64).
>
> Instead map the first discovered cpu to lcore 0.
>
> Reported-at: https://issues.redhat.com/browse/FDP-1312
> Fixes: 88964e6428dc ("netdev-dpdk: Autofill lcore coremask if absent")
> Signed-off-by: David Marchand <david.marchand@redhat.com>

The change looks good to me. I do not have a 128+ system to test, but playing with this on a dual-socket system did not show any problems.

Acked-by: Eelco Chaudron <echaudro@redhat.com>
David Marchand April 17, 2025, 1:45 p.m. UTC | #2
On Thu, Apr 17, 2025 at 3:40 PM Eelco Chaudron <echaudro@redhat.com> wrote:
> On 17 Apr 2025, at 13:30, David Marchand via dev wrote:
>
> > If OVS is started with a cpu affinity which starts at a core >= 128,
> > EAL won't be able to run since the -l option is limited to RTE_MAX_LCORES
> > (which defaults to 128 on x86_64).
> >
> > Instead map the first discovered cpu to lcore 0.
> >
> > Reported-at: https://issues.redhat.com/browse/FDP-1312
> > Fixes: 88964e6428dc ("netdev-dpdk: Autofill lcore coremask if absent")
> > Signed-off-by: David Marchand <david.marchand@redhat.com>
>
> The change looks good to me. I do not have a 128+ system to test, but playing with this on a dual-socket system did not show any problems.

You could artificially reproduce the issue by recompiling DPDK with
-Dmax_lcores=8 for example.

Thanks for the review.
Aaron Conole April 18, 2025, 4:43 p.m. UTC | #3
David Marchand via dev <ovs-dev@openvswitch.org> writes:

> If OVS is started with a cpu affinity which starts at a core >= 128,
> EAL won't be able to run since the -l option is limited to RTE_MAX_LCORES
> (which defaults to 128 on x86_64).
>
> Instead map the first discovered cpu to lcore 0.
>
> Reported-at: https://issues.redhat.com/browse/FDP-1312
> Fixes: 88964e6428dc ("netdev-dpdk: Autofill lcore coremask if absent")
> Signed-off-by: David Marchand <david.marchand@redhat.com>
> ---

LGTM; It could be noted that the workaround for this is to specify a
core set in the eal args list (via dpdk-extra option) or specify the
dpdk-core-mask db option.

Acked-by: Aaron Conole <aconole@redhat.com>
Kevin Traynor April 23, 2025, 3:32 p.m. UTC | #4
On 17/04/2025 12:30, David Marchand via dev wrote:
> If OVS is started with a cpu affinity which starts at a core >= 128,
> EAL won't be able to run since the -l option is limited to RTE_MAX_LCORES
> (which defaults to 128 on x86_64).
> 
> Instead map the first discovered cpu to lcore 0.
> 
> Reported-at: https://issues.redhat.com/browse/FDP-1312
> Fixes: 88964e6428dc ("netdev-dpdk: Autofill lcore coremask if absent")
> Signed-off-by: David Marchand <david.marchand@redhat.com>
> ---
>  lib/dpdk.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/dpdk.c b/lib/dpdk.c
> index 2d22e2b8dd..1f4f2bf083 100644
> --- a/lib/dpdk.c
> +++ b/lib/dpdk.c
> @@ -364,8 +364,8 @@ dpdk_init__(const struct smap *ovs_other_config)
>               * thread affintity - default to core #0 */
>              VLOG_ERR("Thread getaffinity failed. Using core #0");
>          }
> -        svec_add(&args, "-l");
> -        svec_add_nocopy(&args, xasprintf("%d", cpu));
> +        svec_add(&args, "--lcores");
> +        svec_add_nocopy(&args, xasprintf("0@%d", cpu));
>      }
>  
>      svec_terminate(&args);

Acked-by: Kevin Traynor <ktraynor@redhat.com>
Aaron Conole April 24, 2025, 1:23 p.m. UTC | #5
David Marchand via dev <ovs-dev@openvswitch.org> writes:

> If OVS is started with a cpu affinity which starts at a core >= 128,
> EAL won't be able to run since the -l option is limited to RTE_MAX_LCORES
> (which defaults to 128 on x86_64).
>
> Instead map the first discovered cpu to lcore 0.
>
> Reported-at: https://issues.redhat.com/browse/FDP-1312
> Fixes: 88964e6428dc ("netdev-dpdk: Autofill lcore coremask if absent")
> Signed-off-by: David Marchand <david.marchand@redhat.com>
> ---

Thanks David, Eelco, and Kevin.  Applied and backported down to 3.2.
diff mbox series

Patch

diff --git a/lib/dpdk.c b/lib/dpdk.c
index 2d22e2b8dd..1f4f2bf083 100644
--- a/lib/dpdk.c
+++ b/lib/dpdk.c
@@ -364,8 +364,8 @@  dpdk_init__(const struct smap *ovs_other_config)
              * thread affintity - default to core #0 */
             VLOG_ERR("Thread getaffinity failed. Using core #0");
         }
-        svec_add(&args, "-l");
-        svec_add_nocopy(&args, xasprintf("%d", cpu));
+        svec_add(&args, "--lcores");
+        svec_add_nocopy(&args, xasprintf("0@%d", cpu));
     }
 
     svec_terminate(&args);