Message ID | 20220527171817.662264-2-msantana@redhat.com |
---|---|
State | Superseded |
Headers | show |
Series | [ovs-dev,v4,1/2] handlers: Create additional handler threads when using CPU isolation | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
ovsrobot/intel-ovs-compilation | success | test: success |
On 5/27/22 13:18, Michael Santana wrote: > The handler and CPU mapping in upcalls are incorrect, and this is > specially noticeable systems with cpu isolation enabled. > > Say we have a 12 core system where only every even number CPU is enabled > C0, C2, C4, C6, C8, C10 > > This means we will create an array of size 6 that will be sent to > kernel that is populated with sockets [S0, S1, S2, S3, S4, S5] > > The problem is when the kernel does an upcall it checks the socket array > via the index of the CPU, effectively adding additional load on some > CPUs while leaving no work on other CPUs. > > e.g. > > C0 indexes to S0 > C2 indexes to S2 (should be S1) > C4 indexes to S4 (should be S2) > > Modulo of 6 (size of socket array) is applied, so we wrap back to S0 > C6 indexes to S0 (should be S3) > C8 indexes to S2 (should be S4) > C10 indexes to S4 (should be S5) > > Effectively sockets S0, S2, S4 get overloaded while sockets S1, S3, S5 > get no work assigned to them > > This leads to the kernel to throw the following message: > "openvswitch: cpu_id mismatch with handler threads" > > Instead we will send the kernel a corrected array of sockets the size > of all CPUs in the system. In the above example we would create a > corrected array in a round-robin(assuming prime bias) fashion as follows: > [S0, S1, S2, S3, S4, S5, S6, S0, S1, S2, S3, S4] > > Fixes: b1e517bd2f81 ("dpif-netlink: Introduce per-cpu upcall dispatch.") > > Co-authored-by: Aaron Conole <aconole@redhat.com> > signed-off-by: Aaron Conole <aconole@redhat.com> > Signed-off-by: Michael Santana <msantana@redhat.com> > --- > lib/dpif-netlink.c | 20 +++++++++++++------- > 1 file changed, 13 insertions(+), 7 deletions(-) > > diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c > index 53de16e12..1321ac396 100644 > --- a/lib/dpif-netlink.c > +++ b/lib/dpif-netlink.c > @@ -803,11 +803,11 @@ dpif_netlink_set_handler_pids(struct dpif *dpif_, const uint32_t *upcall_pids, > struct dpif_netlink *dpif = dpif_netlink_cast(dpif_); > struct dpif_netlink_dp request, reply; > struct ofpbuf *bufp; > - int error; > - int n_cores; > > - n_cores = count_cpu_cores(); > - ovs_assert(n_cores == n_upcall_pids); > + uint32_t *corrected; > + int error, i, n_cores; > + > + n_cores = count_total_cores(); > VLOG_DBG("Dispatch mode(per-cpu): Number of CPUs is %d", n_cores); > > dpif_netlink_dp_init(&request); > @@ -817,7 +817,12 @@ dpif_netlink_set_handler_pids(struct dpif *dpif_, const uint32_t *upcall_pids, > request.user_features = dpif->user_features | > OVS_DP_F_DISPATCH_UPCALL_PER_CPU; > > - request.upcall_pids = upcall_pids; > + corrected = xcalloc(n_cores, sizeof(uint32_t)); > + > + for (i = 0; i < n_cores; i++) { > + corrected[i] = upcall_pids[i % n_upcall_pids]; > + } > + request.upcall_pids = corrected; > request.n_upcall_pids = n_cores; > > error = dpif_netlink_dp_transact(&request, &reply, &bufp); > @@ -825,9 +830,10 @@ dpif_netlink_set_handler_pids(struct dpif *dpif_, const uint32_t *upcall_pids, > dpif->user_features = reply.user_features; > ofpbuf_delete(bufp); > if (!dpif_netlink_upcall_per_cpu(dpif)) { > - return -EOPNOTSUPP; > + error = -EOPNOTSUPP; > } > } > + free(corrected); > return error; > } > > @@ -2545,7 +2551,7 @@ dpif_netlink_refresh_handlers_cpu_dispatch(struct dpif_netlink *dpif) > uint32_t n_handlers; > uint32_t *upcall_pids; > > - n_handlers = count_cpu_cores(); > + n_handlers = dpif_netlink_calculate_handlers_num(); Also, for the purposes of patch 1/2 and 2/2, I think this line should be moved down to patch 1/2. Without this line patch 1/2 will create the additional handler threads, but there wont be additional sockets that are created at this step > if (dpif->n_handlers != n_handlers) { > VLOG_DBG("Dispatch mode(per-cpu): initializing %d handlers", > n_handlers);
diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c index 53de16e12..1321ac396 100644 --- a/lib/dpif-netlink.c +++ b/lib/dpif-netlink.c @@ -803,11 +803,11 @@ dpif_netlink_set_handler_pids(struct dpif *dpif_, const uint32_t *upcall_pids, struct dpif_netlink *dpif = dpif_netlink_cast(dpif_); struct dpif_netlink_dp request, reply; struct ofpbuf *bufp; - int error; - int n_cores; - n_cores = count_cpu_cores(); - ovs_assert(n_cores == n_upcall_pids); + uint32_t *corrected; + int error, i, n_cores; + + n_cores = count_total_cores(); VLOG_DBG("Dispatch mode(per-cpu): Number of CPUs is %d", n_cores); dpif_netlink_dp_init(&request); @@ -817,7 +817,12 @@ dpif_netlink_set_handler_pids(struct dpif *dpif_, const uint32_t *upcall_pids, request.user_features = dpif->user_features | OVS_DP_F_DISPATCH_UPCALL_PER_CPU; - request.upcall_pids = upcall_pids; + corrected = xcalloc(n_cores, sizeof(uint32_t)); + + for (i = 0; i < n_cores; i++) { + corrected[i] = upcall_pids[i % n_upcall_pids]; + } + request.upcall_pids = corrected; request.n_upcall_pids = n_cores; error = dpif_netlink_dp_transact(&request, &reply, &bufp); @@ -825,9 +830,10 @@ dpif_netlink_set_handler_pids(struct dpif *dpif_, const uint32_t *upcall_pids, dpif->user_features = reply.user_features; ofpbuf_delete(bufp); if (!dpif_netlink_upcall_per_cpu(dpif)) { - return -EOPNOTSUPP; + error = -EOPNOTSUPP; } } + free(corrected); return error; } @@ -2545,7 +2551,7 @@ dpif_netlink_refresh_handlers_cpu_dispatch(struct dpif_netlink *dpif) uint32_t n_handlers; uint32_t *upcall_pids; - n_handlers = count_cpu_cores(); + n_handlers = dpif_netlink_calculate_handlers_num(); if (dpif->n_handlers != n_handlers) { VLOG_DBG("Dispatch mode(per-cpu): initializing %d handlers", n_handlers);