Message ID | 1566492810-32212-1-git-send-email-nitin.katiyar@ericsson.com |
---|---|
State | Accepted |
Delegated to: | Ilya Maximets |
Headers | show |
Series | [ovs-dev,V2] dpif-netdev: Do RCU synchronization at fixed interval in PMD main loop. | expand |
Hi, Please provide your feedback. Regards, Nitin > -----Original Message----- > From: Nitin Katiyar > Sent: Thursday, August 22, 2019 10:24 PM > To: ovs-dev@openvswitch.org > Cc: Nitin Katiyar <nitin.katiyar@ericsson.com>; Anju Thomas > <anju.thomas@ericsson.com> > Subject: [PATCH V2] dpif-netdev: Do RCU synchronization at fixed interval in > PMD main loop. > > Each PMD updates the global sequence number for RCU synchronization > purpose with other OVS threads. This is done at every 1025th iteration in > PMD main loop. > > If the PMD thread is responsible for polling large number of queues that are > carrying traffic, it spends a lot of time processing packets and this results in > significant delay in performing the housekeeping activities. > > If the OVS main thread is waiting to synchronize with the PMD threads and if > those threads delay performing housekeeping activities for more than 3 sec > then LACP processing will be impacted and it will lead to LACP flaps. Similarly, > other controls protocols run by OVS main thread are impacted. > > For e.g. a PMD thread polling 200 ports/queues with average of 1600 > processing cycles per packet with batch size of 32 may take 10240000 > (200 * 1600 * 32) cycles per iteration. In system with 2.0 GHz CPU it means > more than 5 ms per iteration. So, for 1024 iterations to complete it would be > more than 5 seconds. > > This gets worse when there are PMD threads which are less loaded. > It reduces possibility of getting mutex lock in ovsrcu_try_quiesce() by heavily > loaded PMD and next attempt to quiesce would be after 1024 iterations. > > With this patch, PMD RCU synchronization will be performed after fixed > interval instead after a fixed number of iterations. This will ensure that even if > the packet processing load is high the RCU synchronization will not be delayed > long. > > Co-authored-by: Anju Thomas <anju.thomas@ericsson.com> > Signed-off-by: Anju Thomas <anju.thomas@ericsson.com> > Signed-off-by: Nitin Katiyar <nitin.katiyar@ericsson.com> > --- > lib/dpif-netdev.c | 22 ++++++++++++++++++++++ > 1 file changed, 22 insertions(+) > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index d0a1c58..63b6cb9 > 100644 > --- a/lib/dpif-netdev.c > +++ b/lib/dpif-netdev.c > @@ -228,6 +228,9 @@ struct dfc_cache { > * and used during rxq to pmd assignment. */ #define > PMD_RXQ_INTERVAL_MAX 6 > > +/* Time in microseconds to try RCU quiescing. */ #define > +PMD_RCU_QUIESCE_INTERVAL 10000LL > + > struct dpcls { > struct cmap_node node; /* Within dp_netdev_pmd_thread.classifiers > */ > odp_port_t in_port; > @@ -751,6 +754,9 @@ struct dp_netdev_pmd_thread { > > /* Set to true if the pmd thread needs to be reloaded. */ > bool need_reload; > + > + /* Next time when PMD should try RCU quiescing. */ > + long long next_rcu_quiesce; > }; > > /* Interface to netdev-based datapath. */ @@ -5486,6 +5492,8 @@ reload: > pmd->intrvl_tsc_prev = 0; > atomic_store_relaxed(&pmd->intrvl_cycles, 0); > cycles_counter_update(s); > + > + pmd->next_rcu_quiesce = pmd->ctx.now + > PMD_RCU_QUIESCE_INTERVAL; > /* Protect pmd stats from external clearing while polling. */ > ovs_mutex_lock(&pmd->perf_stats.stats_mutex); > for (;;) { > @@ -5493,6 +5501,17 @@ reload: > > pmd_perf_start_iteration(s); > > + /* Do RCU synchronization at fixed interval instead of doing it > + * at fixed number of iterations. This ensures that synchronization > + * would not be delayed long even at high load of packet > + * processing. */ > + if (pmd->ctx.now > pmd->next_rcu_quiesce) { > + if (!ovsrcu_try_quiesce()) { > + pmd->next_rcu_quiesce = > + pmd->ctx.now + PMD_RCU_QUIESCE_INTERVAL; > + } > + } > + > for (i = 0; i < poll_cnt; i++) { > > if (!poll_list[i].rxq_enabled) { @@ -5527,6 +5546,8 @@ reload: > dp_netdev_pmd_try_optimize(pmd, poll_list, poll_cnt); > if (!ovsrcu_try_quiesce()) { > emc_cache_slow_sweep(&((pmd->flow_cache).emc_cache)); > + pmd->next_rcu_quiesce = > + pmd->ctx.now + PMD_RCU_QUIESCE_INTERVAL; > } > > for (i = 0; i < poll_cnt; i++) { @@ -5986,6 +6007,7 @@ > dp_netdev_configure_pmd(struct dp_netdev_pmd_thread *pmd, struct > dp_netdev *dp, > pmd->ctx.last_rxq = NULL; > pmd_thread_ctx_time_update(pmd); > pmd->next_optimization = pmd->ctx.now + > DPCLS_OPTIMIZATION_INTERVAL; > + pmd->next_rcu_quiesce = pmd->ctx.now + > PMD_RCU_QUIESCE_INTERVAL; > pmd->rxq_next_cycle_store = pmd->ctx.now + PMD_RXQ_INTERVAL_LEN; > hmap_init(&pmd->poll_list); > hmap_init(&pmd->tx_ports); > -- > 1.9.1
Hi, Can you please review the following patch and provide the feedback? Regards, Nitin > -----Original Message----- > From: Nitin Katiyar > Sent: Wednesday, August 28, 2019 2:00 PM > To: ovs-dev@openvswitch.org > Cc: Anju Thomas <anju.thomas@ericsson.com>; Ilya Maximets > <i.maximets@samsung.com> > Subject: RE: [PATCH V2] dpif-netdev: Do RCU synchronization at fixed interval > in PMD main loop. > > Hi, > Please provide your feedback. > > Regards, > Nitin > > > -----Original Message----- > > From: Nitin Katiyar > > Sent: Thursday, August 22, 2019 10:24 PM > > To: ovs-dev@openvswitch.org > > Cc: Nitin Katiyar <nitin.katiyar@ericsson.com>; Anju Thomas > > <anju.thomas@ericsson.com> > > Subject: [PATCH V2] dpif-netdev: Do RCU synchronization at fixed > > interval in PMD main loop. > > > > Each PMD updates the global sequence number for RCU synchronization > > purpose with other OVS threads. This is done at every 1025th iteration > > in PMD main loop. > > > > If the PMD thread is responsible for polling large number of queues > > that are carrying traffic, it spends a lot of time processing packets > > and this results in significant delay in performing the housekeeping > activities. > > > > If the OVS main thread is waiting to synchronize with the PMD threads > > and if those threads delay performing housekeeping activities for more > > than 3 sec then LACP processing will be impacted and it will lead to > > LACP flaps. Similarly, other controls protocols run by OVS main thread are > impacted. > > > > For e.g. a PMD thread polling 200 ports/queues with average of 1600 > > processing cycles per packet with batch size of 32 may take 10240000 > > (200 * 1600 * 32) cycles per iteration. In system with 2.0 GHz CPU it > > means more than 5 ms per iteration. So, for 1024 iterations to > > complete it would be more than 5 seconds. > > > > This gets worse when there are PMD threads which are less loaded. > > It reduces possibility of getting mutex lock in ovsrcu_try_quiesce() > > by heavily loaded PMD and next attempt to quiesce would be after 1024 > iterations. > > > > With this patch, PMD RCU synchronization will be performed after fixed > > interval instead after a fixed number of iterations. This will ensure > > that even if the packet processing load is high the RCU > > synchronization will not be delayed long. > > > > Co-authored-by: Anju Thomas <anju.thomas@ericsson.com> > > Signed-off-by: Anju Thomas <anju.thomas@ericsson.com> > > Signed-off-by: Nitin Katiyar <nitin.katiyar@ericsson.com> > > --- > > lib/dpif-netdev.c | 22 ++++++++++++++++++++++ > > 1 file changed, 22 insertions(+) > > > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index > > d0a1c58..63b6cb9 > > 100644 > > --- a/lib/dpif-netdev.c > > +++ b/lib/dpif-netdev.c > > @@ -228,6 +228,9 @@ struct dfc_cache { > > * and used during rxq to pmd assignment. */ #define > > PMD_RXQ_INTERVAL_MAX 6 > > > > +/* Time in microseconds to try RCU quiescing. */ #define > > +PMD_RCU_QUIESCE_INTERVAL 10000LL > > + > > struct dpcls { > > struct cmap_node node; /* Within dp_netdev_pmd_thread.classifiers > > */ > > odp_port_t in_port; > > @@ -751,6 +754,9 @@ struct dp_netdev_pmd_thread { > > > > /* Set to true if the pmd thread needs to be reloaded. */ > > bool need_reload; > > + > > + /* Next time when PMD should try RCU quiescing. */ > > + long long next_rcu_quiesce; > > }; > > > > /* Interface to netdev-based datapath. */ @@ -5486,6 +5492,8 @@ > reload: > > pmd->intrvl_tsc_prev = 0; > > atomic_store_relaxed(&pmd->intrvl_cycles, 0); > > cycles_counter_update(s); > > + > > + pmd->next_rcu_quiesce = pmd->ctx.now + > > PMD_RCU_QUIESCE_INTERVAL; > > /* Protect pmd stats from external clearing while polling. */ > > ovs_mutex_lock(&pmd->perf_stats.stats_mutex); > > for (;;) { > > @@ -5493,6 +5501,17 @@ reload: > > > > pmd_perf_start_iteration(s); > > > > + /* Do RCU synchronization at fixed interval instead of doing it > > + * at fixed number of iterations. This ensures that synchronization > > + * would not be delayed long even at high load of packet > > + * processing. */ > > + if (pmd->ctx.now > pmd->next_rcu_quiesce) { > > + if (!ovsrcu_try_quiesce()) { > > + pmd->next_rcu_quiesce = > > + pmd->ctx.now + PMD_RCU_QUIESCE_INTERVAL; > > + } > > + } > > + > > for (i = 0; i < poll_cnt; i++) { > > > > if (!poll_list[i].rxq_enabled) { @@ -5527,6 +5546,8 @@ reload: > > dp_netdev_pmd_try_optimize(pmd, poll_list, poll_cnt); > > if (!ovsrcu_try_quiesce()) { > > emc_cache_slow_sweep(&((pmd->flow_cache).emc_cache)); > > + pmd->next_rcu_quiesce = > > + pmd->ctx.now + PMD_RCU_QUIESCE_INTERVAL; > > } > > > > for (i = 0; i < poll_cnt; i++) { @@ -5986,6 +6007,7 @@ > > dp_netdev_configure_pmd(struct dp_netdev_pmd_thread *pmd, struct > > dp_netdev *dp, > > pmd->ctx.last_rxq = NULL; > > pmd_thread_ctx_time_update(pmd); > > pmd->next_optimization = pmd->ctx.now + > > DPCLS_OPTIMIZATION_INTERVAL; > > + pmd->next_rcu_quiesce = pmd->ctx.now + > > PMD_RCU_QUIESCE_INTERVAL; > > pmd->rxq_next_cycle_store = pmd->ctx.now + > PMD_RXQ_INTERVAL_LEN; > > hmap_init(&pmd->poll_list); > > hmap_init(&pmd->tx_ports); > > -- > > 1.9.1
Regarding my own part, usually I do not review dpif-netdev patches. On Fri, Oct 04, 2019 at 05:35:58AM +0000, Nitin Katiyar wrote: > Hi, > Can you please review the following patch and provide the feedback? > > Regards, > Nitin > > > -----Original Message----- > > From: Nitin Katiyar > > Sent: Wednesday, August 28, 2019 2:00 PM > > To: ovs-dev@openvswitch.org > > Cc: Anju Thomas <anju.thomas@ericsson.com>; Ilya Maximets > > <i.maximets@samsung.com> > > Subject: RE: [PATCH V2] dpif-netdev: Do RCU synchronization at fixed interval > > in PMD main loop. > > > > Hi, > > Please provide your feedback. > > > > Regards, > > Nitin > > > > > -----Original Message----- > > > From: Nitin Katiyar > > > Sent: Thursday, August 22, 2019 10:24 PM > > > To: ovs-dev@openvswitch.org > > > Cc: Nitin Katiyar <nitin.katiyar@ericsson.com>; Anju Thomas > > > <anju.thomas@ericsson.com> > > > Subject: [PATCH V2] dpif-netdev: Do RCU synchronization at fixed > > > interval in PMD main loop. > > > > > > Each PMD updates the global sequence number for RCU synchronization > > > purpose with other OVS threads. This is done at every 1025th iteration > > > in PMD main loop. > > > > > > If the PMD thread is responsible for polling large number of queues > > > that are carrying traffic, it spends a lot of time processing packets > > > and this results in significant delay in performing the housekeeping > > activities. > > > > > > If the OVS main thread is waiting to synchronize with the PMD threads > > > and if those threads delay performing housekeeping activities for more > > > than 3 sec then LACP processing will be impacted and it will lead to > > > LACP flaps. Similarly, other controls protocols run by OVS main thread are > > impacted. > > > > > > For e.g. a PMD thread polling 200 ports/queues with average of 1600 > > > processing cycles per packet with batch size of 32 may take 10240000 > > > (200 * 1600 * 32) cycles per iteration. In system with 2.0 GHz CPU it > > > means more than 5 ms per iteration. So, for 1024 iterations to > > > complete it would be more than 5 seconds. > > > > > > This gets worse when there are PMD threads which are less loaded. > > > It reduces possibility of getting mutex lock in ovsrcu_try_quiesce() > > > by heavily loaded PMD and next attempt to quiesce would be after 1024 > > iterations. > > > > > > With this patch, PMD RCU synchronization will be performed after fixed > > > interval instead after a fixed number of iterations. This will ensure > > > that even if the packet processing load is high the RCU > > > synchronization will not be delayed long. > > > > > > Co-authored-by: Anju Thomas <anju.thomas@ericsson.com> > > > Signed-off-by: Anju Thomas <anju.thomas@ericsson.com> > > > Signed-off-by: Nitin Katiyar <nitin.katiyar@ericsson.com> > > > --- > > > lib/dpif-netdev.c | 22 ++++++++++++++++++++++ > > > 1 file changed, 22 insertions(+) > > > > > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index > > > d0a1c58..63b6cb9 > > > 100644 > > > --- a/lib/dpif-netdev.c > > > +++ b/lib/dpif-netdev.c > > > @@ -228,6 +228,9 @@ struct dfc_cache { > > > * and used during rxq to pmd assignment. */ #define > > > PMD_RXQ_INTERVAL_MAX 6 > > > > > > +/* Time in microseconds to try RCU quiescing. */ #define > > > +PMD_RCU_QUIESCE_INTERVAL 10000LL > > > + > > > struct dpcls { > > > struct cmap_node node; /* Within dp_netdev_pmd_thread.classifiers > > > */ > > > odp_port_t in_port; > > > @@ -751,6 +754,9 @@ struct dp_netdev_pmd_thread { > > > > > > /* Set to true if the pmd thread needs to be reloaded. */ > > > bool need_reload; > > > + > > > + /* Next time when PMD should try RCU quiescing. */ > > > + long long next_rcu_quiesce; > > > }; > > > > > > /* Interface to netdev-based datapath. */ @@ -5486,6 +5492,8 @@ > > reload: > > > pmd->intrvl_tsc_prev = 0; > > > atomic_store_relaxed(&pmd->intrvl_cycles, 0); > > > cycles_counter_update(s); > > > + > > > + pmd->next_rcu_quiesce = pmd->ctx.now + > > > PMD_RCU_QUIESCE_INTERVAL; > > > /* Protect pmd stats from external clearing while polling. */ > > > ovs_mutex_lock(&pmd->perf_stats.stats_mutex); > > > for (;;) { > > > @@ -5493,6 +5501,17 @@ reload: > > > > > > pmd_perf_start_iteration(s); > > > > > > + /* Do RCU synchronization at fixed interval instead of doing it > > > + * at fixed number of iterations. This ensures that synchronization > > > + * would not be delayed long even at high load of packet > > > + * processing. */ > > > + if (pmd->ctx.now > pmd->next_rcu_quiesce) { > > > + if (!ovsrcu_try_quiesce()) { > > > + pmd->next_rcu_quiesce = > > > + pmd->ctx.now + PMD_RCU_QUIESCE_INTERVAL; > > > + } > > > + } > > > + > > > for (i = 0; i < poll_cnt; i++) { > > > > > > if (!poll_list[i].rxq_enabled) { @@ -5527,6 +5546,8 @@ reload: > > > dp_netdev_pmd_try_optimize(pmd, poll_list, poll_cnt); > > > if (!ovsrcu_try_quiesce()) { > > > emc_cache_slow_sweep(&((pmd->flow_cache).emc_cache)); > > > + pmd->next_rcu_quiesce = > > > + pmd->ctx.now + PMD_RCU_QUIESCE_INTERVAL; > > > } > > > > > > for (i = 0; i < poll_cnt; i++) { @@ -5986,6 +6007,7 @@ > > > dp_netdev_configure_pmd(struct dp_netdev_pmd_thread *pmd, struct > > > dp_netdev *dp, > > > pmd->ctx.last_rxq = NULL; > > > pmd_thread_ctx_time_update(pmd); > > > pmd->next_optimization = pmd->ctx.now + > > > DPCLS_OPTIMIZATION_INTERVAL; > > > + pmd->next_rcu_quiesce = pmd->ctx.now + > > > PMD_RCU_QUIESCE_INTERVAL; > > > pmd->rxq_next_cycle_store = pmd->ctx.now + > > PMD_RXQ_INTERVAL_LEN; > > > hmap_init(&pmd->poll_list); > > > hmap_init(&pmd->tx_ports); > > > -- > > > 1.9.1 >
Hi Guys, Can you please review the patch? It has been in mailing list for long time. Regards, Nitin > -----Original Message----- > From: Nitin Katiyar > Sent: Friday, October 04, 2019 11:06 AM > To: 'ovs-dev@openvswitch.org' <ovs-dev@openvswitch.org> > Cc: Anju Thomas <anju.thomas@ericsson.com>; 'Ilya Maximets' > <i.maximets@samsung.com>; Ben Pfaff <blp@ovn.org> > Subject: RE: [PATCH V2] dpif-netdev: Do RCU synchronization at fixed interval > in PMD main loop. > > Hi, > Can you please review the following patch and provide the feedback? > > Regards, > Nitin > > > -----Original Message----- > > From: Nitin Katiyar > > Sent: Wednesday, August 28, 2019 2:00 PM > > To: ovs-dev@openvswitch.org > > Cc: Anju Thomas <anju.thomas@ericsson.com>; Ilya Maximets > > <i.maximets@samsung.com> > > Subject: RE: [PATCH V2] dpif-netdev: Do RCU synchronization at fixed > > interval in PMD main loop. > > > > Hi, > > Please provide your feedback. > > > > Regards, > > Nitin > > > > > -----Original Message----- > > > From: Nitin Katiyar > > > Sent: Thursday, August 22, 2019 10:24 PM > > > To: ovs-dev@openvswitch.org > > > Cc: Nitin Katiyar <nitin.katiyar@ericsson.com>; Anju Thomas > > > <anju.thomas@ericsson.com> > > > Subject: [PATCH V2] dpif-netdev: Do RCU synchronization at fixed > > > interval in PMD main loop. > > > > > > Each PMD updates the global sequence number for RCU synchronization > > > purpose with other OVS threads. This is done at every 1025th > > > iteration in PMD main loop. > > > > > > If the PMD thread is responsible for polling large number of queues > > > that are carrying traffic, it spends a lot of time processing > > > packets and this results in significant delay in performing the > > > housekeeping > > activities. > > > > > > If the OVS main thread is waiting to synchronize with the PMD > > > threads and if those threads delay performing housekeeping > > > activities for more than 3 sec then LACP processing will be impacted > > > and it will lead to LACP flaps. Similarly, other controls protocols > > > run by OVS main thread are > > impacted. > > > > > > For e.g. a PMD thread polling 200 ports/queues with average of 1600 > > > processing cycles per packet with batch size of 32 may take 10240000 > > > (200 * 1600 * 32) cycles per iteration. In system with 2.0 GHz CPU > > > it means more than 5 ms per iteration. So, for 1024 iterations to > > > complete it would be more than 5 seconds. > > > > > > This gets worse when there are PMD threads which are less loaded. > > > It reduces possibility of getting mutex lock in ovsrcu_try_quiesce() > > > by heavily loaded PMD and next attempt to quiesce would be after > > > 1024 > > iterations. > > > > > > With this patch, PMD RCU synchronization will be performed after > > > fixed interval instead after a fixed number of iterations. This will > > > ensure that even if the packet processing load is high the RCU > > > synchronization will not be delayed long. > > > > > > Co-authored-by: Anju Thomas <anju.thomas@ericsson.com> > > > Signed-off-by: Anju Thomas <anju.thomas@ericsson.com> > > > Signed-off-by: Nitin Katiyar <nitin.katiyar@ericsson.com> > > > --- > > > lib/dpif-netdev.c | 22 ++++++++++++++++++++++ > > > 1 file changed, 22 insertions(+) > > > > > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index > > > d0a1c58..63b6cb9 > > > 100644 > > > --- a/lib/dpif-netdev.c > > > +++ b/lib/dpif-netdev.c > > > @@ -228,6 +228,9 @@ struct dfc_cache { > > > * and used during rxq to pmd assignment. */ #define > > > PMD_RXQ_INTERVAL_MAX 6 > > > > > > +/* Time in microseconds to try RCU quiescing. */ #define > > > +PMD_RCU_QUIESCE_INTERVAL 10000LL > > > + > > > struct dpcls { > > > struct cmap_node node; /* Within > dp_netdev_pmd_thread.classifiers > > > */ > > > odp_port_t in_port; > > > @@ -751,6 +754,9 @@ struct dp_netdev_pmd_thread { > > > > > > /* Set to true if the pmd thread needs to be reloaded. */ > > > bool need_reload; > > > + > > > + /* Next time when PMD should try RCU quiescing. */ > > > + long long next_rcu_quiesce; > > > }; > > > > > > /* Interface to netdev-based datapath. */ @@ -5486,6 +5492,8 @@ > > reload: > > > pmd->intrvl_tsc_prev = 0; > > > atomic_store_relaxed(&pmd->intrvl_cycles, 0); > > > cycles_counter_update(s); > > > + > > > + pmd->next_rcu_quiesce = pmd->ctx.now + > > > PMD_RCU_QUIESCE_INTERVAL; > > > /* Protect pmd stats from external clearing while polling. */ > > > ovs_mutex_lock(&pmd->perf_stats.stats_mutex); > > > for (;;) { > > > @@ -5493,6 +5501,17 @@ reload: > > > > > > pmd_perf_start_iteration(s); > > > > > > + /* Do RCU synchronization at fixed interval instead of doing it > > > + * at fixed number of iterations. This ensures that synchronization > > > + * would not be delayed long even at high load of packet > > > + * processing. */ > > > + if (pmd->ctx.now > pmd->next_rcu_quiesce) { > > > + if (!ovsrcu_try_quiesce()) { > > > + pmd->next_rcu_quiesce = > > > + pmd->ctx.now + PMD_RCU_QUIESCE_INTERVAL; > > > + } > > > + } > > > + > > > for (i = 0; i < poll_cnt; i++) { > > > > > > if (!poll_list[i].rxq_enabled) { @@ -5527,6 +5546,8 @@ reload: > > > dp_netdev_pmd_try_optimize(pmd, poll_list, poll_cnt); > > > if (!ovsrcu_try_quiesce()) { > > > > > > emc_cache_slow_sweep(&((pmd->flow_cache).emc_cache)); > > > + pmd->next_rcu_quiesce = > > > + pmd->ctx.now + PMD_RCU_QUIESCE_INTERVAL; > > > } > > > > > > for (i = 0; i < poll_cnt; i++) { @@ -5986,6 +6007,7 @@ > > > dp_netdev_configure_pmd(struct dp_netdev_pmd_thread *pmd, struct > > > dp_netdev *dp, > > > pmd->ctx.last_rxq = NULL; > > > pmd_thread_ctx_time_update(pmd); > > > pmd->next_optimization = pmd->ctx.now + > > > DPCLS_OPTIMIZATION_INTERVAL; > > > + pmd->next_rcu_quiesce = pmd->ctx.now + > > > PMD_RCU_QUIESCE_INTERVAL; > > > pmd->rxq_next_cycle_store = pmd->ctx.now + > > PMD_RXQ_INTERVAL_LEN; > > > hmap_init(&pmd->poll_list); > > > hmap_init(&pmd->tx_ports); > > > -- > > > 1.9.1
On 11/12/19 7:38 AM, Nitin Katiyar wrote: > Hi Guys, > Can you please review the patch? It has been in mailing list for long time. Hi. Really sorry for so long delay. I lost track of this change somehow during my relocation and job changes. The patch looks good to me in general. I'll take another look and check if it still applicable. And it might be merged, I think, for the current release cycle. Best regards, Ilya Maximets. > > Regards, > Nitin > >> -----Original Message----- >> From: Nitin Katiyar >> Sent: Friday, October 04, 2019 11:06 AM >> To: 'ovs-dev@openvswitch.org' <ovs-dev@openvswitch.org> >> Cc: Anju Thomas <anju.thomas@ericsson.com>; 'Ilya Maximets' >> <i.maximets@samsung.com>; Ben Pfaff <blp@ovn.org> >> Subject: RE: [PATCH V2] dpif-netdev: Do RCU synchronization at fixed interval >> in PMD main loop. >> >> Hi, >> Can you please review the following patch and provide the feedback? >> >> Regards, >> Nitin >> >>> -----Original Message----- >>> From: Nitin Katiyar >>> Sent: Wednesday, August 28, 2019 2:00 PM >>> To: ovs-dev@openvswitch.org >>> Cc: Anju Thomas <anju.thomas@ericsson.com>; Ilya Maximets >>> <i.maximets@samsung.com> >>> Subject: RE: [PATCH V2] dpif-netdev: Do RCU synchronization at fixed >>> interval in PMD main loop. >>> >>> Hi, >>> Please provide your feedback. >>> >>> Regards, >>> Nitin >>> >>>> -----Original Message----- >>>> From: Nitin Katiyar >>>> Sent: Thursday, August 22, 2019 10:24 PM >>>> To: ovs-dev@openvswitch.org >>>> Cc: Nitin Katiyar <nitin.katiyar@ericsson.com>; Anju Thomas >>>> <anju.thomas@ericsson.com> >>>> Subject: [PATCH V2] dpif-netdev: Do RCU synchronization at fixed >>>> interval in PMD main loop. >>>> >>>> Each PMD updates the global sequence number for RCU synchronization >>>> purpose with other OVS threads. This is done at every 1025th >>>> iteration in PMD main loop. >>>> >>>> If the PMD thread is responsible for polling large number of queues >>>> that are carrying traffic, it spends a lot of time processing >>>> packets and this results in significant delay in performing the >>>> housekeeping >>> activities. >>>> >>>> If the OVS main thread is waiting to synchronize with the PMD >>>> threads and if those threads delay performing housekeeping >>>> activities for more than 3 sec then LACP processing will be impacted >>>> and it will lead to LACP flaps. Similarly, other controls protocols >>>> run by OVS main thread are >>> impacted. >>>> >>>> For e.g. a PMD thread polling 200 ports/queues with average of 1600 >>>> processing cycles per packet with batch size of 32 may take 10240000 >>>> (200 * 1600 * 32) cycles per iteration. In system with 2.0 GHz CPU >>>> it means more than 5 ms per iteration. So, for 1024 iterations to >>>> complete it would be more than 5 seconds. >>>> >>>> This gets worse when there are PMD threads which are less loaded. >>>> It reduces possibility of getting mutex lock in ovsrcu_try_quiesce() >>>> by heavily loaded PMD and next attempt to quiesce would be after >>>> 1024 >>> iterations. >>>> >>>> With this patch, PMD RCU synchronization will be performed after >>>> fixed interval instead after a fixed number of iterations. This will >>>> ensure that even if the packet processing load is high the RCU >>>> synchronization will not be delayed long. >>>> >>>> Co-authored-by: Anju Thomas <anju.thomas@ericsson.com> >>>> Signed-off-by: Anju Thomas <anju.thomas@ericsson.com> >>>> Signed-off-by: Nitin Katiyar <nitin.katiyar@ericsson.com> >>>> --- >>>> lib/dpif-netdev.c | 22 ++++++++++++++++++++++ >>>> 1 file changed, 22 insertions(+) >>>> >>>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index >>>> d0a1c58..63b6cb9 >>>> 100644 >>>> --- a/lib/dpif-netdev.c >>>> +++ b/lib/dpif-netdev.c >>>> @@ -228,6 +228,9 @@ struct dfc_cache { >>>> * and used during rxq to pmd assignment. */ #define >>>> PMD_RXQ_INTERVAL_MAX 6 >>>> >>>> +/* Time in microseconds to try RCU quiescing. */ #define >>>> +PMD_RCU_QUIESCE_INTERVAL 10000LL >>>> + >>>> struct dpcls { >>>> struct cmap_node node; /* Within >> dp_netdev_pmd_thread.classifiers >>>> */ >>>> odp_port_t in_port; >>>> @@ -751,6 +754,9 @@ struct dp_netdev_pmd_thread { >>>> >>>> /* Set to true if the pmd thread needs to be reloaded. */ >>>> bool need_reload; >>>> + >>>> + /* Next time when PMD should try RCU quiescing. */ >>>> + long long next_rcu_quiesce; >>>> }; >>>> >>>> /* Interface to netdev-based datapath. */ @@ -5486,6 +5492,8 @@ >>> reload: >>>> pmd->intrvl_tsc_prev = 0; >>>> atomic_store_relaxed(&pmd->intrvl_cycles, 0); >>>> cycles_counter_update(s); >>>> + >>>> + pmd->next_rcu_quiesce = pmd->ctx.now + >>>> PMD_RCU_QUIESCE_INTERVAL; >>>> /* Protect pmd stats from external clearing while polling. */ >>>> ovs_mutex_lock(&pmd->perf_stats.stats_mutex); >>>> for (;;) { >>>> @@ -5493,6 +5501,17 @@ reload: >>>> >>>> pmd_perf_start_iteration(s); >>>> >>>> + /* Do RCU synchronization at fixed interval instead of doing it >>>> + * at fixed number of iterations. This ensures that synchronization >>>> + * would not be delayed long even at high load of packet >>>> + * processing. */ >>>> + if (pmd->ctx.now > pmd->next_rcu_quiesce) { >>>> + if (!ovsrcu_try_quiesce()) { >>>> + pmd->next_rcu_quiesce = >>>> + pmd->ctx.now + PMD_RCU_QUIESCE_INTERVAL; >>>> + } >>>> + } >>>> + >>>> for (i = 0; i < poll_cnt; i++) { >>>> >>>> if (!poll_list[i].rxq_enabled) { @@ -5527,6 +5546,8 @@ reload: >>>> dp_netdev_pmd_try_optimize(pmd, poll_list, poll_cnt); >>>> if (!ovsrcu_try_quiesce()) { >>>> >>>> emc_cache_slow_sweep(&((pmd->flow_cache).emc_cache)); >>>> + pmd->next_rcu_quiesce = >>>> + pmd->ctx.now + PMD_RCU_QUIESCE_INTERVAL; >>>> } >>>> >>>> for (i = 0; i < poll_cnt; i++) { @@ -5986,6 +6007,7 @@ >>>> dp_netdev_configure_pmd(struct dp_netdev_pmd_thread *pmd, struct >>>> dp_netdev *dp, >>>> pmd->ctx.last_rxq = NULL; >>>> pmd_thread_ctx_time_update(pmd); >>>> pmd->next_optimization = pmd->ctx.now + >>>> DPCLS_OPTIMIZATION_INTERVAL; >>>> + pmd->next_rcu_quiesce = pmd->ctx.now + >>>> PMD_RCU_QUIESCE_INTERVAL; >>>> pmd->rxq_next_cycle_store = pmd->ctx.now + >>> PMD_RXQ_INTERVAL_LEN; >>>> hmap_init(&pmd->poll_list); >>>> hmap_init(&pmd->tx_ports); >>>> -- >>>> 1.9.1 >
On 7/2/20 2:36 PM, Ilya Maximets wrote: > On 11/12/19 7:38 AM, Nitin Katiyar wrote: >> Hi Guys, >> Can you please review the patch? It has been in mailing list for long time. > > Hi. > > Really sorry for so long delay. I lost track of this change somehow during > my relocation and job changes. > > The patch looks good to me in general. I'll take another look and > check if it still applicable. And it might be merged, I think, for > the current release cycle. Thanks for the patch! I moved the time checking a bit lower to the place where time context is more accurate and applied to master. Best regards, Ilya Maximets.
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index d0a1c58..63b6cb9 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -228,6 +228,9 @@ struct dfc_cache { * and used during rxq to pmd assignment. */ #define PMD_RXQ_INTERVAL_MAX 6 +/* Time in microseconds to try RCU quiescing. */ +#define PMD_RCU_QUIESCE_INTERVAL 10000LL + struct dpcls { struct cmap_node node; /* Within dp_netdev_pmd_thread.classifiers */ odp_port_t in_port; @@ -751,6 +754,9 @@ struct dp_netdev_pmd_thread { /* Set to true if the pmd thread needs to be reloaded. */ bool need_reload; + + /* Next time when PMD should try RCU quiescing. */ + long long next_rcu_quiesce; }; /* Interface to netdev-based datapath. */ @@ -5486,6 +5492,8 @@ reload: pmd->intrvl_tsc_prev = 0; atomic_store_relaxed(&pmd->intrvl_cycles, 0); cycles_counter_update(s); + + pmd->next_rcu_quiesce = pmd->ctx.now + PMD_RCU_QUIESCE_INTERVAL; /* Protect pmd stats from external clearing while polling. */ ovs_mutex_lock(&pmd->perf_stats.stats_mutex); for (;;) { @@ -5493,6 +5501,17 @@ reload: pmd_perf_start_iteration(s); + /* Do RCU synchronization at fixed interval instead of doing it + * at fixed number of iterations. This ensures that synchronization + * would not be delayed long even at high load of packet + * processing. */ + if (pmd->ctx.now > pmd->next_rcu_quiesce) { + if (!ovsrcu_try_quiesce()) { + pmd->next_rcu_quiesce = + pmd->ctx.now + PMD_RCU_QUIESCE_INTERVAL; + } + } + for (i = 0; i < poll_cnt; i++) { if (!poll_list[i].rxq_enabled) { @@ -5527,6 +5546,8 @@ reload: dp_netdev_pmd_try_optimize(pmd, poll_list, poll_cnt); if (!ovsrcu_try_quiesce()) { emc_cache_slow_sweep(&((pmd->flow_cache).emc_cache)); + pmd->next_rcu_quiesce = + pmd->ctx.now + PMD_RCU_QUIESCE_INTERVAL; } for (i = 0; i < poll_cnt; i++) { @@ -5986,6 +6007,7 @@ dp_netdev_configure_pmd(struct dp_netdev_pmd_thread *pmd, struct dp_netdev *dp, pmd->ctx.last_rxq = NULL; pmd_thread_ctx_time_update(pmd); pmd->next_optimization = pmd->ctx.now + DPCLS_OPTIMIZATION_INTERVAL; + pmd->next_rcu_quiesce = pmd->ctx.now + PMD_RCU_QUIESCE_INTERVAL; pmd->rxq_next_cycle_store = pmd->ctx.now + PMD_RXQ_INTERVAL_LEN; hmap_init(&pmd->poll_list); hmap_init(&pmd->tx_ports);