diff mbox

[ovs-dev,v2] netdev-dpdk: Set pmd thread priority

Message ID 1467749157-71520-1-git-send-email-bhanuprakash.bodireddy@intel.com
State Changes Requested
Delegated to: Daniele Di Proietto
Headers show

Commit Message

Bodireddy, Bhanuprakash July 5, 2016, 8:05 p.m. UTC
Set the DPDK pmd thread scheduling policy to SCHED_RR and static
priority to highest priority value of the policy. This is to deal with
pmd thread starvation case where another cpu hogging process can get
scheduled/affinitized on to the same core the pmd thread is running there
by significantly impacting the datapath performance.

Setting the realtime scheduling policy to the pmd threads is one step
towards Fastpath Service Assurance in OVS DPDK.

The realtime scheduling policy is applied only when CPU mask is passed
to 'pmd-cpu-mask'. The exception to this is 'pmd-cpu-mask=1', where the
policy and priority shall not be applied to pmd thread spawned on core0.
For example:

    * In the absence of pmd-cpu-mask or if pmd-cpu-mask=1, one pmd
      thread shall be created and affinitized to 'core 0' with default
      scheduling policy and priority applied.

    * If pmd-cpu-mask is specified with CPU mask > 1, one or more pmd
      threads shall be spawned on the corresponding core(s) in the mask
      and real time scheduling policy SCHED_RR and highest static
      priority is applied to the pmd thread(s).

To reproduce the issue use following commands:

ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=6
taskset 0x2 cat /dev/zero > /dev/null &

Also OVS control threads should not be affinitized to the pmd cores.
For example 'dpdk-lcore-mask' and 'pmd-cpu-mask' should be exclusive.

v1->v2:
* Removed #ifdef and introduced dummy function "pmd_thread_setpriority" 
  in netdev-dpdk.h
* Rebase

Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
---
 lib/dpif-netdev.c |  8 ++++++++
 lib/netdev-dpdk.c | 14 ++++++++++++++
 lib/netdev-dpdk.h |  7 +++++++
 3 files changed, 29 insertions(+)

Comments

Daniele Di Proietto July 15, 2016, 1:18 a.m. UTC | #1
Thanks for the patch.

Is there any reason why core 0 is treated specially?

I think we should put pmd_thread_setpriority in lib/ovs-numa.c (adding
a ovs_numa prefix), and do nothing if dummy_numa is false.  Or perhaps
integrate it the pthread_setschedparam in
ovs_numa_thread_setaffinity_core().

I've noticed that processes with the same affinity as a PMD thread will
become
totally unresponsive after this patch.  Is this expected? Will this have a
negative
impact on the overall stability of the system?

2016-07-05 13:05 GMT-07:00 Bhanuprakash Bodireddy <
bhanuprakash.bodireddy@intel.com>:

> Set the DPDK pmd thread scheduling policy to SCHED_RR and static
> priority to highest priority value of the policy. This is to deal with
> pmd thread starvation case where another cpu hogging process can get
> scheduled/affinitized on to the same core the pmd thread is running there
> by significantly impacting the datapath performance.
>
> Setting the realtime scheduling policy to the pmd threads is one step
> towards Fastpath Service Assurance in OVS DPDK.
>
> The realtime scheduling policy is applied only when CPU mask is passed
> to 'pmd-cpu-mask'. The exception to this is 'pmd-cpu-mask=1', where the
> policy and priority shall not be applied to pmd thread spawned on core0.
> For example:
>
>     * In the absence of pmd-cpu-mask or if pmd-cpu-mask=1, one pmd
>       thread shall be created and affinitized to 'core 0' with default
>       scheduling policy and priority applied.
>
>     * If pmd-cpu-mask is specified with CPU mask > 1, one or more pmd
>       threads shall be spawned on the corresponding core(s) in the mask
>       and real time scheduling policy SCHED_RR and highest static
>       priority is applied to the pmd thread(s).
>
> To reproduce the issue use following commands:
>
> ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=6
> taskset 0x2 cat /dev/zero > /dev/null &
>
> Also OVS control threads should not be affinitized to the pmd cores.
> For example 'dpdk-lcore-mask' and 'pmd-cpu-mask' should be exclusive.
>
> v1->v2:
> * Removed #ifdef and introduced dummy function "pmd_thread_setpriority"
>   in netdev-dpdk.h
> * Rebase
>
> Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
> ---
>  lib/dpif-netdev.c |  8 ++++++++
>  lib/netdev-dpdk.c | 14 ++++++++++++++
>  lib/netdev-dpdk.h |  7 +++++++
>  3 files changed, 29 insertions(+)
>
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index 37c2631..6ff81d6 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -2849,6 +2849,14 @@ pmd_thread_main(void *f_)
>      ovs_numa_thread_setaffinity_core(pmd->core_id);
>      dpdk_set_lcore_id(pmd->core_id);
>      poll_cnt = pmd_load_queues_and_ports(pmd, &poll_list);
> +
> +    /* Set pmd thread's scheduling policy to SCHED_RR and priority to
> +     * highest priority of SCHED_RR policy, In absence of pmd-cpu-mask
> (or)
> +     * pmd-cpu-mask=1, default scheduling policy and priority shall
> +     * apply to pmd thread */
> +     if (pmd->core_id) {
> +         pmd_thread_setpriority(SCHED_RR);
> +     }
>  reload:
>      emc_cache_init(&pmd->flow_cache);
>
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index 02e2c58..ce1683b 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -3541,3 +3541,17 @@ dpdk_thread_is_pmd(void)
>  {
>      return rte_lcore_id() != NON_PMD_CORE_ID;
>  }
> +
> +void
> +pmd_thread_setpriority(int policy)
> +{
> +    struct sched_param threadparam;
> +    int err;
> +
> +    memset(&threadparam, 0, sizeof(threadparam));
> +    threadparam.sched_priority = sched_get_priority_max(policy);
> +    err = pthread_setschedparam(pthread_self(), policy, &threadparam);
> +    if (err) {
> +        VLOG_WARN("Thread priority error %d",err);
> +    }
> +}
> diff --git a/lib/netdev-dpdk.h b/lib/netdev-dpdk.h
> index 80bb834..1890ae4 100644
> --- a/lib/netdev-dpdk.h
> +++ b/lib/netdev-dpdk.h
> @@ -26,6 +26,7 @@ struct smap;
>  void netdev_dpdk_register(void);
>  void free_dpdk_buf(struct dp_packet *);
>  void dpdk_set_lcore_id(unsigned cpu);
> +void pmd_thread_setpriority(int policy);
>
>  #else
>
> @@ -51,6 +52,12 @@ dpdk_set_lcore_id(unsigned cpu OVS_UNUSED)
>      /* Nothing */
>  }
>
> +static inline void
> +pmd_thread_setpriority(int policy OVS_UNUSED)
> +{
> +    /* Nothing */
> +}
> +
>  #endif /* DPDK_NETDEV */
>
>  void dpdk_init(const struct smap *ovs_other_config);
> --
> 2.4.11
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
>
Bodireddy, Bhanuprakash July 15, 2016, 2:52 p.m. UTC | #2
>-----Original Message-----

>From: Daniele Di Proietto [mailto:daniele.di.proietto@gmail.com]

>Sent: Friday, July 15, 2016 2:19 AM

>To: Bodireddy, Bhanuprakash <bhanuprakash.bodireddy@intel.com>

>Cc: dev@openvswitch.org

>Subject: Re: [ovs-dev] [PATCH v2] netdev-dpdk: Set pmd thread priority

>

>Thanks for the patch.

Hello Daniele, 
Thanks for looking in to this patch.

>Is there any reason why core 0 is treated specially?

it's very uncommon to see 'core0' isolated and HPC threads pinned to the core0. On multicore systems  to improve application performance and mitigate Interrupts, IRQs get explicitly pinned to Core 0. In  few more cases, core 0 is treated more like a management/control core that is used to launch applications on other cores. For this reasons I treat Core 0 special.

>I think we should put pmd_thread_setpriority in lib/ovs-numa.c (adding

>a ovs_numa prefix), and do nothing if dummy_numa is false.


Agree.

  Or perhaps
>integrate it the pthread_setschedparam in

>ovs_numa_thread_setaffinity_core().

>I've noticed that processes with the same affinity as a PMD thread will

>become

>totally unresponsive after this patch.  Is this expected? Will this have a

>negative

>impact on the overall stability of the system?


There are 2 sides to this problem.
(i) Out of Box Deployment (Not specifying dpdk-lcore-mask, pmd-cpu-mask):
      As it is now, when OVS DPDK is run out of box, one pmd thread shall be created and gets pinned to core 0. In this case the pmd thread shall run with default scheduling policy and priority with no impact to the stability of the system.

(ii) High performance Deployment with SA (Explicitly specify dpdk-lcore-mask, pmd-cpu-mask):
      In this case user wants optimum Fastpath performance + SA and is explicitly pinning the control thread and pmd threads to cores. Only in this case the Real time scheduling policy shall be applied to the pmd threads as any disruption to the threads would impact the fastpath performance.

I have come across cases where in multi VM deployments with HT enabled, due to wrong pinning of Qemu threads to the pmd cores,
the pmd thread starvation was observed which eventually destabilizing the system.

Regards,
Bhanuprakash.

>

>2016-07-05 13:05 GMT-07:00 Bhanuprakash Bodireddy

><bhanuprakash.bodireddy@intel.com>:

>Set the DPDK pmd thread scheduling policy to SCHED_RR and static

>priority to highest priority value of the policy. This is to deal with

>pmd thread starvation case where another cpu hogging process can get

>scheduled/affinitized on to the same core the pmd thread is running there

>by significantly impacting the datapath performance.

>

>Setting the realtime scheduling policy to the pmd threads is one step

>towards Fastpath Service Assurance in OVS DPDK.

>

>The realtime scheduling policy is applied only when CPU mask is passed

>to 'pmd-cpu-mask'. The exception to this is 'pmd-cpu-mask=1', where the

>policy and priority shall not be applied to pmd thread spawned on core0.

>For example:

>

>    * In the absence of pmd-cpu-mask or if pmd-cpu-mask=1, one pmd

>      thread shall be created and affinitized to 'core 0' with default

>      scheduling policy and priority applied.

>

>    * If pmd-cpu-mask is specified with CPU mask > 1, one or more pmd

>      threads shall be spawned on the corresponding core(s) in the mask

>      and real time scheduling policy SCHED_RR and highest static

>      priority is applied to the pmd thread(s).

>

>To reproduce the issue use following commands:

>

>ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=6

>taskset 0x2 cat /dev/zero > /dev/null &

>

>Also OVS control threads should not be affinitized to the pmd cores.

>For example 'dpdk-lcore-mask' and 'pmd-cpu-mask' should be exclusive.

>

>v1->v2:

>* Removed #ifdef and introduced dummy function "pmd_thread_setpriority"

>  in netdev-dpdk.h

>* Rebase

>

>Signed-off-by: Bhanuprakash Bodireddy

><bhanuprakash.bodireddy@intel.com>

>---

> lib/dpif-netdev.c |  8 ++++++++

> lib/netdev-dpdk.c | 14 ++++++++++++++

> lib/netdev-dpdk.h |  7 +++++++

> 3 files changed, 29 insertions(+)

>

>diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c

>index 37c2631..6ff81d6 100644

>--- a/lib/dpif-netdev.c

>+++ b/lib/dpif-netdev.c

>@@ -2849,6 +2849,14 @@ pmd_thread_main(void *f_)

>     ovs_numa_thread_setaffinity_core(pmd->core_id);

>     dpdk_set_lcore_id(pmd->core_id);

>     poll_cnt = pmd_load_queues_and_ports(pmd, &poll_list);

>+

>+    /* Set pmd thread's scheduling policy to SCHED_RR and priority to

>+     * highest priority of SCHED_RR policy, In absence of pmd-cpu-mask (or)

>+     * pmd-cpu-mask=1, default scheduling policy and priority shall

>+     * apply to pmd thread */

>+     if (pmd->core_id) {

>+         pmd_thread_setpriority(SCHED_RR);

>+     }

> reload:

>     emc_cache_init(&pmd->flow_cache);

>

>diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c

>index 02e2c58..ce1683b 100644

>--- a/lib/netdev-dpdk.c

>+++ b/lib/netdev-dpdk.c

>@@ -3541,3 +3541,17 @@ dpdk_thread_is_pmd(void)

> {

>     return rte_lcore_id() != NON_PMD_CORE_ID;

> }

>+

>+void

>+pmd_thread_setpriority(int policy)

>+{

>+    struct sched_param threadparam;

>+    int err;

>+

>+    memset(&threadparam, 0, sizeof(threadparam));

>+    threadparam.sched_priority = sched_get_priority_max(policy);

>+    err = pthread_setschedparam(pthread_self(), policy, &threadparam);

>+    if (err) {

>+        VLOG_WARN("Thread priority error %d",err);

>+    }

>+}

>diff --git a/lib/netdev-dpdk.h b/lib/netdev-dpdk.h

>index 80bb834..1890ae4 100644

>--- a/lib/netdev-dpdk.h

>+++ b/lib/netdev-dpdk.h

>@@ -26,6 +26,7 @@ struct smap;

> void netdev_dpdk_register(void);

> void free_dpdk_buf(struct dp_packet *);

> void dpdk_set_lcore_id(unsigned cpu);

>+void pmd_thread_setpriority(int policy);

>

> #else

>

>@@ -51,6 +52,12 @@ dpdk_set_lcore_id(unsigned cpu OVS_UNUSED)

>     /* Nothing */

> }

>

>+static inline void

>+pmd_thread_setpriority(int policy OVS_UNUSED)

>+{

>+    /* Nothing */

>+}

>+

> #endif /* DPDK_NETDEV */

>

> void dpdk_init(const struct smap *ovs_other_config);

>--

>2.4.11

>

>_______________________________________________

>dev mailing list

>dev@openvswitch.org

>http://openvswitch.org/mailman/listinfo/dev
Daniele Di Proietto July 15, 2016, 5:58 p.m. UTC | #3
Thanks for the explanation.

I still think it's weird to hardcode an exception for core 0.

If no pmd-cpu-mask is specified other cores might be used, depending on the
numa affinity.

Perhaps we can call set_priority only if pmd-cpu-mask is specified?  That
seems more consistent.

Thanks,

Daniele

2016-07-15 7:52 GMT-07:00 Bodireddy, Bhanuprakash <
bhanuprakash.bodireddy@intel.com>:

> >-----Original Message-----
> >From: Daniele Di Proietto [mailto:daniele.di.proietto@gmail.com]
> >Sent: Friday, July 15, 2016 2:19 AM
> >To: Bodireddy, Bhanuprakash <bhanuprakash.bodireddy@intel.com>
> >Cc: dev@openvswitch.org
> >Subject: Re: [ovs-dev] [PATCH v2] netdev-dpdk: Set pmd thread priority
> >
> >Thanks for the patch.
> Hello Daniele,
> Thanks for looking in to this patch.
>
> >Is there any reason why core 0 is treated specially?
> it's very uncommon to see 'core0' isolated and HPC threads pinned to the
> core0. On multicore systems  to improve application performance and
> mitigate Interrupts, IRQs get explicitly pinned to Core 0. In  few more
> cases, core 0 is treated more like a management/control core that is used
> to launch applications on other cores. For this reasons I treat Core 0
> special.
>
> >I think we should put pmd_thread_setpriority in lib/ovs-numa.c (adding
> >a ovs_numa prefix), and do nothing if dummy_numa is false.
>
> Agree.
>
>   Or perhaps
> >integrate it the pthread_setschedparam in
> >ovs_numa_thread_setaffinity_core().
> >I've noticed that processes with the same affinity as a PMD thread will
> >become
> >totally unresponsive after this patch.  Is this expected? Will this have a
> >negative
> >impact on the overall stability of the system?
>
> There are 2 sides to this problem.
> (i) Out of Box Deployment (Not specifying dpdk-lcore-mask, pmd-cpu-mask):
>       As it is now, when OVS DPDK is run out of box, one pmd thread shall
> be created and gets pinned to core 0. In this case the pmd thread shall run
> with default scheduling policy and priority with no impact to the stability
> of the system.
>
> (ii) High performance Deployment with SA (Explicitly specify
> dpdk-lcore-mask, pmd-cpu-mask):
>       In this case user wants optimum Fastpath performance + SA and is
> explicitly pinning the control thread and pmd threads to cores. Only in
> this case the Real time scheduling policy shall be applied to the pmd
> threads as any disruption to the threads would impact the fastpath
> performance.
>
> I have come across cases where in multi VM deployments with HT enabled,
> due to wrong pinning of Qemu threads to the pmd cores,
> the pmd thread starvation was observed which eventually destabilizing the
> system.
>
> Regards,
> Bhanuprakash.
>
> >
> >2016-07-05 13:05 GMT-07:00 Bhanuprakash Bodireddy
> ><bhanuprakash.bodireddy@intel.com>:
> >Set the DPDK pmd thread scheduling policy to SCHED_RR and static
> >priority to highest priority value of the policy. This is to deal with
> >pmd thread starvation case where another cpu hogging process can get
> >scheduled/affinitized on to the same core the pmd thread is running there
> >by significantly impacting the datapath performance.
> >
> >Setting the realtime scheduling policy to the pmd threads is one step
> >towards Fastpath Service Assurance in OVS DPDK.
> >
> >The realtime scheduling policy is applied only when CPU mask is passed
> >to 'pmd-cpu-mask'. The exception to this is 'pmd-cpu-mask=1', where the
> >policy and priority shall not be applied to pmd thread spawned on core0.
> >For example:
> >
> >    * In the absence of pmd-cpu-mask or if pmd-cpu-mask=1, one pmd
> >      thread shall be created and affinitized to 'core 0' with default
> >      scheduling policy and priority applied.
> >
> >    * If pmd-cpu-mask is specified with CPU mask > 1, one or more pmd
> >      threads shall be spawned on the corresponding core(s) in the mask
> >      and real time scheduling policy SCHED_RR and highest static
> >      priority is applied to the pmd thread(s).
> >
> >To reproduce the issue use following commands:
> >
> >ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=6
> >taskset 0x2 cat /dev/zero > /dev/null &
> >
> >Also OVS control threads should not be affinitized to the pmd cores.
> >For example 'dpdk-lcore-mask' and 'pmd-cpu-mask' should be exclusive.
> >
> >v1->v2:
> >* Removed #ifdef and introduced dummy function "pmd_thread_setpriority"
> >  in netdev-dpdk.h
> >* Rebase
> >
> >Signed-off-by: Bhanuprakash Bodireddy
> ><bhanuprakash.bodireddy@intel.com>
> >---
> > lib/dpif-netdev.c |  8 ++++++++
> > lib/netdev-dpdk.c | 14 ++++++++++++++
> > lib/netdev-dpdk.h |  7 +++++++
> > 3 files changed, 29 insertions(+)
> >
> >diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> >index 37c2631..6ff81d6 100644
> >--- a/lib/dpif-netdev.c
> >+++ b/lib/dpif-netdev.c
> >@@ -2849,6 +2849,14 @@ pmd_thread_main(void *f_)
> >     ovs_numa_thread_setaffinity_core(pmd->core_id);
> >     dpdk_set_lcore_id(pmd->core_id);
> >     poll_cnt = pmd_load_queues_and_ports(pmd, &poll_list);
> >+
> >+    /* Set pmd thread's scheduling policy to SCHED_RR and priority to
> >+     * highest priority of SCHED_RR policy, In absence of pmd-cpu-mask
> (or)
> >+     * pmd-cpu-mask=1, default scheduling policy and priority shall
> >+     * apply to pmd thread */
> >+     if (pmd->core_id) {
> >+         pmd_thread_setpriority(SCHED_RR);
> >+     }
> > reload:
> >     emc_cache_init(&pmd->flow_cache);
> >
> >diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> >index 02e2c58..ce1683b 100644
> >--- a/lib/netdev-dpdk.c
> >+++ b/lib/netdev-dpdk.c
> >@@ -3541,3 +3541,17 @@ dpdk_thread_is_pmd(void)
> > {
> >     return rte_lcore_id() != NON_PMD_CORE_ID;
> > }
> >+
> >+void
> >+pmd_thread_setpriority(int policy)
> >+{
> >+    struct sched_param threadparam;
> >+    int err;
> >+
> >+    memset(&threadparam, 0, sizeof(threadparam));
> >+    threadparam.sched_priority = sched_get_priority_max(policy);
> >+    err = pthread_setschedparam(pthread_self(), policy, &threadparam);
> >+    if (err) {
> >+        VLOG_WARN("Thread priority error %d",err);
> >+    }
> >+}
> >diff --git a/lib/netdev-dpdk.h b/lib/netdev-dpdk.h
> >index 80bb834..1890ae4 100644
> >--- a/lib/netdev-dpdk.h
> >+++ b/lib/netdev-dpdk.h
> >@@ -26,6 +26,7 @@ struct smap;
> > void netdev_dpdk_register(void);
> > void free_dpdk_buf(struct dp_packet *);
> > void dpdk_set_lcore_id(unsigned cpu);
> >+void pmd_thread_setpriority(int policy);
> >
> > #else
> >
> >@@ -51,6 +52,12 @@ dpdk_set_lcore_id(unsigned cpu OVS_UNUSED)
> >     /* Nothing */
> > }
> >
> >+static inline void
> >+pmd_thread_setpriority(int policy OVS_UNUSED)
> >+{
> >+    /* Nothing */
> >+}
> >+
> > #endif /* DPDK_NETDEV */
> >
> > void dpdk_init(const struct smap *ovs_other_config);
> >--
> >2.4.11
> >
> >_______________________________________________
> >dev mailing list
> >dev@openvswitch.org
> >http://openvswitch.org/mailman/listinfo/dev
>
>
Bodireddy, Bhanuprakash July 15, 2016, 6:54 p.m. UTC | #4
>Thanks for the explanation.

>

>I still think it's weird to hardcode an exception for core 0.

>

>If no pmd-cpu-mask is specified other cores might be used, depending on the

>numa affinity.

>Perhaps we can call set_priority only if pmd-cpu-mask is specified?  That

>seems more consistent.


I agree to this and looks good to me. I will send out v3 as discussed here.

Regards,
Bhanu Prakash.

>Thanks,

>Daniele

>

>2016-07-15 7:52 GMT-07:00 Bodireddy, Bhanuprakash

><bhanuprakash.bodireddy@intel.com>:

>>-----Original Message-----

>>From: Daniele Di Proietto [mailto:daniele.di.proietto@gmail.com]

>>Sent: Friday, July 15, 2016 2:19 AM

>>To: Bodireddy, Bhanuprakash <bhanuprakash.bodireddy@intel.com>

>>Cc: dev@openvswitch.org

>>Subject: Re: [ovs-dev] [PATCH v2] netdev-dpdk: Set pmd thread priority

>>

>>Thanks for the patch.

>Hello Daniele,

>Thanks for looking in to this patch.

>

>>Is there any reason why core 0 is treated specially?

>it's very uncommon to see 'core0' isolated and HPC threads pinned to the

>core0. On multicore systems  to improve application performance and

>mitigate Interrupts, IRQs get explicitly pinned to Core 0. In  few more cases,

>core 0 is treated more like a management/control core that is used to launch

>applications on other cores. For this reasons I treat Core 0 special.

>

>>I think we should put pmd_thread_setpriority in lib/ovs-numa.c (adding

>>a ovs_numa prefix), and do nothing if dummy_numa is false.

>

>Agree.

>

>  Or perhaps

>>integrate it the pthread_setschedparam in

>>ovs_numa_thread_setaffinity_core().

>>I've noticed that processes with the same affinity as a PMD thread will

>>become

>>totally unresponsive after this patch.  Is this expected? Will this have a

>>negative

>>impact on the overall stability of the system?

>

>There are 2 sides to this problem.

>(i) Out of Box Deployment (Not specifying dpdk-lcore-mask, pmd-cpu-mask):

>      As it is now, when OVS DPDK is run out of box, one pmd thread shall be

>created and gets pinned to core 0. In this case the pmd thread shall run with

>default scheduling policy and priority with no impact to the stability of the

>system.

>

>(ii) High performance Deployment with SA (Explicitly specify dpdk-lcore-mask,

>pmd-cpu-mask):

>      In this case user wants optimum Fastpath performance + SA and is

>explicitly pinning the control thread and pmd threads to cores. Only in this

>case the Real time scheduling policy shall be applied to the pmd threads as any

>disruption to the threads would impact the fastpath performance.

>

>I have come across cases where in multi VM deployments with HT enabled,

>due to wrong pinning of Qemu threads to the pmd cores,

>the pmd thread starvation was observed which eventually destabilizing the

>system.

>

>Regards,

>Bhanuprakash.

>

>>

>>2016-07-05 13:05 GMT-07:00 Bhanuprakash Bodireddy

>><bhanuprakash.bodireddy@intel.com>:

>>Set the DPDK pmd thread scheduling policy to SCHED_RR and static

>>priority to highest priority value of the policy. This is to deal with

>>pmd thread starvation case where another cpu hogging process can get

>>scheduled/affinitized on to the same core the pmd thread is running there

>>by significantly impacting the datapath performance.

>>

>>Setting the realtime scheduling policy to the pmd threads is one step

>>towards Fastpath Service Assurance in OVS DPDK.

>>

>>The realtime scheduling policy is applied only when CPU mask is passed

>>to 'pmd-cpu-mask'. The exception to this is 'pmd-cpu-mask=1', where the

>>policy and priority shall not be applied to pmd thread spawned on core0.

>>For example:

>>

>>    * In the absence of pmd-cpu-mask or if pmd-cpu-mask=1, one pmd

>>      thread shall be created and affinitized to 'core 0' with default

>>      scheduling policy and priority applied.

>>

>>    * If pmd-cpu-mask is specified with CPU mask > 1, one or more pmd

>>      threads shall be spawned on the corresponding core(s) in the mask

>>      and real time scheduling policy SCHED_RR and highest static

>>      priority is applied to the pmd thread(s).

>>

>>To reproduce the issue use following commands:

>>

>>ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=6

>>taskset 0x2 cat /dev/zero > /dev/null &

>>

>>Also OVS control threads should not be affinitized to the pmd cores.

>>For example 'dpdk-lcore-mask' and 'pmd-cpu-mask' should be exclusive.

>>

>>v1->v2:

>>* Removed #ifdef and introduced dummy function

>"pmd_thread_setpriority"

>>  in netdev-dpdk.h

>>* Rebase

>>

>>Signed-off-by: Bhanuprakash Bodireddy

>><bhanuprakash.bodireddy@intel.com>

>>---

>> lib/dpif-netdev.c |  8 ++++++++

>> lib/netdev-dpdk.c | 14 ++++++++++++++

>> lib/netdev-dpdk.h |  7 +++++++

>> 3 files changed, 29 insertions(+)

>>

>>diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c

>>index 37c2631..6ff81d6 100644

>>--- a/lib/dpif-netdev.c

>>+++ b/lib/dpif-netdev.c

>>@@ -2849,6 +2849,14 @@ pmd_thread_main(void *f_)

>>     ovs_numa_thread_setaffinity_core(pmd->core_id);

>>     dpdk_set_lcore_id(pmd->core_id);

>>     poll_cnt = pmd_load_queues_and_ports(pmd, &poll_list);

>>+

>>+    /* Set pmd thread's scheduling policy to SCHED_RR and priority to

>>+     * highest priority of SCHED_RR policy, In absence of pmd-cpu-mask (or)

>>+     * pmd-cpu-mask=1, default scheduling policy and priority shall

>>+     * apply to pmd thread */

>>+     if (pmd->core_id) {

>>+         pmd_thread_setpriority(SCHED_RR);

>>+     }

>> reload:

>>     emc_cache_init(&pmd->flow_cache);

>>

>>diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c

>>index 02e2c58..ce1683b 100644

>>--- a/lib/netdev-dpdk.c

>>+++ b/lib/netdev-dpdk.c

>>@@ -3541,3 +3541,17 @@ dpdk_thread_is_pmd(void)

>> {

>>     return rte_lcore_id() != NON_PMD_CORE_ID;

>> }

>>+

>>+void

>>+pmd_thread_setpriority(int policy)

>>+{

>>+    struct sched_param threadparam;

>>+    int err;

>>+

>>+    memset(&threadparam, 0, sizeof(threadparam));

>>+    threadparam.sched_priority = sched_get_priority_max(policy);

>>+    err = pthread_setschedparam(pthread_self(), policy, &threadparam);

>>+    if (err) {

>>+        VLOG_WARN("Thread priority error %d",err);

>>+    }

>>+}

>>diff --git a/lib/netdev-dpdk.h b/lib/netdev-dpdk.h

>>index 80bb834..1890ae4 100644

>>--- a/lib/netdev-dpdk.h

>>+++ b/lib/netdev-dpdk.h

>>@@ -26,6 +26,7 @@ struct smap;

>> void netdev_dpdk_register(void);

>> void free_dpdk_buf(struct dp_packet *);

>> void dpdk_set_lcore_id(unsigned cpu);

>>+void pmd_thread_setpriority(int policy);

>>

>> #else

>>

>>@@ -51,6 +52,12 @@ dpdk_set_lcore_id(unsigned cpu OVS_UNUSED)

>>     /* Nothing */

>> }

>>

>>+static inline void

>>+pmd_thread_setpriority(int policy OVS_UNUSED)

>>+{

>>+    /* Nothing */

>>+}

>>+

>> #endif /* DPDK_NETDEV */

>>

>> void dpdk_init(const struct smap *ovs_other_config);

>>--

>>2.4.11

>>

>>_______________________________________________

>>dev mailing list

>>dev@openvswitch.org

>>http://openvswitch.org/mailman/listinfo/dev
diff mbox

Patch

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 37c2631..6ff81d6 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -2849,6 +2849,14 @@  pmd_thread_main(void *f_)
     ovs_numa_thread_setaffinity_core(pmd->core_id);
     dpdk_set_lcore_id(pmd->core_id);
     poll_cnt = pmd_load_queues_and_ports(pmd, &poll_list);
+
+    /* Set pmd thread's scheduling policy to SCHED_RR and priority to
+     * highest priority of SCHED_RR policy, In absence of pmd-cpu-mask (or)
+     * pmd-cpu-mask=1, default scheduling policy and priority shall
+     * apply to pmd thread */
+     if (pmd->core_id) {
+         pmd_thread_setpriority(SCHED_RR);
+     }
 reload:
     emc_cache_init(&pmd->flow_cache);
 
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 02e2c58..ce1683b 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -3541,3 +3541,17 @@  dpdk_thread_is_pmd(void)
 {
     return rte_lcore_id() != NON_PMD_CORE_ID;
 }
+
+void
+pmd_thread_setpriority(int policy)
+{
+    struct sched_param threadparam;
+    int err;
+
+    memset(&threadparam, 0, sizeof(threadparam));
+    threadparam.sched_priority = sched_get_priority_max(policy);
+    err = pthread_setschedparam(pthread_self(), policy, &threadparam);
+    if (err) {
+        VLOG_WARN("Thread priority error %d",err);
+    }
+}
diff --git a/lib/netdev-dpdk.h b/lib/netdev-dpdk.h
index 80bb834..1890ae4 100644
--- a/lib/netdev-dpdk.h
+++ b/lib/netdev-dpdk.h
@@ -26,6 +26,7 @@  struct smap;
 void netdev_dpdk_register(void);
 void free_dpdk_buf(struct dp_packet *);
 void dpdk_set_lcore_id(unsigned cpu);
+void pmd_thread_setpriority(int policy);
 
 #else
 
@@ -51,6 +52,12 @@  dpdk_set_lcore_id(unsigned cpu OVS_UNUSED)
     /* Nothing */
 }
 
+static inline void
+pmd_thread_setpriority(int policy OVS_UNUSED)
+{
+    /* Nothing */
+}
+
 #endif /* DPDK_NETDEV */
 
 void dpdk_init(const struct smap *ovs_other_config);