diff mbox series

[ovs-dev,v3] ovs-thread: Detect changes in number of cpus

Message ID 20221104220230.781507-1-amorenoz@redhat.com
State Changes Requested
Headers show
Series [ovs-dev,v3] ovs-thread: Detect changes in number of cpus | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/intel-ovs-compilation fail test: fail

Commit Message

Adrian Moreno Nov. 4, 2022, 10:02 p.m. UTC
Currently, things like the number of handler and revalidator threads are
calculated based on the number of available CPUs. However, this number
is considered static and only calculated once, hence ignoring events
such as cpus being hotplugged, switched on/off or affinity mask
changing.

On the other hand, checking the number of available CPUs multiple times
per second seems like an overkill.
Affinity should not change that often and, even if it does, the impact
of destroying and recreating all the threads so often is probably a
price too expensive to pay.

This patch makes the number of cpus be calculated every time 5 seconds
which seems a reasonable middle point.
It generates an impact in the main loop duration of <1% and a worst-case
scenario impact in throughput of < 5% [1].

As a result of these changes (assuming the patch is backported):
- >=2.16: a change in the cpu affinity reflects on the number of threads
  in (at most) 5 seconds.
- < 2.16: a change in the cpu affinity will be reflected on
  the number of threads the next time there is a call to
  bridge_reconfigure() (e.g: on the next DB change), and 5 seconds
  have passed.

The difference in behavior is because on older versions the thread
number calculation was done on bridge reconfiguration while newer
versions moved this logic down to the dpif layer and is run on
dpif->run() stage.
Considering it has not been a huge problem up to today and that the
cpu change would be reflected sooner or later (e.g the user could
force a recalculation with a simple ovs-vsctl command), I think it
might be OK to leave like that.

[1] Tested in the worst-case scenario of disabling the kernel cache
(other_config:flow-size=0), modifying ovs-vswithd's affinity so the
number of handlers go up and down every 5 seconds and calculated the
difference in netperf's ops/sec.

Fixes: be15ec48d766 ("lib: Use a more accurate value for CPU count (sched_getaffinity).")
Cc: david.marchand@redhat.com
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
---
 lib/ovs-thread.c | 61 +++++++++++++++++++++++++++++-------------------
 1 file changed, 37 insertions(+), 24 deletions(-)

Comments

Mike Pattrick Nov. 7, 2022, 6:48 p.m. UTC | #1
On Fri, Nov 4, 2022 at 6:02 PM Adrian Moreno <amorenoz@redhat.com> wrote:
>
> Currently, things like the number of handler and revalidator threads are
> calculated based on the number of available CPUs. However, this number
> is considered static and only calculated once, hence ignoring events
> such as cpus being hotplugged, switched on/off or affinity mask
> changing.
>
> On the other hand, checking the number of available CPUs multiple times
> per second seems like an overkill.
> Affinity should not change that often and, even if it does, the impact
> of destroying and recreating all the threads so often is probably a
> price too expensive to pay.
>
> This patch makes the number of cpus be calculated every time 5 seconds
> which seems a reasonable middle point.
> It generates an impact in the main loop duration of <1% and a worst-case
> scenario impact in throughput of < 5% [1].
>
> As a result of these changes (assuming the patch is backported):
> - >=2.16: a change in the cpu affinity reflects on the number of threads
>   in (at most) 5 seconds.
> - < 2.16: a change in the cpu affinity will be reflected on
>   the number of threads the next time there is a call to
>   bridge_reconfigure() (e.g: on the next DB change), and 5 seconds
>   have passed.
>
> The difference in behavior is because on older versions the thread
> number calculation was done on bridge reconfiguration while newer
> versions moved this logic down to the dpif layer and is run on
> dpif->run() stage.
> Considering it has not been a huge problem up to today and that the
> cpu change would be reflected sooner or later (e.g the user could
> force a recalculation with a simple ovs-vsctl command), I think it
> might be OK to leave like that.
>
> [1] Tested in the worst-case scenario of disabling the kernel cache
> (other_config:flow-size=0), modifying ovs-vswithd's affinity so the
> number of handlers go up and down every 5 seconds and calculated the
> difference in netperf's ops/sec.
>
> Fixes: be15ec48d766 ("lib: Use a more accurate value for CPU count (sched_getaffinity).")
> Cc: david.marchand@redhat.com
> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>

Acked-by: Mike Pattrick <mkp@redhat.com>
Ilya Maximets Dec. 6, 2022, 1:40 p.m. UTC | #2
On 11/4/22 23:02, Adrian Moreno wrote:
> Currently, things like the number of handler and revalidator threads are
> calculated based on the number of available CPUs. However, this number
> is considered static and only calculated once, hence ignoring events
> such as cpus being hotplugged, switched on/off or affinity mask
> changing.
> 
> On the other hand, checking the number of available CPUs multiple times
> per second seems like an overkill.
> Affinity should not change that often and, even if it does, the impact
> of destroying and recreating all the threads so often is probably a
> price too expensive to pay.
> 
> This patch makes the number of cpus be calculated every time 5 seconds
> which seems a reasonable middle point.
> It generates an impact in the main loop duration of <1% and a worst-case
> scenario impact in throughput of < 5% [1].

Thanks, Adrian!  See some comments below.

> 
> As a result of these changes (assuming the patch is backported):
> - >=2.16: a change in the cpu affinity reflects on the number of threads
>   in (at most) 5 seconds.
> - < 2.16: a change in the cpu affinity will be reflected on
>   the number of threads the next time there is a call to
>   bridge_reconfigure() (e.g: on the next DB change), and 5 seconds
>   have passed.
> 
> The difference in behavior is because on older versions the thread
> number calculation was done on bridge reconfiguration while newer
> versions moved this logic down to the dpif layer and is run on
> dpif->run() stage.
> Considering it has not been a huge problem up to today and that the
> cpu change would be reflected sooner or later (e.g the user could
> force a recalculation with a simple ovs-vsctl command), I think it
> might be OK to leave like that.

I don't think we should actually backport that change as it changing
the behavior of the ovs-vswitchd in a way that may not be expected
the the user.  It's more like a new feature.  So, above paragraphs can
be dropped from the commit message.

And since it's a new behavior change, it should have a record in the
NEWS file.  Something like this, maybe:

  - ovs-vswitchd now detects changes in CPU affinity and adjusts the number
    of handler and revalidator threads if necessary.

> 
> [1] Tested in the worst-case scenario of disabling the kernel cache
> (other_config:flow-size=0), modifying ovs-vswithd's affinity so the
> number of handlers go up and down every 5 seconds and calculated the
> difference in netperf's ops/sec.
> 
> Fixes: be15ec48d766 ("lib: Use a more accurate value for CPU count (sched_getaffinity).")

And this tag is not really needed.

> Cc: david.marchand@redhat.com
> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
> ---
>  lib/ovs-thread.c | 61 +++++++++++++++++++++++++++++-------------------
>  1 file changed, 37 insertions(+), 24 deletions(-)
> 
> diff --git a/lib/ovs-thread.c b/lib/ovs-thread.c
> index 78ed3e970..d1deb9c52 100644
> --- a/lib/ovs-thread.c
> +++ b/lib/ovs-thread.c
> @@ -31,6 +31,7 @@
>  #include "openvswitch/poll-loop.h"
>  #include "seq.h"
>  #include "socket-util.h"
> +#include "timeval.h"
>  #include "util.h"
>  
>  #ifdef __CHECKER__
> @@ -627,42 +628,54 @@ ovs_thread_stats_next_bucket(const struct ovsthread_stats *stats, size_t i)
>  }
>  
>  
> -/* Returns the total number of cores available to this process, or 0 if the
> - * number cannot be determined. */
> -int
> -count_cpu_cores(void)
> +static int
> +count_cpu_cores__(void)
>  {
> -    static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
> -    static long int n_cores;
> +    long int n_cores;
>  
> -    if (ovsthread_once_start(&once)) {
>  #ifndef _WIN32
> -        n_cores = sysconf(_SC_NPROCESSORS_ONLN);
> +    n_cores = sysconf(_SC_NPROCESSORS_ONLN);
> +#else
> +    SYSTEM_INFO sysinfo;
> +    GetSystemInfo(&sysinfo);
> +    n_cores = sysinfo.dwNumberOfProcessors;
> +#endif
>  #ifdef __linux__
> -        if (n_cores > 0) {
> -            cpu_set_t *set = CPU_ALLOC(n_cores);
> +    if (n_cores > 0) {
> +        cpu_set_t *set = CPU_ALLOC(n_cores);
>  
> -            if (set) {
> -                size_t size = CPU_ALLOC_SIZE(n_cores);
> +        if (set) {
> +            size_t size = CPU_ALLOC_SIZE(n_cores);
>  
> -                if (!sched_getaffinity(0, size, set)) {
> -                    n_cores = CPU_COUNT_S(size, set);
> -                }
> -                CPU_FREE(set);
> +            if (!sched_getaffinity(0, size, set)) {
> +                n_cores = CPU_COUNT_S(size, set);
>              }
> +            CPU_FREE(set);
>          }
> -#endif
> -#else
> -        SYSTEM_INFO sysinfo;
> -        GetSystemInfo(&sysinfo);
> -        n_cores = sysinfo.dwNumberOfProcessors;
> -#endif
> -        ovsthread_once_done(&once);
>      }
> -
> +#endif
>      return n_cores > 0 ? n_cores : 0;
>  }
>  
> +/* It's unlikely that the available cpus change several times per second and
> + * even if it does, it's not needed (or desired) to react to such changes so
> + * quickly.*/
> +#define COUNT_CPU_UPDATE_TIME_MS 5000
> +/* Returns the current total number of cores available to this process, or 0
> + * if the number cannot be determined.
> + * It is assumed that this function is only called from the main thread.*/

There is a missing space at the end.  But, more importantly, this
assumption is not correct.  The function is called from the system
stats thread, i.e. system_stats_thread_func().  So, updates inside
should be guarded by some form of a mutex.

> +int count_cpu_cores(void) {

Braces should be on a new line.

> +    static long long int last_updated = 0;
> +    long long int now = time_msec();
> +    static int cpu_cores;
> +
> +    if (now - last_updated >= COUNT_CPU_UPDATE_TIME_MS) {
> +        last_updated = now;
> +        cpu_cores = count_cpu_cores__();
> +    }
> +    return cpu_cores;
> +}
> +
>  /* Returns the total number of cores on the system, or 0 if the
>   * number cannot be determined. */
>  int

Bets regards, Ilya Maximets.
Adrian Moreno Dec. 7, 2022, 8:17 a.m. UTC | #3
On 12/6/22 14:40, Ilya Maximets wrote:
> On 11/4/22 23:02, Adrian Moreno wrote:
>> Currently, things like the number of handler and revalidator threads are
>> calculated based on the number of available CPUs. However, this number
>> is considered static and only calculated once, hence ignoring events
>> such as cpus being hotplugged, switched on/off or affinity mask
>> changing.
>>
>> On the other hand, checking the number of available CPUs multiple times
>> per second seems like an overkill.
>> Affinity should not change that often and, even if it does, the impact
>> of destroying and recreating all the threads so often is probably a
>> price too expensive to pay.
>>
>> This patch makes the number of cpus be calculated every time 5 seconds
>> which seems a reasonable middle point.
>> It generates an impact in the main loop duration of <1% and a worst-case
>> scenario impact in throughput of < 5% [1].
> 
> Thanks, Adrian!  See some comments below.
> 
>>
>> As a result of these changes (assuming the patch is backported):
>> - >=2.16: a change in the cpu affinity reflects on the number of threads
>>    in (at most) 5 seconds.
>> - < 2.16: a change in the cpu affinity will be reflected on
>>    the number of threads the next time there is a call to
>>    bridge_reconfigure() (e.g: on the next DB change), and 5 seconds
>>    have passed.
>>
>> The difference in behavior is because on older versions the thread
>> number calculation was done on bridge reconfiguration while newer
>> versions moved this logic down to the dpif layer and is run on
>> dpif->run() stage.
>> Considering it has not been a huge problem up to today and that the
>> cpu change would be reflected sooner or later (e.g the user could
>> force a recalculation with a simple ovs-vsctl command), I think it
>> might be OK to leave like that.
> 
> I don't think we should actually backport that change as it changing
> the behavior of the ovs-vswitchd in a way that may not be expected
> the the user.  It's more like a new feature.  So, above paragraphs can
> be dropped from the commit message.
> 
> And since it's a new behavior change, it should have a record in the
> NEWS file.  Something like this, maybe:
> 
>    - ovs-vswitchd now detects changes in CPU affinity and adjusts the number
>      of handler and revalidator threads if necessary.
> 

I don't know if, reading how n_revalidator_threads depends on the affinity, the 
user's "expected behavior" is this value not changing when there are affinity 
changes, but it's true that the fact that we'd require a OVSDB transaction is an 
added nuance that's definitely not expected on old versions.

If we're not going to backport it, do you think we should add a config knob to 
modify the cpu affinity check timeout (currently 5s)? I've restrained myself 
from adding yet-another-config-value that has some potential associated damage 
and little benefit but I'd like to know your opinion on this.

>>
>> [1] Tested in the worst-case scenario of disabling the kernel cache
>> (other_config:flow-size=0), modifying ovs-vswithd's affinity so the
>> number of handlers go up and down every 5 seconds and calculated the
>> difference in netperf's ops/sec.
>>
>> Fixes: be15ec48d766 ("lib: Use a more accurate value for CPU count (sched_getaffinity).")
> 
> And this tag is not really needed.
> 
>> Cc: david.marchand@redhat.com
>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
>> ---
>>   lib/ovs-thread.c | 61 +++++++++++++++++++++++++++++-------------------
>>   1 file changed, 37 insertions(+), 24 deletions(-)
>>
>> diff --git a/lib/ovs-thread.c b/lib/ovs-thread.c
>> index 78ed3e970..d1deb9c52 100644
>> --- a/lib/ovs-thread.c
>> +++ b/lib/ovs-thread.c
>> @@ -31,6 +31,7 @@
>>   #include "openvswitch/poll-loop.h"
>>   #include "seq.h"
>>   #include "socket-util.h"
>> +#include "timeval.h"
>>   #include "util.h"
>>   
>>   #ifdef __CHECKER__
>> @@ -627,42 +628,54 @@ ovs_thread_stats_next_bucket(const struct ovsthread_stats *stats, size_t i)
>>   }
>>   
>>   
>> -/* Returns the total number of cores available to this process, or 0 if the
>> - * number cannot be determined. */
>> -int
>> -count_cpu_cores(void)
>> +static int
>> +count_cpu_cores__(void)
>>   {
>> -    static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
>> -    static long int n_cores;
>> +    long int n_cores;
>>   
>> -    if (ovsthread_once_start(&once)) {
>>   #ifndef _WIN32
>> -        n_cores = sysconf(_SC_NPROCESSORS_ONLN);
>> +    n_cores = sysconf(_SC_NPROCESSORS_ONLN);
>> +#else
>> +    SYSTEM_INFO sysinfo;
>> +    GetSystemInfo(&sysinfo);
>> +    n_cores = sysinfo.dwNumberOfProcessors;
>> +#endif
>>   #ifdef __linux__
>> -        if (n_cores > 0) {
>> -            cpu_set_t *set = CPU_ALLOC(n_cores);
>> +    if (n_cores > 0) {
>> +        cpu_set_t *set = CPU_ALLOC(n_cores);
>>   
>> -            if (set) {
>> -                size_t size = CPU_ALLOC_SIZE(n_cores);
>> +        if (set) {
>> +            size_t size = CPU_ALLOC_SIZE(n_cores);
>>   
>> -                if (!sched_getaffinity(0, size, set)) {
>> -                    n_cores = CPU_COUNT_S(size, set);
>> -                }
>> -                CPU_FREE(set);
>> +            if (!sched_getaffinity(0, size, set)) {
>> +                n_cores = CPU_COUNT_S(size, set);
>>               }
>> +            CPU_FREE(set);
>>           }
>> -#endif
>> -#else
>> -        SYSTEM_INFO sysinfo;
>> -        GetSystemInfo(&sysinfo);
>> -        n_cores = sysinfo.dwNumberOfProcessors;
>> -#endif
>> -        ovsthread_once_done(&once);
>>       }
>> -
>> +#endif
>>       return n_cores > 0 ? n_cores : 0;
>>   }
>>   
>> +/* It's unlikely that the available cpus change several times per second and
>> + * even if it does, it's not needed (or desired) to react to such changes so
>> + * quickly.*/
>> +#define COUNT_CPU_UPDATE_TIME_MS 5000
>> +/* Returns the current total number of cores available to this process, or 0
>> + * if the number cannot be determined.
>> + * It is assumed that this function is only called from the main thread.*/
> 
> There is a missing space at the end.  But, more importantly, this
> assumption is not correct.  The function is called from the system
> stats thread, i.e. system_stats_thread_func().  So, updates inside
> should be guarded by some form of a mutex.
> 
>> +int count_cpu_cores(void) {
> 
> Braces should be on a new line.
> 
>> +    static long long int last_updated = 0;
>> +    long long int now = time_msec();
>> +    static int cpu_cores;
>> +
>> +    if (now - last_updated >= COUNT_CPU_UPDATE_TIME_MS) {
>> +        last_updated = now;
>> +        cpu_cores = count_cpu_cores__();
>> +    }
>> +    return cpu_cores;
>> +}
>> +
>>   /* Returns the total number of cores on the system, or 0 if the
>>    * number cannot be determined. */
>>   int
> 
> Bets regards, Ilya Maximets.
>
Ilya Maximets Dec. 7, 2022, 12:33 p.m. UTC | #4
On 12/7/22 09:17, Adrian Moreno wrote:
> 
> 
> On 12/6/22 14:40, Ilya Maximets wrote:
>> On 11/4/22 23:02, Adrian Moreno wrote:
>>> Currently, things like the number of handler and revalidator threads are
>>> calculated based on the number of available CPUs. However, this number
>>> is considered static and only calculated once, hence ignoring events
>>> such as cpus being hotplugged, switched on/off or affinity mask
>>> changing.
>>>
>>> On the other hand, checking the number of available CPUs multiple times
>>> per second seems like an overkill.
>>> Affinity should not change that often and, even if it does, the impact
>>> of destroying and recreating all the threads so often is probably a
>>> price too expensive to pay.
>>>
>>> This patch makes the number of cpus be calculated every time 5 seconds
>>> which seems a reasonable middle point.
>>> It generates an impact in the main loop duration of <1% and a worst-case
>>> scenario impact in throughput of < 5% [1].
>>
>> Thanks, Adrian!  See some comments below.
>>
>>>
>>> As a result of these changes (assuming the patch is backported):
>>> - >=2.16: a change in the cpu affinity reflects on the number of threads
>>>    in (at most) 5 seconds.
>>> - < 2.16: a change in the cpu affinity will be reflected on
>>>    the number of threads the next time there is a call to
>>>    bridge_reconfigure() (e.g: on the next DB change), and 5 seconds
>>>    have passed.
>>>
>>> The difference in behavior is because on older versions the thread
>>> number calculation was done on bridge reconfiguration while newer
>>> versions moved this logic down to the dpif layer and is run on
>>> dpif->run() stage.
>>> Considering it has not been a huge problem up to today and that the
>>> cpu change would be reflected sooner or later (e.g the user could
>>> force a recalculation with a simple ovs-vsctl command), I think it
>>> might be OK to leave like that.
>>
>> I don't think we should actually backport that change as it changing
>> the behavior of the ovs-vswitchd in a way that may not be expected
>> the the user.  It's more like a new feature.  So, above paragraphs can
>> be dropped from the commit message.
>>
>> And since it's a new behavior change, it should have a record in the
>> NEWS file.  Something like this, maybe:
>>
>>    - ovs-vswitchd now detects changes in CPU affinity and adjusts the number
>>      of handler and revalidator threads if necessary.
>>
> 
> I don't know if, reading how n_revalidator_threads depends on the affinity, the user's "expected behavior" is this value not changing when there are affinity changes, but it's true that the fact that we'd require a OVSDB transaction is an added nuance that's definitely not expected on old versions.
> 
> If we're not going to backport it, do you think we should add a config knob to modify the cpu affinity check timeout (currently 5s)? I've restrained myself from adding yet-another-config-value that has some potential associated damage and little benefit but I'd like to know your opinion on this.

I'd say we don't need a knob at the moment.  We can add it later, if required.
Maybe bump the timeout to 10 seconds though?  (Just a random number that's
seems a bit more comfortable to me, possibly because it matches the default
flow timout).

> 
>>>
>>> [1] Tested in the worst-case scenario of disabling the kernel cache
>>> (other_config:flow-size=0), modifying ovs-vswithd's affinity so the
>>> number of handlers go up and down every 5 seconds and calculated the
>>> difference in netperf's ops/sec.
>>>
>>> Fixes: be15ec48d766 ("lib: Use a more accurate value for CPU count (sched_getaffinity).")
>>
>> And this tag is not really needed.
>>
>>> Cc: david.marchand@redhat.com
>>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
>>> ---
>>>   lib/ovs-thread.c | 61 +++++++++++++++++++++++++++++-------------------
>>>   1 file changed, 37 insertions(+), 24 deletions(-)
>>>
>>> diff --git a/lib/ovs-thread.c b/lib/ovs-thread.c
>>> index 78ed3e970..d1deb9c52 100644
>>> --- a/lib/ovs-thread.c
>>> +++ b/lib/ovs-thread.c
>>> @@ -31,6 +31,7 @@
>>>   #include "openvswitch/poll-loop.h"
>>>   #include "seq.h"
>>>   #include "socket-util.h"
>>> +#include "timeval.h"
>>>   #include "util.h"
>>>     #ifdef __CHECKER__
>>> @@ -627,42 +628,54 @@ ovs_thread_stats_next_bucket(const struct ovsthread_stats *stats, size_t i)
>>>   }
>>>     
>>> -/* Returns the total number of cores available to this process, or 0 if the
>>> - * number cannot be determined. */
>>> -int
>>> -count_cpu_cores(void)
>>> +static int
>>> +count_cpu_cores__(void)
>>>   {
>>> -    static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
>>> -    static long int n_cores;
>>> +    long int n_cores;
>>>   -    if (ovsthread_once_start(&once)) {
>>>   #ifndef _WIN32
>>> -        n_cores = sysconf(_SC_NPROCESSORS_ONLN);
>>> +    n_cores = sysconf(_SC_NPROCESSORS_ONLN);
>>> +#else
>>> +    SYSTEM_INFO sysinfo;
>>> +    GetSystemInfo(&sysinfo);
>>> +    n_cores = sysinfo.dwNumberOfProcessors;
>>> +#endif
>>>   #ifdef __linux__
>>> -        if (n_cores > 0) {
>>> -            cpu_set_t *set = CPU_ALLOC(n_cores);
>>> +    if (n_cores > 0) {
>>> +        cpu_set_t *set = CPU_ALLOC(n_cores);
>>>   -            if (set) {
>>> -                size_t size = CPU_ALLOC_SIZE(n_cores);
>>> +        if (set) {
>>> +            size_t size = CPU_ALLOC_SIZE(n_cores);
>>>   -                if (!sched_getaffinity(0, size, set)) {
>>> -                    n_cores = CPU_COUNT_S(size, set);
>>> -                }
>>> -                CPU_FREE(set);
>>> +            if (!sched_getaffinity(0, size, set)) {
>>> +                n_cores = CPU_COUNT_S(size, set);
>>>               }
>>> +            CPU_FREE(set);
>>>           }
>>> -#endif
>>> -#else
>>> -        SYSTEM_INFO sysinfo;
>>> -        GetSystemInfo(&sysinfo);
>>> -        n_cores = sysinfo.dwNumberOfProcessors;
>>> -#endif
>>> -        ovsthread_once_done(&once);
>>>       }
>>> -
>>> +#endif
>>>       return n_cores > 0 ? n_cores : 0;
>>>   }
>>>   +/* It's unlikely that the available cpus change several times per second and
>>> + * even if it does, it's not needed (or desired) to react to such changes so
>>> + * quickly.*/
>>> +#define COUNT_CPU_UPDATE_TIME_MS 5000
>>> +/* Returns the current total number of cores available to this process, or 0
>>> + * if the number cannot be determined.
>>> + * It is assumed that this function is only called from the main thread.*/
>>
>> There is a missing space at the end.  But, more importantly, this
>> assumption is not correct.  The function is called from the system
>> stats thread, i.e. system_stats_thread_func().  So, updates inside
>> should be guarded by some form of a mutex.
>>
>>> +int count_cpu_cores(void) {
>>
>> Braces should be on a new line.
>>
>>> +    static long long int last_updated = 0;
>>> +    long long int now = time_msec();
>>> +    static int cpu_cores;
>>> +
>>> +    if (now - last_updated >= COUNT_CPU_UPDATE_TIME_MS) {
>>> +        last_updated = now;
>>> +        cpu_cores = count_cpu_cores__();
>>> +    }
>>> +    return cpu_cores;
>>> +}
>>> +
>>>   /* Returns the total number of cores on the system, or 0 if the
>>>    * number cannot be determined. */
>>>   int
>>
>> Bets regards, Ilya Maximets.
>>
>
diff mbox series

Patch

diff --git a/lib/ovs-thread.c b/lib/ovs-thread.c
index 78ed3e970..d1deb9c52 100644
--- a/lib/ovs-thread.c
+++ b/lib/ovs-thread.c
@@ -31,6 +31,7 @@ 
 #include "openvswitch/poll-loop.h"
 #include "seq.h"
 #include "socket-util.h"
+#include "timeval.h"
 #include "util.h"
 
 #ifdef __CHECKER__
@@ -627,42 +628,54 @@  ovs_thread_stats_next_bucket(const struct ovsthread_stats *stats, size_t i)
 }
 
 
-/* Returns the total number of cores available to this process, or 0 if the
- * number cannot be determined. */
-int
-count_cpu_cores(void)
+static int
+count_cpu_cores__(void)
 {
-    static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
-    static long int n_cores;
+    long int n_cores;
 
-    if (ovsthread_once_start(&once)) {
 #ifndef _WIN32
-        n_cores = sysconf(_SC_NPROCESSORS_ONLN);
+    n_cores = sysconf(_SC_NPROCESSORS_ONLN);
+#else
+    SYSTEM_INFO sysinfo;
+    GetSystemInfo(&sysinfo);
+    n_cores = sysinfo.dwNumberOfProcessors;
+#endif
 #ifdef __linux__
-        if (n_cores > 0) {
-            cpu_set_t *set = CPU_ALLOC(n_cores);
+    if (n_cores > 0) {
+        cpu_set_t *set = CPU_ALLOC(n_cores);
 
-            if (set) {
-                size_t size = CPU_ALLOC_SIZE(n_cores);
+        if (set) {
+            size_t size = CPU_ALLOC_SIZE(n_cores);
 
-                if (!sched_getaffinity(0, size, set)) {
-                    n_cores = CPU_COUNT_S(size, set);
-                }
-                CPU_FREE(set);
+            if (!sched_getaffinity(0, size, set)) {
+                n_cores = CPU_COUNT_S(size, set);
             }
+            CPU_FREE(set);
         }
-#endif
-#else
-        SYSTEM_INFO sysinfo;
-        GetSystemInfo(&sysinfo);
-        n_cores = sysinfo.dwNumberOfProcessors;
-#endif
-        ovsthread_once_done(&once);
     }
-
+#endif
     return n_cores > 0 ? n_cores : 0;
 }
 
+/* It's unlikely that the available cpus change several times per second and
+ * even if it does, it's not needed (or desired) to react to such changes so
+ * quickly.*/
+#define COUNT_CPU_UPDATE_TIME_MS 5000
+/* Returns the current total number of cores available to this process, or 0
+ * if the number cannot be determined.
+ * It is assumed that this function is only called from the main thread.*/
+int count_cpu_cores(void) {
+    static long long int last_updated = 0;
+    long long int now = time_msec();
+    static int cpu_cores;
+
+    if (now - last_updated >= COUNT_CPU_UPDATE_TIME_MS) {
+        last_updated = now;
+        cpu_cores = count_cpu_cores__();
+    }
+    return cpu_cores;
+}
+
 /* Returns the total number of cores on the system, or 0 if the
  * number cannot be determined. */
 int