diff mbox series

[RESEND,v9] Mutex: Add pthread mutex tunables

Message ID 1542851105-16956-1-git-send-email-kemi.wang@intel.com
State New
Headers show
Series [RESEND,v9] Mutex: Add pthread mutex tunables | expand

Commit Message

kemi Nov. 22, 2018, 1:45 a.m. UTC
This patch does not have any functionality change, we only provide a spin
count tunes for pthread adaptive spin mutex. The tunable
glibc.pthread.mutex_spin_count tunes can be used by system administrator to
squeeze system performance according to different hardware capabilities and
workload characteristics.

The maximum value of spin count is limited to 30000 to avoid the overflow
of mutex->__data.__spins variable with the possible type of short in
pthread_mutex_lock ().

The default value of spin count is set to 100 with the reference to the
previous number of times of spinning via trylock. This value would be
architecture-specific and can be tuned with kinds of benchmarks to fit most
cases in future.

   * sysdeps/nptl/dl-tunables.list: Add glibc.pthread.mutex_spin_count entry.
   * manual/tunables.texi: Add glibc.pthread.mutex_spin_count description.
   * nptl/pthread_mutex_conf.h: New file.
   * nptl/pthread_mutex_conf.c: New file.
   * nptl/Makefile: Add pthread_mutex_conf.c for compilation.
   * nptl/nptl-init.c: Put pthread mutex tunable initialization in pthread
     initialization.
   * nptl/pthreadP.h: Refine MAX_ADAPTIVE_COUNT macro definition

I would extend my appreciation sincerely to H.J.Lu for his help to refine
this patch series.

ChangeLog:
    V8->V9:
	a) Add the "unistd.h" header file back, or it will cause build
	regression on 32 bits system.
	b) Rebase on the latest master branch
	c) Tested on x86_64 with build-many-glibcs.py

    V7->V8:
	a) Refine the pthread tunables description in manual/tunables.texi
	accordingly to Carlos O'Donell and Rical Jason.

    V6->V7:
    a) Patch is refined by H.J.Lu

    V5->V6:
    a) Missing "pthread mutex tunables" entry in the menu of tunables.texi,
    add it.

    V4->V5
    a) Put mutex tunable (glibc.mutex.spin_count) initialization as part of
    overall pthread initialization, that would avoid the extra relocation,
    as suggested by Florian Weimer. Thanks for pointing it out!
    b) Move the READ_ONLY_SPIN macro definition from the third patch to
    this patch

    V3->V4
    a) Add comments in elf/dl-tunables.list

    V2->V3
    a) Polish the description of glibc.mutex.spin_count tunable with the
    help from Rical Jasan.
    b) Get rid of the TUNABLE_CALLBACK_FNDECL macros in
    pthread_mutex_conf.c, as suggested by Florian Weimer.
    c) Adjust the default value of spin count to 100 with the reference of
    the previous spinning way via trylock.

    V1->V2
    a) Renamed nptl/mutex-conf.h -> nptl/pthread_mutex_conf.h
    b) Renamed nptl/mutex-conf.c -> nptl/pthread_mutex_conf.c
    c) Change the Makefile to compile pthread_mutex_conf.c
    d) Modify the copyright "2013-2018" -> "2018" for new added files
    e) Fix the indentation issue (tab -> double space) in
    elf/dl-tunables.list
    f) Remove the env alias LD_SPIN_COUNT in elf/dl-tunables.list
    g) Fix the typo errors and refresh glibc.mutex.spin_count tunable
    description in manual/tunables.texi.
    h) Fix the indentation issue in nptl/pthread_mutex_conf.c
    i) Fix the indentation issue for nested preprocessor (add one space for
    each level)

Suggested-by: Andi Kleen <andi.kleen@intel.com>
Signed-off-by: Kemi.wang <kemi.wang@intel.com>
---
 manual/tunables.texi          | 27 +++++++++++++++++++++++
 nptl/Makefile                 |  2 +-
 nptl/nptl-init.c              |  5 +++++
 nptl/pthreadP.h               | 24 +++++++++++++++-----
 nptl/pthread_mutex_conf.c     | 27 +++++++++++++++++++++++
 nptl/pthread_mutex_conf.h     | 45 ++++++++++++++++++++++++++++++++++++++
 sysdeps/nptl/dl-tunables.list | 51 +++++++++++++++++++++++++++++++++++++++++++
 7 files changed, 174 insertions(+), 7 deletions(-)
 create mode 100644 nptl/pthread_mutex_conf.c
 create mode 100644 nptl/pthread_mutex_conf.h
 create mode 100644 sysdeps/nptl/dl-tunables.list

Comments

kemi Nov. 27, 2018, 2:43 a.m. UTC | #1
PING~

On 2018/11/22 上午9:45, Kemi Wang wrote:
> This patch does not have any functionality change, we only provide a spin
> count tunes for pthread adaptive spin mutex. The tunable
> glibc.pthread.mutex_spin_count tunes can be used by system administrator to
> squeeze system performance according to different hardware capabilities and
> workload characteristics.
> 
> The maximum value of spin count is limited to 30000 to avoid the overflow
> of mutex->__data.__spins variable with the possible type of short in
> pthread_mutex_lock ().
> 
> The default value of spin count is set to 100 with the reference to the
> previous number of times of spinning via trylock. This value would be
> architecture-specific and can be tuned with kinds of benchmarks to fit most
> cases in future.
> 
>    * sysdeps/nptl/dl-tunables.list: Add glibc.pthread.mutex_spin_count entry.
>    * manual/tunables.texi: Add glibc.pthread.mutex_spin_count description.
>    * nptl/pthread_mutex_conf.h: New file.
>    * nptl/pthread_mutex_conf.c: New file.
>    * nptl/Makefile: Add pthread_mutex_conf.c for compilation.
>    * nptl/nptl-init.c: Put pthread mutex tunable initialization in pthread
>      initialization.
>    * nptl/pthreadP.h: Refine MAX_ADAPTIVE_COUNT macro definition
> 
> I would extend my appreciation sincerely to H.J.Lu for his help to refine
> this patch series.
> 
> ChangeLog:
>     V8->V9:
> 	a) Add the "unistd.h" header file back, or it will cause build
> 	regression on 32 bits system.
> 	b) Rebase on the latest master branch
> 	c) Tested on x86_64 with build-many-glibcs.py
> 
>     V7->V8:
> 	a) Refine the pthread tunables description in manual/tunables.texi
> 	accordingly to Carlos O'Donell and Rical Jason.
> 
>     V6->V7:
>     a) Patch is refined by H.J.Lu
> 
>     V5->V6:
>     a) Missing "pthread mutex tunables" entry in the menu of tunables.texi,
>     add it.
> 
>     V4->V5
>     a) Put mutex tunable (glibc.mutex.spin_count) initialization as part of
>     overall pthread initialization, that would avoid the extra relocation,
>     as suggested by Florian Weimer. Thanks for pointing it out!
>     b) Move the READ_ONLY_SPIN macro definition from the third patch to
>     this patch
> 
>     V3->V4
>     a) Add comments in elf/dl-tunables.list
> 
>     V2->V3
>     a) Polish the description of glibc.mutex.spin_count tunable with the
>     help from Rical Jasan.
>     b) Get rid of the TUNABLE_CALLBACK_FNDECL macros in
>     pthread_mutex_conf.c, as suggested by Florian Weimer.
>     c) Adjust the default value of spin count to 100 with the reference of
>     the previous spinning way via trylock.
> 
>     V1->V2
>     a) Renamed nptl/mutex-conf.h -> nptl/pthread_mutex_conf.h
>     b) Renamed nptl/mutex-conf.c -> nptl/pthread_mutex_conf.c
>     c) Change the Makefile to compile pthread_mutex_conf.c
>     d) Modify the copyright "2013-2018" -> "2018" for new added files
>     e) Fix the indentation issue (tab -> double space) in
>     elf/dl-tunables.list
>     f) Remove the env alias LD_SPIN_COUNT in elf/dl-tunables.list
>     g) Fix the typo errors and refresh glibc.mutex.spin_count tunable
>     description in manual/tunables.texi.
>     h) Fix the indentation issue in nptl/pthread_mutex_conf.c
>     i) Fix the indentation issue for nested preprocessor (add one space for
>     each level)
> 
> Suggested-by: Andi Kleen <andi.kleen@intel.com>
> Signed-off-by: Kemi.wang <kemi.wang@intel.com>
> ---
>  manual/tunables.texi          | 27 +++++++++++++++++++++++
>  nptl/Makefile                 |  2 +-
>  nptl/nptl-init.c              |  5 +++++
>  nptl/pthreadP.h               | 24 +++++++++++++++-----
>  nptl/pthread_mutex_conf.c     | 27 +++++++++++++++++++++++
>  nptl/pthread_mutex_conf.h     | 45 ++++++++++++++++++++++++++++++++++++++
>  sysdeps/nptl/dl-tunables.list | 51 +++++++++++++++++++++++++++++++++++++++++++
>  7 files changed, 174 insertions(+), 7 deletions(-)
>  create mode 100644 nptl/pthread_mutex_conf.c
>  create mode 100644 nptl/pthread_mutex_conf.h
>  create mode 100644 sysdeps/nptl/dl-tunables.list
> 
> diff --git a/manual/tunables.texi b/manual/tunables.texi
> index 3345a23..33fbe1e 100644
> --- a/manual/tunables.texi
> +++ b/manual/tunables.texi
> @@ -32,6 +32,7 @@ their own namespace.
>  * Tunable names::  The structure of a tunable name
>  * Memory Allocation Tunables::  Tunables in the memory allocation subsystem
>  * Elision Tunables::  Tunables in elision subsystem
> +* POSIX Thread Tunables:: Tunables in the POSIX thread subsystem
>  * Hardware Capability Tunables::  Tunables that modify the hardware
>  				  capabilities seen by @theglibc{}
>  @end menu
> @@ -281,6 +282,32 @@ of try lock attempts.
>  The default value of this tunable is @samp{3}.
>  @end deftp
>  
> +@node POSIX Thread Tunables
> +@section POSIX Thread Tunables
> +@cindex pthread mutex tunables
> +@cindex thread mutex tunables
> +@cindex mutex tunables
> +@cindex tunables thread mutex
> +
> +@deftp {Tunable namespace} glibc.pthread
> +The behavior of POSIX thread can be tuned to gain performance improvements
> +according to specific hardware capabilities and workload characteristics by
> +setting the following tunables in the @code{pthread} namespace:
> +@end deftp
> +
> +@deftp Tunable glibc.pthread.mutex_spin_count
> +The @code{glibc.pthread.mutex_spin_count} tunable sets the maximum number of times
> +a thread should spin on the lock before calling into the kernel to block.
> +Adaptive spin is used for mutexes initialized with the
> +@code{PTHREAD_MUTEX_ADAPTIVE_NP} GNU extension.  It affects both
> +@code{pthread_mutex_lock} and @code{pthread_mutex_timedlock}.
> +
> +The spinning is done until either the maximum spin count is reached or
> +the lock is acquired.
> +
> +The default value of this tunable is @samp{100}.
> +@end deftp
> +
>  @node Hardware Capability Tunables
>  @section Hardware Capability Tunables
>  @cindex hardware capability tunables
> diff --git a/nptl/Makefile b/nptl/Makefile
> index 49b6faa..284f815 100644
> --- a/nptl/Makefile
> +++ b/nptl/Makefile
> @@ -145,7 +145,7 @@ libpthread-routines = nptl-init nptlfreeres vars events version pt-interp \
>  		      mtx_destroy mtx_init mtx_lock mtx_timedlock \
>  		      mtx_trylock mtx_unlock call_once cnd_broadcast \
>  		      cnd_destroy cnd_init cnd_signal cnd_timedwait cnd_wait \
> -		      tss_create tss_delete tss_get tss_set
> +		      tss_create tss_delete tss_get tss_set pthread_mutex_conf
>  #		      pthread_setuid pthread_seteuid pthread_setreuid \
>  #		      pthread_setresuid \
>  #		      pthread_setgid pthread_setegid pthread_setregid \
> diff --git a/nptl/nptl-init.c b/nptl/nptl-init.c
> index 907411d..adf99f1 100644
> --- a/nptl/nptl-init.c
> +++ b/nptl/nptl-init.c
> @@ -38,6 +38,7 @@
>  #include <kernel-features.h>
>  #include <libc-pointer-arith.h>
>  #include <pthread-pids.h>
> +#include <pthread_mutex_conf.h>
>  
>  #ifndef TLS_MULTIPLE_THREADS_IN_TCB
>  /* Pointer to the corresponding variable in libc.  */
> @@ -431,6 +432,10 @@ __pthread_initialize_minimal_internal (void)
>  
>    /* Determine whether the machine is SMP or not.  */
>    __is_smp = is_smp_system ();
> +
> +#if HAVE_TUNABLES
> +  pthread_tunables_init ();
> +#endif
>  }
>  strong_alias (__pthread_initialize_minimal_internal,
>  	      __pthread_initialize_minimal)
> diff --git a/nptl/pthreadP.h b/nptl/pthreadP.h
> index 19efe1e..9bae78e 100644
> --- a/nptl/pthreadP.h
> +++ b/nptl/pthreadP.h
> @@ -47,12 +47,6 @@
>  #endif
>  
>  
> -/* Adaptive mutex definitions.  */
> -#ifndef MAX_ADAPTIVE_COUNT
> -# define MAX_ADAPTIVE_COUNT 100
> -#endif
> -
> -
>  /* Magic cookie representing robust mutex with dead owner.  */
>  #define PTHREAD_MUTEX_INCONSISTENT	INT_MAX
>  /* Magic cookie representing not recoverable robust mutex.  */
> @@ -188,6 +182,24 @@ enum
>  #define __PTHREAD_COND_SHARED_MASK 1
>  
>  
> +#if HAVE_TUNABLES
> +struct mutex_config
> +{
> +  int spin_count;
> +};
> +
> +extern struct mutex_config __mutex_aconf;
> +#endif
> +
> +/* Adaptive mutex definitions.  */
> +#ifndef MAX_ADAPTIVE_COUNT
> +# if HAVE_TUNABLES
> +#  define MAX_ADAPTIVE_COUNT __mutex_aconf.spin_count
> +# else
> +#  define MAX_ADAPTIVE_COUNT 100
> +# endif
> +#endif
> +
>  /* Internal variables.  */
>  
>  
> diff --git a/nptl/pthread_mutex_conf.c b/nptl/pthread_mutex_conf.c
> new file mode 100644
> index 0000000..0da4afe
> --- /dev/null
> +++ b/nptl/pthread_mutex_conf.c
> @@ -0,0 +1,27 @@
> +/* pthread_mutex_conf.c: Pthread mutex tunable parameters.
> +   Copyright (C) 2018 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#if HAVE_TUNABLES
> +#include <pthreadP.h>
> +struct mutex_config __mutex_aconf =
> +{
> +  /* The maximum number of times a thread should spin on the lock before
> +  calling into kernel to block.  */
> +  .spin_count = 100,
> +};
> +#endif
> diff --git a/nptl/pthread_mutex_conf.h b/nptl/pthread_mutex_conf.h
> new file mode 100644
> index 0000000..97ff98e
> --- /dev/null
> +++ b/nptl/pthread_mutex_conf.h
> @@ -0,0 +1,45 @@
> +/* pthread_mutex_conf.h: Pthread mutex tunable parameters.
> +   Copyright (C) 2018 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +#ifndef _PTHREAD_MUTEX_CONF_H
> +#define _PTHREAD_MUTEX_CONF_H 1
> +
> +#if HAVE_TUNABLES
> +# define TUNABLE_NAMESPACE pthread
> +# include <stdbool.h>
> +# include <stdint.h>
> +# include <pthreadP.h>
> +# include <pthread_mutex_conf.h>
> +# include <unistd.h>  /* Get STDOUT_FILENO for _dl_printf.  */
> +# include <elf/dl-tunables.h>
> +
> +extern struct mutex_config __mutex_aconf attribute_hidden;
> +
> +static void
> +TUNABLE_CALLBACK (set_mutex_spin_count) (tunable_val_t *valp)
> +{
> +  __mutex_aconf.spin_count = (int32_t) (valp)->numval;
> +}
> +
> +static void pthread_tunables_init (void)
> +{
> +  TUNABLE_GET (mutex_spin_count, int32_t,
> +               TUNABLE_CALLBACK (set_mutex_spin_count));
> +}
> +#endif
> +
> +#endif
> diff --git a/sysdeps/nptl/dl-tunables.list b/sysdeps/nptl/dl-tunables.list
> new file mode 100644
> index 0000000..1cb52e8
> --- /dev/null
> +++ b/sysdeps/nptl/dl-tunables.list
> @@ -0,0 +1,51 @@
> +# Copyright (C) 2018 Free Software Foundation, Inc.
> +# This file is part of the GNU C Library.
> +
> +# The GNU C Library is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU Lesser General Public
> +# License as published by the Free Software Foundation; either
> +# version 2.1 of the License, or (at your option) any later version.
> +
> +# The GNU C Library is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +# Lesser General Public License for more details.
> +
> +# You should have received a copy of the GNU Lesser General Public
> +# License along with the GNU C Library; if not, see
> +# <http://www.gnu.org/licenses/>.
> +
> +# Allowed attributes for tunables:
> +#
> +# type: Defaults to STRING
> +# minval: Optional minimum acceptable value
> +# maxval: Optional maximum acceptable value
> +# env_alias: An alias environment variable
> +# security_level: Specify security level of the tunable.  Valid values are:
> +#
> +# 	     SXID_ERASE: (default) Don't read for AT_SECURE binaries and
> +# 	     		 removed so that child processes can't read it.
> +# 	     SXID_IGNORE: Don't read for AT_SECURE binaries, but retained for
> +# 	     		  non-AT_SECURE subprocesses.
> +# 	     NONE: Read all the time.
> +
> +
> +# The maximum value of spin count is limited to 30000 to avoid the overflow
> +# of mutex->__data.__spins variable with the possible type of short in
> +# pthread_mutex_lock ().
> +#
> +# The default value of spin count is set to 100 with the reference to the
> +# previous number of times of spinning via trylock. This value would be
> +# architecture-specific and can be tuned with kinds of benchmarks to fit
> +# most cases in future.
> +
> +glibc {
> +  pthread {
> +    mutex_spin_count {
> +      type: INT_32
> +      minval: 0
> +      maxval: 30000
> +      default: 100
> +    }
> +  }
> +}
>
kemi Nov. 27, 2018, 5:01 a.m. UTC | #2
+Carlos

On 2018/11/27 上午10:43, kemi wrote:
> PING~
> 
> On 2018/11/22 上午9:45, Kemi Wang wrote:
>> This patch does not have any functionality change, we only provide a spin
>> count tunes for pthread adaptive spin mutex. The tunable
>> glibc.pthread.mutex_spin_count tunes can be used by system administrator to
>> squeeze system performance according to different hardware capabilities and
>> workload characteristics.
>>
>> The maximum value of spin count is limited to 30000 to avoid the overflow
>> of mutex->__data.__spins variable with the possible type of short in
>> pthread_mutex_lock ().
>>
>> The default value of spin count is set to 100 with the reference to the
>> previous number of times of spinning via trylock. This value would be
>> architecture-specific and can be tuned with kinds of benchmarks to fit most
>> cases in future.
>>
>>    * sysdeps/nptl/dl-tunables.list: Add glibc.pthread.mutex_spin_count entry.
>>    * manual/tunables.texi: Add glibc.pthread.mutex_spin_count description.
>>    * nptl/pthread_mutex_conf.h: New file.
>>    * nptl/pthread_mutex_conf.c: New file.
>>    * nptl/Makefile: Add pthread_mutex_conf.c for compilation.
>>    * nptl/nptl-init.c: Put pthread mutex tunable initialization in pthread
>>      initialization.
>>    * nptl/pthreadP.h: Refine MAX_ADAPTIVE_COUNT macro definition
>>
>> I would extend my appreciation sincerely to H.J.Lu for his help to refine
>> this patch series.
>>
>> ChangeLog:
>>     V8->V9:
>> 	a) Add the "unistd.h" header file back, or it will cause build
>> 	regression on 32 bits system.
>> 	b) Rebase on the latest master branch
>> 	c) Tested on x86_64 with build-many-glibcs.py
>>
>>     V7->V8:
>> 	a) Refine the pthread tunables description in manual/tunables.texi
>> 	accordingly to Carlos O'Donell and Rical Jason.
>>
>>     V6->V7:
>>     a) Patch is refined by H.J.Lu
>>
>>     V5->V6:
>>     a) Missing "pthread mutex tunables" entry in the menu of tunables.texi,
>>     add it.
>>
>>     V4->V5
>>     a) Put mutex tunable (glibc.mutex.spin_count) initialization as part of
>>     overall pthread initialization, that would avoid the extra relocation,
>>     as suggested by Florian Weimer. Thanks for pointing it out!
>>     b) Move the READ_ONLY_SPIN macro definition from the third patch to
>>     this patch
>>
>>     V3->V4
>>     a) Add comments in elf/dl-tunables.list
>>
>>     V2->V3
>>     a) Polish the description of glibc.mutex.spin_count tunable with the
>>     help from Rical Jasan.
>>     b) Get rid of the TUNABLE_CALLBACK_FNDECL macros in
>>     pthread_mutex_conf.c, as suggested by Florian Weimer.
>>     c) Adjust the default value of spin count to 100 with the reference of
>>     the previous spinning way via trylock.
>>
>>     V1->V2
>>     a) Renamed nptl/mutex-conf.h -> nptl/pthread_mutex_conf.h
>>     b) Renamed nptl/mutex-conf.c -> nptl/pthread_mutex_conf.c
>>     c) Change the Makefile to compile pthread_mutex_conf.c
>>     d) Modify the copyright "2013-2018" -> "2018" for new added files
>>     e) Fix the indentation issue (tab -> double space) in
>>     elf/dl-tunables.list
>>     f) Remove the env alias LD_SPIN_COUNT in elf/dl-tunables.list
>>     g) Fix the typo errors and refresh glibc.mutex.spin_count tunable
>>     description in manual/tunables.texi.
>>     h) Fix the indentation issue in nptl/pthread_mutex_conf.c
>>     i) Fix the indentation issue for nested preprocessor (add one space for
>>     each level)
>>
>> Suggested-by: Andi Kleen <andi.kleen@intel.com>
>> Signed-off-by: Kemi.wang <kemi.wang@intel.com>
>> ---
>>  manual/tunables.texi          | 27 +++++++++++++++++++++++
>>  nptl/Makefile                 |  2 +-
>>  nptl/nptl-init.c              |  5 +++++
>>  nptl/pthreadP.h               | 24 +++++++++++++++-----
>>  nptl/pthread_mutex_conf.c     | 27 +++++++++++++++++++++++
>>  nptl/pthread_mutex_conf.h     | 45 ++++++++++++++++++++++++++++++++++++++
>>  sysdeps/nptl/dl-tunables.list | 51 +++++++++++++++++++++++++++++++++++++++++++
>>  7 files changed, 174 insertions(+), 7 deletions(-)
>>  create mode 100644 nptl/pthread_mutex_conf.c
>>  create mode 100644 nptl/pthread_mutex_conf.h
>>  create mode 100644 sysdeps/nptl/dl-tunables.list
>>
>> diff --git a/manual/tunables.texi b/manual/tunables.texi
>> index 3345a23..33fbe1e 100644
>> --- a/manual/tunables.texi
>> +++ b/manual/tunables.texi
>> @@ -32,6 +32,7 @@ their own namespace.
>>  * Tunable names::  The structure of a tunable name
>>  * Memory Allocation Tunables::  Tunables in the memory allocation subsystem
>>  * Elision Tunables::  Tunables in elision subsystem
>> +* POSIX Thread Tunables:: Tunables in the POSIX thread subsystem
>>  * Hardware Capability Tunables::  Tunables that modify the hardware
>>  				  capabilities seen by @theglibc{}
>>  @end menu
>> @@ -281,6 +282,32 @@ of try lock attempts.
>>  The default value of this tunable is @samp{3}.
>>  @end deftp
>>  
>> +@node POSIX Thread Tunables
>> +@section POSIX Thread Tunables
>> +@cindex pthread mutex tunables
>> +@cindex thread mutex tunables
>> +@cindex mutex tunables
>> +@cindex tunables thread mutex
>> +
>> +@deftp {Tunable namespace} glibc.pthread
>> +The behavior of POSIX thread can be tuned to gain performance improvements
>> +according to specific hardware capabilities and workload characteristics by
>> +setting the following tunables in the @code{pthread} namespace:
>> +@end deftp
>> +
>> +@deftp Tunable glibc.pthread.mutex_spin_count
>> +The @code{glibc.pthread.mutex_spin_count} tunable sets the maximum number of times
>> +a thread should spin on the lock before calling into the kernel to block.
>> +Adaptive spin is used for mutexes initialized with the
>> +@code{PTHREAD_MUTEX_ADAPTIVE_NP} GNU extension.  It affects both
>> +@code{pthread_mutex_lock} and @code{pthread_mutex_timedlock}.
>> +
>> +The spinning is done until either the maximum spin count is reached or
>> +the lock is acquired.
>> +
>> +The default value of this tunable is @samp{100}.
>> +@end deftp
>> +
>>  @node Hardware Capability Tunables
>>  @section Hardware Capability Tunables
>>  @cindex hardware capability tunables
>> diff --git a/nptl/Makefile b/nptl/Makefile
>> index 49b6faa..284f815 100644
>> --- a/nptl/Makefile
>> +++ b/nptl/Makefile
>> @@ -145,7 +145,7 @@ libpthread-routines = nptl-init nptlfreeres vars events version pt-interp \
>>  		      mtx_destroy mtx_init mtx_lock mtx_timedlock \
>>  		      mtx_trylock mtx_unlock call_once cnd_broadcast \
>>  		      cnd_destroy cnd_init cnd_signal cnd_timedwait cnd_wait \
>> -		      tss_create tss_delete tss_get tss_set
>> +		      tss_create tss_delete tss_get tss_set pthread_mutex_conf
>>  #		      pthread_setuid pthread_seteuid pthread_setreuid \
>>  #		      pthread_setresuid \
>>  #		      pthread_setgid pthread_setegid pthread_setregid \
>> diff --git a/nptl/nptl-init.c b/nptl/nptl-init.c
>> index 907411d..adf99f1 100644
>> --- a/nptl/nptl-init.c
>> +++ b/nptl/nptl-init.c
>> @@ -38,6 +38,7 @@
>>  #include <kernel-features.h>
>>  #include <libc-pointer-arith.h>
>>  #include <pthread-pids.h>
>> +#include <pthread_mutex_conf.h>
>>  
>>  #ifndef TLS_MULTIPLE_THREADS_IN_TCB
>>  /* Pointer to the corresponding variable in libc.  */
>> @@ -431,6 +432,10 @@ __pthread_initialize_minimal_internal (void)
>>  
>>    /* Determine whether the machine is SMP or not.  */
>>    __is_smp = is_smp_system ();
>> +
>> +#if HAVE_TUNABLES
>> +  pthread_tunables_init ();
>> +#endif
>>  }
>>  strong_alias (__pthread_initialize_minimal_internal,
>>  	      __pthread_initialize_minimal)
>> diff --git a/nptl/pthreadP.h b/nptl/pthreadP.h
>> index 19efe1e..9bae78e 100644
>> --- a/nptl/pthreadP.h
>> +++ b/nptl/pthreadP.h
>> @@ -47,12 +47,6 @@
>>  #endif
>>  
>>  
>> -/* Adaptive mutex definitions.  */
>> -#ifndef MAX_ADAPTIVE_COUNT
>> -# define MAX_ADAPTIVE_COUNT 100
>> -#endif
>> -
>> -
>>  /* Magic cookie representing robust mutex with dead owner.  */
>>  #define PTHREAD_MUTEX_INCONSISTENT	INT_MAX
>>  /* Magic cookie representing not recoverable robust mutex.  */
>> @@ -188,6 +182,24 @@ enum
>>  #define __PTHREAD_COND_SHARED_MASK 1
>>  
>>  
>> +#if HAVE_TUNABLES
>> +struct mutex_config
>> +{
>> +  int spin_count;
>> +};
>> +
>> +extern struct mutex_config __mutex_aconf;
>> +#endif
>> +
>> +/* Adaptive mutex definitions.  */
>> +#ifndef MAX_ADAPTIVE_COUNT
>> +# if HAVE_TUNABLES
>> +#  define MAX_ADAPTIVE_COUNT __mutex_aconf.spin_count
>> +# else
>> +#  define MAX_ADAPTIVE_COUNT 100
>> +# endif
>> +#endif
>> +
>>  /* Internal variables.  */
>>  
>>  
>> diff --git a/nptl/pthread_mutex_conf.c b/nptl/pthread_mutex_conf.c
>> new file mode 100644
>> index 0000000..0da4afe
>> --- /dev/null
>> +++ b/nptl/pthread_mutex_conf.c
>> @@ -0,0 +1,27 @@
>> +/* pthread_mutex_conf.c: Pthread mutex tunable parameters.
>> +   Copyright (C) 2018 Free Software Foundation, Inc.
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library; if not, see
>> +   <http://www.gnu.org/licenses/>.  */
>> +
>> +#if HAVE_TUNABLES
>> +#include <pthreadP.h>
>> +struct mutex_config __mutex_aconf =
>> +{
>> +  /* The maximum number of times a thread should spin on the lock before
>> +  calling into kernel to block.  */
>> +  .spin_count = 100,
>> +};
>> +#endif
>> diff --git a/nptl/pthread_mutex_conf.h b/nptl/pthread_mutex_conf.h
>> new file mode 100644
>> index 0000000..97ff98e
>> --- /dev/null
>> +++ b/nptl/pthread_mutex_conf.h
>> @@ -0,0 +1,45 @@
>> +/* pthread_mutex_conf.h: Pthread mutex tunable parameters.
>> +   Copyright (C) 2018 Free Software Foundation, Inc.
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library; if not, see
>> +   <http://www.gnu.org/licenses/>.  */
>> +#ifndef _PTHREAD_MUTEX_CONF_H
>> +#define _PTHREAD_MUTEX_CONF_H 1
>> +
>> +#if HAVE_TUNABLES
>> +# define TUNABLE_NAMESPACE pthread
>> +# include <stdbool.h>
>> +# include <stdint.h>
>> +# include <pthreadP.h>
>> +# include <pthread_mutex_conf.h>
>> +# include <unistd.h>  /* Get STDOUT_FILENO for _dl_printf.  */
>> +# include <elf/dl-tunables.h>
>> +
>> +extern struct mutex_config __mutex_aconf attribute_hidden;
>> +
>> +static void
>> +TUNABLE_CALLBACK (set_mutex_spin_count) (tunable_val_t *valp)
>> +{
>> +  __mutex_aconf.spin_count = (int32_t) (valp)->numval;
>> +}
>> +
>> +static void pthread_tunables_init (void)
>> +{
>> +  TUNABLE_GET (mutex_spin_count, int32_t,
>> +               TUNABLE_CALLBACK (set_mutex_spin_count));
>> +}
>> +#endif
>> +
>> +#endif
>> diff --git a/sysdeps/nptl/dl-tunables.list b/sysdeps/nptl/dl-tunables.list
>> new file mode 100644
>> index 0000000..1cb52e8
>> --- /dev/null
>> +++ b/sysdeps/nptl/dl-tunables.list
>> @@ -0,0 +1,51 @@
>> +# Copyright (C) 2018 Free Software Foundation, Inc.
>> +# This file is part of the GNU C Library.
>> +
>> +# The GNU C Library is free software; you can redistribute it and/or
>> +# modify it under the terms of the GNU Lesser General Public
>> +# License as published by the Free Software Foundation; either
>> +# version 2.1 of the License, or (at your option) any later version.
>> +
>> +# The GNU C Library is distributed in the hope that it will be useful,
>> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +# Lesser General Public License for more details.
>> +
>> +# You should have received a copy of the GNU Lesser General Public
>> +# License along with the GNU C Library; if not, see
>> +# <http://www.gnu.org/licenses/>.
>> +
>> +# Allowed attributes for tunables:
>> +#
>> +# type: Defaults to STRING
>> +# minval: Optional minimum acceptable value
>> +# maxval: Optional maximum acceptable value
>> +# env_alias: An alias environment variable
>> +# security_level: Specify security level of the tunable.  Valid values are:
>> +#
>> +# 	     SXID_ERASE: (default) Don't read for AT_SECURE binaries and
>> +# 	     		 removed so that child processes can't read it.
>> +# 	     SXID_IGNORE: Don't read for AT_SECURE binaries, but retained for
>> +# 	     		  non-AT_SECURE subprocesses.
>> +# 	     NONE: Read all the time.
>> +
>> +
>> +# The maximum value of spin count is limited to 30000 to avoid the overflow
>> +# of mutex->__data.__spins variable with the possible type of short in
>> +# pthread_mutex_lock ().
>> +#
>> +# The default value of spin count is set to 100 with the reference to the
>> +# previous number of times of spinning via trylock. This value would be
>> +# architecture-specific and can be tuned with kinds of benchmarks to fit
>> +# most cases in future.
>> +
>> +glibc {
>> +  pthread {
>> +    mutex_spin_count {
>> +      type: INT_32
>> +      minval: 0
>> +      maxval: 30000
>> +      default: 100
>> +    }
>> +  }
>> +}
>>
Adhemerval Zanella Nov. 27, 2018, 8:14 p.m. UTC | #3
On 21/11/2018 23:45, Kemi Wang wrote:
> This patch does not have any functionality change, we only provide a spin
> count tunes for pthread adaptive spin mutex. The tunable
> glibc.pthread.mutex_spin_count tunes can be used by system administrator to
> squeeze system performance according to different hardware capabilities and
> workload characteristics.
> 
> The maximum value of spin count is limited to 30000 to avoid the overflow
> of mutex->__data.__spins variable with the possible type of short in
> pthread_mutex_lock ().

With current algorithm nptl/pthread_mutex_{timed}lock.c, I think we can use 
the full short range (32767) since assuming maximum spin_nop usage with 'cnt'
being 'max_cnt - 1' (the fast atomic path) it will converge to 32719.

> 
> The default value of spin count is set to 100 with the reference to the
> previous number of times of spinning via trylock. This value would be
> architecture-specific and can be tuned with kinds of benchmarks to fit most
> cases in future.
> 
>    * sysdeps/nptl/dl-tunables.list: Add glibc.pthread.mutex_spin_count entry.
>    * manual/tunables.texi: Add glibc.pthread.mutex_spin_count description.
>    * nptl/pthread_mutex_conf.h: New file.
>    * nptl/pthread_mutex_conf.c: New file.
>    * nptl/Makefile: Add pthread_mutex_conf.c for compilation.
>    * nptl/nptl-init.c: Put pthread mutex tunable initialization in pthread
>      initialization.
>    * nptl/pthreadP.h: Refine MAX_ADAPTIVE_COUNT macro definition
> 
> I would extend my appreciation sincerely to H.J.Lu for his help to refine
> this patch series.
> 
> ChangeLog:
>     V8->V9:
> 	a) Add the "unistd.h" header file back, or it will cause build
> 	regression on 32 bits system.
> 	b) Rebase on the latest master branch
> 	c) Tested on x86_64 with build-many-glibcs.py
> 
>     V7->V8:
> 	a) Refine the pthread tunables description in manual/tunables.texi
> 	accordingly to Carlos O'Donell and Rical Jason.
> 
>     V6->V7:
>     a) Patch is refined by H.J.Lu
> 
>     V5->V6:
>     a) Missing "pthread mutex tunables" entry in the menu of tunables.texi,
>     add it.
> 
>     V4->V5
>     a) Put mutex tunable (glibc.mutex.spin_count) initialization as part of
>     overall pthread initialization, that would avoid the extra relocation,
>     as suggested by Florian Weimer. Thanks for pointing it out!
>     b) Move the READ_ONLY_SPIN macro definition from the third patch to
>     this patch
> 
>     V3->V4
>     a) Add comments in elf/dl-tunables.list
> 
>     V2->V3
>     a) Polish the description of glibc.mutex.spin_count tunable with the
>     help from Rical Jasan.
>     b) Get rid of the TUNABLE_CALLBACK_FNDECL macros in
>     pthread_mutex_conf.c, as suggested by Florian Weimer.
>     c) Adjust the default value of spin count to 100 with the reference of
>     the previous spinning way via trylock.
> 
>     V1->V2
>     a) Renamed nptl/mutex-conf.h -> nptl/pthread_mutex_conf.h
>     b) Renamed nptl/mutex-conf.c -> nptl/pthread_mutex_conf.c
>     c) Change the Makefile to compile pthread_mutex_conf.c
>     d) Modify the copyright "2013-2018" -> "2018" for new added files
>     e) Fix the indentation issue (tab -> double space) in
>     elf/dl-tunables.list
>     f) Remove the env alias LD_SPIN_COUNT in elf/dl-tunables.list
>     g) Fix the typo errors and refresh glibc.mutex.spin_count tunable
>     description in manual/tunables.texi.
>     h) Fix the indentation issue in nptl/pthread_mutex_conf.c
>     i) Fix the indentation issue for nested preprocessor (add one space for
>     each level)
> 
> Suggested-by: Andi Kleen <andi.kleen@intel.com>
> Signed-off-by: Kemi.wang <kemi.wang@intel.com>
> ---
>  manual/tunables.texi          | 27 +++++++++++++++++++++++
>  nptl/Makefile                 |  2 +-
>  nptl/nptl-init.c              |  5 +++++
>  nptl/pthreadP.h               | 24 +++++++++++++++-----
>  nptl/pthread_mutex_conf.c     | 27 +++++++++++++++++++++++
>  nptl/pthread_mutex_conf.h     | 45 ++++++++++++++++++++++++++++++++++++++
>  sysdeps/nptl/dl-tunables.list | 51 +++++++++++++++++++++++++++++++++++++++++++
>  7 files changed, 174 insertions(+), 7 deletions(-)
>  create mode 100644 nptl/pthread_mutex_conf.c
>  create mode 100644 nptl/pthread_mutex_conf.h
>  create mode 100644 sysdeps/nptl/dl-tunables.list
> 
> diff --git a/manual/tunables.texi b/manual/tunables.texi
> index 3345a23..33fbe1e 100644
> --- a/manual/tunables.texi
> +++ b/manual/tunables.texi
> @@ -32,6 +32,7 @@ their own namespace.
>  * Tunable names::  The structure of a tunable name
>  * Memory Allocation Tunables::  Tunables in the memory allocation subsystem
>  * Elision Tunables::  Tunables in elision subsystem
> +* POSIX Thread Tunables:: Tunables in the POSIX thread subsystem
>  * Hardware Capability Tunables::  Tunables that modify the hardware
>  				  capabilities seen by @theglibc{}
>  @end menu
> @@ -281,6 +282,32 @@ of try lock attempts.
>  The default value of this tunable is @samp{3}.
>  @end deftp
>  
> +@node POSIX Thread Tunables
> +@section POSIX Thread Tunables
> +@cindex pthread mutex tunables
> +@cindex thread mutex tunables
> +@cindex mutex tunables
> +@cindex tunables thread mutex
> +
> +@deftp {Tunable namespace} glibc.pthread
> +The behavior of POSIX thread can be tuned to gain performance improvements
> +according to specific hardware capabilities and workload characteristics by
> +setting the following tunables in the @code{pthread} namespace:
> +@end deftp
> +
> +@deftp Tunable glibc.pthread.mutex_spin_count
> +The @code{glibc.pthread.mutex_spin_count} tunable sets the maximum number of times
> +a thread should spin on the lock before calling into the kernel to block.
> +Adaptive spin is used for mutexes initialized with the
> +@code{PTHREAD_MUTEX_ADAPTIVE_NP} GNU extension.  It affects both
> +@code{pthread_mutex_lock} and @code{pthread_mutex_timedlock}.
> +
> +The spinning is done until either the maximum spin count is reached or
> +the lock is acquired.
> +
> +The default value of this tunable is @samp{100}.
> +@end deftp
> +
>  @node Hardware Capability Tunables
>  @section Hardware Capability Tunables
>  @cindex hardware capability tunables

Ok.

> diff --git a/nptl/Makefile b/nptl/Makefile
> index 49b6faa..284f815 100644
> --- a/nptl/Makefile
> +++ b/nptl/Makefile
> @@ -145,7 +145,7 @@ libpthread-routines = nptl-init nptlfreeres vars events version pt-interp \
>  		      mtx_destroy mtx_init mtx_lock mtx_timedlock \
>  		      mtx_trylock mtx_unlock call_once cnd_broadcast \
>  		      cnd_destroy cnd_init cnd_signal cnd_timedwait cnd_wait \
> -		      tss_create tss_delete tss_get tss_set
> +		      tss_create tss_delete tss_get tss_set pthread_mutex_conf
>  #		      pthread_setuid pthread_seteuid pthread_setreuid \
>  #		      pthread_setresuid \
>  #		      pthread_setgid pthread_setegid pthread_setregid \

Ok.

> diff --git a/nptl/nptl-init.c b/nptl/nptl-init.c
> index 907411d..adf99f1 100644
> --- a/nptl/nptl-init.c
> +++ b/nptl/nptl-init.c
> @@ -38,6 +38,7 @@
>  #include <kernel-features.h>
>  #include <libc-pointer-arith.h>
>  #include <pthread-pids.h>
> +#include <pthread_mutex_conf.h>
>  
>  #ifndef TLS_MULTIPLE_THREADS_IN_TCB
>  /* Pointer to the corresponding variable in libc.  */
> @@ -431,6 +432,10 @@ __pthread_initialize_minimal_internal (void)
>  
>    /* Determine whether the machine is SMP or not.  */
>    __is_smp = is_smp_system ();
> +
> +#if HAVE_TUNABLES
> +  pthread_tunables_init ();
> +#endif
>  }
>  strong_alias (__pthread_initialize_minimal_internal,
>  	      __pthread_initialize_minimal)

Ok. 

> diff --git a/nptl/pthreadP.h b/nptl/pthreadP.h
> index 19efe1e..9bae78e 100644
> --- a/nptl/pthreadP.h
> +++ b/nptl/pthreadP.h
> @@ -47,12 +47,6 @@
>  #endif
>  
>  
> -/* Adaptive mutex definitions.  */
> -#ifndef MAX_ADAPTIVE_COUNT
> -# define MAX_ADAPTIVE_COUNT 100
> -#endif
> -
> -
>  /* Magic cookie representing robust mutex with dead owner.  */
>  #define PTHREAD_MUTEX_INCONSISTENT	INT_MAX
>  /* Magic cookie representing not recoverable robust mutex.  */
> @@ -188,6 +182,24 @@ enum
>  #define __PTHREAD_COND_SHARED_MASK 1
>  
>  
> +#if HAVE_TUNABLES
> +struct mutex_config
> +{
> +  int spin_count;
> +};
> +
> +extern struct mutex_config __mutex_aconf;
> +#endif
> +
> +/* Adaptive mutex definitions.  */
> +#ifndef MAX_ADAPTIVE_COUNT
> +# if HAVE_TUNABLES
> +#  define MAX_ADAPTIVE_COUNT __mutex_aconf.spin_count
> +# else
> +#  define MAX_ADAPTIVE_COUNT 100
> +# endif
> +#endif
> +
>  /* Internal variables.  */
>  

We try to avoid the logic to check if a define is already define to set a
default value.  If the idea is on future patches to define an arch-specific
value, the preferred strategy is to define the generic and an arch-specific
value in their own headers.

>  
> diff --git a/nptl/pthread_mutex_conf.c b/nptl/pthread_mutex_conf.c
> new file mode 100644
> index 0000000..0da4afe
> --- /dev/null
> +++ b/nptl/pthread_mutex_conf.c
> @@ -0,0 +1,27 @@
> +/* pthread_mutex_conf.c: Pthread mutex tunable parameters.

It is not a rule, but there is no need to duplicate the file name on its
description.

> +   Copyright (C) 2018 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#if HAVE_TUNABLES
> +#include <pthreadP.h>
> +struct mutex_config __mutex_aconf =
> +{
> +  /* The maximum number of times a thread should spin on the lock before
> +  calling into kernel to block.  */
> +  .spin_count = 100,
> +};
> +#endif

If the idea is to keep both non-tunable version and tunable version in sync
when default value change, I would suggest to use an extra define to hold
the '100' value.

Another option, which I think it better, is to avoid make MAX_ADAPTIVE_COUNT
have different meaning depending if tunables is enabled or not and just
create a inline function:

nptl/pthreadP.h:

#define MAX_DEFAULT_ADAPTIVE_COUNT 100

static inline short max_adaptive_count (void)
{
#if HAVE_TUNABLES
  return __mutex_aconf.spin_count;
#else
  return MAX_DEFAULT_ADAPTIVE_COUNT;
#endif
}

And adjust nptl/pthread_mutex_conf.c, nptl/pthread_mutex_lock.c, and
nptl/pthread_mutex_timedlock.c accordingly.

Also, if the idea is to provide a different generic value depending
of the architecture it is a matter to move the MAX_DEFAULT_ADAPTIVE_COUNT
to a generic header and override the header by the architecture.

> diff --git a/nptl/pthread_mutex_conf.h b/nptl/pthread_mutex_conf.h
> new file mode 100644
> index 0000000..97ff98e
> --- /dev/null
> +++ b/nptl/pthread_mutex_conf.h
> @@ -0,0 +1,45 @@
> +/* pthread_mutex_conf.h: Pthread mutex tunable parameters.
> +   Copyright (C) 2018 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +#ifndef _PTHREAD_MUTEX_CONF_H
> +#define _PTHREAD_MUTEX_CONF_H 1
> +
> +#if HAVE_TUNABLES
> +# define TUNABLE_NAMESPACE pthread
> +# include <stdbool.h>
> +# include <stdint.h>
> +# include <pthreadP.h>
> +# include <pthread_mutex_conf.h>
> +# include <unistd.h>  /* Get STDOUT_FILENO for _dl_printf.  */
> +# include <elf/dl-tunables.h>

Mostly of the includes seems superfluous, I think only pthread_mutex_conf.h
and elf/dl-tunables.h are really required.

> +
> +extern struct mutex_config __mutex_aconf attribute_hidden;
> +
> +static void
> +TUNABLE_CALLBACK (set_mutex_spin_count) (tunable_val_t *valp)
> +{
> +  __mutex_aconf.spin_count = (int32_t) (valp)->numval;
> +}
> +
> +static void pthread_tunables_init (void)
> +{
> +  TUNABLE_GET (mutex_spin_count, int32_t,
> +               TUNABLE_CALLBACK (set_mutex_spin_count));
> +}
> +#endif
> +
> +#endif
> diff --git a/sysdeps/nptl/dl-tunables.list b/sysdeps/nptl/dl-tunables.list
> new file mode 100644
> index 0000000..1cb52e8
> --- /dev/null
> +++ b/sysdeps/nptl/dl-tunables.list
> @@ -0,0 +1,51 @@
> +# Copyright (C) 2018 Free Software Foundation, Inc.
> +# This file is part of the GNU C Library.
> +
> +# The GNU C Library is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU Lesser General Public
> +# License as published by the Free Software Foundation; either
> +# version 2.1 of the License, or (at your option) any later version.
> +
> +# The GNU C Library is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +# Lesser General Public License for more details.
> +
> +# You should have received a copy of the GNU Lesser General Public
> +# License along with the GNU C Library; if not, see
> +# <http://www.gnu.org/licenses/>.
> +
> +# Allowed attributes for tunables:
> +#
> +# type: Defaults to STRING
> +# minval: Optional minimum acceptable value
> +# maxval: Optional maximum acceptable value
> +# env_alias: An alias environment variable
> +# security_level: Specify security level of the tunable.  Valid values are:
> +#
> +# 	     SXID_ERASE: (default) Don't read for AT_SECURE binaries and
> +# 	     		 removed so that child processes can't read it.
> +# 	     SXID_IGNORE: Don't read for AT_SECURE binaries, but retained for
> +# 	     		  non-AT_SECURE subprocesses.
> +# 	     NONE: Read all the time.

Do we need to duplicate the same comments from 'elf/dl-tunables.list'?

> +
> +
> +# The maximum value of spin count is limited to 30000 to avoid the overflow
> +# of mutex->__data.__spins variable with the possible type of short in
> +# pthread_mutex_lock ().
> +#
> +# The default value of spin count is set to 100 with the reference to the
> +# previous number of times of spinning via trylock. This value would be
> +# architecture-specific and can be tuned with kinds of benchmarks to fit
> +# most cases in future.
> +
> +glibc {
> +  pthread {
> +    mutex_spin_count {
> +      type: INT_32
> +      minval: 0
> +      maxval: 30000
> +      default: 100
> +    }
> +  }
> +}
Carlos O'Donell Nov. 27, 2018, 8:51 p.m. UTC | #4
On 11/27/18 12:01 AM, kemi wrote:
> +Carlos

Are there any outstanding comments regarding this code? You've seen a lot
of reviews and I have not double checked that outstanding comments do not
remain from v1->v9.

My own comments are below. I think if you address those we're ready to
commit these changes. I would like to see us move on more of these kinds
of tunables which expose previously uncontrollable options for specific
workloads. Defaults should still be useful, and I'd like to hear your input
on wether you think 100 is a useful defaul.

Please post v10. Thank you!

> On 2018/11/27 上午10:43, kemi wrote:
>> PING~
>>
>> On 2018/11/22 上午9:45, Kemi Wang wrote:
>>> This patch does not have any functionality change, we only provide a spin
>>> count tunes for pthread adaptive spin mutex. The tunable
>>> glibc.pthread.mutex_spin_count tunes can be used by system administrator to
>>> squeeze system performance according to different hardware capabilities and
>>> workload characteristics.
>>>
>>> The maximum value of spin count is limited to 30000 to avoid the overflow
>>> of mutex->__data.__spins variable with the possible type of short in
>>> pthread_mutex_lock ().
>>>
>>> The default value of spin count is set to 100 with the reference to the
>>> previous number of times of spinning via trylock. This value would be
>>> architecture-specific and can be tuned with kinds of benchmarks to fit most
>>> cases in future.
>>>
>>>    * sysdeps/nptl/dl-tunables.list: Add glibc.pthread.mutex_spin_count entry.
>>>    * manual/tunables.texi: Add glibc.pthread.mutex_spin_count description.
>>>    * nptl/pthread_mutex_conf.h: New file.
>>>    * nptl/pthread_mutex_conf.c: New file.
>>>    * nptl/Makefile: Add pthread_mutex_conf.c for compilation.
>>>    * nptl/nptl-init.c: Put pthread mutex tunable initialization in pthread
>>>      initialization.
>>>    * nptl/pthreadP.h: Refine MAX_ADAPTIVE_COUNT macro definition
>>>
>>> I would extend my appreciation sincerely to H.J.Lu for his help to refine
>>> this patch series.
>>>
>>> ChangeLog:
>>>     V8->V9:
>>> 	a) Add the "unistd.h" header file back, or it will cause build
>>> 	regression on 32 bits system.
>>> 	b) Rebase on the latest master branch
>>> 	c) Tested on x86_64 with build-many-glibcs.py
>>>
>>>     V7->V8:
>>> 	a) Refine the pthread tunables description in manual/tunables.texi
>>> 	accordingly to Carlos O'Donell and Rical Jason.
>>>
>>>     V6->V7:
>>>     a) Patch is refined by H.J.Lu
>>>
>>>     V5->V6:
>>>     a) Missing "pthread mutex tunables" entry in the menu of tunables.texi,
>>>     add it.
>>>
>>>     V4->V5
>>>     a) Put mutex tunable (glibc.mutex.spin_count) initialization as part of
>>>     overall pthread initialization, that would avoid the extra relocation,
>>>     as suggested by Florian Weimer. Thanks for pointing it out!
>>>     b) Move the READ_ONLY_SPIN macro definition from the third patch to
>>>     this patch
>>>
>>>     V3->V4
>>>     a) Add comments in elf/dl-tunables.list
>>>
>>>     V2->V3
>>>     a) Polish the description of glibc.mutex.spin_count tunable with the
>>>     help from Rical Jasan.
>>>     b) Get rid of the TUNABLE_CALLBACK_FNDECL macros in
>>>     pthread_mutex_conf.c, as suggested by Florian Weimer.
>>>     c) Adjust the default value of spin count to 100 with the reference of
>>>     the previous spinning way via trylock.
>>>
>>>     V1->V2
>>>     a) Renamed nptl/mutex-conf.h -> nptl/pthread_mutex_conf.h
>>>     b) Renamed nptl/mutex-conf.c -> nptl/pthread_mutex_conf.c
>>>     c) Change the Makefile to compile pthread_mutex_conf.c
>>>     d) Modify the copyright "2013-2018" -> "2018" for new added files
>>>     e) Fix the indentation issue (tab -> double space) in
>>>     elf/dl-tunables.list
>>>     f) Remove the env alias LD_SPIN_COUNT in elf/dl-tunables.list
>>>     g) Fix the typo errors and refresh glibc.mutex.spin_count tunable
>>>     description in manual/tunables.texi.
>>>     h) Fix the indentation issue in nptl/pthread_mutex_conf.c
>>>     i) Fix the indentation issue for nested preprocessor (add one space for
>>>     each level)
>>>
>>> Suggested-by: Andi Kleen <andi.kleen@intel.com>
>>> Signed-off-by: Kemi.wang <kemi.wang@intel.com>
>>> ---
>>>  manual/tunables.texi          | 27 +++++++++++++++++++++++
>>>  nptl/Makefile                 |  2 +-
>>>  nptl/nptl-init.c              |  5 +++++
>>>  nptl/pthreadP.h               | 24 +++++++++++++++-----
>>>  nptl/pthread_mutex_conf.c     | 27 +++++++++++++++++++++++
>>>  nptl/pthread_mutex_conf.h     | 45 ++++++++++++++++++++++++++++++++++++++
>>>  sysdeps/nptl/dl-tunables.list | 51 +++++++++++++++++++++++++++++++++++++++++++
>>>  7 files changed, 174 insertions(+), 7 deletions(-)
>>>  create mode 100644 nptl/pthread_mutex_conf.c
>>>  create mode 100644 nptl/pthread_mutex_conf.h
>>>  create mode 100644 sysdeps/nptl/dl-tunables.list
>>>
>>> diff --git a/manual/tunables.texi b/manual/tunables.texi
>>> index 3345a23..33fbe1e 100644
>>> --- a/manual/tunables.texi
>>> +++ b/manual/tunables.texi
>>> @@ -32,6 +32,7 @@ their own namespace.
>>>  * Tunable names::  The structure of a tunable name
>>>  * Memory Allocation Tunables::  Tunables in the memory allocation subsystem
>>>  * Elision Tunables::  Tunables in elision subsystem
>>> +* POSIX Thread Tunables:: Tunables in the POSIX thread subsystem

OK.

>>>  * Hardware Capability Tunables::  Tunables that modify the hardware
>>>  				  capabilities seen by @theglibc{}
>>>  @end menu
>>> @@ -281,6 +282,32 @@ of try lock attempts.
>>>  The default value of this tunable is @samp{3}.
>>>  @end deftp
>>>  
>>> +@node POSIX Thread Tunables
>>> +@section POSIX Thread Tunables
>>> +@cindex pthread mutex tunables
>>> +@cindex thread mutex tunables
>>> +@cindex mutex tunables
>>> +@cindex tunables thread mutex
>>> +
>>> +@deftp {Tunable namespace} glibc.pthread
>>> +The behavior of POSIX thread can be tuned to gain performance improvements

s/of POSIX thread/of POSIX threads/g

>>> +according to specific hardware capabilities and workload characteristics by
>>> +setting the following tunables in the @code{pthread} namespace:
>>> +@end deftp
>>> +
>>> +@deftp Tunable glibc.pthread.mutex_spin_count
>>> +The @code{glibc.pthread.mutex_spin_count} tunable sets the maximum number of times
>>> +a thread should spin on the lock before calling into the kernel to block.
>>> +Adaptive spin is used for mutexes initialized with the
>>> +@code{PTHREAD_MUTEX_ADAPTIVE_NP} GNU extension.  It affects both
>>> +@code{pthread_mutex_lock} and @code{pthread_mutex_timedlock}.

OK.

>>> +
>>> +The spinning is done until either the maximum spin count is reached or

s/The spinning is done/The thread spins/g

>>> +the lock is acquired.

OK.

>>> +
>>> +The default value of this tunable is @samp{100}.
>>> +@end deftp
>>> +
>>>  @node Hardware Capability Tunables
>>>  @section Hardware Capability Tunables
>>>  @cindex hardware capability tunables
>>> diff --git a/nptl/Makefile b/nptl/Makefile
>>> index 49b6faa..284f815 100644
>>> --- a/nptl/Makefile
>>> +++ b/nptl/Makefile
>>> @@ -145,7 +145,7 @@ libpthread-routines = nptl-init nptlfreeres vars events version pt-interp \
>>>  		      mtx_destroy mtx_init mtx_lock mtx_timedlock \
>>>  		      mtx_trylock mtx_unlock call_once cnd_broadcast \
>>>  		      cnd_destroy cnd_init cnd_signal cnd_timedwait cnd_wait \
>>> -		      tss_create tss_delete tss_get tss_set
>>> +		      tss_create tss_delete tss_get tss_set pthread_mutex_conf

OK.

>>>  #		      pthread_setuid pthread_seteuid pthread_setreuid \
>>>  #		      pthread_setresuid \
>>>  #		      pthread_setgid pthread_setegid pthread_setregid \
>>> diff --git a/nptl/nptl-init.c b/nptl/nptl-init.c
>>> index 907411d..adf99f1 100644
>>> --- a/nptl/nptl-init.c
>>> +++ b/nptl/nptl-init.c
>>> @@ -38,6 +38,7 @@
>>>  #include <kernel-features.h>
>>>  #include <libc-pointer-arith.h>
>>>  #include <pthread-pids.h>
>>> +#include <pthread_mutex_conf.h>

OK.

>>>  
>>>  #ifndef TLS_MULTIPLE_THREADS_IN_TCB
>>>  /* Pointer to the corresponding variable in libc.  */
>>> @@ -431,6 +432,10 @@ __pthread_initialize_minimal_internal (void)
>>>  
>>>    /* Determine whether the machine is SMP or not.  */
>>>    __is_smp = is_smp_system ();
>>> +
>>> +#if HAVE_TUNABLES
>>> +  pthread_tunables_init ();
>>> +#endif

OK.

>>>  }
>>>  strong_alias (__pthread_initialize_minimal_internal,
>>>  	      __pthread_initialize_minimal)
>>> diff --git a/nptl/pthreadP.h b/nptl/pthreadP.h
>>> index 19efe1e..9bae78e 100644
>>> --- a/nptl/pthreadP.h
>>> +++ b/nptl/pthreadP.h
>>> @@ -47,12 +47,6 @@
>>>  #endif
>>>  
>>>  
>>> -/* Adaptive mutex definitions.  */
>>> -#ifndef MAX_ADAPTIVE_COUNT
>>> -# define MAX_ADAPTIVE_COUNT 100
>>> -#endif
>>> -
>>> -
>>>  /* Magic cookie representing robust mutex with dead owner.  */
>>>  #define PTHREAD_MUTEX_INCONSISTENT	INT_MAX
>>>  /* Magic cookie representing not recoverable robust mutex.  */
>>> @@ -188,6 +182,24 @@ enum
>>>  #define __PTHREAD_COND_SHARED_MASK 1
>>>  
>>>  
>>> +#if HAVE_TUNABLES
>>> +struct mutex_config
>>> +{
>>> +  int spin_count;
>>> +};

OK.

>>> +
>>> +extern struct mutex_config __mutex_aconf;

OK.

>>> +#endif
>>> +
>>> +/* Adaptive mutex definitions.  */
>>> +#ifndef MAX_ADAPTIVE_COUNT
>>> +# if HAVE_TUNABLES
>>> +#  define MAX_ADAPTIVE_COUNT __mutex_aconf.spin_count
>>> +# else
>>> +#  define MAX_ADAPTIVE_COUNT 100

Please define a DEFAULT_ADAPTIVE_COUNT to 100 and use it later
in pthread_mutex_conf.c.

Please add a comment to DEFAULT_ADAPTIVE_COUNT, since you
touched the code we should comment it a bit more:

/* The choice of 100 spins for the default spin count for
   an adaptive spin is a completely arbitrary choice that
   has not been evaluated thoroughly using modern hardware.  */

>>> +# endif
>>> +#endif
>>> +
>>>  /* Internal variables.  */
>>>  
>>>  
>>> diff --git a/nptl/pthread_mutex_conf.c b/nptl/pthread_mutex_conf.c
>>> new file mode 100644
>>> index 0000000..0da4afe
>>> --- /dev/null
>>> +++ b/nptl/pthread_mutex_conf.c
>>> @@ -0,0 +1,27 @@
>>> +/* pthread_mutex_conf.c: Pthread mutex tunable parameters.

Use just "pthread mutex tunable parameters."

>>> +   Copyright (C) 2018 Free Software Foundation, Inc.
>>> +   This file is part of the GNU C Library.
>>> +
>>> +   The GNU C Library is free software; you can redistribute it and/or
>>> +   modify it under the terms of the GNU Lesser General Public
>>> +   License as published by the Free Software Foundation; either
>>> +   version 2.1 of the License, or (at your option) any later version.
>>> +
>>> +   The GNU C Library is distributed in the hope that it will be useful,
>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>> +   Lesser General Public License for more details.
>>> +
>>> +   You should have received a copy of the GNU Lesser General Public
>>> +   License along with the GNU C Library; if not, see
>>> +   <http://www.gnu.org/licenses/>.  */
>>> +
>>> +#if HAVE_TUNABLES
>>> +#include <pthreadP.h>
>>> +struct mutex_config __mutex_aconf =
>>> +{
>>> +  /* The maximum number of times a thread should spin on the lock before
>>> +  calling into kernel to block.  */
>>> +  .spin_count = 100,

Should use DEFAULT_ADAPTIVE_COUNT instead of repeating the 100.

>>> +};
>>> +#endif
>>> diff --git a/nptl/pthread_mutex_conf.h b/nptl/pthread_mutex_conf.h
>>> new file mode 100644
>>> index 0000000..97ff98e
>>> --- /dev/null
>>> +++ b/nptl/pthread_mutex_conf.h
>>> @@ -0,0 +1,45 @@
>>> +/* pthread_mutex_conf.h: Pthread mutex tunable parameters.

Use just "pthread mutex tunable parameters."

>>> +   Copyright (C) 2018 Free Software Foundation, Inc.
>>> +   This file is part of the GNU C Library.
>>> +
>>> +   The GNU C Library is free software; you can redistribute it and/or
>>> +   modify it under the terms of the GNU Lesser General Public
>>> +   License as published by the Free Software Foundation; either
>>> +   version 2.1 of the License, or (at your option) any later version.
>>> +
>>> +   The GNU C Library is distributed in the hope that it will be useful,
>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>> +   Lesser General Public License for more details.
>>> +
>>> +   You should have received a copy of the GNU Lesser General Public
>>> +   License along with the GNU C Library; if not, see
>>> +   <http://www.gnu.org/licenses/>.  */
>>> +#ifndef _PTHREAD_MUTEX_CONF_H
>>> +#define _PTHREAD_MUTEX_CONF_H 1
>>> +
>>> +#if HAVE_TUNABLES
>>> +# define TUNABLE_NAMESPACE pthread
>>> +# include <stdbool.h>
>>> +# include <stdint.h>
>>> +# include <pthreadP.h>
>>> +# include <pthread_mutex_conf.h>
>>> +# include <unistd.h>  /* Get STDOUT_FILENO for _dl_printf.  */
>>> +# include <elf/dl-tunables.h>
>>> +
>>> +extern struct mutex_config __mutex_aconf attribute_hidden;
>>> +
>>> +static void
>>> +TUNABLE_CALLBACK (set_mutex_spin_count) (tunable_val_t *valp)
>>> +{
>>> +  __mutex_aconf.spin_count = (int32_t) (valp)->numval;
>>> +}
>>> +
>>> +static void pthread_tunables_init (void)
>>> +{
>>> +  TUNABLE_GET (mutex_spin_count, int32_t,
>>> +               TUNABLE_CALLBACK (set_mutex_spin_count));
>>> +}
>>> +#endif
>>> +
>>> +#endif

OK.

>>> diff --git a/sysdeps/nptl/dl-tunables.list b/sysdeps/nptl/dl-tunables.list
>>> new file mode 100644
>>> index 0000000..1cb52e8
>>> --- /dev/null
>>> +++ b/sysdeps/nptl/dl-tunables.list
>>> @@ -0,0 +1,51 @@
>>> +# Copyright (C) 2018 Free Software Foundation, Inc.
>>> +# This file is part of the GNU C Library.
>>> +
>>> +# The GNU C Library is free software; you can redistribute it and/or
>>> +# modify it under the terms of the GNU Lesser General Public
>>> +# License as published by the Free Software Foundation; either
>>> +# version 2.1 of the License, or (at your option) any later version.
>>> +
>>> +# The GNU C Library is distributed in the hope that it will be useful,
>>> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>> +# Lesser General Public License for more details.
>>> +
>>> +# You should have received a copy of the GNU Lesser General Public
>>> +# License along with the GNU C Library; if not, see
>>> +# <http://www.gnu.org/licenses/>.
>>> +
>>> +# Allowed attributes for tunables:
>>> +#
>>> +# type: Defaults to STRING
>>> +# minval: Optional minimum acceptable value
>>> +# maxval: Optional maximum acceptable value
>>> +# env_alias: An alias environment variable
>>> +# security_level: Specify security level of the tunable.  Valid values are:
>>> +#
>>> +# 	     SXID_ERASE: (default) Don't read for AT_SECURE binaries and
>>> +# 	     		 removed so that child processes can't read it.
>>> +# 	     SXID_IGNORE: Don't read for AT_SECURE binaries, but retained for
>>> +# 	     		  non-AT_SECURE subprocesses.
>>> +# 	     NONE: Read all the time.
>>> +
>>> +
>>> +# The maximum value of spin count is limited to 30000 to avoid the overflow
>>> +# of mutex->__data.__spins variable with the possible type of short in
>>> +# pthread_mutex_lock ().
>>> +#
>>> +# The default value of spin count is set to 100 with the reference to the
>>> +# previous number of times of spinning via trylock. This value would be
>>> +# architecture-specific and can be tuned with kinds of benchmarks to fit
>>> +# most cases in future.
>>> +
>>> +glibc {
>>> +  pthread {
>>> +    mutex_spin_count {
>>> +      type: INT_32
>>> +      minval: 0
>>> +      maxval: 30000
>>> +      default: 100
>>> +    }
>>> +  }
>>> +}
>>>

I don't like the duplication of DEFAULT_ADAPTIVE_MAX here,
similarly all the other duplicated parameters in glibc.elision
namespace, but I don't think we can do better without a lot
more work to allow macro expansion.

OK.
kemi Nov. 28, 2018, 5:21 a.m. UTC | #5
Carlos, thanks for your comments:)

On 2018/11/28 上午4:51, Carlos O'Donell wrote:
> On 11/27/18 12:01 AM, kemi wrote:
>> +Carlos
> 
> Are there any outstanding comments regarding this code? You've seen a lot
> of reviews and I have not double checked that outstanding comments do not
> remain from v1->v9.
> 

I believe I have tried my best to address the comments from community.
And for each reviewer, I have answered their questions in the emails.

> My own comments are below. I think if you address those we're ready to
> commit these changes. I would like to see us move on more of these kinds
> of tunables which expose previously uncontrollable options for specific
> workloads. Defaults should still be useful, and I'd like to hear your input
> on wether you think 100 is a useful defaul.
> 

The best spin count number should be architecture-specific. E.g. the pause in Skylake
is 10x expensive than before. 

In this patch, we don't have any functionality change, using previous default number 100,
though may not be the best, it will not trigger performance regression at least. 

we have discussed this issue before:
https://sourceware.org/ml/libc-alpha/2018-05/msg00545.html

> Please post v10. Thank you!
> 

sure

>> On 2018/11/27 上午10:43, kemi wrote:
>>> PING~
>>>
>>> On 2018/11/22 上午9:45, Kemi Wang wrote:
>>>> This patch does not have any functionality change, we only provide a spin
>>>> count tunes for pthread adaptive spin mutex. The tunable
>>>> glibc.pthread.mutex_spin_count tunes can be used by system administrator to
>>>> squeeze system performance according to different hardware capabilities and
>>>> workload characteristics.
>>>>
>>>> The maximum value of spin count is limited to 30000 to avoid the overflow
>>>> of mutex->__data.__spins variable with the possible type of short in
>>>> pthread_mutex_lock ().
>>>>
>>>> The default value of spin count is set to 100 with the reference to the
>>>> previous number of times of spinning via trylock. This value would be
>>>> architecture-specific and can be tuned with kinds of benchmarks to fit most
>>>> cases in future.
>>>>
>>>>    * sysdeps/nptl/dl-tunables.list: Add glibc.pthread.mutex_spin_count entry.
>>>>    * manual/tunables.texi: Add glibc.pthread.mutex_spin_count description.
>>>>    * nptl/pthread_mutex_conf.h: New file.
>>>>    * nptl/pthread_mutex_conf.c: New file.
>>>>    * nptl/Makefile: Add pthread_mutex_conf.c for compilation.
>>>>    * nptl/nptl-init.c: Put pthread mutex tunable initialization in pthread
>>>>      initialization.
>>>>    * nptl/pthreadP.h: Refine MAX_ADAPTIVE_COUNT macro definition
>>>>
>>>> I would extend my appreciation sincerely to H.J.Lu for his help to refine
>>>> this patch series.
>>>>
>>>> ChangeLog:
>>>>     V8->V9:
>>>> 	a) Add the "unistd.h" header file back, or it will cause build
>>>> 	regression on 32 bits system.
>>>> 	b) Rebase on the latest master branch
>>>> 	c) Tested on x86_64 with build-many-glibcs.py
>>>>
>>>>     V7->V8:
>>>> 	a) Refine the pthread tunables description in manual/tunables.texi
>>>> 	accordingly to Carlos O'Donell and Rical Jason.
>>>>
>>>>     V6->V7:
>>>>     a) Patch is refined by H.J.Lu
>>>>
>>>>     V5->V6:
>>>>     a) Missing "pthread mutex tunables" entry in the menu of tunables.texi,
>>>>     add it.
>>>>
>>>>     V4->V5
>>>>     a) Put mutex tunable (glibc.mutex.spin_count) initialization as part of
>>>>     overall pthread initialization, that would avoid the extra relocation,
>>>>     as suggested by Florian Weimer. Thanks for pointing it out!
>>>>     b) Move the READ_ONLY_SPIN macro definition from the third patch to
>>>>     this patch
>>>>
>>>>     V3->V4
>>>>     a) Add comments in elf/dl-tunables.list
>>>>
>>>>     V2->V3
>>>>     a) Polish the description of glibc.mutex.spin_count tunable with the
>>>>     help from Rical Jasan.
>>>>     b) Get rid of the TUNABLE_CALLBACK_FNDECL macros in
>>>>     pthread_mutex_conf.c, as suggested by Florian Weimer.
>>>>     c) Adjust the default value of spin count to 100 with the reference of
>>>>     the previous spinning way via trylock.
>>>>
>>>>     V1->V2
>>>>     a) Renamed nptl/mutex-conf.h -> nptl/pthread_mutex_conf.h
>>>>     b) Renamed nptl/mutex-conf.c -> nptl/pthread_mutex_conf.c
>>>>     c) Change the Makefile to compile pthread_mutex_conf.c
>>>>     d) Modify the copyright "2013-2018" -> "2018" for new added files
>>>>     e) Fix the indentation issue (tab -> double space) in
>>>>     elf/dl-tunables.list
>>>>     f) Remove the env alias LD_SPIN_COUNT in elf/dl-tunables.list
>>>>     g) Fix the typo errors and refresh glibc.mutex.spin_count tunable
>>>>     description in manual/tunables.texi.
>>>>     h) Fix the indentation issue in nptl/pthread_mutex_conf.c
>>>>     i) Fix the indentation issue for nested preprocessor (add one space for
>>>>     each level)
>>>>
>>>> Suggested-by: Andi Kleen <andi.kleen@intel.com>
>>>> Signed-off-by: Kemi.wang <kemi.wang@intel.com>
>>>> ---
>>>>  manual/tunables.texi          | 27 +++++++++++++++++++++++
>>>>  nptl/Makefile                 |  2 +-
>>>>  nptl/nptl-init.c              |  5 +++++
>>>>  nptl/pthreadP.h               | 24 +++++++++++++++-----
>>>>  nptl/pthread_mutex_conf.c     | 27 +++++++++++++++++++++++
>>>>  nptl/pthread_mutex_conf.h     | 45 ++++++++++++++++++++++++++++++++++++++
>>>>  sysdeps/nptl/dl-tunables.list | 51 +++++++++++++++++++++++++++++++++++++++++++
>>>>  7 files changed, 174 insertions(+), 7 deletions(-)
>>>>  create mode 100644 nptl/pthread_mutex_conf.c
>>>>  create mode 100644 nptl/pthread_mutex_conf.h
>>>>  create mode 100644 sysdeps/nptl/dl-tunables.list
>>>>
>>>> diff --git a/manual/tunables.texi b/manual/tunables.texi
>>>> index 3345a23..33fbe1e 100644
>>>> --- a/manual/tunables.texi
>>>> +++ b/manual/tunables.texi
>>>> @@ -32,6 +32,7 @@ their own namespace.
>>>>  * Tunable names::  The structure of a tunable name
>>>>  * Memory Allocation Tunables::  Tunables in the memory allocation subsystem
>>>>  * Elision Tunables::  Tunables in elision subsystem
>>>> +* POSIX Thread Tunables:: Tunables in the POSIX thread subsystem
> 
> OK.
> 
>>>>  * Hardware Capability Tunables::  Tunables that modify the hardware
>>>>  				  capabilities seen by @theglibc{}
>>>>  @end menu
>>>> @@ -281,6 +282,32 @@ of try lock attempts.
>>>>  The default value of this tunable is @samp{3}.
>>>>  @end deftp
>>>>  
>>>> +@node POSIX Thread Tunables
>>>> +@section POSIX Thread Tunables
>>>> +@cindex pthread mutex tunables
>>>> +@cindex thread mutex tunables
>>>> +@cindex mutex tunables
>>>> +@cindex tunables thread mutex
>>>> +
>>>> +@deftp {Tunable namespace} glibc.pthread
>>>> +The behavior of POSIX thread can be tuned to gain performance improvements
> 
> s/of POSIX thread/of POSIX threads/g
> 

OK

>>>> +according to specific hardware capabilities and workload characteristics by
>>>> +setting the following tunables in the @code{pthread} namespace:
>>>> +@end deftp
>>>> +
>>>> +@deftp Tunable glibc.pthread.mutex_spin_count
>>>> +The @code{glibc.pthread.mutex_spin_count} tunable sets the maximum number of times
>>>> +a thread should spin on the lock before calling into the kernel to block.
>>>> +Adaptive spin is used for mutexes initialized with the
>>>> +@code{PTHREAD_MUTEX_ADAPTIVE_NP} GNU extension.  It affects both
>>>> +@code{pthread_mutex_lock} and @code{pthread_mutex_timedlock}.
> 
> OK.
> 
>>>> +
>>>> +The spinning is done until either the maximum spin count is reached or
> 
> s/The spinning is done/The thread spins/g
> 

OK

>>>> +the lock is acquired.
> 
> OK.
> 
>>>> +
>>>> +The default value of this tunable is @samp{100}.
>>>> +@end deftp
>>>> +
>>>>  @node Hardware Capability Tunables
>>>>  @section Hardware Capability Tunables
>>>>  @cindex hardware capability tunables
>>>> diff --git a/nptl/Makefile b/nptl/Makefile
>>>> index 49b6faa..284f815 100644
>>>> --- a/nptl/Makefile
>>>> +++ b/nptl/Makefile
>>>> @@ -145,7 +145,7 @@ libpthread-routines = nptl-init nptlfreeres vars events version pt-interp \
>>>>  		      mtx_destroy mtx_init mtx_lock mtx_timedlock \
>>>>  		      mtx_trylock mtx_unlock call_once cnd_broadcast \
>>>>  		      cnd_destroy cnd_init cnd_signal cnd_timedwait cnd_wait \
>>>> -		      tss_create tss_delete tss_get tss_set
>>>> +		      tss_create tss_delete tss_get tss_set pthread_mutex_conf
> 
> OK.
> 
>>>>  #		      pthread_setuid pthread_seteuid pthread_setreuid \
>>>>  #		      pthread_setresuid \
>>>>  #		      pthread_setgid pthread_setegid pthread_setregid \
>>>> diff --git a/nptl/nptl-init.c b/nptl/nptl-init.c
>>>> index 907411d..adf99f1 100644
>>>> --- a/nptl/nptl-init.c
>>>> +++ b/nptl/nptl-init.c
>>>> @@ -38,6 +38,7 @@
>>>>  #include <kernel-features.h>
>>>>  #include <libc-pointer-arith.h>
>>>>  #include <pthread-pids.h>
>>>> +#include <pthread_mutex_conf.h>
> 
> OK.
> 
>>>>  
>>>>  #ifndef TLS_MULTIPLE_THREADS_IN_TCB
>>>>  /* Pointer to the corresponding variable in libc.  */
>>>> @@ -431,6 +432,10 @@ __pthread_initialize_minimal_internal (void)
>>>>  
>>>>    /* Determine whether the machine is SMP or not.  */
>>>>    __is_smp = is_smp_system ();
>>>> +
>>>> +#if HAVE_TUNABLES
>>>> +  pthread_tunables_init ();
>>>> +#endif
> 
> OK.
> 
>>>>  }
>>>>  strong_alias (__pthread_initialize_minimal_internal,
>>>>  	      __pthread_initialize_minimal)
>>>> diff --git a/nptl/pthreadP.h b/nptl/pthreadP.h
>>>> index 19efe1e..9bae78e 100644
>>>> --- a/nptl/pthreadP.h
>>>> +++ b/nptl/pthreadP.h
>>>> @@ -47,12 +47,6 @@
>>>>  #endif
>>>>  
>>>>  
>>>> -/* Adaptive mutex definitions.  */
>>>> -#ifndef MAX_ADAPTIVE_COUNT
>>>> -# define MAX_ADAPTIVE_COUNT 100
>>>> -#endif
>>>> -
>>>> -
>>>>  /* Magic cookie representing robust mutex with dead owner.  */
>>>>  #define PTHREAD_MUTEX_INCONSISTENT	INT_MAX
>>>>  /* Magic cookie representing not recoverable robust mutex.  */
>>>> @@ -188,6 +182,24 @@ enum
>>>>  #define __PTHREAD_COND_SHARED_MASK 1
>>>>  
>>>>  
>>>> +#if HAVE_TUNABLES
>>>> +struct mutex_config
>>>> +{
>>>> +  int spin_count;
>>>> +};
> 
> OK.
> 
>>>> +
>>>> +extern struct mutex_config __mutex_aconf;
> 
> OK.
> 
>>>> +#endif
>>>> +
>>>> +/* Adaptive mutex definitions.  */
>>>> +#ifndef MAX_ADAPTIVE_COUNT
>>>> +# if HAVE_TUNABLES
>>>> +#  define MAX_ADAPTIVE_COUNT __mutex_aconf.spin_count
>>>> +# else
>>>> +#  define MAX_ADAPTIVE_COUNT 100
> 
> Please define a DEFAULT_ADAPTIVE_COUNT to 100 and use it later
> in pthread_mutex_conf.c.
> 
> Please add a comment to DEFAULT_ADAPTIVE_COUNT, since you
> touched the code we should comment it a bit more:
> 
> /* The choice of 100 spins for the default spin count for
>    an adaptive spin is a completely arbitrary choice that
>    has not been evaluated thoroughly using modern hardware.  */
> 

Will do it.

>>>> +# endif
>>>> +#endif
>>>> +
>>>>  /* Internal variables.  */
>>>>  
>>>>  
>>>> diff --git a/nptl/pthread_mutex_conf.c b/nptl/pthread_mutex_conf.c
>>>> new file mode 100644
>>>> index 0000000..0da4afe
>>>> --- /dev/null
>>>> +++ b/nptl/pthread_mutex_conf.c
>>>> @@ -0,0 +1,27 @@
>>>> +/* pthread_mutex_conf.c: Pthread mutex tunable parameters.
> 
> Use just "pthread mutex tunable parameters."
> 

OK

>>>> +   Copyright (C) 2018 Free Software Foundation, Inc.
>>>> +   This file is part of the GNU C Library.
>>>> +
>>>> +   The GNU C Library is free software; you can redistribute it and/or
>>>> +   modify it under the terms of the GNU Lesser General Public
>>>> +   License as published by the Free Software Foundation; either
>>>> +   version 2.1 of the License, or (at your option) any later version.
>>>> +
>>>> +   The GNU C Library is distributed in the hope that it will be useful,
>>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>>> +   Lesser General Public License for more details.
>>>> +
>>>> +   You should have received a copy of the GNU Lesser General Public
>>>> +   License along with the GNU C Library; if not, see
>>>> +   <http://www.gnu.org/licenses/>.  */
>>>> +
>>>> +#if HAVE_TUNABLES
>>>> +#include <pthreadP.h>
>>>> +struct mutex_config __mutex_aconf =
>>>> +{
>>>> +  /* The maximum number of times a thread should spin on the lock before
>>>> +  calling into kernel to block.  */
>>>> +  .spin_count = 100,
> 
> Should use DEFAULT_ADAPTIVE_COUNT instead of repeating the 100.
> 

OK

>>>> +};
>>>> +#endif
>>>> diff --git a/nptl/pthread_mutex_conf.h b/nptl/pthread_mutex_conf.h
>>>> new file mode 100644
>>>> index 0000000..97ff98e
>>>> --- /dev/null
>>>> +++ b/nptl/pthread_mutex_conf.h
>>>> @@ -0,0 +1,45 @@
>>>> +/* pthread_mutex_conf.h: Pthread mutex tunable parameters.
> 
> Use just "pthread mutex tunable parameters."
> 

OK

>>>> +   Copyright (C) 2018 Free Software Foundation, Inc.
>>>> +   This file is part of the GNU C Library.
>>>> +
>>>> +   The GNU C Library is free software; you can redistribute it and/or
>>>> +   modify it under the terms of the GNU Lesser General Public
>>>> +   License as published by the Free Software Foundation; either
>>>> +   version 2.1 of the License, or (at your option) any later version.
>>>> +
>>>> +   The GNU C Library is distributed in the hope that it will be useful,
>>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>>> +   Lesser General Public License for more details.
>>>> +
>>>> +   You should have received a copy of the GNU Lesser General Public
>>>> +   License along with the GNU C Library; if not, see
>>>> +   <http://www.gnu.org/licenses/>.  */
>>>> +#ifndef _PTHREAD_MUTEX_CONF_H
>>>> +#define _PTHREAD_MUTEX_CONF_H 1
>>>> +
>>>> +#if HAVE_TUNABLES
>>>> +# define TUNABLE_NAMESPACE pthread
>>>> +# include <stdbool.h>
>>>> +# include <stdint.h>
>>>> +# include <pthreadP.h>
>>>> +# include <pthread_mutex_conf.h>
>>>> +# include <unistd.h>  /* Get STDOUT_FILENO for _dl_printf.  */
>>>> +# include <elf/dl-tunables.h>
>>>> +
>>>> +extern struct mutex_config __mutex_aconf attribute_hidden;
>>>> +
>>>> +static void
>>>> +TUNABLE_CALLBACK (set_mutex_spin_count) (tunable_val_t *valp)
>>>> +{
>>>> +  __mutex_aconf.spin_count = (int32_t) (valp)->numval;
>>>> +}
>>>> +
>>>> +static void pthread_tunables_init (void)
>>>> +{
>>>> +  TUNABLE_GET (mutex_spin_count, int32_t,
>>>> +               TUNABLE_CALLBACK (set_mutex_spin_count));
>>>> +}
>>>> +#endif
>>>> +
>>>> +#endif
> 
> OK.
> 
>>>> diff --git a/sysdeps/nptl/dl-tunables.list b/sysdeps/nptl/dl-tunables.list
>>>> new file mode 100644
>>>> index 0000000..1cb52e8
>>>> --- /dev/null
>>>> +++ b/sysdeps/nptl/dl-tunables.list
>>>> @@ -0,0 +1,51 @@
>>>> +# Copyright (C) 2018 Free Software Foundation, Inc.
>>>> +# This file is part of the GNU C Library.
>>>> +
>>>> +# The GNU C Library is free software; you can redistribute it and/or
>>>> +# modify it under the terms of the GNU Lesser General Public
>>>> +# License as published by the Free Software Foundation; either
>>>> +# version 2.1 of the License, or (at your option) any later version.
>>>> +
>>>> +# The GNU C Library is distributed in the hope that it will be useful,
>>>> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>>> +# Lesser General Public License for more details.
>>>> +
>>>> +# You should have received a copy of the GNU Lesser General Public
>>>> +# License along with the GNU C Library; if not, see
>>>> +# <http://www.gnu.org/licenses/>.
>>>> +
>>>> +# Allowed attributes for tunables:
>>>> +#
>>>> +# type: Defaults to STRING
>>>> +# minval: Optional minimum acceptable value
>>>> +# maxval: Optional maximum acceptable value
>>>> +# env_alias: An alias environment variable
>>>> +# security_level: Specify security level of the tunable.  Valid values are:
>>>> +#
>>>> +# 	     SXID_ERASE: (default) Don't read for AT_SECURE binaries and
>>>> +# 	     		 removed so that child processes can't read it.
>>>> +# 	     SXID_IGNORE: Don't read for AT_SECURE binaries, but retained for
>>>> +# 	     		  non-AT_SECURE subprocesses.
>>>> +# 	     NONE: Read all the time.
>>>> +
>>>> +
>>>> +# The maximum value of spin count is limited to 30000 to avoid the overflow
>>>> +# of mutex->__data.__spins variable with the possible type of short in
>>>> +# pthread_mutex_lock ().
>>>> +#
>>>> +# The default value of spin count is set to 100 with the reference to the
>>>> +# previous number of times of spinning via trylock. This value would be
>>>> +# architecture-specific and can be tuned with kinds of benchmarks to fit
>>>> +# most cases in future.
>>>> +
>>>> +glibc {
>>>> +  pthread {
>>>> +    mutex_spin_count {
>>>> +      type: INT_32
>>>> +      minval: 0
>>>> +      maxval: 30000
>>>> +      default: 100
>>>> +    }
>>>> +  }
>>>> +}
>>>>
> 
> I don't like the duplication of DEFAULT_ADAPTIVE_MAX here,
> similarly all the other duplicated parameters in glibc.elision
> namespace, but I don't think we can do better without a lot
> more work to allow macro expansion.
> 
> OK.
>  let's just keep the magic number 100 here?
kemi Nov. 29, 2018, 9:13 a.m. UTC | #6
On 2018/11/28 上午4:14, Adhemerval Zanella wrote:
> 
> 
> On 21/11/2018 23:45, Kemi Wang wrote:
>> This patch does not have any functionality change, we only provide a spin
>> count tunes for pthread adaptive spin mutex. The tunable
>> glibc.pthread.mutex_spin_count tunes can be used by system administrator to
>> squeeze system performance according to different hardware capabilities and
>> workload characteristics.
>>
>> The maximum value of spin count is limited to 30000 to avoid the overflow
>> of mutex->__data.__spins variable with the possible type of short in
>> pthread_mutex_lock ().
> 
> With current algorithm nptl/pthread_mutex_{timed}lock.c, I think we can use 
> the full short range (32767) since assuming maximum spin_nop usage with 'cnt'
> being 'max_cnt - 1' (the fast atomic path) it will converge to 32719.
> 

yes, it's fine to limited the maximum value of spin count to 32767.
BTW, the maximum possible value of spin_nop 'cnt' would be 'max_cnt + 1'.

>>
>> The default value of spin count is set to 100 with the reference to the
>> previous number of times of spinning via trylock. This value would be
>> architecture-specific and can be tuned with kinds of benchmarks to fit most
>> cases in future.
>>
>>  
>> +#if HAVE_TUNABLES
>> +struct mutex_config
>> +{
>> +  int spin_count;
>> +};
>> +
>> +extern struct mutex_config __mutex_aconf;
>> +#endif
>> +
>> +/* Adaptive mutex definitions.  */
>> +#ifndef MAX_ADAPTIVE_COUNT
>> +# if HAVE_TUNABLES
>> +#  define MAX_ADAPTIVE_COUNT __mutex_aconf.spin_count
>> +# else
>> +#  define MAX_ADAPTIVE_COUNT 100
>> +# endif
>> +#endif
>> +
>>  /* Internal variables.  */
>>  
> 
> We try to avoid the logic to check if a define is already define to set a
> default value.  If the idea is on future patches to define an arch-specific
> value, the preferred strategy is to define the generic and an arch-specific
> value in their own headers.
> 

Agree!

>>  
>> diff --git a/nptl/pthread_mutex_conf.c b/nptl/pthread_mutex_conf.c
>> new file mode 100644
>> index 0000000..0da4afe
>> --- /dev/null
>> +++ b/nptl/pthread_mutex_conf.c
>> @@ -0,0 +1,27 @@
>> +/* pthread_mutex_conf.c: Pthread mutex tunable parameters.
> 
> It is not a rule, but there is no need to duplicate the file name on its
> description.
> 

OK. Will remove the file name

>> +   Copyright (C) 2018 Free Software Foundation, Inc.
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library; if not, see
>> +   <http://www.gnu.org/licenses/>.  */
>> +
>> +#if HAVE_TUNABLES
>> +#include <pthreadP.h>
>> +struct mutex_config __mutex_aconf =
>> +{
>> +  /* The maximum number of times a thread should spin on the lock before
>> +  calling into kernel to block.  */
>> +  .spin_count = 100,
>> +};
>> +#endif
> 
> If the idea is to keep both non-tunable version and tunable version in sync
> when default value change, I would suggest to use an extra define to hold
> the '100' value.
> 
> Another option, which I think it better, is to avoid make MAX_ADAPTIVE_COUNT
> have different meaning depending if tunables is enabled or not and just
> create a inline function:
> 
> nptl/pthreadP.h:
> 
> #define MAX_DEFAULT_ADAPTIVE_COUNT 100
> 
> static inline short max_adaptive_count (void)
> {
> #if HAVE_TUNABLES
>   return __mutex_aconf.spin_count;
> #else
>   return MAX_DEFAULT_ADAPTIVE_COUNT;
> #endif
> }
> 
> And adjust nptl/pthread_mutex_conf.c, nptl/pthread_mutex_lock.c, and
> nptl/pthread_mutex_timedlock.c accordingly.
> 

Sure

> Also, if the idea is to provide a different generic value depending
> of the architecture it is a matter to move the MAX_DEFAULT_ADAPTIVE_COUNT
> to a generic header and override the header by the architecture.
> 

thanks!
The v10 will be like this:
the DEFAULT_ADAPTIVE_COUNT would be architecture-specific, defined in a generic
file sysdeps/generic/adaptive_spin_count.h, in future, different architectures
(or even different platforms) will have its individual version.
E.g x86 may use /sysdeps/unix/sysv/linux/x86/adaptive_spin_count.h to define this
default value.

I am not familiar with toolchains, is it a right way to define a generic
file and a architecture-specific file like this?

If HAVE_TUNABLES is defined as true, DEFAULT_ADAPTIVE_COUNT will be ignored.

>> +#if HAVE_TUNABLES
>> +# define TUNABLE_NAMESPACE pthread
>> +# include <stdbool.h>
>> +# include <stdint.h>
>> +# include <pthreadP.h>
>> +# include <pthread_mutex_conf.h>
>> +# include <unistd.h>  /* Get STDOUT_FILENO for _dl_printf.  */
>> +# include <elf/dl-tunables.h>
> 
> Mostly of the includes seems superfluous, I think only pthread_mutex_conf.h
> and elf/dl-tunables.h are really required.
> 

maybe not.

For pthreadP.h, I defines struct mutex_config in pthreadP.h, so we have to include header pthreadP.h here
Or, if you don't want header pthreadP.h included, one alternative way is to move this structure definition back
to pthread_mutex_conf.h. But, we will have to include header pthread_mutex_conf.h in pthreadP.h (required by ) 
I think both are OK for me, your idea?

For stdint.h, I tried to remove it, the system complains
In file included from ../elf/dl-tunables.h:34:0,
                       from pthread_mutex_conf.c:23:

../elf/dl-tunable-types.h:35:3: error: unknown type name ‘int64_t’
   int64_t min;
   ^

Then, I added the header stdint.h back, then, the system complains
../elf/dl-tunables.h:42:3: error: unknown type name ‘bool’
  bool initialized;   /* Flag to indicate that the tunable is
  ^
Then, I added the header stdbool.h back, and continue

For unistd.h, it's OK to remove on 64-bits system, but it will fail to compile in 32-bits system.
The comment added after this header explains it: "Get STDOUT_FILENO for _dl_printf"

So, all of the header files listed above are required in current implementation.
Hmm, pthreadP.h should not be needed if moving struct mutex_config back to pthread_mutex_conf.h.

Anyway, I will post v10 for all of those changes.
Thanks for your review and suggestion.

>> +
>> +extern struct mutex_config __mutex_aconf attribute_hidden;
>> +
>> +static void
>> +TUNABLE_CALLBACK (set_mutex_spin_count) (tunable_val_t *valp)
>> +{
>> +  __mutex_aconf.spin_count = (int32_t) (valp)->numval;
>> +}
>> +
>> +static void pthread_tunables_init (void)
>> +{
>> +  TUNABLE_GET (mutex_spin_count, int32_t,
>> +               TUNABLE_CALLBACK (set_mutex_spin_count));
>> +}
>> +#endif
>> +
>> +#endif
>> diff --git a/sysdeps/nptl/dl-tunables.list b/sysdeps/nptl/dl-tunables.list
>> new file mode 100644
>> index 0000000..1cb52e8
>> --- /dev/null
>> +++ b/sysdeps/nptl/dl-tunables.list
>> @@ -0,0 +1,51 @@
>> +# Copyright (C) 2018 Free Software Foundation, Inc.
>> +# This file is part of the GNU C Library.
>> +
>> +# The GNU C Library is free software; you can redistribute it and/or
>> +# modify it under the terms of the GNU Lesser General Public
>> +# License as published by the Free Software Foundation; either
>> +# version 2.1 of the License, or (at your option) any later version.
>> +
>> +# The GNU C Library is distributed in the hope that it will be useful,
>> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +# Lesser General Public License for more details.
>> +
>> +# You should have received a copy of the GNU Lesser General Public
>> +# License along with the GNU C Library; if not, see
>> +# <http://www.gnu.org/licenses/>.
>> +
>> +# Allowed attributes for tunables:
>> +#
>> +# type: Defaults to STRING
>> +# minval: Optional minimum acceptable value
>> +# maxval: Optional maximum acceptable value
>> +# env_alias: An alias environment variable
>> +# security_level: Specify security level of the tunable.  Valid values are:
>> +#
>> +# 	     SXID_ERASE: (default) Don't read for AT_SECURE binaries and
>> +# 	     		 removed so that child processes can't read it.
>> +# 	     SXID_IGNORE: Don't read for AT_SECURE binaries, but retained for
>> +# 	     		  non-AT_SECURE subprocesses.
>> +# 	     NONE: Read all the time.
> 
> Do we need to duplicate the same comments from 'elf/dl-tunables.list'?
> 

Will remove them in v10.
Florian Weimer Nov. 29, 2018, 2:41 p.m. UTC | #7
* Kemi Wang:

> +# 	     SXID_ERASE: (default) Don't read for AT_SECURE binaries and
> +# 	     		 removed so that child processes can't read it.

I think this will not work as intended for libpthread tunables because
if libpthread is not linked in, no code in nptl/nptl-init.c ever runs.

I think we need to move the tunable to elf/dl-tunables.list, parse it
there, and communicate the result to libpthread, possibly via
_rtld_global_ro.

Thanks,
Florian
Carlos O'Donell Nov. 29, 2018, 6:46 p.m. UTC | #8
On 11/28/18 12:21 AM, kemi wrote:
>>>> +
>>>> +glibc {
>>>> +  pthread {
>>>> +    mutex_spin_count {
>>>> +      type: INT_32
>>>> +      minval: 0
>>>> +      maxval: 30000
>>>> +      default: 100
>>>> +    }
>>>> +  }
>>>> +}
>>>>
> I don't like the duplication of DEFAULT_ADAPTIVE_MAX here,
> similarly all the other duplicated parameters in glibc.elision
> namespace, but I don't think we can do better without a lot
> more work to allow macro expansion.
> 
> OK.
>  let's just keep the magic number 100 here?

Yes. That's fine. I started another thread for this:
https://www.sourceware.org/ml/libc-alpha/2018-11/msg00723.html
kemi Nov. 30, 2018, 2:42 a.m. UTC | #9
On 2018/11/29 下午10:41, Florian Weimer wrote:
> * Kemi Wang:
> 
>> +# 	     SXID_ERASE: (default) Don't read for AT_SECURE binaries and
>> +# 	     		 removed so that child processes can't read it.
> 
> I think this will not work as intended for libpthread tunables because
> if libpthread is not linked in, no code in nptl/nptl-init.c ever runs.
> 

this tunable is for pthread mutex. If the libpthread is not linked in,
it seems reasonable that we can't use this tunables.

Why do we need to move to elf/dl-tunables.list and do some tricks?

> I think we need to move the tunable to elf/dl-tunables.list, parse it
> there, and communicate the result to libpthread, possibly via
> _rtld_global_ro.
> 
> Thanks,
> Florian
>
Carlos O'Donell Nov. 30, 2018, 2:44 a.m. UTC | #10
On 11/29/18 9:41 AM, Florian Weimer wrote:
> * Kemi Wang:
> 
>> +# 	     SXID_ERASE: (default) Don't read for AT_SECURE binaries and
>> +# 	     		 removed so that child processes can't read it.
> 
> I think this will not work as intended for libpthread tunables because
> if libpthread is not linked in, no code in nptl/nptl-init.c ever runs.
> 
> I think we need to move the tunable to elf/dl-tunables.list, parse it
> there, and communicate the result to libpthread, possibly via
> _rtld_global_ro.

This is a fairly arcane issue with tunables, are we certain this is a
problem? I'd rather not have new contributors need to handle this, but
that we fix it ourselves and then have Kemi follow established practice
(like putting it in elf/dl-tunables.list and passing the value to
libpthread, or just using sysdeps/nptl/dl-tunables.list).
kemi Nov. 30, 2018, 3:02 a.m. UTC | #11
On 2018/11/30 上午10:42, kemi wrote:
> 
> 
> On 2018/11/29 下午10:41, Florian Weimer wrote:
>> * Kemi Wang:
>>
>>> +# 	     SXID_ERASE: (default) Don't read for AT_SECURE binaries and
>>> +# 	     		 removed so that child processes can't read it.
>>
>> I think this will not work as intended for libpthread tunables because
>> if libpthread is not linked in, no code in nptl/nptl-init.c ever runs.
>>
> 
> this tunable is for pthread mutex. If the libpthread is not linked in,
> it seems reasonable that we can't use this tunables.
> 
> Why do we need to move to elf/dl-tunables.list and do some tricks?
> 

Or you are talking something on pthread mutex shared within multiple processes?
(pthread_mutexattr_setpshared(mattr,pshared))

In this case, libpthread may not need, and we still hope pthread mutex tunable
works in this case?

>> I think we need to move the tunable to elf/dl-tunables.list, parse it
>> there, and communicate the result to libpthread, possibly via
>> _rtld_global_ro.
>>
>> Thanks,
>> Florian
>>
H.J. Lu Nov. 30, 2018, 4:10 a.m. UTC | #12
On Thu, Nov 29, 2018 at 7:07 PM kemi <kemi.wang@intel.com> wrote:
>
>
>
> On 2018/11/30 上午10:42, kemi wrote:
> >
> >
> > On 2018/11/29 下午10:41, Florian Weimer wrote:
> >> * Kemi Wang:
> >>
> >>> +#       SXID_ERASE: (default) Don't read for AT_SECURE binaries and
> >>> +#                   removed so that child processes can't read it.
> >>
> >> I think this will not work as intended for libpthread tunables because
> >> if libpthread is not linked in, no code in nptl/nptl-init.c ever runs.
> >>
> >
> > this tunable is for pthread mutex. If the libpthread is not linked in,
> > it seems reasonable that we can't use this tunables.
> >
> > Why do we need to move to elf/dl-tunables.list and do some tricks?
> >
>
> Or you are talking something on pthread mutex shared within multiple processes?
> (pthread_mutexattr_setpshared(mattr,pshared))
>
> In this case, libpthread may not need, and we still hope pthread mutex tunable
> works in this case?
>
> >> I think we need to move the tunable to elf/dl-tunables.list, parse it
> >> there, and communicate the result to libpthread, possibly via
> >> _rtld_global_ro.

What is the usage for that?  Sure, if libpread isn't linked in, these
tunables will
have the default values, which aren't used by anyone.
Florian Weimer Dec. 11, 2018, 11:13 a.m. UTC | #13
* Carlos O'Donell:

> On 11/29/18 9:41 AM, Florian Weimer wrote:
>> * Kemi Wang:
>> 
>>> +# 	     SXID_ERASE: (default) Don't read for AT_SECURE binaries and
>>> +# 	     		 removed so that child processes can't read it.
>> 
>> I think this will not work as intended for libpthread tunables because
>> if libpthread is not linked in, no code in nptl/nptl-init.c ever runs.
>> 
>> I think we need to move the tunable to elf/dl-tunables.list, parse it
>> there, and communicate the result to libpthread, possibly via
>> _rtld_global_ro.
>
> This is a fairly arcane issue with tunables, are we certain this is a
> problem?

My concern is about the quoted part, from a new file added in this
patch.  The security controls will not work reliably in this context
because libpthread may never be loaded and the promised removal never
happens.

Thanks,
Florian
diff mbox series

Patch

diff --git a/manual/tunables.texi b/manual/tunables.texi
index 3345a23..33fbe1e 100644
--- a/manual/tunables.texi
+++ b/manual/tunables.texi
@@ -32,6 +32,7 @@  their own namespace.
 * Tunable names::  The structure of a tunable name
 * Memory Allocation Tunables::  Tunables in the memory allocation subsystem
 * Elision Tunables::  Tunables in elision subsystem
+* POSIX Thread Tunables:: Tunables in the POSIX thread subsystem
 * Hardware Capability Tunables::  Tunables that modify the hardware
 				  capabilities seen by @theglibc{}
 @end menu
@@ -281,6 +282,32 @@  of try lock attempts.
 The default value of this tunable is @samp{3}.
 @end deftp
 
+@node POSIX Thread Tunables
+@section POSIX Thread Tunables
+@cindex pthread mutex tunables
+@cindex thread mutex tunables
+@cindex mutex tunables
+@cindex tunables thread mutex
+
+@deftp {Tunable namespace} glibc.pthread
+The behavior of POSIX thread can be tuned to gain performance improvements
+according to specific hardware capabilities and workload characteristics by
+setting the following tunables in the @code{pthread} namespace:
+@end deftp
+
+@deftp Tunable glibc.pthread.mutex_spin_count
+The @code{glibc.pthread.mutex_spin_count} tunable sets the maximum number of times
+a thread should spin on the lock before calling into the kernel to block.
+Adaptive spin is used for mutexes initialized with the
+@code{PTHREAD_MUTEX_ADAPTIVE_NP} GNU extension.  It affects both
+@code{pthread_mutex_lock} and @code{pthread_mutex_timedlock}.
+
+The spinning is done until either the maximum spin count is reached or
+the lock is acquired.
+
+The default value of this tunable is @samp{100}.
+@end deftp
+
 @node Hardware Capability Tunables
 @section Hardware Capability Tunables
 @cindex hardware capability tunables
diff --git a/nptl/Makefile b/nptl/Makefile
index 49b6faa..284f815 100644
--- a/nptl/Makefile
+++ b/nptl/Makefile
@@ -145,7 +145,7 @@  libpthread-routines = nptl-init nptlfreeres vars events version pt-interp \
 		      mtx_destroy mtx_init mtx_lock mtx_timedlock \
 		      mtx_trylock mtx_unlock call_once cnd_broadcast \
 		      cnd_destroy cnd_init cnd_signal cnd_timedwait cnd_wait \
-		      tss_create tss_delete tss_get tss_set
+		      tss_create tss_delete tss_get tss_set pthread_mutex_conf
 #		      pthread_setuid pthread_seteuid pthread_setreuid \
 #		      pthread_setresuid \
 #		      pthread_setgid pthread_setegid pthread_setregid \
diff --git a/nptl/nptl-init.c b/nptl/nptl-init.c
index 907411d..adf99f1 100644
--- a/nptl/nptl-init.c
+++ b/nptl/nptl-init.c
@@ -38,6 +38,7 @@ 
 #include <kernel-features.h>
 #include <libc-pointer-arith.h>
 #include <pthread-pids.h>
+#include <pthread_mutex_conf.h>
 
 #ifndef TLS_MULTIPLE_THREADS_IN_TCB
 /* Pointer to the corresponding variable in libc.  */
@@ -431,6 +432,10 @@  __pthread_initialize_minimal_internal (void)
 
   /* Determine whether the machine is SMP or not.  */
   __is_smp = is_smp_system ();
+
+#if HAVE_TUNABLES
+  pthread_tunables_init ();
+#endif
 }
 strong_alias (__pthread_initialize_minimal_internal,
 	      __pthread_initialize_minimal)
diff --git a/nptl/pthreadP.h b/nptl/pthreadP.h
index 19efe1e..9bae78e 100644
--- a/nptl/pthreadP.h
+++ b/nptl/pthreadP.h
@@ -47,12 +47,6 @@ 
 #endif
 
 
-/* Adaptive mutex definitions.  */
-#ifndef MAX_ADAPTIVE_COUNT
-# define MAX_ADAPTIVE_COUNT 100
-#endif
-
-
 /* Magic cookie representing robust mutex with dead owner.  */
 #define PTHREAD_MUTEX_INCONSISTENT	INT_MAX
 /* Magic cookie representing not recoverable robust mutex.  */
@@ -188,6 +182,24 @@  enum
 #define __PTHREAD_COND_SHARED_MASK 1
 
 
+#if HAVE_TUNABLES
+struct mutex_config
+{
+  int spin_count;
+};
+
+extern struct mutex_config __mutex_aconf;
+#endif
+
+/* Adaptive mutex definitions.  */
+#ifndef MAX_ADAPTIVE_COUNT
+# if HAVE_TUNABLES
+#  define MAX_ADAPTIVE_COUNT __mutex_aconf.spin_count
+# else
+#  define MAX_ADAPTIVE_COUNT 100
+# endif
+#endif
+
 /* Internal variables.  */
 
 
diff --git a/nptl/pthread_mutex_conf.c b/nptl/pthread_mutex_conf.c
new file mode 100644
index 0000000..0da4afe
--- /dev/null
+++ b/nptl/pthread_mutex_conf.c
@@ -0,0 +1,27 @@ 
+/* pthread_mutex_conf.c: Pthread mutex tunable parameters.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#if HAVE_TUNABLES
+#include <pthreadP.h>
+struct mutex_config __mutex_aconf =
+{
+  /* The maximum number of times a thread should spin on the lock before
+  calling into kernel to block.  */
+  .spin_count = 100,
+};
+#endif
diff --git a/nptl/pthread_mutex_conf.h b/nptl/pthread_mutex_conf.h
new file mode 100644
index 0000000..97ff98e
--- /dev/null
+++ b/nptl/pthread_mutex_conf.h
@@ -0,0 +1,45 @@ 
+/* pthread_mutex_conf.h: Pthread mutex tunable parameters.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+#ifndef _PTHREAD_MUTEX_CONF_H
+#define _PTHREAD_MUTEX_CONF_H 1
+
+#if HAVE_TUNABLES
+# define TUNABLE_NAMESPACE pthread
+# include <stdbool.h>
+# include <stdint.h>
+# include <pthreadP.h>
+# include <pthread_mutex_conf.h>
+# include <unistd.h>  /* Get STDOUT_FILENO for _dl_printf.  */
+# include <elf/dl-tunables.h>
+
+extern struct mutex_config __mutex_aconf attribute_hidden;
+
+static void
+TUNABLE_CALLBACK (set_mutex_spin_count) (tunable_val_t *valp)
+{
+  __mutex_aconf.spin_count = (int32_t) (valp)->numval;
+}
+
+static void pthread_tunables_init (void)
+{
+  TUNABLE_GET (mutex_spin_count, int32_t,
+               TUNABLE_CALLBACK (set_mutex_spin_count));
+}
+#endif
+
+#endif
diff --git a/sysdeps/nptl/dl-tunables.list b/sysdeps/nptl/dl-tunables.list
new file mode 100644
index 0000000..1cb52e8
--- /dev/null
+++ b/sysdeps/nptl/dl-tunables.list
@@ -0,0 +1,51 @@ 
+# Copyright (C) 2018 Free Software Foundation, Inc.
+# This file is part of the GNU C Library.
+
+# The GNU C Library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2.1 of the License, or (at your option) any later version.
+
+# The GNU C Library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+
+# You should have received a copy of the GNU Lesser General Public
+# License along with the GNU C Library; if not, see
+# <http://www.gnu.org/licenses/>.
+
+# Allowed attributes for tunables:
+#
+# type: Defaults to STRING
+# minval: Optional minimum acceptable value
+# maxval: Optional maximum acceptable value
+# env_alias: An alias environment variable
+# security_level: Specify security level of the tunable.  Valid values are:
+#
+# 	     SXID_ERASE: (default) Don't read for AT_SECURE binaries and
+# 	     		 removed so that child processes can't read it.
+# 	     SXID_IGNORE: Don't read for AT_SECURE binaries, but retained for
+# 	     		  non-AT_SECURE subprocesses.
+# 	     NONE: Read all the time.
+
+
+# The maximum value of spin count is limited to 30000 to avoid the overflow
+# of mutex->__data.__spins variable with the possible type of short in
+# pthread_mutex_lock ().
+#
+# The default value of spin count is set to 100 with the reference to the
+# previous number of times of spinning via trylock. This value would be
+# architecture-specific and can be tuned with kinds of benchmarks to fit
+# most cases in future.
+
+glibc {
+  pthread {
+    mutex_spin_count {
+      type: INT_32
+      minval: 0
+      maxval: 30000
+      default: 100
+    }
+  }
+}