Patchwork [2/7] store thread-specific env information

login
register
mail settings
Submitter Glauber Costa
Date Nov. 26, 2009, 5:24 p.m.
Message ID <1259256300-23937-3-git-send-email-glommer@redhat.com>
Download mbox | patch
Permalink /patch/39574/
State New
Headers show

Comments

Glauber Costa - Nov. 26, 2009, 5:24 p.m.
Since we'll have multiple cpu threads, at least for kvm, we need a way to store
and retrieve the CPUState associated with the current execution thread.
For the I/O thread, this will be NULL.

I am using pthread functions for that, for portability, but we could as well
use __thread keyword.

Signed-off-by: Glauber Costa <glommer@redhat.com>
---
 vl.c |   21 +++++++++++++++++++++
 1 files changed, 21 insertions(+), 0 deletions(-)
Avi Kivity - Nov. 29, 2009, 3:29 p.m.
On 11/26/2009 07:24 PM, Glauber Costa wrote:
> Since we'll have multiple cpu threads, at least for kvm, we need a way to store
> and retrieve the CPUState associated with the current execution thread.
> For the I/O thread, this will be NULL.
>
> I am using pthread functions for that, for portability, but we could as well
> use __thread keyword.
>
>    

Where is __thread not supported?

It's likely a bit faster than pthread_getspecific().
Andreas Färber - Nov. 29, 2009, 3:38 p.m.
Am 29.11.2009 um 16:29 schrieb Avi Kivity:

> On 11/26/2009 07:24 PM, Glauber Costa wrote:
>> Since we'll have multiple cpu threads, at least for kvm, we need a  
>> way to store
>> and retrieve the CPUState associated with the current execution  
>> thread.
>> For the I/O thread, this will be NULL.
>>
>> I am using pthread functions for that, for portability, but we  
>> could as well
>> use __thread keyword.
>>
>>
>
> Where is __thread not supported?

Apple, Sun.

Andreas
Avi Kivity - Nov. 29, 2009, 3:42 p.m.
On 11/29/2009 05:38 PM, Andreas Färber wrote:
>
> Am 29.11.2009 um 16:29 schrieb Avi Kivity:
>
>> On 11/26/2009 07:24 PM, Glauber Costa wrote:
>>> Since we'll have multiple cpu threads, at least for kvm, we need a 
>>> way to store
>>> and retrieve the CPUState associated with the current execution thread.
>>> For the I/O thread, this will be NULL.
>>>
>>> I am using pthread functions for that, for portability, but we could 
>>> as well
>>> use __thread keyword.
>>>
>>>
>>
>> Where is __thread not supported?
>
> Apple, Sun.

Well, pthread_getspecific is around 130 bytes of code, whereas __thread 
is just on instruction.  Maybe we should support both.
Andreas Färber - Nov. 29, 2009, 4 p.m.
Am 29.11.2009 um 16:42 schrieb Avi Kivity:

> On 11/29/2009 05:38 PM, Andreas Färber wrote:
>>
>> Am 29.11.2009 um 16:29 schrieb Avi Kivity:
>>
>>> On 11/26/2009 07:24 PM, Glauber Costa wrote:
>>>> Since we'll have multiple cpu threads, at least for kvm, we need  
>>>> a way to store
>>>> and retrieve the CPUState associated with the current execution  
>>>> thread.
>>>> For the I/O thread, this will be NULL.
>>>>
>>>> I am using pthread functions for that, for portability, but we  
>>>> could as well
>>>> use __thread keyword.
>>>>
>>>>
>>>
>>> Where is __thread not supported?
>>
>> Apple, Sun.
>
> Well, pthread_getspecific is around 130 bytes of code, whereas  
> __thread is just on instruction.  Maybe we should support both.

Maybe. Mono does so, they have some autoconf-based check plus a  
configure switch --with-tls={pthread|__thread} to override.

Andreas
Jamie Lokier - Nov. 29, 2009, 10:29 p.m.
> On 11/29/2009 05:38 PM, Andreas Färber wrote:
>> Am 29.11.2009 um 16:29 schrieb Avi Kivity:
>>> Where is __thread not supported?
>> Apple, Sun.

Some flavours of uClinux :-)

Avi Kivity wrote:
> Well, pthread_getspecific is around 130 bytes of code, whereas __thread 
> is just on instruction.  Maybe we should support both.

It's easy enough, they are quite similar.  Except that
pthread_key_create lets you provide a destructor which is called as
each thread is destroyed (unfortunately no constructor for new
threads; and you can use both methods if you need a destructor and
speed together).

It's not always one instruction - it's more complicated in shared
libraries, but it's always close to that.

Anyway, I decided to measure them both as I wondered about this for
another program.

On my 2.0GHz Core Duo (32-bit), tight unrolled loop, everything in cache:

     Read void *__thread variable        ~ 0.6 ns
     Call pthread_getspecific(key)       ~ 8.8 ns

__thread is preferable but it's not much overhead to call pthread_getspecific().

Imho, it's not worth making code less portable or more complicated to
handle both, but it's a nice touch.

However, I did notice that the compiler optimises away references to
__thread variables much better, such as hoisting from inside loops.

In my programs I have taken to wrapping everything inside a
thread_specific(var) macro, similar to the one in the kernel, which
expands to call pthread_getspecific() or use __thread[*], That keeps the
complexity in one place, which is where the macro is defined.

( [*] - Windows has __thread, but it sometimes crashes when used in a
DLL, so I use the Windows equivalent of pthread_getspecific() in the
same wrapper macro, which is fine. )

-- Jamie
Paolo Bonzini - Nov. 30, 2009, 11:36 a.m.
On 11/29/2009 04:38 PM, Andreas Färber wrote:
>
> Am 29.11.2009 um 16:29 schrieb Avi Kivity:
>
>> On 11/26/2009 07:24 PM, Glauber Costa wrote:
>>> Since we'll have multiple cpu threads, at least for kvm, we need a
>>> way to store
>>> and retrieve the CPUState associated with the current execution thread.
>>> For the I/O thread, this will be NULL.
>>>
>>> I am using pthread functions for that, for portability, but we could
>>> as well
>>> use __thread keyword.
>>>
>>>
>>
>> Where is __thread not supported?
>
> Apple, Sun.

Not sure about Sun.

Anyway on Windows neither __thread nor pthread_getspecific is supported, 
so some configury is needed anyway.

Paolo
Glauber Costa - Nov. 30, 2009, 11:41 a.m.
> Anyway on Windows neither __thread nor pthread_getspecific is supported, so
> some configury is needed anyway.
>
> Paolo

For the record, I am a big fan of __thread. The only reason I used the
pthread library was
portability. I can surely put in some configure knobs to use __thread
where available
Paolo Bonzini - Nov. 30, 2009, 11:49 a.m.
On 11/30/2009 12:41 PM, Glauber Costa wrote:
> For the record, I am a big fan of __thread. The only reason I used
> the pthread library was portability. I can surely put in some
> configure knobs to use __thread where available

Plus, do you really need to support SMP when __thread is not available?...

Paolo
Avi Kivity - Nov. 30, 2009, 12:07 p.m.
On 11/30/2009 01:49 PM, Paolo Bonzini wrote:
> On 11/30/2009 12:41 PM, Glauber Costa wrote:
>> For the record, I am a big fan of __thread. The only reason I used
>> the pthread library was portability. I can surely put in some
>> configure knobs to use __thread where available
>
> Plus, do you really need to support SMP when __thread is not 
> available?...

Good point.

Patch

diff --git a/vl.c b/vl.c
index ee43808..9afe4b6 100644
--- a/vl.c
+++ b/vl.c
@@ -3436,6 +3436,24 @@  static void block_io_signals(void);
 static void unblock_io_signals(void);
 static int tcg_has_work(void);
 
+static pthread_key_t current_env;
+
+CPUState *qemu_get_current_env(void);
+CPUState *qemu_get_current_env(void)
+{
+    return pthread_getspecific(current_env);
+}
+
+static void qemu_set_current_env(CPUState *env)
+{
+    pthread_setspecific(current_env, env);
+}
+
+static void qemu_init_current_env(void)
+{
+    pthread_key_create(&current_env, NULL);
+}
+
 static int qemu_init_main_loop(void)
 {
     int ret;
@@ -3448,6 +3466,7 @@  static int qemu_init_main_loop(void)
     qemu_mutex_init(&qemu_fair_mutex);
     qemu_mutex_init(&qemu_global_mutex);
     qemu_mutex_lock(&qemu_global_mutex);
+    qemu_init_current_env();
 
     unblock_io_signals();
     qemu_thread_self(&io_thread);
@@ -3486,6 +3505,8 @@  static void *kvm_cpu_thread_fn(void *arg)
 
     block_io_signals();
     qemu_thread_self(env->thread);
+    qemu_set_current_env(env);
+
     if (kvm_enabled())
         kvm_init_vcpu(env);