diff mbox

[05/10] tcg: Prepare TB invalidation for lockless TB lookup

Message ID 1468917141-8155-6-git-send-email-pbonzini@redhat.com
State New
Headers show

Commit Message

Paolo Bonzini July 19, 2016, 8:32 a.m. UTC
When invalidating a translation block, set an invalid flag into the
TranslationBlock structure first.  It is also necessary to check whether
the target TB is still valid after acquiring 'tb_lock' but before calling
tb_add_jump() since TB lookup is to be performed out of 'tb_lock' in
future. Note that we don't have to check 'last_tb'; an already invalidated
TB will not be executed anyway and it is thus safe to patch it.

Suggested-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 cpu-exec.c              | 5 +++--
 include/exec/exec-all.h | 2 ++
 translate-all.c         | 3 +++
 3 files changed, 8 insertions(+), 2 deletions(-)

Comments

Sergey Fedorov July 19, 2016, 7:56 p.m. UTC | #1
On 19/07/16 11:32, Paolo Bonzini wrote:
>

It looks much better now :)

> When invalidating a translation block, set an invalid flag into the
> TranslationBlock structure first.  It is also necessary to check whether
> the target TB is still valid after acquiring 'tb_lock' but before calling
> tb_add_jump() since TB lookup is to be performed out of 'tb_lock' in
> future. Note that we don't have to check 'last_tb'; an already invalidated
> TB will not be executed anyway and it is thus safe to patch it.
>
> Suggested-by: Sergey Fedorov <serge.fdrv@gmail.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  cpu-exec.c              | 5 +++--
>  include/exec/exec-all.h | 2 ++
>  translate-all.c         | 3 +++
>  3 files changed, 8 insertions(+), 2 deletions(-)
(snip)
> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
> index acda7b6..bc0bcc5 100644
> --- a/include/exec/exec-all.h
> +++ b/include/exec/exec-all.h
> @@ -213,6 +213,8 @@ struct TranslationBlock {
>  #define CF_USE_ICOUNT  0x20000
>  #define CF_IGNORE_ICOUNT 0x40000 /* Do not generate icount code */
>  
> +    uint16_t invalid;

Why not "int"?

> +
>      void *tc_ptr;    /* pointer to the translated code */
>      uint8_t *tc_search;  /* pointer to search data */
>      /* original tb when cflags has CF_NOCACHE */
>

Thanks,
Sergey
Paolo Bonzini July 19, 2016, 10:27 p.m. UTC | #2
----- Original Message -----
> From: "Sergey Fedorov" <serge.fdrv@gmail.com>
> To: "Paolo Bonzini" <pbonzini@redhat.com>, qemu-devel@nongnu.org
> Cc: "sergey fedorov" <sergey.fedorov@linaro.org>, "alex bennee" <alex.bennee@linaro.org>
> Sent: Tuesday, July 19, 2016 9:56:49 PM
> Subject: Re: [PATCH 05/10] tcg: Prepare TB invalidation for lockless TB lookup
> 
> On 19/07/16 11:32, Paolo Bonzini wrote:
> >
> 
> It looks much better now :)
> 
> > When invalidating a translation block, set an invalid flag into the
> > TranslationBlock structure first.  It is also necessary to check whether
> > the target TB is still valid after acquiring 'tb_lock' but before calling
> > tb_add_jump() since TB lookup is to be performed out of 'tb_lock' in
> > future. Note that we don't have to check 'last_tb'; an already invalidated
> > TB will not be executed anyway and it is thus safe to patch it.
> >
> > Suggested-by: Sergey Fedorov <serge.fdrv@gmail.com>
> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > ---
> >  cpu-exec.c              | 5 +++--
> >  include/exec/exec-all.h | 2 ++
> >  translate-all.c         | 3 +++
> >  3 files changed, 8 insertions(+), 2 deletions(-)
> (snip)
> > diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
> > index acda7b6..bc0bcc5 100644
> > --- a/include/exec/exec-all.h
> > +++ b/include/exec/exec-all.h
> > @@ -213,6 +213,8 @@ struct TranslationBlock {
> >  #define CF_USE_ICOUNT  0x20000
> >  #define CF_IGNORE_ICOUNT 0x40000 /* Do not generate icount code */
> >  
> > +    uint16_t invalid;
> 
> Why not "int"?

There's a hole there, we may want to move something else so I
used a smaller data type.  Even uint8_t would do.

Paolo
> 
> > +
> >      void *tc_ptr;    /* pointer to the translated code */
> >      uint8_t *tc_search;  /* pointer to search data */
> >      /* original tb when cflags has CF_NOCACHE */
> >
> 
> Thanks,
> Sergey
>
Sergey Fedorov July 21, 2016, 8:36 a.m. UTC | #3
On 20/07/16 01:27, Paolo Bonzini wrote:
>
> ----- Original Message -----
>> From: "Sergey Fedorov" <serge.fdrv@gmail.com>
>> To: "Paolo Bonzini" <pbonzini@redhat.com>, qemu-devel@nongnu.org
>> Cc: "sergey fedorov" <sergey.fedorov@linaro.org>, "alex bennee" <alex.bennee@linaro.org>
>> Sent: Tuesday, July 19, 2016 9:56:49 PM
>> Subject: Re: [PATCH 05/10] tcg: Prepare TB invalidation for lockless TB lookup
>>
>> On 19/07/16 11:32, Paolo Bonzini wrote:
>> It looks much better now :)
>>
>>> When invalidating a translation block, set an invalid flag into the
>>> TranslationBlock structure first.  It is also necessary to check whether
>>> the target TB is still valid after acquiring 'tb_lock' but before calling
>>> tb_add_jump() since TB lookup is to be performed out of 'tb_lock' in
>>> future. Note that we don't have to check 'last_tb'; an already invalidated
>>> TB will not be executed anyway and it is thus safe to patch it.
>>>
>>> Suggested-by: Sergey Fedorov <serge.fdrv@gmail.com>
>>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>> ---
>>>  cpu-exec.c              | 5 +++--
>>>  include/exec/exec-all.h | 2 ++
>>>  translate-all.c         | 3 +++
>>>  3 files changed, 8 insertions(+), 2 deletions(-)
>> (snip)
>>> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
>>> index acda7b6..bc0bcc5 100644
>>> --- a/include/exec/exec-all.h
>>> +++ b/include/exec/exec-all.h
>>> @@ -213,6 +213,8 @@ struct TranslationBlock {
>>>  #define CF_USE_ICOUNT  0x20000
>>>  #define CF_IGNORE_ICOUNT 0x40000 /* Do not generate icount code */
>>>  
>>> +    uint16_t invalid;
>> Why not "int"?
> There's a hole there, we may want to move something else so I
> used a smaller data type.  Even uint8_t would do.

But could simple "bool" work as well here?

>
> Paolo
>>> +
>>>      void *tc_ptr;    /* pointer to the translated code */
>>>      uint8_t *tc_search;  /* pointer to search data */

Are you sure that the hole is over there, not here?

Kind regards,
Sergey

>>>      /* original tb when cflags has CF_NOCACHE */
>>>
>> Thanks,
>> Sergey
>>
Paolo Bonzini July 21, 2016, 9:04 a.m. UTC | #4
On 21/07/2016 10:36, Sergey Fedorov wrote:
> On 20/07/16 01:27, Paolo Bonzini wrote:
>>
>> ----- Original Message -----
>>> From: "Sergey Fedorov" <serge.fdrv@gmail.com>
>>> To: "Paolo Bonzini" <pbonzini@redhat.com>, qemu-devel@nongnu.org
>>> Cc: "sergey fedorov" <sergey.fedorov@linaro.org>, "alex bennee" <alex.bennee@linaro.org>
>>> Sent: Tuesday, July 19, 2016 9:56:49 PM
>>> Subject: Re: [PATCH 05/10] tcg: Prepare TB invalidation for lockless TB lookup
>>>
>>> On 19/07/16 11:32, Paolo Bonzini wrote:
>>> It looks much better now :)
>>>
>>>> When invalidating a translation block, set an invalid flag into the
>>>> TranslationBlock structure first.  It is also necessary to check whether
>>>> the target TB is still valid after acquiring 'tb_lock' but before calling
>>>> tb_add_jump() since TB lookup is to be performed out of 'tb_lock' in
>>>> future. Note that we don't have to check 'last_tb'; an already invalidated
>>>> TB will not be executed anyway and it is thus safe to patch it.
>>>>
>>>> Suggested-by: Sergey Fedorov <serge.fdrv@gmail.com>
>>>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>>> ---
>>>>  cpu-exec.c              | 5 +++--
>>>>  include/exec/exec-all.h | 2 ++
>>>>  translate-all.c         | 3 +++
>>>>  3 files changed, 8 insertions(+), 2 deletions(-)
>>> (snip)
>>>> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
>>>> index acda7b6..bc0bcc5 100644
>>>> --- a/include/exec/exec-all.h
>>>> +++ b/include/exec/exec-all.h
>>>> @@ -213,6 +213,8 @@ struct TranslationBlock {
>>>>  #define CF_USE_ICOUNT  0x20000
>>>>  #define CF_IGNORE_ICOUNT 0x40000 /* Do not generate icount code */
>>>>  
>>>> +    uint16_t invalid;
>>> Why not "int"?
>> There's a hole there, we may want to move something else so I
>> used a smaller data type.  Even uint8_t would do.
> 
> But could simple "bool" work as well here?

sizeof(bool) is sometimes 1 sometimes 4.  Since in the future we might
want to pack TranslationBlock for better locality, I thought it was
better to stick with uint*_t.

Paolo

> 
>>
>> Paolo
>>>> +
>>>>      void *tc_ptr;    /* pointer to the translated code */
>>>>      uint8_t *tc_search;  /* pointer to search data */
> 
> Are you sure that the hole is over there, not here?
> 
> Kind regards,
> Sergey
> 
>>>>      /* original tb when cflags has CF_NOCACHE */
>>>>
>>> Thanks,
>>> Sergey
>>>
> 
> 
>
Paolo Bonzini July 21, 2016, 11:25 a.m. UTC | #5
----- Original Message -----
> From: "Sergey Fedorov" <serge.fdrv@gmail.com>
> To: "Paolo Bonzini" <pbonzini@redhat.com>
> Cc: qemu-devel@nongnu.org, "sergey fedorov" <sergey.fedorov@linaro.org>, "alex bennee" <alex.bennee@linaro.org>
> Sent: Thursday, July 21, 2016 10:36:35 AM
> Subject: Re: [PATCH 05/10] tcg: Prepare TB invalidation for lockless TB lookup
> 
> On 20/07/16 01:27, Paolo Bonzini wrote:
> >
> > ----- Original Message -----
> >> From: "Sergey Fedorov" <serge.fdrv@gmail.com>
> >> To: "Paolo Bonzini" <pbonzini@redhat.com>, qemu-devel@nongnu.org
> >> Cc: "sergey fedorov" <sergey.fedorov@linaro.org>, "alex bennee"
> >> <alex.bennee@linaro.org>
> >> Sent: Tuesday, July 19, 2016 9:56:49 PM
> >> Subject: Re: [PATCH 05/10] tcg: Prepare TB invalidation for lockless TB
> >> lookup
> >>
> >> On 19/07/16 11:32, Paolo Bonzini wrote:
> >> It looks much better now :)
> >>
> >>> When invalidating a translation block, set an invalid flag into the
> >>> TranslationBlock structure first.  It is also necessary to check whether
> >>> the target TB is still valid after acquiring 'tb_lock' but before calling
> >>> tb_add_jump() since TB lookup is to be performed out of 'tb_lock' in
> >>> future. Note that we don't have to check 'last_tb'; an already
> >>> invalidated
> >>> TB will not be executed anyway and it is thus safe to patch it.
> >>>
> >>> Suggested-by: Sergey Fedorov <serge.fdrv@gmail.com>
> >>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> >>> ---
> >>>  cpu-exec.c              | 5 +++--
> >>>  include/exec/exec-all.h | 2 ++
> >>>  translate-all.c         | 3 +++
> >>>  3 files changed, 8 insertions(+), 2 deletions(-)
> >> (snip)
> >>> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
> >>> index acda7b6..bc0bcc5 100644
> >>> --- a/include/exec/exec-all.h
> >>> +++ b/include/exec/exec-all.h
> >>> @@ -213,6 +213,8 @@ struct TranslationBlock {
> >>>  #define CF_USE_ICOUNT  0x20000
> >>>  #define CF_IGNORE_ICOUNT 0x40000 /* Do not generate icount code */
> >>>  
> >>> +    uint16_t invalid;
> >> Why not "int"?
> > There's a hole there, we may want to move something else so I
> > used a smaller data type.  Even uint8_t would do.
> 
> But could simple "bool" work as well here?
> 
> >
> > Paolo
> >>> +
> >>>      void *tc_ptr;    /* pointer to the translated code */
> >>>      uint8_t *tc_search;  /* pointer to search data */
> 
> Are you sure that the hole is over there, not here?

Yes, all pointers have the same size.  For 32-bit hosts, my
patch introduces a 2-byte hole.  For 64-bit hosts, it reduces
a 4-byte hole to 2-byte.

Before:

    target_ulong pc;      /* 0 */ 
    target_ulong cs_base; /* 4 */ 
    uint32_t flags;       /* 8  */
    uint16_t size;        /* 12 */
    uint16_t icount;      /* 14 */
    uint32_t cflags;      /* 16 */
    /* 4 byte padding     ** 20 on 64-bit systems */
    void *tc_ptr;         /* 24 on 64-bit systems, 20 on 32-bit */

After:

    target_ulong pc;      /* 0 */ 
    target_ulong cs_base; /* 4 */ 
    uint32_t flags;       /* 8  */
    uint16_t size;        /* 12 */
    uint16_t icount;      /* 14 */
    uint32_t cflags;      /* 16 */
    uint16_t invalid;     /* 20 */
    /* 2 byte padding     ** 22 */
    void *tc_ptr;         /* 24 */

BTW, another reason to use uint16_t is that I suspect tb->icount can
be made redundant with cflags, so we might move tb->invalid up if
tb->icount is removed.

Thanks,

Paolo
Sergey Fedorov July 21, 2016, 5:42 p.m. UTC | #6
On 21/07/16 14:25, Paolo Bonzini wrote:
>
> ----- Original Message -----
>> From: "Sergey Fedorov" <serge.fdrv@gmail.com>
>> To: "Paolo Bonzini" <pbonzini@redhat.com>
>> Cc: qemu-devel@nongnu.org, "sergey fedorov" <sergey.fedorov@linaro.org>, "alex bennee" <alex.bennee@linaro.org>
>> Sent: Thursday, July 21, 2016 10:36:35 AM
>> Subject: Re: [PATCH 05/10] tcg: Prepare TB invalidation for lockless TB lookup
>>
>> On 20/07/16 01:27, Paolo Bonzini wrote:
>>> ----- Original Message -----
>>>> From: "Sergey Fedorov" <serge.fdrv@gmail.com>
>>>> To: "Paolo Bonzini" <pbonzini@redhat.com>, qemu-devel@nongnu.org
>>>> Cc: "sergey fedorov" <sergey.fedorov@linaro.org>, "alex bennee"
>>>> <alex.bennee@linaro.org>
>>>> Sent: Tuesday, July 19, 2016 9:56:49 PM
>>>> Subject: Re: [PATCH 05/10] tcg: Prepare TB invalidation for lockless TB
>>>> lookup
>>>>
>>>> On 19/07/16 11:32, Paolo Bonzini wrote:
>>>> It looks much better now :)
>>>>
>>>>> When invalidating a translation block, set an invalid flag into the
>>>>> TranslationBlock structure first.  It is also necessary to check whether
>>>>> the target TB is still valid after acquiring 'tb_lock' but before calling
>>>>> tb_add_jump() since TB lookup is to be performed out of 'tb_lock' in
>>>>> future. Note that we don't have to check 'last_tb'; an already
>>>>> invalidated
>>>>> TB will not be executed anyway and it is thus safe to patch it.
>>>>>
>>>>> Suggested-by: Sergey Fedorov <serge.fdrv@gmail.com>
>>>>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>>>> ---
>>>>>  cpu-exec.c              | 5 +++--
>>>>>  include/exec/exec-all.h | 2 ++
>>>>>  translate-all.c         | 3 +++
>>>>>  3 files changed, 8 insertions(+), 2 deletions(-)
>>>> (snip)
>>>>> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
>>>>> index acda7b6..bc0bcc5 100644
>>>>> --- a/include/exec/exec-all.h
>>>>> +++ b/include/exec/exec-all.h
>>>>> @@ -213,6 +213,8 @@ struct TranslationBlock {
>>>>>  #define CF_USE_ICOUNT  0x20000
>>>>>  #define CF_IGNORE_ICOUNT 0x40000 /* Do not generate icount code */
>>>>>  
>>>>> +    uint16_t invalid;
>>>> Why not "int"?
>>> There's a hole there, we may want to move something else so I
>>> used a smaller data type.  Even uint8_t would do.
>> But could simple "bool" work as well here?
>>
>>> Paolo
>>>>> +
>>>>>      void *tc_ptr;    /* pointer to the translated code */
>>>>>      uint8_t *tc_search;  /* pointer to search data */
>> Are you sure that the hole is over there, not here?
> Yes, all pointers have the same size.  For 32-bit hosts, my
> patch introduces a 2-byte hole.  For 64-bit hosts, it reduces
> a 4-byte hole to 2-byte.
>
> Before:
>
>     target_ulong pc;      /* 0 */ 
>     target_ulong cs_base; /* 4 */ 
>     uint32_t flags;       /* 8  */
>     uint16_t size;        /* 12 */
>     uint16_t icount;      /* 14 */
>     uint32_t cflags;      /* 16 */
>     /* 4 byte padding     ** 20 on 64-bit systems */
>     void *tc_ptr;         /* 24 on 64-bit systems, 20 on 32-bit */
>
> After:
>
>     target_ulong pc;      /* 0 */ 
>     target_ulong cs_base; /* 4 */ 
>     uint32_t flags;       /* 8  */
>     uint16_t size;        /* 12 */
>     uint16_t icount;      /* 14 */
>     uint32_t cflags;      /* 16 */
>     uint16_t invalid;     /* 20 */
>     /* 2 byte padding     ** 22 */
>     void *tc_ptr;         /* 24 */
>
> BTW, another reason to use uint16_t is that I suspect tb->icount can
> be made redundant with cflags, so we might move tb->invalid up if
> tb->icount is removed.

Fair enough. I think it might make sense to use uint8_t to make a hint
for possible future compaction of TranslationBlock structure.

Thanks,
Sergey
diff mbox

Patch

diff --git a/cpu-exec.c b/cpu-exec.c
index 877ff8e..cdaab1d 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -241,7 +241,8 @@  static bool tb_cmp(const void *p, const void *d)
     if (tb->pc == desc->pc &&
         tb->page_addr[0] == desc->phys_page1 &&
         tb->cs_base == desc->cs_base &&
-        tb->flags == desc->flags) {
+        tb->flags == desc->flags &&
+        !atomic_read(&tb->invalid)) {
         /* check next page if needed */
         if (tb->page_addr[1] == -1) {
             return true;
@@ -352,7 +353,7 @@  static inline TranslationBlock *tb_find_fast(CPUState *cpu,
         /* Check if translation buffer has been flushed */
         if (cpu->tb_flushed) {
             cpu->tb_flushed = false;
-        } else {
+        } else if (!tb->invalid) {
             tb_add_jump(last_tb, tb_exit, tb);
         }
     }
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index acda7b6..bc0bcc5 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -213,6 +213,8 @@  struct TranslationBlock {
 #define CF_USE_ICOUNT  0x20000
 #define CF_IGNORE_ICOUNT 0x40000 /* Do not generate icount code */
 
+    uint16_t invalid;
+
     void *tc_ptr;    /* pointer to the translated code */
     uint8_t *tc_search;  /* pointer to search data */
     /* original tb when cflags has CF_NOCACHE */
diff --git a/translate-all.c b/translate-all.c
index 788fed1..eaa1232 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -773,6 +773,7 @@  static TranslationBlock *tb_alloc(target_ulong pc)
     tb = &tcg_ctx.tb_ctx.tbs[tcg_ctx.tb_ctx.nb_tbs++];
     tb->pc = pc;
     tb->cflags = 0;
+    tb->invalid = false;
     return tb;
 }
 
@@ -991,6 +992,8 @@  void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr)
     uint32_t h;
     tb_page_addr_t phys_pc;
 
+    atomic_set(&tb->invalid, true);
+
     /* remove the TB from the hash list */
     phys_pc = tb->page_addr[0] + (tb->pc & ~TARGET_PAGE_MASK);
     h = tb_hash_func(phys_pc, tb->pc, tb->flags);