Patchwork Re: Regression: segfault on ARM host

login
register
mail settings
Submitter Paolo Bonzini
Date March 1, 2010, 10:33 p.m.
Message ID <4B8C40B3.6030008@redhat.com>
Download mbox | patch
Permalink /patch/46615/
State New
Headers show

Comments

Paolo Bonzini - March 1, 2010, 10:33 p.m.
On 03/01/2010 10:33 PM, Aurelien Jarno wrote:
> While trying to implement setcond on TCG ARM, I have discovered it does
> not work anymore. I have bisected this regression to:
>
> commit 6113d6d3169393c323ac4c82d756a850145a5e7a
> Author: Paolo Bonzini<pbonzini@redhat.com>
> Date:   Fri Jan 15 09:42:09 2010 +0100
>
>      change while to if
>
>      The while loop will be executed exactly 0 or 1 times, depending on
>      env->exit_request.
>
>      Signed-off-by: Paolo Bonzini<pbonzini@redhat.com>
>      Signed-off-by: Anthony Liguori<aliguori@us.ibm.com>
>
> The assertion is actually triggered. When the next patch removing the
> assertion is also applied it segfaults instead.

Looks like a race.  The only piece of logic that is changed by that 
commit is reverted in the attached patch, can you try it?  If it passes, 
I can resubmit with S-o-b.

If it doesn't pass, I wonder whether the while loop was there to trick 
the compiler into not optimizing something.  Seems a bit too clever though.

Paolo
Aurelien Jarno - March 2, 2010, 12:05 a.m.
On Mon, Mar 01, 2010 at 11:33:23PM +0100, Paolo Bonzini wrote:
> On 03/01/2010 10:33 PM, Aurelien Jarno wrote:
>> While trying to implement setcond on TCG ARM, I have discovered it does
>> not work anymore. I have bisected this regression to:
>>
>> commit 6113d6d3169393c323ac4c82d756a850145a5e7a
>> Author: Paolo Bonzini<pbonzini@redhat.com>
>> Date:   Fri Jan 15 09:42:09 2010 +0100
>>
>>      change while to if
>>
>>      The while loop will be executed exactly 0 or 1 times, depending on
>>      env->exit_request.
>>
>>      Signed-off-by: Paolo Bonzini<pbonzini@redhat.com>
>>      Signed-off-by: Anthony Liguori<aliguori@us.ibm.com>
>>
>> The assertion is actually triggered. When the next patch removing the
>> assertion is also applied it segfaults instead.
>
> Looks like a race.  The only piece of logic that is changed by that  
> commit is reverted in the attached patch, can you try it?  If it passes,  
> I can resubmit with S-o-b.

Unfortunately it doesn't work.

> If it doesn't pass, I wonder whether the while loop was there to trick  
> the compiler into not optimizing something.  Seems a bit too clever 
> though.
>

It looks like it is the case. Just replacing the if by a while in your
patch make it working again. But I do wonder what this trick is actually
preventing, as there is probably a better way to prevent that.
Aurelien Jarno - March 5, 2010, 8:03 p.m.
On Tue, Mar 02, 2010 at 01:05:29AM +0100, Aurelien Jarno wrote:
> On Mon, Mar 01, 2010 at 11:33:23PM +0100, Paolo Bonzini wrote:
> > On 03/01/2010 10:33 PM, Aurelien Jarno wrote:
> >> While trying to implement setcond on TCG ARM, I have discovered it does
> >> not work anymore. I have bisected this regression to:
> >>
> >> commit 6113d6d3169393c323ac4c82d756a850145a5e7a
> >> Author: Paolo Bonzini<pbonzini@redhat.com>
> >> Date:   Fri Jan 15 09:42:09 2010 +0100
> >>
> >>      change while to if
> >>
> >>      The while loop will be executed exactly 0 or 1 times, depending on
> >>      env->exit_request.
> >>
> >>      Signed-off-by: Paolo Bonzini<pbonzini@redhat.com>
> >>      Signed-off-by: Anthony Liguori<aliguori@us.ibm.com>
> >>
> >> The assertion is actually triggered. When the next patch removing the
> >> assertion is also applied it segfaults instead.
> >
> > Looks like a race.  The only piece of logic that is changed by that  
> > commit is reverted in the attached patch, can you try it?  If it passes,  
> > I can resubmit with S-o-b.
> 
> Unfortunately it doesn't work.
> 

The bug is actually in the ARM backend, I have just send a patch on the
mailing list to fix it.

Patch

diff --git a/cpu-exec.c b/cpu-exec.c
index 5d6dd51..61b1c59 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -602,9 +602,15 @@  int cpu_exec(CPUState *env1)
                 /* cpu_interrupt might be called while translating the
                    TB, but before it is linked into a potentially
                    infinite loop and becomes env->current_tb. Avoid
-                   starting execution if there is a pending interrupt. */
-                if (!unlikely (env->exit_request)) {
-                    env->current_tb = tb;
+                   starting execution if there is a pending interrupt. 
+		   Doing it this way is necessary to avoid races with
+		   cpu_unlink_tb (called by cpu_exit).  */
+                env->current_tb = tb;
+                if (unlikely (env->exit_request)) {
+                    env->current_tb = NULL;
+		}
+
+		if (likely (env->current_tb)) {
                     tc_ptr = tb->tc_ptr;
                 /* execute the generated code */
 #if defined(__sparc__) && !defined(CONFIG_SOLARIS)