Patchwork [2/3] powerpc/kvm: fix rare but potential deadlock scene

login
register
mail settings
Submitter Liu Ping Fan
Date Nov. 5, 2013, 7:42 a.m.
Message ID <1383637364-14691-2-git-send-email-pingfank@linux.vnet.ibm.com>
Download mbox | patch
Permalink /patch/288400/
State New
Headers show

Comments

Liu Ping Fan - Nov. 5, 2013, 7:42 a.m.
Since kvmppc_hv_find_lock_hpte() is called from both virtmode and
realmode, so it can trigger the deadlock.

Suppose the following scene:

Two physical cpuM, cpuN, two VM instances A, B, each VM has a group of vcpus.

If on cpuM, vcpu_A_1 holds bitlock X (HPTE_V_HVLOCK), then is switched out,
and on cpuN, vcpu_A_2 try to lock X in realmode, then cpuN will be caught in
realmode for a long time.

What makes things even worse if the following happens,
  On cpuM, bitlockX is hold, on cpuN, Y is hold.
  vcpu_B_2 try to lock Y on cpuM in realmode
  vcpu_A_2 try to lock X on cpuN in realmode

Oops! deadlock happens

Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
---
 arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 ++
 1 file changed, 2 insertions(+)
Paul Mackerras - Nov. 6, 2013, 5:04 a.m.
On Tue, Nov 05, 2013 at 03:42:43PM +0800, Liu Ping Fan wrote:
> Since kvmppc_hv_find_lock_hpte() is called from both virtmode and
> realmode, so it can trigger the deadlock.

Good catch, we should have preemption disabled while ever we have a
HPTE locked.

> @@ -474,8 +474,10 @@ static int kvmppc_mmu_book3s_64_hv_xlate(struct kvm_vcpu *vcpu, gva_t eaddr,
>  	}
>  
>  	/* Find the HPTE in the hash table */
> +	preempt_disable();
>  	index = kvmppc_hv_find_lock_hpte(kvm, eaddr, slb_v,
>  					 HPTE_V_VALID | HPTE_V_ABSENT);
> +	preempt_enable();

Which means we need to add the preempt_enable after unlocking the
HPTE, not here.

Regards,
Paul.
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Liu Ping Fan - Nov. 6, 2013, 6:02 a.m.
On Wed, Nov 6, 2013 at 1:04 PM, Paul Mackerras <paulus@samba.org> wrote:
> On Tue, Nov 05, 2013 at 03:42:43PM +0800, Liu Ping Fan wrote:
>> Since kvmppc_hv_find_lock_hpte() is called from both virtmode and
>> realmode, so it can trigger the deadlock.
>
> Good catch, we should have preemption disabled while ever we have a
> HPTE locked.
>
>> @@ -474,8 +474,10 @@ static int kvmppc_mmu_book3s_64_hv_xlate(struct kvm_vcpu *vcpu, gva_t eaddr,
>>       }
>>
>>       /* Find the HPTE in the hash table */
>> +     preempt_disable();
>>       index = kvmppc_hv_find_lock_hpte(kvm, eaddr, slb_v,
>>                                        HPTE_V_VALID | HPTE_V_ABSENT);
>> +     preempt_enable();
>
> Which means we need to add the preempt_enable after unlocking the
> HPTE, not here.
>
Yes. Sorry, but I am not sure about whether we can call
preempt_disable/enable() in realmode. I think since thread_info is
allocated with linear address, so we can use preempt_disable/enable()
inside kvmppc_hv_find_lock_hpte(), right?

Thanks and regards,
Pingfan

> Regards,
> Paul.
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul Mackerras - Nov. 6, 2013, 11:18 a.m.
On Wed, Nov 06, 2013 at 02:02:07PM +0800, Liu ping fan wrote:
> On Wed, Nov 6, 2013 at 1:04 PM, Paul Mackerras <paulus@samba.org> wrote:
> > On Tue, Nov 05, 2013 at 03:42:43PM +0800, Liu Ping Fan wrote:
> >> Since kvmppc_hv_find_lock_hpte() is called from both virtmode and
> >> realmode, so it can trigger the deadlock.
> >
> > Good catch, we should have preemption disabled while ever we have a
> > HPTE locked.
> >
> >> @@ -474,8 +474,10 @@ static int kvmppc_mmu_book3s_64_hv_xlate(struct kvm_vcpu *vcpu, gva_t eaddr,
> >>       }
> >>
> >>       /* Find the HPTE in the hash table */
> >> +     preempt_disable();
> >>       index = kvmppc_hv_find_lock_hpte(kvm, eaddr, slb_v,
> >>                                        HPTE_V_VALID | HPTE_V_ABSENT);
> >> +     preempt_enable();
> >
> > Which means we need to add the preempt_enable after unlocking the
> > HPTE, not here.
> >
> Yes. Sorry, but I am not sure about whether we can call
> preempt_disable/enable() in realmode. I think since thread_info is
> allocated with linear address, so we can use preempt_disable/enable()
> inside kvmppc_hv_find_lock_hpte(), right?

Your analysis correctly pointed out that we can get a deadlock if we
can be preempted while holding a lock on a HPTE.  That means that we
have to disable preemption before taking an HPTE lock and keep it
disabled until after we unlock the HPTE.  Since the point of
kvmppc_hv_find_lock_hpte() is to lock the HPTE and return with it
locked, we can't have the preempt_enable() inside it.  The
preempt_enable() has to come after we have unlocked the HPTE.  That is
also why we can't have the preempt_enable() where your patch put it;
it needs to be about 9 lines further down, after the statement
hptep[0] = v.  (We also need to make sure to re-enable preemption in
the index < 0 case.)

Regards,
Paul.
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Liu Ping Fan - Nov. 7, 2013, 2:36 a.m.
On Wed, Nov 6, 2013 at 7:18 PM, Paul Mackerras <paulus@samba.org> wrote:
> On Wed, Nov 06, 2013 at 02:02:07PM +0800, Liu ping fan wrote:
>> On Wed, Nov 6, 2013 at 1:04 PM, Paul Mackerras <paulus@samba.org> wrote:
>> > On Tue, Nov 05, 2013 at 03:42:43PM +0800, Liu Ping Fan wrote:
>> >> Since kvmppc_hv_find_lock_hpte() is called from both virtmode and
>> >> realmode, so it can trigger the deadlock.
>> >
>> > Good catch, we should have preemption disabled while ever we have a
>> > HPTE locked.
>> >
>> >> @@ -474,8 +474,10 @@ static int kvmppc_mmu_book3s_64_hv_xlate(struct kvm_vcpu *vcpu, gva_t eaddr,
>> >>       }
>> >>
>> >>       /* Find the HPTE in the hash table */
>> >> +     preempt_disable();
>> >>       index = kvmppc_hv_find_lock_hpte(kvm, eaddr, slb_v,
>> >>                                        HPTE_V_VALID | HPTE_V_ABSENT);
>> >> +     preempt_enable();
>> >
>> > Which means we need to add the preempt_enable after unlocking the
>> > HPTE, not here.
>> >
>> Yes. Sorry, but I am not sure about whether we can call
>> preempt_disable/enable() in realmode. I think since thread_info is
>> allocated with linear address, so we can use preempt_disable/enable()
>> inside kvmppc_hv_find_lock_hpte(), right?
>
> Your analysis correctly pointed out that we can get a deadlock if we
> can be preempted while holding a lock on a HPTE.  That means that we
> have to disable preemption before taking an HPTE lock and keep it
> disabled until after we unlock the HPTE.  Since the point of
> kvmppc_hv_find_lock_hpte() is to lock the HPTE and return with it
> locked, we can't have the preempt_enable() inside it.  The
> preempt_enable() has to come after we have unlocked the HPTE.  That is
> also why we can't have the preempt_enable() where your patch put it;
> it needs to be about 9 lines further down, after the statement
> hptep[0] = v.  (We also need to make sure to re-enable preemption in
> the index < 0 case.)
>
Oh, yes, will fix like what you said. My attention is attracted by the
trick of calling kernel func in realmode, and miss the exact point
where the lock is released.

Thanks and regards,
Pingfan
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 043eec8..28160ac 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -474,8 +474,10 @@  static int kvmppc_mmu_book3s_64_hv_xlate(struct kvm_vcpu *vcpu, gva_t eaddr,
 	}
 
 	/* Find the HPTE in the hash table */
+	preempt_disable();
 	index = kvmppc_hv_find_lock_hpte(kvm, eaddr, slb_v,
 					 HPTE_V_VALID | HPTE_V_ABSENT);
+	preempt_enable();
 	if (index < 0)
 		return -ENOENT;
 	hptep = (unsigned long *)(kvm->arch.hpt_virt + (index << 4));