Patchwork ahci: crash after duplicate bh registration

login
register
mail settings
Submitter Jan Kiszka
Date May 8, 2011, 7:10 p.m.
Message ID <4DC6EAB5.4040607@web.de>
Download mbox | patch
Permalink /patch/94601/
State New
Headers show

Comments

Jan Kiszka - May 8, 2011, 7:10 p.m.
Hi Alex,

I've seen crashes caused by ahci_check_cmd_bh unregistering a NULL bh.
It looks like ahci_dma_set_inactive can a called while there is already
a bh hanging around. Patch below cures the issue, but I have no clue if
such an invocation order is valid at all.

Jan

---
Alexander Graf - May 9, 2011, 2:12 p.m.
On 05/08/2011 09:10 PM, Jan Kiszka wrote:
> Hi Alex,
>
> I've seen crashes caused by ahci_check_cmd_bh unregistering a NULL bh.
> It looks like ahci_dma_set_inactive can a called while there is already
> a bh hanging around. Patch below cures the issue, but I have no clue if
> such an invocation order is valid at all.

It's certainly guest triggerable, so yes, let's check here.

Acked-by: Alexander Graf <agraf@suse.de>


Alex

> Jan
>
> ---
>
> diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
> index e2ed2ad..7870030 100644
> --- a/hw/ide/ahci.c
> +++ b/hw/ide/ahci.c
> @@ -1066,9 +1066,11 @@ static int ahci_dma_set_inactive(IDEDMA *dma)
>
>       ad->dma_cb = NULL;
>
> -    /* maybe we still have something to process, check later */
> -    ad->check_bh = qemu_bh_new(ahci_check_cmd_bh, ad);
> -    qemu_bh_schedule(ad->check_bh);
> +    if (!ad->check_bh) {
> +        /* maybe we still have something to process, check later */
> +        ad->check_bh = qemu_bh_new(ahci_check_cmd_bh, ad);
> +        qemu_bh_schedule(ad->check_bh);
> +    }
>
>       return 0;
>   }
>
Kevin Wolf - May 9, 2011, 2:26 p.m.
Am 09.05.2011 16:12, schrieb Alexander Graf:
> On 05/08/2011 09:10 PM, Jan Kiszka wrote:
>> Hi Alex,
>>
>> I've seen crashes caused by ahci_check_cmd_bh unregistering a NULL bh.
>> It looks like ahci_dma_set_inactive can a called while there is already
>> a bh hanging around. Patch below cures the issue, but I have no clue if
>> such an invocation order is valid at all.
> 
> It's certainly guest triggerable, so yes, let's check here.
> 
> Acked-by: Alexander Graf <agraf@suse.de>

Yes, the change makes sense to me. Please resend this as a proper patch,
Jan.

However, I still think Jan's question is valid: Is the AHCI emulation
supposed to run multiple DMA requests at once using the core.c
functions? I'd find it surprising if this actually worked well.

Kevin
Alexander Graf - May 9, 2011, 2:31 p.m.
On 05/09/2011 04:26 PM, Kevin Wolf wrote:
> Am 09.05.2011 16:12, schrieb Alexander Graf:
>> On 05/08/2011 09:10 PM, Jan Kiszka wrote:
>>> Hi Alex,
>>>
>>> I've seen crashes caused by ahci_check_cmd_bh unregistering a NULL bh.
>>> It looks like ahci_dma_set_inactive can a called while there is already
>>> a bh hanging around. Patch below cures the issue, but I have no clue if
>>> such an invocation order is valid at all.
>> It's certainly guest triggerable, so yes, let's check here.
>>
>> Acked-by: Alexander Graf<agraf@suse.de>
> Yes, the change makes sense to me. Please resend this as a proper patch,
> Jan.
>
> However, I still think Jan's question is valid: Is the AHCI emulation
> supposed to run multiple DMA requests at once using the core.c
> functions? I'd find it surprising if this actually worked well.

Not through the IDE core, no. There it can process a queue of IDE 
commands after each other or do NCQ, but that goes a different code 
patch, can do multiple requests at once though.

I'm not sure how this got triggered.


Alex
Jan Kiszka - May 9, 2011, 2:39 p.m.
On 2011-05-09 16:31, Alexander Graf wrote:
> On 05/09/2011 04:26 PM, Kevin Wolf wrote:
>> Am 09.05.2011 16:12, schrieb Alexander Graf:
>>> On 05/08/2011 09:10 PM, Jan Kiszka wrote:
>>>> Hi Alex,
>>>>
>>>> I've seen crashes caused by ahci_check_cmd_bh unregistering a NULL bh.
>>>> It looks like ahci_dma_set_inactive can a called while there is already
>>>> a bh hanging around. Patch below cures the issue, but I have no clue if
>>>> such an invocation order is valid at all.
>>> It's certainly guest triggerable, so yes, let's check here.
>>>
>>> Acked-by: Alexander Graf<agraf@suse.de>
>> Yes, the change makes sense to me. Please resend this as a proper patch,
>> Jan.

Will do.

>>
>> However, I still think Jan's question is valid: Is the AHCI emulation
>> supposed to run multiple DMA requests at once using the core.c
>> functions? I'd find it surprising if this actually worked well.
> 
> Not through the IDE core, no. There it can process a queue of IDE
> commands after each other or do NCQ, but that goes a different code
> patch, can do multiple requests at once though.
> 
> I'm not sure how this got triggered.

Forgot to mention: With a hacked-up q35 series. I may have broken
something there, or it was already broken (there are definitely bugs in
that series), so upstream might not expose the problem at all.

Jan

Patch

diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index e2ed2ad..7870030 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -1066,9 +1066,11 @@  static int ahci_dma_set_inactive(IDEDMA *dma)
 
     ad->dma_cb = NULL;
 
-    /* maybe we still have something to process, check later */
-    ad->check_bh = qemu_bh_new(ahci_check_cmd_bh, ad);
-    qemu_bh_schedule(ad->check_bh);
+    if (!ad->check_bh) {
+        /* maybe we still have something to process, check later */
+        ad->check_bh = qemu_bh_new(ahci_check_cmd_bh, ad);
+        qemu_bh_schedule(ad->check_bh);
+    }
 
     return 0;
 }