diff mbox

[v2,19/24] block: Allow wait_serialising_requests() at any point

Message ID 1386940979-3824-20-git-send-email-kwolf@redhat.com
State New
Headers show

Commit Message

Kevin Wolf Dec. 13, 2013, 1:22 p.m. UTC
We can only have a single wait_serialising_requests() call per request
because otherwise we can run into deadlocks where requests are waiting
for each other. The same is true when wait_serialising_requests() is not
at the very beginning of a request, so that other requests can be issued
between the start of the tracking and wait_serialising_requests().

Fix this by changing wait_serialising_requests() to ignore requests that
are already (directly or indirectly) waiting for the calling request.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block.c                   | 26 +++++++++++++++++++++++---
 include/block/block_int.h |  2 ++
 2 files changed, 25 insertions(+), 3 deletions(-)

Comments

Wayne Xia Dec. 27, 2013, 4:17 a.m. UTC | #1
于 2013/12/13 21:22, Kevin Wolf 写道:
> We can only have a single wait_serialising_requests() call per request
> because otherwise we can run into deadlocks where requests are waiting
> for each other.
  do you mean:
mark_request_serialising(req)
...
wait_serialising_requests(req);
...
wait_serialising_requests(req);

 will have deadlock? I thought it is already resolved by patch 15?
Maybe here is another deadlock reason?

 The same is true when wait_serialising_requests() is not
> at the very beginning of a request, so that other requests can be issued
> between the start of the tracking and wait_serialising_requests().
Max Reitz Jan. 11, 2014, 10:29 p.m. UTC | #2
On 13.12.2013 14:22, Kevin Wolf wrote:
> We can only have a single wait_serialising_requests() call per request
> because otherwise we can run into deadlocks where requests are waiting
> for each other. The same is true when wait_serialising_requests() is not
> at the very beginning of a request, so that other requests can be issued
> between the start of the tracking and wait_serialising_requests().
>
> Fix this by changing wait_serialising_requests() to ignore requests that
> are already (directly or indirectly) waiting for the calling request.
>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>   block.c                   | 26 +++++++++++++++++++++++---
>   include/block/block_int.h |  2 ++
>   2 files changed, 25 insertions(+), 3 deletions(-)

Reviewed-by: Max Reitz <mreitz@redhat.com>
Kevin Wolf Jan. 13, 2014, 11:29 a.m. UTC | #3
Am 27.12.2013 um 05:17 hat Wenchao Xia geschrieben:
> 于 2013/12/13 21:22, Kevin Wolf 写道:
> > We can only have a single wait_serialising_requests() call per request
> > because otherwise we can run into deadlocks where requests are waiting
> > for each other.
>   do you mean:
> mark_request_serialising(req)
> ...
> wait_serialising_requests(req);
> ...
> wait_serialising_requests(req);
> 
>  will have deadlock?

Yes, it can deadlock (it doesn't have to, it depends on whether another
overlapping request is started concurrently). More precisely, the
problematic pattern is:

    mark_request_serialising(req);
    ...
    qemu_coroutine_yield(); /* Other requests may be issued now */
    ...
    wait_serialising_requests(req);

What you mentioned above is a special case of this.

>  I thought it is already resolved by patch 15?
> Maybe here is another deadlock reason?

The problematic pattern in patch 15 was:

    mark_request_serialising(req);
    ... /* no yield here */
    wait_serialising_requests(req);

As opposed to the originally used:

    wait_serialising_requests(req);
    ... /* no yield here */
    mark_request_serialising(req);

Kevin
Wayne Xia Jan. 14, 2014, 3:13 a.m. UTC | #4
于 2014/1/13 19:29, Kevin Wolf 写道:
> Am 27.12.2013 um 05:17 hat Wenchao Xia geschrieben:
>> 于 2013/12/13 21:22, Kevin Wolf 写道:
>>> We can only have a single wait_serialising_requests() call per request
>>> because otherwise we can run into deadlocks where requests are waiting
>>> for each other.
>>    do you mean:
>> mark_request_serialising(req)
>> ...
>> wait_serialising_requests(req);
>> ...
>> wait_serialising_requests(req);
>>
>>   will have deadlock?
>
> Yes, it can deadlock (it doesn't have to, it depends on whether another
> overlapping request is started concurrently). More precisely, the
> problematic pattern is:
>
>      mark_request_serialising(req);
>      ...
>      qemu_coroutine_yield(); /* Other requests may be issued now */
>      ...
>      wait_serialising_requests(req);
>
> What you mentioned above is a special case of this.
>

   It seems when two overlapping request exist in the qeueue, and
both are waiting, problem happens. Thanks for explanation, I am fine
with this patch.


>>   I thought it is already resolved by patch 15?
>> Maybe here is another deadlock reason?
>
> The problematic pattern in patch 15 was:
>
>      mark_request_serialising(req);
>      ... /* no yield here */
>      wait_serialising_requests(req);
>
> As opposed to the originally used:
>
>      wait_serialising_requests(req);
>      ... /* no yield here */
>      mark_request_serialising(req);
>
> Kevin
>
diff mbox

Patch

diff --git a/block.c b/block.c
index 73bba47..8a0225d 100644
--- a/block.c
+++ b/block.c
@@ -2123,6 +2123,20 @@  static bool tracked_request_overlaps(BdrvTrackedRequest *req,
     return true;
 }
 
+static bool coroutine_fn is_waiting_for(BdrvTrackedRequest *waiting,
+                                        BdrvTrackedRequest *waited_for)
+{
+    BdrvTrackedRequest *req;
+
+    for (req = waiting; req != NULL; req = req->waiting_for) {
+        if (req == waited_for) {
+            return true;
+        }
+    }
+
+    return false;
+}
+
 static void coroutine_fn wait_serialising_requests(BdrvTrackedRequest *self)
 {
     BlockDriverState *bs = self->bs;
@@ -2148,9 +2162,15 @@  static void coroutine_fn wait_serialising_requests(BdrvTrackedRequest *self)
                  */
                 assert(qemu_coroutine_self() != req->co);
 
-                qemu_co_queue_wait(&req->wait_queue);
-                retry = true;
-                break;
+                /* If the request is already (indircetly) waiting for us, then
+                 * just go on instead of producing a deadlock. */
+                if (!is_waiting_for(req, self)) {
+                    self->waiting_for = req;
+                    qemu_co_queue_wait(&req->wait_queue);
+                    self->waiting_for = NULL;
+                    retry = true;
+                    break;
+                }
             }
         }
     } while (retry);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 1aee02b..0f7fcef 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -68,6 +68,8 @@  typedef struct BdrvTrackedRequest {
     QLIST_ENTRY(BdrvTrackedRequest) list;
     Coroutine *co; /* owner, used for deadlock detection */
     CoQueue wait_queue; /* coroutines blocked on this request */
+
+    struct BdrvTrackedRequest *waiting_for;
 } BdrvTrackedRequest;
 
 struct BlockDriver {