diff mbox

[v3,21/29] block: Assert serialisation assumptions in pwritev

Message ID 1389968119-24771-22-git-send-email-kwolf@redhat.com
State New
Headers show

Commit Message

Kevin Wolf Jan. 17, 2014, 2:15 p.m. UTC
If a request calls wait_serialising_requests() and actually has to wait
in this function (i.e. a coroutine yield), other requests can run and
previously read data (like the head or tail buffer) could become
outdated. In this case, we would have to restart from the beginning to
read in the updated data.

However, we're lucky and don't actually need to do that: A request can
only wait in the first call of wait_serialising_requests() because we
mark it as serialising before that call, so any later requests would
wait. So as we don't wait in practice, we don't have to reload the data.

This is an important assumption that may not be broken or data
corruption will happen. Document it with some assertions.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

Comments

Max Reitz Jan. 17, 2014, 11:42 p.m. UTC | #1
On 17.01.2014 15:15, Kevin Wolf wrote:
> If a request calls wait_serialising_requests() and actually has to wait
> in this function (i.e. a coroutine yield), other requests can run and
> previously read data (like the head or tail buffer) could become
> outdated. In this case, we would have to restart from the beginning to
> read in the updated data.
>
> However, we're lucky and don't actually need to do that: A request can
> only wait in the first call of wait_serialising_requests() because we
> mark it as serialising before that call, so any later requests would
> wait. So as we don't wait in practice, we don't have to reload the data.
>
> This is an important assumption that may not be broken or data
> corruption will happen. Document it with some assertions.
>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>   block.c | 16 ++++++++++++----
>   1 file changed, 12 insertions(+), 4 deletions(-)

Reviewed-by: Max Reitz <mreitz@redhat.com>
Benoît Canet Jan. 24, 2014, 4:09 p.m. UTC | #2
Le Friday 17 Jan 2014 à 15:15:11 (+0100), Kevin Wolf a écrit :
> If a request calls wait_serialising_requests() and actually has to wait
> in this function (i.e. a coroutine yield), other requests can run and
> previously read data (like the head or tail buffer) could become
> outdated. In this case, we would have to restart from the beginning to
> read in the updated data.
> 
> However, we're lucky and don't actually need to do that: A request can
> only wait in the first call of wait_serialising_requests() because we
> mark it as serialising before that call, so any later requests would
> wait. So as we don't wait in practice, we don't have to reload the data.
> 
> This is an important assumption that may not be broken or data
> corruption will happen. Document it with some assertions.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block.c | 16 ++++++++++++----
>  1 file changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 859e1aa..53d9bd5 100644
> --- a/block.c
> +++ b/block.c
> @@ -2123,14 +2123,15 @@ static bool tracked_request_overlaps(BdrvTrackedRequest *req,
>      return true;
>  }
>  
> -static void coroutine_fn wait_serialising_requests(BdrvTrackedRequest *self)
> +static bool coroutine_fn wait_serialising_requests(BdrvTrackedRequest *self)
>  {
>      BlockDriverState *bs = self->bs;
>      BdrvTrackedRequest *req;
>      bool retry;
> +    bool waited = false;
>  
>      if (!bs->serialising_in_flight) {
> -        return;
> +        return false;
>      }
>  
>      do {
> @@ -2156,11 +2157,14 @@ static void coroutine_fn wait_serialising_requests(BdrvTrackedRequest *self)
>                      qemu_co_queue_wait(&req->wait_queue);
>                      self->waiting_for = NULL;
>                      retry = true;
> +                    waited = true;
>                      break;
>                  }
>              }
>          }
>      } while (retry);
> +
> +    return waited;
>  }
>  
>  /*
> @@ -3011,6 +3015,7 @@ static int coroutine_fn bdrv_aligned_pwritev(BlockDriverState *bs,
>      QEMUIOVector *qiov, int flags)
>  {
>      BlockDriver *drv = bs->drv;
> +    bool waited;
>      int ret;
>  
>      int64_t sector_num = offset >> BDRV_SECTOR_BITS;
> @@ -3019,7 +3024,8 @@ static int coroutine_fn bdrv_aligned_pwritev(BlockDriverState *bs,
>      assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
>      assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
>  
> -    wait_serialising_requests(req);
> +    waited = wait_serialising_requests(req);
> +    assert(!waited || !req->serialising);

I we apply De Morgan's law to the expression we have:

    assert(!(waited && req->serialising));

Is it really the condition we want ?

Best regards

Benoît

>  
>      ret = notifier_with_return_list_notify(&bs->before_write_notifiers, req);
>  
> @@ -3119,9 +3125,11 @@ static int coroutine_fn bdrv_co_do_pwritev(BlockDriverState *bs,
>          QEMUIOVector tail_qiov;
>          struct iovec tail_iov;
>          size_t tail_bytes;
> +        bool waited;
>  
>          mark_request_serialising(&req, align);
> -        wait_serialising_requests(&req);
> +        waited = wait_serialising_requests(&req);
> +        assert(!waited || !use_local_qiov);
>  
>          tail_buf = qemu_blockalign(bs, align);
>          tail_iov = (struct iovec) {
> -- 
> 1.8.1.4
> 
>
Kevin Wolf Jan. 24, 2014, 4:18 p.m. UTC | #3
Am 24.01.2014 um 17:09 hat Benoît Canet geschrieben:
> Le Friday 17 Jan 2014 à 15:15:11 (+0100), Kevin Wolf a écrit :
> > If a request calls wait_serialising_requests() and actually has to wait
> > in this function (i.e. a coroutine yield), other requests can run and
> > previously read data (like the head or tail buffer) could become
> > outdated. In this case, we would have to restart from the beginning to
> > read in the updated data.
> > 
> > However, we're lucky and don't actually need to do that: A request can
> > only wait in the first call of wait_serialising_requests() because we
> > mark it as serialising before that call, so any later requests would
> > wait. So as we don't wait in practice, we don't have to reload the data.
> > 
> > This is an important assumption that may not be broken or data
> > corruption will happen. Document it with some assertions.
> > 
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > ---
> >  block.c | 16 ++++++++++++----
> >  1 file changed, 12 insertions(+), 4 deletions(-)
> > 
> > diff --git a/block.c b/block.c
> > index 859e1aa..53d9bd5 100644
> > --- a/block.c
> > +++ b/block.c
> > @@ -2123,14 +2123,15 @@ static bool tracked_request_overlaps(BdrvTrackedRequest *req,
> >      return true;
> >  }
> >  
> > -static void coroutine_fn wait_serialising_requests(BdrvTrackedRequest *self)
> > +static bool coroutine_fn wait_serialising_requests(BdrvTrackedRequest *self)
> >  {
> >      BlockDriverState *bs = self->bs;
> >      BdrvTrackedRequest *req;
> >      bool retry;
> > +    bool waited = false;
> >  
> >      if (!bs->serialising_in_flight) {
> > -        return;
> > +        return false;
> >      }
> >  
> >      do {
> > @@ -2156,11 +2157,14 @@ static void coroutine_fn wait_serialising_requests(BdrvTrackedRequest *self)
> >                      qemu_co_queue_wait(&req->wait_queue);
> >                      self->waiting_for = NULL;
> >                      retry = true;
> > +                    waited = true;
> >                      break;
> >                  }
> >              }
> >          }
> >      } while (retry);
> > +
> > +    return waited;
> >  }
> >  
> >  /*
> > @@ -3011,6 +3015,7 @@ static int coroutine_fn bdrv_aligned_pwritev(BlockDriverState *bs,
> >      QEMUIOVector *qiov, int flags)
> >  {
> >      BlockDriver *drv = bs->drv;
> > +    bool waited;
> >      int ret;
> >  
> >      int64_t sector_num = offset >> BDRV_SECTOR_BITS;
> > @@ -3019,7 +3024,8 @@ static int coroutine_fn bdrv_aligned_pwritev(BlockDriverState *bs,
> >      assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
> >      assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
> >  
> > -    wait_serialising_requests(req);
> > +    waited = wait_serialising_requests(req);
> > +    assert(!waited || !req->serialising);
> 
> I we apply De Morgan's law to the expression we have:
> 
>     assert(!(waited && req->serialising));
> 
> Is it really the condition we want ?

Yes. If req->serialising is true here, it's an RMW case and we already
had a mark_request_serialising() + wait_serialising_requests() pair. So
we have already waited for earlier requests and newer requests must be
waiting for us. Therefore, it can't happen that a serialising request
has to wait here.

Kevin
diff mbox

Patch

diff --git a/block.c b/block.c
index 859e1aa..53d9bd5 100644
--- a/block.c
+++ b/block.c
@@ -2123,14 +2123,15 @@  static bool tracked_request_overlaps(BdrvTrackedRequest *req,
     return true;
 }
 
-static void coroutine_fn wait_serialising_requests(BdrvTrackedRequest *self)
+static bool coroutine_fn wait_serialising_requests(BdrvTrackedRequest *self)
 {
     BlockDriverState *bs = self->bs;
     BdrvTrackedRequest *req;
     bool retry;
+    bool waited = false;
 
     if (!bs->serialising_in_flight) {
-        return;
+        return false;
     }
 
     do {
@@ -2156,11 +2157,14 @@  static void coroutine_fn wait_serialising_requests(BdrvTrackedRequest *self)
                     qemu_co_queue_wait(&req->wait_queue);
                     self->waiting_for = NULL;
                     retry = true;
+                    waited = true;
                     break;
                 }
             }
         }
     } while (retry);
+
+    return waited;
 }
 
 /*
@@ -3011,6 +3015,7 @@  static int coroutine_fn bdrv_aligned_pwritev(BlockDriverState *bs,
     QEMUIOVector *qiov, int flags)
 {
     BlockDriver *drv = bs->drv;
+    bool waited;
     int ret;
 
     int64_t sector_num = offset >> BDRV_SECTOR_BITS;
@@ -3019,7 +3024,8 @@  static int coroutine_fn bdrv_aligned_pwritev(BlockDriverState *bs,
     assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
     assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
 
-    wait_serialising_requests(req);
+    waited = wait_serialising_requests(req);
+    assert(!waited || !req->serialising);
 
     ret = notifier_with_return_list_notify(&bs->before_write_notifiers, req);
 
@@ -3119,9 +3125,11 @@  static int coroutine_fn bdrv_co_do_pwritev(BlockDriverState *bs,
         QEMUIOVector tail_qiov;
         struct iovec tail_iov;
         size_t tail_bytes;
+        bool waited;
 
         mark_request_serialising(&req, align);
-        wait_serialising_requests(&req);
+        waited = wait_serialising_requests(&req);
+        assert(!waited || !use_local_qiov);
 
         tail_buf = qemu_blockalign(bs, align);
         tail_iov = (struct iovec) {