diff mbox

[2/2] sheepdog: refine discard support

Message ID 1441076590-8015-3-git-send-email-mitake.hitoshi@lab.ntt.co.jp
State New
Headers show

Commit Message

Hitoshi Mitake Sept. 1, 2015, 3:03 a.m. UTC
This patch refines discard support of the sheepdog driver. The
existing discard mechanism was implemented on SD_OP_DISCARD_OBJ, which
was introduced before fine grained reference counting on newer
sheepdog. It doesn't care about relations of snapshots and clones and
discards objects unconditionally.

With this patch, the driver just updates an inode object for updating
reference. Removing the object is done in sheep process side.

Cc: Teruaki Ishizaki <ishizaki.teruaki@lab.ntt.co.jp>
Cc: Vasiliy Tolstov <v.tolstov@selfip.ru>
Cc: Jeff Cody <jcody@redhat.com>
Signed-off-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
---
 block/sheepdog.c | 29 ++++++++++++++++++++---------
 1 file changed, 20 insertions(+), 9 deletions(-)

Comments

Vasiliy Tolstov Sept. 2, 2015, 12:36 p.m. UTC | #1
2015-09-01 6:03 GMT+03:00 Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>:
> This patch refines discard support of the sheepdog driver. The
> existing discard mechanism was implemented on SD_OP_DISCARD_OBJ, which
> was introduced before fine grained reference counting on newer
> sheepdog. It doesn't care about relations of snapshots and clones and
> discards objects unconditionally.
>
> With this patch, the driver just updates an inode object for updating
> reference. Removing the object is done in sheep process side.
>
> Cc: Teruaki Ishizaki <ishizaki.teruaki@lab.ntt.co.jp>
> Cc: Vasiliy Tolstov <v.tolstov@selfip.ru>
> Cc: Jeff Cody <jcody@redhat.com>
> Signed-off-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>


I'm test this patch and now discard working properly and no errors in
sheepdog log file.

Tested-by: Vasiliy Tolstov <v.tolstov@selfip.ru>
Hitoshi Mitake Sept. 4, 2015, 8:51 a.m. UTC | #2
On Wed, Sep 2, 2015 at 9:36 PM, Vasiliy Tolstov <v.tolstov@selfip.ru> wrote:
> 2015-09-01 6:03 GMT+03:00 Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>:
>> This patch refines discard support of the sheepdog driver. The
>> existing discard mechanism was implemented on SD_OP_DISCARD_OBJ, which
>> was introduced before fine grained reference counting on newer
>> sheepdog. It doesn't care about relations of snapshots and clones and
>> discards objects unconditionally.
>>
>> With this patch, the driver just updates an inode object for updating
>> reference. Removing the object is done in sheep process side.
>>
>> Cc: Teruaki Ishizaki <ishizaki.teruaki@lab.ntt.co.jp>
>> Cc: Vasiliy Tolstov <v.tolstov@selfip.ru>
>> Cc: Jeff Cody <jcody@redhat.com>
>> Signed-off-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
>
>
> I'm test this patch and now discard working properly and no errors in
> sheepdog log file.
>
> Tested-by: Vasiliy Tolstov <v.tolstov@selfip.ru>

On the second thought, this patch has a problem of handling snapshot.
Please drop this one (1st patch is ok to apply).

I'll solve the problem in sheepdog side.

Thanks,
Hitoshi

>
> --
> Vasiliy Tolstov,
> e-mail: v.tolstov@selfip.ru
> --
> sheepdog mailing list
> sheepdog@lists.wpkg.org
> https://lists.wpkg.org/mailman/listinfo/sheepdog
Hitoshi Mitake Sept. 4, 2015, 9:57 a.m. UTC | #3
On Fri, Sep 4, 2015 at 5:51 PM, Hitoshi Mitake <mitake.hitoshi@gmail.com> wrote:
> On Wed, Sep 2, 2015 at 9:36 PM, Vasiliy Tolstov <v.tolstov@selfip.ru> wrote:
>> 2015-09-01 6:03 GMT+03:00 Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>:
>>> This patch refines discard support of the sheepdog driver. The
>>> existing discard mechanism was implemented on SD_OP_DISCARD_OBJ, which
>>> was introduced before fine grained reference counting on newer
>>> sheepdog. It doesn't care about relations of snapshots and clones and
>>> discards objects unconditionally.
>>>
>>> With this patch, the driver just updates an inode object for updating
>>> reference. Removing the object is done in sheep process side.
>>>
>>> Cc: Teruaki Ishizaki <ishizaki.teruaki@lab.ntt.co.jp>
>>> Cc: Vasiliy Tolstov <v.tolstov@selfip.ru>
>>> Cc: Jeff Cody <jcody@redhat.com>
>>> Signed-off-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
>>
>>
>> I'm test this patch and now discard working properly and no errors in
>> sheepdog log file.
>>
>> Tested-by: Vasiliy Tolstov <v.tolstov@selfip.ru>
>
> On the second thought, this patch has a problem of handling snapshot.
> Please drop this one (1st patch is ok to apply).
>
> I'll solve the problem in sheepdog side.

On the third thought, this patch can work well ;) Please pick this patch, Jeff.

I considered about a case of interleaving of snapshotting and
discarding requests like below:
1. user invokes dog vdi snapshot, a command for making current VDI
snapshot (updating inode objects)
2. discard request from VM before the request of 1 completes
3. discard completes
4. request of 1 completes

In this case, some data_vdi_id of original inode can be overwritten
before completion of snapshotting. However, this behavior is valid
because dog vdi snapshot doesn't return ack to the user.

Thanks,
Hitoshi

>
> Thanks,
> Hitoshi
>
>>
>> --
>> Vasiliy Tolstov,
>> e-mail: v.tolstov@selfip.ru
>> --
>> sheepdog mailing list
>> sheepdog@lists.wpkg.org
>> https://lists.wpkg.org/mailman/listinfo/sheepdog
Vasiliy Tolstov Sept. 4, 2015, 10:32 a.m. UTC | #4
04 сент. 2015 г. 12:57 пользователь "Hitoshi Mitake" <
mitake.hitoshi@gmail.com> написал:
>
> On Fri, Sep 4, 2015 at 5:51 PM, Hitoshi Mitake <mitake.hitoshi@gmail.com>
wrote:
> > On Wed, Sep 2, 2015 at 9:36 PM, Vasiliy Tolstov <v.tolstov@selfip.ru>
wrote:
> >> 2015-09-01 6:03 GMT+03:00 Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp
>:
> >>> This patch refines discard support of the sheepdog driver. The
> >>> existing discard mechanism was implemented on SD_OP_DISCARD_OBJ, which
> >>> was introduced before fine grained reference counting on newer
> >>> sheepdog. It doesn't care about relations of snapshots and clones and
> >>> discards objects unconditionally.
> >>>
> >>> With this patch, the driver just updates an inode object for updating
> >>> reference. Removing the object is done in sheep process side.
> >>>
> >>> Cc: Teruaki Ishizaki <ishizaki.teruaki@lab.ntt.co.jp>
> >>> Cc: Vasiliy Tolstov <v.tolstov@selfip.ru>
> >>> Cc: Jeff Cody <jcody@redhat.com>
> >>> Signed-off-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
> >>
> >>
> >> I'm test this patch and now discard working properly and no errors in
> >> sheepdog log file.
> >>
> >> Tested-by: Vasiliy Tolstov <v.tolstov@selfip.ru>
> >
> > On the second thought, this patch has a problem of handling snapshot.
> > Please drop this one (1st patch is ok to apply).
> >
> > I'll solve the problem in sheepdog side.
>
> On the third thought, this patch can work well ;) Please pick this patch,
Jeff.
>
> I considered about a case of interleaving of snapshotting and
> discarding requests like below:
> 1. user invokes dog vdi snapshot, a command for making current VDI
> snapshot (updating inode objects)
> 2. discard request from VM before the request of 1 completes
> 3. discard completes
> 4. request of 1 completes
>
> In this case, some data_vdi_id of original inode can be overwritten
> before completion of snapshotting. However, this behavior is valid
> because dog vdi snapshot doesn't return ack to the user.
>

This is normal for snapshots, to get consistent data you need to cooperate
with guest system to get freeze it fs before snapshot.
Quemu-ga already have this via sending freeze to fs beforw and thaw after
snapshot .

> Thanks,
> Hitoshi
>
> >
> > Thanks,
> > Hitoshi
> >
> >>
> >> --
> >> Vasiliy Tolstov,
> >> e-mail: v.tolstov@selfip.ru
> >> --
> >> sheepdog mailing list
> >> sheepdog@lists.wpkg.org
> >> https://lists.wpkg.org/mailman/listinfo/sheepdog
diff mbox

Patch

diff --git a/block/sheepdog.c b/block/sheepdog.c
index 24341ea..e3c5cb5 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -28,7 +28,6 @@ 
 #define SD_OP_READ_OBJ       0x02
 #define SD_OP_WRITE_OBJ      0x03
 /* 0x04 is used internally by Sheepdog */
-#define SD_OP_DISCARD_OBJ    0x05
 
 #define SD_OP_NEW_VDI        0x11
 #define SD_OP_LOCK_VDI       0x12
@@ -856,10 +855,6 @@  static void coroutine_fn aio_read_response(void *opaque)
             rsp.result = SD_RES_SUCCESS;
             s->discard_supported = false;
             break;
-        case SD_RES_SUCCESS:
-            idx = data_oid_to_idx(aio_req->oid);
-            s->inode.data_vdi_id[idx] = 0;
-            break;
         default:
             break;
         }
@@ -1174,7 +1169,13 @@  static void coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
         hdr.flags = SD_FLAG_CMD_WRITE | flags;
         break;
     case AIOCB_DISCARD_OBJ:
-        hdr.opcode = SD_OP_DISCARD_OBJ;
+        hdr.opcode = SD_OP_WRITE_OBJ;
+        hdr.flags = SD_FLAG_CMD_WRITE | flags;
+        s->inode.data_vdi_id[data_oid_to_idx(oid)] = 0;
+        offset = offsetof(SheepdogInode,
+                          data_vdi_id[data_oid_to_idx(oid)]);
+        oid = vid_to_vdi_oid(s->inode.vdi_id);
+        wlen = datalen = sizeof(uint32_t);
         break;
     }
 
@@ -2148,7 +2149,9 @@  static int coroutine_fn sd_co_rw_vector(void *p)
         }
 
         aio_req = alloc_aio_req(s, acb, oid, len, offset, flags, create,
-                                old_oid, done);
+                                old_oid,
+                                acb->aiocb_type == AIOCB_DISCARD_OBJ ?
+                                0 : done);
         QLIST_INSERT_HEAD(&s->inflight_aio_head, aio_req, aio_siblings);
 
         add_aio_request(s, aio_req, acb->qiov->iov, acb->qiov->niov,
@@ -2584,15 +2587,23 @@  static coroutine_fn int sd_co_discard(BlockDriverState *bs, int64_t sector_num,
                                       int nb_sectors)
 {
     SheepdogAIOCB *acb;
-    QEMUIOVector dummy;
     BDRVSheepdogState *s = bs->opaque;
     int ret;
+    QEMUIOVector discard_iov;
+    struct iovec iov;
+    uint32_t zero = 0;
 
     if (!s->discard_supported) {
             return 0;
     }
 
-    acb = sd_aio_setup(bs, &dummy, sector_num, nb_sectors);
+    memset(&discard_iov, 0, sizeof(discard_iov));
+    memset(&iov, 0, sizeof(iov));
+    iov.iov_base = &zero;
+    iov.iov_len = sizeof(zero);
+    discard_iov.iov = &iov;
+    discard_iov.niov = 1;
+    acb = sd_aio_setup(bs, &discard_iov, sector_num, nb_sectors);
     acb->aiocb_type = AIOCB_DISCARD_OBJ;
     acb->aio_done_func = sd_finish_aiocb;