Message ID | 1366042504-18354-1-git-send-email-namei.unix@gmail.com |
---|---|
State | New |
Headers | show |
At Tue, 16 Apr 2013 00:15:04 +0800, Liu Yuan wrote: > > From: Liu Yuan <tailai.ly@taobao.com> > > The 'TRIM' command from VM that is to release underlying data storage for > better thin-provision is already supported by the Sheepdog. > > This patch adds the TRIM support at QEMU part. > > For older Sheepdog that doesn't support it, we return 0(success) to upper layer. > > Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> > Cc: Kevin Wolf <kwolf@redhat.com> > Cc: Stefan Hajnoczi <stefanha@redhat.com> > Cc: Paolo Bonzini <pbonzini@redhat.com> > Signed-off-by: Liu Yuan <tailai.ly@taobao.com> > --- > v5: > - adjust macro numbering > v4: > - adjust discard macro > - return success when operation is not supported by sheep > - add coroutine_fn marker > v3: > - fix a silly accidental deletion of 'default' in switch clause. > v2: > - skip the object when it is not allocated > > block/sheepdog.c | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++++- > 1 file changed, 55 insertions(+), 1 deletion(-) Reviewed-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
On Tue, Apr 16, 2013 at 12:15:04AM +0800, Liu Yuan wrote: > @@ -727,6 +730,20 @@ static void coroutine_fn aio_read_response(void *opaque) > rsp.result = SD_RES_SUCCESS; > } > break; > + case AIOCB_DISCARD_OBJ: > + switch (rsp.result) { > + case SD_RES_INVALID_PARMS: > + error_report("you are running the old sheep that doesn't support " > + "discard command.\n"); error_report() does not need '\n'. The recently added ssh block driver has a similar case when the server does not support fsync. It does the following: 1. Print the error message once only per volume, avoid filling up logs on the host. 2. Include details of the volume/server in case the users is connected to multiple volumes/servers. This allows them to figure out which server is outdated. This makes the error messages safe from denial-of-service and includes more useful information.
Am 16.04.2013 um 10:18 hat Stefan Hajnoczi geschrieben: > On Tue, Apr 16, 2013 at 12:15:04AM +0800, Liu Yuan wrote: > > @@ -727,6 +730,20 @@ static void coroutine_fn aio_read_response(void *opaque) > > rsp.result = SD_RES_SUCCESS; > > } > > break; > > + case AIOCB_DISCARD_OBJ: > > + switch (rsp.result) { > > + case SD_RES_INVALID_PARMS: > > + error_report("you are running the old sheep that doesn't support " > > + "discard command.\n"); > > error_report() does not need '\n'. > > The recently added ssh block driver has a similar case when the server > does not support fsync. It does the following: > > 1. Print the error message once only per volume, avoid filling up logs > on the host. > 2. Include details of the volume/server in case the users is connected > to multiple volumes/servers. This allows them to figure out which > server is outdated. > > This makes the error messages safe from denial-of-service and includes > more useful information. Or if we can check whether discard works during bdrv_open(), we could already fail there for discard=on. Kevin
On 04/16/2013 04:47 PM, Kevin Wolf wrote: > Am 16.04.2013 um 10:18 hat Stefan Hajnoczi geschrieben: >> > On Tue, Apr 16, 2013 at 12:15:04AM +0800, Liu Yuan wrote: >>> > > @@ -727,6 +730,20 @@ static void coroutine_fn aio_read_response(void *opaque) >>> > > rsp.result = SD_RES_SUCCESS; >>> > > } >>> > > break; >>> > > + case AIOCB_DISCARD_OBJ: >>> > > + switch (rsp.result) { >>> > > + case SD_RES_INVALID_PARMS: >>> > > + error_report("you are running the old sheep that doesn't support " >>> > > + "discard command.\n"); >> > >> > error_report() does not need '\n'. >> > >> > The recently added ssh block driver has a similar case when the server >> > does not support fsync. It does the following: >> > >> > 1. Print the error message once only per volume, avoid filling up logs >> > on the host. All the request for the volumes are firstly handled by the sheep daemon this QEMU connects to, so we can say that if one discard request for any volume return SD_RES_INVALID_PARMS, then all the volumes attatched to this VM can't support discard operation. >> > 2. Include details of the volume/server in case the users is connected >> > to multiple volumes/servers. This allows them to figure out which >> > server is outdated. >> > Multi-connections aren't supported yet (though planned), so this doesn't apply for current SD. >> > This makes the error messages safe from denial-of-service and includes >> > more useful information. > Or if we can check whether discard works during bdrv_open(), we could > already fail there for discard=on. Hmm, SD doesn't support a feature negotiation request. The most simple way I can come up is add a s->enable_discard flag that set false when it is reported discard operation isn't supported by the server connected. What do you think? Thanks, Yuan
diff --git a/block/sheepdog.c b/block/sheepdog.c index 987018e..3b64690 100644 --- a/block/sheepdog.c +++ b/block/sheepdog.c @@ -27,6 +27,8 @@ #define SD_OP_CREATE_AND_WRITE_OBJ 0x01 #define SD_OP_READ_OBJ 0x02 #define SD_OP_WRITE_OBJ 0x03 +/* 0x04 is used internally by Sheepdog */ +#define SD_OP_DISCARD_OBJ 0x05 #define SD_OP_NEW_VDI 0x11 #define SD_OP_LOCK_VDI 0x12 @@ -269,6 +271,7 @@ enum AIOCBState { AIOCB_WRITE_UDATA, AIOCB_READ_UDATA, AIOCB_FLUSH_CACHE, + AIOCB_DISCARD_OBJ, }; struct SheepdogAIOCB { @@ -656,7 +659,7 @@ static void coroutine_fn aio_read_response(void *opaque) int ret; AIOReq *aio_req = NULL; SheepdogAIOCB *acb; - unsigned long idx; + uint64_t idx; if (QLIST_EMPTY(&s->inflight_aio_head)) { goto out; @@ -727,6 +730,20 @@ static void coroutine_fn aio_read_response(void *opaque) rsp.result = SD_RES_SUCCESS; } break; + case AIOCB_DISCARD_OBJ: + switch (rsp.result) { + case SD_RES_INVALID_PARMS: + error_report("you are running the old sheep that doesn't support " + "discard command.\n"); + rsp.result = SD_RES_SUCCESS; + break; + case SD_RES_SUCCESS: + idx = data_oid_to_idx(aio_req->oid); + s->inode.data_vdi_id[idx] = 0; + break; + default: + break; + } } if (rsp.result != SD_RES_SUCCESS) { @@ -1016,6 +1033,9 @@ static int coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req, wlen = datalen; hdr.flags = SD_FLAG_CMD_WRITE | flags; break; + case AIOCB_DISCARD_OBJ: + hdr.opcode = SD_OP_DISCARD_OBJ; + break; } if (s->cache_flags) { @@ -1633,6 +1653,15 @@ static int coroutine_fn sd_co_rw_vector(void *p) flags = SD_FLAG_CMD_COW; } break; + case AIOCB_DISCARD_OBJ: + /* + * We discard the object only when the whole object is + * 1) allocated 2) trimmed. Otherwise, simply skip it. + */ + if (len != SD_DATA_OBJ_SIZE || inode->data_vdi_id[idx] == 0) { + goto done; + } + break; default: break; } @@ -2071,6 +2100,28 @@ static int sd_load_vmstate(BlockDriverState *bs, uint8_t *data, } +static coroutine_fn int sd_co_discard(BlockDriverState *bs, int64_t sector_num, + int nb_sectors) +{ + SheepdogAIOCB *acb; + QEMUIOVector dummy; + int ret; + + acb = sd_aio_setup(bs, &dummy, sector_num, nb_sectors); + acb->aiocb_type = AIOCB_DISCARD_OBJ; + acb->aio_done_func = sd_finish_aiocb; + + ret = sd_co_rw_vector(acb); + if (ret <= 0) { + qemu_aio_release(acb); + return ret; + } + + qemu_coroutine_yield(); + + return acb->ret; +} + static QEMUOptionParameter sd_create_options[] = { { .name = BLOCK_OPT_SIZE, @@ -2103,6 +2154,7 @@ static BlockDriver bdrv_sheepdog = { .bdrv_co_readv = sd_co_readv, .bdrv_co_writev = sd_co_writev, .bdrv_co_flush_to_disk = sd_co_flush_to_disk, + .bdrv_co_discard = sd_co_discard, .bdrv_snapshot_create = sd_snapshot_create, .bdrv_snapshot_goto = sd_snapshot_goto, @@ -2128,6 +2180,7 @@ static BlockDriver bdrv_sheepdog_tcp = { .bdrv_co_readv = sd_co_readv, .bdrv_co_writev = sd_co_writev, .bdrv_co_flush_to_disk = sd_co_flush_to_disk, + .bdrv_co_discard = sd_co_discard, .bdrv_snapshot_create = sd_snapshot_create, .bdrv_snapshot_goto = sd_snapshot_goto, @@ -2153,6 +2206,7 @@ static BlockDriver bdrv_sheepdog_unix = { .bdrv_co_readv = sd_co_readv, .bdrv_co_writev = sd_co_writev, .bdrv_co_flush_to_disk = sd_co_flush_to_disk, + .bdrv_co_discard = sd_co_discard, .bdrv_snapshot_create = sd_snapshot_create, .bdrv_snapshot_goto = sd_snapshot_goto,