diff mbox series

[v2,3/8] nbd: BLOCK_STATUS for standard get_block_status function: server part

Message ID 20180312152126.286890-4-vsementsov@virtuozzo.com
State New
Headers show
Series nbd block status base:allocation | expand

Commit Message

Vladimir Sementsov-Ogievskiy March 12, 2018, 3:21 p.m. UTC
Minimal realization: only one extent in server answer is supported.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---

v2: - constants and type defs were splitted out by Eric, except for 
    NBD_META_ID_BASE_ALLOCATION
    - add nbd_opt_skip, to skip meta query remainder, if we are already sure,
    that the query selects nothing
    - check meta export name in OPT_EXPORT_NAME and OPT_GO
    - always set context_id = 0 for NBD_OPT_LIST_META_CONTEXT
    - negotiation rewritten to avoid wasting time and memory on reading long,
    obviously invalid meta queries
    - fixed ERR_INVALID->ERR_UNKNOWN if export not found in nbd_negotiate_meta_queries
    - check client->export_meta.valid in "case NBD_CMD_BLOCK_STATUS"


 include/block/nbd.h |   2 +
 nbd/server.c        | 310 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 312 insertions(+)

Comments

Eric Blake March 13, 2018, 1:47 p.m. UTC | #1
On 03/12/2018 10:21 AM, Vladimir Sementsov-Ogievskiy wrote:
> Minimal realization: only one extent in server answer is supported.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
> 
> v2: - constants and type defs were splitted out by Eric, except for
>      NBD_META_ID_BASE_ALLOCATION

The constant for NBD_META_ID_BASE_ALLOCATION was intentionally not split 
out; it is the only constant that is relevant only to the server side ;) 
  In fact,...

>      - add nbd_opt_skip, to skip meta query remainder, if we are already sure,
>      that the query selects nothing
>      - check meta export name in OPT_EXPORT_NAME and OPT_GO
>      - always set context_id = 0 for NBD_OPT_LIST_META_CONTEXT
>      - negotiation rewritten to avoid wasting time and memory on reading long,
>      obviously invalid meta queries
>      - fixed ERR_INVALID->ERR_UNKNOWN if export not found in nbd_negotiate_meta_queries
>      - check client->export_meta.valid in "case NBD_CMD_BLOCK_STATUS"
> 
> 
>   include/block/nbd.h |   2 +
>   nbd/server.c        | 310 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>   2 files changed, 312 insertions(+)
> 
> diff --git a/include/block/nbd.h b/include/block/nbd.h
> index 2285637e67..9f2be18186 100644
> --- a/include/block/nbd.h
> +++ b/include/block/nbd.h
> @@ -188,6 +188,8 @@ typedef struct NBDExtent {
>   #define NBD_CMD_FLAG_REQ_ONE    (1 << 3) /* only one extent in BLOCK_STATUS
>                                             * reply chunk */
>   
> +#define NBD_META_ID_BASE_ALLOCATION 0
> +

...I will be squashing in a change to move it out of the .h and into the .c.

>   /* Supported request types */
>   enum {
>       NBD_CMD_READ = 0,
> diff --git a/nbd/server.c b/nbd/server.c
> index 085e14afbf..16d7388085 100644
> --- a/nbd/server.c

> @@ -371,6 +396,12 @@ static int nbd_negotiate_handle_list(NBDClient *client, Error **errp)
>       return nbd_negotiate_send_rep(client, NBD_REP_ACK, errp);
>   }
>   
> +static void nbd_check_meta_export_name(NBDClient *client)
> +{
> +    client->export_meta.valid = client->export_meta.valid &&
> +        strcmp(client->exp->name, client->export_meta.export_name) == 0;

The indentation makes this harder for me to parse (at first glance, I 
thought you had (a) && (b), and were either missing a side effect or 
return statement).  It's a lot more obvious what you are doing with:

client->export_meta.valid &= !strcmp(client->exp->name,
                                      client->export_meta.export_name);


> +/* nbd_meta_base_query
> + *
> + * Handle query to 'base' namespace. For now, only base:allocation context is
> + * available in it.
> + *
> + * Return -errno on I/O error, 0 if option was completely handled by
> + * sending a reply about inconsistent lengths, or 1 on success. */
> +static int nbd_meta_base_query(NBDClient *client, NBDExportMetaContexts *meta,
> +                               uint32_t len, Error **errp)

The comments don't describe what 'len' represents, I had to go read the 
call-site before I could understand this function.  If I understand 
correctly, this function is called at the point that we have parsed 
"base:" out of the longer overall name given to LIST or SET, and len is 
the remaining length of the overall name that still needs parsing.

> +{
> +    int ret;
> +    char query[sizeof("allocation") - 1];

Why discard the trailing NUL from the size?  It doesn't hurt to leave it 
in, unless...

> +    size_t alen = strlen("allocation");

...Better than strlen() would be sizeof(query), as long as the trailing 
NUL is not changing the size of the array.

> +
> +    if (len == 0) {
> +        if (client->opt == NBD_OPT_LIST_META_CONTEXT) {
> +            meta->base_allocation = true;
> +        }
> +        return 1;
> +    }

Okay, so here, the user requested "base:"; on list we return all 
contexts that we know (base:allocation); on set we fall through.

> +
> +    if (len != alen) {
> +        return nbd_opt_skip(client, len, errp);
> +    }

Here, the user requested "base:garbage", where the garbage (including 
empty string on set) is a different length than "base:allocation".  It 
may be a valid string for a future NBD version, but for us, we know 
right away it is is not something we recognize, so we gracefully skip it.

Checking myself: if nbd_opt_skip returned -1, we have detected 
communication problems with the client; it does not matter if there is 
any unparsed data remaining in the current option.  It can only return 0 
if nbd_opt_invalid has already finished parsing the entire option (we 
are ready to parse the next NBD_OPT command, no further queries in the 
current option matter).  It can only return 1 if we finished parsing the 
current query, and are positioned ready to parse the next query. [1]

> +
> +    ret = nbd_opt_read(client, query, len, errp);
> +    if (ret <= 0) {
> +        return ret;
> +    }
> +
> +    if (strncmp(query, "allocation", alen) == 0) {

Here, you HAD to use strncmp because you didn't leave room for the 
trailing NUL in the array above.  Tradeoffs.  So I guess your approach 
is okay.

> +        meta->base_allocation = true;
> +    }
> +
> +    return 1;

And if we get here, the user requested exactly "base:allocation", so we 
enabled exactly that context.

> +}
> +
> +/* nbd_negotiate_meta_query
> + *
> + * Parse namespace name and call corresponding function to parse body of the
> + * query.
> + *
> + * The only supported namespace now is 'base'.
> + *
> + * The function aims not wasting time and memory to read long unknown namespace
> + * names.
> + *
> + * Return -errno on I/O error, 0 if option was completely handled by
> + * sending a reply about inconsistent lengths, or 1 on success. */
> +static int nbd_negotiate_meta_query(NBDClient *client,
> +                                    NBDExportMetaContexts *meta, Error **errp)
> +{
> +    int ret;
> +    char query[sizeof("base:") - 1];
> +    size_t baselen = strlen("base:");

And since this matches the approach in the previous function, we'll keep 
it consistent.

> +    uint32_t len;
> +
> +    ret = nbd_opt_read(client, &len, sizeof(len), errp);
> +    if (ret <= 0) {
> +        return ret;
> +    }
> +    cpu_to_be32s(&len);
> +
> +    /* The only supported namespace for now is 'base'. So query should start
> +     * with 'base:'. Otherwise, we can ignore it and skip the remainder. */
> +    if (len < baselen) {
> +        return nbd_opt_skip(client, len, errp);
> +    }
> +
> +    len -= baselen;
> +    ret = nbd_opt_read(client, query, baselen, errp);
> +    if (ret <= 0) {
> +        return ret;
> +    }
> +    if (strncmp(query, "base:", baselen) != 0) {

Again, strncmp is a bit awkward compared to strcmp, but it works.

> +        return nbd_opt_skip(client, len, errp);
> +    }
> +
> +    return nbd_meta_base_query(client, meta, len, errp);
> +}
> +
> +/* nbd_negotiate_meta_queries
> + * Handle NBD_OPT_LIST_META_CONTEXT and NBD_OPT_SET_META_CONTEXT
> + *
> + * @meta may be NULL, if caller isn't interested in selected contexts (for
> + *     NBD_OPT_LIST_META_CONTEXT)
> + *
> + * Return -errno on I/O error, 0 if option was completely handled by
> + * sending a reply about inconsistent lengths, or 1 on success. */

Comment is wrong - this function never returns 1 (nor should it, as 
nbd_negotiate_options() expects a return of 1 only from NBD_OPT_ABORT).

> +static int nbd_negotiate_meta_queries(NBDClient *client,
> +                                      NBDExportMetaContexts *meta, Error **errp)
> +{
> +    int ret;
> +    NBDExport *exp;
> +    NBDExportMetaContexts local_meta;
> +    uint32_t nb_queries;
> +    int i;
> +
> +    assert(client->structured_reply);

Perhaps worth a comment that this is safe because we already filtered it 
out at the caller.

> +
> +    if (!meta) {
> +        meta = &local_meta;
> +    }
> +
> +    memset(meta, 0, sizeof(*meta));
> +
> +    ret = nbd_opt_read_name(client, meta->export_name, NULL, errp);
> +    if (ret <= 0) {
> +        return ret;
> +    }
> +
> +    exp = nbd_export_find(meta->export_name);
> +    if (exp == NULL) {
> +        return nbd_opt_drop(client, NBD_REP_ERR_UNKNOWN, errp,
> +                            "export '%s' not present", meta->export_name);
> +    }
> +

It's nice to see my review comments from v1 fixed here ;)

> +    ret = nbd_opt_read(client, &nb_queries, sizeof(nb_queries), errp);
> +    if (ret <= 0) {
> +        return ret;
> +    }
> +    cpu_to_be32s(&nb_queries);
> +
> +    for (i = 0; i < nb_queries; ++i) {
> +        ret = nbd_negotiate_meta_query(client, meta, errp);
> +        if (ret <= 0) {
> +            return ret;

[1] Okay, I've convinced myself we are good.  We can only early return 
from this loop if we encountered a disconnect (result -1, either read or 
write to client failed, no further communication is worth trying, so it 
doesn't matter if we are left mid-option) or if we encountered an 
inconsistent length and already replied successfully to the client about 
their bogus request (result 0, we've already skipped to the end of the 
current option, ready to parse the next NBD_OPT).

> +        }
> +    }

Missing: On LIST, if nb_queries is 0 before the loop, then we must reply 
with ALL supported contexts, rather than none (the behavior for SET is 
correct, though).

> +
> +    if (meta->base_allocation) {
> +        ret = nbd_negotiate_send_meta_context(client, "base:allocation",
> +                                              NBD_META_ID_BASE_ALLOCATION,
> +                                              errp);
> +        if (ret < 0) {
> +            return ret;
> +        }
> +    }
> +
> +    ret = nbd_negotiate_send_rep(client, NBD_REP_ACK, errp);
> +    if (ret == 0) {
> +        meta->valid = true;
> +    }
> +
> +    return ret;

Code is correct - all early returns and this final return are negative 
or 0, where 0 means we parsed the entire NBD_OPT, gave a reply, and the 
connection is ready for the next NBD_OPT.

> +}
> +
>   /* nbd_negotiate_options
>    * Process all NBD_OPT_* client option commands, during fixed newstyle
>    * negotiation.
> @@ -856,6 +1064,22 @@ static int nbd_negotiate_options(NBDClient *client, uint16_t myflags,
>                   }
>                   break;
>   
> +            case NBD_OPT_LIST_META_CONTEXT:
> +            case NBD_OPT_SET_META_CONTEXT:
> +                if (!client->structured_reply) {
> +                    ret = nbd_opt_invalid(
> +                            client, errp,
> +                            "request option '%s' when structured reply "
> +                            "is not negotiated", nbd_opt_lookup(option));
> +                } else if (option == NBD_OPT_LIST_META_CONTEXT) {
> +                    ret = nbd_negotiate_meta_queries(client, NULL, errp);
> +                } else {
> +                    ret = nbd_negotiate_meta_queries(client,
> +                                                     &client->export_meta,
> +                                                     errp);
> +                }

Looks good.

If we WANTED to split this patch into two, then part 1 would be NBD_OPT 
handling (were we just always return 0 contexts in reply to LIST or 
SET), and part 2 would be NBD_CMD_BLOCK_STATUS handling plus enabling 
base:allocation advertisement during NBD_OPT handling.  But I'm not 
going to ask for a split now.

> +                break;
> +
>               default:
>                   ret = nbd_opt_drop(client, NBD_REP_ERR_UNSUP, errp,
>                                      "Unsupported option %" PRIu32 " (%s)",
> @@ -1485,6 +1709,79 @@ static int coroutine_fn nbd_co_send_sparse_read(NBDClient *client,
>       return ret;
>   }
>   
> +static int blockstatus_to_extent_be(BlockDriverState *bs, uint64_t offset,
> +                                    uint64_t bytes, NBDExtent *extent)
> +{
> +    uint64_t remaining_bytes = bytes;
> +
> +    while (remaining_bytes) {
> +        uint32_t flags;
> +        int64_t num;
> +        int ret = bdrv_block_status_above(bs, NULL, offset, remaining_bytes,
> +                                          &num, NULL, NULL);
> +        if (ret < 0) {
> +            return ret;
> +        }
> +
> +        flags = (ret & BDRV_BLOCK_ALLOCATED ? 0 : NBD_STATE_HOLE) |
> +                (ret & BDRV_BLOCK_ZERO      ? NBD_STATE_ZERO : 0);

I still need to fix what block_status_above returns for protocol drivers 
per Kevin's review of my byte-based status patches, but that will be 
during soft freeze (as it is in the bug fix category); it may have a 
minor impact to this code.  But it shouldn't hold up this series.

> @@ -1562,6 +1859,8 @@ static int nbd_co_receive_request(NBDRequestData *req, NBDRequest *request,
>           valid_flags |= NBD_CMD_FLAG_DF;
>       } else if (request->type == NBD_CMD_WRITE_ZEROES) {
>           valid_flags |= NBD_CMD_FLAG_NO_HOLE;
> +    } else if (request->type == NBD_CMD_BLOCK_STATUS) {
> +        valid_flags |= NBD_CMD_FLAG_REQ_ONE;
>       }
>       if (request->flags & ~valid_flags) {
>           error_setg(errp, "unsupported flags for command %s (got 0x%x)",
> @@ -1690,6 +1989,17 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
>   
>           return nbd_send_generic_reply(client, request->handle, ret,
>                                         "discard failed", errp);
> +    case NBD_CMD_BLOCK_STATUS:
> +        if (client->export_meta.valid && client->export_meta.base_allocation) {
> +            return nbd_co_send_block_status(client, request->handle,
> +                                            blk_bs(exp->blk), request->from,
> +                                            request->len,
> +                                            NBD_META_ID_BASE_ALLOCATION, errp);

Will obviously be expanded as we add more namespaces (for dirty bitmap 
queries), but works for your first cut of just reporting block status.

> +        } else {
> +            return nbd_send_generic_reply(client, request->handle, -EINVAL,
> +                                          "CMD_BLOCK_STATUS not negotiated",
> +                                          errp);
> +        }
>       default:
>           msg = g_strdup_printf("invalid request type (%" PRIu32 ") received",
>                                 request->type);
> 

I'm making tweaks as mentioned above, but this is close enough to get 
into softfreeze.

Reviewed-by: Eric Blake <eblake@redhat.com>
Eric Blake March 13, 2018, 1:56 p.m. UTC | #2
On 03/13/2018 08:47 AM, Eric Blake wrote:
> On 03/12/2018 10:21 AM, Vladimir Sementsov-Ogievskiy wrote:
>> Minimal realization: only one extent in server answer is supported.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>

>> +/* nbd_negotiate_meta_queries
>> + * Handle NBD_OPT_LIST_META_CONTEXT and NBD_OPT_SET_META_CONTEXT
>> + *
>> + * @meta may be NULL, if caller isn't interested in selected contexts 
>> (for
>> + *     NBD_OPT_LIST_META_CONTEXT)
>> + *
>> + * Return -errno on I/O error, 0 if option was completely handled by
>> + * sending a reply about inconsistent lengths, or 1 on success. */
> 
> Comment is wrong - this function never returns 1 (nor should it, as 
> nbd_negotiate_options() expects a return of 1 only from NBD_OPT_ABORT).
> 
>> +static int nbd_negotiate_meta_queries(NBDClient *client,
>> +                                      NBDExportMetaContexts *meta, 
>> Error **errp)
>> +{
>> +    int ret;
>> +    NBDExport *exp;
>> +    NBDExportMetaContexts local_meta;
>> +    uint32_t nb_queries;
>> +    int i;
>> +
>> +    assert(client->structured_reply);
> 
> Perhaps worth a comment that this is safe because we already filtered it 
> out at the caller.
> 
>> +
>> +    if (!meta) {
>> +        meta = &local_meta;
>> +    }

Or, we could check here, and even base our decision on whether to change 
'meta' due to client->opt...


>> @@ -856,6 +1064,22 @@ static int nbd_negotiate_options(NBDClient 
>> *client, uint16_t myflags,
>>                   }
>>                   break;
>> +            case NBD_OPT_LIST_META_CONTEXT:
>> +            case NBD_OPT_SET_META_CONTEXT:
>> +                if (!client->structured_reply) {
>> +                    ret = nbd_opt_invalid(
>> +                            client, errp,
>> +                            "request option '%s' when structured reply "
>> +                            "is not negotiated", 
>> nbd_opt_lookup(option));
>> +                } else if (option == NBD_OPT_LIST_META_CONTEXT) {
>> +                    ret = nbd_negotiate_meta_queries(client, NULL, 
>> errp);
>> +                } else {
>> +                    ret = nbd_negotiate_meta_queries(client,
>> +                                                     
>> &client->export_meta,
>> +                                                     errp);
>> +                }

Then here, we just do a single ret = nbd_negotiate_meta_queries().
diff mbox series

Patch

diff --git a/include/block/nbd.h b/include/block/nbd.h
index 2285637e67..9f2be18186 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -188,6 +188,8 @@  typedef struct NBDExtent {
 #define NBD_CMD_FLAG_REQ_ONE    (1 << 3) /* only one extent in BLOCK_STATUS
                                           * reply chunk */
 
+#define NBD_META_ID_BASE_ALLOCATION 0
+
 /* Supported request types */
 enum {
     NBD_CMD_READ = 0,
diff --git a/nbd/server.c b/nbd/server.c
index 085e14afbf..16d7388085 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -82,6 +82,16 @@  struct NBDExport {
 
 static QTAILQ_HEAD(, NBDExport) exports = QTAILQ_HEAD_INITIALIZER(exports);
 
+/* NBDExportMetaContexts represents a list of contexts to be exported,
+ * as selected by NBD_OPT_SET_META_CONTEXT. Also used for
+ * NBD_OPT_LIST_META_CONTEXT. */
+typedef struct NBDExportMetaContexts {
+    char export_name[NBD_MAX_NAME_SIZE + 1];
+    bool valid; /* means that negotiation of the option finished without
+                   errors */
+    bool base_allocation; /* export base:allocation context (block status) */
+} NBDExportMetaContexts;
+
 struct NBDClient {
     int refcount;
     void (*close_fn)(NBDClient *client, bool negotiated);
@@ -102,6 +112,7 @@  struct NBDClient {
     bool closing;
 
     bool structured_reply;
+    NBDExportMetaContexts export_meta;
 
     uint32_t opt; /* Current option being negotiated */
     uint32_t optlen; /* remaining length of data in ioc for the option being
@@ -273,6 +284,20 @@  static int nbd_opt_read(NBDClient *client, void *buffer, size_t size,
     return qio_channel_read_all(client->ioc, buffer, size, errp) < 0 ? -EIO : 1;
 }
 
+/* Drop size bytes from the unparsed payload of the current option.
+ * Return -errno on I/O error, 0 if option was completely handled by
+ * sending a reply about inconsistent lengths, or 1 on success. */
+static int nbd_opt_skip(NBDClient *client, size_t size, Error **errp)
+{
+    if (size > client->optlen) {
+        return nbd_opt_invalid(client, errp,
+                               "Inconsistent lengths in option %s",
+                               nbd_opt_lookup(client->opt));
+    }
+    client->optlen -= size;
+    return nbd_drop(client->ioc, size, errp) < 0 ? -EIO : 1;
+}
+
 /* nbd_opt_read_name
  *
  * Read string in format:
@@ -371,6 +396,12 @@  static int nbd_negotiate_handle_list(NBDClient *client, Error **errp)
     return nbd_negotiate_send_rep(client, NBD_REP_ACK, errp);
 }
 
+static void nbd_check_meta_export_name(NBDClient *client)
+{
+    client->export_meta.valid = client->export_meta.valid &&
+        strcmp(client->exp->name, client->export_meta.export_name) == 0;
+}
+
 /* Send a reply to NBD_OPT_EXPORT_NAME.
  * Return -errno on error, 0 on success. */
 static int nbd_negotiate_handle_export_name(NBDClient *client,
@@ -422,6 +453,7 @@  static int nbd_negotiate_handle_export_name(NBDClient *client,
 
     QTAILQ_INSERT_TAIL(&client->exp->clients, client, next);
     nbd_export_get(client->exp);
+    nbd_check_meta_export_name(client);
 
     return 0;
 }
@@ -612,6 +644,7 @@  static int nbd_negotiate_handle_info(NBDClient *client, uint16_t myflags,
         client->exp = exp;
         QTAILQ_INSERT_TAIL(&client->exp->clients, client, next);
         nbd_export_get(client->exp);
+        nbd_check_meta_export_name(client);
         rc = 1;
     }
     return rc;
@@ -666,6 +699,181 @@  static QIOChannel *nbd_negotiate_handle_starttls(NBDClient *client,
     return QIO_CHANNEL(tioc);
 }
 
+/* nbd_negotiate_send_meta_context
+ *
+ * Send one chunk of reply to NBD_OPT_{LIST,SET}_META_CONTEXT
+ *
+ * For NBD_OPT_LIST_META_CONTEXT @context_id is ignored, 0 is used instead.
+ */
+static int nbd_negotiate_send_meta_context(NBDClient *client,
+                                           const char *context,
+                                           uint32_t context_id,
+                                           Error **errp)
+{
+    NBDOptionReplyMetaContext opt;
+    struct iovec iov[] = {
+        {.iov_base = &opt, .iov_len = sizeof(opt)},
+        {.iov_base = (void *)context, .iov_len = strlen(context)}
+    };
+
+    if (client->opt == NBD_OPT_LIST_META_CONTEXT) {
+        context_id = 0;
+    }
+
+    set_be_option_rep(&opt.h, client->opt, NBD_REP_META_CONTEXT,
+                      sizeof(opt) - sizeof(opt.h) + iov[1].iov_len);
+    stl_be_p(&opt.context_id, context_id);
+
+    return qio_channel_writev_all(client->ioc, iov, 2, errp) < 0 ? -EIO : 0;
+}
+
+/* nbd_meta_base_query
+ *
+ * Handle query to 'base' namespace. For now, only base:allocation context is
+ * available in it.
+ *
+ * Return -errno on I/O error, 0 if option was completely handled by
+ * sending a reply about inconsistent lengths, or 1 on success. */
+static int nbd_meta_base_query(NBDClient *client, NBDExportMetaContexts *meta,
+                               uint32_t len, Error **errp)
+{
+    int ret;
+    char query[sizeof("allocation") - 1];
+    size_t alen = strlen("allocation");
+
+    if (len == 0) {
+        if (client->opt == NBD_OPT_LIST_META_CONTEXT) {
+            meta->base_allocation = true;
+        }
+        return 1;
+    }
+
+    if (len != alen) {
+        return nbd_opt_skip(client, len, errp);
+    }
+
+    ret = nbd_opt_read(client, query, len, errp);
+    if (ret <= 0) {
+        return ret;
+    }
+
+    if (strncmp(query, "allocation", alen) == 0) {
+        meta->base_allocation = true;
+    }
+
+    return 1;
+}
+
+/* nbd_negotiate_meta_query
+ *
+ * Parse namespace name and call corresponding function to parse body of the
+ * query.
+ *
+ * The only supported namespace now is 'base'.
+ *
+ * The function aims not wasting time and memory to read long unknown namespace
+ * names.
+ *
+ * Return -errno on I/O error, 0 if option was completely handled by
+ * sending a reply about inconsistent lengths, or 1 on success. */
+static int nbd_negotiate_meta_query(NBDClient *client,
+                                    NBDExportMetaContexts *meta, Error **errp)
+{
+    int ret;
+    char query[sizeof("base:") - 1];
+    size_t baselen = strlen("base:");
+    uint32_t len;
+
+    ret = nbd_opt_read(client, &len, sizeof(len), errp);
+    if (ret <= 0) {
+        return ret;
+    }
+    cpu_to_be32s(&len);
+
+    /* The only supported namespace for now is 'base'. So query should start
+     * with 'base:'. Otherwise, we can ignore it and skip the remainder. */
+    if (len < baselen) {
+        return nbd_opt_skip(client, len, errp);
+    }
+
+    len -= baselen;
+    ret = nbd_opt_read(client, query, baselen, errp);
+    if (ret <= 0) {
+        return ret;
+    }
+    if (strncmp(query, "base:", baselen) != 0) {
+        return nbd_opt_skip(client, len, errp);
+    }
+
+    return nbd_meta_base_query(client, meta, len, errp);
+}
+
+/* nbd_negotiate_meta_queries
+ * Handle NBD_OPT_LIST_META_CONTEXT and NBD_OPT_SET_META_CONTEXT
+ *
+ * @meta may be NULL, if caller isn't interested in selected contexts (for
+ *     NBD_OPT_LIST_META_CONTEXT)
+ *
+ * Return -errno on I/O error, 0 if option was completely handled by
+ * sending a reply about inconsistent lengths, or 1 on success. */
+static int nbd_negotiate_meta_queries(NBDClient *client,
+                                      NBDExportMetaContexts *meta, Error **errp)
+{
+    int ret;
+    NBDExport *exp;
+    NBDExportMetaContexts local_meta;
+    uint32_t nb_queries;
+    int i;
+
+    assert(client->structured_reply);
+
+    if (!meta) {
+        meta = &local_meta;
+    }
+
+    memset(meta, 0, sizeof(*meta));
+
+    ret = nbd_opt_read_name(client, meta->export_name, NULL, errp);
+    if (ret <= 0) {
+        return ret;
+    }
+
+    exp = nbd_export_find(meta->export_name);
+    if (exp == NULL) {
+        return nbd_opt_drop(client, NBD_REP_ERR_UNKNOWN, errp,
+                            "export '%s' not present", meta->export_name);
+    }
+
+    ret = nbd_opt_read(client, &nb_queries, sizeof(nb_queries), errp);
+    if (ret <= 0) {
+        return ret;
+    }
+    cpu_to_be32s(&nb_queries);
+
+    for (i = 0; i < nb_queries; ++i) {
+        ret = nbd_negotiate_meta_query(client, meta, errp);
+        if (ret <= 0) {
+            return ret;
+        }
+    }
+
+    if (meta->base_allocation) {
+        ret = nbd_negotiate_send_meta_context(client, "base:allocation",
+                                              NBD_META_ID_BASE_ALLOCATION,
+                                              errp);
+        if (ret < 0) {
+            return ret;
+        }
+    }
+
+    ret = nbd_negotiate_send_rep(client, NBD_REP_ACK, errp);
+    if (ret == 0) {
+        meta->valid = true;
+    }
+
+    return ret;
+}
+
 /* nbd_negotiate_options
  * Process all NBD_OPT_* client option commands, during fixed newstyle
  * negotiation.
@@ -856,6 +1064,22 @@  static int nbd_negotiate_options(NBDClient *client, uint16_t myflags,
                 }
                 break;
 
+            case NBD_OPT_LIST_META_CONTEXT:
+            case NBD_OPT_SET_META_CONTEXT:
+                if (!client->structured_reply) {
+                    ret = nbd_opt_invalid(
+                            client, errp,
+                            "request option '%s' when structured reply "
+                            "is not negotiated", nbd_opt_lookup(option));
+                } else if (option == NBD_OPT_LIST_META_CONTEXT) {
+                    ret = nbd_negotiate_meta_queries(client, NULL, errp);
+                } else {
+                    ret = nbd_negotiate_meta_queries(client,
+                                                     &client->export_meta,
+                                                     errp);
+                }
+                break;
+
             default:
                 ret = nbd_opt_drop(client, NBD_REP_ERR_UNSUP, errp,
                                    "Unsupported option %" PRIu32 " (%s)",
@@ -1485,6 +1709,79 @@  static int coroutine_fn nbd_co_send_sparse_read(NBDClient *client,
     return ret;
 }
 
+static int blockstatus_to_extent_be(BlockDriverState *bs, uint64_t offset,
+                                    uint64_t bytes, NBDExtent *extent)
+{
+    uint64_t remaining_bytes = bytes;
+
+    while (remaining_bytes) {
+        uint32_t flags;
+        int64_t num;
+        int ret = bdrv_block_status_above(bs, NULL, offset, remaining_bytes,
+                                          &num, NULL, NULL);
+        if (ret < 0) {
+            return ret;
+        }
+
+        flags = (ret & BDRV_BLOCK_ALLOCATED ? 0 : NBD_STATE_HOLE) |
+                (ret & BDRV_BLOCK_ZERO      ? NBD_STATE_ZERO : 0);
+
+        if (remaining_bytes == bytes) {
+            extent->flags = flags;
+        }
+
+        if (flags != extent->flags) {
+            break;
+        }
+
+        offset += num;
+        remaining_bytes -= num;
+    }
+
+    cpu_to_be32s(&extent->flags);
+    extent->length = cpu_to_be32(bytes - remaining_bytes);
+
+    return 0;
+}
+
+/* nbd_co_send_extents
+ * @extents should be in big-endian */
+static int nbd_co_send_extents(NBDClient *client, uint64_t handle,
+                               NBDExtent *extents, unsigned nb_extents,
+                               uint32_t context_id, Error **errp)
+{
+    NBDStructuredMeta chunk;
+
+    struct iovec iov[] = {
+        {.iov_base = &chunk, .iov_len = sizeof(chunk)},
+        {.iov_base = extents, .iov_len = nb_extents * sizeof(extents[0])}
+    };
+
+    set_be_chunk(&chunk.h, NBD_REPLY_FLAG_DONE, NBD_REPLY_TYPE_BLOCK_STATUS,
+                 handle, sizeof(chunk) - sizeof(chunk.h) + iov[1].iov_len);
+    stl_be_p(&chunk.context_id, context_id);
+
+    return nbd_co_send_iov(client, iov, 2, errp);
+}
+
+/* Get block status from the exported device and send it to the client */
+static int nbd_co_send_block_status(NBDClient *client, uint64_t handle,
+                                    BlockDriverState *bs, uint64_t offset,
+                                    uint64_t length, uint32_t context_id,
+                                    Error **errp)
+{
+    int ret;
+    NBDExtent extent;
+
+    ret = blockstatus_to_extent_be(bs, offset, length, &extent);
+    if (ret < 0) {
+        return nbd_co_send_structured_error(
+                client, handle, -ret, "can't get block status", errp);
+    }
+
+    return nbd_co_send_extents(client, handle, &extent, 1, context_id, errp);
+}
+
 /* nbd_co_receive_request
  * Collect a client request. Return 0 if request looks valid, -EIO to drop
  * connection right away, and any other negative value to report an error to
@@ -1562,6 +1859,8 @@  static int nbd_co_receive_request(NBDRequestData *req, NBDRequest *request,
         valid_flags |= NBD_CMD_FLAG_DF;
     } else if (request->type == NBD_CMD_WRITE_ZEROES) {
         valid_flags |= NBD_CMD_FLAG_NO_HOLE;
+    } else if (request->type == NBD_CMD_BLOCK_STATUS) {
+        valid_flags |= NBD_CMD_FLAG_REQ_ONE;
     }
     if (request->flags & ~valid_flags) {
         error_setg(errp, "unsupported flags for command %s (got 0x%x)",
@@ -1690,6 +1989,17 @@  static coroutine_fn int nbd_handle_request(NBDClient *client,
 
         return nbd_send_generic_reply(client, request->handle, ret,
                                       "discard failed", errp);
+    case NBD_CMD_BLOCK_STATUS:
+        if (client->export_meta.valid && client->export_meta.base_allocation) {
+            return nbd_co_send_block_status(client, request->handle,
+                                            blk_bs(exp->blk), request->from,
+                                            request->len,
+                                            NBD_META_ID_BASE_ALLOCATION, errp);
+        } else {
+            return nbd_send_generic_reply(client, request->handle, -EINVAL,
+                                          "CMD_BLOCK_STATUS not negotiated",
+                                          errp);
+        }
     default:
         msg = g_strdup_printf("invalid request type (%" PRIu32 ") received",
                               request->type);