diff mbox series

[v3,18/19] nbd/client: Work around 3.0 bug for listing meta contexts

Message ID 20190112175812.27068-19-eblake@redhat.com
State New
Headers show
Series nbd: add qemu-nbd --list | expand

Commit Message

Eric Blake Jan. 12, 2019, 5:58 p.m. UTC
Commit 3d068aff forgot to advertise available qemu: contexts
when the client requests a list with 0 queries. Furthermore,
3.0 shipped with a qemu-img hack of x-dirty-bitmap (commit
216ee365) that _silently_ acts as though the entire image is
clean if a requested bitmap is not present.  Both bugs have
been recently fixed, so that a modern qemu server gives full
context output right away, and the client refuses a
connection if a requested x-dirty-bitmap was not found.

Still, it is likely that there will be users that have to
work with a mix of old and new qemu versions, depending on
which features get backported where, at which point being
able to rely on 'qemu-img --list' output to know for sure
whether a given NBD export has the desired dirty bitmap is
much nicer than blindly connecting and risking that the
entire image may appear clean.  We can make our --list code
smart enough to work around buggy servers by tracking
whether we've seen any qemu: replies in the original 0-query
list; if not, repeat with a single query on "qemu:" (which
may still have no replies, but then we know for sure we
didn't trip up on the server bug).

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20181215135324.152629-22-eblake@redhat.com>
---
 nbd/client.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

Comments

Vladimir Sementsov-Ogievskiy Jan. 16, 2019, 3:43 p.m. UTC | #1
12.01.2019 20:58, Eric Blake wrote:
> Commit 3d068aff forgot to advertise available qemu: contexts
> when the client requests a list with 0 queries. Furthermore,
> 3.0 shipped with a qemu-img hack of x-dirty-bitmap (commit
> 216ee365) that _silently_ acts as though the entire image is
> clean if a requested bitmap is not present.  Both bugs have
> been recently fixed, so that a modern qemu server gives full
> context output right away, and the client refuses a
> connection if a requested x-dirty-bitmap was not found.
> 
> Still, it is likely that there will be users that have to
> work with a mix of old and new qemu versions, depending on
> which features get backported where, at which point being
> able to rely on 'qemu-img --list' output to know for sure
> whether a given NBD export has the desired dirty bitmap is
> much nicer than blindly connecting and risking that the
> entire image may appear clean.  We can make our --list code
> smart enough to work around buggy servers by tracking
> whether we've seen any qemu: replies in the original 0-query
> list; if not, repeat with a single query on "qemu:" (which
> may still have no replies, but then we know for sure we
> didn't trip up on the server bug).
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> Message-Id: <20181215135324.152629-22-eblake@redhat.com>

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

> ---
>   nbd/client.c | 19 +++++++++++++++++++
>   1 file changed, 19 insertions(+)
> 
> diff --git a/nbd/client.c b/nbd/client.c
> index 2001e6e8160..64f3e45edd4 100644
> --- a/nbd/client.c
> +++ b/nbd/client.c
> @@ -21,6 +21,7 @@
>   #include "qapi/error.h"
>   #include "trace.h"
>   #include "nbd-internal.h"
> +#include "qemu/cutils.h"
> 
>   /* Definitions for opaque data types */
> 
> @@ -828,6 +829,8 @@ static int nbd_list_meta_contexts(QIOChannel *ioc,
>                                     Error **errp)
>   {
>       int ret;
> +    int seen_any = false;
> +    int seen_qemu = false;
> 
>       if (nbd_send_one_meta_context(ioc, NBD_OPT_LIST_META_CONTEXT,
>                                     info->name, NULL, errp) < 0) {
> @@ -839,9 +842,25 @@ static int nbd_list_meta_contexts(QIOChannel *ioc,
> 
>           ret = nbd_receive_one_meta_context(ioc, NBD_OPT_LIST_META_CONTEXT,
>                                              &context, NULL, errp);
> +        if (ret == 0 && seen_any && !seen_qemu) {
> +            /*
> +             * Work around qemu 3.0 bug: the server forgot to send
> +             * "qemu:" replies to 0 queries. If we saw at least one
> +             * reply (probably base:allocation), but none of them were

if we are saying about 3.0, it is base:allocation for sure, isn't it?

> +             * qemu:, then run a more specific query to make sure.
> +             */
> +            seen_qemu = true;
> +            if (nbd_send_one_meta_context(ioc, NBD_OPT_LIST_META_CONTEXT,
> +                                          info->name, "qemu:", errp) < 0) {
> +                return -1;
> +            }
> +            continue;
> +        }
>           if (ret <= 0) {
>               return ret;
>           }
> +        seen_any = true;
> +        seen_qemu |= strstart(context, "qemu:", NULL);
>           info->contexts = g_renew(char *, info->contexts, ++info->n_contexts);
>           info->contexts[info->n_contexts - 1] = context;
>       }
>
Eric Blake Jan. 17, 2019, 3:21 a.m. UTC | #2
On 1/16/19 9:43 AM, Vladimir Sementsov-Ogievskiy wrote:

>> @@ -839,9 +842,25 @@ static int nbd_list_meta_contexts(QIOChannel *ioc,
>>
>>           ret = nbd_receive_one_meta_context(ioc, NBD_OPT_LIST_META_CONTEXT,
>>                                              &context, NULL, errp);
>> +        if (ret == 0 && seen_any && !seen_qemu) {
>> +            /*
>> +             * Work around qemu 3.0 bug: the server forgot to send
>> +             * "qemu:" replies to 0 queries. If we saw at least one
>> +             * reply (probably base:allocation), but none of them were
> 
> if we are saying about 3.0, it is base:allocation for sure, isn't it?
> 
>> +             * qemu:, then run a more specific query to make sure.

If the server is qemu 3.0, then yes, it is base:allocation. But it could
be some other server that has its own custom return without implementing
base:allocation.
Vladimir Sementsov-Ogievskiy Jan. 17, 2019, 8:07 a.m. UTC | #3
17.01.2019 6:21, Eric Blake wrote:
> On 1/16/19 9:43 AM, Vladimir Sementsov-Ogievskiy wrote:
> 
>>> @@ -839,9 +842,25 @@ static int nbd_list_meta_contexts(QIOChannel *ioc,
>>>
>>>            ret = nbd_receive_one_meta_context(ioc, NBD_OPT_LIST_META_CONTEXT,
>>>                                               &context, NULL, errp);
>>> +        if (ret == 0 && seen_any && !seen_qemu) {
>>> +            /*
>>> +             * Work around qemu 3.0 bug: the server forgot to send
>>> +             * "qemu:" replies to 0 queries. If we saw at least one
>>> +             * reply (probably base:allocation), but none of them were
>>
>> if we are saying about 3.0, it is base:allocation for sure, isn't it?
>>
>>> +             * qemu:, then run a more specific query to make sure.
> 
> If the server is qemu 3.0, then yes, it is base:allocation. But it could
> be some other server that has its own custom return without implementing
> base:allocation.

Indeed) And in this context, heuristic about that server should have at least one
context listed with no query seems not generic. Why not query 'qemu:' even if empty
query returns nothing? So, at least, "probably" is imbalanced with this not described
in comment heuristic which seems bound to 3.0.

>
Eric Blake Jan. 17, 2019, 2:20 p.m. UTC | #4
On 1/17/19 2:07 AM, Vladimir Sementsov-Ogievskiy wrote:
> 17.01.2019 6:21, Eric Blake wrote:
>> On 1/16/19 9:43 AM, Vladimir Sementsov-Ogievskiy wrote:
>>
>>>> @@ -839,9 +842,25 @@ static int nbd_list_meta_contexts(QIOChannel *ioc,
>>>>
>>>>            ret = nbd_receive_one_meta_context(ioc, NBD_OPT_LIST_META_CONTEXT,
>>>>                                               &context, NULL, errp);
>>>> +        if (ret == 0 && seen_any && !seen_qemu) {
>>>> +            /*
>>>> +             * Work around qemu 3.0 bug: the server forgot to send
>>>> +             * "qemu:" replies to 0 queries. If we saw at least one
>>>> +             * reply (probably base:allocation), but none of them were
>>>
>>> if we are saying about 3.0, it is base:allocation for sure, isn't it?
>>>
>>>> +             * qemu:, then run a more specific query to make sure.
>>
>> If the server is qemu 3.0, then yes, it is base:allocation. But it could
>> be some other server that has its own custom return without implementing
>> base:allocation.
> 
> Indeed) And in this context, heuristic about that server should have at least one
> context listed with no query seems not generic. Why not query 'qemu:' even if empty
> query returns nothing?

Because it is highly unlikely that we will ever encounter a server that
knows how to serve "qemu:" contexts but not "base:allocation" (qemu is
not such a server, and why would any other server bother with qemu:
specific information?).

> So, at least, "probably" is imbalanced with this not described
> in comment heuristic which seems bound to 3.0.

qemu 3.0 is the only server where the heuristic will make a difference,
but not the only server where the heuristic may trigger.
diff mbox series

Patch

diff --git a/nbd/client.c b/nbd/client.c
index 2001e6e8160..64f3e45edd4 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -21,6 +21,7 @@ 
 #include "qapi/error.h"
 #include "trace.h"
 #include "nbd-internal.h"
+#include "qemu/cutils.h"

 /* Definitions for opaque data types */

@@ -828,6 +829,8 @@  static int nbd_list_meta_contexts(QIOChannel *ioc,
                                   Error **errp)
 {
     int ret;
+    int seen_any = false;
+    int seen_qemu = false;

     if (nbd_send_one_meta_context(ioc, NBD_OPT_LIST_META_CONTEXT,
                                   info->name, NULL, errp) < 0) {
@@ -839,9 +842,25 @@  static int nbd_list_meta_contexts(QIOChannel *ioc,

         ret = nbd_receive_one_meta_context(ioc, NBD_OPT_LIST_META_CONTEXT,
                                            &context, NULL, errp);
+        if (ret == 0 && seen_any && !seen_qemu) {
+            /*
+             * Work around qemu 3.0 bug: the server forgot to send
+             * "qemu:" replies to 0 queries. If we saw at least one
+             * reply (probably base:allocation), but none of them were
+             * qemu:, then run a more specific query to make sure.
+             */
+            seen_qemu = true;
+            if (nbd_send_one_meta_context(ioc, NBD_OPT_LIST_META_CONTEXT,
+                                          info->name, "qemu:", errp) < 0) {
+                return -1;
+            }
+            continue;
+        }
         if (ret <= 0) {
             return ret;
         }
+        seen_any = true;
+        seen_qemu |= strstart(context, "qemu:", NULL);
         info->contexts = g_renew(char *, info->contexts, ++info->n_contexts);
         info->contexts[info->n_contexts - 1] = context;
     }