Message ID | 20190112175812.27068-19-eblake@redhat.com |
---|---|
State | New |
Headers | show |
Series | nbd: add qemu-nbd --list | expand |
12.01.2019 20:58, Eric Blake wrote: > Commit 3d068aff forgot to advertise available qemu: contexts > when the client requests a list with 0 queries. Furthermore, > 3.0 shipped with a qemu-img hack of x-dirty-bitmap (commit > 216ee365) that _silently_ acts as though the entire image is > clean if a requested bitmap is not present. Both bugs have > been recently fixed, so that a modern qemu server gives full > context output right away, and the client refuses a > connection if a requested x-dirty-bitmap was not found. > > Still, it is likely that there will be users that have to > work with a mix of old and new qemu versions, depending on > which features get backported where, at which point being > able to rely on 'qemu-img --list' output to know for sure > whether a given NBD export has the desired dirty bitmap is > much nicer than blindly connecting and risking that the > entire image may appear clean. We can make our --list code > smart enough to work around buggy servers by tracking > whether we've seen any qemu: replies in the original 0-query > list; if not, repeat with a single query on "qemu:" (which > may still have no replies, but then we know for sure we > didn't trip up on the server bug). > > Signed-off-by: Eric Blake <eblake@redhat.com> > Message-Id: <20181215135324.152629-22-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> > --- > nbd/client.c | 19 +++++++++++++++++++ > 1 file changed, 19 insertions(+) > > diff --git a/nbd/client.c b/nbd/client.c > index 2001e6e8160..64f3e45edd4 100644 > --- a/nbd/client.c > +++ b/nbd/client.c > @@ -21,6 +21,7 @@ > #include "qapi/error.h" > #include "trace.h" > #include "nbd-internal.h" > +#include "qemu/cutils.h" > > /* Definitions for opaque data types */ > > @@ -828,6 +829,8 @@ static int nbd_list_meta_contexts(QIOChannel *ioc, > Error **errp) > { > int ret; > + int seen_any = false; > + int seen_qemu = false; > > if (nbd_send_one_meta_context(ioc, NBD_OPT_LIST_META_CONTEXT, > info->name, NULL, errp) < 0) { > @@ -839,9 +842,25 @@ static int nbd_list_meta_contexts(QIOChannel *ioc, > > ret = nbd_receive_one_meta_context(ioc, NBD_OPT_LIST_META_CONTEXT, > &context, NULL, errp); > + if (ret == 0 && seen_any && !seen_qemu) { > + /* > + * Work around qemu 3.0 bug: the server forgot to send > + * "qemu:" replies to 0 queries. If we saw at least one > + * reply (probably base:allocation), but none of them were if we are saying about 3.0, it is base:allocation for sure, isn't it? > + * qemu:, then run a more specific query to make sure. > + */ > + seen_qemu = true; > + if (nbd_send_one_meta_context(ioc, NBD_OPT_LIST_META_CONTEXT, > + info->name, "qemu:", errp) < 0) { > + return -1; > + } > + continue; > + } > if (ret <= 0) { > return ret; > } > + seen_any = true; > + seen_qemu |= strstart(context, "qemu:", NULL); > info->contexts = g_renew(char *, info->contexts, ++info->n_contexts); > info->contexts[info->n_contexts - 1] = context; > } >
On 1/16/19 9:43 AM, Vladimir Sementsov-Ogievskiy wrote: >> @@ -839,9 +842,25 @@ static int nbd_list_meta_contexts(QIOChannel *ioc, >> >> ret = nbd_receive_one_meta_context(ioc, NBD_OPT_LIST_META_CONTEXT, >> &context, NULL, errp); >> + if (ret == 0 && seen_any && !seen_qemu) { >> + /* >> + * Work around qemu 3.0 bug: the server forgot to send >> + * "qemu:" replies to 0 queries. If we saw at least one >> + * reply (probably base:allocation), but none of them were > > if we are saying about 3.0, it is base:allocation for sure, isn't it? > >> + * qemu:, then run a more specific query to make sure. If the server is qemu 3.0, then yes, it is base:allocation. But it could be some other server that has its own custom return without implementing base:allocation.
17.01.2019 6:21, Eric Blake wrote: > On 1/16/19 9:43 AM, Vladimir Sementsov-Ogievskiy wrote: > >>> @@ -839,9 +842,25 @@ static int nbd_list_meta_contexts(QIOChannel *ioc, >>> >>> ret = nbd_receive_one_meta_context(ioc, NBD_OPT_LIST_META_CONTEXT, >>> &context, NULL, errp); >>> + if (ret == 0 && seen_any && !seen_qemu) { >>> + /* >>> + * Work around qemu 3.0 bug: the server forgot to send >>> + * "qemu:" replies to 0 queries. If we saw at least one >>> + * reply (probably base:allocation), but none of them were >> >> if we are saying about 3.0, it is base:allocation for sure, isn't it? >> >>> + * qemu:, then run a more specific query to make sure. > > If the server is qemu 3.0, then yes, it is base:allocation. But it could > be some other server that has its own custom return without implementing > base:allocation. Indeed) And in this context, heuristic about that server should have at least one context listed with no query seems not generic. Why not query 'qemu:' even if empty query returns nothing? So, at least, "probably" is imbalanced with this not described in comment heuristic which seems bound to 3.0. >
On 1/17/19 2:07 AM, Vladimir Sementsov-Ogievskiy wrote: > 17.01.2019 6:21, Eric Blake wrote: >> On 1/16/19 9:43 AM, Vladimir Sementsov-Ogievskiy wrote: >> >>>> @@ -839,9 +842,25 @@ static int nbd_list_meta_contexts(QIOChannel *ioc, >>>> >>>> ret = nbd_receive_one_meta_context(ioc, NBD_OPT_LIST_META_CONTEXT, >>>> &context, NULL, errp); >>>> + if (ret == 0 && seen_any && !seen_qemu) { >>>> + /* >>>> + * Work around qemu 3.0 bug: the server forgot to send >>>> + * "qemu:" replies to 0 queries. If we saw at least one >>>> + * reply (probably base:allocation), but none of them were >>> >>> if we are saying about 3.0, it is base:allocation for sure, isn't it? >>> >>>> + * qemu:, then run a more specific query to make sure. >> >> If the server is qemu 3.0, then yes, it is base:allocation. But it could >> be some other server that has its own custom return without implementing >> base:allocation. > > Indeed) And in this context, heuristic about that server should have at least one > context listed with no query seems not generic. Why not query 'qemu:' even if empty > query returns nothing? Because it is highly unlikely that we will ever encounter a server that knows how to serve "qemu:" contexts but not "base:allocation" (qemu is not such a server, and why would any other server bother with qemu: specific information?). > So, at least, "probably" is imbalanced with this not described > in comment heuristic which seems bound to 3.0. qemu 3.0 is the only server where the heuristic will make a difference, but not the only server where the heuristic may trigger.
diff --git a/nbd/client.c b/nbd/client.c index 2001e6e8160..64f3e45edd4 100644 --- a/nbd/client.c +++ b/nbd/client.c @@ -21,6 +21,7 @@ #include "qapi/error.h" #include "trace.h" #include "nbd-internal.h" +#include "qemu/cutils.h" /* Definitions for opaque data types */ @@ -828,6 +829,8 @@ static int nbd_list_meta_contexts(QIOChannel *ioc, Error **errp) { int ret; + int seen_any = false; + int seen_qemu = false; if (nbd_send_one_meta_context(ioc, NBD_OPT_LIST_META_CONTEXT, info->name, NULL, errp) < 0) { @@ -839,9 +842,25 @@ static int nbd_list_meta_contexts(QIOChannel *ioc, ret = nbd_receive_one_meta_context(ioc, NBD_OPT_LIST_META_CONTEXT, &context, NULL, errp); + if (ret == 0 && seen_any && !seen_qemu) { + /* + * Work around qemu 3.0 bug: the server forgot to send + * "qemu:" replies to 0 queries. If we saw at least one + * reply (probably base:allocation), but none of them were + * qemu:, then run a more specific query to make sure. + */ + seen_qemu = true; + if (nbd_send_one_meta_context(ioc, NBD_OPT_LIST_META_CONTEXT, + info->name, "qemu:", errp) < 0) { + return -1; + } + continue; + } if (ret <= 0) { return ret; } + seen_any = true; + seen_qemu |= strstart(context, "qemu:", NULL); info->contexts = g_renew(char *, info->contexts, ++info->n_contexts); info->contexts[info->n_contexts - 1] = context; }
Commit 3d068aff forgot to advertise available qemu: contexts when the client requests a list with 0 queries. Furthermore, 3.0 shipped with a qemu-img hack of x-dirty-bitmap (commit 216ee365) that _silently_ acts as though the entire image is clean if a requested bitmap is not present. Both bugs have been recently fixed, so that a modern qemu server gives full context output right away, and the client refuses a connection if a requested x-dirty-bitmap was not found. Still, it is likely that there will be users that have to work with a mix of old and new qemu versions, depending on which features get backported where, at which point being able to rely on 'qemu-img --list' output to know for sure whether a given NBD export has the desired dirty bitmap is much nicer than blindly connecting and risking that the entire image may appear clean. We can make our --list code smart enough to work around buggy servers by tracking whether we've seen any qemu: replies in the original 0-query list; if not, repeat with a single query on "qemu:" (which may still have no replies, but then we know for sure we didn't trip up on the server bug). Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20181215135324.152629-22-eblake@redhat.com> --- nbd/client.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)