Message ID | 1442907862-21376-4-git-send-email-wency@cn.fujitsu.com |
---|---|
State | New |
Headers | show |
On Tue 22 Sep 2015 09:44:21 AM CEST, Wen Congyang wrote: > The new QMP command name is x-blockdev-child-add, and x-blockdev-child-del. > It justs for adding/removing quorum's child now, and don't support all > kinds of children, nor all block drivers. So it is experimental now. Better "So it is experimental for now". Otherwise it suggests that it was not experimental before but now it is. > +void qmp_x_blockdev_child_add(const char *parent, const char *child, > + Error **errp) > +{ > + BlockDriverState *parent_bs, *child_bs; > + Error *local_err = NULL; > + > + parent_bs = bdrv_lookup_bs(parent, parent, &local_err); > + if (!parent_bs) { > + error_propagate(errp, local_err); > + return; > + } I think this is fine as it is now, but since you are checking the value of parent_bs to see if bdrv_lookup_bs() succeeded you can use errp directly instead and skip the error_propagate() call. > +void qmp_x_blockdev_child_del(const char *parent, const char *child, > + Error **errp) > +{ > + BlockDriverState *parent_bs, *child_bs; > + Error *local_err = NULL; > + > + parent_bs = bdrv_lookup_bs(parent, parent, &local_err); > + if (!parent_bs) { > + error_propagate(errp, local_err); > + return; > + } Same here. > +## > +# @x-blockdev-child-add > +# > +# Add a new child to the parent BDS. Currently only the Quorum driver > +# implements this feature. This is useful to fix a broken quorum > child. Again, I would not use 'BDS' here. "Add a new child to an existing block device", or something like that. > +# @parent: graph node name or id which the child will be added to. > +# > +# @child: graph node name that will be added. > +# > +# Note: this command is experimental, and not a stable API. "[..] and does not have a stable API". > +x-blockdev-child-add > +------------ > + > +Add a child to a quorum node. > + > +Arguments: > + > +- "parent": the quorum's id or node name > +- "child": the child node-name which will be added > + I would use the same kind of description than in the json file. "Add a child to an existing block device. Currently only Quorum supports this feature". Same for the 'x-blockdev-child-del' documentation. Berto
On 22.09.2015 09:44, Wen Congyang wrote: > The new QMP command name is x-blockdev-child-add, and x-blockdev-child-del. > It justs for adding/removing quorum's child now, and don't support all > kinds of children, It does support all kinds of children for quorum, doesn't it? > nor all block drivers. So it is experimental now. Well, that is not really a reason why we would have to make it experimental. For instance, blockdev-add (although some might argue it actually is experimental...) doesn't support all block drivers either. The reason I am hesitant of adding an experimental QMP interface that is actually visible to the user (compare x-image in blkverify and blkdebug, which are not documented and not to be used by the user) is twofold: (1) At some point we have to say "OK, this is good enough now" and make it stable. What would that point be? Who can guarantee that we wouldn't want to make any interface changes after that point? Would we actually remember to revisit this function once in a while and consider making it stable? (2) While marking things experimental *should* keep people from using it in their tools, nobody can guarantee that it *does* keep them from doing so. So we may find ourselves in the situation of having to keep a compatibility interface for an experimental feature... For the second point, you should also consider how useful this feature is to management tools. Just being able to remove and attach children from a quorum node seems very useful on its own. I don't see why we should wait for having support for other block drivers; also, for most block drivers there is no meaningful way of adding or removing children as nicely as that is possible for quorum. E.g. you may have a block filter in the future where you want to exchange its child BDS. This exchange should be an atomic operation, so we cannot use this interface there anyway. For quorum, such an exchange does not need to be atomic, since you can just add the new child first and remove the old one afterwards. So maybe in the future we get some block driver other than quorum for which adding and removing children (as opposed to atomically exchanging them) makes sense, but for now I can only see quorum. Therefore, that this works for quorum only is in my opinion not a reason to make it experimental. I think we actually want to keep it that way. So the question would then be: What ways can you imagine to change this interface, which would necessitate an incompatible change, therefore calling for an experimental interface? (My point is that with such an experimental interface, management tools cannot use it, even though it'd be a very nice functionality to have) > > Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> > Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> > Signed-off-by: Gonglei <arei.gonglei@huawei.com> > --- > blockdev.c | 48 +++++++++++++++++++++++++++++++++++++++++ > qapi/block-core.json | 34 +++++++++++++++++++++++++++++ > qmp-commands.hx | 61 ++++++++++++++++++++++++++++++++++++++++++++++++++++ > 3 files changed, 143 insertions(+) > > diff --git a/blockdev.c b/blockdev.c > index 32b04b4..8da0ffb 100644 > --- a/blockdev.c > +++ b/blockdev.c > @@ -3086,6 +3086,54 @@ fail: > qmp_output_visitor_cleanup(ov); > } > > +void qmp_x_blockdev_child_add(const char *parent, const char *child, > + Error **errp) > +{ > + BlockDriverState *parent_bs, *child_bs; > + Error *local_err = NULL; > + > + parent_bs = bdrv_lookup_bs(parent, parent, &local_err); > + if (!parent_bs) { > + error_propagate(errp, local_err); > + return; > + } > + > + child_bs = bdrv_find_node(child); > + if (!child_bs) { > + error_setg(errp, "Node '%s' not found", child); > + return; > + } > + > + bdrv_add_child(parent_bs, child_bs, &local_err); > + if (local_err) { > + error_propagate(errp, local_err); > + } You can just pass errp to bdrv_add_child(). > +} > + > +void qmp_x_blockdev_child_del(const char *parent, const char *child, > + Error **errp) > +{ > + BlockDriverState *parent_bs, *child_bs; > + Error *local_err = NULL; > + > + parent_bs = bdrv_lookup_bs(parent, parent, &local_err); > + if (!parent_bs) { > + error_propagate(errp, local_err); > + return; > + } > + > + child_bs = bdrv_find_node(child); > + if (!child_bs) { > + error_setg(errp, "Node '%s' not found", child); > + return; > + } > + > + bdrv_del_child(parent_bs, child_bs, &local_err); > + if (local_err) { > + error_propagate(errp, local_err); > + } Same here. Max > +} > + > BlockJobInfoList *qmp_query_block_jobs(Error **errp) > { > BlockJobInfoList *head = NULL, **p_next = &head; > diff --git a/qapi/block-core.json b/qapi/block-core.json > index bb2189e..9418f05 100644 > --- a/qapi/block-core.json > +++ b/qapi/block-core.json > @@ -2114,3 +2114,37 @@ > ## > { 'command': 'block-set-write-threshold', > 'data': { 'node-name': 'str', 'write-threshold': 'uint64' } } > + > +## > +# @x-blockdev-child-add > +# > +# Add a new child to the parent BDS. Currently only the Quorum driver > +# implements this feature. This is useful to fix a broken quorum child. > +# > +# @parent: graph node name or id which the child will be added to. > +# > +# @child: graph node name that will be added. > +# > +# Note: this command is experimental, and not a stable API. > +# > +# Since: 2.5 > +## > +{ 'command': 'x-blockdev-child-add', > + 'data' : { 'parent': 'str', 'child': 'str' } } > + > +## > +# @x-blockdev-child-del > +# > +# Remove a child from the parent BDS. Currently only the Quorum driver > +# implements this feature. This is useful to fix a broken quorum child. > +# Note, you can't remove a child if it would bring the quorum below its > +# threshold. > +# > +# @parent: graph node name or id from which the child will removed. > +# > +# @child: graph node name that will be removed. > +# > +# Since: 2.5 > +## > +{ 'command': 'x-blockdev-child-del', > + 'data' : { 'parent': 'str', 'child': 'str' } } > diff --git a/qmp-commands.hx b/qmp-commands.hx > index 66f0300..36e75b1 100644 > --- a/qmp-commands.hx > +++ b/qmp-commands.hx > @@ -3916,6 +3916,67 @@ Example (2): > EQMP > > { > + .name = "x-blockdev-child-add", > + .args_type = "parent:B,child:B", > + .mhandler.cmd_new = qmp_marshal_x_blockdev_child_add, > + }, > + > +SQMP > +x-blockdev-child-add > +------------ > + > +Add a child to a quorum node. > + > +Arguments: > + > +- "parent": the quorum's id or node name > +- "child": the child node-name which will be added > + > +Note: this command is experimental, and not a stable API. It doesn't > +support all kinds of children, nor all block drivers. > + > +Example: > + > +-> { "execute": blockdev-add", > + "arguments": { "options": { "driver": "raw", > + "node-name": "new_node", > + "id": "test_new_node", > + "file": { "driver": "file", > + "filename": "test.raw" } } } } > +<- { "return": {} } > +-> { "execute": "x-blockdev-child-add", > + "arguments": { "parent": "disk1", "child": "new_node" } } > +<- { "return": {} } > + > +EQMP > + > + { > + .name = "x-blockdev-child-del", > + .args_type = "parent:B,child:B", > + .mhandler.cmd_new = qmp_marshal_x_blockdev_child_del, > + }, > + > +SQMP > +x-blockdev-child-del > +------------ > + > +Delete a child from a quorum node. It can be used to remove a broken > +quorum child. > + > +Arguments: > + > +- "parent": the quorum's id or node name > +- "child": the child node-name which will be removed > + > +Example: > + > +-> { "execute": "x-blockdev-child-del", > + "arguments": { "parent": "disk1", "child": "new_node" } } > +<- { "return": {} } > + > +EQMP > + > + { > .name = "query-named-block-nodes", > .args_type = "", > .mhandler.cmd_new = qmp_marshal_query_named_block_nodes, >
Max Reitz <mreitz@redhat.com> writes: > On 22.09.2015 09:44, Wen Congyang wrote: >> The new QMP command name is x-blockdev-child-add, and x-blockdev-child-del. >> It justs for adding/removing quorum's child now, and don't support all >> kinds of children, > > It does support all kinds of children for quorum, doesn't it? > >> nor all block drivers. So it is experimental now. > > Well, that is not really a reason why we would have to make it > experimental. For instance, blockdev-add (although some might argue it > actually is experimental...) doesn't support all block drivers either. Yup, and not calling it x-blockdev-add until it's done was a mistake. People tried using it, then found its current limitations the painful way. Not nice. > The reason I am hesitant of adding an experimental QMP interface that is > actually visible to the user (compare x-image in blkverify and blkdebug, > which are not documented and not to be used by the user) is twofold: > > (1) At some point we have to say "OK, this is good enough now" and make > it stable. What would that point be? Who can guarantee that we > wouldn't want to make any interface changes after that point? Nobody can, just like for any other interface. So? The x- prefix enables work spanning multiple releases. Until the feature is complete, we have a hard time seeing the whole picture, and therefore the risk of interface mistakes is higher than normal. Once it's complete, we drop the x-. > Would > we actually remember to revisit this function once in a while and > consider making it stable? Has that been a problem in the past? If the feature is of any use, there's always been mounting pressure to finish the job and drop the x-. If it's of no use, not dropping the x- would do no harm. The opposite, actually. > (2) While marking things experimental *should* keep people from using it > in their tools, nobody can guarantee that it *does* keep them from > doing so. So we may find ourselves in the situation of having to > keep a compatibility interface for an experimental feature... You can't force people not to do stupid things, but you *can* improve their chances at avoiding them. Clearly marking experimental stuff improves chances. > For the second point, you should also consider how useful this feature > is to management tools. Just being able to remove and attach children > from a quorum node seems very useful on its own. I don't see why we > should wait for having support for other block drivers; also, for most > block drivers there is no meaningful way of adding or removing children > as nicely as that is possible for quorum. Okay, this is an argument I might be able to buy. > E.g. you may have a block filter in the future where you want to > exchange its child BDS. This exchange should be an atomic operation, so > we cannot use this interface there anyway. For quorum, such an exchange > does not need to be atomic, since you can just add the new child first > and remove the old one afterwards. > > So maybe in the future we get some block driver other than quorum for > which adding and removing children (as opposed to atomically exchanging > them) makes sense, but for now I can only see quorum. Therefore, that > this works for quorum only is in my opinion not a reason to make it > experimental. I think we actually want to keep it that way. Are you telling us the existing interface is insufficiently general? That the general interface neeeds to support atomic replacement? If yes, why isn't the general interface is what we should do for quorum? Delete is atomic replacement by nothing, add is atomic replacement of nothing. > So the question would then be: What ways can you imagine to change this > interface, which would necessitate an incompatible change, therefore > calling for an experimental interface? Yes, that's an important question. Another important question is whether this is the interface we want. A secondary question is whether the incompleteness of the implementation demands an x- to warn users. > (My point is that with such an experimental interface, management tools > cannot use it, even though it'd be a very nice functionality to have) How much work is it to finish the job? Unless it's a substantial chunk, debating x- is much ado about nothing: just finish the job already :)
On Thu 08 Oct 2015 08:15:25 AM CEST, Markus Armbruster wrote: >> For the second point, you should also consider how useful this >> feature is to management tools. Just being able to remove and attach >> children from a quorum node seems very useful on its own. I don't see >> why we should wait for having support for other block drivers; also, >> for most block drivers there is no meaningful way of adding or >> removing children as nicely as that is possible for quorum. > > Okay, this is an argument I might be able to buy. Note that if we want to make this interface stable there's one use case missing: there's currently no way to change the vote threshold. This is maybe not so important for the COLO use case, but for the general case of adding and removing children from a quorum node having the possibility to change the threshold makes a lot of sense. That would probably require a its own API ('quorum-set-threshold' or something like that) so I don't think it has an effect on these child-add and child-del commands, but I wanted to mention it here anyway in case someone sees something that I'm overlooking. Berto
Am 08.10.2015 um 10:29 hat Alberto Garcia geschrieben: > On Thu 08 Oct 2015 08:15:25 AM CEST, Markus Armbruster wrote: > >> For the second point, you should also consider how useful this > >> feature is to management tools. Just being able to remove and attach > >> children from a quorum node seems very useful on its own. I don't see > >> why we should wait for having support for other block drivers; also, > >> for most block drivers there is no meaningful way of adding or > >> removing children as nicely as that is possible for quorum. > > > > Okay, this is an argument I might be able to buy. > > Note that if we want to make this interface stable there's one use case > missing: there's currently no way to change the vote threshold. > > This is maybe not so important for the COLO use case, but for the > general case of adding and removing children from a quorum node having > the possibility to change the threshold makes a lot of sense. > > That would probably require a its own API ('quorum-set-threshold' or > something like that) so I don't think it has an effect on these > child-add and child-del commands, but I wanted to mention it here anyway > in case someone sees something that I'm overlooking. The right way to change the voting threshold is bdrv_reopen(). Adding support to quorum for changing the threshold this way should be trivial. The hard part is how to expose it in QMP, which is why I've only added it to qemu-io so far (which happens to be accessible from HMP). If you need this functionality, let's start a new email thread and discuss the right QMP interface for blockdev-reopen (or blockdev- set-options or whatever you would call it). Kevin
On Thu 08 Oct 2015 12:03:45 PM CEST, Kevin Wolf wrote: >> Note that if we want to make this interface stable there's one use >> case missing: there's currently no way to change the vote threshold. >> > The right way to change the voting threshold is bdrv_reopen(). Adding > support to quorum for changing the threshold this way should be > trivial. Ah, and that would mean implementing quorum_reopen_*. Works for me. > The hard part is how to expose it in QMP, which is why I've only added > it to qemu-io so far (which happens to be accessible from HMP). > > If you need this functionality, let's start a new email thread and > discuss the right QMP interface for blockdev-reopen (or blockdev- > set-options or whatever you would call it). Ok, sounds good. Thanks! Berto
Am 08.10.2015 um 08:15 hat Markus Armbruster geschrieben: > Max Reitz <mreitz@redhat.com> writes: > > E.g. you may have a block filter in the future where you want to > > exchange its child BDS. This exchange should be an atomic operation, so > > we cannot use this interface there anyway. For quorum, such an exchange > > does not need to be atomic, since you can just add the new child first > > and remove the old one afterwards. > > > > So maybe in the future we get some block driver other than quorum for > > which adding and removing children (as opposed to atomically exchanging > > them) makes sense, but for now I can only see quorum. Therefore, that > > this works for quorum only is in my opinion not a reason to make it > > experimental. I think we actually want to keep it that way. > > Are you telling us the existing interface is insufficiently general? > That the general interface neeeds to support atomic replacement? > > If yes, why isn't the general interface is what we should do for quorum? > Delete is atomic replacement by nothing, add is atomic replacement of > nothing. The general thing is what we used to call "dynamic reconfiguration". If we want a single command to enable it (which would be great if we could), we need to some more design work first. Atomic replacement might be the operation we're looking for, but we have to be sure. So far we haven't thought about dynamic reconfiguation enough that we would know the right solution, but enough that we know it's hard. That would be an argument for me that makes adding an x-* command now acceptable. On the other hand, the fact that we want a single command in the end makes me want to keep it experimental. What types of dynamic reconfiguration do we need to support? I'll start with a small list, feel free to extend it: * Replace a child node with another node. This works pretty much everywhere in the tree - including the root, i.e. BlockBackend! Working just on BDSes doesn't seem to be enough. * Adding a child to a node that can take additional children (e.g. quorum can take an arbitrary number; but also COW image formats have an option child, so you could add a backing file to a node originally opened with BDRV_O_NO_BACKING) Same as atomically replacing nothing by a node. * Removing an optional child from a node that remains valid with that child removed. The same examples apply. Same as atomically replacing a child by nothing. * Add a new node between two existing nodes. An example is taking a live snapshot, which inserts a new node between the BlockBackend and the first BDS. Or it could be used to insert a filter somewhere in the graph. Same as creating the new node pointing to node B (or dynamically adding it) and then atomically replacing the reference of node A that pointed to B with a reference to the new node. * Add a new node between multiple existing nodes. This is one of the examples we always used to discuss with dynamic reconfiguration: base <- sn1 <- sn2 <--+-- virtio-blk | +-- NBD server Adding a filter could result in this: base <- sn1 <- sn2 <- throttling <--+-- virtio-blk | +-- NBD server Or this: base <- sn1 <- sn2 <--+-- throttling <- virtio-blk | +-- NBD server Or this: base <- sn1 <- sn2 <--+-- virtio-blk | +-- throttling <- NBD server All of these are different kinds of "adding a filter", and all of them are valid operations that a user could want to perform. Case 2 and 3 are really just "add a new node between two existing nodes", as explained above. Case 1 is new: We still create the throttling node so that it already points to sn2, but now we have to atomically change the children of two things (the BlockBackends of virtio-blk and the NBD server). Not a problem per se because we can just do that, but it raises the question whether the atomic replacement operation needs to be transactionable. * Remove a node between two (or more) other nodes. Same as atomically replacing a child by a grandchild. For more than two involved nodes, again a transactional version might be needed. So at the first sight, this operation seems to work as the general interface for dynamic reconfiguration. One problem we discussed earlier that I'm not sure whether it's related is filter nodes inserted automatically once we change e.g. the I/O throttling QMP commands to add a throttling filter BDS to the graph. If the user creates nodes A and B, but sets throttling options, they might end up with a chain like this: A <- throttling <- B Now imagine that they want to add another filter between A and B, let's say blkdebug. They would need to know that they have to insert the new node between A and throttling or B and throttling, but not between A and B. If they tried to insert it between A and B, the algorithm above says that they would let blkdebug point to A, and replace B's child with blkdebug, the resulting tree wouldn't be throttled any more! A <- blkdebug <- B throttling (ref = 0 -> delete) Even if they knew that they have to consider the throttling node, it currently wouldn't have a node-name, and with Jeff's autogenerated names it wouldn't be predictable. Maybe the dynamic reconfiguration interface does need to be a bit cleverer. Anyway, after writing all of this, I'm almost convinced now that an experimental interface is the right thing to do in this patch series. Kevin
Am 08.10.2015 um 13:02 hat Kevin Wolf geschrieben: > Am 08.10.2015 um 08:15 hat Markus Armbruster geschrieben: > > Max Reitz <mreitz@redhat.com> writes: > > > E.g. you may have a block filter in the future where you want to > > > exchange its child BDS. This exchange should be an atomic operation, so > > > we cannot use this interface there anyway. For quorum, such an exchange > > > does not need to be atomic, since you can just add the new child first > > > and remove the old one afterwards. > > > > > > So maybe in the future we get some block driver other than quorum for > > > which adding and removing children (as opposed to atomically exchanging > > > them) makes sense, but for now I can only see quorum. Therefore, that > > > this works for quorum only is in my opinion not a reason to make it > > > experimental. I think we actually want to keep it that way. > > > > Are you telling us the existing interface is insufficiently general? > > That the general interface neeeds to support atomic replacement? > > > > If yes, why isn't the general interface is what we should do for quorum? > > Delete is atomic replacement by nothing, add is atomic replacement of > > nothing. > > The general thing is what we used to call "dynamic reconfiguration". If > we want a single command to enable it (which would be great if we > could), we need to some more design work first. Atomic replacement might > be the operation we're looking for, but we have to be sure. > > So far we haven't thought about dynamic reconfiguation enough that we > would know the right solution, but enough that we know it's hard. That > would be an argument for me that makes adding an x-* command now > acceptable. On the other hand, the fact that we want a single command in > the end makes me want to keep it experimental. > > > What types of dynamic reconfiguration do we need to support? I'll start > with a small list, feel free to extend it: > > * Replace a child node with another node. This works pretty much > everywhere in the tree - including the root, i.e. BlockBackend! > Working just on BDSes doesn't seem to be enough. > > * Adding a child to a node that can take additional children (e.g. > quorum can take an arbitrary number; but also COW image formats have > an option child, so you could add a backing file to a node originally > opened with BDRV_O_NO_BACKING) > > Same as atomically replacing nothing by a node. > > * Removing an optional child from a node that remains valid with that > child removed. The same examples apply. > > Same as atomically replacing a child by nothing. > > * Add a new node between two existing nodes. An example is taking a live > snapshot, which inserts a new node between the BlockBackend and the > first BDS. Or it could be used to insert a filter somewhere in the > graph. > > Same as creating the new node pointing to node B (or dynamically > adding it) and then atomically replacing the reference of node A that > pointed to B with a reference to the new node. > > * Add a new node between multiple existing nodes. This is one of the > examples we always used to discuss with dynamic reconfiguration: > > base <- sn1 <- sn2 <--+-- virtio-blk > | > +-- NBD server > > Adding a filter could result in this: > > base <- sn1 <- sn2 <- throttling <--+-- virtio-blk > | > +-- NBD server > > Or this: > > base <- sn1 <- sn2 <--+-- throttling <- virtio-blk > | > +-- NBD server > > Or this: > > base <- sn1 <- sn2 <--+-- virtio-blk > | > +-- throttling <- NBD server > > All of these are different kinds of "adding a filter", and all of > them are valid operations that a user could want to perform. > > Case 2 and 3 are really just "add a new node between two existing > nodes", as explained above. > > Case 1 is new: We still create the throttling node so that it already > points to sn2, but now we have to atomically change the children of > two things (the BlockBackends of virtio-blk and the NBD server). Not a > problem per se because we can just do that, but it raises the question > whether the atomic replacement operation needs to be transactionable. > > * Remove a node between two (or more) other nodes. > > Same as atomically replacing a child by a grandchild. For more than > two involved nodes, again a transactional version might be needed. > > So at the first sight, this operation seems to work as the general > interface for dynamic reconfiguration. And actually, after re-reading the other branch of this email thread where I replied to Berto that he wants to use bdrv_reopen() to change the voting threshold for quorum, it occurred to me that bdrv_reopen() should possibly also be this atomatic replacement function. After all, the references to other nodes are blockdev-add options, and if we implement bdrv_reopen() fully so that it can change any options that can be given to blockdev-add, that means that we must be able to replace children. Kevin > One problem we discussed earlier that I'm not sure whether it's related > is filter nodes inserted automatically once we change e.g. the I/O > throttling QMP commands to add a throttling filter BDS to the graph. > > If the user creates nodes A and B, but sets throttling options, they > might end up with a chain like this: > > A <- throttling <- B > > Now imagine that they want to add another filter between A and B, let's > say blkdebug. They would need to know that they have to insert the new > node between A and throttling or B and throttling, but not between A and > B. If they tried to insert it between A and B, the algorithm above says > that they would let blkdebug point to A, and replace B's child with > blkdebug, the resulting tree wouldn't be throttled any more! > > A <- blkdebug <- B > > throttling (ref = 0 -> delete) > > Even if they knew that they have to consider the throttling node, it > currently wouldn't have a node-name, and with Jeff's autogenerated names > it wouldn't be predictable. > > Maybe the dynamic reconfiguration interface does need to be a bit > cleverer. > > > Anyway, after writing all of this, I'm almost convinced now that an > experimental interface is the right thing to do in this patch series. > > Kevin >
On 08.10.2015 08:15, Markus Armbruster wrote: > Max Reitz <mreitz@redhat.com> writes: > >> On 22.09.2015 09:44, Wen Congyang wrote: >>> The new QMP command name is x-blockdev-child-add, and x-blockdev-child-del. >>> It justs for adding/removing quorum's child now, and don't support all >>> kinds of children, >> >> It does support all kinds of children for quorum, doesn't it? >> >>> nor all block drivers. So it is experimental now. >> >> Well, that is not really a reason why we would have to make it >> experimental. For instance, blockdev-add (although some might argue it >> actually is experimental...) doesn't support all block drivers either. > > Yup, and not calling it x-blockdev-add until it's done was a mistake. > People tried using it, then found its current limitations the painful > way. Not nice. I knew I should have written s/some might/Markus does/. ;-) >> The reason I am hesitant of adding an experimental QMP interface that is >> actually visible to the user (compare x-image in blkverify and blkdebug, >> which are not documented and not to be used by the user) is twofold: >> >> (1) At some point we have to say "OK, this is good enough now" and make >> it stable. What would that point be? Who can guarantee that we >> wouldn't want to make any interface changes after that point? > > Nobody can, just like for any other interface. So? The main question is "what would that point be". As I can see you're arguing that that point would be "once people want to use it", but I'm arguing that people want to use it today or we wouldn't need this interface at all. I'm against adding external experimental interface because having external interface indicates that someone wants to use them, but making them experimental indicates that nobody should use them. This interface is added for the COLO series. The documentation added in patch 5 there explains usage of COLO with x-child-add. I don't think that should be there, because it's experimental. But why have an external interface if nobody should use it anyway? > The x- prefix enables work spanning multiple releases. Until the > feature is complete, we have a hard time seeing the whole picture, and > therefore the risk of interface mistakes is higher than normal. Once > it's complete, we drop the x-. I'm arguing the feature is complete as far as what it's supposed to do goes. >> Would >> we actually remember to revisit this function once in a while and >> consider making it stable? > > Has that been a problem in the past? I don't know, because I never witnessed an external experimental interface, but I haven't been closely involved with qemu for too long. > If the feature is of any use, there's always been mounting pressure to > finish the job and drop the x-. If it's of no use, not dropping the x- > would do no harm. The opposite, actually. This is of use, however. I do see your point, but I'm still arguing that I don't see why we need an external interface then. But COLO needs an external interface (which management tools should be able to use!) so there's that. >> (2) While marking things experimental *should* keep people from using it >> in their tools, nobody can guarantee that it *does* keep them from >> doing so. So we may find ourselves in the situation of having to >> keep a compatibility interface for an experimental feature... > > You can't force people not to do stupid things, but you *can* improve > their chances at avoiding them. Clearly marking experimental stuff > improves chances. OK, right. >> For the second point, you should also consider how useful this feature >> is to management tools. Just being able to remove and attach children >> from a quorum node seems very useful on its own. I don't see why we >> should wait for having support for other block drivers; also, for most >> block drivers there is no meaningful way of adding or removing children >> as nicely as that is possible for quorum. > > Okay, this is an argument I might be able to buy. > >> E.g. you may have a block filter in the future where you want to >> exchange its child BDS. This exchange should be an atomic operation, so >> we cannot use this interface there anyway. For quorum, such an exchange >> does not need to be atomic, since you can just add the new child first >> and remove the old one afterwards. >> >> So maybe in the future we get some block driver other than quorum for >> which adding and removing children (as opposed to atomically exchanging >> them) makes sense, but for now I can only see quorum. Therefore, that >> this works for quorum only is in my opinion not a reason to make it >> experimental. I think we actually want to keep it that way. > > Are you telling us the existing interface is insufficiently general? > That the general interface neeeds to support atomic replacement? The general interface for all block drivers and situations (as described by Kevin), yes. The interface for adding/removing children from quorum in regards to what COLO needs, no. > If yes, why isn't the general interface is what we should do for quorum? > Delete is atomic replacement by nothing, add is atomic replacement of > nothing. Yes, but I personally don't have a problem with macro functions. I don't see the harm in implementing blockdev-child-{add,del} now the way they are and later replacing them by calls to the general function. But why should we do that? Some people really want COLO in better sooner than later. Telling them to wait for until the block layer is as nice as we want it to be is not really an option I'd be willing to accept. I don't see why we should delay the COLO work just so that we don't have these two (macro) functions. The alternative would be to keep it experimental and then tell every COLO user to use the general function once it's available. But since COLO users are probably mostly management tools, that seems much more difficult to me than to just keep these two macro functions as legacy in qemu. >> So the question would then be: What ways can you imagine to change this >> interface, which would necessitate an incompatible change, therefore >> calling for an experimental interface? > > Yes, that's an important question. > > Another important question is whether this is the interface we want. I can see why this is an important question to you, but to me it is not so much. As I argued above, I'm not opposed to adding interface that are actually not what we want, but that are what users want, and that are easy to implement with the interface that we want. It's what I call a “macro function”. > A secondary question is whether the incompleteness of the implementation > demands an x- to warn users. We don't want to shoehorn generality into these two functions, but add a new one anyway. We might want to add optional parameters later on (e.g. changing the threshold), but that would be a compatible change. >> (My point is that with such an experimental interface, management tools >> cannot use it, even though it'd be a very nice functionality to have) > > How much work is it to finish the job? Unless it's a substantial chunk, > debating x- is much ado about nothing: just finish the job already :) We have proven in the past to need a significant amount of time for even for non-substantial chunks. Often, non-substantial chunks turn out to be substantial when actually tackling them. It basically comes down to whether the COLO authors are willing to wait for us to do the job. And frankly, if I were them, I wouldn't be. Max
On 08.10.2015 10:29, Alberto Garcia wrote: > On Thu 08 Oct 2015 08:15:25 AM CEST, Markus Armbruster wrote: >>> For the second point, you should also consider how useful this >>> feature is to management tools. Just being able to remove and attach >>> children from a quorum node seems very useful on its own. I don't see >>> why we should wait for having support for other block drivers; also, >>> for most block drivers there is no meaningful way of adding or >>> removing children as nicely as that is possible for quorum. >> >> Okay, this is an argument I might be able to buy. > > Note that if we want to make this interface stable there's one use case > missing: there's currently no way to change the vote threshold. Besides what Kevin said: If you add a new function, that would be independent from these two functions. If want to add it as an optional parameter to blockdev-add-child so the change is done atomically, that wouldn't be an incompatible interface change either. Max > This is maybe not so important for the COLO use case, but for the > general case of adding and removing children from a quorum node having > the possibility to change the threshold makes a lot of sense. > > That would probably require a its own API ('quorum-set-threshold' or > something like that) so I don't think it has an effect on these > child-add and child-del commands, but I wanted to mention it here anyway > in case someone sees something that I'm overlooking. > > Berto >
* Max Reitz (mreitz@redhat.com) wrote: > On 08.10.2015 08:15, Markus Armbruster wrote: > > Max Reitz <mreitz@redhat.com> writes: > > > >> On 22.09.2015 09:44, Wen Congyang wrote: > >>> The new QMP command name is x-blockdev-child-add, and x-blockdev-child-del. > >>> It justs for adding/removing quorum's child now, and don't support all > >>> kinds of children, > >> > >> It does support all kinds of children for quorum, doesn't it? > >> > >>> nor all block drivers. So it is experimental now. > >> > >> Well, that is not really a reason why we would have to make it > >> experimental. For instance, blockdev-add (although some might argue it > >> actually is experimental...) doesn't support all block drivers either. > > > > Yup, and not calling it x-blockdev-add until it's done was a mistake. > > People tried using it, then found its current limitations the painful > > way. Not nice. > > I knew I should have written s/some might/Markus does/. ;-) > > >> The reason I am hesitant of adding an experimental QMP interface that is > >> actually visible to the user (compare x-image in blkverify and blkdebug, > >> which are not documented and not to be used by the user) is twofold: > >> > >> (1) At some point we have to say "OK, this is good enough now" and make > >> it stable. What would that point be? Who can guarantee that we > >> wouldn't want to make any interface changes after that point? > > > > Nobody can, just like for any other interface. So? > > The main question is "what would that point be". As I can see you're > arguing that that point would be "once people want to use it", but I'm > arguing that people want to use it today or we wouldn't need this > interface at all. > > I'm against adding external experimental interface because having > external interface indicates that someone wants to use them, but making > them experimental indicates that nobody should use them. > > This interface is added for the COLO series. The documentation added in > patch 5 there explains usage of COLO with x-child-add. I don't think > that should be there, because it's experimental. But why have an > external interface if nobody should use it anyway? Because it lets people move forward; the COLO series is pretty huge, there already seem to be side discussions spawning off about dynamic reconfiguration of stuff, who knows how long those will take to pan out. Adding the experimental stuff makes it easier for people to try and get some feedback on. If everyone turns out to love it then it only takes a trivial patch to promote it; if people actually realise there is a better interface then it's no problem to change it either - x- doesn't stop any one using it, but it does remove their right to moan if it changes. Dave > > > The x- prefix enables work spanning multiple releases. Until the > > feature is complete, we have a hard time seeing the whole picture, and > > therefore the risk of interface mistakes is higher than normal. Once > > it's complete, we drop the x-. > > I'm arguing the feature is complete as far as what it's supposed to do goes. > > >> Would > >> we actually remember to revisit this function once in a while and > >> consider making it stable? > > > > Has that been a problem in the past? > > I don't know, because I never witnessed an external experimental > interface, but I haven't been closely involved with qemu for too long. > > > If the feature is of any use, there's always been mounting pressure to > > finish the job and drop the x-. If it's of no use, not dropping the x- > > would do no harm. The opposite, actually. > > This is of use, however. > > I do see your point, but I'm still arguing that I don't see why we need > an external interface then. But COLO needs an external interface (which > management tools should be able to use!) so there's that. > > >> (2) While marking things experimental *should* keep people from using it > >> in their tools, nobody can guarantee that it *does* keep them from > >> doing so. So we may find ourselves in the situation of having to > >> keep a compatibility interface for an experimental feature... > > > > You can't force people not to do stupid things, but you *can* improve > > their chances at avoiding them. Clearly marking experimental stuff > > improves chances. > > OK, right. > > >> For the second point, you should also consider how useful this feature > >> is to management tools. Just being able to remove and attach children > >> from a quorum node seems very useful on its own. I don't see why we > >> should wait for having support for other block drivers; also, for most > >> block drivers there is no meaningful way of adding or removing children > >> as nicely as that is possible for quorum. > > > > Okay, this is an argument I might be able to buy. > > > >> E.g. you may have a block filter in the future where you want to > >> exchange its child BDS. This exchange should be an atomic operation, so > >> we cannot use this interface there anyway. For quorum, such an exchange > >> does not need to be atomic, since you can just add the new child first > >> and remove the old one afterwards. > >> > >> So maybe in the future we get some block driver other than quorum for > >> which adding and removing children (as opposed to atomically exchanging > >> them) makes sense, but for now I can only see quorum. Therefore, that > >> this works for quorum only is in my opinion not a reason to make it > >> experimental. I think we actually want to keep it that way. > > > > Are you telling us the existing interface is insufficiently general? > > That the general interface neeeds to support atomic replacement? > > The general interface for all block drivers and situations (as described > by Kevin), yes. The interface for adding/removing children from quorum > in regards to what COLO needs, no. > > > If yes, why isn't the general interface is what we should do for quorum? > > Delete is atomic replacement by nothing, add is atomic replacement of > > nothing. > > Yes, but I personally don't have a problem with macro functions. I don't > see the harm in implementing blockdev-child-{add,del} now the way they > are and later replacing them by calls to the general function. > > But why should we do that? > > Some people really want COLO in better sooner than later. Telling them > to wait for until the block layer is as nice as we want it to be is not > really an option I'd be willing to accept. I don't see why we should > delay the COLO work just so that we don't have these two (macro) functions. > > The alternative would be to keep it experimental and then tell every > COLO user to use the general function once it's available. But since > COLO users are probably mostly management tools, that seems much more > difficult to me than to just keep these two macro functions as legacy in > qemu. > > >> So the question would then be: What ways can you imagine to change this > >> interface, which would necessitate an incompatible change, therefore > >> calling for an experimental interface? > > > > Yes, that's an important question. > > > > Another important question is whether this is the interface we want. > > I can see why this is an important question to you, but to me it is not > so much. As I argued above, I'm not opposed to adding interface that are > actually not what we want, but that are what users want, and that are > easy to implement with the interface that we want. It's what I call a > “macro function”. > > > A secondary question is whether the incompleteness of the implementation > > demands an x- to warn users. > > We don't want to shoehorn generality into these two functions, but add a > new one anyway. We might want to add optional parameters later on (e.g. > changing the threshold), but that would be a compatible change. > > >> (My point is that with such an experimental interface, management tools > >> cannot use it, even though it'd be a very nice functionality to have) > > > > How much work is it to finish the job? Unless it's a substantial chunk, > > debating x- is much ado about nothing: just finish the job already :) > > We have proven in the past to need a significant amount of time for even > for non-substantial chunks. Often, non-substantial chunks turn out to be > substantial when actually tackling them. > > It basically comes down to whether the COLO authors are willing to wait > for us to do the job. And frankly, if I were them, I wouldn't be. > > Max > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On 09.10.2015 18:42, Dr. David Alan Gilbert wrote: > * Max Reitz (mreitz@redhat.com) wrote: >> On 08.10.2015 08:15, Markus Armbruster wrote: >>> Max Reitz <mreitz@redhat.com> writes: >>> >>>> On 22.09.2015 09:44, Wen Congyang wrote: >>>>> The new QMP command name is x-blockdev-child-add, and x-blockdev-child-del. >>>>> It justs for adding/removing quorum's child now, and don't support all >>>>> kinds of children, >>>> >>>> It does support all kinds of children for quorum, doesn't it? >>>> >>>>> nor all block drivers. So it is experimental now. >>>> >>>> Well, that is not really a reason why we would have to make it >>>> experimental. For instance, blockdev-add (although some might argue it >>>> actually is experimental...) doesn't support all block drivers either. >>> >>> Yup, and not calling it x-blockdev-add until it's done was a mistake. >>> People tried using it, then found its current limitations the painful >>> way. Not nice. >> >> I knew I should have written s/some might/Markus does/. ;-) >> >>>> The reason I am hesitant of adding an experimental QMP interface that is >>>> actually visible to the user (compare x-image in blkverify and blkdebug, >>>> which are not documented and not to be used by the user) is twofold: >>>> >>>> (1) At some point we have to say "OK, this is good enough now" and make >>>> it stable. What would that point be? Who can guarantee that we >>>> wouldn't want to make any interface changes after that point? >>> >>> Nobody can, just like for any other interface. So? >> >> The main question is "what would that point be". As I can see you're >> arguing that that point would be "once people want to use it", but I'm >> arguing that people want to use it today or we wouldn't need this >> interface at all. >> >> I'm against adding external experimental interface because having >> external interface indicates that someone wants to use them, but making >> them experimental indicates that nobody should use them. >> >> This interface is added for the COLO series. The documentation added in >> patch 5 there explains usage of COLO with x-child-add. I don't think >> that should be there, because it's experimental. But why have an >> external interface if nobody should use it anyway? > > Because it lets people move forward; the COLO series is pretty huge, there > already seem to be side discussions spawning off about dynamic reconfiguration > of stuff, who knows how long those will take to pan out. Yes, and my point is that with these functions (blockdev-child-{add,del}) the result of that side discussion doesn't matter. > Adding the experimental stuff makes it easier for people to try and > get some feedback on. The thing is, I cannot imagine any feedback that would necessitate an incompatible change. “I want to change quorum's options while adding/removing children” can easily be accomplished with an additional optional parameter. But I do know that we want to keep things experimental exactly because there can be feedback which I cannot imagine right now. > If everyone turns out to love it then it only takes a trivial patch to promote > it; if people actually realise there is a better interface then it's > no problem to change it either - x- doesn't stop any one using it, But it should, shouldn't it? No management tool should be using an x- command, as far as I know. And these are functions which are clearly designed for management tools. If management tools are indeed free to use x- functions, then I'm completely fine with making these experimental for now. It's just that it looks to me like “Hey, look, we have these two new functions you can use!” and then, two versions later we remove them because we have a general reconfiguration option, and we'll say “It's your own fault for using experimental functions” if someone complains. That sounds hypocritical to me, but I'm probably being to “legal” here. (i.e. it's more like “Hey, look, two new cool functions! But don't use them.” which sounds like a contradiction to me, whereas it actually means “Feel free to use them but don't blame us”) tl;dr: May management tools use x- functions? And is it actually conceivable for them to do so? If so, my whole argument becomes moot, so let's make these functions x-. Mainly I'd like to know about some example where we had an x- function in the past. Markus seemed to imply that was the case. Max > but it > does remove their right to moan if it changes. > > Dave
Max Reitz <mreitz@redhat.com> writes: > On 08.10.2015 08:15, Markus Armbruster wrote: >> Max Reitz <mreitz@redhat.com> writes: >> >>> On 22.09.2015 09:44, Wen Congyang wrote: >>>> The new QMP command name is x-blockdev-child-add, and x-blockdev-child-del. >>>> It justs for adding/removing quorum's child now, and don't support all >>>> kinds of children, >>> >>> It does support all kinds of children for quorum, doesn't it? >>> >>>> nor all block drivers. So it is experimental now. >>> >>> Well, that is not really a reason why we would have to make it >>> experimental. For instance, blockdev-add (although some might argue it >>> actually is experimental...) doesn't support all block drivers either. >> >> Yup, and not calling it x-blockdev-add until it's done was a mistake. >> People tried using it, then found its current limitations the painful >> way. Not nice. > > I knew I should have written s/some might/Markus does/. ;-) :) >>> The reason I am hesitant of adding an experimental QMP interface that is >>> actually visible to the user (compare x-image in blkverify and blkdebug, >>> which are not documented and not to be used by the user) is twofold: >>> >>> (1) At some point we have to say "OK, this is good enough now" and make >>> it stable. What would that point be? Who can guarantee that we >>> wouldn't want to make any interface changes after that point? >> >> Nobody can, just like for any other interface. So? > > The main question is "what would that point be". As I can see you're > arguing that that point would be "once people want to use it", but I'm > arguing that people want to use it today or we wouldn't need this > interface at all. > > I'm against adding external experimental interface because having > external interface indicates that someone wants to use them, but making > them experimental indicates that nobody should use them. Make that "nobody should use them in anger just yet." They can and should be used to develop stuff. Developing non-trivial interfaces without actual users is risky. Sometimes, you can't see shortcomings in an interface until you try to use it. Successful actual use can build confidence the experimental interface is in fact ready to be cast in stone. > This interface is added for the COLO series. The documentation added in > patch 5 there explains usage of COLO with x-child-add. I don't think > that should be there, because it's experimental. But why have an > external interface if nobody should use it anyway? > >> The x- prefix enables work spanning multiple releases. Until the >> feature is complete, we have a hard time seeing the whole picture, and >> therefore the risk of interface mistakes is higher than normal. Once >> it's complete, we drop the x-. > > I'm arguing the feature is complete as far as what it's supposed to do goes. When you say "the feature is complete", you're arguing this specific interface is ready. When you say you're "against adding external experimental interface", you're arguing proper use of x-. Let's try to keep the discussion of principles separate from the discussion of the specific instance. On the former: maybe the interface is ready, but I can't judge offhand. All I can do is ask questions. On the latter: I emphatically disagree with the idea that experimental interfaces are to be avoided because "someone wants to use them". >>> Would >>> we actually remember to revisit this function once in a while and >>> consider making it stable? >> >> Has that been a problem in the past? > > I don't know, because I never witnessed an external experimental > interface, but I haven't been closely involved with qemu for too long. QMP itself started experimental, and was declared stable after fairly heated discussion. I think we've been dropping x- prefixes pretty routinely. A quick, superficial search finds commit 41310c6 (x-rdma) and commit 467b3f3 (x-iothread). [...]
"Dr. David Alan Gilbert" <dgilbert@redhat.com> writes: > * Max Reitz (mreitz@redhat.com) wrote: >> On 08.10.2015 08:15, Markus Armbruster wrote: >> > Max Reitz <mreitz@redhat.com> writes: >> > >> >> On 22.09.2015 09:44, Wen Congyang wrote: >> >>> The new QMP command name is x-blockdev-child-add, and >> >>> x-blockdev-child-del. >> >>> It justs for adding/removing quorum's child now, and don't support all >> >>> kinds of children, >> >> >> >> It does support all kinds of children for quorum, doesn't it? >> >> >> >>> nor all block drivers. So it is experimental now. >> >> >> >> Well, that is not really a reason why we would have to make it >> >> experimental. For instance, blockdev-add (although some might argue it >> >> actually is experimental...) doesn't support all block drivers either. >> > >> > Yup, and not calling it x-blockdev-add until it's done was a mistake. >> > People tried using it, then found its current limitations the painful >> > way. Not nice. >> >> I knew I should have written s/some might/Markus does/. ;-) >> >> >> The reason I am hesitant of adding an experimental QMP interface that is >> >> actually visible to the user (compare x-image in blkverify and blkdebug, >> >> which are not documented and not to be used by the user) is twofold: >> >> >> >> (1) At some point we have to say "OK, this is good enough now" and make >> >> it stable. What would that point be? Who can guarantee that we >> >> wouldn't want to make any interface changes after that point? >> > >> > Nobody can, just like for any other interface. So? >> >> The main question is "what would that point be". As I can see you're >> arguing that that point would be "once people want to use it", but I'm >> arguing that people want to use it today or we wouldn't need this >> interface at all. >> >> I'm against adding external experimental interface because having >> external interface indicates that someone wants to use them, but making >> them experimental indicates that nobody should use them. >> >> This interface is added for the COLO series. The documentation added in >> patch 5 there explains usage of COLO with x-child-add. I don't think >> that should be there, because it's experimental. But why have an >> external interface if nobody should use it anyway? > > Because it lets people move forward; the COLO series is pretty huge, there > already seem to be side discussions spawning off about dynamic reconfiguration > of stuff, who knows how long those will take to pan out. > Adding the experimental stuff makes it easier for people to try and > get some feedback on. > If everyone turns out to love it then it only takes a trivial patch to promote > it; if people actually realise there is a better interface then it's > no problem to change it either - x- doesn't stop any one using it, but it > does remove their right to moan if it changes. Exactly.
* Max Reitz (mreitz@redhat.com) wrote: > On 09.10.2015 18:42, Dr. David Alan Gilbert wrote: > > * Max Reitz (mreitz@redhat.com) wrote: > >> On 08.10.2015 08:15, Markus Armbruster wrote: > >>> Max Reitz <mreitz@redhat.com> writes: > >>> > >>>> On 22.09.2015 09:44, Wen Congyang wrote: > >>>>> The new QMP command name is x-blockdev-child-add, and x-blockdev-child-del. > >>>>> It justs for adding/removing quorum's child now, and don't support all > >>>>> kinds of children, > >>>> > >>>> It does support all kinds of children for quorum, doesn't it? > >>>> > >>>>> nor all block drivers. So it is experimental now. > >>>> > >>>> Well, that is not really a reason why we would have to make it > >>>> experimental. For instance, blockdev-add (although some might argue it > >>>> actually is experimental...) doesn't support all block drivers either. > >>> > >>> Yup, and not calling it x-blockdev-add until it's done was a mistake. > >>> People tried using it, then found its current limitations the painful > >>> way. Not nice. > >> > >> I knew I should have written s/some might/Markus does/. ;-) > >> > >>>> The reason I am hesitant of adding an experimental QMP interface that is > >>>> actually visible to the user (compare x-image in blkverify and blkdebug, > >>>> which are not documented and not to be used by the user) is twofold: > >>>> > >>>> (1) At some point we have to say "OK, this is good enough now" and make > >>>> it stable. What would that point be? Who can guarantee that we > >>>> wouldn't want to make any interface changes after that point? > >>> > >>> Nobody can, just like for any other interface. So? > >> > >> The main question is "what would that point be". As I can see you're > >> arguing that that point would be "once people want to use it", but I'm > >> arguing that people want to use it today or we wouldn't need this > >> interface at all. > >> > >> I'm against adding external experimental interface because having > >> external interface indicates that someone wants to use them, but making > >> them experimental indicates that nobody should use them. > >> > >> This interface is added for the COLO series. The documentation added in > >> patch 5 there explains usage of COLO with x-child-add. I don't think > >> that should be there, because it's experimental. But why have an > >> external interface if nobody should use it anyway? > > > > Because it lets people move forward; the COLO series is pretty huge, there > > already seem to be side discussions spawning off about dynamic reconfiguration > > of stuff, who knows how long those will take to pan out. > > Yes, and my point is that with these functions > (blockdev-child-{add,del}) the result of that side discussion doesn't > matter. > > > Adding the experimental stuff makes it easier for people to try and > > get some feedback on. > > The thing is, I cannot imagine any feedback that would necessitate an > incompatible change. “I want to change quorum's options while > adding/removing children” can easily be accomplished with an additional > optional parameter. > > But I do know that we want to keep things experimental exactly because > there can be feedback which I cannot imagine right now. > > > If everyone turns out to love it then it only takes a trivial patch to promote > > it; if people actually realise there is a better interface then it's > > no problem to change it either - x- doesn't stop any one using it, > > But it should, shouldn't it? No management tool should be using an x- > command, as far as I know. And these are functions which are clearly > designed for management tools. > > If management tools are indeed free to use x- functions, then I'm > completely fine with making these experimental for now. It's just that > it looks to me like “Hey, look, we have these two new functions you can > use!” and then, two versions later we remove them because we have a > general reconfiguration option, and we'll say “It's your own fault for > using experimental functions” if someone complains. That sounds > hypocritical to me, but I'm probably being to “legal” here. > > (i.e. it's more like “Hey, look, two new cool functions! But don't use > them.” which sounds like a contradiction to me, whereas it actually > means “Feel free to use them but don't blame us”) > > tl;dr: May management tools use x- functions? And is it actually > conceivable for them to do so? If so, my whole argument becomes moot, so > let's make these functions x-. My guess is the libvirt guys wont take the code to drive the x- methods; but it still makes it easier if someone wants to try this stuff out, they wont need to apply 2/3 sets of COLO code and then any management tools. > Mainly I'd like to know about some example where we had an x- function > in the past. Markus seemed to imply that was the case. The RDMA code used to have x- for migration protocol and some of the capabilities; we've recently added Jason Herne's cpu throttling with similar x- flags (1626fee3bdbb295d5e8aff800f7621357bb376d6), and input-send-event got moved into the x- world (df5b2adb7398d71016ee469f71e52075ed95e04e) which is much worse than it starting out there. Dave > > Max > > > but it > > does remove their right to moan if it changes. > > > > Dave > > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Am 09.10.2015 um 20:24 hat Max Reitz geschrieben: > On 09.10.2015 18:42, Dr. David Alan Gilbert wrote: > > * Max Reitz (mreitz@redhat.com) wrote: > >> On 08.10.2015 08:15, Markus Armbruster wrote: > >>> Max Reitz <mreitz@redhat.com> writes: > >>> > >>>> On 22.09.2015 09:44, Wen Congyang wrote: > >>>>> The new QMP command name is x-blockdev-child-add, and x-blockdev-child-del. > >>>>> It justs for adding/removing quorum's child now, and don't support all > >>>>> kinds of children, > >>>> > >>>> It does support all kinds of children for quorum, doesn't it? > >>>> > >>>>> nor all block drivers. So it is experimental now. > >>>> > >>>> Well, that is not really a reason why we would have to make it > >>>> experimental. For instance, blockdev-add (although some might argue it > >>>> actually is experimental...) doesn't support all block drivers either. > >>> > >>> Yup, and not calling it x-blockdev-add until it's done was a mistake. > >>> People tried using it, then found its current limitations the painful > >>> way. Not nice. > >> > >> I knew I should have written s/some might/Markus does/. ;-) > >> > >>>> The reason I am hesitant of adding an experimental QMP interface that is > >>>> actually visible to the user (compare x-image in blkverify and blkdebug, > >>>> which are not documented and not to be used by the user) is twofold: > >>>> > >>>> (1) At some point we have to say "OK, this is good enough now" and make > >>>> it stable. What would that point be? Who can guarantee that we > >>>> wouldn't want to make any interface changes after that point? > >>> > >>> Nobody can, just like for any other interface. So? > >> > >> The main question is "what would that point be". As I can see you're > >> arguing that that point would be "once people want to use it", but I'm > >> arguing that people want to use it today or we wouldn't need this > >> interface at all. > >> > >> I'm against adding external experimental interface because having > >> external interface indicates that someone wants to use them, but making > >> them experimental indicates that nobody should use them. > >> > >> This interface is added for the COLO series. The documentation added in > >> patch 5 there explains usage of COLO with x-child-add. I don't think > >> that should be there, because it's experimental. But why have an > >> external interface if nobody should use it anyway? > > > > Because it lets people move forward; the COLO series is pretty huge, there > > already seem to be side discussions spawning off about dynamic reconfiguration > > of stuff, who knows how long those will take to pan out. > > Yes, and my point is that with these functions > (blockdev-child-{add,del}) the result of that side discussion doesn't > matter. > > > Adding the experimental stuff makes it easier for people to try and > > get some feedback on. > > The thing is, I cannot imagine any feedback that would necessitate an > incompatible change. “I want to change quorum's options while > adding/removing children” can easily be accomplished with an additional > optional parameter. > > But I do know that we want to keep things experimental exactly because > there can be feedback which I cannot imagine right now. > > > If everyone turns out to love it then it only takes a trivial patch to promote > > it; if people actually realise there is a better interface then it's > > no problem to change it either - x- doesn't stop any one using it, > > But it should, shouldn't it? No management tool should be using an x- > command, as far as I know. And these are functions which are clearly > designed for management tools. It should stop people from using it in production, but it shouldn't stop them from using it for development and testing. We know that child-add/del is probably not the interface that we want to have in the end (and I would like to avoid accumulating tons of compatibility commands once we have what we want). If the COLO people say that they need an experimental command in order to make progress, that's fine with me. I think we'll all agree that while 'blockdev-add' can't reasonably be used in production yet, without it we couldn't have made much of the progress in the block layer that we made in the past year. If COLO people are in the same situation, let's give them what they need, without setting an unwanted interface in stone. > If management tools are indeed free to use x- functions, then I'm > completely fine with making these experimental for now. It's just that > it looks to me like “Hey, look, we have these two new functions you can > use!” and then, two versions later we remove them because we have a > general reconfiguration option, and we'll say “It's your own fault for > using experimental functions” if someone complains. That sounds > hypocritical to me, but I'm probably being to “legal” here. Experimental features in management tools (e.g. in some feature branch) can use them, they just can't rely on it keeping working. > (i.e. it's more like “Hey, look, two new cool functions! But don't use > them.” which sounds like a contradiction to me, whereas it actually > means “Feel free to use them but don't blame us”) "Hey, look, two new functions for an incomplete feature! Use them if you want to help with development, but otherwise stay clear of them." Kevin
On 07.10.2015 21:42, Max Reitz wrote: > On 22.09.2015 09:44, Wen Congyang wrote: >> The new QMP command name is x-blockdev-child-add, and x-blockdev-child-del. >> It justs for adding/removing quorum's child now, and don't support all >> kinds of children, > > It does support all kinds of children for quorum, doesn't it? > >> nor all block drivers. So it is experimental now. > > Well, that is not really a reason why we would have to make it > experimental. For instance, blockdev-add (although some might argue it > actually is experimental...) doesn't support all block drivers either. OK, after a rather long discussion, my opinion has changed. Adding them as experimental interfaces is good (although the reason noted here is not exactly what I feel is the reason that came out in the discussion). Thanks to everyone who argued with me! I took a good chunk of your time, and I'll have you know that I'm grateful for it. Max
Sorry for my slow reply. Kevin Wolf <kwolf@redhat.com> writes: > Am 08.10.2015 um 13:02 hat Kevin Wolf geschrieben: >> Am 08.10.2015 um 08:15 hat Markus Armbruster geschrieben: >> > Max Reitz <mreitz@redhat.com> writes: >> > > E.g. you may have a block filter in the future where you want to >> > > exchange its child BDS. This exchange should be an atomic operation, so >> > > we cannot use this interface there anyway. For quorum, such an exchange >> > > does not need to be atomic, since you can just add the new child first >> > > and remove the old one afterwards. >> > > >> > > So maybe in the future we get some block driver other than quorum for >> > > which adding and removing children (as opposed to atomically exchanging >> > > them) makes sense, but for now I can only see quorum. Therefore, that >> > > this works for quorum only is in my opinion not a reason to make it >> > > experimental. I think we actually want to keep it that way. >> > >> > Are you telling us the existing interface is insufficiently general? >> > That the general interface neeeds to support atomic replacement? >> > >> > If yes, why isn't the general interface is what we should do for quorum? >> > Delete is atomic replacement by nothing, add is atomic replacement of >> > nothing. >> >> The general thing is what we used to call "dynamic reconfiguration". If >> we want a single command to enable it (which would be great if we >> could), we need to some more design work first. Atomic replacement might >> be the operation we're looking for, but we have to be sure. >> >> So far we haven't thought about dynamic reconfiguation enough that we >> would know the right solution, but enough that we know it's hard. Yes. >> That >> would be an argument for me that makes adding an x-* command now >> acceptable. On the other hand, the fact that we want a single command in >> the end makes me want to keep it experimental. We have a bit more time to think until the release calcifies interfaces. >> What types of dynamic reconfiguration do we need to support? I'll start >> with a small list, feel free to extend it: >> >> * Replace a child node with another node. This works pretty much >> everywhere in the tree - including the root, i.e. BlockBackend! >> Working just on BDSes doesn't seem to be enough. Two ways to replace a node: 1. Operation on a node, i.e. replace node O by node N *everywhere*. 2. Operation on an edge, i.e. replace an edge to node O by an edge to node N. Less abstract: 1. Node replacement takes a BDS *O. It either replaces in place (something we've worked hard to get rid of), or needs to find all variables containing the value O. 2. Edge replacement takes a BDS **O. Naturally, whoever contains that thing may need to be notified, so the actual interface may well be a bit more involved. >> * Adding a child to a node that can take additional children (e.g. >> quorum can take an arbitrary number; but also COW image formats have >> an option child, so you could add a backing file to a node originally >> opened with BDRV_O_NO_BACKING) >> >> Same as atomically replacing nothing by a node. Yes. >> * Removing an optional child from a node that remains valid with that >> child removed. The same examples apply. >> >> Same as atomically replacing a child by nothing. Yes. >> * Add a new node between two existing nodes. An example is taking a live >> snapshot, which inserts a new node between the BlockBackend and the >> first BDS. Or it could be used to insert a filter somewhere in the >> graph. >> >> Same as creating the new node pointing to node B (or dynamically >> adding it) and then atomically replacing the reference of node A that >> pointed to B with a reference to the new node. Footnote: between creation of the new node and the atomic replace, the old node has another parent. Requires support for a BDS to have multiple parents, but that's a given, I think. >> * Add a new node between multiple existing nodes. This is one of the >> examples we always used to discuss with dynamic reconfiguration: >> >> base <- sn1 <- sn2 <--+-- virtio-blk >> | >> +-- NBD server >> >> Adding a filter could result in this: >> >> base <- sn1 <- sn2 <- throttling <--+-- virtio-blk >> | >> +-- NBD server >> >> Or this: >> >> base <- sn1 <- sn2 <--+-- throttling <- virtio-blk >> | >> +-- NBD server >> >> Or this: >> >> base <- sn1 <- sn2 <--+-- virtio-blk >> | >> +-- throttling <- NBD server >> >> All of these are different kinds of "adding a filter", and all of >> them are valid operations that a user could want to perform. >> >> Case 2 and 3 are really just "add a new node between two existing >> nodes", as explained above. Yes. >> Case 1 is new: We still create the throttling node so that it already >> points to sn2, but now we have to atomically change the children of >> two things (the BlockBackends of virtio-blk and the NBD server). Not a >> problem per se because we can just do that, but it raises the question >> whether the atomic replacement operation needs to be transactionable. Where "transactionable" means you can bind it together with other transactionable operations, and the resulting transaction either succeeds or fails completely. Right? In increasing flexibility order: a. atomic single replacement, not transactionable b. atomic multiple replacements, not transactionable c. single replacement, transactionable In your throttling example, having to insert the throttle in two separate operations (either of which can fail) seems tolerable. Can we think of an example where a similar multi-replacement should either succeed or fail completely? I.e. one that needs at least b. Can we think of an example where we need c? >> * Remove a node between two (or more) other nodes. >> >> Same as atomically replacing a child by a grandchild. For more than >> two involved nodes, again a transactional version might be needed. >> >> So at the first sight, this operation seems to work as the general >> interface for dynamic reconfiguration. Yes, we seem to be onto something here. Finally! > And actually, after re-reading the other branch of this email thread > where I replied to Berto that he wants to use bdrv_reopen() to change > the voting threshold for quorum, it occurred to me that bdrv_reopen() > should possibly also be this atomatic replacement function. > > After all, the references to other nodes are blockdev-add options, and > if we implement bdrv_reopen() fully so that it can change any options > that can be given to blockdev-add, that means that we must be able to > replace children. You got a point. Do you think implementing it will be hard? > > Kevin > >> One problem we discussed earlier that I'm not sure whether it's related >> is filter nodes inserted automatically once we change e.g. the I/O >> throttling QMP commands to add a throttling filter BDS to the graph. >> >> If the user creates nodes A and B, but sets throttling options, they >> might end up with a chain like this: >> >> A <- throttling <- B Is "sets throttling options" == "uses the (then) legacy interface for throttling instead of the more powerful 'insert throttling node' interface"? >> Now imagine that they want to add another filter between A and B, let's >> say blkdebug. They would need to know that they have to insert the new >> node between A and throttling or B and throttling, but not between A and >> B. If they tried to insert it between A and B, the algorithm above says >> that they would let blkdebug point to A, and replace B's child with >> blkdebug, the resulting tree wouldn't be throttled any more! >> >> A <- blkdebug <- B >> >> throttling (ref = 0 -> delete) Can we phrase the operation differently? Instead of "insert between A and B (silently replacing everything that is now between A and B)", say one of 1a. Replace node A by A <- blkdebug 1b. Replace node B by blkdebug <- B 2a. Replace edge A <- B by <- blkdebug <- Impossible in the current state, because there is no such edge. 2b. Replace edge A <- by <- blkdebug <- 2c. Replace edge <- B by <- blkdebug <- See also node vs. edge replacement above. >> Even if they knew that they have to consider the throttling node, it >> currently wouldn't have a node-name, and with Jeff's autogenerated names >> it wouldn't be predictable. >> >> Maybe the dynamic reconfiguration interface does need to be a bit >> cleverer. >> >> >> Anyway, after writing all of this, I'm almost convinced now that an >> experimental interface is the right thing to do in this patch series. Let's do experimental now, and reconsider before the release.
On 10/21/2015 04:27 PM, Markus Armbruster wrote: > Sorry for my slow reply. > > Kevin Wolf <kwolf@redhat.com> writes: > >> Am 08.10.2015 um 13:02 hat Kevin Wolf geschrieben: >>> Am 08.10.2015 um 08:15 hat Markus Armbruster geschrieben: >>>> Max Reitz <mreitz@redhat.com> writes: >>>>> E.g. you may have a block filter in the future where you want to >>>>> exchange its child BDS. This exchange should be an atomic operation, so >>>>> we cannot use this interface there anyway. For quorum, such an exchange >>>>> does not need to be atomic, since you can just add the new child first >>>>> and remove the old one afterwards. >>>>> >>>>> So maybe in the future we get some block driver other than quorum for >>>>> which adding and removing children (as opposed to atomically exchanging >>>>> them) makes sense, but for now I can only see quorum. Therefore, that >>>>> this works for quorum only is in my opinion not a reason to make it >>>>> experimental. I think we actually want to keep it that way. >>>> >>>> Are you telling us the existing interface is insufficiently general? >>>> That the general interface neeeds to support atomic replacement? >>>> >>>> If yes, why isn't the general interface is what we should do for quorum? >>>> Delete is atomic replacement by nothing, add is atomic replacement of >>>> nothing. >>> >>> The general thing is what we used to call "dynamic reconfiguration". If >>> we want a single command to enable it (which would be great if we >>> could), we need to some more design work first. Atomic replacement might >>> be the operation we're looking for, but we have to be sure. >>> >>> So far we haven't thought about dynamic reconfiguation enough that we >>> would know the right solution, but enough that we know it's hard. > > Yes. > >>> That >>> would be an argument for me that makes adding an x-* command now >>> acceptable. On the other hand, the fact that we want a single command in >>> the end makes me want to keep it experimental. > > We have a bit more time to think until the release calcifies interfaces. > >>> What types of dynamic reconfiguration do we need to support? I'll start >>> with a small list, feel free to extend it: >>> >>> * Replace a child node with another node. This works pretty much >>> everywhere in the tree - including the root, i.e. BlockBackend! >>> Working just on BDSes doesn't seem to be enough. > > Two ways to replace a node: > > 1. Operation on a node, i.e. replace node O by node N *everywhere*. > > 2. Operation on an edge, i.e. replace an edge to node O by an edge to > node N. > > Less abstract: > > 1. Node replacement takes a BDS *O. It either replaces in place > (something we've worked hard to get rid of), or needs to find all > variables containing the value O. > > 2. Edge replacement takes a BDS **O. Naturally, whoever contains that > thing may need to be notified, so the actual interface may well be a > bit more involved. > >>> * Adding a child to a node that can take additional children (e.g. >>> quorum can take an arbitrary number; but also COW image formats have >>> an option child, so you could add a backing file to a node originally >>> opened with BDRV_O_NO_BACKING) >>> >>> Same as atomically replacing nothing by a node. > > Yes. > >>> * Removing an optional child from a node that remains valid with that >>> child removed. The same examples apply. >>> >>> Same as atomically replacing a child by nothing. > > Yes. > >>> * Add a new node between two existing nodes. An example is taking a live >>> snapshot, which inserts a new node between the BlockBackend and the >>> first BDS. Or it could be used to insert a filter somewhere in the >>> graph. >>> >>> Same as creating the new node pointing to node B (or dynamically >>> adding it) and then atomically replacing the reference of node A that >>> pointed to B with a reference to the new node. > > Footnote: between creation of the new node and the atomic replace, the > old node has another parent. Requires support for a BDS to have > multiple parents, but that's a given, I think. > >>> * Add a new node between multiple existing nodes. This is one of the >>> examples we always used to discuss with dynamic reconfiguration: >>> >>> base <- sn1 <- sn2 <--+-- virtio-blk >>> | >>> +-- NBD server >>> >>> Adding a filter could result in this: >>> >>> base <- sn1 <- sn2 <- throttling <--+-- virtio-blk >>> | >>> +-- NBD server >>> >>> Or this: >>> >>> base <- sn1 <- sn2 <--+-- throttling <- virtio-blk >>> | >>> +-- NBD server >>> >>> Or this: >>> >>> base <- sn1 <- sn2 <--+-- virtio-blk >>> | >>> +-- throttling <- NBD server >>> >>> All of these are different kinds of "adding a filter", and all of >>> them are valid operations that a user could want to perform. >>> >>> Case 2 and 3 are really just "add a new node between two existing >>> nodes", as explained above. > > Yes. > >>> Case 1 is new: We still create the throttling node so that it already >>> points to sn2, but now we have to atomically change the children of >>> two things (the BlockBackends of virtio-blk and the NBD server). Not a >>> problem per se because we can just do that, but it raises the question >>> whether the atomic replacement operation needs to be transactionable. > > Where "transactionable" means you can bind it together with other > transactionable operations, and the resulting transaction either > succeeds or fails completely. Right? > > In increasing flexibility order: > > a. atomic single replacement, not transactionable > > b. atomic multiple replacements, not transactionable > > c. single replacement, transactionable > > In your throttling example, having to insert the throttle in two > separate operations (either of which can fail) seems tolerable. > > Can we think of an example where a similar multi-replacement should > either succeed or fail completely? I.e. one that needs at least b. > > Can we think of an example where we need c? > >>> * Remove a node between two (or more) other nodes. >>> >>> Same as atomically replacing a child by a grandchild. For more than >>> two involved nodes, again a transactional version might be needed. >>> >>> So at the first sight, this operation seems to work as the general >>> interface for dynamic reconfiguration. > > Yes, we seem to be onto something here. Finally! > >> And actually, after re-reading the other branch of this email thread >> where I replied to Berto that he wants to use bdrv_reopen() to change >> the voting threshold for quorum, it occurred to me that bdrv_reopen() >> should possibly also be this atomatic replacement function. >> >> After all, the references to other nodes are blockdev-add options, and >> if we implement bdrv_reopen() fully so that it can change any options >> that can be given to blockdev-add, that means that we must be able to >> replace children. > > You got a point. Do you think implementing it will be hard? > >> >> Kevin >> >>> One problem we discussed earlier that I'm not sure whether it's related >>> is filter nodes inserted automatically once we change e.g. the I/O >>> throttling QMP commands to add a throttling filter BDS to the graph. >>> >>> If the user creates nodes A and B, but sets throttling options, they >>> might end up with a chain like this: >>> >>> A <- throttling <- B > > Is "sets throttling options" == "uses the (then) legacy interface for > throttling instead of the more powerful 'insert throttling node' > interface"? > >>> Now imagine that they want to add another filter between A and B, let's >>> say blkdebug. They would need to know that they have to insert the new >>> node between A and throttling or B and throttling, but not between A and >>> B. If they tried to insert it between A and B, the algorithm above says >>> that they would let blkdebug point to A, and replace B's child with >>> blkdebug, the resulting tree wouldn't be throttled any more! >>> >>> A <- blkdebug <- B >>> >>> throttling (ref = 0 -> delete) > > Can we phrase the operation differently? Instead of "insert between A > and B (silently replacing everything that is now between A and B)", > say one of > > 1a. Replace node A by A <- blkdebug > > 1b. Replace node B by blkdebug <- B > > 2a. Replace edge A <- B by <- blkdebug <- > Impossible in the current state, because there is no such edge. What does 'edge' mean? Thanks Wen Congyang > > 2b. Replace edge A <- by <- blkdebug <- > > 2c. Replace edge <- B by <- blkdebug <- > > See also node vs. edge replacement above. > >>> Even if they knew that they have to consider the throttling node, it >>> currently wouldn't have a node-name, and with Jeff's autogenerated names >>> it wouldn't be predictable. >>> >>> Maybe the dynamic reconfiguration interface does need to be a bit >>> cleverer. >>> >>> >>> Anyway, after writing all of this, I'm almost convinced now that an >>> experimental interface is the right thing to do in this patch series. > > Let's do experimental now, and reconsider before the release. > > . >
Wen Congyang <wency@cn.fujitsu.com> writes: > On 10/21/2015 04:27 PM, Markus Armbruster wrote: [...] >> Can we phrase the operation differently? Instead of "insert between A >> and B (silently replacing everything that is now between A and B)", >> say one of >> >> 1a. Replace node A by A <- blkdebug >> >> 1b. Replace node B by blkdebug <- B >> >> 2a. Replace edge A <- B by <- blkdebug <- >> Impossible in the current state, because there is no such edge. > > What does 'edge' mean? It's graph terminology: the BB and BDS serve as the graph's nodes (a.k.a. vertices), and the pointers connecting them serve as the graph's edges. [...]
On 10/26/2015 03:24 PM, Markus Armbruster wrote: > Wen Congyang <wency@cn.fujitsu.com> writes: > >> On 10/21/2015 04:27 PM, Markus Armbruster wrote: > [...] >>> Can we phrase the operation differently? Instead of "insert between A >>> and B (silently replacing everything that is now between A and B)", >>> say one of >>> >>> 1a. Replace node A by A <- blkdebug >>> >>> 1b. Replace node B by blkdebug <- B >>> >>> 2a. Replace edge A <- B by <- blkdebug <- >>> Impossible in the current state, because there is no such edge. >> >> What does 'edge' mean? > > It's graph terminology: the BB and BDS serve as the graph's nodes > (a.k.a. vertices), and the pointers connecting them serve as the graph's > edges. Thanks Wen Congyang > > [...] > . >
diff --git a/blockdev.c b/blockdev.c index 32b04b4..8da0ffb 100644 --- a/blockdev.c +++ b/blockdev.c @@ -3086,6 +3086,54 @@ fail: qmp_output_visitor_cleanup(ov); } +void qmp_x_blockdev_child_add(const char *parent, const char *child, + Error **errp) +{ + BlockDriverState *parent_bs, *child_bs; + Error *local_err = NULL; + + parent_bs = bdrv_lookup_bs(parent, parent, &local_err); + if (!parent_bs) { + error_propagate(errp, local_err); + return; + } + + child_bs = bdrv_find_node(child); + if (!child_bs) { + error_setg(errp, "Node '%s' not found", child); + return; + } + + bdrv_add_child(parent_bs, child_bs, &local_err); + if (local_err) { + error_propagate(errp, local_err); + } +} + +void qmp_x_blockdev_child_del(const char *parent, const char *child, + Error **errp) +{ + BlockDriverState *parent_bs, *child_bs; + Error *local_err = NULL; + + parent_bs = bdrv_lookup_bs(parent, parent, &local_err); + if (!parent_bs) { + error_propagate(errp, local_err); + return; + } + + child_bs = bdrv_find_node(child); + if (!child_bs) { + error_setg(errp, "Node '%s' not found", child); + return; + } + + bdrv_del_child(parent_bs, child_bs, &local_err); + if (local_err) { + error_propagate(errp, local_err); + } +} + BlockJobInfoList *qmp_query_block_jobs(Error **errp) { BlockJobInfoList *head = NULL, **p_next = &head; diff --git a/qapi/block-core.json b/qapi/block-core.json index bb2189e..9418f05 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -2114,3 +2114,37 @@ ## { 'command': 'block-set-write-threshold', 'data': { 'node-name': 'str', 'write-threshold': 'uint64' } } + +## +# @x-blockdev-child-add +# +# Add a new child to the parent BDS. Currently only the Quorum driver +# implements this feature. This is useful to fix a broken quorum child. +# +# @parent: graph node name or id which the child will be added to. +# +# @child: graph node name that will be added. +# +# Note: this command is experimental, and not a stable API. +# +# Since: 2.5 +## +{ 'command': 'x-blockdev-child-add', + 'data' : { 'parent': 'str', 'child': 'str' } } + +## +# @x-blockdev-child-del +# +# Remove a child from the parent BDS. Currently only the Quorum driver +# implements this feature. This is useful to fix a broken quorum child. +# Note, you can't remove a child if it would bring the quorum below its +# threshold. +# +# @parent: graph node name or id from which the child will removed. +# +# @child: graph node name that will be removed. +# +# Since: 2.5 +## +{ 'command': 'x-blockdev-child-del', + 'data' : { 'parent': 'str', 'child': 'str' } } diff --git a/qmp-commands.hx b/qmp-commands.hx index 66f0300..36e75b1 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -3916,6 +3916,67 @@ Example (2): EQMP { + .name = "x-blockdev-child-add", + .args_type = "parent:B,child:B", + .mhandler.cmd_new = qmp_marshal_x_blockdev_child_add, + }, + +SQMP +x-blockdev-child-add +------------ + +Add a child to a quorum node. + +Arguments: + +- "parent": the quorum's id or node name +- "child": the child node-name which will be added + +Note: this command is experimental, and not a stable API. It doesn't +support all kinds of children, nor all block drivers. + +Example: + +-> { "execute": blockdev-add", + "arguments": { "options": { "driver": "raw", + "node-name": "new_node", + "id": "test_new_node", + "file": { "driver": "file", + "filename": "test.raw" } } } } +<- { "return": {} } +-> { "execute": "x-blockdev-child-add", + "arguments": { "parent": "disk1", "child": "new_node" } } +<- { "return": {} } + +EQMP + + { + .name = "x-blockdev-child-del", + .args_type = "parent:B,child:B", + .mhandler.cmd_new = qmp_marshal_x_blockdev_child_del, + }, + +SQMP +x-blockdev-child-del +------------ + +Delete a child from a quorum node. It can be used to remove a broken +quorum child. + +Arguments: + +- "parent": the quorum's id or node name +- "child": the child node-name which will be removed + +Example: + +-> { "execute": "x-blockdev-child-del", + "arguments": { "parent": "disk1", "child": "new_node" } } +<- { "return": {} } + +EQMP + + { .name = "query-named-block-nodes", .args_type = "", .mhandler.cmd_new = qmp_marshal_query_named_block_nodes,