diff mbox

[COLO-BLOCK,v7,02/17] quorum: implement block driver interfaces add/delete a BDS's child

Message ID 1435635285-5804-3-git-send-email-wency@cn.fujitsu.com
State New
Headers show

Commit Message

Wen Congyang June 30, 2015, 3:34 a.m. UTC
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Cc: Alberto Garcia <berto@igalia.com>
---
 block/quorum.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 70 insertions(+), 2 deletions(-)

Comments

Alberto Garcia July 2, 2015, 2:02 p.m. UTC | #1
On Tue 30 Jun 2015 05:34:30 AM CEST, Wen Congyang wrote:

> +static void quorum_add_child(BlockDriverState *bs, QDict *options, Error **errp)
> +{
> +    BDRVQuorumState *s = bs->opaque;
> +    int ret;
> +    Error *local_err = NULL;
> +
> +    bdrv_drain(bs);
> +
> +    if (s->num_children == s->max_children) {
> +        if (s->max_children >= INT_MAX / 2) {
> +            error_setg(errp, "Too many children");
> +            return;
> +        }
> +
> +        s->bs = g_renew(BlockDriverState *, s->bs, s->max_children * 2);
> +        memset(&s->bs[s->max_children], 0, s->max_children * sizeof(void *));
> +        s->max_children *= 2;
> +    }
> +
> +    ret = bdrv_open_image(&s->bs[s->num_children], NULL, options, "child", bs,
> +                          &child_format, false, &local_err);
> +    if (ret < 0) {
> +        error_propagate(errp, local_err);
> +        return;
> +    }
> +    s->num_children++;
> +
> +    /* TODO: Update vote_threshold */
> +}

A few comments:

1) Is there any reason why you grow the array exponentially instead of
   adding a fixed number of elements when the limit is reached? I guess
   in practice the number of children is never going to be too high
   anyway...

2) Did you think of any API to update vote_threshold? Currently it
   cannot be higher than num_children, so it's effectively limited by
   the number of children that are attached when the quorum device is
   created.

3) I don't think it's necessary to set to NULL the pointers in s->bs[i]
   when i >= num_children. There's no way to access those pointers
   anyway. Same for the ' s->bs[s->num_children] = NULL; ' bit in
   quorum_del_child(). I also think that using memset() for setting NULL
   pointers is not portable, although QEMU is already doing this in a
   few places.

Otherwise the code looks good to me. Thanks!

Berto
Wen Congyang July 2, 2015, 2:30 p.m. UTC | #2
At 2015/7/2 22:02, Alberto Garcia Wrote:
> On Tue 30 Jun 2015 05:34:30 AM CEST, Wen Congyang wrote:
>
>> +static void quorum_add_child(BlockDriverState *bs, QDict *options, Error **errp)
>> +{
>> +    BDRVQuorumState *s = bs->opaque;
>> +    int ret;
>> +    Error *local_err = NULL;
>> +
>> +    bdrv_drain(bs);
>> +
>> +    if (s->num_children == s->max_children) {
>> +        if (s->max_children >= INT_MAX / 2) {
>> +            error_setg(errp, "Too many children");
>> +            return;
>> +        }
>> +
>> +        s->bs = g_renew(BlockDriverState *, s->bs, s->max_children * 2);
>> +        memset(&s->bs[s->max_children], 0, s->max_children * sizeof(void *));
>> +        s->max_children *= 2;
>> +    }
>> +
>> +    ret = bdrv_open_image(&s->bs[s->num_children], NULL, options, "child", bs,
>> +                          &child_format, false, &local_err);
>> +    if (ret < 0) {
>> +        error_propagate(errp, local_err);
>> +        return;
>> +    }
>> +    s->num_children++;
>> +
>> +    /* TODO: Update vote_threshold */
>> +}
>
> A few comments:
>
> 1) Is there any reason why you grow the array exponentially instead of
>     adding a fixed number of elements when the limit is reached? I guess
>     in practice the number of children is never going to be too high
>     anyway...

Yes, will do it in the next version.

>
> 2) Did you think of any API to update vote_threshold? Currently it
>     cannot be higher than num_children, so it's effectively limited by
>     the number of children that are attached when the quorum device is
>     created.

The things I think about it is: if vote_threshold-- when deleting a 
child, vote_threshold
will be less than 0. If we don't do vote_threshold-- when it is 1, the 
vote_threshold will
be changed when all children are added again. Which behavior is better?

>
> 3) I don't think it's necessary to set to NULL the pointers in s->bs[i]
>     when i >= num_children. There's no way to access those pointers
>     anyway. Same for the ' s->bs[s->num_children] = NULL; ' bit in
>     quorum_del_child(). I also think that using memset() for setting NULL
>     pointers is not portable, although QEMU is already doing this in a
>     few places.

OK, will remove it in the next version. Just a question: why is using 
memset()
for setting NULL pointers is not prtable?

Thanks
Wen Congyang

>
> Otherwise the code looks good to me. Thanks!
>
> Berto
>
>
Alberto Garcia July 2, 2015, 3:21 p.m. UTC | #3
On Thu 02 Jul 2015 04:30:50 PM CEST, Wen Congyang wrote:

>> 2) Did you think of any API to update vote_threshold? Currently it
>>     cannot be higher than num_children, so it's effectively limited by
>>     the number of children that are attached when the quorum device is
>>     created.
>
> The things I think about it is: if vote_threshold-- when deleting a
> child, vote_threshold will be less than 0. If we don't do
> vote_threshold-- when it is 1, the vote_threshold will be changed when
> all children are added again. Which behavior is better?

Adding/removing a child and changing vote_threshold are in principle two
unrelated operations. One user could want to modify one but not the
other, so linking them together does not seem like a good idea. A
specific API to change vote_threshold could be added when the use case
appears (currently no one seems to have missed it).

So I think I would keep these things separate, I would also remove the
"TODO" comments that mention it.

With the current patches (that do not touch vote_threshold), you can
remove as many children as you want as long as there's at least one
left. However if you end up with num_chilren < vote_threshold then you
will get read errors. I see two alternatives:

  - Allow that and expect that the user will add the necessary children
    afterwards.

  - Forbid that completely and return an error.

I think I prefer the second option.

>> 3) I don't think it's necessary to set to NULL the pointers in
>> s->bs[i] when i >= num_children. There's no way to access those
>> pointers anyway. Same for the ' s->bs[s->num_children] = NULL; ' bit
>> in quorum_del_child(). I also think that using memset() for setting
>> NULL pointers is not portable, although QEMU is already doing this in
>> a few places.
>
> OK, will remove it in the next version. Just a question: why is using
> memset() for setting NULL pointers is not prtable?

The standard allows for null pointers to be internally represented by
nonzero bit patterns. However I'm not aware of any system that we
support that does that.

   http://c-faq.com/null/confusion4.html
   http://c-faq.com/null/machexamp.html

Berto
Wen Congyang July 3, 2015, 1:10 a.m. UTC | #4
On 07/02/2015 11:21 PM, Alberto Garcia wrote:
> On Thu 02 Jul 2015 04:30:50 PM CEST, Wen Congyang wrote:
> 
>>> 2) Did you think of any API to update vote_threshold? Currently it
>>>     cannot be higher than num_children, so it's effectively limited by
>>>     the number of children that are attached when the quorum device is
>>>     created.
>>
>> The things I think about it is: if vote_threshold-- when deleting a
>> child, vote_threshold will be less than 0. If we don't do
>> vote_threshold-- when it is 1, the vote_threshold will be changed when
>> all children are added again. Which behavior is better?
> 
> Adding/removing a child and changing vote_threshold are in principle two
> unrelated operations. One user could want to modify one but not the
> other, so linking them together does not seem like a good idea. A
> specific API to change vote_threshold could be added when the use case
> appears (currently no one seems to have missed it).
> 
> So I think I would keep these things separate, I would also remove the
> "TODO" comments that mention it.
> 
> With the current patches (that do not touch vote_threshold), you can
> remove as many children as you want as long as there's at least one
> left. However if you end up with num_chilren < vote_threshold then you
> will get read errors. I see two alternatives:
> 
>   - Allow that and expect that the user will add the necessary children
>     afterwards.
> 
>   - Forbid that completely and return an error.
> 
> I think I prefer the second option.

OK, I will do it.

> 
>>> 3) I don't think it's necessary to set to NULL the pointers in
>>> s->bs[i] when i >= num_children. There's no way to access those
>>> pointers anyway. Same for the ' s->bs[s->num_children] = NULL; ' bit
>>> in quorum_del_child(). I also think that using memset() for setting
>>> NULL pointers is not portable, although QEMU is already doing this in
>>> a few places.
>>
>> OK, will remove it in the next version. Just a question: why is using
>> memset() for setting NULL pointers is not prtable?
> 
> The standard allows for null pointers to be internally represented by
> nonzero bit patterns. However I'm not aware of any system that we
> support that does that.
> 
>    http://c-faq.com/null/confusion4.html
>    http://c-faq.com/null/machexamp.html

Thanks for your explantion.

Wen Congyang

> 
> Berto
> .
>
Eric Blake July 27, 2015, 2:45 p.m. UTC | #5
On 07/02/2015 09:21 AM, Alberto Garcia wrote:

> 
>>> 3) I don't think it's necessary to set to NULL the pointers in
>>> s->bs[i] when i >= num_children. There's no way to access those
>>> pointers anyway. Same for the ' s->bs[s->num_children] = NULL; ' bit
>>> in quorum_del_child(). I also think that using memset() for setting
>>> NULL pointers is not portable, although QEMU is already doing this in
>>> a few places.
>>
>> OK, will remove it in the next version. Just a question: why is using
>> memset() for setting NULL pointers is not prtable?
> 
> The standard allows for null pointers to be internally represented by
> nonzero bit patterns. However I'm not aware of any system that we
> support that does that.
> 
>    http://c-faq.com/null/confusion4.html
>    http://c-faq.com/null/machexamp.html

What's more, POSIX has very recently taken the stance that memset() to 0
will work on all POSIX systems, even if someone is insane enough to use
a non-zero bit pattern for NULL on modern hardware:

http://austingroupbugs.net/view.php?id=940
diff mbox

Patch

diff --git a/block/quorum.c b/block/quorum.c
index a7df17c..4a15493 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -67,6 +67,9 @@  typedef struct QuorumVotes {
 typedef struct BDRVQuorumState {
     BlockDriverState **bs; /* children BlockDriverStates */
     int num_children;      /* children count */
+    int max_children;      /* The maximum children count, we need to reallocate
+                            * bs if num_children will larger than maximum.
+                            */
     int threshold;         /* if less than threshold children reads gave the
                             * same result a quorum error occurs.
                             */
@@ -879,9 +882,9 @@  static int quorum_open(BlockDriverState *bs, QDict *options, int flags,
         ret = -EINVAL;
         goto exit;
     }
-    if (s->num_children < 2) {
+    if (s->num_children < 1) {
         error_setg(&local_err,
-                   "Number of provided children must be greater than 1");
+                   "Number of provided children must be 1 or more");
         ret = -EINVAL;
         goto exit;
     }
@@ -930,6 +933,7 @@  static int quorum_open(BlockDriverState *bs, QDict *options, int flags,
     /* allocate the children BlockDriverState array */
     s->bs = g_new0(BlockDriverState *, s->num_children);
     opened = g_new0(bool, s->num_children);
+    s->max_children = s->num_children;
 
     for (i = 0; i < s->num_children; i++) {
         char indexstr[32];
@@ -1000,6 +1004,67 @@  static void quorum_attach_aio_context(BlockDriverState *bs,
     }
 }
 
+static void quorum_add_child(BlockDriverState *bs, QDict *options, Error **errp)
+{
+    BDRVQuorumState *s = bs->opaque;
+    int ret;
+    Error *local_err = NULL;
+
+    bdrv_drain(bs);
+
+    if (s->num_children == s->max_children) {
+        if (s->max_children >= INT_MAX / 2) {
+            error_setg(errp, "Too many children");
+            return;
+        }
+
+        s->bs = g_renew(BlockDriverState *, s->bs, s->max_children * 2);
+        memset(&s->bs[s->max_children], 0, s->max_children * sizeof(void *));
+        s->max_children *= 2;
+    }
+
+    ret = bdrv_open_image(&s->bs[s->num_children], NULL, options, "child", bs,
+                          &child_format, false, &local_err);
+    if (ret < 0) {
+        error_propagate(errp, local_err);
+        return;
+    }
+    s->num_children++;
+
+    /* TODO: Update vote_threshold */
+}
+
+static void quorum_del_child(BlockDriverState *bs, BlockDriverState *child_bs,
+                             Error **errp)
+{
+    BDRVQuorumState *s = bs->opaque;
+    int i;
+
+    for (i = 0; i < s->num_children; i++) {
+        if (s->bs[i] == child_bs) {
+            break;
+        }
+    }
+
+    if (i == s->num_children) {
+        error_setg(errp, "Invalid child");
+        return;
+    }
+
+    if (s->num_children == 1) {
+        error_setg(errp, "Cannot remove the last child");
+        return;
+    }
+
+    bdrv_drain(bs);
+    /* We can safe remove this child now */
+    memmove(&s->bs[i], &s->bs[i+1], (s->num_children - i - 1) * sizeof(void *));
+    s->num_children--;
+    s->bs[s->num_children] = NULL;
+
+    /* TODO: update vote_threshold */
+}
+
 static void quorum_refresh_filename(BlockDriverState *bs)
 {
     BDRVQuorumState *s = bs->opaque;
@@ -1054,6 +1119,9 @@  static BlockDriver bdrv_quorum = {
     .bdrv_detach_aio_context            = quorum_detach_aio_context,
     .bdrv_attach_aio_context            = quorum_attach_aio_context,
 
+    .bdrv_add_child                     = quorum_add_child,
+    .bdrv_del_child                     = quorum_del_child,
+
     .is_filter                          = true,
     .bdrv_recurse_is_first_non_filter   = quorum_recurse_is_first_non_filter,
 };