diff mbox

[RFC,03/14] quorum: ignore 0-length child

Message ID 1423710438-14377-4-git-send-email-wency@cn.fujitsu.com
State New
Headers show

Commit Message

Wen Congyang Feb. 12, 2015, 3:07 a.m. UTC
We connect to NBD server when starting block replication, so
the length is 0 before starting block replication.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 block/quorum.c | 5 +++++
 1 file changed, 5 insertions(+)

Comments

Max Reitz Feb. 23, 2015, 8:43 p.m. UTC | #1
On 2015-02-11 at 22:07, Wen Congyang wrote:
> We connect to NBD server when starting block replication, so
> the length is 0 before starting block replication.
>
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> ---
>   block/quorum.c | 5 +++++
>   1 file changed, 5 insertions(+)
>
> diff --git a/block/quorum.c b/block/quorum.c
> index 5ed1ff8..e6aff5f 100644
> --- a/block/quorum.c
> +++ b/block/quorum.c
> @@ -734,6 +734,11 @@ static int64_t quorum_getlength(BlockDriverState *bs)
>           if (value < 0) {
>               return value;
>           }
> +
> +        if (!value) {
> +            continue;
> +        }
> +
>           if (value != result) {
>               return -EIO;
>           }

Hm, what do you think about some specific error value returned by your 
delayed NBD implementation? Like -ENOTCONN or something like that? Then 
we'd be able to discern a real 0-length block device from a 
not-yet-connected NBD server.

Also, while you did write that one shouldn't be using the NBD client as 
the first quorum child, I think we should try to support that case 
anyway. For this patch, that means accepting that 
bdrv_getlength(s->bs[0]) may be off.

Max
Wen Congyang Feb. 24, 2015, 2:33 a.m. UTC | #2
On 02/24/2015 04:43 AM, Max Reitz wrote:
> On 2015-02-11 at 22:07, Wen Congyang wrote:
>> We connect to NBD server when starting block replication, so
>> the length is 0 before starting block replication.
>>
>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>> ---
>>   block/quorum.c | 5 +++++
>>   1 file changed, 5 insertions(+)
>>
>> diff --git a/block/quorum.c b/block/quorum.c
>> index 5ed1ff8..e6aff5f 100644
>> --- a/block/quorum.c
>> +++ b/block/quorum.c
>> @@ -734,6 +734,11 @@ static int64_t quorum_getlength(BlockDriverState *bs)
>>           if (value < 0) {
>>               return value;
>>           }
>> +
>> +        if (!value) {
>> +            continue;
>> +        }
>> +
>>           if (value != result) {
>>               return -EIO;
>>           }
> 
> Hm, what do you think about some specific error value returned by your delayed NBD implementation? Like -ENOTCONN or something like that? Then we'd be able to discern a real 0-length block device from a not-yet-connected NBD server.
> 
> Also, while you did write that one shouldn't be using the NBD client as the first quorum child, I think we should try to support that case anyway. For this patch, that means accepting that bdrv_getlength(s->bs[0]) may be off.

Good idea. I will try it.

Thanks
Wen Congyang

> 
> Max
> .
>
Wen Congyang March 18, 2015, 5:29 a.m. UTC | #3
On 02/24/2015 04:43 AM, Max Reitz wrote:
> On 2015-02-11 at 22:07, Wen Congyang wrote:
>> We connect to NBD server when starting block replication, so
>> the length is 0 before starting block replication.
>>
>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>> ---
>>   block/quorum.c | 5 +++++
>>   1 file changed, 5 insertions(+)
>>
>> diff --git a/block/quorum.c b/block/quorum.c
>> index 5ed1ff8..e6aff5f 100644
>> --- a/block/quorum.c
>> +++ b/block/quorum.c
>> @@ -734,6 +734,11 @@ static int64_t quorum_getlength(BlockDriverState *bs)
>>           if (value < 0) {
>>               return value;
>>           }
>> +
>> +        if (!value) {
>> +            continue;
>> +        }
>> +
>>           if (value != result) {
>>               return -EIO;
>>           }
> 
> Hm, what do you think about some specific error value returned by your delayed NBD implementation? Like -ENOTCONN or something like that? Then we'd be able to discern a real 0-length block device from a not-yet-connected NBD server.

In my newest test, it cannot return -ENOTCONN, otherwise bdrv_open() will fail when we open the child.
Here is the backtrace:
(gdb) bt
#0  bdrv_open_common (bs=0x5555563b2230, file=0x0, options=0x5555563b55a0, flags=57410, drv=0x555555e90c00, errp=0x7fffffffd460) at block.c:1070
#1  0x000055555595ea72 in bdrv_open (pbs=0x7fffffffd5f8, filename=0x0, reference=0x0, options=0x5555563b55a0, flags=57410, drv=0x555555e90c00, errp=0x7fffffffd5e0) at block.c:1677
#2  0x000055555595e3a9 in bdrv_open_image (pbs=0x7fffffffd5f8, filename=0x0, options=0x5555563a6730, bdref_key=0x555555a86b4c "file", flags=57410, allow_none=true, errp=0x7fffffffd5e0) at block.c:1481
#3  0x000055555595e9ae in bdrv_open (pbs=0x555556388008, filename=0x0, reference=0x0, options=0x5555563a6730, flags=8258, drv=0x555555e8b800, errp=0x7fffffffd6b8) at block.c:1655
#4  0x00005555559b0058 in quorum_open (bs=0x55555639bd90, options=0x55555639f100, flags=8258, errp=0x7fffffffd758) at block/quorum.c:1000
#5  0x000055555595d0b8 in bdrv_open_common (bs=0x55555639bd90, file=0x0, options=0x55555639f100, flags=8258, drv=0x555555e8e5c0, errp=0x7fffffffd840) at block.c:1045
#6  0x000055555595ea72 in bdrv_open (pbs=0x55555639bd50, filename=0x0, reference=0x0, options=0x55555639f100, flags=8258, drv=0x555555e8e5c0, errp=0x7fffffffdb70) at block.c:1677
#7  0x00005555559b3bd3 in blk_new_open (name=0x55555639bc60 "virtio0", filename=0x0, reference=0x0, options=0x55555639a420, flags=66, errp=0x7fffffffdb70) at block/block-backend.c:129
#8  0x0000555555754f78 in blockdev_init (file=0x0, bs_opts=0x55555639a420, errp=0x7fffffffdb70) at blockdev.c:536
#9  0x0000555555755d90 in drive_new (all_opts=0x5555563777e0, block_default_type=IF_IDE) at blockdev.c:971
#10 0x000055555576b1f0 in drive_init_func (opts=0x5555563777e0, opaque=0x555556372b48) at vl.c:1104
#11 0x0000555555a1b019 in qemu_opts_foreach (list=0x555555e44060, func=0x55555576b1ba <drive_init_func>, opaque=0x555556372b48, abort_on_failure=1) at util/qemu-option.c:1059
#12 0x00005555557743dd in main (argc=25, argv=0x7fffffffe0c8, envp=0x7fffffffe198) at vl.c:4191

refresh_total_sectors() will fail if we return -ENOTCONN.

Thanks
Wen Congyang

> 
> Also, while you did write that one shouldn't be using the NBD client as the first quorum child, I think we should try to support that case anyway. For this patch, that means accepting that bdrv_getlength(s->bs[0]) may be off.
> 
> Max
> .
>
Max Reitz March 18, 2015, 12:57 p.m. UTC | #4
On 2015-03-18 at 01:29, Wen Congyang wrote:
> On 02/24/2015 04:43 AM, Max Reitz wrote:
>> On 2015-02-11 at 22:07, Wen Congyang wrote:
>>> We connect to NBD server when starting block replication, so
>>> the length is 0 before starting block replication.
>>>
>>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>>> ---
>>>    block/quorum.c | 5 +++++
>>>    1 file changed, 5 insertions(+)
>>>
>>> diff --git a/block/quorum.c b/block/quorum.c
>>> index 5ed1ff8..e6aff5f 100644
>>> --- a/block/quorum.c
>>> +++ b/block/quorum.c
>>> @@ -734,6 +734,11 @@ static int64_t quorum_getlength(BlockDriverState *bs)
>>>            if (value < 0) {
>>>                return value;
>>>            }
>>> +
>>> +        if (!value) {
>>> +            continue;
>>> +        }
>>> +
>>>            if (value != result) {
>>>                return -EIO;
>>>            }
>> Hm, what do you think about some specific error value returned by your delayed NBD implementation? Like -ENOTCONN or something like that? Then we'd be able to discern a real 0-length block device from a not-yet-connected NBD server.
> In my newest test, it cannot return -ENOTCONN, otherwise bdrv_open() will fail when we open the child.
> Here is the backtrace:
> (gdb) bt
> #0  bdrv_open_common (bs=0x5555563b2230, file=0x0, options=0x5555563b55a0, flags=57410, drv=0x555555e90c00, errp=0x7fffffffd460) at block.c:1070
> #1  0x000055555595ea72 in bdrv_open (pbs=0x7fffffffd5f8, filename=0x0, reference=0x0, options=0x5555563b55a0, flags=57410, drv=0x555555e90c00, errp=0x7fffffffd5e0) at block.c:1677
> #2  0x000055555595e3a9 in bdrv_open_image (pbs=0x7fffffffd5f8, filename=0x0, options=0x5555563a6730, bdref_key=0x555555a86b4c "file", flags=57410, allow_none=true, errp=0x7fffffffd5e0) at block.c:1481
> #3  0x000055555595e9ae in bdrv_open (pbs=0x555556388008, filename=0x0, reference=0x0, options=0x5555563a6730, flags=8258, drv=0x555555e8b800, errp=0x7fffffffd6b8) at block.c:1655
> #4  0x00005555559b0058 in quorum_open (bs=0x55555639bd90, options=0x55555639f100, flags=8258, errp=0x7fffffffd758) at block/quorum.c:1000
> #5  0x000055555595d0b8 in bdrv_open_common (bs=0x55555639bd90, file=0x0, options=0x55555639f100, flags=8258, drv=0x555555e8e5c0, errp=0x7fffffffd840) at block.c:1045
> #6  0x000055555595ea72 in bdrv_open (pbs=0x55555639bd50, filename=0x0, reference=0x0, options=0x55555639f100, flags=8258, drv=0x555555e8e5c0, errp=0x7fffffffdb70) at block.c:1677
> #7  0x00005555559b3bd3 in blk_new_open (name=0x55555639bc60 "virtio0", filename=0x0, reference=0x0, options=0x55555639a420, flags=66, errp=0x7fffffffdb70) at block/block-backend.c:129
> #8  0x0000555555754f78 in blockdev_init (file=0x0, bs_opts=0x55555639a420, errp=0x7fffffffdb70) at blockdev.c:536
> #9  0x0000555555755d90 in drive_new (all_opts=0x5555563777e0, block_default_type=IF_IDE) at blockdev.c:971
> #10 0x000055555576b1f0 in drive_init_func (opts=0x5555563777e0, opaque=0x555556372b48) at vl.c:1104
> #11 0x0000555555a1b019 in qemu_opts_foreach (list=0x555555e44060, func=0x55555576b1ba <drive_init_func>, opaque=0x555556372b48, abort_on_failure=1) at util/qemu-option.c:1059
> #12 0x00005555557743dd in main (argc=25, argv=0x7fffffffe0c8, envp=0x7fffffffe198) at vl.c:4191
>
> refresh_total_sectors() will fail if we return -ENOTCONN.

Okay, then 0 will be fine, too.

Max
diff mbox

Patch

diff --git a/block/quorum.c b/block/quorum.c
index 5ed1ff8..e6aff5f 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -734,6 +734,11 @@  static int64_t quorum_getlength(BlockDriverState *bs)
         if (value < 0) {
             return value;
         }
+
+        if (!value) {
+            continue;
+        }
+
         if (value != result) {
             return -EIO;
         }