[v3,00/18] fleecing-hook driver for backup

Message ID	20181001102928.20533-1-vsementsov@virtuozzo.com
Headers	show Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org> From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> To: qemu-devel@nongnu.org, qemu-block@nongnu.org Date: Mon, 1 Oct 2018 13:29:10 +0300 Message-Id: <20181001102928.20533-1-vsementsov@virtuozzo.com> Subject: [Qemu-devel] [PATCH v3 00/18] fleecing-hook driver for backup Precedence: list Cc: kwolf@redhat.com, vsementsov@virtuozzo.com, famz@redhat.com, wencongyang2@huawei.com, xiechanglong.d@gmail.com, armbru@redhat.com, mreitz@redhat.com, stefanha@redhat.com, den@openvz.org, jsnow@redhat.com, jcody@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>
Series	fleecing-hook driver for backup \| expand [v3,00/18] fleecing-hook driver for backup [v3,01/18] block/dirty-bitmap: allow set/reset bits in disabled bitmaps [v3,02/18] block/io: allow BDRV_REQ_SERIALISING for read [v3,03/18] block/backup: simplify backup_incremental_init_copy_bitmap [v3,05/18] util/id: add block-bitmap subsystem [v3,06/18] block/backup: give a name to copy-bitmap [v3,07/18] block/backup: allow use existent copy-bitmap [v3,08/18] block: allow serialized reads to intersect [v3,09/18] block: improve should_update_child [v3,10/18] iotests: handle -f argument correctly for qemu_io_silent [v3,11/18] iotests: allow resume_drive by node name [v3,12/18] iotests: prepare 055 to graph changes during backup job [v3,13/18] block: introduce new filter driver: fleecing-hook [v3,14/18] block/fleecing-hook: internal api [v3,15/18] qapi: add x-drop-fleecing qmp command [v3,16/18] iotests: test new fleecing-hook driver in context of 222 iotest [v3,17/18] block/backup: tiny refactor backup_job_create [v3,18/18] block/backup: use fleecing-hook instead of write notifiers

Message ID

20181001102928.20533-1-vsementsov@virtuozzo.com

Headers

From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
To: qemu-devel@nongnu.org,
	qemu-block@nongnu.org
Date: Mon,  1 Oct 2018 13:29:10 +0300
Message-Id: <20181001102928.20533-1-vsementsov@virtuozzo.com>
Subject: [Qemu-devel] [PATCH v3 00/18] fleecing-hook driver for backup
Precedence: list
Cc: kwolf@redhat.com, vsementsov@virtuozzo.com, famz@redhat.com,
	wencongyang2@huawei.com, xiechanglong.d@gmail.com,
	armbru@redhat.com, mreitz@redhat.com, stefanha@redhat.com,
	den@openvz.org, jsnow@redhat.com, jcody@redhat.com
Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org
Sender: "Qemu-devel"
	<qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>

Series

fleecing-hook driver for backup | expand

Message

Vladimir Sementsov-Ogievskiy Oct. 1, 2018, 10:29 a.m. UTC

v2 was "[RFC v2] new, node-graph-based fleecing and backup"

Hi all!

These series introduce fleecing-hook driver. It's a filter-node, which
do copy-before-write operation. Mirror uses filter-node for handling
guest writes, let's move to filter-node (from write-notifiers) for
backup too (patch 18)

Proposed filter driver is complete and separate: it can be used
standalone, as fleecing provider (instead of backup(sync=none)).
(old-style fleecing based on backup(sync=none) is supported too),
look at patch 16.

There a lot of other ideas and improvements which can be achieved
basing on these series which were discussed on v2 thread, for the
beginning I want to concentrate on the following:

done in these series:
 1. use filter node instead of write notifiers in backup
 2. filter-based fleecing scheme, without a job

near to be done (a test is needed and may be tiny adjustment)
 3. backup scheme for push backup with fleecing, to no disturb
    the guest with long handling of its writes (if target is far
    from remote NBD): just start fleecing to local qcow2 temp
    node and start a backup job (or mirror, why not?) from temp
    node to remote target. These series provide a possibility
    to share dirty bitmap between fleecing-hook and backup job
    to create efficient scenarios.

These series are based on
 [PATCH v4 0/8] dirty-bitmap: rewrite bdrv_dirty_iter_next_area
and
 [PATCH 0/2] replication: drop extra sync

Based-on: <20180919124343.28206-1-vsementsov@virtuozzo.com>
Based-on: <20180917145732.48590-1-vsementsov@virtuozzo.com>

Vladimir Sementsov-Ogievskiy (18):
  block/dirty-bitmap: allow set/reset bits in disabled bitmaps
  block/io: allow BDRV_REQ_SERIALISING for read
  block/backup: simplify backup_incremental_init_copy_bitmap
  block/backup: move from HBitmap to BdrvDirtyBitmap
  util/id: add block-bitmap subsystem
  block/backup: give a name to copy-bitmap
  block/backup: allow use existent copy-bitmap
  block: allow serialized reads to intersect
  block: improve should_update_child
  iotests: handle -f argument correctly for qemu_io_silent
  iotests: allow resume_drive by node name
  iotests: prepare 055 to graph changes during backup job
  block: introduce new filter driver: fleecing-hook
  block/fleecing-hook: internal api
  qapi: add x-drop-fleecing qmp command
  iotests: test new fleecing-hook driver in context of 222 iotest
  block/backup: tiny refactor backup_job_create
  block/backup: use fleecing-hook instead of write notifiers

 qapi/block-core.json          |  57 +++++-
 include/block/block.h         |   9 +
 include/block/block_int.h     |   2 +-
 include/qemu/id.h             |   1 +
 block.c                       |  32 +++-
 block/backup.c                | 320 ++++++++++++++-----------------
 block/dirty-bitmap.c          |   2 -
 block/fleecing-hook.c         | 349 ++++++++++++++++++++++++++++++++++
 block/io.c                    |  16 +-
 block/replication.c           |   2 +-
 blockdev.c                    |  49 ++++-
 util/id.c                     |   1 +
 block/Makefile.objs           |   2 +
 tests/qemu-iotests/055        |  23 ++-
 tests/qemu-iotests/222        |  59 ++++--
 tests/qemu-iotests/222.out    |  66 +++++++
 tests/qemu-iotests/iotests.py |  16 +-
 17 files changed, 778 insertions(+), 228 deletions(-)
 create mode 100644 block/fleecing-hook.c

Comments

Eric Blake Oct. 2, 2018, 8:19 p.m. UTC | #1

On 10/1/18 5:29 AM, Vladimir Sementsov-Ogievskiy wrote:
> v2 was "[RFC v2] new, node-graph-based fleecing and backup"
> 
> Hi all!
> 
> These series introduce fleecing-hook driver. It's a filter-node, which
> do copy-before-write operation. Mirror uses filter-node for handling
> guest writes, let's move to filter-node (from write-notifiers) for
> backup too (patch 18)
> 
> Proposed filter driver is complete and separate: it can be used
> standalone, as fleecing provider (instead of backup(sync=none)).
> (old-style fleecing based on backup(sync=none) is supported too),
> look at patch 16.

I haven't had time to look at this series in any sort of depth yet, but 
it reminds me of a question I just ran into with my libvirt code:

What happens if we want to have two parallel clients both reading off 
different backup/fleece nodes at once?  Right now, 'nbd-server-start' is 
hard-coded to at most one NBD server, and 'nbd-server-add' is hardcoded 
to adding an export to the one-and-only NBD server.  But it would be a 
lot nicer if you could pick different ports for different clients (or 
even mix TCP and Unix sockets), so that independent backup jobs can both 
operate in parallel via different NBD servers both under control of the 
same qemu process, instead of the second client having to wait for the 
first client to disconnect so that the first NBD server can stop.  In 
the meantime, you can be somewhat careful by controlling which export 
names are exposed over NBD, but even with nbd-server-start using 
"tls-creds", all clients can see one another's exports via NBD_OPT_LIST, 
and you are relying on the clients being well-behaved, vs. the nicer 
ability to spawn multiple NBD servers, then control which exports are 
exposed over which servers, and where distinct servers could even have 
different tls-creds.

To get to that point, we'd need to enhance nbd-server-start to return a 
server id, and allow nbd-server-add and friends to take an optional 
parameter of a server id (for back-compat, if the server id is not 
provided, it operates on the first one).

Vladimir Sementsov-Ogievskiy Oct. 3, 2018, 9:55 a.m. UTC | #2

02.10.2018 23:19, Eric Blake wrote:
> On 10/1/18 5:29 AM, Vladimir Sementsov-Ogievskiy wrote:
>> v2 was "[RFC v2] new, node-graph-based fleecing and backup"
>>
>> Hi all!
>>
>> These series introduce fleecing-hook driver. It's a filter-node, which
>> do copy-before-write operation. Mirror uses filter-node for handling
>> guest writes, let's move to filter-node (from write-notifiers) for
>> backup too (patch 18)
>>
>> Proposed filter driver is complete and separate: it can be used
>> standalone, as fleecing provider (instead of backup(sync=none)).
>> (old-style fleecing based on backup(sync=none) is supported too),
>> look at patch 16.
>
> I haven't had time to look at this series in any sort of depth yet, 
> but it reminds me of a question I just ran into with my libvirt code:
>
> What happens if we want to have two parallel clients both reading off 
> different backup/fleece nodes at once?  Right now, 'nbd-server-start' 
> is hard-coded to at most one NBD server, and 'nbd-server-add' is 
> hardcoded to adding an export to the one-and-only NBD server.  But it 
> would be a lot nicer if you could pick different ports for different 
> clients (or even mix TCP and Unix sockets), so that independent backup 
> jobs can both operate in parallel via different NBD servers both under 
> control of the same qemu process, instead of the second client having 
> to wait for the first client to disconnect so that the first NBD 
> server can stop. In the meantime, you can be somewhat careful by 
> controlling which export names are exposed over NBD, but even with 
> nbd-server-start using "tls-creds", all clients can see one another's 
> exports via NBD_OPT_LIST, and you are relying on the clients being 
> well-behaved, vs. the nicer ability to spawn multiple NBD servers, 
> then control which exports are exposed over which servers, and where 
> distinct servers could even have different tls-creds.
>
> To get to that point, we'd need to enhance nbd-server-start to return 
> a server id, and allow nbd-server-add and friends to take an optional 
> parameter of a server id (for back-compat, if the server id is not 
> provided, it operates on the first one).
>

Good thing.
Don't see any problems from block layer: if we want to export the same 
fleecing through several different servers, they all can share the same 
fleecing node.

However, with new approach, we even can setup several fleecing nodes for 
one active disk.. any benefits? For example we can start second external 
backup on the same disk, when the first one is still in progress.. Don't 
sure that's a real case.

Vladimir Sementsov-Ogievskiy Oct. 3, 2018, 3:36 p.m. UTC | #3

02.10.2018 23:19, Eric Blake wrote:
> On 10/1/18 5:29 AM, Vladimir Sementsov-Ogievskiy wrote:
>> v2 was "[RFC v2] new, node-graph-based fleecing and backup"
>>
>> Hi all!
>>
>> These series introduce fleecing-hook driver. It's a filter-node, which
>> do copy-before-write operation. Mirror uses filter-node for handling
>> guest writes, let's move to filter-node (from write-notifiers) for
>> backup too (patch 18)
>>
>> Proposed filter driver is complete and separate: it can be used
>> standalone, as fleecing provider (instead of backup(sync=none)).
>> (old-style fleecing based on backup(sync=none) is supported too),
>> look at patch 16.
>
> I haven't had time to look at this series in any sort of depth yet, 
> but it reminds me of a question I just ran into with my libvirt code:
>
> What happens if we want to have two parallel clients both reading off 
> different backup/fleece nodes at once?  Right now, 'nbd-server-start' 
> is hard-coded to at most one NBD server, and 'nbd-server-add' is 
> hardcoded to adding an export to the one-and-only NBD server.  But it 
> would be a lot nicer if you could pick different ports for different 
> clients (or even mix TCP and Unix sockets), so that independent backup 
> jobs can both operate in parallel via different NBD servers both under 
> control of the same qemu process, instead of the second client having 
> to wait for the first client to disconnect so that the first NBD 
> server can stop. In the meantime, you can be somewhat careful by 
> controlling which export names are exposed over NBD, but even with 
> nbd-server-start using "tls-creds", all clients can see one another's 
> exports via NBD_OPT_LIST, and you are relying on the clients being 
> well-behaved, vs. the nicer ability to spawn multiple NBD servers, 
> then control which exports are exposed over which servers, and where 
> distinct servers could even have different tls-creds.
>
> To get to that point, we'd need to enhance nbd-server-start to return 
> a server id, and allow nbd-server-add and friends to take an optional 
> parameter of a server id (for back-compat, if the server id is not 
> provided, it operates on the first one).
>

hmm, about different ports, it's funny, but NBD-spec is directly against 
different ports for new style negotiation:
A client who wants to use the new style negotiation SHOULD connect on 
the IANA-reserved port for NBD, 10809. The server MAY listen on other 
ports as well, but it SHOULD use the old style handshake on those.

also, next sentence is strange too:
The server SHOULD refuse to allow oldstyle negotiations on the newstyle 
port.

refuse? Refuse to whom? It's a server, who choose negotiation type. Or 
this means, that it should refuse to start at all. so refuse to server 
admin, not to NBD client? sounds strange. Should it be "Server SHOULD 
NOT use the old style on port 10809"?