Message ID | 1374054136-28741-11-git-send-email-famz@redhat.com |
---|---|
State | New |
Headers | show |
Il 17/07/2013 11:42, Fam Zheng ha scritto: > This option allows overriding backing hd of drive. If the target drive > exists, it's referenced as the backing file and refcount incremented. > > Example: > qemu-system-x86_64 -drive \ > file.filename=foo.qcow2,if=none,id=foo \ > -drive file=bar.qcow2,backing=foo I guess this is where we need the soft reference. This has a _lot_ of potential for misuse, I think Kevin bashed me and Federico very heavily when we tried to do something similar. block/backup.c is the right place where we can override the backing hd of the drive. Perhaps we can add a way to open a file with BDRV_O_NO_BACKING from the command line. Paolo > Signed-off-by: Fam Zheng <famz@redhat.com> > --- > block.c | 30 ++++++++++++++++++++++++------ > 1 file changed, 24 insertions(+), 6 deletions(-) > > diff --git a/block.c b/block.c > index 147a448..31fab07 100644 > --- a/block.c > +++ b/block.c > @@ -1083,12 +1083,30 @@ int bdrv_open(BlockDriverState *bs, const char *filename, QDict *options, > > /* If there is a backing file, use it */ > if ((flags & BDRV_O_NO_BACKING) == 0) { > - QDict *backing_options; > - > - extract_subqdict(options, &backing_options, "backing."); > - ret = bdrv_open_backing_file(bs, backing_options); > - if (ret < 0) { > - goto close_and_fail; > + const char *backing_drive; > + backing_drive = qdict_get_try_str(options, "backing"); > + if (backing_drive) { > + bs->backing_hd = bdrv_find(backing_drive); > + if (bs->backing_hd) { > + bdrv_ref(bs->backing_hd, false); > + pstrcpy(bs->backing_file, sizeof(bs->backing_file), > + bs->backing_hd->filename); > + pstrcpy(bs->backing_format, sizeof(bs->backing_format), > + bs->backing_hd->drv->format_name); > + } else { > + qerror_report(ERROR_CLASS_DEVICE_NOT_FOUND, > + "Can't find drive with name %s\n", backing_drive); > + ret = -EINVAL; > + goto close_and_fail; > + } > + qdict_del(options, "backing"); > + } else { > + QDict *backing_options; > + extract_subqdict(options, &backing_options, "backing."); > + ret = bdrv_open_backing_file(bs, backing_options); > + if (ret < 0) { > + goto close_and_fail; > + } > } > } > >
Am 17.07.2013 um 14:36 hat Paolo Bonzini geschrieben: > Il 17/07/2013 11:42, Fam Zheng ha scritto: > > This option allows overriding backing hd of drive. If the target drive > > exists, it's referenced as the backing file and refcount incremented. > > > > Example: > > qemu-system-x86_64 -drive \ > > file.filename=foo.qcow2,if=none,id=foo \ > > -drive file=bar.qcow2,backing=foo > > I guess this is where we need the soft reference. > > This has a _lot_ of potential for misuse, I think Kevin bashed me and > Federico very heavily when we tried to do something similar. Not sure what exactly I "bashed" you for, but I think this is really the right thing to do, and it's the general direction we're headed for with blockdev-add. The user manages his BlockDriverStates and connects them as he sees fit. The defaults of qemu are only used when the user doesn't override them (and libvirt is going to override most to use fd passing). My expectation is, admittedly, that it's hard to get right from where we stand today, because of the coupling of BlockDriverStates with guest devices, but with the refcounting patches (which I haven't reviewed yet), maybe one of the biggest concerns is addressed. This is basically restarting the discussion where I suggested to give the targets of a block job names so that they can be reused. It's about the same kind of misuse that becomes possible and that we need to protect against. Kevin
Il 17/07/2013 14:58, Kevin Wolf ha scritto: > Am 17.07.2013 um 14:36 hat Paolo Bonzini geschrieben: >> Il 17/07/2013 11:42, Fam Zheng ha scritto: >>> This option allows overriding backing hd of drive. If the target drive >>> exists, it's referenced as the backing file and refcount incremented. >>> >>> Example: >>> qemu-system-x86_64 -drive \ >>> file.filename=foo.qcow2,if=none,id=foo \ >>> -drive file=bar.qcow2,backing=foo >> >> I guess this is where we need the soft reference. >> >> This has a _lot_ of potential for misuse, I think Kevin bashed me and >> Federico very heavily when we tried to do something similar. > > Not sure what exactly I "bashed" you for Doing strange things with bs->backing_hd (blkmirror comes to mind). > This is basically restarting the discussion where I suggested to give > the targets of a block job names so that they can be reused. It's about > the same kind of misuse that becomes possible and that we need to > protect against. Yes. But then I'm not sure why we need to rush in blockdev-backup now. Instead we can simply make drive-backup optionally give a name to the target. I understand this is the right thing to do long term, but pre-opening of the target is not really needed for fleecing. Paolo
Am 17.07.2013 um 15:13 hat Paolo Bonzini geschrieben: > Il 17/07/2013 14:58, Kevin Wolf ha scritto: > > Am 17.07.2013 um 14:36 hat Paolo Bonzini geschrieben: > >> Il 17/07/2013 11:42, Fam Zheng ha scritto: > >>> This option allows overriding backing hd of drive. If the target drive > >>> exists, it's referenced as the backing file and refcount incremented. > >>> > >>> Example: > >>> qemu-system-x86_64 -drive \ > >>> file.filename=foo.qcow2,if=none,id=foo \ > >>> -drive file=bar.qcow2,backing=foo > >> > >> I guess this is where we need the soft reference. > >> > >> This has a _lot_ of potential for misuse, I think Kevin bashed me and > >> Federico very heavily when we tried to do something similar. > > > > Not sure what exactly I "bashed" you for > > Doing strange things with bs->backing_hd (blkmirror comes to mind). > > > This is basically restarting the discussion where I suggested to give > > the targets of a block job names so that they can be reused. It's about > > the same kind of misuse that becomes possible and that we need to > > protect against. > > Yes. But then I'm not sure why we need to rush in blockdev-backup now. > Instead we can simply make drive-backup optionally give a name to the > target. > > I understand this is the right thing to do long term, but pre-opening of > the target is not really needed for fleecing. So for how much longer should we plan to procrastinate? (I know, not an entirely fair question, but we have to make the step at some point) I guess we can give a name to the target, and we can make drive-backup automatically connect the target with the original as its backing file (still needs the refcounting, by the way). But is giving a name to the target not enough to allow "interesting" things to be done? I don't remember the details from the mirroring discussion, but it seems it were enough that you didn't want to do it. And we'll want to reference existing BDSes as backing/protocol files in blockdev-add soon anyway, so if we decide against it here, it's just moving from Fam's to-do list to mine... So no, I'm not totally comfortable with allowing it, but not allowing it isn't really an option either. Kevin
Il 17/07/2013 15:48, Kevin Wolf ha scritto: >> I understand this is the right thing to do long term, but pre-opening of >> the target is not really needed for fleecing. > > So for how much longer should we plan to procrastinate? (I know, not an > entirely fair question, but we have to make the step at some point) If we bring it up during soft freeze, we will procrastinate it a lot. :) If we bring it up at the beginning of a release cycle, the wait could be as short as 1 month... > I guess we can give a name to the target, and we can make drive-backup > automatically connect the target with the original as its backing file > (still needs the refcounting, by the way). No, it doesn't need the refcounting (see my reply to the cover letter). In his next submission of drive-backup sync modes, Ian is already going to handle the automatic connection of the target with the original. > But is giving a name to the > target not enough to allow "interesting" things to be done? I don't > remember the details from the mirroring discussion, but it seems it were > enough that you didn't want to do it. Yes, but this time we have to bite the bullet on that one at least, because we have no other choice (we want to do at least one "interesting" thing, namely connect to it with the NBD server). > And we'll want to reference existing BDSes as backing/protocol files in > blockdev-add soon anyway, so if we decide against it here, it's just > moving from Fam's to-do list to mine... We can reconsider these very patches in 1 month. It's just the timing, combined with the fact that this is not necessary for fleecing, that I'm uncomfortable with. Paolo > So no, I'm not totally comfortable with allowing it, but not allowing it > isn't really an option either. > > Kevin >
Am 17.07.2013 um 16:16 hat Paolo Bonzini geschrieben: > Il 17/07/2013 15:48, Kevin Wolf ha scritto: > >> I understand this is the right thing to do long term, but pre-opening of > >> the target is not really needed for fleecing. > > > > So for how much longer should we plan to procrastinate? (I know, not an > > entirely fair question, but we have to make the step at some point) > > If we bring it up during soft freeze, we will procrastinate it a lot. :) > If we bring it up at the beginning of a release cycle, the wait could > be as short as 1 month... > > > And we'll want to reference existing BDSes as backing/protocol files in > > blockdev-add soon anyway, so if we decide against it here, it's just > > moving from Fam's to-do list to mine... > > We can reconsider these very patches in 1 month. It's just the timing, > combined with the fact that this is not necessary for fleecing, that I'm > uncomfortable with. Okay, I see where this is going. Let me reinforce one fundamental policy that you may not like. I've rejected patches based on it before and I'll likely have to do it again in the future. This is it: I'm not going to compromise on upstream APIs to accommodate downstream schedules. If doing things properly means moving fleecing to 1.7, so be it. There's no pressure to have it in upstream 1.6. It might mean downstream patches for some distro, but that's not a valid point in upstream design discussions. Now is the right time to discuss design changes for 1.7, so that as soon as block-next starts to come to life (i.e. beginning of August), we can start with implementing this at the earliest possible point in the 1.7 release cycle. > > I guess we can give a name to the target, and we can make drive-backup > > automatically connect the target with the original as its backing file > > (still needs the refcounting, by the way). > > No, it doesn't need the refcounting (see my reply to the cover letter). > In his next submission of drive-backup sync modes, Ian is already going > to handle the automatic connection of the target with the original. Okay, I'll have a look, but I can't imagine how it work without refcounting. As soon as it has a name, you can attach the target to a guest, nbd server, start block jobs and do all kinds of fun with it so that taking it away when the source goes away becomes problematic. But I'll have a look at the patches and hope that the commit messages explain the details. > > But is giving a name to the > > target not enough to allow "interesting" things to be done? I don't > > remember the details from the mirroring discussion, but it seems it were > > enough that you didn't want to do it. > > Yes, but this time we have to bite the bullet on that one at least, > because we have no other choice (we want to do at least one > "interesting" thing, namely connect to it with the NBD server). Yes, like I said, we might not feel comfortable with enabling these cases, but not enabling them isn't an option either. So now is the time to do the real thing. Kevin
Il 17/07/2013 17:09, Kevin Wolf ha scritto: > Am 17.07.2013 um 16:16 hat Paolo Bonzini geschrieben: >> Il 17/07/2013 15:48, Kevin Wolf ha scritto: >>>> I understand this is the right thing to do long term, but pre-opening of >>>> the target is not really needed for fleecing. >>> >>> So for how much longer should we plan to procrastinate? (I know, not an >>> entirely fair question, but we have to make the step at some point) >> >> If we bring it up during soft freeze, we will procrastinate it a lot. :) >> If we bring it up at the beginning of a release cycle, the wait could >> be as short as 1 month... >> >>> And we'll want to reference existing BDSes as backing/protocol files in >>> blockdev-add soon anyway, so if we decide against it here, it's just >>> moving from Fam's to-do list to mine... >> >> We can reconsider these very patches in 1 month. It's just the timing, >> combined with the fact that this is not necessary for fleecing, that I'm >> uncomfortable with. > > Okay, I see where this is going. Let me reinforce one fundamental policy > that you may not like. No, I like it actually. In no way I want to force fleecing in 1.6. And this is by no means about downstream schedules, I don't care about them (not even sure which downstream you're referring to :)). However, this API seems (a) overengineered for the purposes of fleecing; (b) ignoring what Ian is doing to support sync modes in block/backup.c. >>> I guess we can give a name to the target, and we can make drive-backup >>> automatically connect the target with the original as its backing file >>> (still needs the refcounting, by the way). >> >> No, it doesn't need the refcounting (see my reply to the cover letter). >> In his next submission of drive-backup sync modes, Ian is already going >> to handle the automatic connection of the target with the original. > > Okay, I'll have a look, but I can't imagine how it work without > refcounting. As soon as it has a name, you can attach the target to a > guest, nbd server, start block jobs and do all kinds of fun with it so > that taking it away when the source goes away becomes problematic. - Attach the target to a guest: yes, though I wouldn't do it because the backup job can bdrv_close it under the guest's feet. Deletion would be protected by DriveInfo's refcount. - NBD server: handles closing just fine - start block jobs: job will be cancelled as soon as the backup job bdrv_close's the target - all kinds of fun with it so that taking it away when the source goes away becomes problematic: taking away the source takes away the target too, so it is in the same ballpark as attaching the target to a guest (i.e. be prepared to have I/O that starts to fail, and remember to use drive_get/put_ref). >> Yes, but this time we have to bite the bullet on that one at least, >> because we have no other choice (we want to do at least one >> "interesting" thing, namely connect to it with the NBD server). > > Yes, like I said, we might not feel comfortable with enabling these > cases, but not enabling them isn't an option either. So now is the time > to do the real thing. I agree and understand. At the same time, we already have an (ugly) mechanism for "soft" reference counts, namely DriveInfo. So I don't care about the order between "move refcnt from DriveInfo to BDS" and "support fleecing with the backup job", but they should be two completely separate series. Also, the right API shouldn't be influenced by which part goes in first. Paolo
On Wed, 07/17 14:36, Paolo Bonzini wrote: > Il 17/07/2013 11:42, Fam Zheng ha scritto: > > This option allows overriding backing hd of drive. If the target drive > > exists, it's referenced as the backing file and refcount incremented. > > > > Example: > > qemu-system-x86_64 -drive \ > > file.filename=foo.qcow2,if=none,id=foo \ > > -drive file=bar.qcow2,backing=foo > > I guess this is where we need the soft reference. > > This has a _lot_ of potential for misuse, I think Kevin bashed me and > Federico very heavily when we tried to do something similar. > > block/backup.c is the right place where we can override the backing hd > of the drive. Perhaps we can add a way to open a file with > BDRV_O_NO_BACKING from the command line. OK. If we get the override in block/backup.c, the only thing we need is naming the target so we can add it to nbd server. If refcounting, overriding backing with option and blockdev-backup are still good to have, I can split them and targeting to future releases.
On Mon, Jul 22, 2013 at 02:07:15PM +0800, Fam Zheng wrote: > On Wed, 07/17 14:36, Paolo Bonzini wrote: > > Il 17/07/2013 11:42, Fam Zheng ha scritto: > > > This option allows overriding backing hd of drive. If the target drive > > > exists, it's referenced as the backing file and refcount incremented. > > > > > > Example: > > > qemu-system-x86_64 -drive \ > > > file.filename=foo.qcow2,if=none,id=foo \ > > > -drive file=bar.qcow2,backing=foo > > > > I guess this is where we need the soft reference. > > > > This has a _lot_ of potential for misuse, I think Kevin bashed me and > > Federico very heavily when we tried to do something similar. > > > > block/backup.c is the right place where we can override the backing hd > > of the drive. Perhaps we can add a way to open a file with > > BDRV_O_NO_BACKING from the command line. > > OK. If we get the override in block/backup.c, the only thing we need is > naming the target so we can add it to nbd server. If refcounting, > overriding backing with option and blockdev-backup are still good to > have, I can split them and targeting to future releases. In my latest patchset I have attempted to override the backing hd of the target. It is a little over my head however so any input there would be appreciated. It seems to me like applying that patch would get us a working baseline and then you can add the more fancy stuff on top of that? That sound reasonable? Ian
On Wed, Jul 17, 2013 at 05:09:05PM +0200, Kevin Wolf wrote: > Am 17.07.2013 um 16:16 hat Paolo Bonzini geschrieben: > > Il 17/07/2013 15:48, Kevin Wolf ha scritto: > > >> I understand this is the right thing to do long term, but pre-opening of > > >> the target is not really needed for fleecing. > > > > > > So for how much longer should we plan to procrastinate? (I know, not an > > > entirely fair question, but we have to make the step at some point) > > > > If we bring it up during soft freeze, we will procrastinate it a lot. :) > > If we bring it up at the beginning of a release cycle, the wait could > > be as short as 1 month... > > > > > And we'll want to reference existing BDSes as backing/protocol files in > > > blockdev-add soon anyway, so if we decide against it here, it's just > > > moving from Fam's to-do list to mine... > > > > We can reconsider these very patches in 1 month. It's just the timing, > > combined with the fact that this is not necessary for fleecing, that I'm > > uncomfortable with. > > Okay, I see where this is going. Let me reinforce one fundamental policy > that you may not like. I've rejected patches based on it before and I'll > likely have to do it again in the future. This is it: > > I'm not going to compromise on upstream APIs to accommodate > downstream schedules. > > If doing things properly means moving fleecing to 1.7, so be it. There's > no pressure to have it in upstream 1.6. It might mean downstream patches > for some distro, but that's not a valid point in upstream design > discussions. I'll just throw out there that *I'm* feeling pressure for 1.6, but at the same time I understand your concerns. Some feel that independent of distro concerns this is a feature gap for KVM/qemu. It seems to me a practical approach of taking what is needed to get this working while also putting out patches for doing it right would work out well in the end. Anyway, just my 2c. I am just helping out and I know you guys have to deal with the long term effects so I respect your decisions. Ian > Now is the right time to discuss design changes for 1.7, so that as soon > as block-next starts to come to life (i.e. beginning of August), we can > start with implementing this at the earliest possible point in the 1.7 > release cycle. > > > > I guess we can give a name to the target, and we can make drive-backup > > > automatically connect the target with the original as its backing file > > > (still needs the refcounting, by the way). > > > > No, it doesn't need the refcounting (see my reply to the cover letter). > > In his next submission of drive-backup sync modes, Ian is already going > > to handle the automatic connection of the target with the original. > > Okay, I'll have a look, but I can't imagine how it work without > refcounting. As soon as it has a name, you can attach the target to a > guest, nbd server, start block jobs and do all kinds of fun with it so > that taking it away when the source goes away becomes problematic. But > I'll have a look at the patches and hope that the commit messages > explain the details. > > > > But is giving a name to the > > > target not enough to allow "interesting" things to be done? I don't > > > remember the details from the mirroring discussion, but it seems it were > > > enough that you didn't want to do it. > > > > Yes, but this time we have to bite the bullet on that one at least, > > because we have no other choice (we want to do at least one > > "interesting" thing, namely connect to it with the NBD server). > > Yes, like I said, we might not feel comfortable with enabling these > cases, but not enabling them isn't an option either. So now is the time > to do the real thing. > > Kevin
diff --git a/block.c b/block.c index 147a448..31fab07 100644 --- a/block.c +++ b/block.c @@ -1083,12 +1083,30 @@ int bdrv_open(BlockDriverState *bs, const char *filename, QDict *options, /* If there is a backing file, use it */ if ((flags & BDRV_O_NO_BACKING) == 0) { - QDict *backing_options; - - extract_subqdict(options, &backing_options, "backing."); - ret = bdrv_open_backing_file(bs, backing_options); - if (ret < 0) { - goto close_and_fail; + const char *backing_drive; + backing_drive = qdict_get_try_str(options, "backing"); + if (backing_drive) { + bs->backing_hd = bdrv_find(backing_drive); + if (bs->backing_hd) { + bdrv_ref(bs->backing_hd, false); + pstrcpy(bs->backing_file, sizeof(bs->backing_file), + bs->backing_hd->filename); + pstrcpy(bs->backing_format, sizeof(bs->backing_format), + bs->backing_hd->drv->format_name); + } else { + qerror_report(ERROR_CLASS_DEVICE_NOT_FOUND, + "Can't find drive with name %s\n", backing_drive); + ret = -EINVAL; + goto close_and_fail; + } + qdict_del(options, "backing"); + } else { + QDict *backing_options; + extract_subqdict(options, &backing_options, "backing."); + ret = bdrv_open_backing_file(bs, backing_options); + if (ret < 0) { + goto close_and_fail; + } } }
This option allows overriding backing hd of drive. If the target drive exists, it's referenced as the backing file and refcount incremented. Example: qemu-system-x86_64 -drive \ file.filename=foo.qcow2,if=none,id=foo \ -drive file=bar.qcow2,backing=foo Signed-off-by: Fam Zheng <famz@redhat.com> --- block.c | 30 ++++++++++++++++++++++++------ 1 file changed, 24 insertions(+), 6 deletions(-)