Message ID | 20170301081504.23595-1-famz@redhat.com |
---|---|
State | New |
Headers | show |
Am 01.03.2017 um 09:15 hat Fam Zheng geschrieben: > In an ideal world, read-write access to an image is inherently > exclusive, because we cannot guarantee other readers and writers a > consistency view of the whole image at all point. That's what the > current permission system does, and it is okay as long as it is entirely > internal. But that would change with the coming image locking. In > practice, both end users and our test cases use tools like qemu-img and > qemu-io to peek at images while guest is running. > > Relax a bit and accept other writers in this case. > > Signed-off-by: Fam Zheng <famz@redhat.com> Hm. On the one hand I think you're right that people are using things like this, on other hand it's also true that the result might not be consistent and therefore image locking is right about forbidding these actions. I think your patch is too permissive, it allows even launching a second long-running VM on the image, which will definitely see corrupted data sooner or later. Maybe what we can do is allow shared writers for read-only images if CONSISTENT_READ isn't requested. It's still not 100% correct because we can get inconsistent metadata and cause unexpected failure, but this is probably tolerable. We would then have to change the allowed actions to not request this permission. In qemu-img, we have these read-only users: * qemu-img check (without -r): Let's keep this blocked, it will only report lots of leaks and leads to invalid bug reports. I've had my share of them. * qemu-img compare: Not sure if this makes sense with an image that is in active use? * qemu-img convert source: Similarly to qemu-img compare, this doesn't make a whole lot of sense, with one exception: People are using -s to extract internal snapshots while the VM is still running, and this usually works because the snapshot doesn't change. However, I don't want to know what happens if you delete the snapshort from the qemu process while qemu-img convert is running... Doing 'qemu-img snapshot -s ...' with a running VM is more tolerated abuse than a supported feature. * qemu-img info: This one makes perfect sense even with a running VM * qemu-img map: Results are potentially meaningless with concurrent I/O, but there may be cases where it makes sense. * qemu-img snapshot -l: Somewhat similar to qemu-img convert -s, except that it's very quick and doesn't access actual data (could even be BDRV_O_NOIO, I think). Allowing this makes sense, I guess. * qemu-img rebase, safe mode, backing files: If we allowed concurrent writes, it wouldn't be safe. * qemu-img bench: Well... No. You don't need this on an image with an active VM. * qemu-img dd: Same as convert, except that there is no -s. So the list of subcommands where we want to support it is rather short. We can change blk_new_open() to clear CONSISTENT_READ for BDRV_O_NOIO, which could cover 'info' and 'snapshot -l'. That leaves us with qemu-io, 'convert -s' and 'map', all of which can be imagined to be useful even with a running VM, but all of which can also easily produce wrong results in this case. An explicit command line option to disable CONSISTENT_READ might be the right tool here. What do you think? Kevin
On Wed, 03/01 10:49, Kevin Wolf wrote: > Am 01.03.2017 um 09:15 hat Fam Zheng geschrieben: > > In an ideal world, read-write access to an image is inherently > > exclusive, because we cannot guarantee other readers and writers a > > consistency view of the whole image at all point. That's what the > > current permission system does, and it is okay as long as it is entirely > > internal. But that would change with the coming image locking. In > > practice, both end users and our test cases use tools like qemu-img and > > qemu-io to peek at images while guest is running. > > > > Relax a bit and accept other writers in this case. > > > > Signed-off-by: Fam Zheng <famz@redhat.com> > > Hm. On the one hand I think you're right that people are using things > like this, on other hand it's also true that the result might not be > consistent and therefore image locking is right about forbidding these > actions. > > I think your patch is too permissive, it allows even launching a second > long-running VM on the image, which will definitely see corrupted data > sooner or later. > > Maybe what we can do is allow shared writers for read-only images if > CONSISTENT_READ isn't requested. It's still not 100% correct because we > can get inconsistent metadata and cause unexpected failure, but this is > probably tolerable. We would then have to change the allowed actions to > not request this permission. > > In qemu-img, we have these read-only users: > > * qemu-img check (without -r): Let's keep this blocked, it will only > report lots of leaks and leads to invalid bug reports. I've had my > share of them. > > * qemu-img compare: Not sure if this makes sense with an image that is > in active use? > > * qemu-img convert source: Similarly to qemu-img compare, this doesn't > make a whole lot of sense, with one exception: People are using -s to > extract internal snapshots while the VM is still running, and this > usually works because the snapshot doesn't change. However, I don't > want to know what happens if you delete the snapshort from the qemu > process while qemu-img convert is running... Doing 'qemu-img snapshot > -s ...' with a running VM is more tolerated abuse than a supported > feature. > > * qemu-img info: This one makes perfect sense even with a running VM > > * qemu-img map: Results are potentially meaningless with concurrent I/O, > but there may be cases where it makes sense. > > * qemu-img snapshot -l: Somewhat similar to qemu-img convert -s, except > that it's very quick and doesn't access actual data (could even be > BDRV_O_NOIO, I think). Allowing this makes sense, I guess. > > * qemu-img rebase, safe mode, backing files: If we allowed concurrent > writes, it wouldn't be safe. > > * qemu-img bench: Well... No. You don't need this on an image with an > active VM. > > * qemu-img dd: Same as convert, except that there is no -s. > > So the list of subcommands where we want to support it is rather short. > We can change blk_new_open() to clear CONSISTENT_READ for BDRV_O_NOIO, > which could cover 'info' and 'snapshot -l'. > > That leaves us with qemu-io, 'convert -s' and 'map', all of which can be > imagined to be useful even with a running VM, but all of which can also > easily produce wrong results in this case. An explicit command line > option to disable CONSISTENT_READ might be the right tool here. I'm not sure about this because: 1) this is intrusive from a user PoV, many scripts and upper layer tools will stop working; 2) CONSISTENT_READ is enforced by qcow2 in its .bdrv_child_perm implementation even if blk_new_open() doesn't ask for it, therefore such an option has to impact the whole graph; 3) this isn't only about asking for "persistent read" perm, but more about granting "write" in shared_perm, so it feels messy. A perhaps more contained way is to add a BDRV_O_RELAXED_LOCK flag and use it in those subcommands, then in the image locking code, the "no other writer" byte is not locked if it's set. This has a simpler semantic and a more manageable scope. Fam
Am 01.03.2017 um 13:39 hat Fam Zheng geschrieben: > On Wed, 03/01 10:49, Kevin Wolf wrote: > > Am 01.03.2017 um 09:15 hat Fam Zheng geschrieben: > > > In an ideal world, read-write access to an image is inherently > > > exclusive, because we cannot guarantee other readers and writers a > > > consistency view of the whole image at all point. That's what the > > > current permission system does, and it is okay as long as it is entirely > > > internal. But that would change with the coming image locking. In > > > practice, both end users and our test cases use tools like qemu-img and > > > qemu-io to peek at images while guest is running. > > > > > > Relax a bit and accept other writers in this case. > > > > > > Signed-off-by: Fam Zheng <famz@redhat.com> > > > > Hm. On the one hand I think you're right that people are using things > > like this, on other hand it's also true that the result might not be > > consistent and therefore image locking is right about forbidding these > > actions. > > > > I think your patch is too permissive, it allows even launching a second > > long-running VM on the image, which will definitely see corrupted data > > sooner or later. > > > > Maybe what we can do is allow shared writers for read-only images if > > CONSISTENT_READ isn't requested. It's still not 100% correct because we > > can get inconsistent metadata and cause unexpected failure, but this is > > probably tolerable. We would then have to change the allowed actions to > > not request this permission. > > > > In qemu-img, we have these read-only users: > > [...] > > > > So the list of subcommands where we want to support it is rather short. > > We can change blk_new_open() to clear CONSISTENT_READ for BDRV_O_NOIO, > > which could cover 'info' and 'snapshot -l'. > > > > That leaves us with qemu-io, 'convert -s' and 'map', all of which can be > > imagined to be useful even with a running VM, but all of which can also > > easily produce wrong results in this case. An explicit command line > > option to disable CONSISTENT_READ might be the right tool here. > > I'm not sure about this because: 1) this is intrusive from a user PoV, many > scripts and upper layer tools will stop working; Are you sure? I don't expect user scripts or even proper management tools to use qemu-io on running VMs. I do expect that some users are using 'convert -s' with running VMs despite our recommendation against it. If they are aware that they're doing something that works only in the right circumstances, then breaking it isn't nice. But my gut feeling is that most of them don't know about the implications of accessing a live image. In this case breaking their use case and making them think about whether they want to add something like a --force option sounds like a good thing because they aren't caught by surprise when things go wrong eventually. > 2) CONSISTENT_READ is enforced by qcow2 in its .bdrv_child_perm > implementation even if blk_new_open() doesn't ask for it, therefore > such an option has to impact the whole graph; Yes, the perm |= CONSISTENT_READ statement would become conditional as well so that it isn't set if all parents are read-only and don't need CONSISTENT_READ themselves. This is strictly speaking wrong, but after all the whole point of the change would be to allow things that are strictly speaking wrong. > 3) this isn't only about asking for "persistent read" perm, but more > about granting "write" in shared_perm, so it feels messy. Yes, exceptions make everything messy. > A perhaps more contained way is to add a BDRV_O_RELAXED_LOCK flag and > use it in those subcommands, then in the image locking code, the "no > other writer" byte is not locked if it's set. This has a simpler > semantic and a more manageable scope. No new flags, please (at least none that are used in places deeper than blk_new_open()). I'm also not sure whether applying different rules within qemu and externally is a really good idea. If an external process can open an image read-only, why wouldn't it be possible internally? Kevin
On Wed, 03/01 16:16, Kevin Wolf wrote: > > I'm not sure about this because: 1) this is intrusive from a user PoV, many > > scripts and upper layer tools will stop working; > > Are you sure? I don't expect user scripts or even proper management > tools to use qemu-io on running VMs. I do expect that some users are > using 'convert -s' with running VMs despite our recommendation against > it. > > If they are aware that they're doing something that works only in the > right circumstances, then breaking it isn't nice. But my gut feeling is > that most of them don't know about the implications of accessing a live > image. In this case breaking their use case and making them think about > whether they want to add something like a --force option sounds like a > good thing because they aren't caught by surprise when things go wrong > eventually. Yes, the use case is poor for qcow2, and your points stand there. But image locking will be at the posix level, which has a wider range of users. I cannot draw the same conclusion on raw images as easily.
Am 01.03.2017 um 17:10 hat Fam Zheng geschrieben: > On Wed, 03/01 16:16, Kevin Wolf wrote: > > > I'm not sure about this because: 1) this is intrusive from a user PoV, many > > > scripts and upper layer tools will stop working; > > > > Are you sure? I don't expect user scripts or even proper management > > tools to use qemu-io on running VMs. I do expect that some users are > > using 'convert -s' with running VMs despite our recommendation against > > it. > > > > If they are aware that they're doing something that works only in the > > right circumstances, then breaking it isn't nice. But my gut feeling is > > that most of them don't know about the implications of accessing a live > > image. In this case breaking their use case and making them think about > > whether they want to add something like a --force option sounds like a > > good thing because they aren't caught by surprise when things go wrong > > eventually. > > Yes, the use case is poor for qcow2, and your points stand there. But image > locking will be at the posix level, which has a wider range of users. I cannot > draw the same conclusion on raw images as easily. Well, with raw, I'm even less concerned about breaking the commands related to internal snapshots. :-) Kevin
On Wed, 03/01 17:22, Kevin Wolf wrote: > Am 01.03.2017 um 17:10 hat Fam Zheng geschrieben: > > On Wed, 03/01 16:16, Kevin Wolf wrote: > > > > I'm not sure about this because: 1) this is intrusive from a user PoV, many > > > > scripts and upper layer tools will stop working; > > > > > > Are you sure? I don't expect user scripts or even proper management > > > tools to use qemu-io on running VMs. I do expect that some users are > > > using 'convert -s' with running VMs despite our recommendation against > > > it. > > > > > > If they are aware that they're doing something that works only in the > > > right circumstances, then breaking it isn't nice. But my gut feeling is > > > that most of them don't know about the implications of accessing a live > > > image. In this case breaking their use case and making them think about > > > whether they want to add something like a --force option sounds like a > > > good thing because they aren't caught by surprise when things go wrong > > > eventually. > > > > Yes, the use case is poor for qcow2, and your points stand there. But image > > locking will be at the posix level, which has a wider range of users. I cannot > > draw the same conclusion on raw images as easily. > > Well, with raw, I'm even less concerned about breaking the commands > related to internal snapshots. :-) Yes, I'm agree with a --force there. For qemu-img map and qemu-io, personally I think it's better to keep the default working. qemu-io is a expert mode tool, whoever using it at all should already know what he's doing, --force doesn't add much protection for the innocent. Fam
Am 02.03.2017 um 12:21 hat Fam Zheng geschrieben: > On Wed, 03/01 17:22, Kevin Wolf wrote: > > Am 01.03.2017 um 17:10 hat Fam Zheng geschrieben: > > > On Wed, 03/01 16:16, Kevin Wolf wrote: > > > > > I'm not sure about this because: 1) this is intrusive from a user PoV, many > > > > > scripts and upper layer tools will stop working; > > > > > > > > Are you sure? I don't expect user scripts or even proper management > > > > tools to use qemu-io on running VMs. I do expect that some users are > > > > using 'convert -s' with running VMs despite our recommendation against > > > > it. > > > > > > > > If they are aware that they're doing something that works only in the > > > > right circumstances, then breaking it isn't nice. But my gut feeling is > > > > that most of them don't know about the implications of accessing a live > > > > image. In this case breaking their use case and making them think about > > > > whether they want to add something like a --force option sounds like a > > > > good thing because they aren't caught by surprise when things go wrong > > > > eventually. > > > > > > Yes, the use case is poor for qcow2, and your points stand there. But image > > > locking will be at the posix level, which has a wider range of users. I cannot > > > draw the same conclusion on raw images as easily. > > > > Well, with raw, I'm even less concerned about breaking the commands > > related to internal snapshots. :-) > > Yes, I'm agree with a --force there. For qemu-img map and qemu-io, personally I > think it's better to keep the default working. qemu-io is a expert mode tool, > whoever using it at all should already know what he's doing, --force doesn't add > much protection for the innocent. Being an expert doesn't protect you from stupid mistakes like forgetting that a VM is still running. --force doesn't prevent you from performing your evil action, but it prevents accidents where you didn't even intend to be evil for a change. I think qemu-io and qemu-img map are tools that only human users should be using while a VM is running, so breaking command line syntax compatibility by requiring a --force option wouldn't hurt too much. Kevin
On Thu, 03/02 15:23, Kevin Wolf wrote: > Am 02.03.2017 um 12:21 hat Fam Zheng geschrieben: > > On Wed, 03/01 17:22, Kevin Wolf wrote: > > > Am 01.03.2017 um 17:10 hat Fam Zheng geschrieben: > > > > On Wed, 03/01 16:16, Kevin Wolf wrote: > > > > > > I'm not sure about this because: 1) this is intrusive from a user PoV, many > > > > > > scripts and upper layer tools will stop working; > > > > > > > > > > Are you sure? I don't expect user scripts or even proper management > > > > > tools to use qemu-io on running VMs. I do expect that some users are > > > > > using 'convert -s' with running VMs despite our recommendation against > > > > > it. > > > > > > > > > > If they are aware that they're doing something that works only in the > > > > > right circumstances, then breaking it isn't nice. But my gut feeling is > > > > > that most of them don't know about the implications of accessing a live > > > > > image. In this case breaking their use case and making them think about > > > > > whether they want to add something like a --force option sounds like a > > > > > good thing because they aren't caught by surprise when things go wrong > > > > > eventually. > > > > > > > > Yes, the use case is poor for qcow2, and your points stand there. But image > > > > locking will be at the posix level, which has a wider range of users. I cannot > > > > draw the same conclusion on raw images as easily. > > > > > > Well, with raw, I'm even less concerned about breaking the commands > > > related to internal snapshots. :-) > > > > Yes, I'm agree with a --force there. For qemu-img map and qemu-io, personally I > > think it's better to keep the default working. qemu-io is a expert mode tool, > > whoever using it at all should already know what he's doing, --force doesn't add > > much protection for the innocent. > > Being an expert doesn't protect you from stupid mistakes like forgetting > that a VM is still running. --force doesn't prevent you from performing > your evil action, but it prevents accidents where you didn't even intend > to be evil for a change. > > I think qemu-io and qemu-img map are tools that only human users should > be using while a VM is running, so breaking command line syntax > compatibility by requiring a --force option wouldn't hurt too much. OK, that sounds fair. I don't know how to implement the option, though, would you like to take care of it? :) Fam
diff --git a/block.c b/block.c index f293ccb..9bdd77c 100644 --- a/block.c +++ b/block.c @@ -1685,12 +1685,12 @@ void bdrv_format_default_perms(BlockDriverState *bs, BdrvChild *c, /* Format drivers may touch metadata even if the guest doesn't write */ if (!bdrv_is_read_only(bs)) { perm |= BLK_PERM_WRITE | BLK_PERM_RESIZE; + shared &= ~(BLK_PERM_WRITE | BLK_PERM_RESIZE); } /* bs->file always needs to be consistent because of the metadata. We * can never allow other users to resize or write to it. */ perm |= BLK_PERM_CONSISTENT_READ; - shared &= ~(BLK_PERM_WRITE | BLK_PERM_RESIZE); } else { /* We want consistent read from backing files if the parent needs it. * No other operations are performed on backing files. */
In an ideal world, read-write access to an image is inherently exclusive, because we cannot guarantee other readers and writers a consistency view of the whole image at all point. That's what the current permission system does, and it is okay as long as it is entirely internal. But that would change with the coming image locking. In practice, both end users and our test cases use tools like qemu-img and qemu-io to peek at images while guest is running. Relax a bit and accept other writers in this case. Signed-off-by: Fam Zheng <famz@redhat.com> --- block.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)