Message ID | 20200218124242.584644-10-mreitz@redhat.com |
---|---|
State | New |
Headers | show |
Series | block: Introduce real BdrvChildRole | expand |
Am 18.02.2020 um 13:42 hat Max Reitz geschrieben: > After the series this patch belongs to, we want to have a common > BdrvChildClass that encompasses all of child_file, child_format, and > child_backing. Such a single class needs a single .inherit_options() > implementation, and this patch introduces it. > > The next patch will show how the existing implementations can fall back > to it just by passing appropriate BdrvChildRole and parent_is_format > values. > > Signed-off-by: Max Reitz <mreitz@redhat.com> > Reviewed-by: Eric Blake <eblake@redhat.com> > --- > block.c | 84 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 84 insertions(+) > > diff --git a/block.c b/block.c > index c33f0e9b42..9179b9b604 100644 > --- a/block.c > +++ b/block.c > @@ -998,6 +998,90 @@ static void bdrv_temp_snapshot_options(int *child_flags, QDict *child_options, > *child_flags &= ~BDRV_O_NATIVE_AIO; > } > > +/* > + * Returns the options and flags that a generic child of a BDS should > + * get, based on the given options and flags for the parent BDS. > + */ > +static void __attribute__((unused)) > + bdrv_inherited_options(BdrvChildRole role, bool parent_is_format, > + int *child_flags, QDict *child_options, > + int parent_flags, QDict *parent_options) > +{ > + int flags = parent_flags; > + > + /* > + * First, decide whether to set, clear, or leave BDRV_O_PROTOCOL. > + * Generally, the question to answer is: Should this child be > + * format-probed by default? > + */ > + > + /* > + * Pure and non-filtered data children of non-format nodes should > + * be probed by default (even when the node itself has BDRV_O_PROTOCOL > + * set). This only affects a very limited set of drivers (namely > + * quorum and blkverify when this comment was written). > + * Force-clear BDRV_O_PROTOCOL then. > + */ > + if (!parent_is_format && > + (role & (BDRV_CHILD_DATA | BDRV_CHILD_METADATA | > + BDRV_CHILD_FILTERED)) == > + BDRV_CHILD_DATA) You could avoid the odd indentation (I can't decide whether or not it should be one space more to align correctly) and probably also make the expression more readable if you split it into: (role & BDRV_CHILD_DATA) && !(role & (BDRV_CHILD_METADATA | BDRV_CHILD_FILTERED)) > + { > + flags &= ~BDRV_O_PROTOCOL; > + } > + > + /* > + * All children of format nodes (except for COW children) and all > + * metadata children in general should never be format-probed. > + * Force-set BDRV_O_PROTOCOL then. > + */ > + if ((parent_is_format && !(role & BDRV_CHILD_COW)) || > + (role & BDRV_CHILD_METADATA)) > + { > + flags |= BDRV_O_PROTOCOL; > + } > + > + /* > + * If the cache mode isn't explicitly set, inherit direct and no-flush from > + * the parent. > + */ > + qdict_copy_default(child_options, parent_options, BDRV_OPT_CACHE_DIRECT); > + qdict_copy_default(child_options, parent_options, BDRV_OPT_CACHE_NO_FLUSH); > + qdict_copy_default(child_options, parent_options, BDRV_OPT_FORCE_SHARE); > + > + if (role & BDRV_CHILD_COW) { > + /* backing files are always opened read-only */ Not "always", just by default. > + qdict_set_default_str(child_options, BDRV_OPT_READ_ONLY, "on"); > + qdict_set_default_str(child_options, BDRV_OPT_AUTO_READ_ONLY, "off"); > + } else { > + /* Inherit the read-only option from the parent if it's not set */ > + qdict_copy_default(child_options, parent_options, BDRV_OPT_READ_ONLY); > + qdict_copy_default(child_options, parent_options, > + BDRV_OPT_AUTO_READ_ONLY); > + } > + > + if (parent_is_format && !(role & BDRV_CHILD_COW)) { > + /* > + * Our format drivers take care to send flushes and respect > + * unmap policy, so we can default to enable both on lower > + * layers regardless of the corresponding parent options. > + */ > + qdict_set_default_str(child_options, BDRV_OPT_DISCARD, "unmap"); > + } Why the restriction to format here? Don't we break "unmap" propagation through filters with this? It would probably also be a good question why we don't propagate it to the backing file, but this is preexisting. > + /* Clear flags that only apply to the top layer */ > + flags &= ~(BDRV_O_SNAPSHOT | BDRV_O_NO_BACKING | BDRV_O_COPY_ON_READ); > + > + if (role & BDRV_CHILD_METADATA) { > + flags &= ~BDRV_O_NO_IO; > + } Hm... This is interesting, but I guess it makes sense. It just never was a hard rule that a format driver must not access data even internally with NO_IO, but I think it holds true. > + if (role & BDRV_CHILD_COW) { > + flags &= ~BDRV_O_TEMPORARY; > + } We could probably have a long discussion about whether this is right in theory, but in practice BDRV_O_TEMPORARY only exists for snapshot=on, for which we know that it's always qcow2 with a file and a backing child. And there is no doubt that we make the right distinction in this case. > + *child_flags = flags; > +} > + > /* > * Returns the options and flags that bs->file should get if a protocol driver > * is expected, based on the given options and flags for the parent BDS > -- > 2.24.1 Kevin
Am 06.05.2020 um 12:37 hat Kevin Wolf geschrieben: > Am 18.02.2020 um 13:42 hat Max Reitz geschrieben: > > After the series this patch belongs to, we want to have a common > > BdrvChildClass that encompasses all of child_file, child_format, and > > child_backing. Such a single class needs a single .inherit_options() > > implementation, and this patch introduces it. > > > > The next patch will show how the existing implementations can fall back > > to it just by passing appropriate BdrvChildRole and parent_is_format > > values. > > > > Signed-off-by: Max Reitz <mreitz@redhat.com> > > Reviewed-by: Eric Blake <eblake@redhat.com> > > --- > > block.c | 84 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 84 insertions(+) > > > > diff --git a/block.c b/block.c > > index c33f0e9b42..9179b9b604 100644 > > --- a/block.c > > +++ b/block.c > > @@ -998,6 +998,90 @@ static void bdrv_temp_snapshot_options(int *child_flags, QDict *child_options, > > *child_flags &= ~BDRV_O_NATIVE_AIO; > > } > > > > +/* > > + * Returns the options and flags that a generic child of a BDS should > > + * get, based on the given options and flags for the parent BDS. > > + */ > > +static void __attribute__((unused)) > > + bdrv_inherited_options(BdrvChildRole role, bool parent_is_format, > > + int *child_flags, QDict *child_options, > > + int parent_flags, QDict *parent_options) > > +{ > > + int flags = parent_flags; > > + > > + /* > > + * First, decide whether to set, clear, or leave BDRV_O_PROTOCOL. > > + * Generally, the question to answer is: Should this child be > > + * format-probed by default? > > + */ Just for clarity: Do you know a good reason to ever leave it (i.e. inherit it from the parent), except that that's what we have always been doing for backing files? Though of course, only formats have backing files, so the flag would never be set in practice in this case. > > + /* > > + * Pure and non-filtered data children of non-format nodes should > > + * be probed by default (even when the node itself has BDRV_O_PROTOCOL > > + * set). This only affects a very limited set of drivers (namely > > + * quorum and blkverify when this comment was written). > > + * Force-clear BDRV_O_PROTOCOL then. > > + */ > > + if (!parent_is_format && > > + (role & (BDRV_CHILD_DATA | BDRV_CHILD_METADATA | > > + BDRV_CHILD_FILTERED)) == > > + BDRV_CHILD_DATA) > > You could avoid the odd indentation (I can't decide whether or not it > should be one space more to align correctly) and probably also make the > expression more readable if you split it into: > > (role & BDRV_CHILD_DATA) && > !(role & (BDRV_CHILD_METADATA | BDRV_CHILD_FILTERED)) > > > + { > > + flags &= ~BDRV_O_PROTOCOL; > > + } > > + > > + /* > > + * All children of format nodes (except for COW children) and all > > + * metadata children in general should never be format-probed. > > + * Force-set BDRV_O_PROTOCOL then. > > + */ > > + if ((parent_is_format && !(role & BDRV_CHILD_COW)) || > > + (role & BDRV_CHILD_METADATA)) > > + { > > + flags |= BDRV_O_PROTOCOL; > > + } > > + > > + /* > > + * If the cache mode isn't explicitly set, inherit direct and no-flush from > > + * the parent. > > + */ > > + qdict_copy_default(child_options, parent_options, BDRV_OPT_CACHE_DIRECT); > > + qdict_copy_default(child_options, parent_options, BDRV_OPT_CACHE_NO_FLUSH); > > + qdict_copy_default(child_options, parent_options, BDRV_OPT_FORCE_SHARE); > > + > > + if (role & BDRV_CHILD_COW) { > > + /* backing files are always opened read-only */ > > Not "always", just by default. > > > + qdict_set_default_str(child_options, BDRV_OPT_READ_ONLY, "on"); > > + qdict_set_default_str(child_options, BDRV_OPT_AUTO_READ_ONLY, "off"); > > + } else { > > + /* Inherit the read-only option from the parent if it's not set */ > > + qdict_copy_default(child_options, parent_options, BDRV_OPT_READ_ONLY); > > + qdict_copy_default(child_options, parent_options, > > + BDRV_OPT_AUTO_READ_ONLY); > > + } > > + > > + if (parent_is_format && !(role & BDRV_CHILD_COW)) { > > + /* > > + * Our format drivers take care to send flushes and respect > > + * unmap policy, so we can default to enable both on lower > > + * layers regardless of the corresponding parent options. > > + */ > > + qdict_set_default_str(child_options, BDRV_OPT_DISCARD, "unmap"); > > + } > > Why the restriction to format here? Don't we break "unmap" propagation > through filters with this? > > It would probably also be a good question why we don't propagate it to > the backing file, but this is preexisting. Some patches later, I think the fix is an else branch that copies the flag from parent_options. Kevin
On 06.05.20 12:37, Kevin Wolf wrote: > Am 18.02.2020 um 13:42 hat Max Reitz geschrieben: >> After the series this patch belongs to, we want to have a common >> BdrvChildClass that encompasses all of child_file, child_format, and >> child_backing. Such a single class needs a single .inherit_options() >> implementation, and this patch introduces it. >> >> The next patch will show how the existing implementations can fall back >> to it just by passing appropriate BdrvChildRole and parent_is_format >> values. >> >> Signed-off-by: Max Reitz <mreitz@redhat.com> >> Reviewed-by: Eric Blake <eblake@redhat.com> >> --- >> block.c | 84 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> 1 file changed, 84 insertions(+) >> >> diff --git a/block.c b/block.c >> index c33f0e9b42..9179b9b604 100644 >> --- a/block.c >> +++ b/block.c >> @@ -998,6 +998,90 @@ static void bdrv_temp_snapshot_options(int *child_flags, QDict *child_options, >> *child_flags &= ~BDRV_O_NATIVE_AIO; >> } >> >> +/* >> + * Returns the options and flags that a generic child of a BDS should >> + * get, based on the given options and flags for the parent BDS. >> + */ >> +static void __attribute__((unused)) >> + bdrv_inherited_options(BdrvChildRole role, bool parent_is_format, >> + int *child_flags, QDict *child_options, >> + int parent_flags, QDict *parent_options) >> +{ >> + int flags = parent_flags; >> + >> + /* >> + * First, decide whether to set, clear, or leave BDRV_O_PROTOCOL. >> + * Generally, the question to answer is: Should this child be >> + * format-probed by default? >> + */ >> + >> + /* >> + * Pure and non-filtered data children of non-format nodes should >> + * be probed by default (even when the node itself has BDRV_O_PROTOCOL >> + * set). This only affects a very limited set of drivers (namely >> + * quorum and blkverify when this comment was written). >> + * Force-clear BDRV_O_PROTOCOL then. >> + */ >> + if (!parent_is_format && >> + (role & (BDRV_CHILD_DATA | BDRV_CHILD_METADATA | >> + BDRV_CHILD_FILTERED)) == >> + BDRV_CHILD_DATA) > > You could avoid the odd indentation (I can't decide whether or not it > should be one space more to align correctly) and probably also make the > expression more readable if you split it into: > > (role & BDRV_CHILD_DATA) && > !(role & (BDRV_CHILD_METADATA | BDRV_CHILD_FILTERED)) Yes, looks good. >> + { >> + flags &= ~BDRV_O_PROTOCOL; >> + } >> + >> + /* >> + * All children of format nodes (except for COW children) and all >> + * metadata children in general should never be format-probed. >> + * Force-set BDRV_O_PROTOCOL then. >> + */ >> + if ((parent_is_format && !(role & BDRV_CHILD_COW)) || >> + (role & BDRV_CHILD_METADATA)) >> + { >> + flags |= BDRV_O_PROTOCOL; >> + } >> + >> + /* >> + * If the cache mode isn't explicitly set, inherit direct and no-flush from >> + * the parent. >> + */ >> + qdict_copy_default(child_options, parent_options, BDRV_OPT_CACHE_DIRECT); >> + qdict_copy_default(child_options, parent_options, BDRV_OPT_CACHE_NO_FLUSH); >> + qdict_copy_default(child_options, parent_options, BDRV_OPT_FORCE_SHARE); >> + >> + if (role & BDRV_CHILD_COW) { >> + /* backing files are always opened read-only */ > > Not "always", just by default. OK. I just copied the comment from bdrv_backing_options(). >> + qdict_set_default_str(child_options, BDRV_OPT_READ_ONLY, "on"); >> + qdict_set_default_str(child_options, BDRV_OPT_AUTO_READ_ONLY, "off"); >> + } else { >> + /* Inherit the read-only option from the parent if it's not set */ >> + qdict_copy_default(child_options, parent_options, BDRV_OPT_READ_ONLY); >> + qdict_copy_default(child_options, parent_options, >> + BDRV_OPT_AUTO_READ_ONLY); >> + } >> + >> + if (parent_is_format && !(role & BDRV_CHILD_COW)) { >> + /* >> + * Our format drivers take care to send flushes and respect >> + * unmap policy, so we can default to enable both on lower >> + * layers regardless of the corresponding parent options. >> + */ >> + qdict_set_default_str(child_options, BDRV_OPT_DISCARD, "unmap"); >> + } > > Why the restriction to format here? Don't we break "unmap" propagation > through filters with this? Right now (before this series), the behavior seems ambiguous, in that for filters that use bs->file, it is set, but for those that use bs->backing, it isn’t. But I suspect the main reason for what I did is the way I interpreted the comment (which before this series only mentions block drivers in general, not specifically format drivers): It sounded to me as if the block driver needed to respect the unmap policy, and I didn’t think filters did that. So it was my understanding that filter drivers would just propagate discards and thus we couldn’t default-enable unmap on their children. But I was wrong, the block driver doesn’t need to respect anything, because bdrv_co_pdiscard() already does. So I suppose it should indeed be enabled for all children, with the comment changed to express that it isn’t any block driver that respects unmap policy, but bdrv_co_pdiscard(), e.g.: bdrv_co_pdiscard() respects unmap policy for the parent, so we can default to enable it on lower layers regardless of the parent option. > It would probably also be a good question why we don't propagate it to > the backing file, but this is preexisting. I suppose we should, although it’s irrelevant, so. I suppose I’ll just drop the parent_is_format, adjust the comment and that should be fine for this series. Max
On 06.05.20 15:11, Kevin Wolf wrote: > Am 06.05.2020 um 12:37 hat Kevin Wolf geschrieben: >> Am 18.02.2020 um 13:42 hat Max Reitz geschrieben: >>> After the series this patch belongs to, we want to have a common >>> BdrvChildClass that encompasses all of child_file, child_format, and >>> child_backing. Such a single class needs a single .inherit_options() >>> implementation, and this patch introduces it. >>> >>> The next patch will show how the existing implementations can fall back >>> to it just by passing appropriate BdrvChildRole and parent_is_format >>> values. >>> >>> Signed-off-by: Max Reitz <mreitz@redhat.com> >>> Reviewed-by: Eric Blake <eblake@redhat.com> >>> --- >>> block.c | 84 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> 1 file changed, 84 insertions(+) >>> >>> diff --git a/block.c b/block.c >>> index c33f0e9b42..9179b9b604 100644 >>> --- a/block.c >>> +++ b/block.c >>> @@ -998,6 +998,90 @@ static void bdrv_temp_snapshot_options(int *child_flags, QDict *child_options, >>> *child_flags &= ~BDRV_O_NATIVE_AIO; >>> } >>> >>> +/* >>> + * Returns the options and flags that a generic child of a BDS should >>> + * get, based on the given options and flags for the parent BDS. >>> + */ >>> +static void __attribute__((unused)) >>> + bdrv_inherited_options(BdrvChildRole role, bool parent_is_format, >>> + int *child_flags, QDict *child_options, >>> + int parent_flags, QDict *parent_options) >>> +{ >>> + int flags = parent_flags; >>> + >>> + /* >>> + * First, decide whether to set, clear, or leave BDRV_O_PROTOCOL. >>> + * Generally, the question to answer is: Should this child be >>> + * format-probed by default? >>> + */ > > Just for clarity: Do you know a good reason to ever leave it (i.e. > inherit it from the parent), except that that's what we have always been > doing for backing files? Though of course, only formats have backing > files, so the flag would never be set in practice in this case. It seems correct for filters. [...] >>> + if (parent_is_format && !(role & BDRV_CHILD_COW)) { >>> + /* >>> + * Our format drivers take care to send flushes and respect >>> + * unmap policy, so we can default to enable both on lower >>> + * layers regardless of the corresponding parent options. >>> + */ >>> + qdict_set_default_str(child_options, BDRV_OPT_DISCARD, "unmap"); >>> + } >> >> Why the restriction to format here? Don't we break "unmap" propagation >> through filters with this? >> >> It would probably also be a good question why we don't propagate it to >> the backing file, but this is preexisting. > > Some patches later, I think the fix is an else branch that copies the > flag from parent_options. I thought about the same thing, but is that really necessary if bdrv_co_pdiscard() will already suppress discards on the parent if unmap is false? Max
Am 07.05.2020 um 10:49 hat Max Reitz geschrieben: > On 06.05.20 12:37, Kevin Wolf wrote: > > Am 18.02.2020 um 13:42 hat Max Reitz geschrieben: > >> + if (role & BDRV_CHILD_COW) { > >> + /* backing files are always opened read-only */ > > > > Not "always", just by default. > > OK. I just copied the comment from bdrv_backing_options(). Yes, sorry, I noticed this only later (it's how review goes when a move is split into a copy in one patch and a removal later). I don't insist on a change if you prefer to have a clean copy. > >> + qdict_set_default_str(child_options, BDRV_OPT_READ_ONLY, "on"); > >> + qdict_set_default_str(child_options, BDRV_OPT_AUTO_READ_ONLY, "off"); > >> + } else { > >> + /* Inherit the read-only option from the parent if it's not set */ > >> + qdict_copy_default(child_options, parent_options, BDRV_OPT_READ_ONLY); > >> + qdict_copy_default(child_options, parent_options, > >> + BDRV_OPT_AUTO_READ_ONLY); > >> + } > >> + > >> + if (parent_is_format && !(role & BDRV_CHILD_COW)) { > >> + /* > >> + * Our format drivers take care to send flushes and respect > >> + * unmap policy, so we can default to enable both on lower > >> + * layers regardless of the corresponding parent options. > >> + */ > >> + qdict_set_default_str(child_options, BDRV_OPT_DISCARD, "unmap"); > >> + } > > > > Why the restriction to format here? Don't we break "unmap" propagation > > through filters with this? > > Right now (before this series), the behavior seems ambiguous, in that > for filters that use bs->file, it is set, but for those that use > bs->backing, it isn’t. It's probably easy to agree that this is a bug. > But I suspect the main reason for what I did is the way I interpreted > the comment (which before this series only mentions block drivers in > general, not specifically format drivers): It sounded to me as if the > block driver needed to respect the unmap policy, and I didn’t think > filters did that. So it was my understanding that filter drivers would > just propagate discards and thus we couldn’t default-enable unmap on > their children. This was actually my thought as well. And in order to propagate, we have to copy the option from parent_options, no? Currently it will always be disabled (unless specified explicitly for the node) because that's the default setting for unmap. > But I was wrong, the block driver doesn’t need to respect anything, > because bdrv_co_pdiscard() already does. > > So I suppose it should indeed be enabled for all children, with the > comment changed to express that it isn’t any block driver that respects > unmap policy, but bdrv_co_pdiscard(), e.g.: > > bdrv_co_pdiscard() respects unmap policy for the parent, so we can > default to enable it on lower layers regardless of the parent option. This would restore the behaviour before this series for child_file and child_format. It would be different in that it also enables "unmap" for backing files, which should be okay. > > It would probably also be a good question why we don't propagate it to > > the backing file, but this is preexisting. > > I suppose we should, although it’s irrelevant, so. I suppose I’ll just > drop the parent_is_format, adjust the comment and that should be fine > for this series. Isn't it relevant for zero writes during active commit? (The "normal" intermediate commit job doesn't even try to optimise zero blocks...) The job will have its own BdrvChild to access the node, but option inheritance happens only from the parent that "owns" the backing file, so if a qcow2 image doesn't set "unmap" on its COW child, unmap will be disabled for the active commit job, too. (Oops. Turns out that it's not the case because the 'unmap' option for the job exists only for drive-mirror. blockdev-mirror passes true unconditionally and block-commit passes false unconditionally. I'm always amazed how consistently we fail to be consistent. But I think using zero writes with MAY_UNMAP in a commit job is an obvious extension, so considering it now can't hurt anyway.) Kevin
On 07.05.20 13:19, Kevin Wolf wrote: > Am 07.05.2020 um 10:49 hat Max Reitz geschrieben: >> On 06.05.20 12:37, Kevin Wolf wrote: >>> Am 18.02.2020 um 13:42 hat Max Reitz geschrieben: >>>> + if (role & BDRV_CHILD_COW) { >>>> + /* backing files are always opened read-only */ >>> >>> Not "always", just by default. >> >> OK. I just copied the comment from bdrv_backing_options(). > > Yes, sorry, I noticed this only later (it's how review goes when a move > is split into a copy in one patch and a removal later). I don't insist > on a change if you prefer to have a clean copy. > >>>> + qdict_set_default_str(child_options, BDRV_OPT_READ_ONLY, "on"); >>>> + qdict_set_default_str(child_options, BDRV_OPT_AUTO_READ_ONLY, "off"); >>>> + } else { >>>> + /* Inherit the read-only option from the parent if it's not set */ >>>> + qdict_copy_default(child_options, parent_options, BDRV_OPT_READ_ONLY); >>>> + qdict_copy_default(child_options, parent_options, >>>> + BDRV_OPT_AUTO_READ_ONLY); >>>> + } >>>> + >>>> + if (parent_is_format && !(role & BDRV_CHILD_COW)) { >>>> + /* >>>> + * Our format drivers take care to send flushes and respect >>>> + * unmap policy, so we can default to enable both on lower >>>> + * layers regardless of the corresponding parent options. >>>> + */ >>>> + qdict_set_default_str(child_options, BDRV_OPT_DISCARD, "unmap"); >>>> + } >>> >>> Why the restriction to format here? Don't we break "unmap" propagation >>> through filters with this? >> >> Right now (before this series), the behavior seems ambiguous, in that >> for filters that use bs->file, it is set, but for those that use >> bs->backing, it isn’t. > > It's probably easy to agree that this is a bug. > >> But I suspect the main reason for what I did is the way I interpreted >> the comment (which before this series only mentions block drivers in >> general, not specifically format drivers): It sounded to me as if the >> block driver needed to respect the unmap policy, and I didn’t think >> filters did that. So it was my understanding that filter drivers would >> just propagate discards and thus we couldn’t default-enable unmap on >> their children. > > This was actually my thought as well. And in order to propagate, we have > to copy the option from parent_options, no? Currently it will always be > disabled (unless specified explicitly for the node) because that's the > default setting for unmap. > >> But I was wrong, the block driver doesn’t need to respect anything, >> because bdrv_co_pdiscard() already does. >> >> So I suppose it should indeed be enabled for all children, with the >> comment changed to express that it isn’t any block driver that respects >> unmap policy, but bdrv_co_pdiscard(), e.g.: >> >> bdrv_co_pdiscard() respects unmap policy for the parent, so we can >> default to enable it on lower layers regardless of the parent option. > > This would restore the behaviour before this series for child_file and > child_format. It would be different in that it also enables "unmap" for > backing files, which should be okay. > >>> It would probably also be a good question why we don't propagate it to >>> the backing file, but this is preexisting. >> >> I suppose we should, although it’s irrelevant, so. I suppose I’ll just >> drop the parent_is_format, adjust the comment and that should be fine >> for this series. > > Isn't it relevant for zero writes during active commit? (The "normal" > intermediate commit job doesn't even try to optimise zero blocks...) > > The job will have its own BdrvChild to access the node, but option > inheritance happens only from the parent that "owns" the backing file, > so if a qcow2 image doesn't set "unmap" on its COW child, unmap will be > disabled for the active commit job, too. > > (Oops. Turns out that it's not the case because the 'unmap' option for > the job exists only for drive-mirror. blockdev-mirror passes true > unconditionally and block-commit passes false unconditionally. I'm > always amazed how consistently we fail to be consistent. Alles kaputt :) > But I think using zero writes with MAY_UNMAP in a commit job is an > obvious extension, so considering it now can't hurt anyway.) OK. So I’ll just make unmap=on the default always. Max
diff --git a/block.c b/block.c index c33f0e9b42..9179b9b604 100644 --- a/block.c +++ b/block.c @@ -998,6 +998,90 @@ static void bdrv_temp_snapshot_options(int *child_flags, QDict *child_options, *child_flags &= ~BDRV_O_NATIVE_AIO; } +/* + * Returns the options and flags that a generic child of a BDS should + * get, based on the given options and flags for the parent BDS. + */ +static void __attribute__((unused)) + bdrv_inherited_options(BdrvChildRole role, bool parent_is_format, + int *child_flags, QDict *child_options, + int parent_flags, QDict *parent_options) +{ + int flags = parent_flags; + + /* + * First, decide whether to set, clear, or leave BDRV_O_PROTOCOL. + * Generally, the question to answer is: Should this child be + * format-probed by default? + */ + + /* + * Pure and non-filtered data children of non-format nodes should + * be probed by default (even when the node itself has BDRV_O_PROTOCOL + * set). This only affects a very limited set of drivers (namely + * quorum and blkverify when this comment was written). + * Force-clear BDRV_O_PROTOCOL then. + */ + if (!parent_is_format && + (role & (BDRV_CHILD_DATA | BDRV_CHILD_METADATA | + BDRV_CHILD_FILTERED)) == + BDRV_CHILD_DATA) + { + flags &= ~BDRV_O_PROTOCOL; + } + + /* + * All children of format nodes (except for COW children) and all + * metadata children in general should never be format-probed. + * Force-set BDRV_O_PROTOCOL then. + */ + if ((parent_is_format && !(role & BDRV_CHILD_COW)) || + (role & BDRV_CHILD_METADATA)) + { + flags |= BDRV_O_PROTOCOL; + } + + /* + * If the cache mode isn't explicitly set, inherit direct and no-flush from + * the parent. + */ + qdict_copy_default(child_options, parent_options, BDRV_OPT_CACHE_DIRECT); + qdict_copy_default(child_options, parent_options, BDRV_OPT_CACHE_NO_FLUSH); + qdict_copy_default(child_options, parent_options, BDRV_OPT_FORCE_SHARE); + + if (role & BDRV_CHILD_COW) { + /* backing files are always opened read-only */ + qdict_set_default_str(child_options, BDRV_OPT_READ_ONLY, "on"); + qdict_set_default_str(child_options, BDRV_OPT_AUTO_READ_ONLY, "off"); + } else { + /* Inherit the read-only option from the parent if it's not set */ + qdict_copy_default(child_options, parent_options, BDRV_OPT_READ_ONLY); + qdict_copy_default(child_options, parent_options, + BDRV_OPT_AUTO_READ_ONLY); + } + + if (parent_is_format && !(role & BDRV_CHILD_COW)) { + /* + * Our format drivers take care to send flushes and respect + * unmap policy, so we can default to enable both on lower + * layers regardless of the corresponding parent options. + */ + qdict_set_default_str(child_options, BDRV_OPT_DISCARD, "unmap"); + } + + /* Clear flags that only apply to the top layer */ + flags &= ~(BDRV_O_SNAPSHOT | BDRV_O_NO_BACKING | BDRV_O_COPY_ON_READ); + + if (role & BDRV_CHILD_METADATA) { + flags &= ~BDRV_O_NO_IO; + } + if (role & BDRV_CHILD_COW) { + flags &= ~BDRV_O_TEMPORARY; + } + + *child_flags = flags; +} + /* * Returns the options and flags that bs->file should get if a protocol driver * is expected, based on the given options and flags for the parent BDS