diff mbox series

[v2,13/18] monitor: fdset: Match against O_DIRECT

Message ID 20240523190548.23977-14-farosas@suse.de
State New
Headers show
Series migration/mapped-ram: Add direct-io support | expand

Commit Message

Fabiano Rosas May 23, 2024, 7:05 p.m. UTC
We're about to enable the use of O_DIRECT in the migration code and
due to the alignment restrictions imposed by filesystems we need to
make sure the flag is only used when doing aligned IO.

The migration will do parallel IO to different regions of a file, so
we need to use more than one file descriptor. Those cannot be obtained
by duplicating (dup()) since duplicated file descriptors share the
file status flags, including O_DIRECT. If one migration channel does
unaligned IO while another sets O_DIRECT to do aligned IO, the
filesystem would fail the unaligned operation.

The add-fd QMP command along with the fdset code are specifically
designed to allow the user to pass a set of file descriptors with
different access flags into QEMU to be later fetched by code that
needs to alternate between those flags when doing IO.

Extend the fdset matching to behave the same with the O_DIRECT flag.

Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 monitor/fds.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Peter Xu May 30, 2024, 9:41 p.m. UTC | #1
On Thu, May 23, 2024 at 04:05:43PM -0300, Fabiano Rosas wrote:
> We're about to enable the use of O_DIRECT in the migration code and
> due to the alignment restrictions imposed by filesystems we need to
> make sure the flag is only used when doing aligned IO.
> 
> The migration will do parallel IO to different regions of a file, so
> we need to use more than one file descriptor. Those cannot be obtained
> by duplicating (dup()) since duplicated file descriptors share the
> file status flags, including O_DIRECT. If one migration channel does
> unaligned IO while another sets O_DIRECT to do aligned IO, the
> filesystem would fail the unaligned operation.
> 
> The add-fd QMP command along with the fdset code are specifically
> designed to allow the user to pass a set of file descriptors with
> different access flags into QEMU to be later fetched by code that
> needs to alternate between those flags when doing IO.
> 
> Extend the fdset matching to behave the same with the O_DIRECT flag.
> 
> Signed-off-by: Fabiano Rosas <farosas@suse.de>

Reviewed-by: Peter Xu <peterx@redhat.com>

One thing I am confused but totally irrelevant to this specific change.

I wonder why do we need dupfds at all, and why client needs to remove-fd at
all.

It's about what would go wrong if qmp client only add-fd, then if it's
consumed by anything, it gets moved from "fds" list to "dupfds" list.  The
thing is I don't expect the client should pass over any fd that will not be
consumed.  Then if it's always consumed, why bother dup() at all, and why
bother an explicit remove-fd.
Fabiano Rosas May 31, 2024, 3:42 p.m. UTC | #2
Peter Xu <peterx@redhat.com> writes:

> On Thu, May 23, 2024 at 04:05:43PM -0300, Fabiano Rosas wrote:
>> We're about to enable the use of O_DIRECT in the migration code and
>> due to the alignment restrictions imposed by filesystems we need to
>> make sure the flag is only used when doing aligned IO.
>> 
>> The migration will do parallel IO to different regions of a file, so
>> we need to use more than one file descriptor. Those cannot be obtained
>> by duplicating (dup()) since duplicated file descriptors share the
>> file status flags, including O_DIRECT. If one migration channel does
>> unaligned IO while another sets O_DIRECT to do aligned IO, the
>> filesystem would fail the unaligned operation.
>> 
>> The add-fd QMP command along with the fdset code are specifically
>> designed to allow the user to pass a set of file descriptors with
>> different access flags into QEMU to be later fetched by code that
>> needs to alternate between those flags when doing IO.
>> 
>> Extend the fdset matching to behave the same with the O_DIRECT flag.
>> 
>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>
> Reviewed-by: Peter Xu <peterx@redhat.com>
>
> One thing I am confused but totally irrelevant to this specific change.
>
> I wonder why do we need dupfds at all, and why client needs to remove-fd at
> all.

This answer was lost to time.

>
> It's about what would go wrong if qmp client only add-fd, then if it's
> consumed by anything, it gets moved from "fds" list to "dupfds" list.  The
> thing is I don't expect the client should pass over any fd that will not be
> consumed.  Then if it's always consumed, why bother dup() at all, and why
> bother an explicit remove-fd.

With the lack of documentation, I can only imagine the code was
initially written to be more flexible, but we ended up never needing the
extra flexibility. Maybe we could change that transparently in the
future and deprecate qmp_remove_fd(). I'm thinking, even for the
mapped-ram work, we might not need libvirt to call that function ever.
Peter Xu May 31, 2024, 3:58 p.m. UTC | #3
On Fri, May 31, 2024 at 12:42:05PM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Thu, May 23, 2024 at 04:05:43PM -0300, Fabiano Rosas wrote:
> >> We're about to enable the use of O_DIRECT in the migration code and
> >> due to the alignment restrictions imposed by filesystems we need to
> >> make sure the flag is only used when doing aligned IO.
> >> 
> >> The migration will do parallel IO to different regions of a file, so
> >> we need to use more than one file descriptor. Those cannot be obtained
> >> by duplicating (dup()) since duplicated file descriptors share the
> >> file status flags, including O_DIRECT. If one migration channel does
> >> unaligned IO while another sets O_DIRECT to do aligned IO, the
> >> filesystem would fail the unaligned operation.
> >> 
> >> The add-fd QMP command along with the fdset code are specifically
> >> designed to allow the user to pass a set of file descriptors with
> >> different access flags into QEMU to be later fetched by code that
> >> needs to alternate between those flags when doing IO.
> >> 
> >> Extend the fdset matching to behave the same with the O_DIRECT flag.
> >> 
> >> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> >
> > Reviewed-by: Peter Xu <peterx@redhat.com>
> >
> > One thing I am confused but totally irrelevant to this specific change.
> >
> > I wonder why do we need dupfds at all, and why client needs to remove-fd at
> > all.
> 
> This answer was lost to time.
> 
> >
> > It's about what would go wrong if qmp client only add-fd, then if it's
> > consumed by anything, it gets moved from "fds" list to "dupfds" list.  The
> > thing is I don't expect the client should pass over any fd that will not be
> > consumed.  Then if it's always consumed, why bother dup() at all, and why
> > bother an explicit remove-fd.
> 
> With the lack of documentation, I can only imagine the code was
> initially written to be more flexible, but we ended up never needing the
> extra flexibility. Maybe we could change that transparently in the
> future and deprecate qmp_remove_fd(). I'm thinking, even for the
> mapped-ram work, we might not need libvirt to call that function ever.

Good to know I'm not the only one thinking that.

It's okay - after your cleanup it's much better at least to me.  The only
thing to avoid these, AFAIU, is more careful design level reviews in the
future, and more documents state the facts and keep in history (and
obviously why remove-fd was needed was not documented).  Now we carry them.
diff mbox series

Patch

diff --git a/monitor/fds.c b/monitor/fds.c
index d93d2e695b..a6580e215e 100644
--- a/monitor/fds.c
+++ b/monitor/fds.c
@@ -419,6 +419,11 @@  int monitor_fdset_dup_fd_add(int64_t fdset_id, int flags, Error **errp)
         int fd = -1;
         int dup_fd;
         int mon_fd_flags;
+        int mask = O_ACCMODE;
+
+#ifdef O_DIRECT
+        mask |= O_DIRECT;
+#endif
 
         if (mon_fdset->id != fdset_id) {
             continue;
@@ -432,7 +437,7 @@  int monitor_fdset_dup_fd_add(int64_t fdset_id, int flags, Error **errp)
                 return -1;
             }
 
-            if ((flags & O_ACCMODE) == (mon_fd_flags & O_ACCMODE)) {
+            if ((flags & mask) == (mon_fd_flags & mask)) {
                 fd = mon_fdset_fd->fd;
                 break;
             }