diff mbox

block: char devices on FreeBSD are not behind a pager

Message ID 1413823167-26619-1-git-send-email-roger.pau@citrix.com
State New
Headers show

Commit Message

Roger Pau Monné Oct. 20, 2014, 4:39 p.m. UTC
Acknowledge this and forcefully set BDRV_O_NOCACHE and O_DIRECT in order to
force QEMU to use aligned buffers.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/raw-posix.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

Comments

Kevin Wolf Oct. 20, 2014, 5:22 p.m. UTC | #1
Am 20.10.2014 um 18:39 hat Roger Pau Monne geschrieben:
> Acknowledge this and forcefully set BDRV_O_NOCACHE and O_DIRECT in order to
> force QEMU to use aligned buffers.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> Cc: Kevin Wolf <kwolf@redhat.com>
> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  block/raw-posix.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/block/raw-posix.c b/block/raw-posix.c
> index 86ce4f2..63841dd 100644
> --- a/block/raw-posix.c
> +++ b/block/raw-posix.c
> @@ -472,6 +472,18 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
>          }
>  #endif
>      }
> +#ifdef __FreeBSD__
> +    if (S_ISCHR(st.st_mode)) {
> +        /*
> +         * The file is a char device (disk), which on FreeBSD isn't behind
> +         * a pager, so set BDRV_O_NOCACHE unconditionally. This is needed
> +         * so Qemu makes sure all IO operations on the device are aligned
> +         * to sector size, or else FreeBSD will reject them with EINVAL.
> +         */
> +        bs->open_flags |= BDRV_O_NOCACHE;
> +        s->open_flags |= O_DIRECT;
> +    }
> +#endif

No, this doesn't look right. Block drivers must not modify the options
that they get. (Yes, the Linux AIO case is broken in this respect.
Hopefully we'll be able to fix it soon.)

Depending on what the real requirements are, I can see two options:

1. Require cache.direct=on (i.e. O_DIRECT) for char devices on FreeBSD.
   If the user didn't set the option, print a nice error message telling
   them what option to set.

2. If O_DIRECT isn't actually required to open the file, but you only
   need to make sure to use the right alignment, modify
   raw_probe_alignment() so that it returns an alignment > 1 even for
   non-O_DIRECT files on FreeBSD if they are character devices.

I don't know FreeBSD good enough, but if it fulfills the requirements,
option 2 is certainly the more elegant one.

Kevin
Roger Pau Monné Oct. 21, 2014, 8:14 a.m. UTC | #2
El 20/10/14 a les 19.22, Kevin Wolf ha escrit:
> Am 20.10.2014 um 18:39 hat Roger Pau Monne geschrieben:
>> Acknowledge this and forcefully set BDRV_O_NOCACHE and O_DIRECT in order to
>> force QEMU to use aligned buffers.
>>
>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>> Cc: Kevin Wolf <kwolf@redhat.com>
>> Cc: Stefan Hajnoczi <stefanha@redhat.com>
>> ---
>>  block/raw-posix.c | 12 ++++++++++++
>>  1 file changed, 12 insertions(+)
>>
>> diff --git a/block/raw-posix.c b/block/raw-posix.c
>> index 86ce4f2..63841dd 100644
>> --- a/block/raw-posix.c
>> +++ b/block/raw-posix.c
>> @@ -472,6 +472,18 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
>>          }
>>  #endif
>>      }
>> +#ifdef __FreeBSD__
>> +    if (S_ISCHR(st.st_mode)) {
>> +        /*
>> +         * The file is a char device (disk), which on FreeBSD isn't behind
>> +         * a pager, so set BDRV_O_NOCACHE unconditionally. This is needed
>> +         * so Qemu makes sure all IO operations on the device are aligned
>> +         * to sector size, or else FreeBSD will reject them with EINVAL.
>> +         */
>> +        bs->open_flags |= BDRV_O_NOCACHE;
>> +        s->open_flags |= O_DIRECT;
>> +    }
>> +#endif
> 
> No, this doesn't look right. Block drivers must not modify the options
> that they get. (Yes, the Linux AIO case is broken in this respect.
> Hopefully we'll be able to fix it soon.)
> 
> Depending on what the real requirements are, I can see two options:
> 
> 1. Require cache.direct=on (i.e. O_DIRECT) for char devices on FreeBSD.
>    If the user didn't set the option, print a nice error message telling
>    them what option to set.
> 
> 2. If O_DIRECT isn't actually required to open the file, but you only
>    need to make sure to use the right alignment, modify
>    raw_probe_alignment() so that it returns an alignment > 1 even for
>    non-O_DIRECT files on FreeBSD if they are character devices.
> 
> I don't know FreeBSD good enough, but if it fulfills the requirements,
> option 2 is certainly the more elegant one.

Thanks for the review. O_DIRECT is not required to open the file, so
option 2 seems sensible.

I've added a new flag to BDRVRawState that's used to check if underlying
device needs requests to be aligned. This flag is set by default if
BDRV_O_NOCACHE is used, or if the OS is FreeBSD and the underlying
device is a char dev. This new flag is used as a replacement of the
O_DIRECT and BDRV_O_NOCACHE checks that were used in raw_probe_alignment
and raw_aio_submit. Does this sound OK?

Roger.
Kevin Wolf Oct. 21, 2014, 9:36 a.m. UTC | #3
Am 21.10.2014 um 10:14 hat Roger Pau Monné geschrieben:
> El 20/10/14 a les 19.22, Kevin Wolf ha escrit:
> > Am 20.10.2014 um 18:39 hat Roger Pau Monne geschrieben:
> >> Acknowledge this and forcefully set BDRV_O_NOCACHE and O_DIRECT in order to
> >> force QEMU to use aligned buffers.
> >>
> >> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> >> Cc: Kevin Wolf <kwolf@redhat.com>
> >> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> >> ---
> >>  block/raw-posix.c | 12 ++++++++++++
> >>  1 file changed, 12 insertions(+)
> >>
> >> diff --git a/block/raw-posix.c b/block/raw-posix.c
> >> index 86ce4f2..63841dd 100644
> >> --- a/block/raw-posix.c
> >> +++ b/block/raw-posix.c
> >> @@ -472,6 +472,18 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
> >>          }
> >>  #endif
> >>      }
> >> +#ifdef __FreeBSD__
> >> +    if (S_ISCHR(st.st_mode)) {
> >> +        /*
> >> +         * The file is a char device (disk), which on FreeBSD isn't behind
> >> +         * a pager, so set BDRV_O_NOCACHE unconditionally. This is needed
> >> +         * so Qemu makes sure all IO operations on the device are aligned
> >> +         * to sector size, or else FreeBSD will reject them with EINVAL.
> >> +         */
> >> +        bs->open_flags |= BDRV_O_NOCACHE;
> >> +        s->open_flags |= O_DIRECT;
> >> +    }
> >> +#endif
> > 
> > No, this doesn't look right. Block drivers must not modify the options
> > that they get. (Yes, the Linux AIO case is broken in this respect.
> > Hopefully we'll be able to fix it soon.)
> > 
> > Depending on what the real requirements are, I can see two options:
> > 
> > 1. Require cache.direct=on (i.e. O_DIRECT) for char devices on FreeBSD.
> >    If the user didn't set the option, print a nice error message telling
> >    them what option to set.
> > 
> > 2. If O_DIRECT isn't actually required to open the file, but you only
> >    need to make sure to use the right alignment, modify
> >    raw_probe_alignment() so that it returns an alignment > 1 even for
> >    non-O_DIRECT files on FreeBSD if they are character devices.
> > 
> > I don't know FreeBSD good enough, but if it fulfills the requirements,
> > option 2 is certainly the more elegant one.
> 
> Thanks for the review. O_DIRECT is not required to open the file, so
> option 2 seems sensible.
> 
> I've added a new flag to BDRVRawState that's used to check if underlying
> device needs requests to be aligned. This flag is set by default if
> BDRV_O_NOCACHE is used, or if the OS is FreeBSD and the underlying
> device is a char dev. This new flag is used as a replacement of the
> O_DIRECT and BDRV_O_NOCACHE checks that were used in raw_probe_alignment
> and raw_aio_submit. Does this sound OK?

Yes, this sounds reasonable to me.

Kevin
diff mbox

Patch

diff --git a/block/raw-posix.c b/block/raw-posix.c
index 86ce4f2..63841dd 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -472,6 +472,18 @@  static int raw_open_common(BlockDriverState *bs, QDict *options,
         }
 #endif
     }
+#ifdef __FreeBSD__
+    if (S_ISCHR(st.st_mode)) {
+        /*
+         * The file is a char device (disk), which on FreeBSD isn't behind
+         * a pager, so set BDRV_O_NOCACHE unconditionally. This is needed
+         * so Qemu makes sure all IO operations on the device are aligned
+         * to sector size, or else FreeBSD will reject them with EINVAL.
+         */
+        bs->open_flags |= BDRV_O_NOCACHE;
+        s->open_flags |= O_DIRECT;
+    }
+#endif
 
 #ifdef CONFIG_XFS
     if (platform_test_xfs_fd(s->fd)) {