Message ID | 20190719155047.5885-2-christian.brauner@ubuntu.com |
---|---|
State | New |
Headers | show |
Series | shiftfs: bugfix and O_DIRECT support | expand |
On Fri, Jul 19, 2019 at 05:50:46PM +0200, Christian Brauner wrote: > BugLink: https://bugs.launchpad.net/bugs/1837223 > > This enabled O_DIRECT support for shiftfs if the underlay supports it. > > Currently shiftfs does not handle O_DIRECT if the underlay supports it. > This is blocking dqlite - an essential part of LXD - from profiting from > the performance benefits of O_DIRECT on suitable filesystems when used > with async io such as aio or io_uring. > Overlayfs cannot support this directly since the upper filesystem in > overlay can be any filesystem. So if the upper filesystem does not > support O_DIRECT but the lower filesystem does you're out of luck. > Shiftfs does not suffer from the same problem since there is not concept > of an upper filesystem in the same way that overlayfs has it. > Essentially, shiftfs is a transparent shim relaying everything to the > underlay while overlayfs' upper layer is not (completely). I get concerned any time shiftfs just copies up some non-trivial data from the lower filesystem, that shiftfs is going to get bypassed and some important metadata will not get propoerly updated in shiftfs. The question that immediately comes to mind in this case is whether i_size in the shiftfs indoe gets updated following an O_DIRECT write, and I suspect tha there are other similar cases to worry about. Have you considered this possibility and confirmed that no such issues exist? Thanks, Seth
On Fri, Jul 19, 2019 at 03:02:21PM -0500, Seth Forshee wrote: > On Fri, Jul 19, 2019 at 05:50:46PM +0200, Christian Brauner wrote: > > BugLink: https://bugs.launchpad.net/bugs/1837223 > > > > This enabled O_DIRECT support for shiftfs if the underlay supports it. > > > > Currently shiftfs does not handle O_DIRECT if the underlay supports it. > > This is blocking dqlite - an essential part of LXD - from profiting from > > the performance benefits of O_DIRECT on suitable filesystems when used > > with async io such as aio or io_uring. > > Overlayfs cannot support this directly since the upper filesystem in > > overlay can be any filesystem. So if the upper filesystem does not > > support O_DIRECT but the lower filesystem does you're out of luck. > > Shiftfs does not suffer from the same problem since there is not concept > > of an upper filesystem in the same way that overlayfs has it. > > Essentially, shiftfs is a transparent shim relaying everything to the > > underlay while overlayfs' upper layer is not (completely). > > I get concerned any time shiftfs just copies up some non-trivial data > from the lower filesystem, that shiftfs is going to get bypassed and > some important metadata will not get propoerly updated in shiftfs. The > question that immediately comes to mind in this case is whether i_size > in the shiftfs indoe gets updated following an O_DIRECT write, and I > suspect tha there are other similar cases to worry about. Have you I'm not following, what are "other similar cases to worry about" and are those already exposed? Re: i_size for O_DIRECT I have not seen prior art for filesystems that use iomap for O_DIRECT that would fiddle with i_size so I didn't want to get into that game. That is to say, shiftfs only supports O_DIRECT if the underlying filesystem is iomap based. This was the same approach that overlayfs wanted to use. Christian
On Mon, Jul 22, 2019 at 05:05:08PM +0200, Christian Brauner wrote: > On Fri, Jul 19, 2019 at 03:02:21PM -0500, Seth Forshee wrote: > > On Fri, Jul 19, 2019 at 05:50:46PM +0200, Christian Brauner wrote: > > > BugLink: https://bugs.launchpad.net/bugs/1837223 > > > > > > This enabled O_DIRECT support for shiftfs if the underlay supports it. > > > > > > Currently shiftfs does not handle O_DIRECT if the underlay supports it. > > > This is blocking dqlite - an essential part of LXD - from profiting from > > > the performance benefits of O_DIRECT on suitable filesystems when used > > > with async io such as aio or io_uring. > > > Overlayfs cannot support this directly since the upper filesystem in > > > overlay can be any filesystem. So if the upper filesystem does not > > > support O_DIRECT but the lower filesystem does you're out of luck. > > > Shiftfs does not suffer from the same problem since there is not concept > > > of an upper filesystem in the same way that overlayfs has it. > > > Essentially, shiftfs is a transparent shim relaying everything to the > > > underlay while overlayfs' upper layer is not (completely). > > > > I get concerned any time shiftfs just copies up some non-trivial data > > from the lower filesystem, that shiftfs is going to get bypassed and > > some important metadata will not get propoerly updated in shiftfs. The > > question that immediately comes to mind in this case is whether i_size > > in the shiftfs indoe gets updated following an O_DIRECT write, and I > > suspect tha there are other similar cases to worry about. Have you > > I'm not following, what are "other similar cases to worry about" and are > those already exposed? I just mean anything else which might change in the underlay as a result of I/O, e.g. timestamps. > Re: i_size for O_DIRECT > I have not seen prior art for filesystems that use iomap for O_DIRECT > that would fiddle with i_size so I didn't want to get into that game. > That is to say, shiftfs only supports O_DIRECT if the underlying > filesystem is iomap based. This was the same approach that overlayfs > wanted to use. Are you saying that an O_DIRECT write that extends a file does not result in i_size being updated? That doesn't sound right ...
On Thu, Jul 25, 2019 at 03:26:01PM -0500, Seth Forshee wrote: > On Mon, Jul 22, 2019 at 05:05:08PM +0200, Christian Brauner wrote: > > On Fri, Jul 19, 2019 at 03:02:21PM -0500, Seth Forshee wrote: > > > On Fri, Jul 19, 2019 at 05:50:46PM +0200, Christian Brauner wrote: > > > > BugLink: https://bugs.launchpad.net/bugs/1837223 > > > > > > > > This enabled O_DIRECT support for shiftfs if the underlay supports it. > > > > > > > > Currently shiftfs does not handle O_DIRECT if the underlay supports it. > > > > This is blocking dqlite - an essential part of LXD - from profiting from > > > > the performance benefits of O_DIRECT on suitable filesystems when used > > > > with async io such as aio or io_uring. > > > > Overlayfs cannot support this directly since the upper filesystem in > > > > overlay can be any filesystem. So if the upper filesystem does not > > > > support O_DIRECT but the lower filesystem does you're out of luck. > > > > Shiftfs does not suffer from the same problem since there is not concept > > > > of an upper filesystem in the same way that overlayfs has it. > > > > Essentially, shiftfs is a transparent shim relaying everything to the > > > > underlay while overlayfs' upper layer is not (completely). > > > > > > I get concerned any time shiftfs just copies up some non-trivial data > > > from the lower filesystem, that shiftfs is going to get bypassed and > > > some important metadata will not get propoerly updated in shiftfs. The > > > question that immediately comes to mind in this case is whether i_size > > > in the shiftfs indoe gets updated following an O_DIRECT write, and I > > > suspect tha there are other similar cases to worry about. Have you > > > > I'm not following, what are "other similar cases to worry about" and are > > those already exposed? > > I just mean anything else which might change in the underlay as a result > of I/O, e.g. timestamps. > > > Re: i_size for O_DIRECT > > I have not seen prior art for filesystems that use iomap for O_DIRECT > > that would fiddle with i_size so I didn't want to get into that game. > > That is to say, shiftfs only supports O_DIRECT if the underlying > > filesystem is iomap based. This was the same approach that overlayfs > > wanted to use. > > Are you saying that an O_DIRECT write that extends a file does not > result in i_size being updated? That doesn't sound right ... For example, when overlayfs intended to introduce support for this (which they didn't because of their version of upper and lower) they did not mess with i_size. That's one of the reasons I didn't. Do you prefer to copy up attributes to make sure we're not missing anything? Christian
On Fri, Jul 26, 2019 at 10:29:38AM +0200, Christian Brauner wrote: > On Thu, Jul 25, 2019 at 03:26:01PM -0500, Seth Forshee wrote: > > On Mon, Jul 22, 2019 at 05:05:08PM +0200, Christian Brauner wrote: > > > On Fri, Jul 19, 2019 at 03:02:21PM -0500, Seth Forshee wrote: > > > > On Fri, Jul 19, 2019 at 05:50:46PM +0200, Christian Brauner wrote: > > > > > BugLink: https://bugs.launchpad.net/bugs/1837223 > > > > > > > > > > This enabled O_DIRECT support for shiftfs if the underlay supports it. > > > > > > > > > > Currently shiftfs does not handle O_DIRECT if the underlay supports it. > > > > > This is blocking dqlite - an essential part of LXD - from profiting from > > > > > the performance benefits of O_DIRECT on suitable filesystems when used > > > > > with async io such as aio or io_uring. > > > > > Overlayfs cannot support this directly since the upper filesystem in > > > > > overlay can be any filesystem. So if the upper filesystem does not > > > > > support O_DIRECT but the lower filesystem does you're out of luck. > > > > > Shiftfs does not suffer from the same problem since there is not concept > > > > > of an upper filesystem in the same way that overlayfs has it. > > > > > Essentially, shiftfs is a transparent shim relaying everything to the > > > > > underlay while overlayfs' upper layer is not (completely). > > > > > > > > I get concerned any time shiftfs just copies up some non-trivial data > > > > from the lower filesystem, that shiftfs is going to get bypassed and > > > > some important metadata will not get propoerly updated in shiftfs. The > > > > question that immediately comes to mind in this case is whether i_size > > > > in the shiftfs indoe gets updated following an O_DIRECT write, and I > > > > suspect tha there are other similar cases to worry about. Have you > > > > > > I'm not following, what are "other similar cases to worry about" and are > > > those already exposed? > > > > I just mean anything else which might change in the underlay as a result > > of I/O, e.g. timestamps. > > > > > Re: i_size for O_DIRECT > > > I have not seen prior art for filesystems that use iomap for O_DIRECT > > > that would fiddle with i_size so I didn't want to get into that game. > > > That is to say, shiftfs only supports O_DIRECT if the underlying > > > filesystem is iomap based. This was the same approach that overlayfs > > > wanted to use. > > > > Are you saying that an O_DIRECT write that extends a file does not > > result in i_size being updated? That doesn't sound right ... > > For example, when overlayfs intended to introduce support for this > (which they didn't because of their version of upper and lower) they did > not mess with i_size. That's one of the reasons I didn't. Do you prefer > to copy up attributes to make sure we're not missing anything? I think that something like this: fd = open(path, O_RDWR|O_DIRECT); lseek(fd, 0, SEEK_END); write(fd, data, size); fstat(fd, &stat); printf("%ld\n", (long)stat.st_size); Should print out the correct size (and similarly correct data for other attributes). Maybe it will with this patch, I'm just not certain that it will, so really I'm just asking whether you've tested this sort of thing. If it doesn't work, then it seems like we probably do need to copy up the attributes. Seth
On 26.07.19 14:59, Seth Forshee wrote: > On Fri, Jul 26, 2019 at 10:29:38AM +0200, Christian Brauner wrote: >> On Thu, Jul 25, 2019 at 03:26:01PM -0500, Seth Forshee wrote: >>> On Mon, Jul 22, 2019 at 05:05:08PM +0200, Christian Brauner wrote: >>>> On Fri, Jul 19, 2019 at 03:02:21PM -0500, Seth Forshee wrote: >>>>> On Fri, Jul 19, 2019 at 05:50:46PM +0200, Christian Brauner wrote: >>>>>> BugLink: https://bugs.launchpad.net/bugs/1837223 >>>>>> >>>>>> This enabled O_DIRECT support for shiftfs if the underlay supports it. >>>>>> >>>>>> Currently shiftfs does not handle O_DIRECT if the underlay supports it. >>>>>> This is blocking dqlite - an essential part of LXD - from profiting from >>>>>> the performance benefits of O_DIRECT on suitable filesystems when used >>>>>> with async io such as aio or io_uring. >>>>>> Overlayfs cannot support this directly since the upper filesystem in >>>>>> overlay can be any filesystem. So if the upper filesystem does not >>>>>> support O_DIRECT but the lower filesystem does you're out of luck. >>>>>> Shiftfs does not suffer from the same problem since there is not concept >>>>>> of an upper filesystem in the same way that overlayfs has it. >>>>>> Essentially, shiftfs is a transparent shim relaying everything to the >>>>>> underlay while overlayfs' upper layer is not (completely). >>>>> >>>>> I get concerned any time shiftfs just copies up some non-trivial data >>>>> from the lower filesystem, that shiftfs is going to get bypassed and >>>>> some important metadata will not get propoerly updated in shiftfs. The >>>>> question that immediately comes to mind in this case is whether i_size >>>>> in the shiftfs indoe gets updated following an O_DIRECT write, and I >>>>> suspect tha there are other similar cases to worry about. Have you >>>> >>>> I'm not following, what are "other similar cases to worry about" and are >>>> those already exposed? >>> >>> I just mean anything else which might change in the underlay as a result >>> of I/O, e.g. timestamps. >>> >>>> Re: i_size for O_DIRECT >>>> I have not seen prior art for filesystems that use iomap for O_DIRECT >>>> that would fiddle with i_size so I didn't want to get into that game. >>>> That is to say, shiftfs only supports O_DIRECT if the underlying >>>> filesystem is iomap based. This was the same approach that overlayfs >>>> wanted to use. >>> >>> Are you saying that an O_DIRECT write that extends a file does not >>> result in i_size being updated? That doesn't sound right ... >> >> For example, when overlayfs intended to introduce support for this >> (which they didn't because of their version of upper and lower) they did >> not mess with i_size. That's one of the reasons I didn't. Do you prefer >> to copy up attributes to make sure we're not missing anything? > > I think that something like this: > > fd = open(path, O_RDWR|O_DIRECT); > lseek(fd, 0, SEEK_END); > write(fd, data, size); > fstat(fd, &stat); > printf("%ld\n", (long)stat.st_size); > > Should print out the correct size (and similarly correct data for other > attributes). Maybe it will with this patch, I'm just not certain that it > will, so really I'm just asking whether you've tested this sort of > thing. If it doesn't work, then it seems like we probably do need to > copy up the attributes. > > Seth > With the long discussion seemingly stopping unresolved I am not sure this patchset is ready for SRU or not. -Stefan
On Mon, Aug 12, 2019 at 02:31:38PM +0200, Stefan Bader wrote: > On 26.07.19 14:59, Seth Forshee wrote: > > On Fri, Jul 26, 2019 at 10:29:38AM +0200, Christian Brauner wrote: > >> On Thu, Jul 25, 2019 at 03:26:01PM -0500, Seth Forshee wrote: > >>> On Mon, Jul 22, 2019 at 05:05:08PM +0200, Christian Brauner wrote: > >>>> On Fri, Jul 19, 2019 at 03:02:21PM -0500, Seth Forshee wrote: > >>>>> On Fri, Jul 19, 2019 at 05:50:46PM +0200, Christian Brauner wrote: > >>>>>> BugLink: https://bugs.launchpad.net/bugs/1837223 > >>>>>> > >>>>>> This enabled O_DIRECT support for shiftfs if the underlay supports it. > >>>>>> > >>>>>> Currently shiftfs does not handle O_DIRECT if the underlay supports it. > >>>>>> This is blocking dqlite - an essential part of LXD - from profiting from > >>>>>> the performance benefits of O_DIRECT on suitable filesystems when used > >>>>>> with async io such as aio or io_uring. > >>>>>> Overlayfs cannot support this directly since the upper filesystem in > >>>>>> overlay can be any filesystem. So if the upper filesystem does not > >>>>>> support O_DIRECT but the lower filesystem does you're out of luck. > >>>>>> Shiftfs does not suffer from the same problem since there is not concept > >>>>>> of an upper filesystem in the same way that overlayfs has it. > >>>>>> Essentially, shiftfs is a transparent shim relaying everything to the > >>>>>> underlay while overlayfs' upper layer is not (completely). > >>>>> > >>>>> I get concerned any time shiftfs just copies up some non-trivial data > >>>>> from the lower filesystem, that shiftfs is going to get bypassed and > >>>>> some important metadata will not get propoerly updated in shiftfs. The > >>>>> question that immediately comes to mind in this case is whether i_size > >>>>> in the shiftfs indoe gets updated following an O_DIRECT write, and I > >>>>> suspect tha there are other similar cases to worry about. Have you > >>>> > >>>> I'm not following, what are "other similar cases to worry about" and are > >>>> those already exposed? > >>> > >>> I just mean anything else which might change in the underlay as a result > >>> of I/O, e.g. timestamps. > >>> > >>>> Re: i_size for O_DIRECT > >>>> I have not seen prior art for filesystems that use iomap for O_DIRECT > >>>> that would fiddle with i_size so I didn't want to get into that game. > >>>> That is to say, shiftfs only supports O_DIRECT if the underlying > >>>> filesystem is iomap based. This was the same approach that overlayfs > >>>> wanted to use. > >>> > >>> Are you saying that an O_DIRECT write that extends a file does not > >>> result in i_size being updated? That doesn't sound right ... > >> > >> For example, when overlayfs intended to introduce support for this > >> (which they didn't because of their version of upper and lower) they did > >> not mess with i_size. That's one of the reasons I didn't. Do you prefer > >> to copy up attributes to make sure we're not missing anything? > > > > I think that something like this: > > > > fd = open(path, O_RDWR|O_DIRECT); > > lseek(fd, 0, SEEK_END); > > write(fd, data, size); > > fstat(fd, &stat); > > printf("%ld\n", (long)stat.st_size); > > > > Should print out the correct size (and similarly correct data for other > > attributes). Maybe it will with this patch, I'm just not certain that it > > will, so really I'm just asking whether you've tested this sort of > > thing. If it doesn't work, then it seems like we probably do need to > > copy up the attributes. > > > > Seth > > > With the long discussion seemingly stopping unresolved I am not sure this > patchset is ready for SRU or not. I think we are still waiting for Christian to get back with some test results here, so these patches should wait.
After the sprint I was on vacation. I'll try to get around to it soon. Christian On August 12, 2019 4:45:19 PM GMT+02:00, Seth Forshee <seth.forshee@canonical.com> wrote: >On Mon, Aug 12, 2019 at 02:31:38PM +0200, Stefan Bader wrote: >> On 26.07.19 14:59, Seth Forshee wrote: >> > On Fri, Jul 26, 2019 at 10:29:38AM +0200, Christian Brauner wrote: >> >> On Thu, Jul 25, 2019 at 03:26:01PM -0500, Seth Forshee wrote: >> >>> On Mon, Jul 22, 2019 at 05:05:08PM +0200, Christian Brauner >wrote: >> >>>> On Fri, Jul 19, 2019 at 03:02:21PM -0500, Seth Forshee wrote: >> >>>>> On Fri, Jul 19, 2019 at 05:50:46PM +0200, Christian Brauner >wrote: >> >>>>>> BugLink: https://bugs.launchpad.net/bugs/1837223 >> >>>>>> >> >>>>>> This enabled O_DIRECT support for shiftfs if the underlay >supports it. >> >>>>>> >> >>>>>> Currently shiftfs does not handle O_DIRECT if the underlay >supports it. >> >>>>>> This is blocking dqlite - an essential part of LXD - from >profiting from >> >>>>>> the performance benefits of O_DIRECT on suitable filesystems >when used >> >>>>>> with async io such as aio or io_uring. >> >>>>>> Overlayfs cannot support this directly since the upper >filesystem in >> >>>>>> overlay can be any filesystem. So if the upper filesystem does >not >> >>>>>> support O_DIRECT but the lower filesystem does you're out of >luck. >> >>>>>> Shiftfs does not suffer from the same problem since there is >not concept >> >>>>>> of an upper filesystem in the same way that overlayfs has it. >> >>>>>> Essentially, shiftfs is a transparent shim relaying everything >to the >> >>>>>> underlay while overlayfs' upper layer is not (completely). >> >>>>> >> >>>>> I get concerned any time shiftfs just copies up some >non-trivial data >> >>>>> from the lower filesystem, that shiftfs is going to get >bypassed and >> >>>>> some important metadata will not get propoerly updated in >shiftfs. The >> >>>>> question that immediately comes to mind in this case is whether >i_size >> >>>>> in the shiftfs indoe gets updated following an O_DIRECT write, >and I >> >>>>> suspect tha there are other similar cases to worry about. Have >you >> >>>> >> >>>> I'm not following, what are "other similar cases to worry about" >and are >> >>>> those already exposed? >> >>> >> >>> I just mean anything else which might change in the underlay as a >result >> >>> of I/O, e.g. timestamps. >> >>> >> >>>> Re: i_size for O_DIRECT >> >>>> I have not seen prior art for filesystems that use iomap for >O_DIRECT >> >>>> that would fiddle with i_size so I didn't want to get into that >game. >> >>>> That is to say, shiftfs only supports O_DIRECT if the underlying >> >>>> filesystem is iomap based. This was the same approach that >overlayfs >> >>>> wanted to use. >> >>> >> >>> Are you saying that an O_DIRECT write that extends a file does >not >> >>> result in i_size being updated? That doesn't sound right ... >> >> >> >> For example, when overlayfs intended to introduce support for this >> >> (which they didn't because of their version of upper and lower) >they did >> >> not mess with i_size. That's one of the reasons I didn't. Do you >prefer >> >> to copy up attributes to make sure we're not missing anything? >> > >> > I think that something like this: >> > >> > fd = open(path, O_RDWR|O_DIRECT); >> > lseek(fd, 0, SEEK_END); >> > write(fd, data, size); >> > fstat(fd, &stat); >> > printf("%ld\n", (long)stat.st_size); >> > >> > Should print out the correct size (and similarly correct data for >other >> > attributes). Maybe it will with this patch, I'm just not certain >that it >> > will, so really I'm just asking whether you've tested this sort of >> > thing. If it doesn't work, then it seems like we probably do need >to >> > copy up the attributes. >> > >> > Seth >> > >> With the long discussion seemingly stopping unresolved I am not sure >this >> patchset is ready for SRU or not. > >I think we are still waiting for Christian to get back with some test >results here, so these patches should wait.
On Mon, Aug 12, 2019 at 09:45:19AM -0500, Seth Forshee wrote: > On Mon, Aug 12, 2019 at 02:31:38PM +0200, Stefan Bader wrote: > > On 26.07.19 14:59, Seth Forshee wrote: > > > On Fri, Jul 26, 2019 at 10:29:38AM +0200, Christian Brauner wrote: > > >> On Thu, Jul 25, 2019 at 03:26:01PM -0500, Seth Forshee wrote: > > >>> On Mon, Jul 22, 2019 at 05:05:08PM +0200, Christian Brauner wrote: > > >>>> On Fri, Jul 19, 2019 at 03:02:21PM -0500, Seth Forshee wrote: > > >>>>> On Fri, Jul 19, 2019 at 05:50:46PM +0200, Christian Brauner wrote: > > >>>>>> BugLink: https://bugs.launchpad.net/bugs/1837223 > > >>>>>> > > >>>>>> This enabled O_DIRECT support for shiftfs if the underlay supports it. > > >>>>>> > > >>>>>> Currently shiftfs does not handle O_DIRECT if the underlay supports it. > > >>>>>> This is blocking dqlite - an essential part of LXD - from profiting from > > >>>>>> the performance benefits of O_DIRECT on suitable filesystems when used > > >>>>>> with async io such as aio or io_uring. > > >>>>>> Overlayfs cannot support this directly since the upper filesystem in > > >>>>>> overlay can be any filesystem. So if the upper filesystem does not > > >>>>>> support O_DIRECT but the lower filesystem does you're out of luck. > > >>>>>> Shiftfs does not suffer from the same problem since there is not concept > > >>>>>> of an upper filesystem in the same way that overlayfs has it. > > >>>>>> Essentially, shiftfs is a transparent shim relaying everything to the > > >>>>>> underlay while overlayfs' upper layer is not (completely). > > >>>>> > > >>>>> I get concerned any time shiftfs just copies up some non-trivial data > > >>>>> from the lower filesystem, that shiftfs is going to get bypassed and > > >>>>> some important metadata will not get propoerly updated in shiftfs. The > > >>>>> question that immediately comes to mind in this case is whether i_size > > >>>>> in the shiftfs indoe gets updated following an O_DIRECT write, and I > > >>>>> suspect tha there are other similar cases to worry about. Have you > > >>>> > > >>>> I'm not following, what are "other similar cases to worry about" and are > > >>>> those already exposed? > > >>> > > >>> I just mean anything else which might change in the underlay as a result > > >>> of I/O, e.g. timestamps. > > >>> > > >>>> Re: i_size for O_DIRECT > > >>>> I have not seen prior art for filesystems that use iomap for O_DIRECT > > >>>> that would fiddle with i_size so I didn't want to get into that game. > > >>>> That is to say, shiftfs only supports O_DIRECT if the underlying > > >>>> filesystem is iomap based. This was the same approach that overlayfs > > >>>> wanted to use. > > >>> > > >>> Are you saying that an O_DIRECT write that extends a file does not > > >>> result in i_size being updated? That doesn't sound right ... > > >> > > >> For example, when overlayfs intended to introduce support for this > > >> (which they didn't because of their version of upper and lower) they did > > >> not mess with i_size. That's one of the reasons I didn't. Do you prefer > > >> to copy up attributes to make sure we're not missing anything? > > > > > > I think that something like this: > > > > > > fd = open(path, O_RDWR|O_DIRECT); > > > lseek(fd, 0, SEEK_END); > > > write(fd, data, size); > > > fstat(fd, &stat); > > > printf("%ld\n", (long)stat.st_size); > > > > > > Should print out the correct size (and similarly correct data for other > > > attributes). Maybe it will with this patch, I'm just not certain that it > > > will, so really I'm just asking whether you've tested this sort of > > > thing. If it doesn't work, then it seems like we probably do need to > > > copy up the attributes. > > > > > > Seth > > > > > With the long discussion seemingly stopping unresolved I am not sure this > > patchset is ready for SRU or not. > > I think we are still waiting for Christian to get back with some test > results here, so these patches should wait. The test outlined above shows that the correct file size is reported. Christian
diff --git a/fs/shiftfs.c b/fs/shiftfs.c index 49f6714e9f95..addaa6e21e57 100644 --- a/fs/shiftfs.c +++ b/fs/shiftfs.c @@ -1126,6 +1126,9 @@ static int shiftfs_open(struct inode *inode, struct file *file) } file->private_data = file_info; + /* For O_DIRECT dentry_open() checks f_mapping->a_ops->direct_IO. */ + file->f_mapping = realfile->f_mapping; + file_info->realfile = realfile; return 0; }
BugLink: https://bugs.launchpad.net/bugs/1837223 This enabled O_DIRECT support for shiftfs if the underlay supports it. Currently shiftfs does not handle O_DIRECT if the underlay supports it. This is blocking dqlite - an essential part of LXD - from profiting from the performance benefits of O_DIRECT on suitable filesystems when used with async io such as aio or io_uring. Overlayfs cannot support this directly since the upper filesystem in overlay can be any filesystem. So if the upper filesystem does not support O_DIRECT but the lower filesystem does you're out of luck. Shiftfs does not suffer from the same problem since there is not concept of an upper filesystem in the same way that overlayfs has it. Essentially, shiftfs is a transparent shim relaying everything to the underlay while overlayfs' upper layer is not (completely). Cc: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> --- fs/shiftfs.c | 3 +++ 1 file changed, 3 insertions(+)