Message ID | dd5e4979-233b-76ea-806f-75422571fd6a@redhat.com |
---|---|
State | New |
Headers | show |
On 05/19/2016 04:57 PM, Florian Weimer wrote: > On 03/31/2016 12:14 PM, Florian Weimer wrote: >> On 03/18/2016 11:52 PM, Roland McGrath wrote: >> >>> Whatever the results, they would not IMHO be relevant here. >>> >>> POSIX specifies that st_blksize is the "preferred I/O block size for >>> this >>> object". It's the kernel's responsibility to give userland good advice >>> through this channel. If there are common buggy kernels that give bad >>> advice, that is a reason to apply upper and lower limits to the >>> advice from >>> the kernel. But the expectation should be that the kernel gets fixed to >>> give good advice, and the optimal thing to do with a good kernel is to >>> follow its advice. >>> >>> Since the recommended use of st_blksize in this way is a standard user >>> feature and not just what stdio's implementation happens to do, there >>> is an >>> argument to be made that the limiting of the value should be done in the >>> *stat functions reported st_blksize values rather than in stdio's use of >>> them. (I'm ambivalent about this point.) >> >> That's a good point. I'll try to get feedback from kernel file system >> developers on this matter. > > I wasn't able to get any feedback. Based on Rich's point about random > I/O and Roland's earlier suggestion, I'm just capping the reported > buffer size to 8192 in the attached patch. Ping? <https://sourceware.org/ml/libc-alpha/2016-05/msg00428.html> Thanks, Florian
On 06/24/2016 05:35 PM, Florian Weimer wrote: > On 05/19/2016 04:57 PM, Florian Weimer wrote: >> On 03/31/2016 12:14 PM, Florian Weimer wrote: >>> On 03/18/2016 11:52 PM, Roland McGrath wrote: >>> >>>> Whatever the results, they would not IMHO be relevant here. >>>> >>>> POSIX specifies that st_blksize is the "preferred I/O block size for >>>> this >>>> object". It's the kernel's responsibility to give userland good advice >>>> through this channel. If there are common buggy kernels that give bad >>>> advice, that is a reason to apply upper and lower limits to the >>>> advice from >>>> the kernel. But the expectation should be that the kernel gets >>>> fixed to >>>> give good advice, and the optimal thing to do with a good kernel is to >>>> follow its advice. >>>> >>>> Since the recommended use of st_blksize in this way is a standard user >>>> feature and not just what stdio's implementation happens to do, there >>>> is an >>>> argument to be made that the limiting of the value should be done in >>>> the >>>> *stat functions reported st_blksize values rather than in stdio's >>>> use of >>>> them. (I'm ambivalent about this point.) >>> >>> That's a good point. I'll try to get feedback from kernel file system >>> developers on this matter. >> >> I wasn't able to get any feedback. Based on Rich's point about random >> I/O and Roland's earlier suggestion, I'm just capping the reported >> buffer size to 8192 in the attached patch. > > Ping? > > <https://sourceware.org/ml/libc-alpha/2016-05/msg00428.html> Ping? Thanks, Florian
On 11/29/2016 11:50 AM, Florian Weimer wrote: > On 06/24/2016 05:35 PM, Florian Weimer wrote: >> On 05/19/2016 04:57 PM, Florian Weimer wrote: >>> On 03/31/2016 12:14 PM, Florian Weimer wrote: >>>> On 03/18/2016 11:52 PM, Roland McGrath wrote: >>>> >>>>> Whatever the results, they would not IMHO be relevant here. >>>>> >>>>> POSIX specifies that st_blksize is the "preferred I/O block size for >>>>> this >>>>> object". It's the kernel's responsibility to give userland good advice >>>>> through this channel. If there are common buggy kernels that give bad >>>>> advice, that is a reason to apply upper and lower limits to the >>>>> advice from >>>>> the kernel. But the expectation should be that the kernel gets >>>>> fixed to >>>>> give good advice, and the optimal thing to do with a good kernel is to >>>>> follow its advice. >>>>> >>>>> Since the recommended use of st_blksize in this way is a standard user >>>>> feature and not just what stdio's implementation happens to do, there >>>>> is an >>>>> argument to be made that the limiting of the value should be done in >>>>> the >>>>> *stat functions reported st_blksize values rather than in stdio's >>>>> use of >>>>> them. (I'm ambivalent about this point.) >>>> >>>> That's a good point. I'll try to get feedback from kernel file system >>>> developers on this matter. >>> >>> I wasn't able to get any feedback. Based on Rich's point about random >>> I/O and Roland's earlier suggestion, I'm just capping the reported >>> buffer size to 8192 in the attached patch. >> >> Ping? >> >> <https://sourceware.org/ml/libc-alpha/2016-05/msg00428.html> > > Ping? Just to confirm the user can still call stat themselves and adjust the buffer size up if they want? If that's the case then this looks good to me and users have a way to re-enable the old behaviour. This kind of tuning can be changed at will from release to release or patched by the distributions if they have problems with the change. I don't see any real issue here, the kernel is returning "optimal" values but this conflicts with what the user themselves want to be doing. Thus providing a fixed cap for the default is fine (as long as the user can undo the choice).
On 11/30/2016 02:15 PM, Carlos O'Donell wrote: > On 11/29/2016 11:50 AM, Florian Weimer wrote: >> On 06/24/2016 05:35 PM, Florian Weimer wrote: >>> On 05/19/2016 04:57 PM, Florian Weimer wrote: >>>> On 03/31/2016 12:14 PM, Florian Weimer wrote: >>>>> On 03/18/2016 11:52 PM, Roland McGrath wrote: >>>>> >>>>>> Whatever the results, they would not IMHO be relevant here. >>>>>> >>>>>> POSIX specifies that st_blksize is the "preferred I/O block size for >>>>>> this >>>>>> object". It's the kernel's responsibility to give userland good advice >>>>>> through this channel. If there are common buggy kernels that give bad >>>>>> advice, that is a reason to apply upper and lower limits to the >>>>>> advice from >>>>>> the kernel. But the expectation should be that the kernel gets >>>>>> fixed to >>>>>> give good advice, and the optimal thing to do with a good kernel is to >>>>>> follow its advice. >>>>>> >>>>>> Since the recommended use of st_blksize in this way is a standard user >>>>>> feature and not just what stdio's implementation happens to do, there >>>>>> is an >>>>>> argument to be made that the limiting of the value should be done in >>>>>> the >>>>>> *stat functions reported st_blksize values rather than in stdio's >>>>>> use of >>>>>> them. (I'm ambivalent about this point.) >>>>> >>>>> That's a good point. I'll try to get feedback from kernel file system >>>>> developers on this matter. >>>> >>>> I wasn't able to get any feedback. Based on Rich's point about random >>>> I/O and Roland's earlier suggestion, I'm just capping the reported >>>> buffer size to 8192 in the attached patch. >>> >>> Ping? >>> >>> <https://sourceware.org/ml/libc-alpha/2016-05/msg00428.html> >> >> Ping? > > Just to confirm the user can still call stat themselves and adjust the > buffer size up if they want? They can still install their own buffer, yes. > If that's the case then this looks good to me and users have a way > to re-enable the old behaviour. Thanks, Florian
On 11/30/2016 08:17 AM, Florian Weimer wrote:
> They can still install their own buffer, yes.
The only tradeoff I see is that you're making wasteful
write syscalls while the kernel waits for enough data
to optimally send to the target device.
You really have to know a lot about your device to have
enough information to do this optimally.
xstat()
https://lwn.net/Articles/686106/
~~~
Howells noted that Dave Chinner wanted more I/O
parameters (e.g. preferred read and write sizes,
erase block size). There were five to seven
different numbers that Chinner wanted, but those
could always be added later, he said.
~~~
And the sysfs information is useless for my SATA disks...
[carlos@athas ~]$ cat /sys/block/sda/queue/optimal_io_size
0
So we aren't yet at any kind of utopia where there is
enough information to make the right choice. Even then
the "right" choice depends on what you're trying to do
e.g. small write, large write, throughput, low-latency etc.
libio: Limit buffer size to 8192 bytes [BZ #4099] This avoids overly large buffers with network file systems which report very large block sizes. 2016-05-19 Florian Weimer <fweimer@redhat.com> [BZ #4099] * libio/filedoalloc.c (_IO_file_doallocate): Limit buffer size to _IO_BUFSIZ (8192). diff --git a/NEWS b/NEWS index b3fd3cc..dec9757 100644 --- a/NEWS +++ b/NEWS @@ -33,6 +33,12 @@ Version 2.24 group: files [SUCCESS=merge] nis Implemented by Stephen Gallagher (Red Hat). +* The buffer size for byte-oriented stdio streams is now limited to 8192 + bytes by default. Previously, on Linux, the default buffer size on most + file systems was 4096 bytes (and thus remains unchanged), except on + network file systems, where the buffer size was unpredictable and could be + as large as several megabytes. + Security related changes: * An unnecessary stack copy in _nss_dns_getnetbyname_r was removed. It diff --git a/libio/filedoalloc.c b/libio/filedoalloc.c index 4f9d738..ded0725 100644 --- a/libio/filedoalloc.c +++ b/libio/filedoalloc.c @@ -94,7 +94,7 @@ _IO_file_doallocate (_IO_FILE *fp) fp->_flags |= _IO_LINE_BUF; } #if _IO_HAVE_ST_BLKSIZE - if (st.st_blksize > 0) + if (st.st_blksize > 0 && st.st_blksize < _IO_BUFSIZ) size = st.st_blksize; #endif }