Message ID | 20150424134516.6795441F484D0@oldenburg.str.redhat.com |
---|---|
State | New |
Headers | show |
On 04/24/2015 02:53 PM, Florian Weimer wrote: > The previous implementation could result in silent data corruption, > and this has been observed to happen with application code. I'd appreciate some comment on this patch. Do you agree that this is the right approach?
On 05/05/2015 08:37 AM, Florian Weimer wrote: > On 04/24/2015 02:53 PM, Florian Weimer wrote: >> The previous implementation could result in silent data corruption, >> and this has been observed to happen with application code. > I'd appreciate some comment on this patch. Do you agree that this is > the right approach? > I just now read the bug report and patch and agree that the patch is the right way to go.
On 04/24/2015 08:53 AM, Florian Weimer wrote: > The previous implementation could result in silent data corruption, > and this has been observed to happen with application code. In principle I agree with the removal of all of the fallback fallocate code, it simply can't work reliably, and a reliable solution is ridiculously expensive (see Rich's comments in the BZ about CAS over all the mmap'd pages). The bug with O_APPEND files is real, and yet another reason to remove the fallback code. My opinion is that some of the failure modes talked about in the bugzilla are invalid, for example having another thread or process calling truncate is already a race condition, you don't need the fallocate fallback to expose a race that corrupts data. The other thread might truncate after you had written data to the file, resulting in the loss of data since there was no synchronization. If there was synchronization then there would be no problem since the thread calling truncate would wait for posix_fallocate to complete before truncating. The other side of the coin is that POSIX goes on further to say in "2.9.7 Thread Interactions with Regular File Operations" that threads should never see interleaving sets of file operations, but it is insane to do anything like that because it kills performance, so you don't get those guarantees in Linux. What worries me though is that this change could break existing systems that relied on this emulation to do something sensible for filesystems that don't support fallocate. These binaries could easily be single threaded systems with no other process touching their files and writing to filesystems that don't support fallocate. If that is a sensible class of users, then we need to version the interface, with the old version continuing to call the fallback code and the new version not calling the fallback code. In summary: OK to checkin as long as you version the interface to prevent breaking existing applications. Unless you can show that all filesystems a sensible person might care about support fallocate, making versioning a waste of time. Thoughts? Cheers, Carlos.
On Tue, May 05, 2015 at 04:28:41PM -0400, Carlos O'Donell wrote: > The other side of the coin is that POSIX goes on further to say in > "2.9.7 Thread Interactions with Regular File Operations" that threads > should never see interleaving sets of file operations, but it is insane > to do anything like that because it kills performance, so you don't get > those guarantees in Linux. Which specific guarantees do you see violated with a sane filesystem like XFS?
On 05/05/2015 10:28 PM, Carlos O'Donell wrote: > On 04/24/2015 08:53 AM, Florian Weimer wrote: >> The previous implementation could result in silent data corruption, >> and this has been observed to happen with application code. > > In principle I agree with the removal of all of the fallback fallocate > code, it simply can't work reliably, and a reliable solution is ridiculously > expensive (see Rich's comments in the BZ about CAS over all the mmap'd pages). It's also not covered by the memory model, I think. > The bug with O_APPEND files is real, and yet another reason to remove the > fallback code. We should handle that better at the very least. We could clear O_APPEND, but only in single-threaded mode; I don't think it's worth the effort. Re-opening the descriptor through /proc/self/fd does not work because closing that descriptor would release POSIX advisory locks. > What worries me though is that this change could break existing systems > that relied on this emulation to do something sensible for filesystems > that don't support fallocate. These binaries could easily be single threaded > systems with no other process touching their files and writing to filesystems > that don't support fallocate. If that is a sensible class of users, then we > need to version the interface, with the old version continuing to call the > fallback code and the new version not calling the fallback code. After sleeping over your comments, I actually did my homework. The gist is that we cannot remove fallback, I think not even with the compatibility symbol. Various file systems do not support fallocate. This includes NFS, where even the most recent version makes it optional to implement in the server. SQLite ignores the posix_fallocate return value, but MariaDB does not. A recompiled MariaDB would suddenly start to fail, and the DBA would have to disable pre-allocation in the configuration. If I read the source correctly, systemd-journald will stop logging, and there is no knob to turn off fallocate. Same for libvirt, it will fail to create backing files for storage devices. Both MariaDB and libvirt are often run on NFS storage, so a glibc change which removes fallback would actually affect them. For the code we ship, we can move the fallback to the applications, but there is no good way to make sure that happens with third-party applications. I do not believe the compatibility symbol mechanism is a good alternative because the breakage will be file-system-dependent and may not be noticed during testing. (I'm generally skeptical of using compatibility symbols this way.) Maybe we could remove the write loop and perform only an ftruncate call which (hopefully) increases the file size. This would take care of the O_APPEND issue and remove most of the races. Using posix_fallocate to avoid ENOSPC later would not work, but with thin provisioning, deduplicating storage and compression going around these days, I don't think writing zero blocks has that effect in practice anyway (particularly not on NFS). I'll ask around.
On 05/05/2015 04:48 PM, Christoph Hellwig wrote: > On Tue, May 05, 2015 at 04:28:41PM -0400, Carlos O'Donell wrote: >> The other side of the coin is that POSIX goes on further to say in >> "2.9.7 Thread Interactions with Regular File Operations" that threads >> should never see interleaving sets of file operations, but it is insane >> to do anything like that because it kills performance, so you don't get >> those guarantees in Linux. > > Which specific guarantees do you see violated with a sane filesystem like > XFS? I have not verified that XFS behaves as is expected by POSIX, but I was going by Linus's comments when this issue was discussed and then fixed in 3.14. In particular: http://article.gmane.org/gmane.linux.kernel/398249 With the original thread here: http://thread.gmane.org/gmane.linux.kernel/397980 Would an fstat on XFS show the in-progress IO being done by a call to write? If it does, then it violates POSIX, which requires that none or all of the write show up in the fstat call. The standard statement in question is: ~~~ 2.9.7 Thread Interactions with Regular File Operations All of the functions chmod( ), close( ), fchmod( ), fcntl( ), fstat( ), ftruncate( ), lseek( ), open( ), read( ), readlink( ), stat( ), symlink( ), and write( ) shall be atomic with respect to each other in the effects specified in IEEE Std 1003.1-2001 when they operate on regular files. If two threads each call one of these functions, each call shall either see all of the specified effects of the other call, or none of them. ~~~ Cheers, Carlos.
Florian Weimer wrote: > Maybe we could remove the write loop and perform only an ftruncate call > which (hopefully) increases the file size. This would take care of the > O_APPEND issue and remove most of the races. I like this idea. > Using posix_fallocate to > avoid ENOSPC later would not work, but with thin provisioning, > deduplicating storage and compression going around these days, I don't > think writing zero blocks has that effect in practice anyway That's right. > (particularly not on NFS). It's in draft NFS v4.2 as the ALLOCATE operation; see: https://tools.ietf.org/html/draft-ietf-nfsv4-minorversion2-38 This is pretty much bleeding-edge of course.
On Wed, May 06, 2015 at 04:58:56PM -0400, Carlos O'Donell wrote: > On 05/05/2015 04:48 PM, Christoph Hellwig wrote: > > On Tue, May 05, 2015 at 04:28:41PM -0400, Carlos O'Donell wrote: > >> The other side of the coin is that POSIX goes on further to say in > >> "2.9.7 Thread Interactions with Regular File Operations" that threads > >> should never see interleaving sets of file operations, but it is insane > >> to do anything like that because it kills performance, so you don't get > >> those guarantees in Linux. > > > > Which specific guarantees do you see violated with a sane filesystem like > > XFS? > > I have not verified that XFS behaves as is expected by POSIX, but I was > going by Linus's comments when this issue was discussed and then fixed > in 3.14. > > In particular: > http://article.gmane.org/gmane.linux.kernel/398249 > > With the original thread here: > http://thread.gmane.org/gmane.linux.kernel/397980 > > Would an fstat on XFS show the in-progress IO being done by a call to > write? If it does, then it violates POSIX, which requires that none > or all of the write show up in the fstat call. > > The standard statement in question is: > ~~~ > 2.9.7 Thread Interactions with Regular File Operations > All of the functions chmod( ), close( ), fchmod( ), fcntl( ), fstat( ), > ftruncate( ), lseek( ), open( ), read( ), readlink( ), stat( ), symlink( ), > and write( ) shall be atomic with respect to each other in the effects > specified in IEEE Std 1003.1-2001 when they operate on regular files. If two > threads each call one of these functions, each call shall either see all of > the specified effects of the other call, or none of them. > ~~~ I'm pretty sure Linux has a lot of bugs in this regard. Unless the standard is to be relaxed, I think the right solution is either for the kernel to simulate atomicity or to break out of the long write and return a short write when another operation tries to access the file state while it's in progress. Sadly there does not seem to be anything userspace can do to work around the kernel bugs, though. Rich
On Wed, May 06, 2015 at 03:48:38PM -0700, Paul Eggert wrote: > Florian Weimer wrote: > >Maybe we could remove the write loop and perform only an ftruncate call > >which (hopefully) increases the file size. This would take care of the > >O_APPEND issue and remove most of the races. > > I like this idea. If I'm not mistaken ftruncate could still reduce the file size if it races with another operation that would extend the file. This is also a data loss bug. Rich
> If I'm not mistaken ftruncate could still reduce the file size if it > races with another operation that would extend the file. This is also > a data loss bug. I concur.
On 05/06/2015 03:19 AM, Florian Weimer wrote: > On 05/05/2015 10:28 PM, Carlos O'Donell wrote: >> On 04/24/2015 08:53 AM, Florian Weimer wrote: >>> The previous implementation could result in silent data corruption, >>> and this has been observed to happen with application code. >> >> In principle I agree with the removal of all of the fallback fallocate >> code, it simply can't work reliably, and a reliable solution is ridiculously >> expensive (see Rich's comments in the BZ about CAS over all the mmap'd pages). > > It's also not covered by the memory model, I think. > >> The bug with O_APPEND files is real, and yet another reason to remove the >> fallback code. > > We should handle that better at the very least. > > We could clear O_APPEND, but only in single-threaded mode; I don't think > it's worth the effort. Re-opening the descriptor through /proc/self/fd > does not work because closing that descriptor would release POSIX > advisory locks. I do not think we need to do that, and I agree with some of your comments below. Keep in mind that we need only assure that subsequent writes succeed and that the files is the right length on the filesystem. This in my mind means we need only call `ftruncate` successfully. >> What worries me though is that this change could break existing systems >> that relied on this emulation to do something sensible for filesystems >> that don't support fallocate. These binaries could easily be single threaded >> systems with no other process touching their files and writing to filesystems >> that don't support fallocate. If that is a sensible class of users, then we >> need to version the interface, with the old version continuing to call the >> fallback code and the new version not calling the fallback code. > > After sleeping over your comments, I actually did my homework. The gist > is that we cannot remove fallback, I think not even with the > compatibility symbol. > > Various file systems do not support fallocate. This includes NFS, where > even the most recent version makes it optional to implement in the server. OK. > SQLite ignores the posix_fallocate return value, but MariaDB does not. > A recompiled MariaDB would suddenly start to fail, and the DBA would > have to disable pre-allocation in the configuration. If I read the > source correctly, systemd-journald will stop logging, and there is no > knob to turn off fallocate. Same for libvirt, it will fail to create > backing files for storage devices. OK. > Both MariaDB and libvirt are often run on NFS storage, so a glibc change > which removes fallback would actually affect them. For the code we > ship, we can move the fallback to the applications, but there is no good > way to make sure that happens with third-party applications. I do not > believe the compatibility symbol mechanism is a good alternative because > the breakage will be file-system-dependent and may not be noticed during > testing. (I'm generally skeptical of using compatibility symbols this way.) That is a difference of opinion, but I buy your analysis, despite our best efforts with compatibility symbols the NFS use case would remain and users would see failures everywhere after a recompilation. It would not be prudent of us to do this, and it is exactly what I worried about. > Maybe we could remove the write loop and perform only an ftruncate call > which (hopefully) increases the file size. This would take care of the > O_APPEND issue and remove most of the races. Using posix_fallocate to > avoid ENOSPC later would not work, but with thin provisioning, > deduplicating storage and compression going around these days, I don't > think writing zero blocks has that effect in practice anyway > (particularly not on NFS). I'll ask around. I agree. I was thinking exactly the same thing when I saw the write loop. Unfortunately only fallocate at the kernel fs layer is going to guarantee you never see ENOSPC in all reasonable situations. Cheers, Carlos.
On 05/07/2015 08:19 PM, Roland McGrath wrote: >> If I'm not mistaken ftruncate could still reduce the file size if it >> races with another operation that would extend the file. This is also >> a data loss bug. > > I concur. It happens with length == 0. We could error out with EINVAL instead of calling ftruncate. Daniel Berrange pointed me to these bugs: https://sourceware.org/bugzilla/show_bug.cgi?id=17322 https://bugzilla.redhat.com/show_bug.cgi?id=1140250 https://bugzilla.redhat.com/show_bug.cgi?id=1077068 This suggests that people actually rely on the current allocation behavior. Combined with my previous analysis that applications will start to fail if we remove the fallback and return EINVAL, I now think we need to keep the allocation loop. I don't like this situation. It's a strong argument against providing approximate user-space emulation (setxid is another example, I'm sure there are others). These experiences may be relevant to the getrandom debate. I'm working on a patch with a few minor fixes to posix_fallocate and an update to the manual. I don't think we can do better at present, unfortunately.
diff --git a/ChangeLog b/ChangeLog index b927022..9219d8b 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,5 +1,23 @@ 2015-04-24 Florian Weimer <fweimer@redhat.com> + [BZ#15661] + * sysdeps/posix/posix_fallocate.c: Remove. + * sysdeps/posix/posix_fallocate64.c: Likewise. + * sysdeps/unix/sysv/linux/posix_fallocate.c (posix_fallocate): + Remove internal_fallocate function and fallback. + * sysdeps/unix/sysv/linux/posix_fallocate64.c + (__posix_fallocate64_l64): Likewise. Establish aliases previously + defined in sysdeps/posix/posix_fallocate64.c. + * sysdeps/unix/sysv/linux/mips/mips64/n32/posix_fallocate.c + (posix_fallocate): Remove internal_fallocate function and + fallback. + * sysdeps/unix/sysv/linux/mips/mips64/n32/posix_fallocate64.c + (__posix_fallocate64_l64): Likewise. + * sysdeps/unix/sysv/linux/wordsize-64/posix_fallocate.c + (posix_fallocate): Likewise. + +2015-04-24 Florian Weimer <fweimer@redhat.com> + * sysdeps/unix/sysv/linux/posix_fallocate.c (posix_fallocate): Assume __ASSUME_FALLOCATE is always true. * sysdeps/unix/sysv/linux/posix_fallocate64.c diff --git a/NEWS b/NEWS index ccc4d13..016629f 100644 --- a/NEWS +++ b/NEWS @@ -9,14 +9,14 @@ Version 2.22 * The following bugs are resolved with this release: - 4719, 6792, 13064, 14094, 14841, 14906, 15319, 15467, 15790, 15969, 16351, - 16512, 16560, 16783, 16850, 17090, 17195, 17269, 17523, 17542, 17569, - 17588, 17596, 17620, 17621, 17628, 17631, 17711, 17776, 17779, 17792, - 17836, 17912, 17916, 17930, 17932, 17944, 17949, 17964, 17965, 17967, - 17969, 17978, 17987, 17991, 17996, 17998, 17999, 18019, 18020, 18029, - 18030, 18032, 18036, 18038, 18039, 18042, 18043, 18046, 18047, 18068, - 18080, 18093, 18100, 18104, 18110, 18111, 18128, 18138, 18185, 18197, - 18206, 18210, 18211, 18247, 18287. + 4719, 6792, 13064, 14094, 14841, 14906, 15319, 15467, 15661, 15790, 15969, + 16351, 16512, 16560, 16783, 16850, 17090, 17195, 17269, 17523, 17542, + 17569, 17588, 17596, 17620, 17621, 17628, 17631, 17711, 17776, 17779, + 17792, 17836, 17912, 17916, 17930, 17932, 17944, 17949, 17964, 17965, + 17967, 17969, 17978, 17987, 17991, 17996, 17998, 17999, 18019, 18020, + 18029, 18030, 18032, 18036, 18038, 18039, 18042, 18043, 18046, 18047, + 18068, 18080, 18093, 18100, 18104, 18110, 18111, 18128, 18138, 18185, + 18197, 18206, 18210, 18211, 18247, 18287. * A buffer overflow in gethostbyname_r and related functions performing DNS requests has been fixed. If the NSS functions were called with a @@ -25,6 +25,12 @@ Version 2.22 potentially arbitrary code execution, using crafted, but syntactically valid DNS responses. (CVE-2015-1781) +* The fallback emulation of posix_fallocate and posix_fallocate64 was + removed because it could result in silent data corruption on file systems + which do not implement fallocate support in the kernel. posix_fallocate + and posix_fallocate64 will now fail and return ENOTSUP if the file system + does not support fallocate operations. + * A powerpc and powerpc64 optimization for TLS, similar to TLS descriptors for LD and GD on x86 and x86-64, has been implemented. You will need binutils-2.24 or later to enable this optimization. diff --git a/sysdeps/posix/posix_fallocate.c b/sysdeps/posix/posix_fallocate.c deleted file mode 100644 index d15d603..0000000 --- a/sysdeps/posix/posix_fallocate.c +++ /dev/null @@ -1,93 +0,0 @@ -/* Copyright (C) 2000-2015 Free Software Foundation, Inc. - This file is part of the GNU C Library. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - <http://www.gnu.org/licenses/>. */ - -#include <errno.h> -#include <fcntl.h> -#include <unistd.h> -#include <sys/stat.h> -#include <sys/statfs.h> - -/* Reserve storage for the data of the file associated with FD. */ - -int -posix_fallocate (int fd, __off_t offset, __off_t len) -{ - struct stat64 st; - struct statfs f; - - /* `off_t' is a signed type. Therefore we can determine whether - OFFSET + LEN is too large if it is a negative value. */ - if (offset < 0 || len < 0) - return EINVAL; - if (offset + len < 0) - return EFBIG; - - /* First thing we have to make sure is that this is really a regular - file. */ - if (__fxstat64 (_STAT_VER, fd, &st) != 0) - return EBADF; - if (S_ISFIFO (st.st_mode)) - return ESPIPE; - if (! S_ISREG (st.st_mode)) - return ENODEV; - - if (len == 0) - { - if (st.st_size < offset) - { - int ret = __ftruncate (fd, offset); - - if (ret != 0) - ret = errno; - return ret; - } - return 0; - } - - /* We have to know the block size of the filesystem to get at least some - sort of performance. */ - if (__fstatfs (fd, &f) != 0) - return errno; - - /* Try to play safe. */ - if (f.f_bsize == 0) - f.f_bsize = 512; - - /* Write something to every block. */ - for (offset += (len - 1) % f.f_bsize; len > 0; offset += f.f_bsize) - { - len -= f.f_bsize; - - if (offset < st.st_size) - { - unsigned char c; - ssize_t rsize = __pread (fd, &c, 1, offset); - - if (rsize < 0) - return errno; - /* If there is a non-zero byte, the block must have been - allocated already. */ - else if (rsize == 1 && c != 0) - continue; - } - - if (__pwrite (fd, "", 1, offset) != 1) - return errno; - } - - return 0; -} diff --git a/sysdeps/posix/posix_fallocate64.c b/sysdeps/posix/posix_fallocate64.c deleted file mode 100644 index b845df7..0000000 --- a/sysdeps/posix/posix_fallocate64.c +++ /dev/null @@ -1,113 +0,0 @@ -/* Copyright (C) 2000-2015 Free Software Foundation, Inc. - This file is part of the GNU C Library. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - <http://www.gnu.org/licenses/>. */ - -#include <errno.h> -#include <fcntl.h> -#include <unistd.h> -#include <sys/stat.h> -#include <sys/statfs.h> - -/* Reserve storage for the data of the file associated with FD. */ - -int -__posix_fallocate64_l64 (int fd, __off64_t offset, __off64_t len) -{ - struct stat64 st; - struct statfs64 f; - - /* `off64_t' is a signed type. Therefore we can determine whether - OFFSET + LEN is too large if it is a negative value. */ - if (offset < 0 || len < 0) - return EINVAL; - if (offset + len < 0) - return EFBIG; - - /* First thing we have to make sure is that this is really a regular - file. */ - if (__fxstat64 (_STAT_VER, fd, &st) != 0) - return EBADF; - if (S_ISFIFO (st.st_mode)) - return ESPIPE; - if (! S_ISREG (st.st_mode)) - return ENODEV; - - if (len == 0) - { - if (st.st_size < offset) - { - int ret = __ftruncate64 (fd, offset); - - if (ret != 0) - ret = errno; - return ret; - } - return 0; - } - - /* We have to know the block size of the filesystem to get at least some - sort of performance. */ - if (__fstatfs64 (fd, &f) != 0) - return errno; - - /* Try to play safe. */ - if (f.f_bsize == 0) - f.f_bsize = 512; - - /* Write something to every block. */ - for (offset += (len - 1) % f.f_bsize; len > 0; offset += f.f_bsize) - { - len -= f.f_bsize; - - if (offset < st.st_size) - { - unsigned char c; - ssize_t rsize = __libc_pread64 (fd, &c, 1, offset); - - if (rsize < 0) - return errno; - /* If there is a non-zero byte, the block must have been - allocated already. */ - else if (rsize == 1 && c != 0) - continue; - } - - if (__libc_pwrite64 (fd, "", 1, offset) != 1) - return errno; - } - - return 0; -} - -#undef __posix_fallocate64_l64 -#include <shlib-compat.h> -#include <bits/wordsize.h> - -#if __WORDSIZE == 32 && SHLIB_COMPAT(libc, GLIBC_2_2, GLIBC_2_3_3) - -int -attribute_compat_text_section -__posix_fallocate64_l32 (int fd, off64_t offset, size_t len) -{ - return __posix_fallocate64_l64 (fd, offset, len); -} - -versioned_symbol (libc, __posix_fallocate64_l64, posix_fallocate64, - GLIBC_2_3_3); -compat_symbol (libc, __posix_fallocate64_l32, posix_fallocate64, GLIBC_2_2); -#else -strong_alias (__posix_fallocate64_l64, posix_fallocate64); -#endif diff --git a/sysdeps/unix/sysv/linux/mips/mips64/n32/posix_fallocate.c b/sysdeps/unix/sysv/linux/mips/mips64/n32/posix_fallocate.c index a9c8d73..5d926f5 100644 --- a/sysdeps/unix/sysv/linux/mips/mips64/n32/posix_fallocate.c +++ b/sysdeps/unix/sysv/linux/mips/mips64/n32/posix_fallocate.c @@ -18,10 +18,6 @@ #include <fcntl.h> #include <sysdep.h> -#define posix_fallocate static internal_fallocate -#include <sysdeps/posix/posix_fallocate.c> -#undef posix_fallocate - /* Reserve storage for the data of the file associated with FD. */ int posix_fallocate (int fd, __off_t offset, __off_t len) @@ -31,7 +27,5 @@ posix_fallocate (int fd, __off_t offset, __off_t len) if (! INTERNAL_SYSCALL_ERROR_P (res, err)) return 0; - if (INTERNAL_SYSCALL_ERRNO (res, err) != EOPNOTSUPP) - return INTERNAL_SYSCALL_ERRNO (res, err); - return internal_fallocate (fd, offset, len); + return INTERNAL_SYSCALL_ERRNO (res, err); } diff --git a/sysdeps/unix/sysv/linux/mips/mips64/n32/posix_fallocate64.c b/sysdeps/unix/sysv/linux/mips/mips64/n32/posix_fallocate64.c index 503e918..5d3a636 100644 --- a/sysdeps/unix/sysv/linux/mips/mips64/n32/posix_fallocate64.c +++ b/sysdeps/unix/sysv/linux/mips/mips64/n32/posix_fallocate64.c @@ -18,11 +18,6 @@ #include <fcntl.h> #include <sysdep.h> -extern int __posix_fallocate64_l64 (int fd, __off64_t offset, __off64_t len); -#define __posix_fallocate64_l64 static internal_fallocate64 -#include <sysdeps/posix/posix_fallocate64.c> -#undef __posix_fallocate64_l64 - /* Reserve storage for the data of the file associated with FD. */ int __posix_fallocate64_l64 (int fd, __off64_t offset, __off64_t len) @@ -32,7 +27,5 @@ __posix_fallocate64_l64 (int fd, __off64_t offset, __off64_t len) if (! INTERNAL_SYSCALL_ERROR_P (res, err)) return 0; - if (INTERNAL_SYSCALL_ERRNO (res, err) != EOPNOTSUPP) - return INTERNAL_SYSCALL_ERRNO (res, err); - return internal_fallocate64 (fd, offset, len); + return INTERNAL_SYSCALL_ERRNO (res, err); } diff --git a/sysdeps/unix/sysv/linux/posix_fallocate.c b/sysdeps/unix/sysv/linux/posix_fallocate.c index 4587029..b6124db 100644 --- a/sysdeps/unix/sysv/linux/posix_fallocate.c +++ b/sysdeps/unix/sysv/linux/posix_fallocate.c @@ -18,10 +18,6 @@ #include <fcntl.h> #include <sysdep.h> -#define posix_fallocate static internal_fallocate -#include <sysdeps/posix/posix_fallocate.c> -#undef posix_fallocate - /* Reserve storage for the data of the file associated with FD. */ int posix_fallocate (int fd, __off_t offset, __off_t len) @@ -33,7 +29,5 @@ posix_fallocate (int fd, __off_t offset, __off_t len) if (! INTERNAL_SYSCALL_ERROR_P (res, err)) return 0; - if (INTERNAL_SYSCALL_ERRNO (res, err) != EOPNOTSUPP) - return INTERNAL_SYSCALL_ERRNO (res, err); - return internal_fallocate (fd, offset, len); + return INTERNAL_SYSCALL_ERRNO (res, err); } diff --git a/sysdeps/unix/sysv/linux/posix_fallocate64.c b/sysdeps/unix/sysv/linux/posix_fallocate64.c index 771e59c..97c5a57 100644 --- a/sysdeps/unix/sysv/linux/posix_fallocate64.c +++ b/sysdeps/unix/sysv/linux/posix_fallocate64.c @@ -15,14 +15,11 @@ License along with the GNU C Library; if not, see <http://www.gnu.org/licenses/>. */ +#include <bits/wordsize.h> #include <fcntl.h> +#include <shlib-compat.h> #include <sysdep.h> -extern int __posix_fallocate64_l64 (int fd, __off64_t offset, __off64_t len); -#define __posix_fallocate64_l64 static internal_fallocate64 -#include <sysdeps/posix/posix_fallocate64.c> -#undef __posix_fallocate64_l64 - /* Reserve storage for the data of the file associated with FD. */ int __posix_fallocate64_l64 (int fd, __off64_t offset, __off64_t len) @@ -36,7 +33,20 @@ __posix_fallocate64_l64 (int fd, __off64_t offset, __off64_t len) if (! INTERNAL_SYSCALL_ERROR_P (res, err)) return 0; - if (INTERNAL_SYSCALL_ERRNO (res, err) != EOPNOTSUPP) - return INTERNAL_SYSCALL_ERRNO (res, err); - return internal_fallocate64 (fd, offset, len); + return INTERNAL_SYSCALL_ERRNO (res, err); +} + +#if __WORDSIZE == 32 && SHLIB_COMPAT(libc, GLIBC_2_2, GLIBC_2_3_3) +int +attribute_compat_text_section +__posix_fallocate64_l32 (int fd, off64_t offset, size_t len) +{ + return __posix_fallocate64_l64 (fd, offset, len); } + +versioned_symbol (libc, __posix_fallocate64_l64, posix_fallocate64, + GLIBC_2_3_3); +compat_symbol (libc, __posix_fallocate64_l32, posix_fallocate64, GLIBC_2_2); +#else +strong_alias (__posix_fallocate64_l64, posix_fallocate64); +#endif diff --git a/sysdeps/unix/sysv/linux/wordsize-64/posix_fallocate.c b/sysdeps/unix/sysv/linux/wordsize-64/posix_fallocate.c index 8ae8a29..992d8cb 100644 --- a/sysdeps/unix/sysv/linux/wordsize-64/posix_fallocate.c +++ b/sysdeps/unix/sysv/linux/wordsize-64/posix_fallocate.c @@ -16,13 +16,10 @@ <http://www.gnu.org/licenses/>. */ #include <fcntl.h> +#include <errno.h> #include <kernel-features.h> #include <sysdep.h> -#define posix_fallocate static internal_fallocate -#include <sysdeps/posix/posix_fallocate.c> -#undef posix_fallocate - /* The alpha architecture introduced the fallocate system call in 2.6.33-rc1, so we still need the fallback code. */ #if !defined __ASSUME_FALLOCATE && defined __NR_fallocate @@ -56,11 +53,10 @@ posix_fallocate (int fd, __off_t offset, __off_t len) __have_fallocate = -1; else # endif - if (INTERNAL_SYSCALL_ERRNO (res, err) != EOPNOTSUPP) - return INTERNAL_SYSCALL_ERRNO (res, err); + return INTERNAL_SYSCALL_ERRNO (res, err); } #endif - return internal_fallocate (fd, offset, len); + return ENOSYS; } weak_alias (posix_fallocate, posix_fallocate64)