Message ID | 20160711192653.GA4457@intel.com |
---|---|
State | New |
Headers | show |
On 7/11/2016 3:26 PM, H.J. Lu wrote: > SYSCALL_LL/SYSCALL_LL64 should be used to pass 64-bit value to system > calls. What problem are you trying to solve? SYSCALL_LL and LO_HI_LONG are different on big-endian systems. In this case, LO_HI_LONG is correct, since the kernel API is "unsigned long pos_l, unsigned long pos_h". SYSCALL_LL won't do the right thing. __ALIGNMENT_ARG introduces an extra dummy arguments to be inserted before a 64-bit value split into two 32-bit registers, if required by the platform. Since preadv/pwritev explicitly use split arguments to construct a 64-bit loff_t, __ALIGNMENT_ARG will just add a random inappropriate dummy arg to an API that doesn't need one. I reviewed the casting for LO_HI_LONG and it looks OK; I initially was wondering whether losing the "val >> 31, val" semantics from SYSCALL_LL() might have been problematic, but it seems like LO_HI_LONG should generate the same results.
On Mon, Jul 11, 2016 at 1:36 PM, Chris Metcalf <cmetcalf@mellanox.com> wrote: > On 7/11/2016 3:26 PM, H.J. Lu wrote: >> >> SYSCALL_LL/SYSCALL_LL64 should be used to pass 64-bit value to system >> calls. > > > What problem are you trying to solve? > > SYSCALL_LL and LO_HI_LONG are different on big-endian systems. In this > case, LO_HI_LONG is correct, since the kernel API is "unsigned long pos_l, > unsigned long pos_h". SYSCALL_LL won't do the right thing. > > __ALIGNMENT_ARG introduces an extra dummy arguments to be inserted before > a 64-bit value split into two 32-bit registers, if required by the platform. > Since preadv/pwritev explicitly use split arguments to construct a 64-bit > loff_t, __ALIGNMENT_ARG will just add a random inappropriate dummy arg > to an API that doesn't need one. > > I reviewed the casting for LO_HI_LONG and it looks OK; I initially was > wondering whether losing the "val >> 31, val" semantics from SYSCALL_LL() > might have been problematic, but it seems like LO_HI_LONG should generate > the same results. > /* Provide a macro to pass the off{64}_t argument on p{readv,writev}{64}. */ #define LO_HI_LONG(val) \ (long) (val), \ (long) (((uint64_t) (val)) >> 32) is wrong for __ASSUME_WORDSIZE64_ILP32 platform.
On 7/11/2016 5:04 PM, H.J. Lu wrote: > On Mon, Jul 11, 2016 at 1:36 PM, Chris Metcalf <cmetcalf@mellanox.com> wrote: >> On 7/11/2016 3:26 PM, H.J. Lu wrote: >>> SYSCALL_LL/SYSCALL_LL64 should be used to pass 64-bit value to system >>> calls. >> >> What problem are you trying to solve? >> >> SYSCALL_LL and LO_HI_LONG are different on big-endian systems. In this >> case, LO_HI_LONG is correct, since the kernel API is "unsigned long pos_l, >> unsigned long pos_h". SYSCALL_LL won't do the right thing. >> >> __ALIGNMENT_ARG introduces an extra dummy arguments to be inserted before >> a 64-bit value split into two 32-bit registers, if required by the platform. >> Since preadv/pwritev explicitly use split arguments to construct a 64-bit >> loff_t, __ALIGNMENT_ARG will just add a random inappropriate dummy arg >> to an API that doesn't need one. >> >> I reviewed the casting for LO_HI_LONG and it looks OK; I initially was >> wondering whether losing the "val >> 31, val" semantics from SYSCALL_LL() >> might have been problematic, but it seems like LO_HI_LONG should generate >> the same results. >> > /* Provide a macro to pass the off{64}_t argument on p{readv,writev}{64}. */ > #define LO_HI_LONG(val) \ > (long) (val), \ > (long) (((uint64_t) (val)) >> 32) > > is wrong for __ASSUME_WORDSIZE64_ILP32 platform. On 64-bit platforms, the default definition ends up generating a garbage 32-bit shifted "hi" arg that the kernel then discards. So I think we should just use a simple #ifdef _LP64 to avoid the shift, which would be helpful for all the 64-bit platforms. But while we're at it, I suspect we should actually pass a constant zero as the "hi" value. The kernel discards it, but it's not immediately obvious, and specifying a zero here seems like good conservative programming. In addition, if at some point we want to use the preadv2 syscall ABI, I can pretty much guarantee we will have to re-discover all this again because someone will just write "LO_HI_LONG(val), flags", and be baffled as to why the flags word is ignored on some platforms. (And honestly, something called LO_HI_LONG *not* producing a lo and a hi value is inherently odd.) So I think this makes the most sense: #ifdef _LP64 # define LO_HI_LONG(val) (val), 0 #else # define LO_HI_LONG(val) \ (long) (val), \ (long) (((uint64_t) (val)) >> 32) #endif And then x32-like platforms that want to pass a single 64-bit loff_t register value can override, as HJ has already done for x86_64. It's reasonable to consider this as the exceptional case since the kernel has the standard version of the syscall as taking two arguments, and x32 is the only kernel architecture using the compat_sys_preadv64 implementation that only takes one loff_t argument. This does point out that the __ASSUME_WORDSIZE64_ILP32 symbol really isn't as general as one might like, since it really depends on which 32-bit ABI compat functions the kernel is using for each syscall.
On 13/07/2016 21:46, Chris Metcalf wrote: > On 7/11/2016 5:04 PM, H.J. Lu wrote: >> On Mon, Jul 11, 2016 at 1:36 PM, Chris Metcalf <cmetcalf@mellanox.com> wrote: >>> On 7/11/2016 3:26 PM, H.J. Lu wrote: >>>> SYSCALL_LL/SYSCALL_LL64 should be used to pass 64-bit value to system >>>> calls. >>> >>> What problem are you trying to solve? >>> >>> SYSCALL_LL and LO_HI_LONG are different on big-endian systems. In this >>> case, LO_HI_LONG is correct, since the kernel API is "unsigned long pos_l, >>> unsigned long pos_h". SYSCALL_LL won't do the right thing. >>> >>> __ALIGNMENT_ARG introduces an extra dummy arguments to be inserted before >>> a 64-bit value split into two 32-bit registers, if required by the platform. >>> Since preadv/pwritev explicitly use split arguments to construct a 64-bit >>> loff_t, __ALIGNMENT_ARG will just add a random inappropriate dummy arg >>> to an API that doesn't need one. >>> >>> I reviewed the casting for LO_HI_LONG and it looks OK; I initially was >>> wondering whether losing the "val >> 31, val" semantics from SYSCALL_LL() >>> might have been problematic, but it seems like LO_HI_LONG should generate >>> the same results. >>> >> /* Provide a macro to pass the off{64}_t argument on p{readv,writev}{64}. */ >> #define LO_HI_LONG(val) \ >> (long) (val), \ >> (long) (((uint64_t) (val)) >> 32) >> >> is wrong for __ASSUME_WORDSIZE64_ILP32 platform. > > On 64-bit platforms, the default definition ends up generating a garbage > 32-bit shifted "hi" arg that the kernel then discards. So I think we > should just use a simple #ifdef _LP64 to avoid the shift, which would > be helpful for all the 64-bit platforms. This also seems the right thing to do imho. > > But while we're at it, I suspect we should actually pass a constant zero as > the "hi" value. The kernel discards it, but it's not immediately obvious, > and specifying a zero here seems like good conservative programming. In > addition, if at some point we want to use the preadv2 syscall ABI, I can > pretty much guarantee we will have to re-discover all this again because > someone will just write "LO_HI_LONG(val), flags", and be baffled as to why > the flags word is ignored on some platforms. (And honestly, something > called LO_HI_LONG *not* producing a lo and a hi value is inherently odd.) > > So I think this makes the most sense: > > #ifdef _LP64 > # define LO_HI_LONG(val) (val), 0 > #else > # define LO_HI_LONG(val) \ > (long) (val), \ > (long) (((uint64_t) (val)) >> 32) > #endif Yes, this is what I am thinking to do to fix the Linux implementation. > > And then x32-like platforms that want to pass a single 64-bit loff_t > register value can override, as HJ has already done for x86_64. It's > reasonable to consider this as the exceptional case since the kernel has > the standard version of the syscall as taking two arguments, and x32 > is the only kernel architecture using the compat_sys_preadv64 > implementation that only takes one loff_t argument. > > This does point out that the __ASSUME_WORDSIZE64_ILP32 symbol really > isn't as general as one might like, since it really depends on which > 32-bit ABI compat functions the kernel is using for each syscall. > At the time I suggested the __ASSUME_WORDSIZE64_ILP32 I was not area of all variant usages for supported ports. I think your LP64 suggestion does make sense and although my initial approach would to avoid arch specific overrides, x32 does make sense in this case. I think that if possible futures ILP32 architecture also decide to follow x32 argument passing for 64-bit loff_t, we can make it generic by a arch specific define.
On Thu, Jul 14, 2016 at 3:50 AM, Adhemerval Zanella <adhemerval.zanella@linaro.org> wrote: > > > On 13/07/2016 21:46, Chris Metcalf wrote: >> On 7/11/2016 5:04 PM, H.J. Lu wrote: >>> On Mon, Jul 11, 2016 at 1:36 PM, Chris Metcalf <cmetcalf@mellanox.com> wrote: >>>> On 7/11/2016 3:26 PM, H.J. Lu wrote: >>>>> SYSCALL_LL/SYSCALL_LL64 should be used to pass 64-bit value to system >>>>> calls. >>>> >>>> What problem are you trying to solve? >>>> >>>> SYSCALL_LL and LO_HI_LONG are different on big-endian systems. In this >>>> case, LO_HI_LONG is correct, since the kernel API is "unsigned long pos_l, >>>> unsigned long pos_h". SYSCALL_LL won't do the right thing. >>>> >>>> __ALIGNMENT_ARG introduces an extra dummy arguments to be inserted before >>>> a 64-bit value split into two 32-bit registers, if required by the platform. >>>> Since preadv/pwritev explicitly use split arguments to construct a 64-bit >>>> loff_t, __ALIGNMENT_ARG will just add a random inappropriate dummy arg >>>> to an API that doesn't need one. >>>> >>>> I reviewed the casting for LO_HI_LONG and it looks OK; I initially was >>>> wondering whether losing the "val >> 31, val" semantics from SYSCALL_LL() >>>> might have been problematic, but it seems like LO_HI_LONG should generate >>>> the same results. >>>> >>> /* Provide a macro to pass the off{64}_t argument on p{readv,writev}{64}. */ >>> #define LO_HI_LONG(val) \ >>> (long) (val), \ >>> (long) (((uint64_t) (val)) >> 32) >>> >>> is wrong for __ASSUME_WORDSIZE64_ILP32 platform. >> >> On 64-bit platforms, the default definition ends up generating a garbage >> 32-bit shifted "hi" arg that the kernel then discards. So I think we >> should just use a simple #ifdef _LP64 to avoid the shift, which would >> be helpful for all the 64-bit platforms. > > This also seems the right thing to do imho. > >> >> But while we're at it, I suspect we should actually pass a constant zero as >> the "hi" value. The kernel discards it, but it's not immediately obvious, >> and specifying a zero here seems like good conservative programming. In >> addition, if at some point we want to use the preadv2 syscall ABI, I can >> pretty much guarantee we will have to re-discover all this again because >> someone will just write "LO_HI_LONG(val), flags", and be baffled as to why >> the flags word is ignored on some platforms. (And honestly, something >> called LO_HI_LONG *not* producing a lo and a hi value is inherently odd.) >> >> So I think this makes the most sense: >> >> #ifdef _LP64 >> # define LO_HI_LONG(val) (val), 0 >> #else >> # define LO_HI_LONG(val) \ >> (long) (val), \ >> (long) (((uint64_t) (val)) >> 32) >> #endif > > Yes, this is what I am thinking to do to fix the Linux implementation. I think it is a mistake to put LO_HI_LONG in sysdep.h. It is specific to preadv/pwritev and it should be defined in a common file for preadv/pwritev. >> And then x32-like platforms that want to pass a single 64-bit loff_t >> register value can override, as HJ has already done for x86_64. It's >> reasonable to consider this as the exceptional case since the kernel has >> the standard version of the syscall as taking two arguments, and x32 >> is the only kernel architecture using the compat_sys_preadv64 >> implementation that only takes one loff_t argument. >> >> This does point out that the __ASSUME_WORDSIZE64_ILP32 symbol really >> isn't as general as one might like, since it really depends on which >> 32-bit ABI compat functions the kernel is using for each syscall. >> > > At the time I suggested the __ASSUME_WORDSIZE64_ILP32 I was not area of > all variant usages for supported ports. I think your LP64 suggestion > does make sense and although my initial approach would to avoid arch > specific overrides, x32 does make sense in this case. I think that > if possible futures ILP32 architecture also decide to follow x32 > argument passing for 64-bit loff_t, we can make it generic by a > arch specific define. It is similar to llseek vs lseek. ILP32 architecture can choose llseek over lseek. X32 always passes 64-bit offset in a 640-bit register.I
diff --git a/sysdeps/unix/sysv/linux/preadv.c b/sysdeps/unix/sysv/linux/preadv.c index 107cb81..3db6912 100644 --- a/sysdeps/unix/sysv/linux/preadv.c +++ b/sysdeps/unix/sysv/linux/preadv.c @@ -29,7 +29,8 @@ ssize_t preadv (int fd, const struct iovec *vector, int count, off_t offset) { - return SYSCALL_CANCEL (preadv, fd, vector, count, LO_HI_LONG (offset)); + return SYSCALL_CANCEL (preadv, fd, vector, count, + __ALIGNMENT_ARG SYSCALL_LL (offset)); } # else static ssize_t __atomic_preadv_replacement (int, const struct iovec *, @@ -39,7 +40,7 @@ preadv (int fd, const struct iovec *vector, int count, off_t offset) { # ifdef __NR_preadv ssize_t result = SYSCALL_CANCEL (preadv, fd, vector, count, - LO_HI_LONG (offset)); + __ALIGNMENT_ARG SYSCALL_LL (offset)); if (result >= 0 || errno != ENOSYS) return result; # endif diff --git a/sysdeps/unix/sysv/linux/preadv64.c b/sysdeps/unix/sysv/linux/preadv64.c index 66afd4c..27cd04f 100644 --- a/sysdeps/unix/sysv/linux/preadv64.c +++ b/sysdeps/unix/sysv/linux/preadv64.c @@ -27,7 +27,8 @@ ssize_t preadv64 (int fd, const struct iovec *vector, int count, off64_t offset) { - return SYSCALL_CANCEL (preadv64, fd, vector, count, LO_HI_LONG (offset)); + return SYSCALL_CANCEL (preadv64, fd, vector, count, + __ALIGNMENT_ARG SYSCALL_LL64 (offset)); } #else static ssize_t __atomic_preadv64_replacement (int, const struct iovec *, @@ -37,7 +38,7 @@ preadv64 (int fd, const struct iovec *vector, int count, off64_t offset) { #ifdef __NR_preadv64 ssize_t result = SYSCALL_CANCEL (preadv64, fd, vector, count, - LO_HI_LONG (offset)); + __ALIGNMENT_ARG SYSCALL_LL64 (offset)); if (result >= 0 || errno != ENOSYS) return result; #endif diff --git a/sysdeps/unix/sysv/linux/pwritev.c b/sysdeps/unix/sysv/linux/pwritev.c index 6747f42..7c98bfd 100644 --- a/sysdeps/unix/sysv/linux/pwritev.c +++ b/sysdeps/unix/sysv/linux/pwritev.c @@ -29,7 +29,8 @@ ssize_t pwritev (int fd, const struct iovec *vector, int count, off_t offset) { - return SYSCALL_CANCEL (pwritev, fd, vector, count, LO_HI_LONG (offset)); + return SYSCALL_CANCEL (pwritev, fd, vector, count, + __ALIGNMENT_ARG SYSCALL_LL (offset)); } # else static ssize_t __atomic_pwritev_replacement (int, const struct iovec *, @@ -39,7 +40,7 @@ pwritev (int fd, const struct iovec *vector, int count, off_t offset) { # ifdef __NR_pwritev ssize_t result = SYSCALL_CANCEL (pwritev, fd, vector, count, - LO_HI_LONG (offset)); + __ALIGNMENT_ARG SYSCALL_LL (offset)); if (result >= 0 || errno != ENOSYS) return result; # endif diff --git a/sysdeps/unix/sysv/linux/pwritev64.c b/sysdeps/unix/sysv/linux/pwritev64.c index e162948..5d5d43f 100644 --- a/sysdeps/unix/sysv/linux/pwritev64.c +++ b/sysdeps/unix/sysv/linux/pwritev64.c @@ -27,7 +27,8 @@ ssize_t pwritev64 (int fd, const struct iovec *vector, int count, off64_t offset) { - return SYSCALL_CANCEL (pwritev64, fd, vector, count, LO_HI_LONG (offset)); + return SYSCALL_CANCEL (pwritev64, fd, vector, count, + __ALIGNMENT_ARG SYSCALL_LL64 (offset)); } #else static ssize_t __atomic_pwritev64_replacement (int, const struct iovec *, @@ -37,7 +38,7 @@ pwritev64 (int fd, const struct iovec *vector, int count, off64_t offset) { #ifdef __NR_pwritev64 ssize_t result = SYSCALL_CANCEL (pwritev64, fd, vector, count, - LO_HI_LONG (offset)); + __ALIGNMENT_ARG SYSCALL_LL64 (offset)); if (result >= 0 || errno != ENOSYS) return result; #endif diff --git a/sysdeps/unix/sysv/linux/sysdep.h b/sysdeps/unix/sysv/linux/sysdep.h index a469f57..f2d7e05 100644 --- a/sysdeps/unix/sysv/linux/sysdep.h +++ b/sysdeps/unix/sysv/linux/sysdep.h @@ -47,8 +47,3 @@ #define SYSCALL_LL64(val) \ __LONG_LONG_PAIR ((long) ((val) >> 32), (long) ((val) & 0xffffffff)) #endif - -/* Provide a macro to pass the off{64}_t argument on p{readv,writev}{64}. */ -#define LO_HI_LONG(val) \ - (long) (val), \ - (long) (((uint64_t) (val)) >> 32)