diff mbox series

[3/6] y2038: linux: Provide __clock_settime64 implementation

Message ID 20190414220841.20243-4-lukma@denx.de
State New
Headers show
Series y2038: Linux: Provide __clock_* functions supporting 64 bit time | expand

Commit Message

Lukasz Majewski April 14, 2019, 10:08 p.m. UTC
This patch provides new __clock_settime64 explicit 64 bit function for
setting the time. Moreover, a 32 bit version - __clock_settime has been
refactored to internally use __clock_settime64.

The __clock_settime is now supposed to be used on 32 bit systems -
hence the necessary checks and conversion to 64 bit type. After this
change it is intrinsically Y2038 safe.

The new 64 bit syscall (clock_settime64) available from Linux
5.1+ has been used when applicable on 32 bit systems.
The execution path on 64 bit systems has not been changed or affected in
any way.

Tests:
- The code has been tested with x86_64/x86 (native compilation):
make PARALLELMFLAGS="-j8" && make xcheck PARALLELMFLAGS="-j8"

- Run specific tests on ARM/x86 32bit systems (qemu):
https://github.com/lmajewski/meta-y2038
and run tests:
https://github.com/lmajewski/y2038-tests/commits/master
on kernels with and without 64 bit time support.

No regressions were observed.

* include/time.h (__clock_settime64):
  Add __clock_settime alias according to __TIMESIZE define
* sysdeps/unix/sysv/linux/clock_settime.c (__clock_settime):
  Refactor this function to be used only on 32 bit machines as a wrapper
  on __clock_settime64.
* sysdeps/unix/sysv/linux/clock_settime.c (__clock_settime64): Add
* sysdeps/unix/sysv/linux/clock_settime.c (__clock_settime64):
  Use clock_settime64 kernel syscall (available from 5.1-rc1+ Linux) by
  32 bit Y2038 safe systems
---
 include/time.h                          |  8 ++++++++
 sysdeps/unix/sysv/linux/clock_settime.c | 34 ++++++++++++++++++++++++++++++---
 2 files changed, 39 insertions(+), 3 deletions(-)

Comments

Joseph Myers April 17, 2019, 10:08 p.m. UTC | #1
On Mon, 15 Apr 2019, Lukasz Majewski wrote:

> The new 64 bit syscall (clock_settime64) available from Linux
> 5.1+ has been used when applicable on 32 bit systems.
> The execution path on 64 bit systems has not been changed or affected in
> any way.

Is this unchanged specifically because of __TIMESIZE conditionals in the 
code (so that it doesn't matter whether the 64-bit systems define any new 
syscall names)?

Also, I don't see any __ASSUME_* / ENOSYS handling in this patch.  As 
usual, for any new syscall that might be used in code that could also use 
an older syscall:

* If the appropriate __ASSUME_* is defined, the new syscall can simply be 
used unconditionally.

* If the appropriate __ASSUME_* is not defined, there needs to be runtime 
fallback to the old syscall if the new one returns an ENOSYS error (of 
course, that runtime fallback needs to check for the time overflowing the 
32-bit range and give an appropriate error in that case, before calling 
the 32-bit syscall - and it's necessary to figure out what the priority 
should be of errors for invalid nanoseconds values versus overflowing 
seconds).  It is normal and expected for the kernel headers used to build 
glibc to be (much) newer than the oldest kernel version supported by the 
resulting glibc binaries at runtime.

This __ASSUME_* macro definition in kernel-features.h would need a 
multi-paragraph comment discussing exactly what the semantics of such 
__ASSUME_* macros are in the context of the sets of syscall names and 
numbers present on different kinds of Linux kernel architectures.

> +#if __TIMESIZE != 64
> +int
> +__clock_settime (clockid_t clock_id, const struct timespec *tp)
> +{
> +  struct __timespec64 ts64;
> +
> +  if (! in_time_t_range (tp->tv_sec))
> +    {
> +      __set_errno (EOVERFLOW);
> +      return -1;
> +    }

I don't see how an overflow check makes sense in this context (converting 
a 32-bit timespec to a 64-bit one).
Lukasz Majewski April 18, 2019, 6:46 a.m. UTC | #2
Hi Joseph,

> On Mon, 15 Apr 2019, Lukasz Majewski wrote:
> 
> > The new 64 bit syscall (clock_settime64) available from Linux
> > 5.1+ has been used when applicable on 32 bit systems.
> > The execution path on 64 bit systems has not been changed or
> > affected in any way.  
> 
> Is this unchanged specifically because of __TIMESIZE conditionals in
> the code (so that it doesn't matter whether the 64-bit systems define
> any new syscall names)?

I think yes.

When __TIMESIZE == 64 the clock_settime syscall is supporting 64 bit
time. The goal is to not change execution path for e.g. x86_64.
IMHO this shall be kept separate from Y2038 and 32 bit execution path.


This patch tries to follow 64 bit conversion for __clock_* functions as
presented in:
https://sourceware.org/glibc/wiki/Y2038ProofnessDesign#clock_gettime.28.29

__TIMESIZE is necessary to distinct the 32/64 bit case as
__clock_settime64 is used for both cases.

In this function I also need to check if __TIMESIZE is defined
https://github.com/lmajewski/y2038_glibc/commit/80f3b6220d80a8579e88b63fb90ba8ea7b771c5c#diff-725737619c471831ff796099fbab5c0aR33

as __clock_settime64 is aliased to clock_settime on 64 bit systems and
then aliased/linked to librt (for backward compatibility).

> 
> Also, I don't see any __ASSUME_* / ENOSYS handling in this patch.  As 
> usual, for any new syscall that might be used in code that could also
> use an older syscall:

I've followed some code pattern from glibc sources; for example:
sysdeps/unix/sysv/linux/lseek.c
Here the new syscall is only guarded by: # ifdef __NR__llseek
There was no __ASSUME_* for it.


However, the statx for example uses __ASSUME_STATX, which is defined
when kernel is newer than specified version. I will rewrite the code to
be similar to statx.

> 
> * If the appropriate __ASSUME_* is defined, the new syscall can
> simply be used unconditionally.
> 
> * If the appropriate __ASSUME_* is not defined, there needs to be
> runtime fallback to the old syscall if the new one returns an ENOSYS
> error (of course, that runtime fallback needs to check for the time
> overflowing the 32-bit range and give an appropriate error in that
> case, before calling the 32-bit syscall - and it's necessary to
> figure out what the priority should be of errors for invalid
> nanoseconds values versus overflowing seconds).  It is normal and
> expected for the kernel headers used to build glibc to be (much)
> newer than the oldest kernel version supported by the resulting glibc
> binaries at runtime.

Ach... I see your point. The __ASSUME_ prevents from nonfunctional glibc
when one uses new headers to compile glibc (which have
__NR__clock_settime64 defined) afterwards installed in a system with
old kernel (so __clock_settime64 returns -ENOSYS). I do agree that with
current patch there is no fallback to old syscall in that case.

I will fix it in v2.

> 
> This __ASSUME_* macro definition in kernel-features.h would need a 
> multi-paragraph comment discussing exactly what the semantics of such 
> __ASSUME_* macros are in the context of the sets of syscall names and 
> numbers present on different kinds of Linux kernel architectures.

Ok. I will add it to v2.

> 
> > +#if __TIMESIZE != 64
> > +int
> > +__clock_settime (clockid_t clock_id, const struct timespec *tp)
> > +{
> > +  struct __timespec64 ts64;
> > +
> > +  if (! in_time_t_range (tp->tv_sec))
> > +    {
> > +      __set_errno (EOVERFLOW);
> > +      return -1;
> > +    }  
> 
> I don't see how an overflow check makes sense in this context
> (converting a 32-bit timespec to a 64-bit one).

The above code is for the case when _TIME_BITS is not set (on 32
bit system) and we are after Y2038. The data provided by *tp has wrong
value and shall not be passed to the kernel (no matter if it supports
Y2038 time or not).

> 




Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
Stepan Golosunov April 20, 2019, 12:20 a.m. UTC | #3
15.04.2019 в 00:08:38 +0200 Lukasz Majewski написал:
> +# if defined __NR_clock_settime64
> +  /* Make sure that passed __timespec64 struct pad is 0.  */
> +  struct __timespec64 ts = *tp;
> +  ts.tv_pad = 0;
> +  return INLINE_SYSCALL_CALL (clock_settime64, clock_id, &ts);

Isn't kernel supposed to zero out padding on its own?
At least comment in kernel's get_timespec64 says so:

	/* Zero out the padding for 32 bit systems or in compat mode */
	if (IS_ENABLED(CONFIG_64BIT_TIME) && in_compat_syscall())
		kts.tv_nsec &= 0xFFFFFFFFUL;

The code looks buggy though. It fails to zero out the padding in
32-bit kernels. That part is probably broken since
98f76206b3350 ("compat: Cleanup in_compat_syscall() callers").

And, hmm, is CONFIG_64BIT_TIME enabled anywhere?
Lukasz Majewski April 20, 2019, 11:21 a.m. UTC | #4
Hi Stepan,

> 15.04.2019 в 00:08:38 +0200 Lukasz Majewski написал:
> > +# if defined __NR_clock_settime64
> > +  /* Make sure that passed __timespec64 struct pad is 0.  */
> > +  struct __timespec64 ts = *tp;
> > +  ts.tv_pad = 0;
> > +  return INLINE_SYSCALL_CALL (clock_settime64, clock_id, &ts);  
> 
> Isn't kernel supposed to zero out padding on its own?
> At least comment in kernel's get_timespec64 says so:
> 
> 	/* Zero out the padding for 32 bit systems or in compat mode
> */ if (IS_ENABLED(CONFIG_64BIT_TIME) && in_compat_syscall())
> 		kts.tv_nsec &= 0xFFFFFFFFUL;
> 

For ARM (and x86) 32 bit machines I do use following syscalls (like
clock_settime64):
https://elixir.bootlin.com/linux/v5.1-rc4/source/arch/arm/tools/syscall.tbl#L420

which are providing 64 bit time support on 32 bit systems.

Yes. In those systems the upper part (32 bits) of tv_nsec is cleared up
with mask in the kernel. However, I would prefer not to pass random data
to the kernel, and hence I do clear it up explicitly in glibc.

> The code looks buggy though. It fails to zero out the padding in
> 32-bit kernels.

For the 32 bit systems without Y2038 support enabled in glibc - the
clock_settime would be used, which corresponds to sys_clock_settime32()
in the kernel.

> That part is probably broken since
> 98f76206b3350 ("compat: Cleanup in_compat_syscall() callers").
> 
> And, hmm, is CONFIG_64BIT_TIME enabled anywhere?

When I do use clock_settime64 on the glibc side (with _TIME_BITS=64), I
do not need to enable such config in the kernel. 

If the kernel supports this call (5.1+), then use it, otherwise
fallback to clock_settime().

For 64 bit systems, I do not change the execution path.

If you are interested, please look on the following repo (which has
some more commits than those posted to the mailing list):
https://github.com/lmajewski/y2038_glibc/commits/Y2038-2.29-glibc-__clock-internal-struct-timespec-v1

And meta layer for testing.

https://github.com/lmajewski/meta-y2038

Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
Stepan Golosunov April 22, 2019, 9:07 a.m. UTC | #5
20.04.2019 в 13:21:12 +0200 Lukasz Majewski написал:
> Hi Stepan,
> 
> > 15.04.2019 в 00:08:38 +0200 Lukasz Majewski написал:
> > > +# if defined __NR_clock_settime64
> > > +  /* Make sure that passed __timespec64 struct pad is 0.  */
> > > +  struct __timespec64 ts = *tp;
> > > +  ts.tv_pad = 0;
> > > +  return INLINE_SYSCALL_CALL (clock_settime64, clock_id, &ts);  
> > 
> > Isn't kernel supposed to zero out padding on its own?
> > At least comment in kernel's get_timespec64 says so:
> > 
> > 	/* Zero out the padding for 32 bit systems or in compat mode
> > */ if (IS_ENABLED(CONFIG_64BIT_TIME) && in_compat_syscall())
> > 		kts.tv_nsec &= 0xFFFFFFFFUL;
> > 
> 
> For ARM (and x86) 32 bit machines I do use following syscalls (like
> clock_settime64):
> https://elixir.bootlin.com/linux/v5.1-rc4/source/arch/arm/tools/syscall.tbl#L420
> 
> which are providing 64 bit time support on 32 bit systems.
> 
> Yes. In those systems the upper part (32 bits) of tv_nsec is cleared up
> with mask in the kernel.

Is it? The kernel (5.1-rc6) code looks to me like

	/* Zero out the padding for 32 bit systems or in compat mode */
	if (false && false)
		kts.tv_nsec &= 0xFFFFFFFFUL;

in 32-bit kernels. And like

	if (false && true)
		kts.tv_nsec &= 0xFFFFFFFFUL;

for COMPAT syscalls in 64-bit kernels.

It should probably be changed into

	if (!IS_ENABLED(CONFIG_64BIT) || in_compat_syscall())
		kts.tv_nsec &= 0xFFFFFFFFUL;

(Or into something like

	if (!IS_ENABLED(CONFIG_64BIT) || in_compat_syscall() && !COMPAT_USE_64BIT_TIME)
		kts.tv_nsec &= 0xFFFFFFFFUL;

if x32 should retain 64-bit tv_nsec.)

> However, I would prefer not to pass random data
> to the kernel, and hence I do clear it up explicitly in glibc.

If the kernel does not ignore padding on its own, then zeroing it out
is required everywhere timespec is passed to kernel, including via
code not known to glibc. (Does anyone promise that there won't be any
ioctls that accept timespec, for example?) That seems to be
error-prone (and might requre copying larger structes).

On the other hand, if kernel 5.1+ ignores padding as intended there is
no need to create additional copy of structs in glibc code that calls
into clock_settime64 (or into timer_settime64 that accepts larger
struct, for example).

> > The code looks buggy though. It fails to zero out the padding in
> > 32-bit kernels.
> 
> For the 32 bit systems without Y2038 support enabled in glibc - the
> clock_settime would be used, which corresponds to sys_clock_settime32()
> in the kernel.

I am talking about kernels with Y2038 support.

> > That part is probably broken since
> > 98f76206b3350 ("compat: Cleanup in_compat_syscall() callers").
> > 
> > And, hmm, is CONFIG_64BIT_TIME enabled anywhere?

I guess that the remaining CONFIG_64BIT_TIME in kernel should be
replaced with CONFIG_COMPAT_32BIT_TIME or removed.
Arnd Bergmann April 22, 2019, 9:45 p.m. UTC | #6
On Mon, Apr 22, 2019 at 11:07 AM Stepan Golosunov
<stepan@golosunov.pp.ru> wrote:
> 20.04.2019 в 13:21:12 +0200 Lukasz Majewski написал:
> Is it? The kernel (5.1-rc6) code looks to me like
>
>         /* Zero out the padding for 32 bit systems or in compat mode */
>         if (false && false)
>                 kts.tv_nsec &= 0xFFFFFFFFUL;
>
> in 32-bit kernels. And like
>
>         if (false && true)
>                 kts.tv_nsec &= 0xFFFFFFFFUL;
>
> for COMPAT syscalls in 64-bit kernels.
>
> It should probably be changed into
>
>         if (!IS_ENABLED(CONFIG_64BIT) || in_compat_syscall())
>                 kts.tv_nsec &= 0xFFFFFFFFUL;
>
> (Or into something like
>
>         if (!IS_ENABLED(CONFIG_64BIT) || in_compat_syscall() && !COMPAT_USE_64BIT_TIME)
>                 kts.tv_nsec &= 0xFFFFFFFFUL;
>
> if x32 should retain 64-bit tv_nsec.)

I think the problem is that at some point CONFIG_64BIT_TIME was
meant to be enabled on both 32-bit and 64-bit kernels, but the
definition got changed along  the way.

We probably just want

        if (in_compat_syscall() )
               kts.tv_nsec &= 0xFFFFFFFFUL;

here, which would then truncate the nanoseconds for all compat
mode including x32. For native mode, we don't need to truncate
it, since timespec64 has a 32-bit 'tv_nsec' field in the kernel.

> > However, I would prefer not to pass random data
> > to the kernel, and hence I do clear it up explicitly in glibc.
>
> If the kernel does not ignore padding on its own, then zeroing it out
> is required everywhere timespec is passed to kernel, including via
> code not known to glibc. (Does anyone promise that there won't be any
> ioctls that accept timespec, for example?) That seems to be
> error-prone (and might requre copying larger structes).
>
> On the other hand, if kernel 5.1+ ignores padding as intended there is
> no need to create additional copy of structs in glibc code that calls
> into clock_settime64 (or into timer_settime64 that accepts larger
> struct, for example).

The intention is that the kernel ignores the padding. If you find
another place in the kernel that forget that, we should fix it.

> > > And, hmm, is CONFIG_64BIT_TIME enabled anywhere?
>
> I guess that the remaining CONFIG_64BIT_TIME in kernel should be
> replaced with CONFIG_COMPAT_32BIT_TIME or removed.

We should remove CONFIG_64BIT_TIME. CONFIG_COMPAT_32BIT_TIME
is still needed to identify architectures that don't have it, in
particular riscv32.

       Arnd
Lukasz Majewski April 23, 2019, 3:45 p.m. UTC | #7
Hi Arnd and Stepan,

> On Mon, Apr 22, 2019 at 11:07 AM Stepan Golosunov
> <stepan@golosunov.pp.ru> wrote:
> > 20.04.2019 в 13:21:12 +0200 Lukasz Majewski написал:
> > Is it? The kernel (5.1-rc6) code looks to me like
> >
> >         /* Zero out the padding for 32 bit systems or in compat
> > mode */ if (false && false)
> >                 kts.tv_nsec &= 0xFFFFFFFFUL;
> >
> > in 32-bit kernels. And like
> >
> >         if (false && true)
> >                 kts.tv_nsec &= 0xFFFFFFFFUL;
> >
> > for COMPAT syscalls in 64-bit kernels.
> >
> > It should probably be changed into
> >
> >         if (!IS_ENABLED(CONFIG_64BIT) || in_compat_syscall())
> >                 kts.tv_nsec &= 0xFFFFFFFFUL;
> >
> > (Or into something like
> >
> >         if (!IS_ENABLED(CONFIG_64BIT) || in_compat_syscall()
> > && !COMPAT_USE_64BIT_TIME) kts.tv_nsec &= 0xFFFFFFFFUL;
> >
> > if x32 should retain 64-bit tv_nsec.)  
> 
> I think the problem is that at some point CONFIG_64BIT_TIME was
> meant to be enabled on both 32-bit and 64-bit kernels, but the
> definition got changed along  the way.
> 
> We probably just want
> 
>         if (in_compat_syscall() )
>                kts.tv_nsec &= 0xFFFFFFFFUL;
> 
> here, which would then truncate the nanoseconds for all compat
> mode including x32. For native mode, we don't need to truncate
> it, since timespec64 has a 32-bit 'tv_nsec' field in the kernel.
> 
> > > However, I would prefer not to pass random data
> > > to the kernel, and hence I do clear it up explicitly in glibc.  
> >
> > If the kernel does not ignore padding on its own, then zeroing it
> > out is required everywhere timespec is passed to kernel, including
> > via code not known to glibc. (Does anyone promise that there won't
> > be any ioctls that accept timespec, for example?) That seems to be
> > error-prone (and might requre copying larger structes).
> >
> > On the other hand, if kernel 5.1+ ignores padding as intended there
> > is no need to create additional copy of structs in glibc code that
> > calls into clock_settime64 (or into timer_settime64 that accepts
> > larger struct, for example).  

Ok, I think I see your point:

- As kernel is ignoring padding, there is no need to copy the structure
  and set the padding to 0.

However, in patch:
[PATCH 1/6] y2038: Introduce internal for glibc struct __timespec64

The internal (for glibc) structure has been introduced - it has 32 bit
tv_nsec and 32 bit padding. As it is passed to the kernel - the padding
can have random values and hence shall be zeroed before passing to the
kernel.

The rationale for 32 bit tv_nsec is to be as close as possible to what
is exported by glibc (64 bit tv_sec and 32 bit tv_nsec) for Y2038.

I'm now wondering if it would be better to have glibc internal struct
__timespec64 having both fields 64 bit (as it would be easier to pass
it to Linux).


> 
> The intention is that the kernel ignores the padding. If you find
> another place in the kernel that forget that, we should fix it.
> 

Thanks Arnd for clarification.

> > > > And, hmm, is CONFIG_64BIT_TIME enabled anywhere?  
> >
> > I guess that the remaining CONFIG_64BIT_TIME in kernel should be
> > replaced with CONFIG_COMPAT_32BIT_TIME or removed.  
> 
> We should remove CONFIG_64BIT_TIME. CONFIG_COMPAT_32BIT_TIME
> is still needed to identify architectures that don't have it, in
> particular riscv32.
> 
>        Arnd




Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
diff mbox series

Patch

diff --git a/include/time.h b/include/time.h
index 9eed500cf1..33eee8eb9a 100644
--- a/include/time.h
+++ b/include/time.h
@@ -124,6 +124,14 @@  extern __time64_t __timegm64 (struct tm *__tp) __THROW;
 libc_hidden_proto (__timegm64)
 #endif
 
+#if __TIMESIZE == 64
+# define __clock_settime64 __clock_settime
+#else
+extern int __clock_settime64 (clockid_t clock_id,
+                              const struct __timespec64 *tp);
+libc_hidden_proto (__clock_settime64)
+#endif
+
 /* Compute the `struct tm' representation of T,
    offset OFFSET seconds east of UTC,
    and store year, yday, mon, mday, wday, hour, min, sec into *TP.
diff --git a/sysdeps/unix/sysv/linux/clock_settime.c b/sysdeps/unix/sysv/linux/clock_settime.c
index d837e3019c..a1ea1cac28 100644
--- a/sysdeps/unix/sysv/linux/clock_settime.c
+++ b/sysdeps/unix/sysv/linux/clock_settime.c
@@ -19,11 +19,9 @@ 
 #include <sysdep.h>
 #include <time.h>
 
-#include "kernel-posix-cpu-timers.h"
-
 /* Set CLOCK to value TP.  */
 int
-__clock_settime (clockid_t clock_id, const struct timespec *tp)
+__clock_settime64 (clockid_t clock_id, const struct __timespec64 *tp)
 {
   /* Make sure the time cvalue is OK.  */
   if (tp->tv_nsec < 0 || tp->tv_nsec >= 1000000000)
@@ -32,6 +30,36 @@  __clock_settime (clockid_t clock_id, const struct timespec *tp)
       return -1;
     }
 
+#if defined (__TIMESIZE) && __TIMESIZE != 64
+# if defined __NR_clock_settime64
+  /* Make sure that passed __timespec64 struct pad is 0.  */
+  struct __timespec64 ts = *tp;
+  ts.tv_pad = 0;
+  return INLINE_SYSCALL_CALL (clock_settime64, clock_id, &ts);
+# else
+  struct timespec ts32;
+  valid_timespec64_to_timespec(tp, &ts32);
+  return INLINE_SYSCALL_CALL (clock_settime, clock_id, &ts32);
+# endif
+#else
   return INLINE_SYSCALL_CALL (clock_settime, clock_id, tp);
+#endif
 }
 weak_alias (__clock_settime, clock_settime)
+
+#if __TIMESIZE != 64
+int
+__clock_settime (clockid_t clock_id, const struct timespec *tp)
+{
+  struct __timespec64 ts64;
+
+  if (! in_time_t_range (tp->tv_sec))
+    {
+      __set_errno (EOVERFLOW);
+      return -1;
+    }
+
+  valid_timespec_to_timespec64 (tp, &ts64);
+  return __clock_settime64 (clock_id, &ts64);
+}
+#endif