Patchwork [1/1] Switch to jiffies for native_sched_clock() when TSC warps

login
register
mail settings
Submitter Chase Douglas
Date March 10, 2010, 7:30 p.m.
Message ID <1268249459-30306-2-git-send-email-chase.douglas@canonical.com>
Download mbox | patch
Permalink /patch/47278/
State RFC
Delegated to: Andy Whitcroft
Headers show

Comments

Chase Douglas - March 10, 2010, 7:30 p.m.
Some newer x86 processors (seen on core 2 duo, arrandale) warp their TSC
registers after a suspend. It is believed that a microcode fix may be a
solution, but for now we should work around the issue.

This change adds an upper bound on the difference between TSC readings. It
should be very generous (multiple year difference) so as to only catch TSC
warping which appears to generate timestamps many years in the future.
When a warp is found, usage of the TSC for timing is disabled. The
kernel falls back to using the jiffies counter, which is not as precise
but should be accurate.

Signed-off-by: Chase Douglas <chase.douglas@canonical.com>
---
 arch/x86/kernel/tsc.c |   15 +++++++++++++++
 1 files changed, 15 insertions(+), 0 deletions(-)
Tim Gardner - March 10, 2010, 8:38 p.m.
On 03/10/2010 12:30 PM, Chase Douglas wrote:
> Some newer x86 processors (seen on core 2 duo, arrandale) warp their TSC
> registers after a suspend. It is believed that a microcode fix may be a
> solution, but for now we should work around the issue.
>
> This change adds an upper bound on the difference between TSC readings. It
> should be very generous (multiple year difference) so as to only catch TSC
> warping which appears to generate timestamps many years in the future.
> When a warp is found, usage of the TSC for timing is disabled. The
> kernel falls back to using the jiffies counter, which is not as precise
> but should be accurate.
>
> Signed-off-by: Chase Douglas<chase.douglas@canonical.com>
> ---
>   arch/x86/kernel/tsc.c |   15 +++++++++++++++
>   1 files changed, 15 insertions(+), 0 deletions(-)
>
> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> index 597683a..3e2921d 100644
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -43,7 +43,9 @@ static int tsc_clocksource_reliable;
>   u64 native_sched_clock(void)
>   {
>   	u64 this_offset;
> +	static u64 prev_offset;
>
> +jiffies:
>   	/*
>   	 * Fall back to jiffies if there's no TSC available:
>   	 * ( But note that we still use it if the TSC is marked
> @@ -60,6 +62,19 @@ u64 native_sched_clock(void)
>   	/* read the Time Stamp Counter: */
>   	rdtscll(this_offset);
>
> +	/*
> +	 * if new time stamp is many years later, assume warping and disable
> +	 * TSC usage:
> +	 */
> +	if (__cycles_2_ns(this_offset - prev_offset)>  0x100000000000000
> +	&&  prev_offset) {
> +		printk(KERN_WARNING "TSC warped, using jiffies\n");
> +		tsc_disabled = 1;
> +		goto jiffies;
> +	}
> +
> +	prev_offset = this_offset;
> +
>   	/* return the value in ns */
>   	return __cycles_2_ns(this_offset);
>   }

I think this is not SMP safe. Surely this function is called by more 
then one CPU, therefore prev_offset must be a percpu variable at the 
very least.

I am also not in favor of changing runtime behavior. Why not simply 
advise the user to boot with 'notsc' using the WARN_ON_ONCE() macro?

Is this the best place to detect warping? It _is_ a fairly high 
performance path. How about stashing the TSC just before suspend and 
checking it upon return?

rtg
Chase Douglas - March 10, 2010, 9:04 p.m.
On Wed, Mar 10, 2010 at 3:38 PM, Tim Gardner <tim.gardner@canonical.com> wrote:
> On 03/10/2010 12:30 PM, Chase Douglas wrote:
>>
>> Some newer x86 processors (seen on core 2 duo, arrandale) warp their TSC
>> registers after a suspend. It is believed that a microcode fix may be a
>> solution, but for now we should work around the issue.
>>
>> This change adds an upper bound on the difference between TSC readings. It
>> should be very generous (multiple year difference) so as to only catch TSC
>> warping which appears to generate timestamps many years in the future.
>> When a warp is found, usage of the TSC for timing is disabled. The
>> kernel falls back to using the jiffies counter, which is not as precise
>> but should be accurate.
>>
>> Signed-off-by: Chase Douglas<chase.douglas@canonical.com>
>> ---
>>  arch/x86/kernel/tsc.c |   15 +++++++++++++++
>>  1 files changed, 15 insertions(+), 0 deletions(-)
>>
>> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
>> index 597683a..3e2921d 100644
>> --- a/arch/x86/kernel/tsc.c
>> +++ b/arch/x86/kernel/tsc.c
>> @@ -43,7 +43,9 @@ static int tsc_clocksource_reliable;
>>  u64 native_sched_clock(void)
>>  {
>>        u64 this_offset;
>> +       static u64 prev_offset;
>>
>> +jiffies:
>>        /*
>>         * Fall back to jiffies if there's no TSC available:
>>         * ( But note that we still use it if the TSC is marked
>> @@ -60,6 +62,19 @@ u64 native_sched_clock(void)
>>        /* read the Time Stamp Counter: */
>>        rdtscll(this_offset);
>>
>> +       /*
>> +        * if new time stamp is many years later, assume warping and
>> disable
>> +        * TSC usage:
>> +        */
>> +       if (__cycles_2_ns(this_offset - prev_offset)>  0x100000000000000
>> +       &&  prev_offset) {
>> +               printk(KERN_WARNING "TSC warped, using jiffies\n");
>> +               tsc_disabled = 1;
>> +               goto jiffies;
>> +       }
>> +
>> +       prev_offset = this_offset;
>> +
>>        /* return the value in ns */
>>        return __cycles_2_ns(this_offset);
>>  }
>
> I think this is not SMP safe. Surely this function is called by more then
> one CPU, therefore prev_offset must be a percpu variable at the very least.

Agreed. If it were implemented this way a percpu var would be needed.

> I am also not in favor of changing runtime behavior. Why not simply advise
> the user to boot with 'notsc' using the WARN_ON_ONCE() macro?

This is possible, but is it the best approach? I'm thinking that in
the majority of cases people aren't going to care that their timer is
lower resolution after a resume. The warning message during the switch
could mention that notsc should be used from now on. Also, thinking of
Ubuntu, using a WARN_ON_ONCE macro will create an apport message,
which would likely scare a user since it mentions that your machine
may be unsafe from now on blah blah, when in reality you should be
safe if you make it past that point.

> Is this the best place to detect warping? It _is_ a fairly high performance
> path. How about stashing the TSC just before suspend and checking it upon
> return?

I was going for a more general approach in case there are other times
where the TSC can warp. But yes, given the nature of this path it
probably makes sense to err on the safe side and check only where we
know there may be a warp until we see evidence otherwise. I'll see
what can be done for checking the TSC before and after a resume.

Thanks,
Chase

Patch

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 597683a..3e2921d 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -43,7 +43,9 @@  static int tsc_clocksource_reliable;
 u64 native_sched_clock(void)
 {
 	u64 this_offset;
+	static u64 prev_offset;
 
+jiffies:
 	/*
 	 * Fall back to jiffies if there's no TSC available:
 	 * ( But note that we still use it if the TSC is marked
@@ -60,6 +62,19 @@  u64 native_sched_clock(void)
 	/* read the Time Stamp Counter: */
 	rdtscll(this_offset);
 
+	/*
+	 * if new time stamp is many years later, assume warping and disable
+	 * TSC usage:
+	 */
+	if (__cycles_2_ns(this_offset - prev_offset) > 0x100000000000000
+	    && prev_offset) {
+		printk(KERN_WARNING "TSC warped, using jiffies\n");
+		tsc_disabled = 1;
+		goto jiffies;
+	}
+
+	prev_offset = this_offset;
+
 	/* return the value in ns */
 	return __cycles_2_ns(this_offset);
 }