diff mbox series

[RFC,1/5] x86: tsc: add tsc to art helpers

Message ID 20190716072038.8408-2-felipe.balbi@linux.intel.com
State RFC
Delegated to: David Miller
Headers show
Series PTP: add support for Intel's TGPIO controller | expand

Commit Message

Felipe Balbi July 16, 2019, 7:20 a.m. UTC
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
---
 arch/x86/include/asm/tsc.h |  2 ++
 arch/x86/kernel/tsc.c      | 32 ++++++++++++++++++++++++++++++++
 2 files changed, 34 insertions(+)

Comments

Thomas Gleixner July 16, 2019, 7:57 a.m. UTC | #1
Felipe,

On Tue, 16 Jul 2019, Felipe Balbi wrote:

-ENOCHANGELOG

As you said in the cover letter:

>  (3) The change in arch/x86/kernel/tsc.c needs to be reviewed at length
>      before going in.

So some information what those interfaces are used for and why they are
needed would be really helpful.

> +void get_tsc_ns(struct system_counterval_t *tsc_counterval, u64 *tsc_ns)
> +{
> +	u64 tmp, res, rem;
> +	u64 cycles;
> +
> +	tsc_counterval->cycles = clocksource_tsc.read(NULL);
> +	cycles = tsc_counterval->cycles;
> +	tsc_counterval->cs = art_related_clocksource;
> +
> +	rem = do_div(cycles, tsc_khz);
> +
> +	res = cycles * USEC_PER_SEC;
> +	tmp = rem * USEC_PER_SEC;
> +
> +	do_div(tmp, tsc_khz);
> +	res += tmp;
> +
> +	*tsc_ns = res;
> +}
> +EXPORT_SYMBOL(get_tsc_ns);
> +
> +u64 get_art_ns_now(void)
> +{
> +	struct system_counterval_t tsc_cycles;
> +	u64 tsc_ns;
> +
> +	get_tsc_ns(&tsc_cycles, &tsc_ns);
> +
> +	return tsc_ns;
> +}
> +EXPORT_SYMBOL(get_art_ns_now);

While the changes look innocuous I'm missing the big picture why this needs
to emulate ART instead of simply using TSC directly.

Thanks,

	tglx
Felipe Balbi Aug. 15, 2019, 5:57 a.m. UTC | #2
Hi,


Thomas Gleixner <tglx@linutronix.de> writes:

> Felipe,
>
> On Tue, 16 Jul 2019, Felipe Balbi wrote:
>
> -ENOCHANGELOG
>
> As you said in the cover letter:
>
>>  (3) The change in arch/x86/kernel/tsc.c needs to be reviewed at length
>>      before going in.
>
> So some information what those interfaces are used for and why they are
> needed would be really helpful.

Okay, I have some more details about this. The TGPIO device itself uses
ART since TSC is not directly available to anything other than the
CPU. The 'problem' here is that reading ART incurs extra latency which
we would like to avoid. Therefore, we use TSC and scale it to
nanoseconds which, would be the same as ART to ns.

>> +void get_tsc_ns(struct system_counterval_t *tsc_counterval, u64 *tsc_ns)
>> +{
>> +	u64 tmp, res, rem;
>> +	u64 cycles;
>> +
>> +	tsc_counterval->cycles = clocksource_tsc.read(NULL);
>> +	cycles = tsc_counterval->cycles;
>> +	tsc_counterval->cs = art_related_clocksource;
>> +
>> +	rem = do_div(cycles, tsc_khz);
>> +
>> +	res = cycles * USEC_PER_SEC;
>> +	tmp = rem * USEC_PER_SEC;
>> +
>> +	do_div(tmp, tsc_khz);
>> +	res += tmp;
>> +
>> +	*tsc_ns = res;
>> +}
>> +EXPORT_SYMBOL(get_tsc_ns);
>> +
>> +u64 get_art_ns_now(void)
>> +{
>> +	struct system_counterval_t tsc_cycles;
>> +	u64 tsc_ns;
>> +
>> +	get_tsc_ns(&tsc_cycles, &tsc_ns);
>> +
>> +	return tsc_ns;
>> +}
>> +EXPORT_SYMBOL(get_art_ns_now);
>
> While the changes look innocuous I'm missing the big picture why this needs
> to emulate ART instead of simply using TSC directly.

i don't think we're emulating ART here (other than the name in the
function). We're just reading TSC and converting to nanoseconds, right?

Cheers
Thomas Gleixner Aug. 15, 2019, 2:16 p.m. UTC | #3
Felipe,

On Thu, 15 Aug 2019, Felipe Balbi wrote:
> Thomas Gleixner <tglx@linutronix.de> writes:
> > On Tue, 16 Jul 2019, Felipe Balbi wrote:
> >
> > So some information what those interfaces are used for and why they are
> > needed would be really helpful.
> 
> Okay, I have some more details about this. The TGPIO device itself uses
> ART since TSC is not directly available to anything other than the
> CPU. The 'problem' here is that reading ART incurs extra latency which
> we would like to avoid. Therefore, we use TSC and scale it to
> nanoseconds which, would be the same as ART to ns.

Fine. But that's not really correct:

      TSC = art_to_tsc_offset + ART * scale;
 
> >> +void get_tsc_ns(struct system_counterval_t *tsc_counterval, u64 *tsc_ns)

Why is this not returning the result instead of having that pointer
indirection?

> >> +{
> >> +	u64 tmp, res, rem;
> >> +	u64 cycles;
> >> +
> >> +	tsc_counterval->cycles = clocksource_tsc.read(NULL);
> >> +	cycles = tsc_counterval->cycles;
> >> +	tsc_counterval->cs = art_related_clocksource;

So this does more than returning the TSC time converted to nanoseconds. The
function name should reflect this. Plus both functions want kernel-doc
explaining what they do.

> >> +	rem = do_div(cycles, tsc_khz);
> >> +
> >> +	res = cycles * USEC_PER_SEC;
> >> +	tmp = rem * USEC_PER_SEC;
> >> +
> >> +	do_div(tmp, tsc_khz);
> >> +	res += tmp;
> >> +
> >> +	*tsc_ns = res;
> >> +}
> >> +EXPORT_SYMBOL(get_tsc_ns);
> >> +
> >> +u64 get_art_ns_now(void)
> >> +{
> >> +	struct system_counterval_t tsc_cycles;
> >> +	u64 tsc_ns;
> >> +
> >> +	get_tsc_ns(&tsc_cycles, &tsc_ns);
> >> +
> >> +	return tsc_ns;
> >> +}
> >> +EXPORT_SYMBOL(get_art_ns_now);
> >
> > While the changes look innocuous I'm missing the big picture why this needs
> > to emulate ART instead of simply using TSC directly.
> 
> i don't think we're emulating ART here (other than the name in the
> function). We're just reading TSC and converting to nanoseconds, right?

Well, the function name says clearly: get_art_ns_now(). But you are not
using ART, you use the TSC and derive the ART value from it (incorrectly).

Thanks,

	tglx
Felipe Balbi Oct. 1, 2019, 10:24 a.m. UTC | #4
Hi,

(sorry for the long delay, got caught up in other tasks)

Thomas Gleixner <tglx@linutronix.de> writes:
> On Thu, 15 Aug 2019, Felipe Balbi wrote:
>> Thomas Gleixner <tglx@linutronix.de> writes:
>> > On Tue, 16 Jul 2019, Felipe Balbi wrote:
>> >
>> > So some information what those interfaces are used for and why they are
>> > needed would be really helpful.
>> 
>> Okay, I have some more details about this. The TGPIO device itself uses
>> ART since TSC is not directly available to anything other than the
>> CPU. The 'problem' here is that reading ART incurs extra latency which
>> we would like to avoid. Therefore, we use TSC and scale it to
>> nanoseconds which, would be the same as ART to ns.
>
> Fine. But that's not really correct:
>
>       TSC = art_to_tsc_offset + ART * scale;

From silicon folks I got the equation:

ART = ECX * EBX / EAX;

If I'm reading this correctly, that's basically what
native_calibrate_tsc() does (together with some error checking the safe
defaults). Couldn't we, instead, just have a single function like below?

u64 convert_tsc_to_art_ns()
{
	return x86_platform.calibrate_tsc();
}

Another way would be extract the important parts from
native_calibrate_tsc() into a separate helper. This would safe another
call to cpuid(0x15,...);

>> >> +void get_tsc_ns(struct system_counterval_t *tsc_counterval, u64 *tsc_ns)
>
> Why is this not returning the result instead of having that pointer
> indirection?

That can be changed easily, no worries.

>> >> +{
>> >> +	u64 tmp, res, rem;
>> >> +	u64 cycles;
>> >> +
>> >> +	tsc_counterval->cycles = clocksource_tsc.read(NULL);
>> >> +	cycles = tsc_counterval->cycles;
>> >> +	tsc_counterval->cs = art_related_clocksource;
>
> So this does more than returning the TSC time converted to nanoseconds. The
> function name should reflect this. Plus both functions want kernel-doc
> explaining what they do.

convert_tsc_to_art_ns()? That would be analogous to convert_art_to_tsc()
and convert_art_ns_to_tsc().

cheers
Thomas Gleixner Oct. 17, 2019, 11:15 a.m. UTC | #5
Hi,

On Tue, 1 Oct 2019, Felipe Balbi wrote:
> (sorry for the long delay, got caught up in other tasks)

Delayed by vacation :)

> Thomas Gleixner <tglx@linutronix.de> writes:
> > On Thu, 15 Aug 2019, Felipe Balbi wrote:
> >> Thomas Gleixner <tglx@linutronix.de> writes:
> >> > On Tue, 16 Jul 2019, Felipe Balbi wrote:
> >> >
> >> > So some information what those interfaces are used for and why they are
> >> > needed would be really helpful.
> >> 
> >> Okay, I have some more details about this. The TGPIO device itself uses
> >> ART since TSC is not directly available to anything other than the
> >> CPU. The 'problem' here is that reading ART incurs extra latency which
> >> we would like to avoid. Therefore, we use TSC and scale it to
> >> nanoseconds which, would be the same as ART to ns.
> >
> > Fine. But that's not really correct:
> >
> >       TSC = art_to_tsc_offset + ART * scale;
> 
> From silicon folks I got the equation:
> 
> ART = ECX * EBX / EAX;

What is the content of ECX/EBX/EAX and where is it coming from?
 
> If I'm reading this correctly, that's basically what
> native_calibrate_tsc() does (together with some error checking the safe
> defaults). Couldn't we, instead, just have a single function like below?
> 
> u64 convert_tsc_to_art_ns()
> {
> 	return x86_platform.calibrate_tsc();
> }

Huch? How is that supposed to work? calibrate_tsc() returns the TSC
frequency.

> Another way would be extract the important parts from
> native_calibrate_tsc() into a separate helper. This would safe another
> call to cpuid(0x15,...);

What for?

The relation between TSC and ART is already established via detect_art()
which reads all relevant data out of CPUID(ART_CPUID_LEAF).

We use exactly that information for convert_art_to_tsc() so the obvious
solution for calculating ART from TSC is to do the reverse operation.

convert_art_to_tsc()
{
        rem = do_div(art, art_to_tsc_denominator);

        res = art * art_to_tsc_numerator;
        tmp = rem * art_to_tsc_numerator;

        do_div(tmp, art_to_tsc_denominator);
        res += tmp + art_to_tsc_offset;
}

which is translated into math:

      TSC = ART * SCALE + OFFSET

where

      SCALE = N / D

and

      N = CPUID(ART_CPUID_LEAF).EAX
      D = CPUID(ART_CPUID_LEAF).EBX

So the obvious reverse operation is:

     ART = (TSC - OFFSET) / SCALE;

Translating that into code should not be rocket science.

Thanks,

	tglx
Felipe Balbi Oct. 17, 2019, 12:01 p.m. UTC | #6
Hi,

Thomas Gleixner <tglx@linutronix.de> writes:
>> Thomas Gleixner <tglx@linutronix.de> writes:
>> > On Thu, 15 Aug 2019, Felipe Balbi wrote:
>> >> Thomas Gleixner <tglx@linutronix.de> writes:
>> >> > On Tue, 16 Jul 2019, Felipe Balbi wrote:
>> >> >
>> >> > So some information what those interfaces are used for and why they are
>> >> > needed would be really helpful.
>> >> 
>> >> Okay, I have some more details about this. The TGPIO device itself uses
>> >> ART since TSC is not directly available to anything other than the
>> >> CPU. The 'problem' here is that reading ART incurs extra latency which
>> >> we would like to avoid. Therefore, we use TSC and scale it to
>> >> nanoseconds which, would be the same as ART to ns.
>> >
>> > Fine. But that's not really correct:
>> >
>> >       TSC = art_to_tsc_offset + ART * scale;
>> 
>> From silicon folks I got the equation:
>> 
>> ART = ECX * EBX / EAX;
>
> What is the content of ECX/EBX/EAX and where is it coming from?

Since last email, I got a bit of extra information about how all of
these should work.

ECX contains crystal rate of TSC, EBX and EAX contain constants for
converting between TSC and ART.

So, ART = tsc_cycles * EBX/EAX, this will give me ART cycles. Also, the
time gpio IP needs ART cycles to be written to its comparator
registers, but the PTP framework wants time in nanoseconds.

Therefore we need two new conversion functions:
convert_tsc_to_art_cycles() and something which gives us current TSC in
nanoseconds.

>> If I'm reading this correctly, that's basically what
>> native_calibrate_tsc() does (together with some error checking the safe
>> defaults). Couldn't we, instead, just have a single function like below?
>> 
>> u64 convert_tsc_to_art_ns()
>> {
>> 	return x86_platform.calibrate_tsc();
>> }
>
> Huch? How is that supposed to work? calibrate_tsc() returns the TSC
> frequency.

Yup, that was a total brain fart.

>> Another way would be extract the important parts from
>> native_calibrate_tsc() into a separate helper. This would safe another
>> call to cpuid(0x15,...);
>
> What for?
>
> The relation between TSC and ART is already established via detect_art()
> which reads all relevant data out of CPUID(ART_CPUID_LEAF).
>
> We use exactly that information for convert_art_to_tsc() so the obvious
> solution for calculating ART from TSC is to do the reverse operation.
>
> convert_art_to_tsc()
> {
>         rem = do_div(art, art_to_tsc_denominator);
>
>         res = art * art_to_tsc_numerator;
>         tmp = rem * art_to_tsc_numerator;
>
>         do_div(tmp, art_to_tsc_denominator);
>         res += tmp + art_to_tsc_offset;
> }
>
> which is translated into math:
>
>       TSC = ART * SCALE + OFFSET
>
> where
>
>       SCALE = N / D
>
> and
>
>       N = CPUID(ART_CPUID_LEAF).EAX
>       D = CPUID(ART_CPUID_LEAF).EBX
>
> So the obvious reverse operation is:
>
>      ART = (TSC - OFFSET) / SCALE;
>
> Translating that into code should not be rocket science.

Right, that's where we got after talking to other folks.

Do you have any preferences for the function names? Or does
convert_tsc_to_art() sound ok?
diff mbox series

Patch

diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h
index 8a0c25c6bf09..b7a9f4385a82 100644
--- a/arch/x86/include/asm/tsc.h
+++ b/arch/x86/include/asm/tsc.h
@@ -32,6 +32,8 @@  static inline cycles_t get_cycles(void)
 
 extern struct system_counterval_t convert_art_to_tsc(u64 art);
 extern struct system_counterval_t convert_art_ns_to_tsc(u64 art_ns);
+extern void get_tsc_ns(struct system_counterval_t *tsc_counterval, u64 *tsc_ns);
+extern u64 get_art_ns_now(void);
 
 extern void tsc_early_init(void);
 extern void tsc_init(void);
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 0b29e58f288e..333fffc1db7c 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -1215,6 +1215,38 @@  struct system_counterval_t convert_art_to_tsc(u64 art)
 }
 EXPORT_SYMBOL(convert_art_to_tsc);
 
+void get_tsc_ns(struct system_counterval_t *tsc_counterval, u64 *tsc_ns)
+{
+	u64 tmp, res, rem;
+	u64 cycles;
+
+	tsc_counterval->cycles = clocksource_tsc.read(NULL);
+	cycles = tsc_counterval->cycles;
+	tsc_counterval->cs = art_related_clocksource;
+
+	rem = do_div(cycles, tsc_khz);
+
+	res = cycles * USEC_PER_SEC;
+	tmp = rem * USEC_PER_SEC;
+
+	do_div(tmp, tsc_khz);
+	res += tmp;
+
+	*tsc_ns = res;
+}
+EXPORT_SYMBOL(get_tsc_ns);
+
+u64 get_art_ns_now(void)
+{
+	struct system_counterval_t tsc_cycles;
+	u64 tsc_ns;
+
+	get_tsc_ns(&tsc_cycles, &tsc_ns);
+
+	return tsc_ns;
+}
+EXPORT_SYMBOL(get_art_ns_now);
+
 /**
  * convert_art_ns_to_tsc() - Convert ART in nanoseconds to TSC.
  * @art_ns: ART (Always Running Timer) in unit of nanoseconds