diff mbox

[v4,1/4] Produce system time from correlated clocksource

Message ID 1444675522-4198-2-git-send-email-christopher.s.hall@intel.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Christopher S. Hall Oct. 12, 2015, 6:45 p.m. UTC
From: Thomas Gleixner <tglx@linutronix.de>

Modern Intel hardware provides the so called Always Running Timer
(ART). The TSC which is usually used for timekeeping is derived from
ART and runs with a fixed frequency ratio to it. ART is routed to
devices and allows to take atomic timestamp samples from the device
clock and the ART. One use case is PTP timestamps on network cards. We
want to utilize this feature as it allows us to better correlate the
PTP timestamp to the system time.

In order to gather precise timestamps we need to make sure that the
conversion from ART to TSC and the following conversion from TSC to
clock realtime happens synchronized with the ongoing timekeeping
updates. Otherwise we might convert an ART timestamp from point A in
time with the conversion factors of point B in time. These conversion
factors can differ due to NTP/PTP frequency adjustments and therefor
the resulting clock realtime timestamp would be slightly off, which is
contrary to the whole purpose of synchronized hardware timestamps.

Provide data structures which describe the correlation between two
clocksources and a function to gather correlated and convert
timestamps from a device. The function is as any other timekeeping
function protected against current timekeeper updates via the
timekeeper sequence lock. It calls the device function to gather the
hardware timestamps and converts them to clock real time and clock
monotonic raw.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Another representative use case of time sync and the correlated
clocksource (in addition to PTP noted above) is PTP synchronized
audio.

In a streaming application, as an example, samples will be sent
and/or received by multiple devices with a presentation time that is
in terms of the PTP master clock. Synchronizing the audio output on
these devices requires correlating the audio clock with the PTP
master clock. The more precise this correlation is, the better the
audio quality (i.e. out of sync audio sounds bad).

From an application standpoint, to correlate the PTP master clock
with the audio device clock, the system clock is used as a
intermediate timebase. The transforms such an application would
perform are:

System Clock <-> Audio clock
System Clock <-> Network Device Clock [<-> PTP Master Clock]

Such audio applications make use of some existing ALSA library
calls that provide audio/system cross-timestamps (e.g.
snd_pcm_status_get_htstamp()). Previous driver implementations
capture these cross by reading the system clock (raw/mono/real)
and the device clock atomically in software.

Modern Intel platforms can perform a more accurate cross-
timestamp in hardware (ART,audio device clock).  The audio driver
requires ART->system time transforms -- the same as required for
the network driver. These platforms offload audio processing
(including cross-timestamps) to a DSP which to ensure
uninterrupted audio processing, communicates and response to the
host only once every millsecond. As a result is takes up to a
millisecond for the DSP to receive a request, the request is
processed by the DSP, the audio output hardware is polled for
completion, the result is copied into shared memory, and the
host is notified. All of these operation occur on a millisecond
cadence.  This transaction requires about 2 ms, but under
heavier workloads it may take up to 4 ms.

If update_wall_time() is called while waiting for a
response within get_correlated_ts() (from original patch), a retry
is attempted. This will occur if the cycle_interval(determined by
CONFIG_HZ and mult/shift values) cycles elapse.

The modification to the original patch accomodates these
slow devices by adding the option of providing an ART value outside
of the retry loop and adding a history which can consulted in the
case of an out of date counter value. The history is kept by
making the shadow_timekeeper an array. Each write to the
timekeeper rotates through the array, preserving a
history of updates.

With these changes, if get_correlated_timestamp() detects a counter
value previous to cycle_now, it consults the history in
shadow_timekeeper and translates the timestamp to the system time
value. If the timestamp value is too old, an error is returned

Signed-off-by: Christopher S. Hall <christopher.s.hall@intel.com>
---
 include/linux/clocksource.h |  33 +++++++
 include/linux/timekeeping.h |   4 +
 kernel/time/timekeeping.c   | 203 ++++++++++++++++++++++++++++++++++++++++++--
 3 files changed, 235 insertions(+), 5 deletions(-)

Comments

Richard Cochran Oct. 13, 2015, 4:58 a.m. UTC | #1
On Mon, Oct 12, 2015 at 11:45:19AM -0700, Christopher S. Hall wrote:
> Another representative use case of time sync and the correlated
> clocksource (in addition to PTP noted above) is PTP synchronized
> audio.

The added explanations of the audio use case do help.  However, you
did not address my point in the last series in any way.
 
> In a streaming application, as an example, samples will be sent
> and/or received by multiple devices with a presentation time that is
> in terms of the PTP master clock. Synchronizing the audio output on
> these devices requires correlating the audio clock with the PTP
> master clock. The more precise this correlation is, the better the
> audio quality (i.e. out of sync audio sounds bad).

^^^^
This is mega important.  You want to convert PTP time into audio clock
time.  There is no need for the system time at all.
 
> From an application standpoint, to correlate the PTP master clock
> with the audio device clock, the system clock is used as a
> intermediate timebase.

But why involve the system time base?

> The transforms such an application would
> perform are:
> 
> System Clock <-> Audio clock
> System Clock <-> Network Device Clock [<-> PTP Master Clock]

This is extra work with no benefit.  In fact, this hurts you
because of the need to take avoid update_wall_time AND because of the
NTP frequency adjustments.  Cascaded servos are prone to gain peaking,
and this can easily avoided in this case.
 
> Modern Intel platforms can perform a more accurate cross-
> timestamp in hardware (ART,audio device clock).  The audio driver
> requires ART->system time transforms -- the same as required for
> the network driver.

No, it doesn't need the system time.  It only needs the PTP time.

> The modification to the original patch accomodates these
> slow devices by adding the option of providing an ART value outside
> of the retry loop and adding a history which can consulted in the
> case of an out of date counter value. The history is kept by
> making the shadow_timekeeper an array. Each write to the
> timekeeper rotates through the array, preserving a
> history of updates.

This is all wrong.  All you need to provide the DSP with (ART, PTP)
pairs.  This can be done in a multiple of the DSP period, like every
1, 10, or 100 milliseconds.

Thanks,
Richard
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Richard Cochran Oct. 13, 2015, 5:26 a.m. UTC | #2
On Mon, Oct 12, 2015 at 11:45:19AM -0700, Christopher S. Hall wrote:
> +int get_correlated_timestamp(struct correlated_ts *crt,
> +			     struct correlated_cs *crs)
> +{
> +	struct timekeeper *tk = &tk_core.timekeeper;
> +	unsigned long seq;
> +	cycles_t cycles, cycles_now, cycles_last;
> +	ktime_t base;
> +	s64 nsecs;
> +	int ret;
> +
> +	do {
> +		seq = read_seqcount_begin(&tk_core.seq);
> +		/*
> +		 * Verify that the correlated clocksoure is related to
> +		 * the currently installed timekeeper clocksoure
> +		 */
> +		if (tk->tkr_mono.clock != crs->related_cs)
> +			return -ENODEV;
> +
> +		/*
> +		 * Get a timestamp from the device if get_ts is non-NULL
> +		 */
> +		if( crt->get_ts ) {

CodingStyle.

> +			ret = crt->get_ts(crt);
> +			if (ret)
> +				return ret;
> +		}
> +
> +		/*
> +		 * Convert the timestamp to timekeeper clock cycles
> +		 */
> +		cycles = crs->convert(crs, crt->system_ts);
> +
> +		/*
> +		 * If we have get_ts is valid, we know the cycles value
> +		 * value is up to date and we can just do the conversion
> +		 */
> +		if( crt->get_ts )

Ditto.

> +			goto do_convert;
> +

Thanks,
Richard
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thomas Gleixner Oct. 13, 2015, 7:51 a.m. UTC | #3
On Tue, 13 Oct 2015, Richard Cochran wrote:
> On Mon, Oct 12, 2015 at 11:45:19AM -0700, Christopher S. Hall wrote:
> > The transforms such an application would
> > perform are:
> > 
> > System Clock <-> Audio clock
> > System Clock <-> Network Device Clock [<-> PTP Master Clock]
> 
> This is extra work with no benefit.  In fact, this hurts you
> because of the need to take avoid update_wall_time AND because of the
> NTP frequency adjustments.  Cascaded servos are prone to gain peaking,
> and this can easily avoided in this case.

In such a system the frequency updates are coming from PTP, so you
have a strong correlation.
  
> > Modern Intel platforms can perform a more accurate cross-
> > timestamp in hardware (ART,audio device clock).  The audio driver
> > requires ART->system time transforms -- the same as required for
> > the network driver.
> 
> No, it doesn't need the system time.  It only needs the PTP time.
> 
> > The modification to the original patch accomodates these
> > slow devices by adding the option of providing an ART value outside
> > of the retry loop and adding a history which can consulted in the
> > case of an out of date counter value. The history is kept by
> > making the shadow_timekeeper an array. Each write to the
> > timekeeper rotates through the array, preserving a
> > history of updates.
> 
> This is all wrong.  All you need to provide the DSP with (ART, PTP)
> pairs.  This can be done in a multiple of the DSP period, like every
> 1, 10, or 100 milliseconds.

You are restricting the problem space to this particular use
case. There are other use cases where PTP is not available or not the
relevant reference, but you still want to correlate time domains to
ART.

Thanks,

	tglx




--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Richard Cochran Oct. 13, 2015, 8:31 a.m. UTC | #4
On Tue, Oct 13, 2015 at 09:51:02AM +0200, Thomas Gleixner wrote:
> 
> You are restricting the problem space to this particular use
> case. There are other use cases where PTP is not available or not the
> relevant reference, but you still want to correlate time domains to
> ART.

They may well be other use cases, but they have not been identified
here.  The PTP to media clock problem has a very simple solution.  You
do not need a history of system time stamps to solve it.

Even if you wanted to correlate the system time's UTC with the media
clock, still you don't need any shadow history for that.  Just feed
(ART, UTC) pairs into the DSP at a regular rate, and let the DSP do
the math.  This does not need to be part of the central time keeping
code at all.

Thanks,
Richard
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Richard Cochran Oct. 13, 2015, 1:50 p.m. UTC | #5
Now that I am starting to understand what this code is trying to
achieve...

On Mon, Oct 12, 2015 at 11:45:19AM -0700, Christopher S. Hall wrote:
> The modification to the original patch accomodates these
> slow devices by adding the option of providing an ART value outside
> of the retry loop and adding a history which can consulted in the
> case of an out of date counter value. The history is kept by
> making the shadow_timekeeper an array. Each write to the
> timekeeper rotates through the array, preserving a
> history of updates.

You did not provide an example of how this function is called, in the
special case where correlated_ts.get_ts = NULL and the caller provides
system_ts and device_ts.  Until that user appears, we do not need the
interface at all.

> +/* This needs to be 3 or greater for backtracking to be useful */
> +#define SHADOW_HISTORY_DEPTH 7

Why then have you put 7?

This comment should really explain what is needed and why.

> +/**
> + * get_correlated_timestamp - Get a correlated timestamp

This function is tremendously confusing.  The overloading with special
semenatics when the callback is NULL is not nice.  There is hardly
much shared logic, and so nothing speaks against having two methods.
I would call them "get_correlated_timestamp" and "correlate_timestamp"
or similar.  The "get_" prefix in the name is totally misleading in
your special case.

> + * @crs: conversion between correlated clock and system clock
> + * @crt: callback to get simultaneous device and correlated clock value *or*
> + *	contains a valid correlated clock value and NULL callback

This description stinks.  What is crs?  It is *not* a conversion.  It is:

	struct correlated_ts - Descriptor for taking a correlated time stamp

What is crt?  It is *not* a callback.  It is:

	struct correlated_cs - Descriptor for a clocksource correlated to another

For the crt, the kerneldoc should explain which of the fields

	int			(*get_ts)(struct correlated_ts *ts);
	u64			system_ts;
	u64			device_ts;
	ktime_t			system_real;
	ktime_t			system_raw;
	void			*private;

are provided by the caller and which are set by the function.  The
fact that fields are set either by the caller or by the method is a
funny code smell.

> + *
> + * Reads a timestamp from a device and correlates it to system time.  This
> + * function can be used in two ways.  If a non-NULL get_ts function pointer is
> + * supplied in @crt, this function is called within the retry loop to
> + * read the current correlated clock value and associated device time.
> + * Otherwise (get_ts is NULL) a correlated clock value is supplied and
> + * the history in shadow_timekeeper is consulted if necessary.
> + */
> +int get_correlated_timestamp(struct correlated_ts *crt,
> +			     struct correlated_cs *crs)
> +{
...
> +	do {

This code is only used in the special case:

> +		/*
> +		 * Since the cycles value is supplied outside of the loop,
> +		 * there is no guarantee that it represents a time *after*
> +		 * cycle_last do some checks to figure out whether it's
> +		 * represents the past or the future taking rollover
> +		 * into account. If the value is in the past, try to backtrack
> +		 */

BTW, isn't there an easier way to deal with time stamps in the past?
(Compare with timecounter_cyc2time.)

> +		cycles_now = tk->tkr_mono.read(tk->tkr_mono.clock);
> +		cycles_last = tk->tkr_mono.cycle_last;
> +		if ((cycles >= cycles_last && cycles_now < cycles) ||
> +		    (cycles < cycles_last && cycles_now >= cycles_last)) {
> +			/* cycles is in the past try to backtrack */
> +			int backtrack_index = shadow_index;
> +
> +			while (get_prev_shadow_index(&backtrack_index)) {
> +				tk = shadow_timekeeper+backtrack_index;
> +				if (cycle_between(cycles_last, cycles,
> +						  tk->tkr_mono.cycle_last))
> +					goto do_convert;
> +				cycles_last = tk->tkr_mono.cycle_last;
> +			}
> +			return -EAGAIN;
> +		}

And this is the shared stuff:

> +do_convert:
> +		/* Convert to clock realtime */
> +		base = ktime_add(tk->tkr_mono.base,
> +				 tk_core.timekeeper.offs_real);
> +		nsecs = timekeeping_convert_to_ns(&tk->tkr_mono, cycles);
> +		crt->system_real = ktime_add_ns(base, nsecs);
> +
> +		/* Convert to clock raw monotonic */
> +		base = tk->tkr_raw.base;
> +		nsecs = timekeeping_convert_to_ns(&tk->tkr_raw, cycles);
> +		crt->system_raw = ktime_add_ns(base, nsecs);
> +
> +	} while (read_seqcount_retry(&tk_core.seq, seq));
> +	return 0;
> +}

I suggest moving the shared stuff into a subroutine and providing two
different functions.

Thanks,
Richard
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thomas Gleixner Oct. 13, 2015, 7:15 p.m. UTC | #6
On Tue, 13 Oct 2015, Richard Cochran wrote:

> On Tue, Oct 13, 2015 at 09:51:02AM +0200, Thomas Gleixner wrote:
> > 
> > You are restricting the problem space to this particular use
> > case. There are other use cases where PTP is not available or not the
> > relevant reference, but you still want to correlate time domains to
> > ART.
> 
> They may well be other use cases, but they have not been identified
> here.  The PTP to media clock problem has a very simple solution.  You
> do not need a history of system time stamps to solve it.

Well, these use cases are not in the focus of Christopher, but I have
a few in my head. Think industrial fieldbusses.
 
> Even if you wanted to correlate the system time's UTC with the media
> clock, still you don't need any shadow history for that.  Just feed
> (ART, UTC) pairs into the DSP at a regular rate, and let the DSP do
> the math.  This does not need to be part of the central time keeping
> code at all.

That's not working. The firmware is not going to change, no matter
what.

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thomas Gleixner Oct. 13, 2015, 7:42 p.m. UTC | #7
On Mon, 12 Oct 2015, Christopher S. Hall wrote:
> Another representative use case of time sync and the correlated
> clocksource (in addition to PTP noted above) is PTP synchronized
> audio.

This wants to be a seperate patch, really.
 
> +/* This needs to be 3 or greater for backtracking to be useful */

Why?

> +#define SHADOW_HISTORY_DEPTH 7

And that number is 7 because?

>  static DEFINE_RAW_SPINLOCK(timekeeper_lock);
> -static struct timekeeper shadow_timekeeper;
> +static struct timekeeper shadow_timekeeper[SHADOW_HISTORY_DEPTH];
> +static int shadow_index = -1; /* incremented to zero in timekeeping_init() */

What's the point of this? Aside of that, please do not use tail comments.

> +static bool shadow_timekeeper_full;

That's silly. Make DEPTH a power of 2 and do:

       idx = (idx + 1) & (DEPTH - 1);

> +/*
> + * Modifies shadow index argument to point to the next array element
> + * Returns bool indicating shadow array fullness after the update
> + */
> +static bool get_next_shadow_index(int *shadow_index_out)
> +{
> +	*shadow_index_out = (shadow_index + 1) % SHADOW_HISTORY_DEPTH;
> +	/*
> +	 * If shadow timekeeper is full it stays full, otherwise compute
> +	 * the next value based on whether the index rolls over
> +	 */
> +	return shadow_timekeeper_full ?
> +		true : *shadow_index_out < shadow_index;

All this can go away.

> +	if (action & TK_MIRROR) {
> +		int next_shadow_index;
> +		bool next_shadow_full =
> +			get_next_shadow_index(&next_shadow_index);
> +		memcpy(shadow_timekeeper+next_shadow_index,
> +		       &tk_core.timekeeper, sizeof(tk_core.timekeeper));
> +		shadow_index = next_shadow_index;
> +		shadow_timekeeper_full = next_shadow_full;

Ditto.

> +	}
>  }
>  
>  /**
> @@ -884,6 +923,142 @@ EXPORT_SYMBOL(getnstime_raw_and_real);
>  
>  #endif /* CONFIG_NTP_PPS */
>  
> +/*
> + * Iterator-like function which can be called multiple times to return the
> + * previous shadow_index
> + * Returns false when finding previous is not possible because:
> + * - The array is not full
> + * - The previous shadow_index refers to an entry that may be in-flight
> + */
> +static bool get_prev_shadow_index(int *shadow_index_io)
> +{
> +	int guard_index;
> +	int ret = (*shadow_index_io - 1) % SHADOW_HISTORY_DEPTH;
> +
> +	ret += ret < 0 ? SHADOW_HISTORY_DEPTH : 0;
> +	/*
> +	 * guard_index references the next shadow entry, assume that this
> +	 * isn't valid since its not protected by sequence lock
> +	 */
> +	get_next_shadow_index(&guard_index);
> +	/* if the array isn't full and index references top (invalid) entry */
> +	if (!shadow_timekeeper_full && ret > *shadow_index_io)
> +		return false;
> +	/* the next entry may be in-flight and may be invalid  */
> +	if (ret == guard_index)
> +		return false;
> +	/* Also make sure that entry is valid based on current shadow_index */
> +	*shadow_index_io = ret;
> +	return true;

You surely try hard to do stuff in the most unreadable way. 

> +/**
> + * get_correlated_timestamp - Get a correlated timestamp
> + * @crs: conversion between correlated clock and system clock
> + * @crt: callback to get simultaneous device and correlated clock value *or*
> + *	contains a valid correlated clock value and NULL callback
> + *
> + * Reads a timestamp from a device and correlates it to system time.  This
> + * function can be used in two ways.  If a non-NULL get_ts function pointer is
> + * supplied in @crt, this function is called within the retry loop to
> + * read the current correlated clock value and associated device time.
> + * Otherwise (get_ts is NULL) a correlated clock value is supplied and
> + * the history in shadow_timekeeper is consulted if necessary.
> + */
> +int get_correlated_timestamp(struct correlated_ts *crt,
> +			     struct correlated_cs *crs)
> +{
> +	struct timekeeper *tk = &tk_core.timekeeper;
> +	unsigned long seq;
> +	cycles_t cycles, cycles_now, cycles_last;
> +	ktime_t base;
> +	s64 nsecs;
> +	int ret;
> +
> +	do {
> +		seq = read_seqcount_begin(&tk_core.seq);
> +		/*
> +		 * Verify that the correlated clocksoure is related to
> +		 * the currently installed timekeeper clocksoure
> +		 */
> +		if (tk->tkr_mono.clock != crs->related_cs)
> +			return -ENODEV;
> +
> +		/*
> +		 * Get a timestamp from the device if get_ts is non-NULL
> +		 */
> +		if( crt->get_ts ) {
> +			ret = crt->get_ts(crt);
> +			if (ret)
> +				return ret;
> +		}

What's the point of this? Why are you not making the few lines which
you can actually reuse a helper function and leave the PTP code alone?

> -- 
> 2.1.4

So I reached enf of patch and did not find anything in
timekeeping_init() which tells that the index is incremented to 0. It
really would need a comment, but why do you want to do that at all. It
does not matter whether the first entry is at 0 or 1. You need a
validity check for the entries anyway.

Thanks,

	tglx



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Richard Cochran Oct. 13, 2015, 9:12 p.m. UTC | #8
On Tue, Oct 13, 2015 at 09:15:51PM +0200, Thomas Gleixner wrote:
> That's not working. The firmware is not going to change, no matter
> what.

Can we at least have a explanation of how the firmware operates?  How
are (ART,sys) pairs are generated, and how they are supposed to get
into the DSP?

Thanks,
Richard



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thomas Gleixner Oct. 14, 2015, 7:21 a.m. UTC | #9
On Tue, 13 Oct 2015, Richard Cochran wrote:
> On Tue, Oct 13, 2015 at 09:15:51PM +0200, Thomas Gleixner wrote:
> > That's not working. The firmware is not going to change, no matter
> > what.
> 
> Can we at least have a explanation of how the firmware operates?  How
> are (ART,sys) pairs are generated, and how they are supposed to get
> into the DSP?

The firmware gives you an ART/audio timestamp pair. The firmware does
neither know about system time nor about PTP time.

So we have to do correlation in software.

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Richard Cochran Oct. 14, 2015, 9:29 a.m. UTC | #10
On Wed, Oct 14, 2015 at 09:21:42AM +0200, Thomas Gleixner wrote:
> The firmware gives you an ART/audio timestamp pair. The firmware does
> neither know about system time nor about PTP time.
> 
> So we have to do correlation in software.

Ok, so give me the ART/audio and two recent ART/ptp times* spaced one
second apart, then I'll tell you audio/ptp with error well under 1 PPM.
No need to change the firmware.

Who cares about the system time?  If the DSP isn't using it, then just
leave it out of the equation.

Thanks,
Richard

* If the audio clock isn't locked to the ART, then I'll need two samples.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thomas Gleixner Oct. 14, 2015, 2:22 p.m. UTC | #11
On Wed, 14 Oct 2015, Richard Cochran wrote:

> On Wed, Oct 14, 2015 at 09:21:42AM +0200, Thomas Gleixner wrote:
> > The firmware gives you an ART/audio timestamp pair. The firmware does
> > neither know about system time nor about PTP time.
> > 
> > So we have to do correlation in software.
> 
> Ok, so give me the ART/audio and two recent ART/ptp times* spaced one
> second apart, then I'll tell you audio/ptp with error well under 1 PPM.
> No need to change the firmware.
> 
> Who cares about the system time?  If the DSP isn't using it, then just
> leave it out of the equation.

It's not only the DSP. There is a bunch of usre space software
involved as well. Care to think about the full picture and not just
about some randomly chosen single problem out of the full problem
space.

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Richard Cochran Oct. 14, 2015, 4:18 p.m. UTC | #12
On Wed, Oct 14, 2015 at 04:22:03PM +0200, Thomas Gleixner wrote:
> It's not only the DSP. There is a bunch of usre space software
> involved as well. Care to think about the full picture and not just
> about some randomly chosen single problem out of the full problem
> space.

I would surely like to think about the big picture, if I only knew
what software you are refering to.  Gstreamer maybe?  Jack?

Clueless,
Richard


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christopher S. Hall Oct. 15, 2015, 1:57 a.m. UTC | #13
Thomas,

On Tue, 13 Oct 2015 12:42:52 -0700, Thomas Gleixner <tglx@linutronix.de>  
wrote:
> On Mon, 12 Oct 2015, Christopher S. Hall wrote:
>> audio.
>
> This wants to be a seperate patch, really.

OK. This makes sense, I'll do this the next time.

>> +/* This needs to be 3 or greater for backtracking to be useful */
>
> Why?

The current index points to a copy and the next may be being changed by  
update_wall_time(). Leaving n-2 entries available with useful history in  
them. I'll add more descriptive comments here.

>
>> +#define SHADOW_HISTORY_DEPTH 7
>
> And that number is 7 because?

Due to power of 2 it will be 8 instead. As above the useful history is  
8-2*1 ms (1 ms is the minimum jiffy length).  Array size 4 would not be  
enough history for the DSP which requires 4 ms of history, in the worst  
case.

>> +static int shadow_index = -1; /* incremented to zero in
>
> What's the point of this? Aside of that, please do not use tail comments.

It's removed.  A check for validity is added below and this isn't  
necessary.

> That's silly. Make DEPTH a power of 2 and do:
>
>        idx = (idx + 1) & (DEPTH - 1);

This is changed.

>> +		true : *shadow_index_out < shadow_index;
>
> All this can go away.

Yes.

>> +	/* Also make sure that entry is valid based on current shadow_index */
>> +	*shadow_index_io = ret;
>> +	return true;
>
> You surely try hard to do stuff in the most unreadable way.

Is like this easier to follow?

+static struct timekeeper *search_shadow_history(cycles_t cycles,
+                                               struct clocksource *cs)
+{
+       struct timekeeper *tk = &tk_core.timekeeper;
+       int srchidx = shadow_index;
+       cycles_t cycles_start, cycles_end;
+
+       cycles_start = tk->tkr_mono.cycle_last;
+       do {
+               srchidx = !srchidx-- ? srchidx+SHADOW_HISTORY_DEPTH :  
srchidx;
+               tk = shadow_timekeeper + srchidx;
+
+               /* The next shadow entry may be in flight, don't use it */
+               if (srchidx == ((shadow_index+1) &  
(SHADOW_HISTORY_DEPTH-1)))
+                       return NULL;
+
+               /* Make sure timekeeper is related to clock on this  
interval */
+               if (tk->tkr_mono.clock != cs)
+                       return NULL;
+
+               cycles_end = cycles_start;
+               cycles_start = tk->tkr_mono.cycle_last;
+       } while (!cycle_between(cycles_start, cycles, cycles_end));
+
+       return tk;
+}
A check for validity is added here using the clocksource pointer.

and inside of get_correlated_timestamp():

+                * into account. If the value is in the past, try to  
backtrack
+                */
+               cycles_end = tk->tkr_mono.read(tk->tkr_mono.clock);
+               cycles_start = tk->tkr_mono.cycle_last;
+               if (!cycle_between(cycles_start, cycles, cycles_end)) {
+                       tk = search_shadow_history(cycles,  
crs->related_cs);
+                       if (!tk)
+                               return -EAGAIN;
+               }



>> +		/*
>> +		 * Get a timestamp from the device if get_ts is non-NULL
>> +		 */
>> +		if( crt->get_ts ) {
>> +			ret = crt->get_ts(crt);
>> +			if (ret)
>> +				return ret;
>> +		}
>
> What's the point of this? Why are you not making the few lines which
> you can actually reuse a helper function and leave the PTP code alone?

The audio driver is structured in such a way that it's simpler to provide  
a value rather than a callback.  I changed this to allow the audio  
developers to provide an ART value as input.  If a callback is provided,  
the resulting counter value is guaranteed to be later than cycle_last and  
there is no need to do extra checking (the goto skips that check).  Is  
this an answer to your question?

> So I reached enf of patch and did not find anything in
> timekeeping_init() which tells that the index is incremented to 0. It
> really would need a comment, but why do you want to do that at all. It
> does not matter whether the first entry is at 0 or 1. You need a
> validity check for the entries anyway.

I think this should be resolved.  There's no sensitivity with regard to  
the start index with an added validity check.

Thanks,
Chris
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christopher S. Hall Oct. 15, 2015, 2:34 a.m. UTC | #14
Richard,

On Tue, 13 Oct 2015 14:12:24 -0700, Richard Cochran  
<richardcochran@gmail.com> wrote:
> On Tue, Oct 13, 2015 at 09:15:51PM +0200, Thomas Gleixner wrote:

> Can we at least have a explanation of how the firmware operates?  How
> are (ART,sys) pairs are generated, and how they are supposed to get
> into the DSP?

I'll give it a try. The audio controller has a set of registers almost  
exactly like those on the network device. The e1000e patch adds the  
e1000e_phc_get_ts() function. It writes a register to start the  
cross-timestamp process and some time later the hardware sets a bit  
indicating that it's finished.

In the case of the network, the host polls for this bit to be set,  
indicating the cross-timestamp registers have valid data.  In the audio  
DSP case, it is the DSP that's doing the polling and it can only poll once  
per millisecond.

The transfers look like:

Host -PCI (write request) -> DSP

[Transaction started from host]

DSP -PCI (write to initiate)-> Audio controller

[Transaction started from DSP]

DSP <-PCI (read to poll status)- Audio Controller

[Transaction Complete from DSP perspective]

DSP <-PCI (read (ART,device) pair)- Audio Controller

DSP -PCI (write notification) -> Host

[Transaction complete from Host perspective]

Host <-PCI read (ART,device) pair- DSP


I hope this is helpful. Thanks.

Chris
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Richard Cochran Oct. 15, 2015, 5:41 a.m. UTC | #15
On Wed, Oct 14, 2015 at 07:34:03PM -0700, Christopher Hall wrote:
> I hope this is helpful. Thanks.

So the DSP does not produce or consume system time stamps. Fine.
Still I fail to understand why you need the system time.

Thomas seems to say that there are *other* applications that will want
to transform device time into system time, but why does your audio
application use the system time, when the audio-to-ptp time is
directly available, without any man in the middle?

Thanks,
Richard
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Richard Cochran Oct. 15, 2015, 5:57 a.m. UTC | #16
On Wed, Oct 14, 2015 at 06:57:33PM -0700, Christopher Hall wrote:
> >>+#define SHADOW_HISTORY_DEPTH 7
> >
> >And that number is 7 because?
> 
> Due to power of 2 it will be 8 instead. As above the useful history is 8-2*1
> ms (1 ms is the minimum jiffy length).  Array size 4 would not be enough
> history for the DSP which requires 4 ms of history, in the worst case.

Just as I suspected, the magic number 7 is based on the needs of one
particular user.  What about the next user who comes along needing 10
milliseconds?  That will not do.  Any new interface should be generic
enough to support a wide range of users.

So I think this approach is all wrong.  Here is an idea for you to
consider.  Instead of mucking with the TK, let the user code (possibly
in-kernel) sample ART/sys pairs and interpolate the ART/dev time
stamps.  That way, the user can choose the range and resolution that
he needs.

> The audio driver is structured in such a way that it's simpler to provide a
> value rather than a callback.

Can you please provide a link to the audio driver that uses this new
interface?

Thanks,
Richard

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thomas Gleixner Oct. 15, 2015, 8:13 a.m. UTC | #17
On Thu, 15 Oct 2015, Richard Cochran wrote:
> Thomas seems to say that there are *other* applications that will want
> to transform device time into system time, but why does your audio
> application use the system time, when the audio-to-ptp time is
> directly available, without any man in the middle?

PTP time is slow to access. Having a correlation to system time makes
a log of things simpler; nanosleep is the most obvious example.

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thomas Gleixner Oct. 15, 2015, 8:15 a.m. UTC | #18
On Wed, 14 Oct 2015, Christopher Hall wrote:
> On Tue, 13 Oct 2015 12:42:52 -0700, Thomas Gleixner <tglx@linutronix.de>
> wrote:
> > On Mon, 12 Oct 2015, Christopher S. Hall wrote:
> > > audio.
> > 
> > This wants to be a seperate patch, really.
> 
> OK. This makes sense, I'll do this the next time.
> 
> > > +/* This needs to be 3 or greater for backtracking to be useful */
> > 
> > Why?
> 
> The current index points to a copy and the next may be being changed by
> update_wall_time(). Leaving n-2 entries available with useful history in them.
> I'll add more descriptive comments here.
> 
> > 
> > > +#define SHADOW_HISTORY_DEPTH 7
> > 
> > And that number is 7 because?
> 
> Due to power of 2 it will be 8 instead. As above the useful history is 8-2*1
> ms (1 ms is the minimum jiffy length).  Array size 4 would not be enough
> history for the DSP which requires 4 ms of history, in the worst case.

And how exactly becomes 7 magically 8?
 
> > 
> > What's the point of this? Why are you not making the few lines which
> > you can actually reuse a helper function and leave the PTP code alone?
> 
> The audio driver is structured in such a way that it's simpler to provide a
> value rather than a callback.  I changed this to allow the audio developers to
> provide an ART value as input.  If a callback is provided, the resulting
> counter value is guaranteed to be later than cycle_last and there is no need
> to do extra checking (the goto skips that check).  Is this an answer to your
> question?

Make it a seperate function which can hand in the information and
leave the PTP specific sample/conversion function alone.

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christopher S. Hall Oct. 20, 2015, 12:18 a.m. UTC | #19
Thomas,

On Thu, 15 Oct 2015 01:15:57 -0700, Thomas Gleixner <tglx@linutronix.de>  
wrote:
>> >
>> > > +#define SHADOW_HISTORY_DEPTH 7
>> >
>> > And that number is 7 because?
>>
>> Due to power of 2 it will be 8 instead. As above the useful history is  
>> 8-2*1
>> ms (1 ms is the minimum jiffy length).  Array size 4 would not be enough
>> history for the DSP which requires 4 ms of history, in the worst case.
>
> And how exactly becomes 7 magically 8?

I'm making the array size a power of two, per your suggestion.

In my view, the candidate array sizes are 4 and 8.  But the DSP driver  
code requires that it be at least 8.  Here is my reasoning:

The number of shadow_timekeeper array elements that contain useful history  
is n-2 where n is the size of the shadow_timekeeper array.  This is true  
because shadow_timekeeper[shadow_index] is a copy of tk_core.timekeeper  
(this isn't history).  The next entry of the shadow_timekeeper array may  
be in-flight and contain invalid information, because update_wall_time()  
makes changes to the next entry of shadow timekeeper outside of the  
sequence lock.  If that occurs, get_correlated_timestamp() would not be  
notified of this change through a change in sequence number.

Combining these two requirements, the candidate array sizes and their  
effective history sizes is:

Array Size   Effective History (in elements)
==========   ===============================
4            2				
8            6
16           14

update_wall_clock() checks that the number of elapsed cycles is >=  
tk->cycle_interval before performing any updates to shadow_timekeeper.   
The minimum cycle count (and minimum time) that is contained in each  
history element is equal to cycle_interval which represents a time period  
of 1/HZ seconds.  The smallest configurable period for cycle_interval is 1  
ms (in cycles).  Each history element, therefore, is guaranteed to contain  
1 ms of history for all kernel configurations.  For correct operation of  
the DSP using the proposed ART->system time API we need at least 4 ms of  
history.  To guarantee this much history is available, the array size  
needs to be 8.

Array Size   Minimum Effective History (in ms)
==========   =================================
4            2				
8            6    <----
16           14

Does this make sense for the choice of array size 8?

> Make it a seperate function which can hand in the information and
> leave the PTP specific sample/conversion function alone.

OK.  The audio driver code will conform to the original correlated  
clocksource API.

FYI, I will be away from work and email until 10/28.  Any responses will  
be delayed until then.

Thanks,
Chris
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John Stultz Oct. 20, 2015, 12:36 a.m. UTC | #20
On Mon, Oct 19, 2015 at 5:18 PM, Christopher Hall
<christopher.s.hall@intel.com> wrote:
> On Thu, 15 Oct 2015 01:15:57 -0700, Thomas Gleixner <tglx@linutronix.de>
> wrote:
>>>
>>> >
>>> > > +#define SHADOW_HISTORY_DEPTH 7
>>> >
>>> > And that number is 7 because?
>>>
>>> Due to power of 2 it will be 8 instead. As above the useful history is
>>> 8-2*1
>>> ms (1 ms is the minimum jiffy length).  Array size 4 would not be enough
>>> history for the DSP which requires 4 ms of history, in the worst case.
>>
>>
>> And how exactly becomes 7 magically 8?
>
>
> I'm making the array size a power of two, per your suggestion.
>
> In my view, the candidate array sizes are 4 and 8.  But the DSP driver code
> requires that it be at least 8.  Here is my reasoning:
>
> The number of shadow_timekeeper array elements that contain useful history
> is n-2 where n is the size of the shadow_timekeeper array.  This is true
> because shadow_timekeeper[shadow_index] is a copy of tk_core.timekeeper
> (this isn't history).  The next entry of the shadow_timekeeper array may be
> in-flight and contain invalid information, because update_wall_time() makes
> changes to the next entry of shadow timekeeper outside of the sequence lock.
> If that occurs, get_correlated_timestamp() would not be notified of this
> change through a change in sequence number.

Sorry for not commenting here earlier. But I've sort of loosely been
following this thread.

I'm still very very concerned about the complexity of adding any sort
of array of timekeeper structures for historical purposes, and like
Richard, I feel like the rational for all of this has not been made
very clear.

Thomas seems to be a helpful advocate, but I suspect the use case has
been explained to him in detail off list.

If we're only tracking 4ms of history, how does this solution
measurably improve the error over using the timestamps to generate
MONOTONIC_RAW clock deltas (which doesn't require keeping any history)
and using getnstime_raw_and_real to take an anchor point to calculate
the delta from?  Why is adding complexity necessary?

thanks
-john
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Richard Cochran Oct. 20, 2015, 8:54 a.m. UTC | #21
On Mon, Oct 19, 2015 at 05:36:56PM -0700, John Stultz wrote:
> If we're only tracking 4ms of history, how does this solution
> measurably improve the error over using the timestamps to generate
> MONOTONIC_RAW clock deltas (which doesn't require keeping any history)
> and using getnstime_raw_and_real to take an anchor point to calculate
> the delta from?  Why is adding complexity necessary?

This idea is variant of what I suggested in another reply in this
thread.  To my understanding, there is no need at all to keep a
history arbitrarily 4 ms long.  Instead, the DSP driver (or whoever
else may need such a thing) can simply sample the system time at the
rate needed for that particular application.

Thanks,
Richard
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thomas Gleixner Oct. 20, 2015, 10:48 a.m. UTC | #22
On Tue, 20 Oct 2015, Richard Cochran wrote:

> On Mon, Oct 19, 2015 at 05:36:56PM -0700, John Stultz wrote:
> > If we're only tracking 4ms of history, how does this solution
> > measurably improve the error over using the timestamps to generate
> > MONOTONIC_RAW clock deltas (which doesn't require keeping any history)
> > and using getnstime_raw_and_real to take an anchor point to calculate
> > the delta from?  Why is adding complexity necessary?
> 
> This idea is variant of what I suggested in another reply in this
> thread.  To my understanding, there is no need at all to keep a
> history arbitrarily 4 ms long.  Instead, the DSP driver (or whoever
> else may need such a thing) can simply sample the system time at the
> rate needed for that particular application.

That's complete nonsense. The whole point is to have a proper
correlation from ART/audio timestamps to system time. Sampling system
time does not help in any way,

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Richard Cochran Oct. 20, 2015, 11:51 a.m. UTC | #23
On Tue, Oct 20, 2015 at 12:48:03PM +0200, Thomas Gleixner wrote:
> On Tue, 20 Oct 2015, Richard Cochran wrote:
> 
> > On Mon, Oct 19, 2015 at 05:36:56PM -0700, John Stultz wrote:
> > > If we're only tracking 4ms of history, how does this solution
> > > measurably improve the error over using the timestamps to generate
> > > MONOTONIC_RAW clock deltas (which doesn't require keeping any history)
> > > and using getnstime_raw_and_real to take an anchor point to calculate
> > > the delta from?  Why is adding complexity necessary?
> > 
> > This idea is variant of what I suggested in another reply in this
> > thread.  To my understanding, there is no need at all to keep a
> > history arbitrarily 4 ms long.  Instead, the DSP driver (or whoever
> > else may need such a thing) can simply sample the system time at the
> > rate needed for that particular application.
> 
> That's complete nonsense. The whole point is to have a proper
> correlation from ART/audio timestamps to system time. Sampling system
> time does not help in any way,

You can, in fact, achieve "proper" correlation by sampling.  As John
said, the question is whether the method in the patch set "measurably
improves the error" over using another, simpler method.

Thanks,
Richard

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Richard Cochran Oct. 20, 2015, 2:55 p.m. UTC | #24
On Tue, Oct 20, 2015 at 01:51:13PM +0200, Richard Cochran wrote:
> You can, in fact, achieve "proper" correlation by sampling.  As John
> said, the question is whether the method in the patch set "measurably
> improves the error" over using another, simpler method.

Here is a short example to put some numbers on the expected error.
Let the driver sample at an interval of 1 ms.  If the system time's
frequency hasn't changed between two samples, A and B, then the driver
may interpolate without introducing any error.

If the frequency is changed between the sample times, then the
interpolated value will have some error.  Because 1 ms is smallest HZ
value, the frequency can change at most once during the sample.  If
the frequency changes near point A or B, then the error is minimal.
The worst case occurs when the frequency is changed half way between A
and B.

Suppose the frequency is changed by 10 PPM, at point C, half way
between A and B.  This change results in a 5 nanosecond time
difference at B (10 PPM over C -> B).  The driver will interpolate
using line A-B with slope increased by 5 PPM, and the worst case
error, found at point C, is then 2.5 nanoseconds.

Thanks,
Richard
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thomas Gleixner Oct. 20, 2015, 7:11 p.m. UTC | #25
On Tue, 20 Oct 2015, Richard Cochran wrote:
> On Tue, Oct 20, 2015 at 01:51:13PM +0200, Richard Cochran wrote:
> > You can, in fact, achieve "proper" correlation by sampling.  As John
> > said, the question is whether the method in the patch set "measurably
> > improves the error" over using another, simpler method.
> 
> Here is a short example to put some numbers on the expected error.
> Let the driver sample at an interval of 1 ms.  If the system time's
> frequency hasn't changed between two samples, A and B, then the driver
> may interpolate without introducing any error.

Darn, we don't want to have that kind of sampling in every driver
which has this kind of problem even if it looks like the simpler
choice for this particular use case. This is going to be something
which next generation chips will have on more than just the audio
interface and we realy want to have a generic solution for this.

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Richard Cochran Oct. 20, 2015, 7:36 p.m. UTC | #26
On Tue, Oct 20, 2015 at 09:11:21PM +0200, Thomas Gleixner wrote:
> Darn, we don't want to have that kind of sampling in every driver
> which has this kind of problem even if it looks like the simpler
> choice for this particular use case. This is going to be something
> which next generation chips will have on more than just the audio
> interface and we realy want to have a generic solution for this.

Right, having multiple drivers sampling is bad.

Just thinking out loud: how about a service layer that can handle
multiple drivers?  The layer samples at the maximum requested rate,
and buffers the history for the maximum requested backlog.  The
non-max rate users simply get a higher resolution than they need.

A generic solution would handle any history length for old time
stamps, within reason.  I think hard coding 4 ms (or 8 ms or 800 ms)
is clunky.

Thanks,
Richard
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John Stultz Oct. 20, 2015, 8:16 p.m. UTC | #27
On Tue, Oct 20, 2015 at 12:11 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Tue, 20 Oct 2015, Richard Cochran wrote:
>> On Tue, Oct 20, 2015 at 01:51:13PM +0200, Richard Cochran wrote:
>> > You can, in fact, achieve "proper" correlation by sampling.  As John
>> > said, the question is whether the method in the patch set "measurably
>> > improves the error" over using another, simpler method.
>>
>> Here is a short example to put some numbers on the expected error.
>> Let the driver sample at an interval of 1 ms.  If the system time's
>> frequency hasn't changed between two samples, A and B, then the driver
>> may interpolate without introducing any error.
>
> Darn, we don't want to have that kind of sampling in every driver
> which has this kind of problem even if it looks like the simpler
> choice for this particular use case. This is going to be something
> which next generation chips will have on more than just the audio
> interface and we realy want to have a generic solution for this.

I sort of agree with Richard that the timekeeper history approach
doesn't seem like a generic solution here.

And again, you seem to be speaking with a bigger picture in mind that
at least I don't yet share (apologies for being thick headed here).
Being able to have various hardware sharing a time base is quite
useful, and methods for correlating timestamps together are useful.
But I don't yet really understand why its important that we can
translate a hardware timestamp from some time in the past to the
correct system time in the past without error.

thanks
-john
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thomas Gleixner Oct. 21, 2015, 7:44 a.m. UTC | #28
On Tue, 20 Oct 2015, John Stultz wrote:

> On Tue, Oct 20, 2015 at 12:11 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> > On Tue, 20 Oct 2015, Richard Cochran wrote:
> >> On Tue, Oct 20, 2015 at 01:51:13PM +0200, Richard Cochran wrote:
> >> > You can, in fact, achieve "proper" correlation by sampling.  As John
> >> > said, the question is whether the method in the patch set "measurably
> >> > improves the error" over using another, simpler method.
> >>
> >> Here is a short example to put some numbers on the expected error.
> >> Let the driver sample at an interval of 1 ms.  If the system time's
> >> frequency hasn't changed between two samples, A and B, then the driver
> >> may interpolate without introducing any error.
> >
> > Darn, we don't want to have that kind of sampling in every driver
> > which has this kind of problem even if it looks like the simpler
> > choice for this particular use case. This is going to be something
> > which next generation chips will have on more than just the audio
> > interface and we realy want to have a generic solution for this.
> 
> I sort of agree with Richard that the timekeeper history approach
> doesn't seem like a generic solution here.
 
I'm not pushing that approach. I just want a generic facility of some
sort to solve that.

> And again, you seem to be speaking with a bigger picture in mind that
> at least I don't yet share (apologies for being thick headed here).
> Being able to have various hardware sharing a time base is quite
> useful, and methods for correlating timestamps together are useful.
> But I don't yet really understand why its important that we can
> translate a hardware timestamp from some time in the past to the
> correct system time in the past without error.

If your device can only provide timestamps from the past, then having
access to the history is important if you want to have precise
correlation.

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stanton, Kevin B Nov. 3, 2015, 7:18 p.m. UTC | #29
On Wed, 21 Oct 2015, Thomas Gleixner wrote:
> On Tue, 20 Oct 2015, John Stultz wrote:
>> Being able to have various hardware sharing a time base is quite 
>> useful, and methods for correlating timestamps together are useful.
>> But I don't yet really understand why its important that we can 
>> translate a hardware timestamp from some time in the past to the 
>> correct system time in the past without error.

>If your device can only provide timestamps from the past, then
>having access to the history is important if you want to have
>precise correlation.

I hope this can be solved in timekeeping. But first, a quick
recap...

The timestamp pair (including the ART snapshot, as described
previously) is captured simultaneously by the hardware
resulting, effectively, in a (PTP,TSC) pair, or
(AudioPosition,TSC) pair. The in-the-past-TSC value needs to be
converted to system time so that it can be used by applications,
without exposing the underlying ART or TSC.

Note: ART is architectural, defined as part of Invariant
Timekeeping in the current SDM, so this isn't a one-off
capability.

To convert a past TSC timestamp to system time 'correctly' (in a
mathematical sense), a history of monotonic rate adjustments
since that time in the past must be maintained.

Regarding the amount of history, as Chris mentioned (and
in the context of new Intel hardware) LAN timestamp pairs are
a few microseconds in the past (no history would be required),
but for timestamps captured by the audio DSP, unfortunately,
they can be a small number of *milliseconds* in the past by the
time they're available to the audio driver (some history
required to convert accurately). I'm told that 4ms of adjustment
history accommodates known hardware.

Getting this 'correct' in one place (timekeeping) seems a lot
better than unnecessarily introducing inaccuracy via software
sampling (and extrapolation) or leaving it to each driver to do
it themselves, and to do it differently (and/or do it wrongly).

Best Regards,
Kevin

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John Stultz Nov. 9, 2015, 9:17 p.m. UTC | #30
On Tue, Nov 3, 2015 at 11:18 AM, Stanton, Kevin B
<kevin.b.stanton@intel.com> wrote:
> On Wed, 21 Oct 2015, Thomas Gleixner wrote:
>> On Tue, 20 Oct 2015, John Stultz wrote:
>>> Being able to have various hardware sharing a time base is quite
>>> useful, and methods for correlating timestamps together are useful.
>>> But I don't yet really understand why its important that we can
>>> translate a hardware timestamp from some time in the past to the
>>> correct system time in the past without error.
>
>>If your device can only provide timestamps from the past, then
>>having access to the history is important if you want to have
>>precise correlation.
>
> I hope this can be solved in timekeeping. But first, a quick
> recap...
>
> The timestamp pair (including the ART snapshot, as described
> previously) is captured simultaneously by the hardware
> resulting, effectively, in a (PTP,TSC) pair, or
> (AudioPosition,TSC) pair. The in-the-past-TSC value needs to be
> converted to system time so that it can be used by applications,
> without exposing the underlying ART or TSC.
>
> Note: ART is architectural, defined as part of Invariant
> Timekeeping in the current SDM, so this isn't a one-off
> capability.
>
> To convert a past TSC timestamp to system time 'correctly' (in a
> mathematical sense), a history of monotonic rate adjustments
> since that time in the past must be maintained.

But again, my main problem is that I'm not totally understanding the rational.
Here you're providing the *what*, not really the *why*.

As I mentioned earlier, there are possibly simpler (at least for the
kernel) ways to generate similar data using CLOCK_MONOTONIC_RAW, which
has potentially a 500ppm error. *Why* is it important to add more
complexity to the timekeeping core in order to avoid that error?

> Regarding the amount of history, as Chris mentioned (and
> in the context of new Intel hardware) LAN timestamp pairs are
> a few microseconds in the past (no history would be required),
> but for timestamps captured by the audio DSP, unfortunately,
> they can be a small number of *milliseconds* in the past by the
> time they're available to the audio driver (some history
> required to convert accurately). I'm told that 4ms of adjustment
> history accommodates known hardware.

For a timestamp recorded 4ms ago, 500ppm of error is 2us. Why is 2us
problematic for audio? That seems quite below the human threshold to
notice.

> Getting this 'correct' in one place (timekeeping) seems a lot
> better than unnecessarily introducing inaccuracy via software
> sampling (and extrapolation) or leaving it to each driver to do
> it themselves, and to do it differently (and/or do it wrongly).

Having a common infrastructure for extrapolating the data isn't
something I'm objecting to.  Its just that the shadow-timekeeper has
been problematic enough in recent times bug wise, so I'm just hesitant
to expand the complexity there. That said, I'm open to ideas, but
would really like a better understanding of why other solutions would
be insufficient, and why this one is the best solution.

thanks
-john
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 278dd27..4bedadb 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -258,4 +258,37 @@  void acpi_generic_timer_init(void);
 static inline void acpi_generic_timer_init(void) { }
 #endif
 
+/*
+ * struct correlated_cs - Descriptor for a clocksource correlated to another
+ *	clocksource
+ * @related_cs:		Pointer to the related timekeeping clocksource
+ * @convert:		Conversion function to convert a timestamp from
+ *			the correlated clocksource to cycles of the related
+ *			timekeeping clocksource
+ */
+struct correlated_cs {
+	struct clocksource	*related_cs;
+	u64			(*convert)(struct correlated_cs *cs,
+					   u64 cycles);
+};
+
+struct correlated_ts;
+
+/**
+ * struct correlated_ts - Descriptor for taking a correlated time stamp
+ * @get_ts:		Function to read out a synced system and device
+ *			timestamp
+ * @system_ts:		The raw system clock timestamp
+ * @device_ts:		The raw device timestamp
+ * @system_real:	@system_ts converted to CLOCK_REALTIME
+ * @system_raw:		@system_ts converted to CLOCK_MONOTONIC_RAW
+ */
+struct correlated_ts {
+	int			(*get_ts)(struct correlated_ts *ts);
+	u64			system_ts;
+	u64			device_ts;
+	ktime_t			system_real;
+	ktime_t			system_raw;
+	void			*private;
+};
 #endif /* _LINUX_CLOCKSOURCE_H */
diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h
index ba0ae09..79c46d4 100644
--- a/include/linux/timekeeping.h
+++ b/include/linux/timekeeping.h
@@ -265,6 +265,10 @@  extern void timekeeping_inject_sleeptime64(struct timespec64 *delta);
  */
 extern void getnstime_raw_and_real(struct timespec *ts_raw,
 				   struct timespec *ts_real);
+struct correlated_ts;
+struct correlated_cs;
+extern int get_correlated_timestamp(struct correlated_ts *crt,
+				    struct correlated_cs *crs);
 
 /*
  * Persistent clock related interfaces
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 3739ac6..1a0860c 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -41,8 +41,13 @@  static struct {
 	struct timekeeper	timekeeper;
 } tk_core ____cacheline_aligned;
 
+/* This needs to be 3 or greater for backtracking to be useful */
+#define SHADOW_HISTORY_DEPTH 7
+
 static DEFINE_RAW_SPINLOCK(timekeeper_lock);
-static struct timekeeper shadow_timekeeper;
+static struct timekeeper shadow_timekeeper[SHADOW_HISTORY_DEPTH];
+static int shadow_index = -1; /* incremented to zero in timekeeping_init() */
+static bool shadow_timekeeper_full;
 
 /**
  * struct tk_fast - NMI safe timekeeper
@@ -312,6 +317,19 @@  static inline s64 timekeeping_get_ns(struct tk_read_base *tkr)
 	return nsec + arch_gettimeoffset();
 }
 
+static inline s64 timekeeping_convert_to_ns(struct tk_read_base *tkr,
+					    cycle_t cycles)
+{
+	cycle_t delta;
+	s64 nsec;
+
+	/* calculate the delta since the last update_wall_time */
+	delta = clocksource_delta(cycles, tkr->cycle_last, tkr->mask);
+
+	nsec = delta * tkr->mult + tkr->xtime_nsec;
+	return nsec >> tkr->shift;
+}
+
 /**
  * update_fast_timekeeper - Update the fast and NMI safe monotonic timekeeper.
  * @tkr: Timekeeping readout base from which we take the update
@@ -558,6 +576,21 @@  static inline void tk_update_ktime_data(struct timekeeper *tk)
 	tk->ktime_sec = seconds;
 }
 
+/*
+ * Modifies shadow index argument to point to the next array element
+ * Returns bool indicating shadow array fullness after the update
+ */
+static bool get_next_shadow_index(int *shadow_index_out)
+{
+	*shadow_index_out = (shadow_index + 1) % SHADOW_HISTORY_DEPTH;
+	/*
+	 * If shadow timekeeper is full it stays full, otherwise compute
+	 * the next value based on whether the index rolls over
+	 */
+	return shadow_timekeeper_full ?
+		true : *shadow_index_out < shadow_index;
+}
+
 /* must hold timekeeper_lock */
 static void timekeeping_update(struct timekeeper *tk, unsigned int action)
 {
@@ -582,9 +615,15 @@  static void timekeeping_update(struct timekeeper *tk, unsigned int action)
 	 * to happen last here to ensure we don't over-write the
 	 * timekeeper structure on the next update with stale data
 	 */
-	if (action & TK_MIRROR)
-		memcpy(&shadow_timekeeper, &tk_core.timekeeper,
-		       sizeof(tk_core.timekeeper));
+	if (action & TK_MIRROR) {
+		int next_shadow_index;
+		bool next_shadow_full =
+			get_next_shadow_index(&next_shadow_index);
+		memcpy(shadow_timekeeper+next_shadow_index,
+		       &tk_core.timekeeper, sizeof(tk_core.timekeeper));
+		shadow_index = next_shadow_index;
+		shadow_timekeeper_full = next_shadow_full;
+	}
 }
 
 /**
@@ -884,6 +923,142 @@  EXPORT_SYMBOL(getnstime_raw_and_real);
 
 #endif /* CONFIG_NTP_PPS */
 
+/*
+ * Iterator-like function which can be called multiple times to return the
+ * previous shadow_index
+ * Returns false when finding previous is not possible because:
+ * - The array is not full
+ * - The previous shadow_index refers to an entry that may be in-flight
+ */
+static bool get_prev_shadow_index(int *shadow_index_io)
+{
+	int guard_index;
+	int ret = (*shadow_index_io - 1) % SHADOW_HISTORY_DEPTH;
+
+	ret += ret < 0 ? SHADOW_HISTORY_DEPTH : 0;
+	/*
+	 * guard_index references the next shadow entry, assume that this
+	 * isn't valid since its not protected by sequence lock
+	 */
+	get_next_shadow_index(&guard_index);
+	/* if the array isn't full and index references top (invalid) entry */
+	if (!shadow_timekeeper_full && ret > *shadow_index_io)
+		return false;
+	/* the next entry may be in-flight and may be invalid  */
+	if (ret == guard_index)
+		return false;
+	/* Also make sure that entry is valid based on current shadow_index */
+	*shadow_index_io = ret;
+	return true;
+}
+
+/*
+ * cycle_between - true if test occurs chronologically between before and after
+ */
+
+static bool cycle_between(cycles_t after, cycles_t test, cycles_t before)
+{
+	if (test < before && before > after)
+		return true;
+	if (test > before && test < after)
+		return true;
+	return false;
+}
+
+/**
+ * get_correlated_timestamp - Get a correlated timestamp
+ * @crs: conversion between correlated clock and system clock
+ * @crt: callback to get simultaneous device and correlated clock value *or*
+ *	contains a valid correlated clock value and NULL callback
+ *
+ * Reads a timestamp from a device and correlates it to system time.  This
+ * function can be used in two ways.  If a non-NULL get_ts function pointer is
+ * supplied in @crt, this function is called within the retry loop to
+ * read the current correlated clock value and associated device time.
+ * Otherwise (get_ts is NULL) a correlated clock value is supplied and
+ * the history in shadow_timekeeper is consulted if necessary.
+ */
+int get_correlated_timestamp(struct correlated_ts *crt,
+			     struct correlated_cs *crs)
+{
+	struct timekeeper *tk = &tk_core.timekeeper;
+	unsigned long seq;
+	cycles_t cycles, cycles_now, cycles_last;
+	ktime_t base;
+	s64 nsecs;
+	int ret;
+
+	do {
+		seq = read_seqcount_begin(&tk_core.seq);
+		/*
+		 * Verify that the correlated clocksoure is related to
+		 * the currently installed timekeeper clocksoure
+		 */
+		if (tk->tkr_mono.clock != crs->related_cs)
+			return -ENODEV;
+
+		/*
+		 * Get a timestamp from the device if get_ts is non-NULL
+		 */
+		if( crt->get_ts ) {
+			ret = crt->get_ts(crt);
+			if (ret)
+				return ret;
+		}
+
+		/*
+		 * Convert the timestamp to timekeeper clock cycles
+		 */
+		cycles = crs->convert(crs, crt->system_ts);
+
+		/*
+		 * If we have get_ts is valid, we know the cycles value
+		 * value is up to date and we can just do the conversion
+		 */
+		if( crt->get_ts )
+			goto do_convert;
+
+		/*
+		 * Since the cycles value is supplied outside of the loop,
+		 * there is no guarantee that it represents a time *after*
+		 * cycle_last do some checks to figure out whether it's
+		 * represents the past or the future taking rollover
+		 * into account. If the value is in the past, try to backtrack
+		 */
+		cycles_now = tk->tkr_mono.read(tk->tkr_mono.clock);
+		cycles_last = tk->tkr_mono.cycle_last;
+		if ((cycles >= cycles_last && cycles_now < cycles) ||
+		    (cycles < cycles_last && cycles_now >= cycles_last)) {
+			/* cycles is in the past try to backtrack */
+			int backtrack_index = shadow_index;
+
+			while (get_prev_shadow_index(&backtrack_index)) {
+				tk = shadow_timekeeper+backtrack_index;
+				if (cycle_between(cycles_last, cycles,
+						  tk->tkr_mono.cycle_last))
+					goto do_convert;
+				cycles_last = tk->tkr_mono.cycle_last;
+			}
+			return -EAGAIN;
+		}
+
+do_convert:
+		/* Convert to clock realtime */
+		base = ktime_add(tk->tkr_mono.base,
+				 tk_core.timekeeper.offs_real);
+		nsecs = timekeeping_convert_to_ns(&tk->tkr_mono, cycles);
+		crt->system_real = ktime_add_ns(base, nsecs);
+
+		/* Convert to clock raw monotonic */
+		base = tk->tkr_raw.base;
+		nsecs = timekeeping_convert_to_ns(&tk->tkr_raw, cycles);
+		crt->system_raw = ktime_add_ns(base, nsecs);
+
+	} while (read_seqcount_retry(&tk_core.seq, seq));
+	return 0;
+}
+EXPORT_SYMBOL_GPL(get_correlated_timestamp);
+
 /**
  * do_gettimeofday - Returns the time of day in a timeval
  * @tv:		pointer to the timeval to be set
@@ -1763,7 +1938,9 @@  static cycle_t logarithmic_accumulation(struct timekeeper *tk, cycle_t offset,
 void update_wall_time(void)
 {
 	struct timekeeper *real_tk = &tk_core.timekeeper;
-	struct timekeeper *tk = &shadow_timekeeper;
+	struct timekeeper *tk;
+	int next_shadow_index;
+	bool next_shadow_full;
 	cycle_t offset;
 	int shift = 0, maxshift;
 	unsigned int clock_set = 0;
@@ -1775,6 +1952,9 @@  void update_wall_time(void)
 	if (unlikely(timekeeping_suspended))
 		goto out;
 
+	/* Make sure we're inside the lock */
+	tk = shadow_timekeeper+shadow_index;
+
 #ifdef CONFIG_ARCH_USES_GETTIMEOFFSET
 	offset = real_tk->cycle_interval;
 #else
@@ -1786,6 +1966,13 @@  void update_wall_time(void)
 	if (offset < real_tk->cycle_interval)
 		goto out;
 
+	/* Copy the current shadow timekeeper to the 'next' and point to it */
+	next_shadow_index = shadow_index;
+	next_shadow_full = get_next_shadow_index(&next_shadow_index);
+	memcpy(shadow_timekeeper+next_shadow_index,
+	       shadow_timekeeper+shadow_index, sizeof(*shadow_timekeeper));
+	tk = shadow_timekeeper+next_shadow_index;
+
 	/* Do some additional sanity checking */
 	timekeeping_check_update(real_tk, offset);
 
@@ -1834,8 +2021,14 @@  void update_wall_time(void)
 	 * spinlocked/seqcount protected sections. And we trade this
 	 * memcpy under the tk_core.seq against one before we start
 	 * updating.
+	 *
+	 * Update the shadow index inside here forcing any backtracking
+	 * operations inside get_correlated_timestamp() to restart with
+	 * valid values
 	 */
 	timekeeping_update(tk, clock_set);
+	shadow_index = next_shadow_index;
+	shadow_timekeeper_full = next_shadow_full;
 	memcpy(real_tk, tk, sizeof(*tk));
 	/* The memcpy must come last. Do not put anything here! */
 	write_seqcount_end(&tk_core.seq);