diff mbox series

[RFC] e1000e: allow non-monotonic SYSTIM readings

Message ID 20181017161621.6081-1-mlichvar@redhat.com
State RFC
Headers show
Series [RFC] e1000e: allow non-monotonic SYSTIM readings | expand

Commit Message

Miroslav Lichvar Oct. 17, 2018, 4:16 p.m. UTC
It seems with some NICs supported by the e1000e driver a SYSTIM reading
may occasionally be few microseconds before the previous reading and if
enabled also pass e1000e_sanitize_systim() without reaching the maximum
number of rereads, even if the function is modified to check three
consecutive readings (i.e. it doesn't look like a double read error).
This causes an underflow in the timecounter and the PHC time jumps hours
ahead.

This was observed on 82574, I217 and I219. The fastest way to reproduce
it is to run a program that continuously calls the PTP_SYS_OFFSET ioctl
on the PHC.

Modify e1000e_phc_gettime() to use timecounter_cyc2time() instead of
timecounter_read() in order to allow non-monotonic SYSTIM readings and
prevent the PHC from jumping.

Cc: Jacob Keller <jacob.e.keller@intel.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
---
 drivers/net/ethernet/intel/e1000e/ptp.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

Comments

Jacob Keller Oct. 19, 2018, 4:56 p.m. UTC | #1
Hi Miroslav

> -----Original Message-----
> From: Miroslav Lichvar [mailto:mlichvar@redhat.com]
> Sent: Wednesday, October 17, 2018 9:16 AM
> To: intel-wired-lan@lists.osuosl.org
> Cc: Miroslav Lichvar <mlichvar@redhat.com>; Keller, Jacob E
> <jacob.e.keller@intel.com>; Richard Cochran <richardcochran@gmail.com>
> Subject: [RFC PATCH] e1000e: allow non-monotonic SYSTIM readings
> 
> It seems with some NICs supported by the e1000e driver a SYSTIM reading
> may occasionally be few microseconds before the previous reading and if
> enabled also pass e1000e_sanitize_systim() without reaching the maximum
> number of rereads, even if the function is modified to check three
> consecutive readings (i.e. it doesn't look like a double read error).
> This causes an underflow in the timecounter and the PHC time jumps hours
> ahead.
> 
> This was observed on 82574, I217 and I219. The fastest way to reproduce
> it is to run a program that continuously calls the PTP_SYS_OFFSET ioctl
> on the PHC.
> 

Ouch. Good catch.

> Modify e1000e_phc_gettime() to use timecounter_cyc2time() instead of
> timecounter_read() in order to allow non-monotonic SYSTIM readings and
> prevent the PHC from jumping.
> 

Right. I think this makes sense. It does mean we will report the clock jumping back, but that *is* what the hardware reports here so I think it makes sense.

> Cc: Jacob Keller <jacob.e.keller@intel.com>
> Cc: Richard Cochran <richardcochran@gmail.com>
> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
> ---
>  drivers/net/ethernet/intel/e1000e/ptp.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/e1000e/ptp.c
> b/drivers/net/ethernet/intel/e1000e/ptp.c
> index 37c76945ad9b..ec6477c88a4f 100644
> --- a/drivers/net/ethernet/intel/e1000e/ptp.c
> +++ b/drivers/net/ethernet/intel/e1000e/ptp.c
> @@ -173,10 +173,14 @@ static int e1000e_phc_gettime(struct ptp_clock_info *ptp,
> struct timespec64 *ts)
>  	struct e1000_adapter *adapter = container_of(ptp, struct e1000_adapter,
>  						     ptp_clock_info);
>  	unsigned long flags;
> -	u64 ns;
> +	u64 cycles, ns;
> 
>  	spin_lock_irqsave(&adapter->systim_lock, flags);
> -	ns = timecounter_read(&adapter->tc);
> +
> +	/* Use timecounter_cyc2time() to allow non-monotonic SYSTIM readings */
> +	cycles = adapter->cc.read(&adapter->cc);
> +	ns = timecounter_cyc2time(&adapter->tc, cycles);
> +
>  	spin_unlock_irqrestore(&adapter->systim_lock, flags);
> 
>  	*ts = ns_to_timespec64(ns);
> @@ -233,6 +237,9 @@ static void e1000e_systim_overflow_work(struct
> work_struct *work)
>  	struct e1000_hw *hw = &adapter->hw;
>  	struct timespec64 ts;
> 
> +	/* Update the timecounter */
> +	timecounter_read(&adapter->tc);
> +

yea, we previously depended on the gettime64 to do timecounter_read implicitly.

>  	adapter->ptp_clock_info.gettime64(&adapter->ptp_clock_info, &ts);

Can't we drop this line now?

> 
>  	e_dbg("SYSTIM overflow check at %lld.%09lu\n",
> --
> 2.17.1
Miroslav Lichvar Oct. 23, 2018, 11:31 a.m. UTC | #2
On Fri, Oct 19, 2018 at 04:56:02PM +0000, Keller, Jacob E wrote:
> > @@ -233,6 +237,9 @@ static void e1000e_systim_overflow_work(struct
> > work_struct *work)
> >  	struct e1000_hw *hw = &adapter->hw;
> >  	struct timespec64 ts;
> > 
> > +	/* Update the timecounter */
> > +	timecounter_read(&adapter->tc);
> > +
> 
> yea, we previously depended on the gettime64 to do timecounter_read implicitly.
> 
> >  	adapter->ptp_clock_info.gettime64(&adapter->ptp_clock_info, &ts);
> 
> Can't we drop this line now?

Yes. We can save the value returned by the timecounter_read call and
convert it to timespec for the debug message below. I'll send a new
patch.

> > 
> >  	e_dbg("SYSTIM overflow check at %lld.%09lu\n",

Thanks,
Brown, Aaron F Nov. 3, 2018, 1:58 a.m. UTC | #3
> From: Intel-wired-lan [mailto:intel-wired-lan-bounces@osuosl.org] On
> Behalf Of Miroslav Lichvar
> Sent: Wednesday, October 17, 2018 9:16 AM
> To: intel-wired-lan@lists.osuosl.org
> Cc: Richard Cochran <richardcochran@gmail.com>
> Subject: [Intel-wired-lan] [RFC PATCH] e1000e: allow non-monotonic SYSTIM
> readings
> 
> It seems with some NICs supported by the e1000e driver a SYSTIM reading
> may occasionally be few microseconds before the previous reading and if
> enabled also pass e1000e_sanitize_systim() without reaching the maximum
> number of rereads, even if the function is modified to check three
> consecutive readings (i.e. it doesn't look like a double read error).
> This causes an underflow in the timecounter and the PHC time jumps hours
> ahead.
> 
> This was observed on 82574, I217 and I219. The fastest way to reproduce
> it is to run a program that continuously calls the PTP_SYS_OFFSET ioctl
> on the PHC.
> 
> Modify e1000e_phc_gettime() to use timecounter_cyc2time() instead of
> timecounter_read() in order to allow non-monotonic SYSTIM readings and
> prevent the PHC from jumping.
> 
> Cc: Jacob Keller <jacob.e.keller@intel.com>
> Cc: Richard Cochran <richardcochran@gmail.com>
> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
> ---
>  drivers/net/ethernet/intel/e1000e/ptp.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 

Tested-by: Aaron Brown <aaron.f.brown@intel.com>
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/e1000e/ptp.c b/drivers/net/ethernet/intel/e1000e/ptp.c
index 37c76945ad9b..ec6477c88a4f 100644
--- a/drivers/net/ethernet/intel/e1000e/ptp.c
+++ b/drivers/net/ethernet/intel/e1000e/ptp.c
@@ -173,10 +173,14 @@  static int e1000e_phc_gettime(struct ptp_clock_info *ptp, struct timespec64 *ts)
 	struct e1000_adapter *adapter = container_of(ptp, struct e1000_adapter,
 						     ptp_clock_info);
 	unsigned long flags;
-	u64 ns;
+	u64 cycles, ns;
 
 	spin_lock_irqsave(&adapter->systim_lock, flags);
-	ns = timecounter_read(&adapter->tc);
+
+	/* Use timecounter_cyc2time() to allow non-monotonic SYSTIM readings */
+	cycles = adapter->cc.read(&adapter->cc);
+	ns = timecounter_cyc2time(&adapter->tc, cycles);
+
 	spin_unlock_irqrestore(&adapter->systim_lock, flags);
 
 	*ts = ns_to_timespec64(ns);
@@ -233,6 +237,9 @@  static void e1000e_systim_overflow_work(struct work_struct *work)
 	struct e1000_hw *hw = &adapter->hw;
 	struct timespec64 ts;
 
+	/* Update the timecounter */
+	timecounter_read(&adapter->tc);
+
 	adapter->ptp_clock_info.gettime64(&adapter->ptp_clock_info, &ts);
 
 	e_dbg("SYSTIM overflow check at %lld.%09lu\n",