diff mbox

[tpmdd-devel,2/5] tpm_tis: add a short wait in IRQ probing routine

Message ID 1448026354-6807-3-git-send-email-martin.wilck@ts.fujitsu.com
State New
Headers show

Commit Message

Martin Wilck Nov. 20, 2015, 1:32 p.m. UTC
From: Martin Wilck <Martin.Wilck@ts.fujitsu.com>

There is a possible race condition in the IRQ probing code,
as tpm2_gen_interrupt may return before the IRQ handler is actually called.
This would lead to a false negative (IRQ is found not to work although
it actually does). Wait another ms for the IRQ to arrive, similar to
the IRQ testing code in tpm_tis_send().

v2: split this commit from the previous one ("tpm_tis: calculate command
durations before irq probing").

Signed-off-by: Martin Wilck <Martin.Wilck@ts.fujitsu.com>
---
 drivers/char/tpm/tpm_tis.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Jarkko Sakkinen Nov. 21, 2015, 1:16 p.m. UTC | #1
On Fri, Nov 20, 2015 at 02:32:31PM +0100, martin.wilck@ts.fujitsu.com wrote:
> From: Martin Wilck <Martin.Wilck@ts.fujitsu.com>
> 
> There is a possible race condition in the IRQ probing code,
> as tpm2_gen_interrupt may return before the IRQ handler is actually called.
> This would lead to a false negative (IRQ is found not to work although
> it actually does). Wait another ms for the IRQ to arrive, similar to
> the IRQ testing code in tpm_tis_send().
> 
> v2: split this commit from the previous one ("tpm_tis: calculate command
> durations before irq probing").

Before chip->vendor.irq is set, tpm_transmit() is in polling mode and it
is a synchronous function. By the tpm_get_interrupt() returns, response
is fully received and therefore interrupt has been generated.

> Signed-off-by: Martin Wilck <Martin.Wilck@ts.fujitsu.com>

/Jarkko

> ---
>  drivers/char/tpm/tpm_tis.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c
> index f7f9039..f417b40 100644
> --- a/drivers/char/tpm/tpm_tis.c
> +++ b/drivers/char/tpm/tpm_tis.c
> @@ -811,6 +811,9 @@ static int tpm_tis_init(struct device *dev, struct tpm_info *tpm_info,
>  			else
>  				tpm_gen_interrupt(chip);
>  
> +			if (!chip->vendor.probed_irq)
> +				msleep(1);
> +
>  			chip->vendor.irq = chip->vendor.probed_irq;
>  
>  			/* free_irq will call into tis_int_probe;
> -- 
> 1.8.3.1
> 
> 
> ------------------------------------------------------------------------------
> _______________________________________________
> tpmdd-devel mailing list
> tpmdd-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/tpmdd-devel

------------------------------------------------------------------------------
Jason Gunthorpe Nov. 23, 2015, 11:23 p.m. UTC | #2
On Sat, Nov 21, 2015 at 03:16:07PM +0200, Jarkko Sakkinen wrote:
> On Fri, Nov 20, 2015 at 02:32:31PM +0100, martin.wilck@ts.fujitsu.com wrote:
> > From: Martin Wilck <Martin.Wilck@ts.fujitsu.com>
> > 
> > There is a possible race condition in the IRQ probing code,
> > as tpm2_gen_interrupt may return before the IRQ handler is actually called.
> > This would lead to a false negative (IRQ is found not to work although
> > it actually does). Wait another ms for the IRQ to arrive, similar to
> > the IRQ testing code in tpm_tis_send().
> > 
> > v2: split this commit from the previous one ("tpm_tis: calculate command
> > durations before irq probing").
> 
> Before chip->vendor.irq is set, tpm_transmit() is in polling mode and it
> is a synchronous function. By the tpm_get_interrupt() returns, response
> is fully received and therefore interrupt has been generated.

The interrupt may have been generated at the chip, but delivery and
completion at the CPU is not guarenteed. Martin is right, this is a
race.

The correct way to probe is to trigger an IRQ and then sleep with
timeout to see if it arrives. I actually want to say the kernel has
common code for this already someplace.

The msleep is an ugly solution.

Jason

------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741551&iu=/4140
Jarkko Sakkinen Nov. 24, 2015, 5:56 a.m. UTC | #3
On Mon, Nov 23, 2015 at 04:23:56PM -0700, Jason Gunthorpe wrote:
> On Sat, Nov 21, 2015 at 03:16:07PM +0200, Jarkko Sakkinen wrote:
> > On Fri, Nov 20, 2015 at 02:32:31PM +0100, martin.wilck@ts.fujitsu.com wrote:
> > > From: Martin Wilck <Martin.Wilck@ts.fujitsu.com>
> > > 
> > > There is a possible race condition in the IRQ probing code,
> > > as tpm2_gen_interrupt may return before the IRQ handler is actually called.
> > > This would lead to a false negative (IRQ is found not to work although
> > > it actually does). Wait another ms for the IRQ to arrive, similar to
> > > the IRQ testing code in tpm_tis_send().
> > > 
> > > v2: split this commit from the previous one ("tpm_tis: calculate command
> > > durations before irq probing").
> > 
> > Before chip->vendor.irq is set, tpm_transmit() is in polling mode and it
> > is a synchronous function. By the tpm_get_interrupt() returns, response
> > is fully received and therefore interrupt has been generated.
> 
> The interrupt may have been generated at the chip, but delivery and
> completion at the CPU is not guarenteed. Martin is right, this is a
> race.

Please correct me if I'm wrong but the delivery of the response *is*
guaranteed when it returns from tpm_gen_interrupt(). That guarantee
does not rely on interrupts but polling the status registers.

So this race would be in-between response fully received and when the
CPU traps the interrupt?

> The correct way to probe is to trigger an IRQ and then sleep with
> timeout to see if it arrives. I actually want to say the kernel has
> common code for this already someplace.
> 
> The msleep is an ugly solution.
> 
> Jason

/Jarkko

------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741551&iu=/4140
Jason Gunthorpe Nov. 24, 2015, 7:05 a.m. UTC | #4
On Tue, Nov 24, 2015 at 07:56:55AM +0200, Jarkko Sakkinen wrote:

> So this race would be in-between response fully received and when the
> CPU traps the interrupt?

Yes. Delivery of the response via MMIO reads and delivery of the
associated IRQ caused by the TPM while creating the response are two
completely independent parallel processes in modern hardware. As such
they naturally race.

Jason

------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741551&iu=/4140
Jarkko Sakkinen Nov. 24, 2015, 7:36 a.m. UTC | #5
On Tue, Nov 24, 2015 at 12:05:09AM -0700, Jason Gunthorpe wrote:
> On Tue, Nov 24, 2015 at 07:56:55AM +0200, Jarkko Sakkinen wrote:
> 
> > So this race would be in-between response fully received and when the
> > CPU traps the interrupt?
> 
> Yes. Delivery of the response via MMIO reads and delivery of the
> associated IRQ caused by the TPM while creating the response are two
> completely independent parallel processes in modern hardware. As such
> they naturally race.

Given that even now this is a rare race condition even now without
msleep() that in the *worst case* causes TPM to be initialized in
polling mode why wouldn't a short msleep() be sufficient?

I think this would be good enough.

> Jason

/Jarkko

------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741551&iu=/4140
Jason Gunthorpe Nov. 24, 2015, 6:21 p.m. UTC | #6
On Tue, Nov 24, 2015 at 09:36:12AM +0200, Jarkko Sakkinen wrote:
> On Tue, Nov 24, 2015 at 12:05:09AM -0700, Jason Gunthorpe wrote:
> > On Tue, Nov 24, 2015 at 07:56:55AM +0200, Jarkko Sakkinen wrote:
> > 
> > > So this race would be in-between response fully received and when the
> > > CPU traps the interrupt?
> > 
> > Yes. Delivery of the response via MMIO reads and delivery of the
> > associated IRQ caused by the TPM while creating the response are two
> > completely independent parallel processes in modern hardware. As such
> > they naturally race.
> 
> Given that even now this is a rare race condition even now without
> msleep() that in the *worst case* causes TPM to be initialized in
> polling mode why wouldn't a short msleep() be sufficient?

It is sufficient but inelegant.

Jason

------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741551&iu=/4140
Jarkko Sakkinen Nov. 24, 2015, 7:49 p.m. UTC | #7
On Tue, Nov 24, 2015 at 11:21:15AM -0700, Jason Gunthorpe wrote:
> On Tue, Nov 24, 2015 at 09:36:12AM +0200, Jarkko Sakkinen wrote:
> > On Tue, Nov 24, 2015 at 12:05:09AM -0700, Jason Gunthorpe wrote:
> > > On Tue, Nov 24, 2015 at 07:56:55AM +0200, Jarkko Sakkinen wrote:
> > > 
> > > > So this race would be in-between response fully received and when the
> > > > CPU traps the interrupt?
> > > 
> > > Yes. Delivery of the response via MMIO reads and delivery of the
> > > associated IRQ caused by the TPM while creating the response are two
> > > completely independent parallel processes in modern hardware. As such
> > > they naturally race.
> > 
> > Given that even now this is a rare race condition even now without
> > msleep() that in the *worst case* causes TPM to be initialized in
> > polling mode why wouldn't a short msleep() be sufficient?
> 
> It is sufficient but inelegant.

I'm still going to apply this. Using waitqueue would feel bit of an
overkill.

> Jason

/Jarkko

------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741551&iu=/4140
diff mbox

Patch

diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c
index f7f9039..f417b40 100644
--- a/drivers/char/tpm/tpm_tis.c
+++ b/drivers/char/tpm/tpm_tis.c
@@ -811,6 +811,9 @@  static int tpm_tis_init(struct device *dev, struct tpm_info *tpm_info,
 			else
 				tpm_gen_interrupt(chip);
 
+			if (!chip->vendor.probed_irq)
+				msleep(1);
+
 			chip->vendor.irq = chip->vendor.probed_irq;
 
 			/* free_irq will call into tis_int_probe;