[tpmdd-devel,5/5] tpm_tis: don't use IRQF_SHARED by default when probing IRQ
diff mbox

Message ID 1448026354-6807-6-git-send-email-martin.wilck@ts.fujitsu.com
State New
Headers show

Commit Message

Martin Wilck Nov. 20, 2015, 1:32 p.m. UTC
From: Martin Wilck <Martin.Wilck@ts.fujitsu.com>

When probing IRQs, IRQF_SHARED may cause IRQs to be falsely
detected if other devices generate IRQ events while tpm_tis is
probing or testing. I have seen this on my test machine repeatedly.

The sequence of events in the errror case is as follows:

 1 TPM IRQ generation is enabled for IRQ X and test command sent
 2 TPM finishes command and sets data ready / IRQ flags, but the IRQ
doesn't arrive because it's not configured in the system. Normally this
would cause a timeout and the IRQ would be found not to work, but...
 3 The other device triggers an IRQ X, causing tis_int_probe() to get
called and find the IRQ flags set. Now we conclude that the IRQ seems to
work, with is wrong.

It is impossible to distinguish this condition from a case where
the TPM IRQ actually worked.

Therefore, refrain from probing IRQs that are already used by
other devices by default. Use "interrupts=2" module parameter
to obtain the previous behavior.

v2: Add better explanation in commit message.
Signed-off-by: Martin Wilck <Martin.Wilck@ts.fujitsu.com>
---
 drivers/char/tpm/tpm_tis.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

Comments

Jarkko Sakkinen Nov. 21, 2015, 1:30 p.m. UTC | #1
On Fri, Nov 20, 2015 at 02:32:34PM +0100, martin.wilck@ts.fujitsu.com wrote:
> From: Martin Wilck <Martin.Wilck@ts.fujitsu.com>
> 
> When probing IRQs, IRQF_SHARED may cause IRQs to be falsely
> detected if other devices generate IRQ events while tpm_tis is
> probing or testing. I have seen this on my test machine repeatedly.
> 
> The sequence of events in the errror case is as follows:
> 
>  1 TPM IRQ generation is enabled for IRQ X and test command sent
>  2 TPM finishes command and sets data ready / IRQ flags, but the IRQ
> doesn't arrive because it's not configured in the system. Normally this
> would cause a timeout and the IRQ would be found not to work, but...
>  3 The other device triggers an IRQ X, causing tis_int_probe() to get
> called and find the IRQ flags set. Now we conclude that the IRQ seems to
> work, with is wrong.
> 
> It is impossible to distinguish this condition from a case where
> the TPM IRQ actually worked.
> 
> Therefore, refrain from probing IRQs that are already used by
> other devices by default. Use "interrupts=2" module parameter
> to obtain the previous behavior.

It's handled by not setting chip->vendor.irq in the tis_int_probe but
setting chp->vendor.probed_irq there, not chip>-vendor.irq directly.
Also in polling mode tpm_transmit() is synchronous.

I'm not concluding anything here that would make me take this change.

> v2: Add better explanation in commit message.
> Signed-off-by: Martin Wilck <Martin.Wilck@ts.fujitsu.com>

/Jarkko

> ---
>  drivers/char/tpm/tpm_tis.c | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c
> index 360bccc..54d9080 100644
> --- a/drivers/char/tpm/tpm_tis.c
> +++ b/drivers/char/tpm/tpm_tis.c
> @@ -622,9 +622,9 @@ static irqreturn_t tis_int_handler(int dummy, void *dev_id)
>  	return IRQ_HANDLED;
>  }
>  
> -static bool interrupts = true;
> -module_param(interrupts, bool, 0444);
> -MODULE_PARM_DESC(interrupts, "Enable interrupts");
> +static int interrupts = 1;
> +module_param(interrupts, int, 0444);
> +MODULE_PARM_DESC(interrupts, "1 - enable interrupts (default), 0 - disable interrupts, 2 - use shared interrupts when probing");
>  
>  static void tpm_tis_remove(struct tpm_chip *chip)
>  {
> @@ -784,7 +784,8 @@ static int tpm_tis_init(struct device *dev, struct tpm_info *tpm_info,
>  			iowrite8(i, chip->vendor.iobase +
>  				 TPM_INT_VECTOR(chip->vendor.locality));
>  			if (devm_request_irq
> -			    (dev, i, tis_int_probe, IRQF_SHARED,
> +			    (dev, i, tis_int_probe,
> +					interrupts == 2 ? IRQF_SHARED : 0,
>  			     chip->devname, chip) != 0) {
>  				dev_info(chip->pdev,
>  					 "Unable to request irq: %d for probe (this is non-fatal)\n",
> -- 
> 1.8.3.1
> 
> 
> ------------------------------------------------------------------------------
> _______________________________________________
> tpmdd-devel mailing list
> tpmdd-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/tpmdd-devel

------------------------------------------------------------------------------
Martin Wilck Nov. 23, 2015, 1:24 p.m. UTC | #2
On Sa, 2015-11-21 at 15:30 +0200, Jarkko Sakkinen wrote:
> On Fri, Nov 20, 2015 at 02:32:34PM +0100, martin.wilck@ts.fujitsu.com wrote:
> > From: Martin Wilck <Martin.Wilck@ts.fujitsu.com>
> > 
> > When probing IRQs, IRQF_SHARED may cause IRQs to be falsely
> > detected if other devices generate IRQ events while tpm_tis is
> > probing or testing. I have seen this on my test machine repeatedly.
> > 
> > The sequence of events in the errror case is as follows:
> > 
> >  1 TPM IRQ generation is enabled for IRQ X and test command sent
> >  2 TPM finishes command and sets data ready / IRQ flags, but the IRQ
> > doesn't arrive because it's not configured in the system. Normally this
> > would cause a timeout and the IRQ would be found not to work, but...
> >  3 The other device triggers an IRQ X, causing tis_int_probe() to get
> > called and find the IRQ flags set. Now we conclude that the IRQ seems to
> > work, with is wrong.
> > 
> > It is impossible to distinguish this condition from a case where
> > the TPM IRQ actually worked.
> > 
> > Therefore, refrain from probing IRQs that are already used by
> > other devices by default. Use "interrupts=2" module parameter
> > to obtain the previous behavior.
> 
> It's handled by not setting chip->vendor.irq in the tis_int_probe but
> setting chp->vendor.probed_irq there, not chip>-vendor.irq directly.
> Also in polling mode tpm_transmit() is synchronous.
 
I still don't quite get why that prevents the problem I was describing.
chip->vendor.probed_irq is set when tpm2_gen_interrupt returns, causing
the probing routine to conclude erroneously that the IRQ worked. That's
the problem I observed.

*HOWEVER*, I may have been mislead by some of my own changes which I
reverted in the meantime, and which may have inflated the likelyhood of
false positives at this point.

> I'm not concluding anything here that would make me take this change.

I just retried with your latest master branch (including those of my
patches that you accepted) and couldn't reproduce the problem any more.
I got not a single false positive in 100 rmmod/insmod cycles.

Thus it's probably safe to drop this one. Thanks for your thorough
review.

Regards
Martin


------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741551&iu=/4140
Jarkko Sakkinen Nov. 23, 2015, 4:59 p.m. UTC | #3
On Mon, Nov 23, 2015 at 02:24:34PM +0100, Wilck, Martin wrote:
> On Sa, 2015-11-21 at 15:30 +0200, Jarkko Sakkinen wrote:
> > On Fri, Nov 20, 2015 at 02:32:34PM +0100, martin.wilck@ts.fujitsu.com wrote:
> > > From: Martin Wilck <Martin.Wilck@ts.fujitsu.com>
> > > 
> > > When probing IRQs, IRQF_SHARED may cause IRQs to be falsely
> > > detected if other devices generate IRQ events while tpm_tis is
> > > probing or testing. I have seen this on my test machine repeatedly.
> > > 
> > > The sequence of events in the errror case is as follows:
> > > 
> > >  1 TPM IRQ generation is enabled for IRQ X and test command sent
> > >  2 TPM finishes command and sets data ready / IRQ flags, but the IRQ
> > > doesn't arrive because it's not configured in the system. Normally this
> > > would cause a timeout and the IRQ would be found not to work, but...
> > >  3 The other device triggers an IRQ X, causing tis_int_probe() to get
> > > called and find the IRQ flags set. Now we conclude that the IRQ seems to
> > > work, with is wrong.
> > > 
> > > It is impossible to distinguish this condition from a case where
> > > the TPM IRQ actually worked.
> > > 
> > > Therefore, refrain from probing IRQs that are already used by
> > > other devices by default. Use "interrupts=2" module parameter
> > > to obtain the previous behavior.
> > 
> > It's handled by not setting chip->vendor.irq in the tis_int_probe but
> > setting chp->vendor.probed_irq there, not chip>-vendor.irq directly.
> > Also in polling mode tpm_transmit() is synchronous.
>  
> I still don't quite get why that prevents the problem I was describing.
> chip->vendor.probed_irq is set when tpm2_gen_interrupt returns, causing
> the probing routine to conclude erroneously that the IRQ worked. That's
> the problem I observed.
> 
> *HOWEVER*, I may have been mislead by some of my own changes which I
> reverted in the meantime, and which may have inflated the likelyhood of
> false positives at this point.
> 
> > I'm not concluding anything here that would make me take this change.
> 
> I just retried with your latest master branch (including those of my
> patches that you accepted) and couldn't reproduce the problem any more.
> I got not a single false positive in 100 rmmod/insmod cycles.
> 
> Thus it's probably safe to drop this one. Thanks for your thorough
> review.

My suggestion here is that we end this discussion for now. There is too
much speculation. If there is ever reproducible test case that shows a
regression, then this can be reconsidered.

> Regards
> Martin

/Jarkko

------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741551&iu=/4140
Jason Gunthorpe Nov. 23, 2015, 11:30 p.m. UTC | #4
On Sat, Nov 21, 2015 at 03:30:20PM +0200, Jarkko Sakkinen wrote:
> On Fri, Nov 20, 2015 at 02:32:34PM +0100, martin.wilck@ts.fujitsu.com wrote:
> > From: Martin Wilck <Martin.Wilck@ts.fujitsu.com>
> > 
> > When probing IRQs, IRQF_SHARED may cause IRQs to be falsely
> > detected if other devices generate IRQ events while tpm_tis is
> > probing or testing. I have seen this on my test machine repeatedly.
> > 
> > The sequence of events in the errror case is as follows:
> > 
> >  1 TPM IRQ generation is enabled for IRQ X and test command sent
> >  2 TPM finishes command and sets data ready / IRQ flags, but the IRQ
> > doesn't arrive because it's not configured in the system. Normally this
> > would cause a timeout and the IRQ would be found not to work, but...
> >  3 The other device triggers an IRQ X, causing tis_int_probe() to get
> > called and find the IRQ flags set. Now we conclude that the IRQ seems to
> > work, with is wrong.
> > 
> > It is impossible to distinguish this condition from a case where
> > the TPM IRQ actually worked.
> > 
> > Therefore, refrain from probing IRQs that are already used by
> > other devices by default. Use "interrupts=2" module parameter
> > to obtain the previous behavior.
> 
> It's handled by not setting chip->vendor.irq in the tis_int_probe but
> setting chp->vendor.probed_irq there, not chip>-vendor.irq directly.
> Also in polling mode tpm_transmit() is synchronous.

The scenario Martin describes is absolutely possible, and it is
certainly not correct to try to probe an IRQ line that is shared by
another driver.

It is obvious it was a bug for the probing patch to use IRQF_SHARED in
the first place, so this patch should go with a proper Fixes line. I
would drop the interrupts=2 stuff too, there is no reason ever to
probe on a shared interrupt.

Jason

------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741551&iu=/4140
Jarkko Sakkinen Nov. 24, 2015, 8:56 a.m. UTC | #5
On Mon, Nov 23, 2015 at 04:30:55PM -0700, Jason Gunthorpe wrote:
> On Sat, Nov 21, 2015 at 03:30:20PM +0200, Jarkko Sakkinen wrote:
> > On Fri, Nov 20, 2015 at 02:32:34PM +0100, martin.wilck@ts.fujitsu.com wrote:
> > > From: Martin Wilck <Martin.Wilck@ts.fujitsu.com>
> > > 
> > > When probing IRQs, IRQF_SHARED may cause IRQs to be falsely
> > > detected if other devices generate IRQ events while tpm_tis is
> > > probing or testing. I have seen this on my test machine repeatedly.
> > > 
> > > The sequence of events in the errror case is as follows:
> > > 
> > >  1 TPM IRQ generation is enabled for IRQ X and test command sent
> > >  2 TPM finishes command and sets data ready / IRQ flags, but the IRQ
> > > doesn't arrive because it's not configured in the system. Normally this
> > > would cause a timeout and the IRQ would be found not to work, but...
> > >  3 The other device triggers an IRQ X, causing tis_int_probe() to get
> > > called and find the IRQ flags set. Now we conclude that the IRQ seems to
> > > work, with is wrong.
> > > 
> > > It is impossible to distinguish this condition from a case where
> > > the TPM IRQ actually worked.
> > > 
> > > Therefore, refrain from probing IRQs that are already used by
> > > other devices by default. Use "interrupts=2" module parameter
> > > to obtain the previous behavior.
> > 
> > It's handled by not setting chip->vendor.irq in the tis_int_probe but
> > setting chp->vendor.probed_irq there, not chip>-vendor.irq directly.
> > Also in polling mode tpm_transmit() is synchronous.
> 
> The scenario Martin describes is absolutely possible, and it is
> certainly not correct to try to probe an IRQ line that is shared by
> another driver.

It is fine to call request_irq() with IRQF_SHARED flag as long as the
'dev' parameter is unique.


> It is obvious it was a bug for the probing patch to use IRQF_SHARED in
> the first place, so this patch should go with a proper Fixes line. I
> would drop the interrupts=2 stuff too, there is no reason ever to
> probe on a shared interrupt.

If it is obvious, then it should be easy to describe a sequence of
actions even if reproducing would be nearly impossible.

tis_int_probe() might get invoked after this line:

	chip->vendor.irq = chip->vendor.probed_irq;

This would cause working IRQ to be skipped because of race. Worst
case scenarion would be driver using the polling mode.

> Jason

/Jarkko

------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741551&iu=/4140
Jason Gunthorpe Nov. 24, 2015, 6:20 p.m. UTC | #6
On Tue, Nov 24, 2015 at 10:56:39AM +0200, Jarkko Sakkinen wrote:
> > The scenario Martin describes is absolutely possible, and it is
> > certainly not correct to try to probe an IRQ line that is shared by
> > another driver.
> 
> It is fine to call request_irq() with IRQF_SHARED flag as long as the
> 'dev' parameter is unique.

Eh? What does that have to do with anything?

IRQF_SHARED means other drivers can be listening on the IRQ. It means
other devices can be causing interrupts.

Probing requires that the only possible interrupt source be the device
we are probing. It is fundemantally incompatible with probing to have
another active driver on the same IRQ.

> If it is obvious, then it should be easy to describe a sequence of
> actions even if reproducing would be nearly impossible.

Martin's description made sense to me.

> This would cause working IRQ to be skipped because of race. Worst
> case scenarion would be driver using the polling mode.

No, worst case scenario is the driver falsely detected an interrupt
for the TPM (generated by the shared device, not the TPM) and tries to
use interrupt mode when it is, in fact, not using the correct
interrupt. ie the tpm is broken.

Jason

------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741551&iu=/4140
Jason Gunthorpe Nov. 24, 2015, 6:44 p.m. UTC | #7
On Tue, Nov 24, 2015 at 10:56:39AM +0200, Jarkko Sakkinen wrote:
> > The scenario Martin describes is absolutely possible, and it is
> > certainly not correct to try to probe an IRQ line that is shared by
> > another driver.
> 
> It is fine to call request_irq() with IRQF_SHARED flag as long as the
> 'dev' parameter is unique.

I just want to clarify that we are talking about the same
devm_request_irq.

The first one:
                for (i = irq_s; i <= irq_e && chip->vendor.irq == 0; i++) {
                        iowrite8(i, chip->vendor.iobase +
                                 TPM_INT_VECTOR(chip->vendor.locality));
                        if (devm_request_irq
                            (dev, i, tis_int_probe, IRQF_SHARED,

Is trying to guess an IRQ without it being specified. This process
should never use IRQF_SHARED, it is simply not reliable if the IRQ has
another driver attached.

The second one for a directly specified irq:

        if (chip->vendor.irq) {
                if (devm_request_irq
                    (dev, chip->vendor.irq, tis_int_handler, IRQF_SHARED,
                     chip->devname, chip) != 0) {

Must stay as IRQF_SHARED.

Jason

------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741551&iu=/4140
Jarkko Sakkinen Nov. 24, 2015, 7:40 p.m. UTC | #8
On Tue, Nov 24, 2015 at 11:20:14AM -0700, Jason Gunthorpe wrote:
> On Tue, Nov 24, 2015 at 10:56:39AM +0200, Jarkko Sakkinen wrote:
> > > The scenario Martin describes is absolutely possible, and it is
> > > certainly not correct to try to probe an IRQ line that is shared by
> > > another driver.
> > 
> > It is fine to call request_irq() with IRQF_SHARED flag as long as the
> > 'dev' parameter is unique.
> 
> Eh? What does that have to do with anything?
> 
> IRQF_SHARED means other drivers can be listening on the IRQ. It means
> other devices can be causing interrupts.
> 
> Probing requires that the only possible interrupt source be the device
> we are probing. It is fundemantally incompatible with probing to have
> another active driver on the same IRQ.
> 
> > If it is obvious, then it should be easy to describe a sequence of
> > actions even if reproducing would be nearly impossible.
> 
> Martin's description made sense to me.
> 
> > This would cause working IRQ to be skipped because of race. Worst
> > case scenarion would be driver using the polling mode.
> 
> No, worst case scenario is the driver falsely detected an interrupt
> for the TPM (generated by the shared device, not the TPM) and tries to
> use interrupt mode when it is, in fact, not using the correct
> interrupt. ie the tpm is broken.

Sorry but I'm still not following. I'll try to explain.

tis_int_probe() checks int_status register and only if it is cleared it
will set chip->vendor.probed_irq. What is the scenario to pass this
guard without TPM not having generated an interrupt?

> Jason

/Jarkko

------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741551&iu=/4140
Jason Gunthorpe Nov. 24, 2015, 7:56 p.m. UTC | #9
On Tue, Nov 24, 2015 at 09:40:24PM +0200, Jarkko Sakkinen wrote:
 
> tis_int_probe() checks int_status register and only if it is cleared it
> will set chip->vendor.probed_irq. What is the scenario to pass this
> guard without TPM not having generated an interrupt?

No, it isn't the TPM generating an interrupt we care about. The TPM
always generates an interrupt if configured. The status register is
not relevant.

We are looking to find out where the CPU receives the interrupt.

When IRQF_SHARED is used the handlers become chained. Any irq from any
device on the share will cause all handlers to run. From your comments
I'm not sure you realize this.

If multiple devices can cause tis_int_probe to run then we can no
longer be certain that it is running because of the TPM device. It may
be running due to another device on the share and the TPM interrupt is
actually going someplace else.

Jason

------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741551&iu=/4140
Jarkko Sakkinen Nov. 24, 2015, 9:23 p.m. UTC | #10
On Tue, Nov 24, 2015 at 12:56:09PM -0700, Jason Gunthorpe wrote:
> On Tue, Nov 24, 2015 at 09:40:24PM +0200, Jarkko Sakkinen wrote:
>  
> > tis_int_probe() checks int_status register and only if it is cleared it
> > will set chip->vendor.probed_irq. What is the scenario to pass this
> > guard without TPM not having generated an interrupt?
> 
> No, it isn't the TPM generating an interrupt we care about. The TPM
> always generates an interrupt if configured. The status register is
> not relevant.
> 
> We are looking to find out where the CPU receives the interrupt.
> 
> When IRQF_SHARED is used the handlers become chained. Any irq from any
> device on the share will cause all handlers to run. From your comments
> I'm not sure you realize this.

Yes, I'm fully aware of this. I guess I'm turning a blind eye to
something else since this discussion is still ongoing...

> If multiple devices can cause tis_int_probe to run then we can no
> longer be certain that it is running because of the TPM device. It may
> be running due to another device on the share and the TPM interrupt is
> actually going someplace else.

That's why the handler checks the int_status register in tis_int_probe
like it does in tis_int_handler.

> Jason

/Jarkko

------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741551&iu=/4140
Jason Gunthorpe Nov. 24, 2015, 9:30 p.m. UTC | #11
On Tue, Nov 24, 2015 at 11:23:59PM +0200, Jarkko Sakkinen wrote:
> > If multiple devices can cause tis_int_probe to run then we can no
> > longer be certain that it is running because of the TPM device. It may
> > be running due to another device on the share and the TPM interrupt is
> > actually going someplace else.
> 
> That's why the handler checks the int_status register in tis_int_probe
> like it does in tis_int_handler.

No, int_status doesn't do that!

int_status only means the interrupt was *SENT* from the TPM, it says
nothing about how it was received by the CPU.

It will *always* be set after a command runs, testing it proves nothing.

Jason

------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741551&iu=/4140
Jarkko Sakkinen Nov. 25, 2015, 6:37 a.m. UTC | #12
On Tue, Nov 24, 2015 at 02:30:05PM -0700, Jason Gunthorpe wrote:
> On Tue, Nov 24, 2015 at 11:23:59PM +0200, Jarkko Sakkinen wrote:
> > > If multiple devices can cause tis_int_probe to run then we can no
> > > longer be certain that it is running because of the TPM device. It may
> > > be running due to another device on the share and the TPM interrupt is
> > > actually going someplace else.
> > 
> > That's why the handler checks the int_status register in tis_int_probe
> > like it does in tis_int_handler.
> 
> No, int_status doesn't do that!
> 
> int_status only means the interrupt was *SENT* from the TPM, it says
> nothing about how it was received by the CPU.
> 
> It will *always* be set after a command runs, testing it proves nothing.

Isn't that the information that you want to know that interrupts was
sent from the TPM? There is only one TIS driver. What else is there to
prove? The exact same guard is also in tis_int_handler.

I don't think there are possible races between iterations because flags
are cleared after each iteration.

> Jason

/Jarkko

------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741551&iu=/4140
Jason Gunthorpe Nov. 25, 2015, 6:51 a.m. UTC | #13
On Wed, Nov 25, 2015 at 08:37:16AM +0200, Jarkko Sakkinen wrote:
> > 
> > It will *always* be set after a command runs, testing it proves nothing.
> 
> Isn't that the information that you want to know that interrupts was
> sent from the TPM?

I'm not sure how to explain this more clearly.

The goal of irq probing is to learn where/if the interrupt *from* the
TPM is *received* by the CPU. I am testing the *CPU*, not the TPM.

I *know* the tpm_tis driver is causing the TPM to raise an interrupt.

I *know* that the TPM will report it sent an interrupt in its status
register. This happens independently of what the CPU does.

I *don't know* where that interrupt will land at the CPU, or even if
it reaches the CPU at all.

The test is to make the CPU listen for a specific IRQ then make the
TPM send an IRQ.

If the CPU listens for a specific IRQ *and something else* sends an
IRQ then it will run the tpm ISR, see the set status register, and
assume the TPM sent it, even though it didn't.

It is critical that during probing *nothing* else raises the IRQ.

IRQF_SHARED means other things can raise the IRQ. That is what it
does.

Jason

------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741551&iu=/4140
Jarkko Sakkinen Nov. 25, 2015, 9:58 a.m. UTC | #14
On Tue, Nov 24, 2015 at 11:51:43PM -0700, Jason Gunthorpe wrote:
> On Wed, Nov 25, 2015 at 08:37:16AM +0200, Jarkko Sakkinen wrote:
> > > 
> > > It will *always* be set after a command runs, testing it proves nothing.
> > 
> > Isn't that the information that you want to know that interrupts was
> > sent from the TPM?
> 
> I'm not sure how to explain this more clearly.
> 
> The goal of irq probing is to learn where/if the interrupt *from* the
> TPM is *received* by the CPU. I am testing the *CPU*, not the TPM.
> 
> I *know* the tpm_tis driver is causing the TPM to raise an interrupt.
> 
> I *know* that the TPM will report it sent an interrupt in its status
> register. This happens independently of what the CPU does.
> 
> I *don't know* where that interrupt will land at the CPU, or even if
> it reaches the CPU at all.
> 
> The test is to make the CPU listen for a specific IRQ then make the
> TPM send an IRQ.
> 
> If the CPU listens for a specific IRQ *and something else* sends an
> IRQ then it will run the tpm ISR, see the set status register, and
> assume the TPM sent it, even though it didn't.

The int_status is cleared before enabling the interrupts in the line 735
[1]. TPM interrupt is enabled after that. If something else sends an
IRQ, are you arguing that it would change in int_status? I doubt it.

> It is critical that during probing *nothing* else raises the IRQ.
> 
> IRQF_SHARED means other things can raise the IRQ. That is what it
> does.
> 
> Jason

/Jarkko

[1] http://lxr.free-electrons.com/source/drivers/char/tpm/tpm_tis.c

------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741551&iu=/4140
Martin Wilck Nov. 25, 2015, 11:38 a.m. UTC | #15
> > I'm not sure how to explain this more clearly.
> > 
> > The goal of irq probing is to learn where/if the interrupt *from* the
> > TPM is *received* by the CPU. I am testing the *CPU*, not the TPM.
> > 
> > I *know* the tpm_tis driver is causing the TPM to raise an interrupt.
> > 
> > I *know* that the TPM will report it sent an interrupt in its status
> > register. This happens independently of what the CPU does.
> > 
> > I *don't know* where that interrupt will land at the CPU, or even if
> > it reaches the CPU at all.
> > 
> > The test is to make the CPU listen for a specific IRQ then make the
> > TPM send an IRQ.
> > 
> > If the CPU listens for a specific IRQ *and something else* sends an
> > IRQ then it will run the tpm ISR, see the set status register, and
> > assume the TPM sent it, even though it didn't.
> 
> The int_status is cleared before enabling the interrupts in the line 735
> [1]. TPM interrupt is enabled after that. If something else sends an
> IRQ, are you arguing that it would change in int_status? I doubt it.

Of course not. The TPM does trigger an IRQ when it finishes processing
the test command, and sets the int_status flags in that process. This
IRQ just never arrives at the CPU. Instead, an IRQ from the other device
arrives.

 a) The TPM completes command, sets INT_STATUS and generates an IRQ X.
 b) The CPU receives IRQ X and finds INT_STATUS bits set in the TPM.
------------
 c) The CPU has received the IRQ X generated by the TPM.

We know a) and b) are TRUE. But if another device is sharing IRQ X, we
still can't be sure that c) holds, too.

Martin

------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741551&iu=/4140
Jarkko Sakkinen Nov. 28, 2015, 11:57 a.m. UTC | #16
On Wed, Nov 25, 2015 at 12:38:31PM +0100, Wilck, Martin wrote:
> > > I'm not sure how to explain this more clearly.
> > > 
> > > The goal of irq probing is to learn where/if the interrupt *from* the
> > > TPM is *received* by the CPU. I am testing the *CPU*, not the TPM.
> > > 
> > > I *know* the tpm_tis driver is causing the TPM to raise an interrupt.
> > > 
> > > I *know* that the TPM will report it sent an interrupt in its status
> > > register. This happens independently of what the CPU does.
> > > 
> > > I *don't know* where that interrupt will land at the CPU, or even if
> > > it reaches the CPU at all.
> > > 
> > > The test is to make the CPU listen for a specific IRQ then make the
> > > TPM send an IRQ.
> > > 
> > > If the CPU listens for a specific IRQ *and something else* sends an
> > > IRQ then it will run the tpm ISR, see the set status register, and
> > > assume the TPM sent it, even though it didn't.
> > 
> > The int_status is cleared before enabling the interrupts in the line 735
> > [1]. TPM interrupt is enabled after that. If something else sends an
> > IRQ, are you arguing that it would change in int_status? I doubt it.
> 
> Of course not. The TPM does trigger an IRQ when it finishes processing
> the test command, and sets the int_status flags in that process. This
> IRQ just never arrives at the CPU. Instead, an IRQ from the other device
> arrives.
> 
>  a) The TPM completes command, sets INT_STATUS and generates an IRQ X.
>  b) The CPU receives IRQ X and finds INT_STATUS bits set in the TPM.
> ------------
>  c) The CPU has received the IRQ X generated by the TPM.
> 
> We know a) and b) are TRUE. But if another device is sharing IRQ X, we
> still can't be sure that c) holds, too.

Got it! Thanks. I understand now (finally) the root cause. I was
travelling in the latter part of week and busy with other stuff. That's
why there's been some delay. I'll try to go through your and Jasons
patches ASAP (there is a load of them).

> Martin

/Jarkko

------------------------------------------------------------------------------

Patch
diff mbox

diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c
index 360bccc..54d9080 100644
--- a/drivers/char/tpm/tpm_tis.c
+++ b/drivers/char/tpm/tpm_tis.c
@@ -622,9 +622,9 @@  static irqreturn_t tis_int_handler(int dummy, void *dev_id)
 	return IRQ_HANDLED;
 }
 
-static bool interrupts = true;
-module_param(interrupts, bool, 0444);
-MODULE_PARM_DESC(interrupts, "Enable interrupts");
+static int interrupts = 1;
+module_param(interrupts, int, 0444);
+MODULE_PARM_DESC(interrupts, "1 - enable interrupts (default), 0 - disable interrupts, 2 - use shared interrupts when probing");
 
 static void tpm_tis_remove(struct tpm_chip *chip)
 {
@@ -784,7 +784,8 @@  static int tpm_tis_init(struct device *dev, struct tpm_info *tpm_info,
 			iowrite8(i, chip->vendor.iobase +
 				 TPM_INT_VECTOR(chip->vendor.locality));
 			if (devm_request_irq
-			    (dev, i, tis_int_probe, IRQF_SHARED,
+			    (dev, i, tis_int_probe,
+					interrupts == 2 ? IRQF_SHARED : 0,
 			     chip->devname, chip) != 0) {
 				dev_info(chip->pdev,
 					 "Unable to request irq: %d for probe (this is non-fatal)\n",