[tpmdd-devel] char/tpm: Check return code of wait_for_tpm_stat
diff mbox

Message ID 1476187261-29027-1-git-send-email-jarkko.sakkinen@linux.intel.com
State New
Headers show

Commit Message

Jarkko Sakkinen Oct. 11, 2016, 12:01 p.m. UTC
From: Peter Huewe <peterhuewe@gmx.de>

In some weird cases it might be possible that the TPM does not set
STS.VALID within the given timeout time (or ever) but sets STS.EXPECT
(STS=0x0C) In this case the driver gets stuck in the while loop of
tpm_tis_send_data and loops endlessly.

Checking the return value of wait_for_tpm_stat fixes this and the driver
bails out correctly.  While at it fixing all other users since if the
TPM does not manage to set STS.VALID within the reasonable timeframe
something is definitely wrong and the driver should react correctly.

Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
---
 drivers/char/tpm/tpm_tis_core.c | 29 ++++++++++++++++++++---------
 1 file changed, 20 insertions(+), 9 deletions(-)

Comments

Jason Gunthorpe Oct. 11, 2016, 5:13 p.m. UTC | #1
On Tue, Oct 11, 2016 at 03:01:01PM +0300, Jarkko Sakkinen wrote:
> From: Peter Huewe <peterhuewe@gmx.de>
> 
> In some weird cases it might be possible that the TPM does not set
> STS.VALID within the given timeout time (or ever) but sets STS.EXPECT
> (STS=0x0C) In this case the driver gets stuck in the while loop of
> tpm_tis_send_data and loops endlessly.

Doesn't that exchange mean the TPM has lost synchronization with the
driver? Or maybe it crashed executing a command or something..

Please indicate what hardware is broken like this.. Or how did you get
it to do this?

Jason

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Peter Hüwe Oct. 11, 2016, 6:01 p.m. UTC | #2
Hi
Am 11. Oktober 2016 19:13:13 MESZ, schrieb Jason Gunthorpe <jgunthorpe@obsidianresearch.com>:
>On Tue, Oct 11, 2016 at 03:01:01PM +0300, Jarkko Sakkinen wrote:
>> From: Peter Huewe <peterhuewe@gmx.de>
>> 
>> In some weird cases it might be possible that the TPM does not set
>> STS.VALID within the given timeout time (or ever) but sets STS.EXPECT
>> (STS=0x0C) In this case the driver gets stuck in the while loop of
>> tpm_tis_send_data and loops endlessly.
>
>Doesn't that exchange mean the TPM has lost synchronization with the
>driver? Or maybe it crashed executing a command or something..

I saw that in the field on quite a few (similar) systems with our lpc tpms - so it affects end users.
Yes it is caused by some desynchronization or something similar.

If you manually send a commandReady by mmaping the memory region you can un-stuck the driver and the situation was never seen again on that system.

The exact reason how this happens is yet unknown, but the driver should definitely not be stuck in an endless loop (which zombies the application too) in that case but bail out as defined in the TIS protocol. The next access sends the cr which cures the unsynchronization.




Peter
Jarkko Sakkinen Oct. 12, 2016, 12:16 p.m. UTC | #3
On Tue, Oct 11, 2016 at 08:01:09PM +0200, Peter Huewe wrote:
> 
> 
> Hi
> Am 11. Oktober 2016 19:13:13 MESZ, schrieb Jason Gunthorpe <jgunthorpe@obsidianresearch.com>:
> >On Tue, Oct 11, 2016 at 03:01:01PM +0300, Jarkko Sakkinen wrote:
> >> From: Peter Huewe <peterhuewe@gmx.de>
> >> 
> >> In some weird cases it might be possible that the TPM does not set
> >> STS.VALID within the given timeout time (or ever) but sets STS.EXPECT
> >> (STS=0x0C) In this case the driver gets stuck in the while loop of
> >> tpm_tis_send_data and loops endlessly.
> >
> >Doesn't that exchange mean the TPM has lost synchronization with the
> >driver? Or maybe it crashed executing a command or something..
> 
> I saw that in the field on quite a few (similar) systems with our lpc tpms - so it affects end users.
> Yes it is caused by some desynchronization or something similar.
> 
> If you manually send a commandReady by mmaping the memory region you can un-stuck the driver and the situation was never seen again on that system.
> 
> The exact reason how this happens is yet unknown, but the driver should definitely not be stuck in an endless loop (which zombies the application too) in that case but bail out as defined in the TIS protocol. The next access sends the cr which cures the unsynchronization.

Even as a sanity check return codes should be checked so in
any case I leaned towards applying this patch. It makes the
driver more robust.

/Jarkko

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Jarkko Sakkinen Oct. 21, 2016, 3:35 p.m. UTC | #4
On Wed, Oct 12, 2016 at 03:16:06PM +0300, Jarkko Sakkinen wrote:
> On Tue, Oct 11, 2016 at 08:01:09PM +0200, Peter Huewe wrote:
> > 
> > 
> > Hi
> > Am 11. Oktober 2016 19:13:13 MESZ, schrieb Jason Gunthorpe <jgunthorpe@obsidianresearch.com>:
> > >On Tue, Oct 11, 2016 at 03:01:01PM +0300, Jarkko Sakkinen wrote:
> > >> From: Peter Huewe <peterhuewe@gmx.de>
> > >> 
> > >> In some weird cases it might be possible that the TPM does not set
> > >> STS.VALID within the given timeout time (or ever) but sets STS.EXPECT
> > >> (STS=0x0C) In this case the driver gets stuck in the while loop of
> > >> tpm_tis_send_data and loops endlessly.
> > >
> > >Doesn't that exchange mean the TPM has lost synchronization with the
> > >driver? Or maybe it crashed executing a command or something..
> > 
> > I saw that in the field on quite a few (similar) systems with our lpc tpms - so it affects end users.
> > Yes it is caused by some desynchronization or something similar.
> > 
> > If you manually send a commandReady by mmaping the memory region you can un-stuck the driver and the situation was never seen again on that system.
> > 
> > The exact reason how this happens is yet unknown, but the driver should definitely not be stuck in an endless loop (which zombies the application too) in that case but bail out as defined in the TIS protocol. The next access sends the cr which cures the unsynchronization.
> 
> Even as a sanity check return codes should be checked so in
> any case I leaned towards applying this patch. It makes the
> driver more robust.

I applied this.

/Jarkko

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

Patch
diff mbox

diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
index e3bf31b..73f4c4b 100644
--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -180,11 +180,13 @@  static int recv_data(struct tpm_chip *chip, u8 *buf, size_t count)
 	struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev);
 	int size = 0, burstcnt, rc;
 
-	while (size < count &&
-	       wait_for_tpm_stat(chip,
+	while (size < count) {
+		rc = wait_for_tpm_stat(chip,
 				 TPM_STS_DATA_AVAIL | TPM_STS_VALID,
 				 chip->timeout_c,
-				 &priv->read_queue, true) == 0) {
+				 &priv->read_queue, true);
+		if (rc < 0)
+			return rc;
 		burstcnt = min_t(int, get_burstcount(chip), count - size);
 
 		rc = tpm_tis_read_bytes(priv, TPM_DATA_FIFO(priv->locality),
@@ -229,8 +231,11 @@  static int tpm_tis_recv(struct tpm_chip *chip, u8 *buf, size_t count)
 		goto out;
 	}
 
-	wait_for_tpm_stat(chip, TPM_STS_VALID, chip->timeout_c,
-			  &priv->int_queue, false);
+	if (wait_for_tpm_stat(chip, TPM_STS_VALID, chip->timeout_c,
+				&priv->int_queue, false) < 0) {
+		size = -ETIME;
+		goto out;
+	}
 	status = tpm_tis_status(chip);
 	if (status & TPM_STS_DATA_AVAIL) {	/* retry? */
 		dev_err(&chip->dev, "Error left over data\n");
@@ -279,8 +284,11 @@  static int tpm_tis_send_data(struct tpm_chip *chip, u8 *buf, size_t len)
 
 		count += burstcnt;
 
-		wait_for_tpm_stat(chip, TPM_STS_VALID, chip->timeout_c,
-				  &priv->int_queue, false);
+		if (wait_for_tpm_stat(chip, TPM_STS_VALID, chip->timeout_c,
+					&priv->int_queue, false) < 0) {
+			rc = -ETIME;
+			goto out_err;
+		}
 		status = tpm_tis_status(chip);
 		if (!itpm && (status & TPM_STS_DATA_EXPECT) == 0) {
 			rc = -EIO;
@@ -293,8 +301,11 @@  static int tpm_tis_send_data(struct tpm_chip *chip, u8 *buf, size_t len)
 	if (rc < 0)
 		goto out_err;
 
-	wait_for_tpm_stat(chip, TPM_STS_VALID, chip->timeout_c,
-			  &priv->int_queue, false);
+	if (wait_for_tpm_stat(chip, TPM_STS_VALID, chip->timeout_c,
+				&priv->int_queue, false) < 0) {
+		rc = -ETIME;
+		goto out_err;
+	}
 	status = tpm_tis_status(chip);
 	if (!itpm && (status & TPM_STS_DATA_EXPECT) != 0) {
 		rc = -EIO;