diff mbox

[tpmdd-devel] tpm: vtpm_proxy: Do not run tpm2_shutdown

Message ID 1495717956-14252-1-git-send-email-stefanb@linux.vnet.ibm.com
State New
Headers show

Commit Message

Stefan Berger May 25, 2017, 1:12 p.m. UTC
The tpm2_shutdown does not work with the VTPM proxy driver since the
function only gets called when the backend file descriptor is already
closed and at this point no data can be sent anymore. A proper shutdown
would have to be initated by a user space application, such as a container
management stack, that sends the command via the character device before
terminating the TPM emulator.

To avoid the tpm2_shutdown we introduce a TPM_CHIP_FLAG_NO_SHUTDOWN flag
that only the VTPM proxy driver sets. This also avoids misleading kernel
log messages.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
---
 drivers/char/tpm/tpm.h            | 1 +
 drivers/char/tpm/tpm2-cmd.c       | 3 +++
 drivers/char/tpm/tpm_vtpm_proxy.c | 3 ++-
 3 files changed, 6 insertions(+), 1 deletion(-)

Comments

Jason Gunthorpe May 25, 2017, 3:50 p.m. UTC | #1
On Thu, May 25, 2017 at 09:12:36AM -0400, Stefan Berger wrote:
> The tpm2_shutdown does not work with the VTPM proxy driver since the
> function only gets called when the backend file descriptor is already
> closed and at this point no data can be sent anymore. A proper shutdown
> would have to be initated by a user space application, such as a container
> management stack, that sends the command via the character device before
> terminating the TPM emulator.
> 
> To avoid the tpm2_shutdown we introduce a TPM_CHIP_FLAG_NO_SHUTDOWN flag
> that only the VTPM proxy driver sets. This also avoids misleading kernel
> log messages.

This seems strange to me..

Why isn't ops null if the fd has gone away?

What is the call flow that hits this?

Jason

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Stefan Berger May 25, 2017, 8:04 p.m. UTC | #2
On 05/25/2017 11:50 AM, Jason Gunthorpe wrote:
> On Thu, May 25, 2017 at 09:12:36AM -0400, Stefan Berger wrote:
>> The tpm2_shutdown does not work with the VTPM proxy driver since the
>> function only gets called when the backend file descriptor is already
>> closed and at this point no data can be sent anymore. A proper shutdown
>> would have to be initated by a user space application, such as a container
>> management stack, that sends the command via the character device before
>> terminating the TPM emulator.
>>
>> To avoid the tpm2_shutdown we introduce a TPM_CHIP_FLAG_NO_SHUTDOWN flag
>> that only the VTPM proxy driver sets. This also avoids misleading kernel
>> log messages.
> This seems strange to me..
>
> Why isn't ops null if the fd has gone away?
>
> What is the call flow that hits this?

In this function here.

static void tpm_del_char_device(struct tpm_chip *chip)
{
     cdev_device_del(&chip->cdev, &chip->dev);

     /* Make the chip unavailable. */
     mutex_lock(&idr_lock);
     idr_replace(&dev_nums_idr, NULL, chip->dev_num);
     mutex_unlock(&idr_lock);

     /* Make the driver uncallable. */
     down_write(&chip->ops_sem);
     if (chip->flags & TPM_CHIP_FLAG_TPM2)
         tpm2_shutdown(chip, TPM2_SU_CLEAR);
     chip->ops = NULL;
     up_write(&chip->ops_sem);
}

The request cannot be deliver because the anonymous fd has been closed 
already.

     Stefan


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Jason Gunthorpe May 25, 2017, 8:09 p.m. UTC | #3
On Thu, May 25, 2017 at 04:04:24PM -0400, Stefan Berger wrote:
> On 05/25/2017 11:50 AM, Jason Gunthorpe wrote:
> >On Thu, May 25, 2017 at 09:12:36AM -0400, Stefan Berger wrote:
> >>The tpm2_shutdown does not work with the VTPM proxy driver since the
> >>function only gets called when the backend file descriptor is already
> >>closed and at this point no data can be sent anymore. A proper shutdown
> >>would have to be initated by a user space application, such as a container
> >>management stack, that sends the command via the character device before
> >>terminating the TPM emulator.
> >>
> >>To avoid the tpm2_shutdown we introduce a TPM_CHIP_FLAG_NO_SHUTDOWN flag
> >>that only the VTPM proxy driver sets. This also avoids misleading kernel
> >>log messages.
> >This seems strange to me..
> >
> >Why isn't ops null if the fd has gone away?
> >
> >What is the call flow that hits this?
> 
> In this function here.
> 
> static void tpm_del_char_device(struct tpm_chip *chip)
> {
>     cdev_device_del(&chip->cdev, &chip->dev);
> 
>     /* Make the chip unavailable. */
>     mutex_lock(&idr_lock);
>     idr_replace(&dev_nums_idr, NULL, chip->dev_num);
>     mutex_unlock(&idr_lock);
> 
>     /* Make the driver uncallable. */
>     down_write(&chip->ops_sem);
>     if (chip->flags & TPM_CHIP_FLAG_TPM2)
>         tpm2_shutdown(chip, TPM2_SU_CLEAR);
>     chip->ops = NULL;
>     up_write(&chip->ops_sem);
> }
> 
> The request cannot be deliver because the anonymous fd has been closed
> already.

The driver must always be able to process requests until
tpm_del_char_device completes, so this is triggering an existing bug
in vtpm. This change in core behvior is not going to fix the bug.

eg a request from sysfs/etc could come in between vtpm fd closure and
tpm_del_char_device, and it still must be handled properly.

I guess you need to have transmit command fail fast once the fd is
closed.

Jason

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Stefan Berger May 25, 2017, 8:32 p.m. UTC | #4
On 05/25/2017 04:09 PM, Jason Gunthorpe wrote:
> On Thu, May 25, 2017 at 04:04:24PM -0400, Stefan Berger wrote:
>> On 05/25/2017 11:50 AM, Jason Gunthorpe wrote:
>>> On Thu, May 25, 2017 at 09:12:36AM -0400, Stefan Berger wrote:
>>>> The tpm2_shutdown does not work with the VTPM proxy driver since the
>>>> function only gets called when the backend file descriptor is already
>>>> closed and at this point no data can be sent anymore. A proper shutdown
>>>> would have to be initated by a user space application, such as a container
>>>> management stack, that sends the command via the character device before
>>>> terminating the TPM emulator.
>>>>
>>>> To avoid the tpm2_shutdown we introduce a TPM_CHIP_FLAG_NO_SHUTDOWN flag
>>>> that only the VTPM proxy driver sets. This also avoids misleading kernel
>>>> log messages.
>>> This seems strange to me..
>>>
>>> Why isn't ops null if the fd has gone away?
>>>
>>> What is the call flow that hits this?
>> In this function here.
>>
>> static void tpm_del_char_device(struct tpm_chip *chip)
>> {
>>      cdev_device_del(&chip->cdev, &chip->dev);
>>
>>      /* Make the chip unavailable. */
>>      mutex_lock(&idr_lock);
>>      idr_replace(&dev_nums_idr, NULL, chip->dev_num);
>>      mutex_unlock(&idr_lock);
>>
>>      /* Make the driver uncallable. */
>>      down_write(&chip->ops_sem);
>>      if (chip->flags & TPM_CHIP_FLAG_TPM2)
>>          tpm2_shutdown(chip, TPM2_SU_CLEAR);
>>      chip->ops = NULL;
>>      up_write(&chip->ops_sem);
>> }
>>
>> The request cannot be deliver because the anonymous fd has been closed
>> already.
> The driver must always be able to process requests until
> tpm_del_char_device completes, so this is triggering an existing bug
> in vtpm. This change in core behvior is not going to fix the bug.
>
> eg a request from sysfs/etc could come in between vtpm fd closure and
> tpm_del_char_device, and it still must be handled properly.
>
> I guess you need to have transmit command fail fast once the fd is
> closed.

It doesn't hang. Everything is torn down immediately. What is primarily 
annoying are these two log messages:
tpm tpm0: tpm_transmit: tpm_send: error -32
tpm tpm0: transmit returned -32 while stopping the TPM


     Stefan


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Jason Gunthorpe May 25, 2017, 8:44 p.m. UTC | #5
On Thu, May 25, 2017 at 04:32:50PM -0400, Stefan Berger wrote:

> It doesn't hang. Everything is torn down immediately. What is primarily
> annoying are these two log messages:
> tpm tpm0: tpm_transmit: tpm_send: error -32
> tpm tpm0: transmit returned -32 while stopping the TPM

I think it would be better to change the core to suppress that logging
if the FD is closed.

Jason

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Stefan Berger May 25, 2017, 8:54 p.m. UTC | #6
On 05/25/2017 04:44 PM, Jason Gunthorpe wrote:
> On Thu, May 25, 2017 at 04:32:50PM -0400, Stefan Berger wrote:
>
>> It doesn't hang. Everything is torn down immediately. What is primarily
>> annoying are these two log messages:
>> tpm tpm0: tpm_transmit: tpm_send: error -32
>> tpm tpm0: transmit returned -32 while stopping the TPM
> I think it would be better to change the core to suppress that logging
> if the FD is closed.

This particular command will never reach anyone listening on the proxy's 
file descriptor since the tear-down only begins when the front- and 
backend are closed.
The logging happens somewhere else than where the error occurs. What is 
the best way to suppress the logging? Remove it entirely  -- probably 
not. Return a special error code that doesn't get logged? Return a 2nd 
parameter that indicates this condition? It's not clear to me. Why not 
just prevent the command from being sent if it will never reach its 
intended destination ?

   Stefan

>
> Jason
>


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Jason Gunthorpe May 25, 2017, 9 p.m. UTC | #7
On Thu, May 25, 2017 at 04:54:01PM -0400, Stefan Berger wrote:

> This particular command will never reach anyone listening on the proxy's
> file descriptor since the tear-down only begins when the front- and backend
> are closed.

> The logging happens somewhere else than where the error occurs. What is the
> best way to suppress the logging? Remove it entirely  -- probably not.
> Return a special error code that doesn't get logged?

Error code would be my choice.

> that indicates this condition? It's not clear to me. Why not just
> prevent the command from being sent if it will never reach its
> intended destination

There are many cases where the vtpm shutdown can race with something
else, if the logging for this is bothersome it should be fixed
directly.

Adding strange special case flags is confusing as to the purpose - eg
your commit message didn't even say this is only about fixing some
noisy logging.

Jason

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Jarkko Sakkinen May 25, 2017, 10:33 p.m. UTC | #8
On Thu, May 25, 2017 at 04:32:50PM -0400, Stefan Berger wrote:
> On 05/25/2017 04:09 PM, Jason Gunthorpe wrote:
> > On Thu, May 25, 2017 at 04:04:24PM -0400, Stefan Berger wrote:
> > > On 05/25/2017 11:50 AM, Jason Gunthorpe wrote:
> > > > On Thu, May 25, 2017 at 09:12:36AM -0400, Stefan Berger wrote:
> > > > > The tpm2_shutdown does not work with the VTPM proxy driver since the
> > > > > function only gets called when the backend file descriptor is already
> > > > > closed and at this point no data can be sent anymore. A proper shutdown
> > > > > would have to be initated by a user space application, such as a container
> > > > > management stack, that sends the command via the character device before
> > > > > terminating the TPM emulator.
> > > > > 
> > > > > To avoid the tpm2_shutdown we introduce a TPM_CHIP_FLAG_NO_SHUTDOWN flag
> > > > > that only the VTPM proxy driver sets. This also avoids misleading kernel
> > > > > log messages.
> > > > This seems strange to me..
> > > > 
> > > > Why isn't ops null if the fd has gone away?
> > > > 
> > > > What is the call flow that hits this?
> > > In this function here.
> > > 
> > > static void tpm_del_char_device(struct tpm_chip *chip)
> > > {
> > >      cdev_device_del(&chip->cdev, &chip->dev);
> > > 
> > >      /* Make the chip unavailable. */
> > >      mutex_lock(&idr_lock);
> > >      idr_replace(&dev_nums_idr, NULL, chip->dev_num);
> > >      mutex_unlock(&idr_lock);
> > > 
> > >      /* Make the driver uncallable. */
> > >      down_write(&chip->ops_sem);
> > >      if (chip->flags & TPM_CHIP_FLAG_TPM2)
> > >          tpm2_shutdown(chip, TPM2_SU_CLEAR);
> > >      chip->ops = NULL;
> > >      up_write(&chip->ops_sem);
> > > }
> > > 
> > > The request cannot be deliver because the anonymous fd has been closed
> > > already.
> > The driver must always be able to process requests until
> > tpm_del_char_device completes, so this is triggering an existing bug
> > in vtpm. This change in core behvior is not going to fix the bug.
> > 
> > eg a request from sysfs/etc could come in between vtpm fd closure and
> > tpm_del_char_device, and it still must be handled properly.
> > 
> > I guess you need to have transmit command fail fast once the fd is
> > closed.
> 
> It doesn't hang. Everything is torn down immediately. What is primarily
> annoying are these two log messages:
> tpm tpm0: tpm_transmit: tpm_send: error -32
> tpm tpm0: transmit returned -32 while stopping the TPM
> 
> 
>     Stefan

It's been a while since I've looked into vtpm code. Why was fd closed
before the above? I can go this through myself once I'm back in Finland
next week. Just have forgotten this detail and do not have time to study
this right now.

/Jarkko

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Stefan Berger May 25, 2017, 11:34 p.m. UTC | #9
On 05/25/2017 06:33 PM, Jarkko Sakkinen wrote:
> On Thu, May 25, 2017 at 04:32:50PM -0400, Stefan Berger wrote:
>> On 05/25/2017 04:09 PM, Jason Gunthorpe wrote:
>>> On Thu, May 25, 2017 at 04:04:24PM -0400, Stefan Berger wrote:
>>>> On 05/25/2017 11:50 AM, Jason Gunthorpe wrote:
>>>>> On Thu, May 25, 2017 at 09:12:36AM -0400, Stefan Berger wrote:
>>>>>> The tpm2_shutdown does not work with the VTPM proxy driver since the
>>>>>> function only gets called when the backend file descriptor is already
>>>>>> closed and at this point no data can be sent anymore. A proper shutdown
>>>>>> would have to be initated by a user space application, such as a container
>>>>>> management stack, that sends the command via the character device before
>>>>>> terminating the TPM emulator.
>>>>>>
>>>>>> To avoid the tpm2_shutdown we introduce a TPM_CHIP_FLAG_NO_SHUTDOWN flag
>>>>>> that only the VTPM proxy driver sets. This also avoids misleading kernel
>>>>>> log messages.
>>>>> This seems strange to me..
>>>>>
>>>>> Why isn't ops null if the fd has gone away?
>>>>>
>>>>> What is the call flow that hits this?
>>>> In this function here.
>>>>
>>>> static void tpm_del_char_device(struct tpm_chip *chip)
>>>> {
>>>>       cdev_device_del(&chip->cdev, &chip->dev);
>>>>
>>>>       /* Make the chip unavailable. */
>>>>       mutex_lock(&idr_lock);
>>>>       idr_replace(&dev_nums_idr, NULL, chip->dev_num);
>>>>       mutex_unlock(&idr_lock);
>>>>
>>>>       /* Make the driver uncallable. */
>>>>       down_write(&chip->ops_sem);
>>>>       if (chip->flags & TPM_CHIP_FLAG_TPM2)
>>>>           tpm2_shutdown(chip, TPM2_SU_CLEAR);
>>>>       chip->ops = NULL;
>>>>       up_write(&chip->ops_sem);
>>>> }
>>>>
>>>> The request cannot be deliver because the anonymous fd has been closed
>>>> already.
>>> The driver must always be able to process requests until
>>> tpm_del_char_device completes, so this is triggering an existing bug
>>> in vtpm. This change in core behvior is not going to fix the bug.
>>>
>>> eg a request from sysfs/etc could come in between vtpm fd closure and
>>> tpm_del_char_device, and it still must be handled properly.
>>>
>>> I guess you need to have transmit command fail fast once the fd is
>>> closed.
>> It doesn't hang. Everything is torn down immediately. What is primarily
>> annoying are these two log messages:
>> tpm tpm0: tpm_transmit: tpm_send: error -32
>> tpm tpm0: transmit returned -32 while stopping the TPM
>>
>>
>>      Stefan
> It's been a while since I've looked into vtpm code. Why was fd closed
> before the above? I can go this through myself once I'm back in Finland

The tear-down only starts when the last file descriptor on /dev/tpm%d 
and the (anonymous) file descriptor on the backend was closed. The 
backend (TPM emulator) can also close before the frontend closes. So it 
can happen that there is no backend anymore that would process commands 
and that would cause error messages being logged for commands sent to 
it. For sure there is no more listener for the TPM2 Shutdown command.


> next week. Just have forgotten this detail and do not have time to study
> this right now.
>
> /Jarkko
>


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
diff mbox

Patch

diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
index 25d9858..23b656f 100644
--- a/drivers/char/tpm/tpm.h
+++ b/drivers/char/tpm/tpm.h
@@ -170,6 +170,7 @@  enum tpm_chip_flags {
 	TPM_CHIP_FLAG_IRQ		= BIT(2),
 	TPM_CHIP_FLAG_VIRTUAL		= BIT(3),
 	TPM_CHIP_FLAG_HAVE_TIMEOUTS	= BIT(4),
+	TPM_CHIP_FLAG_NO_SHUTDOWN	= BIT(5),
 };
 
 struct tpm_bios_log {
diff --git a/drivers/char/tpm/tpm2-cmd.c b/drivers/char/tpm/tpm2-cmd.c
index 3ee6883..495d316 100644
--- a/drivers/char/tpm/tpm2-cmd.c
+++ b/drivers/char/tpm/tpm2-cmd.c
@@ -831,6 +831,9 @@  void tpm2_shutdown(struct tpm_chip *chip, u16 shutdown_type)
 	struct tpm2_cmd cmd;
 	int rc;
 
+	if (chip->flags & TPM_CHIP_FLAG_NO_SHUTDOWN)
+		return;
+
 	cmd.header.in = tpm2_shutdown_header;
 	cmd.params.startup_in.startup_type = cpu_to_be16(shutdown_type);
 
diff --git a/drivers/char/tpm/tpm_vtpm_proxy.c b/drivers/char/tpm/tpm_vtpm_proxy.c
index 1d877cc..d439ce7 100644
--- a/drivers/char/tpm/tpm_vtpm_proxy.c
+++ b/drivers/char/tpm/tpm_vtpm_proxy.c
@@ -573,7 +573,8 @@  static struct file *vtpm_proxy_create_device(
 	vtpm_proxy_fops_open(file);
 
 	if (proxy_dev->flags & VTPM_PROXY_FLAG_TPM2)
-		proxy_dev->chip->flags |= TPM_CHIP_FLAG_TPM2;
+		proxy_dev->chip->flags |= TPM_CHIP_FLAG_TPM2 |
+					  TPM_CHIP_FLAG_NO_SHUTDOWN;
 
 	vtpm_proxy_work_start(proxy_dev);