diff mbox series

PCI/switchtec: Fix init_completion race condition with poll_wait()

Message ID 20200313183608.2646-1-logang@deltatee.com
State New
Headers show
Series PCI/switchtec: Fix init_completion race condition with poll_wait() | expand

Commit Message

Logan Gunthorpe March 13, 2020, 6:36 p.m. UTC
The call to init_completion() in mrpc_queue_cmd() can theoretically
race with the call to poll_wait() in switchtec_dev_poll().

  poll()			write()
    switchtec_dev_poll()   	  switchtec_dev_write()
      poll_wait(&s->comp.wait);      mrpc_queue_cmd()
			               init_completion(&s->comp)
				         init_waitqueue_head(&s->comp.wait)

To my knowledge, no one has hit this bug, but we should fix it for
correctness.

Fix this by using reinit_completion() instead of init_completion() in
mrpc_queue_cmd().

Fixes: 080b47def5e5 ("MicroSemi Switchtec management interface driver")
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 drivers/pci/switch/switchtec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Thomas Gleixner March 17, 2020, 12:56 a.m. UTC | #1
Logan,

Logan Gunthorpe <logang@deltatee.com> writes:

> The call to init_completion() in mrpc_queue_cmd() can theoretically
> race with the call to poll_wait() in switchtec_dev_poll().
>
>   poll()			write()
>     switchtec_dev_poll()   	  switchtec_dev_write()
>       poll_wait(&s->comp.wait);      mrpc_queue_cmd()
> 			               init_completion(&s->comp)
> 				         init_waitqueue_head(&s->comp.wait)

just a nitpick. As you took the liberty to copy the description of the
race, which was btw. disovered by me, verbatim from a changelog written
by someone else w/o providing the courtesy of linking to that original
analysis, allow me the liberty to add the missing link:

Link: https://lore.kernel.org/lkml/20200313174701.148376-4-bigeasy@linutronix.de

> To my knowledge, no one has hit this bug, but we should fix it for
> correctness.

s/,but we should fix/.Fix/ ?

> Fix this by using reinit_completion() instead of init_completion() in
> mrpc_queue_cmd().
>
> Fixes: 080b47def5e5 ("MicroSemi Switchtec management interface driver")
> Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Signed-off-by: Logan Gunthorpe <logang@deltatee.com>

Acked-by: Thomas Gleixner <tglx@linutronix.de>

@Bjorn: Can you please hold off on this for a few days until we sorted
        out the remaining issues to avoid potential merge conflicts
        vs. the completion series?

Thanks,

        tglx
Logan Gunthorpe March 17, 2020, 1:25 a.m. UTC | #2
On 2020-03-16 6:56 p.m., Thomas Gleixner wrote:
> Logan,
> 
> Logan Gunthorpe <logang@deltatee.com> writes:
> 
>> The call to init_completion() in mrpc_queue_cmd() can theoretically
>> race with the call to poll_wait() in switchtec_dev_poll().
>>
>>   poll()			write()
>>     switchtec_dev_poll()   	  switchtec_dev_write()
>>       poll_wait(&s->comp.wait);      mrpc_queue_cmd()
>> 			               init_completion(&s->comp)
>> 				         init_waitqueue_head(&s->comp.wait)
> 
> just a nitpick. As you took the liberty to copy the description of the
> race, which was btw. disovered by me, verbatim from a changelog written
> by someone else w/o providing the courtesy of linking to that original
> analysis, allow me the liberty to add the missing link:
>
> Link: https://lore.kernel.org/lkml/20200313174701.148376-4-bigeasy@linutronix.de

Well, I just copied the call chain. I had no way to know you were the
one who discovered the bug given the way it was presented to me. And the
original patch didn't include much in the way of analysis of the bug,
just "It's Racy".

I didn't deliberately omit the link, it just never occurred to me to add
it. In retrospect, I should have included it, sorry about that.

>> To my knowledge, no one has hit this bug, but we should fix it for
>> correctness.
> 
> s/,but we should fix/.Fix/ ?

Yes, that's an improvement.

>> Fix this by using reinit_completion() instead of init_completion() in
>> mrpc_queue_cmd().
>>
>> Fixes: 080b47def5e5 ("MicroSemi Switchtec management interface driver")
>> Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
>> Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
> 
> Acked-by: Thomas Gleixner <tglx@linutronix.de>

Thanks.

> @Bjorn: Can you please hold off on this for a few days until we sorted
>         out the remaining issues to avoid potential merge conflicts
>         vs. the completion series?

I'd suggest simply rebasing the completion patch on this patch, or a
patch like it. Then we'll have the proper bug fix commit and there won't
be a conflict.

Logan
Thomas Gleixner March 17, 2020, 9:05 a.m. UTC | #3
Logan Gunthorpe <logang@deltatee.com> writes:
> On 2020-03-16 6:56 p.m., Thomas Gleixner wrote:
>> @Bjorn: Can you please hold off on this for a few days until we sorted
>>         out the remaining issues to avoid potential merge conflicts
>>         vs. the completion series?
>
> I'd suggest simply rebasing the completion patch on this patch, or a
> patch like it. Then we'll have the proper bug fix commit and there won't
> be a conflict.

The conflict is not a question of rebasing or not. The conflict arises
when this patch is routed through PCI simply because then the rest of
the completion work is stuck until this hits mainline.

Thanks,

        tglx
diff mbox series

Patch

diff --git a/drivers/pci/switch/switchtec.c b/drivers/pci/switch/switchtec.c
index a823b4b8ef8a..81dc7ac01381 100644
--- a/drivers/pci/switch/switchtec.c
+++ b/drivers/pci/switch/switchtec.c
@@ -175,7 +175,7 @@  static int mrpc_queue_cmd(struct switchtec_user *stuser)
 	kref_get(&stuser->kref);
 	stuser->read_len = sizeof(stuser->data);
 	stuser_set_state(stuser, MRPC_QUEUED);
-	init_completion(&stuser->comp);
+	reinit_completion(&stuser->comp);
 	list_add_tail(&stuser->list, &stdev->mrpc_queue);
 
 	mrpc_cmd_submit(stdev);