diff mbox series

PCI: altera-msi: Remove irq handler and data in one go

Message ID 20201108191140.23227-1-martin@kaiser.cx
State New
Headers show
Series PCI: altera-msi: Remove irq handler and data in one go | expand

Commit Message

Martin Kaiser Nov. 8, 2020, 7:11 p.m. UTC
Replace the two separate calls for removing the irq handler and data with a
single irq_set_chained_handler_and_data() call.

Signed-off-by: Martin Kaiser <martin@kaiser.cx>
---
 drivers/pci/controller/pcie-altera-msi.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

Bjorn Helgaas Nov. 10, 2020, 9:21 p.m. UTC | #1
[+cc Nicolas, Jingoo, Gustavo, Toan]

On Sun, Nov 08, 2020 at 08:11:40PM +0100, Martin Kaiser wrote:
> Replace the two separate calls for removing the irq handler and data with a
> single irq_set_chained_handler_and_data() call.

This is similar to these:

  36f024ed8fc9 ("PCI/MSI: pci-xgene-msi: Consolidate chained IRQ handler install/remove")
  5168a73ce32d ("PCI/keystone: Consolidate chained IRQ handler install/remove")
  2cf5a03cb29d ("PCI/keystone: Fix race in installing chained IRQ handler")

and it seems potentially important that this removes the IRQ handler
and data *atomically*, i.e., both are done while holding
irq_get_desc_buslock().  

So I would use this:

  PCI: altera-msi: Fix race in installing chained IRQ handler

  Fix a race where a pending interrupt could be received and the handler
  called before the handler's data has been setup by converting to
  irq_set_chained_handler_and_data().

  See also 2cf5a03cb29d ("PCI/keystone: Fix race in installing chained
  IRQ handler").

to make it clear that this is actually a bug fix, not just a cleanup.

Looks like this should also be done in dw_pcie_free_msi() and
xgene_msi_hwirq_alloc() at the same time?

> Signed-off-by: Martin Kaiser <martin@kaiser.cx>
> ---
>  drivers/pci/controller/pcie-altera-msi.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/controller/pcie-altera-msi.c b/drivers/pci/controller/pcie-altera-msi.c
> index e1636f7714ca..42691dd8ebef 100644
> --- a/drivers/pci/controller/pcie-altera-msi.c
> +++ b/drivers/pci/controller/pcie-altera-msi.c
> @@ -204,8 +204,7 @@ static int altera_msi_remove(struct platform_device *pdev)
>  	struct altera_msi *msi = platform_get_drvdata(pdev);
>  
>  	msi_writel(msi, 0, MSI_INTMASK);
> -	irq_set_chained_handler(msi->irq, NULL);
> -	irq_set_handler_data(msi->irq, NULL);
> +	irq_set_chained_handler_and_data(msi->irq, NULL, NULL);
>  
>  	altera_free_domains(msi);
>  
> -- 
> 2.20.1
>
Bjorn Helgaas Nov. 10, 2020, 9:45 p.m. UTC | #2
[+cc Florian, sorry, I hadn't seen your ack when I sent the below]

On Tue, Nov 10, 2020 at 03:21:36PM -0600, Bjorn Helgaas wrote:
> [+cc Nicolas, Jingoo, Gustavo, Toan]
> 
> On Sun, Nov 08, 2020 at 08:11:40PM +0100, Martin Kaiser wrote:
> > Replace the two separate calls for removing the irq handler and data with a
> > single irq_set_chained_handler_and_data() call.
> 
> This is similar to these:
> 
>   36f024ed8fc9 ("PCI/MSI: pci-xgene-msi: Consolidate chained IRQ handler install/remove")
>   5168a73ce32d ("PCI/keystone: Consolidate chained IRQ handler install/remove")
>   2cf5a03cb29d ("PCI/keystone: Fix race in installing chained IRQ handler")
> 
> and it seems potentially important that this removes the IRQ handler
> and data *atomically*, i.e., both are done while holding
> irq_get_desc_buslock().  
> 
> So I would use this:
> 
>   PCI: altera-msi: Fix race in installing chained IRQ handler
> 
>   Fix a race where a pending interrupt could be received and the handler
>   called before the handler's data has been setup by converting to
>   irq_set_chained_handler_and_data().
> 
>   See also 2cf5a03cb29d ("PCI/keystone: Fix race in installing chained
>   IRQ handler").
> 
> to make it clear that this is actually a bug fix, not just a cleanup.
> 
> Looks like this should also be done in dw_pcie_free_msi() and
> xgene_msi_hwirq_alloc() at the same time?
> 
> > Signed-off-by: Martin Kaiser <martin@kaiser.cx>
> > ---
> >  drivers/pci/controller/pcie-altera-msi.c | 3 +--
> >  1 file changed, 1 insertion(+), 2 deletions(-)
> > 
> > diff --git a/drivers/pci/controller/pcie-altera-msi.c b/drivers/pci/controller/pcie-altera-msi.c
> > index e1636f7714ca..42691dd8ebef 100644
> > --- a/drivers/pci/controller/pcie-altera-msi.c
> > +++ b/drivers/pci/controller/pcie-altera-msi.c
> > @@ -204,8 +204,7 @@ static int altera_msi_remove(struct platform_device *pdev)
> >  	struct altera_msi *msi = platform_get_drvdata(pdev);
> >  
> >  	msi_writel(msi, 0, MSI_INTMASK);
> > -	irq_set_chained_handler(msi->irq, NULL);
> > -	irq_set_handler_data(msi->irq, NULL);
> > +	irq_set_chained_handler_and_data(msi->irq, NULL, NULL);
> >  
> >  	altera_free_domains(msi);
> >  
> > -- 
> > 2.20.1
> >
Martin Kaiser Nov. 11, 2020, 9:43 p.m. UTC | #3
Thus wrote Bjorn Helgaas (helgaas@kernel.org):

> [+cc Florian, sorry, I hadn't seen your ack when I sent the below]

> On Tue, Nov 10, 2020 at 03:21:36PM -0600, Bjorn Helgaas wrote:
> > [+cc Nicolas, Jingoo, Gustavo, Toan]

> > On Sun, Nov 08, 2020 at 08:11:40PM +0100, Martin Kaiser wrote:
> > > Replace the two separate calls for removing the irq handler and data with a
> > > single irq_set_chained_handler_and_data() call.

> > This is similar to these:

> >   36f024ed8fc9 ("PCI/MSI: pci-xgene-msi: Consolidate chained IRQ handler install/remove")
> >   5168a73ce32d ("PCI/keystone: Consolidate chained IRQ handler install/remove")
> >   2cf5a03cb29d ("PCI/keystone: Fix race in installing chained IRQ handler")

> > and it seems potentially important that this removes the IRQ handler
> > and data *atomically*, i.e., both are done while holding
> > irq_get_desc_buslock().  

Ok, understood.

> > So I would use this:

> >   PCI: altera-msi: Fix race in installing chained IRQ handler

> >   Fix a race where a pending interrupt could be received and the handler
> >   called before the handler's data has been setup by converting to
> >   irq_set_chained_handler_and_data().

> >   See also 2cf5a03cb29d ("PCI/keystone: Fix race in installing chained
> >   IRQ handler").

> > to make it clear that this is actually a bug fix, not just a cleanup.

Thomas' commit 2cf5a03cb29d fixed a case where the handler was installed.
We're removing the handler here so his commit message doesn't really fit.
Anyway, I'll rewrite the commit message to clarify that this fixes a
race condition.

> > Looks like this should also be done in dw_pcie_free_msi() and

I'll send a patch for this.

> > xgene_msi_hwirq_alloc() at the same time?

This function uses the error status from irq_set_handler_data().
irq_set_chained_handler_and_data() returns no such error status. Is it
ok to drop the error handling?

Thanks,
Martin
Bjorn Helgaas Nov. 11, 2020, 10:16 p.m. UTC | #4
[+cc Thomas for handler/data order question at end]

On Wed, Nov 11, 2020 at 10:43:55PM +0100, Martin Kaiser wrote:
> Thus wrote Bjorn Helgaas (helgaas@kernel.org):
> > On Tue, Nov 10, 2020 at 03:21:36PM -0600, Bjorn Helgaas wrote:
> > > On Sun, Nov 08, 2020 at 08:11:40PM +0100, Martin Kaiser wrote:
> > > > Replace the two separate calls for removing the irq handler and data with a
> > > > single irq_set_chained_handler_and_data() call.
> 
> > > This is similar to these:
> 
> > >   36f024ed8fc9 ("PCI/MSI: pci-xgene-msi: Consolidate chained IRQ handler install/remove")
> > >   5168a73ce32d ("PCI/keystone: Consolidate chained IRQ handler install/remove")
> > >   2cf5a03cb29d ("PCI/keystone: Fix race in installing chained IRQ handler")
> 
> > > and it seems potentially important that this removes the IRQ handler
> > > and data *atomically*, i.e., both are done while holding
> > > irq_get_desc_buslock().  
> 
> Ok, understood.
> 
> > > So I would use this:
> 
> > >   PCI: altera-msi: Fix race in installing chained IRQ handler
> 
> > >   Fix a race where a pending interrupt could be received and the handler
> > >   called before the handler's data has been setup by converting to
> > >   irq_set_chained_handler_and_data().
> 
> > >   See also 2cf5a03cb29d ("PCI/keystone: Fix race in installing chained
> > >   IRQ handler").
> 
> > > to make it clear that this is actually a bug fix, not just a cleanup.
> 
> Thomas' commit 2cf5a03cb29d fixed a case where the handler was installed.
> We're removing the handler here so his commit message doesn't really fit.
> Anyway, I'll rewrite the commit message to clarify that this fixes a
> race condition.

Oh, right, of course, I wasn't paying attention.  The altera case is
removing and doing it in the correct order (removing handler, then
data), so there shouldn't be a race even with the current code.

> > > Looks like this should also be done in dw_pcie_free_msi() and
> 
> I'll send a patch for this.
> 
> > > xgene_msi_hwirq_alloc() at the same time?
> 
> This function uses the error status from irq_set_handler_data().
> irq_set_chained_handler_and_data() returns no such error status. Is it
> ok to drop the error handling?

I'm not an IRQ expert, but I'd say it's OK to drop it.  Of the 40 or
so callers, the only other caller that looks at the error status is
ingenic_intc_of_init().

Thomas, it looks like irq_domain_set_info() and msi_domain_ops_init()
set the handler itself before setting the handler data:

  irq_domain_set_info
    irq_set_chip_and_handler_name(virq, chip, handler, ...)
    irq_set_handler_data(virq, handler_data)

  msi_domain_ops_init
    __irq_set_handler(virq, info->handler, ...)
    if (info->handler_data)
      irq_set_handler_data(virq, info->handler_data)

That looks at least superficially similar to the race you fixed with
2cf5a03cb29d ("PCI/keystone: Fix race in installing chained IRQ
handler").

Should irq_domain_set_info() and msi_domain_ops_init() swap the order,
too?
Thomas Gleixner Nov. 12, 2020, 11:28 a.m. UTC | #5
On Wed, Nov 11 2020 at 16:16, Bjorn Helgaas wrote:
> On Wed, Nov 11, 2020 at 10:43:55PM +0100, Martin Kaiser wrote:
>> This function uses the error status from irq_set_handler_data().
>> irq_set_chained_handler_and_data() returns no such error status. Is it
>> ok to drop the error handling?
>
> I'm not an IRQ expert, but I'd say it's OK to drop it.  Of the 40 or
> so callers, the only other caller that looks at the error status is
> ingenic_intc_of_init().

Don't know why irq_set_chained_handler_and_data() does not return an
error, but the call site must really do something stupid if it fails to
hand in the proper interrupt number.

> Thomas, it looks like irq_domain_set_info() and msi_domain_ops_init()
> set the handler itself before setting the handler data:
>
>   irq_domain_set_info
>     irq_set_chip_and_handler_name(virq, chip, handler, ...)
>     irq_set_handler_data(virq, handler_data)
>
>   msi_domain_ops_init
>     __irq_set_handler(virq, info->handler, ...)
>     if (info->handler_data)
>       irq_set_handler_data(virq, info->handler_data)
>
> That looks at least superficially similar to the race you fixed with
> 2cf5a03cb29d ("PCI/keystone: Fix race in installing chained IRQ
> handler").
>
> Should irq_domain_set_info() and msi_domain_ops_init() swap the order,
> too?

In theory yes. Practically it should not matter because that happens
during the allocation way before the interrupt can actually fire.  I'll
have a deeper look nevertheless.

Thanks,

        tglx
Thomas Gleixner Nov. 12, 2020, 1:50 p.m. UTC | #6
On Thu, Nov 12 2020 at 12:28, Thomas Gleixner wrote:
> On Wed, Nov 11 2020 at 16:16, Bjorn Helgaas wrote:
>> On Wed, Nov 11, 2020 at 10:43:55PM +0100, Martin Kaiser wrote:
>> Thomas, it looks like irq_domain_set_info() and msi_domain_ops_init()
>> set the handler itself before setting the handler data:
>>
>>   irq_domain_set_info
>>     irq_set_chip_and_handler_name(virq, chip, handler, ...)
>>     irq_set_handler_data(virq, handler_data)
>>
>>   msi_domain_ops_init
>>     __irq_set_handler(virq, info->handler, ...)
>>     if (info->handler_data)
>>       irq_set_handler_data(virq, info->handler_data)
>>
>> That looks at least superficially similar to the race you fixed with
>> 2cf5a03cb29d ("PCI/keystone: Fix race in installing chained IRQ
>> handler").
>>
>> Should irq_domain_set_info() and msi_domain_ops_init() swap the order,
>> too?
>
> In theory yes. Practically it should not matter because that happens
> during the allocation way before the interrupt can actually fire.  I'll
> have a deeper look nevertheless.

So I had a closer look and the reason why it only matters for the
chained handler case is that

   __irq_set_handler(..., is_chained = true, ...)

starts up the interrupt immediately. So the order for this _must_ be:

    set_handler_data() -> set_handler()

For regular interrupts it's really the mapping and allocation code which
does this long before the interrupt is started up. So the ordering does
not matter because the handler can't be reached before the full
setup is finished and the interrupt is actually started up.

Thanks,

        tglx
Bjorn Helgaas Nov. 12, 2020, 2:26 p.m. UTC | #7
On Thu, Nov 12, 2020 at 02:50:42PM +0100, Thomas Gleixner wrote:
> On Thu, Nov 12 2020 at 12:28, Thomas Gleixner wrote:
> > On Wed, Nov 11 2020 at 16:16, Bjorn Helgaas wrote:
> >> On Wed, Nov 11, 2020 at 10:43:55PM +0100, Martin Kaiser wrote:
> >> Thomas, it looks like irq_domain_set_info() and msi_domain_ops_init()
> >> set the handler itself before setting the handler data:
> >>
> >>   irq_domain_set_info
> >>     irq_set_chip_and_handler_name(virq, chip, handler, ...)
> >>     irq_set_handler_data(virq, handler_data)
> >>
> >>   msi_domain_ops_init
> >>     __irq_set_handler(virq, info->handler, ...)
> >>     if (info->handler_data)
> >>       irq_set_handler_data(virq, info->handler_data)
> >>
> >> That looks at least superficially similar to the race you fixed with
> >> 2cf5a03cb29d ("PCI/keystone: Fix race in installing chained IRQ
> >> handler").
> >>
> >> Should irq_domain_set_info() and msi_domain_ops_init() swap the order,
> >> too?
> >
> > In theory yes. Practically it should not matter because that happens
> > during the allocation way before the interrupt can actually fire.  I'll
> > have a deeper look nevertheless.
> 
> So I had a closer look and the reason why it only matters for the
> chained handler case is that
> 
>    __irq_set_handler(..., is_chained = true, ...)
> 
> starts up the interrupt immediately. So the order for this _must_ be:
> 
>     set_handler_data() -> set_handler()
> 
> For regular interrupts it's really the mapping and allocation code which
> does this long before the interrupt is started up. So the ordering does
> not matter because the handler can't be reached before the full
> setup is finished and the interrupt is actually started up.

If the order truly doesn't matter here, maybe it's worth changing it
to "set data, set handler" to avoid the need for a closer look to
verify correctness and to make it harder to copy and paste to a place
where it *does* matter?

Bjorn
Thomas Gleixner Nov. 12, 2020, 6:19 p.m. UTC | #8
On Thu, Nov 12 2020 at 08:26, Bjorn Helgaas wrote:
> On Thu, Nov 12, 2020 at 02:50:42PM +0100, Thomas Gleixner wrote:
>> So I had a closer look and the reason why it only matters for the
>> chained handler case is that
>> 
>>    __irq_set_handler(..., is_chained = true, ...)
>> 
>> starts up the interrupt immediately. So the order for this _must_ be:
>> 
>>     set_handler_data() -> set_handler()
>> 
>> For regular interrupts it's really the mapping and allocation code which
>> does this long before the interrupt is started up. So the ordering does
>> not matter because the handler can't be reached before the full
>> setup is finished and the interrupt is actually started up.
>
> If the order truly doesn't matter here, maybe it's worth changing it
> to "set data, set handler" to avoid the need for a closer look to
> verify correctness and to make it harder to copy and paste to a place
> where it *does* matter?

Makes sense.
diff mbox series

Patch

diff --git a/drivers/pci/controller/pcie-altera-msi.c b/drivers/pci/controller/pcie-altera-msi.c
index e1636f7714ca..42691dd8ebef 100644
--- a/drivers/pci/controller/pcie-altera-msi.c
+++ b/drivers/pci/controller/pcie-altera-msi.c
@@ -204,8 +204,7 @@  static int altera_msi_remove(struct platform_device *pdev)
 	struct altera_msi *msi = platform_get_drvdata(pdev);
 
 	msi_writel(msi, 0, MSI_INTMASK);
-	irq_set_chained_handler(msi->irq, NULL);
-	irq_set_handler_data(msi->irq, NULL);
+	irq_set_chained_handler_and_data(msi->irq, NULL, NULL);
 
 	altera_free_domains(msi);