diff mbox

[2/3,v3] Refactor MSI restore call-chain to drop unnecessary argument

Message ID 521EB76B.8090903@oracle.com
State Superseded
Headers show

Commit Message

Zhenzhong Duan Aug. 29, 2013, 2:52 a.m. UTC
Driver init call graph under baremetal:
driver_init->
    msix_capability_init->
        msix_program_entries->
            msix_mask_irq->
                entry->masked = 1
    request_irq->
        __setup_irq->
            irq_startup->
                unmask_msi_irq->
                    msix_mask_irq->
                        entry->masked = 0

So entry->masked is always updated with newest value and its value could be used
to restore to mask register in device.

But in initial domain (aka priviliged guest), it's different.
Driver init call graph under initial domain:
driver_init->
    msix_capability_init->
        msix_program_entries->
            msix_mask_irq->
                entry->masked = 1
    request_irq->
        __setup_irq->
            irq_startup->
                __startup_pirq->
                    EVTCHNOP_bind_pirq hypercall    (trap into Xen)
[Xen:]
pirq_guest_bind->
    startup_msi_irq->
        unmask_msi_irq->
            msi_set_mask_bit->
                entry->msi_attrib.masked = 0

So entry->msi_attrib.masked in xen side always has newest value. entry->masked
in initial domain is untouched and is 1 after msix_capability_init.

Based on above, it's Xen's duty to restore entry->msi_attrib.masked to device,
but with current code, entry->masked is used and MSI-x interrupt is masked.

Before patch, restore call graph under initial domain:
pci_reset_function->
    pci_restore_state->
        __pci_restore_msix_state->
            arch_restore_msi_irqs->
                xen_initdom_restore_msi_irqs->
                    PHYSDEVOP_restore_msi hypercall (first mask restore)
            msix_mask_irq(entry, entry->masked)     (second mask restore)

So msix_mask_irq call in initial domain call graph needs to be removed.

Based on this we can move the restore of the mask in default_restore_msi_irqs
which would avoid restoring the invalid mask under Xen. Furthermore this
simplifies the API by making everything related to restoring an MSI be in the
platform specific APIs instead of just parts of it.

After patch, restore call graph under initial domain:
pci_reset_function->
    pci_restore_state->
        __pci_restore_msix_state->
            arch_restore_msi_irqs->
                xen_initdom_restore_msi_irqs->
                    PHYSDEVOP_restore_msi hypercall (first mask restore)

Logic for baremetal is not changed.
Before patch, restore call graph under baremetal:
pci_reset_function->
    pci_restore_state->
        __pci_restore_msix_state->
            arch_restore_msi_irqs->
                default_restore_msi_irqs->
            msix_mask_irq(entry, entry->masked)     (first mask restore)

After patch, restore call graph under baremetal:
pci_reset_function->
    pci_restore_state->
        __pci_restore_msix_state->
            arch_restore_msi_irqs->
                default_restore_msi_irqs->
                    msix_mask_irq(entry, entry->masked) (first mask restore)

The process for MSI code is similiar.

-v3: Update patch description per Konrad suggestion, thanks.

Tested-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
---
 drivers/pci/msi.c |   17 ++++++++++++++---
 1 files changed, 14 insertions(+), 3 deletions(-)

Comments

Jan Beulich Aug. 29, 2013, 7:23 a.m. UTC | #1
>>> On 29.08.13 at 04:52, Zhenzhong Duan <zhenzhong.duan@oracle.com> wrote:
> But in initial domain (aka priviliged guest), it's different.
> Driver init call graph under initial domain:
> driver_init->
>     msix_capability_init->
>         msix_program_entries->
>             msix_mask_irq->
>                 entry->masked = 1
>     request_irq->
>         __setup_irq->
>             irq_startup->
>                 __startup_pirq->
>                     EVTCHNOP_bind_pirq hypercall    (trap into Xen)
> [Xen:]
> pirq_guest_bind->
>     startup_msi_irq->
>         unmask_msi_irq->
>             msi_set_mask_bit->
>                 entry->msi_attrib.masked = 0
> 
> So entry->msi_attrib.masked in xen side always has newest value. entry->masked
> in initial domain is untouched and is 1 after msix_capability_init.

And as said several times before - Linux shouldn't be touching
the MSI-X table at all during initial setup or resume (it should in
particular not rely on such accesses to not fault, as being a
privilege violation); all it needs to do is update its software state.

Hence fiddling with default_restore_msi_irqs() seems the wrong
approach towards solving the problem.

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zhenzhong Duan Aug. 29, 2013, 9:50 a.m. UTC | #2
On 2013-08-29 15:23, Jan Beulich wrote:
>>>> On 29.08.13 at 04:52, Zhenzhong Duan <zhenzhong.duan@oracle.com> wrote:
>> But in initial domain (aka priviliged guest), it's different.
>> Driver init call graph under initial domain:
>> driver_init->
>>      msix_capability_init->
>>          msix_program_entries->
>>              msix_mask_irq->
>>                  entry->masked = 1
>>      request_irq->
>>          __setup_irq->
>>              irq_startup->
>>                  __startup_pirq->
>>                      EVTCHNOP_bind_pirq hypercall    (trap into Xen)
>> [Xen:]
>> pirq_guest_bind->
>>      startup_msi_irq->
>>          unmask_msi_irq->
>>              msi_set_mask_bit->
>>                  entry->msi_attrib.masked = 0
>>
>> So entry->msi_attrib.masked in xen side always has newest value. entry->masked
>> in initial domain is untouched and is 1 after msix_capability_init.
> And as said several times before - Linux shouldn't be touching
> the MSI-X table at all during initial setup or resume (it should in
> particular not rely on such accesses to not fault, as being a
> privilege violation); all it needs to do is update its software state.
My patch just remove access to msix mask register in dom0. Anything 
wrong with that?
>
> Hence fiddling with default_restore_msi_irqs() seems the wrong
> approach towards solving the problem.
dom0 uses xen_initdom_restore_msi_irqs, default_restore_msi_irqs is for 
baremetal.

zduan
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Beulich Aug. 29, 2013, 10:47 a.m. UTC | #3
>>> On 29.08.13 at 11:50, Zhenzhong Duan <zhenzhong.duan@oracle.com> wrote:

> On 2013-08-29 15:23, Jan Beulich wrote:
>>>>> On 29.08.13 at 04:52, Zhenzhong Duan <zhenzhong.duan@oracle.com> wrote:
>>> But in initial domain (aka priviliged guest), it's different.
>>> Driver init call graph under initial domain:
>>> driver_init->
>>>      msix_capability_init->
>>>          msix_program_entries->
>>>              msix_mask_irq->
>>>                  entry->masked = 1
>>>      request_irq->
>>>          __setup_irq->
>>>              irq_startup->
>>>                  __startup_pirq->
>>>                      EVTCHNOP_bind_pirq hypercall    (trap into Xen)
>>> [Xen:]
>>> pirq_guest_bind->
>>>      startup_msi_irq->
>>>          unmask_msi_irq->
>>>              msi_set_mask_bit->
>>>                  entry->msi_attrib.masked = 0
>>>
>>> So entry->msi_attrib.masked in xen side always has newest value. entry->masked
>>> in initial domain is untouched and is 1 after msix_capability_init.
>> And as said several times before - Linux shouldn't be touching
>> the MSI-X table at all during initial setup or resume (it should in
>> particular not rely on such accesses to not fault, as being a
>> privilege violation); all it needs to do is update its software state.
> My patch just remove access to msix mask register in dom0. Anything 
> wrong with that?

Oh, okay, I mis-read the change then - you're moving the mask
bit changing from __pci_restore_msi{,x}_state() to
default_restore_msi_irqs(). Which of course is fine if Xen (as you
say) doesn't use the latter.

Sorry for the noise then,
Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Konrad Rzeszutek Wilk Aug. 29, 2013, 1:31 p.m. UTC | #4
Zhenzhong Duan <zhenzhong.duan@oracle.com> wrote:
>Driver init call graph under baremetal:
>driver_init->
>    msix_capability_init->
>        msix_program_entries->
>            msix_mask_irq->
>                entry->masked = 1
>    request_irq->
>        __setup_irq->
>            irq_startup->
>                unmask_msi_irq->
>                    msix_mask_irq->
>                        entry->masked = 0
>
>So entry->masked is always updated with newest value and its value
>could be used
>to restore to mask register in device.
>
>But in initial domain (aka priviliged guest), it's different.
>Driver init call graph under initial domain:
>driver_init->
>    msix_capability_init->
>        msix_program_entries->
>            msix_mask_irq->
>                entry->masked = 1
>    request_irq->
>        __setup_irq->
>            irq_startup->
>                __startup_pirq->
>                    EVTCHNOP_bind_pirq hypercall    (trap into Xen)
>[Xen:]
>pirq_guest_bind->
>    startup_msi_irq->
>        unmask_msi_irq->
>            msi_set_mask_bit->
>                entry->msi_attrib.masked = 0
>
>So entry->msi_attrib.masked in xen side always has newest value.
>entry->masked
>in initial domain is untouched and is 1 after msix_capability_init.
>
>Based on above, it's Xen's duty to restore entry->msi_attrib.masked to
>device,
>but with current code, entry->masked is used and MSI-x interrupt is
>masked.
>
>Before patch, restore call graph under initial domain:
>pci_reset_function->
>    pci_restore_state->
>        __pci_restore_msix_state->
>            arch_restore_msi_irqs->
>                xen_initdom_restore_msi_irqs->
>                   PHYSDEVOP_restore_msi hypercall (first mask restore)
>          msix_mask_irq(entry, entry->masked)     (second mask restore)
>
>So msix_mask_irq call in initial domain call graph needs to be removed.
>
>Based on this we can move the restore of the mask in
>default_restore_msi_irqs
>which would avoid restoring the invalid mask under Xen. Furthermore
>this
>simplifies the API by making everything related to restoring an MSI be
>in the
>platform specific APIs instead of just parts of it.
>
>After patch, restore call graph under initial domain:
>pci_reset_function->
>    pci_restore_state->
>        __pci_restore_msix_state->
>            arch_restore_msi_irqs->
>                xen_initdom_restore_msi_irqs->
>                   PHYSDEVOP_restore_msi hypercall (first mask restore)
>
>Logic for baremetal is not changed.
>Before patch, restore call graph under baremetal:
>pci_reset_function->
>    pci_restore_state->
>        __pci_restore_msix_state->
>            arch_restore_msi_irqs->
>                default_restore_msi_irqs->
>           msix_mask_irq(entry, entry->masked)     (first mask restore)
>
>After patch, restore call graph under baremetal:
>pci_reset_function->
>    pci_restore_state->
>        __pci_restore_msix_state->
>            arch_restore_msi_irqs->
>                default_restore_msi_irqs->
>               msix_mask_irq(entry, entry->masked) (first mask restore)
>
>The process for MSI code is similiar.
>
>-v3: Update patch description per Konrad suggestion, thanks.
>
>Tested-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
>Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>

Bjorn, 
If this is not too late to add to your queue please add it with my Acked-By. 

Thanks. 

>---
> drivers/pci/msi.c |   17 ++++++++++++++---
> 1 files changed, 14 insertions(+), 3 deletions(-)
>
>diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
>index 87223ae..922fb49 100644
>--- a/drivers/pci/msi.c
>+++ b/drivers/pci/msi.c
>@@ -216,6 +216,8 @@ void unmask_msi_irq(struct irq_data *data)
> #ifdef HAVE_DEFAULT_MSI_RESTORE_IRQS
> void default_restore_msi_irqs(struct pci_dev *dev, int irq)
> {
>+	int pos;
>+	u16 control;
> 	struct msi_desc *entry;
> 
> 	entry = NULL;
>@@ -228,8 +230,19 @@ void default_restore_msi_irqs(struct pci_dev *dev,
>int irq)
> 		entry = irq_get_msi_desc(irq);
> 	}
> 
>-	if (entry)
>+	if (entry) {
> 		write_msi_msg(irq, &entry->msg);
>+		if (dev->msix_enabled) {
>+			msix_mask_irq(entry, entry->masked);
>+			readl(entry->mask_base);
>+		} else {
>+			pos = entry->msi_attrib.pos;
>+			pci_read_config_word(dev, pos + PCI_MSI_FLAGS,
>+					     &control);
>+			msi_mask_irq(entry, msi_capable_mask(control),
>+				     entry->masked);
>+		}
>+	}
> }
> #endif
> 
>@@ -406,7 +419,6 @@ static void __pci_restore_msi_state(struct pci_dev
>*dev)
> 	arch_restore_msi_irqs(dev, dev->irq);
> 
> 	pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
>-	msi_mask_irq(entry, msi_capable_mask(control), entry->masked);
> 	control &= ~PCI_MSI_FLAGS_QSIZE;
> 	control |= (entry->msi_attrib.multiple << 4) | PCI_MSI_FLAGS_ENABLE;
> 	pci_write_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, control);
>@@ -430,7 +442,6 @@ static void __pci_restore_msix_state(struct pci_dev
>*dev)
> 
> 	list_for_each_entry(entry, &dev->msi_list, list) {
> 		arch_restore_msi_irqs(dev, entry->irq);
>-		msix_mask_irq(entry, entry->masked);
> 	}
> 
> 	control &= ~PCI_MSIX_FLAGS_MASKALL;


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Sept. 25, 2013, 10:22 p.m. UTC | #5
On Wed, Aug 28, 2013 at 8:52 PM, Zhenzhong Duan
<zhenzhong.duan@oracle.com> wrote:
> ...
> -v3: Update patch description per Konrad suggestion, thanks.
>
> Tested-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>

Zhenzhong, would you mind reposting the entire series as a v4 with all
updates and acks?  I'm not sure I can collect all the right pieces.

Thanks,
  Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zhenzhong Duan Sept. 26, 2013, 1:19 a.m. UTC | #6
于 2013-09-26 06:22, Bjorn Helgaas 写道:
> On Wed, Aug 28, 2013 at 8:52 PM, Zhenzhong Duan
> <zhenzhong.duan@oracle.com> wrote:
>> ...
>> -v3: Update patch description per Konrad suggestion, thanks.
>>
>> Tested-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
> Zhenzhong, would you mind reposting the entire series as a v4 with all
> updates and acks?  I'm not sure I can collect all the right pieces.
>
> Thanks,
>    Bjorn
There are some updates in upstream that conflict with mine since I wrote 
my patches.
So I rewrte the patches based on upstream and send to Konrad.
He will buy a qlogic card and help a test. I'll repost those after that.

thanks
zduan
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 87223ae..922fb49 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -216,6 +216,8 @@  void unmask_msi_irq(struct irq_data *data)
 #ifdef HAVE_DEFAULT_MSI_RESTORE_IRQS
 void default_restore_msi_irqs(struct pci_dev *dev, int irq)
 {
+	int pos;
+	u16 control;
 	struct msi_desc *entry;
 
 	entry = NULL;
@@ -228,8 +230,19 @@  void default_restore_msi_irqs(struct pci_dev *dev, int irq)
 		entry = irq_get_msi_desc(irq);
 	}
 
-	if (entry)
+	if (entry) {
 		write_msi_msg(irq, &entry->msg);
+		if (dev->msix_enabled) {
+			msix_mask_irq(entry, entry->masked);
+			readl(entry->mask_base);
+		} else {
+			pos = entry->msi_attrib.pos;
+			pci_read_config_word(dev, pos + PCI_MSI_FLAGS,
+					     &control);
+			msi_mask_irq(entry, msi_capable_mask(control),
+				     entry->masked);
+		}
+	}
 }
 #endif
 
@@ -406,7 +419,6 @@  static void __pci_restore_msi_state(struct pci_dev *dev)
 	arch_restore_msi_irqs(dev, dev->irq);
 
 	pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
-	msi_mask_irq(entry, msi_capable_mask(control), entry->masked);
 	control &= ~PCI_MSI_FLAGS_QSIZE;
 	control |= (entry->msi_attrib.multiple << 4) | PCI_MSI_FLAGS_ENABLE;
 	pci_write_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, control);
@@ -430,7 +442,6 @@  static void __pci_restore_msix_state(struct pci_dev *dev)
 
 	list_for_each_entry(entry, &dev->msi_list, list) {
 		arch_restore_msi_irqs(dev, entry->irq);
-		msix_mask_irq(entry, entry->masked);
 	}
 
 	control &= ~PCI_MSIX_FLAGS_MASKALL;