4.7 regression: ACPI: No IRQ available for PCI Interrupt Link [LNKD]. Try pci=noacpi or acpi=off
diff mbox

Message ID 34355440-197d-44c7-b27d-535267d1161c@codeaurora.org
State Not Applicable
Headers show

Commit Message

Sinan Kaya Sept. 29, 2016, 10:39 p.m. UTC
On 9/29/2016 2:00 PM, Ondrej Zary wrote:
>> The previous two patches were in the right direction.
>> >
>> > Can we also get the same output from 4.6 kernel with the attached patch for
>> > the same machine you sent these?
> Here it is.
> 
>> > Something about SCI still doesn't feel right.
>> >
>> > The IRQ assignment fails if the penalty is greater than
>> > PIRQ_PENALTY_ISA_ALWAYS. This will happen if BIOS tells us to use an IRQ
>> > and same IRQ is in use by the SCI.

Thanks, I reverted penalize_sci function and dropped patch #1. Can you try this again?

Comments

Ondrej Zary Sept. 30, 2016, 6:44 a.m. UTC | #1
On Friday 30 September 2016, Sinan Kaya wrote:
> On 9/29/2016 2:00 PM, Ondrej Zary wrote:
> >> The previous two patches were in the right direction.
> >>
> >> > Can we also get the same output from 4.6 kernel with the attached
> >> > patch for the same machine you sent these?
> >
> > Here it is.
> >
> >> > Something about SCI still doesn't feel right.
> >> >
> >> > The IRQ assignment fails if the penalty is greater than
> >> > PIRQ_PENALTY_ISA_ALWAYS. This will happen if BIOS tells us to use an
> >> > IRQ and same IRQ is in use by the SCI.
>
> Thanks, I reverted penalize_sci function and dropped patch #1. Can you try
> this again?

It seems to work, at least on one machine.
Sinan Kaya Sept. 30, 2016, 1:14 p.m. UTC | #2
On 2016-09-30 02:44, Ondrej Zary wrote:
> On Friday 30 September 2016, Sinan Kaya wrote:
>> On 9/29/2016 2:00 PM, Ondrej Zary wrote:
>> >> The previous two patches were in the right direction.
>> >>
>> >> > Can we also get the same output from 4.6 kernel with the attached
>> >> > patch for the same machine you sent these?
>> >
>> > Here it is.
>> >
>> >> > Something about SCI still doesn't feel right.
>> >> >
>> >> > The IRQ assignment fails if the penalty is greater than
>> >> > PIRQ_PENALTY_ISA_ALWAYS. This will happen if BIOS tells us to use an
>> >> > IRQ and same IRQ is in use by the SCI.
>> 
>> Thanks, I reverted penalize_sci function and dropped patch #1. Can you 
>> try
>> this again?
> 
> It seems to work, at least on one machine.

Ok, that comfirms my suspicion. We are  having trouble detecting sci 
interrupt  type and we end up penalizing the wrong value.

Can you try your other machines too?

I need to do some research now.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ondrej Zary Sept. 30, 2016, 3:56 p.m. UTC | #3
On Friday 30 September 2016 15:14:42 okaya@codeaurora.org wrote:
> On 2016-09-30 02:44, Ondrej Zary wrote:
> > On Friday 30 September 2016, Sinan Kaya wrote:
> >> On 9/29/2016 2:00 PM, Ondrej Zary wrote:
> >> >> The previous two patches were in the right direction.
> >> >>
> >> >> > Can we also get the same output from 4.6 kernel with the attached
> >> >> > patch for the same machine you sent these?
> >> >
> >> > Here it is.
> >> >
> >> >> > Something about SCI still doesn't feel right.
> >> >> >
> >> >> > The IRQ assignment fails if the penalty is greater than
> >> >> > PIRQ_PENALTY_ISA_ALWAYS. This will happen if BIOS tells us to use
> >> >> > an IRQ and same IRQ is in use by the SCI.
> >>
> >> Thanks, I reverted penalize_sci function and dropped patch #1. Can you
> >> try
> >> this again?
> >
> > It seems to work, at least on one machine.
>
> Ok, that comfirms my suspicion. We are  having trouble detecting sci
> interrupt  type and we end up penalizing the wrong value.
>
> Can you try your other machines too?

Works on the 2nd one too.

> I need to do some research now.
Sinan Kaya Sept. 30, 2016, 7:30 p.m. UTC | #4
Rafael, Bjorn;

On 9/30/2016 11:56 AM, Ondrej Zary wrote:
>>> It seems to work, at least on one machine.
>> >
>> > Ok, that comfirms my suspicion. We are  having trouble detecting sci
>> > interrupt  type and we end up penalizing the wrong value.
>> >
>> > Can you try your other machines too?
> Works on the 2nd one too.
> 
>> > I need to do some research now.

Thanks for the confirmation. I need to gather some opinion from Rafael and Bjorn.

The patch provided only handles SCI in the ISA IRQ range (<16)

I looked at the ACPI spec. SCI interrupt is a two byte field:

"SCI_INT byte 2 offset 46 

System vector the SCI interrupt is wired to in 8259 mode. On
systems that do not contain the 8259, this field contains the
Global System interrupt number of the SCI interrupt. OSPM is
required to treat the ACPI SCI interrupt as a sharable, level,
active low interrupt."

Since we have a race condition between the time the IRQ is registered
via the ISA API and the time it gets to the interrupt driver
since irq_get_type function is not matching what SCI API indicated.

how do we feel about increasing the ISA IRQ range to 256 so that
we are safe for all SCI interrupts? 

Do you have any other recommendation?
Rafael J. Wysocki Sept. 30, 2016, 7:39 p.m. UTC | #5
On Fri, Sep 30, 2016 at 9:30 PM, Sinan Kaya <okaya@codeaurora.org> wrote:
> Rafael, Bjorn;
>
> On 9/30/2016 11:56 AM, Ondrej Zary wrote:
>>>> It seems to work, at least on one machine.
>>> >
>>> > Ok, that comfirms my suspicion. We are  having trouble detecting sci
>>> > interrupt  type and we end up penalizing the wrong value.
>>> >
>>> > Can you try your other machines too?
>> Works on the 2nd one too.
>>
>>> > I need to do some research now.
>
> Thanks for the confirmation. I need to gather some opinion from Rafael and Bjorn.
>
> The patch provided only handles SCI in the ISA IRQ range (<16)
>
> I looked at the ACPI spec. SCI interrupt is a two byte field:
>
> "SCI_INT byte 2 offset 46
>
> System vector the SCI interrupt is wired to in 8259 mode. On
> systems that do not contain the 8259, this field contains the
> Global System interrupt number of the SCI interrupt. OSPM is
> required to treat the ACPI SCI interrupt as a sharable, level,
> active low interrupt."
>
> Since we have a race condition between the time the IRQ is registered
> via the ISA API and the time it gets to the interrupt driver
> since irq_get_type function is not matching what SCI API indicated.
>
> how do we feel about increasing the ISA IRQ range to 256 so that
> we are safe for all SCI interrupts?

I'm not sure how this is related to the problem at hand.  Care to elaborate?

And why exactly was that race condition not present before your changes?

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sinan Kaya Sept. 30, 2016, 8:24 p.m. UTC | #6
On 9/30/2016 3:39 PM, Rafael J. Wysocki wrote:
>> how do we feel about increasing the ISA IRQ range to 256 so that
>> > we are safe for all SCI interrupts?
> I'm not sure how this is related to the problem at hand.  Care to elaborate?
> 

Sure, let me explain. 

The code maintains a static array per IRQ number (acpi_irq_penalty) to assign 
penalties for each IRQ so that during the Link Object allocation, the code 
tries to choose an IRQ with less penalty that has not been used by another 
PCI/IRQ if possible.

In the kernels prior to my change, the type of the SCI
interrupt is given to the ACPI PCI Link driver via this function.

+void acpi_penalize_sci_irq(int irq, int trigger, int polarity)
+{
+	if (irq >= 0 && irq < ARRAY_SIZE(acpi_irq_penalty)) {
+		if (trigger != ACPI_MADT_TRIGGER_LEVEL ||
+		    polarity != ACPI_MADT_POLARITY_ACTIVE_LOW)
+			acpi_irq_penalty[irq] += PIRQ_PENALTY_ISA_ALWAYS;
+		else
+			acpi_irq_penalty[irq] += PIRQ_PENALTY_PCI_USING;
+	}
+}

This function says given the trigger and polarity, if it is ACTIVE_LOW and LEVEL
then increment the penalty by PCI_USING. It also does some ACPI parameter 
checking and assigns ISA_ALWAYS if the parameters are not compatible so that
this IRQ is never used for PCI. ISA_AKWAYS is the maximum penalty that can 
be assigned.

> And why exactly was that race condition not present before your changes?
The size of acpi_irq_penalty array used to be 256. This number puts an 
artificial limit on the IRQ numbers that can be assigned to PCI. ACPI spec does
not put any limits on the PCI numbers and extended IRQ resources can be used
for this purpose. 

During my patch, three things happened:
1. acpi_irq_penalty got renamed to acpi_isa_irq_penalty and the array size was 
reduced to 16 from 256.
2. Since we have no idea about the range of PCI interrupts that can be assigned
through ACPI, we tried to refactor the code so that PCI interrupts are calculated
by the time API is called by walking the PCI Link list here.

static int acpi_irq_pci_sharing_penalty(int irq)

We don't have to preallocate a fixed size array this way and can support any PCI
numbers that we want.

While refactoring the code, we said we could take the next step and get rid of the
acpi_penalize_sci_irq function and calculate the penalty dynamically similar to
acpi_irq_pci_sharing_penalty function. 

The new way to determine if SCI interrupt parameters are compatible is as follows:

	/*
	* Penalize IRQ used by ACPI SCI. If ACPI SCI pin attributes conflict
	* with PCI IRQ attributes, mark ACPI SCI as ISA_ALWAYS so it won't be
	* use for PCI IRQs.
	*/
	if (irq == acpi_gbl_FADT.sci_interrupt) {
		u32 type = irq_get_trigger_type(irq) & IRQ_TYPE_SENSE_MASK;

		if (type != IRQ_TYPE_LEVEL_LOW)
			penalty += PIRQ_PENALTY_ISA_ALWAYS;
		else
			penalty += PIRQ_PENALTY_PCI_USING;
	}
  
This relies on irq_get_trigger_type function to return the correct type. 

I now remember that Bjorn indicated the race condition possibility in this thread
here.

https://lkml.org/lkml/2016/3/8/640

> > > This looks racy, because we test irq_get_trigger_type() without any
> > > kind of locking, and later acpi_register_gsi() calls
> > > irq_create_fwspec_mapping(), which looks like it sets the new trigger
> > > type.  But I don't know how to fix that.
> > 
> > Right, if that pci link allocation code can be executed concurrent, then you
> > might end up with problem, but isn't that a problem even without
> > irq_get_trigger_type()?

> Yes.  It's not a new problem, I just noticed it since we're thinking
> more about the details of what's happening here.

and this is exactly what we have hit in this thread. The value
irq_get_trigger_type(irq) returns does not match what was given with
acpi_penalize_sci_irq and we end up penalizing the PCI IRQ for no reason.

Since the issue is specific to SCI interrupts and SCI interrupts cannot be 
greater than 256, my proposal is 
1. restore the code for the SCI penalty like in my last email 
2. increase the array size back to 256
3. use the acpi_irq_pci_sharing_penalty only if IRQ number is greater than 256
and we know that IRQ numbers are not bounded and can go all the way up to
65535.

I hope it makes sense now. I tend to skip details sometimes. Feel free to
send more questions.
Rafael J. Wysocki Sept. 30, 2016, 9:04 p.m. UTC | #7
On Fri, Sep 30, 2016 at 10:24 PM, Sinan Kaya <okaya@codeaurora.org> wrote:
> On 9/30/2016 3:39 PM, Rafael J. Wysocki wrote:
>>> how do we feel about increasing the ISA IRQ range to 256 so that
>>> > we are safe for all SCI interrupts?
>> I'm not sure how this is related to the problem at hand.  Care to elaborate?
>>
>
> Sure, let me explain.
>

[cut]

>
> I hope it makes sense now. I tend to skip details sometimes. Feel free to
> send more questions.

Thanks for the information!

IIUC, basically, what you are proposing would be to restore the old
penalizing method for IRQs in the 0-255 range and use the new approach
for the rest, right?

What's the drawback, if any?

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sinan Kaya Sept. 30, 2016, 9:14 p.m. UTC | #8
On 9/30/2016 5:04 PM, Rafael J. Wysocki wrote:
>> >
>> > I hope it makes sense now. I tend to skip details sometimes. Feel free to
>> > send more questions.
> Thanks for the information!
> 
> IIUC, basically, what you are proposing would be to restore the old
> penalizing method for IRQs in the 0-255 range and use the new approach
> for the rest, right?

Correct. 

> 
> What's the drawback, if any?

I don't see any drawback to be honest. The reason why I got rid of these ISA
API functions was to remove some of the x86ism from ACPI code. This was Bjorn's
request to clean it up and we failed in two places so far. 

1. acpi_irq_penalty_init
2. acpi_penalize_sci_irq

and we ended up reverting both of these changes. The reverts are all because of the
fact that these APIs are called asynchronously without any coordination with the
PCI Link object or the interrupt controller driver about when it is a good time to be
called.

During this debug, I learnt that acpi_penalize_isa_irq gets called before even ACPI
gets to initialize and relies on static IRQ array for keeping the penalties.

> 
> Thanks,
> Rafael
>
Rafael J. Wysocki Sept. 30, 2016, 9:27 p.m. UTC | #9
On Fri, Sep 30, 2016 at 11:14 PM, Sinan Kaya <okaya@codeaurora.org> wrote:
> On 9/30/2016 5:04 PM, Rafael J. Wysocki wrote:
>>> >
>>> > I hope it makes sense now. I tend to skip details sometimes. Feel free to
>>> > send more questions.
>> Thanks for the information!
>>
>> IIUC, basically, what you are proposing would be to restore the old
>> penalizing method for IRQs in the 0-255 range and use the new approach
>> for the rest, right?
>
> Correct.
>
>>
>> What's the drawback, if any?
>
> I don't see any drawback to be honest.

I'd go for it then, if Bjorn doesn't hate it.

> The reason why I got rid of these ISA
> API functions was to remove some of the x86ism from ACPI code. This was Bjorn's
> request to clean it up and we failed in two places so far.
>
> 1. acpi_irq_penalty_init
> 2. acpi_penalize_sci_irq
>
> and we ended up reverting both of these changes. The reverts are all because of the
> fact that these APIs are called asynchronously without any coordination with the
> PCI Link object or the interrupt controller driver about when it is a good time to be
> called.
>
> During this debug, I learnt that acpi_penalize_isa_irq gets called before even ACPI
> gets to initialize and relies on static IRQ array for keeping the penalties.

So maybe add some comments to that code to explain why things are
arranged the way they are.  We may need/want to revisit it at one
point and such comments will be very useful then.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sinan Kaya Sept. 30, 2016, 9:33 p.m. UTC | #10
On 9/30/2016 5:27 PM, Rafael J. Wysocki wrote:
> On Fri, Sep 30, 2016 at 11:14 PM, Sinan Kaya <okaya@codeaurora.org> wrote:
>> On 9/30/2016 5:04 PM, Rafael J. Wysocki wrote:
>>>>>
>>>>> I hope it makes sense now. I tend to skip details sometimes. Feel free to
>>>>> send more questions.
>>> Thanks for the information!
>>>
>>> IIUC, basically, what you are proposing would be to restore the old
>>> penalizing method for IRQs in the 0-255 range and use the new approach
>>> for the rest, right?
>>
>> Correct.
>>
>>>
>>> What's the drawback, if any?
>>
>> I don't see any drawback to be honest.
> 
> I'd go for it then, if Bjorn doesn't hate it.

OK. I'll prep something.
Bjorn?

> 
>> The reason why I got rid of these ISA
>> API functions was to remove some of the x86ism from ACPI code. This was Bjorn's
>> request to clean it up and we failed in two places so far.
>>
>> 1. acpi_irq_penalty_init
>> 2. acpi_penalize_sci_irq
>>
>> and we ended up reverting both of these changes. The reverts are all because of the
>> fact that these APIs are called asynchronously without any coordination with the
>> PCI Link object or the interrupt controller driver about when it is a good time to be
>> called.
>>
>> During this debug, I learnt that acpi_penalize_isa_irq gets called before even ACPI
>> gets to initialize and relies on static IRQ array for keeping the penalties.
> 
> So maybe add some comments to that code to explain why things are
> arranged the way they are.  We may need/want to revisit it at one
> point and such comments will be very useful then.

Yeah, I think this code definitely needs some documentation. It ended up being
one of the most fragile places in the kernel. This is my 3rd attempt to rearrange
the code.

> 
> Thanks,
> Rafael
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
Sinan Kaya Oct. 1, 2016, 5:49 p.m. UTC | #11
On 9/30/2016 5:27 PM, Rafael J. Wysocki wrote:
>>> >> What's the drawback, if any?
>> >
>> > I don't see any drawback to be honest.
> I'd go for it then, if Bjorn doesn't hate it.
> 

I posted a follow up patch a minute ago. 

[PATCH 1/3] Revert "ACPI,PCI,IRQ: reduce static IRQ array size to 16"
[PATCH 2/3] ACPI, PCI IRQ: add PCI_USING penalty for ISA interrupts
[PATCH 3/3] Revert "ACPI,PCI,IRQ: remove SCI penalize function"

Can we have some testing coverage? and eventually have tested-by?
Ondrej Zary Oct. 2, 2016, 4:53 p.m. UTC | #12
On Saturday 01 October 2016 19:49:17 Sinan Kaya wrote:
> On 9/30/2016 5:27 PM, Rafael J. Wysocki wrote:
> >>> >> What's the drawback, if any?
> >> >
> >> > I don't see any drawback to be honest.
> >
> > I'd go for it then, if Bjorn doesn't hate it.
>
> I posted a follow up patch a minute ago.
>
> [PATCH 1/3] Revert "ACPI,PCI,IRQ: reduce static IRQ array size to 16"
> [PATCH 2/3] ACPI, PCI IRQ: add PCI_USING penalty for ISA interrupts
> [PATCH 3/3] Revert "ACPI,PCI,IRQ: remove SCI penalize function"
>
> Can we have some testing coverage? and eventually have tested-by?

Works on two affected machines. More tests tomorrow.
Sinan Kaya Oct. 3, 2016, 1:05 a.m. UTC | #13
On 10/2/2016 12:53 PM, Ondrej Zary wrote:
>> Can we have some testing coverage? and eventually have tested-by?
> Works on two affected machines. More tests tomorrow.

Thanks, appreciate the feedback. Looking forward to hear your other test results.
Ondrej Zary Oct. 3, 2016, 7:25 a.m. UTC | #14
On Monday 03 October 2016, Sinan Kaya wrote:
> On 10/2/2016 12:53 PM, Ondrej Zary wrote:
> >> Can we have some testing coverage? and eventually have tested-by?
> >
> > Works on two affected machines. More tests tomorrow.
>
> Thanks, appreciate the feedback. Looking forward to hear your other test
> results.

The other two affected machines are working too.
Sinan Kaya Oct. 4, 2016, 2:30 p.m. UTC | #15
Hi Ondrej,

On 10/3/2016 3:25 AM, Ondrej Zary wrote:
> On Monday 03 October 2016, Sinan Kaya wrote:
>> On 10/2/2016 12:53 PM, Ondrej Zary wrote:
>>>> Can we have some testing coverage? and eventually have tested-by?
>>>
>>> Works on two affected machines. More tests tomorrow.
>>
>> Thanks, appreciate the feedback. Looking forward to hear your other test
>> results.
> 
> The other two affected machines are working too.
> 

Can I have your "tested by" before I repost the new version with updated
commit message?

Sinan
Ondrej Zary Oct. 4, 2016, 5:54 p.m. UTC | #16
On Saturday 01 October 2016 19:49:17 Sinan Kaya wrote:
> On 9/30/2016 5:27 PM, Rafael J. Wysocki wrote:
> >>> >> What's the drawback, if any?
> >> >
> >> > I don't see any drawback to be honest.
> >
> > I'd go for it then, if Bjorn doesn't hate it.
>
> I posted a follow up patch a minute ago.
>
> [PATCH 1/3] Revert "ACPI,PCI,IRQ: reduce static IRQ array size to 16"
> [PATCH 2/3] ACPI, PCI IRQ: add PCI_USING penalty for ISA interrupts
> [PATCH 3/3] Revert "ACPI,PCI,IRQ: remove SCI penalize function"
>
> Can we have some testing coverage? and eventually have tested-by?

Fixes the problem on all 4 machines.
These patches also need to go into 4.7-stable and 4.8-stable.

Tested-by: Ondrej Zary <linux@rainbow-software.org>

Patch
diff mbox

diff --git a/drivers/acpi/pci_link.c b/drivers/acpi/pci_link.c
index c983bf7..a212709 100644
--- a/drivers/acpi/pci_link.c
+++ b/drivers/acpi/pci_link.c
@@ -619,6 +619,10 @@  static int acpi_pci_link_allocate(struct acpi_pci_link *link)
 			    acpi_device_bid(link->device));
 		return -ENODEV;
 	} else {
+		if (link->irq.active < ACPI_MAX_ISA_IRQS)
+			acpi_isa_irq_penalty[link->irq.active] +=
+				PIRQ_PENALTY_PCI_USING;
+
 		printk(KERN_WARNING PREFIX "%s [%s] enabled at IRQ %d\n",
 		       acpi_device_name(link->device),
 		       acpi_device_bid(link->device), link->irq.active);