diff mbox series

[V3,28/35] PCI/MSI: Simplify pci_irq_get_affinity()

Message ID 20211210221814.900929381@linutronix.de (mailing list archive)
State Handled Elsewhere
Headers show
Series genirq/msi, PCI/MSI: Spring cleaning - Part 2 | expand

Commit Message

Thomas Gleixner Dec. 10, 2021, 10:19 p.m. UTC
From: Thomas Gleixner <tglx@linutronix.de>

Replace open coded MSI descriptor chasing and use the proper accessor
functions instead.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/pci/msi/msi.c |   26 ++++++++++----------------
 1 file changed, 10 insertions(+), 16 deletions(-)

Comments

Nathan Chancellor Dec. 17, 2021, 10:30 p.m. UTC | #1
Hi Thomas,

On Fri, Dec 10, 2021 at 11:19:26PM +0100, Thomas Gleixner wrote:
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> Replace open coded MSI descriptor chasing and use the proper accessor
> functions instead.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Apologies if this has already been reported somewhere else or already
fixed, I did a search of all of lore and did not see anything similar to
it and I did not see any new commits in -tip around this.

I just bisected a boot failure on my AMD test desktop to this patch as
commit f48235900182 ("PCI/MSI: Simplify pci_irq_get_affinity()") in
-next. It looks like there is a problem with the NVMe drive after this
change according to the logs. Given that the hard drive is not getting
mounted for journald to write logs to, I am not really sure how to get
them from the machine so I have at least taken a picture of what I see
on my screen; open to ideas on that front!

https://github.com/nathanchance/bug-files/blob/0d25d78b5bc1d5e9c15192b3bc80676364de8287/f48235900182/crash.jpg

Please let me know what information I can provide to make debugging this
easier and I am more than happy to apply and test patches as needed.

Cheers,
Nathan
Thomas Gleixner Dec. 18, 2021, 10:25 a.m. UTC | #2
On Fri, Dec 17 2021 at 15:30, Nathan Chancellor wrote:
> On Fri, Dec 10, 2021 at 11:19:26PM +0100, Thomas Gleixner wrote:
> I just bisected a boot failure on my AMD test desktop to this patch as
> commit f48235900182 ("PCI/MSI: Simplify pci_irq_get_affinity()") in
> -next. It looks like there is a problem with the NVMe drive after this
> change according to the logs. Given that the hard drive is not getting
> mounted for journald to write logs to, I am not really sure how to get
> them from the machine so I have at least taken a picture of what I see
> on my screen; open to ideas on that front!

Bah. Fix below.

Thanks,

        tglx
---
diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c
index 71802410e2ab..9b4910befeda 100644
--- a/drivers/pci/msi/msi.c
+++ b/drivers/pci/msi/msi.c
@@ -1100,7 +1100,7 @@ EXPORT_SYMBOL(pci_irq_vector);
  */
 const struct cpumask *pci_irq_get_affinity(struct pci_dev *dev, int nr)
 {
-	int irq = pci_irq_vector(dev, nr);
+	int idx, irq = pci_irq_vector(dev, nr);
 	struct msi_desc *desc;
 
 	if (WARN_ON_ONCE(irq <= 0))
@@ -1113,7 +1113,10 @@ const struct cpumask *pci_irq_get_affinity(struct pci_dev *dev, int nr)
 
 	if (WARN_ON_ONCE(!desc->affinity))
 		return NULL;
-	return &desc->affinity[nr].mask;
+
+	/* MSI has a mask array in the descriptor. */
+	idx = dev->msi_enabled ? nr : 0;
+	return &desc->affinity[idx].mask;
 }
 EXPORT_SYMBOL(pci_irq_get_affinity);
Nathan Chancellor Dec. 18, 2021, 7:04 p.m. UTC | #3
On Sat, Dec 18, 2021 at 11:25:14AM +0100, Thomas Gleixner wrote:
> On Fri, Dec 17 2021 at 15:30, Nathan Chancellor wrote:
> > On Fri, Dec 10, 2021 at 11:19:26PM +0100, Thomas Gleixner wrote:
> > I just bisected a boot failure on my AMD test desktop to this patch as
> > commit f48235900182 ("PCI/MSI: Simplify pci_irq_get_affinity()") in
> > -next. It looks like there is a problem with the NVMe drive after this
> > change according to the logs. Given that the hard drive is not getting
> > mounted for journald to write logs to, I am not really sure how to get
> > them from the machine so I have at least taken a picture of what I see
> > on my screen; open to ideas on that front!
> 
> Bah. Fix below.

Tested-by: Nathan Chancellor <nathan@kernel.org>

> Thanks,
> 
>         tglx
> ---
> diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c
> index 71802410e2ab..9b4910befeda 100644
> --- a/drivers/pci/msi/msi.c
> +++ b/drivers/pci/msi/msi.c
> @@ -1100,7 +1100,7 @@ EXPORT_SYMBOL(pci_irq_vector);
>   */
>  const struct cpumask *pci_irq_get_affinity(struct pci_dev *dev, int nr)
>  {
> -	int irq = pci_irq_vector(dev, nr);
> +	int idx, irq = pci_irq_vector(dev, nr);
>  	struct msi_desc *desc;
>  
>  	if (WARN_ON_ONCE(irq <= 0))
> @@ -1113,7 +1113,10 @@ const struct cpumask *pci_irq_get_affinity(struct pci_dev *dev, int nr)
>  
>  	if (WARN_ON_ONCE(!desc->affinity))
>  		return NULL;
> -	return &desc->affinity[nr].mask;
> +
> +	/* MSI has a mask array in the descriptor. */
> +	idx = dev->msi_enabled ? nr : 0;
> +	return &desc->affinity[idx].mask;
>  }
>  EXPORT_SYMBOL(pci_irq_get_affinity);
>  
>
Cédric Le Goater Dec. 18, 2021, 8:25 p.m. UTC | #4
On 12/18/21 11:25, Thomas Gleixner wrote:
> On Fri, Dec 17 2021 at 15:30, Nathan Chancellor wrote:
>> On Fri, Dec 10, 2021 at 11:19:26PM +0100, Thomas Gleixner wrote:
>> I just bisected a boot failure on my AMD test desktop to this patch as
>> commit f48235900182 ("PCI/MSI: Simplify pci_irq_get_affinity()") in
>> -next. It looks like there is a problem with the NVMe drive after this
>> change according to the logs. Given that the hard drive is not getting
>> mounted for journald to write logs to, I am not really sure how to get
>> them from the machine so I have at least taken a picture of what I see
>> on my screen; open to ideas on that front!
> 
> Bah. Fix below.

That's a fix for the issue I was seeing on pseries with NVMe.

Tested-by: Cédric Le Goater <clg@kaod.org>

Thanks,

C.


> Thanks,
> 
>          tglx
> ---
> diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c
> index 71802410e2ab..9b4910befeda 100644
> --- a/drivers/pci/msi/msi.c
> +++ b/drivers/pci/msi/msi.c
> @@ -1100,7 +1100,7 @@ EXPORT_SYMBOL(pci_irq_vector);
>    */
>   const struct cpumask *pci_irq_get_affinity(struct pci_dev *dev, int nr)
>   {
> -	int irq = pci_irq_vector(dev, nr);
> +	int idx, irq = pci_irq_vector(dev, nr);
>   	struct msi_desc *desc;
>   
>   	if (WARN_ON_ONCE(irq <= 0))
> @@ -1113,7 +1113,10 @@ const struct cpumask *pci_irq_get_affinity(struct pci_dev *dev, int nr)
>   
>   	if (WARN_ON_ONCE(!desc->affinity))
>   		return NULL;
> -	return &desc->affinity[nr].mask;
> +
> +	/* MSI has a mask array in the descriptor. */
> +	idx = dev->msi_enabled ? nr : 0;
> +	return &desc->affinity[idx].mask;
>   }
>   EXPORT_SYMBOL(pci_irq_get_affinity);
>   
>
Thomas Gleixner Dec. 20, 2021, 11:55 a.m. UTC | #5
On Sat, Dec 18 2021 at 21:25, Cédric Le Goater wrote:

> On 12/18/21 11:25, Thomas Gleixner wrote:
>> On Fri, Dec 17 2021 at 15:30, Nathan Chancellor wrote:
>>> On Fri, Dec 10, 2021 at 11:19:26PM +0100, Thomas Gleixner wrote:
>>> I just bisected a boot failure on my AMD test desktop to this patch as
>>> commit f48235900182 ("PCI/MSI: Simplify pci_irq_get_affinity()") in
>>> -next. It looks like there is a problem with the NVMe drive after this
>>> change according to the logs. Given that the hard drive is not getting
>>> mounted for journald to write logs to, I am not really sure how to get
>>> them from the machine so I have at least taken a picture of what I see
>>> on my screen; open to ideas on that front!
>> 
>> Bah. Fix below.
>
> That's a fix for the issue I was seeing on pseries with NVMe.
>
> Tested-by: Cédric Le Goater <clg@kaod.org>

I had a faint memory that I've seen that issue before, but couldn't find
the mail in those massive threads.

Thanks for confirming!

       tglx
Guenter Roeck Jan. 30, 2022, 5:12 p.m. UTC | #6
On Fri, Dec 10, 2021 at 11:19:26PM +0100, Thomas Gleixner wrote:
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> Replace open coded MSI descriptor chasing and use the proper accessor
> functions instead.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

This patch results in the following runtime warning when booting x86
(32 bit) nosmp images from NVME in qemu.

[   14.825482] nvme nvme0: 1/0/0 default/read/poll queues
ILLOPC: ca7c6d10: 0f 0b
[   14.826188] ------------[ cut here ]------------
[   14.826307] WARNING: CPU: 0 PID: 7 at drivers/pci/msi/msi.c:1114 pci_irq_get_affinity+0x80/0x90
[   14.826455] Modules linked in:
[   14.826640] CPU: 0 PID: 7 Comm: kworker/u2:0 Not tainted 5.17.0-rc1-00419-g1d2d8baaf053 #1
[   14.826797] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
[   14.827132] Workqueue: nvme-reset-wq nvme_reset_work
[   14.827336] EIP: pci_irq_get_affinity+0x80/0x90
[   14.827452] Code: e8 d5 30 af ff 85 c0 75 bd 90 0f 0b 31 c0 5b 5e 5d c3 8d b4 26 00 00 00 00 90 5b b8 24 32 7e cb 5e 5d c3 8d b4 26 00 00 00 00 <0f> 0b eb e0 8d b4 26 00 00 00 00 8d 74 26 00 90 55 89 e5 57 56 53
[   14.827717] EAX: 00000000 EBX: c18ba000 ECX: 00000000 EDX: c297c210
[   14.827816] ESI: 00000001 EDI: c18ba000 EBP: c1247e24 ESP: c1247e1c
[   14.827924] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 EFLAGS: 00000246
[   14.828110] CR0: 80050033 CR2: ffda9000 CR3: 0b8ad000 CR4: 000006d0
[   14.828268] Call Trace:
[   14.828554]  blk_mq_pci_map_queues+0x26/0x70
[   14.828710]  nvme_pci_map_queues+0x75/0xc0
[   14.828808]  blk_mq_update_queue_map+0x86/0xa0
[   14.828891]  blk_mq_alloc_tag_set+0xf3/0x390
[   14.828965]  ? nvme_wait_freeze+0x3d/0x50
[   14.829137]  nvme_reset_work+0xd02/0x1120
[   14.829269]  ? lock_acquire+0xc3/0x290
[   14.829435]  process_one_work+0x1ed/0x490
[   14.829569]  worker_thread+0x15e/0x3c0
[   14.829665]  kthread+0xd3/0x100
[   14.829729]  ? process_one_work+0x490/0x490
[   14.829799]  ? kthread_complete_and_exit+0x20/0x20
[   14.829890]  ret_from_fork+0x1c/0x28

Bisect results below.

#regzbot introduced: f48235900182d6

Guenter

---
# bad: [e783362eb54cd99b2cac8b3a9aeac942e6f6ac07] Linux 5.17-rc1
# good: [df0cc57e057f18e44dac8e6c18aba47ab53202f9] Linux 5.16
git bisect start 'v5.17-rc1' 'v5.16'
# good: [fef8dfaea9d6c444b6c2174b3a2b0fca4d226c5e] Merge tag 'regulator-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
git bisect good fef8dfaea9d6c444b6c2174b3a2b0fca4d226c5e
# bad: [3ceff4ea07410763d5d4cccd60349bf7691e7e61] Merge tag 'sound-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
git bisect bad 3ceff4ea07410763d5d4cccd60349bf7691e7e61
# good: [57ea81971b7296b42fc77424af44c5915d3d4ae2] Merge tag 'usb-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
git bisect good 57ea81971b7296b42fc77424af44c5915d3d4ae2
# bad: [feb7a43de5ef625ad74097d8fd3481d5dbc06a59] Merge tag 'irq-msi-2022-01-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad feb7a43de5ef625ad74097d8fd3481d5dbc06a59
# good: [ce990f1de0bc6ff3de43d385e0985efa980fba24] Merge tag 'for-linus-5.17-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
git bisect good ce990f1de0bc6ff3de43d385e0985efa980fba24
# good: [4afd2a9355a9deb16ea42b896820dacf49843a8f] Merge branches 'clk-ingenic' and 'clk-mediatek' into clk-next
git bisect good 4afd2a9355a9deb16ea42b896820dacf49843a8f
# good: [455e73a07f6e288b0061dfcf4fcf54fa9fe06458] Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
git bisect good 455e73a07f6e288b0061dfcf4fcf54fa9fe06458
# bad: [f2948df5f87a722591499da60ab91c611422f755] x86/pci/xen: Use msi_for_each_desc()
git bisect bad f2948df5f87a722591499da60ab91c611422f755
# good: [93296cd1325d1d9afede60202d8833011c9001f2] PCI/MSI: Allocate MSI device data on first use
git bisect good 93296cd1325d1d9afede60202d8833011c9001f2
# good: [82ff8e6b78fc4587a4255301f0a283506daf11b6] PCI/MSI: Use msi_get_virq() in pci_get_vector()
git bisect good 82ff8e6b78fc4587a4255301f0a283506daf11b6
# bad: [125282cd4f33ecd53a24ae4807409da0e5e90fd4] genirq/msi: Move descriptor list to struct msi_device_data
git bisect bad 125282cd4f33ecd53a24ae4807409da0e5e90fd4
# bad: [065afdc9c521f05c53f226dabe5dda2d30294d65] iommu/arm-smmu-v3: Use msi_get_virq()
git bisect bad 065afdc9c521f05c53f226dabe5dda2d30294d65
# bad: [f6632bb2c1454b857adcd131320379ec16fd8666] dmaengine: mv_xor_v2: Get rid of msi_desc abuse
git bisect bad f6632bb2c1454b857adcd131320379ec16fd8666
# bad: [f48235900182d64537c6e8f8dc0932b57a1a0638] PCI/MSI: Simplify pci_irq_get_affinity()
git bisect bad f48235900182d64537c6e8f8dc0932b57a1a0638
# first bad commit: [f48235900182d64537c6e8f8dc0932b57a1a0638] PCI/MSI: Simplify pci_irq_get_affinity()
Thomas Gleixner Jan. 31, 2022, 11:27 a.m. UTC | #7
On Sun, Jan 30 2022 at 09:12, Guenter Roeck wrote:
> On Fri, Dec 10, 2021 at 11:19:26PM +0100, Thomas Gleixner wrote:
> This patch results in the following runtime warning when booting x86
> (32 bit) nosmp images from NVME in qemu.
>
> [   14.825482] nvme nvme0: 1/0/0 default/read/poll queues
> ILLOPC: ca7c6d10: 0f 0b
> [   14.826188] ------------[ cut here ]------------
> [   14.826307] WARNING: CPU: 0 PID: 7 at drivers/pci/msi/msi.c:1114 pci_irq_get_affinity+0x80/0x90

This complains about msi_desc->affinity being NULL.

> git bisect bad f48235900182d64537c6e8f8dc0932b57a1a0638
> # first bad commit: [f48235900182d64537c6e8f8dc0932b57a1a0638] PCI/MSI: Simplify pci_irq_get_affinity()

Hrm. Can you please provide dmesg and /proc/interrupts from a
kernel before that commit?

Thanks,

        tglx
Guenter Roeck Jan. 31, 2022, 3:21 p.m. UTC | #8
On 1/31/22 03:27, Thomas Gleixner wrote:
> On Sun, Jan 30 2022 at 09:12, Guenter Roeck wrote:
>> On Fri, Dec 10, 2021 at 11:19:26PM +0100, Thomas Gleixner wrote:
>> This patch results in the following runtime warning when booting x86
>> (32 bit) nosmp images from NVME in qemu.
>>
>> [   14.825482] nvme nvme0: 1/0/0 default/read/poll queues
>> ILLOPC: ca7c6d10: 0f 0b
>> [   14.826188] ------------[ cut here ]------------
>> [   14.826307] WARNING: CPU: 0 PID: 7 at drivers/pci/msi/msi.c:1114 pci_irq_get_affinity+0x80/0x90
> 
> This complains about msi_desc->affinity being NULL.
> 
>> git bisect bad f48235900182d64537c6e8f8dc0932b57a1a0638
>> # first bad commit: [f48235900182d64537c6e8f8dc0932b57a1a0638] PCI/MSI: Simplify pci_irq_get_affinity()
> 
> Hrm. Can you please provide dmesg and /proc/interrupts from a
> kernel before that commit?
> 

Sure. Please see http://server.roeck-us.net/qemu/x86/.
The logs are generated with with v5.16.4.

Guenter
Thomas Gleixner Jan. 31, 2022, 9:16 p.m. UTC | #9
Guenter,

On Mon, Jan 31 2022 at 07:21, Guenter Roeck wrote:
> Sure. Please see http://server.roeck-us.net/qemu/x86/.
> The logs are generated with with v5.16.4.

thanks for providing the data. It definitely helped me to leave the
state of not seeing the wood for the trees. Fix below.

Thanks,

        tglx
---
Subject: PCI/MSI: Remove bogus warning in pci_irq_get_affinity()
From: Thomas Gleixner <tglx@linutronix.de>
Date: Mon, 31 Jan 2022 22:02:46 +0100

The recent overhaul of pci_irq_get_affinity() introduced a regression when
pci_irq_get_affinity() is called for an MSI-X interrupt which was not
allocated with affinity descriptor information.

The original code just returned a NULL pointer in that case, but the rework
added a WARN_ON() under the assumption that the corresponding WARN_ON() in
the MSI case can be applied to MSI-X as well.

In fact the MSI warning in the original code does not make sense either
because it's legitimate to invoke pci_irq_get_affinity() for a MSI
interrupt which was not allocated with affinity descriptor information.

Remove it and just return NULL as the original code did.

Fixes: f48235900182 ("PCI/MSI: Simplify pci_irq_get_affinity()")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 drivers/pci/msi/msi.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/pci/msi/msi.c
+++ b/drivers/pci/msi/msi.c
@@ -1111,7 +1111,8 @@ const struct cpumask *pci_irq_get_affini
 	if (!desc)
 		return cpu_possible_mask;
 
-	if (WARN_ON_ONCE(!desc->affinity))
+	/* MSI[X] interrupts can be allocated without affinity descriptor */
+	if (!desc->affinity)
 		return NULL;
 
 	/*
diff mbox series

Patch

--- a/drivers/pci/msi/msi.c
+++ b/drivers/pci/msi/msi.c
@@ -1061,26 +1061,20 @@  EXPORT_SYMBOL(pci_irq_vector);
  */
 const struct cpumask *pci_irq_get_affinity(struct pci_dev *dev, int nr)
 {
-	if (dev->msix_enabled) {
-		struct msi_desc *entry;
+	int irq = pci_irq_vector(dev, nr);
+	struct msi_desc *desc;
 
-		for_each_pci_msi_entry(entry, dev) {
-			if (entry->msi_index == nr)
-				return &entry->affinity->mask;
-		}
-		WARN_ON_ONCE(1);
+	if (WARN_ON_ONCE(irq <= 0))
 		return NULL;
-	} else if (dev->msi_enabled) {
-		struct msi_desc *entry = first_pci_msi_entry(dev);
 
-		if (WARN_ON_ONCE(!entry || !entry->affinity ||
-				 nr >= entry->nvec_used))
-			return NULL;
-
-		return &entry->affinity[nr].mask;
-	} else {
+	desc = irq_get_msi_desc(irq);
+	/* Non-MSI does not have the information handy */
+	if (!desc)
 		return cpu_possible_mask;
-	}
+
+	if (WARN_ON_ONCE(!desc->affinity))
+		return NULL;
+	return &desc->affinity[nr].mask;
 }
 EXPORT_SYMBOL(pci_irq_get_affinity);