Message ID | 20170628232204.15227-1-sthemmin@microsoft.com |
---|---|
State | Not Applicable |
Headers | show |
> -----Original Message----- > From: Stephen Hemminger [mailto:stephen@networkplumber.org] > Sent: Wednesday, June 28, 2017 4:22 PM > To: KY Srinivasan <kys@microsoft.com>; bhelgaas@google.com > Cc: linux-pci@vger.kernel.org; devel@linuxdriverproject.org; Stephen > Hemminger <sthemmin@microsoft.com> > Subject: [PATCH] hv: fix msi affinity when device requests all possible CPU's > > When Intel 10G (ixgbevf) is passed to a Hyper-V guest with SR-IOV, the driver > requests affinity with all possible CPU's (0-239) even those CPU's are not online > (and will never be). Because of this the device is unable to correctly get MSI > interrupt's setup. > > This was caused by the change in 4.12 that converted this affinity into all > possible CPU's (0-31) but then host reports an error since this is larger than the > number of online cpu's. > > Previously, this worked (up to 4.12-rc1) because only online cpu's would be put > in mask passed to the host. > > This patch applies only to 4.12. > The driver in linux-next needs a a different fix because of the changes to PCI > host protocol version. The vPCI patch in linux-next has the issue fixed already. Regards, Jork
Patch still needed for 4.12 -----Original Message----- From: Jork Loeser Sent: Thursday, June 29, 2017 3:08 PM To: stephen@networkplumber.org; KY Srinivasan <kys@microsoft.com>; bhelgaas@google.com Cc: linux-pci@vger.kernel.org; devel@linuxdriverproject.org; Stephen Hemminger <sthemmin@microsoft.com> Subject: RE: [PATCH] hv: fix msi affinity when device requests all possible CPU's > -----Original Message----- > From: Stephen Hemminger [mailto:stephen@networkplumber.org] > Sent: Wednesday, June 28, 2017 4:22 PM > To: KY Srinivasan <kys@microsoft.com>; bhelgaas@google.com > Cc: linux-pci@vger.kernel.org; devel@linuxdriverproject.org; Stephen > Hemminger <sthemmin@microsoft.com> > Subject: [PATCH] hv: fix msi affinity when device requests all possible CPU's > > When Intel 10G (ixgbevf) is passed to a Hyper-V guest with SR-IOV, the driver > requests affinity with all possible CPU's (0-239) even those CPU's are not online > (and will never be). Because of this the device is unable to correctly get MSI > interrupt's setup. > > This was caused by the change in 4.12 that converted this affinity into all > possible CPU's (0-31) but then host reports an error since this is larger than the > number of online cpu's. > > Previously, this worked (up to 4.12-rc1) because only online cpu's would be put > in mask passed to the host. > > This patch applies only to 4.12. > The driver in linux-next needs a a different fix because of the changes to PCI > host protocol version. The vPCI patch in linux-next has the issue fixed already. Regards, Jork
On Wed, Jun 28, 2017 at 04:22:04PM -0700, Stephen Hemminger wrote: > When Intel 10G (ixgbevf) is passed to a Hyper-V guest with SR-IOV, > the driver requests affinity with all possible CPU's (0-239) even > those CPU's are not online (and will never be). Because of this the device > is unable to correctly get MSI interrupt's setup. > > This was caused by the change in 4.12 that converted this affinity > into all possible CPU's (0-31) but then host reports > an error since this is larger than the number of online cpu's. > > Previously, this worked (up to 4.12-rc1) because only online cpu's > would be put in mask passed to the host. > > This patch applies only to 4.12. > The driver in linux-next needs a a different fix because of the changes > to PCI host protocol version. If Linus decides to postpone v4.12 a week, I can ask him to pull this. But I suspect he will release v4.12 today. In that case, I don't know what to do with this other than maybe send it to Greg for a -stable release. > Fixes: 433fcf6b7b31 ("PCI: hv: Specify CPU_AFFINITY_ALL for MSI affinity when >= 32 CPUs") > Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> > --- > drivers/pci/host/pci-hyperv.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c > index 84936383e269..3cadfcca3ae9 100644 > --- a/drivers/pci/host/pci-hyperv.c > +++ b/drivers/pci/host/pci-hyperv.c > @@ -900,10 +900,12 @@ static void hv_compose_msi_msg(struct irq_data *data, struct msi_msg *msg) > * processors because Hyper-V only supports 64 in a guest. > */ > affinity = irq_data_get_affinity_mask(data); > + cpumask_and(affinity, affinity, cpu_online_mask); > + > if (cpumask_weight(affinity) >= 32) { > int_pkt->int_desc.cpu_mask = CPU_AFFINITY_ALL; > } else { > - for_each_cpu_and(cpu, affinity, cpu_online_mask) { > + for_each_cpu(cpu, affinity) { > int_pkt->int_desc.cpu_mask |= > (1ULL << vmbus_cpu_number_to_vp_number(cpu)); > } > -- > 2.11.0 >
On Sun, 2 Jul 2017 16:38:19 -0500 Bjorn Helgaas <helgaas@kernel.org> wrote: > On Wed, Jun 28, 2017 at 04:22:04PM -0700, Stephen Hemminger wrote: > > When Intel 10G (ixgbevf) is passed to a Hyper-V guest with SR-IOV, > > the driver requests affinity with all possible CPU's (0-239) even > > those CPU's are not online (and will never be). Because of this the device > > is unable to correctly get MSI interrupt's setup. > > > > This was caused by the change in 4.12 that converted this affinity > > into all possible CPU's (0-31) but then host reports > > an error since this is larger than the number of online cpu's. > > > > Previously, this worked (up to 4.12-rc1) because only online cpu's > > would be put in mask passed to the host. > > > > This patch applies only to 4.12. > > The driver in linux-next needs a a different fix because of the changes > > to PCI host protocol version. > > If Linus decides to postpone v4.12 a week, I can ask him to pull this. But > I suspect he will release v4.12 today. In that case, I don't know what to > do with this other than maybe send it to Greg for a -stable release. Looks like this will have to be queued for 4.12 stable.
On Tue, Jul 04, 2017 at 02:59:42PM -0700, Stephen Hemminger wrote: > On Sun, 2 Jul 2017 16:38:19 -0500 > Bjorn Helgaas <helgaas@kernel.org> wrote: > > > On Wed, Jun 28, 2017 at 04:22:04PM -0700, Stephen Hemminger wrote: > > > When Intel 10G (ixgbevf) is passed to a Hyper-V guest with SR-IOV, > > > the driver requests affinity with all possible CPU's (0-239) even > > > those CPU's are not online (and will never be). Because of this the device > > > is unable to correctly get MSI interrupt's setup. > > > > > > This was caused by the change in 4.12 that converted this affinity > > > into all possible CPU's (0-31) but then host reports > > > an error since this is larger than the number of online cpu's. > > > > > > Previously, this worked (up to 4.12-rc1) because only online cpu's > > > would be put in mask passed to the host. > > > > > > This patch applies only to 4.12. > > > The driver in linux-next needs a a different fix because of the changes > > > to PCI host protocol version. > > > > If Linus decides to postpone v4.12 a week, I can ask him to pull this. But > > I suspect he will release v4.12 today. In that case, I don't know what to > > do with this other than maybe send it to Greg for a -stable release. > > Looks like this will have to be queued for 4.12 stable. I assume you'll take care of this, right? It sounds like there's nothing to do for upstream because it needs a different fix. Bjorn
On Wed, 5 Jul 2017 14:49:33 -0500 Bjorn Helgaas <helgaas@kernel.org> wrote: > On Tue, Jul 04, 2017 at 02:59:42PM -0700, Stephen Hemminger wrote: > > On Sun, 2 Jul 2017 16:38:19 -0500 > > Bjorn Helgaas <helgaas@kernel.org> wrote: > > > > > On Wed, Jun 28, 2017 at 04:22:04PM -0700, Stephen Hemminger wrote: > > > > When Intel 10G (ixgbevf) is passed to a Hyper-V guest with SR-IOV, > > > > the driver requests affinity with all possible CPU's (0-239) even > > > > those CPU's are not online (and will never be). Because of this the device > > > > is unable to correctly get MSI interrupt's setup. > > > > > > > > This was caused by the change in 4.12 that converted this affinity > > > > into all possible CPU's (0-31) but then host reports > > > > an error since this is larger than the number of online cpu's. > > > > > > > > Previously, this worked (up to 4.12-rc1) because only online cpu's > > > > would be put in mask passed to the host. > > > > > > > > This patch applies only to 4.12. > > > > The driver in linux-next needs a a different fix because of the changes > > > > to PCI host protocol version. > > > > > > If Linus decides to postpone v4.12 a week, I can ask him to pull this. But > > > I suspect he will release v4.12 today. In that case, I don't know what to > > > do with this other than maybe send it to Greg for a -stable release. > > > > Looks like this will have to be queued for 4.12 stable. > > I assume you'll take care of this, right? It sounds like there's nothing > to do for upstream because it needs a different fix. > > Bjorn Already fixed in Linux-next. The code is different for PCI 1.2 version and never had the bug.
diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c index 84936383e269..3cadfcca3ae9 100644 --- a/drivers/pci/host/pci-hyperv.c +++ b/drivers/pci/host/pci-hyperv.c @@ -900,10 +900,12 @@ static void hv_compose_msi_msg(struct irq_data *data, struct msi_msg *msg) * processors because Hyper-V only supports 64 in a guest. */ affinity = irq_data_get_affinity_mask(data); + cpumask_and(affinity, affinity, cpu_online_mask); + if (cpumask_weight(affinity) >= 32) { int_pkt->int_desc.cpu_mask = CPU_AFFINITY_ALL; } else { - for_each_cpu_and(cpu, affinity, cpu_online_mask) { + for_each_cpu(cpu, affinity) { int_pkt->int_desc.cpu_mask |= (1ULL << vmbus_cpu_number_to_vp_number(cpu)); }
When Intel 10G (ixgbevf) is passed to a Hyper-V guest with SR-IOV, the driver requests affinity with all possible CPU's (0-239) even those CPU's are not online (and will never be). Because of this the device is unable to correctly get MSI interrupt's setup. This was caused by the change in 4.12 that converted this affinity into all possible CPU's (0-31) but then host reports an error since this is larger than the number of online cpu's. Previously, this worked (up to 4.12-rc1) because only online cpu's would be put in mask passed to the host. This patch applies only to 4.12. The driver in linux-next needs a a different fix because of the changes to PCI host protocol version. Fixes: 433fcf6b7b31 ("PCI: hv: Specify CPU_AFFINITY_ALL for MSI affinity when >= 32 CPUs") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> --- drivers/pci/host/pci-hyperv.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)