mbox series

[Bionic,v3,00/21] blk-mq scheduler fixes

Message ID 20180515130324.23815-1-joserz@linux.vnet.ibm.com
Headers show
Series blk-mq scheduler fixes | expand

Message

Jose Ricardo Ziviani May 15, 2018, 1:03 p.m. UTC
From: Jose Ricardo Ziviani <joserz@linux.ibm.com>

Hello team!

Weeks ago I sent a patchset with:
 * genirq/affinity: Spread irq vectors among present CPUs as far as possible
 * blk-mq: simplify queue mapping & schedule with each possisble CPU

Unfortunately they broke some cases, in particular the hpsa driver, so they
had to be reverted. However, the bugs in blk_mq are leading to a very
unstable KVM/Qemu virtual machines whenever CPU hotplug/unplug and SMT
changes are performed, also impacting live migration.

So, this is a new attempt to have the patches included. This new version
includes all related fixes available upstream.

It's based on Ubuntu-4.15.0-21.22 tag.

V3:
 - Added Buglink line in each commit

V2:
 - Included all the fix necessary to blk_mq

Thank you!

Jose Ricardo Ziviani

Bart Van Assche (1):
  blk-mq: Avoid that blk_mq_delay_run_hw_queue() introduces unintended
    delays

Christoph Hellwig (2):
  genirq/affinity: assign vectors to all possible CPUs
  blk-mq: simplify queue mapping & schedule with each possisble CPU

Jens Axboe (1):
  blk-mq: fix discard merge with scheduler attached

Ming Lei (16):
  genirq/affinity: Rename *node_to_possible_cpumask as *node_to_cpumask
  genirq/affinity: Move actual irq vector spreading into a helper
    function
  genirq/affinity: Allow irq spreading from a given starting point
  genirq/affinity: Spread irq vectors among present CPUs as far as
    possible
  blk-mq: make sure hctx->next_cpu is set correctly
  blk-mq: make sure that correct hctx->next_cpu is set
  blk-mq: don't keep offline CPUs mapped to hctx 0
  blk-mq: avoid to write intermediate result to hctx->next_cpu
  blk-mq: introduce blk_mq_hw_queue_first_cpu() to figure out first cpu
  blk-mq: don't check queue mapped in __blk_mq_delay_run_hw_queue()
  nvme: pci: pass max vectors as num_possible_cpus() to
    pci_alloc_irq_vectors
  scsi: hpsa: fix selection of reply queue
  scsi: megaraid_sas: fix selection of reply queue
  scsi: core: introduce force_blk_mq
  scsi: virtio_scsi: fix IO hang caused by automatic irq vector affinity
  scsi: virtio_scsi: unify scsi_host_template

Thomas Gleixner (1):
  genirq/affinity: Don't return with empty affinity masks on error

 block/blk-core.c                            |   2 +
 block/blk-merge.c                           |  29 +++-
 block/blk-mq-cpumap.c                       |   5 -
 block/blk-mq.c                              |  65 +++++---
 drivers/nvme/host/pci.c                     |   2 +-
 drivers/scsi/hosts.c                        |   1 +
 drivers/scsi/hpsa.c                         |  73 ++++++---
 drivers/scsi/hpsa.h                         |   1 +
 drivers/scsi/megaraid/megaraid_sas.h        |   1 +
 drivers/scsi/megaraid/megaraid_sas_base.c   |  39 ++++-
 drivers/scsi/megaraid/megaraid_sas_fusion.c |  12 +-
 drivers/scsi/virtio_scsi.c                  | 129 ++-------------
 include/scsi/scsi_host.h                    |   3 +
 kernel/irq/affinity.c                       | 166 +++++++++++++-------
 14 files changed, 296 insertions(+), 232 deletions(-)

Comments

Stefan Bader June 7, 2018, 8:05 p.m. UTC | #1
On 15.05.2018 06:03, Jose Ricardo Ziviani wrote:
> From: Jose Ricardo Ziviani <joserz@linux.ibm.com>
> 
> Hello team!
> 
> Weeks ago I sent a patchset with:
>  * genirq/affinity: Spread irq vectors among present CPUs as far as possible
>  * blk-mq: simplify queue mapping & schedule with each possisble CPU
> 
> Unfortunately they broke some cases, in particular the hpsa driver, so they
> had to be reverted. However, the bugs in blk_mq are leading to a very
> unstable KVM/Qemu virtual machines whenever CPU hotplug/unplug and SMT
> changes are performed, also impacting live migration.
> 
> So, this is a new attempt to have the patches included. This new version
> includes all related fixes available upstream.
> 
> It's based on Ubuntu-4.15.0-21.22 tag.
> 
> V3:
>  - Added Buglink line in each commit
> 
> V2:
>  - Included all the fix necessary to blk_mq
> 
> Thank you!
> 
> Jose Ricardo Ziviani
> 
> Bart Van Assche (1):
>   blk-mq: Avoid that blk_mq_delay_run_hw_queue() introduces unintended
>     delays
> 
> Christoph Hellwig (2):
>   genirq/affinity: assign vectors to all possible CPUs
>   blk-mq: simplify queue mapping & schedule with each possisble CPU
> 
> Jens Axboe (1):
>   blk-mq: fix discard merge with scheduler attached
> 
> Ming Lei (16):
>   genirq/affinity: Rename *node_to_possible_cpumask as *node_to_cpumask
>   genirq/affinity: Move actual irq vector spreading into a helper
>     function
>   genirq/affinity: Allow irq spreading from a given starting point
>   genirq/affinity: Spread irq vectors among present CPUs as far as
>     possible
>   blk-mq: make sure hctx->next_cpu is set correctly
>   blk-mq: make sure that correct hctx->next_cpu is set
>   blk-mq: don't keep offline CPUs mapped to hctx 0
>   blk-mq: avoid to write intermediate result to hctx->next_cpu
>   blk-mq: introduce blk_mq_hw_queue_first_cpu() to figure out first cpu
>   blk-mq: don't check queue mapped in __blk_mq_delay_run_hw_queue()
>   nvme: pci: pass max vectors as num_possible_cpus() to
>     pci_alloc_irq_vectors
>   scsi: hpsa: fix selection of reply queue
>   scsi: megaraid_sas: fix selection of reply queue
>   scsi: core: introduce force_blk_mq
>   scsi: virtio_scsi: fix IO hang caused by automatic irq vector affinity
>   scsi: virtio_scsi: unify scsi_host_template
> 
> Thomas Gleixner (1):
>   genirq/affinity: Don't return with empty affinity masks on error
> 
>  block/blk-core.c                            |   2 +
>  block/blk-merge.c                           |  29 +++-
>  block/blk-mq-cpumap.c                       |   5 -
>  block/blk-mq.c                              |  65 +++++---
>  drivers/nvme/host/pci.c                     |   2 +-
>  drivers/scsi/hosts.c                        |   1 +
>  drivers/scsi/hpsa.c                         |  73 ++++++---
>  drivers/scsi/hpsa.h                         |   1 +
>  drivers/scsi/megaraid/megaraid_sas.h        |   1 +
>  drivers/scsi/megaraid/megaraid_sas_base.c   |  39 ++++-
>  drivers/scsi/megaraid/megaraid_sas_fusion.c |  12 +-
>  drivers/scsi/virtio_scsi.c                  | 129 ++-------------
>  include/scsi/scsi_host.h                    |   3 +
>  kernel/irq/affinity.c                       | 166 +++++++++++++-------
>  14 files changed, 296 insertions(+), 232 deletions(-)
> 

I went over the proposed changes and checked that there are currently no other
references in upstream pointing at the patches submitted. Two of those are by
now already applied (via stable):

* blk-mq: fix discard merge with scheduler attached
* blk-mq: don't keep offline CPUs mapped to hctx 0

The remaining changes are all either related to fixing the issue at hand or
follow-ups to address issues that had been introduced.

Acked-by: Stefan Bader <stefan.bader@canonical.com>
Kleber Sacilotto de Souza June 7, 2018, 9:15 p.m. UTC | #2
On 05/15/18 06:03, Jose Ricardo Ziviani wrote:
> From: Jose Ricardo Ziviani <joserz@linux.ibm.com>
> 
> Hello team!
> 
> Weeks ago I sent a patchset with:
>  * genirq/affinity: Spread irq vectors among present CPUs as far as possible
>  * blk-mq: simplify queue mapping & schedule with each possisble CPU
> 
> Unfortunately they broke some cases, in particular the hpsa driver, so they
> had to be reverted. However, the bugs in blk_mq are leading to a very
> unstable KVM/Qemu virtual machines whenever CPU hotplug/unplug and SMT
> changes are performed, also impacting live migration.
> 
> So, this is a new attempt to have the patches included. This new version
> includes all related fixes available upstream.
> 
> It's based on Ubuntu-4.15.0-21.22 tag.
> 
> V3:
>  - Added Buglink line in each commit
> 
> V2:
>  - Included all the fix necessary to blk_mq
> 
> Thank you!
> 
> Jose Ricardo Ziviani
> 
> Bart Van Assche (1):
>   blk-mq: Avoid that blk_mq_delay_run_hw_queue() introduces unintended
>     delays
> 
> Christoph Hellwig (2):
>   genirq/affinity: assign vectors to all possible CPUs
>   blk-mq: simplify queue mapping & schedule with each possisble CPU
> 
> Jens Axboe (1):
>   blk-mq: fix discard merge with scheduler attached
> 
> Ming Lei (16):
>   genirq/affinity: Rename *node_to_possible_cpumask as *node_to_cpumask
>   genirq/affinity: Move actual irq vector spreading into a helper
>     function
>   genirq/affinity: Allow irq spreading from a given starting point
>   genirq/affinity: Spread irq vectors among present CPUs as far as
>     possible
>   blk-mq: make sure hctx->next_cpu is set correctly
>   blk-mq: make sure that correct hctx->next_cpu is set
>   blk-mq: don't keep offline CPUs mapped to hctx 0
>   blk-mq: avoid to write intermediate result to hctx->next_cpu
>   blk-mq: introduce blk_mq_hw_queue_first_cpu() to figure out first cpu
>   blk-mq: don't check queue mapped in __blk_mq_delay_run_hw_queue()
>   nvme: pci: pass max vectors as num_possible_cpus() to
>     pci_alloc_irq_vectors
>   scsi: hpsa: fix selection of reply queue
>   scsi: megaraid_sas: fix selection of reply queue
>   scsi: core: introduce force_blk_mq
>   scsi: virtio_scsi: fix IO hang caused by automatic irq vector affinity
>   scsi: virtio_scsi: unify scsi_host_template
> 
> Thomas Gleixner (1):
>   genirq/affinity: Don't return with empty affinity masks on error
> 
>  block/blk-core.c                            |   2 +
>  block/blk-merge.c                           |  29 +++-
>  block/blk-mq-cpumap.c                       |   5 -
>  block/blk-mq.c                              |  65 +++++---
>  drivers/nvme/host/pci.c                     |   2 +-
>  drivers/scsi/hosts.c                        |   1 +
>  drivers/scsi/hpsa.c                         |  73 ++++++---
>  drivers/scsi/hpsa.h                         |   1 +
>  drivers/scsi/megaraid/megaraid_sas.h        |   1 +
>  drivers/scsi/megaraid/megaraid_sas_base.c   |  39 ++++-
>  drivers/scsi/megaraid/megaraid_sas_fusion.c |  12 +-
>  drivers/scsi/virtio_scsi.c                  | 129 ++-------------
>  include/scsi/scsi_host.h                    |   3 +
>  kernel/irq/affinity.c                       | 166 +++++++++++++-------
>  14 files changed, 296 insertions(+), 232 deletions(-)
> 

The changes don't seem as intrusive as I first thought, they don't
introduce so many new features and are in majority follow-up fixes. As
Stefan mentioned, the patch set seems to be carrying all the fixes sent
upstream so far.

Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Khalid Elmously June 8, 2018, 9:50 p.m. UTC | #3
Applied to Bionic

On 2018-05-15 10:03:03 , Jose Ricardo Ziviani wrote:
> From: Jose Ricardo Ziviani <joserz@linux.ibm.com>
> 
> Hello team!
> 
> Weeks ago I sent a patchset with:
>  * genirq/affinity: Spread irq vectors among present CPUs as far as possible
>  * blk-mq: simplify queue mapping & schedule with each possisble CPU
> 
> Unfortunately they broke some cases, in particular the hpsa driver, so they
> had to be reverted. However, the bugs in blk_mq are leading to a very
> unstable KVM/Qemu virtual machines whenever CPU hotplug/unplug and SMT
> changes are performed, also impacting live migration.
> 
> So, this is a new attempt to have the patches included. This new version
> includes all related fixes available upstream.
> 
> It's based on Ubuntu-4.15.0-21.22 tag.
> 
> V3:
>  - Added Buglink line in each commit
> 
> V2:
>  - Included all the fix necessary to blk_mq
> 
> Thank you!
> 
> Jose Ricardo Ziviani
> 
> Bart Van Assche (1):
>   blk-mq: Avoid that blk_mq_delay_run_hw_queue() introduces unintended
>     delays
> 
> Christoph Hellwig (2):
>   genirq/affinity: assign vectors to all possible CPUs
>   blk-mq: simplify queue mapping & schedule with each possisble CPU
> 
> Jens Axboe (1):
>   blk-mq: fix discard merge with scheduler attached
> 
> Ming Lei (16):
>   genirq/affinity: Rename *node_to_possible_cpumask as *node_to_cpumask
>   genirq/affinity: Move actual irq vector spreading into a helper
>     function
>   genirq/affinity: Allow irq spreading from a given starting point
>   genirq/affinity: Spread irq vectors among present CPUs as far as
>     possible
>   blk-mq: make sure hctx->next_cpu is set correctly
>   blk-mq: make sure that correct hctx->next_cpu is set
>   blk-mq: don't keep offline CPUs mapped to hctx 0
>   blk-mq: avoid to write intermediate result to hctx->next_cpu
>   blk-mq: introduce blk_mq_hw_queue_first_cpu() to figure out first cpu
>   blk-mq: don't check queue mapped in __blk_mq_delay_run_hw_queue()
>   nvme: pci: pass max vectors as num_possible_cpus() to
>     pci_alloc_irq_vectors
>   scsi: hpsa: fix selection of reply queue
>   scsi: megaraid_sas: fix selection of reply queue
>   scsi: core: introduce force_blk_mq
>   scsi: virtio_scsi: fix IO hang caused by automatic irq vector affinity
>   scsi: virtio_scsi: unify scsi_host_template
> 
> Thomas Gleixner (1):
>   genirq/affinity: Don't return with empty affinity masks on error
> 
>  block/blk-core.c                            |   2 +
>  block/blk-merge.c                           |  29 +++-
>  block/blk-mq-cpumap.c                       |   5 -
>  block/blk-mq.c                              |  65 +++++---
>  drivers/nvme/host/pci.c                     |   2 +-
>  drivers/scsi/hosts.c                        |   1 +
>  drivers/scsi/hpsa.c                         |  73 ++++++---
>  drivers/scsi/hpsa.h                         |   1 +
>  drivers/scsi/megaraid/megaraid_sas.h        |   1 +
>  drivers/scsi/megaraid/megaraid_sas_base.c   |  39 ++++-
>  drivers/scsi/megaraid/megaraid_sas_fusion.c |  12 +-
>  drivers/scsi/virtio_scsi.c                  | 129 ++-------------
>  include/scsi/scsi_host.h                    |   3 +
>  kernel/irq/affinity.c                       | 166 +++++++++++++-------
>  14 files changed, 296 insertions(+), 232 deletions(-)
> 
> -- 
> 2.17.0
> 
> 
> -- 
> kernel-team mailing list
> kernel-team@lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team