diff mbox series

[10/15] hw/nvme: Make max_ioqpairs and msix_qsize configurable in runtime

Message ID 20211007162406.1920374-11-lukasz.maniak@linux.intel.com
State New
Headers show
Series hw/nvme: SR-IOV with Virtualization Enhancements | expand

Commit Message

Lukasz Maniak Oct. 7, 2021, 4:24 p.m. UTC
From: Łukasz Gieryk <lukasz.gieryk@linux.intel.com>

The Nvme device defines two properties: max_ioqpairs, msix_qsize. Having
them as constants is problematic for SR-IOV support.

The SR-IOV feature introduces virtual resources (queues, interrupts)
that can be assigned to PF and its dependent VFs. Each device, following
a reset, should work with the configured number of queues. A single
constant is no longer sufficient to hold the whole state.

This patch tries to solve the problem by introducing additional
variables in NvmeCtrl’s state. The variables for, e.g., managing queues
are therefore organized as:

 - n->params.max_ioqpairs – no changes, constant set by the user.

 - n->max_ioqpairs - (new) value derived from n->params.* in realize();
                     constant through device’s lifetime.

 - n->(mutable_state) – (not a part of this patch) user-configurable,
                        specifies number of queues available _after_
                        reset.

 - n->conf_ioqpairs - (new) used in all the places instead of the ‘old’
                      n->params.max_ioqpairs; initialized in realize()
                      and updated during reset() to reflect user’s
                      changes to the mutable state.

Since the number of available i/o queues and interrupts can change in
runtime, buffers for sq/cqs and the MSIX-related structures are
allocated big enough to handle the limits, to completely avoid the
complicated reallocation. A helper function (nvme_update_msixcap_ts)
updates the corresponding capability register, to signal configuration
changes.

Signed-off-by: Łukasz Gieryk <lukasz.gieryk@linux.intel.com>
---
 hw/nvme/ctrl.c | 62 +++++++++++++++++++++++++++++++++-----------------
 hw/nvme/nvme.h |  4 ++++
 2 files changed, 45 insertions(+), 21 deletions(-)

Comments

Philippe Mathieu-Daudé Oct. 18, 2021, 10:06 a.m. UTC | #1
On 10/7/21 18:24, Lukasz Maniak wrote:
> From: Łukasz Gieryk <lukasz.gieryk@linux.intel.com>
> 
> The Nvme device defines two properties: max_ioqpairs, msix_qsize. Having
> them as constants is problematic for SR-IOV support.
> 
> The SR-IOV feature introduces virtual resources (queues, interrupts)
> that can be assigned to PF and its dependent VFs. Each device, following
> a reset, should work with the configured number of queues. A single
> constant is no longer sufficient to hold the whole state.
> 
> This patch tries to solve the problem by introducing additional
> variables in NvmeCtrl’s state. The variables for, e.g., managing queues
> are therefore organized as:
> 
>  - n->params.max_ioqpairs – no changes, constant set by the user.
> 
>  - n->max_ioqpairs - (new) value derived from n->params.* in realize();
>                      constant through device’s lifetime.
> 
>  - n->(mutable_state) – (not a part of this patch) user-configurable,
>                         specifies number of queues available _after_
>                         reset.
> 
>  - n->conf_ioqpairs - (new) used in all the places instead of the ‘old’
>                       n->params.max_ioqpairs; initialized in realize()
>                       and updated during reset() to reflect user’s
>                       changes to the mutable state.
> 
> Since the number of available i/o queues and interrupts can change in
> runtime, buffers for sq/cqs and the MSIX-related structures are
> allocated big enough to handle the limits, to completely avoid the
> complicated reallocation. A helper function (nvme_update_msixcap_ts)
> updates the corresponding capability register, to signal configuration
> changes.
> 
> Signed-off-by: Łukasz Gieryk <lukasz.gieryk@linux.intel.com>
> ---
>  hw/nvme/ctrl.c | 62 +++++++++++++++++++++++++++++++++-----------------
>  hw/nvme/nvme.h |  4 ++++
>  2 files changed, 45 insertions(+), 21 deletions(-)

> @@ -6322,11 +6334,17 @@ static void nvme_init_state(NvmeCtrl *n)
>      NvmeSecCtrlEntry *sctrl;
>      int i;
>  
> +    n->max_ioqpairs = n->params.max_ioqpairs;
> +    n->conf_ioqpairs = n->max_ioqpairs;
> +
> +    n->max_msix_qsize = n->params.msix_qsize;
> +    n->conf_msix_qsize = n->max_msix_qsize;

From an developer perspective, the API becomes confusing.
Most fields from NvmeParams are exposed via QMP, such max_ioqpairs.

I'm not sure we need 2 distinct fields. Maybe simply reorganize
to not reset these values in the DeviceReset handler?

Also, with this series we should consider implementing the migration
state (nvme_vmstate).

> diff --git a/hw/nvme/nvme.h b/hw/nvme/nvme.h
> index 9fbb0a70b5..65383e495c 100644
> --- a/hw/nvme/nvme.h
> +++ b/hw/nvme/nvme.h
> @@ -420,6 +420,10 @@ typedef struct NvmeCtrl {
>      uint64_t    starttime_ms;
>      uint16_t    temperature;
>      uint8_t     smart_critical_warning;
> +    uint32_t    max_msix_qsize;                 /* Derived from params.msix.qsize */
> +    uint32_t    conf_msix_qsize;                /* Configured limit */
> +    uint32_t    max_ioqpairs;                   /* Derived from params.max_ioqpairs */
> +    uint32_t    conf_ioqpairs;                  /* Configured limit */
>
Łukasz Gieryk Oct. 18, 2021, 3:53 p.m. UTC | #2
On Mon, Oct 18, 2021 at 12:06:22PM +0200, Philippe Mathieu-Daudé wrote:
> On 10/7/21 18:24, Lukasz Maniak wrote:
> > From: Łukasz Gieryk <lukasz.gieryk@linux.intel.com>
> > 
> > The Nvme device defines two properties: max_ioqpairs, msix_qsize. Having
> > them as constants is problematic for SR-IOV support.
> > 
> > The SR-IOV feature introduces virtual resources (queues, interrupts)
> > that can be assigned to PF and its dependent VFs. Each device, following
> > a reset, should work with the configured number of queues. A single
> > constant is no longer sufficient to hold the whole state.
> > 
> > This patch tries to solve the problem by introducing additional
> > variables in NvmeCtrl’s state. The variables for, e.g., managing queues
> > are therefore organized as:
> > 
> >  - n->params.max_ioqpairs – no changes, constant set by the user.
> > 
> >  - n->max_ioqpairs - (new) value derived from n->params.* in realize();
> >                      constant through device’s lifetime.
> > 
> >  - n->(mutable_state) – (not a part of this patch) user-configurable,
> >                         specifies number of queues available _after_
> >                         reset.
> > 
> >  - n->conf_ioqpairs - (new) used in all the places instead of the ‘old’
> >                       n->params.max_ioqpairs; initialized in realize()
> >                       and updated during reset() to reflect user’s
> >                       changes to the mutable state.
> > 
> > Since the number of available i/o queues and interrupts can change in
> > runtime, buffers for sq/cqs and the MSIX-related structures are
> > allocated big enough to handle the limits, to completely avoid the
> > complicated reallocation. A helper function (nvme_update_msixcap_ts)
> > updates the corresponding capability register, to signal configuration
> > changes.
> > 
> > Signed-off-by: Łukasz Gieryk <lukasz.gieryk@linux.intel.com>
> > ---
> >  hw/nvme/ctrl.c | 62 +++++++++++++++++++++++++++++++++-----------------
> >  hw/nvme/nvme.h |  4 ++++
> >  2 files changed, 45 insertions(+), 21 deletions(-)
> 
> > @@ -6322,11 +6334,17 @@ static void nvme_init_state(NvmeCtrl *n)
> >      NvmeSecCtrlEntry *sctrl;
> >      int i;
> >  
> > +    n->max_ioqpairs = n->params.max_ioqpairs;
> > +    n->conf_ioqpairs = n->max_ioqpairs;
> > +
> > +    n->max_msix_qsize = n->params.msix_qsize;
> > +    n->conf_msix_qsize = n->max_msix_qsize;
> 
> From an developer perspective, the API becomes confusing.
> Most fields from NvmeParams are exposed via QMP, such max_ioqpairs.

Hi Philippe,

I’m not sure I understand your concern. The NvmeParams stays as it was,
so the interaction with QMP stays unchanged. Sure, if QMP allows
updating NvmeParams in runtime (I’m guessing, as I’m not really
accustomed with the feature), then the Nvme device will no longer
respond to those changes. But n->conf_ioqpairs is not meant to be
altered via QEMU’s interfaces, but rather though the NVME protocol, by
the guest OS kernel/user.

Could you explain how the changes are going to break (or make more
confusing) the interaction with QMP?

> I'm not sure we need 2 distinct fields. Maybe simply reorganize
> to not reset these values in the DeviceReset handler?

The idea was to calculate the max value once and use it in multiple
places later. The actual calculations are in the following 12/15 patch
(I’m also including the code below), so indeed, the intended use case
is not so obvious.

if (pci_is_vf(&n->parent_obj)) {
    n->max_ioqpairs = n->params.sriov_max_vq_per_vf - 1;
} else {
    n->max_ioqpairs = n->params.max_ioqpairs +
                      n->params.sriov_max_vfs * n->params.sriov_max_vq_per_vf;
}

But as I’m thinking more about the problem, then indeed, the max_*
fields may be not necessary. I could calculate max_msix_qsize in the
only place it’s used, and turn the above snippet for max_iopairs into a
function. The downside is the code for calculating maximums is no longer
grouped together.

> Also, with this series we should consider implementing the migration
> state (nvme_vmstate).

I wasn’t aware of this feature. I have to do the readings first.

> > diff --git a/hw/nvme/nvme.h b/hw/nvme/nvme.h
> > index 9fbb0a70b5..65383e495c 100644
> > --- a/hw/nvme/nvme.h
> > +++ b/hw/nvme/nvme.h
> > @@ -420,6 +420,10 @@ typedef struct NvmeCtrl {
> >      uint64_t    starttime_ms;
> >      uint16_t    temperature;
> >      uint8_t     smart_critical_warning;
> > +    uint32_t    max_msix_qsize;                 /* Derived from params.msix.qsize */
> > +    uint32_t    conf_msix_qsize;                /* Configured limit */
> > +    uint32_t    max_ioqpairs;                   /* Derived from params.max_ioqpairs */
> > +    uint32_t    conf_ioqpairs;                  /* Configured limit */
> >  
>
Klaus Jensen Oct. 20, 2021, 7:06 p.m. UTC | #3
On Oct  7 18:24, Lukasz Maniak wrote:
> From: Łukasz Gieryk <lukasz.gieryk@linux.intel.com>
> 
> The Nvme device defines two properties: max_ioqpairs, msix_qsize. Having
> them as constants is problematic for SR-IOV support.
> 
> The SR-IOV feature introduces virtual resources (queues, interrupts)
> that can be assigned to PF and its dependent VFs. Each device, following
> a reset, should work with the configured number of queues. A single
> constant is no longer sufficient to hold the whole state.
> 
> This patch tries to solve the problem by introducing additional
> variables in NvmeCtrl’s state. The variables for, e.g., managing queues
> are therefore organized as:
> 
>  - n->params.max_ioqpairs – no changes, constant set by the user.
> 
>  - n->max_ioqpairs - (new) value derived from n->params.* in realize();
>                      constant through device’s lifetime.
> 
>  - n->(mutable_state) – (not a part of this patch) user-configurable,
>                         specifies number of queues available _after_
>                         reset.
> 
>  - n->conf_ioqpairs - (new) used in all the places instead of the ‘old’
>                       n->params.max_ioqpairs; initialized in realize()
>                       and updated during reset() to reflect user’s
>                       changes to the mutable state.
> 
> Since the number of available i/o queues and interrupts can change in
> runtime, buffers for sq/cqs and the MSIX-related structures are
> allocated big enough to handle the limits, to completely avoid the
> complicated reallocation. A helper function (nvme_update_msixcap_ts)
> updates the corresponding capability register, to signal configuration
> changes.
> 
> Signed-off-by: Łukasz Gieryk <lukasz.gieryk@linux.intel.com>

Instead of this, how about adding new parameters, say, sriov_vi_private
and sriov_vq_private. Then, max_ioqpairs and msix_qsize are still the
"physical" limits and the new parameters just reserve some for the
primary controller, the rest being available for flexsible resources.
Klaus Jensen Oct. 20, 2021, 7:26 p.m. UTC | #4
On Oct  7 18:24, Lukasz Maniak wrote:
> +static void nvme_update_msixcap_ts(PCIDevice *pci_dev, uint32_t table_size)
> +{
> +    uint8_t *config;
> +
> +    assert(pci_dev->msix_cap);

Not all platforms support msix, so an assert() is not right.
Łukasz Gieryk Oct. 21, 2021, 1:40 p.m. UTC | #5
On Wed, Oct 20, 2021 at 09:06:06PM +0200, Klaus Jensen wrote:
> On Oct  7 18:24, Lukasz Maniak wrote:
> > From: Łukasz Gieryk <lukasz.gieryk@linux.intel.com>
> > 
> > The Nvme device defines two properties: max_ioqpairs, msix_qsize. Having
> > them as constants is problematic for SR-IOV support.
> > 
> > The SR-IOV feature introduces virtual resources (queues, interrupts)
> > that can be assigned to PF and its dependent VFs. Each device, following
> > a reset, should work with the configured number of queues. A single
> > constant is no longer sufficient to hold the whole state.
> > 
> > This patch tries to solve the problem by introducing additional
> > variables in NvmeCtrl’s state. The variables for, e.g., managing queues
> > are therefore organized as:
> > 
> >  - n->params.max_ioqpairs – no changes, constant set by the user.
> > 
> >  - n->max_ioqpairs - (new) value derived from n->params.* in realize();
> >                      constant through device’s lifetime.
> > 
> >  - n->(mutable_state) – (not a part of this patch) user-configurable,
> >                         specifies number of queues available _after_
> >                         reset.
> > 
> >  - n->conf_ioqpairs - (new) used in all the places instead of the ‘old’
> >                       n->params.max_ioqpairs; initialized in realize()
> >                       and updated during reset() to reflect user’s
> >                       changes to the mutable state.
> > 
> > Since the number of available i/o queues and interrupts can change in
> > runtime, buffers for sq/cqs and the MSIX-related structures are
> > allocated big enough to handle the limits, to completely avoid the
> > complicated reallocation. A helper function (nvme_update_msixcap_ts)
> > updates the corresponding capability register, to signal configuration
> > changes.
> > 
> > Signed-off-by: Łukasz Gieryk <lukasz.gieryk@linux.intel.com>
> 
> Instead of this, how about adding new parameters, say, sriov_vi_private
> and sriov_vq_private. Then, max_ioqpairs and msix_qsize are still the
> "physical" limits and the new parameters just reserve some for the
> primary controller, the rest being available for flexsible resources.

Compare your configuration:

    max_ioqpairs     = 26
    sriov_max_vfs    = 4
    sriov_vq_private = 10

with mine:

    max_ioqpairs        = 10
    sriov_max_vfs       = 4
    sriov_max_vq_per_vf = 4

In your version, if I wanted to change max_vfs but keep the same number
of flexible resources per VF, then I would have to do some math and
update max_ioparis. And then I also would have to adjust the other
interrupt-related parameter, as it's also affected. In my opinion
it's quite inconvenient.
 
Now, even if I changed the semantic of params, I would still need most
of this patch. (Let’s keep the discussion regarding if max_* fields are
necessary in the other thread).

Without virtualization, the maximum number of queues is constant. User
(i.e., nvme kernel driver) can only query this value (e.g., 10) and
needs to follow this limit.

With virtualization, the flexible resources kick in. Let's continue with
the sample numbers defined earlier (10 private + 16 flexible resources).

1) The device boots, all 16 flexible queues are assigned to the primary
   controller.
2) Nvme kernel driver queries for the limit (10+16=26) and can create/use
   up to this many queues. 
3) User via the virtualization management command unbinds some (let's
   say 2) of the flexible queues from the primary controller and assigns
   them to a secondary controller.
4) After reset, the Physical Function Device reports different limit
   (24), and when the Virtual Device shows up, it will report 1 (adminQ
   consumed the other resource). 

So I need additional variable in the state to store the intermediate
limit (24 or 1), as none of the existing params has the correct value,
and all the places that validate limits must work on the value.
Klaus Jensen Nov. 3, 2021, 12:11 p.m. UTC | #6
On Oct 21 15:40, Łukasz Gieryk wrote:
> On Wed, Oct 20, 2021 at 09:06:06PM +0200, Klaus Jensen wrote:
> > On Oct  7 18:24, Lukasz Maniak wrote:
> > > From: Łukasz Gieryk <lukasz.gieryk@linux.intel.com>
> > > 
> > > The Nvme device defines two properties: max_ioqpairs, msix_qsize. Having
> > > them as constants is problematic for SR-IOV support.
> > > 
> > > The SR-IOV feature introduces virtual resources (queues, interrupts)
> > > that can be assigned to PF and its dependent VFs. Each device, following
> > > a reset, should work with the configured number of queues. A single
> > > constant is no longer sufficient to hold the whole state.
> > > 
> > > This patch tries to solve the problem by introducing additional
> > > variables in NvmeCtrl’s state. The variables for, e.g., managing queues
> > > are therefore organized as:
> > > 
> > >  - n->params.max_ioqpairs – no changes, constant set by the user.
> > > 
> > >  - n->max_ioqpairs - (new) value derived from n->params.* in realize();
> > >                      constant through device’s lifetime.
> > > 
> > >  - n->(mutable_state) – (not a part of this patch) user-configurable,
> > >                         specifies number of queues available _after_
> > >                         reset.
> > > 
> > >  - n->conf_ioqpairs - (new) used in all the places instead of the ‘old’
> > >                       n->params.max_ioqpairs; initialized in realize()
> > >                       and updated during reset() to reflect user’s
> > >                       changes to the mutable state.
> > > 
> > > Since the number of available i/o queues and interrupts can change in
> > > runtime, buffers for sq/cqs and the MSIX-related structures are
> > > allocated big enough to handle the limits, to completely avoid the
> > > complicated reallocation. A helper function (nvme_update_msixcap_ts)
> > > updates the corresponding capability register, to signal configuration
> > > changes.
> > > 
> > > Signed-off-by: Łukasz Gieryk <lukasz.gieryk@linux.intel.com>
> > 
> > Instead of this, how about adding new parameters, say, sriov_vi_private
> > and sriov_vq_private. Then, max_ioqpairs and msix_qsize are still the
> > "physical" limits and the new parameters just reserve some for the
> > primary controller, the rest being available for flexsible resources.
> 
> Compare your configuration:
> 
>     max_ioqpairs     = 26
>     sriov_max_vfs    = 4
>     sriov_vq_private = 10
> 
> with mine:
> 
>     max_ioqpairs        = 10
>     sriov_max_vfs       = 4
>     sriov_max_vq_per_vf = 4
> 
> In your version, if I wanted to change max_vfs but keep the same number
> of flexible resources per VF, then I would have to do some math and
> update max_ioparis. And then I also would have to adjust the other
> interrupt-related parameter, as it's also affected. In my opinion
> it's quite inconvenient.

True, that is probably inconvenient, but we have tools to do this math
for us. I very much prefer to be explicit in these parameters.

Also, see my comment on patch 12. If we keep this meaning of
max_ioqpairs, then we have reasonable defaults for the number of private
resources (if no flexible resources are required) and I think we can
control all parameters in the capabilities structures (with a little
math).

>  
> Now, even if I changed the semantic of params, I would still need most
> of this patch. (Let’s keep the discussion regarding if max_* fields are
> necessary in the other thread).
> 
> Without virtualization, the maximum number of queues is constant. User
> (i.e., nvme kernel driver) can only query this value (e.g., 10) and
> needs to follow this limit.
> 
> With virtualization, the flexible resources kick in. Let's continue with
> the sample numbers defined earlier (10 private + 16 flexible resources).
> 
> 1) The device boots, all 16 flexible queues are assigned to the primary
>    controller.
> 2) Nvme kernel driver queries for the limit (10+16=26) and can create/use
>    up to this many queues. 
> 3) User via the virtualization management command unbinds some (let's
>    say 2) of the flexible queues from the primary controller and assigns
>    them to a secondary controller.
> 4) After reset, the Physical Function Device reports different limit
>    (24), and when the Virtual Device shows up, it will report 1 (adminQ
>    consumed the other resource). 
> 
> So I need additional variable in the state to store the intermediate
> limit (24 or 1), as none of the existing params has the correct value,
> and all the places that validate limits must work on the value.
> 

I do not contest that you need additional state to keep track of
assigned resources. That seems totally reasonable.
diff mbox series

Patch

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index b04cf5eae9..5d9166d66f 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -416,12 +416,12 @@  static bool nvme_nsid_valid(NvmeCtrl *n, uint32_t nsid)
 
 static int nvme_check_sqid(NvmeCtrl *n, uint16_t sqid)
 {
-    return sqid < n->params.max_ioqpairs + 1 && n->sq[sqid] != NULL ? 0 : -1;
+    return sqid < n->conf_ioqpairs + 1 && n->sq[sqid] != NULL ? 0 : -1;
 }
 
 static int nvme_check_cqid(NvmeCtrl *n, uint16_t cqid)
 {
-    return cqid < n->params.max_ioqpairs + 1 && n->cq[cqid] != NULL ? 0 : -1;
+    return cqid < n->conf_ioqpairs + 1 && n->cq[cqid] != NULL ? 0 : -1;
 }
 
 static void nvme_inc_cq_tail(NvmeCQueue *cq)
@@ -4034,8 +4034,7 @@  static uint16_t nvme_create_sq(NvmeCtrl *n, NvmeRequest *req)
         trace_pci_nvme_err_invalid_create_sq_cqid(cqid);
         return NVME_INVALID_CQID | NVME_DNR;
     }
-    if (unlikely(!sqid || sqid > n->params.max_ioqpairs ||
-        n->sq[sqid] != NULL)) {
+    if (unlikely(!sqid || sqid > n->conf_ioqpairs || n->sq[sqid] != NULL)) {
         trace_pci_nvme_err_invalid_create_sq_sqid(sqid);
         return NVME_INVALID_QID | NVME_DNR;
     }
@@ -4382,8 +4381,7 @@  static uint16_t nvme_create_cq(NvmeCtrl *n, NvmeRequest *req)
     trace_pci_nvme_create_cq(prp1, cqid, vector, qsize, qflags,
                              NVME_CQ_FLAGS_IEN(qflags) != 0);
 
-    if (unlikely(!cqid || cqid > n->params.max_ioqpairs ||
-        n->cq[cqid] != NULL)) {
+    if (unlikely(!cqid || cqid > n->conf_ioqpairs || n->cq[cqid] != NULL)) {
         trace_pci_nvme_err_invalid_create_cq_cqid(cqid);
         return NVME_INVALID_QID | NVME_DNR;
     }
@@ -4399,7 +4397,7 @@  static uint16_t nvme_create_cq(NvmeCtrl *n, NvmeRequest *req)
         trace_pci_nvme_err_invalid_create_cq_vector(vector);
         return NVME_INVALID_IRQ_VECTOR | NVME_DNR;
     }
-    if (unlikely(vector >= n->params.msix_qsize)) {
+    if (unlikely(vector >= n->conf_msix_qsize)) {
         trace_pci_nvme_err_invalid_create_cq_vector(vector);
         return NVME_INVALID_IRQ_VECTOR | NVME_DNR;
     }
@@ -4980,13 +4978,12 @@  defaults:
 
         break;
     case NVME_NUMBER_OF_QUEUES:
-        result = (n->params.max_ioqpairs - 1) |
-            ((n->params.max_ioqpairs - 1) << 16);
+        result = (n->conf_ioqpairs - 1) | ((n->conf_ioqpairs - 1) << 16);
         trace_pci_nvme_getfeat_numq(result);
         break;
     case NVME_INTERRUPT_VECTOR_CONF:
         iv = dw11 & 0xffff;
-        if (iv >= n->params.max_ioqpairs + 1) {
+        if (iv >= n->conf_ioqpairs + 1) {
             return NVME_INVALID_FIELD | NVME_DNR;
         }
 
@@ -5141,10 +5138,10 @@  static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeRequest *req)
 
         trace_pci_nvme_setfeat_numq((dw11 & 0xffff) + 1,
                                     ((dw11 >> 16) & 0xffff) + 1,
-                                    n->params.max_ioqpairs,
-                                    n->params.max_ioqpairs);
-        req->cqe.result = cpu_to_le32((n->params.max_ioqpairs - 1) |
-                                      ((n->params.max_ioqpairs - 1) << 16));
+                                    n->conf_ioqpairs,
+                                    n->conf_ioqpairs);
+        req->cqe.result = cpu_to_le32((n->conf_ioqpairs - 1) |
+                                      ((n->conf_ioqpairs - 1) << 16));
         break;
     case NVME_ASYNCHRONOUS_EVENT_CONF:
         n->features.async_config = dw11;
@@ -5582,8 +5579,21 @@  static void nvme_process_sq(void *opaque)
     }
 }
 
+static void nvme_update_msixcap_ts(PCIDevice *pci_dev, uint32_t table_size)
+{
+    uint8_t *config;
+
+    assert(pci_dev->msix_cap);
+    assert(table_size <= pci_dev->msix_entries_nr);
+
+    config = pci_dev->config + pci_dev->msix_cap;
+    pci_set_word_by_mask(config + PCI_MSIX_FLAGS, PCI_MSIX_FLAGS_QSIZE,
+                         table_size - 1);
+}
+
 static void nvme_ctrl_reset(NvmeCtrl *n, NvmeResetType rst)
 {
+    PCIDevice *pci_dev = &n->parent_obj;
     NvmeNamespace *ns;
     int i;
 
@@ -5596,12 +5606,12 @@  static void nvme_ctrl_reset(NvmeCtrl *n, NvmeResetType rst)
         nvme_ns_drain(ns);
     }
 
-    for (i = 0; i < n->params.max_ioqpairs + 1; i++) {
+    for (i = 0; i < n->max_ioqpairs + 1; i++) {
         if (n->sq[i] != NULL) {
             nvme_free_sq(n->sq[i], n);
         }
     }
-    for (i = 0; i < n->params.max_ioqpairs + 1; i++) {
+    for (i = 0; i < n->max_ioqpairs + 1; i++) {
         if (n->cq[i] != NULL) {
             nvme_free_cq(n->cq[i], n);
         }
@@ -5613,15 +5623,17 @@  static void nvme_ctrl_reset(NvmeCtrl *n, NvmeResetType rst)
         g_free(event);
     }
 
-    if (!pci_is_vf(&n->parent_obj) && n->params.sriov_max_vfs) {
+    if (!pci_is_vf(pci_dev) && n->params.sriov_max_vfs) {
         if (rst != NVME_RESET_CONTROLLER) {
-            pcie_sriov_pf_disable_vfs(&n->parent_obj);
+            pcie_sriov_pf_disable_vfs(pci_dev);
         }
     }
 
     n->aer_queued = 0;
     n->outstanding_aers = 0;
     n->qs_created = false;
+
+    nvme_update_msixcap_ts(pci_dev, n->conf_msix_qsize);
 }
 
 static void nvme_ctrl_shutdown(NvmeCtrl *n)
@@ -6322,11 +6334,17 @@  static void nvme_init_state(NvmeCtrl *n)
     NvmeSecCtrlEntry *sctrl;
     int i;
 
+    n->max_ioqpairs = n->params.max_ioqpairs;
+    n->conf_ioqpairs = n->max_ioqpairs;
+
+    n->max_msix_qsize = n->params.msix_qsize;
+    n->conf_msix_qsize = n->max_msix_qsize;
+
     /* add one to max_ioqpairs to account for the admin queue pair */
     n->reg_size = pow2ceil(sizeof(NvmeBar) +
                            2 * (n->params.max_ioqpairs + 1) * NVME_DB_SIZE);
-    n->sq = g_new0(NvmeSQueue *, n->params.max_ioqpairs + 1);
-    n->cq = g_new0(NvmeCQueue *, n->params.max_ioqpairs + 1);
+    n->sq = g_new0(NvmeSQueue *, n->max_ioqpairs + 1);
+    n->cq = g_new0(NvmeCQueue *, n->max_ioqpairs + 1);
     n->temperature = NVME_TEMPERATURE;
     n->features.temp_thresh_hi = NVME_TEMPERATURE_WARNING;
     n->starttime_ms = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
@@ -6491,7 +6509,7 @@  static int nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_dev, Error **errp)
         pci_register_bar(pci_dev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY |
                          PCI_BASE_ADDRESS_MEM_TYPE_64, &n->bar0);
     }
-    ret = msix_init(pci_dev, n->params.msix_qsize,
+    ret = msix_init(pci_dev, n->max_msix_qsize,
                     &n->bar0, 0, msix_table_offset,
                     &n->bar0, 0, msix_pba_offset, 0, &err);
     if (ret < 0) {
@@ -6503,6 +6521,8 @@  static int nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_dev, Error **errp)
         }
     }
 
+    nvme_update_msixcap_ts(pci_dev, n->conf_msix_qsize);
+
     if (n->params.cmb_size_mb) {
         nvme_init_cmb(n, pci_dev);
     }
diff --git a/hw/nvme/nvme.h b/hw/nvme/nvme.h
index 9fbb0a70b5..65383e495c 100644
--- a/hw/nvme/nvme.h
+++ b/hw/nvme/nvme.h
@@ -420,6 +420,10 @@  typedef struct NvmeCtrl {
     uint64_t    starttime_ms;
     uint16_t    temperature;
     uint8_t     smart_critical_warning;
+    uint32_t    max_msix_qsize;                 /* Derived from params.msix.qsize */
+    uint32_t    conf_msix_qsize;                /* Configured limit */
+    uint32_t    max_ioqpairs;                   /* Derived from params.max_ioqpairs */
+    uint32_t    conf_ioqpairs;                  /* Configured limit */
 
     struct {
         MemoryRegion mem;