Message ID | 151863726311.3003.8227524786940828598.stgit@bahia.lan |
---|---|
State | New |
Headers | show |
Series | spapr: fix VCPU ids miscalculation | expand |
On Wed, Feb 14, 2018 at 08:41:03PM +0100, Greg Kurz wrote: > XICS needs to know the highest VCPU id that may be presented to the > guest plus 1. Commit f303f117fec3 "spapr: ensure we have at least one > XICS server" changed how the maximum is computed from: > > smp_cpus * kvmppc_smt_threads() / smp_threads > > to: > > DIV_ROUND_UP(smp_cpus * kvmppc_smt_threads(), smp_threads) > > This was done because at the time we could pass broken CPU topologies > to the -smp command line options, such as threads=9,cpus=1. On a POWER8 > host this would give: > > 1 * 8 / 9 == 0 servers > > and cause QEMU to crash later during XICS setup. > > The formulat evolved a bit to accomodate CPU hot-plug and VSMT, but > most important, stricter checks are performed on the CPU topology. > > With -smp threads=9,cpus=1: > > qemu-system-ppc64: > cpu topology: sockets (1) * cores (1) * threads (9) > maxcpus (1) > > With -smp threads=9,maxcpus=1: > > qemu-system-ppc64: maxcpus must be equal to or greater than smp > > More generally, machine types with hotplug support (2.7 and up), no > longer allow to set maxcpus or smp_cpus to a value that isnt't a > multiple of smp_threads. > > With -smp threads=4,cpus=6: > > qemu-system-ppc64: smp_cpus (6) must be multiple of threads (4) > > With -smp threads=4,maxcpus=6: > > qemu-system-ppc64: max_cpus (6) must be multiple of threads (4) > > This means that the division is perfect and we don't need DIV_ROUND_UP(), > and we could do a regular division: > > max_cpus * spapr->vsmt / smp_threads > > So this patch changes xics_max_server_number() to use the spapr_vcpu_id(), > which works too since max_cpus is a multiple of smp_threads: > > (max_cpus / smp_threads ) * spapr->vsmt + max_cpus % smp_threads > > It breaks migration of pre-2.7 machine types with unusual CPU topologies, > but I guess this is an acceptable trade-off. No, not really. Weird topologies are still allowed on old machine types for backwards compatibility, and we shouldn't break that. I like the idea of consolidating this calculation, but we can't do it by just breaking the older machines (at least not until they're formally deprecated).
On Thu, 15 Feb 2018 15:08:18 +1100 David Gibson <david@gibson.dropbear.id.au> wrote: > On Wed, Feb 14, 2018 at 08:41:03PM +0100, Greg Kurz wrote: <snip> > > > > It breaks migration of pre-2.7 machine types with unusual CPU topologies, > > but I guess this is an acceptable trade-off. > > No, not really. Weird topologies are still allowed on old machine > types for backwards compatibility, and we shouldn't break that. I > like the idea of consolidating this calculation, but we can't do it by > just breaking the older machines (at least not until they're formally > deprecated). > Heh, I had put this patch at the end because I was expecting you might nack it :) Per curiosity, when/how do we decide that an older machine type may be formally deprecated ?
On Thu, Feb 15, 2018 at 05:08:57PM +0100, Greg Kurz wrote: > On Thu, 15 Feb 2018 15:08:18 +1100 > David Gibson <david@gibson.dropbear.id.au> wrote: > > > On Wed, Feb 14, 2018 at 08:41:03PM +0100, Greg Kurz wrote: > > <snip> > > > > > > > It breaks migration of pre-2.7 machine types with unusual CPU topologies, > > > but I guess this is an acceptable trade-off. > > > > No, not really. Weird topologies are still allowed on old machine > > types for backwards compatibility, and we shouldn't break that. I > > like the idea of consolidating this calculation, but we can't do it by > > just breaking the older machines (at least not until they're formally > > deprecated). > > > > Heh, I had put this patch at the end because I was expecting you might > nack it :) > > Per curiosity, when/how do we decide that an older machine type may be > formally deprecated ? For versioned machine types we decided that we'd keep them around upstream for as long as they were needed by a downstream vendor, *provided* that downstream vendor is contributing to QEMU in order to mitigate the maint burden it would entail. Regards, Daniel
On Thu, 15 Feb 2018 16:54:18 +0000 Daniel P. Berrangé <berrange@redhat.com> wrote: > On Thu, Feb 15, 2018 at 05:08:57PM +0100, Greg Kurz wrote: > > On Thu, 15 Feb 2018 15:08:18 +1100 > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > On Wed, Feb 14, 2018 at 08:41:03PM +0100, Greg Kurz wrote: > > > > <snip> > > > > > > > > > > It breaks migration of pre-2.7 machine types with unusual CPU topologies, > > > > but I guess this is an acceptable trade-off. > > > > > > No, not really. Weird topologies are still allowed on old machine > > > types for backwards compatibility, and we shouldn't break that. I > > > like the idea of consolidating this calculation, but we can't do it by > > > just breaking the older machines (at least not until they're formally > > > deprecated). > > > > > > > Heh, I had put this patch at the end because I was expecting you might > > nack it :) > > > > Per curiosity, when/how do we decide that an older machine type may be > > formally deprecated ? > > For versioned machine types we decided that we'd keep them around upstream > for as long as they were needed by a downstream vendor, *provided* that > downstream vendor is contributing to QEMU in order to mitigate the maint > burden it would entail. > Indeed I now remember having heard something like that in the past. Thanks for the details anyway. :) And, this is probably a dumb question, but do we have an up-to-date list of QEMU versions still needed by a contributing vendor ? > Regards, > Daniel
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index 800d3f001253..f1722214cc74 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -176,7 +176,7 @@ static void pre_2_10_vmstate_unregister_dummy_icp(int i) static int xics_max_server_number(sPAPRMachineState *spapr) { - return DIV_ROUND_UP(max_cpus * spapr->vsmt, smp_threads); + return spapr_vcpu_id(spapr, max_cpus) } static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)
XICS needs to know the highest VCPU id that may be presented to the guest plus 1. Commit f303f117fec3 "spapr: ensure we have at least one XICS server" changed how the maximum is computed from: smp_cpus * kvmppc_smt_threads() / smp_threads to: DIV_ROUND_UP(smp_cpus * kvmppc_smt_threads(), smp_threads) This was done because at the time we could pass broken CPU topologies to the -smp command line options, such as threads=9,cpus=1. On a POWER8 host this would give: 1 * 8 / 9 == 0 servers and cause QEMU to crash later during XICS setup. The formulat evolved a bit to accomodate CPU hot-plug and VSMT, but most important, stricter checks are performed on the CPU topology. With -smp threads=9,cpus=1: qemu-system-ppc64: cpu topology: sockets (1) * cores (1) * threads (9) > maxcpus (1) With -smp threads=9,maxcpus=1: qemu-system-ppc64: maxcpus must be equal to or greater than smp More generally, machine types with hotplug support (2.7 and up), no longer allow to set maxcpus or smp_cpus to a value that isnt't a multiple of smp_threads. With -smp threads=4,cpus=6: qemu-system-ppc64: smp_cpus (6) must be multiple of threads (4) With -smp threads=4,maxcpus=6: qemu-system-ppc64: max_cpus (6) must be multiple of threads (4) This means that the division is perfect and we don't need DIV_ROUND_UP(), and we could do a regular division: max_cpus * spapr->vsmt / smp_threads So this patch changes xics_max_server_number() to use the spapr_vcpu_id(), which works too since max_cpus is a multiple of smp_threads: (max_cpus / smp_threads ) * spapr->vsmt + max_cpus % smp_threads It breaks migration of pre-2.7 machine types with unusual CPU topologies, but I guess this is an acceptable trade-off. Signed-off-by: Greg Kurz <groug@kaod.org> --- hw/ppc/spapr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)