Message ID | 1535035877-219196-1-git-send-email-imammedo@redhat.com |
---|---|
State | New |
Headers | show |
Series | vl.c: make sure maxcpus matches topology to prevent migration failure | expand |
On 23/08/2018 16:51, Igor Mammedov wrote: > Topology (threads*cores*sockets) must match maxcpus to be valid, > otherwise we could start QEMU with invalid topology that throws > a error on migration destination side, that should not be reachable: > Source: > -smp 8,maxcpus=64,cores=1,threads=8,sockets=1 > // hotplug cpus upto maxcpus > Destination: > -smp 64,maxcpus=64,cores=1,threads=8,sockets=1 > qemu: cpu topology: sockets (1) * cores (1) * threads (8) < smp_cpus (64) The destination should have sockets=8, shouldn't it? It seems to me that, at startup, you should have cpus = s*t*c and cpus <= maxcpus. Currently we check cpus <= s*t*c <= maxcpus, which doesn't make much sense. Paolo > Signed-off-by: Igor Mammedov <imammedo@redhat.com> > --- > vl.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/vl.c b/vl.c > index 16b913f..2b35e0c 100644 > --- a/vl.c > +++ b/vl.c > @@ -1238,10 +1238,10 @@ static void smp_parse(QemuOpts *opts) > exit(1); > } > > - if (sockets * cores * threads > max_cpus) { > + if (sockets * cores * threads != max_cpus) { > error_report("cpu topology: " > - "sockets (%u) * cores (%u) * threads (%u) > " > - "maxcpus (%u)", > + "sockets (%u) * cores (%u) * threads (%u) not equal to" > + " maxcpus (%u)", > sockets, cores, threads, max_cpus); > exit(1); > } >
On Thu, Aug 23, 2018 at 06:32:41PM +0200, Paolo Bonzini wrote: > On 23/08/2018 16:51, Igor Mammedov wrote: > > Topology (threads*cores*sockets) must match maxcpus to be valid, > > otherwise we could start QEMU with invalid topology that throws > > a error on migration destination side, that should not be reachable: > > Source: > > -smp 8,maxcpus=64,cores=1,threads=8,sockets=1 > > // hotplug cpus upto maxcpus > > Destination: > > -smp 64,maxcpus=64,cores=1,threads=8,sockets=1 > > qemu: cpu topology: sockets (1) * cores (1) * threads (8) < smp_cpus (64) > > The destination should have sockets=8, shouldn't it? > > It seems to me that, at startup, you should have cpus = s*t*c and cpus > <= maxcpus. Currently we check cpus <= s*t*c <= maxcpus, which doesn't > make much sense. Most of the incompleteness of input validation at smp_parse() can be explained by our fear of breaking existing configurations and making existing running VMs not runnable. But now we have a deprecation policy. If we're still afraid of breaking peoples' existing configurations, we should at least deprecate those configurations as soon as possible (and make QEMU at least emit a warning).
On Thu, 23 Aug 2018 18:32:41 +0200 Paolo Bonzini <pbonzini@redhat.com> wrote: > On 23/08/2018 16:51, Igor Mammedov wrote: > > Topology (threads*cores*sockets) must match maxcpus to be valid, > > otherwise we could start QEMU with invalid topology that throws > > a error on migration destination side, that should not be reachable: > > Source: > > -smp 8,maxcpus=64,cores=1,threads=8,sockets=1 > > // hotplug cpus upto maxcpus > > Destination: > > -smp 64,maxcpus=64,cores=1,threads=8,sockets=1 > > qemu: cpu topology: sockets (1) * cores (1) * threads (8) < smp_cpus (64) This destination CLI aren't exactly correct as well since it should've been exactly the same -smp as on source + a bunch of -device cpufoo... so we can always say go fix your CLI so it won't trigger error. > The destination should have sockets=8, shouldn't it? either that or cores=8 or cores=4,sockets=2 ... > It seems to me that, at startup, you should have cpus = s*t*c and cpus > <= maxcpus. Currently we check cpus <= s*t*c <= maxcpus, which doesn't > make much sense. I think that s*t*c should describe topology of whole machine including not yet plugged vcpus. "cpus = s*t*c" probably won't work for partially filled package case: -smp 1,cores=1,threads=8,sockets=1 cores/threads should reflect full package configuration for guest to see an expected topology. > Paolo > > > Signed-off-by: Igor Mammedov <imammedo@redhat.com> > > --- > > vl.c | 6 +++--- > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > diff --git a/vl.c b/vl.c > > index 16b913f..2b35e0c 100644 > > --- a/vl.c > > +++ b/vl.c > > @@ -1238,10 +1238,10 @@ static void smp_parse(QemuOpts *opts) > > exit(1); > > } > > > > - if (sockets * cores * threads > max_cpus) { > > + if (sockets * cores * threads != max_cpus) { > > error_report("cpu topology: " > > - "sockets (%u) * cores (%u) * threads (%u) > " > > - "maxcpus (%u)", > > + "sockets (%u) * cores (%u) * threads (%u) not equal to" > > + " maxcpus (%u)", > > sockets, cores, threads, max_cpus); > > exit(1); > > } > > >
On Thu, 23 Aug 2018 15:03:07 -0300 Eduardo Habkost <ehabkost@redhat.com> wrote: > On Thu, Aug 23, 2018 at 06:32:41PM +0200, Paolo Bonzini wrote: > > On 23/08/2018 16:51, Igor Mammedov wrote: > > > Topology (threads*cores*sockets) must match maxcpus to be valid, > > > otherwise we could start QEMU with invalid topology that throws > > > a error on migration destination side, that should not be reachable: > > > Source: > > > -smp 8,maxcpus=64,cores=1,threads=8,sockets=1 > > > // hotplug cpus upto maxcpus > > > Destination: > > > -smp 64,maxcpus=64,cores=1,threads=8,sockets=1 > > > qemu: cpu topology: sockets (1) * cores (1) * threads (8) < smp_cpus (64) > > > > The destination should have sockets=8, shouldn't it? > > > > It seems to me that, at startup, you should have cpus = s*t*c and cpus > > <= maxcpus. Currently we check cpus <= s*t*c <= maxcpus, which doesn't > > make much sense. > > Most of the incompleteness of input validation at smp_parse() can > be explained by our fear of breaking existing configurations and > making existing running VMs not runnable. > > But now we have a deprecation policy. If we're still afraid of > breaking peoples' existing configurations, we should at least > deprecate those configurations as soon as possible (and make QEMU > at least emit a warning). Sure, I can send a deprecation patch first.
On Fri, Aug 24, 2018 at 11:13:50AM +0200, Igor Mammedov wrote: > On Thu, 23 Aug 2018 18:32:41 +0200 > Paolo Bonzini <pbonzini@redhat.com> wrote: > > > On 23/08/2018 16:51, Igor Mammedov wrote: > > > Topology (threads*cores*sockets) must match maxcpus to be valid, > > > otherwise we could start QEMU with invalid topology that throws > > > a error on migration destination side, that should not be reachable: > > > Source: > > > -smp 8,maxcpus=64,cores=1,threads=8,sockets=1 > > > // hotplug cpus upto maxcpus > > > Destination: > > > -smp 64,maxcpus=64,cores=1,threads=8,sockets=1 > > > qemu: cpu topology: sockets (1) * cores (1) * threads (8) < smp_cpus (64) > This destination CLI aren't exactly correct as well since > it should've been exactly the same -smp as on source + a bunch of -device cpufoo... > so we can always say go fix your CLI so it won't trigger error. > > > The destination should have sockets=8, shouldn't it? > either that or cores=8 or cores=4,sockets=2 ... > > > It seems to me that, at startup, you should have cpus = s*t*c and cpus > > <= maxcpus. Currently we check cpus <= s*t*c <= maxcpus, which doesn't > > make much sense. > I think that s*t*c should describe topology of whole machine > including not yet plugged vcpus. "cpus = s*t*c" probably won't work > for partially filled package case: > -smp 1,cores=1,threads=8,sockets=1 > cores/threads should reflect full package configuration > for guest to see an expected topology. Oh, now I remember: that's the reason we don't enforce s*t*c == smp_cpus nor s*t*c == max_cpus. Both "-smp 4,maxcpus=8,cores=2,threads=2,sockets=1" and "-smp 4,maxcpus=8,cores=2,threads=2,sockets=2" worked since maxcpus was introduced, making the semantics of "sockets" unclear and hard to change without breaking existing configs.
On Fri, 24 Aug 2018 08:11:48 -0300 Eduardo Habkost <ehabkost@redhat.com> wrote: > On Fri, Aug 24, 2018 at 11:13:50AM +0200, Igor Mammedov wrote: > > On Thu, 23 Aug 2018 18:32:41 +0200 > > Paolo Bonzini <pbonzini@redhat.com> wrote: > > > > > On 23/08/2018 16:51, Igor Mammedov wrote: > > > > Topology (threads*cores*sockets) must match maxcpus to be valid, > > > > otherwise we could start QEMU with invalid topology that throws > > > > a error on migration destination side, that should not be reachable: > > > > Source: > > > > -smp 8,maxcpus=64,cores=1,threads=8,sockets=1 > > > > // hotplug cpus upto maxcpus > > > > Destination: > > > > -smp 64,maxcpus=64,cores=1,threads=8,sockets=1 > > > > qemu: cpu topology: sockets (1) * cores (1) * threads (8) < smp_cpus (64) > > This destination CLI aren't exactly correct as well since > > it should've been exactly the same -smp as on source + a bunch of -device cpufoo... > > so we can always say go fix your CLI so it won't trigger error. > > > > > The destination should have sockets=8, shouldn't it? > > either that or cores=8 or cores=4,sockets=2 ... > > > > > It seems to me that, at startup, you should have cpus = s*t*c and cpus > > > <= maxcpus. Currently we check cpus <= s*t*c <= maxcpus, which doesn't > > > make much sense. > > I think that s*t*c should describe topology of whole machine > > including not yet plugged vcpus. "cpus = s*t*c" probably won't work > > for partially filled package case: > > -smp 1,cores=1,threads=8,sockets=1 > > cores/threads should reflect full package configuration > > for guest to see an expected topology. > > Oh, now I remember: that's the reason we don't enforce > s*t*c == smp_cpus nor s*t*c == max_cpus. > > Both "-smp 4,maxcpus=8,cores=2,threads=2,sockets=1" and > "-smp 4,maxcpus=8,cores=2,threads=2,sockets=2" > worked since maxcpus was introduced, making the semantics of > "sockets" unclear and hard to change without breaking existing > configs. Should we go with deprication thingy then, so we could make it clear in the future?
On Fri, Aug 24, 2018 at 01:26:54PM +0200, Igor Mammedov wrote: > On Fri, 24 Aug 2018 08:11:48 -0300 > Eduardo Habkost <ehabkost@redhat.com> wrote: > > > On Fri, Aug 24, 2018 at 11:13:50AM +0200, Igor Mammedov wrote: > > > On Thu, 23 Aug 2018 18:32:41 +0200 > > > Paolo Bonzini <pbonzini@redhat.com> wrote: > > > > > > > On 23/08/2018 16:51, Igor Mammedov wrote: > > > > > Topology (threads*cores*sockets) must match maxcpus to be valid, > > > > > otherwise we could start QEMU with invalid topology that throws > > > > > a error on migration destination side, that should not be reachable: > > > > > Source: > > > > > -smp 8,maxcpus=64,cores=1,threads=8,sockets=1 > > > > > // hotplug cpus upto maxcpus > > > > > Destination: > > > > > -smp 64,maxcpus=64,cores=1,threads=8,sockets=1 > > > > > qemu: cpu topology: sockets (1) * cores (1) * threads (8) < smp_cpus (64) > > > This destination CLI aren't exactly correct as well since > > > it should've been exactly the same -smp as on source + a bunch of -device cpufoo... > > > so we can always say go fix your CLI so it won't trigger error. > > > > > > > The destination should have sockets=8, shouldn't it? > > > either that or cores=8 or cores=4,sockets=2 ... > > > > > > > It seems to me that, at startup, you should have cpus = s*t*c and cpus > > > > <= maxcpus. Currently we check cpus <= s*t*c <= maxcpus, which doesn't > > > > make much sense. > > > I think that s*t*c should describe topology of whole machine > > > including not yet plugged vcpus. "cpus = s*t*c" probably won't work > > > for partially filled package case: > > > -smp 1,cores=1,threads=8,sockets=1 > > > cores/threads should reflect full package configuration > > > for guest to see an expected topology. > > > > Oh, now I remember: that's the reason we don't enforce > > s*t*c == smp_cpus nor s*t*c == max_cpus. > > > > Both "-smp 4,maxcpus=8,cores=2,threads=2,sockets=1" and > > "-smp 4,maxcpus=8,cores=2,threads=2,sockets=2" > > worked since maxcpus was introduced, making the semantics of > > "sockets" unclear and hard to change without breaking existing > > configs. > Should we go with deprication thingy then, > so we could make it clear in the future? Yes, but I'm not sure which option we should adopt (s*t*c == smp_cpus or s*t*c == max_cpus). Does anybody know what's the semantics expected by libvirt today?
On Fri, Aug 24, 2018 at 10:53:32AM -0300, Eduardo Habkost wrote: > On Fri, Aug 24, 2018 at 01:26:54PM +0200, Igor Mammedov wrote: > > On Fri, 24 Aug 2018 08:11:48 -0300 > > Eduardo Habkost <ehabkost@redhat.com> wrote: > > > > > On Fri, Aug 24, 2018 at 11:13:50AM +0200, Igor Mammedov wrote: > > > > On Thu, 23 Aug 2018 18:32:41 +0200 > > > > Paolo Bonzini <pbonzini@redhat.com> wrote: > > > > > > > > > On 23/08/2018 16:51, Igor Mammedov wrote: > > > > > > Topology (threads*cores*sockets) must match maxcpus to be valid, > > > > > > otherwise we could start QEMU with invalid topology that throws > > > > > > a error on migration destination side, that should not be reachable: > > > > > > Source: > > > > > > -smp 8,maxcpus=64,cores=1,threads=8,sockets=1 > > > > > > // hotplug cpus upto maxcpus > > > > > > Destination: > > > > > > -smp 64,maxcpus=64,cores=1,threads=8,sockets=1 > > > > > > qemu: cpu topology: sockets (1) * cores (1) * threads (8) < smp_cpus (64) > > > > This destination CLI aren't exactly correct as well since > > > > it should've been exactly the same -smp as on source + a bunch of -device cpufoo... > > > > so we can always say go fix your CLI so it won't trigger error. > > > > > > > > > The destination should have sockets=8, shouldn't it? > > > > either that or cores=8 or cores=4,sockets=2 ... > > > > > > > > > It seems to me that, at startup, you should have cpus = s*t*c and cpus > > > > > <= maxcpus. Currently we check cpus <= s*t*c <= maxcpus, which doesn't > > > > > make much sense. > > > > I think that s*t*c should describe topology of whole machine > > > > including not yet plugged vcpus. "cpus = s*t*c" probably won't work > > > > for partially filled package case: > > > > -smp 1,cores=1,threads=8,sockets=1 > > > > cores/threads should reflect full package configuration > > > > for guest to see an expected topology. > > > > > > Oh, now I remember: that's the reason we don't enforce > > > s*t*c == smp_cpus nor s*t*c == max_cpus. > > > > > > Both "-smp 4,maxcpus=8,cores=2,threads=2,sockets=1" and > > > "-smp 4,maxcpus=8,cores=2,threads=2,sockets=2" > > > worked since maxcpus was introduced, making the semantics of > > > "sockets" unclear and hard to change without breaking existing > > > configs. > > Should we go with deprication thingy then, > > so we could make it clear in the future? > > Yes, but I'm not sure which option we should adopt > (s*t*c == smp_cpus or s*t*c == max_cpus). > > Does anybody know what's the semantics expected by libvirt today? Libvirt requires s*c*t to equal the total number of possible CPUs, *not* the currently plugged number. ie Valid: <vcpu placement='static' current='16'>32</vcpu> <cpu> <topology sockets='4' cores='4' threads='2'/> </cpu> Invalid: <vcpu placement='static' current='32'>64</vcpu> <cpu> <topology sockets='4' cores='4' threads='2'/> </cpu> Test with: $ virsh edit QEMUGuest1 error: unsupported configuration: CPU topology doesn't match maximum vcpu count Failed. Try again? [y,n,i,f,?]: Regards, Daniel
On Fri, 24 Aug 2018 10:53:32 -0300 Eduardo Habkost <ehabkost@redhat.com> wrote: > On Fri, Aug 24, 2018 at 01:26:54PM +0200, Igor Mammedov wrote: > > On Fri, 24 Aug 2018 08:11:48 -0300 > > Eduardo Habkost <ehabkost@redhat.com> wrote: > > > > > On Fri, Aug 24, 2018 at 11:13:50AM +0200, Igor Mammedov wrote: > > > > On Thu, 23 Aug 2018 18:32:41 +0200 > > > > Paolo Bonzini <pbonzini@redhat.com> wrote: > > > > > > > > > On 23/08/2018 16:51, Igor Mammedov wrote: > > > > > > Topology (threads*cores*sockets) must match maxcpus to be valid, > > > > > > otherwise we could start QEMU with invalid topology that throws > > > > > > a error on migration destination side, that should not be reachable: > > > > > > Source: > > > > > > -smp 8,maxcpus=64,cores=1,threads=8,sockets=1 > > > > > > // hotplug cpus upto maxcpus > > > > > > Destination: > > > > > > -smp 64,maxcpus=64,cores=1,threads=8,sockets=1 > > > > > > qemu: cpu topology: sockets (1) * cores (1) * threads (8) < smp_cpus (64) > > > > This destination CLI aren't exactly correct as well since > > > > it should've been exactly the same -smp as on source + a bunch of -device cpufoo... > > > > so we can always say go fix your CLI so it won't trigger error. > > > > > > > > > The destination should have sockets=8, shouldn't it? > > > > either that or cores=8 or cores=4,sockets=2 ... > > > > > > > > > It seems to me that, at startup, you should have cpus = s*t*c and cpus > > > > > <= maxcpus. Currently we check cpus <= s*t*c <= maxcpus, which doesn't > > > > > make much sense. > > > > I think that s*t*c should describe topology of whole machine > > > > including not yet plugged vcpus. "cpus = s*t*c" probably won't work > > > > for partially filled package case: > > > > -smp 1,cores=1,threads=8,sockets=1 > > > > cores/threads should reflect full package configuration > > > > for guest to see an expected topology. > > > > > > Oh, now I remember: that's the reason we don't enforce > > > s*t*c == smp_cpus nor s*t*c == max_cpus. > > > > > > Both "-smp 4,maxcpus=8,cores=2,threads=2,sockets=1" and > > > "-smp 4,maxcpus=8,cores=2,threads=2,sockets=2" > > > worked since maxcpus was introduced, making the semantics of > > > "sockets" unclear and hard to change without breaking existing > > > configs. > > Should we go with deprication thingy then, > > so we could make it clear in the future? > > Yes, but I'm not sure which option we should adopt > (s*t*c == smp_cpus or s*t*c == max_cpus). s*t*c == smp_cpus is wrong as one won't be able to start QEMU with partial package and then hotplug the rest when needed. [...]
On Thu, Aug 23, 2018 at 03:03:07PM -0300, Eduardo Habkost wrote: > On Thu, Aug 23, 2018 at 06:32:41PM +0200, Paolo Bonzini wrote: > > On 23/08/2018 16:51, Igor Mammedov wrote: > > > Topology (threads*cores*sockets) must match maxcpus to be valid, > > > otherwise we could start QEMU with invalid topology that throws > > > a error on migration destination side, that should not be reachable: > > > Source: > > > -smp 8,maxcpus=64,cores=1,threads=8,sockets=1 > > > // hotplug cpus upto maxcpus > > > Destination: > > > -smp 64,maxcpus=64,cores=1,threads=8,sockets=1 > > > qemu: cpu topology: sockets (1) * cores (1) * threads (8) < smp_cpus (64) > > > > The destination should have sockets=8, shouldn't it? > > > > It seems to me that, at startup, you should have cpus = s*t*c and cpus > > <= maxcpus. Currently we check cpus <= s*t*c <= maxcpus, which doesn't > > make much sense. > > Most of the incompleteness of input validation at smp_parse() can > be explained by our fear of breaking existing configurations and > making existing running VMs not runnable. > > But now we have a deprecation policy. If we're still afraid of > breaking peoples' existing configurations, we should at least > deprecate those configurations as soon as possible (and make QEMU > at least emit a warning). > A million years ago (well > 2 anyway) when I was thinking about doing some '-smp' improvements I tried to address this without breaking existing configs. Here's how I approached it https://lists.gnu.org/archive/html/qemu-ppc/2016-06/msg00317.html Also, there's another similar fix needed in smbios generation. See https://lists.gnu.org/archive/html/qemu-ppc/2016-06/msg00322.html Thanks, drew
diff --git a/vl.c b/vl.c index 16b913f..2b35e0c 100644 --- a/vl.c +++ b/vl.c @@ -1238,10 +1238,10 @@ static void smp_parse(QemuOpts *opts) exit(1); } - if (sockets * cores * threads > max_cpus) { + if (sockets * cores * threads != max_cpus) { error_report("cpu topology: " - "sockets (%u) * cores (%u) * threads (%u) > " - "maxcpus (%u)", + "sockets (%u) * cores (%u) * threads (%u) not equal to" + " maxcpus (%u)", sockets, cores, threads, max_cpus); exit(1); }
Topology (threads*cores*sockets) must match maxcpus to be valid, otherwise we could start QEMU with invalid topology that throws a error on migration destination side, that should not be reachable: Source: -smp 8,maxcpus=64,cores=1,threads=8,sockets=1 // hotplug cpus upto maxcpus Destination: -smp 64,maxcpus=64,cores=1,threads=8,sockets=1 qemu: cpu topology: sockets (1) * cores (1) * threads (8) < smp_cpus (64) Signed-off-by: Igor Mammedov <imammedo@redhat.com> --- vl.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)