diff mbox

[v5,01/10] vl: Don't allow CPU toplogies with partially filled cores

Message ID 1448024079-20808-2-git-send-email-bharata@linux.vnet.ibm.com
State New
Headers show

Commit Message

Bharata B Rao Nov. 20, 2015, 12:54 p.m. UTC
Prevent guests from booting with CPU topologies that have partially
filled CPU cores or can result in partially filled CPU cores after
CPU hotplug like

-smp 15,sockets=1,cores=4,threads=4,maxcpus=16 or
-smp 15,sockets=1,cores=4,threads=4,maxcpus=17.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 vl.c | 9 +++++++++
 1 file changed, 9 insertions(+)

Comments

David Gibson Dec. 1, 2015, 12:37 a.m. UTC | #1
On Fri, Nov 20, 2015 at 06:24:30PM +0530, Bharata B Rao wrote:
> Prevent guests from booting with CPU topologies that have partially
> filled CPU cores or can result in partially filled CPU cores after
> CPU hotplug like
> 
> -smp 15,sockets=1,cores=4,threads=4,maxcpus=16 or
> -smp 15,sockets=1,cores=4,threads=4,maxcpus=17.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

I may have missed a bit of the discussion leading up to this.  What
was the rationale for still allowing partially filled sockets (and
otherwise doing things at core rather than socket level?)

> ---
>  vl.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/vl.c b/vl.c
> index 7d993a5..23a1a1e 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -1248,6 +1248,15 @@ static void smp_parse(QemuOpts *opts)
>              exit(1);
>          }
>  
> +        if (cpus % threads || max_cpus % threads) {
> +            error_report("cpu topology: "
> +                         "sockets (%u) cores (%u) threads (%u) with "
> +                         "smp_cpus (%u) maxcpus (%u) "
> +                         "will result in partially filled cores",
> +                         sockets, cores, threads, cpus, max_cpus);
> +            exit(1);
> +        }
> +
>          smp_cpus = cpus;
>          smp_cores = cores > 0 ? cores : 1;
>          smp_threads = threads > 0 ? threads : 1;
Bharata B Rao Dec. 2, 2015, 3:04 a.m. UTC | #2
On Tue, Dec 01, 2015 at 11:37:03AM +1100, David Gibson wrote:
> On Fri, Nov 20, 2015 at 06:24:30PM +0530, Bharata B Rao wrote:
> > Prevent guests from booting with CPU topologies that have partially
> > filled CPU cores or can result in partially filled CPU cores after
> > CPU hotplug like
> > 
> > -smp 15,sockets=1,cores=4,threads=4,maxcpus=16 or
> > -smp 15,sockets=1,cores=4,threads=4,maxcpus=17.
> > 
> > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> 
> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> 
> I may have missed a bit of the discussion leading up to this.  What
> was the rationale for still allowing partially filled sockets (and
> otherwise doing things at core rather than socket level?)

This is generic parsing code. I don't remember an explicit discussion
around allowing or dis-allowing paritialy filled sockets. I thought
disallowing partially filled cores from generic smp parsing code would
be an acceptable first step for all archs.

Regards,
Bharata.
Igor Mammedov Dec. 2, 2015, 1:52 p.m. UTC | #3
On Fri, 20 Nov 2015 18:24:30 +0530
Bharata B Rao <bharata@linux.vnet.ibm.com> wrote:

> Prevent guests from booting with CPU topologies that have partially
> filled CPU cores or can result in partially filled CPU cores after
> CPU hotplug like
> 
> -smp 15,sockets=1,cores=4,threads=4,maxcpus=16 or
> -smp 15,sockets=1,cores=4,threads=4,maxcpus=17.

CCing Eduardo who might tell if it's ok wrt x86

> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> ---
>  vl.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/vl.c b/vl.c
> index 7d993a5..23a1a1e 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -1248,6 +1248,15 @@ static void smp_parse(QemuOpts *opts)
>              exit(1);
>          }
>  
> +        if (cpus % threads || max_cpus % threads) {
> +            error_report("cpu topology: "
> +                         "sockets (%u) cores (%u) threads (%u) with "
> +                         "smp_cpus (%u) maxcpus (%u) "
> +                         "will result in partially filled cores",
> +                         sockets, cores, threads, cpus, max_cpus);
> +            exit(1);
> +        }
> +
>          smp_cpus = cpus;
>          smp_cores = cores > 0 ? cores : 1;
>          smp_threads = threads > 0 ? threads : 1;
Eduardo Habkost Dec. 2, 2015, 2:14 p.m. UTC | #4
On Wed, Dec 02, 2015 at 02:52:39PM +0100, Igor Mammedov wrote:
> On Fri, 20 Nov 2015 18:24:30 +0530
> Bharata B Rao <bharata@linux.vnet.ibm.com> wrote:
> 
> > Prevent guests from booting with CPU topologies that have partially
> > filled CPU cores or can result in partially filled CPU cores after
> > CPU hotplug like
> > 
> > -smp 15,sockets=1,cores=4,threads=4,maxcpus=16 or
> > -smp 15,sockets=1,cores=4,threads=4,maxcpus=17.
> 
> CCing Eduardo who might tell if it's ok wrt x86

I am always afraid of introducing fatal errors for configurations
that are obviously wrong but may already exist in production (so
it would prevent users from migrating existing VMs). But I am
becoming more inclined to be a bit more aggressive and stop
allowing these broken configurations.

But I would rewrite the error message, and just say that "cpus"
and "maxcpus" must be multiples of "threads". It's easier to
understand and explain than what "partially filled cores" mean.
You don't need to mention "sockets" and "cores" in the error
message, either.

Also, please print different error messages to indicate if
"maxcpus" or "cpus" is the culprit.

What about (cpus % (cores * threads) != 0)? It would result in
sockets with different numbers of cores. Should we allow that?

The patch makes QEMU crash with the following command-line:

  $ qemu-system-x86_64 -smp 2,cores=2,sockets=2
  Floating point exception (core dumped)
  $

It can probably be solved by moving the check after
smp_{cpus,cores,threads} are already set.

> 
> > 
> > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > ---
> >  vl.c | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> > 
> > diff --git a/vl.c b/vl.c
> > index 7d993a5..23a1a1e 100644
> > --- a/vl.c
> > +++ b/vl.c
> > @@ -1248,6 +1248,15 @@ static void smp_parse(QemuOpts *opts)
> >              exit(1);
> >          }
> >  
> > +        if (cpus % threads || max_cpus % threads) {
> > +            error_report("cpu topology: "
> > +                         "sockets (%u) cores (%u) threads (%u) with "
> > +                         "smp_cpus (%u) maxcpus (%u) "
> > +                         "will result in partially filled cores",
> > +                         sockets, cores, threads, cpus, max_cpus);
> > +            exit(1);
> > +        }
> > +
> >          smp_cpus = cpus;
> >          smp_cores = cores > 0 ? cores : 1;
> >          smp_threads = threads > 0 ? threads : 1;
>
Bharata B Rao Dec. 2, 2015, 2:38 p.m. UTC | #5
On Wed, Dec 02, 2015 at 12:14:59PM -0200, Eduardo Habkost wrote:
> On Wed, Dec 02, 2015 at 02:52:39PM +0100, Igor Mammedov wrote:
> > On Fri, 20 Nov 2015 18:24:30 +0530
> > Bharata B Rao <bharata@linux.vnet.ibm.com> wrote:
> > 
> > > Prevent guests from booting with CPU topologies that have partially
> > > filled CPU cores or can result in partially filled CPU cores after
> > > CPU hotplug like
> > > 
> > > -smp 15,sockets=1,cores=4,threads=4,maxcpus=16 or
> > > -smp 15,sockets=1,cores=4,threads=4,maxcpus=17.
> > 
> > CCing Eduardo who might tell if it's ok wrt x86
> 
> I am always afraid of introducing fatal errors for configurations
> that are obviously wrong but may already exist in production (so
> it would prevent users from migrating existing VMs). But I am
> becoming more inclined to be a bit more aggressive and stop
> allowing these broken configurations.
> 
> But I would rewrite the error message, and just say that "cpus"
> and "maxcpus" must be multiples of "threads". It's easier to
> understand and explain than what "partially filled cores" mean.
> You don't need to mention "sockets" and "cores" in the error
> message, either.
> 
> Also, please print different error messages to indicate if
> "maxcpus" or "cpus" is the culprit.

Sure, makes it easier for the user.

> 
> What about (cpus % (cores * threads) != 0)? It would result in
> sockets with different numbers of cores. Should we allow that?

I had planned to prevent that and also

(max_cpus % (cores * threads) != 0) in the next iteration. Would be
good to know if enforcing such a condition would be acceptable.

> 
> The patch makes QEMU crash with the following command-line:
> 
>   $ qemu-system-x86_64 -smp 2,cores=2,sockets=2
>   Floating point exception (core dumped)
>   $
> 
> It can probably be solved by moving the check after
> smp_{cpus,cores,threads} are already set.

Oh yes, will fix in the next iteration.

Regards,
Bharata.
Eduardo Habkost Dec. 2, 2015, 3:19 p.m. UTC | #6
On Wed, Dec 02, 2015 at 08:08:23PM +0530, Bharata B Rao wrote:
> On Wed, Dec 02, 2015 at 12:14:59PM -0200, Eduardo Habkost wrote:
> > On Wed, Dec 02, 2015 at 02:52:39PM +0100, Igor Mammedov wrote:
> > > On Fri, 20 Nov 2015 18:24:30 +0530
> > > Bharata B Rao <bharata@linux.vnet.ibm.com> wrote:
> > > 
> > > > Prevent guests from booting with CPU topologies that have partially
> > > > filled CPU cores or can result in partially filled CPU cores after
> > > > CPU hotplug like
> > > > 
> > > > -smp 15,sockets=1,cores=4,threads=4,maxcpus=16 or
> > > > -smp 15,sockets=1,cores=4,threads=4,maxcpus=17.
> > > 
> > > CCing Eduardo who might tell if it's ok wrt x86
> > 
> > I am always afraid of introducing fatal errors for configurations
> > that are obviously wrong but may already exist in production (so
> > it would prevent users from migrating existing VMs). But I am
> > becoming more inclined to be a bit more aggressive and stop
> > allowing these broken configurations.
> > 
> > But I would rewrite the error message, and just say that "cpus"
> > and "maxcpus" must be multiples of "threads". It's easier to
> > understand and explain than what "partially filled cores" mean.
> > You don't need to mention "sockets" and "cores" in the error
> > message, either.
> > 
> > Also, please print different error messages to indicate if
> > "maxcpus" or "cpus" is the culprit.
> 
> Sure, makes it easier for the user.
> 
> > 
> > What about (cpus % (cores * threads) != 0)? It would result in
> > sockets with different numbers of cores. Should we allow that?
> 
> I had planned to prevent that and also
> 
> (max_cpus % (cores * threads) != 0) in the next iteration. Would be
> good to know if enforcing such a condition would be acceptable.

I'm all for being strict about core count, too, and only make the
requirements less strict if really necessary.

After all, if we really wanted to support sockets with different
numbers of cores each, or cores with different numbers of threads
each, the functionality shouldn't be restricted to the last
socket/core, so it would require a completely different
command-line interface.
diff mbox

Patch

diff --git a/vl.c b/vl.c
index 7d993a5..23a1a1e 100644
--- a/vl.c
+++ b/vl.c
@@ -1248,6 +1248,15 @@  static void smp_parse(QemuOpts *opts)
             exit(1);
         }
 
+        if (cpus % threads || max_cpus % threads) {
+            error_report("cpu topology: "
+                         "sockets (%u) cores (%u) threads (%u) with "
+                         "smp_cpus (%u) maxcpus (%u) "
+                         "will result in partially filled cores",
+                         sockets, cores, threads, cpus, max_cpus);
+            exit(1);
+        }
+
         smp_cpus = cpus;
         smp_cores = cores > 0 ? cores : 1;
         smp_threads = threads > 0 ? threads : 1;