diff mbox

[qemu] sysemu: support up to 1024 vCPUs

Message ID 20170224045531.7026-1-aik@ozlabs.ru
State New
Headers show

Commit Message

Alexey Kardashevskiy Feb. 24, 2017, 4:55 a.m. UTC
From: Greg Kurz <gkurz@linux.vnet.ibm.com>

Some systems can already provide more than 255 hardware threads.

Bumping the QEMU limit to 1024 seems reasonable:
- it has no visible overhead in top;
- the limit itself has no effect on hot paths.

Cc: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---

With ulimit -u/-n bumped (nproc and nofile), I was able to boot a guest
with 1024 CPUs, both with threads=1 and threads=8.

It takes time though - 3:15 to get to the guest shell but it is probably
expected on 160-threads machine.

---
 hw/ppc/spapr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

David Gibson Feb. 24, 2017, 6:16 a.m. UTC | #1
On Fri, Feb 24, 2017 at 03:55:31PM +1100, Alexey Kardashevskiy wrote:
> From: Greg Kurz <gkurz@linux.vnet.ibm.com>
> 
> Some systems can already provide more than 255 hardware threads.
> 
> Bumping the QEMU limit to 1024 seems reasonable:
> - it has no visible overhead in top;
> - the limit itself has no effect on hot paths.
> 
> Cc: Greg Kurz <gkurz@linux.vnet.ibm.com>
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> 
> With ulimit -u/-n bumped (nproc and nofile), I was able to boot a guest
> with 1024 CPUs, both with threads=1 and threads=8.
> 
> It takes time though - 3:15 to get to the guest shell but it is probably
> expected on 160-threads machine.

Applied, thanks.

> 
> ---
>  hw/ppc/spapr.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index e465d7ac98..46b81a625d 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2712,7 +2712,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
>      mc->init = ppc_spapr_init;
>      mc->reset = ppc_spapr_reset;
>      mc->block_default_type = IF_SCSI;
> -    mc->max_cpus = 255;
> +    mc->max_cpus = 1024;
>      mc->no_parallel = 1;
>      mc->default_boot_order = "";
>      mc->default_ram_size = 512 * M_BYTE;
Greg Kurz Feb. 24, 2017, 9:13 a.m. UTC | #2
On Fri, 24 Feb 2017 15:55:31 +1100
Alexey Kardashevskiy <aik@ozlabs.ru> wrote:

> From: Greg Kurz <gkurz@linux.vnet.ibm.com>
> 
> Some systems can already provide more than 255 hardware threads.
> 
> Bumping the QEMU limit to 1024 seems reasonable:
> - it has no visible overhead in top;
> - the limit itself has no effect on hot paths.
> 
> Cc: Greg Kurz <gkurz@linux.vnet.ibm.com>
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> 
> With ulimit -u/-n bumped (nproc and nofile), I was able to boot a guest
> with 1024 CPUs, both with threads=1 and threads=8.
> 
> It takes time though - 3:15 to get to the guest shell but it is probably
> expected on 160-threads machine.
> 

I remember something similiar at the time... also I had to give more
RAM to the guest to be able to run 1024 CPUs (sth like 6 gigs versus
512 megs for 1 CPU). With the same amount of guest RAM, each extra CPU
would cause the memory used by QEMU to grow about 8 megs.

> ---
>  hw/ppc/spapr.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index e465d7ac98..46b81a625d 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2712,7 +2712,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
>      mc->init = ppc_spapr_init;
>      mc->reset = ppc_spapr_reset;
>      mc->block_default_type = IF_SCSI;
> -    mc->max_cpus = 255;
> +    mc->max_cpus = 1024;
>      mc->no_parallel = 1;
>      mc->default_boot_order = "";
>      mc->default_ram_size = 512 * M_BYTE;
David Gibson Feb. 27, 2017, 1:09 a.m. UTC | #3
On Fri, Feb 24, 2017 at 10:13:50AM +0100, Greg Kurz wrote:
> On Fri, 24 Feb 2017 15:55:31 +1100
> Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
> 
> > From: Greg Kurz <gkurz@linux.vnet.ibm.com>
> > 
> > Some systems can already provide more than 255 hardware threads.
> > 
> > Bumping the QEMU limit to 1024 seems reasonable:
> > - it has no visible overhead in top;
> > - the limit itself has no effect on hot paths.
> > 
> > Cc: Greg Kurz <gkurz@linux.vnet.ibm.com>
> > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> > ---
> > 
> > With ulimit -u/-n bumped (nproc and nofile), I was able to boot a guest
> > with 1024 CPUs, both with threads=1 and threads=8.
> > 
> > It takes time though - 3:15 to get to the guest shell but it is probably
> > expected on 160-threads machine.

Yes, I'd expect so, that's a lot of overcommit.  Plus, switching from
one vcpu to another on the same host thread will, IIRC, require two
full partition switches, which are pretty slow on Power.

> I remember something similiar at the time... also I had to give more
> RAM to the guest to be able to run 1024 CPUs (sth like 6 gigs versus
> 512 megs for 1 CPU). With the same amount of guest RAM, each extra CPU
> would cause the memory used by QEMU to grow about 8 megs.

Hm... that seems like rather a lot.  Any idea why?

> 
> > ---
> >  hw/ppc/spapr.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index e465d7ac98..46b81a625d 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -2712,7 +2712,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
> >      mc->init = ppc_spapr_init;
> >      mc->reset = ppc_spapr_reset;
> >      mc->block_default_type = IF_SCSI;
> > -    mc->max_cpus = 255;
> > +    mc->max_cpus = 1024;
> >      mc->no_parallel = 1;
> >      mc->default_boot_order = "";
> >      mc->default_ram_size = 512 * M_BYTE;
> 
>
Greg Kurz Feb. 27, 2017, 10:13 p.m. UTC | #4
On Mon, 27 Feb 2017 12:09:53 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Fri, Feb 24, 2017 at 10:13:50AM +0100, Greg Kurz wrote:
> > On Fri, 24 Feb 2017 15:55:31 +1100
> > Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
> >   
> > > From: Greg Kurz <gkurz@linux.vnet.ibm.com>
> > > 
> > > Some systems can already provide more than 255 hardware threads.
> > > 
> > > Bumping the QEMU limit to 1024 seems reasonable:
> > > - it has no visible overhead in top;
> > > - the limit itself has no effect on hot paths.
> > > 
> > > Cc: Greg Kurz <gkurz@linux.vnet.ibm.com>
> > > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> > > ---
> > > 
> > > With ulimit -u/-n bumped (nproc and nofile), I was able to boot a guest
> > > with 1024 CPUs, both with threads=1 and threads=8.
> > > 
> > > It takes time though - 3:15 to get to the guest shell but it is probably
> > > expected on 160-threads machine.  
> 
> Yes, I'd expect so, that's a lot of overcommit.  Plus, switching from
> one vcpu to another on the same host thread will, IIRC, require two
> full partition switches, which are pretty slow on Power.
> 
> > I remember something similiar at the time... also I had to give more
> > RAM to the guest to be able to run 1024 CPUs (sth like 6 gigs versus
> > 512 megs for 1 CPU). With the same amount of guest RAM, each extra CPU
> > would cause the memory used by QEMU to grow about 8 megs.  
> 
> Hm... that seems like rather a lot.  Any idea why?
> 

No but I'll try again with the current code and I'll have a closer look.

> >   
> > > ---
> > >  hw/ppc/spapr.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > index e465d7ac98..46b81a625d 100644
> > > --- a/hw/ppc/spapr.c
> > > +++ b/hw/ppc/spapr.c
> > > @@ -2712,7 +2712,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
> > >      mc->init = ppc_spapr_init;
> > >      mc->reset = ppc_spapr_reset;
> > >      mc->block_default_type = IF_SCSI;
> > > -    mc->max_cpus = 255;
> > > +    mc->max_cpus = 1024;
> > >      mc->no_parallel = 1;
> > >      mc->default_boot_order = "";
> > >      mc->default_ram_size = 512 * M_BYTE;  
> > 
> >   
> 
> 
>
diff mbox

Patch

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index e465d7ac98..46b81a625d 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2712,7 +2712,7 @@  static void spapr_machine_class_init(ObjectClass *oc, void *data)
     mc->init = ppc_spapr_init;
     mc->reset = ppc_spapr_reset;
     mc->block_default_type = IF_SCSI;
-    mc->max_cpus = 255;
+    mc->max_cpus = 1024;
     mc->no_parallel = 1;
     mc->default_boot_order = "";
     mc->default_ram_size = 512 * M_BYTE;