diff mbox series

[for,3.1] spapr: Fix ibm, max-associativity-domains property number of nodes

Message ID 1542632978-65604-1-git-send-email-spopovyc@redhat.com
State New
Headers show
Series [for,3.1] spapr: Fix ibm, max-associativity-domains property number of nodes | expand

Commit Message

Serhii Popovych Nov. 19, 2018, 1:09 p.m. UTC
Laurent Vivier reported off by one with maximum number of NUMA nodes
provided by qemu-kvm being less by one than required according to
description of "ibm,max-associativity-domains" property in LoPAPR.

It appears that I incorrectly treated LoPAPR description of this
property assuming it provides last valid domain (NUMA node here)
instead of maximum number of domains.

  ### Before hot-add

  (qemu) info numa
  3 nodes
  node 0 cpus: 0
  node 0 size: 0 MB
  node 0 plugged: 0 MB
  node 1 cpus:
  node 1 size: 1024 MB
  node 1 plugged: 0 MB
  node 2 cpus:
  node 2 size: 0 MB
  node 2 plugged: 0 MB

  $ numactl -H
  available: 2 nodes (0-1)
  node 0 cpus: 0
  node 0 size: 0 MB
  node 0 free: 0 MB
  node 1 cpus:
  node 1 size: 999 MB
  node 1 free: 658 MB
  node distances:
  node   0   1
    0:  10  40
    1:  40  10

  ### Hot-add

  (qemu) object_add memory-backend-ram,id=mem0,size=1G
  (qemu) device_add pc-dimm,id=dimm1,memdev=mem0,node=2
  (qemu) [   87.704898] pseries-hotplug-mem: Attempting to hot-add 4 ...
  <there is no "Initmem setup node 2 [mem 0xHEX-0xHEX]">
  [   87.705128] lpar: Attempting to resize HPT to shift 21
  ... <HPT resize messages>

  ### After hot-add

  (qemu) info numa
  3 nodes
  node 0 cpus: 0
  node 0 size: 0 MB
  node 0 plugged: 0 MB
  node 1 cpus:
  node 1 size: 1024 MB
  node 1 plugged: 0 MB
  node 2 cpus:
  node 2 size: 1024 MB
  node 2 plugged: 1024 MB

  $ numactl -H
  available: 2 nodes (0-1)
  ^^^^^^^^^^^^^^^^^^^^^^^^
             Still only two nodes (and memory hot-added to node 0 below)
  node 0 cpus: 0
  node 0 size: 1024 MB
  node 0 free: 1021 MB
  node 1 cpus:
  node 1 size: 999 MB
  node 1 free: 658 MB
  node distances:
  node   0   1
    0:  10  40
    1:  40  10

After fix applied numactl(8) reports 3 nodes available and memory
plugged into node 2 as expected.

Fixes: da9f80fbad21 ("spapr: Add ibm,max-associativity-domains property")
Reported-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Serhii Popovych <spopovyc@redhat.com>
---
 hw/ppc/spapr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Greg Kurz Nov. 19, 2018, 1:27 p.m. UTC | #1
On Mon, 19 Nov 2018 08:09:38 -0500
Serhii Popovych <spopovyc@redhat.com> wrote:

> Laurent Vivier reported off by one with maximum number of NUMA nodes
> provided by qemu-kvm being less by one than required according to
> description of "ibm,max-associativity-domains" property in LoPAPR.
> 
> It appears that I incorrectly treated LoPAPR description of this
> property assuming it provides last valid domain (NUMA node here)
> instead of maximum number of domains.
> 
>   ### Before hot-add
> 
>   (qemu) info numa
>   3 nodes
>   node 0 cpus: 0
>   node 0 size: 0 MB
>   node 0 plugged: 0 MB
>   node 1 cpus:
>   node 1 size: 1024 MB
>   node 1 plugged: 0 MB
>   node 2 cpus:
>   node 2 size: 0 MB
>   node 2 plugged: 0 MB
> 
>   $ numactl -H
>   available: 2 nodes (0-1)
>   node 0 cpus: 0
>   node 0 size: 0 MB
>   node 0 free: 0 MB
>   node 1 cpus:
>   node 1 size: 999 MB
>   node 1 free: 658 MB
>   node distances:
>   node   0   1
>     0:  10  40
>     1:  40  10
> 
>   ### Hot-add
> 
>   (qemu) object_add memory-backend-ram,id=mem0,size=1G
>   (qemu) device_add pc-dimm,id=dimm1,memdev=mem0,node=2
>   (qemu) [   87.704898] pseries-hotplug-mem: Attempting to hot-add 4 ...
>   <there is no "Initmem setup node 2 [mem 0xHEX-0xHEX]">
>   [   87.705128] lpar: Attempting to resize HPT to shift 21
>   ... <HPT resize messages>
> 
>   ### After hot-add
> 
>   (qemu) info numa
>   3 nodes
>   node 0 cpus: 0
>   node 0 size: 0 MB
>   node 0 plugged: 0 MB
>   node 1 cpus:
>   node 1 size: 1024 MB
>   node 1 plugged: 0 MB
>   node 2 cpus:
>   node 2 size: 1024 MB
>   node 2 plugged: 1024 MB
> 
>   $ numactl -H
>   available: 2 nodes (0-1)
>   ^^^^^^^^^^^^^^^^^^^^^^^^
>              Still only two nodes (and memory hot-added to node 0 below)
>   node 0 cpus: 0
>   node 0 size: 1024 MB
>   node 0 free: 1021 MB
>   node 1 cpus:
>   node 1 size: 999 MB
>   node 1 free: 658 MB
>   node distances:
>   node   0   1
>     0:  10  40
>     1:  40  10
> 
> After fix applied numactl(8) reports 3 nodes available and memory
> plugged into node 2 as expected.
> 
> Fixes: da9f80fbad21 ("spapr: Add ibm,max-associativity-domains property")
> Reported-by: Laurent Vivier <lvivier@redhat.com>
> Signed-off-by: Serhii Popovych <spopovyc@redhat.com>
> ---
>  hw/ppc/spapr.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 7afd1a1..843ae6c 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1033,7 +1033,7 @@ static void spapr_dt_rtas(sPAPRMachineState *spapr, void *fdt)
>          cpu_to_be32(0),
>          cpu_to_be32(0),
>          cpu_to_be32(0),
> -        cpu_to_be32(nb_numa_nodes ? nb_numa_nodes - 1 : 0),
> +        cpu_to_be32(nb_numa_nodes ? nb_numa_nodes : 0),

Maybe simply cpu_to_be32(nb_numa_nodes) ?

Apart from that,

Reviewed-by: Greg Kurz <groug@kaod.org>

>      };
>  
>      _FDT(rtas = fdt_add_subnode(fdt, 0, "rtas"));
Laurent Vivier Nov. 19, 2018, 1:30 p.m. UTC | #2
On 19/11/2018 14:27, Greg Kurz wrote:
> On Mon, 19 Nov 2018 08:09:38 -0500
> Serhii Popovych <spopovyc@redhat.com> wrote:
> 
>> Laurent Vivier reported off by one with maximum number of NUMA nodes
>> provided by qemu-kvm being less by one than required according to
>> description of "ibm,max-associativity-domains" property in LoPAPR.
>>
>> It appears that I incorrectly treated LoPAPR description of this
>> property assuming it provides last valid domain (NUMA node here)
>> instead of maximum number of domains.
>>
>>   ### Before hot-add
>>
>>   (qemu) info numa
>>   3 nodes
>>   node 0 cpus: 0
>>   node 0 size: 0 MB
>>   node 0 plugged: 0 MB
>>   node 1 cpus:
>>   node 1 size: 1024 MB
>>   node 1 plugged: 0 MB
>>   node 2 cpus:
>>   node 2 size: 0 MB
>>   node 2 plugged: 0 MB
>>
>>   $ numactl -H
>>   available: 2 nodes (0-1)
>>   node 0 cpus: 0
>>   node 0 size: 0 MB
>>   node 0 free: 0 MB
>>   node 1 cpus:
>>   node 1 size: 999 MB
>>   node 1 free: 658 MB
>>   node distances:
>>   node   0   1
>>     0:  10  40
>>     1:  40  10
>>
>>   ### Hot-add
>>
>>   (qemu) object_add memory-backend-ram,id=mem0,size=1G
>>   (qemu) device_add pc-dimm,id=dimm1,memdev=mem0,node=2
>>   (qemu) [   87.704898] pseries-hotplug-mem: Attempting to hot-add 4 ...
>>   <there is no "Initmem setup node 2 [mem 0xHEX-0xHEX]">
>>   [   87.705128] lpar: Attempting to resize HPT to shift 21
>>   ... <HPT resize messages>
>>
>>   ### After hot-add
>>
>>   (qemu) info numa
>>   3 nodes
>>   node 0 cpus: 0
>>   node 0 size: 0 MB
>>   node 0 plugged: 0 MB
>>   node 1 cpus:
>>   node 1 size: 1024 MB
>>   node 1 plugged: 0 MB
>>   node 2 cpus:
>>   node 2 size: 1024 MB
>>   node 2 plugged: 1024 MB
>>
>>   $ numactl -H
>>   available: 2 nodes (0-1)
>>   ^^^^^^^^^^^^^^^^^^^^^^^^
>>              Still only two nodes (and memory hot-added to node 0 below)
>>   node 0 cpus: 0
>>   node 0 size: 1024 MB
>>   node 0 free: 1021 MB
>>   node 1 cpus:
>>   node 1 size: 999 MB
>>   node 1 free: 658 MB
>>   node distances:
>>   node   0   1
>>     0:  10  40
>>     1:  40  10
>>
>> After fix applied numactl(8) reports 3 nodes available and memory
>> plugged into node 2 as expected.
>>
>> Fixes: da9f80fbad21 ("spapr: Add ibm,max-associativity-domains property")
>> Reported-by: Laurent Vivier <lvivier@redhat.com>
>> Signed-off-by: Serhii Popovych <spopovyc@redhat.com>
>> ---
>>  hw/ppc/spapr.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index 7afd1a1..843ae6c 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -1033,7 +1033,7 @@ static void spapr_dt_rtas(sPAPRMachineState *spapr, void *fdt)
>>          cpu_to_be32(0),
>>          cpu_to_be32(0),
>>          cpu_to_be32(0),
>> -        cpu_to_be32(nb_numa_nodes ? nb_numa_nodes - 1 : 0),
>> +        cpu_to_be32(nb_numa_nodes ? nb_numa_nodes : 0),
> 
> Maybe simply cpu_to_be32(nb_numa_nodes) ?

I agree the "? : " is not needed.

With "cpu_to_be32(nb_numa_nodes)":

Reviewed-by: Laurent Vivier <lvivier@redhat.com>

Thanks,
Laurent
Laurent Vivier Nov. 19, 2018, 1:48 p.m. UTC | #3
On 19/11/2018 14:27, Greg Kurz wrote:
> On Mon, 19 Nov 2018 08:09:38 -0500
> Serhii Popovych <spopovyc@redhat.com> wrote:
> 
>> Laurent Vivier reported off by one with maximum number of NUMA nodes
>> provided by qemu-kvm being less by one than required according to
>> description of "ibm,max-associativity-domains" property in LoPAPR.
>>
>> It appears that I incorrectly treated LoPAPR description of this
>> property assuming it provides last valid domain (NUMA node here)
>> instead of maximum number of domains.
>>
>>   ### Before hot-add
>>
>>   (qemu) info numa
>>   3 nodes
>>   node 0 cpus: 0
>>   node 0 size: 0 MB
>>   node 0 plugged: 0 MB
>>   node 1 cpus:
>>   node 1 size: 1024 MB
>>   node 1 plugged: 0 MB
>>   node 2 cpus:
>>   node 2 size: 0 MB
>>   node 2 plugged: 0 MB
>>
>>   $ numactl -H
>>   available: 2 nodes (0-1)
>>   node 0 cpus: 0
>>   node 0 size: 0 MB
>>   node 0 free: 0 MB
>>   node 1 cpus:
>>   node 1 size: 999 MB
>>   node 1 free: 658 MB
>>   node distances:
>>   node   0   1
>>     0:  10  40
>>     1:  40  10
>>
>>   ### Hot-add
>>
>>   (qemu) object_add memory-backend-ram,id=mem0,size=1G
>>   (qemu) device_add pc-dimm,id=dimm1,memdev=mem0,node=2
>>   (qemu) [   87.704898] pseries-hotplug-mem: Attempting to hot-add 4 ...
>>   <there is no "Initmem setup node 2 [mem 0xHEX-0xHEX]">
>>   [   87.705128] lpar: Attempting to resize HPT to shift 21
>>   ... <HPT resize messages>
>>
>>   ### After hot-add
>>
>>   (qemu) info numa
>>   3 nodes
>>   node 0 cpus: 0
>>   node 0 size: 0 MB
>>   node 0 plugged: 0 MB
>>   node 1 cpus:
>>   node 1 size: 1024 MB
>>   node 1 plugged: 0 MB
>>   node 2 cpus:
>>   node 2 size: 1024 MB
>>   node 2 plugged: 1024 MB
>>
>>   $ numactl -H
>>   available: 2 nodes (0-1)
>>   ^^^^^^^^^^^^^^^^^^^^^^^^
>>              Still only two nodes (and memory hot-added to node 0 below)
>>   node 0 cpus: 0
>>   node 0 size: 1024 MB
>>   node 0 free: 1021 MB
>>   node 1 cpus:
>>   node 1 size: 999 MB
>>   node 1 free: 658 MB
>>   node distances:
>>   node   0   1
>>     0:  10  40
>>     1:  40  10
>>
>> After fix applied numactl(8) reports 3 nodes available and memory
>> plugged into node 2 as expected.
>>
>> Fixes: da9f80fbad21 ("spapr: Add ibm,max-associativity-domains property")
>> Reported-by: Laurent Vivier <lvivier@redhat.com>
>> Signed-off-by: Serhii Popovych <spopovyc@redhat.com>
>> ---
>>  hw/ppc/spapr.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index 7afd1a1..843ae6c 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -1033,7 +1033,7 @@ static void spapr_dt_rtas(sPAPRMachineState *spapr, void *fdt)
>>          cpu_to_be32(0),
>>          cpu_to_be32(0),
>>          cpu_to_be32(0),
>> -        cpu_to_be32(nb_numa_nodes ? nb_numa_nodes - 1 : 0),
>> +        cpu_to_be32(nb_numa_nodes ? nb_numa_nodes : 0),
> 
> Maybe simply cpu_to_be32(nb_numa_nodes) ?

Or "cpu_to_be32(nb_numa_nodes ? nb_numa_nodes : 1)" ?

In spapr_populate_drconf_memory() we have this logic.

Thanks,
Laurent
Serhii Popovych Nov. 19, 2018, 4:18 p.m. UTC | #4
Laurent Vivier wrote:
> On 19/11/2018 14:27, Greg Kurz wrote:
>> On Mon, 19 Nov 2018 08:09:38 -0500
>> Serhii Popovych <spopovyc@redhat.com> wrote:
>>
>>> Laurent Vivier reported off by one with maximum number of NUMA nodes
>>> provided by qemu-kvm being less by one than required according to
>>> description of "ibm,max-associativity-domains" property in LoPAPR.
>>>
>>> It appears that I incorrectly treated LoPAPR description of this
>>> property assuming it provides last valid domain (NUMA node here)
>>> instead of maximum number of domains.
>>>
>>>   ### Before hot-add
>>>
>>>   (qemu) info numa
>>>   3 nodes
>>>   node 0 cpus: 0
>>>   node 0 size: 0 MB
>>>   node 0 plugged: 0 MB
>>>   node 1 cpus:
>>>   node 1 size: 1024 MB
>>>   node 1 plugged: 0 MB
>>>   node 2 cpus:
>>>   node 2 size: 0 MB
>>>   node 2 plugged: 0 MB
>>>
>>>   $ numactl -H
>>>   available: 2 nodes (0-1)
>>>   node 0 cpus: 0
>>>   node 0 size: 0 MB
>>>   node 0 free: 0 MB
>>>   node 1 cpus:
>>>   node 1 size: 999 MB
>>>   node 1 free: 658 MB
>>>   node distances:
>>>   node   0   1
>>>     0:  10  40
>>>     1:  40  10
>>>
>>>   ### Hot-add
>>>
>>>   (qemu) object_add memory-backend-ram,id=mem0,size=1G
>>>   (qemu) device_add pc-dimm,id=dimm1,memdev=mem0,node=2
>>>   (qemu) [   87.704898] pseries-hotplug-mem: Attempting to hot-add 4 ...
>>>   <there is no "Initmem setup node 2 [mem 0xHEX-0xHEX]">
>>>   [   87.705128] lpar: Attempting to resize HPT to shift 21
>>>   ... <HPT resize messages>
>>>
>>>   ### After hot-add
>>>
>>>   (qemu) info numa
>>>   3 nodes
>>>   node 0 cpus: 0
>>>   node 0 size: 0 MB
>>>   node 0 plugged: 0 MB
>>>   node 1 cpus:
>>>   node 1 size: 1024 MB
>>>   node 1 plugged: 0 MB
>>>   node 2 cpus:
>>>   node 2 size: 1024 MB
>>>   node 2 plugged: 1024 MB
>>>
>>>   $ numactl -H
>>>   available: 2 nodes (0-1)
>>>   ^^^^^^^^^^^^^^^^^^^^^^^^
>>>              Still only two nodes (and memory hot-added to node 0 below)
>>>   node 0 cpus: 0
>>>   node 0 size: 1024 MB
>>>   node 0 free: 1021 MB
>>>   node 1 cpus:
>>>   node 1 size: 999 MB
>>>   node 1 free: 658 MB
>>>   node distances:
>>>   node   0   1
>>>     0:  10  40
>>>     1:  40  10
>>>
>>> After fix applied numactl(8) reports 3 nodes available and memory
>>> plugged into node 2 as expected.
>>>
>>> Fixes: da9f80fbad21 ("spapr: Add ibm,max-associativity-domains property")
>>> Reported-by: Laurent Vivier <lvivier@redhat.com>
>>> Signed-off-by: Serhii Popovych <spopovyc@redhat.com>
>>> ---
>>>  hw/ppc/spapr.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>> index 7afd1a1..843ae6c 100644
>>> --- a/hw/ppc/spapr.c
>>> +++ b/hw/ppc/spapr.c
>>> @@ -1033,7 +1033,7 @@ static void spapr_dt_rtas(sPAPRMachineState *spapr, void *fdt)
>>>          cpu_to_be32(0),
>>>          cpu_to_be32(0),
>>>          cpu_to_be32(0),
>>> -        cpu_to_be32(nb_numa_nodes ? nb_numa_nodes - 1 : 0),
>>> +        cpu_to_be32(nb_numa_nodes ? nb_numa_nodes : 0),
>>
>> Maybe simply cpu_to_be32(nb_numa_nodes) ?
> 
> I agree the "? : " is not needed.
> 
> With "cpu_to_be32(nb_numa_nodes)":
> 

Agree, ?: was relevant only to catch -1 case when running guest w/o NUMA
config. Will send v2. Thanks for quick review.

> Reviewed-by: Laurent Vivier <lvivier@redhat.com>
> 
> Thanks,
> Laurent
>
Greg Kurz Nov. 19, 2018, 4:59 p.m. UTC | #5
On Mon, 19 Nov 2018 14:48:34 +0100
Laurent Vivier <lvivier@redhat.com> wrote:

> On 19/11/2018 14:27, Greg Kurz wrote:
> > On Mon, 19 Nov 2018 08:09:38 -0500
> > Serhii Popovych <spopovyc@redhat.com> wrote:
> >   
> >> Laurent Vivier reported off by one with maximum number of NUMA nodes
> >> provided by qemu-kvm being less by one than required according to
> >> description of "ibm,max-associativity-domains" property in LoPAPR.
> >>
> >> It appears that I incorrectly treated LoPAPR description of this
> >> property assuming it provides last valid domain (NUMA node here)
> >> instead of maximum number of domains.
> >>
> >>   ### Before hot-add
> >>
> >>   (qemu) info numa
> >>   3 nodes
> >>   node 0 cpus: 0
> >>   node 0 size: 0 MB
> >>   node 0 plugged: 0 MB
> >>   node 1 cpus:
> >>   node 1 size: 1024 MB
> >>   node 1 plugged: 0 MB
> >>   node 2 cpus:
> >>   node 2 size: 0 MB
> >>   node 2 plugged: 0 MB
> >>
> >>   $ numactl -H
> >>   available: 2 nodes (0-1)
> >>   node 0 cpus: 0
> >>   node 0 size: 0 MB
> >>   node 0 free: 0 MB
> >>   node 1 cpus:
> >>   node 1 size: 999 MB
> >>   node 1 free: 658 MB
> >>   node distances:
> >>   node   0   1
> >>     0:  10  40
> >>     1:  40  10
> >>
> >>   ### Hot-add
> >>
> >>   (qemu) object_add memory-backend-ram,id=mem0,size=1G
> >>   (qemu) device_add pc-dimm,id=dimm1,memdev=mem0,node=2
> >>   (qemu) [   87.704898] pseries-hotplug-mem: Attempting to hot-add 4 ...
> >>   <there is no "Initmem setup node 2 [mem 0xHEX-0xHEX]">
> >>   [   87.705128] lpar: Attempting to resize HPT to shift 21
> >>   ... <HPT resize messages>
> >>
> >>   ### After hot-add
> >>
> >>   (qemu) info numa
> >>   3 nodes
> >>   node 0 cpus: 0
> >>   node 0 size: 0 MB
> >>   node 0 plugged: 0 MB
> >>   node 1 cpus:
> >>   node 1 size: 1024 MB
> >>   node 1 plugged: 0 MB
> >>   node 2 cpus:
> >>   node 2 size: 1024 MB
> >>   node 2 plugged: 1024 MB
> >>
> >>   $ numactl -H
> >>   available: 2 nodes (0-1)
> >>   ^^^^^^^^^^^^^^^^^^^^^^^^
> >>              Still only two nodes (and memory hot-added to node 0 below)
> >>   node 0 cpus: 0
> >>   node 0 size: 1024 MB
> >>   node 0 free: 1021 MB
> >>   node 1 cpus:
> >>   node 1 size: 999 MB
> >>   node 1 free: 658 MB
> >>   node distances:
> >>   node   0   1
> >>     0:  10  40
> >>     1:  40  10
> >>
> >> After fix applied numactl(8) reports 3 nodes available and memory
> >> plugged into node 2 as expected.
> >>
> >> Fixes: da9f80fbad21 ("spapr: Add ibm,max-associativity-domains property")
> >> Reported-by: Laurent Vivier <lvivier@redhat.com>
> >> Signed-off-by: Serhii Popovych <spopovyc@redhat.com>
> >> ---
> >>  hw/ppc/spapr.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >> index 7afd1a1..843ae6c 100644
> >> --- a/hw/ppc/spapr.c
> >> +++ b/hw/ppc/spapr.c
> >> @@ -1033,7 +1033,7 @@ static void spapr_dt_rtas(sPAPRMachineState *spapr, void *fdt)
> >>          cpu_to_be32(0),
> >>          cpu_to_be32(0),
> >>          cpu_to_be32(0),
> >> -        cpu_to_be32(nb_numa_nodes ? nb_numa_nodes - 1 : 0),
> >> +        cpu_to_be32(nb_numa_nodes ? nb_numa_nodes : 0),  
> > 
> > Maybe simply cpu_to_be32(nb_numa_nodes) ?  
> 
> Or "cpu_to_be32(nb_numa_nodes ? nb_numa_nodes : 1)" ?
> 
> In spapr_populate_drconf_memory() we have this logic.
> 

Hmm... maybe you're right, it seems that the code assumes
non-NUMA configs have at one node. Similar assumption is
also present in pc_dimm_realize():

    if (((nb_numa_nodes > 0) && (dimm->node >= nb_numa_nodes)) ||
        (!nb_numa_nodes && dimm->node)) {
        error_setg(errp, "'DIMM property " PC_DIMM_NODE_PROP " has value %"
                   PRIu32 "' which exceeds the number of numa nodes: %d",
                   dimm->node, nb_numa_nodes ? nb_numa_nodes : 1);
        return;
    }

This is a bit confusing...

> Thanks,
> Laurent
Serhii Popovych Nov. 20, 2018, 6:58 p.m. UTC | #6
Greg Kurz wrote:
> On Mon, 19 Nov 2018 14:48:34 +0100
> Laurent Vivier <lvivier@redhat.com> wrote:
> 
>> On 19/11/2018 14:27, Greg Kurz wrote:
>>> On Mon, 19 Nov 2018 08:09:38 -0500
>>> Serhii Popovych <spopovyc@redhat.com> wrote:
>>>   
>>>> Laurent Vivier reported off by one with maximum number of NUMA nodes
>>>> provided by qemu-kvm being less by one than required according to
>>>> description of "ibm,max-associativity-domains" property in LoPAPR.
>>>>
>>>> It appears that I incorrectly treated LoPAPR description of this
>>>> property assuming it provides last valid domain (NUMA node here)
>>>> instead of maximum number of domains.
>>>>
>>>>   ### Before hot-add
>>>>
>>>>   (qemu) info numa
>>>>   3 nodes
>>>>   node 0 cpus: 0
>>>>   node 0 size: 0 MB
>>>>   node 0 plugged: 0 MB
>>>>   node 1 cpus:
>>>>   node 1 size: 1024 MB
>>>>   node 1 plugged: 0 MB
>>>>   node 2 cpus:
>>>>   node 2 size: 0 MB
>>>>   node 2 plugged: 0 MB
>>>>
>>>>   $ numactl -H
>>>>   available: 2 nodes (0-1)
>>>>   node 0 cpus: 0
>>>>   node 0 size: 0 MB
>>>>   node 0 free: 0 MB
>>>>   node 1 cpus:
>>>>   node 1 size: 999 MB
>>>>   node 1 free: 658 MB
>>>>   node distances:
>>>>   node   0   1
>>>>     0:  10  40
>>>>     1:  40  10
>>>>
>>>>   ### Hot-add
>>>>
>>>>   (qemu) object_add memory-backend-ram,id=mem0,size=1G
>>>>   (qemu) device_add pc-dimm,id=dimm1,memdev=mem0,node=2
>>>>   (qemu) [   87.704898] pseries-hotplug-mem: Attempting to hot-add 4 ...
>>>>   <there is no "Initmem setup node 2 [mem 0xHEX-0xHEX]">
>>>>   [   87.705128] lpar: Attempting to resize HPT to shift 21
>>>>   ... <HPT resize messages>
>>>>
>>>>   ### After hot-add
>>>>
>>>>   (qemu) info numa
>>>>   3 nodes
>>>>   node 0 cpus: 0
>>>>   node 0 size: 0 MB
>>>>   node 0 plugged: 0 MB
>>>>   node 1 cpus:
>>>>   node 1 size: 1024 MB
>>>>   node 1 plugged: 0 MB
>>>>   node 2 cpus:
>>>>   node 2 size: 1024 MB
>>>>   node 2 plugged: 1024 MB
>>>>
>>>>   $ numactl -H
>>>>   available: 2 nodes (0-1)
>>>>   ^^^^^^^^^^^^^^^^^^^^^^^^
>>>>              Still only two nodes (and memory hot-added to node 0 below)
>>>>   node 0 cpus: 0
>>>>   node 0 size: 1024 MB
>>>>   node 0 free: 1021 MB
>>>>   node 1 cpus:
>>>>   node 1 size: 999 MB
>>>>   node 1 free: 658 MB
>>>>   node distances:
>>>>   node   0   1
>>>>     0:  10  40
>>>>     1:  40  10
>>>>
>>>> After fix applied numactl(8) reports 3 nodes available and memory
>>>> plugged into node 2 as expected.
>>>>
>>>> Fixes: da9f80fbad21 ("spapr: Add ibm,max-associativity-domains property")
>>>> Reported-by: Laurent Vivier <lvivier@redhat.com>
>>>> Signed-off-by: Serhii Popovych <spopovyc@redhat.com>
>>>> ---
>>>>  hw/ppc/spapr.c | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>>> index 7afd1a1..843ae6c 100644
>>>> --- a/hw/ppc/spapr.c
>>>> +++ b/hw/ppc/spapr.c
>>>> @@ -1033,7 +1033,7 @@ static void spapr_dt_rtas(sPAPRMachineState *spapr, void *fdt)
>>>>          cpu_to_be32(0),
>>>>          cpu_to_be32(0),
>>>>          cpu_to_be32(0),
>>>> -        cpu_to_be32(nb_numa_nodes ? nb_numa_nodes - 1 : 0),
>>>> +        cpu_to_be32(nb_numa_nodes ? nb_numa_nodes : 0),  
>>>
>>> Maybe simply cpu_to_be32(nb_numa_nodes) ?  
>>
>> Or "cpu_to_be32(nb_numa_nodes ? nb_numa_nodes : 1)" ?

Linux handles zero correctly, but nb_numa_nodes ?: 1 looks better.

I did testing with just cpu_to_be32(nb_numa_nodes) and
cpu_to_be32(nb_numa_nodes ? nb_numa_nodes : 1) it works with Linux
correctly in both cases

(guest)# numactl -H
available: 1 nodes (0)
node 0 cpus: 0
node 0 size: 487 MB
node 0 free: 148 MB
node distances:
node   0
  0:  10

(qemu) info numa
0 nodes

>>
>> In spapr_populate_drconf_memory() we have this logic.
>>
> 
> Hmm... maybe you're right, it seems that the code assumes
> non-NUMA configs have at one node. Similar assumption is
> also present in pc_dimm_realize():
> 
>     if (((nb_numa_nodes > 0) && (dimm->node >= nb_numa_nodes)) ||
>         (!nb_numa_nodes && dimm->node)) 
According to this nb_numa_nodes can be zero

>         error_setg(errp, "'DIMM property " PC_DIMM_NODE_PROP " has value %"
>                    PRIu32 "' which exceeds the number of numa nodes: %d",
>                    dimm->node, nb_numa_nodes ? nb_numa_nodes : 1);
and this just handles this case to show proper error message.

>         return;
>     }

> 
> This is a bit confusing...
> 
>> Thanks,
>> Laurent
> 
>
Greg Kurz Nov. 21, 2018, 1:58 p.m. UTC | #7
On Tue, 20 Nov 2018 20:58:45 +0200
Serhii Popovych <spopovyc@redhat.com> wrote:

> Greg Kurz wrote:
> > On Mon, 19 Nov 2018 14:48:34 +0100
> > Laurent Vivier <lvivier@redhat.com> wrote:
> >   
> >> On 19/11/2018 14:27, Greg Kurz wrote:  
> >>> On Mon, 19 Nov 2018 08:09:38 -0500
> >>> Serhii Popovych <spopovyc@redhat.com> wrote:
> >>>     
> >>>> Laurent Vivier reported off by one with maximum number of NUMA nodes
> >>>> provided by qemu-kvm being less by one than required according to
> >>>> description of "ibm,max-associativity-domains" property in LoPAPR.
> >>>>
> >>>> It appears that I incorrectly treated LoPAPR description of this
> >>>> property assuming it provides last valid domain (NUMA node here)
> >>>> instead of maximum number of domains.
> >>>>
> >>>>   ### Before hot-add
> >>>>
> >>>>   (qemu) info numa
> >>>>   3 nodes
> >>>>   node 0 cpus: 0
> >>>>   node 0 size: 0 MB
> >>>>   node 0 plugged: 0 MB
> >>>>   node 1 cpus:
> >>>>   node 1 size: 1024 MB
> >>>>   node 1 plugged: 0 MB
> >>>>   node 2 cpus:
> >>>>   node 2 size: 0 MB
> >>>>   node 2 plugged: 0 MB
> >>>>
> >>>>   $ numactl -H
> >>>>   available: 2 nodes (0-1)
> >>>>   node 0 cpus: 0
> >>>>   node 0 size: 0 MB
> >>>>   node 0 free: 0 MB
> >>>>   node 1 cpus:
> >>>>   node 1 size: 999 MB
> >>>>   node 1 free: 658 MB
> >>>>   node distances:
> >>>>   node   0   1
> >>>>     0:  10  40
> >>>>     1:  40  10
> >>>>
> >>>>   ### Hot-add
> >>>>
> >>>>   (qemu) object_add memory-backend-ram,id=mem0,size=1G
> >>>>   (qemu) device_add pc-dimm,id=dimm1,memdev=mem0,node=2
> >>>>   (qemu) [   87.704898] pseries-hotplug-mem: Attempting to hot-add 4 ...
> >>>>   <there is no "Initmem setup node 2 [mem 0xHEX-0xHEX]">
> >>>>   [   87.705128] lpar: Attempting to resize HPT to shift 21
> >>>>   ... <HPT resize messages>
> >>>>
> >>>>   ### After hot-add
> >>>>
> >>>>   (qemu) info numa
> >>>>   3 nodes
> >>>>   node 0 cpus: 0
> >>>>   node 0 size: 0 MB
> >>>>   node 0 plugged: 0 MB
> >>>>   node 1 cpus:
> >>>>   node 1 size: 1024 MB
> >>>>   node 1 plugged: 0 MB
> >>>>   node 2 cpus:
> >>>>   node 2 size: 1024 MB
> >>>>   node 2 plugged: 1024 MB
> >>>>
> >>>>   $ numactl -H
> >>>>   available: 2 nodes (0-1)
> >>>>   ^^^^^^^^^^^^^^^^^^^^^^^^
> >>>>              Still only two nodes (and memory hot-added to node 0 below)
> >>>>   node 0 cpus: 0
> >>>>   node 0 size: 1024 MB
> >>>>   node 0 free: 1021 MB
> >>>>   node 1 cpus:
> >>>>   node 1 size: 999 MB
> >>>>   node 1 free: 658 MB
> >>>>   node distances:
> >>>>   node   0   1
> >>>>     0:  10  40
> >>>>     1:  40  10
> >>>>
> >>>> After fix applied numactl(8) reports 3 nodes available and memory
> >>>> plugged into node 2 as expected.
> >>>>
> >>>> Fixes: da9f80fbad21 ("spapr: Add ibm,max-associativity-domains property")
> >>>> Reported-by: Laurent Vivier <lvivier@redhat.com>
> >>>> Signed-off-by: Serhii Popovych <spopovyc@redhat.com>
> >>>> ---
> >>>>  hw/ppc/spapr.c | 2 +-
> >>>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >>>> index 7afd1a1..843ae6c 100644
> >>>> --- a/hw/ppc/spapr.c
> >>>> +++ b/hw/ppc/spapr.c
> >>>> @@ -1033,7 +1033,7 @@ static void spapr_dt_rtas(sPAPRMachineState *spapr, void *fdt)
> >>>>          cpu_to_be32(0),
> >>>>          cpu_to_be32(0),
> >>>>          cpu_to_be32(0),
> >>>> -        cpu_to_be32(nb_numa_nodes ? nb_numa_nodes - 1 : 0),
> >>>> +        cpu_to_be32(nb_numa_nodes ? nb_numa_nodes : 0),    
> >>>
> >>> Maybe simply cpu_to_be32(nb_numa_nodes) ?    
> >>
> >> Or "cpu_to_be32(nb_numa_nodes ? nb_numa_nodes : 1)" ?  
> 
> Linux handles zero correctly, but nb_numa_nodes ?: 1 looks better.
> 
> I did testing with just cpu_to_be32(nb_numa_nodes) and
> cpu_to_be32(nb_numa_nodes ? nb_numa_nodes : 1) it works with Linux
> correctly in both cases
> 
> (guest)# numactl -H
> available: 1 nodes (0)
> node 0 cpus: 0
> node 0 size: 487 MB
> node 0 free: 148 MB
> node distances:
> node   0
>   0:  10
> 
> (qemu) info numa
> 0 nodes
> 
> >>
> >> In spapr_populate_drconf_memory() we have this logic.
> >>  
> > 
> > Hmm... maybe you're right, it seems that the code assumes
> > non-NUMA configs have at one node. Similar assumption is
> > also present in pc_dimm_realize():
> > 
> >     if (((nb_numa_nodes > 0) && (dimm->node >= nb_numa_nodes)) ||
> >         (!nb_numa_nodes && dimm->node))   
> According to this nb_numa_nodes can be zero
> 
> >         error_setg(errp, "'DIMM property " PC_DIMM_NODE_PROP " has value %"
> >                    PRIu32 "' which exceeds the number of numa nodes: %d",
> >                    dimm->node, nb_numa_nodes ? nb_numa_nodes : 1);  
> and this just handles this case to show proper error message.
> 

Indeed but it doesn't really explain why we're doing this...

> >         return;
> >     }  
> 
> > 
> > This is a bit confusing...

... fortunately, these commits shed some light:

commit 7db8a127e373e468d1f61e46e01e50d1aa33e827
Author: Alexey Kardashevskiy <aik@ozlabs.ru>
Date:   Thu Jul 3 13:10:04 2014 +1000

    spapr: Refactor spapr_populate_memory() to allow memoryless nodes
    
    Current QEMU does not support memoryless NUMA nodes, however
    actual hardware may have them so it makes sense to have a way
    to emulate them in QEMU. This prepares SPAPR for that.
    
    This moves 2 calls of spapr_populate_memory_node() into
    the existing loop over numa nodes so first several nodes may
    have no memory and this still will work.
    
    If there is no numa configuration, the code assumes there is just
    a single node at 0 and it has all the guest memory.
    
    Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
    Signed-off-by: Alexander Graf <agraf@suse.de>


commit 6663864e950d40c467ae4ab81c4dac64d7a8d9e6
Author: Bharata B Rao <bharata@linux.vnet.ibm.com>
Date:   Mon Aug 3 11:05:40 2015 +0530

    spapr: Populate ibm,associativity-lookup-arrays correctly for non-NUMA

    When NUMA isn't configured explicitly, assume node 0 is present for
    the purpose of creating ibm,associativity-lookup-arrays property
    under ibm,dynamic-reconfiguration-memory DT node. This ensures that
    the associativity index property is correctly updated in ibm,dynamic-memory
    for the LMB that is hotplugged.
    
    Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
    Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
    Signed-off-by: David Gibson <david@gibson.dropbear.id.au>


So I guess ?: 1 is consistent with the longstanding assumption in spapr 
that the machine always has a "node 0", even for non-NUMA setups.

Maybe this logic should be consolidated in some helper for better
clarity.

> >   
> >> Thanks,
> >> Laurent  
> > 
> >   
> 
>
diff mbox series

Patch

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 7afd1a1..843ae6c 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1033,7 +1033,7 @@  static void spapr_dt_rtas(sPAPRMachineState *spapr, void *fdt)
         cpu_to_be32(0),
         cpu_to_be32(0),
         cpu_to_be32(0),
-        cpu_to_be32(nb_numa_nodes ? nb_numa_nodes - 1 : 0),
+        cpu_to_be32(nb_numa_nodes ? nb_numa_nodes : 0),
     };
 
     _FDT(rtas = fdt_add_subnode(fdt, 0, "rtas"));