Possible LMB hot unplug bug in 4.13+ kernels

Message ID 94618f16-b47f-b714-9cb5-4bbbf7fdccdf@linux.vnet.ibm.com
State New
Headers show
Series
  • Possible LMB hot unplug bug in 4.13+ kernels
Related show

Commit Message

Daniel Henrique Barboza Oct. 4, 2017, 8:21 p.m.
Hi,

I've stumbled in a LMB hot unplug problem when running a guest with 
4.13+ kernel using QEMU 2.10. When trying to hot unplug a recently 
hotplugged LMB this is what I got, using an upstream kernel:

---------------
QEMU cmd line: sudo ./qemu-system-ppc64 -name migrate_qemu -boot 
strict=on -device nec-usb-xhci,id=usb,bus=pci.0,addr=0xf -device 
spapr-vscsi,id=scsi0,reg=0x2000 -smp 
32,maxcpus=32,sockets=32,cores=1,threads=1 --machine 
pseries,accel=kvm,kvm-type=HV,usb=off,dump-guest-core=off -m 
4G,slots=32,maxmem=32G -drive 
file=/home/danielhb/vm_imgs/f26.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none 
-device 
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x2,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 
-nographic


Last login: Wed Oct  4 12:28:25 on hvc0
[danielhb@localhost ~]$ grep Mem /proc/meminfo
MemTotal:        4161728 kB
MemFree:         3204352 kB
MemAvailable:    3558336 kB
[danielhb@localhost ~]$ QEMU 2.10.50 monitor - type 'help' for more 
information
(qemu)
(qemu) object_add memory-backend-ram,id=ram0,size=1G
(qemu) device_add pc-dimm,id=dimm0,memdev=ram0
(qemu)

[danielhb@localhost ~]$ grep Mem /proc/meminfo
MemTotal:        5210304 kB
MemFree:         4254656 kB
MemAvailable:    4598144 kB
[danielhb@localhost ~]$ (qemu)
(qemu) device_del dimm0
(qemu) [  136.058727] pseries-hotplug-mem: Memory indexed-count-remove 
failed, adding any removed LMBs

(qemu)
[danielhb@localhost ~]$ grep Mem /proc/meminfo
MemTotal:        5210304 kB
MemFree:         4253696 kB
MemAvailable:    4597184 kB
[danielhb@localhost ~]$

-------------

The LMBs are about to be unplugged, them something happens and they 
ended up being hotplugged back.

This isn't reproducible with 4.11 guests. I can reliably reproduce it in 
4.13+. Haven't tried 4.12.

Changing QEMU snapshots or even the hypervisor kernel/OS didn't affect 
the result.


In an attempt to better understand the issue I did the following changes 
in upstream kernel:

         scns_per_block = block_sz / MIN_MEMORY_BLOCK_SIZE;
@@ -442,8 +444,10 @@ static bool lmb_is_removable(struct of_drconf_cell 
*lmb)

  #ifdef CONFIG_FA_DUMP
         /* Don't hot-remove memory that falls in fadump boot memory area */
-       if (is_fadump_boot_memory_area(phys_addr, block_sz))
+       if (is_fadump_boot_memory_area(phys_addr, block_sz)) {
+               pr_err("lmb belongs to fadump boot memory area\n");
                 return false;
+       }
  #endif

         for (i = 0; i < scns_per_block; i++) {
@@ -454,7 +458,9 @@ static bool lmb_is_removable(struct of_drconf_cell *lmb)
                 rc &= is_mem_section_removable(pfn, PAGES_PER_SECTION);
                 phys_addr += MIN_MEMORY_BLOCK_SIZE;
         }
-
+       if (!rc) {
+               pr_err("is_mem_section_removable returned false \n");
+       }
         return rc ? true : false;
  }

@@ -465,12 +471,16 @@ static int dlpar_remove_lmb(struct of_drconf_cell 
*lmb)
         unsigned long block_sz;
         int nid, rc;

-       if (!lmb_is_removable(lmb))
+       if (!lmb_is_removable(lmb)) {
+               pr_err("dlpar_remove_lmb: lmb is not removable! \n");
                 return -EINVAL;
+       }

         rc = dlpar_offline_lmb(lmb);
-       if (rc)
+       if (rc) {
+               pr_err("dlpar_remove_lmb: offline_lmb returned not zero 
\n");
                 return rc;
+       }

         block_sz = pseries_memory_block_size();
         nid = memory_add_physaddr_to_nid(lmb->base_addr);

And this is the output:

---------

[danielhb@localhost ~]$ QEMU 2.10.50 monitor - type 'help' for more 
information
(qemu)
(qemu) object_add memory-backend-ram,id=ram0,size=1G
(qemu) device_add pc-dimm,id=dimm0,memdev=ram0
(qemu)

[danielhb@localhost ~]$ grep Mem /proc/meminfo
MemTotal:        5210304 kB
MemFree:         4254656 kB
MemAvailable:    4598144 kB
[danielhb@localhost ~]$ (qemu)
(qemu) device_del dimm0
(qemu) [  136.058473] pseries-hotplug-mem: is_mem_section_removable 
returned false
[  136.058607] pseries-hotplug-mem: dlpar_remove_lmb: lmb is not removable!
[  136.058727] pseries-hotplug-mem: Memory indexed-count-remove failed, 
adding any removed LMBs

(qemu)

[danielhb@localhost ~]$ grep Mem /proc/meminfo
MemTotal:        5210304 kB
MemFree:         4253696 kB
MemAvailable:    4597184 kB
[danielhb@localhost ~]$

---------------

It appears that the hot unplug is failing because lmb_is_removable(lmb) 
is returning
false inside dlpar_remove_lmb, triggering the hotplug of the LMBs again:

         if (rc) {
                 pr_err("Memory indexed-count-remove failed, adding any 
removed LMBs\n");

                 for (i = start_index; i < end_index; i++) {
                         if (!lmbs[i].reserved)
                                 continue;

                         rc = dlpar_add_lmb(&lmbs[i]);
                         if (rc)
                                 pr_err("Failed to add LMB, drc index %x\n",
be32_to_cpu(lmbs[i].drc_index));

                         lmbs[i].reserved = 0;
                 }

I am not aware of anything that I can do from QEMU side to fix this. Can 
anyone take a look or provide
guidance? Am I missing something in my tests?


Thanks,


Daniel

Comments

Nathan Fontenot Oct. 5, 2017, 3:43 p.m. | #1
On 10/04/2017 03:21 PM, Daniel Henrique Barboza wrote:
> Hi,
> 
> I've stumbled in a LMB hot unplug problem when running a guest with 4.13+ kernel using QEMU 2.10. When trying to hot unplug a recently hotplugged LMB this is what I got, using an upstream kernel:
> 
> ---------------
> QEMU cmd line: sudo ./qemu-system-ppc64 -name migrate_qemu -boot strict=on -device nec-usb-xhci,id=usb,bus=pci.0,addr=0xf -device spapr-vscsi,id=scsi0,reg=0x2000 -smp 32,maxcpus=32,sockets=32,cores=1,threads=1 --machine pseries,accel=kvm,kvm-type=HV,usb=off,dump-guest-core=off -m 4G,slots=32,maxmem=32G -drive file=/home/danielhb/vm_imgs/f26.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x2,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -nographic
> 
> 
> Last login: Wed Oct  4 12:28:25 on hvc0
> [danielhb@localhost ~]$ grep Mem /proc/meminfo
> MemTotal:        4161728 kB
> MemFree:         3204352 kB
> MemAvailable:    3558336 kB
> [danielhb@localhost ~]$ QEMU 2.10.50 monitor - type 'help' for more information
> (qemu)
> (qemu) object_add memory-backend-ram,id=ram0,size=1G
> (qemu) device_add pc-dimm,id=dimm0,memdev=ram0
> (qemu)
> 
> [danielhb@localhost ~]$ grep Mem /proc/meminfo
> MemTotal:        5210304 kB
> MemFree:         4254656 kB
> MemAvailable:    4598144 kB
> [danielhb@localhost ~]$ (qemu)
> (qemu) device_del dimm0
> (qemu) [  136.058727] pseries-hotplug-mem: Memory indexed-count-remove failed, adding any removed LMBs
> 
> (qemu)
> [danielhb@localhost ~]$ grep Mem /proc/meminfo
> MemTotal:        5210304 kB
> MemFree:         4253696 kB
> MemAvailable:    4597184 kB
> [danielhb@localhost ~]$
> 
> -------------
> 
> The LMBs are about to be unplugged, them something happens and they ended up being hotplugged back.
> 
> This isn't reproducible with 4.11 guests. I can reliably reproduce it in 4.13+. Haven't tried 4.12.
> 
> Changing QEMU snapshots or even the hypervisor kernel/OS didn't affect the result.
> 
> 
> In an attempt to better understand the issue I did the following changes in upstream kernel:
> 
> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
> index 1d48ab424bd9..37550833cdb0 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
> @@ -433,8 +433,10 @@ static bool lmb_is_removable(struct of_drconf_cell *lmb)
>         unsigned long pfn, block_sz;
>         u64 phys_addr;
> 
> -       if (!(lmb->flags & DRCONF_MEM_ASSIGNED))
> +       if (!(lmb->flags & DRCONF_MEM_ASSIGNED)) {
> +               pr_err("lmb is not assigned \n");
>                 return false;
> +       }
> 
>         block_sz = memory_block_size_bytes();
>         scns_per_block = block_sz / MIN_MEMORY_BLOCK_SIZE;
> @@ -442,8 +444,10 @@ static bool lmb_is_removable(struct of_drconf_cell *lmb)
> 
>  #ifdef CONFIG_FA_DUMP
>         /* Don't hot-remove memory that falls in fadump boot memory area */
> -       if (is_fadump_boot_memory_area(phys_addr, block_sz))
> +       if (is_fadump_boot_memory_area(phys_addr, block_sz)) {
> +               pr_err("lmb belongs to fadump boot memory area\n");
>                 return false;
> +       }
>  #endif
> 
>         for (i = 0; i < scns_per_block; i++) {
> @@ -454,7 +458,9 @@ static bool lmb_is_removable(struct of_drconf_cell *lmb)
>                 rc &= is_mem_section_removable(pfn, PAGES_PER_SECTION);
>                 phys_addr += MIN_MEMORY_BLOCK_SIZE;
>         }
> -
> +       if (!rc) {
> +               pr_err("is_mem_section_removable returned false \n");
> +       }
>         return rc ? true : false;
>  }
> 
> @@ -465,12 +471,16 @@ static int dlpar_remove_lmb(struct of_drconf_cell *lmb)
>         unsigned long block_sz;
>         int nid, rc;
> 
> -       if (!lmb_is_removable(lmb))
> +       if (!lmb_is_removable(lmb)) {
> +               pr_err("dlpar_remove_lmb: lmb is not removable! \n");
>                 return -EINVAL;
> +       }
> 
>         rc = dlpar_offline_lmb(lmb);
> -       if (rc)
> +       if (rc) {
> +               pr_err("dlpar_remove_lmb: offline_lmb returned not zero \n");
>                 return rc;
> +       }
> 
>         block_sz = pseries_memory_block_size();
>         nid = memory_add_physaddr_to_nid(lmb->base_addr);
> 
> And this is the output:
> 
> ---------
> 
> [danielhb@localhost ~]$ QEMU 2.10.50 monitor - type 'help' for more information
> (qemu)
> (qemu) object_add memory-backend-ram,id=ram0,size=1G
> (qemu) device_add pc-dimm,id=dimm0,memdev=ram0
> (qemu)
> 
> [danielhb@localhost ~]$ grep Mem /proc/meminfo
> MemTotal:        5210304 kB
> MemFree:         4254656 kB
> MemAvailable:    4598144 kB
> [danielhb@localhost ~]$ (qemu)
> (qemu) device_del dimm0
> (qemu) [  136.058473] pseries-hotplug-mem: is_mem_section_removable returned false
> [  136.058607] pseries-hotplug-mem: dlpar_remove_lmb: lmb is not removable!
> [  136.058727] pseries-hotplug-mem: Memory indexed-count-remove failed, adding any removed LMBs
> 
> (qemu)
> 
> [danielhb@localhost ~]$ grep Mem /proc/meminfo
> MemTotal:        5210304 kB
> MemFree:         4253696 kB
> MemAvailable:    4597184 kB
> [danielhb@localhost ~]$
> 
> ---------------
> 
> It appears that the hot unplug is failing because lmb_is_removable(lmb) is returning
> false inside dlpar_remove_lmb, triggering the hotplug of the LMBs again:
> 
>         if (rc) {
>                 pr_err("Memory indexed-count-remove failed, adding any removed LMBs\n");
> 
>                 for (i = start_index; i < end_index; i++) {
>                         if (!lmbs[i].reserved)
>                                 continue;
> 
>                         rc = dlpar_add_lmb(&lmbs[i]);
>                         if (rc)
>                                 pr_err("Failed to add LMB, drc index %x\n",
> be32_to_cpu(lmbs[i].drc_index));
> 
>                         lmbs[i].reserved = 0;
>                 }
> 
> I am not aware of anything that I can do from QEMU side to fix this. Can anyone take a look or provide
> guidance? Am I missing something in my tests?
> 

I don't think you are missing anything in your tests and am not sure that this is a bug.
DLPAR memory remove operations are not guaranteed, memory remove requests fail quite often
because the LMBs we are trying to remove are not removable.

From the added printks you put in this appears to be what is happening. When trying to
remove the range of LMBs, one of the memory sections in one of the LMBs is not removbable,
this is seen when lmb_is_removable() returns false.

The behavior you then describe, that all LMBs are added back is the correct behavior for
DLPAR operations, they are an all or nothing operation. If any part of the DLPAR request
fails (during add or remove) we revert any changed LMBs back to the stat they were in
prior to handling the request.

-Nathan


> 
> Thanks,
> 
> 
> Daniel
>
Michael Ellerman Oct. 6, 2017, 4:46 a.m. | #2
Nathan Fontenot <nfont@linux.vnet.ibm.com> writes:
> On 10/04/2017 03:21 PM, Daniel Henrique Barboza wrote:
...
>> It appears that the hot unplug is failing because lmb_is_removable(lmb) is returning
>> false inside dlpar_remove_lmb, triggering the hotplug of the LMBs again:
...
>> 
>> I am not aware of anything that I can do from QEMU side to fix this. Can anyone take a look or provide
>> guidance? Am I missing something in my tests?
>> 
>
> I don't think you are missing anything in your tests and am not sure that this is a bug.
> DLPAR memory remove operations are not guaranteed, memory remove requests fail quite often
> because the LMBs we are trying to remove are not removable.
>
> From the added printks you put in this appears to be what is happening. When trying to
> remove the range of LMBs, one of the memory sections in one of the LMBs is not removbable,
> this is seen when lmb_is_removable() returns false.

Yeah I agree.

Unless you (Daniel) think there's some reason lmb_is_removable() is
incorrectly returning false. But most likely it's correct and there's
just an unmovable allocation in that range.

cheers
Daniel Henrique Barboza Oct. 6, 2017, 11:04 a.m. | #3
> Unless you (Daniel) think there's some reason lmb_is_removable() is
> incorrectly returning false. But most likely it's correct and there's
> just an unmovable allocation in that range.

I am not educated enough to say that the current behavior is wrong. What I
can say is that in 4.11 and older kernels that supports LMB hot 
plug/unplug I
didn't see this kernel "refusal" to remove a LMB that was just hotplugged.

Assuming that the kernel is behaving as intended, a QEMU guest started with
4Gb of RAM that receives an extra 1Gb of RAM will not unplug this same 1Gb.
It seems off from the user perspective that a recently added memory is being
considered not removable, thus QEMU will need to keep this limitation in 
mind when
dealing with future LMB bugs in 4.13+ kernels.


Thanks,


Daniel

Patch

diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
b/arch/powerpc/platforms/pseries/hotplug-memory.c
index 1d48ab424bd9..37550833cdb0 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -433,8 +433,10 @@  static bool lmb_is_removable(struct of_drconf_cell 
*lmb)
         unsigned long pfn, block_sz;
         u64 phys_addr;

-       if (!(lmb->flags & DRCONF_MEM_ASSIGNED))
+       if (!(lmb->flags & DRCONF_MEM_ASSIGNED)) {
+               pr_err("lmb is not assigned \n");
                 return false;
+       }

         block_sz = memory_block_size_bytes();