[0/2] memory: Fix up coalesced_io_del not working for KVM
diff mbox

Message ID 20190817093237.27967-1-peterx@redhat.com
State New
Headers show

Commit Message

Peter Xu Aug. 17, 2019, 9:32 a.m. UTC
I can easily crash QEMU as long as KVM is used with e1000 and reboot
many times, then I hit this and QEMU aborts [1]:

  kvm_mem_ioeventfd_add: error adding ioeventfd: No space left on device (28)

To reproduce this issue and also to avoid rebooting so many times,
simply dump the devcount from KVM would work too with this patch
applied to kernel:

Just watch it increase with reboots...

After some digging, it seems to be the coalesced mmio device that
overflowed the kvm io device count.

I suspect it's not working from the very beginning when the coalesced
interfaces were introduced...  We had a fix for the addition
previously but it seems that the deletion part was still broken.  This
patchset tries to fix the two problems related to the deletion part.

IMHO the 2nd patch is a workaround of KVM in that KVM should allow
KVM_UNREGISTER_COALESCED_MMIO to work even if the user specified a
very large zone that covers multiple mmio devices.  I've a KVM patch
for that, however I didn't send it because it'll slightly change the
syscall behavior (of course it won't break any existing users in most
cases).  Please shoot if anyone thought I should post that for good.

I _think_ this should be needed by stables as well because e1000 is
still widely used (is it?) and triggering it is still not that hard
(simply reboot enough times, this should be even worse if we got more
MMIO devices, e.g., multiple e1000-like devices). I'll leave
maintainers to judge.

Please have a look, thanks.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1741863

Peter Xu (2):
  memory: Replace has_coalesced_range with add/del flags
  memory: Split zones when do coalesced_io_del()

 memory.c | 51 +++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 43 insertions(+), 8 deletions(-)

Comments

Peter Xu Aug. 19, 2019, 9:32 a.m. UTC | #1
On Sat, Aug 17, 2019 at 05:32:35PM +0800, Peter Xu wrote:
> I can easily crash QEMU as long as KVM is used with e1000 and reboot
> many times, then I hit this and QEMU aborts [1]:
> 
>   kvm_mem_ioeventfd_add: error adding ioeventfd: No space left on device (28)

Reproducer:

bin=x86_64-softmmu/qemu-system-x86_64
$bin -M q35,accel=kvm,kernel-irqchip=on -smp 8 -m 2G -cpu host \
     -device e1000,netdev=net0 \
     -netdev user,id=net0,hostfwd=tcp::5555-:22 \
     -device e1000,netdev=net1 \
     -netdev user,id=net1 \
     -device e1000,netdev=net2 \
     -netdev user,id=net2 \
     -device e1000,netdev=net3 \
     -netdev user,id=net3 \
     -drive file=/images/default.qcow2,if=none,cache=none,id=drive0 \
     -device virtio-blk-pci,drive=drive0

This should crash for no more than 2-3 reboots.  The more e1000, the
faster.

Regards,

Patch
diff mbox

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index c6a91b044d8d..c4f1e8a5a93c 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3841,6 +3841,7 @@  int kvm_io_bus_register_dev(struct kvm *kvm, enum kvm_bus bus_idx, gpa_t addr,

        memcpy(new_bus, bus, sizeof(*bus) + i * sizeof(struct kvm_io_range));
        new_bus->dev_count++;
+       pr_info("%s: dev_count++ (%d)\n", __func__, new_bus->dev_count);
        new_bus->range[i] = range;
        memcpy(new_bus->range + i + 1, bus->range + i,
                (bus->dev_count - i) * sizeof(struct kvm_io_range));
@@ -3879,6 +3880,7 @@  void kvm_io_bus_unregister_dev(struct kvm *kvm, enum kvm_bus bus_idx,

        memcpy(new_bus, bus, sizeof(*bus) + i * sizeof(struct kvm_io_range));
        new_bus->dev_count--;
+       pr_info("%s: dev_count-- (%d)\n", __func__, new_bus->dev_count);
        memcpy(new_bus->range + i, bus->range + i + 1,
               (new_bus->dev_count - i) * sizeof(struct kvm_io_range));