Message ID | 20210921102434.24273-1-david@redhat.com |
---|---|
State | New |
Headers | show |
Series | [v1] monitor: Consider "id" when rate-limiting MEMORY_DEVICE_SIZE_CHANGE qapi events | expand |
David Hildenbrand <david@redhat.com> writes: > We have to consider the device id, otherwise we'll lose some events for > unrelated devices. If the device does not have a device id (very unlikely), > the target of the notifications has to update the size of all devices > manually either way. > > This was noticed by starting a VM with two virtio-mem devices that each > have a requested size > 0. The Linux guest will initialize both devices > in parallel, resulting in losing MEMORY_DEVICE_SIZE_CHANGE events for > one of the devices. Fascinating. Event rate limiting works as follows. An event is rate-limited when monitor_qapi_event_conf[event].rate != 0. When such an event arrives, it is held in a bucket until a timer associated with the bucket expires. Putting an event in an empty bucket starts its timer. Putting an event in a non-empty bucket replaces its old contents. The bucket to use for an event depends on its event type, and for some events also on certain event arguments. This patch solves the "MEMORY_DEVICE_SIZE_CHANGE events from different devices eat each other" by splitting the event's bucket. The split is imperfect: each device with a qdev ID gets its own bucket, all devices without ID have to share a bucket. This is actually a flaw in the event's design: you can't distinguish events from different devices without IDs. To fix that flaw, add the QOM path to the event. You can then get a perfect bucket split: use the QOM path instead of the qdev ID. Related: [PATCH v8 5/7] qapi/qdev.json: add DEVICE_UNPLUG_GUEST_ERROR QAPI event Message-Id: <20210907004755.424931-6-danielhb413@gmail.com> This deprecates MEM_UNPLUG_ERROR, which only provides a qdev ID in favour of DEVICE_UNPLUG_GUEST_ERROR, which has a wider scope, and also provides a QOM path. Different tack: perhaps the rate limiting is too simplistic and overly aggressive. Its purpose is to avoid a flood. Two events are not a flood, even when one follows the other quickly. Heck, even a dozen aren't. > > Fixes: 722a3c783ef4 ("virtio-pci: Send qapi events when the virtio-mem size changes") > Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com> (maintainer:Human Monitor (HMP)) > Cc: Markus Armbruster <armbru@redhat.com> (supporter:QMP) > Cc: Michael S. Tsirkin <mst@redhat.com> > Cc: Eric Blake <eblake@redhat.com> > Cc: Igor Mammedov <imammedo@redhat.com> > Cc: Michal Privoznik <mprivozn@redhat.com> > Signed-off-by: David Hildenbrand <david@redhat.com> > --- > monitor/monitor.c | 19 +++++++++++++++++++ > 1 file changed, 19 insertions(+) > > diff --git a/monitor/monitor.c b/monitor/monitor.c > index 46a171bca6..05c0b32b67 100644 > --- a/monitor/monitor.c > +++ b/monitor/monitor.c > @@ -474,6 +474,11 @@ static unsigned int qapi_event_throttle_hash(const void *key) > hash += g_str_hash(qdict_get_str(evstate->data, "node-name")); > } > > + if (evstate->event == QAPI_EVENT_MEMORY_DEVICE_SIZE_CHANGE && > + qdict_get(evstate->data, "id")) { > + hash += g_str_hash(qdict_get_str(evstate->data, "id")); > + } > + > return hash; > } > > @@ -496,6 +501,20 @@ static gboolean qapi_event_throttle_equal(const void *a, const void *b) > qdict_get_str(evb->data, "node-name")); > } > > + if (eva->event == QAPI_EVENT_MEMORY_DEVICE_SIZE_CHANGE) { > + const bool id_a = qdict_get(eva->data, "id"); > + const bool id_b = qdict_get(evb->data, "id"); > + > + if (!id_a && !id_b) { > + return TRUE; > + } else if (id_a ^ id_b) { > + return FALSE; > + } > + > + return !strcmp(qdict_get_str(eva->data, "id"), > + qdict_get_str(evb->data, "id")); > + } > + > return TRUE; > } Patch looks sane, but I recommend to first add the QOM path to the event, then use it instead of the qdev ID.
On 22.09.21 14:11, Markus Armbruster wrote: > David Hildenbrand <david@redhat.com> writes: > >> We have to consider the device id, otherwise we'll lose some events for >> unrelated devices. If the device does not have a device id (very unlikely), >> the target of the notifications has to update the size of all devices >> manually either way. >> >> This was noticed by starting a VM with two virtio-mem devices that each >> have a requested size > 0. The Linux guest will initialize both devices >> in parallel, resulting in losing MEMORY_DEVICE_SIZE_CHANGE events for >> one of the devices. > > Fascinating. > > Event rate limiting works as follows. > > An event is rate-limited when monitor_qapi_event_conf[event].rate != 0. > > When such an event arrives, it is held in a bucket until a timer > associated with the bucket expires. Putting an event in an empty bucket > starts its timer. Putting an event in a non-empty bucket replaces its > old contents. > > The bucket to use for an event depends on its event type, and for some > events also on certain event arguments. > > This patch solves the "MEMORY_DEVICE_SIZE_CHANGE events from different > devices eat each other" by splitting the event's bucket. Right, that's how it's getting used in libvirt where we noticed it. > > The split is imperfect: each device with a qdev ID gets its own bucket, > all devices without ID have to share a bucket. Yes, it's far from perfect. Fortunately upper layers (libvirt) barely do that. > > This is actually a flaw in the event's design: you can't distinguish > events from different devices without IDs. > > To fix that flaw, add the QOM path to the event. So the idea would be to extend the event by an optional QOM path (because it's an existing event), but always setting it internally? Thanks! > > You can then get a perfect bucket split: use the QOM path instead of > the qdev ID. > > Related: [PATCH v8 5/7] qapi/qdev.json: add DEVICE_UNPLUG_GUEST_ERROR > QAPI event > Message-Id: <20210907004755.424931-6-danielhb413@gmail.com> > This deprecates MEM_UNPLUG_ERROR, which only provides a qdev ID in > favour of DEVICE_UNPLUG_GUEST_ERROR, which has a wider scope, and also > provides a QOM path. > > Different tack: perhaps the rate limiting is too simplistic and overly > aggressive. Its purpose is to avoid a flood. Two events are not a > flood, even when one follows the other quickly. Heck, even a dozen > aren't. > >> >> Fixes: 722a3c783ef4 ("virtio-pci: Send qapi events when the virtio-mem size changes") >> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com> (maintainer:Human Monitor (HMP)) >> Cc: Markus Armbruster <armbru@redhat.com> (supporter:QMP) >> Cc: Michael S. Tsirkin <mst@redhat.com> >> Cc: Eric Blake <eblake@redhat.com> >> Cc: Igor Mammedov <imammedo@redhat.com> >> Cc: Michal Privoznik <mprivozn@redhat.com> >> Signed-off-by: David Hildenbrand <david@redhat.com> >> --- >> monitor/monitor.c | 19 +++++++++++++++++++ >> 1 file changed, 19 insertions(+) >> >> diff --git a/monitor/monitor.c b/monitor/monitor.c >> index 46a171bca6..05c0b32b67 100644 >> --- a/monitor/monitor.c >> +++ b/monitor/monitor.c >> @@ -474,6 +474,11 @@ static unsigned int qapi_event_throttle_hash(const void *key) >> hash += g_str_hash(qdict_get_str(evstate->data, "node-name")); >> } >> >> + if (evstate->event == QAPI_EVENT_MEMORY_DEVICE_SIZE_CHANGE && >> + qdict_get(evstate->data, "id")) { >> + hash += g_str_hash(qdict_get_str(evstate->data, "id")); >> + } >> + >> return hash; >> } >> >> @@ -496,6 +501,20 @@ static gboolean qapi_event_throttle_equal(const void *a, const void *b) >> qdict_get_str(evb->data, "node-name")); >> } >> >> + if (eva->event == QAPI_EVENT_MEMORY_DEVICE_SIZE_CHANGE) { >> + const bool id_a = qdict_get(eva->data, "id"); >> + const bool id_b = qdict_get(evb->data, "id"); >> + >> + if (!id_a && !id_b) { >> + return TRUE; >> + } else if (id_a ^ id_b) { >> + return FALSE; >> + } >> + >> + return !strcmp(qdict_get_str(eva->data, "id"), >> + qdict_get_str(evb->data, "id")); >> + } >> + >> return TRUE; >> } > > Patch looks sane, but I recommend to first add the QOM path to the > event, then use it instead of the qdev ID. Yes, let me take a look how easy that will be :)
On 22.09.21 14:20, David Hildenbrand wrote: > On 22.09.21 14:11, Markus Armbruster wrote: >> David Hildenbrand <david@redhat.com> writes: >> >>> We have to consider the device id, otherwise we'll lose some events for >>> unrelated devices. If the device does not have a device id (very unlikely), >>> the target of the notifications has to update the size of all devices >>> manually either way. >>> >>> This was noticed by starting a VM with two virtio-mem devices that each >>> have a requested size > 0. The Linux guest will initialize both devices >>> in parallel, resulting in losing MEMORY_DEVICE_SIZE_CHANGE events for >>> one of the devices. >> >> Fascinating. >> >> Event rate limiting works as follows. >> >> An event is rate-limited when monitor_qapi_event_conf[event].rate != 0. >> >> When such an event arrives, it is held in a bucket until a timer >> associated with the bucket expires. Putting an event in an empty bucket >> starts its timer. Putting an event in a non-empty bucket replaces its >> old contents. >> >> The bucket to use for an event depends on its event type, and for some >> events also on certain event arguments. >> >> This patch solves the "MEMORY_DEVICE_SIZE_CHANGE events from different >> devices eat each other" by splitting the event's bucket. > > Right, that's how it's getting used in libvirt where we noticed it. > >> >> The split is imperfect: each device with a qdev ID gets its own bucket, >> all devices without ID have to share a bucket. > > Yes, it's far from perfect. Fortunately upper layers (libvirt) barely do > that. > >> >> This is actually a flaw in the event's design: you can't distinguish >> events from different devices without IDs. >> >> To fix that flaw, add the QOM path to the event. > > So the idea would be to extend the event by an optional QOM path > (because it's an existing event), but always setting it internally? > Thinking about it, not optional, because we always have a QOM path and extending an event with something not optional should just work AFAIKT. Just implemented it, seems to work just fine. Will send a v2 -- thanks!
diff --git a/monitor/monitor.c b/monitor/monitor.c index 46a171bca6..05c0b32b67 100644 --- a/monitor/monitor.c +++ b/monitor/monitor.c @@ -474,6 +474,11 @@ static unsigned int qapi_event_throttle_hash(const void *key) hash += g_str_hash(qdict_get_str(evstate->data, "node-name")); } + if (evstate->event == QAPI_EVENT_MEMORY_DEVICE_SIZE_CHANGE && + qdict_get(evstate->data, "id")) { + hash += g_str_hash(qdict_get_str(evstate->data, "id")); + } + return hash; } @@ -496,6 +501,20 @@ static gboolean qapi_event_throttle_equal(const void *a, const void *b) qdict_get_str(evb->data, "node-name")); } + if (eva->event == QAPI_EVENT_MEMORY_DEVICE_SIZE_CHANGE) { + const bool id_a = qdict_get(eva->data, "id"); + const bool id_b = qdict_get(evb->data, "id"); + + if (!id_a && !id_b) { + return TRUE; + } else if (id_a ^ id_b) { + return FALSE; + } + + return !strcmp(qdict_get_str(eva->data, "id"), + qdict_get_str(evb->data, "id")); + } + return TRUE; }
We have to consider the device id, otherwise we'll lose some events for unrelated devices. If the device does not have a device id (very unlikely), the target of the notifications has to update the size of all devices manually either way. This was noticed by starting a VM with two virtio-mem devices that each have a requested size > 0. The Linux guest will initialize both devices in parallel, resulting in losing MEMORY_DEVICE_SIZE_CHANGE events for one of the devices. Fixes: 722a3c783ef4 ("virtio-pci: Send qapi events when the virtio-mem size changes") Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com> (maintainer:Human Monitor (HMP)) Cc: Markus Armbruster <armbru@redhat.com> (supporter:QMP) Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Eric Blake <eblake@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> --- monitor/monitor.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)