From patchwork Mon Jan 16 21:58:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Woodhouse X-Patchwork-Id: 1727304 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; secure) header.d=infradead.org header.i=@infradead.org header.a=rsa-sha256 header.s=desiato.20200630 header.b=TVhhmSky; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NwmNh5DX2z23gx for ; Tue, 17 Jan 2023 09:07:48 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pHXX5-0002hL-Dl; Mon, 16 Jan 2023 17:00:49 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHXV0-0006sa-EN for qemu-devel@nongnu.org; Mon, 16 Jan 2023 16:58:38 -0500 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHXUt-0003eB-T1 for qemu-devel@nongnu.org; Mon, 16 Jan 2023 16:58:38 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=dwKaI7J+FpIKmlL6BtwPJUMK+R1k9zp6jZEXx3mnsZQ=; b=TVhhmSky98dUJHA3vWmOSqYBnV li8VUTJxnv79afb18Em0y9Qt/wl+jV20vYLm5Ec4ZHqxPCX8UQ/+TF50myDQ2y3XAwCvOlMFS8H6T YXowicTlHj97+ZXh57Pjr7QESrrm/Ej4SK1FWzJGc6gHQOWTrf50rQqOQr0NQNK9sAekjzYLtrr1U J+Dn2WOrFoL88LmxAauwJ1MyV0EftW08ldZ8VwoVRCXxfTZv939n/C07XcRv+8SPTa9ulsR7pOTsL KCe+mEQ3KXdGaEyYpHJuftvKgICbHnIRDd+eqvRGlC1CTGdQaUM5qz/SZiTJTurCtu3blMC4NOYC+ aXUPMc1A==; Received: from i7.infradead.org ([2001:8b0:10b:1:21e:67ff:fecb:7a92]) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1pHXUP-005jRv-0A; Mon, 16 Jan 2023 21:58:04 +0000 Received: from dwoodhou by i7.infradead.org with local (Exim 4.96 #2 (Red Hat Linux)) id 1pHXUY-004iQw-0J; Mon, 16 Jan 2023 21:58:10 +0000 From: David Woodhouse To: qemu-devel@nongnu.org Cc: Paolo Bonzini , Paul Durrant , Joao Martins , Ankur Arora , =?utf-8?q?Philippe_Mathieu-Daud?= =?utf-8?q?=C3=A9?= , Thomas Huth , =?utf-8?q?Alex_Benn=C3=A9e?= , Juan Quintela , "Dr . David Alan Gilbert" , Claudio Fontana , Julien Grall , "Michael S. Tsirkin" , Marcel Apfelbaum , armbru@redhat.com Subject: [PATCH v7 49/51] hw/xen: Add backend implementation of interdomain event channel support Date: Mon, 16 Jan 2023 21:58:03 +0000 Message-Id: <20230116215805.1123514-50-dwmw2@infradead.org> X-Mailer: git-send-email 2.39.0 In-Reply-To: <20230116215805.1123514-1-dwmw2@infradead.org> References: <20230116215805.1123514-1-dwmw2@infradead.org> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by desiato.infradead.org. See http://www.infradead.org/rpr.html Received-SPF: none client-ip=2001:8b0:10b:1:d65d:64ff:fe57:4e05; envelope-from=BATV+491b11caf3ce55304f6a+7085+infradead.org+dwmw2@desiato.srs.infradead.org; helo=desiato.infradead.org X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: David Woodhouse The provides the QEMU side of interdomain event channels, allowing events to be sent to/from the guest. The API mirrors libxenevtchn, and in time both this and the real Xen one will be available through ops structures so that the PV backend drivers can use the correct one as appropriate. For now, this implementation can be used directly by our XenStore which will be for emulated mode only. Signed-off-by: David Woodhouse --- hw/i386/kvm/xen_evtchn.c | 340 ++++++++++++++++++++++++++++++++++++++- hw/i386/kvm/xen_evtchn.h | 19 +++ 2 files changed, 352 insertions(+), 7 deletions(-) diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c index 3340cbb109..dc58843cf2 100644 --- a/hw/i386/kvm/xen_evtchn.c +++ b/hw/i386/kvm/xen_evtchn.c @@ -37,6 +37,7 @@ #include "sysemu/kvm.h" #include "sysemu/kvm_xen.h" #include +#include #include "standard-headers/xen/memory.h" #include "standard-headers/xen/hvm/params.h" @@ -87,6 +88,13 @@ struct compat_shared_info { #define COMPAT_EVTCHN_2L_NR_CHANNELS 1024 +/* Local private implementation of struct xenevtchn_handle */ +struct xenevtchn_handle { + evtchn_port_t be_port; + evtchn_port_t guest_port; /* Or zero for unbound */ + int fd; +}; + /* * For unbound/interdomain ports there are only two possible remote * domains; self and QEMU. Use a single high bit in type_val for that, @@ -110,6 +118,8 @@ struct XenEvtchnState { uint32_t nr_ports; XenEvtchnPort port_table[EVTCHN_2L_NR_CHANNELS]; qemu_irq gsis[GSI_NUM_PINS]; + + struct xenevtchn_handle *be_handles[EVTCHN_2L_NR_CHANNELS]; }; struct XenEvtchnState *xen_evtchn_singleton; @@ -117,6 +127,18 @@ struct XenEvtchnState *xen_evtchn_singleton; /* Top bits of callback_param are the type (HVM_PARAM_CALLBACK_TYPE_xxx) */ #define CALLBACK_VIA_TYPE_SHIFT 56 +static void unbind_backend_ports(XenEvtchnState *s); + +static int xen_evtchn_pre_load(void *opaque) +{ + XenEvtchnState *s = opaque; + + /* Unbind all the backend-side ports; they need to rebind */ + unbind_backend_ports(s); + + return 0; +} + static int xen_evtchn_post_load(void *opaque, int version_id) { XenEvtchnState *s = opaque; @@ -150,6 +172,7 @@ static const VMStateDescription xen_evtchn_vmstate = { .version_id = 1, .minimum_version_id = 1, .needed = xen_evtchn_is_needed, + .pre_load = xen_evtchn_pre_load, .post_load = xen_evtchn_post_load, .fields = (VMStateField[]) { VMSTATE_UINT64(callback_param, XenEvtchnState), @@ -416,6 +439,20 @@ static int assign_kernel_port(uint16_t type, evtchn_port_t port, return kvm_vm_ioctl(kvm_state, KVM_XEN_HVM_SET_ATTR, &ha); } +static int assign_kernel_eventfd(uint16_t type, evtchn_port_t port, int fd) +{ + struct kvm_xen_hvm_attr ha; + + ha.type = KVM_XEN_ATTR_TYPE_EVTCHN; + ha.u.evtchn.send_port = port; + ha.u.evtchn.type = type; + ha.u.evtchn.flags = 0; + ha.u.evtchn.deliver.eventfd.port = 0; + ha.u.evtchn.deliver.eventfd.fd = fd; + + return kvm_vm_ioctl(kvm_state, KVM_XEN_HVM_SET_ATTR, &ha); +} + static bool valid_port(evtchn_port_t port) { if (!port) { @@ -434,6 +471,32 @@ static bool valid_vcpu(uint32_t vcpu) return !!qemu_get_cpu(vcpu); } +static void unbind_backend_ports(XenEvtchnState *s) +{ + XenEvtchnPort *p; + int i; + + for (i = 1; i < s->nr_ports; i++) { + p = &s->port_table[i]; + if (p->type == EVTCHNSTAT_interdomain && + (p->type_val & PORT_INFO_TYPEVAL_REMOTE_QEMU) ) { + evtchn_port_t be_port = p->type_val & PORT_INFO_TYPEVAL_REMOTE_PORT_MASK; + + if (s->be_handles[be_port]) { + /* This part will be overwritten on the load anyway. */ + p->type = EVTCHNSTAT_unbound; + p->type_val = PORT_INFO_TYPEVAL_REMOTE_QEMU; + + /* Leave the backend port open and unbound too. */ + if (kvm_xen_has_cap(EVTCHN_SEND)) { + deassign_kernel_port(i); + } + s->be_handles[be_port]->guest_port = 0; + } + } + } +} + int xen_evtchn_status_op(struct evtchn_status *status) { XenEvtchnState *s = xen_evtchn_singleton; @@ -869,7 +932,14 @@ static int close_port(XenEvtchnState *s, evtchn_port_t port) case EVTCHNSTAT_interdomain: if (p->type_val & PORT_INFO_TYPEVAL_REMOTE_QEMU) { - /* Not yet implemented. This can't happen! */ + uint16_t be_port = p->type_val & ~PORT_INFO_TYPEVAL_REMOTE_QEMU; + struct xenevtchn_handle *xc = s->be_handles[be_port]; + if (xc) { + if (kvm_xen_has_cap(EVTCHN_SEND)) { + deassign_kernel_port(port); + } + xc->guest_port = 0; + } } else { /* Loopback interdomain */ XenEvtchnPort *rp = &s->port_table[p->type_val]; @@ -1101,8 +1171,27 @@ int xen_evtchn_bind_interdomain_op(struct evtchn_bind_interdomain *interdomain) } if (interdomain->remote_dom == DOMID_QEMU) { - /* We haven't hooked up QEMU's PV drivers to this yet */ - ret = -ENOSYS; + struct xenevtchn_handle *xc = s->be_handles[interdomain->remote_port]; + XenEvtchnPort *lp = &s->port_table[interdomain->local_port]; + + if (!xc) { + ret = -ENOENT; + goto out_free_port; + } + + if (xc->guest_port) { + ret = -EBUSY; + goto out_free_port; + } + + assert(xc->be_port == interdomain->remote_port); + xc->guest_port = interdomain->local_port; + if (kvm_xen_has_cap(EVTCHN_SEND)) { + assign_kernel_eventfd(lp->type, xc->guest_port, xc->fd); + } + lp->type = EVTCHNSTAT_interdomain; + lp->type_val = PORT_INFO_TYPEVAL_REMOTE_QEMU | interdomain->remote_port; + ret = 0; } else { /* Loopback */ XenEvtchnPort *rp = &s->port_table[interdomain->remote_port]; @@ -1120,6 +1209,7 @@ int xen_evtchn_bind_interdomain_op(struct evtchn_bind_interdomain *interdomain) } } + out_free_port: if (ret) { free_port(s, interdomain->local_port); } @@ -1184,11 +1274,16 @@ int xen_evtchn_send_op(struct evtchn_send *send) if (p->type_val & PORT_INFO_TYPEVAL_REMOTE_QEMU) { /* * This is an event from the guest to qemu itself, which is - * serving as the driver domain. Not yet implemented; it will - * be hooked up to the qemu implementation of xenstore, - * console, PV net/block drivers etc. + * serving as the driver domain. */ - ret = -ENOSYS; + uint16_t be_port = p->type_val & ~PORT_INFO_TYPEVAL_REMOTE_QEMU; + struct xenevtchn_handle *xc = s->be_handles[be_port]; + if (xc) { + eventfd_write(xc->fd, 1); + ret = 0; + } else { + ret = -ENOENT; + } } else { /* Loopback interdomain ports; just a complex IPI */ set_port_pending(s, p->type_val); @@ -1244,6 +1339,237 @@ int xen_evtchn_set_port(uint16_t port) return ret; } +struct xenevtchn_handle *xen_be_evtchn_open(void) +{ + struct xenevtchn_handle *xc = g_new0(struct xenevtchn_handle, 1); + + xc->fd = eventfd(0, EFD_CLOEXEC); + if (xc->fd < 0) { + free(xc); + return NULL; + } + + return xc; +} + +static int find_be_port(XenEvtchnState *s, struct xenevtchn_handle *xc) +{ + int i; + + for (i = 1; i < EVTCHN_2L_NR_CHANNELS; i++) { + if (!s->be_handles[i]) { + s->be_handles[i] = xc; + xc->be_port = i; + return i; + } + } + return 0; +} + +int xen_be_evtchn_bind_interdomain(struct xenevtchn_handle *xc, uint32_t domid, + evtchn_port_t guest_port) +{ + XenEvtchnState *s = xen_evtchn_singleton; + XenEvtchnPort *gp; + uint16_t be_port = 0; + int ret; + + if (!s) { + return -ENOTSUP; + } + + if (!xc) { + return -EFAULT; + } + + if (domid != xen_domid) { + return -ESRCH; + } + + if (!valid_port(guest_port)) { + return -EINVAL; + } + + qemu_mutex_lock(&s->port_lock); + + /* The guest has to have an unbound port waiting for us to bind */ + gp = &s->port_table[guest_port]; + + switch (gp->type) { + case EVTCHNSTAT_interdomain: + /* Allow rebinding after migration, preserve port # if possible */ + be_port = gp->type_val & ~PORT_INFO_TYPEVAL_REMOTE_QEMU; + assert(be_port != 0); + if (!s->be_handles[be_port]) { + s->be_handles[be_port] = xc; + xc->guest_port = guest_port; + ret = xc->be_port = be_port; + if (kvm_xen_has_cap(EVTCHN_SEND)) { + assign_kernel_eventfd(gp->type, guest_port, xc->fd); + } + break; + } + /* fall through */ + + case EVTCHNSTAT_unbound: + be_port = find_be_port(s, xc); + if (!be_port) { + ret = -ENOSPC; + goto out; + } + + gp->type = EVTCHNSTAT_interdomain; + gp->type_val = be_port | PORT_INFO_TYPEVAL_REMOTE_QEMU; + xc->guest_port = guest_port; + if (kvm_xen_has_cap(EVTCHN_SEND)) { + assign_kernel_eventfd(gp->type, guest_port, xc->fd); + } + ret = be_port; + break; + + default: + ret = -EINVAL; + break; + } + + out: + qemu_mutex_unlock(&s->port_lock); + + return ret; +} + +int xen_be_evtchn_unbind(struct xenevtchn_handle *xc, evtchn_port_t port) +{ + XenEvtchnState *s = xen_evtchn_singleton; + int ret; + + if (!s) { + return -ENOTSUP; + } + + if (!xc) { + return -EFAULT; + } + + qemu_mutex_lock(&s->port_lock); + + if (port && port != xc->be_port) { + ret = -EINVAL; + goto out; + } + + if (xc->guest_port) { + XenEvtchnPort *gp = &s->port_table[xc->guest_port]; + + /* This should never *not* be true */ + if (gp->type == EVTCHNSTAT_interdomain) { + gp->type = EVTCHNSTAT_unbound; + gp->type_val = PORT_INFO_TYPEVAL_REMOTE_QEMU; + } + + if (kvm_xen_has_cap(EVTCHN_SEND)) { + deassign_kernel_port(xc->guest_port); + } + xc->guest_port = 0; + } + + s->be_handles[xc->be_port] = NULL; + xc->be_port = 0; + ret = 0; + out: + qemu_mutex_unlock(&s->port_lock); + return ret; +} + +int xen_be_evtchn_close(struct xenevtchn_handle *xc) +{ + if (!xc) { + return -EFAULT; + } + + xen_be_evtchn_unbind(xc, 0); + + close(xc->fd); + free(xc); + return 0; +} + +int xen_be_evtchn_fd(struct xenevtchn_handle *xc) +{ + if (!xc) { + return -1; + } + return xc->fd; +} + +int xen_be_evtchn_notify(struct xenevtchn_handle *xc, evtchn_port_t port) +{ + XenEvtchnState *s = xen_evtchn_singleton; + int ret; + + if (!s) { + return -ENOTSUP; + } + + if (!xc) { + return -EFAULT; + } + + qemu_mutex_lock(&s->port_lock); + + if (xc->guest_port) { + set_port_pending(s, xc->guest_port); + ret = 0; + } else { + ret = -ENOTCONN; + } + + qemu_mutex_unlock(&s->port_lock); + + return ret; +} + +int xen_be_evtchn_pending(struct xenevtchn_handle *xc) +{ + uint64_t val; + + if (!xc) { + return -EFAULT; + } + + if (!xc->be_port) { + return 0; + } + + if (eventfd_read(xc->fd, &val)) { + return -errno; + } + + return val ? xc->be_port : 0; +} + +int xen_be_evtchn_unmask(struct xenevtchn_handle *xc, evtchn_port_t port) +{ + if (!xc) { + return -EFAULT; + } + + if (xc->be_port != port) { + return -EINVAL; + } + + /* + * We don't actually do anything to unmask it; the event was already + * consumed in xen_be_evtchn_pending(). + */ + return 0; +} + +int xen_be_evtchn_get_guest_port(struct xenevtchn_handle *xc) +{ + return xc->guest_port; +} + static const char *type_names[] = { "closed", "unbound", diff --git a/hw/i386/kvm/xen_evtchn.h b/hw/i386/kvm/xen_evtchn.h index d85b45067b..b7b6f4e592 100644 --- a/hw/i386/kvm/xen_evtchn.h +++ b/hw/i386/kvm/xen_evtchn.h @@ -14,6 +14,8 @@ #include "hw/sysbus.h" +typedef uint32_t evtchn_port_t; + void xen_evtchn_create(void); int xen_evtchn_soft_reset(void); int xen_evtchn_set_callback_param(uint64_t param); @@ -22,6 +24,23 @@ void xen_evtchn_set_callback_level(int level); int xen_evtchn_set_port(uint16_t port); +/* + * These functions mirror the libxenevtchn library API, providing the QEMU + * backend side of "interdomain" event channels. + */ +struct xenevtchn_handle; +struct xenevtchn_handle *xen_be_evtchn_open(void); +int xen_be_evtchn_bind_interdomain(struct xenevtchn_handle *xc, uint32_t domid, + evtchn_port_t guest_port); +int xen_be_evtchn_unbind(struct xenevtchn_handle *xc, evtchn_port_t port); +int xen_be_evtchn_close(struct xenevtchn_handle *xc); +int xen_be_evtchn_fd(struct xenevtchn_handle *xc); +int xen_be_evtchn_notify(struct xenevtchn_handle *xc, evtchn_port_t port); +int xen_be_evtchn_unmask(struct xenevtchn_handle *xc, evtchn_port_t port); +int xen_be_evtchn_pending(struct xenevtchn_handle *xc); +/* Apart from this which is a local addition */ +int xen_be_evtchn_get_guest_port(struct xenevtchn_handle *xc); + void hmp_xen_event_inject(Monitor *mon, const QDict *qdict); void hmp_xen_event_list(Monitor *mon, const QDict *qdict);