Patchwork [XEN,RFC,03/15] hvm-pci: Handle PCI config space in Xen

login
register
mail settings
Submitter Julien Grall
Date March 22, 2012, 3:59 p.m.
Message ID <869b10d6ef10def49aa8a94524c2949e6a6039f0.1332430810.git.julien.grall@citrix.com>
Download mbox | patch
Permalink /patch/148322/
State New
Headers show

Comments

Julien Grall - March 22, 2012, 3:59 p.m.
Add function to register a bdf with a server.
To handle cf8 -> cff we add an handler with register_portio_handle.
When Xen reveice a pio for cf8, it's store the value inside the current
vcpu until it receives a pio for cfc -> cff.
In this case, it checks if the bdf is registered and forge the ioreq
that will be forward to server later.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
---
 xen/arch/x86/hvm/Makefile   |    1 +
 xen/arch/x86/hvm/pci_emul.c |  147 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 148 insertions(+), 0 deletions(-)
 create mode 100644 xen/arch/x86/hvm/pci_emul.c
Jan Beulich - March 23, 2012, 8:29 a.m.
>>> On 22.03.12 at 16:59, Julien Grall <julien.grall@citrix.com> wrote:
> --- /dev/null
> +++ b/xen/arch/x86/hvm/pci_emul.c
> @@ -0,0 +1,147 @@
> +#include <asm/hvm/support.h>
> +#include <xen/hvm/pci_emul.h>
> +#include <xen/pci.h>
> +#include <xen/sched.h>
> +#include <xen/xmalloc.h>
> +
> +#define PCI_DEBUGSTR "%x:%x.%x"
> +#define PCI_DEBUG(bdf) ((bdf) >> 16) & 0xff, ((bdf) >> 11) & 0x1f, ((bdf) >> 8) & 
> 0x7
> +
> +static int handle_config_space(int dir, uint32_t port, uint32_t bytes,
> +                               uint32_t *val)
> +{
> +    uint32_t pci_cf8;
> +    struct pci_device_emul *pci;
> +    ioreq_t *p = get_ioreq(current);
> +    int rc = X86EMUL_UNHANDLEABLE;
> +    struct vcpu *v = current;
> +
> +    spin_lock(&v->domain->arch.hvm_domain.pci_root.pci_lock);
> +
> +    if (port == 0xcf8)

You need to be considerably more careful here: This code should
handle only 32-bit wide aligned accesses, and you need to make
sure that everything else still gets properly forwarded to qemu
(so that e.g. port CF9 could still be properly emulated if desired).

> +    {
> +        rc = X86EMUL_OKAY;
> +        v->arch.hvm_vcpu.pci_cf8 = *val;
> +        goto end_handle;
> +    }
> +
> +    pci_cf8 = v->arch.hvm_vcpu.pci_cf8;
> +
> +    /* Retrieve PCI */
> +    pci = v->domain->arch.hvm_domain.pci_root.pci;
> +
> +    while (pci && !PCI_CMP_BDF(pci, pci_cf8))
> +        pci = pci->next;

Is there a reasonably low enforced boundary on the number
of devices? Otherwise, a linear lookup would seem overly
simple to me.

Further, with how PCI_CMP_BDF() is defined, you're doing the
wrong thing here anyway - bit 31 is required to be set for the
port CFC access to be a config space one. Plus there's an AMD
extension to this interface, so I think other than shifting out
the low 8 bits and checking that the high bit is set, you shouldn't
do any other masking here.

Jan

> +
> +    /* We just fill the ioreq, hvm_send_assist_req will send the request */
> +    if (unlikely(pci == NULL))
> +    {
> +        *val = ~0;
> +        rc = X86EMUL_OKAY;
> +        goto end_handle;
> +    }
> +
> +    p->type = IOREQ_TYPE_PCI_CONFIG;
> +    p->addr = (pci_cf8 & ~3) + (p->addr & 3);
> +
> +    set_ioreq(v, &pci->server->ioreq, p);
> +
> +end_handle:
> +    spin_unlock(&v->domain->arch.hvm_domain.pci_root.pci_lock);
> +    return rc;
> +}
Julien Grall - March 26, 2012, 12:20 p.m.
On 03/23/2012 08:29 AM, Jan Beulich wrote:
> Is there a reasonably low enforced boundary on the number
> of devices? Otherwise, a linear lookup would seem overly
> simple to me.
>    
The maximum of bdf is 2^16 => 65536.
Which kind of structure do you advice ? Array ? Hash Table ?

> Further, with how PCI_CMP_BDF() is defined, you're doing the
> wrong thing here anyway - bit 31 is required to be set for the
> port CFC access to be a config space one. Plus there's an AMD
> extension to this interface, so I think other than shifting out
> the low 8 bits and checking that the high bit is set, you shouldn't
> do any other masking here.
>    
Actually in config address register the 24-30 bits are reserved.
So, do I need to mask it ?

Moreover what is the AMD extension ?

> Jan
>
>    
>> +
>> +    /* We just fill the ioreq, hvm_send_assist_req will send the request */
>> +    if (unlikely(pci == NULL))
>> +    {
>> +        *val = ~0;
>> +        rc = X86EMUL_OKAY;
>> +        goto end_handle;
>> +    }
>> +
>> +    p->type = IOREQ_TYPE_PCI_CONFIG;
>> +    p->addr = (pci_cf8&  ~3) + (p->addr&  3);
>> +
>> +    set_ioreq(v,&pci->server->ioreq, p);
>> +
>> +end_handle:
>> +    spin_unlock(&v->domain->arch.hvm_domain.pci_root.pci_lock);
>> +    return rc;
>> +}
>>      
>
>
Jan Beulich - March 26, 2012, 12:52 p.m.
>>> On 26.03.12 at 14:20, Julien Grall <julien.grall@citrix.com> wrote:
> On 03/23/2012 08:29 AM, Jan Beulich wrote:
>> Is there a reasonably low enforced boundary on the number
>> of devices? Otherwise, a linear lookup would seem overly
>> simple to me.
>>    
> The maximum of bdf is 2^16 => 65536.
> Which kind of structure do you advice ? Array ? Hash Table ?

Radix tree, especially if you fold in the segment number.

>> Further, with how PCI_CMP_BDF() is defined, you're doing the
>> wrong thing here anyway - bit 31 is required to be set for the
>> port CFC access to be a config space one. Plus there's an AMD
>> extension to this interface, so I think other than shifting out
>> the low 8 bits and checking that the high bit is set, you shouldn't
>> do any other masking here.
>>    
> Actually in config address register the 24-30 bits are reserved.
> So, do I need to mask it ?

Not necessarily - I'd suggest considering the part of the address
(which should generally result in a mismatch on any comparison. This
so that this ...

> Moreover what is the AMD extension ?

... can work without additional code. For an implementation, please
have a look at current Linux'es arch/x86/pci/direct.c - bits 24...27
are used for extended config space register accesses (which will be
needed for advanced PCIe or PCI-X functionality, and which may
be particularly important as long as we don't emulate MMCFG - at
least I don't think we do).

Jan

Patch

diff --git a/xen/arch/x86/hvm/Makefile b/xen/arch/x86/hvm/Makefile
index eea5555..585e9c9 100644
--- a/xen/arch/x86/hvm/Makefile
+++ b/xen/arch/x86/hvm/Makefile
@@ -12,6 +12,7 @@  obj-y += irq.o
 obj-y += mtrr.o
 obj-y += nestedhvm.o
 obj-y += pmtimer.o
+obj-y += pci_emul.o
 obj-y += quirks.o
 obj-y += rtc.o
 obj-y += save.o
diff --git a/xen/arch/x86/hvm/pci_emul.c b/xen/arch/x86/hvm/pci_emul.c
new file mode 100644
index 0000000..b390e73
--- /dev/null
+++ b/xen/arch/x86/hvm/pci_emul.c
@@ -0,0 +1,147 @@ 
+#include <asm/hvm/support.h>
+#include <xen/hvm/pci_emul.h>
+#include <xen/pci.h>
+#include <xen/sched.h>
+#include <xen/xmalloc.h>
+
+#define PCI_DEBUGSTR "%x:%x.%x"
+#define PCI_DEBUG(bdf) ((bdf) >> 16) & 0xff, ((bdf) >> 11) & 0x1f, ((bdf) >> 8) & 0x7
+
+static int handle_config_space(int dir, uint32_t port, uint32_t bytes,
+                               uint32_t *val)
+{
+    uint32_t pci_cf8;
+    struct pci_device_emul *pci;
+    ioreq_t *p = get_ioreq(current);
+    int rc = X86EMUL_UNHANDLEABLE;
+    struct vcpu *v = current;
+
+    spin_lock(&v->domain->arch.hvm_domain.pci_root.pci_lock);
+
+    if (port == 0xcf8)
+    {
+        rc = X86EMUL_OKAY;
+        v->arch.hvm_vcpu.pci_cf8 = *val;
+        goto end_handle;
+    }
+
+    pci_cf8 = v->arch.hvm_vcpu.pci_cf8;
+
+    /* Retrieve PCI */
+    pci = v->domain->arch.hvm_domain.pci_root.pci;
+
+    while (pci && !PCI_CMP_BDF(pci, pci_cf8))
+        pci = pci->next;
+
+    /* We just fill the ioreq, hvm_send_assist_req will send the request */
+    if (unlikely(pci == NULL))
+    {
+        *val = ~0;
+        rc = X86EMUL_OKAY;
+        goto end_handle;
+    }
+
+    p->type = IOREQ_TYPE_PCI_CONFIG;
+    p->addr = (pci_cf8 & ~3) + (p->addr & 3);
+
+    set_ioreq(v, &pci->server->ioreq, p);
+
+end_handle:
+    spin_unlock(&v->domain->arch.hvm_domain.pci_root.pci_lock);
+    return rc;
+}
+
+int hvm_register_pcidev(domid_t domid, unsigned int id, u16 bdf)
+{
+    struct domain *d;
+    struct hvm_ioreq_server *s;
+    struct pci_device_emul *x;
+    int rc = 0;
+
+    rc = rcu_lock_target_domain_by_id(domid, &d);
+
+    if (rc != 0)
+        return rc;
+
+    if (!is_hvm_domain(d))
+    {
+        rcu_unlock_domain(d);
+        return -EINVAL;
+    }
+
+    /* Search server */
+    spin_lock(&d->arch.hvm_domain.ioreq_server_lock);
+    s = d->arch.hvm_domain.ioreq_server_list;
+    while ((s != NULL) && (s->id != id))
+        s = s->next;
+
+    if (s == NULL)
+    {
+        dprintk(XENLOG_DEBUG, "Cannot find server\n");
+        rc = -ENOENT;
+        goto create_end;
+    }
+
+    spin_unlock(&d->arch.hvm_domain.ioreq_server_lock);
+
+    spin_lock(&d->arch.hvm_domain.pci_root.pci_lock);
+    x = xmalloc(struct pci_device_emul);
+
+    if (!x)
+    {
+        dprintk(XENLOG_DEBUG, "Cannot allocate pci\n");
+        rc = -ENOMEM;
+        goto create_end;
+    }
+
+    x->bdf = PCI_MASK_BDF(bdf);
+    x->server = s;
+    x->next = d->arch.hvm_domain.pci_root.pci;
+    d->arch.hvm_domain.pci_root.pci = x;
+
+create_end:
+    spin_unlock(&d->arch.hvm_domain.pci_root.pci_lock);
+    rcu_unlock_domain(d);
+
+    return rc;
+}
+
+int hvm_init_pci_emul(struct domain *d)
+{
+    struct pci_root_emul *root = &d->arch.hvm_domain.pci_root;
+
+    spin_lock_init(&root->pci_lock);
+
+    root->pci = NULL;
+
+    /* Register the config space handler */
+    register_portio_handler(d, 0xcf8, 8, handle_config_space);
+
+    return 0;
+}
+
+void hvm_destroy_pci_emul(struct domain *d)
+{
+    struct pci_root_emul *root = &d->arch.hvm_domain.pci_root;
+    struct pci_device_emul *p;
+
+    spin_lock(&root->pci_lock);
+
+    while ( (p = root->pci) != NULL )
+    {
+        root->pci = p->next;
+        xfree(p);
+    }
+
+    spin_unlock(&root->pci_lock);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-set-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */