Message ID | eb931c94b4cd00da7e1e74896b2f9531b56ea357.1332430811.git.julien.grall@citrix.com |
---|---|
State | New |
Headers | show |
Julien Grall writes ("[Xen-devel] [XEN][RFC PATCH 14/15] xl-parsing: Parse the new option device_models"): > For the support of multiple ioreq server, we add a new option "device_models". > It's an array of device model, for each device model, we need to specify > which pci, IO range (MMIO, PIO) will be allow. I don't think this is really a suitable interface. The PCI space in the guest is controlled by the device models(s) and the user should surely specify which devices should be provided by which dms, in terms of devices not in terms of PCI space. Ian.
On Mon, 2 Apr 2012, Ian Jackson wrote: > Julien Grall writes ("[Xen-devel] [XEN][RFC PATCH 14/15] xl-parsing: Parse the new option device_models"): > > For the support of multiple ioreq server, we add a new option "device_models". > > It's an array of device model, for each device model, we need to specify > > which pci, IO range (MMIO, PIO) will be allow. > > I don't think this is really a suitable interface. The PCI space in > the guest is controlled by the device models(s) and the user should > surely specify which devices should be provided by which dms, in terms > of devices not in terms of PCI space. Julien added a name parameter to select the device, maybe we need something clearer? Specifying the PCI address is important, because we have to make sure the PCI addresses of the devices remain the same in a given VM across multiple boots. Thus we could make it optional from the user POV, but in that case we need a clear, well defined and stable algorithm in xl to figure out a PCI address from a given config file.
Stefano Stabellini writes ("Re: [Xen-devel] [XEN][RFC PATCH 14/15] xl-parsing: Parse the new option device_models"): > On Mon, 2 Apr 2012, Ian Jackson wrote: > > I don't think this is really a suitable interface. The PCI space in > > the guest is controlled by the device models(s) and the user should > > surely specify which devices should be provided by which dms, in terms > > of devices not in terms of PCI space. > > Julien added a name parameter to select the device, maybe we need > something clearer? > Specifying the PCI address is important, because we have to make sure > the PCI addresses of the devices remain the same in a given VM across > multiple boots. Are the PCI addresses not assigned in a deterministic fashion by code in qemu-dm, in this case in the qemu-dm which is emulating the pci bridge ? If not then that needs to be fixed, surely ? Ian.
On 04/03/2012 02:31 PM, Ian Jackson wrote: > Stefano Stabellini writes ("Re: [Xen-devel] [XEN][RFC PATCH 14/15] xl-parsing: Parse the new option device_models"): > >> On Mon, 2 Apr 2012, Ian Jackson wrote: >> >>> I don't think this is really a suitable interface. The PCI space in >>> the guest is controlled by the device models(s) and the user should >>> surely specify which devices should be provided by which dms, in terms >>> of devices not in terms of PCI space. >>> >> >> Julien added a name parameter to select the device, maybe we need >> something clearer? >> Specifying the PCI address is important, because we have to make sure >> the PCI addresses of the devices remain the same in a given VM across >> multiple boots. >> > Are the PCI addresses not assigned in a deterministic fashion by code > in qemu-dm, in this case in the qemu-dm which is emulating the pci > bridge ? If not then that needs to be fixed, surely ? > Indeed but each QEMU emulate a subset of the hardware. So how QEMU can know the available PCI addresses ? I think that toolstack must allocate the BDF, otherwise we need to have communication between each qemu-dm.
Julien Grall writes ("Re: [Xen-devel] [XEN][RFC PATCH 14/15] xl-parsing: Parse the new option device_models"): > On 04/03/2012 02:31 PM, Ian Jackson wrote: > > Are the PCI addresses not assigned in a deterministic fashion by code > > in qemu-dm, in this case in the qemu-dm which is emulating the pci > > bridge ? If not then that needs to be fixed, surely ? > > Indeed but each QEMU emulate a subset of the hardware. > So how QEMU can know the available PCI addresses ? > I think that toolstack must allocate the BDF, otherwise we need to have > communication between each qemu-dm. Currently the bdfs are allocated by the single qemu-dm, right ? Why cannot that functionality stay there, with the pci bridge emulation ? Ian.
On Tue, 3 Apr 2012, Ian Jackson wrote: > Julien Grall writes ("Re: [Xen-devel] [XEN][RFC PATCH 14/15] xl-parsing: Parse the new option device_models"): > > On 04/03/2012 02:31 PM, Ian Jackson wrote: > > > Are the PCI addresses not assigned in a deterministic fashion by code > > > in qemu-dm, in this case in the qemu-dm which is emulating the pci > > > bridge ? If not then that needs to be fixed, surely ? > > > > Indeed but each QEMU emulate a subset of the hardware. > > So how QEMU can know the available PCI addresses ? > > I think that toolstack must allocate the BDF, otherwise we need to have > > communication between each qemu-dm. > > Currently the bdfs are allocated by the single qemu-dm, right ? Why > cannot that functionality stay there, with the pci bridge emulation ? Because the allocation of most BDFs in QEMU is done ad-hoc in a first come first served basis. If the first QEMU is not going to emulate these devices then it is not going to allocate the BDF for them either.
Stefano Stabellini writes ("Re: [Xen-devel] [XEN][RFC PATCH 14/15] xl-parsing: Parse the new option device_models"): > On Tue, 3 Apr 2012, Ian Jackson wrote: > > Currently the bdfs are allocated by the single qemu-dm, right ? Why > > cannot that functionality stay there, with the pci bridge emulation ? > > Because the allocation of most BDFs in QEMU is done ad-hoc in a first > come first served basis. If the first QEMU is not going to emulate these > devices then it is not going to allocate the BDF for them either. Do we want the bdf allocation to be stable when switching between the combined and disaggregated qemu-dms ? If we do then it still needs to be done by that qemu-dm using the same algorithm, presumably with additional ipc. If we don't then it would be ok for the bdfs to be allocated by new code in the toolstack somewhere, eg libxl, but it still shouldn't be up to the user to configure. Ian.
diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile index e44fcfa..e35d382 100644 --- a/tools/libxl/Makefile +++ b/tools/libxl/Makefile @@ -57,7 +57,7 @@ $(LIBXL_OBJS): CFLAGS += $(CFLAGS_libxenctrl) $(CFLAGS_libxenguest) $(CFLAGS_lib AUTOINCS= libxlu_cfg_y.h libxlu_cfg_l.h _libxl_list.h AUTOSRCS= libxlu_cfg_y.c libxlu_cfg_l.c LIBXLU_OBJS = libxlu_cfg_y.o libxlu_cfg_l.o libxlu_cfg.o \ - libxlu_disk_l.o libxlu_disk.o + libxlu_disk_l.o libxlu_disk.o libxlu_dm.o $(LIBXLU_OBJS): CFLAGS += $(CFLAGS_libxenctrl) # For xentoollog.h CLIENTS = xl testidl diff --git a/tools/libxl/libxlu_dm.c b/tools/libxl/libxlu_dm.c new file mode 100644 index 0000000..928c7d2 --- /dev/null +++ b/tools/libxl/libxlu_dm.c @@ -0,0 +1,202 @@ +#include "libxl_osdeps.h" /* must come before any other headers */ +#include <stdlib.h> +#include "libxlu_internal.h" +#include "libxlu_cfg_i.h" + +static void split_string_into_string_list(const char *str, + const char *delim, + libxl_string_list *psl) +{ + char *s, *saveptr; + const char *p; + libxl_string_list sl; + + int i = 0, nr = 0; + + s = strdup(str); + if (s == NULL) { + fprintf(stderr, "xlu_dm: unable to allocate memory\n"); + exit(-1); + } + + /* Count number of entries */ + p = strtok_r(s, delim, &saveptr); + do { + nr++; + } while ((p = strtok_r(NULL, delim, &saveptr))); + + free(s); + + s = strdup(str); + + sl = malloc((nr+1) * sizeof (char *)); + if (sl == NULL) { + fprintf(stderr, "xlu_dm: unable to allocate memory\n"); + exit(-1); + } + + p = strtok_r(s, delim, &saveptr); + do { + assert(i < nr); + // Skip blank + while (*p == ' ') + p++; + sl[i] = strdup(p); + i++; + } while ((p = strtok_r(NULL, delim, &saveptr))); + sl[i] = NULL; + + *psl = sl; + + free(s); +} + +static int xlu_dm_check_pci(const char *pci) +{ + unsigned long bus; + unsigned long function; + unsigned long device; + char *buf; + + while (*pci == ' ') + pci++; + + bus = strtol(pci, &buf, 16); + if (pci == buf || *buf != ':') + return 1; + + pci = buf + 1; + device = strtol(pci, &buf, 16); + if (pci == buf || *buf != '.') + return 1; + + pci = buf + 1; + function = strtol(pci, &buf, 16); + if (pci == buf) + return 1; + + pci = buf; + + while (*pci == ' ') + pci++; + + if (*pci != '\0') + return 1; + + buf[0] = '\0'; + + if (bus > 0xff || device > 0x1f || function > 0x7 + || (bus == 0xff && device == 0x1f && function == 0x7)) + return 1; + + return 0; +} + +static int xlu_dm_check_range(const char *range) +{ + unsigned long begin; + unsigned long end; + char *buf; + + begin = strtol(range, &buf, 0); + if (buf == range) + return 0; + + if (*buf == '-') + { + range = buf + 1; + end = strtol(range, &buf, 0); + if (buf == range) + return 1; + } + else + end = begin; + + range = buf; + + while (*range == ' ') + range++; + + if (*range != '\0') + return 1; + + buf[0] = '\0'; + + if (begin > end) + return 1; + + return 0; +} + +int xlu_dm_parse(XLU_Config *cfg, const char *spec, + libxl_dm *dm) +{ + char *buf = strdup(spec); + char *p, *p2; + int i = 0; + int rc = 0; + + p = strtok (buf, ","); + if (!p) + goto skip_dm; + do { + while (*p == ' ') + p++; + if ((p2 = strchr (p, '=')) == NULL) + break; + *p2 = '\0'; + if (!strcmp (p, "name")) + dm->name = strdup (p2 + 1); + else if (!strcmp (p, "path")) + dm->path = strdup (p2 + 1); + else if (!strcmp (p, "pci")) + { + split_string_into_string_list(p2 + 1, ";", &dm->pcis); + for (i = 0; dm->pcis[i]; i++) + { + if (xlu_dm_check_pci(dm->pcis[i])) + { + fprintf(stderr, "xlu_dm: invalid pci \"%s\"\n", + dm->pcis[i]); + rc = 1; + } + } + } + else if (!strcmp (p, "mmio")) + { + split_string_into_string_list(p2 + 1, ";", &dm->mmios); + for (i = 0; dm->mmios[i]; i++) + { + if (xlu_dm_check_range(dm->mmios[i])) + { + fprintf(stderr, "xlu_dm: invalid mmio range \"%s\"\n", + dm->mmios[i]); + rc = 1; + } + } + } + else if (!strcmp (p, "pio")) + { + split_string_into_string_list(p2 + 1, ";", &dm->pios); + for (i = 0; dm->pios[i]; i++) + { + if (xlu_dm_check_range(dm->pios[i])) + { + fprintf(stderr, "xlu_dm: invalid pio range \"%s\"\n", + dm->pios[i]); + rc = 1; + } + } + } + } while ((p = strtok (NULL, ",")) != NULL); + + if (!dm->name && dm->path) + { + fprintf (stderr, "xl: Unable to parse device_deamon\n"); + exit (-ERROR_FAIL); + } +skip_dm: + free(buf); + + return rc; +} diff --git a/tools/libxl/libxlutil.h b/tools/libxl/libxlutil.h index 620b9db..8eb7e16 100644 --- a/tools/libxl/libxlutil.h +++ b/tools/libxl/libxlutil.h @@ -88,6 +88,11 @@ int xlu_disk_parse(XLU_Config *cfg, int nspecs, const char *const *specs, * resulting disk struct is used with libxl. */ +/* + * Daemon specification parsing. + */ +int xlu_dm_parse(XLU_Config *cfg, const char *spec, + libxl_dm *dm); #endif /* LIBXLUTIL_H */ diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c index 1d59b89..5473faf 100644 --- a/tools/libxl/xl_cmdimpl.c +++ b/tools/libxl/xl_cmdimpl.c @@ -515,7 +515,7 @@ static void parse_config_data(const char *configfile_filename_report, const char *buf; long l; XLU_Config *config; - XLU_ConfigList *cpus, *vbds, *nics, *pcis, *cvfbs, *cpuids; + XLU_ConfigList *cpus, *vbds, *nics, *pcis, *cvfbs, *cpuids, *dms; int pci_power_mgmt = 0; int pci_msitranslate = 1; int e; @@ -1119,6 +1119,9 @@ skip_vfb: } else if (!strcmp(buf, "qemu-xen")) { b_info->device_model_version = LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN; + } else if (!strcmp(buf, "multiple-qemu-xen") && c_info->type == LIBXL_DOMAIN_TYPE_HVM) { + b_info->device_model_version + = LIBXL_DEVICE_MODEL_VERSION_MULTIPLE_QEMU_XEN; } else { fprintf(stderr, "Unknown device_model_version \"%s\" specified\n", buf); @@ -1144,6 +1147,29 @@ skip_vfb: } } } + + if (b_info->device_model_version + == LIBXL_DEVICE_MODEL_VERSION_MULTIPLE_QEMU_XEN) { + if (!xlu_cfg_get_list (config, "device_models", &dms, 0, 0)) { + d_config->num_dms = 0; + d_config->dms = NULL; + while ((buf = xlu_cfg_get_listitem (dms, d_config->num_dms)) + != NULL) { + libxl_dm *dm; + + d_config->dms = (libxl_dm *) realloc (d_config->dms, sizeof (libxl_dm) * (d_config->num_dms + 1)); + dm = d_config->dms + d_config->num_dms; + libxl_dm_init (dm); + dm->id = d_config->num_dms + 1; + if (xlu_dm_parse(config, buf, dm)) + exit(-ERROR_FAIL); + + d_config->num_dms++; + } + b_info->u.hvm.max_servers = d_config->num_dms; + } + } + #define parse_extra_args(type) \ e = xlu_cfg_get_list_as_string_list(config, "device_model_args"#type, \ &b_info->extra##type, 0); \
For the support of multiple ioreq server, we add a new option "device_models". It's an array of device model, for each device model, we need to specify which pci, IO range (MMIO, PIO) will be allow. For instance, if we want a QEMU which handle a specify PCI: name=net, path=/path/to/qemu-wrapper, pci=00:4.0 Signed-off-by: Julien Grall <julien.grall@citrix.com> --- tools/libxl/Makefile | 2 +- tools/libxl/libxlu_dm.c | 202 ++++++++++++++++++++++++++++++++++++++++++++++ tools/libxl/libxlutil.h | 5 + tools/libxl/xl_cmdimpl.c | 28 ++++++- 4 files changed, 235 insertions(+), 2 deletions(-) create mode 100644 tools/libxl/libxlu_dm.c