Message ID | 1415900529-10629-1-git-send-email-roger.pau@citrix.com |
---|---|
State | New |
Headers | show |
On 11/13/2014 05:42 PM, Roger Pau Monne wrote: > This patch fixes two issues with persistent grants and the disk PV backend > (Qdisk): > > - Keep track of memory regions where persistent grants have been mapped > since we need to unmap them as a whole. It is not possible to unmap a > single grant if it has been batch-mapped. A new check has also been added > to make sure persistent grants are only used if the whole mapped region > can be persistently mapped in the batch_maps case. > - Unmap persistent grants before switching to the closed state, so the > frontend can also free them. > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > Reported-by: George Dunlap <george.dunlap@eu.citrix.com> Tested-by: George Dunlap <george.dunlap@eu.citrix.com>
On Thu, Nov 13, 2014 at 06:42:09PM +0100, Roger Pau Monne wrote: > This patch fixes two issues with persistent grants and the disk PV backend > (Qdisk): > > - Keep track of memory regions where persistent grants have been mapped > since we need to unmap them as a whole. It is not possible to unmap a > single grant if it has been batch-mapped. A new check has also been added > to make sure persistent grants are only used if the whole mapped region > can be persistently mapped in the batch_maps case. > - Unmap persistent grants before switching to the closed state, so the > frontend can also free them. > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > Reported-by: George Dunlap <george.dunlap@eu.citrix.com> > Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > Cc: Kevin Wolf <kwolf@redhat.com> > Cc: Stefan Hajnoczi <stefanha@redhat.com> > Cc: George Dunlap <george.dunlap@eu.citrix.com> > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > Xen release exception: this is obviously an important bug-fix that should be > included in 4.5. Without this fix, trying to do a block-detach of a Qdisk > from a running guest can lead to guest crash depending on the blkfront > implementation. Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> on the spirit of having it in Xen 4.5 - but I am going to let Stefano decide if he would like to take this version or another one (if he spots something else that needs fixing). Thank you! > --- > Changea since v2: > - Expand the commit message to mention the newly added check. > - Don't use persistent_regions in the !batch_maps case, we can use the > previous implementation which works fine in that case. > > Changes since v1: > - Instead of disabling batch mappings when using persistent grants, keep > track of the persistently mapped regions and unmap them on disconnection. > This prevents unmapping single persistent grants, but the current > implementation doesn't require it. > > This new version is slightly faster than the previous one, the following > test results are in IOPS from 20 runs of a block-attach, fio run, > block-detach test loop: > > x batch > + no_batch > +----------------------------------------------------------------------+ > | + + x x + + + x xx x | > |+ + x x+ *+++ * +x* +++x* x xx x *| > | |_____________A_____M______|| > | |________________A_____M___________| | > +----------------------------------------------------------------------+ > N Min Max Median Avg Stddev > x 20 52941 63822 62396 61180.15 2673.6497 > + 20 49967 63849 60322 59116.9 3456.3455 > Difference at 95.0% confidence > -2063.25 +/- 1977.66 > -3.37242% +/- 3.23252% > (Student's t, pooled s = 3089.88) > --- > hw/block/xen_disk.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++++----- > 1 file changed, 66 insertions(+), 6 deletions(-) > > diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c > index 231e9a7..21842a0 100644 > --- a/hw/block/xen_disk.c > +++ b/hw/block/xen_disk.c > @@ -59,6 +59,13 @@ struct PersistentGrant { > > typedef struct PersistentGrant PersistentGrant; > > +struct PersistentRegion { > + void *addr; > + int num; > +}; > + > +typedef struct PersistentRegion PersistentRegion; > + > struct ioreq { > blkif_request_t req; > int16_t status; > @@ -118,6 +125,7 @@ struct XenBlkDev { > gboolean feature_discard; > gboolean feature_persistent; > GTree *persistent_gnts; > + GSList *persistent_regions; > unsigned int persistent_gnt_count; > unsigned int max_grants; > > @@ -177,6 +185,23 @@ static void destroy_grant(gpointer pgnt) > g_free(grant); > } > > +static void remove_persistent_region(gpointer data, gpointer dev) > +{ > + PersistentRegion *region = data; > + struct XenBlkDev *blkdev = dev; > + XenGnttab gnt = blkdev->xendev.gnttabdev; > + > + if (xc_gnttab_munmap(gnt, region->addr, region->num) != 0) { > + xen_be_printf(&blkdev->xendev, 0, > + "xc_gnttab_munmap region %p failed: %s\n", > + region->addr, strerror(errno)); > + } > + xen_be_printf(&blkdev->xendev, 3, > + "unmapped grant region %p with %d pages\n", > + region->addr, region->num); > + g_free(region); > +} > + > static struct ioreq *ioreq_start(struct XenBlkDev *blkdev) > { > struct ioreq *ioreq = NULL; > @@ -343,6 +368,7 @@ static int ioreq_map(struct ioreq *ioreq) > void *page[BLKIF_MAX_SEGMENTS_PER_REQUEST]; > int i, j, new_maps = 0; > PersistentGrant *grant; > + PersistentRegion *region; > /* domids and refs variables will contain the information necessary > * to map the grants that are needed to fulfill this request. > * > @@ -421,7 +447,22 @@ static int ioreq_map(struct ioreq *ioreq) > } > } > } > - if (ioreq->blkdev->feature_persistent) { > + if (ioreq->blkdev->feature_persistent && new_maps != 0 && > + (!batch_maps || (ioreq->blkdev->persistent_gnt_count + new_maps <= > + ioreq->blkdev->max_grants))) { > + /* > + * If we are using persistent grants and batch mappings only > + * add the new maps to the list of persistent grants if the whole > + * area can be persistently mapped. > + */ > + if (batch_maps) { > + region = g_malloc0(sizeof(*region)); > + region->addr = ioreq->pages; > + region->num = new_maps; > + ioreq->blkdev->persistent_regions = g_slist_append( > + ioreq->blkdev->persistent_regions, > + region); > + } > while ((ioreq->blkdev->persistent_gnt_count < ioreq->blkdev->max_grants) > && new_maps) { > /* Go through the list of newly mapped grants and add as many > @@ -447,6 +488,7 @@ static int ioreq_map(struct ioreq *ioreq) > grant); > ioreq->blkdev->persistent_gnt_count++; > } > + assert(!batch_maps || new_maps == 0); > } > for (i = 0; i < ioreq->v.niov; i++) { > ioreq->v.iov[i].iov_base += (uintptr_t)page[i]; > @@ -971,7 +1013,10 @@ static int blk_connect(struct XenDevice *xendev) > blkdev->max_grants = max_requests * BLKIF_MAX_SEGMENTS_PER_REQUEST; > blkdev->persistent_gnts = g_tree_new_full((GCompareDataFunc)int_cmp, > NULL, NULL, > + batch_maps ? > + (GDestroyNotify)g_free : > (GDestroyNotify)destroy_grant); > + blkdev->persistent_regions = NULL; > blkdev->persistent_gnt_count = 0; > } > > @@ -1000,6 +1045,26 @@ static void blk_disconnect(struct XenDevice *xendev) > blkdev->cnt_map--; > blkdev->sring = NULL; > } > + > + /* > + * Unmap persistent grants before switching to the closed state > + * so the frontend can free them. > + * > + * In the !batch_maps case g_tree_destroy will take care of unmapping > + * the grant, but in the batch_maps case we need to iterate over every > + * region in persistent_regions and unmap it. > + */ > + if (blkdev->feature_persistent) { > + g_tree_destroy(blkdev->persistent_gnts); > + assert(batch_maps || blkdev->persistent_gnt_count == 0); > + if (batch_maps) { > + blkdev->persistent_gnt_count = 0; > + g_slist_foreach(blkdev->persistent_regions, > + (GFunc)remove_persistent_region, blkdev); > + g_slist_free(blkdev->persistent_regions); > + } > + blkdev->feature_persistent = false; > + } > } > > static int blk_free(struct XenDevice *xendev) > @@ -1011,11 +1076,6 @@ static int blk_free(struct XenDevice *xendev) > blk_disconnect(xendev); > } > > - /* Free persistent grants */ > - if (blkdev->feature_persistent) { > - g_tree_destroy(blkdev->persistent_gnts); > - } > - > while (!QLIST_EMPTY(&blkdev->freelist)) { > ioreq = QLIST_FIRST(&blkdev->freelist); > QLIST_REMOVE(ioreq, list); > -- > 1.9.3 (Apple Git-50) >
On Thu, 13 Nov 2014, Roger Pau Monne wrote: > This patch fixes two issues with persistent grants and the disk PV backend > (Qdisk): > > - Keep track of memory regions where persistent grants have been mapped > since we need to unmap them as a whole. It is not possible to unmap a > single grant if it has been batch-mapped. A new check has also been added > to make sure persistent grants are only used if the whole mapped region > can be persistently mapped in the batch_maps case. > - Unmap persistent grants before switching to the closed state, so the > frontend can also free them. > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > Reported-by: George Dunlap <george.dunlap@eu.citrix.com> > Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > Cc: Kevin Wolf <kwolf@redhat.com> > Cc: Stefan Hajnoczi <stefanha@redhat.com> > Cc: George Dunlap <george.dunlap@eu.citrix.com> > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> I'll send a pull request and backport to qemu-xen. > Xen release exception: this is obviously an important bug-fix that should be > included in 4.5. Without this fix, trying to do a block-detach of a Qdisk > from a running guest can lead to guest crash depending on the blkfront > implementation. > --- > Changea since v2: > - Expand the commit message to mention the newly added check. > - Don't use persistent_regions in the !batch_maps case, we can use the > previous implementation which works fine in that case. > > Changes since v1: > - Instead of disabling batch mappings when using persistent grants, keep > track of the persistently mapped regions and unmap them on disconnection. > This prevents unmapping single persistent grants, but the current > implementation doesn't require it. > > This new version is slightly faster than the previous one, the following > test results are in IOPS from 20 runs of a block-attach, fio run, > block-detach test loop: > > x batch > + no_batch > +----------------------------------------------------------------------+ > | + + x x + + + x xx x | > |+ + x x+ *+++ * +x* +++x* x xx x *| > | |_____________A_____M______|| > | |________________A_____M___________| | > +----------------------------------------------------------------------+ > N Min Max Median Avg Stddev > x 20 52941 63822 62396 61180.15 2673.6497 > + 20 49967 63849 60322 59116.9 3456.3455 > Difference at 95.0% confidence > -2063.25 +/- 1977.66 > -3.37242% +/- 3.23252% > (Student's t, pooled s = 3089.88) > --- > hw/block/xen_disk.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++++----- > 1 file changed, 66 insertions(+), 6 deletions(-) > > diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c > index 231e9a7..21842a0 100644 > --- a/hw/block/xen_disk.c > +++ b/hw/block/xen_disk.c > @@ -59,6 +59,13 @@ struct PersistentGrant { > > typedef struct PersistentGrant PersistentGrant; > > +struct PersistentRegion { > + void *addr; > + int num; > +}; > + > +typedef struct PersistentRegion PersistentRegion; > + > struct ioreq { > blkif_request_t req; > int16_t status; > @@ -118,6 +125,7 @@ struct XenBlkDev { > gboolean feature_discard; > gboolean feature_persistent; > GTree *persistent_gnts; > + GSList *persistent_regions; > unsigned int persistent_gnt_count; > unsigned int max_grants; > > @@ -177,6 +185,23 @@ static void destroy_grant(gpointer pgnt) > g_free(grant); > } > > +static void remove_persistent_region(gpointer data, gpointer dev) > +{ > + PersistentRegion *region = data; > + struct XenBlkDev *blkdev = dev; > + XenGnttab gnt = blkdev->xendev.gnttabdev; > + > + if (xc_gnttab_munmap(gnt, region->addr, region->num) != 0) { > + xen_be_printf(&blkdev->xendev, 0, > + "xc_gnttab_munmap region %p failed: %s\n", > + region->addr, strerror(errno)); > + } > + xen_be_printf(&blkdev->xendev, 3, > + "unmapped grant region %p with %d pages\n", > + region->addr, region->num); > + g_free(region); > +} > + > static struct ioreq *ioreq_start(struct XenBlkDev *blkdev) > { > struct ioreq *ioreq = NULL; > @@ -343,6 +368,7 @@ static int ioreq_map(struct ioreq *ioreq) > void *page[BLKIF_MAX_SEGMENTS_PER_REQUEST]; > int i, j, new_maps = 0; > PersistentGrant *grant; > + PersistentRegion *region; > /* domids and refs variables will contain the information necessary > * to map the grants that are needed to fulfill this request. > * > @@ -421,7 +447,22 @@ static int ioreq_map(struct ioreq *ioreq) > } > } > } > - if (ioreq->blkdev->feature_persistent) { > + if (ioreq->blkdev->feature_persistent && new_maps != 0 && > + (!batch_maps || (ioreq->blkdev->persistent_gnt_count + new_maps <= > + ioreq->blkdev->max_grants))) { > + /* > + * If we are using persistent grants and batch mappings only > + * add the new maps to the list of persistent grants if the whole > + * area can be persistently mapped. > + */ > + if (batch_maps) { > + region = g_malloc0(sizeof(*region)); > + region->addr = ioreq->pages; > + region->num = new_maps; > + ioreq->blkdev->persistent_regions = g_slist_append( > + ioreq->blkdev->persistent_regions, > + region); > + } > while ((ioreq->blkdev->persistent_gnt_count < ioreq->blkdev->max_grants) > && new_maps) { > /* Go through the list of newly mapped grants and add as many > @@ -447,6 +488,7 @@ static int ioreq_map(struct ioreq *ioreq) > grant); > ioreq->blkdev->persistent_gnt_count++; > } > + assert(!batch_maps || new_maps == 0); > } > for (i = 0; i < ioreq->v.niov; i++) { > ioreq->v.iov[i].iov_base += (uintptr_t)page[i]; > @@ -971,7 +1013,10 @@ static int blk_connect(struct XenDevice *xendev) > blkdev->max_grants = max_requests * BLKIF_MAX_SEGMENTS_PER_REQUEST; > blkdev->persistent_gnts = g_tree_new_full((GCompareDataFunc)int_cmp, > NULL, NULL, > + batch_maps ? > + (GDestroyNotify)g_free : > (GDestroyNotify)destroy_grant); > + blkdev->persistent_regions = NULL; > blkdev->persistent_gnt_count = 0; > } > > @@ -1000,6 +1045,26 @@ static void blk_disconnect(struct XenDevice *xendev) > blkdev->cnt_map--; > blkdev->sring = NULL; > } > + > + /* > + * Unmap persistent grants before switching to the closed state > + * so the frontend can free them. > + * > + * In the !batch_maps case g_tree_destroy will take care of unmapping > + * the grant, but in the batch_maps case we need to iterate over every > + * region in persistent_regions and unmap it. > + */ > + if (blkdev->feature_persistent) { > + g_tree_destroy(blkdev->persistent_gnts); > + assert(batch_maps || blkdev->persistent_gnt_count == 0); > + if (batch_maps) { > + blkdev->persistent_gnt_count = 0; > + g_slist_foreach(blkdev->persistent_regions, > + (GFunc)remove_persistent_region, blkdev); > + g_slist_free(blkdev->persistent_regions); > + } > + blkdev->feature_persistent = false; > + } > } > > static int blk_free(struct XenDevice *xendev) > @@ -1011,11 +1076,6 @@ static int blk_free(struct XenDevice *xendev) > blk_disconnect(xendev); > } > > - /* Free persistent grants */ > - if (blkdev->feature_persistent) { > - g_tree_destroy(blkdev->persistent_gnts); > - } > - > while (!QLIST_EMPTY(&blkdev->freelist)) { > ioreq = QLIST_FIRST(&blkdev->freelist); > QLIST_REMOVE(ioreq, list); > -- > 1.9.3 (Apple Git-50) >
diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c index 231e9a7..21842a0 100644 --- a/hw/block/xen_disk.c +++ b/hw/block/xen_disk.c @@ -59,6 +59,13 @@ struct PersistentGrant { typedef struct PersistentGrant PersistentGrant; +struct PersistentRegion { + void *addr; + int num; +}; + +typedef struct PersistentRegion PersistentRegion; + struct ioreq { blkif_request_t req; int16_t status; @@ -118,6 +125,7 @@ struct XenBlkDev { gboolean feature_discard; gboolean feature_persistent; GTree *persistent_gnts; + GSList *persistent_regions; unsigned int persistent_gnt_count; unsigned int max_grants; @@ -177,6 +185,23 @@ static void destroy_grant(gpointer pgnt) g_free(grant); } +static void remove_persistent_region(gpointer data, gpointer dev) +{ + PersistentRegion *region = data; + struct XenBlkDev *blkdev = dev; + XenGnttab gnt = blkdev->xendev.gnttabdev; + + if (xc_gnttab_munmap(gnt, region->addr, region->num) != 0) { + xen_be_printf(&blkdev->xendev, 0, + "xc_gnttab_munmap region %p failed: %s\n", + region->addr, strerror(errno)); + } + xen_be_printf(&blkdev->xendev, 3, + "unmapped grant region %p with %d pages\n", + region->addr, region->num); + g_free(region); +} + static struct ioreq *ioreq_start(struct XenBlkDev *blkdev) { struct ioreq *ioreq = NULL; @@ -343,6 +368,7 @@ static int ioreq_map(struct ioreq *ioreq) void *page[BLKIF_MAX_SEGMENTS_PER_REQUEST]; int i, j, new_maps = 0; PersistentGrant *grant; + PersistentRegion *region; /* domids and refs variables will contain the information necessary * to map the grants that are needed to fulfill this request. * @@ -421,7 +447,22 @@ static int ioreq_map(struct ioreq *ioreq) } } } - if (ioreq->blkdev->feature_persistent) { + if (ioreq->blkdev->feature_persistent && new_maps != 0 && + (!batch_maps || (ioreq->blkdev->persistent_gnt_count + new_maps <= + ioreq->blkdev->max_grants))) { + /* + * If we are using persistent grants and batch mappings only + * add the new maps to the list of persistent grants if the whole + * area can be persistently mapped. + */ + if (batch_maps) { + region = g_malloc0(sizeof(*region)); + region->addr = ioreq->pages; + region->num = new_maps; + ioreq->blkdev->persistent_regions = g_slist_append( + ioreq->blkdev->persistent_regions, + region); + } while ((ioreq->blkdev->persistent_gnt_count < ioreq->blkdev->max_grants) && new_maps) { /* Go through the list of newly mapped grants and add as many @@ -447,6 +488,7 @@ static int ioreq_map(struct ioreq *ioreq) grant); ioreq->blkdev->persistent_gnt_count++; } + assert(!batch_maps || new_maps == 0); } for (i = 0; i < ioreq->v.niov; i++) { ioreq->v.iov[i].iov_base += (uintptr_t)page[i]; @@ -971,7 +1013,10 @@ static int blk_connect(struct XenDevice *xendev) blkdev->max_grants = max_requests * BLKIF_MAX_SEGMENTS_PER_REQUEST; blkdev->persistent_gnts = g_tree_new_full((GCompareDataFunc)int_cmp, NULL, NULL, + batch_maps ? + (GDestroyNotify)g_free : (GDestroyNotify)destroy_grant); + blkdev->persistent_regions = NULL; blkdev->persistent_gnt_count = 0; } @@ -1000,6 +1045,26 @@ static void blk_disconnect(struct XenDevice *xendev) blkdev->cnt_map--; blkdev->sring = NULL; } + + /* + * Unmap persistent grants before switching to the closed state + * so the frontend can free them. + * + * In the !batch_maps case g_tree_destroy will take care of unmapping + * the grant, but in the batch_maps case we need to iterate over every + * region in persistent_regions and unmap it. + */ + if (blkdev->feature_persistent) { + g_tree_destroy(blkdev->persistent_gnts); + assert(batch_maps || blkdev->persistent_gnt_count == 0); + if (batch_maps) { + blkdev->persistent_gnt_count = 0; + g_slist_foreach(blkdev->persistent_regions, + (GFunc)remove_persistent_region, blkdev); + g_slist_free(blkdev->persistent_regions); + } + blkdev->feature_persistent = false; + } } static int blk_free(struct XenDevice *xendev) @@ -1011,11 +1076,6 @@ static int blk_free(struct XenDevice *xendev) blk_disconnect(xendev); } - /* Free persistent grants */ - if (blkdev->feature_persistent) { - g_tree_destroy(blkdev->persistent_gnts); - } - while (!QLIST_EMPTY(&blkdev->freelist)) { ioreq = QLIST_FIRST(&blkdev->freelist); QLIST_REMOVE(ioreq, list);
This patch fixes two issues with persistent grants and the disk PV backend (Qdisk): - Keep track of memory regions where persistent grants have been mapped since we need to unmap them as a whole. It is not possible to unmap a single grant if it has been batch-mapped. A new check has also been added to make sure persistent grants are only used if the whole mapped region can be persistently mapped in the batch_maps case. - Unmap persistent grants before switching to the closed state, so the frontend can also free them. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reported-by: George Dunlap <george.dunlap@eu.citrix.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: George Dunlap <george.dunlap@eu.citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- Xen release exception: this is obviously an important bug-fix that should be included in 4.5. Without this fix, trying to do a block-detach of a Qdisk from a running guest can lead to guest crash depending on the blkfront implementation. --- Changea since v2: - Expand the commit message to mention the newly added check. - Don't use persistent_regions in the !batch_maps case, we can use the previous implementation which works fine in that case. Changes since v1: - Instead of disabling batch mappings when using persistent grants, keep track of the persistently mapped regions and unmap them on disconnection. This prevents unmapping single persistent grants, but the current implementation doesn't require it. This new version is slightly faster than the previous one, the following test results are in IOPS from 20 runs of a block-attach, fio run, block-detach test loop: x batch + no_batch +----------------------------------------------------------------------+ | + + x x + + + x xx x | |+ + x x+ *+++ * +x* +++x* x xx x *| | |_____________A_____M______|| | |________________A_____M___________| | +----------------------------------------------------------------------+ N Min Max Median Avg Stddev x 20 52941 63822 62396 61180.15 2673.6497 + 20 49967 63849 60322 59116.9 3456.3455 Difference at 95.0% confidence -2063.25 +/- 1977.66 -3.37242% +/- 3.23252% (Student's t, pooled s = 3089.88) --- hw/block/xen_disk.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 66 insertions(+), 6 deletions(-)