Message ID | 20210120200438.480141-1-frank.heimes@canonical.com |
---|---|
Headers | show |
Series | vfio: pass DMA availability information to userspace (LP: 1907421) | expand |
On Wed, Jan 20, 2021 at 09:04:36PM +0100, frank.heimes@canonical.com wrote: > BugLink: https://bugs.launchpad.net/bugs/1907421 > > SRU Justification: > > [Impact] > > * In case a vfio-pci device on s390x is under I/O load, vfio-pci device may end up in error state. > > * The commit 492855939bdb added a limit to the number of concurrent DMA requests for a vfio container. > > * However, lazy unmapping in s390x can in fact cause quite a large number of outstanding DMA requests > to build up prior to being purged - potentially the entire guest DMA space. > > * This results in unexpected errors seen in qemu such as 'VFIO_MAP_DMA failed: No space left on device'. > > * The solution requires a change to both kernel and qemu. > > * The kernel side of things is addressed by this SRU. > > * The fix adds the ability to provide the number of allowable DMA requests via VFIO_IOMMU_GET_INFO ioctl. > > * The actual fix comes with commit 7d6e1329652e, but another fix ccd59dce1a21 is needed to get it cleanly applied. > > [Fix] > > * ccd59dce1a21f473518bf273bdf5b182bab955b3 ccd59dce1a21 "vfio/type1: Refactor vfio_iommu_type1_ioctl()" > > * 7d6e1329652ed971d1b6e0e7bea66fba5044e271 7d6e1329652e "vfio iommu: Add dma available capability" > > [Test Case] > > * IBM Z or LinuxONE hardware with Ubuntu Server 20.10 installed. > > * PCIe adapters in place that provide vfio, like RoCE Express 2. > > * A KVM host needs to be setup and a KVM guest (use again 20.10) that uses vfio. > > * Generate I/O that flows through the vf and watch out for error like 'VFIO_MAP_DMA failed: No space left on device' in the log. > > [Regression Potential] > > * The first patch ccd59dce1a21 modifies the common code file drivers/vfio/vfio_iommu_type1.c quity significantly. > > * But the reason is not that it introduces a lot of new things, it's a refactoring patch. > > * Nevertheless if done in a bad way it can significantly harm the IO memory management of virtual function adapters. > > * In worst case it may break them entirely, instead of 'just' exeeding the entire DMA space. > > * Things could also go wrong while doing the mapping and unmapping of DMA, that may even have an impact beyond vf adapters - harming other DMA devices. > > * The handling of dirty pages is also touched and the ioctl itself - which is important to keep the control of the devices. > > * But as said before, it re-factoring work, it's upstream accepted since 5.9 and the provenance shows that many engineers had an eye on these changes. > > * The second patch 7d6e1329652e - that inclides the needed fix - comes with far less modifications. > > * It also tounches drivers/vfio/vfio_iommu_type1.c but in a way that it mainly > adds one new function (vfio_iommu_dma_avail_build_caps) and an add. if statement in vfio_iommu_type1_get_info. > > * That allows to provide the current allowable number of DMA mappings to the userspace via the IOMMU info chain, > so that the userspace can take appropriate mitigation. > > * Potential problems can be that the current allowable number of DMA mappings are wrong > and in best case just mappings are wasted > and in worst case there are more reported than available in reality, which could have a severe impact. > > * What happens in such a case is a bit depending on the userspace. > > * This patch got upstream accepted with kernel 5.10 and pre-tested by IBM. > > * In addition a PPA was created with a patched groovy kernel that was shared for further testing. > > [Other] > > * The patch got upstream accepted with kernel v5.10, hence it will land in Hirsute once the target kernel 5.10 is in. > > * For 5.4 this will come as upstream stable update 5.4.90/91. > > * For 5.8 (and 5.9) upstream stable support already ended, hence this SRU is only needed for groovy. > > Liu Yi L (1): > vfio/type1: Refactor vfio_iommu_type1_ioctl() > > Matthew Rosato (1): > vfio iommu: Add dma available capability > > drivers/vfio/vfio_iommu_type1.c | 409 ++++++++++++++++++-------------- > include/uapi/linux/vfio.h | 15 ++ > 2 files changed, 244 insertions(+), 180 deletions(-) > > -- > 2.25.1 > > > -- > kernel-team mailing list > kernel-team@lists.ubuntu.com > https://lists.ubuntu.com/mailman/listinfo/kernel-team Acked-by: William Breathitt Gray <william.gray@canonical.com>
Patches look good, apply clean and build tested Acked-by: Ian May <ian.may@canonical.com> On 2021-01-20 21:04:36 , frank.heimes@canonical.com wrote: > BugLink: https://bugs.launchpad.net/bugs/1907421 > > SRU Justification: > > [Impact] > > * In case a vfio-pci device on s390x is under I/O load, vfio-pci device may end up in error state. > > * The commit 492855939bdb added a limit to the number of concurrent DMA requests for a vfio container. > > * However, lazy unmapping in s390x can in fact cause quite a large number of outstanding DMA requests > to build up prior to being purged - potentially the entire guest DMA space. > > * This results in unexpected errors seen in qemu such as 'VFIO_MAP_DMA failed: No space left on device'. > > * The solution requires a change to both kernel and qemu. > > * The kernel side of things is addressed by this SRU. > > * The fix adds the ability to provide the number of allowable DMA requests via VFIO_IOMMU_GET_INFO ioctl. > > * The actual fix comes with commit 7d6e1329652e, but another fix ccd59dce1a21 is needed to get it cleanly applied. > > [Fix] > > * ccd59dce1a21f473518bf273bdf5b182bab955b3 ccd59dce1a21 "vfio/type1: Refactor vfio_iommu_type1_ioctl()" > > * 7d6e1329652ed971d1b6e0e7bea66fba5044e271 7d6e1329652e "vfio iommu: Add dma available capability" > > [Test Case] > > * IBM Z or LinuxONE hardware with Ubuntu Server 20.10 installed. > > * PCIe adapters in place that provide vfio, like RoCE Express 2. > > * A KVM host needs to be setup and a KVM guest (use again 20.10) that uses vfio. > > * Generate I/O that flows through the vf and watch out for error like 'VFIO_MAP_DMA failed: No space left on device' in the log. > > [Regression Potential] > > * The first patch ccd59dce1a21 modifies the common code file drivers/vfio/vfio_iommu_type1.c quity significantly. > > * But the reason is not that it introduces a lot of new things, it's a refactoring patch. > > * Nevertheless if done in a bad way it can significantly harm the IO memory management of virtual function adapters. > > * In worst case it may break them entirely, instead of 'just' exeeding the entire DMA space. > > * Things could also go wrong while doing the mapping and unmapping of DMA, that may even have an impact beyond vf adapters - harming other DMA devices. > > * The handling of dirty pages is also touched and the ioctl itself - which is important to keep the control of the devices. > > * But as said before, it re-factoring work, it's upstream accepted since 5.9 and the provenance shows that many engineers had an eye on these changes. > > * The second patch 7d6e1329652e - that inclides the needed fix - comes with far less modifications. > > * It also tounches drivers/vfio/vfio_iommu_type1.c but in a way that it mainly > adds one new function (vfio_iommu_dma_avail_build_caps) and an add. if statement in vfio_iommu_type1_get_info. > > * That allows to provide the current allowable number of DMA mappings to the userspace via the IOMMU info chain, > so that the userspace can take appropriate mitigation. > > * Potential problems can be that the current allowable number of DMA mappings are wrong > and in best case just mappings are wasted > and in worst case there are more reported than available in reality, which could have a severe impact. > > * What happens in such a case is a bit depending on the userspace. > > * This patch got upstream accepted with kernel 5.10 and pre-tested by IBM. > > * In addition a PPA was created with a patched groovy kernel that was shared for further testing. > > [Other] > > * The patch got upstream accepted with kernel v5.10, hence it will land in Hirsute once the target kernel 5.10 is in. > > * For 5.4 this will come as upstream stable update 5.4.90/91. > > * For 5.8 (and 5.9) upstream stable support already ended, hence this SRU is only needed for groovy. > > Liu Yi L (1): > vfio/type1: Refactor vfio_iommu_type1_ioctl() > > Matthew Rosato (1): > vfio iommu: Add dma available capability > > drivers/vfio/vfio_iommu_type1.c | 409 ++++++++++++++++++-------------- > include/uapi/linux/vfio.h | 15 ++ > 2 files changed, 244 insertions(+), 180 deletions(-) > > -- > 2.25.1 > > > -- > kernel-team mailing list > kernel-team@lists.ubuntu.com > https://lists.ubuntu.com/mailman/listinfo/kernel-team
Applied to Groovy/master-next. thank you! -Kelsey On 2021-01-20 21:04:36 , frank.heimes@canonical.com wrote: > BugLink: https://bugs.launchpad.net/bugs/1907421 > > SRU Justification: > > [Impact] > > * In case a vfio-pci device on s390x is under I/O load, vfio-pci device may end up in error state. > > * The commit 492855939bdb added a limit to the number of concurrent DMA requests for a vfio container. > > * However, lazy unmapping in s390x can in fact cause quite a large number of outstanding DMA requests > to build up prior to being purged - potentially the entire guest DMA space. > > * This results in unexpected errors seen in qemu such as 'VFIO_MAP_DMA failed: No space left on device'. > > * The solution requires a change to both kernel and qemu. > > * The kernel side of things is addressed by this SRU. > > * The fix adds the ability to provide the number of allowable DMA requests via VFIO_IOMMU_GET_INFO ioctl. > > * The actual fix comes with commit 7d6e1329652e, but another fix ccd59dce1a21 is needed to get it cleanly applied. > > [Fix] > > * ccd59dce1a21f473518bf273bdf5b182bab955b3 ccd59dce1a21 "vfio/type1: Refactor vfio_iommu_type1_ioctl()" > > * 7d6e1329652ed971d1b6e0e7bea66fba5044e271 7d6e1329652e "vfio iommu: Add dma available capability" > > [Test Case] > > * IBM Z or LinuxONE hardware with Ubuntu Server 20.10 installed. > > * PCIe adapters in place that provide vfio, like RoCE Express 2. > > * A KVM host needs to be setup and a KVM guest (use again 20.10) that uses vfio. > > * Generate I/O that flows through the vf and watch out for error like 'VFIO_MAP_DMA failed: No space left on device' in the log. > > [Regression Potential] > > * The first patch ccd59dce1a21 modifies the common code file drivers/vfio/vfio_iommu_type1.c quity significantly. > > * But the reason is not that it introduces a lot of new things, it's a refactoring patch. > > * Nevertheless if done in a bad way it can significantly harm the IO memory management of virtual function adapters. > > * In worst case it may break them entirely, instead of 'just' exeeding the entire DMA space. > > * Things could also go wrong while doing the mapping and unmapping of DMA, that may even have an impact beyond vf adapters - harming other DMA devices. > > * The handling of dirty pages is also touched and the ioctl itself - which is important to keep the control of the devices. > > * But as said before, it re-factoring work, it's upstream accepted since 5.9 and the provenance shows that many engineers had an eye on these changes. > > * The second patch 7d6e1329652e - that inclides the needed fix - comes with far less modifications. > > * It also tounches drivers/vfio/vfio_iommu_type1.c but in a way that it mainly > adds one new function (vfio_iommu_dma_avail_build_caps) and an add. if statement in vfio_iommu_type1_get_info. > > * That allows to provide the current allowable number of DMA mappings to the userspace via the IOMMU info chain, > so that the userspace can take appropriate mitigation. > > * Potential problems can be that the current allowable number of DMA mappings are wrong > and in best case just mappings are wasted > and in worst case there are more reported than available in reality, which could have a severe impact. > > * What happens in such a case is a bit depending on the userspace. > > * This patch got upstream accepted with kernel 5.10 and pre-tested by IBM. > > * In addition a PPA was created with a patched groovy kernel that was shared for further testing. > > [Other] > > * The patch got upstream accepted with kernel v5.10, hence it will land in Hirsute once the target kernel 5.10 is in. > > * For 5.4 this will come as upstream stable update 5.4.90/91. > > * For 5.8 (and 5.9) upstream stable support already ended, hence this SRU is only needed for groovy. > > Liu Yi L (1): > vfio/type1: Refactor vfio_iommu_type1_ioctl() > > Matthew Rosato (1): > vfio iommu: Add dma available capability > > drivers/vfio/vfio_iommu_type1.c | 409 ++++++++++++++++++-------------- > include/uapi/linux/vfio.h | 15 ++ > 2 files changed, 244 insertions(+), 180 deletions(-) > > -- > 2.25.1 > > > -- > kernel-team mailing list > kernel-team@lists.ubuntu.com > https://lists.ubuntu.com/mailman/listinfo/kernel-team