diff mbox series

[V2] vhost: correctly turn on VIRTIO_F_IOMMU_PLATFORM

Message ID 20200226094357.25061-1-jasowang@redhat.com
State New
Headers show
Series [V2] vhost: correctly turn on VIRTIO_F_IOMMU_PLATFORM | expand

Commit Message

Jason Wang Feb. 26, 2020, 9:43 a.m. UTC
We turn on device IOTLB via VIRTIO_F_IOMMU_PLATFORM unconditionally on
platform without IOMMU support. This can lead unnecessary IOTLB
transactions which will damage the performance.

Fixing this by check whether the device is backed by IOMMU and disable
device IOTLB.

Reported-by: Halil Pasic <pasic@linux.ibm.com>
Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support")
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
Changes from V1:
- do not check acked_features
- reuse vhost_dev_has_iommu()
---
 hw/virtio/vhost.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

Comments

Michael S. Tsirkin Feb. 26, 2020, 9:53 a.m. UTC | #1
On Wed, Feb 26, 2020 at 05:43:57PM +0800, Jason Wang wrote:
> We turn on device IOTLB via VIRTIO_F_IOMMU_PLATFORM unconditionally on
> platform without IOMMU support. This can lead unnecessary IOTLB
> transactions which will damage the performance.
> 
> Fixing this by check whether the device is backed by IOMMU and disable
> device IOTLB.
> 
> Reported-by: Halil Pasic <pasic@linux.ibm.com>
> Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support")

Well it's just an optimization, isn't it?
I don't think it's justified to push this to everyone using
vhost with IOTLB, is it?
If you disagree, could you comment a bit on which configurations where tested?

> Cc: qemu-stable@nongnu.org
> Signed-off-by: Jason Wang <jasowang@redhat.com>

Halil could you test this pls? Does this help your performance issue?

> ---
> Changes from V1:
> - do not check acked_features
> - reuse vhost_dev_has_iommu()
> ---
>  hw/virtio/vhost.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 9edfadc81d..9182a00495 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -290,7 +290,14 @@ static int vhost_dev_has_iommu(struct vhost_dev *dev)
>  {
>      VirtIODevice *vdev = dev->vdev;
>  
> -    return virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
> +    /*
> +     * For vhost, VIRTIO_F_IOMMU_PLATFORM means the backend support
> +     * incremental memory mapping API via IOTLB API. For platform that
> +     * does not have IOMMU, there's no need to enable this feature
> +     * which may cause unnecessary IOTLB miss/update trnasactions.
> +     */
> +    return vdev->dma_as != &address_space_memory &&
> +           virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
>  }
>  
>  static void *vhost_memory_map(struct vhost_dev *dev, hwaddr addr,
> @@ -765,6 +772,9 @@ static int vhost_dev_set_features(struct vhost_dev *dev,
>      if (enable_log) {
>          features |= 0x1ULL << VHOST_F_LOG_ALL;
>      }
> +    if (!vhost_dev_has_iommu(dev)) {
> +        features &= ~(0x1ULL << VIRTIO_F_IOMMU_PLATFORM);
> +    }
>      r = dev->vhost_ops->vhost_set_features(dev, features);
>      if (r < 0) {
>          VHOST_OPS_DEBUG("vhost_set_features failed");
> -- 
> 2.19.1
Jason Wang Feb. 26, 2020, 10:17 a.m. UTC | #2
On 2020/2/26 下午5:53, Michael S. Tsirkin wrote:
> On Wed, Feb 26, 2020 at 05:43:57PM +0800, Jason Wang wrote:
>> We turn on device IOTLB via VIRTIO_F_IOMMU_PLATFORM unconditionally on
>> platform without IOMMU support. This can lead unnecessary IOTLB
>> transactions which will damage the performance.
>>
>> Fixing this by check whether the device is backed by IOMMU and disable
>> device IOTLB.
>>
>> Reported-by: Halil Pasic <pasic@linux.ibm.com>
>> Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support")
> Well it's just an optimization, isn't it?


Kind of, or a fix for the performance.


> I don't think it's justified to push this to everyone using
> vhost with IOTLB, is it?


My understanding is that the function should be equivalent to IOTLB in 
this case.

Since no IOMMU is used and device may only see GPA.

Another possible direction is to qemu to update memory table via device 
IOTLB API, this seems less straightforward.


> If you disagree, could you comment a bit on which configurations where tested?


I just do dirty shortcut to emulate the platform without get_dma_as wth 
IOMMU_PLATFORM set.

Thanks


>
>> Cc: qemu-stable@nongnu.org
>> Signed-off-by: Jason Wang <jasowang@redhat.com>
> Halil could you test this pls? Does this help your performance issue?
>
>> ---
>> Changes from V1:
>> - do not check acked_features
>> - reuse vhost_dev_has_iommu()
>> ---
>>   hw/virtio/vhost.c | 12 +++++++++++-
>>   1 file changed, 11 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
>> index 9edfadc81d..9182a00495 100644
>> --- a/hw/virtio/vhost.c
>> +++ b/hw/virtio/vhost.c
>> @@ -290,7 +290,14 @@ static int vhost_dev_has_iommu(struct vhost_dev *dev)
>>   {
>>       VirtIODevice *vdev = dev->vdev;
>>   
>> -    return virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
>> +    /*
>> +     * For vhost, VIRTIO_F_IOMMU_PLATFORM means the backend support
>> +     * incremental memory mapping API via IOTLB API. For platform that
>> +     * does not have IOMMU, there's no need to enable this feature
>> +     * which may cause unnecessary IOTLB miss/update trnasactions.
>> +     */
>> +    return vdev->dma_as != &address_space_memory &&
>> +           virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
>>   }
>>   
>>   static void *vhost_memory_map(struct vhost_dev *dev, hwaddr addr,
>> @@ -765,6 +772,9 @@ static int vhost_dev_set_features(struct vhost_dev *dev,
>>       if (enable_log) {
>>           features |= 0x1ULL << VHOST_F_LOG_ALL;
>>       }
>> +    if (!vhost_dev_has_iommu(dev)) {
>> +        features &= ~(0x1ULL << VIRTIO_F_IOMMU_PLATFORM);
>> +    }
>>       r = dev->vhost_ops->vhost_set_features(dev, features);
>>       if (r < 0) {
>>           VHOST_OPS_DEBUG("vhost_set_features failed");
>> -- 
>> 2.19.1
Michael S. Tsirkin Feb. 26, 2020, 11:44 a.m. UTC | #3
On Wed, Feb 26, 2020 at 06:17:34PM +0800, Jason Wang wrote:
> 
> On 2020/2/26 下午5:53, Michael S. Tsirkin wrote:
> > On Wed, Feb 26, 2020 at 05:43:57PM +0800, Jason Wang wrote:
> > > We turn on device IOTLB via VIRTIO_F_IOMMU_PLATFORM unconditionally on
> > > platform without IOMMU support. This can lead unnecessary IOTLB
> > > transactions which will damage the performance.
> > > 
> > > Fixing this by check whether the device is backed by IOMMU and disable
> > > device IOTLB.
> > > 
> > > Reported-by: Halil Pasic <pasic@linux.ibm.com>
> > > Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support")
> > Well it's just an optimization, isn't it?
> 
> 
> Kind of, or a fix for the performance.
> 

Given this wasn't a regression, it's a valuable enhancement
but Fixes: seems to agressive.

> > I don't think it's justified to push this to everyone using
> > vhost with IOTLB, is it?
> 
> 
> My understanding is that the function should be equivalent to IOTLB in this
> case.
> 
> Since no IOMMU is used and device may only see GPA.
> 
> Another possible direction is to qemu to update memory table via device
> IOTLB API, this seems less straightforward.
> 
> 
> > If you disagree, could you comment a bit on which configurations where tested?
> 
> 
> I just do dirty shortcut to emulate the platform without get_dma_as wth
> IOMMU_PLATFORM set.
> 
> Thanks

If you want Fixes tag with an expectation of everyone backporting,
then this also needs to be tested on a platform with a viommu,
such as dpdk within guest.

> 
> > 
> > > Cc: qemu-stable@nongnu.org
> > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > Halil could you test this pls? Does this help your performance issue?
> > 
> > > ---
> > > Changes from V1:
> > > - do not check acked_features
> > > - reuse vhost_dev_has_iommu()
> > > ---
> > >   hw/virtio/vhost.c | 12 +++++++++++-
> > >   1 file changed, 11 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > > index 9edfadc81d..9182a00495 100644
> > > --- a/hw/virtio/vhost.c
> > > +++ b/hw/virtio/vhost.c
> > > @@ -290,7 +290,14 @@ static int vhost_dev_has_iommu(struct vhost_dev *dev)
> > >   {
> > >       VirtIODevice *vdev = dev->vdev;
> > > -    return virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
> > > +    /*
> > > +     * For vhost, VIRTIO_F_IOMMU_PLATFORM means the backend support
> > > +     * incremental memory mapping API via IOTLB API. For platform that
> > > +     * does not have IOMMU, there's no need to enable this feature
> > > +     * which may cause unnecessary IOTLB miss/update trnasactions.
> > > +     */
> > > +    return vdev->dma_as != &address_space_memory &&
> > > +           virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
> > >   }
> > >   static void *vhost_memory_map(struct vhost_dev *dev, hwaddr addr,
> > > @@ -765,6 +772,9 @@ static int vhost_dev_set_features(struct vhost_dev *dev,
> > >       if (enable_log) {
> > >           features |= 0x1ULL << VHOST_F_LOG_ALL;
> > >       }
> > > +    if (!vhost_dev_has_iommu(dev)) {
> > > +        features &= ~(0x1ULL << VIRTIO_F_IOMMU_PLATFORM);
> > > +    }
> > >       r = dev->vhost_ops->vhost_set_features(dev, features);
> > >       if (r < 0) {
> > >           VHOST_OPS_DEBUG("vhost_set_features failed");
> > > -- 
> > > 2.19.1
Halil Pasic Feb. 26, 2020, 12:50 p.m. UTC | #4
On Wed, 26 Feb 2020 06:44:25 -0500
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Wed, Feb 26, 2020 at 06:17:34PM +0800, Jason Wang wrote:
> > 
> > On 2020/2/26 下午5:53, Michael S. Tsirkin wrote:
> > > On Wed, Feb 26, 2020 at 05:43:57PM +0800, Jason Wang wrote:
> > > > We turn on device IOTLB via VIRTIO_F_IOMMU_PLATFORM unconditionally on
> > > > platform without IOMMU support. This can lead unnecessary IOTLB
> > > > transactions which will damage the performance.
> > > > 
> > > > Fixing this by check whether the device is backed by IOMMU and disable
> > > > device IOTLB.
> > > > 
> > > > Reported-by: Halil Pasic <pasic@linux.ibm.com>
> > > > Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support")
> > > Well it's just an optimization, isn't it?
> > 
> > 
> > Kind of, or a fix for the performance.
> > 
> 
> Given this wasn't a regression, it's a valuable enhancement
> but Fixes: seems to agressive.

IMHO Fixes is appropriate. Telling vhost-net F_ACCESS_PLATFORM
when when vdev->dma_as != &address_space_memory results in a severe
performance degradation.

Regards,
Halil

[..]
Halil Pasic Feb. 26, 2020, 12:55 p.m. UTC | #5
On Wed, 26 Feb 2020 04:53:33 -0500
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Wed, Feb 26, 2020 at 05:43:57PM +0800, Jason Wang wrote:
> > We turn on device IOTLB via VIRTIO_F_IOMMU_PLATFORM unconditionally on
> > platform without IOMMU support. This can lead unnecessary IOTLB
> > transactions which will damage the performance.
> > 
> > Fixing this by check whether the device is backed by IOMMU and disable
> > device IOTLB.
> > 
> > Reported-by: Halil Pasic <pasic@linux.ibm.com>
> > Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support")
> 
> Well it's just an optimization, isn't it?
> I don't think it's justified to push this to everyone using
> vhost with IOTLB, is it?

IMHO we need this for everyone using vhost! For instance vhost-vsock
currently does not work with iommu_platform=on, because unlike vhost-net
vhost does not offer F_ACCESS_PLATFORM, so set features IOCTL fails. 

> If you disagree, could you comment a bit on which configurations where tested?
> 
> > Cc: qemu-stable@nongnu.org
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> 
> Halil could you test this pls? Does this help your performance issue?
> 

I'm pretty sure it does, but I will re-test. The previous version where
it was done virtio-net certainly did.

Regards,
Halil
Halil Pasic Feb. 26, 2020, 1:28 p.m. UTC | #6
On Wed, 26 Feb 2020 17:43:57 +0800
Jason Wang <jasowang@redhat.com> wrote:

> We turn on device IOTLB via VIRTIO_F_IOMMU_PLATFORM unconditionally on
> platform without IOMMU support. This can lead unnecessary IOTLB
> transactions which will damage the performance.
> 
> Fixing this by check whether the device is backed by IOMMU and disable
> device IOTLB.
> 
> Reported-by: Halil Pasic <pasic@linux.ibm.com>
> Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support")
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Jason Wang <jasowang@redhat.com>

Tested-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>

Thank you very much for fixing this! BTW as I mentioned before it
fixes vhost-vsock with iommu_platform=on as well.

Regards,
Halil

> ---
> Changes from V1:
> - do not check acked_features
> - reuse vhost_dev_has_iommu()
> ---
>  hw/virtio/vhost.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 9edfadc81d..9182a00495 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -290,7 +290,14 @@ static int vhost_dev_has_iommu(struct vhost_dev *dev)
>  {
>      VirtIODevice *vdev = dev->vdev;
>  
> -    return virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
> +    /*
> +     * For vhost, VIRTIO_F_IOMMU_PLATFORM means the backend support
> +     * incremental memory mapping API via IOTLB API. For platform that
> +     * does not have IOMMU, there's no need to enable this feature
> +     * which may cause unnecessary IOTLB miss/update trnasactions.
> +     */
> +    return vdev->dma_as != &address_space_memory &&
> +           virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
>  }
>  
>  static void *vhost_memory_map(struct vhost_dev *dev, hwaddr addr,
> @@ -765,6 +772,9 @@ static int vhost_dev_set_features(struct vhost_dev *dev,
>      if (enable_log) {
>          features |= 0x1ULL << VHOST_F_LOG_ALL;
>      }
> +    if (!vhost_dev_has_iommu(dev)) {
> +        features &= ~(0x1ULL << VIRTIO_F_IOMMU_PLATFORM);
> +    }
>      r = dev->vhost_ops->vhost_set_features(dev, features);
>      if (r < 0) {
>          VHOST_OPS_DEBUG("vhost_set_features failed");
Michael S. Tsirkin Feb. 26, 2020, 1:37 p.m. UTC | #7
On Wed, Feb 26, 2020 at 02:28:39PM +0100, Halil Pasic wrote:
> On Wed, 26 Feb 2020 17:43:57 +0800
> Jason Wang <jasowang@redhat.com> wrote:
> 
> > We turn on device IOTLB via VIRTIO_F_IOMMU_PLATFORM unconditionally on
> > platform without IOMMU support. This can lead unnecessary IOTLB
> > transactions which will damage the performance.
> > 
> > Fixing this by check whether the device is backed by IOMMU and disable
> > device IOTLB.
> > 
> > Reported-by: Halil Pasic <pasic@linux.ibm.com>
> > Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support")
> > Cc: qemu-stable@nongnu.org
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> 
> Tested-by: Halil Pasic <pasic@linux.ibm.com>
> Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
> 
> Thank you very much for fixing this! BTW as I mentioned before it
> fixes vhost-vsock with iommu_platform=on as well.

Fixes as in improves performance?

> Regards,
> Halil
> 
> > ---
> > Changes from V1:
> > - do not check acked_features
> > - reuse vhost_dev_has_iommu()
> > ---
> >  hw/virtio/vhost.c | 12 +++++++++++-
> >  1 file changed, 11 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > index 9edfadc81d..9182a00495 100644
> > --- a/hw/virtio/vhost.c
> > +++ b/hw/virtio/vhost.c
> > @@ -290,7 +290,14 @@ static int vhost_dev_has_iommu(struct vhost_dev *dev)
> >  {
> >      VirtIODevice *vdev = dev->vdev;
> >  
> > -    return virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
> > +    /*
> > +     * For vhost, VIRTIO_F_IOMMU_PLATFORM means the backend support
> > +     * incremental memory mapping API via IOTLB API. For platform that
> > +     * does not have IOMMU, there's no need to enable this feature
> > +     * which may cause unnecessary IOTLB miss/update trnasactions.
> > +     */
> > +    return vdev->dma_as != &address_space_memory &&
> > +           virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
> >  }
> >  
> >  static void *vhost_memory_map(struct vhost_dev *dev, hwaddr addr,
> > @@ -765,6 +772,9 @@ static int vhost_dev_set_features(struct vhost_dev *dev,
> >      if (enable_log) {
> >          features |= 0x1ULL << VHOST_F_LOG_ALL;
> >      }
> > +    if (!vhost_dev_has_iommu(dev)) {
> > +        features &= ~(0x1ULL << VIRTIO_F_IOMMU_PLATFORM);
> > +    }
> >      r = dev->vhost_ops->vhost_set_features(dev, features);
> >      if (r < 0) {
> >          VHOST_OPS_DEBUG("vhost_set_features failed");
Michael S. Tsirkin Feb. 26, 2020, 1:45 p.m. UTC | #8
On Wed, Feb 26, 2020 at 01:55:39PM +0100, Halil Pasic wrote:
> On Wed, 26 Feb 2020 04:53:33 -0500
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Wed, Feb 26, 2020 at 05:43:57PM +0800, Jason Wang wrote:
> > > We turn on device IOTLB via VIRTIO_F_IOMMU_PLATFORM unconditionally on
> > > platform without IOMMU support. This can lead unnecessary IOTLB
> > > transactions which will damage the performance.
> > > 
> > > Fixing this by check whether the device is backed by IOMMU and disable
> > > device IOTLB.
> > > 
> > > Reported-by: Halil Pasic <pasic@linux.ibm.com>
> > > Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support")
> > 
> > Well it's just an optimization, isn't it?
> > I don't think it's justified to push this to everyone using
> > vhost with IOTLB, is it?
> 
> IMHO we need this for everyone using vhost! For instance vhost-vsock
> currently does not work with iommu_platform=on, because unlike vhost-net
> vhost does not offer F_ACCESS_PLATFORM, so set features IOCTL fails. 

You mean vsock does not offer it? OK but that's still not a bugfix.
Making new configs work is great, but that's a feature almost
by definition.

> > If you disagree, could you comment a bit on which configurations where tested?
> > 
> > > Cc: qemu-stable@nongnu.org
> > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > 
> > Halil could you test this pls? Does this help your performance issue?
> > 
> 
> I'm pretty sure it does, but I will re-test. The previous version where
> it was done virtio-net certainly did.
> 
> Regards,
> Halil
> 
>
Halil Pasic Feb. 26, 2020, 3:36 p.m. UTC | #9
On Wed, 26 Feb 2020 08:37:13 -0500
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Wed, Feb 26, 2020 at 02:28:39PM +0100, Halil Pasic wrote:
> > On Wed, 26 Feb 2020 17:43:57 +0800
> > Jason Wang <jasowang@redhat.com> wrote:
> > 
> > > We turn on device IOTLB via VIRTIO_F_IOMMU_PLATFORM unconditionally on
> > > platform without IOMMU support. This can lead unnecessary IOTLB
> > > transactions which will damage the performance.
> > > 
> > > Fixing this by check whether the device is backed by IOMMU and disable
> > > device IOTLB.
> > > 
> > > Reported-by: Halil Pasic <pasic@linux.ibm.com>
> > > Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support")
> > > Cc: qemu-stable@nongnu.org
> > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > 
> > Tested-by: Halil Pasic <pasic@linux.ibm.com>
> > Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
> > 
> > Thank you very much for fixing this! BTW as I mentioned before it
> > fixes vhost-vsock with iommu_platform=on as well.
> 
> Fixes as in improves performance?

No, fixes like one does not get something like:
qemu-system-s390x: vhost_set_features failed: Operation not supported (95)
qemu-system-s390x: Error starting vhost: 95
any more.

Regards,
Halil

[..]
Michael S. Tsirkin Feb. 26, 2020, 4:52 p.m. UTC | #10
On Wed, Feb 26, 2020 at 04:36:18PM +0100, Halil Pasic wrote:
> On Wed, 26 Feb 2020 08:37:13 -0500
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Wed, Feb 26, 2020 at 02:28:39PM +0100, Halil Pasic wrote:
> > > On Wed, 26 Feb 2020 17:43:57 +0800
> > > Jason Wang <jasowang@redhat.com> wrote:
> > > 
> > > > We turn on device IOTLB via VIRTIO_F_IOMMU_PLATFORM unconditionally on
> > > > platform without IOMMU support. This can lead unnecessary IOTLB
> > > > transactions which will damage the performance.
> > > > 
> > > > Fixing this by check whether the device is backed by IOMMU and disable
> > > > device IOTLB.
> > > > 
> > > > Reported-by: Halil Pasic <pasic@linux.ibm.com>
> > > > Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support")
> > > > Cc: qemu-stable@nongnu.org
> > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > 
> > > Tested-by: Halil Pasic <pasic@linux.ibm.com>
> > > Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
> > > 
> > > Thank you very much for fixing this! BTW as I mentioned before it
> > > fixes vhost-vsock with iommu_platform=on as well.
> > 
> > Fixes as in improves performance?
> 
> No, fixes like one does not get something like:
> qemu-system-s390x: vhost_set_features failed: Operation not supported (95)
> qemu-system-s390x: Error starting vhost: 95
> any more.
> 
> Regards,
> Halil
> 
> [..]

But can commit c471ad0e9bd46 actually boot a secure guest
where iommu_platform=on is required?
Halil Pasic Feb. 27, 2020, 1:02 p.m. UTC | #11
On Wed, 26 Feb 2020 11:52:26 -0500
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Wed, Feb 26, 2020 at 04:36:18PM +0100, Halil Pasic wrote:
> > On Wed, 26 Feb 2020 08:37:13 -0500
> > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > 
> > > On Wed, Feb 26, 2020 at 02:28:39PM +0100, Halil Pasic wrote:
> > > > On Wed, 26 Feb 2020 17:43:57 +0800
> > > > Jason Wang <jasowang@redhat.com> wrote:
> > > > 
> > > > > We turn on device IOTLB via VIRTIO_F_IOMMU_PLATFORM unconditionally on
> > > > > platform without IOMMU support. This can lead unnecessary IOTLB
> > > > > transactions which will damage the performance.
> > > > > 
> > > > > Fixing this by check whether the device is backed by IOMMU and disable
> > > > > device IOTLB.
> > > > > 
> > > > > Reported-by: Halil Pasic <pasic@linux.ibm.com>
> > > > > Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support")
> > > > > Cc: qemu-stable@nongnu.org
> > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > 
> > > > Tested-by: Halil Pasic <pasic@linux.ibm.com>
> > > > Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
> > > > 
> > > > Thank you very much for fixing this! BTW as I mentioned before it
> > > > fixes vhost-vsock with iommu_platform=on as well.
> > > 
> > > Fixes as in improves performance?
> > 
> > No, fixes like one does not get something like:
> > qemu-system-s390x: vhost_set_features failed: Operation not supported (95)
> > qemu-system-s390x: Error starting vhost: 95
> > any more.
> > 
> > Regards,
> > Halil
> > 
> > [..]
> 
> But can commit c471ad0e9bd46 actually boot a secure guest
> where iommu_platform=on is required?
> 

No, of course it can not. But I'm not sure about AMD SEV. AFAIU without
Jason's patch it does not work for AMD SEV. Tom already stated that with
SEV they don't need the IOVA translation aspect of ACCESS_PLATFORM, but
I have no idea if the condition vdev->dma_as == &address_space_memory
catches them as well or not. They probably have !=.

CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu?

Also, one can specify iommu_platform=on on a device that ain't a part of
a secure-capable VM, just for the fun of it. And that breaks
vhost-vsock. Or is setting iommu_platform=on only valid if
qemu-system-s390x is protected virtualization capable?

BTW, I don't have a strong opinion on the fixes tag. We currently do not
recommend setting iommu_platform, and thus I don't think we care too
much about past qemus having problems with it.

Regards,
Halil
Michael S. Tsirkin Feb. 27, 2020, 3:47 p.m. UTC | #12
On Thu, Feb 27, 2020 at 02:02:15PM +0100, Halil Pasic wrote:
> On Wed, 26 Feb 2020 11:52:26 -0500
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Wed, Feb 26, 2020 at 04:36:18PM +0100, Halil Pasic wrote:
> > > On Wed, 26 Feb 2020 08:37:13 -0500
> > > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > 
> > > > On Wed, Feb 26, 2020 at 02:28:39PM +0100, Halil Pasic wrote:
> > > > > On Wed, 26 Feb 2020 17:43:57 +0800
> > > > > Jason Wang <jasowang@redhat.com> wrote:
> > > > > 
> > > > > > We turn on device IOTLB via VIRTIO_F_IOMMU_PLATFORM unconditionally on
> > > > > > platform without IOMMU support. This can lead unnecessary IOTLB
> > > > > > transactions which will damage the performance.
> > > > > > 
> > > > > > Fixing this by check whether the device is backed by IOMMU and disable
> > > > > > device IOTLB.
> > > > > > 
> > > > > > Reported-by: Halil Pasic <pasic@linux.ibm.com>
> > > > > > Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support")
> > > > > > Cc: qemu-stable@nongnu.org
> > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > 
> > > > > Tested-by: Halil Pasic <pasic@linux.ibm.com>
> > > > > Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
> > > > > 
> > > > > Thank you very much for fixing this! BTW as I mentioned before it
> > > > > fixes vhost-vsock with iommu_platform=on as well.
> > > > 
> > > > Fixes as in improves performance?
> > > 
> > > No, fixes like one does not get something like:
> > > qemu-system-s390x: vhost_set_features failed: Operation not supported (95)
> > > qemu-system-s390x: Error starting vhost: 95
> > > any more.
> > > 
> > > Regards,
> > > Halil
> > > 
> > > [..]
> > 
> > But can commit c471ad0e9bd46 actually boot a secure guest
> > where iommu_platform=on is required?
> > 
> 
> No, of course it can not. But I'm not sure about AMD SEV. AFAIU without
> Jason's patch it does not work for AMD SEV. Tom already stated that with
> SEV they don't need the IOVA translation aspect of ACCESS_PLATFORM, but
> I have no idea if the condition vdev->dma_as == &address_space_memory
> catches them as well or not. They probably have !=.
> 
> CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu?
> 
> Also, one can specify iommu_platform=on on a device that ain't a part of
> a secure-capable VM, just for the fun of it. And that breaks
> vhost-vsock. Or is setting iommu_platform=on only valid if
> qemu-system-s390x is protected virtualization capable?
> 
> BTW, I don't have a strong opinion on the fixes tag. We currently do not
> recommend setting iommu_platform, and thus I don't think we care too
> much about past qemus having problems with it.
> 
> Regards,
> Halil


Let's just say if we do have a Fixes: tag we want to set it correctly to
the commit that needs this fix.
Tom Lendacky Feb. 27, 2020, 8:36 p.m. UTC | #13
On 2/27/20 7:02 AM, Halil Pasic wrote:
> On Wed, 26 Feb 2020 11:52:26 -0500
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
>> On Wed, Feb 26, 2020 at 04:36:18PM +0100, Halil Pasic wrote:
>>> On Wed, 26 Feb 2020 08:37:13 -0500
>>> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>>>
>>>> On Wed, Feb 26, 2020 at 02:28:39PM +0100, Halil Pasic wrote:
>>>>> On Wed, 26 Feb 2020 17:43:57 +0800
>>>>> Jason Wang <jasowang@redhat.com> wrote:
>>>>>
>>>>>> We turn on device IOTLB via VIRTIO_F_IOMMU_PLATFORM unconditionally on
>>>>>> platform without IOMMU support. This can lead unnecessary IOTLB
>>>>>> transactions which will damage the performance.
>>>>>>
>>>>>> Fixing this by check whether the device is backed by IOMMU and disable
>>>>>> device IOTLB.
>>>>>>
>>>>>> Reported-by: Halil Pasic <pasic@linux.ibm.com>
>>>>>> Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support")
>>>>>> Cc: qemu-stable@nongnu.org
>>>>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>>>>
>>>>> Tested-by: Halil Pasic <pasic@linux.ibm.com>
>>>>> Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
>>>>>
>>>>> Thank you very much for fixing this! BTW as I mentioned before it
>>>>> fixes vhost-vsock with iommu_platform=on as well.
>>>>
>>>> Fixes as in improves performance?
>>>
>>> No, fixes like one does not get something like:
>>> qemu-system-s390x: vhost_set_features failed: Operation not supported (95)
>>> qemu-system-s390x: Error starting vhost: 95
>>> any more.
>>>
>>> Regards,
>>> Halil
>>>
>>> [..]
>>
>> But can commit c471ad0e9bd46 actually boot a secure guest
>> where iommu_platform=on is required?
>>
> 
> No, of course it can not. But I'm not sure about AMD SEV. AFAIU without
> Jason's patch it does not work for AMD SEV. Tom already stated that with
> SEV they don't need the IOVA translation aspect of ACCESS_PLATFORM, but
> I have no idea if the condition vdev->dma_as == &address_space_memory
> catches them as well or not. They probably have !=.
> 
> CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu?

Adding Brijesh for this, too.

> 
> Also, one can specify iommu_platform=on on a device that ain't a part of
> a secure-capable VM, just for the fun of it. And that breaks
> vhost-vsock. Or is setting iommu_platform=on only valid if
> qemu-system-s390x is protected virtualization capable?
> 
> BTW, I don't have a strong opinion on the fixes tag. We currently do not
> recommend setting iommu_platform, and thus I don't think we care too
> much about past qemus having problems with it.
> 
> Regards,
> Halil
>
Halil Pasic March 3, 2020, 2:44 p.m. UTC | #14
On Thu, 27 Feb 2020 10:47:22 -0500
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Thu, Feb 27, 2020 at 02:02:15PM +0100, Halil Pasic wrote:
> > On Wed, 26 Feb 2020 11:52:26 -0500
> > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > 
> > > On Wed, Feb 26, 2020 at 04:36:18PM +0100, Halil Pasic wrote:
> > > > On Wed, 26 Feb 2020 08:37:13 -0500
> > > > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > 
> > > > > On Wed, Feb 26, 2020 at 02:28:39PM +0100, Halil Pasic wrote:
> > > > > > On Wed, 26 Feb 2020 17:43:57 +0800
> > > > > > Jason Wang <jasowang@redhat.com> wrote:
> > > > > > 
> > > > > > > We turn on device IOTLB via VIRTIO_F_IOMMU_PLATFORM unconditionally on
> > > > > > > platform without IOMMU support. This can lead unnecessary IOTLB
> > > > > > > transactions which will damage the performance.
> > > > > > > 
> > > > > > > Fixing this by check whether the device is backed by IOMMU and disable
> > > > > > > device IOTLB.
> > > > > > > 
> > > > > > > Reported-by: Halil Pasic <pasic@linux.ibm.com>
> > > > > > > Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support")
> > > > > > > Cc: qemu-stable@nongnu.org
> > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > 
> > > > > > Tested-by: Halil Pasic <pasic@linux.ibm.com>
> > > > > > Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
> > > > > > 
> > > > > > Thank you very much for fixing this! BTW as I mentioned before it
> > > > > > fixes vhost-vsock with iommu_platform=on as well.
> > > > > 
> > > > > Fixes as in improves performance?
> > > > 
> > > > No, fixes like one does not get something like:
> > > > qemu-system-s390x: vhost_set_features failed: Operation not supported (95)
> > > > qemu-system-s390x: Error starting vhost: 95
> > > > any more.
> > > > 
> > > > Regards,
> > > > Halil
> > > > 
> > > > [..]
> > > 
> > > But can commit c471ad0e9bd46 actually boot a secure guest
> > > where iommu_platform=on is required?
> > > 
> > 
> > No, of course it can not. But I'm not sure about AMD SEV. AFAIU without
> > Jason's patch it does not work for AMD SEV. Tom already stated that with
> > SEV they don't need the IOVA translation aspect of ACCESS_PLATFORM, but
> > I have no idea if the condition vdev->dma_as == &address_space_memory
> > catches them as well or not. They probably have !=.
> > 
> > CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu?
> > 
> > Also, one can specify iommu_platform=on on a device that ain't a part of
> > a secure-capable VM, just for the fun of it. And that breaks
> > vhost-vsock. Or is setting iommu_platform=on only valid if
> > qemu-system-s390x is protected virtualization capable?
> > 
> > BTW, I don't have a strong opinion on the fixes tag. We currently do not
> > recommend setting iommu_platform, and thus I don't think we care too
> > much about past qemus having problems with it.
> > 
> > Regards,
> > Halil
> 
> 
> Let's just say if we do have a Fixes: tag we want to set it correctly to
> the commit that needs this fix.
> 

You are absolutely correct. And c471ad0e9bd46 has nothing to do with
vsock. On s390x the situation with virtio-net + vhost + iommu_platform=on
seems rather complex. I did some digging, but I have no conclusive
results yet. And I don't think we care all that much for iommu_platform=on
for old qemus.

Regards,
Halil
Halil Pasic March 13, 2020, 12:44 p.m. UTC | #15
[..]
> > 
> > CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu?
> > 
> > Also, one can specify iommu_platform=on on a device that ain't a part of
> > a secure-capable VM, just for the fun of it. And that breaks
> > vhost-vsock. Or is setting iommu_platform=on only valid if
> > qemu-system-s390x is protected virtualization capable?
> > 
> > BTW, I don't have a strong opinion on the fixes tag. We currently do not
> > recommend setting iommu_platform, and thus I don't think we care too
> > much about past qemus having problems with it.
> > 
> > Regards,
> > Halil
> 
> 
> Let's just say if we do have a Fixes: tag we want to set it correctly to
> the commit that needs this fix.
> 

I finally did some digging regarding the performance degradation. For
s390x the performance degradation on vhost-net was introduced by commit
076a93d797 ("exec: simplify address_space_get_iotlb_entry"). Before
IOMMUTLBEntry.addr_mask used to be based on plen, which in turn was
calculated as the rest of the memory regions size (from address), and
covered most of the guest address space. That is we didn't have a whole
lot of IOTLB API overhead.

With commit 076a93d797 I see IOMMUTLBEntry.addr_mask == 0xfff which comes
as ~TARGET_PAGE_MASK from flatview_do_translate(). To have things working
properly I applied 75e5b70e6, b021d1c044, and d542800d1e on the level of
076a93d797 and 076a93d797~1.

Regarding vhost-vsock. It does not work with iommu_platform=on since the
very beginning (i.e. 8607f5c307 ("virtio: convert to use DMA api")). Not
sure if that is a good or a bad thing. (If the vhost driver in the kernel
would actually have to do the IOTLB translation, then failing in case
where it does not support it seems sane. The problem is that
ACCESS_PLATFORM is used for more than one thing (needs translation, and
restricted memory access).)

I don't think I've heard back from AMD whether vsock works with SEV or
not... I don't have access to HW to test it myself.

We (s390) don't require this being backported to the stable qemus,
because for us iommu_platform=on becomes relevant with protected
virtualization, and those qemu versions don't support it.

Cheers,
Halil
Michael S. Tsirkin March 13, 2020, 3:29 p.m. UTC | #16
On Fri, Mar 13, 2020 at 01:44:46PM +0100, Halil Pasic wrote:
> [..]
> > > 
> > > CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu?
> > > 
> > > Also, one can specify iommu_platform=on on a device that ain't a part of
> > > a secure-capable VM, just for the fun of it. And that breaks
> > > vhost-vsock. Or is setting iommu_platform=on only valid if
> > > qemu-system-s390x is protected virtualization capable?
> > > 
> > > BTW, I don't have a strong opinion on the fixes tag. We currently do not
> > > recommend setting iommu_platform, and thus I don't think we care too
> > > much about past qemus having problems with it.
> > > 
> > > Regards,
> > > Halil
> > 
> > 
> > Let's just say if we do have a Fixes: tag we want to set it correctly to
> > the commit that needs this fix.
> > 
> 
> I finally did some digging regarding the performance degradation. For
> s390x the performance degradation on vhost-net was introduced by commit
> 076a93d797 ("exec: simplify address_space_get_iotlb_entry"). Before
> IOMMUTLBEntry.addr_mask used to be based on plen, which in turn was
> calculated as the rest of the memory regions size (from address), and
> covered most of the guest address space. That is we didn't have a whole
> lot of IOTLB API overhead.
> 
> With commit 076a93d797 I see IOMMUTLBEntry.addr_mask == 0xfff which comes
> as ~TARGET_PAGE_MASK from flatview_do_translate(). To have things working
> properly I applied 75e5b70e6, b021d1c044, and d542800d1e on the level of
> 076a93d797 and 076a93d797~1.

Peter, what's your take on this one?


> Regarding vhost-vsock. It does not work with iommu_platform=on since the
> very beginning (i.e. 8607f5c307 ("virtio: convert to use DMA api")). Not
> sure if that is a good or a bad thing. (If the vhost driver in the kernel
> would actually have to do the IOTLB translation, then failing in case
> where it does not support it seems sane. The problem is that
> ACCESS_PLATFORM is used for more than one thing (needs translation, and
> restricted memory access).)
> 
> I don't think I've heard back from AMD whether vsock works with SEV or
> not... I don't have access to HW to test it myself.
> 
> We (s390) don't require this being backported to the stable qemus,
> because for us iommu_platform=on becomes relevant with protected
> virtualization, and those qemu versions don't support it.
> 
> Cheers,
> Halil
Peter Xu March 13, 2020, 4:31 p.m. UTC | #17
On Fri, Mar 13, 2020 at 11:29:59AM -0400, Michael S. Tsirkin wrote:
> On Fri, Mar 13, 2020 at 01:44:46PM +0100, Halil Pasic wrote:
> > [..]
> > > > 
> > > > CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu?
> > > > 
> > > > Also, one can specify iommu_platform=on on a device that ain't a part of
> > > > a secure-capable VM, just for the fun of it. And that breaks
> > > > vhost-vsock. Or is setting iommu_platform=on only valid if
> > > > qemu-system-s390x is protected virtualization capable?
> > > > 
> > > > BTW, I don't have a strong opinion on the fixes tag. We currently do not
> > > > recommend setting iommu_platform, and thus I don't think we care too
> > > > much about past qemus having problems with it.
> > > > 
> > > > Regards,
> > > > Halil
> > > 
> > > 
> > > Let's just say if we do have a Fixes: tag we want to set it correctly to
> > > the commit that needs this fix.
> > > 
> > 
> > I finally did some digging regarding the performance degradation. For
> > s390x the performance degradation on vhost-net was introduced by commit
> > 076a93d797 ("exec: simplify address_space_get_iotlb_entry"). Before
> > IOMMUTLBEntry.addr_mask used to be based on plen, which in turn was
> > calculated as the rest of the memory regions size (from address), and
> > covered most of the guest address space. That is we didn't have a whole
> > lot of IOTLB API overhead.
> > 
> > With commit 076a93d797 I see IOMMUTLBEntry.addr_mask == 0xfff which comes
> > as ~TARGET_PAGE_MASK from flatview_do_translate(). To have things working
> > properly I applied 75e5b70e6, b021d1c044, and d542800d1e on the level of
> > 076a93d797 and 076a93d797~1.
> 
> Peter, what's your take on this one?

Commit 076a93d797 was one of the patchset where we want to provide
sensible IOTLB entries and also that should start to work with huge
pages.  Frankly speaking after a few years I forgot the original
motivation of that whole thing, but IIRC there's a patch that was
trying to speedup especially for vhost but I noticed it's not merged:

https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg00574.html

Regarding to the current patch, I'm not sure I understand it
correctly, but is that performance issue only happens when (1) there's
no intel-iommu device, and (2) there is iommu_platform=on specified
for the vhost backend?

If so, I'd confess I am not too surprised if this fails the boot with
vhost-vsock because after all we speicified iommu_platform=on
explicitly in the cmdline, so if we want it to work we can simply
remove that iommu_platform=on when vhost-vsock doesn't support it
yet...  I thougth iommu_platform=on was added for that case - when we
want to force IOMMU to be enabled from host side, and it should always
be used with a vIOMMU device.

However I also agree that from performance POV this patch helps for
this quite special case.

Thanks,
Brijesh Singh March 13, 2020, 8:27 p.m. UTC | #18
On 3/13/20 7:44 AM, Halil Pasic wrote:
> [..]
>>> CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu?
>>>
>>> Also, one can specify iommu_platform=on on a device that ain't a part of
>>> a secure-capable VM, just for the fun of it. And that breaks
>>> vhost-vsock. Or is setting iommu_platform=on only valid if
>>> qemu-system-s390x is protected virtualization capable?
>>>
>>> BTW, I don't have a strong opinion on the fixes tag. We currently do not
>>> recommend setting iommu_platform, and thus I don't think we care too
>>> much about past qemus having problems with it.
>>>
>>> Regards,
>>> Halil
>>
>> Let's just say if we do have a Fixes: tag we want to set it correctly to
>> the commit that needs this fix.
>>
> I finally did some digging regarding the performance degradation. For
> s390x the performance degradation on vhost-net was introduced by commit
> 076a93d797 ("exec: simplify address_space_get_iotlb_entry"). Before
> IOMMUTLBEntry.addr_mask used to be based on plen, which in turn was
> calculated as the rest of the memory regions size (from address), and
> covered most of the guest address space. That is we didn't have a whole
> lot of IOTLB API overhead.
>
> With commit 076a93d797 I see IOMMUTLBEntry.addr_mask == 0xfff which comes
> as ~TARGET_PAGE_MASK from flatview_do_translate(). To have things working
> properly I applied 75e5b70e6, b021d1c044, and d542800d1e on the level of
> 076a93d797 and 076a93d797~1.
>
> Regarding vhost-vsock. It does not work with iommu_platform=on since the
> very beginning (i.e. 8607f5c307 ("virtio: convert to use DMA api")). Not
> sure if that is a good or a bad thing. (If the vhost driver in the kernel
> would actually have to do the IOTLB translation, then failing in case
> where it does not support it seems sane. The problem is that
> ACCESS_PLATFORM is used for more than one thing (needs translation, and
> restricted memory access).)
>
> I don't think I've heard back from AMD whether vsock works with SEV or
> not... I don't have access to HW to test it myself.


I just tried vhost-vsock on AMD SEV machine and it does not work. I am
using FC31 (qemu 4.1.1.1.fc31).


> We (s390) don't require this being backported to the stable qemus,
> because for us iommu_platform=on becomes relevant with protected
> virtualization, and those qemu versions don't support it.
>
> Cheers,
> Halil
>
Michael S. Tsirkin March 14, 2020, 6:22 p.m. UTC | #19
On Fri, Mar 13, 2020 at 03:27:59PM -0500, Brijesh Singh wrote:
> 
> On 3/13/20 7:44 AM, Halil Pasic wrote:
> > [..]
> >>> CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu?
> >>>
> >>> Also, one can specify iommu_platform=on on a device that ain't a part of
> >>> a secure-capable VM, just for the fun of it. And that breaks
> >>> vhost-vsock. Or is setting iommu_platform=on only valid if
> >>> qemu-system-s390x is protected virtualization capable?
> >>>
> >>> BTW, I don't have a strong opinion on the fixes tag. We currently do not
> >>> recommend setting iommu_platform, and thus I don't think we care too
> >>> much about past qemus having problems with it.
> >>>
> >>> Regards,
> >>> Halil
> >>
> >> Let's just say if we do have a Fixes: tag we want to set it correctly to
> >> the commit that needs this fix.
> >>
> > I finally did some digging regarding the performance degradation. For
> > s390x the performance degradation on vhost-net was introduced by commit
> > 076a93d797 ("exec: simplify address_space_get_iotlb_entry"). Before
> > IOMMUTLBEntry.addr_mask used to be based on plen, which in turn was
> > calculated as the rest of the memory regions size (from address), and
> > covered most of the guest address space. That is we didn't have a whole
> > lot of IOTLB API overhead.
> >
> > With commit 076a93d797 I see IOMMUTLBEntry.addr_mask == 0xfff which comes
> > as ~TARGET_PAGE_MASK from flatview_do_translate(). To have things working
> > properly I applied 75e5b70e6, b021d1c044, and d542800d1e on the level of
> > 076a93d797 and 076a93d797~1.
> >
> > Regarding vhost-vsock. It does not work with iommu_platform=on since the
> > very beginning (i.e. 8607f5c307 ("virtio: convert to use DMA api")). Not
> > sure if that is a good or a bad thing. (If the vhost driver in the kernel
> > would actually have to do the IOTLB translation, then failing in case
> > where it does not support it seems sane. The problem is that
> > ACCESS_PLATFORM is used for more than one thing (needs translation, and
> > restricted memory access).)
> >
> > I don't think I've heard back from AMD whether vsock works with SEV or
> > not... I don't have access to HW to test it myself.
> 
> 
> I just tried vhost-vsock on AMD SEV machine and it does not work. I am
> using FC31 (qemu 4.1.1.1.fc31).

Neither does vhost scsi - no ACCESS_PLATFORM support. But with Jason's
patch I think both should work. Pls give it a try.

> 
> > We (s390) don't require this being backported to the stable qemus,
> > because for us iommu_platform=on becomes relevant with protected
> > virtualization, and those qemu versions don't support it.
> >
> > Cheers,
> > Halil
> >
Halil Pasic March 16, 2020, 4:57 p.m. UTC | #20
On Fri, 13 Mar 2020 12:31:22 -0400
Peter Xu <peterx@redhat.com> wrote:

> On Fri, Mar 13, 2020 at 11:29:59AM -0400, Michael S. Tsirkin wrote:
> > On Fri, Mar 13, 2020 at 01:44:46PM +0100, Halil Pasic wrote:
> > > [..]
> > > > > 
> > > > > CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu?
> > > > > 
> > > > > Also, one can specify iommu_platform=on on a device that ain't a part of
> > > > > a secure-capable VM, just for the fun of it. And that breaks
> > > > > vhost-vsock. Or is setting iommu_platform=on only valid if
> > > > > qemu-system-s390x is protected virtualization capable?
> > > > > 
> > > > > BTW, I don't have a strong opinion on the fixes tag. We currently do not
> > > > > recommend setting iommu_platform, and thus I don't think we care too
> > > > > much about past qemus having problems with it.
> > > > > 
> > > > > Regards,
> > > > > Halil
> > > > 
> > > > 
> > > > Let's just say if we do have a Fixes: tag we want to set it correctly to
> > > > the commit that needs this fix.
> > > > 
> > > 
> > > I finally did some digging regarding the performance degradation. For
> > > s390x the performance degradation on vhost-net was introduced by commit
> > > 076a93d797 ("exec: simplify address_space_get_iotlb_entry"). Before
> > > IOMMUTLBEntry.addr_mask used to be based on plen, which in turn was
> > > calculated as the rest of the memory regions size (from address), and
> > > covered most of the guest address space. That is we didn't have a whole
> > > lot of IOTLB API overhead.
> > > 
> > > With commit 076a93d797 I see IOMMUTLBEntry.addr_mask == 0xfff which comes
> > > as ~TARGET_PAGE_MASK from flatview_do_translate(). To have things working
> > > properly I applied 75e5b70e6, b021d1c044, and d542800d1e on the level of
> > > 076a93d797 and 076a93d797~1.
> > 
> > Peter, what's your take on this one?
> 
> Commit 076a93d797 was one of the patchset where we want to provide
> sensible IOTLB entries and also that should start to work with huge
> pages.  Frankly speaking after a few years I forgot the original
> motivation of that whole thing, but IIRC there's a patch that was
> trying to speedup especially for vhost but I noticed it's not merged:
> 
> https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg00574.html
> 

From the looks of it, I don't think we would have seen that big
performance degradation had this patch been included. I can give
it a spin if you like. Shall I?

> Regarding to the current patch, I'm not sure I understand it
> correctly, but is that performance issue only happens when (1) there's
> no intel-iommu device, and (2) there is iommu_platform=on specified
> for the vhost backend?
> 

I can confirm, that your description covers my scenario. I didn't
investigate what happens when we have an intel-iommu, because s390 does
not do intel-iommu. I can also confirm that no performance degradation
is observed when the virtio-net has iommu_platform=off. The property
iommu_platform is a virtio device (and not a backend) level property.
 

> If so, I'd confess I am not too surprised if this fails the boot with
> vhost-vsock because after all we speicified iommu_platform=on
> explicitly in the cmdline, so if we want it to work we can simply
> remove that iommu_platform=on when vhost-vsock doesn't support it
> yet...  I thougth iommu_platform=on was added for that case - when we
> want to force IOMMU to be enabled from host side, and it should always
> be used with a vIOMMU device.
> 

The problem is that the virtio feature bit F_ACCESS_PLATFORM, which is
directly controlled by the iommu_platform proprerty stands for two things
1) need to do IOVA translation
2) the access of the device to the guests RAM is restricted.

There are cases where 2) does apply and 1) does not. We need to specify
iommu_platform=on to make the virtio implementation in the guest use
the dma api, because we need to grant access to memory as required. But
we don't need translation and we don't have a vIOMMU.

Regards,
Halil


> However I also agree that from performance POV this patch helps for
> this quite special case.
> 
> Thanks,
>
Michael S. Tsirkin March 16, 2020, 5:19 p.m. UTC | #21
On Fri, Mar 13, 2020 at 12:31:22PM -0400, Peter Xu wrote:
> On Fri, Mar 13, 2020 at 11:29:59AM -0400, Michael S. Tsirkin wrote:
> > On Fri, Mar 13, 2020 at 01:44:46PM +0100, Halil Pasic wrote:
> > > [..]
> > > > > 
> > > > > CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu?
> > > > > 
> > > > > Also, one can specify iommu_platform=on on a device that ain't a part of
> > > > > a secure-capable VM, just for the fun of it. And that breaks
> > > > > vhost-vsock. Or is setting iommu_platform=on only valid if
> > > > > qemu-system-s390x is protected virtualization capable?
> > > > > 
> > > > > BTW, I don't have a strong opinion on the fixes tag. We currently do not
> > > > > recommend setting iommu_platform, and thus I don't think we care too
> > > > > much about past qemus having problems with it.
> > > > > 
> > > > > Regards,
> > > > > Halil
> > > > 
> > > > 
> > > > Let's just say if we do have a Fixes: tag we want to set it correctly to
> > > > the commit that needs this fix.
> > > > 
> > > 
> > > I finally did some digging regarding the performance degradation. For
> > > s390x the performance degradation on vhost-net was introduced by commit
> > > 076a93d797 ("exec: simplify address_space_get_iotlb_entry"). Before
> > > IOMMUTLBEntry.addr_mask used to be based on plen, which in turn was
> > > calculated as the rest of the memory regions size (from address), and
> > > covered most of the guest address space. That is we didn't have a whole
> > > lot of IOTLB API overhead.
> > > 
> > > With commit 076a93d797 I see IOMMUTLBEntry.addr_mask == 0xfff which comes
> > > as ~TARGET_PAGE_MASK from flatview_do_translate(). To have things working
> > > properly I applied 75e5b70e6, b021d1c044, and d542800d1e on the level of
> > > 076a93d797 and 076a93d797~1.
> > 
> > Peter, what's your take on this one?
> 
> Commit 076a93d797 was one of the patchset where we want to provide
> sensible IOTLB entries and also that should start to work with huge
> pages.

So the issue bundamentally is that it
never produces entries larger than page size.

Wasteful even just with huge pages, all the more
so which passthrough which could have giga-byte
entries.

Want to try fixing that?


>  Frankly speaking after a few years I forgot the original
> motivation of that whole thing, but IIRC there's a patch that was
> trying to speedup especially for vhost but I noticed it's not merged:
> 
> https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg00574.html
> 
> Regarding to the current patch, I'm not sure I understand it
> correctly, but is that performance issue only happens when (1) there's
> no intel-iommu device, and (2) there is iommu_platform=on specified
> for the vhost backend?
> 
> If so, I'd confess I am not too surprised if this fails the boot with
> vhost-vsock because after all we speicified iommu_platform=on
> explicitly in the cmdline, so if we want it to work we can simply
> remove that iommu_platform=on when vhost-vsock doesn't support it
> yet...  I thougth iommu_platform=on was added for that case - when we
> want to force IOMMU to be enabled from host side, and it should always
> be used with a vIOMMU device.
> 
> However I also agree that from performance POV this patch helps for
> this quite special case.
> 
> Thanks,
> 
> -- 
> Peter Xu
Peter Xu March 16, 2020, 5:31 p.m. UTC | #22
On Mon, Mar 16, 2020 at 05:57:37PM +0100, Halil Pasic wrote:
> On Fri, 13 Mar 2020 12:31:22 -0400
> Peter Xu <peterx@redhat.com> wrote:
> 
> > On Fri, Mar 13, 2020 at 11:29:59AM -0400, Michael S. Tsirkin wrote:
> > > On Fri, Mar 13, 2020 at 01:44:46PM +0100, Halil Pasic wrote:
> > > > [..]
> > > > > > 
> > > > > > CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu?
> > > > > > 
> > > > > > Also, one can specify iommu_platform=on on a device that ain't a part of
> > > > > > a secure-capable VM, just for the fun of it. And that breaks
> > > > > > vhost-vsock. Or is setting iommu_platform=on only valid if
> > > > > > qemu-system-s390x is protected virtualization capable?
> > > > > > 
> > > > > > BTW, I don't have a strong opinion on the fixes tag. We currently do not
> > > > > > recommend setting iommu_platform, and thus I don't think we care too
> > > > > > much about past qemus having problems with it.
> > > > > > 
> > > > > > Regards,
> > > > > > Halil
> > > > > 
> > > > > 
> > > > > Let's just say if we do have a Fixes: tag we want to set it correctly to
> > > > > the commit that needs this fix.
> > > > > 
> > > > 
> > > > I finally did some digging regarding the performance degradation. For
> > > > s390x the performance degradation on vhost-net was introduced by commit
> > > > 076a93d797 ("exec: simplify address_space_get_iotlb_entry"). Before
> > > > IOMMUTLBEntry.addr_mask used to be based on plen, which in turn was
> > > > calculated as the rest of the memory regions size (from address), and
> > > > covered most of the guest address space. That is we didn't have a whole
> > > > lot of IOTLB API overhead.
> > > > 
> > > > With commit 076a93d797 I see IOMMUTLBEntry.addr_mask == 0xfff which comes
> > > > as ~TARGET_PAGE_MASK from flatview_do_translate(). To have things working
> > > > properly I applied 75e5b70e6, b021d1c044, and d542800d1e on the level of
> > > > 076a93d797 and 076a93d797~1.
> > > 
> > > Peter, what's your take on this one?
> > 
> > Commit 076a93d797 was one of the patchset where we want to provide
> > sensible IOTLB entries and also that should start to work with huge
> > pages.  Frankly speaking after a few years I forgot the original
> > motivation of that whole thing, but IIRC there's a patch that was
> > trying to speedup especially for vhost but I noticed it's not merged:
> > 
> > https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg00574.html

[1]

> > 
> 
> From the looks of it, I don't think we would have seen that big
> performance degradation had this patch been included. I can give
> it a spin if you like. Shall I?
> 
> > Regarding to the current patch, I'm not sure I understand it
> > correctly, but is that performance issue only happens when (1) there's
> > no intel-iommu device, and (2) there is iommu_platform=on specified
> > for the vhost backend?
> > 
> 
> I can confirm, that your description covers my scenario. I didn't
> investigate what happens when we have an intel-iommu, because s390 does
> not do intel-iommu. I can also confirm that no performance degradation
> is observed when the virtio-net has iommu_platform=off. The property
> iommu_platform is a virtio device (and not a backend) level property.
>  
> 
> > If so, I'd confess I am not too surprised if this fails the boot with
> > vhost-vsock because after all we speicified iommu_platform=on
> > explicitly in the cmdline, so if we want it to work we can simply
> > remove that iommu_platform=on when vhost-vsock doesn't support it
> > yet...  I thougth iommu_platform=on was added for that case - when we
> > want to force IOMMU to be enabled from host side, and it should always
> > be used with a vIOMMU device.
> > 
> 
> The problem is that the virtio feature bit F_ACCESS_PLATFORM, which is
> directly controlled by the iommu_platform proprerty stands for two things
> 1) need to do IOVA translation
> 2) the access of the device to the guests RAM is restricted.
> 
> There are cases where 2) does apply and 1) does not. We need to specify
> iommu_platform=on to make the virtio implementation in the guest use
> the dma api, because we need to grant access to memory as required. But
> we don't need translation and we don't have a vIOMMU.

I see the point of this patch now.  I'm still unclear on how s390
works for DMA protection, but it seems totally different from the
IOMMU model on x86/arm.  Considering this, please ignore above patch
[1] because that's hackish in all cases to play with iotlb caches, and
current patch should be much better (and easier) IMHO.

Thanks,
Peter Xu March 16, 2020, 6:14 p.m. UTC | #23
On Mon, Mar 16, 2020 at 01:19:54PM -0400, Michael S. Tsirkin wrote:
> On Fri, Mar 13, 2020 at 12:31:22PM -0400, Peter Xu wrote:
> > On Fri, Mar 13, 2020 at 11:29:59AM -0400, Michael S. Tsirkin wrote:
> > > On Fri, Mar 13, 2020 at 01:44:46PM +0100, Halil Pasic wrote:
> > > > [..]
> > > > > > 
> > > > > > CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu?
> > > > > > 
> > > > > > Also, one can specify iommu_platform=on on a device that ain't a part of
> > > > > > a secure-capable VM, just for the fun of it. And that breaks
> > > > > > vhost-vsock. Or is setting iommu_platform=on only valid if
> > > > > > qemu-system-s390x is protected virtualization capable?
> > > > > > 
> > > > > > BTW, I don't have a strong opinion on the fixes tag. We currently do not
> > > > > > recommend setting iommu_platform, and thus I don't think we care too
> > > > > > much about past qemus having problems with it.
> > > > > > 
> > > > > > Regards,
> > > > > > Halil
> > > > > 
> > > > > 
> > > > > Let's just say if we do have a Fixes: tag we want to set it correctly to
> > > > > the commit that needs this fix.
> > > > > 
> > > > 
> > > > I finally did some digging regarding the performance degradation. For
> > > > s390x the performance degradation on vhost-net was introduced by commit
> > > > 076a93d797 ("exec: simplify address_space_get_iotlb_entry"). Before
> > > > IOMMUTLBEntry.addr_mask used to be based on plen, which in turn was
> > > > calculated as the rest of the memory regions size (from address), and
> > > > covered most of the guest address space. That is we didn't have a whole
> > > > lot of IOTLB API overhead.
> > > > 
> > > > With commit 076a93d797 I see IOMMUTLBEntry.addr_mask == 0xfff which comes
> > > > as ~TARGET_PAGE_MASK from flatview_do_translate(). To have things working
> > > > properly I applied 75e5b70e6, b021d1c044, and d542800d1e on the level of
> > > > 076a93d797 and 076a93d797~1.
> > > 
> > > Peter, what's your take on this one?
> > 
> > Commit 076a93d797 was one of the patchset where we want to provide
> > sensible IOTLB entries and also that should start to work with huge
> > pages.
> 
> So the issue bundamentally is that it
> never produces entries larger than page size.
> 
> Wasteful even just with huge pages, all the more
> so which passthrough which could have giga-byte
> entries.
> 
> Want to try fixing that?

Yes we can fix that, but I'm still not sure whether changing the
interface of address_space_get_iotlb_entry() to cover adhoc regions is
a good idea, because I think it's still a memory core API and imho it
would still be good to have IOTLBs returned to be what the hardware
will be using (always page aligned IOTLBs).  Also it would still be
not ideal because vhost backend will still need to send the MISSING
messages and block for each of the continuous guest memory ranges
registered, so there will still be misterious delay.  Not to say
logically all the caches can be invalidated too so in that sense I
think it's as hacky as the vhost speedup patch mentioned below..

Ideally I think vhost should be able to know when PT is enabled or
disabled for the device, so the vhost backend (kernel or userspace)
should be able to directly use GPA for DMA.  That might need some new
vhost interface.

For the s390's specific issue, I would think Jason's patch an simple
and ideal solution already.

Thanks,

> 
> 
> >  Frankly speaking after a few years I forgot the original
> > motivation of that whole thing, but IIRC there's a patch that was
> > trying to speedup especially for vhost but I noticed it's not merged:
> > 
> > https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg00574.html
> > 
> > Regarding to the current patch, I'm not sure I understand it
> > correctly, but is that performance issue only happens when (1) there's
> > no intel-iommu device, and (2) there is iommu_platform=on specified
> > for the vhost backend?
> > 
> > If so, I'd confess I am not too surprised if this fails the boot with
> > vhost-vsock because after all we speicified iommu_platform=on
> > explicitly in the cmdline, so if we want it to work we can simply
> > remove that iommu_platform=on when vhost-vsock doesn't support it
> > yet...  I thougth iommu_platform=on was added for that case - when we
> > want to force IOMMU to be enabled from host side, and it should always
> > be used with a vIOMMU device.
> > 
> > However I also agree that from performance POV this patch helps for
> > this quite special case.
> > 
> > Thanks,
> > 
> > -- 
> > Peter Xu
>
Jason Wang March 17, 2020, 3:04 a.m. UTC | #24
On 2020/3/17 上午2:14, Peter Xu wrote:
> On Mon, Mar 16, 2020 at 01:19:54PM -0400, Michael S. Tsirkin wrote:
>> On Fri, Mar 13, 2020 at 12:31:22PM -0400, Peter Xu wrote:
>>> On Fri, Mar 13, 2020 at 11:29:59AM -0400, Michael S. Tsirkin wrote:
>>>> On Fri, Mar 13, 2020 at 01:44:46PM +0100, Halil Pasic wrote:
>>>>> [..]
>>>>>>> CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu?
>>>>>>>
>>>>>>> Also, one can specify iommu_platform=on on a device that ain't a part of
>>>>>>> a secure-capable VM, just for the fun of it. And that breaks
>>>>>>> vhost-vsock. Or is setting iommu_platform=on only valid if
>>>>>>> qemu-system-s390x is protected virtualization capable?
>>>>>>>
>>>>>>> BTW, I don't have a strong opinion on the fixes tag. We currently do not
>>>>>>> recommend setting iommu_platform, and thus I don't think we care too
>>>>>>> much about past qemus having problems with it.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Halil
>>>>>> Let's just say if we do have a Fixes: tag we want to set it correctly to
>>>>>> the commit that needs this fix.
>>>>>>
>>>>> I finally did some digging regarding the performance degradation. For
>>>>> s390x the performance degradation on vhost-net was introduced by commit
>>>>> 076a93d797 ("exec: simplify address_space_get_iotlb_entry"). Before
>>>>> IOMMUTLBEntry.addr_mask used to be based on plen, which in turn was
>>>>> calculated as the rest of the memory regions size (from address), and
>>>>> covered most of the guest address space. That is we didn't have a whole
>>>>> lot of IOTLB API overhead.
>>>>>
>>>>> With commit 076a93d797 I see IOMMUTLBEntry.addr_mask == 0xfff which comes
>>>>> as ~TARGET_PAGE_MASK from flatview_do_translate(). To have things working
>>>>> properly I applied 75e5b70e6, b021d1c044, and d542800d1e on the level of
>>>>> 076a93d797 and 076a93d797~1.
>>>> Peter, what's your take on this one?
>>> Commit 076a93d797 was one of the patchset where we want to provide
>>> sensible IOTLB entries and also that should start to work with huge
>>> pages.
>> So the issue bundamentally is that it
>> never produces entries larger than page size.
>>
>> Wasteful even just with huge pages, all the more
>> so which passthrough which could have giga-byte
>> entries.
>>
>> Want to try fixing that?
> Yes we can fix that, but I'm still not sure whether changing the
> interface of address_space_get_iotlb_entry() to cover adhoc regions is
> a good idea, because I think it's still a memory core API and imho it
> would still be good to have IOTLBs returned to be what the hardware
> will be using (always page aligned IOTLBs).  Also it would still be
> not ideal because vhost backend will still need to send the MISSING
> messages and block for each of the continuous guest memory ranges
> registered, so there will still be misterious delay.  Not to say
> logically all the caches can be invalidated too so in that sense I
> think it's as hacky as the vhost speedup patch mentioned below..
>
> Ideally I think vhost should be able to know when PT is enabled or
> disabled for the device, so the vhost backend (kernel or userspace)
> should be able to directly use GPA for DMA.  That might need some new
> vhost interface.


Yes but I think we don't need another API since we can send GPA->HVA 
mapping via device IOTLB API when we find there's no DMA translation at 
all (either PT or no vIOMMU).

Vhost doesn't need to know whether an address is an IOVA (vIOMMU) , GPA 
(no vIOMMU), or even HVA (dpdk virtio-user).

Thanks


>
> For the s390's specific issue, I would think Jason's patch an simple
> and ideal solution already.
>
> Thanks,
>
Michael S. Tsirkin March 17, 2020, 6:28 a.m. UTC | #25
On Mon, Mar 16, 2020 at 02:14:05PM -0400, Peter Xu wrote:
> On Mon, Mar 16, 2020 at 01:19:54PM -0400, Michael S. Tsirkin wrote:
> > On Fri, Mar 13, 2020 at 12:31:22PM -0400, Peter Xu wrote:
> > > On Fri, Mar 13, 2020 at 11:29:59AM -0400, Michael S. Tsirkin wrote:
> > > > On Fri, Mar 13, 2020 at 01:44:46PM +0100, Halil Pasic wrote:
> > > > > [..]
> > > > > > > 
> > > > > > > CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu?
> > > > > > > 
> > > > > > > Also, one can specify iommu_platform=on on a device that ain't a part of
> > > > > > > a secure-capable VM, just for the fun of it. And that breaks
> > > > > > > vhost-vsock. Or is setting iommu_platform=on only valid if
> > > > > > > qemu-system-s390x is protected virtualization capable?
> > > > > > > 
> > > > > > > BTW, I don't have a strong opinion on the fixes tag. We currently do not
> > > > > > > recommend setting iommu_platform, and thus I don't think we care too
> > > > > > > much about past qemus having problems with it.
> > > > > > > 
> > > > > > > Regards,
> > > > > > > Halil
> > > > > > 
> > > > > > 
> > > > > > Let's just say if we do have a Fixes: tag we want to set it correctly to
> > > > > > the commit that needs this fix.
> > > > > > 
> > > > > 
> > > > > I finally did some digging regarding the performance degradation. For
> > > > > s390x the performance degradation on vhost-net was introduced by commit
> > > > > 076a93d797 ("exec: simplify address_space_get_iotlb_entry"). Before
> > > > > IOMMUTLBEntry.addr_mask used to be based on plen, which in turn was
> > > > > calculated as the rest of the memory regions size (from address), and
> > > > > covered most of the guest address space. That is we didn't have a whole
> > > > > lot of IOTLB API overhead.
> > > > > 
> > > > > With commit 076a93d797 I see IOMMUTLBEntry.addr_mask == 0xfff which comes
> > > > > as ~TARGET_PAGE_MASK from flatview_do_translate(). To have things working
> > > > > properly I applied 75e5b70e6, b021d1c044, and d542800d1e on the level of
> > > > > 076a93d797 and 076a93d797~1.
> > > > 
> > > > Peter, what's your take on this one?
> > > 
> > > Commit 076a93d797 was one of the patchset where we want to provide
> > > sensible IOTLB entries and also that should start to work with huge
> > > pages.
> > 
> > So the issue bundamentally is that it
> > never produces entries larger than page size.
> > 
> > Wasteful even just with huge pages, all the more
> > so which passthrough which could have giga-byte
> > entries.
> > 
> > Want to try fixing that?
> 
> Yes we can fix that, but I'm still not sure whether changing the
> interface of address_space_get_iotlb_entry() to cover adhoc regions is
> a good idea, because I think it's still a memory core API and imho it
> would still be good to have IOTLBs returned to be what the hardware
> will be using (always page aligned IOTLBs).

E.g. with virtio-iommu, there's no hardware in sight.
Even with e.g. VTD page aligned does not mean TARGET_PAGE,
can be much bigger.

>  Also it would still be
> not ideal because vhost backend will still need to send the MISSING
> messages and block for each of the continuous guest memory ranges
> registered, so there will still be misterious delay.  Not to say
> logically all the caches can be invalidated too so in that sense I
> think it's as hacky as the vhost speedup patch mentioned below..
> 
> Ideally I think vhost should be able to know when PT is enabled or
> disabled for the device, so the vhost backend (kernel or userspace)
> should be able to directly use GPA for DMA.  That might need some new
> vhost interface.
> 
> For the s390's specific issue, I would think Jason's patch an simple
> and ideal solution already.
> 
> Thanks,
> 
> > 
> > 
> > >  Frankly speaking after a few years I forgot the original
> > > motivation of that whole thing, but IIRC there's a patch that was
> > > trying to speedup especially for vhost but I noticed it's not merged:
> > > 
> > > https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg00574.html
> > > 
> > > Regarding to the current patch, I'm not sure I understand it
> > > correctly, but is that performance issue only happens when (1) there's
> > > no intel-iommu device, and (2) there is iommu_platform=on specified
> > > for the vhost backend?
> > > 
> > > If so, I'd confess I am not too surprised if this fails the boot with
> > > vhost-vsock because after all we speicified iommu_platform=on
> > > explicitly in the cmdline, so if we want it to work we can simply
> > > remove that iommu_platform=on when vhost-vsock doesn't support it
> > > yet...  I thougth iommu_platform=on was added for that case - when we
> > > want to force IOMMU to be enabled from host side, and it should always
> > > be used with a vIOMMU device.
> > > 
> > > However I also agree that from performance POV this patch helps for
> > > this quite special case.
> > > 
> > > Thanks,
> > > 
> > > -- 
> > > Peter Xu
> > 
> 
> -- 
> Peter Xu
Peter Xu March 17, 2020, 2:13 p.m. UTC | #26
On Tue, Mar 17, 2020 at 11:04:26AM +0800, Jason Wang wrote:
> 
> On 2020/3/17 上午2:14, Peter Xu wrote:
> > On Mon, Mar 16, 2020 at 01:19:54PM -0400, Michael S. Tsirkin wrote:
> > > On Fri, Mar 13, 2020 at 12:31:22PM -0400, Peter Xu wrote:
> > > > On Fri, Mar 13, 2020 at 11:29:59AM -0400, Michael S. Tsirkin wrote:
> > > > > On Fri, Mar 13, 2020 at 01:44:46PM +0100, Halil Pasic wrote:
> > > > > > [..]
> > > > > > > > CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu?
> > > > > > > > 
> > > > > > > > Also, one can specify iommu_platform=on on a device that ain't a part of
> > > > > > > > a secure-capable VM, just for the fun of it. And that breaks
> > > > > > > > vhost-vsock. Or is setting iommu_platform=on only valid if
> > > > > > > > qemu-system-s390x is protected virtualization capable?
> > > > > > > > 
> > > > > > > > BTW, I don't have a strong opinion on the fixes tag. We currently do not
> > > > > > > > recommend setting iommu_platform, and thus I don't think we care too
> > > > > > > > much about past qemus having problems with it.
> > > > > > > > 
> > > > > > > > Regards,
> > > > > > > > Halil
> > > > > > > Let's just say if we do have a Fixes: tag we want to set it correctly to
> > > > > > > the commit that needs this fix.
> > > > > > > 
> > > > > > I finally did some digging regarding the performance degradation. For
> > > > > > s390x the performance degradation on vhost-net was introduced by commit
> > > > > > 076a93d797 ("exec: simplify address_space_get_iotlb_entry"). Before
> > > > > > IOMMUTLBEntry.addr_mask used to be based on plen, which in turn was
> > > > > > calculated as the rest of the memory regions size (from address), and
> > > > > > covered most of the guest address space. That is we didn't have a whole
> > > > > > lot of IOTLB API overhead.
> > > > > > 
> > > > > > With commit 076a93d797 I see IOMMUTLBEntry.addr_mask == 0xfff which comes
> > > > > > as ~TARGET_PAGE_MASK from flatview_do_translate(). To have things working
> > > > > > properly I applied 75e5b70e6, b021d1c044, and d542800d1e on the level of
> > > > > > 076a93d797 and 076a93d797~1.
> > > > > Peter, what's your take on this one?
> > > > Commit 076a93d797 was one of the patchset where we want to provide
> > > > sensible IOTLB entries and also that should start to work with huge
> > > > pages.
> > > So the issue bundamentally is that it
> > > never produces entries larger than page size.
> > > 
> > > Wasteful even just with huge pages, all the more
> > > so which passthrough which could have giga-byte
> > > entries.
> > > 
> > > Want to try fixing that?
> > Yes we can fix that, but I'm still not sure whether changing the
> > interface of address_space_get_iotlb_entry() to cover adhoc regions is
> > a good idea, because I think it's still a memory core API and imho it
> > would still be good to have IOTLBs returned to be what the hardware
> > will be using (always page aligned IOTLBs).  Also it would still be
> > not ideal because vhost backend will still need to send the MISSING
> > messages and block for each of the continuous guest memory ranges
> > registered, so there will still be misterious delay.  Not to say
> > logically all the caches can be invalidated too so in that sense I
> > think it's as hacky as the vhost speedup patch mentioned below..
> > 
> > Ideally I think vhost should be able to know when PT is enabled or
> > disabled for the device, so the vhost backend (kernel or userspace)
> > should be able to directly use GPA for DMA.  That might need some new
> > vhost interface.
> 
> 
> Yes but I think we don't need another API since we can send GPA->HVA mapping
> via device IOTLB API when we find there's no DMA translation at all (either
> PT or no vIOMMU).

Jason,

Do you mean what we've worked on before?

https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg00574.html

(I just read the previous discussion on that patch, it seems to be
 exactly what we've discussed again...)

Thanks,

> 
> Vhost doesn't need to know whether an address is an IOVA (vIOMMU) , GPA (no
> vIOMMU), or even HVA (dpdk virtio-user).
> 
> Thanks
> 
> 
> > 
> > For the s390's specific issue, I would think Jason's patch an simple
> > and ideal solution already.
> > 
> > Thanks,
> > 
>
Peter Xu March 17, 2020, 2:39 p.m. UTC | #27
On Tue, Mar 17, 2020 at 02:28:42AM -0400, Michael S. Tsirkin wrote:
> On Mon, Mar 16, 2020 at 02:14:05PM -0400, Peter Xu wrote:
> > On Mon, Mar 16, 2020 at 01:19:54PM -0400, Michael S. Tsirkin wrote:
> > > On Fri, Mar 13, 2020 at 12:31:22PM -0400, Peter Xu wrote:
> > > > On Fri, Mar 13, 2020 at 11:29:59AM -0400, Michael S. Tsirkin wrote:
> > > > > On Fri, Mar 13, 2020 at 01:44:46PM +0100, Halil Pasic wrote:
> > > > > > [..]
> > > > > > > > 
> > > > > > > > CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu?
> > > > > > > > 
> > > > > > > > Also, one can specify iommu_platform=on on a device that ain't a part of
> > > > > > > > a secure-capable VM, just for the fun of it. And that breaks
> > > > > > > > vhost-vsock. Or is setting iommu_platform=on only valid if
> > > > > > > > qemu-system-s390x is protected virtualization capable?
> > > > > > > > 
> > > > > > > > BTW, I don't have a strong opinion on the fixes tag. We currently do not
> > > > > > > > recommend setting iommu_platform, and thus I don't think we care too
> > > > > > > > much about past qemus having problems with it.
> > > > > > > > 
> > > > > > > > Regards,
> > > > > > > > Halil
> > > > > > > 
> > > > > > > 
> > > > > > > Let's just say if we do have a Fixes: tag we want to set it correctly to
> > > > > > > the commit that needs this fix.
> > > > > > > 
> > > > > > 
> > > > > > I finally did some digging regarding the performance degradation. For
> > > > > > s390x the performance degradation on vhost-net was introduced by commit
> > > > > > 076a93d797 ("exec: simplify address_space_get_iotlb_entry"). Before
> > > > > > IOMMUTLBEntry.addr_mask used to be based on plen, which in turn was
> > > > > > calculated as the rest of the memory regions size (from address), and
> > > > > > covered most of the guest address space. That is we didn't have a whole
> > > > > > lot of IOTLB API overhead.
> > > > > > 
> > > > > > With commit 076a93d797 I see IOMMUTLBEntry.addr_mask == 0xfff which comes
> > > > > > as ~TARGET_PAGE_MASK from flatview_do_translate(). To have things working
> > > > > > properly I applied 75e5b70e6, b021d1c044, and d542800d1e on the level of
> > > > > > 076a93d797 and 076a93d797~1.
> > > > > 
> > > > > Peter, what's your take on this one?
> > > > 
> > > > Commit 076a93d797 was one of the patchset where we want to provide
> > > > sensible IOTLB entries and also that should start to work with huge
> > > > pages.
> > > 
> > > So the issue bundamentally is that it
> > > never produces entries larger than page size.
> > > 
> > > Wasteful even just with huge pages, all the more
> > > so which passthrough which could have giga-byte
> > > entries.
> > > 
> > > Want to try fixing that?
> > 
> > Yes we can fix that, but I'm still not sure whether changing the
> > interface of address_space_get_iotlb_entry() to cover adhoc regions is
> > a good idea, because I think it's still a memory core API and imho it
> > would still be good to have IOTLBs returned to be what the hardware
> > will be using (always page aligned IOTLBs).
> 
> E.g. with virtio-iommu, there's no hardware in sight.
> Even with e.g. VTD page aligned does not mean TARGET_PAGE,
> can be much bigger.

Right. Sorry to be unclear, but I meant the emulated device (in this
case for x86 it's VT-d) should follow the hardware.  Here the page
mask is decided by VT-d in vtd_iommu_translate() for PT mode which is
4K only.  For another example, ARM SMMU is doing similar thing (return
PAGE_SIZE when PT enabled, smmuv3_translate()).  That actually makes
sense to me.  On the other hand, I'm not sure whether there's side
effect if we change this to cover the whole address space for PT.

Thanks,

> 
> >  Also it would still be
> > not ideal because vhost backend will still need to send the MISSING
> > messages and block for each of the continuous guest memory ranges
> > registered, so there will still be misterious delay.  Not to say
> > logically all the caches can be invalidated too so in that sense I
> > think it's as hacky as the vhost speedup patch mentioned below..
> > 
> > Ideally I think vhost should be able to know when PT is enabled or
> > disabled for the device, so the vhost backend (kernel or userspace)
> > should be able to directly use GPA for DMA.  That might need some new
> > vhost interface.
> > 
> > For the s390's specific issue, I would think Jason's patch an simple
> > and ideal solution already.
> > 
> > Thanks,
> > 
> > > 
> > > 
> > > >  Frankly speaking after a few years I forgot the original
> > > > motivation of that whole thing, but IIRC there's a patch that was
> > > > trying to speedup especially for vhost but I noticed it's not merged:
> > > > 
> > > > https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg00574.html
> > > > 
> > > > Regarding to the current patch, I'm not sure I understand it
> > > > correctly, but is that performance issue only happens when (1) there's
> > > > no intel-iommu device, and (2) there is iommu_platform=on specified
> > > > for the vhost backend?
> > > > 
> > > > If so, I'd confess I am not too surprised if this fails the boot with
> > > > vhost-vsock because after all we speicified iommu_platform=on
> > > > explicitly in the cmdline, so if we want it to work we can simply
> > > > remove that iommu_platform=on when vhost-vsock doesn't support it
> > > > yet...  I thougth iommu_platform=on was added for that case - when we
> > > > want to force IOMMU to be enabled from host side, and it should always
> > > > be used with a vIOMMU device.
> > > > 
> > > > However I also agree that from performance POV this patch helps for
> > > > this quite special case.
> > > > 
> > > > Thanks,
> > > > 
> > > > -- 
> > > > Peter Xu
> > > 
> > 
> > -- 
> > Peter Xu
>
Michael S. Tsirkin March 17, 2020, 2:55 p.m. UTC | #28
On Tue, Mar 17, 2020 at 10:39:04AM -0400, Peter Xu wrote:
> On Tue, Mar 17, 2020 at 02:28:42AM -0400, Michael S. Tsirkin wrote:
> > On Mon, Mar 16, 2020 at 02:14:05PM -0400, Peter Xu wrote:
> > > On Mon, Mar 16, 2020 at 01:19:54PM -0400, Michael S. Tsirkin wrote:
> > > > On Fri, Mar 13, 2020 at 12:31:22PM -0400, Peter Xu wrote:
> > > > > On Fri, Mar 13, 2020 at 11:29:59AM -0400, Michael S. Tsirkin wrote:
> > > > > > On Fri, Mar 13, 2020 at 01:44:46PM +0100, Halil Pasic wrote:
> > > > > > > [..]
> > > > > > > > > 
> > > > > > > > > CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu?
> > > > > > > > > 
> > > > > > > > > Also, one can specify iommu_platform=on on a device that ain't a part of
> > > > > > > > > a secure-capable VM, just for the fun of it. And that breaks
> > > > > > > > > vhost-vsock. Or is setting iommu_platform=on only valid if
> > > > > > > > > qemu-system-s390x is protected virtualization capable?
> > > > > > > > > 
> > > > > > > > > BTW, I don't have a strong opinion on the fixes tag. We currently do not
> > > > > > > > > recommend setting iommu_platform, and thus I don't think we care too
> > > > > > > > > much about past qemus having problems with it.
> > > > > > > > > 
> > > > > > > > > Regards,
> > > > > > > > > Halil
> > > > > > > > 
> > > > > > > > 
> > > > > > > > Let's just say if we do have a Fixes: tag we want to set it correctly to
> > > > > > > > the commit that needs this fix.
> > > > > > > > 
> > > > > > > 
> > > > > > > I finally did some digging regarding the performance degradation. For
> > > > > > > s390x the performance degradation on vhost-net was introduced by commit
> > > > > > > 076a93d797 ("exec: simplify address_space_get_iotlb_entry"). Before
> > > > > > > IOMMUTLBEntry.addr_mask used to be based on plen, which in turn was
> > > > > > > calculated as the rest of the memory regions size (from address), and
> > > > > > > covered most of the guest address space. That is we didn't have a whole
> > > > > > > lot of IOTLB API overhead.
> > > > > > > 
> > > > > > > With commit 076a93d797 I see IOMMUTLBEntry.addr_mask == 0xfff which comes
> > > > > > > as ~TARGET_PAGE_MASK from flatview_do_translate(). To have things working
> > > > > > > properly I applied 75e5b70e6, b021d1c044, and d542800d1e on the level of
> > > > > > > 076a93d797 and 076a93d797~1.
> > > > > > 
> > > > > > Peter, what's your take on this one?
> > > > > 
> > > > > Commit 076a93d797 was one of the patchset where we want to provide
> > > > > sensible IOTLB entries and also that should start to work with huge
> > > > > pages.
> > > > 
> > > > So the issue bundamentally is that it
> > > > never produces entries larger than page size.
> > > > 
> > > > Wasteful even just with huge pages, all the more
> > > > so which passthrough which could have giga-byte
> > > > entries.
> > > > 
> > > > Want to try fixing that?
> > > 
> > > Yes we can fix that, but I'm still not sure whether changing the
> > > interface of address_space_get_iotlb_entry() to cover adhoc regions is
> > > a good idea, because I think it's still a memory core API and imho it
> > > would still be good to have IOTLBs returned to be what the hardware
> > > will be using (always page aligned IOTLBs).
> > 
> > E.g. with virtio-iommu, there's no hardware in sight.
> > Even with e.g. VTD page aligned does not mean TARGET_PAGE,
> > can be much bigger.
> 
> Right. Sorry to be unclear, but I meant the emulated device (in this
> case for x86 it's VT-d) should follow the hardware.  Here the page
> mask is decided by VT-d in vtd_iommu_translate() for PT mode which is
> 4K only.  For another example, ARM SMMU is doing similar thing (return
> PAGE_SIZE when PT enabled, smmuv3_translate()).  That actually makes
> sense to me.  On the other hand, I'm not sure whether there's side
> effect if we change this to cover the whole address space for PT.
> 
> Thanks,

Well we can translate a batch of entries in a loop, and
as long as VA/PA mappings are consistent, treat the
batch as one.

This is a classical batching approach and not doing this is a classical
reason for bad performance.


> > 
> > >  Also it would still be
> > > not ideal because vhost backend will still need to send the MISSING
> > > messages and block for each of the continuous guest memory ranges
> > > registered, so there will still be misterious delay.  Not to say
> > > logically all the caches can be invalidated too so in that sense I
> > > think it's as hacky as the vhost speedup patch mentioned below..
> > > 
> > > Ideally I think vhost should be able to know when PT is enabled or
> > > disabled for the device, so the vhost backend (kernel or userspace)
> > > should be able to directly use GPA for DMA.  That might need some new
> > > vhost interface.
> > > 
> > > For the s390's specific issue, I would think Jason's patch an simple
> > > and ideal solution already.
> > > 
> > > Thanks,
> > > 
> > > > 
> > > > 
> > > > >  Frankly speaking after a few years I forgot the original
> > > > > motivation of that whole thing, but IIRC there's a patch that was
> > > > > trying to speedup especially for vhost but I noticed it's not merged:
> > > > > 
> > > > > https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg00574.html
> > > > > 
> > > > > Regarding to the current patch, I'm not sure I understand it
> > > > > correctly, but is that performance issue only happens when (1) there's
> > > > > no intel-iommu device, and (2) there is iommu_platform=on specified
> > > > > for the vhost backend?
> > > > > 
> > > > > If so, I'd confess I am not too surprised if this fails the boot with
> > > > > vhost-vsock because after all we speicified iommu_platform=on
> > > > > explicitly in the cmdline, so if we want it to work we can simply
> > > > > remove that iommu_platform=on when vhost-vsock doesn't support it
> > > > > yet...  I thougth iommu_platform=on was added for that case - when we
> > > > > want to force IOMMU to be enabled from host side, and it should always
> > > > > be used with a vIOMMU device.
> > > > > 
> > > > > However I also agree that from performance POV this patch helps for
> > > > > this quite special case.
> > > > > 
> > > > > Thanks,
> > > > > 
> > > > > -- 
> > > > > Peter Xu
> > > > 
> > > 
> > > -- 
> > > Peter Xu
> > 
> 
> -- 
> Peter Xu
Jason Wang March 18, 2020, 2:06 a.m. UTC | #29
On 2020/3/17 下午10:13, Peter Xu wrote:
> On Tue, Mar 17, 2020 at 11:04:26AM +0800, Jason Wang wrote:
>> On 2020/3/17 上午2:14, Peter Xu wrote:
>>> On Mon, Mar 16, 2020 at 01:19:54PM -0400, Michael S. Tsirkin wrote:
>>>> On Fri, Mar 13, 2020 at 12:31:22PM -0400, Peter Xu wrote:
>>>>> On Fri, Mar 13, 2020 at 11:29:59AM -0400, Michael S. Tsirkin wrote:
>>>>>> On Fri, Mar 13, 2020 at 01:44:46PM +0100, Halil Pasic wrote:
>>>>>>> [..]
>>>>>>>>> CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu?
>>>>>>>>>
>>>>>>>>> Also, one can specify iommu_platform=on on a device that ain't a part of
>>>>>>>>> a secure-capable VM, just for the fun of it. And that breaks
>>>>>>>>> vhost-vsock. Or is setting iommu_platform=on only valid if
>>>>>>>>> qemu-system-s390x is protected virtualization capable?
>>>>>>>>>
>>>>>>>>> BTW, I don't have a strong opinion on the fixes tag. We currently do not
>>>>>>>>> recommend setting iommu_platform, and thus I don't think we care too
>>>>>>>>> much about past qemus having problems with it.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Halil
>>>>>>>> Let's just say if we do have a Fixes: tag we want to set it correctly to
>>>>>>>> the commit that needs this fix.
>>>>>>>>
>>>>>>> I finally did some digging regarding the performance degradation. For
>>>>>>> s390x the performance degradation on vhost-net was introduced by commit
>>>>>>> 076a93d797 ("exec: simplify address_space_get_iotlb_entry"). Before
>>>>>>> IOMMUTLBEntry.addr_mask used to be based on plen, which in turn was
>>>>>>> calculated as the rest of the memory regions size (from address), and
>>>>>>> covered most of the guest address space. That is we didn't have a whole
>>>>>>> lot of IOTLB API overhead.
>>>>>>>
>>>>>>> With commit 076a93d797 I see IOMMUTLBEntry.addr_mask == 0xfff which comes
>>>>>>> as ~TARGET_PAGE_MASK from flatview_do_translate(). To have things working
>>>>>>> properly I applied 75e5b70e6, b021d1c044, and d542800d1e on the level of
>>>>>>> 076a93d797 and 076a93d797~1.
>>>>>> Peter, what's your take on this one?
>>>>> Commit 076a93d797 was one of the patchset where we want to provide
>>>>> sensible IOTLB entries and also that should start to work with huge
>>>>> pages.
>>>> So the issue bundamentally is that it
>>>> never produces entries larger than page size.
>>>>
>>>> Wasteful even just with huge pages, all the more
>>>> so which passthrough which could have giga-byte
>>>> entries.
>>>>
>>>> Want to try fixing that?
>>> Yes we can fix that, but I'm still not sure whether changing the
>>> interface of address_space_get_iotlb_entry() to cover adhoc regions is
>>> a good idea, because I think it's still a memory core API and imho it
>>> would still be good to have IOTLBs returned to be what the hardware
>>> will be using (always page aligned IOTLBs).  Also it would still be
>>> not ideal because vhost backend will still need to send the MISSING
>>> messages and block for each of the continuous guest memory ranges
>>> registered, so there will still be misterious delay.  Not to say
>>> logically all the caches can be invalidated too so in that sense I
>>> think it's as hacky as the vhost speedup patch mentioned below..
>>>
>>> Ideally I think vhost should be able to know when PT is enabled or
>>> disabled for the device, so the vhost backend (kernel or userspace)
>>> should be able to directly use GPA for DMA.  That might need some new
>>> vhost interface.
>> Yes but I think we don't need another API since we can send GPA->HVA mapping
>> via device IOTLB API when we find there's no DMA translation at all (either
>> PT or no vIOMMU).
> Jason,
>
> Do you mean what we've worked on before?
>
> https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg00574.html
>
> (I just read the previous discussion on that patch, it seems to be
>   exactly what we've discussed again...)
>
> Thanks,


Right, something like that. But it's not urgent now consider this patch 
has been merged.

Thanks


>
diff mbox series

Patch

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 9edfadc81d..9182a00495 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -290,7 +290,14 @@  static int vhost_dev_has_iommu(struct vhost_dev *dev)
 {
     VirtIODevice *vdev = dev->vdev;
 
-    return virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
+    /*
+     * For vhost, VIRTIO_F_IOMMU_PLATFORM means the backend support
+     * incremental memory mapping API via IOTLB API. For platform that
+     * does not have IOMMU, there's no need to enable this feature
+     * which may cause unnecessary IOTLB miss/update trnasactions.
+     */
+    return vdev->dma_as != &address_space_memory &&
+           virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
 }
 
 static void *vhost_memory_map(struct vhost_dev *dev, hwaddr addr,
@@ -765,6 +772,9 @@  static int vhost_dev_set_features(struct vhost_dev *dev,
     if (enable_log) {
         features |= 0x1ULL << VHOST_F_LOG_ALL;
     }
+    if (!vhost_dev_has_iommu(dev)) {
+        features &= ~(0x1ULL << VIRTIO_F_IOMMU_PLATFORM);
+    }
     r = dev->vhost_ops->vhost_set_features(dev, features);
     if (r < 0) {
         VHOST_OPS_DEBUG("vhost_set_features failed");