[v2,0/5] s390x: vfio-ap: guest dedicated crypto adapters

Message ID	1519746259-27710-1-git-send-email-akrowiak@linux.vnet.ibm.com
Headers	show Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org> Gateway: Authorized Use Only! Violators will be prosecuted for <qemu-devel@nongnu.org> from <akrowiak@linux.vnet.ibm.com>; Tue, 27 Feb 2018 10:44:31 -0500 Gateway: Authorized Use Only! Violators will be prosecuted; Tue, 27 Feb 2018 10:44:27 -0500 From: Tony Krowiak <akrowiak@linux.vnet.ibm.com> To: qemu-devel@nongnu.org Date: Tue, 27 Feb 2018 10:44:14 -0500 Message-Id: <1519746259-27710-1-git-send-email-akrowiak@linux.vnet.ibm.com> Subject: [Qemu-devel] [PATCH v2 0/5] s390x: vfio-ap: guest dedicated crypto adapters Precedence: list Cc: mjrosato@linux.vnet.ibm.com, peter.maydell@linaro.org, pasic@linux.vnet.ibm.com, alifm@linux.vnet.ibm.com, eskultet@redhat.com, david@redhat.com, pmorel@linux.vnet.ibm.com, cohuck@redhat.com, heiko.carstens@de.ibm.com, alex.williamson@redhat.com, agraf@suse.de, borntraeger@de.ibm.com, qemu-s390x@nongnu.org, Tony Krowiak <akrowiak@linux.vnet.ibm.com>, jjherne@linux.vnet.ibm.com, schwidefsky@de.ibm.com, pbonzini@redhat.com, bjsdjshi@linux.vnet.ibm.com, eric.auger@redhat.com, rth@twiddle.net Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>
Series	s390x: vfio-ap: guest dedicated crypto adapters \| expand [v2,0/5] s390x: vfio-ap: guest dedicated crypto adapters [v2,1/5] s390: doc: detailed specifications for AP virtualization [v2,2/5] s390x/ap: base Adjunct Processor (AP) object [v2,3/5] s390x/vfio: ap: VFIO: linux header updates [v2,4/5] s390x/vfio: ap: Introduce VFIO AP device [v2,5/5] s390x/cpumodel: Set up CPU model for AP device support

Tony Krowiak Feb. 27, 2018, 3:44 p.m. UTC

This patch series is the QEMU counterpart to the KVM/kernel support for 
guest dedicated crypto adapters. The KVM/kernel model is built on the 
VFIO mediated device framework and provides the infrastructure for 
granting exclusive guest access to crypto devices installed on the linux 
host. This patch series introduces a new QEMU command line option, QEMU 
object model and CPU model features to exploit the KVM/kernel model.

See the detailed specifications for AP virtualization provided by this 
patch set in docs/vfio-ap.txt for a more complete discussion of the 
design introduced by this patch series.

v1 -> v2 Change log:
===================
* Removed unnecessary S390APMatrixDevice, S390APMatrixDeviceClass 
* Removed ioctl to configure the AP matrix for the guest: letting the
  vfio_ap device driver's 'open' callback configure the AP matrix
  for the guest
* Removed masks from object model: Unnecessary at this point because they 
  are not currently used 
* Renamed:
  * VFIOAPMatrixDevice to VFIOAPDevice
  * VFIOAPMatrixDeviceClass to VFIOAPDeviceClass
  * APMatrixDevice to APDevice
  * APMatrixDeviceClass to APDeviceClass
  * ap-matrix.c to ap.c (in hw/vfio)
  * ap-matrix-device.c to ap-device.c (in hw/s390x)
  * ap-matrix-device.h to ap-device.h (in include/hw/s390x)
* Added CPU model feature for AP facilities installed on guest and 
  facilities features for QCI Instructions Available (STFLE.12) and AP 
  Facilities Test facility installed (STFLE.15).

Tony Krowiak (5):
  s390x/ap: base Adjunct Processor (AP) object
  s390x/vfio: ap: VFIO: linux header updates
  s390x/vfio: ap: Introduce VFIO AP device
  s390x/cpumodel: Set up CPU model for AP device support
  s390: doc: detailed specifications for AP virtualization

 default-configs/s390x-softmmu.mak |    1 +
 docs/vfio-ap.txt                  |  540 +++++++++++++++++++++++++++++++++++++
 hw/s390x/Makefile.objs            |    1 +
 hw/s390x/ap-device.c              |   38 +++
 hw/vfio/Makefile.objs             |    1 +
 hw/vfio/ap.c                      |  176 ++++++++++++
 include/hw/s390x/ap-device.h      |   38 +++
 include/hw/vfio/vfio-common.h     |    1 +
 linux-headers/asm-s390/kvm.h      |    1 +
 linux-headers/linux/vfio.h        |    2 +
 target/s390x/cpu_features.c       |    3 +
 target/s390x/cpu_features_def.h   |    3 +
 target/s390x/cpu_models.c         |   12 +
 target/s390x/gen-features.c       |    3 +
 target/s390x/kvm.c                |    6 +
 15 files changed, 826 insertions(+), 0 deletions(-)
 create mode 100644 docs/vfio-ap.txt
 create mode 100644 hw/s390x/ap-device.c
 create mode 100644 hw/vfio/ap.c
 create mode 100644 include/hw/s390x/ap-device.h

no-reply@patchew.org Feb. 27, 2018, 3:54 p.m. UTC | #1

Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 1519746259-27710-1-git-send-email-akrowiak@linux.vnet.ibm.com
Subject: [Qemu-devel] [PATCH v2 0/5] s390x: vfio-ap: guest dedicated crypto adapters

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
    echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
    if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
        failed=1
        echo
    fi
    n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]               patchew/1519746259-27710-1-git-send-email-akrowiak@linux.vnet.ibm.com -> patchew/1519746259-27710-1-git-send-email-akrowiak@linux.vnet.ibm.com
Switched to a new branch 'test'
e9e1d68b87 s390x/cpumodel: Set up CPU model for AP device support
2fabd0f576 s390x/vfio: ap: Introduce VFIO AP device
bb505ee5d6 s390x/vfio: ap: VFIO: linux header updates
4ea89ebf38 s390x/ap: base Adjunct Processor (AP) object
4fc31e63ea s390: doc: detailed specifications for AP virtualization

=== OUTPUT BEGIN ===
Checking PATCH 1/5: s390: doc: detailed specifications for AP virtualization...
Checking PATCH 2/5: s390x/ap: base Adjunct Processor (AP) object...
Checking PATCH 3/5: s390x/vfio: ap: VFIO: linux header updates...
Checking PATCH 4/5: s390x/vfio: ap: Introduce VFIO AP device...
Checking PATCH 5/5: s390x/cpumodel: Set up CPU model for AP device support...
WARNING: line over 80 characters
#86: FILE: target/s390x/cpu_features.c:39:
+    FEAT_INIT("qci", S390_FEAT_TYPE_STFL, 12, "Query AP Configuration facility"),

ERROR: line over 90 characters
#89: FILE: target/s390x/cpu_features.c:42:
+    FEAT_INIT("apft", S390_FEAT_TYPE_STFL, 15, "Adjunct Processor Facilities Test facility"),

total: 1 errors, 1 warnings, 113 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@freelists.org

David Hildenbrand March 6, 2018, 10:01 a.m. UTC | #2

On 27.02.2018 16:44, Tony Krowiak wrote:
> This patch series is the QEMU counterpart to the KVM/kernel support for 
> guest dedicated crypto adapters. The KVM/kernel model is built on the 
> VFIO mediated device framework and provides the infrastructure for 
> granting exclusive guest access to crypto devices installed on the linux 
> host. This patch series introduces a new QEMU command line option, QEMU 
> object model and CPU model features to exploit the KVM/kernel model.
> 
> See the detailed specifications for AP virtualization provided by this 
> patch set in docs/vfio-ap.txt for a more complete discussion of the 
> design introduced by this patch series.
> 
> v1 -> v2 Change log:
> ===================
> * Removed unnecessary S390APMatrixDevice, S390APMatrixDeviceClass 
> * Removed ioctl to configure the AP matrix for the guest: letting the
>   vfio_ap device driver's 'open' callback configure the AP matrix
>   for the guest
> * Removed masks from object model: Unnecessary at this point because they 
>   are not currently used 
> * Renamed:
>   * VFIOAPMatrixDevice to VFIOAPDevice
>   * VFIOAPMatrixDeviceClass to VFIOAPDeviceClass
>   * APMatrixDevice to APDevice
>   * APMatrixDeviceClass to APDeviceClass
>   * ap-matrix.c to ap.c (in hw/vfio)
>   * ap-matrix-device.c to ap-device.c (in hw/s390x)
>   * ap-matrix-device.h to ap-device.h (in include/hw/s390x)
> * Added CPU model feature for AP facilities installed on guest and 
>   facilities features for QCI Instructions Available (STFLE.12) and AP 
>   Facilities Test facility installed (STFLE.15).
> 
> Tony Krowiak (5):
>   s390x/ap: base Adjunct Processor (AP) object
>   s390x/vfio: ap: VFIO: linux header updates
>   s390x/vfio: ap: Introduce VFIO AP device
>   s390x/cpumodel: Set up CPU model for AP device support
>   s390: doc: detailed specifications for AP virtualization
> 

I'm going to highlight an issue that stems from bad HW design: The lack
of an AP interpretation facility (indication). We e.g. have something
like that for zPCI (and all other I/O besides AP as far as I remember).

Let's assume L1 provides AP to L2.
Let's assume L2 provides AP to L3.

L2 can blindly forward APs to L3 because it sees the AP facility. This
requires AP vSIE support. We have no separate way of indicating that
support, it comes with the AP feature. So let's assume L2 does not
emulate devices but uses interpretation for L3.

Everything is fine as long as L1 does not emulate AP
devices/instructions for L2. All instructions are interpreted by HW.

But what happens if L1 emulates AP devices for L2? intepretation is
disabled. QEMU handles it.

However L2 can simply forward AP devices to L3. At this point, we must
also intercept and emulate AP instructions issued by L3 in _L1_.

This is what we call the nightmare of nested virtualization (see x86),
because we have to emulate L3 instructions in L1 - but even worse, not
even in L1 kernel space but in L1 user space.

Long story short:

Making this scenario work would require a _huge_ effort (going to user
space with nested guest state - or communicating with the user space
part using some other mechanism).

So we could never provide the AP feature reliably with the SIE feature.
We want to avoid interdependence between CPU features. (because
everything else makes CPU feature detection ugly - CMMA is a good
example and the only exception so far)

Long story even shorter:

No emulated AP devices with KVM.

Pierre Morel March 6, 2018, 4:53 p.m. UTC | #3

On 06/03/2018 11:01, David Hildenbrand wrote:
> On 27.02.2018 16:44, Tony Krowiak wrote:
>> This patch series is the QEMU counterpart to the KVM/kernel support for
>> guest dedicated crypto adapters. The KVM/kernel model is built on the
>> VFIO mediated device framework and provides the infrastructure for
>> granting exclusive guest access to crypto devices installed on the linux
>> host. This patch series introduces a new QEMU command line option, QEMU
>> object model and CPU model features to exploit the KVM/kernel model.
>>
>> See the detailed specifications for AP virtualization provided by this
>> patch set in docs/vfio-ap.txt for a more complete discussion of the
>> design introduced by this patch series.
>>
>> v1 -> v2 Change log:
>> ===================
>> * Removed unnecessary S390APMatrixDevice, S390APMatrixDeviceClass
>> * Removed ioctl to configure the AP matrix for the guest: letting the
>>    vfio_ap device driver's 'open' callback configure the AP matrix
>>    for the guest
>> * Removed masks from object model: Unnecessary at this point because they
>>    are not currently used
>> * Renamed:
>>    * VFIOAPMatrixDevice to VFIOAPDevice
>>    * VFIOAPMatrixDeviceClass to VFIOAPDeviceClass
>>    * APMatrixDevice to APDevice
>>    * APMatrixDeviceClass to APDeviceClass
>>    * ap-matrix.c to ap.c (in hw/vfio)
>>    * ap-matrix-device.c to ap-device.c (in hw/s390x)
>>    * ap-matrix-device.h to ap-device.h (in include/hw/s390x)
>> * Added CPU model feature for AP facilities installed on guest and
>>    facilities features for QCI Instructions Available (STFLE.12) and AP
>>    Facilities Test facility installed (STFLE.15).
>>
>> Tony Krowiak (5):
>>    s390x/ap: base Adjunct Processor (AP) object
>>    s390x/vfio: ap: VFIO: linux header updates
>>    s390x/vfio: ap: Introduce VFIO AP device
>>    s390x/cpumodel: Set up CPU model for AP device support
>>    s390: doc: detailed specifications for AP virtualization
>>
> I'm going to highlight an issue that stems from bad HW design: The lack
> of an AP interpretation facility (indication). We e.g. have something
> like that for zPCI (and all other I/O besides AP as far as I remember).
>
> Let's assume L1 provides AP to L2.
> Let's assume L2 provides AP to L3.
>
> L2 can blindly forward APs to L3 because it sees the AP facility. This
> requires AP vSIE support. We have no separate way of indicating that
> support, it comes with the AP feature. So let's assume L2 does not
> emulate devices but uses interpretation for L3.
>
> Everything is fine as long as L1 does not emulate AP
> devices/instructions for L2. All instructions are interpreted by HW.

If L1 emulates AP, there is no need it sets any bit in the L2 SIE CRYCB.
In fact we better do not set any bit in the CRYCB.

>
> But what happens if L1 emulates AP devices for L2? intepretation is
> disabled. QEMU handles it.
>
> However L2 can simply forward AP devices to L3. At this point, we must
> also intercept and emulate AP instructions issued by L3 in _L1_.

If L2 forward devices to L3 through SIE ECA.28 but no bit is set is in 
the CRYCB of L2,
L3 will not see any device.

>
> This is what we call the nightmare of nested virtualization (see x86),
> because we have to emulate L3 instructions in L1 - but even worse, not
> even in L1 kernel space but in L1 user space.

As soon as one level begin to virtualize, all levels under it
must virtualize too so that L3 instructions will be handled in L2
which will issue instructions that will be handled in L1.

>
>
> Long story short:
>
> Making this scenario work would require a _huge_ effort (going to user
> space with nested guest state - or communicating with the user space
> part using some other mechanism).

A funny game with big overhead but same virtualization whatever the 
level is.

>
> So we could never provide the AP feature reliably with the SIE feature.

I think we should change a little this sentence to:
We can not provide SIE interpretation to a guest from which
any guest level N-1 does not use SIE interpretation.

Nothing bad will occur for the host, the hardware or other guests,
but the guest will just not get any device.

> We want to avoid interdependence between CPU features. (because
> everything else makes CPU feature detection ugly - CMMA is a good
> example and the only exception so far)
>
>
> Long story even shorter:
>
> No emulated AP devices with KVM.
>
I agree with: KVM should never set bits in CRYCB for emulated devices.

David Hildenbrand March 6, 2018, 5:10 p.m. UTC | #4

> If L2 forward devices to L3 through SIE ECA.28 but no bit is set is in 
> the CRYCB of L2,
> L3 will not see any device.

Exactly and this is the problem: How should L2 know that these devices
are special and cannot be forwarded.

> 
>>
>> This is what we call the nightmare of nested virtualization (see x86),
>> because we have to emulate L3 instructions in L1 - but even worse, not
>> even in L1 kernel space but in L1 user space.
> 
> As soon as one level begin to virtualize, all levels under it
> must virtualize too so that L3 instructions will be handled in L2
> which will issue instructions that will be handled in L1.

By virtualize I assume you mean emulate? If so, yes.

>>
>> So we could never provide the AP feature reliably with the SIE feature.
> 
> I think we should change a little this sentence to:
> We can not provide SIE interpretation to a guest from which
> any guest level N-1 does not use SIE interpretation.

Exactly, and as said, there is no way to tell a guest that it has AP but
cannot use AP interpretation but has to intercept and handle manually.

> 
> Nothing bad will occur for the host, the hardware or other guests,
> but the guest will just not get any device.
> 
>> We want to avoid interdependence between CPU features. (because
>> everything else makes CPU feature detection ugly - CMMA is a good
>> example and the only exception so far)
>>
>>
>> Long story even shorter:
>>
>> No emulated AP devices with KVM.
>>
> I agree with: KVM should never set bits in CRYCB for emulated devices.

I think this is stronger: emulated AP devices should not be used with
KVM because it can potentially lead to architectural (v)SIE conflicts.

But the details are buried in some AP documentation not accessible to me.

Anyhow, if the scenario I described cannot be worked around via:

a) telling a guest that AP virtualization cannot be used - which doesn't
seem to be possible
b) provoking for selected devices a SIE exit when an AP instruction is
executed on these devices - and this is totally fine with the documented
AP architecture

I assume we would have to live with !emualted AP devices.

Pierre Morel March 7, 2018, 10:22 a.m. UTC | #5

On 06/03/2018 18:10, David Hildenbrand wrote:
>> If L2 forward devices to L3 through SIE ECA.28 but no bit is set is in
>> the CRYCB of L2,
>> L3 will not see any device.
> Exactly and this is the problem: How should L2 know that these devices
> are special and cannot be forwarded.
>
>>> This is what we call the nightmare of nested virtualization (see x86),
>>> because we have to emulate L3 instructions in L1 - but even worse, not
>>> even in L1 kernel space but in L1 user space.
>> As soon as one level begin to virtualize, all levels under it
>> must virtualize too so that L3 instructions will be handled in L2
>> which will issue instructions that will be handled in L1.
> By virtualize I assume you mean emulate? If so, yes.
>
>>> So we could never provide the AP feature reliably with the SIE feature.
>> I think we should change a little this sentence to:
>> We can not provide SIE interpretation to a guest from which
>> any guest level N-1 does not use SIE interpretation.
> Exactly, and as said, there is no way to tell a guest that it has AP but
> cannot use AP interpretation but has to intercept and handle manually.


vSIE must clear ECA28 during running of the guest if the host itself do 
not have ECA28 set.
Since ECA28 set for the host means AP instructions available for the host
then we can sum it up by: vSIE should never set ECA28 in the shadow SIE
if no AP instructions available.

Pierre


>
>> Nothing bad will occur for the host, the hardware or other guests,
>> but the guest will just not get any device.
>>
>>> We want to avoid interdependence between CPU features. (because
>>> everything else makes CPU feature detection ugly - CMMA is a good
>>> example and the only exception so far)
>>>
>>>
>>> Long story even shorter:
>>>
>>> No emulated AP devices with KVM.
>>>
>> I agree with: KVM should never set bits in CRYCB for emulated devices.
> I think this is stronger: emulated AP devices should not be used with
> KVM because it can potentially lead to architectural (v)SIE conflicts.
>
> But the details are buried in some AP documentation not accessible to me.
>
> Anyhow, if the scenario I described cannot be worked around via:
>
> a) telling a guest that AP virtualization cannot be used - which doesn't
> seem to be possible
> b) provoking for selected devices a SIE exit when an AP instruction is
> executed on these devices - and this is totally fine with the documented
> AP architecture
>
> I assume we would have to live with !emualted AP devices.
>

Christian Borntraeger March 7, 2018, 2:27 p.m. UTC | #6

On 03/07/2018 11:22 AM, Pierre Morel wrote:
> On 06/03/2018 18:10, David Hildenbrand wrote:
>>> If L2 forward devices to L3 through SIE ECA.28 but no bit is set is in
>>> the CRYCB of L2,
>>> L3 will not see any device.
>> Exactly and this is the problem: How should L2 know that these devices
>> are special and cannot be forwarded.
>>
>>>> This is what we call the nightmare of nested virtualization (see x86),
>>>> because we have to emulate L3 instructions in L1 - but even worse, not
>>>> even in L1 kernel space but in L1 user space.
>>> As soon as one level begin to virtualize, all levels under it
>>> must virtualize too so that L3 instructions will be handled in L2
>>> which will issue instructions that will be handled in L1.
>> By virtualize I assume you mean emulate? If so, yes.
>>
>>>> So we could never provide the AP feature reliably with the SIE feature.
>>> I think we should change a little this sentence to:
>>> We can not provide SIE interpretation to a guest from which
>>> any guest level N-1 does not use SIE interpretation.
>> Exactly, and as said, there is no way to tell a guest that it has AP but
>> cannot use AP interpretation but has to intercept and handle manually.
> 
> 
> vSIE must clear ECA28 during running of the guest if the host itself do not have ECA28 set.
> Since ECA28 set for the host means AP instructions available for the host
> then we can sum it up by: vSIE should never set ECA28 in the shadow SIE
> if no AP instructions available.

To say it differently, architecturally ECA28 is an effective control so we might
put the burden on the guest2 by saying even it you set eca.28 you might still
get exits for NQAP,PQAP,DQAP and handle it appropriately.


> 
> Pierre
> 
> 
>>
>>> Nothing bad will occur for the host, the hardware or other guests,
>>> but the guest will just not get any device.
>>>
>>>> We want to avoid interdependence between CPU features. (because
>>>> everything else makes CPU feature detection ugly - CMMA is a good
>>>> example and the only exception so far)
>>>>
>>>>
>>>> Long story even shorter:
>>>>
>>>> No emulated AP devices with KVM.
>>>>
>>> I agree with: KVM should never set bits in CRYCB for emulated devices.
>> I think this is stronger: emulated AP devices should not be used with
>> KVM because it can potentially lead to architectural (v)SIE conflicts.
>>
>> But the details are buried in some AP documentation not accessible to me.
>>
>> Anyhow, if the scenario I described cannot be worked around via:
>>
>> a) telling a guest that AP virtualization cannot be used - which doesn't
>> seem to be possible
>> b) provoking for selected devices a SIE exit when an AP instruction is
>> executed on these devices - and this is totally fine with the documented
>> AP architecture
>>
>> I assume we would have to live with !emualted AP devices.
>>
>

[v2,0/5] s390x: vfio-ap: guest dedicated crypto adapters

Message

Comments