Patchwork [2/2] target-ppc: KVM: Fix some kernel version edge cases for kvmppc_reset_htab()

login
register
mail settings
Submitter David Gibson
Date Sept. 20, 2012, 7:08 a.m.
Message ID <1348124922-24263-3-git-send-email-david@gibson.dropbear.id.au>
Download mbox | patch
Permalink /patch/185349/
State New
Headers show

Comments

David Gibson - Sept. 20, 2012, 7:08 a.m.
The kvmppc_reset_htab() function invokes the KVM_PPC_ALLOCATE_HTAB vm ioctl
to request KVM to allocate and reset a hash page table for the guest - it
returns the size of hash table allocated, or 0 to indicate that qemu needs
to allocate the hash table itself.  In practice qemu needs to allocate the
htab for full emulation and with Book3sPR KVM, but the kernel has to
allocate it for Book3sHV KVM (the hash table needs to be physically
contiguous in that case).

Unfortunately, the logic in this function is incorrect for some existing
kernels.  Specifically:
  * at least some PR KVM versions advertise the relevant capability but
don't actually implement the ioctl(), returning ENOTTY.
  * For old kernels which don't have the capability, we currently return 0.
This is correct for PV KVM, where we need to allocate the htab, but not for
HV KVM - kernels of this era always allocate a 16MB hash table per guest.

This patch corrects both of these edge cases.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 target-ppc/kvm.c |   30 +++++++++++++++++++++++++-----
 1 file changed, 25 insertions(+), 5 deletions(-)
Alexander Graf - Sept. 20, 2012, 8:58 a.m.
On 20.09.2012, at 09:08, David Gibson wrote:

> The kvmppc_reset_htab() function invokes the KVM_PPC_ALLOCATE_HTAB vm ioctl
> to request KVM to allocate and reset a hash page table for the guest - it
> returns the size of hash table allocated, or 0 to indicate that qemu needs
> to allocate the hash table itself.  In practice qemu needs to allocate the
> htab for full emulation and with Book3sPR KVM, but the kernel has to
> allocate it for Book3sHV KVM (the hash table needs to be physically
> contiguous in that case).
> 
> Unfortunately, the logic in this function is incorrect for some existing
> kernels.  Specifically:
>  * at least some PR KVM versions advertise the relevant capability but
> don't actually implement the ioctl(), returning ENOTTY.
>  * For old kernels which don't have the capability, we currently return 0.
> This is correct for PV KVM, where we need to allocate the htab, but not for
> HV KVM - kernels of this era always allocate a 16MB hash table per guest.
> 
> This patch corrects both of these edge cases.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

Thanks, applied to ppc-next.


Alex

Patch

diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 78a47fb..ecc7719 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -1128,18 +1128,38 @@  int kvmppc_reset_htab(int shift_hint)
 {
     uint32_t shift = shift_hint;
 
-    if (kvm_enabled() &&
-        kvm_check_extension(kvm_state, KVM_CAP_PPC_ALLOC_HTAB)) {
+    if (!kvm_enabled()) {
+        /* Full emulation, tell caller to allocate htab itself */
+        return 0;
+    }
+    if (kvm_check_extension(kvm_state, KVM_CAP_PPC_ALLOC_HTAB)) {
         int ret;
         ret = kvm_vm_ioctl(kvm_state, KVM_PPC_ALLOCATE_HTAB, &shift);
-        if (ret < 0) {
+        if (ret == -ENOTTY) {
+            /* At least some versions of PR KVM advertise the
+             * capability, but don't implement the ioctl().  Oops.
+             * Return 0 so that we allocate the htab in qemu, as is
+             * correct for PR. */
+            return 0;
+        } else if (ret < 0) {
             return ret;
         }
         return shift;
     }
 
-    /* For now.. */
-    return 0;
+    /* We have a kernel that predates the htab reset calls.  For PR
+     * KVM, we need to allocate the htab ourselves, for an HV KVM of
+     * this era, it has allocated a 16MB fixed size hash table
+     * already.  Kernels of this era have the GET_PVINFO capability
+     * only on PR, so we use this hack to determine the right
+     * answer */
+    if (kvm_check_extension(kvm_state, KVM_CAP_PPC_GET_PVINFO)) {
+        /* PR - tell caller to allocate htab */
+        return 0;
+    } else {
+        /* HV - assume 16MB kernel allocated htab */
+        return 24;
+    }
 }
 
 static inline uint32_t mfpvr(void)