diff mbox series

[v2] target/i386: add "-cpu, lbr-fmt=*" support to enable guest LBR

Message ID 20210427080948.439432-1-like.xu@linux.intel.com
State New
Headers show
Series [v2] target/i386: add "-cpu, lbr-fmt=*" support to enable guest LBR | expand

Commit Message

Like Xu April 27, 2021, 8:09 a.m. UTC
The last branch recording (LBR) is a performance monitor unit (PMU)
feature on Intel processors that records a running trace of the most
recent branches taken by the processor in the LBR stack. The QEMU
could configure whether it's enabled or not for each guest via CLI.

The LBR feature would be enabled on the guest if:
- the KVM is enabled and the PMU is enabled and,
- the msr-based-feature IA32_PERF_CAPABILITIES is supporterd on KVM and,
- the supported returned value for lbr_fmt from this msr is not zero and,
- the requested guest vcpu model does support FEAT_1_ECX.CPUID_EXT_PDCM,
- the configured lbr-fmt value is the same as the host lbr_fmt value
  OR use the QEMU option "-cpu host,migratable=no".

Signed-off-by: Like Xu <like.xu@linux.intel.com>
---
 target/i386/cpu.c     | 34 ++++++++++++++++++++++++++++++++++
 target/i386/cpu.h     | 10 ++++++++++
 target/i386/kvm/kvm.c | 10 ++++++++--
 3 files changed, 52 insertions(+), 2 deletions(-)

Comments

Eduardo Habkost April 28, 2021, 9:19 p.m. UTC | #1
On Tue, Apr 27, 2021 at 04:09:48PM +0800, Like Xu wrote:
> The last branch recording (LBR) is a performance monitor unit (PMU)
> feature on Intel processors that records a running trace of the most
> recent branches taken by the processor in the LBR stack. The QEMU
> could configure whether it's enabled or not for each guest via CLI.
> 
> The LBR feature would be enabled on the guest if:
> - the KVM is enabled and the PMU is enabled and,
> - the msr-based-feature IA32_PERF_CAPABILITIES is supporterd on KVM and,
> - the supported returned value for lbr_fmt from this msr is not zero and,
> - the requested guest vcpu model does support FEAT_1_ECX.CPUID_EXT_PDCM,
> - the configured lbr-fmt value is the same as the host lbr_fmt value
>   OR use the QEMU option "-cpu host,migratable=no".

I don't understand why "migratable" matters here.  "migratable"
is just a convenience property to get better defaults when using
"-cpu host".  I don't know why it would change the lbr-fmt
validation rules.

> 
> Signed-off-by: Like Xu <like.xu@linux.intel.com>
> ---

A changelog explaining what you changed since v1 would have been
useful here.

>  target/i386/cpu.c     | 34 ++++++++++++++++++++++++++++++++++
>  target/i386/cpu.h     | 10 ++++++++++
>  target/i386/kvm/kvm.c | 10 ++++++++--
>  3 files changed, 52 insertions(+), 2 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index ad99cad0e7..9c8e54aa6f 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -6623,6 +6623,10 @@ static void x86_cpu_filter_features(X86CPU *cpu, bool verbose)
>      }
>  
>      for (w = 0; w < FEATURE_WORDS; w++) {
> +        if (w == FEAT_PERF_CAPABILITIES) {
> +            continue;
> +        }
> +

Why exactly is this necessary?  I expected to be completely OK to
call mark_unavailable_features() multiple times for the same
FeatureWord.

If there's a reason why this is necessary, I suggest adding a
comment explaining why.

>          uint64_t host_feat =
>              x86_cpu_get_supported_feature_word(w, false);
>          uint64_t requested_features = env->features[w];
> @@ -6630,6 +6634,27 @@ static void x86_cpu_filter_features(X86CPU *cpu, bool verbose)
>          mark_unavailable_features(cpu, w, unavailable_features, prefix);
>      }
>  
> +    uint64_t host_perf_cap =
> +        x86_cpu_get_supported_feature_word(FEAT_PERF_CAPABILITIES, false);
> +    if (!cpu->lbr_fmt && !cpu->migratable) {
> +        cpu->lbr_fmt = host_perf_cap & PERF_CAP_LBR_FMT;

"migratable=no" is not a request to override explicit user
settings.  This is why we have the ~env->user_features masking
inside x86_cpu_expand_features() when initializing
env->features[].

In either case, I don't understand why you need the lines above.
"migratable=no" should already trigger the x86_cpu_get_supported_feature_word()
loop inside x86_cpu_expand_features(), and it should initialize
env->features[FEAT_PERF_CAPABILITIES] with the host value.  Isn't
that code working for you?


> +        if (cpu->lbr_fmt) {
> +            info_report("vPMU: The value of lbr-fmt has been adjusted "
> +                        "to 0x%lx and guest LBR is enabled.",
> +                        host_perf_cap & PERF_CAP_LBR_FMT);


From your other message:

(I'm assuming your examples are for a lbr-fmt=5 host)

> "-cpu host,migratable=no" --> "Enable guest LBR and show warning"

Enabling guest LBR in this case is 100% OK, isn't it?  I don't
think you need to show a warning.


> "-cpu host,migratable=no,lbr-fmt=0" --> "Enable guest LBR and show warning"

Why?  In this case, we should do what the user asked for whenever
possible, and the user is explicitly asking lbr-fmt to be 0.

> "-cpu host,migratable=no,lbr-fmt=5" --> "Enable guest LBR"

Looks OK.

> "-cpu host,migratable=no,lbr-fmt=6" --> "Disable guest LBR and show warning"

Makes sense to me[1].


> +        }
> +    } else {
> +        uint64_t requested_lbr_fmt = cpu->lbr_fmt & PERF_CAP_LBR_FMT;
> +        if (requested_lbr_fmt && kvm_enabled()) {


From your other message:

> "-cpu host,lbr-fmt=0" --> "Disable guest LBR"

Makes sense to me.  I understand this as a confirmation that it's
OK to have a guest/host mismatch if guest LBR=0.

> "-cpu host,lbr-fmt=5" --> "Enable guest LBR"

Makes sense to me.

> "-cpu host,lbr-fmt=6" --> "Disable guest LBR and show warning"

Makes sense to me[1].


[1] As long as "show warning" becomes "fatal error" if enforce=1.
    mark_unavailable_features() should make sure this happens.

    Or maybe we should make this an error?  It would be even
    better.  The example code below makes it an error.


> +            if (requested_lbr_fmt != (host_perf_cap & PERF_CAP_LBR_FMT)) {
> +                cpu->lbr_fmt = 0;
> +                warn_report("vPMU: The supported lbr-fmt value on the host "
> +                            "is 0x%lx and guest LBR is disabled.",
> +                            host_perf_cap & PERF_CAP_LBR_FMT);
> +            }
> +        }
> +    }
> +
>      if ((env->features[FEAT_7_0_EBX] & CPUID_7_0_EBX_INTEL_PT) &&
>          kvm_enabled()) {
>          KVMState *s = CPU(cpu)->kvm_state;
> @@ -6734,6 +6759,14 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
>          }
>      }
>  
> +    if (cpu->lbr_fmt) {
> +        if (!cpu->enable_pmu) {
> +            error_setg(errp, "LBR is unsupported since guest PMU is disabled.");
> +            return;
> +        }
> +        env->features[FEAT_PERF_CAPABILITIES] |= cpu->lbr_fmt;

This doesn't seem right, as we should still do what the user
asked for if "lbr-fmt=0" is used.

You need to differentiate "lbr-fmt=0" from "lbr-fmt not set"
somehow.  I suggest initializing lbr_fmt to 0xFF by default,
so you can override env->features[FEAT_PERF_CAPABILITIES]
when lbr_fmt != 0xFF (even if lbr_fmt=0).

Something like this:

  #define LBR_FMT_UNSET 0xff
  ...
  DEFINE_PROP_UINT8("lbr-fmt", X86CPU, lbr_fmt, LBR_FMT_UNSET)
  ...

  void x86_cpu_realizefn(...)
  {
      ...
      if (cpu->lbr_fmt != LBR_FMT_UNSET) {
          if ((cpu->lbr_fmt & LBR_FMT_FMT) != cpu->lbr_fmt) {
              error_setg(errp, "invalid lbr-fmt" ...);
              return;
          }
          env->features[FEAT_PERF_CAPABILITIES] &= ~PERF_CAP_LBR_FMT;
          env->features[FEAT_PERF_CAPABILITIES] |= cpu->lbr_fmt;
      }
      /* If lbr_fmt == LBR_FMT_UNSET, the default value of env->features[]
       * will be used as is (and it may depend on the "migratable" flag)
       */
      ...
      /*
       * We can always validate env->features[FEAT_PERF_CAPABILITIES],
       * no matter how it was initialized:
       */
      uint64_t requested_lbr_fmt =
          env->features[FEAT_PERF_CAPABILITIES] & PERF_CAP_LBR_FMT;
      if (requested_lbr_fmt && kvm_enabled()) {
          /* Maybe this code will work out of the box for all
           * accelerators, but checking kvm_enabled() is safer.
           */
          uint64_t host_perf_cap =
              x86_cpu_get_supported_feature_word(FEAT_PERF_CAPABILITIES, false);
          uint64_t host_lbr_fmt = host_perf_cap & PERF_CAP_LBR_FMT;
          if (!cpu->enable_pmu) {
              error_setg(errp, "LBR is unsupported without pmu=on");
              return;
          }
          if (requested_lbr_fmt != host_lbr_fmt)) {
              /* An error is better than a warning */
              error_setg(errp, "lbr-fmt mismatch" ...);
              /* probably a good idea to include requested_lbr_fmt
               * and host_lbr_fmt in the error message */
              return;
          }
      }
      ...
  }



> +    }
> +
>      /* mwait extended info: needed for Core compatibility */
>      /* We always wake on interrupt even if host does not have the capability */
>      cpu->mwait.ecx |= CPUID_MWAIT_EMX | CPUID_MWAIT_IBE;
> @@ -7300,6 +7333,7 @@ static Property x86_cpu_properties[] = {
>  #endif
>      DEFINE_PROP_INT32("node-id", X86CPU, node_id, CPU_UNSET_NUMA_NODE_ID),
>      DEFINE_PROP_BOOL("pmu", X86CPU, enable_pmu, false),
> +    DEFINE_PROP_UINT8("lbr-fmt", X86CPU, lbr_fmt, 0),
>  
>      DEFINE_PROP_UINT32("hv-spinlocks", X86CPU, hyperv_spinlock_attempts,
>                         HYPERV_SPINLOCK_NEVER_NOTIFY),
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index 570f916878..b12c879fc4 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -354,6 +354,7 @@ typedef enum X86Seg {
>  #define ARCH_CAP_TSX_CTRL_MSR		(1<<7)
>  
>  #define MSR_IA32_PERF_CAPABILITIES      0x345
> +#define PERF_CAP_LBR_FMT      0x3f
>  
>  #define MSR_IA32_TSX_CTRL		0x122
>  #define MSR_IA32_TSCDEADLINE            0x6e0
> @@ -1726,6 +1727,15 @@ struct X86CPU {
>       */
>      bool enable_pmu;
>  
> +    /*
> +     * Configure LBR_FMT bits on IA32_PERF_CAPABILITIES MSR.
> +     * This can't be enabled by default yet because it doesn't have
> +     * ABI stability guarantees, as it is only allowed to pass all
> +     * LBR_FMT bits returned by kvm_arch_get_supported_msr_feature()
> +     * (that depends on host CPU and kernel capabilities) to the guest.
> +     */
> +    uint8_t lbr_fmt;
> +
>      /* LMCE support can be enabled/disabled via cpu option 'lmce=on/off'. It is
>       * disabled by default to avoid breaking migration between QEMU with
>       * different LMCE configurations.
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index 7fe9f52710..aa926984ae 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -2732,8 +2732,14 @@ static void kvm_msr_entry_add_perf(X86CPU *cpu, FeatureWordArray f)
>                                             MSR_IA32_PERF_CAPABILITIES);
>  
>      if (kvm_perf_cap) {
> -        kvm_msr_entry_add(cpu, MSR_IA32_PERF_CAPABILITIES,
> -                        kvm_perf_cap & f[FEAT_PERF_CAPABILITIES]);
> +        kvm_perf_cap = (cpu->migratable) ?
> +            (kvm_perf_cap & f[FEAT_PERF_CAPABILITIES]) : kvm_perf_cap;

I don't understand why you are checking cpu->migratable here.
The CPU code should ensure f[FEAT_PERF_CAPABILITIES] is
initialized correctly before calling kvm_arch_init_vcpu().

> +
> +        if (!cpu->lbr_fmt) {
> +            kvm_perf_cap &= ~PERF_CAP_LBR_FMT;
> +        }

Likewise: this should be done by the CPU initialization code
before kvm_arch_init_vcpu() gets called.

The existing code looks weird here: what's the purpose of the
kvm_arch_get_supported_msr_feature() call in this function?

env->features[] is supposed to reflect what the guest actually
sees.  x86_cpu_realizefn()/x86_cpu_filter_features() is supposed
to ensure that before calling kvm_arch_init_vcpu().  If there's a
mismatch between env->features and what the guest sees, we have a
problem.

If you want to be 100% sure, maybe you can add an assert() here.
But if the function is receiving invalid input it's too late to
fix the value.

> +
> +        kvm_msr_entry_add(cpu, MSR_IA32_PERF_CAPABILITIES, kvm_perf_cap);
>      }
>  }
>  
> -- 
> 2.30.2
>
Like Xu April 30, 2021, 3:20 a.m. UTC | #2
Hi Eduardo,

Thanks for your detailed comments.

On 2021/4/29 5:19, Eduardo Habkost wrote:
> On Tue, Apr 27, 2021 at 04:09:48PM +0800, Like Xu wrote:
>> The last branch recording (LBR) is a performance monitor unit (PMU)
>> feature on Intel processors that records a running trace of the most
>> recent branches taken by the processor in the LBR stack. The QEMU
>> could configure whether it's enabled or not for each guest via CLI.
>>
>> The LBR feature would be enabled on the guest if:
>> - the KVM is enabled and the PMU is enabled and,
>> - the msr-based-feature IA32_PERF_CAPABILITIES is supporterd on KVM and,
>> - the supported returned value for lbr_fmt from this msr is not zero and,
>> - the requested guest vcpu model does support FEAT_1_ECX.CPUID_EXT_PDCM,
>> - the configured lbr-fmt value is the same as the host lbr_fmt value
>>    OR use the QEMU option "-cpu host,migratable=no".
> 
> I don't understand why "migratable" matters here.  "migratable"
> is just a convenience property to get better defaults when using
> "-cpu host".  I don't know why it would change the lbr-fmt
> validation rules.

Your comments bevlow help me understand why we introduced "migratable"
and I'll fllow it.

> 
>>
>> Signed-off-by: Like Xu <like.xu@linux.intel.com>
>> ---
> 
> A changelog explaining what you changed since v1 would have been
> useful here.

Sorry for inconvenience.

> 
>>   target/i386/cpu.c     | 34 ++++++++++++++++++++++++++++++++++
>>   target/i386/cpu.h     | 10 ++++++++++
>>   target/i386/kvm/kvm.c | 10 ++++++++--
>>   3 files changed, 52 insertions(+), 2 deletions(-)
>>
>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>> index ad99cad0e7..9c8e54aa6f 100644
>> --- a/target/i386/cpu.c
>> +++ b/target/i386/cpu.c
>> @@ -6623,6 +6623,10 @@ static void x86_cpu_filter_features(X86CPU *cpu, bool verbose)
>>       }
>>   
>>       for (w = 0; w < FEATURE_WORDS; w++) {
>> +        if (w == FEAT_PERF_CAPABILITIES) {
>> +            continue;
>> +        }
>> +
> 
> Why exactly is this necessary?  I expected to be completely OK to
> call mark_unavailable_features() multiple times for the same
> FeatureWord.
> 

OK.

> If there's a reason why this is necessary, I suggest adding a
> comment explaining why.
> 
>>           uint64_t host_feat =
>>               x86_cpu_get_supported_feature_word(w, false);
>>           uint64_t requested_features = env->features[w];
>> @@ -6630,6 +6634,27 @@ static void x86_cpu_filter_features(X86CPU *cpu, bool verbose)
>>           mark_unavailable_features(cpu, w, unavailable_features, prefix);
>>       }
>>   
>> +    uint64_t host_perf_cap =
>> +        x86_cpu_get_supported_feature_word(FEAT_PERF_CAPABILITIES, false);
>> +    if (!cpu->lbr_fmt && !cpu->migratable) {
>> +        cpu->lbr_fmt = host_perf_cap & PERF_CAP_LBR_FMT;
> 
> "migratable=no" is not a request to override explicit user
> settings.  This is why we have the ~env->user_features masking
> inside x86_cpu_expand_features() when initializing
> env->features[].
> 
> In either case, I don't understand why you need the lines above.
> "migratable=no" should already trigger the x86_cpu_get_supported_feature_word()
> loop inside x86_cpu_expand_features(), and it should initialize
> env->features[FEAT_PERF_CAPABILITIES] with the host value.  Isn't
> that code working for you?
> 
> 
>> +        if (cpu->lbr_fmt) {
>> +            info_report("vPMU: The value of lbr-fmt has been adjusted "
>> +                        "to 0x%lx and guest LBR is enabled.",
>> +                        host_perf_cap & PERF_CAP_LBR_FMT);
> 
> 
>>From your other message:
> 
> (I'm assuming your examples are for a lbr-fmt=5 host)
> 
>> "-cpu host,migratable=no" --> "Enable guest LBR and show warning"
> 
> Enabling guest LBR in this case is 100% OK, isn't it?  I don't
> think you need to show a warning.
> 
> 
>> "-cpu host,migratable=no,lbr-fmt=0" --> "Enable guest LBR and show warning"
> 
> Why?  In this case, we should do what the user asked for whenever
> possible, and the user is explicitly asking lbr-fmt to be 0.
> 
>> "-cpu host,migratable=no,lbr-fmt=5" --> "Enable guest LBR"
> 
> Looks OK.
> 
>> "-cpu host,migratable=no,lbr-fmt=6" --> "Disable guest LBR and show warning"
> 
> Makes sense to me[1].
> 
> 
>> +        }
>> +    } else {
>> +        uint64_t requested_lbr_fmt = cpu->lbr_fmt & PERF_CAP_LBR_FMT;
>> +        if (requested_lbr_fmt && kvm_enabled()) {
> 
> 
>>From your other message:
> 
>> "-cpu host,lbr-fmt=0" --> "Disable guest LBR"
> 
> Makes sense to me.  I understand this as a confirmation that it's
> OK to have a guest/host mismatch if guest LBR=0.
> 
>> "-cpu host,lbr-fmt=5" --> "Enable guest LBR"
> 
> Makes sense to me.
> 
>> "-cpu host,lbr-fmt=6" --> "Disable guest LBR and show warning"
> 
> Makes sense to me[1].
> 
> 
> [1] As long as "show warning" becomes "fatal error" if enforce=1.
>      mark_unavailable_features() should make sure this happens.
> 
>      Or maybe we should make this an error?  It would be even
>      better.  The example code below makes it an error.
> 
> 
>> +            if (requested_lbr_fmt != (host_perf_cap & PERF_CAP_LBR_FMT)) {
>> +                cpu->lbr_fmt = 0;
>> +                warn_report("vPMU: The supported lbr-fmt value on the host "
>> +                            "is 0x%lx and guest LBR is disabled.",
>> +                            host_perf_cap & PERF_CAP_LBR_FMT);
>> +            }
>> +        }
>> +    }
>> +
>>       if ((env->features[FEAT_7_0_EBX] & CPUID_7_0_EBX_INTEL_PT) &&
>>           kvm_enabled()) {
>>           KVMState *s = CPU(cpu)->kvm_state;
>> @@ -6734,6 +6759,14 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
>>           }
>>       }
>>   
>> +    if (cpu->lbr_fmt) {
>> +        if (!cpu->enable_pmu) {
>> +            error_setg(errp, "LBR is unsupported since guest PMU is disabled.");
>> +            return;
>> +        }
>> +        env->features[FEAT_PERF_CAPABILITIES] |= cpu->lbr_fmt;
> 
> This doesn't seem right, as we should still do what the user
> asked for if "lbr-fmt=0" is used.
> 
> You need to differentiate "lbr-fmt=0" from "lbr-fmt not set"
> somehow.  I suggest initializing lbr_fmt to 0xFF by default,
> so you can override env->features[FEAT_PERF_CAPABILITIES]
> when lbr_fmt != 0xFF (even if lbr_fmt=0).


> 
> Something like this:
> 
>    #define LBR_FMT_UNSET 0xff
>    ...
>    DEFINE_PROP_UINT8("lbr-fmt", X86CPU, lbr_fmt, LBR_FMT_UNSET)
>    ...
> 
>    void x86_cpu_realizefn(...)
>    {
>        ...
>        if (cpu->lbr_fmt != LBR_FMT_UNSET) {
>            if ((cpu->lbr_fmt & LBR_FMT_FMT) != cpu->lbr_fmt) {
>                error_setg(errp, "invalid lbr-fmt" ...);
>                return;
>            }
>            env->features[FEAT_PERF_CAPABILITIES] &= ~PERF_CAP_LBR_FMT;
>            env->features[FEAT_PERF_CAPABILITIES] |= cpu->lbr_fmt;
>        }
>        /* If lbr_fmt == LBR_FMT_UNSET, the default value of env->features[]
>         * will be used as is (and it may depend on the "migratable" flag)
>         */

How about the user use "-cpu host,lbr-fmt=0xff" ?

The proposed code does nothing about warning or error,
but implicitly uses the the default value of env->features[]

Users may have an illusion that the "lbr-fmt=0xff" is a "valid" lbr-fmt
and it may enable guest LBR (depend on the "migratable" flag).

>        ...
>        /*
>         * We can always validate env->features[FEAT_PERF_CAPABILITIES],
>         * no matter how it was initialized:
>         */
>        uint64_t requested_lbr_fmt =
>            env->features[FEAT_PERF_CAPABILITIES] & PERF_CAP_LBR_FMT;
>        if (requested_lbr_fmt && kvm_enabled()) {
>            /* Maybe this code will work out of the box for all
>             * accelerators, but checking kvm_enabled() is safer.
>             */
>            uint64_t host_perf_cap =
>                x86_cpu_get_supported_feature_word(FEAT_PERF_CAPABILITIES, false);
>            uint64_t host_lbr_fmt = host_perf_cap & PERF_CAP_LBR_FMT;
>            if (!cpu->enable_pmu) {
>                error_setg(errp, "LBR is unsupported without pmu=on");
>                return;
>            }
>            if (requested_lbr_fmt != host_lbr_fmt)) {
>                /* An error is better than a warning */

OK.

>                error_setg(errp, "lbr-fmt mismatch" ...);
>                /* probably a good idea to include requested_lbr_fmt
>                 * and host_lbr_fmt in the error message */
>                return;
>            }
>        }
>        ...
>    }
> 
> 
> 
>> +    }
>> +
>>       /* mwait extended info: needed for Core compatibility */
>>       /* We always wake on interrupt even if host does not have the capability */
>>       cpu->mwait.ecx |= CPUID_MWAIT_EMX | CPUID_MWAIT_IBE;
>> @@ -7300,6 +7333,7 @@ static Property x86_cpu_properties[] = {
>>   #endif
>>       DEFINE_PROP_INT32("node-id", X86CPU, node_id, CPU_UNSET_NUMA_NODE_ID),
>>       DEFINE_PROP_BOOL("pmu", X86CPU, enable_pmu, false),
>> +    DEFINE_PROP_UINT8("lbr-fmt", X86CPU, lbr_fmt, 0),

>>   
>>       DEFINE_PROP_UINT32("hv-spinlocks", X86CPU, hyperv_spinlock_attempts,
>>                          HYPERV_SPINLOCK_NEVER_NOTIFY),
>> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
>> index 570f916878..b12c879fc4 100644
>> --- a/target/i386/cpu.h
>> +++ b/target/i386/cpu.h
>> @@ -354,6 +354,7 @@ typedef enum X86Seg {
>>   #define ARCH_CAP_TSX_CTRL_MSR		(1<<7)
>>   
>>   #define MSR_IA32_PERF_CAPABILITIES      0x345
>> +#define PERF_CAP_LBR_FMT      0x3f
>>   
>>   #define MSR_IA32_TSX_CTRL		0x122
>>   #define MSR_IA32_TSCDEADLINE            0x6e0
>> @@ -1726,6 +1727,15 @@ struct X86CPU {
>>        */
>>       bool enable_pmu;
>>   
>> +    /*
>> +     * Configure LBR_FMT bits on IA32_PERF_CAPABILITIES MSR.
>> +     * This can't be enabled by default yet because it doesn't have
>> +     * ABI stability guarantees, as it is only allowed to pass all
>> +     * LBR_FMT bits returned by kvm_arch_get_supported_msr_feature()
>> +     * (that depends on host CPU and kernel capabilities) to the guest.
>> +     */
>> +    uint8_t lbr_fmt;
>> +
>>       /* LMCE support can be enabled/disabled via cpu option 'lmce=on/off'. It is
>>        * disabled by default to avoid breaking migration between QEMU with
>>        * different LMCE configurations.
>> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
>> index 7fe9f52710..aa926984ae 100644
>> --- a/target/i386/kvm/kvm.c
>> +++ b/target/i386/kvm/kvm.c
>> @@ -2732,8 +2732,14 @@ static void kvm_msr_entry_add_perf(X86CPU *cpu, FeatureWordArray f)
>>                                              MSR_IA32_PERF_CAPABILITIES);
>>   
>>       if (kvm_perf_cap) {
>> -        kvm_msr_entry_add(cpu, MSR_IA32_PERF_CAPABILITIES,
>> -                        kvm_perf_cap & f[FEAT_PERF_CAPABILITIES]);
>> +        kvm_perf_cap = (cpu->migratable) ?
>> +            (kvm_perf_cap & f[FEAT_PERF_CAPABILITIES]) : kvm_perf_cap;
> 
> I don't understand why you are checking cpu->migratable here.
> The CPU code should ensure f[FEAT_PERF_CAPABILITIES] is
> initialized correctly before calling kvm_arch_init_vcpu().

OK.

> 
>> +
>> +        if (!cpu->lbr_fmt) {
>> +            kvm_perf_cap &= ~PERF_CAP_LBR_FMT;
>> +        }
> 
> Likewise: this should be done by the CPU initialization code
> before kvm_arch_init_vcpu() gets called.
> 
> The existing code looks weird here: what's the purpose of the
> kvm_arch_get_supported_msr_feature() call in this function?
> 
> env->features[] is supposed to reflect what the guest actually
> sees.  x86_cpu_realizefn()/x86_cpu_filter_features() is supposed
> to ensure that before calling kvm_arch_init_vcpu().  If there's a
> mismatch between env->features and what the guest sees, we have a
> problem.

Thanks for your clarification and I'll follow it.

> 
> If you want to be 100% sure, maybe you can add an assert() here.
> But if the function is receiving invalid input it's too late to
> fix the value.
> 
>> +
>> +        kvm_msr_entry_add(cpu, MSR_IA32_PERF_CAPABILITIES, kvm_perf_cap);
>>       }
>>   }
>>   
>> -- 
>> 2.30.2
>>
>
Eduardo Habkost April 30, 2021, 10:27 p.m. UTC | #3
On Fri, Apr 30, 2021 at 11:20:08AM +0800, Like Xu wrote:
[...]
> > > +    if (cpu->lbr_fmt) {
> > > +        if (!cpu->enable_pmu) {
> > > +            error_setg(errp, "LBR is unsupported since guest PMU is disabled.");
> > > +            return;
> > > +        }
> > > +        env->features[FEAT_PERF_CAPABILITIES] |= cpu->lbr_fmt;
> > 
> > This doesn't seem right, as we should still do what the user
> > asked for if "lbr-fmt=0" is used.
> > 
> > You need to differentiate "lbr-fmt=0" from "lbr-fmt not set"
> > somehow.  I suggest initializing lbr_fmt to 0xFF by default,
> > so you can override env->features[FEAT_PERF_CAPABILITIES]
> > when lbr_fmt != 0xFF (even if lbr_fmt=0).
> 
> 
> > 
> > Something like this:
> > 
> >    #define LBR_FMT_UNSET 0xff
> >    ...
> >    DEFINE_PROP_UINT8("lbr-fmt", X86CPU, lbr_fmt, LBR_FMT_UNSET)
> >    ...
> > 
> >    void x86_cpu_realizefn(...)
> >    {
> >        ...
> >        if (cpu->lbr_fmt != LBR_FMT_UNSET) {
> >            if ((cpu->lbr_fmt & LBR_FMT_FMT) != cpu->lbr_fmt) {
> >                error_setg(errp, "invalid lbr-fmt" ...);
> >                return;
> >            }
> >            env->features[FEAT_PERF_CAPABILITIES] &= ~PERF_CAP_LBR_FMT;
> >            env->features[FEAT_PERF_CAPABILITIES] |= cpu->lbr_fmt;
> >        }
> >        /* If lbr_fmt == LBR_FMT_UNSET, the default value of env->features[]
> >         * will be used as is (and it may depend on the "migratable" flag)
> >         */
> 
> How about the user use "-cpu host,lbr-fmt=0xff" ?
> 
> The proposed code does nothing about warning or error,
> but implicitly uses the the default value of env->features[]
> 
> Users may have an illusion that the "lbr-fmt=0xff" is a "valid" lbr-fmt
> and it may enable guest LBR (depend on the "migratable" flag).

You are correct, lbr-fmt=0xff will be synonymous to "use default
lbr-fmt", but this won't be obvious.

You can avoid this by validating the user-provided value in a
property setter.  Or you could just document that 0xff has a
special meaning, to avoid a custom setter.  I believe custom
setters are more likely to cause us problems in the future than
users fiddling with obviously an invalid lbr-fmt value.
diff mbox series

Patch

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index ad99cad0e7..9c8e54aa6f 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6623,6 +6623,10 @@  static void x86_cpu_filter_features(X86CPU *cpu, bool verbose)
     }
 
     for (w = 0; w < FEATURE_WORDS; w++) {
+        if (w == FEAT_PERF_CAPABILITIES) {
+            continue;
+        }
+
         uint64_t host_feat =
             x86_cpu_get_supported_feature_word(w, false);
         uint64_t requested_features = env->features[w];
@@ -6630,6 +6634,27 @@  static void x86_cpu_filter_features(X86CPU *cpu, bool verbose)
         mark_unavailable_features(cpu, w, unavailable_features, prefix);
     }
 
+    uint64_t host_perf_cap =
+        x86_cpu_get_supported_feature_word(FEAT_PERF_CAPABILITIES, false);
+    if (!cpu->lbr_fmt && !cpu->migratable) {
+        cpu->lbr_fmt = host_perf_cap & PERF_CAP_LBR_FMT;
+        if (cpu->lbr_fmt) {
+            info_report("vPMU: The value of lbr-fmt has been adjusted "
+                        "to 0x%lx and guest LBR is enabled.",
+                        host_perf_cap & PERF_CAP_LBR_FMT);
+        }
+    } else {
+        uint64_t requested_lbr_fmt = cpu->lbr_fmt & PERF_CAP_LBR_FMT;
+        if (requested_lbr_fmt && kvm_enabled()) {
+            if (requested_lbr_fmt != (host_perf_cap & PERF_CAP_LBR_FMT)) {
+                cpu->lbr_fmt = 0;
+                warn_report("vPMU: The supported lbr-fmt value on the host "
+                            "is 0x%lx and guest LBR is disabled.",
+                            host_perf_cap & PERF_CAP_LBR_FMT);
+            }
+        }
+    }
+
     if ((env->features[FEAT_7_0_EBX] & CPUID_7_0_EBX_INTEL_PT) &&
         kvm_enabled()) {
         KVMState *s = CPU(cpu)->kvm_state;
@@ -6734,6 +6759,14 @@  static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
         }
     }
 
+    if (cpu->lbr_fmt) {
+        if (!cpu->enable_pmu) {
+            error_setg(errp, "LBR is unsupported since guest PMU is disabled.");
+            return;
+        }
+        env->features[FEAT_PERF_CAPABILITIES] |= cpu->lbr_fmt;
+    }
+
     /* mwait extended info: needed for Core compatibility */
     /* We always wake on interrupt even if host does not have the capability */
     cpu->mwait.ecx |= CPUID_MWAIT_EMX | CPUID_MWAIT_IBE;
@@ -7300,6 +7333,7 @@  static Property x86_cpu_properties[] = {
 #endif
     DEFINE_PROP_INT32("node-id", X86CPU, node_id, CPU_UNSET_NUMA_NODE_ID),
     DEFINE_PROP_BOOL("pmu", X86CPU, enable_pmu, false),
+    DEFINE_PROP_UINT8("lbr-fmt", X86CPU, lbr_fmt, 0),
 
     DEFINE_PROP_UINT32("hv-spinlocks", X86CPU, hyperv_spinlock_attempts,
                        HYPERV_SPINLOCK_NEVER_NOTIFY),
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 570f916878..b12c879fc4 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -354,6 +354,7 @@  typedef enum X86Seg {
 #define ARCH_CAP_TSX_CTRL_MSR		(1<<7)
 
 #define MSR_IA32_PERF_CAPABILITIES      0x345
+#define PERF_CAP_LBR_FMT      0x3f
 
 #define MSR_IA32_TSX_CTRL		0x122
 #define MSR_IA32_TSCDEADLINE            0x6e0
@@ -1726,6 +1727,15 @@  struct X86CPU {
      */
     bool enable_pmu;
 
+    /*
+     * Configure LBR_FMT bits on IA32_PERF_CAPABILITIES MSR.
+     * This can't be enabled by default yet because it doesn't have
+     * ABI stability guarantees, as it is only allowed to pass all
+     * LBR_FMT bits returned by kvm_arch_get_supported_msr_feature()
+     * (that depends on host CPU and kernel capabilities) to the guest.
+     */
+    uint8_t lbr_fmt;
+
     /* LMCE support can be enabled/disabled via cpu option 'lmce=on/off'. It is
      * disabled by default to avoid breaking migration between QEMU with
      * different LMCE configurations.
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 7fe9f52710..aa926984ae 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -2732,8 +2732,14 @@  static void kvm_msr_entry_add_perf(X86CPU *cpu, FeatureWordArray f)
                                            MSR_IA32_PERF_CAPABILITIES);
 
     if (kvm_perf_cap) {
-        kvm_msr_entry_add(cpu, MSR_IA32_PERF_CAPABILITIES,
-                        kvm_perf_cap & f[FEAT_PERF_CAPABILITIES]);
+        kvm_perf_cap = (cpu->migratable) ?
+            (kvm_perf_cap & f[FEAT_PERF_CAPABILITIES]) : kvm_perf_cap;
+
+        if (!cpu->lbr_fmt) {
+            kvm_perf_cap &= ~PERF_CAP_LBR_FMT;
+        }
+
+        kvm_msr_entry_add(cpu, MSR_IA32_PERF_CAPABILITIES, kvm_perf_cap);
     }
 }