diff mbox

oops/warning report for the week of November 26, 2008

Message ID 20081127122800.45fc0b1a@linux.intel.com
State Not Applicable, archived
Delegated to: David Miller
Headers show

Commit Message

Arjan van de Ven Nov. 27, 2008, 8:28 p.m. UTC
On Thu, 27 Nov 2008 21:18:36 +0100
Ingo Molnar <mingo@elte.hu> wrote:

> 
> * Arjan van de Ven <arjan@linux.intel.com> wrote:
> 
> > Ingo Molnar wrote:
> >> * Arjan van de Ven <arjan@linux.intel.com> wrote:
> >>
> >>> Rank 8: mtrr_trim_uncached_memory (warning)
> >>> 	Reported 227 times (619 total reports)
> >>> 	There is a high number of machines where our MTRR checks
> >>> 	trigger. I suspect we are too picky in accepting the MTRR
> >>> 	configuration.
> >>
> >> the warning here means: "the BIOS messed up but we fixed it up for
> >> you just fine".
> >
> > I don't believe that right now. we see so many of these, including 
> > many "there's no MTRRs at all", that I am seriously suspecting that 
> > our code is just incorrect somehow and triggering too much.
> 
> well we looked at existing reports and Linux was right to fix them
> up. Show us one that is incorrect, then we can fix it up.
> 
> the "no MTRR's" are vmware/(also qemu?) guests not implementing a
> full CPU emulation.

... and it's still our fault in part, since we don't even check to see
if a cpu claims to support MTRR before complaining about it...

easy to fix though:

From 7e987ae541c41ce908b414fee9d8e2fd2099a083 Mon Sep 17 00:00:00 2001
From: Arjan van de Ven <arjan@linux.intel.com>
Date: Thu, 27 Nov 2008 12:25:47 -0800
Subject: [PATCH] x86: make sure the CPU advertizes MTRR support before complaining about the lack thereoff...

We complain loudly if a CPU does not have MTRR support... but we don't check if the CPU
exposes MTRR support in the CPUID flags first. While this might not fix all of the
broken virtualization systems out there, it will at least fix those that properly don't
advertize things they don't support.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
---
 arch/x86/kernel/cpu/mtrr/main.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

Comments

Ingo Molnar Nov. 27, 2008, 8:47 p.m. UTC | #1
* Arjan van de Ven <arjan@linux.intel.com> wrote:

> On Thu, 27 Nov 2008 21:18:36 +0100
> Ingo Molnar <mingo@elte.hu> wrote:
> 
> > 
> > * Arjan van de Ven <arjan@linux.intel.com> wrote:
> > 
> > > Ingo Molnar wrote:
> > >> * Arjan van de Ven <arjan@linux.intel.com> wrote:
> > >>
> > >>> Rank 8: mtrr_trim_uncached_memory (warning)
> > >>> 	Reported 227 times (619 total reports)
> > >>> 	There is a high number of machines where our MTRR checks
> > >>> 	trigger. I suspect we are too picky in accepting the MTRR
> > >>> 	configuration.
> > >>
> > >> the warning here means: "the BIOS messed up but we fixed it up for
> > >> you just fine".
> > >
> > > I don't believe that right now. we see so many of these, including 
> > > many "there's no MTRRs at all", that I am seriously suspecting that 
> > > our code is just incorrect somehow and triggering too much.
> > 
> > well we looked at existing reports and Linux was right to fix them
> > up. Show us one that is incorrect, then we can fix it up.
> > 
> > the "no MTRR's" are vmware/(also qemu?) guests not implementing a
> > full CPU emulation.
> 
> ... and it's still our fault in part, since we don't even check to 
> see if a cpu claims to support MTRR before complaining about it...
> 
> easy to fix though:

IIRC the problem is that vmware _does_ claim that it supports MTRRs. 

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Arjan van de Ven Nov. 27, 2008, 8:53 p.m. UTC | #2
On Thu, 27 Nov 2008 21:47:14 +0100
Ingo Molnar <mingo@elte.hu> wrote:

> IIRC the problem is that vmware _does_ claim that it supports MTRRs. 

it might.
but even if they would fix that, we would still WARN (
at least we should do our side correctly...
H. Peter Anvin Nov. 27, 2008, 9:18 p.m. UTC | #3
Arjan van de Ven wrote:
> +	if (!cpu_has_mtrr)
> +		return 0;
>  	if (!is_cpu(INTEL) || disable_mtrr_trim)
>  		return 0;
>  	rdmsr(MTRRdefType_MSR, def, dummy);

cpu_has_mtrr there should presumably replace is_cpu(INTEL).  I'm not 
sure if this can be replaced by use_intel(); in particular use_intel() 
relies on mtrr_if having been initialized.

Looking...

	-hpa (out of town for Thanksgiving)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu Nov. 27, 2008, 9:18 p.m. UTC | #4
Arjan van de Ven wrote:
> On Thu, 27 Nov 2008 21:18:36 +0100
> Ingo Molnar <mingo@elte.hu> wrote:
> 
>> * Arjan van de Ven <arjan@linux.intel.com> wrote:
>>
>>> Ingo Molnar wrote:
>>>> * Arjan van de Ven <arjan@linux.intel.com> wrote:
>>>>
>>>>> Rank 8: mtrr_trim_uncached_memory (warning)
>>>>> 	Reported 227 times (619 total reports)
>>>>> 	There is a high number of machines where our MTRR checks
>>>>> 	trigger. I suspect we are too picky in accepting the MTRR
>>>>> 	configuration.
>>>> the warning here means: "the BIOS messed up but we fixed it up for
>>>> you just fine".
>>> I don't believe that right now. we see so many of these, including 
>>> many "there's no MTRRs at all", that I am seriously suspecting that 
>>> our code is just incorrect somehow and triggering too much.
>> well we looked at existing reports and Linux was right to fix them
>> up. Show us one that is incorrect, then we can fix it up.
>>
>> the "no MTRR's" are vmware/(also qemu?) guests not implementing a
>> full CPU emulation.
> 
> ... and it's still our fault in part, since we don't even check to see
> if a cpu claims to support MTRR before complaining about it...
> 
> easy to fix though:
> 
> From 7e987ae541c41ce908b414fee9d8e2fd2099a083 Mon Sep 17 00:00:00 2001
> From: Arjan van de Ven <arjan@linux.intel.com>
> Date: Thu, 27 Nov 2008 12:25:47 -0800
> Subject: [PATCH] x86: make sure the CPU advertizes MTRR support before complaining about the lack thereoff...
> 
> We complain loudly if a CPU does not have MTRR support... but we don't check if the CPU
> exposes MTRR support in the CPUID flags first. While this might not fix all of the
> broken virtualization systems out there, it will at least fix those that properly don't
> advertize things they don't support.
> 
> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
> ---
>  arch/x86/kernel/cpu/mtrr/main.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c
> index 1159e26..0044e61 100644
> --- a/arch/x86/kernel/cpu/mtrr/main.c
> +++ b/arch/x86/kernel/cpu/mtrr/main.c
> @@ -1567,6 +1567,8 @@ int __init mtrr_trim_uncached_memory(unsigned long end_pfn)
>  	 * Make sure we only trim uncachable memory on machines that
>  	 * support the Intel MTRR architecture:
>  	 */
> +	if (!cpu_has_mtrr)
> +		return 0;

that is not needed, we already check that in mtrr_bp_init before this function is called, and it will assign mtrr_if

and
#define is_cpu(vnd)     (mtrr_if && mtrr_if->vendor == X86_VENDOR_##vnd)

will make it sure mtrr is there.

ps: here INTEL mean any cpu has same interface like intel cpu's

YH

>  	if (!is_cpu(INTEL) || disable_mtrr_trim)
>  		return 0;
>  	rdmsr(MTRRdefType_MSR, def, dummy);

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
H. Peter Anvin Nov. 27, 2008, 9:42 p.m. UTC | #5
Arjan van de Ven wrote:
> 
> diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c
> index 1159e26..0044e61 100644
> --- a/arch/x86/kernel/cpu/mtrr/main.c
> +++ b/arch/x86/kernel/cpu/mtrr/main.c
> @@ -1567,6 +1567,8 @@ int __init mtrr_trim_uncached_memory(unsigned long end_pfn)
>  	 * Make sure we only trim uncachable memory on machines that
>  	 * support the Intel MTRR architecture:
>  	 */
> +	if (!cpu_has_mtrr)
> +		return 0;
>  	if (!is_cpu(INTEL) || disable_mtrr_trim)
>  		return 0;
>  	rdmsr(MTRRdefType_MSR, def, dummy);

Okay... is_cpu() here is defined as:

#define is_cpu(vnd)      (mtrr_if && mtrr_if->vendor == X86_VENDOR_##vnd)

... so an MTRR interface has been identified.  Therefore testing 
cpu_has_mtrr is redundant.

As far as use_intel() versus is_cpu(INTEL), it looks to me as though the 
two are identical in the current code -- mtrr_if->vendor is never set in 
the generic code, and so defaults to 0 - meaning X86_VENDOR_INTEL.

All in all, it looks like the vendor ID stuff is a bad case of "works by 
accident" in the MTRR code, however, *given the current code* I conclude 
that is_cpu(INTEL) == use_intel() and that neither can be true without 
MTRRs enabled.

	-hpa
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ingo Molnar Nov. 28, 2008, 8:34 a.m. UTC | #6
* Arjan van de Ven <arjan@linux.intel.com> wrote:

> On Thu, 27 Nov 2008 21:47:14 +0100
> Ingo Molnar <mingo@elte.hu> wrote:
> 
> > IIRC the problem is that vmware _does_ claim that it supports MTRRs. 
> 
> it might.
> but even if they would fix that, we would still WARN (
> at least we should do our side correctly...

As pointed out in other parts of the thread, that is not the case.

Anyway, as i said it in the onset, if you think we should remove the 
warning altogether, or tweak it, we can do that - it is important to 
have relevant warnings show up in kerneloops.org.

To sum it up: the only remaining MTRR warnings we know of are either:

 1) apparently genuine BIOS bugs that do cause problems if the (new) 
    kernel does not fix them up.

    The MTRR warning is relevant and correct in those cases.

or:

 2) sucky virtualization solutions that cheat the guest OS by faking 
    "MTRR support" in the CPUID info, but not actually showing any 
    MTRRs. These virtualization solutions do not even properly 
    identify themselves to the kernel.

    The MTRR warning is unnecessary in this case.

So what we did in the x86 tree was remove the warning in the second 
case - is to properly identify vmware (and in general, virtualization) 
guests.

It was not a simple oneliner:

 earth4:~/tip> gll linus..x86/detect-hyper

 4e42ebd: x86: hypervisor - fix sparse warnings
 c450d78: x86: vmware - fix sparse warnings
 fd8cd7e: x86: vmware: look for DMI string in the product serial key
 6bdbfe9: x86: VMware: Fix vmware_get_tsc code
 395628e: x86: Skip verification by the watchdog for TSC clocksource.
 eca0cd0: x86: Add a synthetic TSC_RELIABLE feature bit.
 88b094f: x86: Hypervisor detection and get tsc_freq from hypervisor
 49ab56a: x86: add X86_FEATURE_HYPERVISOR feature bit
 b2bcc7b: x86: add a synthetic TSC_RELIABLE feature bit

and it will benefit vmware guests in many more areas than just a 
sharper MTRR warning message. That code is queued up for v2.6.29.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c
index 1159e26..0044e61 100644
--- a/arch/x86/kernel/cpu/mtrr/main.c
+++ b/arch/x86/kernel/cpu/mtrr/main.c
@@ -1567,6 +1567,8 @@  int __init mtrr_trim_uncached_memory(unsigned long end_pfn)
 	 * Make sure we only trim uncachable memory on machines that
 	 * support the Intel MTRR architecture:
 	 */
+	if (!cpu_has_mtrr)
+		return 0;
 	if (!is_cpu(INTEL) || disable_mtrr_trim)
 		return 0;
 	rdmsr(MTRRdefType_MSR, def, dummy);