[1/2] KVM: PPC: Book3S HV: trace_tlbie must not be called in realmode

Message ID 20180405175631.31381-2-npiggin@gmail.com
State New
Headers show
Series
  • KVM powerpc tlbie scalability improvement
Related show

Commit Message

Nicholas Piggin April 5, 2018, 5:56 p.m.
This crashes with a "Bad real address for load" attempting to load
from the vmalloc region in realmode (faulting address is in DAR).

  Oops: Bad interrupt in KVM entry/exit code, sig: 6 [#1]
  LE SMP NR_CPUS=2048 NUMA PowerNV
  CPU: 53 PID: 6582 Comm: qemu-system-ppc Not tainted 4.16.0-01530-g43d1859f0994
  NIP:  c0000000000155ac LR: c0000000000c2430 CTR: c000000000015580
  REGS: c000000fff76dd80 TRAP: 0200   Not tainted  (4.16.0-01530-g43d1859f0994)
  MSR:  9000000000201003 <SF,HV,ME,RI,LE>  CR: 48082222  XER: 00000000
  CFAR: 0000000102900ef0 DAR: d00017fffd941a28 DSISR: 00000040 SOFTE: 3
  NIP [c0000000000155ac] perf_trace_tlbie+0x2c/0x1a0
  LR [c0000000000c2430] do_tlbies+0x230/0x2f0

I suspect the reason is the per-cpu data is not in the linear chunk.
This could be restored if that was able to be fixed, but for now,
just remove the tracepoints.

Fixes: 0428491cba ("powerpc/mm: Trace tlbie(l) instructions")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv_rm_mmu.c | 4 ----
 1 file changed, 4 deletions(-)

Comments

Balbir Singh April 8, 2018, 10:17 a.m. | #1
On Fri, Apr 6, 2018 at 3:56 AM, Nicholas Piggin <npiggin@gmail.com> wrote:
> This crashes with a "Bad real address for load" attempting to load
> from the vmalloc region in realmode (faulting address is in DAR).
>
>   Oops: Bad interrupt in KVM entry/exit code, sig: 6 [#1]
>   LE SMP NR_CPUS=2048 NUMA PowerNV
>   CPU: 53 PID: 6582 Comm: qemu-system-ppc Not tainted 4.16.0-01530-g43d1859f0994
>   NIP:  c0000000000155ac LR: c0000000000c2430 CTR: c000000000015580
>   REGS: c000000fff76dd80 TRAP: 0200   Not tainted  (4.16.0-01530-g43d1859f0994)
>   MSR:  9000000000201003 <SF,HV,ME,RI,LE>  CR: 48082222  XER: 00000000
>   CFAR: 0000000102900ef0 DAR: d00017fffd941a28 DSISR: 00000040 SOFTE: 3
>   NIP [c0000000000155ac] perf_trace_tlbie+0x2c/0x1a0
>   LR [c0000000000c2430] do_tlbies+0x230/0x2f0
>
> I suspect the reason is the per-cpu data is not in the linear chunk.
> This could be restored if that was able to be fixed, but for now,
> just remove the tracepoints.

Could you share the stack trace as well? I've not observed this in my testing.
May be I don't have as many cpus. I presume your talking about the per cpu
data offsets for per cpu trace data?

Balbir Singh.
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicholas Piggin April 8, 2018, 1:41 p.m. | #2
On Sun, 8 Apr 2018 20:17:47 +1000
Balbir Singh <bsingharora@gmail.com> wrote:

> On Fri, Apr 6, 2018 at 3:56 AM, Nicholas Piggin <npiggin@gmail.com> wrote:
> > This crashes with a "Bad real address for load" attempting to load
> > from the vmalloc region in realmode (faulting address is in DAR).
> >
> >   Oops: Bad interrupt in KVM entry/exit code, sig: 6 [#1]
> >   LE SMP NR_CPUS=2048 NUMA PowerNV
> >   CPU: 53 PID: 6582 Comm: qemu-system-ppc Not tainted 4.16.0-01530-g43d1859f0994
> >   NIP:  c0000000000155ac LR: c0000000000c2430 CTR: c000000000015580
> >   REGS: c000000fff76dd80 TRAP: 0200   Not tainted  (4.16.0-01530-g43d1859f0994)
> >   MSR:  9000000000201003 <SF,HV,ME,RI,LE>  CR: 48082222  XER: 00000000
> >   CFAR: 0000000102900ef0 DAR: d00017fffd941a28 DSISR: 00000040 SOFTE: 3
> >   NIP [c0000000000155ac] perf_trace_tlbie+0x2c/0x1a0
> >   LR [c0000000000c2430] do_tlbies+0x230/0x2f0
> >
> > I suspect the reason is the per-cpu data is not in the linear chunk.
> > This could be restored if that was able to be fixed, but for now,
> > just remove the tracepoints.  
> 
> Could you share the stack trace as well? I've not observed this in my testing.

I can't seem to find it, I can try reproduce tomorrow. It was coming
from h_remove hcall from the guest. It's 176 logical CPUs.

> May be I don't have as many cpus. I presume your talking about the per cpu
> data offsets for per cpu trace data?

It looked like it was dereferencing virtually mapped per-cpu data, yes.
Probably the perf_events deref.

Thanks,
Nick

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael Ellerman April 10, 2018, 3:21 a.m. | #3
Nicholas Piggin <npiggin@gmail.com> writes:

> On Sun, 8 Apr 2018 20:17:47 +1000
> Balbir Singh <bsingharora@gmail.com> wrote:
>
>> On Fri, Apr 6, 2018 at 3:56 AM, Nicholas Piggin <npiggin@gmail.com> wrote:
>> > This crashes with a "Bad real address for load" attempting to load
>> > from the vmalloc region in realmode (faulting address is in DAR).
>> >
>> >   Oops: Bad interrupt in KVM entry/exit code, sig: 6 [#1]
>> >   LE SMP NR_CPUS=2048 NUMA PowerNV
>> >   CPU: 53 PID: 6582 Comm: qemu-system-ppc Not tainted 4.16.0-01530-g43d1859f0994
>> >   NIP:  c0000000000155ac LR: c0000000000c2430 CTR: c000000000015580
>> >   REGS: c000000fff76dd80 TRAP: 0200   Not tainted  (4.16.0-01530-g43d1859f0994)
>> >   MSR:  9000000000201003 <SF,HV,ME,RI,LE>  CR: 48082222  XER: 00000000
>> >   CFAR: 0000000102900ef0 DAR: d00017fffd941a28 DSISR: 00000040 SOFTE: 3
>> >   NIP [c0000000000155ac] perf_trace_tlbie+0x2c/0x1a0
>> >   LR [c0000000000c2430] do_tlbies+0x230/0x2f0
>> >
>> > I suspect the reason is the per-cpu data is not in the linear chunk.
>> > This could be restored if that was able to be fixed, but for now,
>> > just remove the tracepoints.  
>> 
>> Could you share the stack trace as well? I've not observed this in my testing.
>
> I can't seem to find it, I can try reproduce tomorrow. It was coming
> from h_remove hcall from the guest. It's 176 logical CPUs.
>
>> May be I don't have as many cpus. I presume your talking about the per cpu
>> data offsets for per cpu trace data?
>
> It looked like it was dereferencing virtually mapped per-cpu data, yes.
> Probably the perf_events deref.

Naveen has posted a series to (hopefully) fix this, which just missed
the merge window:

  https://patchwork.ozlabs.org/patch/894757/


cheers
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Naveen N. Rao April 10, 2018, 5:55 a.m. | #4
Michael Ellerman wrote:
> Nicholas Piggin <npiggin@gmail.com> writes:
> 
>> On Sun, 8 Apr 2018 20:17:47 +1000
>> Balbir Singh <bsingharora@gmail.com> wrote:
>>
>>> On Fri, Apr 6, 2018 at 3:56 AM, Nicholas Piggin <npiggin@gmail.com> wrote:
>>> > This crashes with a "Bad real address for load" attempting to load
>>> > from the vmalloc region in realmode (faulting address is in DAR).
>>> >
>>> >   Oops: Bad interrupt in KVM entry/exit code, sig: 6 [#1]
>>> >   LE SMP NR_CPUS=2048 NUMA PowerNV
>>> >   CPU: 53 PID: 6582 Comm: qemu-system-ppc Not tainted 4.16.0-01530-g43d1859f0994
>>> >   NIP:  c0000000000155ac LR: c0000000000c2430 CTR: c000000000015580
>>> >   REGS: c000000fff76dd80 TRAP: 0200   Not tainted  (4.16.0-01530-g43d1859f0994)
>>> >   MSR:  9000000000201003 <SF,HV,ME,RI,LE>  CR: 48082222  XER: 00000000
>>> >   CFAR: 0000000102900ef0 DAR: d00017fffd941a28 DSISR: 00000040 SOFTE: 3
>>> >   NIP [c0000000000155ac] perf_trace_tlbie+0x2c/0x1a0
>>> >   LR [c0000000000c2430] do_tlbies+0x230/0x2f0
>>> >
>>> > I suspect the reason is the per-cpu data is not in the linear chunk.
>>> > This could be restored if that was able to be fixed, but for now,
>>> > just remove the tracepoints.  
>>> 
>>> Could you share the stack trace as well? I've not observed this in my testing.
>>
>> I can't seem to find it, I can try reproduce tomorrow. It was coming
>> from h_remove hcall from the guest. It's 176 logical CPUs.
>>
>>> May be I don't have as many cpus. I presume your talking about the per cpu
>>> data offsets for per cpu trace data?
>>
>> It looked like it was dereferencing virtually mapped per-cpu data, yes.
>> Probably the perf_events deref.
> 
> Naveen has posted a series to (hopefully) fix this, which just missed
> the merge window:
> 
>   https://patchwork.ozlabs.org/patch/894757/

I'm afraid that won't actually help here :(
That series is specific to the function tracer, while this is using 
static tracepoints.

We could convert trace_tlbie() to a TRACE_EVENT_CONDITION() and guard it 
within a check for paca->ftrace_enabled, but that would only be useful 
if the below callsites can ever be hit outside of KVM guest mode.

- Naveen


--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicholas Piggin April 10, 2018, 6:10 a.m. | #5
On Tue, 10 Apr 2018 11:25:02 +0530
"Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote:

> Michael Ellerman wrote:
> > Nicholas Piggin <npiggin@gmail.com> writes:
> >   
> >> On Sun, 8 Apr 2018 20:17:47 +1000
> >> Balbir Singh <bsingharora@gmail.com> wrote:
> >>  
> >>> On Fri, Apr 6, 2018 at 3:56 AM, Nicholas Piggin <npiggin@gmail.com> wrote:  
> >>> > This crashes with a "Bad real address for load" attempting to load
> >>> > from the vmalloc region in realmode (faulting address is in DAR).
> >>> >
> >>> >   Oops: Bad interrupt in KVM entry/exit code, sig: 6 [#1]
> >>> >   LE SMP NR_CPUS=2048 NUMA PowerNV
> >>> >   CPU: 53 PID: 6582 Comm: qemu-system-ppc Not tainted 4.16.0-01530-g43d1859f0994
> >>> >   NIP:  c0000000000155ac LR: c0000000000c2430 CTR: c000000000015580
> >>> >   REGS: c000000fff76dd80 TRAP: 0200   Not tainted  (4.16.0-01530-g43d1859f0994)
> >>> >   MSR:  9000000000201003 <SF,HV,ME,RI,LE>  CR: 48082222  XER: 00000000
> >>> >   CFAR: 0000000102900ef0 DAR: d00017fffd941a28 DSISR: 00000040 SOFTE: 3
> >>> >   NIP [c0000000000155ac] perf_trace_tlbie+0x2c/0x1a0
> >>> >   LR [c0000000000c2430] do_tlbies+0x230/0x2f0
> >>> >
> >>> > I suspect the reason is the per-cpu data is not in the linear chunk.
> >>> > This could be restored if that was able to be fixed, but for now,
> >>> > just remove the tracepoints.    
> >>> 
> >>> Could you share the stack trace as well? I've not observed this in my testing.  
> >>
> >> I can't seem to find it, I can try reproduce tomorrow. It was coming
> >> from h_remove hcall from the guest. It's 176 logical CPUs.
> >>  
> >>> May be I don't have as many cpus. I presume your talking about the per cpu
> >>> data offsets for per cpu trace data?  
> >>
> >> It looked like it was dereferencing virtually mapped per-cpu data, yes.
> >> Probably the perf_events deref.  
> > 
> > Naveen has posted a series to (hopefully) fix this, which just missed
> > the merge window:
> > 
> >   https://patchwork.ozlabs.org/patch/894757/  
> 
> I'm afraid that won't actually help here :(
> That series is specific to the function tracer, while this is using 
> static tracepoints.
> 
> We could convert trace_tlbie() to a TRACE_EVENT_CONDITION() and guard it 
> within a check for paca->ftrace_enabled, but that would only be useful 
> if the below callsites can ever be hit outside of KVM guest mode.

Right, removing the trace points is the right thing to do here.

Doing tracing in real mode would be a whole effort itself, I'd expect.
Or disabling realmode handling of HPT hcalls if trace points are
active.

Thanks,
Nick
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael Ellerman April 11, 2018, 2:49 p.m. | #6
On Thu, 2018-04-05 at 17:56:30 UTC, Nicholas Piggin wrote:
> This crashes with a "Bad real address for load" attempting to load
> from the vmalloc region in realmode (faulting address is in DAR).
> 
>   Oops: Bad interrupt in KVM entry/exit code, sig: 6 [#1]
>   LE SMP NR_CPUS=2048 NUMA PowerNV
>   CPU: 53 PID: 6582 Comm: qemu-system-ppc Not tainted 4.16.0-01530-g43d1859f0994
>   NIP:  c0000000000155ac LR: c0000000000c2430 CTR: c000000000015580
>   REGS: c000000fff76dd80 TRAP: 0200   Not tainted  (4.16.0-01530-g43d1859f0994)
>   MSR:  9000000000201003 <SF,HV,ME,RI,LE>  CR: 48082222  XER: 00000000
>   CFAR: 0000000102900ef0 DAR: d00017fffd941a28 DSISR: 00000040 SOFTE: 3
>   NIP [c0000000000155ac] perf_trace_tlbie+0x2c/0x1a0
>   LR [c0000000000c2430] do_tlbies+0x230/0x2f0
> 
> I suspect the reason is the per-cpu data is not in the linear chunk.
> This could be restored if that was able to be fixed, but for now,
> just remove the tracepoints.
> 
> Fixes: 0428491cba ("powerpc/mm: Trace tlbie(l) instructions")
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Applied to powerpc fixes, thanks.

https://git.kernel.org/powerpc/c/19ce7909ed11c49f7eddf59e7f49cd

cheers
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index e1c083fbe434..78e6a392330f 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -470,8 +470,6 @@  static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues,
 		for (i = 0; i < npages; ++i) {
 			asm volatile(PPC_TLBIE_5(%0,%1,0,0,0) : :
 				     "r" (rbvalues[i]), "r" (kvm->arch.lpid));
-			trace_tlbie(kvm->arch.lpid, 0, rbvalues[i],
-				kvm->arch.lpid, 0, 0, 0);
 		}
 
 		if (cpu_has_feature(CPU_FTR_P9_TLBIE_BUG)) {
@@ -492,8 +490,6 @@  static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues,
 		for (i = 0; i < npages; ++i) {
 			asm volatile(PPC_TLBIEL(%0,%1,0,0,0) : :
 				     "r" (rbvalues[i]), "r" (0));
-			trace_tlbie(kvm->arch.lpid, 1, rbvalues[i],
-				0, 0, 0, 0);
 		}
 		asm volatile("ptesync" : : : "memory");
 	}