[RFC] arch/powerpc: Turn off irqs in switch_mm()

Message ID 20170419063826.1678-1-david@gibson.dropbear.id.au
State Accepted
Commit 9765ad134a00a01cbcc69c78ff6defbfad209bc5
Headers show

Commit Message

David Gibson April 19, 2017, 6:38 a.m.
There seems to be a mismatch in expectations between the powerpc arch code
and the generic (and x86) code in terms of the irq state when switch_mm()
is called.

powerpc expects irqs to already be (soft) disabled when switch_mm() is
called, as made clear in the commit message of 9c1e105 "powerpc: Allow
perf_counters to access user memory at interrupt time".

That seems to be true when it's called from the schedule, but not for
use_mm().  This becomes clear when looking at the x86 code paths for
switch_mm().  There, switch_mm() itself disable irqs, with a
switch_mm_irqs_off() variant which expects that to be already done.

This patch addresses the problem, making the powerpc code mirror the x86
code.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 arch/powerpc/include/asm/mmu_context.h | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

RH-Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1437794

It seems that some more recent changes in vhost have made it more
likely to hit this problem, triggering a WARN.

Comments

Michael Ellerman April 24, 2017, 10:47 p.m. | #1
On Wed, 2017-04-19 at 06:38:26 UTC, David Gibson wrote:
> There seems to be a mismatch in expectations between the powerpc arch code
> and the generic (and x86) code in terms of the irq state when switch_mm()
> is called.
> 
> powerpc expects irqs to already be (soft) disabled when switch_mm() is
> called, as made clear in the commit message of 9c1e105 "powerpc: Allow
> perf_counters to access user memory at interrupt time".
> 
> That seems to be true when it's called from the schedule, but not for
> use_mm().  This becomes clear when looking at the x86 code paths for
> switch_mm().  There, switch_mm() itself disable irqs, with a
> switch_mm_irqs_off() variant which expects that to be already done.
> 
> This patch addresses the problem, making the powerpc code mirror the x86
> code.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/9765ad134a00a01cbcc69c78ff6def

cheers

Patch

diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index b9e3f0a..0012f03 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -70,8 +70,9 @@  extern void drop_cop(unsigned long acop, struct mm_struct *mm);
  * switch_mm is the entry point called from the architecture independent
  * code in kernel/sched/core.c
  */
-static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
-			     struct task_struct *tsk)
+static inline void switch_mm_irqs_off(struct mm_struct *prev,
+				      struct mm_struct *next,
+				      struct task_struct *tsk)
 {
 	/* Mark this context has been used on the new CPU */
 	if (!cpumask_test_cpu(smp_processor_id(), mm_cpumask(next)))
@@ -110,6 +111,18 @@  static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
 	switch_mmu_context(prev, next, tsk);
 }
 
+static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
+			     struct task_struct *tsk)
+{
+	unsigned long flags;
+
+	local_irq_save(flags);
+	switch_mm_irqs_off(prev, next, tsk);
+	local_irq_restore(flags);
+}
+#define switch_mm_irqs_off switch_mm_irqs_off
+
+
 #define deactivate_mm(tsk,mm)	do { } while (0)
 
 /*