diff mbox series

[RFC,2/2] powerpc: nop trap instruction after WARN_ONCE fires

Message ID 20220923154143.1115645-2-npiggin@gmail.com (mailing list archive)
State Changes Requested
Headers show
Series [RFC,1/2] powerpc: Don't use extable for WARN | expand

Checks

Context Check Description
snowpatch_ozlabs/github-powerpc_selftests success Successfully ran 10 jobs.
snowpatch_ozlabs/github-powerpc_clang success Successfully ran 6 jobs.
snowpatch_ozlabs/github-powerpc_ppctests success Successfully ran 10 jobs.
snowpatch_ozlabs/github-powerpc_kernel_qemu success Successfully ran 23 jobs.
snowpatch_ozlabs/github-powerpc_sparse success Successfully ran 4 jobs.

Commit Message

Nicholas Piggin Sept. 23, 2022, 3:41 p.m. UTC
WARN_ONCE and similar are often used in frequently executed code, and
should not crash the system. The program check interrupt caused by
WARN_ON_ONCE can be a significant overhead even when nothing is being
printed. This can cause performance to become unacceptable, having the
same effective impact to the user as a BUG_ON().

Avoid this overhead by patching the trap with a nop instruction after a
"once" trap fires. Conditional warnings that return a result must have
equivalent compare and branch instructions after the trap, so when it is
nopped the statement will behave the same way. It's possible the asm
goto should be removed entirely and this comparison just done in C now.

XXX: possibly this should schedule the patching to run in a different
context than the program check.

XXX: patching process could allocate memory for patched bugs rather
than keeping a word in the bug entry for all of them. Very few will
ever fire.

XXX: is concurrency okay?

XXX: could avoid clobbering cc in some cases, possibly optimise some
variants.

---
 arch/powerpc/include/asm/bug.h | 14 ++++++++++--
 arch/powerpc/kernel/traps.c    | 42 +++++++++++++++++++++++++++++++++-
 include/asm-generic/bug.h      |  9 ++++++++
 include/linux/bug.h            |  1 +
 lib/bug.c                      | 22 +++++++++---------
 5 files changed, 74 insertions(+), 14 deletions(-)

Comments

Christophe Leroy Sept. 23, 2022, 4:47 p.m. UTC | #1
Le 23/09/2022 à 17:41, Nicholas Piggin a écrit :
> WARN_ONCE and similar are often used in frequently executed code, and
> should not crash the system. The program check interrupt caused by
> WARN_ON_ONCE can be a significant overhead even when nothing is being
> printed. This can cause performance to become unacceptable, having the
> same effective impact to the user as a BUG_ON().
> 
> Avoid this overhead by patching the trap with a nop instruction after a
> "once" trap fires. Conditional warnings that return a result must have
> equivalent compare and branch instructions after the trap, so when it is
> nopped the statement will behave the same way. It's possible the asm
> goto should be removed entirely and this comparison just done in C now.

You mean, just like PPC32 ? (Since db87a7199229 ("powerpc/bug: Remove 
specific powerpc BUG_ON() and WARN_ON() on PPC32"))

But I'm having hard time with your change.

You change only WARN_ON()
But WARN_ON_ONCE() calls __WARN_FLAGS()
And WARN_ONCE() calls WARN() via DO_ONCE_LITE_IF()

So I don't see any ..._ONCE something going with WARN_ON().

Am I missing something ?

Otherwise, what you want to do is to patch the 'twi' in __WARN_FLAGS() 
by a non conditional branch to __label_warn_on .



Christophe
David Laight Sept. 26, 2022, 8:56 a.m. UTC | #2
From: Nicholas Piggin
> Sent: 23 September 2022 16:42
>
> WARN_ONCE and similar are often used in frequently executed code, and
> should not crash the system. The program check interrupt caused by
> WARN_ON_ONCE can be a significant overhead even when nothing is being
> printed. This can cause performance to become unacceptable, having the
> same effective impact to the user as a BUG_ON().
> 
> Avoid this overhead by patching the trap with a nop instruction after a
> "once" trap fires. Conditional warnings that return a result must have
> equivalent compare and branch instructions after the trap, so when it is
> nopped the statement will behave the same way. It's possible the asm
> goto should be removed entirely and this comparison just done in C now.
> 
> XXX: possibly this should schedule the patching to run in a different
> context than the program check.

I'm pretty sure WARN_ON_ONCE() is valid everywhere printk() is allowed.
In many cases this means you can't call mutex_enter().

So you need a different scheme.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Nicholas Piggin Oct. 4, 2022, 4:31 a.m. UTC | #3
On Sat Sep 24, 2022 at 2:47 AM AEST, Christophe Leroy wrote:
>
>
> Le 23/09/2022 à 17:41, Nicholas Piggin a écrit :
> > WARN_ONCE and similar are often used in frequently executed code, and
> > should not crash the system. The program check interrupt caused by
> > WARN_ON_ONCE can be a significant overhead even when nothing is being
> > printed. This can cause performance to become unacceptable, having the
> > same effective impact to the user as a BUG_ON().
> > 
> > Avoid this overhead by patching the trap with a nop instruction after a
> > "once" trap fires. Conditional warnings that return a result must have
> > equivalent compare and branch instructions after the trap, so when it is
> > nopped the statement will behave the same way. It's possible the asm
> > goto should be removed entirely and this comparison just done in C now.
>
> You mean, just like PPC32 ? (Since db87a7199229 ("powerpc/bug: Remove 
> specific powerpc BUG_ON() and WARN_ON() on PPC32"))
>
> But I'm having hard time with your change.
>
> You change only WARN_ON()
> But WARN_ON_ONCE() calls __WARN_FLAGS()
> And WARN_ONCE() calls WARN() via DO_ONCE_LITE_IF()
>
> So I don't see any ..._ONCE something going with WARN_ON().
>
> Am I missing something ?

Hmm, no I must have missed something. I guess it is the EMIT_WARN_ENTRY
in asm which is the main problem I've seen. Although we could remove the
DO_ONCE_LITE_IF code generation from our WARN_ON_ONCE as well if we did
this patching.

Thanks,
Nick
Christophe Leroy Oct. 4, 2022, 8:14 a.m. UTC | #4
Le 04/10/2022 à 06:31, Nicholas Piggin a écrit :
> On Sat Sep 24, 2022 at 2:47 AM AEST, Christophe Leroy wrote:
>>
>>
>> Le 23/09/2022 à 17:41, Nicholas Piggin a écrit :
>>> WARN_ONCE and similar are often used in frequently executed code, and
>>> should not crash the system. The program check interrupt caused by
>>> WARN_ON_ONCE can be a significant overhead even when nothing is being
>>> printed. This can cause performance to become unacceptable, having the
>>> same effective impact to the user as a BUG_ON().
>>>
>>> Avoid this overhead by patching the trap with a nop instruction after a
>>> "once" trap fires. Conditional warnings that return a result must have
>>> equivalent compare and branch instructions after the trap, so when it is
>>> nopped the statement will behave the same way. It's possible the asm
>>> goto should be removed entirely and this comparison just done in C now.
>>
>> You mean, just like PPC32 ? (Since db87a7199229 ("powerpc/bug: Remove
>> specific powerpc BUG_ON() and WARN_ON() on PPC32"))
>>
>> But I'm having hard time with your change.
>>
>> You change only WARN_ON()
>> But WARN_ON_ONCE() calls __WARN_FLAGS()
>> And WARN_ONCE() calls WARN() via DO_ONCE_LITE_IF()
>>
>> So I don't see any ..._ONCE something going with WARN_ON().
>>
>> Am I missing something ?
> 
> Hmm, no I must have missed something. I guess it is the EMIT_WARN_ENTRY
> in asm which is the main problem I've seen. Although we could remove the
> DO_ONCE_LITE_IF code generation from our WARN_ON_ONCE as well if we did
> this patching.
> 

Yes, I guess having now the recovery address in the bug table instead of 
the extable is rather more efficient.

Maybe DO_ONCE_LITE could be replaced by DO_ONCE which uses jump_label ? 
Not sure it is worth a specific patching implementation, is it ?

Christophe
diff mbox series

Patch

diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h
index dec73137fea0..0ea7b9fe0305 100644
--- a/arch/powerpc/include/asm/bug.h
+++ b/arch/powerpc/include/asm/bug.h
@@ -14,6 +14,7 @@ 
 .macro __EMIT_BUG_ENTRY addr,file,line,flags
 	 .section __bug_table,"aw"
 5001:	 .4byte \addr - .
+	 .4byte 0
 	 .4byte 0
 	 .4byte 5002f - .
 	 .short \line, \flags
@@ -27,6 +28,7 @@ 
 .macro __EMIT_BUG_ENTRY addr,file,line,flags
 	 .section __bug_table,"aw"
 5001:	 .4byte \addr - .
+	 .4byte 0
 	 .4byte 0
 	 .short \flags
 	 .org 5001b+BUG_ENTRY_SIZE
@@ -48,6 +50,7 @@ 
 #else /* !__ASSEMBLY__ */
 struct arch_bug_entry {
 	signed int recovery_addr_disp;
+	unsigned int insn;
 };
 
 /* _EMIT_BUG_ENTRY expects args %0,%1,%2,%3 to be FILE, LINE, flags and
@@ -57,6 +60,7 @@  struct arch_bug_entry {
 	".section __bug_table,\"aw\"\n"		\
 	"2:	.4byte 1b - .\n"		\
 	"	.4byte 0\n"			\
+	"	.4byte 0\n"			\
 	"	.4byte %0 - .\n"		\
 	"	.short %1, %2\n"		\
 	".org 2b+%3\n"				\
@@ -66,6 +70,7 @@  struct arch_bug_entry {
 	".section __bug_table,\"aw\"\n"		\
 	"2:	.4byte 1b - .\n"		\
 	"	.4byte 0\n"			\
+	"	.4byte 0\n"			\
 	"	.short %2\n"			\
 	".org 2b+%3\n"				\
 	".previous\n"
@@ -76,6 +81,7 @@  struct arch_bug_entry {
 	".section __bug_table,\"aw\"\n"		\
 	"2:	.4byte 1b - .\n"		\
 	"	.4byte 1b - %l[" #label "]\n"	\
+	"	.4byte 0\n"			\
 	"	.4byte %0 - .\n"		\
 	"	.short %1, %2\n"		\
 	".org 2b+%3\n"				\
@@ -85,6 +91,7 @@  struct arch_bug_entry {
 	".section __bug_table,\"aw\"\n"		\
 	"2:	.4byte 1b - .\n"		\
 	"	.4byte 1b - %l[" #label "]\n"	\
+	"	.4byte 0\n"			\
 	"	.short %2\n"			\
 	".org 2b+%3\n"				\
 	".previous\n"
@@ -106,7 +113,7 @@  struct arch_bug_entry {
 		: : "i" (__FILE__), "i" (__LINE__),	\
 		  "i" (flags),				\
 		  "i" (sizeof(struct bug_entry)),	\
-		  ##__VA_ARGS__ : : label)
+		  ##__VA_ARGS__ : "cr0" : label)
 
 /*
  * BUG_ON() and WARN_ON() do their best to cooperate with compile-time
@@ -151,7 +158,7 @@  __label_warn_on:						\
 		} else {					\
 			__label__ __label_warn_on;		\
 								\
-			WARN_ENTRY(PPC_TLNEI " %4, 0",		\
+			WARN_ENTRY(PPC_TLNEI " %4, 0 ; cmpdi %4, 0 ; bne %l[__label_warn_on]",		\
 				   BUGFLAG_WARNING | BUGFLAG_TAINT(TAINT_WARN),	\
 				   __label_warn_on,		\
 				   "r" ((__force long)(x)));	\
@@ -178,6 +185,9 @@  __label_warn_on:						\
 #define _EMIT_BUG_ENTRY
 #define _EMIT_WARN_ENTRY
 #endif
+#define arch_generic_bug_entry_clear_once
+void arch_generic_bug_entry_clear_once(struct bug_entry *bug);
+
 #endif /* CONFIG_BUG */
 
 #include <asm-generic/bug.h>
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 1e521a432bd0..14137c17b9cd 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -1458,6 +1458,44 @@  static int emulate_math(struct pt_regs *regs)
 static inline int emulate_math(struct pt_regs *regs) { return -1; }
 #endif
 
+static DEFINE_MUTEX(bug_lock);
+
+void arch_generic_bug_entry_clear_once(struct bug_entry *bug)
+{
+	unsigned long bugaddr = bug_addr(bug);
+
+	BUG_ON(!bug->arch.insn);
+
+	mutex_lock(&bug_lock);
+	if (bug->arch.insn == 0) {
+		mutex_unlock(&bug_lock);
+		return;
+	}
+	patch_instruction((u32 *)bugaddr, ppc_inst(bug->arch.insn));
+	bug->arch.insn = 0;
+	mutex_unlock(&bug_lock);
+}
+
+static void bug_entry_done_once(struct bug_entry *bug)
+{
+	unsigned long bugaddr = bug_addr(bug);
+	unsigned int insn;
+
+	if (!mutex_trylock(&bug_lock))
+		return;
+
+	if (bug->arch.insn != 0)
+		goto out;
+
+	if (__get_user(insn, (unsigned int __user *)bugaddr))
+		goto out;
+
+	patch_instruction((u32 *)bugaddr, ppc_inst(PPC_RAW_NOP()));
+
+out:
+	mutex_unlock(&bug_lock);
+}
+
 static void do_program_check(struct pt_regs *regs)
 {
 	unsigned int reason = get_reason(regs);
@@ -1494,7 +1532,7 @@  static void do_program_check(struct pt_regs *regs)
 
 		if (!(regs->msr & MSR_PR) &&  /* not user-mode */
 		    report_bug(bugaddr, regs) == BUG_TRAP_TYPE_WARN) {
-			const struct bug_entry *bug;
+			struct bug_entry *bug;
 			unsigned long recov;
 
 			bug = find_bug(bugaddr);
@@ -1502,6 +1540,8 @@  static void do_program_check(struct pt_regs *regs)
 				recov = regs->nip + 4;
 			else
 				recov = bugaddr - bug->arch.recovery_addr_disp;
+			if (bug && (bug->flags & BUGFLAG_ONCE))
+				bug_entry_done_once(bug);
 
 			regs_set_return_ip(regs, recov);
 			return;
diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h
index 1ce8431bdf02..a0cb717998c1 100644
--- a/include/asm-generic/bug.h
+++ b/include/asm-generic/bug.h
@@ -49,6 +49,15 @@  struct bug_entry {
 #endif
 	unsigned short	flags;
 };
+
+static inline unsigned long bug_addr(const struct bug_entry *bug)
+{
+#ifdef CONFIG_GENERIC_BUG_RELATIVE_POINTERS
+	return (unsigned long)&bug->bug_addr_disp + bug->bug_addr_disp;
+#else
+	return bug->bug_addr;
+#endif
+}
 #endif	/* CONFIG_GENERIC_BUG */
 
 /*
diff --git a/include/linux/bug.h b/include/linux/bug.h
index 348acf2558f3..a207445e5b5f 100644
--- a/include/linux/bug.h
+++ b/include/linux/bug.h
@@ -46,6 +46,7 @@  enum bug_trap_type report_bug(unsigned long bug_addr, struct pt_regs *regs);
 /* These are defined by the architecture */
 int is_valid_bugaddr(unsigned long addr);
 
+void __generic_bug_clear_once(void);
 void generic_bug_clear_once(void);
 
 #else	/* !CONFIG_GENERIC_BUG */
diff --git a/lib/bug.c b/lib/bug.c
index c223a2575b72..d1cc4f763679 100644
--- a/lib/bug.c
+++ b/lib/bug.c
@@ -50,15 +50,6 @@ 
 
 extern struct bug_entry __start___bug_table[], __stop___bug_table[];
 
-static inline unsigned long bug_addr(const struct bug_entry *bug)
-{
-#ifdef CONFIG_GENERIC_BUG_RELATIVE_POINTERS
-	return (unsigned long)&bug->bug_addr_disp + bug->bug_addr_disp;
-#else
-	return bug->bug_addr;
-#endif
-}
-
 #ifdef CONFIG_MODULES
 /* Updates are protected by module mutex */
 static LIST_HEAD(module_bug_list);
@@ -209,12 +200,21 @@  enum bug_trap_type report_bug(unsigned long bugaddr, struct pt_regs *regs)
 	return BUG_TRAP_TYPE_BUG;
 }
 
+#ifndef arch_generic_bug_entry_clear_once
+#define arch_generic_bug_entry_clear_once(bug)
+#endif
+
 static void clear_once_table(struct bug_entry *start, struct bug_entry *end)
 {
 	struct bug_entry *bug;
 
-	for (bug = start; bug < end; bug++)
-		bug->flags &= ~BUGFLAG_DONE;
+	for (bug = start; bug < end; bug++) {
+		bool triggered = bug->flags & BUGFLAG_DONE;
+		if (triggered) {
+			bug->flags &= ~BUGFLAG_DONE;
+			arch_generic_bug_entry_clear_once(bug);
+		}
+	}
 }
 
 void generic_bug_clear_once(void)