diff mbox

arch: configuration, deleting 'CONFIG_BUG' since always need it.

Message ID 201305231139.38233.arnd@arndb.de (mailing list archive)
State Not Applicable
Headers show

Commit Message

Arnd Bergmann May 23, 2013, 9:39 a.m. UTC
On Thursday 23 May 2013, Geert Uytterhoeven wrote:
> > The problem is: trying to fix that will mean the result is a larger
> > kernel than if you just do the usual arch-implemented thing of placing
> > an defined faulting instruction at the BUG() site - which defeats the
> > purpose of turning off CONFIG_BUG.
> 
> Is __builtin_unreachable() working well these days?
> 

Hmm, I just tried the trivial patch below, which seemed to do the right thing.
Needs a little more investigation, but that might actually be the correct
solution. I thought that at some point __builtin_unreachable() was the same
as "do {} while (1)", but this is not the case with the gcc I was using --
it just tells gcc that we don't expect to ever get here.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>

Comments

Russell King - ARM Linux May 23, 2013, 10:04 a.m. UTC | #1
On Thu, May 23, 2013 at 11:39:37AM +0200, Arnd Bergmann wrote:
> On Thursday 23 May 2013, Geert Uytterhoeven wrote:
> > > The problem is: trying to fix that will mean the result is a larger
> > > kernel than if you just do the usual arch-implemented thing of placing
> > > an defined faulting instruction at the BUG() site - which defeats the
> > > purpose of turning off CONFIG_BUG.
> > 
> > Is __builtin_unreachable() working well these days?
> > 
> 
> Hmm, I just tried the trivial patch below, which seemed to do the right thing.
> Needs a little more investigation, but that might actually be the correct
> solution. I thought that at some point __builtin_unreachable() was the same
> as "do {} while (1)", but this is not the case with the gcc I was using --
> it just tells gcc that we don't expect to ever get here.

All this is doing is hiding the warning, nothing more.

What the compiler does is this:

	beq	1f
	... some asm code ...
	__builtin_reachable() point
	maybe a literal table
1:	... some asm code doing some other part of the function ...

and what will happen is that the first block of asm will fall through the
(possibly present) literal table into the following asm code.  So, as
specified in the gcc manual, if you ever hit a __builtin_unreachable()
point, your program is undefined (as in, the behaviour of it can no longer
be known.)

We can't make that guarantee with BUG() - because sometimes they do fire
and sometimes in the most unlikely scenarios, particularly if you're not
looking, or at the most inconvenient time.

So, if you want to use this, then you should update the CONFIG_BUG text
to include a warning to this effect:

     Warning: if CONFIG_BUG is turned off, and control flow reaches
     a BUG(), the system behaviour will be undefined.

so that people can make an informed choice about this, because at the
moment:

          Disabling this option eliminates support for BUG and WARN, reducing
          the size of your kernel image and potentially quietly ignoring
          numerous fatal conditions. You should only consider disabling this
          option for embedded systems with no facilities for reporting errors.
          Just say Y.

will become completely misleading.  Turning this option off will _not_
result in "quietly ignoring numerous fatal conditions".

And I come back to one of my previous arguments - is it not better to
panic() if we hit one of these conditions so that the system can try to
do a panic-reboot rather than continue blindly into the unknown?
Russell King - ARM Linux May 23, 2013, 10:29 a.m. UTC | #2
On Thu, May 23, 2013 at 03:09:50AM -0700, Eric W. Biederman wrote:
> Arnd Bergmann <arnd@arndb.de> writes:
> 
> > On Thursday 23 May 2013, Geert Uytterhoeven wrote:
> >> > The problem is: trying to fix that will mean the result is a larger
> >> > kernel than if you just do the usual arch-implemented thing of placing
> >> > an defined faulting instruction at the BUG() site - which defeats the
> >> > purpose of turning off CONFIG_BUG.
> >> 
> >> Is __builtin_unreachable() working well these days?
> >> 
> >
> > Hmm, I just tried the trivial patch below, which seemed to do the right thing.
> > Needs a little more investigation, but that might actually be the correct
> > solution. I thought that at some point __builtin_unreachable() was the same
> > as "do {} while (1)", but this is not the case with the gcc I was using --
> > it just tells gcc that we don't expect to ever get here.
> 
> Yes.
> 
> We already have this abstracted in compiler.h as the macro unreachable,
> so the slight modification of your patch below should handle this case.
> 
> For compilers without __builtin_unreachable() unreachable() expands to
> do {} while(1) but an infinite loop seems reasonable and preserves the
> semantics of the code, unlike the current noop that is do {} while(0).

Semantics of the code really don't come in to it if you use unreachable().
unreachable() is an effective do { } while (0) to the compiler.  It just
doesn't warn about it anymore.  It's actually worse than that - it's
permission to the compiler to just stop considering flow control at that
point and do anything it likes with the following instruction slot.

What __builtin_unreachable() means to the compiler is "we will *never*
get here".  That isn't the case for BUG() - BUG() means "we hope that
we will never get here, but we might, and if we do your data is in
grave danger."

We should either have something at that point (like a call to a function
which panics) or remove the ability to turn off CONFIG_BUG and anyone who
cares about kernel size needs to come up with a single trapping
instruction BUG() implementation.
Chen Gang May 23, 2013, 10:41 a.m. UTC | #3
On 05/23/2013 06:04 PM, Russell King - ARM Linux wrote:
> So, if you want to use this, then you should update the CONFIG_BUG text
> to include a warning to this effect:
> 
>      Warning: if CONFIG_BUG is turned off, and control flow reaches
>      a BUG(), the system behaviour will be undefined.
> 
> so that people can make an informed choice about this, because at the
> moment:
> 
>           Disabling this option eliminates support for BUG and WARN, reducing
>           the size of your kernel image and potentially quietly ignoring
>           numerous fatal conditions. You should only consider disabling this
>           option for embedded systems with no facilities for reporting errors.
>           Just say Y.
> 
> will become completely misleading.  Turning this option off will _not_
> result in "quietly ignoring numerous fatal conditions".
> 
> And I come back to one of my previous arguments - is it not better to
> panic() if we hit one of these conditions so that the system can try to
> do a panic-reboot rather than continue blindly into the unknown?

But I still suggest to delete CONFIG_BUG in common kernel.

Since currently, disable 'CONFIG_BUG' is not a common features (most of
architectures are always enable it), it is only belongs to some
architectures specific features (may some embedded systems).

It is not suitable to still let 'CONFIG_BUG' exist in
"asm-generic/bug.h" which is only for common features.

And each architecture can customize their own BUG(), if one architecture
wants to Disabling this option, let it specify its own BUG().

So, most of architectures need not consider this issue again.


Thanks.
Arnd Bergmann May 23, 2013, 10:59 a.m. UTC | #4
On Thursday 23 May 2013, Russell King - ARM Linux wrote:
> So, if you want to use this, then you should update the CONFIG_BUG text
> to include a warning to this effect:
> 
>      Warning: if CONFIG_BUG is turned off, and control flow reaches
>      a BUG(), the system behaviour will be undefined.
> 
> so that people can make an informed choice about this, because at the
> moment:
> 
>           Disabling this option eliminates support for BUG and WARN, reducing
>           the size of your kernel image and potentially quietly ignoring
>           numerous fatal conditions. You should only consider disabling this
>           option for embedded systems with no facilities for reporting errors.
>           Just say Y.
> 
> will become completely misleading.  Turning this option off will not
> result in "quietly ignoring numerous fatal conditions".

I must be missing something, to me the two descriptions mean the same thing.

> And I come back to one of my previous arguments - is it not better to
> panic() if we hit one of these conditions so that the system can try to
> do a panic-reboot rather than continue blindly into the unknown?

I think this all comes from the 'linux-tiny' project that tried to squeeze
out the last bits of kernel object code size at some point. The idea was
that if you have code like

	BUG_ON(something_unexpected_happened());

or

	switch (my_enum) {
	case FOO:
		return f1();
	case BAR:
		return f2();
	default:
		BUG();
	}

You don't just want to avoid the code for printing the bug message and
the invalid instruction, we also want the compiler to not emit the 
function call or check the enum for unexpected values. The meaning of
BUG() is really that person writing that statement was sure it cannot
happen unless there is a bug in the kernel, which has likely already
corrupted data. Printing a diagnostic at this point is nice if someone
is there to look at it, but letting the kernel do further actions that
may be undefined is not going to make things worse.

	Arnd
Chen Gang May 23, 2013, 11:19 a.m. UTC | #5
On 05/23/2013 06:59 PM, Arnd Bergmann wrote:
> You don't just want to avoid the code for printing the bug message and
> the invalid instruction, we also want the compiler to not emit the 
> function call or check the enum for unexpected values. The meaning of
> BUG() is really that person writing that statement was sure it cannot
> happen unless there is a bug in the kernel, which has likely already
> corrupted data. Printing a diagnostic at this point is nice if someone
> is there to look at it, but letting the kernel do further actions that
> may be undefined is not going to make things worse.

So I think neither unreachable() nor panic() are suitable for this
condition.

I guess 'CONFIG_BUG' is not belong to common features, now (and in the
future), so it is not suitable still exist in "asm-generic/bug.h", need
remove it firstly.

And then let the specific architectures to implement their own BUG(), if
they want some special features.

SO most of arches can skip this issue.


Thanks.
Russell King - ARM Linux May 23, 2013, 11:24 a.m. UTC | #6
On Thu, May 23, 2013 at 12:59:43PM +0200, Arnd Bergmann wrote:
> On Thursday 23 May 2013, Russell King - ARM Linux wrote:
> > So, if you want to use this, then you should update the CONFIG_BUG text
> > to include a warning to this effect:
> > 
> >      Warning: if CONFIG_BUG is turned off, and control flow reaches
> >      a BUG(), the system behaviour will be undefined.
> > 
> > so that people can make an informed choice about this, because at the
> > moment:
> > 
> >           Disabling this option eliminates support for BUG and WARN, reducing
> >           the size of your kernel image and potentially quietly ignoring
> >           numerous fatal conditions. You should only consider disabling this
> >           option for embedded systems with no facilities for reporting errors.
> >           Just say Y.
> > 
> > will become completely misleading.  Turning this option off will not
> > result in "quietly ignoring numerous fatal conditions".
> 
> I must be missing something, to me the two descriptions mean the same thing.

To me, the current text suggests that we still detect the fatal condition
but the code continues to execute in a manner controlled by the program.

The latter is uncontrolled code (or data) execution in ways unspecified
by the program.

> You don't just want to avoid the code for printing the bug message and
> the invalid instruction, we also want the compiler to not emit the 
> function call or check the enum for unexpected values. The meaning of
> BUG() is really that person writing that statement was sure it cannot
> happen unless there is a bug in the kernel, which has likely already
> corrupted data. Printing a diagnostic at this point is nice if someone
> is there to look at it, but letting the kernel do further actions that
> may be undefined is not going to make things worse.

I'm not talking about printing a diagnostic.  I'm talking about the CPU
remaining under the control of the program it is running - that being
the Linux kernel.

With CONFIG_BUG unset, turning on things like reboot-on-panic and such
like is worthless.  Arguably even is having a hardware watchdog - because
even if you hit one of these BUG() conditions where the CPU goes off and
does its own thing, it might be sufficient that the system is still able
to take care of the watchdog.

This is the problem you guys are missing - unreachable() means "we lose
control of the CPU at this point".

If you have an embedded system and you've taken out all the printk()
stuff, you most certainly want the system to do _something_ if you hit
an unexpected condition.
Arnd Bergmann May 23, 2013, 12:09 p.m. UTC | #7
On Thursday 23 May 2013, Russell King - ARM Linux wrote:
> This is the problem you guys are missing - unreachable() means "we lose
> control of the CPU at this point".

I'm absolutely aware of this. Again, the current behaviour of doing nothing
at all isn't very different from undefined behavior when you get when you
get to the end of a function returning a pointer without a "return" statement,
or when you return from a function that has determined that it is not safe
to continue.

> If you have an embedded system and you've taken out all the printk()
> stuff, you most certainly want the system to do something if you hit
> an unexpected condition.

I did not claim that it was a good idea to disable BUG(), all I said is
that "random stuff may happen" is probably what Matt Mackall had in mind when
he introduced the option.

	Arnd
Russell King - ARM Linux May 23, 2013, 12:50 p.m. UTC | #8
On Thu, May 23, 2013 at 02:09:02PM +0200, Arnd Bergmann wrote:
> On Thursday 23 May 2013, Russell King - ARM Linux wrote:
> > This is the problem you guys are missing - unreachable() means "we lose
> > control of the CPU at this point".
> 
> I'm absolutely aware of this. Again, the current behaviour of doing nothing
> at all isn't very different from undefined behavior when you get when you
> get to the end of a function returning a pointer without a "return" statement,
> or when you return from a function that has determined that it is not safe
> to continue.

Running off the end of a function like that is a different kettle of fish.
The execution path is still as the compiler intends - what isn't is that
the data returned is likely to be random trash.

That's _quite_ different from the CPU starting to execute the contents
of a literal data pool.
Geert Uytterhoeven May 23, 2013, 2:10 p.m. UTC | #9
On Thu, May 23, 2013 at 2:50 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Thu, May 23, 2013 at 02:09:02PM +0200, Arnd Bergmann wrote:
>> On Thursday 23 May 2013, Russell King - ARM Linux wrote:
>> > This is the problem you guys are missing - unreachable() means "we lose
>> > control of the CPU at this point".
>>
>> I'm absolutely aware of this. Again, the current behaviour of doing nothing
>> at all isn't very different from undefined behavior when you get when you
>> get to the end of a function returning a pointer without a "return" statement,
>> or when you return from a function that has determined that it is not safe
>> to continue.
>
> Running off the end of a function like that is a different kettle of fish.
> The execution path is still as the compiler intends - what isn't is that
> the data returned is likely to be random trash.
>
> That's _quite_ different from the CPU starting to execute the contents
> of a literal data pool.

I agree it's best to e.g. trap and reboot.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds
Chen Gang May 24, 2013, 2:13 a.m. UTC | #10
On 05/23/2013 10:10 PM, Geert Uytterhoeven wrote:
> On Thu, May 23, 2013 at 2:50 PM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
>> > On Thu, May 23, 2013 at 02:09:02PM +0200, Arnd Bergmann wrote:
>>> >> On Thursday 23 May 2013, Russell King - ARM Linux wrote:
>>>> >> > This is the problem you guys are missing - unreachable() means "we lose
>>>> >> > control of the CPU at this point".
>>> >>
>>> >> I'm absolutely aware of this. Again, the current behaviour of doing nothing
>>> >> at all isn't very different from undefined behavior when you get when you
>>> >> get to the end of a function returning a pointer without a "return" statement,
>>> >> or when you return from a function that has determined that it is not safe
>>> >> to continue.
>> >
>> > Running off the end of a function like that is a different kettle of fish.
>> > The execution path is still as the compiler intends - what isn't is that
>> > the data returned is likely to be random trash.
>> >
>> > That's _quite_ different from the CPU starting to execute the contents
>> > of a literal data pool.
> I agree it's best to e.g. trap and reboot.

After read the arch/*/include/asm/bug.h,

It seems panic() is not suitable for NOMMU platforms (only m68k use it,
also need CONFIG_BUG and CONFIG_SUN3 enabled).

And unreachable() is need followed with an asm inline instruction (arm,
x86, powerpc mips...).

And __builtin_trap() is "the mechanism used may vary from release to
release so should not rely on any particular implementation" (ref to
"http://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html", used by m68k,
sparc, ia64).

I can not find any *trap*() and *unreachable*() in "include/asm-generic/"

I can not find any suitable implementation which 'generic' enough to add
in "include/asm-generic/" (and in fact, CONFIG_BUG itself is not
'generic' enough to be in "include/asm-generic/").


At last, I still suggest to delete CONFIG_BUG, so most of architectures
can skip this issue firstly.

Then for specific architectures, also can get 3 benefits:

a. the related maintainers can implement it as their own willing (not
need discus it with another platform maintainers again);

b. the related maintainers can free use the platform specific features
(which can not be used in "include/asm-generic/");

c. the related maintainers are more familiar their own architectures
demands and requirements.



----------- arch/m68k/include/asm/bug.h --------------------------------

  1 #ifndef _M68K_BUG_H
  2 #define _M68K_BUG_H
  3
  4 #ifdef CONFIG_MMU
  5 #ifdef CONFIG_BUG
  6 #ifdef CONFIG_DEBUG_BUGVERBOSE
  7 #ifndef CONFIG_SUN3
  8 #define BUG() do { \
  9         printk("kernel BUG at %s:%d!\n", __FILE__, __LINE__); \
 10         __builtin_trap(); \
 11 } while (0)
 12 #else
 13 #define BUG() do { \
 14         printk("kernel BUG at %s:%d!\n", __FILE__, __LINE__); \
 15         panic("BUG!"); \
 16 } while (0)
 17 #endif
 18 #else
 19 #define BUG() do { \
 20         __builtin_trap(); \
 21 } while (0)
 22 #endif
 23
 24 #define HAVE_ARCH_BUG
 25 #endif
 26 #endif /* CONFIG_MMU */
 27
 28 #include <asm-generic/bug.h>
 29
 30 #endif




Thanks.
Chen Gang May 24, 2013, 4:17 a.m. UTC | #11
On 05/24/2013 10:13 AM, Chen Gang wrote:
> On 05/23/2013 10:10 PM, Geert Uytterhoeven wrote:
>> On Thu, May 23, 2013 at 2:50 PM, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>>>> On Thu, May 23, 2013 at 02:09:02PM +0200, Arnd Bergmann wrote:
>>>>>> On Thursday 23 May 2013, Russell King - ARM Linux wrote:
>>>>>>>> This is the problem you guys are missing - unreachable() means "we lose
>>>>>>>> control of the CPU at this point".
>>>>>>
>>>>>> I'm absolutely aware of this. Again, the current behaviour of doing nothing
>>>>>> at all isn't very different from undefined behavior when you get when you
>>>>>> get to the end of a function returning a pointer without a "return" statement,
>>>>>> or when you return from a function that has determined that it is not safe
>>>>>> to continue.
>>>>
>>>> Running off the end of a function like that is a different kettle of fish.
>>>> The execution path is still as the compiler intends - what isn't is that
>>>> the data returned is likely to be random trash.
>>>>
>>>> That's _quite_ different from the CPU starting to execute the contents
>>>> of a literal data pool.
>> I agree it's best to e.g. trap and reboot.
> 

In fact: if enable CONFIG_BUG, but not enable HAVE_ARCH_BUG, the
default implementation is:

 47 #ifndef HAVE_ARCH_BUG
 48 #define BUG() do { \
 49         printk("BUG: failure at %s:%d/%s()!\n", __FILE__, __LINE__, __func__); \
 50         panic("BUG!"); \
 51 } while (0)
 52 #endif

So if we delete CONFIG_BUG, the default implementation will be almost
like panic(),  and in panic() itself, also calls printk() !!

So...

:-)



> After read the arch/*/include/asm/bug.h,
> 
> It seems panic() is not suitable for NOMMU platforms (only m68k use it,
> also need CONFIG_BUG and CONFIG_SUN3 enabled).
> 
> And unreachable() is need followed with an asm inline instruction (arm,
> x86, powerpc mips...).
> 
> And __builtin_trap() is "the mechanism used may vary from release to
> release so should not rely on any particular implementation" (ref to
> "http://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html", used by m68k,
> sparc, ia64).
> 
> I can not find any *trap*() and *unreachable*() in "include/asm-generic/"
> 
> I can not find any suitable implementation which 'generic' enough to add
> in "include/asm-generic/" (and in fact, CONFIG_BUG itself is not
> 'generic' enough to be in "include/asm-generic/").
> 
> 
> At last, I still suggest to delete CONFIG_BUG, so most of architectures
> can skip this issue firstly.
> 
> Then for specific architectures, also can get 3 benefits:
> 
> a. the related maintainers can implement it as their own willing (not
> need discus it with another platform maintainers again);
> 
> b. the related maintainers can free use the platform specific features
> (which can not be used in "include/asm-generic/");
> 
> c. the related maintainers are more familiar their own architectures
> demands and requirements.
> 
> 
> 
> ----------- arch/m68k/include/asm/bug.h --------------------------------
> 
>   1 #ifndef _M68K_BUG_H
>   2 #define _M68K_BUG_H
>   3
>   4 #ifdef CONFIG_MMU
>   5 #ifdef CONFIG_BUG
>   6 #ifdef CONFIG_DEBUG_BUGVERBOSE
>   7 #ifndef CONFIG_SUN3
>   8 #define BUG() do { \
>   9         printk("kernel BUG at %s:%d!\n", __FILE__, __LINE__); \
>  10         __builtin_trap(); \
>  11 } while (0)
>  12 #else
>  13 #define BUG() do { \
>  14         printk("kernel BUG at %s:%d!\n", __FILE__, __LINE__); \
>  15         panic("BUG!"); \
>  16 } while (0)
>  17 #endif
>  18 #else
>  19 #define BUG() do { \
>  20         __builtin_trap(); \
>  21 } while (0)
>  22 #endif
>  23
>  24 #define HAVE_ARCH_BUG
>  25 #endif
>  26 #endif /* CONFIG_MMU */
>  27
>  28 #include <asm-generic/bug.h>
>  29
>  30 #endif
> 
> 
> 
> 
> Thanks.
>
Ingo Molnar May 28, 2013, 8:19 a.m. UTC | #12
* Russell King - ARM Linux <linux@arm.linux.org.uk> wrote:

> So, if you want to use this, then you should update the CONFIG_BUG text 
> to include a warning to this effect:
> 
>      Warning: if CONFIG_BUG is turned off, and control flow reaches
>      a BUG(), the system behaviour will be undefined.
> 
> so that people can make an informed choice about this, because at the
> moment:
> 
>           Disabling this option eliminates support for BUG and WARN, reducing
>           the size of your kernel image and potentially quietly ignoring
>           numerous fatal conditions. You should only consider disabling this
>           option for embedded systems with no facilities for reporting errors.
>           Just say Y.
> 
> will become completely misleading.  Turning this option off will _not_ 
> result in "quietly ignoring numerous fatal conditions".

I'm fine with adding your text as a clarification - but I think 'quietly 
ignoring fatal conditions' very much implies an undefined outcome if that 
unexpected condition does occur: the code might crash, it might corrupt 
memory or it might do some other unexpected thing.

There are many other places that do a BUG_ON() of a NULL pointer or so, or 
of a zero refcount, or a not held lock - and turning the BUG_ON() off 
makes the code unpredictable _anyway_ - even if the compiler does not 
notice an uninitialized variable.

So pretty much any weakening of BUG_ON() _will_ make the kernel more 
unpredictable.

> And I come back to one of my previous arguments - is it not better to 
> panic() if we hit one of these conditions so that the system can try to 
> do a panic-reboot rather than continue blindly into the unknown?

It will often continue blindly into the unknown even if the compiler is 
happy ...

The only difference is that it's "unpredictable" in a way not visible from 
the C code: the code won't necessarily fall through the BUG() when hitting 
that condition - although in practice it probably will.

So I think the same principle applies to it as to any other debugging 
code: it's fine to be able to turn debugging off. It's a performance 
versus kernel robustness/determinism trade-off.

Thanks,

	Ingo
Chen Gang May 28, 2013, 10:25 a.m. UTC | #13
On 05/28/2013 04:19 PM, Ingo Molnar wrote:
>> > And I come back to one of my previous arguments - is it not better to 
>> > panic() if we hit one of these conditions so that the system can try to 
>> > do a panic-reboot rather than continue blindly into the unknown?
> It will often continue blindly into the unknown even if the compiler is 
> happy ...
> 

For Server, it is necessary to always enable CONFIG_BUG and call panic()

When analyzing core dump or KDB trap, we have to assume that the kernel
has already "continued blindly", but lucky, we can get the core dump or
KDB trap finally (sometimes, we really even can not get core dump or KDB
trap).


For PC, it is useless to disable CONFIG_BUG

The PC memory has already large enough to skip the minimal size
optimization. And its speed is also high enough to skip the speed
improvement by minimal size optimization.


For Embedded system, some of architectures may disable CONFIG_BUG.

Embedded system are widely used in many area, so the requirement of each
architectures for BUG() may be different,

  some may need reboot as quickly as possible for urgent processing;
  some may need dead looping in BUG() for avoid user amazing;
    (if auto-reboot, users feel out of control, don't know what happens)
  some may still need panic() just like Server requirements.
  others may not care about it, just implement it as minimal size.



> The only difference is that i

t's "unpredictable" in a way not visible from
> the C code: the code won't necessarily fall through the BUG() when hitting 
> that condition - although in practice it probably will.
> 
> So I think the same principle applies to it as to any other debugging 
> code: it's fine to be able to turn debugging off. It's a performance 
> versus kernel robustness/determinism trade-off.

'minimal size' for BUG() is belongs to some of Embedded system specific
features, it is not 'generic' enough to be in "include/asm-generic/".

If we still provide the "disable CONFIG_BUG", some of 'crazy users'
(e.g. randconfig) may make 'noise' to most of architectures.

So we need not provide "disable CONFIG_BUG" features for all platforms,
since most of architectures need not support it, and the architecture
which really need minimal size, can implement it by itself as a
architecture specific feature.



Thanks.
H. Peter Anvin May 28, 2013, 2:55 p.m. UTC | #14
On 05/28/2013 01:19 AM, Ingo Molnar wrote:
> 
> So I think the same principle applies to it as to any other debugging 
> code: it's fine to be able to turn debugging off. It's a performance 
> versus kernel robustness/determinism trade-off.
> 

I suspect, rather, that BUG() should turn into a trap (or jump to a
death routine) under any circumstances.  The one thing that can be
omitted for small configurations are the annotations, which only serve
to output a more human-readable error message.

	-hpa
Arnd Bergmann May 28, 2013, 3:43 p.m. UTC | #15
On Tuesday 28 May 2013, H. Peter Anvin wrote:
> On 05/28/2013 01:19 AM, Ingo Molnar wrote:
> > 
> > So I think the same principle applies to it as to any other debugging 
> > code: it's fine to be able to turn debugging off. It's a performance 
> > versus kernel robustness/determinism trade-off.
> > 
> 
> I suspect, rather, that BUG() should turn into a trap (or jump to a
> death routine) under any circumstances.  The one thing that can be
> omitted for small configurations are the annotations, which only serve
> to output a more human-readable error message.

Right, that is what the patch I just posted does.

On a related note, I found that WARN_ON() can no longer be compiled
out since there is already code that relies on the side-effects of
the condition. I assume that was an intentional change I missed,
since it used to be defined so that you could remove it completely.

	Arnd
H. Peter Anvin May 28, 2013, 4:06 p.m. UTC | #16
On 05/28/2013 08:43 AM, Arnd Bergmann wrote:
> 
> Right, that is what the patch I just posted does.
> 
> On a related note, I found that WARN_ON() can no longer be compiled
> out since there is already code that relies on the side-effects of
> the condition. I assume that was an intentional change I missed,
> since it used to be defined so that you could remove it completely.
> 

It is possible to define WARN_ON() as:

#define WARN_ON(x) ((void)(x))

... which preserves side effects.

	-hpa
Arnd Bergmann May 28, 2013, 5:20 p.m. UTC | #17
On Tuesday 28 May 2013, H. Peter Anvin wrote:
> On 05/28/2013 08:43 AM, Arnd Bergmann wrote:
> > 
> > Right, that is what the patch I just posted does.
> > 
> > On a related note, I found that WARN_ON() can no longer be compiled
> > out since there is already code that relies on the side-effects of
> > the condition. I assume that was an intentional change I missed,
> > since it used to be defined so that you could remove it completely.
> > 
> 
> It is possible to define WARN_ON() as:
> 
> #define WARN_ON(x) ((void)(x))
> 
> ... which preserves side effects.

Yes, actually the return value has to be maintained as well.
The current (!CONFIG_BUG) default implementation is

#define WARN_ON(condition) ({                                           \
        int __ret_warn_on = !!(condition);                              \
        unlikely(__ret_warn_on);                                        \
})

which seems fine.

#define WARN_ON(condition) unlikely(!!(condition))

is probably just as good.

	Arnd
diff mbox

Patch

diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h
index 7d10f96..9afff7d 100644
--- a/include/asm-generic/bug.h
+++ b/include/asm-generic/bug.h
@@ -108,11 +108,11 @@  extern void warn_slowpath_null(const char *file, const int line);
 
 #else /* !CONFIG_BUG */
 #ifndef HAVE_ARCH_BUG
-#define BUG() do {} while(0)
+#define BUG() __builtin_unreachable ()
 #endif
 
 #ifndef HAVE_ARCH_BUG_ON
-#define BUG_ON(condition) do { if (condition) ; } while(0)
+#define BUG_ON(condition) do { if (condition) __builtin_unreachable(); } while(0)
 #endif
 
 #ifndef HAVE_ARCH_WARN_ON