Tip regression - bus error (bisected)

Submitted by Colin King on July 21, 2017, 2:12 p.m.

Details

Message ID 57d64408-6ef3-407e-3a77-819c060f2158@canonical.com
State New
Headers show

Commit Message

Colin King July 21, 2017, 2:12 p.m.
On 21/07/17 03:18, Jeffrey Hugo wrote:
> I noticed a consistent bus error when running tip on a ARM64 platform -
> 
> ubuntu@ubuntu:~$ sudo fwts
> Running 71 tests, results appended to results.log
> Test: Gather kernel system information.
>   Gather kernel signature.                                1 skipped, 1
> info only
>   Gather kernel system information.                       1 info only
>   Gather kernel boot command line.                        1 info only
>   Gather ACPI driver version.                             1 info only
> Test: OPAL Processor Power Management DT Validation Tests
>   Test skipped, missing features: devicetree
> Test: OPAL Reserved memory DT Validation Test
>   Test skipped, missing features: devicetree
> Test: OPAL Processor Recovery Diagnostics Info
>   Test skipped, missing features: devicetree
> Test: Scan kernel log for Oopses.
>   Kernel log oops check.                                  2 passed
> Test: Run OLOG scan and analysis checks.
>  Test skipped.
> Test: Scan kernel log for errors and warnings.
>   Kernel log error check.                                 1 passed
> Test: BMC Info
>   BMC Info                                                1 passed
> Test: General ACPI information test.
>   Determine Kernel ACPI version.                          1 info only
>   Determine machine's ACPI version.                                  :
> 11.7% |
> Caught SIGNAL 7 (Bus error), aborting.
> Backtrace:
> 0x0000ffff7eae09e4 /usr/local/lib/fwts/libfwts.so.1(+0x109e4)
> 
> I bisected the issue down to this commit -
> 
> commit cc3ea59404ef2bb89e40556bce8a8d803b39d3ce
> Author: Colin Ian King <colin.king@canonical.com>
> Date:   Fri Jul 14 09:35:10 2017 +0100
> 
>     lib: fwts_safe_mem: remove need to copy into a buffer
> 
>     While fwts_safe_memread() works fine as it is, it is copying data
>     to the stack and we don't guard how big that copy can be, so we
>     potentially could get a segfault if we run out of stack. Instead
>     just read the data. Force gcc not to optimize out the reads by
>     using volatile.
> 
>     Signed-off-by: Colin Ian King <colin.king@canonical.com>
>     Acked-by: Alex Hung <alex.hung@canonical.com>
>     Acked-by: Ivan Hu <ivan.hu@canonical.com>
> 
> I haven't really looked further into the issue, but is there additional
> information that would be useful to root cause and fix?
> 

So, can you apply the following:



and run with:

sudo fwts - >& fwts.log

and send that fwts.log to me.

Thanks!

Patch hide | download patch | download mbox

diff --git a/src/lib/src/fwts_safe_mem.c b/src/lib/src/fwts_safe_mem.c
index 216ad8e2..08a79350 100644
--- a/src/lib/src/fwts_safe_mem.c
+++ b/src/lib/src/fwts_safe_mem.c
@@ -71,8 +71,12 @@  int fwts_safe_memread(const void *src, const size_t n)
        const volatile uint8_t *ptr = src;
        const volatile uint8_t *end = ptr + n;

-       if (sigsetjmp(jmpbuf, 1) != 0)
+       printf("fwts_safe_memread: %p..%p\n", src, src + n);
+
+       if (sigsetjmp(jmpbuf, 1) != 0) {
+               printf("  Region cannot be read safely\n");
                return FWTS_ERROR;
+       }

        fwts_sig_handler_set(SIGSEGV, sig_handler, &old_segv_action);
        fwts_sig_handler_set(SIGBUS, sig_handler, &old_bus_action);
@@ -83,5 +87,7 @@  int fwts_safe_memread(const void *src, const size_t n)
        fwts_sig_handler_restore(SIGSEGV, &old_segv_action);
        fwts_sig_handler_restore(SIGBUS, &old_bus_action);

+       printf("  :Region can be read safely\n");
+
        return FWTS_OK;
 }