diff mbox series

[libgpiod] gpiosim: fix data race that corrupts heap

Message ID 20230626-datarace-v1-1-b78e2abb493b@linaro.org
State New
Headers show
Series [libgpiod] gpiosim: fix data race that corrupts heap | expand

Commit Message

Erik Schilling June 26, 2023, 1:14 p.m. UTC
Hit this while seeing some heap corruptions when running cargo test on
the Rust bindings.

Took a bit to track down since I first used address sanitizers, but with
the thread sanitizer it becomes obvious immediately (output simplified):

    ==================
    WARNING: ThreadSanitizer: data race (pid=288119)
      Write of size 8 at 0x0000018f1e78 by thread T6:
        #0 id_free /libgpiod/tests/gpiosim/gpiosim.c:141:17
        #1 dev_release /libgpiod/tests/gpiosim/gpiosim.c:600:2
        #2 refcount_dec /libgpiod/tests/gpiosim/gpiosim.c:176:3
        #3 gpiosim_dev_unref /libgpiod/tests/gpiosim/gpiosim.c:671:2
        #4 bank_release /libgpiod/tests/gpiosim/gpiosim.c:873:2
        #5 refcount_dec /libgpiod/tests/gpiosim/gpiosim.c:176:3
        #6 gpiosim_bank_unref /libgpiod/tests/gpiosim/gpiosim.c:941:2
        [...]

      Previous write of size 8 at 0x0000018f1e78 by thread T1:
        #0 id_free /libgpiod/tests/gpiosim/gpiosim.c:141:17
        #1 bank_release /libgpiod/tests/gpiosim/gpiosim.c:878:2
        #2 refcount_dec /libgpiod/tests/gpiosim/gpiosim.c:176:3
        #3 gpiosim_bank_unref /libgpiod/tests/gpiosim/gpiosim.c:941:2
        [...]

      Location is global 'id_del_ctx' of size 16 at 0x0000018f1e70

      Thread T6 'chip::verify::f' (tid=288126, running) created by main thread at:
        #7 test::run_tests::hd53a07a011bd771f /.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/test/src/lib.rs:407:21
        [...]

      Thread T1 'chip::open::gpi' (tid=288121, finished) created by main thread at:
        #7 test::run_tests::hd53a07a011bd771f /.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/test/src/lib.rs:407:21
        [...]

    SUMMARY: ThreadSanitizer: data race /libgpiod/tests/gpiosim/gpiosim.c:141:17 in id_free
    ==================

This eventually can either lead to leaks or double free's that corrupt
the heap and lead to crashes.

The issue got introduced when a previously local variable that did not
require protection was turned into a global variable.

Fixes: 5e111df2fca56d57193a1825e45e78dd8b76c0f1
Signed-off-by: Erik Schilling <erik.schilling@linaro.org>
---
To: Linux-GPIO <linux-gpio@vger.kernel.org>
Cc: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
---
 tests/gpiosim/gpiosim.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


---
base-commit: d04639ddd11ed7d02c630e693bf07d97f53e17d3
change-id: 20230626-datarace-e62e9bcfa3ee

Best regards,

Comments

Bartosz Golaszewski June 26, 2023, 2:36 p.m. UTC | #1
On Mon, Jun 26, 2023 at 3:15 PM Erik Schilling
<erik.schilling@linaro.org> wrote:
>
> Hit this while seeing some heap corruptions when running cargo test on
> the Rust bindings.
>
> Took a bit to track down since I first used address sanitizers, but with
> the thread sanitizer it becomes obvious immediately (output simplified):
>
>     ==================
>     WARNING: ThreadSanitizer: data race (pid=288119)
>       Write of size 8 at 0x0000018f1e78 by thread T6:
>         #0 id_free /libgpiod/tests/gpiosim/gpiosim.c:141:17
>         #1 dev_release /libgpiod/tests/gpiosim/gpiosim.c:600:2
>         #2 refcount_dec /libgpiod/tests/gpiosim/gpiosim.c:176:3
>         #3 gpiosim_dev_unref /libgpiod/tests/gpiosim/gpiosim.c:671:2
>         #4 bank_release /libgpiod/tests/gpiosim/gpiosim.c:873:2
>         #5 refcount_dec /libgpiod/tests/gpiosim/gpiosim.c:176:3
>         #6 gpiosim_bank_unref /libgpiod/tests/gpiosim/gpiosim.c:941:2
>         [...]
>
>       Previous write of size 8 at 0x0000018f1e78 by thread T1:
>         #0 id_free /libgpiod/tests/gpiosim/gpiosim.c:141:17
>         #1 bank_release /libgpiod/tests/gpiosim/gpiosim.c:878:2
>         #2 refcount_dec /libgpiod/tests/gpiosim/gpiosim.c:176:3
>         #3 gpiosim_bank_unref /libgpiod/tests/gpiosim/gpiosim.c:941:2
>         [...]
>
>       Location is global 'id_del_ctx' of size 16 at 0x0000018f1e70
>
>       Thread T6 'chip::verify::f' (tid=288126, running) created by main thread at:
>         #7 test::run_tests::hd53a07a011bd771f /.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/test/src/lib.rs:407:21
>         [...]
>
>       Thread T1 'chip::open::gpi' (tid=288121, finished) created by main thread at:
>         #7 test::run_tests::hd53a07a011bd771f /.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/test/src/lib.rs:407:21
>         [...]
>
>     SUMMARY: ThreadSanitizer: data race /libgpiod/tests/gpiosim/gpiosim.c:141:17 in id_free
>     ==================
>
> This eventually can either lead to leaks or double free's that corrupt
> the heap and lead to crashes.
>
> The issue got introduced when a previously local variable that did not
> require protection was turned into a global variable.
>
> Fixes: 5e111df2fca56d57193a1825e45e78dd8b76c0f1
> Signed-off-by: Erik Schilling <erik.schilling@linaro.org>
> ---
> To: Linux-GPIO <linux-gpio@vger.kernel.org>
> Cc: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> ---
>  tests/gpiosim/gpiosim.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/tests/gpiosim/gpiosim.c b/tests/gpiosim/gpiosim.c
> index b49a61a..fca6b7f 100644
> --- a/tests/gpiosim/gpiosim.c
> +++ b/tests/gpiosim/gpiosim.c
> @@ -137,11 +137,11 @@ static int id_alloc(void)
>
>  static void id_free(int id)
>  {
> +       pthread_mutex_lock(&id_lock);
> +
>         id_del_ctx.id = id;
>         id_del_ctx.idp = NULL;
>
> -       pthread_mutex_lock(&id_lock);
> -
>         twalk(id_root, id_del);
>         if (id_del_ctx.idp) {
>                 tdelete(id_del_ctx.idp, &id_root, id_compare);
>
> ---
> base-commit: d04639ddd11ed7d02c630e693bf07d97f53e17d3
> change-id: 20230626-datarace-e62e9bcfa3ee
>
> Best regards,
> --
> Erik Schilling <erik.schilling@linaro.org>
>

Nice catch, thanks! I applied it and just changed the Fixes tag to
conform to the kernel style.

I actually saw some issues when running rust tests some time ago but
couldn't reproduce reliably, I suspect this was it.

Bart
Erik Schilling June 26, 2023, 4:41 p.m. UTC | #2
On Mon Jun 26, 2023 at 4:36 PM CEST, Bartosz Golaszewski wrote:
> Nice catch, thanks!
> I applied it and just changed the Fixes tag to
> conform to the kernel style.

Thanks! I updated my gitconfig with the correct pretty format.

> I actually saw some issues when running rust tests some time ago but
> couldn't reproduce reliably, I suspect this was it.

Yeah, it was a bit nasty since this mixed up the double-free backtraces
really bad. I was starting at the ref counters and Rust abstractions
for quite a while and could not make sense of it. And once one added
too many debug prints the chance of reproducing it got reduced pretty
severly on my machine.

- Erik
diff mbox series

Patch

diff --git a/tests/gpiosim/gpiosim.c b/tests/gpiosim/gpiosim.c
index b49a61a..fca6b7f 100644
--- a/tests/gpiosim/gpiosim.c
+++ b/tests/gpiosim/gpiosim.c
@@ -137,11 +137,11 @@  static int id_alloc(void)
 
 static void id_free(int id)
 {
+	pthread_mutex_lock(&id_lock);
+
 	id_del_ctx.id = id;
 	id_del_ctx.idp = NULL;
 
-	pthread_mutex_lock(&id_lock);
-
 	twalk(id_root, id_del);
 	if (id_del_ctx.idp) {
 		tdelete(id_del_ctx.idp, &id_root, id_compare);