mbox series

[GIT,PULL] gpio: fixes for v5.18-rc2

Message ID 20220409205134.13070-1-brgl@bgdev.pl
State New
Headers show
Series [GIT,PULL] gpio: fixes for v5.18-rc2 | expand

Pull-request

git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux.git tags/gpio-fixes-for-v5.18-rc2

Message

Bartosz Golaszewski April 9, 2022, 8:51 p.m. UTC
Linus,

Here's a single fix for a race condition between the GPIO core and consumers of
GPIO IRQ chips.

Please pull,
Bartosz Golaszewski

The following changes since commit 3123109284176b1532874591f7c81f3837bbdc17:

  Linux 5.18-rc1 (2022-04-03 14:08:21 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux.git tags/gpio-fixes-for-v5.18-rc2

for you to fetch changes up to 5467801f1fcbdc46bc7298a84dbf3ca1ff2a7320:

  gpio: Restrict usage of GPIO chip irq members before initialization (2022-04-04 14:41:34 +0200)

----------------------------------------------------------------
gpio fixes for v5.18-rc2

- fix a race condition with consumers accessing the fields of GPIO IRQ chips
  before they're fully initialized

----------------------------------------------------------------
Shreeya Patel (1):
      gpio: Restrict usage of GPIO chip irq members before initialization

 drivers/gpio/gpiolib.c      | 19 +++++++++++++++++++
 include/linux/gpio/driver.h |  9 +++++++++
 2 files changed, 28 insertions(+)

Comments

Linus Torvalds April 10, 2022, 4:26 a.m. UTC | #1
On Sat, Apr 9, 2022 at 10:51 AM Bartosz Golaszewski <brgl@bgdev.pl> wrote:
>
> Here's a single fix for a race condition between the GPIO core and consumers of
> GPIO IRQ chips.

I've pulled this, but it's horribly broken.

You can't just use a compiler barrier to make sure the compiler orders
the data at initialization time.

That doesn't take care of CPU re-ordering, but it also doesn't take
care of re-ordering reads on the other side of the equation.

Every write barrier needs to pair with a read barrier.

And "barrier()" is only a barrier on that CPU, since it is only a
barrier for code generation, not for data.

There are multiple ways to do proper hand-off of data, but the best
one is likely

 - on the initialization side, do

        .. initialize all the data, then do ..
        smp_store_release(&initialized, 1);

 - on the reading side, do

        if (!smp_load_acquire(&initialized))
                 return -EAGAIN;

        .. you can now rely on all the data having been initialized ..

But honestly, the fact that you got this race condition so wrong makes
me suggest you use proper locks. Because the above gives you proper
ordering between the two sequences, but the sequences in question
still have to have a *lot* of guarantees about the accesses actually
then being valid in a lock-free environment (the only obviously safe
situation is a "initialize things once, everything afterwards is only
a read" - otherwise y ou need to make sure all the *updates* are
safely done too).

With locking, all these issues go away. The lock will take care of
ordering, but also data consistency at updates.

Without locking, you need to do the above kinds of careful things for
_all_ the accesses that can race, not just that "initialized" flag.

                 Linus
pr-tracker-bot@kernel.org April 10, 2022, 4:53 a.m. UTC | #2
The pull request you sent on Sat,  9 Apr 2022 22:51:34 +0200:

> git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux.git tags/gpio-fixes-for-v5.18-rc2

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/fa3b895da8e06d6e3dcf3e6941a3fd428343e3d7

Thank you!
Bartosz Golaszewski April 11, 2022, 10:49 a.m. UTC | #3
On Sun, Apr 10, 2022 at 6:27 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Sat, Apr 9, 2022 at 10:51 AM Bartosz Golaszewski <brgl@bgdev.pl> wrote:
> >
> > Here's a single fix for a race condition between the GPIO core and consumers of
> > GPIO IRQ chips.
>
> I've pulled this, but it's horribly broken.
>
> You can't just use a compiler barrier to make sure the compiler orders
> the data at initialization time.
>
> That doesn't take care of CPU re-ordering, but it also doesn't take
> care of re-ordering reads on the other side of the equation.
>
> Every write barrier needs to pair with a read barrier.
>
> And "barrier()" is only a barrier on that CPU, since it is only a
> barrier for code generation, not for data.
>
> There are multiple ways to do proper hand-off of data, but the best
> one is likely
>
>  - on the initialization side, do
>
>         .. initialize all the data, then do ..
>         smp_store_release(&initialized, 1);
>
>  - on the reading side, do
>
>         if (!smp_load_acquire(&initialized))
>                  return -EAGAIN;
>
>         .. you can now rely on all the data having been initialized ..
>
> But honestly, the fact that you got this race condition so wrong makes
> me suggest you use proper locks. Because the above gives you proper
> ordering between the two sequences, but the sequences in question
> still have to have a *lot* of guarantees about the accesses actually
> then being valid in a lock-free environment (the only obviously safe
> situation is a "initialize things once, everything afterwards is only
> a read" - otherwise y ou need to make sure all the *updates* are
> safely done too).
>
> With locking, all these issues go away. The lock will take care of
> ordering, but also data consistency at updates.
>
> Without locking, you need to do the above kinds of careful things for
> _all_ the accesses that can race, not just that "initialized" flag.
>
>                  Linus

Cc'ing Shreeya

Thanks, we'll see about a follow-up with a better solution.

Bart