mbox series

[v1,0/2] gpio: Restrict usage of GPIO chip irq members before initialization

Message ID cover.1676564505.git.asmaa@nvidia.com
Headers show
Series gpio: Restrict usage of GPIO chip irq members before initialization | expand

Message

Asmaa Mnebhi Feb. 16, 2023, 4:24 p.m. UTC
BugLink: https://bugs.launchpad.net/bugs/2007581

SRU Justification:

[Impact]

GPIO chip irq members are exposed before they could be completely
initialized and this leads to race conditions.

One such issue was observed for the gc->irq.domain variable which
was accessed through the pwr-mlxbf.c driver in gpiochip_to_irq() before
it could be initialized by gpiochip_add_irqchip(). This resulted in
Kernel NULL pointer dereference. This is a well known issue in the linux community
and was fixed via 2 commits:
5467801f1fcbdc46bc7298a84dbf3ca1ff2a7320
and
06fb4ecfeac7e00d6704fa5ed19299f2fefb3cc9 (since the previous commit caused a regression)

This race condition is intermittent and hard to reproduce.

[Fix]

* Backport: 5467801f1fcbdc46bc7298a84dbf3ca1ff2a7320 to fix the bug at stake
* Backport: 06fb4ecfeac7e00d6704fa5ed19299f2fefb3cc9 to fix a regression introduced by the previous commit

[Test Case]

* Check that the gpio-mlxbf2.c driver is loaded with no kernel panic
* check that all drivers dependent on gpio-mlxbf2.c driver are loaded (mlxbf-gige and pwr-mlxbf)
* do 5000 reboots to make sure this race condition no longer happens

[Regression Potential]

This could cause some regression with the use of gpio interrupts so it is important to test the dependent
drivers mlxbf-gige and pwr-mlxbf. Trigger power reset interrupt to test pwr-mlxbf and bring down/up the
oob_net0 interface to test mlxbf-gige.

Mario Limonciello (1):
  gpio: Request interrupts after IRQ is initialized

Shreeya Patel (1):
  gpio: Restrict usage of GPIO chip irq members before initialization

 drivers/gpio/gpiolib.c      | 19 +++++++++++++++++++
 include/linux/gpio/driver.h |  9 +++++++++
 2 files changed, 28 insertions(+)

Comments

Stefan Bader Feb. 16, 2023, 4:47 p.m. UTC | #1
On 16.02.23 17:24, Asmaa Mnebhi wrote:
> BugLink: https://bugs.launchpad.net/bugs/2007581
> 
> SRU Justification:
> 
> [Impact]
> 
> GPIO chip irq members are exposed before they could be completely
> initialized and this leads to race conditions.
> 
> One such issue was observed for the gc->irq.domain variable which
> was accessed through the pwr-mlxbf.c driver in gpiochip_to_irq() before
> it could be initialized by gpiochip_add_irqchip(). This resulted in
> Kernel NULL pointer dereference. This is a well known issue in the linux community
> and was fixed via 2 commits:
> 5467801f1fcbdc46bc7298a84dbf3ca1ff2a7320
> and
> 06fb4ecfeac7e00d6704fa5ed19299f2fefb3cc9 (since the previous commit caused a regression)
> 
> This race condition is intermittent and hard to reproduce.
> 
> [Fix]
> 
> * Backport: 5467801f1fcbdc46bc7298a84dbf3ca1ff2a7320 to fix the bug at stake
> * Backport: 06fb4ecfeac7e00d6704fa5ed19299f2fefb3cc9 to fix a regression introduced by the previous commit
> 
> [Test Case]
> 
> * Check that the gpio-mlxbf2.c driver is loaded with no kernel panic
> * check that all drivers dependent on gpio-mlxbf2.c driver are loaded (mlxbf-gige and pwr-mlxbf)
> * do 5000 reboots to make sure this race condition no longer happens
> 
> [Regression Potential]
> 
> This could cause some regression with the use of gpio interrupts so it is important to test the dependent
> drivers mlxbf-gige and pwr-mlxbf. Trigger power reset interrupt to test pwr-mlxbf and bring down/up the
> oob_net0 interface to test mlxbf-gige.
> 
> Mario Limonciello (1):
>    gpio: Request interrupts after IRQ is initialized
> 
> Shreeya Patel (1):
>    gpio: Restrict usage of GPIO chip irq members before initialization
> 
>   drivers/gpio/gpiolib.c      | 19 +++++++++++++++++++
>   include/linux/gpio/driver.h |  9 +++++++++
>   2 files changed, 28 insertions(+)
> 

Note that patches show up duplicated because I did not check they got 
there already before moderating them out of the queue.

-Stefan
Stefan Bader Feb. 17, 2023, 9:12 a.m. UTC | #2
On 16.02.23 17:24, Asmaa Mnebhi wrote:
> BugLink: https://bugs.launchpad.net/bugs/2007581
> 
> SRU Justification:
> 
> [Impact]
> 
> GPIO chip irq members are exposed before they could be completely
> initialized and this leads to race conditions.
> 
> One such issue was observed for the gc->irq.domain variable which
> was accessed through the pwr-mlxbf.c driver in gpiochip_to_irq() before
> it could be initialized by gpiochip_add_irqchip(). This resulted in
> Kernel NULL pointer dereference. This is a well known issue in the linux community
> and was fixed via 2 commits:
> 5467801f1fcbdc46bc7298a84dbf3ca1ff2a7320
> and
> 06fb4ecfeac7e00d6704fa5ed19299f2fefb3cc9 (since the previous commit caused a regression)
> 
> This race condition is intermittent and hard to reproduce.
> 
> [Fix]
> 
> * Backport: 5467801f1fcbdc46bc7298a84dbf3ca1ff2a7320 to fix the bug at stake
> * Backport: 06fb4ecfeac7e00d6704fa5ed19299f2fefb3cc9 to fix a regression introduced by the previous commit
> 
> [Test Case]
> 
> * Check that the gpio-mlxbf2.c driver is loaded with no kernel panic
> * check that all drivers dependent on gpio-mlxbf2.c driver are loaded (mlxbf-gige and pwr-mlxbf)
> * do 5000 reboots to make sure this race condition no longer happens
> 
> [Regression Potential]
> 
> This could cause some regression with the use of gpio interrupts so it is important to test the dependent
> drivers mlxbf-gige and pwr-mlxbf. Trigger power reset interrupt to test pwr-mlxbf and bring down/up the
> oob_net0 interface to test mlxbf-gige.
> 
> Mario Limonciello (1):
>    gpio: Request interrupts after IRQ is initialized
> 
> Shreeya Patel (1):
>    gpio: Restrict usage of GPIO chip irq members before initialization
> 
>   drivers/gpio/gpiolib.c      | 19 +++++++++++++++++++
>   include/linux/gpio/driver.h |  9 +++++++++
>   2 files changed, 28 insertions(+)
> 

Subject is not including which series/kernel this set targeting. F/J? 
linux-bluefield.

-Stefan
Asmaa Mnebhi Feb. 17, 2023, 3:35 p.m. UTC | #3
HI Stefan,

Question for you. When I cherry-picked these patches, the following was added to my patches commit messages:
> Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>
> Cc: stable@vger.kernel.org
> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
> Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
> Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl> (backported from 
> commit 5467801f1fcbdc46bc7298a84dbf3ca1ff2a7320)
> Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com>

Which resulted in spamming all the people above. I will manually remove these "reviewerd-bu and cc" in the future as it is very confusing to them

-----Original Message-----
From: Stefan Bader <stefan.bader@canonical.com> 
Sent: Friday, February 17, 2023 4:12 AM
To: Asmaa Mnebhi <asmaa@nvidia.com>; kernel-team@lists.ubuntu.com
Cc: Vladimir Sokolovsky <vlad@nvidia.com>; Meriton Tuli <meriton@nvidia.com>; Khoa Vo <khoav@nvidia.com>
Subject: NACK/Cmnt: [PATCH v1 0/2] gpio: Restrict usage of GPIO chip irq members before initialization

On 16.02.23 17:24, Asmaa Mnebhi wrote:
> BugLink: https://bugs.launchpad.net/bugs/2007581
> 
> SRU Justification:
> 
> [Impact]
> 
> GPIO chip irq members are exposed before they could be completely 
> initialized and this leads to race conditions.
> 
> One such issue was observed for the gc->irq.domain variable which was 
> accessed through the pwr-mlxbf.c driver in gpiochip_to_irq() before it 
> could be initialized by gpiochip_add_irqchip(). This resulted in 
> Kernel NULL pointer dereference. This is a well known issue in the 
> linux community and was fixed via 2 commits:
> 5467801f1fcbdc46bc7298a84dbf3ca1ff2a7320
> and
> 06fb4ecfeac7e00d6704fa5ed19299f2fefb3cc9 (since the previous commit 
> caused a regression)
> 
> This race condition is intermittent and hard to reproduce.
> 
> [Fix]
> 
> * Backport: 5467801f1fcbdc46bc7298a84dbf3ca1ff2a7320 to fix the bug at 
> stake
> * Backport: 06fb4ecfeac7e00d6704fa5ed19299f2fefb3cc9 to fix a 
> regression introduced by the previous commit
> 
> [Test Case]
> 
> * Check that the gpio-mlxbf2.c driver is loaded with no kernel panic
> * check that all drivers dependent on gpio-mlxbf2.c driver are loaded 
> (mlxbf-gige and pwr-mlxbf)
> * do 5000 reboots to make sure this race condition no longer happens
> 
> [Regression Potential]
> 
> This could cause some regression with the use of gpio interrupts so it 
> is important to test the dependent drivers mlxbf-gige and pwr-mlxbf. 
> Trigger power reset interrupt to test pwr-mlxbf and bring down/up the
> oob_net0 interface to test mlxbf-gige.
> 
> Mario Limonciello (1):
>    gpio: Request interrupts after IRQ is initialized
> 
> Shreeya Patel (1):
>    gpio: Restrict usage of GPIO chip irq members before initialization
> 
>   drivers/gpio/gpiolib.c      | 19 +++++++++++++++++++
>   include/linux/gpio/driver.h |  9 +++++++++
>   2 files changed, 28 insertions(+)
> 

Subject is not including which series/kernel this set targeting. F/J? 
linux-bluefield.

-Stefan

--
- Stefan
Bartlomiej Zolnierkiewicz Feb. 20, 2023, 3:40 p.m. UTC | #4
Hi Asmaa,

On Fri, Feb 17, 2023 at 4:36 PM Asmaa Mnebhi <asmaa@nvidia.com> wrote:
>
> HI Stefan,
>
> Question for you. When I cherry-picked these patches, the following was added to my patches commit messages:
> > Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>
> > Cc: stable@vger.kernel.org
> > Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
> > Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
> > Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl> (backported from
> > commit 5467801f1fcbdc46bc7298a84dbf3ca1ff2a7320)
> > Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com>
>
> Which resulted in spamming all the people above. I will manually remove these "reviewerd-bu and cc" in the future as it is very confusing to them

These tags are an integral part of the commit description and
shouldn't be removed. Please try using the "--suppress-cc=all"
parameter for 'git send-email' command instead to prevent this from
happening. You may also use the "--dry-run" parameter to make the
command do everything except actually sending the emails to safely
check who will get them.

--
Best regards,
Bartlomiej

> -----Original Message-----
> From: Stefan Bader <stefan.bader@canonical.com>
> Sent: Friday, February 17, 2023 4:12 AM
> To: Asmaa Mnebhi <asmaa@nvidia.com>; kernel-team@lists.ubuntu.com
> Cc: Vladimir Sokolovsky <vlad@nvidia.com>; Meriton Tuli <meriton@nvidia.com>; Khoa Vo <khoav@nvidia.com>
> Subject: NACK/Cmnt: [PATCH v1 0/2] gpio: Restrict usage of GPIO chip irq members before initialization
>
> On 16.02.23 17:24, Asmaa Mnebhi wrote:
> > BugLink: https://bugs.launchpad.net/bugs/2007581
> >
> > SRU Justification:
> >
> > [Impact]
> >
> > GPIO chip irq members are exposed before they could be completely
> > initialized and this leads to race conditions.
> >
> > One such issue was observed for the gc->irq.domain variable which was
> > accessed through the pwr-mlxbf.c driver in gpiochip_to_irq() before it
> > could be initialized by gpiochip_add_irqchip(). This resulted in
> > Kernel NULL pointer dereference. This is a well known issue in the
> > linux community and was fixed via 2 commits:
> > 5467801f1fcbdc46bc7298a84dbf3ca1ff2a7320
> > and
> > 06fb4ecfeac7e00d6704fa5ed19299f2fefb3cc9 (since the previous commit
> > caused a regression)
> >
> > This race condition is intermittent and hard to reproduce.
> >
> > [Fix]
> >
> > * Backport: 5467801f1fcbdc46bc7298a84dbf3ca1ff2a7320 to fix the bug at
> > stake
> > * Backport: 06fb4ecfeac7e00d6704fa5ed19299f2fefb3cc9 to fix a
> > regression introduced by the previous commit
> >
> > [Test Case]
> >
> > * Check that the gpio-mlxbf2.c driver is loaded with no kernel panic
> > * check that all drivers dependent on gpio-mlxbf2.c driver are loaded
> > (mlxbf-gige and pwr-mlxbf)
> > * do 5000 reboots to make sure this race condition no longer happens
> >
> > [Regression Potential]
> >
> > This could cause some regression with the use of gpio interrupts so it
> > is important to test the dependent drivers mlxbf-gige and pwr-mlxbf.
> > Trigger power reset interrupt to test pwr-mlxbf and bring down/up the
> > oob_net0 interface to test mlxbf-gige.
> >
> > Mario Limonciello (1):
> >    gpio: Request interrupts after IRQ is initialized
> >
> > Shreeya Patel (1):
> >    gpio: Restrict usage of GPIO chip irq members before initialization
> >
> >   drivers/gpio/gpiolib.c      | 19 +++++++++++++++++++
> >   include/linux/gpio/driver.h |  9 +++++++++
> >   2 files changed, 28 insertions(+)
> >
>
> Subject is not including which series/kernel this set targeting. F/J?
> linux-bluefield.
>
> -Stefan
>
> --
> - Stefan
>