diff mbox series

[RESEND] fix lock error when BT IRQ preempt BT timer

Message ID 20210113100251.79250-1-hegdevasant@linux.vnet.ibm.com
State Accepted
Headers show
Series [RESEND] fix lock error when BT IRQ preempt BT timer | expand

Commit Message

Vasant Hegde Jan. 13, 2021, 10:02 a.m. UTC
From: lixg <lixgemail@gmail.com>

BT IRQ may preempt BT timer if BMC response host when bt msg timeout.
When BT IRQ preempt BT timer, the infight_bt_msg did not protected by bt.lock very well.

And we will see the following log:
[29006114.163785853,3] BT: seq 0x81 netfn 0x0a cmd 0x23: Timeout sending message
[29006114.288029290,3] BT: seq 0x81 netfn 0x0b cmd 0x23: Timeout sending message
[29006114.288917798,3] IPMI: Incorrect netfn 0x0b in response

It may cause 'CPU Hardlock UP', 'memory refree', 'kernel crash' or something else...

Signed-off-by: lixg <867314078@qq.com>
---
Lixg,
  Your patch was filtered out and didn't hit mailing list. Hence
  resending your patch.
  I have fixed mailing list issue. Now all your patches/responses will
  come to maiilng list.

-Vasant

 hw/bt.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

Comments

DD Aric li Jan. 13, 2021, 11:47 a.m. UTC | #1
thanks

Vasant Hegde <hegdevasant@linux.vnet.ibm.com> 于2021年1月13日周三 下午6:03写道:

> From: lixg <lixgemail@gmail.com>
>
> BT IRQ may preempt BT timer if BMC response host when bt msg timeout.
> When BT IRQ preempt BT timer, the infight_bt_msg did not protected by
> bt.lock very well.
>
> And we will see the following log:
> [29006114.163785853,3] BT: seq 0x81 netfn 0x0a cmd 0x23: Timeout sending
> message
> [29006114.288029290,3] BT: seq 0x81 netfn 0x0b cmd 0x23: Timeout sending
> message
> [29006114.288917798,3] IPMI: Incorrect netfn 0x0b in response
>
> It may cause 'CPU Hardlock UP', 'memory refree', 'kernel crash' or
> something else...
>
> Signed-off-by: lixg <867314078@qq.com>
> ---
> Lixg,
>   Your patch was filtered out and didn't hit mailing list. Hence
>   resending your patch.
>   I have fixed mailing list issue. Now all your patches/responses will
>   come to maiilng list.
>
> -Vasant
>
>  hw/bt.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/hw/bt.c b/hw/bt.c
> index cf967f899..5016feab6 100644
> --- a/hw/bt.c
> +++ b/hw/bt.c
> @@ -211,6 +211,11 @@ static void bt_msg_del(struct bt_msg *bt_msg)
>  {
>         list_del(&bt_msg->link);
>         bt.queue_len--;
> +
> +       /* once inflight_bt_msg out of list, it should be emptyed */
> +       if (bt_msg == inflight_bt_msg)
> +               inflight_bt_msg = NULL;
> +
>         unlock(&bt.lock);
>         ipmi_cmd_done(bt_msg->ipmi_msg.cmd,
>                       IPMI_NETFN_RETURN_CODE(bt_msg->ipmi_msg.netfn),
> @@ -393,9 +398,6 @@ static void bt_expire_old_msg(uint64_t tb)
>                         BT_Q_ERR(bt_msg, "Timeout sending message");
>                         bt_msg_del(bt_msg);
>
> -                       /* Ready to send next message */
> -                       inflight_bt_msg = NULL;
> -
>                         /*
>                          * Timing out a message is inherently racy as the
> BMC
>                          * may start writing just as we decide to kill the
> --
> 2.26.2
>
>
Vasant Hegde Jan. 19, 2021, 8:54 a.m. UTC | #2
On 1/13/21 3:32 PM, Vasant Hegde wrote:
> From: lixg <lixgemail@gmail.com>
> 

Merged to master as 46d7eafbd.

Thanks
-Vasant
DD Aric li Jan. 19, 2021, 9:19 a.m. UTC | #3
Awesome, thanks!

Vasant Hegde <hegdevasant@linux.vnet.ibm.com> 于2021年1月19日周二 下午4:54写道:

> On 1/13/21 3:32 PM, Vasant Hegde wrote:
> > From: lixg <lixgemail@gmail.com>
> >
>
> Merged to master as 46d7eafbd.
>
> Thanks
> -Vasant
>
diff mbox series

Patch

diff --git a/hw/bt.c b/hw/bt.c
index cf967f899..5016feab6 100644
--- a/hw/bt.c
+++ b/hw/bt.c
@@ -211,6 +211,11 @@  static void bt_msg_del(struct bt_msg *bt_msg)
 {
 	list_del(&bt_msg->link);
 	bt.queue_len--;
+
+	/* once inflight_bt_msg out of list, it should be emptyed */
+	if (bt_msg == inflight_bt_msg)
+		inflight_bt_msg = NULL;
+
 	unlock(&bt.lock);
 	ipmi_cmd_done(bt_msg->ipmi_msg.cmd,
 		      IPMI_NETFN_RETURN_CODE(bt_msg->ipmi_msg.netfn),
@@ -393,9 +398,6 @@  static void bt_expire_old_msg(uint64_t tb)
 			BT_Q_ERR(bt_msg, "Timeout sending message");
 			bt_msg_del(bt_msg);
 
-			/* Ready to send next message */
-			inflight_bt_msg = NULL;
-
 			/*
 			 * Timing out a message is inherently racy as the BMC
 			 * may start writing just as we decide to kill the