diff mbox series

[linux-stable-4.14] tcp: reset sk_send_head in tcp_write_queue_purge

Message ID 20180315160913.180918-1-soheil.kdev@gmail.com
State Not Applicable, archived
Delegated to: David Miller
Headers show
Series [linux-stable-4.14] tcp: reset sk_send_head in tcp_write_queue_purge | expand

Commit Message

Soheil Hassas Yeganeh March 15, 2018, 4:09 p.m. UTC
From: Soheil Hassas Yeganeh <soheil@google.com>

tcp_write_queue_purge clears all the SKBs in the write queue
but does not reset the sk_send_head. As a result, we can have
a NULL pointer dereference anywhere that we use tcp_send_head
instead of the tcp_write_queue_tail.

For example, after 27fid7a8ed38 (tcp: purge write queue upon RST),
we can purge the write queue on RST. Prior to
75c119afe14f (tcp: implement rb-tree based retransmit queue),
tcp_push will only check tcp_send_head and then accesses
tcp_write_queue_tail to send the actual SKB. As a result, it will
dereference a NULL pointer.

This has been reported twice for 4.14 where we don't have
75c119afe14f:

By Timofey Titovets:

[  422.081094] BUG: unable to handle kernel NULL pointer dereference
at 0000000000000038
[  422.081254] IP: tcp_push+0x42/0x110
[  422.081314] PGD 0 P4D 0
[  422.081364] Oops: 0002 [#1] SMP PTI

By Yongjian Xu:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
IP: tcp_push+0x48/0x120
PGD 80000007ff77b067 P4D 80000007ff77b067 PUD 7fd989067 PMD 0
Oops: 0002 [#18] SMP PTI
Modules linked in: tcp_diag inet_diag tcp_bbr sch_fq iTCO_wdt
iTCO_vendor_support pcspkr ixgbe mdio i2c_i801 lpc_ich joydev input_leds shpchp
e1000e igb dca ptp pps_core hwmon mei_me mei ipmi_si ipmi_msghandler sg ses
scsi_transport_sas enclosure ext4 jbd2 mbcache sd_mod ahci libahci megaraid_sas
wmi ast ttm dm_mirror dm_region_hash dm_log dm_mod dax
CPU: 6 PID: 14156 Comm: [ET_NET 6] Tainted: G D 4.14.26-1.el6.x86_64 #1
Hardware name: LENOVO ThinkServer RD440 /ThinkServer RD440, BIOS A0TS80A
09/22/2014
task: ffff8807d78d8140 task.stack: ffffc9000e944000
RIP: 0010:tcp_push+0x48/0x120
RSP: 0018:ffffc9000e947a88 EFLAGS: 00010246
RAX: 00000000000005b4 RBX: ffff880f7cce9c00 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000040 RDI: ffff8807d00f5000
RBP: ffffc9000e947aa8 R08: 0000000000001c84 R09: 0000000000000000
R10: ffff8807d00f5158 R11: 0000000000000000 R12: ffff8807d00f5000
R13: 0000000000000020 R14: 00000000000256d4 R15: 0000000000000000
FS: 00007f5916de9700(0000) GS:ffff88107fd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000038 CR3: 00000007f8226004 CR4: 00000000001606e0
Call Trace:
tcp_sendmsg_locked+0x33d/0xe50
tcp_sendmsg+0x37/0x60
inet_sendmsg+0x39/0xc0
sock_sendmsg+0x49/0x60
sock_write_iter+0xb6/0x100
do_iter_readv_writev+0xec/0x130
? rw_verify_area+0x49/0xb0
do_iter_write+0x97/0xd0
vfs_writev+0x7e/0xe0
? __wake_up_common_lock+0x80/0xa0
? __fget_light+0x2c/0x70
? __do_page_fault+0x1e7/0x530
do_writev+0x60/0xf0
? inet_shutdown+0xac/0x110
SyS_writev+0x10/0x20
do_syscall_64+0x6f/0x140
? prepare_exit_to_usermode+0x8b/0xa0
entry_SYSCALL_64_after_hwframe+0x3d/0xa2
RIP: 0033:0x3135ce0c57
RSP: 002b:00007f5916de4b00 EFLAGS: 00000293 ORIG_RAX: 0000000000000014
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000003135ce0c57
RDX: 0000000000000002 RSI: 00007f5916de4b90 RDI: 000000000000606f
RBP: 0000000000000000 R08: 0000000000000000 R09: 00007f5916de8c38
R10: 0000000000000000 R11: 0000000000000293 R12: 00000000000464cc
R13: 00007f5916de8c30 R14: 00007f58d8bef080 R15: 0000000000000002
Code: 48 8b 97 60 01 00 00 4c 8d 97 58 01 00 00 41 b9 00 00 00 00 41 89 f3 4c 39
d2 49 0f 44 d1 41 81 e3 00 80 00 00 0f 85 b0 00 00 00 <80> 4a 38 08 44 8b 8f 74
06 00 00 44 89 8f 7c 06 00 00 83 e6 01
RIP: tcp_push+0x48/0x120 RSP: ffffc9000e947a88
CR2: 0000000000000038
---[ end trace 8d545c2e93515549 ]---

Fixes: a27fid7a8ed38 (tcp: purge write queue upon RST)
Reported-by: Timofey Titovets <nefelim4ag@gmail.com>
Reported-by: Yongjian Xu <yongjianchn@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
---
 include/net/tcp.h | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

Comments

yongjian xu March 16, 2018, 2:26 p.m. UTC | #1
Tested-by: Yongjian Xu <yongjianchn@gmail.com>

2018-03-16 0:09 GMT+08:00 Soheil Hassas Yeganeh <soheil.kdev@gmail.com>:
> From: Soheil Hassas Yeganeh <soheil@google.com>
>
> tcp_write_queue_purge clears all the SKBs in the write queue
> but does not reset the sk_send_head. As a result, we can have
> a NULL pointer dereference anywhere that we use tcp_send_head
> instead of the tcp_write_queue_tail.
>
> For example, after 27fid7a8ed38 (tcp: purge write queue upon RST),
> we can purge the write queue on RST. Prior to
> 75c119afe14f (tcp: implement rb-tree based retransmit queue),
> tcp_push will only check tcp_send_head and then accesses
> tcp_write_queue_tail to send the actual SKB. As a result, it will
> dereference a NULL pointer.
>
> This has been reported twice for 4.14 where we don't have
> 75c119afe14f:
>
> By Timofey Titovets:
>
> [  422.081094] BUG: unable to handle kernel NULL pointer dereference
> at 0000000000000038
> [  422.081254] IP: tcp_push+0x42/0x110
> [  422.081314] PGD 0 P4D 0
> [  422.081364] Oops: 0002 [#1] SMP PTI
>
> By Yongjian Xu:
>
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
> IP: tcp_push+0x48/0x120
> PGD 80000007ff77b067 P4D 80000007ff77b067 PUD 7fd989067 PMD 0
> Oops: 0002 [#18] SMP PTI
> Modules linked in: tcp_diag inet_diag tcp_bbr sch_fq iTCO_wdt
> iTCO_vendor_support pcspkr ixgbe mdio i2c_i801 lpc_ich joydev input_leds shpchp
> e1000e igb dca ptp pps_core hwmon mei_me mei ipmi_si ipmi_msghandler sg ses
> scsi_transport_sas enclosure ext4 jbd2 mbcache sd_mod ahci libahci megaraid_sas
> wmi ast ttm dm_mirror dm_region_hash dm_log dm_mod dax
> CPU: 6 PID: 14156 Comm: [ET_NET 6] Tainted: G D 4.14.26-1.el6.x86_64 #1
> Hardware name: LENOVO ThinkServer RD440 /ThinkServer RD440, BIOS A0TS80A
> 09/22/2014
> task: ffff8807d78d8140 task.stack: ffffc9000e944000
> RIP: 0010:tcp_push+0x48/0x120
> RSP: 0018:ffffc9000e947a88 EFLAGS: 00010246
> RAX: 00000000000005b4 RBX: ffff880f7cce9c00 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: 0000000000000040 RDI: ffff8807d00f5000
> RBP: ffffc9000e947aa8 R08: 0000000000001c84 R09: 0000000000000000
> R10: ffff8807d00f5158 R11: 0000000000000000 R12: ffff8807d00f5000
> R13: 0000000000000020 R14: 00000000000256d4 R15: 0000000000000000
> FS: 00007f5916de9700(0000) GS:ffff88107fd00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000038 CR3: 00000007f8226004 CR4: 00000000001606e0
> Call Trace:
> tcp_sendmsg_locked+0x33d/0xe50
> tcp_sendmsg+0x37/0x60
> inet_sendmsg+0x39/0xc0
> sock_sendmsg+0x49/0x60
> sock_write_iter+0xb6/0x100
> do_iter_readv_writev+0xec/0x130
> ? rw_verify_area+0x49/0xb0
> do_iter_write+0x97/0xd0
> vfs_writev+0x7e/0xe0
> ? __wake_up_common_lock+0x80/0xa0
> ? __fget_light+0x2c/0x70
> ? __do_page_fault+0x1e7/0x530
> do_writev+0x60/0xf0
> ? inet_shutdown+0xac/0x110
> SyS_writev+0x10/0x20
> do_syscall_64+0x6f/0x140
> ? prepare_exit_to_usermode+0x8b/0xa0
> entry_SYSCALL_64_after_hwframe+0x3d/0xa2
> RIP: 0033:0x3135ce0c57
> RSP: 002b:00007f5916de4b00 EFLAGS: 00000293 ORIG_RAX: 0000000000000014
> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000003135ce0c57
> RDX: 0000000000000002 RSI: 00007f5916de4b90 RDI: 000000000000606f
> RBP: 0000000000000000 R08: 0000000000000000 R09: 00007f5916de8c38
> R10: 0000000000000000 R11: 0000000000000293 R12: 00000000000464cc
> R13: 00007f5916de8c30 R14: 00007f58d8bef080 R15: 0000000000000002
> Code: 48 8b 97 60 01 00 00 4c 8d 97 58 01 00 00 41 b9 00 00 00 00 41 89 f3 4c 39
> d2 49 0f 44 d1 41 81 e3 00 80 00 00 0f 85 b0 00 00 00 <80> 4a 38 08 44 8b 8f 74
> 06 00 00 44 89 8f 7c 06 00 00 83 e6 01
> RIP: tcp_push+0x48/0x120 RSP: ffffc9000e947a88
> CR2: 0000000000000038
> ---[ end trace 8d545c2e93515549 ]---
>
> Fixes: a27fid7a8ed38 (tcp: purge write queue upon RST)
> Reported-by: Timofey Titovets <nefelim4ag@gmail.com>
> Reported-by: Yongjian Xu <yongjianchn@gmail.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
> ---
>  include/net/tcp.h | 11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/include/net/tcp.h b/include/net/tcp.h
> index 0a13574134b8b..d323d4fa742ca 100644
> --- a/include/net/tcp.h
> +++ b/include/net/tcp.h
> @@ -1600,6 +1600,11 @@ enum tcp_chrono {
>  void tcp_chrono_start(struct sock *sk, const enum tcp_chrono type);
>  void tcp_chrono_stop(struct sock *sk, const enum tcp_chrono type);
>
> +static inline void tcp_init_send_head(struct sock *sk)
> +{
> +       sk->sk_send_head = NULL;
> +}
> +
>  /* write queue abstraction */
>  static inline void tcp_write_queue_purge(struct sock *sk)
>  {
> @@ -1610,6 +1615,7 @@ static inline void tcp_write_queue_purge(struct sock *sk)
>                 sk_wmem_free_skb(sk, skb);
>         sk_mem_reclaim(sk);
>         tcp_clear_all_retrans_hints(tcp_sk(sk));
> +       tcp_init_send_head(sk);
>  }
>
>  static inline struct sk_buff *tcp_write_queue_head(const struct sock *sk)
> @@ -1672,11 +1678,6 @@ static inline void tcp_check_send_head(struct sock *sk, struct sk_buff *skb_unli
>                 tcp_sk(sk)->highest_sack = NULL;
>  }
>
> -static inline void tcp_init_send_head(struct sock *sk)
> -{
> -       sk->sk_send_head = NULL;
> -}
> -
>  static inline void __tcp_add_write_queue_tail(struct sock *sk, struct sk_buff *skb)
>  {
>         __skb_queue_tail(&sk->sk_write_queue, skb);
> --
> 2.16.2.804.g6dcf76e118-goog
>
Willy Tarreau March 20, 2018, 4:13 p.m. UTC | #2
Hi David,

regarding the patch below, I'm not certain whether you planned to take
it since it's marked "not applicable" on patchwork, but I suspect it's
only because it doesn't apply to mainline.

However, please note that there are two typos in commit IDs referenced in
the commit message that might be nice to fix before merging :

> > For example, after 27fid7a8ed38 (tcp: purge write queue upon RST),

- here it's "a27fd7a8ed38" (missing 'a' and extra 'i' in the middle)

- and lower in the fixes tag there's the extra 'i' as well :

> > Fixes: a27fid7a8ed38 (tcp: purge write queue upon RST)

There was another report today on the stable list for a similar crash
on 4.14.28 and I suspect it's the one I saw this week-end on my firewall
after upgrading from 4.14.10 to 4.14.27 though I didn't have the symbols
to confirm.

Thanks,
Willy
Timofey Titovets March 29, 2018, 10 a.m. UTC | #3
Hi,
any progress here?

I didn't find that patch applied for any 4.14.27-4.14.31

Patch rejected?

2018-03-20 19:13 GMT+03:00 Willy Tarreau <w@1wt.eu>:
> Hi David,
>
> regarding the patch below, I'm not certain whether you planned to take
> it since it's marked "not applicable" on patchwork, but I suspect it's
> only because it doesn't apply to mainline.
>
> However, please note that there are two typos in commit IDs referenced in
> the commit message that might be nice to fix before merging :
>
>> > For example, after 27fid7a8ed38 (tcp: purge write queue upon RST),
>
> - here it's "a27fd7a8ed38" (missing 'a' and extra 'i' in the middle)
>
> - and lower in the fixes tag there's the extra 'i' as well :
>
>> > Fixes: a27fid7a8ed38 (tcp: purge write queue upon RST)
>
> There was another report today on the stable list for a similar crash
> on 4.14.28 and I suspect it's the one I saw this week-end on my firewall
> after upgrading from 4.14.10 to 4.14.27 though I didn't have the symbols
> to confirm.
>
> Thanks,
> Willy
Jack Wang March 29, 2018, 10:09 a.m. UTC | #4
2018-03-29 12:00 GMT+02:00 Timofey Titovets <nefelim4ag@gmail.com>:
> Hi,
> any progress here?
>
> I didn't find that patch applied for any 4.14.27-4.14.31
>
> Patch rejected?
It's in queue for 4.14.32
https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/tree/queue-4.14/tcp-reset-sk_send_head-in-tcp_write_queue_purge.patch?id=c78a2769647faebad83a2e35a14cde382be3ff7d
>
> 2018-03-20 19:13 GMT+03:00 Willy Tarreau <w@1wt.eu>:
>> Hi David,
>>
>> regarding the patch below, I'm not certain whether you planned to take
>> it since it's marked "not applicable" on patchwork, but I suspect it's
>> only because it doesn't apply to mainline.
>>
>> However, please note that there are two typos in commit IDs referenced in
>> the commit message that might be nice to fix before merging :
>>
>>> > For example, after 27fid7a8ed38 (tcp: purge write queue upon RST),
>>
>> - here it's "a27fd7a8ed38" (missing 'a' and extra 'i' in the middle)
>>
>> - and lower in the fixes tag there's the extra 'i' as well :
>>
>>> > Fixes: a27fid7a8ed38 (tcp: purge write queue upon RST)
>>
>> There was another report today on the stable list for a similar crash
>> on 4.14.28 and I suspect it's the one I saw this week-end on my firewall
>> after upgrading from 4.14.10 to 4.14.27 though I didn't have the symbols
>> to confirm.
>>
>> Thanks,
>> Willy
>
>
>
> --
> Have a nice day,
> Timofey.
diff mbox series

Patch

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 0a13574134b8b..d323d4fa742ca 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1600,6 +1600,11 @@  enum tcp_chrono {
 void tcp_chrono_start(struct sock *sk, const enum tcp_chrono type);
 void tcp_chrono_stop(struct sock *sk, const enum tcp_chrono type);
 
+static inline void tcp_init_send_head(struct sock *sk)
+{
+	sk->sk_send_head = NULL;
+}
+
 /* write queue abstraction */
 static inline void tcp_write_queue_purge(struct sock *sk)
 {
@@ -1610,6 +1615,7 @@  static inline void tcp_write_queue_purge(struct sock *sk)
 		sk_wmem_free_skb(sk, skb);
 	sk_mem_reclaim(sk);
 	tcp_clear_all_retrans_hints(tcp_sk(sk));
+	tcp_init_send_head(sk);
 }
 
 static inline struct sk_buff *tcp_write_queue_head(const struct sock *sk)
@@ -1672,11 +1678,6 @@  static inline void tcp_check_send_head(struct sock *sk, struct sk_buff *skb_unli
 		tcp_sk(sk)->highest_sack = NULL;
 }
 
-static inline void tcp_init_send_head(struct sock *sk)
-{
-	sk->sk_send_head = NULL;
-}
-
 static inline void __tcp_add_write_queue_tail(struct sock *sk, struct sk_buff *skb)
 {
 	__skb_queue_tail(&sk->sk_write_queue, skb);