[net] net: diag: Don't double-free TCP_NEW_SYN_RECV sockets in tcp_abort

Message ID 20180707073140.202004-1-lorenzo@google.com
State Accepted
Delegated to: David Miller
Headers show
Series
  • [net] net: diag: Don't double-free TCP_NEW_SYN_RECV sockets in tcp_abort
Related show

Commit Message

Lorenzo Colitti July 7, 2018, 7:31 a.m.
When tcp_diag_destroy closes a TCP_NEW_SYN_RECV socket, it first
frees it by calling inet_csk_reqsk_queue_drop_and_and_put in
tcp_abort, and then frees it again by calling sock_gen_put.

Since tcp_abort only has one caller, and all the other codepaths
in tcp_abort don't free the socket, just remove the free in that
function.

Cc: David Ahern <dsa@cumulusnetworks.com>
Tested: passes Android sock_diag_test.py, which exercises this codepath
Fixes: d7226c7a4dd1 ("net: diag: Fix refcnt leak in error path destroying socket")
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
---
 net/ipv4/tcp.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

David Miller July 7, 2018, 1:11 p.m. | #1
From: Lorenzo Colitti <lorenzo@google.com>
Date: Sat,  7 Jul 2018 16:31:40 +0900

> Tested: passes Android sock_diag_test.py, which exercises this codepath

If this Android test case exercises this path, why didn't it trigger
the double free and thus cause this bug to be found much sooner?

Just curious.
Eric Dumazet July 7, 2018, 1:28 p.m. | #2
On 07/07/2018 12:31 AM, Lorenzo Colitti wrote:
> When tcp_diag_destroy closes a TCP_NEW_SYN_RECV socket, it first
> frees it by calling inet_csk_reqsk_queue_drop_and_and_put in
> tcp_abort, and then frees it again by calling sock_gen_put.
> 
> Since tcp_abort only has one caller, and all the other codepaths
> in tcp_abort don't free the socket, just remove the free in that
> function.
> 
> Cc: David Ahern <dsa@cumulusnetworks.com>
> Tested: passes Android sock_diag_test.py, which exercises this codepath
> Fixes: d7226c7a4dd1 ("net: diag: Fix refcnt leak in error path destroying socket")
> Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
> ---

Signed-off-by: Eric Dumazet <edumazet@google.com>

Thanks !
Eric Dumazet July 7, 2018, 1:29 p.m. | #3
On 07/07/2018 06:11 AM, David Miller wrote:
> From: Lorenzo Colitti <lorenzo@google.com>
> Date: Sat,  7 Jul 2018 16:31:40 +0900
> 
>> Tested: passes Android sock_diag_test.py, which exercises this codepath
> 
> If this Android test case exercises this path, why didn't it trigger
> the double free and thus cause this bug to be found much sooner?
> 
> Just curious.
> 

Presumably android has not backported yet the refcount_t stuff in their kernels.

That is a guess though...
David Ahern July 7, 2018, 1:33 p.m. | #4
On 7/7/18 7:11 AM, David Miller wrote:
> From: Lorenzo Colitti <lorenzo@google.com>
> Date: Sat,  7 Jul 2018 16:31:40 +0900
> 
>> Tested: passes Android sock_diag_test.py, which exercises this codepath
> 
> If this Android test case exercises this path, why didn't it trigger
> the double free and thus cause this bug to be found much sooner?
> 

wondering the same. How can I get access to sock_diag_test.py?
Eric Dumazet July 7, 2018, 1:45 p.m. | #5
On 07/07/2018 06:33 AM, David Ahern wrote:
> On 7/7/18 7:11 AM, David Miller wrote:
>> From: Lorenzo Colitti <lorenzo@google.com>
>> Date: Sat,  7 Jul 2018 16:31:40 +0900
>>
>>> Tested: passes Android sock_diag_test.py, which exercises this codepath
>>
>> If this Android test case exercises this path, why didn't it trigger
>> the double free and thus cause this bug to be found much sooner?
>>
> 
> wondering the same. How can I get access to sock_diag_test.py?
> 

I would simply use ss -tKa src :443 command on a live web server ;)

Note to readers : Do not try that unless you want to kill your server.
Eric Dumazet July 7, 2018, 1:51 p.m. | #6
On 07/07/2018 06:45 AM, Eric Dumazet wrote:
> 
> 
> On 07/07/2018 06:33 AM, David Ahern wrote:
>> On 7/7/18 7:11 AM, David Miller wrote:
>>> From: Lorenzo Colitti <lorenzo@google.com>
>>> Date: Sat,  7 Jul 2018 16:31:40 +0900
>>>
>>>> Tested: passes Android sock_diag_test.py, which exercises this codepath
>>>
>>> If this Android test case exercises this path, why didn't it trigger
>>> the double free and thus cause this bug to be found much sooner?
>>>
>>
>> wondering the same. How can I get access to sock_diag_test.py?
>>
> 
> I would simply use ss -tKa src :443 command on a live web server ;)
> 
> Note to readers : Do not try that unless you want to kill your server.
> 
>

Here is a packetdrill test :

// Test SOCK_DESTROY on SYN_RECV request sockets
// We use the "ss" socket statistics tool, which uses inet_diag sockets.

// ss -K can be slow
--tolerance_usecs=15000


    0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
   +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
   +0 fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
   +0 bind(3, ..., ...) = 0
   +0 listen(3, 1) = 0

   +0 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 2>
   +0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8>

// ss -K is scary ! Do not mess with the filter or risk killing a lot of flows
   +0 `ss -t -K -n state SYN-RECV src :8080 >/dev/null`

  +.1 < . 1:1(0) ack 1 win 32890
   +0 > R 1:1(0)

// The listener was not killed, but has no available child -> -1 EAGAIN
   +0 accept(3, ..., ...) = -1 EAGAIN (Resource temporarily unavailable)
David Ahern July 7, 2018, 1:56 p.m. | #7
On 7/7/18 7:51 AM, Eric Dumazet wrote:
> 
> 
> On 07/07/2018 06:45 AM, Eric Dumazet wrote:
>>
>>
>> On 07/07/2018 06:33 AM, David Ahern wrote:
>>> On 7/7/18 7:11 AM, David Miller wrote:
>>>> From: Lorenzo Colitti <lorenzo@google.com>
>>>> Date: Sat,  7 Jul 2018 16:31:40 +0900
>>>>
>>>>> Tested: passes Android sock_diag_test.py, which exercises this codepath
>>>>
>>>> If this Android test case exercises this path, why didn't it trigger
>>>> the double free and thus cause this bug to be found much sooner?
>>>>
>>>
>>> wondering the same. How can I get access to sock_diag_test.py?
>>>
>>
>> I would simply use ss -tKa src :443 command on a live web server ;)
>>
>> Note to readers : Do not try that unless you want to kill your server.
>>
>>
> 
> Here is a packetdrill test :

So I have to either learn how to use packetdrill or install a web server
and put load on it. If the Android tests are not publicly available then
the reference should be removed from the commit log.


> 
> // Test SOCK_DESTROY on SYN_RECV request sockets
> // We use the "ss" socket statistics tool, which uses inet_diag sockets.
> 
> // ss -K can be slow
> --tolerance_usecs=15000
> 
> 
>     0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
>    +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
>    +0 fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
>    +0 bind(3, ..., ...) = 0
>    +0 listen(3, 1) = 0
> 
>    +0 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 2>
>    +0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8>
> 
> // ss -K is scary ! Do not mess with the filter or risk killing a lot of flows
>    +0 `ss -t -K -n state SYN-RECV src :8080 >/dev/null`
> 
>   +.1 < . 1:1(0) ack 1 win 32890
>    +0 > R 1:1(0)
> 
> // The listener was not killed, but has no available child -> -1 EAGAIN
>    +0 accept(3, ..., ...) = -1 EAGAIN (Resource temporarily unavailable)
> 

I'll give this a try later. Thanks,
Eric Dumazet July 7, 2018, 1:58 p.m. | #8
On 07/07/2018 06:29 AM, Eric Dumazet wrote:
> 
> 
> On 07/07/2018 06:11 AM, David Miller wrote:
>> From: Lorenzo Colitti <lorenzo@google.com>
>> Date: Sat,  7 Jul 2018 16:31:40 +0900
>>
>>> Tested: passes Android sock_diag_test.py, which exercises this codepath
>>
>> If this Android test case exercises this path, why didn't it trigger
>> the double free and thus cause this bug to be found much sooner?
>>
>> Just curious.
>>
> 
> Presumably android has not backported yet the refcount_t stuff in their kernels.
> 
> That is a guess though...
> 


Also maybe the confusion comes from the fact that it is not a double free,
but a use-after-free, which might cause a bug on kernels without
refcount_t saturation.
Eric Dumazet July 7, 2018, 2:16 p.m. | #9
On 07/07/2018 06:56 AM, David Ahern wrote:

> 
> So I have to either learn how to use packetdrill or install a web server
> and put load on it. If the Android tests are not publicly available then
> the reference should be removed from the commit log.

We see many Change-Id tags in changelogs, and other tags that help various
teams, even if it makes little sense for others.

Now back to week-end, I am not really impressed by your gratitude today.
Eric Dumazet July 7, 2018, 2:19 p.m. | #10
On 07/07/2018 07:16 AM, Eric Dumazet wrote:
> 
> 
> On 07/07/2018 06:56 AM, David Ahern wrote:
> 
>>
>> So I have to either learn how to use packetdrill or install a web server
>> and put load on it. If the Android tests are not publicly available then
>> the reference should be removed from the commit log.
> 



( A google search for sock_diag_test.py show plenty  of git repos btw)
David Ahern July 7, 2018, 10:07 p.m. | #11
On 7/7/18 1:31 AM, Lorenzo Colitti wrote:
> When tcp_diag_destroy closes a TCP_NEW_SYN_RECV socket, it first
> frees it by calling inet_csk_reqsk_queue_drop_and_and_put in
> tcp_abort, and then frees it again by calling sock_gen_put.
> 
> Since tcp_abort only has one caller, and all the other codepaths
> in tcp_abort don't free the socket, just remove the free in that
> function.
> 
> Cc: David Ahern <dsa@cumulusnetworks.com>
> Tested: passes Android sock_diag_test.py, which exercises this codepath
> Fixes: d7226c7a4dd1 ("net: diag: Fix refcnt leak in error path destroying socket")
> Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
> ---
>  net/ipv4/tcp.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 

Thanks for the bugfix.

Reviewed-by: David Ahern <dsa@cumulusnetworks.com>
Tested-by: David Ahern <dsa@cumulusnetworks.com>
David Miller July 8, 2018, 1:57 a.m. | #12
From: Lorenzo Colitti <lorenzo@google.com>
Date: Sat,  7 Jul 2018 16:31:40 +0900

> When tcp_diag_destroy closes a TCP_NEW_SYN_RECV socket, it first
> frees it by calling inet_csk_reqsk_queue_drop_and_and_put in
> tcp_abort, and then frees it again by calling sock_gen_put.
> 
> Since tcp_abort only has one caller, and all the other codepaths
> in tcp_abort don't free the socket, just remove the free in that
> function.
> 
> Cc: David Ahern <dsa@cumulusnetworks.com>
> Tested: passes Android sock_diag_test.py, which exercises this codepath
> Fixes: d7226c7a4dd1 ("net: diag: Fix refcnt leak in error path destroying socket")
> Signed-off-by: Lorenzo Colitti <lorenzo@google.com>

Applied and queued up for -stable, thanks!
Lorenzo Colitti July 9, 2018, 4:53 a.m. | #13
On Sat, Jul 7, 2018 at 10:29 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >> Tested: passes Android sock_diag_test.py, which exercises this codepath
> >
> > If this Android test case exercises this path, why didn't it trigger
> > the double free and thus cause this bug to be found much sooner?
> >
> > Just curious.
>
> Presumably android has not backported yet the refcount_t stuff in their kernels.

That's correct. We only started seeing this on 4.14, which is not yet
in the field.

Also, I think this failure does not appear in our continuous test runs
(even on 4.14) because those uses ARCH=um, which is single-threaded.
We're working on making it possible to run the tests on qemu in order
to catch these threading issues, but we're not there yet.
Lorenzo Colitti July 9, 2018, 5:24 a.m. | #14
On Sat, Jul 7, 2018 at 10:56 PM David Ahern <dsa@cumulusnetworks.com> wrote:
> > Here is a packetdrill test :
>
> So I have to either learn how to use packetdrill or install a web server
> and put load on it. If the Android tests are not publicly available then
> the reference should be removed from the commit log.

It is publicly available. The test is at
https://android.googlesource.com/kernel/tests/+/master/net/test/sock_diag_test.py
. Documentation is at
https://source.android.com/devices/architecture/kernel/network_tests .

Note that not all the tests pass on upstream kernels. We do try to
keep them working, but because they are also run as conformance tests
for Android devices they must test out-of-tree behaviour, such as the
fact that unlike upstream, Android kernels use RFC-compliant SHA256
truncation when the PF_KEY API is used.

Again, I don't think you'll see this particular issue on the ARCH=um
tests that run by default. But if you run sock_diag_test.py on a VM or
physical machine as root, you should.
David Ahern July 9, 2018, 2:59 p.m. | #15
On 7/8/18 10:53 PM, Lorenzo Colitti wrote:
> On Sat, Jul 7, 2018 at 10:29 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>>> Tested: passes Android sock_diag_test.py, which exercises this codepath
>>>
>>> If this Android test case exercises this path, why didn't it trigger
>>> the double free and thus cause this bug to be found much sooner?
>>>
>>> Just curious.
>>
>> Presumably android has not backported yet the refcount_t stuff in their kernels.
> 
> That's correct. We only started seeing this on 4.14, which is not yet
> in the field.

Prior to 4.12 syn-recv sockets did not show up in the list, so it could
not be closed through the diag API. I did not chase it to a specific
commit between 4.11 (no syn-recv in 'ss -ta') and 4.12 (syn-recv shows
up in 'ss -ta').
David Ahern July 9, 2018, 3 p.m. | #16
On 7/8/18 11:24 PM, Lorenzo Colitti wrote:
> On Sat, Jul 7, 2018 at 10:56 PM David Ahern <dsa@cumulusnetworks.com> wrote:
>>> Here is a packetdrill test :
>>
>> So I have to either learn how to use packetdrill or install a web server
>> and put load on it. If the Android tests are not publicly available then
>> the reference should be removed from the commit log.
> 
> It is publicly available. The test is at
> https://android.googlesource.com/kernel/tests/+/master/net/test/sock_diag_test.py
> . Documentation is at
> https://source.android.com/devices/architecture/kernel/network_tests .
> 
> Note that not all the tests pass on upstream kernels. We do try to
> keep them working, but because they are also run as conformance tests
> for Android devices they must test out-of-tree behaviour, such as the
> fact that unlike upstream, Android kernels use RFC-compliant SHA256
> truncation when the PF_KEY API is used.
> 
> Again, I don't think you'll see this particular issue on the ARCH=um
> tests that run by default. But if you run sock_diag_test.py on a VM or
> physical machine as root, you should.
> 

ok. thanks for reference.

I general I am always looking for more easy to run tests to catch
problems like this.
Eric Dumazet July 9, 2018, 3:17 p.m. | #17
On 07/09/2018 07:59 AM, David Ahern wrote:
> On 7/8/18 10:53 PM, Lorenzo Colitti wrote:
>> On Sat, Jul 7, 2018 at 10:29 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>>>> Tested: passes Android sock_diag_test.py, which exercises this codepath
>>>>
>>>> If this Android test case exercises this path, why didn't it trigger
>>>> the double free and thus cause this bug to be found much sooner?
>>>>
>>>> Just curious.
>>>
>>> Presumably android has not backported yet the refcount_t stuff in their kernels.
>>
>> That's correct. We only started seeing this on 4.14, which is not yet
>> in the field.
> 
> Prior to 4.12 syn-recv sockets did not show up in the list, so it could
> not be closed through the diag API. I did not chase it to a specific
> commit between 4.11 (no syn-recv in 'ss -ta') and 4.12 (syn-recv shows
> up in 'ss -ta').
> 

I suggest you look more closely to tcp_abort(), the function you changed
in commit d7226c7a4dd1  (linux-4.8)

Then you will see it had support for syn-recv since linux-4.5

So I do not understand your 4.11/4.12 mentions at all.

"ss -ta" always had dumped syn-recv request sockets.
David Ahern July 9, 2018, 4:14 p.m. | #18
On 7/9/18 9:17 AM, Eric Dumazet wrote:
> 
> 
> On 07/09/2018 07:59 AM, David Ahern wrote:
>> On 7/8/18 10:53 PM, Lorenzo Colitti wrote:
>>> On Sat, Jul 7, 2018 at 10:29 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>>>>> Tested: passes Android sock_diag_test.py, which exercises this codepath
>>>>>
>>>>> If this Android test case exercises this path, why didn't it trigger
>>>>> the double free and thus cause this bug to be found much sooner?
>>>>>
>>>>> Just curious.
>>>>
>>>> Presumably android has not backported yet the refcount_t stuff in their kernels.
>>>
>>> That's correct. We only started seeing this on 4.14, which is not yet
>>> in the field.
>>
>> Prior to 4.12 syn-recv sockets did not show up in the list, so it could
>> not be closed through the diag API. I did not chase it to a specific
>> commit between 4.11 (no syn-recv in 'ss -ta') and 4.12 (syn-recv shows
>> up in 'ss -ta').
>>
> 
> I suggest you look more closely to tcp_abort(), the function you changed
> in commit d7226c7a4dd1  (linux-4.8)
> 
> Then you will see it had support for syn-recv since linux-4.5
> 
> So I do not understand your 4.11/4.12 mentions at all.
> 
> "ss -ta" always had dumped syn-recv request sockets.
> 
> 

Perhaps it is something with my config, settings, ss version or test
program, but I do not see it on 4.11:

# ip addr sh dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP
group default qlen 1000
    link/ether 02:e0:f9:1c:00:37 brd ff:ff:ff:ff:ff:ff
    inet 10.100.1.4/24 scope global eth1
       valid_lft forever preferred_lft forever
...

# uname -r
4.11.0

# iptables -A OUTPUT -p tcp --sport 12345  --tcp-flags SYN,ACK SYN,ACK
-j DROP
# vrf-test -s &
[1] 823

--> connection attempt into this node from 10.100.1.254

# ss -ta | grep 1234
LISTEN     0      5          *:12345                    *:*

No, SYN-RECV in output. A local connection does show up (but I was not
looking at that earlier):

# vrf-test -r 10.100.1.4 &
[2] 827

# ss -ta | grep 1234
LISTEN     0      5          *:12345                    *:*
SYN-RECV   0      0      10.100.1.4:12345                10.100.1.254:54276
SYN-RECV   0      0      10.100.1.4:12345                10.100.1.4:52316
SYN-SENT   0      1      10.100.1.4:52316                10.100.1.4:12345


Same commands on 4.12:

# uname -r
4.12.0

# iptables -A OUTPUT -p tcp --sport 12345  --tcp-flags SYN,ACK SYN,ACK
-j DROP
# vrf-test -s &
[1] 815

--> connection attempt into this node from 10.100.1.254

# ss -ta | grep 1234
LISTEN     0      5          *:12345                    *:*
SYN-RECV   0      0      10.100.1.4:12345                10.100.1.254:54272

So, SYN-RECV is there.

However, in writing this response and repeating the steps over and over
I have noticed inconsistencies even with 4.12. For example, I reboot
4.12 and try the steps again:

# uname -r
4.12.0

# iptables -A OUTPUT -p tcp --sport 12345  --tcp-flags SYN,ACK SYN,ACK
-j DROP
# vrf-test -s &
[1] 821

--> connection attempt into this node from 10.100.1.254

# ss -ta | grep 1234
LISTEN     0      5          *:12345                    *:*

no SYN-RECV. Local test:

# vrf-test -r 10.100.1.4 &
[2] 824
# ss -ta | grep 1234
LISTEN     0      5          *:12345                    *:*
SYN-SENT   0      1      10.100.1.4:59922                10.100.1.4:12345

No SYN-RECV.

# ss --version
ss utility, iproute2-ss150831
Eric Dumazet July 9, 2018, 4:21 p.m. | #19
On 07/09/2018 09:14 AM, David Ahern wrote:

> Perhaps it is something with my config, settings, ss version or test
> program, but I do not see it on 4.11:
>

Maybe some vrf issue, I dunno.

Patch

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index e7b53d2a97..c959bb6ea4 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3720,8 +3720,7 @@  int tcp_abort(struct sock *sk, int err)
 			struct request_sock *req = inet_reqsk(sk);
 
 			local_bh_disable();
-			inet_csk_reqsk_queue_drop_and_put(req->rsk_listener,
-							  req);
+			inet_csk_reqsk_queue_drop(req->rsk_listener, req);
 			local_bh_enable();
 			return 0;
 		}