Message ID | 20180707073140.202004-1-lorenzo@google.com |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
Series | [net] net: diag: Don't double-free TCP_NEW_SYN_RECV sockets in tcp_abort | expand |
From: Lorenzo Colitti <lorenzo@google.com> Date: Sat, 7 Jul 2018 16:31:40 +0900 > Tested: passes Android sock_diag_test.py, which exercises this codepath If this Android test case exercises this path, why didn't it trigger the double free and thus cause this bug to be found much sooner? Just curious.
On 07/07/2018 12:31 AM, Lorenzo Colitti wrote: > When tcp_diag_destroy closes a TCP_NEW_SYN_RECV socket, it first > frees it by calling inet_csk_reqsk_queue_drop_and_and_put in > tcp_abort, and then frees it again by calling sock_gen_put. > > Since tcp_abort only has one caller, and all the other codepaths > in tcp_abort don't free the socket, just remove the free in that > function. > > Cc: David Ahern <dsa@cumulusnetworks.com> > Tested: passes Android sock_diag_test.py, which exercises this codepath > Fixes: d7226c7a4dd1 ("net: diag: Fix refcnt leak in error path destroying socket") > Signed-off-by: Lorenzo Colitti <lorenzo@google.com> > --- Signed-off-by: Eric Dumazet <edumazet@google.com> Thanks !
On 07/07/2018 06:11 AM, David Miller wrote: > From: Lorenzo Colitti <lorenzo@google.com> > Date: Sat, 7 Jul 2018 16:31:40 +0900 > >> Tested: passes Android sock_diag_test.py, which exercises this codepath > > If this Android test case exercises this path, why didn't it trigger > the double free and thus cause this bug to be found much sooner? > > Just curious. > Presumably android has not backported yet the refcount_t stuff in their kernels. That is a guess though...
On 7/7/18 7:11 AM, David Miller wrote: > From: Lorenzo Colitti <lorenzo@google.com> > Date: Sat, 7 Jul 2018 16:31:40 +0900 > >> Tested: passes Android sock_diag_test.py, which exercises this codepath > > If this Android test case exercises this path, why didn't it trigger > the double free and thus cause this bug to be found much sooner? > wondering the same. How can I get access to sock_diag_test.py?
On 07/07/2018 06:33 AM, David Ahern wrote: > On 7/7/18 7:11 AM, David Miller wrote: >> From: Lorenzo Colitti <lorenzo@google.com> >> Date: Sat, 7 Jul 2018 16:31:40 +0900 >> >>> Tested: passes Android sock_diag_test.py, which exercises this codepath >> >> If this Android test case exercises this path, why didn't it trigger >> the double free and thus cause this bug to be found much sooner? >> > > wondering the same. How can I get access to sock_diag_test.py? > I would simply use ss -tKa src :443 command on a live web server ;) Note to readers : Do not try that unless you want to kill your server.
On 07/07/2018 06:45 AM, Eric Dumazet wrote: > > > On 07/07/2018 06:33 AM, David Ahern wrote: >> On 7/7/18 7:11 AM, David Miller wrote: >>> From: Lorenzo Colitti <lorenzo@google.com> >>> Date: Sat, 7 Jul 2018 16:31:40 +0900 >>> >>>> Tested: passes Android sock_diag_test.py, which exercises this codepath >>> >>> If this Android test case exercises this path, why didn't it trigger >>> the double free and thus cause this bug to be found much sooner? >>> >> >> wondering the same. How can I get access to sock_diag_test.py? >> > > I would simply use ss -tKa src :443 command on a live web server ;) > > Note to readers : Do not try that unless you want to kill your server. > > Here is a packetdrill test : // Test SOCK_DESTROY on SYN_RECV request sockets // We use the "ss" socket statistics tool, which uses inet_diag sockets. // ss -K can be slow --tolerance_usecs=15000 0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 +0 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 2> +0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8> // ss -K is scary ! Do not mess with the filter or risk killing a lot of flows +0 `ss -t -K -n state SYN-RECV src :8080 >/dev/null` +.1 < . 1:1(0) ack 1 win 32890 +0 > R 1:1(0) // The listener was not killed, but has no available child -> -1 EAGAIN +0 accept(3, ..., ...) = -1 EAGAIN (Resource temporarily unavailable)
On 7/7/18 7:51 AM, Eric Dumazet wrote: > > > On 07/07/2018 06:45 AM, Eric Dumazet wrote: >> >> >> On 07/07/2018 06:33 AM, David Ahern wrote: >>> On 7/7/18 7:11 AM, David Miller wrote: >>>> From: Lorenzo Colitti <lorenzo@google.com> >>>> Date: Sat, 7 Jul 2018 16:31:40 +0900 >>>> >>>>> Tested: passes Android sock_diag_test.py, which exercises this codepath >>>> >>>> If this Android test case exercises this path, why didn't it trigger >>>> the double free and thus cause this bug to be found much sooner? >>>> >>> >>> wondering the same. How can I get access to sock_diag_test.py? >>> >> >> I would simply use ss -tKa src :443 command on a live web server ;) >> >> Note to readers : Do not try that unless you want to kill your server. >> >> > > Here is a packetdrill test : So I have to either learn how to use packetdrill or install a web server and put load on it. If the Android tests are not publicly available then the reference should be removed from the commit log. > > // Test SOCK_DESTROY on SYN_RECV request sockets > // We use the "ss" socket statistics tool, which uses inet_diag sockets. > > // ss -K can be slow > --tolerance_usecs=15000 > > > 0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 > +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 > +0 fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 > +0 bind(3, ..., ...) = 0 > +0 listen(3, 1) = 0 > > +0 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 2> > +0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8> > > // ss -K is scary ! Do not mess with the filter or risk killing a lot of flows > +0 `ss -t -K -n state SYN-RECV src :8080 >/dev/null` > > +.1 < . 1:1(0) ack 1 win 32890 > +0 > R 1:1(0) > > // The listener was not killed, but has no available child -> -1 EAGAIN > +0 accept(3, ..., ...) = -1 EAGAIN (Resource temporarily unavailable) > I'll give this a try later. Thanks,
On 07/07/2018 06:29 AM, Eric Dumazet wrote: > > > On 07/07/2018 06:11 AM, David Miller wrote: >> From: Lorenzo Colitti <lorenzo@google.com> >> Date: Sat, 7 Jul 2018 16:31:40 +0900 >> >>> Tested: passes Android sock_diag_test.py, which exercises this codepath >> >> If this Android test case exercises this path, why didn't it trigger >> the double free and thus cause this bug to be found much sooner? >> >> Just curious. >> > > Presumably android has not backported yet the refcount_t stuff in their kernels. > > That is a guess though... > Also maybe the confusion comes from the fact that it is not a double free, but a use-after-free, which might cause a bug on kernels without refcount_t saturation.
On 07/07/2018 06:56 AM, David Ahern wrote: > > So I have to either learn how to use packetdrill or install a web server > and put load on it. If the Android tests are not publicly available then > the reference should be removed from the commit log. We see many Change-Id tags in changelogs, and other tags that help various teams, even if it makes little sense for others. Now back to week-end, I am not really impressed by your gratitude today.
On 07/07/2018 07:16 AM, Eric Dumazet wrote: > > > On 07/07/2018 06:56 AM, David Ahern wrote: > >> >> So I have to either learn how to use packetdrill or install a web server >> and put load on it. If the Android tests are not publicly available then >> the reference should be removed from the commit log. > ( A google search for sock_diag_test.py show plenty of git repos btw)
On 7/7/18 1:31 AM, Lorenzo Colitti wrote: > When tcp_diag_destroy closes a TCP_NEW_SYN_RECV socket, it first > frees it by calling inet_csk_reqsk_queue_drop_and_and_put in > tcp_abort, and then frees it again by calling sock_gen_put. > > Since tcp_abort only has one caller, and all the other codepaths > in tcp_abort don't free the socket, just remove the free in that > function. > > Cc: David Ahern <dsa@cumulusnetworks.com> > Tested: passes Android sock_diag_test.py, which exercises this codepath > Fixes: d7226c7a4dd1 ("net: diag: Fix refcnt leak in error path destroying socket") > Signed-off-by: Lorenzo Colitti <lorenzo@google.com> > --- > net/ipv4/tcp.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > Thanks for the bugfix. Reviewed-by: David Ahern <dsa@cumulusnetworks.com> Tested-by: David Ahern <dsa@cumulusnetworks.com>
From: Lorenzo Colitti <lorenzo@google.com> Date: Sat, 7 Jul 2018 16:31:40 +0900 > When tcp_diag_destroy closes a TCP_NEW_SYN_RECV socket, it first > frees it by calling inet_csk_reqsk_queue_drop_and_and_put in > tcp_abort, and then frees it again by calling sock_gen_put. > > Since tcp_abort only has one caller, and all the other codepaths > in tcp_abort don't free the socket, just remove the free in that > function. > > Cc: David Ahern <dsa@cumulusnetworks.com> > Tested: passes Android sock_diag_test.py, which exercises this codepath > Fixes: d7226c7a4dd1 ("net: diag: Fix refcnt leak in error path destroying socket") > Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Applied and queued up for -stable, thanks!
On Sat, Jul 7, 2018 at 10:29 PM Eric Dumazet <eric.dumazet@gmail.com> wrote: > >> Tested: passes Android sock_diag_test.py, which exercises this codepath > > > > If this Android test case exercises this path, why didn't it trigger > > the double free and thus cause this bug to be found much sooner? > > > > Just curious. > > Presumably android has not backported yet the refcount_t stuff in their kernels. That's correct. We only started seeing this on 4.14, which is not yet in the field. Also, I think this failure does not appear in our continuous test runs (even on 4.14) because those uses ARCH=um, which is single-threaded. We're working on making it possible to run the tests on qemu in order to catch these threading issues, but we're not there yet.
On Sat, Jul 7, 2018 at 10:56 PM David Ahern <dsa@cumulusnetworks.com> wrote: > > Here is a packetdrill test : > > So I have to either learn how to use packetdrill or install a web server > and put load on it. If the Android tests are not publicly available then > the reference should be removed from the commit log. It is publicly available. The test is at https://android.googlesource.com/kernel/tests/+/master/net/test/sock_diag_test.py . Documentation is at https://source.android.com/devices/architecture/kernel/network_tests . Note that not all the tests pass on upstream kernels. We do try to keep them working, but because they are also run as conformance tests for Android devices they must test out-of-tree behaviour, such as the fact that unlike upstream, Android kernels use RFC-compliant SHA256 truncation when the PF_KEY API is used. Again, I don't think you'll see this particular issue on the ARCH=um tests that run by default. But if you run sock_diag_test.py on a VM or physical machine as root, you should.
On 7/8/18 10:53 PM, Lorenzo Colitti wrote: > On Sat, Jul 7, 2018 at 10:29 PM Eric Dumazet <eric.dumazet@gmail.com> wrote: >>>> Tested: passes Android sock_diag_test.py, which exercises this codepath >>> >>> If this Android test case exercises this path, why didn't it trigger >>> the double free and thus cause this bug to be found much sooner? >>> >>> Just curious. >> >> Presumably android has not backported yet the refcount_t stuff in their kernels. > > That's correct. We only started seeing this on 4.14, which is not yet > in the field. Prior to 4.12 syn-recv sockets did not show up in the list, so it could not be closed through the diag API. I did not chase it to a specific commit between 4.11 (no syn-recv in 'ss -ta') and 4.12 (syn-recv shows up in 'ss -ta').
On 7/8/18 11:24 PM, Lorenzo Colitti wrote: > On Sat, Jul 7, 2018 at 10:56 PM David Ahern <dsa@cumulusnetworks.com> wrote: >>> Here is a packetdrill test : >> >> So I have to either learn how to use packetdrill or install a web server >> and put load on it. If the Android tests are not publicly available then >> the reference should be removed from the commit log. > > It is publicly available. The test is at > https://android.googlesource.com/kernel/tests/+/master/net/test/sock_diag_test.py > . Documentation is at > https://source.android.com/devices/architecture/kernel/network_tests . > > Note that not all the tests pass on upstream kernels. We do try to > keep them working, but because they are also run as conformance tests > for Android devices they must test out-of-tree behaviour, such as the > fact that unlike upstream, Android kernels use RFC-compliant SHA256 > truncation when the PF_KEY API is used. > > Again, I don't think you'll see this particular issue on the ARCH=um > tests that run by default. But if you run sock_diag_test.py on a VM or > physical machine as root, you should. > ok. thanks for reference. I general I am always looking for more easy to run tests to catch problems like this.
On 07/09/2018 07:59 AM, David Ahern wrote: > On 7/8/18 10:53 PM, Lorenzo Colitti wrote: >> On Sat, Jul 7, 2018 at 10:29 PM Eric Dumazet <eric.dumazet@gmail.com> wrote: >>>>> Tested: passes Android sock_diag_test.py, which exercises this codepath >>>> >>>> If this Android test case exercises this path, why didn't it trigger >>>> the double free and thus cause this bug to be found much sooner? >>>> >>>> Just curious. >>> >>> Presumably android has not backported yet the refcount_t stuff in their kernels. >> >> That's correct. We only started seeing this on 4.14, which is not yet >> in the field. > > Prior to 4.12 syn-recv sockets did not show up in the list, so it could > not be closed through the diag API. I did not chase it to a specific > commit between 4.11 (no syn-recv in 'ss -ta') and 4.12 (syn-recv shows > up in 'ss -ta'). > I suggest you look more closely to tcp_abort(), the function you changed in commit d7226c7a4dd1 (linux-4.8) Then you will see it had support for syn-recv since linux-4.5 So I do not understand your 4.11/4.12 mentions at all. "ss -ta" always had dumped syn-recv request sockets.
On 7/9/18 9:17 AM, Eric Dumazet wrote: > > > On 07/09/2018 07:59 AM, David Ahern wrote: >> On 7/8/18 10:53 PM, Lorenzo Colitti wrote: >>> On Sat, Jul 7, 2018 at 10:29 PM Eric Dumazet <eric.dumazet@gmail.com> wrote: >>>>>> Tested: passes Android sock_diag_test.py, which exercises this codepath >>>>> >>>>> If this Android test case exercises this path, why didn't it trigger >>>>> the double free and thus cause this bug to be found much sooner? >>>>> >>>>> Just curious. >>>> >>>> Presumably android has not backported yet the refcount_t stuff in their kernels. >>> >>> That's correct. We only started seeing this on 4.14, which is not yet >>> in the field. >> >> Prior to 4.12 syn-recv sockets did not show up in the list, so it could >> not be closed through the diag API. I did not chase it to a specific >> commit between 4.11 (no syn-recv in 'ss -ta') and 4.12 (syn-recv shows >> up in 'ss -ta'). >> > > I suggest you look more closely to tcp_abort(), the function you changed > in commit d7226c7a4dd1 (linux-4.8) > > Then you will see it had support for syn-recv since linux-4.5 > > So I do not understand your 4.11/4.12 mentions at all. > > "ss -ta" always had dumped syn-recv request sockets. > > Perhaps it is something with my config, settings, ss version or test program, but I do not see it on 4.11: # ip addr sh dev eth1 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 02:e0:f9:1c:00:37 brd ff:ff:ff:ff:ff:ff inet 10.100.1.4/24 scope global eth1 valid_lft forever preferred_lft forever ... # uname -r 4.11.0 # iptables -A OUTPUT -p tcp --sport 12345 --tcp-flags SYN,ACK SYN,ACK -j DROP # vrf-test -s & [1] 823 --> connection attempt into this node from 10.100.1.254 # ss -ta | grep 1234 LISTEN 0 5 *:12345 *:* No, SYN-RECV in output. A local connection does show up (but I was not looking at that earlier): # vrf-test -r 10.100.1.4 & [2] 827 # ss -ta | grep 1234 LISTEN 0 5 *:12345 *:* SYN-RECV 0 0 10.100.1.4:12345 10.100.1.254:54276 SYN-RECV 0 0 10.100.1.4:12345 10.100.1.4:52316 SYN-SENT 0 1 10.100.1.4:52316 10.100.1.4:12345 Same commands on 4.12: # uname -r 4.12.0 # iptables -A OUTPUT -p tcp --sport 12345 --tcp-flags SYN,ACK SYN,ACK -j DROP # vrf-test -s & [1] 815 --> connection attempt into this node from 10.100.1.254 # ss -ta | grep 1234 LISTEN 0 5 *:12345 *:* SYN-RECV 0 0 10.100.1.4:12345 10.100.1.254:54272 So, SYN-RECV is there. However, in writing this response and repeating the steps over and over I have noticed inconsistencies even with 4.12. For example, I reboot 4.12 and try the steps again: # uname -r 4.12.0 # iptables -A OUTPUT -p tcp --sport 12345 --tcp-flags SYN,ACK SYN,ACK -j DROP # vrf-test -s & [1] 821 --> connection attempt into this node from 10.100.1.254 # ss -ta | grep 1234 LISTEN 0 5 *:12345 *:* no SYN-RECV. Local test: # vrf-test -r 10.100.1.4 & [2] 824 # ss -ta | grep 1234 LISTEN 0 5 *:12345 *:* SYN-SENT 0 1 10.100.1.4:59922 10.100.1.4:12345 No SYN-RECV. # ss --version ss utility, iproute2-ss150831
On 07/09/2018 09:14 AM, David Ahern wrote: > Perhaps it is something with my config, settings, ss version or test > program, but I do not see it on 4.11: > Maybe some vrf issue, I dunno.
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index e7b53d2a97..c959bb6ea4 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3720,8 +3720,7 @@ int tcp_abort(struct sock *sk, int err) struct request_sock *req = inet_reqsk(sk); local_bh_disable(); - inet_csk_reqsk_queue_drop_and_put(req->rsk_listener, - req); + inet_csk_reqsk_queue_drop(req->rsk_listener, req); local_bh_enable(); return 0; }
When tcp_diag_destroy closes a TCP_NEW_SYN_RECV socket, it first frees it by calling inet_csk_reqsk_queue_drop_and_and_put in tcp_abort, and then frees it again by calling sock_gen_put. Since tcp_abort only has one caller, and all the other codepaths in tcp_abort don't free the socket, just remove the free in that function. Cc: David Ahern <dsa@cumulusnetworks.com> Tested: passes Android sock_diag_test.py, which exercises this codepath Fixes: d7226c7a4dd1 ("net: diag: Fix refcnt leak in error path destroying socket") Signed-off-by: Lorenzo Colitti <lorenzo@google.com> --- net/ipv4/tcp.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)