Message ID | 1271849253.7895.1929.camel@edumazet-laptop |
---|---|
State | Superseded, archived |
Delegated to: | David Miller |
Headers | show |
On Wed, Apr 21, 2010 at 4:27 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > Here is the patch I use now and my test application is now able to open > and connect 1000000 sockets (ulimit -n 1000000) I believe we hit this very yesterday in our test lab. We had a stress test running of one of our applications with about a dozen instances of it running on the box. Suddenly dns requests began failing with the complaint that it couldn't make a request out because there were no sockets. root@champagne:/proc/sys/net/ipv4> host gh host: isc_socket_bind: address in use Netstat showed 61580 total sockets (UDP and TCP) on the address being used by the above dns request. (local port range 1025 65535). That dns request should not have been failing. I noticed that the number of UDP sockets was close to the maximum allowed by the port range, but they were across different IP addresses, no one IP address had too many and there should have been available ports on all IP addresses. Further, the number of udp sockets in use seemed to hit the wall at a little above 64,000 and I never got above that number. If that is the normal behavior of the kernel, it could be a big problem for scaling the application. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Apr 21, 2010 at 01:27:33PM +0200, Eric Dumazet (eric.dumazet@gmail.com) wrote: > Here is the patch I use now and my test application is now able to open > and connect 1000000 sockets (ulimit -n 1000000) > > Trick is bind_conflict() must refuse a socket to bind to a port on a non > null IP if another socket already uses same port on same IP. > > Plus the previous patch sent (check a conflict before exiting the search > loop) > > What do you think ? Looks good, but do we want to check only reused socket's address there? What if one of the sockets does not have reuse option turned on, will it break?
Le mercredi 21 avril 2010 à 22:27 +0400, Evgeniy Polyakov a écrit : > On Wed, Apr 21, 2010 at 01:27:33PM +0200, Eric Dumazet (eric.dumazet@gmail.com) wrote: > > Here is the patch I use now and my test application is now able to open > > and connect 1000000 sockets (ulimit -n 1000000) > > > > Trick is bind_conflict() must refuse a socket to bind to a port on a non > > null IP if another socket already uses same port on same IP. > > > > Plus the previous patch sent (check a conflict before exiting the search > > loop) > > > > What do you think ? > > Looks good, but do we want to check only reused socket's address there? > What if one of the sockets does not have reuse option turned on, will it > break? > Well, if one socket doesnt have reuse option turned on, the previous test already works ? if (!reuse || !sk2->sk_reuse || sk2->sk_state == TCP_LISTEN) { if (!sk2_rcv_saddr || !sk_rcv_saddr || sk2_rcv_saddr == sk_rcv_saddr) break; } else if (reuse && sk2->sk_reuse && sk2_rcv_saddr && sk2_rcv_saddr == sk_rcv_saddr) break; I failed to factorize this complex test :( -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Apr 21, 2010 at 08:43:36PM +0200, Eric Dumazet (eric.dumazet@gmail.com) wrote: > Le mercredi 21 avril 2010 à 22:27 +0400, Evgeniy Polyakov a écrit : > > On Wed, Apr 21, 2010 at 01:27:33PM +0200, Eric Dumazet (eric.dumazet@gmail.com) wrote: > > > Here is the patch I use now and my test application is now able to open > > > and connect 1000000 sockets (ulimit -n 1000000) > > > > > > Trick is bind_conflict() must refuse a socket to bind to a port on a non > > > null IP if another socket already uses same port on same IP. > > > > > > Plus the previous patch sent (check a conflict before exiting the search > > > loop) > > > > > > What do you think ? > > > > Looks good, but do we want to check only reused socket's address there? > > What if one of the sockets does not have reuse option turned on, will it > > break? > > > > Well, if one socket doesnt have reuse option turned on, the previous > test already works ? > > if (!reuse || !sk2->sk_reuse || sk2->sk_state == TCP_LISTEN) { > if (!sk2_rcv_saddr || !sk_rcv_saddr || > sk2_rcv_saddr == sk_rcv_saddr) > break; > } else if (reuse && sk2->sk_reuse && > sk2_rcv_saddr && > sk2_rcv_saddr == sk_rcv_saddr) > break; > > I failed to factorize this complex test :( Damn it, I tried multiple times :) You are right of course!
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c index e0a3e35..78cbc39 100644 --- a/net/ipv4/inet_connection_sock.c +++ b/net/ipv4/inet_connection_sock.c @@ -70,13 +70,17 @@ int inet_csk_bind_conflict(const struct sock *sk, (!sk->sk_bound_dev_if || !sk2->sk_bound_dev_if || sk->sk_bound_dev_if == sk2->sk_bound_dev_if)) { + const __be32 sk2_rcv_saddr = inet_rcv_saddr(sk2); + if (!reuse || !sk2->sk_reuse || sk2->sk_state == TCP_LISTEN) { - const __be32 sk2_rcv_saddr = inet_rcv_saddr(sk2); if (!sk2_rcv_saddr || !sk_rcv_saddr || sk2_rcv_saddr == sk_rcv_saddr) break; - } + } else if (reuse && sk2->sk_reuse && + sk2_rcv_saddr && + sk2_rcv_saddr == sk_rcv_saddr) + break; } } return node != NULL; @@ -120,9 +124,11 @@ again: smallest_size = tb->num_owners; smallest_rover = rover; if (atomic_read(&hashinfo->bsockets) > (high - low) + 1) { - spin_unlock(&head->lock); - snum = smallest_rover; - goto have_snum; + if (!inet_csk(sk)->icsk_af_ops->bind_conflict(sk, tb)) { + spin_unlock(&head->lock); + snum = smallest_rover; + goto have_snum; + } } } goto next;