From patchwork Wed Jan 16 21:09:10 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vijay Subramanian X-Patchwork-Id: 213058 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 637642C0040 for ; Thu, 17 Jan 2013 08:09:31 +1100 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756332Ab3APVJ1 (ORCPT ); Wed, 16 Jan 2013 16:09:27 -0500 Received: from mail-da0-f48.google.com ([209.85.210.48]:43633 "EHLO mail-da0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751969Ab3APVJ0 (ORCPT ); Wed, 16 Jan 2013 16:09:26 -0500 Received: by mail-da0-f48.google.com with SMTP id k18so749406dae.21 for ; Wed, 16 Jan 2013 13:09:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:date:from:x-x-sender:to:cc:subject:in-reply-to :message-id:references:user-agent:mime-version:content-type; bh=XXzZPbqKZC/Wj8VSNLPLnOmpjR1fviFkSIBx2aIfKWw=; b=iwD638oxVoOHj0HD0s45CMVeXylOa6SlRDaVuzSS4dPBTzQIMiHjGNi5vWggzMUeQ6 OlRgvHuZr7cWTXWXosjehgcMuXgT2bBCW091zDVQNOp+K4U3uhi8JdgWRN7xhDsGnOju tvoxlPJfiR5gLu8VO6HBdNCHVk0yHAg1aS0uVnrUsuzQr1sItcx2rhbsEpDJZURDd84w otRnbFPFrJ/Q78TQXYcqfErHEiiw5DceOgK+MLrye81JJa0456S8kBerfJAQ3K90lq6e /HJXE/F7KVlQFZsdNs85ltYyT4mWJTY9sBQf5qx+GEpj+AuPOO1LNTyf46JdCDO6CP05 QZ7w== X-Received: by 10.68.253.137 with SMTP id aa9mr6432557pbd.102.1358370565924; Wed, 16 Jan 2013 13:09:25 -0800 (PST) Received: from [2001:420:2ca:4109:a800:4ff:fe00:a04] ([2001:420:2ca:4109:a800:4ff:fe00:a04]) by mx.google.com with ESMTPS id jo6sm12901386pbb.5.2013.01.16.13.09.24 (version=TLSv1 cipher=RC4-SHA bits=128/128); Wed, 16 Jan 2013 13:09:25 -0800 (PST) Date: Wed, 16 Jan 2013 13:09:10 -0800 (PST) From: Vijay Subramanian X-X-Sender: root@cleese To: Tom Herbert cc: netdev@vger.kernel.org, davem@davemloft.net, netdev@markandruth.co.uk, eric.dumazet@gmail.com Subject: Re: [PATCH 2/5] soreuseport: TCP/IPv4 implementation In-Reply-To: Message-ID: References: User-Agent: Alpine 2.00 (DEB 1167 2008-08-23) MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org > * Unlike other sk lookup places we do not check > @@ -73,8 +75,11 @@ int inet_csk_bind_conflict(const struct sock *sk, > (!sk->sk_bound_dev_if || > !sk2->sk_bound_dev_if || > sk->sk_bound_dev_if == sk2->sk_bound_dev_if)) { > - if (!reuse || !sk2->sk_reuse || > - sk2->sk_state == TCP_LISTEN) { > + if ((!reuse || !sk2->sk_reuse || > + sk2->sk_state == TCP_LISTEN) && > + (!reuseport || !sk2->sk_reuseport || > + (sk2->sk_state != TCP_TIME_WAIT && > + uid != sock_i_uid(sk2)))) { > const __be32 sk2_rcv_saddr = sk_rcv_saddr(sk2); > if (!sk2_rcv_saddr || !sk_rcv_saddr(sk) || > sk2_rcv_saddr == sk_rcv_saddr(sk)) How about introducing some helper functions to make inet_csk_bid_conflict() and inet6_csk_bind_conflict() more readable as in patch below? We can add another test for reuseport() for soreuseport patches. udp.c already has ipv4_rcv_saddr_equal() but it seems to call ipv6_only_sock() and not inet_v6_ipv6only() which is needed in inet{6}_csk_bid_conflict().So I added sk_rcv_saddr_equal(). Also the bind_conflict functions can return bool instead of int (not implemented in patch below). If patch idea below is ok, I will send it officially. Thanks, Vijay --- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h index 1832927..c15d2eb 100644 --- a/include/net/inet_connection_sock.h +++ b/include/net/inet_connection_sock.h @@ -22,6 +22,8 @@ #include #include +#include +#include #define INET_CSK_DEBUG 1 @@ -205,6 +207,37 @@ static inline void inet_csk_clear_xmit_timer(struct sock *sk, const int what) #endif } +/* The port cannot be reused if the older socket is in LISTEN state or if + * either the old or new one does not allow reuse + */ +static inline bool sk_reuse_equal(int reuse, const struct sock *sk2) +{ + return !reuse || !sk2->sk_reuse || sk2->sk_state == TCP_LISTEN; +} + +/* The port cannot be reused if both sockets are bound to the same device or + * if either one is not bound + */ +static inline bool sk_bound_dev_equal(const struct sock *sk, + const struct sock *sk2) +{ + return !sk->sk_bound_dev_if || !sk2->sk_bound_dev_if || + sk->sk_bound_dev_if == sk2->sk_bound_dev_if; +} + +/* The port cannot be reused if both sockets have the same rcv_saddr + * or if either rcv_saddr is NULL + */ +static inline bool sk_rcv_saddr_equal(const struct sock *sk1, + const struct sock *sk2) +{ + __be32 sk1_rcv_saddr = sk_rcv_saddr(sk1); + __be32 sk2_rcv_saddr = sk_rcv_saddr(sk2); + + return !sk2_rcv_saddr || !sk1_rcv_saddr || + sk2_rcv_saddr == sk1_rcv_saddr; +} + /* * Reset the retransmission timer */ diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c index d0670f0..375cca3 100644 --- a/net/ipv4/inet_connection_sock.c +++ b/net/ipv4/inet_connection_sock.c @@ -68,26 +68,15 @@ int inet_csk_bind_conflict(const struct sock *sk, */ sk_for_each_bound(sk2, node, &tb->owners) { - if (sk != sk2 && - !inet_v6_ipv6only(sk2) && - (!sk->sk_bound_dev_if || - !sk2->sk_bound_dev_if || - sk->sk_bound_dev_if == sk2->sk_bound_dev_if)) { - if (!reuse || !sk2->sk_reuse || - sk2->sk_state == TCP_LISTEN) { - const __be32 sk2_rcv_saddr = sk_rcv_saddr(sk2); - if (!sk2_rcv_saddr || !sk_rcv_saddr(sk) || - sk2_rcv_saddr == sk_rcv_saddr(sk)) - break; - } - if (!relax && reuse && sk2->sk_reuse && - sk2->sk_state != TCP_LISTEN) { - const __be32 sk2_rcv_saddr = sk_rcv_saddr(sk2); + if (sk != sk2 && !inet_v6_ipv6only(sk2) && + sk_bound_dev_equal(sk, sk2)) { + if (sk_reuse_equal(reuse, sk2) && + sk_rcv_saddr_equal(sk, sk2)) + break; - if (!sk2_rcv_saddr || !sk_rcv_saddr(sk) || - sk2_rcv_saddr == sk_rcv_saddr(sk)) - break; - } + if (!relax && sk_reuse_equal(reuse, sk2) && + sk_rcv_saddr_equal(sk, sk2)) + break; } } return node != NULL; diff --git a/net/ipv6/inet6_connection_sock.c b/net/ipv6/inet6_connection_sock.c index 3064785..8ebe20d 100644 --- a/net/ipv6/inet6_connection_sock.c +++ b/net/ipv6/inet6_connection_sock.c @@ -39,13 +39,9 @@ int inet6_csk_bind_conflict(const struct sock *sk, * vs net namespaces issues. */ sk_for_each_bound(sk2, node, &tb->owners) { - if (sk != sk2 && - (!sk->sk_bound_dev_if || - !sk2->sk_bound_dev_if || - sk->sk_bound_dev_if == sk2->sk_bound_dev_if) && - (!sk->sk_reuse || !sk2->sk_reuse || - sk2->sk_state == TCP_LISTEN) && - ipv6_rcv_saddr_equal(sk, sk2)) + if (sk != sk2 && sk_bound_dev_equal(sk, sk2) && + sk_reuse_equal(sk->sk_reuse, sk2) && + ipv6_rcv_saddr_equal(sk, sk2)) break; }