[net-next,2/2] tcp: add TCP_CC_INFO socket option

Message ID 1430263429-4069-3-git-send-email-edumazet@google.com
State Accepted
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet April 28, 2015, 11:23 p.m.
Some Congestion Control modules can provide per flow information,
but current way to get this information is to use netlink.

Like TCP_INFO, let's add TCP_CC_INFO so that applications can
issue a getsockopt() if they have a socket file descriptor,
instead of playing complex netlink games.

Sample usage would be :

  union tcp_cc_info info;
  socklen_t len = sizeof(info);

  if (getsockopt(fd, SOL_TCP, TCP_CC_INFO, &info, &len) == -1)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
---
 include/uapi/linux/tcp.h |  1 +
 net/ipv4/tcp.c           | 21 +++++++++++++++++++++
 2 files changed, 22 insertions(+)

Comments

Neal Cardwell April 29, 2015, 1 a.m. | #1
On Tue, Apr 28, 2015 at 7:23 PM, Eric Dumazet <edumazet@google.com> wrote:
> Some Congestion Control modules can provide per flow information,
> but current way to get this information is to use netlink.
>
> Like TCP_INFO, let's add TCP_CC_INFO so that applications can
> issue a getsockopt() if they have a socket file descriptor,
> instead of playing complex netlink games.
>
> Sample usage would be :
>
>   union tcp_cc_info info;
>   socklen_t len = sizeof(info);
>
>   if (getsockopt(fd, SOL_TCP, TCP_CC_INFO, &info, &len) == -1)
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Yuchung Cheng <ycheng@google.com>
> Cc: Neal Cardwell <ncardwell@google.com>
> ---

Acked-by: Neal Cardwell <ncardwell@google.com>

A very useful facility. Thanks for putting this together, Eric!

neal
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Borkmann April 29, 2015, 8:17 a.m. | #2
On 04/29/2015 01:23 AM, Eric Dumazet wrote:
> Some Congestion Control modules can provide per flow information,
> but current way to get this information is to use netlink.
>
> Like TCP_INFO, let's add TCP_CC_INFO so that applications can
> issue a getsockopt() if they have a socket file descriptor,
> instead of playing complex netlink games.
>
> Sample usage would be :
>
>    union tcp_cc_info info;
>    socklen_t len = sizeof(info);
>
>    if (getsockopt(fd, SOL_TCP, TCP_CC_INFO, &info, &len) == -1)
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Yuchung Cheng <ycheng@google.com>
> Cc: Neal Cardwell <ncardwell@google.com>

Presuming other cc algorithms would in future also export
internal information through this interface, would it make
sense to put tcp_cc_info into a container structure so we
don't miss out attr (vegas, dctcp, ...), like:

   struct tcp_cc_exp {
     u32               kind;
     union tcp_cc_info info;
   };

Otherwise looks good:

Acked-by: Daniel Borkmann <daniel@iogearbox.net>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet April 29, 2015, 11:07 a.m. | #3
On Wed, 2015-04-29 at 10:17 +0200, Daniel Borkmann wrote:
> On 04/29/2015 01:23 AM, Eric Dumazet wrote:
> > Some Congestion Control modules can provide per flow information,
> > but current way to get this information is to use netlink.
> >
> > Like TCP_INFO, let's add TCP_CC_INFO so that applications can
> > issue a getsockopt() if they have a socket file descriptor,
> > instead of playing complex netlink games.
> >
> > Sample usage would be :
> >
> >    union tcp_cc_info info;
> >    socklen_t len = sizeof(info);
> >
> >    if (getsockopt(fd, SOL_TCP, TCP_CC_INFO, &info, &len) == -1)
> >
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > Cc: Yuchung Cheng <ycheng@google.com>
> > Cc: Neal Cardwell <ncardwell@google.com>
> 
> Presuming other cc algorithms would in future also export
> internal information through this interface, would it make
> sense to put tcp_cc_info into a container structure so we
> don't miss out attr (vegas, dctcp, ...), like:
> 
>    struct tcp_cc_exp {
>      u32               kind;
>      union tcp_cc_info info;
>    };
> 
> Otherwise looks good:
> 
> Acked-by: Daniel Borkmann <daniel@iogearbox.net>

I thought of this, but I really believe this is not needed, as the
application can already fetch CC name (if really its does not know yet)

And I also wanted to get same layout for info provided by netlink and
getsockopt()

Thanks !


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Borkmann April 29, 2015, 11:15 a.m. | #4
On 04/29/2015 01:07 PM, Eric Dumazet wrote:
> On Wed, 2015-04-29 at 10:17 +0200, Daniel Borkmann wrote:
>> On 04/29/2015 01:23 AM, Eric Dumazet wrote:
>>> Some Congestion Control modules can provide per flow information,
>>> but current way to get this information is to use netlink.
>>>
>>> Like TCP_INFO, let's add TCP_CC_INFO so that applications can
>>> issue a getsockopt() if they have a socket file descriptor,
>>> instead of playing complex netlink games.
>>>
>>> Sample usage would be :
>>>
>>>     union tcp_cc_info info;
>>>     socklen_t len = sizeof(info);
>>>
>>>     if (getsockopt(fd, SOL_TCP, TCP_CC_INFO, &info, &len) == -1)
>>>
>>> Signed-off-by: Eric Dumazet <edumazet@google.com>
>>> Cc: Yuchung Cheng <ycheng@google.com>
>>> Cc: Neal Cardwell <ncardwell@google.com>
>>
>> Presuming other cc algorithms would in future also export
>> internal information through this interface, would it make
>> sense to put tcp_cc_info into a container structure so we
>> don't miss out attr (vegas, dctcp, ...), like:
>>
>>     struct tcp_cc_exp {
>>       u32               kind;
>>       union tcp_cc_info info;
>>     };
>>
>> Otherwise looks good:
>>
>> Acked-by: Daniel Borkmann <daniel@iogearbox.net>
>
> I thought of this, but I really believe this is not needed, as the
> application can already fetch CC name (if really its does not know yet)
>
> And I also wanted to get same layout for info provided by netlink and
> getsockopt()

Ok, I'm fine with that. Presumably, applications making use of
this facility would most likely set the cc op themselves from
the application, so they know exactly what to expect here. Or,
might have fetched it via the cc name albeit a bit inconvenient,
but not impossible.

Thanks,
Daniel
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yuchung Cheng April 29, 2015, 4:15 p.m. | #5
On Tue, Apr 28, 2015 at 4:23 PM, Eric Dumazet <edumazet@google.com> wrote:
> Some Congestion Control modules can provide per flow information,
> but current way to get this information is to use netlink.
>
> Like TCP_INFO, let's add TCP_CC_INFO so that applications can
> issue a getsockopt() if they have a socket file descriptor,
> instead of playing complex netlink games.
>
> Sample usage would be :
>
>   union tcp_cc_info info;
>   socklen_t len = sizeof(info);
>
>   if (getsockopt(fd, SOL_TCP, TCP_CC_INFO, &info, &len) == -1)
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Yuchung Cheng <ycheng@google.com>
> Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>

Nice extension!

> ---
>  include/uapi/linux/tcp.h |  1 +
>  net/ipv4/tcp.c           | 21 +++++++++++++++++++++
>  2 files changed, 22 insertions(+)
>
> diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
> index 3b9718328d8b..937ec387276f 100644
> --- a/include/uapi/linux/tcp.h
> +++ b/include/uapi/linux/tcp.h
> @@ -112,6 +112,7 @@ enum {
>  #define TCP_FASTOPEN           23      /* Enable FastOpen on listeners */
>  #define TCP_TIMESTAMP          24
>  #define TCP_NOTSENT_LOWAT      25      /* limit number of unsent bytes in write queue */
> +#define TCP_CC_INFO            26      /* Get Congestion Control (optional) info */
>
>  struct tcp_repair_opt {
>         __u32   opt_code;
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 8c5cd9efebbc..1e06c75a6837 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -252,6 +252,7 @@
>  #include <linux/types.h>
>  #include <linux/fcntl.h>
>  #include <linux/poll.h>
> +#include <linux/inet_diag.h>
>  #include <linux/init.h>
>  #include <linux/fs.h>
>  #include <linux/skbuff.h>
> @@ -2734,6 +2735,26 @@ static int do_tcp_getsockopt(struct sock *sk, int level,
>                         return -EFAULT;
>                 return 0;
>         }
> +       case TCP_CC_INFO: {
> +               const struct tcp_congestion_ops *ca_ops;
> +               union tcp_cc_info info;
> +               size_t sz = 0;
> +               int attr;
> +
> +               if (get_user(len, optlen))
> +                       return -EFAULT;
> +
> +               ca_ops = icsk->icsk_ca_ops;
> +               if (ca_ops && ca_ops->get_info)
> +                       sz = ca_ops->get_info(sk, ~0U, &attr, &info);
> +
> +               len = min_t(unsigned int, len, sz);
> +               if (put_user(len, optlen))
> +                       return -EFAULT;
> +               if (copy_to_user(optval, &info, len))
> +                       return -EFAULT;
> +               return 0;
> +       }
>         case TCP_QUICKACK:
>                 val = !icsk->icsk_ack.pingpong;
>                 break;
> --
> 2.2.0.rc0.207.ga3a616c
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller April 29, 2015, 10:12 p.m. | #6
From: Eric Dumazet <edumazet@google.com>
Date: Tue, 28 Apr 2015 16:23:49 -0700

> Some Congestion Control modules can provide per flow information,
> but current way to get this information is to use netlink.
> 
> Like TCP_INFO, let's add TCP_CC_INFO so that applications can
> issue a getsockopt() if they have a socket file descriptor,
> instead of playing complex netlink games.
> 
> Sample usage would be :
> 
>   union tcp_cc_info info;
>   socklen_t len = sizeof(info);
> 
>   if (getsockopt(fd, SOL_TCP, TCP_CC_INFO, &info, &len) == -1)
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index 3b9718328d8b..937ec387276f 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -112,6 +112,7 @@  enum {
 #define TCP_FASTOPEN		23	/* Enable FastOpen on listeners */
 #define TCP_TIMESTAMP		24
 #define TCP_NOTSENT_LOWAT	25	/* limit number of unsent bytes in write queue */
+#define TCP_CC_INFO		26	/* Get Congestion Control (optional) info */
 
 struct tcp_repair_opt {
 	__u32	opt_code;
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 8c5cd9efebbc..1e06c75a6837 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -252,6 +252,7 @@ 
 #include <linux/types.h>
 #include <linux/fcntl.h>
 #include <linux/poll.h>
+#include <linux/inet_diag.h>
 #include <linux/init.h>
 #include <linux/fs.h>
 #include <linux/skbuff.h>
@@ -2734,6 +2735,26 @@  static int do_tcp_getsockopt(struct sock *sk, int level,
 			return -EFAULT;
 		return 0;
 	}
+	case TCP_CC_INFO: {
+		const struct tcp_congestion_ops *ca_ops;
+		union tcp_cc_info info;
+		size_t sz = 0;
+		int attr;
+
+		if (get_user(len, optlen))
+			return -EFAULT;
+
+		ca_ops = icsk->icsk_ca_ops;
+		if (ca_ops && ca_ops->get_info)
+			sz = ca_ops->get_info(sk, ~0U, &attr, &info);
+
+		len = min_t(unsigned int, len, sz);
+		if (put_user(len, optlen))
+			return -EFAULT;
+		if (copy_to_user(optval, &info, len))
+			return -EFAULT;
+		return 0;
+	}
 	case TCP_QUICKACK:
 		val = !icsk->icsk_ack.pingpong;
 		break;