Message ID | 1489420786-19547-1-git-send-email-johunt@akamai.com |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
On Mon, Mar 13, 2017 at 8:59 AM, Josh Hunt <johunt@akamai.com> wrote: > Allows application to read the amount of data sitting in the receive > queue. > > Signed-off-by: Josh Hunt <johunt@akamai.com> > --- > > A team here is looking for a way to get the amount of data in a UDP socket's > receive queue. It seems like this should be SIOCINQ, but for UDP sockets that > returns the size of the next pending datagram. I implemented the patch below, > but am wondering if this is the right place for this change? I was debating > between this or a new UDP ioctl. But what is the 'amount of data' exactly ? Number of packets, amount of bytes to read from these packets ? You chose to report kernel memory usage, which is not guaranteed to be the same among kernels versions (or kernel configs) If we export these internals, I would export the whole thing, like we did with netlink ie tweak sock_diag_put_meminfo() and export the SK_MEMINFO_VARS So that we avoid adding other options in the future.
On 03/13/2017 11:12 AM, Eric Dumazet wrote: > On Mon, Mar 13, 2017 at 8:59 AM, Josh Hunt <johunt@akamai.com> wrote: >> Allows application to read the amount of data sitting in the receive >> queue. >> >> Signed-off-by: Josh Hunt <johunt@akamai.com> >> --- >> >> A team here is looking for a way to get the amount of data in a UDP socket's >> receive queue. It seems like this should be SIOCINQ, but for UDP sockets that >> returns the size of the next pending datagram. I implemented the patch below, >> but am wondering if this is the right place for this change? I was debating >> between this or a new UDP ioctl. > > But what is the 'amount of data' exactly ? > Number of packets, amount of bytes to read from these packets ? I meant bytes. I will clarify in the next version. > > You chose to report kernel memory usage, which is not guaranteed to be > the same among kernels versions (or kernel configs) > > If we export these internals, I would export the whole thing, like we > did with netlink > > ie tweak sock_diag_put_meminfo() and export the SK_MEMINFO_VARS > > So that we avoid adding other options in the future. > OK, so you're suggesting we provide all of the SK_MEMINFO_VARS data via socket option then? That sounds good to me. I will send that in a v2. Josh
From: Josh Hunt <johunt@akamai.com> Date: Mon, 13 Mar 2017 12:38:39 -0500 > On 03/13/2017 11:12 AM, Eric Dumazet wrote: >> On Mon, Mar 13, 2017 at 8:59 AM, Josh Hunt <johunt@akamai.com> wrote: >>> Allows application to read the amount of data sitting in the receive >>> queue. >>> >>> Signed-off-by: Josh Hunt <johunt@akamai.com> >>> --- >>> >>> A team here is looking for a way to get the amount of data in a UDP >>> socket's >>> receive queue. It seems like this should be SIOCINQ, but for UDP >>> sockets that >>> returns the size of the next pending datagram. I implemented the patch >>> below, >>> but am wondering if this is the right place for this change? I was >>> debating >>> between this or a new UDP ioctl. >> >> But what is the 'amount of data' exactly ? >> Number of packets, amount of bytes to read from these packets ? > > I meant bytes. I will clarify in the next version. As Eric is hinting, the calculation you are using doesn't represent this. You need to do something like walk the receive queue and add the skb->len values together. sk->sk_rmem_alloc is usually much larger than the sum of the skb->len values in the socket receive queue. I don't see how this culmination of skb->truesize values is useful, whereas I can see how an application could want the summation of the skb->len values.
On 03/13/2017 02:39 PM, David Miller wrote: > From: Josh Hunt <johunt@akamai.com> > Date: Mon, 13 Mar 2017 12:38:39 -0500 > >> On 03/13/2017 11:12 AM, Eric Dumazet wrote: >>> On Mon, Mar 13, 2017 at 8:59 AM, Josh Hunt <johunt@akamai.com> wrote: >>>> Allows application to read the amount of data sitting in the receive >>>> queue. >>>> >>>> Signed-off-by: Josh Hunt <johunt@akamai.com> >>>> --- >>>> >>>> A team here is looking for a way to get the amount of data in a UDP >>>> socket's >>>> receive queue. It seems like this should be SIOCINQ, but for UDP >>>> sockets that >>>> returns the size of the next pending datagram. I implemented the patch >>>> below, >>>> but am wondering if this is the right place for this change? I was >>>> debating >>>> between this or a new UDP ioctl. >>> >>> But what is the 'amount of data' exactly ? >>> Number of packets, amount of bytes to read from these packets ? >> >> I meant bytes. I will clarify in the next version. > > As Eric is hinting, the calculation you are using doesn't represent > this. > > You need to do something like walk the receive queue and add the > skb->len values together. > > sk->sk_rmem_alloc is usually much larger than the sum of the skb->len > values in the socket receive queue. I don't see how this culmination > of skb->truesize values is useful, whereas I can see how an application > could want the summation of the skb->len values. > In this particular case they really do want to know total # of bytes in the receive queue, not the data bytes they can consume from an application pov. The kernel currently only exposes this value through netlink or /proc/net/udp from what I saw. I believe Eric's suggestion in his previous mail was to export all of these meminfo metrics via a single socket option call similar to how its done in netlink. We could then use that for both call sites. I agree that it would be useful to also have the data you and Eric are suggesting exposed somewhere, the total # of skb->len bytes sitting in the receive queue. I could add that as a second socket option. Does this sound reasonable? Josh
From: Josh Hunt <johunt@akamai.com> Date: Mon, 13 Mar 2017 18:34:41 -0500 > In this particular case they really do want to know total # of bytes > in the receive queue, not the data bytes they can consume from an > application pov. The kernel currently only exposes this value through > netlink or /proc/net/udp from what I saw. Can you explain in what way this is useful? The difference between skb->len and skb->truesize is really kernel internal implementation detail, and I'm trying to figure out why this would be useful to an application.
On Mon, 2017-03-13 at 18:34 -0500, Josh Hunt wrote: > In this particular case they really do want to know total # of bytes in > the receive queue, not the data bytes they can consume from an > application pov. The kernel currently only exposes this value through > netlink or /proc/net/udp from what I saw. > > I believe Eric's suggestion in his previous mail was to export all of > these meminfo metrics via a single socket option call similar to how its > done in netlink. We could then use that for both call sites. > > I agree that it would be useful to also have the data you and Eric are > suggesting exposed somewhere, the total # of skb->len bytes sitting in > the receive queue. I could add that as a second socket option. Please note that UDP stack does not maintain a per socket sum(skb->len) Implementing this in a system call would require to lock the receive queue (blocking BH) and iterating over a potential huge skb list. Or add a new socket field and add/sub every skb->len of packets added/removed to/from receive queue. So I would prefer to not provide this information, this looks quite a bloat.
On 03/13/2017 07:10 PM, David Miller wrote: > From: Josh Hunt <johunt@akamai.com> > Date: Mon, 13 Mar 2017 18:34:41 -0500 > >> In this particular case they really do want to know total # of bytes >> in the receive queue, not the data bytes they can consume from an >> application pov. The kernel currently only exposes this value through >> netlink or /proc/net/udp from what I saw. > > Can you explain in what way this is useful? > > The difference between skb->len and skb->truesize is really kernel > internal implementation detail, and I'm trying to figure out why > this would be useful to an application. > First, it looks like my original patch was against an old kernel which did not have the updated udp accounting code. Not sure how that happened. Apologies for that. There's no need to add in the backlog, at least for udp now, sk_rmem_alloc is all that is needed for my case. The application here is interested in monitoring the amount of data in the receive buffer. Looking for and identifying overflows, and also understanding how full it is. I know we already have SO_RXQ_OVFL, but this only shows the # of drops on overflow. We expose this (skmem) information via /proc and netlink today. It seems like unnecessary overhead to require an application to also create a netlink socket to get this data. Creating a socket option to mimic the behavior of sock_diag_put_meminfo() and export all meminfo_vars would be great if that's something you'd accept. Josh
diff --git a/include/net/sock.h b/include/net/sock.h index 5e59976..bcfed2ae 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -854,6 +854,11 @@ static inline void __sk_add_backlog(struct sock *sk, struct sk_buff *skb) skb->next = NULL; } +static unsigned int sk_rcvqueue_size(const struct sock *sk) +{ + return sk->sk_backlog.len + atomic_read(&sk->sk_rmem_alloc); +} + /* * Take into account size of receive queue and backlog queue * Do not take into account this skb truesize, @@ -861,7 +866,7 @@ static inline void __sk_add_backlog(struct sock *sk, struct sk_buff *skb) */ static inline bool sk_rcvqueues_full(const struct sock *sk, unsigned int limit) { - unsigned int qsize = sk->sk_backlog.len + atomic_read(&sk->sk_rmem_alloc); + unsigned int qsize = sk_rcvqueue_size(sk); return qsize > limit; } diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h index 2c748dd..36b8cd5 100644 --- a/include/uapi/asm-generic/socket.h +++ b/include/uapi/asm-generic/socket.h @@ -94,4 +94,6 @@ #define SCM_TIMESTAMPING_OPT_STATS 54 +#define SO_RCVQUEUE_SIZE 55 + #endif /* __ASM_GENERIC_SOCKET_H */ diff --git a/net/core/sock.c b/net/core/sock.c index f6fd79f..fa69864 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1269,6 +1269,10 @@ int sock_getsockopt(struct socket *sock, int level, int optname, v.val = sk->sk_incoming_cpu; break; + case SO_RCVQUEUE_SIZE: + v.val = sk_rcvqueue_size(sk); + break; + default: /* We implement the SO_SNDLOWAT etc to not be settable * (1003.1g 7).
Allows application to read the amount of data sitting in the receive queue. Signed-off-by: Josh Hunt <johunt@akamai.com> --- A team here is looking for a way to get the amount of data in a UDP socket's receive queue. It seems like this should be SIOCINQ, but for UDP sockets that returns the size of the next pending datagram. I implemented the patch below, but am wondering if this is the right place for this change? I was debating between this or a new UDP ioctl. include/net/sock.h | 7 ++++++- include/uapi/asm-generic/socket.h | 2 ++ net/core/sock.c | 4 ++++ 3 files changed, 12 insertions(+), 1 deletion(-)