diff mbox

net: add tracepoints to socket api

Message ID 20090126195930.GA28208@hmsreliant.think-freely.org
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Neil Horman Jan. 26, 2009, 7:59 p.m. UTC
Hey all-
	I've been working for a bit with the various tracepoints that have been
developed in the lttng tree.  The infrastructure is available in the main line
kernel.  All thats missing are the tracepoints themselves.  While some seem to
be contentious for various reasons, most seem benign, and can be rather usefull,
so I was thinking we could start importing Some for the net stack.  Heres a
patch for the top level socket api tracepoints.  If these are acceptable, I'll
begin porting over the rest for the remainder of the network stack.   This patch
applies cleanly against the net-next tree.


Regards
Neil

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>


 socket.c |   12 ++++++++++++
 1 file changed, 12 insertions(+)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

David Miller Jan. 27, 2009, 1:22 a.m. UTC | #1
From: Neil Horman <nhorman@tuxdriver.com>
Date: Mon, 26 Jan 2009 14:59:30 -0500

> @@ -95,6 +95,7 @@
>  
>  #include <net/sock.h>
>  #include <linux/netfilter.h>
> +#include <trace/socket.h>

Where is this mysterious include/trace/socket.h file?
I can't seem to find it :-)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Arnaldo Carvalho de Melo Jan. 27, 2009, 1:31 a.m. UTC | #2
Em Mon, Jan 26, 2009 at 05:22:53PM -0800, David Miller escreveu:
> From: Neil Horman <nhorman@tuxdriver.com>
> Date: Mon, 26 Jan 2009 14:59:30 -0500
> 
> > @@ -95,6 +95,7 @@
> >  
> >  #include <net/sock.h>
> >  #include <linux/netfilter.h>
> > +#include <trace/socket.h>
> 
> Where is this mysterious include/trace/socket.h file?
> I can't seem to find it :-)

Its something along the lines of include/trace/block.h, with the
DECLARE_TRACE that match the DEFINE_TRACE entries he added, probably he
forgot to git-add this new file :-)

- Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Jan. 27, 2009, 5:57 a.m. UTC | #3
From: Arnaldo Carvalho de Melo <acme@redhat.com>
Date: Mon, 26 Jan 2009 23:31:27 -0200

> Em Mon, Jan 26, 2009 at 05:22:53PM -0800, David Miller escreveu:
> > From: Neil Horman <nhorman@tuxdriver.com>
> > Date: Mon, 26 Jan 2009 14:59:30 -0500
> > 
> > > @@ -95,6 +95,7 @@
> > >  
> > >  #include <net/sock.h>
> > >  #include <linux/netfilter.h>
> > > +#include <trace/socket.h>
> > 
> > Where is this mysterious include/trace/socket.h file?
> > I can't seem to find it :-)
> 
> Its something along the lines of include/trace/block.h, with the
> DECLARE_TRACE that match the DEFINE_TRACE entries he added, probably he
> forgot to git-add this new file :-)

I kind of figured as much, so now Neil just needs to respin his
patch with the relevant bits of that header file included.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Neil Horman Jan. 27, 2009, 12:09 p.m. UTC | #4
On Mon, Jan 26, 2009 at 09:57:03PM -0800, David Miller wrote:
> From: Arnaldo Carvalho de Melo <acme@redhat.com>
> Date: Mon, 26 Jan 2009 23:31:27 -0200
> 
> > Em Mon, Jan 26, 2009 at 05:22:53PM -0800, David Miller escreveu:
> > > From: Neil Horman <nhorman@tuxdriver.com>
> > > Date: Mon, 26 Jan 2009 14:59:30 -0500
> > > 
> > > > @@ -95,6 +95,7 @@
> > > >  
> > > >  #include <net/sock.h>
> > > >  #include <linux/netfilter.h>
> > > > +#include <trace/socket.h>
> > > 
> > > Where is this mysterious include/trace/socket.h file?
> > > I can't seem to find it :-)
> > 
> > Its something along the lines of include/trace/block.h, with the
> > DECLARE_TRACE that match the DEFINE_TRACE entries he added, probably he
> > forgot to git-add this new file :-)
> 
> I kind of figured as much, so now Neil just needs to respin his
> patch with the relevant bits of that header file included.
> 

Grr, knew something like this would happen.  Its apparently impossible for me to
send something up without missing something obvious :).  My bad.  I'll
respin/repost in just a bit
Neil

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig Jan. 27, 2009, 5:18 p.m. UTC | #5
On Mon, Jan 26, 2009 at 02:59:30PM -0500, Neil Horman wrote:
> Hey all-
> 	I've been working for a bit with the various tracepoints that have been
> developed in the lttng tree.  The infrastructure is available in the main line
> kernel.  All thats missing are the tracepoints themselves.  While some seem to
> be contentious for various reasons, most seem benign, and can be rather usefull,
> so I was thinking we could start importing Some for the net stack.  Heres a
> patch for the top level socket api tracepoints.  If these are acceptable, I'll
> begin porting over the rest for the remainder of the network stack.   This patch
> applies cleanly against the net-next tree.

Do we now have lttng in linux-next?  Otherwise these trace point would
be useless there without more patching.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Jan. 27, 2009, 5:20 p.m. UTC | #6
From: Christoph Hellwig <hch@infradead.org>
Date: Tue, 27 Jan 2009 12:18:57 -0500

> Do we now have lttng in linux-next?  Otherwise these trace point would
> be useless there without more patching.

We're merging this stuff to solve the chicken and egg
problem wherein the lttng tree merge is basically
stalled.

If the individual subsystem annotations go in, the hope
is that they'll have less to merge and thus it's more likely
to actually happen.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig Jan. 27, 2009, 5:23 p.m. UTC | #7
On Tue, Jan 27, 2009 at 09:20:17AM -0800, David Miller wrote:
> From: Christoph Hellwig <hch@infradead.org>
> Date: Tue, 27 Jan 2009 12:18:57 -0500
> 
> > Do we now have lttng in linux-next?  Otherwise these trace point would
> > be useless there without more patching.
> 
> We're merging this stuff to solve the chicken and egg
> problem wherein the lttng tree merge is basically
> stalled.
> 
> If the individual subsystem annotations go in, the hope
> is that they'll have less to merge and thus it's more likely
> to actually happen.

I'm rather concerned as I haven't seen any progress on the lttng core
lately.  I'd really prefer to have a version in close to mergeable shape
first, that's actively beeing pushed.  Adding the instrumentation is
trivial as seen by this small patch, but getting the core right (and who
knows if that involves changing the way actual instrumentation works, it's
not like that hasn't changed n million times yet) is essential.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Jan. 27, 2009, 6:21 p.m. UTC | #8
From: Christoph Hellwig <hch@infradead.org>
Date: Tue, 27 Jan 2009 12:23:23 -0500

> On Tue, Jan 27, 2009 at 09:20:17AM -0800, David Miller wrote:
> > We're merging this stuff to solve the chicken and egg
> > problem wherein the lttng tree merge is basically
> > stalled.
> > 
> > If the individual subsystem annotations go in, the hope
> > is that they'll have less to merge and thus it's more likely
> > to actually happen.
> 
> I'm rather concerned as I haven't seen any progress on the lttng core
> lately.  I'd really prefer to have a version in close to mergeable shape
> first, that's actively beeing pushed.  Adding the instrumentation is
> trivial as seen by this small patch, but getting the core right (and who
> knows if that involves changing the way actual instrumentation works, it's
> not like that hasn't changed n million times yet) is essential.

The block layer already merged trace annotations.  I therefore see no
harm in merging the net bits.

If you disagree, either submit a revert of the block layer
annotations, or content yourself in being a hypocrite :-)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig Jan. 27, 2009, 6:47 p.m. UTC | #9
On Tue, Jan 27, 2009 at 10:21:41AM -0800, David Miller wrote:
> The block layer already merged trace annotations.  I therefore see no
> harm in merging the net bits.

The block layer also has a consumer for them (blktrace) and soon a
second one (the ftrace plugin)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Neil Horman Jan. 27, 2009, 8:42 p.m. UTC | #10
On Tue, Jan 27, 2009 at 01:47:04PM -0500, Christoph Hellwig wrote:
> On Tue, Jan 27, 2009 at 10:21:41AM -0800, David Miller wrote:
> > The block layer already merged trace annotations.  I therefore see no
> > harm in merging the net bits.
> 
> The block layer also has a consumer for them (blktrace) and soon a
> second one (the ftrace plugin)
> 
> 

FWIW, ftrace is just as usefull for the network syscalls as any other set of
syscalls (hence my starting with the top layer of tracepoints).  I've also got
another use case thats been requested of me.  Specifically its been requested
that we have some unified method for tracking network drops in one place, so
that users don't have to use an amalgamation of tools to check statistics in
several places to find where their netowork packets are getting dropped.  I
thought defining a set of tracepoints to detect such drops might make the start
of a good solution.  Granted the existing tracepoints don't cover that use case
very well, but I like the idea of having the existing tracepoints in place
before I go adding new ones (to ensure that I don't duplicate effort).

Regards
Neil

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Neil Horman Jan. 28, 2009, 9:28 p.m. UTC | #11
On Tue, Jan 27, 2009 at 01:47:04PM -0500, Christoph Hellwig wrote:
> On Tue, Jan 27, 2009 at 10:21:41AM -0800, David Miller wrote:
> > The block layer already merged trace annotations.  I therefore see no
> > harm in merging the net bits.
> 
> The block layer also has a consumer for them (blktrace) and soon a
> second one (the ftrace plugin)
> 
> 

FWIW, ftrace will also work well on these syscalls (which I think is rather
usefull).  Another use case came across my path the other day too.  I've had
several requests from various people to find a way to provide unifying frame
drop indicators.  Basically people hate to have to dig through various stat
files and tools to look at all the places a frame can be dropped within our
network stack.  I think a nice solution to this problem might be some
tracepoints that people could use to gather comprehensive tx/rx frame drop
information using one tool.  Of course, these provided tracepoints don't do
that, but I think it would be good to get the existing tracepoint set in before
I go poking at an implementation of that, just to ensure I avoid duplicate work

Neil


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Feb. 1, 2009, 10:02 a.m. UTC | #12
From: Neil Horman <nhorman@tuxdriver.com>
Date: Wed, 28 Jan 2009 16:28:09 -0500

> Of course, these provided tracepoints don't do that, but I think it
> would be good to get the existing tracepoint set in before I go
> poking at an implementation of that, just to ensure I avoid
> duplicate work

Christoph's point is that it doesn't make any sense to merge
this stuff in without any upstream users.

The block stuff has such users, this net syscall trace stuff doesn't.

Now your "where is the packet dropped" trace thing, that would
something we could merge once that patch and a suitable tool exists.
Just like the block layer cases.

So I have to toss this patch for now, sorry.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Neil Horman Feb. 1, 2009, 6:53 p.m. UTC | #13
On Sun, Feb 01, 2009 at 02:02:47AM -0800, David Miller wrote:
> From: Neil Horman <nhorman@tuxdriver.com>
> Date: Wed, 28 Jan 2009 16:28:09 -0500
> 
> > Of course, these provided tracepoints don't do that, but I think it
> > would be good to get the existing tracepoint set in before I go
> > poking at an implementation of that, just to ensure I avoid
> > duplicate work
> 
> Christoph's point is that it doesn't make any sense to merge
> this stuff in without any upstream users.
> 
> The block stuff has such users, this net syscall trace stuff doesn't.
> 
Not sure thats entirely true.  Frysk can use the net syscall trace points just
as well as it can monitor the block syscalls (Unless I'm missing somethig here).
Granted thats not applicable for the deeper tracepoints, but for the top level
stuff...

> Now your "where is the packet dropped" trace thing, that would
> something we could merge once that patch and a suitable tool exists.
> Just like the block layer cases.
> 
> So I have to toss this patch for now, sorry.
> 
Thats ok, thats why I started small. :).  I'll take a more serious look at
implementing a consise drop monitoring utilty and come back to this.

Thanks for the eyes all!
Neil

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/socket.c b/net/socket.c
index 35dd737..6cf37f8 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -95,6 +95,7 @@ 
 
 #include <net/sock.h>
 #include <linux/netfilter.h>
+#include <trace/socket.h>
 
 static int sock_no_open(struct inode *irrelevant, struct file *dontcare);
 static ssize_t sock_aio_read(struct kiocb *iocb, const struct iovec *iov,
@@ -155,6 +156,11 @@  static const struct net_proto_family *net_families[NPROTO] __read_mostly;
 
 static DEFINE_PER_CPU(int, sockets_in_use) = 0;
 
+DEFINE_TRACE(socket_sendmsg);
+DEFINE_TRACE(socket_recvmsg);
+DEFINE_TRACE(socket_create);
+DEFINE_TRACE(socket_call);
+
 /*
  * Support routines.
  * Move socket addresses back and forth across the kernel/user
@@ -574,6 +580,7 @@  int sock_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
 	ret = __sock_sendmsg(&iocb, sock, msg, size);
 	if (-EIOCBQUEUED == ret)
 		ret = wait_on_sync_kiocb(&iocb);
+	trace_socket_sendmsg(sock, msg, size, ret);
 	return ret;
 }
 
@@ -653,10 +660,12 @@  int sock_recvmsg(struct socket *sock, struct msghdr *msg,
 	int ret;
 
 	init_sync_kiocb(&iocb, NULL);
+
 	iocb.private = &siocb;
 	ret = __sock_recvmsg(&iocb, sock, msg, size, flags);
 	if (-EIOCBQUEUED == ret)
 		ret = wait_on_sync_kiocb(&iocb);
+	trace_socket_recvmsg(sock, msg, size, flags, ret);
 	return ret;
 }
 
@@ -1242,6 +1251,7 @@  SYSCALL_DEFINE3(socket, int, family, int, type, int, protocol)
 	if (retval < 0)
 		goto out_release;
 
+	trace_socket_create(sock, retval);
 out:
 	/* It may be already another descriptor 8) Not kernel problem. */
 	return retval;
@@ -2064,6 +2074,8 @@  SYSCALL_DEFINE2(socketcall, int, call, unsigned long __user *, args)
 	a0 = a[0];
 	a1 = a[1];
 
+	trace_socket_call(call, a0);
+
 	switch (call) {
 	case SYS_SOCKET:
 		err = sys_socket(a0, a1, a[2]);