diff mbox

[nft,RFC] monitor: Support printing processes which caused the event

Message ID 20170510105510.891-1-phil@nwl.cc
State RFC
Delegated to: Pablo Neira
Headers show

Commit Message

Phil Sutter May 10, 2017, 10:55 a.m. UTC
This adds support for printing the process ID and name for changes which
'nft monitor' reports:

| nft -a -p monitor
| add chain ip t2 bla3 # pid 11616 (nft)

If '-n' was given in addition to '-p', parsing the process name from
/proc/<pid>/cmdline is suppressed.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Cc: Florian Westphal <fw@strlen.de>
---
 include/nftables.h |  1 +
 src/main.c         | 12 ++++++++++-
 src/netlink.c      | 60 +++++++++++++++++++++++++++++++++++++++++++++++++-----
 src/rule.c         |  2 --
 4 files changed, 67 insertions(+), 8 deletions(-)

Comments

Florian Westphal May 10, 2017, 11:27 a.m. UTC | #1
Phil Sutter <phil@nwl.cc> wrote:
> This adds support for printing the process ID and name for changes which
> 'nft monitor' reports:
> 
> | nft -a -p monitor
> | add chain ip t2 bla3 # pid 11616 (nft)

This prints something else, see below.

> diff --git a/src/netlink.c b/src/netlink.c
> index 7e7261fe1e1d4..67a2f2a901ebe 100644
> --- a/src/netlink.c
> +++ b/src/netlink.c
> @@ -2068,6 +2068,40 @@ next:
>  	nftnl_expr_iter_destroy(nlrei);
>  }
>  
> +static const char *pid2name(uint32_t pid)
> +{
> +	static char buf[512];
> +	int fd, rc;
> +	char *p;
> +
> +	snprintf(buf, sizeof(buf), "/proc/%u/cmdline", pid);
> +	fd = open(buf, O_RDONLY);
> +	if (fd == -1)
> +		return "";
> +
> +	rc = read(fd, buf, sizeof(buf));

This should do a

buf[sizeof(buf) - 1] = 0;

to be on safe side.

> +static void print_pid(const struct nlmsghdr *nlh)
> +{
> +	const char *name;
> +
> +	if (!pid_output)
> +		return;
> +	printf(" # pid %u", nlh->nlmsg_pid);

nlmsg_pid is the netlink portid.

While most programs set it to their process id there is no guarantee.
Its just a (unique) 32 bit identifier.

Afaics one has to use /proc/net/netlink to map the portid to the inode
and then walk /proc/*/fd/* to find the socket with that inode.

Perhaps there is a simpler way, maybe you can check what ss is doing
and what info can be obtained via netlink diag.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pablo Neira Ayuso May 10, 2017, 11:34 a.m. UTC | #2
On Wed, May 10, 2017 at 12:55:10PM +0200, Phil Sutter wrote:
> This adds support for printing the process ID and name for changes which
> 'nft monitor' reports:
>
> | nft -a -p monitor
> | add chain ip t2 bla3 # pid 11616 (nft)
> 
> If '-n' was given in addition to '-p', parsing the process name from
> /proc/<pid>/cmdline is suppressed.

That is not the process ID, that is the netlink portID, that is pretty
much useless if we get a robot process with multiple netlink sockets
(likely given that it may want to lookup from interface index to
interface name).
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pablo Neira Ayuso May 10, 2017, 11:38 a.m. UTC | #3
On Wed, May 10, 2017 at 01:27:24PM +0200, Florian Westphal wrote:
> nlmsg_pid is the netlink portid.

Right.

> While most programs set it to their process id there is no guarantee.
> Its just a (unique) 32 bit identifier.

It's actually the kernel that decides what portID the socket gets
IIRC. For the first socket it uses the process ID, then for follow up
sockets, it just looks for a spare ID in the negative range of the 32
bit, if my memory serves well.

> Afaics one has to use /proc/net/netlink to map the portid to the inode
> and then walk /proc/*/fd/* to find the socket with that inode.
> 
> Perhaps there is a simpler way, maybe you can check what ss is doing
> and what info can be obtained via netlink diag.

I wouldn't be surprise if we need more kernel infrastructure to deal
with this. Parsing /proc for a netlink thing is definitely not ideal.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Florian Westphal May 10, 2017, 11:57 a.m. UTC | #4
Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> On Wed, May 10, 2017 at 01:27:24PM +0200, Florian Westphal wrote:
> > While most programs set it to their process id there is no guarantee.
> > Its just a (unique) 32 bit identifier.
> 
> It's actually the kernel that decides what portID the socket gets
> IIRC. For the first socket it uses the process ID, then for follow up
> sockets, it just looks for a spare ID in the negative range of the 32
> bit, if my memory serves well.

Argh, yes, of course...

> > Afaics one has to use /proc/net/netlink to map the portid to the inode
> > and then walk /proc/*/fd/* to find the socket with that inode.
> > 
> > Perhaps there is a simpler way, maybe you can check what ss is doing
> > and what info can be obtained via netlink diag.
> 
> I wouldn't be surprise if we need more kernel infrastructure to deal
> with this. Parsing /proc for a netlink thing is definitely not ideal.

Yes.  From nft monitor point of view the most easy solution would be
if the process id (or even the name?) would be sent back to userspace in a netlink
attribute.  Do you think we can extend nf_tables to include
get_task_comm() name and/or pid when/if we send update notifications?

(The pid is actually not that useful as process might have exited
already).
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Arturo Borrero Gonzalez May 10, 2017, 12:52 p.m. UTC | #5
On 10 May 2017 at 12:55, Phil Sutter <phil@nwl.cc> wrote:
> This adds support for printing the process ID and name for changes which
> 'nft monitor' reports:
>
> | nft -a -p monitor
> | add chain ip t2 bla3 # pid 11616 (nft)
>
> If '-n' was given in addition to '-p', parsing the process name from
> /proc/<pid>/cmdline is suppressed.
>
> Signed-off-by: Phil Sutter <phil@nwl.cc>
> Cc: Florian Westphal <fw@strlen.de>
> ---
>  include/nftables.h |  1 +
>  src/main.c         | 12 ++++++++++-
>  src/netlink.c      | 60 +++++++++++++++++++++++++++++++++++++++++++++++++-----
>  src/rule.c         |  2 --
>  4 files changed, 67 insertions(+), 8 deletions(-)
>

If you are about to parse the textual nft output anyway, (which
doesn't seems like a good idea BTW),
why you don't simply add a rule comment?:

% nft add rule inet filter input counter comment "added by my app"
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Phil Sutter May 10, 2017, 2:02 p.m. UTC | #6
On Wed, May 10, 2017 at 02:52:49PM +0200, Arturo Borrero Gonzalez wrote:
> On 10 May 2017 at 12:55, Phil Sutter <phil@nwl.cc> wrote:
> > This adds support for printing the process ID and name for changes which
> > 'nft monitor' reports:
> >
> > | nft -a -p monitor
> > | add chain ip t2 bla3 # pid 11616 (nft)
> >
> > If '-n' was given in addition to '-p', parsing the process name from
> > /proc/<pid>/cmdline is suppressed.
> >
> > Signed-off-by: Phil Sutter <phil@nwl.cc>
> > Cc: Florian Westphal <fw@strlen.de>
> > ---
> >  include/nftables.h |  1 +
> >  src/main.c         | 12 ++++++++++-
> >  src/netlink.c      | 60 +++++++++++++++++++++++++++++++++++++++++++++++++-----
> >  src/rule.c         |  2 --
> >  4 files changed, 67 insertions(+), 8 deletions(-)
> >
> 
> If you are about to parse the textual nft output anyway, (which
> doesn't seems like a good idea BTW),
> why you don't simply add a rule comment?:
> 
> % nft add rule inet filter input counter comment "added by my app"

Sometimes you don't control the instance adding the rule, then this is
not an option.

Cheers, Phil
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Phil Sutter May 10, 2017, 2:39 p.m. UTC | #7
Hi,

On Wed, May 10, 2017 at 01:57:48PM +0200, Florian Westphal wrote:
> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > On Wed, May 10, 2017 at 01:27:24PM +0200, Florian Westphal wrote:
> > > While most programs set it to their process id there is no guarantee.
> > > Its just a (unique) 32 bit identifier.
> > 
> > It's actually the kernel that decides what portID the socket gets
> > IIRC. For the first socket it uses the process ID, then for follow up
> > sockets, it just looks for a spare ID in the negative range of the 32
> > bit, if my memory serves well.
> 
> Argh, yes, of course...
> 
> > > Afaics one has to use /proc/net/netlink to map the portid to the inode
> > > and then walk /proc/*/fd/* to find the socket with that inode.
> > > 
> > > Perhaps there is a simpler way, maybe you can check what ss is doing
> > > and what info can be obtained via netlink diag.
> > 
> > I wouldn't be surprise if we need more kernel infrastructure to deal
> > with this. Parsing /proc for a netlink thing is definitely not ideal.
> 
> Yes.  From nft monitor point of view the most easy solution would be
> if the process id (or even the name?) would be sent back to userspace in a netlink
> attribute.  Do you think we can extend nf_tables to include
> get_task_comm() name and/or pid when/if we send update notifications?
> 
> (The pid is actually not that useful as process might have exited
> already).

The question is where to put it. Looking at the netlink message
structure, I see two options:

A) Extend struct nfgenmsg to contain PID and process name (a buffer of
   length TASK_COMM_LEN).
B) Add type-specific attributes for each type, like NFTA_RULE_PID and
   NFTA_RULE_PNAME.

The problem with A) is that it will break older user space expecting
sizeof(struct nfgenmsg) to be shorter. Additional attributes should not
be a problem here, but having to add them for each object type seems to
be a really ugly solution.

Cheers, Phil
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Florian Westphal May 10, 2017, 2:54 p.m. UTC | #8
Phil Sutter <phil@nwl.cc> wrote:
> Hi,
> 
> On Wed, May 10, 2017 at 01:57:48PM +0200, Florian Westphal wrote:
> > Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > > On Wed, May 10, 2017 at 01:27:24PM +0200, Florian Westphal wrote:
> > > > While most programs set it to their process id there is no guarantee.
> > > > Its just a (unique) 32 bit identifier.
> > > 
> > > It's actually the kernel that decides what portID the socket gets
> > > IIRC. For the first socket it uses the process ID, then for follow up
> > > sockets, it just looks for a spare ID in the negative range of the 32
> > > bit, if my memory serves well.
> > 
> > Argh, yes, of course...
> > 
> > > > Afaics one has to use /proc/net/netlink to map the portid to the inode
> > > > and then walk /proc/*/fd/* to find the socket with that inode.
> > > > 
> > > > Perhaps there is a simpler way, maybe you can check what ss is doing
> > > > and what info can be obtained via netlink diag.
> > > 
> > > I wouldn't be surprise if we need more kernel infrastructure to deal
> > > with this. Parsing /proc for a netlink thing is definitely not ideal.
> > 
> > Yes.  From nft monitor point of view the most easy solution would be
> > if the process id (or even the name?) would be sent back to userspace in a netlink
> > attribute.  Do you think we can extend nf_tables to include
> > get_task_comm() name and/or pid when/if we send update notifications?
> > 
> > (The pid is actually not that useful as process might have exited
> > already).
> 
> The question is where to put it. Looking at the netlink message
> structure, I see two options:
> 
> A) Extend struct nfgenmsg to contain PID and process name (a buffer of
>    length TASK_COMM_LEN).
> B) Add type-specific attributes for each type, like NFTA_RULE_PID and
>    NFTA_RULE_PNAME.
> 
> The problem with A) is that it will break older user space expecting
> sizeof(struct nfgenmsg) to be shorter.

Right, A) is a non-starter.

> Additional attributes should not
> be a problem here, but having to add them for each object type seems to
> be a really ugly solution.

I don't find it ugly, but alternatively we could add a new type of info
sent at the beginning of the commit phase (before all the table/rule etc
updates) and include it there.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Phil Sutter May 10, 2017, 3:11 p.m. UTC | #9
On Wed, May 10, 2017 at 04:54:37PM +0200, Florian Westphal wrote:
> Phil Sutter <phil@nwl.cc> wrote:
> > Hi,
> > 
> > On Wed, May 10, 2017 at 01:57:48PM +0200, Florian Westphal wrote:
> > > Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > > > On Wed, May 10, 2017 at 01:27:24PM +0200, Florian Westphal wrote:
> > > > > While most programs set it to their process id there is no guarantee.
> > > > > Its just a (unique) 32 bit identifier.
> > > > 
> > > > It's actually the kernel that decides what portID the socket gets
> > > > IIRC. For the first socket it uses the process ID, then for follow up
> > > > sockets, it just looks for a spare ID in the negative range of the 32
> > > > bit, if my memory serves well.
> > > 
> > > Argh, yes, of course...
> > > 
> > > > > Afaics one has to use /proc/net/netlink to map the portid to the inode
> > > > > and then walk /proc/*/fd/* to find the socket with that inode.
> > > > > 
> > > > > Perhaps there is a simpler way, maybe you can check what ss is doing
> > > > > and what info can be obtained via netlink diag.
> > > > 
> > > > I wouldn't be surprise if we need more kernel infrastructure to deal
> > > > with this. Parsing /proc for a netlink thing is definitely not ideal.
> > > 
> > > Yes.  From nft monitor point of view the most easy solution would be
> > > if the process id (or even the name?) would be sent back to userspace in a netlink
> > > attribute.  Do you think we can extend nf_tables to include
> > > get_task_comm() name and/or pid when/if we send update notifications?
> > > 
> > > (The pid is actually not that useful as process might have exited
> > > already).
> > 
> > The question is where to put it. Looking at the netlink message
> > structure, I see two options:
> > 
> > A) Extend struct nfgenmsg to contain PID and process name (a buffer of
> >    length TASK_COMM_LEN).
> > B) Add type-specific attributes for each type, like NFTA_RULE_PID and
> >    NFTA_RULE_PNAME.
> > 
> > The problem with A) is that it will break older user space expecting
> > sizeof(struct nfgenmsg) to be shorter.
> 
> Right, A) is a non-starter.
> 
> > Additional attributes should not
> > be a problem here, but having to add them for each object type seems to
> > be a really ugly solution.
> 
> I don't find it ugly, but alternatively we could add a new type of info
> sent at the beginning of the commit phase (before all the table/rule etc
> updates) and include it there.

You mean as a separate netlink message? Then we would have to map that
message to the actual notification, no? Or do you think it's sufficient
to cache the last PID/name received and use it for the monitor messages
until an update to them comes in?

Thanks, Phil
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Florian Westphal May 10, 2017, 5:59 p.m. UTC | #10
Phil Sutter <phil@nwl.cc> wrote:
> > I don't find it ugly, but alternatively we could add a new type of info
> > sent at the beginning of the commit phase (before all the table/rule etc
> > updates) and include it there.
> 
> You mean as a separate netlink message? Then we would have to map that
> message to the actual notification, no? Or do you think it's sufficient
> to cache the last PID/name received and use it for the monitor messages
> until an update to them comes in?

I didn't check yet but all the notifications should contain the current
generation id and we also send a genid update after a transaction is
done so I suspect its enough to emit the name once per transaction.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pablo Neira Ayuso May 11, 2017, 6:41 a.m. UTC | #11
On Wed, May 10, 2017 at 07:59:20PM +0200, Florian Westphal wrote:
> Phil Sutter <phil@nwl.cc> wrote:
> > > I don't find it ugly, but alternatively we could add a new type of info
> > > sent at the beginning of the commit phase (before all the table/rule etc
> > > updates) and include it there.
> > 
> > You mean as a separate netlink message? Then we would have to map that
> > message to the actual notification, no? Or do you think it's sufficient
> > to cache the last PID/name received and use it for the monitor messages
> > until an update to them comes in?
> 
> I didn't check yet but all the notifications should contain the current
> generation id and we also send a genid update after a transaction is
> done so I suspect its enough to emit the name once per transaction.

What is the usecase for this? Please don't tell me the obvious the
answer: I just want to know what process has modified what.

If the point is to know if someone else, not myself as a process, has
modified the ruleset, that is very easy to know with the netlink
infrastructure.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Florian Westphal May 11, 2017, 6:59 a.m. UTC | #12
Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> What is the usecase for this? Please don't tell me the obvious the
> answer: I just want to know what process has modified what.
> 
> If the point is to know if someone else, not myself as a process, has
> modified the ruleset, that is very easy to know with the netlink
> infrastructure.

Yes, thats in fact more important than 'know what process has modified
what', although I think it would be nice from a debug-point of view,
i.e.

$self adds a rule
something else adds a rule at same time

How can $self learn/know the handle assigned by kernel?

The larger picture is to start thinking in direction of libnft,
i.e. get the groundwork going so we don't have to tell 3rd party tools
like firewalld to parse nft text output.

Ideally, I'd like to see a mechanism where the 3rd party tool can:
1. queue an arbitrary amount of updates (add/delete of rules, set
   elements etc.)
2. learn the unique handles assigned to these rules
   so that it can identifiy/remove each one of these rules.

Thomas Woerner suggested a way where userspace can assign unique handles
instead of the kernel but I don't like it because i found no way how the
kernel could enforce that such user-handles are unique without walking
all rules of a table for every transaction.

But currently its impossible to delete a rule again without parsing
'nft -a list table'.  'delete-by-name' is good of course, but, has
the same problems we have with iptables.  I like that we have unique
handles that would allow to 1:1 map every rule to a uniqeue identifier.

But right now its more of a guessing game as the inserting program
doesn't see the handle(s) synchronously, just via monitor.

[ I suspect I now see where you're going with this as the notifications
  contain the nlportid so at least a tool that inserts + listens for
  updates will know if a notification is due to changes by someone else
  or not ...].

Do all of these patches make more sense now?

But, ideally, I'd like to see traction in direction of libnft (I'd
prefer to split nft for this, i.e. gplv2-licensed library).
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Phil Sutter May 11, 2017, 8:27 a.m. UTC | #13
On Thu, May 11, 2017 at 08:59:27AM +0200, Florian Westphal wrote:
> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > What is the usecase for this? Please don't tell me the obvious the
> > answer: I just want to know what process has modified what.
> > 
> > If the point is to know if someone else, not myself as a process, has
> > modified the ruleset, that is very easy to know with the netlink
> > infrastructure.
> 
> Yes, thats in fact more important than 'know what process has modified
> what', although I think it would be nice from a debug-point of view,
> i.e.
> 
> $self adds a rule
> something else adds a rule at same time
> 
> How can $self learn/know the handle assigned by kernel?
> 
> The larger picture is to start thinking in direction of libnft,
> i.e. get the groundwork going so we don't have to tell 3rd party tools
> like firewalld to parse nft text output.

Which is a rather pointless argument regarding the monitor changes -
neither parsing 'nft monitor' nor 'nft -a list ruleset' output is a
particularly good option. Assuming that there will be at least optional
NLM_F_ECHO in libnftables[1], programs will receive the handles for the
rules they add, probably even along with the full rule depending on the
yet to define API.

> Ideally, I'd like to see a mechanism where the 3rd party tool can:
> 1. queue an arbitrary amount of updates (add/delete of rules, set
>    elements etc.)
> 2. learn the unique handles assigned to these rules
>    so that it can identifiy/remove each one of these rules.
> 
> Thomas Woerner suggested a way where userspace can assign unique handles
> instead of the kernel but I don't like it because i found no way how the
> kernel could enforce that such user-handles are unique without walking
> all rules of a table for every transaction.
> 
> But currently its impossible to delete a rule again without parsing
> 'nft -a list table'.  'delete-by-name' is good of course, but, has
> the same problems we have with iptables.  I like that we have unique
> handles that would allow to 1:1 map every rule to a uniqeue identifier.

My point with all this is that delete-by-name simply doesn't exist yet,
and given the obstacles it will face I guess it will take a little while
until it's there. My synchronous handle output patch and these 'nft
monitor' enhancements will provide an alternative at least until
delete-by-name is there and actually usable. Also I don't know of a
single point why not both ways should be made available - it only
increases acceptance in users, which is a good thing per se.

> But right now its more of a guessing game as the inserting program
> doesn't see the handle(s) synchronously, just via monitor.

This is only reliable if a project uses libnftnl directly and hence
knows it's own portid.

Cheers, Phil

[1] I prefer the longer name just because it's more different to
libnftnl than just 'libnft'.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pablo Neira Ayuso May 11, 2017, 9:58 a.m. UTC | #14
On Thu, May 11, 2017 at 10:27:46AM +0200, Phil Sutter wrote:
> On Thu, May 11, 2017 at 08:59:27AM +0200, Florian Westphal wrote:
> > Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > > What is the usecase for this? Please don't tell me the obvious the
> > > answer: I just want to know what process has modified what.
> > > 
> > > If the point is to know if someone else, not myself as a process, has
> > > modified the ruleset, that is very easy to know with the netlink
> > > infrastructure.
> > 
> > Yes, thats in fact more important than 'know what process has modified
> > what', although I think it would be nice from a debug-point of view,
> > i.e.
> > 
> > $self adds a rule
> > something else adds a rule at same time
> > 
> > How can $self learn/know the handle assigned by kernel?
> > 
> > The larger picture is to start thinking in direction of libnft,
> > i.e. get the groundwork going so we don't have to tell 3rd party tools
> > like firewalld to parse nft text output.
> 
> Which is a rather pointless argument regarding the monitor changes -
> neither parsing 'nft monitor' nor 'nft -a list ruleset' output is a
> particularly good option. Assuming that there will be at least optional
> NLM_F_ECHO in libnftables[1], programs will receive the handles for the
> rules they add, probably even along with the full rule depending on the
> yet to define API.
> 
> > Ideally, I'd like to see a mechanism where the 3rd party tool can:
> > 1. queue an arbitrary amount of updates (add/delete of rules, set
> >    elements etc.)
> > 2. learn the unique handles assigned to these rules
> >    so that it can identifiy/remove each one of these rules.
> > 
> > Thomas Woerner suggested a way where userspace can assign unique handles
> > instead of the kernel but I don't like it because i found no way how the
> > kernel could enforce that such user-handles are unique without walking
> > all rules of a table for every transaction.
> > 
> > But currently its impossible to delete a rule again without parsing
> > 'nft -a list table'.  'delete-by-name' is good of course, but, has
> > the same problems we have with iptables.  I like that we have unique
> > handles that would allow to 1:1 map every rule to a uniqeue identifier.
> 
> My point with all this is that delete-by-name simply doesn't exist yet,
> and given the obstacles it will face I guess it will take a little while
> until it's there. My synchronous handle output patch and these 'nft
> monitor' enhancements will provide an alternative at least until
> delete-by-name is there and actually usable. Also I don't know of a
> single point why not both ways should be made available - it only
> increases acceptance in users, which is a good thing per se.

Add a --echo option to nft that emerges the NLM_F_ECHO flag, so
basically this prints the result of what you added into the kernel.
This will also work with nft -i. No value returned via variable
though.

> > But right now its more of a guessing game as the inserting program
> > doesn't see the handle(s) synchronously, just via monitor.
> 
> This is only reliable if a project uses libnftnl directly and hence
> knows it's own portid.
> 
> Cheers, Phil
> 
> [1] I prefer the longer name just because it's more different to
> libnftnl than just 'libnft'.

We can start a poll with a bunch pre-selected choices, actually I only
see two at this moment.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pablo Neira Ayuso May 11, 2017, 9:59 a.m. UTC | #15
On Thu, May 11, 2017 at 08:59:27AM +0200, Florian Westphal wrote:
> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > What is the usecase for this? Please don't tell me the obvious the
> > answer: I just want to know what process has modified what.
> > 
> > If the point is to know if someone else, not myself as a process, has
> > modified the ruleset, that is very easy to know with the netlink
> > infrastructure.
> 
> Yes, thats in fact more important than 'know what process has modified
> what', although I think it would be nice from a debug-point of view,
> i.e.
> 
> $self adds a rule
> something else adds a rule at same time
> 
> How can $self learn/know the handle assigned by kernel?

Using the NLM_F_ECHO flag. From the netlink multicast side, the
nlmsg_pid also tell us if it's being self who added this entry.

> The larger picture is to start thinking in direction of libnft,
> i.e. get the groundwork going so we don't have to tell 3rd party tools
> like firewalld to parse nft text output.

> Ideally, I'd like to see a mechanism where the 3rd party tool can:
> 1. queue an arbitrary amount of updates (add/delete of rules, set
>    elements etc.)
> 2. learn the unique handles assigned to these rules
>    so that it can identifiy/remove each one of these rules.
>
> Thomas Woerner suggested a way where userspace can assign unique handles
> instead of the kernel but I don't like it because i found no way how the
> kernel could enforce that such user-handles are unique without walking
> all rules of a table for every transaction.

We already have a temporary id, see NFTA_SET_ID and NFTA_RULE_ID. This
id though is only valid during the transaction, so it just goes away
after it. But it is sufficient to solve the problem you describe. It
should be easy to place it in the event message, so we send it back to
userspace. Thus, userspace can correlate the NFTA_RULE_ID that you
selected and the NFTA_RULE_HANDLE the kernel assigns for you if you
want know what rule you added relates to the one you got,
specifically.

> But currently its impossible to delete a rule again without parsing
> 'nft -a list table'.  'delete-by-name' is good of course, but, has
> the same problems we have with iptables.  I like that we have unique
> handles that would allow to 1:1 map every rule to a uniqeue identifier.

Command line is a different front. This is already solved from a
programmatic point of view. So I think it is just a matter of emerging
the echo feature from there.

> But right now its more of a guessing game as the inserting program
> doesn't see the handle(s) synchronously, just via monitor.
> 
> [ I suspect I now see where you're going with this as the notifications
>   contain the nlportid so at least a tool that inserts + listens for
>   updates will know if a notification is due to changes by someone else
>   or not ...].
> 
> Do all of these patches make more sense now?
> 
> But, ideally, I'd like to see traction in direction of libnft (I'd
> prefer to split nft for this, i.e. gplv2-licensed library).

Eric Leblond handed over several patches to me, I made some progress
on them back in december too. Let me sort out this a bit so we can
bootstrap a separated repository and keep discussing on this.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/nftables.h b/include/nftables.h
index 6f541557364ba..9107b56b26151 100644
--- a/include/nftables.h
+++ b/include/nftables.h
@@ -29,6 +29,7 @@  extern unsigned int numeric_output;
 extern unsigned int stateless_output;
 extern unsigned int ip2name_output;
 extern unsigned int handle_output;
+extern unsigned int pid_output;
 extern unsigned int debug_level;
 extern const char *include_paths[INCLUDE_PATHS_MAX];
 
diff --git a/src/main.c b/src/main.c
index 1cc8b39ff4ab9..670e6791e8877 100644
--- a/src/main.c
+++ b/src/main.c
@@ -33,6 +33,7 @@  unsigned int numeric_output;
 unsigned int stateless_output;
 unsigned int ip2name_output;
 unsigned int handle_output;
+unsigned int pid_output;
 #ifdef DEBUG
 unsigned int debug_level;
 #endif
@@ -51,10 +52,11 @@  enum opt_vals {
 	OPT_IP2NAME		= 'N',
 	OPT_DEBUG		= 'd',
 	OPT_HANDLE_OUTPUT	= 'a',
+	OPT_PID_OUTPUT		= 'p',
 	OPT_INVALID		= '?',
 };
 
-#define OPTSTRING	"hvf:iI:vnsNa"
+#define OPTSTRING	"hvf:iI:vnsNap"
 
 static const struct option options[] = {
 	{
@@ -103,6 +105,10 @@  static const struct option options[] = {
 		.val		= OPT_HANDLE_OUTPUT,
 	},
 	{
+		.name		= "pid",
+		.val		= OPT_PID_OUTPUT,
+	},
+	{
 		.name		= NULL
 	}
 };
@@ -125,6 +131,7 @@  static void show_help(const char *name)
 "  -s, --stateless		Omit stateful information of ruleset.\n"
 "  -N				Translate IP addresses to names.\n"
 "  -a, --handle			Output rule handle.\n"
+"  -p, --pid			Output process information in monitor.\n"
 "  -I, --includepath <directory>	Add <directory> to the paths searched for include files.\n"
 #ifdef DEBUG
 "  --debug <level [,level...]>	Specify debugging level (scanner, parser, eval, netlink, mnl, proto-ctx, segtree, all)\n"
@@ -333,6 +340,9 @@  int main(int argc, char * const *argv)
 		case OPT_HANDLE_OUTPUT:
 			handle_output++;
 			break;
+		case OPT_PID_OUTPUT:
+			pid_output++;
+			break;
 		case OPT_INVALID:
 			exit(NFT_EXIT_FAILURE);
 		}
diff --git a/src/netlink.c b/src/netlink.c
index 7e7261fe1e1d4..67a2f2a901ebe 100644
--- a/src/netlink.c
+++ b/src/netlink.c
@@ -2068,6 +2068,40 @@  next:
 	nftnl_expr_iter_destroy(nlrei);
 }
 
+static const char *pid2name(uint32_t pid)
+{
+	static char buf[512];
+	int fd, rc;
+	char *p;
+
+	snprintf(buf, sizeof(buf), "/proc/%u/cmdline", pid);
+	fd = open(buf, O_RDONLY);
+	if (fd == -1)
+		return "";
+
+	rc = read(fd, buf, sizeof(buf));
+	close (fd);
+	if (rc < 0)
+		return "";
+
+	p = strrchr(buf, '/');
+	return p ? p + 1 : buf;
+}
+
+static void print_pid(const struct nlmsghdr *nlh)
+{
+	const char *name;
+
+	if (!pid_output)
+		return;
+	printf(" # pid %u", nlh->nlmsg_pid);
+
+	if (numeric_output)
+		return;
+	name = pid2name(nlh->nlmsg_pid);
+	printf(" (%s)", name ? : "?");
+}
+
 static int netlink_events_table_cb(const struct nlmsghdr *nlh, int type,
 				   struct netlink_mon_handler *monh)
 {
@@ -2089,8 +2123,10 @@  static int netlink_events_table_cb(const struct nlmsghdr *nlh, int type,
 
 		family = nftnl_table_get_u32(nlt, NFTNL_TABLE_FAMILY);
 
-		printf("%s %s\n", family2str(family),
+		printf("%s %s", family2str(family),
 		       nftnl_table_get_str(nlt, NFTNL_TABLE_NAME));
+		print_pid(nlh);
+		printf("\n");
 		break;
 	case NFTNL_OUTPUT_XML:
 	case NFTNL_OUTPUT_JSON:
@@ -2125,12 +2161,16 @@  static int netlink_events_chain_cb(const struct nlmsghdr *nlh, int type,
 			c = netlink_delinearize_chain(monh->ctx, nlc);
 			chain_print_plain(c);
 			chain_free(c);
+			print_pid(nlh);
+			printf("\n");
 			break;
 		case NFT_MSG_DELCHAIN:
 			family = nftnl_chain_get_u32(nlc, NFTNL_CHAIN_FAMILY);
-			printf("delete chain %s %s %s\n", family2str(family),
+			printf("delete chain %s %s %s", family2str(family),
 			       nftnl_chain_get_str(nlc, NFTNL_CHAIN_TABLE),
 			       nftnl_chain_get_str(nlc, NFTNL_CHAIN_NAME));
+			print_pid(nlh);
+			printf("\n");
 			break;
 		}
 		break;
@@ -2170,14 +2210,17 @@  static int netlink_events_set_cb(const struct nlmsghdr *nlh, int type,
 			}
 			set_print_plain(set);
 			set_free(set);
+			print_pid(nlh);
 			printf("\n");
 			break;
 		case NFT_MSG_DELSET:
 			family = nftnl_set_get_u32(nls, NFTNL_SET_FAMILY);
-			printf("delete set %s %s %s\n",
+			printf("delete set %s %s %s",
 			       family2str(family),
 			       nftnl_set_get_str(nls, NFTNL_SET_TABLE),
 			       nftnl_set_get_str(nls, NFTNL_SET_NAME));
+			print_pid(nlh);
+			printf("\n");
 			break;
 		}
 		break;
@@ -2257,6 +2300,7 @@  static int netlink_events_setelem_cb(const struct nlmsghdr *nlh, int type,
 		}
 		printf("element %s %s %s ", family2str(family), table, setname);
 		expr_print(dummyset->init);
+		print_pid(nlh);
 		printf("\n");
 
 		set_free(dummyset);
@@ -2294,15 +2338,18 @@  static int netlink_events_obj_cb(const struct nlmsghdr *nlh, int type,
 			}
 			obj_print_plain(obj);
 			obj_free(obj);
+			print_pid(nlh);
 			printf("\n");
 			break;
 		case NFT_MSG_DELOBJ:
 			family = nftnl_obj_get_u32(nlo, NFTNL_OBJ_FAMILY);
-			printf("delete %s %s %s %s\n",
+			printf("delete %s %s %s %s",
 			       obj_type_name(nftnl_obj_get_u32(nlo, NFTNL_OBJ_TYPE)),
 			       family2str(family),
 			       nftnl_obj_get_str(nlo, NFTNL_OBJ_TABLE),
 			       nftnl_obj_get_str(nlo, NFTNL_OBJ_NAME));
+			print_pid(nlh);
+			printf("\n");
 			break;
 		}
 		break;
@@ -2351,13 +2398,16 @@  static int netlink_events_rule_cb(const struct nlmsghdr *nlh, int type,
 
 			printf("add rule %s %s %s ", family, table, chain);
 			rule_print(r);
+			print_pid(nlh);
 			printf("\n");
 
 			rule_free(r);
 			break;
 		case NFT_MSG_DELRULE:
-			printf("delete rule %s %s %s handle %u\n",
+			printf("delete rule %s %s %s handle %u",
 			       family, table, chain, (unsigned int)handle);
+			print_pid(nlh);
+			printf("\n");
 			break;
 		}
 		break;
diff --git a/src/rule.c b/src/rule.c
index 209cf2d7d9f60..10095320348eb 100644
--- a/src/rule.c
+++ b/src/rule.c
@@ -691,8 +691,6 @@  void chain_print_plain(const struct chain *chain)
 		       chain->type, chain->hookstr,
 		       chain->priority, chain_policy2str(chain->policy));
 	}
-
-	printf("\n");
 }
 
 struct table *table_alloc(void)