diff mbox

[iptables] iptables: use IPC semaphore instead of abstract unix sockets

Message ID 1421615616-23053-1-git-send-email-pablo@netfilter.org
State Not Applicable
Delegated to: Pablo Neira
Headers show

Commit Message

Pablo Neira Ayuso Jan. 18, 2015, 9:13 p.m. UTC
Abstract unix sockets cannot be used to synchronize several concurrent
instances of iptables since an unpriviledged process can create them and
prevent the legitimate iptables instance from running.

This patch introduces a semaphore that is identified by the path to the
iptables binary, it also relies on SEM_UNDO so the kernel performs the
up() operation at process exit to avoid races with signals. This also
avoid file locks that require a writable filesystem.

Fixes: 93587a0 ("ip[6]tables: Add locking to prevent concurrent instances")
Reported-by: Lennart Poettering <lennart@poettering.net>
Cc: Phil Oester <kernel@linuxace.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 iptables/xshared.c |  122 ++++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 105 insertions(+), 17 deletions(-)

Comments

Jan Engelhardt Jan. 18, 2015, 9:28 p.m. UTC | #1
On Sunday 2015-01-18 22:13, Pablo Neira Ayuso wrote:

>This patch introduces a semaphore
>index b18022e..80eed2c 100644
>--- a/iptables/xshared.c
>+++ b/iptables/xshared.c
>+static int xtables_check_owner(int semid)
>+{
>+	int ret;
>+	struct semid_ds ds;
>+
>+	ret = semctl(semid, 0, IPC_STAT, &ds);

Is there a particular reason you are not using the POSIX semaphores?
[sem_open/shm_open as per sem_overview(7)].
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pablo Neira Ayuso Jan. 19, 2015, 12:24 p.m. UTC | #2
On Sun, Jan 18, 2015 at 10:28:32PM +0100, Jan Engelhardt wrote:
> On Sunday 2015-01-18 22:13, Pablo Neira Ayuso wrote:
> 
> >This patch introduces a semaphore
> >index b18022e..80eed2c 100644
> >--- a/iptables/xshared.c
> >+++ b/iptables/xshared.c
> >+static int xtables_check_owner(int semid)
> >+{
> >+	int ret;
> >+	struct semid_ds ds;
> >+
> >+	ret = semctl(semid, 0, IPC_STAT, &ds);
> 
> Is there a particular reason you are not using the POSIX semaphores?
> [sem_open/shm_open as per sem_overview(7)].

Please, read the patch description:

"This patch introduces a semaphore that is identified by the path to
the iptables binary, it also relies on SEM_UNDO so the kernel performs
the up() operation at process exit to avoid races with signals. This
also avoids file locks that require a writable filesystem."
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Patrick McHardy Jan. 19, 2015, 12:51 p.m. UTC | #3
On 19.01, Pablo Neira Ayuso wrote:
> On Sun, Jan 18, 2015 at 10:28:32PM +0100, Jan Engelhardt wrote:
> > On Sunday 2015-01-18 22:13, Pablo Neira Ayuso wrote:
> > 
> > >This patch introduces a semaphore
> > >index b18022e..80eed2c 100644
> > >--- a/iptables/xshared.c
> > >+++ b/iptables/xshared.c
> > >+static int xtables_check_owner(int semid)
> > >+{
> > >+	int ret;
> > >+	struct semid_ds ds;
> > >+
> > >+	ret = semctl(semid, 0, IPC_STAT, &ds);
> > 
> > Is there a particular reason you are not using the POSIX semaphores?
> > [sem_open/shm_open as per sem_overview(7)].
> 
> Please, read the patch description:
> 
> "This patch introduces a semaphore that is identified by the path to
> the iptables binary, it also relies on SEM_UNDO so the kernel performs
> the up() operation at process exit to avoid races with signals. This
> also avoids file locks that require a writable filesystem."

Is it wise to use the path? Not that its very common, but multiple
binaries would still race. Any reason you chose not to use something
globally unique?
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Patrick McHardy Jan. 19, 2015, 12:59 p.m. UTC | #4
On 19.01, Pablo Neira Ayuso wrote:
> On Mon, Jan 19, 2015 at 12:51:19PM +0000, Patrick McHardy wrote:
> > On 19.01, Pablo Neira Ayuso wrote:
> > > On Sun, Jan 18, 2015 at 10:28:32PM +0100, Jan Engelhardt wrote:
> > > > On Sunday 2015-01-18 22:13, Pablo Neira Ayuso wrote:
> > > > 
> > > > >This patch introduces a semaphore
> > > > >index b18022e..80eed2c 100644
> > > > >--- a/iptables/xshared.c
> > > > >+++ b/iptables/xshared.c
> > > > >+static int xtables_check_owner(int semid)
> > > > >+{
> > > > >+	int ret;
> > > > >+	struct semid_ds ds;
> > > > >+
> > > > >+	ret = semctl(semid, 0, IPC_STAT, &ds);
> > > > 
> > > > Is there a particular reason you are not using the POSIX semaphores?
> > > > [sem_open/shm_open as per sem_overview(7)].
> > > 
> > > Please, read the patch description:
> > > 
> > > "This patch introduces a semaphore that is identified by the path to
> > > the iptables binary, it also relies on SEM_UNDO so the kernel performs
> > > the up() operation at process exit to avoid races with signals. This
> > > also avoids file locks that require a writable filesystem."
> > 
> > Is it wise to use the path? Not that its very common, but multiple
> > binaries would still race. Any reason you chose not to use something
> > globally unique?
> 
> What kind of race are you worrying about?

Multiple iptables binaries, which would obviously have different paths.
As I said, not common, but possible.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pablo Neira Ayuso Jan. 19, 2015, 1 p.m. UTC | #5
On Mon, Jan 19, 2015 at 12:51:19PM +0000, Patrick McHardy wrote:
> On 19.01, Pablo Neira Ayuso wrote:
> > On Sun, Jan 18, 2015 at 10:28:32PM +0100, Jan Engelhardt wrote:
> > > On Sunday 2015-01-18 22:13, Pablo Neira Ayuso wrote:
> > > 
> > > >This patch introduces a semaphore
> > > >index b18022e..80eed2c 100644
> > > >--- a/iptables/xshared.c
> > > >+++ b/iptables/xshared.c
> > > >+static int xtables_check_owner(int semid)
> > > >+{
> > > >+	int ret;
> > > >+	struct semid_ds ds;
> > > >+
> > > >+	ret = semctl(semid, 0, IPC_STAT, &ds);
> > > 
> > > Is there a particular reason you are not using the POSIX semaphores?
> > > [sem_open/shm_open as per sem_overview(7)].
> > 
> > Please, read the patch description:
> > 
> > "This patch introduces a semaphore that is identified by the path to
> > the iptables binary, it also relies on SEM_UNDO so the kernel performs
> > the up() operation at process exit to avoid races with signals. This
> > also avoids file locks that require a writable filesystem."
> 
> Is it wise to use the path? Not that its very common, but multiple
> binaries would still race. Any reason you chose not to use something
> globally unique?

What kind of race are you worrying about?
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pablo Neira Ayuso Jan. 19, 2015, 1:06 p.m. UTC | #6
On Mon, Jan 19, 2015 at 02:00:24PM +0100, Pablo Neira Ayuso wrote:
> On Mon, Jan 19, 2015 at 12:51:19PM +0000, Patrick McHardy wrote:
> > On 19.01, Pablo Neira Ayuso wrote:
> > > On Sun, Jan 18, 2015 at 10:28:32PM +0100, Jan Engelhardt wrote:
> > > > On Sunday 2015-01-18 22:13, Pablo Neira Ayuso wrote:
> > > > 
> > > > >This patch introduces a semaphore
> > > > >index b18022e..80eed2c 100644
> > > > >--- a/iptables/xshared.c
> > > > >+++ b/iptables/xshared.c
> > > > >+static int xtables_check_owner(int semid)
> > > > >+{
> > > > >+	int ret;
> > > > >+	struct semid_ds ds;
> > > > >+
> > > > >+	ret = semctl(semid, 0, IPC_STAT, &ds);
> > > > 
> > > > Is there a particular reason you are not using the POSIX semaphores?
> > > > [sem_open/shm_open as per sem_overview(7)].
> > > 
> > > Please, read the patch description:
> > > 
> > > "This patch introduces a semaphore that is identified by the path to
> > > the iptables binary, it also relies on SEM_UNDO so the kernel performs
> > > the up() operation at process exit to avoid races with signals. This
> > > also avoids file locks that require a writable filesystem."
> > 
> > Is it wise to use the path? Not that its very common, but multiple
> > binaries would still race. Any reason you chose not to use something
> > globally unique?
> 
> What kind of race are you worrying about?

Oh, I get it. Several different iptables binaries located in different
paths. This patch cannot address that situation, we can select a
hardcoded key but we may conflict with other applications.

Regarding the use of posix semaphores, there is no SEM_UNDO feature,
so we can have problem if this receives a kill signal or it
abort/crash somewhere in the code.

I think the best solution is to use to flock() as others do but then
we need a writable filesystem() which is what Phil was trying to skip.

Question is if we should really care. I mean, this locking solution
was introduced as a workaround given we couldn't solve this in the
kernel.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Patrick McHardy Jan. 19, 2015, 1:08 p.m. UTC | #7
On 19.01, Pablo Neira Ayuso wrote:
> On Mon, Jan 19, 2015 at 02:00:24PM +0100, Pablo Neira Ayuso wrote:
> > On Mon, Jan 19, 2015 at 12:51:19PM +0000, Patrick McHardy wrote:
> > > On 19.01, Pablo Neira Ayuso wrote:
> > > > "This patch introduces a semaphore that is identified by the path to
> > > > the iptables binary, it also relies on SEM_UNDO so the kernel performs
> > > > the up() operation at process exit to avoid races with signals. This
> > > > also avoids file locks that require a writable filesystem."
> > > 
> > > Is it wise to use the path? Not that its very common, but multiple
> > > binaries would still race. Any reason you chose not to use something
> > > globally unique?
> > 
> > What kind of race are you worrying about?
> 
> Oh, I get it. Several different iptables binaries located in different
> paths. This patch cannot address that situation, we can select a
> hardcoded key but we may conflict with other applications.

Sure, but that risk also exists with using the path.

> Regarding the use of posix semaphores, there is no SEM_UNDO feature,
> so we can have problem if this receives a kill signal or it
> abort/crash somewhere in the code.
> 
> I think the best solution is to use to flock() as others do but then
> we need a writable filesystem() which is what Phil was trying to skip.
> 
> Question is if we should really care. I mean, this locking solution
> was introduced as a workaround given we couldn't solve this in the
> kernel.

I think your patch is fine, just wanted to point out that we might
want to choose a hardcoded name. I think the risk of clashes with
other applications is absolutely minimal.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Patrick Schaaf Jan. 19, 2015, 1:19 p.m. UTC | #8
On Monday 19 January 2015 14:06:34 Pablo Neira Ayuso wrote:
> On Mon, Jan 19, 2015 at 02:00:24PM +0100, Pablo Neira Ayuso wrote:
> 
> I think the best solution is to use to flock() as others do but then
> we need a writable filesystem() which is what Phil was trying to skip.

This appears to work: flock /sys/module/ip_tables

best regards
  Patrick
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pablo Neira Ayuso Jan. 19, 2015, 1:21 p.m. UTC | #9
On Mon, Jan 19, 2015 at 01:08:33PM +0000, Patrick McHardy wrote:
> On 19.01, Pablo Neira Ayuso wrote:
> > On Mon, Jan 19, 2015 at 02:00:24PM +0100, Pablo Neira Ayuso wrote:
> > > On Mon, Jan 19, 2015 at 12:51:19PM +0000, Patrick McHardy wrote:
> > > > On 19.01, Pablo Neira Ayuso wrote:
> > > > > "This patch introduces a semaphore that is identified by the path to
> > > > > the iptables binary, it also relies on SEM_UNDO so the kernel performs
> > > > > the up() operation at process exit to avoid races with signals. This
> > > > > also avoids file locks that require a writable filesystem."
> > > > 
> > > > Is it wise to use the path? Not that its very common, but multiple
> > > > binaries would still race. Any reason you chose not to use something
> > > > globally unique?
> > > 
> > > What kind of race are you worrying about?
> > 
> > Oh, I get it. Several different iptables binaries located in different
> > paths. This patch cannot address that situation, we can select a
> > hardcoded key but we may conflict with other applications.
> 
> Sure, but that risk also exists with using the path.
> 
> > Regarding the use of posix semaphores, there is no SEM_UNDO feature,
> > so we can have problem if this receives a kill signal or it
> > abort/crash somewhere in the code.
> > 
> > I think the best solution is to use to flock() as others do but then
> > we need a writable filesystem() which is what Phil was trying to skip.
> > 
> > Question is if we should really care. I mean, this locking solution
> > was introduced as a workaround given we couldn't solve this in the
> > kernel.
> 
> I think your patch is fine, just wanted to point out that we might
> want to choose a hardcoded name. I think the risk of clashes with
> other applications is absolutely minimal.

Makes sense to me. This is how ftok() calculates the key based on the
path:

  key = ((st.st_ino & 0xffff) | ((st.st_dev & 0xff) << 16)
         | ((proj_id & 0xff) << 24));

We can a select a magic number for that key, any candidates?

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Patrick Schaaf Jan. 19, 2015, 1:34 p.m. UTC | #10
On Monday 19 January 2015 14:19:14 Patrick Schaaf wrote:
>  This appears to work: flock /sys/module/ip_tables

<blush/> also works for ordinary users......

/sys/module/ip_tables/uevent would work, but then it's only present when 
iptables has been built as a module. And stuff like /proc/net/ip_tables does 
not work, as apparently /proc does not support flock...

Sorry for the noise.

Patrick

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lennart Poettering Jan. 19, 2015, 2:17 p.m. UTC | #11
On Sun, 18.01.15 22:13, Pablo Neira Ayuso (pablo@netfilter.org) wrote:

> Abstract unix sockets cannot be used to synchronize several concurrent
> instances of iptables since an unpriviledged process can create them and
> prevent the legitimate iptables instance from running.
> 
> This patch introduces a semaphore that is identified by the path to the
> iptables binary, it also relies on SEM_UNDO so the kernel performs the
> up() operation at process exit to avoid races with signals. This also
> avoid file locks that require a writable filesystem.

Please, don't use SysV IPC for any new code, it's the worst
choice. (the token API is already such a desaster!) The API is awful,
and dated. At least use the POSIX APIs instead.

I'd really recommend using BSD file locks on a file in /run though,
they have the nicest semantics of all, and are supported on Linux
since a long time.

Lennart
Lennart Poettering Jan. 19, 2015, 2:21 p.m. UTC | #12
On Mon, 19.01.15 13:24, Pablo Neira Ayuso (pablo@netfilter.org) wrote:

> On Sun, Jan 18, 2015 at 10:28:32PM +0100, Jan Engelhardt wrote:
> > On Sunday 2015-01-18 22:13, Pablo Neira Ayuso wrote:
> > 
> > >This patch introduces a semaphore
> > >index b18022e..80eed2c 100644
> > >--- a/iptables/xshared.c
> > >+++ b/iptables/xshared.c
> > >+static int xtables_check_owner(int semid)
> > >+{
> > >+	int ret;
> > >+	struct semid_ds ds;
> > >+
> > >+	ret = semctl(semid, 0, IPC_STAT, &ds);
> > 
> > Is there a particular reason you are not using the POSIX semaphores?
> > [sem_open/shm_open as per sem_overview(7)].
> 
> Please, read the patch description:
> 
> "This patch introduces a semaphore that is identified by the path to
> the iptables binary, it also relies on SEM_UNDO so the kernel performs
> the up() operation at process exit to avoid races with signals. This
> also avoids file locks that require a writable filesystem."

/run is always writable. /dev/shm too. And BSD file locks are robust
too (i.e. are automatically unlocked on exit). And you can have that
for POSIX locks too.

Lennart
Jan Engelhardt Jan. 19, 2015, 3:54 p.m. UTC | #13
On Monday 2015-01-19 14:06, Pablo Neira Ayuso wrote:
>
>I think the best solution is to use to flock() as others do but then
>we need a writable filesystem() which is what Phil was trying to skip.

If semaphores are no longer on the table, using shm_open for an
flockable fd seems like an option.
The /dev/shm directory implicitly used for that should be there in
a normal system, so as to support POSIX shm/sem in the first place.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lennart Poettering Jan. 23, 2015, 3:04 a.m. UTC | #14
On Mon, 19.01.15 16:54, Jan Engelhardt (jengelh@inai.de) wrote:

> 
> On Monday 2015-01-19 14:06, Pablo Neira Ayuso wrote:
> >
> >I think the best solution is to use to flock() as others do but then
> >we need a writable filesystem() which is what Phil was trying to skip.
> 
> If semaphores are no longer on the table, using shm_open for an
> flockable fd seems like an option.
> The /dev/shm directory implicitly used for that should be there in
> a normal system, so as to support POSIX shm/sem in the first place.

/dev/shm is a world-writable directory. If you use that, then
unprivileged processes can play games with you again by taking the
name away from you, and the original problem is back.

This really should be a file in /run, and nothing else.

Lennart
diff mbox

Patch

diff --git a/iptables/xshared.c b/iptables/xshared.c
index b18022e..80eed2c 100644
--- a/iptables/xshared.c
+++ b/iptables/xshared.c
@@ -9,12 +9,15 @@ 
 #include <sys/socket.h>
 #include <sys/un.h>
 #include <unistd.h>
+#include <limits.h>
+#include <sys/types.h>
+#include <sys/ipc.h>
+#include <sys/sem.h>
+#include <string.h>
+#include <errno.h>
 #include <xtables.h>
 #include "xshared.h"
 
-#define XT_SOCKET_NAME "xtables"
-#define XT_SOCKET_LEN 8
-
 /*
  * Print out any special helps. A user might like to be able to add a --help
  * to the commandline, and see expected results. So we call help for all
@@ -243,24 +246,109 @@  void xs_init_match(struct xtables_match *match)
 		match->init(match->m);
 }
 
+static int xtables_check_owner(int semid)
+{
+	int ret;
+	struct semid_ds ds;
+
+	ret = semctl(semid, 0, IPC_STAT, &ds);
+	if (ret < 0) {
+		if (errno == EACCES)
+			return 0;
+
+		goto err;
+	}
+
+	/* I don't own this semaphore, release it and create it again */
+	if (ds.sem_perm.cuid != 0)
+		goto err;
+
+	return 0;
+err:
+	/* delete this semaphore and try again */
+	ret = semctl(semid, 0, IPC_RMID);
+	if (ret < 0)
+		return 0;
+
+	return -1;
+}
+
+static bool xtables_sem_down(int semid, int wait)
+{
+	struct sembuf down = {
+		.sem_num	= 0,
+		.sem_op		= -1,
+		.sem_flg	= IPC_NOWAIT | SEM_UNDO,
+	};
+	int ret;
+
+	ret = semop(semid, &down, 1);
+	if (ret < 0 && errno == EAGAIN)
+		return false;
+
+	return true;
+}
+
+static int xtables_sem_up(int semid)
+{
+	struct sembuf up = {
+		.sem_num	= 0,
+		.sem_op		= 1,
+	};
+	int ret;
+
+	ret = semop(semid, &up, 1);
+	if (ret < 0)
+		perror("xtables cannot up semaphore");
+
+	return ret;
+}
+
+static int xtables_sem_create(void)
+{
+	int semid, key;
+	char iptables_path[PATH_MAX], proc_path[PATH_MAX];
+	ssize_t ret;
+again:
+	snprintf(proc_path, sizeof(proc_path), "/proc/%u/exe", getpid());
+	proc_path[sizeof(proc_path) - 1] = '\0';
+	ret = readlink(proc_path, iptables_path, sizeof(iptables_path));
+	if (ret < 0)
+		return -1;
+
+	key = ftok(iptables_path, 1);
+	semid = semget(key, 1, 0600 | IPC_CREAT | IPC_EXCL);
+	if (semid < 0) {
+		semid = semget(key, 1, 0);
+		if (semid < 0)
+			return -1;
+
+		if (xtables_check_owner(semid) < 0)
+			goto again;
+	} else {
+		if (xtables_check_owner(semid) < 0)
+			goto again;
+
+		/* Linux initializes IPC semaphore counters to zero after
+		 * semget(), set this counter to one initially.
+		 */
+		xtables_sem_up(semid);
+	}
+	return semid;
+}
+
 bool xtables_lock(int wait)
 {
-	int i = 0, ret, xt_socket;
-	struct sockaddr_un xt_addr;
-	int waited = 0;
-
-	memset(&xt_addr, 0, sizeof(xt_addr));
-	xt_addr.sun_family = AF_UNIX;
-	strcpy(xt_addr.sun_path+1, XT_SOCKET_NAME);
-	xt_socket = socket(AF_UNIX, SOCK_STREAM, 0);
-	/* If we can't even create a socket, fall back to prior (lockless) behavior */
-	if (xt_socket < 0)
+	int semid, waited = 0, i = 0;
+
+	semid = xtables_sem_create();
+	if (semid < 0) {
+		perror("Cannot get iptables semaphore, giving up on locking");
 		return true;
+	}
 
-	while (1) {
-		ret = bind(xt_socket, (struct sockaddr*)&xt_addr,
-			   offsetof(struct sockaddr_un, sun_path)+XT_SOCKET_LEN);
-		if (ret == 0)
+        while (1) {
+		if (xtables_sem_down(semid, wait))
 			return true;
 		else if (wait >= 0 && waited >= wait)
 			return false;