Message ID | 1421615616-23053-1-git-send-email-pablo@netfilter.org |
---|---|
State | Not Applicable |
Delegated to: | Pablo Neira |
Headers | show |
On Sunday 2015-01-18 22:13, Pablo Neira Ayuso wrote: >This patch introduces a semaphore >index b18022e..80eed2c 100644 >--- a/iptables/xshared.c >+++ b/iptables/xshared.c >+static int xtables_check_owner(int semid) >+{ >+ int ret; >+ struct semid_ds ds; >+ >+ ret = semctl(semid, 0, IPC_STAT, &ds); Is there a particular reason you are not using the POSIX semaphores? [sem_open/shm_open as per sem_overview(7)]. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Jan 18, 2015 at 10:28:32PM +0100, Jan Engelhardt wrote: > On Sunday 2015-01-18 22:13, Pablo Neira Ayuso wrote: > > >This patch introduces a semaphore > >index b18022e..80eed2c 100644 > >--- a/iptables/xshared.c > >+++ b/iptables/xshared.c > >+static int xtables_check_owner(int semid) > >+{ > >+ int ret; > >+ struct semid_ds ds; > >+ > >+ ret = semctl(semid, 0, IPC_STAT, &ds); > > Is there a particular reason you are not using the POSIX semaphores? > [sem_open/shm_open as per sem_overview(7)]. Please, read the patch description: "This patch introduces a semaphore that is identified by the path to the iptables binary, it also relies on SEM_UNDO so the kernel performs the up() operation at process exit to avoid races with signals. This also avoids file locks that require a writable filesystem." -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 19.01, Pablo Neira Ayuso wrote: > On Sun, Jan 18, 2015 at 10:28:32PM +0100, Jan Engelhardt wrote: > > On Sunday 2015-01-18 22:13, Pablo Neira Ayuso wrote: > > > > >This patch introduces a semaphore > > >index b18022e..80eed2c 100644 > > >--- a/iptables/xshared.c > > >+++ b/iptables/xshared.c > > >+static int xtables_check_owner(int semid) > > >+{ > > >+ int ret; > > >+ struct semid_ds ds; > > >+ > > >+ ret = semctl(semid, 0, IPC_STAT, &ds); > > > > Is there a particular reason you are not using the POSIX semaphores? > > [sem_open/shm_open as per sem_overview(7)]. > > Please, read the patch description: > > "This patch introduces a semaphore that is identified by the path to > the iptables binary, it also relies on SEM_UNDO so the kernel performs > the up() operation at process exit to avoid races with signals. This > also avoids file locks that require a writable filesystem." Is it wise to use the path? Not that its very common, but multiple binaries would still race. Any reason you chose not to use something globally unique? -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 19.01, Pablo Neira Ayuso wrote: > On Mon, Jan 19, 2015 at 12:51:19PM +0000, Patrick McHardy wrote: > > On 19.01, Pablo Neira Ayuso wrote: > > > On Sun, Jan 18, 2015 at 10:28:32PM +0100, Jan Engelhardt wrote: > > > > On Sunday 2015-01-18 22:13, Pablo Neira Ayuso wrote: > > > > > > > > >This patch introduces a semaphore > > > > >index b18022e..80eed2c 100644 > > > > >--- a/iptables/xshared.c > > > > >+++ b/iptables/xshared.c > > > > >+static int xtables_check_owner(int semid) > > > > >+{ > > > > >+ int ret; > > > > >+ struct semid_ds ds; > > > > >+ > > > > >+ ret = semctl(semid, 0, IPC_STAT, &ds); > > > > > > > > Is there a particular reason you are not using the POSIX semaphores? > > > > [sem_open/shm_open as per sem_overview(7)]. > > > > > > Please, read the patch description: > > > > > > "This patch introduces a semaphore that is identified by the path to > > > the iptables binary, it also relies on SEM_UNDO so the kernel performs > > > the up() operation at process exit to avoid races with signals. This > > > also avoids file locks that require a writable filesystem." > > > > Is it wise to use the path? Not that its very common, but multiple > > binaries would still race. Any reason you chose not to use something > > globally unique? > > What kind of race are you worrying about? Multiple iptables binaries, which would obviously have different paths. As I said, not common, but possible. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jan 19, 2015 at 12:51:19PM +0000, Patrick McHardy wrote: > On 19.01, Pablo Neira Ayuso wrote: > > On Sun, Jan 18, 2015 at 10:28:32PM +0100, Jan Engelhardt wrote: > > > On Sunday 2015-01-18 22:13, Pablo Neira Ayuso wrote: > > > > > > >This patch introduces a semaphore > > > >index b18022e..80eed2c 100644 > > > >--- a/iptables/xshared.c > > > >+++ b/iptables/xshared.c > > > >+static int xtables_check_owner(int semid) > > > >+{ > > > >+ int ret; > > > >+ struct semid_ds ds; > > > >+ > > > >+ ret = semctl(semid, 0, IPC_STAT, &ds); > > > > > > Is there a particular reason you are not using the POSIX semaphores? > > > [sem_open/shm_open as per sem_overview(7)]. > > > > Please, read the patch description: > > > > "This patch introduces a semaphore that is identified by the path to > > the iptables binary, it also relies on SEM_UNDO so the kernel performs > > the up() operation at process exit to avoid races with signals. This > > also avoids file locks that require a writable filesystem." > > Is it wise to use the path? Not that its very common, but multiple > binaries would still race. Any reason you chose not to use something > globally unique? What kind of race are you worrying about? -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jan 19, 2015 at 02:00:24PM +0100, Pablo Neira Ayuso wrote: > On Mon, Jan 19, 2015 at 12:51:19PM +0000, Patrick McHardy wrote: > > On 19.01, Pablo Neira Ayuso wrote: > > > On Sun, Jan 18, 2015 at 10:28:32PM +0100, Jan Engelhardt wrote: > > > > On Sunday 2015-01-18 22:13, Pablo Neira Ayuso wrote: > > > > > > > > >This patch introduces a semaphore > > > > >index b18022e..80eed2c 100644 > > > > >--- a/iptables/xshared.c > > > > >+++ b/iptables/xshared.c > > > > >+static int xtables_check_owner(int semid) > > > > >+{ > > > > >+ int ret; > > > > >+ struct semid_ds ds; > > > > >+ > > > > >+ ret = semctl(semid, 0, IPC_STAT, &ds); > > > > > > > > Is there a particular reason you are not using the POSIX semaphores? > > > > [sem_open/shm_open as per sem_overview(7)]. > > > > > > Please, read the patch description: > > > > > > "This patch introduces a semaphore that is identified by the path to > > > the iptables binary, it also relies on SEM_UNDO so the kernel performs > > > the up() operation at process exit to avoid races with signals. This > > > also avoids file locks that require a writable filesystem." > > > > Is it wise to use the path? Not that its very common, but multiple > > binaries would still race. Any reason you chose not to use something > > globally unique? > > What kind of race are you worrying about? Oh, I get it. Several different iptables binaries located in different paths. This patch cannot address that situation, we can select a hardcoded key but we may conflict with other applications. Regarding the use of posix semaphores, there is no SEM_UNDO feature, so we can have problem if this receives a kill signal or it abort/crash somewhere in the code. I think the best solution is to use to flock() as others do but then we need a writable filesystem() which is what Phil was trying to skip. Question is if we should really care. I mean, this locking solution was introduced as a workaround given we couldn't solve this in the kernel. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 19.01, Pablo Neira Ayuso wrote: > On Mon, Jan 19, 2015 at 02:00:24PM +0100, Pablo Neira Ayuso wrote: > > On Mon, Jan 19, 2015 at 12:51:19PM +0000, Patrick McHardy wrote: > > > On 19.01, Pablo Neira Ayuso wrote: > > > > "This patch introduces a semaphore that is identified by the path to > > > > the iptables binary, it also relies on SEM_UNDO so the kernel performs > > > > the up() operation at process exit to avoid races with signals. This > > > > also avoids file locks that require a writable filesystem." > > > > > > Is it wise to use the path? Not that its very common, but multiple > > > binaries would still race. Any reason you chose not to use something > > > globally unique? > > > > What kind of race are you worrying about? > > Oh, I get it. Several different iptables binaries located in different > paths. This patch cannot address that situation, we can select a > hardcoded key but we may conflict with other applications. Sure, but that risk also exists with using the path. > Regarding the use of posix semaphores, there is no SEM_UNDO feature, > so we can have problem if this receives a kill signal or it > abort/crash somewhere in the code. > > I think the best solution is to use to flock() as others do but then > we need a writable filesystem() which is what Phil was trying to skip. > > Question is if we should really care. I mean, this locking solution > was introduced as a workaround given we couldn't solve this in the > kernel. I think your patch is fine, just wanted to point out that we might want to choose a hardcoded name. I think the risk of clashes with other applications is absolutely minimal. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Monday 19 January 2015 14:06:34 Pablo Neira Ayuso wrote: > On Mon, Jan 19, 2015 at 02:00:24PM +0100, Pablo Neira Ayuso wrote: > > I think the best solution is to use to flock() as others do but then > we need a writable filesystem() which is what Phil was trying to skip. This appears to work: flock /sys/module/ip_tables best regards Patrick -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jan 19, 2015 at 01:08:33PM +0000, Patrick McHardy wrote: > On 19.01, Pablo Neira Ayuso wrote: > > On Mon, Jan 19, 2015 at 02:00:24PM +0100, Pablo Neira Ayuso wrote: > > > On Mon, Jan 19, 2015 at 12:51:19PM +0000, Patrick McHardy wrote: > > > > On 19.01, Pablo Neira Ayuso wrote: > > > > > "This patch introduces a semaphore that is identified by the path to > > > > > the iptables binary, it also relies on SEM_UNDO so the kernel performs > > > > > the up() operation at process exit to avoid races with signals. This > > > > > also avoids file locks that require a writable filesystem." > > > > > > > > Is it wise to use the path? Not that its very common, but multiple > > > > binaries would still race. Any reason you chose not to use something > > > > globally unique? > > > > > > What kind of race are you worrying about? > > > > Oh, I get it. Several different iptables binaries located in different > > paths. This patch cannot address that situation, we can select a > > hardcoded key but we may conflict with other applications. > > Sure, but that risk also exists with using the path. > > > Regarding the use of posix semaphores, there is no SEM_UNDO feature, > > so we can have problem if this receives a kill signal or it > > abort/crash somewhere in the code. > > > > I think the best solution is to use to flock() as others do but then > > we need a writable filesystem() which is what Phil was trying to skip. > > > > Question is if we should really care. I mean, this locking solution > > was introduced as a workaround given we couldn't solve this in the > > kernel. > > I think your patch is fine, just wanted to point out that we might > want to choose a hardcoded name. I think the risk of clashes with > other applications is absolutely minimal. Makes sense to me. This is how ftok() calculates the key based on the path: key = ((st.st_ino & 0xffff) | ((st.st_dev & 0xff) << 16) | ((proj_id & 0xff) << 24)); We can a select a magic number for that key, any candidates? Thanks! -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Monday 19 January 2015 14:19:14 Patrick Schaaf wrote:
> This appears to work: flock /sys/module/ip_tables
<blush/> also works for ordinary users......
/sys/module/ip_tables/uevent would work, but then it's only present when
iptables has been built as a module. And stuff like /proc/net/ip_tables does
not work, as apparently /proc does not support flock...
Sorry for the noise.
Patrick
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, 18.01.15 22:13, Pablo Neira Ayuso (pablo@netfilter.org) wrote: > Abstract unix sockets cannot be used to synchronize several concurrent > instances of iptables since an unpriviledged process can create them and > prevent the legitimate iptables instance from running. > > This patch introduces a semaphore that is identified by the path to the > iptables binary, it also relies on SEM_UNDO so the kernel performs the > up() operation at process exit to avoid races with signals. This also > avoid file locks that require a writable filesystem. Please, don't use SysV IPC for any new code, it's the worst choice. (the token API is already such a desaster!) The API is awful, and dated. At least use the POSIX APIs instead. I'd really recommend using BSD file locks on a file in /run though, they have the nicest semantics of all, and are supported on Linux since a long time. Lennart
On Mon, 19.01.15 13:24, Pablo Neira Ayuso (pablo@netfilter.org) wrote: > On Sun, Jan 18, 2015 at 10:28:32PM +0100, Jan Engelhardt wrote: > > On Sunday 2015-01-18 22:13, Pablo Neira Ayuso wrote: > > > > >This patch introduces a semaphore > > >index b18022e..80eed2c 100644 > > >--- a/iptables/xshared.c > > >+++ b/iptables/xshared.c > > >+static int xtables_check_owner(int semid) > > >+{ > > >+ int ret; > > >+ struct semid_ds ds; > > >+ > > >+ ret = semctl(semid, 0, IPC_STAT, &ds); > > > > Is there a particular reason you are not using the POSIX semaphores? > > [sem_open/shm_open as per sem_overview(7)]. > > Please, read the patch description: > > "This patch introduces a semaphore that is identified by the path to > the iptables binary, it also relies on SEM_UNDO so the kernel performs > the up() operation at process exit to avoid races with signals. This > also avoids file locks that require a writable filesystem." /run is always writable. /dev/shm too. And BSD file locks are robust too (i.e. are automatically unlocked on exit). And you can have that for POSIX locks too. Lennart
On Monday 2015-01-19 14:06, Pablo Neira Ayuso wrote: > >I think the best solution is to use to flock() as others do but then >we need a writable filesystem() which is what Phil was trying to skip. If semaphores are no longer on the table, using shm_open for an flockable fd seems like an option. The /dev/shm directory implicitly used for that should be there in a normal system, so as to support POSIX shm/sem in the first place. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 19.01.15 16:54, Jan Engelhardt (jengelh@inai.de) wrote: > > On Monday 2015-01-19 14:06, Pablo Neira Ayuso wrote: > > > >I think the best solution is to use to flock() as others do but then > >we need a writable filesystem() which is what Phil was trying to skip. > > If semaphores are no longer on the table, using shm_open for an > flockable fd seems like an option. > The /dev/shm directory implicitly used for that should be there in > a normal system, so as to support POSIX shm/sem in the first place. /dev/shm is a world-writable directory. If you use that, then unprivileged processes can play games with you again by taking the name away from you, and the original problem is back. This really should be a file in /run, and nothing else. Lennart
diff --git a/iptables/xshared.c b/iptables/xshared.c index b18022e..80eed2c 100644 --- a/iptables/xshared.c +++ b/iptables/xshared.c @@ -9,12 +9,15 @@ #include <sys/socket.h> #include <sys/un.h> #include <unistd.h> +#include <limits.h> +#include <sys/types.h> +#include <sys/ipc.h> +#include <sys/sem.h> +#include <string.h> +#include <errno.h> #include <xtables.h> #include "xshared.h" -#define XT_SOCKET_NAME "xtables" -#define XT_SOCKET_LEN 8 - /* * Print out any special helps. A user might like to be able to add a --help * to the commandline, and see expected results. So we call help for all @@ -243,24 +246,109 @@ void xs_init_match(struct xtables_match *match) match->init(match->m); } +static int xtables_check_owner(int semid) +{ + int ret; + struct semid_ds ds; + + ret = semctl(semid, 0, IPC_STAT, &ds); + if (ret < 0) { + if (errno == EACCES) + return 0; + + goto err; + } + + /* I don't own this semaphore, release it and create it again */ + if (ds.sem_perm.cuid != 0) + goto err; + + return 0; +err: + /* delete this semaphore and try again */ + ret = semctl(semid, 0, IPC_RMID); + if (ret < 0) + return 0; + + return -1; +} + +static bool xtables_sem_down(int semid, int wait) +{ + struct sembuf down = { + .sem_num = 0, + .sem_op = -1, + .sem_flg = IPC_NOWAIT | SEM_UNDO, + }; + int ret; + + ret = semop(semid, &down, 1); + if (ret < 0 && errno == EAGAIN) + return false; + + return true; +} + +static int xtables_sem_up(int semid) +{ + struct sembuf up = { + .sem_num = 0, + .sem_op = 1, + }; + int ret; + + ret = semop(semid, &up, 1); + if (ret < 0) + perror("xtables cannot up semaphore"); + + return ret; +} + +static int xtables_sem_create(void) +{ + int semid, key; + char iptables_path[PATH_MAX], proc_path[PATH_MAX]; + ssize_t ret; +again: + snprintf(proc_path, sizeof(proc_path), "/proc/%u/exe", getpid()); + proc_path[sizeof(proc_path) - 1] = '\0'; + ret = readlink(proc_path, iptables_path, sizeof(iptables_path)); + if (ret < 0) + return -1; + + key = ftok(iptables_path, 1); + semid = semget(key, 1, 0600 | IPC_CREAT | IPC_EXCL); + if (semid < 0) { + semid = semget(key, 1, 0); + if (semid < 0) + return -1; + + if (xtables_check_owner(semid) < 0) + goto again; + } else { + if (xtables_check_owner(semid) < 0) + goto again; + + /* Linux initializes IPC semaphore counters to zero after + * semget(), set this counter to one initially. + */ + xtables_sem_up(semid); + } + return semid; +} + bool xtables_lock(int wait) { - int i = 0, ret, xt_socket; - struct sockaddr_un xt_addr; - int waited = 0; - - memset(&xt_addr, 0, sizeof(xt_addr)); - xt_addr.sun_family = AF_UNIX; - strcpy(xt_addr.sun_path+1, XT_SOCKET_NAME); - xt_socket = socket(AF_UNIX, SOCK_STREAM, 0); - /* If we can't even create a socket, fall back to prior (lockless) behavior */ - if (xt_socket < 0) + int semid, waited = 0, i = 0; + + semid = xtables_sem_create(); + if (semid < 0) { + perror("Cannot get iptables semaphore, giving up on locking"); return true; + } - while (1) { - ret = bind(xt_socket, (struct sockaddr*)&xt_addr, - offsetof(struct sockaddr_un, sun_path)+XT_SOCKET_LEN); - if (ret == 0) + while (1) { + if (xtables_sem_down(semid, wait)) return true; else if (wait >= 0 && waited >= wait) return false;
Abstract unix sockets cannot be used to synchronize several concurrent instances of iptables since an unpriviledged process can create them and prevent the legitimate iptables instance from running. This patch introduces a semaphore that is identified by the path to the iptables binary, it also relies on SEM_UNDO so the kernel performs the up() operation at process exit to avoid races with signals. This also avoid file locks that require a writable filesystem. Fixes: 93587a0 ("ip[6]tables: Add locking to prevent concurrent instances") Reported-by: Lennart Poettering <lennart@poettering.net> Cc: Phil Oester <kernel@linuxace.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> --- iptables/xshared.c | 122 ++++++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 105 insertions(+), 17 deletions(-)