diff mbox

xtables: Add locking to prevent concurrent instances

Message ID 20130522173511.GB860@gmail.com
State Superseded
Headers show

Commit Message

Phil Oester May 22, 2013, 5:35 p.m. UTC
There have been numerous complaints and bug reports over the years when admins
attempt to run more than one instance of iptables simultaneously.  Currently
open bug reports which are related:

325: Parallel execution of the iptables is impossible
758: Retry iptables command on transient failure
764: Doing -Z twice in parallel breaks counters
822: iptables shows negative or other bad packet/byte counts

As Patrick notes in 325:  "Since this has been a problem people keep running
into, I'd suggest to simply add some locking to iptables to catch the most
common case."

I started looking into alternatives to add locking, and of course the most
common/obvious solution is to use a pidfile.  But this has various downsides,
such as if the application is terminated abnormally and the pidfile isn't
cleaned up.  And this also requires a writable filesystem.  Using a UNIX domain
socket file (e.g. in /var/run) has similar issues.

Starting in 2.2, Linux added support for abstract sockets.  These sockets
require no filesystem, and automatically disappear once the application
terminates.  This is the locking solution I chose to implement in xtables-multi.
As an added bonus, since each network namespace has its own socket pool, an
iptables instance running in one namespace will not lock out an iptables
instance running in another namespace.  A filesystem approach would have
to recognize and handle multiple network namespaces.

As long as I was adding locking, I also chose to add a retry loop, with 3
attempts made to grab the lock before giving up.

Phil


Signed-off-by: Phil Oester <kernel@linuxace.com>

Comments

Patrick McHardy May 22, 2013, 7:29 p.m. UTC | #1
Phil Oester <kernel@linuxace.com> schrieb:

>There have been numerous complaints and bug reports over the years when
>admins
>attempt to run more than one instance of iptables simultaneously. 
>Currently
>open bug reports which are related:
>
>325: Parallel execution of the iptables is impossible
>758: Retry iptables command on transient failure
>764: Doing -Z twice in parallel breaks counters
>822: iptables shows negative or other bad packet/byte counts
>
>As Patrick notes in 325:  "Since this has been a problem people keep
>running
>into, I'd suggest to simply add some locking to iptables to catch the
>most
>common case."
>
>I started looking into alternatives to add locking, and of course the
>most
>common/obvious solution is to use a pidfile.  But this has various
>downsides,
>such as if the application is terminated abnormally and the pidfile
>isn't
>cleaned up.  And this also requires a writable filesystem.  Using a
>UNIX domain
>socket file (e.g. in /var/run) has similar issues.
>
>Starting in 2.2, Linux added support for abstract sockets.  These
>sockets
>require no filesystem, and automatically disappear once the application
>terminates.  This is the locking solution I chose to implement in
>xtables-multi.
>As an added bonus, since each network namespace has its own socket
>pool, an
>iptables instance running in one namespace will not lock out an
>iptables
>instance running in another namespace.  A filesystem approach would
>have
>to recognize and handle multiple network namespaces.
>
>As long as I was adding locking, I also chose to add a retry loop, with
>3
>attempts made to grab the lock before giving up.

If I'm not mistaken, we retry failed operations since a few years now, so is this actually still a problem? 

If so, why abort after three tries instead of waiting until the lock can be acquired?


--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Phil Oester May 22, 2013, 8:04 p.m. UTC | #2
On Wed, May 22, 2013 at 09:29:06PM +0200, Patrick McHardy wrote:
> If I'm not mistaken, we retry failed operations since a few years now, so is this actually still a problem? 
> 
> If so, why abort after three tries instead of waiting until the lock can be acquired?

Not sure what you are referring to with retrying operations.  The race condition
between multiple simultaneous instances of iptables running still exists in
latest git.  For example, fire up a few simultanous instances of this:

# while : ;  do iptables -A INPUT -j ACCEPT ; iptables -D INPUT -j ACCEPT ; done
iptables: Resource temporarily unavailable.
iptables: Resource temporarily unavailable.
iptables: Invalid argument. Run `dmesg' for more information.
iptables: Resource temporarily unavailable.
iptables: Resource temporarily unavailable.

Bug 764 is another race problem where two simultaneous "iptables -Z" causes
counters to get reset to invalid numbers (and is reproducible with current git).  

As to your second question re: why only three tries (in 3 seconds): I suppose we
could just wait "forever".  In most normal cases I wouldn't expect a wait > 1
second for an instance to complete.  But there could be pathological corner cases
such as dumping a huge ruleset with DNS resolution (i.e. without -n) which could
take an eternity to complete.  Should we spit out (to stderr) some kind of
message in these cases, or just sit silently?  Perhaps every 5 seconds?

Phil
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Patrick McHardy May 22, 2013, 8:11 p.m. UTC | #3
Phil Oester <kernel@linuxace.com> schrieb:

>On Wed, May 22, 2013 at 09:29:06PM +0200, Patrick McHardy wrote:
>> If I'm not mistaken, we retry failed operations since a few years
>now, so is this actually still a problem? 
>> 
>> If so, why abort after three tries instead of waiting until the lock
>can be acquired?
>
>Not sure what you are referring to with retrying operations.  The race
>condition
>between multiple simultaneous instances of iptables running still
>exists in
>latest git.  For example, fire up a few simultanous instances of this:
>
># while : ;  do iptables -A INPUT -j ACCEPT ; iptables -D INPUT -j
>ACCEPT ; done
>iptables: Resource temporarily unavailable.
>iptables: Resource temporarily unavailable.
>iptables: Invalid argument. Run `dmesg' for more information.
>iptables: Resource temporarily unavailable.
>iptables: Resource temporarily unavailable.
>
>Bug 764 is another race problem where two simultaneous "iptables -Z"
>causes
>counters to get reset to invalid numbers (and is reproducible with
>current git).  

I see, thanks.

>As to your second question re: why only three tries (in 3 seconds): I
>suppose we
>could just wait "forever".  In most normal cases I wouldn't expect a
>wait > 1
>second for an instance to complete.  But there could be pathological
>corner cases
>such as dumping a huge ruleset with DNS resolution (i.e. without -n)
>which could
>take an eternity to complete.  Should we spit out (to stderr) some kind
>of
>message in these cases, or just sit silently?  Perhaps every 5 seconds?

Printing a message sounds fine. But I think the main point of using locking is to provide a guarantee that a correct operation won't randomly fail. So I think we should retry forever.

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/iptables/xtables-multi.c b/iptables/xtables-multi.c
index 8014d5f..9c22a82 100644
--- a/iptables/xtables-multi.c
+++ b/iptables/xtables-multi.c
@@ -1,8 +1,12 @@ 
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <unistd.h>
 #include "xshared.h"
 
+#include "xtables.h"
 #include "xtables-multi.h"
 
 #ifdef ENABLE_IPV4
@@ -35,7 +39,31 @@  static const struct subcommand multi_subcommands[] = {
 	{NULL},
 };
 
+#define XTMSOCKET_NAME "xtables_multi"
+#define XTMSOCKET_LEN 14
+
 int main(int argc, char **argv)
 {
-	return subcmd_main(argc, argv, multi_subcommands);
+	int i = 0, ret, xtm_socket;
+	struct sockaddr_un xtm_addr;
+
+	memset(&xtm_addr, 0, sizeof(xtm_addr));
+	xtm_addr.sun_family = AF_UNIX;
+	strcpy(xtm_addr.sun_path+1, XTMSOCKET_NAME);
+	xtm_socket = socket(AF_UNIX, SOCK_STREAM, 0);
+	/* If we can't even create a socket, just revert to prior (lockless) behavior */
+	if (xtm_socket < 0)
+		return subcmd_main(argc, argv, multi_subcommands);
+
+	do {
+		ret = bind(xtm_socket, (struct sockaddr*)&xtm_addr,
+			   offsetof(struct sockaddr_un, sun_path)+XTMSOCKET_LEN);
+		if (ret == 0)
+			return subcmd_main(argc, argv, multi_subcommands);
+		i++;
+		sleep(1);
+	} while (i <= 2);
+
+	fprintf(stderr, "ERROR: unable to obtain lock - another instance is already running\n");
+	exit(RESOURCE_PROBLEM);
 }