diff mbox

netfilter: have r->cost != 0 case work

Message ID 1348302412-2811-1-git-send-email-jengelh@inai.de
State Accepted
Headers show

Commit Message

Jan Engelhardt Sept. 22, 2012, 8:26 a.m. UTC
Commit v2.6.19-rc1~1272^2~41 tells us that r->cost != 0 can happen when
a running state is saved to userspace and then reinstated from there.

Make sure that priv is initialized with some values when that happens.

Signed-off-by: Jan Engelhardt <jengelh@inai.de>
---
 net/netfilter/xt_limit.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Patrick McHardy Sept. 23, 2012, 12:43 p.m. UTC | #1
Jan Engelhardt <jengelh@inai.de> schrieb:

>Commit v2.6.19-rc1~1272^2~41 tells us that r->cost != 0 can happen when
>a running state is saved to userspace and then reinstated from there.
>
>Make sure that priv is initialized with some values when that happens.
>
>Signed-off-by: Jan Engelhardt <jengelh@inai.de>
>---
> net/netfilter/xt_limit.c |    8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
>diff --git a/net/netfilter/xt_limit.c b/net/netfilter/xt_limit.c
>index 5c22ce8..a4c1e45 100644
>--- a/net/netfilter/xt_limit.c
>+++ b/net/netfilter/xt_limit.c
>@@ -117,11 +117,11 @@ static int limit_mt_check(const struct
>xt_mtchk_param *par)
> 
> 	/* For SMP, we only want to use one set of state. */
> 	r->master = priv;
>+	/* User avg in seconds * XT_LIMIT_SCALE: convert to jiffies *
>+	   128. */
>+	priv->prev = jiffies;
>+	priv->credit = user2credits(r->avg * r->burst); /* Credits full. */
> 	if (r->cost == 0) {
>-		/* User avg in seconds * XT_LIMIT_SCALE: convert to jiffies *
>-		   128. */
>-		priv->prev = jiffies;
>-		priv->credit = user2credits(r->avg * r->burst); /* Credits full. */
> 		r->credit_cap = priv->credit; /* Credits full. */
> 		r->cost = user2credits(r->avg);
> 	}

I don't think we can do any better than this. We could reuse the old state from userspace, but that might have changed in the kernel as well.

Perhaps for the future we can find some way to optionally associate elements of the new ruleset with the old one and keep states when parameters haven't changed. Probably hard though since the new ruleset is constructed in the kernel while the old one is still active.

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Engelhardt Sept. 24, 2012, 1:02 p.m. UTC | #2
On Sunday 2012-09-23 14:43, Patrick McHardy wrote:
>Jan Engelhardt <jengelh@inai.de> schrieb:
>
>>Commit v2.6.19-rc1~1272^2~41 tells us that r->cost != 0 can happen when
>>a running state is saved to userspace and then reinstated from there.
>>
>>Make sure that priv is initialized with some values when that happens.
>>
>>Signed-off-by: Jan Engelhardt <jengelh@inai.de>
>>---
>> net/netfilter/xt_limit.c |    8 ++++----
>> 1 file changed, 4 insertions(+), 4 deletions(-)
>>
>>diff --git a/net/netfilter/xt_limit.c b/net/netfilter/xt_limit.c
>>index 5c22ce8..a4c1e45 100644
>>--- a/net/netfilter/xt_limit.c
>>+++ b/net/netfilter/xt_limit.c
>>@@ -117,11 +117,11 @@ static int limit_mt_check(const struct
>>xt_mtchk_param *par)
>> 
>> 	/* For SMP, we only want to use one set of state. */
>> 	r->master = priv;
>>+	/* User avg in seconds * XT_LIMIT_SCALE: convert to jiffies *
>>+	   128. */
>>+	priv->prev = jiffies;
>>+	priv->credit = user2credits(r->avg * r->burst); /* Credits full. */
>> 	if (r->cost == 0) {
>>-		/* User avg in seconds * XT_LIMIT_SCALE: convert to jiffies *
>>-		   128. */
>>-		priv->prev = jiffies;
>>-		priv->credit = user2credits(r->avg * r->burst); /* Credits full. */
>> 		r->credit_cap = priv->credit; /* Credits full. */
>> 		r->cost = user2credits(r->avg);
>> 	}
>
>I don't think we can do any better than this.

"This" being the state as of 3.5, or this patch?
priv-> really should be initialized, somehow.
Being a kernel-only structure, can't rely on userspace to do it.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Patrick McHardy Sept. 24, 2012, 1:12 p.m. UTC | #3
On Mon, 24 Sep 2012, Jan Engelhardt wrote:

>
> On Sunday 2012-09-23 14:43, Patrick McHardy wrote:
>> Jan Engelhardt <jengelh@inai.de> schrieb:
>>
>>> Commit v2.6.19-rc1~1272^2~41 tells us that r->cost != 0 can happen when
>>> a running state is saved to userspace and then reinstated from there.
>>>
>>> Make sure that priv is initialized with some values when that happens.
>>>
>>> Signed-off-by: Jan Engelhardt <jengelh@inai.de>
>>> ---
>>> net/netfilter/xt_limit.c |    8 ++++----
>>> 1 file changed, 4 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/net/netfilter/xt_limit.c b/net/netfilter/xt_limit.c
>>> index 5c22ce8..a4c1e45 100644
>>> --- a/net/netfilter/xt_limit.c
>>> +++ b/net/netfilter/xt_limit.c
>>> @@ -117,11 +117,11 @@ static int limit_mt_check(const struct
>>> xt_mtchk_param *par)
>>>
>>> 	/* For SMP, we only want to use one set of state. */
>>> 	r->master = priv;
>>> +	/* User avg in seconds * XT_LIMIT_SCALE: convert to jiffies *
>>> +	   128. */
>>> +	priv->prev = jiffies;
>>> +	priv->credit = user2credits(r->avg * r->burst); /* Credits full. */
>>> 	if (r->cost == 0) {
>>> -		/* User avg in seconds * XT_LIMIT_SCALE: convert to jiffies *
>>> -		   128. */
>>> -		priv->prev = jiffies;
>>> -		priv->credit = user2credits(r->avg * r->burst); /* Credits full. */
>>> 		r->credit_cap = priv->credit; /* Credits full. */
>>> 		r->cost = user2credits(r->avg);
>>> 	}
>>
>> I don't think we can do any better than this.
>
> "This" being the state as of 3.5, or this patch?
> priv-> really should be initialized, somehow.
> Being a kernel-only structure, can't rely on userspace to do it.

"This" as in your patch.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jim Robinson Sept. 24, 2012, 8:34 p.m. UTC | #4
[Retry with text formatting to try to make the vger.kernel.org spam 
filter happy this time]


Patrick McHardy wrote:
 >
 > Jan Engelhardt <jengelh@inai.de> schrieb:
 >
 >> Commit v2.6.19-rc1~1272^2~41 tells us that r->cost != 0 can happen when
 >> a running state is saved to userspace and then reinstated from there.
 >>
 >> Make sure that priv is initialized with some values when that happens.
 >>
 >> Signed-off-by: Jan Engelhardt <jengelh@inai.de>
 >> ---
 >> net/netfilter/xt_limit.c |    8 ++++----
 >> 1 file changed, 4 insertions(+), 4 deletions(-)
 >>
 >> diff --git a/net/netfilter/xt_limit.c b/net/netfilter/xt_limit.c
 >> index 5c22ce8..a4c1e45 100644
 >> --- a/net/netfilter/xt_limit.c
 >> +++ b/net/netfilter/xt_limit.c
 >> @@ -117,11 +117,11 @@ static int limit_mt_check(const struct
 >> xt_mtchk_param *par)
 >>
 >>     /* For SMP, we only want to use one set of state. */
 >>     r->master = priv;
 >> +    /* User avg in seconds * XT_LIMIT_SCALE: convert to jiffies *
 >> +       128. */
 >> +    priv->prev = jiffies;
 >> +    priv->credit = user2credits(r->avg * r->burst); /* Credits full. */
 >>     if (r->cost == 0) {
 >> -        /* User avg in seconds * XT_LIMIT_SCALE: convert to jiffies *
 >> -           128. */
 >> -        priv->prev = jiffies;
 >> -        priv->credit = user2credits(r->avg * r->burst); /* Credits 
full. */
 >>         r->credit_cap = priv->credit; /* Credits full. */
 >>         r->cost = user2credits(r->avg);
 >>     }
 > I don't think we can do any better than this. We could reuse the old 
state from userspace, but that might have changed in the kernel as well.
 >
 > Perhaps for the future we can find some way to optionally associate 
elements of the new ruleset with the old one and keep states when 
parameters haven't changed. Probably hard though since the new ruleset 
is constructed in the kernel while the old one is still active.
 >

Can you possibly elaborate on why my initial patch does not actually fix 
the problem? Specifically, the concern seems to be that the values in 
'r->master' in limit_mt_check() may not be reliable. In my testing I saw 
that r->master was the same (both address and values) as had been being 
used in limit_mt() calls for the rule. So if you could elaborate on the 
circumstances that could cause the values to not be the same, it would 
be appreciated.  In particular I don't understand the reference to 
'userspace'.

Below is the debugging trace I included as one the attachments I sent. 
It shows the calls applied to the same rate limit rule. Note that 
'r->master' in limit_mt_check() is the same address as 'r->master' in 
the previous limit_mt() calls. My testing indicated that the values were 
the same as well. Also note that the call to limit_mt_check() was 
triggered by an iptables change unrelated to that rule.

BTW, I am not at all saying you're wrong - you're the experts here. I'm 
just trying to understand why my patch doesn't actually fix the problem. 
This is for my own benefit and also because I'll have to explain it to 
my own powers-that-be, since the proposed fix basically resets the 
values and will guarantee a match in the next mt_limit() call(s) 
regardless of what should actually happen.

And as something of an aside, can you say why limit_mt_check() is called 
on all the rate limit rules when an unrelated iptables change occurs? 
This is what I observed.

Thanks
Jim

     limit_mt(): par=ffff880028203bd0 r=ffffc90003cfb138 
master=ffff88020f930fa0
     limit_mt(): par=ffff880028203bd0 r=ffffc90003cfb138 
master=ffff88020f930fa0
     limit_mt(): par=ffff880028203bd0 r=ffffc90003cfb138 
master=ffff88020f930fa0
     # bug triggered here
     limit_mt_check(): par=ffff880176a3bd98 r=ffffc90003d39138 
master=ffff88020f930fa0
                       new-master=ffff88020f930140 [this is the 
kmalloc()ed 'priv']
     limit_mt_destroy(): par=ffff880176a3bd38 r=ffffc90003cfb138 
master=ffff88020f930fa0
     limit_mt(): par=ffff880028203bd0 r=ffffc90003d39138 
master=ffff88020f930140
     limit_mt(): par=ffff880028203bd0 r=ffffc90003d39138 
master=ffff88020f930140
     limit_mt(): par=ffff880028203bd0 r=ffffc90003d39138 
master=ffff88020f930140




--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Engelhardt Sept. 24, 2012, 9:30 p.m. UTC | #5
On Monday 2012-09-24 22:34, Jim Robinson wrote:
>
>Can you possibly elaborate on why my initial patch does not actually
>fix the problem? Specifically, the concern seems to be that the
>values in 'r->master' in limit_mt_check() may not be reliable. In my
>testing I saw that r->master was the same (both address and values)
>as had been being used in limit_mt() calls for the rule.

The ruleset may have already changed by the time the iptables
userspace process gets to send in the modified ruleset (with
an old r->master).

1. iptables instance 1 fetches ruleset
2. iptables instance 2 flushes the ruleset
3. iptables instance 1 reuploads the ruleset, now with a
   r->master pointing into the woods.
4. ???
5. boom.

Or, for extra giggles:

1. iptables instance 1 fetches ruleset
2. iptables instance 1 uploads ruleset
3. the new temporary table, i.e. the limit match inside it,
   (wrongly) reuses r->master
4. the old table frees all its private data.
   the new limit rule's r->master now points into the woods.
5. ???
6. boom.

>And as something of an aside, can you say why limit_mt_check() is called on all
>the rate limit rules when an unrelated iptables change occurs? This is what I
>observed.

First the new table is constructed, then atomically swapped with
the old one, after which the old one is freed.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pablo Neira Ayuso Sept. 25, 2012, 2:39 p.m. UTC | #6
On Sat, Sep 22, 2012 at 10:26:52AM +0200, Jan Engelhardt wrote:
> Commit v2.6.19-rc1~1272^2~41 tells us that r->cost != 0 can happen when
> a running state is saved to userspace and then reinstated from there.
> 
> Make sure that priv is initialized with some values when that happens.

Just to clarify: With this patch, we go back to the situation in which
the state is reset on unrelated rule updates, right?
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Engelhardt Sept. 25, 2012, 2:52 p.m. UTC | #7
On Tuesday 2012-09-25 16:39, Pablo Neira Ayuso wrote:
>On Sat, Sep 22, 2012 at 10:26:52AM +0200, Jan Engelhardt wrote:
>> Commit v2.6.19-rc1~1272^2~41 tells us that r->cost != 0 can happen when
>> a running state is saved to userspace and then reinstated from there.
>> 
>> Make sure that priv is initialized with some values when that happens.
>
>Just to clarify: With this patch, we go back to the situation in which
>the state is reset on unrelated rule updates, right?

Before this patch, when userspace updated a rule, random things 
would happen on matching due to use of uninitialized memory.

Now, the values are initialized, and implies that the credit is reset on 
table update.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pablo Neira Ayuso Sept. 25, 2012, 11:27 p.m. UTC | #8
On Tue, Sep 25, 2012 at 04:52:32PM +0200, Jan Engelhardt wrote:
> On Tuesday 2012-09-25 16:39, Pablo Neira Ayuso wrote:
> >On Sat, Sep 22, 2012 at 10:26:52AM +0200, Jan Engelhardt wrote:
> >> Commit v2.6.19-rc1~1272^2~41 tells us that r->cost != 0 can happen when
> >> a running state is saved to userspace and then reinstated from there.
> >> 
> >> Make sure that priv is initialized with some values when that happens.
> >
> >Just to clarify: With this patch, we go back to the situation in which
> >the state is reset on unrelated rule updates, right?
> 
> Before this patch, when userspace updated a rule, random things 
> would happen on matching due to use of uninitialized memory.

I know that ;-)

> Now, the values are initialized, and implies that the credit is reset on 
> table update.

This is what I wanted to confirm. Thanks Jan.

I'll pass this to David. Let's see if we can get this into 3.6-rc.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Engelhardt Sept. 26, 2012, 1:28 p.m. UTC | #9
On Wednesday 2012-09-26 01:27, Pablo Neira Ayuso wrote:
>
>> Now, the values are initialized, and implies that the credit is reset on 
>> table update.
>
>This is what I wanted to confirm. Thanks Jan.
>
>I'll pass this to David. Let's see if we can get this into 3.6-rc.

This would even be -stable material.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pablo Neira Ayuso Sept. 26, 2012, 2:47 p.m. UTC | #10
On Wed, Sep 26, 2012 at 03:28:27PM +0200, Jan Engelhardt wrote:
> 
> On Wednesday 2012-09-26 01:27, Pablo Neira Ayuso wrote:
> >
> >> Now, the values are initialized, and implies that the credit is reset on 
> >> table update.
> >
> >This is what I wanted to confirm. Thanks Jan.
> >
> >I'll pass this to David. Let's see if we can get this into 3.6-rc.
> 
> This would even be -stable material.

Yes. Likely that we missed 3.6-rc7, so David may pass this to -stable
himself.

I have a long queue of things to pass to -stable. My plan is to spend
time on getting things into it now that the merge window will be
closed.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/netfilter/xt_limit.c b/net/netfilter/xt_limit.c
index 5c22ce8..a4c1e45 100644
--- a/net/netfilter/xt_limit.c
+++ b/net/netfilter/xt_limit.c
@@ -117,11 +117,11 @@  static int limit_mt_check(const struct xt_mtchk_param *par)
 
 	/* For SMP, we only want to use one set of state. */
 	r->master = priv;
+	/* User avg in seconds * XT_LIMIT_SCALE: convert to jiffies *
+	   128. */
+	priv->prev = jiffies;
+	priv->credit = user2credits(r->avg * r->burst); /* Credits full. */
 	if (r->cost == 0) {
-		/* User avg in seconds * XT_LIMIT_SCALE: convert to jiffies *
-		   128. */
-		priv->prev = jiffies;
-		priv->credit = user2credits(r->avg * r->burst); /* Credits full. */
 		r->credit_cap = priv->credit; /* Credits full. */
 		r->cost = user2credits(r->avg);
 	}