Patchwork bridge/netfilter: regression in 2.6.39.1

login
register
mail settings
Submitter Eric Dumazet
Date June 6, 2011, 2:26 p.m.
Message ID <1307370363.3098.37.camel@edumazet-laptop>
Download mbox | patch
Permalink /patch/98931/
State Accepted
Delegated to: David Miller
Headers show

Comments

Eric Dumazet - June 6, 2011, 2:26 p.m.
From: Alexander Holler <holler@ahsoftware.de>

Le lundi 06 juin 2011 à 15:29 +0200, Alexander Holler a écrit :
> > Nice, now please submit a patch with 0972ddb237 as a guideline.
> >
> > BTW, you could also check other struct dst_ops methods for
> > bridge/netfilter:
> 
> Sorry, but I prefer to submit patches I understand by myself and for 
> stuff I know something about. The patch in my first mail was just meant 
> as a quick fix (which seemed to work here).
> 
> So even if I might be able to construct a working patch using commit 
> 0972ddb237, I don't think I should do that.

OK, I'll do it for you then, I am surprised you dont understand my
review / suggestion.

One patch submitter is supposed to followup and send a new version to
take into account reviews/comments, not wait that eventually everybody
says "OK, lets take it as is"


[PATCH] bridge: provide a cow_metrics method for fake_ops

Like in commit 0972ddb237 (provide cow_metrics() methods to blackhole
dst_ops), we must provide a cow_metrics for bridges fake_dst_ops as
well.

This fixes a regression coming from commits 62fa8a846d7d (net: Implement
read-only protection and COW'ing of metrics.) and 33eb9873a28 (bridge:
initialize fake_rtable metrics)

ip link set mybridge mtu 1234
-->
[  136.546243] Pid: 8415, comm: ip Tainted: P 
2.6.39.1-00006-g40545b7 #103 ASUSTeK Computer Inc.         V1Sn 
        /V1Sn
[  136.546256] EIP: 0060:[<00000000>] EFLAGS: 00010202 CPU: 0
[  136.546268] EIP is at 0x0
[  136.546273] EAX: f14a389c EBX: 000005d4 ECX: f80d32c0 EDX: f80d1da1
[  136.546279] ESI: f14a3000 EDI: f255bf10 EBP: f15c3b54 ESP: f15c3b48
[  136.546285]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[  136.546293] Process ip (pid: 8415, ti=f15c2000 task=f4741f80 
task.ti=f15c2000)
[  136.546297] Stack:
[  136.546301]  f80c658f f14a3000 ffffffed f15c3b64 c12cb9c8 f80d1b80 
ffffffa1 f15c3bbc
[  136.546315]  c12da347 c12d9c7d 00000000 f7670b00 00000000 f80d1b80 
ffffffa6 f15c3be4
[  136.546329]  00000004 f14a3000 f255bf20 00000008 f15c3bbc c11d6cae 
00000000 00000000
[  136.546343] Call Trace:
[  136.546359]  [<f80c658f>] ? br_change_mtu+0x5f/0x80 [bridge]
[  136.546372]  [<c12cb9c8>] dev_set_mtu+0x38/0x80
[  136.546381]  [<c12da347>] do_setlink+0x1a7/0x860
[  136.546390]  [<c12d9c7d>] ? rtnl_fill_ifinfo+0x9bd/0xc70
[  136.546400]  [<c11d6cae>] ? nla_parse+0x6e/0xb0
[  136.546409]  [<c12db931>] rtnl_newlink+0x361/0x510
[  136.546420]  [<c1023240>] ? vmalloc_sync_all+0x100/0x100
[  136.546429]  [<c1362762>] ? error_code+0x5a/0x60
[  136.546438]  [<c12db5d0>] ? rtnl_configure_link+0x80/0x80
[  136.546446]  [<c12db27a>] rtnetlink_rcv_msg+0xfa/0x210
[  136.546454]  [<c12db180>] ? __rtnl_unlock+0x20/0x20
[  136.546463]  [<c12ee0fe>] netlink_rcv_skb+0x8e/0xb0
[  136.546471]  [<c12daf1c>] rtnetlink_rcv+0x1c/0x30
[  136.546479]  [<c12edafa>] netlink_unicast+0x23a/0x280
[  136.546487]  [<c12ede6b>] netlink_sendmsg+0x26b/0x2f0
[  136.546497]  [<c12bb828>] sock_sendmsg+0xc8/0x100
[  136.546508]  [<c10adf61>] ? __alloc_pages_nodemask+0xe1/0x750
[  136.546517]  [<c11d0602>] ? _copy_from_user+0x42/0x60
[  136.546525]  [<c12c5e4c>] ? verify_iovec+0x4c/0xc0
[  136.546534]  [<c12bd805>] sys_sendmsg+0x1c5/0x200
[  136.546542]  [<c10c2150>] ? __do_fault+0x310/0x410
[  136.546549]  [<c10c2c46>] ? do_wp_page+0x1d6/0x6b0
[  136.546557]  [<c10c47d1>] ? handle_pte_fault+0xe1/0x720
[  136.546565]  [<c12bd1af>] ? sys_getsockname+0x7f/0x90
[  136.546574]  [<c10c4ec1>] ? handle_mm_fault+0xb1/0x180
[  136.546582]  [<c1023240>] ? vmalloc_sync_all+0x100/0x100
[  136.546589]  [<c10233b3>] ? do_page_fault+0x173/0x3d0
[  136.546596]  [<c12bd87b>] ? sys_recvmsg+0x3b/0x60
[  136.546605]  [<c12bdd83>] sys_socketcall+0x293/0x2d0
[  136.546614]  [<c13629d0>] sysenter_do_call+0x12/0x26
[  136.546619] Code:  Bad EIP value.
[  136.546627] EIP: [<00000000>] 0x0 SS:ESP 0068:f15c3b48
[  136.546645] CR2: 0000000000000000
[  136.546652] ---[ end trace 6909b560e78934fa ]---

Signed-off-by: Alexander Holler <holler@ahsoftware.de>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Neil Horman <nhorman@tuxdriver.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
---
net/bridge/br_netfilter.c |    6 ++++++
 1 file changed, 6 insertions(+)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Neil Horman - June 6, 2011, 3:32 p.m.
On Mon, Jun 06, 2011 at 04:26:03PM +0200, Eric Dumazet wrote:
> From: Alexander Holler <holler@ahsoftware.de>
> 
> Le lundi 06 juin 2011 à 15:29 +0200, Alexander Holler a écrit :
> > > Nice, now please submit a patch with 0972ddb237 as a guideline.
> > >
> > > BTW, you could also check other struct dst_ops methods for
> > > bridge/netfilter:
> > 
> > Sorry, but I prefer to submit patches I understand by myself and for 
> > stuff I know something about. The patch in my first mail was just meant 
> > as a quick fix (which seemed to work here).
> > 
> > So even if I might be able to construct a working patch using commit 
> > 0972ddb237, I don't think I should do that.
> 
> OK, I'll do it for you then, I am surprised you dont understand my
> review / suggestion.
> 
> One patch submitter is supposed to followup and send a new version to
> take into account reviews/comments, not wait that eventually everybody
> says "OK, lets take it as is"
> 
> 
> [PATCH] bridge: provide a cow_metrics method for fake_ops
> 
> Like in commit 0972ddb237 (provide cow_metrics() methods to blackhole
> dst_ops), we must provide a cow_metrics for bridges fake_dst_ops as
> well.
> 
> This fixes a regression coming from commits 62fa8a846d7d (net: Implement
> read-only protection and COW'ing of metrics.) and 33eb9873a28 (bridge:
> initialize fake_rtable metrics)
> 
> ip link set mybridge mtu 1234
> -->
> [  136.546243] Pid: 8415, comm: ip Tainted: P 
> 2.6.39.1-00006-g40545b7 #103 ASUSTeK Computer Inc.         V1Sn 
>         /V1Sn
> [  136.546256] EIP: 0060:[<00000000>] EFLAGS: 00010202 CPU: 0
> [  136.546268] EIP is at 0x0
> [  136.546273] EAX: f14a389c EBX: 000005d4 ECX: f80d32c0 EDX: f80d1da1
> [  136.546279] ESI: f14a3000 EDI: f255bf10 EBP: f15c3b54 ESP: f15c3b48
> [  136.546285]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> [  136.546293] Process ip (pid: 8415, ti=f15c2000 task=f4741f80 
> task.ti=f15c2000)
> [  136.546297] Stack:
> [  136.546301]  f80c658f f14a3000 ffffffed f15c3b64 c12cb9c8 f80d1b80 
> ffffffa1 f15c3bbc
> [  136.546315]  c12da347 c12d9c7d 00000000 f7670b00 00000000 f80d1b80 
> ffffffa6 f15c3be4
> [  136.546329]  00000004 f14a3000 f255bf20 00000008 f15c3bbc c11d6cae 
> 00000000 00000000
> [  136.546343] Call Trace:
> [  136.546359]  [<f80c658f>] ? br_change_mtu+0x5f/0x80 [bridge]
> [  136.546372]  [<c12cb9c8>] dev_set_mtu+0x38/0x80
> [  136.546381]  [<c12da347>] do_setlink+0x1a7/0x860
> [  136.546390]  [<c12d9c7d>] ? rtnl_fill_ifinfo+0x9bd/0xc70
> [  136.546400]  [<c11d6cae>] ? nla_parse+0x6e/0xb0
> [  136.546409]  [<c12db931>] rtnl_newlink+0x361/0x510
> [  136.546420]  [<c1023240>] ? vmalloc_sync_all+0x100/0x100
> [  136.546429]  [<c1362762>] ? error_code+0x5a/0x60
> [  136.546438]  [<c12db5d0>] ? rtnl_configure_link+0x80/0x80
> [  136.546446]  [<c12db27a>] rtnetlink_rcv_msg+0xfa/0x210
> [  136.546454]  [<c12db180>] ? __rtnl_unlock+0x20/0x20
> [  136.546463]  [<c12ee0fe>] netlink_rcv_skb+0x8e/0xb0
> [  136.546471]  [<c12daf1c>] rtnetlink_rcv+0x1c/0x30
> [  136.546479]  [<c12edafa>] netlink_unicast+0x23a/0x280
> [  136.546487]  [<c12ede6b>] netlink_sendmsg+0x26b/0x2f0
> [  136.546497]  [<c12bb828>] sock_sendmsg+0xc8/0x100
> [  136.546508]  [<c10adf61>] ? __alloc_pages_nodemask+0xe1/0x750
> [  136.546517]  [<c11d0602>] ? _copy_from_user+0x42/0x60
> [  136.546525]  [<c12c5e4c>] ? verify_iovec+0x4c/0xc0
> [  136.546534]  [<c12bd805>] sys_sendmsg+0x1c5/0x200
> [  136.546542]  [<c10c2150>] ? __do_fault+0x310/0x410
> [  136.546549]  [<c10c2c46>] ? do_wp_page+0x1d6/0x6b0
> [  136.546557]  [<c10c47d1>] ? handle_pte_fault+0xe1/0x720
> [  136.546565]  [<c12bd1af>] ? sys_getsockname+0x7f/0x90
> [  136.546574]  [<c10c4ec1>] ? handle_mm_fault+0xb1/0x180
> [  136.546582]  [<c1023240>] ? vmalloc_sync_all+0x100/0x100
> [  136.546589]  [<c10233b3>] ? do_page_fault+0x173/0x3d0
> [  136.546596]  [<c12bd87b>] ? sys_recvmsg+0x3b/0x60
> [  136.546605]  [<c12bdd83>] sys_socketcall+0x293/0x2d0
> [  136.546614]  [<c13629d0>] sysenter_do_call+0x12/0x26
> [  136.546619] Code:  Bad EIP value.
> [  136.546627] EIP: [<00000000>] 0x0 SS:ESP 0068:f15c3b48
> [  136.546645] CR2: 0000000000000000
> [  136.546652] ---[ end trace 6909b560e78934fa ]---
> 
> Signed-off-by: Alexander Holler <holler@ahsoftware.de>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> CC: Neil Horman <nhorman@tuxdriver.com>
> CC: Herbert Xu <herbert@gondor.apana.org.au>
> ---
> net/bridge/br_netfilter.c |    6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c
> index 3fa1231..23b43d2 100644
> --- a/net/bridge/br_netfilter.c
> +++ b/net/bridge/br_netfilter.c
> @@ -104,10 +104,16 @@ static void fake_update_pmtu(struct dst_entry *dst, u32 mtu)
>  {
>  }
>  
> +static u32 *fake_cow_metrics(struct dst_entry *dst, unsigned long old)
> +{
> +	return NULL;
> +}
> +
>  static struct dst_ops fake_dst_ops = {
>  	.family =		AF_INET,
>  	.protocol =		cpu_to_be16(ETH_P_IP),
>  	.update_pmtu =		fake_update_pmtu,
> +	.cow_metrics =		fake_cow_metrics,
>  };
>  
>  /*
> 
> 
Not to drag this out further, but since you illustrated the correct way to do
this with the blackhole_ops test, and this modification now gives us two
instances of that case, would it perhaps be better to just do this in
dst_metrics_write_ptr:

return dst->ops->cow_metrics ? return dst->ops->cow_metrics(dst, p) : NULL;

Then we could eliminate the two functions that do nothing be retun NULL (along
with their respective call instructions), and save any future users from having
to remember to include a dummy cow_metrics method if they happen to set the read
only flag on thier dst_ops?

Neil

> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet - June 6, 2011, 4:11 p.m.
Le lundi 06 juin 2011 à 11:32 -0400, Neil Horman a écrit :

> Not to drag this out further, but since you illustrated the correct way to do
> this with the blackhole_ops test, and this modification now gives us two
> instances of that case, would it perhaps be better to just do this in
> dst_metrics_write_ptr:
> 
> return dst->ops->cow_metrics ? return dst->ops->cow_metrics(dst, p) : NULL;
> 
> Then we could eliminate the two functions that do nothing be retun NULL (along
> with their respective call instructions), and save any future users from having
> to remember to include a dummy cow_metrics method if they happen to set the read
> only flag on thier dst_ops?

Well, I prefer how David coded the thing.
We can add selective traces where we want.

Having a default behavior might give much more work to find a bug in
this area. A NULL pointer access gives us an immediate indication.

Its a bit late to add an "if (dst->ops->cow_metrics)" test now that we
covered all call sites ;)

But we probably have more bugs elsewhere, because of many dst changes in
2.6.39



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Neil Horman - June 6, 2011, 5:07 p.m.
On Mon, Jun 06, 2011 at 06:11:45PM +0200, Eric Dumazet wrote:
> Le lundi 06 juin 2011 à 11:32 -0400, Neil Horman a écrit :
> 
> > Not to drag this out further, but since you illustrated the correct way to do
> > this with the blackhole_ops test, and this modification now gives us two
> > instances of that case, would it perhaps be better to just do this in
> > dst_metrics_write_ptr:
> > 
> > return dst->ops->cow_metrics ? return dst->ops->cow_metrics(dst, p) : NULL;
> > 
> > Then we could eliminate the two functions that do nothing be retun NULL (along
> > with their respective call instructions), and save any future users from having
> > to remember to include a dummy cow_metrics method if they happen to set the read
> > only flag on thier dst_ops?
> 
> Well, I prefer how David coded the thing.
> We can add selective traces where we want.
> 
> Having a default behavior might give much more work to find a bug in
> this area. A NULL pointer access gives us an immediate indication.
> 
> Its a bit late to add an "if (dst->ops->cow_metrics)" test now that we
> covered all call sites ;)
> 
> But we probably have more bugs elsewhere, because of many dst changes in
> 2.6.39
Ok, sounds reasonable to me.

Reviewed-by: Neil Horman <nhorman@tuxdriver.com>

> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - June 7, 2011, 7:52 a.m.
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 06 Jun 2011 16:26:03 +0200

> [PATCH] bridge: provide a cow_metrics method for fake_ops
> 
> Like in commit 0972ddb237 (provide cow_metrics() methods to blackhole
> dst_ops), we must provide a cow_metrics for bridges fake_dst_ops as
> well.
> 
> This fixes a regression coming from commits 62fa8a846d7d (net: Implement
> read-only protection and COW'ing of metrics.) and 33eb9873a28 (bridge:
> initialize fake_rtable metrics)
> 
> ip link set mybridge mtu 1234
> -->
 ...
> Signed-off-by: Alexander Holler <holler@ahsoftware.de>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>


Applied, thanks everyone.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c
index 3fa1231..23b43d2 100644
--- a/net/bridge/br_netfilter.c
+++ b/net/bridge/br_netfilter.c
@@ -104,10 +104,16 @@  static void fake_update_pmtu(struct dst_entry *dst, u32 mtu)
 {
 }
 
+static u32 *fake_cow_metrics(struct dst_entry *dst, unsigned long old)
+{
+	return NULL;
+}
+
 static struct dst_ops fake_dst_ops = {
 	.family =		AF_INET,
 	.protocol =		cpu_to_be16(ETH_P_IP),
 	.update_pmtu =		fake_update_pmtu,
+	.cow_metrics =		fake_cow_metrics,
 };
 
 /*