Message ID | 1334855781.2395.203.camel@edumazet-glaptop |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
On Thu, Apr 19, 2012 at 07:16:21PM +0200, Eric Dumazet wrote: > From: Eric Dumazet <edumazet@google.com> > > It seems there is a logic error in trace_drop_common(), since we store > only 64 drops, even if they are from same location. > > This fix is a one liner, but we probably need more work to avoid useless > atomic dec/inc > > Now I can watch 1 Mpps drops through dropwatch... > > Signed-off-by: Eric Dumazet <edumazet@google.com> > Cc: Neil Horman <nhorman@tuxdriver.com> > --- > Neil, it seems this code is not SMP/preempt safe. Worker can free our > data under us, and genlmsg_new() can return NULL under stress. > > net/core/drop_monitor.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/net/core/drop_monitor.c b/net/core/drop_monitor.c > index 7f36b38..5c3c81a 100644 > --- a/net/core/drop_monitor.c > +++ b/net/core/drop_monitor.c > @@ -150,6 +150,7 @@ static void trace_drop_common(struct sk_buff *skb, void *location) > for (i = 0; i < msg->entries; i++) { > if (!memcmp(&location, msg->points[i].pc, sizeof(void *))) { > msg->points[i].count++; > + atomic_inc(&data->dm_hit_count); > goto out; > } > } > > > I spent a good deal of time going through it to make sure it was preempt safe, but its certainly possible that I missed something. I'll look at it shortly. Thanks for this update though Acked-by: Neil Horman <nhorman@tuxdriver.com> -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le jeudi 19 avril 2012 à 13:28 -0400, Neil Horman a écrit : > I spent a good deal of time going through it to make sure it was preempt safe, > but its certainly possible that I missed something. I'll look at it shortly. > Thanks for this update though > You might want to check this problem as well : [ 38.352571] BUG: sleeping function called from invalid context at kernel/mutex.c:85 [ 38.352576] in_atomic(): 1, irqs_disabled(): 0, pid: 4415, name: dropwatch [ 38.352580] Pid: 4415, comm: dropwatch Not tainted 3.4.0-rc2+ #71 [ 38.352582] Call Trace: [ 38.352592] [<ffffffff8153aaf0>] ? trace_napi_poll_hit+0xd0/0xd0 [ 38.352599] [<ffffffff81063f2a>] __might_sleep+0xca/0xf0 [ 38.352606] [<ffffffff81655b16>] mutex_lock+0x26/0x50 [ 38.352610] [<ffffffff8153aaf0>] ? trace_napi_poll_hit+0xd0/0xd0 [ 38.352616] [<ffffffff810b72d9>] tracepoint_probe_register+0x29/0x90 [ 38.352621] [<ffffffff8153a585>] set_all_monitor_traces+0x105/0x170 [ 38.352625] [<ffffffff8153a8ca>] net_dm_cmd_trace+0x2a/0x40 [ 38.352630] [<ffffffff8154a81a>] genl_rcv_msg+0x21a/0x2b0 [ 38.352636] [<ffffffff810f8029>] ? zone_statistics+0x99/0xc0 [ 38.352640] [<ffffffff8154a600>] ? genl_rcv+0x30/0x30 [ 38.352645] [<ffffffff8154a059>] netlink_rcv_skb+0xa9/0xd0 [ 38.352649] [<ffffffff8154a5f0>] genl_rcv+0x20/0x30 [ 38.352653] [<ffffffff81549a7e>] netlink_unicast+0x1ae/0x1f0 [ 38.352658] [<ffffffff81549d76>] netlink_sendmsg+0x2b6/0x310 [ 38.352663] [<ffffffff8150824f>] sock_sendmsg+0x10f/0x130 [ 38.352668] [<ffffffff8150abe0>] ? move_addr_to_kernel+0x60/0xb0 [ 38.352673] [<ffffffff81515f04>] ? verify_iovec+0x64/0xe0 [ 38.352677] [<ffffffff81509c46>] __sys_sendmsg+0x386/0x390 [ 38.352682] [<ffffffff810ffaf9>] ? handle_mm_fault+0x139/0x210 [ 38.352687] [<ffffffff8165b5bc>] ? do_page_fault+0x1ec/0x4f0 [ 38.352693] [<ffffffff8106ba4d>] ? set_next_entity+0x9d/0xb0 [ 38.352699] [<ffffffff81310b49>] ? tty_ldisc_deref+0x9/0x10 [ 38.352703] [<ffffffff8106d363>] ? pick_next_task_fair+0x63/0x140 [ 38.352708] [<ffffffff8150b8d4>] sys_sendmsg+0x44/0x80 [ 38.352713] [<ffffffff8165f8e2>] system_call_fastpath+0x16/0x1b -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Apr 19, 2012 at 11:25:33PM +0200, Eric Dumazet wrote: > Le jeudi 19 avril 2012 à 13:28 -0400, Neil Horman a écrit : > > > I spent a good deal of time going through it to make sure it was preempt safe, > > but its certainly possible that I missed something. I'll look at it shortly. > > Thanks for this update though > > > > You might want to check this problem as well : > > [ 38.352571] BUG: sleeping function called from invalid context at kernel/mutex.c:85 > [ 38.352576] in_atomic(): 1, irqs_disabled(): 0, pid: 4415, name: dropwatch > [ 38.352580] Pid: 4415, comm: dropwatch Not tainted 3.4.0-rc2+ #71 > [ 38.352582] Call Trace: > [ 38.352592] [<ffffffff8153aaf0>] ? trace_napi_poll_hit+0xd0/0xd0 > [ 38.352599] [<ffffffff81063f2a>] __might_sleep+0xca/0xf0 > [ 38.352606] [<ffffffff81655b16>] mutex_lock+0x26/0x50 > [ 38.352610] [<ffffffff8153aaf0>] ? trace_napi_poll_hit+0xd0/0xd0 > [ 38.352616] [<ffffffff810b72d9>] tracepoint_probe_register+0x29/0x90 > [ 38.352621] [<ffffffff8153a585>] set_all_monitor_traces+0x105/0x170 > [ 38.352625] [<ffffffff8153a8ca>] net_dm_cmd_trace+0x2a/0x40 > [ 38.352630] [<ffffffff8154a81a>] genl_rcv_msg+0x21a/0x2b0 > [ 38.352636] [<ffffffff810f8029>] ? zone_statistics+0x99/0xc0 > [ 38.352640] [<ffffffff8154a600>] ? genl_rcv+0x30/0x30 > [ 38.352645] [<ffffffff8154a059>] netlink_rcv_skb+0xa9/0xd0 > [ 38.352649] [<ffffffff8154a5f0>] genl_rcv+0x20/0x30 > [ 38.352653] [<ffffffff81549a7e>] netlink_unicast+0x1ae/0x1f0 > [ 38.352658] [<ffffffff81549d76>] netlink_sendmsg+0x2b6/0x310 > [ 38.352663] [<ffffffff8150824f>] sock_sendmsg+0x10f/0x130 > [ 38.352668] [<ffffffff8150abe0>] ? move_addr_to_kernel+0x60/0xb0 > [ 38.352673] [<ffffffff81515f04>] ? verify_iovec+0x64/0xe0 > [ 38.352677] [<ffffffff81509c46>] __sys_sendmsg+0x386/0x390 > [ 38.352682] [<ffffffff810ffaf9>] ? handle_mm_fault+0x139/0x210 > [ 38.352687] [<ffffffff8165b5bc>] ? do_page_fault+0x1ec/0x4f0 > [ 38.352693] [<ffffffff8106ba4d>] ? set_next_entity+0x9d/0xb0 > [ 38.352699] [<ffffffff81310b49>] ? tty_ldisc_deref+0x9/0x10 > [ 38.352703] [<ffffffff8106d363>] ? pick_next_task_fair+0x63/0x140 > [ 38.352708] [<ffffffff8150b8d4>] sys_sendmsg+0x44/0x80 > [ 38.352713] [<ffffffff8165f8e2>] system_call_fastpath+0x16/0x1b > > > Will do, thanks! Neil -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Eric Dumazet <eric.dumazet@gmail.com> Date: Thu, 19 Apr 2012 19:16:21 +0200 > From: Eric Dumazet <edumazet@google.com> > > It seems there is a logic error in trace_drop_common(), since we store > only 64 drops, even if they are from same location. > > This fix is a one liner, but we probably need more work to avoid useless > atomic dec/inc > > Now I can watch 1 Mpps drops through dropwatch... > > Signed-off-by: Eric Dumazet <edumazet@google.com> > Cc: Neil Horman <nhorman@tuxdriver.com> Applied, thanks. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/core/drop_monitor.c b/net/core/drop_monitor.c index 7f36b38..5c3c81a 100644 --- a/net/core/drop_monitor.c +++ b/net/core/drop_monitor.c @@ -150,6 +150,7 @@ static void trace_drop_common(struct sk_buff *skb, void *location) for (i = 0; i < msg->entries; i++) { if (!memcmp(&location, msg->points[i].pc, sizeof(void *))) { msg->points[i].count++; + atomic_inc(&data->dm_hit_count); goto out; } }