diff mbox

[RFC,net] neigh: do not modify unlinked entries

Message ID 1434484599-5875-1-git-send-email-ja@ssi.bg
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Julian Anastasov June 16, 2015, 7:56 p.m. UTC
The lockless lookups can return entry that is unlinked.
Sometimes they get reference before last neigh_cleanup_and_release,
sometimes they do not need reference. Later, any
modification attempts may result in the following problems:

1. entry is not destroyed immediately because neigh_update
can start the timer for dead entry, eg. on change to NUD_REACHABLE
state. As result, entry lives for some time but is invisible
and out of control.

2. __neigh_event_send can run in parallel with neigh_destroy
while refcnt=0 but if timer is started and expired refcnt can
reach 0 for second time leading to second neigh_destroy and
possible crash.

Thanks to Eric Dumazet and Ying Xue for their work and analyze
on the __neigh_event_send change.

Fixes: 767e97e1e0db ("neigh: RCU conversion of struct neighbour")
Fixes: a263b3093641 ("ipv4: Make neigh lookups directly in output packet path.")
Fixes: 6fd6ce2056de ("ipv6: Do not depend on rt->n in ip6_finish_output2().")
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
---
 net/core/neighbour.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

  This is an RFC, so that it can get proper commit message,
testing and reports. In fact, I'm interested to see valid
stack dumps for the "NEIGH: BUG, double timer add, state is %x"
message without this patch and without any debug patches that
dump stack from neigh_hold or other places...

Comments

Eric Dumazet June 19, 2015, 6:40 a.m. UTC | #1
On Tue, 2015-06-16 at 22:56 +0300, Julian Anastasov wrote:
> The lockless lookups can return entry that is unlinked.
> Sometimes they get reference before last neigh_cleanup_and_release,
> sometimes they do not need reference. Later, any
> modification attempts may result in the following problems:
> 
> 1. entry is not destroyed immediately because neigh_update
> can start the timer for dead entry, eg. on change to NUD_REACHABLE
> state. As result, entry lives for some time but is invisible
> and out of control.
> 
> 2. __neigh_event_send can run in parallel with neigh_destroy
> while refcnt=0 but if timer is started and expired refcnt can
> reach 0 for second time leading to second neigh_destroy and
> possible crash.
> 
> Thanks to Eric Dumazet and Ying Xue for their work and analyze
> on the __neigh_event_send change.
> 
> Fixes: 767e97e1e0db ("neigh: RCU conversion of struct neighbour")
> Fixes: a263b3093641 ("ipv4: Make neigh lookups directly in output packet path.")
> Fixes: 6fd6ce2056de ("ipv6: Do not depend on rt->n in ip6_finish_output2().")
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Ying Xue <ying.xue@windriver.com>
> Signed-off-by: Julian Anastasov <ja@ssi.bg>
> ---

Seems good to me Julian !

Acked-by: Eric Dumazet <edumazet@google.com>


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hideaki Yoshifuji June 19, 2015, 7:14 a.m. UTC | #2
Hi,

Julian Anastasov wrote:
> The lockless lookups can return entry that is unlinked.
> Sometimes they get reference before last neigh_cleanup_and_release,
> sometimes they do not need reference. Later, any
> modification attempts may result in the following problems:
> 
> 1. entry is not destroyed immediately because neigh_update
> can start the timer for dead entry, eg. on change to NUD_REACHABLE
> state. As result, entry lives for some time but is invisible
> and out of control.
> 
> 2. __neigh_event_send can run in parallel with neigh_destroy
> while refcnt=0 but if timer is started and expired refcnt can
> reach 0 for second time leading to second neigh_destroy and
> possible crash.
> 
> Thanks to Eric Dumazet and Ying Xue for their work and analyze
> on the __neigh_event_send change.
> 
> Fixes: 767e97e1e0db ("neigh: RCU conversion of struct neighbour")
> Fixes: a263b3093641 ("ipv4: Make neigh lookups directly in output packet path.")
> Fixes: 6fd6ce2056de ("ipv6: Do not depend on rt->n in ip6_finish_output2().")
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Ying Xue <ying.xue@windriver.com>
> Signed-off-by: Julian Anastasov <ja@ssi.bg>
> ---
>  net/core/neighbour.c | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
>   This is an RFC, so that it can get proper commit message,
> testing and reports. In fact, I'm interested to see valid
> stack dumps for the "NEIGH: BUG, double timer add, state is %x"
> message without this patch and without any debug patches that
> dump stack from neigh_hold or other places...
> 
> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
> index 3de6542..2237c1b 100644
> --- a/net/core/neighbour.c
> +++ b/net/core/neighbour.c
> @@ -957,6 +957,8 @@ int __neigh_event_send(struct neighbour *neigh, struct sk_buff *skb)
>  	rc = 0;
>  	if (neigh->nud_state & (NUD_CONNECTED | NUD_DELAY | NUD_PROBE))
>  		goto out_unlock_bh;
> +	if (neigh->dead)
> +		goto out_dead;
>  
>  	if (!(neigh->nud_state & (NUD_STALE | NUD_INCOMPLETE))) {
>  		if (NEIGH_VAR(neigh->parms, MCAST_PROBES) +
> @@ -1013,6 +1015,13 @@ out_unlock_bh:
>  		write_unlock(&neigh->lock);
>  	local_bh_enable();
>  	return rc;
> +
> +out_dead:
> +	if (neigh->nud_state & NUD_STALE)
> +		goto out_unlock_bh;
> +	write_unlock_bh(&neigh->lock);
> +	kfree_skb(skb);
> +	return 1;
>  }
>  EXPORT_SYMBOL(__neigh_event_send);
>  

Should we always drop the packet here since it is
already dead, shouldn't we?

--yoshfuji

> @@ -1076,6 +1085,8 @@ int neigh_update(struct neighbour *neigh, const u8 *lladdr, u8 new,
>  	if (!(flags & NEIGH_UPDATE_F_ADMIN) &&
>  	    (old & (NUD_NOARP | NUD_PERMANENT)))
>  		goto out;
> +	if (neigh->dead)
> +		goto out;
>  
>  	if (!(new & NUD_VALID)) {
>  		neigh_del_timer(neigh);
> @@ -1225,6 +1236,8 @@ EXPORT_SYMBOL(neigh_update);
>   */
>  void __neigh_set_probe_once(struct neighbour *neigh)
>  {
> +	if (neigh->dead)
> +		return;
>  	neigh->updated = jiffies;
>  	if (!(neigh->nud_state & NUD_FAILED))
>  		return;
>
Julian Anastasov June 19, 2015, 8:24 a.m. UTC | #3
Hello,

On Fri, 19 Jun 2015, YOSHIFUJI Hideaki/吉藤英明 wrote:

> Should we always drop the packet here since it is
> already dead, shouldn't we?

	It can be a NETDEV_CHANGEADDR event, eth_header()
will build valid header. It can be some race condition
with neigh_forced_gc and neigh_periodic_work where we can
not fallback to neigh_create. It is not our job to
drop packets, so I preferred to avoid it...

Regards

--
Julian Anastasov <ja@ssi.bg>
David Miller June 21, 2015, 4:43 p.m. UTC | #4
From: Julian Anastasov <ja@ssi.bg>
Date: Tue, 16 Jun 2015 22:56:39 +0300

> The lockless lookups can return entry that is unlinked.
> Sometimes they get reference before last neigh_cleanup_and_release,
> sometimes they do not need reference. Later, any
> modification attempts may result in the following problems:
> 
> 1. entry is not destroyed immediately because neigh_update
> can start the timer for dead entry, eg. on change to NUD_REACHABLE
> state. As result, entry lives for some time but is invisible
> and out of control.
> 
> 2. __neigh_event_send can run in parallel with neigh_destroy
> while refcnt=0 but if timer is started and expired refcnt can
> reach 0 for second time leading to second neigh_destroy and
> possible crash.
> 
> Thanks to Eric Dumazet and Ying Xue for their work and analyze
> on the __neigh_event_send change.
> 
> Fixes: 767e97e1e0db ("neigh: RCU conversion of struct neighbour")
> Fixes: a263b3093641 ("ipv4: Make neigh lookups directly in output packet path.")
> Fixes: 6fd6ce2056de ("ipv6: Do not depend on rt->n in ip6_finish_output2().")
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Ying Xue <ying.xue@windriver.com>
> Signed-off-by: Julian Anastasov <ja@ssi.bg>

Applied and queued up for -stable, thanks!
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
diff mbox

Patch

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index 3de6542..2237c1b 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -957,6 +957,8 @@  int __neigh_event_send(struct neighbour *neigh, struct sk_buff *skb)
 	rc = 0;
 	if (neigh->nud_state & (NUD_CONNECTED | NUD_DELAY | NUD_PROBE))
 		goto out_unlock_bh;
+	if (neigh->dead)
+		goto out_dead;
 
 	if (!(neigh->nud_state & (NUD_STALE | NUD_INCOMPLETE))) {
 		if (NEIGH_VAR(neigh->parms, MCAST_PROBES) +
@@ -1013,6 +1015,13 @@  out_unlock_bh:
 		write_unlock(&neigh->lock);
 	local_bh_enable();
 	return rc;
+
+out_dead:
+	if (neigh->nud_state & NUD_STALE)
+		goto out_unlock_bh;
+	write_unlock_bh(&neigh->lock);
+	kfree_skb(skb);
+	return 1;
 }
 EXPORT_SYMBOL(__neigh_event_send);
 
@@ -1076,6 +1085,8 @@  int neigh_update(struct neighbour *neigh, const u8 *lladdr, u8 new,
 	if (!(flags & NEIGH_UPDATE_F_ADMIN) &&
 	    (old & (NUD_NOARP | NUD_PERMANENT)))
 		goto out;
+	if (neigh->dead)
+		goto out;
 
 	if (!(new & NUD_VALID)) {
 		neigh_del_timer(neigh);
@@ -1225,6 +1236,8 @@  EXPORT_SYMBOL(neigh_update);
  */
 void __neigh_set_probe_once(struct neighbour *neigh)
 {
+	if (neigh->dead)
+		return;
 	neigh->updated = jiffies;
 	if (!(neigh->nud_state & NUD_FAILED))
 		return;