Message ID | 1431495814-22759-1-git-send-email-ying.xue@windriver.com |
---|---|
State | Rejected, archived |
Delegated to: | David Miller |
Headers | show |
On Wed, 2015-05-13 at 13:43 +0800, Ying Xue wrote: > Once modifying a pending timer of a neighbour, it's insufficient to > post a warning message. Instead we should not take the neighbour's > reference count at the same time, otherwise, it causes an issue that > the neighbour cannot be freed forever. > > Signed-off-by: Ying Xue <ying.xue@windriver.com> > --- > net/core/neighbour.c | 9 +++++---- > 1 file changed, 5 insertions(+), 4 deletions(-) > > diff --git a/net/core/neighbour.c b/net/core/neighbour.c > index 3de6542..5595db3 100644 > --- a/net/core/neighbour.c > +++ b/net/core/neighbour.c > @@ -164,10 +164,11 @@ static int neigh_forced_gc(struct neigh_table *tbl) > > static void neigh_add_timer(struct neighbour *n, unsigned long when) > { > - neigh_hold(n); > - if (unlikely(mod_timer(&n->timer, when))) { > - printk("NEIGH: BUG, double timer add, state is %x\n", > - n->nud_state); > + if (likely(!mod_timer(&n->timer, when))) { > + neigh_hold(n); > + } else { > + pr_warn("NEIGH: BUG, double timer add, state is %x\n", > + n->nud_state); > dump_stack(); > } > } Have you hit this condition ? If yes, there is a bug elsewhere and we need to fix it, not trying to recover. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 05/13/2015 02:20 PM, Eric Dumazet wrote: > Have you hit this condition ? > No, I just found the issue when I was reading the code. > If yes, there is a bug elsewhere and we need to fix it, not trying to > recover. > Sorry, I don't know its relevant story, so can you please give some hints? Thanks, Ying -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 05/13/2015 03:25 PM, Ying Xue wrote: > On 05/13/2015 02:20 PM, Eric Dumazet wrote: >> Have you hit this condition ? >> > > No, I just found the issue when I was reading the code. > >> If yes, there is a bug elsewhere and we need to fix it, not trying to >> recover. >> After a search, it's found your mentioned bug probably should be: https://patchwork.ozlabs.org/patch/458112/ But I think the bug should be a different issue that the patch is to be resolved. When the bug happened, neigh_add_timer() would take a reference on an object that already had a refcount of zero, which means somebody already decreased the refcount to zero before neigh_add_timer() was called. So, the bug should be unrelated to neigh_add_timer() although the bug really needs to be fixed. But the issue the patch tries to fix is wrong because it obviously violates the generic usage of how to take an object refcount in the object's timer handler as well as when its timer is started or deleted. Specially when a neighbour's timer to be modified is in pending state, we should not take its refcount, otherwise, the neighbour is leaked forever. Therefore, the patch is needed for us even if your mentioned bug is not fixed. Regards, Ying > > Sorry, I don't know its relevant story, so can you please give some hints? > > Thanks, > Ying > > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2015-05-13 at 18:32 +0800, Ying Xue wrote: > On 05/13/2015 03:25 PM, Ying Xue wrote: > > On 05/13/2015 02:20 PM, Eric Dumazet wrote: > >> Have you hit this condition ? > >> > > > > No, I just found the issue when I was reading the code. > > > >> If yes, there is a bug elsewhere and we need to fix it, not trying to > >> recover. > >> > > After a search, it's found your mentioned bug probably should be: > > https://patchwork.ozlabs.org/patch/458112/ > > But I think the bug should be a different issue that the patch is to be > resolved. When the bug happened, neigh_add_timer() would take a reference on an > object that already had a refcount of zero, which means somebody already > decreased the refcount to zero before neigh_add_timer() was called. So, the bug > should be unrelated to neigh_add_timer() although the bug really needs to be > fixed. But the issue the patch tries to fix is wrong because it obviously > violates the generic usage of how to take an object refcount in the object's > timer handler as well as when its timer is started or deleted. Specially when a > neighbour's timer to be modified is in pending state, we should not take its > refcount, otherwise, the neighbour is leaked forever. > > Therefore, the patch is needed for us even if your mentioned bug is not fixed. Sorry, your patch is wrong. As I said, if the issue is there, a different fix is needed. Caller of the function must owns a refcount by definition. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/core/neighbour.c b/net/core/neighbour.c index 3de6542..5595db3 100644 --- a/net/core/neighbour.c +++ b/net/core/neighbour.c @@ -164,10 +164,11 @@ static int neigh_forced_gc(struct neigh_table *tbl) static void neigh_add_timer(struct neighbour *n, unsigned long when) { - neigh_hold(n); - if (unlikely(mod_timer(&n->timer, when))) { - printk("NEIGH: BUG, double timer add, state is %x\n", - n->nud_state); + if (likely(!mod_timer(&n->timer, when))) { + neigh_hold(n); + } else { + pr_warn("NEIGH: BUG, double timer add, state is %x\n", + n->nud_state); dump_stack(); } }
Once modifying a pending timer of a neighbour, it's insufficient to post a warning message. Instead we should not take the neighbour's reference count at the same time, otherwise, it causes an issue that the neighbour cannot be freed forever. Signed-off-by: Ying Xue <ying.xue@windriver.com> --- net/core/neighbour.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)