Message ID | 20191106062610.12039-2-hoang.h.le@dektech.com.au |
---|---|
State | Accepted |
Delegated to: | David Miller |
Headers | show |
Series | [net-next,1/2] tipc: update cluster capabilities if node deleted | expand |
Acked-by: Jon > -----Original Message----- > From: Hoang Le <hoang.h.le@dektech.com.au> > Sent: 6-Nov-19 01:26 > To: Jon Maloy <jon.maloy@ericsson.com>; maloy@donjonn.com; netdev@vger.kernel.org; tipc- > discussion@lists.sourceforge.net > Subject: [net-next 2/2] tipc: reduce sensitive to retransmit failures > > With huge cluster (e.g >200nodes), the amount of that flow: > gap -> retransmit packet -> acked will take time in case of STATE_MSG > dropped/delayed because a lot of traffic. This lead to 1.5 sec tolerance > value criteria made link easy failure around 2nd, 3rd of failed > retransmission attempts. > > Instead of re-introduced criteria of 99 faled retransmissions to fix the > issue, we increase failure detection timer to ten times tolerance value. > > Fixes: 77cf8edbc0e7 ("tipc: simplify stale link failure criteria") > Acked-by: Jon Maloy <jon.maloy@ericsson.com> > Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au> > --- > net/tipc/link.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/tipc/link.c b/net/tipc/link.c > index 038861bad72b..2aed7a958a8c 100644 > --- a/net/tipc/link.c > +++ b/net/tipc/link.c > @@ -1087,7 +1087,7 @@ static bool link_retransmit_failure(struct tipc_link *l, struct tipc_link *r, > return false; > > if (!time_after(jiffies, TIPC_SKB_CB(skb)->retr_stamp + > - msecs_to_jiffies(r->tolerance))) > + msecs_to_jiffies(r->tolerance * 10))) > return false; > > hdr = buf_msg(skb); > -- > 2.20.1
From: Hoang Le <hoang.h.le@dektech.com.au> Date: Wed, 6 Nov 2019 13:26:10 +0700 > With huge cluster (e.g >200nodes), the amount of that flow: > gap -> retransmit packet -> acked will take time in case of STATE_MSG > dropped/delayed because a lot of traffic. This lead to 1.5 sec tolerance > value criteria made link easy failure around 2nd, 3rd of failed > retransmission attempts. > > Instead of re-introduced criteria of 99 faled retransmissions to fix the > issue, we increase failure detection timer to ten times tolerance value. > > Fixes: 77cf8edbc0e7 ("tipc: simplify stale link failure criteria") > Acked-by: Jon Maloy <jon.maloy@ericsson.com> > Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au> Applied.
diff --git a/net/tipc/link.c b/net/tipc/link.c index 038861bad72b..2aed7a958a8c 100644 --- a/net/tipc/link.c +++ b/net/tipc/link.c @@ -1087,7 +1087,7 @@ static bool link_retransmit_failure(struct tipc_link *l, struct tipc_link *r, return false; if (!time_after(jiffies, TIPC_SKB_CB(skb)->retr_stamp + - msecs_to_jiffies(r->tolerance))) + msecs_to_jiffies(r->tolerance * 10))) return false; hdr = buf_msg(skb);