[net,1/1] tipc: initialize broadcast link stale counter correctly

Message ID 1539288149-24122-1-git-send-email-jon.maloy@ericsson.com
State Under Review
Delegated to: David Miller
Headers show
Series
  • [net,1/1] tipc: initialize broadcast link stale counter correctly
Related show

Commit Message

Jon Maloy Oct. 11, 2018, 8:02 p.m.
In the commit referred to below we added link tolerance as an additional
criteria for declaring broadcast transmission "stale" and resetting the
unicast links to the affected node.

Unfortunately, this 'improvement' introduced two bugs, which each and
one alone cause only limited problems, but combined lead to seemingly
stochastic unicast link resets, depending on the amount of broadcast
traffic transmitted.

The first issue, a missing initialization of the 'tolerance' field of
the receiver broadcast link, was recently fixed by commit 047491ea334a
("tipc: set link tolerance correctly in broadcast link").

Ths second issue, where we omit to reset the 'stale_cnt' field of
the same link after a 'stale' period is over, leads to this counter
accumulating over time, and in the absence of the 'tolerance' criteria
leads to the above described symptoms. This commit adds the missing
initialization.

Fixes: a4dc70d46cf1 ("tipc: extend link reset criteria for stale packet
retransmission")

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
---
 net/tipc/link.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Ying Xue Oct. 12, 2018, 12:41 a.m. | #1
On 10/12/2018 04:02 AM, Jon Maloy wrote:
> In the commit referred to below we added link tolerance as an additional
> criteria for declaring broadcast transmission "stale" and resetting the
> unicast links to the affected node.
> 
> Unfortunately, this 'improvement' introduced two bugs, which each and
> one alone cause only limited problems, but combined lead to seemingly
> stochastic unicast link resets, depending on the amount of broadcast
> traffic transmitted.
> 
> The first issue, a missing initialization of the 'tolerance' field of
> the receiver broadcast link, was recently fixed by commit 047491ea334a
> ("tipc: set link tolerance correctly in broadcast link").
> 
> Ths second issue, where we omit to reset the 'stale_cnt' field of
> the same link after a 'stale' period is over, leads to this counter
> accumulating over time, and in the absence of the 'tolerance' criteria
> leads to the above described symptoms. This commit adds the missing
> initialization.
> 
> Fixes: a4dc70d46cf1 ("tipc: extend link reset criteria for stale packet
> retransmission")
> 
> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>

Acked-by: Ying Xue <ying.xue@windriver.com>

> ---
>  net/tipc/link.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/net/tipc/link.c b/net/tipc/link.c
> index f6552e4..201c3b5 100644
> --- a/net/tipc/link.c
> +++ b/net/tipc/link.c
> @@ -1041,6 +1041,7 @@ static int tipc_link_retrans(struct tipc_link *l, struct tipc_link *r,
>  	if (r->last_retransm != buf_seqno(skb)) {
>  		r->last_retransm = buf_seqno(skb);
>  		r->stale_limit = jiffies + msecs_to_jiffies(r->tolerance);
> +		r->stale_cnt = 0;
>  	} else if (++r->stale_cnt > 99 && time_after(jiffies, r->stale_limit)) {
>  		link_retransmit_failure(l, skb);
>  		if (link_is_bc_sndlink(l))
>

Patch

diff --git a/net/tipc/link.c b/net/tipc/link.c
index f6552e4..201c3b5 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -1041,6 +1041,7 @@  static int tipc_link_retrans(struct tipc_link *l, struct tipc_link *r,
 	if (r->last_retransm != buf_seqno(skb)) {
 		r->last_retransm = buf_seqno(skb);
 		r->stale_limit = jiffies + msecs_to_jiffies(r->tolerance);
+		r->stale_cnt = 0;
 	} else if (++r->stale_cnt > 99 && time_after(jiffies, r->stale_limit)) {
 		link_retransmit_failure(l, skb);
 		if (link_is_bc_sndlink(l))