diff mbox series

[net,1/1] tipc: initialize broadcast link stale counter correctly

Message ID 1539288149-24122-1-git-send-email-jon.maloy@ericsson.com
State Accepted, archived
Delegated to: David Miller
Headers show
Series [net,1/1] tipc: initialize broadcast link stale counter correctly | expand

Commit Message

Jon Maloy Oct. 11, 2018, 8:02 p.m. UTC
In the commit referred to below we added link tolerance as an additional
criteria for declaring broadcast transmission "stale" and resetting the
unicast links to the affected node.

Unfortunately, this 'improvement' introduced two bugs, which each and
one alone cause only limited problems, but combined lead to seemingly
stochastic unicast link resets, depending on the amount of broadcast
traffic transmitted.

The first issue, a missing initialization of the 'tolerance' field of
the receiver broadcast link, was recently fixed by commit 047491ea334a
("tipc: set link tolerance correctly in broadcast link").

Ths second issue, where we omit to reset the 'stale_cnt' field of
the same link after a 'stale' period is over, leads to this counter
accumulating over time, and in the absence of the 'tolerance' criteria
leads to the above described symptoms. This commit adds the missing
initialization.

Fixes: a4dc70d46cf1 ("tipc: extend link reset criteria for stale packet
retransmission")

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
---
 net/tipc/link.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Ying Xue Oct. 12, 2018, 12:41 a.m. UTC | #1
On 10/12/2018 04:02 AM, Jon Maloy wrote:
> In the commit referred to below we added link tolerance as an additional
> criteria for declaring broadcast transmission "stale" and resetting the
> unicast links to the affected node.
> 
> Unfortunately, this 'improvement' introduced two bugs, which each and
> one alone cause only limited problems, but combined lead to seemingly
> stochastic unicast link resets, depending on the amount of broadcast
> traffic transmitted.
> 
> The first issue, a missing initialization of the 'tolerance' field of
> the receiver broadcast link, was recently fixed by commit 047491ea334a
> ("tipc: set link tolerance correctly in broadcast link").
> 
> Ths second issue, where we omit to reset the 'stale_cnt' field of
> the same link after a 'stale' period is over, leads to this counter
> accumulating over time, and in the absence of the 'tolerance' criteria
> leads to the above described symptoms. This commit adds the missing
> initialization.
> 
> Fixes: a4dc70d46cf1 ("tipc: extend link reset criteria for stale packet
> retransmission")
> 
> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>

Acked-by: Ying Xue <ying.xue@windriver.com>

> ---
>  net/tipc/link.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/net/tipc/link.c b/net/tipc/link.c
> index f6552e4..201c3b5 100644
> --- a/net/tipc/link.c
> +++ b/net/tipc/link.c
> @@ -1041,6 +1041,7 @@ static int tipc_link_retrans(struct tipc_link *l, struct tipc_link *r,
>  	if (r->last_retransm != buf_seqno(skb)) {
>  		r->last_retransm = buf_seqno(skb);
>  		r->stale_limit = jiffies + msecs_to_jiffies(r->tolerance);
> +		r->stale_cnt = 0;
>  	} else if (++r->stale_cnt > 99 && time_after(jiffies, r->stale_limit)) {
>  		link_retransmit_failure(l, skb);
>  		if (link_is_bc_sndlink(l))
>
David Miller Oct. 16, 2018, 5:04 a.m. UTC | #2
From: Jon Maloy <jon.maloy@ericsson.com>
Date: Thu, 11 Oct 2018 22:02:29 +0200

> In the commit referred to below we added link tolerance as an additional
> criteria for declaring broadcast transmission "stale" and resetting the
> unicast links to the affected node.
> 
> Unfortunately, this 'improvement' introduced two bugs, which each and
> one alone cause only limited problems, but combined lead to seemingly
> stochastic unicast link resets, depending on the amount of broadcast
> traffic transmitted.
> 
> The first issue, a missing initialization of the 'tolerance' field of
> the receiver broadcast link, was recently fixed by commit 047491ea334a
> ("tipc: set link tolerance correctly in broadcast link").
> 
> Ths second issue, where we omit to reset the 'stale_cnt' field of
> the same link after a 'stale' period is over, leads to this counter
> accumulating over time, and in the absence of the 'tolerance' criteria
> leads to the above described symptoms. This commit adds the missing
> initialization.
> 
> Fixes: a4dc70d46cf1 ("tipc: extend link reset criteria for stale packet
> retransmission")
> 
> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>

Applied, thanks Jon.

In the future, please always make your Fixes: tag one single line no
matter how long it is.  And also do not separate it from other tags
like the signoff with empty lines.  All of the tags should be
collected together without any empty lines in between.

Thank you.
diff mbox series

Patch

diff --git a/net/tipc/link.c b/net/tipc/link.c
index f6552e4..201c3b5 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -1041,6 +1041,7 @@  static int tipc_link_retrans(struct tipc_link *l, struct tipc_link *r,
 	if (r->last_retransm != buf_seqno(skb)) {
 		r->last_retransm = buf_seqno(skb);
 		r->stale_limit = jiffies + msecs_to_jiffies(r->tolerance);
+		r->stale_cnt = 0;
 	} else if (++r->stale_cnt > 99 && time_after(jiffies, r->stale_limit)) {
 		link_retransmit_failure(l, skb);
 		if (link_is_bc_sndlink(l))