diff mbox series

net: tcp: ratelimit warnings in tcp_recvmsg

Message ID 5fa93ef0.1c69fb81.bff98.2afc@mx.google.com
State Rejected
Delegated to: David Miller
Headers show
Series net: tcp: ratelimit warnings in tcp_recvmsg | expand

Checks

Context Check Description
jkicinski/cover_letter success Link
jkicinski/fixes_present success Link
jkicinski/patch_count success Link
jkicinski/tree_selection success Guessed tree name to be net-next
jkicinski/subject_prefix warning Target tree name not specified in the subject
jkicinski/source_inline success Was 0 now: 0
jkicinski/verify_signedoff success Link
jkicinski/module_param success Was 0 now: 0
jkicinski/build_32bit success Errors and warnings before: 3 this patch: 3
jkicinski/kdoc success Errors and warnings before: 0 this patch: 0
jkicinski/verify_fixes success Link
jkicinski/checkpatch warning WARNING: line length of 93 exceeds 80 columns
jkicinski/build_allmodconfig_warn success Errors and warnings before: 3 this patch: 3
jkicinski/header_inline success Link
jkicinski/stable success Stable not CCed

Commit Message

Menglong Dong Nov. 9, 2020, 1:06 p.m. UTC
From: Menglong Dong <dong.menglong@zte.com.cn>

'before(*seq, TCP_SKB_CB(skb)->seq) == true' means that one or more
skbs are lost somehow. Once this happen, it seems that it will
never recover automatically. As a result, a warning will be printed
and a '-EAGAIN' will be returned in non-block mode.

As a general suituation, users call 'poll' on a socket and then receive
skbs with 'recv' in non-block mode. This mode will make every
arriving skb of the socket trigger a warning. Plenty of skbs will cause
high rate of kernel log.

Besides, WARN is for indicating kernel bugs only and should not be
user-triggable. Replace it with 'net_warn_ratelimited' here.

Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn>
---
 net/ipv4/tcp.c | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

Comments

Eric Dumazet Nov. 9, 2020, 1:36 p.m. UTC | #1
On Mon, Nov 9, 2020 at 2:07 PM <menglong8.dong@gmail.com> wrote:
>
> From: Menglong Dong <dong.menglong@zte.com.cn>
>
> 'before(*seq, TCP_SKB_CB(skb)->seq) == true' means that one or more
> skbs are lost somehow. Once this happen, it seems that it will
> never recover automatically. As a result, a warning will be printed
> and a '-EAGAIN' will be returned in non-block mode.
>
> As a general suituation, users call 'poll' on a socket and then receive
> skbs with 'recv' in non-block mode. This mode will make every
> arriving skb of the socket trigger a warning. Plenty of skbs will cause
> high rate of kernel log.
>
> Besides, WARN is for indicating kernel bugs only and should not be
> user-triggable. Replace it with 'net_warn_ratelimited' here.
>
> Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn>

I do not think this patch is useful. That is simply code churn.

Can you trigger the WARN() in the latest upstream version ?
If yes this is a serious bug that needs urgent attention.

Make sure you have backported all needed fixes into your kernel, if
you get this warning on a non pristine kernel.
Menglong Dong Nov. 9, 2020, 2:48 p.m. UTC | #2
On Mon, Nov 9, 2020 at 9:36 PM Eric Dumazet <edumazet@google.com> wrote:
>
> I do not think this patch is useful. That is simply code churn.
>
> Can you trigger the WARN() in the latest upstream version ?
> If yes this is a serious bug that needs urgent attention.
>
> Make sure you have backported all needed fixes into your kernel, if
> you get this warning on a non pristine kernel.

Theoretically, this WARN() shouldn't be triggered in any branches.
Somehow, it just happened in kernel v3.10. This really confused me. I
wasn't able to keep tracing it, as it is a product environment.

I notice that the codes for tcp skb receiving didn't change much
between v3.10 and the latest upstream version, and guess the latest
version can be triggered too.

If something is fixed and this WARN() won't be triggered, just ignore me.

Cheers,
Menglong Dong
Eric Dumazet Nov. 9, 2020, 3:39 p.m. UTC | #3
On 11/9/20 3:48 PM, Menglong Dong wrote:
> On Mon, Nov 9, 2020 at 9:36 PM Eric Dumazet <edumazet@google.com> wrote:
>>
>> I do not think this patch is useful. That is simply code churn.
>>
>> Can you trigger the WARN() in the latest upstream version ?
>> If yes this is a serious bug that needs urgent attention.
>>
>> Make sure you have backported all needed fixes into your kernel, if
>> you get this warning on a non pristine kernel.
> 
> Theoretically, this WARN() shouldn't be triggered in any branches.
> Somehow, it just happened in kernel v3.10. This really confused me. I
> wasn't able to keep tracing it, as it is a product environment.
> 
> I notice that the codes for tcp skb receiving didn't change much
> between v3.10 and the latest upstream version, and guess the latest
> version can be triggered too.
> 
> If something is fixed and this WARN() won't be triggered, just ignore me.
> 

Yes, I confirm this WARN() should not trigger.

The bug is not in tcp recvmsg(), that is why you do not see obvious
fix for this issue in 3.10
diff mbox series

Patch

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index b2bc3d7fe9e8..5e38dfd03036 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2093,11 +2093,12 @@  int tcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int nonblock,
 			/* Now that we have two receive queues this
 			 * shouldn't happen.
 			 */
-			if (WARN(before(*seq, TCP_SKB_CB(skb)->seq),
-				 "TCP recvmsg seq # bug: copied %X, seq %X, rcvnxt %X, fl %X\n",
-				 *seq, TCP_SKB_CB(skb)->seq, tp->rcv_nxt,
-				 flags))
+			if (unlikely(before(*seq, TCP_SKB_CB(skb)->seq))) {
+				net_warn_ratelimited("TCP recvmsg seq # bug: copied %X, seq %X, rcvnxt %X, fl %X\n",
+						     *seq, TCP_SKB_CB(skb)->seq, tp->rcv_nxt,
+						     flags);
 				break;
+			}
 
 			offset = *seq - TCP_SKB_CB(skb)->seq;
 			if (unlikely(TCP_SKB_CB(skb)->tcp_flags & TCPHDR_SYN)) {
@@ -2108,9 +2109,11 @@  int tcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int nonblock,
 				goto found_ok_skb;
 			if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN)
 				goto found_fin_ok;
-			WARN(!(flags & MSG_PEEK),
-			     "TCP recvmsg seq # bug 2: copied %X, seq %X, rcvnxt %X, fl %X\n",
-			     *seq, TCP_SKB_CB(skb)->seq, tp->rcv_nxt, flags);
+
+			if (!(flags & MSG_PEEK))
+				net_warn_ratelimited("TCP recvmsg seq # bug 2: copied %X, seq %X, rcvnxt %X, fl %X\n",
+						     *seq, TCP_SKB_CB(skb)->seq, tp->rcv_nxt,
+						     flags);
 		}
 
 		/* Well, if we have backlog, try to process it now yet. */