diff mbox series

[linux-next,6f5b24e] WARNING: CPU: 0 PID: 10288 at net/ipv4/tcp_input.c:889 tcp_update_reordering+0x1c4/0x1e0

Message ID 1507716177.3792.35.camel@abdul.in.ibm.com (mailing list archive)
State Not Applicable
Headers show
Series [linux-next,6f5b24e] WARNING: CPU: 0 PID: 10288 at net/ipv4/tcp_input.c:889 tcp_update_reordering+0x1c4/0x1e0 | expand

Commit Message

Abdul Haleem Oct. 11, 2017, 10:02 a.m. UTC
Hi,

CPU OFF & ON in a loop on next kernel results in WARNING in dmesg.

Machine Type: Power 8 PowerVM LPAR
Kernel : 4.14.0-rc4-next-20171009
Gcc : 6.3.1
config : attached
Test: CPU toggle

WARN_ON was first introduced with the commit:

commit 6f5b24eed0278136c29c27f2a7b3a2b6a202ac68
Author: Soheil Hassas Yeganeh <soheil@google.com>
Date:   Tue May 16 17:39:02 2017 -0400

    tcp: warn on negative reordering values
    
    Commit bafbb9c73241 ("tcp: eliminate negative reordering
    in tcp_clean_rtx_queue") fixes an issue for negative
    reordering metrics.
    
    To be resilient to such errors, warn and return
    when a negative metric is passed to tcp_update_reordering().
    
    Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
    Signed-off-by: Neal Cardwell <ncardwell@google.com>
    Signed-off-by: Yuchung Cheng <ycheng@google.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

 

Trace:
------------[ cut here ]------------
WARNING: CPU: 1 PID: 2380 at net/ipv4/tcp_input.c:889 tcp_update_reordering+0x120/0x140
Modules linked in: rpadlpar_io rpaphp xt_addrtype xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 iptable_filter ip_tables x_tables nf_nat nf_conntrack bridge stp llc dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c vmx_crypto rtc_generic pseries_rng autofs4
CPU: 1 PID: 2380 Comm: sh Not tainted 4.14.0-rc4-next-20171009 #4
task: c0000000fc206480 task.stack: c0000000fc7b4000
NIP:  c000000000b16b30 LR: c000000000b1b838 CTR: 0000000000000000
REGS: c00000077ffdb2d0 TRAP: 0700   Not tainted  (4.14.0-rc4-next-20171009)
MSR:  8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 88024428  XER: 20000018
CFAR: c000000000b16a48 SOFTE: 1
GPR00: c000000000b1b838 c00000077ffdb550 c0000000015bd600 c0000000fe22f080
GPR04: ffffffffffffffe5 0000000000000000 49dd7cee49dd7c2e c00000077ffdb5b0
GPR08: 00000000ffffffff 0000000000000003 000000000000001a d000000006d11b20
GPR12: 0000000048024422 c00000000e740a80 c0000000fe22f080 00000000ffffffff
GPR16: 0000000000000000 0000000000000000 0000000000000000 c00000077ffdb704
GPR20: c00000077ffdb6e4 0000000000000003 0000000049dd7cee 0000000000000001
GPR24: c00000077ffdb5b0 0000000000000000 c00000077ffdb6e0 0000000000000001
GPR28: c0000000fe22f080 ffffffffffffffe5 0000000000000000 c0000000fe22f080
NIP [c000000000b16b30] tcp_update_reordering+0x120/0x140
LR [c000000000b1b838] tcp_sacktag_write_queue+0x568/0xa70
Call Trace:
[c00000077ffdb550] [c00000077ffdb6e0] 0xc00000077ffdb6e0 (unreliable)
[c00000077ffdb590] [c000000000b1b838] tcp_sacktag_write_queue+0x568/0xa70
[c00000077ffdb650] [c000000000b1e054] tcp_ack+0x3b4/0x14c0
[c00000077ffdb7e0] [c000000000b21954] tcp_rcv_established+0x1d4/0x6b0
[c00000077ffdb840] [c000000000b2f1e0] tcp_v4_do_rcv+0x1a0/0x2c0
[c00000077ffdb880] [c000000000b3281c] tcp_v4_rcv+0xc6c/0xea0
[c00000077ffdb970] [c000000000afccc0] ip_local_deliver_finish+0x170/0x350
[c00000077ffdb9c0] [c000000000afd58c] ip_local_deliver+0x5c/0x130
[c00000077ffdba20] [c000000000afd0a8] ip_rcv_finish+0x208/0x4b0
[c00000077ffdbab0] [c000000000afd9c8] ip_rcv+0x368/0x480
[c00000077ffdbb20] [c000000000a8fb00] __netif_receive_skb_core+0xa80/0xeb0
[c00000077ffdbbe0] [c000000000a98688] netif_receive_skb_internal+0xd8/0x130
[c00000077ffdbc20] [c000000000a996cc] napi_gro_receive+0x11c/0x1d0
[c00000077ffdbc60] [c0000000008e23bc] ibmveth_poll+0x31c/0x700
[c00000077ffdbd50] [c000000000a98f0c] net_rx_action+0x37c/0x480
[c00000077ffdbe50] [c000000000c3e730] __do_softirq+0x170/0x3dc
[c00000077ffdbf40] [c00000000010a208] irq_exit+0xe8/0x120
[c00000077ffdbf60] [c000000000016f8c] __do_irq+0x8c/0x1d0
[c00000077ffdbf90] [c00000000002b1a0] call_do_irq+0x14/0x24
[c0000000fc7b7de0] [c000000000017168] do_IRQ+0x98/0x140
[c0000000fc7b7e30] [c000000000008ac4] hardware_interrupt_common+0x114/0x120
Instruction dump:
7c0803a6 4e800020 60000000 60420000 e93f0620 3940001d 7928e721 4182ff9c
790affe2 214a001c 4bffff90 60420000 <0fe00000> 38210040 e8010010 eba1ffe8
---[ end trace c8b2e99246cd4592 ]---

Comments

Soheil Hassas Yeganeh Oct. 11, 2017, 2:08 p.m. UTC | #1
On Wed, Oct 11, 2017 at 6:02 AM, Abdul Haleem
<abdhalee@linux.vnet.ibm.com> wrote:
> Hi,
>
> CPU OFF & ON in a loop on next kernel results in WARNING in dmesg.
>
> Machine Type: Power 8 PowerVM LPAR
> Kernel : 4.14.0-rc4-next-20171009
> Gcc : 6.3.1
> config : attached
> Test: CPU toggle
>
> WARN_ON was first introduced with the commit:
>
> commit 6f5b24eed0278136c29c27f2a7b3a2b6a202ac68
> Author: Soheil Hassas Yeganeh <soheil@google.com>
> Date:   Tue May 16 17:39:02 2017 -0400
>
>     tcp: warn on negative reordering values
>
>     Commit bafbb9c73241 ("tcp: eliminate negative reordering
>     in tcp_clean_rtx_queue") fixes an issue for negative
>     reordering metrics.
>
>     To be resilient to such errors, warn and return
>     when a negative metric is passed to tcp_update_reordering().
>
>     Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
>     Signed-off-by: Neal Cardwell <ncardwell@google.com>
>     Signed-off-by: Yuchung Cheng <ycheng@google.com>
>     Signed-off-by: Eric Dumazet <edumazet@google.com>
>     Signed-off-by: David S. Miller <davem@davemloft.net>
>
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index bbadd79..2fa55f5 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -887,6 +887,9 @@ static void tcp_update_reordering(struct sock *sk,
> const int metric,
>         struct tcp_sock *tp = tcp_sk(sk);
>         int mib_idx;
>
> +       if (WARN_ON_ONCE(metric < 0))
> +               return;
> +

Thank you for your report, Abdul! This warning was added to find the
root cause of insanely high reordering in TCP, which luckily I believe
you have reproduced.

Yuchung had pointed out offline that we are using inconsistent types
in using in tcp_sacktag_state vs tcp_sock: int for reord and
fack_count in tcp_sacktag_state and u32 for their related fields in
tcp_sock. This is likely to be the culprit here.

Would you mind giving the following patch a try please? If that fixes
the issue for you, we will mail that as a fix. Thanks!

From: Soheil Hassas Yeganeh <soheil@google.com>
Date: Wed, 11 Oct 2017 09:52:49 -0400
Subject: [PATCH net] tcp: fix type of fack_count and reord in sack_tag_state

This bug is found by Yuchung.

Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
---
 net/ipv4/tcp_input.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index d0682ce2a5d6..2fada7acbdc5 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1132,8 +1132,8 @@ static bool tcp_check_dsack(struct sock *sk,
const struct sk_buff *ack_skb,
 }

 struct tcp_sacktag_state {
-       int reord;
-       int fack_count;
+       u32 reord;
+       u32 fack_count;
        /* Timestamps for earliest and latest never-retransmitted segment
         * that was SACKed. RTO needs the earliest RTT to stay conservative,
         * but congestion control should still get an accurate delay signal.
@@ -1209,7 +1209,7 @@ static u8 tcp_sacktag_one(struct sock *sk,
                        u64 xmit_time)
 {
        struct tcp_sock *tp = tcp_sk(sk);
-       int fack_count = state->fack_count;
+       u32 fack_count = state->fack_count;

        /* Account D-SACK for retransmitted packet. */
        if (dup_sack && (sacked & TCPCB_RETRANS)) {
diff mbox series

Patch

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index bbadd79..2fa55f5 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -887,6 +887,9 @@  static void tcp_update_reordering(struct sock *sk,
const int metric,
        struct tcp_sock *tp = tcp_sk(sk);
        int mib_idx;
 
+       if (WARN_ON_ONCE(metric < 0))
+               return;
+
        if (metric > tp->reordering) {
                tp->reordering = min(sysctl_tcp_max_reordering, metric);