Message ID | 1482327763.8944.26.camel@edumazet-glaptop3.roam.corp.google.com |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
> -----Original Message----- > From: Eric Dumazet [mailto:eric.dumazet@gmail.com] > Sent: Wednesday, December 21, 2016 3:43 PM > > Madalin reported crashes happening in tcp_tasklet_func() on powerpc64 > > Before TSQ_QUEUED bit is cleared, we must ensure the changes done > by list_del(&tp->tsq_node); are committed to memory, otherwise > corruption might happen, as an other cpu could catch TSQ_QUEUED > clearance too soon. > > We can notice that old kernels were immune to this bug, because > TSQ_QUEUED was cleared after a bh_lock_sock(sk)/bh_unlock_sock(sk) > section, but they could have missed a kick to write additional bytes, > when NIC interrupts for a given flow are spread to multiple cpus. > > Affected TCP flows would need an incoming ACK or RTO timer to add more > packets to the pipe. So overall situation should be better now. > > Fixes: b223feb9de2a ("tcp: tsq: add shortcut in tcp_tasklet_func()") > Signed-off-by: Eric Dumazet <edumazet@google.com> > Reported-by: Madalin Bucur <madalin.bucur@nxp.com> > Tested-by: Madalin Bucur <madalin.bucur@nxp.com> It's actually tested by Xing Lei: Tested-by: Xing Lei <xing.lei@nxp.com> Thank you for the fast resolution. > --- > net/ipv4/tcp_output.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c > index > b45101f3d2bd2e0f0077305a061add4f7ea0de27..31a255b555ad86a3537c077862e3ea38 > f9b44284 100644 > --- a/net/ipv4/tcp_output.c > +++ b/net/ipv4/tcp_output.c > @@ -769,6 +769,7 @@ static void tcp_tasklet_func(unsigned long data) > list_del(&tp->tsq_node); > > sk = (struct sock *)tp; > + smp_mb__before_atomic(); > clear_bit(TSQ_QUEUED, &sk->sk_tsq_flags); > > if (!sk->sk_lock.owned && >
From: Eric Dumazet <eric.dumazet@gmail.com> Date: Wed, 21 Dec 2016 05:42:43 -0800 > From: Eric Dumazet <edumazet@google.com> > > Madalin reported crashes happening in tcp_tasklet_func() on powerpc64 > > Before TSQ_QUEUED bit is cleared, we must ensure the changes done > by list_del(&tp->tsq_node); are committed to memory, otherwise > corruption might happen, as an other cpu could catch TSQ_QUEUED > clearance too soon. > > We can notice that old kernels were immune to this bug, because > TSQ_QUEUED was cleared after a bh_lock_sock(sk)/bh_unlock_sock(sk) > section, but they could have missed a kick to write additional bytes, > when NIC interrupts for a given flow are spread to multiple cpus. > > Affected TCP flows would need an incoming ACK or RTO timer to add more > packets to the pipe. So overall situation should be better now. > > Fixes: b223feb9de2a ("tcp: tsq: add shortcut in tcp_tasklet_func()") > Signed-off-by: Eric Dumazet <edumazet@google.com> > Reported-by: Madalin Bucur <madalin.bucur@nxp.com> > Tested-by: Madalin Bucur <madalin.bucur@nxp.com> Applied, thanks Eric.
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index b45101f3d2bd2e0f0077305a061add4f7ea0de27..31a255b555ad86a3537c077862e3ea38f9b44284 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -769,6 +769,7 @@ static void tcp_tasklet_func(unsigned long data) list_del(&tp->tsq_node); sk = (struct sock *)tp; + smp_mb__before_atomic(); clear_bit(TSQ_QUEUED, &sk->sk_tsq_flags); if (!sk->sk_lock.owned &&