diff mbox series

[v2] tcp: verify the checksum of the first data segment in a new connection

Message ID 5b203e1b.vy4yU6CwMEwLmNtj%fllinden@amazon.com
State Changes Requested, archived
Delegated to: David Miller
Headers show
Series [v2] tcp: verify the checksum of the first data segment in a new connection | expand

Commit Message

Frank van der Linden June 12, 2018, 9:41 p.m. UTC
commit 079096f103fa ("tcp/dccp: install syn_recv requests into ehash
table") introduced an optimization for the handling of child sockets
created for a new TCP connection.

But this optimization passes any data associated with the last ACK of the
connection handshake up the stack without verifying its checksum, because it
calls tcp_child_process(), which in turn calls tcp_rcv_state_process()
directly.  These lower-level processing functions do not do any checksum
verification.

Insert a tcp_checksum_complete call in the TCP_NEW_SYN_RECEIVE path to
fix this.

Signed-off-by: Frank van der Linden <fllinden@amazon.com>
---
 net/ipv4/tcp_ipv4.c | 10 +++++++++-
 net/ipv6/tcp_ipv6.c | 10 +++++++++-
 2 files changed, 18 insertions(+), 2 deletions(-)

Comments

Eric Dumazet June 12, 2018, 9:50 p.m. UTC | #1
On 06/12/2018 02:41 PM, Frank van der Linden wrote:
> commit 079096f103fa ("tcp/dccp: install syn_recv requests into ehash
> table") introduced an optimization for the handling of child sockets
> created for a new TCP connection.
> 
> But this optimization passes any data associated with the last ACK of the
> connection handshake up the stack without verifying its checksum, because it
> calls tcp_child_process(), which in turn calls tcp_rcv_state_process()
> directly.  These lower-level processing functions do not do any checksum
> verification.
> 
> Insert a tcp_checksum_complete call in the TCP_NEW_SYN_RECEIVE path to
> fix this.
> 
> Signed-off-by: Frank van der Linden <fllinden@amazon.com>


This is way too complicated.

You should call tcp_checksum_complete() earlier and avoid all this mess.


IPV4 part shown here :

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index fed3f1c6616708997f621535efe9412e4afa0a50..7b5f32aa3835b0124b0a9bd342c371df7b46f471 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1730,6 +1730,10 @@ int tcp_v4_rcv(struct sk_buff *skb)
                        reqsk_put(req);
                        goto discard_it;
                }
+               if (unlikely(tcp_checksum_complete(skb))) {
+                       reqsk_put(req);
+                       goto csum_error;
+               }
                if (unlikely(sk->sk_state != TCP_LISTEN)) {
                        inet_csk_reqsk_queue_drop_and_put(sk, req);
                        goto lookup;
Frank van der Linden June 12, 2018, 9:53 p.m. UTC | #2
The convention seems to be to call tcp_checksum_complete after tcp_filter has a chance to deal with the packet. I wanted to preserve that.

If that is not a concern, then I agree that this is a far better way to go.

Frank

On 6/12/18, 2:50 PM, "Eric Dumazet" <eric.dumazet@gmail.com> wrote:

    
    
    On 06/12/2018 02:41 PM, Frank van der Linden wrote:
    > commit 079096f103fa ("tcp/dccp: install syn_recv requests into ehash
    > table") introduced an optimization for the handling of child sockets
    > created for a new TCP connection.
    > 
    > But this optimization passes any data associated with the last ACK of the
    > connection handshake up the stack without verifying its checksum, because it
    > calls tcp_child_process(), which in turn calls tcp_rcv_state_process()
    > directly.  These lower-level processing functions do not do any checksum
    > verification.
    > 
    > Insert a tcp_checksum_complete call in the TCP_NEW_SYN_RECEIVE path to
    > fix this.
    > 
    > Signed-off-by: Frank van der Linden <fllinden@amazon.com>
    
    
    This is way too complicated.
    
    You should call tcp_checksum_complete() earlier and avoid all this mess.
    
    
    IPV4 part shown here :
    
    diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
    index fed3f1c6616708997f621535efe9412e4afa0a50..7b5f32aa3835b0124b0a9bd342c371df7b46f471 100644
    --- a/net/ipv4/tcp_ipv4.c
    +++ b/net/ipv4/tcp_ipv4.c
    @@ -1730,6 +1730,10 @@ int tcp_v4_rcv(struct sk_buff *skb)
                            reqsk_put(req);
                            goto discard_it;
                    }
    +               if (unlikely(tcp_checksum_complete(skb))) {
    +                       reqsk_put(req);
    +                       goto csum_error;
    +               }
                    if (unlikely(sk->sk_state != TCP_LISTEN)) {
                            inet_csk_reqsk_queue_drop_and_put(sk, req);
                            goto lookup;
Eric Dumazet June 12, 2018, 10:03 p.m. UTC | #3
On 06/12/2018 02:53 PM, van der Linden, Frank wrote:
> The convention seems to be to call tcp_checksum_complete after tcp_filter has a chance to deal with the packet. I wanted to preserve that.
> 
> If that is not a concern, then I agree that this is a far better way to go.
> 
> Frank

Given that we can drop the packet earlier from :

if (skb_checksum_init(skb, IPPROTO_TCP, inet_compute_pseudo))
     goto csum_error;

I am quite sure we really do not care of tcp_filter() being
hit or not by packets with bad checksum.

Thanks
Frank van der Linden June 12, 2018, 10:30 p.m. UTC | #4
Sure, fair enough. I was assuming there might be a reason of why tcp_filter was always done after the data (not pseudo header) checksum. If there isn't (and obviously the the possible MD5 checks are done before it too), then that's definitely the right thing to do.

I'll resend. Though if you have the simpler change already lined up, I'll happily refrain from sending it myself.

Frank

On 6/12/18, 3:03 PM, "Eric Dumazet" <eric.dumazet@gmail.com> wrote:

    
    
    On 06/12/2018 02:53 PM, van der Linden, Frank wrote:
    > The convention seems to be to call tcp_checksum_complete after tcp_filter has a chance to deal with the packet. I wanted to preserve that.
    > 
    > If that is not a concern, then I agree that this is a far better way to go.
    > 
    > Frank
    
    Given that we can drop the packet earlier from :
    
    if (skb_checksum_init(skb, IPPROTO_TCP, inet_compute_pseudo))
         goto csum_error;
    
    I am quite sure we really do not care of tcp_filter() being
    hit or not by packets with bad checksum.
    
    Thanks
Frank van der Linden June 12, 2018, 11:12 p.m. UTC | #5
Ok, patch v3 sent.

It was rightly pointed out to me that I shouldn't commit the mortal sin of top posting - but bear with me guys, I'll dig up my 25-year old .muttrc :-)

Frank

On 6/12/18, 3:03 PM, "Eric Dumazet" <eric.dumazet@gmail.com> wrote:

   
    
    On 06/12/2018 02:53 PM, van der Linden, Frank wrote:
    > The convention seems to be to call tcp_checksum_complete after tcp_filter has a chance to deal with the packet. I wanted to preserve that.
    > 
    > If that is not a concern, then I agree that this is a far better way to go.
    > 
    > Frank
    
    Given that we can drop the packet earlier from :
    
    if (skb_checksum_init(skb, IPPROTO_TCP, inet_compute_pseudo))
         goto csum_error;
    
    I am quite sure we really do not care of tcp_filter() being
    hit or not by packets with bad checksum.
    
    Thanks
Balbir Singh June 13, 2018, 5:08 a.m. UTC | #6
On Wed, Jun 13, 2018 at 7:50 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
> On 06/12/2018 02:41 PM, Frank van der Linden wrote:
>> commit 079096f103fa ("tcp/dccp: install syn_recv requests into ehash
>> table") introduced an optimization for the handling of child sockets
>> created for a new TCP connection.
>>
>> But this optimization passes any data associated with the last ACK of the
>> connection handshake up the stack without verifying its checksum, because it
>> calls tcp_child_process(), which in turn calls tcp_rcv_state_process()
>> directly.  These lower-level processing functions do not do any checksum
>> verification.
>>
>> Insert a tcp_checksum_complete call in the TCP_NEW_SYN_RECEIVE path to
>> fix this.
>>
>> Signed-off-by: Frank van der Linden <fllinden@amazon.com>
>
>
> This is way too complicated.
>
> You should call tcp_checksum_complete() earlier and avoid all this mess.
>
>
> IPV4 part shown here :
>
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index fed3f1c6616708997f621535efe9412e4afa0a50..7b5f32aa3835b0124b0a9bd342c371df7b46f471 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -1730,6 +1730,10 @@ int tcp_v4_rcv(struct sk_buff *skb)
>                         reqsk_put(req);
>                         goto discard_it;
>                 }
> +               if (unlikely(tcp_checksum_complete(skb))) {
> +                       reqsk_put(req);
> +                       goto csum_error;
> +               }

I like this variant, it is not under the sock_lock, but it skips
tcp_filter() as Frank pointed out.

I tested a variant of this patch with an increment of MIB/CSUM errors
and jump to discard_and_relse and it seemed to pass my testing


Balbir Singh.
diff mbox series

Patch

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index f70586b..f361cf9 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1639,6 +1639,7 @@  int tcp_v4_rcv(struct sk_buff *skb)
 	const struct iphdr *iph;
 	const struct tcphdr *th;
 	bool refcounted;
+	bool csumerr = false;
 	struct sock *sk;
 	int ret;
 
@@ -1703,7 +1704,12 @@  int tcp_v4_rcv(struct sk_buff *skb)
 			th = (const struct tcphdr *)skb->data;
 			iph = ip_hdr(skb);
 			tcp_v4_fill_cb(skb, iph, th);
-			nsk = tcp_check_req(sk, skb, req, false, &req_stolen);
+
+			csumerr = tcp_checksum_complete(skb);
+			if (!csumerr) {
+				nsk = tcp_check_req(sk, skb, req, false,
+						    &req_stolen);
+			}
 		}
 		if (!nsk) {
 			reqsk_put(req);
@@ -1798,6 +1804,8 @@  int tcp_v4_rcv(struct sk_buff *skb)
 	sk_drops_add(sk, skb);
 	if (refcounted)
 		sock_put(sk);
+	if (csumerr)
+		goto csum_error;
 	goto discard_it;
 
 do_time_wait:
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 6d664d8..17a20fa 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1425,6 +1425,7 @@  static int tcp_v6_rcv(struct sk_buff *skb)
 	const struct tcphdr *th;
 	const struct ipv6hdr *hdr;
 	bool refcounted;
+	bool csumerr = false;
 	struct sock *sk;
 	int ret;
 	struct net *net = dev_net(skb->dev);
@@ -1486,7 +1487,12 @@  static int tcp_v6_rcv(struct sk_buff *skb)
 			th = (const struct tcphdr *)skb->data;
 			hdr = ipv6_hdr(skb);
 			tcp_v6_fill_cb(skb, hdr, th);
-			nsk = tcp_check_req(sk, skb, req, false, &req_stolen);
+
+			csumerr = tcp_checksum_complete(skb);
+			if (!csumerr) {
+				nsk = tcp_check_req(sk, skb, req, false,
+						    &req_stolen);
+			}
 		}
 		if (!nsk) {
 			reqsk_put(req);
@@ -1577,6 +1583,8 @@  static int tcp_v6_rcv(struct sk_buff *skb)
 	sk_drops_add(sk, skb);
 	if (refcounted)
 		sock_put(sk);
+	if (csumerr)
+		goto csum_error;
 	goto discard_it;
 
 do_time_wait: