diff mbox

tcp: RST: binding oif to iif for tcp v4

Message ID 1328300212-18836-1-git-send-email-shawn.lu@ericsson.com
State Rejected, archived
Delegated to: David Miller
Headers show

Commit Message

Shawn Lu Feb. 3, 2012, 8:16 p.m. UTC
Binding RST packet outgoing interface to incomming interface
for tcp v4. This has few benefits:
1. tcp_v6_send_reset already did that.
2. This helps tcp connect with SO_BINDTODEVICE set. When connection
is lost, we still able to sending out RST using same interface.
3. limit RST traffic in ingress interface reduce the impact of
RST attack.

Signed-off-by: Shawn Lu <shawn.lu@ericsson.com>
---
 net/ipv4/tcp_ipv4.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

Comments

Eric Dumazet Feb. 3, 2012, 9:31 p.m. UTC | #1
Le vendredi 03 février 2012 à 12:16 -0800, Shawn Lu a écrit :
> Binding RST packet outgoing interface to incomming interface
> for tcp v4. This has few benefits:
> 1. tcp_v6_send_reset already did that.

I dont think so. ipv6 makes no special provision for RST.

> 2. This helps tcp connect with SO_BINDTODEVICE set. When connection
> is lost, we still able to sending out RST using same interface.

I dont understand this.

> 3. limit RST traffic in ingress interface reduce the impact of
> RST attack.
> 

I dont understand this.

Me confused.

Why RST are special and should bypass/force routing decisions ?

This is going to break some setups.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ben Greear Feb. 3, 2012, 9:39 p.m. UTC | #2
On 02/03/2012 01:31 PM, Eric Dumazet wrote:
> Le vendredi 03 février 2012 à 12:16 -0800, Shawn Lu a écrit :
>> Binding RST packet outgoing interface to incomming interface
>> for tcp v4. This has few benefits:
>> 1. tcp_v6_send_reset already did that.
>
> I dont think so. ipv6 makes no special provision for RST.
>
>> 2. This helps tcp connect with SO_BINDTODEVICE set. When connection
>> is lost, we still able to sending out RST using same interface.
>
> I dont understand this.

I have been using a similar patch for years now.  When using routing
rules that depend on binding to an interface, the RST isn't always bound
properly without this fix.

My code looks like this:

@@ -650,6 +650,7 @@ static void tcp_v4_send_reset(struct sock *sk, struct sk_buff *skb)
  				      arg.iov[0].iov_len, IPPROTO_TCP, 0);
  	arg.csumoffset = offsetof(struct tcphdr, check) / 2;
  	arg.flags = (sk && inet_sk(sk)->transparent) ? IP_REPLY_ARG_NOSRCCHECK : 0;
+	arg.bound_dev_if = skb_rtable(skb)->rt_iif;

  	net = dev_net(skb_dst(skb)->dev);
  	ip_send_reply(net->ipv4.tcp_sock, skb, ip_hdr(skb)->saddr,


It was originally written by Patrick McHardy, but I've hacked it
a bit over the years to keep it compiling.  It seems to work fine.

Thanks,
Ben
Shawn Lu Feb. 3, 2012, 9:43 p.m. UTC | #3
See inline.
-----Original Message-----
From: Eric Dumazet [mailto:eric.dumazet@gmail.com] 
Sent: Friday, February 03, 2012 1:32 PM
To: Shawn Lu
Cc: davem@davemloft.net; netdev@vger.kernel.org; xiaoclu@gmail.com
Subject: Re: [PATCH] tcp: RST: binding oif to iif for tcp v4

Le vendredi 03 février 2012 à 12:16 -0800, Shawn Lu a écrit :
> Binding RST packet outgoing interface to incomming interface for tcp 
> v4. This has few benefits:
> 1. tcp_v6_send_reset already did that.

I dont think so. ipv6 makes no special provision for RST.
[shawn LU] it's in  tcp_v6_send_response line 899 of tcp_ipv6.c
 fl6.flowi6_oif = inet6_iif(skb);

> 2. This helps tcp connect with SO_BINDTODEVICE set. When connection is 
> lost, we still able to sending out RST using same interface.

I dont understand this.

[shawn Lu] ok.  Tcp socket is bind to device using SO_BINDTODEVICE  to
Limit traffic to specifc interface.  Sometime, it may not have a valid
Source address to get through ip_route_output_key. 

> 3. limit RST traffic in ingress interface reduce the impact of RST 
> attack.
> 
[shawn Lu] sometime, we want to limit unimportant traffic to certain interface
To reduce impact on other interface 

I dont understand this.

Me confused.

Why RST are special and should bypass/force routing decisions ?

This is going to break some setups.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Shawn Lu Feb. 3, 2012, 9:56 p.m. UTC | #4
I have the same issue when using  SO_BINDTODEVICE to bind to specifc interface,  RST is getting lost in ip_route_output_key.  

-----Original Message-----
From: Ben Greear [mailto:greearb@candelatech.com] 
Sent: Friday, February 03, 2012 1:40 PM
To: Eric Dumazet
Cc: Shawn Lu; davem@davemloft.net; netdev@vger.kernel.org; xiaoclu@gmail.com
Subject: Re: [PATCH] tcp: RST: binding oif to iif for tcp v4

On 02/03/2012 01:31 PM, Eric Dumazet wrote:
> Le vendredi 03 février 2012 à 12:16 -0800, Shawn Lu a écrit :
>> Binding RST packet outgoing interface to incomming interface for tcp 
>> v4. This has few benefits:
>> 1. tcp_v6_send_reset already did that.
>
> I dont think so. ipv6 makes no special provision for RST.
>
>> 2. This helps tcp connect with SO_BINDTODEVICE set. When connection 
>> is lost, we still able to sending out RST using same interface.
>
> I dont understand this.

I have been using a similar patch for years now.  When using routing rules that depend on binding to an interface, the RST isn't always bound properly without this fix.

My code looks like this:

@@ -650,6 +650,7 @@ static void tcp_v4_send_reset(struct sock *sk, struct sk_buff *skb)
  				      arg.iov[0].iov_len, IPPROTO_TCP, 0);
  	arg.csumoffset = offsetof(struct tcphdr, check) / 2;
  	arg.flags = (sk && inet_sk(sk)->transparent) ? IP_REPLY_ARG_NOSRCCHECK : 0;
+	arg.bound_dev_if = skb_rtable(skb)->rt_iif;

  	net = dev_net(skb_dst(skb)->dev);
  	ip_send_reply(net->ipv4.tcp_sock, skb, ip_hdr(skb)->saddr,


It was originally written by Patrick McHardy, but I've hacked it a bit over the years to keep it compiling.  It seems to work fine.

Thanks,
Ben


--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Feb. 4, 2012, 7:44 a.m. UTC | #5
Le vendredi 03 février 2012 à 16:43 -0500, Shawn Lu a écrit :
> See inline.
> -----Original Message-----
> From: Eric Dumazet [mailto:eric.dumazet@gmail.com] 
> Le vendredi 03 février 2012 à 12:16 -0800, Shawn Lu a écrit :
> > Binding RST packet outgoing interface to incomming interface for tcp 
> > v4. This has few benefits:
> > 1. tcp_v6_send_reset already did that.
> 
> I dont think so. ipv6 makes no special provision for RST.
> [shawn LU] it's in  tcp_v6_send_response line 899 of tcp_ipv6.c
>  fl6.flowi6_oif = inet6_iif(skb);
> 

tcp_v6_send_response() is used to send RST _and_ ACK

Its not reserved to RST. Thats why I said "no _special_ provision for
RST"

I repeat my question :

Why do you believe only RST should be handled in a different manner than
other TCP messages ?

If we decide to break asymmetric routing, we should do it completely and
document it.

It might already be broken for IPv6 and nobody cared / noticed.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Feb. 4, 2012, 8:06 a.m. UTC | #6
Le vendredi 03 février 2012 à 16:43 -0500, Shawn Lu a écrit :

> [shawn Lu] ok.  Tcp socket is bind to device using SO_BINDTODEVICE  to
> Limit traffic to specifc interface.  Sometime, it may not have a valid
> Source address to get through ip_route_output_key. 
> 

Define "Sometime" ?

We have to force the oif only if requested by the socket in this case.

arg.bound_dev_if = sk ? sk->sk_bound_dev_if : 0;

Thats what we do in tcp_v4_send_ack() :
	if (oif)
		arg.bound_dev_if = oif;




--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Shawn Lu Feb. 4, 2012, 6:15 p.m. UTC | #7
Hi, Eric:

> From: Eric Dumazet [mailto:eric.dumazet@gmail.com] 
> I repeat my question :
> 
> Why do you believe only RST should be handled in a different 
> manner than other TCP messages ?
> 
> If we decide to break asymmetric routing, we should do it 
> completely and document it.
> 
> It might already be broken for IPv6 and nobody cared / noticed.

RST is specical. if a packet caused RST, connection is gone.
We lost all socket option (or bind information) for that socket.
So we build reply only basing on parameters arrived with
The segement.  It's nature to use another one from segment: inet_iif(skb).

In the same time, I agree with you that it breaks asymmetric routing.

It may be more propertly to do it when routing fails.
Will send out another patch to replace this patch. 

> -----Original Message-----
> From: Eric Dumazet [mailto:eric.dumazet@gmail.com] 
> Sent: Saturday, February 04, 2012 12:07 AM
> To: Shawn Lu
> Cc: davem@davemloft.net; netdev@vger.kernel.org; xiaoclu@gmail.com
> Subject: RE: [PATCH] tcp: RST: binding oif to iif for tcp v4
> 
> Le vendredi 03 février 2012 à 16:43 -0500, Shawn Lu a écrit :
> 
> > [shawn Lu] ok.  Tcp socket is bind to device using 
> SO_BINDTODEVICE  to 
> > Limit traffic to specifc interface.  Sometime, it may not 
> have a valid 
> > Source address to get through ip_route_output_key.
> > 
> 
> Define "Sometime" ?
My case is in router. Forwarding is done in line card. Any packet that can't process by
Forwarding(such as bgp) will "punt" to control plane(another box, we called RP). In RP,
routing is bypassed if packet is from line card, because we know it is target to
local. On egress path, we also need bind traffic to this interface to be able to go back
To line card. In TCP case, when connection is gone, all binding information is gone, we can't
Figure out egress path depend only destination address. 
But we still able to send it back if iif is used. 
> 
> We have to force the oif only if requested by the socket in this case.
> 
> arg.bound_dev_if = sk ? sk->sk_bound_dev_if : 0;
> 
> Thats what we do in tcp_v4_send_ack() :
> 	if (oif)
> 		arg.bound_dev_if = oif;
> 
> 
> 
> 
> 
It won't work.  For RST, socket is gone. In most case, there is no socket
Associated with it.  Only under one case (it fall in to listener that is not bind to specifc
Address), we can only get sk_bound_dev_if from listener. In addition, in this case,
The listener is most likely not bind to any interface to be able to get connection
Request from all interface.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Feb. 4, 2012, 6:29 p.m. UTC | #8
Le samedi 04 février 2012 à 13:15 -0500, Shawn Lu a écrit :

> It won't work.  For RST, socket is gone. 

If this was true, we would not have :

tcp_v4_send_reset(struct sock *sk, struct sk_buff *skb)

but :

tcp_v4_send_reset(struct sk_buff *skb)


Think about it.

You write a lot, but you dont really understand what I wrote in my
previous mails, do you ?



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Shawn Lu Feb. 4, 2012, 6:42 p.m. UTC | #9
Eric:

Yes, you are right.  It helps if we do as your suggested.

Do you want to send out a patch to address this? 

Thanks!

Shawn
> -----Original Message-----
> From: Eric Dumazet [mailto:eric.dumazet@gmail.com] 
> Sent: Saturday, February 04, 2012 10:30 AM
> To: Shawn Lu
> Cc: davem@davemloft.net; netdev@vger.kernel.org; xiaoclu@gmail.com
> Subject: RE: [PATCH] tcp: RST: binding oif to iif for tcp v4
> 
> Le samedi 04 février 2012 à 13:15 -0500, Shawn Lu a écrit :
> 
> > It won't work.  For RST, socket is gone. 
> 
> If this was true, we would not have :
> 
> tcp_v4_send_reset(struct sock *sk, struct sk_buff *skb)
> 
> but :
> 
> tcp_v4_send_reset(struct sk_buff *skb)
> 
> 
> Think about it.
> 
> You write a lot, but you dont really understand what I wrote 
> in my previous mails, do you ?
> 
> 
> 
> --
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 90e4793..994b1ea 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -676,6 +676,7 @@  static void tcp_v4_send_reset(struct sock *sk, struct sk_buff *skb)
 				      arg.iov[0].iov_len, IPPROTO_TCP, 0);
 	arg.csumoffset = offsetof(struct tcphdr, check) / 2;
 	arg.flags = (sk && inet_sk(sk)->transparent) ? IP_REPLY_ARG_NOSRCCHECK : 0;
+	arg.bound_dev_if = inet_iif(skb);
 
 	net = dev_net(skb_dst(skb)->dev);
 	arg.tos = ip_hdr(skb)->tos;