diff mbox

[RFC] sctp: Don't lookup dst if transport dst is still valid

Message ID 51DF74BE.6000001@windriver.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

fan.du July 12, 2013, 3:15 a.m. UTC
On 2013年07月09日 23:11, Vlad Yasevich wrote:
> On 07/05/2013 10:09 AM, Vlad Yasevich wrote:
>> On 07/03/2013 10:33 PM, Fan Du wrote:
>>>
>>>
>>> On 2013年07月03日 21:23, Vlad Yasevich wrote:
>>>> On 07/02/2013 10:18 PM, Fan Du wrote:
>>>>>
>>>>>
>>>>> On 2013年07月02日 22:29, Vlad Yasevich wrote:
>>>>>> On 07/02/2013 02:39 AM, Fan Du wrote:
>>>>>>> When sctp sits on IPv6, sctp_transport_dst_check pass cookie as ZERO,
>>>>>>> as a result ip6_dst_check always fail out. This behaviour makes
>>>>>>> transport->dst useless, because every sctp_packet_transmit must look
>>>>>>> for valid dst(Is this what supposed to be?)
>>>>>>>
>>>>>>> One aggressive way is to call rt_genid_bump which invalid all dst to
>>>>>>> make new dst for transport, apparently it also hurts others.
>>>>>>> I'm sure this may not be the best for all, so any commnets?
>>>>>>>
>>>>>>> Signed-off-by: Fan Du <fan.du@windriver.com>
>>>>>>> ---
>>>>>>> include/net/sctp/sctp.h | 18 ++++++++++++------
>>>>>>> net/sctp/ipv6.c | 2 ++
>>>>>>> 2 files changed, 14 insertions(+), 6 deletions(-)
>>>>>>>
>>>>>>> diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
>>>>>>> index cd89510..f05af01 100644
>>>>>>> --- a/include/net/sctp/sctp.h
>>>>>>> +++ b/include/net/sctp/sctp.h
>>>>>>> @@ -719,14 +719,20 @@ static inline void sctp_v4_map_v6(union
>>>>>>> sctp_addr *addr)
>>>>>>> addr->v6.sin6_addr.s6_addr32[2] = htonl(0x0000ffff);
>>>>>>> }
>>>>>>>
>>>>>>> -/* The cookie is always 0 since this is how it's used in the
>>>>>>> - * pmtu code.
>>>>>>> - */
>>>>>>> +/* Set cookie with the right one for IPv6 and zero for others */
>>>>>>> static inline struct dst_entry *sctp_transport_dst_check(struct
>>>>>>> sctp_transport *t)
>>>>>>> {
>>>>>>> - if (t->dst && !dst_check(t->dst, 0)) {
>>>>>>> - dst_release(t->dst);
>>>>>>> - t->dst = NULL;
>>>>>>> +
>>>>>>> + if (t->dst) {
>>>>>>> + struct rt6_info *rt = (struct rt6_info *)t->dst;
>>>>>>> + u32 cookie = 0;
>>>>>>> +
>>>>>>> + if ((t->af_specific->sa_family == AF_INET6) && rt->rt6i_node)
>>>>>>> + cookie = rt->rt6i_node->fn_sernum;
>>>>>>> + if (!dst_check(t->dst, cookie)) {
>>>>>>> + dst_release(t->dst);
>>>>>>> + t->dst = NULL;
>>>>>>> + }
>>>>>>> }
>>>>>>
>>>>>> I think it would be better if we stored the dst_cookie in the
>>>>>> transport structure and initialized it at lookup time. If you do that,
>>>>>> then if the route table changes, we'd correctly detect it without
>>>>>> artificially bumping rt_genid (and hurting ipv4).
>>>>>
>>>>> Hi Vlad/Neil
>>>>>
>>>>> Is this what you mean?
>>>>
>>>> Yes, exactly.
>>>>
>>>
>>> Hi Vlad
>>>
>>> I thinks twice about below patch, this is actually a chicken-egg issue.
>>> Look below scenario:
>>> (1) The first time we push packet through a transport, dst_cookie is 0,
>>> so sctp_transport_dst_check also pass cookie as 0, then return dst
>>> as NULL.
>>> Then we lookup dst by sctp_transport_route, and in there we
>>> initiate dst_cookie
>>> with rt->rt6i_node->fn_sernum
>>>
>>> (2) Then the next time we push packet through this transport again,
>>> we pass dst_cookie(rt->rt6i_node->fn_sernum) to ip6_dst_check, and
>>> return valid dst without bothering to lookup dst again.
>>
>> No, if the route was removed rt6i_node will be NULL, and NULL will be
>> returned from ip6_dst_check(). If the route still exists then we'll
>> compare the serial number with a cookie.
>>
>>>
>>> BUT, suppose when deleting the source address of this dst after
>>> transport->dst_cookie
>>> has been well initialized. transport->dst_cookie still holds
>>> rt->rt6i_node->fn_sernum,
>>> meaning ip6_dst_check will return valid dst, which it shouldn't in this
>>> case, the
>>> result will be association ABORT.
>>
>> No, removing the address cause the route for that prefix to be removed
>> as well. This will set rt6i_node to NULL.
>>
>>>
>>> Other way is invalid all transport->dst which using the deleting address
>>> as source address
>>> without bumping gen_id, problem is the traverse times depends heavily on
>>> transport number,
>>> and also need to take account locking issue it will introduce.
>>>
>>> >
>>> > No, you are not missing anything. IPv4 doesn't use the cookie and
>>> always seems to pass it as 0.
>>> >
>>> > Yes, ipv4 will bump the gen_id thus invalidating all routes (there
>>> has been disagreement about it).
>>> > IPv6 doesn't do that. In ipv6, when the addresses are added or
>>> removed, routes are also added or removed and
>>> > any time the route is added it will have a new serial number. So, you
>>> don't have to disturb ipv4 cache when ipv6 routing info changes.
>>>
>>> Thank you very much for your explanation!
>>>
>>> IPv6 don't bump gen_id, when adding/deleting address, and tag an serial
>>> number with each route.
>>> Doing this way loose the semantic of dst_check, because SCTP depends no
>>> dst_check fulfill its
>>> duty to actually check whether the holding dst is still valid, well most
>>> other Layer 4 protocol
>>> simply rely on ip6_route_output/ip6_dst_lookup_flow to grab dst every
>>> time sending data out.
>>
>> Look at how other protocols (tcp, dccp) do this. It is sufficient to
>> cache the route serial number into the dst_cookie at the time the route
>> was lookup-up and cached. Then the cookie is passed to dst_check to
>> validate the route.
>
>
> Hi Fan
>
> Have you tried the updated patch you sent? Based on what the tcp/udp code is doing, the updated patch should work correctly. If it does, can you re-post with attribution/sign-off

Hello Vlad and Neil

Could you please kindly review below patch?

Thanks


 From ec43c8ec27e16a72d87d6d4b07a1f494083261a2 Mon Sep 17 00:00:00 2001
From: Fan Du <fan.du@windriver.com>
Date: Fri, 12 Jul 2013 10:22:35 +0800
Subject: [PATCH] sctp: Don't lookup dst if transport dst is still valid

When sctp sits on IPv6, current sctp_transport_dst_check pass ZERO
ip6_dst_check, which result in dst lookup every time for IPv6.
We use dst_cookie to check against fn_sernum to avoid unnecessary
lookup.

But problem still arise when we attempt to delete address
in multi-home mode, deleting an IPv6 address does not invalidate
any dst which source address is the same at the deleted one.
Which means sctp cannot rely on ip6_dst_check in this scenario.

To enforce multi-home functionality, introducing a sctp_genid similar
with rt genid, but only in sctp territory, so we don't hurt any other
rt validness against other layer 4 protocol.

Signed-off-by: Fan Du <fan.du@windriver.com>
---
  include/net/netns/sctp.h   |    5 +++++
  include/net/sctp/sctp.h    |   11 +++++++++--
  include/net/sctp/structs.h |    3 +++
  net/sctp/ipv6.c            |   24 +++++++++++++++++++++---
  net/sctp/protocol.c        |    7 +++++++
  5 files changed, 45 insertions(+), 5 deletions(-)




> Thanks
> -vlad
>
>>
>> -vlad
>>>
>>> So please pronounce a final judgment.
>>>
>>>> -vlad
>>>>
>>>>>
>>>>> diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
>>>>> index cd89510..0a646a5 100644
>>>>> --- a/include/net/sctp/sctp.h
>>>>> +++ b/include/net/sctp/sctp.h
>>>>> @@ -724,7 +724,7 @@ static inline void sctp_v4_map_v6(union sctp_addr
>>>>> *addr)
>>>>> */
>>>>> static inline struct dst_entry *sctp_transport_dst_check(struct
>>>>> sctp_transport *t)
>>>>> {
>>>>> - if (t->dst && !dst_check(t->dst, 0)) {
>>>>> + if (t->dst && !dst_check(t->dst, t->dst_cookie)) {
>>>>> dst_release(t->dst);
>>>>> t->dst = NULL;
>>>>> }
>>>>> diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
>>>>> index 1bd4c41..cafdd19 100644
>>>>> --- a/include/net/sctp/structs.h
>>>>> +++ b/include/net/sctp/structs.h
>>>>> @@ -946,6 +946,8 @@ struct sctp_transport {
>>>>> __u64 hb_nonce;
>>>>>
>>>>> struct rcu_head rcu;
>>>>> +
>>>>> + u32 dst_cookie;
>>>>> };
>>>>>
>>>>> struct sctp_transport *sctp_transport_new(struct net *, const union
>>>>> sctp_addr *,
>>>>> diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c
>>>>> index 8ee553b..82a420f 100644
>>>>> --- a/net/sctp/ipv6.c
>>>>> +++ b/net/sctp/ipv6.c
>>>>> @@ -350,6 +350,7 @@ out:
>>>>> struct rt6_info *rt;
>>>>> rt = (struct rt6_info *)dst;
>>>>> t->dst = dst;
>>>>> + t->dst_cookie = rt->rt6i_node ? rt->rt6i_node->fn_sernum
>>>>> : 0;
>>>>> SCTP_DEBUG_PRINTK("rt6_dst:%pI6 rt6_src:%pI6\n",
>>>>> &rt->rt6i_dst.addr, &fl6->saddr);
>>>>> } else {
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> -vlad
>>>>>>
>>>>>>>
>>>>>>> return t->dst;
>>>>>>> diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c
>>>>>>> index 8ee553b..cfae77e 100644
>>>>>>> --- a/net/sctp/ipv6.c
>>>>>>> +++ b/net/sctp/ipv6.c
>>>>>>> @@ -137,6 +137,8 @@ static int sctp_inet6addr_event(struct
>>>>>>> notifier_block *this, unsigned long ev,
>>>>>>> break;
>>>>>>> }
>>>>>>>
>>>>>>> + /* invalid all transport dst forcing to look up new dst */
>>>>>>> + rt_genid_bump(net);
>>>>>>> return NOTIFY_DONE;
>>>>>>> }
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>
>

Comments

David Miller July 12, 2013, 10:58 p.m. UTC | #1
From: Fan Du <fan.du@windriver.com>
Date: Fri, 12 Jul 2013 11:15:10 +0800

> But problem still arise when we attempt to delete address
> in multi-home mode, deleting an IPv6 address does not invalidate
> any dst which source address is the same at the deleted one.
> Which means sctp cannot rely on ip6_dst_check in this scenario.

I still cannot understand why this is an SCTP specific issue.

Specifically, I cannot see why address addition/deletion doesn't
cause problems for cached ipv6 routes in UDP and TCP sockets too.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Vladislav Yasevich July 13, 2013, 12:18 p.m. UTC | #2
On 07/12/2013 06:58 PM, David Miller wrote:
> From: Fan Du <fan.du@windriver.com>
> Date: Fri, 12 Jul 2013 11:15:10 +0800
>
>> But problem still arise when we attempt to delete address
>> in multi-home mode, deleting an IPv6 address does not invalidate
>> any dst which source address is the same at the deleted one.
>> Which means sctp cannot rely on ip6_dst_check in this scenario.
>
> I still cannot understand why this is an SCTP specific issue.
>
> Specifically, I cannot see why address addition/deletion doesn't
> cause problems for cached ipv6 routes in UDP and TCP sockets too.
>

Trying to figure this out.

-vlad
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
fan.du July 16, 2013, 9:58 a.m. UTC | #3
On 2013年07月13日 06:58, David Miller wrote:
> From: Fan Du<fan.du@windriver.com>
> Date: Fri, 12 Jul 2013 11:15:10 +0800
>
>> But problem still arise when we attempt to delete address
>> in multi-home mode, deleting an IPv6 address does not invalidate
>> any dst which source address is the same at the deleted one.
>> Which means sctp cannot rely on ip6_dst_check in this scenario.
>
> I still cannot understand why this is an SCTP specific issue.

It's not SCTP specific, it's shared by all all layer 4 protocol IMHO.
The issue of SCTP IPv6 doesn't check IPv6 dst validness has been addressed
using *only* dst_cookie as other layer 4 protocol does for its sock.
But this scheme cannot cover scenario when delete primary address to support
SCTP multi-home feature, this is where the concern is.

Use netsend to send a large file using DCCP, considering the sender
host has two IPv6 address, while sending, delete the one netsend currently
using. Wireshark could catch the sender is still transmit packet out using
the deleted address in a slowly manner.

All of those boils down to one question that I cannot resist to ask:
If delete an IPv6 address(*1*), whether the original rt/dst destinate for a
remote address(*2*) using the deleted address as source address is still legal
for subsequent usage in current kernel IPv6 routing implementation???

btw, (*1*) and (*2*) are on quite different node in the binary tree as well
as different leaf.

> Specifically, I cannot see why address addition/deletion doesn't
> cause problems for cached ipv6 routes in UDP and TCP sockets too.
>
diff mbox

Patch

diff --git a/include/net/netns/sctp.h b/include/net/netns/sctp.h
index 3573a81..83204db 100644
--- a/include/net/netns/sctp.h
+++ b/include/net/netns/sctp.h
@@ -129,6 +129,11 @@  struct netns_sctp {

  	/* Threshold for autoclose timeout, in seconds. */
  	unsigned long max_autoclose;
+
+#if IS_ENABLED(CONFIG_IPV6)	
+	/* sctp IPv6 specific */
+	atomic_t	addr_genid;
+#endif
  };

  #endif /* __NETNS_SCTP_H__ */
diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
index d8e37ec..c344ea8 100644
--- a/include/net/sctp/sctp.h
+++ b/include/net/sctp/sctp.h
@@ -613,11 +613,18 @@  static inline void sctp_v4_map_v6(union sctp_addr *addr)
   */
  static inline struct dst_entry *sctp_transport_dst_check(struct sctp_transport *t)
  {
-	if (t->dst && !dst_check(t->dst, 0)) {
+	struct sctp_af *af = t->af_specific;
+
+	/* We handle two sceanario here:
+	 * One: using dst_cookie to check against any routing information change
+	 * Two: using sctp_cookie to check against address deletion
+	 * Either of them force a new dst lookup for transport
+	 */
+	if ((t->dst && !dst_check(t->dst, t->dst_cookie)) ||
+		af->check_addr_change(t)) {
  		dst_release(t->dst);
  		t->dst = NULL;
  	}
-
  	return t->dst;
  }

diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index e745c92..9849915 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -492,6 +492,7 @@  struct sctp_af {
  	void		(*seq_dump_addr)(struct seq_file *seq,
  					 union sctp_addr *addr);
  	void		(*ecn_capable)(struct sock *sk);
+	int 		(*check_addr_change)(struct sctp_transport *t);
  	__u16		net_header_len;
  	int		sockaddr_len;
  	sa_family_t	sa_family;
@@ -946,6 +947,8 @@  struct sctp_transport {
  	__u64 hb_nonce;

  	struct rcu_head rcu;
+	u32 dst_cookie;
+	u32 addr_cookie;
  };

  struct sctp_transport *sctp_transport_new(struct net *, const union sctp_addr *,
diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c
index 09ffcc9..343cd10 100644
--- a/net/sctp/ipv6.c
+++ b/net/sctp/ipv6.c
@@ -137,6 +137,8 @@  static int sctp_inet6addr_event(struct notifier_block *this, unsigned long ev,
  		break;
  	}

+	/* force a lookup on all transport->dst for stcp IPv6 ONLY */	
+	atomic_inc(&net->sctp.addr_genid);
  	return NOTIFY_DONE;
  }

@@ -348,12 +350,20 @@  static void sctp_v6_get_dst(struct sctp_transport *t, union sctp_addr *saddr,
  out:
  	if (!IS_ERR_OR_NULL(dst)) {
  		struct rt6_info *rt;
+		struct net *net;

  		rt = (struct rt6_info *)dst;
  		t->dst = dst;
-
+		net = dev_net(rt->rt6i_idev->dev);		
+	
  		pr_debug("rt6_dst:%pI6 rt6_src:%pI6\n", &rt->rt6i_dst.addr,
-			 &fl6->saddr);
+			&fl6->saddr);
+	
+		/* use fn_sernum to detect routing change */
+		t->dst_cookie = rt->rt6i_node ? rt->rt6i_node->fn_sernum : 0;
+
+		/* record addr_genid to check against address delete */
+		t->addr_cookie = atomic_read(&net->sctp.addr_genid);
  	} else {
  		t->dst = NULL;

@@ -724,6 +734,14 @@  static void sctp_v6_ecn_capable(struct sock *sk)
  	inet6_sk(sk)->tclass |= INET_ECN_ECT_0;
  }

+static int sctp_v6_check_addr_change(struct sctp_transport *t)
+{
+	struct rt6_info *rt = (struct rt6_info *)t->dst;
+	struct net *net = dev_net(rt->rt6i_idev->dev);
+	
+	return (t->addr_cookie != atomic_read(&net->sctp.addr_genid) ? 1 : 0);
+}
+
  /* Initialize a PF_INET6 socket msg_name. */
  static void sctp_inet6_msgname(char *msgname, int *addr_len)
  {
@@ -1008,6 +1026,7 @@  static struct sctp_af sctp_af_inet6 = {
  	.is_ce		   = sctp_v6_is_ce,
  	.seq_dump_addr	   = sctp_v6_seq_dump_addr,
  	.ecn_capable	   = sctp_v6_ecn_capable,
+	.check_addr_change = sctp_v6_check_addr_change,
  	.net_header_len	   = sizeof(struct ipv6hdr),
  	.sockaddr_len	   = sizeof(struct sockaddr_in6),
  #ifdef CONFIG_COMPAT
@@ -1076,7 +1095,6 @@  int sctp_v6_add_protocol(void)

  	if (inet6_add_protocol(&sctpv6_protocol, IPPROTO_SCTP) < 0)
  		return -EAGAIN;
-
  	return 0;
  }

diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
index 4a17494d..b9335d7 100644
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -595,6 +595,12 @@  static void sctp_v4_ecn_capable(struct sock *sk)
  	INET_ECN_xmit(sk);
  }

+static int sctp_v4_check_addr_change(struct sctp_transport *t)
+{
+	/* Always return 0, as IPv4 bump rt genid when address changes */
+	return 0;
+}
+
  static void sctp_addr_wq_timeout_handler(unsigned long arg)
  {
  	struct net *net = (struct net *)arg;
@@ -1064,6 +1070,7 @@  static struct sctp_af sctp_af_inet = {
  	.is_ce		   = sctp_v4_is_ce,
  	.seq_dump_addr	   = sctp_v4_seq_dump_addr,
  	.ecn_capable	   = sctp_v4_ecn_capable,
+	.check_addr_change = sctp_v4_check_addr_change,
  	.net_header_len	   = sizeof(struct iphdr),
  	.sockaddr_len	   = sizeof(struct sockaddr_in),
  #ifdef CONFIG_COMPAT