Patchwork [RFC,net-next,(V2,RESENT)] ipv6: Queue fragments per interface for multicast/link-local addresses.

login
register
mail settings
Submitter YOSHIFUJI Hideaki / 吉藤英明
Date Feb. 16, 2013, 4:44 p.m.
Message ID <511FB776.8000901@linux-ipv6.org>
Download mbox | patch
Permalink /patch/220980/
State RFC
Delegated to: David Miller
Headers show

Comments

YOSHIFUJI Hideaki / 吉藤英明 - Feb. 16, 2013, 4:44 p.m.
We should queue fragments for the same link-local address on
different interfaces (e.g. fe80::1%eth0 and fe80::1%eth1) to the
different queue, because of nature of addressing architecture.

Similarly, we should queue fragments for multicast on different
interface to the different queue.  This is okay because
application joins group on speicific interface, and multicast
traffic is expected only on that interface.

CC: Ben Greear <greearb@candelatech.com>
CC: Vlad Yasevich <vyasevic@redhat.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
---
 include/net/ipv6.h                      |    1 +
 net/ipv6/netfilter/nf_conntrack_reasm.c |    9 +++++++--
 net/ipv6/reassembly.c                   |   10 ++++++++--
 3 files changed, 16 insertions(+), 4 deletions(-)
Erik Hugne - Feb. 18, 2013, 1:19 p.m.
On Sun, Feb 17, 2013 at 01:44:38AM +0900, YOSHIFUJI Hideaki wrote:
> We should queue fragments for the same link-local address on
> different interfaces (e.g. fe80::1%eth0 and fe80::1%eth1) to the
> different queue, because of nature of addressing architecture.
> 
> Similarly, we should queue fragments for multicast on different
> interface to the different queue.  This is okay because
> application joins group on speicific interface, and multicast
> traffic is expected only on that interface.
>

Your patch does solve the reassembly problem when macvlans are defined, thanks!

A tad unrelated, but i think there's still some ipv6 multicast filtering 
work to be done in the macvlan driver. If you for example join an all scope 
mcast address, ff09::1 on macvlan 0 it will implicitly be 'joined' on all 
other macvlans aswell..

I can buy that linklocal multicast packets are cloned out to all macvlan devices,
but if a specific prefix is joined on one of them, i dont think it's correct that
all sibling devices should receive that traffic, unless they asked for it.

//E
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hannes Frederic Sowa - March 16, 2013, 7:47 a.m.
On Sun, Feb 17, 2013 at 01:44:38AM +0900, YOSHIFUJI Hideaki wrote:
> We should queue fragments for the same link-local address on
> different interfaces (e.g. fe80::1%eth0 and fe80::1%eth1) to the
> different queue, because of nature of addressing architecture.
> 
> Similarly, we should queue fragments for multicast on different
> interface to the different queue.  This is okay because
> application joins group on speicific interface, and multicast
> traffic is expected only on that interface.
> 
> CC: Ben Greear <greearb@candelatech.com>
> CC: Vlad Yasevich <vyasevic@redhat.com>
> CC: Eric Dumazet <eric.dumazet@gmail.com>
> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

I just found this patch while cleaning up my tree. I don't know its state
(netdev patchworks says RFC and netfilter patchworks still lists it as
new). However, I also do think that the per interface matching would be
the right thing to do for multicast|linklocal fragments. Perhaps this patch
should be resend?

Yoshifuji, do you think we should also implement RFC 3168 5.3 ECN
fragmentation protection in reassembly.c? I think it should be
straightforward because it is already implemented for ipv4 and the
relevant bits just need to be moved to inet_fragment.c and become a bit
more generalized.

Thanks,

  Hannes

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
YOSHIFUJI Hideaki / 吉藤英明 - March 17, 2013, 12:50 a.m.
Hannes Frederic Sowa wrote:
> On Sun, Feb 17, 2013 at 01:44:38AM +0900, YOSHIFUJI Hideaki wrote:
>> We should queue fragments for the same link-local address on
>> different interfaces (e.g. fe80::1%eth0 and fe80::1%eth1) to the
>> different queue, because of nature of addressing architecture.
>>
>> Similarly, we should queue fragments for multicast on different
>> interface to the different queue.  This is okay because
>> application joins group on speicific interface, and multicast
>> traffic is expected only on that interface.
>>
>> CC: Ben Greear <greearb@candelatech.com>
>> CC: Vlad Yasevich <vyasevic@redhat.com>
>> CC: Eric Dumazet <eric.dumazet@gmail.com>
>> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
> 
> I just found this patch while cleaning up my tree. I don't know its state
> (netdev patchworks says RFC and netfilter patchworks still lists it as
> new). However, I also do think that the per interface matching would be
> the right thing to do for multicast|linklocal fragments. Perhaps this patch
> should be resend?

Will do.

> Yoshifuji, do you think we should also implement RFC 3168 5.3 ECN
> fragmentation protection in reassembly.c? I think it should be
> straightforward because it is already implemented for ipv4 and the
> relevant bits just need to be moved to inet_fragment.c and become a bit
> more generalized.

OK.

--yoshfuji

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 851d541..1af98aa 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -463,6 +463,7 @@  enum ip6_defrag_users {
 struct ip6_create_arg {
 	__be32 id;
 	u32 user;
+	int ifindex;
 	const struct in6_addr *src;
 	const struct in6_addr *dst;
 };
diff --git a/net/ipv6/netfilter/nf_conntrack_reasm.c b/net/ipv6/netfilter/nf_conntrack_reasm.c
index c674f15..acd1820 100644
--- a/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -163,7 +163,8 @@  static void nf_ct_frag6_expire(unsigned long data)
 
 /* Creation primitives. */
 static inline struct frag_queue *fq_find(struct net *net, __be32 id,
-					 u32 user, struct in6_addr *src,
+					 u32 user, int ifindex,
+					 struct in6_addr *src,
 					 struct in6_addr *dst)
 {
 	struct inet_frag_queue *q;
@@ -171,6 +172,10 @@  static inline struct frag_queue *fq_find(struct net *net, __be32 id,
 	unsigned int hash;
 
 	arg.id = id;
+	if (ipv6_addr_type(dst) & (IPV6_ADDR_MULTICAST | IPV6_ADDR_LINKLOCAL))
+		arg.ifindex = ifindex;
+	else
+		arg.ifindex = 0;
 	arg.user = user;
 	arg.src = src;
 	arg.dst = dst;
@@ -572,7 +577,7 @@  struct sk_buff *nf_ct_frag6_gather(struct sk_buff *skb, u32 user)
 	inet_frag_evictor(&net->nf_frag.frags, &nf_frags, false);
 	local_bh_enable();
 
-	fq = fq_find(net, fhdr->identification, user, &hdr->saddr, &hdr->daddr);
+	fq = fq_find(net, fhdr->identification, user, dev->ifindex, &hdr->saddr, &hdr->daddr);
 	if (fq == NULL) {
 		pr_debug("Can't find and can't create new queue\n");
 		goto ret_orig;
diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c
index bab2c27..18c5e3c 100644
--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c
@@ -113,6 +113,7 @@  bool ip6_frag_match(struct inet_frag_queue *q, void *a)
 
 	fq = container_of(q, struct frag_queue, q);
 	return	fq->id == arg->id &&
+		(!arg->ifindex || fq->iif == arg->ifindex) &&
 		fq->user == arg->user &&
 		ipv6_addr_equal(&fq->saddr, arg->src) &&
 		ipv6_addr_equal(&fq->daddr, arg->dst);
@@ -182,13 +183,17 @@  static void ip6_frag_expire(unsigned long data)
 }
 
 static __inline__ struct frag_queue *
-fq_find(struct net *net, __be32 id, const struct in6_addr *src, const struct in6_addr *dst)
+fq_find(struct net *net, __be32 id, int ifindex, const struct in6_addr *src, const struct in6_addr *dst)
 {
 	struct inet_frag_queue *q;
 	struct ip6_create_arg arg;
 	unsigned int hash;
 
 	arg.id = id;
+	if (ipv6_addr_type(dst) & (IPV6_ADDR_MULTICAST | IPV6_ADDR_LINKLOCAL))
+		arg.ifindex = ifindex;
+	else
+		arg.ifindex = 0;
 	arg.user = IP6_DEFRAG_LOCAL_DELIVER;
 	arg.src = src;
 	arg.dst = dst;
@@ -534,7 +539,8 @@  static int ipv6_frag_rcv(struct sk_buff *skb)
 		IP6_ADD_STATS_BH(net, ip6_dst_idev(skb_dst(skb)),
 				 IPSTATS_MIB_REASMFAILS, evicted);
 
-	fq = fq_find(net, fhdr->identification, &hdr->saddr, &hdr->daddr);
+	fq = fq_find(net, fhdr->identification, skb->dev->ifindex,
+		     &hdr->saddr, &hdr->daddr);
 	if (fq != NULL) {
 		int ret;