From patchwork Tue Jul 10 14:41:26 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hangbin Liu X-Patchwork-Id: 942011 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="dup2/hFE"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 41Q4fc4ZP4z9s0W for ; Wed, 11 Jul 2018 00:42:16 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933778AbeGJOmJ (ORCPT ); Tue, 10 Jul 2018 10:42:09 -0400 Received: from mail-pg1-f195.google.com ([209.85.215.195]:36406 "EHLO mail-pg1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933353AbeGJOmH (ORCPT ); Tue, 10 Jul 2018 10:42:07 -0400 Received: by mail-pg1-f195.google.com with SMTP id m19-v6so2077225pgv.3 for ; Tue, 10 Jul 2018 07:42:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=3NbEWvomeTi5zcDcpY8057Qiq5RwkTBADOQQc/Sf8KA=; b=dup2/hFEObhybw9M6WJoyS1VEIXNjFhoj8nfe43XcC0hELeE7S6HYig+zPDeejCmCv CtktHcWZnj6EwpJLs67DqOpC7yw6/KYD69ZSNbVxnUmTgqN8OMhwu2zVqg9DBIv1k52F ovTjGPZask36e87vi4g91Hj6U5d38CBgjVb1O0Ldk2Vi3UnLKUEVMzaq42zfXpfGbYX4 jsVq88f/n7hbbfVJlwvg2k2+H4K8nlGUy7pE1U+FpsDDOMeHp3ns6CfhjNhuAdSUkTs8 ZLDPXz0jipGMY4vw67YjqvSMBLIr+9ZId1/vrZYLhTVrwYVhdrr+lEPBCZ7f5i6dFQej fHqw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=3NbEWvomeTi5zcDcpY8057Qiq5RwkTBADOQQc/Sf8KA=; b=lCp7UnPlE/1IzNlqJVpJvT1tFL6FaDDENNJBIWbdVCepkUrkqm62jMPZKHCFptwGBG FU4r5b8zWEbCAP+6ZDcy6yRxiTXN9LiWgU/teRD6AZ3X31e9mR/k9ijd8AdMlcrofQKW 7WJcqA94i4SgKp7BoU6448lduq/ktJCSutbk0OGl0SBPtAVA9iCMyLr15bGTtoiw56G+ EIkQxOhb/kX9TMqiwyt/w/s1zAQrnOzOpw4w3mr9Asiwx6Ib1wZPXKCLSL0TnvRaGwnV wKqDjuWlVFHQcVH909GSJjPrMgw5nfeFqYzKQri+9IMVeTJzX3H/P8mjDYa3CdIRHt8p AgcQ== X-Gm-Message-State: APt69E0sTsGVoAF0/awrKHScuR6d4F2PqFuSfT86u3rxWepu0Y+7NW+b CCCw6wP892pWQXBPmGOGwnxQa8VA X-Google-Smtp-Source: AAOMgpf/zQkkD/sMkWteJsD96E2hg+5OW9cqL0+v3BS8l4cKldqETpKi/q6mw6XFRp5tPVRmTOxotg== X-Received: by 2002:a63:4f1a:: with SMTP id d26-v6mr17318395pgb.121.1531233726692; Tue, 10 Jul 2018 07:42:06 -0700 (PDT) Received: from leo.usersys.redhat.com ([209.132.188.80]) by smtp.gmail.com with ESMTPSA id d23-v6sm41980118pfe.2.2018.07.10.07.42.03 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 10 Jul 2018 07:42:06 -0700 (PDT) From: Hangbin Liu To: netdev@vger.kernel.org Cc: David Miller , Stefano Brivio , Paolo Abeni , Daniel Borkmann , WANG Cong , YOSHIFUJI Hideaki , Flavio Leitner , Hangbin Liu Subject: [PATCHv2 net 1/2] ipv4/igmp: init group mode as INCLUDE when join source group Date: Tue, 10 Jul 2018 22:41:26 +0800 Message-Id: <1531233687-28744-2-git-send-email-liuhangbin@gmail.com> X-Mailer: git-send-email 2.5.5 In-Reply-To: <1531233687-28744-1-git-send-email-liuhangbin@gmail.com> References: <1528871551-17879-1-git-send-email-liuhangbin@gmail.com> <1531233687-28744-1-git-send-email-liuhangbin@gmail.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Based on RFC3376 5.1 If no interface state existed for that multicast address before the change (i.e., the change consisted of creating a new per-interface record), or if no state exists after the change (i.e., the change consisted of deleting a per-interface record), then the "non-existent" state is considered to have a filter mode of INCLUDE and an empty source list. Which means a new multicast group should start with state IN(). Function ip_mc_join_group() works correctly for IGMP ASM(Any-Source Multicast) mode. It adds a group with state EX() and inits crcount to mc_qrv, so the kernel will send a TO_EX() report message after adding group. But for IGMPv3 SSM(Source-specific multicast) JOIN_SOURCE_GROUP mode, we split the group joining into two steps. First we join the group like ASM, i.e. via ip_mc_join_group(). So the state changes from IN() to EX(). Then we add the source-specific address with INCLUDE mode. So the state changes from EX() to IN(A). Before the first step sends a group change record, we finished the second step. So we will only send the second change record. i.e. TO_IN(A). Regarding the RFC stands, we should actually send an ALLOW(A) message for SSM JOIN_SOURCE_GROUP as the state should mimic the 'IN() to IN(A)' transition. The issue was exposed by commit a052517a8ff65 ("net/multicast: should not send source list records when have filter mode change"). Before this change, we used to send both ALLOW(A) and TO_IN(A). After this change we only send TO_IN(A). Fix it by adding a new parameter to init group mode. Also add new wrapper functions so we don't need to change too much code. v1 -> v2: In my first version I only cleared the group change record. But this is not enough. Because when a new group join, it will init as EXCLUDE and trigger an filter mode change in ip/ip6_mc_add_src(), which will clear all source addresses' sf_crcount. This will prevent early joined address sending state change records if multi source addressed joined at the same time. In v2 patch, I fixed it by directly initializing the mode to INCLUDE for SSM JOIN_SOURCE_GROUP. I also split the original patch into two separated patches for IPv4 and IPv6. Fixes: a052517a8ff65 ("net/multicast: should not send source list records when have filter mode change") Reviewed-by: Stefano Brivio Signed-off-by: Hangbin Liu --- include/linux/igmp.h | 2 ++ net/ipv4/igmp.c | 58 ++++++++++++++++++++++++++++++++++++-------------- net/ipv4/ip_sockglue.c | 4 ++-- 3 files changed, 46 insertions(+), 18 deletions(-) diff --git a/include/linux/igmp.h b/include/linux/igmp.h index f823185..119f539 100644 --- a/include/linux/igmp.h +++ b/include/linux/igmp.h @@ -109,6 +109,8 @@ struct ip_mc_list { extern int ip_check_mc_rcu(struct in_device *dev, __be32 mc_addr, __be32 src_addr, u8 proto); extern int igmp_rcv(struct sk_buff *); extern int ip_mc_join_group(struct sock *sk, struct ip_mreqn *imr); +extern int ip_mc_join_group_ssm(struct sock *sk, struct ip_mreqn *imr, + unsigned int mode); extern int ip_mc_leave_group(struct sock *sk, struct ip_mreqn *imr); extern void ip_mc_drop_socket(struct sock *sk); extern int ip_mc_source(int add, int omode, struct sock *sk, diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c index b26a81a..ef72499 100644 --- a/net/ipv4/igmp.c +++ b/net/ipv4/igmp.c @@ -1200,13 +1200,14 @@ static void igmpv3_del_delrec(struct in_device *in_dev, struct ip_mc_list *im) spin_lock_bh(&im->lock); if (pmc) { im->interface = pmc->interface; - im->crcount = in_dev->mr_qrv ?: net->ipv4.sysctl_igmp_qrv; im->sfmode = pmc->sfmode; if (pmc->sfmode == MCAST_INCLUDE) { im->tomb = pmc->tomb; im->sources = pmc->sources; for (psf = im->sources; psf; psf = psf->sf_next) - psf->sf_crcount = im->crcount; + psf->sf_crcount = in_dev->mr_qrv ?: net->ipv4.sysctl_igmp_qrv; + } else { + im->crcount = in_dev->mr_qrv ?: net->ipv4.sysctl_igmp_qrv; } in_dev_put(pmc->interface); kfree(pmc); @@ -1288,7 +1289,7 @@ static void igmp_group_dropped(struct ip_mc_list *im) #endif } -static void igmp_group_added(struct ip_mc_list *im) +static void igmp_group_added(struct ip_mc_list *im, unsigned int mode) { struct in_device *in_dev = im->interface; #ifdef CONFIG_IP_MULTICAST @@ -1316,7 +1317,13 @@ static void igmp_group_added(struct ip_mc_list *im) } /* else, v3 */ - im->crcount = in_dev->mr_qrv ?: net->ipv4.sysctl_igmp_qrv; + /* Based on RFC3376 5.1, for newly added INCLUDE SSM, we should + * not send filter-mode change record as the mode should be from + * IN() to IN(A). + */ + if (mode == MCAST_EXCLUDE) + im->crcount = in_dev->mr_qrv ?: net->ipv4.sysctl_igmp_qrv; + igmp_ifc_event(in_dev); #endif } @@ -1381,8 +1388,7 @@ static void ip_mc_hash_remove(struct in_device *in_dev, /* * A socket has joined a multicast group on device dev. */ - -void ip_mc_inc_group(struct in_device *in_dev, __be32 addr) +void __ip_mc_inc_group(struct in_device *in_dev, __be32 addr, unsigned int mode) { struct ip_mc_list *im; #ifdef CONFIG_IP_MULTICAST @@ -1394,7 +1400,7 @@ void ip_mc_inc_group(struct in_device *in_dev, __be32 addr) for_each_pmc_rtnl(in_dev, im) { if (im->multiaddr == addr) { im->users++; - ip_mc_add_src(in_dev, &addr, MCAST_EXCLUDE, 0, NULL, 0); + ip_mc_add_src(in_dev, &addr, mode, 0, NULL, 0); goto out; } } @@ -1408,8 +1414,8 @@ void ip_mc_inc_group(struct in_device *in_dev, __be32 addr) in_dev_hold(in_dev); im->multiaddr = addr; /* initial mode is (EX, empty) */ - im->sfmode = MCAST_EXCLUDE; - im->sfcount[MCAST_EXCLUDE] = 1; + im->sfmode = mode; + im->sfcount[mode] = 1; refcount_set(&im->refcnt, 1); spin_lock_init(&im->lock); #ifdef CONFIG_IP_MULTICAST @@ -1426,12 +1432,17 @@ void ip_mc_inc_group(struct in_device *in_dev, __be32 addr) #ifdef CONFIG_IP_MULTICAST igmpv3_del_delrec(in_dev, im); #endif - igmp_group_added(im); + igmp_group_added(im, mode); if (!in_dev->dead) ip_rt_multicast_event(in_dev); out: return; } + +void ip_mc_inc_group(struct in_device *in_dev, __be32 addr) +{ + __ip_mc_inc_group(in_dev, addr, MCAST_EXCLUDE); +} EXPORT_SYMBOL(ip_mc_inc_group); static int ip_mc_check_iphdr(struct sk_buff *skb) @@ -1688,7 +1699,7 @@ void ip_mc_remap(struct in_device *in_dev) #ifdef CONFIG_IP_MULTICAST igmpv3_del_delrec(in_dev, pmc); #endif - igmp_group_added(pmc); + igmp_group_added(pmc, pmc->sfmode); } } @@ -1751,7 +1762,7 @@ void ip_mc_up(struct in_device *in_dev) #ifdef CONFIG_IP_MULTICAST igmpv3_del_delrec(in_dev, pmc); #endif - igmp_group_added(pmc); + igmp_group_added(pmc, pmc->sfmode); } } @@ -2130,8 +2141,8 @@ static void ip_mc_clear_src(struct ip_mc_list *pmc) /* Join a multicast group */ - -int ip_mc_join_group(struct sock *sk, struct ip_mreqn *imr) +static int __ip_mc_join_group(struct sock *sk, struct ip_mreqn *imr, + unsigned int mode) { __be32 addr = imr->imr_multiaddr.s_addr; struct ip_mc_socklist *iml, *i; @@ -2172,15 +2183,30 @@ int ip_mc_join_group(struct sock *sk, struct ip_mreqn *imr) memcpy(&iml->multi, imr, sizeof(*imr)); iml->next_rcu = inet->mc_list; iml->sflist = NULL; - iml->sfmode = MCAST_EXCLUDE; + iml->sfmode = mode; rcu_assign_pointer(inet->mc_list, iml); - ip_mc_inc_group(in_dev, addr); + __ip_mc_inc_group(in_dev, addr, mode); err = 0; done: return err; } + +/* Join ASM (Any-Source Multicast) group + */ +int ip_mc_join_group(struct sock *sk, struct ip_mreqn *imr) +{ + return __ip_mc_join_group(sk, imr, MCAST_EXCLUDE); +} EXPORT_SYMBOL(ip_mc_join_group); +/* Join SSM (Source-Specific Multicast) group + */ +int ip_mc_join_group_ssm(struct sock *sk, struct ip_mreqn *imr, + unsigned int mode) +{ + return __ip_mc_join_group(sk, imr, mode); +} + static int ip_mc_leave_src(struct sock *sk, struct ip_mc_socklist *iml, struct in_device *in_dev) { diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c index 57bbb06..5f28607 100644 --- a/net/ipv4/ip_sockglue.c +++ b/net/ipv4/ip_sockglue.c @@ -982,7 +982,7 @@ static int do_ip_setsockopt(struct sock *sk, int level, mreq.imr_multiaddr.s_addr = mreqs.imr_multiaddr; mreq.imr_address.s_addr = mreqs.imr_interface; mreq.imr_ifindex = 0; - err = ip_mc_join_group(sk, &mreq); + err = ip_mc_join_group_ssm(sk, &mreq, MCAST_INCLUDE); if (err && err != -EADDRINUSE) break; omode = MCAST_INCLUDE; @@ -1059,7 +1059,7 @@ static int do_ip_setsockopt(struct sock *sk, int level, mreq.imr_multiaddr = psin->sin_addr; mreq.imr_address.s_addr = 0; mreq.imr_ifindex = greqs.gsr_interface; - err = ip_mc_join_group(sk, &mreq); + err = ip_mc_join_group_ssm(sk, &mreq, MCAST_INCLUDE); if (err && err != -EADDRINUSE) break; greqs.gsr_interface = mreq.imr_ifindex;