diff mbox

IGMP Join dropping multicast packets

Message ID 91bdcedb0903141316j2dbf4160wb348a5a9e3bde8ad@mail.gmail.com
State Not Applicable, archived
Delegated to: David Miller
Headers show

Commit Message

Dave Boutcher March 14, 2009, 8:16 p.m. UTC
I'm running into an interesting problem with joining multiple
multicast feeds.  If you join multiple multicast feeds using
setsockopt(...,IP_ADD_MEMBERSHIP...) it causes packets on UNRELATED
multicast feeds to get dropped.  We have a multicast feed on a rock
solid network, and we were very surprised to see dropped packets.  The
cause was a different process/program being run by a different user
joining a bunch of mulitcast feeds.

I can recreate this with a fairly simple testcase (attached below.)
The problem doesn't happen with unicast UDP data, and it doesn't
happen with loopback, so you need at least two systems to run this
(and what subscriber to netdev doesn't have at least two systems.)  To
recreate, run "receiver" on one system, "sender", on another, and then
"joiner" on the receiving system.  You should see a message pop out
saying that packets have been dropped.  I've recreated this on a few

Comments

Eric Dumazet March 15, 2009, 2:37 a.m. UTC | #1
Dave Boutcher a écrit :
> I'm running into an interesting problem with joining multiple
> multicast feeds.  If you join multiple multicast feeds using
> setsockopt(...,IP_ADD_MEMBERSHIP...) it causes packets on UNRELATED
> multicast feeds to get dropped.  We have a multicast feed on a rock
> solid network, and we were very surprised to see dropped packets.  The
> cause was a different process/program being run by a different user
> joining a bunch of mulitcast feeds.
> 
> I can recreate this with a fairly simple testcase (attached below.)
> The problem doesn't happen with unicast UDP data, and it doesn't
> happen with loopback, so you need at least two systems to run this
> (and what subscriber to netdev doesn't have at least two systems.)  To
> recreate, run "receiver" on one system, "sender", on another, and then
> "joiner" on the receiving system.  You should see a message pop out
> saying that packets have been dropped.  I've recreated this on a few
> different kernel versions (the latest being 2.6.28) and a few
> different sets off hardware.  I HAVEN"T recreated it if the system
> doing the IP_ADD_MEMBERSHIP specifies a specific interface rather than
> INADDR_ANY.  I'm not sure if that is core to the issue or not.  You
> may also need to bump the value in
> /proc/sys/net/ipv4/igmp_max_memberships (though that hasn't seemed
> necessary for me.)
> 
> I poked around in igmp.c, but its mojo exceeds my threshold.  If
> anyone has any ideas or questions I'd be happy to hear them.
> 

I could not reproduce the problem on my machines (bnx2 adapter), even if changing
NUMSOCK from 55 to 200 in joiner.c

Is your network a 100Mb one or Gigabit ?
Try to slow down your joiner ?
(Could be a flood of IGMP messages your router/switch cannot cope with)

Please describe your "rock solid" network setup (kind of network adapters you have, kind of router...)

Each time an address is added, NIC driver have to reprogram mcfilter of
the device. Maybe some NIC can drop some packets at this moment...

If using tcpdump to force promiscuous mode on the device also triggers packet losses ?

(see also ifconfig ethX promisc|allmulti)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dave Boutcher March 16, 2009, 2:04 a.m. UTC | #2
On Sat, Mar 14, 2009 at 9:37 PM, Eric Dumazet <dada1@cosmosbay.com> wrote:
> Dave Boutcher a écrit :
>> I'm running into an interesting problem with joining multiple
>> multicast feeds.  If you join multiple multicast feeds using
>> setsockopt(...,IP_ADD_MEMBERSHIP...) it causes packets on UNRELATED
>> multicast feeds to get dropped.  We have a multicast feed on a rock
>> solid network, and we were very surprised to see dropped packets.  The
>> cause was a different process/program being run by a different user
>> joining a bunch of mulitcast feeds.
>
> I could not reproduce the problem on my machines (bnx2 adapter), even if changing
> NUMSOCK from 55 to 200 in joiner.c

Thanks for trying Eric.  Based on your email I did some more testing
and thus far I've
only recreated this on x86_64 arches, not on i386.  Which arch did you
try it on?

> Is your network a 100Mb one or Gigabit ?
> Try to slow down your joiner ?
> (Could be a flood of IGMP messages your router/switch cannot cope with)
>
> Please describe your "rock solid" network setup (kind of network adapters you have, kind of router...)

The problem originally manifest itself at work on a 24-core Dell
server with 6 NICs.   The network
is gigabit with a Cisco 4900 switch.  I recreated it in my basement on
my little white-box
system and a cheap netgear switch.  The NIC at work is Intel e1000e
driver, the one
at home is also e1000.

> If using tcpdump to force promiscuous mode on the device also triggers packet losses ?
>
> (see also ifconfig ethX promisc|allmulti)

I haven't had a chance to play with promiscuous yet...
Eric Dumazet March 16, 2009, 7:01 p.m. UTC | #3
Dave Boutcher a écrit :
> On Sat, Mar 14, 2009 at 9:37 PM, Eric Dumazet <dada1@cosmosbay.com> wrote:
>> Dave Boutcher a écrit :
>>> I'm running into an interesting problem with joining multiple
>>> multicast feeds.  If you join multiple multicast feeds using
>>> setsockopt(...,IP_ADD_MEMBERSHIP...) it causes packets on UNRELATED
>>> multicast feeds to get dropped.  We have a multicast feed on a rock
>>> solid network, and we were very surprised to see dropped packets.  The
>>> cause was a different process/program being run by a different user
>>> joining a bunch of mulitcast feeds.
>> I could not reproduce the problem on my machines (bnx2 adapter), even if changing
>> NUMSOCK from 55 to 200 in joiner.c
> 
> Thanks for trying Eric.  Based on your email I did some more testing
> and thus far I've
> only recreated this on x86_64 arches, not on i386.  Which arch did you
> try it on?

I tried both, 32 and 64 bit kernels. No problems so far.

Could you post a linux kernel .config of a non 'working' machine, and dmesg output ?

> 
>> Is your network a 100Mb one or Gigabit ?
>> Try to slow down your joiner ?
>> (Could be a flood of IGMP messages your router/switch cannot cope with)
>>
>> Please describe your "rock solid" network setup (kind of network adapters you have, kind of router...)
> 
> The problem originally manifest itself at work on a 24-core Dell
> server with 6 NICs.   The network
> is gigabit with a Cisco 4900 switch.  I recreated it in my basement on
> my little white-box
> system and a cheap netgear switch.  The NIC at work is Intel e1000e
> driver, the one
> at home is also e1000.
> 
>> If using tcpdump to force promiscuous mode on the device also triggers packet losses ?
>>
>> (see also ifconfig ethX promisc|allmulti)
> 
> I haven't had a chance to play with promiscuous yet...
> 


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet March 17, 2009, 7:08 a.m. UTC | #4
Eric Dumazet a écrit :
> Dave Boutcher a écrit :
>> On Sat, Mar 14, 2009 at 9:37 PM, Eric Dumazet <dada1@cosmosbay.com> wrote:
>>> Dave Boutcher a écrit :
>>>> I'm running into an interesting problem with joining multiple
>>>> multicast feeds.  If you join multiple multicast feeds using
>>>> setsockopt(...,IP_ADD_MEMBERSHIP...) it causes packets on UNRELATED
>>>> multicast feeds to get dropped.  We have a multicast feed on a rock
>>>> solid network, and we were very surprised to see dropped packets.  The
>>>> cause was a different process/program being run by a different user
>>>> joining a bunch of mulitcast feeds.
>>> I could not reproduce the problem on my machines (bnx2 adapter), even if changing
>>> NUMSOCK from 55 to 200 in joiner.c
>> Thanks for trying Eric.  Based on your email I did some more testing
>> and thus far I've
>> only recreated this on x86_64 arches, not on i386.  Which arch did you
>> try it on?
> 
> I tried both, 32 and 64 bit kernels. No problems so far.
> 
> Could you post a linux kernel .config of a non 'working' machine, and dmesg output ?
> 

Also, is using a third machine to start your joiner program is able to trigger
packet losses too ?


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dave Boutcher March 18, 2009, 3:50 a.m. UTC | #5
On Mon, Mar 16, 2009 at 2:01 PM, Eric Dumazet <dada1@cosmosbay.com> wrote:
> Dave Boutcher a écrit :
>> On Sat, Mar 14, 2009 at 9:37 PM, Eric Dumazet <dada1@cosmosbay.com> wrote:
>>> Dave Boutcher a écrit :
>>>> I'm running into an interesting problem with joining multiple
>>>> multicast feeds.  If you join multiple multicast feeds using
>>>> setsockopt(...,IP_ADD_MEMBERSHIP...) it causes packets on UNRELATED
>>>> multicast feeds to get dropped.  We have a multicast feed on a rock
>>>> solid network, and we were very surprised to see dropped packets.  The
>>>> cause was a different process/program being run by a different user
>>>> joining a bunch of mulitcast feeds.
>>> I could not reproduce the problem on my machines (bnx2 adapter), even if changing
>>> NUMSOCK from 55 to 200 in joiner.c
>>
>> Thanks for trying Eric.  Based on your email I did some more testing
>> and thus far I've
>> only recreated this on x86_64 arches, not on i386.  Which arch did you
>> try it on?
>
> I tried both, 32 and 64 bit kernels. No problems so far.
>
> Could you post a linux kernel .config of a non 'working' machine, and dmesg output ?

Eric, based on your inability to recreate this, I tried on some other
hardware I had lying around that has an AMD chipset built-in NIC.
I could not recreate the problem on that hardware.  I'm starting to
think this is an e1000 problem.  In both the e1000 and e1000e
drivers they do the following logic:

      /* clear the old settings from the multicast hash table */

       for (i = 0; i < mta_reg_count; i++) {
               E1000_WRITE_REG_ARRAY(hw, MTA, i, 0);
               E1000_WRITE_FLUSH();
       }

       /* load any remaining addresses into the hash table */

       for (; mc_ptr; mc_ptr = mc_ptr->next) {
               hash_value = e1000_hash_mc_addr(hw, mc_ptr->da_addr);
               e1000_mta_set(hw, hash_value);
       }

There's clearly a window where the NIC doesn't have the multicast
addresses loaded.  This may just be broken-as-designed.  If anyone
else happens to have some e1000 hardware and wants to see if you
can recreate this, I'd be curious.

Some other notes just FYI...

- RcvbufErrors in /proc/net/snmp doesn't get incremented when this happens
- there are no messages in dmesg
- frames get dropped when the program calls exit() and all the sockets
get closed
  (and multicast joins dropped) as well as when the ADD_MEMBERSHIPs happen
- The problem happens even when adding a sleep(1) in between each of the
  ADD_MEMBERSHIP calls.
Eric Dumazet March 18, 2009, 7:38 a.m. UTC | #6
Dave Boutcher a écrit :
> On Mon, Mar 16, 2009 at 2:01 PM, Eric Dumazet <dada1@cosmosbay.com> wrote:
>> Dave Boutcher a écrit :
>>> On Sat, Mar 14, 2009 at 9:37 PM, Eric Dumazet <dada1@cosmosbay.com> wrote:
>>>> Dave Boutcher a écrit :
>>>>> I'm running into an interesting problem with joining multiple
>>>>> multicast feeds.  If you join multiple multicast feeds using
>>>>> setsockopt(...,IP_ADD_MEMBERSHIP...) it causes packets on UNRELATED
>>>>> multicast feeds to get dropped.  We have a multicast feed on a rock
>>>>> solid network, and we were very surprised to see dropped packets.  The
>>>>> cause was a different process/program being run by a different user
>>>>> joining a bunch of mulitcast feeds.
>>>> I could not reproduce the problem on my machines (bnx2 adapter), even if changing
>>>> NUMSOCK from 55 to 200 in joiner.c
>>> Thanks for trying Eric.  Based on your email I did some more testing
>>> and thus far I've
>>> only recreated this on x86_64 arches, not on i386.  Which arch did you
>>> try it on?
>> I tried both, 32 and 64 bit kernels. No problems so far.
>>
>> Could you post a linux kernel .config of a non 'working' machine, and dmesg output ?
> 
> Eric, based on your inability to recreate this, I tried on some other
> hardware I had lying around that has an AMD chipset built-in NIC.
> I could not recreate the problem on that hardware.  I'm starting to
> think this is an e1000 problem.  In both the e1000 and e1000e
> drivers they do the following logic:
> 
>       /* clear the old settings from the multicast hash table */
> 
>        for (i = 0; i < mta_reg_count; i++) {
>                E1000_WRITE_REG_ARRAY(hw, MTA, i, 0);
>                E1000_WRITE_FLUSH();
>        }
> 
>        /* load any remaining addresses into the hash table */
> 
>        for (; mc_ptr; mc_ptr = mc_ptr->next) {
>                hash_value = e1000_hash_mc_addr(hw, mc_ptr->da_addr);
>                e1000_mta_set(hw, hash_value);
>        }
> 
> There's clearly a window where the NIC doesn't have the multicast
> addresses loaded.  This may just be broken-as-designed.  If anyone
> else happens to have some e1000 hardware and wants to see if you
> can recreate this, I'd be curious.
> 

Ouch, you are probably right, this code needs a change.

tg3 for example has a loop bulding hash values in a local array,
then a write of this array on NIC.

                for (i = 0, mclist = dev->mc_list; mclist && i < dev->mc_count;
                     i++, mclist = mclist->next) {

                        crc = calc_crc (mclist->dmi_addr, ETH_ALEN);
                        bit = ~crc & 0x7f;
                        regidx = (bit & 0x60) >> 5;
                        bit &= 0x1f;
                        mc_filter[regidx] |= (1 << bit);
                }

                tw32(MAC_HASH_REG_0, mc_filter[0]);
                tw32(MAC_HASH_REG_1, mc_filter[1]);
                tw32(MAC_HASH_REG_2, mc_filter[2]);
                tw32(MAC_HASH_REG_3, mc_filter[3]);
        }

Other example , on bnx2, same logic :

               memset(mc_filter, 0, 4 * NUM_MC_HASH_REGISTERS);

                for (i = 0, mclist = dev->mc_list; mclist && i < dev->mc_count;
                     i++, mclist = mclist->next) {

                        crc = ether_crc_le(ETH_ALEN, mclist->dmi_addr);
                        bit = crc & 0xff;
                        regidx = (bit & 0xe0) >> 5;
                        bit &= 0x1f;
                        mc_filter[regidx] |= (1 << bit);
                }

                for (i = 0; i < NUM_MC_HASH_REGISTERS; i++) {
                        REG_WR(bp, BNX2_EMAC_MULTICAST_HASH0 + (i * 4),
                               mc_filter[i]);
                }



> Some other notes just FYI...
> 
> - RcvbufErrors in /proc/net/snmp doesn't get incremented when this happens
> - there are no messages in dmesg
> - frames get dropped when the program calls exit() and all the sockets
> get closed
>   (and multicast joins dropped) as well as when the ADD_MEMBERSHIPs happen
> - The problem happens even when adding a sleep(1) in between each of the
>   ADD_MEMBERSHIP calls.
> 


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jesse Brandeburg March 18, 2009, 5:24 p.m. UTC | #7
On Tue, 17 Mar 2009, Dave Boutcher wrote:
> Eric, based on your inability to recreate this, I tried on some other
> hardware I had lying around that has an AMD chipset built-in NIC.
> I could not recreate the problem on that hardware.  I'm starting to
> think this is an e1000 problem.  In both the e1000 and e1000e
> drivers they do the following logic:
> 
>       /* clear the old settings from the multicast hash table */
> 
>        for (i = 0; i < mta_reg_count; i++) {
>                E1000_WRITE_REG_ARRAY(hw, MTA, i, 0);
>                E1000_WRITE_FLUSH();
>        }
> 
>        /* load any remaining addresses into the hash table */
> 
>        for (; mc_ptr; mc_ptr = mc_ptr->next) {
>                hash_value = e1000_hash_mc_addr(hw, mc_ptr->da_addr);
>                e1000_mta_set(hw, hash_value);
>        }
> 
> There's clearly a window where the NIC doesn't have the multicast
> addresses loaded.  This may just be broken-as-designed.  If anyone
> else happens to have some e1000 hardware and wants to see if you
> can recreate this, I'd be curious.
> 
> Some other notes just FYI...
> 
> - RcvbufErrors in /proc/net/snmp doesn't get incremented when this happens
> - there are no messages in dmesg
> - frames get dropped when the program calls exit() and all the sockets
> get closed
>   (and multicast joins dropped) as well as when the ADD_MEMBERSHIPs happen
> - The problem happens even when adding a sleep(1) in between each of the
>   ADD_MEMBERSHIP calls.

Interesting, this code has been there for eons (and probably this 
behavior) but that doesn't mean its not a problem.

We are in the process of figuring out if there are any hardware corner 
cases to changing this code (particularly in e1000)

Initial thoughts are:
1) kcalloc an array that we then populate with the hash functions, and 
   then program every location only once (never flush)
2) only program a single hash value each time a multicast is added (bad 
   because we can't tell the difference in the list since the last time 
   the OS gave us the list)

It really seems like this should be fixable, and I agree that the driver 
behavior is far from optimal, however well entrenched.

Jesse
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dave Boutcher March 19, 2009, 1:51 a.m. UTC | #8
On Wed, Mar 18, 2009 at 12:24 PM, Brandeburg, Jesse
<jesse.brandeburg@intel.com> wrote:
>
> On Tue, 17 Mar 2009, Dave Boutcher wrote:
> > Eric, based on your inability to recreate this, I tried on some other
> > hardware I had lying around that has an AMD chipset built-in NIC.
> > I could not recreate the problem on that hardware.  I'm starting to
> > think this is an e1000 problem.  In both the e1000 and e1000e
> > drivers they do the following logic:
> >
> >       /* clear the old settings from the multicast hash table */
> >
> >        for (i = 0; i < mta_reg_count; i++) {
> >                E1000_WRITE_REG_ARRAY(hw, MTA, i, 0);
> >                E1000_WRITE_FLUSH();
> >        }
> >
> >        /* load any remaining addresses into the hash table */
> >
> >        for (; mc_ptr; mc_ptr = mc_ptr->next) {
> >                hash_value = e1000_hash_mc_addr(hw, mc_ptr->da_addr);
> >                e1000_mta_set(hw, hash_value);
> >        }
> >
> > There's clearly a window where the NIC doesn't have the multicast
> > addresses loaded.  This may just be broken-as-designed.  If anyone
> > else happens to have some e1000 hardware and wants to see if you
> > can recreate this, I'd be curious.
> >
> > Some other notes just FYI...
> >
> > - RcvbufErrors in /proc/net/snmp doesn't get incremented when this happens
> > - there are no messages in dmesg
> > - frames get dropped when the program calls exit() and all the sockets
> > get closed
> >   (and multicast joins dropped) as well as when the ADD_MEMBERSHIPs happen
> > - The problem happens even when adding a sleep(1) in between each of the
> >   ADD_MEMBERSHIP calls.
>
> Interesting, this code has been there for eons (and probably this
> behavior) but that doesn't mean its not a problem.

Hi Jesse, thanks for the response...

If you go back in this thread I had a dead easy unprivileged user-land testcase
that causes frame loss.  We ran into this in a production environment
(and I kind
of glossed over how long it took to figure out why the hell we were dropping
frames...you can only increase rmem_max so many times ;-)  OTOH not that many
people use multicast, and even fewer notice a few dropped frames, so the
priority is probably lowish.

On the other other hand, I'm working in the financial trading space these days,
where Linux is pretty much king....and they're all about multicast.

--
Dave B
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller March 19, 2009, 5:46 a.m. UTC | #9
From: "Brandeburg, Jesse" <jesse.brandeburg@intel.com>
Date: Wed, 18 Mar 2009 10:24:18 -0700 (Pacific Daylight Time)

> Interesting, this code has been there for eons (and probably this 
> behavior) but that doesn't mean its not a problem.
> 
> We are in the process of figuring out if there are any hardware corner 
> cases to changing this code (particularly in e1000)
> 
> Initial thoughts are:
> 1) kcalloc an array that we then populate with the hash functions, and 
>    then program every location only once (never flush)
> 2) only program a single hash value each time a multicast is added (bad 
>    because we can't tell the difference in the list since the last time 
>    the OS gave us the list)
> 
> It really seems like this should be fixable, and I agree that the driver 
> behavior is far from optimal, however well entrenched.

Just do what tg3 does to fix this now, get fancy and "beautiful"
later.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

different kernel versions (the latest being 2.6.28) and a few
different sets off hardware.  I HAVEN"T recreated it if the system
doing the IP_ADD_MEMBERSHIP specifies a specific interface rather than
INADDR_ANY.  I'm not sure if that is core to the issue or not.  You
may also need to bump the value in
/proc/sys/net/ipv4/igmp_max_memberships (though that hasn't seemed
necessary for me.)

I poked around in igmp.c, but its mojo exceeds my threshold.  If
anyone has any ideas or questions I'd be happy to hear them.

diff -uNr null/joiner.c multicast/joiner.c
--- null/joiner.c	1969-12-31 18:00:00.000000000 -0600
+++ multicast/joiner.c	2009-03-14 15:04:10.000000000 -0500
@@ -0,0 +1,44 @@ 
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <netinet/in.h>
+#include <arpa/inet.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdbool.h>
+#include <string.h>
+#include <unistd.h>
+
+#define NUMSOCK 55
+
+int main(int argc, char **argv)
+{
+	struct ip_mreq mreq;
+	int i;
+	int sd;
+	char ipaddr[64];
+
+	for (i=0; i<NUMSOCK; i++) {
+		sd = socket(PF_INET, SOCK_DGRAM, IPPROTO_UDP);
+		
+		if (sd < 0) {
+			perror("socket");
+			exit(0);
+		}
+		
+		sprintf(ipaddr,"239.192.2.%d",i+1);
+		
+		mreq.imr_multiaddr.s_addr = inet_addr(ipaddr);
+		mreq.imr_interface.s_addr = htonl(INADDR_ANY);
+		
+		if (setsockopt(sd, IPPROTO_IP, IP_ADD_MEMBERSHIP, &mreq, sizeof(mreq))) {
+			perror("IP_ADD_MEMBERSHIP");
+			exit(0);
+		}
+	}
+
+	printf("Sleeping for 10 seconds\n");
+	sleep(10);
+
+	exit(0);
+}
+
diff -uNr null/Makefile multicast/Makefile
--- null/Makefile	1969-12-31 18:00:00.000000000 -0600
+++ multicast/Makefile	2009-03-14 15:13:09.000000000 -0500
@@ -0,0 +1,9 @@ 
+CFLAGS = -Wall -g
+
+all: sender receiver joiner
+
+receiver:
+
+sender:
+
+joiner:
diff -uNr null/mctest.h multicast/mctest.h
--- null/mctest.h	1969-12-31 18:00:00.000000000 -0600
+++ multicast/mctest.h	2009-03-14 14:47:13.000000000 -0500
@@ -0,0 +1,10 @@ 
+#ifndef __MCTEST_H__
+#define __MCTEST_H__
+
+struct mcdata {
+	int32_t seq1;
+	char data[60];
+	int32_t seq2;
+};
+
+#endif
Binary files null/receiver and multicast/receiver differ
diff -uNr null/receiver.c multicast/receiver.c
--- null/receiver.c	1969-12-31 18:00:00.000000000 -0600
+++ multicast/receiver.c	2009-03-14 14:48:25.000000000 -0500
@@ -0,0 +1,71 @@ 
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <netinet/in.h>
+#include <arpa/inet.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdbool.h>
+#include <string.h>
+#include <unistd.h>
+
+#include "mctest.h"
+
+int main(int argc, char **argv)
+{
+	int sd = socket(PF_INET, SOCK_DGRAM, IPPROTO_UDP);
+	int on = 1;
+	struct sockaddr_in addr;
+	uint32_t seq = 1;
+	int bytes;
+	struct ip_mreq mreq;
+
+	struct mcdata data;
+
+	if (sd < 0) {
+		perror("socket");
+		exit(0);
+	}
+
+	mreq.imr_multiaddr.s_addr = inet_addr("239.192.1.1");
+	mreq.imr_interface.s_addr = htonl(INADDR_ANY);
+
+	if (setsockopt(sd, IPPROTO_IP, IP_ADD_MEMBERSHIP, &mreq, sizeof(mreq))) {
+		perror("IP_ADD_MEMBERSHIP");
+		exit(0);
+	}
+
+	if (setsockopt(sd, SOL_SOCKET, SO_REUSEADDR, &on, sizeof(int))) {
+		perror("SO_REUSEADDR");
+		exit(0);
+	}
+
+	bzero(&addr, sizeof(addr));
+	addr.sin_family = AF_INET;
+	addr.sin_port = ntohs(60604);
+	addr.sin_addr.s_addr = inet_addr("239.192.1.1");
+
+	if (bind(sd, (struct sockaddr*)&addr, sizeof(addr))) {
+		perror("bind");
+		exit(0);
+	}
+
+	while(1) {
+		bytes = recv(sd, &data, sizeof(data), 0);
+		if (bytes != sizeof(data)) {
+			printf("recv got %d, expected %lu\n",
+			       bytes, sizeof(data));
+			exit(0);
+		}
+
+		if ((ntohl(data.seq1) != seq) ||
+		    (ntohl(data.seq2) != seq)) {
+			printf("Mismatched seq! Expected %u, got %u/%u\n",
+			       seq, ntohl(data.seq1), ntohl(data.seq2));
+		}
+
+		seq = ntohl(data.seq1)+1;
+		if (seq % 10000 == 0)
+			printf("got seq %u\n",seq);
+	}
+}
+
diff -uNr null/sender.c multicast/sender.c
--- null/sender.c	1969-12-31 18:00:00.000000000 -0600
+++ multicast/sender.c	2009-03-14 14:47:13.000000000 -0500
@@ -0,0 +1,55 @@ 
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <netinet/in.h>
+#include <arpa/inet.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdbool.h>
+#include <string.h>
+#include <unistd.h>
+
+#include "mctest.h"
+
+int main(int argc, char **argv)
+{
+	int sd = socket(PF_INET, SOCK_DGRAM, IPPROTO_UDP);
+	int on = 1;
+	struct sockaddr_in addr;
+	uint32_t seq = 1;
+	int bytes;
+
+	struct mcdata data;
+
+	if (sd < 0) {
+		perror("socket");
+		exit(0);
+	}
+
+	if (setsockopt(sd, IPPROTO_IP, IP_MULTICAST_LOOP, &on, sizeof(int))) {
+		perror("IO_MULTICAST_LOOP");
+		exit(0);
+	}
+
+	bzero(&addr, sizeof(addr));
+	addr.sin_family = AF_INET;
+	addr.sin_port = ntohs(60604);
+	addr.sin_addr.s_addr = inet_addr("239.192.1.1");
+
+	memset(data.data, 0xdb, sizeof(data.data));
+
+	while (1) {
+		data.seq1 = data.seq2 = htonl(seq);
+
+		bytes = sendto(sd, &data, sizeof(data), 0,
+			       (struct sockaddr *)&addr, sizeof(addr));
+		if (bytes != sizeof(data)) {
+			perror("send");
+			exit(0);
+		}
+
+		seq++;
+		usleep(1000);
+	}
+
+	return 0;
+}