Message ID | 1283895316.2634.248.camel@edumazet-laptop |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
On 2010-09-07 23:35, Eric Dumazet wrote: > Le mardi 07 septembre 2010 à 12:59 -0700, David Miller a écrit : > >> Eric, please just delete the code block instead of leaving it >> there inside of an #if 0 block. >> >> If there is information conveyed by the unused code, add that >> information to the nice comment you're adding :-) >> > > Indeed ;) > > [PATCH] inet: dont set inet_rcv_saddr in connect() > > The following sequence : > > socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) > > connect(fd, {sa_family=AF_INET, sin_port=htons(xx), > sin_addr=inet_addr("1.2.3.4")}, 28) > > 1) Does an implicit inet_autobind() > (using an INADDR_ANY address, and selecting a random port). > > 2) Does an ip4_datagram_connect() to specify the address/port of > remote end point. > > Problem is ip4_datagram_connect() also sets inet->inet_rcv_saddr (from > INADDR_ANY to IP source address, given the current route to remote end > point). Only the first connect() on the socket does this. Following ones > dont change the (possibly wrong) source address. > > This breaks the secondary UDP hash, based on (ADDRESS, port), that was > computed by inet_autobind(). If more than 10 sockets are linked in > primary hash chain, we can drop incoming packets because searches are > done in wrong secondary hash chain. > > This also potentially breaks multiple connect() to change remote > endpoints, because old source address might be non usable for packets to > new destination. > > If route happens to change, then we should automatically change our > source address too, at next sendmsg() call, and UDP code deals with this > just fine. > > If an application needs to specify a precise source address, it must use > bind() system call. connect() man page only refers to remote address, > not local one. > > Reported-by: Krzysztof Olędzki <ole@ans.pl> > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Tested-by: Krzysztof Piotr Oledzki <ole@ans.pl> Best regards, Krzysztof Olędzki -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 09/07/2010 05:35 PM, Eric Dumazet wrote: > Problem is ip4_datagram_connect() also sets inet->inet_rcv_saddr (from > INADDR_ANY to IP source address, given the current route to remote end > point). Only the first connect() on the socket does this. Following ones > dont change the (possibly wrong) source address. > > This breaks the secondary UDP hash, based on (ADDRESS, port), that was > computed by inet_autobind(). If more than 10 sockets are linked in > primary hash chain, we can drop incoming packets because searches are > done in wrong secondary hash chain. I can't confirm this since I don't have 10 sockets in my primary hash chain at the moment... > This also potentially breaks multiple connect() to change remote > endpoints, because old source address might be non usable for packets to > new destination. > > If route happens to change, then we should automatically change our > source address too, at next sendmsg() call, and UDP code deals with this > just fine. > > If an application needs to specify a precise source address, it must use > bind() system call. connect() man page only refers to remote address, > not local one. Is this really the right thing to do? Linux has been doing this forever, right? Just like BSD has done it forever. The way I've always "cleared" a local address is to set the address family to AF_UNSPEC on the next connect() call, as mentioned on the man page. I just want to make sure we're not changing something just to work around a broken application, sendto()/sendmsg() work perfect in this case by not setting the local address. BTW, it seems as though the reason this might only happen sometimes is that if the first connect() is to 127.0.0.1, you won't be able to then try and connect to say, 192.168.1.1. If you first connect() to a remote address things will probably just work. My possibly wrong $.02 -Brian -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Brian Haley <brian.haley@hp.com> Date: Tue, 07 Sep 2010 22:34:59 -0400 > Is this really the right thing to do? Linux has been doing this forever, > right? Just like BSD has done it forever. Indeed, I checked for this in Stevens volume 2 when I reviewed Eric's patch, the logic is identical. > The way I've always "cleared" a local address is to set the address > family to AF_UNSPEC on the next connect() call, as mentioned on the > man page. I just want to make sure we're not changing something > just to work around a broken application, sendto()/sendmsg() work > perfect in this case by not setting the local address. > > BTW, it seems as though the reason this might only happen sometimes is > that if the first connect() is to 127.0.0.1, you won't be able to then > try and connect to say, 192.168.1.1. If you first connect() to a remote > address things will probably just work. Actually I have enough doubts about this too. So I'm going to hold off on the patch until we sort this out. As Eric stated, sendmsg() on raw and UDP sockets pick the source address based upon the bound or looked up route anyways. But still I think something still might end up being not-kosher with Eric's change. More so, I suspect, the issue is simply that the ordering of events is simply inconvenient for the secondary hash implementation :-) The port bind happens before the source address is bound, and that is what screws everything up. In this sense leaving the source addresses at INADDR_ANY is just a workaround for the secondary hash table :-) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le mardi 07 septembre 2010 à 20:34 -0700, David Miller a écrit : > From: Brian Haley <brian.haley@hp.com> > Date: Tue, 07 Sep 2010 22:34:59 -0400 > > > Is this really the right thing to do? Linux has been doing this forever, > > right? Just like BSD has done it forever. > > Indeed, I checked for this in Stevens volume 2 when I reviewed > Eric's patch, the logic is identical. > > > The way I've always "cleared" a local address is to set the address > > family to AF_UNSPEC on the next connect() call, as mentioned on the > > man page. I just want to make sure we're not changing something > > just to work around a broken application, sendto()/sendmsg() work > > perfect in this case by not setting the local address. > > Problem is AF_UNSPEC always clears the remote address (as stated in manual), and sometimes local one (as not stated) if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK)) inet_reset_saddr(sk); This is the workaround that was coded years ago in Linux to undo the local addr setting ;) Following program produces this output : local addr=0x7f000001 sin_port=37877 after connect(AF_UNSPEC) local addr=0x0 sin_port=0 local addr=0x7f000001 sin_port=37877 after connect(AF_UNSPEC) local addr=0x7f000001 sin_port=0 #include <stdlib.h> #include <unistd.h> #include <sys/socket.h> #include <arpa/inet.h> #include <string.h> #include <stdio.h> int main(int argc, char *argv[]) { int fd; struct sockaddr_in target, me; socklen_t len; if ((fd = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP)) == -1) { perror("socket"); return 1; } memset(&target, 0, sizeof(target)); target.sin_family = AF_INET; target.sin_port = 123; target.sin_addr.s_addr = htonl(0x7f000001); if (connect(fd, (const struct sockaddr *)&target, sizeof(target)) == -1) { perror("connect"); close(fd); return 2; } len = sizeof(me); if (getsockname(fd, (struct sockaddr *)&me, &len)== -1) { perror("getsockname"); close(fd); return 3; } printf("local addr=0x%x sin_port=%u\n", ntohl(me.sin_addr.s_addr), ntohs(me.sin_port)); memset(&target, 0, sizeof(target)); target.sin_family = AF_UNSPEC; if (connect(fd, (const struct sockaddr *)&target, sizeof(target)) == -1) { perror("connect AF_UNSPEC"); close(fd); return 4; } len = sizeof(me); if (getsockname(fd, (struct sockaddr *)&me, &len)== -1) { perror("getsockname 2"); close(fd); return 3; } printf("after connect(AF_UNSPEC) local addr=0x%x sin_port=%u\n", ntohl(me.sin_addr.s_addr), ntohs(me.sin_port)); memset(&target, 0, sizeof(target)); target.sin_family = AF_INET; target.sin_addr.s_addr = htonl(0x7f000001); if (bind(fd, (const struct sockaddr *)&target, sizeof(target)) == -1) { perror("bind"); close(fd); return 2; } len = sizeof(me); if (getsockname(fd, (struct sockaddr *)&me, &len)== -1) { perror("getsockname"); close(fd); return 3; } printf("local addr=0x%x sin_port=%u\n", ntohl(me.sin_addr.s_addr), ntohs(me.sin_port)); memset(&target, 0, sizeof(target)); target.sin_family = AF_UNSPEC; if (connect(fd, (const struct sockaddr *)&target, sizeof(target)) == -1) { perror("connect AF_UNSPEC"); close(fd); return 4; } len = sizeof(me); if (getsockname(fd, (struct sockaddr *)&me, &len)== -1) { perror("getsockname 2"); close(fd); return 3; } printf("after connect(AF_UNSPEC) local addr=0x%x sin_port=%u\n", ntohl(me.sin_addr.s_addr), ntohs(me.sin_port)); return 0; } -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le mardi 07 septembre 2010 à 22:34 -0400, Brian Haley a écrit : > Is this really the right thing to do? Linux has been doing this forever, > right? Just like BSD has done it forever. The way I've always "cleared" > a local address is to set the address family to AF_UNSPEC on the next > connect() call, as mentioned on the man page. I just want to make sure > we're not changing something just to work around a broken application, > sendto()/sendmsg() work perfect in this case by not setting the local address. > > BTW, it seems as though the reason this might only happen sometimes is > that if the first connect() is to 127.0.0.1, you won't be able to then > try and connect to say, 192.168.1.1. If you first connect() to a remote > address things will probably just work. I believe we have the following choice : 1) connect(AF_UNIX) sets the remote address/port bind() sets the local port (and optionally address) connect(AF_UNSPEC) clears remote addess/port, let local address/port unchanged 2) Correct UDP hashing, when local address changes from 0 to x.y.z.t (cost : two locks taken), and a possible packet drop during the operation. Document that connect() also sets local address, and that before doing a second connect() to change remote address, its mandatory to first issue a connect(AF_UNSPEC) to clear local address (if not locked by a prior bind() call) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Eric Dumazet <eric.dumazet@gmail.com> Date: Wed, 08 Sep 2010 06:57:37 +0200 > Document that connect() also sets local address, and that before > doing a second connect() to change remote address, its mandatory to > first issue a connect(AF_UNSPEC) to clear local address (if not locked > by a prior bind() call) For connectionless sockets, the application may connect() as many times as it wishes to change the remote address. The local address remains set if it were set before such a re-associating connect(). It need only issue a connect(AF_UNSPEC) to make the socket have no remote association, and as you state this operation will also wipe out any local address settings not created by a bind() call. And nicely our man pages are very clear about this :-) as is BSD and Steven's volume 2. This has been legal for decades, so we have to keep working this way. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Eric Dumazet <eric.dumazet@gmail.com> Date: Wed, 08 Sep 2010 06:42:45 +0200 > Problem is AF_UNSPEC always clears the remote address (as stated in > manual), and sometimes local one (as not stated) > > if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK)) > inet_reset_saddr(sk); > > This is the workaround that was coded years ago in Linux to undo the > local addr setting ;) Ok, I mis-remembered about AF_UNSPEC, I thought BSD did it too, it does not. But we really offer the most flexibility here because we offer this. In BSD case, connect() on an already connect()'d UDP socket unconditionally zaps the local address setting (because, as Stevens states, that local address "might" have been created by a previous connect()). This forces the application to re-bind() the local address if it wants to use something other than what connect() is going to choose. We leave the local settings alone for normal connect() on already connect()'d sockets, and provide connect(AF_UNSPEC) to explicitly kill connect() created local and remote associations. This allows bind() settings to survive through multiple connect() calls if the user wants, a facility BSD does not provide. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le mardi 07 septembre 2010 à 22:36 -0700, David Miller a écrit : > From: Eric Dumazet <eric.dumazet@gmail.com> > Date: Wed, 08 Sep 2010 06:57:37 +0200 > > > Document that connect() also sets local address, and that before > > doing a second connect() to change remote address, its mandatory to > > first issue a connect(AF_UNSPEC) to clear local address (if not locked > > by a prior bind() call) > > For connectionless sockets, the application may connect() as many > times as it wishes to change the remote address. The local address > remains set if it were set before such a re-associating connect(). > > It need only issue a connect(AF_UNSPEC) to make the socket have no > remote association, and as you state this operation will also wipe out > any local address settings not created by a bind() call. > > And nicely our man pages are very clear about this :-) as is BSD and > Steven's volume 2. > > This has been legal for decades, so we have to keep working this way. Yes, its also buggy, if 2nd remote address is not reachable on same interface. Even if we try a connect(AF_UNSPEC), the local address stay as is : after bind(port 5555) local addr=0x0:5555 after connect(123) local addr=0x7f000001:5555 remote addr=0x7f000001:123 Could not connect, errno=22 after connect(AF_UNSPEC) local addr=0x7f000001:5555 connect: Invalid argument I'll work on UDP fix anyway. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le mercredi 08 septembre 2010 à 07:52 +0200, Eric Dumazet a écrit : > Le mardi 07 septembre 2010 à 22:36 -0700, David Miller a écrit : > > From: Eric Dumazet <eric.dumazet@gmail.com> > > Date: Wed, 08 Sep 2010 06:57:37 +0200 > > > > > Document that connect() also sets local address, and that before > > > doing a second connect() to change remote address, its mandatory to > > > first issue a connect(AF_UNSPEC) to clear local address (if not locked > > > by a prior bind() call) > > > > For connectionless sockets, the application may connect() as many > > times as it wishes to change the remote address. The local address > > remains set if it were set before such a re-associating connect(). > > > > It need only issue a connect(AF_UNSPEC) to make the socket have no > > remote association, and as you state this operation will also wipe out > > any local address settings not created by a bind() call. > > > > And nicely our man pages are very clear about this :-) as is BSD and > > Steven's volume 2. > > > > This has been legal for decades, so we have to keep working this way. > > Yes, its also buggy, if 2nd remote address is not reachable on same interface. > Even if we try a connect(AF_UNSPEC), the local address stay as is : > > after bind(port 5555) local addr=0x0:5555 > after connect(123) local addr=0x7f000001:5555 remote addr=0x7f000001:123 > Could not connect, errno=22 > after connect(AF_UNSPEC) local addr=0x7f000001:5555 > connect: Invalid argument > I run the program on FreeBSD 8.1, and this OS does change the source address at connect() time. It also change it each time connect() is called, not only once. fd = socket() connect(fd, "127.0.0.1:ntp") system("netstat -p udp | grep udp") connect(fd, "192.168.20.110:ntp") system("netstat -p udp | grep udp") -> udp4 0 0 127.0.0.1.35974 127.0.0.1.ntp udp4 0 0 192.168.20.80.35974 192.168.20.110.ntp while on Linux we refuse the second connect() -> EINVAL (Because no route can be found from 127.0.0.1 to 192.168.20.110) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/ipv4/datagram.c b/net/ipv4/datagram.c index f055094..656dc73 100644 --- a/net/ipv4/datagram.c +++ b/net/ipv4/datagram.c @@ -60,10 +60,13 @@ int ip4_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len) ip_rt_put(rt); return -EACCES; } - if (!inet->inet_saddr) - inet->inet_saddr = rt->rt_src; /* Update source address */ - if (!inet->inet_rcv_saddr) - inet->inet_rcv_saddr = rt->rt_src; + /* + * Should connect() change inet_rcv_saddr / inet_saddr ? + * It should not, because we want to specify the peer to which + * datagrams are to be sent, regardless of our source address that + * might change in the future, after a route change. + * To specify our source address, bind() is the right API. + */ inet->inet_daddr = rt->rt_dst; inet->inet_dport = usin->sin_port; sk->sk_state = TCP_ESTABLISHED;