Message ID | 20130520103017.054ae605@nehalam.linuxnetplumber.net |
---|---|
State | Rejected, archived |
Delegated to: | David Miller |
Headers | show |
> From: Stephen Hemminger <stephen@networkplumber.org> > \This commit 823aa873bc782f1c51b1ce8ec6da7cfcaf93836e > Author: stephen hemminger <stephen@networkplumber.org> > Date: Sat Apr 27 11:31:57 2013 +0000 > > vxlan: allow choosing destination port per vxlan > > is broken revert it. The change allowed setting per port for transmit > but did not add additional listening sockets, which made any vxlan's > defined with non-default port send only. This allows you to specify a different default port for transmits, which is what you want to do if your own instance of VXLAN is the odd one. I don't see any requirement for multiple listen ports for that to be useful, since those sending to you can have complete fdb tables even if the local instance doesn't and relies on the default. Not to mention using an agent to fill the fdb triggered by packets sent to the default, so the receiver is not necessarily even a VXLAN instance. The receiver side and transmit side ports can be completely independent of each other, as in any other client-server system. I think an administrator should have full flexibility to specify the ports and destinations as s/he sees fit and if you don't think that's a useful feature, you don't have to use it; the defaults work fine, too. +-DLS -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 20 May 2013 14:15:59 -0400 David Stevens <dlstevens@us.ibm.com> wrote: > > From: Stephen Hemminger <stephen@networkplumber.org> > > > \This commit 823aa873bc782f1c51b1ce8ec6da7cfcaf93836e > > Author: stephen hemminger <stephen@networkplumber.org> > > Date: Sat Apr 27 11:31:57 2013 +0000 > > > > vxlan: allow choosing destination port per vxlan > > > > is broken revert it. The change allowed setting per port for transmit > > but did not add additional listening sockets, which made any vxlan's > > defined with non-default port send only. > > This allows you to specify a different default port for > transmits, which is what you want to do if your own instance > of VXLAN is the odd one. I don't see any requirement for multiple > listen ports for that to be useful, since those sending to you > can have complete fdb tables even if the local instance doesn't > and relies on the default. Not to mention using an agent to > fill the fdb triggered by packets sent to the default, so the > receiver is not necessarily even a VXLAN instance. The receiver > side and transmit side ports can be completely independent of > each other, as in any other client-server system. Vxlan's are a weird beast. They can be viewed as either bridge like entities or tunnel like entities. I view them more as bridge type devices where user configures two hosts with equivalent values and they learn about each other. In that case the code in 3.10 is broken; but the version with the learning in net-next works. Your view is that VXLAN's are more like tunnels, where each host has static entries to know about every other host. In that mode, 3.10 is useable, but the same effect can be had by defining static neighbour entries. Both views are valid, but I don't want the behaviour to change from 3.10 to 3.11, since 3.10 is not released yet, it makes more sense to go back and remove IFLA_VXLAN_PORT from 3.10 and make it work like 3.9. > I think an administrator should have full flexibility to specify > the ports and destinations as s/he sees fit and if you don't think > that's a useful feature, you don't have to use it; the defaults > work fine, too. They can do that with neighbour entries. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 5/20/2013 11:30 AM, Stephen Hemminger wrote: > On Mon, 20 May 2013 14:15:59 -0400 > David Stevens <dlstevens@us.ibm.com> wrote: > >>> From: Stephen Hemminger <stephen@networkplumber.org> >> >>> \This commit 823aa873bc782f1c51b1ce8ec6da7cfcaf93836e >>> Author: stephen hemminger <stephen@networkplumber.org> >>> Date: Sat Apr 27 11:31:57 2013 +0000 >>> >>> vxlan: allow choosing destination port per vxlan >>> >>> is broken revert it. The change allowed setting per port for transmit >>> but did not add additional listening sockets, which made any vxlan's >>> defined with non-default port send only. >> This allows you to specify a different default port for >> transmits, which is what you want to do if your own instance >> of VXLAN is the odd one. I don't see any requirement for multiple >> listen ports for that to be useful, since those sending to you >> can have complete fdb tables even if the local instance doesn't >> and relies on the default. Not to mention using an agent to >> fill the fdb triggered by packets sent to the default, so the >> receiver is not necessarily even a VXLAN instance. The receiver >> side and transmit side ports can be completely independent of >> each other, as in any other client-server system. > Vxlan's are a weird beast. They can be viewed as either bridge like > entities or tunnel like entities. I view them more as bridge type > devices where user configures two hosts with equivalent values and > they learn about each other. In that case the code in 3.10 is broken; > but the version with the learning in net-next works. > > Your view is that VXLAN's are more like tunnels, where each host > has static entries to know about every other host. In that mode, > 3.10 is useable, but the same effect can be had by defining static > neighbour entries. how can we send to a different dst port using static neighbor entries? Thanks Sridhar -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 20 May 2013 16:59:44 -0700 Sridhar Samudrala <samudrala.sridhar@gmail.com> wrote: > On 5/20/2013 11:30 AM, Stephen Hemminger wrote: > > On Mon, 20 May 2013 14:15:59 -0400 > > David Stevens <dlstevens@us.ibm.com> wrote: > > > >>> From: Stephen Hemminger <stephen@networkplumber.org> > >> > >>> \This commit 823aa873bc782f1c51b1ce8ec6da7cfcaf93836e > >>> Author: stephen hemminger <stephen@networkplumber.org> > >>> Date: Sat Apr 27 11:31:57 2013 +0000 > >>> > >>> vxlan: allow choosing destination port per vxlan > >>> > >>> is broken revert it. The change allowed setting per port for transmit > >>> but did not add additional listening sockets, which made any vxlan's > >>> defined with non-default port send only. > >> This allows you to specify a different default port for > >> transmits, which is what you want to do if your own instance > >> of VXLAN is the odd one. I don't see any requirement for multiple > >> listen ports for that to be useful, since those sending to you > >> can have complete fdb tables even if the local instance doesn't > >> and relies on the default. Not to mention using an agent to > >> fill the fdb triggered by packets sent to the default, so the > >> receiver is not necessarily even a VXLAN instance. The receiver > >> side and transmit side ports can be completely independent of > >> each other, as in any other client-server system. > > Vxlan's are a weird beast. They can be viewed as either bridge like > > entities or tunnel like entities. I view them more as bridge type > > devices where user configures two hosts with equivalent values and > > they learn about each other. In that case the code in 3.10 is broken; > > but the version with the learning in net-next works. > > > > Your view is that VXLAN's are more like tunnels, where each host > > has static entries to know about every other host. In that mode, > > 3.10 is useable, but the same effect can be had by defining static > > neighbour entries. > > how can we send to a different dst port using static neighbor entries? > > Thanks > Sridhar It is part of this commit in 3.10 already. commit 6681712d67eef14c4ce793561c3231659153a320 Author: David Stevens <dlstevens@us.ibm.com> Date: Fri Mar 15 04:35:51 2013 +0000 vxlan: generalize forwarding tables This patch generalizes VXLAN forwarding table entries allowing an administrator to: 1) specify multiple destinations for a given MAC 2) specify alternate vni's in the VXLAN header 3) specify alternate destination UDP ports 4) use multicast MAC addresses as fdb lookup keys 5) specify multicast destinations 6) specify the outgoing interface for forwarded packets The combination allows configuration of more complex topologies using VXLAN encapsulation. Changes since v1: rebase to 3.9.0-rc2 Signed-Off-By: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 5/20/2013 5:13 PM, Stephen Hemminger wrote: > On Mon, 20 May 2013 16:59:44 -0700 > Sridhar Samudrala <samudrala.sridhar@gmail.com> wrote: > >> On 5/20/2013 11:30 AM, Stephen Hemminger wrote: >>> On Mon, 20 May 2013 14:15:59 -0400 >>> David Stevens <dlstevens@us.ibm.com> wrote: >>> >>>>> From: Stephen Hemminger <stephen@networkplumber.org> >>>> >>>>> \This commit 823aa873bc782f1c51b1ce8ec6da7cfcaf93836e >>>>> Author: stephen hemminger <stephen@networkplumber.org> >>>>> Date: Sat Apr 27 11:31:57 2013 +0000 >>>>> >>>>> vxlan: allow choosing destination port per vxlan >>>>> >>>>> is broken revert it. The change allowed setting per port for transmit >>>>> but did not add additional listening sockets, which made any vxlan's >>>>> defined with non-default port send only. >>>> This allows you to specify a different default port for >>>> transmits, which is what you want to do if your own instance >>>> of VXLAN is the odd one. I don't see any requirement for multiple >>>> listen ports for that to be useful, since those sending to you >>>> can have complete fdb tables even if the local instance doesn't >>>> and relies on the default. Not to mention using an agent to >>>> fill the fdb triggered by packets sent to the default, so the >>>> receiver is not necessarily even a VXLAN instance. The receiver >>>> side and transmit side ports can be completely independent of >>>> each other, as in any other client-server system. >>> Vxlan's are a weird beast. They can be viewed as either bridge like >>> entities or tunnel like entities. I view them more as bridge type >>> devices where user configures two hosts with equivalent values and >>> they learn about each other. In that case the code in 3.10 is broken; >>> but the version with the learning in net-next works. >>> >>> Your view is that VXLAN's are more like tunnels, where each host >>> has static entries to know about every other host. In that mode, >>> 3.10 is useable, but the same effect can be had by defining static >>> neighbour entries. >> how can we send to a different dst port using static neighbor entries? >> >> Thanks >> Sridhar > It is part of this commit in 3.10 already. > > commit 6681712d67eef14c4ce793561c3231659153a320 > Author: David Stevens <dlstevens@us.ibm.com> > Date: Fri Mar 15 04:35:51 2013 +0000 > > vxlan: generalize forwarding tables > > This patch generalizes VXLAN forwarding table entries allowing an administrator > to: > 1) specify multiple destinations for a given MAC > 2) specify alternate vni's in the VXLAN header > 3) specify alternate destination UDP ports > 4) use multicast MAC addresses as fdb lookup keys > 5) specify multicast destinations > 6) specify the outgoing interface for forwarded packets > > OK. I am aware of specifying dst port via static fdb(forwarding table) entries. I was confused when you said neighbor entries. Thanks Sridhar -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Stephen Hemminger <stephen@networkplumber.org> Date: Mon, 20 May 2013 11:30:00 -0700 > On Mon, 20 May 2013 14:15:59 -0400 > David Stevens <dlstevens@us.ibm.com> wrote: > >> I think an administrator should have full flexibility to specify >> the ports and destinations as s/he sees fit and if you don't think >> that's a useful feature, you don't have to use it; the defaults >> work fine, too. > > They can do that with neighbour entries. David, please come to some kind of agreement with Stephen about what we're going to do about this. Thank you. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller <davem@davemloft.net> wrote on 05/22/2013 06:08:30 PM: > David, please come to some kind of agreement with Stephen about what > we're going to do about this. Well, I think being able to specify an alternate port for the default fdb entry is a useful feature in its own right. It is not equivalent to individual fdb entries with alternate ports because those require the host to know all the destinations in advance. It's certainly reasonable to use an alternate port for the default entry as well, and is not "useless" without multiple listen ports; it is as useful for the default port as it is for individual fdb entries. I'm not sure what the benefit of reverting any working feature is, only to re-add it later, which I believe is what Stephen is proposing. I think it is useful for some configurations now, even if it only becomes useful for other configurations after multiple listen ports. Stephen, can you explain what the benefit of reverting this is? I think we want it long term, but I can live without it temporarily. I just don't see any reasoning for removing it temporarily in the first place. It isn't broken, or breaking anything else, after all. +-DLS -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 22 May 2013 19:18:12 -0400 David Stevens <dlstevens@us.ibm.com> wrote: > David Miller <davem@davemloft.net> wrote on 05/22/2013 06:08:30 PM: > > > David, please come to some kind of agreement with Stephen about what > > we're going to do about this. > > Well, I think being able to specify an alternate port for the > default fdb entry is a useful feature in its own right. It is > not equivalent to individual fdb entries with alternate ports > because those require the host to know all the destinations in > advance. It's certainly reasonable to use an alternate port for > the default entry as well, and is not "useless" without multiple > listen ports; it is as useful for the default port as it is for > individual fdb entries. > > I'm not sure what the benefit of reverting any working feature > is, only to re-add it later, which I believe is what Stephen > is proposing. I think it is useful for some configurations now, > even if it only becomes useful for other configurations after > multiple listen ports. > > Stephen, can you explain what the benefit of reverting this is? > > I think we want it long term, but I can live without it > temporarily. I just don't see any reasoning for removing it > temporarily in the first place. It isn't broken, or breaking > anything else, after all. > > +-DLS > There are two different config things here 1. FDB - NDA_PORT 2. VXLAN - IFLA_VXLAN_PORT The FDB stuff is seperate and fine, Dave did a complete job on that. I was confused, but it's fine. The per-vxlan device stuff is the issue. I don't think it works as is in 3.10 because there is no real way to use it to receive. And don't want to let it out broken. By reverting that part, we avoid raising false expectations. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Stephen Hemminger <stephen@networkplumber.org> wrote on 05/22/2013 08:39:54 PM: > The per-vxlan device stuff is the issue. I don't think it works > as is in 3.10 because there is no real way to use it to receive. > And don't want to let it out broken. By reverting that part, we avoid > raising false expectations. > Of course you can use it to receive; the remote hosts can be listening on a different port and this is the only mechanism to send to those if there are not fdb entries on the local machine. So, for example, host A: binds to 1234 has no fdb entries, but sets the default remote port to 4789 host B binds to 4789 has no fdb entries and sets the default remote port to 1234 ...or you have an agent listening on a multicast address with port 5429 and the agent's job is to dynamically populate the fdb tables of any hosts that don't have matching entries. All the VXLAN instances bind to, say 4789, and the agent binds to port 5429. The agent receives a packet, and adds an entry to the fdb table on the sending host where the destination is the actual VXLAN receiver with port 4789. But the default port on all of them for this kind of set-up is correctly the different 5429, not 4789. These configurations have nothing to do with multiple listen ports, and reverting the patch prohibits an admin from doing it at all. In the first set up, you can link two segments on different ports without knowing any destination MAC addresses. In the second, you can separate traffic not matching any destination MAC from traffic with a known destination, and use that to fill forwarding tables. I know these configurations don't match your sense of how VXLAN should be used, but these are not "broken", do receives perfectly fine, and have no dependency at all on multiple listen ports. I don't know what "false expectations" you think are there-- I think anyone changing the default destination port to a different value from the binding port simply expects that transmits using the default will go to one port and received packets will go to another. Deciding in advance that that is never appropriate is short-sighted; that decision belongs with the admin setting it up. Reverting the patch removes that choice. +-DLS -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 22 May 2013 22:14:39 -0400 David Stevens <dlstevens@us.ibm.com> wrote: > Stephen Hemminger <stephen@networkplumber.org> wrote on 05/22/2013 > 08:39:54 PM: > > > The per-vxlan device stuff is the issue. I don't think it works > > as is in 3.10 because there is no real way to use it to receive. > > And don't want to let it out broken. By reverting that part, we avoid > > raising false expectations. > > > > Of course you can use it to receive; the remote hosts can be > listening on a different port and this is the only mechanism > to send to those if there are not fdb entries on the local machine. > > So, for example, host A: > binds to 1234 > has no fdb entries, but sets the default remote port to 4789 > > host B > binds to 4789 > has no fdb entries and sets the default remote port to 1234 > > ...or you have an agent listening on a multicast address with port 5429 > and the agent's job is to dynamically populate the fdb tables of any > hosts that don't have matching entries. All the VXLAN instances bind > to, say 4789, and the agent binds to port 5429. The agent receives a > packet, and adds an entry to the fdb table on the sending host where > the destination is the actual VXLAN receiver with port 4789. But the > default port on all of them for this kind of set-up is correctly the > different 5429, not 4789. > > These configurations have nothing to do with multiple listen ports, > and reverting the patch prohibits an admin from doing it at all. > In the first set up, you can link two segments on different ports > without knowing any destination MAC addresses. In the second, you > can separate traffic not matching any destination MAC from traffic > with a known destination, and use that to fill forwarding tables. > > I know these configurations don't match your sense of how VXLAN > should be used, but these are not "broken", do receives perfectly fine, > and have no dependency at all on multiple listen ports. I don't know > what "false expectations" you think are there-- I think anyone > changing the default destination port to a different value from > the binding port simply expects that transmits using the default > will go to one port and received packets will go to another. Deciding > in advance that that is never appropriate is short-sighted; that > decision belongs with the admin setting it up. Reverting the patch > removes that choice. > > > +-DLS > Let's go back to basic use cases. User wants to setup a vxlan between two hosts with custom port On both hosts, he/she does: # ip li add vxlan0 type vxlan id 42 group 239.1.1.1 dev eth0 dstport 4789 With net-next it works, with 3.10 it doesn't. Sure it works if they have your agent, or manually configure N-M static FDB entries, but that is corner case. No matter if Dave decides to the revert the kernel part or not, I am pulling the option from 3.10 version of iproute to avoid user confusion. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Stephen Hemminger <stephen@networkplumber.org> wrote on 05/23/2013 01:08:54 PM: > Let's go back to basic use cases. > User wants to setup a vxlan between two hosts with custom port > On both hosts, he/she does: > # ip li add vxlan0 type vxlan id 42 group 239.1.1.1 dev eth0 dstport 4789 > > With net-next it works, with 3.10 it doesn't. This does not "work" in net-next, either. If you set the one-and-only bind port to one port and the one and only send port to another port on both systems, they can't talk to each other in net-next, either. If you separately have it listen on multiple ports, and if potential non-linux systems you're talking to can do that too, then you can configure it to work, but the extra listen ports are entirely unnecessary. You're starting with the assumption, which I disagree with, that the receive and send ports need to be the same. In fact, for two systems listening on different ports to talk to each other, they *must* be different. It doesn't require multiple ports, it only requires the listen and send ports be *different*. If I have two existing systems, one listening on port 8472 and one listening on port 4789, and I want to connect them, how would you do that? Yes, you *can* add individual MAC entries with different ports for all VMs connected on both machines, but if you have 2 hosts and 10,000 MAC addresses, that's a ridiculous waste. If you simply don't know all the MAC addresses on the remote machine, you can't do it. But if you can set the destination port for the default fdb entry, which is what dstport does, then you can do both. The dstport and dst are simply the default entry for the fdb table, used if there is no exact match in the fdb. They should have every capability to use alternate data that the fdb table does. > Sure it works if they have your agent, or manually configure N-M static FDB > entries, but that is corner case. > > No matter if Dave decides to the revert the kernel part or not, I am > pulling the option > from 3.10 version of iproute to avoid user confusion. I don't know what confusion you're concerned about here. How can anyone be confused by packets going to a different destination port when they use an option "dstport XXXX"? (!?!?!?!?) You can't really prohibit people from using VXLAN however they choose to. It simply means people wanting to do more interesting things will have to patch your code, jump through hoops to edit packets with netfilter, or use alternative implementations that are more flexible. None of those things benefits VXLAN on linux. +-DLS -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 23 May 2013 14:45:51 -0400 David Stevens <dlstevens@us.ibm.com> wrote: > Stephen Hemminger <stephen@networkplumber.org> wrote on 05/23/2013 > 01:08:54 PM: > > > Let's go back to basic use cases. > > User wants to setup a vxlan between two hosts with custom port > > On both hosts, he/she does: > > # ip li add vxlan0 type vxlan id 42 group 239.1.1.1 dev eth0 dstport > 4789 > > > > With net-next it works, with 3.10 it doesn't. > > This does not "work" in net-next, either. If you set the one-and-only bind > port to one port and the one and only send port to another port on both > systems, they can't talk to each other in net-next, either. If you > separately > have it listen on multiple ports, and if potential non-linux systems > you're > talking to can do that too, then you can configure it to work, but the > extra > listen ports are entirely unnecessary. With the patch davem already included, the dstport is enough to add additional listener. > > You're starting with the assumption, which I disagree with, that the > receive > and send ports need to be the same. In fact, for two systems listening on > different ports to talk to each other, they *must* be different. It > doesn't > require multiple ports, it only requires the listen and send ports be > *different*. Another use case is: # ip link add vxlan-old type vxlan id 42 group 239.1.1.1 dev eth0 # ip link add vxlan-new type vxlan id 42 group 239.1.1.1 dev eth0 dstport 4789 This also works with net-next and not with 3.10 > > If I have two existing systems, one listening on port 8472 and one > listening > on port 4789, and I want to connect them, how would you do that? You can either do: A. Define two vxlan's and route or bridge them B. Define one vxlan with a destination port and add exception FDB entry to talk to the outlier Yes, you > *can* add individual MAC entries with different ports for all VMs > connected > on both machines, but if you have 2 hosts and 10,000 MAC addresses, that's > a ridiculous waste. If you simply don't know all the MAC addresses on the > remote machine, you can't do it. But if you can set the destination port > for > the default fdb entry, which is what dstport does, then you can do both. User's may reasonably want to define multiple overlay VXLAN's each on a different port, or create one VXLAN and add excptions. > > The dstport and dst are simply the default entry for the fdb table, used > if > there is no exact match in the fdb. They should have every capability to > use > alternate data that the fdb table does. > > > Sure it works if they have your agent, or manually configure N-M static > FDB > > entries, but that is corner case. > > > > No matter if Dave decides to the revert the kernel part or not, I am > > pulling the option > > from 3.10 version of iproute to avoid user confusion. > > I don't know what confusion you're concerned about here. How can anyone > be confused by packets going to a different destination port when they use > an option "dstport XXXX"? (!?!?!?!?) The dstport option is not in a released kernel or iproute. > > You can't really prohibit people from using VXLAN however they choose to. > It simply means people wanting to do more interesting things will > have to patch your code, jump through hoops to edit packets with > netfilter, > or use alternative implementations that are more flexible. None of those > things benefits VXLAN on linux. > > +-DLS > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Stephen Hemminger <stephen@networkplumber.org> wrote on 05/23/2013 03:18:04 PM: > With the patch davem already included, the dstport is enough > to add additional listener. If you're saying that using the dstport changes the listen port, or adds another listen port, then I think that behaviour is wrong and should be reverted. An admin should be able to specify the source and destination ports independently of each other. If dstport has a side-effect that is unrelated to changing the destination port, that's what I'd call "confusing." IMHO, "port" should change the listen port (only) and "dstport" should change the send port (only). And yes, both of those should allow multiple ports, and destinations. So, binding should be a list of the form: "[IP:]port[,[IP:]port]*" and destinations should be the same as in the fdb, allowing multiple destinations and different ports, and different vni's. It should be simply a "default" fdb entry in all respects. > Another use case is: > # ip link add vxlan-old type vxlan id 42 group 239.1.1.1 dev eth0 > # ip link add vxlan-new type vxlan id 42 group 239.1.1.1 dev eth0 > dstport 4789 > > This also works with net-next and not with 3.10 Yes, I wholeheartedly approve of this feature. It simply is not the only use case for wanting different destination ports, so is not linked to the dstport feature, in my opinion. > > If I have two existing systems, one listening on port 8472 and one > > listening > > on port 4789, and I want to connect them, how would you do that? > > You can either do: > A. Define two vxlan's and route or bridge them > B. Define one vxlan with a destination port and add exception FDB entry > to talk to the outlier So, you're requiring an extra bind port (and the specific port the remote device is using may not be available on the local machine) and an extra hop through a bridge, when all you really want as an admin is to say "listen on this port, send on that port" without that also meaning you have to know and specify the MAC address of every possible remote VM. > User's may reasonably want to define multiple overlay VXLAN's each > on a different port, or create one VXLAN and add excptions. Sure. I think multiple listen ports is a good idea, but it is not the only reason for wanting a different destination port, and not a reason to link destination and listen ports. > The dstport option is not in a released kernel or iproute. I don't think we're going to get consensus here. If you intend the artificial requirement that the bind and listen ports be the same, and that admins need to jump through hoops to get the reasonable effect of making them different, then I simply disagree with you. +-DLS -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 5/23/2013 1:06 PM, David Stevens wrote: > Stephen Hemminger <stephen@networkplumber.org> wrote on 05/23/2013 > 03:18:04 PM: > > >> With the patch davem already included, the dstport is enough >> to add additional listener. > If you're saying that using the dstport changes the > listen port, or adds another listen port, then I think that > behaviour is wrong and should be reverted. I agree that using 'dstport' option to also create a socket and binding to that port for receives is confusing. As the name suggests, it should only be used as a default dst port for fdb entries. > An admin should be able to specify the source and destination > ports independently of each other. If dstport has a side-effect that > is unrelated to changing the destination port, that's what I'd call > "confusing." > IMHO, "port" should change the listen port (only) and "dstport" > should change the send port (only). And yes, both of those should allow > multiple ports, and destinations. So, binding should be a list of > the form: "[IP:]port[,[IP:]port]*" and destinations should be the same > as in the fdb, allowing multiple destinations and different ports, and > different vni's. It should be simply a "default" fdb entry in all > respects. Currently 'port' option takes 2 values that indicate the range of ports that can be used as source port when sending vxlan packets. So we don't have a good way to specify listening port when creating a vxlan device using the existing options. It may be a good idea to revert dstport in linux-3.10 and multiple listening ports patch in net-next and re-implement them with 2 different options that can take a list of ports/addresses as David suggested. Thanks Sridhar -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
I'm deciding to keep the patch in, and not revert. A very concise and incredibly strong argument will need to be given to me to change my mind. Thanks. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, 02 Jun 2013 22:40:37 -0700 (PDT) David Miller <davem@davemloft.net> wrote: > > I'm deciding to keep the patch in, and not revert. > > A very concise and incredibly strong argument will need to be > given to me to change my mind. > > Thanks. Ok, I decided to drop the ability to configure this in 3.10 iproute2 because the behaviour changes in 3.11 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--- a/drivers/net/vxlan.c 2013-05-20 08:13:49.078440463 -0700 +++ b/drivers/net/vxlan.c 2013-05-20 09:31:32.752105130 -0700 @@ -110,7 +110,6 @@ struct vxlan_dev { struct net_device *dev; struct vxlan_rdst default_dst; /* default destination */ __be32 saddr; /* source address */ - __be16 dst_port; __u16 port_min; /* source port range */ __u16 port_max; __u8 tos; /* TOS override */ @@ -193,7 +192,7 @@ static int vxlan_fdb_info(struct sk_buff if (send_ip && nla_put_be32(skb, NDA_DST, rdst->remote_ip)) goto nla_put_failure; - if (rdst->remote_port && rdst->remote_port != vxlan->dst_port && + if (rdst->remote_port && rdst->remote_port != htons(vxlan_port) && nla_put_be16(skb, NDA_PORT, rdst->remote_port)) goto nla_put_failure; if (rdst->remote_vni != vxlan->default_dst.remote_vni && @@ -480,7 +479,7 @@ static int vxlan_fdb_add(struct ndmsg *n return -EINVAL; port = nla_get_be16(tb[NDA_PORT]); } else - port = vxlan->dst_port; + port = htons(vxlan_port); if (tb[NDA_VNI]) { if (nla_len(tb[NDA_VNI]) != sizeof(u32)) @@ -591,7 +590,7 @@ static void vxlan_snoop(struct net_devic err = vxlan_fdb_create(vxlan, src_mac, src_ip, NUD_REACHABLE, NLM_F_EXCL|NLM_F_CREATE, - vxlan->dst_port, + vxlan_port, vxlan->default_dst.remote_vni, 0, NTF_SELF); spin_unlock(&vxlan->hash_lock); @@ -982,7 +981,7 @@ static netdev_tx_t vxlan_xmit_one(struct __be16 df = 0; __u8 tos, ttl; - dst_port = rdst->remote_port ? rdst->remote_port : vxlan->dst_port; + dst_port = rdst->remote_port ? rdst->remote_port : htons(vxlan_port); vni = rdst->remote_vni; dst = rdst->remote_ip; @@ -1326,7 +1325,6 @@ static void vxlan_setup(struct net_devic inet_get_local_port_range(&low, &high); vxlan->port_min = low; vxlan->port_max = high; - vxlan->dst_port = htons(vxlan_port); vxlan->dev = dev; @@ -1349,7 +1347,6 @@ static const struct nla_policy vxlan_pol [IFLA_VXLAN_RSC] = { .type = NLA_U8 }, [IFLA_VXLAN_L2MISS] = { .type = NLA_U8 }, [IFLA_VXLAN_L3MISS] = { .type = NLA_U8 }, - [IFLA_VXLAN_PORT] = { .type = NLA_U16 }, }; static int vxlan_validate(struct nlattr *tb[], struct nlattr *data[]) @@ -1479,9 +1476,6 @@ static int vxlan_newlink(struct net *net vxlan->port_max = ntohs(p->high); } - if (data[IFLA_VXLAN_PORT]) - vxlan->dst_port = nla_get_be16(data[IFLA_VXLAN_PORT]); - SET_ETHTOOL_OPS(dev, &vxlan_ethtool_ops); err = register_netdevice(dev); @@ -1517,7 +1511,6 @@ static size_t vxlan_get_size(const struc nla_total_size(sizeof(__u32)) + /* IFLA_VXLAN_AGEING */ nla_total_size(sizeof(__u32)) + /* IFLA_VXLAN_LIMIT */ nla_total_size(sizeof(struct ifla_vxlan_port_range)) + - nla_total_size(sizeof(__be16))+ /* IFLA_VXLAN_PORT */ 0; } @@ -1554,8 +1547,7 @@ static int vxlan_fill_info(struct sk_buf nla_put_u8(skb, IFLA_VXLAN_L3MISS, !!(vxlan->flags & VXLAN_F_L3MISS)) || nla_put_u32(skb, IFLA_VXLAN_AGEING, vxlan->age_interval) || - nla_put_u32(skb, IFLA_VXLAN_LIMIT, vxlan->addrmax) || - nla_put_be16(skb, IFLA_VXLAN_PORT, vxlan->dst_port)) + nla_put_u32(skb, IFLA_VXLAN_LIMIT, vxlan->addrmax)) goto nla_put_failure; if (nla_put(skb, IFLA_VXLAN_PORT_RANGE, sizeof(ports), &ports))
\This commit 823aa873bc782f1c51b1ce8ec6da7cfcaf93836e Author: stephen hemminger <stephen@networkplumber.org> Date: Sat Apr 27 11:31:57 2013 +0000 vxlan: allow choosing destination port per vxlan is broken revert it. The change allowed setting per port for transmit but did not add additional listening sockets, which made any vxlan's defined with non-default port send only. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> --- This patch needs to go -net only and not -net-next where the problem has been resolved. Not sure what the process flow is to make that happen. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html