Message ID | 1417683438-10935-2-git-send-email-jiri@resnulli.us |
---|---|
State | Changes Requested, archived |
Delegated to: | stephen hemminger |
Headers | show |
On 12/04/14 03:57, Jiri Pirko wrote: > Signed-off-by: Jiri Pirko <jiri@resnulli.us> > --- > include/linux/if_link.h | 1 + > ip/ipaddress.c | 8 ++++++++ > 2 files changed, 9 insertions(+) > > diff --git a/include/linux/if_link.h b/include/linux/if_link.h > index 4732063..a6e2594 100644 > --- a/include/linux/if_link.h > +++ b/include/linux/if_link.h > @@ -145,6 +145,7 @@ enum { > IFLA_CARRIER, > IFLA_PHYS_PORT_ID, > IFLA_CARRIER_CHANGES, > + IFLA_PHYS_SWITCH_ID, > __IFLA_MAX > }; > Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> cheers, jamal -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Dec 04, 2014 at 09:57:13AM +0100, Jiri Pirko wrote: > Signed-off-by: Jiri Pirko <jiri@resnulli.us> > --- > include/linux/if_link.h | 1 + > ip/ipaddress.c | 8 ++++++++ > 2 files changed, 9 insertions(+) > > diff --git a/include/linux/if_link.h b/include/linux/if_link.h > index 4732063..a6e2594 100644 > --- a/include/linux/if_link.h > +++ b/include/linux/if_link.h > @@ -145,6 +145,7 @@ enum { > IFLA_CARRIER, > IFLA_PHYS_PORT_ID, > IFLA_CARRIER_CHANGES, > + IFLA_PHYS_SWITCH_ID, Serious question for Stephen et al, once we take this to iproute2 are we going to be able to change the name but the string diplayed if needed? I had a patch that called this IFLA_PARENT_ID that I was using with the older github tree used by Jiri and Scott before these were in net-next. I wanted to submit that as a change to what became 82f2841291cfaf4d225aa1766424280254d3e3b2, but was waiting for things to be accepted and the dust settled. I like the parent/child/sibling nomenclature better for 4 reasons: - Most did not seem to like the term 'offload' since that term would be confusing with, GRO, GSO, etc. - A *significant* use case for many of the high-end ASICs in datacenters is routing. - switchid does not make sense in the OVS/flow case because it is all about flows, not switches or routers, and parent made sense there. - I wanted to combine this for use with SR-IOV use case, so one can more easily map PF->VF using this. Can you give me a bit (a day) to clean-up that patch and submit an alternative proposal to these? > __IFLA_MAX > }; > > diff --git a/ip/ipaddress.c b/ip/ipaddress.c > index 4d99324..bd36a07 100644 > --- a/ip/ipaddress.c > +++ b/ip/ipaddress.c > @@ -589,6 +589,14 @@ int print_linkinfo(const struct sockaddr_nl *who, > b1, sizeof(b1))); > } > > + if (tb[IFLA_PHYS_SWITCH_ID]) { > + SPRINT_BUF(b1); > + fprintf(fp, "switchid %s ", > + hexstring_n2a(RTA_DATA(tb[IFLA_PHYS_SWITCH_ID]), > + RTA_PAYLOAD(tb[IFLA_PHYS_SWITCH_ID]), > + b1, sizeof(b1))); > + } > + > if (tb[IFLA_OPERSTATE]) > print_operstate(fp, rta_getattr_u8(tb[IFLA_OPERSTATE])); > > -- > 1.9.3 > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/4/14, 6:20 AM, Andy Gospodarek wrote: > On Thu, Dec 04, 2014 at 09:57:13AM +0100, Jiri Pirko wrote: >> Signed-off-by: Jiri Pirko <jiri@resnulli.us> >> --- >> include/linux/if_link.h | 1 + >> ip/ipaddress.c | 8 ++++++++ >> 2 files changed, 9 insertions(+) >> >> diff --git a/include/linux/if_link.h b/include/linux/if_link.h >> index 4732063..a6e2594 100644 >> --- a/include/linux/if_link.h >> +++ b/include/linux/if_link.h >> @@ -145,6 +145,7 @@ enum { >> IFLA_CARRIER, >> IFLA_PHYS_PORT_ID, >> IFLA_CARRIER_CHANGES, >> + IFLA_PHYS_SWITCH_ID, > Serious question for Stephen et al, once we take this to iproute2 are we > going to be able to change the name but the string diplayed if needed? > > I had a patch that called this IFLA_PARENT_ID that I was using with the > older github tree used by Jiri and Scott before these were in net-next. > I wanted to submit that as a change to what became > 82f2841291cfaf4d225aa1766424280254d3e3b2, but was waiting for things to > be accepted and the dust settled. > > I like the parent/child/sibling nomenclature better for 4 reasons: > > - Most did not seem to like the term 'offload' since that term would be > confusing with, GRO, GSO, etc. > - A *significant* use case for many of the high-end ASICs in datacenters > is routing. > - switchid does not make sense in the OVS/flow case because it is all > about flows, not switches or routers, and parent made sense there. > - I wanted to combine this for use with SR-IOV use case, so one can more > easily map PF->VF using this. ack.., I have voted for calling it a generic parent id in the past. > > Can you give me a bit (a day) to clean-up that patch and submit an > alternative proposal to these? > >> __IFLA_MAX >> }; >> >> diff --git a/ip/ipaddress.c b/ip/ipaddress.c >> index 4d99324..bd36a07 100644 >> --- a/ip/ipaddress.c >> +++ b/ip/ipaddress.c >> @@ -589,6 +589,14 @@ int print_linkinfo(const struct sockaddr_nl *who, >> b1, sizeof(b1))); >> } >> >> + if (tb[IFLA_PHYS_SWITCH_ID]) { >> + SPRINT_BUF(b1); >> + fprintf(fp, "switchid %s ", >> + hexstring_n2a(RTA_DATA(tb[IFLA_PHYS_SWITCH_ID]), >> + RTA_PAYLOAD(tb[IFLA_PHYS_SWITCH_ID]), >> + b1, sizeof(b1))); >> + } >> + >> if (tb[IFLA_OPERSTATE]) >> print_operstate(fp, rta_getattr_u8(tb[IFLA_OPERSTATE])); >> >> -- >> 1.9.3 >> -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Thu, Dec 04, 2014 at 03:20:15PM CET, gospo@cumulusnetworks.com wrote: >On Thu, Dec 04, 2014 at 09:57:13AM +0100, Jiri Pirko wrote: >> Signed-off-by: Jiri Pirko <jiri@resnulli.us> >> --- >> include/linux/if_link.h | 1 + >> ip/ipaddress.c | 8 ++++++++ >> 2 files changed, 9 insertions(+) >> >> diff --git a/include/linux/if_link.h b/include/linux/if_link.h >> index 4732063..a6e2594 100644 >> --- a/include/linux/if_link.h >> +++ b/include/linux/if_link.h >> @@ -145,6 +145,7 @@ enum { >> IFLA_CARRIER, >> IFLA_PHYS_PORT_ID, >> IFLA_CARRIER_CHANGES, >> + IFLA_PHYS_SWITCH_ID, > >Serious question for Stephen et al, once we take this to iproute2 are we >going to be able to change the name but the string diplayed if needed? > >I had a patch that called this IFLA_PARENT_ID that I was using with the >older github tree used by Jiri and Scott before these were in net-next. >I wanted to submit that as a change to what became >82f2841291cfaf4d225aa1766424280254d3e3b2, but was waiting for things to >be accepted and the dust settled. > >I like the parent/child/sibling nomenclature better for 4 reasons: > >- Most did not seem to like the term 'offload' since that term would be > confusing with, GRO, GSO, etc. >- A *significant* use case for many of the high-end ASICs in datacenters > is routing. >- switchid does not make sense in the OVS/flow case because it is all > about flows, not switches or routers, and parent made sense there. well ovs is all about flows and has the "switch" word in the name. I believe that people are talking about "switches" in case of these "flow devices" as well. I see nothing wrong in that. I think that "switch" became generic name for "packet forwarding machines". "parent" is very generic and may mean 100 things... >- I wanted to combine this for use with SR-IOV use case, so one can more > easily map PF->VF using this. Ugh, please don't mix this up with pf, vf. That is completely different thing. pf vf mapping is done in sysfs. In netlink, physportid is used for that purpose. We can expose this phys port id for pf as well (as a different attr) and we are done. > >Can you give me a bit (a day) to clean-up that patch and submit an >alternative proposal to these? > >> __IFLA_MAX >> }; >> >> diff --git a/ip/ipaddress.c b/ip/ipaddress.c >> index 4d99324..bd36a07 100644 >> --- a/ip/ipaddress.c >> +++ b/ip/ipaddress.c >> @@ -589,6 +589,14 @@ int print_linkinfo(const struct sockaddr_nl *who, >> b1, sizeof(b1))); >> } >> >> + if (tb[IFLA_PHYS_SWITCH_ID]) { >> + SPRINT_BUF(b1); >> + fprintf(fp, "switchid %s ", >> + hexstring_n2a(RTA_DATA(tb[IFLA_PHYS_SWITCH_ID]), >> + RTA_PAYLOAD(tb[IFLA_PHYS_SWITCH_ID]), >> + b1, sizeof(b1))); >> + } >> + >> if (tb[IFLA_OPERSTATE]) >> print_operstate(fp, rta_getattr_u8(tb[IFLA_OPERSTATE])); >> >> -- >> 1.9.3 >> -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Dec 04, 2014 at 03:33:06PM +0100, Jiri Pirko wrote: > Thu, Dec 04, 2014 at 03:20:15PM CET, gospo@cumulusnetworks.com wrote: > >On Thu, Dec 04, 2014 at 09:57:13AM +0100, Jiri Pirko wrote: > >> Signed-off-by: Jiri Pirko <jiri@resnulli.us> > >> --- > >> include/linux/if_link.h | 1 + > >> ip/ipaddress.c | 8 ++++++++ > >> 2 files changed, 9 insertions(+) > >> > >> diff --git a/include/linux/if_link.h b/include/linux/if_link.h > >> index 4732063..a6e2594 100644 > >> --- a/include/linux/if_link.h > >> +++ b/include/linux/if_link.h > >> @@ -145,6 +145,7 @@ enum { > >> IFLA_CARRIER, > >> IFLA_PHYS_PORT_ID, > >> IFLA_CARRIER_CHANGES, > >> + IFLA_PHYS_SWITCH_ID, > > > >Serious question for Stephen et al, once we take this to iproute2 are we > >going to be able to change the name but the string diplayed if needed? > > > >I had a patch that called this IFLA_PARENT_ID that I was using with the > >older github tree used by Jiri and Scott before these were in net-next. > >I wanted to submit that as a change to what became > >82f2841291cfaf4d225aa1766424280254d3e3b2, but was waiting for things to > >be accepted and the dust settled. > > > >I like the parent/child/sibling nomenclature better for 4 reasons: > > > >- Most did not seem to like the term 'offload' since that term would be > > confusing with, GRO, GSO, etc. > >- A *significant* use case for many of the high-end ASICs in datacenters > > is routing. > >- switchid does not make sense in the OVS/flow case because it is all > > about flows, not switches or routers, and parent made sense there. > > well ovs is all about flows and has the "switch" word in the name. I > believe that people are talking about "switches" in case of these "flow > devices" as well. I see nothing wrong in that. I think that "switch" > became generic name for "packet forwarding machines". Just because one chose to use it that way does not mean I agree with it and that we should copy their bad decision. :-) > "parent" is very generic and may mean 100 things... But in this case, it means 1 thing. The netdev you are using is connected to another device that controls it rather than being just a NIC. > > > >- I wanted to combine this for use with SR-IOV use case, so one can more > > easily map PF->VF using this. > > Ugh, please don't mix this up with pf, vf. That is completely different > thing. > > pf vf mapping is done in sysfs. In netlink, physportid is used for that > purpose. We can expose this phys port id for pf as well (as a different > attr) and we are done. I know that attribute is there, but I find it more valuable for solutions like nPAR than for PF/VF use-case. Parent/child relationship makes more sense to me since for forwarding will be controlled by the embedded switch on those devices. (Notice I specifically chose not to use 'master' since that is already overloaded by bridge, bonding, teaming, etc.) > > > > >Can you give me a bit (a day) to clean-up that patch and submit an > >alternative proposal to these? > > > >> __IFLA_MAX > >> }; > >> > >> diff --git a/ip/ipaddress.c b/ip/ipaddress.c > >> index 4d99324..bd36a07 100644 > >> --- a/ip/ipaddress.c > >> +++ b/ip/ipaddress.c > >> @@ -589,6 +589,14 @@ int print_linkinfo(const struct sockaddr_nl *who, > >> b1, sizeof(b1))); > >> } > >> > >> + if (tb[IFLA_PHYS_SWITCH_ID]) { > >> + SPRINT_BUF(b1); > >> + fprintf(fp, "switchid %s ", > >> + hexstring_n2a(RTA_DATA(tb[IFLA_PHYS_SWITCH_ID]), > >> + RTA_PAYLOAD(tb[IFLA_PHYS_SWITCH_ID]), > >> + b1, sizeof(b1))); > >> + } > >> + > >> if (tb[IFLA_OPERSTATE]) > >> print_operstate(fp, rta_getattr_u8(tb[IFLA_OPERSTATE])); > >> > >> -- > >> 1.9.3 > >> > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Thu, Dec 04, 2014 at 03:57:43PM CET, gospo@cumulusnetworks.com wrote: >On Thu, Dec 04, 2014 at 03:33:06PM +0100, Jiri Pirko wrote: >> Thu, Dec 04, 2014 at 03:20:15PM CET, gospo@cumulusnetworks.com wrote: >> >On Thu, Dec 04, 2014 at 09:57:13AM +0100, Jiri Pirko wrote: >> >> Signed-off-by: Jiri Pirko <jiri@resnulli.us> >> >> --- >> >> include/linux/if_link.h | 1 + >> >> ip/ipaddress.c | 8 ++++++++ >> >> 2 files changed, 9 insertions(+) >> >> >> >> diff --git a/include/linux/if_link.h b/include/linux/if_link.h >> >> index 4732063..a6e2594 100644 >> >> --- a/include/linux/if_link.h >> >> +++ b/include/linux/if_link.h >> >> @@ -145,6 +145,7 @@ enum { >> >> IFLA_CARRIER, >> >> IFLA_PHYS_PORT_ID, >> >> IFLA_CARRIER_CHANGES, >> >> + IFLA_PHYS_SWITCH_ID, >> > >> >Serious question for Stephen et al, once we take this to iproute2 are we >> >going to be able to change the name but the string diplayed if needed? >> > >> >I had a patch that called this IFLA_PARENT_ID that I was using with the >> >older github tree used by Jiri and Scott before these were in net-next. >> >I wanted to submit that as a change to what became >> >82f2841291cfaf4d225aa1766424280254d3e3b2, but was waiting for things to >> >be accepted and the dust settled. >> > >> >I like the parent/child/sibling nomenclature better for 4 reasons: >> > >> >- Most did not seem to like the term 'offload' since that term would be >> > confusing with, GRO, GSO, etc. >> >- A *significant* use case for many of the high-end ASICs in datacenters >> > is routing. >> >- switchid does not make sense in the OVS/flow case because it is all >> > about flows, not switches or routers, and parent made sense there. >> >> well ovs is all about flows and has the "switch" word in the name. I >> believe that people are talking about "switches" in case of these "flow >> devices" as well. I see nothing wrong in that. I think that "switch" >> became generic name for "packet forwarding machines". > >Just because one chose to use it that way does not mean I agree with it >and that we should copy their bad decision. :-) > >> "parent" is very generic and may mean 100 things... > >But in this case, it means 1 thing. The netdev you are using is >connected to another device that controls it rather than being just a >NIC. when we discussed linkage now called upper/lower devices, there was a proposal to use "parent" as well. And then it ment only 1 thing as well. Why for example pf is not parent of vf? Using "parent" word here and assume that it just makes sense for this case is IMHO wrong. Too generic. "switch_parent" - that I am okay with. > >> >> >> >- I wanted to combine this for use with SR-IOV use case, so one can more >> > easily map PF->VF using this. >> >> Ugh, please don't mix this up with pf, vf. That is completely different >> thing. >> >> pf vf mapping is done in sysfs. In netlink, physportid is used for that >> purpose. We can expose this phys port id for pf as well (as a different >> attr) and we are done. > >I know that attribute is there, but I find it more valuable for >solutions like nPAR than for PF/VF use-case. Why it is not good enough for pf/vf? There are SR-IOV drivers using this: $ git grep ndo_get_phys_port_id drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c: .ndo_get_phys_port_id = bnx2x_get_phys_port_id, drivers/net/ethernet/intel/i40e/i40e_main.c: .ndo_get_phys_port_id = i40e_get_phys_port_id, drivers/net/ethernet/mellanox/mlx4/en_netdev.c: .ndo_get_phys_port_id = mlx4_en_get_phys_port_id, drivers/net/ethernet/mellanox/mlx4/en_netdev.c: .ndo_get_phys_port_id = mlx4_en_get_phys_port_id, drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c: .ndo_get_phys_port_id = qlcnic_get_phys_port_id, >Parent/child relationship >makes more sense to me since for forwarding will be controlled by the >embedded switch on those devices. (Notice I specifically chose not to >use 'master' since that is already overloaded by bridge, bonding, >teaming, etc.) But is that embedded switch the same thing as PF? I doubt that. Let the aspect of PF/VF aside and look at it as on ordinary switch. The eswitch driver needs te be implemented and add netdevs for internal ports (they do not exist now). You have to look at PF/VF and eSwitch as on separate things. > >> >> > >> >Can you give me a bit (a day) to clean-up that patch and submit an >> >alternative proposal to these? >> > >> >> __IFLA_MAX >> >> }; >> >> >> >> diff --git a/ip/ipaddress.c b/ip/ipaddress.c >> >> index 4d99324..bd36a07 100644 >> >> --- a/ip/ipaddress.c >> >> +++ b/ip/ipaddress.c >> >> @@ -589,6 +589,14 @@ int print_linkinfo(const struct sockaddr_nl *who, >> >> b1, sizeof(b1))); >> >> } >> >> >> >> + if (tb[IFLA_PHYS_SWITCH_ID]) { >> >> + SPRINT_BUF(b1); >> >> + fprintf(fp, "switchid %s ", >> >> + hexstring_n2a(RTA_DATA(tb[IFLA_PHYS_SWITCH_ID]), >> >> + RTA_PAYLOAD(tb[IFLA_PHYS_SWITCH_ID]), >> >> + b1, sizeof(b1))); >> >> + } >> >> + >> >> if (tb[IFLA_OPERSTATE]) >> >> print_operstate(fp, rta_getattr_u8(tb[IFLA_OPERSTATE])); >> >> >> >> -- >> >> 1.9.3 >> >> >> -- >> To unsubscribe from this list: send the line "unsubscribe netdev" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >-- >To unsubscribe from this list: send the line "unsubscribe netdev" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Thu, Dec 04, 2014 at 03:57:43PM CET, gospo@cumulusnetworks.com wrote: >On Thu, Dec 04, 2014 at 03:33:06PM +0100, Jiri Pirko wrote: >> Thu, Dec 04, 2014 at 03:20:15PM CET, gospo@cumulusnetworks.com wrote: >> >On Thu, Dec 04, 2014 at 09:57:13AM +0100, Jiri Pirko wrote: >> >> Signed-off-by: Jiri Pirko <jiri@resnulli.us> >> >> --- >> >> include/linux/if_link.h | 1 + >> >> ip/ipaddress.c | 8 ++++++++ >> >> 2 files changed, 9 insertions(+) >> >> >> >> diff --git a/include/linux/if_link.h b/include/linux/if_link.h >> >> index 4732063..a6e2594 100644 >> >> --- a/include/linux/if_link.h >> >> +++ b/include/linux/if_link.h >> >> @@ -145,6 +145,7 @@ enum { >> >> IFLA_CARRIER, >> >> IFLA_PHYS_PORT_ID, >> >> IFLA_CARRIER_CHANGES, >> >> + IFLA_PHYS_SWITCH_ID, >> > >> >Serious question for Stephen et al, once we take this to iproute2 are we >> >going to be able to change the name but the string diplayed if needed? >> > >> >I had a patch that called this IFLA_PARENT_ID that I was using with the >> >older github tree used by Jiri and Scott before these were in net-next. >> >I wanted to submit that as a change to what became >> >82f2841291cfaf4d225aa1766424280254d3e3b2, but was waiting for things to >> >be accepted and the dust settled. >> > >> >I like the parent/child/sibling nomenclature better for 4 reasons: >> > >> >- Most did not seem to like the term 'offload' since that term would be >> > confusing with, GRO, GSO, etc. >> >- A *significant* use case for many of the high-end ASICs in datacenters >> > is routing. >> >- switchid does not make sense in the OVS/flow case because it is all >> > about flows, not switches or routers, and parent made sense there. >> >> well ovs is all about flows and has the "switch" word in the name. I >> believe that people are talking about "switches" in case of these "flow >> devices" as well. I see nothing wrong in that. I think that "switch" >> became generic name for "packet forwarding machines". > >Just because one chose to use it that way does not mean I agree with it >and that we should copy their bad decision. :-) > >> "parent" is very generic and may mean 100 things... > >But in this case, it means 1 thing. The netdev you are using is >connected to another device that controls it rather than being just a >NIC. > >> >> >> >- I wanted to combine this for use with SR-IOV use case, so one can more >> > easily map PF->VF using this. >> >> Ugh, please don't mix this up with pf, vf. That is completely different >> thing. >> >> pf vf mapping is done in sysfs. In netlink, physportid is used for that >> purpose. We can expose this phys port id for pf as well (as a different >> attr) and we are done. > >I know that attribute is there, but I find it more valuable for >solutions like nPAR than for PF/VF use-case. Parent/child relationship >makes more sense to me since for forwarding will be controlled by the >embedded switch on those devices. (Notice I specifically chose not to see, "embedded switch", not "embedded parent" or "embedded offload". And yes, the "embedded switch" also may (and some of them do today) work with flows. >use 'master' since that is already overloaded by bridge, bonding, >teaming, etc.) > >> >> > >> >Can you give me a bit (a day) to clean-up that patch and submit an >> >alternative proposal to these? >> > >> >> __IFLA_MAX >> >> }; >> >> >> >> diff --git a/ip/ipaddress.c b/ip/ipaddress.c >> >> index 4d99324..bd36a07 100644 >> >> --- a/ip/ipaddress.c >> >> +++ b/ip/ipaddress.c >> >> @@ -589,6 +589,14 @@ int print_linkinfo(const struct sockaddr_nl *who, >> >> b1, sizeof(b1))); >> >> } >> >> >> >> + if (tb[IFLA_PHYS_SWITCH_ID]) { >> >> + SPRINT_BUF(b1); >> >> + fprintf(fp, "switchid %s ", >> >> + hexstring_n2a(RTA_DATA(tb[IFLA_PHYS_SWITCH_ID]), >> >> + RTA_PAYLOAD(tb[IFLA_PHYS_SWITCH_ID]), >> >> + b1, sizeof(b1))); >> >> + } >> >> + >> >> if (tb[IFLA_OPERSTATE]) >> >> print_operstate(fp, rta_getattr_u8(tb[IFLA_OPERSTATE])); >> >> >> >> -- >> >> 1.9.3 >> >> >> -- >> To unsubscribe from this list: send the line "unsubscribe netdev" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >-- >To unsubscribe from this list: send the line "unsubscribe netdev" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Dec 04, 2014 at 04:15:12PM +0100, Jiri Pirko wrote: > Thu, Dec 04, 2014 at 03:57:43PM CET, gospo@cumulusnetworks.com wrote: > >On Thu, Dec 04, 2014 at 03:33:06PM +0100, Jiri Pirko wrote: > >> Thu, Dec 04, 2014 at 03:20:15PM CET, gospo@cumulusnetworks.com wrote: > >> >On Thu, Dec 04, 2014 at 09:57:13AM +0100, Jiri Pirko wrote: > >> >> Signed-off-by: Jiri Pirko <jiri@resnulli.us> > >> >> --- > >> >> include/linux/if_link.h | 1 + > >> >> ip/ipaddress.c | 8 ++++++++ > >> >> 2 files changed, 9 insertions(+) > >> >> > >> >> diff --git a/include/linux/if_link.h b/include/linux/if_link.h > >> >> index 4732063..a6e2594 100644 > >> >> --- a/include/linux/if_link.h > >> >> +++ b/include/linux/if_link.h > >> >> @@ -145,6 +145,7 @@ enum { > >> >> IFLA_CARRIER, > >> >> IFLA_PHYS_PORT_ID, > >> >> IFLA_CARRIER_CHANGES, > >> >> + IFLA_PHYS_SWITCH_ID, > >> > > >> >Serious question for Stephen et al, once we take this to iproute2 are we > >> >going to be able to change the name but the string diplayed if needed? > >> > > >> >I had a patch that called this IFLA_PARENT_ID that I was using with the > >> >older github tree used by Jiri and Scott before these were in net-next. > >> >I wanted to submit that as a change to what became > >> >82f2841291cfaf4d225aa1766424280254d3e3b2, but was waiting for things to > >> >be accepted and the dust settled. > >> > > >> >I like the parent/child/sibling nomenclature better for 4 reasons: > >> > > >> >- Most did not seem to like the term 'offload' since that term would be > >> > confusing with, GRO, GSO, etc. > >> >- A *significant* use case for many of the high-end ASICs in datacenters > >> > is routing. > >> >- switchid does not make sense in the OVS/flow case because it is all > >> > about flows, not switches or routers, and parent made sense there. > >> > >> well ovs is all about flows and has the "switch" word in the name. I > >> believe that people are talking about "switches" in case of these "flow > >> devices" as well. I see nothing wrong in that. I think that "switch" > >> became generic name for "packet forwarding machines". > > > >Just because one chose to use it that way does not mean I agree with it > >and that we should copy their bad decision. :-) > > > >> "parent" is very generic and may mean 100 things... > > > >But in this case, it means 1 thing. The netdev you are using is > >connected to another device that controls it rather than being just a > >NIC. > > > >> > >> > >> >- I wanted to combine this for use with SR-IOV use case, so one can more > >> > easily map PF->VF using this. > >> > >> Ugh, please don't mix this up with pf, vf. That is completely different > >> thing. > >> > >> pf vf mapping is done in sysfs. In netlink, physportid is used for that > >> purpose. We can expose this phys port id for pf as well (as a different > >> attr) and we are done. > > > >I know that attribute is there, but I find it more valuable for > >solutions like nPAR than for PF/VF use-case. Parent/child relationship > >makes more sense to me since for forwarding will be controlled by the > >embedded switch on those devices. (Notice I specifically chose not to > > see, "embedded switch", not "embedded parent" or "embedded offload". And > yes, the "embedded switch" also may (and some of them do today) work > with flows. It's fine to exclude the PF/VF part for now since you seem disagree with that as a valid reason. The other 3 reasons are in my mind, so that was why I asked to Stephen would mind waiting for me to post my set. I'm not asking for you to do anything here but to be patient and let me get the set out. > > > >use 'master' since that is already overloaded by bridge, bonding, > >teaming, etc.) > > > >> > >> > > >> >Can you give me a bit (a day) to clean-up that patch and submit an > >> >alternative proposal to these? > >> > > >> >> __IFLA_MAX > >> >> }; > >> >> > >> >> diff --git a/ip/ipaddress.c b/ip/ipaddress.c > >> >> index 4d99324..bd36a07 100644 > >> >> --- a/ip/ipaddress.c > >> >> +++ b/ip/ipaddress.c > >> >> @@ -589,6 +589,14 @@ int print_linkinfo(const struct sockaddr_nl *who, > >> >> b1, sizeof(b1))); > >> >> } > >> >> > >> >> + if (tb[IFLA_PHYS_SWITCH_ID]) { > >> >> + SPRINT_BUF(b1); > >> >> + fprintf(fp, "switchid %s ", > >> >> + hexstring_n2a(RTA_DATA(tb[IFLA_PHYS_SWITCH_ID]), > >> >> + RTA_PAYLOAD(tb[IFLA_PHYS_SWITCH_ID]), > >> >> + b1, sizeof(b1))); > >> >> + } > >> >> + > >> >> if (tb[IFLA_OPERSTATE]) > >> >> print_operstate(fp, rta_getattr_u8(tb[IFLA_OPERSTATE])); > >> >> > >> >> -- > >> >> 1.9.3 > >> >> > >> -- > >> To unsubscribe from this list: send the line "unsubscribe netdev" in > >> the body of a message to majordomo@vger.kernel.org > >> More majordomo info at http://vger.kernel.org/majordomo-info.html > >-- > >To unsubscribe from this list: send the line "unsubscribe netdev" in > >the body of a message to majordomo@vger.kernel.org > >More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/04/14 at 09:57am, Andy Gospodarek wrote: > I know that attribute is there, but I find it more valuable for > solutions like nPAR than for PF/VF use-case. Parent/child relationship > makes more sense to me since for forwarding will be controlled by the > embedded switch on those devices. (Notice I specifically chose not to > use 'master' since that is already overloaded by bridge, bonding, > teaming, etc.) So that would make it link, master, and parent. Not sure we can make it any more confusing ;-) TBH, I'm really struggling on how to best name this. Trying to reach for 100% correctness leads to generic, amgbigous names. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Jiri Pirko <jiri@resnulli.us> writes: Would someone please explain to me what a switch id is? I looked in the kernel source, and I looked here and while I know switches I don't have a clue what a switch id is. My primary concern at this point is that you have introduced a global identifier that is isn't a hardware property (it certainly does not look like a mac address) and that is unique across network namespaces and thus breaks checkpoint/restart (aka CRIU). Also what in the world does PHYS mean in IFLA_PHYS_SWITCH_ID? Does that mean we can't have a purely software implementation of this interface? Given that we will want a software implementation at some point including PHYS in the name seems completely wrong. > Signed-off-by: Jiri Pirko <jiri@resnulli.us> > --- > include/linux/if_link.h | 1 + > ip/ipaddress.c | 8 ++++++++ > 2 files changed, 9 insertions(+) > > diff --git a/include/linux/if_link.h b/include/linux/if_link.h > index 4732063..a6e2594 100644 > --- a/include/linux/if_link.h > +++ b/include/linux/if_link.h > @@ -145,6 +145,7 @@ enum { > IFLA_CARRIER, > IFLA_PHYS_PORT_ID, > IFLA_CARRIER_CHANGES, > + IFLA_PHYS_SWITCH_ID, > __IFLA_MAX > }; > > diff --git a/ip/ipaddress.c b/ip/ipaddress.c > index 4d99324..bd36a07 100644 > --- a/ip/ipaddress.c > +++ b/ip/ipaddress.c > @@ -589,6 +589,14 @@ int print_linkinfo(const struct sockaddr_nl *who, > b1, sizeof(b1))); > } > > + if (tb[IFLA_PHYS_SWITCH_ID]) { > + SPRINT_BUF(b1); > + fprintf(fp, "switchid %s ", > + hexstring_n2a(RTA_DATA(tb[IFLA_PHYS_SWITCH_ID]), > + RTA_PAYLOAD(tb[IFLA_PHYS_SWITCH_ID]), > + b1, sizeof(b1))); > + } > + > if (tb[IFLA_OPERSTATE]) > print_operstate(fp, rta_getattr_u8(tb[IFLA_OPERSTATE])); -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Thu, Dec 04, 2014 at 05:15:04PM CET, ebiederm@xmission.com wrote: >Jiri Pirko <jiri@resnulli.us> writes: > >Would someone please explain to me what a switch id is? > >I looked in the kernel source, and I looked here and while I know >switches I don't have a clue what a switch id is. > >My primary concern at this point is that you have introduced a global >identifier that is isn't a hardware property (it certainly does not look >like a mac address) and that is unique across network namespaces and >thus breaks checkpoint/restart (aka CRIU). IFLA_PHYS_SWITCH_ID is very similar to IFLA_PHYS_PORT_ID. It is generated by the driver and ensures that there is the same switch id for all ports belonging to the same switch chip/asic. It is up to the driver how to implement the id. I would like to point you to driver implementing ndo_get_phys_port_id > >Also what in the world does PHYS mean in IFLA_PHYS_SWITCH_ID? Does that >mean we can't have a purely software implementation of this interface? >Given that we will want a software implementation at some point >including PHYS in the name seems completely wrong. We can remove the "PHYS", no problem. I do not understand what you say about "software implementation". The point is to allow hw switch/ish chips to be supported. > >> Signed-off-by: Jiri Pirko <jiri@resnulli.us> >> --- >> include/linux/if_link.h | 1 + >> ip/ipaddress.c | 8 ++++++++ >> 2 files changed, 9 insertions(+) >> >> diff --git a/include/linux/if_link.h b/include/linux/if_link.h >> index 4732063..a6e2594 100644 >> --- a/include/linux/if_link.h >> +++ b/include/linux/if_link.h >> @@ -145,6 +145,7 @@ enum { >> IFLA_CARRIER, >> IFLA_PHYS_PORT_ID, >> IFLA_CARRIER_CHANGES, >> + IFLA_PHYS_SWITCH_ID, >> __IFLA_MAX >> }; >> >> diff --git a/ip/ipaddress.c b/ip/ipaddress.c >> index 4d99324..bd36a07 100644 >> --- a/ip/ipaddress.c >> +++ b/ip/ipaddress.c >> @@ -589,6 +589,14 @@ int print_linkinfo(const struct sockaddr_nl *who, >> b1, sizeof(b1))); >> } >> >> + if (tb[IFLA_PHYS_SWITCH_ID]) { >> + SPRINT_BUF(b1); >> + fprintf(fp, "switchid %s ", >> + hexstring_n2a(RTA_DATA(tb[IFLA_PHYS_SWITCH_ID]), >> + RTA_PAYLOAD(tb[IFLA_PHYS_SWITCH_ID]), >> + b1, sizeof(b1))); >> + } >> + >> if (tb[IFLA_OPERSTATE]) >> print_operstate(fp, rta_getattr_u8(tb[IFLA_OPERSTATE])); -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Jiri Pirko <jiri@resnulli.us> writes: > Thu, Dec 04, 2014 at 05:15:04PM CET, ebiederm@xmission.com wrote: >>Jiri Pirko <jiri@resnulli.us> writes: >> >>Would someone please explain to me what a switch id is? >> >>I looked in the kernel source, and I looked here and while I know >>switches I don't have a clue what a switch id is. >> >>My primary concern at this point is that you have introduced a global >>identifier that is isn't a hardware property (it certainly does not look >>like a mac address) and that is unique across network namespaces and >>thus breaks checkpoint/restart (aka CRIU). > > IFLA_PHYS_SWITCH_ID is very similar to IFLA_PHYS_PORT_ID. It is > generated by the driver and ensures that there is the same switch id for > all ports belonging to the same switch chip/asic. It is up to the driver > how to implement the id. I would like to point you to driver > implementing ndo_get_phys_port_id Looking at ndo_get_phys_port_id it is just the per port mac address. Or guid in the case of infiniband. Which really makes me wonder why we didn't use the same abstractions in the code for address types that we do for hardware addresses. Using mac address or other hardware addresses that are used for layer 2 addressing makes sense to me. There is a long tradition of that and as I recall protocols like STP actually requiring having a different mac address per port. When I asked the question I thought the switch id was going to be something like the ifindex, the software index of a network device. Finally having tracked down the rocker implementation of rocker_port_switch_parent_id_get I see it you are reading some 64bit hardware register. Which leads me to ask what are the semantics of switch_id? Is the switch id an identifier with a prefix from IEEE and assigned by the manufacture so that it is guaranteed to the tolerances of the manufacturing process to be globally unique? Is the switch id a random number that is statistically likely to be globally unique because you have enough bits? As I recall you need at least 128 bits to have a reasonable chance of a random number avoiding the birthday paradox. Do we need some kind of manufacturer id to tell one switch id from another? Is the switch id persistent across reboots? >>Also what in the world does PHYS mean in IFLA_PHYS_SWITCH_ID? Does that >>mean we can't have a purely software implementation of this interface? >>Given that we will want a software implementation at some point >>including PHYS in the name seems completely wrong. > > We can remove the "PHYS", no problem. I do not understand what you say > about "software implementation". The point is to allow hw switch/ish > chips to be supported. If we are talking about something typically stored in a eeprom like a mac address phys seems appropriate. Still having a definition of this switch id clean clear enough that net/bridge and drivers/net/macvlan can implement it seems important. Even more important is having a definition of switch id clear enough that userspace can use the switch id to do something useful. Right now switch id looks like one of those weird one manufacturer properties that is fine to expose as a driver specific property but I don't yet see it being a generic property I that can be used usefully in userspace. So can we please get some clear semantics or failing that can we please not expose this to userspace as generic property. Thanks, Eric >>> Signed-off-by: Jiri Pirko <jiri@resnulli.us> >>> --- >>> include/linux/if_link.h | 1 + >>> ip/ipaddress.c | 8 ++++++++ >>> 2 files changed, 9 insertions(+) >>> >>> diff --git a/include/linux/if_link.h b/include/linux/if_link.h >>> index 4732063..a6e2594 100644 >>> --- a/include/linux/if_link.h >>> +++ b/include/linux/if_link.h >>> @@ -145,6 +145,7 @@ enum { >>> IFLA_CARRIER, >>> IFLA_PHYS_PORT_ID, >>> IFLA_CARRIER_CHANGES, >>> + IFLA_PHYS_SWITCH_ID, >>> __IFLA_MAX >>> }; >>> >>> diff --git a/ip/ipaddress.c b/ip/ipaddress.c >>> index 4d99324..bd36a07 100644 >>> --- a/ip/ipaddress.c >>> +++ b/ip/ipaddress.c >>> @@ -589,6 +589,14 @@ int print_linkinfo(const struct sockaddr_nl *who, >>> b1, sizeof(b1))); >>> } >>> >>> + if (tb[IFLA_PHYS_SWITCH_ID]) { >>> + SPRINT_BUF(b1); >>> + fprintf(fp, "switchid %s ", >>> + hexstring_n2a(RTA_DATA(tb[IFLA_PHYS_SWITCH_ID]), >>> + RTA_PAYLOAD(tb[IFLA_PHYS_SWITCH_ID]), >>> + b1, sizeof(b1))); >>> + } >>> + >>> if (tb[IFLA_OPERSTATE]) >>> print_operstate(fp, rta_getattr_u8(tb[IFLA_OPERSTATE])); -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/4/14, 9:52 AM, Eric W. Biederman wrote: > Jiri Pirko <jiri@resnulli.us> writes: > >> Thu, Dec 04, 2014 at 05:15:04PM CET, ebiederm@xmission.com wrote: >>> Jiri Pirko <jiri@resnulli.us> writes: >>> >>> Would someone please explain to me what a switch id is? >>> >>> I looked in the kernel source, and I looked here and while I know >>> switches I don't have a clue what a switch id is. >>> >>> My primary concern at this point is that you have introduced a global >>> identifier that is isn't a hardware property (it certainly does not look >>> like a mac address) and that is unique across network namespaces and >>> thus breaks checkpoint/restart (aka CRIU). >> IFLA_PHYS_SWITCH_ID is very similar to IFLA_PHYS_PORT_ID. It is >> generated by the driver and ensures that there is the same switch id for >> all ports belonging to the same switch chip/asic. It is up to the driver >> how to implement the id. I would like to point you to driver >> implementing ndo_get_phys_port_id > Looking at ndo_get_phys_port_id it is just the per port mac address. Or > guid in the case of infiniband. Which really makes me wonder why we > didn't use the same abstractions in the code for address types that we > do for hardware addresses. > > Using mac address or other hardware addresses that are used for layer 2 > addressing makes sense to me. There is a long tradition of that and as > I recall protocols like STP actually requiring having a different mac > address per port. > > When I asked the question I thought the switch id was going to be > something like the ifindex, the software index of a network device. > > > Finally having tracked down the rocker implementation of > rocker_port_switch_parent_id_get I see it you are reading some 64bit > hardware register. > > Which leads me to ask what are the semantics of switch_id? > > Is the switch id an identifier with a prefix from IEEE and assigned by > the manufacture so that it is guaranteed to the tolerances of the > manufacturing process to be globally unique? > > Is the switch id a random number that is statistically likely to be > globally unique because you have enough bits? As I recall you need > at least 128 bits to have a reasonable chance of a random number > avoiding the birthday paradox. > > Do we need some kind of manufacturer id to tell one switch id from > another? > > Is the switch id persistent across reboots? > >>> Also what in the world does PHYS mean in IFLA_PHYS_SWITCH_ID? Does that >>> mean we can't have a purely software implementation of this interface? >>> Given that we will want a software implementation at some point >>> including PHYS in the name seems completely wrong. >> We can remove the "PHYS", no problem. I do not understand what you say >> about "software implementation". The point is to allow hw switch/ish >> chips to be supported. > If we are talking about something typically stored in a eeprom like a > mac address phys seems appropriate. > > Still having a definition of this switch id clean clear enough that > net/bridge and drivers/net/macvlan can implement it seems important. > > Even more important is having a definition of switch id clear enough > that userspace can use the switch id to do something useful. > > Right now switch id looks like one of those weird one manufacturer > properties that is fine to expose as a driver specific property > but I don't yet see it being a generic property I that can be used > usefully in userspace. > > So can we please get some clear semantics or failing that can we please > not expose this to userspace as generic property. Agree..100%. This was my original concern as well and i have raised it earlier. Don't expose it to userspace if the semantics are still not clear. Thanks. >>>> Signed-off-by: Jiri Pirko <jiri@resnulli.us> >>>> --- >>>> include/linux/if_link.h | 1 + >>>> ip/ipaddress.c | 8 ++++++++ >>>> 2 files changed, 9 insertions(+) >>>> >>>> diff --git a/include/linux/if_link.h b/include/linux/if_link.h >>>> index 4732063..a6e2594 100644 >>>> --- a/include/linux/if_link.h >>>> +++ b/include/linux/if_link.h >>>> @@ -145,6 +145,7 @@ enum { >>>> IFLA_CARRIER, >>>> IFLA_PHYS_PORT_ID, >>>> IFLA_CARRIER_CHANGES, >>>> + IFLA_PHYS_SWITCH_ID, >>>> __IFLA_MAX >>>> }; >>>> >>>> diff --git a/ip/ipaddress.c b/ip/ipaddress.c >>>> index 4d99324..bd36a07 100644 >>>> --- a/ip/ipaddress.c >>>> +++ b/ip/ipaddress.c >>>> @@ -589,6 +589,14 @@ int print_linkinfo(const struct sockaddr_nl *who, >>>> b1, sizeof(b1))); >>>> } >>>> >>>> + if (tb[IFLA_PHYS_SWITCH_ID]) { >>>> + SPRINT_BUF(b1); >>>> + fprintf(fp, "switchid %s ", >>>> + hexstring_n2a(RTA_DATA(tb[IFLA_PHYS_SWITCH_ID]), >>>> + RTA_PAYLOAD(tb[IFLA_PHYS_SWITCH_ID]), >>>> + b1, sizeof(b1))); >>>> + } >>>> + >>>> if (tb[IFLA_OPERSTATE]) >>>> print_operstate(fp, rta_getattr_u8(tb[IFLA_OPERSTATE])); -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Thu, Dec 04, 2014 at 06:52:49PM CET, ebiederm@xmission.com wrote: >Jiri Pirko <jiri@resnulli.us> writes: > >> Thu, Dec 04, 2014 at 05:15:04PM CET, ebiederm@xmission.com wrote: >>>Jiri Pirko <jiri@resnulli.us> writes: >>> >>>Would someone please explain to me what a switch id is? >>> >>>I looked in the kernel source, and I looked here and while I know >>>switches I don't have a clue what a switch id is. >>> >>>My primary concern at this point is that you have introduced a global >>>identifier that is isn't a hardware property (it certainly does not look >>>like a mac address) and that is unique across network namespaces and >>>thus breaks checkpoint/restart (aka CRIU). >> >> IFLA_PHYS_SWITCH_ID is very similar to IFLA_PHYS_PORT_ID. It is >> generated by the driver and ensures that there is the same switch id for >> all ports belonging to the same switch chip/asic. It is up to the driver >> how to implement the id. I would like to point you to driver >> implementing ndo_get_phys_port_id > >Looking at ndo_get_phys_port_id it is just the per port mac address. Or >guid in the case of infiniband. Which really makes me wonder why we >didn't use the same abstractions in the code for address types that we >do for hardware addresses. > >Using mac address or other hardware addresses that are used for layer 2 >addressing makes sense to me. There is a long tradition of that and as >I recall protocols like STP actually requiring having a different mac >address per port. > >When I asked the question I thought the switch id was going to be >something like the ifindex, the software index of a network device. > > >Finally having tracked down the rocker implementation of >rocker_port_switch_parent_id_get I see it you are reading some 64bit >hardware register. > >Which leads me to ask what are the semantics of switch_id? > >Is the switch id an identifier with a prefix from IEEE and assigned by >the manufacture so that it is guaranteed to the tolerances of the >manufacturing process to be globally unique? It is up to the driver what to use. It can use mac addr. This is same as for phys port id. > >Is the switch id a random number that is statistically likely to be >globally unique because you have enough bits? As I recall you need >at least 128 bits to have a reasonable chance of a random number >avoiding the birthday paradox. > >Do we need some kind of manufacturer id to tell one switch id from >another? > >Is the switch id persistent across reboots? Yes it is (as for phys port id). > >>>Also what in the world does PHYS mean in IFLA_PHYS_SWITCH_ID? Does that >>>mean we can't have a purely software implementation of this interface? >>>Given that we will want a software implementation at some point >>>including PHYS in the name seems completely wrong. >> >> We can remove the "PHYS", no problem. I do not understand what you say >> about "software implementation". The point is to allow hw switch/ish >> chips to be supported. > >If we are talking about something typically stored in a eeprom like a >mac address phys seems appropriate. Yes, we are. > >Still having a definition of this switch id clean clear enough that >net/bridge and drivers/net/macvlan can implement it seems important. I don't understand why would net/bridge or driver/net/macvlan want to implement this. The purpose is to group ports/netdevs which are part of the same hw switch. > >Even more important is having a definition of switch id clear enough >that userspace can use the switch id to do something useful. Userspace threats this the same it treats phys port id. if two ports/netdevs has same switch id, they belong under same hw switch. That's all. > >Right now switch id looks like one of those weird one manufacturer >properties that is fine to expose as a driver specific property >but I don't yet see it being a generic property I that can be used >usefully in userspace. > >So can we please get some clear semantics or failing that can we please >not expose this to userspace as generic property. I thought that the semantics is clean. Looks like I will have to update Documentation/networking/switchdev.txt adding some more info about this. > >Thanks, >Eric > > > >>>> Signed-off-by: Jiri Pirko <jiri@resnulli.us> >>>> --- >>>> include/linux/if_link.h | 1 + >>>> ip/ipaddress.c | 8 ++++++++ >>>> 2 files changed, 9 insertions(+) >>>> >>>> diff --git a/include/linux/if_link.h b/include/linux/if_link.h >>>> index 4732063..a6e2594 100644 >>>> --- a/include/linux/if_link.h >>>> +++ b/include/linux/if_link.h >>>> @@ -145,6 +145,7 @@ enum { >>>> IFLA_CARRIER, >>>> IFLA_PHYS_PORT_ID, >>>> IFLA_CARRIER_CHANGES, >>>> + IFLA_PHYS_SWITCH_ID, >>>> __IFLA_MAX >>>> }; >>>> >>>> diff --git a/ip/ipaddress.c b/ip/ipaddress.c >>>> index 4d99324..bd36a07 100644 >>>> --- a/ip/ipaddress.c >>>> +++ b/ip/ipaddress.c >>>> @@ -589,6 +589,14 @@ int print_linkinfo(const struct sockaddr_nl *who, >>>> b1, sizeof(b1))); >>>> } >>>> >>>> + if (tb[IFLA_PHYS_SWITCH_ID]) { >>>> + SPRINT_BUF(b1); >>>> + fprintf(fp, "switchid %s ", >>>> + hexstring_n2a(RTA_DATA(tb[IFLA_PHYS_SWITCH_ID]), >>>> + RTA_PAYLOAD(tb[IFLA_PHYS_SWITCH_ID]), >>>> + b1, sizeof(b1))); >>>> + } >>>> + >>>> if (tb[IFLA_OPERSTATE]) >>>> print_operstate(fp, rta_getattr_u8(tb[IFLA_OPERSTATE])); -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Jiri Pirko <jiri@resnulli.us> writes: > Thu, Dec 04, 2014 at 06:52:49PM CET, ebiederm@xmission.com wrote: >>Jiri Pirko <jiri@resnulli.us> writes: >> >>> Thu, Dec 04, 2014 at 05:15:04PM CET, ebiederm@xmission.com wrote: >>>>Jiri Pirko <jiri@resnulli.us> writes: >>>> >>>>Would someone please explain to me what a switch id is? >>>> >>>>I looked in the kernel source, and I looked here and while I know >>>>switches I don't have a clue what a switch id is. >>>> >>>>My primary concern at this point is that you have introduced a global >>>>identifier that is isn't a hardware property (it certainly does not look >>>>like a mac address) and that is unique across network namespaces and >>>>thus breaks checkpoint/restart (aka CRIU). >>> >>> IFLA_PHYS_SWITCH_ID is very similar to IFLA_PHYS_PORT_ID. It is >>> generated by the driver and ensures that there is the same switch id for >>> all ports belonging to the same switch chip/asic. It is up to the driver >>> how to implement the id. I would like to point you to driver >>> implementing ndo_get_phys_port_id >> >>Looking at ndo_get_phys_port_id it is just the per port mac address. Or >>guid in the case of infiniband. Which really makes me wonder why we >>didn't use the same abstractions in the code for address types that we >>do for hardware addresses. >> >>Using mac address or other hardware addresses that are used for layer 2 >>addressing makes sense to me. There is a long tradition of that and as >>I recall protocols like STP actually requiring having a different mac >>address per port. >> >>When I asked the question I thought the switch id was going to be >>something like the ifindex, the software index of a network device. >> >> >>Finally having tracked down the rocker implementation of >>rocker_port_switch_parent_id_get I see it you are reading some 64bit >>hardware register. >> >>Which leads me to ask what are the semantics of switch_id? >> >>Is the switch id an identifier with a prefix from IEEE and assigned by >>the manufacture so that it is guaranteed to the tolerances of the >>manufacturing process to be globally unique? > > It is up to the driver what to use. It can use mac addr. This is same as > for phys port id. My reading of the code says the phys port id is the layer 2 hardware address of the port. Certainly that is the case for all of the implementations now and it would be insane for it to be anything else. >>Is the switch id a random number that is statistically likely to be >>globally unique because you have enough bits? As I recall you need >>at least 128 bits to have a reasonable chance of a random number >>avoiding the birthday paradox. >> >>Do we need some kind of manufacturer id to tell one switch id from >>another? >> >>Is the switch id persistent across reboots? > > Yes it is (as for phys port id). >>>>Also what in the world does PHYS mean in IFLA_PHYS_SWITCH_ID? Does that >>>>mean we can't have a purely software implementation of this interface? >>>>Given that we will want a software implementation at some point >>>>including PHYS in the name seems completely wrong. >>> >>> We can remove the "PHYS", no problem. I do not understand what you say >>> about "software implementation". The point is to allow hw switch/ish >>> chips to be supported. >> >>If we are talking about something typically stored in a eeprom like a >>mac address phys seems appropriate. > > Yes, we are. > >> >>Still having a definition of this switch id clean clear enough that >>net/bridge and drivers/net/macvlan can implement it seems important. > > I don't understand why would net/bridge or driver/net/macvlan want to > implement this. The purpose is to group ports/netdevs which are part of > the same hw switch. Certainly all of the macvlan devices are part of the same logical switch, and are all the same hardware. >>Even more important is having a definition of switch id clear enough >>that userspace can use the switch id to do something useful. > > Userspace threats this the same it treats phys port id. > > if two ports/netdevs has same switch id, they belong under same hw > switch. That's all. So this id needs to be globally unique? How does the rocket driver achieve global uniquness? Is the rocket driver using an IEEE assigned prefix based id like the ethernet mac address? This is an important detail as ensuring global uniqueness can take a lot of work so there needs to be a general agreement on how global uniqueness is achieved. Saying it is up to the driver how to achieve a globally unique id across every different switch driver is really insufficient. If I have two drivers that each implement an id that is 64bit long and they use two completely different methods there is a real chance of collision. So either I need a driver id that I also use in the comparison between switch ids or how a driver generates a globally unique id need to be specified. >>Right now switch id looks like one of those weird one manufacturer >>properties that is fine to expose as a driver specific property >>but I don't yet see it being a generic property I that can be used >>usefully in userspace. >> >>So can we please get some clear semantics or failing that can we please >>not expose this to userspace as generic property. > > I thought that the semantics is clean. Looks like I will have to update > Documentation/networking/switchdev.txt adding some more info about > this. The semantics as described so far will not achieve a useful id, which makes them rubbish. Eric -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Thu, Dec 04, 2014 at 07:57:41PM CET, ebiederm@xmission.com wrote: >Jiri Pirko <jiri@resnulli.us> writes: > >> Thu, Dec 04, 2014 at 06:52:49PM CET, ebiederm@xmission.com wrote: >>>Jiri Pirko <jiri@resnulli.us> writes: >>> >>>> Thu, Dec 04, 2014 at 05:15:04PM CET, ebiederm@xmission.com wrote: >>>>>Jiri Pirko <jiri@resnulli.us> writes: >>>>> >>>>>Would someone please explain to me what a switch id is? >>>>> >>>>>I looked in the kernel source, and I looked here and while I know >>>>>switches I don't have a clue what a switch id is. >>>>> >>>>>My primary concern at this point is that you have introduced a global >>>>>identifier that is isn't a hardware property (it certainly does not look >>>>>like a mac address) and that is unique across network namespaces and >>>>>thus breaks checkpoint/restart (aka CRIU). >>>> >>>> IFLA_PHYS_SWITCH_ID is very similar to IFLA_PHYS_PORT_ID. It is >>>> generated by the driver and ensures that there is the same switch id for >>>> all ports belonging to the same switch chip/asic. It is up to the driver >>>> how to implement the id. I would like to point you to driver >>>> implementing ndo_get_phys_port_id >>> >>>Looking at ndo_get_phys_port_id it is just the per port mac address. Or >>>guid in the case of infiniband. Which really makes me wonder why we >>>didn't use the same abstractions in the code for address types that we >>>do for hardware addresses. >>> >>>Using mac address or other hardware addresses that are used for layer 2 >>>addressing makes sense to me. There is a long tradition of that and as >>>I recall protocols like STP actually requiring having a different mac >>>address per port. >>> >>>When I asked the question I thought the switch id was going to be >>>something like the ifindex, the software index of a network device. >>> >>> >>>Finally having tracked down the rocker implementation of >>>rocker_port_switch_parent_id_get I see it you are reading some 64bit >>>hardware register. >>> >>>Which leads me to ask what are the semantics of switch_id? >>> >>>Is the switch id an identifier with a prefix from IEEE and assigned by >>>the manufacture so that it is guaranteed to the tolerances of the >>>manufacturing process to be globally unique? >> >> It is up to the driver what to use. It can use mac addr. This is same as >> for phys port id. > >My reading of the code says the phys port id is the layer 2 hardware >address of the port. Certainly that is the case for all of the >implementations now and it would be insane for it to be anything else. > >>>Is the switch id a random number that is statistically likely to be >>>globally unique because you have enough bits? As I recall you need >>>at least 128 bits to have a reasonable chance of a random number >>>avoiding the birthday paradox. >>> >>>Do we need some kind of manufacturer id to tell one switch id from >>>another? >>> >>>Is the switch id persistent across reboots? >> >> Yes it is (as for phys port id). > >>>>>Also what in the world does PHYS mean in IFLA_PHYS_SWITCH_ID? Does that >>>>>mean we can't have a purely software implementation of this interface? >>>>>Given that we will want a software implementation at some point >>>>>including PHYS in the name seems completely wrong. >>>> >>>> We can remove the "PHYS", no problem. I do not understand what you say >>>> about "software implementation". The point is to allow hw switch/ish >>>> chips to be supported. >>> >>>If we are talking about something typically stored in a eeprom like a >>>mac address phys seems appropriate. >> >> Yes, we are. >> >>> >>>Still having a definition of this switch id clean clear enough that >>>net/bridge and drivers/net/macvlan can implement it seems important. >> >> I don't understand why would net/bridge or driver/net/macvlan want to >> implement this. The purpose is to group ports/netdevs which are part of >> the same hw switch. > >Certainly all of the macvlan devices are part of the same logical >switch, and are all the same hardware. > >>>Even more important is having a definition of switch id clear enough >>>that userspace can use the switch id to do something useful. >> >> Userspace threats this the same it treats phys port id. >> >> if two ports/netdevs has same switch id, they belong under same hw >> switch. That's all. > >So this id needs to be globally unique? No. It is enough to be unique within a single system. It serves for no more than to find out 2 ids are same or not, no other info value. So when the drivers uses sane ids (like mac for example, or in case of rocker an id which is passed by qemu command line), the chances of collision are very very close to none (never say never). > >How does the rocket driver achieve global uniquness? Is the rocket >driver using an IEEE assigned prefix based id like the ethernet mac >address? > >This is an important detail as ensuring global uniqueness can take a lot >of work so there needs to be a general agreement on how global >uniqueness is achieved. > >Saying it is up to the driver how to achieve a globally unique id across >every different switch driver is really insufficient. If I have two >drivers that each implement an id that is 64bit long and they use two >completely different methods there is a real chance of collision. > >So either I need a driver id that I also use in the comparison between >switch ids or how a driver generates a globally unique id need to be >specified. > >>>Right now switch id looks like one of those weird one manufacturer >>>properties that is fine to expose as a driver specific property >>>but I don't yet see it being a generic property I that can be used >>>usefully in userspace. >>> >>>So can we please get some clear semantics or failing that can we please >>>not expose this to userspace as generic property. >> >> I thought that the semantics is clean. Looks like I will have to update >> Documentation/networking/switchdev.txt adding some more info about >> this. > >The semantics as described so far will not achieve a useful id, which >makes them rubbish. > >Eric -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Jiri Pirko <jiri@resnulli.us> writes: >>So this id needs to be globally unique? > > No. It is enough to be unique within a single system. It serves for no > more than to find out 2 ids are same or not, no other info value. > > So when the drivers uses sane ids (like mac for example, or in case of > rocker an id which is passed by qemu command line), the chances of > collision are very very close to none (never say never). So the switch id isn't necessarily even as good as a manufacturers serial number that changes between different models, and anyone who uses this id must look at the driver to get uniqueness. Yes what is needed in the comparison to get uniqueness in a single system badly needs to be documented because that switch id is very much not enough on it's own. phys_port_id on the other hand is globally unique. So please stop saying that this is anything like phys_port_id. Eric -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Thu, Dec 04, 2014 at 08:26:14PM CET, ebiederm@xmission.com wrote: >Jiri Pirko <jiri@resnulli.us> writes: > >>>So this id needs to be globally unique? >> >> No. It is enough to be unique within a single system. It serves for no >> more than to find out 2 ids are same or not, no other info value. >> >> So when the drivers uses sane ids (like mac for example, or in case of >> rocker an id which is passed by qemu command line), the chances of >> collision are very very close to none (never say never). > >So the switch id isn't necessarily even as good as a manufacturers >serial number that changes between different models, and anyone who uses >this id must look at the driver to get uniqueness. > >Yes what is needed in the comparison to get uniqueness in a single >system badly needs to be documented because that switch id is very much >not enough on it's own. > >phys_port_id on the other hand is globally unique. So please stop >saying that this is anything like phys_port_id. I'm sorry. But is is the same. And the purpose is very similar. In case of phys port id to find out if 2 netdevices share the same physical port. In case of phys switch id to find out if 2 netdevices share the same switch parent. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
ebiederm@xmission.com (Eric W. Biederman) writes: > Jiri Pirko <jiri@resnulli.us> writes: > >>>So this id needs to be globally unique? >> >> No. It is enough to be unique within a single system. It serves for no >> more than to find out 2 ids are same or not, no other info value. >> >> So when the drivers uses sane ids (like mac for example, or in case of >> rocker an id which is passed by qemu command line), the chances of >> collision are very very close to none (never say never). Thinking about what you said a little more. Two different sources of persistent numbers picking numbers by completely different algorithms can give you no assurance that you don't produce conflicts. The switch id as desisgned can not work. There are expected to be between 2**36 to 2**40 devices in this world. Your first switch id is a 64it number. At the very best by the birthday pardox predicts there will be a conflict ever 2**32 devices or between 2**4 and 2**8 devices in the world with conflicts. If the ids are not randomly distributed (which they won't be) things could easily be much much worse. That is just good enough the code could get out there and run for years before you have the nightmare of having to fix all of userspace. That is a nightmare no one needs. So please remove this broken code, and this broken concept from the kernel and go back to the drawing board. Eric -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Thu, Dec 04, 2014 at 09:06:14PM CET, ebiederm@xmission.com wrote: >ebiederm@xmission.com (Eric W. Biederman) writes: > >> Jiri Pirko <jiri@resnulli.us> writes: >> >>>>So this id needs to be globally unique? >>> >>> No. It is enough to be unique within a single system. It serves for no >>> more than to find out 2 ids are same or not, no other info value. >>> >>> So when the drivers uses sane ids (like mac for example, or in case of >>> rocker an id which is passed by qemu command line), the chances of >>> collision are very very close to none (never say never). > >Thinking about what you said a little more. > >Two different sources of persistent numbers picking numbers by >completely different algorithms can give you no assurance that you don't >produce conflicts. > >The switch id as desisgned can not work. > >There are expected to be between 2**36 to 2**40 devices in this world. >Your first switch id is a 64it number. At the very best by the birthday >pardox predicts there will be a conflict ever 2**32 devices or between >2**4 and 2**8 devices in the world with conflicts. If the ids are not >randomly distributed (which they won't be) things could easily be much >much worse. > >That is just good enough the code could get out there and run for years >before you have the nightmare of having to fix all of userspace. That >is a nightmare no one needs. > >So please remove this broken code, and this broken concept from the >kernel and go back to the drawing board. In that case the phys port id is broken in the same way. Let's rather think about how to avoid conflicts for both. Given the fact the conflicts should be avoided only on a single baremetal, that should be doable (for (bad) example using driver name mixed with driver created id). -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Jiri Pirko <jiri@resnulli.us> writes: > Thu, Dec 04, 2014 at 09:06:14PM CET, ebiederm@xmission.com wrote: >>ebiederm@xmission.com (Eric W. Biederman) writes: >> >>> Jiri Pirko <jiri@resnulli.us> writes: >>> >>>>>So this id needs to be globally unique? >>>> >>>> No. It is enough to be unique within a single system. It serves for no >>>> more than to find out 2 ids are same or not, no other info value. >>>> >>>> So when the drivers uses sane ids (like mac for example, or in case of >>>> rocker an id which is passed by qemu command line), the chances of >>>> collision are very very close to none (never say never). >> >>Thinking about what you said a little more. >> >>Two different sources of persistent numbers picking numbers by >>completely different algorithms can give you no assurance that you don't >>produce conflicts. >> >>The switch id as desisgned can not work. >> >>There are expected to be between 2**36 to 2**40 devices in this world. >>Your first switch id is a 64it number. At the very best by the birthday >>pardox predicts there will be a conflict ever 2**32 devices or between >>2**4 and 2**8 devices in the world with conflicts. If the ids are not >>randomly distributed (which they won't be) things could easily be much >>much worse. >> >>That is just good enough the code could get out there and run for years >>before you have the nightmare of having to fix all of userspace. That >>is a nightmare no one needs. >> >>So please remove this broken code, and this broken concept from the >>kernel and go back to the drawing board. > > In that case the phys port id is broken in the same way. Let's rather > think about how to avoid conflicts for both. Given the fact the > conflicts should be avoided only on a single baremetal, that should be > doable (for (bad) example using driver name mixed with driver created > id). No. phys_port_id is not broken in the same way, and phys_port_id does not have the same set of properties. phys_port_id's in practice all have an IEEE prefix that identifies the manufacturer and a manufacture assigned serial number. Aka a mac address or a EUID-64. What the mlx4 ethernet driver is doing retunring a 64bit EUID-64 I don't know. If there are problems in the worst case issues with phys_port_id are fixable by simple driver tweaks, because fundamentally we are working with globally uniuqe identifiers. Well globally unique baring manufacturing bugs in eeproms. I agree with you that the switch id concept can be saved. But I think we should fix switch id before we export it to userspace so we don't have to break userspace later. My intuition says we want something like ifindex, but I am not at all certain how switch id is planned to be used. Given that it is single box I don't expect you are sending it out over the wire. *shrug* Why does switch id need to be persistent? Why can't switch id be property like ifindex? What are the actual requirements. Eric -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Thu, Dec 04, 2014 at 09:55:07PM CET, ebiederm@xmission.com wrote: >Jiri Pirko <jiri@resnulli.us> writes: > >> Thu, Dec 04, 2014 at 09:06:14PM CET, ebiederm@xmission.com wrote: >>>ebiederm@xmission.com (Eric W. Biederman) writes: >>> >>>> Jiri Pirko <jiri@resnulli.us> writes: >>>> >>>>>>So this id needs to be globally unique? >>>>> >>>>> No. It is enough to be unique within a single system. It serves for no >>>>> more than to find out 2 ids are same or not, no other info value. >>>>> >>>>> So when the drivers uses sane ids (like mac for example, or in case of >>>>> rocker an id which is passed by qemu command line), the chances of >>>>> collision are very very close to none (never say never). >>> >>>Thinking about what you said a little more. >>> >>>Two different sources of persistent numbers picking numbers by >>>completely different algorithms can give you no assurance that you don't >>>produce conflicts. >>> >>>The switch id as desisgned can not work. >>> >>>There are expected to be between 2**36 to 2**40 devices in this world. >>>Your first switch id is a 64it number. At the very best by the birthday >>>pardox predicts there will be a conflict ever 2**32 devices or between >>>2**4 and 2**8 devices in the world with conflicts. If the ids are not >>>randomly distributed (which they won't be) things could easily be much >>>much worse. >>> >>>That is just good enough the code could get out there and run for years >>>before you have the nightmare of having to fix all of userspace. That >>>is a nightmare no one needs. >>> >>>So please remove this broken code, and this broken concept from the >>>kernel and go back to the drawing board. >> >> In that case the phys port id is broken in the same way. Let's rather >> think about how to avoid conflicts for both. Given the fact the >> conflicts should be avoided only on a single baremetal, that should be >> doable (for (bad) example using driver name mixed with driver created >> id). > >No. phys_port_id is not broken in the same way, and phys_port_id does >not have the same set of properties. > >phys_port_id's in practice all have an IEEE prefix that identifies the >manufacturer and a manufacture assigned serial number. Aka a mac >address or a EUID-64. What the mlx4 ethernet driver is doing retunring >a 64bit EUID-64 I don't know. If there are problems in the worst >case issues with phys_port_id are fixable by simple driver tweaks, >because fundamentally we are working with globally uniuqe identifiers. >Well globally unique baring manufacturing bugs in eeproms. Well the fact that phys_post_id's are now implemented mostly by putting mac into it does not mean that other drivers cannot do it differently. So once again, phys_port_id and phys_switch_id are the same in this matter. > > > >I agree with you that the switch id concept can be saved. But I think >we should fix switch id before we export it to userspace so we don't >have to break userspace later. > >My intuition says we want something like ifindex, but I am not at all >certain how switch id is planned to be used. Given that it is single >box I don't expect you are sending it out over the wire. No, it is not to be send out. > >*shrug* > >Why does switch id need to be persistent? Why can't switch id be >property like ifindex? Well I can imagine that multiple ports of the same switch chip could be passed through to the virtual machines (similar to SR-IOV pf/vf). > >What are the actual requirements. They are actually very similar to phys_port_id. Therefore I made that the same. > >Eric -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Jiri Pirko <jiri@resnulli.us> writes: >>No. phys_port_id is not broken in the same way, and phys_port_id does >>not have the same set of properties. >> >>phys_port_id's in practice all have an IEEE prefix that identifies the >>manufacturer and a manufacture assigned serial number. Aka a mac >>address or a EUID-64. What the mlx4 ethernet driver is doing retunring >>a 64bit EUID-64 I don't know. If there are problems in the worst >>case issues with phys_port_id are fixable by simple driver tweaks, >>because fundamentally we are working with globally uniuqe identifiers. >>Well globally unique baring manufacturing bugs in eeproms. > > Well the fact that phys_post_id's are now implemented mostly by putting > mac into it does not mean that other drivers cannot do it differently. > So once again, phys_port_id and phys_switch_id are the same in this > matter. No exclusively implemented that way. And yes other drivers can implement bugs, that doesn't mean those bugs will be toleraged and won't break userspace. >>I agree with you that the switch id concept can be saved. But I think >>we should fix switch id before we export it to userspace so we don't >>have to break userspace later. >> >>My intuition says we want something like ifindex, but I am not at all >>certain how switch id is planned to be used. Given that it is single >>box I don't expect you are sending it out over the wire. > > No, it is not to be send out. > >> >>*shrug* >> >>Why does switch id need to be persistent? Why can't switch id be >>property like ifindex? > > Well I can imagine that multiple ports of the same switch chip could be > passed through to the virtual machines (similar to SR-IOV pf/vf). In that case you do need something that is globally unique. Because if I migrate your virtual machine onto a different physical machine seeing the same switch id and I might make the mistaken assumption that I am remain on the same switch if the ids are not global. >>What are the actual requirements. > > > They are actually very similar to phys_port_id. Therefore I made that > the same. Then please for rocket use a non-buggy implementation with a globally unique id. Given your descriptions of the requirements I can't see how any other implementation isn't buggy. Eric -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/04/14 at 10:10pm, Jiri Pirko wrote: > Thu, Dec 04, 2014 at 09:55:07PM CET, ebiederm@xmission.com wrote: > >My intuition says we want something like ifindex, but I am not at all > >certain how switch id is planned to be used. Given that it is single > >box I don't expect you are sending it out over the wire. > > No, it is not to be send out. We discussed this before and stated we would fix this as soon as it's merged. Let's just define a simple and quick switch id prefix generator ala ifindex which serves as a unique prefix to all switch ids. I have time early next week if Jiri doesn't. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Eric W. Biederman > Jiri Pirko <jiri@resnulli.us> writes: > > > Thu, Dec 04, 2014 at 09:06:14PM CET, ebiederm@xmission.com wrote: > >>ebiederm@xmission.com (Eric W. Biederman) writes: > >> > >>> Jiri Pirko <jiri@resnulli.us> writes: > >>> > >>>>>So this id needs to be globally unique? > >>>> > >>>> No. It is enough to be unique within a single system. It serves for no > >>>> more than to find out 2 ids are same or not, no other info value. > >>>> > >>>> So when the drivers uses sane ids (like mac for example, or in case of > >>>> rocker an id which is passed by qemu command line), the chances of > >>>> collision are very very close to none (never say never). > >> > >>Thinking about what you said a little more. > >> > >>Two different sources of persistent numbers picking numbers by > >>completely different algorithms can give you no assurance that you don't > >>produce conflicts. > >> > >>The switch id as desisgned can not work. > >> > >>There are expected to be between 2**36 to 2**40 devices in this world. > >>Your first switch id is a 64it number. At the very best by the birthday > >>pardox predicts there will be a conflict ever 2**32 devices or between > >>2**4 and 2**8 devices in the world with conflicts. If the ids are not > >>randomly distributed (which they won't be) things could easily be much > >>much worse. > >> > >>That is just good enough the code could get out there and run for years > >>before you have the nightmare of having to fix all of userspace. That > >>is a nightmare no one needs. > >> > >>So please remove this broken code, and this broken concept from the > >>kernel and go back to the drawing board. > > > > In that case the phys port id is broken in the same way. Let's rather > > think about how to avoid conflicts for both. Given the fact the > > conflicts should be avoided only on a single baremetal, that should be > > doable (for (bad) example using driver name mixed with driver created > > id). > > No. phys_port_id is not broken in the same way, and phys_port_id does > not have the same set of properties. > > phys_port_id's in practice all have an IEEE prefix that identifies the > manufacturer and a manufacture assigned serial number. Aka a mac > address or a EUID-64. What the mlx4 ethernet driver is doing retunring > a 64bit EUID-64 I don't know. If there are problems in the worst > case issues with phys_port_id are fixable by simple driver tweaks, > because fundamentally we are working with globally uniuqe identifiers. > Well globally unique baring manufacturing bugs in eeproms. Manufacturers have to generate unique MAC addresses - otherwise people complain. But can't be assumed to put different 'serial numbers' in other devices. If you look at USB memory sticks you are likely to find that the serial number in the (equivalent of the) ATA identify response isn't unique. So I doubt you can use the value to distinguish between devices. You also get the situation where ethernet MAC addresses only have to be unique within a collision domain. Many old sun systems used a single MAC address - valid because they assumed/required that multiple interfaces be connected to different networks. So even MAC addresses aren't per-interface. David -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Laight <David.Laight@ACULAB.COM> writes: > From: Eric W. Biederman >> Jiri Pirko <jiri@resnulli.us> writes: >> >> > Thu, Dec 04, 2014 at 09:06:14PM CET, ebiederm@xmission.com wrote: >> >>ebiederm@xmission.com (Eric W. Biederman) writes: >> >> >> >>> Jiri Pirko <jiri@resnulli.us> writes: >> >>> >> >>>>>So this id needs to be globally unique? >> >>>> >> >>>> No. It is enough to be unique within a single system. It serves for no >> >>>> more than to find out 2 ids are same or not, no other info value. >> >>>> >> >>>> So when the drivers uses sane ids (like mac for example, or in case of >> >>>> rocker an id which is passed by qemu command line), the chances of >> >>>> collision are very very close to none (never say never). >> >> >> >>Thinking about what you said a little more. >> >> >> >>Two different sources of persistent numbers picking numbers by >> >>completely different algorithms can give you no assurance that you don't >> >>produce conflicts. >> >> >> >>The switch id as desisgned can not work. >> >> >> >>There are expected to be between 2**36 to 2**40 devices in this world. >> >>Your first switch id is a 64it number. At the very best by the birthday >> >>pardox predicts there will be a conflict ever 2**32 devices or between >> >>2**4 and 2**8 devices in the world with conflicts. If the ids are not >> >>randomly distributed (which they won't be) things could easily be much >> >>much worse. >> >> >> >>That is just good enough the code could get out there and run for years >> >>before you have the nightmare of having to fix all of userspace. That >> >>is a nightmare no one needs. >> >> >> >>So please remove this broken code, and this broken concept from the >> >>kernel and go back to the drawing board. >> > >> > In that case the phys port id is broken in the same way. Let's rather >> > think about how to avoid conflicts for both. Given the fact the >> > conflicts should be avoided only on a single baremetal, that should be >> > doable (for (bad) example using driver name mixed with driver created >> > id). >> >> No. phys_port_id is not broken in the same way, and phys_port_id does >> not have the same set of properties. >> >> phys_port_id's in practice all have an IEEE prefix that identifies the >> manufacturer and a manufacture assigned serial number. Aka a mac >> address or a EUID-64. What the mlx4 ethernet driver is doing retunring >> a 64bit EUID-64 I don't know. If there are problems in the worst >> case issues with phys_port_id are fixable by simple driver tweaks, >> because fundamentally we are working with globally uniuqe identifiers. >> Well globally unique baring manufacturing bugs in eeproms. > > Manufacturers have to generate unique MAC addresses - otherwise people complain. > But can't be assumed to put different 'serial numbers' in other devices. > If you look at USB memory sticks you are likely to find that the serial > number in the (equivalent of the) ATA identify response isn't unique. > So I doubt you can use the value to distinguish between devices. When I said serial number I was talking about MAC addresses. > You also get the situation where ethernet MAC addresses only have to be > unique within a collision domain. Many old sun systems used a single MAC > address - valid because they assumed/required that multiple interfaces > be connected to different networks. > So even MAC addresses aren't per-interface. The old sun system issue does not apply to switches and switch like devices. There are common layer two protocls (Spanning Tree? LACP?) that require each switch port to have a unique mac address. So people will complain and software will break if there is not a mac address per port. So for switches and things that strongly resemble switches assuming a unique mac address per port is a reasonable assumption. Eric -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/linux/if_link.h b/include/linux/if_link.h index 4732063..a6e2594 100644 --- a/include/linux/if_link.h +++ b/include/linux/if_link.h @@ -145,6 +145,7 @@ enum { IFLA_CARRIER, IFLA_PHYS_PORT_ID, IFLA_CARRIER_CHANGES, + IFLA_PHYS_SWITCH_ID, __IFLA_MAX }; diff --git a/ip/ipaddress.c b/ip/ipaddress.c index 4d99324..bd36a07 100644 --- a/ip/ipaddress.c +++ b/ip/ipaddress.c @@ -589,6 +589,14 @@ int print_linkinfo(const struct sockaddr_nl *who, b1, sizeof(b1))); } + if (tb[IFLA_PHYS_SWITCH_ID]) { + SPRINT_BUF(b1); + fprintf(fp, "switchid %s ", + hexstring_n2a(RTA_DATA(tb[IFLA_PHYS_SWITCH_ID]), + RTA_PAYLOAD(tb[IFLA_PHYS_SWITCH_ID]), + b1, sizeof(b1))); + } + if (tb[IFLA_OPERSTATE]) print_operstate(fp, rta_getattr_u8(tb[IFLA_OPERSTATE]));
Signed-off-by: Jiri Pirko <jiri@resnulli.us> --- include/linux/if_link.h | 1 + ip/ipaddress.c | 8 ++++++++ 2 files changed, 9 insertions(+)