Message ID | 20170725051344.2040-1-jakub.kicinski@netronome.com |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
On Mon, Jul 24, 2017 at 10:13:44PM -0700, Jakub Kicinski wrote: > We are still in position where we can suggest uniform naming > convention for ndo_get_phys_port_name(). switchdev.txt file > already contained a suggestion of how to name external ports. > Since the use of switchdev for SR-IOV NIC's eswitches is growing, > establish a format for ports of those devices as well. > > Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> This is a nice addition and I suspect there could be even more done to update this file to cover the VF rep usage. > --- > Documentation/networking/switchdev.txt | 14 +++++++++++--- > 1 file changed, 11 insertions(+), 3 deletions(-) > > diff --git a/Documentation/networking/switchdev.txt b/Documentation/networking/switchdev.txt > index 3e7b946dea27..7c4b6025fb4b 100644 > --- a/Documentation/networking/switchdev.txt > +++ b/Documentation/networking/switchdev.txt > @@ -119,9 +119,17 @@ into 4 10G ports, resulting in 4 port netdevs, the device can give a unique > SUBSYSTEM=="net", ACTION=="add", ATTR{phys_switch_id}=="<phys_switch_id>", \ > ATTR{phys_port_name}!="", NAME="swX$attr{phys_port_name}" > > -Suggested naming convention is "swXpYsZ", where X is the switch name or ID, Y > -is the port name or ID, and Z is the sub-port name or ID. For example, sw1p1s0 > -would be sub-port 0 on port 1 on switch 1. > +Suggested formats of the port name returned by ndo_get_phys_port_name are: > + - pA for external ports; > + - pAsB for split external ports; > + - pfC for PF ports (so called PF representors); > + - pfCvfD for VF ports (so called VF representors). I hate to clutter this up, but might be also need to add: - pfCsB for split PF ports (so called PF representors); - pfCsBvfD for split VF ports (so called VF representors). or are we comfortable that these additions to the name for split ports are implied? > +Where A is the port name or ID; B is the sub-port name or ID; C is PCIe > +Physical Function name or ID and D is PCIe Virtual Function name or ID. > + > +Suggested naming convention for switches is "swX", where X is the switch name > +or ID, plus the port name. For example, sw1p1s0 would be sub-port 0 on port 1 > +on switch 1. > > Port Features > ^^^^^^^^^^^^^
On Tue, 25 Jul 2017 11:22:41 -0400, Andy Gospodarek wrote: > On Mon, Jul 24, 2017 at 10:13:44PM -0700, Jakub Kicinski wrote: > > We are still in position where we can suggest uniform naming > > convention for ndo_get_phys_port_name(). switchdev.txt file > > already contained a suggestion of how to name external ports. > > Since the use of switchdev for SR-IOV NIC's eswitches is growing, > > establish a format for ports of those devices as well. > > > > Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> > > This is a nice addition and I suspect there could be even more done to > update this file to cover the VF rep usage. > > > --- > > Documentation/networking/switchdev.txt | 14 +++++++++++--- > > 1 file changed, 11 insertions(+), 3 deletions(-) > > > > diff --git a/Documentation/networking/switchdev.txt b/Documentation/networking/switchdev.txt > > index 3e7b946dea27..7c4b6025fb4b 100644 > > --- a/Documentation/networking/switchdev.txt > > +++ b/Documentation/networking/switchdev.txt > > @@ -119,9 +119,17 @@ into 4 10G ports, resulting in 4 port netdevs, the device can give a unique > > SUBSYSTEM=="net", ACTION=="add", ATTR{phys_switch_id}=="<phys_switch_id>", \ > > ATTR{phys_port_name}!="", NAME="swX$attr{phys_port_name}" > > > > -Suggested naming convention is "swXpYsZ", where X is the switch name or ID, Y > > -is the port name or ID, and Z is the sub-port name or ID. For example, sw1p1s0 > > -would be sub-port 0 on port 1 on switch 1. > > +Suggested formats of the port name returned by ndo_get_phys_port_name are: > > + - pA for external ports; > > + - pAsB for split external ports; > > + - pfC for PF ports (so called PF representors); > > + - pfCvfD for VF ports (so called VF representors). > > I hate to clutter this up, but might be also need to add: > > - pfCsB for split PF ports (so called PF representors); > - pfCsBvfD for split VF ports (so called VF representors). > > or are we comfortable that these additions to the name for split ports > are implied? Hm.. What is a split PF port? Splits happen on the physical port - see my rant on the thread this is a reply to ;) PFs are PCIe functions, on the opposite side of the eswitch from the wires.
On Tue, Jul 25, 2017 at 03:26:47PM -0700, Jakub Kicinski wrote: > On Tue, 25 Jul 2017 11:22:41 -0400, Andy Gospodarek wrote: > > On Mon, Jul 24, 2017 at 10:13:44PM -0700, Jakub Kicinski wrote: > > > We are still in position where we can suggest uniform naming > > > convention for ndo_get_phys_port_name(). switchdev.txt file > > > already contained a suggestion of how to name external ports. > > > Since the use of switchdev for SR-IOV NIC's eswitches is growing, > > > establish a format for ports of those devices as well. > > > > > > Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> > > > > This is a nice addition and I suspect there could be even more done to > > update this file to cover the VF rep usage. > > > > > --- > > > Documentation/networking/switchdev.txt | 14 +++++++++++--- > > > 1 file changed, 11 insertions(+), 3 deletions(-) > > > > > > diff --git a/Documentation/networking/switchdev.txt b/Documentation/networking/switchdev.txt > > > index 3e7b946dea27..7c4b6025fb4b 100644 > > > --- a/Documentation/networking/switchdev.txt > > > +++ b/Documentation/networking/switchdev.txt > > > @@ -119,9 +119,17 @@ into 4 10G ports, resulting in 4 port netdevs, the device can give a unique > > > SUBSYSTEM=="net", ACTION=="add", ATTR{phys_switch_id}=="<phys_switch_id>", \ > > > ATTR{phys_port_name}!="", NAME="swX$attr{phys_port_name}" > > > > > > -Suggested naming convention is "swXpYsZ", where X is the switch name or ID, Y > > > -is the port name or ID, and Z is the sub-port name or ID. For example, sw1p1s0 > > > -would be sub-port 0 on port 1 on switch 1. > > > +Suggested formats of the port name returned by ndo_get_phys_port_name are: > > > + - pA for external ports; > > > + - pAsB for split external ports; > > > + - pfC for PF ports (so called PF representors); > > > + - pfCvfD for VF ports (so called VF representors). > > > > I hate to clutter this up, but might be also need to add: > > > > - pfCsB for split PF ports (so called PF representors); > > - pfCsBvfD for split VF ports (so called VF representors). > > > > or are we comfortable that these additions to the name for split ports > > are implied? > > Hm.. What is a split PF port? Splits happen on the physical port - see > my rant on the thread this is a reply to ;) PFs are PCIe functions, > on the opposite side of the eswitch from the wires. I'm with you that I think there is value in separate netdevs to represent "PFs, VFs and external ports/MACs" -- particularly for the use-case you to create rules to control PF<->VF traffic. So while I'm not saying it is a _great_ idea to support such a thing as port-splitting of PFs, I suggested this addition as I'm not willing to restrict such a design/implementation if a vendor or customer desired. It seemed useful to provde some guidance on how to name them -- even if we do not like them. :-)
On Tue, 25 Jul 2017 21:48:15 -0400, Andy Gospodarek wrote: > On Tue, Jul 25, 2017 at 03:26:47PM -0700, Jakub Kicinski wrote: > > On Tue, 25 Jul 2017 11:22:41 -0400, Andy Gospodarek wrote: > > > On Mon, Jul 24, 2017 at 10:13:44PM -0700, Jakub Kicinski wrote: > > > > We are still in position where we can suggest uniform naming > > > > convention for ndo_get_phys_port_name(). switchdev.txt file > > > > already contained a suggestion of how to name external ports. > > > > Since the use of switchdev for SR-IOV NIC's eswitches is growing, > > > > establish a format for ports of those devices as well. > > > > > > > > Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> > > > > > > This is a nice addition and I suspect there could be even more done to > > > update this file to cover the VF rep usage. > > > > > > > --- > > > > Documentation/networking/switchdev.txt | 14 +++++++++++--- > > > > 1 file changed, 11 insertions(+), 3 deletions(-) > > > > > > > > diff --git a/Documentation/networking/switchdev.txt b/Documentation/networking/switchdev.txt > > > > index 3e7b946dea27..7c4b6025fb4b 100644 > > > > --- a/Documentation/networking/switchdev.txt > > > > +++ b/Documentation/networking/switchdev.txt > > > > @@ -119,9 +119,17 @@ into 4 10G ports, resulting in 4 port netdevs, the device can give a unique > > > > SUBSYSTEM=="net", ACTION=="add", ATTR{phys_switch_id}=="<phys_switch_id>", \ > > > > ATTR{phys_port_name}!="", NAME="swX$attr{phys_port_name}" > > > > > > > > -Suggested naming convention is "swXpYsZ", where X is the switch name or ID, Y > > > > -is the port name or ID, and Z is the sub-port name or ID. For example, sw1p1s0 > > > > -would be sub-port 0 on port 1 on switch 1. > > > > +Suggested formats of the port name returned by ndo_get_phys_port_name are: > > > > + - pA for external ports; > > > > + - pAsB for split external ports; > > > > + - pfC for PF ports (so called PF representors); > > > > + - pfCvfD for VF ports (so called VF representors). > > > > > > I hate to clutter this up, but might be also need to add: > > > > > > - pfCsB for split PF ports (so called PF representors); > > > - pfCsBvfD for split VF ports (so called VF representors). > > > > > > or are we comfortable that these additions to the name for split ports > > > are implied? > > > > Hm.. What is a split PF port? Splits happen on the physical port - see > > my rant on the thread this is a reply to ;) PFs are PCIe functions, > > on the opposite side of the eswitch from the wires. > > I'm with you that I think there is value in separate netdevs to > represent "PFs, VFs and external ports/MACs" -- particularly for the > use-case you to create rules to control PF<->VF traffic. > > So while I'm not saying it is a _great_ idea to support such a thing as > port-splitting of PFs, I suggested this addition as I'm not willing to restrict > such a design/implementation if a vendor or customer desired. It seemed > useful to provde some guidance on how to name them -- even if we do not > like them. :-) If I understand you correctly split PF would be a situation where device has multiple port instances on the PCIe PF side? IOW switch sees multiple endpoints on the PF side? Let me attempt an ASCII diagram :) HOST A || HOST B || PF A | V | V | V | V || PF B | V | V | V | F | F | F | F ... || | F | F | F ... port A0 | port A1 | 0 | 1 | 2 | 3 || port B0 | port B1 | 0 | 1 | 2 || PCI Express link || PCI Express link \ \ \ | | | | | / / / \ \ \ | | | | | / / / \______\______\' | | | '____/___/___/ /---------------------------\ |<<========== | | ==========>> | | SR-IOV e-switch | |<<========== | | ==========>> | \---------------------------/ | | | | | | || || MAC 0 || MAC 1 || MAC 2 || || Seems to be a valid configuration, perhaps this would actually be of some use in container workloads, especially if the ports could be instantiated at runtime in high numbers. I would be cautious though with calling the instances splits. The more different PFs look from MACs the better IMHO. Do you actually have that problem today? Is there any HW supported upstream which would benefit from this? Could we decide on naming when we have an example implementation? In theory nothing stops us from splitting VFs the same way. Another note on PF netdevs, perhaps the most awkward thing about them, is that they result in two netdevs being visible to the host. This is not incorrect, since VFs if unassigned to VMs will end up creating an "actual" netdev and the switchdev port representor too, but it rubs some people the wrong way. Which in turn makes those people try to not spawn separate netdevs, which is incorrect IMHO, and breaks down e.g. when the real netdev gets assigned to a namespace. I'm not sure if this clarifies my thinking, I have, however, seem to have drawn a moose :)
Tue, Jul 25, 2017 at 07:13:44AM CEST, jakub.kicinski@netronome.com wrote: >We are still in position where we can suggest uniform naming >convention for ndo_get_phys_port_name(). switchdev.txt file >already contained a suggestion of how to name external ports. >Since the use of switchdev for SR-IOV NIC's eswitches is growing, >establish a format for ports of those devices as well. > >Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> >--- > Documentation/networking/switchdev.txt | 14 +++++++++++--- > 1 file changed, 11 insertions(+), 3 deletions(-) > >diff --git a/Documentation/networking/switchdev.txt b/Documentation/networking/switchdev.txt >index 3e7b946dea27..7c4b6025fb4b 100644 >--- a/Documentation/networking/switchdev.txt >+++ b/Documentation/networking/switchdev.txt >@@ -119,9 +119,17 @@ into 4 10G ports, resulting in 4 port netdevs, the device can give a unique > SUBSYSTEM=="net", ACTION=="add", ATTR{phys_switch_id}=="<phys_switch_id>", \ > ATTR{phys_port_name}!="", NAME="swX$attr{phys_port_name}" > >-Suggested naming convention is "swXpYsZ", where X is the switch name or ID, Y >-is the port name or ID, and Z is the sub-port name or ID. For example, sw1p1s0 >-would be sub-port 0 on port 1 on switch 1. >+Suggested formats of the port name returned by ndo_get_phys_port_name are: >+ - pA for external ports; >+ - pAsB for split external ports; >+ - pfC for PF ports (so called PF representors); >+ - pfCvfD for VF ports (so called VF representors). I think it would make sense if the driver would just fill-up a struct in the ndo call and core would generate the string. >+Where A is the port name or ID; B is the sub-port name or ID; C is PCIe >+Physical Function name or ID and D is PCIe Virtual Function name or ID. >+ >+Suggested naming convention for switches is "swX", where X is the switch name >+or ID, plus the port name. For example, sw1p1s0 would be sub-port 0 on port 1 >+on switch 1. > > Port Features > ^^^^^^^^^^^^^ >-- >2.11.0 >
On Wed, 26 Jul 2017 07:48:40 +0200, Jiri Pirko wrote: > Tue, Jul 25, 2017 at 07:13:44AM CEST, jakub.kicinski@netronome.com wrote: > >We are still in position where we can suggest uniform naming > >convention for ndo_get_phys_port_name(). switchdev.txt file > >already contained a suggestion of how to name external ports. > >Since the use of switchdev for SR-IOV NIC's eswitches is growing, > >establish a format for ports of those devices as well. > > > >Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> > >--- > > Documentation/networking/switchdev.txt | 14 +++++++++++--- > > 1 file changed, 11 insertions(+), 3 deletions(-) > > > >diff --git a/Documentation/networking/switchdev.txt b/Documentation/networking/switchdev.txt > >index 3e7b946dea27..7c4b6025fb4b 100644 > >--- a/Documentation/networking/switchdev.txt > >+++ b/Documentation/networking/switchdev.txt > >@@ -119,9 +119,17 @@ into 4 10G ports, resulting in 4 port netdevs, the device can give a unique > > SUBSYSTEM=="net", ACTION=="add", ATTR{phys_switch_id}=="<phys_switch_id>", \ > > ATTR{phys_port_name}!="", NAME="swX$attr{phys_port_name}" > > > >-Suggested naming convention is "swXpYsZ", where X is the switch name or ID, Y > >-is the port name or ID, and Z is the sub-port name or ID. For example, sw1p1s0 > >-would be sub-port 0 on port 1 on switch 1. > >+Suggested formats of the port name returned by ndo_get_phys_port_name are: > >+ - pA for external ports; > >+ - pAsB for split external ports; > >+ - pfC for PF ports (so called PF representors); > >+ - pfCvfD for VF ports (so called VF representors). > > I think it would make sense if the driver would just fill-up a struct in > the ndo call and core would generate the string. I do like the idea of core generating the string. I have a temptation to try to involve devlink in this somehow, since port id and split info is already available there. Perhaps that would only overcomplicate things. The other question is: can we remove the option to generate an arbitrary string completely? I think Or and mlx5 folks may object since it would mean mlx5 VF repr names would change.
On Wed, Jul 26, 2017 at 11:13 AM, Jakub Kicinski <kubakici@wp.pl> wrote: > On Wed, 26 Jul 2017 07:48:40 +0200, Jiri Pirko wrote: >> I think it would make sense if the driver would just fill-up a struct in >> the ndo call and core would generate the string. > I do like the idea of core generating the string. I have a temptation > to try to involve devlink in this somehow, since port id and split info > is already available there. Perhaps that would only overcomplicate > things. > The other question is: can we remove the option to generate an arbitrary > string completely? I think Or and mlx5 folks may object since it would > mean mlx5 VF repr names would change. What we have today is the representor driver instance setting the VF index as the of phys port name and we're telling users to have this sort of udev rule: SUBSYSTEM=="net", ACTION=="add", ATTR{phys_switch_id}=="<phys_switch_id>", \ ATTR{phys_port_name}!="", NAME="$PF_NIC$attr{phys_port_name}" that has affiliation to the PF where this VF belongs. AFAIK this model/assumption is now under push to higher level (open stack), so lets see if/what we want to change here w.r.t to sriov offloading drivers. I would opt for an approach where the value returned by the kernel is the minimal possible and user-space has flexibility to further orchestrate that with udev rules. I wasn't fully clear on Jakub's suggestion which parts must come from the kernel. Do we have any length limitation for the phys port name? Or.
On Tue, Jul 25, 2017 at 07:34:47PM -0700, Jakub Kicinski wrote: > On Tue, 25 Jul 2017 21:48:15 -0400, Andy Gospodarek wrote: > > On Tue, Jul 25, 2017 at 03:26:47PM -0700, Jakub Kicinski wrote: > > > On Tue, 25 Jul 2017 11:22:41 -0400, Andy Gospodarek wrote: > > > > On Mon, Jul 24, 2017 at 10:13:44PM -0700, Jakub Kicinski wrote: > > > > > We are still in position where we can suggest uniform naming > > > > > convention for ndo_get_phys_port_name(). switchdev.txt file > > > > > already contained a suggestion of how to name external ports. > > > > > Since the use of switchdev for SR-IOV NIC's eswitches is growing, > > > > > establish a format for ports of those devices as well. > > > > > > > > > > Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> > > > > > > > > This is a nice addition and I suspect there could be even more done to > > > > update this file to cover the VF rep usage. > > > > > > > > > --- > > > > > Documentation/networking/switchdev.txt | 14 +++++++++++--- > > > > > 1 file changed, 11 insertions(+), 3 deletions(-) > > > > > > > > > > diff --git a/Documentation/networking/switchdev.txt b/Documentation/networking/switchdev.txt > > > > > index 3e7b946dea27..7c4b6025fb4b 100644 > > > > > --- a/Documentation/networking/switchdev.txt > > > > > +++ b/Documentation/networking/switchdev.txt > > > > > @@ -119,9 +119,17 @@ into 4 10G ports, resulting in 4 port netdevs, the device can give a unique > > > > > SUBSYSTEM=="net", ACTION=="add", ATTR{phys_switch_id}=="<phys_switch_id>", \ > > > > > ATTR{phys_port_name}!="", NAME="swX$attr{phys_port_name}" > > > > > > > > > > -Suggested naming convention is "swXpYsZ", where X is the switch name or ID, Y > > > > > -is the port name or ID, and Z is the sub-port name or ID. For example, sw1p1s0 > > > > > -would be sub-port 0 on port 1 on switch 1. > > > > > +Suggested formats of the port name returned by ndo_get_phys_port_name are: > > > > > + - pA for external ports; > > > > > + - pAsB for split external ports; > > > > > + - pfC for PF ports (so called PF representors); > > > > > + - pfCvfD for VF ports (so called VF representors). > > > > > > > > I hate to clutter this up, but might be also need to add: > > > > > > > > - pfCsB for split PF ports (so called PF representors); > > > > - pfCsBvfD for split VF ports (so called VF representors). > > > > > > > > or are we comfortable that these additions to the name for split ports > > > > are implied? > > > > > > Hm.. What is a split PF port? Splits happen on the physical port - see > > > my rant on the thread this is a reply to ;) PFs are PCIe functions, > > > on the opposite side of the eswitch from the wires. > > > > I'm with you that I think there is value in separate netdevs to > > represent "PFs, VFs and external ports/MACs" -- particularly for the > > use-case you to create rules to control PF<->VF traffic. > > > > So while I'm not saying it is a _great_ idea to support such a thing as > > port-splitting of PFs, I suggested this addition as I'm not willing to restrict > > such a design/implementation if a vendor or customer desired. It seemed > > useful to provde some guidance on how to name them -- even if we do not > > like them. :-) > > If I understand you correctly split PF would be a situation where > device has multiple port instances on the PCIe PF side? IOW switch sees > multiple endpoints on the PF side? Let me attempt an ASCII diagram :) > > > HOST A || HOST B > || > PF A | V | V | V | V || PF B | V | V | V > | F | F | F | F ... || | F | F | F ... > port A0 | port A1 | 0 | 1 | 2 | 3 || port B0 | port B1 | 0 | 1 | 2 > || > PCI Express link || PCI Express link > \ \ \ | | | | | / / / > \ \ \ | | | | | / / / > \______\______\' | | | '____/___/___/ > /---------------------------\ > |<<========== | > | ==========>> | > | SR-IOV e-switch | > |<<========== | > | ==========>> | > \---------------------------/ > | | | > | | | > || || > MAC 0 || MAC 1 || MAC 2 > || || > > > Seems to be a valid configuration, perhaps this would actually be of > some use in container workloads, especially if the ports could be > instantiated at runtime in high numbers. I would be cautious though > with calling the instances splits. The more different PFs look from > MACs the better IMHO. Do you actually have that problem today? > > Is there any HW supported upstream which would benefit from this? Could we > decide on naming when we have an example implementation? In theory > nothing stops us from splitting VFs the same way. Not that I know about right now. > Another note on PF netdevs, perhaps the most awkward thing about them, > is that they result in two netdevs being visible to the host. This is > not incorrect, since VFs if unassigned to VMs will end up creating an > "actual" netdev and the switchdev port representor too, but it rubs > some people the wrong way. Which in turn makes those people try to not > spawn separate netdevs, which is incorrect IMHO, and breaks down e.g. > when the real netdev gets assigned to a namespace. For me, the most awkward part of having a separate netdev for the PF and the MAC is that is really not how things were thought about in the nominal switching case (the non e-switch case). Since idea behind switchdev when it was created was to make sure that each front-panel port on a switch was represented by a netdev in the kernel (and the 'CPU interface' was abstracted away by the driver) I was always a bit uneasy about having a separate netdev allocated for the CPU port when in the switching case it really wasn't necessary. > I'm not sure if this clarifies my thinking, I have, however, seem to > have drawn a moose :) Which looks great, BTW. The moose may turn out to be one of the major benefits from this thread!
On Wed, 26 Jul 2017 12:23:17 +0300, Or Gerlitz wrote: > On Wed, Jul 26, 2017 at 11:13 AM, Jakub Kicinski <kubakici@wp.pl> wrote: > > On Wed, 26 Jul 2017 07:48:40 +0200, Jiri Pirko wrote: > > >> I think it would make sense if the driver would just fill-up a struct in > >> the ndo call and core would generate the string. > > > I do like the idea of core generating the string. I have a temptation > > to try to involve devlink in this somehow, since port id and split info > > is already available there. Perhaps that would only overcomplicate > > things. > > > The other question is: can we remove the option to generate an arbitrary > > string completely? I think Or and mlx5 folks may object since it would > > mean mlx5 VF repr names would change. > > What we have today is the representor driver instance setting the VF index as > the of phys port name and we're telling users to have this sort of udev rule: > > SUBSYSTEM=="net", ACTION=="add", > ATTR{phys_switch_id}=="<phys_switch_id>", \ ATTR{phys_port_name}!="", > NAME="$PF_NIC$attr{phys_port_name}" Example names generated by this rule would be pfnic_0, pfnic_1 for vf representors 0 and 1? > that has affiliation to the PF where this VF belongs. AFAIK this > model/assumption is now under push to > higher level (open stack), so lets see if/what we want to change here > w.r.t to sriov offloading drivers. > > I would opt for an approach where the value returned by the kernel is > the minimal possible and > user-space has flexibility to further orchestrate that with udev > rules. I wasn't fully clear on Jakub's suggestion > which parts must come from the kernel. Do we have any length > limitation for the phys port name? Looks like the limit today is IFNAMSIZ. I'm in favor of leaving the flexibility to the userspace, why I suggested adding the pf%d or pf%dvf%d to the name is that I don't think we have any other way today of telling whether representor is for a physical port, VF or PF. If I understand mlx5 code, you're not reporting port ids for physical ports so presence of the name already implies it's a VF but in case you want to add port splitting support, for example, reporting the name on physical ports will become more of a necessity. If we adopt Jiri's suggestion of returning structured data it will be very easy to give user space type and indexes separately, but we should probably still return the string for backwards compatibility.
On Wed, Jul 26, 2017 at 11:11 PM, Jakub Kicinski <kubakici@wp.pl> wrote: > On Wed, 26 Jul 2017 12:23:17 +0300, Or Gerlitz wrote: >> On Wed, Jul 26, 2017 at 11:13 AM, Jakub Kicinski <kubakici@wp.pl> wrote: >> > On Wed, 26 Jul 2017 07:48:40 +0200, Jiri Pirko wrote: >> >> >> I think it would make sense if the driver would just fill-up a struct in >> >> the ndo call and core would generate the string. >> >> > I do like the idea of core generating the string. I have a temptation >> > to try to involve devlink in this somehow, since port id and split info >> > is already available there. Perhaps that would only overcomplicate >> > things. >> >> > The other question is: can we remove the option to generate an arbitrary >> > string completely? I think Or and mlx5 folks may object since it would >> > mean mlx5 VF repr names would change. >> >> What we have today is the representor driver instance setting the VF index as >> the of phys port name and we're telling users to have this sort of udev rule: >> >> SUBSYSTEM=="net", ACTION=="add", >> ATTR{phys_switch_id}=="<phys_switch_id>", \ ATTR{phys_port_name}!="", >> NAME="$PF_NIC$attr{phys_port_name}" > > Example names generated by this rule would be pfnic_0, pfnic_1 for vf > representors 0 and 1? YES >> that has affiliation to the PF where this VF belongs. AFAIK this >> model/assumption is now under push to >> higher level (open stack), so lets see if/what we want to change here >> w.r.t to sriov offloading drivers. >> >> I would opt for an approach where the value returned by the kernel is >> the minimal possible and >> user-space has flexibility to further orchestrate that with udev >> rules. I wasn't fully clear on Jakub's suggestion >> which parts must come from the kernel. Do we have any length >> limitation for the phys port name? > > Looks like the limit today is IFNAMSIZ. I'm in favor of leaving the > flexibility to the userspace, why I suggested adding the pf%d or > pf%dvf%d to the name is that I don't think we have any other way today > of telling whether representor is for a physical port, VF or PF. > If I understand mlx5 code, you're not reporting port ids for physical > ports so presence of the name already implies it's a VF but in case you yes, currently in our model the PF serves as the uplink representor when running in offloads mode. > want to add port splitting support, for example, reporting the name on > physical ports will become more of a necessity. > If we adopt Jiri's suggestion of returning structured data it will be > very easy to give user space type and indexes separately, but we should > probably still return the string for backwards compatibility. I am not still clear how the structured data would look like
diff --git a/Documentation/networking/switchdev.txt b/Documentation/networking/switchdev.txt index 3e7b946dea27..7c4b6025fb4b 100644 --- a/Documentation/networking/switchdev.txt +++ b/Documentation/networking/switchdev.txt @@ -119,9 +119,17 @@ into 4 10G ports, resulting in 4 port netdevs, the device can give a unique SUBSYSTEM=="net", ACTION=="add", ATTR{phys_switch_id}=="<phys_switch_id>", \ ATTR{phys_port_name}!="", NAME="swX$attr{phys_port_name}" -Suggested naming convention is "swXpYsZ", where X is the switch name or ID, Y -is the port name or ID, and Z is the sub-port name or ID. For example, sw1p1s0 -would be sub-port 0 on port 1 on switch 1. +Suggested formats of the port name returned by ndo_get_phys_port_name are: + - pA for external ports; + - pAsB for split external ports; + - pfC for PF ports (so called PF representors); + - pfCvfD for VF ports (so called VF representors). +Where A is the port name or ID; B is the sub-port name or ID; C is PCIe +Physical Function name or ID and D is PCIe Virtual Function name or ID. + +Suggested naming convention for switches is "swX", where X is the switch name +or ID, plus the port name. For example, sw1p1s0 would be sub-port 0 on port 1 +on switch 1. Port Features ^^^^^^^^^^^^^
We are still in position where we can suggest uniform naming convention for ndo_get_phys_port_name(). switchdev.txt file already contained a suggestion of how to name external ports. Since the use of switchdev for SR-IOV NIC's eswitches is growing, establish a format for ports of those devices as well. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> --- Documentation/networking/switchdev.txt | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)