diff mbox

[V2,1/3] Documentation: devicetree: add multiple cpu port DSA binding

Message ID 20170530104419.6052-1-john@phrozen.org
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

John Crispin May 30, 2017, 10:44 a.m. UTC
Extend the DSA binding documentation, adding the new property required
when there is more than one CPU port attached to the switch.

Cc: Rob Herring <robh+dt@kernel.org>
Cc: devicetree@vger.kernel.org
Signed-off-by: John Crispin <john@phrozen.org>
---
 Documentation/devicetree/bindings/net/dsa/dsa.txt | 61 ++++++++++++++++++++++-
 1 file changed, 60 insertions(+), 1 deletion(-)

Comments

Florian Fainelli May 30, 2017, 9:32 p.m. UTC | #1
On 05/30/2017 03:44 AM, John Crispin wrote:
> Extend the DSA binding documentation, adding the new property required
> when there is more than one CPU port attached to the switch.
> 
> Cc: Rob Herring <robh+dt@kernel.org>
> Cc: devicetree@vger.kernel.org
> Signed-off-by: John Crispin <john@phrozen.org>
> ---
>  Documentation/devicetree/bindings/net/dsa/dsa.txt | 61 ++++++++++++++++++++++-
>  1 file changed, 60 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/devicetree/bindings/net/dsa/dsa.txt b/Documentation/devicetree/bindings/net/dsa/dsa.txt
> index cfe8f64eca4f..c164eb38ccc5 100644
> --- a/Documentation/devicetree/bindings/net/dsa/dsa.txt
> +++ b/Documentation/devicetree/bindings/net/dsa/dsa.txt
> @@ -55,6 +55,11 @@ A user port has the following optional property:
>  - label			: Describes the label associated with this port, which
>                            will become the netdev name.
>  
> +- cpu			: Option for non "cpu"/"dsa" ports. A phandle to a
> +			  "cpu" port, which will be used for passing packets
> +			  from this port to the host. If not present, the first
> +			  "cpu" port will be used.

So this option essentially allow us to "partition" the switch between
vectors of ports and their upstream/CPU port.

While using Device Tree is an obvious choice for making the initial
partitioning, it seems like we are missing a configuration mechanism
whereby we can properly assign ports to a specific upstream CPU port.

Let's move the actual discussion into patch 2 in order not to pollute
the DT maintainers' inbox.

> +
>  Port child nodes may also contain the following optional standardised
>  properties, described in binding documents:
>  
> @@ -71,7 +76,7 @@ properties, described in binding documents:
>  			  Documentation/devicetree/bindings/net/fixed-link.txt
>  			  for details.
>  
> -Example
> +Examples
>  
>  The following example shows three switches on three MDIO busses,
>  linked into one DSA cluster.
> @@ -264,6 +269,60 @@ linked into one DSA cluster.
>  	};
>  };
>  
> +The following example shows a switch that has two cpu ports each connecting
> +to a different MAC.
> +
> +&mdio0 {
> +	switch@0 {
> +		compatible = "mediatek,mt7530";
> +		#address-cells = <1>;
> +		#size-cells = <0>;
> +		reg = <0>;
> +
> +		ports {
> +			#address-cells = <1>;
> +			#size-cells = <0>;
> +			port@0 {
> +				reg = <0>;
> +				label = "lan0";
> +				cpu = <&cpu_port1>;
> +			};
> +
> +			port@1 {
> +				reg = <1>;
> +				label = "lan1";
> +				cpu = <&cpu_port1>;
> +			};
> +
> +			port@2 {
> +				reg = <2>;
> +				label = "lan2";
> +				cpu = <&cpu_port1>;
> +			};
> +
> +			port@3 {
> +				reg = <3>;
> +				label = "wan";
> +				cpu = <&cpu_port2>;
> +			};
> +
> +			cpu_port2: port@5 {
> +				reg = <5>;
> +				label = "cpu";
> +				ethernet = <&gmac2>;
> +				phy-mode = "trgmii";
> +			};
> +
> +			cpu_port1: port@6 {
> +				reg = <6>;
> +				label = "cpu";
> +				ethernet = <&gmac1>;
> +				phy-mode = "trgmii";
> +			};
> +		};
> +	};
> +};
> +
>  Deprecated Binding
>  ------------------
>  
>
Rob Herring June 7, 2017, 9:10 p.m. UTC | #2
On Tue, May 30, 2017 at 02:32:29PM -0700, Florian Fainelli wrote:
> On 05/30/2017 03:44 AM, John Crispin wrote:
> > Extend the DSA binding documentation, adding the new property required
> > when there is more than one CPU port attached to the switch.
> > 
> > Cc: Rob Herring <robh+dt@kernel.org>
> > Cc: devicetree@vger.kernel.org
> > Signed-off-by: John Crispin <john@phrozen.org>
> > ---
> >  Documentation/devicetree/bindings/net/dsa/dsa.txt | 61 ++++++++++++++++++++++-
> >  1 file changed, 60 insertions(+), 1 deletion(-)
> > 
> > diff --git a/Documentation/devicetree/bindings/net/dsa/dsa.txt b/Documentation/devicetree/bindings/net/dsa/dsa.txt
> > index cfe8f64eca4f..c164eb38ccc5 100644
> > --- a/Documentation/devicetree/bindings/net/dsa/dsa.txt
> > +++ b/Documentation/devicetree/bindings/net/dsa/dsa.txt
> > @@ -55,6 +55,11 @@ A user port has the following optional property:
> >  - label			: Describes the label associated with this port, which
> >                            will become the netdev name.
> >  
> > +- cpu			: Option for non "cpu"/"dsa" ports. A phandle to a
> > +			  "cpu" port, which will be used for passing packets
> > +			  from this port to the host. If not present, the first
> > +			  "cpu" port will be used.
> 
> So this option essentially allow us to "partition" the switch between
> vectors of ports and their upstream/CPU port.

Could this be more generic? This is basically saying route all packets 
on this port to another port. Maybe there's some usecase to route to 
non-cpu ports?

> While using Device Tree is an obvious choice for making the initial
> partitioning, it seems like we are missing a configuration mechanism
> whereby we can properly assign ports to a specific upstream CPU port.

What determines how things are routed/partitioned? If it is purely user 
choice then I don't think this should be in DT.

> Let's move the actual discussion into patch 2 in order not to pollute
> the DT maintainers' inbox.
Florian Fainelli June 7, 2017, 9:35 p.m. UTC | #3
On 06/07/2017 02:10 PM, Rob Herring wrote:
> On Tue, May 30, 2017 at 02:32:29PM -0700, Florian Fainelli wrote:
>> On 05/30/2017 03:44 AM, John Crispin wrote:
>>> Extend the DSA binding documentation, adding the new property required
>>> when there is more than one CPU port attached to the switch.
>>>
>>> Cc: Rob Herring <robh+dt@kernel.org>
>>> Cc: devicetree@vger.kernel.org
>>> Signed-off-by: John Crispin <john@phrozen.org>
>>> ---
>>>  Documentation/devicetree/bindings/net/dsa/dsa.txt | 61 ++++++++++++++++++++++-
>>>  1 file changed, 60 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/Documentation/devicetree/bindings/net/dsa/dsa.txt b/Documentation/devicetree/bindings/net/dsa/dsa.txt
>>> index cfe8f64eca4f..c164eb38ccc5 100644
>>> --- a/Documentation/devicetree/bindings/net/dsa/dsa.txt
>>> +++ b/Documentation/devicetree/bindings/net/dsa/dsa.txt
>>> @@ -55,6 +55,11 @@ A user port has the following optional property:
>>>  - label			: Describes the label associated with this port, which
>>>                            will become the netdev name.
>>>  
>>> +- cpu			: Option for non "cpu"/"dsa" ports. A phandle to a
>>> +			  "cpu" port, which will be used for passing packets
>>> +			  from this port to the host. If not present, the first
>>> +			  "cpu" port will be used.
>>
>> So this option essentially allow us to "partition" the switch between
>> vectors of ports and their upstream/CPU port.
> 
> Could this be more generic? This is basically saying route all packets 
> on this port to another port. Maybe there's some usecase to route to 
> non-cpu ports?

Yes, we absolutely want it to be more generic. The "problem" is that
before an user has a chance to come in and type some commands that
reconfigure the switch, there is a default state in which we need to
assume one particular CPU/management port is going to be used to receive
traffic. This can be pushed as a driver decision as to which of the
multiple CPU ports is the most suitable IMHO.

> 
>> While using Device Tree is an obvious choice for making the initial
>> partitioning, it seems like we are missing a configuration mechanism
>> whereby we can properly assign ports to a specific upstream CPU port.
> 
> What determines how things are routed/partitioned? If it is purely user 
> choice then I don't think this should be in DT.

Right now we don't have any mechanism, and statically doing this from
Device Tree is too inflexible. I have been working on a parallel path
where we use the bridge (which is already accelerated when there is a
switch) in order to define groups of ports, the idea would be do to e.g:

brctl addbr br-lan
brctl addbr br-lan eth0
brctl addbr br-lan lan1
...
brctl addbr br-lan lan4

brctl addbr br-wan
brctl addbr br-wan eth1
brctl addbr br-wan wan

Assuming that lan1-lan4 are your LAN ports, and wan is your WAN port and
you have two CPU ports.

Until we configure these bridges though, we would be in the same state
that we currently are, which is that traffic is only possible between
lan1 <-> eth0, lan2 <-> eth0 and so on (every port is isolated from each
other).
Andrew Lunn June 7, 2017, 9:42 p.m. UTC | #4
On Wed, Jun 07, 2017 at 04:10:02PM -0500, Rob Herring wrote:
> On Tue, May 30, 2017 at 02:32:29PM -0700, Florian Fainelli wrote:
> > On 05/30/2017 03:44 AM, John Crispin wrote:
> > > Extend the DSA binding documentation, adding the new property required
> > > when there is more than one CPU port attached to the switch.
> > > 
> > > Cc: Rob Herring <robh+dt@kernel.org>
> > > Cc: devicetree@vger.kernel.org
> > > Signed-off-by: John Crispin <john@phrozen.org>
> > > ---
> > >  Documentation/devicetree/bindings/net/dsa/dsa.txt | 61 ++++++++++++++++++++++-
> > >  1 file changed, 60 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/Documentation/devicetree/bindings/net/dsa/dsa.txt b/Documentation/devicetree/bindings/net/dsa/dsa.txt
> > > index cfe8f64eca4f..c164eb38ccc5 100644
> > > --- a/Documentation/devicetree/bindings/net/dsa/dsa.txt
> > > +++ b/Documentation/devicetree/bindings/net/dsa/dsa.txt
> > > @@ -55,6 +55,11 @@ A user port has the following optional property:
> > >  - label			: Describes the label associated with this port, which
> > >                            will become the netdev name.
> > >  
> > > +- cpu			: Option for non "cpu"/"dsa" ports. A phandle to a
> > > +			  "cpu" port, which will be used for passing packets
> > > +			  from this port to the host. If not present, the first
> > > +			  "cpu" port will be used.
> > 
> > So this option essentially allow us to "partition" the switch between
> > vectors of ports and their upstream/CPU port.
> 
> Could this be more generic? This is basically saying route all packets 
> on this port to another port.

Hi Rob

No, it is not saying that.

The CPU port of the switch is special. It is used by the switch for
frames it does not know what to do with. e.g, it has not learned the
destination MAC address, it is an IGMP management packet, etc. Or the
MAC address is that of the CPU, or the CPU needs to bridge it out
another interface, e.g. a L2 VPN. The switch will add an additional
header indicating what the ingress port was, and pass it to the CPU
via this port.

There is a presentation Florian, Vivien and I made at netdev 2.1
earlier this year which talks about this.

If you want to mirror all packets from one port to another, you can
use tc and the mirror action.

> > While using Device Tree is an obvious choice for making the initial
> > partitioning, it seems like we are missing a configuration mechanism
> > whereby we can properly assign ports to a specific upstream CPU port.
> 
> What determines how things are routed/partitioned? If it is purely user 
> choice then I don't think this should be in DT.

> > Let's move the actual discussion into patch 2 in order not to pollute
> > the DT maintainers' inbox.

We are aiming to load balance traffic to/from the CPU and the switch.

The ports of a switch can very in characteristics. Sometimes two are
able to do 10Gbps, while others are just 1Gbps. So it would make sense
to give those higher speed ports a bigger fraction of the available
CPU bandwidth, etc.

This binding gives us the option to start with a sensible default for
a typical application of the hardware. For something like a WiFi box,
this is probably sufficient. However, there is also a lot of usage of
DSA in industrial application, and more flexibility is needed. For
that we probably need a user API of some sort which allows the
defaults in the device tree to be modified to a specific user case.

	 Andrew
Andrew Lunn June 8, 2017, 7:31 p.m. UTC | #5
> Right now we don't have any mechanism, and statically doing this from
> Device Tree is too inflexible. I have been working on a parallel path
> where we use the bridge (which is already accelerated when there is a
> switch) in order to define groups of ports, the idea would be do to e.g:
> 
> brctl addbr br-lan
> brctl addbr br-lan eth0
> brctl addbr br-lan lan1
> ...
> brctl addbr br-lan lan4
> 
> brctl addbr br-wan
> brctl addbr br-wan eth1
> brctl addbr br-wan wan
> 
> Assuming that lan1-lan4 are your LAN ports, and wan is your WAN port and
> you have two CPU ports.

Hi Florian

I don't like this, on multiple levels.

My wan port typically never has more than 40Mbps of traffic on it. So
dedicating a whole 1Gbps ethernet to it makes no sense. I want to
share eth1 bandwidth with the wan port, and some of the other
ports. Meaning i would have to add eth1 to br-lan as well as
br-wan. Does the bridge allow that? And what sort of hacks do you have
to allow a port to be added to a bridge, but not used by the bridge?

And what is the point of br-wan? It only has one real port in it. So
i'm adding per-packet overhead which i don't need, just in order to
signal to the hardware how it should statically route cpu traffic for
a port.

Now say i have one of the bigger marvell switches, with 11 ports, in
an industrial application. I setup 3 or 4 bridges. I then need to add
eth0 and eth1 to two different bridges. And say i use some ports
without a bridge. How do i configure them?

And how do i dump the current mapping?

For me, this is the wrong architecture. What CPU port is used should
be a port property, not a bridge property. I think we should use
devlink. Add commands to dump the current mapping, and set it.

	 Andrew
Florian Fainelli June 8, 2017, 7:57 p.m. UTC | #6
On 06/08/2017 12:31 PM, Andrew Lunn wrote:
>> Right now we don't have any mechanism, and statically doing this from
>> Device Tree is too inflexible. I have been working on a parallel path
>> where we use the bridge (which is already accelerated when there is a
>> switch) in order to define groups of ports, the idea would be do to e.g:
>>
>> brctl addbr br-lan
>> brctl addbr br-lan eth0
>> brctl addbr br-lan lan1
>> ...
>> brctl addbr br-lan lan4
>>
>> brctl addbr br-wan
>> brctl addbr br-wan eth1
>> brctl addbr br-wan wan
>>
>> Assuming that lan1-lan4 are your LAN ports, and wan is your WAN port and
>> you have two CPU ports.
> 
> Hi Florian
> 
> I don't like this, on multiple levels.
> 
> My wan port typically never has more than 40Mbps of traffic on it. So
> dedicating a whole 1Gbps ethernet to it makes no sense. I want to
> share eth1 bandwidth with the wan port, and some of the other
> ports. Meaning i would have to add eth1 to br-lan as well as
> br-wan. Does the bridge allow that? And what sort of hacks do you have
> to allow a port to be added to a bridge, but not used by the bridge?

This is AN example of how to configure a port grouping based on John's
example and use case, not THE only example ;) The idea is obviously that
you can define association between user-facing ports and CPU ports as
you see fit, the idea is to use bridge to do that association because
that's already what controls VLAN membership and forwarding.

> 
> And what is the point of br-wan? It only has one real port in it. So
> i'm adding per-packet overhead which i don't need, just in order to
> signal to the hardware how it should statically route cpu traffic for
> a port.

No, it has two ports in it, adding eth1 is necessary do define the VLAN
membership and forwarding rules, when eth1 is added we resolve the CPU
port this corresponds to in the switch and we use that to define the
forwarding decision between ports.

The overhead per-packet is extremely limited because the first thing
br_handle_frame() will do is see that eth1 is a DSA master network
device and pass packets back up the stack for processing through dst->rcv().

> 
> Now say i have one of the bigger marvell switches, with 11 ports, in
> an industrial application. I setup 3 or 4 bridges. I then need to add
> eth0 and eth1 to two different bridges. And say i use some ports> without a bridge. How do i configure them?

If you don't add your conduit interface to the bridge, then the default
CPU interface (which could be left to the driver to decide which one is
relevant) gets used and things work as expected. When you add a DSA
master network device to the bridge, only then do we use that
information to refine the forwarding decision and map to the appropriate
port vectors. Doing this becomes necessary if you create a second (or
more) bridge to isolate a group of ports.

> 
> And how do i dump the current mapping?

You look at both (or more) bridges' members, what's wrong or even any
different from what happens today?

> 
> For me, this is the wrong architecture. What CPU port is used should
> be a port property, not a bridge property. I think we should use
> devlink. Add commands to dump the current mapping, and set it.

In premise I agree, although just like we need a bridge today to
configure VLAN memberships, it did not seem to me like a big stretch to
use a bridge to configure which CPU port you want.
diff mbox

Patch

diff --git a/Documentation/devicetree/bindings/net/dsa/dsa.txt b/Documentation/devicetree/bindings/net/dsa/dsa.txt
index cfe8f64eca4f..c164eb38ccc5 100644
--- a/Documentation/devicetree/bindings/net/dsa/dsa.txt
+++ b/Documentation/devicetree/bindings/net/dsa/dsa.txt
@@ -55,6 +55,11 @@  A user port has the following optional property:
 - label			: Describes the label associated with this port, which
                           will become the netdev name.
 
+- cpu			: Option for non "cpu"/"dsa" ports. A phandle to a
+			  "cpu" port, which will be used for passing packets
+			  from this port to the host. If not present, the first
+			  "cpu" port will be used.
+
 Port child nodes may also contain the following optional standardised
 properties, described in binding documents:
 
@@ -71,7 +76,7 @@  properties, described in binding documents:
 			  Documentation/devicetree/bindings/net/fixed-link.txt
 			  for details.
 
-Example
+Examples
 
 The following example shows three switches on three MDIO busses,
 linked into one DSA cluster.
@@ -264,6 +269,60 @@  linked into one DSA cluster.
 	};
 };
 
+The following example shows a switch that has two cpu ports each connecting
+to a different MAC.
+
+&mdio0 {
+	switch@0 {
+		compatible = "mediatek,mt7530";
+		#address-cells = <1>;
+		#size-cells = <0>;
+		reg = <0>;
+
+		ports {
+			#address-cells = <1>;
+			#size-cells = <0>;
+			port@0 {
+				reg = <0>;
+				label = "lan0";
+				cpu = <&cpu_port1>;
+			};
+
+			port@1 {
+				reg = <1>;
+				label = "lan1";
+				cpu = <&cpu_port1>;
+			};
+
+			port@2 {
+				reg = <2>;
+				label = "lan2";
+				cpu = <&cpu_port1>;
+			};
+
+			port@3 {
+				reg = <3>;
+				label = "wan";
+				cpu = <&cpu_port2>;
+			};
+
+			cpu_port2: port@5 {
+				reg = <5>;
+				label = "cpu";
+				ethernet = <&gmac2>;
+				phy-mode = "trgmii";
+			};
+
+			cpu_port1: port@6 {
+				reg = <6>;
+				label = "cpu";
+				ethernet = <&gmac1>;
+				phy-mode = "trgmii";
+			};
+		};
+	};
+};
+
 Deprecated Binding
 ------------------