diff mbox series

[net-next,v2,1/2] net: dsa: mv88e6xxx: Save switch rules

Message ID 20190125095507.29334-2-miquel.raynal@bootlin.com
State Changes Requested
Delegated to: David Miller
Headers show
Series mv88e6xxx DSA suspend to RAM support | expand

Commit Message

Miquel Raynal Jan. 25, 2019, 9:55 a.m. UTC
The user might apply a specific switch configuration, with specific
forwarding rules, VLAN, bridges, etc.

During suspend to RAM the switch power will be turned off and the
switch will lost its configuration. In an attempt to bring S2RAM
support to the mv88e6xxx DSA, let's first save these rules in a
per-chip list thanks to the mv88e6xxx_add/del_xxx_rule()
helpers. These helpers are then called from various callbacks:
* mv88e6xxx_port_fdb_add/del()
* mv88e6xxx_port_mdb_add/del()
* mv88e6xxx_port_vlan_add/del()
* mv88e6xxx_port_bridge_join/leave()
* mv88e6xxx_crosschip_bridge_join/leave()

To avoid recursion problems when replaying the rules, the content of
the above *_add()/*_join() callbacks has been moved in separate
helpers with a '_' prefix. Hence, each callback just calls the
corresponding helper and the corresponding *_add_xxx_rule().

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
---

Changes since v1:

Comments

Florian Fainelli Jan. 25, 2019, 6:37 p.m. UTC | #1
Hi Miquel,

On 1/25/19 1:55 AM, Miquel Raynal wrote:
> The user might apply a specific switch configuration, with specific
> forwarding rules, VLAN, bridges, etc.
> 
> During suspend to RAM the switch power will be turned off and the
> switch will lost its configuration. In an attempt to bring S2RAM
> support to the mv88e6xxx DSA, let's first save these rules in a
> per-chip list thanks to the mv88e6xxx_add/del_xxx_rule()
> helpers. These helpers are then called from various callbacks:
> * mv88e6xxx_port_fdb_add/del()
> * mv88e6xxx_port_mdb_add/del()
> * mv88e6xxx_port_vlan_add/del()
> * mv88e6xxx_port_bridge_join/leave()
> * mv88e6xxx_crosschip_bridge_join/leave()
> 
> To avoid recursion problems when replaying the rules, the content of
> the above *_add()/*_join() callbacks has been moved in separate
> helpers with a '_' prefix. Hence, each callback just calls the
> corresponding helper and the corresponding *_add_xxx_rule().

None of this should be done in the driver IMHO, because this is
presumably applicable to all switch devices that lose their state during
suspend/resume, so at best, this should be moved to the core DSA layer,
but doing this means that we should also have a well established
contract between the DSA layer and individual switch drivers as far as
quiescing/saving/restoring state goes.

By moving things to the core we can also more tightly control what data
structures get used to represent e.g.: VLANs, FDBs, MDBs etc and
possibly push/utilize caching into the original subsystem. For instance
VLAN/bridge already do maintain caches of VLANs, so if we could somehow
expose those, we would not bloat the kernel's memory footprint by having
an additional layer to maintain with identical information.

Just my 2 cents.
Miquel Raynal Jan. 28, 2019, 2:24 p.m. UTC | #2
Hi Florian,

Florian Fainelli <f.fainelli@gmail.com> wrote on Fri, 25 Jan 2019
10:37:38 -0800:

> Hi Miquel,
> 
> On 1/25/19 1:55 AM, Miquel Raynal wrote:
> > The user might apply a specific switch configuration, with specific
> > forwarding rules, VLAN, bridges, etc.
> > 
> > During suspend to RAM the switch power will be turned off and the
> > switch will lost its configuration. In an attempt to bring S2RAM
> > support to the mv88e6xxx DSA, let's first save these rules in a
> > per-chip list thanks to the mv88e6xxx_add/del_xxx_rule()
> > helpers. These helpers are then called from various callbacks:
> > * mv88e6xxx_port_fdb_add/del()
> > * mv88e6xxx_port_mdb_add/del()
> > * mv88e6xxx_port_vlan_add/del()
> > * mv88e6xxx_port_bridge_join/leave()
> > * mv88e6xxx_crosschip_bridge_join/leave()
> > 
> > To avoid recursion problems when replaying the rules, the content of
> > the above *_add()/*_join() callbacks has been moved in separate
> > helpers with a '_' prefix. Hence, each callback just calls the
> > corresponding helper and the corresponding *_add_xxx_rule().  
> 
> None of this should be done in the driver IMHO, because this is
> presumably applicable to all switch devices that lose their state during
> suspend/resume, so at best, this should be moved to the core DSA layer,
> but doing this means that we should also have a well established
> contract between the DSA layer and individual switch drivers as far as
> quiescing/saving/restoring state goes.
> 
> By moving things to the core we can also more tightly control what data
> structures get used to represent e.g.: VLANs, FDBs, MDBs etc and
> possibly push/utilize caching into the original subsystem. For instance
> VLAN/bridge already do maintain caches of VLANs, so if we could somehow
> expose those, we would not bloat the kernel's memory footprint by having
> an additional layer to maintain with identical information.

So you suggest to move the intelligence of FDBs/MDBs in net/dsa/port.c,
is this right?

I don't see where VLAN and bridge information are cached, can you point
me to the relevant locations?

What about cross-chip bridges? There is nothing about them in
net/dsa/port.c. The implementation I see in the mv88e6xxx driver
only touches the PVT but I don't get whether we should handle this
calls like regular bridge-join/leave events or not (maybe they are
cached with regular bridge events?).


Thanks,
Miquèl
Andrew Lunn Jan. 28, 2019, 2:44 p.m. UTC | #3
> I don't see where VLAN and bridge information are cached, can you point
> me to the relevant locations?

Miquèl

The bridge should have all that information. You need to ask it to
enumerate the current configuration and replay it to the switch.

There might be something in the Mellanox driver you can copy? But i've
not looked, i'm just guessing.

We also need to think about how we are going to test this. There is a
lot of state information in a switch. So we are going to need some
pretty good tests to show we have recreated all of it.

       Andrew
Miquel Raynal Jan. 28, 2019, 3:57 p.m. UTC | #4
Hi Andrew,

Thanks for helping!

Andrew Lunn <andrew@lunn.ch> wrote on Mon, 28 Jan 2019 15:44:17 +0100:

> > I don't see where VLAN and bridge information are cached, can you point
> > me to the relevant locations?  
> 
> Miquèl
> 
> The bridge should have all that information. You need to ask it to
> enumerate the current configuration and replay it to the switch.
> 
> There might be something in the Mellanox driver you can copy? But i've
> not looked, i'm just guessing.

I am still searching but so far I did not find a mechanism reading the
configuration of the bridge out of a 'net' object. Indeed there are
multiple lists with the configuration but they are all 'mellanox'
objects, they do not belong to the core.

Maybe I don't find this configuration because I don't know what it is.
I imagine this configuration being one (or multiple) list(s), stored
somewhere in a net_device being a bridge. Am I on the wrong path?

Otherwise I might just save my own structures in net/dsa/switch.c like
I did for the mv88e6xx driver, and once this works, net-folks might
want to optimize the memory consumption and re-use the bridge
configuration directly?

> We also need to think about how we are going to test this. There is a
> lot of state information in a switch. So we are going to need some
> pretty good tests to show we have recreated all of it.

My understanding of all this is rather short, until know I used what
you proposed in the v1 of this series but I am all ears if I need to
add anything to my test list.


Thanks,
Miquèl
Andrew Lunn Jan. 28, 2019, 5:42 p.m. UTC | #5
On Mon, Jan 28, 2019 at 04:57:49PM +0100, Miquel Raynal wrote:
> Hi Andrew,
> 
> Thanks for helping!
> 
> Andrew Lunn <andrew@lunn.ch> wrote on Mon, 28 Jan 2019 15:44:17 +0100:
> 
> > > I don't see where VLAN and bridge information are cached, can you point
> > > me to the relevant locations?  
> > 
> > Miquèl
> > 
> > The bridge should have all that information. You need to ask it to
> > enumerate the current configuration and replay it to the switch.
> > 
> > There might be something in the Mellanox driver you can copy? But i've
> > not looked, i'm just guessing.
> 
> I am still searching but so far I did not find a mechanism reading the
> configuration of the bridge out of a 'net' object. Indeed there are
> multiple lists with the configuration but they are all 'mellanox'
> objects, they do not belong to the core.

Hi Miquèl

Look at how iproute2 works. How does the bridge command enumerate the
fdb and mdb's? How does bridge vlan show work? bridge link show? See
if you can use this infrastructure within the kernel.

> > We also need to think about how we are going to test this. There is a
> > lot of state information in a switch. So we are going to need some
> > pretty good tests to show we have recreated all of it.
> 
> My understanding of all this is rather short, until know I used what
> you proposed in the v1 of this series but I am all ears if I need to
> add anything to my test list.

What you probably need is a generic DSA test suite, with a number of
hardware devices, with different generations of mv88e6xxx devices, and
ideally different sf2, kzs, etc switches. Setup a configuration and
test is works correctly. Suspend, resume, and test is still works. And
you probably need to go through a number of cycles of suspend/resume.
And you are going to need to maintain that for a number of years,
testing every release, to see what breaks as we add new features and
new devices.

There also needs to be some though put into what happens when the
network changes while the switch is suspended. A port looses its link,
a port comes up, an SFP module is ejected, and SFP module is
inserted. The PTP grand master moves, etc. I hope the usual mechanisms
just work, but it all needs testing.

S2RAM is hard for a device like this. It is not something i personally
would want to do :-(

      Andrew
Miquel Raynal Jan. 29, 2019, 9:01 a.m. UTC | #6
Hi Andrew,

Andrew Lunn <andrew@lunn.ch> wrote on Mon, 28 Jan 2019 18:42:46 +0100:

> On Mon, Jan 28, 2019 at 04:57:49PM +0100, Miquel Raynal wrote:
> > Hi Andrew,
> > 
> > Thanks for helping!
> > 
> > Andrew Lunn <andrew@lunn.ch> wrote on Mon, 28 Jan 2019 15:44:17 +0100:
> >   
> > > > I don't see where VLAN and bridge information are cached, can you point
> > > > me to the relevant locations?    
> > > 
> > > Miquèl
> > > 
> > > The bridge should have all that information. You need to ask it to
> > > enumerate the current configuration and replay it to the switch.
> > > 
> > > There might be something in the Mellanox driver you can copy? But i've
> > > not looked, i'm just guessing.  
> > 
> > I am still searching but so far I did not find a mechanism reading the
> > configuration of the bridge out of a 'net' object. Indeed there are
> > multiple lists with the configuration but they are all 'mellanox'
> > objects, they do not belong to the core.  
> 
> Hi Miquèl
> 
> Look at how iproute2 works. How does the bridge command enumerate the
> fdb and mdb's? How does bridge vlan show work? bridge link show? See
> if you can use this infrastructure within the kernel.

Thanks!

> 
> > > We also need to think about how we are going to test this. There is a
> > > lot of state information in a switch. So we are going to need some
> > > pretty good tests to show we have recreated all of it.  
> > 
> > My understanding of all this is rather short, until know I used what
> > you proposed in the v1 of this series but I am all ears if I need to
> > add anything to my test list.  
> 
> What you probably need is a generic DSA test suite, with a number of
> hardware devices, with different generations of mv88e6xxx devices, and
> ideally different sf2, kzs, etc switches. Setup a configuration and
> test is works correctly. Suspend, resume, and test is still works. And
> you probably need to go through a number of cycles of suspend/resume.
> And you are going to need to maintain that for a number of years,
> testing every release, to see what breaks as we add new features and
> new devices.

I am very sorry but I kind of disagree with the above proposal. Usually
contributors try to write the best solution with the help of the
community, test on the hardware they have in hand and propose the
changes. I cannot bond on a 10-years involvement in testing several
switches over the releases.

Today, there is no S2RAM support for switches. First, I proposed to add
suspend/resume callbacks to the mv88e6xxx driver - just enough to avoid
crashing the kernel. It was reported that the configuration was lost so
I wrote a rule-saving mechanism to replay the rules at resume. I was
told that this mechanism would best fit in the DSA core directly. I am
open to do that, I don't think it is that much work. But it is also
required that I use as less memory as possible. This is going to take
more time but I think I can do it as well. At least for a minimal set of
configuration.

Then, why not let other people improve things as they need? IIUC Switch
S2RAM does not work at all, I may try to improve the situation but I
do not have the abilities nor the time to do it exhaustively for every
piece of hardware and every situation.

> 
> There also needs to be some though put into what happens when the
> network changes while the switch is suspended. A port looses its link,
> a port comes up, an SFP module is ejected, and SFP module is
> inserted. The PTP grand master moves, etc. I hope the usual mechanisms
> just work, but it all needs testing.

Is this really specific to switches? I know it is an issue and I
understand you would prefer not to support S2RAM at all rather than
addressing part of it, but isn't it better to support the simplest
situation first, than supporting nothing at all?


Thanks Andrew for your guidance and help anyway,
Miquèl
Andrew Lunn Jan. 29, 2019, 2:51 p.m. UTC | #7
On Tue, Jan 29, 2019 at 10:01:17AM +0100, Miquel Raynal wrote:
> Hi Andrew,
> 
> Andrew Lunn <andrew@lunn.ch> wrote on Mon, 28 Jan 2019 18:42:46 +0100:
> 
> > On Mon, Jan 28, 2019 at 04:57:49PM +0100, Miquel Raynal wrote:
> > > Hi Andrew,
> > > 
> > > Thanks for helping!
> > > 
> > > Andrew Lunn <andrew@lunn.ch> wrote on Mon, 28 Jan 2019 15:44:17 +0100:
> > >   
> > > > > I don't see where VLAN and bridge information are cached, can you point
> > > > > me to the relevant locations?    
> > > > 
> > > > Miquèl
> > > > 
> > > > The bridge should have all that information. You need to ask it to
> > > > enumerate the current configuration and replay it to the switch.
> > > > 
> > > > There might be something in the Mellanox driver you can copy? But i've
> > > > not looked, i'm just guessing.  
> > > 
> > > I am still searching but so far I did not find a mechanism reading the
> > > configuration of the bridge out of a 'net' object. Indeed there are
> > > multiple lists with the configuration but they are all 'mellanox'
> > > objects, they do not belong to the core.  
> > 
> > Hi Miquèl
> > 
> > Look at how iproute2 works. How does the bridge command enumerate the
> > fdb and mdb's? How does bridge vlan show work? bridge link show? See
> > if you can use this infrastructure within the kernel.
> 
> Thanks!
> 
> > 
> > > > We also need to think about how we are going to test this. There is a
> > > > lot of state information in a switch. So we are going to need some
> > > > pretty good tests to show we have recreated all of it.  
> > > 
> > > My understanding of all this is rather short, until know I used what
> > > you proposed in the v1 of this series but I am all ears if I need to
> > > add anything to my test list.  
> > 
> > What you probably need is a generic DSA test suite, with a number of
> > hardware devices, with different generations of mv88e6xxx devices, and
> > ideally different sf2, kzs, etc switches. Setup a configuration and
> > test is works correctly. Suspend, resume, and test is still works. And
> > you probably need to go through a number of cycles of suspend/resume.
> > And you are going to need to maintain that for a number of years,
> > testing every release, to see what breaks as we add new features and
> > new devices.
> 
> I am very sorry but I kind of disagree with the above proposal. Usually
> contributors try to write the best solution with the help of the
> community, test on the hardware they have in hand and propose the
> changes. I cannot bond on a 10-years involvement in testing several
> switches over the releases.

Hi Miquèl

I was trying to point out this is a very hard subject to tackle. And
to do it right is not going to be a few patches. It needs a lot of
work, and a lot of testing, and it needs ongoing work because the
mv88e6xxx driver is not complete, there are more features to add,
which are going to need suspend/resume support adding.

> Today, there is no S2RAM support for switches. First, I proposed to add
> suspend/resume callbacks to the mv88e6xxx driver - just enough to avoid
> crashing the kernel.

Then i would suggest the mv88e6xxx refuses the suspend. Actually that
probably is the first correct step. We don't have suspend support, so
stop the suspend happening, so preventing the kernel crash.

Having to maintain the mv88e6xxx, i don't want a suspend which might
work in the simplest configuration, but fails badly for more complex
configurations. Before accepting any patches, i want a good feeling it
works correctly. I would be willing to accept support and testing on
one Marvell family of switches, but again, i want to know it is well
tested. And i want to know somebody is going to stay around and look
after the support as the switch driver develops new features, which
are going to need suspend/resume support.

If you are only willing to consider a limited number of features, you
need to track if the switch is still within those limited set of
features, and refuse the suspend if not.

> > There also needs to be some though put into what happens when the
> > network changes while the switch is suspended. A port looses its link,
> > a port comes up, an SFP module is ejected, and SFP module is
> > inserted. The PTP grand master moves, etc. I hope the usual mechanisms
> > just work, but it all needs testing.
> 
> Is this really specific to switches? I know it is an issue and I
> understand you would prefer not to support S2RAM at all rather than
> addressing part of it, but isn't it better to support the simplest
> situation first, than supporting nothing at all?

Worst case scenario, you induce a loop in your network, and a
broadcast storm takes down the whole network. It is unlikely, but it
is very disruptive if it does happen. It is also the sort of situation
which is probably not going to get tested, making it more likely to
actually happen. And this is specific to switches. A single network
card cannot do this, you need two ports to form a loop.

     Andrew
Vivien Didelot Jan. 29, 2019, 3:46 p.m. UTC | #8
Hi Miquèl,

On Tue, 29 Jan 2019 15:51:57 +0100, Andrew Lunn <andrew@lunn.ch> wrote:

> > Today, there is no S2RAM support for switches. First, I proposed to add
> > suspend/resume callbacks to the mv88e6xxx driver - just enough to avoid
> > crashing the kernel.
> 
> Then i would suggest the mv88e6xxx refuses the suspend. Actually that
> probably is the first correct step. We don't have suspend support, so
> stop the suspend happening, so preventing the kernel crash.
 
I am not confortable with adding support for S2RAM in mv88e6xxx yet because
as it was explained, we are aware of much complicated scenarios out there
to pretend that DSA /partly/ supports suspend-resume. The prefered approach
for the moment is to keep things simple and not supporting this feature yet,
especially at the mv88e6xxx driver level.

However crashing is unacceptable so I'm backing Andrew's point here, please
submit a fix to prevent the suspend (and crash) for the moment.

Sorry if you felt that your work is being delayed, it is much appreciated!


Thanks,

	Vivien
Miquel Raynal Jan. 30, 2019, 9:46 a.m. UTC | #9
Hi Vivien & Andrew,

Vivien Didelot <vivien.didelot@gmail.com> wrote on Tue, 29 Jan 2019
10:46:13 -0500:

> Hi Miquèl,
> 
> On Tue, 29 Jan 2019 15:51:57 +0100, Andrew Lunn <andrew@lunn.ch> wrote:
> 
> > > Today, there is no S2RAM support for switches. First, I proposed to add
> > > suspend/resume callbacks to the mv88e6xxx driver - just enough to avoid
> > > crashing the kernel.  
> > 
> > Then i would suggest the mv88e6xxx refuses the suspend. Actually that
> > probably is the first correct step. We don't have suspend support, so
> > stop the suspend happening, so preventing the kernel crash.  
>  
> I am not confortable with adding support for S2RAM in mv88e6xxx yet because
> as it was explained, we are aware of much complicated scenarios out there
> to pretend that DSA /partly/ supports suspend-resume. The prefered approach
> for the moment is to keep things simple and not supporting this feature yet,
> especially at the mv88e6xxx driver level.
> 
> However crashing is unacceptable so I'm backing Andrew's point here, please
> submit a fix to prevent the suspend (and crash) for the moment.
> 
> Sorry if you felt that your work is being delayed, it is much appreciated!

Thanks for the more detailed explanation, I got your point and I better
understand your reluctance.

So your proposal is to refuse suspending when using a mv88e6xxx switch.
What about the current situation where suspending is allowed, but all
the configuration gone? As long as all the ports are disabled during
suspend, it should not hurt anything, right? Plus, this is what the
bcm_sf2 and qca8k drivers are doing. I can even add an error message in
the resume path to warn about this drawback.


Thanks,
Miquèl
Andrew Lunn Jan. 30, 2019, 2:54 p.m. UTC | #10
> So your proposal is to refuse suspending when using a mv88e6xxx switch.

Hi Miquèl

That is the first step. It makes the mv88e6xxx suspend compliant, in
that it currently does not support suspend.

> What about the current situation where suspending is allowed, but all
> the configuration gone?

That is broken. The whole point of suspending is that you resume back
to the original state.

   Andrew
Vivien Didelot Jan. 31, 2019, 12:46 a.m. UTC | #11
Hi Miquèl,

On Wed, 30 Jan 2019 10:46:06 +0100, Miquel Raynal <miquel.raynal@bootlin.com> wrote:

> > > > Today, there is no S2RAM support for switches. First, I proposed to add
> > > > suspend/resume callbacks to the mv88e6xxx driver - just enough to avoid
> > > > crashing the kernel.  
> > > 
> > > Then i would suggest the mv88e6xxx refuses the suspend. Actually that
> > > probably is the first correct step. We don't have suspend support, so
> > > stop the suspend happening, so preventing the kernel crash.  

Actually can you show me the crash that is happening?
Miquel Raynal Feb. 1, 2019, 11:01 a.m. UTC | #12
Hi Vivien,

Vivien Didelot <vivien.didelot@gmail.com> wrote on Wed, 30 Jan 2019
19:46:08 -0500:

> Hi Miquèl,
> 
> On Wed, 30 Jan 2019 10:46:06 +0100, Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> 
> > > > > Today, there is no S2RAM support for switches. First, I proposed to add
> > > > > suspend/resume callbacks to the mv88e6xxx driver - just enough to avoid
> > > > > crashing the kernel.    
> > > > 
> > > > Then i would suggest the mv88e6xxx refuses the suspend. Actually that
> > > > probably is the first correct step. We don't have suspend support, so
> > > > stop the suspend happening, so preventing the kernel crash.    
> 
> Actually can you show me the crash that is happening?

Sure, here it is: http://code.bulix.org/swwb11-569137

Actually it is a silent crash but the platform never resumes. I am
pretty sure this is due to the kthread_queue_delayed_work() loop which
might access registers before it is allowed to do so. In my proposal I
just canceled it at suspend and restarted it at resume.

Next week I will send a patch to refuse the suspend as you both
suggested and if people want to suspend, they will have to remove the
switch support.


Thanks,
Miquèl
Andrew Lunn Feb. 1, 2019, 2:08 p.m. UTC | #13
On Fri, Feb 01, 2019 at 12:01:19PM +0100, Miquel Raynal wrote:
> Hi Vivien,
> 
> Vivien Didelot <vivien.didelot@gmail.com> wrote on Wed, 30 Jan 2019
> 19:46:08 -0500:
> 
> > Hi Miquèl,
> > 
> > On Wed, 30 Jan 2019 10:46:06 +0100, Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> > 
> > > > > > Today, there is no S2RAM support for switches. First, I proposed to add
> > > > > > suspend/resume callbacks to the mv88e6xxx driver - just enough to avoid
> > > > > > crashing the kernel.    
> > > > > 
> > > > > Then i would suggest the mv88e6xxx refuses the suspend. Actually that
> > > > > probably is the first correct step. We don't have suspend support, so
> > > > > stop the suspend happening, so preventing the kernel crash.    
> > 
> > Actually can you show me the crash that is happening?
> 
> Sure, here it is: http://code.bulix.org/swwb11-569137
> 
> Actually it is a silent crash but the platform never resumes. I am
> pretty sure this is due to the kthread_queue_delayed_work() loop which
> might access registers before it is allowed to do so.

Hi Miquel

That sounds like it is an MDIO driver problem, or at least, a resume
ordering problem. You need to ensure that the MDIO bus driver is
resumed before the switch driver. Also, that the switch is suspended
before the MDIO bus driver is suspended.

       Andrew
Miquel Raynal Feb. 1, 2019, 2:43 p.m. UTC | #14
Hi Andrew,

Andrew Lunn <andrew@lunn.ch> wrote on Fri, 1 Feb 2019 15:08:31 +0100:

> On Fri, Feb 01, 2019 at 12:01:19PM +0100, Miquel Raynal wrote:
> > Hi Vivien,
> > 
> > Vivien Didelot <vivien.didelot@gmail.com> wrote on Wed, 30 Jan 2019
> > 19:46:08 -0500:
> >   
> > > Hi Miquèl,
> > > 
> > > On Wed, 30 Jan 2019 10:46:06 +0100, Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> > >   
> > > > > > > Today, there is no S2RAM support for switches. First, I proposed to add
> > > > > > > suspend/resume callbacks to the mv88e6xxx driver - just enough to avoid
> > > > > > > crashing the kernel.      
> > > > > > 
> > > > > > Then i would suggest the mv88e6xxx refuses the suspend. Actually that
> > > > > > probably is the first correct step. We don't have suspend support, so
> > > > > > stop the suspend happening, so preventing the kernel crash.      
> > > 
> > > Actually can you show me the crash that is happening?  
> > 
> > Sure, here it is: http://code.bulix.org/swwb11-569137
> > 
> > Actually it is a silent crash but the platform never resumes. I am
> > pretty sure this is due to the kthread_queue_delayed_work() loop which
> > might access registers before it is allowed to do so.  
> 
> Hi Miquel
> 
> That sounds like it is an MDIO driver problem, or at least, a resume
> ordering problem. You need to ensure that the MDIO bus driver is
> resumed before the switch driver. Also, that the switch is suspended
> before the MDIO bus driver is suspended.

I don't think there is an ordering problem. The MDIO bus is suspended
after the switch and resumed before. But if there is no cancellation of
the work thread (which is always automatically restarted) in the suspend
path, soon or later this work will run at a time when the MDIO bus is
still not accessible and will freeze the platform entirely.


Thanks,
Miquèl
diff mbox series

Patch

=================
* New patch: saves the forwarding/vlan/bridge rules when they are
  applied in a list. This way in the second patch when adding S2RAM
  support, we just need to replay these rules.

 drivers/net/dsa/mv88e6xxx/chip.c | 270 +++++++++++++++++++++++++++----
 drivers/net/dsa/mv88e6xxx/chip.h |  33 ++++
 2 files changed, 271 insertions(+), 32 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 8a517d8fb9d1..428177f80abd 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -52,6 +52,111 @@  static void assert_reg_lock(struct mv88e6xxx_chip *chip)
 	}
 }
 
+static void mv88e6xxx_add_rule(struct mv88e6xxx_chip *chip, int port,
+			       enum mv88e6xxx_rule_type type,
+			       const unsigned char *addr, u16 vid,
+			       const struct switchdev_obj_port_vlan *vlan,
+			       int dev, struct net_device *br)
+{
+	struct mv88e6xxx_rule *rule;
+
+	rule = kzalloc(sizeof(*rule), GFP_KERNEL);
+	if (!rule)
+		return;
+
+	rule->port = port;
+	rule->type = type;
+	switch (type) {
+	case FDB_RULE:
+	case MDB_RULE:
+		ether_addr_copy(rule->params.db.addr, addr);
+		rule->params.db.vid = vid;
+		break;
+	case VLAN_RULE:
+		rule->params.vlan = *vlan;
+		break;
+	case BRIDGE_RULE:
+		rule->params.br = br;
+		break;
+	case CC_BRIDGE_RULE:
+		rule->params.crosschip.dev = dev;
+		rule->params.crosschip.br = br;
+		break;
+	}
+
+	list_add_tail(&rule->node, &chip->rules);
+}
+
+static void mv88e6xxx_del_rule(struct mv88e6xxx_chip *chip, int port,
+			       enum mv88e6xxx_rule_type type,
+			       const unsigned char *addr, u16 vid,
+			       const struct switchdev_obj_port_vlan *vlan,
+			       int dev, struct net_device *br)
+{
+	struct mv88e6xxx_rule *rule, *next;
+
+	list_for_each_entry_safe(rule, next, &chip->rules, node) {
+		if (rule->port != port || rule->type != type)
+			continue;
+
+		switch (type) {
+		case FDB_RULE:
+		case MDB_RULE:
+			if (!memcmp(rule->params.db.addr, addr, ETH_ALEN) &&
+			    rule->params.db.vid == vid)
+				goto found;
+			break;
+		case VLAN_RULE:
+			if (rule->params.vlan.flags == vlan->flags &&
+			    rule->params.vlan.vid_begin == vlan->vid_begin &&
+			    rule->params.vlan.vid_end == vlan->vid_end)
+				goto found;
+			break;
+		case BRIDGE_RULE:
+			if (rule->params.br == br)
+				goto found;
+			break;
+		case CC_BRIDGE_RULE:
+			if (rule->params.crosschip.br == br &&
+			    rule->params.crosschip.dev == dev)
+				goto found;
+			break;
+		default:
+			dev_warn(chip->dev, "Unknown rule type\n");
+			break;
+		}
+	}
+
+	dev_info(chip->dev, "Cannot delete rule: not found\n");
+
+	return;
+
+found:
+	list_del(&rule->node);
+	kfree(rule);
+}
+
+#define mv88e6xxx_add_fdb_rule(chip, port, addr, vid) \
+	mv88e6xxx_add_rule(chip, port, FDB_RULE, addr, vid, NULL, 0, NULL)
+#define mv88e6xxx_del_fdb_rule(chip, port, addr, vid) \
+	mv88e6xxx_del_rule(chip, port, FDB_RULE, addr, vid, NULL, 0, NULL)
+#define mv88e6xxx_add_mdb_rule(chip, port, addr, vid) \
+	mv88e6xxx_add_rule(chip, port, MDB_RULE, addr, vid, NULL, 0, NULL)
+#define mv88e6xxx_del_mdb_rule(chip, port, addr, vid) \
+	mv88e6xxx_del_rule(chip, port, MDB_RULE, addr, vid, NULL, 0, NULL)
+#define mv88e6xxx_add_vlan_rule(chip, port, vlan) \
+	mv88e6xxx_add_rule(chip, port, VLAN_RULE, NULL, 0, vlan, 0, NULL)
+#define mv88e6xxx_del_vlan_rule(chip, port, vlan) \
+	mv88e6xxx_del_rule(chip, port, VLAN_RULE, NULL, 0, vlan, 0, NULL)
+#define mv88e6xxx_add_bridge_rule(chip, port, br) \
+	mv88e6xxx_add_rule(chip, port, BRIDGE_RULE, NULL, 0, NULL, 0, br)
+#define mv88e6xxx_del_bridge_rule(chip, port, br) \
+	mv88e6xxx_del_rule(chip, port, BRIDGE_RULE, NULL, 0, NULL, 0, br)
+#define mv88e6xxx_add_cc_bridge_rule(chip, port, dev, br) \
+	mv88e6xxx_add_rule(chip, port, CC_BRIDGE_RULE, NULL, 0, NULL, dev, br)
+#define mv88e6xxx_del_cc_bridge_rule(chip, port, dev, br) \
+	mv88e6xxx_del_rule(chip, port, CC_BRIDGE_RULE, NULL, 0, NULL, dev, br)
+
 /* The switch ADDR[4:1] configuration pins define the chip SMI device address
  * (ADDR[0] is always zero, thus only even SMI addresses can be strapped).
  *
@@ -1674,8 +1779,8 @@  static int mv88e6xxx_broadcast_setup(struct mv88e6xxx_chip *chip, u16 vid)
 	return 0;
 }
 
-static int _mv88e6xxx_port_vlan_add(struct mv88e6xxx_chip *chip, int port,
-				    u16 vid, u8 member)
+static int __mv88e6xxx_port_vlan_add(struct mv88e6xxx_chip *chip, int port,
+				     u16 vid, u8 member)
 {
 	struct mv88e6xxx_vtu_entry vlan;
 	int err;
@@ -1693,19 +1798,19 @@  static int _mv88e6xxx_port_vlan_add(struct mv88e6xxx_chip *chip, int port,
 	return mv88e6xxx_broadcast_setup(chip, vid);
 }
 
-static void mv88e6xxx_port_vlan_add(struct dsa_switch *ds, int port,
+static int _mv88e6xxx_port_vlan_add(struct mv88e6xxx_chip *chip, int port,
 				    const struct switchdev_obj_port_vlan *vlan)
 {
-	struct mv88e6xxx_chip *chip = ds->priv;
 	bool untagged = vlan->flags & BRIDGE_VLAN_INFO_UNTAGGED;
 	bool pvid = vlan->flags & BRIDGE_VLAN_INFO_PVID;
 	u8 member;
 	u16 vid;
+	int err;
 
 	if (!chip->info->max_vid)
-		return;
+		return -EINVAL;
 
-	if (dsa_is_dsa_port(ds, port) || dsa_is_cpu_port(ds, port))
+	if (dsa_is_dsa_port(chip->ds, port) || dsa_is_cpu_port(chip->ds, port))
 		member = MV88E6XXX_G1_VTU_DATA_MEMBER_TAG_UNMODIFIED;
 	else if (untagged)
 		member = MV88E6XXX_G1_VTU_DATA_MEMBER_TAG_UNTAGGED;
@@ -1714,16 +1819,37 @@  static void mv88e6xxx_port_vlan_add(struct dsa_switch *ds, int port,
 
 	mutex_lock(&chip->reg_lock);
 
-	for (vid = vlan->vid_begin; vid <= vlan->vid_end; ++vid)
-		if (_mv88e6xxx_port_vlan_add(chip, port, vid, member))
-			dev_err(ds->dev, "p%d: failed to add VLAN %d%c\n", port,
-				vid, untagged ? 'u' : 't');
+	for (vid = vlan->vid_begin; vid <= vlan->vid_end; ++vid) {
+		err = __mv88e6xxx_port_vlan_add(chip, port, vid, member);
+		if (err) {
+			dev_err(chip->dev, "p%d: failed to add VLAN %d%c\n",
+				port, vid, untagged ? 'u' : 't');
+			goto out;
+		}
+	}
 
-	if (pvid && mv88e6xxx_port_set_pvid(chip, port, vlan->vid_end))
-		dev_err(ds->dev, "p%d: failed to set PVID %d\n", port,
-			vlan->vid_end);
+	if (pvid) {
+		err = mv88e6xxx_port_set_pvid(chip, port, vlan->vid_end);
+		if (err)
+			dev_err(chip->dev, "p%d: failed to set PVID %d\n",
+				port, vlan->vid_end);
+	}
 
+out:
 	mutex_unlock(&chip->reg_lock);
+
+	return err;
+}
+
+static void mv88e6xxx_port_vlan_add(struct dsa_switch *ds, int port,
+				    const struct switchdev_obj_port_vlan *vlan)
+{
+	struct mv88e6xxx_chip *chip = ds->priv;
+
+	if (_mv88e6xxx_port_vlan_add(chip, port, vlan))
+		return;
+
+	mv88e6xxx_add_vlan_rule(chip, port, vlan);
 }
 
 static int _mv88e6xxx_port_vlan_del(struct mv88e6xxx_chip *chip,
@@ -1769,6 +1895,8 @@  static int mv88e6xxx_port_vlan_del(struct dsa_switch *ds, int port,
 	if (!chip->info->max_vid)
 		return -EOPNOTSUPP;
 
+	mv88e6xxx_del_vlan_rule(chip, port, vlan);
+
 	mutex_lock(&chip->reg_lock);
 
 	err = mv88e6xxx_port_get_pvid(chip, port, &pvid);
@@ -1793,10 +1921,9 @@  static int mv88e6xxx_port_vlan_del(struct dsa_switch *ds, int port,
 	return err;
 }
 
-static int mv88e6xxx_port_fdb_add(struct dsa_switch *ds, int port,
-				  const unsigned char *addr, u16 vid)
+static int _mv88e6xxx_port_fdb_add(struct mv88e6xxx_chip *chip, int port,
+				   const unsigned char *addr, u16 vid)
 {
-	struct mv88e6xxx_chip *chip = ds->priv;
 	int err;
 
 	mutex_lock(&chip->reg_lock);
@@ -1807,12 +1934,29 @@  static int mv88e6xxx_port_fdb_add(struct dsa_switch *ds, int port,
 	return err;
 }
 
+static int mv88e6xxx_port_fdb_add(struct dsa_switch *ds, int port,
+				  const unsigned char *addr, u16 vid)
+{
+	struct mv88e6xxx_chip *chip = ds->priv;
+	int err;
+
+	err = _mv88e6xxx_port_fdb_add(chip, port, addr, vid);
+	if (err)
+		return err;
+
+	mv88e6xxx_add_fdb_rule(chip, port, addr, vid);
+
+	return 0;
+}
+
 static int mv88e6xxx_port_fdb_del(struct dsa_switch *ds, int port,
 				  const unsigned char *addr, u16 vid)
 {
 	struct mv88e6xxx_chip *chip = ds->priv;
 	int err;
 
+	mv88e6xxx_del_fdb_rule(chip, port, addr, vid);
+
 	mutex_lock(&chip->reg_lock);
 	err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid,
 					   MV88E6XXX_G1_ATU_DATA_STATE_UNUSED);
@@ -1945,31 +2089,68 @@  static int mv88e6xxx_bridge_map(struct mv88e6xxx_chip *chip,
 	return 0;
 }
 
+static int _mv88e6xxx_port_bridge_join(struct mv88e6xxx_chip *chip, int port,
+				       struct net_device *br)
+{
+	int err;
+
+	mutex_lock(&chip->reg_lock);
+	err = mv88e6xxx_bridge_map(chip, br);
+	mutex_unlock(&chip->reg_lock);
+
+	return err;
+}
+
 static int mv88e6xxx_port_bridge_join(struct dsa_switch *ds, int port,
 				      struct net_device *br)
 {
 	struct mv88e6xxx_chip *chip = ds->priv;
 	int err;
 
-	mutex_lock(&chip->reg_lock);
-	err = mv88e6xxx_bridge_map(chip, br);
-	mutex_unlock(&chip->reg_lock);
+	err = _mv88e6xxx_port_bridge_join(chip, port, br);
+	if (err)
+		return err;
 
-	return err;
+	mv88e6xxx_add_bridge_rule(chip, port, br);
+
+	return 0;
 }
 
 static void mv88e6xxx_port_bridge_leave(struct dsa_switch *ds, int port,
 					struct net_device *br)
 {
 	struct mv88e6xxx_chip *chip = ds->priv;
+	int err;
+
+	mv88e6xxx_del_bridge_rule(chip, port, br);
 
 	mutex_lock(&chip->reg_lock);
-	if (mv88e6xxx_bridge_map(chip, br) ||
-	    mv88e6xxx_port_vlan_map(chip, port))
+	err = mv88e6xxx_bridge_map(chip, br);
+	if (!err)
+		err = mv88e6xxx_port_vlan_map(chip, port);
+
+	if (err)
 		dev_err(ds->dev, "failed to remap in-chip Port VLAN\n");
+
 	mutex_unlock(&chip->reg_lock);
 }
 
+static int _mv88e6xxx_crosschip_bridge_join(struct mv88e6xxx_chip *chip,
+					    int dev, int port,
+					    struct net_device *br)
+{
+	int err;
+
+	if (!mv88e6xxx_has_pvt(chip))
+		return 0;
+
+	mutex_lock(&chip->reg_lock);
+	err = mv88e6xxx_pvt_map(chip, dev, port);
+	mutex_unlock(&chip->reg_lock);
+
+	return err;
+}
+
 static int mv88e6xxx_crosschip_bridge_join(struct dsa_switch *ds, int dev,
 					   int port, struct net_device *br)
 {
@@ -1979,11 +2160,13 @@  static int mv88e6xxx_crosschip_bridge_join(struct dsa_switch *ds, int dev,
 	if (!mv88e6xxx_has_pvt(chip))
 		return 0;
 
-	mutex_lock(&chip->reg_lock);
-	err = mv88e6xxx_pvt_map(chip, dev, port);
-	mutex_unlock(&chip->reg_lock);
+	err = _mv88e6xxx_crosschip_bridge_join(chip, dev, port, br);
+	if (err)
+		return err;
 
-	return err;
+	mv88e6xxx_add_cc_bridge_rule(chip, port, dev, br);
+
+	return 0;
 }
 
 static void mv88e6xxx_crosschip_bridge_leave(struct dsa_switch *ds, int dev,
@@ -1994,6 +2177,8 @@  static void mv88e6xxx_crosschip_bridge_leave(struct dsa_switch *ds, int dev,
 	if (!mv88e6xxx_has_pvt(chip))
 		return;
 
+	mv88e6xxx_del_cc_bridge_rule(chip, port, dev, br);
+
 	mutex_lock(&chip->reg_lock);
 	if (mv88e6xxx_pvt_map(chip, dev, port))
 		dev_err(ds->dev, "failed to remap cross-chip Port VLAN\n");
@@ -4534,17 +4719,34 @@  static int mv88e6xxx_port_mdb_prepare(struct dsa_switch *ds, int port,
 	return 0;
 }
 
+static int _mv88e6xxx_port_mdb_add(struct mv88e6xxx_chip *chip, int port,
+				   const unsigned char *addr, u16 vid)
+{
+	int err;
+
+	mutex_lock(&chip->reg_lock);
+	err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid,
+					   MV88E6XXX_G1_ATU_DATA_STATE_MC_STATIC);
+	mutex_unlock(&chip->reg_lock);
+
+	if (err)
+		dev_err(chip->dev,
+			"p%d: failed to load multicast MAC address\n", port);
+
+	return err;
+}
+
 static void mv88e6xxx_port_mdb_add(struct dsa_switch *ds, int port,
 				   const struct switchdev_obj_port_mdb *mdb)
 {
 	struct mv88e6xxx_chip *chip = ds->priv;
+	int err;
 
-	mutex_lock(&chip->reg_lock);
-	if (mv88e6xxx_port_db_load_purge(chip, port, mdb->addr, mdb->vid,
-					 MV88E6XXX_G1_ATU_DATA_STATE_MC_STATIC))
-		dev_err(ds->dev, "p%d: failed to load multicast MAC address\n",
-			port);
-	mutex_unlock(&chip->reg_lock);
+	err = _mv88e6xxx_port_mdb_add(chip, port, mdb->addr, mdb->vid);
+	if (err)
+		return;
+
+	mv88e6xxx_add_mdb_rule(chip, port, mdb->addr, mdb->vid);
 }
 
 static int mv88e6xxx_port_mdb_del(struct dsa_switch *ds, int port,
@@ -4553,6 +4755,8 @@  static int mv88e6xxx_port_mdb_del(struct dsa_switch *ds, int port,
 	struct mv88e6xxx_chip *chip = ds->priv;
 	int err;
 
+	mv88e6xxx_del_mdb_rule(chip, port, mdb->addr, mdb->vid);
+
 	mutex_lock(&chip->reg_lock);
 	err = mv88e6xxx_port_db_load_purge(chip, port, mdb->addr, mdb->vid,
 					   MV88E6XXX_G1_ATU_DATA_STATE_UNUSED);
@@ -4766,6 +4970,8 @@  static int mv88e6xxx_probe(struct mdio_device *mdiodev)
 	if (err)
 		goto out_mdio;
 
+	INIT_LIST_HEAD(&chip->rules);
+
 	return 0;
 
 out_mdio:
diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h
index f9ecb7872d32..b5c8d6a8d9e7 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.h
+++ b/drivers/net/dsa/mv88e6xxx/chip.h
@@ -203,6 +203,36 @@  struct mv88e6xxx_port {
 	int serdes_irq;
 };
 
+enum mv88e6xxx_rule_type {
+	FDB_RULE,
+	MDB_RULE,
+	VLAN_RULE,
+	BRIDGE_RULE,
+	CC_BRIDGE_RULE,
+};
+
+struct mv88e6xxx_db_rule {
+	u8 addr[ETH_ALEN];
+	u16 vid;
+};
+
+struct mv88e6xxx_br_rule {
+	int dev;
+	struct net_device *br;
+};
+
+struct mv88e6xxx_rule {
+	int port;
+	enum mv88e6xxx_rule_type type;
+	union {
+		struct mv88e6xxx_db_rule db;
+		struct switchdev_obj_port_vlan vlan;
+		struct net_device *br;
+		struct mv88e6xxx_br_rule crosschip;
+	} params;
+	struct list_head node;
+};
+
 struct mv88e6xxx_chip {
 	const struct mv88e6xxx_info *info;
 
@@ -285,6 +315,9 @@  struct mv88e6xxx_chip {
 
 	/* Array of port structures. */
 	struct mv88e6xxx_port ports[DSA_MAX_PORTS];
+
+	/* List of bridging rules to recover at resume */
+	struct list_head rules;
 };
 
 struct mv88e6xxx_bus_ops {