diff mbox series

[net-next,v11,2/5] netvsc: refactor notifier/event handling code to use the failover framework

Message ID 1526954781-35359-3-git-send-email-sridhar.samudrala@intel.com
State Changes Requested, archived
Delegated to: David Miller
Headers show
Series Enable virtio_net to act as a standby for a passthru device | expand

Commit Message

Samudrala, Sridhar May 22, 2018, 2:06 a.m. UTC
Use the registration/notification framework supported by the generic
failover infrastructure.

Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
---
 drivers/net/hyperv/Kconfig      |   1 +
 drivers/net/hyperv/hyperv_net.h |   2 +
 drivers/net/hyperv/netvsc_drv.c | 133 +++++++---------------------------------
 3 files changed, 25 insertions(+), 111 deletions(-)

Comments

Jiri Pirko May 22, 2018, 9:06 a.m. UTC | #1
Tue, May 22, 2018 at 04:06:18AM CEST, sridhar.samudrala@intel.com wrote:
>Use the registration/notification framework supported by the generic
>failover infrastructure.
>
>Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>

In previous patchset versions, the common code did
netdev_rx_handler_register() and netdev_upper_dev_link() etc
(netvsc_vf_join()). Now, this is still done in netvsc. Why?

This should be part of the common "failover" code.
Jiri Pirko May 22, 2018, 9:08 a.m. UTC | #2
Tue, May 22, 2018 at 11:06:37AM CEST, jiri@resnulli.us wrote:
>Tue, May 22, 2018 at 04:06:18AM CEST, sridhar.samudrala@intel.com wrote:
>>Use the registration/notification framework supported by the generic
>>failover infrastructure.
>>
>>Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
>
>In previous patchset versions, the common code did
>netdev_rx_handler_register() and netdev_upper_dev_link() etc
>(netvsc_vf_join()). Now, this is still done in netvsc. Why?
>
>This should be part of the common "failover" code.
>

Also note that in the current patchset you use IFF_FAILOVER flag for
master, yet for the slave you use IFF_SLAVE. That is wrong.
IFF_FAILOVER_SLAVE should be used.
Michael S. Tsirkin May 22, 2018, 1:12 p.m. UTC | #3
On Tue, May 22, 2018 at 11:08:53AM +0200, Jiri Pirko wrote:
> Tue, May 22, 2018 at 11:06:37AM CEST, jiri@resnulli.us wrote:
> >Tue, May 22, 2018 at 04:06:18AM CEST, sridhar.samudrala@intel.com wrote:
> >>Use the registration/notification framework supported by the generic
> >>failover infrastructure.
> >>
> >>Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
> >
> >In previous patchset versions, the common code did
> >netdev_rx_handler_register() and netdev_upper_dev_link() etc
> >(netvsc_vf_join()). Now, this is still done in netvsc. Why?
> >
> >This should be part of the common "failover" code.
> >
> 
> Also note that in the current patchset you use IFF_FAILOVER flag for
> master, yet for the slave you use IFF_SLAVE. That is wrong.
> IFF_FAILOVER_SLAVE should be used.

Or drop IFF_FAILOVER_SLAVE and set both IFF_FAILOVER and IFF_SLAVE?
Jiri Pirko May 22, 2018, 1:14 p.m. UTC | #4
Tue, May 22, 2018 at 03:12:40PM CEST, mst@redhat.com wrote:
>On Tue, May 22, 2018 at 11:08:53AM +0200, Jiri Pirko wrote:
>> Tue, May 22, 2018 at 11:06:37AM CEST, jiri@resnulli.us wrote:
>> >Tue, May 22, 2018 at 04:06:18AM CEST, sridhar.samudrala@intel.com wrote:
>> >>Use the registration/notification framework supported by the generic
>> >>failover infrastructure.
>> >>
>> >>Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
>> >
>> >In previous patchset versions, the common code did
>> >netdev_rx_handler_register() and netdev_upper_dev_link() etc
>> >(netvsc_vf_join()). Now, this is still done in netvsc. Why?
>> >
>> >This should be part of the common "failover" code.
>> >
>> 
>> Also note that in the current patchset you use IFF_FAILOVER flag for
>> master, yet for the slave you use IFF_SLAVE. That is wrong.
>> IFF_FAILOVER_SLAVE should be used.
>
>Or drop IFF_FAILOVER_SLAVE and set both IFF_FAILOVER and IFF_SLAVE?

No. IFF_SLAVE is for bonding.
Michael S. Tsirkin May 22, 2018, 1:17 p.m. UTC | #5
On Tue, May 22, 2018 at 03:14:22PM +0200, Jiri Pirko wrote:
> Tue, May 22, 2018 at 03:12:40PM CEST, mst@redhat.com wrote:
> >On Tue, May 22, 2018 at 11:08:53AM +0200, Jiri Pirko wrote:
> >> Tue, May 22, 2018 at 11:06:37AM CEST, jiri@resnulli.us wrote:
> >> >Tue, May 22, 2018 at 04:06:18AM CEST, sridhar.samudrala@intel.com wrote:
> >> >>Use the registration/notification framework supported by the generic
> >> >>failover infrastructure.
> >> >>
> >> >>Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
> >> >
> >> >In previous patchset versions, the common code did
> >> >netdev_rx_handler_register() and netdev_upper_dev_link() etc
> >> >(netvsc_vf_join()). Now, this is still done in netvsc. Why?
> >> >
> >> >This should be part of the common "failover" code.
> >> >
> >> 
> >> Also note that in the current patchset you use IFF_FAILOVER flag for
> >> master, yet for the slave you use IFF_SLAVE. That is wrong.
> >> IFF_FAILOVER_SLAVE should be used.
> >
> >Or drop IFF_FAILOVER_SLAVE and set both IFF_FAILOVER and IFF_SLAVE?
> 
> No. IFF_SLAVE is for bonding.

What breaks if we reuse it for failover?
Jiri Pirko May 22, 2018, 1:26 p.m. UTC | #6
Tue, May 22, 2018 at 03:17:37PM CEST, mst@redhat.com wrote:
>On Tue, May 22, 2018 at 03:14:22PM +0200, Jiri Pirko wrote:
>> Tue, May 22, 2018 at 03:12:40PM CEST, mst@redhat.com wrote:
>> >On Tue, May 22, 2018 at 11:08:53AM +0200, Jiri Pirko wrote:
>> >> Tue, May 22, 2018 at 11:06:37AM CEST, jiri@resnulli.us wrote:
>> >> >Tue, May 22, 2018 at 04:06:18AM CEST, sridhar.samudrala@intel.com wrote:
>> >> >>Use the registration/notification framework supported by the generic
>> >> >>failover infrastructure.
>> >> >>
>> >> >>Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
>> >> >
>> >> >In previous patchset versions, the common code did
>> >> >netdev_rx_handler_register() and netdev_upper_dev_link() etc
>> >> >(netvsc_vf_join()). Now, this is still done in netvsc. Why?
>> >> >
>> >> >This should be part of the common "failover" code.
>> >> >
>> >> 
>> >> Also note that in the current patchset you use IFF_FAILOVER flag for
>> >> master, yet for the slave you use IFF_SLAVE. That is wrong.
>> >> IFF_FAILOVER_SLAVE should be used.
>> >
>> >Or drop IFF_FAILOVER_SLAVE and set both IFF_FAILOVER and IFF_SLAVE?
>> 
>> No. IFF_SLAVE is for bonding.
>
>What breaks if we reuse it for failover?

This is exposed to userspace. IFF_SLAVE is expected for bonding slaves.
And failover slave is not a bonding slave.
Michael S. Tsirkin May 22, 2018, 1:39 p.m. UTC | #7
On Tue, May 22, 2018 at 03:26:26PM +0200, Jiri Pirko wrote:
> Tue, May 22, 2018 at 03:17:37PM CEST, mst@redhat.com wrote:
> >On Tue, May 22, 2018 at 03:14:22PM +0200, Jiri Pirko wrote:
> >> Tue, May 22, 2018 at 03:12:40PM CEST, mst@redhat.com wrote:
> >> >On Tue, May 22, 2018 at 11:08:53AM +0200, Jiri Pirko wrote:
> >> >> Tue, May 22, 2018 at 11:06:37AM CEST, jiri@resnulli.us wrote:
> >> >> >Tue, May 22, 2018 at 04:06:18AM CEST, sridhar.samudrala@intel.com wrote:
> >> >> >>Use the registration/notification framework supported by the generic
> >> >> >>failover infrastructure.
> >> >> >>
> >> >> >>Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
> >> >> >
> >> >> >In previous patchset versions, the common code did
> >> >> >netdev_rx_handler_register() and netdev_upper_dev_link() etc
> >> >> >(netvsc_vf_join()). Now, this is still done in netvsc. Why?
> >> >> >
> >> >> >This should be part of the common "failover" code.
> >> >> >
> >> >> 
> >> >> Also note that in the current patchset you use IFF_FAILOVER flag for
> >> >> master, yet for the slave you use IFF_SLAVE. That is wrong.
> >> >> IFF_FAILOVER_SLAVE should be used.
> >> >
> >> >Or drop IFF_FAILOVER_SLAVE and set both IFF_FAILOVER and IFF_SLAVE?
> >> 
> >> No. IFF_SLAVE is for bonding.
> >
> >What breaks if we reuse it for failover?
> 
> This is exposed to userspace. IFF_SLAVE is expected for bonding slaves.
> And failover slave is not a bonding slave.

That does not really answer the question.  I'd claim it's sufficiently
like a bond slave for IFF_SLAVE to make sense.

In fact you will find that netvsc already sets IFF_SLAVE, and so
does e.g. the eql driver.

The advantage of using IFF_SLAVE is that userspace knows to skip it.  If
we don't set IFF_SLAVE existing userspace tries to use the lowerdev.
Jiri Pirko May 22, 2018, 3:13 p.m. UTC | #8
Tue, May 22, 2018 at 03:39:33PM CEST, mst@redhat.com wrote:
>On Tue, May 22, 2018 at 03:26:26PM +0200, Jiri Pirko wrote:
>> Tue, May 22, 2018 at 03:17:37PM CEST, mst@redhat.com wrote:
>> >On Tue, May 22, 2018 at 03:14:22PM +0200, Jiri Pirko wrote:
>> >> Tue, May 22, 2018 at 03:12:40PM CEST, mst@redhat.com wrote:
>> >> >On Tue, May 22, 2018 at 11:08:53AM +0200, Jiri Pirko wrote:
>> >> >> Tue, May 22, 2018 at 11:06:37AM CEST, jiri@resnulli.us wrote:
>> >> >> >Tue, May 22, 2018 at 04:06:18AM CEST, sridhar.samudrala@intel.com wrote:
>> >> >> >>Use the registration/notification framework supported by the generic
>> >> >> >>failover infrastructure.
>> >> >> >>
>> >> >> >>Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
>> >> >> >
>> >> >> >In previous patchset versions, the common code did
>> >> >> >netdev_rx_handler_register() and netdev_upper_dev_link() etc
>> >> >> >(netvsc_vf_join()). Now, this is still done in netvsc. Why?
>> >> >> >
>> >> >> >This should be part of the common "failover" code.
>> >> >> >
>> >> >> 
>> >> >> Also note that in the current patchset you use IFF_FAILOVER flag for
>> >> >> master, yet for the slave you use IFF_SLAVE. That is wrong.
>> >> >> IFF_FAILOVER_SLAVE should be used.
>> >> >
>> >> >Or drop IFF_FAILOVER_SLAVE and set both IFF_FAILOVER and IFF_SLAVE?
>> >> 
>> >> No. IFF_SLAVE is for bonding.
>> >
>> >What breaks if we reuse it for failover?
>> 
>> This is exposed to userspace. IFF_SLAVE is expected for bonding slaves.
>> And failover slave is not a bonding slave.
>
>That does not really answer the question.  I'd claim it's sufficiently
>like a bond slave for IFF_SLAVE to make sense.
>
>In fact you will find that netvsc already sets IFF_SLAVE, and so

netvsc does the whole failover thing in a wrong way. This patchset is
trying to fix it.

>does e.g. the eql driver.
>
>The advantage of using IFF_SLAVE is that userspace knows to skip it.  If

The userspace should know how to skip other types of slaves - team,
bridge, ovs, etc. The "master link" should be the one to look at.


>we don't set IFF_SLAVE existing userspace tries to use the lowerdev.

Each master type has a IFF_ master flag and IFF_ slave flag. In private
flag. I don't see no reason to break this pattern here.
Samudrala, Sridhar May 22, 2018, 3:28 p.m. UTC | #9
On 5/22/2018 2:08 AM, Jiri Pirko wrote:
> Tue, May 22, 2018 at 11:06:37AM CEST, jiri@resnulli.us wrote:
>> Tue, May 22, 2018 at 04:06:18AM CEST, sridhar.samudrala@intel.com wrote:
>>> Use the registration/notification framework supported by the generic
>>> failover infrastructure.
>>>
>>> Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
>> In previous patchset versions, the common code did
>> netdev_rx_handler_register() and netdev_upper_dev_link() etc
>> (netvsc_vf_join()). Now, this is still done in netvsc. Why?
>>
>> This should be part of the common "failover" code.

Based on Stephen's feedback on earlier patches, i tried to minimize the changes to
netvsc and only commonize the notifier and the main event handler routine.
Another complication is that netvsc does part of registration in a delayed workqueue.

It should be possible to move some of the code from net_failover.c to generic
failover.c in future if Stephen is ok with it.


>>
> Also note that in the current patchset you use IFF_FAILOVER flag for
> master, yet for the slave you use IFF_SLAVE. That is wrong.
> IFF_FAILOVER_SLAVE should be used.

Not sure which code you are referring to.  I only set IFF_FAILOVER_SLAVE
in patch 3.
Michael S. Tsirkin May 22, 2018, 3:32 p.m. UTC | #10
On Tue, May 22, 2018 at 05:13:43PM +0200, Jiri Pirko wrote:
> Tue, May 22, 2018 at 03:39:33PM CEST, mst@redhat.com wrote:
> >On Tue, May 22, 2018 at 03:26:26PM +0200, Jiri Pirko wrote:
> >> Tue, May 22, 2018 at 03:17:37PM CEST, mst@redhat.com wrote:
> >> >On Tue, May 22, 2018 at 03:14:22PM +0200, Jiri Pirko wrote:
> >> >> Tue, May 22, 2018 at 03:12:40PM CEST, mst@redhat.com wrote:
> >> >> >On Tue, May 22, 2018 at 11:08:53AM +0200, Jiri Pirko wrote:
> >> >> >> Tue, May 22, 2018 at 11:06:37AM CEST, jiri@resnulli.us wrote:
> >> >> >> >Tue, May 22, 2018 at 04:06:18AM CEST, sridhar.samudrala@intel.com wrote:
> >> >> >> >>Use the registration/notification framework supported by the generic
> >> >> >> >>failover infrastructure.
> >> >> >> >>
> >> >> >> >>Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
> >> >> >> >
> >> >> >> >In previous patchset versions, the common code did
> >> >> >> >netdev_rx_handler_register() and netdev_upper_dev_link() etc
> >> >> >> >(netvsc_vf_join()). Now, this is still done in netvsc. Why?
> >> >> >> >
> >> >> >> >This should be part of the common "failover" code.
> >> >> >> >
> >> >> >> 
> >> >> >> Also note that in the current patchset you use IFF_FAILOVER flag for
> >> >> >> master, yet for the slave you use IFF_SLAVE. That is wrong.
> >> >> >> IFF_FAILOVER_SLAVE should be used.
> >> >> >
> >> >> >Or drop IFF_FAILOVER_SLAVE and set both IFF_FAILOVER and IFF_SLAVE?
> >> >> 
> >> >> No. IFF_SLAVE is for bonding.
> >> >
> >> >What breaks if we reuse it for failover?
> >> 
> >> This is exposed to userspace. IFF_SLAVE is expected for bonding slaves.
> >> And failover slave is not a bonding slave.
> >
> >That does not really answer the question.  I'd claim it's sufficiently
> >like a bond slave for IFF_SLAVE to make sense.
> >
> >In fact you will find that netvsc already sets IFF_SLAVE, and so
> 
> netvsc does the whole failover thing in a wrong way. This patchset is
> trying to fix it.

Maybe, but we don't need gratuitous changes either, especially if they
break userspace.

> >does e.g. the eql driver.
> >
> >The advantage of using IFF_SLAVE is that userspace knows to skip it.  If
> 
> The userspace should know how to skip other types of slaves - team,
> bridge, ovs, etc.
> The "master link" should be the one to look at.
> 

How should existing userspace know which ones to skip and which one is
the master?  Right now userspace seems to assume whatever does not have
IFF_SLAVE should be looked at. Are you saying that's not the right thing
to do and userspace should be fixed? What should userspace do in
your opinion that will be forward compatible with future kernels?

> 
> >we don't set IFF_SLAVE existing userspace tries to use the lowerdev.
> 
> Each master type has a IFF_ master flag and IFF_ slave flag.

Could you give some examples please?

> In private
> flag. I don't see no reason to break this pattern here.

Other masters are setup from userspace, this one is set up automatically
by kernel. So the bar is higher, we need an interface that existing
userspace knows about.  We can't just say "oh if userspace set this up
it should know to skip lowerdevs".

Otherwise multiple interfaces with same mac tend to confuse userspace.
Jiri Pirko May 22, 2018, 3:36 p.m. UTC | #11
Tue, May 22, 2018 at 05:28:42PM CEST, sridhar.samudrala@intel.com wrote:
>
>On 5/22/2018 2:08 AM, Jiri Pirko wrote:
>> Tue, May 22, 2018 at 11:06:37AM CEST, jiri@resnulli.us wrote:
>> > Tue, May 22, 2018 at 04:06:18AM CEST, sridhar.samudrala@intel.com wrote:
>> > > Use the registration/notification framework supported by the generic
>> > > failover infrastructure.
>> > > 
>> > > Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
>> > In previous patchset versions, the common code did
>> > netdev_rx_handler_register() and netdev_upper_dev_link() etc
>> > (netvsc_vf_join()). Now, this is still done in netvsc. Why?
>> > 
>> > This should be part of the common "failover" code.
>
>Based on Stephen's feedback on earlier patches, i tried to minimize the changes to
>netvsc and only commonize the notifier and the main event handler routine.
>Another complication is that netvsc does part of registration in a delayed workqueue.

:( This kind of degrades the whole efford of having single solution
in "failover" module. I think that common parts, as
netdev_rx_handler_register() and others certainly is should be inside
the common module. This is not a good time to minimize changes. Let's do
the thing properly and fix the netvsc mess now.


>
>It should be possible to move some of the code from net_failover.c to generic
>failover.c in future if Stephen is ok with it.
>
>
>> > 
>> Also note that in the current patchset you use IFF_FAILOVER flag for
>> master, yet for the slave you use IFF_SLAVE. That is wrong.
>> IFF_FAILOVER_SLAVE should be used.
>
>Not sure which code you are referring to.  I only set IFF_FAILOVER_SLAVE
>in patch 3.

The existing netvsc driver.
Jiri Pirko May 22, 2018, 3:45 p.m. UTC | #12
Tue, May 22, 2018 at 05:32:30PM CEST, mst@redhat.com wrote:
>On Tue, May 22, 2018 at 05:13:43PM +0200, Jiri Pirko wrote:
>> Tue, May 22, 2018 at 03:39:33PM CEST, mst@redhat.com wrote:
>> >On Tue, May 22, 2018 at 03:26:26PM +0200, Jiri Pirko wrote:
>> >> Tue, May 22, 2018 at 03:17:37PM CEST, mst@redhat.com wrote:
>> >> >On Tue, May 22, 2018 at 03:14:22PM +0200, Jiri Pirko wrote:
>> >> >> Tue, May 22, 2018 at 03:12:40PM CEST, mst@redhat.com wrote:
>> >> >> >On Tue, May 22, 2018 at 11:08:53AM +0200, Jiri Pirko wrote:
>> >> >> >> Tue, May 22, 2018 at 11:06:37AM CEST, jiri@resnulli.us wrote:
>> >> >> >> >Tue, May 22, 2018 at 04:06:18AM CEST, sridhar.samudrala@intel.com wrote:
>> >> >> >> >>Use the registration/notification framework supported by the generic
>> >> >> >> >>failover infrastructure.
>> >> >> >> >>
>> >> >> >> >>Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
>> >> >> >> >
>> >> >> >> >In previous patchset versions, the common code did
>> >> >> >> >netdev_rx_handler_register() and netdev_upper_dev_link() etc
>> >> >> >> >(netvsc_vf_join()). Now, this is still done in netvsc. Why?
>> >> >> >> >
>> >> >> >> >This should be part of the common "failover" code.
>> >> >> >> >
>> >> >> >> 
>> >> >> >> Also note that in the current patchset you use IFF_FAILOVER flag for
>> >> >> >> master, yet for the slave you use IFF_SLAVE. That is wrong.
>> >> >> >> IFF_FAILOVER_SLAVE should be used.
>> >> >> >
>> >> >> >Or drop IFF_FAILOVER_SLAVE and set both IFF_FAILOVER and IFF_SLAVE?
>> >> >> 
>> >> >> No. IFF_SLAVE is for bonding.
>> >> >
>> >> >What breaks if we reuse it for failover?
>> >> 
>> >> This is exposed to userspace. IFF_SLAVE is expected for bonding slaves.
>> >> And failover slave is not a bonding slave.
>> >
>> >That does not really answer the question.  I'd claim it's sufficiently
>> >like a bond slave for IFF_SLAVE to make sense.
>> >
>> >In fact you will find that netvsc already sets IFF_SLAVE, and so
>> 
>> netvsc does the whole failover thing in a wrong way. This patchset is
>> trying to fix it.
>
>Maybe, but we don't need gratuitous changes either, especially if they
>break userspace.

What do you mean by the "break"? It was a mistake to reuse IFF_SLAVE at
the first place, lets fix it. If some userspace depends on that flag, it
is broken anyway.


>
>> >does e.g. the eql driver.
>> >
>> >The advantage of using IFF_SLAVE is that userspace knows to skip it.  If
>> 
>> The userspace should know how to skip other types of slaves - team,
>> bridge, ovs, etc.
>> The "master link" should be the one to look at.
>> 
>
>How should existing userspace know which ones to skip and which one is
>the master?  Right now userspace seems to assume whatever does not have
>IFF_SLAVE should be looked at. Are you saying that's not the right thing

Why do you say so? What do you mean by "looked at"? Certainly not.
IFLA_MASTER is the attribute that should be looked at, nothing else.


>to do and userspace should be fixed? What should userspace do in
>your opinion that will be forward compatible with future kernels?
>
>> 
>> >we don't set IFF_SLAVE existing userspace tries to use the lowerdev.
>> 
>> Each master type has a IFF_ master flag and IFF_ slave flag.
>
>Could you give some examples please?

enum netdev_priv_flags {
        IFF_EBRIDGE                     = 1<<1,
        IFF_BRIDGE_PORT                 = 1<<9,
        IFF_OPENVSWITCH                 = 1<<20,
        IFF_OVS_DATAPATH                = 1<<10,
	IFF_L3MDEV_MASTER               = 1<<18,
        IFF_L3MDEV_SLAVE                = 1<<21,
        IFF_TEAM                        = 1<<22,
        IFF_TEAM_PORT                   = 1<<13,
};


>
>> In private
>> flag. I don't see no reason to break this pattern here.
>
>Other masters are setup from userspace, this one is set up automatically
>by kernel. So the bar is higher, we need an interface that existing
>userspace knows about.  We can't just say "oh if userspace set this up
>it should know to skip lowerdevs".
>
>Otherwise multiple interfaces with same mac tend to confuse userspace.

No difference, really.
Regardless who does the setup, and independent userspace deamon should
react accordingly.
Michael S. Tsirkin May 22, 2018, 3:46 p.m. UTC | #13
On Tue, May 22, 2018 at 05:36:14PM +0200, Jiri Pirko wrote:
> Tue, May 22, 2018 at 05:28:42PM CEST, sridhar.samudrala@intel.com wrote:
> >
> >On 5/22/2018 2:08 AM, Jiri Pirko wrote:
> >> Tue, May 22, 2018 at 11:06:37AM CEST, jiri@resnulli.us wrote:
> >> > Tue, May 22, 2018 at 04:06:18AM CEST, sridhar.samudrala@intel.com wrote:
> >> > > Use the registration/notification framework supported by the generic
> >> > > failover infrastructure.
> >> > > 
> >> > > Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
> >> > In previous patchset versions, the common code did
> >> > netdev_rx_handler_register() and netdev_upper_dev_link() etc
> >> > (netvsc_vf_join()). Now, this is still done in netvsc. Why?
> >> > 
> >> > This should be part of the common "failover" code.
> >
> >Based on Stephen's feedback on earlier patches, i tried to minimize the changes to
> >netvsc and only commonize the notifier and the main event handler routine.
> >Another complication is that netvsc does part of registration in a delayed workqueue.
> 
> :( This kind of degrades the whole efford of having single solution
> in "failover" module. I think that common parts, as
> netdev_rx_handler_register() and others certainly is should be inside
> the common module. This is not a good time to minimize changes. Let's do
> the thing properly and fix the netvsc mess now.
> 
> 
> >
> >It should be possible to move some of the code from net_failover.c to generic
> >failover.c in future if Stephen is ok with it.
> >
> >
> >> > 
> >> Also note that in the current patchset you use IFF_FAILOVER flag for
> >> master, yet for the slave you use IFF_SLAVE. That is wrong.
> >> IFF_FAILOVER_SLAVE should be used.
> >
> >Not sure which code you are referring to.  I only set IFF_FAILOVER_SLAVE
> >in patch 3.
> 
> The existing netvsc driver.

We really can't change netvsc's flags now, even if it's interface is
messy, it's being used in the field. We can add a flag that makes netvsc
behave differently, and if this flag also allows enhanced functionality
userspace will gradually switch.

Anything breaking userspace I fully expect Stephen to nack and
IMO with good reason.
Jiri Pirko May 22, 2018, 4:12 p.m. UTC | #14
Fixing the subj, sorry about that.

Tue, May 22, 2018 at 05:46:21PM CEST, mst@redhat.com wrote:
>On Tue, May 22, 2018 at 05:36:14PM +0200, Jiri Pirko wrote:
>> Tue, May 22, 2018 at 05:28:42PM CEST, sridhar.samudrala@intel.com wrote:
>> >
>> >On 5/22/2018 2:08 AM, Jiri Pirko wrote:
>> >> Tue, May 22, 2018 at 11:06:37AM CEST, jiri@resnulli.us wrote:
>> >> > Tue, May 22, 2018 at 04:06:18AM CEST, sridhar.samudrala@intel.com wrote:
>> >> > > Use the registration/notification framework supported by the generic
>> >> > > failover infrastructure.
>> >> > > 
>> >> > > Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
>> >> > In previous patchset versions, the common code did
>> >> > netdev_rx_handler_register() and netdev_upper_dev_link() etc
>> >> > (netvsc_vf_join()). Now, this is still done in netvsc. Why?
>> >> > 
>> >> > This should be part of the common "failover" code.
>> >
>> >Based on Stephen's feedback on earlier patches, i tried to minimize the changes to
>> >netvsc and only commonize the notifier and the main event handler routine.
>> >Another complication is that netvsc does part of registration in a delayed workqueue.
>> 
>> :( This kind of degrades the whole efford of having single solution
>> in "failover" module. I think that common parts, as
>> netdev_rx_handler_register() and others certainly is should be inside
>> the common module. This is not a good time to minimize changes. Let's do
>> the thing properly and fix the netvsc mess now.
>> 
>> 
>> >
>> >It should be possible to move some of the code from net_failover.c to generic
>> >failover.c in future if Stephen is ok with it.
>> >
>> >
>> >> > 
>> >> Also note that in the current patchset you use IFF_FAILOVER flag for
>> >> master, yet for the slave you use IFF_SLAVE. That is wrong.
>> >> IFF_FAILOVER_SLAVE should be used.
>> >
>> >Not sure which code you are referring to.  I only set IFF_FAILOVER_SLAVE
>> >in patch 3.
>> 
>> The existing netvsc driver.
>
>We really can't change netvsc's flags now, even if it's interface is
>messy, it's being used in the field. We can add a flag that makes netvsc
>behave differently, and if this flag also allows enhanced functionality
>userspace will gradually switch.

Okay, although in this case, it really does not make much sense, so be
it. Leave the netvsc set the ->priv flag to IFF_SLAVE as it is doing
now. (This once-wrong-forever-wrong policy is flustrating me).

But since this patchset introduces private flag IFF_FAILOVER and
IFF_FAILOVER_SLAVE, and we set IFF_FAILOVER to the netvsc netdev
instance, we should also set IFF_FAILOVER_SLAVE to the enslaved VF
netdevice to get at least some consistency between virtio_net and
netvsc.


>
>Anything breaking userspace I fully expect Stephen to nack and
>IMO with good reason.
>
>-- 
>MST
Michael S. Tsirkin May 22, 2018, 4:52 p.m. UTC | #15
On Tue, May 22, 2018 at 05:45:01PM +0200, Jiri Pirko wrote:
> Tue, May 22, 2018 at 05:32:30PM CEST, mst@redhat.com wrote:
> >On Tue, May 22, 2018 at 05:13:43PM +0200, Jiri Pirko wrote:
> >> Tue, May 22, 2018 at 03:39:33PM CEST, mst@redhat.com wrote:
> >> >On Tue, May 22, 2018 at 03:26:26PM +0200, Jiri Pirko wrote:
> >> >> Tue, May 22, 2018 at 03:17:37PM CEST, mst@redhat.com wrote:
> >> >> >On Tue, May 22, 2018 at 03:14:22PM +0200, Jiri Pirko wrote:
> >> >> >> Tue, May 22, 2018 at 03:12:40PM CEST, mst@redhat.com wrote:
> >> >> >> >On Tue, May 22, 2018 at 11:08:53AM +0200, Jiri Pirko wrote:
> >> >> >> >> Tue, May 22, 2018 at 11:06:37AM CEST, jiri@resnulli.us wrote:
> >> >> >> >> >Tue, May 22, 2018 at 04:06:18AM CEST, sridhar.samudrala@intel.com wrote:
> >> >> >> >> >>Use the registration/notification framework supported by the generic
> >> >> >> >> >>failover infrastructure.
> >> >> >> >> >>
> >> >> >> >> >>Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
> >> >> >> >> >
> >> >> >> >> >In previous patchset versions, the common code did
> >> >> >> >> >netdev_rx_handler_register() and netdev_upper_dev_link() etc
> >> >> >> >> >(netvsc_vf_join()). Now, this is still done in netvsc. Why?
> >> >> >> >> >
> >> >> >> >> >This should be part of the common "failover" code.
> >> >> >> >> >
> >> >> >> >> 
> >> >> >> >> Also note that in the current patchset you use IFF_FAILOVER flag for
> >> >> >> >> master, yet for the slave you use IFF_SLAVE. That is wrong.
> >> >> >> >> IFF_FAILOVER_SLAVE should be used.
> >> >> >> >
> >> >> >> >Or drop IFF_FAILOVER_SLAVE and set both IFF_FAILOVER and IFF_SLAVE?
> >> >> >> 
> >> >> >> No. IFF_SLAVE is for bonding.
> >> >> >
> >> >> >What breaks if we reuse it for failover?
> >> >> 
> >> >> This is exposed to userspace. IFF_SLAVE is expected for bonding slaves.
> >> >> And failover slave is not a bonding slave.
> >> >
> >> >That does not really answer the question.  I'd claim it's sufficiently
> >> >like a bond slave for IFF_SLAVE to make sense.
> >> >
> >> >In fact you will find that netvsc already sets IFF_SLAVE, and so
> >> 
> >> netvsc does the whole failover thing in a wrong way. This patchset is
> >> trying to fix it.
> >
> >Maybe, but we don't need gratuitous changes either, especially if they
> >break userspace.
> 
> What do you mean by the "break"? It was a mistake to reuse IFF_SLAVE at
> the first place, lets fix it. If some userspace depends on that flag, it
> is broken anyway.
> 
> 
> >
> >> >does e.g. the eql driver.
> >> >
> >> >The advantage of using IFF_SLAVE is that userspace knows to skip it.  If
> >> 
> >> The userspace should know how to skip other types of slaves - team,
> >> bridge, ovs, etc.
> >> The "master link" should be the one to look at.
> >> 
> >
> >How should existing userspace know which ones to skip and which one is
> >the master?  Right now userspace seems to assume whatever does not have
> >IFF_SLAVE should be looked at. Are you saying that's not the right thing
> 
> Why do you say so? What do you mean by "looked at"? Certainly not.
> IFLA_MASTER is the attribute that should be looked at, nothing else.
> 
> 
> >to do and userspace should be fixed? What should userspace do in
> >your opinion that will be forward compatible with future kernels?
> >
> >> 
> >> >we don't set IFF_SLAVE existing userspace tries to use the lowerdev.
> >> 
> >> Each master type has a IFF_ master flag and IFF_ slave flag.
> >
> >Could you give some examples please?
> 
> enum netdev_priv_flags {
>         IFF_EBRIDGE                     = 1<<1,
>         IFF_BRIDGE_PORT                 = 1<<9,
>         IFF_OPENVSWITCH                 = 1<<20,
>         IFF_OVS_DATAPATH                = 1<<10,
> 	IFF_L3MDEV_MASTER               = 1<<18,
>         IFF_L3MDEV_SLAVE                = 1<<21,
>         IFF_TEAM                        = 1<<22,
>         IFF_TEAM_PORT                   = 1<<13,
> };

That's not in uapi, is it?  the comment above that says:

These flags are invisible to userspace



> 
> >
> >> In private
> >> flag. I don't see no reason to break this pattern here.
> >
> >Other masters are setup from userspace, this one is set up automatically
> >by kernel. So the bar is higher, we need an interface that existing
> >userspace knows about.  We can't just say "oh if userspace set this up
> >it should know to skip lowerdevs".
> >
> >Otherwise multiple interfaces with same mac tend to confuse userspace.
> 
> No difference, really.
> Regardless who does the setup, and independent userspace deamon should
> react accordingly.

If the deamon does the setup itself, it's reasonable to require that it
learns about new flags each time we add a new driver.  If it doesn't,
then I think it's less reasonable.
Jiri Pirko May 22, 2018, 5:38 p.m. UTC | #16
Tue, May 22, 2018 at 06:52:21PM CEST, mst@redhat.com wrote:
>On Tue, May 22, 2018 at 05:45:01PM +0200, Jiri Pirko wrote:
>> Tue, May 22, 2018 at 05:32:30PM CEST, mst@redhat.com wrote:
>> >On Tue, May 22, 2018 at 05:13:43PM +0200, Jiri Pirko wrote:
>> >> Tue, May 22, 2018 at 03:39:33PM CEST, mst@redhat.com wrote:
>> >> >On Tue, May 22, 2018 at 03:26:26PM +0200, Jiri Pirko wrote:
>> >> >> Tue, May 22, 2018 at 03:17:37PM CEST, mst@redhat.com wrote:
>> >> >> >On Tue, May 22, 2018 at 03:14:22PM +0200, Jiri Pirko wrote:
>> >> >> >> Tue, May 22, 2018 at 03:12:40PM CEST, mst@redhat.com wrote:
>> >> >> >> >On Tue, May 22, 2018 at 11:08:53AM +0200, Jiri Pirko wrote:
>> >> >> >> >> Tue, May 22, 2018 at 11:06:37AM CEST, jiri@resnulli.us wrote:
>> >> >> >> >> >Tue, May 22, 2018 at 04:06:18AM CEST, sridhar.samudrala@intel.com wrote:
>> >> >> >> >> >>Use the registration/notification framework supported by the generic
>> >> >> >> >> >>failover infrastructure.
>> >> >> >> >> >>
>> >> >> >> >> >>Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
>> >> >> >> >> >
>> >> >> >> >> >In previous patchset versions, the common code did
>> >> >> >> >> >netdev_rx_handler_register() and netdev_upper_dev_link() etc
>> >> >> >> >> >(netvsc_vf_join()). Now, this is still done in netvsc. Why?
>> >> >> >> >> >
>> >> >> >> >> >This should be part of the common "failover" code.
>> >> >> >> >> >
>> >> >> >> >> 
>> >> >> >> >> Also note that in the current patchset you use IFF_FAILOVER flag for
>> >> >> >> >> master, yet for the slave you use IFF_SLAVE. That is wrong.
>> >> >> >> >> IFF_FAILOVER_SLAVE should be used.
>> >> >> >> >
>> >> >> >> >Or drop IFF_FAILOVER_SLAVE and set both IFF_FAILOVER and IFF_SLAVE?
>> >> >> >> 
>> >> >> >> No. IFF_SLAVE is for bonding.
>> >> >> >
>> >> >> >What breaks if we reuse it for failover?
>> >> >> 
>> >> >> This is exposed to userspace. IFF_SLAVE is expected for bonding slaves.
>> >> >> And failover slave is not a bonding slave.
>> >> >
>> >> >That does not really answer the question.  I'd claim it's sufficiently
>> >> >like a bond slave for IFF_SLAVE to make sense.
>> >> >
>> >> >In fact you will find that netvsc already sets IFF_SLAVE, and so
>> >> 
>> >> netvsc does the whole failover thing in a wrong way. This patchset is
>> >> trying to fix it.
>> >
>> >Maybe, but we don't need gratuitous changes either, especially if they
>> >break userspace.
>> 
>> What do you mean by the "break"? It was a mistake to reuse IFF_SLAVE at
>> the first place, lets fix it. If some userspace depends on that flag, it
>> is broken anyway.
>> 
>> 
>> >
>> >> >does e.g. the eql driver.
>> >> >
>> >> >The advantage of using IFF_SLAVE is that userspace knows to skip it.  If
>> >> 
>> >> The userspace should know how to skip other types of slaves - team,
>> >> bridge, ovs, etc.
>> >> The "master link" should be the one to look at.
>> >> 
>> >
>> >How should existing userspace know which ones to skip and which one is
>> >the master?  Right now userspace seems to assume whatever does not have
>> >IFF_SLAVE should be looked at. Are you saying that's not the right thing
>> 
>> Why do you say so? What do you mean by "looked at"? Certainly not.
>> IFLA_MASTER is the attribute that should be looked at, nothing else.
>> 
>> 
>> >to do and userspace should be fixed? What should userspace do in
>> >your opinion that will be forward compatible with future kernels?
>> >
>> >> 
>> >> >we don't set IFF_SLAVE existing userspace tries to use the lowerdev.
>> >> 
>> >> Each master type has a IFF_ master flag and IFF_ slave flag.
>> >
>> >Could you give some examples please?
>> 
>> enum netdev_priv_flags {
>>         IFF_EBRIDGE                     = 1<<1,
>>         IFF_BRIDGE_PORT                 = 1<<9,
>>         IFF_OPENVSWITCH                 = 1<<20,
>>         IFF_OVS_DATAPATH                = 1<<10,
>> 	IFF_L3MDEV_MASTER               = 1<<18,
>>         IFF_L3MDEV_SLAVE                = 1<<21,
>>         IFF_TEAM                        = 1<<22,
>>         IFF_TEAM_PORT                   = 1<<13,
>> };
>
>That's not in uapi, is it?  the comment above that says:

Correct.


>
>These flags are invisible to userspace
>
>
>
>> 
>> >
>> >> In private
>> >> flag. I don't see no reason to break this pattern here.
>> >
>> >Other masters are setup from userspace, this one is set up automatically
>> >by kernel. So the bar is higher, we need an interface that existing
>> >userspace knows about.  We can't just say "oh if userspace set this up
>> >it should know to skip lowerdevs".
>> >
>> >Otherwise multiple interfaces with same mac tend to confuse userspace.
>> 
>> No difference, really.
>> Regardless who does the setup, and independent userspace deamon should
>> react accordingly.
>
>If the deamon does the setup itself, it's reasonable to require that it
>learns about new flags each time we add a new driver.  If it doesn't,
>then I think it's less reasonable.

No need. The "IFLA_MASTER" attr is always there to be looked at. That is
enough.
Michael S. Tsirkin May 22, 2018, 7:54 p.m. UTC | #17
On Tue, May 22, 2018 at 07:38:44PM +0200, Jiri Pirko wrote:
> >> >> In private
> >> >> flag. I don't see no reason to break this pattern here.
> >> >
> >> >Other masters are setup from userspace, this one is set up automatically
> >> >by kernel. So the bar is higher, we need an interface that existing
> >> >userspace knows about.  We can't just say "oh if userspace set this up
> >> >it should know to skip lowerdevs".
> >> >
> >> >Otherwise multiple interfaces with same mac tend to confuse userspace.
> >> 
> >> No difference, really.
> >> Regardless who does the setup, and independent userspace deamon should
> >> react accordingly.
> >
> >If the deamon does the setup itself, it's reasonable to require that it
> >learns about new flags each time we add a new driver.  If it doesn't,
> >then I think it's less reasonable.
> 
> No need. The "IFLA_MASTER" attr is always there to be looked at. That is
> enough.

Oh so if it has an master, skip it? Sorry, I misunderstood what you were
saying earlier.

Thanks, this makes sense to me.
Samudrala, Sridhar May 22, 2018, 8:54 p.m. UTC | #18
On 5/22/2018 9:12 AM, Jiri Pirko wrote:
> Fixing the subj, sorry about that.
>
> Tue, May 22, 2018 at 05:46:21PM CEST, mst@redhat.com wrote:
>> On Tue, May 22, 2018 at 05:36:14PM +0200, Jiri Pirko wrote:
>>> Tue, May 22, 2018 at 05:28:42PM CEST, sridhar.samudrala@intel.com wrote:
>>>> On 5/22/2018 2:08 AM, Jiri Pirko wrote:
>>>>> Tue, May 22, 2018 at 11:06:37AM CEST, jiri@resnulli.us wrote:
>>>>>> Tue, May 22, 2018 at 04:06:18AM CEST, sridhar.samudrala@intel.com wrote:
>>>>>>> Use the registration/notification framework supported by the generic
>>>>>>> failover infrastructure.
>>>>>>>
>>>>>>> Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
>>>>>> In previous patchset versions, the common code did
>>>>>> netdev_rx_handler_register() and netdev_upper_dev_link() etc
>>>>>> (netvsc_vf_join()). Now, this is still done in netvsc. Why?
>>>>>>
>>>>>> This should be part of the common "failover" code.
>>>> Based on Stephen's feedback on earlier patches, i tried to minimize the changes to
>>>> netvsc and only commonize the notifier and the main event handler routine.
>>>> Another complication is that netvsc does part of registration in a delayed workqueue.
>>> :( This kind of degrades the whole efford of having single solution
>>> in "failover" module. I think that common parts, as
>>> netdev_rx_handler_register() and others certainly is should be inside
>>> the common module. This is not a good time to minimize changes. Let's do
>>> the thing properly and fix the netvsc mess now.
>>>
>>>
>>>> It should be possible to move some of the code from net_failover.c to generic
>>>> failover.c in future if Stephen is ok with it.
>>>>
>>>>
>>>>> Also note that in the current patchset you use IFF_FAILOVER flag for
>>>>> master, yet for the slave you use IFF_SLAVE. That is wrong.
>>>>> IFF_FAILOVER_SLAVE should be used.
>>>> Not sure which code you are referring to.  I only set IFF_FAILOVER_SLAVE
>>>> in patch 3.
>>> The existing netvsc driver.
>> We really can't change netvsc's flags now, even if it's interface is
>> messy, it's being used in the field. We can add a flag that makes netvsc
>> behave differently, and if this flag also allows enhanced functionality
>> userspace will gradually switch.
> Okay, although in this case, it really does not make much sense, so be
> it. Leave the netvsc set the ->priv flag to IFF_SLAVE as it is doing
> now. (This once-wrong-forever-wrong policy is flustrating me).
>
> But since this patchset introduces private flag IFF_FAILOVER and
> IFF_FAILOVER_SLAVE, and we set IFF_FAILOVER to the netvsc netdev
> instance, we should also set IFF_FAILOVER_SLAVE to the enslaved VF
> netdevice to get at least some consistency between virtio_net and
> netvsc.

OK. I can make this change to set/unset IFF_FAILOVER_SLAVE in the netvsc
register/unregister routines so that it is consistent with virtio_net.

Based on your discussion with mst, i think we can even remove IFF_SLAVE
setting on netvsc as it should not impact userspace.  If Stephen is OK
we can make this change too.

Do you see any other items that need to be resolved for this series to go in
this merge window?



>
>> Anything breaking userspace I fully expect Stephen to nack and
>> IMO with good reason.
>>
>> -- 
>> MST
Jiri Pirko May 23, 2018, 6:27 a.m. UTC | #19
Tue, May 22, 2018 at 10:54:29PM CEST, sridhar.samudrala@intel.com wrote:
>
>
>On 5/22/2018 9:12 AM, Jiri Pirko wrote:
>> Fixing the subj, sorry about that.
>> 
>> Tue, May 22, 2018 at 05:46:21PM CEST, mst@redhat.com wrote:
>> > On Tue, May 22, 2018 at 05:36:14PM +0200, Jiri Pirko wrote:
>> > > Tue, May 22, 2018 at 05:28:42PM CEST, sridhar.samudrala@intel.com wrote:
>> > > > On 5/22/2018 2:08 AM, Jiri Pirko wrote:
>> > > > > Tue, May 22, 2018 at 11:06:37AM CEST, jiri@resnulli.us wrote:
>> > > > > > Tue, May 22, 2018 at 04:06:18AM CEST, sridhar.samudrala@intel.com wrote:
>> > > > > > > Use the registration/notification framework supported by the generic
>> > > > > > > failover infrastructure.
>> > > > > > > 
>> > > > > > > Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
>> > > > > > In previous patchset versions, the common code did
>> > > > > > netdev_rx_handler_register() and netdev_upper_dev_link() etc
>> > > > > > (netvsc_vf_join()). Now, this is still done in netvsc. Why?
>> > > > > > 
>> > > > > > This should be part of the common "failover" code.
>> > > > Based on Stephen's feedback on earlier patches, i tried to minimize the changes to
>> > > > netvsc and only commonize the notifier and the main event handler routine.
>> > > > Another complication is that netvsc does part of registration in a delayed workqueue.
>> > > :( This kind of degrades the whole efford of having single solution
>> > > in "failover" module. I think that common parts, as
>> > > netdev_rx_handler_register() and others certainly is should be inside
>> > > the common module. This is not a good time to minimize changes. Let's do
>> > > the thing properly and fix the netvsc mess now.
>> > > 
>> > > 
>> > > > It should be possible to move some of the code from net_failover.c to generic
>> > > > failover.c in future if Stephen is ok with it.
>> > > > 
>> > > > 
>> > > > > Also note that in the current patchset you use IFF_FAILOVER flag for
>> > > > > master, yet for the slave you use IFF_SLAVE. That is wrong.
>> > > > > IFF_FAILOVER_SLAVE should be used.
>> > > > Not sure which code you are referring to.  I only set IFF_FAILOVER_SLAVE
>> > > > in patch 3.
>> > > The existing netvsc driver.
>> > We really can't change netvsc's flags now, even if it's interface is
>> > messy, it's being used in the field. We can add a flag that makes netvsc
>> > behave differently, and if this flag also allows enhanced functionality
>> > userspace will gradually switch.
>> Okay, although in this case, it really does not make much sense, so be
>> it. Leave the netvsc set the ->priv flag to IFF_SLAVE as it is doing
>> now. (This once-wrong-forever-wrong policy is flustrating me).
>> 
>> But since this patchset introduces private flag IFF_FAILOVER and
>> IFF_FAILOVER_SLAVE, and we set IFF_FAILOVER to the netvsc netdev
>> instance, we should also set IFF_FAILOVER_SLAVE to the enslaved VF
>> netdevice to get at least some consistency between virtio_net and
>> netvsc.
>
>OK. I can make this change to set/unset IFF_FAILOVER_SLAVE in the netvsc
>register/unregister routines so that it is consistent with virtio_net.
>
>Based on your discussion with mst, i think we can even remove IFF_SLAVE
>setting on netvsc as it should not impact userspace.  If Stephen is OK
>we can make this change too.
>
>Do you see any other items that need to be resolved for this series to go in
>this merge window?

As I wrote previously, the common code including rx_handler registration
and setting of flags and master link should be done in a common code,
moved away from netvsc code.

Thanks.


>
>
>
>> 
>> > Anything breaking userspace I fully expect Stephen to nack and
>> > IMO with good reason.
>> > 
>> > -- 
>> > MST
>
Samudrala, Sridhar May 23, 2018, 4:16 p.m. UTC | #20
On 5/22/2018 11:27 PM, Jiri Pirko wrote:
> Tue, May 22, 2018 at 10:54:29PM CEST, sridhar.samudrala@intel.com wrote:
>>
>> On 5/22/2018 9:12 AM, Jiri Pirko wrote:
>>> Fixing the subj, sorry about that.
>>>
>>> Tue, May 22, 2018 at 05:46:21PM CEST, mst@redhat.com wrote:
>>>> On Tue, May 22, 2018 at 05:36:14PM +0200, Jiri Pirko wrote:
>>>>> Tue, May 22, 2018 at 05:28:42PM CEST, sridhar.samudrala@intel.com wrote:
>>>>>> On 5/22/2018 2:08 AM, Jiri Pirko wrote:
>>>>>>> Tue, May 22, 2018 at 11:06:37AM CEST, jiri@resnulli.us wrote:
>>>>>>>> Tue, May 22, 2018 at 04:06:18AM CEST, sridhar.samudrala@intel.com wrote:
>>>>>>>>> Use the registration/notification framework supported by the generic
>>>>>>>>> failover infrastructure.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
>>>>>>>> In previous patchset versions, the common code did
>>>>>>>> netdev_rx_handler_register() and netdev_upper_dev_link() etc
>>>>>>>> (netvsc_vf_join()). Now, this is still done in netvsc. Why?
>>>>>>>>
>>>>>>>> This should be part of the common "failover" code.
>>>>>> Based on Stephen's feedback on earlier patches, i tried to minimize the changes to
>>>>>> netvsc and only commonize the notifier and the main event handler routine.
>>>>>> Another complication is that netvsc does part of registration in a delayed workqueue.
>>>>> :( This kind of degrades the whole efford of having single solution
>>>>> in "failover" module. I think that common parts, as
>>>>> netdev_rx_handler_register() and others certainly is should be inside
>>>>> the common module. This is not a good time to minimize changes. Let's do
>>>>> the thing properly and fix the netvsc mess now.
>>>>>
>>>>>
>>>>>> It should be possible to move some of the code from net_failover.c to generic
>>>>>> failover.c in future if Stephen is ok with it.
>>>>>>
>>>>>>
>>>>>>> Also note that in the current patchset you use IFF_FAILOVER flag for
>>>>>>> master, yet for the slave you use IFF_SLAVE. That is wrong.
>>>>>>> IFF_FAILOVER_SLAVE should be used.
>>>>>> Not sure which code you are referring to.  I only set IFF_FAILOVER_SLAVE
>>>>>> in patch 3.
>>>>> The existing netvsc driver.
>>>> We really can't change netvsc's flags now, even if it's interface is
>>>> messy, it's being used in the field. We can add a flag that makes netvsc
>>>> behave differently, and if this flag also allows enhanced functionality
>>>> userspace will gradually switch.
>>> Okay, although in this case, it really does not make much sense, so be
>>> it. Leave the netvsc set the ->priv flag to IFF_SLAVE as it is doing
>>> now. (This once-wrong-forever-wrong policy is flustrating me).
>>>
>>> But since this patchset introduces private flag IFF_FAILOVER and
>>> IFF_FAILOVER_SLAVE, and we set IFF_FAILOVER to the netvsc netdev
>>> instance, we should also set IFF_FAILOVER_SLAVE to the enslaved VF
>>> netdevice to get at least some consistency between virtio_net and
>>> netvsc.
>> OK. I can make this change to set/unset IFF_FAILOVER_SLAVE in the netvsc
>> register/unregister routines so that it is consistent with virtio_net.
>>
>> Based on your discussion with mst, i think we can even remove IFF_SLAVE
>> setting on netvsc as it should not impact userspace.  If Stephen is OK
>> we can make this change too.
>>
>> Do you see any other items that need to be resolved for this series to go in
>> this merge window?
> As I wrote previously, the common code including rx_handler registration
> and setting of flags and master link should be done in a common code,
> moved away from netvsc code.
>
This requires re-introducing the 2 additional ops pre_register and pre_unregister
that i removed in the last couple of revisions to minimize netvsc changes and the
indirect calls that Stephen expressed some concern.

But, as these calls don't happen in hot path, i guess it should not be a big
issue and the right way to go.
Will submit a v12 with these updates.
diff mbox series

Patch

diff --git a/drivers/net/hyperv/Kconfig b/drivers/net/hyperv/Kconfig
index 0765d5f61714..23a2d145813a 100644
--- a/drivers/net/hyperv/Kconfig
+++ b/drivers/net/hyperv/Kconfig
@@ -2,5 +2,6 @@  config HYPERV_NET
 	tristate "Microsoft Hyper-V virtual network driver"
 	depends on HYPERV
 	select UCS2_STRING
+	select FAILOVER
 	help
 	  Select this option to enable the Hyper-V virtual network driver.
diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h
index 1be34d2e3563..99d8e7398a5b 100644
--- a/drivers/net/hyperv/hyperv_net.h
+++ b/drivers/net/hyperv/hyperv_net.h
@@ -932,6 +932,8 @@  struct net_device_context {
 	u32 vf_alloc;
 	/* Serial number of the VF to team with */
 	u32 vf_serial;
+
+	struct failover *failover;
 };
 
 /* Per channel data */
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index da07ccdf84bf..6c77a81b7266 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -43,6 +43,7 @@ 
 #include <net/pkt_sched.h>
 #include <net/checksum.h>
 #include <net/ip6_checksum.h>
+#include <net/failover.h>
 
 #include "hyperv_net.h"
 
@@ -1763,46 +1764,6 @@  static void netvsc_link_change(struct work_struct *w)
 	rtnl_unlock();
 }
 
-static struct net_device *get_netvsc_bymac(const u8 *mac)
-{
-	struct net_device *dev;
-
-	ASSERT_RTNL();
-
-	for_each_netdev(&init_net, dev) {
-		if (dev->netdev_ops != &device_ops)
-			continue;	/* not a netvsc device */
-
-		if (ether_addr_equal(mac, dev->perm_addr))
-			return dev;
-	}
-
-	return NULL;
-}
-
-static struct net_device *get_netvsc_byref(struct net_device *vf_netdev)
-{
-	struct net_device *dev;
-
-	ASSERT_RTNL();
-
-	for_each_netdev(&init_net, dev) {
-		struct net_device_context *net_device_ctx;
-
-		if (dev->netdev_ops != &device_ops)
-			continue;	/* not a netvsc device */
-
-		net_device_ctx = netdev_priv(dev);
-		if (!rtnl_dereference(net_device_ctx->nvdev))
-			continue;	/* device is removed */
-
-		if (rtnl_dereference(net_device_ctx->vf_netdev) == vf_netdev)
-			return dev;	/* a match */
-	}
-
-	return NULL;
-}
-
 /* Called when VF is injecting data into network stack.
  * Change the associated network device from VF to netvsc.
  * note: already called with rcu_read_lock
@@ -1915,24 +1876,15 @@  static void netvsc_vf_setup(struct work_struct *w)
 	rtnl_unlock();
 }
 
-static int netvsc_register_vf(struct net_device *vf_netdev)
+static int netvsc_register_vf(struct net_device *vf_netdev,
+			      struct net_device *ndev)
 {
-	struct net_device *ndev;
 	struct net_device_context *net_device_ctx;
 	struct netvsc_device *netvsc_dev;
 
 	if (vf_netdev->addr_len != ETH_ALEN)
 		return NOTIFY_DONE;
 
-	/*
-	 * We will use the MAC address to locate the synthetic interface to
-	 * associate with the VF interface. If we don't find a matching
-	 * synthetic interface, move on.
-	 */
-	ndev = get_netvsc_bymac(vf_netdev->perm_addr);
-	if (!ndev)
-		return NOTIFY_DONE;
-
 	net_device_ctx = netdev_priv(ndev);
 	netvsc_dev = rtnl_dereference(net_device_ctx->nvdev);
 	if (!netvsc_dev || rtnl_dereference(net_device_ctx->vf_netdev))
@@ -1949,17 +1901,13 @@  static int netvsc_register_vf(struct net_device *vf_netdev)
 }
 
 /* VF up/down change detected, schedule to change data path */
-static int netvsc_vf_changed(struct net_device *vf_netdev)
+static int netvsc_vf_changed(struct net_device *vf_netdev,
+			     struct net_device *ndev)
 {
 	struct net_device_context *net_device_ctx;
 	struct netvsc_device *netvsc_dev;
-	struct net_device *ndev;
 	bool vf_is_up = netif_running(vf_netdev);
 
-	ndev = get_netvsc_byref(vf_netdev);
-	if (!ndev)
-		return NOTIFY_DONE;
-
 	net_device_ctx = netdev_priv(ndev);
 	netvsc_dev = rtnl_dereference(net_device_ctx->nvdev);
 	if (!netvsc_dev)
@@ -1972,15 +1920,11 @@  static int netvsc_vf_changed(struct net_device *vf_netdev)
 	return NOTIFY_OK;
 }
 
-static int netvsc_unregister_vf(struct net_device *vf_netdev)
+static int netvsc_unregister_vf(struct net_device *vf_netdev,
+				struct net_device *ndev)
 {
-	struct net_device *ndev;
 	struct net_device_context *net_device_ctx;
 
-	ndev = get_netvsc_byref(vf_netdev);
-	if (!ndev)
-		return NOTIFY_DONE;
-
 	net_device_ctx = netdev_priv(ndev);
 	cancel_delayed_work_sync(&net_device_ctx->vf_takeover);
 
@@ -1994,6 +1938,12 @@  static int netvsc_unregister_vf(struct net_device *vf_netdev)
 	return NOTIFY_OK;
 }
 
+static struct failover_ops netvsc_failover_ops = {
+	.slave_register		= netvsc_register_vf,
+	.slave_unregister	= netvsc_unregister_vf,
+	.slave_link_change	= netvsc_vf_changed,
+};
+
 static int netvsc_probe(struct hv_device *dev,
 			const struct hv_vmbus_device_id *dev_id)
 {
@@ -2083,8 +2033,14 @@  static int netvsc_probe(struct hv_device *dev,
 		goto register_failed;
 	}
 
+	net_device_ctx->failover = failover_register(net, &netvsc_failover_ops);
+	if (IS_ERR(net_device_ctx->failover))
+		goto err_failover;
+
 	return ret;
 
+err_failover:
+	unregister_netdev(net);
 register_failed:
 	rndis_filter_device_remove(dev, nvdev);
 rndis_failed:
@@ -2125,13 +2081,15 @@  static int netvsc_remove(struct hv_device *dev)
 	rtnl_lock();
 	vf_netdev = rtnl_dereference(ndev_ctx->vf_netdev);
 	if (vf_netdev)
-		netvsc_unregister_vf(vf_netdev);
+		failover_slave_unregister(vf_netdev);
 
 	if (nvdev)
 		rndis_filter_device_remove(dev, nvdev);
 
 	unregister_netdevice(net);
 
+	failover_unregister(ndev_ctx->failover);
+
 	rtnl_unlock();
 	rcu_read_unlock();
 
@@ -2158,54 +2116,8 @@  static struct  hv_driver netvsc_drv = {
 	.remove = netvsc_remove,
 };
 
-/*
- * On Hyper-V, every VF interface is matched with a corresponding
- * synthetic interface. The synthetic interface is presented first
- * to the guest. When the corresponding VF instance is registered,
- * we will take care of switching the data path.
- */
-static int netvsc_netdev_event(struct notifier_block *this,
-			       unsigned long event, void *ptr)
-{
-	struct net_device *event_dev = netdev_notifier_info_to_dev(ptr);
-
-	/* Skip our own events */
-	if (event_dev->netdev_ops == &device_ops)
-		return NOTIFY_DONE;
-
-	/* Avoid non-Ethernet type devices */
-	if (event_dev->type != ARPHRD_ETHER)
-		return NOTIFY_DONE;
-
-	/* Avoid Vlan dev with same MAC registering as VF */
-	if (is_vlan_dev(event_dev))
-		return NOTIFY_DONE;
-
-	/* Avoid Bonding master dev with same MAC registering as VF */
-	if ((event_dev->priv_flags & IFF_BONDING) &&
-	    (event_dev->flags & IFF_MASTER))
-		return NOTIFY_DONE;
-
-	switch (event) {
-	case NETDEV_REGISTER:
-		return netvsc_register_vf(event_dev);
-	case NETDEV_UNREGISTER:
-		return netvsc_unregister_vf(event_dev);
-	case NETDEV_UP:
-	case NETDEV_DOWN:
-		return netvsc_vf_changed(event_dev);
-	default:
-		return NOTIFY_DONE;
-	}
-}
-
-static struct notifier_block netvsc_netdev_notifier = {
-	.notifier_call = netvsc_netdev_event,
-};
-
 static void __exit netvsc_drv_exit(void)
 {
-	unregister_netdevice_notifier(&netvsc_netdev_notifier);
 	vmbus_driver_unregister(&netvsc_drv);
 }
 
@@ -2225,7 +2137,6 @@  static int __init netvsc_drv_init(void)
 	if (ret)
 		return ret;
 
-	register_netdevice_notifier(&netvsc_netdev_notifier);
 	return 0;
 }