diff mbox series

[net] bonding: fix feature flag setting at init time

Message ID 20201123031716.6179-1-jarod@redhat.com
State Superseded
Headers show
Series [net] bonding: fix feature flag setting at init time | expand

Commit Message

Jarod Wilson Nov. 23, 2020, 3:17 a.m. UTC
Have run into a case where bond_option_mode_set() gets called before
hw_features has been filled in, and very bad things happen when
netdev_change_features() then gets called, because the empty hw_features
wipes out almost all features. Further reading of netdev feature flag
documentation suggests drivers aren't supposed to touch wanted_features,
so this changes bond_option_mode_set() to use netdev_increment_features()
and &= ~BOND_XFRM_FEATURES on mode changes and then only calling
netdev_features_change() if there was actually a change of features. This
specifically fixes bonding on top of mlxsw interfaces, and has been
regression-tested with ixgbe interfaces. This change also simplifies the
xfrm-specific code in bond_setup() a little bit as well.

Fixes: a3b658cfb664 ("bonding: allow xfrm offload setup post-module-load")
Reported-by: Ivan Vecera <ivecera@redhat.com>
Cc: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Thomas Davis <tadavis@lbl.gov>
Cc: netdev@vger.kernel.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
---
 drivers/net/bonding/bond_main.c    | 10 ++++------
 drivers/net/bonding/bond_options.c | 14 +++++++++++---
 2 files changed, 15 insertions(+), 9 deletions(-)

Comments

Ivan Vecera Nov. 23, 2020, 8:19 a.m. UTC | #1
On Sun, 22 Nov 2020 22:17:16 -0500
Jarod Wilson <jarod@redhat.com> wrote:

> Have run into a case where bond_option_mode_set() gets called before
> hw_features has been filled in, and very bad things happen when
> netdev_change_features() then gets called, because the empty hw_features
> wipes out almost all features. Further reading of netdev feature flag
> documentation suggests drivers aren't supposed to touch wanted_features,
> so this changes bond_option_mode_set() to use netdev_increment_features()
> and &= ~BOND_XFRM_FEATURES on mode changes and then only calling
> netdev_features_change() if there was actually a change of features. This
> specifically fixes bonding on top of mlxsw interfaces, and has been
> regression-tested with ixgbe interfaces. This change also simplifies the
> xfrm-specific code in bond_setup() a little bit as well.

Hi Jarod,

the reason is not correct... The problem is not with empty ->hw_features but
with empty ->wanted_features.
During bond device creation bond_newlink() is called. It calls bond_changelink()
first and afterwards register_netdevice(). The problem is that ->wanted_features
are initialized in register_netdevice() so during bond_changlink() call
->wanted_features is 0. So...

bond_newlink()
-> bond_changelink()
   -> __bond_opt_set()
      -> bond_option_mode_set()
         -> netdev_change_features()
            -> __netdev_update_features()
               features = netdev_get_wanted_features()
                          { dev->features & ~dev->hw_features) | dev->wanted_features }

dev->wanted_features is here zero so the rest of the expression clears a bunch of
bits from dev->features...

In case of mlxsw it is important that NETIF_F_HW_VLAN_CTAG_FILTER bit is cleared in
bonding device because in this case vlan_add_rx_filter_info() does not call
bond_vlan_rx_add_vid() so mlxsw_sp_port_add_vid() is not called as well.

Later this causes a WARN in mlxsw_sp_inetaddr_port_vlan_event() because
instance of mlxsw_sp_port_vlan does not exist as mlxsw_sp_port_add_vid() was not
called.

Btw. it should be enough to call existing snippet in bond_option_mode_set() only
when device is already registered?

E.g.:
diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
index 9abfaae1c6f7..ca4913fee5a9 100644
--- a/drivers/net/bonding/bond_options.c
+++ b/drivers/net/bonding/bond_options.c
@@ -768,11 +768,13 @@ static int bond_option_mode_set(struct bonding *bond,
                bond->params.tlb_dynamic_lb = 1;
 
 #ifdef CONFIG_XFRM_OFFLOAD
-       if (newval->value == BOND_MODE_ACTIVEBACKUP)
-               bond->dev->wanted_features |= BOND_XFRM_FEATURES;
-       else
-               bond->dev->wanted_features &= ~BOND_XFRM_FEATURES;
-       netdev_change_features(bond->dev);
+       if (dev->reg_state == NETREG_REGISTERED) {
+               if (newval->value == BOND_MODE_ACTIVEBACKUP)
+                       bond->dev->wanted_features |= BOND_XFRM_FEATURES;
+               else
+                       bond->dev->wanted_features &= ~BOND_XFRM_FEATURES;
+               netdev_change_features(bond->dev);
+       }
 #endif /* CONFIG_XFRM_OFFLOAD */


Thanks,
Ivan
kernel test robot Nov. 23, 2020, 3:32 p.m. UTC | #2
Hi Jarod,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on net-next/master]
[also build test ERROR on next-20201123]
[cannot apply to net/master linux/master linus/master sparc-next/master v5.10-rc5]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Jarod-Wilson/bonding-fix-feature-flag-setting-at-init-time/20201123-111956
base:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git f9e425e99b0756c1479042afe761073779df2a30
config: x86_64-rhel (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
reproduce (this is a W=1 build):
        # https://github.com/0day-ci/linux/commit/6d883c4c2b01573ba9dddcb9fe109f961a8b7f10
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Jarod-Wilson/bonding-fix-feature-flag-setting-at-init-time/20201123-111956
        git checkout 6d883c4c2b01573ba9dddcb9fe109f961a8b7f10
        # save the attached .config to linux build tree
        make W=1 ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   drivers/net/bonding/bond_options.c: In function 'bond_option_mode_set':
>> drivers/net/bonding/bond_options.c:752:38: error: 'BOND_XFRM_FEATURES' undeclared (first use in this function)
     752 |  netdev_features_t mask = features & BOND_XFRM_FEATURES;
         |                                      ^~~~~~~~~~~~~~~~~~
   drivers/net/bonding/bond_options.c:752:38: note: each undeclared identifier is reported only once for each function it appears in
   drivers/net/bonding/bond_options.c:752:20: warning: unused variable 'mask' [-Wunused-variable]
     752 |  netdev_features_t mask = features & BOND_XFRM_FEATURES;
         |                    ^~~~

vim +/BOND_XFRM_FEATURES +752 drivers/net/bonding/bond_options.c

   747	
   748	static int bond_option_mode_set(struct bonding *bond,
   749					const struct bond_opt_value *newval)
   750	{
   751		netdev_features_t features = bond->dev->features;
 > 752		netdev_features_t mask = features & BOND_XFRM_FEATURES;
   753	
   754		if (!bond_mode_uses_arp(newval->value)) {
   755			if (bond->params.arp_interval) {
   756				netdev_dbg(bond->dev, "%s mode is incompatible with arp monitoring, start mii monitoring\n",
   757					   newval->string);
   758				/* disable arp monitoring */
   759				bond->params.arp_interval = 0;
   760			}
   761	
   762			if (!bond->params.miimon) {
   763				/* set miimon to default value */
   764				bond->params.miimon = BOND_DEFAULT_MIIMON;
   765				netdev_dbg(bond->dev, "Setting MII monitoring interval to %d\n",
   766					   bond->params.miimon);
   767			}
   768		}
   769	
   770		if (newval->value == BOND_MODE_ALB)
   771			bond->params.tlb_dynamic_lb = 1;
   772	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
diff mbox series

Patch

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 71c9677d135f..b8e0cb4f9480 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -4721,15 +4721,13 @@  void bond_setup(struct net_device *bond_dev)
 				NETIF_F_HW_VLAN_CTAG_FILTER;
 
 	bond_dev->hw_features |= NETIF_F_GSO_ENCAP_ALL;
-#ifdef CONFIG_XFRM_OFFLOAD
-	bond_dev->hw_features |= BOND_XFRM_FEATURES;
-#endif /* CONFIG_XFRM_OFFLOAD */
 	bond_dev->features |= bond_dev->hw_features;
 	bond_dev->features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX;
 #ifdef CONFIG_XFRM_OFFLOAD
-	/* Disable XFRM features if this isn't an active-backup config */
-	if (BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP)
-		bond_dev->features &= ~BOND_XFRM_FEATURES;
+	bond_dev->hw_features |= BOND_XFRM_FEATURES;
+	/* Only enable XFRM features if this is an active-backup config */
+	if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP)
+		bond_dev->features |= BOND_XFRM_FEATURES;
 #endif /* CONFIG_XFRM_OFFLOAD */
 }
 
diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
index 9abfaae1c6f7..bce34648d97d 100644
--- a/drivers/net/bonding/bond_options.c
+++ b/drivers/net/bonding/bond_options.c
@@ -748,6 +748,9 @@  const struct bond_option *bond_opt_get(unsigned int option)
 static int bond_option_mode_set(struct bonding *bond,
 				const struct bond_opt_value *newval)
 {
+	netdev_features_t features = bond->dev->features;
+	netdev_features_t mask = features & BOND_XFRM_FEATURES;
+
 	if (!bond_mode_uses_arp(newval->value)) {
 		if (bond->params.arp_interval) {
 			netdev_dbg(bond->dev, "%s mode is incompatible with arp monitoring, start mii monitoring\n",
@@ -769,10 +772,15 @@  static int bond_option_mode_set(struct bonding *bond,
 
 #ifdef CONFIG_XFRM_OFFLOAD
 	if (newval->value == BOND_MODE_ACTIVEBACKUP)
-		bond->dev->wanted_features |= BOND_XFRM_FEATURES;
+		features = netdev_increment_features(features,
+						     BOND_XFRM_FEATURES, mask);
 	else
-		bond->dev->wanted_features &= ~BOND_XFRM_FEATURES;
-	netdev_change_features(bond->dev);
+		features &= ~BOND_XFRM_FEATURES;
+
+	if (bond->dev->features != features) {
+		bond->dev->features = features;
+		netdev_features_change(bond->dev);
+	}
 #endif /* CONFIG_XFRM_OFFLOAD */
 
 	/* don't cache arp_validate between modes */