Message ID | 20181223152930.7925-1-mironov.ivan@gmail.com |
---|---|
State | Superseded, archived |
Delegated to: | David Miller |
Headers | show |
Series | bnx2x: Fix NULL pointer dereference in bnx2x_del_all_vlans() on some hw | expand |
4.20 release is affected too. On Sun, 2018-12-23 at 20:29 +0500, Ivan Mironov wrote: > This happened when I tried to boot normal Fedora 29 system with latest > available kernel (from fedora rawhide, plus some unrelated custom > patches): > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 > PGD 0 P4D 0 > Oops: 0010 [#1] SMP PTI > CPU: 6 PID: 1422 Comm: libvirtd Tainted: G I 4.20.0-0.rc7.git3.hpsa2.1.fc29.x86_64 #1 > Hardware name: HP ProLiant BL460c G6, BIOS I24 05/21/2018 > RIP: 0010: (null) > Code: Bad RIP value. > RSP: 0018:ffffa47ccdc9fbe0 EFLAGS: 00010246 > RAX: 0000000000000000 RBX: 00000000000003e8 RCX: ffffa47ccdc9fbf8 > RDX: ffffa47ccdc9fc00 RSI: ffff97d9ee7b01f8 RDI: ffff97d9f0150b80 > RBP: ffff97d9f0150b80 R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000003 > R13: ffff97d9ef1e53e8 R14: 0000000000000009 R15: ffff97d9f0ac6730 > FS: 00007f4d224ef700(0000) GS:ffff97d9fa200000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: ffffffffffffffd6 CR3: 00000011ece52006 CR4: 00000000000206e0 > Call Trace: > ? bnx2x_chip_cleanup+0x195/0x610 [bnx2x] > ? bnx2x_nic_unload+0x1e2/0x8f0 [bnx2x] > ? bnx2x_reload_if_running+0x24/0x40 [bnx2x] > ? bnx2x_set_features+0x79/0xa0 [bnx2x] > ? __netdev_update_features+0x244/0x9e0 > ? netlink_broadcast_filtered+0x136/0x4b0 > ? netdev_update_features+0x22/0x60 > ? dev_disable_lro+0x1c/0xe0 > ? devinet_sysctl_forward+0x1c6/0x211 > ? proc_sys_call_handler+0xab/0x100 > ? __vfs_write+0x36/0x1a0 > ? rcu_read_lock_sched_held+0x79/0x80 > ? rcu_sync_lockdep_assert+0x2e/0x60 > ? __sb_start_write+0x14c/0x1b0 > ? vfs_write+0x159/0x1c0 > ? vfs_write+0xba/0x1c0 > ? ksys_write+0x52/0xc0 > ? do_syscall_64+0x60/0x1f0 > ? entry_SYSCALL_64_after_hwframe+0x49/0xbe > > After some investigation I figured out that recently added cleanup code > tries to call VLAN filtering de-initialization function which exist only > for newer hardware. Corresponding function pointer is not > initialized (== 0) for older hardware, namely these chips: > > #define CHIP_NUM_57710 0x164e > #define CHIP_NUM_57711 0x164f > #define CHIP_NUM_57711E 0x1650 > > And I have one of those in my test system: > > 02:00.0 Ethernet controller [0200]: Broadcom Inc. and subsidiaries NetXtreme II BCM57711E 10-Gigabit PCIe [14e4:1650] > 02:00.1 Ethernet controller [0200]: Broadcom Inc. and subsidiaries NetXtreme II BCM57711E 10-Gigabit PCIe [14e4:1650] > > Function bnx2x_init_vlan_mac_fp_objs() from > drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h decides whether to > initialize relevant pointers in bnx2x_sp_objs.vlan_obj or not. > > This regression was introduced after v4.20-rc7. > > Fixes: 04f05230c5c13 ("bnx2x: Remove configured vlans as part of unload sequence.") > Signed-off-by: Ivan Mironov <mironov.ivan@gmail.com> > --- > .../net/ethernet/broadcom/bnx2x/bnx2x_main.c | 22 +++++++++++++------ > 1 file changed, 15 insertions(+), 7 deletions(-) > > diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c > index b164f705709d..0e37c2484ac2 100644 > --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c > +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c > @@ -8504,15 +8504,23 @@ int bnx2x_set_vlan_one(struct bnx2x *bp, u16 vlan, > static int bnx2x_del_all_vlans(struct bnx2x *bp) > { > struct bnx2x_vlan_mac_obj *vlan_obj = &bp->sp_objs[0].vlan_obj; > - unsigned long ramrod_flags = 0, vlan_flags = 0; > struct bnx2x_vlan_entry *vlan; > - int rc; > > - __set_bit(RAMROD_COMP_WAIT, &ramrod_flags); > - __set_bit(BNX2X_VLAN, &vlan_flags); > - rc = vlan_obj->delete_all(bp, vlan_obj, &vlan_flags, &ramrod_flags); > - if (rc) > - return rc; > + /* The whole *vlan_obj structure may be not initialized if VLAN > + * filtering offload is not supported by hardware. Currently this is > + * true for all hardware covered by CHIP_IS_E1x(). > + */ > + if (vlan_obj->delete_all) { > + unsigned long ramrod_flags = 0, vlan_flags = 0; > + int rc; > + > + __set_bit(RAMROD_COMP_WAIT, &ramrod_flags); > + __set_bit(BNX2X_VLAN, &vlan_flags); > + rc = vlan_obj->delete_all(bp, vlan_obj, &vlan_flags, > + &ramrod_flags); > + if (rc) > + return rc; > + } > > /* Mark that hw forgot all entries */ > list_for_each_entry(vlan, &bp->vlan_reg, link)
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c index b164f705709d..0e37c2484ac2 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c @@ -8504,15 +8504,23 @@ int bnx2x_set_vlan_one(struct bnx2x *bp, u16 vlan, static int bnx2x_del_all_vlans(struct bnx2x *bp) { struct bnx2x_vlan_mac_obj *vlan_obj = &bp->sp_objs[0].vlan_obj; - unsigned long ramrod_flags = 0, vlan_flags = 0; struct bnx2x_vlan_entry *vlan; - int rc; - __set_bit(RAMROD_COMP_WAIT, &ramrod_flags); - __set_bit(BNX2X_VLAN, &vlan_flags); - rc = vlan_obj->delete_all(bp, vlan_obj, &vlan_flags, &ramrod_flags); - if (rc) - return rc; + /* The whole *vlan_obj structure may be not initialized if VLAN + * filtering offload is not supported by hardware. Currently this is + * true for all hardware covered by CHIP_IS_E1x(). + */ + if (vlan_obj->delete_all) { + unsigned long ramrod_flags = 0, vlan_flags = 0; + int rc; + + __set_bit(RAMROD_COMP_WAIT, &ramrod_flags); + __set_bit(BNX2X_VLAN, &vlan_flags); + rc = vlan_obj->delete_all(bp, vlan_obj, &vlan_flags, + &ramrod_flags); + if (rc) + return rc; + } /* Mark that hw forgot all entries */ list_for_each_entry(vlan, &bp->vlan_reg, link)
This happened when I tried to boot normal Fedora 29 system with latest available kernel (from fedora rawhide, plus some unrelated custom patches): BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 PGD 0 P4D 0 Oops: 0010 [#1] SMP PTI CPU: 6 PID: 1422 Comm: libvirtd Tainted: G I 4.20.0-0.rc7.git3.hpsa2.1.fc29.x86_64 #1 Hardware name: HP ProLiant BL460c G6, BIOS I24 05/21/2018 RIP: 0010: (null) Code: Bad RIP value. RSP: 0018:ffffa47ccdc9fbe0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 00000000000003e8 RCX: ffffa47ccdc9fbf8 RDX: ffffa47ccdc9fc00 RSI: ffff97d9ee7b01f8 RDI: ffff97d9f0150b80 RBP: ffff97d9f0150b80 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000003 R13: ffff97d9ef1e53e8 R14: 0000000000000009 R15: ffff97d9f0ac6730 FS: 00007f4d224ef700(0000) GS:ffff97d9fa200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 00000011ece52006 CR4: 00000000000206e0 Call Trace: ? bnx2x_chip_cleanup+0x195/0x610 [bnx2x] ? bnx2x_nic_unload+0x1e2/0x8f0 [bnx2x] ? bnx2x_reload_if_running+0x24/0x40 [bnx2x] ? bnx2x_set_features+0x79/0xa0 [bnx2x] ? __netdev_update_features+0x244/0x9e0 ? netlink_broadcast_filtered+0x136/0x4b0 ? netdev_update_features+0x22/0x60 ? dev_disable_lro+0x1c/0xe0 ? devinet_sysctl_forward+0x1c6/0x211 ? proc_sys_call_handler+0xab/0x100 ? __vfs_write+0x36/0x1a0 ? rcu_read_lock_sched_held+0x79/0x80 ? rcu_sync_lockdep_assert+0x2e/0x60 ? __sb_start_write+0x14c/0x1b0 ? vfs_write+0x159/0x1c0 ? vfs_write+0xba/0x1c0 ? ksys_write+0x52/0xc0 ? do_syscall_64+0x60/0x1f0 ? entry_SYSCALL_64_after_hwframe+0x49/0xbe After some investigation I figured out that recently added cleanup code tries to call VLAN filtering de-initialization function which exist only for newer hardware. Corresponding function pointer is not initialized (== 0) for older hardware, namely these chips: #define CHIP_NUM_57710 0x164e #define CHIP_NUM_57711 0x164f #define CHIP_NUM_57711E 0x1650 And I have one of those in my test system: 02:00.0 Ethernet controller [0200]: Broadcom Inc. and subsidiaries NetXtreme II BCM57711E 10-Gigabit PCIe [14e4:1650] 02:00.1 Ethernet controller [0200]: Broadcom Inc. and subsidiaries NetXtreme II BCM57711E 10-Gigabit PCIe [14e4:1650] Function bnx2x_init_vlan_mac_fp_objs() from drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h decides whether to initialize relevant pointers in bnx2x_sp_objs.vlan_obj or not. This regression was introduced after v4.20-rc7. Fixes: 04f05230c5c13 ("bnx2x: Remove configured vlans as part of unload sequence.") Signed-off-by: Ivan Mironov <mironov.ivan@gmail.com> --- .../net/ethernet/broadcom/bnx2x/bnx2x_main.c | 22 +++++++++++++------ 1 file changed, 15 insertions(+), 7 deletions(-)