Patchwork [RESEND,BUGFIX,3/3] PCI: check whether pci device has been removed when remove a pci device by sysfs

login
register
mail settings
Submitter Yijing Wang
Date Aug. 25, 2012, 9:59 a.m.
Message ID <5038A21C.4070200@huawei.com>
Download mbox | patch
Permalink /patch/179959/
State Superseded
Headers show

Comments

Yijing Wang - Aug. 25, 2012, 9:59 a.m.
We remove a pci device maybe like this
echo 1 > /sys/bus/pci/devices/xxxx:xx:xx.x/remove
Then remove_store function will be called to complete this remove work,
later the remove work will be queued to sysfs_workqueue by device_schedule_callback.
So if we remove a pci root port device and a pci endpoint device which was the root
port's child device concurrently.The endponit device will be removed when root port's
remove work completed,so when endpoint device itself's remove work start, since endpoint
device has been removed, it will result to oops.
This patch fix this.

CallTrace:
kworker/u:2[220]: Oops 11003706212352 [1]
Modules linked in: cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi
_cpufreq binfmt_misc fuse nls_iso8859_1 loop ipmi_si ipmi_devintf ipmi_msghandle
r dm_mod igb ppdev iTCO_wdt parport_pc iTCO_vendor_support i2c_i801 parport sg m
ptctl serio_raw i2c_core lpc_ich mfd_core hid_generic button container usbhid hi
d uhci_hcd ehci_hcd usbcore usb_common sd_mod crc_t10dif ext3 mbcache jbd fan pr
ocessor ide_pci_generic ide_core ata_piix libata mptsas mptscsih mptbase scsi_tr
ansport_sas scsi_mod thermal thermal_sys hwmon

Pid: 220, CPU 30, comm:          kworker/u:2
psr : 0000121008526030 ifs : 8000000000000388 ip  : [<a0000001004b3081>]    Not
tainted (3.5.0-rc6yijing-repo)
ip is at __pci_remove_bus_device+0x101/0x1e0
unat: 0000000000000000 pfs : 0000000000000388 rsc : 0000000000000003
rnat: ffffffffffffffff bsps: ffffffffffffffff pr  : 0000080001919585
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c9e70433f
csd : 0000000000000000 ssd : 0000000000000000
b0  : a0000001004b3060 b6  : a0000001004c2400 b7  : a0000001000faae0
f6  : 000000000000000000000 f7  : 1003e00000000000057cd
f8  : 1003e0000000050000003 f9  : 1003e000001cb8678a0d0
f10 : 1003e9a05b7a39369e270 f11 : 1003e000000000000008f
r1  : a0000001014e63c0 r2  : e000001f075dec00 r3  : 0000000000000000
r8  : 0000000000000008 r9  : a0000001012e7308 r10 : 0000000004000000
r11 : e000000f0006e800 r12 : e000001f08dbfe00 r13 : e000001f08db0000
r14 : 0000000000000000 r15 : 0000000000000000 r16 : 0000000000000000
r17 : e000000f0006f008 r18 : 000000000f000000 r19 : a0000001012f3910
r20 : 0000000000100001 r21 : a000000101a62990 r22 : a000000100344580
r23 : 0000000000000000 r24 : 0000000000001000 r25 : 0000000000000000
r26 : a000000101a62988 r27 : e000003f0fc37e60 r28 : e000003f0fc37e68
r29 : e000002f07012be0 r30 : 0000000082aa0260 r31 : 0000000000004000

Call Trace:
 [<a000000100016500>] show_stack+0x80/0xa0
                                sp=e000001f08dbf9c0 bsp=e000001f08db1388
 [<a000000100016b60>] show_regs+0x640/0x920
                                sp=e000001f08dbfb90 bsp=e000001f08db1330
 [<a000000100040770>] die+0x190/0x2c0
                                sp=e000001f08dbfba0 bsp=e000001f08db12f0
 [<a000000100908f60>] ia64_do_page_fault+0x7e0/0xac0
                                sp=e000001f08dbfba0 bsp=e000001f08db1290
 [<a00000010000c0a0>] ia64_native_leave_kernel+0x0/0x270
                                sp=e000001f08dbfc30 bsp=e000001f08db1290
 [<a0000001004b3080>] __pci_remove_bus_device+0x100/0x1e0
                                sp=e000001f08dbfe00 bsp=e000001f08db1250
 [<a0000001004b32f0>] pci_stop_and_remove_bus_device+0x30/0x60
                                sp=e000001f08dbfe00 bsp=e000001f08db1230
 [<a0000001004c2440>] remove_callback+0x40/0x80
                                sp=e000001f08dbfe00 bsp=e000001f08db1208
 [<a0000001003445d0>] sysfs_schedule_callback_work+0x50/0x120
                                sp=e000001f08dbfe00 bsp=e000001f08db11d0
 [<a0000001000bc2d0>] process_one_work+0x6f0/0xae0
                                sp=e000001f08dbfe00 bsp=e000001f08db1158
 [<a0000001000bcf70>] worker_thread+0x3b0/0xc80
                                sp=e000001f08dbfe00 bsp=e000001f08db1060
 [<a0000001000cf050>] kthread+0x110/0x140
                                sp=e000001f08dbfe00 bsp=e000001f08db1028
 [<a000000100014590>] kernel_thread_helper+0x30/0x60
                                sp=e000001f08dbfe30 bsp=e000001f08db1000
 [<a00000010000a0c0>] start_kernel_thread+0x20/0x40
                                sp=e000001f08dbfe30 bsp=e000001f08db1000
Disabling lock debugging due to kernel taint
Unable to handle kernel NULL pointer dereference (address 0000000000000048)
kworker/u:2[220]: Oops 11012296146944 [2]

Pid: 220, CPU 30, comm:          kworker/u:2
psr : 0000121008022038 ifs : 8000000000000288 ip  : [<a0000001000c4961>]    Tain
ted: G      D      (3.5.0-rc6yijing-repo)
ip is at wq_worker_sleeping+0x61/0x200
unat: 0000000000000000 pfs : 0000000000000288 rsc : 0000000000000003
rnat: 0000121008026038 bsps: a0000001000407e0 pr  : 965a684515516955
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c0270033f
csd : 0000000000000000 ssd : 0000000000000000
b0  : a0000001000c4920 b6  : a0000001000f9fc0 b7  : a0000001000faae0
f6  : 000000000000000000000 f7  : 1003e9e3779b97f4a7c16
f8  : 1003e0000000050000003 f9  : 1003e000001cb87e8a5a8
f10 : 1003e9a78b92717b9f0f8 f11 : 1003e000000000000008f
r1  : a0000001014e63c0 r2  : 0000000000000000 r3  : fffffffffffc1200
r8  : 0000000000000000 r9  : 000000000000001e r10 : a000000101432530
r11 : a000000101432530 r12 : e000001f08dbfb70 r13 : e000001f08db0000
r14 : 0000000000001000 r15 : a000000101432620 r16 : e000003000245d40
r17 : fffffffffffc5c00 r18 : e000003000245d00 r19 : 00000000000000f8
r20 : e000001f08db0070 r21 : 0000000000000048 r22 : e000003000245ce8
r23 : e000003000245ce0 r24 : a000000101a638e0 r25 : ffffffffff48e500
r26 : e000003f088a0098 r27 : 0000000000000400 r28 : 0000000000000001
r29 : 000000000420806c r30 : e000001f08db0014 r31 : 0000000000000000

Call Trace:
 [<a000000100016500>] show_stack+0x80/0xa0
                                sp=e000001f08dbf730 bsp=e000001f08db16f8
 [<a000000100016b60>] show_regs+0x640/0x920
                                sp=e000001f08dbf900 bsp=e000001f08db16a0
 [<a000000100040770>] die+0x190/0x2c0
                                sp=e000001f08dbf910 bsp=e000001f08db1660
 [<a000000100908f60>] ia64_do_page_fault+0x7e0/0xac0
                                sp=e000001f08dbf910 bsp=e000001f08db1600
 [<a00000010000c0a0>] ia64_native_leave_kernel+0x0/0x270
                                sp=e000001f08dbf9a0 bsp=e000001f08db1600
 [<a0000001000c4960>] wq_worker_sleeping+0x60/0x200
                                sp=e000001f08dbfb70 bsp=e000001f08db15b8
 [<a0000001009007e0>] __schedule+0x14c0/0x18c0
                                sp=e000001f08dbfb70 bsp=e000001f08db1440
 [<a000000100900ea0>] schedule+0x60/0x140
                                sp=e000001f08dbfb80 bsp=e000001f08db13e0
 [<a000000100090d10>] do_exit+0xef0/0x1740
                                sp=e000001f08dbfb80 bsp=e000001f08db1330
 [<a000000100040840>] die+0x260/0x2c0
                                sp=e000001f08dbfba0 bsp=e000001f08db12f0
 [<a000000100908f60>] ia64_do_page_fault+0x7e0/0xac0
                                sp=e000001f08dbfba0 bsp=e000001f08db1290
 [<a00000010000c0a0>] ia64_native_leave_kernel+0x0/0x270
                                sp=e000001f08dbfc30 bsp=e000001f08db1290
 [<a0000001004b3080>] __pci_remove_bus_device+0x100/0x1e0
                                sp=e000001f08dbfe00 bsp=e000001f08db1250
 [<a0000001004b32f0>] pci_stop_and_remove_bus_device+0x30/0x60
                                sp=e000001f08dbfe00 bsp=e000001f08db1230
 [<a0000001004c2440>] remove_callback+0x40/0x80
                                sp=e000001f08dbfe00 bsp=e000001f08db1208
 [<a0000001003445d0>] sysfs_schedule_callback_work+0x50/0x120
                                sp=e000001f08dbfe00 bsp=e000001f08db11d0
 [<a0000001000bc2d0>] process_one_work+0x6f0/0xae0
                                sp=e000001f08dbfe00 bsp=e000001f08db1158
 [<a0000001000bcf70>] worker_thread+0x3b0/0xc80
                                sp=e000001f08dbfe00 bsp=e000001f08db1060
 [<a0000001000cf050>] kthread+0x110/0x140
                                sp=e000001f08dbfe00 bsp=e000001f08db1028
 [<a000000100014590>] kernel_thread_helper+0x30/0x60
                                sp=e000001f08dbfe30 bsp=e000001f08db1000
 [<a00000010000a0c0>] start_kernel_thread+0x20/0x40
                                sp=e000001f08dbfe30 bsp=e000001f08db1000
Fixing recursive fault but reboot is needed!
Modules linked in: cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi
_cpufreq binfmt_misc fuse nls_iso8859_1 loop ipmi_si ipmi_devintf ipmi_msghandle
r dm_mod igb ppdev iTCO_wdt parport_pc iTCO_vendor_support i2c_i801 parport sg m
ptctl serio_raw i2c_core lpc_ich mfd_core hid_generic button container usbhid hi
d uhci_hcd ehci_hcd usbcore usb_common sd_mod crc_t10dif ext3 mbcache jbd fan pr
ocessor ide_pci_generic ide_core ata_piix libata mptsas mptscsih mptbase scsi_tr
ansport_sas scsi_mod thermal thermal_sys hwmon

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
---
 drivers/pci/pci-sysfs.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)
Jiang Liu - Aug. 25, 2012, 2:39 p.m.
Hi Yijing,
	The patch only patially fix the issue, there exists still small race
condition window because pdev->is_added isn't a reliable flag to depend on.
	--Gerry

On 08/25/2012 05:59 PM, Yijing Wang wrote:
> We remove a pci device maybe like this
> echo 1 > /sys/bus/pci/devices/xxxx:xx:xx.x/remove
> Then remove_store function will be called to complete this remove work,
> later the remove work will be queued to sysfs_workqueue by device_schedule_callback.
> So if we remove a pci root port device and a pci endpoint device which was the root
> port's child device concurrently.The endponit device will be removed when root port's
> remove work completed,so when endpoint device itself's remove work start, since endpoint
> device has been removed, it will result to oops.
> This patch fix this.
> 
> CallTrace:
> kworker/u:2[220]: Oops 11003706212352 [1]
> Modules linked in: cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi
> _cpufreq binfmt_misc fuse nls_iso8859_1 loop ipmi_si ipmi_devintf ipmi_msghandle
> r dm_mod igb ppdev iTCO_wdt parport_pc iTCO_vendor_support i2c_i801 parport sg m
> ptctl serio_raw i2c_core lpc_ich mfd_core hid_generic button container usbhid hi
> d uhci_hcd ehci_hcd usbcore usb_common sd_mod crc_t10dif ext3 mbcache jbd fan pr
> ocessor ide_pci_generic ide_core ata_piix libata mptsas mptscsih mptbase scsi_tr
> ansport_sas scsi_mod thermal thermal_sys hwmon
> 
> Pid: 220, CPU 30, comm:          kworker/u:2
> psr : 0000121008526030 ifs : 8000000000000388 ip  : [<a0000001004b3081>]    Not
> tainted (3.5.0-rc6yijing-repo)
> ip is at __pci_remove_bus_device+0x101/0x1e0
> unat: 0000000000000000 pfs : 0000000000000388 rsc : 0000000000000003
> rnat: ffffffffffffffff bsps: ffffffffffffffff pr  : 0000080001919585
> ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c9e70433f
> csd : 0000000000000000 ssd : 0000000000000000
> b0  : a0000001004b3060 b6  : a0000001004c2400 b7  : a0000001000faae0
> f6  : 000000000000000000000 f7  : 1003e00000000000057cd
> f8  : 1003e0000000050000003 f9  : 1003e000001cb8678a0d0
> f10 : 1003e9a05b7a39369e270 f11 : 1003e000000000000008f
> r1  : a0000001014e63c0 r2  : e000001f075dec00 r3  : 0000000000000000
> r8  : 0000000000000008 r9  : a0000001012e7308 r10 : 0000000004000000
> r11 : e000000f0006e800 r12 : e000001f08dbfe00 r13 : e000001f08db0000
> r14 : 0000000000000000 r15 : 0000000000000000 r16 : 0000000000000000
> r17 : e000000f0006f008 r18 : 000000000f000000 r19 : a0000001012f3910
> r20 : 0000000000100001 r21 : a000000101a62990 r22 : a000000100344580
> r23 : 0000000000000000 r24 : 0000000000001000 r25 : 0000000000000000
> r26 : a000000101a62988 r27 : e000003f0fc37e60 r28 : e000003f0fc37e68
> r29 : e000002f07012be0 r30 : 0000000082aa0260 r31 : 0000000000004000
> 
> Call Trace:
>  [<a000000100016500>] show_stack+0x80/0xa0
>                                 sp=e000001f08dbf9c0 bsp=e000001f08db1388
>  [<a000000100016b60>] show_regs+0x640/0x920
>                                 sp=e000001f08dbfb90 bsp=e000001f08db1330
>  [<a000000100040770>] die+0x190/0x2c0
>                                 sp=e000001f08dbfba0 bsp=e000001f08db12f0
>  [<a000000100908f60>] ia64_do_page_fault+0x7e0/0xac0
>                                 sp=e000001f08dbfba0 bsp=e000001f08db1290
>  [<a00000010000c0a0>] ia64_native_leave_kernel+0x0/0x270
>                                 sp=e000001f08dbfc30 bsp=e000001f08db1290
>  [<a0000001004b3080>] __pci_remove_bus_device+0x100/0x1e0
>                                 sp=e000001f08dbfe00 bsp=e000001f08db1250
>  [<a0000001004b32f0>] pci_stop_and_remove_bus_device+0x30/0x60
>                                 sp=e000001f08dbfe00 bsp=e000001f08db1230
>  [<a0000001004c2440>] remove_callback+0x40/0x80
>                                 sp=e000001f08dbfe00 bsp=e000001f08db1208
>  [<a0000001003445d0>] sysfs_schedule_callback_work+0x50/0x120
>                                 sp=e000001f08dbfe00 bsp=e000001f08db11d0
>  [<a0000001000bc2d0>] process_one_work+0x6f0/0xae0
>                                 sp=e000001f08dbfe00 bsp=e000001f08db1158
>  [<a0000001000bcf70>] worker_thread+0x3b0/0xc80
>                                 sp=e000001f08dbfe00 bsp=e000001f08db1060
>  [<a0000001000cf050>] kthread+0x110/0x140
>                                 sp=e000001f08dbfe00 bsp=e000001f08db1028
>  [<a000000100014590>] kernel_thread_helper+0x30/0x60
>                                 sp=e000001f08dbfe30 bsp=e000001f08db1000
>  [<a00000010000a0c0>] start_kernel_thread+0x20/0x40
>                                 sp=e000001f08dbfe30 bsp=e000001f08db1000
> Disabling lock debugging due to kernel taint
> Unable to handle kernel NULL pointer dereference (address 0000000000000048)
> kworker/u:2[220]: Oops 11012296146944 [2]
> 
> Pid: 220, CPU 30, comm:          kworker/u:2
> psr : 0000121008022038 ifs : 8000000000000288 ip  : [<a0000001000c4961>]    Tain
> ted: G      D      (3.5.0-rc6yijing-repo)
> ip is at wq_worker_sleeping+0x61/0x200
> unat: 0000000000000000 pfs : 0000000000000288 rsc : 0000000000000003
> rnat: 0000121008026038 bsps: a0000001000407e0 pr  : 965a684515516955
> ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c0270033f
> csd : 0000000000000000 ssd : 0000000000000000
> b0  : a0000001000c4920 b6  : a0000001000f9fc0 b7  : a0000001000faae0
> f6  : 000000000000000000000 f7  : 1003e9e3779b97f4a7c16
> f8  : 1003e0000000050000003 f9  : 1003e000001cb87e8a5a8
> f10 : 1003e9a78b92717b9f0f8 f11 : 1003e000000000000008f
> r1  : a0000001014e63c0 r2  : 0000000000000000 r3  : fffffffffffc1200
> r8  : 0000000000000000 r9  : 000000000000001e r10 : a000000101432530
> r11 : a000000101432530 r12 : e000001f08dbfb70 r13 : e000001f08db0000
> r14 : 0000000000001000 r15 : a000000101432620 r16 : e000003000245d40
> r17 : fffffffffffc5c00 r18 : e000003000245d00 r19 : 00000000000000f8
> r20 : e000001f08db0070 r21 : 0000000000000048 r22 : e000003000245ce8
> r23 : e000003000245ce0 r24 : a000000101a638e0 r25 : ffffffffff48e500
> r26 : e000003f088a0098 r27 : 0000000000000400 r28 : 0000000000000001
> r29 : 000000000420806c r30 : e000001f08db0014 r31 : 0000000000000000
> 
> Call Trace:
>  [<a000000100016500>] show_stack+0x80/0xa0
>                                 sp=e000001f08dbf730 bsp=e000001f08db16f8
>  [<a000000100016b60>] show_regs+0x640/0x920
>                                 sp=e000001f08dbf900 bsp=e000001f08db16a0
>  [<a000000100040770>] die+0x190/0x2c0
>                                 sp=e000001f08dbf910 bsp=e000001f08db1660
>  [<a000000100908f60>] ia64_do_page_fault+0x7e0/0xac0
>                                 sp=e000001f08dbf910 bsp=e000001f08db1600
>  [<a00000010000c0a0>] ia64_native_leave_kernel+0x0/0x270
>                                 sp=e000001f08dbf9a0 bsp=e000001f08db1600
>  [<a0000001000c4960>] wq_worker_sleeping+0x60/0x200
>                                 sp=e000001f08dbfb70 bsp=e000001f08db15b8
>  [<a0000001009007e0>] __schedule+0x14c0/0x18c0
>                                 sp=e000001f08dbfb70 bsp=e000001f08db1440
>  [<a000000100900ea0>] schedule+0x60/0x140
>                                 sp=e000001f08dbfb80 bsp=e000001f08db13e0
>  [<a000000100090d10>] do_exit+0xef0/0x1740
>                                 sp=e000001f08dbfb80 bsp=e000001f08db1330
>  [<a000000100040840>] die+0x260/0x2c0
>                                 sp=e000001f08dbfba0 bsp=e000001f08db12f0
>  [<a000000100908f60>] ia64_do_page_fault+0x7e0/0xac0
>                                 sp=e000001f08dbfba0 bsp=e000001f08db1290
>  [<a00000010000c0a0>] ia64_native_leave_kernel+0x0/0x270
>                                 sp=e000001f08dbfc30 bsp=e000001f08db1290
>  [<a0000001004b3080>] __pci_remove_bus_device+0x100/0x1e0
>                                 sp=e000001f08dbfe00 bsp=e000001f08db1250
>  [<a0000001004b32f0>] pci_stop_and_remove_bus_device+0x30/0x60
>                                 sp=e000001f08dbfe00 bsp=e000001f08db1230
>  [<a0000001004c2440>] remove_callback+0x40/0x80
>                                 sp=e000001f08dbfe00 bsp=e000001f08db1208
>  [<a0000001003445d0>] sysfs_schedule_callback_work+0x50/0x120
>                                 sp=e000001f08dbfe00 bsp=e000001f08db11d0
>  [<a0000001000bc2d0>] process_one_work+0x6f0/0xae0
>                                 sp=e000001f08dbfe00 bsp=e000001f08db1158
>  [<a0000001000bcf70>] worker_thread+0x3b0/0xc80
>                                 sp=e000001f08dbfe00 bsp=e000001f08db1060
>  [<a0000001000cf050>] kthread+0x110/0x140
>                                 sp=e000001f08dbfe00 bsp=e000001f08db1028
>  [<a000000100014590>] kernel_thread_helper+0x30/0x60
>                                 sp=e000001f08dbfe30 bsp=e000001f08db1000
>  [<a00000010000a0c0>] start_kernel_thread+0x20/0x40
>                                 sp=e000001f08dbfe30 bsp=e000001f08db1000
> Fixing recursive fault but reboot is needed!
> Modules linked in: cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi
> _cpufreq binfmt_misc fuse nls_iso8859_1 loop ipmi_si ipmi_devintf ipmi_msghandle
> r dm_mod igb ppdev iTCO_wdt parport_pc iTCO_vendor_support i2c_i801 parport sg m
> ptctl serio_raw i2c_core lpc_ich mfd_core hid_generic button container usbhid hi
> d uhci_hcd ehci_hcd usbcore usb_common sd_mod crc_t10dif ext3 mbcache jbd fan pr
> ocessor ide_pci_generic ide_core ata_piix libata mptsas mptscsih mptbase scsi_tr
> ansport_sas scsi_mod thermal thermal_sys hwmon
> 
> Signed-off-by: Yijing Wang <wangyijing@huawei.com>
> ---
>  drivers/pci/pci-sysfs.c |    3 +++
>  1 files changed, 3 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index 6869009..b0be682 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -332,7 +332,10 @@ static void remove_callback(struct device *dev)
>  	struct pci_dev *pdev = to_pci_dev(dev);
> 
>  	mutex_lock(&pci_remove_rescan_mutex);
> +	if (!pdev->is_added)
> +		goto out;
>  	pci_stop_and_remove_bus_device(pdev);
> +out:
>  	mutex_unlock(&pci_remove_rescan_mutex);
>  }
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yijing Wang - Aug. 27, 2012, 6:42 a.m.
On 2012/8/25 22:39, Jiang Liu wrote:
> Hi Yijing,
> 	The patch only patially fix the issue, there exists still small race
> condition window because pdev->is_added isn't a reliable flag to depend on.
> 	--Gerry
> 

Hi Gerry,
    You are right, add pdev->is_added flag check here only fix the race condition window
between remove and rescan sysfs interfaces. Maybe we need a more comprehensive solution to
solve these problems between hotplug/remove/rescan actions.Next, I will do a more detailed test for
[RFC PATCH v1 00/22] introduce PCI bus lock to serialize PCI hotplug operations patches.
That is a more comprehensive solution actually.

---------
Thanks.
Yijing

> On 08/25/2012 05:59 PM, Yijing Wang wrote:
>> We remove a pci device maybe like this
>> echo 1 > /sys/bus/pci/devices/xxxx:xx:xx.x/remove
>> Then remove_store function will be called to complete this remove work,
>> later the remove work will be queued to sysfs_workqueue by device_schedule_callback.
>> So if we remove a pci root port device and a pci endpoint device which was the root
>> port's child device concurrently.The endponit device will be removed when root port's
>> remove work completed,so when endpoint device itself's remove work start, since endpoint
>> device has been removed, it will result to oops.
>> This patch fix this.
>>
>> CallTrace:
>> kworker/u:2[220]: Oops 11003706212352 [1]
>> Modules linked in: cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi
>> _cpufreq binfmt_misc fuse nls_iso8859_1 loop ipmi_si ipmi_devintf ipmi_msghandle
>> r dm_mod igb ppdev iTCO_wdt parport_pc iTCO_vendor_support i2c_i801 parport sg m
>> ptctl serio_raw i2c_core lpc_ich mfd_core hid_generic button container usbhid hi
>> d uhci_hcd ehci_hcd usbcore usb_common sd_mod crc_t10dif ext3 mbcache jbd fan pr
>> ocessor ide_pci_generic ide_core ata_piix libata mptsas mptscsih mptbase scsi_tr
>> ansport_sas scsi_mod thermal thermal_sys hwmon


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 6869009..b0be682 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -332,7 +332,10 @@  static void remove_callback(struct device *dev)
 	struct pci_dev *pdev = to_pci_dev(dev);

 	mutex_lock(&pci_remove_rescan_mutex);
+	if (!pdev->is_added)
+		goto out;
 	pci_stop_and_remove_bus_device(pdev);
+out:
 	mutex_unlock(&pci_remove_rescan_mutex);
 }