diff mbox

9p: disconnect channel when PCI device is removed

Message ID 1334353716-19483-1-git-send-email-levinsasha928@gmail.com
State Not Applicable, archived
Delegated to: David Miller
Headers show

Commit Message

Sasha Levin April 13, 2012, 9:48 p.m. UTC
When a virtio_9p pci device is being removed, we should close down any
active channels and free up resources, we're not supposed to BUG() if there's
still an open channel since it's a valid case when removing the PCI device.

Otherwise, removing the PCI device with an open channel would cause the
following BUG():

[ 1184.671416] ------------[ cut here ]------------
[ 1184.672057] kernel BUG at net/9p/trans_virtio.c:618!
[ 1184.672057] invalid opcode: 0000 [#1] PREEMPT SMP
[ 1184.672057] CPU 3
[ 1184.672057] Pid: 5, comm: kworker/u:0 Tainted: G        W    3.4.0-rc2-next-20120413-sasha-dirty #76
[ 1184.672057] RIP: 0010:[<ffffffff825c9116>]  [<ffffffff825c9116>] p9_virtio_remove+0x16/0x90
[ 1184.672057] RSP: 0018:ffff88000d653ac0  EFLAGS: 00010202
[ 1184.672057] RAX: ffffffff836bfb40 RBX: ffff88000c9b2148 RCX: ffff88000d658978
[ 1184.672057] RDX: 0000000000000006 RSI: 0000000000000000 RDI: ffff880028868000
[ 1184.672057] RBP: ffff88000d653ad0 R08: 0000000000000000 R09: 0000000000000000
[ 1184.672057] R10: 0000000000000000 R11: 0000000000000001 R12: ffff880028868000
[ 1184.672057] R13: ffffffff835aa7c0 R14: ffff880041630000 R15: ffff88000d653da0
[ 1184.672057] FS:  0000000000000000(0000) GS:ffff880035a00000(0000) knlGS:0000000000000000
[ 1184.672057] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1184.672057] CR2: 0000000001181000 CR3: 000000000eba1000 CR4: 00000000000406e0
[ 1184.672057] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
x000000000117a190 *[ 1184.672057] DR3: 00000000000000**
00 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1184.672057] Process kworker/u:0 (pid: 5, threadinfo ffff88000d652000, task ffff88000d658000)
[ 1184.672057] Stack:
[ 1184.672057]  ffff880028868000 ffffffff836bfb40 ffff88000d653af0 ffffffff8193661b
[ 1184.672057]  ffff880028868008 ffffffff836bfb40 ffff88000d653b10 ffffffff81af1c81
[ 1184.672057]  ffff880028868068 ffff880028868008 ffff88000d653b30 ffffffff81af257a
[ 1184.795301] Call Trace:
[ 1184.795301]  [<ffffffff8193661b>] virtio_dev_remove+0x1b/0x60
[ 1184.795301]  [<ffffffff81af1c81>] __device_release_driver+0x81/0xd0
[ 1184.795301]  [<ffffffff81af257a>] device_release_driver+0x2a/0x40
[ 1184.795301]  [<ffffffff81af0d48>] bus_remove_device+0x138/0x150
[ 1184.795301]  [<ffffffff81aef08d>] device_del+0x14d/0x1b0
[ 1184.795301]  [<ffffffff81aef138>] device_unregister+0x48/0x60
[ 1184.795301]  [<ffffffff8193694d>] unregister_virtio_device+0xd/0x10
[ 1184.795301]  [<ffffffff8265fc74>] virtio_pci_remove+0x2a/0x6c
[ 1184.795301]  [<ffffffff818a95ad>] pci_device_remove+0x4d/0x110
[ 1184.795301]  [<ffffffff81af1c81>] __device_release_driver+0x81/0xd0
[ 1184.795301]  [<ffffffff81af257a>] device_release_driver+0x2a/0x40
[ 1184.795301]  [<ffffffff81af0d48>] bus_remove_device+0x138/0x150
[ 1184.795301]  [<ffffffff81aef08d>] device_del+0x14d/0x1b0
[ 1184.795301]  [<ffffffff81aef138>] device_unregister+0x48/0x60
[ 1184.795301]  [<ffffffff818a36fa>] pci_stop_bus_device+0x6a/0x90
[ 1184.795301]  [<ffffffff818a3791>] pci_stop_and_remove_bus_device+0x11/0x20
[ 1184.795301]  [<ffffffff818c21d9>] remove_callback+0x9/0x10
[ 1184.795301]  [<ffffffff81252d91>] sysfs_schedule_callback_work+0x21/0x60
[ 1184.795301]  [<ffffffff810cb1a1>] process_one_work+0x281/0x430
[ 1184.795301]  [<ffffffff810cb140>] ? process_one_work+0x220/0x430
[ 1184.795301]  [<ffffffff81252d70>] ? sysfs_read_file+0x1c0/0x1c0
[ 1184.795301]  [<ffffffff810cc613>] worker_thread+0x1f3/0x320
[ 1184.795301]  [<ffffffff810cc420>] ? manage_workers.clone.13+0x130/0x130
[ 1184.795301]  [<ffffffff810d30b2>] kthread+0xb2/0xc0
[ 1184.795301]  [<ffffffff826783f4>] kernel_thread_helper+0x4/0x10
[ 1184.795301]  [<ffffffff810deb18>] ? finish_task_switch+0x78/0xf0
[ 1184.795301]  [<ffffffff82676574>] ? retint_restore_args+0x13/0x13
[ 1184.795301]  [<ffffffff810d3000>] ? kthread_flush_work_fn+0x10/0x10
[ 1184.795301]  [<ffffffff826783f0>] ? gs_change+0x13/0x13
[ 1184.795301] Code: c1 9e 0a 00 48 83 c4 08 5b c9 c3 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 54 49 89 fc 53 48 8b 9f a8 04 00 00 80 3b 00 74 0a <0f> 0b 0f 1f 84 00 00 00 00 00 48 8b 87 88 04 00 00 ff 50 30 31
[ 1184.795301] RIP  [<ffffffff825c9116>] p9_virtio_remove+0x16/0x90
[ 1184.795301]  RSP <ffff88000d653ac0>
[ 1184.952618] ---[ end trace a307b3ed40206b4c ]---

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
 net/9p/trans_virtio.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

Comments

Aneesh Kumar K.V April 15, 2012, 12:27 p.m. UTC | #1
Sasha Levin <levinsasha928@gmail.com> writes:

> When a virtio_9p pci device is being removed, we should close down any
> active channels and free up resources, we're not supposed to BUG() if there's
> still an open channel since it's a valid case when removing the PCI device.
>
> Otherwise, removing the PCI device with an open channel would cause the
> following BUG():
>
...

> diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
> index 3d43206..5af18d1 100644
> --- a/net/9p/trans_virtio.c
> +++ b/net/9p/trans_virtio.c
> @@ -615,7 +615,8 @@ static void p9_virtio_remove(struct virtio_device *vdev)
>  {
>  	struct virtio_chan *chan = vdev->priv;
>
> -	BUG_ON(chan->inuse);
> +	if (chan->inuse)
> +		p9_virtio_close(chan->client);
>  	vdev->config->del_vqs(vdev);
>
>  	mutex_lock(&virtio_9p_lock);

But an umount should have resulted in p9_virtio_close ? How are you
removing the device ? Are you removing the device with file system
mounted  ?. In that case may be we should return EBUSY ?

-aneesh

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sasha Levin April 16, 2012, 4:35 a.m. UTC | #2
On Sun, Apr 15, 2012 at 2:27 PM, Aneesh Kumar K.V
<aneesh.kumar@linux.vnet.ibm.com> wrote:
> Sasha Levin <levinsasha928@gmail.com> writes:
>
>> When a virtio_9p pci device is being removed, we should close down any
>> active channels and free up resources, we're not supposed to BUG() if there's
>> still an open channel since it's a valid case when removing the PCI device.
>>
>> Otherwise, removing the PCI device with an open channel would cause the
>> following BUG():
>>
> ...
>
>> diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
>> index 3d43206..5af18d1 100644
>> --- a/net/9p/trans_virtio.c
>> +++ b/net/9p/trans_virtio.c
>> @@ -615,7 +615,8 @@ static void p9_virtio_remove(struct virtio_device *vdev)
>>  {
>>       struct virtio_chan *chan = vdev->priv;
>>
>> -     BUG_ON(chan->inuse);
>> +     if (chan->inuse)
>> +             p9_virtio_close(chan->client);
>>       vdev->config->del_vqs(vdev);
>>
>>       mutex_lock(&virtio_9p_lock);
>
> But an umount should have resulted in p9_virtio_close ? How are you
> removing the device ? Are you removing the device with file system
> mounted  ?. In that case may be we should return EBUSY ?

I signal the underlying PCI device to remove (echo 1 >
/sys/devices/pci0000\:00/[...]/remove), we can't really prevent that
thing so we must clean up ourselves.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Aneesh Kumar K.V April 16, 2012, 10:59 a.m. UTC | #3
Sasha Levin <levinsasha928@gmail.com> writes:

> On Sun, Apr 15, 2012 at 2:27 PM, Aneesh Kumar K.V
> <aneesh.kumar@linux.vnet.ibm.com> wrote:
>> Sasha Levin <levinsasha928@gmail.com> writes:
>>
>>> When a virtio_9p pci device is being removed, we should close down any
>>> active channels and free up resources, we're not supposed to BUG() if there's
>>> still an open channel since it's a valid case when removing the PCI device.
>>>
>>> Otherwise, removing the PCI device with an open channel would cause the
>>> following BUG():
>>>
>> ...
>>
>>> diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
>>> index 3d43206..5af18d1 100644
>>> --- a/net/9p/trans_virtio.c
>>> +++ b/net/9p/trans_virtio.c
>>> @@ -615,7 +615,8 @@ static void p9_virtio_remove(struct virtio_device *vdev)
>>>  {
>>>       struct virtio_chan *chan = vdev->priv;
>>>
>>> -     BUG_ON(chan->inuse);
>>> +     if (chan->inuse)
>>> +             p9_virtio_close(chan->client);
>>>       vdev->config->del_vqs(vdev);
>>>
>>>       mutex_lock(&virtio_9p_lock);
>>
>> But an umount should have resulted in p9_virtio_close ? How are you
>> removing the device ? Are you removing the device with file system
>> mounted  ?. In that case may be we should return EBUSY ?
>
> I signal the underlying PCI device to remove (echo 1 >
> /sys/devices/pci0000\:00/[...]/remove), we can't really prevent that
> thing so we must clean up ourselves.

What does that mean for the mounted file system ? What would happen to
the pending fs operations in that case ?

-aneesh

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sasha Levin April 28, 2012, 8:55 a.m. UTC | #4
On Mon, Apr 16, 2012 at 12:59 PM, Aneesh Kumar K.V
<aneesh.kumar@linux.vnet.ibm.com> wrote:
> Sasha Levin <levinsasha928@gmail.com> writes:
>
>> On Sun, Apr 15, 2012 at 2:27 PM, Aneesh Kumar K.V
>> <aneesh.kumar@linux.vnet.ibm.com> wrote:
>>> Sasha Levin <levinsasha928@gmail.com> writes:
>>>
>>>> When a virtio_9p pci device is being removed, we should close down any
>>>> active channels and free up resources, we're not supposed to BUG() if there's
>>>> still an open channel since it's a valid case when removing the PCI device.
>>>>
>>>> Otherwise, removing the PCI device with an open channel would cause the
>>>> following BUG():
>>>>
>>> ...
>>>
>>>> diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
>>>> index 3d43206..5af18d1 100644
>>>> --- a/net/9p/trans_virtio.c
>>>> +++ b/net/9p/trans_virtio.c
>>>> @@ -615,7 +615,8 @@ static void p9_virtio_remove(struct virtio_device *vdev)
>>>>  {
>>>>       struct virtio_chan *chan = vdev->priv;
>>>>
>>>> -     BUG_ON(chan->inuse);
>>>> +     if (chan->inuse)
>>>> +             p9_virtio_close(chan->client);
>>>>       vdev->config->del_vqs(vdev);
>>>>
>>>>       mutex_lock(&virtio_9p_lock);
>>>
>>> But an umount should have resulted in p9_virtio_close ? How are you
>>> removing the device ? Are you removing the device with file system
>>> mounted  ?. In that case may be we should return EBUSY ?
>>
>> I signal the underlying PCI device to remove (echo 1 >
>> /sys/devices/pci0000\:00/[...]/remove), we can't really prevent that
>> thing so we must clean up ourselves.
>
> What does that mean for the mounted file system ? What would happen to
> the pending fs operations in that case ?

I'm guessing that all of them should be canceled. virtio-pci simulates
a PCI device, if the PCI device is unplugged there's not much to do
about the filesystem or pending requests.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Aneesh Kumar K.V April 28, 2012, 6:16 p.m. UTC | #5
Sasha Levin <levinsasha928@gmail.com> writes:

> On Mon, Apr 16, 2012 at 12:59 PM, Aneesh Kumar K.V
> <aneesh.kumar@linux.vnet.ibm.com> wrote:
>> Sasha Levin <levinsasha928@gmail.com> writes:
>>
>>> On Sun, Apr 15, 2012 at 2:27 PM, Aneesh Kumar K.V
>>> <aneesh.kumar@linux.vnet.ibm.com> wrote:
>>>> Sasha Levin <levinsasha928@gmail.com> writes:
>>>>
>>>>> When a virtio_9p pci device is being removed, we should close down any
>>>>> active channels and free up resources, we're not supposed to BUG() if there's
>>>>> still an open channel since it's a valid case when removing the PCI device.
>>>>>
>>>>> Otherwise, removing the PCI device with an open channel would cause the
>>>>> following BUG():
>>>>>
>>>> ...
>>>>
>>>>> diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
>>>>> index 3d43206..5af18d1 100644
>>>>> --- a/net/9p/trans_virtio.c
>>>>> +++ b/net/9p/trans_virtio.c
>>>>> @@ -615,7 +615,8 @@ static void p9_virtio_remove(struct virtio_device *vdev)
>>>>>  {
>>>>>       struct virtio_chan *chan = vdev->priv;
>>>>>
>>>>> -     BUG_ON(chan->inuse);
>>>>> +     if (chan->inuse)
>>>>> +             p9_virtio_close(chan->client);
>>>>>       vdev->config->del_vqs(vdev);
>>>>>
>>>>>       mutex_lock(&virtio_9p_lock);
>>>>
>>>> But an umount should have resulted in p9_virtio_close ? How are you
>>>> removing the device ? Are you removing the device with file system
>>>> mounted  ?. In that case may be we should return EBUSY ?
>>>
>>> I signal the underlying PCI device to remove (echo 1 >
>>> /sys/devices/pci0000\:00/[...]/remove), we can't really prevent that
>>> thing so we must clean up ourselves.
>>
>> What does that mean for the mounted file system ? What would happen to
>> the pending fs operations in that case ?
>
> I'm guessing that all of them should be canceled.

Pending operation we can cancel, but what about dirty pages in cached
mode ? Also how does virtio-blk handle this ? 

> virtio-pci simulates
> a PCI device, if the PCI device is unplugged there's not much to do
> about the filesystem or pending requests.

Ideal thing to do would be to make remove return -EBUSY; let the user
umount. But looking at the code, I guess there is no easy way to return
error from remove callback ?

-aneesh

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sasha Levin April 29, 2012, 3:21 a.m. UTC | #6
On Sat, Apr 28, 2012 at 8:16 PM, Aneesh Kumar K.V
<aneesh.kumar@linux.vnet.ibm.com> wrote:
> Sasha Levin <levinsasha928@gmail.com> writes:
>
>> On Mon, Apr 16, 2012 at 12:59 PM, Aneesh Kumar K.V
>> <aneesh.kumar@linux.vnet.ibm.com> wrote:
>>> Sasha Levin <levinsasha928@gmail.com> writes:
>>>
>>>> On Sun, Apr 15, 2012 at 2:27 PM, Aneesh Kumar K.V
>>>> <aneesh.kumar@linux.vnet.ibm.com> wrote:
>>>>> Sasha Levin <levinsasha928@gmail.com> writes:
>>>>>
>>>>>> When a virtio_9p pci device is being removed, we should close down any
>>>>>> active channels and free up resources, we're not supposed to BUG() if there's
>>>>>> still an open channel since it's a valid case when removing the PCI device.
>>>>>>
>>>>>> Otherwise, removing the PCI device with an open channel would cause the
>>>>>> following BUG():
>>>>>>
>>>>> ...
>>>>>
>>>>>> diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
>>>>>> index 3d43206..5af18d1 100644
>>>>>> --- a/net/9p/trans_virtio.c
>>>>>> +++ b/net/9p/trans_virtio.c
>>>>>> @@ -615,7 +615,8 @@ static void p9_virtio_remove(struct virtio_device *vdev)
>>>>>>  {
>>>>>>       struct virtio_chan *chan = vdev->priv;
>>>>>>
>>>>>> -     BUG_ON(chan->inuse);
>>>>>> +     if (chan->inuse)
>>>>>> +             p9_virtio_close(chan->client);
>>>>>>       vdev->config->del_vqs(vdev);
>>>>>>
>>>>>>       mutex_lock(&virtio_9p_lock);
>>>>>
>>>>> But an umount should have resulted in p9_virtio_close ? How are you
>>>>> removing the device ? Are you removing the device with file system
>>>>> mounted  ?. In that case may be we should return EBUSY ?
>>>>
>>>> I signal the underlying PCI device to remove (echo 1 >
>>>> /sys/devices/pci0000\:00/[...]/remove), we can't really prevent that
>>>> thing so we must clean up ourselves.
>>>
>>> What does that mean for the mounted file system ? What would happen to
>>> the pending fs operations in that case ?
>>
>> I'm guessing that all of them should be canceled.
>
> Pending operation we can cancel, but what about dirty pages in cached
> mode ? Also how does virtio-blk handle this ?

virtio-blk does pretty much the same thing:

sh-4.2# echo 1 > 0000\:00\:05.0/remove
sh-4.2# ls -al
[hang]

>> virtio-pci simulates
>> a PCI device, if the PCI device is unplugged there's not much to do
>> about the filesystem or pending requests.
>
> Ideal thing to do would be to make remove return -EBUSY; let the user
> umount. But looking at the code, I guess there is no easy way to return
> error from remove callback ?

There's no way to return error, the removal callback is just a notifier.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rusty Russell May 7, 2012, 3:15 a.m. UTC | #7
On Fri, 13 Apr 2012 17:48:36 -0400, Sasha Levin <levinsasha928@gmail.com> wrote:
> When a virtio_9p pci device is being removed, we should close down any
> active channels and free up resources, we're not supposed to BUG() if there's
> still an open channel since it's a valid case when removing the PCI device.
> 
> Otherwise, removing the PCI device with an open channel would cause the
> following BUG():

(Damn changed notmuch.el bindings!  Previous reply went only to Sasha).

Applied thanks,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Aneesh Kumar K.V May 7, 2012, 7:14 a.m. UTC | #8
Rusty Russell <rusty@rustcorp.com.au> writes:

> On Fri, 13 Apr 2012 17:48:36 -0400, Sasha Levin <levinsasha928@gmail.com> wrote:
>> When a virtio_9p pci device is being removed, we should close down any
>> active channels and free up resources, we're not supposed to BUG() if there's
>> still an open channel since it's a valid case when removing the PCI device.
>> 
>> Otherwise, removing the PCI device with an open channel would cause the
>> following BUG():
>
> (Damn changed notmuch.el bindings!  Previous reply went only to Sasha).
>
> Applied thanks,
> Rusty.

I am not sure whether the patch is sufficient, p9_virtio_remove does a
kfree(chan) and since we are not doing anything at the file system
level, we would still allow new 9p client request. That means
p9_virtio_request would be dereferencing a freed memory.

-aneesh

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
index 3d43206..5af18d1 100644
--- a/net/9p/trans_virtio.c
+++ b/net/9p/trans_virtio.c
@@ -615,7 +615,8 @@  static void p9_virtio_remove(struct virtio_device *vdev)
 {
 	struct virtio_chan *chan = vdev->priv;
 
-	BUG_ON(chan->inuse);
+	if (chan->inuse)
+		p9_virtio_close(chan->client);
 	vdev->config->del_vqs(vdev);
 
 	mutex_lock(&virtio_9p_lock);