diff mbox series

[net] ibmvnic: Clean RX pools only during a hard reset

Message ID 1518970121-8539-1-git-send-email-tlfalcon@linux.vnet.ibm.com
State Changes Requested, archived
Delegated to: David Miller
Headers show
Series [net] ibmvnic: Clean RX pools only during a hard reset | expand

Commit Message

Thomas Falcon Feb. 18, 2018, 4:08 p.m. UTC
Sorry, this fixes a bug in commit d0869c0071e4. The cause of the
bug is that "stale" RX buffers containing packet data are returned
to the driver after device close and open. While most buffers will be
returned with an error and handled by the polling routine, some buffers
will be returned as containing valid data. Unfortunately, the socket
buffers allocated were already freed when the device was closed, so
attempts to process them result in a panic.

RX pools still need to be cleaned in some cases, such as during
a fatal reset. In all other cases, the socket buffers will either
be freed in the polling routine or processed by the kernel.

Fixes: d0869c0071e4 ("ibmvnic: Clean RX pool buffers during device close")
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

David Miller Feb. 19, 2018, 4:37 p.m. UTC | #1
From: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Date: Sun, 18 Feb 2018 10:08:40 -0600

> Sorry, this fixes a bug in commit d0869c0071e4. The cause of the
> bug is that "stale" RX buffers containing packet data are returned
> to the driver after device close and open. While most buffers will be
> returned with an error and handled by the polling routine, some buffers
> will be returned as containing valid data. Unfortunately, the socket
> buffers allocated were already freed when the device was closed, so
> attempts to process them result in a panic.
> 
> RX pools still need to be cleaned in some cases, such as during
> a fatal reset. In all other cases, the socket buffers will either
> be freed in the polling routine or processed by the kernel.
> 
> Fixes: d0869c0071e4 ("ibmvnic: Clean RX pool buffers during device close")
> Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>

It really shouldn't matter who, or how many times, clear_rx_pools() is
called.

Anyone who calls it and frees the SKBs will mark the SKB slots as NULL,
so any subsequent call cannot possibly double free the buffers.

At best you need to explain the problem better in the commit message.
Thomas Falcon Feb. 19, 2018, 7:24 p.m. UTC | #2
On 02/19/2018 10:37 AM, David Miller wrote:
> From: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
> Date: Sun, 18 Feb 2018 10:08:40 -0600
>
>> Sorry, this fixes a bug in commit d0869c0071e4. The cause of the
>> bug is that "stale" RX buffers containing packet data are returned
>> to the driver after device close and open. While most buffers will be
>> returned with an error and handled by the polling routine, some buffers
>> will be returned as containing valid data. Unfortunately, the socket
>> buffers allocated were already freed when the device was closed, so
>> attempts to process them result in a panic.
>>
>> RX pools still need to be cleaned in some cases, such as during
>> a fatal reset. In all other cases, the socket buffers will either
>> be freed in the polling routine or processed by the kernel.
>>
>> Fixes: d0869c0071e4 ("ibmvnic: Clean RX pool buffers during device close")
>> Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
> It really shouldn't matter who, or how many times, clear_rx_pools() is
> called.
>
> Anyone who calls it and frees the SKBs will mark the SKB slots as NULL,
> so any subsequent call cannot possibly double free the buffers.
>
> At best you need to explain the problem better in the commit message.

Sorry, I should explain it better. It's not there is a double free.  It's that the driver is receiving RX descriptors from the previous session for which socket buffers have been freed. The driver's polling routine tries to copy data to the socket buffer, but it's been freed, so it's trying to copy to a NULL pointer.

Tom
David Miller Feb. 19, 2018, 7:30 p.m. UTC | #3
From: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Date: Mon, 19 Feb 2018 13:24:52 -0600

> Sorry, I should explain it better. It's not there is a double free.
> It's that the driver is receiving RX descriptors from the previous
> session for which socket buffers have been freed. The driver's
> polling routine tries to copy data to the socket buffer, but it's
> been freed, so it's trying to copy to a NULL pointer.

That's kinda hairy, is this resend of the old descriptors guaranteed
to always happen in this situation?

Maybe it's better to have some way for the RX descriptor receiving
path to detect this situation (is SKB slot NULL?) to handle the
problem there.

Thanks.
Thomas Falcon Feb. 19, 2018, 8:01 p.m. UTC | #4
On 02/19/2018 01:30 PM, David Miller wrote:
> From: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
> Date: Mon, 19 Feb 2018 13:24:52 -0600
>
>> Sorry, I should explain it better. It's not there is a double free.
>> It's that the driver is receiving RX descriptors from the previous
>> session for which socket buffers have been freed. The driver's
>> polling routine tries to copy data to the socket buffer, but it's
>> been freed, so it's trying to copy to a NULL pointer.
> That's kinda hairy, is this resend of the old descriptors guaranteed
> to always happen in this situation?
>
> Maybe it's better to have some way for the RX descriptor receiving
> path to detect this situation (is SKB slot NULL?) to handle the
> problem there.

It is something we can expect to happen in this situation.  Thanks for the suggestion.  That way the driver can free up that memory when it closes.  I'll try to get a v2 out soon.

Thanks again.

> Thanks.
>
diff mbox series

Patch

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 996f475..6710313 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -1179,7 +1179,9 @@  static int __ibmvnic_close(struct net_device *netdev)
 			}
 		}
 	}
-	clean_rx_pools(adapter);
+	if (unlikely(adapter->resetting &&
+		     adapter->reset_reason != VNIC_RESET_NON_FATAL))
+		clean_rx_pools(adapter);
 	clean_tx_pools(adapter);
 	adapter->state = VNIC_CLOSED;
 	return rc;