Message ID | 1518970121-8539-1-git-send-email-tlfalcon@linux.vnet.ibm.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
Series | [net] ibmvnic: Clean RX pools only during a hard reset | expand |
From: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Date: Sun, 18 Feb 2018 10:08:40 -0600 > Sorry, this fixes a bug in commit d0869c0071e4. The cause of the > bug is that "stale" RX buffers containing packet data are returned > to the driver after device close and open. While most buffers will be > returned with an error and handled by the polling routine, some buffers > will be returned as containing valid data. Unfortunately, the socket > buffers allocated were already freed when the device was closed, so > attempts to process them result in a panic. > > RX pools still need to be cleaned in some cases, such as during > a fatal reset. In all other cases, the socket buffers will either > be freed in the polling routine or processed by the kernel. > > Fixes: d0869c0071e4 ("ibmvnic: Clean RX pool buffers during device close") > Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> It really shouldn't matter who, or how many times, clear_rx_pools() is called. Anyone who calls it and frees the SKBs will mark the SKB slots as NULL, so any subsequent call cannot possibly double free the buffers. At best you need to explain the problem better in the commit message.
On 02/19/2018 10:37 AM, David Miller wrote: > From: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> > Date: Sun, 18 Feb 2018 10:08:40 -0600 > >> Sorry, this fixes a bug in commit d0869c0071e4. The cause of the >> bug is that "stale" RX buffers containing packet data are returned >> to the driver after device close and open. While most buffers will be >> returned with an error and handled by the polling routine, some buffers >> will be returned as containing valid data. Unfortunately, the socket >> buffers allocated were already freed when the device was closed, so >> attempts to process them result in a panic. >> >> RX pools still need to be cleaned in some cases, such as during >> a fatal reset. In all other cases, the socket buffers will either >> be freed in the polling routine or processed by the kernel. >> >> Fixes: d0869c0071e4 ("ibmvnic: Clean RX pool buffers during device close") >> Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> > It really shouldn't matter who, or how many times, clear_rx_pools() is > called. > > Anyone who calls it and frees the SKBs will mark the SKB slots as NULL, > so any subsequent call cannot possibly double free the buffers. > > At best you need to explain the problem better in the commit message. Sorry, I should explain it better. It's not there is a double free. It's that the driver is receiving RX descriptors from the previous session for which socket buffers have been freed. The driver's polling routine tries to copy data to the socket buffer, but it's been freed, so it's trying to copy to a NULL pointer. Tom
From: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Date: Mon, 19 Feb 2018 13:24:52 -0600 > Sorry, I should explain it better. It's not there is a double free. > It's that the driver is receiving RX descriptors from the previous > session for which socket buffers have been freed. The driver's > polling routine tries to copy data to the socket buffer, but it's > been freed, so it's trying to copy to a NULL pointer. That's kinda hairy, is this resend of the old descriptors guaranteed to always happen in this situation? Maybe it's better to have some way for the RX descriptor receiving path to detect this situation (is SKB slot NULL?) to handle the problem there. Thanks.
On 02/19/2018 01:30 PM, David Miller wrote: > From: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> > Date: Mon, 19 Feb 2018 13:24:52 -0600 > >> Sorry, I should explain it better. It's not there is a double free. >> It's that the driver is receiving RX descriptors from the previous >> session for which socket buffers have been freed. The driver's >> polling routine tries to copy data to the socket buffer, but it's >> been freed, so it's trying to copy to a NULL pointer. > That's kinda hairy, is this resend of the old descriptors guaranteed > to always happen in this situation? > > Maybe it's better to have some way for the RX descriptor receiving > path to detect this situation (is SKB slot NULL?) to handle the > problem there. It is something we can expect to happen in this situation. Thanks for the suggestion. That way the driver can free up that memory when it closes. I'll try to get a v2 out soon. Thanks again. > Thanks. >
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index 996f475..6710313 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -1179,7 +1179,9 @@ static int __ibmvnic_close(struct net_device *netdev) } } } - clean_rx_pools(adapter); + if (unlikely(adapter->resetting && + adapter->reset_reason != VNIC_RESET_NON_FATAL)) + clean_rx_pools(adapter); clean_tx_pools(adapter); adapter->state = VNIC_CLOSED; return rc;
Sorry, this fixes a bug in commit d0869c0071e4. The cause of the bug is that "stale" RX buffers containing packet data are returned to the driver after device close and open. While most buffers will be returned with an error and handled by the polling routine, some buffers will be returned as containing valid data. Unfortunately, the socket buffers allocated were already freed when the device was closed, so attempts to process them result in a panic. RX pools still need to be cleaned in some cases, such as during a fatal reset. In all other cases, the socket buffers will either be freed in the polling routine or processed by the kernel. Fixes: d0869c0071e4 ("ibmvnic: Clean RX pool buffers during device close") Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> --- drivers/net/ethernet/ibm/ibmvnic.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)