diff mbox series

[net,v3] ibmvnic fix NULL tx_pools and rx_tools issue at do_reset

Message ID 20200825172641.806912-1-drt@linux.ibm.com (mailing list archive)
State Handled Elsewhere
Headers show
Series [net,v3] ibmvnic fix NULL tx_pools and rx_tools issue at do_reset | expand

Checks

Context Check Description
snowpatch_ozlabs/apply_patch success Successfully applied on branch powerpc/merge (d4ecce4dcc8f8820286cf4e0859850c555e89854)
snowpatch_ozlabs/build-ppc64le warning Upstream build failed, couldn't test patch
snowpatch_ozlabs/build-ppc64be warning Upstream build failed, couldn't test patch
snowpatch_ozlabs/build-ppc64e warning Upstream build failed, couldn't test patch
snowpatch_ozlabs/build-pmac32 warning Upstream build failed, couldn't test patch
snowpatch_ozlabs/checkpatch warning total: 0 errors, 1 warnings, 2 checks, 43 lines checked
snowpatch_ozlabs/needsstable success Patch has no Fixes tags

Commit Message

Dany Madden Aug. 25, 2020, 5:26 p.m. UTC
From: Mingming Cao <mmc@linux.vnet.ibm.com>

At the time of do_rest, ibmvnic tries to re-initalize the tx_pools
and rx_pools to avoid re-allocating the long term buffer. However
there is a window inside do_reset that the tx_pools and
rx_pools were freed before re-initialized making it possible to deference
null pointers.

This patch fix this issue by always check the tx_pool
and rx_pool are not NULL after ibmvnic_login. If so, re-allocating
the pools. This will avoid getting into calling reset_tx/rx_pools with
NULL adapter tx_pools/rx_pools pointer. Also add null pointer check in
reset_tx_pools and reset_rx_pools to safe handle NULL pointer case.

Signed-off-by: Mingming Cao <mmc@linux.vnet.ibm.com>
Signed-off-by: Dany Madden <drt@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

Comments

David Miller Aug. 26, 2020, 12:31 a.m. UTC | #1
From: Dany Madden <drt@linux.ibm.com>
Date: Tue, 25 Aug 2020 13:26:41 -0400

> From: Mingming Cao <mmc@linux.vnet.ibm.com>
> 
> At the time of do_rest, ibmvnic tries to re-initalize the tx_pools
> and rx_pools to avoid re-allocating the long term buffer. However
> there is a window inside do_reset that the tx_pools and
> rx_pools were freed before re-initialized making it possible to deference
> null pointers.
> 
> This patch fix this issue by always check the tx_pool
> and rx_pool are not NULL after ibmvnic_login. If so, re-allocating
> the pools. This will avoid getting into calling reset_tx/rx_pools with
> NULL adapter tx_pools/rx_pools pointer. Also add null pointer check in
> reset_tx_pools and reset_rx_pools to safe handle NULL pointer case.
> 
> Signed-off-by: Mingming Cao <mmc@linux.vnet.ibm.com>
> Signed-off-by: Dany Madden <drt@linux.ibm.com>

Applied, but:

> +	if (!adapter->rx_pool)
> +		return -1;
> +

This driver has poor error code usage, it's a random mix of hypervisor
error codes, normal error codes like -EINVAL, and internal error codes.
Sometimes used all in the same function.

For example:

static int ibmvnic_send_crq(struct ibmvnic_adapter *adapter,
			    union ibmvnic_crq *crq)
 ...
	if (!adapter->crq.active &&
	    crq->generic.first != IBMVNIC_CRQ_INIT_CMD) {
		dev_warn(dev, "Invalid request detected while CRQ is inactive, possible device state change during reset\n");
		return -EINVAL;
	}
 ...
	rc = plpar_hcall_norets(H_SEND_CRQ, ua,
				cpu_to_be64(u64_crq[0]),
				cpu_to_be64(u64_crq[1]));

	if (rc) {
		if (rc == H_CLOSED) {
 ...
	return rc;

So obviously this function returns a mix of negative erro codes
and Hypervisor codes such as H_CLOSED.

And stuff like:

	rc = __ibmvnic_open(netdev);
	if (rc)
		return IBMVNIC_OPEN_FAILED;
Mingming Cao Aug. 26, 2020, 1:14 a.m. UTC | #2
> On Aug 25, 2020, at 5:31 PM, David Miller <davem@davemloft.net> wrote:
> 
> From: Dany Madden <drt@linux.ibm.com>
> Date: Tue, 25 Aug 2020 13:26:41 -0400
> 
>> From: Mingming Cao <mmc@linux.vnet.ibm.com>
>> 
>> At the time of do_rest, ibmvnic tries to re-initalize the tx_pools
>> and rx_pools to avoid re-allocating the long term buffer. However
>> there is a window inside do_reset that the tx_pools and
>> rx_pools were freed before re-initialized making it possible to deference
>> null pointers.
>> 
>> This patch fix this issue by always check the tx_pool
>> and rx_pool are not NULL after ibmvnic_login. If so, re-allocating
>> the pools. This will avoid getting into calling reset_tx/rx_pools with
>> NULL adapter tx_pools/rx_pools pointer. Also add null pointer check in
>> reset_tx_pools and reset_rx_pools to safe handle NULL pointer case.
>> 
>> Signed-off-by: Mingming Cao <mmc@linux.vnet.ibm.com>
>> Signed-off-by: Dany Madden <drt@linux.ibm.com>
> 
> Applied, but:
> 
>> +	if (!adapter->rx_pool)
>> +		return -1;
>> +
> 
> This driver has poor error code usage, it's a random mix of hypervisor
> error codes, normal error codes like -EINVAL, and internal error codes.
> Sometimes used all in the same function.
> 

Agree need to improve. For this patch/fix,  -1 is  chosen to follow other part of the driver that check NULL pointer and return -1 . We should  go through all of -1 cases and replace with normal proper error code. That should be a seperate patch. 

> For example:
> 
> static int ibmvnic_send_crq(struct ibmvnic_adapter *adapter,
> 			    union ibmvnic_crq *crq)
> ...
> 	if (!adapter->crq.active &&
> 	    crq->generic.first != IBMVNIC_CRQ_INIT_CMD) {
> 		dev_warn(dev, "Invalid request detected while CRQ is inactive, possible device state change during reset\n");
> 		return -EINVAL;
> 	}
> ...
> 	rc = plpar_hcall_norets(H_SEND_CRQ, ua,
> 				cpu_to_be64(u64_crq[0]),
> 				cpu_to_be64(u64_crq[1]));
> 
> 	if (rc) {
> 		if (rc == H_CLOSED) {
> ...
> 	return rc;
> 
> So obviously this function returns a mix of negative erro codes
> and Hypervisor codes such as H_CLOSED.
> 
> And stuff like:
> 
> 	rc = __ibmvnic_open(netdev);
> 	if (rc)
> 		return IBMVNIC_OPEN_FAILED;

Agree. 

Mingming
diff mbox series

Patch

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 5afb3c9c52d2..d3a774331afc 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -479,6 +479,9 @@  static int reset_rx_pools(struct ibmvnic_adapter *adapter)
 	int i, j, rc;
 	u64 *size_array;
 
+	if (!adapter->rx_pool)
+		return -1;
+
 	size_array = (u64 *)((u8 *)(adapter->login_rsp_buf) +
 		be32_to_cpu(adapter->login_rsp_buf->off_rxadd_buff_size));
 
@@ -649,6 +652,9 @@  static int reset_tx_pools(struct ibmvnic_adapter *adapter)
 	int tx_scrqs;
 	int i, rc;
 
+	if (!adapter->tx_pool)
+		return -1;
+
 	tx_scrqs = be32_to_cpu(adapter->login_rsp_buf->num_txsubm_subcrqs);
 	for (i = 0; i < tx_scrqs; i++) {
 		rc = reset_one_tx_pool(adapter, &adapter->tso_pool[i]);
@@ -2011,7 +2017,10 @@  static int do_reset(struct ibmvnic_adapter *adapter,
 		    adapter->req_rx_add_entries_per_subcrq !=
 		    old_num_rx_slots ||
 		    adapter->req_tx_entries_per_subcrq !=
-		    old_num_tx_slots) {
+		    old_num_tx_slots ||
+		    !adapter->rx_pool ||
+		    !adapter->tso_pool ||
+		    !adapter->tx_pool) {
 			release_rx_pools(adapter);
 			release_tx_pools(adapter);
 			release_napi(adapter);
@@ -2024,10 +2033,14 @@  static int do_reset(struct ibmvnic_adapter *adapter,
 		} else {
 			rc = reset_tx_pools(adapter);
 			if (rc)
+				netdev_dbg(adapter->netdev, "reset tx pools failed (%d)\n",
+						rc);
 				goto out;
 
 			rc = reset_rx_pools(adapter);
 			if (rc)
+				netdev_dbg(adapter->netdev, "reset rx pools failed (%d)\n",
+						rc);
 				goto out;
 		}
 		ibmvnic_disable_irqs(adapter);