[net-next,v2,2/2] ethernet/intel: consolidate napi and napi exit

Message ID 20181108225532.26078-1-jeffrey.t.kirsher@intel.com
State Under Review
Delegated to: Jeff Kirsher
Headers show
Series
  • Untitled series #74900
Related show

Commit Message

Jeff Kirsher Nov. 8, 2018, 10:55 p.m.
From: Jesse Brandeburg <jesse.brandeburg@intel.com>

While reviewing code, I noticed that Eric Dumazet recommends that
drivers check the return code of napi_complete_done, and use that
to decide to enable interrupts or not when exiting poll.  One of
the Intel drivers was already fixed (ixgbe).

Upon looking at the Intel drivers as a whole, we are handling our
polling and napi exit in a few different ways based on whether we
have multiqueue and whether we have tx cleanup included. Several
drivers had the bug of exiting napi with return 0, which appears
to mess up the accounting in the stack.

Consolidate all the napi routines to do best known way of exiting
and to just mostly look like each other.
1) check return code of napi_complete_done to control interrupt enable
2) return the actual amount of work done.
3) return budget immediately if need napi poll again

Tested the changes on e1000e with a high interrupt rate set, and
it shows about an 8% reduction in the CPU utilization when busy
polling because we aren't re-enabling interrupts when we're about
to be polled.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
---
v2: fixed conflicts when applied to my next-queue tree, dev-queue
    branch.  Most notably the hunk toward the ice driver did not apply
    due to upstream driver changes already applied to my tree

 drivers/net/ethernet/intel/e100.c             | 10 +++++----
 drivers/net/ethernet/intel/e1000/e1000_main.c | 11 +++++-----
 drivers/net/ethernet/intel/e1000e/netdev.c    | 17 +++++++-------
 drivers/net/ethernet/intel/fm10k/fm10k_main.c | 10 ++++-----
 drivers/net/ethernet/intel/i40e/i40e_txrx.c   |  9 ++++----
 drivers/net/ethernet/intel/iavf/iavf_txrx.c   |  9 ++++----
 drivers/net/ethernet/intel/ice/ice_txrx.c     | 10 +++++----
 drivers/net/ethernet/intel/igb/igb_main.c     | 10 +++++----
 drivers/net/ethernet/intel/igbvf/netdev.c     |  9 +++++---
 drivers/net/ethernet/intel/igc/igc_main.c     | 10 +++++----
 .../net/ethernet/intel/ixgbevf/ixgbevf_main.c | 22 +++++++++++--------
 11 files changed, 73 insertions(+), 54 deletions(-)

Comments

Bowers, AndrewX Nov. 9, 2018, 6:26 p.m. | #1
> -----Original Message-----
> From: Intel-wired-lan [mailto:intel-wired-lan-bounces@osuosl.org] On
> Behalf Of Jeff Kirsher
> Sent: Thursday, November 8, 2018 2:56 PM
> To: intel-wired-lan@lists.osuosl.org
> Subject: [Intel-wired-lan] [net-next v2 2/2] ethernet/intel: consolidate napi
> and napi exit
> 
> From: Jesse Brandeburg <jesse.brandeburg@intel.com>
> 
> While reviewing code, I noticed that Eric Dumazet recommends that drivers
> check the return code of napi_complete_done, and use that to decide to
> enable interrupts or not when exiting poll.  One of the Intel drivers was
> already fixed (ixgbe).
> 
> Upon looking at the Intel drivers as a whole, we are handling our polling and
> napi exit in a few different ways based on whether we have multiqueue and
> whether we have tx cleanup included. Several drivers had the bug of exiting
> napi with return 0, which appears to mess up the accounting in the stack.
> 
> Consolidate all the napi routines to do best known way of exiting and to just
> mostly look like each other.
> 1) check return code of napi_complete_done to control interrupt enable
> 2) return the actual amount of work done.
> 3) return budget immediately if need napi poll again
> 
> Tested the changes on e1000e with a high interrupt rate set, and it shows
> about an 8% reduction in the CPU utilization when busy polling because we
> aren't re-enabling interrupts when we're about to be polled.
> 
> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
> ---
> v2: fixed conflicts when applied to my next-queue tree, dev-queue
>     branch.  Most notably the hunk toward the ice driver did not apply
>     due to upstream driver changes already applied to my tree
> 
>  drivers/net/ethernet/intel/e100.c             | 10 +++++----
>  drivers/net/ethernet/intel/e1000/e1000_main.c | 11 +++++-----
>  drivers/net/ethernet/intel/e1000e/netdev.c    | 17 +++++++-------
>  drivers/net/ethernet/intel/fm10k/fm10k_main.c | 10 ++++-----
>  drivers/net/ethernet/intel/i40e/i40e_txrx.c   |  9 ++++----
>  drivers/net/ethernet/intel/iavf/iavf_txrx.c   |  9 ++++----
>  drivers/net/ethernet/intel/ice/ice_txrx.c     | 10 +++++----
>  drivers/net/ethernet/intel/igb/igb_main.c     | 10 +++++----
>  drivers/net/ethernet/intel/igbvf/netdev.c     |  9 +++++---
>  drivers/net/ethernet/intel/igc/igc_main.c     | 10 +++++----
>  .../net/ethernet/intel/ixgbevf/ixgbevf_main.c | 22 +++++++++++--------
>  11 files changed, 73 insertions(+), 54 deletions(-)

Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Keller, Jacob E Nov. 9, 2018, 11:29 p.m. | #2
> -----Original Message-----
> From: Intel-wired-lan [mailto:intel-wired-lan-bounces@osuosl.org] On Behalf Of Jeff
> Kirsher
> Sent: Thursday, November 08, 2018 2:56 PM
> To: intel-wired-lan@lists.osuosl.org
> Subject: [Intel-wired-lan] [net-next v2 2/2] ethernet/intel: consolidate napi and napi
> exit
> 
> From: Jesse Brandeburg <jesse.brandeburg@intel.com>
> 
> While reviewing code, I noticed that Eric Dumazet recommends that
> drivers check the return code of napi_complete_done, and use that
> to decide to enable interrupts or not when exiting poll.  One of
> the Intel drivers was already fixed (ixgbe).
> 
> Upon looking at the Intel drivers as a whole, we are handling our
> polling and napi exit in a few different ways based on whether we
> have multiqueue and whether we have tx cleanup included. Several
> drivers had the bug of exiting napi with return 0, which appears
> to mess up the accounting in the stack.
> 
> Consolidate all the napi routines to do best known way of exiting
> and to just mostly look like each other.
> 1) check return code of napi_complete_done to control interrupt enable
> 2) return the actual amount of work done.
> 3) return budget immediately if need napi poll again
> 
> Tested the changes on e1000e with a high interrupt rate set, and
> it shows about an 8% reduction in the CPU utilization when busy
> polling because we aren't re-enabling interrupts when we're about
> to be polled.
> 
> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>

The fm10k changes look good to me. Thanks Jesse! For those:

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>

Patch

diff --git a/drivers/net/ethernet/intel/e100.c b/drivers/net/ethernet/intel/e100.c
index 7c4b55482f72..5e5c57db0d3f 100644
--- a/drivers/net/ethernet/intel/e100.c
+++ b/drivers/net/ethernet/intel/e100.c
@@ -2225,11 +2225,13 @@  static int e100_poll(struct napi_struct *napi, int budget)
 	e100_rx_clean(nic, &work_done, budget);
 	e100_tx_clean(nic);
 
-	/* If budget not fully consumed, exit the polling mode */
-	if (work_done < budget) {
-		napi_complete_done(napi, work_done);
+	/* If budget fully consumed, continue polling */
+	if (work_done == budget)
+		return budget;
+
+	/* only re-enable interrupt if stack agrees polling is really done */
+	if (likely(napi_complete_done(napi, work_done)))
 		e100_enable_irq(nic);
-	}
 
 	return work_done;
 }
diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c
index 43b6d3cec3b3..8fe9af0e2ab7 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_main.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
@@ -3803,14 +3803,15 @@  static int e1000_clean(struct napi_struct *napi, int budget)
 
 	adapter->clean_rx(adapter, &adapter->rx_ring[0], &work_done, budget);
 
-	if (!tx_clean_complete)
-		work_done = budget;
+	if (!tx_clean_complete || work_done == budget)
+		return budget;
 
-	/* If budget not fully consumed, exit the polling mode */
-	if (work_done < budget) {
+	/* Exit the polling mode, but don't re-enable interrupts if stack might
+	 * poll us due to busy-polling
+	 */
+	if (likely(napi_complete_done(napi, work_done))) {
 		if (likely(adapter->itr_setting & 3))
 			e1000_set_itr(adapter);
-		napi_complete_done(napi, work_done);
 		if (!test_bit(__E1000_DOWN, &adapter->flags))
 			e1000_irq_enable(adapter);
 	}
diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index a387b21312e8..4244983fcd37 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -2653,9 +2653,9 @@  static int e1000_alloc_queues(struct e1000_adapter *adapter)
 /**
  * e1000e_poll - NAPI Rx polling callback
  * @napi: struct associated with this polling callback
- * @weight: number of packets driver is allowed to process this poll
+ * @budget: number of packets driver is allowed to process this poll
  **/
-static int e1000e_poll(struct napi_struct *napi, int weight)
+static int e1000e_poll(struct napi_struct *napi, int budget)
 {
 	struct e1000_adapter *adapter = container_of(napi, struct e1000_adapter,
 						     napi);
@@ -2669,16 +2669,17 @@  static int e1000e_poll(struct napi_struct *napi, int weight)
 	    (adapter->rx_ring->ims_val & adapter->tx_ring->ims_val))
 		tx_cleaned = e1000_clean_tx_irq(adapter->tx_ring);
 
-	adapter->clean_rx(adapter->rx_ring, &work_done, weight);
+	adapter->clean_rx(adapter->rx_ring, &work_done, budget);
 
-	if (!tx_cleaned)
-		work_done = weight;
+	if (!tx_cleaned || work_done == budget)
+		return budget;
 
-	/* If weight not fully consumed, exit the polling mode */
-	if (work_done < weight) {
+	/* Exit the polling mode, but don't re-enable interrupts if stack might
+	 * poll us due to busy-polling
+	 */
+	if (likely(napi_complete_done(napi, work_done))) {
 		if (adapter->itr_setting & 3)
 			e1000_set_itr(adapter);
-		napi_complete_done(napi, work_done);
 		if (!test_bit(__E1000_DOWN, &adapter->state)) {
 			if (adapter->msix_entries)
 				ew32(IMS, adapter->rx_ring->ims_val);
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_main.c b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
index 5b2a50e5798f..6fd15a734324 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_main.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
@@ -1465,11 +1465,11 @@  static int fm10k_poll(struct napi_struct *napi, int budget)
 	if (!clean_complete)
 		return budget;
 
-	/* all work done, exit the polling mode */
-	napi_complete_done(napi, work_done);
-
-	/* re-enable the q_vector */
-	fm10k_qv_enable(q_vector);
+	/* Exit the polling mode, but don't re-enable interrupts if stack might
+	 * poll us due to busy-polling
+	 */
+	if (likely(napi_complete_done(napi, work_done)))
+		fm10k_qv_enable(q_vector);
 
 	return min(work_done, budget - 1);
 }
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index c4d44096cdaf..a0b1575468fc 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -2667,10 +2667,11 @@  int i40e_napi_poll(struct napi_struct *napi, int budget)
 	if (vsi->back->flags & I40E_TXR_FLAGS_WB_ON_ITR)
 		q_vector->arm_wb_state = false;
 
-	/* Work is done so exit the polling mode and re-enable the interrupt */
-	napi_complete_done(napi, work_done);
-
-	i40e_update_enable_itr(vsi, q_vector);
+	/* Exit the polling mode, but don't re-enable interrupts if stack might
+	 * poll us due to busy-polling
+	 */
+	if (likely(napi_complete_done(napi, work_done)))
+		i40e_update_enable_itr(vsi, q_vector);
 
 	return min(work_done, budget - 1);
 }
diff --git a/drivers/net/ethernet/intel/iavf/iavf_txrx.c b/drivers/net/ethernet/intel/iavf/iavf_txrx.c
index 3b1dc77ae368..9b4d7cec2e18 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_txrx.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_txrx.c
@@ -1761,10 +1761,11 @@  int iavf_napi_poll(struct napi_struct *napi, int budget)
 	if (vsi->back->flags & IAVF_TXR_FLAGS_WB_ON_ITR)
 		q_vector->arm_wb_state = false;
 
-	/* Work is done so exit the polling mode and re-enable the interrupt */
-	napi_complete_done(napi, work_done);
-
-	iavf_update_enable_itr(vsi, q_vector);
+	/* Exit the polling mode, but don't re-enable interrupts if stack might
+	 * poll us due to busy-polling
+	 */
+	if (likely(napi_complete_done(napi, work_done)))
+		iavf_update_enable_itr(vsi, q_vector);
 
 	return min(work_done, budget - 1);
 }
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
index 4b92863b3500..49fc38094185 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
@@ -1103,10 +1103,12 @@  int ice_napi_poll(struct napi_struct *napi, int budget)
 	if (!clean_complete)
 		return budget;
 
-	/* Work is done so exit the polling mode and re-enable the interrupt */
-	napi_complete_done(napi, work_done);
-	if (test_bit(ICE_FLAG_MSIX_ENA, pf->flags))
-		ice_irq_dynamic_ena(&vsi->back->hw, vsi, q_vector);
+	/* Exit the polling mode, but don't re-enable interrupts if stack might
+	 * poll us due to busy-polling
+	 */
+	if (likely(napi_complete_done(napi, work_done)))
+		if (test_bit(ICE_FLAG_MSIX_ENA, pf->flags))
+			ice_irq_dynamic_ena(&vsi->back->hw, vsi, q_vector);
 
 	return min(work_done, budget - 1);
 }
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index b0f17d9f3cb0..bb4f3f64fbf0 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -7752,11 +7752,13 @@  static int igb_poll(struct napi_struct *napi, int budget)
 	if (!clean_complete)
 		return budget;
 
-	/* If not enough Rx work done, exit the polling mode */
-	napi_complete_done(napi, work_done);
-	igb_ring_irq_enable(q_vector);
+	/* Exit the polling mode, but don't re-enable interrupts if stack might
+	 * poll us due to busy-polling
+	 */
+	if (likely(napi_complete_done(napi, work_done)))
+		igb_ring_irq_enable(q_vector);
 
-	return 0;
+	return min(work_done, budget - 1);
 }
 
 /**
diff --git a/drivers/net/ethernet/intel/igbvf/netdev.c b/drivers/net/ethernet/intel/igbvf/netdev.c
index 820d49eb41ab..4eab83faec62 100644
--- a/drivers/net/ethernet/intel/igbvf/netdev.c
+++ b/drivers/net/ethernet/intel/igbvf/netdev.c
@@ -1186,10 +1186,13 @@  static int igbvf_poll(struct napi_struct *napi, int budget)
 
 	igbvf_clean_rx_irq(adapter, &work_done, budget);
 
-	/* If not enough Rx work done, exit the polling mode */
-	if (work_done < budget) {
-		napi_complete_done(napi, work_done);
+	if (work_done == budget)
+		return budget;
 
+	/* Exit the polling mode, but don't re-enable interrupts if stack might
+	 * poll us due to busy-polling
+	 */
+	if (likely(napi_complete_done(napi, work_done))) {
 		if (adapter->requested_itr & 3)
 			igbvf_set_itr(adapter);
 
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index d002055c0623..28ffe98f8921 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -2852,11 +2852,13 @@  static int igc_poll(struct napi_struct *napi, int budget)
 	if (!clean_complete)
 		return budget;
 
-	/* If not enough Rx work done, exit the polling mode */
-	napi_complete_done(napi, work_done);
-	igc_ring_irq_enable(q_vector);
+	/* Exit the polling mode, but don't re-enable interrupts if stack might
+	 * poll us due to busy-polling
+	 */
+	if (likely(napi_complete_done(napi, work_done)))
+		igc_ring_irq_enable(q_vector);
 
-	return 0;
+	return min(work_done, budget - 1);
 }
 
 /**
diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index 196b890467b2..2de81f046fb5 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -1293,16 +1293,20 @@  static int ixgbevf_poll(struct napi_struct *napi, int budget)
 	/* If all work not completed, return budget and keep polling */
 	if (!clean_complete)
 		return budget;
-	/* all work done, exit the polling mode */
-	napi_complete_done(napi, work_done);
-	if (adapter->rx_itr_setting == 1)
-		ixgbevf_set_itr(q_vector);
-	if (!test_bit(__IXGBEVF_DOWN, &adapter->state) &&
-	    !test_bit(__IXGBEVF_REMOVING, &adapter->state))
-		ixgbevf_irq_enable_queues(adapter,
-					  BIT(q_vector->v_idx));
 
-	return 0;
+	/* Exit the polling mode, but don't re-enable interrupts if stack might
+	 * poll us due to busy-polling
+	 */
+	if (likely(napi_complete_done(napi, work_done))) {
+		if (adapter->rx_itr_setting == 1)
+			ixgbevf_set_itr(q_vector);
+		if (!test_bit(__IXGBEVF_DOWN, &adapter->state) &&
+		    !test_bit(__IXGBEVF_REMOVING, &adapter->state))
+			ixgbevf_irq_enable_queues(adapter,
+						  BIT(q_vector->v_idx));
+	}
+
+	return min(work_done, budget - 1);
 }
 
 /**