From patchwork Sat Sep 22 10:30:34 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirsher, Jeffrey T" X-Patchwork-Id: 186121 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id CB5032C0094 for ; Sat, 22 Sep 2012 20:31:14 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755415Ab2IVKbK (ORCPT ); Sat, 22 Sep 2012 06:31:10 -0400 Received: from mga11.intel.com ([192.55.52.93]:8125 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755338Ab2IVKbI (ORCPT ); Sat, 22 Sep 2012 06:31:08 -0400 Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga102.fm.intel.com with ESMTP; 22 Sep 2012 03:30:54 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.80,467,1344236400"; d="scan'208";a="225405598" Received: from unknown (HELO jtkirshe-mobl.amr.corp.intel.com) ([10.255.13.235]) by fmsmga002.fm.intel.com with ESMTP; 22 Sep 2012 03:30:53 -0700 From: Jeff Kirsher To: davem@davemloft.net Cc: Alexander Duyck , netdev@vger.kernel.org, gospo@redhat.com, sassmann@redhat.com, Jeff Kirsher Subject: [net-next 5/7] igb: Change how we populate the RSS indirection table Date: Sat, 22 Sep 2012 03:30:34 -0700 Message-Id: <1348309836-7107-6-git-send-email-jeffrey.t.kirsher@intel.com> X-Mailer: git-send-email 1.7.11.4 In-Reply-To: <1348309836-7107-1-git-send-email-jeffrey.t.kirsher@intel.com> References: <1348309836-7107-1-git-send-email-jeffrey.t.kirsher@intel.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Alexander Duyck This patch cleans up our RSS indirection table configuration so that we generate the same table regardless of CPU endianness. In addition it changes the table setup so that instead of doing a modulo based setup it is instead a divisor based setup. The advantage to this is that we should be able to take the Rx hash and compute the Rx queue with very little CPU overhead if needed. Signed-off-by: Alexander Duyck Tested-by: Aaron Brown Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/igb/igb_main.c | 55 +++++++++++++++---------------- 1 file changed, 26 insertions(+), 29 deletions(-) diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index 91f542c..27688d9 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -2834,11 +2834,7 @@ static void igb_setup_mrqc(struct igb_adapter *adapter) { struct e1000_hw *hw = &adapter->hw; u32 mrqc, rxcsum; - u32 j, num_rx_queues, shift = 0, shift2 = 0; - union e1000_reta { - u32 dword; - u8 bytes[4]; - } reta; + u32 j, num_rx_queues, shift = 0; static const u8 rsshash[40] = { 0x6d, 0x5a, 0x56, 0xda, 0x25, 0x5b, 0x0e, 0xc2, 0x41, 0x67, 0x25, 0x3d, 0x43, 0xa3, 0x8f, 0xb0, 0xd0, 0xca, 0x2b, 0xcb, @@ -2856,35 +2852,36 @@ static void igb_setup_mrqc(struct igb_adapter *adapter) num_rx_queues = adapter->rss_queues; - if (adapter->vfs_allocated_count) { - /* 82575 and 82576 supports 2 RSS queues for VMDq */ - switch (hw->mac.type) { - case e1000_i350: - case e1000_82580: - num_rx_queues = 1; - shift = 0; - break; - case e1000_82576: + switch (hw->mac.type) { + case e1000_82575: + shift = 6; + break; + case e1000_82576: + /* 82576 supports 2 RSS queues for SR-IOV */ + if (adapter->vfs_allocated_count) { shift = 3; num_rx_queues = 2; - break; - case e1000_82575: - shift = 2; - shift2 = 6; - default: - break; } - } else { - if (hw->mac.type == e1000_82575) - shift = 6; + break; + default: + break; } - for (j = 0; j < (32 * 4); j++) { - reta.bytes[j & 3] = (j % num_rx_queues) << shift; - if (shift2) - reta.bytes[j & 3] |= num_rx_queues << shift2; - if ((j & 3) == 3) - wr32(E1000_RETA(j >> 2), reta.dword); + /* + * Populate the indirection table 4 entries at a time. To do this + * we are generating the results for n and n+2 and then interleaving + * those with the results with n+1 and n+3. + */ + for (j = 0; j < 32; j++) { + /* first pass generates n and n+2 */ + u32 base = ((j * 0x00040004) + 0x00020000) * num_rx_queues; + u32 reta = (base & 0x07800780) >> (7 - shift); + + /* second pass generates n+1 and n+3 */ + base += 0x00010001 * num_rx_queues; + reta |= (base & 0x07800780) << (1 + shift); + + wr32(E1000_RETA(j), reta); } /*