Patchwork [net-next,4/5] be2net: get rid of AMAP_SET/GET macros in TX path

login
register
mail settings
Submitter Sathya Perla
Date Sept. 27, 2012, 6:32 a.m.
Message ID <92de1988-73db-4864-bf19-10ed11dac557@CMEXHTCAS2.ad.emulex.com>
Download mbox | patch
Permalink /patch/187289/
State Changes Requested
Delegated to: David Miller
Headers show

Comments

Sathya Perla - Sept. 27, 2012, 6:32 a.m.
The AMAP macros are used in be2net for setting and parsing bits in HW
descriptors. The macros do this by calculating the mask & offset of each
field from the AMAP structure definition.
In the TX patch, replace the usage of these macros with code to explicitly
shift & mask each field. Doing this reduces instructions and improves
readability.

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
---
 drivers/net/ethernet/emulex/benet/be_hw.h   |   62 ++++++++-------------------
 drivers/net/ethernet/emulex/benet/be_main.c |   58 ++++++++++---------------
 2 files changed, 41 insertions(+), 79 deletions(-)
David Miller - Sept. 28, 2012, 2:29 a.m.
From: Sathya Perla <sathya.perla@emulex.com>
Date: Thu, 27 Sep 2012 12:02:47 +0530

> The AMAP macros are used in be2net for setting and parsing bits in HW
> descriptors. The macros do this by calculating the mask & offset of each
> field from the AMAP structure definition.
> In the TX patch, replace the usage of these macros with code to explicitly
> shift & mask each field. Doing this reduces instructions and improves
> readability.
> 
> Signed-off-by: Sathya Perla <sathya.perla@emulex.com>

Now you have endianness bugs.  The previous code worked with 8-bit
struct members and as such was endian neutral.

Now you're working with words, so you thus have to take endianness
into consideration.

The readability aspect is also extremely questionable, here's why.
The old code accessed struct members with _NAMES_ which described what
the values are and what they do.

This new code puts values into opaque "word" array members.  That's
about as crappy as it comes wrt. readability.  What in the world
does word[0] do?  I can't tell from it's name.  Yet with the existing
"struct amap_eth_hdr_wrb" there is none of that kind of confusion.

So don't pretend this new code even looks better, it looks like opaque
garbage to me.

Just admit this is an optimization that broke things on the endianness
you did not test.

be2net patches have been of a very low quality lately.  So I suggest
you improve things or else your submissions will be processed with an
extremely low priority.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sathya Perla - Sept. 28, 2012, 5:47 a.m.
>-----Original Message-----
>From: David Miller [mailto:davem@davemloft.net]
>
>Now you have endianness bugs.  The previous code worked with 8-bit
>struct members and as such was endian neutral.
>
>Now you're working with words, so you thus have to take endianness
>into consideration.

Dave,
endianness is handled even in this patch. The call to wrb_fill_hdr()
is followed by be_dws_cpu_to_le() to handle this.

The old code did not set the 8-bit members in the amap_eth_wrb_hdr struct.
The TX hdr wrb structure is actually 4 words (32-bit * 4) long. Each word
has some fields with different bit sizes. In the old code a *separate* psuedo
structure -- called the amap_eth_wrb_hdr -- was defined, wherein a field
of size x bits in the actual TX-hrd-wrb was defined as *x bytes* long. This was
done to be able to calculate the mask and shift values of each bit-field without having to define
them manually.

There has been feedback in the community earlier that this method of calculating
bit-field masks and shifts is unusual and to be consistent with other drivers' code we
manually define the mask and shift values for each field.

>
>The readability aspect is also extremely questionable, here's why.
>The old code accessed struct members with _NAMES_ which described what
>the values are and what they do.
>
>This new code puts values into opaque "word" array members.  That's
>about as crappy as it comes wrt. readability.  What in the world
>does word[0] do?  I can't tell from it's name.  Yet with the existing
>"struct amap_eth_hdr_wrb" there is none of that kind of confusion.

The word[2] of TX-hdr-wrb consists for various 1-bit fields.
I could call this word "flags". But, word[3] consists of unrelated fields like
wrb-num (5bits),  lso-mss-len(14 bits) etc . So, I just used dw[3].

I could rename the current definition of TX-hdr-wrb from
struct be_eth_hdr_wrb {
	u32 dw[4];
};

to

struct be_eth_hdr_wrb {
	u32 rsvd0;
	u32 rsvd0;
	/* compl, evt, crc & cksum bits */
	u32 flags;
	/* num-wrb, lso-mss, vlan-tci etc */
	u32 info;
};

Pls let me know if this addresses your concerns...

thanks,
-Sathya

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - Sept. 28, 2012, 6:40 a.m.
From: "Perla, Sathya" <Sathya.Perla@Emulex.Com>
Date: Fri, 28 Sep 2012 05:47:25 +0000

> endianness is handled even in this patch. The call to wrb_fill_hdr()
> is followed by be_dws_cpu_to_le() to handle this.

That swap_dws() thing is the most inefficient thing I've ever seen.

Instead of being able to benefit from compile time optimizations
such as byte swaps of constants, you do everything hidden from the
compiler so nothing gets optimized.

The aspect you are changing is the least of your problems in this
area.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sathya Perla - Sept. 28, 2012, 7:05 a.m.
>-----Original Message-----
>From: David Miller [mailto:davem@davemloft.net]
>
>> endianness is handled even in this patch. The call to wrb_fill_hdr()
>> is followed by be_dws_cpu_to_le() to handle this.
>
>That swap_dws() thing is the most inefficient thing I've ever seen.
>
>Instead of being able to benefit from compile time optimizations
>such as byte swaps of constants, you do everything hidden from the
>compiler so nothing gets optimized.

I'd like to clarify that swap_dws() was being used/needed even in the old code. 
The AMAP_SET_BITS() macros (that this patch removes) did nothing to take care of endian byte-swapping.
They are just calculating the mask/shift values of each bit-field and setting the value
in host-endian.

But, as you are convinced that this patch doesn't provide much value, I'll just drop it and 
re-send the patch set....
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - Sept. 28, 2012, 7:10 a.m.
From: "Perla, Sathya" <Sathya.Perla@Emulex.Com>
Date: Fri, 28 Sep 2012 07:05:27 +0000

>>-----Original Message-----
>>From: David Miller [mailto:davem@davemloft.net]
>>
>>> endianness is handled even in this patch. The call to wrb_fill_hdr()
>>> is followed by be_dws_cpu_to_le() to handle this.
>>
>>That swap_dws() thing is the most inefficient thing I've ever seen.
>>
>>Instead of being able to benefit from compile time optimizations
>>such as byte swaps of constants, you do everything hidden from the
>>compiler so nothing gets optimized.
> 
> I'd like to clarify that swap_dws() was being used/needed even in
> the old code.

I fully understand this.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/drivers/net/ethernet/emulex/benet/be_hw.h b/drivers/net/ethernet/emulex/benet/be_hw.h
index b755f70..32a84ab 100644
--- a/drivers/net/ethernet/emulex/benet/be_hw.h
+++ b/drivers/net/ethernet/emulex/benet/be_hw.h
@@ -273,56 +273,30 @@  struct be_eth_wrb {
 	u32 frag_len;		/* dword 3: bits 0 - 15 */
 } __packed;
 
-/* Pseudo amap definition for eth_hdr_wrb in which each bit of the
- * actual structure is defined as a byte : used to calculate
- * offset/shift/mask of each field */
-struct amap_eth_hdr_wrb {
-	u8 rsvd0[32];		/* dword 0 */
-	u8 rsvd1[32];		/* dword 1 */
-	u8 complete;		/* dword 2 */
-	u8 event;
-	u8 crc;
-	u8 forward;
-	u8 lso6;
-	u8 mgmt;
-	u8 ipcs;
-	u8 udpcs;
-	u8 tcpcs;
-	u8 lso;
-	u8 vlan;
-	u8 gso[2];
-	u8 num_wrb[5];
-	u8 lso_mss[14];
-	u8 len[16];		/* dword 3 */
-	u8 vlan_tag[16];
-} __packed;
+#define TX_HDR_WRB_COMPL		1		/* word 2 */
+#define TX_HDR_WRB_EVT			(1 << 1)	/* word 2 */
+#define TX_HDR_WRB_CRC			(1 << 2)	/* word 2 */
+#define TX_HDR_WRB_LSO6			(1 << 4)	/* word 2 */
+#define TX_HDR_WRB_IPCS			(1 << 6)	/* word 2 */
+#define TX_HDR_WRB_UDPCS		(1 << 7)	/* word 2 */
+#define TX_HDR_WRB_TCPCS		(1 << 8)	/* word 2 */
+#define TX_HDR_WRB_LSO			(1 << 9)	/* word 2 */
+#define TX_HDR_WRB_VLAN			(1 << 10)	/* word 2 */
+#define TX_HDR_WRB_NUM_SHIFT		13		/* word 2: bits 13:17 */
+#define TX_HDR_WRB_NUM_MASK		0x1F		/* word 2: bits 13:17 */
+#define TX_HDR_WRB_MSS_SHIFT		18		/* word 2: bits 18:31 */
+#define TX_HDR_WRB_MSS_MASK		0x3FFF		/* word 2: bits 18:31 */
+#define TX_HDR_WRB_LEN_MASK		0xFFFF		/* word 3: bits 0:15 */
+#define TX_HDR_WRB_VLAN_TCI_SHIFT	16		/* word 3: bits 16:31 */
+#define TX_HDR_WRB_VLAN_TCI_MASK	0xFFFF		/* word 3: bits 16:31 */
 
 struct be_eth_hdr_wrb {
 	u32 dw[4];
 };
 
 /* TX Compl Queue Descriptor */
-
-/* Pseudo amap definition for eth_tx_compl in which each bit of the
- * actual structure is defined as a byte: used to calculate
- * offset/shift/mask of each field */
-struct amap_eth_tx_compl {
-	u8 wrb_index[16];	/* dword 0 */
-	u8 ct[2]; 		/* dword 0 */
-	u8 port[2];		/* dword 0 */
-	u8 rsvd0[8];		/* dword 0 */
-	u8 status[4];		/* dword 0 */
-	u8 user_bytes[16];	/* dword 1 */
-	u8 nwh_bytes[8];	/* dword 1 */
-	u8 lso;			/* dword 1 */
-	u8 cast_enc[2];		/* dword 1 */
-	u8 rsvd1[5];		/* dword 1 */
-	u8 rsvd2[32];		/* dword 2 */
-	u8 pkts[16];		/* dword 3 */
-	u8 ringid[11];		/* dword 3 */
-	u8 hash_val[4];		/* dword 3 */
-	u8 valid;		/* dword 3 */
-} __packed;
+#define TX_COMPL_WRB_IDX_MASK		0xFFFF		/* word 0: bits 0:15 */
+#define TX_COMPL_VALID			(1 << 31)	/* word 3 */
 
 struct be_eth_tx_compl {
 	u32 dw[4];
diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index 4855dd6..c74906d 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -585,45 +585,34 @@  static int be_vlan_tag_chk(struct be_adapter *adapter, struct sk_buff *skb)
 static void wrb_fill_hdr(struct be_adapter *adapter, struct be_eth_hdr_wrb *hdr,
 		struct sk_buff *skb, u32 wrb_cnt, u32 len)
 {
-	u16 vlan_tag;
-
-	memset(hdr, 0, sizeof(*hdr));
-
-	AMAP_SET_BITS(struct amap_eth_hdr_wrb, crc, hdr, 1);
+	u32 dw2 = 0, dw3 = 0;
 
 	if (skb_is_gso(skb)) {
-		AMAP_SET_BITS(struct amap_eth_hdr_wrb, lso, hdr, 1);
-		AMAP_SET_BITS(struct amap_eth_hdr_wrb, lso_mss,
-			hdr, skb_shinfo(skb)->gso_size);
+		dw2 |= TX_HDR_WRB_LSO;
+		dw2 |= (skb_shinfo(skb)->gso_size & TX_HDR_WRB_MSS_MASK) <<
+			TX_HDR_WRB_MSS_SHIFT;
 		if (skb_is_gso_v6(skb) && !lancer_chip(adapter))
-			AMAP_SET_BITS(struct amap_eth_hdr_wrb, lso6, hdr, 1);
-		if (lancer_chip(adapter) && adapter->sli_family  ==
-							LANCER_A0_SLI_FAMILY) {
-			AMAP_SET_BITS(struct amap_eth_hdr_wrb, ipcs, hdr, 1);
-			if (is_tcp_pkt(skb))
-				AMAP_SET_BITS(struct amap_eth_hdr_wrb,
-								tcpcs, hdr, 1);
-			else if (is_udp_pkt(skb))
-				AMAP_SET_BITS(struct amap_eth_hdr_wrb,
-								udpcs, hdr, 1);
-		}
+			dw2 |= TX_HDR_WRB_LSO6;
 	} else if (skb->ip_summed == CHECKSUM_PARTIAL) {
 		if (is_tcp_pkt(skb))
-			AMAP_SET_BITS(struct amap_eth_hdr_wrb, tcpcs, hdr, 1);
+			dw2 |= TX_HDR_WRB_TCPCS;
 		else if (is_udp_pkt(skb))
-			AMAP_SET_BITS(struct amap_eth_hdr_wrb, udpcs, hdr, 1);
+			dw2 |= TX_HDR_WRB_UDPCS;
 	}
 
 	if (vlan_tx_tag_present(skb)) {
-		AMAP_SET_BITS(struct amap_eth_hdr_wrb, vlan, hdr, 1);
-		vlan_tag = be_get_tx_vlan_tag(adapter, skb);
-		AMAP_SET_BITS(struct amap_eth_hdr_wrb, vlan_tag, hdr, vlan_tag);
+		dw2 |= TX_HDR_WRB_VLAN;
+		dw3 = (be_get_tx_vlan_tag(adapter, skb) & 0xFFFF) <<
+				TX_HDR_WRB_VLAN_TCI_SHIFT;
 	}
+	dw2 |= TX_HDR_WRB_CRC | TX_HDR_WRB_EVT | TX_HDR_WRB_COMPL |
+		(wrb_cnt & TX_HDR_WRB_NUM_MASK) << TX_HDR_WRB_NUM_SHIFT;
+	dw3 |= len & TX_HDR_WRB_LEN_MASK;
 
-	AMAP_SET_BITS(struct amap_eth_hdr_wrb, event, hdr, 1);
-	AMAP_SET_BITS(struct amap_eth_hdr_wrb, complete, hdr, 1);
-	AMAP_SET_BITS(struct amap_eth_hdr_wrb, num_wrb, hdr, wrb_cnt);
-	AMAP_SET_BITS(struct amap_eth_hdr_wrb, len, hdr, len);
+	hdr->dw[2] = dw2;
+	hdr->dw[3] = dw3;
+	hdr->dw[0] = 0;
+	hdr->dw[1] = 0;
 }
 
 static void unmap_tx_frag(struct device *dev, struct be_eth_wrb *wrb,
@@ -1554,13 +1543,14 @@  static struct be_eth_tx_compl *be_tx_compl_get(struct be_queue_info *tx_cq)
 {
 	struct be_eth_tx_compl *txcp = queue_tail_node(tx_cq);
 
-	if (txcp->dw[offsetof(struct amap_eth_tx_compl, valid) / 32] == 0)
+	/* valid bit is bit 31 of dw[3] */
+	if (!txcp->dw[3])
 		return NULL;
 
 	rmb();
 	be_dws_le_to_cpu(txcp, sizeof(*txcp));
 
-	txcp->dw[offsetof(struct amap_eth_tx_compl, valid) / 32] = 0;
+	txcp->dw[3] = 0;
 
 	queue_tail_inc(tx_cq);
 	return txcp;
@@ -1686,9 +1676,7 @@  static void be_tx_compl_clean(struct be_adapter *adapter)
 		for_all_tx_queues(adapter, txo, i) {
 			txq = &txo->q;
 			while ((txcp = be_tx_compl_get(&txo->cq))) {
-				end_idx =
-					AMAP_GET_BITS(struct amap_eth_tx_compl,
-						      wrb_index, txcp);
+				end_idx = txcp->dw[0] & TX_COMPL_WRB_IDX_MASK;
 				num_wrbs += be_tx_compl_process(adapter, txo,
 								end_idx);
 				cmpl++;
@@ -2040,8 +2028,8 @@  static bool be_process_tx(struct be_adapter *adapter, struct be_tx_obj *txo,
 		if (!txcp)
 			break;
 		num_wrbs += be_tx_compl_process(adapter, txo,
-				AMAP_GET_BITS(struct amap_eth_tx_compl,
-					wrb_index, txcp));
+						txcp->dw[0] &
+						TX_COMPL_WRB_IDX_MASK);
 	}
 
 	if (work_done) {