diff mbox

net: skb_tx_hash() improvements

Message ID 49FAC112.6090808@cosmosbay.com
State Superseded, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet May 1, 2009, 9:29 a.m. UTC
Eric Dumazet a écrit :
> Eric Dumazet a écrit :
>> David, here is the followup I promised
>>
>> Thanks
>>
>> [PATCH] net: skb_tx_hash() improvements
>>
>> When skb_rx_queue_recorded() is true, we dont want to use jhash distribution
>> as the device driver exactly told us which queue was selected at RX time.
>> jhash makes a statistical shuffle, but this wont work with only 8 different inputs.
>>
>> We also need to implement a true reciprocal division, to not disturb
>> symmetric setups (when number of tx queues matches number of rx queues)
>> and cpu affinities.
>>
>> This patch introduces a new helper, dev_real_num_tx_queues_set()
>> to set both real_num_tx_queues and its reciprocal value,
>> and makes all drivers use this helper.
> 
> Oh well, this was wrong, I took divide result while we want a modulo !
> 
> Need to think a litle bit more :)
> 

So no need of a true reciprocal divide, just a refinement of first patch.

(Avoiding the divide if possible)

If incoming device has 4 rx queues, and outgoing device has 8 queues,
only 4 of tx queues are used, I wonder if we need some further improvement
here to better use all available tx queues ? Probably not in generic code...

[PATCH] net: skb_tx_hash() improvement

When skb_rx_queue_recorded() is true, we dont want to use jhash distribution
as the device driver exactly told us which queue was selected at RX time.
jhash makes a statistical shuffle, but this wont work with only 8 different inputs.

Same thing for the 'modulo' operation, that works only if inputs are
enough random (ie use all available 32 bits)

This patch avoids jhash computation (which cost ~50 instructions), but might
still need a modulo operation, in case number of tx queues is smaller
than number of rx queues.

Reported-by: Andrew Dickinson <andrew@whydna.net>
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
---

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

David Miller May 1, 2009, 4:17 p.m. UTC | #1
From: Eric Dumazet <dada1@cosmosbay.com>
Date: Fri, 01 May 2009 11:29:54 +0200

> -	} else if (skb->sk && skb->sk->sk_hash) {
> +		/*
> +		 * Try to avoid an expensive divide, for symmetric setups :
> +		 *   number of tx queues of output device ==
> +		 *   number of rx queues of incoming device
> +		 */
> +		if (hash >= dev->real_num_tx_queues)
> +			hash %= dev->real_num_tx_queues;
> +		return hash;
> +	}

Subtraction in a while() loop is almost certainly a lot
faster.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/core/dev.c b/net/core/dev.c
index 308a7d0..b3acb51 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1737,9 +1737,19 @@  u16 skb_tx_hash(const struct net_device *dev, const struct sk_buff *skb)
 
 	if (skb_rx_queue_recorded(skb)) {
 		hash = skb_get_rx_queue(skb);
-	} else if (skb->sk && skb->sk->sk_hash) {
+		/*
+		 * Try to avoid an expensive divide, for symmetric setups :
+		 *   number of tx queues of output device ==
+		 *   number of rx queues of incoming device
+		 */
+		if (hash >= dev->real_num_tx_queues)
+			hash %= dev->real_num_tx_queues;
+		return hash;
+	}
+
+	if (skb->sk && skb->sk->sk_hash)
 		hash = skb->sk->sk_hash;
-	} else
+	else
 		hash = skb->protocol;
 
 	hash = jhash_1word(hash, skb_tx_hashrnd);