diff mbox

bnx2/BCM5709: why 5 interrupts on a 4 core system (2.6.33.3)

Message ID 1274042826.2299.26.camel@edumazet-laptop
State Not Applicable, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet May 16, 2010, 8:47 p.m. UTC
Le dimanche 16 mai 2010 à 22:34 +0200, Krzysztof Olędzki a écrit :
> On 2010-05-16 22:15, Eric Dumazet wrote:

> > All tx packets through bonding will use txqueue 0, since bnx2 doesnt
> > provide a ndo_select_queue() function.
> 
> OK, that explains everything. Thank you Eric. I assume it may take some 
> time for bonding to become multiqueue aware and/or bnx2x to provide 
> ndo_select_queue?
> 

bonding might become multiqueue aware, there are several patches
floating around.

But with your ping tests, it wont change the selected txqueue anyway (it
will be the same for any targets, because skb_tx_hash() wont hash the
destination address, only the skb->protocol.

> BTW: With a normal router workload, should I expect big performance drop 
> when receiving and forwarding the same packet using different CPUs? 
> Bonding provides very important functionality, I'm not able to drop it. :(
> 

Not sure what you mean by forwarding same packet using different CPUs.
You probably meant different queues, because in normal case, only one
cpu is involved (the one receiving the packet is also the one
transmitting it, unless you have congestion or trafic shaping)

If you have 4 cpus, you can use following patch and have a transparent
bonding against multiqueue. Still bonding xmit path hits a global
rwlock, so performance is not what you can get without bonding.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

George B. May 16, 2010, 9:06 p.m. UTC | #1
2010/5/16 Eric Dumazet <eric.dumazet@gmail.com>:
> Le dimanche 16 mai 2010 à 22:34 +0200, Krzysztof Olędzki a écrit :
>> On 2010-05-16 22:15, Eric Dumazet wrote:
>
>> > All tx packets through bonding will use txqueue 0, since bnx2 doesnt
>> > provide a ndo_select_queue() function.
>>
>> OK, that explains everything. Thank you Eric. I assume it may take some
>> time for bonding to become multiqueue aware and/or bnx2x to provide
>> ndo_select_queue?
>>
>
> bonding might become multiqueue aware, there are several patches
> floating around.
>
> But with your ping tests, it wont change the selected txqueue anyway (it
> will be the same for any targets, because skb_tx_hash() wont hash the
> destination address, only the skb->protocol.
>
>> BTW: With a normal router workload, should I expect big performance drop
>> when receiving and forwarding the same packet using different CPUs?
>> Bonding provides very important functionality, I'm not able to drop it. :(
>>
>
> Not sure what you mean by forwarding same packet using different CPUs.
> You probably meant different queues, because in normal case, only one
> cpu is involved (the one receiving the packet is also the one
> transmitting it, unless you have congestion or trafic shaping)
>
> If you have 4 cpus, you can use following patch and have a transparent
> bonding against multiqueue. Still bonding xmit path hits a global
> rwlock, so performance is not what you can get without bonding.
>
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 5e12462..2c257f7 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -5012,8 +5012,8 @@ int bond_create(struct net *net, const char *name)
>
>        rtnl_lock();
>
> -       bond_dev = alloc_netdev(sizeof(struct bonding), name ? name : "",
> -                               bond_setup);
> +       bond_dev = alloc_netdev_mq(sizeof(struct bonding), name ? name : "",
> +                               bond_setup, 4);
>        if (!bond_dev) {
>                pr_err("%s: eek! can't alloc netdev!\n", name);
>                rtnl_unlock();
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

FWIW, I will be comparing VLANs on bonded ethernet interfaces compared
to bonded to vlan interfaces (create a vlan on two interfaces and bond
them together) later this week to see if I can notice any performance
difference. I am expecting I will when two or more vlans are
experiencing heavy traffic.  What concerns me is if one ethernet goes
away, will the bond interface see the ethernet underlying the vlan
interface has gone down?

So in summary, rather than bonding ethernet interfaces and then
applying vlans to the bond, I intend to create vlans on the ethernet
interfaces and bond them. So one bond interface per vlan plus one for
the "raw" interfaces.  I am hoping that will allow better throughput
with multiple processors (and less head-of-line blocking for vlans
with low traffic rates).  Note: that configuration doesn't work with
2.6.32, I haven't tried with 2.6.33, and it allows me to configure it
with 2.6.34-rc7 though I haven't tested it yet on a multiqueue
ethernet with multiple processors.  I should have some systems to test
with later this week.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Krzysztof Oledzki May 16, 2010, 9:12 p.m. UTC | #2
On 2010-05-16 22:47, Eric Dumazet wrote:
> Le dimanche 16 mai 2010 à 22:34 +0200, Krzysztof Olędzki a écrit :
>> On 2010-05-16 22:15, Eric Dumazet wrote:
>
>>> All tx packets through bonding will use txqueue 0, since bnx2 doesnt
>>> provide a ndo_select_queue() function.
>>
>> OK, that explains everything. Thank you Eric. I assume it may take some
>> time for bonding to become multiqueue aware and/or bnx2x to provide
>> ndo_select_queue?
>>
>
> bonding might become multiqueue aware, there are several patches
> floating around.
>
> But with your ping tests, it wont change the selected txqueue anyway (it
> will be the same for any targets, because skb_tx_hash() wont hash the
> destination address, only the skb->protocol.

What do you mean by "wont hash the destination address, only the 
skb->protocol"? It won't hash the destination address for ICMP or for 
all IP protocols?

My normal workload is TCP and UDP based so if it is only ICMP then there 
is no problem. Actually I have noticeably more UDP traffic than an 
average network, mainly because of LWAPP/CAPWAP, so I'm interested in 
good performance for both TCP and UDP.

During my initial tests ICMP ping showed the same behavior like UDP/TCP 
with iperf, so I sticked with it. I'll redo everyting with UDP and TCP 
of course. :)

>> BTW: With a normal router workload, should I expect big performance drop
>> when receiving and forwarding the same packet using different CPUs?
>> Bonding provides very important functionality, I'm not able to drop it. :(
>>
>
> Not sure what you mean by forwarding same packet using different CPUs.
> You probably meant different queues, because in normal case, only one
> cpu is involved (the one receiving the packet is also the one
> transmitting it, unless you have congestion or trafic shaping)

I mean to receive it on a one CPU and to send it on a different one. I 
would like to assing different vectors (eth1-0 .. eth1-4) to different 
CPUs, but with bnx2x+bonding packets are received on queues 1-4 (eth1-1 
.. eth1-4) and sent from queue 0 (eth1-0). So, for a one packet, two 
different CPUs will be involved (RX on q1-q4, TX on q0).

> If you have 4 cpus, you can use following patch and have a transparent
> bonding against multiqueue.

Thanks! If I get it right: with the patch, packets should be sent using 
the same CPU (queue?) that was used when receiving?

> Still bonding xmit path hits a global
> rwlock, so performance is not what you can get without bonding.

It may not be perfect, but it should be much better than nothing, right?

Best regards,

			Krzysztof Olędzki
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet May 16, 2010, 9:26 p.m. UTC | #3
Le dimanche 16 mai 2010 à 23:12 +0200, Krzysztof Olędzki a écrit :
> On 2010-05-16 22:47, Eric Dumazet wrote:
> > Le dimanche 16 mai 2010 à 22:34 +0200, Krzysztof Olędzki a écrit :
> >> On 2010-05-16 22:15, Eric Dumazet wrote:
> >
> >>> All tx packets through bonding will use txqueue 0, since bnx2 doesnt
> >>> provide a ndo_select_queue() function.
> >>
> >> OK, that explains everything. Thank you Eric. I assume it may take some
> >> time for bonding to become multiqueue aware and/or bnx2x to provide
> >> ndo_select_queue?
> >>
> >
> > bonding might become multiqueue aware, there are several patches
> > floating around.
> >
> > But with your ping tests, it wont change the selected txqueue anyway (it
> > will be the same for any targets, because skb_tx_hash() wont hash the
> > destination address, only the skb->protocol.
> 
> What do you mean by "wont hash the destination address, only the 
> skb->protocol"? It won't hash the destination address for ICMP or for 
> all IP protocols?

locally generated ICMP packets all use same tx queue, because
sk->sk_hash is not set :

        if (skb->sk && skb->sk->sk_hash)
                hash = skb->sk->sk_hash;
        else
                hash = (__force u16) skb->protocol;

        hash = jhash_1word(hash, hashrnd);

        return (u16) (((u64) hash * dev->real_num_tx_queues) >> 32);
 



However, replies will spread four queues, if hardware is capable to
perform hashing of ICMP packets, using IP addresses (source/destination)

> 
> My normal workload is TCP and UDP based so if it is only ICMP then there 
> is no problem. Actually I have noticeably more UDP traffic than an 
> average network, mainly because of LWAPP/CAPWAP, so I'm interested in 
> good performance for both TCP and UDP.
> 
> During my initial tests ICMP ping showed the same behavior like UDP/TCP 
> with iperf, so I sticked with it. I'll redo everyting with UDP and TCP 
> of course. :)
> 
> >> BTW: With a normal router workload, should I expect big performance drop
> >> when receiving and forwarding the same packet using different CPUs?
> >> Bonding provides very important functionality, I'm not able to drop it. :(
> >>
> >
> > Not sure what you mean by forwarding same packet using different CPUs.
> > You probably meant different queues, because in normal case, only one
> > cpu is involved (the one receiving the packet is also the one
> > transmitting it, unless you have congestion or trafic shaping)
> 
> I mean to receive it on a one CPU and to send it on a different one. I 
> would like to assing different vectors (eth1-0 .. eth1-4) to different 
> CPUs, but with bnx2x+bonding packets are received on queues 1-4 (eth1-1 
> .. eth1-4) and sent from queue 0 (eth1-0). So, for a one packet, two 
> different CPUs will be involved (RX on q1-q4, TX on q0).

As I said, (unless you use RPS), one forwarded packet only uses one CPU.
How tx queue is selected is another story. We try to do a 1-1 mapping.

> 
> > If you have 4 cpus, you can use following patch and have a transparent
> > bonding against multiqueue.
> 
> Thanks! If I get it right: with the patch, packets should be sent using 
> the same CPU (queue?) that was used when receiving?

Yes, for forwarding loads.

(You might use 5 or 8 instead of 4, because its not clear to me if bnx2
has 5 txqueues or 4 in your case)

> 
> > Still bonding xmit path hits a global
> > rwlock, so performance is not what you can get without bonding.
> 
> It may not be perfect, but it should be much better than nothing, right?
> 

Sure.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Krzysztof Oledzki May 18, 2010, 2:22 p.m. UTC | #4
On 2010-05-16 23:26, Eric Dumazet wrote:

<CUT>

>> My normal workload is TCP and UDP based so if it is only ICMP then there
>> is no problem. Actually I have noticeably more UDP traffic than an
>> average network, mainly because of LWAPP/CAPWAP, so I'm interested in
>> good performance for both TCP and UDP.
>>
>> During my initial tests ICMP ping showed the same behavior like UDP/TCP
>> with iperf, so I sticked with it. I'll redo everyting with UDP and TCP
>> of course. :)
>>
>>>> BTW: With a normal router workload, should I expect big performance drop
>>>> when receiving and forwarding the same packet using different CPUs?
>>>> Bonding provides very important functionality, I'm not able to drop it. :(
>>>>
>>>
>>> Not sure what you mean by forwarding same packet using different CPUs.
>>> You probably meant different queues, because in normal case, only one
>>> cpu is involved (the one receiving the packet is also the one
>>> transmitting it, unless you have congestion or trafic shaping)
>>
>> I mean to receive it on a one CPU and to send it on a different one. I
>> would like to assing different vectors (eth1-0 .. eth1-4) to different
>> CPUs, but with bnx2x+bonding packets are received on queues 1-4 (eth1-1
>> .. eth1-4) and sent from queue 0 (eth1-0). So, for a one packet, two
>> different CPUs will be involved (RX on q1-q4, TX on q0).
>
> As I said, (unless you use RPS), one forwarded packet only uses one CPU.
> How tx queue is selected is another story. We try to do a 1-1 mapping.

OK, but with multi-queue NIC, I can assign each queue to a different 
CPU. So, while forwarding packets from a flow, I would like to assign 
the same queue on both input and output.

>>> If you have 4 cpus, you can use following patch and have a transparent
>>> bonding against multiqueue.
>>
>> Thanks! If I get it right: with the patch, packets should be sent using
>> the same CPU (queue?) that was used when receiving?
>
> Yes, for forwarding loads.
>
> (You might use 5 or 8 instead of 4, because its not clear to me if bnx2
> has 5 txqueues or 4 in your case)

Thank you. What happens if I set it to a lower/bigger value, than 
avaliable txqueues in a NIC?

Best regards,

			Krzysztof Olędzki
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet May 18, 2010, 2:26 p.m. UTC | #5
Le mardi 18 mai 2010 à 16:22 +0200, Krzysztof Olędzki a écrit :

> Thank you. What happens if I set it to a lower/bigger value, than 
> avaliable txqueues in a NIC?

lower values -> same situation than today (not all txqueues will be
used)

bigger values -> it will be capped, so its only a bit more ram
allocated.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Krzysztof Oledzki May 18, 2010, 2:55 p.m. UTC | #6
On 2010-05-18 16:26, Eric Dumazet wrote:
> Le mardi 18 mai 2010 à 16:22 +0200, Krzysztof Olędzki a écrit :
>
>> Thank you. What happens if I set it to a lower/bigger value, than
>> avaliable txqueues in a NIC?
>
> lower values ->  same situation than today (not all txqueues will be
> used)
>
> bigger values ->  it will be capped, so its only a bit more ram
> allocated.

So it is safe to put there little bigger value than needed. Thanks.

Best regards,

			Krzysztof Olędzki
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 5e12462..2c257f7 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -5012,8 +5012,8 @@  int bond_create(struct net *net, const char *name)
 
 	rtnl_lock();
 
-	bond_dev = alloc_netdev(sizeof(struct bonding), name ? name : "",
-				bond_setup);
+	bond_dev = alloc_netdev_mq(sizeof(struct bonding), name ? name : "",
+				bond_setup, 4);
 	if (!bond_dev) {
 		pr_err("%s: eek! can't alloc netdev!\n", name);
 		rtnl_unlock();