diff mbox

[net-next] net: introduce napi_schedule_irqoff()

Message ID 1414553435.631.41.camel@edumazet-glaptop2.roam.corp.google.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet Oct. 29, 2014, 3:30 a.m. UTC
On Tue, 2014-10-28 at 18:17 -0700, Alexei Starovoitov wrote:

> Interesting! Are you trying to optimize some of such NIC drivers? ;)

??

> but all of 10G+ and virtio won't apply, right?


Which 10G+ driver could not use this ?

mlx4 patch would be :



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Alexei Starovoitov Oct. 29, 2014, 4:17 a.m. UTC | #1
On Tue, Oct 28, 2014 at 8:30 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Tue, 2014-10-28 at 18:17 -0700, Alexei Starovoitov wrote:
>
>> Interesting! Are you trying to optimize some of such NIC drivers? ;)
>
> ??
>
>> but all of 10G+ and virtio won't apply, right?
>
>
> Which 10G+ driver could not use this ?
>
> mlx4 patch would be :

right.
but it's not a critical path for napi drivers. Why bother?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Oct. 29, 2014, 4:40 a.m. UTC | #2
On Tue, 2014-10-28 at 21:17 -0700, Alexei Starovoitov wrote:
> On Tue, Oct 28, 2014 at 8:30 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > On Tue, 2014-10-28 at 18:17 -0700, Alexei Starovoitov wrote:
> >
> >> Interesting! Are you trying to optimize some of such NIC drivers? ;)
> >
> > ??
> >
> >> but all of 10G+ and virtio won't apply, right?
> >
> >
> > Which 10G+ driver could not use this ?
> >
> > mlx4 patch would be :
> 
> right.
> but it's not a critical path for napi drivers. Why bother?

You probably are missing something.

This is a critical path, of course.

Try netperf -t TCP_RR for example.




--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexei Starovoitov Oct. 29, 2014, 5:13 a.m. UTC | #3
On Tue, Oct 28, 2014 at 9:40 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Tue, 2014-10-28 at 21:17 -0700, Alexei Starovoitov wrote:
>> On Tue, Oct 28, 2014 at 8:30 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> > On Tue, 2014-10-28 at 18:17 -0700, Alexei Starovoitov wrote:
>> >
>> >> Interesting! Are you trying to optimize some of such NIC drivers? ;)
>> >
>> > ??
>> >
>> >> but all of 10G+ and virtio won't apply, right?
>> >
>> >
>> > Which 10G+ driver could not use this ?
>> >
>> > mlx4 patch would be :
>>
>> right.
>> but it's not a critical path for napi drivers. Why bother?
>
> You probably are missing something.
>
> This is a critical path, of course.
>
> Try netperf -t TCP_RR for example.

tried 50 parallel netperf -t TCP_RR over ixgbe
and perf top were tcp stack bits, qdisc locks and netperf itself.
What do you see?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Oct. 29, 2014, 6:20 a.m. UTC | #4
On Tue, 2014-10-28 at 22:13 -0700, Alexei Starovoitov wrote:

> tried 50 parallel netperf -t TCP_RR over ixgbe
> and perf top were tcp stack bits, qdisc locks and netperf itself.
> What do you see?

You are kidding right ?

If you save 30 nsec ( 2 * 15 nsec) per transaction, and rtt is about 20
usec, its a 0.15 % gain. Not bad for a trivial patch.

Why are you using 50 parallel netperf, instead of trying a single
netperf, as I mentioned latency impact, not overall throughput ?

Do you believe typical servers in data centers are only sending &
receiving bulk packets, with no interrupt, and one cpu busy polling in
NAPI handler ?


Every atomic op we remove/avoid, every irq masking unmasking we remove,
every cache line miss or extra bus transaction we remove, TLB miss, is
the path for better latency.

You should take a look at recent commits I did, you'll get the general
picture if you missed it.

git log --oneline --author dumazet | head -100



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexei Starovoitov Oct. 29, 2014, 3:26 p.m. UTC | #5
On Tue, Oct 28, 2014 at 11:20 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Tue, 2014-10-28 at 22:13 -0700, Alexei Starovoitov wrote:
>
>> tried 50 parallel netperf -t TCP_RR over ixgbe
>> and perf top were tcp stack bits, qdisc locks and netperf itself.
>> What do you see?
>
> You are kidding right ?
>
> If you save 30 nsec ( 2 * 15 nsec) per transaction, and rtt is about 20
> usec, its a 0.15 % gain. Not bad for a trivial patch.

agreed.
I wasn't arguing against the patch at all. Was just curious
what performance gain we'll see. 0.15% is tiny and some might
say the code bloat is not worth it, but imo it's a good one. ack.

> Every atomic op we remove/avoid, every irq masking unmasking we remove,
> every cache line miss or extra bus transaction we remove, TLB miss, is
> the path for better latency.

yes. We're saying the same thing.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index c8e75dab80553c876b195361456fb49587231055..c562c1468944f9ad4319e5faaf19bf9e66d15eaf 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -878,8 +878,8 @@  void mlx4_en_rx_irq(struct mlx4_cq *mcq)
 	struct mlx4_en_cq *cq = container_of(mcq, struct mlx4_en_cq, mcq);
 	struct mlx4_en_priv *priv = netdev_priv(cq->dev);
 
-	if (priv->port_up)
-		napi_schedule(&cq->napi);
+	if (likely(priv->port_up))
+		napi_schedule_irqoff(&cq->napi);
 	else
 		mlx4_en_arm_cq(priv, cq);
 }
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index 34c137878545fc672dad1a3d86e11c034c0ac368..5c4062921cdf46f1a7021a39705275c33ca4de77 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -479,8 +479,8 @@  void mlx4_en_tx_irq(struct mlx4_cq *mcq)
 	struct mlx4_en_cq *cq = container_of(mcq, struct mlx4_en_cq, mcq);
 	struct mlx4_en_priv *priv = netdev_priv(cq->dev);
 
-	if (priv->port_up)
-		napi_schedule(&cq->napi);
+	if (likely(priv->port_up))
+		napi_schedule_irqoff(&cq->napi);
 	else
 		mlx4_en_arm_cq(priv, cq);
 }