diff mbox

[RFC] csum experts, csum_replace2() is too expensive

Message ID 1395341341.9114.93.camel@edumazet-glaptop2.roam.corp.google.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet March 20, 2014, 6:49 p.m. UTC
csum_replace2() uses about 29 cycles, while a plain ip_send_check() is
way faster (16 cycles)

csum_partial() is not really meant for doing checksums over 8 bytes !

Any idea how to make the thing really fast as intended ?

I saw csum_partial() consuming 1% of cpu cycles in a GRO workload, that
is insane...

Following patch might be the fastest thing ?

(At this point we already have validated IP checksum)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Herbert Xu March 21, 2014, 12:13 a.m. UTC | #1
On Thu, Mar 20, 2014 at 11:49:01AM -0700, Eric Dumazet wrote:
> csum_replace2() uses about 29 cycles, while a plain ip_send_check() is
> way faster (16 cycles)
> 
> csum_partial() is not really meant for doing checksums over 8 bytes !
> 
> Any idea how to make the thing really fast as intended ?
> 
> I saw csum_partial() consuming 1% of cpu cycles in a GRO workload, that
> is insane...
> 
> Following patch might be the fastest thing ?

Your patch makes sense to me.  However does it actually help
your 1% in the GRO profile?

Cheers,
Andi Kleen March 21, 2014, 1:56 a.m. UTC | #2
Eric Dumazet <eric.dumazet@gmail.com> writes:
>
> I saw csum_partial() consuming 1% of cpu cycles in a GRO workload, that
> is insane...


Couldn't it just be the cache miss?

-Andi
Eric Dumazet March 21, 2014, 2:51 a.m. UTC | #3
On Thu, 2014-03-20 at 18:56 -0700, Andi Kleen wrote:
> Eric Dumazet <eric.dumazet@gmail.com> writes:
> >
> > I saw csum_partial() consuming 1% of cpu cycles in a GRO workload, that
> > is insane...
> 
> 
> Couldn't it just be the cache miss?

Well, no, because we aggregate frames that were already fully checked
(by GRO engine/layers) in the same NAPI run.

In this particular case, it seems we aggregate few frames per run (its a
40Gb Intel NIC, it apparently is tuned for low latency)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller March 21, 2014, 6:07 p.m. UTC | #4
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 20 Mar 2014 11:49:01 -0700

> csum_replace2() uses about 29 cycles, while a plain ip_send_check() is
> way faster (16 cycles)
> 
> csum_partial() is not really meant for doing checksums over 8 bytes !
> 
> Any idea how to make the thing really fast as intended ?
> 
> I saw csum_partial() consuming 1% of cpu cycles in a GRO workload, that
> is insane...
> 
> Following patch might be the fastest thing ?
> 
> (At this point we already have validated IP checksum)
 ...
> @@ -1434,8 +1434,8 @@ static int inet_gro_complete(struct sk_buff *skb, int nhoff)
>  	int proto = iph->protocol;
>  	int err = -ENOSYS;
>  
> -	csum_replace2(&iph->check, iph->tot_len, newlen);
>  	iph->tot_len = newlen;
> +	ip_send_check(&iph);

Yeah the csum_replace*() are extremely suboptimal.

We should be able to cons up something cheap like the trick that
ip_decrease_ttl() uses.

https://tools.ietf.org/html/rfc1624
https://tools.ietf.org/html/rfc1141
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet March 21, 2014, 6:30 p.m. UTC | #5
On Fri, 2014-03-21 at 14:07 -0400, David Miller wrote:

> Yeah the csum_replace*() are extremely suboptimal.
> 
> We should be able to cons up something cheap like the trick that
> ip_decrease_ttl() uses.
> 
> https://tools.ietf.org/html/rfc1624
> https://tools.ietf.org/html/rfc1141

Thanks for the pointers, I'll cook a patch.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 8c54870db792..86c924c16f3c 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1434,8 +1434,8 @@  static int inet_gro_complete(struct sk_buff *skb, int nhoff)
 	int proto = iph->protocol;
 	int err = -ENOSYS;
 
-	csum_replace2(&iph->check, iph->tot_len, newlen);
 	iph->tot_len = newlen;
+	ip_send_check(&iph);
 
 	rcu_read_lock();
 	ops = rcu_dereference(inet_offloads[proto]);