diff mbox series

[nf-next,v4,1/5] net: sched: Micro-optimize egress handling

Message ID a2a8af1622dff2bfd51d446aa8da2c1d2f6f543c.1611304190.git.lukas@wunner.de
State Awaiting Upstream
Delegated to: Pablo Neira
Headers show
Series Netfilter egress hook | expand

Commit Message

Lukas Wunner Jan. 22, 2021, 8:47 a.m. UTC
sch_handle_egress() returns either the skb or NULL to signal to its
caller __dev_queue_xmit() whether a packet should continue to be
processed.

The skb is always non-NULL, otherwise __dev_queue_xmit() would hit a
NULL pointer deref right at its top.

But the compiler doesn't know that.  So if sch_handle_egress() signals
success by returning the skb, the "if (!skb) goto out;" statement
results in a gratuitous NULL pointer check in the Assembler output.

Avoid by telling the compiler that __dev_queue_xmit() is never passed a
NULL skb.  This also eliminates another gratuitous NULL pointer check in
__dev_queue_xmit()
  qdisc_pkt_len_init()
    skb_header_pointer()
      __skb_header_pointer()

The speedup is barely measurable:
Before: 1877 1875 1878 1874 1882 1873 Mb/sec
After:  1877 1877 1880 1883 1888 1886 Mb/sec

However we're about to add a netfilter egress hook to __dev_queue_xmit()
and without the micro-optimization, it will result in a performance
degradation which is indeed measurable:
With netfilter hook:               1853 1852 1850 1848 1849 1851 Mb/sec
With netfilter hook + micro-optim: 1874 1877 1881 1875 1876 1876 Mb/sec

The performance degradation is caused by a JNE instruction ("if (skb)")
being flipped to a JE instruction ("if (!skb)") once the netfilter hook
is added.  The micro-optimization removes the test and jump instructions
altogether.

Measurements were performed on a Core i7-3615QM.  Reproducer:
ip link add dev foo type dummy
ip link set dev foo up
tc qdisc add dev foo clsact
tc filter add dev foo egress bpf da bytecode '1,6 0 0 0,'
modprobe pktgen
echo "add_device foo" > /proc/net/pktgen/kpktgend_3
samples/pktgen/pktgen_bench_xmit_mode_queue_xmit.sh -i foo -n 400000000 -m "11:11:11:11:11:11" -d 1.1.1.1

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Thomas Graf <tgraf@suug.ch>
---
 net/core/dev.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Jakub Kicinski Jan. 24, 2021, 3:26 a.m. UTC | #1
On Fri, 22 Jan 2021 09:47:01 +0100 Lukas Wunner wrote:
> sch_handle_egress() returns either the skb or NULL to signal to its
> caller __dev_queue_xmit() whether a packet should continue to be
> processed.
> 
> The skb is always non-NULL, otherwise __dev_queue_xmit() would hit a
> NULL pointer deref right at its top.
> 
> But the compiler doesn't know that.  So if sch_handle_egress() signals
> success by returning the skb, the "if (!skb) goto out;" statement
> results in a gratuitous NULL pointer check in the Assembler output.

Which exact compiler are we talking about it? Did you report this?
As Eric pointed the compiler should be able to figure this out quite
easily.

> Avoid by telling the compiler that __dev_queue_xmit() is never passed a
> NULL skb.  This also eliminates another gratuitous NULL pointer check in
> __dev_queue_xmit()
>   qdisc_pkt_len_init()
>     skb_header_pointer()
>       __skb_header_pointer()
> 
> The speedup is barely measurable:
> Before: 1877 1875 1878 1874 1882 1873 Mb/sec
> After:  1877 1877 1880 1883 1888 1886 Mb/sec
> 
> However we're about to add a netfilter egress hook to __dev_queue_xmit()
> and without the micro-optimization, it will result in a performance
> degradation which is indeed measurable:
> With netfilter hook:               1853 1852 1850 1848 1849 1851 Mb/sec
> With netfilter hook + micro-optim: 1874 1877 1881 1875 1876 1876 Mb/sec
> 
> The performance degradation is caused by a JNE instruction ("if (skb)")
> being flipped to a JE instruction ("if (!skb)") once the netfilter hook
> is added.  The micro-optimization removes the test and jump instructions
> altogether.
> 
> Measurements were performed on a Core i7-3615QM.  Reproducer:
> ip link add dev foo type dummy
> ip link set dev foo up
> tc qdisc add dev foo clsact
> tc filter add dev foo egress bpf da bytecode '1,6 0 0 0,'
> modprobe pktgen
> echo "add_device foo" > /proc/net/pktgen/kpktgend_3
> samples/pktgen/pktgen_bench_xmit_mode_queue_xmit.sh -i foo -n 400000000 -m "11:11:11:11:11:11" -d 1.1.1.1
> 
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> Cc: John Fastabend <john.fastabend@gmail.com>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Thomas Graf <tgraf@suug.ch>
> ---
>  net/core/dev.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 7afbb642e203..4c16b9932823 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -4072,6 +4072,7 @@ struct netdev_queue *netdev_core_pick_tx(struct net_device *dev,
>   *      the BH enable code must have IRQs enabled so that it will not deadlock.
>   *          --BLG
>   */
> +__attribute__((nonnull(1)))
>  static int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev)
>  {
>  	struct net_device *dev = skb->dev;
Lukas Wunner Jan. 24, 2021, 10:46 a.m. UTC | #2
On Sat, Jan 23, 2021 at 07:26:24PM -0800, Jakub Kicinski wrote:
> On Fri, 22 Jan 2021 09:47:01 +0100 Lukas Wunner wrote:
> > sch_handle_egress() returns either the skb or NULL to signal to its
> > caller __dev_queue_xmit() whether a packet should continue to be
> > processed.
> > 
> > The skb is always non-NULL, otherwise __dev_queue_xmit() would hit a
> > NULL pointer deref right at its top.
> > 
> > But the compiler doesn't know that.  So if sch_handle_egress() signals
> > success by returning the skb, the "if (!skb) goto out;" statement
> > results in a gratuitous NULL pointer check in the Assembler output.
> 
> Which exact compiler are we talking about it? Did you report this?
> As Eric pointed the compiler should be able to figure this out quite
> easily.

I tested with gcc 8, 9, 10.

No need to report as it's the expected behavior with
-fno-delete-null-pointer-checks, whose motivation appears
questionable though (per my preceding e-mail).

Thanks,

Lukas
diff mbox series

Patch

diff --git a/net/core/dev.c b/net/core/dev.c
index 7afbb642e203..4c16b9932823 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4072,6 +4072,7 @@  struct netdev_queue *netdev_core_pick_tx(struct net_device *dev,
  *      the BH enable code must have IRQs enabled so that it will not deadlock.
  *          --BLG
  */
+__attribute__((nonnull(1)))
 static int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev)
 {
 	struct net_device *dev = skb->dev;