diff mbox series

[net-next,3/3] bpf: Only set node->ref = 1 if it has not been set

Message ID 20170901062713.1842249-4-kafai@fb.com
State Accepted, archived
Delegated to: David Miller
Headers show
Series bpf: Improve LRU map lookup performance | expand

Commit Message

Martin KaFai Lau Sept. 1, 2017, 6:27 a.m. UTC
This patch writes 'node->ref = 1' only if node->ref is 0.
The number of lookups/s for a ~1M entries LRU map increased by
~30% (260097 to 343313).

Other writes on 'node->ref = 0' is not changed.  In those cases, the
same cache line has to be changed anyway.

First column: Size of the LRU hash
Second column: Number of lookups/s

Before:
> echo "$((2**20+1)): $(./map_perf_test 1024 1 $((2**20+1)) 10000000 | awk '{print $3}')"
1048577: 260097

After:
> echo "$((2**20+1)): $(./map_perf_test 1024 1 $((2**20+1)) 10000000 | awk '{print $3}')"
1048577: 343313

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
 kernel/bpf/bpf_lru_list.h | 3 ++-
 kernel/bpf/hashtab.c      | 7 ++++++-
 2 files changed, 8 insertions(+), 2 deletions(-)

Comments

Daniel Borkmann Sept. 1, 2017, 9:28 a.m. UTC | #1
On 09/01/2017 08:27 AM, Martin KaFai Lau wrote:
> This patch writes 'node->ref = 1' only if node->ref is 0.
> The number of lookups/s for a ~1M entries LRU map increased by
> ~30% (260097 to 343313).
>
> Other writes on 'node->ref = 0' is not changed.  In those cases, the
> same cache line has to be changed anyway.
>
> First column: Size of the LRU hash
> Second column: Number of lookups/s
>
> Before:
>> echo "$((2**20+1)): $(./map_perf_test 1024 1 $((2**20+1)) 10000000 | awk '{print $3}')"
> 1048577: 260097
>
> After:
>> echo "$((2**20+1)): $(./map_perf_test 1024 1 $((2**20+1)) 10000000 | awk '{print $3}')"
> 1048577: 343313
>
> Signed-off-by: Martin KaFai Lau <kafai@fb.com>

Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Alexei Starovoitov Sept. 1, 2017, 2:22 p.m. UTC | #2
On 8/31/17 11:27 PM, Martin KaFai Lau wrote:
> This patch writes 'node->ref = 1' only if node->ref is 0.
> The number of lookups/s for a ~1M entries LRU map increased by
> ~30% (260097 to 343313).
>
> Other writes on 'node->ref = 0' is not changed.  In those cases, the
> same cache line has to be changed anyway.
>
> First column: Size of the LRU hash
> Second column: Number of lookups/s
>
> Before:
>> echo "$((2**20+1)): $(./map_perf_test 1024 1 $((2**20+1)) 10000000 | awk '{print $3}')"
> 1048577: 260097
>
> After:
>> echo "$((2**20+1)): $(./map_perf_test 1024 1 $((2**20+1)) 10000000 | awk '{print $3}')"
> 1048577: 343313
>
> Signed-off-by: Martin KaFai Lau <kafai@fb.com>

Acked-by: Alexei Starovoitov <ast@kernel.org>
diff mbox series

Patch

diff --git a/kernel/bpf/bpf_lru_list.h b/kernel/bpf/bpf_lru_list.h
index 5c35a98d02bf..7d4f89b7cb84 100644
--- a/kernel/bpf/bpf_lru_list.h
+++ b/kernel/bpf/bpf_lru_list.h
@@ -69,7 +69,8 @@  static inline void bpf_lru_node_set_ref(struct bpf_lru_node *node)
 	/* ref is an approximation on access frequency.  It does not
 	 * have to be very accurate.  Hence, no protection is used.
 	 */
-	node->ref = 1;
+	if (!node->ref)
+		node->ref = 1;
 }
 
 int bpf_lru_init(struct bpf_lru *lru, bool percpu, u32 hash_offset,
diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index 682f4543fefa..431126f31ea3 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -519,9 +519,14 @@  static u32 htab_lru_map_gen_lookup(struct bpf_map *map,
 {
 	struct bpf_insn *insn = insn_buf;
 	const int ret = BPF_REG_0;
+	const int ref_reg = BPF_REG_1;
 
 	*insn++ = BPF_EMIT_CALL((u64 (*)(u64, u64, u64, u64, u64))__htab_map_lookup_elem);
-	*insn++ = BPF_JMP_IMM(BPF_JEQ, ret, 0, 2);
+	*insn++ = BPF_JMP_IMM(BPF_JEQ, ret, 0, 4);
+	*insn++ = BPF_LDX_MEM(BPF_B, ref_reg, ret,
+			      offsetof(struct htab_elem, lru_node) +
+			      offsetof(struct bpf_lru_node, ref));
+	*insn++ = BPF_JMP_IMM(BPF_JNE, ref_reg, 0, 1);
 	*insn++ = BPF_ST_MEM(BPF_B, ret,
 			     offsetof(struct htab_elem, lru_node) +
 			     offsetof(struct bpf_lru_node, ref),