diff mbox series

[bpf-next,03/11] libbpf: unify and speed up BTF string deduplication

Message ID 20201029005902.1706310-4-andrii@kernel.org
State Not Applicable
Delegated to: BPF Maintainers
Headers show
Series libbpf: split BTF support | expand

Checks

Context Check Description
jkicinski/cover_letter success Link
jkicinski/fixes_present success Link
jkicinski/patch_count success Link
jkicinski/tree_selection success Clearly marked for bpf-next
jkicinski/subject_prefix success Link
jkicinski/source_inline success Was 0 now: 0
jkicinski/verify_signedoff success Link
jkicinski/module_param success Was 0 now: 0
jkicinski/build_32bit fail Errors and warnings before: 4 this patch: 4
jkicinski/kdoc success Errors and warnings before: 0 this patch: 0
jkicinski/verify_fixes success Link
jkicinski/checkpatch fail Link
jkicinski/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
jkicinski/header_inline success Link
jkicinski/stable success Stable not CCed

Commit Message

Andrii Nakryiko Oct. 29, 2020, 12:58 a.m. UTC
From: Andrii Nakryiko <andriin@fb.com>

Revamp BTF dedup's string deduplication to match the approach of writable BTF
string management. This allows to transfer deduplicated strings index back to
BTF object after deduplication without expensive extra memory copying and hash
map re-construction. It also simplifies the code and speeds it up, because
hashmap-based string deduplication is faster than sort + unique approach.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 tools/lib/bpf/btf.c | 265 +++++++++++++++++---------------------------
 1 file changed, 99 insertions(+), 166 deletions(-)

Comments

Song Liu Oct. 30, 2020, 11:32 p.m. UTC | #1
> On Oct 28, 2020, at 5:58 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
> 
> From: Andrii Nakryiko <andriin@fb.com>
> 
> Revamp BTF dedup's string deduplication to match the approach of writable BTF
> string management. This allows to transfer deduplicated strings index back to
> BTF object after deduplication without expensive extra memory copying and hash
> map re-construction. It also simplifies the code and speeds it up, because
> hashmap-based string deduplication is faster than sort + unique approach.
> 
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>

LGTM, with a couple nitpick below:

Acked-by: Song Liu <songliubraving@fb.com>

> ---
> tools/lib/bpf/btf.c | 265 +++++++++++++++++---------------------------
> 1 file changed, 99 insertions(+), 166 deletions(-)
> 
> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index 89fecfe5cb2b..db9331fea672 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -90,6 +90,14 @@ struct btf {
> 	struct hashmap *strs_hash;
> 	/* whether strings are already deduplicated */
> 	bool strs_deduped;
> +	/* extra indirection layer to make strings hashmap work with stable
> +	 * string offsets and ability to transparently choose between
> +	 * btf->strs_data or btf_dedup->strs_data as a source of strings.
> +	 * This is used for BTF strings dedup to transfer deduplicated strings
> +	 * data back to struct btf without re-building strings index.
> +	 */
> +	void **strs_data_ptr;
> +
> 	/* BTF object FD, if loaded into kernel */
> 	int fd;
> 
> @@ -1363,17 +1371,19 @@ int btf__get_map_kv_tids(const struct btf *btf, const char *map_name,
> 
> static size_t strs_hash_fn(const void *key, void *ctx)
> {
> -	struct btf *btf = ctx;
> -	const char *str = btf->strs_data + (long)key;
> +	const char ***strs_data_ptr = ctx;
> +	const char *strs = **strs_data_ptr;
> +	const char *str = strs + (long)key;

Can we keep using btf as the ctx here? "char ***" hurts my eyes...

[...]

> -	d->btf->hdr->str_len = end - start;
> +	/* replace BTF string data and hash with deduped ones */
> +	free(d->btf->strs_data);
> +	hashmap__free(d->btf->strs_hash);
> +	d->btf->strs_data = d->strs_data;
> +	d->btf->strs_data_cap = d->strs_cap;
> +	d->btf->hdr->str_len = d->strs_len;
> +	d->btf->strs_hash = d->strs_hash;
> +	/* now point strs_data_ptr back to btf->strs_data */
> +	d->btf->strs_data_ptr = &d->btf->strs_data;
> +
> +	d->strs_data = d->strs_hash = NULL;
> +	d->strs_len = d->strs_cap = 0;
> 	d->btf->strs_deduped = true;
> +	return 0;
> +
> +err_out:
> +	free(d->strs_data);
> +	hashmap__free(d->strs_hash);
> +	d->strs_data = d->strs_hash = NULL;
> +	d->strs_len = d->strs_cap = 0;
> +
> +	/* restore strings pointer for existing d->btf->strs_hash back */
> +	d->btf->strs_data_ptr = &d->strs_data;

We have quite some duplicated code in err_out vs. succeed_out cases. 
How about we add a helper function, like

void free_strs_data(struct btf_dedup *d)
{
	free(d->strs_data);
	hashmap__free(d->strs_hash);
	d->strs_data = d->strs_hash = NULL;
	d->strs_len = d->strs_cap = 0;	
}

?
Andrii Nakryiko Nov. 3, 2020, 4:51 a.m. UTC | #2
On Fri, Oct 30, 2020 at 4:33 PM Song Liu <songliubraving@fb.com> wrote:
>
>
>
> > On Oct 28, 2020, at 5:58 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
> >
> > From: Andrii Nakryiko <andriin@fb.com>
> >
> > Revamp BTF dedup's string deduplication to match the approach of writable BTF
> > string management. This allows to transfer deduplicated strings index back to
> > BTF object after deduplication without expensive extra memory copying and hash
> > map re-construction. It also simplifies the code and speeds it up, because
> > hashmap-based string deduplication is faster than sort + unique approach.
> >
> > Signed-off-by: Andrii Nakryiko <andriin@fb.com>
>
> LGTM, with a couple nitpick below:
>
> Acked-by: Song Liu <songliubraving@fb.com>
>
> > ---
> > tools/lib/bpf/btf.c | 265 +++++++++++++++++---------------------------
> > 1 file changed, 99 insertions(+), 166 deletions(-)
> >
> > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> > index 89fecfe5cb2b..db9331fea672 100644
> > --- a/tools/lib/bpf/btf.c
> > +++ b/tools/lib/bpf/btf.c
> > @@ -90,6 +90,14 @@ struct btf {
> >       struct hashmap *strs_hash;
> >       /* whether strings are already deduplicated */
> >       bool strs_deduped;
> > +     /* extra indirection layer to make strings hashmap work with stable
> > +      * string offsets and ability to transparently choose between
> > +      * btf->strs_data or btf_dedup->strs_data as a source of strings.
> > +      * This is used for BTF strings dedup to transfer deduplicated strings
> > +      * data back to struct btf without re-building strings index.
> > +      */
> > +     void **strs_data_ptr;
> > +
> >       /* BTF object FD, if loaded into kernel */
> >       int fd;
> >
> > @@ -1363,17 +1371,19 @@ int btf__get_map_kv_tids(const struct btf *btf, const char *map_name,
> >
> > static size_t strs_hash_fn(const void *key, void *ctx)
> > {
> > -     struct btf *btf = ctx;
> > -     const char *str = btf->strs_data + (long)key;
> > +     const char ***strs_data_ptr = ctx;
> > +     const char *strs = **strs_data_ptr;
> > +     const char *str = strs + (long)key;
>
> Can we keep using btf as the ctx here? "char ***" hurts my eyes...
>

yep, changed to struct btf *

> [...]
>
> > -     d->btf->hdr->str_len = end - start;
> > +     /* replace BTF string data and hash with deduped ones */
> > +     free(d->btf->strs_data);
> > +     hashmap__free(d->btf->strs_hash);
> > +     d->btf->strs_data = d->strs_data;
> > +     d->btf->strs_data_cap = d->strs_cap;
> > +     d->btf->hdr->str_len = d->strs_len;
> > +     d->btf->strs_hash = d->strs_hash;
> > +     /* now point strs_data_ptr back to btf->strs_data */
> > +     d->btf->strs_data_ptr = &d->btf->strs_data;
> > +
> > +     d->strs_data = d->strs_hash = NULL;
> > +     d->strs_len = d->strs_cap = 0;
> >       d->btf->strs_deduped = true;
> > +     return 0;
> > +
> > +err_out:
> > +     free(d->strs_data);
> > +     hashmap__free(d->strs_hash);
> > +     d->strs_data = d->strs_hash = NULL;
> > +     d->strs_len = d->strs_cap = 0;
> > +
> > +     /* restore strings pointer for existing d->btf->strs_hash back */
> > +     d->btf->strs_data_ptr = &d->strs_data;
>
> We have quite some duplicated code in err_out vs. succeed_out cases.
> How about we add a helper function, like

nope, that won't work, free(d->strs_data) vs free(d->btf->strs_data),
same for hashmap__free(), plus there are strict requirements about the
exact sequence of assignments in success case

>
> void free_strs_data(struct btf_dedup *d)
> {
>         free(d->strs_data);
>         hashmap__free(d->strs_hash);
>         d->strs_data = d->strs_hash = NULL;
>         d->strs_len = d->strs_cap = 0;
> }
>
> ?
Alexei Starovoitov Nov. 3, 2020, 4:59 a.m. UTC | #3
On Wed, Oct 28, 2020 at 05:58:54PM -0700, Andrii Nakryiko wrote:
> From: Andrii Nakryiko <andriin@fb.com>
> 
> Revamp BTF dedup's string deduplication to match the approach of writable BTF
> string management. This allows to transfer deduplicated strings index back to
> BTF object after deduplication without expensive extra memory copying and hash
> map re-construction. It also simplifies the code and speeds it up, because
> hashmap-based string deduplication is faster than sort + unique approach.
> 
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> ---
>  tools/lib/bpf/btf.c | 265 +++++++++++++++++---------------------------
>  1 file changed, 99 insertions(+), 166 deletions(-)
> 
> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index 89fecfe5cb2b..db9331fea672 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -90,6 +90,14 @@ struct btf {
>  	struct hashmap *strs_hash;
>  	/* whether strings are already deduplicated */
>  	bool strs_deduped;
> +	/* extra indirection layer to make strings hashmap work with stable
> +	 * string offsets and ability to transparently choose between
> +	 * btf->strs_data or btf_dedup->strs_data as a source of strings.
> +	 * This is used for BTF strings dedup to transfer deduplicated strings
> +	 * data back to struct btf without re-building strings index.
> +	 */
> +	void **strs_data_ptr;

I thought one of the ideas of dedup algo was that strings were deduped first,
so there is no need to rebuild them.
Then split BTF cannot touch base BTF strings and they're immutable.
But the commit log is talking about transfer of strings and
hash map re-construction? Why split BTF would reconstruct anything?
It either finds a string in a base BTF or adds to its own strings section.
Is it all due to switch to hash? The speedup motivation is clear, but then
it sounds like that the speedup is causing all these issues.
The strings could have stayed as-is. Just a bit slower ?
Andrii Nakryiko Nov. 3, 2020, 6:01 a.m. UTC | #4
On Mon, Nov 2, 2020 at 8:59 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Oct 28, 2020 at 05:58:54PM -0700, Andrii Nakryiko wrote:
> > From: Andrii Nakryiko <andriin@fb.com>
> >
> > Revamp BTF dedup's string deduplication to match the approach of writable BTF
> > string management. This allows to transfer deduplicated strings index back to
> > BTF object after deduplication without expensive extra memory copying and hash
> > map re-construction. It also simplifies the code and speeds it up, because
> > hashmap-based string deduplication is faster than sort + unique approach.
> >
> > Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> > ---
> >  tools/lib/bpf/btf.c | 265 +++++++++++++++++---------------------------
> >  1 file changed, 99 insertions(+), 166 deletions(-)
> >
> > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> > index 89fecfe5cb2b..db9331fea672 100644
> > --- a/tools/lib/bpf/btf.c
> > +++ b/tools/lib/bpf/btf.c
> > @@ -90,6 +90,14 @@ struct btf {
> >       struct hashmap *strs_hash;
> >       /* whether strings are already deduplicated */
> >       bool strs_deduped;
> > +     /* extra indirection layer to make strings hashmap work with stable
> > +      * string offsets and ability to transparently choose between
> > +      * btf->strs_data or btf_dedup->strs_data as a source of strings.
> > +      * This is used for BTF strings dedup to transfer deduplicated strings
> > +      * data back to struct btf without re-building strings index.
> > +      */
> > +     void **strs_data_ptr;
>
> I thought one of the ideas of dedup algo was that strings were deduped first,
> so there is no need to rebuild them.

Ugh.. many things to unpack here. Let's try.

Yes, the idea of dedup is to have only unique strings. But we always
were rebuilding strings during dedup, here we are just changing the
algorithm for string dedup from sort+uniq to hash table. We were
deduping strings unconditionally because we don't know how the BTF
strings section was created in the first place and if it's already
deduplicated or not. So we had to always do it before.

With BTF write APIs the situation became a bit more nuanced. If we
create BTF programmatically from scratch (btf_new_empty()), then
libbpf guarantees (by construction) that all added strings are
auto-deduped. In such a case btf->strs_deduped will be set to true and
during btf_dedup() we'll skip string deduplication. It's purely a
performance improvement and it benefits the main btf_dedup workflow in
pahole.

But if ready-built BTF was loaded from somewhere first and then
modified with BTF write APIs, then it's a bit different. For existing
strings, when we transition from read-only BTF to writable BTF, we
build string lookup hashmap, but we don't deduplicate and remap string
offsets. So if loaded BTF had string duplicates, it will continue
having string duplicates. The string lookup index will pick arbitrary
instance of duplicated string as a unique key, but strings data will
still have duplicates and there will be types that still reference
duplicated string. Until (and if) we do btf_dedup(). At that time
we'll create another unique hash table *and* will remap all string
offsets across all types.

I did it this way intentionally (not remapping strings when doing
read-only -> writable BTF transition) to not accidentally corrupt
.BTF.ext strings. If I were to do full string dedup for r/o ->
writable transition, I'd need to add APIs to "link" struct btf_ext to
struct btf, so that libbpf could remap .BTF.ext strings transparently.
But I didn't want to add those APIs (yet) and didn't want to deal with
mutable struct btf_ext (yet).

So, in short, for strings dedup fundamentally nothing changed at all.

> Then split BTF cannot touch base BTF strings and they're immutable.

This is exactly the case right now. Nothing in base BTF changes, ever.

> But the commit log is talking about transfer of strings and
> hash map re-construction? Why split BTF would reconstruct anything?

This transfer of strings is for split BTF's strings data only. In
general case, we have some unknown strings data in split BTF. When we
do dedup, we need to make sure that split BTF strings are deduplicated
(we don't touch base BTF strings at all). For that we need to
construct a new hashmap. Once we constructed it, we have new strings
data with deduplicated strings, so to avoid creating another big copy
for struct btf, we just "transfer" that data to struct btf from struct
btf_dedup. void **strs_data_ptr just allows reusing the same (already
constructed) hashmap, same underlying blog of deduplicated string
data, same hashing and equality functions.

> It either finds a string in a base BTF or adds to its own strings section.
> Is it all due to switch to hash? The speedup motivation is clear, but then
> it sounds like that the speedup is causing all these issues.
> The strings could have stayed as-is. Just a bit slower ?

Previously we were able to rewrite strings in-place and strings data
was never reallocated (because BTF was read-only always). So it was
all a bit simpler. By using double-indirection we don't have to build
a third hashmap once we are done with strings dedup, we just replace
struct btf's own string lookup hashmap and string data memory.
Alternative is another expensive memory allocation and potentially
pretty big hashmap copy.

Apart from double indirection, the algorithm is much simpler now. If I
were writing original BTF dedup in C++, I'd use a hashmap approach
back then. But we didn't have hashmap in libbpf yet, so sort + uniq
was chosen.
diff mbox series

Patch

diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index 89fecfe5cb2b..db9331fea672 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -90,6 +90,14 @@  struct btf {
 	struct hashmap *strs_hash;
 	/* whether strings are already deduplicated */
 	bool strs_deduped;
+	/* extra indirection layer to make strings hashmap work with stable
+	 * string offsets and ability to transparently choose between
+	 * btf->strs_data or btf_dedup->strs_data as a source of strings.
+	 * This is used for BTF strings dedup to transfer deduplicated strings
+	 * data back to struct btf without re-building strings index.
+	 */
+	void **strs_data_ptr;
+
 	/* BTF object FD, if loaded into kernel */
 	int fd;
 
@@ -1363,17 +1371,19 @@  int btf__get_map_kv_tids(const struct btf *btf, const char *map_name,
 
 static size_t strs_hash_fn(const void *key, void *ctx)
 {
-	struct btf *btf = ctx;
-	const char *str = btf->strs_data + (long)key;
+	const char ***strs_data_ptr = ctx;
+	const char *strs = **strs_data_ptr;
+	const char *str = strs + (long)key;
 
 	return str_hash(str);
 }
 
 static bool strs_hash_equal_fn(const void *key1, const void *key2, void *ctx)
 {
-	struct btf *btf = ctx;
-	const char *str1 = btf->strs_data + (long)key1;
-	const char *str2 = btf->strs_data + (long)key2;
+	const char ***strs_data_ptr = ctx;
+	const char *strs = **strs_data_ptr;
+	const char *str1 = strs + (long)key1;
+	const char *str2 = strs + (long)key2;
 
 	return strcmp(str1, str2) == 0;
 }
@@ -1418,8 +1428,11 @@  static int btf_ensure_modifiable(struct btf *btf)
 	memcpy(types, btf->types_data, btf->hdr->type_len);
 	memcpy(strs, btf->strs_data, btf->hdr->str_len);
 
+	/* make hashmap below use btf->strs_data as a source of strings */
+	btf->strs_data_ptr = &btf->strs_data;
+
 	/* build lookup index for all strings */
-	hash = hashmap__new(strs_hash_fn, strs_hash_equal_fn, btf);
+	hash = hashmap__new(strs_hash_fn, strs_hash_equal_fn, &btf->strs_data_ptr);
 	if (IS_ERR(hash)) {
 		err = PTR_ERR(hash);
 		hash = NULL;
@@ -2824,19 +2837,11 @@  struct btf_dedup {
 	size_t hypot_cap;
 	/* Various option modifying behavior of algorithm */
 	struct btf_dedup_opts opts;
-};
-
-struct btf_str_ptr {
-	const char *str;
-	__u32 new_off;
-	bool used;
-};
-
-struct btf_str_ptrs {
-	struct btf_str_ptr *ptrs;
-	const char *data;
-	__u32 cnt;
-	__u32 cap;
+	/* temporary strings deduplication state */
+	void *strs_data;
+	size_t strs_cap;
+	size_t strs_len;
+	struct hashmap* strs_hash;
 };
 
 static long hash_combine(long h, long value)
@@ -3063,64 +3068,41 @@  static int btf_for_each_str_off(struct btf_dedup *d, str_off_fn_t fn, void *ctx)
 	return 0;
 }
 
-static int str_sort_by_content(const void *a1, const void *a2)
-{
-	const struct btf_str_ptr *p1 = a1;
-	const struct btf_str_ptr *p2 = a2;
-
-	return strcmp(p1->str, p2->str);
-}
-
-static int str_sort_by_offset(const void *a1, const void *a2)
-{
-	const struct btf_str_ptr *p1 = a1;
-	const struct btf_str_ptr *p2 = a2;
-
-	if (p1->str != p2->str)
-		return p1->str < p2->str ? -1 : 1;
-	return 0;
-}
-
-static int btf_dedup_str_ptr_cmp(const void *str_ptr, const void *pelem)
-{
-	const struct btf_str_ptr *p = pelem;
-
-	if (str_ptr != p->str)
-		return (const char *)str_ptr < p->str ? -1 : 1;
-	return 0;
-}
-
-static int btf_str_mark_as_used(__u32 *str_off_ptr, void *ctx)
+static int strs_dedup_remap_str_off(__u32 *str_off_ptr, void *ctx)
 {
-	struct btf_str_ptrs *strs;
-	struct btf_str_ptr *s;
+	struct btf_dedup *d = ctx;
+	long old_off, new_off, len;
+	const char *s;
+	void *p;
+	int err;
 
 	if (*str_off_ptr == 0)
 		return 0;
 
-	strs = ctx;
-	s = bsearch(strs->data + *str_off_ptr, strs->ptrs, strs->cnt,
-		    sizeof(struct btf_str_ptr), btf_dedup_str_ptr_cmp);
-	if (!s)
-		return -EINVAL;
-	s->used = true;
-	return 0;
-}
+	s = btf__str_by_offset(d->btf, *str_off_ptr);
+	len = strlen(s) + 1;
 
-static int btf_str_remap_offset(__u32 *str_off_ptr, void *ctx)
-{
-	struct btf_str_ptrs *strs;
-	struct btf_str_ptr *s;
+	new_off = d->strs_len;
+	p = btf_add_mem(&d->strs_data, &d->strs_cap, 1, new_off, BTF_MAX_STR_OFFSET, len);
+	if (!p)
+		return -ENOMEM;
 
-	if (*str_off_ptr == 0)
-		return 0;
+	memcpy(p, s, len);
 
-	strs = ctx;
-	s = bsearch(strs->data + *str_off_ptr, strs->ptrs, strs->cnt,
-		    sizeof(struct btf_str_ptr), btf_dedup_str_ptr_cmp);
-	if (!s)
-		return -EINVAL;
-	*str_off_ptr = s->new_off;
+	/* Now attempt to add the string, but only if the string with the same
+	 * contents doesn't exist already (HASHMAP_ADD strategy). If such
+	 * string exists, we'll get its offset in old_off (that's old_key).
+	 */
+	err = hashmap__insert(d->strs_hash, (void *)new_off, (void *)new_off,
+			      HASHMAP_ADD, (const void **)&old_off, NULL);
+	if (err == -EEXIST) {
+		*str_off_ptr = old_off;
+	} else if (err) {
+		return err;
+	} else {
+		*str_off_ptr = new_off;
+		d->strs_len += len;
+	}
 	return 0;
 }
 
@@ -3137,118 +3119,69 @@  static int btf_str_remap_offset(__u32 *str_off_ptr, void *ctx)
  */
 static int btf_dedup_strings(struct btf_dedup *d)
 {
-	char *start = d->btf->strs_data;
-	char *end = start + d->btf->hdr->str_len;
-	char *p = start, *tmp_strs = NULL;
-	struct btf_str_ptrs strs = {
-		.cnt = 0,
-		.cap = 0,
-		.ptrs = NULL,
-		.data = start,
-	};
-	int i, j, err = 0, grp_idx;
-	bool grp_used;
+	char *s;
+	int err;
 
 	if (d->btf->strs_deduped)
 		return 0;
 
-	/* build index of all strings */
-	while (p < end) {
-		if (strs.cnt + 1 > strs.cap) {
-			struct btf_str_ptr *new_ptrs;
-
-			strs.cap += max(strs.cnt / 2, 16U);
-			new_ptrs = libbpf_reallocarray(strs.ptrs, strs.cap, sizeof(strs.ptrs[0]));
-			if (!new_ptrs) {
-				err = -ENOMEM;
-				goto done;
-			}
-			strs.ptrs = new_ptrs;
-		}
-
-		strs.ptrs[strs.cnt].str = p;
-		strs.ptrs[strs.cnt].used = false;
-
-		p += strlen(p) + 1;
-		strs.cnt++;
-	}
-
-	/* temporary storage for deduplicated strings */
-	tmp_strs = malloc(d->btf->hdr->str_len);
-	if (!tmp_strs) {
-		err = -ENOMEM;
-		goto done;
-	}
-
-	/* mark all used strings */
-	strs.ptrs[0].used = true;
-	err = btf_for_each_str_off(d, btf_str_mark_as_used, &strs);
-	if (err)
-		goto done;
-
-	/* sort strings by context, so that we can identify duplicates */
-	qsort(strs.ptrs, strs.cnt, sizeof(strs.ptrs[0]), str_sort_by_content);
+	s = btf_add_mem(&d->strs_data, &d->strs_cap, 1, d->strs_len, BTF_MAX_STR_OFFSET, 1);
+	if (!s)
+		return -ENOMEM;
+	/* initial empty string */
+	s[0] = 0;
+	d->strs_len = 1;
 
-	/*
-	 * iterate groups of equal strings and if any instance in a group was
-	 * referenced, emit single instance and remember new offset
+	/* temporarily switch to use btf_dedup's strs_data for strings for hash
+	 * functions; later we'll just transfer hashmap to struct btf as is,
+	 * along the strs_data
 	 */
-	p = tmp_strs;
-	grp_idx = 0;
-	grp_used = strs.ptrs[0].used;
-	/* iterate past end to avoid code duplication after loop */
-	for (i = 1; i <= strs.cnt; i++) {
-		/*
-		 * when i == strs.cnt, we want to skip string comparison and go
-		 * straight to handling last group of strings (otherwise we'd
-		 * need to handle last group after the loop w/ duplicated code)
-		 */
-		if (i < strs.cnt &&
-		    !strcmp(strs.ptrs[i].str, strs.ptrs[grp_idx].str)) {
-			grp_used = grp_used || strs.ptrs[i].used;
-			continue;
-		}
+	d->btf->strs_data_ptr = &d->strs_data;
 
-		/*
-		 * this check would have been required after the loop to handle
-		 * last group of strings, but due to <= condition in a loop
-		 * we avoid that duplication
-		 */
-		if (grp_used) {
-			int new_off = p - tmp_strs;
-			__u32 len = strlen(strs.ptrs[grp_idx].str);
-
-			memmove(p, strs.ptrs[grp_idx].str, len + 1);
-			for (j = grp_idx; j < i; j++)
-				strs.ptrs[j].new_off = new_off;
-			p += len + 1;
-		}
-
-		if (i < strs.cnt) {
-			grp_idx = i;
-			grp_used = strs.ptrs[i].used;
-		}
+	d->strs_hash = hashmap__new(strs_hash_fn, strs_hash_equal_fn, &d->btf->strs_data_ptr);
+	if (IS_ERR(d->strs_hash)) {
+		err = PTR_ERR(d->strs_hash);
+		d->strs_hash = NULL;
+		goto err_out;
 	}
 
-	/* replace original strings with deduped ones */
-	d->btf->hdr->str_len = p - tmp_strs;
-	memmove(start, tmp_strs, d->btf->hdr->str_len);
-	end = start + d->btf->hdr->str_len;
-
-	/* restore original order for further binary search lookups */
-	qsort(strs.ptrs, strs.cnt, sizeof(strs.ptrs[0]), str_sort_by_offset);
+	/* insert empty string; we won't be looking it up during strings
+	 * dedup, but it's good to have it for generic BTF string lookups
+	 */
+	err = hashmap__insert(d->strs_hash, (void *)0, (void *)0,
+			      HASHMAP_ADD, NULL, NULL);
+	if (err)
+		goto err_out;
 
 	/* remap string offsets */
-	err = btf_for_each_str_off(d, btf_str_remap_offset, &strs);
+	err = btf_for_each_str_off(d, strs_dedup_remap_str_off, d);
 	if (err)
-		goto done;
+		goto err_out;
 
-	d->btf->hdr->str_len = end - start;
+	/* replace BTF string data and hash with deduped ones */
+	free(d->btf->strs_data);
+	hashmap__free(d->btf->strs_hash);
+	d->btf->strs_data = d->strs_data;
+	d->btf->strs_data_cap = d->strs_cap;
+	d->btf->hdr->str_len = d->strs_len;
+	d->btf->strs_hash = d->strs_hash;
+	/* now point strs_data_ptr back to btf->strs_data */
+	d->btf->strs_data_ptr = &d->btf->strs_data;
+
+	d->strs_data = d->strs_hash = NULL;
+	d->strs_len = d->strs_cap = 0;
 	d->btf->strs_deduped = true;
+	return 0;
+
+err_out:
+	free(d->strs_data);
+	hashmap__free(d->strs_hash);
+	d->strs_data = d->strs_hash = NULL;
+	d->strs_len = d->strs_cap = 0;
+
+	/* restore strings pointer for existing d->btf->strs_hash back */
+	d->btf->strs_data_ptr = &d->strs_data;
 
-done:
-	free(tmp_strs);
-	free(strs.ptrs);
 	return err;
 }