diff mbox series

um: allow using glibc string functions instead of generics

Message ID 20201110163034.22963-1-anton.ivanov@cambridgegreys.com
State Superseded
Headers show
Series um: allow using glibc string functions instead of generics | expand

Commit Message

Anton Ivanov Nov. 10, 2020, 4:30 p.m. UTC
From: Anton Ivanov <anton.ivanov@cambridgegreys.com>

UML kernel runs as a normal userspace process and can use the
optimized glibc strings functions like strcpy, memcpy, etc.

The support is optional and is turned on/of using a config
option.

Using glibc functions results in a slightly smaller executable
when linked dynamically as well as anything between 1% and 5%
performance improvements.

Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
---
 arch/um/Kconfig                    | 11 +++++
 arch/um/include/asm/string.h       | 72 +++++++++++++++++++++++++++
 arch/um/include/shared/os_string.h | 30 ++++++++++++
 arch/um/os-Linux/Makefile          |  4 +-
 arch/um/os-Linux/string.c          | 78 ++++++++++++++++++++++++++++++
 5 files changed, 193 insertions(+), 2 deletions(-)
 create mode 100644 arch/um/include/asm/string.h
 create mode 100644 arch/um/include/shared/os_string.h
 create mode 100644 arch/um/os-Linux/string.c

Comments

Johannes Berg Nov. 10, 2020, 4:40 p.m. UTC | #1
On Tue, 2020-11-10 at 16:30 +0000, anton.ivanov@cambridgegreys.com
wrote:
> From: Anton Ivanov <anton.ivanov@cambridgegreys.com>
> 
> UML kernel runs as a normal userspace process and can use the
> optimized glibc strings functions like strcpy, memcpy, etc.
> 
> The support is optional and is turned on/of using a config
> option.
> 
> Using glibc functions results in a slightly smaller executable
> when linked dynamically as well as anything between 1% and 5%
> performance improvements.

Nice! :-)

> diff --git a/arch/um/Kconfig b/arch/um/Kconfig
> index 4b799fad8b48..961cf3af3ff0 100644
> --- a/arch/um/Kconfig
> +++ b/arch/um/Kconfig
> @@ -189,6 +189,17 @@ config UML_TIME_TRAVEL_SUPPORT
>  
>  	  It is safe to say Y, but you probably don't need this.
>  
> +config UML_USE_HOST_STRINGS
> +	bool
> +	default y
> +	prompt "Use glibc strings and memory functions"
> +	help
> +	  UML runs as a normal userspace process. As a result it can use
> +	  the optimized strcpy, memcpy, etc from glibc instead of the
> +          kernel generic equivalents. This provides some minimal speedup
> +	  in the 1% or so range for most applications. It also results in
> +	  a smaller executable.

Looks like some inconsistent tab/spaces indentation there :)

> +++ b/arch/um/os-Linux/string.c
> @@ -0,0 +1,78 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2002 - 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com)
> + */
> +
> +#include <stddef.h>
> +#include <unistd.h>
> +#include <string.h>
> +#include <os_string.h>
> +
> +inline char *os_strcpy(char *dest, const char *src)

those 'inline' annotations seem strange - they can't possibly do
anything useful here?

johannes
Anton Ivanov Nov. 10, 2020, 4:54 p.m. UTC | #2
On 10/11/2020 16:40, Johannes Berg wrote:
> On Tue, 2020-11-10 at 16:30 +0000, anton.ivanov@cambridgegreys.com
> wrote:
>> From: Anton Ivanov <anton.ivanov@cambridgegreys.com>
>>
>> UML kernel runs as a normal userspace process and can use the
>> optimized glibc strings functions like strcpy, memcpy, etc.
>>
>> The support is optional and is turned on/of using a config
>> option.
>>
>> Using glibc functions results in a slightly smaller executable
>> when linked dynamically as well as anything between 1% and 5%
>> performance improvements.
> 
> Nice! :-)
> 
>> diff --git a/arch/um/Kconfig b/arch/um/Kconfig
>> index 4b799fad8b48..961cf3af3ff0 100644
>> --- a/arch/um/Kconfig
>> +++ b/arch/um/Kconfig
>> @@ -189,6 +189,17 @@ config UML_TIME_TRAVEL_SUPPORT
>>   
>>   	  It is safe to say Y, but you probably don't need this.
>>   
>> +config UML_USE_HOST_STRINGS
>> +	bool
>> +	default y
>> +	prompt "Use glibc strings and memory functions"
>> +	help
>> +	  UML runs as a normal userspace process. As a result it can use
>> +	  the optimized strcpy, memcpy, etc from glibc instead of the
>> +          kernel generic equivalents. This provides some minimal speedup
>> +	  in the 1% or so range for most applications. It also results in
>> +	  a smaller executable.
> 
> Looks like some inconsistent tab/spaces indentation there :)

Will fix it in v2.

> 
>> +++ b/arch/um/os-Linux/string.c
>> @@ -0,0 +1,78 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Copyright (C) 2002 - 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com)
>> + */
>> +
>> +#include <stddef.h>
>> +#include <unistd.h>
>> +#include <string.h>
>> +#include <os_string.h>
>> +
>> +inline char *os_strcpy(char *dest, const char *src)
> 
> those 'inline' annotations seem strange - they can't possibly do
> anything useful here?

Indeed. The compiler inlines them anyway. I can remove that. There is no difference (I tried both).


> 
> johannes
> 
>
Richard Weinberger Nov. 10, 2020, 8:13 p.m. UTC | #3
----- Ursprüngliche Mail -----
> Von: "anton ivanov" <anton.ivanov@cambridgegreys.com>
> An: "linux-um" <linux-um@lists.infradead.org>
> CC: "richard" <richard@nod.at>, "anton ivanov" <anton.ivanov@cambridgegreys.com>
> Gesendet: Dienstag, 10. November 2020 17:30:34
> Betreff: [PATCH] um: allow using glibc string functions instead of generics

> From: Anton Ivanov <anton.ivanov@cambridgegreys.com>
> 
> UML kernel runs as a normal userspace process and can use the
> optimized glibc strings functions like strcpy, memcpy, etc.
> 
> The support is optional and is turned on/of using a config
> option.
> 
> Using glibc functions results in a slightly smaller executable
> when linked dynamically as well as anything between 1% and 5%
> performance improvements.

On what workload did you see such a huge performance improvement?
The in-kernel variants of memcpy and such are already well optimized.
So I'm a little surprised.

Thanks,
//richard
Anton Ivanov Nov. 10, 2020, 8:56 p.m. UTC | #4
On 10/11/2020 20:13, Richard Weinberger wrote:
> ----- Ursprüngliche Mail -----
>> Von: "anton ivanov" <anton.ivanov@cambridgegreys.com>
>> An: "linux-um" <linux-um@lists.infradead.org>
>> CC: "richard" <richard@nod.at>, "anton ivanov" <anton.ivanov@cambridgegreys.com>
>> Gesendet: Dienstag, 10. November 2020 17:30:34
>> Betreff: [PATCH] um: allow using glibc string functions instead of generics
> 
>> From: Anton Ivanov <anton.ivanov@cambridgegreys.com>
>>
>> UML kernel runs as a normal userspace process and can use the
>> optimized glibc strings functions like strcpy, memcpy, etc.
>>
>> The support is optional and is turned on/of using a config
>> option.
>>
>> Using glibc functions results in a slightly smaller executable
>> when linked dynamically as well as anything between 1% and 5%
>> performance improvements.
> 
> On what workload did you see such a huge performance improvement?

File IO ~ 1% or thereabouts, iperf - 2-4%.

> The in-kernel variants of memcpy and such are already well optimized.

UML has no string.h in asm which means it falls back to 
asm-generic/string.h which in turn pulls in the ones from lib/string.c

These are not optimized.

Example - memcpy:

void *memcpy(void *dest, const void *src, size_t count)
{
	char *tmp = dest;
	const char *s = src;

	while (count--)
		*tmp++ = *s++;
	return dest;
}


> So I'm a little surprised.

I am actually surprised the gain is so low. I was expecting up to 15%.

> 
> Thanks,
> //richard
>
Richard Weinberger Nov. 10, 2020, 9:29 p.m. UTC | #5
----- Ursprüngliche Mail -----
> Von: "anton ivanov" <anton.ivanov@cambridgegreys.com> 
>> On what workload did you see such a huge performance improvement?
> 
> File IO ~ 1% or thereabouts, iperf - 2-4%.
> 
>> The in-kernel variants of memcpy and such are already well optimized.
> 
> UML has no string.h in asm which means it falls back to
> asm-generic/string.h which in turn pulls in the ones from lib/string.c
>
> These are not optimized.

Hmmm, I think it should use the highly optimized variants from arch/x86.

Thanks,
//richard
Anton Ivanov Nov. 10, 2020, 9:33 p.m. UTC | #6
On 10/11/2020 21:29, Richard Weinberger wrote:
> ----- Ursprüngliche Mail -----
>> Von: "anton ivanov" <anton.ivanov@cambridgegreys.com>
>>> On what workload did you see such a huge performance improvement?
>> File IO ~ 1% or thereabouts, iperf - 2-4%.
>>
>>> The in-kernel variants of memcpy and such are already well optimized.
>> UML has no string.h in asm which means it falls back to
>> asm-generic/string.h which in turn pulls in the ones from lib/string.c
>>
>> These are not optimized.
> Hmmm, I think it should use the highly optimized variants from arch/x86.

That is the other option - to bring in string32.h and string64.h from x86.

>
> Thanks,
> //richard
>
Richard Weinberger Nov. 10, 2020, 9:39 p.m. UTC | #7
----- Ursprüngliche Mail -----
> Von: "anton ivanov" <anton.ivanov@cambridgegreys.com>
> An: "richard" <richard@nod.at>
> CC: "linux-um" <linux-um@lists.infradead.org>
> Gesendet: Dienstag, 10. November 2020 22:33:48
> Betreff: Re: [PATCH] um: allow using glibc string functions instead of generics

> On 10/11/2020 21:29, Richard Weinberger wrote:
>> ----- Ursprüngliche Mail -----
>>> Von: "anton ivanov" <anton.ivanov@cambridgegreys.com>
>>>> On what workload did you see such a huge performance improvement?
>>> File IO ~ 1% or thereabouts, iperf - 2-4%.
>>>
>>>> The in-kernel variants of memcpy and such are already well optimized.
>>> UML has no string.h in asm which means it falls back to
>>> asm-generic/string.h which in turn pulls in the ones from lib/string.c
>>>
>>> These are not optimized.
>> Hmmm, I think it should use the highly optimized variants from arch/x86.
> 
> That is the other option - to bring in string32.h and string64.h from x86.

Yes, I thought we do so already. I fear we list this feature after some code
cleanup a long time ago.

I'm happy with either option.

Thanks,
//richard
Anton Ivanov Nov. 11, 2020, 7:13 a.m. UTC | #8
On 10/11/2020 21:39, Richard Weinberger wrote:
> ----- Ursprüngliche Mail -----
>> Von: "anton ivanov" <anton.ivanov@cambridgegreys.com>
>> An: "richard" <richard@nod.at>
>> CC: "linux-um" <linux-um@lists.infradead.org>
>> Gesendet: Dienstag, 10. November 2020 22:33:48
>> Betreff: Re: [PATCH] um: allow using glibc string functions instead of generics
> 
>> On 10/11/2020 21:29, Richard Weinberger wrote:
>>> ----- Ursprüngliche Mail -----
>>>> Von: "anton ivanov" <anton.ivanov@cambridgegreys.com>
>>>>> On what workload did you see such a huge performance improvement?
>>>> File IO ~ 1% or thereabouts, iperf - 2-4%.
>>>>
>>>>> The in-kernel variants of memcpy and such are already well optimized.
>>>> UML has no string.h in asm which means it falls back to
>>>> asm-generic/string.h which in turn pulls in the ones from lib/string.c
>>>>
>>>> These are not optimized.
>>> Hmmm, I think it should use the highly optimized variants from arch/x86.
>>
>> That is the other option - to bring in string32.h and string64.h from x86.
> 
> Yes, I thought we do so already. I fear we list this feature after some code
> cleanup a long time ago.
> 
> I'm happy with either option.

I will have a look if we lost other optimized code as a result of the 
asm cleanup and sort it out in the next version.

The advantage of glibc is that it is guaranteed to chose the correct 
flavor for the CPU.

I do not think that this the case for the kernel ones, because they rely 
on boottime CPU features detection which does not happen in the case of 
UML. So in order to use them properly, we may have to implement that.

Otherwise, the code in the glibc tree and in the kernel is nearly 
identical. Just glibc was easier as it did not require figuring out CPU 
detection :)

> 
> Thanks,
> //richard
> 
> _______________________________________________
> linux-um mailing list
> linux-um@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-um
>
Anton Ivanov Nov. 11, 2020, 8:26 a.m. UTC | #9
On 11/11/2020 07:13, Anton Ivanov wrote:
> On 10/11/2020 21:39, Richard Weinberger wrote:
>> ----- Ursprüngliche Mail -----
>>> Von: "anton ivanov" <anton.ivanov@cambridgegreys.com>
>>> An: "richard" <richard@nod.at>
>>> CC: "linux-um" <linux-um@lists.infradead.org>
>>> Gesendet: Dienstag, 10. November 2020 22:33:48
>>> Betreff: Re: [PATCH] um: allow using glibc string functions instead of generics
>>
>>> On 10/11/2020 21:29, Richard Weinberger wrote:
>>>> ----- Ursprüngliche Mail -----
>>>>> Von: "anton ivanov" <anton.ivanov@cambridgegreys.com>
>>>>>> On what workload did you see such a huge performance improvement?
>>>>> File IO ~ 1% or thereabouts, iperf - 2-4%.
>>>>>
>>>>>> The in-kernel variants of memcpy and such are already well optimized.
>>>>> UML has no string.h in asm which means it falls back to
>>>>> asm-generic/string.h which in turn pulls in the ones from lib/string.c
>>>>>
>>>>> These are not optimized.
>>>> Hmmm, I think it should use the highly optimized variants from arch/x86.
>>>
>>> That is the other option - to bring in string32.h and string64.h from x86.
>>
>> Yes, I thought we do so already. I fear we list this feature after some code
>> cleanup a long time ago.
>>
>> I'm happy with either option.
> 
> I will have a look if we lost other optimized code as a result of the asm cleanup and sort it out in the next version.

Out of the important bits we have lost the x86 optimized code for:

1. memcpy and other strings.h functions
2. cksum
3. xor

while memcpy and friends can be picked up from glibc the others can't. So we might as well figure out how to pick them up from the x86 tree.

> 
> The advantage of glibc is that it is guaranteed to chose the correct flavor for the CPU.
> 
> I do not think that this the case for the kernel ones, because they rely on boottime CPU features detection which does not happen in the case of UML. So in order to use them properly, we may have to implement that.
> 
> Otherwise, the code in the glibc tree and in the kernel is nearly identical. Just glibc was easier as it did not require figuring out CPU detection :)
> 
>>
>> Thanks,
>> //richard
>>
>> _______________________________________________
>> linux-um mailing list
>> linux-um@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-um
>>
> 
>
Anton Ivanov Nov. 11, 2020, 9:49 a.m. UTC | #10
On 10/11/2020 21:39, Richard Weinberger wrote:
> ----- Ursprüngliche Mail -----
>> Von: "anton ivanov" <anton.ivanov@cambridgegreys.com>
>> An: "richard" <richard@nod.at>
>> CC: "linux-um" <linux-um@lists.infradead.org>
>> Gesendet: Dienstag, 10. November 2020 22:33:48
>> Betreff: Re: [PATCH] um: allow using glibc string functions instead of generics
>> On 10/11/2020 21:29, Richard Weinberger wrote:
>>> ----- Ursprüngliche Mail -----
>>>> Von: "anton ivanov" <anton.ivanov@cambridgegreys.com>
>>>>> On what workload did you see such a huge performance improvement?
>>>> File IO ~ 1% or thereabouts, iperf - 2-4%.
>>>>
>>>>> The in-kernel variants of memcpy and such are already well optimized.
>>>> UML has no string.h in asm which means it falls back to
>>>> asm-generic/string.h which in turn pulls in the ones from lib/string.c
>>>>
>>>> These are not optimized.
>>> Hmmm, I think it should use the highly optimized variants from arch/x86.
>> That is the other option - to bring in string32.h and string64.h from x86.
> Yes, I thought we do so already. I fear we list this feature after some code
> cleanup a long time ago.
>
> I'm happy with either option.

I did a quick and ugly hack to bring in xor from x86 tree (just sse, not avx), the difference is 117%.

I had to edit/hack quite a few things though.

I am now going to reset my trees and see how we can do this properly by bringing in the original files "as is" and defining things as NOOPs as well as doing fake defines for the CPU features. That should also allow us to replace the fake defines with actual host CPU detection later so we can use AVX and other features not present on all 64 bit platforms.

>
> Thanks,
> //richard
>
> _______________________________________________
> linux-um mailing list
> linux-um@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-um
Anton Ivanov Nov. 11, 2020, 3:14 p.m. UTC | #11
On 11/11/2020 07:13, Anton Ivanov wrote:
> On 10/11/2020 21:39, Richard Weinberger wrote:
>> ----- Ursprüngliche Mail -----
>>> Von: "anton ivanov" <anton.ivanov@cambridgegreys.com>
>>> An: "richard" <richard@nod.at>
>>> CC: "linux-um" <linux-um@lists.infradead.org>
>>> Gesendet: Dienstag, 10. November 2020 22:33:48
>>> Betreff: Re: [PATCH] um: allow using glibc string functions instead of generics
>>
>>> On 10/11/2020 21:29, Richard Weinberger wrote:
>>>> ----- Ursprüngliche Mail -----
>>>>> Von: "anton ivanov" <anton.ivanov@cambridgegreys.com>
>>>>>> On what workload did you see such a huge performance improvement?
>>>>> File IO ~ 1% or thereabouts, iperf - 2-4%.
>>>>>
>>>>>> The in-kernel variants of memcpy and such are already well optimized.
>>>>> UML has no string.h in asm which means it falls back to
>>>>> asm-generic/string.h which in turn pulls in the ones from lib/string.c
>>>>>
>>>>> These are not optimized.
>>>> Hmmm, I think it should use the highly optimized variants from arch/x86.
>>>
>>> That is the other option - to bring in string32.h and string64.h from x86.
>>
>> Yes, I thought we do so already. I fear we list this feature after some code
>> cleanup a long time ago.
>>
>> I'm happy with either option.
>
> I will have a look if we lost other optimized code as a result of the asm cleanup and sort it out in the next version.
>
> The advantage of glibc is that it is guaranteed to chose the correct flavor for the CPU.
>
> I do not think that this the case for the kernel ones, because they rely on boottime CPU features detection which does not happen in the case of UML. So in order to use them properly, we may have to implement that.
>
> Otherwise, the code in the glibc tree and in the kernel is nearly identical. Just glibc was easier as it did not require figuring out CPU detection :)

I did XOR, the difference on its own benchmarks on my machine is 117%.

That was the relatively easy part.

We have issues applying the same approach to checksum.h

1. The x86 checksum_32.h and checksum_64.h files do not ifdef with the appropriate ifdefs csum_and_copy_from_user() and to_user. In order to use these instead of our own copies, we need to add the missing ifdefs.

2. Checksum needs tidyng up. Both 32 bit and 64 bit versions of UML bring in an old copy of the 32 bit x86 checksum which has been duplicated into arch/x86/um. There is some 64 bit code which unless I am mistaken is never pulled in because it is still ifdefed on CONFIG_X86_32 instead of CONFIG_64BIT

string.h looks more straight forward, I will probably do it next leaving cksum last.

A.


>
>>
>> Thanks,
>> //richard
>>
>> _______________________________________________
>> linux-um mailing list
>> linux-um@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-um
>>
>
>
Anton Ivanov Nov. 11, 2020, 3:59 p.m. UTC | #12
On 11/11/2020 15:14, Anton Ivanov wrote:
>
> On 11/11/2020 07:13, Anton Ivanov wrote:
>> On 10/11/2020 21:39, Richard Weinberger wrote:
>>> ----- Ursprüngliche Mail -----
>>>> Von: "anton ivanov" <anton.ivanov@cambridgegreys.com>
>>>> An: "richard" <richard@nod.at>
>>>> CC: "linux-um" <linux-um@lists.infradead.org>
>>>> Gesendet: Dienstag, 10. November 2020 22:33:48
>>>> Betreff: Re: [PATCH] um: allow using glibc string functions instead of generics
>>>
>>>> On 10/11/2020 21:29, Richard Weinberger wrote:
>>>>> ----- Ursprüngliche Mail -----
>>>>>> Von: "anton ivanov" <anton.ivanov@cambridgegreys.com>
>>>>>>> On what workload did you see such a huge performance improvement?
>>>>>> File IO ~ 1% or thereabouts, iperf - 2-4%.
>>>>>>
>>>>>>> The in-kernel variants of memcpy and such are already well optimized.
>>>>>> UML has no string.h in asm which means it falls back to
>>>>>> asm-generic/string.h which in turn pulls in the ones from lib/string.c
>>>>>>
>>>>>> These are not optimized.
>>>>> Hmmm, I think it should use the highly optimized variants from arch/x86.
>>>>
>>>> That is the other option - to bring in string32.h and string64.h from x86.
>>>
>>> Yes, I thought we do so already. I fear we list this feature after some code
>>> cleanup a long time ago.
>>>
>>> I'm happy with either option.
>>
>> I will have a look if we lost other optimized code as a result of the asm cleanup and sort it out in the next version.
>>
>> The advantage of glibc is that it is guaranteed to chose the correct flavor for the CPU.
>>
>> I do not think that this the case for the kernel ones, because they rely on boottime CPU features detection which does not happen in the case of UML. So in order to use them properly, we may have to implement that.
>>
>> Otherwise, the code in the glibc tree and in the kernel is nearly identical. Just glibc was easier as it did not require figuring out CPU detection :)
>
> I did XOR, the difference on its own benchmarks on my machine is 117%.
>
> That was the relatively easy part.
>
> We have issues applying the same approach to checksum.h
>
> 1. The x86 checksum_32.h and checksum_64.h files do not ifdef with the appropriate ifdefs csum_and_copy_from_user() and to_user. In order to use these instead of our own copies, we need to add the missing ifdefs.
>
> 2. Checksum needs tidyng up. Both 32 bit and 64 bit versions of UML bring in an old copy of the 32 bit x86 checksum which has been duplicated into arch/x86/um. There is some 64 bit code which unless I am mistaken is never pulled in because it is still ifdefed on CONFIG_X86_32 instead of CONFIG_64BIT

Actually, the x86/um/Kconfig defines the CONFIG_X86_32/CONFIG_X86_64, so the 64 bit code is pulled in. My mistake.

The rest still applies, some of the files are copies of older versions of the x86, we should be picking up the newer ones wherever possible.

A.

>
> string.h looks more straight forward, I will probably do it next leaving cksum last.
>
> A.
>
>
>>
>>>
>>> Thanks,
>>> //richard
>>>
>>> _______________________________________________
>>> linux-um mailing list
>>> linux-um@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-um
>>>
>>
>>
Anton Ivanov Nov. 11, 2020, 6:09 p.m. UTC | #13
On 11/11/2020 15:14, Anton Ivanov wrote:
>
> On 11/11/2020 07:13, Anton Ivanov wrote:
>> On 10/11/2020 21:39, Richard Weinberger wrote:
>>> ----- Ursprüngliche Mail -----
>>>> Von: "anton ivanov" <anton.ivanov@cambridgegreys.com>
>>>> An: "richard" <richard@nod.at>
>>>> CC: "linux-um" <linux-um@lists.infradead.org>
>>>> Gesendet: Dienstag, 10. November 2020 22:33:48
>>>> Betreff: Re: [PATCH] um: allow using glibc string functions instead of generics
>>>
>>>> On 10/11/2020 21:29, Richard Weinberger wrote:
>>>>> ----- Ursprüngliche Mail -----
>>>>>> Von: "anton ivanov" <anton.ivanov@cambridgegreys.com>
>>>>>>> On what workload did you see such a huge performance improvement?
>>>>>> File IO ~ 1% or thereabouts, iperf - 2-4%.
>>>>>>
>>>>>>> The in-kernel variants of memcpy and such are already well optimized.
>>>>>> UML has no string.h in asm which means it falls back to
>>>>>> asm-generic/string.h which in turn pulls in the ones from lib/string.c
>>>>>>
>>>>>> These are not optimized.
>>>>> Hmmm, I think it should use the highly optimized variants from arch/x86.
>>>>
>>>> That is the other option - to bring in string32.h and string64.h from x86.
>>>
>>> Yes, I thought we do so already. I fear we list this feature after some code
>>> cleanup a long time ago.
>>>
>>> I'm happy with either option.
>>
>> I will have a look if we lost other optimized code as a result of the asm cleanup and sort it out in the next version.
>>
>> The advantage of glibc is that it is guaranteed to chose the correct flavor for the CPU.
>>
>> I do not think that this the case for the kernel ones, because they rely on boottime CPU features detection which does not happen in the case of UML. So in order to use them properly, we may have to implement that.
>>
>> Otherwise, the code in the glibc tree and in the kernel is nearly identical. Just glibc was easier as it did not require figuring out CPU detection :)
>
> I did XOR, the difference on its own benchmarks on my machine is 117%.
>
> That was the relatively easy part.
>
> We have issues applying the same approach to checksum.h
>
> 1. The x86 checksum_32.h and checksum_64.h files do not ifdef with the appropriate ifdefs csum_and_copy_from_user() and to_user. In order to use these instead of our own copies, we need to add the missing ifdefs.
>
> 2. Checksum needs tidyng up. Both 32 bit and 64 bit versions of UML bring in an old copy of the 32 bit x86 checksum which has been duplicated into arch/x86/um. There is some 64 bit code which unless I am mistaken is never pulled in because it is still ifdefed on CONFIG_X86_32 instead of CONFIG_64BIT
>
> string.h looks more straight forward, I will probably do it next leaving cksum last.

I managed to untangle checksum so it 100% works using x86 "upstream" files and I have a working version, but I will need some testing for 32/64 bit cases to make sure it is OK.

I will probably push it before the end of the week.

A.

>
> A.
>
>
>>
>>>
>>> Thanks,
>>> //richard
>>>
>>> _______________________________________________
>>> linux-um mailing list
>>> linux-um@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-um
>>>
>>
>>
diff mbox series

Patch

diff --git a/arch/um/Kconfig b/arch/um/Kconfig
index 4b799fad8b48..961cf3af3ff0 100644
--- a/arch/um/Kconfig
+++ b/arch/um/Kconfig
@@ -189,6 +189,17 @@  config UML_TIME_TRAVEL_SUPPORT
 
 	  It is safe to say Y, but you probably don't need this.
 
+config UML_USE_HOST_STRINGS
+	bool
+	default y
+	prompt "Use glibc strings and memory functions"
+	help
+	  UML runs as a normal userspace process. As a result it can use
+	  the optimized strcpy, memcpy, etc from glibc instead of the
+          kernel generic equivalents. This provides some minimal speedup
+	  in the 1% or so range for most applications. It also results in
+	  a smaller executable.
+
 endmenu
 
 source "arch/um/drivers/Kconfig"
diff --git a/arch/um/include/asm/string.h b/arch/um/include/asm/string.h
new file mode 100644
index 000000000000..1fba8d59afe5
--- /dev/null
+++ b/arch/um/include/asm/string.h
@@ -0,0 +1,72 @@ 
+#ifndef __ASM_UM_STRING_H
+#define __ASM_UM_STRING_H
+
+#ifdef CONFIG_UML_USE_HOST_STRINGS
+
+/* UML saves and restores registers when going to/from
+ * userspace. This allows the use of normal userspace
+ * functions for strings with all relevant glibc processor
+ * optimizations
+ */
+
+#include <os_string.h>
+
+#define __HAVE_ARCH_STRCPY
+
+#define strcpy(dest, src) os_strcpy(dest, src);
+
+#define __HAVE_ARCH_STRNCPY
+
+#define strncpy(dest, src, count) os_strncpy(dest, src, count)
+
+#define __HAVE_ARCH_STRCAT
+
+#define strcat(dest, src) os_strcat(dest, src)
+
+#define __HAVE_ARCH_STRNCAT
+
+#define strncat(dest, src, count) os_strncat(dest, src, count)
+
+#define __HAVE_ARCH_STRCMP
+
+#define strcmp(cs, ct) os_strcmp(cs, ct)
+
+#define __HAVE_ARCH_STRNCMP
+
+#define strncmp(cs, ct, count) os_strncmp(cs, ct, count) 
+
+#define __HAVE_ARCH_STRCHR
+
+#define strchr(s, c) os_strchr(s, c)
+
+#define __HAVE_ARCH_STRLEN
+
+#define strlen(s) os_strlen(s)
+
+#define __HAVE_ARCH_MEMCPY
+
+#define memcpy(dst, src, n) os_memcpy(dst, src, n)
+
+#define __HAVE_ARCH_MEMMOVE
+
+#define memmove(dest, src, n) os_memmove(dest, src, n)
+
+#define __HAVE_ARCH_MEMCHR
+
+#define memchr(cs, c, count) os_memchr(cs, c, count)
+
+#define __HAVE_ARCH_STRNLEN
+
+#define strnlen(s, count) os_strnlen(s, count)
+
+#define __HAVE_ARCH_STRSTR
+
+#define strstr(cs, ct) os_strstr(cs, ct)
+
+#define __HAVE_ARCH_MEMSET
+
+#define memset(dst, c, n) os_memset(dst, c, n)
+
+#endif
+
+#endif /* __ASM_GENERIC_STRING_H */
diff --git a/arch/um/include/shared/os_string.h b/arch/um/include/shared/os_string.h
new file mode 100644
index 000000000000..b9662f622e77
--- /dev/null
+++ b/arch/um/include/shared/os_string.h
@@ -0,0 +1,30 @@ 
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2015 Anton Ivanov (aivanov@{brocade.com,kot-begemot.co.uk})
+ * Copyright (C) 2015 Thomas Meyer (thomas@m3y3r.de)
+ * Copyright (C) 2002 - 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com)
+ */
+
+#ifndef __OS_STRING_H__
+#define __OS_STRING_H__
+
+#include <stddef.h>
+
+/* string.c */
+
+extern char *os_strcpy(char *dest, const char *src);
+extern char *os_strncpy(char *dest, const char *src, size_t count);
+extern char *os_strcat(char *dest, const char *src);
+extern char *os_strncat(char *dest, const char *src, size_t count);
+extern int os_strcmp(const char *cs, const char *ct);
+extern int os_strncmp(const char *cs, const char *ct, size_t count);
+extern char *os_strchr(const char *s, int c);
+extern size_t os_strlen(const char *s);
+extern void *os_memcpy(void *dest, const void *src, size_t n);
+extern void *os_memmove(void *dest, const void *src, size_t n);
+extern void *os_memchr(const void *cs, int c, size_t count);
+extern size_t os_strnlen(const char *s, size_t count);
+extern char *os_strstr(const char *cs, const char *ct);
+extern void *os_memset(void *s, int c, size_t n);
+
+#endif
diff --git a/arch/um/os-Linux/Makefile b/arch/um/os-Linux/Makefile
index 839915b8c31c..f117f2514191 100644
--- a/arch/um/os-Linux/Makefile
+++ b/arch/um/os-Linux/Makefile
@@ -8,12 +8,12 @@  KCOV_INSTRUMENT                := n
 
 obj-y = execvp.o file.o helper.o irq.o main.o mem.o process.o \
 	registers.o sigio.o signal.o start_up.o time.o tty.o \
-	umid.o user_syms.o util.o drivers/ skas/
+	umid.o user_syms.o util.o string.o drivers/ skas/
 
 obj-$(CONFIG_ARCH_REUSE_HOST_VSYSCALL_AREA) += elf_aux.o
 
 USER_OBJS := $(user-objs-y) elf_aux.o execvp.o file.o helper.o irq.o \
 	main.o mem.o process.o registers.o sigio.o signal.o start_up.o time.o \
-	tty.o umid.o util.o
+	tty.o umid.o util.o string.o
 
 include arch/um/scripts/Makefile.rules
diff --git a/arch/um/os-Linux/string.c b/arch/um/os-Linux/string.c
new file mode 100644
index 000000000000..abc76e2aecb3
--- /dev/null
+++ b/arch/um/os-Linux/string.c
@@ -0,0 +1,78 @@ 
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2002 - 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com)
+ */
+
+#include <stddef.h>
+#include <unistd.h>
+#include <string.h>
+#include <os_string.h>
+
+inline char *os_strcpy(char *dest, const char *src)
+{
+	return strcpy(dest, src);
+}
+
+inline char *os_strncpy(char *dest, const char *src, size_t count)
+{
+	return strncpy(dest, src, count);
+}
+
+inline char *os_strcat(char *dest, const char *src)
+{
+	return strcat(dest, src);
+}
+inline char *os_strncat(char *dest, const char *src, size_t count)
+{
+	return strncat(dest, src, count);
+}
+
+inline int os_strcmp(const char *cs, const char *ct)
+{
+	return strcmp(cs, ct);
+}
+
+inline int os_strncmp(const char *cs, const char *ct, size_t count)
+{
+	return strncmp(cs, ct, count);
+}
+
+inline char *os_strchr(const char *s, int c)
+{
+	return strchr(s, c);
+}
+inline size_t os_strlen(const char *s)
+{
+	return strlen(s);
+}
+
+inline void *os_memcpy(void *dest, const void *src, size_t n)
+{
+	return memcpy(dest, src, n);
+}
+
+inline void *os_memmove(void *dest, const void *src, size_t n)
+{
+	return memmove(dest, src, n);
+}
+	
+inline void *os_memchr(const void *cs, int c, size_t count)
+{
+	return memchr(cs, c, count);
+}
+
+inline size_t os_strnlen(const char *s, size_t count)
+{
+	return strnlen(s, count);
+}
+
+inline char *os_strstr(const char *cs, const char *ct)
+{
+	return strstr(cs, ct);
+}
+
+inline void *os_memset(void *s, int c, size_t n)
+{
+	return memset(s, c, n);
+}
+