diff mbox series

[RFC] Come up with TARGET_HAS_FAST_MEMPCPY_ROUTINE (PR middle-end/90263).

Message ID e8923b5d-2a1f-c1b4-9b7f-a970ae88f8a2@suse.cz
State New
Headers show
Series [RFC] Come up with TARGET_HAS_FAST_MEMPCPY_ROUTINE (PR middle-end/90263). | expand

Commit Message

Martin Liška May 10, 2019, 9:04 a.m. UTC
Hi.

This is updated version of the patch I've sent here:
https://gcc.gnu.org/ml/gcc-patches/2017-08/msg00149.html

The patch is about introduction of a new target macro that will
drive how we expand mempcpy. Having a target with TARGET_HAS_FAST_MEMPCPY_ROUTINE == 1,
we do not expand using memcpy, but mempcy.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

@Wilco: Can you please come up with a test-case for aarch64?

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

2019-05-09  Martin Liska  <mliska@suse.cz>

	PR middle-end/90263
	* builtins.c (expand_builtin_memory_copy_args): When having a
	target with fast mempcpy implementation do now use memcpy.
	* config/i386/i386.h (TARGET_HAS_FAST_MEMPCPY_ROUTINE): New.
	* defaults.h (TARGET_HAS_FAST_MEMPCPY_ROUTINE): By default
	target does not have fast mempcpy routine.
	* doc/tm.texi: Document the new hook.
	* doc/tm.texi.in: Likewise.
	* expr.h (emit_block_move_hints): Add 2 new arguments.
	* expr.c (emit_block_move_hints): Bail out when libcall
	to memcpy would be used.

gcc/testsuite/ChangeLog:

2019-05-09  Martin Liska  <mliska@suse.cz>

	* gcc.c-torture/execute/builtins/mempcpy.c: Use mempcpy
	without LHS.
---
 gcc/builtins.c                                 | 18 ++++++++++++++++--
 gcc/config/i386/i386.h                         |  3 +++
 gcc/defaults.h                                 |  7 +++++++
 gcc/doc/tm.texi                                |  5 +++++
 gcc/doc/tm.texi.in                             |  5 +++++
 gcc/expr.c                                     | 13 ++++++++++++-
 gcc/expr.h                                     |  4 +++-
 .../gcc.c-torture/execute/builtins/mempcpy.c   |  5 ++---
 8 files changed, 53 insertions(+), 7 deletions(-)

Comments

Jakub Jelinek May 10, 2019, 9:21 a.m. UTC | #1
On Fri, May 10, 2019 at 11:04:12AM +0200, Martin Liška wrote:
> --- a/gcc/config/i386/i386.h
> +++ b/gcc/config/i386/i386.h
> @@ -1906,6 +1906,9 @@ typedef struct ix86_args {
>  
>  #define CLEAR_RATIO(speed) ((speed) ? MIN (6, ix86_cost->move_ratio) : 2)
>  
> +/* C library provides fast implementation of mempcpy function.  */
> +#define TARGET_HAS_FAST_MEMPCPY_ROUTINE 1
> +

1) we shouldn't be adding further target macros, but target hooks
2) I don't think this is a property of the x86 target, but of x86 glibc,
   so you should set it on x86 glibc only (i.e. i?86/x86_64 linux and hurd
   when using glibc, not newlib, nor bionic/android, nor uclibc, nor musl)

> --- a/gcc/testsuite/gcc.c-torture/execute/builtins/mempcpy.c
> +++ b/gcc/testsuite/gcc.c-torture/execute/builtins/mempcpy.c
> @@ -56,9 +56,8 @@ main_test (void)
>    if (__builtin_mempcpy (p, "ABCDE", 6) != p + 6 || memcmp (p, "ABCDE", 6))
>      abort ();
>  
> -  /* If the result of mempcpy is ignored, gcc should use memcpy.
> -     This should be optimized always, so set inside_main again.  */
> -  inside_main = 1;
> +  /* Set inside main in order to not abort because of usafe of mempcpy.  */
> +  inside_main = 0;
>    mempcpy (p + 5, s3, 1);
>    if (memcmp (p, "ABCDEFg", 8))
>      abort ();

Why this?  Do you mean mempcpy is called here, even when the lhs is unused?
We should be calling memcpy in that case.

	Jakub
Wilco Dijkstra May 10, 2019, 2:21 p.m. UTC | #2
Hi Martin,

> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

However I guess some existing tests checking for mempcpy may fail on
other targets, so might need to be changed.

> @Wilco: Can you please come up with a test-case for aarch64?

A simplified version gcc/testsuite/gcc.dg/fold-bcopy.c should be good enough
if existing tests don't provide enough cover. Literally just compiling the example
I gave and checking it calls memcpy using a target exclude will do, eg.

int *f (int *p, int *q, long n)
{
  return __builtin_mempcpy (p, q, n);
}

/* { dg-final { scan-assembler "mempcpy" { target { i?86-*-* x86_64-*-* } } } } */
/* { dg-final { scan-assembler "memcpy" { target { ! { i?86-*-* x86_64-*-* } } } } } */

Wilco
Martin Liška May 13, 2019, 10:14 a.m. UTC | #3
On 5/10/19 11:21 AM, Jakub Jelinek wrote:
> On Fri, May 10, 2019 at 11:04:12AM +0200, Martin Liška wrote:
>> --- a/gcc/config/i386/i386.h
>> +++ b/gcc/config/i386/i386.h
>> @@ -1906,6 +1906,9 @@ typedef struct ix86_args {
>>  
>>  #define CLEAR_RATIO(speed) ((speed) ? MIN (6, ix86_cost->move_ratio) : 2)
>>  
>> +/* C library provides fast implementation of mempcpy function.  */
>> +#define TARGET_HAS_FAST_MEMPCPY_ROUTINE 1
>> +
> 
> 1) we shouldn't be adding further target macros, but target hooks

Done.

> 2) I don't think this is a property of the x86 target, but of x86 glibc,
>    so you should set it on x86 glibc only (i.e. i?86/x86_64 linux and hurd
>    when using glibc, not newlib, nor bionic/android, nor uclibc, nor musl)

I've implemented the in i386.c with DEFAULT_LIBC == LIBC_GLIBC. Hope it's correct?

> 
>> --- a/gcc/testsuite/gcc.c-torture/execute/builtins/mempcpy.c
>> +++ b/gcc/testsuite/gcc.c-torture/execute/builtins/mempcpy.c
>> @@ -56,9 +56,8 @@ main_test (void)
>>    if (__builtin_mempcpy (p, "ABCDE", 6) != p + 6 || memcmp (p, "ABCDE", 6))
>>      abort ();
>>  
>> -  /* If the result of mempcpy is ignored, gcc should use memcpy.
>> -     This should be optimized always, so set inside_main again.  */
>> -  inside_main = 1;
>> +  /* Set inside main in order to not abort because of usafe of mempcpy.  */
>> +  inside_main = 0;
>>    mempcpy (p + 5, s3, 1);
>>    if (memcmp (p, "ABCDEFg", 8))
>>      abort ();
> 
> Why this?  Do you mean mempcpy is called here, even when the lhs is unused?
> We should be calling memcpy in that case.

Yes, that's a stupid mistake. I'm attaching updated version of the patch
that works fine in x86_64-linux-gnu.

Martin

> 
> 	Jakub
>
Wilco Dijkstra May 13, 2019, 1:02 p.m. UTC | #4
Hi Martin,

+++ b/gcc/testsuite/gcc.c-torture/pr90263.c
@@ -0,0 +1,9 @@
+/* PR middle-end/90263 */
+
+int *f (int *p, int *q, long n)
+{
+  return __builtin_mempcpy (p, q, n);
+}
+
+/* { dg-final { scan-assembler "mempcpy" { target { i?86-*-* x86_64-*-* } } } } */
+/* { dg-final { scan-assembler "memcpy" { target { ! { i?86-*-* x86_64-*-* } } } } } */


This should be gcc/testsuite/gcc.c-torture/compile/pr90263.c, otherwise it is
ignored. The scan-assembler would also need to check for GLIBC unless you
always skip it on x86 (we don't seem to have a check_effective_target_glibc,
maybe it's worth adding?).

Wilco
Jakub Jelinek May 13, 2019, 1:07 p.m. UTC | #5
On Mon, May 13, 2019 at 12:14:37PM +0200, Martin Liška wrote:
> On 5/10/19 11:21 AM, Jakub Jelinek wrote:
> > On Fri, May 10, 2019 at 11:04:12AM +0200, Martin Liška wrote:
> >> --- a/gcc/config/i386/i386.h
> >> +++ b/gcc/config/i386/i386.h
> >> @@ -1906,6 +1906,9 @@ typedef struct ix86_args {
> >>  
> >>  #define CLEAR_RATIO(speed) ((speed) ? MIN (6, ix86_cost->move_ratio) : 2)
> >>  
> >> +/* C library provides fast implementation of mempcpy function.  */
> >> +#define TARGET_HAS_FAST_MEMPCPY_ROUTINE 1
> >> +
> > 
> > 1) we shouldn't be adding further target macros, but target hooks
> 
> Done.
> 
> > 2) I don't think this is a property of the x86 target, but of x86 glibc,
> >    so you should set it on x86 glibc only (i.e. i?86/x86_64 linux and hurd
> >    when using glibc, not newlib, nor bionic/android, nor uclibc, nor musl)
> 
> I've implemented the in i386.c with DEFAULT_LIBC == LIBC_GLIBC. Hope it's correct?

No, that would be correct only in the rare SINGLE_LIBC configurations.
Either you can do
#ifdef OPTION_GLIBC
  return OPTION_GLIBC;
#else
  return false;
#endif
or define something in config/linux.h (or .[ch]) that you can then use in
i386.c.

	Jakub
Martin Liška May 14, 2019, 2:55 p.m. UTC | #6
On 5/13/19 3:07 PM, Jakub Jelinek wrote:
> On Mon, May 13, 2019 at 12:14:37PM +0200, Martin Liška wrote:
>> On 5/10/19 11:21 AM, Jakub Jelinek wrote:
>>> On Fri, May 10, 2019 at 11:04:12AM +0200, Martin Liška wrote:
>>>> --- a/gcc/config/i386/i386.h
>>>> +++ b/gcc/config/i386/i386.h
>>>> @@ -1906,6 +1906,9 @@ typedef struct ix86_args {
>>>>  
>>>>  #define CLEAR_RATIO(speed) ((speed) ? MIN (6, ix86_cost->move_ratio) : 2)
>>>>  
>>>> +/* C library provides fast implementation of mempcpy function.  */
>>>> +#define TARGET_HAS_FAST_MEMPCPY_ROUTINE 1
>>>> +
>>>
>>> 1) we shouldn't be adding further target macros, but target hooks
>>
>> Done.
>>
>>> 2) I don't think this is a property of the x86 target, but of x86 glibc,
>>>    so you should set it on x86 glibc only (i.e. i?86/x86_64 linux and hurd
>>>    when using glibc, not newlib, nor bionic/android, nor uclibc, nor musl)
>>
>> I've implemented the in i386.c with DEFAULT_LIBC == LIBC_GLIBC. Hope it's correct?
> 
> No, that would be correct only in the rare SINGLE_LIBC configurations.
> Either you can do
> #ifdef OPTION_GLIBC
>   return OPTION_GLIBC;
> #else
>   return false;
> #endif
> or define something in config/linux.h (or .[ch]) that you can then use in
> i386.c.
> 
> 	Jakub
> 

Hi.

You always have nice ideas. I'm sending updated patch which addresses both Jakub's
and Wilco's comments.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin
Martin Sebor May 14, 2019, 3:07 p.m. UTC | #7
On 5/14/19 8:55 AM, Martin Liška wrote:
> On 5/13/19 3:07 PM, Jakub Jelinek wrote:
>> On Mon, May 13, 2019 at 12:14:37PM +0200, Martin Liška wrote:
>>> On 5/10/19 11:21 AM, Jakub Jelinek wrote:
>>>> On Fri, May 10, 2019 at 11:04:12AM +0200, Martin Liška wrote:
>>>>> --- a/gcc/config/i386/i386.h
>>>>> +++ b/gcc/config/i386/i386.h
>>>>> @@ -1906,6 +1906,9 @@ typedef struct ix86_args {
>>>>>   
>>>>>   #define CLEAR_RATIO(speed) ((speed) ? MIN (6, ix86_cost->move_ratio) : 2)
>>>>>   
>>>>> +/* C library provides fast implementation of mempcpy function.  */
>>>>> +#define TARGET_HAS_FAST_MEMPCPY_ROUTINE 1
>>>>> +
>>>>
>>>> 1) we shouldn't be adding further target macros, but target hooks
>>>
>>> Done.
>>>
>>>> 2) I don't think this is a property of the x86 target, but of x86 glibc,
>>>>     so you should set it on x86 glibc only (i.e. i?86/x86_64 linux and hurd
>>>>     when using glibc, not newlib, nor bionic/android, nor uclibc, nor musl)
>>>
>>> I've implemented the in i386.c with DEFAULT_LIBC == LIBC_GLIBC. Hope it's correct?
>>
>> No, that would be correct only in the rare SINGLE_LIBC configurations.
>> Either you can do
>> #ifdef OPTION_GLIBC
>>    return OPTION_GLIBC;
>> #else
>>    return false;
>> #endif
>> or define something in config/linux.h (or .[ch]) that you can then use in
>> i386.c.
>>
>> 	Jakub
>>
> 
> Hi.
> 
> You always have nice ideas. I'm sending updated patch which addresses both Jakub's
> and Wilco's comments.
> 

index 66cee075018..7bff5cbd313 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -5797,6 +5797,12 @@ DEFHOOK
   const char *, (void),
   hook_constcharptr_void_null)

+DEFHOOK
+(has_fast_mempcpy_routine,
+ "Return true if a target has a fast mempcpy routine.",
+ bool, (void),
+ hook_bool_void_false)
+

Not to be too nit-picky about the name but target.def refers to
functions rather than routines.  It also defines a hook called
libc_has_function with the obvious semantics so if there's
a chance that it could be useful to query whether another libc
function is "fast" I would suggest to consider defining the hook
correspondingly, i.e.,

   bool libc_has_fast_function (enum function_class)

and naming the macro similarly.

Martin
Martin Liška May 15, 2019, 11:35 a.m. UTC | #8
On 5/14/19 5:07 PM, Martin Sebor wrote:
> On 5/14/19 8:55 AM, Martin Liška wrote:
>> On 5/13/19 3:07 PM, Jakub Jelinek wrote:
>>> On Mon, May 13, 2019 at 12:14:37PM +0200, Martin Liška wrote:
>>>> On 5/10/19 11:21 AM, Jakub Jelinek wrote:
>>>>> On Fri, May 10, 2019 at 11:04:12AM +0200, Martin Liška wrote:
>>>>>> --- a/gcc/config/i386/i386.h
>>>>>> +++ b/gcc/config/i386/i386.h
>>>>>> @@ -1906,6 +1906,9 @@ typedef struct ix86_args {
>>>>>>     #define CLEAR_RATIO(speed) ((speed) ? MIN (6, ix86_cost->move_ratio) : 2)
>>>>>>   +/* C library provides fast implementation of mempcpy function.  */
>>>>>> +#define TARGET_HAS_FAST_MEMPCPY_ROUTINE 1
>>>>>> +
>>>>>
>>>>> 1) we shouldn't be adding further target macros, but target hooks
>>>>
>>>> Done.
>>>>
>>>>> 2) I don't think this is a property of the x86 target, but of x86 glibc,
>>>>>     so you should set it on x86 glibc only (i.e. i?86/x86_64 linux and hurd
>>>>>     when using glibc, not newlib, nor bionic/android, nor uclibc, nor musl)
>>>>
>>>> I've implemented the in i386.c with DEFAULT_LIBC == LIBC_GLIBC. Hope it's correct?
>>>
>>> No, that would be correct only in the rare SINGLE_LIBC configurations.
>>> Either you can do
>>> #ifdef OPTION_GLIBC
>>>    return OPTION_GLIBC;
>>> #else
>>>    return false;
>>> #endif
>>> or define something in config/linux.h (or .[ch]) that you can then use in
>>> i386.c.
>>>
>>>     Jakub
>>>
>>
>> Hi.
>>
>> You always have nice ideas. I'm sending updated patch which addresses both Jakub's
>> and Wilco's comments.
>>
> 
> index 66cee075018..7bff5cbd313 100644
> --- a/gcc/target.def
> +++ b/gcc/target.def
> @@ -5797,6 +5797,12 @@ DEFHOOK
>   const char *, (void),
>   hook_constcharptr_void_null)
> 
> +DEFHOOK
> +(has_fast_mempcpy_routine,
> + "Return true if a target has a fast mempcpy routine.",
> + bool, (void),
> + hook_bool_void_false)
> +
> 
> Not to be too nit-picky about the name but target.def refers to
> functions rather than routines.  It also defines a hook called
> libc_has_function with the obvious semantics so if there's
> a chance that it could be useful to query whether another libc
> function is "fast" I would suggest to consider defining the hook
> correspondingly, i.e.,
> 
>   bool libc_has_fast_function (enum function_class)
> 
> and naming the macro similarly.
> 
> Martin

Hi Martin.

That's a very good suggestion and I'm implementing that!

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Thanks,
Martin
Jeff Law May 17, 2019, 3:38 p.m. UTC | #9
On 5/15/19 5:35 AM, Martin Liška wrote:
> On 5/14/19 5:07 PM, Martin Sebor wrote:
>> On 5/14/19 8:55 AM, Martin Liška wrote:
>>> On 5/13/19 3:07 PM, Jakub Jelinek wrote:
>>>> On Mon, May 13, 2019 at 12:14:37PM +0200, Martin Liška wrote:
>>>>> On 5/10/19 11:21 AM, Jakub Jelinek wrote:
>>>>>> On Fri, May 10, 2019 at 11:04:12AM +0200, Martin Liška wrote:
>>>>>>> --- a/gcc/config/i386/i386.h
>>>>>>> +++ b/gcc/config/i386/i386.h
>>>>>>> @@ -1906,6 +1906,9 @@ typedef struct ix86_args {
>>>>>>>     #define CLEAR_RATIO(speed) ((speed) ? MIN (6, ix86_cost->move_ratio) : 2)
>>>>>>>   +/* C library provides fast implementation of mempcpy function.  */
>>>>>>> +#define TARGET_HAS_FAST_MEMPCPY_ROUTINE 1
>>>>>>> +
>>>>>> 1) we shouldn't be adding further target macros, but target hooks
>>>>> Done.
>>>>>
>>>>>> 2) I don't think this is a property of the x86 target, but of x86 glibc,
>>>>>>     so you should set it on x86 glibc only (i.e. i?86/x86_64 linux and hurd
>>>>>>     when using glibc, not newlib, nor bionic/android, nor uclibc, nor musl)
>>>>> I've implemented the in i386.c with DEFAULT_LIBC == LIBC_GLIBC. Hope it's correct?
>>>> No, that would be correct only in the rare SINGLE_LIBC configurations.
>>>> Either you can do
>>>> #ifdef OPTION_GLIBC
>>>>    return OPTION_GLIBC;
>>>> #else
>>>>    return false;
>>>> #endif
>>>> or define something in config/linux.h (or .[ch]) that you can then use in
>>>> i386.c.
>>>>
>>>>     Jakub
>>>>
>>> Hi.
>>>
>>> You always have nice ideas. I'm sending updated patch which addresses both Jakub's
>>> and Wilco's comments.
>>>
>> index 66cee075018..7bff5cbd313 100644
>> --- a/gcc/target.def
>> +++ b/gcc/target.def
>> @@ -5797,6 +5797,12 @@ DEFHOOK
>>   const char *, (void),
>>   hook_constcharptr_void_null)
>>
>> +DEFHOOK
>> +(has_fast_mempcpy_routine,
>> + "Return true if a target has a fast mempcpy routine.",
>> + bool, (void),
>> + hook_bool_void_false)
>> +
>>
>> Not to be too nit-picky about the name but target.def refers to
>> functions rather than routines.  It also defines a hook called
>> libc_has_function with the obvious semantics so if there's
>> a chance that it could be useful to query whether another libc
>> function is "fast" I would suggest to consider defining the hook
>> correspondingly, i.e.,
>>
>>   bool libc_has_fast_function (enum function_class)
>>
>> and naming the macro similarly.
>>
>> Martin
> Hi Martin.
> 
> That's a very good suggestion and I'm implementing that!
> 
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Thanks,
> Martin
> 
> 
> 0001-Come-up-with-hook-libc_has_fast_function-PR-middle-e.patch
> 
> From f865ba74008acc651e9ef2fbfc9f2d4bb9ce8fb8 Mon Sep 17 00:00:00 2001
> From: marxin <mliska@suse.cz>
> Date: Mon, 29 Apr 2019 13:46:25 +0200
> Subject: [PATCH] Come up with hook libc_has_fast_function (PR
>  middle-end/90263).
> 
> gcc/ChangeLog:
> 
> 2019-05-15  Martin Liska  <mliska@suse.cz>
> 
> 	PR middle-end/90263
> 	* builtins.c (expand_builtin_memory_copy_args): When having a
> 	target with fast mempcpy implementation do now use memcpy.
> 	* config/i386/i386.c (ix86_libc_has_fast_function): New.
> 	(TARGET_LIBC_HAS_FAST_FUNCTION): Likewise.
> 	* doc/tm.texi: Likewise.
> 	* doc/tm.texi.in: Likewise.
> 	* target.def:
> 	* expr.c (emit_block_move_hints): Add 2 new arguments.
> 	* expr.h (emit_block_move_hints): Bail out when libcall
> 	to memcpy would be used.
> 
> gcc/testsuite/ChangeLog:
> 
> 2019-05-15  Martin Liska  <mliska@suse.cz>
> 
> 	PR middle-end/90263
> 	* gcc.c-torture/compile/pr90263.c: New test.
> 	* lib/target-supports.exp: Add check_effective_target_glibc.
> ---
>  gcc/builtins.c                                | 17 +++++++++++++++--
>  gcc/config/i386/i386.c                        | 15 +++++++++++++++
>  gcc/doc/tm.texi                               |  5 +++++
>  gcc/doc/tm.texi.in                            |  2 ++
>  gcc/expr.c                                    | 13 ++++++++++++-
>  gcc/expr.h                                    |  4 +++-
>  gcc/target.def                                |  7 +++++++
>  gcc/testsuite/gcc.c-torture/compile/pr90263.c | 10 ++++++++++
>  gcc/testsuite/lib/target-supports.exp         | 11 +++++++++++
>  9 files changed, 80 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr90263.c
>
OK
jeff
Rainer Orth May 20, 2019, 8:43 a.m. UTC | #10
Hi Martin,

> On 5/14/19 5:07 PM, Martin Sebor wrote:
>> On 5/14/19 8:55 AM, Martin Liška wrote:
>>> On 5/13/19 3:07 PM, Jakub Jelinek wrote:
>>>> On Mon, May 13, 2019 at 12:14:37PM +0200, Martin Liška wrote:
>>>>> On 5/10/19 11:21 AM, Jakub Jelinek wrote:
>>>>>> On Fri, May 10, 2019 at 11:04:12AM +0200, Martin Liška wrote:
>>>>>>> --- a/gcc/config/i386/i386.h
>>>>>>> +++ b/gcc/config/i386/i386.h
>>>>>>> @@ -1906,6 +1906,9 @@ typedef struct ix86_args {
>>>>>>>     #define CLEAR_RATIO(speed) ((speed) ? MIN (6,
>>>>>>> ix86_cost->move_ratio) : 2)
>>>>>>>   +/* C library provides fast implementation of mempcpy function.  */
>>>>>>> +#define TARGET_HAS_FAST_MEMPCPY_ROUTINE 1
>>>>>>> +
>>>>>>
>>>>>> 1) we shouldn't be adding further target macros, but target hooks
>>>>>
>>>>> Done.
>>>>>
>>>>>> 2) I don't think this is a property of the x86 target, but of x86 glibc,
>>>>>>     so you should set it on x86 glibc only (i.e. i?86/x86_64 linux
>>>>>> and hurd
>>>>>>     when using glibc, not newlib, nor bionic/android, nor uclibc, nor
>>>>>> musl)
>>>>>
>>>>> I've implemented the in i386.c with DEFAULT_LIBC == LIBC_GLIBC. Hope
>>>>> it's correct?
>>>>
>>>> No, that would be correct only in the rare SINGLE_LIBC configurations.
>>>> Either you can do
>>>> #ifdef OPTION_GLIBC
>>>>    return OPTION_GLIBC;
>>>> #else
>>>>    return false;
>>>> #endif
>>>> or define something in config/linux.h (or .[ch]) that you can then use in
>>>> i386.c.
>>>>
>>>>     Jakub
>>>>
>>>
>>> Hi.
>>>
>>> You always have nice ideas. I'm sending updated patch which addresses
>>> both Jakub's
>>> and Wilco's comments.
>>>
>> 
>> index 66cee075018..7bff5cbd313 100644
>> --- a/gcc/target.def
>> +++ b/gcc/target.def
>> @@ -5797,6 +5797,12 @@ DEFHOOK
>>   const char *, (void),
>>   hook_constcharptr_void_null)
>> 
>> +DEFHOOK
>> +(has_fast_mempcpy_routine,
>> + "Return true if a target has a fast mempcpy routine.",
>> + bool, (void),
>> + hook_bool_void_false)
>> +
>> 
>> Not to be too nit-picky about the name but target.def refers to
>> functions rather than routines.  It also defines a hook called
>> libc_has_function with the obvious semantics so if there's
>> a chance that it could be useful to query whether another libc
>> function is "fast" I would suggest to consider defining the hook
>> correspondingly, i.e.,
>> 
>>   bool libc_has_fast_function (enum function_class)
>> 
>> and naming the macro similarly.
>> 
>> Martin
>
> Hi Martin.
>
> That's a very good suggestion and I'm implementing that!
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

as usual, the new effective-target keyworks needs documenting in
sourcebuild.texi.  Besides, it doesn't seem to be a very useful property
of a target, just a collection for all sorts of properties where we are
too lazy to introduce a new one that really describes what this is
about...

	Rainer
Martin Liška May 20, 2019, 10:43 a.m. UTC | #11
On 5/20/19 10:43 AM, Rainer Orth wrote:
> Hi Martin,
> 
>> On 5/14/19 5:07 PM, Martin Sebor wrote:
>>> On 5/14/19 8:55 AM, Martin Liška wrote:
>>>> On 5/13/19 3:07 PM, Jakub Jelinek wrote:
>>>>> On Mon, May 13, 2019 at 12:14:37PM +0200, Martin Liška wrote:
>>>>>> On 5/10/19 11:21 AM, Jakub Jelinek wrote:
>>>>>>> On Fri, May 10, 2019 at 11:04:12AM +0200, Martin Liška wrote:
>>>>>>>> --- a/gcc/config/i386/i386.h
>>>>>>>> +++ b/gcc/config/i386/i386.h
>>>>>>>> @@ -1906,6 +1906,9 @@ typedef struct ix86_args {
>>>>>>>>     #define CLEAR_RATIO(speed) ((speed) ? MIN (6,
>>>>>>>> ix86_cost->move_ratio) : 2)
>>>>>>>>   +/* C library provides fast implementation of mempcpy function.  */
>>>>>>>> +#define TARGET_HAS_FAST_MEMPCPY_ROUTINE 1
>>>>>>>> +
>>>>>>>
>>>>>>> 1) we shouldn't be adding further target macros, but target hooks
>>>>>>
>>>>>> Done.
>>>>>>
>>>>>>> 2) I don't think this is a property of the x86 target, but of x86 glibc,
>>>>>>>     so you should set it on x86 glibc only (i.e. i?86/x86_64 linux
>>>>>>> and hurd
>>>>>>>     when using glibc, not newlib, nor bionic/android, nor uclibc, nor
>>>>>>> musl)
>>>>>>
>>>>>> I've implemented the in i386.c with DEFAULT_LIBC == LIBC_GLIBC. Hope
>>>>>> it's correct?
>>>>>
>>>>> No, that would be correct only in the rare SINGLE_LIBC configurations.
>>>>> Either you can do
>>>>> #ifdef OPTION_GLIBC
>>>>>    return OPTION_GLIBC;
>>>>> #else
>>>>>    return false;
>>>>> #endif
>>>>> or define something in config/linux.h (or .[ch]) that you can then use in
>>>>> i386.c.
>>>>>
>>>>>     Jakub
>>>>>
>>>>
>>>> Hi.
>>>>
>>>> You always have nice ideas. I'm sending updated patch which addresses
>>>> both Jakub's
>>>> and Wilco's comments.
>>>>
>>>
>>> index 66cee075018..7bff5cbd313 100644
>>> --- a/gcc/target.def
>>> +++ b/gcc/target.def
>>> @@ -5797,6 +5797,12 @@ DEFHOOK
>>>   const char *, (void),
>>>   hook_constcharptr_void_null)
>>>
>>> +DEFHOOK
>>> +(has_fast_mempcpy_routine,
>>> + "Return true if a target has a fast mempcpy routine.",
>>> + bool, (void),
>>> + hook_bool_void_false)
>>> +
>>>
>>> Not to be too nit-picky about the name but target.def refers to
>>> functions rather than routines.  It also defines a hook called
>>> libc_has_function with the obvious semantics so if there's
>>> a chance that it could be useful to query whether another libc
>>> function is "fast" I would suggest to consider defining the hook
>>> correspondingly, i.e.,
>>>
>>>   bool libc_has_fast_function (enum function_class)
>>>
>>> and naming the macro similarly.
>>>
>>> Martin
>>
>> Hi Martin.
>>
>> That's a very good suggestion and I'm implementing that!
>>
>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> as usual, the new effective-target keyworks needs documenting in
> sourcebuild.texi.

Hi.

It's not a target keywords, but it's a new hook. I've briefly read the file
and I can't find any other hooks mentioned in the file.

Martin

>  Besides, it doesn't seem to be a very useful property
> of a target, just a collection for all sorts of properties where we are
> too lazy to introduce a new one that really describes what this is
> about...
> 
> 	Rainer
>
Rainer Orth May 20, 2019, 11:08 a.m. UTC | #12
Hi Martin,

>> as usual, the new effective-target keyworks needs documenting in
>> sourcebuild.texi.
>
> It's not a target keywords, but it's a new hook. I've briefly read the file
> and I can't find any other hooks mentioned in the file.

I'm not talking about the hook, but about the glibc effective-target
keyword implemented in target-supports.exp
(check_effective_target_glibc).  All questions about its utility aside,
they are supposed to be documented in sourcebuild.texi (and I'm working
on a patch to both add those that have been missed and check that no new
missing ones are introduced at build time).

	Rainer
Jakub Jelinek May 20, 2019, 9:28 p.m. UTC | #13
On Fri, May 17, 2019 at 09:38:31AM -0600, Jeff Law wrote:
> > 2019-05-15  Martin Liska  <mliska@suse.cz>
> > 
> > 	PR middle-end/90263
> > 	* gcc.c-torture/compile/pr90263.c: New test.
> > 	* lib/target-supports.exp: Add check_effective_target_glibc.

The test is:
+UNRESOLVED: gcc.c-torture/compile/pr90263.c   -O0   scan-assembler mempcpy
+UNRESOLVED: gcc.c-torture/compile/pr90263.c   -O1   scan-assembler mempcpy
+UNRESOLVED: gcc.c-torture/compile/pr90263.c   -O2   scan-assembler mempcpy
+UNRESOLVED: gcc.c-torture/compile/pr90263.c   -O2 -flto   scan-assembler mempcpy
+UNRESOLVED: gcc.c-torture/compile/pr90263.c   -O2 -flto -flto-partition=none   scan-assembler mempcpy
+UNRESOLVED: gcc.c-torture/compile/pr90263.c   -O3 -g   scan-assembler mempcpy
+UNRESOLVED: gcc.c-torture/compile/pr90263.c   -Os   scan-assembler mempcpy
compile.exp defaults to dg-do assemble, so it doesn't emit assembly, but
object file and so you can't scan-assembler it (unless -save-temps).
Why have you put it into gcc.c-torture/compile/ rather than gcc.dg/
or gcc.dg/torture/ and made dg-do compile there?

	Jakub
Martin Liška May 21, 2019, 6:53 a.m. UTC | #14
On 5/20/19 11:28 PM, Jakub Jelinek wrote:
> On Fri, May 17, 2019 at 09:38:31AM -0600, Jeff Law wrote:
>>> 2019-05-15  Martin Liska  <mliska@suse.cz>
>>>
>>> 	PR middle-end/90263
>>> 	* gcc.c-torture/compile/pr90263.c: New test.
>>> 	* lib/target-supports.exp: Add check_effective_target_glibc.
> 
> The test is:
> +UNRESOLVED: gcc.c-torture/compile/pr90263.c   -O0   scan-assembler mempcpy
> +UNRESOLVED: gcc.c-torture/compile/pr90263.c   -O1   scan-assembler mempcpy
> +UNRESOLVED: gcc.c-torture/compile/pr90263.c   -O2   scan-assembler mempcpy
> +UNRESOLVED: gcc.c-torture/compile/pr90263.c   -O2 -flto   scan-assembler mempcpy
> +UNRESOLVED: gcc.c-torture/compile/pr90263.c   -O2 -flto -flto-partition=none   scan-assembler mempcpy
> +UNRESOLVED: gcc.c-torture/compile/pr90263.c   -O3 -g   scan-assembler mempcpy
> +UNRESOLVED: gcc.c-torture/compile/pr90263.c   -Os   scan-assembler mempcpy
> compile.exp defaults to dg-do assemble, so it doesn't emit assembly, but
> object file and so you can't scan-assembler it (unless -save-temps).
> Why have you put it into gcc.c-torture/compile/ rather than gcc.dg/
> or gcc.dg/torture/ and made dg-do compile there?

You are right. I chose a strangle location for the test-case.

I'm going to install attached patch.
Martin

> 
> 	Jakub
>
Jakub Jelinek May 21, 2019, 7:08 a.m. UTC | #15
On Tue, May 21, 2019 at 08:53:59AM +0200, Martin Liška wrote:
> >From 3e01ffe133f9e62156599555732b9d14d9293025 Mon Sep 17 00:00:00 2001
> From: Martin Liska <mliska@suse.cz>
> Date: Tue, 21 May 2019 08:51:06 +0200
> Subject: [PATCH] Move a test-case (PR testsuite/90551).
> 
> gcc/testsuite/ChangeLog:
> 
> 2019-05-21  Martin Liska  <mliska@suse.cz>
> 
> 	PR testsuite/90551
> 	* pr90263.c: Move from gcc.c-torture/compile
> 	into gcc.dg.

Note in gcc.dg/ the default options will be -O0, is that what you want to
test (or say add -O2 in dg-options, or cycle through torture options by
having the test in gcc.dg/torture/)?

>  gcc/testsuite/{gcc.c-torture/compile => gcc.dg}/pr90263.c | 1 +
>  1 file changed, 1 insertion(+)
>  rename gcc/testsuite/{gcc.c-torture/compile => gcc.dg}/pr90263.c (92%)
> 
> diff --git a/gcc/testsuite/gcc.c-torture/compile/pr90263.c b/gcc/testsuite/gcc.dg/pr90263.c
> similarity index 92%
> rename from gcc/testsuite/gcc.c-torture/compile/pr90263.c
> rename to gcc/testsuite/gcc.dg/pr90263.c
> index df3ab0fc1cd..acf3db16640 100644
> --- a/gcc/testsuite/gcc.c-torture/compile/pr90263.c
> +++ b/gcc/testsuite/gcc.dg/pr90263.c
> @@ -1,4 +1,5 @@
>  /* PR middle-end/90263 */
> +/* { dg-do compile } */
>  /* { dg-require-effective-target glibc } */
>  
>  int *f (int *p, int *q, long n)
> -- 
> 2.21.0
> 


	Jakub
Martin Liška May 21, 2019, 7:32 a.m. UTC | #16
On 5/21/19 9:08 AM, Jakub Jelinek wrote:
> On Tue, May 21, 2019 at 08:53:59AM +0200, Martin Liška wrote:
>> >From 3e01ffe133f9e62156599555732b9d14d9293025 Mon Sep 17 00:00:00 2001
>> From: Martin Liska <mliska@suse.cz>
>> Date: Tue, 21 May 2019 08:51:06 +0200
>> Subject: [PATCH] Move a test-case (PR testsuite/90551).
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2019-05-21  Martin Liska  <mliska@suse.cz>
>>
>> 	PR testsuite/90551
>> 	* pr90263.c: Move from gcc.c-torture/compile
>> 	into gcc.dg.
> 
> Note in gcc.dg/ the default options will be -O0, is that what you want to
> test (or say add -O2 in dg-options, or cycle through torture options by
> having the test in gcc.dg/torture/)?

I prefer to use -O2 which is done in patch I'm attaching.

Thanks,
Martin

> 
>>  gcc/testsuite/{gcc.c-torture/compile => gcc.dg}/pr90263.c | 1 +
>>  1 file changed, 1 insertion(+)
>>  rename gcc/testsuite/{gcc.c-torture/compile => gcc.dg}/pr90263.c (92%)
>>
>> diff --git a/gcc/testsuite/gcc.c-torture/compile/pr90263.c b/gcc/testsuite/gcc.dg/pr90263.c
>> similarity index 92%
>> rename from gcc/testsuite/gcc.c-torture/compile/pr90263.c
>> rename to gcc/testsuite/gcc.dg/pr90263.c
>> index df3ab0fc1cd..acf3db16640 100644
>> --- a/gcc/testsuite/gcc.c-torture/compile/pr90263.c
>> +++ b/gcc/testsuite/gcc.dg/pr90263.c
>> @@ -1,4 +1,5 @@
>>  /* PR middle-end/90263 */
>> +/* { dg-do compile } */
>>  /* { dg-require-effective-target glibc } */
>>  
>>  int *f (int *p, int *q, long n)
>> -- 
>> 2.21.0
>>
> 
> 
> 	Jakub
>
diff mbox series

Patch

diff --git a/gcc/builtins.c b/gcc/builtins.c
index d37d73fc4a0..09d5b540ae8 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -3839,6 +3839,8 @@  expand_builtin_memory_copy_args (tree dest, tree src, tree len,
   unsigned HOST_WIDE_INT max_size;
   unsigned HOST_WIDE_INT probable_max_size;
 
+  bool is_move_done;
+
   /* If DEST is not a pointer type, call the normal function.  */
   if (dest_align == 0)
     return NULL_RTX;
@@ -3888,11 +3890,23 @@  expand_builtin_memory_copy_args (tree dest, tree src, tree len,
   if (CALL_EXPR_TAILCALL (exp)
       && (retmode == RETURN_BEGIN || target == const0_rtx))
     method = BLOCK_OP_TAILCALL;
-  if (retmode == RETURN_END && target != const0_rtx)
+  if (TARGET_HAS_FAST_MEMPCPY_ROUTINE
+      && retmode == RETURN_END
+      && target != const0_rtx)
     method = BLOCK_OP_NO_LIBCALL_RET;
   dest_addr = emit_block_move_hints (dest_mem, src_mem, len_rtx, method,
 				     expected_align, expected_size,
-				     min_size, max_size, probable_max_size);
+				     min_size, max_size, probable_max_size,
+				     TARGET_HAS_FAST_MEMPCPY_ROUTINE
+				     && retmode == RETURN_END,
+				     &is_move_done);
+
+  /* Bail out when a mempcpy call would be expanded as libcall and when
+     we have a target that provides a fast implementation
+     of mempcpy routine.  */
+  if (!is_move_done)
+    return NULL_RTX;
+
   if (dest_addr == pc_rtx)
     return NULL_RTX;
 
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 3fee779296f..7d20178f432 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -1906,6 +1906,9 @@  typedef struct ix86_args {
 
 #define CLEAR_RATIO(speed) ((speed) ? MIN (6, ix86_cost->move_ratio) : 2)
 
+/* C library provides fast implementation of mempcpy function.  */
+#define TARGET_HAS_FAST_MEMPCPY_ROUTINE 1
+
 /* Define if shifts truncate the shift count which implies one can
    omit a sign-extension or zero-extension of a shift count.
 
diff --git a/gcc/defaults.h b/gcc/defaults.h
index b7534256119..eca19d1977f 100644
--- a/gcc/defaults.h
+++ b/gcc/defaults.h
@@ -1348,6 +1348,13 @@  see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #define SET_RATIO(speed) MOVE_RATIO (speed)
 #endif
 
+/* By default do not generate libcall to mempcpy and rather use
+   libcall to memcpy and adjustment of return value.  */
+
+#ifndef TARGET_HAS_FAST_MEMPCPY_ROUTINE
+#define TARGET_HAS_FAST_MEMPCPY_ROUTINE 0
+#endif
+
 /* Supply a default definition of STACK_SAVEAREA_MODE for emit_stack_save.
    Normally move_insn, so Pmode stack pointer.  */
 
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 8c8978bb13a..cf709dfb843 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -6735,6 +6735,11 @@  optimized for speed rather than size.
 If you don't define this, it defaults to the value of @code{MOVE_RATIO}.
 @end defmac
 
+@defmac TARGET_HAS_FAST_MEMPCPY_ROUTINE
+By default do not generate libcall to mempcpy and rather use
+libcall to memcpy and adjustment of return value.
+@end defmac
+
 @defmac USE_LOAD_POST_INCREMENT (@var{mode})
 A C expression used to determine whether a load postincrement is a good
 thing to use for a given mode.  Defaults to the value of
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index fe1194ef91a..d05c52a36f9 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -4619,6 +4619,11 @@  optimized for speed rather than size.
 If you don't define this, it defaults to the value of @code{MOVE_RATIO}.
 @end defmac
 
+@defmac TARGET_HAS_FAST_MEMPCPY_ROUTINE
+By default do not generate libcall to mempcpy and rather use
+libcall to memcpy and adjustment of return value.
+@end defmac
+
 @defmac USE_LOAD_POST_INCREMENT (@var{mode})
 A C expression used to determine whether a load postincrement is a good
 thing to use for a given mode.  Defaults to the value of
diff --git a/gcc/expr.c b/gcc/expr.c
index fa15b7eceae..c78bc74c0d9 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -1561,12 +1561,16 @@  emit_block_move_hints (rtx x, rtx y, rtx size, enum block_op_methods method,
 		       unsigned int expected_align, HOST_WIDE_INT expected_size,
 		       unsigned HOST_WIDE_INT min_size,
 		       unsigned HOST_WIDE_INT max_size,
-		       unsigned HOST_WIDE_INT probable_max_size)
+		       unsigned HOST_WIDE_INT probable_max_size,
+		       bool bail_out_libcall, bool *is_move_done)
 {
   int may_use_call;
   rtx retval = 0;
   unsigned int align;
 
+  if (is_move_done)
+    *is_move_done = true;
+
   gcc_assert (size);
   if (CONST_INT_P (size) && INTVAL (size) == 0)
     return 0;
@@ -1628,6 +1632,13 @@  emit_block_move_hints (rtx x, rtx y, rtx size, enum block_op_methods method,
 	   && ADDR_SPACE_GENERIC_P (MEM_ADDR_SPACE (x))
 	   && ADDR_SPACE_GENERIC_P (MEM_ADDR_SPACE (y)))
     {
+      if (bail_out_libcall)
+	{
+	  if (is_move_done)
+	    *is_move_done = false;
+	  return retval;
+	}
+
       if (may_use_call < 0)
 	return pc_rtx;
 
diff --git a/gcc/expr.h b/gcc/expr.h
index 17c3962436a..6eb70bf12f1 100644
--- a/gcc/expr.h
+++ b/gcc/expr.h
@@ -114,7 +114,9 @@  extern rtx emit_block_move_hints (rtx, rtx, rtx, enum block_op_methods,
 			          unsigned int, HOST_WIDE_INT,
 				  unsigned HOST_WIDE_INT,
 				  unsigned HOST_WIDE_INT,
-				  unsigned HOST_WIDE_INT);
+				  unsigned HOST_WIDE_INT,
+				  bool bail_out_libcall = false,
+				  bool *is_move_done = NULL);
 extern rtx emit_block_cmp_hints (rtx, rtx, rtx, tree, rtx, bool,
 				 by_pieces_constfn, void *);
 extern bool emit_storent_insn (rtx to, rtx from);
diff --git a/gcc/testsuite/gcc.c-torture/execute/builtins/mempcpy.c b/gcc/testsuite/gcc.c-torture/execute/builtins/mempcpy.c
index d82e2232d7b..0b84d229cef 100644
--- a/gcc/testsuite/gcc.c-torture/execute/builtins/mempcpy.c
+++ b/gcc/testsuite/gcc.c-torture/execute/builtins/mempcpy.c
@@ -56,9 +56,8 @@  main_test (void)
   if (__builtin_mempcpy (p, "ABCDE", 6) != p + 6 || memcmp (p, "ABCDE", 6))
     abort ();
 
-  /* If the result of mempcpy is ignored, gcc should use memcpy.
-     This should be optimized always, so set inside_main again.  */
-  inside_main = 1;
+  /* Set inside main in order to not abort because of usafe of mempcpy.  */
+  inside_main = 0;
   mempcpy (p + 5, s3, 1);
   if (memcmp (p, "ABCDEFg", 8))
     abort ();