Fix string/tst-xbzero-opt if build with gcc head.

Message ID a1bd8b21-c935-125f-0b9c-381f0ed19b5a@linux.ibm.com
State New
Headers show
Series
  • Fix string/tst-xbzero-opt if build with gcc head.
Related show

Commit Message

Stefan Liebler July 12, 2018, 2:22 p.m.
Fix string/tst-xbzero-opt is build with gcc head.

On s390x, the test string/tst-xbzero-opt is failing if build with GCC head:
FAIL: no clear/prepare: expected 32 got 0
FAIL: no clear/test: expected some got 0
FAIL: ordinary clear/prepare: expected 32 got 0
INFO: ordinary clear/test: found 0 patterns (memset not eliminated)
PASS: explicit clear/prepare: expected 32 got 32
PASS: explicit clear/test: expected 0 got 0

In setup_no_clear / setup_ordinary_clear, GCC is omitting the memcpy 
loop in prepare_test_buffer. Thus count_test_patterns does not find any 
of the test_pattern.

This patch introduces a compiler barrier just after filling the buffer.

If okay, shall this be committed before or after the release?

Bye
Stefan

---

ChangeLog:

	* string/tst-xbzero-opt.c (prepare_test_buffer):
	Add compiler barrier.

Comments

Zack Weinberg July 12, 2018, 4:42 p.m. | #1
On Thu, Jul 12, 2018 at 10:22 AM, Stefan Liebler <stli@linux.ibm.com> wrote:
> Fix string/tst-xbzero-opt is build with gcc head.
...
> In setup_no_clear / setup_ordinary_clear, GCC is omitting the memcpy loop in
> prepare_test_buffer. Thus count_test_patterns does not find any of the
> test_pattern.
>
> This patch introduces a compiler barrier just after filling the buffer.

I think I understand why the call to swapcontext in
prepare_test_buffer is not a sufficient compiler barrier, but I am not
a fan of asm volatile ("" ::: "memory"), because I fully expect some
future compiler to decide that there are no actual assembly
instructions being inserted so the statement can be completely
ignored.  I would prefer us to find some kind of construct that
actually does make externally-visible side effects depend on the
contents of 'buf' in terms of the C abstract machine.

zw
Stefan Liebler July 16, 2018, 11:05 a.m. | #2
On 07/12/2018 06:42 PM, Zack Weinberg wrote:
> On Thu, Jul 12, 2018 at 10:22 AM, Stefan Liebler <stli@linux.ibm.com> wrote:
>> Fix string/tst-xbzero-opt is build with gcc head.
> ...
>> In setup_no_clear / setup_ordinary_clear, GCC is omitting the memcpy loop in
>> prepare_test_buffer. Thus count_test_patterns does not find any of the
>> test_pattern.
>>
>> This patch introduces a compiler barrier just after filling the buffer.
> 
> I think I understand why the call to swapcontext in
> prepare_test_buffer is not a sufficient compiler barrier, but I am not
> a fan of asm volatile ("" ::: "memory"), because I fully expect some
> future compiler to decide that there are no actual assembly
> instructions being inserted so the statement can be completely
> ignored.  I would prefer us to find some kind of construct that
> actually does make externally-visible side effects depend on the
> contents of 'buf' in terms of the C abstract machine.
> 
> zw
> 

Okay. Then here is a new version of the patch without the empty asm.
If build with GCC head on s390x, setup_no_clear is now filling the 
buffer and setup_ordinary_clear is just a tail call to setup_no_clear 
(same behaviour as before).

Is this C construct okay?

Bye.
Stefan
commit a42ce495317df04f1abdc3670c1fcbb68f329c3d
Author: Stefan Liebler <stli@linux.ibm.com>
Date:   Fri Jul 13 15:58:48 2018 +0200

    Fix string/tst-xbzero-opt if build with gcc head.
    
    On s390x, the test string/tst-xbzero-opt is failing if build with gcc head:
    FAIL: no clear/prepare: expected 32 got 0
    FAIL: no clear/test: expected some got 0
    FAIL: ordinary clear/prepare: expected 32 got 0
    INFO: ordinary clear/test: found 0 patterns (memset not eliminated)
    PASS: explicit clear/prepare: expected 32 got 32
    PASS: explicit clear/test: expected 0 got 0
    
    In setup_no_clear / setup_ordinary_clear, GCC is omitting the memcpy loop
    in prepare_test_buffer. Thus count_test_patterns does not find any of the
    test_pattern.
    
    This patch calls use_test_buffer in order to force the compiler to really copy
    the pattern to buf.
    
    ChangeLog:
    
            * string/tst-xbzero-opt.c (use_test_buffer): New function.
             (prepare_test_buffer): Call use_test_buffer as compiler barrier.

diff --git a/string/tst-xbzero-opt.c b/string/tst-xbzero-opt.c
index cf7041f37a..b00fafd237 100644
--- a/string/tst-xbzero-opt.c
+++ b/string/tst-xbzero-opt.c
@@ -97,6 +97,17 @@ static const unsigned char test_pattern[16] =
 
 static ucontext_t uc_main, uc_co;
 
+static __attribute__ ((noinline)) int
+use_test_buffer (unsigned char *buf)
+{
+  unsigned int sum = 0;
+
+  for (unsigned int i = 0; i < PATTERN_REPS; i++)
+    sum += buf[i * PATTERN_SIZE];
+
+  return (sum == 2 * PATTERN_REPS) ? 0 : 1;
+}
+
 /* Always check the test buffer immediately after filling it; this
    makes externally visible side effects depend on the buffer existing
    and having been filled in.  */
@@ -108,6 +119,10 @@ prepare_test_buffer (unsigned char *buf)
 
   if (swapcontext (&uc_co, &uc_main))
     abort ();
+
+  /* Force the compiler to really copy the pattern to buf.  */
+  if (use_test_buffer (buf))
+    abort ();
 }
 
 static void
Zack Weinberg July 16, 2018, 12:36 p.m. | #3
On Mon, Jul 16, 2018 at 7:05 AM, Stefan Liebler <stli@linux.ibm.com> wrote:
> On 07/12/2018 06:42 PM, Zack Weinberg wrote:
>> I would prefer us to find some kind of construct that
>> actually does make externally-visible side effects depend on the
>> contents of 'buf' in terms of the C abstract machine.
>>
>
> Okay. Then here is a new version of the patch without the empty asm.
> If build with GCC head on s390x, setup_no_clear is now filling the buffer
> and setup_ordinary_clear is just a tail call to setup_no_clear (same
> behaviour as before).

Yes, this change is OK.

zw
Stefan Liebler July 16, 2018, 1:13 p.m. | #4
On 07/16/2018 02:36 PM, Zack Weinberg wrote:
> On Mon, Jul 16, 2018 at 7:05 AM, Stefan Liebler <stli@linux.ibm.com> wrote:
>> On 07/12/2018 06:42 PM, Zack Weinberg wrote:
>>> I would prefer us to find some kind of construct that
>>> actually does make externally-visible side effects depend on the
>>> contents of 'buf' in terms of the C abstract machine.
>>>
>>
>> Okay. Then here is a new version of the patch without the empty asm.
>> If build with GCC head on s390x, setup_no_clear is now filling the buffer
>> and setup_ordinary_clear is just a tail call to setup_no_clear (same
>> behaviour as before).
> 
> Yes, this change is OK.
> 
> zw
> 

Okay. Thank you.
Shall I commit this test-fix before or after the release?

Bye.
Stefan
Zack Weinberg July 16, 2018, 1:18 p.m. | #5
On Mon, Jul 16, 2018 at 9:13 AM, Stefan Liebler <stli@linux.ibm.com> wrote:
> On 07/16/2018 02:36 PM, Zack Weinberg wrote:
>> On Mon, Jul 16, 2018 at 7:05 AM, Stefan Liebler <stli@linux.ibm.com>
>> wrote:
>>> Okay. Then here is a new version of the patch without the empty asm.
>>> If build with GCC head on s390x, setup_no_clear is now filling the buffer
>>> and setup_ordinary_clear is just a tail call to setup_no_clear (same
>>> behaviour as before).
>>
>>
>> Yes, this change is OK.
>
> Okay. Thank you.
> Shall I commit this test-fix before or after the release?

Carlos has the last word but we usually accept bugfixes to test cases
right up till the hard freeze.

zw
Stefan Liebler July 23, 2018, 6:42 a.m. | #6
On 07/16/2018 03:18 PM, Zack Weinberg wrote:
> On Mon, Jul 16, 2018 at 9:13 AM, Stefan Liebler <stli@linux.ibm.com> wrote:
>> On 07/16/2018 02:36 PM, Zack Weinberg wrote:
>>> On Mon, Jul 16, 2018 at 7:05 AM, Stefan Liebler <stli@linux.ibm.com>
>>> wrote:
>>>> Okay. Then here is a new version of the patch without the empty asm.
>>>> If build with GCC head on s390x, setup_no_clear is now filling the buffer
>>>> and setup_ordinary_clear is just a tail call to setup_no_clear (same
>>>> behaviour as before).
>>>
>>>
>>> Yes, this change is OK.
>>
>> Okay. Thank you.
>> Shall I commit this test-fix before or after the release?
> 
> Carlos has the last word but we usually accept bugfixes to test cases
> right up till the hard freeze.
> 
> zw
> 

Carlos,

what about this bugfix for this test?
Can I commit it now or shall I wait commit it after the release?

Bye
Stefan
Carlos O'Donell July 26, 2018, 1:33 p.m. | #7
On 07/16/2018 07:05 AM, Stefan Liebler wrote:
> On 07/12/2018 06:42 PM, Zack Weinberg wrote:
>> On Thu, Jul 12, 2018 at 10:22 AM, Stefan Liebler <stli@linux.ibm.com> wrote:
>>> Fix string/tst-xbzero-opt is build with gcc head.
>> ...
>>> In setup_no_clear / setup_ordinary_clear, GCC is omitting the memcpy loop in
>>> prepare_test_buffer. Thus count_test_patterns does not find any of the
>>> test_pattern.
>>>
>>> This patch introduces a compiler barrier just after filling the buffer.
>>
>> I think I understand why the call to swapcontext in
>> prepare_test_buffer is not a sufficient compiler barrier, but I am not
>> a fan of asm volatile ("" ::: "memory"), because I fully expect some
>> future compiler to decide that there are no actual assembly
>> instructions being inserted so the statement can be completely
>> ignored.  I would prefer us to find some kind of construct that
>> actually does make externally-visible side effects depend on the
>> contents of 'buf' in terms of the C abstract machine.
>>
>> zw
>>
> 
> Okay. Then here is a new version of the patch without the empty asm. 
> If build with GCC head on s390x, setup_no_clear is now filling the
> buffer and setup_ordinary_clear is just a tail call to setup_no_clear
> (same behaviour as before).

Thanks for fixing this.

> commit a42ce495317df04f1abdc3670c1fcbb68f329c3d
> Author: Stefan Liebler <stli@linux.ibm.com>
> Date:   Fri Jul 13 15:58:48 2018 +0200
> 
>     Fix string/tst-xbzero-opt if build with gcc head.
>     
>     On s390x, the test string/tst-xbzero-opt is failing if build with gcc head:
>     FAIL: no clear/prepare: expected 32 got 0
>     FAIL: no clear/test: expected some got 0
>     FAIL: ordinary clear/prepare: expected 32 got 0
>     INFO: ordinary clear/test: found 0 patterns (memset not eliminated)
>     PASS: explicit clear/prepare: expected 32 got 32
>     PASS: explicit clear/test: expected 0 got 0
>     
>     In setup_no_clear / setup_ordinary_clear, GCC is omitting the memcpy loop
>     in prepare_test_buffer. Thus count_test_patterns does not find any of the
>     test_pattern.
>     
>     This patch calls use_test_buffer in order to force the compiler to really copy
>     the pattern to buf.

OK.

>     
>     ChangeLog:
>     
>             * string/tst-xbzero-opt.c (use_test_buffer): New function.
>              (prepare_test_buffer): Call use_test_buffer as compiler barrier.

OK for 2.28. Consider noclone also.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

> 
> diff --git a/string/tst-xbzero-opt.c b/string/tst-xbzero-opt.c
> index cf7041f37a..b00fafd237 100644
> --- a/string/tst-xbzero-opt.c
> +++ b/string/tst-xbzero-opt.c
> @@ -97,6 +97,17 @@ static const unsigned char test_pattern[16] =
>  
>  static ucontext_t uc_main, uc_co;
>  
> +static __attribute__ ((noinline)) int

Suggest: Add noclone.

As optimization barriers we have also used "noclone" here
e.g. malloc/tst-mallocstate.c (my_free).

I can't see a case where a specialization of this function avoids
the copy, but it might be possible? Just for safety's sake use noclone
too.

> +use_test_buffer (unsigned char *buf)
> +{
> +  unsigned int sum = 0;
> +
> +  for (unsigned int i = 0; i < PATTERN_REPS; i++)
> +    sum += buf[i * PATTERN_SIZE];
> +
> +  return (sum == 2 * PATTERN_REPS) ? 0 : 1;
> +}

OK.

> +
>  /* Always check the test buffer immediately after filling it; this
>     makes externally visible side effects depend on the buffer existing
>     and having been filled in.  */
> @@ -108,6 +119,10 @@ prepare_test_buffer (unsigned char *buf)
>  
>    if (swapcontext (&uc_co, &uc_main))
>      abort ();
> +
> +  /* Force the compiler to really copy the pattern to buf.  */
> +  if (use_test_buffer (buf))
> +    abort ();

OK.

>  }
>  
>  static void
Stefan Liebler July 26, 2018, 3:13 p.m. | #8
On 07/26/2018 03:33 PM, Carlos O'Donell wrote:
> On 07/16/2018 07:05 AM, Stefan Liebler wrote:
>> On 07/12/2018 06:42 PM, Zack Weinberg wrote:
>>> On Thu, Jul 12, 2018 at 10:22 AM, Stefan Liebler <stli@linux.ibm.com> wrote:
>>>> Fix string/tst-xbzero-opt is build with gcc head.
>>> ...
>>>> In setup_no_clear / setup_ordinary_clear, GCC is omitting the memcpy loop in
>>>> prepare_test_buffer. Thus count_test_patterns does not find any of the
>>>> test_pattern.
>>>>
>>>> This patch introduces a compiler barrier just after filling the buffer.
>>>
>>> I think I understand why the call to swapcontext in
>>> prepare_test_buffer is not a sufficient compiler barrier, but I am not
>>> a fan of asm volatile ("" ::: "memory"), because I fully expect some
>>> future compiler to decide that there are no actual assembly
>>> instructions being inserted so the statement can be completely
>>> ignored.  I would prefer us to find some kind of construct that
>>> actually does make externally-visible side effects depend on the
>>> contents of 'buf' in terms of the C abstract machine.
>>>
>>> zw
>>>
>>
>> Okay. Then here is a new version of the patch without the empty asm.
>> If build with GCC head on s390x, setup_no_clear is now filling the
>> buffer and setup_ordinary_clear is just a tail call to setup_no_clear
>> (same behaviour as before).
> 
> Thanks for fixing this.
> 
>> commit a42ce495317df04f1abdc3670c1fcbb68f329c3d
>> Author: Stefan Liebler <stli@linux.ibm.com>
>> Date:   Fri Jul 13 15:58:48 2018 +0200
>>
>>      Fix string/tst-xbzero-opt if build with gcc head.
>>      
>>      On s390x, the test string/tst-xbzero-opt is failing if build with gcc head:
>>      FAIL: no clear/prepare: expected 32 got 0
>>      FAIL: no clear/test: expected some got 0
>>      FAIL: ordinary clear/prepare: expected 32 got 0
>>      INFO: ordinary clear/test: found 0 patterns (memset not eliminated)
>>      PASS: explicit clear/prepare: expected 32 got 32
>>      PASS: explicit clear/test: expected 0 got 0
>>      
>>      In setup_no_clear / setup_ordinary_clear, GCC is omitting the memcpy loop
>>      in prepare_test_buffer. Thus count_test_patterns does not find any of the
>>      test_pattern.
>>      
>>      This patch calls use_test_buffer in order to force the compiler to really copy
>>      the pattern to buf.
> 
> OK.
> 
>>      
>>      ChangeLog:
>>      
>>              * string/tst-xbzero-opt.c (use_test_buffer): New function.
>>               (prepare_test_buffer): Call use_test_buffer as compiler barrier.
> 
> OK for 2.28. Consider noclone also.
> 
> Reviewed-by: Carlos O'Donell <carlos@redhat.com>
> 

Okay. I've just committed this patch with noclone.

Thanks
Stefan

>>
>> diff --git a/string/tst-xbzero-opt.c b/string/tst-xbzero-opt.c
>> index cf7041f37a..b00fafd237 100644
>> --- a/string/tst-xbzero-opt.c
>> +++ b/string/tst-xbzero-opt.c
>> @@ -97,6 +97,17 @@ static const unsigned char test_pattern[16] =
>>   
>>   static ucontext_t uc_main, uc_co;
>>   
>> +static __attribute__ ((noinline)) int
> 
> Suggest: Add noclone.
> 
> As optimization barriers we have also used "noclone" here
> e.g. malloc/tst-mallocstate.c (my_free).
> 
> I can't see a case where a specialization of this function avoids
> the copy, but it might be possible? Just for safety's sake use noclone
> too.
> 
>> +use_test_buffer (unsigned char *buf)
>> +{
>> +  unsigned int sum = 0;
>> +
>> +  for (unsigned int i = 0; i < PATTERN_REPS; i++)
>> +    sum += buf[i * PATTERN_SIZE];
>> +
>> +  return (sum == 2 * PATTERN_REPS) ? 0 : 1;
>> +}
> 
> OK.
> 
>> +
>>   /* Always check the test buffer immediately after filling it; this
>>      makes externally visible side effects depend on the buffer existing
>>      and having been filled in.  */
>> @@ -108,6 +119,10 @@ prepare_test_buffer (unsigned char *buf)
>>   
>>     if (swapcontext (&uc_co, &uc_main))
>>       abort ();
>> +
>> +  /* Force the compiler to really copy the pattern to buf.  */
>> +  if (use_test_buffer (buf))
>> +    abort ();
> 
> OK.
> 
>>   }
>>   
>>   static void
>

Patch

commit 65f9b078c5053faa93e1f572282463685a869864
Author: Stefan Liebler <stli@linux.ibm.com>
Date:   Thu Jul 12 16:07:26 2018 +0200

    Fix string/tst-xbzero-opt if build with gcc head.
    
    On s390x, the test string/tst-xbzero-opt is failing if build with gcc head:
    FAIL: no clear/prepare: expected 32 got 0
    FAIL: no clear/test: expected some got 0
    FAIL: ordinary clear/prepare: expected 32 got 0
    INFO: ordinary clear/test: found 0 patterns (memset not eliminated)
    PASS: explicit clear/prepare: expected 32 got 32
    PASS: explicit clear/test: expected 0 got 0
    
    In setup_no_clear / setup_ordinary_clear, GCC is omitting the memcpy loop
    in prepare_test_buffer. Thus count_test_patterns does not find any of the
    test_pattern.
    
    This patch introduces a compiler barrier just after filling the buffer
    and the filling of the test_pattern is not omitted.
    
    ChangeLog:
    
            * string/tst-xbzero-opt.c (prepare_test_buffer): Add compiler barrier.

diff --git a/string/tst-xbzero-opt.c b/string/tst-xbzero-opt.c
index cf7041f37a..4c2be0c197 100644
--- a/string/tst-xbzero-opt.c
+++ b/string/tst-xbzero-opt.c
@@ -106,6 +106,9 @@  prepare_test_buffer (unsigned char *buf)
   for (unsigned int i = 0; i < PATTERN_REPS; i++)
     memcpy (buf + i*PATTERN_SIZE, test_pattern, PATTERN_SIZE);
 
+  /* Force the compiler to really copy the pattern to buf.  */
+  __asm__ __volatile__ ("" ::: "memory");
+
   if (swapcontext (&uc_co, &uc_main))
     abort ();
 }