Patchwork [2/2] Support __ATOMIC_HLE_RELEASE for __atomic_clear/store_n

login
register
mail settings
Submitter Uros Bizjak
Date Jan. 13, 2013, 7:59 p.m.
Message ID <CAFULd4Ztbccsq9h50Tp-0JGHr4CoFJ2QByf0WgS4Ed92_09=Rw@mail.gmail.com>
Download mbox | patch
Permalink /patch/211649/
State New
Headers show

Comments

Uros Bizjak - Jan. 13, 2013, 7:59 p.m.
Hello!

> __atomic_clear and __atomic_store_n didn't have code to generate
> the TSX HLE RELEASE prefix. Add this plus test cases.

+(define_insn "atomic_store_hle_release<mode>"
+  [(set (match_operand:ATOMIC 0 "memory_operand")
+	(unspec:ATOMIC [(match_operand:ATOMIC 1 "register_operand")
+			(match_operand:SI 2 "const_int_operand")]
+		       UNSPEC_MOVA_RELEASE))]
+  ""
+  "%K2mov{<imodesuffix>}\t{%1, %0|%0, %1}")

This pattern doesn't have any constraints! Also, mov insn can store
immediates directly.

+      if (model & IX86_HLE_RELEASE)
+        {
+      	  emit_insn (gen_atomic_store_hle_release<mode> (operands[0],
operands[1],
+	  	  				         operands[2]));
+	  DONE;
+        }					
+
       /* For seq-cst stores, when we lack MFENCE, use XCHG.  */
       if (model == MEMMODEL_SEQ_CST && !(TARGET_64BIT || TARGET_SSE2))

What about __ATOMIC_SEQ_CST; should

  __atomic_clear (p, __ATOMIC_SEQ_CST | __ATOMIC_HLE_RELEASE);

emit a mfence at the end; in case of for your test:

        xrelease movb   $0, (%rdi)
        mfence
        ret
?

+
+void
+hle_clear (int *p, int v)

hle_clear (char *p)

This argument should correspond to a bool, please see documentation.

I have also attached the patch that implements sync.md fixes.

Uros.
Andi Kleen - Jan. 13, 2013, 8:36 p.m.
On Sun, Jan 13, 2013 at 08:59:15PM +0100, Uros Bizjak wrote:
> Hello!
> 
> > __atomic_clear and __atomic_store_n didn't have code to generate
> > the TSX HLE RELEASE prefix. Add this plus test cases.
> 
> +(define_insn "atomic_store_hle_release<mode>"
> +  [(set (match_operand:ATOMIC 0 "memory_operand")
> +	(unspec:ATOMIC [(match_operand:ATOMIC 1 "register_operand")
> +			(match_operand:SI 2 "const_int_operand")]
> +		       UNSPEC_MOVA_RELEASE))]
> +  ""
> +  "%K2mov{<imodesuffix>}\t{%1, %0|%0, %1}")
> 
> This pattern doesn't have any constraints! Also, mov insn can store
> immediates directly.

Can you suggest a better pattern?

> 
> +      if (model & IX86_HLE_RELEASE)
> +        {
> +      	  emit_insn (gen_atomic_store_hle_release<mode> (operands[0],
> operands[1],
> +	  	  				         operands[2]));
> +	  DONE;
> +        }					
> +
>        /* For seq-cst stores, when we lack MFENCE, use XCHG.  */
>        if (model == MEMMODEL_SEQ_CST && !(TARGET_64BIT || TARGET_SSE2))
> 
> What about __ATOMIC_SEQ_CST; should
> 
>   __atomic_clear (p, __ATOMIC_SEQ_CST | __ATOMIC_HLE_RELEASE);
> 
> emit a mfence at the end; in case of for your test:

Originally I thought not, but now on reconsideration it's needed for
older CPUs that don't know about the XRELEASE. And it may be even needed
with TSX for the non transactional fallback execution. I'll fix the patch.

> +
> +void
> +hle_clear (int *p, int v)
> 
> hle_clear (char *p)
> 
> This argument should correspond to a bool, please see documentation.


Not sure I understand? Which documentation? This is just a random test case

>    DONE;
>  })
>  
> +(define_insn "atomic_store<mode>_1"
> +  [(set (match_operand:ATOMIC 0 "memory_operand" "=m")
> +	(unspec:ATOMIC [(match_operand:ATOMIC 1 "<nonmemory_operand>" "<r><i>")
> +			(match_operand:SI 2 "const_int_operand")]
> +		       UNSPEC_MOVA))]
> +  ""
> +  "%K2mov{<imodesuffix>}\t{%1, %0|%0, %1}")

Is that the updated pattern you wanted? It looks similar to mine.

-Andi
Uros Bizjak - Jan. 13, 2013, 8:59 p.m.
On Sun, Jan 13, 2013 at 9:36 PM, Andi Kleen <andi@firstfloor.org> wrote:

>> > __atomic_clear and __atomic_store_n didn't have code to generate
>> > the TSX HLE RELEASE prefix. Add this plus test cases.
>>
>> +(define_insn "atomic_store_hle_release<mode>"
>> +  [(set (match_operand:ATOMIC 0 "memory_operand")
>> +     (unspec:ATOMIC [(match_operand:ATOMIC 1 "register_operand")
>> +                     (match_operand:SI 2 "const_int_operand")]
>> +                    UNSPEC_MOVA_RELEASE))]
>> +  ""
>> +  "%K2mov{<imodesuffix>}\t{%1, %0|%0, %1}")
>>
>> This pattern doesn't have any constraints! Also, mov insn can store
>> immediates directly.
>
> Can you suggest a better pattern?

It is implemented in the patch, attached to my previous message.

>>
>> +      if (model & IX86_HLE_RELEASE)
>> +        {
>> +               emit_insn (gen_atomic_store_hle_release<mode> (operands[0],
>> operands[1],
>> +                                                      operands[2]));
>> +       DONE;
>> +        }
>> +
>>        /* For seq-cst stores, when we lack MFENCE, use XCHG.  */
>>        if (model == MEMMODEL_SEQ_CST && !(TARGET_64BIT || TARGET_SSE2))
>>
>> What about __ATOMIC_SEQ_CST; should
>>
>>   __atomic_clear (p, __ATOMIC_SEQ_CST | __ATOMIC_HLE_RELEASE);
>>
>> emit a mfence at the end; in case of for your test:
>
> Originally I thought not, but now on reconsideration it's needed for
> older CPUs that don't know about the XRELEASE. And it may be even needed
> with TSX for the non transactional fallback execution. I'll fix the patch.

Also fixed in my patch. It emits mfence at the end.

>> +
>> +void
>> +hle_clear (int *p, int v)
>>
>> hle_clear (char *p)
>>
>> This argument should correspond to a bool, please see documentation.
>
>
> Not sure I understand? Which documentation? This is just a random test case

Ah, I was referring to the gcc documentation about __atomic_clear.

>> +(define_insn "atomic_store<mode>_1"
>> +  [(set (match_operand:ATOMIC 0 "memory_operand" "=m")
>> +     (unspec:ATOMIC [(match_operand:ATOMIC 1 "<nonmemory_operand>" "<r><i>")
>> +                     (match_operand:SI 2 "const_int_operand")]
>> +                    UNSPEC_MOVA))]
>> +  ""
>> +  "%K2mov{<imodesuffix>}\t{%1, %0|%0, %1}")
>
> Is that the updated pattern you wanted? It looks similar to mine.

Yes the attached patch actually implements all proposed fixes.

Uros.
Andi Kleen - Jan. 13, 2013, 10:12 p.m.
> >> +(define_insn "atomic_store<mode>_1"
> >> +  [(set (match_operand:ATOMIC 0 "memory_operand" "=m")
> >> +     (unspec:ATOMIC [(match_operand:ATOMIC 1 "<nonmemory_operand>" "<r><i>")
> >> +                     (match_operand:SI 2 "const_int_operand")]
> >> +                    UNSPEC_MOVA))]
> >> +  ""
> >> +  "%K2mov{<imodesuffix>}\t{%1, %0|%0, %1}")
> >
> > Is that the updated pattern you wanted? It looks similar to mine.
> 
> Yes the attached patch actually implements all proposed fixes.

Ok great. Can you just commit it then? It looks good to me.

-Andi

Patch

Index: config/i386/sync.md
===================================================================
--- config/i386/sync.md	(revision 195137)
+++ config/i386/sync.md	(working copy)
@@ -224,8 +224,12 @@ 
 	  DONE;
 	}
 
-      /* Otherwise use a normal store.  */
-      emit_move_insn (operands[0], operands[1]);
+      /* Otherwise use a store...  */
+      if (INTVAL (operands[2]) & IX86_HLE_RELEASE)
+	emit_insn (gen_atomic_store<mode>_1 (operands[0], operands[1],
+					     operands[2]));
+      else
+	emit_move_insn (operands[0], operands[1]);
     }
   /* ... followed by an MFENCE, if required.  */
   if (model == MEMMODEL_SEQ_CST)
@@ -233,6 +237,14 @@ 
   DONE;
 })
 
+(define_insn "atomic_store<mode>_1"
+  [(set (match_operand:ATOMIC 0 "memory_operand" "=m")
+	(unspec:ATOMIC [(match_operand:ATOMIC 1 "<nonmemory_operand>" "<r><i>")
+			(match_operand:SI 2 "const_int_operand")]
+		       UNSPEC_MOVA))]
+  ""
+  "%K2mov{<imodesuffix>}\t{%1, %0|%0, %1}")
+
 (define_insn_and_split "atomic_storedi_fpu"
   [(set (match_operand:DI 0 "memory_operand" "=m,m,m")
 	(unspec:DI [(match_operand:DI 1 "register_operand" "x,m,?r")]