Message ID | CAFULd4Ztbccsq9h50Tp-0JGHr4CoFJ2QByf0WgS4Ed92_09=Rw@mail.gmail.com |
---|---|
State | New |
Headers | show |
On Sun, Jan 13, 2013 at 08:59:15PM +0100, Uros Bizjak wrote: > Hello! > > > __atomic_clear and __atomic_store_n didn't have code to generate > > the TSX HLE RELEASE prefix. Add this plus test cases. > > +(define_insn "atomic_store_hle_release<mode>" > + [(set (match_operand:ATOMIC 0 "memory_operand") > + (unspec:ATOMIC [(match_operand:ATOMIC 1 "register_operand") > + (match_operand:SI 2 "const_int_operand")] > + UNSPEC_MOVA_RELEASE))] > + "" > + "%K2mov{<imodesuffix>}\t{%1, %0|%0, %1}") > > This pattern doesn't have any constraints! Also, mov insn can store > immediates directly. Can you suggest a better pattern? > > + if (model & IX86_HLE_RELEASE) > + { > + emit_insn (gen_atomic_store_hle_release<mode> (operands[0], > operands[1], > + operands[2])); > + DONE; > + } > + > /* For seq-cst stores, when we lack MFENCE, use XCHG. */ > if (model == MEMMODEL_SEQ_CST && !(TARGET_64BIT || TARGET_SSE2)) > > What about __ATOMIC_SEQ_CST; should > > __atomic_clear (p, __ATOMIC_SEQ_CST | __ATOMIC_HLE_RELEASE); > > emit a mfence at the end; in case of for your test: Originally I thought not, but now on reconsideration it's needed for older CPUs that don't know about the XRELEASE. And it may be even needed with TSX for the non transactional fallback execution. I'll fix the patch. > + > +void > +hle_clear (int *p, int v) > > hle_clear (char *p) > > This argument should correspond to a bool, please see documentation. Not sure I understand? Which documentation? This is just a random test case > DONE; > }) > > +(define_insn "atomic_store<mode>_1" > + [(set (match_operand:ATOMIC 0 "memory_operand" "=m") > + (unspec:ATOMIC [(match_operand:ATOMIC 1 "<nonmemory_operand>" "<r><i>") > + (match_operand:SI 2 "const_int_operand")] > + UNSPEC_MOVA))] > + "" > + "%K2mov{<imodesuffix>}\t{%1, %0|%0, %1}") Is that the updated pattern you wanted? It looks similar to mine. -Andi
On Sun, Jan 13, 2013 at 9:36 PM, Andi Kleen <andi@firstfloor.org> wrote: >> > __atomic_clear and __atomic_store_n didn't have code to generate >> > the TSX HLE RELEASE prefix. Add this plus test cases. >> >> +(define_insn "atomic_store_hle_release<mode>" >> + [(set (match_operand:ATOMIC 0 "memory_operand") >> + (unspec:ATOMIC [(match_operand:ATOMIC 1 "register_operand") >> + (match_operand:SI 2 "const_int_operand")] >> + UNSPEC_MOVA_RELEASE))] >> + "" >> + "%K2mov{<imodesuffix>}\t{%1, %0|%0, %1}") >> >> This pattern doesn't have any constraints! Also, mov insn can store >> immediates directly. > > Can you suggest a better pattern? It is implemented in the patch, attached to my previous message. >> >> + if (model & IX86_HLE_RELEASE) >> + { >> + emit_insn (gen_atomic_store_hle_release<mode> (operands[0], >> operands[1], >> + operands[2])); >> + DONE; >> + } >> + >> /* For seq-cst stores, when we lack MFENCE, use XCHG. */ >> if (model == MEMMODEL_SEQ_CST && !(TARGET_64BIT || TARGET_SSE2)) >> >> What about __ATOMIC_SEQ_CST; should >> >> __atomic_clear (p, __ATOMIC_SEQ_CST | __ATOMIC_HLE_RELEASE); >> >> emit a mfence at the end; in case of for your test: > > Originally I thought not, but now on reconsideration it's needed for > older CPUs that don't know about the XRELEASE. And it may be even needed > with TSX for the non transactional fallback execution. I'll fix the patch. Also fixed in my patch. It emits mfence at the end. >> + >> +void >> +hle_clear (int *p, int v) >> >> hle_clear (char *p) >> >> This argument should correspond to a bool, please see documentation. > > > Not sure I understand? Which documentation? This is just a random test case Ah, I was referring to the gcc documentation about __atomic_clear. >> +(define_insn "atomic_store<mode>_1" >> + [(set (match_operand:ATOMIC 0 "memory_operand" "=m") >> + (unspec:ATOMIC [(match_operand:ATOMIC 1 "<nonmemory_operand>" "<r><i>") >> + (match_operand:SI 2 "const_int_operand")] >> + UNSPEC_MOVA))] >> + "" >> + "%K2mov{<imodesuffix>}\t{%1, %0|%0, %1}") > > Is that the updated pattern you wanted? It looks similar to mine. Yes the attached patch actually implements all proposed fixes. Uros.
> >> +(define_insn "atomic_store<mode>_1" > >> + [(set (match_operand:ATOMIC 0 "memory_operand" "=m") > >> + (unspec:ATOMIC [(match_operand:ATOMIC 1 "<nonmemory_operand>" "<r><i>") > >> + (match_operand:SI 2 "const_int_operand")] > >> + UNSPEC_MOVA))] > >> + "" > >> + "%K2mov{<imodesuffix>}\t{%1, %0|%0, %1}") > > > > Is that the updated pattern you wanted? It looks similar to mine. > > Yes the attached patch actually implements all proposed fixes. Ok great. Can you just commit it then? It looks good to me. -Andi
Index: config/i386/sync.md =================================================================== --- config/i386/sync.md (revision 195137) +++ config/i386/sync.md (working copy) @@ -224,8 +224,12 @@ DONE; } - /* Otherwise use a normal store. */ - emit_move_insn (operands[0], operands[1]); + /* Otherwise use a store... */ + if (INTVAL (operands[2]) & IX86_HLE_RELEASE) + emit_insn (gen_atomic_store<mode>_1 (operands[0], operands[1], + operands[2])); + else + emit_move_insn (operands[0], operands[1]); } /* ... followed by an MFENCE, if required. */ if (model == MEMMODEL_SEQ_CST) @@ -233,6 +237,14 @@ DONE; }) +(define_insn "atomic_store<mode>_1" + [(set (match_operand:ATOMIC 0 "memory_operand" "=m") + (unspec:ATOMIC [(match_operand:ATOMIC 1 "<nonmemory_operand>" "<r><i>") + (match_operand:SI 2 "const_int_operand")] + UNSPEC_MOVA))] + "" + "%K2mov{<imodesuffix>}\t{%1, %0|%0, %1}") + (define_insn_and_split "atomic_storedi_fpu" [(set (match_operand:DI 0 "memory_operand" "=m,m,m") (unspec:DI [(match_operand:DI 1 "register_operand" "x,m,?r")]