Message ID | 53AAB0AF.5060802@gaisler.com |
---|---|
State | New |
Headers | show |
> The memory barriers generated for SPARC are targeting the weakest memory > model allowed for SPARC. That's not quite true, they are targeting the sparc_memory_model, which is the memory model selected for the architecture/OS pair by default and which can be overridden by the user with -mmemory-model=[default|rmo|pso|tso|sc]. > The LEON3/4 SPARC processors are using a stronger memory model and thus have > less requirements on the memory barriers. My understanding is that they use TSO, in which case... > For LEON3/4, StoreStore is compiler-only, instead of "stbar", ..."stdbar" should never be generated since #StoreStore is implied by TSO. > and StoreLoad can be achieved with a normal byte write "stb", instead of > an atomic byte read-write "ldstub". OK, thanks. Does this result in a significance performance gain? > The provided patch changes the previously mentioned memory barriers for > TARGET_LEON3. I think that only the membar_storeload_leon3 pattern is necessary. Couple of more nits: the new pattern is not "multi", it's "store" and you need to add: && !TARGET_LEON3 to the original membar_storeload since TARGET_LEON3 is also TARGET_V8.
Hi, Thank you for your comments. > ..."stdbar" should never be generated since #StoreStore is implied by TSO. I missed that #StoreStore is never generated for TSO, so I'm removing that pattern. > OK, thanks. Does this result in a significance performance gain? stb seems to be at least twice as fast as ldstub. > I think that only the membar_storeload_leon3 pattern is necessary. The full barrier pattern membar_leon3 also gets generated so I think that one should be kept also. I'm changing the pattern type to "store" and the condition on the original patterns to "&& !TARGET_LEON3" and resubmitting the patch. On 2014-07-10 11:37, Eric Botcazou wrote: >> The memory barriers generated for SPARC are targeting the weakest memory >> model allowed for SPARC. > > That's not quite true, they are targeting the sparc_memory_model, which is the > memory model selected for the architecture/OS pair by default and which can be > overridden by the user with -mmemory-model=[default|rmo|pso|tso|sc]. > >> The LEON3/4 SPARC processors are using a stronger memory model and thus have >> less requirements on the memory barriers. > > My understanding is that they use TSO, in which case... > >> For LEON3/4, StoreStore is compiler-only, instead of "stbar", > > ..."stdbar" should never be generated since #StoreStore is implied by TSO. > >> and StoreLoad can be achieved with a normal byte write "stb", instead of >> an atomic byte read-write "ldstub". > > OK, thanks. Does this result in a significance performance gain? > >> The provided patch changes the previously mentioned memory barriers for >> TARGET_LEON3. > > I think that only the membar_storeload_leon3 pattern is necessary. Couple of > more nits: the new pattern is not "multi", it's "store" and you need to add: > > && !TARGET_LEON3 > > to the original membar_storeload since TARGET_LEON3 is also TARGET_V8. >
> The full barrier pattern membar_leon3 also gets generated so I think > that one should be kept also. Do you have a testcase? membar is generated by sparc_emit_membar_for_model and, for the TSO model of LEON3, implied = StoreStore | LoadLoad | LoadStore so mm can only be StoreLoad, which means that membar_storeload will match so the full barrier never will.
That was an error on my side. The wrong memory model had gotten cached in a generated make script. Lets drop membar_leon3 also then :) On 2014-07-11 11:15, Eric Botcazou wrote: >> The full barrier pattern membar_leon3 also gets generated so I think >> that one should be kept also. > > Do you have a testcase? membar is generated by sparc_emit_membar_for_model > and, for the TSO model of LEON3, implied = StoreStore | LoadLoad | LoadStore > so mm can only be StoreLoad, which means that membar_storeload will match so > the full barrier never will. >
> That was an error on my side. The wrong memory model had gotten cached > in a generated make script. Lets drop membar_leon3 also then :) Fine with me but, on second thoughts, if a mere "stb" is a #StoreLoad memory barrier for LEON3, doesn't this simply mean that the memory model of the LEON3 is Strong Consistency and not TSO? In which case, the only thing to change is the default setting for LEON3 in sparc_option_override.
LEON3 has a store buffer of length 1, so an additional store is required to be sure that the one preceding it has been written to memory. I am not familiar enough with the internals of gcc to pick the optimal place to encode this behavior, so any suggestions from you are appreciated :) On 2014-07-12 11:18, Eric Botcazou wrote: >> That was an error on my side. The wrong memory model had gotten cached >> in a generated make script. Lets drop membar_leon3 also then :) > > Fine with me but, on second thoughts, if a mere "stb" is a #StoreLoad memory > barrier for LEON3, doesn't this simply mean that the memory model of the LEON3 > is Strong Consistency and not TSO? In which case, the only thing to change is > the default setting for LEON3 in sparc_option_override. >
> LEON3 has a store buffer of length 1, so an additional store is required > to be sure that the one preceding it has been written to memory. I am > not familiar enough with the internals of gcc to pick the optimal place > to encode this behavior, so any suggestions from you are appreciated :) OK, that's not Strong Consistency so I'm going to apply your latest patch.
diff --git a/gcc/config/sparc/sync.md b/gcc/config/sparc/sync.md index e6e237f..26173a7 100644 --- a/gcc/config/sparc/sync.md +++ b/gcc/config/sparc/sync.md @@ -56,6 +56,15 @@ [(set_attr "type" "multi") (set_attr "length" "0")]) +;; For LEON3, membar #StoreStore is compiler-only. +(define_insn "*membar_storestore_leon3" + [(set (match_operand:BLK 0 "" "") + (unspec:BLK [(match_dup 0) (const_int 8)] UNSPEC_MEMBAR))] + "TARGET_LEON3" + "" + [(set_attr "type" "multi") + (set_attr "length" "0")]) + ;; For V8, STBAR is exactly membar #StoreStore, by definition. (define_insn "*membar_storestore" [(set (match_operand:BLK 0 "" "") @@ -64,6 +73,14 @@ "stbar" [(set_attr "type" "multi")]) +;; For LEON3, STB has the effect of membar #StoreLoad. +(define_insn "*membar_storeload_leon3" + [(set (match_operand:BLK 0 "" "") + (unspec:BLK [(match_dup 0) (const_int 2)] UNSPEC_MEMBAR))] + "TARGET_LEON3" + "stb\t%%g0, [%%sp-1]" + [(set_attr "type" "multi")]) + ;; For V8, LDSTUB has the effect of membar #StoreLoad. (define_insn "*membar_storeload" [(set (match_operand:BLK 0 "" "") @@ -72,6 +89,15 @@ "ldstub\t[%%sp-1], %%g0" [(set_attr "type" "multi")]) +;; For LEON3, membar #StoreLoad is enough for a full barrier. +(define_insn "*membar_leon3" + [(set (match_operand:BLK 0 "" "") + (unspec:BLK [(match_dup 0) (match_operand:SI 1 "const_int_operand")] + UNSPEC_MEMBAR))] + "TARGET_LEON3" + "stb\t%%g0, [%%sp-1]" + [(set_attr "type" "multi")]) + ;; Put the two together, in combination with the fact that V8 implements PSO ;; as its weakest memory model, means a full barrier. Match all remaining