diff mbox

Generate more efficient memory barriers for LEON3

Message ID 53AAB0AF.5060802@gaisler.com
State New
Headers show

Commit Message

Daniel Cederman June 25, 2014, 11:21 a.m. UTC
Hello,

The memory barriers generated for SPARC are targeting the weakest memory 
model allowed for SPARC. The LEON3/4 SPARC processors are using a 
stronger memory model and thus have less requirements on the memory 
barriers. For LEON3/4, StoreStore is compiler-only, instead of "stbar", 
and StoreLoad can be achieved with a normal byte write "stb", instead of 
an atomic byte read-write "ldstub". The provided patch changes the 
previously mentioned memory barriers for TARGET_LEON3.

Best regards,
Daniel Cederman

ChangeLog:

2014-06-25  Daniel Cederman  <cederman@gaisler.com>

gcc/config/sparc/
	* sync.md: Generate more efficient memory barriers for LEON3

  ;; instances of the membar pattern for Sparc V8.

Comments

Eric Botcazou July 10, 2014, 9:37 a.m. UTC | #1
> The memory barriers generated for SPARC are targeting the weakest memory
> model allowed for SPARC.

That's not quite true, they are targeting the sparc_memory_model, which is the 
memory model selected for the architecture/OS pair by default and which can be 
overridden by the user with -mmemory-model=[default|rmo|pso|tso|sc].

> The LEON3/4 SPARC processors are using a stronger memory model and thus have
> less requirements on the memory barriers.

My understanding is that they use TSO, in which case...

> For LEON3/4, StoreStore is compiler-only, instead of "stbar",

..."stdbar" should never be generated since #StoreStore is implied by TSO.

> and StoreLoad can be achieved with a normal byte write "stb", instead of
> an atomic byte read-write "ldstub".

OK, thanks.  Does this result in a significance performance gain?

> The provided patch changes the previously mentioned memory barriers for
> TARGET_LEON3.

I think that only the membar_storeload_leon3 pattern is necessary.  Couple of 
more nits: the new pattern is not "multi", it's "store" and you need to add:

  && !TARGET_LEON3

to the original membar_storeload since TARGET_LEON3 is also TARGET_V8.
Daniel Cederman July 11, 2014, 7:55 a.m. UTC | #2
Hi,

Thank you for your comments.

 > ..."stdbar" should never be generated since #StoreStore is implied by 
TSO.

I missed that #StoreStore is never generated for TSO, so I'm removing 
that pattern.

 > OK, thanks.  Does this result in a significance performance gain?

stb seems to be at least twice as fast as ldstub.

 > I think that only the membar_storeload_leon3 pattern is necessary.

The full barrier pattern membar_leon3 also gets generated so I think 
that one should be kept also.

I'm changing the pattern type to "store" and the condition on the 
original patterns to "&& !TARGET_LEON3" and resubmitting the patch.

On 2014-07-10 11:37, Eric Botcazou wrote:
>> The memory barriers generated for SPARC are targeting the weakest memory
>> model allowed for SPARC.
>
> That's not quite true, they are targeting the sparc_memory_model, which is the
> memory model selected for the architecture/OS pair by default and which can be
> overridden by the user with -mmemory-model=[default|rmo|pso|tso|sc].
>
>> The LEON3/4 SPARC processors are using a stronger memory model and thus have
>> less requirements on the memory barriers.
>
> My understanding is that they use TSO, in which case...
>
>> For LEON3/4, StoreStore is compiler-only, instead of "stbar",
>
> ..."stdbar" should never be generated since #StoreStore is implied by TSO.
>
>> and StoreLoad can be achieved with a normal byte write "stb", instead of
>> an atomic byte read-write "ldstub".
>
> OK, thanks.  Does this result in a significance performance gain?
>
>> The provided patch changes the previously mentioned memory barriers for
>> TARGET_LEON3.
>
> I think that only the membar_storeload_leon3 pattern is necessary.  Couple of
> more nits: the new pattern is not "multi", it's "store" and you need to add:
>
>    && !TARGET_LEON3
>
> to the original membar_storeload since TARGET_LEON3 is also TARGET_V8.
>
Eric Botcazou July 11, 2014, 9:15 a.m. UTC | #3
> The full barrier pattern membar_leon3 also gets generated so I think
> that one should be kept also.

Do you have a testcase?  membar is generated by sparc_emit_membar_for_model 
and, for the TSO model of LEON3, implied = StoreStore | LoadLoad | LoadStore 
so mm can only be StoreLoad, which means that membar_storeload will match so 
the full barrier never will.
Daniel Cederman July 11, 2014, 10:24 a.m. UTC | #4
That was an error on my side. The wrong memory model had gotten cached 
in a generated make script. Lets drop membar_leon3 also then :)

On 2014-07-11 11:15, Eric Botcazou wrote:
>> The full barrier pattern membar_leon3 also gets generated so I think
>> that one should be kept also.
>
> Do you have a testcase?  membar is generated by sparc_emit_membar_for_model
> and, for the TSO model of LEON3, implied = StoreStore | LoadLoad | LoadStore
> so mm can only be StoreLoad, which means that membar_storeload will match so
> the full barrier never will.
>
Eric Botcazou July 12, 2014, 9:18 a.m. UTC | #5
> That was an error on my side. The wrong memory model had gotten cached
> in a generated make script. Lets drop membar_leon3 also then :)

Fine with me but, on second thoughts, if a mere "stb" is a #StoreLoad memory 
barrier for LEON3, doesn't this simply mean that the memory model of the LEON3 
is Strong Consistency and not TSO?  In which case, the only thing to change is 
the default setting for LEON3 in sparc_option_override.
Daniel Cederman July 14, 2014, 11:41 a.m. UTC | #6
LEON3 has a store buffer of length 1, so an additional store is required 
to be sure that the one preceding it has been written to memory. I am 
not familiar enough with the internals of gcc to pick the optimal place 
to encode this behavior, so any suggestions from you are appreciated :)

On 2014-07-12 11:18, Eric Botcazou wrote:
>> That was an error on my side. The wrong memory model had gotten cached
>> in a generated make script. Lets drop membar_leon3 also then :)
>
> Fine with me but, on second thoughts, if a mere "stb" is a #StoreLoad memory
> barrier for LEON3, doesn't this simply mean that the memory model of the LEON3
> is Strong Consistency and not TSO?  In which case, the only thing to change is
> the default setting for LEON3 in sparc_option_override.
>
Eric Botcazou July 19, 2014, 9:28 a.m. UTC | #7
> LEON3 has a store buffer of length 1, so an additional store is required
> to be sure that the one preceding it has been written to memory. I am
> not familiar enough with the internals of gcc to pick the optimal place
> to encode this behavior, so any suggestions from you are appreciated :)

OK, that's not Strong Consistency so I'm going to apply your latest patch.
diff mbox

Patch

diff --git a/gcc/config/sparc/sync.md b/gcc/config/sparc/sync.md
index e6e237f..26173a7 100644
--- a/gcc/config/sparc/sync.md
+++ b/gcc/config/sparc/sync.md
@@ -56,6 +56,15 @@ 
    [(set_attr "type" "multi")
     (set_attr "length" "0")])

+;; For LEON3, membar #StoreStore is compiler-only.
+(define_insn "*membar_storestore_leon3"
+  [(set (match_operand:BLK 0 "" "")
+       (unspec:BLK [(match_dup 0) (const_int 8)] UNSPEC_MEMBAR))]
+  "TARGET_LEON3"
+  ""
+  [(set_attr "type" "multi")
+  (set_attr "length" "0")])
+
  ;; For V8, STBAR is exactly membar #StoreStore, by definition.
  (define_insn "*membar_storestore"
    [(set (match_operand:BLK 0 "" "")
@@ -64,6 +73,14 @@ 
    "stbar"
    [(set_attr "type" "multi")])

+;; For LEON3, STB has the effect of membar #StoreLoad.
+(define_insn "*membar_storeload_leon3"
+  [(set (match_operand:BLK 0 "" "")
+       (unspec:BLK [(match_dup 0) (const_int 2)] UNSPEC_MEMBAR))]
+  "TARGET_LEON3"
+  "stb\t%%g0, [%%sp-1]"
+  [(set_attr "type" "multi")])
+
  ;; For V8, LDSTUB has the effect of membar #StoreLoad.
  (define_insn "*membar_storeload"
    [(set (match_operand:BLK 0 "" "")
@@ -72,6 +89,15 @@ 
    "ldstub\t[%%sp-1], %%g0"
    [(set_attr "type" "multi")])

+;; For LEON3, membar #StoreLoad is enough for a full barrier.
+(define_insn "*membar_leon3"
+  [(set (match_operand:BLK 0 "" "")
+       (unspec:BLK [(match_dup 0) (match_operand:SI 1 "const_int_operand")]
+                   UNSPEC_MEMBAR))]
+  "TARGET_LEON3"
+  "stb\t%%g0, [%%sp-1]"
+  [(set_attr "type" "multi")])
+
  ;; Put the two together, in combination with the fact that V8 
implements PSO
  ;; as its weakest memory model, means a full barrier.  Match all remaining