diff mbox

[libitm] avoid non-portable branch mnemonics

Message ID 4EBADADF.5060602@redhat.com
State New
Headers show

Commit Message

Richard Henderson Nov. 9, 2011, 7:56 p.m. UTC
I said elsewhere that I would convert this to __atomic, but then
I re-read my commentary about using cmpxchg *without* a lock prefix.
What we're looking for here is more or less non-interruptible, 
rather than atomic.  And apparently I benchmarked this a while back
as a 10x performance improvement.

Seems like the easiest thing is simply to use .byte instead of ,pn.

Committed.


r~
commit f3210a53394de39a8aa74ec9dcb23f2cc0551322
Author: rth <rth@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Wed Nov 9 19:51:49 2011 +0000

    libitm: Avoid non-portable x86 branch prediction mnemonic.
    
    git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@181233 138bc75d-0d04-0410-961f-82ee72b054a4
diff mbox

Patch

diff --git a/libitm/ChangeLog b/libitm/ChangeLog
index e78716d..0501d16 100644
--- a/libitm/ChangeLog
+++ b/libitm/ChangeLog
@@ -1,5 +1,8 @@ 
 2011-11-09  Richard Henderson  <rth@redhat.com>
 
+	* config/x86/cacheline.h (gtm_cacheline::store_mask): Use .byte
+	to emit branch prediction hint.
+
 	* config/x86/sjlj.S: Protect elf directives with __ELF__.
 	Protect .note.GNU-stack with __linux__.
 
diff --git a/libitm/config/x86/cacheline.h b/libitm/config/x86/cacheline.h
index 15a95b0..f91d7cc 100644
--- a/libitm/config/x86/cacheline.h
+++ b/libitm/config/x86/cacheline.h
@@ -144,7 +144,7 @@  gtm_cacheline::operator= (const gtm_cacheline & __restrict s)
 }
 #endif
 
-// ??? Support masked integer stores more efficiently with an unlocked cmpxchg
+// Support masked integer stores more efficiently with an unlocked cmpxchg
 // insn.  My reasoning is that while we write to locations that we do not wish
 // to modify, we do it in an uninterruptable insn, and so we either truely
 // write back the original data or the insn fails -- unlike with a
@@ -171,7 +171,8 @@  gtm_cacheline::store_mask (uint32_t *d, uint32_t s, uint8_t m)
 		"and	%[m], %[n]\n\t"
 		"or	%[s], %[n]\n\t"
 		"cmpxchg %[n], %[d]\n\t"
-		"jnz,pn	0b"
+		".byte	0x2e\n\t"	// predict not-taken, aka jnz,pn
+		"jnz	0b"
 		: [d] "+m"(*d), [n] "=&r" (n), [o] "+a"(o)
 		: [s] "r" (s & bm), [m] "r" (~bm));
 	}
@@ -198,7 +199,8 @@  gtm_cacheline::store_mask (uint64_t *d, uint64_t s, uint8_t m)
 		"and	%[m], %[n]\n\t"
 		"or	%[s], %[n]\n\t"
 		"cmpxchg %[n], %[d]\n\t"
-		"jnz,pn	0b"
+		".byte	0x2e\n\t"	// predict not-taken, aka jnz,pn
+		"jnz	0b"
 		: [d] "+m"(*d), [n] "=&r" (n), [o] "+a"(o)
 		: [s] "r" (s & bm), [m] "r" (~bm));
 #else