Patchwork [1/2] Document HLE / RTM intrinsics

login
register
mail settings
Submitter Andi Kleen
Date Jan. 12, 2013, 3:28 p.m.
Message ID <1358004522-16358-1-git-send-email-andi@firstfloor.org>
Download mbox | patch
Permalink /patch/211504/
State New
Headers show

Comments

Andi Kleen - Jan. 12, 2013, 3:28 p.m.
From: Andi Kleen <ak@linux.intel.com>

The TSX HLE/RTM intrinsics were missing documentation. Add this to the
manual.

Ok for release / trunk?

2013-01-11  Andi Kleen  <ak@linux.intel.com>

	* doc/extend.texi: Document __ATOMIC_HLE_ACQUIRE,
	__ATOMIC_HLE_RELEASE. Document __builtin_ia32 TSX intrincs.
	Document _x* TSX intrinsics.
---
 gcc/doc/extend.texi |  115 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 115 insertions(+)
aldot - Jan. 12, 2013, 5:04 p.m.
On 12 January 2013 16:28:41 Andi Kleen <andi@firstfloor.org> wrote:
> From: Andi Kleen <ak@linux.intel.com>

> +Returns _XBEGIN_STARTED when the transaction
> +started successfully (not this is not 0, so the constant has to be

not this is not 0? Or note?

Thanks,


Sent with AquaMail for Android
http://www.aqua-mail.com
Andi Kleen - Jan. 12, 2013, 5:20 p.m.
On Sat, Jan 12, 2013 at 06:04:19PM +0100, Bernhard Reutner-Fischer wrote:
> On 12 January 2013 16:28:41 Andi Kleen <andi@firstfloor.org> wrote:
> >From: Andi Kleen <ak@linux.intel.com>
> 
> >+Returns _XBEGIN_STARTED when the transaction
> >+started successfully (not this is not 0, so the constant has to be
> 
> not this is not 0? Or note?

"note"

Thanks. Will fix before comitting.

-Andi
Richard Guenther - Jan. 14, 2013, 7:24 a.m.
On Sat, Jan 12, 2013 at 6:20 PM, Andi Kleen <andi@firstfloor.org> wrote:
> On Sat, Jan 12, 2013 at 06:04:19PM +0100, Bernhard Reutner-Fischer wrote:
>> On 12 January 2013 16:28:41 Andi Kleen <andi@firstfloor.org> wrote:
>> >From: Andi Kleen <ak@linux.intel.com>
>>
>> >+Returns _XBEGIN_STARTED when the transaction
>> >+started successfully (not this is not 0, so the constant has to be
>>
>> not this is not 0? Or note?
>
> "note"
>
> Thanks. Will fix before comitting.

I think (somewhere else) we agreed to only document intrinsics,
not the __builtin_ia32_ variants (they are an implementation detail).
Yes, we're not consistent with that, but we do miss a lot of
documentation for these kind of builtins.  I suppose we also do not
document all intrinsics either (though that's desired, as we provide
those headers - even manpages would be nice for them I suppose).

Target maintainers?

Thanks,
Richard.

> -Andi
Andi Kleen - Jan. 14, 2013, 6:22 p.m.
> I think (somewhere else) we agreed to only document intrinsics,
> not the __builtin_ia32_ variants (they are an implementation detail).

They are all (poorly) documented.

I didn't really document them, just list them.

> Yes, we're not consistent with that, but we do miss a lot of
> documentation for these kind of builtins.  I suppose we also do not
> document all intrinsics either (though that's desired, as we provide
> those headers - even manpages would be nice for them I suppose).

It would be nice if gcc had proper documentation for all the
<xxxintrin.h> functions. But that's a lot of work.

But I would like to have TSX properly documented at least.

-Andi
Andi Kleen - Jan. 20, 2013, 6:50 p.m.
Andi Kleen <andi@firstfloor.org> writes:

> From: Andi Kleen <ak@linux.intel.com>
>
> The TSX HLE/RTM intrinsics were missing documentation. Add this to the
> manual.
>
> Ok for release / trunk?

Could someone please review/approve this (documentation only) patch?

Thanks.

-Andi

> 2013-01-11  Andi Kleen  <ak@linux.intel.com>
>
> 	* doc/extend.texi: Document __ATOMIC_HLE_ACQUIRE,
> 	__ATOMIC_HLE_RELEASE. Document __builtin_ia32 TSX intrincs.
> 	Document _x* TSX intrinsics.
Andi Kleen - Jan. 26, 2013, 10:54 p.m.
Andi Kleen <andi@firstfloor.org> writes:

PING^2!!

> Andi Kleen <andi@firstfloor.org> writes:
>
>> From: Andi Kleen <ak@linux.intel.com>
>>
>> The TSX HLE/RTM intrinsics were missing documentation. Add this to the
>> manual.
>>
>> Ok for release / trunk?
>
> Could someone please review/approve this (documentation only) patch?

> Thanks.
>
> -Andi
>
>> 2013-01-11  Andi Kleen  <ak@linux.intel.com>
>>
>> 	* doc/extend.texi: Document __ATOMIC_HLE_ACQUIRE,
>> 	__ATOMIC_HLE_RELEASE. Document __builtin_ia32 TSX intrincs.
>> 	Document _x* TSX intrinsics.
Florian Weimer - Jan. 27, 2013, 6:15 p.m.
On 01/12/2013 04:28 PM, Andi Kleen wrote:

> The TSX HLE/RTM intrinsics were missing documentation. Add this to the
> manual.

Are these intrinsics restricted to free-standing implementations?  Or 
are these instructions designed in such a way that they work as expected 
even if the threading library uses them internally?  (That would be 
quite a feat.)
Andi Kleen - Jan. 27, 2013, 9:22 p.m.
On Sun, Jan 27, 2013 at 07:15:42PM +0100, Florian Weimer wrote:
> On 01/12/2013 04:28 PM, Andi Kleen wrote:
> 
> >The TSX HLE/RTM intrinsics were missing documentation. Add this to the
> >manual.
> 
> Are these intrinsics restricted to free-standing implementations?  Or 
> are these instructions designed in such a way that they work as expected 
> even if the threading library uses them internally?  (That would be 
> quite a feat.)

They can be combined with a threading library, with some restrictions.
See the manual for details. All transactions are flattened.

http://software.intel.com/sites/default/files/m/a/b/3/4/d/41604-319433-012a.pdf
chapter 8

"restrictions" may lead to not eliding or abort, but never to
correctness problems.

Documenting all that is out of scope for the gcc manual though.

-Andi
Andi Kleen - Feb. 14, 2013, 9:34 p.m.
Andi Kleen <andi@firstfloor.org> writes:

PING^3

I'm about to give up on this, concluding that there is no interest
in improving the gcc documentation.

-Andi

> Andi Kleen <andi@firstfloor.org> writes:
>
> PING^2!!
>
>> Andi Kleen <andi@firstfloor.org> writes:
>>
>>> From: Andi Kleen <ak@linux.intel.com>
>>>
>>> The TSX HLE/RTM intrinsics were missing documentation. Add this to the
>>> manual.
>>>
>>> Ok for release / trunk?
>>
>> Could someone please review/approve this (documentation only) patch?
>
>> Thanks.
>>
>> -Andi
>>
>>> 2013-01-11  Andi Kleen  <ak@linux.intel.com>
>>>
>>> 	* doc/extend.texi: Document __ATOMIC_HLE_ACQUIRE,
>>> 	__ATOMIC_HLE_RELEASE. Document __builtin_ia32 TSX intrincs.
>>> 	Document _x* TSX intrinsics.

Patch

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index cc20ed2..fb0d4bc 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -81,6 +81,7 @@  extensions, accepted by GCC in C90 mode and in C++.
 * Offsetof::            Special syntax for implementing @code{offsetof}.
 * __sync Builtins::     Legacy built-in functions for atomic memory access.
 * __atomic Builtins::   Atomic built-in functions with memory model.
+* x86 specific memory model extensions for transactional memory:: x86 memory models.
 * Object Size Checking:: Built-in functions for limited buffer overflow
                         checking.
 * Other Builtins::      Other built-in functions.
@@ -7466,6 +7467,37 @@  alignment.  A value of 0 indicates typical alignment should be used.  The
 compiler may also ignore this parameter.
 @end deftypefn
 
+@node x86 specific memory model extensions for transactional memory
+@section x86 specific memory model extensions for transactional memory
+
+The i386 architecture supports additional memory ordering flags
+to mark lock critical sections for hardware lock elision. 
+These must be specified in addition to an existing memory model to 
+atomic intrinsics.
+
+@table @code
+@item __ATOMIC_HLE_ACQUIRE
+Start lock elision on a lock variable.
+Memory model must be @code{__ATOMIC_ACQUIRE} or stronger.
+@item __ATOMIC_HLE_RELEASE
+End lock elision on a lock variable.
+Memory model must be @code{__ATOMIC_RELEASE} or stronger.
+@end table
+
+When a lock acquire fails it's required for good performance to abort
+the transaction quickly. This can be done with a @code{_mm_pause}
+
+@smallexample
+#include <immintrin.h> // For _mm_pause
+
+/* Acquire lock with lock elision */
+while (__atomic_exchange_n(&lockvar, 1, __ATOMIC_ACQUIRE|__ATOMIC_HLE_ACQUIRE))
+    _mm_pause(); /* Abort failed transaction */
+...
+/* Free lock with lock elision */
+__atomic_clear(&lockvar, __ATOMIC_RELEASE|__ATOMIC_HLE_RELEASE);
+@end smallexample
+
 @node Object Size Checking
 @section Object Size Checking Built-in Functions
 @findex __builtin_object_size
@@ -8737,6 +8769,7 @@  instructions, but allow the compiler to schedule those calls.
 * Blackfin Built-in Functions::
 * FR-V Built-in Functions::
 * X86 Built-in Functions::
+* X86 transactional memory intrinsics::
 * MIPS DSP Built-in Functions::
 * MIPS Paired-Single Support::
 * MIPS Loongson Built-in Functions::
@@ -10917,6 +10950,88 @@  v2sf __builtin_ia32_pswapdsf (v2sf)
 v2si __builtin_ia32_pswapdsi (v2si)
 @end smallexample
 
+The following built-in functions are available when @option{-mrtm} is used
+They are used for restricted transactional memory. These are the internal
+low level functions. Normally the functions in 
+@ref{X86 transactional memory intrinsics} should be used instead.
+
+@smallexample
+int __builtin_ia32_xbegin ()
+void __builtin_ia32_xend ()
+void __builtin_ia32_xabort (status)
+int __builtin_ia32_xtest ()
+@end smallexample
+
+@node X86 transactional memory intrinsics
+@subsection X86 transaction memory intrinsics
+
+Hardware transactional memory intrinsics for i386. These allow to use
+memory transactions with RTM (Restricted Transactional Memory).
+For using HLE (Hardware Lock Elision) see @ref{x86 specific memory model extensions for transactional memory} instead.
+This support is enabled with the @option{-mrtm} option.
+
+A memory transaction commits all changes to memory in an atomic way,
+as visible to other threads. If the transaction fails it is rolled back
+and all side effects discarded.
+
+Generally there is no guarantee that a memory transaction ever suceeds
+and suitable fallback code always needs to be supplied.
+
+@deftypefn {RTM Function} {unsigned} _xbegin ()
+Start a RTM (Restricted Transactional Memory) transaction. 
+Returns _XBEGIN_STARTED when the transaction
+started successfully (not this is not 0, so the constant has to be 
+explicitely tested). When the transaction aborts all side effects
+are undone and an abort code is returned. There is no guarantee
+any transaction ever succeeds, so there always needs to be a valid
+tested fallback path.
+@end deftypefn
+
+@smallexample
+#include <immintrin.h>
+
+if ((status = _xbegin ()) == _XBEGIN_STARTED) @{
+    ... transaction code...
+    _xend ();
+@} else @{
+    ... non transactional fallback path...
+@}
+@end smallexample
+
+Valid abort status bits (when the value is not @code{_XBEGIN_STARTED}) are:
+
+@table @code
+@item _XABORT_EXPLICIT
+Transaction explicitely aborted with @code{_xabort}. The parameter passed
+to @code{_xabort} is available with @code{_XABORT_CODE(status)}
+@item _XABORT_RETRY
+Transaction retry is possible.
+@item _XABORT_CONFLICT
+Transaction abort due to a memory conflict with another thread
+@item _XABORT_CAPACITY
+Transaction abort due to the transaction using too much memory
+@item _XABORT_DEBUG
+Transaction abort due to a debug trap
+@item _XABORT_NESTED
+Transaction abort in a inner nested transaction
+@end table
+
+@deftypefn {RTM Function} {void} _xend ()
+Commit the current transaction. When no transaction is active this will
+fault. All memory side effects of the transactions will become visible
+to other threads in an atomic matter.
+@end deftypefn
+
+@deftypefn {RTM Function} {int} _xtest ()
+Return a value not zero when a transaction is currently active, otherwise 0.
+@end deftypefn
+
+@deftypefn {RTM Function} {void} _xabort (status)
+Abort the current transaction. When no transaction is active this is a no-op.
+status must be a 8bit constant, that is included in the status code returned
+by @code{_xbegin}
+@end deftypefn
+
 @node MIPS DSP Built-in Functions
 @subsection MIPS DSP Built-in Functions