[V6,1/9] elf: Add new powerpc specifc core note sections

Message ID	1417507013-11948-2-git-send-email-khandual@linux.vnet.ibm.com (mailing list archive)
State	Superseded
Delegated to:	Michael Ellerman
Headers	show Return-Path: <linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org> Gateway: Authorized Use Only! Violators will be prosecuted for <linuxppc-dev@ozlabs.org> from <khandual@linux.vnet.ibm.com>; Tue, 2 Dec 2014 18:06:13 +1000 Gateway: Authorized Use Only! Violators will be prosecuted; Tue, 2 Dec 2014 18:06:12 +1000 From: Anshuman Khandual <khandual@linux.vnet.ibm.com> To: linux-kernel@vger.kernel.org, linuxppc-dev@ozlabs.org Subject: [PATCH V6 1/9] elf: Add new powerpc specifc core note sections Date: Tue, 2 Dec 2014 13:26:45 +0530 Message-Id: <1417507013-11948-2-git-send-email-khandual@linux.vnet.ibm.com> In-Reply-To: <1417507013-11948-1-git-send-email-khandual@linux.vnet.ibm.com> References: <1417507013-11948-1-git-send-email-khandual@linux.vnet.ibm.com> Cc: shuahkh@osg.samsung.com, mikey@neuling.org, james.hogan@imgtec.com, avagin@openvz.org, Paul.Clothier@imgtec.com, peterz@infradead.org, palves@redhat.com, oleg@redhat.com, davem@davemloft.net, dhowells@redhat.com, kirjanov@gmail.com, davej@redhat.com, akpm@linux-foundation.org, sukadev@linux.vnet.ibm.com, tglx@linutronix.de, sam.bobroff@au1.ibm.com Precedence: list MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" <linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>

Anshuman Khandual Dec. 2, 2014, 7:56 a.m. UTC

This patch adds four new ELF core note sections for powerpc
transactional memory and one new ELF core note section for
powerpc general miscellaneous debug registers. These addition
of new ELF core note sections extends the existing ELF ABI
without affecting it in any manner.

Acked-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 include/uapi/linux/elf.h | 5 +++++
 1 file changed, 5 insertions(+)

Michael Ellerman Dec. 3, 2014, 5:22 a.m. UTC | #1

On Tue, 2014-02-12 at 07:56:45 UTC, Anshuman Khandual wrote:
> This patch adds four new ELF core note sections for powerpc
> transactional memory and one new ELF core note section for
> powerpc general miscellaneous debug registers. These addition
> of new ELF core note sections extends the existing ELF ABI
> without affecting it in any manner.
> 
> Acked-by: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
> ---
>  include/uapi/linux/elf.h | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h
> index ea9bf25..2260fc0 100644
> --- a/include/uapi/linux/elf.h
> +++ b/include/uapi/linux/elf.h
> @@ -379,6 +379,11 @@ typedef struct elf64_shdr {
>  #define NT_PPC_VMX	0x100		/* PowerPC Altivec/VMX registers */
>  #define NT_PPC_SPE	0x101		/* PowerPC SPE/EVR registers */
>  #define NT_PPC_VSX	0x102		/* PowerPC VSX registers */
> +#define NT_PPC_TM_SPR	0x103		/* PowerPC TM special registers */
> +#define NT_PPC_TM_CGPR	0x104		/* PowerpC TM checkpointed GPR */
> +#define NT_PPC_TM_CFPR	0x105		/* PowerPC TM checkpointed FPR */
> +#define NT_PPC_TM_CVMX	0x106		/* PowerPC TM checkpointed VMX */
> +#define NT_PPC_MISC	0x107		/* PowerPC miscellaneous registers */

This is a really terrible name, "MISC".

Having said that, I guess it's accurate. We have a whole bunch of regs that
have accrued over recent years that aren't accessible via ptrace.

It seems to me if we're adding a misc regset we should be adding everything we
might want to it that is currenty architected.

But currently you only include the PPR, TAR & DSCR.

Looking at Power ISA v2.07, I see the following that could be included:

  MMCR2
  MMCRA
  PMC1
  PMC2
  PMC3
  PMC4
  PMC5
  PMC6
  MMCR0
  EBBHR
  EBBRR
  BESCR
  SIAR
  SDAR
  CFAR?

Those are all new in 2.07 except for CFAR.

There might be more I missed, that was just a quick scan.

Some are only accessible when EBB is in use, maybe those could be a separate
regset.

cheers

Anshuman Khandual Dec. 3, 2014, 6:48 a.m. UTC | #2

On 12/03/2014 10:52 AM, Michael Ellerman wrote:
> On Tue, 2014-02-12 at 07:56:45 UTC, Anshuman Khandual wrote:
>> This patch adds four new ELF core note sections for powerpc
>> transactional memory and one new ELF core note section for
>> powerpc general miscellaneous debug registers. These addition
>> of new ELF core note sections extends the existing ELF ABI
>> without affecting it in any manner.
>>
>> Acked-by: Andrew Morton <akpm@linux-foundation.org>
>> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
>> ---
>>  include/uapi/linux/elf.h | 5 +++++
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h
>> index ea9bf25..2260fc0 100644
>> --- a/include/uapi/linux/elf.h
>> +++ b/include/uapi/linux/elf.h
>> @@ -379,6 +379,11 @@ typedef struct elf64_shdr {
>>  #define NT_PPC_VMX	0x100		/* PowerPC Altivec/VMX registers */
>>  #define NT_PPC_SPE	0x101		/* PowerPC SPE/EVR registers */
>>  #define NT_PPC_VSX	0x102		/* PowerPC VSX registers */
>> +#define NT_PPC_TM_SPR	0x103		/* PowerPC TM special registers */
>> +#define NT_PPC_TM_CGPR	0x104		/* PowerpC TM checkpointed GPR */
>> +#define NT_PPC_TM_CFPR	0x105		/* PowerPC TM checkpointed FPR */
>> +#define NT_PPC_TM_CVMX	0x106		/* PowerPC TM checkpointed VMX */
>> +#define NT_PPC_MISC	0x107		/* PowerPC miscellaneous registers */
> 
> This is a really terrible name, "MISC".
> 
> Having said that, I guess it's accurate. We have a whole bunch of regs that
> have accrued over recent years that aren't accessible via ptrace.
> 
> It seems to me if we're adding a misc regset we should be adding everything we
> might want to it that is currenty architected.

But I believe they also need to be part of the thread_struct structure to be
accessible from ptrace.
 
> 
> But currently you only include the PPR, TAR & DSCR.

Yeah, thats what we started with.

> 
> Looking at Power ISA v2.07, I see the following that could be included:
> 
>   MMCR2
>   MMCRA
>   PMC1
>   PMC2
>   PMC3
>   PMC4
>   PMC5
>   PMC6
>   MMCR0
>   EBBHR
>   EBBRR
>   BESCR
>   SIAR
>   SDAR
>   CFAR?

MMCRA, PMC[1..6], EBBHR, BESCR, EBBRR, CFAR are not part of the thread struct.

> 
> Those are all new in 2.07 except for CFAR.
> 
> There might be more I missed, that was just a quick scan.
> 
> Some are only accessible when EBB is in use, maybe those could be a separate
> regset.

Yeah we can have one more regset for EBB specific registers.

Anshuman Khandual Dec. 8, 2014, 10:08 a.m. UTC | #3

On 12/03/2014 12:18 PM, Anshuman Khandual wrote:
> On 12/03/2014 10:52 AM, Michael Ellerman wrote:
>> On Tue, 2014-02-12 at 07:56:45 UTC, Anshuman Khandual wrote:
>>> This patch adds four new ELF core note sections for powerpc
>>> transactional memory and one new ELF core note section for
>>> powerpc general miscellaneous debug registers. These addition
>>> of new ELF core note sections extends the existing ELF ABI
>>> without affecting it in any manner.
>>>
>>> Acked-by: Andrew Morton <akpm@linux-foundation.org>
>>> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
>>> ---
>>>  include/uapi/linux/elf.h | 5 +++++
>>>  1 file changed, 5 insertions(+)
>>>
>>> diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h
>>> index ea9bf25..2260fc0 100644
>>> --- a/include/uapi/linux/elf.h
>>> +++ b/include/uapi/linux/elf.h
>>> @@ -379,6 +379,11 @@ typedef struct elf64_shdr {
>>>  #define NT_PPC_VMX	0x100		/* PowerPC Altivec/VMX registers */
>>>  #define NT_PPC_SPE	0x101		/* PowerPC SPE/EVR registers */
>>>  #define NT_PPC_VSX	0x102		/* PowerPC VSX registers */
>>> +#define NT_PPC_TM_SPR	0x103		/* PowerPC TM special registers */
>>> +#define NT_PPC_TM_CGPR	0x104		/* PowerpC TM checkpointed GPR */
>>> +#define NT_PPC_TM_CFPR	0x105		/* PowerPC TM checkpointed FPR */
>>> +#define NT_PPC_TM_CVMX	0x106		/* PowerPC TM checkpointed VMX */
>>> +#define NT_PPC_MISC	0x107		/* PowerPC miscellaneous registers */
>>
>> This is a really terrible name, "MISC".
>>
>> Having said that, I guess it's accurate. We have a whole bunch of regs that
>> have accrued over recent years that aren't accessible via ptrace.
>>
>> It seems to me if we're adding a misc regset we should be adding everything we
>> might want to it that is currenty architected.
> 
> But I believe they also need to be part of the thread_struct structure to be
> accessible from ptrace.

Currently we dont context save/restore the PMC count registers (PMC1-PMC6)
during the process context switch. So the values of PMC1..PMC6 are not
thread specific in the structure. To be able to access them in ptrace
when the tracee has stopped, we need to context save these counters
in the thread struct. Shall we do that ? Then we can add them to the
MISC regset bucket irrespective of whats the value we get in there when
we probe through ptrace.

The same goes for MMCRA, CFAR registers as well.

>  
>>
>> But currently you only include the PPR, TAR & DSCR.
> 
> Yeah, thats what we started with.
> 
>>
>> Looking at Power ISA v2.07, I see the following that could be included:
>>
>>   MMCR2
>>   MMCRA
>>   PMC1
>>   PMC2
>>   PMC3
>>   PMC4
>>   PMC5
>>   PMC6
>>   MMCR0
>>   EBBHR
>>   EBBRR
>>   BESCR
>>   SIAR
>>   SDAR
>>   CFAR?
> 
> MMCRA, PMC[1..6], EBBHR, BESCR, EBBRR, CFAR are not part of the thread struct.

Sorry. EBBRR, EBBHR, BESCR registers are part of the thread struct.

> 
>>
>> Those are all new in 2.07 except for CFAR.
>>
>> There might be more I missed, that was just a quick scan.
>>
>> Some are only accessible when EBB is in use, maybe those could be a separate
>> regset.
> 
> Yeah we can have one more regset for EBB specific registers.

Should the new EBB specific regset include only EBBRR, EBBHR, BESCR registers
or should it also include SIAR, SDAR, SIER, MMCR0, MMCR2 registers as well. I
was thinking about putting these five registers into the MISC bucket instead.
But from the perf code, it looks like these five registers are also related to
the EBB context as well.

Some clarity on these points would really help.

Edjunior Barbosa Machado Dec. 19, 2014, 7:28 p.m. UTC | #4

On 12/08/2014 08:08 AM, Anshuman Khandual wrote:
> On 12/03/2014 12:18 PM, Anshuman Khandual wrote:
>> On 12/03/2014 10:52 AM, Michael Ellerman wrote:
>>> On Tue, 2014-02-12 at 07:56:45 UTC, Anshuman Khandual wrote:
>>>> This patch adds four new ELF core note sections for powerpc
>>>> transactional memory and one new ELF core note section for
>>>> powerpc general miscellaneous debug registers. These addition
>>>> of new ELF core note sections extends the existing ELF ABI
>>>> without affecting it in any manner.
>>>>
>>>> Acked-by: Andrew Morton <akpm@linux-foundation.org>
>>>> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
>>>> ---
>>>>  include/uapi/linux/elf.h | 5 +++++
>>>>  1 file changed, 5 insertions(+)
>>>>
>>>> diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h
>>>> index ea9bf25..2260fc0 100644
>>>> --- a/include/uapi/linux/elf.h
>>>> +++ b/include/uapi/linux/elf.h
>>>> @@ -379,6 +379,11 @@ typedef struct elf64_shdr {
>>>>  #define NT_PPC_VMX	0x100		/* PowerPC Altivec/VMX registers */
>>>>  #define NT_PPC_SPE	0x101		/* PowerPC SPE/EVR registers */
>>>>  #define NT_PPC_VSX	0x102		/* PowerPC VSX registers */
>>>> +#define NT_PPC_TM_SPR	0x103		/* PowerPC TM special registers */
>>>> +#define NT_PPC_TM_CGPR	0x104		/* PowerpC TM checkpointed GPR */
>>>> +#define NT_PPC_TM_CFPR	0x105		/* PowerPC TM checkpointed FPR */
>>>> +#define NT_PPC_TM_CVMX	0x106		/* PowerPC TM checkpointed VMX */
>>>> +#define NT_PPC_MISC	0x107		/* PowerPC miscellaneous registers */
>>>
>>> This is a really terrible name, "MISC".
>>>
>>> Having said that, I guess it's accurate. We have a whole bunch of regs that
>>> have accrued over recent years that aren't accessible via ptrace.
>>>
>>> It seems to me if we're adding a misc regset we should be adding everything we
>>> might want to it that is currenty architected.
>>
>> But I believe they also need to be part of the thread_struct structure to be
>> accessible from ptrace.
> 
> Currently we dont context save/restore the PMC count registers (PMC1-PMC6)
> during the process context switch. So the values of PMC1..PMC6 are not
> thread specific in the structure. To be able to access them in ptrace
> when the tracee has stopped, we need to context save these counters
> in the thread struct. Shall we do that ? Then we can add them to the
> MISC regset bucket irrespective of whats the value we get in there when
> we probe through ptrace.
> 
> The same goes for MMCRA, CFAR registers as well.
> 
>>  
>>>
>>> But currently you only include the PPR, TAR & DSCR.
>>
>> Yeah, thats what we started with.
>>
>>>
>>> Looking at Power ISA v2.07, I see the following that could be included:
>>>
>>>   MMCR2
>>>   MMCRA
>>>   PMC1
>>>   PMC2
>>>   PMC3
>>>   PMC4
>>>   PMC5
>>>   PMC6
>>>   MMCR0
>>>   EBBHR
>>>   EBBRR
>>>   BESCR
>>>   SIAR
>>>   SDAR
>>>   CFAR?
>>
>> MMCRA, PMC[1..6], EBBHR, BESCR, EBBRR, CFAR are not part of the thread struct.
> 
> Sorry. EBBRR, EBBHR, BESCR registers are part of the thread struct.
> 
>>
>>>
>>> Those are all new in 2.07 except for CFAR.
>>>
>>> There might be more I missed, that was just a quick scan.
>>>
>>> Some are only accessible when EBB is in use, maybe those could be a separate
>>> regset.
>>
>> Yeah we can have one more regset for EBB specific registers.
> 
> Should the new EBB specific regset include only EBBRR, EBBHR, BESCR registers
> or should it also include SIAR, SDAR, SIER, MMCR0, MMCR2 registers as well. I
> was thinking about putting these five registers into the MISC bucket instead.
> But from the perf code, it looks like these five registers are also related to
> the EBB context as well.
> 
> Some clarity on these points would really help.

Hi,

from the provided testcase using ptrace interface, reviewing with the help
of Ulrich, it looks OK from GDB perspective, with the exception of a few
concerns:

The patchset seems to change the "original" ptrace requests (i.e.
PTRACE_GETREGS/GETFPREGS/GETVRREGS...) to return the "transactional" state, and
adds new register sets to return the "checkpointed" state. Considering that
whenever you get a debugger interception inside a transactional block, the
transaction will abort, we're wondering if it wouldn't make more sense to
display the 'checkpointed' state as the normal registers since this is where the
execution will continue from.

Also, we've noticed that the 'misc' regset contains registers from different ISA
versions (dscr and ppr appear in ISA 2.05, tar is from 2.07). I'm not sure if
there is a way to detect presence/validity of such registers, but perhaps it
might be a good idea to separate registers from different ISAs in different
regsets.

Regarding the inclusion of other registers along with the EBB-related ones, I'm
sorry but I'm not familiar with them.

Thanks and regards,
--
Edjunior

Anshuman Khandual Jan. 1, 2015, 8:08 a.m. UTC | #5

On 12/20/2014 12:58 AM, Edjunior Barbosa Machado wrote:
> On 12/08/2014 08:08 AM, Anshuman Khandual wrote:
>> On 12/03/2014 12:18 PM, Anshuman Khandual wrote:
>>> On 12/03/2014 10:52 AM, Michael Ellerman wrote:
>>>> On Tue, 2014-02-12 at 07:56:45 UTC, Anshuman Khandual wrote:
>>>>> This patch adds four new ELF core note sections for powerpc
>>>>> transactional memory and one new ELF core note section for
>>>>> powerpc general miscellaneous debug registers. These addition
>>>>> of new ELF core note sections extends the existing ELF ABI
>>>>> without affecting it in any manner.
>>>>>
>>>>> Acked-by: Andrew Morton <akpm@linux-foundation.org>
>>>>> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
>>>>> ---
>>>>>  include/uapi/linux/elf.h | 5 +++++
>>>>>  1 file changed, 5 insertions(+)
>>>>>
>>>>> diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h
>>>>> index ea9bf25..2260fc0 100644
>>>>> --- a/include/uapi/linux/elf.h
>>>>> +++ b/include/uapi/linux/elf.h
>>>>> @@ -379,6 +379,11 @@ typedef struct elf64_shdr {
>>>>>  #define NT_PPC_VMX	0x100		/* PowerPC Altivec/VMX registers */
>>>>>  #define NT_PPC_SPE	0x101		/* PowerPC SPE/EVR registers */
>>>>>  #define NT_PPC_VSX	0x102		/* PowerPC VSX registers */
>>>>> +#define NT_PPC_TM_SPR	0x103		/* PowerPC TM special registers */
>>>>> +#define NT_PPC_TM_CGPR	0x104		/* PowerpC TM checkpointed GPR */
>>>>> +#define NT_PPC_TM_CFPR	0x105		/* PowerPC TM checkpointed FPR */
>>>>> +#define NT_PPC_TM_CVMX	0x106		/* PowerPC TM checkpointed VMX */
>>>>> +#define NT_PPC_MISC	0x107		/* PowerPC miscellaneous registers */
>>>>
>>>> This is a really terrible name, "MISC".
>>>>
>>>> Having said that, I guess it's accurate. We have a whole bunch of regs that
>>>> have accrued over recent years that aren't accessible via ptrace.
>>>>
>>>> It seems to me if we're adding a misc regset we should be adding everything we
>>>> might want to it that is currenty architected.
>>>
>>> But I believe they also need to be part of the thread_struct structure to be
>>> accessible from ptrace.
>>
>> Currently we dont context save/restore the PMC count registers (PMC1-PMC6)
>> during the process context switch. So the values of PMC1..PMC6 are not
>> thread specific in the structure. To be able to access them in ptrace
>> when the tracee has stopped, we need to context save these counters
>> in the thread struct. Shall we do that ? Then we can add them to the
>> MISC regset bucket irrespective of whats the value we get in there when
>> we probe through ptrace.
>>
>> The same goes for MMCRA, CFAR registers as well.
>>
>>>  
>>>>
>>>> But currently you only include the PPR, TAR & DSCR.
>>>
>>> Yeah, thats what we started with.
>>>
>>>>
>>>> Looking at Power ISA v2.07, I see the following that could be included:
>>>>
>>>>   MMCR2
>>>>   MMCRA
>>>>   PMC1
>>>>   PMC2
>>>>   PMC3
>>>>   PMC4
>>>>   PMC5
>>>>   PMC6
>>>>   MMCR0
>>>>   EBBHR
>>>>   EBBRR
>>>>   BESCR
>>>>   SIAR
>>>>   SDAR
>>>>   CFAR?
>>>
>>> MMCRA, PMC[1..6], EBBHR, BESCR, EBBRR, CFAR are not part of the thread struct.
>>
>> Sorry. EBBRR, EBBHR, BESCR registers are part of the thread struct.
>>
>>>
>>>>
>>>> Those are all new in 2.07 except for CFAR.
>>>>
>>>> There might be more I missed, that was just a quick scan.
>>>>
>>>> Some are only accessible when EBB is in use, maybe those could be a separate
>>>> regset.
>>>
>>> Yeah we can have one more regset for EBB specific registers.
>>
>> Should the new EBB specific regset include only EBBRR, EBBHR, BESCR registers
>> or should it also include SIAR, SDAR, SIER, MMCR0, MMCR2 registers as well. I
>> was thinking about putting these five registers into the MISC bucket instead.
>> But from the perf code, it looks like these five registers are also related to
>> the EBB context as well.
>>
>> Some clarity on these points would really help.
> 
> Hi,
> 
> from the provided testcase using ptrace interface, reviewing with the help
> of Ulrich, it looks OK from GDB perspective, with the exception of a few
> concerns:
> 
> The patchset seems to change the "original" ptrace requests (i.e.
> PTRACE_GETREGS/GETFPREGS/GETVRREGS...) to return the "transactional" state, and
> adds new register sets to return the "checkpointed" state. Considering that
> whenever you get a debugger interception inside a transactional block, the
> transaction will abort, we're wondering if it wouldn't make more sense to 
> display the 'checkpointed' state as the normal registers since this is where the
> execution will continue from.

Debugger interception (trace interrupt) in between any transaction block will abort
it ? I doubt that. The tracee process will just stop, it's context gets saved in the
kernel so that it can again start executing from the exact same point onward when it
resumes. If this happens when inside any transaction block, the transaction's running
context and check pointed context will get saved. The execution will again start from
the running context values instead of check pointed when the process resumes. Check
pointed values will be loaded back into the context when the transaction finishes.
Inside transaction both running and check pointed values can be probed independently.

> 
> Also, we've noticed that the 'misc' regset contains registers from different ISA
> versions (dscr and ppr appear in ISA 2.05, tar is from 2.07). I'm not sure if
> there is a way to detect presence/validity of such registers, but perhaps it
> might be a good idea to separate registers from different ISAs in different
> regsets.

Thats right, will use feature CPU_FTR_ARCH_207S (which checks whether we are v2.07
compliant) to detect whether TAR register is available or not. 

> 
> Regarding the inclusion of other registers along with the EBB-related ones, I'm
> sorry but I'm not familiar with them.

Michael Ellerman mentioned that we can look into them separately sometime later
not in this patch series.

Anshuman Khandual Jan. 14, 2015, 4:44 a.m. UTC | #6

On 01/01/2015 01:38 PM, Anshuman Khandual wrote:
>> > Also, we've noticed that the 'misc' regset contains registers from different ISA
>> > versions (dscr and ppr appear in ISA 2.05, tar is from 2.07). I'm not sure if
>> > there is a way to detect presence/validity of such registers, but perhaps it
>> > might be a good idea to separate registers from different ISAs in different
>> > regsets.
> Thats right, will use feature CPU_FTR_ARCH_207S (which checks whether we are v2.07
> compliant) to detect whether TAR register is available or not. 
> 

Need to correct something here. Run time detection of the presence of TAR register
through the feature bit CPU_FTR_ARCH_207S as I had mentioned before is not required.
Right now we take care of the compile time availability of the individual registers
the same way it is present on the thread struct. In the systems which do not have the
TAR register, thread.tar is always going to be 0 which is exactly the same value we
would get by excluding tar register copy after the run time detection of the feature.

Michael Neuling Jan. 21, 2015, 11:39 p.m. UTC | #7

On Thu, 2015-01-01 at 13:38 +0530, Anshuman Khandual wrote:
> On 12/20/2014 12:58 AM, Edjunior Barbosa Machado wrote:
> > On 12/08/2014 08:08 AM, Anshuman Khandual wrote:
> >> On 12/03/2014 12:18 PM, Anshuman Khandual wrote:
> >>> On 12/03/2014 10:52 AM, Michael Ellerman wrote:
> >>>> On Tue, 2014-02-12 at 07:56:45 UTC, Anshuman Khandual wrote:
> >>>>> This patch adds four new ELF core note sections for powerpc
> >>>>> transactional memory and one new ELF core note section for
> >>>>> powerpc general miscellaneous debug registers. These addition
> >>>>> of new ELF core note sections extends the existing ELF ABI
> >>>>> without affecting it in any manner.
> >>>>>
> >>>>> Acked-by: Andrew Morton <akpm@linux-foundation.org>
> >>>>> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
> >>>>> ---
> >>>>>  include/uapi/linux/elf.h | 5 +++++
> >>>>>  1 file changed, 5 insertions(+)
> >>>>>
> >>>>> diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h
> >>>>> index ea9bf25..2260fc0 100644
> >>>>> --- a/include/uapi/linux/elf.h
> >>>>> +++ b/include/uapi/linux/elf.h
> >>>>> @@ -379,6 +379,11 @@ typedef struct elf64_shdr {
> >>>>>  #define NT_PPC_VMX	0x100		/* PowerPC Altivec/VMX registers */
> >>>>>  #define NT_PPC_SPE	0x101		/* PowerPC SPE/EVR registers */
> >>>>>  #define NT_PPC_VSX	0x102		/* PowerPC VSX registers */
> >>>>> +#define NT_PPC_TM_SPR	0x103		/* PowerPC TM special registers */
> >>>>> +#define NT_PPC_TM_CGPR	0x104		/* PowerpC TM checkpointed GPR */
> >>>>> +#define NT_PPC_TM_CFPR	0x105		/* PowerPC TM checkpointed FPR */
> >>>>> +#define NT_PPC_TM_CVMX	0x106		/* PowerPC TM checkpointed VMX */
> >>>>> +#define NT_PPC_MISC	0x107		/* PowerPC miscellaneous registers */
> >>>>
> >>>> This is a really terrible name, "MISC".
> >>>>
> >>>> Having said that, I guess it's accurate. We have a whole bunch of regs that
> >>>> have accrued over recent years that aren't accessible via ptrace.
> >>>>
> >>>> It seems to me if we're adding a misc regset we should be adding everything we
> >>>> might want to it that is currenty architected.
> >>>
> >>> But I believe they also need to be part of the thread_struct structure to be
> >>> accessible from ptrace.
> >>
> >> Currently we dont context save/restore the PMC count registers (PMC1-PMC6)
> >> during the process context switch. So the values of PMC1..PMC6 are not
> >> thread specific in the structure. To be able to access them in ptrace
> >> when the tracee has stopped, we need to context save these counters
> >> in the thread struct. Shall we do that ? Then we can add them to the
> >> MISC regset bucket irrespective of whats the value we get in there when
> >> we probe through ptrace.
> >>
> >> The same goes for MMCRA, CFAR registers as well.
> >>
> >>>  
> >>>>
> >>>> But currently you only include the PPR, TAR & DSCR.
> >>>
> >>> Yeah, thats what we started with.
> >>>
> >>>>
> >>>> Looking at Power ISA v2.07, I see the following that could be included:
> >>>>
> >>>>   MMCR2
> >>>>   MMCRA
> >>>>   PMC1
> >>>>   PMC2
> >>>>   PMC3
> >>>>   PMC4
> >>>>   PMC5
> >>>>   PMC6
> >>>>   MMCR0
> >>>>   EBBHR
> >>>>   EBBRR
> >>>>   BESCR
> >>>>   SIAR
> >>>>   SDAR
> >>>>   CFAR?
> >>>
> >>> MMCRA, PMC[1..6], EBBHR, BESCR, EBBRR, CFAR are not part of the thread struct.
> >>
> >> Sorry. EBBRR, EBBHR, BESCR registers are part of the thread struct.
> >>
> >>>
> >>>>
> >>>> Those are all new in 2.07 except for CFAR.
> >>>>
> >>>> There might be more I missed, that was just a quick scan.
> >>>>
> >>>> Some are only accessible when EBB is in use, maybe those could be a separate
> >>>> regset.
> >>>
> >>> Yeah we can have one more regset for EBB specific registers.
> >>
> >> Should the new EBB specific regset include only EBBRR, EBBHR, BESCR registers
> >> or should it also include SIAR, SDAR, SIER, MMCR0, MMCR2 registers as well. I
> >> was thinking about putting these five registers into the MISC bucket instead.
> >> But from the perf code, it looks like these five registers are also related to
> >> the EBB context as well.
> >>
> >> Some clarity on these points would really help.
> > 
> > Hi,
> > 
> > from the provided testcase using ptrace interface, reviewing with the help
> > of Ulrich, it looks OK from GDB perspective, with the exception of a few
> > concerns:
> > 
> > The patchset seems to change the "original" ptrace requests (i.e.
> > PTRACE_GETREGS/GETFPREGS/GETVRREGS...) to return the "transactional" state, and
> > adds new register sets to return the "checkpointed" state. Considering that
> > whenever you get a debugger interception inside a transactional block, the
> > transaction will abort, we're wondering if it wouldn't make more sense to 
> > display the 'checkpointed' state as the normal registers since this is where the
> > execution will continue from.
> 
> Debugger interception (trace interrupt) in between any transaction block will abort
> it ? I doubt that.

The trace interrupt will not abort the transaction explicitly...

> The tracee process will just stop, it's context gets saved in the
> kernel so that it can again start executing from the exact same point onward when it
> resumes. 

... unfortunately, this save *will* doom the transaction.  To save, a
treclaim instruction is run which will always explicitly doom the
transaction.

> If this happens when inside any transaction block, the transaction's running
> context and check pointed context will get saved. The execution will again start from
> the running context values instead of check pointed when the process resumes. Check
> pointed values will be loaded back into the context when the transaction finishes.

Although since the transaction has been explicitly doomed, the hardware
will *always* abort at this point and start execution from the
checkpointed values.

> Inside transaction both running and check pointed values can be probed independently.

Yep, that's the idea, although setting the running values won't change
anything since the the translation is already doomed and will abort once
the cpu starts executing it.  

Mikey

> > 
> > Also, we've noticed that the 'misc' regset contains registers from different ISA
> > versions (dscr and ppr appear in ISA 2.05, tar is from 2.07). I'm not sure if
> > there is a way to detect presence/validity of such registers, but perhaps it
> > might be a good idea to separate registers from different ISAs in different
> > regsets.
> 
> Thats right, will use feature CPU_FTR_ARCH_207S (which checks whether we are v2.07
> compliant) to detect whether TAR register is available or not. 
> 
> > 
> > Regarding the inclusion of other registers along with the EBB-related ones, I'm
> > sorry but I'm not familiar with them.
> 
> Michael Ellerman mentioned that we can look into them separately sometime later
> not in this patch series.
> 
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev

Ulrich Weigand Jan. 22, 2015, 3:55 p.m. UTC | #8

Michael Neuling <mikey@neuling.org> wrote on 22.01.2015 00:39:57:
> On Thu, 2015-01-01 at 13:38 +0530, Anshuman Khandual wrote:
> > On 12/20/2014 12:58 AM, Edjunior Barbosa Machado wrote:
> > > The patchset seems to change the "original" ptrace requests (i.e.
> > > PTRACE_GETREGS/GETFPREGS/GETVRREGS...) to return the
> > > "transactional" state, and adds new register sets to return
> > > the "checkpointed" state.  Considering that whenever you get
> > > a debugger interception inside a transactional block, the
> > > transaction will abort, we're wondering if it wouldn't make
> > > more sense to display the 'checkpointed' state as the normal
> > > registers since this is where the execution will continue from.
> >
> > Debugger interception (trace interrupt) in between any transaction
> > block will abort it ? I doubt that.
>
> The trace interrupt will not abort the transaction explicitly...
>
> > The tracee process will just stop, it's context gets saved in the
> > kernel so that it can again start executing from the exact same
> > point onward when it resumes.
>
> ... unfortunately, this save *will* doom the transaction.  To save, a
> treclaim instruction is run which will always explicitly doom the
> transaction.
>
> > If this happens when inside any transaction block, the transaction's
running
> > context and check pointed context will get saved. The execution
> > will again start from the running context values instead of check
> > pointed when the process resumes. Check pointed values will be loaded
> > back into the context when the transaction finishes.
>
> Although since the transaction has been explicitly doomed, the hardware
> will *always* abort at this point and start execution from the
> checkpointed values.
>
> > Inside transaction both running and check pointed values can be
> > probed independently.
>
> Yep, that's the idea, although setting the running values won't change
> anything since the the translation is already doomed and will abort once
> the cpu starts executing it.

So this looks to me like the overall effect on debugging transactional
code should be the same on Power and z, even if some internal details
are different (on z, the exception will automatically abort the
transaction; on p, the exception itself will not abort, but *restarting*
user space execution will).

From a GDB perspective, it would therefore be preferable if the ptrace
interface were to behave in a similar fashion on p as on z: that is,
if an exception interrupting a transaction results in a ptrace intercept,
at this point:

- the "normal" ptrace register set commands should access the
  *checkpointed* registers (allowing both read and write access)

  -- GDB will use this to display current position (already reflecting
  the fact that the transaction will abort), and use it when changing
  register values e.g. to effect an inferior function call

- a new ptrace register set should allow access (read-only) to the
  *running* register values

  -- GDB can use this to display the position inside the transaction
  at the point it aborted, using new transaction-specific commands


Bye,
Ulrich

Michael Neuling Jan. 22, 2015, 9:44 p.m. UTC | #9

> > > Inside transaction both running and check pointed values can be
> > > probed independently.
> >
> > Yep, that's the idea, although setting the running values won't change
> > anything since the the translation is already doomed and will abort once
> > the cpu starts executing it.
> 
> So this looks to me like the overall effect on debugging transactional
> code should be the same on Power and z, even if some internal details
> are different (on z, the exception will automatically abort the
> transaction; on p, the exception itself will not abort, but *restarting*
> user space execution will).

Yep

> From a GDB perspective, it would therefore be preferable if the ptrace
> interface were to behave in a similar fashion on p as on z: that is,
> if an exception interrupting a transaction results in a ptrace intercept,
> at this point:

Agreed.

> - the "normal" ptrace register set commands should access the
>   *checkpointed* registers (allowing both read and write access)

OK, this is a change from what we've been proposing with Anshuman's
patch set but I'm happy to change it to make it consistent with other
architectures.  It's relatively arbitrary which goes where, so I'm happy
to change.

>   -- GDB will use this to display current position (already reflecting
>   the fact that the transaction will abort), and use it when changing
>   register values e.g. to effect an inferior function call

"Current position" depends on your perspective.  Is it the last executed
instruction or the next executed instruction?  If it's the last executed
instruction, then it's the running values.  If it's the next, then it's
the check pointed.  

Anyway, I'm happy to make it the check pointed values for the sake of
ptrace/gdb.

> - a new ptrace register set should allow access (read-only) to the
>   *running* register values

This is because changing them won't ever result in a side effect?  

>   -- GDB can use this to display the position inside the transaction
>   at the point it aborted, using new transaction-specific commands

Yep.

Mikey

PS in the subject s/specifc/specific/

Michael Neuling Jan. 28, 2015, 4:28 a.m. UTC | #10

On Fri, 2015-01-23 at 08:44 +1100, Michael Neuling wrote:
> > > > Inside transaction both running and check pointed values can be
> > > > probed independently.
> > >
> > > Yep, that's the idea, although setting the running values won't change
> > > anything since the the translation is already doomed and will abort once
> > > the cpu starts executing it.
> > 
> > So this looks to me like the overall effect on debugging transactional
> > code should be the same on Power and z, even if some internal details
> > are different (on z, the exception will automatically abort the
> > transaction; on p, the exception itself will not abort, but *restarting*
> > user space execution will).
> 
> Yep
> 
> > From a GDB perspective, it would therefore be preferable if the ptrace
> > interface were to behave in a similar fashion on p as on z: that is,
> > if an exception interrupting a transaction results in a ptrace intercept,
> > at this point:
> 
> Agreed.
> 
> > - the "normal" ptrace register set commands should access the
> >   *checkpointed* registers (allowing both read and write access)
> 
> OK, this is a change from what we've been proposing with Anshuman's
> patch set but I'm happy to change it to make it consistent with other
> architectures.  It's relatively arbitrary which goes where, so I'm happy
> to change.
> 
> >   -- GDB will use this to display current position (already reflecting
> >   the fact that the transaction will abort), and use it when changing
> >   register values e.g. to effect an inferior function call
> 
> "Current position" depends on your perspective.  Is it the last executed
> instruction or the next executed instruction?  If it's the last executed
> instruction, then it's the running values.  If it's the next, then it's
> the check pointed.  
> 
> Anyway, I'm happy to make it the check pointed values for the sake of
> ptrace/gdb.

Uli,

Sorry, I'm rethinking this as we didn't consider user suspended
transactions.

It makes sense for normal transactions but for user suspended
transactions the running values are the ones you want to modify since
that is where you'll end up restarting from.  The hardware will only
abort/rollback once a tresume is encountered.
      * 
So we could do what you're talking about for normal transactions and
then switch them for suspended transactions.  But this just seems to be
making the kernel interface overly complicated.  

So I'm keen on just keeping it the way Anshuman has now and GDB has to
understand the program flow better to know which ones it wants to
modify.  The kernel always provides the "normal" set as running and the
new set as check pointed.  GDB then has to check the MSR to work out
what it wants to modify.

> > - a new ptrace register set should allow access (read-only) to the
> >   *running* register values
> 
> This is because changing them won't ever result in a side effect?  

For the same reason as above, we need to be able to modify the running
values when it's a user suspended transaction.  So I don't agree with
this any more in the case of user suspended transaction.  We need to be
able to modify both sets of registers.

Mikey

Ulrich Weigand Feb. 6, 2015, 2:47 p.m. UTC | #11

Michael Neuling <mikey@neuling.org> wrote on 28.01.2015 05:28:09:

> Sorry, I'm rethinking this as we didn't consider user suspended
> transactions.
>
> It makes sense for normal transactions but for user suspended
> transactions the running values are the ones you want to modify since
> that is where you'll end up restarting from.  The hardware will only
> abort/rollback once a tresume is encountered.

OK, I see.  I hadn't really looked into suspended transactions before ...

> So we could do what you're talking about for normal transactions and
> then switch them for suspended transactions.  But this just seems to be
> making the kernel interface overly complicated.

Agreed.  Given that there seems to be an inevitable difference in how
transactions are seen by the debugger between Power and z (there are
no suspended transactions on z), I guess it makes more sense to have
the interface naturally model the hardware register sets on Power,
and have GDB cope with the differences.

> So I'm keen on just keeping it the way Anshuman has now and GDB has to
> understand the program flow better to know which ones it wants to
> modify.  The kernel always provides the "normal" set as running and the
> new set as check pointed.  GDB then has to check the MSR to work out
> what it wants to modify.

So I'm thinking how this all would work out for the case of GDB wanting
to call an inferior function while the process is stopped within a
transaction (in either transactional or suspended state).

(A) In suspended state, I guess GDB could modify the "normal" register
set and ask the kernel to continue.  The kernel would then transfer
control to the inferior function entry point while still in suspended
state, and the inferior function would execute in suspended state until
it hits the GDB-placed breakpoint at the end.  At this point, GDB would
reload the original values to the "normal" register set, and ask the
kernel to continue.  The kernel would then transfer control to the
originally-interrupted piece of code in suspended state, which would
continue to execute until the tresume, at which point the transaction
would abort (due to TDOOMED).  Right?

(B) In transactional state, GDB could modify the "checkpointed" register
set and ask the kernel to continue.  The kernel would transfer control
to the originally-interrupted code in transactional state.  At this
point, the transaction would immediately abort, and control would
transfer to the state specified in the checkpointed register set,
and the inferior function would now execute in non-transactional
state, until it hits the breakpoint.  At this point, GDB would now
have to restore the original values of the *checkpointed* register
set into the *normal* register set (a bit weird from a GDB internals
perspective), and ask the kernel to continue.  Execution would now
resume at the abort handler of the originally interrupted transaction.
(Hmm.  Maybe GDB would also have to set cr0 or something?)

Is it possible for GDB to change the state (transactional vs.
suspended) where the kernel ought to let the application continue in?
I guess via modifying the appropriate MSR bits via ptrace?

If so, there would be other options to handle this:

(A') In the transactional state, GDB could set the MSR to suspended,
and modify the "normal" register set.  The inferior function would
execute in the suspended state as in (A) above.  Upon return, GDB
would restore the normal register set (including restoring the MSR
to transactional state).  Once the kernel resumes the application,
the transaction would abort at this point.

(B') In the suspended state, GDB could set the MSR to transactional,
and modify the "checkpointed" register set.  Once the kernel resumes
the application, the transaction immediately aborts and control
transfers to the inferior function, executing in nontransactional
state as in (B).  Upon return, GDB would again need to restore the
original checkpointed register set into the normal register set,
set cr0 to indicate transactional abort, and continue.


Using the combination of (A)+(A') would be easiest to implement
in GDB without modifying a lot of common code, and would have the
advantage that the inferior function always executes in the same
state (suspended), while leaving information about the interrupted
transaction visible.

Using the combination of (B)+(B') would be a bit more difficult
to implement (but certainly feasible), and would have the advantage
that the inferior function always executes in nontransactional state
(which is what it would most likely expect, anyway).  However, the
disadvantage is that after the inferior call returns, GDB is unable
to fully restore the visible inferior state as it was before (since
we're now in nontransactional state, and there is probably no way
to force us back into transactional/suspended state ...).

Am I missing something here?  Any additional thoughts?


> > > - a new ptrace register set should allow access (read-only) to the
> > >   *running* register values
> >
> > This is because changing them won't ever result in a side effect?
>
> For the same reason as above, we need to be able to modify the running
> values when it's a user suspended transaction.  So I don't agree with
> this any more in the case of user suspended transaction.  We need to be
> able to modify both sets of registers.

So I guess the kernel, on resuming the application, first loads the
checkpointed register set into normal registers, issues the trechkpt
instruction to move them to the checkpointed register, and then loads
the normal register set, before returning to user space?  Then it does
indeed appear that both sets *are* modifyable, and thus the kernel
interface should export them as such.

Bye,
Ulrich

Michael Neuling Feb. 23, 2015, 4:51 a.m. UTC | #12

Uli,

Sorry for the slow response.

> Michael Neuling <mikey@neuling.org> wrote on 28.01.2015 05:28:09:
> 
> > Sorry, I'm rethinking this as we didn't consider user suspended
> > transactions.
> >
> > It makes sense for normal transactions but for user suspended
> > transactions the running values are the ones you want to modify since
> > that is where you'll end up restarting from.  The hardware will only
> > abort/rollback once a tresume is encountered.
> 
> OK, I see.  I hadn't really looked into suspended transactions before ...
>
> > So we could do what you're talking about for normal transactions and
> > then switch them for suspended transactions.  But this just seems to be
> > making the kernel interface overly complicated.
> 
> Agreed.  Given that there seems to be an inevitable difference in how
> transactions are seen by the debugger between Power and z (there are
> no suspended transactions on z), I guess it makes more sense to have
> the interface naturally model the hardware register sets on Power,
> and have GDB cope with the differences.
>
> > So I'm keen on just keeping it the way Anshuman has now and GDB has to
> > understand the program flow better to know which ones it wants to
> > modify.  The kernel always provides the "normal" set as running and the
> > new set as check pointed.  GDB then has to check the MSR to work out
> > what it wants to modify.
> 
> So I'm thinking how this all would work out for the case of GDB wanting
> to call an inferior function while the process is stopped within a
> transaction (in either transactional or suspended state).

Should this inferior function be run in the current mode of the
processor?  ie if the process is currently transactional and the
transaction aborts, should we be able to see any global state change
because of an inferior function being run in GDB?  

Also, if you modify the stack in suspend mode, that'll be persistent.
So it's possible that you could corrupt your stack if you abort.  For
example, if your tbegin is inside a function (one or more deep) that
returns (one or more times while transactional), you need make sure you
don't touch the stack non-transactionally if you want to be able to
abort and not corrupt your stack.

I think what you're proposing with running the inferior function in
suspend mode may end up corrupting the stack in this way.  You'd need to
be really careful to make sure the inferior function is run on the stack
pointer of the checkpointed registers.  

We do something like this in the kernel when laying out a signal frame
on the user stack when the user is transactional.  We lay it out on the
checkpointed stack pointer/r1.  (The best way for users to avoid this
problem is to use sigaltstack() but we can't rely on that).

> (A) In suspended state, I guess GDB could modify the "normal" register
> set and ask the kernel to continue.  The kernel would then transfer
> control to the inferior function entry point while still in suspended
> state, and the inferior function would execute in suspended state until
> it hits the GDB-placed breakpoint at the end.  At this point, GDB would
> reload the original values to the "normal" register set, and ask the
> kernel to continue.  The kernel would then transfer control to the
> originally-interrupted piece of code in suspended state, which would
> continue to execute until the tresume, at which point the transaction
> would abort (due to TDOOMED).  Right?

I think so.

> (B) In transactional state, GDB could modify the "checkpointed" register
> set and ask the kernel to continue.  The kernel would transfer control
> to the originally-interrupted code in transactional state.  At this
> point, the transaction would immediately abort, and control would
> transfer to the state specified in the checkpointed register set,
> and the inferior function would now execute in non-transactional
> state, until it hits the breakpoint.  At this point, GDB would now
> have to restore the original values of the *checkpointed* register
> set into the *normal* register set (a bit weird from a GDB internals
> perspective), and ask the kernel to continue.  Execution would now
> resume at the abort handler of the originally interrupted transaction.
> (Hmm.  Maybe GDB would also have to set cr0 or something?)

OK, but do we want the inferior function to be run non-transactionally?
Should it changes be seen after the transaction aborts?  

> Is it possible for GDB to change the state (transactional vs.
> suspended) where the kernel ought to let the application continue in?
> I guess via modifying the appropriate MSR bits via ptrace?

Yes, you can change the MSR to change the TM mode.

> If so, there would be other options to handle this:
> 
> (A') In the transactional state, GDB could set the MSR to suspended,
> and modify the "normal" register set.  The inferior function would
> execute in the suspended state as in (A) above.  Upon return, GDB
> would restore the normal register set (including restoring the MSR
> to transactional state).  Once the kernel resumes the application,
> the transaction would abort at this point.

This seems ok, although again, the inferior function would have side
effects even if the transaction aborts.  Is that what we want?  Can we
say to users that the inferior function is actually be run after the
transaction aborts?

Also, we need to be careful with the stack.  

> (B') In the suspended state, GDB could set the MSR to transactional,
> and modify the "checkpointed" register set.  Once the kernel resumes
> the application, the transaction immediately aborts and control
> transfers to the inferior function, executing in nontransactional
> state as in (B).  Upon return, GDB would again need to restore the
> original checkpointed register set into the normal register set,
> set cr0 to indicate transactional abort, and continue.

OK.

> Using the combination of (A)+(A') would be easiest to implement
> in GDB without modifying a lot of common code, and would have the
> advantage that the inferior function always executes in the same
> state (suspended), while leaving information about the interrupted
> transaction visible.
> 
> Using the combination of (B)+(B') would be a bit more difficult
> to implement (but certainly feasible), and would have the advantage
> that the inferior function always executes in nontransactional state
> (which is what it would most likely expect, anyway).  However, the
> disadvantage is that after the inferior call returns, GDB is unable
> to fully restore the visible inferior state as it was before (since
> we're now in nontransactional state, and there is probably no way
> to force us back into transactional/suspended state ...).

Yep.

> Am I missing something here?  Any additional thoughts?
>
> > > > - a new ptrace register set should allow access (read-only) to the
> > > >   *running* register values
> > >
> > > This is because changing them won't ever result in a side effect?
> >
> > For the same reason as above, we need to be able to modify the running
> > values when it's a user suspended transaction.  So I don't agree with
> > this any more in the case of user suspended transaction.  We need to be
> > able to modify both sets of registers.
> 
> So I guess the kernel, on resuming the application, first loads the
> checkpointed register set into normal registers, issues the trechkpt
> instruction to move them to the checkpointed register, and then loads
> the normal register set, before returning to user space?  

Correct.

> Then it does
> indeed appear that both sets *are* modifyable, and thus the kernel
> interface should export them as such.

Correct.

Getting back to the kernel interface, are you happy with what Anshuman
has proposed in the current series?

Regards,
Mikey

Ulrich Weigand March 18, 2015, 12:53 p.m. UTC | #13

Michael Neuling <mikey@neuling.org> wrote on 23.02.2015 05:51:50:

> Sorry for the slow response.

Same here :-(

> Should this inferior function be run in the current mode of the
> processor?  ie if the process is currently transactional and the
> transaction aborts, should we be able to see any global state change
> because of an inferior function being run in GDB?

Inferior functions IMO should *not* run in transactional mode.
A typical inferior call is to malloc to get some memory to be used
by GDB to store data in the inferior ... you wouldn't want to call
malloc inside a transaction.

Now, whether inferior functions can run in *suspended* mode
(i.e. causing the transaction to abort after the inferior call
has returned) or have to run in nontransactional mode (aborting
the transaction befer the inferior call), I'm not sure.  Unless
there are some unexpected side effect, it seems cleaner to run
in suspended mode, since this way we can restore the state we
were in before the inferior call after the call (which is what
user would tend to expect).

On the other hand, inferior function calls do not necessarily
return.  In fact, they could throw or longjmp to some routine
higher up on the stack, in which case GDB will never get control
back.  The rest of program execution would then be done in suspended
mode (and not nontransactional mode), at least until code attempts
to start the next transaction.  Not sure if this is an issue.

> Also, if you modify the stack in suspend mode, that'll be persistent.
> So it's possible that you could corrupt your stack if you abort.  For
> example, if your tbegin is inside a function (one or more deep) that
> returns (one or more times while transactional), you need make sure you
> don't touch the stack non-transactionally if you want to be able to
> abort and not corrupt your stack.

I see.

> I think what you're proposing with running the inferior function in
> suspend mode may end up corrupting the stack in this way.  You'd need to
> be really careful to make sure the inferior function is run on the stack
> pointer of the checkpointed registers.

On the other hand, if code called a subroutine after the tbegin, if we
were using the checkpointed r1, this might corrupt the stack of the
transactional code.  (This code will never actually *run* again since
the transaction is doomed, but we can still *inspect* it in GDB after
the inferior call has returned, so the stack should remain unchanged.
Well .. if the transaction is suspended, the code might in fact still
run, so it should remain unchanged either way.)

I guess we could use the minimum of transactional and checkpointed r1
in that case, to be safe either way.

> > (A) In suspended state, I guess GDB could modify the "normal" register
> > set and ask the kernel to continue.  The kernel would then transfer
> > control to the inferior function entry point while still in suspended
> > state, and the inferior function would execute in suspended state until
> > it hits the GDB-placed breakpoint at the end.  At this point, GDB would
> > reload the original values to the "normal" register set, and ask the
> > kernel to continue.  The kernel would then transfer control to the
> > originally-interrupted piece of code in suspended state, which would
> > continue to execute until the tresume, at which point the transaction
> > would abort (due to TDOOMED).  Right?
>
> I think so.

Good.

> > (B) In transactional state, GDB could modify the "checkpointed"
register
> > set and ask the kernel to continue.  The kernel would transfer control
> > to the originally-interrupted code in transactional state.  At this
> > point, the transaction would immediately abort, and control would
> > transfer to the state specified in the checkpointed register set,
> > and the inferior function would now execute in non-transactional
> > state, until it hits the breakpoint.  At this point, GDB would now
> > have to restore the original values of the *checkpointed* register
> > set into the *normal* register set (a bit weird from a GDB internals
> > perspective), and ask the kernel to continue.  Execution would now
> > resume at the abort handler of the originally interrupted transaction.
> > (Hmm.  Maybe GDB would also have to set cr0 or something?)
>
> OK, but do we want the inferior function to be run non-transactionally?
> Should it changes be seen after the transaction aborts?

Yes, see above.

> > Is it possible for GDB to change the state (transactional vs.
> > suspended) where the kernel ought to let the application continue in?
> > I guess via modifying the appropriate MSR bits via ptrace?
>
> Yes, you can change the MSR to change the TM mode.

OK, that's good.

> > If so, there would be other options to handle this:
> >
> > (A') In the transactional state, GDB could set the MSR to suspended,
> > and modify the "normal" register set.  The inferior function would
> > execute in the suspended state as in (A) above.  Upon return, GDB
> > would restore the normal register set (including restoring the MSR
> > to transactional state).  Once the kernel resumes the application,
> > the transaction would abort at this point.
>
> This seems ok, although again, the inferior function would have side
> effects even if the transaction aborts.  Is that what we want?  Can we
> say to users that the inferior function is actually be run after the
> transaction aborts?
>
> Also, we need to be careful with the stack.

See above.

> > (B') In the suspended state, GDB could set the MSR to transactional,
> > and modify the "checkpointed" register set.  Once the kernel resumes
> > the application, the transaction immediately aborts and control
> > transfers to the inferior function, executing in nontransactional
> > state as in (B).  Upon return, GDB would again need to restore the
> > original checkpointed register set into the normal register set,
> > set cr0 to indicate transactional abort, and continue.
>
> OK.
>
> > Using the combination of (A)+(A') would be easiest to implement
> > in GDB without modifying a lot of common code, and would have the
> > advantage that the inferior function always executes in the same
> > state (suspended), while leaving information about the interrupted
> > transaction visible.
> >
> > Using the combination of (B)+(B') would be a bit more difficult
> > to implement (but certainly feasible), and would have the advantage
> > that the inferior function always executes in nontransactional state
> > (which is what it would most likely expect, anyway).  However, the
> > disadvantage is that after the inferior call returns, GDB is unable
> > to fully restore the visible inferior state as it was before (since
> > we're now in nontransactional state, and there is probably no way
> > to force us back into transactional/suspended state ...).
>
> Yep.

So right now I'd tend to prefer (A)+(A'), but the important thing is
that the kernel seems to provide all features required for GDB to
implement any of the above, so we can still make that decision later.

> Getting back to the kernel interface, are you happy with what Anshuman
> has proposed in the current series?

Given the discussion above, this seems fine to me now.

Bye,
Ulrich

Michael Neuling March 18, 2015, 10:45 p.m. UTC | #14

On Wed, 2015-03-18 at 13:53 +0100, Ulrich Weigand wrote:
> Michael Neuling <mikey@neuling.org> wrote on 23.02.2015 05:51:50:
> 
> > Sorry for the slow response.
> 
> Same here :-(

I'm going to break the cycle and respond in a few hours :-)


> > I think what you're proposing with running the inferior function in
> > suspend mode may end up corrupting the stack in this way.  You'd need to
> > be really careful to make sure the inferior function is run on the stack
> > pointer of the checkpointed registers.
> 
> On the other hand, if code called a subroutine after the tbegin, if we
> were using the checkpointed r1, this might corrupt the stack of the
> transactional code.  (This code will never actually *run* again since
> the transaction is doomed, but we can still *inspect* it in GDB after
> the inferior call has returned, so the stack should remain unchanged.
> Well .. if the transaction is suspended, the code might in fact still
> run, so it should remain unchanged either way.)
> 
> I guess we could use the minimum of transactional and checkpointed r1
> in that case, to be safe either way.

Sounds good.

<snip>

> > > Using the combination of (A)+(A') would be easiest to implement
> > > in GDB without modifying a lot of common code, and would have the
> > > advantage that the inferior function always executes in the same
> > > state (suspended), while leaving information about the interrupted
> > > transaction visible.
> > >
> > > Using the combination of (B)+(B') would be a bit more difficult
> > > to implement (but certainly feasible), and would have the advantage
> > > that the inferior function always executes in nontransactional state
> > > (which is what it would most likely expect, anyway).  However, the
> > > disadvantage is that after the inferior call returns, GDB is unable
> > > to fully restore the visible inferior state as it was before (since
> > > we're now in nontransactional state, and there is probably no way
> > > to force us back into transactional/suspended state ...).
> >
> > Yep.
> 
> So right now I'd tend to prefer (A)+(A'), but the important thing is
> that the kernel seems to provide all features required for GDB to
> implement any of the above, so we can still make that decision later.
> 
> > Getting back to the kernel interface, are you happy with what Anshuman
> > has proposed in the current series?
> 
> Given the discussion above, this seems fine to me now.

Great, we'll push through with this in mind.

Thanks!
Mikey

Michael Neuling March 18, 2015, 10:50 p.m. UTC | #15

On Thu, 2015-03-19 at 09:45 +1100, Michael Neuling wrote:
> On Wed, 2015-03-18 at 13:53 +0100, Ulrich Weigand wrote:
> > Michael Neuling <mikey@neuling.org> wrote on 23.02.2015 05:51:50:
> > 
> > > Sorry for the slow response.
> > 
> > Same here :-(
> 
> I'm going to break the cycle and respond in a few hours :-)
> 
> 
> > > I think what you're proposing with running the inferior function in
> > > suspend mode may end up corrupting the stack in this way.  You'd need to
> > > be really careful to make sure the inferior function is run on the stack
> > > pointer of the checkpointed registers.
> > 
> > On the other hand, if code called a subroutine after the tbegin, if we
> > were using the checkpointed r1, this might corrupt the stack of the
> > transactional code.  (This code will never actually *run* again since
> > the transaction is doomed, but we can still *inspect* it in GDB after
> > the inferior call has returned, so the stack should remain unchanged.
> > Well .. if the transaction is suspended, the code might in fact still
> > run, so it should remain unchanged either way.)
> > 
> > I guess we could use the minimum of transactional and checkpointed r1
> > in that case, to be safe either way.
> 
> Sounds good.
> 
> <snip>
> 
> > > > Using the combination of (A)+(A') would be easiest to implement
> > > > in GDB without modifying a lot of common code, and would have the
> > > > advantage that the inferior function always executes in the same
> > > > state (suspended), while leaving information about the interrupted
> > > > transaction visible.
> > > >
> > > > Using the combination of (B)+(B') would be a bit more difficult
> > > > to implement (but certainly feasible), and would have the advantage
> > > > that the inferior function always executes in nontransactional state
> > > > (which is what it would most likely expect, anyway).  However, the
> > > > disadvantage is that after the inferior call returns, GDB is unable
> > > > to fully restore the visible inferior state as it was before (since
> > > > we're now in nontransactional state, and there is probably no way
> > > > to force us back into transactional/suspended state ...).
> > >
> > > Yep.
> > 
> > So right now I'd tend to prefer (A)+(A'), but the important thing is
> > that the kernel seems to provide all features required for GDB to
> > implement any of the above, so we can still make that decision later.
> > 
> > > Getting back to the kernel interface, are you happy with what Anshuman
> > > has proposed in the current series?
> > 
> > Given the discussion above, this seems fine to me now.
> 
> Great, we'll push through with this in mind.

Anshuman,

With that in mind, do we have a way to set the top 32bits of the MSR
(which contain the TM bits) when ptracing 32 bit processes?  I can't
find anything like that in this patch set.  

Mikey

Anshuman Khandual March 23, 2015, 10:34 a.m. UTC | #16

On 03/19/2015 04:20 AM, Michael Neuling wrote:
> On Thu, 2015-03-19 at 09:45 +1100, Michael Neuling wrote:
>> On Wed, 2015-03-18 at 13:53 +0100, Ulrich Weigand wrote:
>>> Michael Neuling <mikey@neuling.org> wrote on 23.02.2015 05:51:50:
>>>
>>>> Sorry for the slow response.
>>>
>>> Same here :-(
>>
>> I'm going to break the cycle and respond in a few hours :-)
>>
>>
>>>> I think what you're proposing with running the inferior function in
>>>> suspend mode may end up corrupting the stack in this way.  You'd need to
>>>> be really careful to make sure the inferior function is run on the stack
>>>> pointer of the checkpointed registers.
>>>
>>> On the other hand, if code called a subroutine after the tbegin, if we
>>> were using the checkpointed r1, this might corrupt the stack of the
>>> transactional code.  (This code will never actually *run* again since
>>> the transaction is doomed, but we can still *inspect* it in GDB after
>>> the inferior call has returned, so the stack should remain unchanged.
>>> Well .. if the transaction is suspended, the code might in fact still
>>> run, so it should remain unchanged either way.)
>>>
>>> I guess we could use the minimum of transactional and checkpointed r1
>>> in that case, to be safe either way.
>>
>> Sounds good.
>>
>> <snip>
>>
>>>>> Using the combination of (A)+(A') would be easiest to implement
>>>>> in GDB without modifying a lot of common code, and would have the
>>>>> advantage that the inferior function always executes in the same
>>>>> state (suspended), while leaving information about the interrupted
>>>>> transaction visible.
>>>>>
>>>>> Using the combination of (B)+(B') would be a bit more difficult
>>>>> to implement (but certainly feasible), and would have the advantage
>>>>> that the inferior function always executes in nontransactional state
>>>>> (which is what it would most likely expect, anyway).  However, the
>>>>> disadvantage is that after the inferior call returns, GDB is unable
>>>>> to fully restore the visible inferior state as it was before (since
>>>>> we're now in nontransactional state, and there is probably no way
>>>>> to force us back into transactional/suspended state ...).
>>>>
>>>> Yep.
>>>
>>> So right now I'd tend to prefer (A)+(A'), but the important thing is
>>> that the kernel seems to provide all features required for GDB to
>>> implement any of the above, so we can still make that decision later.
>>>
>>>> Getting back to the kernel interface, are you happy with what Anshuman
>>>> has proposed in the current series?
>>>
>>> Given the discussion above, this seems fine to me now.
>>
>> Great, we'll push through with this in mind.
> 
> Anshuman,
> 
> With that in mind, do we have a way to set the top 32bits of the MSR
> (which contain the TM bits) when ptracing 32 bit processes?  I can't
> find anything like that in this patch set.

No, we dont have that yet. When ptracing in 32-bit mode the MSR value
which can be viewed or set from the user space through PTRACE_GETREGS
PTRACE_SETREGS call is it's lower 32 bits only. Either we can club
the upper 32 bits of MSR as part of one of the ELF core notes we are
adding in the patch series or we can create one more separate ELF core
note for that purpose. Let me know your opinion on this.

Ulrich Weigand April 8, 2015, 5:50 p.m. UTC | #17

Anshuman Khandual <khandual@linux.vnet.ibm.com> wrote on 23.03.2015
11:34:30:

> > With that in mind, do we have a way to set the top 32bits of the MSR
> > (which contain the TM bits) when ptracing 32 bit processes?  I can't
> > find anything like that in this patch set.
>
> No, we dont have that yet. When ptracing in 32-bit mode the MSR value
> which can be viewed or set from the user space through PTRACE_GETREGS
> PTRACE_SETREGS call is it's lower 32 bits only. Either we can club
> the upper 32 bits of MSR as part of one of the ELF core notes we are
> adding in the patch series or we can create one more separate ELF core
> note for that purpose. Let me know your opinion on this.

I'm not sure I understand this.  I thought we had the following:

- If the process calling ptrace is itself 64-bit (which is how GDB is
  built on all current Linux distributions), then PTRACE_GETREGS etc.
  will *always* operate on 64-bit register sets, even if the target
  process is 32-bit.

- If the process calling ptrace is 32-bit, then PTRACE_GETREGS will
  operate on 32-bit register sets.   However, there is a separate
  PTRACE_GETREGS64 / PTRACE_SETREGS64 call that will also provide
  the opportunity to operate on the full 64-bit register set.  Both
  apply independently of whether the target process is 32-bit or
  64-bit.

Is this not correct?

Bye,
Ulrich

Michael Neuling April 8, 2015, 11:11 p.m. UTC | #18

On Wed, 2015-04-08 at 19:50 +0200, Ulrich Weigand wrote:
> Anshuman Khandual <khandual@linux.vnet.ibm.com> wrote on 23.03.2015
> 11:34:30:
> 
> > > With that in mind, do we have a way to set the top 32bits of the MSR
> > > (which contain the TM bits) when ptracing 32 bit processes?  I can't
> > > find anything like that in this patch set.
> >
> > No, we dont have that yet. When ptracing in 32-bit mode the MSR value
> > which can be viewed or set from the user space through PTRACE_GETREGS
> > PTRACE_SETREGS call is it's lower 32 bits only. Either we can club
> > the upper 32 bits of MSR as part of one of the ELF core notes we are
> > adding in the patch series or we can create one more separate ELF core
> > note for that purpose. Let me know your opinion on this.
> 
> I'm not sure I understand this.  I thought we had the following:
> 
> - If the process calling ptrace is itself 64-bit (which is how GDB is
>   built on all current Linux distributions), then PTRACE_GETREGS etc.
>   will *always* operate on 64-bit register sets, even if the target
>   process is 32-bit.
> 
> - If the process calling ptrace is 32-bit, then PTRACE_GETREGS will
>   operate on 32-bit register sets.   However, there is a separate
>   PTRACE_GETREGS64 / PTRACE_SETREGS64 call that will also provide
>   the opportunity to operate on the full 64-bit register set.  Both
>   apply independently of whether the target process is 32-bit or
>   64-bit.
> 
> Is this not correct?

I think you're correct.  We should be right.  I'd forgotten about the
GET/SETREGS64 interfaces.

Mikey

Anshuman Khandual April 9, 2015, 12:50 p.m. UTC | #19

On 04/09/2015 04:41 AM, Michael Neuling wrote:
> On Wed, 2015-04-08 at 19:50 +0200, Ulrich Weigand wrote:
>> Anshuman Khandual <khandual@linux.vnet.ibm.com> wrote on 23.03.2015
>> 11:34:30:
>>
>>>> With that in mind, do we have a way to set the top 32bits of the MSR
>>>> (which contain the TM bits) when ptracing 32 bit processes?  I can't
>>>> find anything like that in this patch set.
>>>
>>> No, we dont have that yet. When ptracing in 32-bit mode the MSR value
>>> which can be viewed or set from the user space through PTRACE_GETREGS
>>> PTRACE_SETREGS call is it's lower 32 bits only. Either we can club
>>> the upper 32 bits of MSR as part of one of the ELF core notes we are
>>> adding in the patch series or we can create one more separate ELF core
>>> note for that purpose. Let me know your opinion on this.
>>
>> I'm not sure I understand this.  I thought we had the following:
>>
>> - If the process calling ptrace is itself 64-bit (which is how GDB is
>>   built on all current Linux distributions), then PTRACE_GETREGS etc.
>>   will *always* operate on 64-bit register sets, even if the target
>>   process is 32-bit.
>>
>> - If the process calling ptrace is 32-bit, then PTRACE_GETREGS will
>>   operate on 32-bit register sets.   However, there is a separate
>>   PTRACE_GETREGS64 / PTRACE_SETREGS64 call that will also provide
>>   the opportunity to operate on the full 64-bit register set.  Both
>>   apply independently of whether the target process is 32-bit or
>>   64-bit.
>>
>> Is this not correct?
> 
> I think you're correct.  We should be right.  I'd forgotten about the
> GET/SETREGS64 interfaces.

In that case, is the patch series complete and okay ? Is there any thing
else we need to verify other than waiting for the GDB test results which
Edjunior has been working on. But I am not aware of the status on the GDB
test development front.

Edjunior,

Do you have any updates ?

Regards
Anshuman

Michael Neuling April 10, 2015, 3:03 a.m. UTC | #20

On Thu, 2015-04-09 at 18:20 +0530, Anshuman Khandual wrote:
> On 04/09/2015 04:41 AM, Michael Neuling wrote:
> > On Wed, 2015-04-08 at 19:50 +0200, Ulrich Weigand wrote:
> >> Anshuman Khandual <khandual@linux.vnet.ibm.com> wrote on 23.03.2015
> >> 11:34:30:
> >>
> >>>> With that in mind, do we have a way to set the top 32bits of the MSR
> >>>> (which contain the TM bits) when ptracing 32 bit processes?  I can't
> >>>> find anything like that in this patch set.
> >>>
> >>> No, we dont have that yet. When ptracing in 32-bit mode the MSR value
> >>> which can be viewed or set from the user space through PTRACE_GETREGS
> >>> PTRACE_SETREGS call is it's lower 32 bits only. Either we can club
> >>> the upper 32 bits of MSR as part of one of the ELF core notes we are
> >>> adding in the patch series or we can create one more separate ELF core
> >>> note for that purpose. Let me know your opinion on this.
> >>
> >> I'm not sure I understand this.  I thought we had the following:
> >>
> >> - If the process calling ptrace is itself 64-bit (which is how GDB is
> >>   built on all current Linux distributions), then PTRACE_GETREGS etc.
> >>   will *always* operate on 64-bit register sets, even if the target
> >>   process is 32-bit.
> >>
> >> - If the process calling ptrace is 32-bit, then PTRACE_GETREGS will
> >>   operate on 32-bit register sets.   However, there is a separate
> >>   PTRACE_GETREGS64 / PTRACE_SETREGS64 call that will also provide
> >>   the opportunity to operate on the full 64-bit register set.  Both
> >>   apply independently of whether the target process is 32-bit or
> >>   64-bit.
> >>
> >> Is this not correct?
> > 
> > I think you're correct.  We should be right.  I'd forgotten about the
> > GET/SETREGS64 interfaces.
> 
> In that case, is the patch series complete and okay ? Is there any thing
> else we need to verify other than waiting for the GDB test results which
> Edjunior has been working on. But I am not aware of the status on the GDB
> test development front.

I think we are good.

Mikey

> 
> Edjunior,
> 
> Do you have any updates ?
> 
> Regards
> Anshuman
> 
>

Anshuman Khandual April 10, 2015, 9:10 a.m. UTC | #21

On 04/10/2015 08:33 AM, Michael Neuling wrote:
> On Thu, 2015-04-09 at 18:20 +0530, Anshuman Khandual wrote:
>> On 04/09/2015 04:41 AM, Michael Neuling wrote:
>>> On Wed, 2015-04-08 at 19:50 +0200, Ulrich Weigand wrote:
>>>> Anshuman Khandual <khandual@linux.vnet.ibm.com> wrote on 23.03.2015
>>>> 11:34:30:
>>>>
>>>>>> With that in mind, do we have a way to set the top 32bits of the MSR
>>>>>> (which contain the TM bits) when ptracing 32 bit processes?  I can't
>>>>>> find anything like that in this patch set.
>>>>>
>>>>> No, we dont have that yet. When ptracing in 32-bit mode the MSR value
>>>>> which can be viewed or set from the user space through PTRACE_GETREGS
>>>>> PTRACE_SETREGS call is it's lower 32 bits only. Either we can club
>>>>> the upper 32 bits of MSR as part of one of the ELF core notes we are
>>>>> adding in the patch series or we can create one more separate ELF core
>>>>> note for that purpose. Let me know your opinion on this.
>>>>
>>>> I'm not sure I understand this.  I thought we had the following:
>>>>
>>>> - If the process calling ptrace is itself 64-bit (which is how GDB is
>>>>   built on all current Linux distributions), then PTRACE_GETREGS etc.
>>>>   will *always* operate on 64-bit register sets, even if the target
>>>>   process is 32-bit.
>>>>
>>>> - If the process calling ptrace is 32-bit, then PTRACE_GETREGS will
>>>>   operate on 32-bit register sets.   However, there is a separate
>>>>   PTRACE_GETREGS64 / PTRACE_SETREGS64 call that will also provide
>>>>   the opportunity to operate on the full 64-bit register set.  Both
>>>>   apply independently of whether the target process is 32-bit or
>>>>   64-bit.
>>>>
>>>> Is this not correct?
>>>
>>> I think you're correct.  We should be right.  I'd forgotten about the
>>> GET/SETREGS64 interfaces.
>>
>> In that case, is the patch series complete and okay ? Is there any thing
>> else we need to verify other than waiting for the GDB test results which
>> Edjunior has been working on. But I am not aware of the status on the GDB
>> test development front.
> 
> I think we are good.

I had posted a newer version [V7] of this patch series couple of months back
which got ignored while the discussion continued in this version.

V7: https://lkml.org/lkml/2015/1/14/19

Apart from the last gitignore related patch which already got merged into
mainline separately, all other patches should be as good even today. I will
try rebasing the series, running the base tests again and re post it in some
time.

Ulrich Weigand April 10, 2015, 10:33 a.m. UTC | #22

Anshuman Khandual <khandual@linux.vnet.ibm.com> wrote on 10.04.2015
11:10:35:

> I had posted a newer version [V7] of this patch series couple of months
back
> which got ignored while the discussion continued in this version.
>
> V7: https://lkml.org/lkml/2015/1/14/19

Ah, with all the back-and-forth on the checkpointed state, I never looked
at this.  Unfortunately, there's still a number of issues with this, I
think:

- You provide checkpointed FPR and VMX registers, but there doesn't seem
  to be any way to get at the checkpointed *VSX* registers (i.e. the part
  that is neither covered by FPR or VMX, corresponding to NT_PPC_VSX).

- We may have had this discussion in the past, but I still do not like the
  notion of a "misc" register set, in particular since the three registers
  in it are available at different architecture levels and categories.

  I would much prefer three separate regsets (e.g. NT_PPC_DSCR, NT_PPC_PPR,
  and NT_PPC_TAR), each of which is available and valid if and only if the
  current processor actually has the register in question.

  If we do have a single regset, at the very least a "get" operation should
  set registers unvailable on the machine to a defined state (zero?)
  instead of simply leaving memory uninitialized.

- Similarly, the NT_PPC_TM_SPR regset as currently defined mixes and
matches
  registers with different "lifetimes".  The transactional memory registers
  (TFHAR, TEXASR, TFIAR) are available *always* on machines that support
  transactions.  But the other registers in that regset are checkpointed
  versions that are only available/valid within a transaction.  I think a
  better way to faithfully represent this would be to have the
NT_PPC_TM_SPR
  regset only contain the transcational memory registers, and use separate
  regsets for the checkpointed registers -- those should parallel the non-
  checkpointed register regset.

  For example, if we have NT_PPC_DSCR, there should be a NT_PPC_CDSCR for
  the checkpointed version etc.  (If we do stay with MISC, there should
then
  be a CMISC).

- Particularly confusing to me is the "checkpointed original MSR" which
  currently also resides in NT_PPC_TM_SPR.  What exactly is this?  How
  does that differ from the MSR slot in the NT_PPC_TM_CGPR regset?

  I may be misreading kernel code, but it seems the kernel does not
actually
  use the ckpt_regs.msr slot at all, and therefore the corresponding slot
of
  the NT_PPC_TM_CGPR regset is likewise undefined/unused.  Would it not be
  more consistent to use that slot to pass the checkpointed MSR?

  In any case, it seems the ptrace set-register case currently allows user
  space to restore *any* arbitrary value into the checkpointed MSR, which
  would presumably get restored into the real MSR at some point, unless I'm
  missing something here.  Do we not need a check that only safe bits are
  modified, just like with ptrace access to the real MSR?

Bye,
Ulrich

Anshuman Khandual April 13, 2015, 8:48 a.m. UTC | #23

On 04/10/2015 04:03 PM, Ulrich Weigand wrote:
> Anshuman Khandual <khandual@linux.vnet.ibm.com> wrote on 10.04.2015
> 11:10:35:
> 
>> I had posted a newer version [V7] of this patch series couple of months
> back
>> which got ignored while the discussion continued in this version.
>>
>> V7: https://lkml.org/lkml/2015/1/14/19
> 
> Ah, with all the back-and-forth on the checkpointed state, I never looked
> at this.  Unfortunately, there's still a number of issues with this, I
> think:
> 
> - You provide checkpointed FPR and VMX registers, but there doesn't seem
>   to be any way to get at the checkpointed *VSX* registers (i.e. the part
>   that is neither covered by FPR or VMX, corresponding to NT_PPC_VSX).

Will change vsr_get, vsr_set functions as we have done for fpr_get and fpr_set
functions. Also will add one more ELF core note NT_PPC_TM_CVSX to fetch the
check pointed state of VSX register while inside the transaction.

> 
> - We may have had this discussion in the past, but I still do not like the
>   notion of a "misc" register set, in particular since the three registers
>   in it are available at different architecture levels and categories.

Misc category as always stands for registers which can not be easily classified
into any meaningful categories.

> 
>   I would much prefer three separate regsets (e.g. NT_PPC_DSCR, NT_PPC_PPR,
>   and NT_PPC_TAR), each of which is available and valid if and only if the
>   current processor actually has the register in question.

Thats like adding one ELF core note for every single register because we cannot
put them in any category. Then as Michael Ellerman had pointed out to include
a lot more registers in this MISC category (which we are not doing right now
in the interest of having minimum support available before we look at the full
possible list of MISC registers), we should add one ELF core note section for
each of those individual registers ? I am not sure.

> 
>   If we do have a single regset, at the very least a "get" operation should
>   set registers unvailable on the machine to a defined state (zero?)
>   instead of simply leaving memory uninitialized.

Yeah sure, we can do that.

> 
> - Similarly, the NT_PPC_TM_SPR regset as currently defined mixes and
> matches
>   registers with different "lifetimes".  The transactional memory registers
>   (TFHAR, TEXASR, TFIAR) are available *always* on machines that support
>   transactions.  But the other registers in that regset are checkpointed
>   versions that are only available/valid within a transaction.  I think a
>   better way to faithfully represent this would be to have the
> NT_PPC_TM_SPR
>   regset only contain the transcational memory registers, and use separate
>   regsets for the checkpointed registers -- those should parallel the non-
>   checkpointed register regset.

Right now, we support NT_PPC_TM_SPR only inside the transaction, so we dont
have the problem with different "lifetimes" registers accessed together. But
yes, I get your point.

> 
>   For example, if we have NT_PPC_DSCR, there should be a NT_PPC_CDSCR for
>   the checkpointed version etc.  (If we do stay with MISC, there should
> then
>   be a CMISC).

But then NT_PPC_MISC and NT_PPC_CMISC can contain different set of registers.
NT_PPC_CMISC will contain the orig_msr for now which the other elf core note
does not have and NT_PPC_MISC will grow to have lot more registers in the
future leaving behind NT_PPC_CMISC as it is. Need to take care of these.

> 
> - Particularly confusing to me is the "checkpointed original MSR" which
>   currently also resides in NT_PPC_TM_SPR.  What exactly is this?  How
>   does that differ from the MSR slot in the NT_PPC_TM_CGPR regset?

I believed it stores the check pointed MSR value which was in the register
before the transaction started. But then how it is different from the
ckpt_regs.msr, I am not sure. Mikey or Michael should be able to clarify
more on this. I can see "orig_msr" getting used in many places to hold the
check pointed value of MSR.

> 
>   I may be misreading kernel code, but it seems the kernel does not
> actually
>   use the ckpt_regs.msr slot at all, and therefore the corresponding slot
> of
>   the NT_PPC_TM_CGPR regset is likewise undefined/unused.  Would it not be
>   more consistent to use that slot to pass the checkpointed MSR?

Hmm. Its a valid point. Would like to get some more clarity on this from
Mikey whether that slot can be used for check pointed MSR value or not.
Then why did we have these two slots to hold the same check pointed MSR
value in the first place at all. Getting confused a bit.

> 
>   In any case, it seems the ptrace set-register case currently allows user
>   space to restore *any* arbitrary value into the checkpointed MSR, which
>   would presumably get restored into the real MSR at some point, unless I'm
>   missing something here.  Do we not need a check that only safe bits are
>   modified, just like with ptrace access to the real MSR?

Where and which safe bits do we check before writing any value into the MSR
register from ptrace interface ? May be I am missing something here.

Anshuman Khandual April 20, 2015, 6:42 a.m. UTC | #24

On 04/13/2015 02:18 PM, Anshuman Khandual wrote:
> On 04/10/2015 04:03 PM, Ulrich Weigand wrote:
>> Anshuman Khandual <khandual@linux.vnet.ibm.com> wrote on 10.04.2015
>> 11:10:35:
> 
> I believed it stores the check pointed MSR value which was in the register
> before the transaction started. But then how it is different from the
> ckpt_regs.msr, I am not sure. Mikey or Michael should be able to clarify
> more on this. I can see "orig_msr" getting used in many places to hold the
> check pointed value of MSR.

tm_orig_msr is used during process context switch only. ckpt_regs
gets used in the signal context where we save all userspace context.
So ptrace would look into the saved MSR value correctly inside
ckpt_regs.msr slot. I believe thats the check pointed MSR value
we are interested in from the ptrace perspective not the tm_orig_msr
which just gets used during process context switch.

>>   I may be misreading kernel code, but it seems the kernel does not
>> actually
>>   use the ckpt_regs.msr slot at all, and therefore the corresponding slot
>> of
>>   the NT_PPC_TM_CGPR regset is likewise undefined/unused.  Would it not be
>>   more consistent to use that slot to pass the checkpointed MSR?
> 
> Hmm. Its a valid point. Would like to get some more clarity on this from
> Mikey whether that slot can be used for check pointed MSR value or not.
> Then why did we have these two slots to hold the same check pointed MSR
> value in the first place at all. Getting confused a bit.

Using ckpt_regs.msr during process context switch instead of tm_orig_msr seems
to be working fine and all the basic TM tests pass, in which case we can drop
tm_orig_msr from thread_struct. Will post a patch on this and see whats the
response from others.

Ulrich Weigand April 20, 2015, 12:27 p.m. UTC | #25

Anshuman Khandual <khandual@linux.vnet.ibm.com> wrote on 13.04.2015
10:48:57:
> On 04/10/2015 04:03 PM, Ulrich Weigand wrote:
> > - You provide checkpointed FPR and VMX registers, but there doesn't
seem
> >   to be any way to get at the checkpointed *VSX* registers (i.e. the
part
> >   that is neither covered by FPR or VMX, corresponding to NT_PPC_VSX).
>
> Will change vsr_get, vsr_set functions as we have done for fpr_get and
fpr_set
> functions. Also will add one more ELF core note NT_PPC_TM_CVSX to fetch
the
> check pointed state of VSX register while inside the transaction.

OK.

> >   I would much prefer three separate regsets (e.g. NT_PPC_DSCR,
NT_PPC_PPR,
> >   and NT_PPC_TAR), each of which is available and valid if and only if
the
> >   current processor actually has the register in question.
>
> Thats like adding one ELF core note for every single register
> because we cannot
> put them in any category. Then as Michael Ellerman had pointed out to
include
> a lot more registers in this MISC category (which we are not doing right
now
> in the interest of having minimum support available before we look at the
full
> possible list of MISC registers), we should add one ELF core note section
for
> each of those individual registers ? I am not sure.

This confuses me a bit.  My understanding was that ptrace regsets, once
defined, should never change in the future.  (GDB will only check whether
or not a regset is supported; if it is, it will expect the contents to be
as it expects them to be.)  "Including a lot more registers" would
therefore
seem to require adding new regsets anyway, which is one of the reasons why
I disagree a "MISC" regset is particularly useful.

> > - Similarly, the NT_PPC_TM_SPR regset as currently defined mixes and
> > matches
> >   registers with different "lifetimes".  The transactional memory
registers
> >   (TFHAR, TEXASR, TFIAR) are available *always* on machines that
support
> >   transactions.  But the other registers in that regset are
checkpointed
> >   versions that are only available/valid within a transaction.  I think
a
> >   better way to faithfully represent this would be to have the
> > NT_PPC_TM_SPR
> >   regset only contain the transcational memory registers, and use
separate
> >   regsets for the checkpointed registers -- those should parallel the
non-
> >   checkpointed register regset.
>
> Right now, we support NT_PPC_TM_SPR only inside the transaction, so we
dont
> have the problem with different "lifetimes" registers accessed together.
But
> yes, I get your point.

Since the transactional SPRs are accessible from user space outside of a
transaction, it would make sense for them to accessible from ptrace as
well.
If the current patch set doesn't do that, I guess it would be better to
change that.

> > - Particularly confusing to me is the "checkpointed original MSR" which
> >   currently also resides in NT_PPC_TM_SPR.  What exactly is this?  How
> >   does that differ from the MSR slot in the NT_PPC_TM_CGPR regset?
>
> I believed it stores the check pointed MSR value which was in the
register
> before the transaction started. But then how it is different from the
> ckpt_regs.msr, I am not sure. Mikey or Michael should be able to clarify
> more on this. I can see "orig_msr" getting used in many places to hold
the
> check pointed value of MSR.

Your other mail states that the orig_mst may be irrelevant for ptrace
anyway ... that would be OK with me as well.

> >   In any case, it seems the ptrace set-register case currently allows
user
> >   space to restore *any* arbitrary value into the checkpointed MSR,
which
> >   would presumably get restored into the real MSR at some point, unless
I'm
> >   missing something here.  Do we not need a check that only safe bits
are
> >   modified, just like with ptrace access to the real MSR?
>
> Where and which safe bits do we check before writing any value into the
MSR
> register from ptrace interface ? May be I am missing something here.

All ptrace accesses to *set* the regular msr go via this routine:

static int set_user_msr(struct task_struct *task, unsigned long msr)
{
        task->thread.regs->msr &= ~MSR_DEBUGCHANGE;
        task->thread.regs->msr |= msr & MSR_DEBUGCHANGE;
        return 0;
}

I think we'd need to do the equivalent whenever changing the checkpointed
MSR.

Bye,
Ulrich

Anshuman Khandual April 21, 2015, 4:55 a.m. UTC | #26

On 04/20/2015 05:57 PM, Ulrich Weigand wrote:
> Anshuman Khandual <khandual@linux.vnet.ibm.com> wrote on 13.04.2015
> 10:48:57:
>> On 04/10/2015 04:03 PM, Ulrich Weigand wrote:
>>> - You provide checkpointed FPR and VMX registers, but there doesn't
> seem
>>>   to be any way to get at the checkpointed *VSX* registers (i.e. the
> part
>>>   that is neither covered by FPR or VMX, corresponding to NT_PPC_VSX).
>>
>> Will change vsr_get, vsr_set functions as we have done for fpr_get and
> fpr_set
>> functions. Also will add one more ELF core note NT_PPC_TM_CVSX to fetch
> the
>> check pointed state of VSX register while inside the transaction.
> 
> OK.
> 
>>>   I would much prefer three separate regsets (e.g. NT_PPC_DSCR,
> NT_PPC_PPR,
>>>   and NT_PPC_TAR), each of which is available and valid if and only if
> the
>>>   current processor actually has the register in question.
>>
>> Thats like adding one ELF core note for every single register
>> because we cannot
>> put them in any category. Then as Michael Ellerman had pointed out to
> include
>> a lot more registers in this MISC category (which we are not doing right
> now
>> in the interest of having minimum support available before we look at the
> full
>> possible list of MISC registers), we should add one ELF core note section
> for
>> each of those individual registers ? I am not sure.
> 
> This confuses me a bit.  My understanding was that ptrace regsets, once
> defined, should never change in the future.  (GDB will only check whether
> or not a regset is supported; if it is, it will expect the contents to be
> as it expects them to be.)  "Including a lot more registers" would
> therefore
> seem to require adding new regsets anyway, which is one of the reasons why
> I disagree a "MISC" regset is particularly useful.

Yeah right. Started thinking that (NT_PPC_TAR, NT_PPC_CTAR),
(NT_PPC_PPR, NT_PPC_CPPR), (NT_PPC_DSCR, NT_PPC_CDSCR) kind of combinations
make more sense !

> 
>>> - Similarly, the NT_PPC_TM_SPR regset as currently defined mixes and
>>> matches
>>>   registers with different "lifetimes".  The transactional memory
> registers
>>>   (TFHAR, TEXASR, TFIAR) are available *always* on machines that
> support
>>>   transactions.  But the other registers in that regset are
> checkpointed
>>>   versions that are only available/valid within a transaction.  I think
> a
>>>   better way to faithfully represent this would be to have the
>>> NT_PPC_TM_SPR
>>>   regset only contain the transcational memory registers, and use
> separate
>>>   regsets for the checkpointed registers -- those should parallel the
> non-
>>>   checkpointed register regset.
>>
>> Right now, we support NT_PPC_TM_SPR only inside the transaction, so we
> dont
>> have the problem with different "lifetimes" registers accessed together.
> But
>> yes, I get your point.
> 
> Since the transactional SPRs are accessible from user space outside of a
> transaction, it would make sense for them to accessible from ptrace as
> well.
> If the current patch set doesn't do that, I guess it would be better to
> change that.

Yeah I agree. Will change it.

> 
>>> - Particularly confusing to me is the "checkpointed original MSR" which
>>>   currently also resides in NT_PPC_TM_SPR.  What exactly is this?  How
>>>   does that differ from the MSR slot in the NT_PPC_TM_CGPR regset?
>>
>> I believed it stores the check pointed MSR value which was in the
> register
>> before the transaction started. But then how it is different from the
>> ckpt_regs.msr, I am not sure. Mikey or Michael should be able to clarify
>> more on this. I can see "orig_msr" getting used in many places to hold
> the
>> check pointed value of MSR.
> 
> Your other mail states that the orig_mst may be irrelevant for ptrace
> anyway ... that would be OK with me as well.

Yeah. The variable tm_orig_msr is used to compute MSR state inside
the kernel or what would be passed to the user space while returning
at various stages of the transaction, where as ckpt_regs.msr contains
the latest check pointed MSR value to be fetched by ptrace. Thats my
understanding as of now.

> 
>>>   In any case, it seems the ptrace set-register case currently allows
> user
>>>   space to restore *any* arbitrary value into the checkpointed MSR,
> which
>>>   would presumably get restored into the real MSR at some point, unless
> I'm
>>>   missing something here.  Do we not need a check that only safe bits
> are
>>>   modified, just like with ptrace access to the real MSR?
>>
>> Where and which safe bits do we check before writing any value into the
> MSR
>> register from ptrace interface ? May be I am missing something here.
> 
> All ptrace accesses to *set* the regular msr go via this routine:
> 
> static int set_user_msr(struct task_struct *task, unsigned long msr)
> {
>         task->thread.regs->msr &= ~MSR_DEBUGCHANGE;
>         task->thread.regs->msr |= msr & MSR_DEBUGCHANGE;
>         return 0;
> }
> 
> I think we'd need to do the equivalent whenever changing the checkpointed
> MSR.

Agree, will incorporate this change.

In summary, after putting together all the issues that we have
discussed till now regarding the number and scope of all new ELF
core note sections being added, the probable elements there in
can be listed as below.

Changed ELF core note sections
------------------------------
These core note sections need to be changed to accommodate the in
transaction ptrace requests when the running/current value of these
registers will reside some where else instead of the original places
of thread_struct.

/* Running register state */
(1) NT_PRFPREG				(Accessible always)
(2) NT_PPC_VMX				(Accessible always)
(3) NT_PPC_VSX				(Accessible always)

New ELF core note sections
--------------------------
/* TM check pointed register set */
(1) NT_PPC_TM_CGPR	--> NT_PRSTATUS	(Accessible inside TM)	
(2) NT_PPC_TM_CFPR	--> NT_PRFPREG	(Accessible inside TM)
(3) NT_PPC_TM_CVMX	--> NT_PPC_VMX	(Accessible inside TM)
(4) NT_PPC_TM_CVSX	--> NT_PPC_VSX	(Accessible inside TM)

NOTE: The register set data structure for these ELF core not
sections would exactly match with that of the corresponding
running value register sets indicated above.

/* TM SPR set */			(Accessible always)
(5) NT_PPC_TM_SPR	thread->tm_tfhar
			thread->tm_tfiar
			thread->ttm_exasr

/* TM check pointed misc register set */
(6) NT_PPC_TM_TAR	thread->tm_tar	(Accessible inside TM)
(7) NT_PPC_TM_PPR	thread->tm_ppr	(Accessible inside TM)
(8) NT_PPC_TM_DSCR	thread->tm_dscr (Accessible inside TM)

NOTE: Application can have a different set of TAR, PPR and DSCR
registers inside the transaction compared that of the outside.
Also seems like they are *not* the check pointed ones, will
double check on this. Changed the core note section name from
NT_PPC_TM_CTAR to just NT_PPC_TM_TAR and for all the others.

/* Running misc register set */
(9) 	NT_PPC_TAR	thread->tar	(Accessible always)
(10)	NT_PPC_PPR	thread->ppr	(Accessible always)
(11)	NT_PPC_DSCR	thread->dscr	(Accessible always)

NOTE: They are like any other special purpose register which can
be changed from the user space. So the elf core note section name
can be generic. Here are some optional ELF core note sections
which we can also add like the above ones.

(12)	NT_PPC_EBBRR	thread->ebbrr	(Accessible inside EBB)
(13)	NT_PPC_EBBHR	thread->ebbhr	(Accessible inside EBB)
(14)	NT_PPC_BESCR	thread->bescr	(Accessible inside EBB)
(15)	NT_PPC_SIAR	thread->siar	(Accessible inside EBB)
(16) 	NT_PPC_SDAR	thread->sdar	(Accessible inside EBB)
(17)	NT_PPC_SIER	thread->sier	(Accessible inside EBB)
(18)	NT_PPC_MMCR2	thread->mmcr2	(Accessible inside EBB)
(19)	NT_PPC_MMCR0	thread->mmcr0	(Accessible inside EBB)

Ulrich, Mikey, MPE,

	Please do let me know your thoughts on this.

Regards
Anshuman

Ulrich Weigand April 21, 2015, 2:41 p.m. UTC | #27

Anshuman Khandual <khandual@linux.vnet.ibm.com> wrote on 21.04.2015
06:55:24:

> Changed ELF core note sections
> ------------------------------
> These core note sections need to be changed to accommodate the in
> transaction ptrace requests when the running/current value of these
> registers will reside some where else instead of the original places
> of thread_struct.
>
> /* Running register state */
> (1) NT_PRFPREG            (Accessible always)
> (2) NT_PPC_VMX            (Accessible always)
> (3) NT_PPC_VSX            (Accessible always)
>
> New ELF core note sections
> --------------------------
> /* TM check pointed register set */
> (1) NT_PPC_TM_CGPR   --> NT_PRSTATUS   (Accessible inside TM)
> (2) NT_PPC_TM_CFPR   --> NT_PRFPREG   (Accessible inside TM)
> (3) NT_PPC_TM_CVMX   --> NT_PPC_VMX   (Accessible inside TM)
> (4) NT_PPC_TM_CVSX   --> NT_PPC_VSX   (Accessible inside TM)
>
> NOTE: The register set data structure for these ELF core not
> sections would exactly match with that of the corresponding
> running value register sets indicated above.

OK.

> /* TM SPR set */         (Accessible always)
> (5) NT_PPC_TM_SPR   thread->tm_tfhar
>          thread->tm_tfiar
>          thread->ttm_exasr

OK.

> /* TM check pointed misc register set */
> (6) NT_PPC_TM_TAR   thread->tm_tar   (Accessible inside TM)
> (7) NT_PPC_TM_PPR   thread->tm_ppr   (Accessible inside TM)
> (8) NT_PPC_TM_DSCR   thread->tm_dscr (Accessible inside TM)
>
> NOTE: Application can have a different set of TAR, PPR and DSCR
> registers inside the transaction compared that of the outside.
> Also seems like they are *not* the check pointed ones, will
> double check on this. Changed the core note section name from
> NT_PPC_TM_CTAR to just NT_PPC_TM_TAR and for all the others.

Huh?  How is this not the checkpointed set?  I would have
expected that NT_PPC_TAR contains the current tar (which is
the one in the transaction if we're within one), while
NT_PPC_TM_CTAR contains the checkpointed value that will be
restored once the transaction aborts.  Why is this not the
case?

> NOTE: They are like any other special purpose register which can
> be changed from the user space. So the elf core note section name
> can be generic. Here are some optional ELF core note sections
> which we can also add like the above ones.
>
> (12)   NT_PPC_EBBRR   thread->ebbrr   (Accessible inside EBB)
> (13)   NT_PPC_EBBHR   thread->ebbhr   (Accessible inside EBB)
> (14)   NT_PPC_BESCR   thread->bescr   (Accessible inside EBB)
> (15)   NT_PPC_SIAR   thread->siar   (Accessible inside EBB)
> (16)    NT_PPC_SDAR   thread->sdar   (Accessible inside EBB)
> (17)   NT_PPC_SIER   thread->sier   (Accessible inside EBB)
> (18)   NT_PPC_MMCR2   thread->mmcr2   (Accessible inside EBB)
> (19)   NT_PPC_MMCR0   thread->mmcr0   (Accessible inside EBB)

So I'm not really familiar with the EBB stuff.  But just as a
general note, if those are in fact related (i.e. on every machine
that has EBB, all those registers will be available), it might
indeed make more sense to collect them into a single note section
(NT_PPC_EBB?) after all.

Bye,
Ulrich

Anshuman Khandual April 22, 2015, 9:24 a.m. UTC | #28

On 04/21/2015 08:11 PM, Ulrich Weigand wrote:
> Anshuman Khandual <khandual@linux.vnet.ibm.com> wrote on 21.04.2015
> 06:55:24:
> 
>> Changed ELF core note sections
>> ------------------------------
>> These core note sections need to be changed to accommodate the in
>> transaction ptrace requests when the running/current value of these
>> registers will reside some where else instead of the original places
>> of thread_struct.
>>
>> /* Running register state */
>> (1) NT_PRFPREG            (Accessible always)
>> (2) NT_PPC_VMX            (Accessible always)
>> (3) NT_PPC_VSX            (Accessible always)
>>
>> New ELF core note sections
>> --------------------------
>> /* TM check pointed register set */
>> (1) NT_PPC_TM_CGPR   --> NT_PRSTATUS   (Accessible inside TM)
>> (2) NT_PPC_TM_CFPR   --> NT_PRFPREG   (Accessible inside TM)
>> (3) NT_PPC_TM_CVMX   --> NT_PPC_VMX   (Accessible inside TM)
>> (4) NT_PPC_TM_CVSX   --> NT_PPC_VSX   (Accessible inside TM)
>>
>> NOTE: The register set data structure for these ELF core not
>> sections would exactly match with that of the corresponding
>> running value register sets indicated above.
> 
> OK.
> 
>> /* TM SPR set */         (Accessible always)
>> (5) NT_PPC_TM_SPR   thread->tm_tfhar
>>          thread->tm_tfiar
>>          thread->ttm_exasr
> 
> OK.
> 
>> /* TM check pointed misc register set */
>> (6) NT_PPC_TM_TAR   thread->tm_tar   (Accessible inside TM)
>> (7) NT_PPC_TM_PPR   thread->tm_ppr   (Accessible inside TM)
>> (8) NT_PPC_TM_DSCR   thread->tm_dscr (Accessible inside TM)
>>
>> NOTE: Application can have a different set of TAR, PPR and DSCR
>> registers inside the transaction compared that of the outside.
>> Also seems like they are *not* the check pointed ones, will
>> double check on this. Changed the core note section name from
>> NT_PPC_TM_CTAR to just NT_PPC_TM_TAR and for all the others.
> 
> Huh?  How is this not the checkpointed set?  I would have
> expected that NT_PPC_TAR contains the current tar (which is
> the one in the transaction if we're within one), while
> NT_PPC_TM_CTAR contains the checkpointed value that will be
> restored once the transaction aborts.  Why is this not the
> case?

Yeah you are right. There is one running value always which is
'thread->tm_tar' when TM is active and it is 'thread->tar' when
TM is not active. This can be accessed *always* with NT_PPC_TAR.
The check pointed TAR is 'thread->tar' only when *TM is active*
which can be accessed via NT_PPC_TM_CTAR. Will update the ELF
core note list with lifetimes.

> 
>> NOTE: They are like any other special purpose register which can
>> be changed from the user space. So the elf core note section name
>> can be generic. Here are some optional ELF core note sections
>> which we can also add like the above ones.
>>
>> (12)   NT_PPC_EBBRR   thread->ebbrr   (Accessible inside EBB)
>> (13)   NT_PPC_EBBHR   thread->ebbhr   (Accessible inside EBB)
>> (14)   NT_PPC_BESCR   thread->bescr   (Accessible inside EBB)
>> (15)   NT_PPC_SIAR   thread->siar   (Accessible inside EBB)
>> (16)    NT_PPC_SDAR   thread->sdar   (Accessible inside EBB)
>> (17)   NT_PPC_SIER   thread->sier   (Accessible inside EBB)
>> (18)   NT_PPC_MMCR2   thread->mmcr2   (Accessible inside EBB)
>> (19)   NT_PPC_MMCR0   thread->mmcr0   (Accessible inside EBB)
> 
> So I'm not really familiar with the EBB stuff.  But just as a
> general note, if those are in fact related (i.e. on every machine
> that has EBB, all those registers will be available), it might
> indeed make more sense to collect them into a single note section
> (NT_PPC_EBB?) after all.

I agree that would save us precious ELF core note section entry slots
which are 256 in total ever for powerpc arch. Right now all these
register states in thread_struct are getting used for EBB only. But
again these are PMU related registers, in future we may need to access
them for purposes other than EBB. Yes, that will be a problem for us
to solve later on. Right now it makes more sense to group them together
under EBB heading and pass them as a group which saves ELF core note
entries for powerpc. Hopefully Mikey and Michael will agree on this.

[V6,1/9] elf: Add new powerpc specifc core note sections

Commit Message

Comments

Patch