diff mbox

[0/1] PPC32: fix ptrace() access to FPU registers

Message ID 20190610232758.19010-1-radu.rendec@gmail.com (mailing list archive)
State Not Applicable
Headers show

Checks

Context Check Description
snowpatch_ozlabs/apply_patch fail Failed to apply to any branch
snowpatch_ozlabs/apply_patch warning Failed to apply on branch next (a3bf9fbdad600b1e4335dd90979f8d6072e4f602)

Commit Message

Radu Rendec June 10, 2019, 11:27 p.m. UTC
Hi Everyone,

I'm following up on the ptrace() problem that I reported a few days ago.
I believe my version of the code handles all cases correctly. While the
problem essentially boils down to dividing the fpidx by 2 on PPC32, it
becomes tricky when the same code must work correctly on both PPC32 and
PPC64.

One other thing that I believe was handled incorrectly in the previous
version is the unused half of fpscr on PPC32. Note that while PT_FPSCR
is defined as (PT_FPR0 + 2*32 + 1), making only the upper half visible,
PT_FPR0 + 2*32 still corresponds to a possible address that userspace
can pass. In that case, comparing fpidx to (PT_FPSCR - PT_FPR0) would
cause an invalid access to the FPU registers array.

I tested the patch on 4.9.179, but that part of the code is identical in
recent kernels so it should work just the same.

I wrote a simple test program than can be used to quickly test (on an
x86_64 host) that all cases are handled correctly for both PPC32/PPC64.
The code is included below.

I also tested with gdbserver (test patch included below) and verified
that it generates two ptrace() calls for each FPU register, with
addresses between 0xc0 and 0x1bc.

8<--------------- Makefile ---------------------------------------------
.PHONY: all clean

all: ptrace-fpregs-32 ptrace-fpregs-64

ptrace-fpregs-32: ptrace-fpregs.c
	$(CC) -o ptrace-fpregs-32 -Wall -O2 -m32 ptrace-fpregs.c

ptrace-fpregs-64: ptrace-fpregs.c
	$(CC) -o ptrace-fpregs-64 -Wall -O2 ptrace-fpregs.c

clean:
	rm -f ptrace-fpregs-32 ptrace-fpregs-64
8<--------------- ptrace-fpregs.c --------------------------------------
#include <stdio.h>
#include <errno.h>

#define PT_FPR0	48

#ifndef __x86_64

#define PT_FPR31 (PT_FPR0 + 2*31)
#define PT_FPSCR (PT_FPR0 + 2*32 + 1)

#else

#define PT_FPSCR (PT_FPR0 + 32)

#endif

int test_access(unsigned long addr)
{
	int ret;

	do {
		unsigned long index, fpidx;

		ret = -EIO;

		/* convert to index and check */
		index = addr / sizeof(long);
		if ((addr & (sizeof(long) - 1)) || (index > PT_FPSCR))
			break;

		if (index < PT_FPR0) {
			ret = printf("ptrace_put_reg(%lu)", index);
			break;
		}

		ret = 0;
#ifndef __x86_64
		if (index == PT_FPSCR - 1) {
			/* corner case for PPC32; do nothing */
			printf("corner_case");
			break;
		}
#endif
		if (index == PT_FPSCR) {
			printf("fpscr");
			break;
		}

		/*
		 * FPR is always 64-bit; on PPC32, userspace does two 32-bit
		 * accesses. Add bit2 to allow accessing the upper half on
		 * 32-bit; on 64-bit, bit2 is always 0 (we validate it above).
		 */
		fpidx = (addr - PT_FPR0 * sizeof(long)) / 8;
		printf("TS_FPR[%lu] + %lu", fpidx, addr & 4);
		break;
	} while (0);

	return ret;
}

int main(void)
{
	unsigned long addr;
	int rc;

	for (addr = 0; addr < PT_FPSCR * sizeof(long) + 16; addr++) {
		printf("0x%04lx: ", addr);
		rc = test_access(addr);
		if (rc < 0)
			printf("!err!");
		printf("\t<%d>\n", rc);
	}

	return 0;
}
8<--------------- gdb.patch --------------------------------------------
8<----------------------------------------------------------------------

Radu Rendec (1):
  PPC32: fix ptrace() access to FPU registers

 arch/powerpc/kernel/ptrace.c | 85 ++++++++++++++++++++++--------------
 1 file changed, 52 insertions(+), 33 deletions(-)

Comments

Daniel Axtens June 13, 2019, 7:59 a.m. UTC | #1
Radu Rendec <radu.rendec@gmail.com> writes:

> Hi Everyone,
>
> I'm following up on the ptrace() problem that I reported a few days ago.
> I believe my version of the code handles all cases correctly. While the
> problem essentially boils down to dividing the fpidx by 2 on PPC32, it
> becomes tricky when the same code must work correctly on both PPC32 and
> PPC64.
>
> One other thing that I believe was handled incorrectly in the previous
> version is the unused half of fpscr on PPC32. Note that while PT_FPSCR
> is defined as (PT_FPR0 + 2*32 + 1), making only the upper half visible,
> PT_FPR0 + 2*32 still corresponds to a possible address that userspace
> can pass. In that case, comparing fpidx to (PT_FPSCR - PT_FPR0) would
> cause an invalid access to the FPU registers array.
>
> I tested the patch on 4.9.179, but that part of the code is identical in
> recent kernels so it should work just the same.

I've been looking into this. Something is definitely up, but I'm not
sure that we want to fix it in exactly the way you identified. I'll keep
you updated.

Regards,
Daniel

>
> I wrote a simple test program than can be used to quickly test (on an
> x86_64 host) that all cases are handled correctly for both PPC32/PPC64.
> The code is included below.
>
> I also tested with gdbserver (test patch included below) and verified
> that it generates two ptrace() calls for each FPU register, with
> addresses between 0xc0 and 0x1bc.
>
> 8<--------------- Makefile ---------------------------------------------
> .PHONY: all clean
>
> all: ptrace-fpregs-32 ptrace-fpregs-64
>
> ptrace-fpregs-32: ptrace-fpregs.c
> 	$(CC) -o ptrace-fpregs-32 -Wall -O2 -m32 ptrace-fpregs.c
>
> ptrace-fpregs-64: ptrace-fpregs.c
> 	$(CC) -o ptrace-fpregs-64 -Wall -O2 ptrace-fpregs.c
>
> clean:
> 	rm -f ptrace-fpregs-32 ptrace-fpregs-64
> 8<--------------- ptrace-fpregs.c --------------------------------------
> #include <stdio.h>
> #include <errno.h>
>
> #define PT_FPR0	48
>
> #ifndef __x86_64
>
> #define PT_FPR31 (PT_FPR0 + 2*31)
> #define PT_FPSCR (PT_FPR0 + 2*32 + 1)
>
> #else
>
> #define PT_FPSCR (PT_FPR0 + 32)
>
> #endif
>
> int test_access(unsigned long addr)
> {
> 	int ret;
>
> 	do {
> 		unsigned long index, fpidx;
>
> 		ret = -EIO;
>
> 		/* convert to index and check */
> 		index = addr / sizeof(long);
> 		if ((addr & (sizeof(long) - 1)) || (index > PT_FPSCR))
> 			break;
>
> 		if (index < PT_FPR0) {
> 			ret = printf("ptrace_put_reg(%lu)", index);
> 			break;
> 		}
>
> 		ret = 0;
> #ifndef __x86_64
> 		if (index == PT_FPSCR - 1) {
> 			/* corner case for PPC32; do nothing */
> 			printf("corner_case");
> 			break;
> 		}
> #endif
> 		if (index == PT_FPSCR) {
> 			printf("fpscr");
> 			break;
> 		}
>
> 		/*
> 		 * FPR is always 64-bit; on PPC32, userspace does two 32-bit
> 		 * accesses. Add bit2 to allow accessing the upper half on
> 		 * 32-bit; on 64-bit, bit2 is always 0 (we validate it above).
> 		 */
> 		fpidx = (addr - PT_FPR0 * sizeof(long)) / 8;
> 		printf("TS_FPR[%lu] + %lu", fpidx, addr & 4);
> 		break;
> 	} while (0);
>
> 	return ret;
> }
>
> int main(void)
> {
> 	unsigned long addr;
> 	int rc;
>
> 	for (addr = 0; addr < PT_FPSCR * sizeof(long) + 16; addr++) {
> 		printf("0x%04lx: ", addr);
> 		rc = test_access(addr);
> 		if (rc < 0)
> 			printf("!err!");
> 		printf("\t<%d>\n", rc);
> 	}
>
> 	return 0;
> }
> 8<--------------- gdb.patch --------------------------------------------
> --- gdb/gdbserver/linux-low.c.orig	2019-06-10 11:45:53.810882669 -0400
> +++ gdb/gdbserver/linux-low.c	2019-06-10 11:49:32.272929766 -0400
> @@ -4262,6 +4262,8 @@ store_register (struct regcache *regcach
>    pid = lwpid_of (get_thread_lwp (current_inferior));
>    for (i = 0; i < size; i += sizeof (PTRACE_XFER_TYPE))
>      {
> +      printf("writing register #%d offset %d at address %#x\n",
> +             regno, i, (unsigned int)regaddr);
>        errno = 0;
>        ptrace (PTRACE_POKEUSER, pid,
>  	    /* Coerce to a uintptr_t first to avoid potential gcc warning
> 8<----------------------------------------------------------------------
>
> Radu Rendec (1):
>   PPC32: fix ptrace() access to FPU registers
>
>  arch/powerpc/kernel/ptrace.c | 85 ++++++++++++++++++++++--------------
>  1 file changed, 52 insertions(+), 33 deletions(-)
>
> -- 
> 2.20.1
Daniel Axtens June 17, 2019, 1:19 a.m. UTC | #2
Radu Rendec <radu.rendec@gmail.com> writes:

> Hi Everyone,
>
> I'm following up on the ptrace() problem that I reported a few days ago.
> I believe my version of the code handles all cases correctly. While the
> problem essentially boils down to dividing the fpidx by 2 on PPC32, it
> becomes tricky when the same code must work correctly on both PPC32 and
> PPC64.
>
> One other thing that I believe was handled incorrectly in the previous
> version is the unused half of fpscr on PPC32. Note that while PT_FPSCR
> is defined as (PT_FPR0 + 2*32 + 1), making only the upper half visible,
> PT_FPR0 + 2*32 still corresponds to a possible address that userspace
> can pass. In that case, comparing fpidx to (PT_FPSCR - PT_FPR0) would
> cause an invalid access to the FPU registers array.
>
> I tested the patch on 4.9.179, but that part of the code is identical in
> recent kernels so it should work just the same.
>
> I wrote a simple test program than can be used to quickly test (on an
> x86_64 host) that all cases are handled correctly for both PPC32/PPC64.
> The code is included below.
>
> I also tested with gdbserver (test patch included below) and verified
> that it generates two ptrace() calls for each FPU register, with
> addresses between 0xc0 and 0x1bc.

Thanks for looking in to this. I can confirm your issue. What I'm
currently wondering is: what is the behaviour with a 32-bit userspace on
a 64-bit kernel? Should they also be going down the 32-bit path as far
as calculating offsets goes?

Regards,
Daniel

>
> 8<--------------- Makefile ---------------------------------------------
> .PHONY: all clean
>
> all: ptrace-fpregs-32 ptrace-fpregs-64
>
> ptrace-fpregs-32: ptrace-fpregs.c
> 	$(CC) -o ptrace-fpregs-32 -Wall -O2 -m32 ptrace-fpregs.c
>
> ptrace-fpregs-64: ptrace-fpregs.c
> 	$(CC) -o ptrace-fpregs-64 -Wall -O2 ptrace-fpregs.c
>
> clean:
> 	rm -f ptrace-fpregs-32 ptrace-fpregs-64
> 8<--------------- ptrace-fpregs.c --------------------------------------
> #include <stdio.h>
> #include <errno.h>
>
> #define PT_FPR0	48
>
> #ifndef __x86_64
>
> #define PT_FPR31 (PT_FPR0 + 2*31)
> #define PT_FPSCR (PT_FPR0 + 2*32 + 1)
>
> #else
>
> #define PT_FPSCR (PT_FPR0 + 32)
>
> #endif
>
> int test_access(unsigned long addr)
> {
> 	int ret;
>
> 	do {
> 		unsigned long index, fpidx;
>
> 		ret = -EIO;
>
> 		/* convert to index and check */
> 		index = addr / sizeof(long);
> 		if ((addr & (sizeof(long) - 1)) || (index > PT_FPSCR))
> 			break;
>
> 		if (index < PT_FPR0) {
> 			ret = printf("ptrace_put_reg(%lu)", index);
> 			break;
> 		}
>
> 		ret = 0;
> #ifndef __x86_64
> 		if (index == PT_FPSCR - 1) {
> 			/* corner case for PPC32; do nothing */
> 			printf("corner_case");
> 			break;
> 		}
> #endif
> 		if (index == PT_FPSCR) {
> 			printf("fpscr");
> 			break;
> 		}
>
> 		/*
> 		 * FPR is always 64-bit; on PPC32, userspace does two 32-bit
> 		 * accesses. Add bit2 to allow accessing the upper half on
> 		 * 32-bit; on 64-bit, bit2 is always 0 (we validate it above).
> 		 */
> 		fpidx = (addr - PT_FPR0 * sizeof(long)) / 8;
> 		printf("TS_FPR[%lu] + %lu", fpidx, addr & 4);
> 		break;
> 	} while (0);
>
> 	return ret;
> }
>
> int main(void)
> {
> 	unsigned long addr;
> 	int rc;
>
> 	for (addr = 0; addr < PT_FPSCR * sizeof(long) + 16; addr++) {
> 		printf("0x%04lx: ", addr);
> 		rc = test_access(addr);
> 		if (rc < 0)
> 			printf("!err!");
> 		printf("\t<%d>\n", rc);
> 	}
>
> 	return 0;
> }
> 8<--------------- gdb.patch --------------------------------------------
> --- gdb/gdbserver/linux-low.c.orig	2019-06-10 11:45:53.810882669 -0400
> +++ gdb/gdbserver/linux-low.c	2019-06-10 11:49:32.272929766 -0400
> @@ -4262,6 +4262,8 @@ store_register (struct regcache *regcach
>    pid = lwpid_of (get_thread_lwp (current_inferior));
>    for (i = 0; i < size; i += sizeof (PTRACE_XFER_TYPE))
>      {
> +      printf("writing register #%d offset %d at address %#x\n",
> +             regno, i, (unsigned int)regaddr);
>        errno = 0;
>        ptrace (PTRACE_POKEUSER, pid,
>  	    /* Coerce to a uintptr_t first to avoid potential gcc warning
> 8<----------------------------------------------------------------------
>
> Radu Rendec (1):
>   PPC32: fix ptrace() access to FPU registers
>
>  arch/powerpc/kernel/ptrace.c | 85 ++++++++++++++++++++++--------------
>  1 file changed, 52 insertions(+), 33 deletions(-)
>
> -- 
> 2.20.1
Radu Rendec June 17, 2019, 2:27 a.m. UTC | #3
On Mon, 2019-06-17 at 11:19 +1000, Daniel Axtens wrote:
> Radu Rendec <
> radu.rendec@gmail.com
> > writes:
> 
> > Hi Everyone,
> > 
> > I'm following up on the ptrace() problem that I reported a few days ago.
> > I believe my version of the code handles all cases correctly. While the
> > problem essentially boils down to dividing the fpidx by 2 on PPC32, it
> > becomes tricky when the same code must work correctly on both PPC32 and
> > PPC64.
> > 
> > One other thing that I believe was handled incorrectly in the previous
> > version is the unused half of fpscr on PPC32. Note that while PT_FPSCR
> > is defined as (PT_FPR0 + 2*32 + 1), making only the upper half visible,
> > PT_FPR0 + 2*32 still corresponds to a possible address that userspace
> > can pass. In that case, comparing fpidx to (PT_FPSCR - PT_FPR0) would
> > cause an invalid access to the FPU registers array.
> > 
> > I tested the patch on 4.9.179, but that part of the code is identical in
> > recent kernels so it should work just the same.
> > 
> > I wrote a simple test program than can be used to quickly test (on an
> > x86_64 host) that all cases are handled correctly for both PPC32/PPC64.
> > The code is included below.
> > 
> > I also tested with gdbserver (test patch included below) and verified
> > that it generates two ptrace() calls for each FPU register, with
> > addresses between 0xc0 and 0x1bc.
> 
> Thanks for looking in to this. I can confirm your issue. What I'm
> currently wondering is: what is the behaviour with a 32-bit userspace on
> a 64-bit kernel? Should they also be going down the 32-bit path as far
> as calculating offsets goes?

Thanks for reviewing this. I haven't thought about the 32-bit userspace
on a 64-bit kernel, that is a very good question. Userspace passes a
pointer, so in theory it could go down either path as long as the
pointer points to the right data type.

I will go back to the gdb source code and try to figure out if that case
is handled in a special way. If not, it's probably safe to assume that a
32-bit userspace should always go down the 32-bit path regardless of the
kernel bitness (in which case I think I have to change my patch).

Best regards,
Radu

> > 8<--------------- Makefile ---------------------------------------------
> > .PHONY: all clean
> > 
> > all: ptrace-fpregs-32 ptrace-fpregs-64
> > 
> > ptrace-fpregs-32: ptrace-fpregs.c
> > 	$(CC) -o ptrace-fpregs-32 -Wall -O2 -m32 ptrace-fpregs.c
> > 
> > ptrace-fpregs-64: ptrace-fpregs.c
> > 	$(CC) -o ptrace-fpregs-64 -Wall -O2 ptrace-fpregs.c
> > 
> > clean:
> > 	rm -f ptrace-fpregs-32 ptrace-fpregs-64
> > 8<--------------- ptrace-fpregs.c --------------------------------------
> > #include <stdio.h>
> > #include <errno.h>
> > 
> > #define PT_FPR0	48
> > 
> > #ifndef __x86_64
> > 
> > #define PT_FPR31 (PT_FPR0 + 2*31)
> > #define PT_FPSCR (PT_FPR0 + 2*32 + 1)
> > 
> > #else
> > 
> > #define PT_FPSCR (PT_FPR0 + 32)
> > 
> > #endif
> > 
> > int test_access(unsigned long addr)
> > {
> > 	int ret;
> > 
> > 	do {
> > 		unsigned long index, fpidx;
> > 
> > 		ret = -EIO;
> > 
> > 		/* convert to index and check */
> > 		index = addr / sizeof(long);
> > 		if ((addr & (sizeof(long) - 1)) || (index > PT_FPSCR))
> > 			break;
> > 
> > 		if (index < PT_FPR0) {
> > 			ret = printf("ptrace_put_reg(%lu)", index);
> > 			break;
> > 		}
> > 
> > 		ret = 0;
> > #ifndef __x86_64
> > 		if (index == PT_FPSCR - 1) {
> > 			/* corner case for PPC32; do nothing */
> > 			printf("corner_case");
> > 			break;
> > 		}
> > #endif
> > 		if (index == PT_FPSCR) {
> > 			printf("fpscr");
> > 			break;
> > 		}
> > 
> > 		/*
> > 		 * FPR is always 64-bit; on PPC32, userspace does two 32-bit
> > 		 * accesses. Add bit2 to allow accessing the upper half on
> > 		 * 32-bit; on 64-bit, bit2 is always 0 (we validate it above).
> > 		 */
> > 		fpidx = (addr - PT_FPR0 * sizeof(long)) / 8;
> > 		printf("TS_FPR[%lu] + %lu", fpidx, addr & 4);
> > 		break;
> > 	} while (0);
> > 
> > 	return ret;
> > }
> > 
> > int main(void)
> > {
> > 	unsigned long addr;
> > 	int rc;
> > 
> > 	for (addr = 0; addr < PT_FPSCR * sizeof(long) + 16; addr++) {
> > 		printf("0x%04lx: ", addr);
> > 		rc = test_access(addr);
> > 		if (rc < 0)
> > 			printf("!err!");
> > 		printf("\t<%d>\n", rc);
> > 	}
> > 
> > 	return 0;
> > }
> > 8<--------------- gdb.patch --------------------------------------------
> > --- gdb/gdbserver/linux-low.c.orig	2019-06-10 11:45:53.810882669 -0400
> > +++ gdb/gdbserver/linux-low.c	2019-06-10 11:49:32.272929766 -0400
> > @@ -4262,6 +4262,8 @@ store_register (struct regcache *regcach
> >    pid = lwpid_of (get_thread_lwp (current_inferior));
> >    for (i = 0; i < size; i += sizeof (PTRACE_XFER_TYPE))
> >      {
> > +      printf("writing register #%d offset %d at address %#x\n",
> > +             regno, i, (unsigned int)regaddr);
> >        errno = 0;
> >        ptrace (PTRACE_POKEUSER, pid,
> >  	    /* Coerce to a uintptr_t first to avoid potential gcc warning
> > 8<----------------------------------------------------------------------
> > 
> > Radu Rendec (1):
> >   PPC32: fix ptrace() access to FPU registers
> > 
> >  arch/powerpc/kernel/ptrace.c | 85 ++++++++++++++++++++++--------------
> >  1 file changed, 52 insertions(+), 33 deletions(-)
> > 
> > -- 
> > 2.20.1
Daniel Axtens June 18, 2019, 6:42 a.m. UTC | #4
Radu Rendec <radu.rendec@gmail.com> writes:

> On Mon, 2019-06-17 at 11:19 +1000, Daniel Axtens wrote:
>> Radu Rendec <
>> radu.rendec@gmail.com
>> > writes:
>> 
>> > Hi Everyone,
>> > 
>> > I'm following up on the ptrace() problem that I reported a few days ago.
>> > I believe my version of the code handles all cases correctly. While the
>> > problem essentially boils down to dividing the fpidx by 2 on PPC32, it
>> > becomes tricky when the same code must work correctly on both PPC32 and
>> > PPC64.
>> > 
>> > One other thing that I believe was handled incorrectly in the previous
>> > version is the unused half of fpscr on PPC32. Note that while PT_FPSCR
>> > is defined as (PT_FPR0 + 2*32 + 1), making only the upper half visible,
>> > PT_FPR0 + 2*32 still corresponds to a possible address that userspace
>> > can pass. In that case, comparing fpidx to (PT_FPSCR - PT_FPR0) would
>> > cause an invalid access to the FPU registers array.
>> > 
>> > I tested the patch on 4.9.179, but that part of the code is identical in
>> > recent kernels so it should work just the same.
>> > 
>> > I wrote a simple test program than can be used to quickly test (on an
>> > x86_64 host) that all cases are handled correctly for both PPC32/PPC64.
>> > The code is included below.
>> > 
>> > I also tested with gdbserver (test patch included below) and verified
>> > that it generates two ptrace() calls for each FPU register, with
>> > addresses between 0xc0 and 0x1bc.
>> 
>> Thanks for looking in to this. I can confirm your issue. What I'm
>> currently wondering is: what is the behaviour with a 32-bit userspace on
>> a 64-bit kernel? Should they also be going down the 32-bit path as far
>> as calculating offsets goes?
>
> Thanks for reviewing this. I haven't thought about the 32-bit userspace
> on a 64-bit kernel, that is a very good question. Userspace passes a
> pointer, so in theory it could go down either path as long as the
> pointer points to the right data type.
>
> I will go back to the gdb source code and try to figure out if that case
> is handled in a special way. If not, it's probably safe to assume that a
> 32-bit userspace should always go down the 32-bit path regardless of the
> kernel bitness (in which case I think I have to change my patch).

It doesn't seem to reproduce on a 64-bit kernel with 32-bit
userspace. Couldn't tell you why - if you can figure it out from the gdb
source code I'd love to know! I have, however, tried it - and all the fp
registers look correct and KASAN doesn't pick up any memory corruption.

Regards,
Daniel
>
> Best regards,
> Radu
>
>> > 8<--------------- Makefile ---------------------------------------------
>> > .PHONY: all clean
>> > 
>> > all: ptrace-fpregs-32 ptrace-fpregs-64
>> > 
>> > ptrace-fpregs-32: ptrace-fpregs.c
>> > 	$(CC) -o ptrace-fpregs-32 -Wall -O2 -m32 ptrace-fpregs.c
>> > 
>> > ptrace-fpregs-64: ptrace-fpregs.c
>> > 	$(CC) -o ptrace-fpregs-64 -Wall -O2 ptrace-fpregs.c
>> > 
>> > clean:
>> > 	rm -f ptrace-fpregs-32 ptrace-fpregs-64
>> > 8<--------------- ptrace-fpregs.c --------------------------------------
>> > #include <stdio.h>
>> > #include <errno.h>
>> > 
>> > #define PT_FPR0	48
>> > 
>> > #ifndef __x86_64
>> > 
>> > #define PT_FPR31 (PT_FPR0 + 2*31)
>> > #define PT_FPSCR (PT_FPR0 + 2*32 + 1)
>> > 
>> > #else
>> > 
>> > #define PT_FPSCR (PT_FPR0 + 32)
>> > 
>> > #endif
>> > 
>> > int test_access(unsigned long addr)
>> > {
>> > 	int ret;
>> > 
>> > 	do {
>> > 		unsigned long index, fpidx;
>> > 
>> > 		ret = -EIO;
>> > 
>> > 		/* convert to index and check */
>> > 		index = addr / sizeof(long);
>> > 		if ((addr & (sizeof(long) - 1)) || (index > PT_FPSCR))
>> > 			break;
>> > 
>> > 		if (index < PT_FPR0) {
>> > 			ret = printf("ptrace_put_reg(%lu)", index);
>> > 			break;
>> > 		}
>> > 
>> > 		ret = 0;
>> > #ifndef __x86_64
>> > 		if (index == PT_FPSCR - 1) {
>> > 			/* corner case for PPC32; do nothing */
>> > 			printf("corner_case");
>> > 			break;
>> > 		}
>> > #endif
>> > 		if (index == PT_FPSCR) {
>> > 			printf("fpscr");
>> > 			break;
>> > 		}
>> > 
>> > 		/*
>> > 		 * FPR is always 64-bit; on PPC32, userspace does two 32-bit
>> > 		 * accesses. Add bit2 to allow accessing the upper half on
>> > 		 * 32-bit; on 64-bit, bit2 is always 0 (we validate it above).
>> > 		 */
>> > 		fpidx = (addr - PT_FPR0 * sizeof(long)) / 8;
>> > 		printf("TS_FPR[%lu] + %lu", fpidx, addr & 4);
>> > 		break;
>> > 	} while (0);
>> > 
>> > 	return ret;
>> > }
>> > 
>> > int main(void)
>> > {
>> > 	unsigned long addr;
>> > 	int rc;
>> > 
>> > 	for (addr = 0; addr < PT_FPSCR * sizeof(long) + 16; addr++) {
>> > 		printf("0x%04lx: ", addr);
>> > 		rc = test_access(addr);
>> > 		if (rc < 0)
>> > 			printf("!err!");
>> > 		printf("\t<%d>\n", rc);
>> > 	}
>> > 
>> > 	return 0;
>> > }
>> > 8<--------------- gdb.patch --------------------------------------------
>> > --- gdb/gdbserver/linux-low.c.orig	2019-06-10 11:45:53.810882669 -0400
>> > +++ gdb/gdbserver/linux-low.c	2019-06-10 11:49:32.272929766 -0400
>> > @@ -4262,6 +4262,8 @@ store_register (struct regcache *regcach
>> >    pid = lwpid_of (get_thread_lwp (current_inferior));
>> >    for (i = 0; i < size; i += sizeof (PTRACE_XFER_TYPE))
>> >      {
>> > +      printf("writing register #%d offset %d at address %#x\n",
>> > +             regno, i, (unsigned int)regaddr);
>> >        errno = 0;
>> >        ptrace (PTRACE_POKEUSER, pid,
>> >  	    /* Coerce to a uintptr_t first to avoid potential gcc warning
>> > 8<----------------------------------------------------------------------
>> > 
>> > Radu Rendec (1):
>> >   PPC32: fix ptrace() access to FPU registers
>> > 
>> >  arch/powerpc/kernel/ptrace.c | 85 ++++++++++++++++++++++--------------
>> >  1 file changed, 52 insertions(+), 33 deletions(-)
>> > 
>> > -- 
>> > 2.20.1
Radu Rendec June 18, 2019, 12:16 p.m. UTC | #5
On Tue, 2019-06-18 at 16:42 +1000, Daniel Axtens wrote:
> Radu Rendec <
> radu.rendec@gmail.com
> > writes:
> 
> > On Mon, 2019-06-17 at 11:19 +1000, Daniel Axtens wrote:
> > > Radu Rendec <
> > > radu.rendec@gmail.com
> > > 
> > > > writes:
> > > > Hi Everyone,
> > > > 
> > > > I'm following up on the ptrace() problem that I reported a few days ago.
> > > > I believe my version of the code handles all cases correctly. While the
> > > > problem essentially boils down to dividing the fpidx by 2 on PPC32, it
> > > > becomes tricky when the same code must work correctly on both PPC32 and
> > > > PPC64.
> > > > 
> > > > One other thing that I believe was handled incorrectly in the previous
> > > > version is the unused half of fpscr on PPC32. Note that while PT_FPSCR
> > > > is defined as (PT_FPR0 + 2*32 + 1), making only the upper half visible,
> > > > PT_FPR0 + 2*32 still corresponds to a possible address that userspace
> > > > can pass. In that case, comparing fpidx to (PT_FPSCR - PT_FPR0) would
> > > > cause an invalid access to the FPU registers array.
> > > > 
> > > > I tested the patch on 4.9.179, but that part of the code is identical in
> > > > recent kernels so it should work just the same.
> > > > 
> > > > I wrote a simple test program than can be used to quickly test (on an
> > > > x86_64 host) that all cases are handled correctly for both PPC32/PPC64.
> > > > The code is included below.
> > > > 
> > > > I also tested with gdbserver (test patch included below) and verified
> > > > that it generates two ptrace() calls for each FPU register, with
> > > > addresses between 0xc0 and 0x1bc.
> > > 
> > > Thanks for looking in to this. I can confirm your issue. What I'm
> > > currently wondering is: what is the behaviour with a 32-bit userspace on
> > > a 64-bit kernel? Should they also be going down the 32-bit path as far
> > > as calculating offsets goes?
> > 
> > Thanks for reviewing this. I haven't thought about the 32-bit userspace
> > on a 64-bit kernel, that is a very good question. Userspace passes a
> > pointer, so in theory it could go down either path as long as the
> > pointer points to the right data type.
> > 
> > I will go back to the gdb source code and try to figure out if that case
> > is handled in a special way. If not, it's probably safe to assume that a
> > 32-bit userspace should always go down the 32-bit path regardless of the
> > kernel bitness (in which case I think I have to change my patch).
> 
> It doesn't seem to reproduce on a 64-bit kernel with 32-bit
> userspace. Couldn't tell you why - if you can figure it out from the gdb
> source code I'd love to know! I have, however, tried it - and all the fp
> registers look correct and KASAN doesn't pick up any memory corruption.

I looked at the gdb source code and all it seems to care about is the
architecture it was compiled for. In other words, 32-bit gdb always
assumes 32-bit register layout, regardless of whether it's running on a
32-bit or 64-bit kernel.

So it's no surprise that the problem didn't reproduce and KASAN didn't
pick up anything in your experiment. Since the kernel is 64-bit, it
divides addresses by 8, so all indexes are within bounds. The part that
I don't get is how the FP registers looked correct.

Since you already have a working setup, it would be nice if you could
add a printk to arch_ptrace() to print the address and confirm what I
believe happens (by reading the gdb source code).

So for 32-bit gdb on 64-bit kernel, I think gdb will generate 48 x 4-
byte aligned addresses for the CPU registers, then 32 x 2 x 4-byte
aligned addresses for the FPU registers. The indexes will not overflow,
but access to all registers (CPU and FPU) will be broken because:
 * The kernel always divides by 8. The CPU register address range that
   gdb generates will be half of what the kernel expects and "odd" (i.e.
   non 8-byte aligned) addresses will fail with EIO because the kernel
   code checks that the last 3 bits are all zero.
 * The FPU register indexes will be off by 24. For example, when gdb
   thinks it asks for FPR0, it calculates the address as 4 x 48, but the
   kernel divides by 8, so it will get index 24, which it thinks is a
   CPU register.

When it stops on a breakpoint, gdb seems to read all registers (both CPU
and FPU) in ascending index order. So if you print the address in the
kernel it's actually very easy to verify my theory. I expect addresses
from 0 to 48 x 4 in increments of 4 (for the CPU registers), then
addresses from 48 x 4 to 48 x 4 + 32 x 2 x 4 in increments of 4 (for the
FPU registers).

Best regards,
Radu
Andreas Schwab June 18, 2019, 6:09 p.m. UTC | #6
On Jun 18 2019, Radu Rendec <radu.rendec@gmail.com> wrote:

> Since you already have a working setup, it would be nice if you could
> add a printk to arch_ptrace() to print the address and confirm what I
> believe happens (by reading the gdb source code).

A ppc32 ptrace syscall goes through compat_arch_ptrace.

Andreas.
Daniel Axtens June 19, 2019, 12:36 a.m. UTC | #7
Andreas Schwab <schwab@linux-m68k.org> writes:

> On Jun 18 2019, Radu Rendec <radu.rendec@gmail.com> wrote:
>
>> Since you already have a working setup, it would be nice if you could
>> add a printk to arch_ptrace() to print the address and confirm what I
>> believe happens (by reading the gdb source code).
>
> A ppc32 ptrace syscall goes through compat_arch_ptrace.


Ah right, and that (in ptrace32.c) contains code that will work:


			/*
			 * the user space code considers the floating point
			 * to be an array of unsigned int (32 bits) - the
			 * index passed in is based on this assumption.
			 */
			tmp = ((unsigned int *)child->thread.fp_state.fpr)
				[FPRINDEX(index)];

FPRINDEX is defined above to deal with the various manipulations you
need to do.

Radu: I think we want to copy that working code back into ptrace.c. The
challenge will be unpicking the awful mess of ifdefs in ptrace.c and
making it somewhat more comprehensible.

Regards,
Daniel

>
> Andreas.
>
> -- 
> Andreas Schwab, schwab@linux-m68k.org
> GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
> "And now for something completely different."
Radu Rendec June 19, 2019, 12:57 p.m. UTC | #8
On Wed, 2019-06-19 at 10:36 +1000, Daniel Axtens wrote:
> Andreas Schwab <
> schwab@linux-m68k.org
> > writes:
> 
> > On Jun 18 2019, Radu Rendec <
> > radu.rendec@gmail.com
> > > wrote:
> > 
> > > Since you already have a working setup, it would be nice if you could
> > > add a printk to arch_ptrace() to print the address and confirm what I
> > > believe happens (by reading the gdb source code).
> > 
> > A ppc32 ptrace syscall goes through compat_arch_ptrace.

Right. I completely overlooked that part.

> Ah right, and that (in ptrace32.c) contains code that will work:
> 
> 
> 			/*
> 			 * the user space code considers the floating point
> 			 * to be an array of unsigned int (32 bits) - the
> 			 * index passed in is based on this assumption.
> 			 */
> 			tmp = ((unsigned int *)child->thread.fp_state.fpr)
> 				[FPRINDEX(index)];
> 
> FPRINDEX is defined above to deal with the various manipulations you
> need to do.

Correct. Basically it does the same that I did in my patch: it divides
the index again by 2 (it's already divided by 4 in compat_arch_ptrace()
so it ends up divided by 8), then takes the least significant bit and
adds it to the index. I take bit 2 of the original address, which is the
same thing (because in FPRHALF() the address is already divided by 4).

So we have this in ptrace32.c:

#define FPRNUMBER(i) (((i) - PT_FPR0) >> 1)
#define FPRHALF(i) (((i) - PT_FPR0) & 1)
#define FPRINDEX(i) TS_FPRWIDTH * FPRNUMBER(i) * 2 + FPRHALF(i)

index = (unsigned long) addr >> 2;
(unsigned int *)child->thread.fp_state.fpr)[FPRINDEX(index)]


And we have this in my patch:

fpidx = (addr - PT_FPR0 * sizeof(long)) / 8;
(void *)&child->thread.TS_FPR(fpidx) + (addr & 4)

> Radu: I think we want to copy that working code back into ptrace.c. 

I'm not sure that would work. There's a subtle difference: the code in
ptrace32.c is always compiled on a 64-bit kernel and the user space
calling it is always 32-bit; on the other hand, the code in ptrace.c can
be compiled on either a 64-bit kernel or a 32-bit kernel and the user
space calling it always has the same "bitness" as the kernel.

One difference is the size of the CPU registers. On 64-bit they are 8
byte long and user space knows that and generates 8-byte aligned
addresses. So you have to divide the address by 8 to calculate the CPU
register index correctly, which compat_arch_ptrace() currently doesn't.

Another difference is that on 64-bit `long` is 8 bytes, so user space
can read a whole FPU register in a single ptrace call. 

Now that we are all aware of compat_arch_ptrace() (which handles the
special case of a 32-bit process running on a 64-bit kernel) I would say
the patch is correct and does the right thing for both 32-bit and 64-bit 
kernels and processes.

> The challenge will be unpicking the awful mess of ifdefs in ptrace.c
> and making it somewhat more comprehensible.

I'm not sure what ifdefs you're thinking about. The only that are used
inside arch_ptrace() are PT_FPR0, PT_FPSCR and TS_FPR, which seem to be
correct.

But perhaps it would be useful to change my patch and add a comment just
before arch_ptrace() that explains how the math is done and that the
code must work on both 32-bit and 64-bit, the user space address
assumptions, etc.

By the way, I'm not sure the code in compat_arch_ptrace() handles
PT_FPSCR correctly. It might (just because fpscr is right next to fpr[]
in memory - and that's a hack), but I can't figure out if it accesses
the right half.

Radu
Christophe Leroy June 11, 2021, 6:02 a.m. UTC | #9
Le 19/06/2019 à 14:57, Radu Rendec a écrit :
> On Wed, 2019-06-19 at 10:36 +1000, Daniel Axtens wrote:
>> Andreas Schwab <
>> schwab@linux-m68k.org
>>> writes:
>>
>>> On Jun 18 2019, Radu Rendec <
>>> radu.rendec@gmail.com
>>>> wrote:
>>>
>>>> Since you already have a working setup, it would be nice if you could
>>>> add a printk to arch_ptrace() to print the address and confirm what I
>>>> believe happens (by reading the gdb source code).
>>>
>>> A ppc32 ptrace syscall goes through compat_arch_ptrace.
> 
> Right. I completely overlooked that part.
> 
>> Ah right, and that (in ptrace32.c) contains code that will work:
>>
>>
>> 			/*
>> 			 * the user space code considers the floating point
>> 			 * to be an array of unsigned int (32 bits) - the
>> 			 * index passed in is based on this assumption.
>> 			 */
>> 			tmp = ((unsigned int *)child->thread.fp_state.fpr)
>> 				[FPRINDEX(index)];
>>
>> FPRINDEX is defined above to deal with the various manipulations you
>> need to do.
> 
> Correct. Basically it does the same that I did in my patch: it divides
> the index again by 2 (it's already divided by 4 in compat_arch_ptrace()
> so it ends up divided by 8), then takes the least significant bit and
> adds it to the index. I take bit 2 of the original address, which is the
> same thing (because in FPRHALF() the address is already divided by 4).
> 
> So we have this in ptrace32.c:
> 
> #define FPRNUMBER(i) (((i) - PT_FPR0) >> 1)
> #define FPRHALF(i) (((i) - PT_FPR0) & 1)
> #define FPRINDEX(i) TS_FPRWIDTH * FPRNUMBER(i) * 2 + FPRHALF(i)
> 
> index = (unsigned long) addr >> 2;
> (unsigned int *)child->thread.fp_state.fpr)[FPRINDEX(index)]
> 
> 
> And we have this in my patch:
> 
> fpidx = (addr - PT_FPR0 * sizeof(long)) / 8;
> (void *)&child->thread.TS_FPR(fpidx) + (addr & 4)
> 
>> Radu: I think we want to copy that working code back into ptrace.c.
> 
> I'm not sure that would work. There's a subtle difference: the code in
> ptrace32.c is always compiled on a 64-bit kernel and the user space
> calling it is always 32-bit; on the other hand, the code in ptrace.c can
> be compiled on either a 64-bit kernel or a 32-bit kernel and the user
> space calling it always has the same "bitness" as the kernel.
> 
> One difference is the size of the CPU registers. On 64-bit they are 8
> byte long and user space knows that and generates 8-byte aligned
> addresses. So you have to divide the address by 8 to calculate the CPU
> register index correctly, which compat_arch_ptrace() currently doesn't.
> 
> Another difference is that on 64-bit `long` is 8 bytes, so user space
> can read a whole FPU register in a single ptrace call.
> 
> Now that we are all aware of compat_arch_ptrace() (which handles the
> special case of a 32-bit process running on a 64-bit kernel) I would say
> the patch is correct and does the right thing for both 32-bit and 64-bit
> kernels and processes.
> 
>> The challenge will be unpicking the awful mess of ifdefs in ptrace.c
>> and making it somewhat more comprehensible.
> 
> I'm not sure what ifdefs you're thinking about. The only that are used
> inside arch_ptrace() are PT_FPR0, PT_FPSCR and TS_FPR, which seem to be
> correct.
> 
> But perhaps it would be useful to change my patch and add a comment just
> before arch_ptrace() that explains how the math is done and that the
> code must work on both 32-bit and 64-bit, the user space address
> assumptions, etc.
> 
> By the way, I'm not sure the code in compat_arch_ptrace() handles
> PT_FPSCR correctly. It might (just because fpscr is right next to fpr[]
> in memory - and that's a hack), but I can't figure out if it accesses
> the right half.
> 


Does the issue still exists ? If yes, the patch has to be rebased.
Radu Rendec June 11, 2021, 2:37 p.m. UTC | #10
On Fri, 2021-06-11 at 08:02 +0200, Christophe Leroy wrote:
>Le 19/06/2019 à 14:57, Radu Rendec a écrit :
>> On Wed, 2019-06-19 at 10:36 +1000, Daniel Axtens wrote:
>>> Andreas Schwab <
>>> schwab@linux-m68k.org
>>>> writes:
>>>
>>>> On Jun 18 2019, Radu Rendec <
>>>> radu.rendec@gmail.com
>>>>> wrote:
>>>>
>>>>> Since you already have a working setup, it would be nice if you could
>>>>> add a printk to arch_ptrace() to print the address and confirm what I
>>>>> believe happens (by reading the gdb source code).
>>>>
>>>> A ppc32 ptrace syscall goes through compat_arch_ptrace.
>>
>> Right. I completely overlooked that part.
>>
>>> Ah right, and that (in ptrace32.c) contains code that will work:
>>>
>>>
>>> 			/*
>>> 			 * the user space code considers the floating point
>>> 			 * to be an array of unsigned int (32 bits) - the
>>> 			 * index passed in is based on this assumption.
>>> 			 */
>>> 			tmp = ((unsigned int *)child->thread.fp_state.fpr)
>>> 				[FPRINDEX(index)];
>>>
>>> FPRINDEX is defined above to deal with the various manipulations you
>>> need to do.
>>
>> Correct. Basically it does the same that I did in my patch: it divides
>> the index again by 2 (it's already divided by 4 in compat_arch_ptrace()
>> so it ends up divided by 8), then takes the least significant bit and
>> adds it to the index. I take bit 2 of the original address, which is the
>> same thing (because in FPRHALF() the address is already divided by 4).
>>
>> So we have this in ptrace32.c:
>>
>> #define FPRNUMBER(i) (((i) - PT_FPR0) >> 1)
>> #define FPRHALF(i) (((i) - PT_FPR0) & 1)
>> #define FPRINDEX(i) TS_FPRWIDTH * FPRNUMBER(i) * 2 + FPRHALF(i)
>>
>> index = (unsigned long) addr >> 2;
>> (unsigned int *)child->thread.fp_state.fpr)[FPRINDEX(index)]
>>
>>
>> And we have this in my patch:
>>
>> fpidx = (addr - PT_FPR0 * sizeof(long)) / 8;
>> (void *)&child->thread.TS_FPR(fpidx) + (addr & 4)
>>
>>> Radu: I think we want to copy that working code back into ptrace.c.
>>
>> I'm not sure that would work. There's a subtle difference: the code in
>> ptrace32.c is always compiled on a 64-bit kernel and the user space
>> calling it is always 32-bit; on the other hand, the code in ptrace.c can
>> be compiled on either a 64-bit kernel or a 32-bit kernel and the user
>> space calling it always has the same "bitness" as the kernel.
>>
>> One difference is the size of the CPU registers. On 64-bit they are 8
>> byte long and user space knows that and generates 8-byte aligned
>> addresses. So you have to divide the address by 8 to calculate the CPU
>> register index correctly, which compat_arch_ptrace() currently doesn't.
>>
>> Another difference is that on 64-bit `long` is 8 bytes, so user space
>> can read a whole FPU register in a single ptrace call.
>>
>> Now that we are all aware of compat_arch_ptrace() (which handles the
>> special case of a 32-bit process running on a 64-bit kernel) I would say
>> the patch is correct and does the right thing for both 32-bit and 64-bit
>> kernels and processes.
>>
>>> The challenge will be unpicking the awful mess of ifdefs in ptrace.c
>>> and making it somewhat more comprehensible.
>>
>> I'm not sure what ifdefs you're thinking about. The only that are used
>> inside arch_ptrace() are PT_FPR0, PT_FPSCR and TS_FPR, which seem to be
>> correct.
>>
>> But perhaps it would be useful to change my patch and add a comment just
>> before arch_ptrace() that explains how the math is done and that the
>> code must work on both 32-bit and 64-bit, the user space address
>> assumptions, etc.
>>
>> By the way, I'm not sure the code in compat_arch_ptrace() handles
>> PT_FPSCR correctly. It might (just because fpscr is right next to fpr[]
>> in memory - and that's a hack), but I can't figure out if it accesses
>> the right half.
>>
>
>Does the issue still exists ? If yes, the patch has to be rebased.

Hard to say. I'm still using 4.9 (stable) on the systems that I created
the patch for. I tried to rebase, and the patch no longer applies. It
looks like there have been some changes around that area, notably your
commit e009fa433542, so it could actually be fixed now.

It's been exactly two years since I sent the patch and I don't remember
all the details. I will have to go back and look. Also, running a recent
kernel on my PPC32 systems is not an option because there are a bunch of
custom patches that would have to be ported. I will try in a VM and get
back to you, hopefully early next week.

Best regards,
Radu
Radu Rendec July 18, 2021, 6:07 p.m. UTC | #11
On Fri, 2021-06-11 at 10:37 -0400, Radu Rendec wrote:
>On Fri, 2021-06-11 at 08:02 +0200, Christophe Leroy wrote:
>>Le 19/06/2019 à 14:57, Radu Rendec a écrit :
>>> On Wed, 2019-06-19 at 10:36 +1000, Daniel Axtens wrote:
>>>> Andreas Schwab <
>>>> schwab@linux-m68k.org
>>>>> writes:
>>>>
>>>>> On Jun 18 2019, Radu Rendec <
>>>>> radu.rendec@gmail.com
>>>>>> wrote:
>>>>>
>>>>>> Since you already have a working setup, it would be nice if you could
>>>>>> add a printk to arch_ptrace() to print the address and confirm what I
>>>>>> believe happens (by reading the gdb source code).
>>>>>
>>>>> A ppc32 ptrace syscall goes through compat_arch_ptrace.
>>>
>>> Right. I completely overlooked that part.
>>>
>>>> Ah right, and that (in ptrace32.c) contains code that will work:
>>>>
>>>>
>>>> 			/*
>>>> 			 * the user space code considers the floating point
>>>> 			 * to be an array of unsigned int (32 bits) - the
>>>> 			 * index passed in is based on this assumption.
>>>> 			 */
>>>> 			tmp = ((unsigned int *)child->thread.fp_state.fpr)
>>>> 				[FPRINDEX(index)];
>>>>
>>>> FPRINDEX is defined above to deal with the various manipulations you
>>>> need to do.
>>>
>>> Correct. Basically it does the same that I did in my patch: it divides
>>> the index again by 2 (it's already divided by 4 in compat_arch_ptrace()
>>> so it ends up divided by 8), then takes the least significant bit and
>>> adds it to the index. I take bit 2 of the original address, which is the
>>> same thing (because in FPRHALF() the address is already divided by 4).
>>>
>>> So we have this in ptrace32.c:
>>>
>>> #define FPRNUMBER(i) (((i) - PT_FPR0) >> 1)
>>> #define FPRHALF(i) (((i) - PT_FPR0) & 1)
>>> #define FPRINDEX(i) TS_FPRWIDTH * FPRNUMBER(i) * 2 + FPRHALF(i)
>>>
>>> index = (unsigned long) addr >> 2;
>>> (unsigned int *)child->thread.fp_state.fpr)[FPRINDEX(index)]
>>>
>>>
>>> And we have this in my patch:
>>>
>>> fpidx = (addr - PT_FPR0 * sizeof(long)) / 8;
>>> (void *)&child->thread.TS_FPR(fpidx) + (addr & 4)
>>>
>>>> Radu: I think we want to copy that working code back into ptrace.c.
>>>
>>> I'm not sure that would work. There's a subtle difference: the code in
>>> ptrace32.c is always compiled on a 64-bit kernel and the user space
>>> calling it is always 32-bit; on the other hand, the code in ptrace.c can
>>> be compiled on either a 64-bit kernel or a 32-bit kernel and the user
>>> space calling it always has the same "bitness" as the kernel.
>>>
>>> One difference is the size of the CPU registers. On 64-bit they are 8
>>> byte long and user space knows that and generates 8-byte aligned
>>> addresses. So you have to divide the address by 8 to calculate the CPU
>>> register index correctly, which compat_arch_ptrace() currently doesn't.
>>>
>>> Another difference is that on 64-bit `long` is 8 bytes, so user space
>>> can read a whole FPU register in a single ptrace call.
>>>
>>> Now that we are all aware of compat_arch_ptrace() (which handles the
>>> special case of a 32-bit process running on a 64-bit kernel) I would say
>>> the patch is correct and does the right thing for both 32-bit and 64-bit
>>> kernels and processes.
>>>
>>>> The challenge will be unpicking the awful mess of ifdefs in ptrace.c
>>>> and making it somewhat more comprehensible.
>>>
>>> I'm not sure what ifdefs you're thinking about. The only that are used
>>> inside arch_ptrace() are PT_FPR0, PT_FPSCR and TS_FPR, which seem to be
>>> correct.
>>>
>>> But perhaps it would be useful to change my patch and add a comment just
>>> before arch_ptrace() that explains how the math is done and that the
>>> code must work on both 32-bit and 64-bit, the user space address
>>> assumptions, etc.
>>>
>>> By the way, I'm not sure the code in compat_arch_ptrace() handles
>>> PT_FPSCR correctly. It might (just because fpscr is right next to fpr[]
>>> in memory - and that's a hack), but I can't figure out if it accesses
>>> the right half.
>>>
>>
>>Does the issue still exists ? If yes, the patch has to be rebased.
>
>Hard to say. I'm still using 4.9 (stable) on the systems that I created
>the patch for. I tried to rebase, and the patch no longer applies. It
>looks like there have been some changes around that area, notably your
>commit e009fa433542, so it could actually be fixed now.
>
>It's been exactly two years since I sent the patch and I don't remember
>all the details. I will have to go back and look. Also, running a recent
>kernel on my PPC32 systems is not an option because there are a bunch of
>custom patches that would have to be ported. I will try in a VM and get
>back to you, hopefully early next week.

I finally had time to test everything properly. I can now confirm that
the original problem no longer exists, so the patch doesn't need to be
rebased.

I tested all three variants: 32-bit program on 32-bit kernel, 32-bit
program on 64-bit kernel and 64-bit program on 64-bit kernel.

Best regards,
Radu
diff mbox

Patch

--- gdb/gdbserver/linux-low.c.orig	2019-06-10 11:45:53.810882669 -0400
+++ gdb/gdbserver/linux-low.c	2019-06-10 11:49:32.272929766 -0400
@@ -4262,6 +4262,8 @@  store_register (struct regcache *regcach
   pid = lwpid_of (get_thread_lwp (current_inferior));
   for (i = 0; i < size; i += sizeof (PTRACE_XFER_TYPE))
     {
+      printf("writing register #%d offset %d at address %#x\n",
+             regno, i, (unsigned int)regaddr);
       errno = 0;
       ptrace (PTRACE_POKEUSER, pid,
 	    /* Coerce to a uintptr_t first to avoid potential gcc warning