diff mbox series

The patch solves the type error of the parameter “off” in syscall mmap on the ARM64 platform.

Message ID 1555504525-17962-1-git-send-email-zhouby_cn@126.com
State New
Headers show
Series The patch solves the type error of the parameter “off” in syscall mmap on the ARM64 platform. | expand

Commit Message

Boyang Zhou April 17, 2019, 12:35 p.m. UTC
The error information is that “offset value too large for defined data type”.
Reason:
On the X86 platform, the data type of “off" is unsigned long; but on the ARM64 platform, the data type is defined as off_t, and off_t is by type long instead of unsigned long.
When the off right shifts in the function “sys_mmap_pgoff(addr, len, prot, flags, fd, off >> PAGE_SHIFT)"on ARM64, high address of off is filled with sign bit 1instead of 0.
In our case, we mmap GPU doorbell on both platform. On the x86 platform, the value of off is f009c00000000000, after shift the value becomes f009c00000000; while on the ARM64, the value of off changes from ed35c00000000000 to fffed35c00000000. This value is treated as unsigned long in later functions. So it is too big for off and the error happened.
We have tested the patchs in Huawei ARM64 server with a couples of AMD GPUs.

Signed-off-by: Boyang Zhou <zhouby_cn@126.com>
---
 arch/arm64/kernel/sys.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Mark Rutland April 18, 2019, 11:28 a.m. UTC | #1
[adding linux-arch and relevant folk]

On Wed, Apr 17, 2019 at 08:35:25PM +0800, Boyang Zhou wrote:
> The error information is that “offset value too large for defined data type”.
> Reason:
> On the X86 platform, the data type of “off" is unsigned long; but on the ARM64 platform, the data type is defined as off_t, and off_t is by type long instead of unsigned long.
> When the off right shifts in the function “sys_mmap_pgoff(addr, len, prot, flags, fd, off >> PAGE_SHIFT)"on ARM64, high address of off is filled with sign bit 1instead of 0.
> In our case, we mmap GPU doorbell on both platform. On the x86 platform, the value of off is f009c00000000000, after shift the value becomes f009c00000000; while on the ARM64, the value of off changes from ed35c00000000000 to fffed35c00000000. This value is treated as unsigned long in later functions. So it is too big for off and the error happened.
> We have tested the patchs in Huawei ARM64 server with a couples of AMD GPUs.

It looks like the generic mmap uses unsigned long, as do sparc and x86.

However, arm64, microblase, powerpc and riscv all use off_t.

Should those all be using unsigned long? If so, that seems like it
should be a treewide cleanup.

Similar applies to pgoff for mmap2.

Thanks,
Mark.

> 
> Signed-off-by: Boyang Zhou <zhouby_cn@126.com>
> ---
>  arch/arm64/kernel/sys.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kernel/sys.c b/arch/arm64/kernel/sys.c
> index b44065f..6f91e81 100644
> --- a/arch/arm64/kernel/sys.c
> +++ b/arch/arm64/kernel/sys.c
> @@ -31,7 +31,7 @@
>  
>  SYSCALL_DEFINE6(mmap, unsigned long, addr, unsigned long, len,
>  		unsigned long, prot, unsigned long, flags,
> -		unsigned long, fd, off_t, off)
> +		unsigned long, fd, unsigned long, off)
>  {
>  	if (offset_in_page(off) != 0)
>  		return -EINVAL;
> -- 
> 2.7.4
>
Will Deacon April 23, 2019, 5:04 p.m. UTC | #2
On Thu, Apr 18, 2019 at 12:28:18PM +0100, Mark Rutland wrote:
> [adding linux-arch and relevant folk]
> 
> On Wed, Apr 17, 2019 at 08:35:25PM +0800, Boyang Zhou wrote:
> > The error information is that “offset value too large for defined data type”.
> > Reason:
> > On the X86 platform, the data type of “off" is unsigned long; but on the ARM64 platform, the data type is defined as off_t, and off_t is by type long instead of unsigned long.
> > When the off right shifts in the function “sys_mmap_pgoff(addr, len, prot, flags, fd, off >> PAGE_SHIFT)"on ARM64, high address of off is filled with sign bit 1instead of 0.
> > In our case, we mmap GPU doorbell on both platform. On the x86 platform, the value of off is f009c00000000000, after shift the value becomes f009c00000000; while on the ARM64, the value of off changes from ed35c00000000000 to fffed35c00000000. This value is treated as unsigned long in later functions. So it is too big for off and the error happened.
> > We have tested the patchs in Huawei ARM64 server with a couples of AMD GPUs.
> 
> It looks like the generic mmap uses unsigned long, as do sparc and x86.
> 
> However, arm64, microblase, powerpc and riscv all use off_t.
> 
> Should those all be using unsigned long? If so, that seems like it
> should be a treewide cleanup.

This is more than just a cleanup, since it changes the behaviour of the
system call and I'd much rather each architecture made this choice
themselves. I also don't think we should change the type of the parameter,
so something like the following instead:

diff --git a/arch/arm64/kernel/sys.c b/arch/arm64/kernel/sys.c
index b44065fb1616..77dc5f006bc4 100644
--- a/arch/arm64/kernel/sys.c
+++ b/arch/arm64/kernel/sys.c
@@ -31,8 +31,10 @@
 
 SYSCALL_DEFINE6(mmap, unsigned long, addr, unsigned long, len,
 		unsigned long, prot, unsigned long, flags,
-		unsigned long, fd, off_t, off)
+		unsigned long, fd, off_t, pgoff)
 {
+	unsigned long off = pgoff;
+
 	if (offset_in_page(off) != 0)
 		return -EINVAL;
 

> Similar applies to pgoff for mmap2.

Does it? Looks like unsigned long is used consistently there afaict. Did
you spot something I missed?

Will
Mark Rutland April 23, 2019, 5:20 p.m. UTC | #3
On Tue, Apr 23, 2019 at 06:04:45PM +0100, Will Deacon wrote:
> On Thu, Apr 18, 2019 at 12:28:18PM +0100, Mark Rutland wrote:
> > [adding linux-arch and relevant folk]
> > 
> > On Wed, Apr 17, 2019 at 08:35:25PM +0800, Boyang Zhou wrote:
> > > The error information is that “offset value too large for defined data type”.
> > > Reason:
> > > On the X86 platform, the data type of “off" is unsigned long; but on the ARM64 platform, the data type is defined as off_t, and off_t is by type long instead of unsigned long.
> > > When the off right shifts in the function “sys_mmap_pgoff(addr, len, prot, flags, fd, off >> PAGE_SHIFT)"on ARM64, high address of off is filled with sign bit 1instead of 0.
> > > In our case, we mmap GPU doorbell on both platform. On the x86 platform, the value of off is f009c00000000000, after shift the value becomes f009c00000000; while on the ARM64, the value of off changes from ed35c00000000000 to fffed35c00000000. This value is treated as unsigned long in later functions. So it is too big for off and the error happened.
> > > We have tested the patchs in Huawei ARM64 server with a couples of AMD GPUs.
> > 
> > It looks like the generic mmap uses unsigned long, as do sparc and x86.
> > 
> > However, arm64, microblase, powerpc and riscv all use off_t.
> > 
> > Should those all be using unsigned long? If so, that seems like it
> > should be a treewide cleanup.
> 
> This is more than just a cleanup, since it changes the behaviour of the
> system call and I'd much rather each architecture made this choice
> themselves. I also don't think we should change the type of the parameter,
> so something like the following instead:
> 
> diff --git a/arch/arm64/kernel/sys.c b/arch/arm64/kernel/sys.c
> index b44065fb1616..77dc5f006bc4 100644
> --- a/arch/arm64/kernel/sys.c
> +++ b/arch/arm64/kernel/sys.c
> @@ -31,8 +31,10 @@
>  
>  SYSCALL_DEFINE6(mmap, unsigned long, addr, unsigned long, len,
>  		unsigned long, prot, unsigned long, flags,
> -		unsigned long, fd, off_t, off)
> +		unsigned long, fd, off_t, pgoff)
>  {
> +	unsigned long off = pgoff;
> +
>  	if (offset_in_page(off) != 0)
>  		return -EINVAL;

To be honest, I think changing the parameter itself would be cleaner,
but I'm not the maintainer. :)

> > Similar applies to pgoff for mmap2.
> 
> Does it? Looks like unsigned long is used consistently there afaict. Did
> you spot something I missed?

Sorry, I meant tree-wide where csky and riscv use off_t, and all others
use unsigned long.

Thanks,
Mark.
David Howells April 24, 2019, 8:31 a.m. UTC | #4
Will Deacon <will.deacon@arm.com> wrote:

> -		unsigned long, fd, off_t, off)
> +		unsigned long, fd, off_t, pgoff)
>  {
> +	unsigned long off = pgoff;
> +

Would loff_t be better?

David
Will Deacon April 24, 2019, 1:50 p.m. UTC | #5
On Wed, Apr 24, 2019 at 09:31:28AM +0100, David Howells wrote:
> Will Deacon <will.deacon@arm.com> wrote:
> 
> > -		unsigned long, fd, off_t, off)
> > +		unsigned long, fd, off_t, pgoff)
> >  {
> > +	unsigned long off = pgoff;
> > +
> 
> Would loff_t be better?

I don't think so, isn't that still signed? I also think it's the same
size as off_t on arm64 (long long vs long).

Will
diff mbox series

Patch

diff --git a/arch/arm64/kernel/sys.c b/arch/arm64/kernel/sys.c
index b44065f..6f91e81 100644
--- a/arch/arm64/kernel/sys.c
+++ b/arch/arm64/kernel/sys.c
@@ -31,7 +31,7 @@ 
 
 SYSCALL_DEFINE6(mmap, unsigned long, addr, unsigned long, len,
 		unsigned long, prot, unsigned long, flags,
-		unsigned long, fd, off_t, off)
+		unsigned long, fd, unsigned long, off)
 {
 	if (offset_in_page(off) != 0)
 		return -EINVAL;