diff mbox

remap_file_pages needs to check for cache coherency

Message ID 20131227180018.GC4945@linux.intel.com
State RFC
Delegated to: David Miller
Headers show

Commit Message

Matthew Wilcox Dec. 27, 2013, 6 p.m. UTC
It seems to me that while (for example) on SPARC, it's not possible to
create a non-coherent mapping with mmap(), after we've done an mmap,
we can then use remap_file_pages() to create a mapping that no longer
aliases in the D-cache.

I have only compile-tested this patch.  I don't have any SPARC hardware,
and my PA-RISC hardware hasn't been turned on in six years ... I noticed
this while wandering around looking at some other stuff.

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

David Miller Dec. 27, 2013, 6:48 p.m. UTC | #1
From: Matthew Wilcox <willy@linux.intel.com>
Date: Fri, 27 Dec 2013 13:00:18 -0500

> It seems to me that while (for example) on SPARC, it's not possible to
> create a non-coherent mapping with mmap(), after we've done an mmap,
> we can then use remap_file_pages() to create a mapping that no longer
> aliases in the D-cache.
> 
> I have only compile-tested this patch.  I don't have any SPARC hardware,
> and my PA-RISC hardware hasn't been turned on in six years ... I noticed
> this while wandering around looking at some other stuff.

I suppose this is needed, but only in the case where the mapping is
shared and writable, right?  I don't see you testing those conditions,
but with them I'd be OK with this change.

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John David Anglin Dec. 27, 2013, 7:13 p.m. UTC | #2
On 27-Dec-13, at 1:00 PM, Matthew Wilcox wrote:

> +#ifdef __ARCH_FORCE_SHMLBA
> +	/* Is the mapping cache-coherent? */
> +	if ((pgoff ^ linear_page_index(vma, start)) &
> +	    ((SHMLBA-1) >> PAGE_SHIFT))
> +		goto out;
> +#endif


I think this will cause problems on PA-RISC.  The reason is we have an  
additional offset
for mappings.  See get_offset() in sys_parisc.c.

SHMLBA is 4 MB on PA-RISC.  If we limit ourselves to aligned mappings,  
we run out of
memory very quickly.  Even with our current implementation, we fail  
the perl locales test
with locales-all installed.

Dave
--
John David Anglin	dave.anglin@bell.net



--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Matthew Wilcox Dec. 27, 2013, 7:20 p.m. UTC | #3
On Fri, Dec 27, 2013 at 01:48:14PM -0500, David Miller wrote:
> From: Matthew Wilcox <willy@linux.intel.com>
> Date: Fri, 27 Dec 2013 13:00:18 -0500
> 
> > It seems to me that while (for example) on SPARC, it's not possible to
> > create a non-coherent mapping with mmap(), after we've done an mmap,
> > we can then use remap_file_pages() to create a mapping that no longer
> > aliases in the D-cache.
> > 
> > I have only compile-tested this patch.  I don't have any SPARC hardware,
> > and my PA-RISC hardware hasn't been turned on in six years ... I noticed
> > this while wandering around looking at some other stuff.
> 
> I suppose this is needed, but only in the case where the mapping is
> shared and writable, right?  I don't see you testing those conditions,
> but with them I'd be OK with this change.

VM_SHARED is checked a few lines above; too far to be visible in the
original context diff:

        if (!vma || !(vma->vm_flags & VM_SHARED))
                goto out;
 
        if (!vma->vm_ops || !vma->vm_ops->remap_pages)
                goto out;
 
        if (start < vma->vm_start || start + size > vma->vm_end)
                goto out;
 
+#ifdef __ARCH_FORCE_SHMLBA
+       /* Is the mapping cache-coherent? */
+       if ((pgoff ^ linear_page_index(vma, start)) &
+           ((SHMLBA-1) >> PAGE_SHIFT))
+               goto out;
+#endif

I don't understand why we need to check for writable here.  We don't
seem to check VM_WRITE in arch_get_unmapped_area(), so I don't see why
we should be checking it here.  Put it another way; if I mmap() a file
with PROT_READ only, should I be able to see stale data after another
thread has written to it?
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Matthew Wilcox Dec. 27, 2013, 7:33 p.m. UTC | #4
On Fri, Dec 27, 2013 at 02:13:16PM -0500, John David Anglin wrote:
> On 27-Dec-13, at 1:00 PM, Matthew Wilcox wrote:
> 
> >+#ifdef __ARCH_FORCE_SHMLBA
> >+	/* Is the mapping cache-coherent? */
> >+	if ((pgoff ^ linear_page_index(vma, start)) &
> >+	    ((SHMLBA-1) >> PAGE_SHIFT))
> >+		goto out;
> >+#endif
> 
> 
> I think this will cause problems on PA-RISC.  The reason is we have
> an additional offset
> for mappings.  See get_offset() in sys_parisc.c.

I don't think it will cause any additional problems.  The test merely
asks "Is the offset to put at this address cache-coherent with the offset
that was at this address when the mmap was established?"

> SHMLBA is 4 MB on PA-RISC.  If we limit ourselves to aligned
> mappings, we run out of
> memory very quickly.  Even with our current implementation, we fail
> the perl locales test
> with locales-all installed.

I know the large SHMLBA is problematic for PA-RISC, but I don't think
there's a lot of code out there using remap_file_pages().  code.google.com
found almost nothing, and a regular google search found only a couple
of little toys.

Have you considered measuring SHMLBA on different CPU models and
reducing it at boot time?  I know that 4MB is the architectural guarantee
(actually, I seem to remember that 16MB was the architectural guarantee,
but jsm found some CPU architects who said it would enver exceed 4MB).
I bet some CPUs have considerably lower cache coherency limits.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John David Anglin Dec. 27, 2013, 7:47 p.m. UTC | #5
On 27-Dec-13, at 2:33 PM, Matthew Wilcox wrote:

> Have you considered measuring SHMLBA on different CPU models and
> reducing it at boot time?  I know that 4MB is the architectural  
> guarantee
> (actually, I seem to remember that 16MB was the architectural  
> guarantee,
> but jsm found some CPU architects who said it would enver exceed 4MB).
> I bet some CPUs have considerably lower cache coherency limits.


It's worth looking at.  The value is supposed to be returned by the  
PDC_CACHE PDC
call but I know my rp3440 returns a value of 0 indicating that the  
aliasing boundary
is unknown and may be greater than 16MB.

Dave
--
John David Anglin	dave.anglin@bell.net



--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John David Anglin Dec. 27, 2013, 8:14 p.m. UTC | #6
On 27-Dec-13, at 2:47 PM, John David Anglin wrote:

> It's worth looking at.  The value is supposed to be returned by the  
> PDC_CACHE PDC
> call but I know my rp3440 returns a value of 0 indicating that the  
> aliasing boundary
> is unknown and may be greater than 16MB.

c3750 data cache has an aliasing boundary of 4 MB, so I think we are  
stuck with large
SHMLBA.

Dave
--
John David Anglin	dave.anglin@bell.net



--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/mm/fremap.c b/mm/fremap.c
index 5bff081..01fc2e7 100644
--- a/mm/fremap.c
+++ b/mm/fremap.c
@@ -19,6 +19,7 @@ 
 
 #include <asm/mmu_context.h>
 #include <asm/cacheflush.h>
+#include <asm/shmparam.h>
 #include <asm/tlbflush.h>
 
 #include "internal.h"
@@ -177,6 +178,13 @@  SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
 	if (start < vma->vm_start || start + size > vma->vm_end)
 		goto out;
 
+#ifdef __ARCH_FORCE_SHMLBA
+	/* Is the mapping cache-coherent? */
+	if ((pgoff ^ linear_page_index(vma, start)) &
+	    ((SHMLBA-1) >> PAGE_SHIFT))
+		goto out;
+#endif
+
 	/* Must set VM_NONLINEAR before any pages are populated. */
 	if (!(vma->vm_flags & VM_NONLINEAR)) {
 		/*