diff mbox series

[01/19] mm: introduce MAP_SHARED_VALIDATE, a mechanism to safely define new mmap flags

Message ID 20171011200603.27442-2-jack@suse.cz
State Not Applicable, archived
Headers show
Series dax, ext4, xfs: Synchronous page faults | expand

Commit Message

Jan Kara Oct. 11, 2017, 8:05 p.m. UTC
From: Dan Williams <dan.j.williams@intel.com>

The mmap(2) syscall suffers from the ABI anti-pattern of not validating
unknown flags. However, proposals like MAP_SYNC and MAP_DIRECT need a
mechanism to define new behavior that is known to fail on older kernels
without the support. Define a new MAP_SHARED_VALIDATE flag pattern that
is guaranteed to fail on all legacy mmap implementations.

It is worth noting that the original proposal was for a standalone
MAP_VALIDATE flag. However, when that  could not be supported by all
archs Linus observed:

    I see why you *think* you want a bitmap. You think you want
    a bitmap because you want to make MAP_VALIDATE be part of MAP_SYNC
    etc, so that people can do

    ret = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED
		    | MAP_SYNC, fd, 0);

    and "know" that MAP_SYNC actually takes.

    And I'm saying that whole wish is bogus. You're fundamentally
    depending on special semantics, just make it explicit. It's already
    not portable, so don't try to make it so.

    Rename that MAP_VALIDATE as MAP_SHARED_VALIDATE, make it have a value
    of 0x3, and make people do

    ret = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED_VALIDATE
		    | MAP_SYNC, fd, 0);

    and then the kernel side is easier too (none of that random garbage
    playing games with looking at the "MAP_VALIDATE bit", but just another
    case statement in that map type thing.

    Boom. Done.

Similar to ->fallocate() we also want the ability to validate the
support for new flags on a per ->mmap() 'struct file_operations'
instance basis.  Towards that end arrange for flags to be generically
validated against a mmap_supported_mask exported by 'struct
file_operations'. By default all existing flags are implicitly
supported, but new flags require MAP_SHARED_VALIDATE and
per-instance-opt-in.

Cc: Jan Kara <jack@suse.cz>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Suggested-by: Christoph Hellwig <hch@lst.de>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 arch/alpha/include/uapi/asm/mman.h           |  1 +
 arch/mips/include/uapi/asm/mman.h            |  1 +
 arch/mips/kernel/vdso.c                      |  2 +-
 arch/parisc/include/uapi/asm/mman.h          |  1 +
 arch/tile/mm/elf.c                           |  3 ++-
 arch/xtensa/include/uapi/asm/mman.h          |  1 +
 include/linux/fs.h                           |  2 ++
 include/linux/mm.h                           |  2 +-
 include/linux/mman.h                         | 39 ++++++++++++++++++++++++++++
 include/uapi/asm-generic/mman-common.h       |  1 +
 mm/mmap.c                                    | 21 ++++++++++++---
 tools/include/uapi/asm-generic/mman-common.h |  1 +
 12 files changed, 69 insertions(+), 6 deletions(-)

Comments

Christoph Hellwig Oct. 13, 2017, 7:12 a.m. UTC | #1
So did we settle on the new mmap_validate vs adding a new argument
to ->mmap for real now?  I have to say I'd much prefer passing an
additional argument instead, but if higher powers rule that out
this version is ok.

> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 13dab191a23e..5aee97d64cae 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1701,6 +1701,8 @@ struct file_operations {
>  	long (*unlocked_ioctl) (struct file *, unsigned int, unsigned long);
>  	long (*compat_ioctl) (struct file *, unsigned int, unsigned long);
>  	int (*mmap) (struct file *, struct vm_area_struct *);
> +	int (*mmap_validate) (struct file *, struct vm_area_struct *,
> +			unsigned long);

Can we make this return a bool for ok vs not ok?  That way we only
need to have the error code discussion in one place instead of every
file system.
Dan Williams Oct. 13, 2017, 3:44 p.m. UTC | #2
On Fri, Oct 13, 2017 at 12:12 AM, Christoph Hellwig <hch@infradead.org> wrote:
> So did we settle on the new mmap_validate vs adding a new argument
> to ->mmap for real now?  I have to say I'd much prefer passing an
> additional argument instead, but if higher powers rule that out
> this version is ok.

MAP_DIRECT also needs the fd now, so it would be two arguments. Before
you say "that's bogus" I want to understand what the alternative looks
like for notifying userspace that its get_user_pages() registration
lease is being invalidated.

>
>> diff --git a/include/linux/fs.h b/include/linux/fs.h
>> index 13dab191a23e..5aee97d64cae 100644
>> --- a/include/linux/fs.h
>> +++ b/include/linux/fs.h
>> @@ -1701,6 +1701,8 @@ struct file_operations {
>>       long (*unlocked_ioctl) (struct file *, unsigned int, unsigned long);
>>       long (*compat_ioctl) (struct file *, unsigned int, unsigned long);
>>       int (*mmap) (struct file *, struct vm_area_struct *);
>> +     int (*mmap_validate) (struct file *, struct vm_area_struct *,
>> +                     unsigned long);
>
> Can we make this return a bool for ok vs not ok?  That way we only
> need to have the error code discussion in one place instead of every
> file system.

How does userspace figure out what went wrong if we lose the ability
to return different failure reasons from the ops owner? Maybe an enum
to cut down the possible error codes to 'flag not supported', 'flag
supported but failed to setup the mapping', and 'success'.
Dan Williams Oct. 13, 2017, 6:28 p.m. UTC | #3
On Fri, Oct 13, 2017 at 8:44 AM, Dan Williams <dan.j.williams@intel.com> wrote:
> On Fri, Oct 13, 2017 at 12:12 AM, Christoph Hellwig <hch@infradead.org> wrote:
>> So did we settle on the new mmap_validate vs adding a new argument
>> to ->mmap for real now?  I have to say I'd much prefer passing an
>> additional argument instead, but if higher powers rule that out
>> this version is ok.
>
> MAP_DIRECT also needs the fd now, so it would be two arguments. Before
> you say "that's bogus" I want to understand what the alternative looks
> like for notifying userspace that its get_user_pages() registration
> lease is being invalidated.
>

Jason straightened me out about how RDMA applications expect to get
notifications and now I see why the fd based scheme is lacking.
Dan Williams Oct. 14, 2017, 3:57 p.m. UTC | #4
On Fri, 2017-10-13 at 00:12 -0700, Christoph Hellwig wrote:
> So did we settle on the new mmap_validate vs adding a new argument
> to ->mmap for real now?  I have to say I'd much prefer passing an
> additional argument instead, but if higher powers rule that out
> this version is ok.

Even if we decide to add a parameter to ->mmap() I think that should be
done after we merge this version. Otherwise there's no way to stage
these changes in advance of the merge window since we need to run the
"add parameter" coccinelle script near or after -rc1 when there's no
risk of new ->mmap() users being added.

> 
> > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > index 13dab191a23e..5aee97d64cae 100644
> > --- a/include/linux/fs.h
> > +++ b/include/linux/fs.h
> > @@ -1701,6 +1701,8 @@ struct file_operations {
> >  	long (*unlocked_ioctl) (struct file *, unsigned int,
> > unsigned long);
> >  	long (*compat_ioctl) (struct file *, unsigned int,
> > unsigned long);
> >  	int (*mmap) (struct file *, struct vm_area_struct *);
> > +	int (*mmap_validate) (struct file *, struct vm_area_struct
> > *,
> > +			unsigned long);
> 
> Can we make this return a bool for ok vs not ok?  That way we only
> need to have the error code discussion in one place instead of every
> file system.

How about the following incremental update? It allows ->mmap_validate()
to be used as a full replacement for ->mmap() and it limits the error
code freedom to a centralized mmap_status_errno() routine:

---

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 5aee97d64cae..e07fcac17ba7 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1685,6 +1685,13 @@ struct block_device_operations;
 #define NOMMU_VMFLAGS \
 	(NOMMU_MAP_READ | NOMMU_MAP_WRITE | NOMMU_MAP_EXEC)
 
+enum mmap_status {
+	MMAP_SUCCESS,
+	MMAP_UNSUPPORTED_FLAGS,
+	MMAP_INVALID_FILE,
+	MMAP_RESOURCE_FAILURE,
+};
+typedef enum mmap_status mmap_status_t;
 
 struct iov_iter;
 
@@ -1701,7 +1708,7 @@ struct file_operations {
 	long (*unlocked_ioctl) (struct file *, unsigned int, unsigned long);
 	long (*compat_ioctl) (struct file *, unsigned int, unsigned long);
 	int (*mmap) (struct file *, struct vm_area_struct *);
-	int (*mmap_validate) (struct file *, struct vm_area_struct *,
+	mmap_status_t (*mmap_validate) (struct file *, struct vm_area_struct *,
 			unsigned long);
 	int (*open) (struct inode *, struct file *);
 	int (*flush) (struct file *, fl_owner_t id);
diff --git a/mm/mmap.c b/mm/mmap.c
index 2649c00581a0..c1b6a8c25ce7 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1432,7 +1432,7 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
 				vm_flags &= ~VM_MAYEXEC;
 			}
 
-			if (!file->f_op->mmap)
+			if (!file->f_op->mmap && !file->f_op->mmap_validate)
 				return -ENODEV;
 			if (vm_flags & (VM_GROWSDOWN|VM_GROWSUP))
 				return -EINVAL;
@@ -1612,6 +1612,27 @@ static inline int accountable_mapping(struct file *file, vm_flags_t vm_flags)
 	return (vm_flags & (VM_NORESERVE | VM_SHARED | VM_WRITE)) == VM_WRITE;
 }
 
+static int mmap_status_errno(mmap_status_t stat)
+{
+	static const int rc[] = {
+		[MMAP_SUCCESS] = 0,
+		[MMAP_UNSUPPORTED_FLAGS] = -EOPNOTSUPP,
+		[MMAP_INVALID_FILE] = -ENOTTY,
+		[MMAP_RESOURCE_FAILURE] = -ENOMEM,
+	};
+
+	switch (stat) {
+	case MMAP_SUCCESS:
+	case MMAP_UNSUPPORTED_FLAGS:
+	case MMAP_INVALID_FILE:
+	case MMAP_RESOURCE_FAILURE:
+		return rc[stat];
+	default:
+		/* unknown mmap_status */
+		return rc[MMAP_UNSUPPORTED_FLAGS];
+	}
+}
+
 unsigned long mmap_region(struct file *file, unsigned long addr,
 		unsigned long len, vm_flags_t vm_flags, unsigned long pgoff,
 		struct list_head *uf, unsigned long map_flags)
@@ -1619,6 +1640,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
 	struct mm_struct *mm = current->mm;
 	struct vm_area_struct *vma, *prev;
 	int error;
+	mmap_status_t status;
 	struct rb_node **rb_link, *rb_parent;
 	unsigned long charged = 0;
 
@@ -1697,11 +1719,19 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
 		 * vma_link() below can deny write-access if VM_DENYWRITE is set
 		 * and map writably if VM_SHARED is set. This usually means the
 		 * new file must not have been exposed to user-space, yet.
+		 *
+		 * We require ->mmap_validate in the MAP_SHARED_VALIDATE
+		 * case, prefer ->mmap_validate over ->mmap, and
+		 * otherwise fallback to legacy ->mmap.
 		 */
 		vma->vm_file = get_file(file);
-		if ((map_flags & MAP_TYPE) == MAP_SHARED_VALIDATE)
-			error = file->f_op->mmap_validate(file, vma, map_flags);
-		else
+		if ((map_flags & MAP_TYPE) == MAP_SHARED_VALIDATE) {
+			status = file->f_op->mmap_validate(file, vma, map_flags);
+			error = mmap_status_errno(status);
+		} else if (file->f_op->mmap_validate) {
+			status = file->f_op->mmap_validate(file, vma, map_flags);
+			error = mmap_status_errno(status);
+		} else
 			error = call_mmap(file, vma);
 		if (error)
 			goto unmap_and_free_vma;
Christoph Hellwig Oct. 16, 2017, 7:45 a.m. UTC | #5
> How about the following incremental update? It allows ->mmap_validate()
> to be used as a full replacement for ->mmap() and it limits the error
> code freedom to a centralized mmap_status_errno() routine:

Nah - my earlier comment was simply misinformed because I didn't
read the whole patch and the _validate name mislead me.

So I think the current calling conventions are ok, I'd just like a
better name (mmap_flags maybe?) and avoid the need the file system
also has to implement ->mmap.
Jan Kara Oct. 17, 2017, 11:50 a.m. UTC | #6
On Mon 16-10-17 00:45:04, Christoph Hellwig wrote:
> > How about the following incremental update? It allows ->mmap_validate()
> > to be used as a full replacement for ->mmap() and it limits the error
> > code freedom to a centralized mmap_status_errno() routine:
> 
> Nah - my earlier comment was simply misinformed because I didn't
> read the whole patch and the _validate name mislead me.
> 
> So I think the current calling conventions are ok, I'd just like a
> better name (mmap_flags maybe?) and avoid the need the file system
> also has to implement ->mmap.

OK, I can do that. But I had just realized that if MAP_DIRECT isn't going
to end up using mmap(2) interface but something else (and I'm not sure
where discussions on this matter ended), we don't need flags argument for
->mmap at all. MAP_SYNC uses a VMA flag anyway and thus it is fine with the
current ->mmap interface. We still need some opt-in mechanism for
MAP_SHARED_VALIDATE though (probably supported mmap flags as Dan had in one
version of his patch). Thoughts on which way to go for now?

								Honza
Dan Williams Oct. 17, 2017, 7:38 p.m. UTC | #7
On Tue, Oct 17, 2017 at 4:50 AM, Jan Kara <jack@suse.cz> wrote:
> On Mon 16-10-17 00:45:04, Christoph Hellwig wrote:
>> > How about the following incremental update? It allows ->mmap_validate()
>> > to be used as a full replacement for ->mmap() and it limits the error
>> > code freedom to a centralized mmap_status_errno() routine:
>>
>> Nah - my earlier comment was simply misinformed because I didn't
>> read the whole patch and the _validate name mislead me.
>>
>> So I think the current calling conventions are ok, I'd just like a
>> better name (mmap_flags maybe?) and avoid the need the file system
>> also has to implement ->mmap.
>
> OK, I can do that. But I had just realized that if MAP_DIRECT isn't going
> to end up using mmap(2) interface but something else (and I'm not sure
> where discussions on this matter ended), we don't need flags argument for
> ->mmap at all. MAP_SYNC uses a VMA flag anyway and thus it is fine with the
> current ->mmap interface. We still need some opt-in mechanism for
> MAP_SHARED_VALIDATE though (probably supported mmap flags as Dan had in one
> version of his patch). Thoughts on which way to go for now?

The "supported mmap flags" approach also solves the problem you raised
about MAP_SYNC being silently accepted by an ->mmap() handler that
does not know about the new flag. I.e. leading userpace to potentially
assume an invalid data consistency model. I'll revive that approach
now that the MAP_DIRECT problem is going to be solved via a different
interface.
Christoph Hellwig Oct. 18, 2017, 6:59 a.m. UTC | #8
On Tue, Oct 17, 2017 at 01:50:47PM +0200, Jan Kara wrote:
> OK, I can do that. But I had just realized that if MAP_DIRECT isn't going
> to end up using mmap(2) interface but something else (and I'm not sure
> where discussions on this matter ended), we don't need flags argument for
> ->mmap at all. MAP_SYNC uses a VMA flag anyway and thus it is fine with the
> current ->mmap interface. We still need some opt-in mechanism for
> MAP_SHARED_VALIDATE though (probably supported mmap flags as Dan had in one
> version of his patch). Thoughts on which way to go for now?

Yes, I'd much prefer the mmap_flags in file_operations.  The other
option would be a new FMODE_* flag which is what Al did for various
other optional features, but I generally thing that is a confusing
interface.
diff mbox series

Patch

diff --git a/arch/alpha/include/uapi/asm/mman.h b/arch/alpha/include/uapi/asm/mman.h
index 3b26cc62dadb..92823f24890b 100644
--- a/arch/alpha/include/uapi/asm/mman.h
+++ b/arch/alpha/include/uapi/asm/mman.h
@@ -14,6 +14,7 @@ 
 #define MAP_TYPE	0x0f		/* Mask for type of mapping (OSF/1 is _wrong_) */
 #define MAP_FIXED	0x100		/* Interpret addr exactly */
 #define MAP_ANONYMOUS	0x10		/* don't use a file */
+#define MAP_SHARED_VALIDATE 0x3		/* share + validate extension flags */
 
 /* not used by linux, but here to make sure we don't clash with OSF/1 defines */
 #define _MAP_HASSEMAPHORE 0x0200
diff --git a/arch/mips/include/uapi/asm/mman.h b/arch/mips/include/uapi/asm/mman.h
index da3216007fe0..c77689076577 100644
--- a/arch/mips/include/uapi/asm/mman.h
+++ b/arch/mips/include/uapi/asm/mman.h
@@ -30,6 +30,7 @@ 
 #define MAP_PRIVATE	0x002		/* Changes are private */
 #define MAP_TYPE	0x00f		/* Mask for type of mapping */
 #define MAP_FIXED	0x010		/* Interpret addr exactly */
+#define MAP_SHARED_VALIDATE 0x3		/* share + validate extension flags */
 
 /* not used by linux, but here to make sure we don't clash with ABI defines */
 #define MAP_RENAME	0x020		/* Assign page to file */
diff --git a/arch/mips/kernel/vdso.c b/arch/mips/kernel/vdso.c
index 019035d7225c..cf10654477a9 100644
--- a/arch/mips/kernel/vdso.c
+++ b/arch/mips/kernel/vdso.c
@@ -110,7 +110,7 @@  int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
 	base = mmap_region(NULL, STACK_TOP, PAGE_SIZE,
 			   VM_READ|VM_WRITE|VM_EXEC|
 			   VM_MAYREAD|VM_MAYWRITE|VM_MAYEXEC,
-			   0, NULL);
+			   0, NULL, 0);
 	if (IS_ERR_VALUE(base)) {
 		ret = base;
 		goto out;
diff --git a/arch/parisc/include/uapi/asm/mman.h b/arch/parisc/include/uapi/asm/mman.h
index 775b5d5e41a1..36b688d52de3 100644
--- a/arch/parisc/include/uapi/asm/mman.h
+++ b/arch/parisc/include/uapi/asm/mman.h
@@ -14,6 +14,7 @@ 
 #define MAP_TYPE	0x03		/* Mask for type of mapping */
 #define MAP_FIXED	0x04		/* Interpret addr exactly */
 #define MAP_ANONYMOUS	0x10		/* don't use a file */
+#define MAP_SHARED_VALIDATE 0x3		/* share + validate extension flags */
 
 #define MAP_DENYWRITE	0x0800		/* ETXTBSY */
 #define MAP_EXECUTABLE	0x1000		/* mark it as an executable */
diff --git a/arch/tile/mm/elf.c b/arch/tile/mm/elf.c
index 889901824400..5ffcbe76aef9 100644
--- a/arch/tile/mm/elf.c
+++ b/arch/tile/mm/elf.c
@@ -143,7 +143,8 @@  int arch_setup_additional_pages(struct linux_binprm *bprm,
 		unsigned long addr = MEM_USER_INTRPT;
 		addr = mmap_region(NULL, addr, INTRPT_SIZE,
 				   VM_READ|VM_EXEC|
-				   VM_MAYREAD|VM_MAYWRITE|VM_MAYEXEC, 0, NULL);
+				   VM_MAYREAD|VM_MAYWRITE|VM_MAYEXEC, 0,
+				   NULL, 0);
 		if (addr > (unsigned long) -PAGE_SIZE)
 			retval = (int) addr;
 	}
diff --git a/arch/xtensa/include/uapi/asm/mman.h b/arch/xtensa/include/uapi/asm/mman.h
index b15b278aa314..ec597900eec7 100644
--- a/arch/xtensa/include/uapi/asm/mman.h
+++ b/arch/xtensa/include/uapi/asm/mman.h
@@ -37,6 +37,7 @@ 
 #define MAP_PRIVATE	0x002		/* Changes are private */
 #define MAP_TYPE	0x00f		/* Mask for type of mapping */
 #define MAP_FIXED	0x010		/* Interpret addr exactly */
+#define MAP_SHARED_VALIDATE 0x3		/* share + validate extension flags */
 
 /* not used by linux, but here to make sure we don't clash with ABI defines */
 #define MAP_RENAME	0x020		/* Assign page to file */
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 13dab191a23e..5aee97d64cae 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1701,6 +1701,8 @@  struct file_operations {
 	long (*unlocked_ioctl) (struct file *, unsigned int, unsigned long);
 	long (*compat_ioctl) (struct file *, unsigned int, unsigned long);
 	int (*mmap) (struct file *, struct vm_area_struct *);
+	int (*mmap_validate) (struct file *, struct vm_area_struct *,
+			unsigned long);
 	int (*open) (struct inode *, struct file *);
 	int (*flush) (struct file *, fl_owner_t id);
 	int (*release) (struct inode *, struct file *);
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 065d99deb847..38f6ed954dde 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2133,7 +2133,7 @@  extern unsigned long get_unmapped_area(struct file *, unsigned long, unsigned lo
 
 extern unsigned long mmap_region(struct file *file, unsigned long addr,
 	unsigned long len, vm_flags_t vm_flags, unsigned long pgoff,
-	struct list_head *uf);
+	struct list_head *uf, unsigned long map_flags);
 extern unsigned long do_mmap(struct file *file, unsigned long addr,
 	unsigned long len, unsigned long prot, unsigned long flags,
 	vm_flags_t vm_flags, unsigned long pgoff, unsigned long *populate,
diff --git a/include/linux/mman.h b/include/linux/mman.h
index c8367041fafd..94b63b4d71ff 100644
--- a/include/linux/mman.h
+++ b/include/linux/mman.h
@@ -7,6 +7,45 @@ 
 #include <linux/atomic.h>
 #include <uapi/linux/mman.h>
 
+/*
+ * Arrange for legacy / undefined architecture specific flags to be
+ * ignored by default in LEGACY_MAP_MASK.
+ */
+#ifndef MAP_32BIT
+#define MAP_32BIT 0
+#endif
+#ifndef MAP_HUGE_2MB
+#define MAP_HUGE_2MB 0
+#endif
+#ifndef MAP_HUGE_1GB
+#define MAP_HUGE_1GB 0
+#endif
+#ifndef MAP_UNINITIALIZED
+#define MAP_UNINITIALIZED 0
+#endif
+
+/*
+ * The historical set of flags that all mmap implementations implicitly
+ * support when a ->mmap_validate() op is not provided in file_operations.
+ */
+#define LEGACY_MAP_MASK (MAP_SHARED \
+		| MAP_PRIVATE \
+		| MAP_FIXED \
+		| MAP_ANONYMOUS \
+		| MAP_DENYWRITE \
+		| MAP_EXECUTABLE \
+		| MAP_UNINITIALIZED \
+		| MAP_GROWSDOWN \
+		| MAP_LOCKED \
+		| MAP_NORESERVE \
+		| MAP_POPULATE \
+		| MAP_NONBLOCK \
+		| MAP_STACK \
+		| MAP_HUGETLB \
+		| MAP_32BIT \
+		| MAP_HUGE_2MB \
+		| MAP_HUGE_1GB)
+
 extern int sysctl_overcommit_memory;
 extern int sysctl_overcommit_ratio;
 extern unsigned long sysctl_overcommit_kbytes;
diff --git a/include/uapi/asm-generic/mman-common.h b/include/uapi/asm-generic/mman-common.h
index 203268f9231e..ac55d1c0ec0f 100644
--- a/include/uapi/asm-generic/mman-common.h
+++ b/include/uapi/asm-generic/mman-common.h
@@ -24,6 +24,7 @@ 
 #else
 # define MAP_UNINITIALIZED 0x0		/* Don't support this flag */
 #endif
+#define MAP_SHARED_VALIDATE 0x3		/* share + validate extension flags */
 
 /*
  * Flags for mlock
diff --git a/mm/mmap.c b/mm/mmap.c
index 680506faceae..a1bcaa9eff42 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1389,6 +1389,18 @@  unsigned long do_mmap(struct file *file, unsigned long addr,
 		struct inode *inode = file_inode(file);
 
 		switch (flags & MAP_TYPE) {
+		case (MAP_SHARED_VALIDATE):
+			if ((flags & ~LEGACY_MAP_MASK) == 0) {
+				/*
+				 * If all legacy mmap flags, downgrade
+				 * to MAP_SHARED, i.e. invoke ->mmap()
+				 * instead of ->mmap_validate()
+				 */
+				flags &= ~MAP_TYPE;
+				flags |= MAP_SHARED;
+			} else if (!file->f_op->mmap_validate)
+				return -EOPNOTSUPP;
+			/* fall through */
 		case MAP_SHARED:
 			if ((prot&PROT_WRITE) && !(file->f_mode&FMODE_WRITE))
 				return -EACCES;
@@ -1465,7 +1477,7 @@  unsigned long do_mmap(struct file *file, unsigned long addr,
 			vm_flags |= VM_NORESERVE;
 	}
 
-	addr = mmap_region(file, addr, len, vm_flags, pgoff, uf);
+	addr = mmap_region(file, addr, len, vm_flags, pgoff, uf, flags);
 	if (!IS_ERR_VALUE(addr) &&
 	    ((vm_flags & VM_LOCKED) ||
 	     (flags & (MAP_POPULATE | MAP_NONBLOCK)) == MAP_POPULATE))
@@ -1602,7 +1614,7 @@  static inline int accountable_mapping(struct file *file, vm_flags_t vm_flags)
 
 unsigned long mmap_region(struct file *file, unsigned long addr,
 		unsigned long len, vm_flags_t vm_flags, unsigned long pgoff,
-		struct list_head *uf)
+		struct list_head *uf, unsigned long map_flags)
 {
 	struct mm_struct *mm = current->mm;
 	struct vm_area_struct *vma, *prev;
@@ -1687,7 +1699,10 @@  unsigned long mmap_region(struct file *file, unsigned long addr,
 		 * new file must not have been exposed to user-space, yet.
 		 */
 		vma->vm_file = get_file(file);
-		error = call_mmap(file, vma);
+		if ((map_flags & MAP_TYPE) == MAP_SHARED_VALIDATE)
+			error = file->f_op->mmap_validate(file, vma, map_flags);
+		else
+			error = call_mmap(file, vma);
 		if (error)
 			goto unmap_and_free_vma;
 
diff --git a/tools/include/uapi/asm-generic/mman-common.h b/tools/include/uapi/asm-generic/mman-common.h
index 203268f9231e..ac55d1c0ec0f 100644
--- a/tools/include/uapi/asm-generic/mman-common.h
+++ b/tools/include/uapi/asm-generic/mman-common.h
@@ -24,6 +24,7 @@ 
 #else
 # define MAP_UNINITIALIZED 0x0		/* Don't support this flag */
 #endif
+#define MAP_SHARED_VALIDATE 0x3		/* share + validate extension flags */
 
 /*
  * Flags for mlock