[10/13] mm: Wire up MAP_SYNC

Message ID 20170821215703.GA24473@linux.intel.com
State New
Headers show

Commit Message

Ross Zwisler Aug. 21, 2017, 9:57 p.m.
On Thu, Aug 17, 2017 at 06:08:12PM +0200, Jan Kara wrote:
> Pretty crude for now...
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

One other thing that should probably be wired up before this is all said and
done is the VmFlag string in /proc/<pid>/smaps.  Right now when we set this
flag it ends up as ??:

7f44e6cbd000-7f44e6dbd000 rw-s 00000000 103:00 12              /root/dax/data
Size:               1024 kB
Rss:                   0 kB
Pss:                   0 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:         0 kB
Referenced:            0 kB
Anonymous:             0 kB
LazyFree:              0 kB
AnonHugePages:         0 kB
ShmemPmdMapped:        0 kB
Shared_Hugetlb:        0 kB
Private_Hugetlb:       0 kB
Swap:                  0 kB
SwapPss:               0 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB
Locked:                0 kB
VmFlags: rd wr sh mr mw me ms ?? mm hg 

The quick one-liner at the end of this patch changes that to:

7fe30e87f000-7fe30e97f000 rw-s 00000000 103:00 12               /root/dax/data
Size:               1024 kB
Rss:                   0 kB
Pss:                   0 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:         0 kB
Referenced:            0 kB
Anonymous:             0 kB
LazyFree:              0 kB
AnonHugePages:         0 kB
ShmemPmdMapped:        0 kB
Shared_Hugetlb:        0 kB
Private_Hugetlb:       0 kB
Swap:                  0 kB
SwapPss:               0 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB
Locked:                0 kB
VmFlags: rd wr sh mr mw me ms sf mm hg

I think that of software can rely on this flag for userspace flushing without
worrying about any new TOCTOU races because there isn't a way to unset the
VM_SYNC flag once it is set - it should be valid as long as the mmap() remains
open and the mmap'd address is valid.

--- 8< ---
 fs/proc/task_mmu.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Jan Kara Aug. 22, 2017, 9:34 a.m. | #1
On Mon 21-08-17 15:57:03, Ross Zwisler wrote:
> On Thu, Aug 17, 2017 at 06:08:12PM +0200, Jan Kara wrote:
> > Pretty crude for now...
> > 
> > Signed-off-by: Jan Kara <jack@suse.cz>
> 
> One other thing that should probably be wired up before this is all said and
> done is the VmFlag string in /proc/<pid>/smaps.  Right now when we set this
> flag it ends up as ??:

Thanks. Patch folded so that I don't forget about this when updating the
patch to the new interface :).

								Honza
Dan Williams Aug. 22, 2017, 5:27 p.m. | #2
On Mon, Aug 21, 2017 at 2:57 PM, Ross Zwisler
<ross.zwisler@linux.intel.com> wrote:
> On Thu, Aug 17, 2017 at 06:08:12PM +0200, Jan Kara wrote:
>> Pretty crude for now...
>>
>> Signed-off-by: Jan Kara <jack@suse.cz>
>
> One other thing that should probably be wired up before this is all said and
> done is the VmFlag string in /proc/<pid>/smaps.  Right now when we set this
> flag it ends up as ??:
>
> 7f44e6cbd000-7f44e6dbd000 rw-s 00000000 103:00 12              /root/dax/data
> Size:               1024 kB
> Rss:                   0 kB
> Pss:                   0 kB
> Shared_Clean:          0 kB
> Shared_Dirty:          0 kB
> Private_Clean:         0 kB
> Private_Dirty:         0 kB
> Referenced:            0 kB
> Anonymous:             0 kB
> LazyFree:              0 kB
> AnonHugePages:         0 kB
> ShmemPmdMapped:        0 kB
> Shared_Hugetlb:        0 kB
> Private_Hugetlb:       0 kB
> Swap:                  0 kB
> SwapPss:               0 kB
> KernelPageSize:        4 kB
> MMUPageSize:           4 kB
> Locked:                0 kB
> VmFlags: rd wr sh mr mw me ms ?? mm hg
>
> The quick one-liner at the end of this patch changes that to:
>
> 7fe30e87f000-7fe30e97f000 rw-s 00000000 103:00 12               /root/dax/data
> Size:               1024 kB
> Rss:                   0 kB
> Pss:                   0 kB
> Shared_Clean:          0 kB
> Shared_Dirty:          0 kB
> Private_Clean:         0 kB
> Private_Dirty:         0 kB
> Referenced:            0 kB
> Anonymous:             0 kB
> LazyFree:              0 kB
> AnonHugePages:         0 kB
> ShmemPmdMapped:        0 kB
> Shared_Hugetlb:        0 kB
> Private_Hugetlb:       0 kB
> Swap:                  0 kB
> SwapPss:               0 kB
> KernelPageSize:        4 kB
> MMUPageSize:           4 kB
> Locked:                0 kB
> VmFlags: rd wr sh mr mw me ms sf mm hg
>
> I think that of software can rely on this flag for userspace flushing without
> worrying about any new TOCTOU races because there isn't a way to unset the
> VM_SYNC flag once it is set - it should be valid as long as the mmap() remains
> open and the mmap'd address is valid.
>
> --- 8< ---
>  fs/proc/task_mmu.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index b836fd6..a2a82ed 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -650,6 +650,7 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma)
>                 [ilog2(VM_ACCOUNT)]     = "ac",
>                 [ilog2(VM_NORESERVE)]   = "nr",
>                 [ilog2(VM_HUGETLB)]     = "ht",
> +               [ilog2(VM_SYNC)]        = "sf",
>                 [ilog2(VM_ARCH_1)]      = "ar",
>                 [ilog2(VM_DONTDUMP)]    = "dd",
>  #ifdef CONFIG_MEM_SOFT_DIRTY

So, I'm not sure I agree with this. I'm explicitly *not* advertising
MAP_DIRECT in ->vm_flags in my patches because we've seen applications
try to use smaps as an API rather than a debug tool. The toctou race
is fundamentally unsolvable unless you trust the agent that setup the
mapping will not tear it down while you've observed the sync flag.
Otherwise, if you do trust that the mapping will not be torn down then
userspace can already just trust itself and not rely on the kernel to
communicate the flag state.

I'm not against adding it, but the reasoning should be for debug and
not an api guarantee that applications will rely on, and unfortunately
I think we've seen that applications will rely on smaps no matter how
we document it.

Patch

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index b836fd6..a2a82ed 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -650,6 +650,7 @@  static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma)
 		[ilog2(VM_ACCOUNT)]	= "ac",
 		[ilog2(VM_NORESERVE)]	= "nr",
 		[ilog2(VM_HUGETLB)]	= "ht",
+		[ilog2(VM_SYNC)]	= "sf",
 		[ilog2(VM_ARCH_1)]	= "ar",
 		[ilog2(VM_DONTDUMP)]	= "dd",
 #ifdef CONFIG_MEM_SOFT_DIRTY