diff mbox series

[v15,1/2] util/mmap-alloc: support MAP_SYNC in qemu_ram_mmap()

Message ID 20190426003652.32633-2-richardw.yang@linux.intel.com
State New
Headers show
Series support MAP_SYNC for memory-backend-file | expand

Commit Message

Wei Yang April 26, 2019, 12:36 a.m. UTC
From: Zhang Yi <yi.z.zhang@linux.intel.com>

When a file supporting DAX is used as vNVDIMM backend, mmap it with
MAP_SYNC flag in addition which can ensure file system metadata
synced in each guest writes to the backend file, without other QEMU
actions (e.g., periodic fsync() by QEMU).

Current, We have below different possible use cases:

1. pmem=on is set, shared=on is set, MAP_SYNC supported:
   a: backend is a dax supporting file.
    - MAP_SYNC will active.
   b: backend is not a dax supporting file.
    - mmap will trigger a warning. then MAP_SYNC flag will be ignored

2. The rest of cases:
   - we will never pass the MAP_SYNC to mmap2

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Signed-off-by: Zhang Yi <yi.z.zhang@linux.intel.com>
[ehabkost: Rebased patch to latest code on master]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Tested-by: Wei Yang <richardw.yang@linux.intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

---
v15: fix compile issue on pre-linux4.15
v14: rebase on top of current upstream
---
 util/mmap-alloc.c | 45 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 44 insertions(+), 1 deletion(-)

Comments

Eduardo Habkost April 30, 2019, 8:46 p.m. UTC | #1
Hi,

On Fri, Apr 26, 2019 at 08:36:51AM +0800, Wei Yang wrote:
> From: Zhang Yi <yi.z.zhang@linux.intel.com>
> 
> When a file supporting DAX is used as vNVDIMM backend, mmap it with
> MAP_SYNC flag in addition which can ensure file system metadata
> synced in each guest writes to the backend file, without other QEMU
> actions (e.g., periodic fsync() by QEMU).
> 
> Current, We have below different possible use cases:
> 
> 1. pmem=on is set, shared=on is set, MAP_SYNC supported:
>    a: backend is a dax supporting file.
>     - MAP_SYNC will active.
>    b: backend is not a dax supporting file.
>     - mmap will trigger a warning. then MAP_SYNC flag will be ignored
> 
> 2. The rest of cases:
>    - we will never pass the MAP_SYNC to mmap2
> 
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> Signed-off-by: Zhang Yi <yi.z.zhang@linux.intel.com>
> [ehabkost: Rebased patch to latest code on master]
> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
> Tested-by: Wei Yang <richardw.yang@linux.intel.com>
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> 
> ---
> v15: fix compile issue on pre-linux4.15
> v14: rebase on top of current upstream

Note that v14 was already merged, so the build fix would need to
be submitted separately.  However:

[...]
> +#ifdef CONFIG_LINUX
> +#include <linux/mman.h>
> +#endif /* CONFIG_LINUX */
> +
> +#ifndef MAP_SYNC
> +#define MAP_SYNC 0
> +#endif
> +#ifndef MAP_SHARED_VALIDATE
> +#define MAP_SHARED_VALIDATE 0
> +#endif

Why would we need this, if we added copies of mman.h to
linux-headers?
Wei Yang April 30, 2019, 10:36 p.m. UTC | #2
On Tue, Apr 30, 2019 at 05:46:36PM -0300, Eduardo Habkost wrote:
>Hi,
>
>On Fri, Apr 26, 2019 at 08:36:51AM +0800, Wei Yang wrote:
>> From: Zhang Yi <yi.z.zhang@linux.intel.com>
>> 
>> When a file supporting DAX is used as vNVDIMM backend, mmap it with
>> MAP_SYNC flag in addition which can ensure file system metadata
>> synced in each guest writes to the backend file, without other QEMU
>> actions (e.g., periodic fsync() by QEMU).
>> 
>> Current, We have below different possible use cases:
>> 
>> 1. pmem=on is set, shared=on is set, MAP_SYNC supported:
>>    a: backend is a dax supporting file.
>>     - MAP_SYNC will active.
>>    b: backend is not a dax supporting file.
>>     - mmap will trigger a warning. then MAP_SYNC flag will be ignored
>> 
>> 2. The rest of cases:
>>    - we will never pass the MAP_SYNC to mmap2
>> 
>> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
>> Signed-off-by: Zhang Yi <yi.z.zhang@linux.intel.com>
>> [ehabkost: Rebased patch to latest code on master]
>> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
>> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
>> Tested-by: Wei Yang <richardw.yang@linux.intel.com>
>> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
>> 
>> ---
>> v15: fix compile issue on pre-linux4.15
>> v14: rebase on top of current upstream
>
>Note that v14 was already merged, so the build fix would need to
>be submitted separately.  However:
>

Sure, I would spin a separate patch.

>[...]
>> +#ifdef CONFIG_LINUX
>> +#include <linux/mman.h>
>> +#endif /* CONFIG_LINUX */
>> +
>> +#ifndef MAP_SYNC
>> +#define MAP_SYNC 0
>> +#endif
>> +#ifndef MAP_SHARED_VALIDATE
>> +#define MAP_SHARED_VALIDATE 0
>> +#endif
>
>Why would we need this, if we added copies of mman.h to
>linux-headers?

This is reported by Stefan.

https://www.mail-archive.com/qemu-devel@nongnu.org/msg612168.html

>
>-- 
>Eduardo
Eduardo Habkost April 30, 2019, 10:48 p.m. UTC | #3
On Tue, Apr 30, 2019 at 10:36:18PM +0000, Wei Yang wrote:
[...]
> >> +#ifdef CONFIG_LINUX
> >> +#include <linux/mman.h>
> >> +#endif /* CONFIG_LINUX */
> >> +
> >> +#ifndef MAP_SYNC
> >> +#define MAP_SYNC 0
> >> +#endif
> >> +#ifndef MAP_SHARED_VALIDATE
> >> +#define MAP_SHARED_VALIDATE 0
> >> +#endif
> >
> >Why would we need this, if we added copies of mman.h to
> >linux-headers?
> 
> This is reported by Stefan.
> 
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg612168.html

Stefan, did you hit a build failure, or it was just theoretical?

linux-headers/*/mman.h is updated by "linux-headers: add
linux/mman.h" (commit 8cf108c5d159).  If the build really fails,
something else is broken in our build system.
Michael S. Tsirkin April 30, 2019, 10:50 p.m. UTC | #4
On Tue, Apr 30, 2019 at 07:48:16PM -0300, Eduardo Habkost wrote:
> On Tue, Apr 30, 2019 at 10:36:18PM +0000, Wei Yang wrote:
> [...]
> > >> +#ifdef CONFIG_LINUX
> > >> +#include <linux/mman.h>
> > >> +#endif /* CONFIG_LINUX */
> > >> +
> > >> +#ifndef MAP_SYNC
> > >> +#define MAP_SYNC 0
> > >> +#endif
> > >> +#ifndef MAP_SHARED_VALIDATE
> > >> +#define MAP_SHARED_VALIDATE 0
> > >> +#endif
> > >
> > >Why would we need this, if we added copies of mman.h to
> > >linux-headers?
> > 
> > This is reported by Stefan.
> > 
> > https://www.mail-archive.com/qemu-devel@nongnu.org/msg612168.html
> 
> Stefan, did you hit a build failure, or it was just theoretical?
> 
> linux-headers/*/mman.h is updated by "linux-headers: add
> linux/mman.h" (commit 8cf108c5d159).  If the build really fails,
> something else is broken in our build system.

I think it's for non-linux hosts. linux-headers/ is only used
on linux hosts.

> -- 
> Eduardo
Paolo Bonzini April 30, 2019, 11:11 p.m. UTC | #5
On 01/05/19 00:50, Michael S. Tsirkin wrote:
>> Stefan, did you hit a build failure, or it was just theoretical?
>>
>> linux-headers/*/mman.h is updated by "linux-headers: add
>> linux/mman.h" (commit 8cf108c5d159).  If the build really fails,
>> something else is broken in our build system.
> I think it's for non-linux hosts. linux-headers/ is only used
> on linux hosts.

Yes, it is.  Maybe the #ifndef/#define  part should be only used for
non-Linux.

Paolo
Michael S. Tsirkin April 30, 2019, 11:38 p.m. UTC | #6
On Wed, May 01, 2019 at 01:11:34AM +0200, Paolo Bonzini wrote:
> On 01/05/19 00:50, Michael S. Tsirkin wrote:
> >> Stefan, did you hit a build failure, or it was just theoretical?
> >>
> >> linux-headers/*/mman.h is updated by "linux-headers: add
> >> linux/mman.h" (commit 8cf108c5d159).  If the build really fails,
> >> something else is broken in our build system.
> > I think it's for non-linux hosts. linux-headers/ is only used
> > on linux hosts.
> 
> Yes, it is.  Maybe the #ifndef/#define  part should be only used for
> non-Linux.
> 
> Paolo

Makes sense. We'd rather have an error on linux than stub it out as 0.
Eduardo Habkost May 1, 2019, 4:58 p.m. UTC | #7
On Tue, Apr 30, 2019 at 07:38:59PM -0400, Michael S. Tsirkin wrote:
> On Wed, May 01, 2019 at 01:11:34AM +0200, Paolo Bonzini wrote:
> > On 01/05/19 00:50, Michael S. Tsirkin wrote:
> > >> Stefan, did you hit a build failure, or it was just theoretical?
> > >>
> > >> linux-headers/*/mman.h is updated by "linux-headers: add
> > >> linux/mman.h" (commit 8cf108c5d159).  If the build really fails,
> > >> something else is broken in our build system.
> > > I think it's for non-linux hosts. linux-headers/ is only used
> > > on linux hosts.
> > 
> > Yes, it is.  Maybe the #ifndef/#define  part should be only used for
> > non-Linux.
> > 
> > Paolo
> 
> Makes sense. We'd rather have an error on linux than stub it out as 0.


I'm confused by these replies.  When exactly do you expect an
error on Linux?

For context, I'm talking about the code in v14 of the series (which was already
merged):

#ifdef CONFIG_LINUX
#include <linux/mman.h>
#else  /* !CONFIG_LINUX */
#define MAP_SYNC              0x0
#define MAP_SHARED_VALIDATE   0x0
#endif /* CONFIG_LINUX */

I fail to see any case where the build would fail in v14.
Stefan Hajnoczi May 1, 2019, 5:26 p.m. UTC | #8
On Tue, Apr 30, 2019 at 07:48:16PM -0300, Eduardo Habkost wrote:
> On Tue, Apr 30, 2019 at 10:36:18PM +0000, Wei Yang wrote:
> [...]
> > >> +#ifdef CONFIG_LINUX
> > >> +#include <linux/mman.h>
> > >> +#endif /* CONFIG_LINUX */
> > >> +
> > >> +#ifndef MAP_SYNC
> > >> +#define MAP_SYNC 0
> > >> +#endif
> > >> +#ifndef MAP_SHARED_VALIDATE
> > >> +#define MAP_SHARED_VALIDATE 0
> > >> +#endif
> > >
> > >Why would we need this, if we added copies of mman.h to
> > >linux-headers?
> > 
> > This is reported by Stefan.
> > 
> > https://www.mail-archive.com/qemu-devel@nongnu.org/msg612168.html
> 
> Stefan, did you hit a build failure, or it was just theoretical?
> 
> linux-headers/*/mman.h is updated by "linux-headers: add
> linux/mman.h" (commit 8cf108c5d159).  If the build really fails,
> something else is broken in our build system.

I wasn't aware that QEMU ships its own mman.h.  In that case we don't
need the ifndef for Linux hosts.

Stefan
diff mbox series

Patch

diff --git a/util/mmap-alloc.c b/util/mmap-alloc.c
index 9713f4b960..b8f94d618a 100644
--- a/util/mmap-alloc.c
+++ b/util/mmap-alloc.c
@@ -10,6 +10,17 @@ 
  * later.  See the COPYING file in the top-level directory.
  */
 
+#ifdef CONFIG_LINUX
+#include <linux/mman.h>
+#endif /* CONFIG_LINUX */
+
+#ifndef MAP_SYNC
+#define MAP_SYNC 0
+#endif
+#ifndef MAP_SHARED_VALIDATE
+#define MAP_SHARED_VALIDATE 0
+#endif
+
 #include "qemu/osdep.h"
 #include "qemu/mmap-alloc.h"
 #include "qemu/host-utils.h"
@@ -82,6 +93,7 @@  void *qemu_ram_mmap(int fd,
                     bool is_pmem)
 {
     int flags;
+    int map_sync_flags = 0;
     int guardfd;
     size_t offset;
     size_t pagesize;
@@ -132,9 +144,40 @@  void *qemu_ram_mmap(int fd,
     flags = MAP_FIXED;
     flags |= fd == -1 ? MAP_ANONYMOUS : 0;
     flags |= shared ? MAP_SHARED : MAP_PRIVATE;
+    if (shared && is_pmem) {
+        map_sync_flags = MAP_SYNC | MAP_SHARED_VALIDATE;
+    }
+
     offset = QEMU_ALIGN_UP((uintptr_t)guardptr, align) - (uintptr_t)guardptr;
 
-    ptr = mmap(guardptr + offset, size, PROT_READ | PROT_WRITE, flags, fd, 0);
+    ptr = mmap(guardptr + offset, size, PROT_READ | PROT_WRITE,
+               flags | map_sync_flags, fd, 0);
+
+    if (ptr == MAP_FAILED && map_sync_flags) {
+        if (errno == ENOTSUP) {
+            char *proc_link, *file_name;
+            int len;
+            proc_link = g_strdup_printf("/proc/self/fd/%d", fd);
+            file_name = g_malloc0(PATH_MAX);
+            len = readlink(proc_link, file_name, PATH_MAX - 1);
+            if (len < 0) {
+                len = 0;
+            }
+            file_name[len] = '\0';
+            fprintf(stderr, "Warning: requesting persistence across crashes "
+                    "for backend file %s failed. Proceeding without "
+                    "persistence, data might become corrupted in case of host "
+                    "crash.\n", file_name);
+            g_free(proc_link);
+            g_free(file_name);
+        }
+        /*
+         * if map failed with MAP_SHARED_VALIDATE | MAP_SYNC,
+         * we will remove these flags to handle compatibility.
+         */
+        ptr = mmap(guardptr + offset, size, PROT_READ | PROT_WRITE,
+                   flags, fd, 0);
+    }
 
     if (ptr == MAP_FAILED) {
         munmap(guardptr, total);