Message ID | 14736_1666122204_634F01DB_14736_494_1_b3efa290e26b85b54ab27728bf190316cf2ab1ee.1666122184.git.yann.morin@orange.com |
---|---|
State | Changes Requested |
Headers | show |
Series | [1/6,v3] package/skeleton-systemd: move /var factory tmpfiles out of /etc | expand |
Am Di., 18. Okt. 2022 um 21:43 Uhr schrieb <yann.morin@orange.com>: > > While the /var factory seems to be working in most cases, there have > been suggestions that it may be slightly and subtely borken in some > (rare? edge?) cases, especially about symlinks. Had to dig up an old post of mine (not the only one touching on that), some issues are: - it kills previous files in /usr/share/factory/var - it doesn't handle symlinks (just think of /var already containing a symlink into that factory), especially relative ones. - it has sideeffects with tmpfile .confs that are ordered before and touch /var - it has sideeffects with other PRE_CMD_HOOKS touching /var (Post is from mid 2020, so forgive me if my memory is fuzzy, but I had already practical problems with atleast the last 2 of those). To me this is just not a robust solution > > An other solution is to pre-populate /var at build time, by way of > calling systemd-tmpfiles, and mounting an overlayfs on-top of it at > runtime. > > This is slightly accrobatic, though, and requires a few hoops: > - first, we create a tmpfs > - there, we create three directories: > - the first to bind-mount /var as it is, i.e. read-only > - the second as the read-write upper for the overlayfs > - the third as the "working area" for the overlays ..and we depend on overlayfs > > This is done with two systemd units: > - rootfs-bindmount-var.service: prepares up to bind-mounting /var into > the tmpfs > - var.mount: a mount unit which actually mounts the overlayfs. > > Users who want to provide an actual storage to keep /var across reboots, > will have to provide their own mount units and make it RequiredBy and > BoundBy our var.mount unit. > > Systemd units courtesy Norbert, with slight tweaks and cleanups. Yeah, Im not fine with the tweaks to drop the symlink /usr/lib/systemd/system/var.mount -> ../var.mount (and the added intstall section) First in the same local-fs "target" you could mount /etc, making this a complicated hidden issue, I don't know when systemd reloads, I believe only after that target. Second, this should be enabled by default, and in a way even when /etc is borked/not ready. If a user really wants to disable the mount, he can mask it. > > Signed-off-by: Yann E. MORIN <yann.morin@orange.com> > Cc: Norbert Lange <nolange79@gmail.com> > Cc: Romain Naour <romain.naour@smile.fr> > Cc: Jérémy Rosen <jeremy.rosen@smile.fr> > --- > .../{ => factory}/var.mount | 0 > .../overlayfs/rootfs-bindmount-var.service | 21 ++++++++++++++++ > .../skeleton-init-systemd/overlayfs/var.mount | 15 ++++++++++++ > .../skeleton-init-systemd.mk | 20 +++++++++++++--- > system/Config.in | 24 +++++++++++++------ > 5 files changed, 70 insertions(+), 10 deletions(-) > rename package/skeleton-init-systemd/{ => factory}/var.mount (100%) > create mode 100644 package/skeleton-init-systemd/overlayfs/rootfs-bindmount-var.service > create mode 100644 package/skeleton-init-systemd/overlayfs/var.mount > > diff --git a/package/skeleton-init-systemd/var.mount b/package/skeleton-init-systemd/factory/var.mount > similarity index 100% > rename from package/skeleton-init-systemd/var.mount > rename to package/skeleton-init-systemd/factory/var.mount > diff --git a/package/skeleton-init-systemd/overlayfs/rootfs-bindmount-var.service b/package/skeleton-init-systemd/overlayfs/rootfs-bindmount-var.service > new file mode 100644 > index 0000000000..e412a56c49 > --- /dev/null > +++ b/package/skeleton-init-systemd/overlayfs/rootfs-bindmount-var.service > @@ -0,0 +1,21 @@ > +[Unit] > +Description=Bind-mount variable storage (/var) > +Documentation=man:file-hierarchy(7) > +ConditionPathIsSymbolicLink=!/var > +# ConditionPathIsReadWrite=!/var > +DefaultDependencies=no > +Conflicts=umount.target > +Before=local-fs.target umount.target > +After=local-fs-pre.target A am actually considering changing that to: Before=local-fs-pre.target umount.target # After=local-fs-pre.target It does not depend an anything, so no technical reason to order it after anything. And it is technically a preparation for the actual local-fs.target. > + > +[Service] > +Type=oneshot > +RemainAfterExit=yes > +ExecStartPre=-/bin/mkdir /run/varoverlay > +ExecStartPre=/bin/mount --make-private -n -t tmpfs tmpfs_root_ovl /run/varoverlay > +ExecStartPre=/bin/mkdir /run/varoverlay/lower /run/varoverlay/upper /run/varoverlay/work > +ExecStart=/bin/mount --make-private -n --bind /var /run/varoverlay/lower > + > +ExecStop=/bin/umount -n /run/varoverlay/lower > +ExecStopPost=/bin/umount -n /run/varoverlay > +ExecStopPost=/bin/rmdir /run/varoverlay > diff --git a/package/skeleton-init-systemd/overlayfs/var.mount b/package/skeleton-init-systemd/overlayfs/var.mount > new file mode 100644 > index 0000000000..fab223c27b > --- /dev/null > +++ b/package/skeleton-init-systemd/overlayfs/var.mount > @@ -0,0 +1,15 @@ > +[Unit] > +Description=variable storage (/var) > +Documentation=man:file-hierarchy(7) > +ConditionPathIsSymbolicLink=!/var > +After=rootfs-bindmount-var.service > +BindsTo=rootfs-bindmount-var.service > + > +[Mount] > +What=overlay_var > +Where=/var > +Type=overlay > +Options=lowerdir=/run/varoverlay/lower,upperdir=/run/varoverlay/upper,workdir=/run/varoverlay/work,redirect_dir=on,index=on,xino=on > + > +[Install] > +WantedBy=local-fs.target See above. > diff --git a/package/skeleton-init-systemd/skeleton-init-systemd.mk b/package/skeleton-init-systemd/skeleton-init-systemd.mk > index 69991265a5..07a4180db0 100644 > --- a/package/skeleton-init-systemd/skeleton-init-systemd.mk > +++ b/package/skeleton-init-systemd/skeleton-init-systemd.mk > @@ -33,7 +33,7 @@ define SKELETON_INIT_SYSTEMD_ROOT_RO_OR_RW > endef > > ifeq ($(BR2_INIT_SYSTEMD_VAR_FACTORY),y) > -define SKELETON_INIT_SYSTEMD_PRE_ROOTFS_VAR > +define SKELETON_INIT_SYSTEMD_PRE_ROOTFS_VAR_FACTORY > rm -rf $(TARGET_DIR)/usr/share/factory/var > mv $(TARGET_DIR)/var $(TARGET_DIR)/usr/share/factory/var > mkdir -p $(TARGET_DIR)/var > @@ -52,11 +52,25 @@ define SKELETON_INIT_SYSTEMD_PRE_ROOTFS_VAR > || exit 1; \ > fi; \ > done >$(TARGET_DIR)/usr/lib/tmpfiles.d/buildroot-factory.conf > - $(INSTALL) -D -m 0644 $(SKELETON_INIT_SYSTEMD_PKGDIR)/var.mount \ > + $(INSTALL) -D -m 0644 $(SKELETON_INIT_SYSTEMD_PKGDIR)/factory/var.mount \ > $(TARGET_DIR)/usr/lib/systemd/system/var.mount > endef > -SKELETON_INIT_SYSTEMD_ROOTFS_PRE_CMD_HOOKS += SKELETON_INIT_SYSTEMD_PRE_ROOTFS_VAR > +SKELETON_INIT_SYSTEMD_ROOTFS_PRE_CMD_HOOKS += SKELETON_INIT_SYSTEMD_PRE_ROOTFS_VAR_FACTORY > endif # BR2_INIT_SYSTEMD_VAR_FACTORY > + > +ifeq ($(BR2_INIT_SYSTEMD_VAR_OVERLAYFS),y) > +define SKELETON_INIT_SYSTEMD_LINUX_CONFIG_FIXUPS > + $(call KCONFIG_ENABLE_OPT,CONFIG_OVERLAY_FS) > +endef > +define SKELETON_INIT_SYSTEMD_PRE_ROOTFS_VAR_OVERLAYFS > + $(INSTALL) -D -m 0644 $(SKELETON_INIT_SYSTEMD_PKGDIR)/overlayfs/var.mount \ > + $(TARGET_DIR)/usr/lib/systemd/system/var.mount > + $(INSTALL) -D -m 0644 $(SKELETON_INIT_SYSTEMD_PKGDIR)/overlayfs/rootfs-bindmount-var.service \ > + $(TARGET_DIR)/usr/lib/systemd/system/rootfs-bindmount-var.service > +endef > +SKELETON_INIT_SYSTEMD_ROOTFS_PRE_CMD_HOOKS += SKELETON_INIT_SYSTEMD_PRE_ROOTFS_VAR_OVERLAYFS > +endif # BR2_INIT_SYSTEMD_VAR_OVERLAYFS > + > endif # BR2_TARGET_GENERIC_REMOUNT_ROOTFS_RW > > ifeq ($(BR2_INIT_SYSTEMD_POPULATE_TMPFILES),y) > diff --git a/system/Config.in b/system/Config.in > index 074fda509c..0c064b8211 100644 > --- a/system/Config.in > +++ b/system/Config.in > @@ -164,6 +164,14 @@ choice > Select how Buildroot provides a read-write /var when the > rootfs is not remounted read-write. > > + Note: Buildroot uses a tmpfs, either as a mount point or as > + the upper of an overlayfs, so as to at least make the system > + bootable out of the box; mounting a filesystem from actual > + storage is left to the integration, as it is too specific and > + may need preparatory work like partitionning a device and/or > + formatting a filesystem first, which falls out of the scope > + of Buildroot. > + > config BR2_INIT_SYSTEMD_VAR_FACTORY > bool "build a factory to populate a tmpfs" > help > @@ -176,17 +184,19 @@ config BR2_INIT_SYSTEMD_VAR_FACTORY > It probably does not play very well with triggering a call > to systemd-tmpfiles at build time (below). > > - Note: Buildroot mounts a tmpfs on /var to at least make the > - system bootable out of the box; mounting a filesystem from > - actual storage is left to the integration, as it is too > - specific and may need preparatory work like partitionning a > - device and/or formatting a filesystem first, so that falls > - out of the scope of Buildroot. > - > To use persistent storage, provide a systemd dropin for the > var.mount unit, that overrides the What and Type, and possibly > the Options and After, fields. > > +config BR2_INIT_SYSTEMD_VAR_OVERLAYFS > + bool "mount an overlayfs backed by a tmpfs" > + help > + Mount an overlayfs on /var, with the upper as a tmpfs. > + > + To use a persistent storage, provide your own systemd unit(s) > + that eventually mount that persistent storage on > + /run/varoverlay/upper/ perhaps pull in or depend on overlayfs here > + > config BR2_INIT_SYSTEMD_VAR_CUSTOM > bool "something else" > help > -- > 2.25.1 > > > _________________________________________________________________________________________________________________________ > > Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc > pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler > a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, > Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. > > This message and its attachments may contain confidential or privileged information that may be protected by law; > they should not be distributed, used or copied without authorisation. > If you have received this email in error, please notify the sender and delete this message and its attachments. > As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. > Thank you. > Generally the unit and directory names could be more logical, and for allowing the user to specify a custom mount by reading an EnvironmentFile in the rootfs-bindmount-var unit. Regards, Norbert [1] http://lists.busybox.net/pipermail/buildroot/2020-July/287016.html PS. Planning to add some more comments next week, should find dome time here, huge commit msgs to get through.
Norbert, All, Thank you for your feedback! :-) On 2022-10-23 23:47 +0200, Norbert Lange spake thusly: > Am Di., 18. Okt. 2022 um 21:43 Uhr schrieb <yann.morin@orange.com>: > > While the /var factory seems to be working in most cases, there have > > been suggestions that it may be slightly and subtely borken in some > > (rare? edge?) cases, especially about symlinks. > > Had to dig up an old post of mine (not the only one touching on that), > some issues are: > > - it kills previous files in /usr/share/factory/var > - it doesn't handle symlinks (just think of /var already containing > a symlink into that factory), > especially relative ones. > - it has sideeffects with tmpfile .confs that are ordered before and > touch /var > - it has sideeffects with other PRE_CMD_HOOKS touching /var > > (Post is from mid 2020, so forgive me if my memory is fuzzy, > but I had already practical problems with atleast the last 2 of those). Forgive me if my memory is fuzzy, but I don't recall seeing any patch to fix those issues with the factory... ;-) > To me this is just not a robust solution Yet, there are some people for whom the factory does work just fine (first-hand experience here, and besides your comments, we have had noone reporting actual issues in the 5+ years we've implemented the factory, AFAICR). So, we do not want to break the situation for them. Once the overlayfs scheme has been in place for some time and it got exercised, we can consider switching the default, and eventualy we can get rid of the factory if it proves to be unfixable (again, without concrete examples that do break it, we can devise a fix). If we can't yet agree on how to integrate the overlayfs based scheme, we need a way to sort out the conflict between the factory and running tmpfiles at build time, which is what patches 1-4 are for, since they do not change the current behaviour, but clarifies the current situation. So, those are the patches where we should concentrate for now. Patches 5-6 introduce the new overlayfs scheme as an alternative to the factory, a new feature, so they can go in later... And yes, I did test the overlayfs scheme in our use-case here, and yes it does work as advertised, so yes, this is a good feature! > > An other solution is to pre-populate /var at build time, by way of > > calling systemd-tmpfiles, and mounting an overlayfs on-top of it at > > runtime. > > > > This is slightly accrobatic, though, and requires a few hoops: > > - first, we create a tmpfs > > - there, we create three directories: > > - the first to bind-mount /var as it is, i.e. read-only > > - the second as the read-write upper for the overlayfs > > - the third as the "working area" for the overlays > ..and we depend on overlayfs I just had to enable overlayfs in the kernel, and it worked without any other package. FTR, I have added new runtime tests to validate both the factory and the overlayfs scenarii. I will post them in the coming days when I've cleaned them up (have to run locally, as my gtilab free minutes are exhausted): https://gitlab.com/ymorin/buildroot/-/tree/systemdify-var [--SNIP--] > > Systemd units courtesy Norbert, with slight tweaks and cleanups. > > Yeah, Im not fine with the tweaks to drop the symlink > /usr/lib/systemd/system/var.mount -> ../var.mount > (and the added intstall section) > > First in the same local-fs "target" you could mount /etc, > making this a complicated hidden issue, I don't know > when systemd reloads, I believe only after that target. > > Second, this should be enabled by default, and > in a way even when /etc is borked/not ready. So, currently, Buildroot does not work (does nothing to officialy work) seemlessly with an empty /etc, because we explicitly run "systemctl preset-all" at the end of the build (as a prefs_cmd), and that fills in /etc/systemd/system/. As you said, supporting an empty /etc will require *way* more explicit support in Buildroot > If a user really wants to disable the mount, he can mask it. That is true if using either the symlink or the install section, no? I.e. they'd just provide a preset that reads: disable rootfs-bindount-var.service disable var.mount > > Signed-off-by: Yann E. MORIN <yann.morin@orange.com> > > Cc: Norbert Lange <nolange79@gmail.com> > > Cc: Romain Naour <romain.naour@smile.fr> > > Cc: Jérémy Rosen <jeremy.rosen@smile.fr> [--SNIP--] > > --- /dev/null > > +++ b/package/skeleton-init-systemd/overlayfs/rootfs-bindmount-var.service > > @@ -0,0 +1,21 @@ > > +[Unit] > > +Description=Bind-mount variable storage (/var) > > +Documentation=man:file-hierarchy(7) > > +ConditionPathIsSymbolicLink=!/var > > +# ConditionPathIsReadWrite=!/var > > +DefaultDependencies=no > > +Conflicts=umount.target > > +Before=local-fs.target umount.target > > +After=local-fs-pre.target > > A am actually considering changing that to: > > Before=local-fs-pre.target umount.target > # After=local-fs-pre.target No reason to keep comment in units. > It does not depend an anything, It does depend on /run being mounted. > so no technical reason to order > it after anything. And it is technically a preparation for the > actual local-fs.target. This. We do not have a vision of the grand scheme of how systemd organises stuff, what each .target means and how they depend on each others. Maybe my search skills are getting rusted as time passes, but I could never find such a design doc, and it lacks sorely. Manpages are only so good as to explain each details, but they do not provide a global overview... [--SNIP--] > > +config BR2_INIT_SYSTEMD_VAR_OVERLAYFS > > + bool "mount an overlayfs backed by a tmpfs" > > + help > > + Mount an overlayfs on /var, with the upper as a tmpfs. > > + > > + To use a persistent storage, provide your own systemd unit(s) > > + that eventually mount that persistent storage on > > + /run/varoverlay/upper/ > perhaps pull in or depend on overlayfs here I did not need anything beside enabling overlayfs in the kernel (see the runtime tests branch I pointed to above). > Generally the unit and directory names could be more logical, Yes, I agree that we should have a kind of naming scheme for this. But I have no good idea... What I was thinking, though, is that we maybe should make dotted directories, i.e. /run/.varoverlay/{lower,upper,work} > and for allowing the user to specify a custom mount > by reading an EnvironmentFile in the rootfs-bindmount-var unit. It was my understanding that users could provide their own unit(s), something that would ultimately end up with a mount unit like: # cat run_varoverlay_upper.mount [Unit] After=rootfs-bindmount-var.service BindsTo=rootfs-bindmount-var.service [Mount] What=/dev/something Where=/run/varoverlay/upper Type=ext4-or-whatever [Install] BoundBy=var.mount WantedBy=var.mount That way, they do not have to override any of our units; they would just intersperse their unit in the existing dependency graph, between the rootfs-bindmount-var.service and the var.mount. And the content of the filesystem on /dev/something would only get the content of /var, not the {lower,upper,work} directories, which could be a bit confusing. > Planning to add some more comments next week, should find dome time here, > huge commit msgs to get through. Commit messages are important, because they do provide all the rationale and reasoning behind a change. They will be there forever, and in the future, we can refer to them to understand why the code eneded up like it is, and with new insight then, we can understand where our reasoning was flawed, or if it was correct, how the environment around has changed. I consider a good commit log more important than the actual change. Regards, Yann E. MORIN.
Am Di., 25. Okt. 2022 um 10:08 Uhr schrieb <yann.morin@orange.com>: > > Norbert, All, > > Thank you for your feedback! :-) > > On 2022-10-23 23:47 +0200, Norbert Lange spake thusly: > > Am Di., 18. Okt. 2022 um 21:43 Uhr schrieb <yann.morin@orange.com>: > > > While the /var factory seems to be working in most cases, there have > > > been suggestions that it may be slightly and subtely borken in some > > > (rare? edge?) cases, especially about symlinks. > > > > Had to dig up an old post of mine (not the only one touching on that), > > some issues are: > > > > - it kills previous files in /usr/share/factory/var > > - it doesn't handle symlinks (just think of /var already containing > > a symlink into that factory), > > especially relative ones. > > - it has sideeffects with tmpfile .confs that are ordered before and > > touch /var > > - it has sideeffects with other PRE_CMD_HOOKS touching /var > > > > (Post is from mid 2020, so forgive me if my memory is fuzzy, > > but I had already practical problems with atleast the last 2 of those). > > Forgive me if my memory is fuzzy, but I don't recall seeing any patch to > fix those issues with the factory... ;-) I am not sure a complete fix for the factory would be a computable problem, know the implementation then you can "break it". there are way simpler solutions for "make a copy of that stuff". > > > To me this is just not a robust solution > > Yet, there are some people for whom the factory does work just fine > (first-hand experience here, and besides your comments, we have had > noone reporting actual issues in the 5+ years we've implemented the > factory, AFAICR). So, we do not want to break the situation for them. > > Once the overlayfs scheme has been in place for some time and it got > exercised, we can consider switching the default, and eventualy we can > get rid of the factory if it proves to be unfixable (again, without > concrete examples that do break it, we can devise a fix). Thats a sunken cost fallacy ;) > > If we can't yet agree on how to integrate the overlayfs based scheme, we > need a way to sort out the conflict between the factory and running > tmpfiles at build time, which is what patches 1-4 are for, since they do > not change the current behaviour, but clarifies the current situation. > > So, those are the patches where we should concentrate for now. > > Patches 5-6 introduce the new overlayfs scheme as an alternative to the > factory, a new feature, so they can go in later... > > And yes, I did test the overlayfs scheme in our use-case here, and yes > it does work as advertised, so yes, this is a good feature! Glad to hear, I am going to clean it up a bit > > > > An other solution is to pre-populate /var at build time, by way of > > > calling systemd-tmpfiles, and mounting an overlayfs on-top of it at > > > runtime. > > > > > > This is slightly accrobatic, though, and requires a few hoops: > > > - first, we create a tmpfs > > > - there, we create three directories: > > > - the first to bind-mount /var as it is, i.e. read-only > > > - the second as the read-write upper for the overlayfs > > > - the third as the "working area" for the overlays > > ..and we depend on overlayfs > > I just had to enable overlayfs in the kernel, and it worked without any > other package. I meant its no option if the kernel does not provide the overlayfs, which would be an argument against it. > > FTR, I have added new runtime tests to validate both the factory and > the overlayfs scenarii. I will post them in the coming days when I've > cleaned them up (have to run locally, as my gtilab free minutes are > exhausted): > > https://gitlab.com/ymorin/buildroot/-/tree/systemdify-var > > [--SNIP--] > > > Systemd units courtesy Norbert, with slight tweaks and cleanups. > > > > Yeah, Im not fine with the tweaks to drop the symlink > > /usr/lib/systemd/system/var.mount -> ../var.mount > > (and the added intstall section) > > > > First in the same local-fs "target" you could mount /etc, > > making this a complicated hidden issue, I don't know > > when systemd reloads, I believe only after that target. > > > > Second, this should be enabled by default, and > > in a way even when /etc is borked/not ready. > > So, currently, Buildroot does not work (does nothing to officialy work) > seemlessly with an empty /etc, because we explicitly run "systemctl > preset-all" at the end of the build (as a prefs_cmd), and that fills in > /etc/systemd/system/. > > As you said, supporting an empty /etc will require *way* more explicit > support in Buildroot It helps if there arent any additional blocks in the way, systemd envisions your rootfs "master" to live under /usr. Key services should be statically linked (like for ex. the dbus.socket). > > If a user really wants to disable the mount, he can mask it. > > That is true if using either the symlink or the install section, no? > I.e. they'd just provide a preset that reads: > > disable rootfs-bindount-var.service > disable var.mount Yeah, it should be a "hard" default. And not affected by the usuall en/disable/preset operations. the mask/unmask operations are the "hard" stuff. Lets turn it around: what arguments can you muster for using the install functionality? > > > > Signed-off-by: Yann E. MORIN <yann.morin@orange.com> > > > Cc: Norbert Lange <nolange79@gmail.com> > > > Cc: Romain Naour <romain.naour@smile.fr> > > > Cc: Jérémy Rosen <jeremy.rosen@smile.fr> > [--SNIP--] > > > --- /dev/null > > > +++ b/package/skeleton-init-systemd/overlayfs/rootfs-bindmount-var.service > > > @@ -0,0 +1,21 @@ > > > +[Unit] > > > +Description=Bind-mount variable storage (/var) > > > +Documentation=man:file-hierarchy(7) > > > +ConditionPathIsSymbolicLink=!/var > > > +# ConditionPathIsReadWrite=!/var > > > +DefaultDependencies=no > > > +Conflicts=umount.target > > > +Before=local-fs.target umount.target > > > +After=local-fs-pre.target > > > > A am actually considering changing that to: > > > > Before=local-fs-pre.target umount.target > > # After=local-fs-pre.target > > No reason to keep comment in units. For displaying the change, expect a series from me later. > > > It does not depend an anything, > > It does depend on /run being mounted. Which is a *given invariant* with systemd, one of the first things that happen. > > > so no technical reason to order > > it after anything. And it is technically a preparation for the > > actual local-fs.target. > > This. > > We do not have a vision of the grand scheme of how systemd organises > stuff, what each .target means and how they depend on each others. Maybe > my search skills are getting rusted as time passes, but I could never > find such a design doc, and it lacks sorely. Manpages are only so good > as to explain each details, but they do not provide a global overview... Like that? : https://www.freedesktop.org/software/systemd/man/bootup.html > > [--SNIP--] > > > +config BR2_INIT_SYSTEMD_VAR_OVERLAYFS > > > + bool "mount an overlayfs backed by a tmpfs" > > > + help > > > + Mount an overlayfs on /var, with the upper as a tmpfs. > > > + > > > + To use a persistent storage, provide your own systemd unit(s) > > > + that eventually mount that persistent storage on > > > + /run/varoverlay/upper/ > > perhaps pull in or depend on overlayfs here > > I did not need anything beside enabling overlayfs in the kernel (see the > runtime tests branch I pointed to above). > > > Generally the unit and directory names could be more logical, > > Yes, I agree that we should have a kind of naming scheme for this. But I > have no good idea... > > What I was thinking, though, is that we maybe should make dotted > directories, i.e. /run/.varoverlay/{lower,upper,work} id namespace it like /run/.buildroot/overlay_var_{lower,upper,work}. > > > and for allowing the user to specify a custom mount > > by reading an EnvironmentFile in the rootfs-bindmount-var unit. > > It was my understanding that users could provide their own unit(s), > something that would ultimately end up with a mount unit like: > > # cat run_varoverlay_upper.mount > [Unit] > After=rootfs-bindmount-var.service > BindsTo=rootfs-bindmount-var.service > > [Mount] > What=/dev/something > Where=/run/varoverlay/upper > Type=ext4-or-whatever > > [Install] > BoundBy=var.mount > WantedBy=var.mount > > That way, they do not have to override any of our units; they would just > intersperse their unit in the existing dependency graph, between the > rootfs-bindmount-var.service and the var.mount. > > And the content of the filesystem on /dev/something would only get the > content of /var, not the {lower,upper,work} directories, which could be > a bit confusing. Yeah, I havent thought about a customizing. Not sure if it wouldnt be better to provide a simple one and a customizable one. ie. the simple one doesnt have to wait for udev, loaded kernel modules for blockdevices or network connections to do its job. > > > Planning to add some more comments next week, should find dome time here, > > huge commit msgs to get through. > > Commit messages are important, because they do provide all the rationale > and reasoning behind a change. They will be there forever, and in the > future, we can refer to them to understand why the code eneded up like > it is, and with new insight then, we can understand where our reasoning > was flawed, or if it was correct, how the environment around has > changed. > > I consider a good commit log more important than the actual change. Wasnt meant as complaint. Regards, Norbert
Am Di., 25. Okt. 2022 um 14:12 Uhr schrieb Norbert Lange <nolange79@gmail.com>: > > Am Di., 25. Okt. 2022 um 10:08 Uhr schrieb <yann.morin@orange.com>: > > > > Norbert, All, > > > > Thank you for your feedback! :-) > > > > On 2022-10-23 23:47 +0200, Norbert Lange spake thusly: > > > Am Di., 18. Okt. 2022 um 21:43 Uhr schrieb <yann.morin@orange.com>: > > > > While the /var factory seems to be working in most cases, there have > > > > been suggestions that it may be slightly and subtely borken in some > > > > (rare? edge?) cases, especially about symlinks. > > > > > > Had to dig up an old post of mine (not the only one touching on that), > > > some issues are: > > > > > > - it kills previous files in /usr/share/factory/var > > > - it doesn't handle symlinks (just think of /var already containing > > > a symlink into that factory), > > > especially relative ones. > > > - it has sideeffects with tmpfile .confs that are ordered before and > > > touch /var > > > - it has sideeffects with other PRE_CMD_HOOKS touching /var > > > > > > (Post is from mid 2020, so forgive me if my memory is fuzzy, > > > but I had already practical problems with atleast the last 2 of those). > > > > Forgive me if my memory is fuzzy, but I don't recall seeing any patch to > > fix those issues with the factory... ;-) > > I am not sure a complete fix for the factory would be a computable problem, > know the implementation then you can "break it". > there are way simpler solutions for "make a copy of that stuff". > > > > > > To me this is just not a robust solution > > > > Yet, there are some people for whom the factory does work just fine > > (first-hand experience here, and besides your comments, we have had > > noone reporting actual issues in the 5+ years we've implemented the > > factory, AFAICR). So, we do not want to break the situation for them. > > > > Once the overlayfs scheme has been in place for some time and it got > > exercised, we can consider switching the default, and eventualy we can > > get rid of the factory if it proves to be unfixable (again, without > > concrete examples that do break it, we can devise a fix). > > Thats a sunken cost fallacy ;) > > > > > If we can't yet agree on how to integrate the overlayfs based scheme, we > > need a way to sort out the conflict between the factory and running > > tmpfiles at build time, which is what patches 1-4 are for, since they do > > not change the current behaviour, but clarifies the current situation. > > > > So, those are the patches where we should concentrate for now. > > > > Patches 5-6 introduce the new overlayfs scheme as an alternative to the > > factory, a new feature, so they can go in later... > > > > And yes, I did test the overlayfs scheme in our use-case here, and yes > > it does work as advertised, so yes, this is a good feature! > > Glad to hear, I am going to clean it up a bit > > > > > > > An other solution is to pre-populate /var at build time, by way of > > > > calling systemd-tmpfiles, and mounting an overlayfs on-top of it at > > > > runtime. > > > > > > > > This is slightly accrobatic, though, and requires a few hoops: > > > > - first, we create a tmpfs > > > > - there, we create three directories: > > > > - the first to bind-mount /var as it is, i.e. read-only > > > > - the second as the read-write upper for the overlayfs > > > > - the third as the "working area" for the overlays > > > ..and we depend on overlayfs > > > > I just had to enable overlayfs in the kernel, and it worked without any > > other package. > > I meant its no option if the kernel does not provide the overlayfs, > which would be an argument against it. > > > > > FTR, I have added new runtime tests to validate both the factory and > > the overlayfs scenarii. I will post them in the coming days when I've > > cleaned them up (have to run locally, as my gtilab free minutes are > > exhausted): > > > > https://gitlab.com/ymorin/buildroot/-/tree/systemdify-var > > > > [--SNIP--] > > > > Systemd units courtesy Norbert, with slight tweaks and cleanups. > > > > > > Yeah, Im not fine with the tweaks to drop the symlink > > > /usr/lib/systemd/system/var.mount -> ../var.mount > > > (and the added intstall section) > > > > > > First in the same local-fs "target" you could mount /etc, > > > making this a complicated hidden issue, I don't know > > > when systemd reloads, I believe only after that target. > > > > > > Second, this should be enabled by default, and > > > in a way even when /etc is borked/not ready. > > > > So, currently, Buildroot does not work (does nothing to officialy work) > > seemlessly with an empty /etc, because we explicitly run "systemctl > > preset-all" at the end of the build (as a prefs_cmd), and that fills in > > /etc/systemd/system/. > > > > As you said, supporting an empty /etc will require *way* more explicit > > support in Buildroot > > It helps if there arent any additional blocks in the way, systemd > envisions your rootfs "master" to live under /usr. > Key services should be statically linked (like for ex. the dbus.socket). > > > > If a user really wants to disable the mount, he can mask it. > > > > That is true if using either the symlink or the install section, no? > > I.e. they'd just provide a preset that reads: > > > > disable rootfs-bindount-var.service > > disable var.mount > > Yeah, it should be a "hard" default. And not affected by the usuall > en/disable/preset > operations. > the mask/unmask operations are the "hard" stuff. > > Lets turn it around: what arguments can you muster > for using the install functionality? > > > > > > > Signed-off-by: Yann E. MORIN <yann.morin@orange.com> > > > > Cc: Norbert Lange <nolange79@gmail.com> > > > > Cc: Romain Naour <romain.naour@smile.fr> > > > > Cc: Jérémy Rosen <jeremy.rosen@smile.fr> > > [--SNIP--] > > > > --- /dev/null > > > > +++ b/package/skeleton-init-systemd/overlayfs/rootfs-bindmount-var.service > > > > @@ -0,0 +1,21 @@ > > > > +[Unit] > > > > +Description=Bind-mount variable storage (/var) > > > > +Documentation=man:file-hierarchy(7) > > > > +ConditionPathIsSymbolicLink=!/var > > > > +# ConditionPathIsReadWrite=!/var > > > > +DefaultDependencies=no > > > > +Conflicts=umount.target > > > > +Before=local-fs.target umount.target > > > > +After=local-fs-pre.target > > > > > > A am actually considering changing that to: > > > > > > Before=local-fs-pre.target umount.target > > > # After=local-fs-pre.target > > > > No reason to keep comment in units. > > For displaying the change, expect a series from me later. > > > > > > It does not depend an anything, > > > > It does depend on /run being mounted. > > Which is a *given invariant* with systemd, one > of the first things that happen. > > > > > > so no technical reason to order > > > it after anything. And it is technically a preparation for the > > > actual local-fs.target. > > > > This. > > > > We do not have a vision of the grand scheme of how systemd organises > > stuff, what each .target means and how they depend on each others. Maybe > > my search skills are getting rusted as time passes, but I could never > > find such a design doc, and it lacks sorely. Manpages are only so good > > as to explain each details, but they do not provide a global overview... > > Like that? : > https://www.freedesktop.org/software/systemd/man/bootup.html > > > > > [--SNIP--] > > > > +config BR2_INIT_SYSTEMD_VAR_OVERLAYFS > > > > + bool "mount an overlayfs backed by a tmpfs" > > > > + help > > > > + Mount an overlayfs on /var, with the upper as a tmpfs. > > > > + > > > > + To use a persistent storage, provide your own systemd unit(s) > > > > + that eventually mount that persistent storage on > > > > + /run/varoverlay/upper/ > > > perhaps pull in or depend on overlayfs here > > > > I did not need anything beside enabling overlayfs in the kernel (see the > > runtime tests branch I pointed to above). > > > > > Generally the unit and directory names could be more logical, > > > > Yes, I agree that we should have a kind of naming scheme for this. But I > > have no good idea... > > > > What I was thinking, though, is that we maybe should make dotted > > directories, i.e. /run/.varoverlay/{lower,upper,work} > > id namespace it like /run/.buildroot/overlay_var_{lower,upper,work}. > > > > > > and for allowing the user to specify a custom mount > > > by reading an EnvironmentFile in the rootfs-bindmount-var unit. > > > > It was my understanding that users could provide their own unit(s), > > something that would ultimately end up with a mount unit like: > > > > # cat run_varoverlay_upper.mount > > [Unit] > > After=rootfs-bindmount-var.service > > BindsTo=rootfs-bindmount-var.service > > > > [Mount] > > What=/dev/something > > Where=/run/varoverlay/upper > > Type=ext4-or-whatever > > > > [Install] > > BoundBy=var.mount > > WantedBy=var.mount > > > > That way, they do not have to override any of our units; they would just > > intersperse their unit in the existing dependency graph, between the > > rootfs-bindmount-var.service and the var.mount. > > > > And the content of the filesystem on /dev/something would only get the > > content of /var, not the {lower,upper,work} directories, which could be > > a bit confusing. > > Yeah, I havent thought about a customizing. Not sure if it wouldnt be better > to provide a simple one and a customizable one. > > ie. the simple one doesnt have to wait for udev, loaded kernel modules > for blockdevices > or network connections to do its job. > > > > > > Planning to add some more comments next week, should find dome time here, > > > huge commit msgs to get through. > > > > Commit messages are important, because they do provide all the rationale > > and reasoning behind a change. They will be there forever, and in the > > future, we can refer to them to understand why the code eneded up like > > it is, and with new insight then, we can understand where our reasoning > > was flawed, or if it was correct, how the environment around has > > changed. > > > > I consider a good commit log more important than the actual change. > > Wasnt meant as complaint. > > Regards, Norbert Im still doctoring with this one, please keep this open for now. The basic idea would be to denote a few buildroot specific directories that can be used by multiple units. /run/.br - small filesystem stuff fitting the run mount /tmp/.br - filesystem stuff to fat for /run /run/.br/bnd/* - bind mounts for evil trickery, replicating the original path (eg /var/hugo is bind mounted to /run/.br/bnd/var/hugo) Some feedback on where to document this? Anyone else required to look at that? The /var overlay would end up in /tmp/.br/ovl_var - no additional tmpfs required. eg. the mount option would be: lowerdir=/run/.br/bnd/var,upperdir=/tmp/.br/ovl_var/up,workdir=/tmp/.br/ovl_var/wd Regards, Norbert
diff --git a/package/skeleton-init-systemd/var.mount b/package/skeleton-init-systemd/factory/var.mount similarity index 100% rename from package/skeleton-init-systemd/var.mount rename to package/skeleton-init-systemd/factory/var.mount diff --git a/package/skeleton-init-systemd/overlayfs/rootfs-bindmount-var.service b/package/skeleton-init-systemd/overlayfs/rootfs-bindmount-var.service new file mode 100644 index 0000000000..e412a56c49 --- /dev/null +++ b/package/skeleton-init-systemd/overlayfs/rootfs-bindmount-var.service @@ -0,0 +1,21 @@ +[Unit] +Description=Bind-mount variable storage (/var) +Documentation=man:file-hierarchy(7) +ConditionPathIsSymbolicLink=!/var +# ConditionPathIsReadWrite=!/var +DefaultDependencies=no +Conflicts=umount.target +Before=local-fs.target umount.target +After=local-fs-pre.target + +[Service] +Type=oneshot +RemainAfterExit=yes +ExecStartPre=-/bin/mkdir /run/varoverlay +ExecStartPre=/bin/mount --make-private -n -t tmpfs tmpfs_root_ovl /run/varoverlay +ExecStartPre=/bin/mkdir /run/varoverlay/lower /run/varoverlay/upper /run/varoverlay/work +ExecStart=/bin/mount --make-private -n --bind /var /run/varoverlay/lower + +ExecStop=/bin/umount -n /run/varoverlay/lower +ExecStopPost=/bin/umount -n /run/varoverlay +ExecStopPost=/bin/rmdir /run/varoverlay diff --git a/package/skeleton-init-systemd/overlayfs/var.mount b/package/skeleton-init-systemd/overlayfs/var.mount new file mode 100644 index 0000000000..fab223c27b --- /dev/null +++ b/package/skeleton-init-systemd/overlayfs/var.mount @@ -0,0 +1,15 @@ +[Unit] +Description=variable storage (/var) +Documentation=man:file-hierarchy(7) +ConditionPathIsSymbolicLink=!/var +After=rootfs-bindmount-var.service +BindsTo=rootfs-bindmount-var.service + +[Mount] +What=overlay_var +Where=/var +Type=overlay +Options=lowerdir=/run/varoverlay/lower,upperdir=/run/varoverlay/upper,workdir=/run/varoverlay/work,redirect_dir=on,index=on,xino=on + +[Install] +WantedBy=local-fs.target diff --git a/package/skeleton-init-systemd/skeleton-init-systemd.mk b/package/skeleton-init-systemd/skeleton-init-systemd.mk index 69991265a5..07a4180db0 100644 --- a/package/skeleton-init-systemd/skeleton-init-systemd.mk +++ b/package/skeleton-init-systemd/skeleton-init-systemd.mk @@ -33,7 +33,7 @@ define SKELETON_INIT_SYSTEMD_ROOT_RO_OR_RW endef ifeq ($(BR2_INIT_SYSTEMD_VAR_FACTORY),y) -define SKELETON_INIT_SYSTEMD_PRE_ROOTFS_VAR +define SKELETON_INIT_SYSTEMD_PRE_ROOTFS_VAR_FACTORY rm -rf $(TARGET_DIR)/usr/share/factory/var mv $(TARGET_DIR)/var $(TARGET_DIR)/usr/share/factory/var mkdir -p $(TARGET_DIR)/var @@ -52,11 +52,25 @@ define SKELETON_INIT_SYSTEMD_PRE_ROOTFS_VAR || exit 1; \ fi; \ done >$(TARGET_DIR)/usr/lib/tmpfiles.d/buildroot-factory.conf - $(INSTALL) -D -m 0644 $(SKELETON_INIT_SYSTEMD_PKGDIR)/var.mount \ + $(INSTALL) -D -m 0644 $(SKELETON_INIT_SYSTEMD_PKGDIR)/factory/var.mount \ $(TARGET_DIR)/usr/lib/systemd/system/var.mount endef -SKELETON_INIT_SYSTEMD_ROOTFS_PRE_CMD_HOOKS += SKELETON_INIT_SYSTEMD_PRE_ROOTFS_VAR +SKELETON_INIT_SYSTEMD_ROOTFS_PRE_CMD_HOOKS += SKELETON_INIT_SYSTEMD_PRE_ROOTFS_VAR_FACTORY endif # BR2_INIT_SYSTEMD_VAR_FACTORY + +ifeq ($(BR2_INIT_SYSTEMD_VAR_OVERLAYFS),y) +define SKELETON_INIT_SYSTEMD_LINUX_CONFIG_FIXUPS + $(call KCONFIG_ENABLE_OPT,CONFIG_OVERLAY_FS) +endef +define SKELETON_INIT_SYSTEMD_PRE_ROOTFS_VAR_OVERLAYFS + $(INSTALL) -D -m 0644 $(SKELETON_INIT_SYSTEMD_PKGDIR)/overlayfs/var.mount \ + $(TARGET_DIR)/usr/lib/systemd/system/var.mount + $(INSTALL) -D -m 0644 $(SKELETON_INIT_SYSTEMD_PKGDIR)/overlayfs/rootfs-bindmount-var.service \ + $(TARGET_DIR)/usr/lib/systemd/system/rootfs-bindmount-var.service +endef +SKELETON_INIT_SYSTEMD_ROOTFS_PRE_CMD_HOOKS += SKELETON_INIT_SYSTEMD_PRE_ROOTFS_VAR_OVERLAYFS +endif # BR2_INIT_SYSTEMD_VAR_OVERLAYFS + endif # BR2_TARGET_GENERIC_REMOUNT_ROOTFS_RW ifeq ($(BR2_INIT_SYSTEMD_POPULATE_TMPFILES),y) diff --git a/system/Config.in b/system/Config.in index 074fda509c..0c064b8211 100644 --- a/system/Config.in +++ b/system/Config.in @@ -164,6 +164,14 @@ choice Select how Buildroot provides a read-write /var when the rootfs is not remounted read-write. + Note: Buildroot uses a tmpfs, either as a mount point or as + the upper of an overlayfs, so as to at least make the system + bootable out of the box; mounting a filesystem from actual + storage is left to the integration, as it is too specific and + may need preparatory work like partitionning a device and/or + formatting a filesystem first, which falls out of the scope + of Buildroot. + config BR2_INIT_SYSTEMD_VAR_FACTORY bool "build a factory to populate a tmpfs" help @@ -176,17 +184,19 @@ config BR2_INIT_SYSTEMD_VAR_FACTORY It probably does not play very well with triggering a call to systemd-tmpfiles at build time (below). - Note: Buildroot mounts a tmpfs on /var to at least make the - system bootable out of the box; mounting a filesystem from - actual storage is left to the integration, as it is too - specific and may need preparatory work like partitionning a - device and/or formatting a filesystem first, so that falls - out of the scope of Buildroot. - To use persistent storage, provide a systemd dropin for the var.mount unit, that overrides the What and Type, and possibly the Options and After, fields. +config BR2_INIT_SYSTEMD_VAR_OVERLAYFS + bool "mount an overlayfs backed by a tmpfs" + help + Mount an overlayfs on /var, with the upper as a tmpfs. + + To use a persistent storage, provide your own systemd unit(s) + that eventually mount that persistent storage on + /run/varoverlay/upper/ + config BR2_INIT_SYSTEMD_VAR_CUSTOM bool "something else" help
While the /var factory seems to be working in most cases, there have been suggestions that it may be slightly and subtely borken in some (rare? edge?) cases, especially about symlinks. An other solution is to pre-populate /var at build time, by way of calling systemd-tmpfiles, and mounting an overlayfs on-top of it at runtime. This is slightly accrobatic, though, and requires a few hoops: - first, we create a tmpfs - there, we create three directories: - the first to bind-mount /var as it is, i.e. read-only - the second as the read-write upper for the overlayfs - the third as the "working area" for the overlays This is done with two systemd units: - rootfs-bindmount-var.service: prepares up to bind-mounting /var into the tmpfs - var.mount: a mount unit which actually mounts the overlayfs. Users who want to provide an actual storage to keep /var across reboots, will have to provide their own mount units and make it RequiredBy and BoundBy our var.mount unit. Systemd units courtesy Norbert, with slight tweaks and cleanups. Signed-off-by: Yann E. MORIN <yann.morin@orange.com> Cc: Norbert Lange <nolange79@gmail.com> Cc: Romain Naour <romain.naour@smile.fr> Cc: Jérémy Rosen <jeremy.rosen@smile.fr> --- .../{ => factory}/var.mount | 0 .../overlayfs/rootfs-bindmount-var.service | 21 ++++++++++++++++ .../skeleton-init-systemd/overlayfs/var.mount | 15 ++++++++++++ .../skeleton-init-systemd.mk | 20 +++++++++++++--- system/Config.in | 24 +++++++++++++------ 5 files changed, 70 insertions(+), 10 deletions(-) rename package/skeleton-init-systemd/{ => factory}/var.mount (100%) create mode 100644 package/skeleton-init-systemd/overlayfs/rootfs-bindmount-var.service create mode 100644 package/skeleton-init-systemd/overlayfs/var.mount