diff mbox series

[1/2] core: fail if running under systemd but not sockets

Message ID 20230926084151.966511-1-felix.moessbauer@siemens.com
State Changes Requested
Delegated to: Stefano Babic
Headers show
Series [1/2] core: fail if running under systemd but not sockets | expand

Commit Message

Felix Moessbauer Sept. 26, 2023, 8:41 a.m. UTC
When running with CONFIG_SYSTEMD on a systemd bootet system, fail in
case no sockets were created by systemd. This solves the following
issues:

- improper cleanup of sockets: previously the self-created sockets were
  not properly removed in this case
- fail fast instead of silently: Duplicated sockets were created in
  different paths.

Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
---
 core/network_thread.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Stefano Babic Oct. 17, 2023, 8:34 a.m. UTC | #1
Hi Felix,

On 26.09.23 10:41, 'Felix Moessbauer' via swupdate wrote:
> When running with CONFIG_SYSTEMD on a systemd bootet system, fail in
> case no sockets were created by systemd. This solves the following
> issues:
> 
> - improper cleanup of sockets: previously the self-created sockets were
>    not properly removed in this case
> - fail fast instead of silently: Duplicated sockets were created in
>    different paths.
> 
> Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
> ---
>   core/network_thread.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/core/network_thread.c b/core/network_thread.c
> index a5b6901..bd973ab 100644
> --- a/core/network_thread.c
> +++ b/core/network_thread.c
> @@ -235,7 +235,8 @@ int listener_create(const char *path, int type)
>   			}
>   		}
>   		if (listenfd == -1) {
> -			TRACE("got no socket at %s from systemd", path);
> +			ERROR("got no socket at %s from systemd", path);
> +			return -1;
>   		} else {
>   			TRACE("got socket fd=%d at %s from systemd", listenfd, path);
>   		}

I think this is too simplistic. This solves the most use case when 
SWUpdate runs as daemon (quite 100% on my projects) and systemd 
overtakes this.

There are other use cases where SWUpdate runs in foreground, when "-d" 
(download) or "-i" (install from file) are passed. In that cases, 
systemd is not involved and it exits here. So this patch breaks some use 
cases and (as far as I know) some projects in field. Yes, not project of 
mine, but it does not matter.

I guess that to improve like you want to have, something more should be 
done. SWUpdate should store if it runs in foreground or not, and raise 
the error just in case it is planned that it is started by systemd.

Best regards,
Stefano
Felix Moessbauer Oct. 18, 2023, 1:42 p.m. UTC | #2
On Tue, 2023-10-17 at 10:34 +0200, Stefano Babic wrote:
> Hi Felix,
> 
> On 26.09.23 10:41, 'Felix Moessbauer' via swupdate wrote:
> > When running with CONFIG_SYSTEMD on a systemd bootet system, fail
> > in
> > case no sockets were created by systemd. This solves the following
> > issues:
> > 
> > - improper cleanup of sockets: previously the self-created sockets
> > were
> >    not properly removed in this case
> > - fail fast instead of silently: Duplicated sockets were created in
> >    different paths.
> > 
> > Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
> > ---
> >   core/network_thread.c | 3 ++-
> >   1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/core/network_thread.c b/core/network_thread.c
> > index a5b6901..bd973ab 100644
> > --- a/core/network_thread.c
> > +++ b/core/network_thread.c
> > @@ -235,7 +235,8 @@ int listener_create(const char *path, int type)
> >                         }
> >                 }
> >                 if (listenfd == -1) {
> > -                       TRACE("got no socket at %s from systemd",
> > path);
> > +                       ERROR("got no socket at %s from systemd",
> > path);
> > +                       return -1;
> >                 } else {
> >                         TRACE("got socket fd=%d at %s from
> > systemd", listenfd, path);
> >                 }
> 
> I think this is too simplistic. This solves the most use case when 
> SWUpdate runs as daemon (quite 100% on my projects) and systemd 
> overtakes this.
> 
> There are other use cases where SWUpdate runs in foreground, when "-
> d" 
> (download) or "-i" (install from file) are passed. In that cases, 
> systemd is not involved and it exits here. So this patch breaks some
> use 
> cases and (as far as I know) some projects in field. Yes, not project
> of 
> mine, but it does not matter.

Hi.

Uuh... actually it breaks on systems that are bootet with systemd, but
where the swupdate binary is run in foreground. That's a bigger design
flaw. As there is no reliable way to detect if the service is running
under systemd, we indeed must rely on fixed socket paths. But the
socket paths must not depend on the TMPDIR.

> 
> I guess that to improve like you want to have, something more should
> be 
> done. SWUpdate should store if it runs in foreground or not, and
> raise 
> the error just in case it is planned that it is started by systemd.

The socket cleanup logic needs to handle that case correctly, as it
needs to track if sockets were created and not if the init system is
systemd.

Finally, all the sd_bootet checks can be removed as the output is
meaningless anyways. All sd_* functions also behave correctly on non
systemd bootet systems. That's explicitly stated in the man page.

I'll split this series to first only fix the docs, the socket leak and
get rid of sd_booted. Then we can further discuss how to solve the more
conceptual issues around the TMPDIR (in a possibly backward compatible
manner).

Felix

> 
> Best regards,
> Stefano
>
Stefano Babic Oct. 18, 2023, 3:10 p.m. UTC | #3
Hi Felix,

On 18.10.23 15:42, 'MOESSBAUER, Felix' via swupdate wrote:
> On Tue, 2023-10-17 at 10:34 +0200, Stefano Babic wrote:
>> Hi Felix,
>>
>> On 26.09.23 10:41, 'Felix Moessbauer' via swupdate wrote:
>>> When running with CONFIG_SYSTEMD on a systemd bootet system, fail
>>> in
>>> case no sockets were created by systemd. This solves the following
>>> issues:
>>>
>>> - improper cleanup of sockets: previously the self-created sockets
>>> were
>>>     not properly removed in this case
>>> - fail fast instead of silently: Duplicated sockets were created in
>>>     different paths.
>>>
>>> Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
>>> ---
>>>    core/network_thread.c | 3 ++-
>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/core/network_thread.c b/core/network_thread.c
>>> index a5b6901..bd973ab 100644
>>> --- a/core/network_thread.c
>>> +++ b/core/network_thread.c
>>> @@ -235,7 +235,8 @@ int listener_create(const char *path, int type)
>>>                          }
>>>                  }
>>>                  if (listenfd == -1) {
>>> -                       TRACE("got no socket at %s from systemd",
>>> path);
>>> +                       ERROR("got no socket at %s from systemd",
>>> path);
>>> +                       return -1;
>>>                  } else {
>>>                          TRACE("got socket fd=%d at %s from
>>> systemd", listenfd, path);
>>>                  }
>>
>> I think this is too simplistic. This solves the most use case when
>> SWUpdate runs as daemon (quite 100% on my projects) and systemd
>> overtakes this.
>>
>> There are other use cases where SWUpdate runs in foreground, when "-
>> d"
>> (download) or "-i" (install from file) are passed. In that cases,
>> systemd is not involved and it exits here. So this patch breaks some
>> use
>> cases and (as far as I know) some projects in field. Yes, not project
>> of
>> mine, but it does not matter.
> 
> Hi.
> 
> Uuh... actually it breaks on systems that are bootet with systemd, but
> where the swupdate binary is run in foreground.

Do you mean you start two instances of SWUpdate ?

If not, I do not see the issue. Systemd does not start SWUpdate, and 
SWUpdate generates itself the sockets.

> That's a bigger design
> flaw. As there is no reliable way to detect if the service is running
> under systemd, we indeed must rely on fixed socket paths. But the
> socket paths must not depend on the TMPDIR.

But we know if SWUpdate is running in foreground as one-shot updater, or 
it is running as daemon.

TMPDIR (name is not the best, I agree) is just the HOME directory for 
SWUpdate's temporary files, including sockets. It can be set to whatever 
we want before starting. Which are the issues you are currently seeing 
with it ?

> 
>>
>> I guess that to improve like you want to have, something more should
>> be
>> done. SWUpdate should store if it runs in foreground or not, and
>> raise
>> the error just in case it is planned that it is started by systemd.
> 
> The socket cleanup logic needs to handle that case correctly, as it
> needs to track if sockets were created and not if the init system is
> systemd.

Well, it is the same. Socket creation outside of SWUpdate happens just 
if systemd did it. There are not other use cases.

> 
> Finally, all the sd_bootet checks can be removed as the output is
> meaningless anyways.
> All sd_* functions also behave correctly on non
> systemd bootet systems.

Ok, I need some more information. Where is coming ds_boot() if we have 
no systemd ?

> That's explicitly stated in the man page.

I do not know what you mean.

On systems without systemd, there is no libsystemd. libsystemd cannot be 
built at all, and I mean on OE build when virtual_init is not set to 
systemd. This requires the introduction of CONFIG_SYSTEMD here, while 
everything could be cleaner just by checking the return code of 
sd_booted, if available.

So please explain me better, I have not yet understood.

> 
> I'll split this series to first only fix the docs, the socket leak and
> get rid of sd_booted. Then we can further discuss how to solve the more
> conceptual issues around the TMPDIR (in a possibly backward compatible
> manner).
> 

Best regards,
Stefano
Felix Moessbauer Oct. 19, 2023, 5:43 a.m. UTC | #4
On Wed, 2023-10-18 at 17:10 +0200, Stefano Babic wrote:
> Hi Felix,
> 
> On 18.10.23 15:42, 'MOESSBAUER, Felix' via swupdate wrote:
> > On Tue, 2023-10-17 at 10:34 +0200, Stefano Babic wrote:
> > > Hi Felix,
> > > 
> > > On 26.09.23 10:41, 'Felix Moessbauer' via swupdate wrote:
> > > > When running with CONFIG_SYSTEMD on a systemd bootet system,
> > > > fail
> > > > in
> > > > case no sockets were created by systemd. This solves the
> > > > following
> > > > issues:
> > > > 
> > > > - improper cleanup of sockets: previously the self-created
> > > > sockets
> > > > were
> > > >     not properly removed in this case
> > > > - fail fast instead of silently: Duplicated sockets were
> > > > created in
> > > >     different paths.
> > > > 
> > > > Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
> > > > ---
> > > >    core/network_thread.c | 3 ++-
> > > >    1 file changed, 2 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/core/network_thread.c b/core/network_thread.c
> > > > index a5b6901..bd973ab 100644
> > > > --- a/core/network_thread.c
> > > > +++ b/core/network_thread.c
> > > > @@ -235,7 +235,8 @@ int listener_create(const char *path, int
> > > > type)
> > > >                          }
> > > >                  }
> > > >                  if (listenfd == -1) {
> > > > -                       TRACE("got no socket at %s from
> > > > systemd",
> > > > path);
> > > > +                       ERROR("got no socket at %s from
> > > > systemd",
> > > > path);
> > > > +                       return -1;
> > > >                  } else {
> > > >                          TRACE("got socket fd=%d at %s from
> > > > systemd", listenfd, path);
> > > >                  }
> > > 
> > > I think this is too simplistic. This solves the most use case
> > > when
> > > SWUpdate runs as daemon (quite 100% on my projects) and systemd
> > > overtakes this.
> > > 
> > > There are other use cases where SWUpdate runs in foreground, when
> > > "-
> > > d"
> > > (download) or "-i" (install from file) are passed. In that cases,
> > > systemd is not involved and it exits here. So this patch breaks
> > > some
> > > use
> > > cases and (as far as I know) some projects in field. Yes, not
> > > project
> > > of
> > > mine, but it does not matter.
> > 
> > Hi.
> > 
> > Uuh... actually it breaks on systems that are bootet with systemd,
> > but
> > where the swupdate binary is run in foreground.
> 
> Do you mean you start two instances of SWUpdate ?

No, just a single instance (with e.g. -i), running outside of a systemd
service (but on a systemd bootet system). The main issue is that
sd_booted() only checks if the system is booted with systemd, but not
if the executable is running inside a systemd service.

> 
> If not, I do not see the issue. Systemd does not start SWUpdate, and 
> SWUpdate generates itself the sockets.
> 
> > That's a bigger design
> > flaw. As there is no reliable way to detect if the service is
> > running
> > under systemd, we indeed must rely on fixed socket paths. But the
> > socket paths must not depend on the TMPDIR.
> 
> But we know if SWUpdate is running in foreground as one-shot updater,
> or 
> it is running as daemon.
> 
> TMPDIR (name is not the best, I agree) is just the HOME directory for
> SWUpdate's temporary files, including sockets. It can be set to
> whatever 
> we want before starting. Which are the issues you are currently
> seeing 
> with it ?

If we change TMPDIR inside the systemd unit file, the sockets provided
by systemd are still created below /run. However swupdate wants to
connect to the sockets under $TMPDIR/ and there are none. Hence it
thinks no sockets were provided by systemd and creates these. Later in
the cleanup, the (incorrect) logic checks if the system was bootet
using systemd and the sockets created by swupdate are not removed.

> 
> > 
> > > 
> > > I guess that to improve like you want to have, something more
> > > should
> > > be
> > > done. SWUpdate should store if it runs in foreground or not, and
> > > raise
> > > the error just in case it is planned that it is started by
> > > systemd.
> > 
> > The socket cleanup logic needs to handle that case correctly, as it
> > needs to track if sockets were created and not if the init system
> > is
> > systemd.
> 
> Well, it is the same. Socket creation outside of SWUpdate happens
> just 
> if systemd did it. There are not other use cases.

Still, we need to track if the sockets were created by swupdate and
hence need to be cleaned up by swupdate.

> 
> > 
> > Finally, all the sd_bootet checks can be removed as the output is
> > meaningless anyways.
> > All sd_* functions also behave correctly on non
> > systemd bootet systems.
> 
> Ok, I need some more information. Where is coming ds_boot() if we
> have 
> no systemd ?

We need to distinguish between compile time and runtime. We already
have ifdefs to enable / disable systemd support. At runtime, we
currently check (in case sd support was compiled in), if the system is
booted with systemd (i.e. pid1 is systemd). These checks are
irrelevant, as all sd_* functions internally already do that.

> 
> > That's explicitly stated in the man page.
> 
> I do not know what you mean.

man sd_booted

> 
> On systems without systemd, there is no libsystemd. libsystemd cannot
> be 
> built at all, and I mean on OE build when virtual_init is not set to 
> systemd. This requires the introduction of CONFIG_SYSTEMD here, while
> everything could be cleaner just by checking the return code of 
> sd_booted, if available.
> 
> So please explain me better, I have not yet understood.

Yes, exactly. My plan is to keep the compile time guards, but remove
the runtime guards, as these are not required.

Best regards,
Felix

> 
> > 
> > I'll split this series to first only fix the docs, the socket leak
> > and
> > get rid of sd_booted. Then we can further discuss how to solve the
> > more
> > conceptual issues around the TMPDIR (in a possibly backward
> > compatible
> > manner).
> > 
> 
> Best regards,
> Stefano
>
Stefano Babic Oct. 19, 2023, 8:35 a.m. UTC | #5
Hi Felix,

On 19.10.23 07:43, 'MOESSBAUER, Felix' via swupdate wrote:
> On Wed, 2023-10-18 at 17:10 +0200, Stefano Babic wrote:
>> Hi Felix,
>>
>> On 18.10.23 15:42, 'MOESSBAUER, Felix' via swupdate wrote:
>>> On Tue, 2023-10-17 at 10:34 +0200, Stefano Babic wrote:
>>>> Hi Felix,
>>>>
>>>> On 26.09.23 10:41, 'Felix Moessbauer' via swupdate wrote:
>>>>> When running with CONFIG_SYSTEMD on a systemd bootet system,
>>>>> fail
>>>>> in
>>>>> case no sockets were created by systemd. This solves the
>>>>> following
>>>>> issues:
>>>>>
>>>>> - improper cleanup of sockets: previously the self-created
>>>>> sockets
>>>>> were
>>>>>      not properly removed in this case
>>>>> - fail fast instead of silently: Duplicated sockets were
>>>>> created in
>>>>>      different paths.
>>>>>
>>>>> Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
>>>>> ---
>>>>>     core/network_thread.c | 3 ++-
>>>>>     1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/core/network_thread.c b/core/network_thread.c
>>>>> index a5b6901..bd973ab 100644
>>>>> --- a/core/network_thread.c
>>>>> +++ b/core/network_thread.c
>>>>> @@ -235,7 +235,8 @@ int listener_create(const char *path, int
>>>>> type)
>>>>>                           }
>>>>>                   }
>>>>>                   if (listenfd == -1) {
>>>>> -                       TRACE("got no socket at %s from
>>>>> systemd",
>>>>> path);
>>>>> +                       ERROR("got no socket at %s from
>>>>> systemd",
>>>>> path);
>>>>> +                       return -1;
>>>>>                   } else {
>>>>>                           TRACE("got socket fd=%d at %s from
>>>>> systemd", listenfd, path);
>>>>>                   }
>>>>
>>>> I think this is too simplistic. This solves the most use case
>>>> when
>>>> SWUpdate runs as daemon (quite 100% on my projects) and systemd
>>>> overtakes this.
>>>>
>>>> There are other use cases where SWUpdate runs in foreground, when
>>>> "-
>>>> d"
>>>> (download) or "-i" (install from file) are passed. In that cases,
>>>> systemd is not involved and it exits here. So this patch breaks
>>>> some
>>>> use
>>>> cases and (as far as I know) some projects in field. Yes, not
>>>> project
>>>> of
>>>> mine, but it does not matter.
>>>
>>> Hi.
>>>
>>> Uuh... actually it breaks on systems that are bootet with systemd,
>>> but
>>> where the swupdate binary is run in foreground.
>>
>> Do you mean you start two instances of SWUpdate ?
> 
> No, just a single instance (with e.g. -i), running outside of a systemd
> service (but on a systemd bootet system). The main issue is that
> sd_booted() only checks if the system is booted with systemd, but not
> if the executable is running inside a systemd service.
> 

Ok, got it. And yes, this does not work. SWUpdate should be rebuilt 
without CONFIG_SYSTEMD to get it working, very noisy.

>>
>> If not, I do not see the issue. Systemd does not start SWUpdate, and
>> SWUpdate generates itself the sockets.
>>
>>> That's a bigger design
>>> flaw. As there is no reliable way to detect if the service is
>>> running
>>> under systemd, we indeed must rely on fixed socket paths. But the
>>> socket paths must not depend on the TMPDIR.
>>
>> But we know if SWUpdate is running in foreground as one-shot updater,
>> or
>> it is running as daemon.
>>
>> TMPDIR (name is not the best, I agree) is just the HOME directory for
>> SWUpdate's temporary files, including sockets. It can be set to
>> whatever
>> we want before starting. Which are the issues you are currently
>> seeing
>> with it ?
> 
> If we change TMPDIR inside the systemd unit file, the sockets provided
> by systemd are still created below /run. However swupdate wants to
> connect to the sockets under $TMPDIR/ and there are none. Hence it
> thinks no sockets were provided by systemd and creates these. Later in
> the cleanup, the (incorrect) logic checks if the system was bootet
> using systemd and the sockets created by swupdate are not removed.

Ok, I see. It raises the spontan question how many advantages we have 
letting that sockets are created by systemd. The big advantage is if we 
have other services (applications) that want to connect to SWUpdate, and 
systemd will rules this and all services can be started in parallel. On 
the other side, SWUpdate makes already housekeeping, and sockets are 
created and deleted if CONFIG_SYSTEMD is not set.

> 
>>
>>>
>>>>
>>>> I guess that to improve like you want to have, something more
>>>> should
>>>> be
>>>> done. SWUpdate should store if it runs in foreground or not, and
>>>> raise
>>>> the error just in case it is planned that it is started by
>>>> systemd.
>>>
>>> The socket cleanup logic needs to handle that case correctly, as it
>>> needs to track if sockets were created and not if the init system
>>> is
>>> systemd.
>>
>> Well, it is the same. Socket creation outside of SWUpdate happens
>> just
>> if systemd did it. There are not other use cases.
> 
> Still, we need to track if the sockets were created by swupdate and
> hence need to be cleaned up by swupdate.

I agree that this should be tracked.

> 
>>
>>>
>>> Finally, all the sd_bootet checks can be removed as the output is
>>> meaningless anyways.
>>> All sd_* functions also behave correctly on non
>>> systemd bootet systems.
>>
>> Ok, I need some more information. Where is coming ds_boot() if we
>> have
>> no systemd ?
> 
> We need to distinguish between compile time and runtime. We already
> have ifdefs to enable / disable systemd support. At runtime, we
> currently check (in case sd support was compiled in), if the system is
> booted with systemd (i.e. pid1 is systemd). These checks are
> irrelevant, as all sd_* functions internally already do that.
> 
>>
>>> That's explicitly stated in the man page.
>>
>> I do not know what you mean.
> 
> man sd_booted
> 
>>
>> On systems without systemd, there is no libsystemd. libsystemd cannot
>> be
>> built at all, and I mean on OE build when virtual_init is not set to
>> systemd. This requires the introduction of CONFIG_SYSTEMD here, while
>> everything could be cleaner just by checking the return code of
>> sd_booted, if available.
>>
>> So please explain me better, I have not yet understood.
> 
> Yes, exactly. My plan is to keep the compile time guards, but remove
> the runtime guards, as these are not required.

Agree that sd_booted() should be removed, it is not the right check. Not 
sure it is enough, biut it is in the right direction.

Best regards,
Stefano

> 
> Best regards,
> Felix
> 
>>
>>>
>>> I'll split this series to first only fix the docs, the socket leak
>>> and
>>> get rid of sd_booted. Then we can further discuss how to solve the
>>> more
>>> conceptual issues around the TMPDIR (in a possibly backward
>>> compatible
>>> manner).
>>>
>>
>> Best regards,
>> Stefano
>>
>
diff mbox series

Patch

diff --git a/core/network_thread.c b/core/network_thread.c
index a5b6901..bd973ab 100644
--- a/core/network_thread.c
+++ b/core/network_thread.c
@@ -235,7 +235,8 @@  int listener_create(const char *path, int type)
 			}
 		}
 		if (listenfd == -1) {
-			TRACE("got no socket at %s from systemd", path);
+			ERROR("got no socket at %s from systemd", path);
+			return -1;
 		} else {
 			TRACE("got socket fd=%d at %s from systemd", listenfd, path);
 		}