Message ID | 20210531183702.54869-1-christian.storm@siemens.com |
---|---|
State | Accepted |
Headers | show |
Series | doc: improve 'best practise' spelling and wording | expand |
Hi Christian, On 31.05.21 20:37, Christian Storm wrote: > Signed-off-by: Christian Storm <christian.storm@siemens.com> > --- > doc/source/swupdate-best-practise.rst | 217 +++++++++++++------------- > 1 file changed, 108 insertions(+), 109 deletions(-) > > diff --git a/doc/source/swupdate-best-practise.rst b/doc/source/swupdate-best-practise.rst > index e222fb7..afd5789 100644 > --- a/doc/source/swupdate-best-practise.rst > +++ b/doc/source/swupdate-best-practise.rst > @@ -9,12 +9,12 @@ Swupdate Best Practice > This is intended as general rule to integrate SWUpdate into a custom project. > > SWUpdate is an update agent and it is thought to be a framework. This means it is highly > -configurable and SWUpdate should be configured to fit into a project, and not > -viceversa. SWUpdate makes just few requirements on the system and it has no fix schema. > -There is no restrictions about how many partitions, or which storage you are using. > +configurable and SWUpdate should be configured to fit into a project, not > +vice versa. SWUpdate makes just a few requirements on the system and it has no fixed update schema. > +There is no restriction on how many partitions or which storage you are using. > In some more complex cases, the update depends on a lot of conditions, > -and SWUpdate can run differently according to the `mode` a device is started. > -Think that SWUpdate is not a ready-to-use updater, and you need first to > +and SWUpdate can run differently according to the `mode` a device is started in. > +Think about SWUpdate not being a ready-to-use updater but a framework, and hence you should first > write a meaningful: > > Update Concept > @@ -24,91 +24,92 @@ Take your time and write first an update concept for your device. It is not was > You have to imagine conditions when an update is not working, and try to write > down the use cases when an update can fail and how the device can be restored. > SWUpdate installs new software, but a successful update means that the new software > -is started and runs flawlessly. The interface with the bootloader (or the one that start the > -SW) must be checked in details. > +is started and runs flawlessly. The interface with the bootloader (or the one that starts the > +software) must be checked in details. > A successful update means: > -- SWUpdate runs with success > -- Device reboots > -- Bootloader can start the new software > -- The new software runs, makes some consistency checks, and declare that the transaction (that is from old to new sw) is terminated. > > -This means that some coordination between bootloader and update agent is necessary. > +- SWUpdate runs successfully, > +- the device reboots, > +- the bootloader can start the new software, and > +- the new software runs, makes some consistency checks, and declares that the transaction (that is from old to new software) is terminated. > + > +This means that some coordination between the bootloader and the update agent is necessary. > In most cases, this is done via persistent variables that are available to both > -SWUpdate and bootloader. SWUpdate has two built-in variables: > +SWUpdate and the bootloader. SWUpdate has two built-in variables: > > - *recovery_status*: this is set when SWUpdate starts to write to the device, and it is > - reset after the installation completed with success or it is set to `failure` in case > + unset after the installation completed with success or it is set to `failed` in case > of error. A bootloader can use it to check if an update was interrupted. > - This is a must in case a single copy of software is on the device. > + This is a must in case a single copy of the software is on the device. > > -- *ustate*: this sets a state machine. SWUpdate sets it to `1` (INSTALL) after an update. > - The bootloader can use it to check if a new software must be tested. > - Bootloader should implement a counter mechanism to check how many time it tried to start > - a new software. When a threshold is reached, the bootloader should declare the new sofwtare > - as buggy, and proceed with a fallback in case thew old software is available. > +- *ustate*: this triggers a state machine. SWUpdate sets it to `1` (INSTALL) after an update. > + The bootloader can use it to check whether a new software must be tested. > + The bootloader should implement a counter mechanism to check how many times it tried to start > + a new software. When a threshold is reached, the bootloader should declare the new software > + as buggy, and proceed with a fallback in case the old software is available. > > A *fallback* is always initiated by the bootloader, because it knows > -if the new sofware is running. It should toggle the copies and start the old software set. > -To communicate this to the user space and to SWUpdate, the bootloader sets the ustate variable to > -`FAILED`. SWUpdate uses it in case the result must be forwarded to an external server (like a backend). > +if the new software is running. It should toggle the copies and start the old software set. > +To communicate this to user space and to SWUpdate, the bootloader sets the `ustate` variable to > +`3` (FAILED). SWUpdate uses this information in case the result must be forwarded to an external server (like a backend). > There is a time window when a fallback can take place. In fact, after a reboot and some attempts, > -the update transaction is declared successful or failed, and ater then new update can be executed. > +the update transaction is declared successful or failed, and later a new update can be executed. > When a new update runs, the status of the stand-by copy is unknown, because it could be > -the result of an interrupted update. Running an uncomplete software can lead to unpredictable > +the result of an interrupted update. Running an incomplete software can lead to unpredictable > results and must be strongly avoided. > A common pattern for a toggling in the bootloader is: > > -- ustate is not set or set to '0' (no update pending). The bootloader runs the configured > - copy and won't ever toggle. In case of failure, a rescue can be started if available. > -- ustate is set to *installed* ('1'): the new software is under test. The bootloader initiates > - a fallback if new software is not running and sets ustate to failed. If new software runs, > - it is duty of user space (often SWUpdate or the init script for SWUpdate) to reset the variable > - to '0'. Note that resetting the variable is project specific, and it could be set as last > - action after checking as much as possible that the new SW is running. This includes > - performing in the application a database migration, starts to communicate with peers, whatever. > +- `ustate` is not set or set to `0` (no update pending). The bootloader runs the configured > + copy and won't ever toggle. In case of failure, a rescue system can be started if available. > +- `ustate` is set to `1` (INSTALLED): the new software is under test. The bootloader initiates > + a fallback if the new software is not running and sets `ustate` to `3` (FAILED). If the new software runs, > + it is the duty of user space (often SWUpdate or the init script for SWUpdate) to reset `ustate` > + to `0`. Note that resetting the variable is project specific, and it could be set as last > + action after having sufficiently checked that the new software is running. This includes > + performing in the application a database migration, starting communicating with peers, whatever. > > -Check in advance which security topics are relevant on your project. This includes: > +Check in advance which security topics are relevant for your project. This includes: > > -- signed images (SWU is verified before installing), and then which crypto mechanismus is used > +- signed images (SWU is verified before installing), and then which crypto mechanism is used > (RSA keys, certificates, PKI) > - encryption for artifacts > -- under which user and group SWUpdate and other components are allowed to run, see > - `privilege separation`. Set user id and group id if not root in swupdate.cfg. > +- under which user and group SWUpdate and other components are allowed to run. > + Set user id and group id if not `root` in `swupdate.cfg`. > - if any version can be installed or if you forbid a downgrade, and then be sure to pass > - the range of versions you allow via --M, -R and --max-version. > + the range of versions you allow via `--M`, `-R` and `--max-version`. > - hardware-software compatibility check and how your device knows which hardware > revision is running. > > -Reduce dependencies at minimum > +Reduce dependencies to minimum > ------------------------------ > > An update should be possible in any condition. Even if the system is degraded or in a bad shape, > if an update can work, the device can be functional again without returning it back to the > factory. > -SWUpdate is thought to be self contained: that means it does not make usage of external > -tools. If your system is degraded and filesystems gets corrupted, there are less chance to restore it > -if the update calls external tools. SWUpdate is started at boot time and there are good chance > -it goes on even if your system has some (software) flaws. > +SWUpdate is thought to be self contained: that means it does not make use of external > +tools. If your system is degraded and filesystems get corrupted, there are less chances to restore it > +if the update calls external tools. SWUpdate is started at boot time and there are good chances > +it succeeds even if your system has some (software) flaws. > Be careful to make an update depending on your application or try to reduce the dependencies. > -In fact, application is updated often and an introduction of a new bugs can make the device not > -updatable anymore. Take the dependencies under control, and if you have any, be sure that the > +In fact, the application is updated often and an introduction of new bugs can make the device no > +longer updatable. Take the dependencies under control, and if you have any, be sure that the > update is still working. You can fix any bugs if the update works, but not anymore if the device > cannot be updated. > > -Make a risk analyses > +Make a risk analysis > -------------------- > > -A more accurate analyses brings less surprises in field. Think about twice what you want to update, > -and which components should be updated, and which risks in case of update of single point of failure. > +A more accurate analysis brings less surprises in the field. Think twice about what you want to update, > +which components should be updated, and the risks of updating a single point of failure. > Very often, this means the bootloader. Compare risks and benefits: it happens in many projects that > having the possibility (with some risk) to update the bootloader is better that returning the devices > -back to service. A cost / benefits analyses should be part of the integration of the update agent. > +back to service. A cost / benefits analysis should be part of the integration of the update agent. > > SWUpdate builtin configuration > ============================== > > SWUpdate has a compile time configuration. The default configuration delivered with `meta-swupdate` > -is not suitable for most project. The easy way to check configuration in Yocto is to run: > +is not suitable for most projects. The easy way to check configuration in Yocto is to run: > > :: > > @@ -120,91 +121,89 @@ Outside Yocto, just run in SWUpdate's sources: > > make menuconfig > > -Check security, bootloader and which handlers should be installed. They depend strongly from > +Check security, bootloader, and which handlers should be installed. They depend strongly on > your project. > -If you build with OE, add a swupdate_%.bbappend to one of your layers, and put the resulting > +If you build with OE, add a `swupdate_%.bbappend` to one of your layers, and put the resulting > configuration file as `defconfig` that can be fetched. > Please review the following configuration: > -- security settings > - > -- interfaces required (where the SW is coming from). Disable the interface you do not need. > > -- handlers required for you project. Disable again what you do not need, but consider if > +- Security settings > +- Interfaces required (where the software is coming from). Disable the interface you do not need. > +- Handlers required for your project. Disable what you do not need, but consider if > you could need some of them in future. As example, you can safely disable *ubivol* if > - you do not use raw NAND, but you can let *archive* if you plan to install artifacts > - from tarball in future. > - > -- it is highly recommended to enable Lua to extend runtime behavior. > + you do not use raw NAND, but you can let *archive* enabled if you plan to install artifacts > + from tarballs in future. > +- It is highly recommended to enable Lua to extend runtime behavior. > > SWUpdate startup > ================ > > -An easy way to start SWUpdate is provided only with meta-swupdate and Yocto. A generi SystemV init script or a > -systemd unit for SWUpdate are executing a script `swupdate.sh`, that is delivered together with SWUpdate > +An easy way to start SWUpdate is provided only with meta-swupdate and Yocto. A generic SystemV init script or a > +systemd unit for SWUpdate are executing a script `swupdate.sh`, that is delivered together with the SWUpdate > binaries. > -The script goes through /etc/swupdate/conf.d/ and source all found files. The integrator can use > -a set of predefined variable to configure SWUpdate's command line parameters. > +The script goes through `/etc/swupdate/conf.d/` and sources all found files. The integrator can use > +a set of predefined variables to configure SWUpdate's command line parameters. > > -- *SWUPDATE_WEBSERVER_ARGS* : this string is passed if the webserver must be started. They are the Webserver > - specific parameter. If this variable is set, the script will add `-w` to the list of parameters. > - Note: meta-swupdate contains a default configuration for SWUPDATE_WEBSERVER_ARGS, that use /www as document root > +- *SWUPDATE_WEBSERVER_ARGS* : This string is passed if the webserver must be started. It consists of the webserver > + specific parameters. If this variable is set, the script will add `-w` to the list of parameters. > + Note: meta-swupdate contains a default configuration for SWUPDATE_WEBSERVER_ARGS, that uses /www as document root > for the Website and default port 8080. > - *SWUPDATE_SURICATTA_ARGS* : Suricatta (backend) specific parameters. There is no default. > -- *SWUPDATE_ARGS* : parameteres not belonging to Webserver or Suricatta. > +- *SWUPDATE_ARGS* : Parameters not belonging to Webserver or Suricatta. > > -Note that swupdate.sh sources the file in sorted order, so it is possible to overwrite the variables > +Note that `swupdate.sh` sources the files in sorted order, so it is possible to override the variables > with a configuration file whose filename is loaded at the end. Preferred style is to use SystemV like > names, for example `10-webserver`, `11-suricatta`, and so on. > > Write sw-description > ==================== > > -sw-description is the central file that describe a new release and how a release must be installed. > -It should be a consequence of the update concept. There is not a single right way to do. SWupdate > -uses heaviliy `selections` and links to extract just one part of the whole sw-description, that > -can be used for different situsations and different way to run the device. One use case for > -selection is to implement the dual-copy (often reported as A/B) mode: one section contains instructions > +`sw-description` is the central file that describes a new software release and how a release must be installed. > +It should be a consequence of the update concept. There is not a single right way. SWUpdate heavily > +uses 'selections' and links to extract just one part of the whole `sw-description`, that > +can be used for different situations and different ways to run the device. One use case for > +selections is to implement the dual-copy (often referred to as A/B) mode: one selection contains instructions > for one copy, the other for the second copy. Which copy is the stand-by must be detected > before running SWUpdate and passed via the `-e <selection,mode>` switch. > -Other methods set up link to the standby storage (like /dev/standby) during boot. Or the standby > -device can be detected at runtime with an `embedded-script`, as part of sw-description, with Lua code. > -Please note that for last case, SWUpdate is extended with functions exported to the Lua context that > -simplify the detection. SWUpdate exports a `getroot` that returns type and value for the device used > -for rootfs. See SWUpdate documentation for a complete list of functions exported by SWUpdate that can be > -used in Lua. An embedded Lua script must just start with: > +Other methods set up a link to the standby storage (like `/dev/standby`) during boot. Or the standby > +device can be detected at runtime with an `embedded-script`, as part of `sw-description`, with Lua code. > +Please note that for the last case, SWUpdate is extended with functions exported to the Lua context that > +simplify the detection. SWUpdate exports a `getroot()` function that returns type and value for the device used > +as rootfs. See SWUpdate documentation for a complete list of functions exported by SWUpdate that can be > +used in Lua. An embedded Lua script must just start with > > :: > > require ('swupdate') > > -to make usage of them. > +to make use of them. > > -Use as much as possible OE variables > +Use OE variables as much as possible > ------------------------------------ > > -meta-swupdate replaces special construct in sw-description with the values of build variables. > -The recognized construct in sw-description is delimited by *@@*, that is *@@VARIABLE-NAME@@*. > -The exception (for compatibility reason) is the automatic generation of sha256. The syntax in that case > +meta-swupdate replaces a special construct in `sw-description` with the values of build variables. > +The recognized construct in `sw-description` is delimited by *@@*, that is *@@VARIABLE-NAME@@*. > +The exception (for compatibility reasons) is the automatic generation of `sha256`. The syntax in that case > is : > > :: > > sha256 = "@<name of artifact>" > > -You can use again variable substituion for artifact name. Example: > +You can again use variable substitution for artifact names. Example: > > :: > > sha256 = "@@@SYSTEM_IMAGE@@-@@MACHINE@@@@SWUPDATE_IMAGES_FSTYPES[@@SYSTEM_IMAGE@@]@@"; > > -Please note that each variable is double delimited (at the begin and at the end) by `@@`. > +Please note that each variable is double delimited (at the beginning and at the end) by `@@`. > > -Deliver your scripts instead of relying on installed > ----------------------------------------------------- > +Deliver your scripts instead of relying on them being installed > +--------------------------------------------------------------- > > You have the freedom to call any tools during an update. However, take care if you are using > some tools from the running rootfs / current software. This implies that the current software is running > -flawlessly, as well as the tools you are calling. And this is not maybe always the case. > +flawlessly, as well as the tools you are calling. And this may not always be the case. > > > Prefer Lua to shell scripts > @@ -220,18 +219,18 @@ problem if system is getting rid of resources. > A better approach is to use Lua and to deliver the scripts inside the SWU. In fact, the Lua > interpreter is linked to SWUpdate and runs in context of the SWUpdate process without forking > a child process. Shell is not involved at all. Of course, Lua scripts should be written > -to be selfcontained, too, and executing external tools should be done only if unavoidable. > +to be self-contained, too, and executing external tools should be done only if unavoidable. > > Use installed-directly when possible > ------------------------------------ > > SWUpdate can be enabled for zero-copy (or streaming mode), that is the incoming SWU is analyzed on the fly and it is > -installed by the related handler without any temporary copy. If this is not set, SWUpdate creates > -a temporary copy on $TMPDIR before passing it to the handlers. Note that $TMPDIR generally points to > -a RAMDISK and storing files there reduce the amount of memory available for the application. > +installed by the associated handler without any temporary copy. If this is not set, SWUpdate creates > +a temporary copy in `$TMPDIR` before passing it to the handlers. Note that `$TMPDIR` generally points to > +a RAMDISK and storing files there reduces the amount of memory available for the application. > It makes sense to disable the flag in case the artifact is a single point of failure. > A typical example could be the bootloader (not duplicated on the devices), and if the SWU > -is corrupted or the connection get broken, lets the board in a bricked state. It makes sense > +is corrupted or the connection gets broken, the board is left in a bricked state. It makes sense > then to download the whole artifact before installing. > > Always enable sha256 verification > @@ -243,26 +242,26 @@ weak. Do not trust it, but enable sha256 for each artifact. > Always set the "type" attribute > ------------------------------- > > -SWUpdate sets some default handler if the type is not set. Do not use it, but set explicitely > -the type (that is, which handler should install the artifact) in sw-description. > +SWUpdate sets some default handler if the type is not set. Do not use it, but set explicitly > +the type (that is, which handler should install the artifact) in `sw-description`. > > Do not rely on install order > ---------------------------- > > -SWUpdate does not require that artifacts must be put in CPIO in a specific order. The exception is > -sw-description, that must be the first file in SWU. Avoid dependencies inside the SWU, that is an artifact > -that can be installed only after another one was installed before. If you really need it, for example > +SWUpdate does not require that artifacts are put into the CPIO in a specific order. The exception is > +`sw-description`, that must be the first file in a SWU. Avoid dependencies inside the SWU, that is an artifact > +that can be installed only after another one was installed before. If you really need it, for example if > you want to install a file into a filesystem provided as image, disable `installed-directy` for the file > and enable it for the filesystem image. > > Do not drop atomicity ! > ----------------------- > > -SWUpdate guarantees atomicity until you do not do something that simply breaks it. As example, > -think at the bootloader's environment. In sw-description there is a specific section where > +SWUpdate guarantees atomicity as long as you don't do something that simply breaks it. As example, > +think about the bootloader's environment. In an `sw-description`, there is a specific section where > the environment can be set, adding / modifying / deleting variables. SWUpdate does not change > -the single variable, but generates for the supported bootloader the resulting new environment and > -this is written in one shot in a way (for U-Boot / EFIBootguard, not for GRUB) that is powercut safe. > +single variables, but generates the resulting new environment for the supported bootloader and > +this is written in one shot in a way (for U-Boot / EFIBootguard, not for GRUB) that is power-cut safe. > You can of course change the environment in a postinstall script, like in the following way (for U-Boot): > > :: > @@ -273,13 +272,13 @@ You can of course change the environment in a postinstall script, like in the fo > fw_setenv var4 val4 > fw_setenv var5 val5 > > -If a power cut happens during two calls of fw_setenv, the environment is in a middled of a change and this > +If a power cut happens during two calls of fw_setenv, the environment is in an intermediate state and this > can brick the device. > > Plan to have a rescue system > ============================ > > -Even if you have a double-copy, something can go wrong. Plan to have a rescue system (swupdate-image in meta-swupdate) > -and to install it on a separate storage as the main system, if it is possible. This helps when the main > -storage is corrupted, and device can be restored in field without returning it back to the factory. > -Plan to update the rescue as well: it is software, too, and its bugs should be fixed, too. > +Even if you have a double-copy setup, something can go wrong. Plan to have a rescue system (swupdate-image in meta-swupdate) > +and to install it on a separate storage than the main system, if it is possible. This helps when the main > +storage is corrupted, and the device can be restored in the field without returning it back to the factory. > +Plan to update the rescue system as well: it is software, too, and its bugs should be fixed, too. > Applied to -master, thanks ! Best regards, Stefano Babic
diff --git a/doc/source/swupdate-best-practise.rst b/doc/source/swupdate-best-practise.rst index e222fb7..afd5789 100644 --- a/doc/source/swupdate-best-practise.rst +++ b/doc/source/swupdate-best-practise.rst @@ -9,12 +9,12 @@ Swupdate Best Practice This is intended as general rule to integrate SWUpdate into a custom project. SWUpdate is an update agent and it is thought to be a framework. This means it is highly -configurable and SWUpdate should be configured to fit into a project, and not -viceversa. SWUpdate makes just few requirements on the system and it has no fix schema. -There is no restrictions about how many partitions, or which storage you are using. +configurable and SWUpdate should be configured to fit into a project, not +vice versa. SWUpdate makes just a few requirements on the system and it has no fixed update schema. +There is no restriction on how many partitions or which storage you are using. In some more complex cases, the update depends on a lot of conditions, -and SWUpdate can run differently according to the `mode` a device is started. -Think that SWUpdate is not a ready-to-use updater, and you need first to +and SWUpdate can run differently according to the `mode` a device is started in. +Think about SWUpdate not being a ready-to-use updater but a framework, and hence you should first write a meaningful: Update Concept @@ -24,91 +24,92 @@ Take your time and write first an update concept for your device. It is not was You have to imagine conditions when an update is not working, and try to write down the use cases when an update can fail and how the device can be restored. SWUpdate installs new software, but a successful update means that the new software -is started and runs flawlessly. The interface with the bootloader (or the one that start the -SW) must be checked in details. +is started and runs flawlessly. The interface with the bootloader (or the one that starts the +software) must be checked in details. A successful update means: -- SWUpdate runs with success -- Device reboots -- Bootloader can start the new software -- The new software runs, makes some consistency checks, and declare that the transaction (that is from old to new sw) is terminated. -This means that some coordination between bootloader and update agent is necessary. +- SWUpdate runs successfully, +- the device reboots, +- the bootloader can start the new software, and +- the new software runs, makes some consistency checks, and declares that the transaction (that is from old to new software) is terminated. + +This means that some coordination between the bootloader and the update agent is necessary. In most cases, this is done via persistent variables that are available to both -SWUpdate and bootloader. SWUpdate has two built-in variables: +SWUpdate and the bootloader. SWUpdate has two built-in variables: - *recovery_status*: this is set when SWUpdate starts to write to the device, and it is - reset after the installation completed with success or it is set to `failure` in case + unset after the installation completed with success or it is set to `failed` in case of error. A bootloader can use it to check if an update was interrupted. - This is a must in case a single copy of software is on the device. + This is a must in case a single copy of the software is on the device. -- *ustate*: this sets a state machine. SWUpdate sets it to `1` (INSTALL) after an update. - The bootloader can use it to check if a new software must be tested. - Bootloader should implement a counter mechanism to check how many time it tried to start - a new software. When a threshold is reached, the bootloader should declare the new sofwtare - as buggy, and proceed with a fallback in case thew old software is available. +- *ustate*: this triggers a state machine. SWUpdate sets it to `1` (INSTALL) after an update. + The bootloader can use it to check whether a new software must be tested. + The bootloader should implement a counter mechanism to check how many times it tried to start + a new software. When a threshold is reached, the bootloader should declare the new software + as buggy, and proceed with a fallback in case the old software is available. A *fallback* is always initiated by the bootloader, because it knows -if the new sofware is running. It should toggle the copies and start the old software set. -To communicate this to the user space and to SWUpdate, the bootloader sets the ustate variable to -`FAILED`. SWUpdate uses it in case the result must be forwarded to an external server (like a backend). +if the new software is running. It should toggle the copies and start the old software set. +To communicate this to user space and to SWUpdate, the bootloader sets the `ustate` variable to +`3` (FAILED). SWUpdate uses this information in case the result must be forwarded to an external server (like a backend). There is a time window when a fallback can take place. In fact, after a reboot and some attempts, -the update transaction is declared successful or failed, and ater then new update can be executed. +the update transaction is declared successful or failed, and later a new update can be executed. When a new update runs, the status of the stand-by copy is unknown, because it could be -the result of an interrupted update. Running an uncomplete software can lead to unpredictable +the result of an interrupted update. Running an incomplete software can lead to unpredictable results and must be strongly avoided. A common pattern for a toggling in the bootloader is: -- ustate is not set or set to '0' (no update pending). The bootloader runs the configured - copy and won't ever toggle. In case of failure, a rescue can be started if available. -- ustate is set to *installed* ('1'): the new software is under test. The bootloader initiates - a fallback if new software is not running and sets ustate to failed. If new software runs, - it is duty of user space (often SWUpdate or the init script for SWUpdate) to reset the variable - to '0'. Note that resetting the variable is project specific, and it could be set as last - action after checking as much as possible that the new SW is running. This includes - performing in the application a database migration, starts to communicate with peers, whatever. +- `ustate` is not set or set to `0` (no update pending). The bootloader runs the configured + copy and won't ever toggle. In case of failure, a rescue system can be started if available. +- `ustate` is set to `1` (INSTALLED): the new software is under test. The bootloader initiates + a fallback if the new software is not running and sets `ustate` to `3` (FAILED). If the new software runs, + it is the duty of user space (often SWUpdate or the init script for SWUpdate) to reset `ustate` + to `0`. Note that resetting the variable is project specific, and it could be set as last + action after having sufficiently checked that the new software is running. This includes + performing in the application a database migration, starting communicating with peers, whatever. -Check in advance which security topics are relevant on your project. This includes: +Check in advance which security topics are relevant for your project. This includes: -- signed images (SWU is verified before installing), and then which crypto mechanismus is used +- signed images (SWU is verified before installing), and then which crypto mechanism is used (RSA keys, certificates, PKI) - encryption for artifacts -- under which user and group SWUpdate and other components are allowed to run, see - `privilege separation`. Set user id and group id if not root in swupdate.cfg. +- under which user and group SWUpdate and other components are allowed to run. + Set user id and group id if not `root` in `swupdate.cfg`. - if any version can be installed or if you forbid a downgrade, and then be sure to pass - the range of versions you allow via --M, -R and --max-version. + the range of versions you allow via `--M`, `-R` and `--max-version`. - hardware-software compatibility check and how your device knows which hardware revision is running. -Reduce dependencies at minimum +Reduce dependencies to minimum ------------------------------ An update should be possible in any condition. Even if the system is degraded or in a bad shape, if an update can work, the device can be functional again without returning it back to the factory. -SWUpdate is thought to be self contained: that means it does not make usage of external -tools. If your system is degraded and filesystems gets corrupted, there are less chance to restore it -if the update calls external tools. SWUpdate is started at boot time and there are good chance -it goes on even if your system has some (software) flaws. +SWUpdate is thought to be self contained: that means it does not make use of external +tools. If your system is degraded and filesystems get corrupted, there are less chances to restore it +if the update calls external tools. SWUpdate is started at boot time and there are good chances +it succeeds even if your system has some (software) flaws. Be careful to make an update depending on your application or try to reduce the dependencies. -In fact, application is updated often and an introduction of a new bugs can make the device not -updatable anymore. Take the dependencies under control, and if you have any, be sure that the +In fact, the application is updated often and an introduction of new bugs can make the device no +longer updatable. Take the dependencies under control, and if you have any, be sure that the update is still working. You can fix any bugs if the update works, but not anymore if the device cannot be updated. -Make a risk analyses +Make a risk analysis -------------------- -A more accurate analyses brings less surprises in field. Think about twice what you want to update, -and which components should be updated, and which risks in case of update of single point of failure. +A more accurate analysis brings less surprises in the field. Think twice about what you want to update, +which components should be updated, and the risks of updating a single point of failure. Very often, this means the bootloader. Compare risks and benefits: it happens in many projects that having the possibility (with some risk) to update the bootloader is better that returning the devices -back to service. A cost / benefits analyses should be part of the integration of the update agent. +back to service. A cost / benefits analysis should be part of the integration of the update agent. SWUpdate builtin configuration ============================== SWUpdate has a compile time configuration. The default configuration delivered with `meta-swupdate` -is not suitable for most project. The easy way to check configuration in Yocto is to run: +is not suitable for most projects. The easy way to check configuration in Yocto is to run: :: @@ -120,91 +121,89 @@ Outside Yocto, just run in SWUpdate's sources: make menuconfig -Check security, bootloader and which handlers should be installed. They depend strongly from +Check security, bootloader, and which handlers should be installed. They depend strongly on your project. -If you build with OE, add a swupdate_%.bbappend to one of your layers, and put the resulting +If you build with OE, add a `swupdate_%.bbappend` to one of your layers, and put the resulting configuration file as `defconfig` that can be fetched. Please review the following configuration: -- security settings - -- interfaces required (where the SW is coming from). Disable the interface you do not need. -- handlers required for you project. Disable again what you do not need, but consider if +- Security settings +- Interfaces required (where the software is coming from). Disable the interface you do not need. +- Handlers required for your project. Disable what you do not need, but consider if you could need some of them in future. As example, you can safely disable *ubivol* if - you do not use raw NAND, but you can let *archive* if you plan to install artifacts - from tarball in future. - -- it is highly recommended to enable Lua to extend runtime behavior. + you do not use raw NAND, but you can let *archive* enabled if you plan to install artifacts + from tarballs in future. +- It is highly recommended to enable Lua to extend runtime behavior. SWUpdate startup ================ -An easy way to start SWUpdate is provided only with meta-swupdate and Yocto. A generi SystemV init script or a -systemd unit for SWUpdate are executing a script `swupdate.sh`, that is delivered together with SWUpdate +An easy way to start SWUpdate is provided only with meta-swupdate and Yocto. A generic SystemV init script or a +systemd unit for SWUpdate are executing a script `swupdate.sh`, that is delivered together with the SWUpdate binaries. -The script goes through /etc/swupdate/conf.d/ and source all found files. The integrator can use -a set of predefined variable to configure SWUpdate's command line parameters. +The script goes through `/etc/swupdate/conf.d/` and sources all found files. The integrator can use +a set of predefined variables to configure SWUpdate's command line parameters. -- *SWUPDATE_WEBSERVER_ARGS* : this string is passed if the webserver must be started. They are the Webserver - specific parameter. If this variable is set, the script will add `-w` to the list of parameters. - Note: meta-swupdate contains a default configuration for SWUPDATE_WEBSERVER_ARGS, that use /www as document root +- *SWUPDATE_WEBSERVER_ARGS* : This string is passed if the webserver must be started. It consists of the webserver + specific parameters. If this variable is set, the script will add `-w` to the list of parameters. + Note: meta-swupdate contains a default configuration for SWUPDATE_WEBSERVER_ARGS, that uses /www as document root for the Website and default port 8080. - *SWUPDATE_SURICATTA_ARGS* : Suricatta (backend) specific parameters. There is no default. -- *SWUPDATE_ARGS* : parameteres not belonging to Webserver or Suricatta. +- *SWUPDATE_ARGS* : Parameters not belonging to Webserver or Suricatta. -Note that swupdate.sh sources the file in sorted order, so it is possible to overwrite the variables +Note that `swupdate.sh` sources the files in sorted order, so it is possible to override the variables with a configuration file whose filename is loaded at the end. Preferred style is to use SystemV like names, for example `10-webserver`, `11-suricatta`, and so on. Write sw-description ==================== -sw-description is the central file that describe a new release and how a release must be installed. -It should be a consequence of the update concept. There is not a single right way to do. SWupdate -uses heaviliy `selections` and links to extract just one part of the whole sw-description, that -can be used for different situsations and different way to run the device. One use case for -selection is to implement the dual-copy (often reported as A/B) mode: one section contains instructions +`sw-description` is the central file that describes a new software release and how a release must be installed. +It should be a consequence of the update concept. There is not a single right way. SWUpdate heavily +uses 'selections' and links to extract just one part of the whole `sw-description`, that +can be used for different situations and different ways to run the device. One use case for +selections is to implement the dual-copy (often referred to as A/B) mode: one selection contains instructions for one copy, the other for the second copy. Which copy is the stand-by must be detected before running SWUpdate and passed via the `-e <selection,mode>` switch. -Other methods set up link to the standby storage (like /dev/standby) during boot. Or the standby -device can be detected at runtime with an `embedded-script`, as part of sw-description, with Lua code. -Please note that for last case, SWUpdate is extended with functions exported to the Lua context that -simplify the detection. SWUpdate exports a `getroot` that returns type and value for the device used -for rootfs. See SWUpdate documentation for a complete list of functions exported by SWUpdate that can be -used in Lua. An embedded Lua script must just start with: +Other methods set up a link to the standby storage (like `/dev/standby`) during boot. Or the standby +device can be detected at runtime with an `embedded-script`, as part of `sw-description`, with Lua code. +Please note that for the last case, SWUpdate is extended with functions exported to the Lua context that +simplify the detection. SWUpdate exports a `getroot()` function that returns type and value for the device used +as rootfs. See SWUpdate documentation for a complete list of functions exported by SWUpdate that can be +used in Lua. An embedded Lua script must just start with :: require ('swupdate') -to make usage of them. +to make use of them. -Use as much as possible OE variables +Use OE variables as much as possible ------------------------------------ -meta-swupdate replaces special construct in sw-description with the values of build variables. -The recognized construct in sw-description is delimited by *@@*, that is *@@VARIABLE-NAME@@*. -The exception (for compatibility reason) is the automatic generation of sha256. The syntax in that case +meta-swupdate replaces a special construct in `sw-description` with the values of build variables. +The recognized construct in `sw-description` is delimited by *@@*, that is *@@VARIABLE-NAME@@*. +The exception (for compatibility reasons) is the automatic generation of `sha256`. The syntax in that case is : :: sha256 = "@<name of artifact>" -You can use again variable substituion for artifact name. Example: +You can again use variable substitution for artifact names. Example: :: sha256 = "@@@SYSTEM_IMAGE@@-@@MACHINE@@@@SWUPDATE_IMAGES_FSTYPES[@@SYSTEM_IMAGE@@]@@"; -Please note that each variable is double delimited (at the begin and at the end) by `@@`. +Please note that each variable is double delimited (at the beginning and at the end) by `@@`. -Deliver your scripts instead of relying on installed ----------------------------------------------------- +Deliver your scripts instead of relying on them being installed +--------------------------------------------------------------- You have the freedom to call any tools during an update. However, take care if you are using some tools from the running rootfs / current software. This implies that the current software is running -flawlessly, as well as the tools you are calling. And this is not maybe always the case. +flawlessly, as well as the tools you are calling. And this may not always be the case. Prefer Lua to shell scripts @@ -220,18 +219,18 @@ problem if system is getting rid of resources. A better approach is to use Lua and to deliver the scripts inside the SWU. In fact, the Lua interpreter is linked to SWUpdate and runs in context of the SWUpdate process without forking a child process. Shell is not involved at all. Of course, Lua scripts should be written -to be selfcontained, too, and executing external tools should be done only if unavoidable. +to be self-contained, too, and executing external tools should be done only if unavoidable. Use installed-directly when possible ------------------------------------ SWUpdate can be enabled for zero-copy (or streaming mode), that is the incoming SWU is analyzed on the fly and it is -installed by the related handler without any temporary copy. If this is not set, SWUpdate creates -a temporary copy on $TMPDIR before passing it to the handlers. Note that $TMPDIR generally points to -a RAMDISK and storing files there reduce the amount of memory available for the application. +installed by the associated handler without any temporary copy. If this is not set, SWUpdate creates +a temporary copy in `$TMPDIR` before passing it to the handlers. Note that `$TMPDIR` generally points to +a RAMDISK and storing files there reduces the amount of memory available for the application. It makes sense to disable the flag in case the artifact is a single point of failure. A typical example could be the bootloader (not duplicated on the devices), and if the SWU -is corrupted or the connection get broken, lets the board in a bricked state. It makes sense +is corrupted or the connection gets broken, the board is left in a bricked state. It makes sense then to download the whole artifact before installing. Always enable sha256 verification @@ -243,26 +242,26 @@ weak. Do not trust it, but enable sha256 for each artifact. Always set the "type" attribute ------------------------------- -SWUpdate sets some default handler if the type is not set. Do not use it, but set explicitely -the type (that is, which handler should install the artifact) in sw-description. +SWUpdate sets some default handler if the type is not set. Do not use it, but set explicitly +the type (that is, which handler should install the artifact) in `sw-description`. Do not rely on install order ---------------------------- -SWUpdate does not require that artifacts must be put in CPIO in a specific order. The exception is -sw-description, that must be the first file in SWU. Avoid dependencies inside the SWU, that is an artifact -that can be installed only after another one was installed before. If you really need it, for example +SWUpdate does not require that artifacts are put into the CPIO in a specific order. The exception is +`sw-description`, that must be the first file in a SWU. Avoid dependencies inside the SWU, that is an artifact +that can be installed only after another one was installed before. If you really need it, for example if you want to install a file into a filesystem provided as image, disable `installed-directy` for the file and enable it for the filesystem image. Do not drop atomicity ! ----------------------- -SWUpdate guarantees atomicity until you do not do something that simply breaks it. As example, -think at the bootloader's environment. In sw-description there is a specific section where +SWUpdate guarantees atomicity as long as you don't do something that simply breaks it. As example, +think about the bootloader's environment. In an `sw-description`, there is a specific section where the environment can be set, adding / modifying / deleting variables. SWUpdate does not change -the single variable, but generates for the supported bootloader the resulting new environment and -this is written in one shot in a way (for U-Boot / EFIBootguard, not for GRUB) that is powercut safe. +single variables, but generates the resulting new environment for the supported bootloader and +this is written in one shot in a way (for U-Boot / EFIBootguard, not for GRUB) that is power-cut safe. You can of course change the environment in a postinstall script, like in the following way (for U-Boot): :: @@ -273,13 +272,13 @@ You can of course change the environment in a postinstall script, like in the fo fw_setenv var4 val4 fw_setenv var5 val5 -If a power cut happens during two calls of fw_setenv, the environment is in a middled of a change and this +If a power cut happens during two calls of fw_setenv, the environment is in an intermediate state and this can brick the device. Plan to have a rescue system ============================ -Even if you have a double-copy, something can go wrong. Plan to have a rescue system (swupdate-image in meta-swupdate) -and to install it on a separate storage as the main system, if it is possible. This helps when the main -storage is corrupted, and device can be restored in field without returning it back to the factory. -Plan to update the rescue as well: it is software, too, and its bugs should be fixed, too. +Even if you have a double-copy setup, something can go wrong. Plan to have a rescue system (swupdate-image in meta-swupdate) +and to install it on a separate storage than the main system, if it is possible. This helps when the main +storage is corrupted, and the device can be restored in the field without returning it back to the factory. +Plan to update the rescue system as well: it is software, too, and its bugs should be fixed, too.
Signed-off-by: Christian Storm <christian.storm@siemens.com> --- doc/source/swupdate-best-practise.rst | 217 +++++++++++++------------- 1 file changed, 108 insertions(+), 109 deletions(-)