Skip to content
Snippets Groups Projects
Commit 77079bf2 authored by Zygmunt Krynicki's avatar Zygmunt Krynicki
Browse files

Numerous suggestions from Amit and Andrei


Signed-off-by: default avatarZygmunt Krynicki <zygmunt.krynicki@huawei.com>
parent db894ea8
No related branches found
No related tags found
No related merge requests found
......@@ -7,7 +7,7 @@ Over The Air (OTA) Updates
==========================
|main_project_name| provides support for updating Linux devices in the field.
With certain preparations, derivative projects can prepare and distribute
With certain modifications, derivative projects can prepare and distribute
periodic updates to bring in up-to-date security patches, new features and
capabilities.
......@@ -28,22 +28,21 @@ This chapter contains specific advice to the implementer of the update system.
complete product must tune and adjust a number of elements.
Failure to understand and correctly implement the following advice can cause
catastrophic failure in the field. When in doubt, re-test and re-check.
significant failure in the field. When in doubt, re-test and re-check.
Partitions
..........
|main_project_name| devices are using an A/B model with two immutable system
partitions, separate boot partition, separate application data partition and
separate system data partition and separate immutable device data partition.
The roles for those partitions were determined at the design stage and should
|main_project_name| devices use an A/B model with two immutable system
partitions and separate partitions for boot, application data, system data and immutable device data.
The roles for these partitions was determined at the design stage and should
be used in according with the intent.
OS, not apps
............
The update stack is designed to update the operating system, not applications.
Applications _may_ be embedded into the operating system image but _should_ be
Applications _may_ be embedded into the operating system image but ideally _should_ be
delivered as separate entities, for example, as system containers, because that
de-couples their life-cycle and upgrade frequency from that of the base system.
......@@ -52,8 +51,8 @@ Care should be taken to plan ahead, so that their sizes are not a constraining
factor during the evolution of the system software. This is also related to any
applications that may be bundled in the system image.
Each update involves system re-boot. In case of failure (total or partial),
another reboot is performed for the rollback operation. In contrast some
Each update requires a system re-boot. In case of failure (total or partial),
another reboot is performed for the rollback operation. In contrast, some
application update stacks may be able to achieve zero-downtime updates.
Plan your updates such, so that least downtime and interruption occurs for the
......@@ -75,7 +74,7 @@ product recall.
Space Requirements
..................
Update involves downloading the complete copy of the system partition. The
An update involves downloading the complete copy of the system partition. The
device must either use the data partition (which should have enough storage for
typical use-cases) or must combine having enough memory in RAM-based file system
_and_ use small enough images to ensure that the copy may be fully downloaded.
......@@ -91,16 +90,16 @@ Time Requirements
Update frequency incurs proportional load on the update server. A large enough
fleet of devices merely _checking_ for an update can take down any single
server. To alleviate this product design should balance update frequency (in
server. To alleviate this, product design should balance update frequency (in
some cases it can be controlled remotely post-deployment) and to spread the load
over time. It is strongly advisable to evenly distribute update checks with a
random element. If any potential updates must occur at a specific local time
(e.g. between three and four AM), then the system must be correctly configured
to observe the correct time zone. The update server can be scaled horizontally,
to an extent. At least for NetOTA care was taken to allow efficiency at scale,
to an extent. At least for NetOTA, care was taken to allow efficiency at scale,
with stateless operation and no need for a traditional database. Any number of
geographically distributed replicas, behind load balancers and geo-routing can
take arbitrarily large load. The update server (both HawkBit and NetOTA) uses
geographically distributed replicas, behind load balancers and geo-routing, can
withstand an arbitrarily large load. The update server (both HawkBit and NetOTA) uses
separates meta-data from file storage, allowing to offload network traffic to
optimized CDN solutions.
......@@ -125,22 +124,22 @@ The disk is partitioned into the following partitions:
- system-data (ext4)
- app-data (ext4)
The update stack interacts with the boot partition, the system a and b
partitions and the system data partition. Remaining partitions may be used by
The update stack interacts with the boot partition, the system-a and system-b
partitions and the system-data partition. Remaining partitions may be used by
other parts of the system, but are not directly affected by anything that
happens during the update process.
happens during the system (base OS) update process.
Boot and update process
-----------------------
Platform specific boot loader chooses one of the system partitions, either A or
The platform-specific boot loader chooses one of the system partitions, either A or
B, and boots into it. On EFI systems the kernel is loaded from the system
partition. Other boot loaders may need to load the kernel from the boot
partition. Appropriate redundancy scheme is used, to allow more than one kernel
to co-exist.
During early initialization of userspace, the immutable system partition mounted
at `/` is augmented with bind mounts to other partitions. In general application
at `/` is augmented with bind mounts to other partitions. In general, application
data (e.g. containers and other large data sets) is meant to live on the
application data partition, which does not use the A/B update model.
......@@ -150,7 +149,7 @@ Applications that are compiled into the image need overrides for their Yocto to
allow them to persist state. This is handled with the ``WRITABLES`` system which
is not documented here.
When an update is initiated, a complete image is downloaded to temporary
When an update is initiated, a complete system image is downloaded to temporary
storage. The image is cryptographically verified against the RAUC signing key or
key-chain. Compatibility is checked against the RAUC ``COMPATIBLE`` string.
......@@ -163,7 +162,7 @@ active, then the image is copied to *slot B*. Platform-specific logic is then
used to configure the boot system to boot into the newly written slot **once**.
This acts as a safety mechanism, ensuring that power loss anywhere during the
update process has the effect of reverting back to the known-good image. After
the image is written, platform specific post-install schedules the device to
the image is written, a platform-specific post-install hook schedules the device to
reboot, perhaps in a special way to ensure the boot-once constraint.
During boot-up, platform firmware or GRUB EFI application detects the boot-once
......@@ -173,13 +172,13 @@ applications. On successful boot, late userspace takes the decision to commit
the update transaction. A committed transaction atomically swaps the
active-inactive role of the two system partitions.
If failure, for example power loss or unexpected software error, prevents
If failure, for example due to power loss or unexpected software error, prevents
reaching the commit stage, then update commit will not happen. Depending on the
nature of the failure the device may restart automatically, or may need to
restarted externally. It is recommended to equip and configure a hardware
watchdog to avoid the need of manual recovery during this critical step.
watchdog to avoid the need for manual recovery during this critical step, while ensuring the the watchdog doesn't result in a reboot loop.
Once restarted the good slot is booted into automatically and the upgrade is
Once restarted the known-good slot is booted into automatically and the upgrade is
aborted. Temporary data saved during the update process is removed, so that it
does not accumulate in the boot partition.
......@@ -191,11 +190,11 @@ Supported update servers
HawkBit is a mature solution and recommended for scenarios where devices are
managed centrally by a single authority. The device manufacturer may sell
white-label boxes, deferring all management to the integrator or reseller. The
integrator must deploy operate and maintain a HawkBit installation for the
integrator must deploy, operate and maintain a HawkBit installation for the
lifetime of the product. All devices deployed in the field must be explicitly
provisioned with location and credentials before updates can be distributed.
NetOTA is not as mature but is recommended for scenarios where no central
NetOTA is still under development but is recommended for scenarios where no central
authority manages devices, but the device manufacturer or vendor still maintains
the software over time, releasing updates that devices may install at any time.
The manufacturer may pre-provision all devices with the location of the update
......@@ -238,7 +237,7 @@ architecture and deploy a scalable installation across multiple machines.
Deploying HawkBit
.................
To deploy HawkBit for a evaluation it is best to use the ``hawkbit`` snap
In order to evaluate HawkBit, it is best to use the ``hawkbit`` snap
package. The package offers several stability levels expressed as distinct snap
tracks. Installation instructions can be found on the `hawkbit snap information
page <https://snapcraft.io/hawkbit>`_.
......@@ -264,7 +263,7 @@ HawkBit for evaluation, set the listen address to `0.0.0.0` or `::`, so that the
service is reachable from all the network interfaces. This can be done with
``snap set hawkbit address=0.0.0.0``.
Once HawkBit is installed, either using the snap or in any other way, it should
Once HawkBit is installed, it should
be configured in one of several ways. The primary deciding factor is how devices
authenticate to HawkBit. The full documentation is beyond the scope of this
document, but for simple deployments we recommend either using *per-device
......@@ -279,7 +278,7 @@ menu. In HawkBit nomenclature, a device is called a _target_. Targets may be
clustered into target types, which aid in maintaining a heterogeneous fleet more
easily. Each target has a *controller ID*, which is an unique string identifying
the device in the system. In some authentication modes, devices need to be
provisioned with not only the URL of the HawkBit server, but also with their
provisioned not only with the URL of the HawkBit server, but also with their
*controller ID* and *security token*. Mass deployments can be performed using
bulk upload or using the management API.
......@@ -297,7 +296,7 @@ Provisioning Devices for HawkBit
SysOTA does not contain a native HawkBit client yet, so it leverages the
``rauc-hawkbit-updater`` program for this role. Said program reads a
configuration file ``/etc/rauc-hawkbit-updater/config.conf``, which must be
owned by the ``rauc-hawkbit`` user, connects to a given HawkBit server
owned by the ``rauc-hawkbit`` user, connects to a given HawkBit server,
authenticates using either device or gateway token and then listens for events.
|main_project_name| images contain a sample configuration file in
......@@ -327,8 +326,8 @@ Working with HawkBit
....................
HawkBit has both the web dashboard and a complex set of REST APIs covering all
aspects of the management story. During exploration and evaluation it is
recommended to use the graphical user interface. As the workflow solidifies it
aspects of the management story. During exploration and evaluation, it is
recommended to use the graphical user interface. As the workflow solidifies, it
is encouraged to switch the REST APIs and automation.
The general data model related to updates is as follows:
......@@ -340,7 +339,7 @@ The general data model related to updates is as follows:
The |main_project_name| project has created the ``hawkbitctl`` utility, which
easily create the required scaffolding and to upload the bundle to the server.
While useful, the tool does not cover the entire API surface yet and you may
find that specific functionality is missing. In cases like that custom
find that specific functionality is missing. In cases like that, custom
solutions, for example scripts using ``curl`` may be used as a stop-gap
measure.
......@@ -391,7 +390,7 @@ is deployed and devices are provisioned, is as follows:
- Bind the *software module* to the *distribution set* (by drag-and-drop).
At this stage, the update is uploaded and can be rolled out or assigned to
individual devices. Once a device is asked to update it will download and
individual devices. Once a device is asked to update, it will download and
install the bundle. Basic information about the process is relayed from the
device to HawkBit and can be seen in per-device action history.
......@@ -408,7 +407,7 @@ HTTPS to check if an update is available. In this mode whoever operates the
NetOTA server chooses the composition and number of available system images and
devices can be configured to follow a specific image name and stability level.
Unlike in the HawkBit model, the central server has no control over the devices.
Instead anyone controlling individual devices chooses the server, the image name
Instead, anyone controlling individual devices chooses the server, the image name
and the stability level and then follows along at the pace determined by the
device.
......@@ -430,7 +429,7 @@ documentation <https://gitlab.com/zygoon/netota>`_.
Deploying NetOTA
................
To deploy NetOTA for a evaluation it is best to use the ``netota`` snap
To deploy NetOTA for evaluation, it is best to use the ``netota`` snap
package. The package offers several stability levels expressed as distinct snap
tracks. Installation instructions can be found on the `netota snap information
page <https://snapcraft.io/netota>`_.
......@@ -441,7 +440,7 @@ This version is annotated with the git commit hash and a sequential number
counted since the most recent tag.
Once ``netota`` snap is installed, consult the ``snap info netota`` command and
read the description explaining available configuration options. Those are
read the description explaining available configuration options. Those options are
managed through the snap configuration system. By default NetOTA listens on
``localhost``, port ``8000`` and is meant to be exposed by a reverse http
proxy. Evaluation installations can use the insecure http protocol directly and
......@@ -453,7 +452,7 @@ address=0.0.0.0:8000``.
NetOTA does not offer any graphical dashboards and is configured by placing
files in the file system. The snap package uses the directory
``/var/snap/netota/common/repository`` as the root of the data set. Upon
installation an ``example`` package is copied there. It can be used to
installation, an ``example`` package is copied there. It can be used to
understand the data structure used by NetOTA. Evaluation deployments can edit
the data in place with a text editor. Production deployments are advised to use
a git repository to track deployment operations. Updates to the repository do
......@@ -514,7 +513,7 @@ is deployed and devices are provisioned, is as follows:
- Choose which stream to publish the bundle to. You can create additional streams
at will, by touching a ``foo.stream`` file. Make sure to create the
corresponding ``foo.stream.d`` directory as well. This will create the stream
``foo``. If you choose an existing stream remember that all the *archives*
``foo``. If you choose an existing stream, remember that all the *archives*
present in that stream must have the exact same version. This means you may
need to perform additional builds, if the package is built for more than one
architecture or ``MACHINE`` value.
......@@ -529,7 +528,7 @@ is deployed and devices are provisioned, is as follows:
services for your fleet.
- If you are doing this for the first time, make sure to read the upstream
documentation of the NetOTA project and consult the sample repository created
by the ``netota`` snap package on first install. Ideally keep the changes
by the ``netota`` snap package on first install. Ideally, keep the changes
you've made in a git repository, so that you can both track any changes or
revert back to previous state.
- Restart the NetOTA service or sent ``SIGHUP`` to the ``netotad`` process.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment