home
navigate_next
Blog
navigate_next
News

The landscape of software deployment in robotics

The landscape of software deployment in robotics
Andrew Murtagh
Co-founder, CEO
The landscape of software deployment in robotics

At Airbotics, we’ve been studying the different ways engineers deploy software to their robots for a while. We’ve talked to hundreds of folks up and down the industry - from students deploying their prototypes, to CTOs responsible for managing thousands of robots in production. We’ve noticed some patterns across them and in this article we detail the main techniques, technologies and practices we’ve seen.

If you’re planning to develop a solution and are interested in learning more about the topic, this article will provide an objective, high-level overview of the landscape of options available.

The landscape

1. Manually updating

Most engineering teams start updating software on their robots with some combination of command line utilities like scp, rsync, git, ssh, etc. This workflow probably sounds familiar:

  1. ssh robot@hostname
  2. git pull origin main
  3. catkin_make
  4. roslaunch robot start.launch

It’s clear to most this method will fall down as a fleet scales. It’s laborious, prone to operator error, not particularly secure, lacks traceability, reproducibility, access controls - the list goes on. It may also leave sensitive source code on robots. Some teams will address this by building software on their workstations and scp or rsync the built artifacts to their robots, but that can introduce cross-compilation and dev/prod parity issues. So there are a lot of things going against manually updating software on robots.

Despite that, it’s where most teams start - and probably should start. When your biggest headache is simply making your robot functional, investing weeks or months into developing an elegant deployment pipeline is probably not the best course of action.

2. Software Configuration Management (SCM) tools

A logical next step is to try to automate this process. Some teams wrap these steps up with custom scripting using <insert your preferred scripting language here>. But most will pull existing tools off the shelf rather than reinvent the wheel (most of which come from the cloud and IT domains).

The most common ones we’ve seen are Ansible, Salt, and Puppet. At a high level, they all typically specify some desired configuration of software that should exist on your robot(s) expressed in a framework specific-language, and some executable utility that tries to bring the state of the machine to match the configuration. Nice.

They’re mostly all open-source and mature products that are supported by large companies, have strong developer communities, have been used successfully on many fleets at scale, and offer very high levels of control and flexibility.

On the flip side, they can present a relatively steep learning curve for those who haven’t worked with them before, and configuration drift can often occur. Coming from the cloud domain where machines are almost always available, many of the tools are push-based. But in the imperfect world of robotics where machines can often be engaged in a task, have poor network connection, or just be powered off this will almost always lead to some of the fleet being unable to accept an update when one is pushed out - which can require implementing additional retry logic.

You may also need to implement more features to improve security, verify image integrity, implement canary rollouts, make deployments atomic, implement rollbacks, increase observability, create an admin panel, etc. etc. - but you have almost unlimited control of how it should be implemented. Ultimately, these tools are as good as the time you can put into them.

2. A/B updates

Coming more from the embedded world, A/B updates (or dual-bank updates) are a robust, tried-and-true method for upgrading software on edge devices.

It involves splitting a hard drive into (usually) two partitions, one active (A) and one inactive (B). The A partition runs an image containing everything your application needs, when an update is initiated it downloads and writes the new version into the B partition and points the bootloader to it. After the next successful reboot, the B partition will run as the active partition and their roles will switch. If the boot process fails, the bootloader can be configured to rollback to a known previous working version in the A partition - meaning your robots should always be bootable.

There are plenty of providers for this (both open-source and commercial), most notably: RAUC, SWUpdate and Mender.

A/B updates are a mature  technique and are considered to be very robust. The providers typically have very stable products with high levels of stability and security.

The immediate downside to this approach is needing to over-provision storage by x2 (thankfully, storage is relatively cheap these days), and unless your provider supports delta updates (only transmitting changes between versions instead of an new version in its entirety) upgrades can consume a lot of bandwidth - which tends to be in short supply in robotics.

Building full system images in a reproducible, source-control tracked way can present a challenge to robotics engineers that haven’t come from the embedded world. Most teams tend to gravitate towards the likes of NixOs or Yocto to help with this. Maybe we’ll see some Packer implementations of robotics codebases soon.

3. Containers

How could we write about deploying software and not mention containers. Containers have been steadily moving their way from cloud computing to robotics for the past few years.

The big player in this space is of course Docker, but for those concerned about a daemon running as root there is also Podman (although it is possible to run the Docker daemon in rootless mode).

Docker itself doesn’t have much in the way of orchestration, monitoring and deployment - but options like Balena, Portainer and Watchtower exist for this. For those who can’t get enough of containers in their lives there’s Kubernetes and its extended family of tools - most notably for robotics, k3s.

The advantages of containers have been written about to death and mostly all transfer to robotics development - great developer experience, very portable, won’t pull layers that aren’t already on the system which may save on network usage, widely supported tooling, reproducible builds, etc., etc.

But there are some caveats to consider when using containers for robotics development. You’ll hear mixed opinions about performance overhead (which in theory ought not to be exceedingly onerous, but in practice can be), network and device access (commonly required in robotics) can be a pain, and upgrades don’t have the level of atomicity that A/B upgrades have.

But the real showstopper many teams eventually run into is that they simply need some level of compute or configuration on the host for which Docker would need to be bent out of shape to manage. Things like: network configuration, systemd unit files, udev rules, drivers, real-time kernel patches, display servers, etc. The solution for this is often to pair Docker with SCM tools or A/B updates. So while Docker is great for application level software deployment it isn’t a panacea for robotics development.

4. AWS Greengrass

AWS Greengrass is an approach unique to itself that has gained increasing traction in robotics.

It runs as an agent on your robot through which you can deploy, run and manage components. Components could be AWS Lambda functions, Docker containers, secrets, and lots of other services.

If you love AWS, this could be a good solution for you. It’s very fully-featured and naturally, being an AWS service, managed and scaled for you. However, it comes with a high level of lock-in (which may or may not be a drawback for some), creating components can be cumbersome, and depending on your stance the AWS console may not provide a high level of observability.

6. Other runners

There are many other options out there that we’ve happened to see less frequently used in robotics - not that they are inherently worse. The main ones being: snaps, Rapyuta, Nimbus, transferring some kind of archive with some kind of script (e.g. zip and Python), package managers (predominantly apt), and entirely offline updates via a USB stick or similar.

A new way to update software on robots

Like many of the other problems faced in robotics, solutions are often borrowed from other fields. For software deployment, solutions have predominantly come from the cloud or IoT fields.

In assessing the landscape of options available to roboticists, we’ve found many of them fall short in meeting the specific needs of the robotics use case. In particular:

  • Robots go where servers never can and are often constrained by network bandwidth - so updates should be delta by default.
  • Robots typically contain multiple computers that need to be securely updated simultaneously and reported on.
  • Robots always need to boot - updates should therefore be atomic with automatic rollbacks.
  • Robots need to be configured at the lowest level of the compute stack - so full system updates should be possible.
  • Robots can’t be interrupted from their tasks to download and upgrade its software whenever a new update becomes available - so granular control over the update lifecycle should be possible.

At Airbotics, we believe that securely, reliably and easily updating software on robots is an important enough task to warrant its own product. Airbotics is dedicated entirely to meeting the specific challenges found in robotics.

Images are built using Yocto to provide full control of the software stack, they’re shipped using OSTree to provide incremental and atomic upgrades, and secured using Uptane. The downsides of our approach is that robots need to be rebooted for the update to take effect, OSTree places some limitations on filesystems, and building an image with Yocto can be an involved process. There's no silver bullet in engineering.

Conclusion

We’ve seen as many ways to deploy software to robots as there are robotics teams. Like most things in robotics, the solutions tend to be pulled from other fields. They can work well for some, and disastrous for others.

Other fields have benefited enormously from investing in good infrastructure and tooling - nimble product teams, faster delivery to customers, greater efficiency and collaboration in teams, better stability - the list goes on.

If robotics is to continue to mature, we’ll also need to invest in better infrastructure, and shipping software should have its place there. We’re very excited to see more activity and solutions in this space so robotics engineers can get back to what they’re good at - robots!

arrow_back
Back to blog