From Friday, April 19th (11:00 PM CDT) through Saturday, April 20th (2:00 PM CDT), 2024, ni.com will undergo system upgrades that may result in temporary service interruption.

We appreciate your patience as we improve our online experience.

NI Linux Real-Time Discussions

cancel
Showing results for 
Search instead for 
Did you mean: 

How to implement a full system self-backup for embedded or field-deployed RIO

Well, the time has finally come. The product I'm working on that OEMs a cRIO needs to be deployed to customers' sites, and that means several things:

  1. The end user will no longer have access to a SW dev for questions or help.
  2. The IT environment the cRIO is in will not be controlled by anyone working with the SW devs.
  3. The cRIO will not be within arm's reach of anyone with a copy of MAX (because that would give them a few godlike powers they shouldn't have).
  4. Most ports on the cRIO will be firewalled to make it safe for use in DMZs or non-NATted networks. Also to protect it against curious IT personnel and blackhat hobbyists.

We need a way to make the cRIO back itself up daily so it can be restored by a product support technician. I'm familiar iwth NI Sysconfig and many of its limitations and quirks. It gets the job done well when your network connection is stable with a low ping, if you're using it from a remote device (like a workstation or laptop). But it won't allow me to have the cRIO create a system image of itself without restarting into Safe Mode, and that can't be allowed because this is a safety/control application with 24/7 uptime.

I've tried using a shell script to tarball nonvolatile files and then untar them during restoration, but the restored system doesn't work until I use MAX to reinstall "NI Software".

So how do I get a cRIO to create a system image of itself (1) without going offline (2) in such a way that it can be restored automatically by an authorized user?

0 Kudos
Message 1 of 9
(5,020 Views)

Have you tried any of the standard linux back up tools?

Message 2 of 9
(4,557 Views)

Not yet. Have you? I have some imagination for what you're up to, and I suspect you might have already solved this problem.

0 Kudos
Message 3 of 9
(4,557 Views)

I have some imagination for what you're up to...

I'm sure you do, seeing as how I still run across your name in our source repository. 

No, I haven't tried any of them yet.  We're in the early stages of moving to linux-based RIOs and still figuring out the systems, tools, procedures, and coding practices that allow us to support both VxWorks and NI-Linux based units in the field.  As our devices are designed to be 30-year assets with 24/7 uptime, this is no small task.

In any case, creating daily backups on the target and making them available for field techs to restore doesn't fit our business model.  It's too expensive to send someone to the site.  I suppose if I ever get SaltStack working we might look into automatically creating local images and restoring them via remote commands as a recovery option.

One thing I've considered (but have no idea how to do) is creating a recovery partition on the sbRIO with a custom bootstrap image.  The idea is if there is a catastrophic failure on the main partition, some lower-level daemon takes over, reimages the main partition with the bootstrap image, and reboots.  Then we can go through our normal process of downloading and installing the necessary components to get the system back up and running.  But like I said, I have no idea how to do that or if it's even possible on the linux RIOs.  (I'm pretty sure it can't be done on the VxWorks RIOs.)

0 Kudos
Message 4 of 9
(4,557 Views)

Staab_Engineering wrote:

...

We need a way to make the cRIO back itself up daily so it can be restored by a product support technician. I'm familiar iwth NI Sysconfig and many of its limitations and quirks. It gets the job done well when your network connection is stable with a low ping, if you're using it from a remote device (like a workstation or laptop). But it won't allow me to have the cRIO create a system image of itself without restarting into Safe Mode, and that can't be allowed because this is a safety/control application with 24/7 uptime.

I've tried using a shell script to tarball nonvolatile files and then untar them during restoration, but the restored system doesn't work until I use MAX to reinstall "NI Software".

So how do I get a cRIO to create a system image of itself (1) without going offline (2) in such a way that it can be restored automatically by an authorized user?

I was wondering if I could get a more complete picture of what systems you wish to archive and restore, since attempting to backup a running application in a manner that allows for live restoration is a very tricky task, especially if the services that are part of the 24/7 uptime requirement are part of this backup/restore. This is not just a tricky task for LabVIEW, but for all applications (and is most often achieved through system duplication and transparent failover, "system" usually being a virtual machine or container in the networking world).

0 Kudos
Message 5 of 9
(4,557 Views)

BradM wrote:

I was wondering if I could get a more complete picture of what systems you wish to archive and restore

I can't share specific details on a public forum because, you know, IP protection and NDA and all that stuff. But I'll say this: I need to back up all applications, daemons, kernel modules, etc. in such a way that I can overwrite what's on disk with the backup files in order to get the product working again. If that involves formatting the primary disk prior to restoration, so be it. I don't need to back up state data that'll be recreated when the system starts running again. I do need to back up app and config files for currently-running processes.

I don't need live restoration. That is, as you've said, an incredibly difficult feat. We intend to take a system down when it misbehaves and while offline, restore it from a backup and validate the restoration using a technician's software tool and laptop. Then we'll put it back online.

0 Kudos
Message 6 of 9
(4,557 Views)

Daklu wrote:

One thing I've considered (but have no idea how to do) is creating a recovery partition on the sbRIO with a custom bootstrap image.  The idea is if there is a catastrophic failure on the main partition, some lower-level daemon takes over, reimages the main partition with the bootstrap image, and reboots.

That's how NI implemented "safe mode". It's a separate ROM (or write-only partition) that houses a failover image which provides configuration services that MAX or Sys Config can touch in order to change the primary image. Based on my experience poking around in the Linux RT boot scripts, it seems entirely possible to create such an image yourself and modify or replace a handful of those scripts to make your partition the "safe mode". You really have a lot of power with the Linux devices, if you're willing to play the dev/ops role and modiy NI's default configuration.

Message 7 of 9
(4,557 Views)

If that involves formatting the primary disk prior to restoration, so be it. I don't need to back up state data that'll be recreated when the system starts running again. I do need to back up app and config files for currently-running processes.

I don't need live restoration.

As that is the case, any number of the noted backup utilities/schemes listed in the link provided by Daklu should get what you need (with a little elbow grease) and the understanding that grabbing an image of a live, running system may have issues regarding files that are actively being modified at the time of backup creation.

Formatting and reinstallation is the safest bet, but certainly tools like rsync will do precisely what you ask it to do (overwriting the old backed-up files with the backup contents, leaving the remainder of the system in-tact). I have not played around with the tools geared more towards complete system backups (clonezilla, e.g.) or more complex backup schemes (bacula, e.g.), but those seem to be overkill for the problem that you're solving here.

Message 8 of 9
(4,557 Views)

Staab_Engineering wrote:

Daklu wrote:

One thing I've considered (but have no idea how to do) is creating a recovery partition on the sbRIO with a custom bootstrap image.  The idea is if there is a catastrophic failure on the main partition, some lower-level daemon takes over, reimages the main partition with the bootstrap image, and reboots.

That's how NI implemented "safe mode". It's a separate ROM (or write-only partition) that houses a failover image which provides configuration services that MAX or Sys Config can touch in order to change the primary image. Based on my experience poking around in the Linux RT boot scripts, it seems entirely possible to create such an image yourself and modify or replace a handful of those scripts to make your partition the "safe mode". You really have a lot of power with the Linux devices, if you're willing to play the dev/ops role and modiy NI's default configuration.

Safemode is a ramdisk included in a so-called Flattened-Image-Tree file (kernel + ramdisk + "device tree" + scripts). It exists on the bootfs partition.

Take care when playing around with the Zynq-based boot flow, we've had an enterprising academic group brick a controller attempting to modify the boot flow for their needs. At least check here prior to trying something out so we can warn you if there's likelihood to cause issues and things to look out for.

Message 9 of 9
(4,557 Views)