LabVIEW Development Best Practices Blog

Elijah_K · ‎02-02-2011

Error handling is one of the most difficult and unappreciated components of a complex application architecture. For many LabVIEW users, the normal error wire and error case structure are adequate, but it's not uncommon to run into scenarios that are difficult to handle with the default error handling in LabVIEW. It's for this very reason that a member of our systems engineering group put together the Specific Error Handler (SEH) Reference Library.

The 'Specific Error Handler' Express VI allows a user to configure actions based on specific errors. Perhaps most valuable of all is the fact that when an error occurs, it's handled by an independent, a-synchronous process. You can specify the level of criticality of errors and the appropriate course of action, be it ignore, retry, or run custom code. For anyone developing even a modestly complex application that is struggling to handle errors, I strongly recommend visiting the link above for more inforrmation on how to download and use this free API.

This is an image of the configuration dialog that allows users to specify actions for specific errors.

Elijah Kerry
NI Director, Software Community

ryank · ‎02-03-2011

I wanted to clarify one thing, the version of SEH used in the example is best used with an asynchronous central error handler as Eli states, but the user is responsible for handling the communication to the asynchronous task (usually via a queue). I have a version that also provides a built-in asynchronous communication mechanism as well as templates for the central error handling loop and some other features (such as an RT specific version optimized for performance), but unfortunately, I haven't gotten it documented and posted yet. If you're interested in those features and willing to figure them out without documentation, feel free to send me a private message and I'll be happy to share. I'll put a post in this area once I get the fully documented version posted.

Fire · ‎02-04-2011

The SEH reminds me on my proposal in the idea exchange forum:

http://forums.ni.com/t5/LabVIEW-Idea-Exchange/Error-and-Exception-Handling/idi-p/1009269

The SEH as described looks promising. I will try it in the next days.

But, without a list of errors thrown by a subVI as part of its public interface it will become difficult to determine which errors have to be handled.

Regards Holger

~Samuli · ‎02-11-2011

This is great! We would really need some better tools for error handling. For example the Merge Errors function will only trap the first error. So there is a big risk to loose errors. For example if we have two errors like -8002 and -8003. Lets say that the error -8002 = "INI File is missing" and -8003 = "Critical Error! Temperature limit exceeded!". Now if we wire those two errors to Merge Errors function then we will loose the Error -8003. And because the error -8002 (INI File is missing) is not that critical users usually just use a case structure to handle -8002 errors and clear it. So the software could just keep running even though there is critical error -8003 too.

Also we would need to have an optimized functions for error handling. For example Simple Error Handler is just a wrapper for General Error Handler. So that will produce some overhead. Well that is not much. But I think even the general error handler VI is too heavy when there is no error (it will execute code even when there is no error). I know that many companies are building their custom error handlers because of that. Also we would need proper error logging functions. It is very important to log errors not just trap them

Style is Everything

ryank · ‎02-11-2011

Samuli calls out a number of good issues in his comment.

1. Passing errors - LabVIEW's traditional mechanism of error handling relies on the concept that by passing an error through a section of code, the later sections of the code should not execute if an error occurs early in the code. However, this is a flawed concept for a couple of reasons. First, as Samuli points out, there are certain sections of code that need to execute whether there is an incoming error or not (checking a critical temperature limit for example). Second, because of the first effect, and because of varied coding standards among LabVIEW code, it's very difficult to actually predict what a given code segment will do given an incoming error unless you explicitly test. I developed the SEH library to address this problem by attempting to create a localized method of handling errors so that they don't need to be passed through large sections of code (in theory, you could drop one of those SEH blocks after each function call, but it's more common to drop them after a functional segment of code such as a state or loop iteration).

2. Performance - There are two problems, performance wise, with the traditional method of handling errors in LabVIEW. First, is the use of the source string. String manipulation is expensive, and non-deterministic, which makes it a particular problem for control systems (RT). Second is the fact that most of the built-in error handling palette is designed for flexibility and to return as much information as possible, rather than being designed for performance. In general, the SEH is written for performance, although the error classification still requires string manipulation. In the next version of the SEH, I've made some improvements to the general "no-error" performance (the execution time when there are no errors to handle) as well as included an "RT" version that drops the source string and uses alternate methods for error classification to avoid string manipulation or memory allocation.

3. Logging errors - Logging errors is actually a pretty tricky problem. First need to get all of the errors (as described in Samuli's post, passing them can lead to some being missed). You generally need to offload them to some asynchronous process to write them to file as doing File I/O throughout your system could have a number of nasty side-effects. Next, you need to deal with the possibility that a section of code repeatedly throws the same error and overloads your transfer or logging mechanisms. It's also desirable to have the ability to prioritize errors so that a low priority error like Samuli's ini file issue doesn't end up causing a high priority issue like the temperature limit from being logged. Finally, you need to make sure that the information in the error log is actually going to help you do something proactive about problems in the system. This often requires more information about the error than just the code and location. For example, in Samuli's example, you'd probably want to know the path the system was looking for the ini file at, and the temperature and channel number of the limit error. It's also worth noting that almost all of these concerns apply equally to displaying an error message to the user via a dialog box or other mechanism. There is not currently any one solution to all of these problems that I'm aware of. I've built some of the needed functionality into the next generation of the SEH library, and I'm working on building the rest, but it's slow going as it's kind of a side project for me.

B.Settles · ‎02-14-2011

I defeinitely have tried to overcome the same issues ryank points out, especially the points about creating an asynchronous error message handler and logging. These two become very interesting when you have a distributed network of RT systems and one user interface that presents/controls information to/from all of them. Right now, we use an error queue with an error handling process that logs the errors (and whatever information we decided to code when we remembered to) to a file on each RT system. An error flag is then passed to our data network for indication to the user when an error occurs and on what target. Our solution seems to work, but here are some of the things we would like to improve;

Time stamps for when errors occur. We are using LabVIEW 8.6, and there is no easy way to synchronize the time on all the RT systems. Thus when we do get errors on multiple systems, its a pain to put together a timeline of events. Granted, this is more of an RT problem, but my point is encoding the time of the error into the error code/string would be helpful.
Distributed network error handling: I was going to start looking into logging all the errors from my networked systems into one file using this API I found the other day, http://zone.ni.com/devzone/cda/epd/p/id/5980. I could do this one of two ways; 1) designate one of my RT systems as the error logger and send messages to it, or 2) create a service that runs on a windows machine and logs errors in the background. The problem with option 2 is that our user interface can be run from any Windows machine on the test subnet.

To summarize; 1) Can NI finally include a native solution for synchronizing RT controllers with a NIST or Windows time server for setting the system time 2) encoding the time when an error occurs into the error code/string, and 3) distributed system error logging, which is much different from error handling that should only be performed locally.

"All truths are easy to understand once they are discovered; the point is to discover them." -- Galileo Galilei

JimMacD · ‎02-14-2011

settlesj,

Time sync of cRIOs to an SNTP server is something I'm playing around with right now. I'm actually making a SCADA system using DSC and need to have all the RT targets reporting data (via shared variables) to a central database with correct timestamps. What I have found is a beta version of NI Timesync 1.2 http://forums.ni.com/t5/LabVIEW/How-do-I-setup-cRIO-SNTP-Time-Sync-in-MAX/m-p/1411132

As of now, I have a cRIO currently synced to an internet time server, I'm working on getting it to sync with my main network server, haven't had sucess with it yet, might be a blocked port.

--------------------------------------------------------------------------------------------------

--CLD--
LV 6.1 to 2015 SP1

B.Settles · ‎02-14-2011

JimMacD,

I saw that cRIOs can be configured from a NIST time server or something similar, but this ability isn’t available on the RT embedded controllers as far as I’ve seen.

"All truths are easy to understand once they are discovered; the point is to discover them." -- Galileo Galilei

viScience · ‎02-14-2011

Also, if you need greater accuracy, you can sync cRIO's to IEEE1588 PTP for

<100uS error.

ryank · ‎02-14-2011

settlesj,

1) I'm not sure if the TimeSync Beta works with LV 8.6, but the ini file method definitely works, I've used it before:

http://digital.ni.com/public.nsf/allkb/F2B057C72B537EA2862572D100646D43?OpenDocument

EDIT: Nevermind, just saw your response. This method only works on VxWorks (i.e. cRIO). If you are using ETS (i.e. PC or PXI) you should try TimeSync. If TimeSync doesn't end up working with LV 8.6 then there are some libraries out there that you can use to implement a SNTP client yourself, but if I remember right the challenge is actually setting the system clock with subsecond accuracy. If you can't figure out how to do that you might try just ignoring the system clock entirely and using something like this http://zone.ni.com/devzone/cda/epd/p/id/5568 .

2) For RT, I avoid encoding the timestamp into the source string to avoid memory allocation from string manipulation in time-critical code. Instead, when sending errors to my asychronous process, I bundle the error code, timestamp, and some other info together into a separate data type, and use that as the data type for my queue (I actually use an FGV based priority-queue rather than the queue primitives because I want filtering and priority).

3) Syslog is a very good tool for distributed error notification or logging, and it's what I use for error notification on RT systems. The only thing you should be aware of is that it's UDP based, and therefore delivery can fail without retry in the event of a network collision. For most systems this is fine, but if you're working on an application where it is essential that you get the information about all of your errors, then this could be a drawback and you may still want to maintain a small local log as a backup.

B.Settles · ‎02-15-2011

Ryank,

I’ve tried the INI file keys before, but they don’t work on PXI RT controllers, which is all we are using. I’m also not familiar with the acronym FGV.

"All truths are easy to understand once they are discovered; the point is to discover them." -- Galileo Galilei

Luiz Maia · ‎02-15-2011

FGV is a functional global variable...

Here's an example... http://decibel.ni.com/content/docs/DOC-12876

settlesj <web.community@ni.com>

15/02/2011 15:16

Please respond to:

ni-165522323-a56-2x-aze@decibel.ni.com

To

Luiz Maia/BECRL02/Transport/ALSTOM@GA

cc

Subject

Re: - Advanced Error Handling in

LabVIEW

Community

Advanced Error Handling in LabVIEW

new comment by settlesj - View all comments on this blog post

Ryank,

I?ve tried the INI file keys before, but they don?t work on PXI RT

controllers, which is all we are using. I?m also not familiar with the

acronym FGV.

Reply to this email to respond to settlesj's comment.

Luiz Carlos Maia Junior

B.Settles · ‎02-15-2011

Got it. Use it all the time. A human head can only hold so many acronyms and my new job has pushed me past my limit.

"All truths are easy to understand once they are discovered; the point is to discover them." -- Galileo Galilei

LabVIEW Development Best Practices Blog

Re: Advanced Error Handling in LabVIEW