How to handle hanging processes in QSM

TBM · ‎04-24-2019

Hello everyone,

This is a follow-up thread to my initial question "How to add a timeout failsafe in QSM-PC" available here: https://forums.ni.com/t5/LabVIEW/Adding-a-timeout-fail-safe-in-QSM-PC/td-p/3905791?profile.language=...

I've spent a good amount of time seaching and reading on the forum but no thread seems to really cover what I'm looking for.

I'd like to spark a conversation around the concept of handling hanging processes in QSM-PC and see how some of you do it in your apps. Since I've discovered this architecture thanks to a paper by Mezintel (http://www.mezintel.com/blog/labview-queued-state-machine/), I've been using it widely, but the concept of timeouts and watchdog timers in QSM has always been a bit fuzzy to me.

While I understand how we can split a code into parts and poll for various check-ups, then interrupt if necessary and change the entire queue of events, which gives us a great flexibility to handle errors while structuring our code, I've always wondered how to handle timeouts properly, especially in term of user input.

This has never bothered me too much as I tend to have a "dumb proof" approach to my code and I try to cover all angles but this is the first time I'm working on a large-scale application that might get used by multiple companies with hundreds of employees so I'm a little bit more cautious than usual.

While I have control over the code, I don't have control over what data users are going to feed into the system. There are of course safety measures I can think of to mitigate the risk of errors but there's always a part of me that thinks that there may be a way to tick all the green checks of our safety net and yet manage to input corrupted data or data that will cause undesired hanging as no errors might be generated.

There are also situations where we might use a chunk of code from a 3rd-party provider that doesn't implement a timeout logic, making it impossible to generate an error after a set time directly within the process.

How do you handle these situation? Is there any way to interrupt a process and return to the idle state?

All ideas are welcomed!

mcduff · ‎04-24-2019

Sorry I do not use a QSM, I never liked it.

But here are some suggestions that you may already do, but did not show in this demo.

Make your cluster a type-def in your state machine.
You state machine looks like a JKI State machine except you made with enums instead of strings. I prefer String based State Machines a lot more flexibility.
Not sure what you watch dog is doing.
Not sure how to handle a case where you interrupt a loop. A lot depends on what you want to do after you interrupt the loop. Do you want to exit the program, restart the loop, etc.

As far a watchdogs go for monitoring loops, you could check the status of your queue and if it is constantly growing in size AND the same state is waiting to be dequeued then that may mean your loop is stuck. Or maybe just the same state waiting to be dequeued too too long.

What I don't like about the QSM is that you need queues for every independent loop, so depending on the number of loops you need that could be a lot.

I like the JKI State Machine and you seem to have a handle on it, I connect independent loops via user events, a lot easier than queues, if you are looking for something new to try.

mcduff

TBM · ‎04-24-2019

Thank you a lot for the reply and suggestions mcduff. Always looking to expand my knowledge and try new things.

I actually played with both the JKI-SM and QSM architecture and found to be more comfortable with enums so I find interesting that you mention string-based SMs are more flexible.

Could you give an example where a string-based SM offers more flexibility than enums w/ variant data?

Regarding #3, the watchdog currently doesn't do anything and just act as a reverse counter that starts when a task is initiated and reset on task completion/idle state. I was testing a few things with it like trying to abort the consumer loop when timeout is reached, without luck. Figured I'd just let it in the demo.

About #4, this would probably depend on what the software was doing when timeout was reached but the general approach would be to allow for a) Resume operation(s) (reset watchdog / restart timeout) or, b) Abort operation(s) and return to state "x" or, c) Generate error report, abort operation(s) and then return to state "x".

There shouldn't be much challenge in detecting the things you mentioned (queue size, same state waiting for dequeue) but I'm still not sure how to abort a process should we decide to return to the Idle state (or whatever appropriate state) when data is still being processed.

When you mean independent loop, do you mean parallel processes? I haven't had to work on a project where this had become a major hurdle but I'll definitely keep it in mind.

--

I'm currently reading about some folks who adapted the JKI State Machine into the producer-consumer loop. Wouldn't it be plagued with the same queues issue you're referring to?

The conversation goes on talking about GOOP. It seems overkill but I'll read more about it, can't be bad to explore other horizons!

Bob_Schor · ‎04-24-2019

Advantages to Strings -- easy to add new State (just make up a new String name and add a Case Statement with that name.

Disadvantage to Strings -- if you mis-spell a State Name, you won't get a match, which will be "caught" by the (required) Default case. If you add a State (string) but forget to program a Case for it, it is also treated as a "mis-spelling". If you want to find where your States are being used, the "search" process is pretty tedious.

Advantage to Enums (my method-of-choice) -- can (must, in my case) have a TypeDef, making finding them easy (search for the TypeDef). Adding a State but forgetting to program a Case for the State (or a Default) leads to a Broken Arrow, easy to find and fix.

Disadvantage to Enums -- large changes to Enum (and sometimes removing a State) can "scramble" (or leave two identically-named) Case options, requiring a little thinking of "is this the right code for this State?". There's a little more effort in creating and maintaining the Enum/TypeDef compared to inventing a new String.

My Recommendation -- You Pays Your Money, You Makes Your Choice. Be consistent.

Bob Schor

TBM · ‎04-24-2019

Thanks for joining in Bob.

These are the same conclusions I came to after playing with the two structure.

I tend to prefer the "defined" state of enums over strings even if it means a little bit more job to maintain but if there are other advantages in term of flexibility I haven't considered I'm certainly curious about them.

With that said, I don't think choosing Strings or Enums would affect how we can tackle at the timeout issue.

mcduff · ‎04-24-2019

@TBM wrote:

First thing first about enums:

Definitely faster than strings, if you have a tight loop and want every nanosecond of performance, enums are the way to go
Impossible to have a wrong state, ie, misspelled state, etc

JKI SM -String Based (May only apply to JKI SM as that is what I use, and these are only my opinions)

Built in error handling - (in case misspelled)
Built in Event Structure - I use it for FP events and dynamic events such as User events to communicate between loops or a DAQmx Event such as Change State to trigger something, or I use the DAQmx event number of points in the buffer to periodically download data. Why events, no polling needed.
I prefer to make a multi line "macro" of states to execute, rather than build an array of enums
If a make a multi line "macro" of states easy to comment out a state for debugging purposes rather than remove an element from an array. Below is an example of a macro, if I want to comment out a state I add a # at the beginning. Also it is easier to reorder a bunch of lines of text rather than an array.
Easy to add arguments to state. I would rather have one state with two arguments than two separate states. For example UI: FrontPanel >> Open, and UI: FrontPanel >> Close. The advantage is I can add more arguments if needed, like hidden, maximized, etc without added more states and largely duplicated code. Or I have a message state and the argument is the message.
JKI added some nice tools like State Machine Explorer that makes programming easier.

Could you give an example where a string-based SM offers more flexibility than enums w/ variant data?

I use user event to communicate between loops instead of queues and I use an string and variant as the data payload. Between loops, I think you need variants for flexibility.

Regarding #3, the watchdog currently doesn't do anything and just act as a reverse counter that starts when a task is initiated and reset on task completion/idle state. I was testing a few things with it like trying to abort the consumer loop when timeout is reached, without luck. Figured I'd just let it in the demo.

Stopping a loop is hard. You can monitor the queue status and determine if the loop is stuck, but after that, comes the hard part. Your loop is not stuck at the queue, it is stuck in some state, how do you communicate with that state to tell it to stop? Not easy. Unless you have communication hooks into the stuck state not much you can do except crtl-alt-del or force quit the application. (If you loop was stuck at the queue you could always destroy the reference to stop it.)

About #4, this would probably depend on what the software was doing when timeout was reached but the general approach would be to allow for a) Resume operation(s) (reset watchdog / restart timeout) or, b) Abort operation(s) and return to state "x" or, c) Generate error report, abort operation(s) and then return to state "x".

In my programs when an error occurs, I make a dialog window that describes the error and allows the user to try and continue (ignore the error) or end the application. (I end the application gracefully, not a force quit.) In this case the program is still responsive, there is just an error.

When you mean independent loop, do you mean parallel processes? I haven't had to work on a project where this had become a major hurdle but I'll definitely keep it in mind.

Yes. LabVIEW is great for parallel processing, try to learn to design your programs this way. It is hard, but definitely worth it as programs become more complicated and require more resources.

I'm currently reading about some folks who adapted the JKI State Machine into the producer-consumer loop. Wouldn't it be plagued with the same queues issue you're referring to?

I use User Events to communicate between loops, they are lossless and can send messages to many loops. So rather than send a message to a particular loop, I send my message to every loop. A helper loop decides whether to "forward" the message.

I am probably not explaining right, but I use some modifications of this link for communicating between loops. Message Bus

This following link is an excellent presentation/tutorial about how User Events work

https://libraries.io/github/JackDunaway/LabVIEW-User-Events-Tips-Tricks-and-Sundry

Lastly, use what you are comfortable with.

mcduff

Ben · ‎04-25-2019

Drawback of strings;

Typos can not be detected until the offending code is executed so testing requires that every possible transition is invoked. So a typo may not be found until the end-user does something that was not expected.

Advantage of Enums;

Unhandled states are flagged via a broken wire at development time. Typos? who cares? after all this is LV not a text based language.

Ben

Retired Senior Automation Systems Architect with Data Science Automation LabVIEW Champion Knight of NI and Prepper LinkedIn Profile YouTube Channel

Bob_Schor · ‎04-25-2019

Not to be too contrary, but even I must admit that the JKI State Machine is a(n evolving) Thing of Beauty.

Bob Schor

Kevin_Price · ‎04-25-2019

As you can see there isn't a clear consensus on the enum vs string question. I started out favoring enums but in recent years have converted to strings.

The main driving factor was some effort with a coworker to establish a few common-use libraries, packages, and templates. Stuff that would provide a good head-start on the kind of code we had tended to re-implement over and over. We decided that strings would allow us to build utilities that were more universal unlike enums which tend to be unique and specific to each individual app.

Here are some "habits" that have helped:

- The case structure that receives the strings is configured to be case-INsensitive. Nothing good comes from having separate cases for "END", "End" and "end".

- Our "Default" case is used to catch illegal strings (whether typo, misspelling, or whatever) and pop up an error window. These are expected to be caught during development and debugging. In theory they're a bit of a time bomb but in practice, we haven't gotten burned and don't even trip them much during development.

- We don't build up our strings with parsing and formatting functions. They're just string constants, much like an enum would have been. As a matter of convention, we've chosen not to extend the flexibility strings offer into code that accomodates dynamically-defined string messages.

All that said, I've got nothing against enums and still use them in lower layers of code that's app-specific.

-Kevin P

ALERT! LabVIEW's subscription-only policy came to an end (finally!). Unfortunately, pricing favors the captured and committed over new adopters -- so tread carefully.

JÞB · ‎04-25-2019

I'll add my take on the String vs Enum state selector.

Look at the owner. Is it a lvlib, lvlibp or, lvproj.

For Libraries use strings! This allows you to extend library functionality without refactoring the using project code worse case is a project that uses a library that had to be downgraded to an earlier version utilizing an extended function. The default case can easilly say "Error: state %s not available in library%s version %d.%d" now you have information telling the integration team just what the developer did wrong(other than not reading the library Readme file change history)

For projects, use an Enum unless you are the one person who never write a bug or sees any change orders (scope creep) during any part of a development life cycle.

Now, if you don't know if your case structure belongs in a library or belongs in a project, what a reuse library is or you have no source code control... just wing it! In the long run whatever you do is going to cause problems in the future. Use what hurts you less to debug and qualify.

"Should be" isn't "Is" -Jay

LabVIEW

How to handle hanging processes in QSM

How to handle hanging processes in QSM

Re: How to handle hanging processes in QSM

Re: How to handle hanging processes in QSM

Re: How to handle hanging processes in QSM

Re: How to handle hanging processes in QSM

Re: How to handle hanging processes in QSM

Re: How to handle hanging processes in QSM

Re: How to handle hanging processes in QSM

Re: How to handle hanging processes in QSM

Re: How to handle hanging processes in QSM