I am developing an embedded application where it collects data from cRIO modules.
The application uses AF and LV 2018 SP1 in a cRIO-9033.
Aside from Module Data Acquisition, it has two other actors which do the Error Logging and other that sends the collected data through MQTT.
To log the errors, I overrode the "Handle Error.vi" to log eventual errors and avoid stopping the actor, with exception of Error 43/1608.
The MQTT Actor is constantly receiving messages to send to our server, but when by any chance the server is down, the methods inside throws an error and this error is logged.
This looks nice, but after a while of an inoperative server (minutes even) the log file gets polluted and nothing productive comes out of it.
Is there a known way to handle this situation?
My first thought was: monitor the actors and when an Actor throws out the same error repeatedly (let's say 5 times), shut this actor down, and then wait a period to relaunch it again.
Is this a correct approach? Have any of you ever done something like this?
Solved! Go to Solution.
I haven't done anything like this, but your idea of shutting down after 5 errors seems reasonable.
You might also code the MQTT Actor to detect when the server seems to be down, and go into a stalled state. In the stalled state, maybe it just pings the server every 60 seconds to see when the link is back up. That will clean up your error log too. Depending on your application, you can drop or buffer the incoming data while the link is stalled.
I have a similar situation where we are collecting large amounts of data and logging it to the a database. I wrote an error logger that overrides the Handle Error and writes the errors received to the database as well in a new table.
To mitigate large amounts of repeat errors, I wrote a filter that looks for repeat errors within a given time, and then it ignores those errors until the time has elapsed, and then goes through the cycle again.
Of course we alert the operator to let them know of the error, but unless it is fatal, we do not shut down the application or the trouble module.
Sorry for the long delay in answering the post with the adopted solution.
Actually, shortly after posting here, the system went on production and so far I haven't had time to improve this actor and implement the modifications in the field.
After studying the alternatives, I totally discarded the idea of shutting down the actor and I went to the "stalled state" option.
So after a couple of errors, I change the actor state to "suspended" and during the next attempts of sending messages, it checks the connection prior to sending, thus modifying back the actor state in case of success.
The only downside of this approach is that I can lose the data that it was supposed to send, but in the current usage, there is no problem in discarding data.