Undeterministic/ Strange behavior in the Real-Time System

Bob_Schor · ‎05-06-2016

Am I correct that the four "anonymous VIs" that you show in your Picture (I really dislike pictures of code, just attach the code) are what you call A, B, C, and D? When you say they are not sub-VIs, what are they? [If they are running inside of another VI, as these seem to be, then pretty much, by definition, they are "sub-VIs", at least as I understand the term].

The Picture shows code that never stops, and where the timing is completely undetermined (it is "reasonable" that all four Frames will start at the same time, but they don't have to -- in principle, the last VI could start running before the first ...).

A much better way to start 4 sub-VIs at known times is ... to start four sub-VIs at known times! The Error Line is Your Friend, use it!

But as I understand your original problem, it wasn't how to start these VIs, but how to stop them, right? Or are you talking about something else?

Bob Schor

edjones93 · ‎05-07-2016

You've mentioned changing front panel items, are you doing this programmatically? The graphical user interface is not deterministic and will result in non-deterministic behaviour. If you wish to programmatically change Controls/Indicators you should do this in a seperate, non-timed loop using a producer consumer architecture. - Update your controls and indicators in the consumer loop, not a timed loop. This document provides other methodologies for such non-deterministic takes also:

http://zone.ni.com/reference/en-XX/help/371361K-01/lvconcepts/vi_execution_speed/

This is a producer consumer architecture:

http://www.ni.com/white-paper/3023/en/

It should be used to seperate deterministic and non-deterministic takes, non-deterministic tasks are user interface, network communication and file I/O among others.

Additionally you have four timed loops in your original posts' attached picture which each have a dt value of 1 ms. This will not be possible as each loop is a thread, it is a system level limitation that an operating system can only execute one thread at a time for each core and each thread takes at least 1 ms as this is the rate the operating system's interrupt from the CMOS chip is. So, when you set 1 ms on each loop, you will only actually achieve 4ms with one core 2ms with 2 cores and only 1 ms if you have 4 cores. Additionally this leaves no time for other background processes/services to run if you don't have any greater than 4 cores.

So when you update controls/indicators with this setup, (since your system has 4 cores) at least one core will be taken over by the operating system for some time while it updates the user interface before returning control back to the timed loop.

Finally with regards to controlling subVI's and closing these programmatically, you can use VI Server to call the subVI's and Close Reference to hault their execution, however I believe the issue is more related to you updating user interfaces when you have already consumed all CPU resources and are also doing this within a timed loop.

Why do you need to run your threads at 1ms intervals?

More information is needed towards this, the logic involved in the loops, what is being read from the FPGA, at which rate is it pushing data to the FIFO's, how much that is and how large are these FIFO's?

This document discusses what I have outlined above more thoroughly:

http://zone.ni.com/reference/en-XX/help/370622M-01/lvrtconcepts/deterministic_apps_timed_loop/

This document outlines non-deterministic tasks:

http://zone.ni.com/reference/en-XX/help/370622M-01/lvrtbestpractices/rt_time_budgeting/

MindyN · ‎05-09-2016

Hi Edjones93,

Thank you for your response.

You've made an excellent point about the limitation of system processor usage.

FYI, my main VI is HUGE. It has 8-9 timeloops and flat sequences that are running in parallel and I thought it had enough space (It doesn't take up all the CPU core), but I didn't think of the the fact that each core can only operate 1 thread. Therefore, sometimes I notice that some timeloop didn't get called.

Here is the my program architecture:

MAIN VI has 8 timeloops, Inside of 4 of the flat sequences, there is VI/ SubVI that had another 5 flat sequences and timed loop.

So, I was expected to see total of ~20 timed loops to run at the same time (read in data for 20 channels from the DMA FIFO).

Do you think reduce it to 3-4 timeloops would help?

Do you recommend any method that I can run 20 different timed loops effectively (because they all worked independently)? They don't have to be syncronized, but they need to get data every clock cycle (all 20 timed loops are while loops that had case structure and each case has different clock frequencies, fastest case has 1MHz clock and slowest case has 1 kHz clock)

The 1ms interval is just an example, I just want it to run one after the other => I supposed to have it as 1ms, 2 ms, 3 ms, and 4 ms.

Thank you in advance,

Mindy

natasftw · ‎05-09-2016

What was the CPU usage for those 20 loops when they ran indepedently? You can see from there if it's reasonable for them to all run simultaneously (The answer is no)

You need to start looking at your application in a less fragmented way. You want to collect data from 20 sensors. That's easy. What's keeping you from doing so in a single loop? What is the difference in timing? Does it matter if you oversample?

MindyN · ‎05-10-2016

Hi natasftw,

Thank you for your response.

1. My program is a bit complicated. Lets me keep it simple, there are 2 cases in the case structure for every of the 20 loops/ sensor reader. In case 1, the timed-loop frequency is 1 Mhz (because it needs to sync and do all the security/ data integrity check)=> take up a lot of CPU. This sync process only happen once in the beginning. For 1 sensor run independently, it takes total 100% out of 400% CPU (I have 4 cores). Thus, I cannot do initialization/ sync for 20 sensors at the SAME TIME. So, I add time delay for each of them so that one happen after the other. After it sync, the timed-loop frequwncy will be 10 kHz (100 times slower) => CPU will dramatically decrease to 8% out of 400% CPU. When all sensor sync, all 20 sensors can run at the same time.

2. Because of the reason above, I cannot run all in a single loop. Plus, each sensor has there own code word and they are not synchronized. They have data at different time. I need a real time system, I cannot afford to wait for data in 20 sensors before I move on to the next case. There are 2 parts in here, the FPGA part gets data and deal with oversample/ undersample (the FPGA part is doing great), and the RT part received data via DMA FIFO => Good data will be in queue and I don't have to worry about over sample or undersample.

Please let me know if you may have any idea or recommendation on improving this performance,

Thank you,

Mindy

MrJackHamilton · ‎05-12-2016

Sorry for posting late here...

I've done alot of RT systems. There are a few tips and caveats you need to be aware of when developing RT systems.

1. You don't have the protection of a multi-threaded OS, so you can thread-starve a loop which anything NOT happening is possible. I've seen Queue data go missing. This is what you trade off for working in RT.

2. Never have a loop running without SOME timing delay. Even State machines, add a small delay between state calls. This minimizes CPU loading when those loops run.

3. No. 2 is true of you employ USR's...these single shot loops called repeatively without any delay will full load the CPU.

4. Consider the syncopation between multiple loops - meaning if they syncronize and task the CPU at the same time, you'll get a CPU load spike and again - stuff does happen.

5. Now that VXWorks OS is gone, I've seen Linux RT system hang hard and crash when you overload the cores.

6. FIFO's is implemented poorly can add a significant CPU load. You have to 'tune' your RT FIFO receive code to optimize of minumal loading. NI has not implemented any internal loop interrupt on the RT FIFO read. I've come across numerous advanced LabVIEW guys getting tripped up on this.

7. If you don't have the "Get RT Core % Usage" as part of your code...you're not paying attention to an important metric in developing your RT code.

8. Minimize Class Calls in the RT. I've seem recently lots of odd problems when deploying classes on the RT, by very experienced Class LV programmers. Stuff, I've never come across in my years of RT prior to classes. My assumption is there is more overheard in them and this loads the RT.

9. Keep your code lean, Queues and Notifers work beautifully. Use Queues to pass data, use notifiers to pass status information. String arrays have a large LV memory footprint.

10. Structure your code to perform ALL the acquisition and control tasks continuously. i.e. acquire the data..even if it's not being saved to disk. Control loops should run, but in a disabled state (shunt out the changing the control line state - but still run data thru the control processing loops.) - This helps identify what the 'normal operating load' of the RT is....otherwise it is hard to identify problems if the code gones from an idle state to a everything is happening state.

Without careful consideration of CPU loading, relatively slow code can tax the RT.

Good Luck and Regards

Jack Hamilton

LabVIEW

Undeterministic/ Strange behavior in the Real-Time System

Re: Undeterministic/ Strange behavior in the Real-Time System

Re: Undeterministic/ Strange behavior in the Real-Time System

Re: Undeterministic/ Strange behavior in the Real-Time System

Re: Undeterministic/ Strange behavior in the Real-Time System

Re: Undeterministic/ Strange behavior in the Real-Time System

Re: Undeterministic/ Strange behavior in the Real-Time System