Best practices to manage large array inside an event loop?

Gurdas · ‎05-30-2013

My program is a multi-event structure, where each event corresponds to a boolean value change by the user. The program begins by reading a .csv file, typically, 50000 rows and 100 columns, of string values. I add some columns, predict the final array size and write to two array controls - input and output. The input array always shows the original data as read from the file. The output array shows the current state of data. The user then presses button that triggers events, in each event I read the output array value, manipulate data, and write back to output array. If an event has to manipulate only the 7th column (for example), then I index that column, manipulate it, and replace that column in the control. I am using 'Value' property node to read and write data to the output array. I NEVER use build array function. I have to update the control at the end of each event to allow the user to see effect of the event.

Typically, my program runs fast. I can go through all events and even repeat them at a good speed. However, when I read .csv file second time, the program slows down, third time even slower. When I read the file, I predict the final size of input and output array and refresh the controls with new values from the .csv file. What am I doing wrong?

1. Should I use shift register in my event structure to read and update output array instead of using property node? Or, should I read and write using a Reference to the control?

2. When I read a control using property node, am I creating a second copy of the data?

3. I believe each time I read or write a control or indicator, LabVIEW switches to threads. This may be causing some delay but I doubt is the reason for my troubles. Right or wrong?

4. Is there a way to deallocate all input and output array control memory? Would it help to write a null array before I refill the controls with a new .csv file?

I can add screenshots of my code. I cannot provide the complete program due to security issues.

Using LV 8.5 Professional on Windows 7 64-bit.

Thanks,

Gurdas

Gurdas Sandhu, Ph.D.
ORISE Research Fellow at US EPA

moderator1983 · ‎05-30-2013

Sharing your code and file, will help to get the insight...!!
Well if you've implemented your code using a single loop, you may want to try an 'Producer/Consumer Design Pattern (Events)' architecture.

I am not allergic to Kudos, in fact I love Kudos.

Make your LabVIEW experience more CONVENIENT.

Gurdas · ‎05-30-2013

I've attached two screenshots. One showing the event where I read the .csv file and write to two controls - "Data from File" and "Output Data". The second screenshot shows how for an event, I read "Output Data", manipulate what I need to, and update "Output Data". My program has about 20 events where I manipulate the data held in "Output Data" control.

Gurdas Sandhu, Ph.D.
ORISE Research Fellow at US EPA

moderator1983 · ‎05-30-2013

You certainly facing issue(s) related with memory management...!!

And reading file, executing loops etc inside an event structure is not advisable...!! So it would be towards betterment of your program to change the architecture... may be two loops, one should handle all user events (producer) and other should play with data (consumer)...!! Well that alone is not gonna solve the issue... Because you're reading the file for multiple times...!! And that's building up memory consumption by your LabVIEW application..!!

If at all that is the requirement then think of doing it in a smarter way... so that additional memory allocation can be avoided...!!

Seems like there is lot of scope to make your code efficient...!!

I am not allergic to Kudos, in fact I love Kudos.

Make your LabVIEW experience more CONVENIENT.

Gurdas · ‎05-30-2013

I looked at this page about the Producer/Consumer design: http://www.ni.com/white-paper/3023/en

I doubt my application will benefit from this. I never have two competing loops. The user never needs data from two files simultaneously. Once the data from a file is read, the user clicks on buttons to process the data. This process is always one button/process at a time. Once the user is done processing, the data held in the "Output Data" control is saved. At this time, the user can either close the software OR read another file and process the data.

Gurdas Sandhu, Ph.D.
ORISE Research Fellow at US EPA

crossrulz · ‎05-30-2013

Gurdas wrote:

1. Should I use shift register in my event structure to read and update output array instead of using property node? Or, should I read and write using a Reference to the control?

2. When I read a control using property node, am I creating a second copy of the data?

3. I believe each time I read or write a control or indicator, LabVIEW switches to threads. This may be causing some delay but I doubt is the reason for my troubles. Right or wrong?

4. Is there a way to deallocate all input and output array control memory? Would it help to write a null array before I refill the controls with a new .csv file?

1. Yes, you really should be saving the data in a shift register.

2. You are making copies you don't need to.

3. When you read or write to/from a property node, you are forcing the UI thread to run. So you are causing a thread swap for each read and write of the array.

4. Just write the new array to the shift register.

There are only two ways to tell somebody thanks: Kudos and Marked Solutions
Unofficial Forum Rules and Guidelines
"Not that we are sufficient in ourselves to claim anything as coming from us, but our sufficiency is from God" - 2 Corinthians 3:5

Gurdas · ‎05-30-2013

Thanks, crossluz, that helps.

1. I've suspected shift register will help and will implement that during the weekend. The nature of the application is that sometimes the user would like to see the result of a processing step. Maybe instead of auto updating the output control after each step, I should provide a dedicated button to write the current shift register data to the control?

2. This surprised me. I thought local variables create copies but reading values through a property node does not. Is there a application document somewhere that will explain more about what exactly happens when I read a value through property node? I use property nodes so extensively that I want to be sure I need to go fix all of them. I had task manager running and yes, each time I clicked a button to manipulate data, LabVIEW's memory usage jumped by significant numbers. By the time I got to the end, LV was consuming 2 GB of physical memory. It was 1 GB when I first read the file and write to output control. If I delete the wires writing to the input and output array, at the end of file read LV was consuming 0.5 GB.

2a. If property nodes are not an option and neither using wires or local variables or shift registers, what should I do? If I use a reference to the control, is that any better?

3. Okay.

4. Great. If you notice the screenshot attached in an earlier post, you notice that I read a file, add more columns, and write to an output control. All of this is an event trigged by the user clicking a "read file" button. My question with using a shift register is: do I need to initialize it outside of my events? Even if I did, I will not know the required size until I've read the file. Or, should I simply write the file read array to the shift register? Note, once I write to the shift register, I do NOT change its size. I am always replacing elements or columns. But that file read and update shift array is a one time change in size of shift register. Is that okay?

- Gurdas

Gurdas Sandhu, Ph.D.
ORISE Research Fellow at US EPA

crossrulz · ‎05-30-2013

Gurdas wrote:

1. I've suspected shift register will help and will implement that during the weekend. The nature of the application is that sometimes the user would like to see the result of a processing step. Maybe instead of auto updating the output control after each step, I should provide a dedicated button to write the current shift register data to the control?

Is is a control or an indicator? If it is just an indicator, write to the terminal right before the event structure (from the shift register).

@Gurdas wrote:

2. This surprised me. I thought local variables create copies but reading values through a property node does not. Is there a application document somewhere that will explain more about what exactly happens when I read a value through property node? I use property nodes so extensively that I want to be sure I need to go fix all of them. I had task manager running and yes, each time I clicked a button to manipulate data, LabVIEW's memory usage jumped by significant numbers. By the time I got to the end, LV was consuming 2 GB of physical memory. It was 1 GB when I first read the file and write to output control. If I delete the wires writing to the input and output array, at the end of file read LV was consuming 0.5 GB.

A property node and local variable pretty much work the same way, except they go through different hoops to get the value to the control/indicator. The property node is a lot more involved.

Gurdas wrote:
2a. If property nodes are not an option and neither using wires or local variables or shift registers, what should I do? If I use a reference to the control, is that any better?

Nope. References use property nodes. So it is just as bad. Why wouldn't wires or shift registered be an option?

Gurdas wrote:
4. Great. If you notice the screenshot attached in an earlier post, you notice that I read a file, add more columns, and write to an output control. All of this is an event trigged by the user clicking a "read file" button. My question with using a shift register is: do I need to initialize it outside of my events? Even if I did, I will not know the required size until I've read the file. Or, should I simply write the file read array to the shift register? Note, once I write to the shift register, I do NOT change its size. I am always replacing elements or columns. But that file read and update shift array is a one time change in size of shift register. Is that okay?

Initialize it with your first read. If you don't read until an event is performed, then just initialize the shift register as an empty array. If you read again, just write the new value to the shift register and let LabVIEW worry about the memory management.

There are only two ways to tell somebody thanks: Kudos and Marked Solutions
Unofficial Forum Rules and Guidelines
"Not that we are sufficient in ourselves to claim anything as coming from us, but our sufficiency is from God" - 2 Corinthians 3:5

Gurdas · ‎05-30-2013

Crossulz, I updated my code and replaced all property nodes of the output array indicator with shift register. I see some speed/memory improvements (more about that later).

1. Sorry about the wrong terminology. My output array is an indicator and not a control. I was using the word since I use this array to control the saved output. I am not sure I understood what you meant by writing right before the event structure. Did you mean having the terminal between the outer while loop (which has the shift register) and the event structure? Wouldn't that have it update the indicator each time the while loop runs? For the lack of a better solution, I created an event to update the indicator using shift register value. See attached screenshot, "Gurdas_3*.jpg"

2. Okay.

2a. Wires won't work because control and indicators are being manipulated in different events. Shift registers could work, but that means having many shift register, one for each control or indicator. I've read about Feedback nodes, not sure if they exist in LV 8.5.1. If they do, is that an option?

4. See attached screenshot "Gurdas_4*.jpg". I intiliaze the shift register with an empty array and then update it with the data when file is read. Is this what you meant? I still have an input data indicator that I am writing to. But this is one time; this indicator is not read or updated anywhere else.

4a. Notice the 3 branchout data wires just after the file read? Is that creating 3 copies by any chance?

5. See attached screenshot "Gurdas_5*.jpg". This is from the most computation intensive event that uses current data from the shift register. I am indexing many columns out. Is that creating extra copies of the columns?

About the improvements from using shift register. LV's memory usage as reported by Task Manager:

A. Before file read 70 MB
B. After file read 750 MB (includes writing one time to the input data indicator; if I do not write, memory usage is 450 MB)
C. After completing all steps done but never writing to output iindicator array 1.1 GB
D. After completing all steps done and writing to output iindicator array 1.5 GB (this used to be 2 GB previously)

The file I am reading is 47000 rows and 78 columns. I read the file as string. The ondisk size of the file is less than 20 MB.

6. Comparing steps A and B above, why does memory use increase so much more than the actual file size being read?

Thanks for helping me here!

Gurdas Sandhu, Ph.D.
ORISE Research Fellow at US EPA

johnsold · ‎05-30-2013

Gurdas,

Regarding items 5 and 6. Read from Spreadsheet File.vi tends to make copies of the data. I am not sure when the output is a string array whether it makes copies or not. Certainly when doing the standard conversion to an array of double precision numerics it makes copies.

It appears that you eventually convert everything except the header rows to numeric. I suggest you read as strings only the header rows. Then read in chunks of a thousand rows or so at a time using the automatic conversion to numeric arrays and Replace Array Subset to populate the numeric array. This method (assuming you do not display the results on indicators or make extra copies elsewhere in your code) should allow you to read the file, get the headers, and have an array of the numeric data without going above about 40-50 MB of RAM.

Keeping all the data from the file in a string array and converting most of it to numeric arrays guarantee the you have multiple copies.

When crossrulz was referring to initializing an array, he was likely suggesting that the array be created outside the loop at a size equal to or larger than the maximum size of the expected array. Then inside the loop use Replace Array Subset to put the data from the file into the shift register. This reuses the allocated memory for the array and does not make copies.

Lynn

LabVIEW

Best practices to manage large array inside an event loop?

Best practices to manage large array inside an event loop?

Re: Best practices to manage large array inside an event loop?

Re: Best practices to manage large array inside an event loop?

Re: Best practices to manage large array inside an event loop?

Re: Best practices to manage large array inside an event loop?

Re: Best practices to manage large array inside an event loop?

Re: Best practices to manage large array inside an event loop?

Re: Best practices to manage large array inside an event loop?

Re: Best practices to manage large array inside an event loop?

Re: Best practices to manage large array inside an event loop?