Large data Application

labviewette · ‎08-11-2017

Dear Community,

I have a requirement to develop a program with large datasets. Normally when working with small or medium datasets I use clusters that contain all data needed and that are available in each case of my state machine using shift registers.

I am thinking about what alternatives to use instead of shifting clusters with large data sets in the new program. I was thinking about using FGV to set and get data values or even go object oriented. I am new to OOP. One question I am asking myself is:

If I define the class data to be the large cluster data then what is the best way to implement class methods to read and set the values of the class data cluster. Bearing in mind that sometimes i just want to read/update one, another time up to 300 or 400 values.

Appreciate your feedback.

IanSh · ‎08-11-2017

How much is "large"? It may be that your approach passing the data around a shift register would work fine, have you seen issues? Don't change things unless you actually need to.

If you are actually going to change things, an FGV isn't going to help, these are used for encapsulating functions or making data available without a wire, not for solving large data problems.

A class is also not inherently helpful. This is also for encapsulation purposes (and others I won't go into). The class is still pretty much just a cluster on the wire.

If you have large data problems (typically RAM memory issues), there are various methods which could help. Post your VI and a bit more information such as how fast data is being acquired and how big the data array gets. Careful array manipulation can solve most issues, often using the in-place element structure.

Ian

Bob_Schor · ‎08-11-2017

I recently faced a similar situation where I was collecting and monitoring 24 behavioral stations, each running an Asynchronous "Clone" VI that took data from a VISA port at 10 points/sec. I wanted to be able to see a graph of the last 5, 25, or 125 minutes of data from a selected Station (displayed in a sub-Panel on the Top-Level VI). I implemented this by having 3 600-point arrays (one for each "scale factor") and updating the data as needed when a new data point came in (i.e. every 100 msec).

Needless to say, managing 3 600-point arrays when what you mainly want to do is change a single point without taking too much time (as you have 24 of these tasks trying to run simultaneously) suggests that passing the Arrays themselves around is not a good idea.

I'd heard about Data Value References as a means of working "in place" with large data structures, but had never tried them. Turned out to be fairly straight-forward, and worked really well (meaning that (a) it was fairly easy to program, (b) it was fairly easy to understand and explain to my colleagues what was happening, and (c) it seemed to work very nicely. I didn't time how long it took to update a point, but looking at the routine called "Add a Data Point" (which might have to "touch" all three of the 600-point graphs), I'd guestimate that the routine (which takes a Data Point and indexing information contained in the data structure that includes the actual graph) takes considerably less than a millisecond at maximum, possibly microseconds if the new point doesn't need to be plotted at the highest-resolution plot. The secret, of course, is that DVRs work with an In-Place Element structure, so the only thing you pass around is the address of the Graph Arrays, not the Arrays, themselves.

Bob Schor

mcduff · ‎08-11-2017

I like to use DVRs for large data sets. You can pass around the DVR reference in your state machine or to another loop to limit any data copies.

The big question is what do you want to do with your data? If you just want to save it, then I recommend saving it in chunks, if you want to do analysis on it, depending what you are doing can get tricky as far memory copies go.

cheers,

mcduff

labviewette · ‎08-13-2017

Thank you for the replies. I am building up the program from scratch. I have about 1000 Database entries that I need to read into a cluster and update them in the course of different events taking place like writing values from different instruments. The DB Entries could also increase to become around 2000 entries. I do not know yet if I would get any memory issues if I program the application using clusters and passing them around. Naturally memory usage is also dependent on the overall tasks done in the application. I am thinking ahead of time what alternatives I have If I do run into memory problems.

From your posts I gather the most reasonable way would be to use data value references.

Bob_Schor · ‎08-13-2017

The suggestion to use DVRs makes some assumptions. The first is that whatever the situation, all of the data (your array of 2000 clusters, for example) can fit in memory at once. If each Cluster can fit in a few KBytes, you should be OK on this front, as you can certainly accomodate a multi-MByte array. What you are accomplishing with the DVR approach is avoiding passing this monster around when you want to access a particular element, say "Update the ID String in Array Element 1234" -- with DVRs, you pass the DVR (a pointer, in essence) to an In-Place Element (IPE) that gives you access to the structure. Inside this IPE you put another IPE that gets you Array Element 1234 that you pass to yet another Unbundle/Bundle IPE that gets you the ID String. While it looks a little intimidating with all of these nested In Place Element structures, each is a very simple "Go get me one thing" and, best of all, it works!

Bob Schor

mcduff · ‎08-13-2017

A DVR may work for your case, it depends what is in your cluster and how many clusters you would like to keep in memory at one time.

My typically use case for a DVR is when I want to hold a large array, millions++ of points, in memory for either analysis, display, etc. That being said DVRs can be a hassle to use, and correct use case is often a difficult proposition. See https://forums.ni.com/t5/LabVIEW/LabVIEW-2015-Buffer-Allocation-Bug/td-p/3300392

I'll probably be cited for blasphemy here, but in DVR case, I do not think the LabVIEW compiler is smart. For a DVR you are explicitly telling the compiler to do an operation in place, but the compiler sometimes does not follow your instructions. (See above link.) If you are using LabVIEW 2017, the situation is better, and DVRs are a little bit smarter.

So what I am trying to say is if your cluster is not particularly large then a DVR may be more trouble then it is worth.

mcduff

LabVIEW

Large data Application

Large data Application

Re: Large data Application

Re: Large data Application

Re: Large data Application

Re: Large data Application

Re: Large data Application

Re: Large data Application