How to do a data acquisition with fast read and write with tdms

McPhazer · ‎01-19-2019

Hello,

I need to do a data acquisition with fast read (data display) and write for a fairly large dataset, Coming in every 200ms is an array of 1000 x 1000 doubles (think of it as a camera image). This array is dependent on one control variable, like

var=1 array1

var=2 array2

...

var=N arrayN

Yet, I have the situation that I might scan multiple times over the same variable value:

var=1 arrayN+1

So, I have in this example two arrays belonging to the same variable. In that case, I want to store the average of the multiple arrays with the same value of the variable.

It is clear that I cannot hold all the data in the memory (at least not for 32 bit which I am using) . The need to store an averaged array forbids to use one tdms file for all the data as I cannot modify the data of an existent tdms file (at least not without copying the whole file). On the other side, I would like to display the data in almost realtime. The easiest way I came up is to store the data of each variable in a separate tdms file and always replace that file with the averaged dataset in case more data with the same var comes in.

So far so good, now the questions comes if my way is also the fastest way to read the data and display it or is there a more elegant way?

I would be thankful for any suggestions.

mcduff · ‎01-19-2019

Create a running average of the values you want to average, that way you only need 1 copy of the array for the average. See below for an example. You can add logic to reset it after a certain number of iterations. In example below only 2 arrays needed, one for the new values, one for the average.

mcduff

McPhazer · ‎01-19-2019

Thank you for your reply. This does not help me. Please read my post again. I understand the concept of an average and have no problem in applying it. Maybe this comes not out clearly: I do not want the average of all the arrays but only the average of these with the same var value: The arrays in between (in my example N-1 arrays, with N>1000) are too large to hold it in memory. I need to store to disc and I am worried by the read time to reconstruct the dataset.

mcduff · ‎01-19-2019

I guess I do not understand what you are looking for.

You have a 1000x1000 image, assume 16 bit, that is 2Mbytes of data. Your images are arriving every 200ms, so you need to write 10Mbytes/s to disk. These values seem doable in my mind with a modern computer.

On the other side, I would like to display the data in almost realtime. The easiest way I came up is to store the data of each variable in a separate tdms file and always replace that file with the averaged dataset in case more data with the same var comes in.

In the example I showed before the average is contained on the shift register of the while loop, no need to store all the previous ones, read them, then average them.

The arrays in between (in my example N-1 arrays, with N>1000) is too large to hold it in memory. I need to store to diskc and I am worried by the read time to reconstruct the dataset.

That is why you use a running average. The sum of the previous N-1 iterations is equal to (N-1)*average. To get the new average add the current value to this sum and divide by N. This means rather than reading N-1 datasets, you just store 1 dataset. This is how an oscilloscope averages data.

That is where I am trying to help, a running average just contains 1 dataset, so you can store all the previous data, and keep one running averaged data. Otherwise, post some code of restate your question because I do not understand.

mcduff

EDIT:

I do not want the average of all the arrays but only the average of these with the same var value:

Add the logic for this case in your loop, along with another shift register to keep track of the count.

McPhazer · ‎01-19-2019

Thank you. See my edited post. The point is that I do not want the average of all the arrays but the average of those with the same var value.

McPhazer · ‎01-19-2019

The other point is: At the same time, I want to display the data. This is then a 3D dataset (X,Y, var) and I want to have certain planes displayed which do not correspond to a plane with the same var value, i.e. a plane (X0, Y, var) with X0 const and Y and var going over all values.

McPhazer · ‎01-19-2019

The full dataset is the typically larger than 2MB * 1000

mcduff · ‎01-19-2019

Now I think I understand what you want. You have a bunch of 2d slices of 3D object and you want to average the same slice to together. With that many images at that size without a 64bit system you will probably need to store to a file. You could make multiple files or have one ginormous file. I would make a test between the two cases, not sure if TDMS holds all that memory while open, or only what has just been written. I would try the ginormous file first, write a 1000x1000 array to it in a loop and watch and see if the memory grows. With one file you could just keep the reference open and loop through channels for averaging, and yes it will not be real time unless you have a RAID array.

Good luck.

mcduff

McPhazer · ‎01-20-2019

I am thankful for your willingness to help me. I also agree that my post is not easy to grab. Nevertheless, comparing the speed of a large tdms file with a many small tdms is not an experiment I have to do. I know the outcome. As written in my first post, there is the problem that in a tdms file I cannot replace data . The only way to replace data in a tdms file is to open the file and write to a new file while replacing the array I need to replace. I already know therefore that a large (non partitioned) tdms file costs too much time as the time growths with N.

I have no problem adding a RAID0 system (or just a newest generation Samsung SSD as scratch space). Nevertheless, before doing the kill with hardware, the software architecture has to be correct.

What I am more interested to discuss is the question, if the described tdms architecture (possibly with asynchronous read/write) is the right choice? or if there are other solutions? I heard of hdf5 (but don't know, if I can there replace the data on the fly with multiple clients accessing the same file) and can guess that there might be a way to do the same with an SQL database (However, I do not to know the difference in speed of the different hardware resources, I imagine with SQL that I somehow put load on the network controller which might be a problem for me as I need to synchronize to a NTP server and the process might slow down the synchronization).

mcduff · ‎01-20-2019

Depending on what version of LabVIEW you have, you could theoretically read the data from the channel you want for the average, delete that data(use the function TDMS Delete Data Function), then replace the data in the channel with the new averaged data. I do not have LabVIEW here to try it out.

mcduff

LabVIEW

How to do a data acquisition with fast read and write with tdms

How to do a data acquisition with fast read and write with tdms

Re: How to do a data acquisition with fast read and write with tdms

Re: How to do a data acquisition with fast read and write with tdms

Re: How to do a data acquisition with fast read and write with tdms

Re: How to do a data acquisition with fast read and write with tdms

Re: How to do a data acquisition with fast read and write with tdms

Re: How to do a data acquisition with fast read and write with tdms

Re: How to do a data acquisition with fast read and write with tdms

Re: How to do a data acquisition with fast read and write with tdms

Re: How to do a data acquisition with fast read and write with tdms