TDMS flexibility / performance

Intaris · ‎12-03-2015

@YongqingYe wrote:

From my personal perspective, the VI attached is still too complicated. Is it possible to get a much simpler VI just to reproduce the problem of TDMS performance you described, without any queue, case structure, and so on?

Yes that may be possible if I could freely choose the format in which the data is presented to the routine, but alas I cannot. We have a process running on an RT controller sending data packets back regularly and since these measurements can scale up to 32 SGL channels with up to 1M points per channel (128MB), forward, backward scans (256MB), up to one thousand repeats (256 GB) and then up to a thousand of these in 2D direction and possible also in the 3D direction making a grand total of 256 PB of data. Of course typical datasets will be far smaller, but our current implementation is trying to absolutely minimise memory footprint since even 256MB for a single 1D scan is starting to get into the area of being uncomfortable once we factor in certain data copies (querying the data for display and so on).

A lot of the final functionality is not included here. We need to maintain a running average of forward and backward scans over each repeat and I have been able to utilise the advanced TDMS fuinctions for this. In this way I can read in the old values, add the new values and overwrite the old values (something not possible with the standard TDMS functions). Then, whent he repeats are all finished, we simply read int he data piece by piece and divide by the number of repeats and again overwrite. All this is required by the final version, but even the simpler arrangement shown here is showing some big problems.

I was originally surprised that my own buffering routine (in combination with disabling buffering for TDMS) actually significantly worsened the write time deterioration.

Here is a graph of the write times required by my routine over a number of iterations. Note that the final file size here is approximately 1.5 GB.

Initially, data and Meta data are written, then I switch off the meta data portion. The write speed becomes constant (3300 ms) but re-activating the meta data writing seems to significantly slow down the entire process. It also seems to get slower as a function of the file size. The second period of writing only data is again constant, but the final write speeds are over five times slower than when writing data only (approx. 17000 ms versus 3300 ms). Something weird is going on with the writing of meta data here. If I'm doing something wrong, I'd love to know what. Does writing new meta data have to touch already written meta data for the same channel / group?

Seeing how our data will most likely have a multitude of groups and channels, this is worrying.

Is anyone else seeing at least the same trend when running my code?

YongqingYe · ‎12-03-2015

What I'm pretty sure is that TDMS writing is just "appending" new data, the writing performance is not related to the size of the file. What I can recommend is to write a simple VI with some simple TDMS Advanced API nodes, wirting data and test the benchmark of the writing performance.

Intaris · ‎12-04-2015

@YongqingYe wrote:

What I'm pretty sure is that TDMS writing is just "appending" new data, the writing performance is not related to the size of the file. What I can recommend is to write a simple VI with some simple TDMS Advanced API nodes, wirting data and test the benchmark of the writing performance.

How is that different to the VI I have posted. It's not a very complicated VI at all.

There are two cases which do anything at all, Write SGL and New Group. These write data to TDMA files, as you say. If the first element of the SGL array received in any iteration is -1, a new group (or new channels in the same group) is created, otherwise the data is appended to the TDMS file. There's nothing more to it I'm afraid.

The rest of the complexity comes from the format of the data we receive, this I cannot change for benchmarking because that would defeat the whole purpose of the exercise.

The data is organised into several groups, each of which consist of many channels. These channels are again grouped into several distinct groups determined by the channels which are being recorded. In the example below, three channels are being recorded with multiple repeat measurements. The numbers represent the order in which the data arrives. Because of the interleaved group structure of the received data, we cannot simply define ALL channels and write until the group is full. We need to wait until each sub-group is finished (Chan1_001, Chan2_001 and Chan3_001) and then define the next channels and continue writing. It seems that this meta data writing is messing up our performance a bit (Aside from the multitude of individual TDMS Write calls (which we could reduce by buffering say 1M points). As mentioned previously, we cannot always guarantee that we can hold the entirety of this interleaved data set (up to 32 channels, 1M data points SGL = 128MB) in a single block of memory. 128MB tends to start getting troublesome if it's not the only large memory allocation required in a software. And we have a few places where we need to handle such data sets, therefore the 128MB is most likely already going to be a limitation for a contiguous block of RAM. What we have considered is writing each interleaved group to a separate TDMS file and them reading in non-interleaved and writing as decimated data to the final file. This is added complexity we would like to avoid if at all possible.

Ps Have you run my code? Have you also observed the increase in write times?

Intaris · ‎12-04-2015

And just for completeness sake, here's a version without queues and without the case structure. I placed the write data and create Group functionality into inlined sub-VIs to make things clearer.

It shows exactly the same behaviour.

Shane

Intaris · ‎12-08-2015

So, after waiting for a while no answers.

Lovely.

I suppose I'll have to proceed without TDMS for this application then.

If anyone can at least post whether they have seen the same write time increases as I have shown or not, that would be nice.

dan_u · ‎12-08-2015

Running your code I see the same linear increase of the write time.

I wonder what is going on here, it looks like some increasing complex operation is required for each write operation (e.g. parsing the file meta data or moving file contents).

Write performance decreases so fast that after a few writes the write time will be longer than the acquisition time, which will cause problems quickly...

Florian.Ludwig · ‎12-09-2015

I ran your code and saw an increase in write time.

It didn't seem as linear on my machine though.

Intaris · ‎12-09-2015

Is that the same disk as your OS is residing on?

All of the tests I did were on separate disks in orcer to avoid other processes accessing the disk in parallel. Also, which LV version is that?

Florian.Ludwig · ‎12-09-2015

Separate disk.

LV 2014 SP1 on Win7 64b.

The image above was taken with repeats=10 and 2D iterations=2.

When I use the standard settings repeats=10 and 2D iterations=10 it all takes way longer:

dan_u · ‎12-09-2015

I ran another test with the default settings (repeats = 10 and 2D iterations = 10).

LV 2012 SP1 on Win7 64bit. Separate disk.

Quite different results than Florian on his LV 2014 system.

Between Iterations 112 and 140 other processes were running on the same system (CPU, memory, hard drive).

I seem to start much faster than Florian (less than 2s per Iteration), but time increases linearly at a fast pace. Around iteration 160 it takes 26s per iteration (write speed ~0.5MB/s).

Total file size after 166 iterations is ~2.2GB.

LabVIEW

TDMS flexibility / performance

Re: TDMS flexibility / performance

Re: TDMS flexibility / performance

Re: TDMS flexibility / performance

Re: TDMS flexibility / performance

Re: TDMS flexibility / performance

Re: TDMS flexibility / performance

Re: TDMS flexibility / performance

Re: TDMS flexibility / performance

Re: TDMS flexibility / performance

Re: TDMS flexibility / performance