LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

ASCII vs. TDMS

Solved!
Go to solution

I have this big packet that claims TDMS files are smaller than ASCII files.

 

 

I have a 4x36790 array that i save to an ascii file and tdms file.  The ascii file (862 kb) is significantly smaller than the tdms file (1.12 mb).  

 

Can anyone offer an explanation why this NI chart is not supporting my findings?

Message Edited by elset191 on 05-08-2009 03:42 PM
--
Tim Elsey
Certified LabVIEW Architect
Download All
0 Kudos
Message 1 of 5
(5,699 Views)
Solution
Accepted by topic author elset191

Can't be sure (can't run your code right now), but I'd be willing to bet that you aren't saving things to the same precision.  If your ascii data is only 7 character long, it would be the same size as a binary file of doubles (I'm assuming a one byte delimeter in your text file). 

 

Try reading back both files and subtracting them from each other -- I'll bet you'll find a difference.  So TDMS files are smaller than ASCII files given the same precision.

-Matt Bradley

************ kudos always appreciated, but only when deserved **************************




Message 2 of 5
(5,670 Views)

What is the magnitude of the numbers you are storing? 

 

If your dataset is relatively small, the amount of overhead the TDMS file format adds for channel names and grouping is greater than the compression you get by storing the data as a binary file.  Your data set isn't huge, but isn't all that small either.

 

However, you are saving doubles which would cost you 8 bytes per array element.  With ASCII, you are storing the elements with about 6 or so characters per number since the default setting is 3 decimal places.  You have a tab separator, 3 decimal places, the decimal point, and 1 character for each digit on the integer side of the decimal point.

 

Unfortunately, the VI you posted was missing some subVI's and typedef's and didn't reflect the image you posted.  But cleaning it up and running it with a small 4x4 array with just a 3 in the lower right corner (all rest zeroes), I got 107 bytes for ASCII and 616 for TDMS.  And the TDMS index file added another 456 bytes.  So my test run showed both possibilities, very little data overall so that the extra info in the TDMS format was a greater percentage of the file.   And relatively small numbers that took only 6 bytes to show and separate in ASCII as opposed to 8 bytes to store as a double in binary.

 

By looking at your file sizes, I say each value you are storing is costing about 2 more bytes with binary compared to an ASCII string with 3 decimal places.  (147,160 array elements, ~258,000 difference in file size)

 

One disadvantage of ASCII is that the formatting to text will limit your precision to some amount that you define by the format string, but if you remain in binary, the value will be stored with all the precision/resolution that was present in its native datatype.

Message 3 of 5
(5,668 Views)

Oops.  Here's the right VI.

Turns out I was in fact saving to different precisions, and when I switched the ascii to 6 places it is indeed bigger than the TDMS.  Thanks for your insights guys.

 

Is there a way to change the precision of TDMS files?

--
Tim Elsey
Certified LabVIEW Architect
0 Kudos
Message 4 of 5
(5,608 Views)

TDMS saves the data as raw binary data. So an 8-byte double takes up 8 bytes for that data point (indexing information aside). If you really want smaller files at the price of precision, you could save single precision floating point numbers, which are only 4 bytes. Check out the Numeric Conversion palette in the Numeric palette for functions to convert numeric data types.

 

When you start actually integrating TDMS into your application, there are other considerations for keeping your files smaller and more efficient.

Message Edited by Jarrod S. on 05-11-2009 09:27 AM
Jarrod S.
National Instruments
Message 5 of 5
(5,604 Views)