Datastore options: TDMS or Database?

kc64 · ‎10-19-2010

When I need a datastore, the first tool I reach for in any non-trivial case is a database (MySQL, SQL Server, etc).

A colleague recently bid a job specifying that the datastore will be implemented with a TDMS file. I have never used this technology before so I am unsure whether this is a good choice or bad.

I like databases for the following reasons:

They use "standard" SQL language.
The content is fairly open.
I can get the data from multiple locations using a database tool or Excel.
The capacity scales well.
It is moderately easy to write LV code to manipulate the data.
I can post reports to web pages using the data directly.

How does TDMS compare on these points? Are there any other points that should be mentioned?

AndrewMc · ‎10-19-2010

Let me respond to each of your concerns.

1. They use "standard" SQL language.

While TDMS files do not support querying with SQL, you can query data sets scattered throughout your computer. You can store meta-data (properties) with each file/channel/group of channels that can be searched on an entire system. A product called DataFinder will index your system for this meta-data so that a query can happen really quickly.

2. The content is fairly open.

The TDMS file format is openly documented (the binary layout). As well, there are plugins for various third party environments (Matlab, OpenOffice, Excel) as well as a generic C DLL for accessing TDMS files from virtually any environment.

3. I can get the data from multiple locations using a database tool or Excel.

Same thing with TDMS. If you just double-click on a TDMS file in Windows, it will open in Excel. Leveraging the TDMS API, you could easily build a web service (or even a web page) which will access/query your TDMS files and display them in a UI.

4. The capacity scales well.

This is definitely an area where TDMS wins. Databases scale pretty well, but ultimately they are not really designed well for high rate measurement data. If you acquire data for weeks at a decent rate, your data can easily exceed 2 GB. Some databases would be able to handle this okay, but you will probably be storing them in a field as a "blob" (that is, a binary field) that *should* work OK, but I wouldn't feel that great about it.

5. It is moderately easy to write LV code to manipulate the data.

Again, this is an area where TDMS definitely wins. It's easier to write LV code to manipulate data in a TDMS file. Even a simple express VI (Read from Measurement File) can provide your data.

6. I can post reports to web pages using the data directly.

You can do this with TDMS as well (just have the backend of your web page access TDMS instead of a database). Similarly, LabVIEW makes it easy to create a webservice so that you could easily design a webpage hosted on a remote server that access your LabVIEW web service to gather the data.

How does TDMS compare on these points? Are there any other points that should be mentioned?

TDMS is a much better option than a database for streaming high throughput measurement data. It supports most of the features that you would find in a database and should serve as a solid foundation for the long term. Even if you were to prefer a database for hosting your meta-data information (though, as I pointed out, would be unnecessary since the TDMS file can easily store such information), I would still recommend that if you went with a database approach that you left the actual "data" part to TDMS. That is, have a field for "filePath" alongside the other information which will store the TDMS file location where the actual data resides.

Size:

Note as well that TDMS is a binary file format, so a double precision value will be stored in 8 bytes. Furthermore, if you use the DAQmx data logging feature (integrated into the software) to stream data to TDMS, only 2 bytes per sample will be stored to disk (with scaling information stored alongside the data for automatic data scaling when the file is read back using the TDMS API). Similarly, even outside of DAQmx, a user can write compressed data and use the "Create Scaling Information" VI to store scaling values. This feature helps a lot with disk space.

Thanks,

Andy McRorie
NI R&D

YongqingYe · ‎10-20-2010

Please refer to www.ni.com/tdms for more information. It also contains some general comparison for TDMS vs Database and some other file formats.

kc64 · ‎10-20-2010

Thank you both for your helpful posts. I want to mark them both as solutions. 🙂

kc64 · ‎11-24-2010

So, my boss wants me to write the solution in TDMS and the client wants the same thing.

Now, I have some configuration data to load into the datastore (IP addresses, COM port assignments, etc.) With my database background, I would just open the table and type the data into it or copy the data from an Excel sheet provided by the client and import it into the table. What is the best method for manually entering data into a TDMS file?

YongqingYe · ‎11-24-2010

hi kc64,

I'm afraid you cannot manually edit a TDMS file from an editor and save it back to the TDMS file, but for your case, I would recommend you use TDMS "Set Properties" node in LabVIEW to set the configuration information.

kc64 · ‎12-15-2010

For the benefit of those who come after, I would like to point out that I really miss the relational abilities of the database solution. I failed to realize, until this project, just how much code I have avoided writing by using SQL to relate, filter, sort, and otherwise manage complex data manipulations. As far as I can tell, there is no TDMS substitute except writing custom code to do what SQL could have done easily.

Brad_Turpin · ‎12-15-2010

Hi kc64,

Based on your statement that you are missing the built-in ability to run relational queries, it sounds like you have not yet tried using NI's DataFinder to provide that functionality. The DataFinder automatically indexes your TDMS data files and populates records in its relational data base-- so you write to the data base just by creating TDMS files with scalar (queryable) properties on the File, Group, or Channel level, as well as each Channel containing a 1D array of (non-queryable) data values. The LabVIEW DataFinder Toolkit (included with Developer Suite Core) provides an easy-to-use set VIs to query the DataFinder data base. Note that there is a small deployment fee when you build an application that contains the LabVIEW DataFinder Toolkit, but that includes the cost of the deployed DataFinder data base. NI also makes a product called DIAdem that provides a full environment for browsing and querying your TDMS data files and then inspect, analyze and report the data loaded. The tree view and querying user interfaces to the DataFinder data base are built-in features of the DIAdem environment. DIAdem installs the DataFinder automatically, and the price of the DataFinder is already built into the price of DIAdem. DIAdem can also be automated to provide batch processing/reporting, much like the macro recorder in Excel. If you're interested, check us out at www.ni.com/diadem or just email me at brad.turpin@ni.com.

Cheers,

Brad Turpin

DIAdem Product Support Engineer
National Instruments

kc64 · ‎03-15-2011

How does Citadel fit into this context? I've just begun learning the DSC module and found that a Citadel database can be used to log historical data to a database. I never new Citadel existed until today.

Brad_Turpin · ‎03-16-2011

Hi kc64,

Citadel was designed as a way to efficiently store data and alarming status for long term monitoring applications, originally for the Lookout product, then later for what eventually became known as LabVIEW DSC. Shared variables in LabVIEW can stream directly to Citadel, for instance. Citadel natively supports time window querying and selected channel loading. There is a way to create an ODBC Administrator connection to the Citadel data base so that you can additionally query based on the channel values using SQL. The big difference to normal SQL data bases is that Citadel does not store any other relational information, which in the TDMS world are called properties, so aside from time and data values there's remarkably little to query on. You just ask to load data from selected channels and selected time windows, and that's it-- which is all you want when you're analyzing the results from the last batch or the last month.

Brad Turpin

DIAdem Product Support Engineer

National Instruments

LabVIEW

Datastore options: TDMS or Database?

Datastore options: TDMS or Database?

Re: Datastore options: TDMS or Database?

Re: Datastore options: TDMS or Database?

Re: Datastore options: TDMS or Database?

Re: Datastore options: TDMS or Database?

Re: Datastore options: TDMS or Database?

Re: Datastore options: TDMS or Database?

Re: Datastore options: TDMS or Database?

Re: Datastore options: TDMS or Database?

Re: Datastore options: TDMS or Database?