LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

Industry standard for storing results in a database

Hi all,

 

We're building a networked test system that uses a centralised database for retrieving and storing information for each test station, of which there are many. As a part of this, each test station will generate data that needs to be stored in the database, and we've been thinking about the best way to achieve this.

 

These are our requirements:

  • Must store single values, such as Boolean (Pass/Fail), Numeric (DBL, such as 3.24V), Text (serial numbers), Timestamp (Date record).
  • Must store short arrays of numerics, up to a few thousand points only.
  • Must be readable from various environments (not just LabVIEW), so we can't use peculiar data formats.

Regarding frequency, the quicker test stations will produce a single result (Pass) every 30 seconds, whilst the slower systems maybe an array of DBLs once every five minutes. If the database gets 'large' then we'll look to archiving out old data once in a while, so size isn't a concern.

 

We expect that the problem of needing to flexibly store various data in a database has been solved a thousand times before, so to save us reinventing the wheel we were wondering if there's a published solution out there, an industry standard approach perhaps, that we can refer to?

 

Any advice thankfully received!

 

RTD

0 Kudos
Message 1 of 11
(6,410 Views)

As far as I know, there isn't really an 'industry standard' for storing test results because every company has their own requirements for what data they want to store/retrieve (e.g. the data stored with each test). There are probably COTS (commercial off the shelf) solutions available but they may not do what you need or may be completely overkill.

 

As a rough idea, this is something I put together recently as a small system for storing test measurement data:

2015-07-17_09-15-29.png

 

This allows me to store any number of measurements associated with a specific test run. If I needed to store anything other than floats I would either add a str_data column or create a table for each type of measurement I'm expecting. I've also done some work for storing production line test data - that was a lot more complicated and ran into the order of 15 or so tables.

 

As for the frequency/size of the database - this isn't really an issue with how big the database gets, they are designed to work with millions of rows (as long as you choose appropriate id number formats and indexing). You're more likely to run out of disk space on the server before you run into any real limitations. It should be pretty easy to estimate the amount of space your database will use over time (add up column record size in kb and multiply by number of rows generated over X months/years).

 

As for a database engine - MySQL and SQL Server are common, but neither of these are really suited to storing arrays of data (you'd have to convert the array into rows and or serialise it into a string which makes calculations harder) - you might want to look at something like MongoDB (stores records as 'documents' and they don't have to have a rigid structure).

 

Pretty much all database engines have some sort of interface which you can use with LabVIEW or other applications (e.g. ODBC, command line, TCP/IP, DLL).


LabVIEW Champion, CLA, CLED, CTD
(blog)
Message 2 of 11
(6,383 Views)

Hi Sam,


Thank you for your reply. Our current approach is similar to your suggestion, and includes a table of Reports (each with primary keys) and multiple tables of Results. Each Results table handles one data type, and each entry references the Report key. With this we can create Reports with multiple Results entries of various datatypes. It's also expandable to new datatypes. But we're not sure about the chronology - how can we ensure the data in the Report is returned from the database in the same order it was created? For example, if we acquire a numeric, then a boolean, then another numeric results, those three results might come back from the database as two numerics and a boolean. The order is now changed, and that will affect report content generation.

 

Storing arrays seems troublesome. We're using SQL Server, so we might need to serialise the data and keep it interpretable for other applications.

 

I'd be interested in seeing how others solve this problem. We need flexibility to accommodate unknown future datatypes (which we have), and potentially large data sets too. For this we expect we would simply store a path to a file and keep the data outside of the database itself.

 

 

0 Kudos
Message 3 of 11
(6,373 Views)

You could put an 'order' column on the results table? Either that or if they need to be sorted chronologically then put a timestamp on the results. You can also sort by the id.

 

If you're not bothered about the database doing calculations on the arrays (so it's just for archiving) then you can just comma-separate them and store them in the array as text. We did that for storing configuration keys - we wanted to know which configuration keys were used for a test but weren't too bothered about lookups on the data so we just comma-separated them and stored the ids as text.

 

Also - most database engines support a binary BLOB data type - you can use this to store binary data such as files in the database itself. Store a BLOB of file data and the file name in a 'files' table. You can use this for any custom data types as well (but be aware of things like endianness when dealing with binary data).

 

MongoDB is good for these sorts of flexible applications because it has no rigid structure - your 'records' can contain any fields/data types.


LabVIEW Champion, CLA, CLED, CTD
(blog)
Message 4 of 11
(6,352 Views)

In actuality, I got the description of our current approach wrong. This is a snippet of the design:

tables.png

We have a TestReport table, and a TestResult table. Each Report is linked to muliple Results entries, which each reference one of the smaller specific tables, such as ResultBOOL.

This gives us our ordering through the ID field and we can expand to additional datatypes by creating more specific tables on the right hand side.

 

But it's not just about the storage, it's about reasonable levels of retrieval too. A query to this database to return all the Results for a provided ReportID will result in what? When you think about it, you would get a large 2D variant array (as that's how LabVIEW returns data), with columns from all the Specific tables concatenated into one. How do we know what datatype each cell is?

 

Alternatively, we ask for all the BOOL results, then all the INT results, then all the STR results etc., which makes for multiple calls to the database.

 

Is there a better way to store the data in the database such that retrieving it is simpler?

0 Kudos
Message 5 of 11
(6,330 Views)

Have you considered the way TestStand stores data?

Jim
You're entirely bonkers. But I'll tell you a secret. All the best people are. ~ Alice
For he does not know what will happen; So who can tell him when it will occur? Eccl. 8:7

Message 6 of 11
(6,303 Views)

@jcarmody wrote:

Have you considered the way TestStand stores data?


I've long toyed with the idea of writing a LabVIEW API for this.  Someday. 😄

Jim
You're entirely bonkers. But I'll tell you a secret. All the best people are. ~ Alice
For he does not know what will happen; So who can tell him when it will occur? Eccl. 8:7

0 Kudos
Message 7 of 11
(6,252 Views)

I've had a look at the link, I think that will take some time to understand fully. Does it have more flexibility than the aforementioned solution? I can't see any enquiries to this being simple.

0 Kudos
Message 8 of 11
(6,227 Views)
TestStand itself is a very flexible tool for managing tests and has a lot more options than your test manager so the schema reflects that. Some of the rows in the tables can be removed. One thing I suggest that you add to your schema is the test result status. It is useful to be able to find all tests that have passed or all tests that have failed without having to compare each result to the limits after you have stored them.
Message 9 of 11
(6,200 Views)

@RTDSafety wrote:

 

Alternatively, we ask for all the BOOL results, then all the INT results, then all the STR results etc., which makes for multiple calls to the database.

 

Is there a better way to store the data in the database such that retrieving it is simpler?


We basically store all results in tResults with attributes ID, type, value(dbl), value(string). Int, bool and single/doubles can all easily be stored as doubles with the Type attribute determining how it should be returned. If type is String it's ofcourse the varchar field that's used. One clever solution of the double specials of +/-Inf and NaN, which cannot be stored in SQL server, is stored as StringNumber instead, meaning it should be stored in the String field, but through Extract from string and Format into string. The benefit is ofcourse one data format and one question for all normal data.

 

Arrays are a special type which uses a sub-table and the string can store the ID/guid of it.

 

/Y

G# - Award winning reference based OOP for LV, for free! - Qestit VIPM GitHub

Qestit Systems
Certified-LabVIEW-Developer
Message 10 of 11
(6,113 Views)