09-30-2010 09:02 AM
Hi Jeff,
That does make sense and clears up the mystery-- I was assuming that second column was the number of elapsed seconds for that day listed in the first column. Obviously, I didn't look too closely into that assumption as it lead to an extra day being added on...
I can fix the DataPlugin. I'll post a new one soon. Tell me, though, is this data file you sent the typical size you have, or are some data files much larger? What's the largest data file size you'd expect? Having to parse out each DBL from the second column as a series of compressed strings and build up each datetime stamp manually will take some time for a really big data file.
Brad Turpin
DIAdem Product Support Engineer
National Instruments
09-30-2010 09:16 AM
Brad,
Thanks. The files will get quite large. There is an X, Y, and Z accelerometer that get up to ~75MB in size (the sensor gathers data for up to 11 hours at one sample per second). File formats are exactly the same. It is OK if it takes some time to process, there will not be new datasets very often.
There is also a fourth file type, data from an audio sensor, that has the exact same file format except it has 512 data points on a line instead of 256. Thus the file size for this file can be double, about 150MB max. The sampling rate on this sensor is 30kHZ. Is it easy to modify the data plugin to handle this case as well? I am happy to try to make these modifications myself if you point me in the right direction, I don't want take too much of your time if this is burdensome.
On a side note, this sensor will be run on the first field test tomorrow. We are measuring sound and vibration response on a train for a certain application. I am excited to be able to use DIADem and the new mapping features to visualize the data.
Regards
Jeff
09-30-2010 10:26 AM
Hi Jeff,
You were right, the synchronization of those 2 files looks great now that the JTFA_XLS DataPlugin reads the time stamp correctly. Currently the DataPlugin only reads the first data 256 values on each line, so it will "work" for the audio files, but it will also ignore half the data values. It would be easy to edit the DataPlugin to read 256 data values for vibration files and 512 data values for audio files... ASSUMING that there is some way for the DataPlugin to tell which file it is loading. Is there some string in the data file that the DataPlugin can reliably latch onto to reveal which type of file it is? Would you post one of these audio files for me to work with?
When you get very large data files, I suggest that you load them into DIAdem the first time with the DataPlugin (which will take a while), then IMMEDIATELY save them back to disk as a TDM/TDX data file (DIAdem default format). The save will only take a few seconds, and loading the TDM/TDX data file back into DIAdem will also only take a few seconds. A binary data file of size 100MB is very manageable, but an ASCII data file in that range performs very slowly. If I were you, I'd pay that loading speed hit only once and use the binary TDM/TDX data file every subsequent time you want to load that data set. This will work on all your 3 data file types.
Brad Turpin
DIAdem Product Support Engineer
National Instruments
09-30-2010 10:44 AM
Brad,
Thanks, I will check out your updated dataplugin. I am attaching an audio file for you to have a look at, it was sampled at the same time as the other two files. A key difference is that the audio data is sampled at 30KHZ instead of 3200Hz -- I guess that change needs to be accounted for in the VBScript rather than the dataplugin, right?
There is not a string in the data file itself that describes the type. All of the datapoints are on the same line, however, so you could count until you encounter an LF. A simpler solution might be the file name -- accel files will always start with "Accel" and audio files will always start with "Audio".
Thanks
Jeff
09-30-2010 11:15 AM
Hi Jeff,
I think I will rewrite the DataPlugin to count out the first line of values and apply that number to all the rest of the lines. But if the sampling rate is different between audio and vibration files, then the DataPlugin still needs to distinguish between them-- currently the 3200 S/sec rate is hardcoded in the DataPlugin (because it's nowhere inside the file). I noticed that the audio file you posted had the file extension CSV, whereas your vibration file had the file extension XLS-- is that a reliable distinction, or can there be vibration files that end in CSV and audio files that end in XLS?
I was also thinking about your larger/longer data files some more. You probably also want to run the JTFA.vbs on each of those (which will also take a while) and save the resulting data into a TDM/TDX file for future reference. You won't want to have to run all the FFT calculations more than once.
Brad Turpin
DIAdem Product Support Engineer
National Instruments
09-30-2010 12:02 PM
Brad,
Sounds like a good approach. All files are ASCII CSV, they will never be excel files. If you look at the Accel file attached to my first post it was CSV and it works with your plugin, I'm not sure where XLS entered the picture.
Thanks
Jeff
09-30-2010 12:20 PM
Hi Jeff,
File Extension. File Extension. The first 2 data files you sent me had a file name that ended with "*.xls", but the audio file you sent had a file name that ended with "*.csv". I know they're all ASCII files because I'm parsing them as ASCII files with the DataPlugin functions. Right now I'm posting a version of the DataPlugin that uses 3200Hz and 256 columns UNLESS the file name starts with "Audio", in which case it uses 30000Hz and 512 columns.
I'm concerned that with the current approach we're going to run out of DIAdem channels, which max out at 65,000 (2^16). We could alternatively load 256 or 512 channels and have the timestamps and their values in rows-- then we wouldn't have to worry about the length of the recording. Will the maximum number of timestamps in any of these data files exceed 65,000 or so?
I noticed with both the acceleration and the audio files that the frequency content didn't start until after 13:00, about the time that the NMEA file started. That means that the acceleration and audio files were only 20% filled with usable data and 80% filled with silence. Perhaps we should consider chopping off the timestamps that are outside the corresponding NMEA data file before calculating the FFTs?
Brad Turpin
DIAdem Product Support Engineer
National Instruments
09-30-2010 12:32 PM
Hi Jeff,
OK, this is confusing. I went back and looked at your original post, and the file attached has a link name ending in *.csv, but when I right-click on that link and choose "Save Target As..." I get a file dialog that suggests the name AccelX001.xls and the file type of "Excel".
So I suppose the Discussion Forum may be automatically converting that file during the save process, or perhaps you posted an Excel file with a *.csv file extension.
Are they all *.CSV files, then, on your end?
Brad Turpin
DIAdem Product Support Engineer
National Instruments
09-30-2010 12:35 PM
OK, sorry about the file extension confusion, the bottom line is the files will always be ASCII comma separated values. Thanks for the dataplugin revision, I will give this a try.
Currently we can have about 11 hours of data in one file, and there is a new timestamp approximately every second, so that gives us 11*3600 = 39,600 timestamps in a file. However, I would like to be able to have data from all sensor files open at the same time if possible (AccelX, AccelY, AccelZ, Audio). So the worst case number of "snapshots" is 4*39600 = 147,600. Sounds like we may run into a problem hitting the max number of channels, then.
Regarding your last point, it would be perfectly fine to chop off data that falls outside of the time when there is valid GPS fix (NMEA times).
Thanks
Jeff
09-30-2010 12:38 PM
Brad,
They are all CSV files on my end. Not sure how the file extension is being renamed... strange.
Jeff