LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

Defrag TDMS File With Progress

Have you ever had a very large TDMS file that comes with a very large index file?  Have you ever been told to defrag it so it is smaller and easier to work with?  How long did it take to defrag your file?  The current implementation of the TDMS Defrag is a black box with no indication that anything is happening.  You simply have to wait a very long time and hope it finishes.

 

What I've developed is some code that gives a progress on a TDMS defrag, and allows to cancel mid way through.  If you cancel, the part of the file that has been defragged will be kept.  Meaning if half the file was defragged then if you defrag the file again that first half will have been taken care of.

 

It does this by breaking the TDMS file up into chunks.  By default it will make 10 separate files from the one file.  It then will defrag the first file.  Then combine it with the next file, and defrag the combined file, then it will combine that with the next file and defrag the combined file.  It does this until all files have been processed or until the user cancels.

 

The down side to this technique is the total time to defrag a file will be much longer.  Still this is a first attempt and there are other techniques that could improve it performance, like defragging each of the 10 files at once and combine them.  That is a bit more tricky.

Message 1 of 10
(4,740 Views)
The reason that TDMS files get fragmented is that to make them fast, the format doesn't worry about keeping different samples of the same channel in congruent memory. Consequently, as samples are added to the file, any one channel's data will be spread through out the file.

Something you can try is to read each channel one at a time and write it back to a new TDMS file all at once. In the end, the new file will contain all the data in the old one, but the channels will now all be in one continuous block of memory -- effectively defragged.

Mike...

Certified Professional Instructor
Certified LabVIEW Architect
LabVIEW Champion

"... after all, He's not a tame lion..."

For help with grief and grieving.
0 Kudos
Message 2 of 10
(4,705 Views)
I know what makes files fragmented and i know techniques to avoid fragmented files. I have lots of data made by someone else that didnt know about the right way of doing it. We are talking about TDMS files that are multiple GB, and corresponding index files that are also multiple GB. Just doing a TDMS open takes several minutes. Let alone the time needed to read any data. This way i can run a batch on all these files, not needing to open or read data, and i get some feedback on progress.
0 Kudos
Message 3 of 10
(4,694 Views)
How do you break up the original file into chunks? How long does the entire process take? Man, GB files! 64-bit machine right? Any idea how big the files are after defragging?

I just had a thought, a hybrid approach might be something to try. Break up the files into chunks as you do now, but then read the data out of the chunks and resaved it, as I was suggesting. When dealing with really large datasets you never know what might make a difference.

Other than that, all I have to offer are a few random thoughts. I was reading a paper from a guy at NI that was actually about floating-point representation, but it talked about the differences that having debugging enabled can make of the way LV's compiler optimizes code, and the difference can be huge. So you want to make sure that debugging is disabled on this code.

And you will want to run it on a computer with the most free disk space possible, and as little other stuff running in the background as possible. No network connection, no antivirus, no firewall, fastest HD possible. Do you have a computer available with a RAID array for disk storage? Parrallelizing disk accesses would certainly not hurt.

You might also want to build this into an executable and see if that makes a difference.

Mike...

PS: I gave my original answer last night and didn't actually notice who posted the question (the icons, as they appear on my phone, are way to small for me to see a lot of times). In any case, I guess I went into more detail than you needed, no offense intended.

Certified Professional Instructor
Certified LabVIEW Architect
LabVIEW Champion

"... after all, He's not a tame lion..."

For help with grief and grieving.
0 Kudos
Message 4 of 10
(4,687 Views)

Hi, I like this idea and tried to play with it. However, it seems like some sub-VIs are missing for your attached VIs.

0 Kudos
Message 5 of 10
(4,650 Views)
@Mike no offense taken, glad you came to realize I'm not a noob.

@Deppsu I'm pretty sure all VIs are there other than OpenG which I didn't mention as a requirement but personally I think no LabVIEW developer should be without OpenG.
0 Kudos
Message 6 of 10
(4,642 Views)

Update, LabVIEW 2015 now has a defrag with an option to read progress.  It is a shipped example in the example finder which enables querying the progress of a defrag.  Also defrag times have been improved in 2015.

0 Kudos
Message 7 of 10
(4,359 Views)

That is a cool new feature.  Is it possible to back save the new defrag to LV2014 for those of us who have not upgraded yet?

0 Kudos
Message 8 of 10
(4,222 Views)

Maybe, but with a decent amount of work. 

 

This isn't a purely G implementation and relies on the TDMS DLL at the following path for me:

 

C:\Program Files (x86)\National Instruments\Shared\tdms\tdms.dll

 

To support this progress bar new functions had to be added:

 

TdsFileSetDefragmentProgressSwitch_NT

and

TdsFileSetDefragmentProgressSwitch

 

So if you were to update your TDMS DLL, or ensure that your version does have the added functions, then a back saved VI might work.  Attached is the back saved example shipped in 2015 to 2014.

 

EDIT: Re attached code because I forgot the vi.lib VIs.

Message 9 of 10
(4,212 Views)

I guess its time to install LV 2015 🙂

 

Actually I just tried the example in LV2014 and it worked!

0 Kudos
Message 10 of 10
(4,179 Views)