Memory Problem - Split File

dan07 · ‎11-30-2008

Hello..

I created a VI to split large .txt (waveforms) files. After run, I just need to specify the sample rate of the files and choose a folder. The VI will list all files of the folder and show them to me in a array of clusters. In each file I can specify the start and final time to split the files. Next, I just click on RUN and wait (hours), until all the files get splitted.

The problem is: the files are around 100 mb and after the VI had splitted some files, I got the message "Not enough memory to complete this operation". I don't know where is the problem in my VI, but I thing that I am creating copies of data inside the VI.

Anyone knows where is the problem?

The VI its attached.

Thanks in advance

Dan07

TCPlomp · ‎12-01-2008

Some pointers:

Never build long running code into an event structure, each event should be done in +- 100 ms.
Use proper primitives:

Your code (in the sequence structure) will only work on Windows.
The same goes for the new name building:
There is no need to pass the number of files between the event cases, the array will determine the number of runs for the for loop
To get the path in the 'run' case I would use a local
To get the starting and end point of the array there is no need to build a second (very big) array, just some easy math

Now on to the real task, I would figure out how much data is stored in one line (you only read the first column), read only the part that is necessary (with the code from the last bullet), and resave that data.

Ton

Message Edited by TonP on 12-01-2008 09:27 AM

Free Code Capture Tool! Version 2.1.3 with comments, web-upload, back-save and snippets!
Nederlandse

LabVIEW user groep www.lvug.nl
My LabVIEW Ideas

LabVIEW, programming like it should be!

DFGray · ‎12-01-2008

I agree with Ton. There is no need to convert these points to binary and back to text. Read the input file in 65,000 byte chunks. Locate the first point in the file (be aware it may span a chunk boundary). Start streaming the same text to another file (once again in 65,000 byte chunks) while looking for your end point. Stop when you get to the end. No need to parse the whole file or have all the data in memory at once. Note that the 65,000 byte chunk size is Windows specific (the optimum on other platforms will depend on the OS and disk formatting). With a modern processor, you should be able to stream 5 - 10 MBytes/sec or more, depending on how well you implement it. Parallel producer/consumer loops are your friend.

Osvaldo_Santos · ‎12-01-2008

See bellow a simple example to split binary files. It works well for files larger than 500MB. You can improve it with the informations posted by others users.

I hope it´s help.

Regards,

Osvaldo Santos

RavensFan · ‎12-01-2008

Your picture didn't appear because it pointed to this location. http://forums.ni.com/ni/attachments/ni/170/preview/0/temp-118466-ni-170-8884-attachment". You have to submit the message with the attachment first. Then edit the message to insert the image pointing to its permanent location on the forum servers.

Here is that image. It is located at http://forums.ni.com/ni/attachments/ni/170/371801/1/Split1.PNG

dan07 · ‎12-01-2008

TonP

I applied all your sugestions to my code but I am still receiving the message concerning problems with memory.

- Since I removed the for loop (it was heavy) that was used to create a time series, Do I need to insert my code in flat sequences yet?

- The new way to create the path its really very nice

- I don't know how to implement the code to read only a part of the file. I read somethings about 65,000 bytes chunk, but I don't know how to implement this to my code. Sorry about this, I am new to Labview.

Attached its the updated version of the VI

Thanks for help

Dan07

dan07 · ‎12-01-2008

DFGray

I read your material about how to manage large files in Labview but since I am a new user in labview I was not able to implement it to my code. I performed the alterations suggested by Ton and the code its more "light", but I still receive the message about problems with memory. What is the most simple way to modify my code and implement the 65,000 bytes chunk code?

I need to know this because in other VIs I used to load large files and plot graphs of the data, and I always have problems with slow cursors on the graphs due the large data.

I sent in the last message the file Split_modified.VI, it is updated.

Thanks in advance

Daniel

TCPlomp · ‎12-01-2008

Is the amount of bytes per line always the same?

It would help if you could sent a file with 20 lines or so.

Opening up the 'Read From Spreadsheet File.vi' and figuring out what is hapenning will help you as well.

Ton

Free Code Capture Tool! Version 2.1.3 with comments, web-upload, back-save and snippets!
Nederlandse

LabVIEW user groep www.lvug.nl
My LabVIEW Ideas

LabVIEW, programming like it should be!

dan07 · ‎12-01-2008

TonP

The files loaded don't have always the same size. They have around 9 milion of values in a single column, since that the sample rate is 10kHz and the file is 15 min lenght. The files have around 90 mb of size.

Attached there is a file with 65,536 values in a single column, just as example.

When I run the VI the whole error that I get is:

The first window with OK buttom shows: Not enough memory to complete this operation

The second window shows:

Error 2 occurred at Write to Text in Write Spreadsheet String.vi->Write To Spreadsheet File (DBL).vi->Split_modified.vi
Possible reason(s):
LabVIEW: Memory is full.

I tried to use read from spreadsheet and perform the operation in a single file and using constants instead of control (for sample rate) or array (for start and final time), but I got no success. I receive the error messages:

LabVIEW: memory is full

VI "Split_modified2" was stopped at unknown " " at a call to "Split_modified2"

This VI is also attached

Thanks for help and attention

Dan07

altenbach · ‎12-01-2008

You are doing all this way too complicated. Since you start with a string and end with a string, there is absolutely no need to parse it into a DBL and back. Just operate on the strings directly.

Look inside the spreadseet file VIs to see what they do!:

So, you read a 90MB file containing about 9M lines as a string, parse it into a 2D DBL array, transpose, slice out the first column, then (on writing) you make your 1D array into a 2D array, transpose it, convert it to a spreadseet string, and write it to a file. In any case, you are creating a mindboggling number of datacopies in memory. Constantly hopping between datatypes.

It looks like your input file is well behaved and has a fixed number of characters per line (is this guaranteed?). Thus your subset numbers can easily be translated into file offsets. All you need to do is read your file as a string from the calculated offfset with the desired lenght and write the string back into a new file. Try it!

LabVIEW Champion.

LabVIEW

Memory Problem - Split File

Memory Problem - Split File

Re: Memory Problem - Split File

Re: Memory Problem - Split File

Re: Memory Problem - Split File

Re: Memory Problem - Split File

Re: Memory Problem - Split File

Re: Memory Problem - Split File

Re: Memory Problem - Split File

Re: Memory Problem - Split File

Re: Memory Problem - Split File