05-17-2006 05:36 AM
I have 300Mb binary file (actually it is 2D array of DBL) I need to transpose it in order to read it faster in next step.
If I read the date directly from file, calculating the location of the next number to retrieve on fly, it takes extremely long time.
If I read whole 300Mb in memory in order to use "transpose array" and then to save it to the new file, LabVIew giving me an error "Memory Full".
Any ideas how could I convert large file with data?
05-17-2006 07:28 AM
Is the data always the same size or is it variable? Seeing that you are trying to Transpose the data it seems that the data is arranged in a fix width x height. The only real solution to me would be to do it all with file pointers. Have one pointer reading the file going down the columns and the other write file writing the rows. I may be slower than doing it in memory but would remove the Out of Memory errors. If the file is fixed size it should not be too painful.
Other solution is obvious, buy even more memory. Starting to see PC sold new with 2 Gbytes of RAM.
Matt
05-17-2006 07:36 AM
05-17-2006 08:39 AM - edited 05-17-2006 08:39 AM
Thanks you for your suggestions,
1. Program has to work on any reasonable PC, so trick with buying more memory will not work.
2. I have huge swap size, it does not help
3. The way I do it now: I read number from one file at the specific place and save it in another file (transposing data), it takes one and half hour to convert 300Mb file this way. This is to long and I wanted to find quicker way.
4. it looks like the only thing left to try is to "Create an array of the 'inverse' size " suggested by Lul, but the problem is if instead of 300Mb file I will have 500Mb (the size of the data is not fixed and depends on the model of calculations which produces this result I need to analyse), then it will be a problem even just to load this into one array.
Message Edited by jonni on 05-17-2006 08:40 AM
05-17-2006 10:34 AM
Try the redim.zip file located here. http://darkfader.net/toolbox/
It contains an exe file and a .cpp file. I have not run this program.
05-17-2006 10:42 AM
I still think the best approch is to leave the file on disk and use file pointers to read, transpose, and write data. Sounds like a good challenge if you have the time to try it. If you are pressed for time than use more direct methods for solving the problem.
Matt
05-17-2006 10:50 AM
@jonni wrote:
Thanks you for your suggestions,
3. The way I do it now: I read number from one file at the specific place and save it in another file (transposing data), it takes one and half hour to convert 300Mb file this way. This is to long and I wanted to find quicker way.
You should be able to dramatically speed this up by finding the right balance between memory usage and speed. You can easily read large chunks of adjacent elements corresponding to N columns (e.g. 10-25% of your data), transpose the subarray, and write it to the new file. Now go back and get the next chunk of colums, do the same, and append it to the output file. repeat until all columns are processed.
for example, if you have an array:
abcd
efgh
ijkl
mnop
read "ab, ef, ij, mn" with four read operations, transpose the 2D subarray, then write the rows:
aeim
bfjn
repeat with the remaining colums and append.
To save memory, don't use any array indicators on the FP.
05-17-2006 10:52 AM - edited 05-17-2006 10:52 AM
Message Edited by LuI on 05-17-2006 05:54 PM
05-17-2006 10:58 AM
Agree, in most cases Disk operations are thousand or more times slower than memory. In this case the first priority is to make the application work on any computer. First goal is always to make it work on the target.
In this case since the file is so large the memory is Disk Caching anyway so it has to hit the hard drive so any improvements of running this in memory are negated by the Disk Caching operation. At lease this way the program would run on any PC.
Just an opinion,
Matt
05-17-2006 01:09 PM
Thank you all for suggestions.
I am reading now column from one file and saving a row to another file. Instead of reading number by number. This didn't increase speed much.
I guess I have to live with that 🙂