From Friday, April 19th (11:00 PM CDT) through Saturday, April 20th (2:00 PM CDT), 2024, ni.com will undergo system upgrades that may result in temporary service interruption.

We appreciate your patience as we improve our online experience.

LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

Trying to merge large files

Solved!
Go to solution

Hi, 

 

I am trying to merge several large files into a single file. Each of the files contains several hundred Megabytes of Data in several columns of Tap deliminated txt files. 

 

If I just try to open them, merge them in a concatenating for loop and writing a new file I constantly get the error not enough memory.

 

What is an easy way to merge several files into one, while avoiding the memory overload issue? Each file contains 4 columns with millions of rows of data, the final file should have the same amount of rows as previously but each column added at the end.

 

Thank you for your help

0 Kudos
Message 1 of 11
(2,920 Views)

Adding columns to a text file is difficult to impossible.

 

If you must use a text file then I suggest(it may not work),

open 1 file convert to binary, close file, open second file convert to binary append to binary array, repeat for third file.

 

Now you have a binary array of points, convert it to text in chunks, say 100k rows at a time. Append each chunk to the new text file.

 

mcduff

0 Kudos
Message 2 of 11
(2,904 Views)

@LukasSc wrote:

Hi, 

 

I am trying to merge several large files into a single file. Each of the files contains several hundred Megabytes of Data in several columns of Tap deliminated txt files. 

 

If I just try to open them, merge them in a concatenating for loop and writing a new file I constantly get the error not enough memory.

 

What is an easy way to merge several files into one, while avoiding the memory overload issue? Each file contains 4 columns with millions of rows of data, the final file should have the same amount of rows as previously but each column added at the end.

 

Thank you for your help


This task would be better performed in Excel itself.

 

If you were clever you could:

  1. open several of the files
  2. record or program a macro of you appending a column from one sheet to another
  3. playback that macro for the rest
========================
=== Engineer Ambiguously ===
========================
0 Kudos
Message 3 of 11
(2,889 Views)

I would read all of the files 1 line at a time.  You then merge everything into a line and write that line.  Repeat until you have read all of at least one of the files.


GCentral
There are only two ways to tell somebody thanks: Kudos and Marked Solutions
Unofficial Forum Rules and Guidelines
"Not that we are sufficient in ourselves to claim anything as coming from us, but our sufficiency is from God" - 2 Corinthians 3:5
0 Kudos
Message 4 of 11
(2,888 Views)

Command window (or system exec)

Copy [file 1] [file 2] .. [file n] [destination file]

 

/Y

G# - Award winning reference based OOP for LV, for free! - Qestit VIPM GitHub

Qestit Systems
Certified-LabVIEW-Developer
0 Kudos
Message 5 of 11
(2,879 Views)

I think copy will append them row by row (add rows, not increase column count).

 

You may be able to hold all of your file references open and scan blocks from each file, use the Spreadsheet String to array (of strings) and appropriate Transpose operations and build array to go through faster than line by line, but it seems likely you cannot open and read all of any one file and make progress. 


GCentral
0 Kudos
Message 6 of 11
(2,861 Views)
Solution
Accepted by topic author LukasSc

OK, I think you have some ideas for a solution. But having hundreds of megabytes of data stored as text files is not very good. Merging several of them to create a text file of gigabyte-size is worse. What do you need it for? It will be extremely slow to use and extract data from. Consider using tdms for storing that amount of data instead.

Certified LabVIEW Architect
Message 7 of 11
(2,725 Views)

Wow TDMS files seem indeed promising. I might have to rewrite some of my Data Acquisition procedures but it seems worth it. Thank you for pointing that out, I wasn't even aware of the format.

0 Kudos
Message 8 of 11
(2,699 Views)

@LukasSc wrote:

Wow TDMS files seem indeed promising. I might have to rewrite some of my Data Acquisition procedures but it seems worth it. Thank you for pointing that out, I wasn't even aware of the format.


If you're willing to switch to TDMS files, you have the significant advantage of being able to write new columns separately.

That would allow you to parse entire files at a time, and just keep adding new columns (perhaps based on a header row in the existing files?) to the TDMS file.

For TDMS, these would be Channels probably (although you can consider using Groups to group similar measurements, or perhaps to separate by source file, or however you wish).


GCentral
0 Kudos
Message 9 of 11
(2,690 Views)

TDMS really is the Koenigsegg and rolls Royce of file formats to use for storing data. It is simple to start using, versatile, reliable like a rock and faster than anything.

Certified LabVIEW Architect
0 Kudos
Message 10 of 11
(2,666 Views)