Showing results for 
Search instead for 
Did you mean: 

Merging csv files

Go to solution

Hi all,


I'm using LV to create multiple csv files, then I want to merge them together into one CSV file (so basically csvfile1 would be columns 1-3, csvfile2 would be columns 4-5, etc.) Is this possible? I've found a lot of references that kind of refer to doing this, but nothing that outright states if it's doable or not.


Thanks in advance!



0 Kudos
Message 1 of 4
Accepted by topic author Laura_H



Yes, it is doable.


It might be messy, depending on the details.  Basically you need to read each file and convert it into a suitable array. Then you need to create the final array and place the data from each file into the apprpriate places. Finally you write the array to the new file.


Arrays in LabVIEW are indexed from zero, so the first column is column 0.  You suggest that csvfile1 has three columns while csvfile2 has only two. Do all the files have identical number of rows?  Arrays in LV are always rectangular. So if csvfile1 has 3 columns and 10 rows, and csvfile2 has 2 columns and 8 rows, and you want something else in column zero, your final array would have at least 6 coulmns and 10 rows. The last two rows in the csvfile2 columns would have default values (zero for numerics).



0 Kudos
Message 2 of 4

Hi Lynn,


Thanks for your quick response!


I implemented what you said into a simple file that I was testing with. Each of the files will have the same number of rows, but different numbers of columns. Not sure if this is the easiest way to do what you said, but it seems to get the job done. Thanks again!



0 Kudos
Message 3 of 4



Generally the use of stacked sequence structures is discouraged because it obscures code, is very inflexible, and makes it hard to extend or modify functionality of the program. In this case since you are using one frame to generate test files and the other to convert them it is not completely inappropriate.


I think you may have one problem with the second frame.  It does not show up in this simple test but might in a larger, more realistic use case.  The way you are indexing the Insert Into Array will become a problem if you ever have more columns than rows. Then the Array Minimum function would return the number of rows rather than the number of columns.  Use Index Array and select index 1.  That will always give you the number of columns.


Next, consider the expansion to N source csv files. If all the files are in one folder and they are the only files in that folder, then you can use the List Folder function from the Advanced File palette to get an array of all the filenames in the folder. By passing the array of names through an autoindexing terminal in a for loop, you can read the files one at a time, automatically. However, your present system will not know where to put the data from the "i"th file.  You could write to the final file and then re-read it each time but that just beats on the hard drive.


The better approach would be to accumulate the data in a shift register in the loop and then do one write outside the loop after it finishes.  The best way to do that is to pre-allocate memory for the array outside the loop so the array does not grow continually inside the loop. Use Initialize Array for the pre-allocation. Inside the loop use Replace Array Subset. This does not change the size of the array or make a copy so it is quite efficient.  Keep the current column number in a shift register. On each iteration add the number of columns replaced to the index value.


To determine the number of columns for Initialize Array you can multiply the size of the array of filenames by the maximum number of columns in any source file. This will produce an array which is larger than you need since some of the source files will have fewer columns than the maximum. When the loop finishes, use Array Subset to get the portion with real data.  The value in the index shift register will tell you the subset size. (Possibly -1).



Message 4 of 4