LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

Fast processing of mixed representation binary files

Solved!
Go to solution

@Taki1999 wrote:

Sounds like you've got a pretty good handle on it.

Here's how I'd do it, assuming that I'd downcast back to U8 when necessary at a later point.

snippet-ParseBinaryFile.png


That's a very concise looking method to join the numbers. I'll check that out , thank you 🙂

 

Aside:  yes, your definition cluster in very similar to what I have already in my main program. I already have such a procesor working with delimited text files of arbiary data types (encoded as text) , implementing binary data types is the next (requested) step.

 

What are your thoughts on paralizing that loop in your example? will there be a large overhead of (multiple) memory copies of the U82DArray ?

 

 

0 Kudos
Message 11 of 13
(303 Views)

@nemi wrote:

 

What are your thoughts on paralizing that loop in your example? will there be a large overhead of (multiple) memory copies of the U82DArray ?

 

 


The profiler says that the loop isn't parallelizable due to the shift register.

We could make it parallelizable by adding the column index to the column definition cluster and using that instead of the shift register.

 

How large of data sets are we talking about? What sort of machines are you running on?

It's been a long time since I've worked with large data sets that had to be chunked on multiple processors in parallel.

I'd see what the performance of the standard algorithm is before expending the additional effort to optimize for parallel operation.

0 Kudos
Message 12 of 13
(292 Views)

 


@Taki1999 wrote:

@nemi wrote:

 

What are your thoughts on paralizing that loop in your example? will there be a large overhead of (multiple) memory copies of the U82DArray ?

 

 


The profiler says that the loop isn't parallelizable due to the shift register.

We could make it parallelizable by adding the column index to the column definition cluster and using that instead of the shift register.

 

How large of data sets are we talking about? What sort of machines are you running on?

It's been a long time since I've worked with large data sets that had to be chunked on multiple processors in parallel.

I'd see what the performance of the standard algorithm is before expending the additional effort to optimize for parallel operation.


Gigbytes of 300+ column data . Smiley Tongue

 

I already load my files in chunks to parse them and have sucessfully used parallel loops on the text file based data ( 2D string array parsing).

 

The major spead up was using paralel for loops for multi-threaded data mining of the 100+ individual 1D binary array files that result from the 1st step parsing. (we parse once to 1D binary files and analize them multiple times).

 

 

 

BTW, to answer my own quesiton:

by the looks of it there shoudl be no memory overhead for reading from a source array in multiple palces:

 

http://zone.ni.com/reference/en-XX/help/371361J-01/lvconcepts/vi_memory_usage/

 

 

"...However, in this case the Index Array function does not modify the input array. If you pass data to multiple locations, all of which read the data without modifying it, LabVIEW does not make a copy of the data. As a result, all the data is in-place....."

 

0 Kudos
Message 13 of 13
(285 Views)