Labview performance with convolution math

NCLS · ‎08-29-2005

Hi,

If anyone familiar with how Labview performs calculation within nested for loops (in this case 4 nested for-loops), I need help with understanding why a sub-vi (screenshot attached) is consuming tons of cpu resource thus causing the application to fail its timing requirement. The basic overview of the sub-vi function is as follow:

Input 2D array with a 1 pixel object (ranges from 0x0000 to 0x0FFF in value) against a background (ranges from 0x0000 to 0x0FFF as well) comes into the sub-vi is indexed and convolved with a filter kernel. The filter kernel is a 1D floating-point array but everything inside the nested loops is type converted into signed 32 data structure so that no coercion dots exist. The output is a modified 2D array values of the 1 pixel object and its neighboring 8 pixels. The computer that the application runs on is an IBM pc with 3.0GHz + 512MB of RAM with hyperthreading disabled in its BIOS. Overall application gets its input data via serial port and outputs data on a 1394a FireWire channel. What I have observed is that as input data rate increases from 1Hz to 10Hz and onward, the cpu usage increases from 5-6% usage at 1Hz to 33% usage at 10Hz and beyond 50Hz input rate the cpu becomes saturated. On the output side I can see on the FireWire analyzer that the data at higher rate fails to meet timing requirement (100Hz data rate should be 10ms apart but instead is around 35ms or more as input data rate increased). I have experimented with setting VI "execution" property by changing the priorities and unchecking the checkbox for "Allow Debugging" to reduce any unnecessary overhead. I have also followed NI's application note to preallocated array outside of the loops and only use Replace Array Subset node to avoid calling memory manager. Am I doing anything stupid in Labview?

If the underlining problem is Labview processing, the next step may be to perform the convolution in C and use it in Labview as a CLN, but if anyone has experience in this approach could you comment on it?

Any comments or questions are appreciated. Thanks!!

Neal

NCLS · ‎08-29-2005

I forgot to mention that I ran the metrics and the bulk of the processing time and memory usage are spend in this sub-vi. That's why I can pin point to this function as the problem.

altenbach · ‎08-29-2005

(This thread belongs in the LabVIEW forum, not the feedback forum).

Well, if this VI does the bulk of the computations and all other VIs are just for the user interface, it is expected that it will use most of the CPU. It seems to shuffle a lot of bits around. 4kx4k is a big array!

Would you mind attaching the actual VI?

Somehow, I have the suspicion that this VI can be significantly simplified.... (btw: Why are you converting U16 to U32 to I32? Why not convert directly to i32? "round to nearest" has no effect on integers. Personally, I would also get rid of the formula node.)

Message Edited by altenbach on 08-29-2005 09:50 AM

Message Edited by altenbach on 08-29-2005 09:51 AM

LabVIEW Champion.

NCLS · ‎08-29-2005

oops, just notice the forum, if an Admin would so kindly move this thread to Labview board.

now, attached is the actual vi. As for the suggestion to try to remove the formula node, I did that and didn't notice any increase in performance.

NCLS · ‎08-29-2005

re-attaching the vi...

altenbach · ‎08-29-2005

Thanks,

Looking at it for a few minutes....

The formula node does the same calculation over and over. It needs to be done exactly once outside the loop. You can then multiply the 3x3 result with a 3x3 subset of the array. This eliminates the two inner loops.

The attached is a very rough quick modification (LabVIEW 7.1) I have not tested it (I don't have any real data!) but it should give you some ideas. Please modify as needed so it gives the desired results. 😉 Probably, there are bugs and oversights....

Let me know if this makes sense 🙂

LabVIEW Champion.

NCLS · ‎08-29-2005

Thank you, I will be trying and experimenting with it. Regardless, it helps to see how experienced Labview programmer implement code in a certain way.

NCLS · ‎08-30-2005

Just an update, with a couple of changes in where type conversion occurs the ConvolveTgtMOD.vi can now handle processing data at 200Hz. I guess it's a number game with Labview where performance is concerned, the least number of times that Labview needs to make copy or process the data the better it is for tight timing requirement. Attached is the vi, at least now I can concentrate on the functionalities of the application. Thanks!

altenbach · ‎08-30-2005

I am not sure why you are carrying the kernel as DBL, but maybe you need fractional numbers for it (Would SGL be OK?). In this case, you should replace the "To I32" in the center of the big loop to a "To DBL" Right now you are converting the 3x3 U16 array to I32 and then immediately to DBL (notice the grey coercion dot on the multiply!). The middle step is just extra work and will not change the result. (You could even simply delete the "To I32", because the coercion will automatically convert it to DBL for the multiplication).

It depends on the size of the convolution area vs. the total image, but if they are similar, you should insert a "To DBL" right between the "Perfect Target" control and the loop. This way, each element is only converted exactly once. If you do it inside the loop as described in the first paragraph, you do up to 9x more conversions. (This will only be OK if you typcally convolute significantly less than 10% of the total image area. Do the math!)

If this is often called, you should also pre-compute the formula and embed the 9 values as a diagram constant. They never change! (I assume that the Filter Kernel input varies). See attached (LabVIEW 7.1).

I am still not sure why you need duplicate inputs for the perfect target. Are they different when called?

LabVIEW Champion.

LabVIEW

Labview performance with convolution math

Labview performance with convolution math

Re: Labview performance with convolution math

Re: Labview performance with convolution math

Re: Labview performance with convolution math

Re: Labview performance with convolution math

Re: Labview performance with convolution math

Re: Labview performance with convolution math

Re: Labview performance with convolution math

Re: Labview performance with convolution math