I am doing some research about using Nvidia GPUs in pattern detection algorithms. In OpenCV there are two versions of Hough Circle Detection algorithm - one for CPU and one for GPU. I want to compare execution times for them. My basic application is in LabVIEW but to run OpenCV code and custom CUDA code I am using DLL libraries and Call Library Function Node block. I have not any problems with call standard OpenCV function. I also can call my own CUDA function using available LabVIEW block to initialize/deinitialize GPU, allocate/deallocate memory and upload/download data. The main part of code is run as CLFN function. In general I am doing everything like there is described in LabVIEW GPU Analysis Toolkit - Calling Custom GPU Functions.
I have problem when I am trying to use OpenCV functions for GPU from LabVIEW. My application always crash due to some memory access fault. It happens on main OpenCV function line. I suppose it is related to handling data which is already in GPU memory, allocated and uploaded by LabVIEW functions, but which i must convert to OpenCV type GpuMat. Maybe someone was already doing something like that. If not I can describe whole process but it is quite complicated and it take a lot of time me and this person which will be trying help me... But of course I can do this ;).
To find the cause of my problem I implemented some application in C++ with CUDA and OpenCV. It is working correctly but only as standalone application (call from exe file).
In LabVIEW may be similar so I am trying to build application. I included needed dll files and my application run without asking about them and without any errors. But when I click button which call extern CUDA function nothing happens. No effect and no errors. Error dialog from Call Library Function Node also do not show error. What could cause this behavior?