I want to perform CUDA operation in LabVIEW to multiply big size matrices in GPU and measure the time of execution, I used GPU analysis toolkit but there is no information about the block and grid size,surely the execution time is less than CPU, but i need to measure the performance of GPU over various tiles and block size like using dimGrid, dimBlock. so should I need to create a complete DLL file alone and then import it to LabVIEW without using the VIs of GPU toolkit(LVCUBLAS and LVCUDA...)????
in other word is the dll file instead of gpu toolkit or aided to it?? Can anyone give me some example of a similar operation?