02-11-2014 03:00 PM
That is a huge improvement! The OpenCL FFT gives the option to run in single precision which Labview doesn't and is even better if you don't need the extra precision.
02-11-2014 03:01 PM
Lets just say that trying to force a bigger difference by changing to 2048 width didn't work out well. 🙂
(Graphics driver reset and VI hang)
/Y
02-11-2014 03:03 PM
Yeah, OpenCL on the GPU depends on how much memory the card can allocate. If you scroll down on the device properties it will give you a heads up but it doesn't enforce it anywhere in the VIs.
02-11-2014 03:04 PM - edited 02-11-2014 03:04 PM
/double post, sorry
02-11-2014 03:10 PM
New run gave 32ms and 44ms resp.
Very nice improvement, and i'm in no way accusing you, i just thought the massive parallellism would yield more. 🙂
This will be fun to play with. 😄
/Y
02-11-2014 03:14 PM
The more you can put into the OpenCL kernel to avoid data transfers between device and host and even global memory on the device, the better off you are. Check the OpenCL cheatsheet if you are interesting in writing kernels, and there is a write-up in the manual.
06-18-2018 05:38 PM
Why did you use Write/Read buffer instead Map/Unmap? Map/Unmap will speed up transfer, because used DMA.
Can you correct you source for use Map/Unmap?
06-20-2018 09:47 AM
Hey, thanks for using OpenCL for Labview. I don't have the time to support this any more, but it is free on Github if you'd like to modify it. I would be happy to add you to the Github project so your changes can be merged into the main project.
07-01-2018 05:08 PM - edited 07-01-2018 05:08 PM
OK. I changed to Map/Unmap. The speed up to 100x.
07-01-2018 08:09 PM
Thanks so much! Merged your changes into the main branch, glad to see some folks using this!