My C language control application calls the DAQmx API 200 times a second to update some AOs, using the following call:
DAQmxWriteAnalogF64(taskHandle,1, TRUE, 0.0, DAQmx_Val_GroupByScanNumber, data, NULL, NULL);
On a system with a single cDAQ-9178 chassis and two NI 9264 AO modules, it works correctly with a task of up to 4 channels. If I add channels to the AO task, the output rate slows down below 200 Hz (5 ms). Here are the timings:
How can I write 1 sample per channel, over 32 channels, every 5 ms? It is a control app which responds to inputs, so I cannot write multiple samples per call.
Solved! Go to Solution.
I did some quick measurements and my performance numbers pretty much match yours. The problem is that the data for each channel is being sent separately for static AO tasks, as seen in your data. It's something that could definetly be optimized. I'd recommend filing a feature request (which puts it on NI's radar), but that doesn't help you right now.
Normally I would tell you to configure a hardware timed analog output task at 200Hz, disable write regeneration, and write your output values as you calculate them. Unfortunately, you will not be able to sustain your task at that rate without underflowing. cDAQ is not really setup for hardware timed single point.
Short of future optimizations, I do not think it is currently possible to achieve what you want with 32 channels at 200Hz.
I'm afraid the latency on ethernet is worse than USB and thus would make it even less likely to achieve what you are trying to do. For hardware-timed single point applications, I would highly recommend looking at the PCI / PXI / PCIe / PXIe busses. In addition, for faster rates, you will likely require a real-time operating system.
I'm only asking for 200 Hz! Sending 32 16-bit values every 5 ms requires a throughput of 100 kbps. That's much less than 1% of the capacity of Ethernet or USB. I find it hard to believe that the most modern NI technologies could not achieve this, if the driver writers only tried.
I think the important thing to note about bus protocols, such as USB, ethernet, and PCI, is that throughput is not the same thing as latency. For higher latency bus protocols, you have to send a larger # of samples across the bus at a time to account for the latency and handle higher throughput. So, when doing hardware-timed acquisition / generation, we can get much higher throughput by intellegently sending larger chunks of samples each time we send data across the bus. With hardware-timed single-point applications, you cannot do this because you have to send each sample, as fast as you can, across the bus. Our R&D group diligently works with bus protocols to ensure every optimization with the protocol is made to give you the most performance possible. It just turns out that USB and ethernet bus systems are not designed for hardware-timed single-point applications.
I am not using hardware timed generation. As I stated above, I am using 1 Sample (On Demand) Generation Mode.
The strange thing about all this is, as you can see in my API call above, I set the timeout to zero. But the more channels in the task, the longer it takes the API call to return to my code (see timing table above). When it does return, it signals no error. So it must be waiting for something but it should not wait because timeout is zero.
My apologies, I didn't see the On-demand, but the effect is still the same. You are updating the output using a single sample at a time. Therefore, the bus is still only transferring a single sample per channel at a time (which is ineffective on a higher latency bus). Since the driver doesn't know the rate at which you want to update in an on-demand task, it cannot intellegenty package your output samples together when it sends it across the bus. It has to send them as they are presented (on demand).
As for the behavior you are seeing with multiple channels, I agree that this is strange and R&D is looking into the behavior to see if they can improve it. Have you tried using a hardware timed output task with regeneration to see if you can get faster updates? This way you could update the channel with a voltage and have it regenerate (regeneration is on by default) that same voltage on each clock edge until you presented it with a new voltage. This would effectively be the same as on-demand except your update would be delayed by up to the rate of the sample clock rate. My hope is that the multiple channel behavior you are seeing only exists on software-timed task types and not our hardware-timed tasks.
I have tried it in Continuous Samples Mode at 600 Hz with buffer sizes of 3 and 600 but I keep getting a -200292
DAQmxErrorSamplesCanNotYetBeWritten error. I will keep trying with other settings, if you think it will work.