08-10-2012 11:40 AM
I'm on LV 2010 and can't see your code, so can only comment based on what I've read here. Given the things
that have already been suggested, it sounds like there's quite a bit of room for improving your throughput.
1. You refer to a need for real-time processing. What exactly drives this need and what are the specs? How much latency can you accept? If you're trying to do processing/decision making/outputs between individual samples at 250 kHz, you need to start thinking about an FPGA or going embedded.
2. If you're reading 1 sample at a time, DON'T unless *absolutely* necessary! The overhead of >200k calls per sec to DAQmx Read is likely the biggest factor limiting your processing speed.
I dunno your overall app and what all the bits mean, so maybe you can't do this. But if at all possible, I'd limit the pace of the DAQmx Read loop to something more like 100 calls per sec. That'll leave a much larger % of your time for actual processing rather than the overhead of the driver call.
3. You should then retry the producer-consumer approach now that you're working with arrays of data rather than scalars.
4. Yes, you're almost definitely better off preallocating an oversized array and then writing only the filled-in subset rather than using "build array", especially if you're building in a loop.
5. I don't know your board in detail but am not aware that it can do a pattern-based acqusiition. Still, you might be able to use change-detection to reduce the sampling and processing rate. Just realize that if the *time* of those changes is important, you'll also need a way to timestamp the change detect event. This is easy enough to do with a DAQ counter, but I'm not sure if your board has any.
-Kevin P
08-10-2012 11:46 AM
I could try using finite samples and just reading a chunk of the buffer rather than all the samples avaliable on the buffer. I'll try that, thanks.
I tried pattern recognition using a trigger, but it seemed to result in a lot of dropped data. It's very important that certain words come in sequence, so any dropped data can quickly snowball into lots of dropped data.
I can accept a little latency as long as it doesn't compound. The problem I have now is that as the buffer keeps filling the processing gets more and more behind.
But I'll try reducing calls to DAQmx read. Thanks
08-10-2012 12:48 PM
Once again, the extra necessary filtering of the data when NSAMPs are read in weighs things down more than when a single sample. I'll keep poking at it though. At this point, I think that a hardware 0-screen in front of the DAQ cards is the only way to go.
08-10-2012 02:08 PM
That doesn't sound right. Even in a worst case scenario where the processing must look at every element individually, 1*(Read N) + N*(Process 1) should be able to run faster than N*(Read 1) + N*(Process 1).
Did you *also* try the producer-consumer split and work to avoid using "build array?" Can you post a screenshot of the processing? Can you capture based on change detection to reduce your data rate?
It's generally not easy to out-optimize LV's built-in thread management, but another avenue of approach might be to assign different cores to the DAQ part vs the processing part. You can do this with Timed Loop structures which allow you to specify a preferred core. Just watch out that you don't overconstrain the loop timing. And don't be surprised if things actually get worse when you try to override LV and manually optimize the cores.
-Kevin P
08-10-2012 02:16 PM
Yeah, it doesn't make sense to me either that none of these ideas have helped, especially the producer/consumer. I know the VI is fairly large, so can you post your latest attempts as the actual VIs.