07-23-2009 11:32 AM - edited 07-23-2009 11:38 AM
I am in the works of replacing our "VXI-fiber optic slot0/bridge" with the "VXI-USB slot 0/bridge". The data throughput for large blocks seems to be fine, but the throughput for small requests is causing issues. I suspect latency, but want to consult with an applications engineer or expert to confirm. I would also like to see documenation comparing latency of the USB solution versus fiber-optic and other bridges. Most important, I want to understand the root cause and any tips and tricks to overcome the issue.
Our system:
My application of the VXI chassis is in an automated test system. The system is mounted in a vehicle. Sampling is triggered off of an encoder on the vehicle wheel. We have up to 8 channels of data per VXI module, and up to 3 modules per complete system. Typically, each channel collects ~32 bytes of data per sample. The sampling rate (speed dependant) is anywhere from 0 to ~1500 Hertz. So the data needs are anywhere between 0kB per second up to (1500 Hertz * 3 modules * 8 channels * 32 bytes ~=) 400kB per second. On each channel is a 32kB FIFO (32 bits wide * 8k deep) with a half full flag and a "data status" register. To pull data from the channel, I read the data status register that tells me either:
- no data
- one (32 bytes) or more cycles of data
- half full FIFO (16kB) of data
To read the "data status" register and to pull one cycle of data from the FIFO, I map the VXI addresses and use viPeek32(). To pull in the block (16kB) of data from the FIFO, I use the viMoveIn32() as recommended by the manual.
The observed behavior:
Compared to the old fiber-optic system (which used NI-VXI library and was direct pointer de-referencing capable in mapped windows), the USB mapped address calls, using viPeekXX(), are very limited in data rate. By the design of our test system, I must pull 32 bits, then resample the "data Status" register. I can not assume a full 32 byte (4 reads deep) cycle is in the FIFO because the sampling hardware is asynchronous and it may be in the middle of a scan.
I have found the limitation of the "poll data status and viPeek32()" method to be on the order of 10 Hertz sampling in our system. 4 reads / cycle * 2 viPeek32 calls / read * 24 channels * 10 herts = ~1600 viPeek32s() / second. This number may not be exact but I feel it is in the right order of magnitude.
What happens, is any where between 10 and 1500 hertz- I can not read data via viPeek32() as fast as it is stacked in the FIFO/registers. So eventually the half full flag goes high, and then I can do a block read, 16kB viMoveIn(), and catch up. The funny thing is, the faster we go, the better the system flows. The problem is the area just above the 10 Hertz speed as it takes along time until I know there is a block to read, and consequently may data becomes more and more delayed - followed by one huge chunk of catching-up data.
Why the behavior is detrimental to our system:
We count on a real time system. Also, on the PC end, the data is displayed as a scrolling scan to the operator. The scroll must be smooth and true to vehicle speed. I can not accomplish both real time display and smooth-true movement display at sample rates in between viPeekXX() limitations and the viMoveInXX() threshold. Also, I can not just sit back and wait for the block read as this has to be a true-zero and near-zero capable system (with real-time display). The speed information comes to me from the VXI info, so nor can I know how much data to expect at any given moment.
USB vs older fiber-optic bridge:
The fiber-optic bridge appearantly did not have near the gap in performance between low-level (peek) one at a time reads and high level block reads. In fact, the old code never even used block reads, it used low level reads for everything and kept up. (Note: the old system used the vxi library and used pointer derefencing).
-----
So this all leads me to believe that the USB system has a significant latency (somewhere 100 microsecond or up) and the older fiber-optic system did not. My guess at the moment is that the USB calls are more dependant on the operating system and must wait in line through various Windows calls - while the true memory mapping in the old system could bypass a lot of this (with direct pointer derefencing into the mapped memory) - resulting in a big latency discrepancy between the systems. Regardless, I need to know if it is indeed latency, what the root cause is, and can anything be done to improve it many times over.
Thanks and Regards,
Jeff
07-24-2009 11:37 AM
Hey Jeff,
As I mentioned in our email interaction earlier we are looking into this on our end and will get back to you regarding this once we can track down the information. If another customer has seen this or has more information on this i hope that they will post up on thisforum.
Thank you and have a great day!
01-07-2010 05:10 PM
We ran fairly extensive benchmarks awhile back and found that in general, there is significant latency for USB as well as Ethernet VXI controllers. PCIe or MXI were much faster.