Jeremy:
I have written several high speed data acquisition systems for large
wind tunnels for the Korean gov't and for Ford Motor Co. (that ran as
fast as 100 frames of 64K data per second under Windows NT 4.0 and
200MHz dual Pentium Pro and later 500MHz P-II PC's) so I have some
useful tips to offer you about high speed data systems.
To begin with, I assume you are operating in a Windows (98/ME/XP) or
Windows NT (NT 4.0/2000/XP Pro) environment. You need to be aware
that these operating systems are not true real time operating systems
and WILL NOT operate reliably at <10mS w/o special efforts.
WINDOWS IS NOT A REAL TIME DETERMINISTIC O/S AND MICROSOFT WILL EVEN
TELL YOU SO ON THEIR WEB-SITE.
They can be used as "soft" real time O/S's as long as you recognize
their limitations and don't try to use them for controlling or
measuring something where life or property is at a significant risk.
Windows 32 bit O/S's are pre-emptive multi-tasking operating systems
and are by definition non-deterministic.
This means that another application or the operating system can steal
your applications time-slice from it if it has higher priority than
your program. You can control priority to some extent in LabVIEW at
the execution settings window and/or the task manager.
Unfortunately however, unless you write a kernel mode (Ring 0) level
device driver however, the operating system can always steal your time
slice from you though because O/S code runs at a privileged level
w/higher priority than any application code which runs at Ring 3
level.
The Windows operating system has a time scheduler that switches
between running applications using a period of time that Microsoft
calls a "quanta". You can do a few tricks with this scheduler to make
things better. Typically a quanta is about 10-20 mSec depending on
number of processors and processor speed of your machine and whether
you are runnning NT derivatives or the less capable 98 derivatives.
First you can adjust the "quanta" size to run longer or shorter
(shortest under NT is about 1 millisecond but this may not be
desirable as there is a lot of CPU overhead in switching applications)
if this suits your need.
Second you can raise the priority of your application above other
applications so that they can't interrupt you (although the O/S still
can).
Finally, supposedly even though your application may be a regular
(Ring 3 level) application, if you keep adjusting your priority level
in your application, the scheduler keeps resetting the clock on your
quanta so that it doesn't expire and switch to another application.
(You would have to make a Windows API DLL call to do this trick so I
don't know how easy it would be from LV.)
There are real time multi tasking non-pre-emptive "round-robin"
operating systems such as VXWorks which allow each program a dedicated
time slice but don't allow pre-emption. This way each application
knows that it will always get it's time slice when it is it's turn for
a known amount of time. If you are doing real time acquisition, you
will probably need a real time O/S like VXWorks.
You could look into embedded XP or Windows CE. I haven't investigated
them much but basically they are stripped down versions of Windows
designed to run embedded computers in consumer electronic devices like
TV's and stereo components, etc. They are supposed to be better than
plain jane desktop Windows or Server Windows for this type of data
acquisition function.
Microsoft has tried to push them for other lines of products but
industry has generally not moved the Bill Gates way in this matter.
If you could collect your data from a DOS 6.2.2 based application on a
machine running actual DOS 6.2.2 and then post process the data later
on a Windows machine, this could be a very attractive alternative to
Win32. DOS is a single threaded, mono-process command line based
operating system with relatively low latency and fairly high
determinism (compared to Windows). It is much simpler and more
reliable than Windows for these types of applications and DOS 6.2
forms the basis of the O/S's for many types of older embedded PC
products.
Avoid 16 bit Windows like the plague (Windows 3.1 or Windows for
Workgroups, etc.) for data acquisition work. It uses "co-operative"
multi-tasking. In this form of multitasking, there isn't really any
multi-threading (multiple execution threads within one single
application) and applications don't really EVER have to give back the
CPU if they don't want to. That is why old Windows 3.1 would get hung
so often and need to be rebooted. If one app gets hung up under
Win3.1 they all do.
You should also be aware that timing measurements made below about
20mS in resolution w/LabVIEW in 32 bit Windows aren't very realistic,
so please don't use them for actual test data. Don't accept that a
10mS Wait timer in LV is really 10msecs. It probably isn't.
The best software only timer in Windows is something called a
multimedia timer, available through a Windows API DLL call. It is
accurate to about a millisecond but there is a high overhead involved.
The best real time timing available to you with a PC is on the CPU
chip in something called the high performance query counter which is
microsecond accurate and relatively low overhead because it is a real
hardware counter not something in software. You will need to make Win
API DLL calls to QueryHighPerformance counter in order to use it.
If this is not satisfactory, then you will need a dedicated high
performance timer/counter that is GPS or LORAN signal based or has an
oven controlled quartz timing crystal.
I am wondering why you need to read thermocouples so fast. I never
heard of a thermocouple that was so fast. When I used them, the
values didn't really change very quickly from one second to the next
so we typically measured at a few times a second. Remember the
thermocouple and the item being tested have a given volume, mass,
surface area and conductivity and thus have a certain "thermal mass"
or thermodynamic inertia to overcome when changing temperatures and
will not do so instantaneously.
If you are doing combustion testing or something, you might really
need to go this fast but pay attention to the characteristic times of
the test article, measured process, and the sensors and don't kill
yourself coding faster than the physics says matters.
Here are 30 of my best tricks to cut improve high speed data
acquisition and cut back on the interruptions from the O/S and other
applications:
1) Raise the priority of whatever VI is doing the heavy lifting so
that it will be less likely to be interrupted by other VI's and
applications.
2) Get a faster PC. Remember hardware is almost always cheaper than
software. Better yet, use a faster multi-processor PC and implement
your code in a multi-threaded manner using parallel loops, dynamic
VI's, occurrences, notifiers, queues, etc. to reduce any bottlenecks
that might stop the whole machine while it is waiting on some
particular bit of I/O to occur.
3) Mark VI's as thread-safe and re-entrant and then design them to be
thread-safe and re-entrant using occurrencees, notifiers, semaphores,
mutexes, and queues where ever possible. Use VI templates if spawning
duplicate VI's to solve a problem helps to solve it.
4) Mark VI's to run in the data-acquisition thread under the execution
window or in any thread but the user interface thread. This will
improve their performance and help to isolate them from mouse moves
and clicks, keyboard hits, etc.
5) To go even further, you could build a multi-PC based distributed
solution using high bandwidth connectivity technologies like fiber
optic reflective memory cards, giga-bit ethernet, Firewire (IEEE1394),
or USB 2.0 technologies. Design your code to take advantage of
parallelism and then find hardware ways of scaling up using parallel
hardware.
6) Use dedicated data acquisition hardware with it's own real-time
controllers, buffer memory, and dedicated hardware analysis functions.
This takes the hard work off of your CPU and places it on hardware
that was specifically designed to do it. You might consider Neff
Instruments 470, 495, or 500/620 series or a VXI/PXI approach.
7) Buy lots of RAM. I put 1.5GB in my home system for about $165.
RAM is the cheapest PC performance improver you can get.
😎 Switch to a real time O/S like VXWorks or get a real-time Linux
variant (SUSE has one available for it.) or get real time kernel
modifications for Windows NT or 2000. (There are a few specialty
companies who deal in these modifications.) The down-side to the
WinNT/2000 RT modifications as opposed to Linux is that while Linux is
open-source code, WinNT is a black box. Who knows what the
consequences are to modifying the NT kernel?)
9) NI has some dedicated real-time controllers that run LV now. You
should probably look at them.
10) You could also switch to a PLC (programmable logic controller)
like those made by Allen Bradley or GE/FANUC or Siemens. They run
real time deterministic operating systems so that you can guarantee
the time between successive executions of the main loop typically down
to 4-5mS. They are designed specifically for use with thermocouples
and other industrial sensors. You can write code for them either in
their native ladder logic development environments or in some cases
using NI's BridgeView which is basically LabVIEW for PLC's. Ladder
logic is very similar to LabVIEW anyways and should be no problem for
a LabVIEW developer to use.
The downside to PLC's is that they are more expensive. On the other
hand, they are orders of magnitude more reliable and deterministic
than PC based data acqusition and control.
11) Use PCI-DMA based cards in your PC whenever possible. Avoid ISA
or EISA bus cards. DMA based cards can perform their own memory moves
without bothering the CPU which is very important. Also their
bandwidth is at least 4 times that of an ISA card and they are easier
to configure (plug and play).
12) After you buy lots of RAM, disable swap files on your PC. Swap
files burn up lots of CPU time while they thrash around moving pages
in and out of memory. If you put 2GB in a system, that usually will
be plenty for whatever you are doing.
13) Get rid of any unnecessary "fluff" services or applications
running on the PC. Dump Anti-virus and firewall software, (but keep
your data acq PC safe by keeping it behind another firewall machine
and restricting access to those with a need to test only and restrict
actual use to testing only), unnecessary network protocols or
services, MSN, AOL, MS FastFind Indexer, MS Messenger, the Messenger
service, the Alerter service, and any identifiable service that you
aren't going to actually need to use to do your job. If you open task
manager under NT with nothing "running" you will see that there are
anywhere from 30 to 100 processes actually running, eating up your
valuable CPU and memory.
14) Separate execution loops by functionality, don't make unrelate
parts of your code wait on each other if they don't have to. Decouple
unrelated tasks and use multithreaded signalling techniques to let
some tasks wait for other tasks to complete asynchronously. This
avoids the high overhead associated with tight polling loops.
15) Watch out! When you are writing very high speed multi-threaded
code, use the appropriate multi-threaded signalling techniques,
notifiers, queueing, etc. Don't use globals etc. to pass information
between threads or you could end up with a race condition or worse yet
with code that deadlocks.
16) Make sure that if there is a piece of hardware that must be
accessed by one caller at a time that you control access using
resource locks or mutexes, or semaphores to keep from passing the
wrong data to the wrong caller or worse yet accepting control
information from the wrong caller.
17) If possible, put data acquisition for such devices in simple loops
and copy their returned data into a notifier so that all threads can
wait for new data from the hardware via the notifier at one time at
low CPU rates and so that one thread doesn't make other threads wait
while it hogs the scarce and valuable hardware resource. Notifiers
basically "broadcast" the hardware data to all interested parties
rather than calling them one at a time on the phone.
18) Don't put any unnecessary code into FOR or WHILE loops. Don't do
array manipulations, SQL calls, OLE automation, use Globals, or update
displays if you don't need to. Don't forget to put a timer in these
loops though to allow
other parallel code a breathing space to execute.
19) TRYING TO WRITE DIRECTLY TO EXCEL IN A HIGH SPEED REAL TIME
APPLICATION IS SO NOT RECOMMENDED!! Writing to an Excel workbook
would involve either OLE automation (relatively fast but not nearly
fast enough) or DDE (terribly slow and unreliable.) There is a high
overhead involved in OLE automation and if it is not absolutely
necessary it is to be avoided.
It would be far better to log your data to disk in an a raw A/D bytes
binary undecoded form in 64K blocks and pull it back later from the
hard-drive for post-test processing into a more Excel friendly human
engineering units format.
20) Handling UI graphics and mouse clicks takes a lot of overhead that
you can eliminate if you do what I say in this tip:
Separate user interface controls into their own while loop away from
actual "real" code execution, use the new event structure or use a
slow polling loop to read controls and update indicators. Launch VI's
dynamically from this user interface loop and don't wait on them to
complete before looping.
Remember human time scales are more like 4 times a second or at the
very best 20 times a second and not like 100 times/second. Use a 100
or 250 ms timer in this loop if you don't use the new event structure.
(If you don't have the 6.1 LV, you could just create an occurrence,
put a wait on occurrence into this loop and set the timeout for it to
250ms but never have anything trigger that occurrence. This way if
will "sleep" between time outs and not really use much CPU in this
loop for polling.)
21) Disable debugging in LV. This will speed up your code.
22) Avoid dynamically building or manipulating arrays. Preallocate
arrays where possible. It takes a lot of CPU to move large arrays
around. Same goes for charts with large#'s of data points.
23) Decouple display update rates from data acquisition rates. Just
because you want to acquire 100 frames of data a second, doesn't mean
that you necessarily need to analyze or display it at 100 frames a
second. Even if you don't do what I said in #20 above, at least don't
display every frame of data if you don't need to.
24) Avoid graphs and charts if numbers or better yet PASS/FAIL LED's
will do.
There is a lot less overhead in redrawing simpler controls like
numeric indicators or LED's. Keep them small if overhead is a
problem.
25) Get a higher performance graphics card. A lot of CPU overhead is
in redraws of screens.
26) Use the profiler to identify parts of your code that are called
many times and / or that use the highest amounts of CPU and/or memory.
Try to figure out ways to re-write these sections to run faster and
lighter. Use the NT task manager to find out what kind of load you
are putting on the CPU. Try to target your design for <10%. This
number seems to work well in my experience. A lot of 100% spikes is a
BAD thing. Remember the higher the load, the less deterministic and
reliable your application will be.
27) Avoid making a lot of I/O calls that exchange only small amounts
of information. Try to make fewer calls that move larger amounts of
information. This is usually where a lot of CPU time goes. It really
doesn't take any longer to move 64K of data than it does to move 1K
with existing DMA hardware and PCI buses. Try collecting 100 frames
of data once a second rather than 1 frame a hundred times a second.
The same goes for logging data to disk. Instead of writing data in
frames, try writing in 64K chunks. (You can easily modify the built
in Write I16 Integer array as binary VI to write U8's or U32's
instead.) This is usually the most efficient size for most H/D
systems. If for example your program collects data 4K at a time, save
up 16 frames before you write to disk. It is important to use a
multiple of 65536 bytes when moving data, don't send 65538 bytes at a
time for example.
28) Get a RAID array Ultra 2 SCSI H/D with some serious buffering
memory or a better yet a fibre channel RAID array. This will reduce
I/O delays while your application is waiting for the hard drive head
to seek to the right location and increase overall disk drive
bandwidth. Typically a h/d seek on a regular IDE drive is about
7-10ms these days.
Keep your H/D's defragged and check them for errors regularly. A
seriously fragmented H/D will really slow you down. (Executive
Software Diskkeeper Light is a good freeware defragger I've used in
the past. I use Norton System Works at home.) Use NTFS formatting
for your H/D as it is faster than FAT16 and usually faster than FAT32
or O/S2 HPFS.
Also beware of inflated advertising claims of hard-drive throughput.
I once wrote a C++ program that logged about 6.4 MB/sec to disk on a
system with a single Ultra-2 SCSI H/D that advertised 80 MB/sec. I
couldn't understand why performance was only 8% of the advertised
rate. After speaking to the manufacturer however, I found out that
80MB/sec was a "burst" rate and not sustainable over any meaningful
period of time. The best sustained rate was actually about 8MB/sec if
I would be willing to switch from C to assembly language.
29) If possible, just store all your data in a pre-allocated big array
(or circular buffer if you feel like implementing one) until either
you can get around to writing the data, or the test is over. This
will eliminate disk hits altogether but it limits test time to the
available memory so you will need to calculate memory requirements.
30) If you need really high speed I/O, or if you find that some part
of your code is really slow after profiling, and that rewriting it in
LabVIEW just isn't going to make it any faster, consider going to MS
Visual C or C++. C will be about the fastest you can reasonably go
without really getting down into the ones and zeros business.
Douglas De Clue
LabVIEW developer/Test Engineer
ddeclue@bellsouth.net
P.S. If you know of anybody who needs a real "top gun" LV or C/C++
developer, I'll be needing a new job soon, so drop me a line.
kml37 wrote in message news:<50650000000500000021820000-1023576873000@exchange.ni.com>...
> Jeremy:
> Sorry I was so vague in my earlier description of the process. What
> we are doing is collecting 2 types of data and trying to correlate
> them. We are running a thermocouple through LabVIEW and trying to
> correlate each temperature reading to a visual image we are taking at
> a rate of 60 frames/sec. However, the camera is not LabVIEW
> compatible. To get around the problem, we have written a Labview
> program that sends a true message when the labview window is frontmost
> (i.e.-the active screen). By doing this, we know the exact time we
> push RECORD on the software of the camera we are using.
>
> The reason we are going for such a high speed is because we would like
> one temperature reading for every visual image. To achieve this, a
> delay time of 16-17 ms is required between each reading (which gives
> about 60 readings per second).
>
> Now here is where it gets complicated. I am using a "wait" function
> in the Labview program. When I set it to wait 20 ms, it has no
> problem recording the data into Excel. When I set it to wait 10 ms,
> it also works. However, when I set it to 15 or 16 or 17 ms, it
> alternates waiting 10 and 20 ms between each reading so that the
> average time comes out to be 15ms. Is there a way around this?
>
> You mentioned in your reply that we could store the data in memory
> until the aquisition is done. How can I do that? Are you talking
> about the memory on the NI card? I am new at all this, so please be
> explicit.
>
> Thanks for your help.
>
> Kurt