We are looking into using a PXIe-6548 digital I/O card to test a custom fpga device.
The specsheet says that this board can link and loop through multiple patterns using its onboard hardware.
Before we buy it, need to know if (1) the following pseudo-code can be implemented and (2) are there any hardware issues that would prevent this?
Load 100000 or many different waveform sequences of variable sizes (4 to 256 cycles) into the board memory
The script commands would be loaded using the C++ function call.
Execute script (realtime on board) that does the following:
Do until X>100000
Wait for a status flag from DUT (script trigger)
Load the entire test sequence as one large waveform.
Use subsets of the waveform where each waveform subset is loaded from an array.
The array values would be loaded as the sequence is read in from a text file.
Do until X>200000
Generate wfm subset(A,B)
Wait for status flag from DUT (script trigger)
From my understanding, you are dynamically generating a sequence of custom digital waveforms which are triggered by responses (assertion of a status flag) by the device under test. You should be able to do so using the PXIe-6548,. However, a few limiting factors such as minimum waveform size requirements and memory exist and should be considered.
In order to perform this sequence, the generation mode of the board will have to be set to Scripted mode. Scripted mode allows you to generate the next waveform once a certain condition is met (in this case, the triggering of a status flag). Moreover, the minimum waveform size is dependent of your sampling rate. So, if you are sampling at 200MHz using a stepped sequence, then your sample size must be at least 128S (see specifications). This may cause issues since you have a minimum pattern size of 4 cycles. Nevertheless, sampling at a lower frequency and using a different waveform configuration can yield more desired results. You may want to refer to Common Scripting Use Cases in the NI Digital Waveform Generator/Analyzer Help.
Furthermore, keep in mind that the PXIe-6548 has an on-board memory with the size of 256MB. Scripts and waveforms share this memory space. Theoretically, the waveforms will consume approximately 3.2MB (256 cycles/waveform x 100000 waveforms / 8 cycles/Byte). However there will be overhead memory usage when storing the waveforms. For more information, you may want to refer to the Scripts and Generation Onboard Memory from the NI Digital Waveform Generator/Analyzer Help.
Thanks for the reply Roman, I'm looking through the NI Digital waveform generator help docs as well as the script documentation. As I understand from the docs, the script triggers 0,1,2,3 can be configured for Digital edge or Digital level. From the triggering documentation, the script trigger comes in from the backplane of the card. Not from the device. This is described in docs for NIHDIO_ATTR_DIGITAL_LEVEL_SCRIPT_TRIGGER_SOURCE. So the hardware script trigger does not appear to be what we would use for this testing.
There is a footnote in the triggers summary that says the pattern match trigger is valid only used for acquisition sessions. I assume this means that it triggers the card to acquire data from the device under test. The 6548 would monitor the device in acquisition mode, not real time compare mode. For generation mode, it says that external triggers from the PCI bus controls the data generation. This tells me that the 6548 cannot perform the scripting for data generation on its own..
It appears that the only method to do this is with software running on the host controller to handle the script decisions. The host must wait for the pattern match trigger asserted by the 6548. The host would then send start command to execute the next waveform. If this overhead is in the 200-500usec range then it would be acceptable. Ideally it should be less than 50usec because this pattern matching event would take place probably 25000 to 50000 times for the sequences we have in mind. If it takes more than 200usec for each match then execution time becomes a big concern.
Are there any benchmarks for this host controller execution time? I assume we would need an embedded host controller mounted in the chassis in order to have the fastest possible execution time. True?
It would also be nice to use raid hardware to stream the patterns. Would this work with the pattern matching steps as I've described?
How many digital lines are you using for acquisition and generation? Are some of the lines bidirectional? If so, how are your signals being synchronized (handshaking, timed wait, etc.)? If your system requires bidirectional lines, one issue that may arise is accidently the driving the line from both the PXIe-6548 and DUT.
If your system can have dedicated input or output lines such that each line is unidirectional, I would recommend using dynamic acquisition of multiple records and retriggerable pattern generation. Essentially, the PXIe-6548 will listen for a certain bit pattern. Once the pattern is matched, a trigger will assert (ex. PFI0), causing a generation of a particular waveform. The system will repeat and acquisitions will trigger to generate different patterns.
Dynamic Acquisition of Multiple Records can be located by navigating to ..\National Instruments\NI-HSDIO\examples\c\Dynamic Acquisition\DynamicAcquisitionOfMultipleRecords for the C example or searching “records” in the NI LabVIEW Example Finder and selecting that specific file in the list for the LabVIEW example.
Dynamic Generation of Multiple Waveforms can be located by navigating to ..\National Instruments\NI-HSDIO\examples\c\Dynamic Generation\DynamicGenerationOfMultipleWaveforms for the C example or searching “dynamic” in the NI LabVIEW Example Finder and selecting that specific file in the list for the LabVIEW example.
In regards to execution time, if you require determinism and quick response rates, an FPGA would be most appropriate. However, the execution of your program will effectively perform if you commit the patterns, in particular the generation patterns, to memory prior to executing your script. This optimizes your execution because the program counter simply points at the beginning of the particular memory space/address of the pattern and increments down the addresses for generation. If the patterns are not created a-priori, the patterns will be built in-process and cause latency in execution due to memory and processing overheads.
You can also provide external storage for streaming information back and forth from the PXIe-6548 to the storage. Please note that the bandwidth is limited by the PXIe bus which is somewhere in the order of 600 MB/s.
The signals are bi-directional. The device is a nand-flash memory with 8 I/O pins and sometimes the lines are used for writing and reading data from memory and sometimes they are used to read a status register that holds device status bits. For example, a read command is executed to tell the device to retrieve data from memory. The 8 lines are polled until all 8 bits are in the correct state. Then the memory data, which has been transferred to a fifo, is clocked out on the same 8 I/O lines. The procedure is used for writes and erases as well. After executing a command the device goes into busy mode for some random time (100ns to 5ms depending on what it is doing). These times vary from device to device. The typical flash tester has the ability to branch based on a pass or failing. It can have a vector set like this
1000100000 // Drive a command
XXXXXXXX //Don't care drivers are disabled
XXXXXXX //Don't care drivers are disabled
HLHLLLLL // Check status from device
branch on error to label
LLHLLLLH // Begin reading data from data
continue with more reads
The tester will branch back to label until HLHLLLLL appears or a timeout occurs.
It sounds like you are performing memory response tests with your current system. Would the following DevZone article, named Memory Test Reference Design, be helpful as a reference and example? Although the article was created for using NI-655x series cards, the example should be compatible with your PXIe-6548.
Not really. In the Memory test example, it does show complex generated in conjunction with acquistion of signals from the dut. The results of the acquistion are processed by the host controller. It is not a real-time compare. The test patterns we run are not algorthmic.
The method would require to generate a waveform sequence and wait for a pattern match then acquire samples. Then post process those samples before starting the next waveform sequence. If time to post process is fast (<200usec), then it might work. This process would continue for the 100000 sequences to complete the entire test sequence.
The question is how long does it take to post process the acquistion memory. If it was only one time then it would not be a problem. But since it is a flash memory we have to wait for the status flags at the end of each command that the dut executes.
The big hurdle is how to handle the pattern matching in a reasonable time. We were hoping the 6548's could do it internally with it's fpga. Another option might be to use one of the fpga cards that we program the fpga ourselves to handle the pattern matching. The fpga could get vectors from the raid system or some other type of memory on the pci bus.
Thanks and keep the ideas coming.
Before going into too much detail on what I believe might work for you. I want to first state that if this is going to be used for production test, the best solution for fastest test times would be to use a board with an open FPGA such as our R-Series line or the PXIe-6583. I understand that this does add extra development time using LabVIEW FPGA etc., but it would be the most optimal solution in the end.
If you wish to use the PXIe-6548, I will do my best to help you devise a solution.
Goal (please clarify if this is incorrect):
Generate x samples on 8 lines, immediately wait for a pattern on the same lines, acquire y samples, and post process these samples. Then repeat the same test 100,000 times.
The complication with your setup comes from having a pattern match trigger that needs to start an acquisition session immediately following generation on the same lines. Since our hardware cannot be set to have the acquisition session begin looking for a pattern match trigger immediately following generation (hardware timed), we need to have the acquisition session looking for the pattern match while we generate on the same lines. This will require you to setup an addition line as a gate. So, first you will generate your x samples on lines 0-7 while leaving line 8 at 0. Then, after you are done generating, you will tristate the 0-7 lines and set the line 8 to 1. The pattern match for the acquisition will be set to be across 9 lines instead of 8. This will ensure that we don't pattern match until the 9th line (or gating line) is high. After this, we simply acquire the data, process it in software, and restart the task. I think that the post processing will be quick and you will likely be able to meet your 200usec requirement, but this would need to be benchmarked to confirm.
The reason I didn't suggest a hardware compare solution is for a few reasons.
First, for hardware compare to work, you have to start your acquisition session within a limited # of clock cycles after the generation session starts (when doing stimulus and response which you need). For response only, this wouldn't be a problem. I'm not sure how many samples you need to be generated and how long it will take after generation before the pattern match will occur. If this is relatively quick (<256 clock cycles), we might be able to get hardware compare to work.
Second, I wasn't sure if the delay from generation to receiving the pattern match trigger is fixed in your system. If it is, we might be able to start the acquisition session immediately and throw away the samples until the pattern match is set to come in.
Some problems that might arise with this setup is how you want to clock everything. Typically, with hardware compare we use something called "Source Synchronous." This is where you send your generation clock to the dut which passes it back to the HSDIO card to be used by the acquisition / hardware compare. This keeps the clock in sync with your data by sending it through the same path delay as the data. I'm not sure what you have planned, but without knowing more, there might be other complications with the above setups. Please let me know if you aren't using the onboard clock and have something like this in mind.
Again, I believe this is the type of setup you are looking for. Please let me know if I misunderstood something. In addition, this might be more easily discussed over the phone. With your permission, I can obtain your contact information from our IT group and call you. Would this be something you would be interested in?
Yes absolutely I would like to talk about this. Any time next week would be OK. I work until 4pm mountain time.
I try and call you early next week. Look forward to talking with you.