So I'm starting a new thread to discuss the possibility of a presentation covering FPGA 'Best Practices', or common pitfalls/design choices.
This began in the main "I wish there was a presentation on..." thread but is continued here to avoid derailing that thread too much further.
I'd like a presentation on common patterns in FPGA code.
This presentation from 2014(?) is somewhat similar: LabVIEW FPGA Design Patterns and Best Practices (NIWeek 2014?)
I'd like to know about common mistakes and the better way of writing FPGA-based code.
Tom McQuillan: Possible presentation on GoF designs (with Sam Taggart?)
Me: Might not be exactly what I'm imagining - GoF patterns often require dynamic dispatch.
Terry Stratoudakis (Terry_ALE):
I am interested in this but first some comments and questions.
... software optimizations and techniques are mostly single core minded where on an FPGA things are spatial and so forth.
Has a new thread for this been made?
Yes, here now.
I gave a talk in May 2020 https://www.youtube.com/watch?v=i_nC_sGOqUw&t which talks about some of these techniques at a general level. How does this compare to what you are looking for?
I enjoyed that presentation but it mainly focused (as I understand it) on making faster things possible.
My problems are not necessarily related to making things as fast as possible, but rather about making them as readable/conceptually understandable as possible.
There are good LabVIEW FPGA shipping examples that have best practices as well.
Perhaps some review of these best examples could form the beginning of this hypothetical presentation (or if nobody submits this presentation, I'd be happy to receive some pointers here)
Other best practices can be found in the VST2 and RTSA code but they are not openly available. A talk could be made that speaks to those practices without revealing the code.
Also, what is typical application and NI hardware (i.e. cRIO or PXI)?
For me, cRIO, but I'd like to think that the problems I'm facing might not be specific to the hardware or the clock speeds. I guess that as speeds get faster and faster, more sacrifices to readability might need to be made though...
To give a concrete example of what I might mean with regards to pitfalls/design choices, I'll describe some cRIO code I've been recently rewriting.
My system uses some NI-9402 modules to communicate via SPI with a PCB that I designed, which contains an ADC and an "octal switch" (see ADG714). The switch controls various "almost static" inputs to the ADC, for example the shutdown, reset and oversampling digital inputs.
Most of the time, the ADC acquires continuously (this could be controlled by either the RT system, or by a switch using an NI-9344 module). The results are streamed over DMA FIFO to the RT system, which bundles them together in nice packages for communication to a desktop system, for logging, display, further analysis, etc.
Sometimes we might want to change some settings - e.g. oversampling ratio, or the sample rate, etc. To change something like the oversampling rate, the ADC must stop acquiring, the ADG needs to be updated with new values, the ADC must be reset (again requiring a pair of changes to the ADG switches), and then the sampling should resume.
Previously, the code ran in a sort of nested state machine structure. To update the settings, the RT system would change some FPGA controls, then set a boolean ("Requesting Update", or something) to true. The FPGA would poll that control, then go through a series of "Updating", "Finished Updating", "Ready to Acquire" like states, allowing the RT system to wait for the Ready to Acquire, then empty the FIFO, then set "Start" to true, resuming the acquisition.
This required lots of different booleans, and states, and seemingly worked at best "most" of the time. Clearly there were some situations in which the end state was not valid, but digging into this mess was pretty tricky - keeping the changes to state in your head continuously wasn't very practical.
This situation was vastly simplified by a recent change I made - now, the FPGA will always acquire a "block" of data of a certain length, depending on an enum "Sample Rate" value, which also includes the number of channels to sample (e.g. a typical values are "10kHz x 8Ch", or "50kHz x 3Ch", or similar).
The DMA sends a 'header' element that conveys the contents of the upcoming block - how many elements, how many channels do they represent?
By promising to always output that number of elements (even if some of them are 0, because the acquisition died due to e.g. power failure to the board, or a broken wire, or whatever), the RT system is much simpler.
Now, a new setting request can simply be enqueued on a FIFO to the FPGA, and when the end of a block is reached, the FIFO can be checked to see if it should continue sampling, or change something.
No complicated handshaking is necessary between RT and FPGA.
I don't know that this is a common problem, or a common solution (enforcing a block of data rather than individual elements, or e.g. 1 sample cycle with N results (one per channel sampled)), but it wouldn't surprise me in hindsight to learn that it was. If I'd considered this approach a long time ago, I could have saved probably a non-negligible amount of time and effort.
At the same time, modifying various parts of the code to use objects and simpler abstractions (e.g. a VI that carries out "Pulse Reset", rather than setting the ADG switches value to 28, then setting "Update Switches", then waiting for "Finished", then setting the values to 12, then "Update Switches", then...) allow more easily spotting problems in code - for example, the ADC is triggered by a pulse on one line, but the results actually are transferred partially during the next sampling cycle. If the sample rate increases, then previously it would be possible for the "Conversion Start" line to pulse repeatedly during the transfer of a previous sample, leading to a whole collection of "Start Time" values being put on a FIFO with no accompanying data.
Now, it's clearer that this can be a problem and when the sample rate changes, an additional pause is given between the last CONVST on the previous "block", and the first in the new block at a different rate.
Thanks for the context and feedback. Really helpful. I have not done many cRIO systems but I understand the (general) challenges faced.
One common principle that we look to apply in our projects is a more pronounced design phase that is outside of the LabVIEW environment. For this we look to UML diagram templates.
Another is simulating the system but not just literally in the FPGA sense. This could help one see integration issues up front and in an environment that is easier to troubleshoot. A few years ago we were working on a cRIO based system where the deployment was overseas and there wasn't a lot of room for back and forth. The FPGA was very simple but the RT had it's share of complexity. We made a simulated model in Windows and were able to exercise all known scenarios. We are applying this (in concept) to PXI based FPGA systems. Though they are RF and high bandwidth we do this to test the interfaces and low bandwidth logic. There are modules to help shake out issues where we need to run at higher bandwidths.
Anyway, I think applying design techniques outside of LabVIEW tends to be counter intuitive in the LabVIEW world (myself included). NI teaches us that "it is easy" and "no coding needed" which even if we ignore these statements may still have it in some level. The simulations are another aspect which I would say are best practices.
That said I wonder if this 'talk' could be a panel where there are different perspectives with some questions planned (pre-submitted), on the fly, or combination of the two. I feel like I know quite a bit on the subject but I still see things that keep me humble.
The other general issue is that LabVIEW FPGA has a much smaller and quieter community than LabVIEW. The reason is understandable but the result is that there are less resources and discussions happening. I find that with LabVIEW FPGA some projects tend to be more proprietary which leads to less discussions. The best thing would be decouple the principles from the projects. This is not easy but it is really the only way a success can be repeated and a failure can be avoided.
I assume you know of the cRIO Developers Guide http://www.ni.com/pdf/products/us/fullcriodevguide.pdf. Though dated, I am sure it has good stuff in there. I haven't studied it but I assume what you are looking for goes beyond and maybe some things have changed since it was published.
It's not Best Practices, nor is it Common Design patterns, but I did a presentation on creating a Time Weighted Data Averaging Mechanism in FPGA that covered the challenges of how a simple set of code in Windows could eat too many resources in FPGA until converted from a parallel approach to a state machine approach. If that interests you, I can submit it. It was presented to the LabVIEW Architects Forum user group in February 2017 (link to video recording below).
Averaging data in FPGA
Averaging data can take many forms, for a project, I was requested to implement a time weighted average to smooth data spikes. The time weighting formula could be adjusted to a user specified value (# of averages), and also had to account for situations in which older data did not yet exist. FPGA code requires data to be fixed in length. Although not successfully implemented in FPGA during the project (had to move it up to the RT layer), this presentation will show how I was eventually able to implement the code in the FPGA, and some of the changes made to make it scaleable for a large number of channels while using minimal FPGA resources.