Okay so I'm trying to understand this concept a little better here. I was just doing some reading on how all cases of a case structure execute in parallel on an FPGA, and how the longest path through the case structure is the critical path. I have been working on an application where I have an AI card I sample at both high and low speeds to measure different frequencies. I've oversimplified the logic of the code to show my question below. In one case I need to pull in a large number of signals at low frequencies on the NI 9205. Since that card can go up to 250 kS/s, I figured I was more than fine. I also, however, want to have a mode for my program where I sample a much higher frequency signal. I was thinking up till now that I could do that on the same card in a different case of the structure as shown below, if I only sample a single channel at that time. Now I'm wondering if the other case will cause my high frequency input to be limited to (# channels)/(250 kS/s).
I actually already had build this and was pulling in a 25KHz sine wave generated by another device and it looked perfect. If I can't reach the max sample rate, how did I get such a nice input? Am I just looking at an alias? I appreciate any help
Okay so I'm trying to understand this concept a little better here. I was just doing some reading on how all cases of a case structure execute in parallel on an FPGA, and how the longest path through the case structure is the critical path.
Not sure where you read this, but it's not correct. The only case that executes is the one that matches the input to the case selector, so that dictates the timing. You can prove this with a small modification to your VI: time each loop iteration by storing the tick count in a shift register, and subtracting the shift register value from the current tick count. You'll see the timing change based on which condition you select, unless your loop timer is so slow that it dominates the analog acquisition time.
That is what I have always thought too. And it makes sense with what I am seeing based off the sampled data. I read this idea when I was reading through one of the High Throughput LabVIEW course slides (I have my CPI but I've never done this course so I was looking through it). The slide looks like so:
The text notes for the slide say "
For the case structure, each case is given its own dedicated HW resources. A case executes by calculating the results of all of the cases and selecting which case's outputs are provided to downstream logic. This means that the critical path of the case structure will be the case with the longest propagation delay. For this example the add will be the critical path since it has many more layers of logic than the AND function."
I did a bit of searching and can't find this documented very well anywhere, which is really the origin of my question. I did come across one forum post that mentions this idea here:
I figured if NI and the forums said it, there might be some truth there. I also thought maybe this was only true for a Single Cycle Timed Loop (which I could see being the case), but I just can't find anything that says that. Also, maybe I just don't understand what that slide/forum post are saying?Maybe help me see where I've gone wrong in interpreting that.
Yes, this is specific to single cycle loops. The thing to remember the compiler is trying to get that whole block diagram to calculate in one cycle so it's going to be limited by the worst case of the case structure plus the stuff going into and out of it.
For a regular loop, the compiler doesn't have the same constraint. It can break up what's in the case structure over multiple iterations if needed.
I can't prove this, but I suspect that slide is an oversimplification. There's probably an enable signal provided to the I/O node only when that specific case executes, such that the I/O node in the inactive case does not actually execute.
If you've used the Xilinx IP blocks, you may have seen that some have IP/Clock enable signals, and if you don't use them, the block executes on every clock cycle even when the LabVIEW code indicates that the block should not be executing (for example, because it's in an inactive case of a case structure). The I/O Node is probably similar, except with an implicit (rather than explicit) enable signal.
Okay, thanks for confirming that the limiting critical path is only for SCTL. I figured that must be the case but I just couldn't find it documented anywhere.
I have used Xilinx IP blocks before, but it's been quite a long time. So, let me try to sum this all up here. Inside a SCTL, the critical path IS the longest path through any of the cases. Outside a SCTL, enable signals or something else in the FPGA build ensure that only one part of the case structure executes at a time. Which brings me back to my original point, outside a SCTL, having signals read from the same device in multiple cases of a case structure does NOT affect the max sample rate, because they are NOT executing at the same time.
Again, thanks for confirming. I was getting worried about some of my designs after I read that. Perhaps that training slide could use some more explicit descriptions.
Inside a SCTL, the critical path IS the longest path through any of the cases. Outside a SCTL, enable signals or something else in the FPGA build ensure that only one part of the case structure executes at a time. Which brings me back to my original point, outside a SCTL, having signals read from the same device in multiple cases of a case structure does NOT affect the max sample rate, because they are NOT executing at the same time.
That's not quite right. The enable signals apply regardless of whether or not the code is inside a single-cycle timed loop (although, as a side note, most analog I/O nodes can't execute in a single cycle and therefore can't be placed in a SCTL). Within a SCTL, the compiler has to verify that regardless of which case executes, it will complete within a single clock cycle, but if that condition is satisfied then what's inside the case structure doesn't affect the timing because the entire loop always takes exactly one clock cycle. Outside a SCTL, there isn't really a "critical path" because you haven't set any timing constraints.
So, I should probably think about this more as a compile time consideration. When laying out longest path through a timed section, the critical path matters. However, once actually on the FPGA, the enable signals prevent that section of code from performing any actions until enabled. Both paths may be "executing" but only one path is enabled. I can thus conclude that this won't affect the AI sample rates since the disabled signals won't be pulled in until enabled. I mean this for both inside and outside SCTL, although yes, inside the SCTL I wouldn't be able to do those AIs. This is also why we were saying that it doesn't matter unless were inside the SCTL, because this only matters for timing constraints.
My thinking about it was wrong because when it said in parallel for the case structures, I figured it must be pulling in both my high and low frequencies at the same time from the way it was worded. But if disable signals are preventing that then I think I understand now how this is working. In parallel, but disabled.