Using VI Scripting to generate FPGA VI

jiangliang · ‎11-11-2019

Hello there:

My job need me to use about 200 "Discrete Delay Function" with different delay settings in FPGA, I know we can use VI scripting to generate VI in Host VI, but I didn;t found such pattle there. Anyone can help with that?

Thanks!

Darren · ‎11-11-2019

There is an unofficial API for scripting FPGA nodes here:

[LabVIEW 20xx]\vi.lib\rvi\ClientSDK\Core\Script

Here is what the scripting diagram would look like for creating and configuring the Discrete Delay node:

National Instruments does not support using this API, but it should be relatively straightforward to figure out. Good luck!

Intaris · ‎11-12-2019

I made a malleable VI to do something like that for me.

Then I simply input the delay as a parameter to the malleable VI. Case structures within the malleable VI get constant folded, so there's no loss of efficiency.

Look HERE.

Intaris · ‎11-12-2019

@Darren wrote:

There is an unofficial API for scripting FPGA nodes here:

[LabVIEW 20xx]\vi.lib\rvi\ClientSDK\Core\Script

Here is what the scripting diagram would look like for creating and configuring the Discrete Delay node:

National Instruments does not support using this API, but it should be relatively straightforward to figure out. Good luck!

I did NOT know about these, nice.

Oh my lord, I could have used these like 2 years ago.....

wiebe@CARYA · ‎11-12-2019

@jiangliang wrote:

My job need me to use about 200 "Discrete Delay Function" with different delay settings in FPGA, I know we can use VI scripting to generate VI in Host VI, but I didn;t found such pattle there. Anyone can help with that?

Why not make one historical buffer sized the maximum delay? A simple array or FPGA memory and keeping the pointer that wraps around it's size? Then you can get all the delayed samples from that buffer. That could be done from a for loop of delays, or if you really want to parallelize it, copies of a VI. It's probably much less gate hungry compared to 200 separate delays, all keeping a buffer with the size of it's delay.

Search LabVIEW like a graph!

Intaris · ‎11-12-2019

wiebe@CARYA wrote:

one historical buffer sized the maximum delay? A simple array keeping the pointer and wrapping around it's size? Then you can get all the delayed samples from that buffer. That could be done from a for loop of delays, or if you really want to parallelize it, copies of a VI. It's probably much less gate hungry compared to 200 separate delays, all keeping a buffer with the size of it's delay.

It's actually not that inefficient. It can be inferred as either SRLs (where up to 16 or 32 delay stages - pipelined - can be combined into a single LUT) or as BRAM if the delay needs to be really large.

I avoid Arrays as much as possible.

wiebe@CARYA · ‎11-12-2019

@Intaris wrote:

wiebe@CARYA wrote:

one historical buffer sized the maximum delay? A simple array keeping the pointer and wrapping around it's size? Then you can get all the delayed samples from that buffer. That could be done from a for loop of delays, or if you really want to parallelize it, copies of a VI. It's probably much less gate hungry compared to 200 separate delays, all keeping a buffer with the size of it's delay.

It's actually not that inefficient. It can be inferred as either SRLs (where up to 16 or 32 delay stages - pipelined - can be combined into a single LUT) or as BRAM if the delay needs to be really large.

I avoid Arrays as much as possible.

But does it scale up well to 200 delays? Or large delays? For 200 delays, I imagine 32 stages isn't that much. (I'm asking, I really don't know)

I never used build in FPGA functions much. I always run into some limit making them useless. Somehow my FIR filter is higher order or needs more bits than offered, or I need 8 and only >whatever< number is available...

FPGA Arrays are OK if they're small, but conceptually (sorry if I edited that in after you replied) FPGA memory would work as well.

Search LabVIEW like a graph!

Intaris · ‎11-12-2019

Wait, what do you mean by "200 delays". I understand 200 individual instances of delays, with different delays for each.

A delay of 200 cycles is a single delay.

But of course there's nothing stopping you from stringing together multiple SRLs after each other to get to longer delays, 10x SRLs set to 16 delay gives a total of 160 delay. This is a pretty linear scaling, but as I said earlier, depending on the bit width, it may be better doing it in BRAM.

In order to enable this LUT usage, you just need a feedback node (direction reversed preferably) with no default value and initialisation at compile time. Then the compiler will utilise SLICEM if available. The Discrete delay the OP mentions just forces this allocation but ends up being functionally equivalent.

I also don't use many built-in FPGA functions. Xilinx IPCores excepted.

Intaris · ‎11-12-2019

wiebe@CARYA wrote:

FPGA Arrays are OK if they're small, but conceptually (sorry if I edited that in after you replied) FPGA memory would work as well.

Ooh, sneaky. 😋

The thing with Arrays is they take not only space for the array itself, but rotating and so on take increasingly more resources the larger the array gets (i.e. the number of elements in the array as well as the size of each element). BRAM resource usage stays pretty much constant irrespective of the length of the buffer (unless you need more than 1 BRAM). Indexing is "free" because it's dedicated hardware on the FPGA chip.

My cut-off is normally around 200-300 LUTs or Registers. Then I'll implement something in BRAM (Our design has plenty left over).

Here's an excerpt from a really old document outlining how expensive different operations actually are in LV FPGA. It was back in LV 8.5, but the basic scaling I think remains accurate, even if the numbers are not all 100% correct. Of course, this is for dynamic addressing, not valid when the same indices are always being read (as would probably be the case here). Static indexing is way more efficient.

Note: These costs are for the advertised operation only, they do NOT include the cost of the actual array storage itself.

Increasing cost of array size (Dynamic addressing only)

Intaris · ‎11-12-2019

Just a quick follow-up. It's been annoying me that I couldn't remember where I learned of the Feedback node SRL trick. It's in the High Performance FPGA guide. Page 68.

Here's a clip of the part I am referring to. The previous page is also important, but I'm just highlighting the trick here.

This is a gem of a trick!

LabVIEW

Using VI Scripting to generate FPGA VI

Using VI Scripting to generate FPGA VI

Re: Using VI Scripting to generate FPGA VI

Re: Using VI Scripting to generate FPGA VI

Re: Using VI Scripting to generate FPGA VI

Re: Using VI Scripting to generate FPGA VI

Re: Using VI Scripting to generate FPGA VI

Re: Using VI Scripting to generate FPGA VI

Re: Using VI Scripting to generate FPGA VI

Re: Using VI Scripting to generate FPGA VI

Re: Using VI Scripting to generate FPGA VI