LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

Pass array between FPGA loops

Hello, I have a LabVIEW project running on an FPGA that looks like this:

Template.png

 

My main question: what is the best way to pass array data between FPGA loops 1 and 2? I've used registers for other sets of single data, but there has to be a better way than using 15 separate registers.

 

Questions 2&3 reflect my trying to understand the range of the words "large," "complex," and "fast." when working with FPGA's. For example, I understand that realtime vi's can be used in tandem with FPGAs to ensure more complex procedures don't interfere with the FPGA's data acquisition rate. What type of operations would you look to offload? Further, LabVIEW's tutorial on Improving Efficiency When Accessing DMA FIFOs talks about using an in place element structure to work with a large amount of data. How large of a data set is necessary to implement this practice? 

0 Kudos
Message 1 of 7
(3,288 Views)

Hi,

 

1 - you can use an FPGA FIFO - this behaves more or less like a DMA FIFO but doesn't transmit to/from the "host", which in the context of FPGA programming is your RT system (not the separate desktop computer). (Edit: Considering the last comment below within this post, perhaps you have no RT system?)

 

2&3 - if you're sending hundreds of singles a second in an array of singles, you are none of "large", "fast" or "complex", which is nice. Complex probably means non-native/default data types - clusters of things, perhaps even nested clusters and so on. In the context of an FPGA, ~single digit kHz is comfortably not fast. I'm not sure exactly where I'd say is "fast", but I'm running at ~150kHz with a packed datatype in U32s (18 bits from an ADC, then some timing information and channel data) and haven't needed to use "read regions" or similar. Perhaps it would be better if I did, but it works without.

 

Does your code as you've shown it work? Are you sure the loop in the bottom right ("Computer <-> FPGA Loop") isn't running on your RT system? I'm not exactly sure what your communication system looks like, but perhaps being able to do it directly depends on the location of your FPGA - is this a PCIe card or similar inside a desktop computer?

 

For a cRIO (which this may not be, but anyway...) you can't directly communicate between FPGA and a desktop, but I imagine with embedded cards the answer is different.


GCentral
0 Kudos
Message 2 of 7
(3,272 Views)

Hi J,

 

Q1: when your loops (in the FPGA) run at different, but low, frequencies you could use a simple global variable…

Best regards,
GerdW


using LV2016/2019/2021 on Win10/11+cRIO, TestStand2016/2019
Message 3 of 7
(3,251 Views)

@GerdW wrote:

Hi J,

 

Q1: when your loops (in the FPGA) run at different, but low, frequencies you could use a simple global variable…


I would recommend to use Registers (or handshakes) instead.  Behind the scenes it will be the same VHDL code anyway (Registers for two loops with the same clock, Handshakes for loops with different clocks).

 

Registers can be instantiated within the code, Globals cannot. So if you end up with any kind of code you might want to expand to add a second processing loop, it's much easier to just duplicate the registers on the BD than have to go to the global, add the elements, drop the global and then select the correct element.

 

But the main question I would have is: Do you need every element sent? I'm assuming "no" as this will be difficult since your "producer" is running 5x as fast as your consumer. If you don't need every update, how often do you need a "new" value in the consumer loop? I ask because depending on the relationship between the two clocks being used, you may actually only get an update of your register (or handshake) every N cycles at the consumer side (where N is typically between 2 and 4).

 

And my other big question would be: Do you really need floating point values? OK; I see you plan on using FXP.

 

Further points:

1) What is the rate of updates between the two loops on the FPGA, is it always 3 values? This defined the WIDTH of your interface. i.e. the datatype used in the solution I propose later.

2) How many elements will you have in total for sending over the DMA? I see 16 being mentioned, but you know you can only send one at a time from the FPGA, right?

3) Personally, I would use a dual-clock BRAM to store the items on the consumer side and then iterate through them to send via DMA.

4) Writing directly to BRAM from Loop 1 will work if you take care to make sure you never read and write the same address at the same time. Using a paging or banking approach will generally prevent this from happening. If there are simultaneous writes and reads in unrelated clock domains, data corruption may occur.

5) You will need some signalling for the consumer to know when it can send. And some indexing to iterate through all of the elements required.

0 Kudos
Message 4 of 7
(3,229 Views)

I *think*, based on the subdiagram labels for the loops and the numbers given, that the idea is one loop will enqueue 3 elements per iteration and run 5 times as fast as the consumer, which will read 15 elements each iteration.

 

If so, perhaps using Replace Array Subset in the producer and only transferring (via handshake, register, whatever) on the 5th iteration each time would be easier to reason about?

I think this (with a FIFO) could also work though...

Example_VI_BD.png

Perhaps this is undesirable because FIFOs are more expensive to implement?


GCentral
Message 5 of 7
(3,208 Views)

I think the desired Clock speeds pretty much mandate using single cycle timed loops (SCTL).

 

Once you have SCTLs, writing or reading a FIFO with a FOR loop is not possible.

 

Is the data from the first loop continuous or packet-based. i.e. is the ordering important?

0 Kudos
Message 6 of 7
(3,200 Views)

I've used FPGA memory in the passed, but for larger sizes.

 

So one loop reads the memory, the other writes it.

 

You can have a loop that reads (or writes) the memory from a pointer that is set by a control, and have it put the memory value in an indicator. The RT VI can then get\set the memory in a loop, to get\set all of it.

 

Race conditions lurk, so be warned...

0 Kudos
Message 7 of 7
(2,992 Views)