LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

Maximizing resources on FPGA by computing in "shifts"

Solved!
Go to solution

Hi all,

 

I am trying to do linear algebra on an FPGA, as I need a fast computation speed to have an accurate observer for control applications.  I need to compute at >=100kHz; my FPGA target has a 40MHz clock, which gives me 400 clock cycles to compute.

 

I have broken down my matrices into 1D row arrays, and I am using the Dot Product VI.  (My FPGA target only supports 1D arrays; and I am using the 2013 version.)

 

The computation flow is as follows: Case 0 would compute a1*x1, a2*x2, a3*x2, and a4*x4, and store them in y1, y2, y3, and y4.  Something would trigger the case to advance.  Then, Case 1 would compute a5*x5, a6*x6, a7*x7, and a8*x8, and store them in y5, y6, y7, and y8; and so on.

 

Using 8 Dot Product VIs, I can do all necessary computations in four cases.  (Ten of the computations are size 1x8 * 8x1; sixteen computations are size 1x2 * 2x1.)  I cannot use more than 8 Dot Product VIs because I run out of resources on the FPGA, so I am using case structures to "schedule shifts" of inputs/outputs to a set of 8 Dot Product VIs.  (See screenshot below.)

 

However, I am struggling with getting the computed result of the Dot Product VI to go into the right variable.  What I mean by that is that there is a  >=1 cycle computational delay from the Dot Product VI that seems to be unpredictable, and this is causing my computations to get stored in the wrong outputs (i.e. a1*x1 gets stored in y5).  

 

Does anyone have any advice on how to make this work better?  Here are some thoughts I have had:

 

1.  Use a boolean timing flag in conjunction with the "Ready for Output" input on the Dot Product VI in order to use them like a register, triggering them to "flush" when the flag is activated.  This sort of works, except if the flag stays on for more than one cycle or doesn't get detected when triggered, things can go wonky.

 

2.  Offset the later cycle by some fixed number of cycles. This only works if the Dot Product VIs have a constant cycle delay.  I've had mixed results with this; under some conditions it seems to work, under others it does not.  I don't fully understand the cycle delay.

 

3.  Try to use the "ready for input" output flag from the Dot Product VI to trigger the case to advance.  Again, I can't quite get consistent results.  It seems like everything triggers at different times.  I also don't fully understand what the "ready for output" and "ready for input" flags cause the Dot Product VI to do (does it cache its computed output until the downstream signals it is ready?).

 

Code is attached (runs on an FPGA target, mine is a cRIO-9030), and here's a screenshot showing the idea I've been working with.

 

I'm quite tired of waiting 10 minutes to compile only to find out it doesn't behave in a way that makes sense over and over!  I would appreciate any kind of insight FPGA experts out there might have.  Thanks 🙂

The general idea I'm exploring here

0 Kudos
Message 1 of 6
(2,703 Views)

@phototr0pe wrote:
(...)

 

I'm quite tired of waiting 10 minutes to compile only to find out it doesn't behave in a way that makes sense over and over!  I would appreciate any kind of insight FPGA experts out there might have.  Thanks 🙂

 


One quick tip: if you want to test VI, you don't have to compile the FPGA. You can simply run this VI on My Computer target (add it in the project). There are also FPGA simulation options available (but this is quick tip, so I don't have time to elaborate about them now 😉

0 Kudos
Message 2 of 6
(2,635 Views)

In addition to that quick tip, here's a link describing checking FPGA code without compiling:

https://www.digiajay.com/single-post/2017/06/04/No-need-to-install-FPGA-compilers-to-review-or-edit-...


GCentral
0 Kudos
Message 3 of 6
(2,630 Views)

Thanks for the suggestions, all.  Unfortunately both Simulation mode on the FPGA target and trying to run the VI on another target e.g. the host PC do not work, at least using my cRIO-9030 and LabVIEW 2014.  The VI will run but the Dot Product computation always returns zero.  Perhaps the single cycle timed loop is not implemented for these features?

 

EDIT:  Ok, after clearing the VI from memory and trying the Simulation mode again, it does actually work with the SCTL.  I'm not sure why it didn't before!  Good tip.

0 Kudos
Message 4 of 6
(2,621 Views)
Solution
Accepted by topic author phototr0pe

Ok, for anyone else out there who wants to use this technique, I've figured out a few things.

 

  • First, the Dot Product VI doesn't seem to cache or otherwise withhold output, no matter what you wire to "Input Valid" or "Ready for Output."  
  • If you wire "True" to "Input Valid," the Dot Product VI will start computing, and will return "False" for "Ready for Input."  If you wire "False" to "Input Valid," the Dot Product VI will not start computing.

How I solved the problem:

  • I ended up just wiring a "true" constant both to "Input Valid" and "Ready for Output," to force the VI to compute immediately, in case of latency.
  • I then created another case for the outputs only, "Output case"; an empty case (Output case = 4) is the default and means that no output is sent to indicators.  If Output case = {0,1,2,3}, Dot Product outputs are sent to some set of indicators.
  • After a set number of base clock cycles, for one single cycle (cycle [k]) I set a boolean called "Advance Case" to true.  This is the chain of events:
    • Cycle [k-1]: "Advance Case" = false.  "Input case" = 0.  Dot Product VI has had N cycles to compute.  "Output case" = 4 (empty case; no indicators are being updated).
    • Cycle [k]: "Advance Case" = True.  (This is triggered by a cycle counter that resets when Advance Case = True.)  This causes Output Case = 0, sending the computed output of the Dot Product VIs to the correct indicators.  Input Case = Input Case + 1 (now = 1).
    • Cycle [k+1]: Output case = 4 (empty case).  Input case = 1, and the Dot Product VIs receive a new set of inputs to compute.

labview2.png

 

It's not strictly deterministic, as not all eight states update at the same time; but I can update all elements at >=500kHz, which should be fast enough to appear deterministic to my 1kHz control loop.

0 Kudos
Message 5 of 6
(2,601 Views)

@phototr0pe, I'm glad you got things working, although you can make the code a bit more robust with a few changes.

 

First, I would look over the documentation for the Dot Product function. It's possible some of the issues you are seeing are due to differing configurations of the nodes, specifically the pipeline stages. This function uses a high-speed handshake protocol for operation to ensure data can be pipelined through the design safely. If designed properly, you should not need to time anything in the function but can rather just let the data flow through the system using the control flow handshake signals.

 

Like you mentioned, blocks using the handshake protocol only maintain their output value for the cycle the Output Valid signal goes high. If you don't capture the value on that cycle, it will be lost. However, the output will only be delivered if Ready for Output is asserted. 

 

I'm not sure exactly how the application will service inputs and outputs, but using arrays to hold all the values can consume a lot of resources. If possible, it would be good to store the values in a memory and read one row or column at a time. If the values are coming through a DMA channel, there are a number of schemes to use to partition the data efficiently.

 

Again, the handshake signals are helpful here as you have many options to partition the data across the Dot Product functions. For instance, if you increase the pipeline depth of the Dot Product function, and decrease the number of elements per cycle from memory, you may be able to increase the clock rate of the loop up to 120 MHz or higher and actually get better throughput, possibly with fewer Dot Product functions. You should have no problem hitting a 1 kHz response time for an application like this!

Message 6 of 6
(2,587 Views)