Showing results for 
Search instead for 
Did you mean: 

FPGA: Optimizing DSP48s usage

Hi, I am using a Crio-9076 which has 58 DSP48s, for a code to generate a signal and then compute the FFT, Vrms, Power, dBm, dBu and dBV of the input signal. However the code that I have written is using 100% i.e. all the 58 DSP48s. 

As per my understanding the multiplication operations are what use DSP48s in FPGA. Is there a way to reduce the DSP48s usage?

One method is to use scale by power of 2 but since my required values are not a power of 2 that is out of option.

I am attaching my code below. Can anyone help me optimize it?

0 Kudos
Message 1 of 4

Hi linu,


why don't you calculate dBm, dBu, dBV, and Power on the RT side of your project?

Do you really need to calculate those values in the FPGA???

Best regards,

using LV2016/2019/2020 on Win8.1/10+cRIO
0 Kudos
Message 2 of 4

I could do that, but this a an academic assignment focused on understanding the basic fundamentals of working with the limitations of the FPGA. I am curious about this issue since multiplication is a very basic operation and I might encounter similar problems in the future, how do I work around this problem?


This thread on the forum:


Refer to similar problem that I am facing however people have posted references to CAR documents and links which are only NI accessible. I have raised a service request for the same but I am yet to hear from them.

0 Kudos
Message 3 of 4

If this is a homework question then I'll try and answer in a way that you can put the blocks together.  Since an FPGA is actual hardware that you are programming, you are going to x amount of pathways to process inputs.  So you have to imagine the FPGA is a circuit with a fixed size and you have to fit ALL the processing you need onto that circuit board.  


An opposite problem is how do you increase the processing power of an FPGA?  You parallelize the processing.  So instead of feeding an arry of singles to be processed into a FOR loop (this would process the data one at a time), you parallelize so that the array (let's say size 😎 processes all 8 elements at one time.  This would mean that you would have 8 separate circuits slices of the FPGA for processing arrays or data chunks of size 8 by Single Bytes.


So, approaching your problem, your limitition isn't necessarily processing, but space, so what would be the inverse solution for the above problem?


Ben Johnson
0 Kudos
Message 4 of 4