Multiple unbundles when pipelining too costly on FPGA?

Shiv0921 · ‎02-20-2024

Save me from this wiring pain

I've inherited this code to extend. I'm new to FPGA programming but my understanding is that pipelining the steps in this loop will make it take up less memory.
But several pieces of data are being sent around and I need to add more and , arranging all these wires and fighting the shift registers every time I need to add something is awful.

My question is can I just send the cluster wire round, unbundling it at each point? or will these extra unbundles defeat the point of the pipelining? t I don't have a good feel for the cost of things on the FPGA

This is a similar question to this: https://forums.ni.com/t5/LabVIEW/Cluster-performance-quot-single-unbundle-quot-vs-quot-multiple/td-p...

but the above doesn't refer to FPGA and while I agree with the summary of simplifying ones block diagram to avoid feeling the need for having multiple unbundles, I feel like the need for pipelining is what is causing my need and I can't change that.

Here's the code, after all my additions it'll have about 30 wires looping around the frame, or 6 unbundles if I can go that route.

Shiv0921 · ‎02-20-2024

Oh i should add, the data set in question isn't large. Due to be maximum 18 u32 elements in the end

raphschru · ‎02-20-2024

Wow, you can thank the developer for leaving you this huge plate of spaghetti 😁.

You normally use pipelining when you have a lengthy operation in a tight loop (like a single-cycle timed loop). The idea is to break down this operation into smaller ones that can be executed in parallel. The downside is that by passing data through a shift register, you add a delay at each operation because you use the data from the previous iteration.

Here it seems more like an overuse of shift registers for no apparent reason. The code appears to only replace values in the same array at fixed positions (which is a no-op in FPGA). So to me, 95% of what we see on the picture is completely useless.

We could probably help you further if you could post the .vi file instead of a truncated picture of its block diagram...

Regards,

Raphaël.

Yamaeda · ‎02-21-2024

Look like they've used Shift registers as a buffer to delay stuff. If i remember correctly there's a simple block for that. Also, for readability, i'd probably make those loops into inline functions.

G# - Award winning reference based OOP for LV, for free! - Qestit VIPM GitHub

Qestit Systems

Shiv0921 · ‎02-21-2024

Thanks for your thoughts! Glad to hear it sounds like this will be able to look a lot cleaner.

Just to clarify then, pipelining should only be used in single cycle timed loops? It's used quite heavily in this program, and I had been advised to copy this setup for additions, but perhaps it isn't necessary.

Indeed the purpose of the loop is to convert the data to U32s and insert them in the array for sending.

Yes apologies for the screenshot. I'd post the sole vi, but all of the code is on one huge vi, so I've attached the project instead. The code in question is under the FPGA target, 'FPGA main.vi', scroll to the bottom and left, 4 loops from the bottom you'll find the loop I'm referring to, called 'Select channels to send to PC'.

Shiv0921 · ‎02-21-2024

What i've been told is that the pipelining means each step of the process can occur simultaneously, thereby reducing memory/making it faster/some kind of FPGA specific benefit. So that was the aim rather than introducing any delay. As far as I can tell a delay would serve no purpose. The purpose of the loop is only receive data from multiple other loops, then convert them and place them into the 'sending' array at their calculated position. It should wait until it has a data block from each of the queues, but once all the data is received the process should just be able to proceed step by step, with the passage of the array that's being altered through these steps dictating the dataflow.

This loop is for sending data to the PC, so timing isn't a concern, but I'm told the FPGA was struggling for memory as it was, so space is a concern. I need to do whatever optimizes the memory use on the FPGA, but a close second is doing something about the readability.

Thanks for your point about inlining the subvis- that doesn't seem to be available on the FPGA, when I go to the same place in VI properties as on the PC the option isn't there sadly.

Terry_ALE · ‎02-21-2024

Unbundle functions are a no-opp in LabVIEW FPGA. That is, it is just a wiring function, no copies of signals/data are made.

Pipelining should only be used if you get compile failures due to timing. Otherwise you add delay and use resources without reason or benefit.

Certified LabVIEW Architect, Certified Professional Instructor
ALE Consultants

Introduction to LabVIEW FPGA for RF, Radar, and Electronic Warfare Applications

Terry_ALE · ‎02-21-2024

@Shiv0921 wrote:

What i've been told is that the pipelining means each step of the process can occur simultaneously, thereby reducing memory/making it faster/some kind of FPGA specific benefit. So that was the aim rather than introducing any delay. As far as I can tell a delay would serve no purpose. The purpose of the loop is only receive data from multiple other loops, then convert them and place them into the 'sending' array at their calculated position. It should wait until it has a data block from each of the queues, but once all the data is received the process should just be able to proceed step by step, with the passage of the array that's being altered through these steps dictating the dataflow.

This loop is for sending data to the PC, so timing isn't a concern, but I'm told the FPGA was struggling for memory as it was, so space is a concern. I need to do whatever optimizes the memory use on the FPGA, but a close second is doing something about the readability.

Thanks for your point about inlining the subvis- that doesn't seem to be available on the FPGA, when I go to the same place in VI properties as on the PC the option isn't there sadly.

Inlining does not save space on FPGAs. There is no functional overhead for subVIs.

Some of this is hardware vs. software thinking.

Certified LabVIEW Architect, Certified Professional Instructor
ALE Consultants

Introduction to LabVIEW FPGA for RF, Radar, and Electronic Warfare Applications

raphschru · ‎02-21-2024

This is equivalent, no need for shift registers (that induce delays).

Also, all your inserting at variable indexes are eating lots of resources. Keep in mind all arrays in FPGA code have a fixed size, so in this case your must allocate arrays for the max number of channels. Then when you decode the data, you can ignore elements that are not used.

Regards,

Raphaël.

Shiv0921 · ‎02-21-2024

Ah thanks Raphaël! that’s so beautiful.

Yes great point, it makes total sense that any cutting up of data should occur on the PC side, not bothering the FPGA with being dynamic about how much it sends across.

Seems I need to do a deeper dive on the FPGA developers manual to be able to spot these things myself. Thanks all for the help!

LabVIEW

Multiple unbundles when pipelining too costly on FPGA?

Multiple unbundles when pipelining too costly on FPGA?

Re: Multiple unbundles when pipelining too costly on FPGA?

Re: Multiple unbundles when pipelining too costly on FPGA?

Re: Multiple unbundles when pipelining too costly on FPGA?

Re: Multiple unbundles when pipelining too costly on FPGA?

Re: Multiple unbundles when pipelining too costly on FPGA?

Re: Multiple unbundles when pipelining too costly on FPGA?

Re: Multiple unbundles when pipelining too costly on FPGA?

Re: Multiple unbundles when pipelining too costly on FPGA?

Re: Multiple unbundles when pipelining too costly on FPGA?