Synchronizing DMA FIFO Buffers

SierraNevadaAaron · ‎09-05-2014

Hi All,

I am attempting to transfer four channels of data (each channel is 32 bits) from an FPGA target to a host. I'm stuffing two channels each into a DMA FIFO buffer, so I have two DMA buffers.

The samples need to be synchronized in time, so data pulled off the two DMA buffers need to be synchronized. I have read many posts with people in similar circumstances, no solutions of which work when the total number of bits to be synchronized exceeds 64.

Is there a clever way to guarantee synchronization of two 64-bit FIFO DMA buffers between an FPGA and windows host?

RavensFan · ‎09-05-2014

Are you doing anything else with these FIFO's? Putting data in from multiple places, or pulling it out from mulitple places? If you start out with an empty FIFO, and put the same number of data elements in each one at the same rate. Then pull out the same number of elements from each at the same rate, they should automatically be synchronized. Element 1 in DMA1 and Element 1 in DMA2 should be able to come out matched up if they were put in the same time and later come out at the same time. Only if you start throwing in other elements or pulling out other elements unevenly between the two DMA's will they fall out of synchronization.

Could you put all 4 channels into the same DMA? Let each channel be one element. When you add 4 elements on one end, pull out 4 elements on the other end. If you use 32-bits of the element for the data, and use some other bits at the hight end to identify what channel it comes from where, you can be sure what element comes from where and be sure you are taking them out in the order you expect as groups of 4 elements at once.

If you have some sort of clocking mechanism, you can use all or some of the extra 32 bits as an identifier to tell when the data was put into the buffer. On the Reading end, you can parse out the extra 32 bits, determine what channel they came from, and grab the clock data you added, and use that in comparison with the other elements you read to sort them out and determine what pieces of data belong together.

SierraNevadaAaron · ‎09-05-2014

Thanks for the reply Fan.

Regarding putting all the data into one FIFO, yes I wish I could do that, but as described I have 128 bits of data, and DMA FIFOs to my knowledge only support datatypes with a maximum of 64 bits, thus I'm forced to use two buffers.

The two buffers are being filled in the same Timed Loop in the FPGA, and are being pulled off in the same Timed Loop in the host, so it is already implemented as you describe. Even with this, however, I get a non-deterministic offset in samples between the buffers.

One concern I have is the buffers are being configured and started in a sequential procedure. Might it be necessary / possible to start them in parallel? I dont have a separate command to actually 'start' populating the buffers, maybe I start them sequentially and then have a shared boolean in the FPGA Timed Loop that actually enables filling the buffers?

RavensFan · ‎09-05-2014

But you can put 128 bits of data into the FIFO if you put it in as 2 elements. 1st element has first 64 bits, 2nd element has next 64 bits. Or as 4 elements each with 32 bits, but use the other 32 bits that are available in each other as an identifier, such as a clock tick, channel number, or an incrementing serial number. When you pull out 2 elements at once (or 4 at once in the second case) you know those were put in together. And if you have the extra 32 bits, you can inspect the extra data to sure what channel, or what time, or what sequence the data was pushed into the DMA FIFO.

I would certainly use some sort of boolean or notifier (if available on FPGA) to signal when separate parts of the code should start together. I'd be surprised you didn't already have that. You must have some mechanism now that puts the two different pieces of 32 bit data together to full up a 64-bit element in a single FIFO stream.

crossrulz · ‎09-05-2014

You should probably just add a command of some sort (Boolean control going to TRUE?) that the FPGA waits for before entering into the Timed Loop. That way your host can do all of its setup and tell the FPGA when it is ready to run.

There are only two ways to tell somebody thanks: Kudos and Marked Solutions
Unofficial Forum Rules and Guidelines
"Not that we are sufficient in ourselves to claim anything as coming from us, but our sufficiency is from God" - 2 Corinthians 3:5

crossrulz · ‎09-05-2014

@RavensFan wrote:

But you can put 128 bits of data into the FIFO if you put it in as 2 elements. 1st element has first 64 bits, 2nd element has next 64 bits.

That won't work inside of a SSTL. You can only write to a DMA once in a given clock cycle.

There are only two ways to tell somebody thanks: Kudos and Marked Solutions
Unofficial Forum Rules and Guidelines
"Not that we are sufficient in ourselves to claim anything as coming from us, but our sufficiency is from God" - 2 Corinthians 3:5

RavensFan · ‎09-05-2014

@crossrulz wrote:

@RavensFan wrote:

But you can put 128 bits of data into the FIFO if you put it in as 2 elements. 1st element has first 64 bits, 2nd element has next 64 bits.

That won't work inside of a SSTL. You can only write to a DMA once in a given clock cycle.

SSTL = Single Cycle Timed Loop ????

I don't remember the OP saying they were using that kind of loop. But that is a good point to mention in the event he is.

SierraNevadaAaron · ‎09-05-2014

Agreed crossrulz, I can't interleave the samples either as I wouldn't be able to discern which channel was which.... At least with bit-packing I know which bits are which channels.

EDIT:

Whoops, didn't see your comment about using the remaining bits as channel identifiers.

I was a bit worried about doing something like interlacing the channels due to performance issues on the host side. Is this common practice and I shouldn't be worried about performance on the host? In any case, great suggestions guys, thank you.

I do have a 'write enable' boolean, but it's getting set before the buffers are set up... stupidly. Let me see if that fixes things.

SierraNevadaAaron · ‎09-05-2014

Whoops, didn't see your comment about using the remaining bits as channel identifiers.

I was a bit worried about doing something like interlacing the channels due to performance issues on the host side. Is this common practice and I shouldn't be worried about performance on the host? In any case, great suggestions guys, thank you.

RavensFan · ‎09-05-2014

EDIT: I just saw your response while I was trying to type of a more detailed response below.

I'm sure this method works because I'm doing a similar thing with a FPGA I'm working with and packing 4 channels worth of data into a single FIFO stream. I'm actually using 62-bits worth of data, and sacrificing the highest 2 bits for a channel number. I was able to pass along more than 38,000 64-bit elements per second from my FPGA to PXI without any indication of lost data. I'm hoping to test the system with ~67,000 elements/second soon.

You can know which element goes to which channel.

Let's say you have 4 channels. I'll label them 0, 1, 2, 3.

You have 32-bits of actual data.

Take your data from channel 0 and join it with a 32-bit integer of 0.

For channel 1, join with the 32-bit integer of 1.

Likewise with 2 and 3.

On the other end, read the 64-bit element. Mask it with a 64-bit value of 00 00 00 00 FF FF FF FF and you've got the data. Take that same piece of data and rotate it by 32 bits and you'll have the channel number. Now you know the data and which channel it came from.

So if channel 0 gave you data of AA AA AA AA, the 64-bit element would look like 00 00 00 00 AA AA AA AA

If channel 1 gave you data of BB BB BB BB, the element would look like 00 00 00 01 BB BB BB BB

Channel 3 00 00 00 02 CC CC CC CC

Channel 4 00 00 00 03 DD DD DD DD

If you read the 4 elements, they should all belong together. And in the event they didn't get put in in a specific order, you now have the channel number to know which is which channel.

If you still have the risk that the 4 elements don't belong together, perhaps one channel failed to put in data while the other channels started, you can use some of the other bits in the 32-bits to put a serial number. You can check the serial number after decoding on the other end. If the 4 pieces of data don't belong together, you can discard the extras. And you can determine if you need to read 1, 2, or 3 more elements to resynchronize you package of 4 elements.

LabVIEW

Synchronizing DMA FIFO Buffers

Synchronizing DMA FIFO Buffers

Re: Synchronizing DMA FIFO Buffers

Re: Synchronizing DMA FIFO Buffers

Re: Synchronizing DMA FIFO Buffers

Re: Synchronizing DMA FIFO Buffers

Re: Synchronizing DMA FIFO Buffers

Re: Synchronizing DMA FIFO Buffers

Re: Synchronizing DMA FIFO Buffers

Re: Synchronizing DMA FIFO Buffers

Re: Synchronizing DMA FIFO Buffers