Real-Time Measurement and Control

Showing results for 
Search instead for 
Did you mean: 

fpga jitter (125nanosecond)


I'm having an issue with writing a programmable event sequencer.  I have an experiment where I need to trigger various pieces of equipment (pulsers, hi-res time converters, lasers, etc.) via TTL lines.  The basic repetition rate is in the 10-100kHz range where the pulsers are running at that rate but the lasers pulse at 100Hz. 


The basic architecture of my FPGA VI is several while loops being controlled by a "master" loop.   The master is a while loop containing a flat sequence with a loop timer which sets the timebase (Sync loop) from which every other event is generated.  All of the external trigger outputs are generated in their own while loops and a case statement looking for the sync pulse to go true. At the point of the sync signal going true, a case statement is entered with a flat sequence of events where each frame is controlled by a timer.  These timers are all configured to be 32-bit wide and programmed in clock ticks.  All of the programmable delays are set via a host VI.


The "jitter" behavior is very predictable.  Delays that are multiples of 125 nanosecond are stable; however, non-multiple delays demonstrate a jitter of 125ns.  That is, the delay/event occurs at time X or X+125ns.  The card's clock frequency is 40MHz and should theoretically give me a 25ns resolution.  


I am running LabView 8.6 with the FPGA option and using a PCI-7811R card.  Also, please note, this is my first attempt at LV.  I'm normally a hardware and VHDL person.





0 Kudos
Message 1 of 8

Hello, would it be possible for you to attach your code?  The overall architecture is a little hard to picture and it will likely go a long way towards getting to the root of the problem.  If not, what exactly do you mean when you say that each flat sequence is controlled by a timer? 

0 Kudos
Message 2 of 8


Here's the VI for the FPGA.

0 Kudos
Message 3 of 8
Hello,  I have taken a look at your VI and the overall architecture is a little bit confusing.  I believe a better way to do this would be to remove the master timing loop and just time each of the individual loops instead using the same method you used to time the master loop.  The loop rate can still be set programmatically.  One reason why this would be better is that you are polling the 'sync' variable in the false case of most of the loops.  However, each FGPA loop has an overhead of two clock cycles.  As a result, you are only really checking the 'sync' variable once every 3 clock cycles.  Because the pulse itself is only 1 clock cycle long, the pulse may be missed.
0 Kudos
Message 4 of 8


Thanks for your responses.  


In regards to the sync pulse being only 1 clock cycle wide, various pulse widths have been used ranging from 4 clock cycles down to the 1 cycle shown.  All with the same behavior.


In regards to the suggestion "remove the master timing loop and time each of the individual loops", I and not sure how the loops will all stay synchronized.  The timing of the outputs relative to each other is absolutely key.

Finally, these suggestions still don't address the fundamental issue of being unable to set pulses at the 25ns resolution of the FPGA clock.  I can only set widths in multiple of 125ns (5 clocks).  It doesn't matter if the pulse is 500ns or 2 usec, the pulse width resolution is only 125ns.


This experiment was initially implemented with a NI card, PCI-6602 which contains 8 timers.  The software engineer configured the card to run one timer as the master timebase and the other timers were slaved in various ways off of the master.  The experiment has grown and the card is at its limit.  The suggestion was made to go to an FPGA implementation.  While it would be a simple implementation in VHDL, the LabView interface for controlling experimental parameters via the LV GUI plus the ability of the user to change the overall architecture easily led us to choose the NI FPGA-based card.


The effect I'm trying to achieve, is a timebase loop generating a sync pulses.  The hi-rep rate outputs, i.e., the Extraction Pulse and the TDC/TOF Start Pulse generator occur on every loop but are time shifted with respect to each other.  The low-rep outputs, i.e., Laser 1 and Laser 2, occur approximately once every 1000 reps and use the Extraction Count to keep track of the hi-rep loops and trigger when needed.  Each output can be viewed as a series of one-shots (implemented with wait loops) being triggered from the sync pulse.  The first delay(s) are used to set the time shift relative to the master and then the final wait timer is used to set the output pulse width.  Some of the delays are set via the Host VI and remain constant during an experiement while other delays are varied systematically via a table of values stored in the FPGA.


Attached is a rough timing diagram and block diagram.


Thanks for your help.




Message 5 of 8

I believe the problem is with the final loop that free runs and updates the DIO lines.  Since you have a VHDL background, you're probably thinking of the VI as a circuit diagram.  However, LabVIEW is based on dataflow programming where a node doesn't execute until all if its inputs have been satisfied.  To maintain this dataflow symantic on the LV FPGA VI, the compiler inserts what we call an "enable chain" to ensure the nodes execute sequentially (according to dataflow semantics) as opposed to in parallel.  This means that every node or primitive on the block diagram will take one or more clock cycles to execute and will also have an associated flip flop.  If you look at your while loop, the longest data flow chain is composed of reading from a local variable, performing a boolean operation (e.g. AND primitive), and writing the boolean value to an I/O Node.  Since each of these operations takes 1 clock cycle, the longest path will take 75 ns to execute.  As was mentioned previously, the while loop also takes another 2 clock cycles to reset itself.  Adding these two together gives you the minimum 125 ns pulse width you're seeing.  This also explains why some outputs jitter and why some don't.  If the loops updating the local variables execute in multiples of 125 ns, then they'll stay phase locked to your I/O update loop.  Otherwise, you'll see 125 ns of jitter occasionally.


Probably the easiest way to fix this is to replace the while loop where the I/O updates are performed with a single cycle timed loop.  As the name implies, everything in the single cycle timed loop executes in a single clock cycle, or you'll get a compile error stating you failed to meet timing.  The single cycle timed loop still honors dataflow at its edges, however execution inside the loop is normal dataflow and everything effectively executes in parallel.  In most cases, this means all of the logic becomes combinatorial logic where the results are stored in flip flops at the edge of the loop.  It also means some nodes aren't supported inside the single cycle time loop.  You should check out the LabVIEW help file for more information on what's supported, not supported, and other caveats with this loop structure.  Since timing seems to be particularly important for your application, you may also want to replace all of your other while loops with single cycle timed loops since it will be clearer as to which exact clock cycle a particular variable or output is updated.  This will require a little bit of code restructuring on your part (e.g. write it more like a state machine).  You can also create derived clocks from the top level clock to act as the source of your timed loops if you need to go faster/slower to meet the timing requirements for your application or give you more breathing room to meet the timing constraints of the compile.  Check out the help file if you need more information on this feature.  I hope this helps.  Good luck!

Message 6 of 8


Thanks for the info regarding the various overheads associated with various structures.  The VI that I posted is something like the fifth major revision of the code.  I think the WHILE loop controlling the outputs was a fairly late addition in an effort to consolidate the output assignments without any feeling for the effect it would have due to overheads, etc.  I felt the consolidation might help in readability and maintainability of the code.


I used your suggestion of converting the output loop to a SCTL with interesting results.  The outputs seems to have lost the 125ns discretization of pulse widths and is now showing the full 25ns resolution!  Obviously the SCTL removed a lot of the overhead of the generic WHILE loop that I was using.  Unfortunately the outputs still show a jitter relative to each other, but now with the improved resolution. Smiley Happy That is, the jitter of one output relative to another, including the sync,  is still around 100-125 ns, but now there are 4 or 5 discrete potential positions, spaced 25 ns apart that the edge may occur.


I also removed one of the outputs from the output SCTL and put the assertions in the timing generation loop for that variable/output.  I saw no difference in behavior between that output and another output in the SCTL. 


I am very pleased to have accessed the full resolution again, but the discrete "jitter" is still an issue.  I suspect it is the overhead associated with the testing of the sync variable that is causing this uncertainty.  Obviously I need to take the next step in re-writing my code and am willing to try your suggestion:


 you may also want to replace all of your other while loops with single cycle timed loops since it will be clearer as to which exact clock cycle a particular variable or output is updated.  This will require a little bit of code restructuring on your part (e.g. write it more like a state machine).


I don't understand what you mean by structuring it like a state machine.  Sorry.  I'm being thick I guess.  I've designed state machines with Karnaugh maps (old school), VHDL, etc. but can't for the life of me understand your suggestion.


To make things easier and the iterations quicker, I'm working with a simpler VI with only one sync loop and one delayed output.  I've played some with timed sequences and SCTL but haven't made any progress.  I'm open to other methods of syncing multiple loops and/or generating multple outputs with dynamic delay settings.


Many thanks for your patience and help.



0 Kudos
Message 7 of 8

I'm glat that at least the first part of the suggestion helped.  When I was referring to a state machine, I was mostly alluding to the fact that you can't use the Wait (tick count) primitives inside the single cycle time loop since logically that concept doesn't make sense.  Instead, you can use a case structure along with shift registers to implement a state machine.  This state machine then runs at the rate of the single cycle timed loop, and state transitions occur at some frequency less than the loop is running.  For instance, the first state might wait for a sync pulse.  In this state of the case structure, look for a rising edge and change the state to something like "count offset".  In this state of the case structure, increment a counter each clock cycle until the desired offset value has been reached.  Once the desired offset has been reached, set the desired output high and transition to a "count pulse high" state.  Increment another counter in this state until the desired pulse width has been reached, reset the output low, and go back to the wait for sync pulse state.  Hopefully this makes sense.  If not, you can probably find some reference examples in the example finder that may clarify things.


I'm still a little unclear about the jitter.  When you mention jitter, do you mean the phase of the various output pulses is 100 - 125 ns off from where you expect them and remain constant, or do you mean that they are constantly moving around with respect to each other?  If it's the former, the single cycle timed loop should definitely correct the problem.  If it's the latter, I'm not sure what might be causing the problem off the top of my head, but I wouldn't be completely surprised if the single cycle timed loop still fixed the problem.  Anyway, give the single cycle timed loop a try and post back with your findings.

0 Kudos
Message 8 of 8