Driver Development Kit (DDK)

cancel
Showing results for 
Search instead for 
Did you mean: 

x-series dma: hang in tCHInChDMAChannelController::requestStop

Solved!
Go to solution


hi --

 

i am getting an occasional hang inside the DDK function tCHInChDMAChannelController::requestStop.  

 

basically, the function sets stop in the DMA Channel_Operation_Register, then waits for stop, last link, or error bits to be set in the Channel_Status_Register.

 

what could cause this be stuck in an infinite loop (channel status is an unchanging 0x10004000)?

 

in the particular case i was looking at today, i was trying to stop the dma by calling nNISTC3::tCHInChDMAChannel::stop() after detecting an overrun error (from AI_Timer.Status_1_Register.getOverrun_St()).

 

 

and, since this condition is possible, what is a valid way to detect it and bail out of this routine (or avoid it completely) without resorting to evil timeouts?

 

thanks,

--spg

 

------------------
scott gillespie
applied brain, inc.
------------------
0 Kudos
Message 1 of 5
(10,819 Views)

Hello Scott,

 

I'd like to look into this a little more.  Under what conditions were you getting the overrun (example, device, sample rate, etc)?

 

In aiex3, the first AI DMA example, I see that if overrun is detected the example does not check that the AI timer is disarmed and that the stream fifo is empty.  Could you modify the example to show the AI Timer status registers and the stream circuit status registers just before calling stop dma?

 

AI InTimer Status_1_Register

AI Stream Circuit StreamControlStatusRegister

 

Thanks,

Steven T.

0 Kudos
Message 2 of 5
(10,814 Views)

hmm, this is interesting. the problem seems to be triggered by a certain size of data transfer.  i'm currently using a PCIe6363.  a 220,000 sample (880,000 bytes, i have it in 4-byte fifo mode) acquistion at 1 microsecond (yes micro) sampling rate runs fine -- i just ran 100,000 of these in a row with no apparent errors.   however, if i bump that up to 246,000 samples, i run into problems on the first or second run.  the first 230,000 or so samples come in as expected, but at a certain point the tCHInChDMAChannel reports bytes available in memory, but the data is not in the buffer.  this triggers the stop hang eventually, but that may be due to memory getting severely whacked by dma gone wild. 

 

today i'll try to replicate this using aiex3.  

------------------
scott gillespie
applied brain, inc.
------------------
0 Kudos
Message 3 of 5
(10,807 Views)

steven t --

 

i am really baffled now.

 

i modifed aiex3 to do a single large acquisiton (rather than continous mode).  

 

any size transfer larger than about 300,000 samples exhibits this behavior.  that is, the last 1 to 10K samples of the transfer does not make it into the buffer, even after the stream transfer count is reported complete.   

 

it is not consistent at which point the data stops -- it is generally a transfer somewhere in the last or second to last DMA chunky link.  i do my own sgl construction, and of course that is the first suspect for some type of DMA problem -- but i have been over it with a fine tooth comb and so far i can't find anything wrong with the sgl setup and not even a good theory to fit the facts.

 

once the problem occurs, the next time i run the modifed aiex3, i will usually get nothing (no data will transfer), then eventually both StreamControlStatusRegister and AI InTimer Status_1_Register start returning 0xffffffff (that doesn't look good).  this is when the stop() routine hangs, because Status_1 keeps returning 0xffffffff no matter what.

 

at this point, i have to reboot, otherwise i will never get any dma'd data again.

 

i have tried this with 2 and 4-byte fifo mode, and a variety of sampling rates and sizes of transfers (up to 10 meg) -- regardless, the problem still occurs.

 

below is an example of output during the intial run (modified to display data across the sampled range, rather than every value).  further below, output of the next run, which hung in stop:

 

--------------------------------------------

--------------------------------------------

--------------------------------------------

 

-- while waiting for the whole transfer to complete, the status control register toggles between 237910b0 and 237900b2. 

-- here, the last 10,528 samples did not come through correctly

-- i am sampling a square wave

 

 

Testing: Speedy X-series 6363 Slot-4

 

Bar0/Bar1/iBus: 0x1b1000/0x0/0x21df50

X-Series Info -- name:PCIe-6363, id=29749, adc:1 ai:32 dac:4

Memrequest: 0000000001000000

Starting finite 100.00-second hardware-timed analog measurement.

Reading 500000-sample chunks from the 500000-sample DMA buffer.

Status_1: 237910b0, StreamControlStatusReg: 00000101

Status_1: 237900b2, StreamControlStatusReg: 00000101

Status_1: 337900b2, StreamControlStatusReg: 00000101

...

Status_1: 237900b2, StreamControlStatusReg: 00040101

Status_1: 619010f0, StreamControlStatusReg: 00040101

 

--> dma reports all of the data is available

 

--> last 10528 samples matched to index (i.e. did not tranfer)

 

-----

dump

-----

0) -1.293030 -1.294964 -1.294964 -1.293997 -1.294319 -1.293674 -1.294319 -1.292707 -1.293030 -1.291740

50000) -0.214794 -1.180854 -1.245968 -1.270466 -1.282070 -1.286260 -1.288517 -1.290773 -1.294319 -1.293352

100000) 3.791936 3.792581 3.791936 3.794837 3.789035 3.788068 3.790002 3.790969 3.792259 3.790002

150000) -1.295931 -1.294964 -1.294641 -1.294641 -1.294319 -1.293674 -1.291096 -1.291740 -1.292063 -1.292063

200000) -1.295931 -1.293352 -1.293030 -1.293352 -1.292385 -1.292063 -1.293352 -1.295608 -1.294319 -1.293352

250000) 3.790969 3.792259 3.788391 3.791936 3.789358 3.789680 3.788391 3.792581 3.793226 3.791614

300000) -1.295931 -1.296575 -1.296253 -1.295286 -1.292385 -1.292707 -1.293030 3.649783 3.740684 3.769050

350000) -1.293030 -1.293352 -1.292385 -1.292707 -1.293674 -1.295286 -1.295931 -1.294319 -1.295931 -1.294641

400000) 3.791936 3.787101 3.783233 3.785490 3.789358 3.790002 3.788068 3.792259 3.792581 3.796772

450000) 3.786134 3.788713 3.790969 3.793226 3.791936 3.792581 3.795160 3.793226 3.790002 3.787746

499999) -7.838192

 

Finished finite 100.00-second hardware-timed analog measurement.

Read 500000 samples (without overwriting data) using a 500000-sample DMA buffer.

--------- speedy -- Unload Library

 

--------------------------------------------

--------------------------------------------

--------------------------------------------

 

-- program loops waiting for data (0 bytes reported available)

-- as soon as Status_1 starts returning 0xffffffff it is all over

 

Testing: Speedy X-series 6363 Slot-4

 

Bar0/Bar1/iBus: 0x1b1000/0x0/0x229f10

X-Series Info -- name:PCIe-6363, id=29749, adc:1 ai:32 dac:4

Memrequest: 0000000001000000

Starting finite 100.00-second hardware-timed analog measurement.

Reading 500000-sample chunks from the 500000-sample DMA buffer.

Status_1: 237910b0, StreamControlStatusReg: 00000101

Status_1: 237900b2, StreamControlStatusReg: 00000101

Status_1: 337900b2, StreamControlStatusReg: 00000101

Status_1: 237900b2, StreamControlStatusReg: 00000101

Status_1: 337900b2, StreamControlStatusReg: 00000101

Status_1: 237900b2, StreamControlStatusReg: 00000101

Status_1: 237900b2, StreamControlStatusReg: 00040101

Status_1: 237900b2, StreamControlStatusReg: 00000101

Status_1: 337900b2, StreamControlStatusReg: 00000101

Status_1: 237900b2, StreamControlStatusReg: 00000101

Status_1: 337900b2, StreamControlStatusReg: 00000101

Status_1: 237900b2, StreamControlStatusReg: 00000101

Status_1: 337900b2, StreamControlStatusReg: 00000101

Status_1: 237900b2, StreamControlStatusReg: 00000101

Status_1: 337900b2, StreamControlStatusReg: 00000101

Status_1: 237900b2, StreamControlStatusReg: 00000101

Status_1: ffffffff, StreamControlStatusReg: ffffffff  <-- aiError is set true here

 

---> hang in tCHInChDMAChannel::stop

------------------
scott gillespie
applied brain, inc.
------------------
0 Kudos
Message 4 of 5
(10,803 Views)
Solution
Accepted by topic author spg

>> i do my own sgl construction, and of course that is the first suspect for some type of

>> DMA problem -- but i have been over it with a fine tooth comb and so far i can't find

>> anything wrong with the sgl setup and not even a good theory to fit the facts.

 

after leaving this on the back burner for a little while, i recently returned to it and found the problem.  as we all suspected, it was a programmer error -- turned out the tail end of the SGL was getting clobbered during setup, under certain conditions, only for large SGLs. the result, apparently, was that the chinch was going out to lunch when it tried to fetch the last SGL chunkylink. now that i have fixed that problem, i can run dma transfers at top speed all day long without errors. 

------------------
scott gillespie
applied brain, inc.
------------------
0 Kudos
Message 5 of 5
(10,768 Views)