Buffer size limitations in finite acquisition - best way to do very long acquisitions

Kevin_Price · ‎12-11-2024

I still haven't heard a clear description of the symptoms of your "crashes" or any DAQmx error codes that might get thrown.

But the fact that you seem to have found a kind of hard limit at 100k makes me kinda suspicious. It turns out that for continuous acquisition with a sample rate of 100 kHz, the default buffer size DAQmx wants to set up is 100k samples.

It still should honor your requests for a *bigger* buffer size, but perhaps there's a subtle problem in the way you're trying to set up your buffer? Can you try a direct call to the buffer config function after setting up sample timing and before starting the task?

I've not seen this kind of problem on the LabVIEW side. Various times I've tested out the ability to request a larger-than-default buffer size in the call to set up sample timing and I've not seen it refuse or ignore the request. It just works. I don't know if there are multiple API functions available on the text language side and maybe a different one needs to be used? Just guessing now, so I'll cut things off here.

-Kevin P

ALERT! LabVIEW's subscription-only policy came to an end (finally!). Unfortunately, pricing favors the captured and committed over new adopters -- so tread carefully.

ft_06 · ‎12-11-2024

My bad, I thought error was obvious and mentioned but it is in other topics. This is the usual error message with code -200279 where app does not keep up with hw acquisition

The framework still seems to honor something because I get a maximu buffer size error at some point. And if I remove a few channels, this is fine (as buffer is per channel and I guess final size is base size * number of channels)

this is why I had some feeling about an intermediate buffer

And if we can increase buffer size to a very big value, we should be able to have callbacks every 10s or even 1 minute. While you seemed to state that you need to be < 1s on your side (but you do not oversize your buffer also)

I will see how far I can check this buffer size but I would not understand why it would not be allocate for a few seconds of buffering.

To be honest, after that, next step is to go to NI support, I am wasting your and my time now that we have that many empirical observations

File "/home/fretur01/Gitlab/nidaq-cli/nidaqcli/nidaq.py", line 198, in callback_nidaq
data = task.read(number_of_samples_per_channel = number_of_samples)
File "/home/fretur01/Gitlab/nidaq-cli/nidaqcli/lib/python3.10/site-packages/nidaqmx/task/_task.py", line 615, in read
_, samples_read = self._interpreter.read_analog_f64(
File "/home/fretur01/Gitlab/nidaq-cli/nidaqcli/lib/python3.10/site-packages/nidaqmx/_library_interpreter.py", line 4166, in read_analog_f64
self.check_for_error(error_code, samps_per_chan_read=samps_per_chan_read.value)
File "/home/fretur01/Gitlab/nidaq-cli/nidaqcli/lib/python3.10/site-packages/nidaqmx/_library_interpreter.py", line 6408, in check_for_error
raise DaqReadError(extended_error_info, error_code, samps_per_chan_read)
nidaqmx.errors.DaqReadError: The application is not able to keep up with the hardware acquisition.
Increasing the buffer size, reading the data more frequently, or specifying a fixed number of samples to read instead of reading all available samples might correct the problem.

Property: DAQmx_Read_RelativeTo
Requested Value: DAQmx_Val_CurrReadPos

Property: DAQmx_Read_Offset
Requested Value: 0

Task Name: Task_Dev1

Status Code: -200279

ft_06 · ‎12-12-2024

Below are some interesting traces. My feeling is that it is showing that there is a correct input buffer size of several seconds of samples

But for callbacks, there is a kind of buffer, which is 1s of samples + ~ 20%. That is at 100K it is 100 000 + 20%. And at 10K, I see 10 000 + 25%

This I am wondering if input buffer size is not just related to the tdms logging set through configure_logging(...) and custom callbacks are managed through this 1 second buffer. At least I know I have a ~ 1s limit for my callbacks

As a last experiment I will try later to not register my callbacks and see if I can read curr_read_pos and avail_samp_per_chan in a timer-triggered thread to see if their limit is the input buffer size

Interesting traces:

* I dump onboard buffer size, input buffer size and curr_read_pos/avail_samps_per_chan before and after calling task.read (callback every 250ms) => it was still crashing around this 100K limit

Debug onbrd 4095 => onboard buffer as per HW spec
Debug buf 800000 => OK, I set 8 * sampling clock

* Then I did this without doing the read: we see that read_pos does not increment. And available samples go up to 100K then is 0. In fact it is the -200257 crash... but you get the trace only if you call task.read 😉 (which I am not doing)

Debug PRE index 0 pos 0 avail 25051
Debug POST index 0 pos 0 avail 25054
Debug PRE index 0 pos 0 avail 50054
Debug POST index 0 pos 0 avail 50057
Debug PRE index 0 pos 0 avail 75089
Debug POST index 0 pos 0 avail 75094
Debug PRE index 0 pos 0 avail 100046
Debug POST index 0 pos 0 avail 100050
Debug PRE index 0 pos 0 avail 0
Debug POST index 0 pos 0 avail 0
Debug PRE index 0 pos 0 avail 0
Debug POST index 0 pos 0 avail 0
Debug PRE index 0 pos 0 avail 0
Debug POST index 0 pos 0 avail 0

- thus I do the same at 10K sampling but call read after 3 callbacks (but not reading all samples, only value expected for ! callback) then after 6 callbacks.

We see 80000 buffer size and maximum available samples at 12500 (vs 100 000 for 100K sampling rate). So the available samples buffer seems to be 1s of sampling rate

Debug onbrd 4095
Debug buf 80000
Acquisition mode: AcquisitionType.CONTINUOUS, Num samples per channel: 80000

Slave tasks started, master_task start

Configure Timer to end in 10 seconds, started at 2024-12-12 16:51:02.769297
Debug PRE index 0 pos 0 avail 2505
Debug POST index 0 pos 0 avail 2505 => no task.read() call so still 2500 available samples
Debug PRE index 0 pos 0 avail 5002
Debug POST index 0 pos 0 avail 5002
Debug PRE index 0 pos 0 avail 7507
Debug POST index 0 pos 0 avail 7508
Debug PRE index 0 pos 0 avail 10002 => before doing anything we have 10000 available samples
Debug POST index 0 pos 2500 avail 7507 => I have called task.read for 2500 samples (normal value for 1 callback call). So back to 7500 available samples
Debug PRE index 0 pos 2500 avail 10003
Debug POST index 0 pos 2500 avail 10003
Debug PRE index 0 pos 2500 avail 12507
Debug POST index 0 pos 2500 avail 12508
Debug PRE index 0 pos 2500 avail 0
Exception ignored on calling ctypes callback function: functools.partial(<function callback_nidaq at 0x7d71db9f2830>, 0)
Traceback (most recent call last):
File "/home/fretur01/Gitlab/nidaq-cli/nidaqcli/nidaq.py", line 213, in callback_nidaq
data = task.read(number_of_samples_per_channel = number_of_samples)
File "/home/fretur01/Gitlab/nidaq-cli/nidaqcli/lib/python3.10/site-packages/nidaqmx/task/_task.py", line 615, in read
_, samples_read = self._interpreter.read_analog_f64(
File "/home/fretur01/Gitlab/nidaq-cli/nidaqcli/lib/python3.10/site-packages/nidaqmx/_library_interpreter.py", line 4166, in read_analog_f64
self.check_for_error(error_code, samps_per_chan_read=samps_per_chan_read.value)
File "/home/fretur01/Gitlab/nidaq-cli/nidaqcli/lib/python3.10/site-packages/nidaqmx/_library_interpreter.py", line 6408, in check_for_error
raise DaqReadError(extended_error_info, error_code, samps_per_chan_read)
nidaqmx.errors.DaqReadError: The application is not able to keep up with the hardware acquisition.
Increasing the buffer size, reading the data more frequently, or specifying a fixed number of samples to read instead of reading all available samples might correct the problem.

Property: DAQmx_Read_RelativeTo
Requested Value: DAQmx_Val_CurrReadPos

Property: DAQmx_Read_Offset
Requested Value: 0

Task Name: Task_Dev1

Status Code: -200279

mcduff · ‎12-14-2024

Can you explain your Callback a bit? I am confused by it. I am not a text programmer either.

As Kevin stated 100ms of data is the sweet spot. If you configured DAQmx for "Log and Read" you need to read data every 100ms otherwise you will get the error you see. There is another mode called "Log Only" which will log directly to disk but you won't be able to view your data.

The callback is confusing to me because in LabVIEW you can configure the task to "trigger an event every N Samples". But it seems like you are manually triggering it.

One way to increase efficiency is to set the number of samples to an even multiple of the disk sector size. (Once again use the multiple closest to the 100ms interval.) The continuous acquisition can be quite stable in LabVIEW. I have had USB devices stream 32MB/s of data continuously for over a week without error, along with PXIe devices stream 160MB/s of data until storage ran out.

ft_06 · ‎12-16-2024

Thanks for jumping in. I described below my callback and some other stuff.

We have aligned with Kevin that empirically, we need to trigger the callback quite more than every second. I got the stability I wanted at 100K with a 250ms callback. At 100ms, I guess I can go to the full capability of the card.

The only concern that we (I) have is to understand where this is coming from. API allows defining a buffer size of several seconds of samples, still you need to trigger your callback at least every second and quite more for higher rates. That does not satisfy me as a Power&Perf architect 😉

But this is NI internal way of doing (couldn't they just explain it in the API doc ? An API doc is meant for static and dynamic behaviour of API calls including real-time constraints !!) so I will post a last experiment and then just live with empirical achievements to not waste more time. And we have some other support request to them so I would just add it to the topics

But to be honest, I find the rest of the docs at NI very good and deep... still hard to find as, without the exact accurate keywords, the search engine gives you like 3 pages of unrelated doc and I generally don't go past 3 pages 😉

========================================================

Effectively, the callback is all text programming on Linux, no Labview, triggering the "trigger and event every N samples" manually

My callback is getting the data for display in UI or some live processing, while the data is also stored automatically on disk through the logging API for later reading/processing/displaying (just 1 API call then NI is doing all the work to save in tdms format)

I am also leveraging the LOG only mode as it should be the most stable. We lose display/live processing but the use case would be to configure everything with display, sanity test what we get then let the test run without UI to be the most stable.

Whoah, 1 week of storage at 160MB/s, that makes a lot of storage !!

mcduff · ‎12-16-2024

Typically for LabVIEW programming, the DAQmx Read is in a while loop; the rate of the while loop is controlled by the DAQmx Read function. By setting the read function to approximately 1/10 the sample rate, it will run at roughly 100ms.

The buffer is there in case the reads cannot occur on time; I have seen this happen when Windows is doing something else in the background.

Once again, not sure about python, please correct me where I am wrong, but you are sending a callback to do a read. My understanding is that python is not deterministic; therefore you may not be sending your read when needed. Overtime, you can fall behind and run out of buffer as you have seen.

Below is from a LabVIEW example, you can see the Read is in its own while loop. Can you set up your acquisition that way? The read is set to acquire N points per loop.

ft_06 · ‎12-17-2024

Hello,

Don't be confused, I am aligned with your suggestions and I am able to trigger my callback every 250ms or 100ms and read the data and it is really stable even at high rates. Kevin and you gave me very good advices. It works but as an engineer, I want to understand the underlying mechanisms as that does not match with my interpretation of the doc (but is my interpretation correct ? probably not, see below) or does not match a typical producer/consumer use case.

Last minute news, I found the issue: use input_buf_size property of the API (that is readable AND writable) and not samp_quant_samp_per_chan and you can very precisely tune how you want the producer/consumer use case to work

Details below for the text programmers

===========================================================

Today I use samp_quant_samp_per_chan to tentatively size the data exchange buffer. Doc states: "samp_quant_samp_per_chan: If samp_quant_samp_mode is AcquisitionType.CONTINUOUS, NI-DAQmx uses this value to determine the buffer size." (and for finite acquisition this is the exact amount of samples to acquire). OK, but which buffer ? And it states it "uses" the value. If I have f(x) = 1, I can say that I use x to produce 1 but x is not involved. Stupid math terminology

And when I was dumping "input_buf_size" (doc state: "Specifies the number of samples the input buffer can hold for each channel in the task"), I did not realize that this property is also writable. And when I write it... the input buffer has the correct size and I can size a buffer of 20seconds and trigger the callback like every 5s or more.... at high acquisition-rate

task.in_stream.input_buf_size = 20 * master_samp_clk => master_samp_clk clock is the number of samples per channel acquired per second, so here this is a 20 seconds buffer

task.register_every_n_samples_acquired_into_buffer_event(master_samp_clk * K, callback) => callback is called every master_samp_clk * K samples, that is every K seconds.

I did this at 500kHz sampling rate and I can put huge CPU load, the framework does not care.

mcduff · ‎12-17-2024

So it appears that you are calling the DAQmx Timing Function, here is the help for it below. Maybe that will be a better explanantion for you.

You are not explicitly setting the buffer size, you are inputting a value that is used to determine the buffer size. There are ways to explicitly set the size.

Glad things are working for you.

Sample Clock

Sets the source of the Sample Clock, the rate of the Sample Clock, and the number of samples to acquire or generate.

task/channels in is the name of the task or a list of virtual channels to which the operation applies. If you provide a list of virtual channels, NI-DAQmx creates a task automatically.

rate specifies the sampling rate in samples per channel per second. If you use an external source for the Sample Clock, set this input to the maximum expected rate of that clock.

source specifies the source terminal of the Sample Clock. Leave this input unwired to use the default onboard clock of the device.

active edge specifies on which edges of Sample Clock pulses to acquire or generate samples.

Falling (10171)	Acquire or generate samples on falling edges of the Sample Clock.
Rising (10280)	Acquire or generate samples on rising edges of the Sample Clock.

error in describes error conditions that occur before this VI or function runs. The default is no error. If an error occurred before this VI or function runs, the VI or function passes the error in value to error out. If an error occurs while this VI or function runs, the VI or function runs normally and sets its own error status in error out. Use the Simple Error Handler or General Error Handler VIs to display the description of the error code. Use error in and error out to check errors and to specify execution order by wiring error out from one node to error in of the next node.

	status is TRUE (X) if an error occurred before this VI or function ran or FALSE (checkmark) to indicate a warning or that no error occurred before this VI or function ran. The default is FALSE.
	code is the error or warning code. The default is 0. If status is TRUE, code is a negative error code. If status is FALSE, code is 0 or a warning code.
	source identifies where an error occurred. The source string includes the name of the VI that produced the error, what inputs are in error, and how to eliminate the error.

sample mode specifies if the task acquires or generates samples continuously or if it acquires or generates a finite number of samples.

Continuous Samples (10123)	Acquire or generate samples until the DAQmx Stop Task VI runs.
Finite Samples (10178)	Acquire or generate a finite number of samples.
Hardware Timed Single Point (12522)	Acquire or generate samples continuously using hardware timing without a buffer. Hardware timed single point sample mode is supported only for the sample clock and change detection timing types.

samples per channel specifies the number of samples to acquire or generate for each channel in the task if sample mode is Finite Samples. If sample mode is Continuous Samples, NI-DAQmx uses this value to determine the buffer size. This VI returns an error if the specified value is negative.

task out is a reference to the task after this VI or function runs. If you wired a channel or list of channels to task/channels in, NI-DAQmx creates this task automatically.

error out contains error information. If error in indicates that an error occurred before this VI or function ran, error out contains the same error information. Otherwise, error out describes the error status that this VI or function produces. Right-click the error out indicator on the front panel and select Explain Error from the shortcut menu for more information about the error.

	status is TRUE (X) if an error occurred or FALSE (checkmark) to indicate a warning or that no error occurred.
	code is the error or warning code. If status is TRUE, code is a nonzero error code. If status is FALSE, code is 0 or a warning code.
	source identifies where and why an error occurred. The source string includes the name of the VI that produced the error, what inputs are in error, and how to eliminate the error.

ft_06 · ‎12-18-2024

Hello,

You are right, this is what I was mentioning in my previous post, "it "uses" the value", it does not set it. But I will stand to my position, it is using it as "f" is using "x" in f(x)=1 and this is intellectually dishonest 😉 and confusing for text programmers (but I agree that NI products are great products and doc is globally of high quality)

Effectively, there are ways to set it precisely. In python, this is this input_buf_size, in Labview, I found https://www.ni.com/docs/en-US/bundle/ni-daqmx-labview-api-ref/page/lvdaqmx/mxinbufcfg.html, in C API, this is DAQmxCfgInputBuffer(...). Unfortunately, python version is the least clear and obvious 😞

Now that I can set it, I can at last have a 20s input buffer and 1s callback for sufficient latency. If I put a lot of CPU load, callback is triggered only every 1.5s. I just have to read 2x or 3x the expected number of samples if available samples are too big (or even read all available samples) and I have no crash. I am back to a fully controlled consumer/producer use case. Hurray !

=========================================================

https://knowledge.ni.com/KnowledgeArticleDetails?id=kA00Z000000P9PkSAK&l=en-GB mentions how input buffer size is allocated, this is a discrete function of sample rate.

But could depend on the card so I did my own tests: the input size only depends only on sampling rate, not "samples per channel" parameter. I got discrete values like 1000, 14436, 104448, ... and past some high sampling rate, it is sampling rate / 2.

So what is the point of having this parameter in the API ? Or at least could it be specified on which cards it has an effect and point to the right API to set it precisely ?

Kevin_Price · ‎12-18-2024

FWIW, this explicit call to set the buffer size was what I was trying to allude to back in msg #11, but I didn't know any specific text language syntax to point toward.

But I'm still bothered that DAQmx didn't go beyond its 100k default buffer size for your sample rate even when you specified a much larger "samples per channel" value when configuring timing. Over on the LabVIEW side, I know I've tested that out and it has always been honored. The buffer size would be the *larger* of the default or whatever I asked for.

I'm also surprised that you found a different set of discrete buffer sizes (14436, 104448, rate/2) than what that chart shows. I hadn't known the buffer auto-sizing behavior to be device dependent that way (but don't think I ever made a serious effort to check either). Or could it even be dependent on the language? Maybe the LabVIEW API ends up calling down to different auto-sizing code than the C or python APIs? If so, that seems more like a bug than a feature.

Very tiny FWIW: the buffer sizing link has a subtle error in that the stated ranges slightly overlap. There's ambiguity for requested buffer sizes of 100 or 10000. Somewhere on the site (which I can't seem to find right now) is another chart that removes that overlap. As I recall, the ranges then become 0-100, 101-10000, 10001-1000000, 1000000-inf.

-Kevin P

ALERT! LabVIEW's subscription-only policy came to an end (finally!). Unfortunately, pricing favors the captured and committed over new adopters -- so tread carefully.

Multifunction DAQ

Buffer size limitations in finite acquisition - best way to do very long acquisitions

Re: Buffer size limitations in finite acquisition - best way to do very long acquisitions

Re: Buffer size limitations in finite acquisition - best way to do very long acquisitions

Re: Buffer size limitations in finite acquisition - best way to do very long acquisitions

Re: Buffer size limitations in finite acquisition - best way to do very long acquisitions

Re: Buffer size limitations in finite acquisition - best way to do very long acquisitions

Re: Buffer size limitations in finite acquisition - best way to do very long acquisitions

Re: Buffer size limitations in finite acquisition - best way to do very long acquisitions

Re: Buffer size limitations in finite acquisition - best way to do very long acquisitions

Sample Clock

Re: Buffer size limitations in finite acquisition - best way to do very long acquisitions

Re: Buffer size limitations in finite acquisition - best way to do very long acquisitions