We have a test application that uses the Lossy Enqueue function to add images from several cameras to a numbr of queues. We then use the Preview Queue Element function cyclically to read the images off the queue until we find the one we want (or until timeout), using Deque Element to remove the previous element on each loop (after the first one). See the "DequeueLoop.png" attachment for the relevant bit of code. The image is then extracted from the queue data and processed to check for compliance with requirements (other than timestamps).
This is all controlled by TestStand and works well in the majority of cases - particularly for short sequences.
However, every so often (depending on which system is running it), the "Preview Queue Element" function returns an error 1050, with the text "Trap Notifier is late". This error happens almost immediately the call occurs.
On the LV2010 system in Europe, once this happens it keeps happening. There, the sequence consists of 87 calls to the VI that reads the queue (usually with a CAN message send immediately prior to each).
On our local (LV8.6) system, I have implemented a loop that sends a CAN message, then reads the queue data repeatedly. On this system, approximately every 35th call produces an error. Sometimes the error is followed by the next queue checking VI producing a timeout (in this case, 5 seconds). Usually the error does not "stick" on the local system.
On the face of this evidence, I suspect there's some sort of resource conflict occurring.
If I replace the Lossy Enqueue with a normal Enqueue, the system times out on every queue checking call.
I have also tried adding semaphore acquisition and release around the Lossy Enqueue and the Preview Queue Element calls, but these only seem to make the matter worse - specifically every 20-30 calls instead of every 35ish. It is (always) possible that I made an error with the way this was done. Basically, I created a semaphore for each of the queues and bundled a reference into the typedef of the queue data, then used that to acquire and release the locks (removing the semaphore refence when the queue is destroyed, of course). Should I be doing something other than wiring the output of the "Obtain Semaphore Reference" VI straight into a "Bundle By Name" function?
On the European system, I extended the queue size from 100 elements to 1200 (yes, that's a LOT of memory) and was able to get a trouble-free run - on that particular sequence.
Oh yes, on the local system I was able to "correct" the error with a separate 10ms wait step after each queue reading call.
On the European system, the same "fix" made the issue worse.
There is a lot going on here, but it looks like the central cause of the problem is the source of Error 1050. Checking in LabVIEW under Help » Explain Error, it looks like the possible reason listed for Error 1050 is "LabVIEW: Error occurred while executing script."
Have you checked through the various script nodes in your code? If the script node is dependent on an external application you have to make sure the application is open and running properly, etc.
@milan R wrote:
...it looks like the possible reason listed for Error 1050 is "LabVIEW: Error occurred while executing script."
Actually, since Geoff says the error also says "trap notifier is late", I'm assuming this is an internal LV error (it would make sense that the queue primitives use notifications to signal that something happened). I'm also assuming that someone with access to the LV source code could verify fairly easily that this error can be generated from the Preview Queue Element primitive.
But I doubt that much can be done with this without being able to reproduce the bug. Geoff, can you recreate this in a smaller application? If not, NI can take your full code to reproduce it (although I don't know what their process is for that), but it's usually much easier if it can be replicated in a small app.
Also, I don't really understand your design here, so there might be an alternative way to do this. For instance, why do you use a lossy enqueue? Why do you use preview instead of simply using Dequeue (which locks the queue correctly)?
Hi Tst and Milan and thanks for finding and answering my post,
There were cogent arguments for using a Lossy Enqueue in the first place, as well as using a Preview instead of a Dequeue. Unfortunately, the person who made those decisions no longer works for us and didn't write them down. I do know that using a non-lossy Enqueue gives us timeouts every time, as there's often a delay between the camera tasks being started and the specific part of the test for which we're looking, so our buffers fill up before the test even starts.
Changing to a normal Dequeue instead of using the Preview Queue Element function might well be the best answer for us. The only problem is that I'm not sure what else might break as a result. I'll give it a try.
As for scripts that might have an error - we don't explicitly use scripts. It's all run as sequences from TestStand. We're just using the built-in LabVIEW Queue primitives.
The application is reasonably modular, but breaking it down to something smaller might be problematic. As it stands, we could probably arrange an NDA with NI (if we don't already have one) and send them the installation package we use for distributing our system to internal customers, together with the sequence I've been using to test it locally. I'll try for the Dequeue solution first.
Replacing the Preview Queue Element function with a normal Dequeue still results in errors with about the same frequency.
I have realised that the text of the message comes from a function slightly later in the processing stage, as does the error code. This means the error I've reported is actually something of a red herring - at least, for our system. Where it is actually coming from is a bit of logic that tests the timestamps and buffer numbers. What seems to be happening is that every so often we're getting a queue element through with a timestamp that's between 5ms and 30ms too late for our requirements.
In other words, we're dropping some buffers unintentionally. The Windows host is failing to keep up. At least, that's the most likely answer.
Why I thought it was a queuing error was because I remember seeing the name "Preview Queue Element" in some of the error messages somewhere. Of course, running and re-running the system has resulted in numerous errors. However, the queue element errors don't occur when the system is running "properly" - just the "late notifier" messages.
As for why we're using Lossy Enqueue - that is documented. Basically, we start the vision system running early in the test sequence and start accumulating images. Then we run the tests as required, with each test looking for appropriate timestamps in the queue.
All of our major sub-systems - Vision, CAN, Analog Inputs, Digital inputs, Current Measurement, Sound, etc - use the Lossy Enqueue method.
Why the Preview Queue Element method? This is not documented and the people who coded it have moved to greener pastures (although I remember hearing it verbally at least once). I suspect it's because the image from any given camera might be used in any of several Vision sub-systems - OCR, Pattern Matching, Angle Measurement, etc. The same method is used in all of our major sub-systems apart from CAN.