LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

queue reference very very occcasionally invalid

I have a large object-orientated application that is performing laser machining on multiple machines in a factory.  All machines run the same LabVIEW application and have identical software components installed to the PC. (Win7, 64-bit LabVIEW etc)

All products comprise of multiple lines that are machined by a motion system controlled via a .NET interface.   Each line that we machine, around 200,000 per day, are read from file in a producer loop and buffered in a queue before being dequeued by a consumer loop and machined, the queue is never bigger than 10 lines because the read producer loop queries the current queue size before deciding to read more lines.

This producer-consumer architecture is a basic LV building block and works well.  The actual queue reference is initialized and stored in a Functional Global VI, when a line is read from file, the data is used to populate a CLUSTER which is then passed to Enqueue element.   The reference wire from the Global goes first to GetQueueStatus, then to Enqueue element.  It works perfectly every day, but very occasionally, around once every

2-months, we see this error:

"T-1-Get Queue Status in Cut.lvclass:EnqueueLine.vi->Cut.lvclass:RunProcess.vi->Rig_Operation_Main.vi"

Error 1 means an input is invalid.  After the error occurs, I'm able to go back and check the text file reads okay and the CLUSTER populates.  I'm therefore of the belief that the error occurs because the Queue Reference is invalid.

Can anyone offer suggestions as to why this might happen and how to fix the bug.

 

My code writes a text log file as it goes to aid debug, it looks like this when the error happens:

 

Normal running:

20191021 162903Z EnqueueLine.vi Start
20191021 162903Z Enqueue LineIndex 429 Start Queue Size 18 End Consumer Queue Size 26
20191021 162903Z EnqueueLine.vi Finish

 

With error:

20190824 110326Z EnqueueLine.vi Start
20190824 110326Z Enqueue LineIndex 1041 Start Queue Size 0 End Consumer Queue Size 0 (T-1-Get Queue Status in Cut.lvclass:EnqueueLine.vi->Cut.lvclass:RunProcess.vi->Rig_Operation_Main.vi)
20190824 110326Z EnqueueLine.vi Finish (T-1-Get Queue Status in Cut.lvclass:EnqueueLine.vi->Cut.lvclass:RunProcess.vi->Rig_Operation_Main.vi)

 

0 Kudos
Message 1 of 10
(2,811 Views)

The most likely situation is you have a case structure or Event Structure that is using the "Use Default If Unwired" tunnel option and you failed to wire through your queue reference.  That feature should be turned off in 98% of the situations I have seen.


GCentral
There are only two ways to tell somebody thanks: Kudos and Marked Solutions
Unofficial Forum Rules and Guidelines
"Not that we are sufficient in ourselves to claim anything as coming from us, but our sufficiency is from God" - 2 Corinthians 3:5
Message 2 of 10
(2,791 Views)

The most likely is still some simple programmer error. But:

Queues and very rare instances of invalid refnums... I recalled that a former colleague of mine fixed a bug in G# years ago that had these symptoms. The bugfix note claims "LabVIEW queues ref nums seems to be reused every 4096 element."

 

If you cannot find anything, I can call him and ask him to elaborate on the issue.

Certified LabVIEW Architect
Message 3 of 10
(2,767 Views)

Thanks for the suggestions.  Here a snip of code to be clear how simple it is.

The Producer Loop:

ProducerLoop.jpg

 

Inside the Enqueue VI:

Enqueue.jpg

The queue reference comes from the yellow colored Functional Global Variable VI.  At 200,000 lines per day between 9 manufacturing rigs, that's 73 million enqueues a year.  We're seeing the fail around 6 times per year.

So far, it has only failed at LineIndex greater than 1000.  The largest features machined are around 6000 lines.

You will notice I have put a case structure and a while loop in there.  This is a recent attempt at reversion, if the error occurs, the VI will clear the error, re-read the Reference and re-attempt the Enqueue.

0 Kudos
Message 4 of 10
(2,732 Views)

It has also occurred to me that GetQueueStatus is also called in the Consumer Loop.

If both Producer and Consumer loop try to call GetQueueStatus at the same instant, I would imagine that both have there own reference to the call and the LabVIEW scheduler handles it all in the background.  But maybe the potential exists for a bug.

GetQueueStatusInConsumerLoop.jpg

0 Kudos
Message 5 of 10
(2,721 Views)

I had the same error except for a file reference that manifested itself in an edge case. Not a programming error, probably an OS or intrinsic LabVIEW problem, if @rolfk is reading this, he would probably know the answer.

 

Below is the offending code.

 

Snap33.png

Basically, I would write a special file header, close the file, than open it using a H5 library. Every now and then, I would get the invalid reference error, where my file ID was not valid, after the H5 Open File function. The error was telling me that the file I just closed before that function was not valid. Note the Close file function did not give any errors. According to data flow, everything should be valid.

 

This was an edge case. When I used the same exact files on internal disk drive, I never got an error. I only got the error when I was using an external USB drive and had a lot of small files that I was converting, that is, going through the code above. For whatever reason, the latency over the USB bus was not taken into account. My work around was to catch the error and retry the Open H5 function if there was an error. That is, after a small delay the invalid file reference suddenly became valid. This worked.

 

This has no relation to your problem, but if you are opening references on external devices, there may be some unaccounted latency. In my case, it was not really a programming error.

 

mcduff

0 Kudos
Message 6 of 10
(2,708 Views)

Hi mcduff, thanks for the info, I'm not using any external storage that might cause a delay, however, your experience gives me hope that my reversion code will solve the bug.

0 Kudos
Message 7 of 10
(2,701 Views)

A far out idea that is mostly likely not the case but just in-case...

 

If a Action Engine (or any VI( is set as "sub-routine", instances of the sub-VI will have a call option "skip if busy".

 

If the VI in question is configured to "Skip if Busy" there is a possibility that another thread is using the sub-VI, and the configured instance will be skipped in which chase the skipped VI will return the "default-default" of any data returned by the sub-VI.

 

If the VI is not set as sub-routine forget about everything I just wrote,

 

Ben 

Retired Senior Automation Systems Architect with Data Science Automation LabVIEW Champion Knight of NI and Prepper LinkedIn Profile YouTube Channel
Message 8 of 10
(2,671 Views)

thanks Ben, I searched for callers, the Enqueue VI is used only in one place, it's only a sub-VI to keep the diagram tidy.  Soon the code with the reversion that re-reads the reference will be deployed to the factory floor.  I'll report back if the bug remains.

0 Kudos
Message 9 of 10
(2,626 Views)

I can confirm the bug has now gone.  The reversion routine that re-reads the reference must be effective as we have had no more fails.

Message 10 of 10
(2,427 Views)