.Net RefNum problem in loop

jaketc · ‎03-28-2022

I am debugging a LabVIEW application that uses a Diffraction Limited camera. The code triggers a capture then attempts to download it.

LV Version 2013
Camera SBIG STC-428-OEM

After running for a period of time (400k interactions) the refnum will "disappear" i.e. goes to 0x00000 and code then throws an error.

I am now testing the code with "ignore errors" in the property node set to true.

Please advise and comment.

screenshot

this is how I think the various SW stack up

Who is John Galt?

wiebe@CARYA · ‎03-29-2022

@jaketc wrote:
After running for a period of time (400k interactions) the refnum will "disappear" i.e. goes to 0x00000 and code then throws an error.

Exactly 400.000 interactions? Always exactly 400.000?

Seems to me there's little LabVIEW can do about it. Looks like a counter of some sort inside a resource.

Maybe there's a method to reset, or a property to set the counter? Even if you can read it, you could at least make a graceful re-init.

@jaketc wrote:
I am now testing the code with "ignore errors" in the property node set to true.

That will not make a difference.

"Ignore errors inside" ignores the return error of a property, so the sequential property aren't skipped because of previous error(s). Since you only have one property in the property node, I do not see how it will make any difference.

Search LabVIEW like a graph!

wiebe@CARYA · ‎03-29-2022

Also, you should also stop the while loop on an error out of the property node.

The reference going 0 might be the result of an error, and if you don't stop the loop on this error, you'll never know the nature of the error. Your loop might stop next iteration, but then you've lost the original error.

As the device has a connection, there could for instance be an expiration time on it.

Search LabVIEW like a graph!

jaketc · ‎03-29-2022

thank you for your answer. I should have said over 400k iterations (my point was it runs a "long" time). The code I included in screen captures is the test code I made to try to isolate the problem. In the actual VI, there is error handling. I cannot post the actual VI as it's protected under NDA. FWIW I am not the original author. My thoughts are the problem is in the.Net assembly (which calls a vendor-supplied DLL) or the order in which various methods are being called and/or references are being closed. I found a couple of places where refs are not being closed which I'm fairly sure we causing memory leaks. Closing them seemed to solve that problem. I used the Windows resource monitor to observe memory usage. Before my mods, I could repeatedly observe memory being consumed, after I did not. I am running various tests to try to determine where the problem(s) are. Things I'm trying, opening then closing refs, using various methods, etc. Plus digging into the .net assembly (I did not author it as well), the vendor DLL API, and trying to make sure the order of operation(s) is OK. Using a Vendor supplied application (which calls the DLL) it can operate indefinitely without error and or memory problems. Based on that my assumption is the problem is in either the .net wrapper or the (user-written) VIs. Not the DLL or LV. I guess I'm trying to employ the Ishikawa method to determine the root cause of the problem, i.e. man, machine, method, materials (in this case SW). I think it's likely either an SW or procedural problem. I'm including the order of operation in the procedure category. If it's in the SW domain, I suspect the vendors DLL and LV are not the problem. This leaves the user-written VIs, or the user-written .net wrapper. Just for reference, this code is massive, poorly documented, and (IMHO) poorly structured. More than one consultant has walked (run) away from it. Please comment if you agree that it's likely a problem with the LV code (user-written VIs) or the .Net wrapper (also user-written, but not the same author as the LV code.) Again thanks for your expert advice. I really appreciate it.

Who is John Galt?

wiebe@CARYA · ‎03-30-2022

@jaketc wrote:

Please comment if you agree that it's likely a problem with the LV code (user-written VIs) or the .Net wrapper (also user-written, but not the same author as the LV code.) Again thanks for your expert advice. I really appreciate it.

Seems like you're on the best track to figuring this out...

Finding the root cause is definitely the way to go, much better then looking for a work around. If you don't have full understanding, these kind of problems tend to hunt you forever.

I'd be interested to know if this is an iteration problem or a time problem. If you can speed the error up or slow it down, you're a step closer to the solution.

If the other application loops 10X slower, it might seem OK, while in fact it could crash, only 10X slower. Also, perhaps they did work around the problem. So it's not entirely safe to rule out that part of the code... Is there source code available?

It could be that it's doing something right that you aren't doing, or that they don't do something wrong that you are doing (you meaning the LabVIEW code you didn't build of course).

Other options are:

Make a C# test to reproduce the simple VI (you don't even need M$ VS)

Examining the .NET wrapper (e.g. dnSpy)

Calling the dll directly, skipping the wrapper

Asking the vendor for advice

Search LabVIEW like a graph!

jaketc · ‎03-31-2022

Excellent suggestions. I have VS so I could attempt a simple C# test program.

I have the .net wrapper source and will examine it.

I do not have the DLL source code but do have good documentation for the API.

The client has a good relationship with the vendor (buys a lot of their cameras) so hopefully, they will be receptive and helpful.

I’m going to start with writing a report, outlining what I’m doing, what the problems are (errors), and asking them for their opinion/help. Plus ask for a more high level of what needs to be done (using their API) to command/control the device.

On my last test, LV threw invalid parameter and/or memory errors. In the first case, the refnum was 0x0000 at error. Which I suspect was a result of the memory error. Possibly another memory leak? There are many loops that open constructors then call various methods. I definitely found one - a constructor was being called inside the loop and closed outside of the loop. The loop, in this case, was checking to see if “transfer complete”. On true (or timeout 3s which doesn’t seem to work) the loop would terminate, with the refnum being passed to “close session” LV primitive. Since the loop doesn’t terminate the constructor keeps being called until it runs out of memory and LV hangs.

I’m writing this on my phone while sitting in my car in the rain. Needed to take a break and enjoy some Seattle sunshine. Please forgive my typing, spelling, and nomenclature mistakes. BTW I hope I didn’t sound like I was blaming the original author -just saying I didn’t write the original code to provide context. I make lots of mistakes just didn’t make these.

Once the problem is determined and (hopefully) the solution implemented I will post the results. I’m seeing lots of LV applications that call (or need to) external code, e.g., python, .net, Matlab these days. This exercise will undoubtedly be useful in future debugging sessions. Thanks again for your help.

Who is John Galt?

FireFist-Redhawk · ‎03-31-2022

@jaketc wrote:

screenshot

I would add a Wait to throttle your loop a little. It's running at a million loops an hour, so to speak.

Redhawk
Test Engineer at Moog Inc.

Saying "Thanks that fixed it" or "Thanks that answers my question" and not giving a Kudo or Marked Solution, is like telling your waiter they did a great job and not leaving a tip. Please, tip your waiters.

Kevin_Price · ‎03-31-2022

The screencap you shared looks like a downstream *symptom*. The problem is the upstream error that comes into your function on the 'error in' terminal.

Because of that incoming error, your call to (what looks like) a SensorStatus constructor probably exhibits standard error behavior -- no constructing happens and the output refnum is invalid with a value of 0.

According to the error text, you should move your investigation to "PromiseResult.vi".

-Kevin P

ALERT! LabVIEW's subscription-only policy came to an end (finally!). Unfortunately, pricing favors the captured and committed over new adopters -- so tread carefully.

Yamaeda · ‎04-01-2022

It sounds like you're recreating a .NET ref each time and runs out of memory space (windows allow ~1 million refs), i had a similar problem in a program.

Cache and reuse the ref as much as possible and it should work.

G# - Award winning reference based OOP for LV, for free! - Qestit VIPM GitHub

Qestit Systems

wiebe@CARYA · ‎04-01-2022

@jaketc wrote:

Excellent suggestions. I have VS so I could attempt a simple C# test program.

I have the .net wrapper source and will examine it.

There is a C# (and VB) compiler build into .NET (CSharpCodeProvider class)... No need to install VS.

But if you have VS, you might as well use it.

Search LabVIEW like a graph!

LabVIEW

.Net RefNum problem in loop

.Net RefNum problem in loop

Re: .Net RefNum problem in loop

Re: .Net RefNum problem in loop

Re: .Net RefNum problem in loop

Re: .Net RefNum problem in loop

Re: .Net RefNum problem in loop

Re: .Net RefNum problem in loop

Re: .Net RefNum problem in loop

Re: .Net RefNum problem in loop

Re: .Net RefNum problem in loop