LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

Error 62 intermittent after prolonged disconnect in TCP heartbeat routine

Hi,

 

I am getting an 'Error 62' in a client/server TCP communication which involves a connection health check which is only entered if the connection between client or server is down. The health check is meant as a way to recover operation of the client application (running on a cFP) after the server (running on a windows machine) comes back up.

 

The only sense I can make out of it is if I have the windows machine completely shut down over night. The fieldpoint client app will go into a connection health check state when it can no longer communicate with the server application, by design. When I bring the windows machine back up the next day and turn on the server app, it is able to establish communication but at STMread (after STMWriteMeta Data), it throws an Error 62. 

 

If I close the server app without shutting down the windows machine, the client app enters its health check state and is able to re-establish its communication just fine after restarting the server app. Likewise, I can restart the windows machine and then turn on the server app and the connection comes back just fine again. So the issue seems to relate to a prolonged period of disconnect. Since I need to be able to recover from such outtages seemlessly, I need to figure out how I can fix this Error 62 problem.

 

Client Heartbeat

 

server_app.png

 

I hope the problem is obvious to someone from the code snippets. Just in case, I have included both full VIs. Maybe there's a better tested open source VI I could use for this connection health checker that simply returns true if the connection is good? That's all I'm trying to accomplish here.

 

 

Download All
0 Kudos
Message 1 of 2
(2,907 Views)

First off, there are a couple of VIs that appear to be located in your "My Documents folder" which are missing from the snippet. Getting a look at those would be helpful in determining what is actually going on.

 

Your connection from the cFP to server appears to be using a dynamically assigned port number on the outbound socket connection. There is a documented bug on the RT platform (CAR# 242786) whereby dynamically assigned ports on the RT system are not returned to the pool of available ports after a connection is either not established, broken or closed. In your application, it looks like the cFP system is continually trying to make a connection to the server when the server is not responding. One possibility is that after running all night, depending on the speed that you retry the connection, you are essentially running out of ports and the cFP is no longer able to connect to anything using TCP.

 

Try running your code again to duplicate the problem. When you see the problem occuring, attempt to connect to the cFP using MAX, FTP or LabVIEW. A failure to be able to connect using these methods will confirm the depletion of the available port pool.

 

If this is in fact the problem, there are a couple of options.

 

1) Change the method you use for your heartbeat. One thought is that the server could periodically broadcast UDP messages to the cFP systems that it normally connects to. The cFP systems could simply monitor for data coming from this UDP connection as a means to verify the server program is up and running.

 

2) Swap which system establishes the connection so that the RT system creates the listener and the Windows system initiates the connection. This would require that the server would then initiate requests for data from the cFP (cFP does not speak unless spoken to).

 

 

 

 

 

Greg Sussman
Sr Business Manager A/D/G BU
0 Kudos
Message 2 of 2
(2,892 Views)