06-21-2017 07:00 PM
I have a system we configured a few years ago with a cRIO-9025 controller using LabVIEW 10 RT and FPGA. The system is a network attached protocol converter which communicates using UDP multicast packets. The original configuration used layer 2 ethernet switches to connect everything and communicate with remote clients, it has been working like this for several years. We recently changed out the dumb layer 2 switches for "smart" managed Cisco switches and an issue has popped up that has us stumped. When the app in the cRIO first starts up and opens the multicast connections everything is good, the open UDP connection vi sends an IGMP join and the switch properly routes multicast UDP packets from the external client to the cRIO. After about 2 minutes or so the router prunes the multicast tree and no longer forwards the packets to the cRIO. My network guy tells me that the router is asking the cRIO if it still is joined and it doesn't respond (an IGMP thing) at which point the router decides the cRIO is no longer there and removes that multicast route. Re-starting the app will re-initiate the join and everything is good for another couple of minutes. We were able to force the router to keep the group opened but this is a total hack. Has anyone else seen this behavior in a VxWorks based cRIO ? Is there some secret ini file key to change this behavior ?
06-22-2017 09:47 AM
That is odd behavior- after some digging I found this KB that may address your problem. Otherwise, it may be an issue with high priority loops forcing the processor to stop network communication to insure enough resources are available to run those loops. You can check your CPU to see if it reaches 100% during the time in which the cRIO becomes unresponsive.
06-22-2017 09:59 AM
We monitor the CPU/memory usage in real-time and it seems to hover around 30%, I can isolate the code in question to see if it is some kind of CPU related issue as in the link you posted - we haven't had any issues with latency or odd behavior with the original layer 2 switch install so I am doubtful this is the problem but I won't rule anything out at this point
06-22-2017 10:07 AM
I would at least console out and check the Recv-Q column after typing inetstatShow to see if it increases during connection. What does your UDP code look like, are you UDP reading and writing in parallel?
06-22-2017 10:11 AM
the loop in question opens a read only connection and waits for characters in the buffer, the commands are low rate, it does not write out anything in that loop
06-22-2017 10:14 AM
Isolate the UDP code and check to see if the same thing occurs. You can use the UDP multicast example.
06-26-2017 06:06 PM
update: I isolated the code to this on the cRIO and windows client side:
The client sends a message every second, the cRIO side receives the message and increments Count, it runs exactly 180 times before it stops - restarting the cRIO side will run for another 3 minutes - We can go into the router and disable multicast pruning for that group and it will run forever
06-27-2017 10:16 AM
I did some more digging, and have some questions for you. Does the cRIO have a static IP or is it DHCP? We have a KB that talks about UDP Multicast failing on static IP devices. Is the cRIO on the same subnet as the windows client side?
06-27-2017 10:29 AM - edited 06-27-2017 10:30 AM
Good questions, we just went to DHCP when we replaced the layer 2 switches with routers, and the windows client and cRIO are NOT on the same subnet - Here is one other thought I had, we are also using the 2nd Ethernet interface on the cRIO to communicate with another device in a point-to-point fashion (this interface is using static IP addresses), I'm wondering if it is possible that the cRIO is responding on this NIC instead of the one that it joined on, like if the bindings got mixed up ?? As another data point we have solid IP connectivity with the windows client, if I switch to unicast it will run all day long. FYI the KB you mentioned is for Linux based units, this one is VXWorks.
06-27-2017 10:35 AM
Sorry about that, forgot you had vxWorks. Could you check if you see the same behavior when disabling/disconnecting the 2nd ethernet interface on the cRIO?