06-21-2017 07:00 PM
I have a system we configured a few years ago with a cRIO-9025 controller using LabVIEW 10 RT and FPGA. The system is a network attached protocol converter which communicates using UDP multicast packets. The original configuration used layer 2 ethernet switches to connect everything and communicate with remote clients, it has been working like this for several years. We recently changed out the dumb layer 2 switches for "smart" managed Cisco switches and an issue has popped up that has us stumped. When the app in the cRIO first starts up and opens the multicast connections everything is good, the open UDP connection vi sends an IGMP join and the switch properly routes multicast UDP packets from the external client to the cRIO. After about 2 minutes or so the router prunes the multicast tree and no longer forwards the packets to the cRIO. My network guy tells me that the router is asking the cRIO if it still is joined and it doesn't respond (an IGMP thing) at which point the router decides the cRIO is no longer there and removes that multicast route. Re-starting the app will re-initiate the join and everything is good for another couple of minutes. We were able to force the router to keep the group opened but this is a total hack. Has anyone else seen this behavior in a VxWorks based cRIO ? Is there some secret ini file key to change this behavior ?
06-22-2017 09:47 AM
That is odd behavior- after some digging I found this KB that may address your problem. Otherwise, it may be an issue with high priority loops forcing the processor to stop network communication to insure enough resources are available to run those loops. You can check your CPU to see if it reaches 100% during the time in which the cRIO becomes unresponsive.
06-22-2017 09:59 AM
We monitor the CPU/memory usage in real-time and it seems to hover around 30%, I can isolate the code in question to see if it is some kind of CPU related issue as in the link you posted - we haven't had any issues with latency or odd behavior with the original layer 2 switch install so I am doubtful this is the problem but I won't rule anything out at this point
06-22-2017 10:07 AM
I would at least console out and check the Recv-Q column after typing inetstatShow to see if it increases during connection. What does your UDP code look like, are you UDP reading and writing in parallel?
06-22-2017 10:11 AM
the loop in question opens a read only connection and waits for characters in the buffer, the commands are low rate, it does not write out anything in that loop
06-22-2017 10:14 AM
Isolate the UDP code and check to see if the same thing occurs. You can use the UDP multicast example.
06-26-2017 06:06 PM
update: I isolated the code to this on the cRIO and windows client side:
The client sends a message every second, the cRIO side receives the message and increments Count, it runs exactly 180 times before it stops - restarting the cRIO side will run for another 3 minutes - We can go into the router and disable multicast pruning for that group and it will run forever
06-27-2017 10:16 AM
I did some more digging, and have some questions for you. Does the cRIO have a static IP or is it DHCP? We have a KB that talks about UDP Multicast failing on static IP devices. Is the cRIO on the same subnet as the windows client side?
06-27-2017 10:29 AM - edited 06-27-2017 10:30 AM
Good questions, we just went to DHCP when we replaced the layer 2 switches with routers, and the windows client and cRIO are NOT on the same subnet - Here is one other thought I had, we are also using the 2nd Ethernet interface on the cRIO to communicate with another device in a point-to-point fashion (this interface is using static IP addresses), I'm wondering if it is possible that the cRIO is responding on this NIC instead of the one that it joined on, like if the bindings got mixed up ?? As another data point we have solid IP connectivity with the windows client, if I switch to unicast it will run all day long. FYI the KB you mentioned is for Linux based units, this one is VXWorks.
06-27-2017 10:35 AM
Sorry about that, forgot you had vxWorks. Could you check if you see the same behavior when disabling/disconnecting the 2nd ethernet interface on the cRIO?
06-27-2017 10:55 AM
going to the lab now to run that test
06-29-2017 10:13 AM
I ran the test, disabling the 2nd Ethernet interface did not fix things - also I tried passing in the interface and got an Error 54, I tried passing in the IP address in string to IP and also using the string to IP with no input - both generated the error 54 which is something about a malformed address
06-29-2017 11:30 AM
Have you tried machine name instead of ip address?
Also try changing to static instead of DHCP to see if the same thing occurs.
Lastly, try using a different UDP port number and let me know if you see the same behavior.
Put an indicator after string to IP to verify that it is a valid formatted IP address.
07-11-2017 10:15 AM
I tried everything suggested without success - I found another thread on the group called "IGMP problem with PXI under LabVIEW 2015SP1" different OS (Pharlap) but exact same symptoms, this appears to be lack of support in the VXWorks kernel for the latest version of IGMP
07-11-2017 12:47 PM
I think the version of NI-RIO is 3.6 in our cRIO-9025s - can you tell me if updating to a later version with our LabVIEW 10 will work or possibly fix ? I notice that the latest version that is compatible with LabVIEW 10 SP1 is NI-RIO 13.1.1 - I'm wondering if the VXWorks version is different and maybe has support for IGMPV3 ?
07-18-2017 02:00 PM
Sorry for the late reply. That is a very old version of NI-RIO, so yes try upgrading and let me know what happens.
08-14-2017 11:49 AM
So I attempted to update to the last revision listed as compatible with LabVIEW 10 SP1 which is NI-RIO 13.1, this completely broke NI MAX (which had some rather cryptic error messages) and several other NI software elements - we are going to have to re-load and start over - So my next question is: what is the appropriate way to update NI-RIO ? do I need to download all the revisions since NI-RIO 3.6 and install in order ?
08-15-2017 09:56 AM
You do not need to install everything since 3.6. When you reinstall, do so in this order:
LabVIEW Development
LabVIEW modules such as Real-Time or FPGA
NI Drivers such as NI-RIO 13.1
08-16-2017 04:37 PM
update - was able to upgrade NI-RIO to version 13.1 by removing all NI software and re-installing in the order you recommended. Everything is working as before - this did NOT fix the IGMP multicast issue. I'm out of ideas on this. We can get the system to work by hardcoding the routers to subscribe to the multicast group itself - this keeps the route from getting pruned but doesn't explain why the CRIO does not respond to the IGMP query after initially opening the connection
08-17-2017 11:05 AM
I am unfortunately out of ideas as well. You have somewhat of a workaround though I understand it's a hack. I'm unsure why the cRIO does not respond to an IGMP query. I searched CAR's and there was a CAR #152908 that implemented IGMPv2 functionality to vxworks targets. If you have a standard support contract with NI, feel free to open a service request about this, as there may be a fix that R&D has that can only be accessed through a proper escalation chain via a service request.
07-13-2018 11:53 AM
I am uploading a working, tested UDP Mulitcast routine for the cRIO which is tested on two chassis 9039, 9025 in LabVIEW 2017 SP1.
There are some sublties to doing this - especially if the chassis has 2 ethernet ports. Additionally, the VI has code to query and return the cRIO Chassis information and Ehternet port configuration -as the old cRIO Configuration toolkit no longer works.
My heart goes out to those who have banged their heads on this...this was all flawless on VxWorks and now is very spotty on the Linux RT. IMO
Regards and enjoy
Jack Hamilton
07-13-2018 04:54 PM
Hi Jack, I am looking at using UDP multicast on cRIO linux RT. When you say it's sotty do you mean it's harder to get working but that it works ok after that?
07-13-2018 05:37 PM
The spotty part is which cRIO system supports MCast UDP and which do not. Even NI does not know...
Other than that - it's rock solid.
ALSO!: I my example, you have to change the IP address I hardcoded to your actual cRIO address to get it to work. [Upper left corner of the diagram]
I've done a lot of cRIO stuff...and am happy to help...!
07-17-2018 11:19 AM
ah, that's good to know. Could be a scary thing if you have a deployed system, then for some reason you have to upgrade or swap out a compactRIO for a different model and nothing works anymore.
07-17-2018 12:03 PM
Also, Stay away from the Shared Variable in the cRIO, it can be a problem. We've had a cRIO fail to boot because a shared variable problem - or at least that's what we determined. The embedded should always run - not matter what is happening in the outside world.
However, it appears there is something in the way the library deploys on startup on the embedded that can cause the problem. There likely was a legitimate cause - perhaps the SV engine on the PC was not running or in an error state...but the cRIO should have completed its boot-up.
We stripped all the SV code out...and it works fine.
07-18-2018 11:35 AM
I've been using DCAF framework. Learning curve can be steep, especially for creating new modules but it works great and very reliable. No Shared Variables used in it at all, I highly recommend it.
07-18-2018 11:44 AM
Lots of good input on this topic, I'll add one more definitive item that we discovered - it appears that the VxWorks kernels installed in the cRIOs do NOT support IGMP V3 - This was the original issue for us, characterized by the system initially joining the multicast group but 3 minutes later being pruned from the routing tables (PIM Sparse-mode configuration) Rolling back to IGMP-V2 solved that problem or by forcing the router to do join the group on that port with a hard route
07-18-2018 11:56 AM
Cs,
Thanks for the input. I do recall years ago NOT getting MCast UDP to work on a sbRIO, using code that worked on the cRIO...It took NI support quite awhile to find out that they had not fully deployed the TCP/IP Stack on that particular/series of sbRIO.
It seems [at that time] that the NI Embedded Software Manager decides what's deployed and what's not. The sbRIO/cRIO Product [hardware] Manager may not be aware of the minutia of the actual implementation.
As I've been using LV for decades, I use my own TCP/IP based embedded interface code, employing the LabVIEW TCP/IP primitives and don't use the 'toolkits' (they did not exist when I started!).
That said, the TCP/IP Stack implementation in the cRIO/sBRIO is rock solid and its stood up to lots of scrutiny [Wireshark] and "accusations" from 'C' based programmers over numerous projects over the years. Once it's working - it's working!.