I face the following problem and I need your help to troubleshoot.
I have a unit running for over 5 years almost everyday which consists of a NI cRIO 9074 and two (2) NI 9144 EtherCAT modules.
Both cRIO and EtherCATs are in FPGA mode (each one has a NI-9478 module for PWM functionality)
So far so good, this setup is up and running without problems for years.
Suddenly this morning both of EtherCATs seems to be in "Pre Operational" mode and cannot enter the "Operational mode".
I am running the unit under LabVIEW development platform and everything compiles without any problem (both on host and target side). I found out that both EtherCATs cannot drive their modules.
Checking the on-board LEDs, the RUN led is blinking (the on-board ERR LED is not lit) and checking the "Online Device State..." of each one (inside LabVIEW project, by right clicking each device under EtherCAT tree), indeed they are in this mode.
Trying to switch to "Safe Operational" mode, so I can enter the "Operational Mode" I get a "Device Error" with the following explanation:
"Error -2147138407 occurred at an unidentified location
The slave device cannot be transitioned to Safe Operation state
if the device's distributed clock is enabled and the NI Scan Engine is set to Configuration mode."
"Error -2147138442 occurred at an unidentified location
The module cannot be found. If the physical module exists, and the device is in FPGA mode, recompiling and downloading may fix this problem."
By right clicking the cRIO device --> Utilities --> Scan Engine Mode
I can see two options: "Switch to Active" and "Switch to Configuration" with the latter disabled (already in this mode?)
By clicking "Switch to Active" I get the following error message:
"An error occurred while attempting to switch the I/O scan mode.
The module cannot be found. If the physical module exists, and the device is in FPGA mode, recompiling may fix this problem."
I am running the LabVIEW 2009 development platform under Win7 x64.
So far I have recompiled the FPGAs for the three devices but the same problem remains.
Any suggestions on how to proceed?
The problem you describe seems very interesting. You mention that this morning the system stopped working. What exactly happened between yesterday and today? Were there any software updates in the computer? Was there a power short of some sort in the facility? Were the devices operational at night and they stopped working, or when you turned them on today they presented this behavior?
Some things that come to my mind that you could check or try:
This happened during the weekend. Monday morning the problem arise. My colleague who operates the unit told me the following:
The devices were not operational during the weekend but the unit was powered (cRIO & EtherCATs).
There were no recent software updates (os & NI).
Indeed there were power shorts during the previous week which affected the operation of the unit, this has also happened before without any further problem. Also we have strong indications that an outage happened during weekend, too.
Monday morning a SPST relay from a module (NI 9481) located at the first EtherCAT did not operate (this one triggers another relay in order to power the whole unit). Then we found that both EtherCATS can not drive their modules and further investigation showed that the EtherCATs cannot enter "Operational" mode.
Also another clue I recently learned is that the previous week some AIs where returning zero values while they should not. The operator of the unit confirmed that the devices where working properly though, and did not pay more attention or reported the issue.
All modules are in the same slot. No moves.
I have not tried an example project, yet. Can you point me a link or tutorial to follow in order to have a common reference? Also, is there any case that doing that will mess up the channels of my project? (well it should not but wondering just in case!) It would be a pain at least, to reroute I/Os from 16 modules on both devices.
The ethernet cables seem to be proper since I can read the devices state through LabVIEW project and change between "Init", "Bootstrap" and "Pre Operational" modes, on both EtherCATs. Am I right?
A couple of points:
A couple of troubleshooting steps:
I take it remaking a project and re-adding the devices did not clear any of the errors?
you are right, the power outages should not be let to affect the devices. We will proceed and connect them with our main UPS.
The operator, unfortunately has no more info to contribute. About the devices, yes we have rebooted several times.
Also I can confirm that the errors where not cleared after creating a new project and adding the devices.
I will proceed with the troubleshooting steps as soon as possible and post the results.
According the disable of the Distrubuted clock of Device 1, should I toggle it or keep it disabled eventually?
Also which firmware should I download to the EtherCATs?
Should I download from here
or is there a dedicated NI-9144 firmware page available?
I'll just chime in for the benefit of any NI engineer who may be monitoring this thread: I had a related issue with a NI 9144 in December of last year (Service request # 7726816) which could only be resolved by returning the unit to NI for repair.
I have disabled the Distributed Clock and updated EtherCATs firmware with no luck. The problem remains.
I have downloaded the following firmware: https://forums.ni.com/ni/attachments/ni/280/13368/2/NI-9144_rev2_4.zip
After CFER_STS comment, and the results you have got so far; I also recommend trying to check if the devices were damaged and need to be replaced. Have you tried the above steps with just one device? For example, when creating the blank project and adding the device with its EtherCAT master. Try adding just one EtherCAT device and check if that works. Does the LED in the EtherCAT ports show correct connection in the three devices?
All the best,
I tried the above with one EtherCAT at a time. Nothing changed. Also I tried unplugged all the C-series modules from all devices (3), in case one of these was responsible but no luck either. The EtherCAT ports seems to show correct connection between the three devices (orange solid, green blinking). cRIO LED status: Power solid, all other off, EtherCAT 1-2: Power solid, Run blinking, all other off
Any ideas on how to proceed and make sure that the EtherCATs were damaged? I mean it seems pretty strange to me that both EtherCATs were simultaneously damaged.
Also in case a chassis (or two or all of them) needs replacement, I would like suggestions on migration because importing all these modules again in the project is a nightmarish scenario.