05-22-2020 08:55 AM
Hey Fancy Folk,
TLDR:
When running a test for several days, one of the CAN ports on a PXI 8512/2 goes into the Error Passive communication state and has a Tx error counter of 128. How can I restart my CAN communications when in the Error Passive state?
Background:
I have developed a test that will last ~ 3 months. During the initial testing, I noticed that a port on my CAN card "stops communicating" via XNET after several days. I used a secondary device (PCAN) to verify my UUT is still sending messages, which it is. I started looking through the XNET help and found the XNET Read (State CAN Comm) vi contains additional fault information that the XNET read/write Vi's error cluster out doesn't have. By this, I mean that when my CAN communication stops, there is no error on my XNET read/write error line, but there is information in the XNET Read (State CAN Comm) that says why the communication stops.
What I've done:
Originally, I thought that my code was going into Bus Off mode (LED 1 green, LED 2 red) so I added some code that stated "if communication state = Bus Off, start interface only" (Page 4-541 in the XNET help) and some code that would display the indicators associated with the XNET Read (State CAN Comm). After 3 days, I the State CAN Comm finally changed from default values to the fault value. Ends up, the transmit error counter went to 128, communication state is Error Passive, and last error is Ack (see image attached). Based on XNET help, it makes sense that the state went from error active to error passive since the counter went above 127.
Where I need additional help:
I am confused as to why the transmit error counter is increasing. Yes, I understand the error counter incremented when a message is starting to send and decremented when it is fully transmitted. However, why is there an error in the transmission? XNET documentation points to an error in the physical communication CAN lines or a termination resistor. I guess I can understand that if the com state went to Error Passive at start, but it can take anywhere from 3 to 7 days for the com state to change. I also understand this question is more of a CAN question than XNET question.
Additionally, how can I restart my CAN communication on my PXI 8512/2 port? I couldn't find any documentation that explicitly state that restarting the interface only would be fine like it would be for Bus Off mode. I assume I'd be fine, but you know what they say about assuming. My initial idea is to have the code restart the station (close XNET sessions and initializes new ones) if the CAN state goes into the Error Passive mode.
Additional Notes:
I currently have my PCAN device set up to monitor any faults on my CAN bus via the PCAN's trace ability. Hopefully I can see the messages that are causing the faults.
Thanks,
Matt
02-13-2023 07:12 AM
Hey Matt. Did you ever determine a solution to this? I am having the exact same problem that exhibits itself within the first 5 minutes of running code. Hardware has been the same and hasn't changed for many months. Software has only went through minor changes. I just can't manage to keep my second can bus alive
02-14-2023 06:51 AM
You should try resolving the bus error issue. See NI-XNET Troubleshooting Guide.
According to XNET Read (State CAN Comm).vi,
When a CAN interface transitions to the bus off state, communication stops for the interface. All NI-XNET sessions for the interface no longer receive or transmit frame values. To restart the CAN interface and all its sessions, call the XNET Start VI.
02-14-2023 08:46 AM
@mazcoder wrote:
Hey Matt. Did you ever determine a solution to this? I am having the exact same problem that exhibits itself within the first 5 minutes of running code. Hardware has been the same and hasn't changed for many months. Software has only went through minor changes. I just can't manage to keep my second can bus alive
Below is a snippet from my code that I used as a work around. I am querying the communication state and if it is in Error Passive or Bus Off, I restart the interface only. As far as I remember, I haven't had issues with this code being dropped into my XNET read loop.
Also, We were having some connection issues since our connectors were inside a thermochamber and the heat cycling made them a bit brittle. We replaced the connectors and wiring after adding this code into the main test. I'm not sure if this code or replacing the connectors fixed my problem, but I wanted to share hardware was also changed when the issue vanished.
02-15-2023 06:40 PM
@Matt_AM yes that method was what I was thinking I was going to need to do. But the next morning I was standing by the test stand looking at what else might be the problem and I ended up changing out the power supply for the cRIO which consequently was the same power supply that the C Series CAN module used to power the can buses. Poof. Problem gone. It was only a little 12v 1.5amp wall wart type power supply. Swapped it for a proper adjustable bench supply and the test is running fine.