Real-Time Measurement and Control

cancel
Showing results for 
Search instead for 
Did you mean: 

cRIO 9068 crashes with kernel "bug"

Hi everyone,

 

I have a cRIO 9068 that has recently started to give me issues. After some days it will just stop working, not respond in any way, and I have to press the RESET-button to get it back.

At the moment of crash the following log appears in the log file nikern:

 

2021-07-16T01:00:58.113+00:00 TestCellMain kernel: [1256891.980674] Stack: (0xdd1e1d50 to 0xdd1e2000)
2021-07-16T01:00:58.113+00:00 TestCellMain kernel: [1256891.980682] 1d40:                                     c0160cbc c03f2df4 00000000 dd37e848
2021-07-16T01:00:58.113+00:00 TestCellMain kernel: [1256891.980693] 1d60: dd1e1d94 00000000 ddecca40 dd1e1d90 df5bbd84 dd1e1dd8 dd1e1e40 0632d0dd
2021-07-16T01:00:58.113+00:00 TestCellMain kernel: [1256891.980703] 1d80: dd1e1dac dd1e1d90 c018de6c c018db38 00000000 dd37e848 ffffff92 01a91c38
2021-07-16T01:00:58.113+00:00 TestCellMain kernel: [1256891.980713] 1da0: dd1e1eb4 dd1e1db0 c018ec24 c018de14 dd1e1dc8 00000000 dd1e1dec dd1e1dd8
2021-07-16T01:00:58.113+00:00 TestCellMain kernel: [1256891.980723] 1dc0: f8a08621 00047722 df5bc440 01a91000 dd28a400 00000c38 dd1e1dd8 00000000
2021-07-16T01:00:58.113+00:00 TestCellMain kernel: [1256891.980733] 1de0: dd061dd8 dd1e1de4 00000000 00000000 ddecca40 dd37e848 dd060000 0000005d
2021-07-16T01:00:58.113+00:00 TestCellMain kernel: [1256891.980743] 1e00: 00000000 00000000 0000005d dd061e0c dd061e0c df5bbd9c dd061e14 ddecca40
2021-07-16T01:00:58.113+00:00 TestCellMain kernel: [1256891.980753] 1e20: df5bbd84 01a91000 dd28a400 00000c38 dd37e840 dd1e1dd8 dd1e1dcc ffffffff
2021-07-16T01:00:58.113+00:00 TestCellMain kernel: [1256891.980763] 1e40: dd1e1e40 00000000 00000000 00000000 f8a08621 00047722 f89fc2d1 00047722
2021-07-16T01:00:58.114+00:00 TestCellMain kernel: [1256891.980772] 1e60: c017c04c dfbe0e00 00000000 dd1e1e6c dd1e1e6c 00000001 00000000 00000000
2021-07-16T01:00:58.114+00:00 TestCellMain kernel: [1256891.980782] 1e80: 00000000 01a91aa4 dd1e1eb4 f89fc2d1 01a91c38 0000000b 00000000 0632d0dd
2021-07-16T01:00:58.114+00:00 TestCellMain kernel: [1256891.980792] 1ea0: 00000000 01a91c54 dd1e1f44 dd1e1eb8 c0190a64 c018e844 01a91c38 dd1e1f50
2021-07-16T01:00:58.114+00:00 TestCellMain kernel: [1256891.980801] 1ec0: 00000001 c0158d28 00000001 dfbe474c 0000000a 0000000b 00000000 dd1e1fb0
2021-07-16T01:00:58.114+00:00 TestCellMain kernel: [1256891.980811] 1ee0: 00000000 c07ba6a7 0000059b dd37e208 8000059b 8000059b dde35c90 dde35c90
2021-07-16T01:00:58.114+00:00 TestCellMain kernel: [1256891.980821] 1f00: 00000001 dd1e1f00 dd1e1f2c 01a91000 dd1e1f2c dd1e1f20 c057ebd0 f89fc2d1
2021-07-16T01:00:58.114+00:00 TestCellMain kernel: [1256891.980831] 1f20: 00047722 0000000b 00000000 0000008b 0632d0dd 01a91c54 dd1e1fa4 dd1e1f48
2021-07-16T01:00:58.114+00:00 TestCellMain kernel: [1256891.980841] 1f40: c0190c90 c0190080 01a91c38 00000000 ffffffff dd1e1f60 c017ea20 00000000
2021-07-16T01:00:58.114+00:00 TestCellMain kernel: [1256891.980851] 1f60: 00132dbb 00000000 f89fc2d1 00047722 00132dbb 3b0c34d1 00000008 01a91c38
2021-07-16T01:00:58.114+00:00 TestCellMain kernel: [1256891.980861] 1f80: 00000000 01a91c50 000000f0 c0108264 dd1e0000 00000000 00000000 dd1e1fa8
2021-07-16T01:00:58.114+00:00 TestCellMain kernel: [1256891.980871] 1fa0: c01080c0 c0190b54 01a91c38 00000000 01a91c54 0000008b 0632d0dd b6283de8
2021-07-16T01:00:58.114+00:00 TestCellMain kernel: [1256891.980880] 1fc0: 01a91c38 00000000 01a91c50 000000f0 00000000 00000000 00000000 01a91c38
2021-07-16T01:00:58.114+00:00 TestCellMain kernel: [1256891.980890] 1fe0: 00000009 b6283d60 00000000 b6cdb084 200b0010 01a91c54 00000000 00000a87
2021-07-16T01:00:58.116+00:00 TestCellMain kernel: [1256891.981041] Code: 1afffff9 e5994014 e3d44001 1a000016 (e7f001f2)
2021-07-16T01:00:58.116+00:00 TestCellMain kernel: [1256891.981343] Kernel BUG at c0669cfc [verbose debug info unavailable]
2021-07-16T01:00:58.116+00:00 TestCellMain kernel: [1256891.981348] Internal error: Oops - BUG: 0 [#2] PREEMPT SMP ARM
2021-07-16T01:00:58.117+00:00 TestCellMain kernel: [1256891.981458] Process LV_Occurrence (pid: 1435, stack limit = 0xdd1e0210)
2021-07-16T01:00:58.117+00:00 TestCellMain kernel: [1256891.981462] Stack: (0xdd1e1ab0 to 0xdd1e2000)
2021-07-16T01:00:58.117+00:00 TestCellMain kernel: [1256891.981470] 1aa0:                                     c0668374 c0146588 c066b6c0 200f0113
2021-07-16T01:00:58.118+00:00 TestCellMain kernel: [1256891.981481] 1ac0: dd1e1ac0 dd1e1b0c dd1e1b34 dd1e1acc c010c5b8 c01013b4 00000000 ddecca40
2021-07-16T01:00:58.118+00:00 TestCellMain kernel: [1256891.981492] 1ae0: dd37e801 ddeccf0c dd1e1b44 df5bbd84 ddeccf68 df5bbd84 dd37e840 ddeccf0c
2021-07-16T01:00:58.118+00:00 TestCellMain kernel: [1256891.981502] 1b00: dd1e1b44 df5bbd84 ddeccf68 dd37e848 dd1e1b34 dd1e1b20 c066b9d0 c0669a20
2021-07-16T01:00:58.118+00:00 TestCellMain kernel: [1256891.981512] 1b20: 00000000 ddecca40 dd1e1b7c dd1e1b38 c018fb58 c066b978 c0166d98 ddeccf68
2021-07-16T01:00:58.118+00:00 TestCellMain kernel: [1256891.981522] 1b40: c018dba8 01a91000 dd28a400 00000c38 dd1e1b74 ddecca40 dd28a400 dd28a400
2021-07-16T01:00:58.118+00:00 TestCellMain kernel: [1256891.981532] 1b60: c018dba8 00000000 c07ba0a9 00000008 dd1e1ba4 dd1e1b80 c011ea94 c018fa94
2021-07-16T01:00:58.118+00:00 TestCellMain kernel: [1256891.981542] 1b80: c01996ac c066bb78 ddecca40 ddecca40 dd28a400 dd1e1c12 dd1e1bc4 dd1e1ba8
2021-07-16T01:00:58.118+00:00 TestCellMain kernel: [1256891.981553] 1ba0: c01249a0 c011ea4c dd1e1bc4 dd1e1bb8 c01211cc c018dba8 dd1e1c44 dd1e1bc8
2021-07-16T01:00:58.118+00:00 TestCellMain kernel: [1256891.981563] 1bc0: c010bdfc c0124740 dd1e0210 0000000b dd2194c8 c07ba0a1 600f0193 bf000000
2021-07-16T01:00:58.118+00:00 TestCellMain kernel: [1256891.981573] 1be0: 3114f5bc 66666661 20396666 39393565 34313034 64336520 30303434 61312031
2021-07-16T01:00:58.119+00:00 TestCellMain kernel: [1256891.981583] 1c00: 30303030 28203631 30663765 32663130 00002029 dfbd5370 0000000b e7f001f2
2021-07-16T01:00:58.119+00:00 TestCellMain kernel: [1256891.981593] 1c20: dd1e1d00 c018dba8 00000000 e7100000 c010c6a4 dd1e0000 dd1e1c54 dd1e1c48
2021-07-16T01:00:58.119+00:00 TestCellMain kernel: [1256891.981603] 1c40: c010bed4 c010ba8c dd1e1cfc dd1e1c58 c0101094 c010be84 00000006 ddeccb80
2021-07-16T01:00:58.119+00:00 TestCellMain kernel: [1256891.981613] 1c60: 00000004 00000000 00030001 c018dba8 dd1e1ca4 dd1e1c80 c01404f4 c01028a0
2021-07-16T01:00:58.119+00:00 TestCellMain kernel: [1256891.981624] 1c80: ffffffff ffffffff dd198000 c01407f8 ffffffff 00000000 dd1e1ccc c01466b0
2021-07-16T01:00:58.119+00:00 TestCellMain kernel: [1256891.981634] 1ca0: dd1e1ce4 dd1e1cb0 c01466b0 c066b69c dd19801c ddecca40 ddecca40 00000000
2021-07-16T01:00:58.119+00:00 TestCellMain kernel: [1256891.981644] 1cc0: dd28a400 dd0b6800 ddecca40 dfbd5340 ddeccd94 00000000 dd1e1d34 dd1e1ce8
2021-07-16T01:00:58.119+00:00 TestCellMain kernel: [1256891.981653] 1ce0: c018dbac 00000000 c010cad8 00000000 dd1e1d8c dd1e1d00 c010c6a4 c010100c
2021-07-16T01:00:58.119+00:00 TestCellMain kernel: [1256891.981663] 1d00: 00000000 dd061dd8 00000000 dd061dd8 00000000 dd37e848 ddecca40 01a91c38
2021-07-16T01:00:58.119+00:00 TestCellMain kernel: [1256891.981673] 1d20: dd1e1e08 dd37e840 00000000 dd1e1d8c dd1e1d38 dd1e1d50 c066a354 c018dba8
2021-07-16T01:00:58.119+00:00 TestCellMain kernel: [1256891.981683] 1d40: 600f0193 ffffffff c018db94 bf000000 c0160cbc c03f2df4 00000000 dd37e848
2021-07-16T01:00:58.119+00:00 TestCellMain kernel: [1256891.981693] 1d60: dd1e1d94 00000000 ddecca40 dd1e1d90 df5bbd84 dd1e1dd8 dd1e1e40 0632d0dd
2021-07-16T01:00:58.119+00:00 TestCellMain kernel: [1256891.981703] 1d80: dd1e1dac dd1e1d90 c018de6c c018db38 00000000 dd37e848 ffffff92 01a91c38
2021-07-16T01:00:58.119+00:00 TestCellMain kernel: [1256891.981714] 1da0: dd1e1eb4 dd1e1db0 c018ec24 c018de14 dd1e1dc8 00000000 dd1e1dec dd1e1dd8
2021-07-16T01:00:58.119+00:00 TestCellMain kernel: [1256891.981724] 1dc0: f8a08621 00047722 df5bc440 01a91000 dd28a400 00000c38 dd1e1dd8 00000000
2021-07-16T01:00:58.119+00:00 TestCellMain kernel: [1256891.981734] 1de0: dd061dd8 dd1e1de4 00000000 00000000 ddecca40 dd37e848 dd060000 0000005d
2021-07-16T01:00:58.119+00:00 TestCellMain kernel: [1256891.981743] 1e00: 00000000 00000000 0000005d dd061e0c dd061e0c df5bbd9c dd061e14 ddecca40
2021-07-16T01:00:58.119+00:00 TestCellMain kernel: [1256891.981754] 1e20: df5bbd84 01a91000 dd28a400 00000c38 dd37e840 dd1e1dd8 dd1e1dcc ffffffff
2021-07-16T01:00:58.119+00:00 TestCellMain kernel: [1256891.981763] 1e40: dd1e1e40 00000000 00000000 00000000 f8a08621 00047722 f89fc2d1 00047722
2021-07-16T01:00:58.119+00:00 TestCellMain kernel: [1256891.981773] 1e60: c017c04c dfbe0e00 00000000 dd1e1e6c dd1e1e6c 00000001 00000000 00000000
2021-07-16T01:00:58.120+00:00 TestCellMain kernel: [1256891.981783] 1e80: 00000000 01a91aa4 dd1e1eb4 f89fc2d1 01a91c38 0000000b 00000000 0632d0dd
2021-07-16T01:00:58.120+00:00 TestCellMain kernel: [1256891.981793] 1ea0: 00000000 01a91c54 dd1e1f44 dd1e1eb8 c0190a64 c018e844 01a91c38 dd1e1f50
2021-07-16T01:00:58.120+00:00 TestCellMain kernel: [1256891.981802] 1ec0: 00000001 c0158d28 00000001 dfbe474c 0000000a 0000000b 00000000 dd1e1fb0
2021-07-16T01:00:58.120+00:00 TestCellMain kernel: [1256891.981812] 1ee0: 00000000 c07ba6a7 0000059b dd37e208 8000059b 8000059b dde35c90 dde35c90
2021-07-16T01:00:58.120+00:00 TestCellMain kernel: [1256891.981822] 1f00: 00000001 dd1e1f00 dd1e1f2c 01a91000 dd1e1f2c dd1e1f20 c057ebd0 f89fc2d1
2021-07-16T01:00:58.120+00:00 TestCellMain kernel: [1256891.981832] 1f20: 00047722 0000000b 00000000 0000008b 0632d0dd 01a91c54 dd1e1fa4 dd1e1f48
2021-07-16T01:00:58.120+00:00 TestCellMain kernel: [1256891.981842] 1f40: c0190c90 c0190080 01a91c38 00000000 ffffffff dd1e1f60 c017ea20 00000000
2021-07-16T01:00:58.120+00:00 TestCellMain kernel: [1256891.981852] 1f60: 00132dbb 00000000 f89fc2d1 00047722 00132dbb 3b0c34d1 00000008 01a91c38
2021-07-16T01:00:58.120+00:00 TestCellMain kernel: [1256891.981862] 1f80: 00000000 01a91c50 000000f0 c0108264 dd1e0000 00000000 00000000 dd1e1fa8
2021-07-16T01:00:58.120+00:00 TestCellMain kernel: [1256891.981872] 1fa0: c01080c0 c0190b54 01a91c38 00000000 01a91c54 0000008b 0632d0dd b6283de8
2021-07-16T01:00:58.121+00:00 TestCellMain kernel: [1256891.981881] 1fc0: 01a91c38 00000000 01a91c50 000000f0 00000000 00000000 00000000 01a91c38
2021-07-16T01:00:58.121+00:00 TestCellMain kernel: [1256891.981890] 1fe0: 00000009 b6283d60 00000000 b6cdb084 200b0010 01a91c54 00000000 00000a87
2021-07-16T01:00:58.123+00:00 TestCellMain kernel: [1256891.982219] Code: 0a000002 e51b305c e1530007 0a000000 (e7f001f2)
2021-07-16T01:00:58.124+00:00 TestCellMain kernel: [1256891.982603] Fixing recursive fault but reboot is needed!
2021-07-16T01:00:58.124+00:00 TestCellMain kernel: [1256891.982609] BUG: scheduling while atomic: LV_Occurrence/1435/0x00000002

Seems to me like some issue occurs, and the cRIO reports that it needs a reboot, and we have to do that manually?

Have anyone seen this before?

 

I have contacted NI about it, and they suspect a memory leak. I've paid close attention to the memory usage, but have not seen a trend that it is decreasing.

To further complicate this, I did once catch it just as it was crashing, and it did seem like the memory was very rapidly decreasing just then.

Is it possible that some error occurs at some point, and this causes a rapid memory leak, that further causes the issue above?

 

Thank you for reading, and please ask any question that might help. I will try to be quick to answer.

0 Kudos
Message 1 of 5
(155 Views)

HI,

Have you tried upgrading firmware with this steps?

 

0 Kudos
Message 2 of 5
(116 Views)

Thank you for replying. 

Yes, I have upgraded to the latest cRIO SW and firmware for the cRIO. 

I’m now logging the free memory and the memory used by the runtime pid, and I actually see the free memory slowly decreasing, while the lvrt memory is not increasing. 

I’m not certain which process is increasing in memory, but I’m trying to log that now. 

0 Kudos
Message 3 of 5
(110 Views)

Hi

 

is there any progress?

have you discovered the process eating the RAM?

0 Kudos
Message 4 of 5
(51 Views)

Hi,

 

I've been monitoring the memory used pr process on the cRIO for a while, and the only one that is steadily increasing is syslog-ng.

I write error and a few other things to syslog, but I don't understand why it should cause a memory leak.

 

Maybe I had some error that was spamming it when the memory quickly ran out. At the moment it is increasing very slowly.

 

Here follows the last feedback from NI support, which I haven't really had a proper look at yet:

 

The Syslog-ng is a process that logs events from the system. 
Please take a look at the content of this link. I think it might be useful. Please scroll down the page to see the contents, in the beginning there are some images. 
Something in your code logs parameters too often. This might lead to such memory usage. Could you please tell me what kind of program are you running on your cRIO?

This also might be useful. In the description, you can find the path of the syslog-ng.cfg which is connected to the syslo-ng. 
  Please inform me if this information somehow useful. 

0 Kudos
Message 5 of 5
(43 Views)