Real-Time Measurement and Control

cancel
Showing results for 
Search instead for 
Did you mean: 

CompactRIO sometimes uses high CPU at startup

Solved!
Go to solution

We have a complex application that runs on a CompactRIO and we regularly run tests on multiple targets (sbRIO, cRIO and MyRIO) in the office to validate our code. Every once in a while, the application starts and uses 70% of CPU instead of the usual 35%-40% on other targets. Tu investigate this problem, we put a relay on the power supply line of a CompactRIO 9067 and we have been cycling the power every 3 minutes, with the CPU usage being monitored a reported to a server over TCP/IP messages every 5 seconds. Of course, we deploy the built application (rtexe) as the startup program.

 

After greatly reducing the application down to a very simple VI with 4 loops, loading the 9067 at ~5% typically, we get about 2% of the power cycles resulting in 60-70% load and also another 2% of the power-up sequences leading to 95-100% CPU load. Even at 100%, our code generates the message to our server so we know that the application is running.

 

This specific series of tests was performed on the 9067 target with firmware 3.0.0f0 (operating system is NI Linux Real-Time ARMv7-A 4.1.15-rt17-ni-4.0.0f1) and LabVIEW RT 15 SP1. We have seen a similar strange behavior on targets with firmware 3.5.0f0 (OS NI Linux Real-Time ARMv7-A 3.14.40-rt37-ni-3.0.0f2) and 4.0.0f0 (OS NI Linux Real-Time ARMv7-A 3.14.46-rt46-ni-3.5.0f0) which makes us believe that it is not the root cause and/or has not been fixed yet.

 

To monitor the CPU, we use the "System" resources (System Session.SystemResource ---> System Hardware.CpuLoadTotal). The numbers that we get are the same as the one seen in MAX, even when the CPU load is very high so we have all the reasons to believe that those CPU% numbers are real.

 

Has anybody faced this previously? I include the source code for reference along with the project file. We have seen this problem on other build specifications that did not disable the Debug on the VIs...All of our code tend to have the flag "Separate Compile Code" enabled.

 

 

0 Kudos
Message 1 of 10
(5,229 Views)

Adding a screenshot from our server's data display for two hours this morning to display the change in CPU.

 

0 Kudos
Message 2 of 10
(5,225 Views)

This is more results from last night. The startup is either good (using 5% of CPU) of very bad (using 100% of CPU) but what on earth could make LabVIEW arbitrarily use 55% of CPU?

 

2018-04-26 06_28_38_ Dashboard.png

 

I look forward to hearing about the root cause of this but with my simple VI attached in the original post, I do not think that it is our code. Are other users also seeing this in their applications? I was talking with my colleagues yesterday and we also think that the CPU is real because when the CPU was 100%, we've seen targets crash after a few hours or a day while the targets starting with a reasonable CPU can last for many months. 

 

I will try to login through SSH and get a the CPU usage from the kernel directly, per thread. Anything specific that NI's AEs would like to see otherwise??

0 Kudos
Message 3 of 10
(5,184 Views)

Ok, managed to catch a 100% this morning, logged in to the target using SSH and got this from "Top":

 

Mem: 155788K used, 355504K free, 976K shrd, 0K buff, 84232K cached
CPU: 92% usr 7% sys 0% nic 0% idle 0% io 0% irq 0% sirq
Load average: 3.18 1.72 0.70 2/241 1729
PID PPID USER STAT VSZ %VSZ %CPU COMMAND
1281 1273 lvuser S 67948 13% 14% {MainAppThread} ./lvrt
1689 1682 admin R 6172 1% 0% {top} /bin/busybox.nosuid /usr/bin/top
1231 1 lvuser S 17808 3% 0% /usr/local/natinst/bin/tagsrv -start
1678 893 admin S 10468 2% 0% sshd: admin@pts/0
21 2 admin SW 0 0% 0% [ksoftirqd/1]
3 2 admin SW 0 0% 0% [ksoftirqd/0]

 

I'm probably not knowledgeable enough beyond this, where is Linux hiding the extra 90%???

0 Kudos
Message 4 of 10
(5,177 Views)

In comparison, this is the result from "Top" when the application reports a reasonable CPU on the dashboard (~5% CPU). The LabVIEW RT application CPU usage is still clearly different...6% vs 14%.

 

Mem: 155876K used, 355416K free, 976K shrd, 0K buff, 84256K cached
CPU: 3% usr 2% sys 0% nic 94% idle 0% io 0% irq 0% sirq
Load average: 1.46 0.84 0.33 2/243 1970
PID PPID USER STAT VSZ %VSZ %CPU COMMAND
1353 1334 lvuser S 68104 13% 6% {MainAppThread} ./lvrt
1969 1952 admin R 6172 1% 0% {top} /bin/busybox.nosuid /usr/bin/top
1948 892 admin S 10600 2% 0% sshd: admin@pts/0
21 2 admin SW 0 0% 0% [ksoftirqd/1]
3 2 admin SW 0 0% 0% [ksoftirqd/0]
10 2 admin SW 0 0% 0% [rcuop/0]
20 2 admin SW 0 0% 0% [rcuc/1]
142 2 admin SW 0 0% 0% [kworker/1:1]
1205 1204 webserv S 26288 5% 0% {SystemWebServer} /usr/local/natinst/s
1280 1205 webserv S 22572 4% 0% NIWebServiceContainer {C208840D-4A38-1
1289 1 lvuser S 17808 3% 0% /usr/local/natinst/bin/tagsrv

 

I'll try to get the results as well when the CPU is ~55%...

 

Any chance someone from NI read this so far and tested internally on a 9067, MyRIO or Linux sbRIO?

0 Kudos
Message 5 of 10
(5,176 Views)

Hello,

 

Thanks for being detailed on all of this. Right now I have no immediate answers, and I'm not sure if I'll be able to replicate anytime soon. You can monitor CPU usage in the DSM as well as per the following:

 

https://knowledge.ni.com/KnowledgeArticleDetails?id=kA00Z000000P9kJSAS

 

I would monitor there as well. I know it's basic, but I'd also take a look at page 46 and 47 of the cRIO developers guide:

 

http://www.ni.com/pdf/products/us/fullcriodevguide.pdf

 

If other users don't have much insight, I would highly encourage you to call in and create a service request with us so we can discuss this in more detail and expedite the support timeline. 

CH
Applications Engineering
National Instruments
http://www.ni.com/en-us/support.html
0 Kudos
Message 6 of 10
(5,168 Views)

Thanks. I looked at the references. DSM does not seem to want to show the CPU on any of our targets. I checked and the "System State Publisher" is indeed not installed. I opened a service request for the time being. I do not think that the information in the cRIO guide is very helpful here.

 

This might help also. It is the list of software libraries currently installed.

 

CompactRIO Support 16.0
Hardware Configuration Web Support 16.0.0
LabVIEW Real-Time 15.0.1
NI Scan Engine 4.4
NI System Configuration 16.0.0
NI System Configuration Remote Support 16.0.0
NI Web-based Configuration and Monitoring 16.0.0
NI-RIO 16.0
NI-RIO IO Scan 16.0
NI-Serial 9870 and 9871 Scan Engine Support 15.0.0
NI-VISA 16.0.0
NI-VISA Remote Passport 16.0.0
NI-VISA Server 16.0.0
NI-XNET 16.1.0
Network Configuration Web Support 16.0.0
Network Streams 15.0
Network Variable Engine 15.0.0
Remote Panel Server for LabVIEW RT 16.0.0
Run-Time Engine for Web Services 16.0.0
SSL Support for LabVIEW RT 16.0.0
Software Management Web Support 16.0.0
Time Configuration Web Support 16.0.0
Variable Client Support for LabVIEW RT 15.0.0
WebDAV Client with SSL Support 15.0.0
WebDAV Server 16.0.0

0 Kudos
Message 7 of 10
(5,160 Views)

I added a System Exec to redirect "top" to a file and I caught the 55% CPU load process CPU usage. There is another process in the list that uses up some CPU "rcuop/0". I hope that this can be helpful in finding the root cause for one of your Linux gurus. That process is not in the list when LabVIEW and Linux report 100% CPU usage.

 

Mem: 150792K used, 360500K free, 972K shrd, 0K buff, 82024K cached
CPU: 45% usr 9% sys 0% nic 45% idle 0% io 0% irq 0% sirq
Load average: 2.13 0.75 0.27 1/241 1960
^[[7m PID PPID USER STAT VSZ %VSZ %CPU COMMAND^[[0m
1359 1347 lvuser S 68020 13% 23% {MainAppThread} ./lvrt
1960 1959 lvuser R 2592 1% 9% {top} /bin/busybox.nosuid /usr/bin/top
10 2 admin SW 0 0% 5% [rcuop/0]
1217 1216 webserv S 25712 5% 0% {SystemWebServer} /usr/local/natinst/s
1293 1217 webserv S 20480 4% 0% NIWebServiceContainer {C8D6763B-4A9C-1
1308 1 lvuser S 17808 3% 0% /usr/local/natinst/bin/tagsrv -start
1236 1 admin S 13572 3% 0% /usr/local/natinst/share/mxs/nimxs -d
1232 1217 webserv S 12800 3% 0% NIWebServiceContainer {C860C0D8-4A9C-1
874 1 admin S 12696 2% 0% {niauth_daemon} /usr/local/natinst/sha
1275 1 admin S 12668 2% 0% /usr/local/natinst/bin/lkads -start
1255 1217 webserv S 12012 2% 0% NIWebServiceContainer {C89FC22B-4A9C-1
1353 1217 webserv S 11624 2% 0% NIWebServiceContainer {C9264EEA-4A9C-1
1394 1217 webserv S 11176 2% 0% NIWebServiceContainer {C95A1A53-4A9C-1
902 1 admin S 10316 2% 0% /usr/sbin/syslog-ng --process-mode=bac
1326 1217 webserv S 9868 2% 0% NIWebServiceContainer {C90C04B1-4A9C-1
1380 1217 webserv S 9796 2% 0% NIWebServiceContainer {C9478792-4A9C-1
935 1 nobody S 8256 2% 0% /usr/local/natinst/bin/nisvcloc -D -n
896 1 admin S 8248 2% 0% /usr/sbin/sshd
1347 1321 admin S 8076 2% 0% /bin/su -- lvuser -l -c /etc/init.d/lv
1312 1 lvuser S 7144 1% 0% /usr/local/natinst/bin/nixntRpcServer^
Thu Apr 26 20:30:38 GMT+8 2018

0 Kudos
Message 8 of 10
(5,146 Views)

I opened a ticket with NI and the first suggestion was to delay my application from running for 30-60 seconds after power up. I tried it by putting a "wait (ms)" with 60,000 inside a structure and then passing an error cluster to the rest of the code to hold off any other processing. Unfortunately, the second restart immediately showed the CPU going at 65% so that does not solve the problem.

0 Kudos
Message 9 of 10
(5,120 Views)
Solution
Accepted by topic author OlivierL

We opened a Service Request with NI about this and ran many more tests over the past month. We were able to reproduce the problem fairly easily and also find a workaround. There is now a CAR (700007) associated with this problem.

 

In short, my understanding is that it is Linux that does not properly count the OS Idle time when the timed loop and the the CPU scheduler tick meet a specific alignment. Therefore, the CPU is not really used but is reported as 100% (on one core) by the OS. The "RT Get CPU Loads" would return the proper CPU usage from LabVIEW where the System -> SystemResource -> CpuLoadTotal returns the Linux counters that can be wrongly reporting 100%.

 

Message 10 of 10
(4,893 Views)