Real-Time Measurement and Control

cancel
Showing results for 
Search instead for 
Did you mean: 

It is not possible to synchronise the Linux RT target (9053) with the NTP server. Any ideas?

I have a problem with the NI-9053 which is not able to synchronise with the NTP server. I have already posted some details in the comments to this post, but now I have done some measurements and it looks even weirder and worse.

 

Configuration:

  • Compact RIO NI 9053, Linux RT 2024 Q3 (latest as of today) with the latest NI-Sync installed.
  • The final goal is to get the thing working being connected to a Windows PC via USB-network cable (console port). But now I connected both the PC and the Compact RIO directly to the LAN with the unrestricted internet access in order to follow the manuals as close as possible.
  • All the required firewall ports are open for anyone.

 

What I did:

  • Followed this manual closely. The only difference is that I left the default "tinker panic 0" and loopback settings enabled (otherwise there are no connection at all).
  • Took into account note about the Linux OS desynchronisation from the network time, as the mentioned in the "Additional information" section of the above mentioned manual.
  • For the tests with the synchronisation with the Windows PC alone, I took into account the information from help pages article that the Windows time server is no-no and the 3rd party Meinberg NTP server.

Combinations of the parameters I have tested:

  • IEEE-1588 mode (slave) + syncronisation with the Internet time (time.nist.gov).
  • Same as above + added a couple more servers (pool.ntp.org + local NTP server on Linux).
  • IEEE-1588 mode (slave)  + synchronisation with the Windows PC using W32time NTP server (in-built into a Windows 10).
  • IEEE-1588 mode (slave)  + synchronisation with the Meinberg NTP.
  • All of the above, but in 802.1AS mode.

Outcome:

  • In most cases the connection to the NTP servers was established successfully and I confirmed this as described here. Additionally, local NTP servers (both Windows- and Linux-based) were checked from the another computers and confirmed that they are working well. E.g. when the only NTP is the time.nist.gov:
    admin@CRIO:~# ntpq -pn
    remote refid st t when poll reach delay offset jitter
    ==============================================================================
    *132.163.96.3 .NIST. 1 u 47 64 377 157.062 -0.075 0.082
  • Meinberg NTP is just does not work for me, this is the only case, when I was not able to make a Compact RIO to connect to the NTP no matter what.
  • On the other hand, the Windows time NTP server (W32time) seems to be working and it is possible to connect to it.
  • Whatever the combination of parameters, synchronisation does not work. I.e. achieving 1 ms (not even talking about 1 us) as described here is simply impossible. For example, the clocks difference between the reference PC and the Compact RIO over the ~15 hours looks like:
    D_mitriy_0-1726816714983.png


    Or like (~1.5 hours): 

    ntp_pic01.png
  • The grep ntpd /var/log/messages returns information like below, regardless the server(s) I am trying to connect to (note that TIME_ERROR):
    2024-09-20T10:47:36.729+03:00 CRIO ntpd[1630]: ntp-4 is maintained by Network Time Foundation,
    2024-09-20T10:47:36.729+03:00 CRIO ntpd[1630]: Inc. (NTF), a non-profit 501(c)(3) public-benefit
    2024-09-20T10:47:36.729+03:00 CRIO ntpd[1630]: corporation. Support and training for ntp-4 are
    2024-09-20T10:47:36.729+03:00 CRIO ntpd[1630]: available at https://www.nwtime.org/support
    2024-09-20T10:47:36.729+03:00 CRIO ntpd[1630]: ----------------------------------------------------
    2024-09-20T10:47:36.737+03:00 CRIO ntpd[1632]: proto: precision = 0.345 usec (-21)
    2024-09-20T10:47:36.738+03:00 CRIO ntpd[1632]: basedate set to 2020-06-11
    2024-09-20T10:47:36.738+03:00 CRIO ntpd[1632]: gps base set to 2020-06-14 (week 2110)
    2024-09-20T10:47:36.738+03:00 CRIO ntpd[1632]: restrict default: KOD does nothing without LIMITED.
    2024-09-20T10:47:36.745+03:00 CRIO ntpd[1632]: Listen and drop on 0 v6wildcard [::]:123
    2024-09-20T10:47:36.745+03:00 CRIO ntpd[1632]: Listen and drop on 1 v4wildcard 0.0.0.0:123
    2024-09-20T10:47:36.746+03:00 CRIO ntpd[1632]: Listen normally on 2 lo 127.0.0.1:123
    2024-09-20T10:47:36.746+03:00 CRIO ntpd[1632]: Listen normally on 3 lo [::1]:123
    2024-09-20T10:47:36.746+03:00 CRIO ntpd[1632]: bind(20) AF_INET6 fe80::280:2fff:fe36:94de%2#123 flags 0x11 failed: Cannot assign requested address
    2024-09-20T10:47:36.746+03:00 CRIO ntpd[1632]: unable to create socket on eth0 (4) for fe80::280:2fff:fe36:94de%2#123
    2024-09-20T10:47:36.746+03:00 CRIO ntpd[1632]: failed to init interface for address fe80::280:2fff:fe36:94de%2
    2024-09-20T10:47:36.746+03:00 CRIO ntpd[1632]: Listening on routing socket on fd #20 for interface updates
    2024-09-20T10:47:36.746+03:00 CRIO ntpd[1632]: kernel reports TIME_ERROR: 0x41: Clock Unsynchronized
    2024-09-20T10:47:36.746+03:00 CRIO ntpd[1632]: kernel reports TIME_ERROR: 0x41: Clock Unsynchronized
    2024-09-20T10:47:38.734+03:00 CRIO ntpd[1632]: Listen normally on 5 eth0 [fe80::280:2fff:fe36:94de%2]:123
    2024-09-20T10:47:38.734+03:00 CRIO ntpd[1632]: new interface(s) found: waking up resolver
    2024-09-20T10:47:45.734+03:00 CRIO ntpd[1632]: Listen normally on 6 eth0 192.168.0.95:123
    2024-09-20T10:47:45.734+03:00 CRIO ntpd[1632]: new interface(s) found: waking up resolver
  • If I enable the the Linux OS syncronisation, on top of the previous messages I will get a "kernel reports TIME_ERROR: 0x2040: Clock Unsynchronized" error.

Summary:

Time synchronisation is poorly documented, it is impossible to set it up even if you do everything according to the manual, there are a number of manuals on the same subject, but none of them is complete, they are missing important information, they contradict each other in some places (especially the suggested configs), or they are just plain wrong (e.g. the config in the first "official" manual does not even connect to the NTP server without adding tinker panic and loopback settings; Windows time is at least somehow functional, unlike Meinberg NTP, etc.). I've spent a lot of time on this this week and still can't get it to work. Maybe it is just me.

 

So any ideas or tips are welcome! Maybe I have done something stupidly wrong and just cannot see it?

0 Kudos
Message 1 of 8
(724 Views)

I took some measurements over the weekend. The time difference between the local Windows PC and the Compact RIO looks even more strange now.

One point in 20 seconds, so these regular "jumps" happen every ~9.5 hours. Maybe those are poinbts when the PC do synchronisation of its own clocks with the time.windows.com, maybe something else.

D_mitriy_0-1727070566906.png

 

0 Kudos
Message 2 of 8
(686 Views)

I managed to get the recommended Meinberg NTP to work. The NI manual, which recommends starting with an empty config, is wrong here.

And it does not synchronize CmpactRIO anyway, at least in IEEE-1588 mode (see below for overnight timings).

ntp_pic04.png

Message 3 of 8
(665 Views)

Thanks for posting this ... I am sure it will help somebody else scratching their head about how to do this (I remember one of my ex-colleagues mentioning something similar, but I can't remember what workaround was devised).

Consultant Control Engineer
www-isc-ltd.com
0 Kudos
Message 4 of 8
(645 Views)

It would be great if you had a chance to ask...

Quick update, I have repeated the test with the 802.1AS and Meinberg NTP, the results are the same (i.e. synchronization does not work).

0 Kudos
Message 5 of 8
(643 Views)

Another update. Yesterday I decided that if it doesn't work the way the NI manuals say, then let's do the opposite.

I removed NI-Sync altogether (fortunately I do not need it) and re-enabled native Linux synchronisation (as described here). Rebooted the system. Got below 2 ms time deviation within a minute or two. Did a very quick and dirty test routine (no logging etc, just a plot). The results of the ~14 hour test are below.

D_mitriy_0-1727418258776.png

So there are some jumps (my guess is that this is due to the PC's own time synchronisation with the internet), but the result is at least stable most of the time and it just works. Of course, this is nowhere near what is promised with NI-Sync (and I have no idea what the people who need it would do), but at least it is a decent result.

 

 

 

Message 6 of 8
(627 Views)

Another update: It turns out that, for some reason, some of the DAQmx tasks (within the same CompactRIO unit) refuse to synchronise without NI-Sync installed (Error -209836). After reinstalling NI-Sync, all synchronisation stops working again. So it looks like it is possible to get everything working if I can somehow prevent NI-Sync from touching the system clock.

0 Kudos
Message 7 of 8
(621 Views)

And sort of a final update (?)

 

I spent about 3 or 4 weeks with NI support trying to get accurate synchronisation with the PC or network (via NTP). It turns out that not only does NI-Sync not work properly* with the Linux cRIOs, it also prevents the normal Linux NTP features from working.

Cherry on the top is that due to a bug #2866609, you can't use the DAQmx without NI-Sync installed on the cRIO, so, at least until this bug is resolved, you can't really synchronize time on the cRIO automatically. With Ni-Sync disabled, I achieved the sync accuracy under 2 milliseconds with the local network NTP server. But since I need DAQmx, I have no other choice than "happily" live with the up to 8 seconds** time offsets.

 

Notes:

* Unless you have a real TSN hardware in your network, your switches and routers will support TSN and so on and so forth. This is not the case and is expensive. Other Linux-based cRIOs can't do that job, because to switch on TSN they need NI-Sync, and with NI-Sync enabled you can't sync them with anything else, except other TSN devices.

** That's what I measured on my system. The interesting thing is that the time reported by a "niSync Get Time" that I used in my code and other means of time measurement could be quite large and change over time. So, to be honest, I do not know how to reliably set and measure time on the Linux-based cRIOs.

Message 8 of 8
(310 Views)