Centos – Amazon AWS Centos 7 System Clock is fast

awscentosdate

We have Amazon AWS instance with CentOS Linux 7 (Core). But maybe that is not specific for system

Few days ago the System Clock (date) begins to speed up very fast.
If we sync it with Hardware Clock (hwclock), after about 10-20 minutes System Clock (date) will be ahead for 48 seconds.
And 48 secs offset is the max value. After a few hours it will be ahead for 48 seconds too.

I know that a little offset is normal. But 48 seconds offset in ~10-20 minutes is not normal.
I also know that there are files and libs like adjtimex which can use "delta" value and will adjust system time
But in my case, speed up process stops when it reached ~48 seconds.
So, hwclock will print for example 12:00:00 and date will print 12:00:48

I tried:

  1. Install ntpdate and sync time via ntpdate pool.ntp.org
  2. hwclock --hctosys to set System Time from the Hardware Clock. Also tried hwclock --systohc after syncing time (date) with ntpdate
  3. Created file /etc/sysconfig/clock with "HWCLOCK_ADJUST" param set to true. Also tried with false value
  4. Deleted file /etc/adjtime or so, which had UTC and ZERO values in it

But with no luck.

After time sync, I run next code: $ while true; do ntpdate pool.ntp.org; sleep 60; done

16 Jan 15:29:45 ntpdate[20656]: step time server 129.250.35.251 offset -4.977822 sec
16 Jan 15:30:46 ntpdate[20743]: step time server 129.250.35.251 offset -5.117517 sec
16 Jan 15:31:48 ntpdate[20813]: step time server 74.117.214.3 offset -4.853926 sec
16 Jan 15:32:50 ntpdate[20890]: step time server 23.239.26.89 offset -5.583270 sec
16 Jan 15:33:51 ntpdate[20941]: step time server 74.117.214.3 offset -4.983483 sec
16 Jan 15:34:53 ntpdate[20994]: step time server 12.167.151.1 offset -5.150401 sec
16 Jan 15:35:54 ntpdate[21080]: step time server 173.255.206.154 offset -5.256357 sec
16 Jan 15:37:03 ntpdate[21155]: adjust time server 12.167.151.1 offset 0.011276 sec
16 Jan 15:38:09 ntpdate[21205]: adjust time server 108.61.56.35 offset -0.019818 sec
16 Jan 15:39:16 ntpdate[21241]: adjust time server 108.61.56.35 offset -0.285154 sec
16 Jan 15:40:18 ntpdate[21660]: step time server 108.61.56.35 offset -5.227262 sec
16 Jan 15:41:19 ntpdate[21706]: step time server 108.61.73.244 offset -5.474606 sec
16 Jan 15:42:20 ntpdate[21756]: step time server 108.61.73.244 offset -5.286961 sec
16 Jan 15:43:22 ntpdate[21791]: step time server 108.61.73.244 offset -4.808674 sec
16 Jan 15:44:29 ntpdate[21885]: adjust time server 96.244.96.19 offset -0.010287 sec
16 Jan 15:45:36 ntpdate[21952]: adjust time server 96.244.96.19 offset -0.000296 sec
16 Jan 15:46:43 ntpdate[22013]: adjust time server 96.244.96.19 offset -0.012838 sec
16 Jan 15:47:51 ntpdate[22126]: adjust time server 198.206.133.14 offset -0.347436 sec
16 Jan 15:48:53 ntpdate[22220]: step time server 198.206.133.14 offset -5.570427 sec
16 Jan 15:49:57 ntpdate[22300]: step time server 198.206.133.14 offset -5.229636 sec
16 Jan 15:50:58 ntpdate[22367]: step time server 104.131.53.252 offset -5.466987 sec
16 Jan 15:52:00 ntpdate[22407]: step time server 104.131.53.252 offset -5.298659 sec
16 Jan 15:53:01 ntpdate[22462]: step time server 104.131.53.252 offset -5.127748 sec
16 Jan 15:54:03 ntpdate[22578]: step time server 129.6.15.30 offset -5.014787 sec
16 Jan 15:55:05 ntpdate[22617]: step time server 129.6.15.30 offset -5.144181 sec
16 Jan 15:56:06 ntpdate[22694]: step time server 129.6.15.30 offset -5.436509 sec
16 Jan 15:57:08 ntpdate[22733]: step time server 96.238.43.39 offset -5.038639 sec

Who can tell me what's going on here?
Does that mean that System Clock works fine for about ~3-4 minutes sometimes?
Before these logs I thought that it speeds up always up to 48 seconds.
The reason why logs printed out not every exactly 60 secs, because ntpdate works for a few seconds and after sync writes those text.

We solved this issue by running ntpdate (ntp) as a service to sync date automatically.

What are the possible reasons for that "sudden gigantic speeds up"?

If this is not a common issue, we will contact Amazon support for help.

Best Answer

The problem was probably in one of the hypervisors; it could have been the clock skewed by 48s; it happens (and is not a problem unique to AWS)

There was also a Xen bug, no idea if that applies nowadays. (has not AWS migrated to kvm?)

Amazon is advising people to install chrony synced with one of their NTP servers. Have a look at AWS docs - EC2 - Setting the Time for Your Linux Instance

As in:

sudo yum erase ntp*
sudo yum install chrony

Create /etc/chrony.conf with:

server 169.254.169.123 prefer iburst

And lastly:

sudo service chronyd start

One thing that could also be tried, per a @jordanm comment, is stopping/starting the EC2 server. You might get lucky, and get it running in another hypervisor without the clock skewed.

If these actions still do not solve the problem, I would open a ticket with Amazon.

Related Question