I have set up a small network that does not\cannot have connectivity outside of it's LAN. There are 2 ESX hosts connected to a SAN hosting a number of Windows Server VM's in a Domain. As I do not have external connectivity I have configured the PDCe to be the primary time source and configured both ESX hosts to get their time from the PDCe. In 1 week the PDCe has lost 11 seconds. The ESX host that the PDCe us running on has on offset of 0.037 but the other host has an offset of -3597 ms. All other statistics when running watch ntpq -p on both hosts are very similar.
1. Giving the fact that I cannot connect to an external time source is this the correct configuration?
2. Can anyone advise how to troubleshoot why my ESX host is losing time against the PDCe?
This is a live network so I can see the time sync becoming an issue over the coming months.
The NTP service is running. If I run watch ntpq -p I get this result
st t when poll reach delay offset jitter
1 u 46 64 377 0.350 -3676.9 1.447
Why is the offset nearly 4 seconds and increasing daily?
As good advice and also in most of the situations, we have such as this structure: ESXi hosts do not have any outside connectivity respect to the security aspect. So you need to set up an internal NTP server (Such as a Primary DC), However, you need an external NTP server that you internal NTP server syncs the clock with that one. But for a fast test of your problem, configure another internal NTP (such as a router or a firewall) that have an allowed firewall rule to pass NTP traffic to/from internet (UDP 123) and set it as the NTP server of your Hosts, Then check it again by running ntpq -p and give the result after a time duration.
I explained on following links about NTP config and possible problems:
The network (not just the ESX hosts) cannot have external connectivity, so I cannot point to an external NTP time source. So my 1st question was should I use a VM (eg the PDCe) as the primary source or 1 of the ESX hosts. I decided to use the PDCe (which is a VM) as the primary source and it is losing 11 seconds a week. I point both ESXi hosts to the PDCe for the primary time, the hosts that PDCe is running from is ok but the time on the other ESXi host is 3 seconds out, but I don't understand why this is so far out when they are connected on the same LAN\switch.
Do you have another system (not a VM) - e.g. a server - in the subnet, which you can use/configure as an "external" common time server for the ESXi hosts, as well as the DC running on them?
I have 3 physical servers, 2 x esx hosts and 1 backup server which is a member of the AD Domain.
I was expecting some drift in the time but not as much as I'm experiencing in such a short timescale.
I had similar problem in one of our projects. Time sync is interrupted suddenly after some minutes with PDC as the NTP server of my ESXi hosts. So I decide to set a mikrotik routerboard as the NTP server and then everything works fine. However if you want to use the DC as the time source, you must set PDC as the internal NTP in your network(other DCs will recieve NTP from PDC by default) then set an internal device that has internet access, as the PDC's NTP source. Then that device should sync its time with and external NTP now