actyler1001
Enthusiast
Enthusiast

NTP broken after ESXi 7u3 upgrade

NTP time sync appears to have broken after the 7U3 upgrade.  Anyone else run into this and have suggestions on how I might fix?  I've deleted and re-created the service, checked FW policy, tried different servers...  Nothing helps.

Original Build: VMware ESXi, 7.0.2, 17867351

actyler1001_1-1635089620144.png

 

New Build: VMware ESXi, 7.0.3, 18644231

actyler1001_0-1635089532365.png

 

Labels (2)
108 Replies
sramanuja
VMware Employee
VMware Employee

I'm a product manager in the vSphere team and monitoring this thread for a while. Unfortunately, we are unable to reproduce some of the issues you have reported internally. Would you be willing to share some of these issues and support bundles from your environment so that we can investigate this further?

0 Kudos
CoffeeBlackest
Contributor
Contributor

This is an excerpt from one of my emails with support:

Although, this has been one of the known ones with 7.0 U3, you can definitely apply the workaround steps as shown in the article: https://kb.vmware.com/s/article/86255?lang=en_US

 

Though i'm not certain my issue is the same as what has been described in some of the above posts.

 

VMware ESXi, 7.0.3, 19193900

0 Kudos
Kinnison
Enthusiast
Enthusiast

@sramanuja,

Good afternoon, I can only talk about my personal experience and environment,

Introduced with ESXi 7.0U3  and before ESXi 7.0U3c if for any reason any reliable time was not reachable, the management agent on the hosts crashed in less then a minute; I mitigated that issue by adding a local, reliable, time source, and referencing it by it's IP address and not by FQDN . it was not a big effort as my lab is a small one with only six ESXi host.

Disabling the monitoring of "time sync" related event also mitigated this issue, but with a permament warning in HOST > Configuration > time configuration stating: the "time service is currently not syncronized" even when the NTP service was working as expected and time source were available over internet.

Somehow ESXi 7.0U3c fixed that specific issue but introduced other.

With vCenter 7.0U3c going to HOST > Configuration > time configuration is more or less futile, as no alarm / warning are generated / tracked when time sycronization went lost, e.g. when I stopped the service via command line and after some time I restarted it (/etc/init.d/ntpd stop, start, restart). So, to be properly notified of potential issues in a timely manner I decided to rely on modern log facility / monitoring tools.

TBH, I have not seen the NTP service restart by itself but I'm of the "old school" and so I tend to make things as simple as possible, and I suppose I was also more lucky than other. To be noted, in my lab I don't rely on AD, the hosts or any kind of virtual machine as a (reliable) time source.

Regards,
Ferdinando

0 Kudos
sramanuja
VMware Employee
VMware Employee

Thank you for your feedback. I've shared it with our engineering team.

0 Kudos
shashikrishnak
Contributor
Contributor

Any workaround? 

0 Kudos
LabMasterBeta
Enthusiast
Enthusiast

@shashikrishnak 

VMware has now implemented another partial-fix for NTP directly into the new ESXi 7.0 Update-3d.

Short snip from 7.0u3d release notes:

  • PR 2875575: After upgrading to ESXi 7.0 Update 2d and later, you see an NTP time sync error

    In some environments, after upgrading to ESXi 7.0 Update 2d and later, in the vSphere Client you might see the error Host has lost time synchronization. However, the alarm might not indicate an actual issue.

    This issue is resolved in this release. [70u3d]  The fix replaces the error message with a log function for backtracing and prevents false alarms. 

Unfortunately, this does NOT fix my issues described earlier in the thread from other NTP issues still persisting, specifically after CLEAN-INSTALL on SuperMicro SuperServer mini-ATX with Xeon D-Series (we use many in clusters for special-use tasks).

shashikrishnak
Contributor
Contributor

@LabMasterBeta 


My ESXi hosts are in Hypervisor:VMware ESXi, 7.0.3, 19482537 (ESXi 7.0 Update-3d) but still I am seeing this NTP alert on all ESXi hosts.

0 Kudos
LabMasterBeta
Enthusiast
Enthusiast

@sramanuja  - It is now reproduced by multiple people, that NTP issues persist and still unresolved with the latest ESXi 7.03d build patch.

Please relay that to your engineering team.

Also, if you want someone to upload a "vm-support" bundle, then you need to open a courtesy SR# and provide secure instructions to upload the logs dumps to that SR# for you to securely receive it. These forums do not permit file uploads of that size nor are the uploads secure.
 
0 Kudos
sramanuja
VMware Employee
VMware Employee

I see that too, but we are not seeing such issues being reported in our official support channel. Unless someone raises and SR and uploads logs for us to investigate and provides me the SR number, I cannot help.

I cannot create SRs because they are supposed to be unique to our customers. Multiple SRs can help raise the visibility, so the best thing you can do is to raise SR and upload logs.

0 Kudos
LabMasterBeta
Enthusiast
Enthusiast


@sramanuja wrote:

I see that too, but we are not seeing such issues being reported in our official support channel. Unless someone raises and SR and uploads logs for us to investigate and provides me the SR number, I cannot help.

I cannot create SRs because they are supposed to be unique to our customers. Multiple SRs can help raise the visibility, so the best thing you can do is to raise SR and upload logs.


@sramanuja,

Thanks for replying, and I'm very glad you are able to reproduce the issue!!

Unfortunately, I'm very surprised you are unable to open your own internal SR for the issue you reproduced.

Please private message me a courtesy or internal SR# and method to upload logs to that SR#, and I will gladly do so!

I cannot justify paying for burning the cost of an SR for a known-issue that as you said, even though you reproduced it will likely get no attention without multiple customers opening multiple SR's for escalation to engineering for a patch. For such a critical service as NTP, this is not a good answer from VMware.

Regardless, please provide any no-cost SR method so I can assist, and I will.

Thanks!

0 Kudos
Kinnison
Enthusiast
Enthusiast

@sramanuja, good morning,


I beg your pardon but I have to agree with Labmasterbeta (and many other).


I have now updated the vCenter product to the current 7.0U3e and ESXi to the current ESXi 7.0U3D.
Nonetheless, as far as I have taken the trouble to verify, all the defects in the user interface or the inaccuracy of some of the information displayed are as they were before.
I mean, if I have doubts about the correct functioning of the NTP service I use the command line, exactly as I did before.


Let me tell you that personally I don't understand why to keep objects in the context of a user interface that have not worked for a long time, like the "famous" action button.

I am not here (and I do not want) to argue but, I should rise a support request to report that the power consumption indicated by the monitoring of a virtual machine cannot amount "some" thousand kilowatts? Honestly, I don't think so.

From my humble perspective, many don't bother anymore asking for technical support for long-standing (and maybe obvious) "unresolved product problems and defects", in the end they manage in another way or ignore them.


Well, this does not mean that I will no longer use VMware products (I don't even think about it) but certainly now I find myself being more cautious and selective than before.


Regards,
Ferdinando

Markus_Hartmann
Contributor
Contributor

same here, a stretched vSAN environment running on build 19482537 (DellEMC VxRail Image). 

Hope there will be a final fix for that.

0 Kudos
tractng
Enthusiast
Enthusiast

I am seeing similar issues.  My versions are below.  Last week I had one host lost its management interface and after a while it got reconnected back again.  This a new build environment but we are close to moving to production.  Right now we are on 6.5.  I am not confident with this version as we have a new CIO and don't want to get yelled at LOL!!

 

I am getting this message but the time is correct most of the time with the hosts. A few hosts would randomly be off which is scary.

Time service is currently not synchronized.

VMware vCenter Server
Version:
7.0.3.00600
Build number:
19717403


ESXi version:7.0.3
ESXi build number:
19482537

0 Kudos
TPGOPI007
Contributor
Contributor

VMware ESXi, 7.0.3, 19898904

Same issue on the latest version. Below is the workaround given for my case, but the issue comes back after a reboot.

1. On vSphere Client, go to Configure -> System -> Time Configuration tab, select "Network Time Protocol" and click on EDIT button
2. From the configuration box, uncheck "Enable monitoring events"
3. Click the OK button

 

0 Kudos
TPGOPI007
Contributor
Contributor

sramanuja Can you please check? There should be a PR open. 

0 Kudos
actyler555
Enthusiast
Enthusiast

vSphere 7 = Garbage.  6.7 support needs to be extended another 12 mo. at least.

Regards,

Adam Tyler

0 Kudos
tractng
Enthusiast
Enthusiast

TPGOPI007,

 

If you did the work around (Unchecking "Enable monitoring events"), any events will not be logged?

0 Kudos
TPGOPI007
Contributor
Contributor

Not logged until the next reboot

0 Kudos
tractng
Enthusiast
Enthusiast

Thanks!

0 Kudos
maxr91
Contributor
Contributor

VMware has released new ESXi Version. 

ESXi 7.0 Update 3eESXi_7.0.3-0.40.198989042022-06-1419898904

 

Someone checked if Issue solved finally or not?  

0 Kudos