Hello,
I have a cluster of ESX 4.0 U4 hosts (btw: I'm not the admin for esx...).
I have no problems with time keeping and RHEL 5/6 guests.
Instead, I have problems with SuSE 10 Sp4 x86_64 guests.
kernel is stock sp4: 2.6.16.60-0.97.1-default
virtual hardware is 7
From Timekeeping-In-VirtualMachines.pdf it shoud be not necessary to teak any parameter.
But using ntp, I notice that time is always behind the time server (eg after one day is -42seconds)
eg
# /home/nagios/nrpe/libexec/check_ntp -H my_ntp_server -w 1 -c 2
NTP CRITICAL: Offset -42.98160553 secs|offset=-42.98160553
Strangely, if I disable vmware tools the vm keeps time in sync correctly using ntp.
I have other 4 guests with this config and all have the same behaviour enabling/disabling VMware Tools
I suppose that it could be related to vmci kernel module.
In fact I remember in the past vmci only supported by 32bit kernels in SuSE, but at the same time now with virtual hw 7 it is installed by default without possibility to disable.
So, if I stop vmware tools, in particular I unload the vmci kernel module.....
I noticed between start and stop of vmware tools, these modules loaded difference
20a21,22
> iptable_filter 3456 0
> ip_tables 12624 1 iptable_filter
42a45,50
> vmci 41792 1 vsock
> vmmemctl 16968 0
> vmxnet 22916 0
> vmxnet3 39172 0
> vsock 57984 0
> x_tables 14088 1 ip_tables
It seems vmware-config.pl has no option for disabling vmci...
What can I do? Can I safely blacklist vmci kernel module via modprobe.cnf / modprobe.d directory?
In my case these guests only uses LSI for disk and e1000 for network
kernel parameters from this kernel are:
# cat /boot/config-`uname -r` | grep HZ
# CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
# CONFIG_HZ_1000 is not set
CONFIG_HZ=250
CONFIG_MACHZ_WDT=m
I also tried time sync with vmware tools (after disabling ntp) but the situation was worse. And the guest is not doing any particular activity from a load average point of view.
Anyone else experiencing similar problems with this kind of guest os?
Thanks in advance,
Gianluca
Any possibility of contacting the ESX admin to verify that the VMWare tools are up-to-date on your guests and that the hosts actually is in sync and using NTP?
Thanks for your answer.
I'm not the ESX sysop, but I have access to power functions in vSphere client.
In fact I already installed vmware tools as provided by ESX.
ESX is 4.0 U4 (build 504850) and I used VMwareTools-4.0.0-504850.tar.gz.
And apparently instaled cleanly.
I have several RHEL 5 and RHEL 6 guests running on the same ESX server physical hosts and none of them shows any problem with time keeping and the same ntp server configured.
Also, plenty of Vmware guests, physical Linux and AIX systems synchronizing with that ntp server, so I think it is not its problem.
In the mean time I'm testing this kind of workaround, as blacklisting vmci modules doesn't prevent vmware from loading it...
$ cd /lib/modules/$(uname -r)/misc
$ sudo mv vmci.o vmci.o.orig
$ sudo mv vsock.o vsock.o.orig
$ sudo /sbin/service vmware-tools start
Checking acpi hot plug done
Starting VMware Tools services in the virtual machine:
Switching to guest configuration: done
Guest memory manager: done
VM communication interface: failed
VM communication interface socket family: failed
Guest operating system daemon: done
And from vSphere client it shows VMware Tools running (I think due to guestd running)
The problem is that at reboot vmware-tools fails to start automatically and then if I run:
sudo /sbin/service vmware-tools start
Checking acpi hot plug done
VMware Tools is installed, but it has not been
(correctly) configured for the running kernel.
To (re-)configure it, invoke the following command:
/usr/bin/vmware-config-tools.pl.
Looking through startup script, It seems there is a needed (empty) file named "dsp" under /usr/lib/vmware-tools and I loose it after reboot
(possibly due to shutdown phase..?)
if I do
$ cd /usr/lib/vmware-tools
$ sudo touch dsp
$ sudo service vmware-tools start
Checking acpi hot plug done
Starting VMware Tools services in the virtual machine:
Switching to guest configuration: done
Guest memory manager: done
VM communication interface: failed
VM communication interface socket family: failed
Guest operating system daemon: done
For sure this is worse than suboptimal workaround... I know...
Tipically without vmci (and vsock) kernel loaded and with ntpd running, I have this after a few days
$ sudo /home/nagios/nrpe/libexec/check_ntp -H my_ntp_server -w 1 -c 2
NTP OK: Offset 0.07095170021 secs|offset=0.07095170021
Gianluca
There's a clock problem in the 2.6.16.60-0.97.1 kenerls on VMware.
This Novell TID covers it: http://www.novell.com/support/kb/doc.php?id=7010716
There's a SUSE 10 SP4 kernel updates that fixes this problem. It installs version 2.6.16.60-0.99.1