tekka
Enthusiast
Enthusiast

time keeping problem with SuSE 10 SP4 x86_64

Hello,

I have a cluster of ESX 4.0 U4 hosts (btw: I'm not the admin for esx...).

I have no problems with time keeping and RHEL 5/6 guests.

Instead, I have problems with SuSE 10 Sp4 x86_64 guests.

kernel is stock sp4: 2.6.16.60-0.97.1-default

virtual hardware is 7

From Timekeeping-In-VirtualMachines.pdf it shoud be not necessary to teak any parameter.

But using ntp, I notice that time is always behind the time server (eg after one day is -42seconds)

eg

# /home/nagios/nrpe/libexec/check_ntp -H my_ntp_server -w 1 -c 2

NTP CRITICAL: Offset -42.98160553 secs|offset=-42.98160553

Strangely, if I disable vmware tools the vm keeps time in sync correctly using ntp.

I have other 4 guests with this config and all have the same behaviour enabling/disabling VMware Tools

I suppose that it could be related to vmci kernel module.

In fact I remember in the past vmci only supported by 32bit kernels in SuSE, but at the same time now with virtual hw 7 it is installed by default without possibility to disable.

So, if I stop vmware tools, in particular I unload the vmci kernel module.....

I noticed between start and stop of vmware tools, these modules loaded difference

20a21,22
> iptable_filter          3456  0
> ip_tables              12624  1 iptable_filter
42a45,50
> vmci                   41792  1 vsock
> vmmemctl               16968  0
> vmxnet                 22916  0
> vmxnet3                39172  0
> vsock                  57984  0
> x_tables               14088  1 ip_tables

It seems vmware-config.pl has no option for disabling vmci...

What can I do? Can I safely blacklist vmci kernel module via modprobe.cnf / modprobe.d directory?

In my case these guests only uses LSI for disk and e1000 for network

kernel parameters from this kernel are:

# cat /boot/config-`uname -r` | grep HZ
# CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
# CONFIG_HZ_1000 is not set
CONFIG_HZ=250
CONFIG_MACHZ_WDT=m

I also tried time sync with vmware tools (after disabling ntp) but the situation was worse. And the guest is not doing any particular activity from a load average point of view.

Anyone else experiencing similar problems with this kind of guest os?

Thanks in advance,

Gianluca

0 Kudos
4 Replies
Tsjo
Enthusiast
Enthusiast

Any possibility of contacting the ESX admin to verify that the VMWare tools are up-to-date on your guests and that the hosts actually is in sync and using NTP?

If you find this information useful, please award points for "correct" or "helpful".
0 Kudos
tekka
Enthusiast
Enthusiast

Thanks for your answer.

I'm not the ESX sysop, but I have access to power functions in vSphere client.

In fact I already installed vmware tools as provided by ESX.

ESX is 4.0 U4 (build 504850) and I used VMwareTools-4.0.0-504850.tar.gz.

And apparently instaled cleanly.

I have several RHEL 5 and RHEL 6 guests running on the same ESX server physical hosts and none of them shows any problem with time keeping and the same ntp server configured.

Also, plenty of Vmware guests, physical Linux and AIX systems synchronizing with that ntp server, so I think it is not its problem.

In the mean time I'm testing this kind of workaround, as blacklisting vmci modules doesn't prevent vmware from loading it...

$ cd /lib/modules/$(uname -r)/misc

$ sudo mv vmci.o vmci.o.orig
$ sudo mv vsock.o vsock.o.orig

$ sudo /sbin/service vmware-tools start
   Checking acpi hot plug                                              done
Starting VMware Tools services in the virtual machine:
   Switching to guest configuration:                                   done
   Guest memory manager:                                               done
   VM communication interface:                                        failed
   VM communication interface socket family:                          failed
   Guest operating system daemon:                                      done

And from vSphere client it shows VMware Tools running (I think due to guestd running)

The problem is that at reboot vmware-tools fails to start automatically and then if I run:

sudo /sbin/service vmware-tools start
   Checking acpi hot plug                                              done
VMware Tools is installed, but it has not been
(correctly) configured for the running kernel.
To (re-)configure it, invoke the following command:
/usr/bin/vmware-config-tools.pl.

Looking through startup script, It seems there is a needed (empty) file named "dsp" under /usr/lib/vmware-tools and I loose it after reboot

(possibly due to shutdown phase..?)

if I do
$ cd /usr/lib/vmware-tools

$ sudo touch dsp

$ sudo service vmware-tools start
   Checking acpi hot plug                                              done
Starting VMware Tools services in the virtual machine:
   Switching to guest configuration:                                   done
   Guest memory manager:                                               done
   VM communication interface:                                        failed
   VM communication interface socket family:                          failed
   Guest operating system daemon:                                      done

For sure this is worse than suboptimal workaround... I know...

Tipically without vmci (and vsock) kernel loaded and with ntpd running, I have this after a few days

$ sudo /home/nagios/nrpe/libexec/check_ntp -H my_ntp_server -w 1 -c 2
NTP OK: Offset 0.07095170021 secs|offset=0.07095170021

Gianluca

0 Kudos
bsdnazz
Contributor
Contributor

There's a clock problem in the 2.6.16.60-0.97.1 kenerls on VMware.

This Novell TID covers it: http://www.novell.com/support/kb/doc.php?id=7010716

Guy Dawson
0 Kudos
bsdnazz
Contributor
Contributor

There's a SUSE 10 SP4 kernel updates that fixes this problem. It installs version 2.6.16.60-0.99.1

Guy Dawson
0 Kudos