I am running a xServer x3550 with VMware 4.0.0 as shown below :
+ + +
VMware ESX 4.0.0 build-164009
+ + +
In the middle of past Sunday, 10 am, nobody working (in theory), the machine stops (power off).
The file "/var/log/vmware/hostd-1.log" has an interesting line (the last one) :
+ + +
RefreshVms updated overhead for 1 VM
Ticket issued for CIMOM version 1.0, user root
Actual VM overhead: 153550848 bytes
RefreshVms updated overhead for 1 VM
Recvd. ACPI power event from the vmkernel
+ + +
I have no clue what does it mean, even I have read few pages about the ACPI timer ...
Can someone provide a bit of light into my darness ?
Sebastian.
Sebastian,
Here are the meanings of the following messages:
>Nov 1 10:40:40 BCNILOG01 vmkernel: 17:22:54:45.055 cpu5:4108)VMKAcpi: 2019: In PowerButton Helper
The message above means that somebody has pressed the physical "Power" button on the server or initiated a power-off from the RSA interface in your server.
>Nov 2 13:31:26 BCNILOG01 vmkernel: TSC: 0 cpu0:0)Init: 418: cpu 0: early measured tsc speed 2493750228 Hz
The preceding message is the first message generated by the vmkernel at boot time. It means that it has measured the CPU speed of CPU 0 to be 2493750228 Hz and the "timestamp" is 0.
Essentially, the reason why your server had shutdown was due to some person initiating the process.
Take a look at the following KB article:
Determining if user activity caused ESX host reboot (KB #1004594)
Faisal
Sebastian,
>2009-11-01 10:40:40.169 F66256D0 info 'ha-host' Recvd. ACPI power event from the vmkernel
This message is from vmware-hostd. The host agent. The vmkernel reports all activity from ACPI. Try taking a look at /var/log/vmkernel for similar messages.
Normally if a system powers off or reboots and an ACPI event is recorded in the logs, it is an indication of someone either pushing the physical Power Button or in your case enabling the power button via the RSA interface.
Also, take a look at /var/log/vmksummary at the same time. Run:
grep halt /var/log/vmksummary
You will probably see a line saying "halting" or "rebooting" indicating a clean shutdown of the system around the time of the message and power-off of the system.
I hope this helps.
Faisal
Faisal - thank you very much for your answer, so concrete and precise.
Meanwhile, I was trying to get the list of LOG files to read .... jejeje
/var/log/vnware/hostd.log
/var/log/messages
/var/log/vmkernel
/var/log/vmksummary.txt
/var/log/vmkwarning
....
Let me have a look at the file(s) you mention ...
Well, the SUMMARY is too short, so the info from past sunday is lost.
but "vmkernel.1" has an interesting line at the "problem" time ....November 1st, 10:40
( I do start the machine again on Monday Nov 2-nd ... )
But, again, the message "VMKAcpi: 2019: In PowerButton Helper" is still useless to me .... Any pointer to any book ?
Of course I am using Google ...
Thanks.
+ + +
Nov 1 10:35:57 BCNILOG01 vmkernel: 17:22:50:02.421 cpu1:8999)ScsiNpiv: 1304: GetInfo for adapter vmhba1, , max_vports=0, vports_inuse=0, linktype=0, state=1, failreason=0, rv=0, sts=0
Nov 1 10:40:40 BCNILOG01 vmkernel: 17:22:54:45.055 cpu5:4108)VMKAcpi: 2019: In PowerButton Helper
Nov 2 13:31:26 BCNILOG01 vmkernel: TSC: 0 cpu0:0)Init: 418: cpu 0: early measured tsc speed 2493750228 Hz
+ + +
Sebastian,
Here are the meanings of the following messages:
>Nov 1 10:40:40 BCNILOG01 vmkernel: 17:22:54:45.055 cpu5:4108)VMKAcpi: 2019: In PowerButton Helper
The message above means that somebody has pressed the physical "Power" button on the server or initiated a power-off from the RSA interface in your server.
>Nov 2 13:31:26 BCNILOG01 vmkernel: TSC: 0 cpu0:0)Init: 418: cpu 0: early measured tsc speed 2493750228 Hz
The preceding message is the first message generated by the vmkernel at boot time. It means that it has measured the CPU speed of CPU 0 to be 2493750228 Hz and the "timestamp" is 0.
Essentially, the reason why your server had shutdown was due to some person initiating the process.
Take a look at the following KB article:
Determining if user activity caused ESX host reboot (KB #1004594)
Faisal
Faisal - thanks a lot for your explanation.
Just a single question : where did you find it ?
I mean, the article you give me the URL is OK, but how can I find it, if I had to do it by myself ?
Just to try to be able to do it on myself the next time ...
Sebastian.