Hi,
You should check in /var/log/vmkernel , /var/log/messages , /var/log/vmksummary.
Thanks,
Samir
P.S : If you think that the answer is helpful please consider rewarding points.
What hardware do you have it installed on?
Tom Cronin, VCP, VMware vExpert 2009 - Co-Leader Buffalo, NY VMUG
Hi,
Gone through the logs and I can see the entries where you got the reboot multiple times. But the other log files are not providing a conclusive proof of what is the problem. Can you check and remember the sequence of events before the reboot that can help in troubleshooting.
Mar 13 00:00:16 vhaiswvsh10 logger: (1236852016) loaded VMkernel
Mar 13 00:25:42 vhaiswvsh10 vmkhalt: (1236853543) Starting system...
Mar 13 00:26:07 vhaiswvsh10 vmkhalt: (1236853567) Rebooting system...
Mar 13 00:29:13 vhaiswvsh10 vmkhalt: (1236853753) Starting system...
Mar 13 00:29:18 vhaiswvsh10 logger: (1236853758) loaded VMkernel
Mar 13 00:45:08 vhaiswvsh10 vmkhalt: (1236854708) Starting system...
Mar 13 00:45:13 vhaiswvsh10 logger: (1236854713) loaded VMkernel
Thanks,
Samir
P.S : If you think that the answer is helpful please consider rewarding points.
There is basically nothing happening on the box at this point. All vms are down and it still reboots. the only event is that I am tring to scp off one of the vital systems to no avail.
Are you getting any errors at the boot up , I have observed few errors related to iSCSI.
Thanks,
Samir
There are no boot up errors. However the portion "Restoring S/W iscsi volumes" takes a long time to complete. And at this point in time we are not using iscsi connections.
No boot up errors. The "Restoring S/W iscsi volumes" takes a very long time. At this point we have no ISCI connections.
Can you disable swisci if you are not using it when you have a chance to login to ESX. Also consider opening a support ticket with VMware for troubleshooting this.
Thanks,
Samir
Has the server been physically moved? We moved a new server to our dr site after building it and it rebooted daily. Reseat memory, cpus, pci cards and two months of bliss.
The server has not been moved at all.
Could you point out to me where and in which log you saw the hostd memory issues?
Hi,
Looks like she lost connectivity to it's iSCSI datastores, then a rescan was performed.
Mar 11 09:33:34 vhaiswvsh10 watchdog-cimserver: '/var/pegasus/bin/cimserver daemon=false' exited after 117 seconds
Mar 11 09:33:34 vhaiswvsh10 watchdog-cimserver: Executing '/var/pegasus/bin/cimserver daemon=false'
Mar 11 09:33:40 vhaiswvsh10 cimserver: trying to popen /sbin/modprobe edd 2>&1
Mar 11 09:33:40 vhaiswvsh10 cimserver: trying to popen /sbin/modprobe edd 2>&1
Mar 11 09:33:40 vhaiswvsh10 vmware-hostd[1899]: Accepted password for user root from 127.0.0.1
Mar 11 09:33:41 vhaiswvsh10 cimserver: created VICimInstanceBuilder
Mar 11 09:33:41 vhaiswvsh10 cimserver: created VICimMethodMgr
Mar 11 09:35:02 vhaiswvsh10 vmkiscsid[22622]: cannot make connection to 10.208.55.70:3260: Connection refused
Mar 11 09:35:02 vhaiswvsh10 vmkiscsid[22622]: Connection to Discovery Address 10.208.55.70 failed
Mar 11 09:35:03 vhaiswvsh10 vmkiscsid[22622]: cannot make connection to 10.208.55.70:3260: Connection refused
Mar 11 09:35:03 vhaiswvsh10 vmkiscsid[22622]: Connection to Discovery Address 10.208.55.70 failed
Does it still have any connectivity to that datastore?
Did the iSCSI client port get closed locally or was it a remote event?
vExpert 2009
Actually there were no iscsi datastores connected. The firewall was on and swiscsi was enabled, but there was no connection.
Ok, Then as already advised disable it for now.
I would also disable the pegasus cimserver
The reset occurs after this event
Mar 13 00:42:13 vhaiswvsh10 cimserver: trying to popen /sbin/modprobe edd 2>&1
Mar 13 00:45:16 vhaiswvsh10 syslogd 1.4.1: restart.
Mar 13 00:45:16 vhaiswvsh10 syslog: syslogd startup succeeded
Every time!
Use
service pegasus stop
vExpert 2009
we had a similar problem with one of our esx server, no errors in logs besides what you had. Was a production esx server so did not have time to troubleshoot, or chance it rebooting randomly. We backed up the logs and reinstalled the server.
Went through the logs again but found nothing, after the reinstall the esx server never rebooted again.