Hi,
today one of our servers (dell PE 2950) had a routine maintenance: a dell technician replaced the raid controller battery.
On that server we have an ESXi installation (the free version, update 4, v.3.5.0 153875, standalone -without VirtualCenter): after replacing the raid controller battery the ESXi booted, but:
-lost the IP address of management interface/console (IP address was 0.0.0.0)
-lost all network setup (vnics/vswitch and iscsi)
-went automatically in lockdown mode
after a couple of reboots, working with at the console, I managed to set lockdown mode to off (initially it refused to disable lockdown).
Then I had rebuild all the network setup (in console and then in the VI client -> rebuilt vswitches and assigned vnics) and to reconfigure iscsi
I tested the host with some reboots but:
-1 over 3 times it boots in lockdown mode, even if I had disabled it
-at every boot the automatic startup/shutdown setup is lost (all VM are in the manual startup group, even those I placed in automatic startup in a given sequence) and VMs do not start automatically.
So is there a way to do an "integrity check" of the ESXi installation/local HD to assure that it is OK?
by the way, I enabled ssh login
Thank you
Guido
I have a similar problem before . . the probklem is your that your config should automatically get backed up at 1 minute past the hour . . every hour. You more than likely have aq corrupt disk space . .. meaning this can not function properly . . so your changes become session only changes.
Follow the link below for info on how to resolve the disk issue:
http://www.vm-help.com/esx/esx3i/check_system_partitions.php
Then wait until about 5 past the hour and review logs for any failed backup errors.
Many thanks for the reply and for the useful link to the docs about ESXi file system partitions check.
These are my ESXi partititions:
Device Boot Start End Blocks Id System
/dev/disks/vmhba1:0:0:1 5 750 763904 5 Extended
/dev/disks/vmhba1:0:0:2 751 4845 4193280 6 FAT16
/dev/disks/vmhba1:0:0:3 4846 69376 66079744 fb VMFS
/dev/disks/vmhba1:0:0:4 * 1 4 4080 4 FAT16 <32M
/dev/disks/vmhba1:0:0:5 5 52 49136 6 FAT16
/dev/disks/vmhba1:0:0:6 53 100 49136 6 FAT16
/dev/disks/vmhba1:0:0:7 101 210 112624 fc VMKcore
/dev/disks/vmhba1:0:0:8 211 750 552944 6 FAT16
Partition table entries are not in disk order
~ # esxcfg-vmhbadevs -f
vmhba1:0:0:8 /vmfs/devices/disks/vmhba1:0:0:8 a79407ec-71c546c0-1368-0fc9b0ac7595
vmhba1:0:0:6 /vmfs/devices/disks/vmhba1:0:0:6 5bc62b73-7c202adb-f01f-97b43777d751
vmhba1:0:0:5 /vmfs/devices/disks/vmhba1:0:0:5 3a02cc70-99a3b655-3dd7-64ab30093543
vmhba1:0:0:2 /vmfs/devices/disks/vmhba1:0:0:2 49e5e6f3-43b0a7f7-c3f6-002219a79228
~ # ls -l | grep vmfs
l--
0 root root 1984 Jan 1 1970 altbootbank -> /vmfs/volumes/5bc62b73-7c202adb-f01f-97b43777d751
l
0 root root 1984 Jan 1 1970 bootbank -> /vmfs/volumes/3a02cc70-99a3b655-3dd7-64ab30093543
l--
0 root root 1984 Jan 1 1970 scratch -> /vmfs/volumes/49e5e6f3-43b0a7f7-c3f6-002219a79228
l
0 root root 1984 Jan 1 1970 store -> /vmfs/volumes/a79407ec-71c546c0-1368-0fc9b0ac7595
I run a dosfsck pass on each of the four partitions that look meaningful to me: one (/scratch) ended ith success, the others without error.
May I consider them OK?
~
# dosfsck -t -r /dev/disks/vmhba1:0:0:2
dosfsck 2.11, 12 Mar 2005, FAT32, LFN
Seek to 2147491840:Success
~ # dosfsck -t -r /dev/disks/vmhba1:0:0:5
dosfsck 2.11, 12 Mar 2005, FAT32, LFN
/dev/disks/vmhba1:0:0:5: 10 files, 39607/48927 clusters
~ # dosfsck -t -r /dev/disks/vmhba1:0:0:6
dosfsck 2.11, 12 Mar 2005, FAT32, LFN
/dev/disks/vmhba1:0:0:6: 2 files, 1/48927 clusters
~ # dosfsck -t -r /dev/disks/vmhba1:0:0:8
dosfsck 2.11, 12 Mar 2005, FAT32, LFN
/dev/disks/vmhba1:0:0:8: 34 files, 11546/34549 clusters
To wich logfile should I check fo errors on 5minutes past the hour?
In /var/log/messages I can't find anything meaningful:
Sep 15 14:04:19 sfcb[57902]: storelib Physical Device Device ID : 0x2
last message repeated 1 times
Sep 15 14:04:43 vmkernel: 0:04:07:52.202 cpu4:1977)WARNING: UserSocketInet: 588: waiters list not empty!
Sep 15 14:04:43 Hostd: Activation : Invoke done on
Sep 15 14:04:43 Hostd: Throw vmodl.fault.RequestCanceled
Sep 15 14:04:43 Hostd: Result:
Sep 15 14:04:43 Hostd: (vmodl.fault.RequestCanceled) { dynamicType = <unset>, msg = "" }
Sep 15 14:04:43 Hostd:
Sep 15 14:04:43 Hostd: Failed to send response to the client: Broken pipe
Sep 15 14:05:21 sfcb[2654]: storelib Physical Device Device ID : 0x2
last message repeated 11 times
Sep 15 14:05:51 sfcb[58229]: storelib Physical Device Device ID : 0x2
last message repeated 3 times
Sep 15 14:05:51 sfcb[58234]: storelib Physical Device Device ID : 0x2