We are experiencing disconnections and slowness on VM's on our cluster. When checked for one VM, its host was giving error that ' Configuration issue, connection timed out'.
We are unable to migrate VM from that host. After some time that error disappeared. But problem started growing. one after another VM's are showing slowness and disconnections. Unfortunately all those VM's are critical servers.
If there are sevral vms on the same host having the issue, I would think your problem is more host related than VM. It could be load related . Check the logs in /var/log for errors.
When you say "We are unable to migrate VM from that host." Why is there an error ?
You also say "After some time that error disappeared" You mean the disconnects ? if so again this point to load... Have you looked at the performance tab for the ESX host ?
Not all VM's showing issues, only few. There is no packet drops on any on of the esxi hosts, no performance issues etc.. while migrating from faulty esxi host ( ) to another host, error was 'an error occured while communicating with the host'. We are using 4 esxi hosts in our cluster, but we afraid how 3 of them together started showing issues. Vm's running on them were showing strange behaviours.
1) On on esxi host we were unable to run management commands to restart management services
2) unable to migrate VM's between these hosts
3) VM were disconnecting very frequntly
4) 3 VM's got inactive and we were unable to do anything on them and then when removed those VM's from inventory and readded. But then we were unable to power them up. Error ' operation canot be completed in current state.So we tried to mannualy register them on host, but that di not work. Not sure why..
5) Once we rbooted all thses hosts also.
you are patched up to latest patches?
check hostd and vpxa logs
view logs from vi client or https://<esxi ip address>/host
here is how to browse via console: http://sparrowangelstechnology.blogspot.com/2012/07/browsing-vmware-logs-in-dcui-view-esxi.html
could be stroage related?
if you are having problems with your storage it would cause the host to have problems
that would make more sense actually since you have 3 hosts acting weird like this