We are currently running HA and DRS we have configured HA Isolation Response to Leave Powered On - This places a Lock file on the VM'S running on the host - if the host fails hard down - HA will be unable to move these machines to another available host in the cluster. We considered changing this setting to shutdown which would gracefully shutdown the VMS and release the lock file allowing another host to boot the VMS.
We prefer to leave the setting set to Leave Powered On - my question is if we rebuild the failed machine on another Host, as the same name would HA allow us to start the failed VMS on the new host?
Hi Mesteller,
Isolation and host failure are two different problems.
You need to decide what to do in case one of a host becomes isolated. If you are absolutely sure that the VM network will still be available, then you can decide to keep your applications up as they can communicate with their clients.
If a host fails, even if it uses lock files, other hosts will be able to restart the failed VMs. Other hosts will survey file locks. If they are not touched during a few milliseconds then they consider the host failed and they try to restart the VMs.
You can't have two VMs with the same name in a vCenter.
Good luck.
Regards
Franck
Hi
So the definition of Isolation is unable to communicate with the cluster - meaning a loss of the network? We recently had a hard drive failure that caused the Linux (Vmware) kernel to stop working, all the Vms were still running on that host - in order to recover we replaced the drive and re-booted - we attempted to migrate the vms to another host and were unable because of the lock file - If I am understanding this correctly if we would have disabled the host's network connections HA would have migrated the VMS?
As for having two Host with the same name we would remove the host from inventory and then re-build - after thinking about this strategy it isn't a very good one - it would take longer to re-build than to
have the Isolation setting changed to shutdown.
Neal Mesteller
Sr Analyst, Distributed Systems LAN/PC
neal.mesteller@kennametal.com
T 724-539-5341
M 724-331-5990
F 724-539-5031
Kennametal Inc. | 1600 Technology Way | Latrobe, PA 15650 | www.kennametal.com
From: FranckRookie <communities-emailer@vmware.com>
To: <neal.mesteller@kennametal.com>
Date: 10/27/2010 03:45 PM
Isolation consists in a host not being able to communicate with other members of the cluster through the admin network. The problem you had is different and very annoying.
The best solution would have been to move your running VMs to another host either manually one by one or asking the ESX to enter maintenance mode. But if the system console is crashed then there is a good chance that it will not accept the move request or react properly to an isolation event. So you have two possibilities:
- stop your VMs from the inside with an OS shutdown and then restart them on other hosts. Finally reboot or repair your host.
- make a hard shutdown of your host by unplugging the power cable. The VMs will be restarted on other hosts by the HA failure.
Useless to say that it is always better to close your applications neatly using the first solution...
Regards
Franck
Hi
We suspect the service console was crashed - we did attempt to enter maintenance mode but could not - I believe the option was greyed out - we attempted to gracefully shutdown the virtual machines running on that host then re-boot on a different host by removing form inventory then re-adding to inventory this still did not release the lock file and left the VM in an un-bootable state- having the isolation set to leave powered on and then forcing the host to fail by pulling the power would cause the running vms to power crash - and still not be re-started by HA - according to VmWare support we had to gracefully shutdown our virtual machines - re-place the failed drive then re-boot the host -
Our network is set up for HA - we have Mgmt network - vmotion - and production.
Reviewing these settings it appears their isn't a good option
Power Crash- would perform a virtual power down - crash the VMS os - hared on the VM - but this option will allow the VM to be booted on another host -
leave powered on - cluster can no longer monitor the Host or the Vm's - lock file issues cannot bring the vms up on another host
shut down - if VM fails to power down no way to force shutdown - if shutdown successful HA will work
What would be the best setting to use to minimize outages to our VM's
It appears that leave powered on is the best option
Neal Mesteller
Sr Analyst, Distributed Systems LAN/PC
neal.mesteller@kennametal.com
T 724-539-5341
M 724-331-5990
F 724-539-5031
Kennametal Inc. | 1600 Technology Way | Latrobe, PA 15650 | www.kennametal.com
From: FranckRookie <communities-emailer@vmware.com>
To: <neal.mesteller@kennametal.com>
Date: 10/28/2010 08:46 AM
Hi,
Sorry if this causes more confusion than helps.
I note that you tried to gracefully shutdown virtual machines by removing them from the inventory. This does not shutdown virtual machines, the only way to do this with an isolated host are as follows:
- via some sort of remote tools to the virtual machines, i.e. RDC if Windows servers (this assumes the virtual machine network is still up and running)
- via the Service Console (assumes you have console access if the Service console network is down)
- physically power the host off
Obviously the preference is to gracefully power the virtual machines off, however this is not always possible and if the virtual machines are still running but not contactable, often the business will dictate the nasty power off. Having said that most OS these days can recover from a power off after running some logs back or a chkdsk or two
In regards to what is the best open for Host Isolation, the answer really depends on your environment. Personally I tend to prefer the Leave VMs powered on option as this allows me to determine if I actually have an isolated host that affects my virtual machines, or just a host that has no access to Service Console. If the host actually goes belly up, this option will still allow the virtual machines to start on another host. For example, lets say the vSwitch that hosts the Service Console loses connection to the network (e.g. both NICS die) however the vSwitch the virtual machines run from is fine, do you really want the virtual machines restarting during the day or would you rather control this to take the host down after hours?
Kind regards.
Hi
Yes we did try to gracefully shutdown from Remote desktop - sorry I did not make that clear - Anyway from what I am gathering it is better to leave the isolation setting set to leave powered on because of the way we have our network set up for HA.
thanks
Neal Mesteller
Sr Analyst, Distributed Systems LAN/PC
neal.mesteller@kennametal.com
T 724-539-5341
M 724-331-5990
F 724-539-5031
Kennametal Inc. | 1600 Technology Way | Latrobe, PA 15650 | www.kennametal.com
From: ThompsG <communities-emailer@vmware.com>
To: <neal.mesteller@kennametal.com>
Date: 10/29/2010 06:54 AM