I've been thinking a lot about this, but never tested irl.
Anyway, if VMsrv1 goes down in an uncontrolled fashion, the VM state will be lost. Naturally, as the VM is running in the crashing host's memory. To achieve any chance of continuing, you'd need to snapshot the vm before the crash, just like vMotion technology does.
But as soon as you have a consistent VM on your iSCSI storage, you \_should_ be able to start it up from a snapshotted state. In fact, I'm doing a test on this while typing.
The iSCSI handling is of course a different issue, as VMware ignores the tech behind the host. So how you should handle storage is out of my hands. Anyway, I'm into the same ideas here, but for now I'm moving vms over Samba for testing only.
Any word on your implementation?
For anyone that's tested or knows:
Is it possible to set the VM to use physical disk space instead of actual memory (at cost of performance), but if the system's power is pulled, could the VM be "running" in the same state it was on another machine? Sure, there would be some issues regarding data validity during the downtime, but we don't want the OS to run through file system checks or reboot.
I'm sure there has to be a poor-man's solution to getting some sort of high availability with VMWare Server...
However, what I am not sure of is the VMWare Server
If the primary VMWare Server goes down, and the
secondary kicks up, will the Virtual Machines be in
the SAME STATE they were in (powered up, still
running)? Or will they be powered off, resulting in
file system checks when powered up? Or will VMWare
Server think that the volume simply disconnected, and
reconnect it and continue running?
We know that this can be accomplished with ESX,
first, the vmware guests are executing on physical host systems CPU(s)...if that host goes down so does the vm.
putting all the data out on a network storage device at least gives you the possiblility of starting up that guest again quickly on a new host system. but depending on what was happening in the guest at the time of the crash you may or may not have to do any sort of fsck/etc in the guest itself.
you could leave the disk backed memory feature enabled at the expense of overall performance, if you were esp paranoid and its possible it *MIGHT* make it easier to recover in the event of the crash/startup, but its pretty unlikely the guest would ever just resume cleanly
if you are trying to build a poor man's ESX, consider what your own time is worth. ESX accomplishes its vmotion magic with many features/capabilites not inherent in your typical linux distro...e.g. clustered filesystem, etc...if it was easy to do you would have seen offerings from the other virtualization vendors which are trying to make a business out of this stuff and none of them have anything going yet which is production grade
there is no cheap way to get perfect availability
alternatively you may want to consider using some sort of block-level data replication that is making copies of the source vm data to another host (and ideally episodic snapshots to facilitate rollback should there be data corruption situation in the source vm dataset). that technology is less expensive than ESX and gives a better offsite DR story anyway. its very feasible you could keep a remote copy of the vm data in sync to within minutes of the source which could be quickly brought online
Even with ESX, if you lose a physical server the VMs go down. If you have Virtual Center it will start the VMs back up on another running physical server, but the VMs will not be in the same state as they were when the physical server went down.
I'm interested in looking at this sort of setup, too. First, there's no way to keep the guests running if a physical server crashes - ESX doesn't even do this - it just restarts the guests on another physical server automatically. I think this could probably be accomplished on VMware Server on Linux, as well.
I'm running three development servers right now with an iSCSI disk share using the OCFS2 filesystem to share the VMs between the hosts. In my experience, heartbeat isn't that easy to configure, but I think it would probably be possible to configure heartbeat between the servers so that, if one server goes down, the others will automatically take over the services and start back up the VMs that went down.
One of the catches is that, in order for the VMs to come back properly, all of the network settings on the hosts must be identical (at least for bridged networks). I have about 6 or 7 VLANs connected to various bridged interfaces, and I need to make sure that they're set up identically on all of my hosts.
Anyway, I'm interested to see if something can be don't that would make this work at least half-way decently.
alternatively you may want to consider using some
sort of block-level data replication that is making
copies of the source vm data to another host (and
ideally episodic snapshots to facilitate rollback
should there be data corruption situation in the
source vm dataset). that technology is less expensive
than ESX and gives a better offsite DR story anyway.
its very feasible you could keep a remote copy of the
vm data in sync to within minutes of the source which
could be quickly brought online
I've thought about that, but then - even if we rely on the underlying data location (iSCSI or some other storage) to perform block level data that's transparent to anything accessing it, we would still be in the same mess, right?
Would it be any different than having both the memory/disk caching disabled and having everything out on a networked storage - then restarting those?
I don't mind doing that for backup purposes (our iSCSI solution is mirrored, and we can also set it to snapshot).
The biggest problem I have is trying to figure out what actually goes on with the virtual machine when:
1 - One host goes down, another fired up same files (after deleting WRITELOCK and VMEM files if needed). The system would think it was hard powered off, right? However, if we disable any caching, it should simply perform a check and we should be good to go, minus any failed transactions that happened at the time of failure?
2 - If we rely on snapshotting at the block level, wouldn't it be impossible to restore that snapshot while the VM is running? We would need to power it off, restore, and then boot up, right? I know there's probably specific methods of restoring snapshots, but I just can't grasp that I can restore a snapshot without having to power down the VM. And once the snapshot is restored, the VM is not in a running state, so we're back to the first issue - file system checks and since it was snapshotted while running, I don't know how it would react.
This is what I've been thinking to accomplish, but it's just one of those things where it takes time, and I might need help...
One server = Manager (can be any spec, all it will do is monitor the state of VM hosts)
VMHosts = all VMWare Server hosts, any number...
On the VMHosts (based on Linux for this example):
Connect each to the same repository containing Virtual Machines (NAS, iSCSI, FC, etc...)
Each VMHost has a script running every minute (cron) which will gather the current state of each running VM (i.e. CPU usage, Mem usage at the host level and at the VM level)
On the Manager (based on Linux for this example):
Manager performs basic ping checks on each host every minute (or other method to verify host is still alive on the network)
Manager receives reports from each VMHost every minute.
Manager can monitor CPU/Memory usage per host, if it feels like either is excessive, it will power down the VM and boot it on another host with less used resources
Manager can monitor VMHost, if it doesn't hear from a host, it can remove the WRITELOCK/VMEM files and tell another host to boot it
Technically, this should be VERY EASY to do... However, I just don't know how to get the VMHost to report back the CPU/Mem usage of each host... I guess I could just check the PID of the running machine and grep it from TOP or PS or something?
What would be nice is to figure out how to move the process.. I know it can be accomplished with XEN and other apps... This way, the machine doesn't need to power off (if moved due to high resource usage on host), it simply has it's process transferred.
i`m desparately seeking for some affordable snapshot & host based data replication solution for linux - but this doesn`t seem to exist.
realtime replication costs too much performance and asynchronous (non snapshot based) replication seems to unsafe for me. nor do i want to have centralized storage.
anybody knows better how to asynchronously doing snapshot-based transfer of VMware Server VMs ?
Message was edited by:
Has anyone ever tried to make a diskless VMHost?
PXE boot an image to run VMWare?
I've been looking at what to do, and I'm going to try the following:
PXE Server to PXE boot VMHost nodes (so they are diskless)
They are set to connect to iSCSI host
Manager then delegates VMs to each node
If node dies, VMs are restarted on an available node
If manager notices excessive resource usage, it will power down vms (until usage is normal) and restart on node with less used resources
It doesn't seem too difficult at all, now that I look at the steps of what is going on.
I just want to know if anyone has ever made a diskless VMHost? Or any other method to get a node without drives to run VMWare Server?
anybody knows better how to asynchronously doing
snapshot-based transfer of VMware Server VMs ?
If you're hosting your VMs on LVM, rumours are saying that url=http://eric.windisch.us/software/yabus/yabus[/url] may be soon used for such vmware tasks (in version 2.1). But you're right in that it that things won't be as good as VCB on esx.
What do you mean by "host-based data replication solution?" I'm unfamiliar with that term and just looking for some clarification on what exactly you're looking for. Linux has LVM2, which allows for snapshots, and DRBD, which is able to replicate block devices over the network to another block device. You might be able to combine the two by taking a snapshot, then writing out a simple config file for DRBD that would start DRBD on the this snapshot and replicate it somewhere else. DRBD is also tunable for resource utilization when it's running, so you can throttle it so that it won't eat your entire machine while it's replicating. Not sure if that's what you're looking for or not, but thought I'd mention it
Also, why don't you want centralized storage? I can understand if cost is a factor - it can get very expensive very fast, especially if you're looking for a lot of redundancy, but is there some other reason besides cost?
I've played with this in my free time - haven't made it very far with it. You can certainly use iSCSI on the host to get the VM's disk, then have the VM use that disk, but that doesn't require PXE boot and still requires that the host do the iSCSI->VMware Disk translation.
I've started to play with loading an iSCSI module into the Linux initrd to get Linux to boot natively over iSCSI in a VM. There are a few sites out there that give you instructions on how to do it - Google it and you'll find them. I haven't had much success, yet, but I also haven't spent a lot of time trying to track the problems down.
Diskless hosts are relatively easy to do, assuming your central storage or O/S repository is available. We're starting to boot our servers over our SAN - we install HBAs (iSCSI or FC) and then run an O/S installation onto the SAN volume presented to the host. Our SAN also features the ability to create a "master" O/S image, then create snapshots and present those to each of the servers, so we only have to do the O/S install once - the rest is just reconfiguration.
I was just trying to see if I could duplicate the features and functionality of VirtualIron's system which does all of this, but is based on Xen.
In any case, PXE booting (diskless nodes) aren't that big of a deal, so we can install small drives in each system and have them connect over the iSCSI network for the images.
But, I'm still toying with the Manager/Host model I described above... I think it has a chance of working.
Mgr - monitor hosts, delegate VMs to Hosts, and so on
Hosts - Report to Mgr, run Hosts
I've been playing with commands and trying to make some scripts that can do the reporting part... All the hosts need to do is send numbers to the Mgr... The Manager can analyze them and then decide what to do.
>What do you mean by "host-based data replication solution?"
i mean replication from one host to another host - without anything centralized or "in the middle".
I'm unfamiliar with that term and just looking for some clarification on
what exactly you're looking for.
i`m not sure, if this absolutely the right term to use, but i think i have read articels calling "host to host" replication this way.
ok, i know of lvm2 and drbd - but lvm gives significant performance loss with snapshots and i didn`t find a dedicatet reference for a combined setup with drbd. so it`s sort of "hackerish" to work this out for myself.
>Also, why don't you want centralized storage?
i don`t want it because of cost and because being a single point of failure (ok, you can have it redundant, but that`s another significant cost factor)
i theory, some "host to host" replication should be doable - for example esxreplicator from visioncore looks promising - but that`s ESX only - and expensive, too.