VMware Cloud Community
ShahidSheikh
Enthusiast
Enthusiast

Moving VMs from one ESXi server to another - what a pain!!!

This is where I miss my setup with VMware Server. I do not have a iSCSI or NFS central storage. Before I started the move to ESXi, all my VMware Server (1.0.7) machines were Ubuntu 7.0.4 servers all of which had the directories where VMs were stored exported through NFS and cross mounted on each of the servers. So on each server I had the directory structure:

/vmfs/svm01

/vmfs/svm02

/vmfs/svm03

/vmfs/svm04

Where only one directory was local and the remaining three were NFS mounts from the three other servers. This made moving moving VMs around a breeze. I could even run vmware-vdiskmanager with input and output VMs being on different servers. The only thing that was slow was provisioning thick vmdks.

Now under ESXi moving VMs is a pain. Apparently scp in ESXi does not support the -C (compression) flag. Through scp, I can either scp thick vmdks from one ESXi to another or import them into thin vmdks and then scp them. Either way its a painfully slow process. And scp only gives a maximum thruput of about 2.5 MBytes/sec (20 Mbps) on my 1 GB connected NICs (VM nic is separate from management nic). All servers are either Dell (2850, 2950) or HP 2U (380 G4) servers, all with 4 or more U320 SCSI drives in a hardware RAID.

I was hoping in RCLI I would be able to run vmkfstools.pl script across machines but alas, it only runs vmkfstools operations on a single machine too.

Right now I am trying to move one of my VMs that has a 120 GB vmdk and it says its going to take 15 hours to move. I can reduce the server downtime significantly (I think) by taking a snapshot and moving that first, then downing the VM and moving the deltas but still all of this for doing something I could do fairly painlessly in VMware Server seems a lot of work.

My question is that does anyone have a better way of moving VMs from one ESXi to another? I will eventually have a centralized NFS store but that is not going to happen for another 2 months.

It would be very cool if there was a way to NFS export the local datastores in ESXi.

Reply
0 Kudos
81 Replies
ShahidSheikh
Enthusiast
Enthusiast

kpc, what NFS device are you using? If there was one thing I was impressed with in ESXi it was NFS support and speed. What kind of speeds are you seeing?

Reply
0 Kudos
kpc
Contributor
Contributor

Hi Shahid

I was seeing real slow write speeds to my NFS share, however I've been doing some tests and seems that if I export the share with the switch 'async' I get decent speeds - it was 'sync' before. Funny how this doesn't affect ESX.

Reply
0 Kudos
ssapp80
Contributor
Contributor

when I drop your compiled rsync in esxi and try to run it i get "permission denied"..........any particular method in which i need to place it there?

Reply
0 Kudos
ShahidSheikh
Enthusiast
Enthusiast

Funny how this doesn't affect ESX.

Hi kpc

I'm assuming that you are referring to ESX/ESXi reading/writing to NFS Vs tools like vmkfstools through an SSH to the ESXi reading/writing to NFS and not a comparison between ESX and ESXi.

I have not sniffed traffic between ESXi and NFS server to see what the difference is between the vmkernel accessing data from NFS and the linux OS in the management console accessing data. I suspect the internal caching built-in to ESX mask a lot of the issues introduced by high latency as a result of using sync. And I suspect the NFS volume is not mounted in two separate places (vmkernel and management console.) I suspect the vmkernel just exposes its own NFS mounted volume to the management console through the /vmfs/volumes mount. Don't have an ESXi box in front of me right now to verify.

You know, I have always exported NFS in async mode. Never had a problem. But then I've never really experienced a bad power failure either. With all the cache built-in to RAID controllers these days, it would be interesting to find out how badly an async exported NFS volumes running on a server with good RAID controller experiences data loss/corruption when the power dies. Of course that would depend on the write caching policy of the RAID controller as well.

Sourceforge documentation on NFS and its performance tuning is very good place to start. Really well written. http://nfs.sourceforge.net/nfs-howto/ar01s05.html

Reply
0 Kudos
BThunderW
Contributor
Contributor

Did you chmod +x?

when I drop your compiled rsync in esxi and try to run it i get "permission denied"..........any particular method in which i need to place it there?

Reply
0 Kudos
kpc
Contributor
Contributor

Thanks for the helpful insight Shahid. Thinking back you setup an NFS share on ESXi and ESX totally differently, or well I did. I'm just glad I'm getting decent speeds now Smiley Happy

Reply
0 Kudos
glim
Contributor
Contributor

NFS is kind of a kernel space thing. In a Linux-like OS, NFS is now commonly done in kernelspace for client and server, and the utilities only serve to assist/configure the kernel.

Altering whatever NFS that ESX does would require rebuilding their kernel.

There does exist a very old implementation of a userspace NFS server, but I don't think that would buy you much here.

I think that the main limitation on speed is that the maintenance-VM is bandwidth-limited by ESX itself.

If someone has anything else they'd like built, I can try to take a look at it.

EDIT:

And apologies for forgetting the chmod +x if it caused any confusion...

Reply
0 Kudos
ssapp80
Contributor
Contributor

thanks BThunderW.....that did the trick.

Reply
0 Kudos
ssapp80
Contributor
Contributor

BThunderW.......I'm using the rsync you compiled and its working great, I'm sending to a remote rsync daemon and getting 10mb/s on a 100mbit connection.........many thanks!!!!!

Reply
0 Kudos
josby
Contributor
Contributor

Thanks glim, I did this and it works great. However, the rsync binary in /bin disappears on reboot. I am guessing the root filesystem in ESXi is a ramdisk that gets created from a compressed image file on each boot. Did you experience this as well? Any thoughts on a workaround? I doubt updating the contents of the image file ESXi is using to include the binary would be easy.

Oh, wait, there they are, in /vmfs/volumes/Hypervisor1. Just tar files...I guess that wouldn't be that difficult after all.

But, more than I want to mess with. I ended up making a bin directory in /vmfs/volumes/datastore1 and putting rsync in there so it will persist across reboots, then added "--rsync-path=/vmfs/volumes/datastore1/bin/rsync" to the rsync command on my remote system that initiates the copying.

Reply
0 Kudos
TechFan
Contributor
Contributor

Hm, interesting. I tried this out. . .initiating the rsync from a ssh shell on the esxi test box. I am still only getting 4MB/s transfer speed. . .same as scp. I guess I will have to try initiating it from a remote box. I need to get past this 4MB/s limit (wherever it is).

So, how are you doing this remotely anyway? How are you mounting the local filesystem or are you?

Reply
0 Kudos
TechFan
Contributor
Contributor

Hm. So, how were you connecting to the ESX to run rsync? Through the ssh connection?

Reply
0 Kudos
wessie
Enthusiast
Enthusiast

I didn't have time to look into it, but shouldn't be scripting to multiple hosts a job for the VIMA product?

It allows connecting to multiple hosts and can specify the target host per RCLI command.

Message was edited by: wessie

Added VIMA link

Reply
0 Kudos
glim
Contributor
Contributor

Yes. I did. I am a bit ill at the moment, but yes. Use the "-e" option

On Dec 28, 2008, at 10:37 AM, TechFan <communities-emailer@vmware.com

Reply
0 Kudos
TechFan
Contributor
Contributor

Thanks. I got it working. It depends on your goals, but I found that it is faster to transfer via ssh to the localhost with nfs mount. . .unfortunately, I think all the data still gets transferred. . .but since I am working against a remote netapp anyway. . . At this time, I am not trying to sync local esx resources (in that case going direct to the esx server would be best), but stuff on a netapp.

Thanks for responding. I hope you feel well soon.

Reply
0 Kudos
lamw
Community Manager
Community Manager

Take a look at this script, it can be setup to help you replicate your VMs on both ESX and ESXi servers.

ghettoVCB.sh

I also agree with Wessie that this can be easily scripted using VIMA and RCLI, the problem is VIMA is extremely slow right now but if time is not important than you can definitely make a script that does what you need.

Reply
0 Kudos
TechFan
Contributor
Contributor

What script? Were you going to put in a link?

Speed is important in that I was hoping to rsync backup copies of my VM's each night to another SAN/NAS type storage. . .then offsite if I can get it to be efficient with sending only changes. . .

Reply
0 Kudos
lamw
Community Manager
Community Manager

I'm sorry, the link must have not made it while I made the post.

http://communities.vmware.com/docs/DOC-8760

Reply
0 Kudos
TechFan
Contributor
Contributor

Thanks. I guess I need to figure out what is going to really work best. . .since I am trying to pull data off a netapp that already has snapshots. . .not sure if using vmkfstool -i is better than a rsync. . .and things run from console seem to run a lot slower than running them from another linux vm even.

I was trying to find a solution that would allow me to only backup the changes. . .but it looks like no matter what I do, it is going to copy the entire VM and data. . .since it is coming from the Netapp and I can't run rsync on the netapp itself. I am trying to accomplish two things. . .have a backup outside of the netapp in case of physical failure (speed most important). . .and have a backup offsite (bandwidth used most important). It looks like I just need the fastest transfer method for #1 and need to rsync from #1 (instead of netapp) for #2.

Thanks again.

Reply
0 Kudos
ShahidSheikh
Enthusiast
Enthusiast

Think of the console as a vm running on top of the vmkernel with very limited resources. I believe it is intentionally designed like that.

I have no experience with NetApp so forgive me if this is given. Is your NetApp device NFS?

If it is and if you are already using snapshots then why not mount the NFS volumes on dedicated linux machine and use rsync on that machine to rsync the static snapshots over to another NFS storage.

Reply
0 Kudos