6 Replies Latest reply on May 25, 2007 1:17 PM by Rumple

    Horrid performance with Cold migration ESX 3.0.1

    taylorb Expert

      I am hoping someone has some suggestions for me.  I am trying to do some cold migrations.   I am getting 11GB per hour transfer rates between my two ESX hosts from local SCSI to Local SCSI.   I get this same performance whether I use the migrate function from VC, or if I just SCP the files from the command line.    If I SCP from either of my 3.0.1 hosts to another box, I get 50+ GB/hr.   If I SCP from another linux box TO either of my 3.0.1 boxes I get even worse performance, like 6 GB per hour.   SO it appears that both boxes are performing horribly on the receiving end, but both can send data at a reasonable rate.   What could be the issue here?  I am using a raid5 set with 4x146 SAS 10k drives.  They should be reasonably fast for just network file transfers I would assume.   Network doesn't seem to be an issue because I can send files fast enough.

        • 1. Re: Horrid performance with Cold migration ESX 3.0.1
          oreeh Guru

          This is a known problem with ESX 3.

          Writes to the VMFS are slow when done from the console.

           

          Try exporting the VMDK to a local ext3 partition, then copy it to the other ESX3 host and import it there.

           

          FYI: never transfer VMDK files from/to VMFS using cp,scp,ftp,...

          Your files may get fragmented or oven (rare) corrupted.

          1 person found this helpful
          • 2. Re: Horrid performance with Cold migration ESX 3.0.1
            taylorb Expert

            Update:  So I tested an SCP between the two hosts going from SCSI on the first box to a SAN LUN on the second box.  The performance is very good (~100GB/hr).  This would lead me to beleive that the performance issue is with writes on my Local SCSI RAid5 sets.  What would be causing such a wide disparity between read and write performance on the local disk.   I still think I should be getting close to 50GB per hour on them read or write.

            • 3. Re: Horrid performance with Cold migration ESX 3.0.1
              taylorb Expert

              This is a known problem with ESX 3.

              Writes to the VMFS are slow when done from the

              console.

               

              Try exporting the VMDK to a local ext3 partition,

              then copy it to the other ESX3 host and import it

              there.

               

              FYI: never transfer VMDK files from/to VMFS using

              cp,scp,ftp,...

              Your files may get fragmented or oven (rare)

              corrupted.

               

              I am not specifically trying to SCP the files, I am just using that as a test.  I get similar performance with SCP or cold migrate so I am assuming if I can get SCP  to go fast then a cold migrate will too.  

               

              If the performance issue is with copying from the command line, then why do I get such good performance when I go from SCSI to SAN, but not SAN to SCSI?

              • 4. Re: Horrid performance with Cold migration ESX 3.0.1
                oreeh Guru

                I get similar performance with SCP or cold migrate so I am assuming if I can get SCP to go fast then a cold migrate will too.

                 

                yes

                 

                I assume (only the engineers know) that access to FC/iSCSI VMFS is handled a little different than access to local VMFS.

                • 5. Re: Horrid performance with Cold migration ESX 3.0.1
                  taylorb Expert

                  Just wanted to follow up with the real solution to this.   It turns out our write cache was disabled because our servers shipped without battery backed cache cards.   I bought a battery, turned on the write cache and my cold migrations went from 1 hour for a 12GB VM to 5 minutes.  So a 60x improvement!  Sounds like write cache is a necessity.

                  • 6. Re: Horrid performance with Cold migration ESX 3.0.1
                    Rumple Master

                    I've also tested local RAID without battery backup and it would take 22 minutes to create 1GB file vs 57 seconds with battery backup and writethu enabled