We are having problems vmotion, and was told by VMware that SVC is problem and I would need to contact IBM. After talking to IBM they sent me the follwoing link.
Has any one seen this version of ESX 3.0.1 - Build 36662?
I'm not 100% sure which of the patches brings your machine to this level, but I applied all 14 patches and my build # is: 37303
Take a look at the patches available for ESX 3.0.1
Patch ESX-2066306 mentions the following VMotion related fixes which could be what you need:
1) Virtual machines experiencing high cpu load during a VMotion migration can hang after the migration is complete.
2) Virtual machines can crash during a NUMA migration due to memory allocation failures.
I had opened a support call when I saw that update on IBM's SVC site but was told by VMWare that build 36662 was an internal development build and that they wouldn't give out a copy. What code release are you running on your SVC? Please post if the patches help with the vmotion issue. We are very anxious to upgrade but need to wait until there is some kind of official support. We are running SVC 126.96.36.199 with 13 ESX 2.5.3 hosts without issues.
Here's and update to my problem. I have reinstalled the two host running 3.0.1, 32039 on 03/22/2007 and have not had any problems since. I created a cluster with HA and DRS enabled and both are work great.
Now! If we can get VMware to support IBM's SVC we will be in business. I have upper management coming down on me about look to a different product. I think VMware is the best out there, but I don't get to make the decisions.
If you contact your IBM Support contact you can request an 'RPQ' for this build. We were able to get it a few weeks ago. They will only support Windows 2003 guest OS at this point but I'm told that they will be certifying more guest OS versions shortly and that this the problems that are fixed with build 36662 should be included in the next ESX release sometime near the end of summer / fall time frame.
This sounds good but does nothing for us. We are running a few Linux Servers as well as Windows 2000/2003/R2 Servers. Also, if VMware releases a patch we will have to go through IBM to get the patch.
Some unofficial timeframes that I have gotten regarding OSes other than 2003:
-RHEL 2.1, 3
-SLES 8, 9
Hope this helps.
Let's all complain to IBM to get off its collective a$$ and get this solved. I'm extremely concerned about the support the SVC will get with regards to ESX. If it takes 16 months (3.0.0 released June '06) to get standard support for all OSes (RHEL 4 - Q3 '07), I think I'll recommend dropping IBM (we are a large IBM customer). It's just way too much of a pain to deal with.
It's simply ridiculous that this is an issue.
I think both IBM and VMware need to communicate just a little better. Both need to work hand in hand to make the customers they have happy.
Our DASD group just updated the SVC to 188.8.131.52 and thats when our problems started with 3.0.1. We have some host running 2.5.3 and have not had any problems. I'm not sure why 3.0.1 would and 2.5.3 would not seeing that they both use the same storage/SVC.
But like I said, after reinstalling 3.0.1 I have not seen any problems. Seems to be working better then before, but I'm not going to hold my breath.
In my humble, yet very upset, opinion, I think IBM bears most of the fault for this. The SVC is certified via a self-certification process and IBM can't say that they needed this long to test the application (and guests) against the hardware. Regardless of what is said, I just don't believe any of it. 16 months to get it working? Absolutely unacceptable.
If this was a priority to IBM to support their customers, they would beat on VMware to assist in getting the issues solved. For VMware, they would probably just recommend taking the SVC out of the picture.
Regarding the 2.5 vs 3.0 issue, I guess the standard drivers were new for 3.0 and was the reason for the issues. The RPQ iso (36662) includes an updated driver, as I've been told.
Also, SVC support goes GA in the 3.1 release in what I think is the same release that supports XP (Q3). That is, of course, until other minor releases are posted. Then I'm sure it'll take 9 months to get those approved for SVC deployment.
Ok, I'll stop my ranting now
Here's an update after reinstalling with build 36662.
I just installed this build less then two weeks ago and now I am getting the same "Operation timed out when trying to migrate a VM. This is really frustrating!
Before I reinstalled with IBM 36662 build everything was working like a champ.