As far as I can tell there seems to be a bug in 3.0.2 that started with the 11/15/07 patch set. Whenever I remove an ESX server from VirtuaCenter, rebuild it with the same name, and rejoin VirtualCenter, Vmotioning a running VM to the rebuilt server will fail with a timeout. /var/log/vmware/vpx/vpxa.log on that server inticates that "the object doesn't exist and has never existed".
To resolve this is issue, I have to migrate a VM that is powered off or suspended (This is technically a relocation not a migration). Once that VM has been moved to the rebuilt ESX server. I power it on, migrate it back to the server it was originally on, and then everything seems to work as normal.
I have no problems if I use 3.0.2 update 1, but with any patches, starting from 11/15/07 through 03/06/08 I'll get the error. Not sure about 3.5
Anyone else seen this? Does anyone know if it's resolved in 3.5? VMware, any fix for this coming out soon?
I've just rebuilt a 3.5 server and have exactly the same issue.
Only it's worse, because I can't find any way to get it working again.
A long shot: We had an issue with vmotion / cold migration when installing a esx 3.5 with the hostname in UPPERCASE. Did you do that?
Did you try to disconnect the server from the VC instead of removing it, perform rebuild, connect it back.
It appears that disabling DRS will allow you to VM between affected machines, but enabling it again will break VMotion, so it's hardly a long-term solution.
I've "fixed" the issue by removing all my hosts from the cluster, removing the cluster, creating a new one and moving all the hosts back into it.
However, VMotion does seem much slower now than it used to be.
Thanks for the suggestions
Names are all lowercase after that HA bug they had in VC 2.0.2.
I'll try disbaling DRS, but I'm pretty sure I tried that before without an effect.
I'll try disconnecting the server too, but I feel more comfortable removing it since it will be rebuilt before it gets reconnected.
As far as destroying and recreating the cluster, I don't think I want to go through that every time. What I described above seems to work every time, It's a pain, but probably less effort.
Based on how quickly people responded, it looks like a lot of other people have been having this problem. I'm very sorry to hear it still exists in 3.5, hopefully VMware will fix it quickly.
Problem solved. Seems the issue is that the ESX serves that aren't rebuilt cache the MAC address of the vmkernel port of the server that was rebuilt. The solution is to do a vmkping from the server you just rebuilt to the other servers in the cluster. At some point 'll write a script that will actually query VirtualCenter to determine the addresses, but for now I just added a section to my network config script that enumerates all the IPs on the subnet and vmkpings them.
Original Poster has found solution to his own issue. there thread marked as assumed answered
Tom howarth
VMTN User Communities Moderator