VMware Cloud Community
Bob_Warrington
Contributor
Contributor

VMotion Fails Under VC 2.01 between ESX 2.5.3 Servers.

I have a large mixed production enviroment consisting of VC 2.01 and a mixture of ESX 3.01 and ESX 2.5.3 servers. VMotion between ESX 2.5.3 servers works for some servers but not for others. I have two identical servers (IBM x445's) that will not VMotion between each other. The error I receive is "A general system error occured: failed to prepare source for VMotion(unknown exception)". VMotion across all ESX 2.53 servers used to work under VC 1.3.1.

Any ideas would be much appreciated. Thanks.

0 Kudos
9 Replies
radhika1780
Enthusiast
Enthusiast

Hope this information will help to solve your problem.

By default, VirtualCenter allows migrations with VMotion between compatible source and destination CPUs only.

In simplest terms, “compatible CPUs” for VMotion purposes means that source and target CPUs must have the same manufacturer (Intel or AMD) and be members of the same basic processor family (Pentium 3 or Pentium 4, for instance). Within a given processor family, the source and target CPUs must also have certain common and extended features implemented (or not implemented, depending on the specific feature) for VMotion to succeed.

To determine if the source and target CPUs meet VMotion requirements, VirtualCenter compares the target CPU to a default bit mask definition. The default bit mask has evolved with each version of VirtualCenter to support (or preclude) VMotion migration given a specific set of CPU features. For example, the VirtualCenter 1.x bit mask doesn’t flag the NX (Intel) or XD (AMD) bits, but the VirtualCenter 2.x bit mask does.

The process of modifying VMotion compatibility settings depends on the VirtualCenter version and the ESX Server 2.x or ESX Server 3.x host system running the virtual machines. Also varying by VirtualCenter Server version and ESX Server version is the scope of modifications to the default bit mask, as follows:

VirtualCenter 1.x—Modifications to the default bit mask apply to all ESX Server 2.x hosts being managed by VirtualCenter.

VirtualCenter 2.x with ESX Server 3.x—Modifications must be made on a per-virtual-machine basis using the VI Client.

VirtualCenter 2.0.1 Patch 2 (and later) with ESX Server 2.x—Modifications are made at the VirtualCenter Server level but can be defined by processor manufacturer, by version number, and other criteria.

VirtualCenter Server 1.x

VirtualCenter Server 1.x bit mask applies to all virtual machines on the ESX Server 2.x host system. VirtualCenter Server 1.x uses a locally stored text file, config.ini, for configuration information. The default location is:

C:\Documents and Settings\All Users\Application Data\VMware\VMware VirtualCenter

Modifying the Default Bit Mask

To modify the default bit mask for all ESX 2.x Server hosts managed by a VirtualCenter Server 1.x system, you make the changes to the VirtualCenter Server’s configuration file (config.ini). The process of editing the config.ini file is generally the same (regardless of CPU vendor):

Open the config.ini file in a text editor, such as WordPad. (If the config.ini doesn’t exist, create a text file named “config.in” and save it in the VirtualCenter application directory.) The config.ini file is typically located in the default VirtualCenter directory:

C:\Documents and Settings\All Users\Application Data\VMware\VMware VirtualCenter

Add to the config.ini file the appropriate line for the CPU feature that you want to mask (see the Table).

Save the file and restart the VirtualCenter management server so that the modified bit mask takes effect. The next time VMotion is attempted, the specific feature will be ignored by VirtualCenter, and VMotion should proceed.

CPU Feature VirtualCenter Server 1.x Vendor Supported?

SSE3 migrate.ignore.extfeature.bits = 0xE5BD Intel, AMD No

SSE4 migrate.ignore.extfeature.bits = 0xE7BC Intel No

Single-core-Dual-core migrate.ignore.feature.bits = 0x90000000 AMD Yes

PERF_MSR migrate.ignore.extfeature.bits = 0xE5A0 Intel Yes

As an example, to effectively enable VMotion between a host based on an Intel CPU that has SSE3 and one that doesn’t, you would add:

migrate.ignore.extfeature.bits = 0xE5BD

Note that VMware neither supports nor recommends modifying the VMotion constraints for SSE3 or SSE4 CPU extended features because of the risk of application or guest OS failure after migration.

VirtualCenter Server 2.x

VirtualCenter Server 2.x provides two different ways to modify VMotion CPU constraints, depending on the version of ESX Server systems.

For ESX Server 3.x hosts, CPU constraints on a per-virtual-machine basis are editable through the Virtual Infrastructure Client (VI Client) graphical user-interface.

For ESX Server 2.x hosts (managed using VirtualCenter 2.0.1 Patch 2 and subsequent releases only), the CPU constraints can be edited directly in the VirtualCenter configuration file (vpxd.cfg).

ESX Server 3.x

In Virtual Center 2.x, the default bit mask for VMotion constraints among ESX Server 3.x systems is applied on a per-virtual-machine basis using the VI Client. Regardless of CPU type or mask modifications, accessing the necessary dialog boxes is generally the same, as follows:

Launch VI Client and connect to the VirtualCenter Server host system.

Select the virtual machine that you want to migrate (by selecting Inventory (from the navigation bar) and navigating to the virtual machine, or by simply clicking on its name in the inventory panel.)

Click Edit Settings (in the Commands section of the Summary tab). The Properties page for the virtual machine displays.

Click the Options tab in the Properties page.

Select Advanced (under Settings on the Options tab) to display several settings-related boxes, including CPU Identification Mask, in the right-hand pane.

For every virtual machine that you want to migrate (that doesn’t meet CPU compatibility constraints for VMotion), you’ll navigate to the CPU Identification Mask section of the Properties-->Options page for starters.

Modifying Default NX/DX Mask

Navigate to the virtual machine’s Properties page-->Options tab (see steps above, if necessary).

Click the Hide the Nx flag from guest radio button to disable this CPU compatibility check for the selected virtual machine.

Click OK to save the change.

Modifying Default Mask

Navigate to the virtual machine’s Properties page-->Options tab (see steps above, if necessary).

Click Advanced... to open the CPU Identification Mask properties dialog box. Note that the CPU Identification Mask dialog box has two tabs—Virtual Machine Default, and AMD Override. Most modifications for Intel CPU features are made on the Virtual Machine Default page. Modifications for AMD CPU features are made on the AMD Overrides page.

Click the Virtual Machine Default tab to activate the dialog box, if necessary.

To modify the mask for a specific feature, enter the series of dashes and 0s shown in the table.

Click OK when you are finished and exit the CPU Identification Mask dialog box.

Feature Level Row Mask

SSE3 1 ecx -


-


-


-


-


-


---0 -0-0

SSE4 1 ecx -


-


-


-


-


0- -


-0

Modifying AMD-Specific Masks

Features that are specific to AMD processors are listed on the AMD Override tab.

Navigate to the virtual machine’s Properties page-->Options tab (see steps above, if necessary).

Click Advanced... to open the CPU Identification Mask properties dialog box.

Click the AMD Override tab to activate the dialog box. The AMD Override page displays.

Enter the appropriate mask as shown in the table. You may need to scroll to find the fields for some register levels.

Click OK to save changes.

Feature Level Row AMD Override

SSE3 1 ecx -


-


-


-


-


-


-


---0

RDTSCP 80000001 edx -


0--- -


-


-


-


-


-


FFXSR 80000001 edx -


--0- -


-


-


-


-


-


Combining Mask Modifications

You can combine modifications to the default bit mask to allow migration with VMotion between groups that are incompatible based on more than one CPU feature. For example, to filter-out the compatibility check for both SSE3 and SSE4 combine the following:

-


-


-


-


-


-


---0 -0-0 and

-


-


-


-


-


0- -


-0

to yield

-


-


-


-


-


--0- ---0 -0-0

ESX Server 2.x

For ESX Server 2.x hosts managed by VirtualCenter 2.0.1 Patch 2 (and subsequent releases), the bit mask can be modified by manually editing the VirtualCenter configuration file. The VirtualCenter 2.x-series configuration file, vpxd.cfg, contains XML tags defining various elements and settings for the VirtualCenter server. Bit-masks can be modified by adding to the configuration file the appropriate XML tags to define the mask in the context of a guest OS configuration option.

Although a full discussion of how to build these elements is beyond the scope of this article, the About the Tags section provides a brief overview. Specific steps for Editing the VirtualCenter Configuration File and XML tags for Common Mask Patterns—specifically, SSE3, SSE4, and SSE3 and SSE4 Combined—that you can use are below.

About the Tags

The tags identify the register to which the mask should apply, and for which hosts and which versions. The tags are direct descendents of the --::::::-x:-x-x--:::::::-x--:::::x-:--:-x--:::::x-:-x:-x-x</ecx> </level-1> </default-vendor> <amd> <level-1> <ecx>:::::::-x</ecx> </level-1> </amd> </cpuFeatureMask>

\----


Reverting to the Default CPU Compatibility Bit Mask

You can restore the default VMotion compatibility constraints by simply reverting any changes made to the default mask:

For VirtualCenter 1.x: Remove any of the lines added to the config.ini file, and then re-starting the VirtualCenter server.

VirtualCenter 2.x \[and ESX Server 3.x]: Reset any changed rows to the defaults by clicking ?Reset All to Default? button on the CPU Identification Mask dialog box.

VirtualCenter 2.x \[and ESX Server 2.x]: Remove any added tags entered in the vpxd.cfg file and re-start the VirtualCenter service.

0 Kudos
radhika1780
Enthusiast
Enthusiast

Check the thread- to resolve your problem.

www.vmware.com/community/thread.jspa?messageID=458564

0 Kudos
Champagne
Contributor
Contributor

Actually, Bob indicates that the servers are identical. This issue is being experienced on 2 of 8 identical servers, all using exactly the same procs. Only difference is that these two servers have some[/u] VMs which access RDM volumes while the other hosts do not. None of the hosted VMs can be migrated, regardless of whether or not RDMs are used. The vmware-vmxa agent was restarted yesterday without effect. Today, the agent was killed and allowed to respawn. Now at least the server is able to report on the VMs storage volume sizes but VMotion is still no go.

We have a scheduled reboot to the VC server tonight. If that doesn't change the situation, we'll maybe try a manual reinstall of the vpx agent. It's likely that we will probably end up scheduling a reboot to the affected VMs and hosts (not pretty when you are trying to sell decent uptime to clients).

Gary

0 Kudos
wearmg
Contributor
Contributor

I'm having the same problem on 1 of my ESX 2.5.3 servers. I have 6 IBM x366's that I've had no problems using VMotion with in the past but now when I try to move VMs off of 1 of the hosts it goes about 10% on the progress bar and then gives the message: A general system error occurred: failed to prepare source for VMotion (unknown exception). If I look at the vpxa.log file it shows a 'Failed to get disk capacities for VM:' error message. When I look in VC at the properties of a VM on that host it shows 'unavailable' for the disk size. There are no RDM volumes on that host.

It's running 2.5.3 Build 24171 - It's been up for over a hundred days running without many vmotions so I'm not sure exactly when the problem may have started. I've been trying to get all of the other hosts up to a current build for the daylight savings updates and just ran into this on this host. I had recently updated from the base VC 2.0.1 version to patch 1 and thought that maybe that had something to do with it so I tried installing patch 2. During the patch 2 install it errored out vpxagent upgrade on that host and I had to manually install it. The manual install went ok and I was able to bring that host back into virtual center but I'm still getting the 'Failed to get disk capacities for VM' in the vpxa.log when I try to move the VMs off of that host.

-Matt

0 Kudos
Champagne
Contributor
Contributor

Matt,

We too encountered the "Failed to get disk capacities..." issue in the vpxa.log. Between a VC reboot and killing/respawning the vmware-vpxa agent on the affected ESX Server we were able to re-establish the disk capacity reporting function. Unfortunately, we still received the same unknown exception error following a failed VMotion attempt. Strange thing is that the failure is occurring before 10% complete now.

Bob opened an SR this past Friday and so far it appears support is aware of this issue but has no quick fix for us. They recommended Patch 2 however from what you are seeing this doesn't seem to fix the problem.

We'll report on anything further we get from the SR.

Gary

0 Kudos
Bob_Warrington
Contributor
Contributor

I would like to thank everyone for your responses. Here's an update on the issue.

The response to the ticket I have open with VMware was to upgrade to VC 2.01 Patch 2. I proceeded to do so after scheduling VC downtime with our users (it's production after all). I first applied the update in our lab and encountered a problem with the new license server not being able to recognize the license file as valid. I removed a blank line and blank characters from the end of file which fixed this issue. As mentioned in other threads, I also encountered the problem of a few (not all) ESX 2.5.3 and 3.01 servers reported as disconnected after the upgrade. It is necessary to restart the VC agent on these hosts and you may have to manually reconnect the hosts.

I proceeded to install VC 2.01 in our production environment. Once complete I retested VMotion. I was able to VMotion additional (but not all) VMs after restarting the serverd process on ESX 2.5.3 servers. The remaining VMs still failed on VMotion. The final solution was to edit the VMs, and reset the CD and/or floppy if any invalid characters (typically either an empty field or a value that has the characters '[]' in it). Note that I have tried this before the VC 2.01 Patch2 was applied without success. Now VMotion is working between our ESX 2.5.3 hosts! I can finally patch the last two ESX servers with DST and continue with our VI3 migration. This mixed VC 2.01 and ESX 3.01 environment is an 'interesting' challenge.

Bob

0 Kudos
rreynol
Enthusiast
Enthusiast

Something easy that fixed the same problem for us, also with the same general system error, was to simply restart the vmware-serverd process that runs on 2.5 servers

killall vmware-serverd

The VMs will disconnect momentarily but when they come back you can see the storage information and VMotion started working again.

0 Kudos
Leethal
Contributor
Contributor

Woo Hoo.... that fixed it for me .. thanks.

0 Kudos
bsd1977
Contributor
Contributor

rreynol, this fixed our situation as well. Thanks!

0 Kudos