Reply to Message

View discussion in a popup

Replying to:
losisoft
Contributor
Contributor

Hi,


We have the same issue, we are upgrading our system from 5.x to 5.5u3b
So far I could reproduce the issue on 3 different cluster.

Details:
Cluster #1 and cluster #2 - contains 4 node.
3 node have been upgraded to 5.5u3b - build 3343343
4th node is still on 5.1.0 build 1483097
The cluster is using VSS on management and VDS for the VM servers.
vswitch0 - management with 2 network card, explicit failover, one active one standby adapter
vds - for the VM hosts - with 2 network card

On cluster #1, and #2 - we have 1-1 VM server which I can not migrate to the upgraded hosts.
I can migrate VM servers to the old node, and then migrate it back. The problem is specific to the VM server.
The VM server which is affected is a Win 2008R2, VM tool version is on the VM server is: 9.0.5 build 1065307
vm hardware is v9

-----------------------------------------
Cluster #3 have 5 nodes
3 node have been upgraded to 5.5u3b - build 3343343
node 4, 5 is still on 5.0.0 build 623860
The cluster is using VSS on both for management and for the VM servers.
There is two vswitch - one for managemnet one for the VM servers - each with 2 network card, explicit failover, one active one standby adapter


Cluster #3
Here we stil have 2 node on 5.0 The affected VM can be migrated freely between the two 5.0.0 host, but it can not be migrated to any of the upgraded 5.5 hosts.

The VM server which is affected is a Win 2008R2, VM tool version is on the VM server is: 8.6.5 build 621624
vm hardware is v8

-----------------------

MTU size on al system is 1500
On the switches jumbo frame is enabled. That means the MTU is 9216. So this should not be the issue.
On network side the servers are OK, we are using them for years, there was no such issue.


Error during the migration is:
Migration to host xx.XX.XX.XX failed with error Already disconnected (195887150)
vmotion migration [-1062716352:1453282157653493] failed writing stram completion: Already disconnected
vmotion migration [-1062716352:1453282157653493] failed to flush stream buffer: Already disconnected
vmotion migration [-1062716352:1453282157653493] socket connected returned: Already disconnected

This comes up at 14%


Sniplet from two of the affected VM server vmware.log file - This is the log what the system created during the vmotion:

2016-01-19T13:39:44.832Z| vmx| I120: MigratePlatformInitMigration:  DiskOp file set to /vmfs/volumes/51b9ba04-be38ab49-ddfc-d89d67136f18/FRJLOLA01/FRJLOLA01-diskOp.tmp

2016-01-19T13:39:44.852Z| vmx| I120: MigrateWaitForData: waiting for data.

2016-01-19T13:39:44.852Z| vmx| I120: MigrateSetState: Transitioning from state 7 to 8.

2016-01-19T13:39:44.891Z| vmx| I120: MigrateRPC_RetrieveMessages: Informed of a new user message, but can't handle messages in state 4.  Leaving the message queued.

2016-01-19T13:39:44.998Z| vmx| I120: MigrateSetStateFinished: type=2 new state=11

2016-01-19T13:39:44.998Z| vmx| I120: MigrateSetState: Transitioning from state 8 to 11.

2016-01-19T13:39:44.998Z| vmx| I120: [msg.migrate.waitdata.platform] Failed waiting for data.  Error bad0007. Bad parameter.

2016-01-19T13:39:44.998Z| vmx| I120: Migrate: cleaning up migration state.

2016-01-19T13:39:44.999Z| vmx| I120: MigrateSetState: Transitioning from state 11 to 0.

2016-01-19T13:39:44.999Z| vmx| I120: Migrate: Final status reported to VMDB.

2016-01-19T13:39:44.999Z| vmx| I120: Module Migrate power on failed.

2016-01-19T13:39:44.999Z| vmx| I120: VMX_PowerOn: ModuleTable_PowerOn = 0

2016-01-19T13:39:44.999Z| vmx| I120: SVMotion_PowerOff: Not running Storage vMotion. Nothing to do

2016-01-19T13:39:45.000Z| vmx| I120: WORKER: asyncOps=1 maxActiveOps=1 maxPending=0 maxCompleted=0

2016-01-19T13:39:45.000Z| vmx| I120: Vix: [275507 mainDispatch.c:1201]: VMAutomationPowerOff: Powering off.

2016-01-19T13:39:45.001Z| vmx| W110: /vmfs/volumes/51b9ba04-be38ab49-ddfc-d89d67136f18/FRJLOLA01/FRJLOLA01.vmx: Cannot remove symlink /var/run/vmware/root_0/1453210780080218_275507/configFile: No such file or directory

2016-01-19T13:39:45.055Z| vmx| I120: Vix: [275507 mainDispatch.c:3964]: VMAutomation_ReportPowerOpFinished: statevar=1, newAppState=1873, success=1 additionalError=0

2016-01-19T13:39:45.056Z| vmx| I120: Msg_Post: Error

2016-01-19T13:39:45.056Z| vmx| I120: [vob.swap.migrate.invalidindex.mig] The migration swap type is not supported for migration.

2016-01-19T13:39:45.056Z| vmx| I120: [vob.migrate.addpage.swapped.invalidindex] Received invalid swap slot data (0xc000ec8c) for pgNum 0.

2016-01-19T13:39:45.056Z| vmx| I120: [vob.vmotion.addpage.failed.status] vMotion migration [c0a83c40:1453210779366300] failed to add memory page 0 to VM: Bad parameter

2016-01-19T13:39:45.056Z| vmx| I120: [vob.vmotion.stream.completion.complete.fail] vMotion migration [c0a83c40:1453210779366300] failed draining stream completion: Bad parameter

2016-01-19T13:39:45.056Z| vmx| I120: [msg.moduletable.powerOnFailed] Module Migrate power on failed.

2016-01-19T13:39:45.056Z| vmx| I120: [msg.vmx.poweron.failed] Failed to start the virtual machine.
2016-01-19T13:39:45.056Z| vmx| I120: ----------------------------------------
2016-01-19T13:39:45.058Z| vmx| I120: VmdbPipeStreamsOvlError Couldn't read: OVL_STATUS_EOF, (2) No such file or directory.

2016-01-19T13:39:45.058Z| vmx| I120: VmdbCnxDisconnect: Disconnect: closed pipe for pub cnx '/db/connection/#1/' (-32)

2016-01-19T13:39:45.059Z| vmx| I120: VmdbDbRemoveCnx: Removing Cnx from Db for '/db/connection/#1/'

2016-01-19T13:39:45.059Z| vmx| I120: VUIDialogDequeue: found 0 vmdb ui connections; canceling dialogs

2016-01-19T13:39:45.059Z| vmx| I120: Vix: [275507 mainDispatch.c:3964]: VMAutomation_ReportPowerOpFinished: statevar=0, newAppState=1870, success=1 additionalError=0

2016-01-19T13:39:45.059Z| vmx| I120: Transitioned vmx/execState/val to poweredOff

2016-01-19T13:39:45.059Z| vmx| I120: Vix: [275507 mainDispatch.c:3964]: VMAutomation_ReportPowerOpFinished: statevar=0, newAppState=1870, success=0 additionalError=0

2016-01-19T13:39:45.059Z| vmx| I120: Vix: [275507 mainDispatch.c:4003]: Error VIX_E_FAIL in VMAutomation_ReportPowerOpFinished(): Unknown error

2016-01-19T13:39:45.066Z| vmx| I120: Vix: [275507 mainDispatch.c:3964]: VMAutomation_ReportPowerOpFinished: statevar=0, newAppState=1870, success=1 additionalError=0

2016-01-19T13:39:45.066Z| vmx| I120: Transitioned vmx/execState/val to poweredOff

2016-01-19T13:39:45.067Z| vmx| I120: VMIOP: Exit

2016-01-19T13:39:45.069Z| vmx| I120: Vix: [275507 mainDispatch.c:849]: VMAutomation_LateShutdown()
2016-01-19T13:39:45.069Z| vmx| I120: Vix: [275507 mainDispatch.c:799]: VMAutomationCloseListenerSocket. Closing listener socket.


2016-01-20T08:27:44.719Z| vmx| I120: WORKER: Creating new group with numThreads=1 (4)

2016-01-20T08:27:44.720Z| vmx| I120: FTCpt: (0 unk) State transition 0 -> 1

2016-01-20T08:27:44.736Z| vmx| I120: WORKER: Creating new group with numThreads=1 (4)

2016-01-20T08:27:44.736Z| vmx| I120: MigratePlatformInitMigration:  DiskOp file set to /vmfs/volumes/54dcb18a-c0a1ec7e-b76e-d89d671472c4/FRHUOPAP01/FRHUOPAP01-diskOp.tmp

2016-01-20T08:27:44.758Z| vmx| I120: MigrateWaitForData: waiting for data.

2016-01-20T08:27:44.758Z| vmx| I120: MigrateSetState: Transitioning from state 7 to 8.

2016-01-20T08:27:46.112Z| vmx| I120: MigrateSetStateFinished: type=2 new state=11

2016-01-20T08:27:46.112Z| vmx| I120: MigrateSetState: Transitioning from state 8 to 11.

2016-01-20T08:27:46.112Z| vmx| I120: [msg.migrate.waitdata.platform] Failed waiting for data.  Error bad0007. Bad parameter.

2016-01-20T08:27:46.113Z| vmx| I120: Migrate: cleaning up migration state.

2016-01-20T08:27:46.113Z| vmx| I120: MigrateSetState: Transitioning from state 11 to 0.
2016-01-20T08:27:46.114Z| vmx| I120: Migrate: Final status reported to VMDB.

2016-01-20T08:27:46.114Z| vmx| I120: Module Migrate power on failed.

2016-01-20T08:27:46.114Z| vmx| I120: VMX_PowerOn: ModuleTable_PowerOn = 0

2016-01-20T08:27:46.114Z| vmx| I120: SVMotion_PowerOff: Not running Storage vMotion. Nothing to do
2016-01-20T08:27:46.114Z| vmx| I120: WORKER: asyncOps=0 maxActiveOps=0 maxPending=0 maxCompleted=0

2016-01-20T08:27:46.114Z| vmx| I120: Vix: [240094 mainDispatch.c:1201]: VMAutomationPowerOff: Powering off.

2016-01-20T08:27:46.116Z| vmx| W110: /vmfs/volumes/54dcb18a-c0a1ec7e-b76e-d89d671472c4/FRHUOPAP01/FRHUOPAP01.vmx: Cannot remove symlink /var/run/vmware/root_0/1453278463967538_240094/configFile: No such file or directory

2016-01-20T08:27:46.147Z| vmx| I120: Vix: [240094 mainDispatch.c:3964]: VMAutomation_ReportPowerOpFinished: statevar=1, newAppState=1873, success=1 additionalError=0

2016-01-20T08:27:46.148Z| vmx| I120: Msg_Post: Error

2016-01-20T08:27:46.148Z| vmx| I120: [vob.swap.migrate.invalidindex.mig] The migration swap type is not supported for migration.

2016-01-20T08:27:46.148Z| vmx| I120: [vob.migrate.addpage.swapped.invalidindex] Received invalid swap slot data (0xc0029591) for pgNum 0x1ca1.

2016-01-20T08:27:46.148Z| vmx| I120: [vob.vmotion.addpage.failed.status] vMotion migration [c0a8dce0:1453278463248266] failed to add memory page 0x1ca1 to VM: Bad parameter

2016-01-20T08:27:46.148Z| vmx| I120: [vob.vmotion.stream.completion.complete.fail] vMotion migration [c0a8dce0:1453278463248266] failed draining stream completion: Bad parameter

2016-01-20T08:27:46.148Z| vmx| I120: [msg.moduletable.powerOnFailed] Module Migrate power on failed.

2016-01-20T08:27:46.148Z| vmx| I120: [msg.vmx.poweron.failed] Failed to start the virtual machine.

2016-01-20T08:27:46.148Z| vmx| I120: ----------------------------------------


I tried to change on the 5.5 host the
VMkernel.boot.netPktHeapMaxMBperGB from 6 to 12
and
VMkernel.boot.netPktPoolMaxMBperGB from 75 to 100

on the 5.1 host change the
VMkernel.Boot.netPktPoolMaxSize from 656 to 800.
VMkernel.Boot.netPktHeapMaxSize from 64 to 128

Tried to do an IP address change on the vmkernel interface.

No change. Issue remains.

Reply
0 Kudos