Hello,
From my experience, when using vMotion, I usually see a missed ping or none at all at the exact moment a virtual machine cuts over to the destination host.
However, I'm now observing a network disconnect that can be over ten seconds during the vMotion cutover, which is causing issues.
This is happening mostly on HP Proliant DL580s and DL980s with dual 10Gb NICs for vMotion and dual 10Gb NICs for VM traffic. Storage is FC with 2TB LUNs. ESX is 5.1 Update 1 on vCenter 5.1.
The affected VMs are usually between 24 and 64Gb of memory with 16 vCPUs.
I'm also observing long ping timeouts during a Storage vMotion, once again right at the end of the operation.
So far, VMware support have not been able to find a solution to this issue but we are still investigating.
Has anyone observed this issue in their environment?
Let me know if you need additional information.
Thank you.
Is spanning tree port-fast enabled on the physical switches? Are there any errors or issues on the physical switch on the moment of a vMotion?
Do you have VM swap files stored with host instead of with VM folder? Were the VMs swapping during vmotion?
Do you have management port, the VMotion port and all VMs are on the same network subnet.
Good question tomtom901, I am looking into that with our network team at the moment. Just wanted to give an update to your reply. Thanks
tomtom901, spanning tree port-fast was enabled on the VM network switch ports. It wasn't enabled on the vMotion switch ports but activating it did not change anything.
There were no errors on the switch side during vMotions, no dropped packets. Flow control was enabled on the vMotion ports, no change after this was disabled.
For the first time, I actually saw some ping time outs at 33% during a vMotion...
I put one adapter on Standby in each of the hosts' vSwitches with the same results.
One set of hosts is using Standard Switches and another is using Distributed Switches.
wwan, yes the VMs' swap files are stored with the VMs. I did not observe swapping during the vMotions.
chandra_bhatt, we have dedicated physical NICs and subnets for VM, Management and vMotion traffic.
Could you do a vMotion and post some logging of the vmkernel.log file, located in /var/log/vmkernel.log. Select the last 250 lines like this:
tail -n 250 /var/log/vmkernel.log and post this output here.
tomtom901, here is the output of the tail command from one of the hosts where we vMotioned one VM back on forth.
2013-12-16T19:01:01.142Z cpu20:124692)World: vm 124705: 1421: Starting world vmm11:VirtualMachine with flags 8
2013-12-16T19:01:01.143Z cpu20:124692)World: vm 124706: 1421: Starting world vmm12:VirtualMachine with flags 8
2013-12-16T19:01:01.143Z cpu20:124692)World: vm 124707: 1421: Starting world vmm13:VirtualMachine with flags 8
2013-12-16T19:01:01.143Z cpu20:124692)World: vm 124708: 1421: Starting world vmm14:VirtualMachine with flags 8
2013-12-16T19:01:01.143Z cpu20:124692)World: vm 124709: 1421: Starting world vmm15:VirtualMachine with flags 8
2013-12-16T19:01:01.292Z cpu20:124692)Migrate: vm 124693: 3273: Setting VMOTION info: Dest ts = 1387220456095408, src ip = <10.x.x.x> dest ip = <10.x.x.x> Dest wid = 0 using SHARED swap
2013-12-16T19:01:01.292Z cpu20:124692)WARNING: VMotionUtil: 5028: 1387220456095408
vMotion vmknic is undefined. vMotion will attempt to proceed, but may not reach optimal performance.
2013-12-16T19:01:01.310Z cpu20:124692)Tcpip_Vmk: 1213: Affinitizing 10.x.x.x to world 124714, Success
2013-12-16T19:01:01.310Z cpu20:124692)VMotion: 2688: 1387220456095408
Set ip address '10.x.x.x' worldlet affinity to recv World ID 124714
2013-12-16T19:01:01.310Z cpu20:124692)Hbr: 3308: Migration start received (worldID=124693) (migrateType=1) (event=0) (isSource=0) (sharedConfig=1)
2013-12-16T19:01:01.686Z cpu13:17026)MigrateNet: vm 17026: 2077: Accepted connection from <10.x.x.x>
2013-12-16T19:01:01.686Z cpu13:17026)MigrateNet: vm 17026: 2147: dataSocket 0x41001f5da850 receive buffer size is 563272
2013-12-16T19:01:01.688Z cpu13:17026)MigrateNet: vm 17026: 2077: Accepted connection from <10.x.x.x>
2013-12-16T19:01:01.688Z cpu13:17026)MigrateNet: vm 17026: 2147: dataSocket 0x4100470172a0 receive buffer size is 563272
2013-12-16T19:01:01.688Z cpu13:17026)VMotionUtil: 3184: 1387220456095408
Stream connection 1 added.
2013-12-16T19:01:01.689Z cpu13:17026)MigrateNet: vm 17026: 2077: Accepted connection from <10.x.x.x>
2013-12-16T19:01:01.689Z cpu13:17026)MigrateNet: vm 17026: 2147: dataSocket 0x410047017950 receive buffer size is 563272
2013-12-16T19:01:01.689Z cpu13:17026)VMotionUtil: 3184: 1387220456095408
Stream connection 2 added.
2013-12-16T19:04:14.063Z cpu24:124714)VMotionRecv: 638: 1387220456095408
Estimated network bandwidth 115.599 MB/s during pre-copy
2013-12-16T19:04:15.384Z cpu4:124692)VSCSI: 3780: handle 8192(vscsi0:0):Creating Virtual Device for world 124693 (FSS handle 1328284)
2013-12-16T19:04:15.386Z cpu4:124692)VSCSI: 3780: handle 8193(vscsi0:1):Creating Virtual Device for world 124693 (FSS handle 1590429)
2013-12-16T19:04:15.388Z cpu4:124692)VSCSI: 3780: handle 8194(vscsi0:2):Creating Virtual Device for world 124693 (FSS handle 1262747)
2013-12-16T19:04:15.389Z cpu4:124692)VSCSI: 3780: handle 8195(vscsi0:3):Creating Virtual Device for world 124693 (FSS handle 1655962)
2013-12-16T19:04:15.391Z cpu4:124692)VSCSI: 3780: handle 8196(vscsi0:5):Creating Virtual Device for world 124693 (FSS handle 1197215)
2013-12-16T19:04:15.449Z cpu20:124714)VMotionRecv: 2386: 1387220456095408
DONE paging in
2013-12-16T19:04:15.449Z cpu20:124714)VMotionRecv: 2394: 1387220456095408
Estimated network bandwidth 85.067 MB/s during page-in
2013-12-16T19:04:15.936Z cpu2:124693)VMMVMKCall: 208: Received INIT from world 124693
2013-12-16T19:04:15.946Z cpu2:124693)LSI: 1986: LSI: Initialized rings for scsi0 async=1, record=0 replay=0
2013-12-16T19:04:15.948Z cpu2:125131)WARNING: Uplink: 3075: releasing cap 0x0!
2013-12-16T19:04:15.948Z cpu2:125131)WARNING: Uplink: 3075: releasing cap 0x0!
2013-12-16T19:04:15.948Z cpu2:125131)WARNING: Uplink: 3075: releasing cap 0x0!
2013-12-16T19:04:15.948Z cpu2:125131)WARNING: Uplink: 3075: releasing cap 0x0!
2013-12-16T19:04:15.948Z cpu2:125131)Net: 1858: connected VirtualMachine eth2 to iSCSI, portID 0x5000006
2013-12-16T19:04:15.948Z cpu2:125131)Net: 1858: connected VirtualMachine eth1 to Heartbeat, portID 0x3000007
2013-12-16T19:04:15.950Z cpu2:125131)WARNING: Uplink: 3075: releasing cap 0x0!
2013-12-16T19:04:15.950Z cpu2:125131)WARNING: Uplink: 3075: releasing cap 0x0!
2013-12-16T19:04:15.950Z cpu2:125131)WARNING: Uplink: 3075: releasing cap 0x0!
2013-12-16T19:04:15.950Z cpu2:125131)WARNING: Uplink: 3075: releasing cap 0x0!
2013-12-16T19:04:15.950Z cpu2:125131)Net: 1858: connected VirtualMachine eth0 to 10.x.x.x, portID 0x4000006
2013-12-16T19:04:15.959Z cpu0:124695)VMMVMKCall: 208: Received INIT from world 124695
2013-12-16T19:04:15.961Z cpu4:124696)VMMVMKCall: 208: Received INIT from world 124696
2013-12-16T19:04:15.963Z cpu19:124697)VMMVMKCall: 208: Received INIT from world 124697
2013-12-16T19:04:15.966Z cpu10:124698)VMMVMKCall: 208: Received INIT from world 124698
2013-12-16T19:04:15.968Z cpu4:124699)VMMVMKCall: 208: Received INIT from world 124699
2013-12-16T19:04:15.969Z cpu10:124700)VMMVMKCall: 208: Received INIT from world 124700
2013-12-16T19:04:15.973Z cpu6:124701)VMMVMKCall: 208: Received INIT from world 124701
2013-12-16T19:04:15.973Z cpu38:124702)VMMVMKCall: 208: Received INIT from world 124702
2013-12-16T19:04:15.975Z cpu29:124703)VMMVMKCall: 208: Received INIT from world 124703
2013-12-16T19:04:15.977Z cpu36:124704)VMMVMKCall: 208: Received INIT from world 124704
2013-12-16T19:04:15.978Z cpu35:124705)VMMVMKCall: 208: Received INIT from world 124705
2013-12-16T19:04:15.979Z cpu30:124706)VMMVMKCall: 208: Received INIT from world 124706
2013-12-16T19:04:15.981Z cpu38:124707)VMMVMKCall: 208: Received INIT from world 124707
2013-12-16T19:04:15.983Z cpu22:124708)VMMVMKCall: 208: Received INIT from world 124708
2013-12-16T19:04:15.984Z cpu23:124709)VMMVMKCall: 208: Received INIT from world 124709
2013-12-16T19:04:16.048Z cpu2:125131)NetPort: 1380: enabled port 0x4000006 with mac 00:50:56:ac:5a:87
2013-12-16T19:04:16.049Z cpu2:125131)NetPort: 1380: enabled port 0x3000007 with mac 00:50:56:ac:5a:88
2013-12-16T19:04:16.050Z cpu2:125131)NetPort: 1380: enabled port 0x5000006 with mac 00:50:56:ac:5a:89
2013-12-16T19:04:16.057Z cpu2:124693)VMotion: 5679: 1387220456095408
Received all changed pages.
2013-12-16T19:04:16.163Z cpu4:124693)VmMemMigrate: vm 124693: 5005: Regular swap file bitmap checks out.
2013-12-16T19:04:16.174Z cpu2:124693)VMotion: 5458: 1387220456095408
Resume handshake successful
2013-12-16T19:04:16.235Z cpu2:125131)Hbr: 3405: Migration end received (worldID=124693) (migrateType=1) (event=1) (isSource=0) (sharedConfig=1)
2013-12-16T19:04:16.290Z cpu17:125147)Swap: vm 124693: 3254: Starting prefault for the migration swap file
2013-12-16T19:04:16.481Z cpu39:17594)Config: 347: "SIOControlFlag2" = 0, Old Value: 1, (Status: 0x0)
2013-12-16T19:04:17.081Z cpu17:125147)Swap: vm 124693: 3429: Finish swapping in migration swap file. (faulted 0 pages, pshared 0 pages). Success.
2013-12-16T19:10:35.987Z cpu24:17632)Config: 347: "SIOControlFlag2" = 1, Old Value: 0, (Status: 0x0)
2013-12-16T19:10:39.817Z cpu19:124692)Migrate: vm 124693: 3273: Setting VMOTION info: Source ts = 1387221025280408, src ip = <10.x.x.x> dest ip = <10.x.x.x> Dest wid = 142487 using SHARED swap
2013-12-16T19:10:39.817Z cpu19:124692)WARNING: VMotionUtil: 5028: 1387221025280408 S: vMotion vmknic is undefined. vMotion will attempt to proceed, but may not reach optimal performance.
2013-12-16T19:10:39.824Z cpu19:124692)Tcpip_Vmk: 1213: Affinitizing 10.x.x.x to world 125561, Success
2013-12-16T19:10:39.824Z cpu19:124692)VMotion: 2688: 1387221025280408 S: Set ip address '10.x.x.x' worldlet affinity to send World ID 125561
2013-12-16T19:10:39.824Z cpu19:124692)Hbr: 3308: Migration start received (worldID=124693) (migrateType=1) (event=0) (isSource=1) (sharedConfig=1)
2013-12-16T19:10:39.826Z cpu6:17026)MigrateNet: vm 17026: 2077: Accepted connection from <10.x.x.x>
2013-12-16T19:10:39.826Z cpu6:17026)MigrateNet: vm 17026: 2147: dataSocket 0x4100470e8580 receive buffer size is 563272
2013-12-16T19:10:39.826Z cpu24:125561)VMotionUtil: 3184: 1387221025280408 S: Stream connection 1 added.
2013-12-16T19:10:39.827Z cpu24:125561)VMotionUtil: 3184: 1387221025280408 S: Stream connection 2 added.
2013-12-16T19:12:53.751Z cpu0:124693)VMotion: 4502: 1387221025280408 S: Another pre-copy iteration needed with 1012394 pages left to send (prev2 9961472, prev 9961472, network bandwidth ~128.293 MB/s, 4680% t2d)
2013-12-16T19:13:23.452Z cpu2:124693)VMotion: 4502: 1387221025280408 S: Another pre-copy iteration needed with 264458 pages left to send (prev2 9961472, prev 68, network bandwidth ~131.425 MB/s, 349% t2d)
2013-12-16T19:13:30.800Z cpu2:124693)VMotion: 4502: 1387221025280408 S: Another pre-copy iteration needed with 86525 pages left to send (prev2 68, prev 115, network bandwidth ~149.583 MB/s, 260% t2d)
2013-12-16T19:13:34.939Z cpu11:124693)VMotion: 4502: 1387221025280408 S: Another pre-copy iteration needed with 29247 pages left to send (prev2 115, prev 153, network bandwidth ~122.496 MB/s, 187% t2d)
2013-12-16T19:13:37.501Z cpu4:124693)VMotion: 4417: 1387221025280408 S: Stopping pre-copy: only 9172 pages left to send, which can be sent within the switchover time goal of 0.500 seconds (network bandwidth ~84.152 MB/s, 182% t2d)
2013-12-16T19:13:37.564Z cpu8:125131)VSCSI: 6340: handle 8192(vscsi0:0):Destroying Device for world 124693 (pendCom 0)
2013-12-16T19:13:37.575Z cpu10:125131)VSCSI: 6340: handle 8193(vscsi0:1):Destroying Device for world 124693 (pendCom 0)
2013-12-16T19:13:37.576Z cpu10:125131)VSCSI: 6340: handle 8194(vscsi0:2):Destroying Device for world 124693 (pendCom 0)
2013-12-16T19:13:37.585Z cpu10:125131)VSCSI: 6340: handle 8195(vscsi0:3):Destroying Device for world 124693 (pendCom 0)
2013-12-16T19:13:37.586Z cpu10:125131)VSCSI: 6340: handle 8196(vscsi0:5):Destroying Device for world 124693 (pendCom 0)
2013-12-16T19:13:37.587Z cpu10:125131)NetPort: 1574: disabled port 0x4000006
2013-12-16T19:13:37.587Z cpu10:125131)NetPort: 1574: disabled port 0x3000007
2013-12-16T19:13:37.587Z cpu10:125131)NetPort: 1574: disabled port 0x5000006
2013-12-16T19:13:38.591Z cpu6:125561)VMotionSend: 3640: 1387221025280408 S: Sent all modified pages to destination (network bandwidth ~120.884 MB/s)
2013-12-16T19:13:40.152Z cpu4:124692)Hbr: 3405: Migration end received (worldID=124693) (migrateType=1) (event=1) (isSource=1) (sharedConfig=1)
2013-12-16T19:13:40.169Z cpu22:16451)WARNING: Uplink: 3622: requesting cap 0x0!
2013-12-16T19:13:40.169Z cpu22:16451)WARNING: Uplink: 3622: requesting cap 0x0!
2013-12-16T19:13:40.169Z cpu22:16451)Net: 2475: disconnected client from port 0x5000006
2013-12-16T19:13:40.169Z cpu22:16451)Net: 2475: disconnected client from port 0x3000007
2013-12-16T19:13:40.170Z cpu22:16451)WARNING: Uplink: 3622: requesting cap 0x0!
2013-12-16T19:13:40.170Z cpu22:16451)WARNING: Uplink: 3622: requesting cap 0x0!
2013-12-16T19:13:40.170Z cpu22:16451)Net: 2475: disconnected client from port 0x4000006
2013-12-16T19:13:45.314Z cpu16:124689)Config: 347: "SIOControlFlag2" = 0, Old Value: 1, (Status: 0x0)
2013-12-16T19:18:39.608Z cpu22:16406)vmnic9 : 87:00.1 Link Down
2013-12-16T19:18:39.811Z cpu30:16414)vmnic9 : 87:00.1 Link Up - 10 Gbps Full Duplex
2013-12-16T19:18:43.329Z cpu20:16448)NetPort: 1574: disabled port 0x3000002
2013-12-16T19:18:43.329Z cpu20:16448)Uplink: 4915: enabled port 0x3000002 with mac 00:9c:02:3c:f8:8c
2013-12-16T19:18:45.038Z cpu36:16420)vmnic9 : 87:00.1 Link Down
2013-12-16T19:18:45.241Z cpu36:16420)vmnic9 : 87:00.1 Link Up - 10 Gbps Full Duplex
2013-12-16T19:18:48.331Z cpu20:16448)NetPort: 1574: disabled port 0x3000002
2013-12-16T19:18:48.331Z cpu20:16448)Uplink: 4915: enabled port 0x3000002 with mac 00:9c:02:3c:f8:8c
2013-12-16T19:19:04.930Z cpu6:16390)vmnic4 : 0e:00.0 Link Down
2013-12-16T19:19:05.139Z cpu6:16390)vmnic4 : 0e:00.0 Link Up - 10 Gbps Full Duplex
2013-12-16T19:19:09.525Z cpu0:16424)vmnic4 : 0e:00.0 Link Down
2013-12-16T19:19:09.744Z cpu0:126038)vmnic4 : 0e:00.0 Link Up - 10 Gbps Full Duplex
2013-12-16T19:26:01.714Z cpu0:124689)Config: 347: "SIOControlFlag2" = 1, Old Value: 0, (Status: 0x0)
2013-12-16T19:26:01.834Z cpu1:126290)MemSched: vm 126290: 7756: extended swap to 8192 pgs
2013-12-16T19:26:02.633Z cpu1:126290)World: vm 126291: 1421: Starting world vmm0:VirtualMachine with flags 8
2013-12-16T19:26:02.635Z cpu1:126290)Sched: vm 126291: 6416: Adding world 'vmm0:VirtualMachine', group 'host/user/pool1', cpu: shares=-3 min=0 minLimit=-1 max=-1, mem: shares=-3 min=0 minLimit=-1 max=-1
2013-12-16T19:26:02.635Z cpu1:126290)Sched: vm 126291: 6431: renamed group 199587 to vm.126290
2013-12-16T19:26:02.635Z cpu1:126290)Sched: vm 126291: 6448: group 199587 is located under group 199582
2013-12-16T19:26:02.641Z cpu1:126290)MemSched: vm 126290: 7756: extended swap to 31221 pgs
2013-12-16T19:26:02.677Z cpu1:126290)World: vm 126293: 1421: Starting world vmm1:VirtualMachine with flags 8
2013-12-16T19:26:02.678Z cpu1:126290)World: vm 126294: 1421: Starting world vmm2:VirtualMachine with flags 8
2013-12-16T19:26:02.678Z cpu1:126290)World: vm 126295: 1421: Starting world vmm3:VirtualMachine with flags 8
2013-12-16T19:26:02.678Z cpu1:126290)World: vm 126296: 1421: Starting world vmm4:VirtualMachine with flags 8
2013-12-16T19:26:02.678Z cpu1:126290)World: vm 126297: 1421: Starting world vmm5:VirtualMachine with flags 8
2013-12-16T19:26:02.678Z cpu1:126290)World: vm 126298: 1421: Starting world vmm6:VirtualMachine with flags 8
2013-12-16T19:26:02.678Z cpu1:126290)World: vm 126299: 1421: Starting world vmm7:VirtualMachine with flags 8
2013-12-16T19:26:02.679Z cpu1:126290)World: vm 126300: 1421: Starting world vmm8:VirtualMachine with flags 8
2013-12-16T19:26:02.679Z cpu1:126290)World: vm 126301: 1421: Starting world vmm9:VirtualMachine with flags 8
2013-12-16T19:26:02.679Z cpu1:126290)World: vm 126302: 1421: Starting world vmm10:VirtualMachine with flags 8
2013-12-16T19:26:02.680Z cpu1:126290)World: vm 126303: 1421: Starting world vmm11:VirtualMachine with flags 8
2013-12-16T19:26:02.680Z cpu1:126290)World: vm 126304: 1421: Starting world vmm12:VirtualMachine with flags 8
2013-12-16T19:26:02.680Z cpu1:126290)World: vm 126305: 1421: Starting world vmm13:VirtualMachine with flags 8
2013-12-16T19:26:02.680Z cpu1:126290)World: vm 126306: 1421: Starting world vmm14:VirtualMachine with flags 8
2013-12-16T19:26:02.680Z cpu1:126290)World: vm 126307: 1421: Starting world vmm15:VirtualMachine with flags 8
2013-12-16T19:26:02.842Z cpu1:126290)Migrate: vm 126291: 3273: Setting VMOTION info: Dest ts = 1387221957736408, src ip = <10.x.x.x> dest ip = <10.x.x.x> Dest wid = 0 using SHARED swap
2013-12-16T19:26:02.842Z cpu1:126290)WARNING: VMotionUtil: 5028: 1387221957736408
vMotion vmknic is undefined. vMotion will attempt to proceed, but may not reach optimal performance.
2013-12-16T19:26:02.860Z cpu1:126290)Tcpip_Vmk: 1213: Affinitizing 10.x.x.x to world 126312, Success
2013-12-16T19:26:02.860Z cpu1:126290)VMotion: 2688: 1387221957736408
Set ip address '10.x.x.x' worldlet affinity to recv World ID 126312
2013-12-16T19:26:02.860Z cpu1:126290)Hbr: 3308: Migration start received (worldID=126291) (migrateType=1) (event=0) (isSource=0) (sharedConfig=1)
2013-12-16T19:26:03.223Z cpu6:17026)MigrateNet: vm 17026: 2077: Accepted connection from <10.x.x.x>
2013-12-16T19:26:03.223Z cpu6:17026)MigrateNet: vm 17026: 2147: dataSocket 0x410047109030 receive buffer size is 563272
2013-12-16T19:26:03.226Z cpu6:17026)MigrateNet: vm 17026: 2077: Accepted connection from <10.x.x.x>
2013-12-16T19:26:03.226Z cpu6:17026)MigrateNet: vm 17026: 2147: dataSocket 0x410047109d90 receive buffer size is 563272
2013-12-16T19:26:03.226Z cpu6:17026)VMotionUtil: 3184: 1387221957736408
Stream connection 1 added.
2013-12-16T19:26:03.227Z cpu6:17026)MigrateNet: vm 17026: 2077: Accepted connection from <10.x.x.x>
2013-12-16T19:26:03.227Z cpu6:17026)MigrateNet: vm 17026: 2147: dataSocket 0x41004710a440 receive buffer size is 563272
2013-12-16T19:26:03.227Z cpu6:17026)VMotionUtil: 3184: 1387221957736408
Stream connection 2 added.
2013-12-16T19:28:31.711Z cpu0:126312)VMotionRecv: 638: 1387221957736408
Estimated network bandwidth 147.645 MB/s during pre-copy
2013-12-16T19:28:32.774Z cpu2:126290)VSCSI: 3780: handle 8197(vscsi0:0):Creating Virtual Device for world 126291 (FSS handle 806346)
2013-12-16T19:28:32.776Z cpu4:126312)VMotionRecv: 2386: 1387221957736408
DONE paging in
2013-12-16T19:28:32.776Z cpu4:126312)VMotionRecv: 2394: 1387221957736408
Estimated network bandwidth 74.964 MB/s during page-in
2013-12-16T19:28:32.776Z cpu2:126290)VSCSI: 3780: handle 8198(vscsi0:1):Creating Virtual Device for world 126291 (FSS handle 871883)
2013-12-16T19:28:32.777Z cpu2:126290)VSCSI: 3780: handle 8199(vscsi0:2):Creating Virtual Device for world 126291 (FSS handle 2706893)
2013-12-16T19:28:32.779Z cpu2:126290)VSCSI: 3780: handle 8200(vscsi0:3):Creating Virtual Device for world 126291 (FSS handle 937422)
2013-12-16T19:28:32.781Z cpu2:126290)VSCSI: 3780: handle 8201(vscsi0:5):Creating Virtual Device for world 126291 (FSS handle 740815)
2013-12-16T19:28:33.319Z cpu2:126291)VMMVMKCall: 208: Received INIT from world 126291
2013-12-16T19:28:33.328Z cpu2:126291)LSI: 1986: LSI: Initialized rings for scsi0 async=1, record=0 replay=0
2013-12-16T19:28:33.330Z cpu2:126558)WARNING: Uplink: 3075: releasing cap 0x0!
2013-12-16T19:28:33.330Z cpu2:126558)WARNING: Uplink: 3075: releasing cap 0x0!
2013-12-16T19:28:33.330Z cpu2:126558)WARNING: Uplink: 3075: releasing cap 0x0!
2013-12-16T19:28:33.330Z cpu2:126558)WARNING: Uplink: 3075: releasing cap 0x0!
2013-12-16T19:28:33.330Z cpu2:126558)Net: 1858: connected VirtualMachine eth2 to iSCSI, portID 0x5000007
2013-12-16T19:28:33.332Z cpu2:126558)Net: 1858: connected VirtualMachine eth1 to Heartbeat, portID 0x3000008
2013-12-16T19:28:33.332Z cpu2:126558)WARNING: Uplink: 3075: releasing cap 0x0!
2013-12-16T19:28:33.332Z cpu2:126558)WARNING: Uplink: 3075: releasing cap 0x0!
2013-12-16T19:28:33.332Z cpu2:126558)WARNING: Uplink: 3075: releasing cap 0x0!
2013-12-16T19:28:33.332Z cpu2:126558)WARNING: Uplink: 3075: releasing cap 0x0!
2013-12-16T19:28:33.332Z cpu2:126558)Net: 1858: connected VirtualMachine eth0 to 10.x.x.x, portID 0x4000007
2013-12-16T19:28:33.340Z cpu13:126293)VMMVMKCall: 208: Received INIT from world 126293
2013-12-16T19:28:33.344Z cpu0:126294)VMMVMKCall: 208: Received INIT from world 126294
2013-12-16T19:28:33.346Z cpu6:126295)VMMVMKCall: 208: Received INIT from world 126295
2013-12-16T19:28:33.348Z cpu9:126296)VMMVMKCall: 208: Received INIT from world 126296
2013-12-16T19:28:33.350Z cpu12:126297)VMMVMKCall: 208: Received INIT from world 126297
2013-12-16T19:28:33.352Z cpu19:126299)VMMVMKCall: 208: Received INIT from world 126299
2013-12-16T19:28:33.353Z cpu17:126298)VMMVMKCall: 208: Received INIT from world 126298
2013-12-16T19:28:33.354Z cpu28:126300)VMMVMKCall: 208: Received INIT from world 126300
2013-12-16T19:28:33.355Z cpu32:126301)VMMVMKCall: 208: Received INIT from world 126301
2013-12-16T19:28:33.357Z cpu24:126302)VMMVMKCall: 208: Received INIT from world 126302
2013-12-16T19:28:33.358Z cpu29:126303)VMMVMKCall: 208: Received INIT from world 126303
2013-12-16T19:28:33.360Z cpu24:126304)VMMVMKCall: 208: Received INIT from world 126304
2013-12-16T19:28:33.361Z cpu27:126305)VMMVMKCall: 208: Received INIT from world 126305
2013-12-16T19:28:33.363Z cpu21:126306)VMMVMKCall: 208: Received INIT from world 126306
2013-12-16T19:28:33.365Z cpu24:126307)VMMVMKCall: 208: Received INIT from world 126307
2013-12-16T19:28:33.426Z cpu2:126558)NetPort: 1380: enabled port 0x4000007 with mac 00:50:56:ac:5a:87
2013-12-16T19:28:33.428Z cpu2:126558)NetPort: 1380: enabled port 0x3000008 with mac 00:50:56:ac:5a:88
2013-12-16T19:28:33.428Z cpu2:126558)NetPort: 1380: enabled port 0x5000007 with mac 00:50:56:ac:5a:89
2013-12-16T19:28:33.436Z cpu2:126291)VMotion: 5679: 1387221957736408
Received all changed pages.
2013-12-16T19:28:33.535Z cpu2:126291)VmMemMigrate: vm 126291: 5005: Regular swap file bitmap checks out.
2013-12-16T19:28:33.543Z cpu2:126291)VMotion: 5458: 1387221957736408
Resume handshake successful
2013-12-16T19:28:33.633Z cpu2:126558)Hbr: 3405: Migration end received (worldID=126291) (migrateType=1) (event=1) (isSource=0) (sharedConfig=1)
2013-12-16T19:28:33.685Z cpu3:126574)Swap: vm 126291: 3254: Starting prefault for the migration swap file
2013-12-16T19:28:34.117Z cpu0:17632)Config: 347: "SIOControlFlag2" = 0, Old Value: 1, (Status: 0x0)
2013-12-16T19:28:34.488Z cpu3:126574)Swap: vm 126291: 3429: Finish swapping in migration swap file. (faulted 0 pages, pshared 0 pages). Success.
2013-12-16T19:34:21.988Z cpu34:124689)Config: 347: "SIOControlFlag2" = 1, Old Value: 0, (Status: 0x0)
2013-12-16T19:34:25.676Z cpu19:126290)Migrate: vm 126291: 3273: Setting VMOTION info: Source ts = 1387222457240408, src ip = <10.x.x.x> dest ip = <10.x.x.x> Dest wid = 143974 using SHARED swap
2013-12-16T19:34:25.676Z cpu19:126290)WARNING: VMotionUtil: 5028: 1387222457240408 S: vMotion vmknic is undefined. vMotion will attempt to proceed, but may not reach optimal performance.
2013-12-16T19:34:25.683Z cpu19:126290)Tcpip_Vmk: 1213: Affinitizing 10.x.x.x to world 126935, Success
2013-12-16T19:34:25.683Z cpu19:126290)VMotion: 2688: 1387222457240408 S: Set ip address '10.x.x.x' worldlet affinity to send World ID 126935
2013-12-16T19:34:25.683Z cpu19:126290)Hbr: 3308: Migration start received (worldID=126291) (migrateType=1) (event=0) (isSource=1) (sharedConfig=1)
2013-12-16T19:34:25.687Z cpu6:17026)MigrateNet: vm 17026: 2077: Accepted connection from <10.x.x.x>
2013-12-16T19:34:25.687Z cpu6:17026)MigrateNet: vm 17026: 2147: dataSocket 0x410047108980 receive buffer size is 563272
2013-12-16T19:34:25.687Z cpu3:126935)VMotionUtil: 3184: 1387222457240408 S: Stream connection 1 added.
2013-12-16T19:34:25.687Z cpu3:126935)VMotionUtil: 3184: 1387222457240408 S: Stream connection 2 added.
2013-12-16T19:37:37.829Z cpu0:126291)VMotion: 4502: 1387222457240408 S: Another pre-copy iteration needed with 1787207 pages left to send (prev2 9961472, prev 9961472, network bandwidth ~112.991 MB/s, 2727% t2d)
2013-12-16T19:38:24.918Z cpu0:126291)VMotion: 4502: 1387222457240408 S: Another pre-copy iteration needed with 373388 pages left to send (prev2 9961472, prev 74, network bandwidth ~140.506 MB/s, 372% t2d)
2013-12-16T19:38:35.663Z cpu17:126291)VMotion: 4502: 1387222457240408 S: Another pre-copy iteration needed with 115397 pages left to send (prev2 74, prev 69, network bandwidth ~143.951 MB/s, 262% t2d)
2013-12-16T19:38:39.671Z cpu2:126291)VMotion: 4502: 1387222457240408 S: Another pre-copy iteration needed with 32453 pages left to send (prev2 69, prev 32, network bandwidth ~146.465 MB/s, 224% t2d)
2013-12-16T19:38:41.822Z cpu2:126291)VMotion: 4417: 1387222457240408 S: Stopping pre-copy: only 8041 pages left to send, which can be sent within the switchover time goal of 0.500 seconds (network bandwidth ~83.733 MB/s, 215% t2d)
2013-12-16T19:38:41.859Z cpu12:126558)VSCSI: 6340: handle 8197(vscsi0:0):Destroying Device for world 126291 (pendCom 0)
2013-12-16T19:38:41.874Z cpu12:126558)VSCSI: 6340: handle 8198(vscsi0:1):Destroying Device for world 126291 (pendCom 0)
2013-12-16T19:38:41.880Z cpu12:126558)VSCSI: 6340: handle 8199(vscsi0:2):Destroying Device for world 126291 (pendCom 0)
2013-12-16T19:38:41.892Z cpu12:126558)VSCSI: 6340: handle 8200(vscsi0:3):Destroying Device for world 126291 (pendCom 0)
2013-12-16T19:38:41.892Z cpu12:126558)VSCSI: 6340: handle 8201(vscsi0:5):Destroying Device for world 126291 (pendCom 0)
2013-12-16T19:38:41.894Z cpu12:126558)NetPort: 1574: disabled port 0x4000007
2013-12-16T19:38:41.894Z cpu12:126558)NetPort: 1574: disabled port 0x3000008
2013-12-16T19:38:41.894Z cpu12:126558)NetPort: 1574: disabled port 0x5000007
2013-12-16T19:38:43.081Z cpu24:126935)VMotionSend: 3640: 1387222457240408 S: Sent all modified pages to destination (network bandwidth ~67.584 MB/s)
2013-12-16T19:38:44.703Z cpu2:126290)Hbr: 3405: Migration end received (worldID=126291) (migrateType=1) (event=1) (isSource=1) (sharedConfig=1)
2013-12-16T19:38:44.720Z cpu22:16451)WARNING: Uplink: 3622: requesting cap 0x0!
2013-12-16T19:38:44.720Z cpu22:16451)WARNING: Uplink: 3622: requesting cap 0x0!
2013-12-16T19:38:44.720Z cpu22:16451)Net: 2475: disconnected client from port 0x5000007
2013-12-16T19:38:44.720Z cpu22:16451)Net: 2475: disconnected client from port 0x3000008
2013-12-16T19:38:44.720Z cpu22:16451)WARNING: Uplink: 3622: requesting cap 0x0!
2013-12-16T19:38:44.720Z cpu22:16451)WARNING: Uplink: 3622: requesting cap 0x0!
2013-12-16T19:38:44.720Z cpu22:16451)Net: 2475: disconnected client from port 0x4000007
2013-12-16T19:38:49.949Z cpu19:17594)Config: 347: "SIOControlFlag2" = 0, Old Value: 1, (Status: 0x0)
Thanks!
"WARNING: VMotionUtil: 5028: 1387221025280408 S: vMotion vmknic is undefined. vMotion will attempt to proceed, but may not reach optimal performance." came up couple of times. If you are using multi-NIC vmotion, you might need to check whether each vmotion vmnic is bound to a physical nic using nic teaming and make sure setting are consistent over all hosts. See this two links (bit old but still useful). You should see something like "...Successfully bound connection to vmknic..."
http://www.yellow-bricks.com/2011/09/17/multiple-nic-vmotion-in-vsphere-5/
wwan, thank you for the links, I will look into those.
After reading your last post I put only one NIC as active in the vMotion port group on both hosts. Same results.
vCenter is showing that the VM is operating on the destination host for several seconds (sometimes as much as ten) before its VM interface IP stops timing out.
I do not recall having these issues when our hosts were on version 5.0. The hosts were upgraded, not freshly installed.
I dont recall that either. Usually the setup of vmotion is quite straight forward. Plus there is not much change in 5.1. Might be worth to scrap the vmotion vswitch on both hosts and re-create them. The mentioned log entires definitely dont look right.
Dialectic, has this been resolved for you? I noticed no updates to this post after December?
We are experiencing the same exact issue as you. We have VDs on all hosts and they are in a cluster. It was happening with 5.1 Update 1 and now it continues to occur with Update 2. It happens like you mentioned on VMs with higher CPU and Memory with a greater impact on the network timeout (when pinging)
Background on my infrastructure: HP Blade Gen8 Chassis with Flex 10/10 card connected to Nexus 5000 switches.
Hey MikeF0209,
I apologize for the delay in responding.
Is there a specific application on the VMs you are seeing issue?
For us, the only VMs affected by the issue are clustered SQL VMs.
Our VMware Technical Account Manager recently informed us that vMotion and Storage vMotion is not officially supported by VMware.
You CAN vMotion and Storage vMotion VMs with SQL, but it's possible that some SQL instances might fail and have to be restarted after the vMotion cutover.
We now failover instances on another VM when we need to vMotion/svMotion a SQL VM.
We have other VMs with high IO but we do not see the kind of issues as with the SQL VMs when using vMotion/svMotion.
I hope that helps.
I'm having a similar issue. We have Nexus 5Ks (also happening on the 6509s) with the 1000V. When I vmotion a machine 64GB RAM, I dropped over 10 pings... the other host is on the same switch. This hurts our Exchange 2010 environment. vmotion is supported with 2010, although there are some caveats with these settings,
cluster /prop SameSubnetDelay=2000:DWORD
cluster /prop CrossSubnetDelay=4000:DWORD
cluster /prop CrossSubnetThreshold=10:DWORD
cluster /prop SameSubnetThreshold=10:DWORD
but that doesn't help if I'm dropping off the network for that long.
We are still experiencing the issue. We've updated to 5.1 update 2, have called VMWare to take a look at our config same with HP and they all report everything is configured correctly.
Next step is to get them both on the same call together.
