tdubb123
Expert
Expert

Migration to host x.x.x.x failed with error Already disconnected (195887150).

I am getting this timeout error in vcenter when doing a vmotion of a vm. any idea what this is?

this is on esxi 5.1 914609

Migration to host x.x.x.x failed with error Already disconnected (195887150).
vMotion migration [171704789:1358656707894315] failed writing stream completion: Already disconnected
vMotion migration [171704789:1358656707894315] failed to flush stream buffer: Already disconnected
vMotion migration [171704789:1358656707894315] socket connected returned: Already disconnected
0 Kudos
25 Replies
Sreec
VMware Employee
VMware Employee

Hi,

    Looks like its matching with KB:http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=103026... do confirm the status.

Cheers,
Sree | CKA|CKAD|VCIX-3X| VCAP-4X| VExpert 5x
Please KUDO helpful posts and mark the thread as solved if answered
0 Kudos
tdubb123
Expert
Expert

I checked on all of them. Even changed vmk ip addesses. but still same problem.  tried vmkping/ping each hosts vmk ips and works. Its random. sometimes vmotion works but most of the time it doesnot and times out.

Tried it again and now it gives this

Failed to initialize migration at source. Error 195887109. Already exists.

vMotion migration Re: Migration to host x.x.x.x failed with error Already disconnected (195887150). The requested scheduler group already exists. Failing migration. A previously attempted migration may still be cleaning up state. Please retry in a few minutes.

The vMotion failed because the destination host did not receive data from the source host on the vMotion network. Please check your vMotion network settings and physical network configuration and ensure they are correct.

0 Kudos
Sreec
VMware Employee
VMware Employee

Hi,

    Can you please post the vmkernel/hostd messages when the vmotion failed.

Cheers,
Sree | CKA|CKAD|VCIX-3X| VCAP-4X| VExpert 5x
Please KUDO helpful posts and mark the thread as solved if answered
0 Kudos
tdubb123
Expert
Expert

here are the vmotion logs from the source when it fails

2013-01-20T15:00:37.200Z cpu12:9141)Config: 347: "SIOControlFlag2" = 1, Old Value: 0, (Status: 0x0)

2013-01-20T15:00:37.917Z cpu6:44048)Migrate: vm 44049: 3273: Setting VMOTION info: Source ts = 1358694037317522, src ip = <10.60.1.249> dest ip = <10.60.1.213> Dest wid = 42592 using SHARED swap

2013-01-20T15:00:37.920Z cpu4:44242)MigrateNet: 1165: 1358694037317522 S: Successfully bound connection to vmknic '10.60.1.249'

2013-01-20T15:00:37.920Z cpu6:44048)Tcpip_Vmk: 1213: Affinitizing 10.60.1.249 to world 44242, Success

2013-01-20T15:00:37.920Z cpu6:44048)VMotion: 2688: 1358694037317522 S: Set ip address '10.60.1.249' worldlet affinity to send World ID 44242

2013-01-20T15:00:37.920Z cpu6:44048)Hbr: 3308: Migration start received (worldID=44049) (migrateType=1) (event=0) (isSource=1) (sharedConfig=1)

2013-01-20T15:00:37.920Z cpu14:8796)MigrateNet: vm 8796: 2061: Accepted connection from <10.60.1.213>

2013-01-20T15:00:37.920Z cpu14:8796)MigrateNet: vm 8796: 2131: dataSocket 0x4100264a3a00 receive buffer size is 563724

2013-01-20T15:00:37.921Z cpu4:44242)MigrateNet: 1165: 1358694037317522 S: Successfully bound connection to vmknic '10.60.1.249'

2013-01-20T15:00:37.921Z cpu4:44242)VMotionUtil: 3184: 1358694037317522 S: Stream connection 1 added.

2013-01-20T15:00:37.921Z cpu4:44242)MigrateNet: 1165: 1358694037317522 S: Successfully bound connection to vmknic '10.60.1.250'

2013-01-20T15:00:37.921Z cpu4:44242)VMotionUtil: 3184: 1358694037317522 S: Stream connection 2 added.

2013-01-20T15:00:57.942Z cpu17:44240)WARNING: VMotionUtil: 2098: 1358694037317522 S: Stream write completion 0x4125b00133d0 inactive for at least 20 seconds! Progressed to 60392/527876 bytes.

2013-01-20T15:00:57.942Z cpu17:44240)VMotionUtil: 5119: 1358694037317522 S: Socket 0x410026469400 streamHealth pending: 0/563724 snd 144 rcv

2013-01-20T15:00:57.942Z cpu19:44241)WARNING: VMotionUtil: 2098: 1358694037317522 S: Stream write completion 0x4125b00131b0 inactive for at least 20 seconds! Progressed to 35816/527876 bytes.

2013-01-20T15:00:57.942Z cpu19:44241)VMotionUtil: 5119: 1358694037317522 S: Socket 0x41001b835a00 streamHealth pending: 0/563724 snd 144 rcv

2013-01-20T15:01:17.963Z cpu19:44241)WARNING: VMotionUtil: 2098: 1358694037317522 S: Stream write completion 0x4125b00133d0 inactive for at least 20 seconds! Progressed to 60392/527876 bytes.

2013-01-20T15:01:17.963Z cpu19:44241)VMotionUtil: 5119: 1358694037317522 S: Socket 0x410026469400 streamHealth pending: 0/563724 snd 144 rcv

2013-01-20T15:01:17.963Z cpu17:44240)WARNING: VMotionUtil: 2098: 1358694037317522 S: Stream write completion 0x4125b00131b0 inactive for at least 20 seconds! Progressed to 35816/527876 bytes.

2013-01-20T15:01:17.963Z cpu17:44240)VMotionUtil: 5119: 1358694037317522 S: Socket 0x41001b835a00 streamHealth pending: 0/563724 snd 144 rcv

2013-01-20T15:01:37.983Z cpu19:44241)WARNING: VMotionUtil: 2098: 1358694037317522 S: Stream write completion 0x4125b00131b0 inactive for at least 20 seconds! Progressed to 35816/527876 bytes.

2013-01-20T15:01:37.983Z cpu19:44241)VMotionUtil: 5119: 1358694037317522 S: Socket 0x41001b835a00 streamHealth pending: 0/563724 snd 144 rcv

2013-01-20T15:01:37.983Z cpu17:44240)WARNING: VMotionUtil: 2098: 1358694037317522 S: Stream write completion 0x4125b00133d0 inactive for at least 20 seconds! Progressed to 60392/527876 bytes.

2013-01-20T15:01:37.983Z cpu17:44240)VMotionUtil: 5119: 1358694037317522 S: Socket 0x410026469400 streamHealth pending: 0/563724 snd 144 rcv

2013-01-20T15:01:58.004Z cpu17:44240)WARNING: VMotionUtil: 2098: 1358694037317522 S: Stream write completion 0x4125b00133d0 inactive for at least 20 seconds! Progressed to 60392/527876 bytes.

2013-01-20T15:01:58.004Z cpu17:44240)VMotionUtil: 5119: 1358694037317522 S: Socket 0x410026469400 streamHealth pending: 0/563724 snd 144 rcv

2013-01-20T15:01:58.004Z cpu19:44241)WARNING: VMotionUtil: 2098: 1358694037317522 S: Stream write completion 0x4125b00131b0 inactive for at least 20 seconds! Progressed to 35816/527876 bytes.

2013-01-20T15:01:58.004Z cpu19:44241)VMotionUtil: 5119: 1358694037317522 S: Socket 0x41001b835a00 streamHealth pending: 0/563724 snd 144 rcv

2013-01-20T15:02:01.346Z cpu19:44241)WARNING: VMotionUtil: 4526: 1358694037317522 S: socket connected returned: Already disconnected

2013-01-20T15:02:01.346Z cpu19:44241)WARNING: VMotionUtil: 6990: 1358694037317522 S: failed to flush stream buffer: Already disconnected

2013-01-20T15:02:01.346Z cpu17:44240)WARNING: VMotionUtil: 4526: 1358694037317522 S: socket connected returned: Already disconnected

2013-01-20T15:02:01.346Z cpu19:44241)WARNING: VMotionUtil: 1617: 1358694037317522 S: failed writing stream completion: Already disconnected

2013-01-20T15:02:01.346Z cpu17:44240)WARNING: VMotionUtil: 6990: 1358694037317522 S: failed to flush stream buffer: Already disconnected

2013-01-20T15:02:01.346Z cpu17:44240)WARNING: VMotionUtil: 1617: 1358694037317522 S: failed writing stream completion: Already disconnected

2013-01-20T15:02:01.346Z cpu19:44241)WARNING: Migrate: 269: 1358694037317522 S: Failed: Already disconnected (0xbad002e) @0x0

2013-01-20T15:02:01.346Z cpu19:44241)VMotionUtil: 5119: 1358694037317522 S: Socket 0x410026469400 streamSock pending: 563724/563724 snd 144 rcv

2013-01-20T15:02:01.346Z cpu19:44241)VMotionUtil: 5119: 1358694037317522 S: Socket 0x41001b835a00 streamSock pending: 563724/563724 snd 144 rcv

2013-01-20T15:02:01.422Z cpu0:44059)WARNING: Migrate: 5244: 1358694037317522 S: Migration considered a failure by the VMX. It is most likely a timeout, but check the VMX log for the true error.

2013-01-20T15:02:01.422Z cpu0:44059)Hbr: 3405: Migration end received (worldID=44049) (migrateType=1) (event=1) (isSource=1) (sharedConfig=1)

2013-01-20T15:02:01.422Z cpu12:44243)VMotionUtil: 5119: 1358694037317522 S: Socket 0x4100264a3a00 rcvMigFree pending: 31424/35792 snd 0 rcv

2013-01-20T15:02:01.455Z cpu2:9138)Config: 347: "SIOControlFlag2" = 0, Old Value: 1, (Status: 0x0)

0 Kudos
tdubb123
Expert
Expert

vmkernel logs on the destination vmotion host

2013-01-20T15:10:23.547Z cpu14:9160)Config: 347: "SIOControlFlag2" = 1, Old Value: 0, (Status: 0x0)

2013-01-20T15:10:23.613Z cpu0:43396)MemSched: vm 43396: 7756: extended swap to 8192 pgs

2013-01-20T15:10:23.766Z cpu0:43396)World: vm 43397: 1421: Starting world vmm0:ucstest with flags 8

2013-01-20T15:10:23.766Z cpu0:43396)Sched: vm 43397: 6416: Adding world 'vmm0:ucstest', group 'host/user', cpu: shares=-3 min=0 minLimit=-1 max=-1, mem: shares=-3 min=0 minLimit=-1 max=-1

2013-01-20T15:10:23.766Z cpu0:43396)Sched: vm 43397: 6431: renamed group 63064 to vm.43396

2013-01-20T15:10:23.766Z cpu0:43396)Sched: vm 43397: 6448: group 63064 is located under group 4

2013-01-20T15:10:23.772Z cpu0:43396)MemSched: vm 43396: 7756: extended swap to 27336 pgs

2013-01-20T15:10:23.808Z cpu1:43396)Migrate: vm 43397: 3273: Setting VMOTION info: Dest ts = 1358694623315922, src ip = <10.60.1.249> dest ip = <10.60.1.213> Dest wid = 0 using SHARED swap

2013-01-20T15:10:23.812Z cpu1:43396)Tcpip_Vmk: 1213: Affinitizing 10.60.1.213 to world 43403, Success

2013-01-20T15:10:23.812Z cpu1:43396)VMotion: 2688: 1358694623315922 😧 Set ip address '10.60.1.213' worldlet affinity to recv World ID 43403

2013-01-20T15:10:23.812Z cpu1:43396)Hbr: 3308: Migration start received (worldID=43397) (migrateType=1) (event=0) (isSource=0) (sharedConfig=1)

2013-01-20T15:10:23.931Z cpu6:8796)MigrateNet: vm 8796: 2061: Accepted connection from <10.60.1.249>

2013-01-20T15:10:23.931Z cpu6:8796)MigrateNet: vm 8796: 2131: dataSocket 0x41001b8bc810 receive buffer size is 563724

2013-01-20T15:10:23.931Z cpu9:43402)MigrateNet: 1165: 1358694623315922 😧 Successfully bound connection to vmknic '10.60.1.213'

2013-01-20T15:10:23.931Z cpu6:8796)MigrateNet: vm 8796: 2061: Accepted connection from <10.60.1.249>

2013-01-20T15:10:23.931Z cpu6:8796)MigrateNet: vm 8796: 2131: dataSocket 0x41002709c810 receive buffer size is 563724

2013-01-20T15:10:23.931Z cpu6:8796)VMotionUtil: 3184: 1358694623315922 😧 Stream connection 1 added.

2013-01-20T15:10:23.931Z cpu6:8796)MigrateNet: vm 8796: 2061: Accepted connection from <10.60.1.250>

2013-01-20T15:10:23.931Z cpu6:8796)MigrateNet: vm 8796: 2131: dataSocket 0x41001bac1e10 receive buffer size is 563724

2013-01-20T15:10:23.932Z cpu6:8796)VMotionUtil: 3184: 1358694623315922 😧 Stream connection 2 added.

2013-01-20T15:10:43.934Z cpu9:43402)WARNING: Migrate: 269: 1358694623315922 😧 Failed: Timeout (0xbad0021) @0x41802eabde57

2013-01-20T15:10:43.934Z cpu9:43402)VMotionSend: 3978: 1358694623315922 😧 Waiting for channel references prior to cleanup...

2013-01-20T15:10:43.935Z cpu18:43396)WARNING: Migrate: 5244: 1358694623315922 😧 Migration considered a failure by the VMX. It is most likely a timeout, but check the VMX log for the true error.

2013-01-20T15:10:43.935Z cpu18:43396)Hbr: 3405: Migration end received (worldID=43397) (migrateType=1) (event=1) (isSource=0) (sharedConfig=1)

2013-01-20T15:10:43.935Z cpu0:43401)VMotionUtil: 5119: 1358694623315922 😧 Socket 0x4100270b7a10 sendSocket pending: 563644/563724 snd 0 rcv

2013-01-20T15:10:43.957Z cpu12:20650)Config: 347: "SIOControlFlag2" = 0, Old Value: 1, (Status: 0x0)

0 Kudos
Sreec
VMware Employee
VMware Employee

Hi,

    We need to check Virtual machine logs(Vmware) to figure out the exact error message>"Migration considered a failure by the VMX.  It is most likely a timeout, but check the VMX log for the true error".Can you please post the virtual machine logs.

1.Is this the only Virtual machine where in vmotion is failing?

Cheers,
Sree | CKA|CKAD|VCIX-3X| VCAP-4X| VExpert 5x
Please KUDO helpful posts and mark the thread as solved if answered
0 Kudos
tdubb123
Expert
Expert

its happening for all vms. not just one. If I waited a bit, then vmotion might go through. Usually a reboot as well will work. but it will fails after a while

0 Kudos
Sreec
VMware Employee
VMware Employee

Hi,

    Thanks for the update.Can you please upload the virtual machine logs(Failed Vmotion).I guess vmotion was working earlier?

Cheers,
Sree | CKA|CKAD|VCIX-3X| VCAP-4X| VExpert 5x
Please KUDO helpful posts and mark the thread as solved if answered
0 Kudos
spascoe64
Contributor
Contributor

Are you by any chance using a Qlogic 81xx CNA Adapter in this configuration?

I'm gettting the same type of errors that you are on a system that was working fine before I upgraded it from ESXi 4.1 to ESXi 5.1 recently.

Scott

0 Kudos
tiagomartinez
Enthusiast
Enthusiast

I am having this same problem at one customer.
here we are using ESXi 5.1 Build 799733, Qlogic 8152 Driver 2.00.00.54 , firmware 1.40.3.
Every time that 3 or more VMotion operations starts simultaneously this error appears.
anyone have any ideas??
Blog: http://vmwarebrasil.blogspot.com Consider awarding points for helpful or correct answers!
0 Kudos
spascoe64
Contributor
Contributor

It this point, I've got an open ticket with VMware trying to identify the issue.

We are currently focused on trying to rule in/rule out this KB issue: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=204287...

My VMware support technician is in the process of trying to define what the net packet pools size should be set to in this scenario.

That is all I know at this point.  Scott

0 Kudos
tdubb123
Expert
Expert

have you checked all your mtu settings to make sure they match - vmk, vswitch, physical port and physical switches?

0 Kudos
spascoe64
Contributor
Contributor

Oh Yeah. We've been through the servers with a fine tooth comb.... It is definitely not an obvious problem. I think it is a driver issue, but it could be a throughput/buffer issue as well.

0 Kudos
tsanches
Contributor
Contributor

anyone had this problem solved? I have a support ticket open for 40 days but so far nothing.

I can not upgrade the environment to version 5.1 with this problem!

0 Kudos
tdubb123
Expert
Expert

what hardware you running this on? have you tried vmkping -s -s 9000 to the storage?

are you using jumbo frames?

0 Kudos
tsanches
Contributor
Contributor

The hosts are running on a PowerEdge R710 with a Qlogic 8152. And we are using jumbo frames.

Access to the storage is OK. The vMotion and Storage Networs are in different subnet's.

0 Kudos
tdubb123
Expert
Expert

and vmkping to the storage is ok?

vmkping -s 9000 storage_ip

0 Kudos
tsanches
Contributor
Contributor

Yes. It's OK.

0 Kudos
tdubb123
Expert
Expert

how are you vmkernel interfaces configured? how many nics for vmotion? is vmotion and iscsi vlan separated?

0 Kudos