VMware Cloud Community
BPMalan
Contributor
Contributor

ESX host hang

Guys

had one of my ESX hosts hang in the middle of the night.  have ESX 4.1.0 Build 348481. below is a snip taken from the vmkernal log from well before teh failure (some time after 04:00) until when i powered it off and on again.

Jun 1 18:24:38 BPMVM2 vmkernel: 19:03:57:45.060 cpu2:14252)NetPort: 1157: disabled port 0x100000c

Jun 1 18:24:38 BPMVM2 vmkernel: 19:03:57:45.064 cpu3:14251)VSCSI: 6495: handle 8536(vscsi0:0):Destroying Device for world 14252 (pendCom 0)

Jun 1 18:24:38 BPMVM2 vmkernel: 19:03:57:45.337 cpu2:14251)CBT: 682: Disconnecting the cbt device d1c802b-cbt with filehandle 219971627

Jun 1 18:24:40 BPMVM2 vmkernel: 19:03:57:47.232 cpu3:14252)NetPort: 982: enabled port 0x100000c with mac 00:0c:29:79:53:45

Jun 1 18:24:40 BPMVM2 vmkernel: 19:03:57:47.336 cpu2:14251)CBT: 1029: Created device 5db802a-cbt for cbt driver with filehandle 98271274

Jun 1 18:24:40 BPMVM2 vmkernel: 19:03:57:47.428 cpu2:14251)CBT: 1029: Created device 9d0c02e-cbt for cbt driver with filehandle 164675630

Jun 1 18:24:40 BPMVM2 vmkernel: 19:03:57:47.434 cpu2:14251)CBT: 682: Disconnecting the cbt device 5db802a-cbt with filehandle 98271274

Jun 1 18:24:40 BPMVM2 vmkernel: 19:03:57:47.438 cpu2:14251)CBT: 682: Disconnecting the cbt device 9d0c02e-cbt with filehandle 164675630

Jun 1 18:24:40 BPMVM2 vmkernel: 19:03:57:47.478 cpu2:14251)CBT: 1029: Created device 9d1402e-cbt for cbt driver with filehandle 164708398

Jun 1 18:24:40 BPMVM2 vmkernel: 19:03:57:47.543 cpu2:14251)CBT: 1029: Created device 9d60032-cbt for cbt driver with filehandle 165019698

Jun 1 18:24:40 BPMVM2 vmkernel: 19:03:57:47.548 cpu2:14251)CBT: 682: Disconnecting the cbt device 9d1402e-cbt with filehandle 164708398

Jun 1 18:24:40 BPMVM2 vmkernel: 19:03:57:47.553 cpu2:14251)CBT: 682: Disconnecting the cbt device 9d60032-cbt with filehandle 165019698

Jun 1 18:24:40 BPMVM2 vmkernel: 19:03:57:47.645 cpu3:14251)CBT: 1029: Created device 9d1c02e-cbt for cbt driver with filehandle 164741166

Jun 1 18:24:40 BPMVM2 vmkernel: 19:03:57:47.653 cpu3:14251)VSCSI: 3482: handle 8537(vscsi0:0):Using sync mode due to sparse disks

Jun 1 18:24:40 BPMVM2 vmkernel: 19:03:57:47.653 cpu3:14251)VSCSI: 3523: handle 8537(vscsi0:0):Creating Virtual Device for world 14252 (FSS handle 346980399)

Jun 1 18:24:47 BPMVM2 vmkernel: 19:03:57:54.568 cpu0:14252)NetPort: 1157: disabled port 0x100000c

Jun 1 18:24:47 BPMVM2 vmkernel: 19:03:57:54.570 cpu1:14251)VSCSI: 6495: handle 8537(vscsi0:0):Destroying Device for world 14252 (pendCom 0)

Jun 1 18:24:48 BPMVM2 vmkernel: 19:03:57:55.250 cpu1:14251)CBT: 682: Disconnecting the cbt device 9d1c02e-cbt with filehandle 164741166

Jun 1 18:24:49 BPMVM2 vmkernel: 19:03:57:55.933 cpu1:14251)CBT: 1029: Created device 5dc402a-cbt for cbt driver with filehandle 98320426

Jun 1 18:24:49 BPMVM2 vmkernel: 19:03:57:55.937 cpu1:14251)CBT: 682: Disconnecting the cbt device 5dc402a-cbt with filehandle 98320426

Jun 1 18:24:49 BPMVM2 vmkernel: 19:03:57:56.675 cpu1:14251)CBT: 1029: Created device b3ec029-cbt for cbt driver with filehandle 188661801

Jun 1 18:24:49 BPMVM2 vmkernel: 19:03:57:56.706 cpu1:14251)CBT: 1029: Created device 12a9002d-cbt for cbt driver with filehandle 313065517

Jun 1 18:24:49 BPMVM2 vmkernel: 19:03:57:56.710 cpu1:14251)CBT: 682: Disconnecting the cbt device 12a9002d-cbt with filehandle 313065517

Jun 1 18:24:49 BPMVM2 vmkernel: 19:03:57:56.714 cpu1:14251)CBT: 682: Disconnecting the cbt device b3ec029-cbt with filehandle 188661801

Jun 1 18:24:49 BPMVM2 vmkernel: 19:03:57:56.788 cpu1:14251)CBT: 1029: Created device 12a9802d-cbt for cbt driver with filehandle 313098285

Jun 1 18:24:51 BPMVM2 vmkernel: 19:03:57:58.487 cpu1:14251)CBT: 682: Disconnecting the cbt device 12a9802d-cbt with filehandle 313098285

Jun 1 18:24:52 BPMVM2 vmkernel: 19:03:57:58.992 cpu0:14252)NetPort: 982: enabled port 0x100000c with mac 00:0c:29:79:53:45

Jun 1 18:24:52 BPMVM2 vmkernel: 19:03:57:59.037 cpu1:14251)CBT: 1029: Created device 7c2802c-cbt for cbt driver with filehandle 130187308

Jun 1 18:24:52 BPMVM2 vmkernel: 19:03:57:59.094 cpu1:14251)VSCSI: 3523: handle 8538(vscsi0:0):Creating Virtual Device for world 14252 (FSS handle 164773934)

Jun 1 20:30:54 BPMVM2 vmkernel: 19:06:04:00.939 cpu3:4259)NetPort: 1157: disabled port 0x100000e

Jun 1 20:30:54 BPMVM2 vmkernel: 19:06:04:00.941 cpu1:4258)VSCSI: 6495: handle 8525(vscsi0:0):Destroying Device for world 4259 (pendCom 0)

Jun 1 20:30:54 BPMVM2 vmkernel: 19:06:04:01.809 cpu1:4258)CBT: 682: Disconnecting the cbt device 9f9c06d-cbt with filehandle 167362669

Jun 1 20:30:57 BPMVM2 vmkernel: 19:06:04:03.937 cpu3:4259)NetPort: 982: enabled port 0x100000e with mac 00:0c:29:c0:1b:27

Jun 1 20:30:57 BPMVM2 vmkernel: 19:06:04:04.102 cpu1:4258)CBT: 1029: Created device c28006f-cbt for cbt driver with filehandle 203948143

Jun 1 20:30:57 BPMVM2 vmkernel: 19:06:04:04.122 cpu3:4258)VSCSI: 3482: handle 8539(vscsi0:0):Using sync mode due to sparse disks

Jun 1 20:30:57 BPMVM2 vmkernel: 19:06:04:04.122 cpu3:4258)VSCSI: 3523: handle 8539(vscsi0:0):Creating Virtual Device for world 4259 (FSS handle 30310513)

Jun 2 02:43:07 BPMVM2 vmkernel: 19:12:16:13.625 cpu2:4259)NetPort: 1157: disabled port 0x100000e

Jun 2 02:43:07 BPMVM2 vmkernel: 19:12:16:13.628 cpu2:4258)VSCSI: 6495: handle 8539(vscsi0:0):Destroying Device for world 4259 (pendCom 0)

Jun 2 02:43:07 BPMVM2 vmkernel: 19:12:16:13.854 cpu3:4258)CBT: 682: Disconnecting the cbt device c28006f-cbt with filehandle 203948143

Jun 2 02:43:09 BPMVM2 vmkernel: 19:12:16:15.838 cpu3:4259)NetPort: 982: enabled port 0x100000e with mac 00:0c:29:c0:1b:27

Jun 2 02:43:09 BPMVM2 vmkernel: 19:12:16:16.080 cpu1:4258)CBT: 1029: Created device c79006e-cbt for cbt driver with filehandle 209256558

Jun 2 02:43:09 BPMVM2 vmkernel: 19:12:16:16.241 cpu1:4258)CBT: 1029: Created device d554072-cbt for cbt driver with filehandle 223690866

Jun 2 02:43:09 BPMVM2 vmkernel: 19:12:16:16.244 cpu1:4258)CBT: 682: Disconnecting the cbt device c79006e-cbt with filehandle 209256558

Jun 2 02:43:09 BPMVM2 vmkernel: 19:12:16:16.247 cpu1:4258)CBT: 682: Disconnecting the cbt device d554072-cbt with filehandle 223690866

Jun 2 02:43:09 BPMVM2 vmkernel: 19:12:16:16.277 cpu1:4258)CBT: 1029: Created device d55c072-cbt for cbt driver with filehandle 223723634

Jun 2 02:43:09 BPMVM2 vmkernel: 19:12:16:16.349 cpu1:4258)CBT: 1029: Created device 19cc076-cbt for cbt driver with filehandle 27050102

Jun 2 02:43:09 BPMVM2 vmkernel: 19:12:16:16.353 cpu1:4258)CBT: 682: Disconnecting the cbt device d55c072-cbt with filehandle 223723634

Jun 2 02:43:09 BPMVM2 vmkernel: 19:12:16:16.355 cpu1:4258)CBT: 682: Disconnecting the cbt device 19cc076-cbt with filehandle 27050102

Jun 2 02:43:09 BPMVM2 vmkernel: 19:12:16:16.434 cpu1:4258)CBT: 1029: Created device d564072-cbt for cbt driver with filehandle 223756402

Jun 2 02:43:09 BPMVM2 vmkernel: 19:12:16:16.439 cpu1:4258)VSCSI: 3482: handle 8540(vscsi0:0):Using sync mode due to sparse disks

Jun 2 02:43:09 BPMVM2 vmkernel: 19:12:16:16.439 cpu1:4258)VSCSI: 3523: handle 8540(vscsi0:0):Creating Virtual Device for world 4259 (FSS handle 152879219)

Jun 2 02:53:23 BPMVM2 vmkernel: 19:12:26:30.200 cpu2:4259)NetPort: 1157: disabled port 0x100000e

Jun 2 02:53:23 BPMVM2 vmkernel: 19:12:26:30.202 cpu1:4258)VSCSI: 6495: handle 8540(vscsi0:0):Destroying Device for world 4259 (pendCom 0)

Jun 2 02:53:23 BPMVM2 vmkernel: 19:12:26:30.293 cpu1:4258)CBT: 682: Disconnecting the cbt device d564072-cbt with filehandle 223756402

Jun 2 02:53:24 BPMVM2 vmkernel: 19:12:26:31.242 cpu1:4258)CBT: 1029: Created device c7a006e-cbt for cbt driver with filehandle 209322094

Jun 2 02:53:24 BPMVM2 vmkernel: 19:12:26:31.246 cpu1:4258)CBT: 682: Disconnecting the cbt device c7a006e-cbt with filehandle 209322094

Jun 2 02:53:26 BPMVM2 vmkernel: 19:12:26:33.419 cpu2:4259)NetPort: 982: enabled port 0x100000e with mac 00:0c:29:c0:1b:27

Jun 2 02:53:27 BPMVM2 vmkernel: 19:12:26:33.538 cpu1:4258)CBT: 1029: Created device c7d806e-cbt for cbt driver with filehandle 209551470

Jun 2 02:53:27 BPMVM2 vmkernel: 19:12:26:33.638 cpu1:4258)CBT: 1029: Created device d56c072-cbt for cbt driver with filehandle 223789170

Jun 2 02:53:27 BPMVM2 vmkernel: 19:12:26:33.641 cpu1:4258)CBT: 682: Disconnecting the cbt device c7d806e-cbt with filehandle 209551470

Jun 2 02:53:27 BPMVM2 vmkernel: 19:12:26:33.644 cpu1:4258)CBT: 682: Disconnecting the cbt device d56c072-cbt with filehandle 223789170

Jun 2 02:53:27 BPMVM2 vmkernel: 19:12:26:33.672 cpu1:4258)CBT: 1029: Created device d574072-cbt for cbt driver with filehandle 223821938

Jun 2 02:53:27 BPMVM2 vmkernel: 19:12:26:33.705 cpu1:4258)CBT: 1029: Created device 19e0076-cbt for cbt driver with filehandle 27132022

Jun 2 02:53:27 BPMVM2 vmkernel: 19:12:26:33.709 cpu1:4258)CBT: 682: Disconnecting the cbt device d574072-cbt with filehandle 223821938

Jun 2 02:53:27 BPMVM2 vmkernel: 19:12:26:33.712 cpu1:4258)CBT: 682: Disconnecting the cbt device 19e0076-cbt with filehandle 27132022

Jun 2 02:53:27 BPMVM2 vmkernel: 19:12:26:33.798 cpu1:4258)CBT: 1029: Created device d57c072-cbt for cbt driver with filehandle 223854706

Jun 2 02:53:27 BPMVM2 vmkernel: 19:12:26:33.868 cpu1:4258)VSCSI: 3482: handle 8541(vscsi0:0):Using sync mode due to sparse disks

Jun 2 02:53:27 BPMVM2 vmkernel: 19:12:26:33.868 cpu1:4258)VSCSI: 3523: handle 8541(vscsi0:0):Creating Virtual Device for world 4259 (FSS handle 152961139)

Jun 2 02:53:33 BPMVM2 vmkernel: 19:12:26:39.579 cpu1:4259)NetPort: 1157: disabled port 0x100000e

Jun 2 02:53:33 BPMVM2 vmkernel: 19:12:26:39.581 cpu2:4258)VSCSI: 6495: handle 8541(vscsi0:0):Destroying Device for world 4259 (pendCom 0)

Jun 2 02:53:33 BPMVM2 vmkernel: 19:12:26:40.425 cpu3:4258)CBT: 682: Disconnecting the cbt device d57c072-cbt with filehandle 223854706

Jun 2 02:53:34 BPMVM2 vmkernel: 19:12:26:40.985 cpu2:4258)CBT: 1029: Created device c7e406e-cbt for cbt driver with filehandle 209600622

Jun 2 02:53:34 BPMVM2 vmkernel: 19:12:26:40.990 cpu2:4258)CBT: 682: Disconnecting the cbt device c7e406e-cbt with filehandle 209600622

Jun 2 02:53:34 BPMVM2 vmkernel: 19:12:26:41.427 cpu3:4258)CBT: 1029: Created device a10006d-cbt for cbt driver with filehandle 168820845

Jun 2 02:53:35 BPMVM2 vmkernel: 19:12:26:41.550 cpu3:4258)CBT: 1029: Created device 1d0c071-cbt for cbt driver with filehandle 30457969

Jun 2 02:53:35 BPMVM2 vmkernel: 19:12:26:41.555 cpu3:4258)CBT: 682: Disconnecting the cbt device 1d0c071-cbt with filehandle 30457969

Jun 2 02:53:35 BPMVM2 vmkernel: 19:12:26:41.559 cpu3:4258)CBT: 682: Disconnecting the cbt device a10006d-cbt with filehandle 168820845

Jun 2 02:53:35 BPMVM2 vmkernel: 19:12:26:41.627 cpu3:4258)CBT: 1029: Created device 1d14071-cbt for cbt driver with filehandle 30490737

Jun 2 02:53:36 BPMVM2 vmkernel: 19:12:26:43.110 cpu3:4258)CBT: 682: Disconnecting the cbt device 1d14071-cbt with filehandle 30490737

Jun 2 02:53:36 BPMVM2 vmkernel: 19:12:26:43.452 cpu1:4259)NetPort: 982: enabled port 0x100000e with mac 00:0c:29:c0:1b:27

Jun 2 02:53:37 BPMVM2 vmkernel: 19:12:26:43.589 cpu2:4258)CBT: 1029: Created device 14cc070-cbt for cbt driver with filehandle 21807216

Jun 2 02:53:37 BPMVM2 vmkernel: 19:12:26:43.599 cpu2:4258)VSCSI: 3523: handle 8542(vscsi0:0):Creating Virtual Device for world 4259 (FSS handle 223887474)

Jun 2 08:25:07 BPMVM2 vmkernel: TSC: 0 cpu0:0)Init: 388: cpu 0: early measured tsc speed is 2833334167 Hz

Jun 2 08:25:07 BPMVM2 vmkernel: TSC: 12223 cpu0:0)Init: 389: vmkLoadEntry = $[0x190dacf0]

We have backups running inside teh windows VMs and none appear to have crashed, but even if they did i wouldnt expect them to take down the whole host.  We also replicate the VMs off site using vRanger 5.0, the last replication finished at around 02.54 which is the last entry sown above before i rebooted at 08:25. the next replication starts at about 05:00 which obviously didnt run.

We recieved emails up til about 04:00 which would indicate that the VMs were still working fine until then so again it doesnt look like the replication caused the issue.

I had to basically just hold in the power switch to get the server to restart, it wasnt accepting commands from the cosole and i tried PUTTY to it and i couldnt connect.  the above log isnt giving me much of an indication as to the cause of teh hang.  is there anywhere else i could look?

0 Kudos
1 Reply
Ted_O_
Contributor
Contributor

I have a few hosts that do the exact same thing as what you are describing.  The most recent occurrences were on an ESX 4.1 host with a slightly older build than yours.  It's an IBM x3850.  There are never any clues in the logs as to why it has happened.  The vm's on the host remain accessible through remote desktop and such, but the host itself is completely hung. Cannot login through the console, or through ssh, the only thing you can do is ping it.  The only solution to get it back online has been to power it off and back on, which of course knocks all the vm's it's hosting offline...

If you come across a way to diagnose this issue or prevent it from happening I certainly would be interested in finding out about it.  I have logged an SR with Vmware regarding this issue in the past, but never got a useful answer.

This time around I'm going to do a system update (drivers, firmware, BIOS, etc.), rebuild the OS and patch up to the latest build.  It's the only thing I can think of, but from past experience I have found that even going through all this has still not made the problem go away.

0 Kudos