Hi,
One of mine ESXi 5.1 hosts is disconnected in vCenter. I'am unable to reconnect the host.
When restarting services.sh, i get on serveral services this error:
Connect to localhost failed: Connection failure
Running vpxa restart
Connect to localhost failed: Connection failure
Running sfcbd-watchdog restart
Connect to localhost failed: Connection failure
~ # tail -f var/log/vmkernel.log
2013-04-04T06:36:23.424Z cpu13:1211854)WARNING: Tcpip: 1304: socreate(type=1, proto=6) failed with error No buffer space available (55)
2013-04-04T06:36:23.424Z cpu13:1211854)WARNING: Tcpip: 1304: socreate(type=1, proto=6) failed with error No buffer space available (55)
2013-04-04T06:36:26.003Z cpu18:1211111)WARNING: Tcpip: 1304: socreate(type=2, proto=0) failed with error No buffer space available (55)
2013-04-04T06:36:26.014Z cpu18:1211111)WARNING: Tcpip: 1304: socreate(type=2, proto=0) failed with error No buffer space available (55)
2013-04-04T06:39:18.185Z cpu23:1212818)WARNING: UserLinux: 1331: unsupported: (void)
~ # tail -f /var/log/vpxa.log
2013-04-04T06:42:53.764Z [73C59B90 verbose 'hostdcnx'] [VpxaHalCnxHostagent] Creating temporary connect spec: localhost:443
2013-04-04T06:42:53.765Z [73C17B90 error 'HttpConnectionPool-000000'] [ConnectComplete] Connect failed to <cs p:0db52800, TCP:localhost:443>; cnx: (null), error: N7Vmacore15SystemExceptionE(Connection reset by peer)
2013-04-04T06:42:53.765Z [73C59B90 error 'httphttpUtil'] [HttpUtil::ExecuteRequest] Error in sending request - Connection reset by peer
2013-04-04T06:42:53.765Z [73C59B90 error 'hostdcnx'] [VpxaHalCnxHostagent] Failed to discover version: vim.fault.HttpFault
2013-04-04T06:42:53.765Z [73C59B90 warning 'hostdcnx'] [VpxaHalCnxHostagent] Could not resolve version for authenticating to host agent
2013-04-04T06:43:13.767Z [73C38B90 verbose 'hostdcnx'] [VpxaHalCnxHostagent] Creating temporary connect spec: localhost:443
2013-04-04T06:43:13.768Z [FFB99B90 error 'HttpConnectionPool-000000'] [ConnectComplete] Connect failed to <cs p:0db60350, TCP:localhost:443>; cnx: (null), error: N7Vmacore15SystemExceptionE(Connection reset by peer)
2013-04-04T06:43:13.768Z [73C38B90 error 'httphttpUtil'] [HttpUtil::ExecuteRequest] Error in sending request - Connection reset by peer
ESXi host is pingable on DNS and IP, also from other hosts.
Anyone know how i can solve this problem without a reboot?
Regards,
Thijs
Check if your /scratch parition is full on the host. Example from my host showing 10MB of 4GB used:
Partition is not full:
~ # ls -ld /scratch
lrwxrwxrwx 1 root root 49 Mar 18 23:39 /scratch -> /vmfs/volum es/51249a32-49af51fd-5a45-6c3be5b19130
~ # df -h | grep /vmfs/volumes/51249a32-49af51fd-5a45-6c3be5b19130
vfat 4.0G 36.9M 4.0G 1% /vmfs/volumes/51249a32-49af51fd-5a45-6c3
This is also what i get:
~ # tail -f /var/log/vmkwarning.log
2013-04-05T01:09:10.932Z cpu16:14104)WARNING: CBT: 986: Unsupported ioctl 43
2013-04-05T01:09:26.314Z cpu18:14104)WARNING: CBT: 986: Unsupported ioctl 43
2013-04-05T01:09:41.807Z cpu23:14104)WARNING: CBT: 986: Unsupported ioctl 43
2013-04-05T01:10:58.807Z cpu17:14104)WARNING: CBT: 986: Unsupported ioctl 43
2013-04-05T01:13:56.896Z cpu20:14104)WARNING: CBT: 986: Unsupported ioctl 43
2013-04-05T01:15:45.410Z cpu20:14104)WARNING: CBT: 986: Unsupported ioctl 43
2013-04-05T01:18:56.099Z cpu19:14104)WARNING: CBT: 986: Unsupported ioctl 43
2013-04-05T04:02:49.241Z cpu14:14104)WARNING: CBT: 986: Unsupported ioctl 43
2013-04-05T04:18:29.443Z cpu18:14104)WARNING: CBT: 986: Unsupported ioctl 43
2013-04-05T06:32:02.619Z cpu13:1273528)WARNING: UserLinux: 1331: unsupported: (void)
Interesting. What hardware are you running ESXi 5.1 on? What is the hardware for this NIC? Hopefully you verified it is supported under ESXi 5.1: http://www.vmware.com/resources/compatibility/search.php
Also, run these commands to list your installed VIBs and get info on the NIC driver being used. This may help me determine if you're using an unsupported/outdated driver:
Check info on NICs and associated driver:
# esxcli network nic list
Example from one of our HP ProLiant servers, showing be2net driver being used. I've edited output for simplicity:
Command to list VIBs installed. Hopefully you're using a VMwareCertified driver that's compatible with your hardware + ESXi 5.1:
# esxcli software vib list
Example from one of our HP ProLiant servers, using a razor-thin custom ESXi 5.1 image:
~ # esxcli software vib list
Name Version Vendor Acceptance Level Install Date
----------------------------- ---------------------------------- --------------- ---------------- ------------
char-hpcru 5.0.3.09-1OEM.500.0.0.434156 Hewlett-Packard PartnerSupported -
hp-smx-provider 500.03.01.10.2-434156 Hewlett-Packard VMwareAccepted -
hpacucli 9.20-9.0 Hewlett-Packard PartnerSupported -
hpbootcfg 01-01.02 Hewlett-Packard PartnerSupported -
ehci-ehci-hcd 1.0-3vmw.510.0.0.799733 VMware VMwareCertified -
esx-base 5.1.0-0.9.914609 VMware VMwareCertified -
esx-dvfilter-generic-fastpath 5.1.0-0.0.799733 VMware VMwareCertified -
esx-xlibs 5.1.0-0.0.799733 VMware VMwareCertified -
ipmi-ipmi-devintf 39.1-4vmw.510.0.0.799733 VMware VMwareCertified -
ipmi-ipmi-msghandler 39.1-4vmw.510.0.0.799733 VMware VMwareCertified -
ipmi-ipmi-si-drv 39.1-4vmw.510.0.0.799733 VMware VMwareCertified -
misc-cnic-register 1.1-1vmw.510.0.0.799733 VMware VMwareCertified -
misc-drivers 5.1.0-0.0.799733 VMware VMwareCertified -
net-be2net 4.1.255.11-1vmw.510.0.0.799733 VMware VMwareCertified -
net-bnx2 2.0.15g.v50.11-7vmw.510.0.0.799733 VMware VMwareCertified -
net-nx-nic 4.0.558-3vmw.510.0.0.799733 VMware VMwareCertified -
scsi-hpsa 5.0.0-21vmw.510.0.0.799733 VMware VMwareCertified -
scsi-lpfc820 8.2.3.1-127vmw.510.0.0.799733 VMware VMwareCertified -
uhci-usb-uhci 1.0-3vmw.510.0.0.799733 VMware VMwareCertified -
vmware-fdm 5.1.0-947673 VMware VMwareCertified -
hpnmi 2.0.11-434156 hp PartnerSupported -
We are running HP Proliant DL360 G7's with ESXi 5.1.0, 1021289 (not a custom HP image)..
~ # esxcli network nic list
Name PCI Device Driver Link Speed Duplex MAC Address MTU Description
------ ------------- ------ ---- ----- ------ ----------------- ---- --------------------------------------------------------------
vmnic0 0000:003:00.0 bnx2 Up 1000 Full 6c:3b:e5:b1:93:c8 9000 Broadcom Corporation Broadcom NetXtreme II BCM5709 1000Base-T
vmnic1 0000:003:00.1 bnx2 Up 1000 Full 6c:3b:e5:b1:93:ca 9000 Broadcom Corporation Broadcom NetXtreme II BCM5709 1000Base-T
vmnic2 0000:004:00.0 bnx2 Up 1000 Full 6c:3b:e5:b1:93:c0 1500 Broadcom Corporation Broadcom NetXtreme II BCM5709 1000Base-T
vmnic3 0000:004:00.1 bnx2 Up 1000 Full 6c:3b:e5:b1:93:c2 1500 Broadcom Corporation Broadcom NetXtreme II BCM5709 1000Base-T
vmnic4 0000:00b:00.0 e1000e Up 1000 Full e8:39:35:12:3f:95 1500 Intel Corporation 82571EB Gigabit Ethernet Controller (Copper)
vmnic5 0000:00b:00.1 e1000e Up 1000 Full e8:39:35:12:3f:94 1500 Intel Corporation 82571EB Gigabit Ethernet Controller (Copper)
vmnic6 0000:00c:00.0 e1000e Down 0 Half e8:39:35:12:3f:97 1500 Intel Corporation 82571EB Gigabit Ethernet Controller (Copper)
vmnic7 0000:00c:00.1 e1000e Down 0 Half e8:39:35:12:3f:96 1500 Intel Corporation 82571EB Gigabit Ethernet Controller (Copper)
~ # esxcli software vib list
Name Version Vendor Acceptance Level Install Date
----------------------------- ---------------------------------- ------ ---------------- ------------
ata-pata-amd 0.3.10-3vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
ata-pata-atiixp 0.4.6-4vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
ata-pata-cmd64x 0.2.5-3vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
ata-pata-hpt3x2n 0.3.4-3vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
ata-pata-pdc2027x 1.0-3vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
ata-pata-serverworks 0.4.3-3vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
ata-pata-sil680 0.4.8-3vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
ata-pata-via 0.3.3-2vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
block-cciss 3.6.14-10vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
ehci-ehci-hcd 1.0-3vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
esx-base 5.1.0-0.10.1021289 VMware VMwareCertified 2013-03-19
esx-dvfilter-generic-fastpath 5.1.0-0.0.799733 VMware VMwareCertified 2013-02-19
esx-tboot 5.1.0-0.0.799733 VMware VMwareCertified 2013-02-19
esx-xlibs 5.1.0-0.0.799733 VMware VMwareCertified 2013-02-19
esx-xserver 5.1.0-0.0.799733 VMware VMwareCertified 2013-02-19
ima-qla4xxx 2.01.31-1vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
ipmi-ipmi-devintf 39.1-4vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
ipmi-ipmi-msghandler 39.1-4vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
ipmi-ipmi-si-drv 39.1-4vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
misc-cnic-register 1.1-1vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
misc-drivers 5.1.0-0.0.799733 VMware VMwareCertified 2013-02-19
net-be2net 4.1.255.11-1vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
net-bnx2 2.0.15g.v50.11-7vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
net-bnx2x 1.61.15.v50.3-1vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
net-cnic 1.10.2j.v50.7-3vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
net-e1000 8.0.3.1-2vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
net-e1000e 1.1.2-3vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
net-enic 1.4.2.15a-1vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
net-forcedeth 0.61-2vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
net-igb 2.1.11.1-3vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
net-ixgbe 3.7.13.6iov-10vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
net-nx-nic 4.0.558-3vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
net-r8168 8.013.00-3vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
net-r8169 6.011.00-2vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
net-s2io 2.1.4.13427-3vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
net-sky2 1.20-2vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
net-tg3 3.110h.v50.4-4vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
net-vmxnet3 1.1.3.0-3vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
ohci-usb-ohci 1.0-3vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
sata-ahci 3.0-13vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
sata-ata-piix 2.12-6vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
sata-sata-nv 3.5-4vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
sata-sata-promise 2.12-3vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
sata-sata-sil24 1.1-1vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
sata-sata-sil 2.3-4vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
sata-sata-svw 2.3-3vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
scsi-aacraid 1.1.5.1-9vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
scsi-adp94xx 1.0.8.12-6vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
scsi-aic79xx 3.1-5vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
scsi-bnx2i 1.9.1d.v50.1-5vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
scsi-fnic 1.5.0.3-1vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
scsi-hpsa 5.0.0-21vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
scsi-ips 7.12.05-4vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
scsi-lpfc820 8.2.3.1-127vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
scsi-megaraid-mbox 2.20.5.1-6vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
scsi-megaraid-sas 5.34-4vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
scsi-megaraid2 2.00.4-9vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
scsi-mpt2sas 10.00.00.00-5vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
scsi-mptsas 4.23.01.00-6vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
scsi-mptspi 4.23.01.00-6vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
scsi-qla2xxx 902.k1.1-9vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
scsi-qla4xxx 5.01.03.2-4vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
scsi-rste 2.0.2.0088-1vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
uhci-usb-uhci 1.0-3vmw.510.0.0.799733 VMware VMwareCertified 2013-02-19
vmware-fdm 5.1.0-947673 VMware VMwareCertified 2013-03-31
tools-light 5.1.0-0.9.914609 VMware VMwareCertified 2013-02-28
On the affected ESXi host i get this:
~ # esxcli network nic list
Connect to localhost failed: Connection failure
~ # esxcli software vib list
Connect to localhost failed: Connection failure
Okay so the drivers in use then are bnx2 and e1000e, and they are VMware Certified. And I'm not seeing a newer version of either available from VMware. Are you configuring any specific settings for either driver?
When did you upgrade to ESXi 5.1 build 1021289? Were you having the same issue on a prior build of ESXi 5.1?
This error is definitely of interest when running esxcli: "Connect to localhost failed: Connection failure"
This VMware KB came up in a couple different places related to that error, suggesting the hostd process has crashed or is hung:
http://kb.vmware.com/kb/1002849
In particular, I wonder if your hostd process needs to be restarted:
http://kb.vmware.com/kb/2030663
Had the same and other issues (purple screens) with previous versions of ESXi 5.1.0. Upgraded to latest version on 16-03-2013.
I do not configuring any specific settings for drivers.
I've tried severals KBs but nothing seems to work (only a reboot). I have already opened a support ticket with VMware.
~ # /etc/init.d/hostd start
hostd is running.
~ # /etc/init.d/vpxa status
vpxa is running
~ # tail -f var/log/hostd.log
2013-04-09T20:12:44.028Z [4C9A4B90 warning 'UserDirectory'] Group lookup failed for 'MANAGEMENT\ESX Admins'
2013-04-09T20:12:46.791Z [4C6EAB90 verbose 'Cimsvc'] Ticket issued for CIMOM version 1.0, user root
DJGetComputerDN: 0x80047: 0x995 - Unknown error
Stack Trace:
DJGetComputerDN: 0x80047: 0x251E - Unknown error
Stack Trace:
2013-04-09T20:13:00.099Z [4C68BB90 warning 'vim.PerformanceManager'] Basil read workload parameters for d4330b54-7587ac4a are out of range,oIO = 294178 ioSizeBytes = 4114
2013-04-09T20:13:00.099Z [4C68BB90 warning 'vim.PerformanceManager'] Basil write workload parameters for d4330b54-7587ac4a are out of range,oIO = 26770 ioSizeBytes = 157
2013-04-09T20:13:11.047Z [4C7CCB90 verbose 'DvsManager'] PersistAllDvsInfo called
~ # tail -f var/log/vpxa.log
2013-04-08T11:35:18.010Z [7B45EB90 warning 'Default'] Closing Response processing in unexpected state: 3
2013-04-08T11:35:18.024Z [7B45EB90 verbose 'commonvpxXml'] [VpxXml] Error fetching /definitions/import/@namespace from /sdk/vimService?wsdl: 503 (Service Unavailable)
2013-04-08T11:35:18.024Z [7B45EB90 warning 'Default'] Closing Response processing in unexpected state: 3
2013-04-08T11:35:18.024Z [7B45EB90 warning 'hostdcnx'] [VpxaHalCnxHostagent] Could not resolve version for authenticating to host agent
2013-04-08T11:35:38.024Z [FFD8BB90 verbose 'hostdcnx'] [VpxaHalCnxHostagent] Creating temporary connect spec: localhost:443
2013-04-08T11:35:38.039Z [FFD8BB90 verbose 'commonvpxXml'] [VpxXml] Error fetching /sdk/vimServiceVersions.xml: 503 (Service Unavailable)
2013-04-08T11:35:38.039Z [FFD8BB90 warning 'Default'] Closing Response processing in unexpected state: 3
2013-04-08T11:35:38.054Z [FFD8BB90 verbose 'commonvpxXml'] [VpxXml] Error fetching /definitions/import/@namespace from /sdk/vimService?wsdl: 503 (Service Unavailable)
2013-04-08T11:35:38.054Z [FFD8BB90 warning 'Default'] Closing Response processing in unexpected state: 3
2013-04-08T11:35:38.054Z [FFD8BB90 warning 'hostdcnx'] [VpxaHalCnxHostagent] Could not resolve version for authenticating to host agent
What SPP set are you running on your HP DL360 G7?
I don't know. I installed out of the box ESXi.
ESXi host crashed today with a PSOD.... host is now working normally and reconnected in vCenter.
Bittersweet I suppose. It's working for now, until it happens again.
I would check the firmware and BIOS configuration on the server to ensure it's supported for ESXi 5.1.
Checking HP's support matrix, your hardware is supported under ESXi 5.1:
http://h18004.www1.hp.com/products/servers/vmware/supportmatrix/hpvmware.html
However, in terms of firmware (the SPP firmware set), HP started supporting ESXi 5.1 as of SPP 2012.08.00:
http://ftp.hp.com/pub/softlib2/software1/doc/p1822529277/v81463/SPP2012.08.0rev1ReleaseNotes.pdf
You could take the easy route and just apply the latest SPP set to the server (SPP 2013.02.0):
Here's all the drivers/firmware HP supports for DL360 G7 with ESXi 5.1. I don't personally use HP selected drivers over VMware Certified drivers, but firmware and BIOS are important to keep in sync:
Also would run diagnostics via HP Smart Start 8.70 to see if any hardware tests fail. We generally run all tests (except HDD tests because they are very time consuming) for 3 - 4 days to ensure it holds up without error before deploying ESXi on the hardware.
Updated today one ESXi host with SPP 2013.02.0. Hopefully this resolve the issues.
I don't understand the cause that would do this in the first place but here is the ARTICLE from VMWare.
Link
Sorry, I guess VMware don't like to have links to their own site posted here. Guess you will have to go hunt.
Hint: 2056181 is the Article ID in the Knowledge Base
Hi,
Do you happen to use Veeam Backup and Replication or/and with Veeam One combination?
We are using above Veeam product and we are suspecting it is the caused of this issue
@ThijsW
As I see you are getting events >>>failed with error No buffer space available (55)<<<< This could be that few applications might have failed to close TCP sockets. Have tried to reboot your host.
Do check this too VMware KB: ESXi 5.1 host is disconnected in vCenter Server and reports the error:No buffer space ava...
Rajeev
I highly believe the application in our case is Veeam One or/and Veeam Backup and Replication
Now to the question why would this causes ESXi to be unstable? Does it mean ESXi 5.1 host is easily subjected to DOS attacks?
Hi,
All issues are gone after using HP Service Pack for ProLiant (SPP).
Regards,
Thijs
Hi Thijs,
Do you happen to use any Veeam product on this setup?
Yes, Veeam One monitor and Backup and Replication