VMware Cloud Community
ttrulis
Contributor
Contributor
Jump to solution

ESX host disconnected unable to add to VC

So it began with me being unable to power on a VM that I created due to "Insufficient resources to satisfy configured failover level for HA" which is fine because that is how I set up HA. I tried to reconfigure the HA for the host by right-clicking the host and selecting that option. Once I did that it failed at about 30% with "An error occurred during the configuration of the HA Agent on the host" and the networking page disappeared from the configuration tab similar to what a lot of people reported when upgrading I think from a previous version. This is not the case I haven't upgraded anything. So then I restarted the service service mgmt-vmware restart and the ESX host disconnected from VC altogether. I cannot use the VI client to connect to the host in question but I am connected via the service console. All the VM's and the host are pingable and everything is running. I disabled HA and DRS and tried to add the ESX host again which didn't work.

Any help is appreciated. Thank you.

0 Kudos
1 Solution

Accepted Solutions
avlieshout
VMware Employee
VMware Employee
Jump to solution

I found this KB article which discusses the same symptoms as you.

Check if your esx.conf file is corrupted. Use the KB article guidelines to restore or recreate it.

-Arnim van Lieshout

-


Blogging:

Twitter:

If you find this information useful, please award points for "correct" or "helpful".

Arnim van Lieshout Blogging: http://www.van-lieshout.com Twitter: http://www.twitter.com/avlieshout If you find this information useful, please award points for "correct" or "helpful".

View solution in original post

0 Kudos
20 Replies
anujmodi1
Hot Shot
Hot Shot
Jump to solution

Are you able to connect esx host directly with VI client ? What is error message you get while connecting with VI Client.

Anuj Modi,

If you found my answer to be useful, feel free to mark it as Helpful or Correct.

The latest blogs and articles on Virtulization:

http://communities.vmware.com/blogs/amodi

Anuj Modi, If you found my answer to be useful, feel free to mark it as Helpful or Correct. The latest blogs and articles on Virtulization: anujmodi.wordpress.com
0 Kudos
ttrulis
Contributor
Contributor
Jump to solution

Connecting to ESX through VI client fails: "Details: A connection failure occurred."

I can connect via service console.

0 Kudos
anujmodi1
Hot Shot
Hot Shot
Jump to solution

Check if this KB will help you

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1006156&sl...

Anuj Modi,

If you found my answer to be useful, feel free to mark it as Helpful or Correct.

The latest blogs and articles on Virtulization:

http://communities.vmware.com/blogs/amodi

http://vsolutions.compare2shop.com

Anuj Modi, If you found my answer to be useful, feel free to mark it as Helpful or Correct. The latest blogs and articles on Virtulization: anujmodi.wordpress.com
0 Kudos
anujmodi1
Hot Shot
Hot Shot
Jump to solution

one more article will help you : http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1003561&sl...

Anuj Modi,

If you found my answer to be useful, feel free to mark it as Helpful or Correct.

The latest blogs and articles on Virtulization:

http://communities.vmware.com/blogs/amodi

http://vsolutions.compare2shop.com

Anuj Modi, If you found my answer to be useful, feel free to mark it as Helpful or Correct. The latest blogs and articles on Virtulization: anujmodi.wordpress.com
0 Kudos
dickybird
Enthusiast
Enthusiast
Jump to solution

Check the vpx logs on the ESX server @ /var/log/vmware/vpx

Also did you try to use FQDN when adding ESX server to VC.

Sometimes in version 3.0.2 it fails to install Vc agents

Try following

  1. Login to ESX Server via ssh client as root user

  2. cd /tmp

  3. mkdir vmware-root

  4. Try re-connecting the host to Virtual Center

Also Try following

At the service console, issue

#service mgmt-vmware restart At the service console, issue

  1. service vmware-vpxa restart

0 Kudos
ttrulis
Contributor
Contributor
Jump to solution

@anujmodi1

This article: refers to a licensing issue. I checked the hostd logs and do not see the exact same thing. Although there is a backtrace:

[2009-07-30

16:03:41.242 'HostsvcPlugin' 3076440992 info] Resource pool configuration:

/etc/vmware/hostd/pools.xml

[2009-07-30

16:03:41.246 'App' 3076440992 panic] error: FileIO error: No space left : /etc/vmware/hostd/pools.xml.tmp

[2009-07-30

16:03:41.246 'App' 3076440992 panic] backtrace:

eip 0x12c6a8e

eip 0x1184769

eip 0x1135c15

eip 0x12c7bdc

eip 0x12c8a93

eip 0x1154483

eip 0x11545fd

eip 0x1158fe4

eip 0x1159b2f

eip 0x115e920

eip 0x1150b66

eip 0x520d2f4

eip 0x5209e3d

eip 0x520ef46

eip 0x520c546

eip 0x520cef4

eip 0x52109f4

eip 0x55bbd88

eip 0x55bc03b

eip 0x55ce937

eip 0x113fe62

eip 0x113bd66

eip 0x562cec7

eip 0x563bd72

eip 0x56445ee

eip 0x139e79a

eip 0x50fc3d1

the second link you provided times out for me.

0 Kudos
ttrulis
Contributor
Contributor
Jump to solution

Tried service vmware-vpxa restart and did not work.

I am using the FQDN when adding the ESX (would have face palmed myself for that)

vpx logs:

Last stats polling used ms

Creating temporary connect spec: localhost:443

Failed to discover namespace: Connection refused

Could not resolve namespace for authenticating to hos

agent

Not sure what above suggests but from searching it looks like dns resolution issues. The host file in /etc/ has the correct IP and FQDN's. I can also ping from the ESX from VC server.

Using version ESX 3.5 so I omitted created vmware-root.

thanks for the suggestions.

0 Kudos
dickybird
Enthusiast
Enthusiast
Jump to solution

You can try putting ESX hosts IP in lmhosts file in VC server.

VC server is pointing to right licnese file - verify

what does nslookup resolving to for both ESX and VC servers.

0 Kudos
ttrulis
Contributor
Contributor
Jump to solution

I don't think that will help as our VC server is able to resolve both ESX servers.

Nothing has changed with license file. nslookup resolves fine.

0 Kudos
ttrulis
Contributor
Contributor
Jump to solution

For some reason this morning I restarted mgmt-vmware and the host reconnected. However the networking page under the config tab for the ESX is still blank. Is this a corrupt .xml? How can it be fixed?

0 Kudos
dickybird
Enthusiast
Enthusiast
Jump to solution

Networking being blank means it did not recognize your NIC's.

Can you please check the logs on ESX server /var/log/messages

or host.log files for possible error messages

0 Kudos
avlieshout
VMware Employee
VMware Employee
Jump to solution

Try to reinstall vpxa and aam agents.

  1. Right click on your ESX in VC, select Disconnect

  2. Connect to your ESX using an ssh client as root

  3. Run the command "rpm -qa | grep vpxa" to list the vpxa agent

  4. Run "rpm -e <output from last command>" to remove vpxa

  5. Run the command "rpm -qa | grep aam" to list the aam agents

  6. Remove both agents using "rpm -e <output from last command>" to remove aam agents

  7. Go back to VC, right click on your host and select connect. This will reinstall vpxa and aam agents.

-Arnim van Lieshout

-


Blogging:

Twitter:

If you find this information useful, please award points for "correct" or "helpful".

Arnim van Lieshout Blogging: http://www.van-lieshout.com Twitter: http://www.twitter.com/avlieshout If you find this information useful, please award points for "correct" or "helpful".
0 Kudos
ttrulis
Contributor
Contributor
Jump to solution

I have followed your instructions and the networking page under config tab is still blank. This error appears:

"Cannot access a disposed object. Object name: 'ListViewEx'"

0 Kudos
ttrulis
Contributor
Contributor
Jump to solution

I didn't see anything helpful in the /var/log/messages but the host.log files had this:

2009-07-31 09:01:01 (21163) ERROR: Could not copy '/boot/initrd-vmnix.img' to '/tmp/vmware.0.tmp': No such file or directory

2009-07-31 09:01:01 (21163) ERROR: Could not write out new initrds.

http://communities.vmware.com/thread/83138

from the above thread claims "The esx.conf file was corrupted somehow and lost all the configuration settings."

I don't think I have backups of these config files. Is there anyway to get them back somehow or is this a different but similar issue?

0 Kudos
avlieshout
VMware Employee
VMware Employee
Jump to solution

Error sounds like a virtual center error to me. Try to remove the affected host from virtual center and re-add it afterwards.

-Arnim van Lieshout

-


Blogging: http://www.van-lieshout.com

Twitter:

If you find this information useful, please award points for "correct" or "helpful".

Arnim van Lieshout Blogging: http://www.van-lieshout.com Twitter: http://www.twitter.com/avlieshout If you find this information useful, please award points for "correct" or "helpful".
0 Kudos
dickybird
Enthusiast
Enthusiast
Jump to solution

Its .net error..did you check by going to ESX directly by VI client.

VC corruption can eaily be fixed by doing a install with repair option for VC.

Try removing ESX one more time and add again

0 Kudos
ttrulis
Contributor
Contributor
Jump to solution

To remove it from VC I would need it in maintenance mode and therefore would need to vmotion the VM's off. I cannot do this as it does not see a phyiscal network assigned for vmotion. I attempted to connect directly to ESX and the results are the same.

Another one of my concerns is currently all VM's are reachable via ping and RDP as is the ESX host. Should I just recreate vmotion network? Why is that network gone and everything else okay?

Stranger still a esxcfg-vswif -l reveals nothing listed. I think the service console should be there.

An esxcfg-switch reveal all the switches and ports and what is dedicated to what. I could recreate everything with what is shown here but this is a production box and don't want to risk interrupting service.

0 Kudos
avlieshout
VMware Employee
VMware Employee
Jump to solution

To remove a live host from a cluster, first disconnect the host. Then when disconnected you can remove the disconnected host.

BUT, since you have the same problem with the VIC connected directly to the ESX host, removing and re-adding will not help you, as the problem is probably within the ESX host itself.

I do not know your networking setup. Recreating things could interupt network service from the running vms, so be careful and make sure you know 100% what you are doing.

Perhaps it is best to schedule some downtime and reboot the ESX host after you shutdown all guests properly. Maybe this simply resolves your problem.

You could try to create a new vswitch and see if that succeeds and check if the new vswitch shows up in the networking tab.

If your vmotion network setup is seperate from your vm network, you could then try to recreate a vmkernel port and assign a physical nic to the new setup and see if you can access the network through the vmkernel port by using vmkping from the console.

If you succeed you can use vmotion again and evacuate the running vms.

One last stupid question. Have you tried the refresh button on the networking tab?

good luck. Let me know how things work out.

-Arnim van Lieshout

-


Blogging:

Twitter:

If you find this information useful, please award points for "correct" or "helpful".

Arnim van Lieshout Blogging: http://www.van-lieshout.com Twitter: http://www.twitter.com/avlieshout If you find this information useful, please award points for "correct" or "helpful".
0 Kudos
avlieshout
VMware Employee
VMware Employee
Jump to solution

I found this KB article which discusses the same symptoms as you.

Check if your esx.conf file is corrupted. Use the KB article guidelines to restore or recreate it.

-Arnim van Lieshout

-


Blogging:

Twitter:

If you find this information useful, please award points for "correct" or "helpful".

Arnim van Lieshout Blogging: http://www.van-lieshout.com Twitter: http://www.twitter.com/avlieshout If you find this information useful, please award points for "correct" or "helpful".
0 Kudos