VMware Cloud Community
durrie
Contributor
Contributor

ESX server will not add back into Virtual Center?

I have an ESX server which was showing up as (disconnected) in VC. I removed it from VC, logged in via SSH (putty)...killed the hostd process and restarted the vmware-mgt services. When I attempted to reconnect the ESX server back into VC I get a totally unhelpful error message..."A general system error occured"...see the attached file for message recieved.

The strange this is that this ESX host is not manageable now via a direct VIC connection or via a web browser. Connecting directly to the host with the VIC allows me in but it is very slow and will not list the VMs on the host nor does to return any of data under any of the sections under the Configuration tab.

The VMs are running and working as normal...I can RDP to them and manage them but I cannot manage the ESX host in any way nor reconnect it to the VC.

Any tips or pointers for log files I can check to help diagnose this host would be greatly appreciated...?

Tags (3)
0 Kudos
12 Replies
Troy_Clavell
Immortal
Immortal

have you tried to stop the vpx agent

service vmware-vpxa stop

Once stopped, try to add back to vCenter.

0 Kudos
ShantanuVCP
Contributor
Contributor

I have a feeling that the ESX box has stopped listening to the ports 902/903. This is the port on which the VI Client and the Web Interface connects. Please login to the Service Console and check for the list of running ports. You can use the command esxcfg-firewall -q to check the list of running services and then restart the VMware Management Interface (/etc/init.d/httpd.vmware restart)

0 Kudos
durrie
Contributor
Contributor

Troy...tied to readd the host while xpxa in a stoppped state...no joy. I did also restart this service as part of the initial troubleshoot..?

ShantanuVCP...not exactly sure what I'm looking for in the command supplied...here is the information retuned...what doe this actually tell me?

Thanks in advance...

497K   50M valid-source-address-udp  udp  --  *      *       0.0.0.0/0            0.0.0.0/0
  826 39788 valid-source-address  tcp  --  *      *       0.0.0.0/0            0.0.0.0/0          tcp flags:0x16/0x0                          2
4613  292K icmp-in    icmp --  *      *       0.0.0.0/0            0.0.0.0/0
180K   22M ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0          state RELATED,ESTABLISHED
    9   432 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0          tcp dpt:902 state NEW
   16   768 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0          tcp dpt:80 state NEW
  255 12344 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0          tcp dpt:443 state NEW
1114  370K ACCEPT     udp  --  *      *       0.0.0.0/0            0.0.0.0/0          udp spts:67:68 dpts:67:68
    0     0 ACCEPT     udp  --  *      *       0.0.0.0/0            0.0.0.0/0          udp dpt:427
   24  1152 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0          tcp dpt:427 state NEW
  120  5760 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0          tcp dpt:5989 state NEW
2804  218K ACCEPT     udp  --  *      *       0.0.0.0/0            0.0.0.0/0          udp dpt:161
   18   868 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0          tcp dpt:22 state NEW
    0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0          tcp dpt:2301
   24  1152 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0          tcp dpt:2381

Chain FORWARD (policy DROP 0 packets, 0 bytes)
pkts bytes target     prot opt in     out     source               destination

Chain OUTPUT (policy DROP 0 packets, 0 bytes)
pkts bytes target     prot opt in     out     source               destination
8354K 9692M ACCEPT     all  --  *      lo      0.0.0.0/0            0.0.0.0/0
80011   22M valid-tcp-flags  tcp  --  *      *       0.0.0.0/0            0.0.0.0/0
4613  292K icmp-out   icmp --  *      *       0.0.0.0/0            0.0.0.0/0
   29  2033 ACCEPT     udp  --  *      *       0.0.0.0/0            0.0.0.0/0          udp spts:1024:65535 dpt:53
    0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0          tcp spts:1024:65535 dpt:53
186K   31M ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0          state RELATED,ESTABLISHED
    0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0          tcp dpt:902 state NEW
    0     0 ACCEPT     udp  --  *      *       0.0.0.0/0            0.0.0.0/0          udp spts:67:68 dpts:67:68
    0     0 ACCEPT     udp  --  *      *       0.0.0.0/0            0.0.0.0/0          udp spt:427
    0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0          tcp spt:427 state NEW
  934 70984 ACCEPT     udp  --  *      *       0.0.0.0/0            0.0.0.0/0          udp dpt:123
    0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0          tcp dpt:443 state NEW
    0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0          tcp dpt:902 state NEW
    0     0 ACCEPT     udp  --  *      *       0.0.0.0/0            0.0.0.0/0          udp dpt:162
    0     0 ACCEPT     udp  --  *      *       0.0.0.0/0            0.0.0.0/0          udp dpt:902 state NEW
   13   780 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0          tcp dpt:27000 state NEW
   13   780 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0          tcp dpt:27010 state NEW
    0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0          tcp dpt:280
  389 28791 REJECT     all  --  *      *       0.0.0.0/0            0.0.0.0/0          reject-with icmp-port-unreach                          able

Chain icmp-in (1 references)
pkts bytes target     prot opt in     out     source               destination
    0     0 ACCEPT     icmp --  *      *       0.0.0.0/0            0.0.0.0/0          icmp type 0
4613  292K ACCEPT     icmp --  *      *       0.0.0.0/0            0.0.0.0/0          icmp type 8
    0     0 ACCEPT     icmp --  *      *       0.0.0.0/0            0.0.0.0/0          icmp type 3 code 4
    0     0 DROP       all  --  *      *       0.0.0.0/0            0.0.0.0/0

Chain icmp-out (1 references)
pkts bytes target     prot opt in     out     source               destination
    0     0 ACCEPT     icmp --  *      *       0.0.0.0/0            0.0.0.0/0          icmp type 8
4613  292K ACCEPT     icmp --  *      *       0.0.0.0/0            0.0.0.0/0          icmp type 0
    0     0 DROP       all  --  *      *       0.0.0.0/0            0.0.0.0/0

Chain log-and-drop (7 references)
pkts bytes target     prot opt in     out     source               destination
    0     0 LOG        all  --  *      *       0.0.0.0/0            0.0.0.0/0          LOG flags 6 level 7
    0     0 DROP       all  --  *      *       0.0.0.0/0            0.0.0.0/0

Chain valid-source-address (2 references)
pkts bytes target     prot opt in     out     source               destination
    0     0 DROP       all  --  *      *       127.0.0.1            0.0.0.0/0
    0     0 DROP       all  --  *      *       0.0.0.0/8            0.0.0.0/0
    0     0 DROP       all  --  *      *       0.0.0.0/0            255.255.255.255

Chain valid-source-address-udp (1 references)
pkts bytes target     prot opt in     out     source               destination
    0     0 DROP       all  --  *      *       127.0.0.1            0.0.0.0/0
   83 45746 DROP       all  --  *      *       0.0.0.0/8            0.0.0.0/0

Chain valid-tcp-flags (2 references)
pkts bytes target     prot opt in     out     source               destination
    0     0 log-and-drop  tcp  --  *      *       0.0.0.0/0            0.0.0.0/0          tcp flags:0x3F/0x00
    0     0 log-and-drop  tcp  --  *      *       0.0.0.0/0            0.0.0.0/0          tcp flags:0x11/0x01
    0     0 log-and-drop  tcp  --  *      *       0.0.0.0/0            0.0.0.0/0          tcp flags:0x18/0x08
    0     0 log-and-drop  tcp  --  *      *       0.0.0.0/0            0.0.0.0/0          tcp flags:0x30/0x20
    0     0 log-and-drop  tcp  --  *      *       0.0.0.0/0            0.0.0.0/0          tcp flags:0x03/0x03
    0     0 log-and-drop  tcp  --  *      *       0.0.0.0/0            0.0.0.0/0          tcp flags:0x06/0x06
    0     0 log-and-drop  tcp  --  *      *       0.0.0.0/0            0.0.0.0/0          tcp flags:0x05/0x05


Incoming and outgoing ports blocked by default.
Enabled services: CIMSLP ntpClient VCB CIMHttpsServer snmpd vpxHeartbeats LicenseClient sshServer

Opened ports:
        hp-sim              : port 2301 tcp.in
        hpim                : port 2381 tcp.in
        sim-cert            : port 280 tcp.out

0 Kudos
Troy_Clavell
Immortal
Immortal

when did this disconnect happen?  What version of vCenter are you licensed for?  The ESX Host in question that will not add back to inventory, there are no disk space issues?

0 Kudos
bulletprooffool
Champion
Champion

First port of call here would be network connectivity / DNS / hosts files.

I realise that you can connect (limited) at the mmoent, but this does not mean that your networking is functioning 100% - you may be losing packets or routing incorrectly or similar.

I assume that all of these are valid and have been tested?

I am also guessing that you have tried connecting the host by IP address rather than hostname?

Next thing to consider is that a network change has happened somewhere and is affecting you (NAT etc) - is it possible to get a machine on the same subnet as your ESX host and then test connectivity to the Host (ie take networking / gateways etc out of the equation)

Is your  storage properly visible to your ESX host - have you hacvd any recent storage change sthat may affect the host?

One day I will virtualise myself . . .
0 Kudos
durrie
Contributor
Contributor

Guys…sorry for the essay…but I do appreciate your time if and when you may have time to spare and read my reply…?

Troy – Licensing shouldn’t be an issue, this host is one of 9 hosts running our Virtual Server Infrastructure.

All others are running fine and setup in the same manner…to use the same license server...although the problem box it is not picking up any add-ins that it should like our other servers I just assumed these missing add-ins are because the box is not connecting to the VC…i.e….vCenter agent for ESX Server, VMotion, VMware Consolidated Backup are not showing up under the licensing config tab?

Disk space also seems fine…

[root@vh1 root]# df -h

Filesystem               Size    Used    Avail    Use%    Mounted on

/dev/cciss/c0d0p2     4.9G    1.5G     3.2G     32%      /

/dev/cciss/c0d0p1      99M   26M       68M      28%     /boot

none                         391M     0      391M       0%      /dev/shm

/dev/cciss/c0d0p6      2.0G  1006M  856M      55%     /var/log

This feels more and more like a network issue to me?

Bulletprooffo – yes I tried connecting the box into VC by IP and hostname…all I get is that totally unhelpful “general system error” error message!

On your second suggestion…ALL our ESX hosts in our VM server cluster are in the same subnet. Only this particular box is playing up! She is an old girl (ProLiant DL 585 - G1) but we have 4 other identical “physically identical” G1 boxes in this cluster working fine?

I have noticed that the VC server address is NOT in this problem box’s host file. Infact because we have never used HA or VMotion none of our hosts have anything other than their own address set in their host files…strange that it is not affecting any other boxes but I am going to add the VC IP and FQDN to the problem box’s host file and see what happens…?

On the storage point…hmm…I’m afraid this is a headache for us at present!

We have an EVA which is being difficult in that not even HP themselves can figure out why our ESX hosts auto configure their paths to overwhelmingly route through only one of the two Fibre channel controllers on the EVA. To make matters worse they set their preferred path all on the same port of the 4 ports on each FC controller which means when something like McAfee tries to push out an update our EVA grinds to a halt with maxed out I\O…!!!

Disk presentation does however seem fine on this box…despite the same port and controller prefered path issue!

Disk vmhba1:0:14 /dev/sdb (460800MB) has 8 paths and policy of Fixed

FC 3:7.0 10000000c96459bd<->50001fe1500d4c9a vmhba1:0:14 On active preferred

FC 3:7.0 10000000c96459bd<->50001fe1500d4c98 vmhba1:1:14 On

FC 3:7.0 10000000c96459bd<->50001fe1500d4c9e vmhba1:2:14 On

FC 3:7.0 10000000c96459bd<->50001fe1500d4c9c vmhba1:3:14 On

FC 3:8.0 10000000c9673ed6<->50001fe1500d4c9b vmhba2:0:14 On

FC 3:8.0 10000000c9673ed6<->50001fe1500d4c99 vmhba2:1:14 On

FC 3:8.0 10000000c9673ed6<->50001fe1500d4c9f vmhba2:2:14 On

FC 3:8.0 10000000c9673ed6<->50001fe1500d4c9d vmhba2:3:14 On

Disk vmhba1:0:10 /dev/sda (512000MB) has 8 paths and policy of Fixed

FC 3:7.0 10000000c96459bd<->50001fe1500d4c9a vmhba1:0:10 On active preferred

FC 3:7.0 10000000c96459bd<->50001fe1500d4c98 vmhba1:1:10 On

FC 3:7.0 10000000c96459bd<->50001fe1500d4c9e vmhba1:2:10 On

FC 3:7.0 10000000c96459bd<->50001fe1500d4c9c vmhba1:3:10 On

FC 3:8.0 10000000c9673ed6<->50001fe1500d4c9b vmhba2:0:10 On

FC 3:8.0 10000000c9673ed6<->50001fe1500d4c99 vmhba2:1:10 On

FC 3:8.0 10000000c9673ed6<->50001fe1500d4c9f vmhba2:2:10 On

FC 3:8.0 10000000c9673ed6<->50001fe1500d4c9d vmhba2:3:10 On

Disk vmhba3:0:0 /dev/cciss/c0d0 (140006MB) has 1 paths and policy of Fixed

Local 6:4.0 vmhba3:0:0 On active preferred

RAID Controller (SCSI-3) vmhba1:1:0  (0MB) has 8 paths and policy of Fixed

FC 3:7.0 10000000c96459bd<->50001fe1500d4c98 vmhba1:1:0 On active preferred

FC 3:7.0 10000000c96459bd<->50001fe1500d4c9e vmhba1:2:0 On

FC 3:7.0 10000000c96459bd<->50001fe1500d4c9c vmhba1:3:0 On

FC 3:7.0 10000000c96459bd<->50001fe1500d4c9a vmhba1:0:0 On

FC 3:8.0 10000000c9673ed6<->50001fe1500d4c9b vmhba2:0:0 On

FC 3:8.0 10000000c9673ed6<->50001fe1500d4c99 vmhba2:1:0 On

FC 3:8.0 10000000c9673ed6<->50001fe1500d4c9f vmhba2:2:0 On

FC 3:8.0 10000000c9673ed6<->50001fe1500d4c9d vmhba2:3:0 On

This looks more and more like an ESX box reboot...although I hate doing this because it doesn't teach me anything or halp us find out possible problems with the system!

0 Kudos
bulletprooffool
Champion
Champion

I agree - I also like to solve a problem before rebooting / rebuilding - for future reference etc.

If you are having storage issue sthough, I'd defeinitely not exclude this form my troubleshooting.

Could you try removing the faulty storage from this host and use different storage to ascertain whether this has any effect on the behaviour of the ESX host?

One day I will virtualise myself . . .
0 Kudos
durrie
Contributor
Contributor

UPDATE:...for those interested...

I opened an SR with VMWare and they said I need to update my ESX 3.5 to update 5.

I have a Windows 2008 VM on this host and apparently unless your ESX is 3.5 Update 5 it will have problems loading those VMs into its inventory which is why the direct VIC connection to the Host does not load the VMs into view either...and because the host cannot load its own VM inventory the host cannot be loaded into VC.

I have to unregister the Win2008 box from the host...update my ESX server from update 4 to update 5 then reload the Win 2008 box back into inventory at which point it should load back into VC...

Watch this space...I will report the results here once we have a maintenance window agreed by ouw "change board"...probably Thursday comming...

0 Kudos
bulletprooffool
Champion
Champion

Thanks  for posting back

One day I will virtualise myself . . .
0 Kudos
durrie
Contributor
Contributor

UPDATE: I can confirm that the ESX update 5 resolved this issue for me.

1.) Shutdown all VMs on Host

2.) Unregister all VMs (and Templates if you have any) that are Windows 2008 on the Host ESX server

3.) Update the Host to ESX 3.5 Update 5

4.) Reregister all the VMs (and Templates  if you have any) that are Windows 2008 on the Host ESX server

5.) Re-add Host ESX server back to VC

0 Kudos
bulletprooffool
Champion
Champion

Please mark the thread as answered, as this helps people who search for solutions in future to identify threads that actually have answers to questions asked. - /thanks again for posting back with your result.

One day I will virtualise myself . . .
0 Kudos
a_p_
Leadership
Leadership

Marked as "Assumed Answered"

André

0 Kudos