VMware Cloud Community
iliketurtles
Contributor
Contributor
Jump to solution

Unable to Enble SSH on ESXi 5.5

   I need to stop a service on a host to troubleshooting a problem.

This issue http://h20565.www2.hp.com/hpsc/doc/public/display?sp4ts.oid=5177949&docId=emr_na-c03624914&docLocale...

So I want to stop the ams service /etc/init.d/hp-ams.sh stop

Please correct me if I am wrong but I need to putty/ssh into host but my problem is I am unable to enable ssh on the host.

I have used iLo to Remote to console but when I change Enable SSH it will not save. I go back to Troubleshooting Mode Options and ESXi Shell is changed back to Disabled. I also tried to make the change on the Host using VSphere , no success.

I am fairly new working at my current company so maybe I am missing something which is blocking this change? I know nothing with Group Policy would affect this, maybe Update Manager is keeping me from making this change? Not sure where else to look.

Tags (5)
Reply
0 Kudos
1 Solution

Accepted Solutions
BostonTechGuy
Enthusiast
Enthusiast
Jump to solution

ILIKETURTLES (and everyone on this thread :smileysilly:),

I am in the middle of this very problem as we speak. I also have had progress.  Sharing my experience hoping it will help.

Starting off like a drug commercial or daytime lawyer ad:

  • Do have HP Gen8 gear in your datacenter (blades in my case)
  • Unable to boot new vms or existing VMs from an off state?
  • You cant vMotion VMs to another host?
  • Have you tried to enable SSH only to find it wont work?
  • Hi my name is attorney BLAH BLAH..

ok ok I am trying for comic relief because this issue sucks.

Here is what happened with me

Read VMWare KB: VMware KB: ESXi host cannot initiate vMotion or enable services and reports the error: Heap globalCa...

My issue was exactly this. Got HEAP errors but this wasnt a HEAP problem. Once again the Communities to my rescue to look at HP-AMS.  Then I ran into the exact same issue as you with SSH.  It wont turn on the host no matter what I did.


In the KB there is a brief mention of the error message CANT FORK.  Here is what you need to do.

  • Log into the ILO of the HOST and look at the console of the VM HOST.. aka the Yellow and Gray page.
  • Press ALT-F1.  This will get you into the ESXI SHELL which used to be known as TechSupport Mode.
  • If all you see are the words CANT FORK over and over, the host is stuck. Image Below
  • cant fork.PNG

I ran into this issue on all my hosts in the cluster that was giving me trouble.  No matter what I did I could not migrate a single VM on any other host so I could reboot. So, I had to schedule an outage with the company for at least one host.  After contacting the business owners and getting change approval I logged into each guest and shut them down gracefully.  Once all the VMs on the host were down, I rebooted the host.  I watched the complete shutdown and boot back up from the ILO Console on the HP gear..

Once the host came back online, I was able to enable SSH, log in and proceed to the next steps.

It was at this time and place I could start using the commands to disable HP-AMS.  The work around in the KB states to remove it completely.  I have did that on the first host and it worked perfectly.  On that host I was able to boot new VMs and reboot existing VMs. Still couldnt migrate VMs off the host though.  I found out later, that had to do with the HP-AMS on the other hosts. However, I was able to migrate VMs on the other broken hosts to the Host with the work around.  vMotion was back if only one way.

So I vMotioned all the VMs of a second host and then did a reboot of the host as it had the CANT FORK problem.  Once rebooted and once again enabled SSH and removed HP-AMS.  All went perfectly.  I am in the process of removing HP-AMS from all the Hosts.  Now that vMotion works on the servers that HP-AMS has been removed, I can at least do this in the middle of the day and not worry about outages.

Its the next steps I am not certain of.  My company is in the process of truly moving to a Software Designed DataCenter. At least I hope so because its my design.. anyway I have been told I need HP-AMS for certain products to work like latest version HP Insight and such.  SO I have not installed the newer version of HP-AMS that is mentioned in the KB.  Has anyone had any luck with installing it.. OR.. should just rebuild the entire ESXi Host with the latest version HP ESXI OS?  Now that vMotion is semi working again, I can at least move VMs off the host, pull it our of the cluster, wipe it, install the latest HP ESXI OS with updated HP-AMS and then put it back in.. wash, rinse, repeat.  I dont mind rebuilding an ESXI Host, the process takes very little time. 

Hopefull this will help you in the same problem I am having.

Thanks,

Boston Tech Guy

View solution in original post

Reply
0 Kudos
8 Replies
Wscholz
Enthusiast
Enthusiast
Jump to solution

How did you enable SSH ?

Please check your Firewall and Service Settings for SSH according to:

Enable SSH on VMware ESXi 5.5 via vSphere Client | Thomas Maurer

------------------------------------------------------------------------- If you found my answer useful please consider marking it as "correct" or "helpful". Thanks a lot
Reply
0 Kudos
iliketurtles
Contributor
Contributor
Jump to solution

Looks like my Host has more of an issue than just trying to enable SSH.

I can't enable SSH as I stated. I tried to enable from Host as well as vSphere (used Thomas Maurer link as guide)

I tried to migrate the VMs to another host and failed with timeout error. Also can't view any logs, I try to select any of the logs but nothing happens. I assume some kind of buffer/log space is filled. I believe a reboot will fix the problem which on this host isn't terrible but having same problem on another host with 30+ VMs some of them production.

I believe the root cause is hp-ams issue, any way to fix this without having to reboot host?

Reply
0 Kudos
Wscholz
Enthusiast
Enthusiast
Jump to solution

Can you stop the service manually ?

The problem description in the kb article is that a log file gets filled quickly so unless this log file fills you whole disk i doubt that you problems with the host are related to this. Did vmotion worked before ?

What do you mean when you say that you can not view log files ? If you login to the shell and go to /var/log can you see / open the log files like hostd.log or vmkwarning.log ?

------------------------------------------------------------------------- If you found my answer useful please consider marking it as "correct" or "helpful". Thanks a lot
Reply
0 Kudos
iliketurtles
Contributor
Contributor
Jump to solution

Can you stop the service manually ? According to documentation I have seen I can only do this task using SSH/Putty, this is the main problem of trying to turn off the service.

The problem description in the kb article is that a log file gets filled quickly so unless this log file fills you whole disk i doubt that you problems with the host are related to this. Did vmotion worked before ? Yes it worked before.

What do you mean when you say that you can not view log files ? If you login to the shell and go to /var/log can you see / open the log files like hostd.log or vmkwarning.log ? Apologies I gave you bad information, I am able to view the logs using the DCUI.

Reply
0 Kudos
Wscholz
Enthusiast
Enthusiast
Jump to solution

And did you see any  Error Messages ?

------------------------------------------------------------------------- If you found my answer useful please consider marking it as "correct" or "helpful". Thanks a lot
Reply
0 Kudos
Alistar
Expert
Expert
Jump to solution

The only thing that comes to my mind to properly solve this is to schedule a downtime with the app owners, shut down the VMs, reboot the host, immediately migrate the VMs to another host in the cluster, and reinstall a .vib that is not affected by this bug.

Good luck!

Stop by my blog if you'd like 🙂 I dabble in vSphere troubleshooting, PowerCLI scripting and NetApp storage - and I share my journeys at http://vmxp.wordpress.com/
Reply
0 Kudos
BostonTechGuy
Enthusiast
Enthusiast
Jump to solution

ILIKETURTLES (and everyone on this thread :smileysilly:),

I am in the middle of this very problem as we speak. I also have had progress.  Sharing my experience hoping it will help.

Starting off like a drug commercial or daytime lawyer ad:

  • Do have HP Gen8 gear in your datacenter (blades in my case)
  • Unable to boot new vms or existing VMs from an off state?
  • You cant vMotion VMs to another host?
  • Have you tried to enable SSH only to find it wont work?
  • Hi my name is attorney BLAH BLAH..

ok ok I am trying for comic relief because this issue sucks.

Here is what happened with me

Read VMWare KB: VMware KB: ESXi host cannot initiate vMotion or enable services and reports the error: Heap globalCa...

My issue was exactly this. Got HEAP errors but this wasnt a HEAP problem. Once again the Communities to my rescue to look at HP-AMS.  Then I ran into the exact same issue as you with SSH.  It wont turn on the host no matter what I did.


In the KB there is a brief mention of the error message CANT FORK.  Here is what you need to do.

  • Log into the ILO of the HOST and look at the console of the VM HOST.. aka the Yellow and Gray page.
  • Press ALT-F1.  This will get you into the ESXI SHELL which used to be known as TechSupport Mode.
  • If all you see are the words CANT FORK over and over, the host is stuck. Image Below
  • cant fork.PNG

I ran into this issue on all my hosts in the cluster that was giving me trouble.  No matter what I did I could not migrate a single VM on any other host so I could reboot. So, I had to schedule an outage with the company for at least one host.  After contacting the business owners and getting change approval I logged into each guest and shut them down gracefully.  Once all the VMs on the host were down, I rebooted the host.  I watched the complete shutdown and boot back up from the ILO Console on the HP gear..

Once the host came back online, I was able to enable SSH, log in and proceed to the next steps.

It was at this time and place I could start using the commands to disable HP-AMS.  The work around in the KB states to remove it completely.  I have did that on the first host and it worked perfectly.  On that host I was able to boot new VMs and reboot existing VMs. Still couldnt migrate VMs off the host though.  I found out later, that had to do with the HP-AMS on the other hosts. However, I was able to migrate VMs on the other broken hosts to the Host with the work around.  vMotion was back if only one way.

So I vMotioned all the VMs of a second host and then did a reboot of the host as it had the CANT FORK problem.  Once rebooted and once again enabled SSH and removed HP-AMS.  All went perfectly.  I am in the process of removing HP-AMS from all the Hosts.  Now that vMotion works on the servers that HP-AMS has been removed, I can at least do this in the middle of the day and not worry about outages.

Its the next steps I am not certain of.  My company is in the process of truly moving to a Software Designed DataCenter. At least I hope so because its my design.. anyway I have been told I need HP-AMS for certain products to work like latest version HP Insight and such.  SO I have not installed the newer version of HP-AMS that is mentioned in the KB.  Has anyone had any luck with installing it.. OR.. should just rebuild the entire ESXi Host with the latest version HP ESXI OS?  Now that vMotion is semi working again, I can at least move VMs off the host, pull it our of the cluster, wipe it, install the latest HP ESXI OS with updated HP-AMS and then put it back in.. wash, rinse, repeat.  I dont mind rebuilding an ESXI Host, the process takes very little time. 

Hopefull this will help you in the same problem I am having.

Thanks,

Boston Tech Guy

Reply
0 Kudos
iliketurtles
Contributor
Contributor
Jump to solution

Thanks all for your input in this issue!

Boston Tech: Appreciate the comprehensive info, this is very relevant to me.

Looks like a reboot will have to happened, no way around it.

Reply
0 Kudos