VMware Cloud Community
g_feiner
Contributor
Contributor
Jump to solution

connectivity to service console lost

hi community,

i have a strange problem with one of my blades in ibm bladecenter. in the last days i noticed a few times that VI complained about the blade having trouble with the HA agent. i then reconfigured for HA and everything was fine. today now, i dont have connectivity to the blades service console.

the blade is equipped with 4 nics, two of them - vmnic0 and vmnic3 - are connected to the service console and vmkernel. i suspected a hardware problem or switch misconfig. but, after checking that, i still have the same trouble. esx-server is not responding to pings and cannot ping to the outside. i see, however, arp requests when tcpdumping on the vswif0!?

strange thing is, i can ping the vmkernel interface which resides in the very same two nics as vswif0 does ...

i even reinstalled the esx-host - apparently it continues to use the old configuration, including a second vswitch i created for the old installation - where does it get those informations from? there has to be some connectivity left, but i cant find it?!

so, to summarize:

- service console not responding

- vmkernel responding

- config retained after complete reinstallation of esx-server on the blade

- arp requests in tcpdump, but nothing else

- cannot reach the system from the outside and cannot ping from the inside

does anybody have a clue whats going on here? any help would be appreciated

regards

gerd

0 Kudos
1 Solution

Accepted Solutions
wpatton
Expert
Expert
Jump to solution

When you do a re-install, be sure you choose Install and not Upgrade, as it will retain settings using the Upgrade option.

If none of the other blades in that enclosure are having any issues, I would start looking at the blade itself. It could be a hardware issue with it's connection to the backplane, etc.

If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".

*Disclaimer: VMware Employee* If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".

View solution in original post

0 Kudos
7 Replies
g_feiner
Contributor
Contributor
Jump to solution

before someone asks: the other blades are on the same subnet and are just working fine - so i disregarded layer3-issues as such.

regards

gerd

0 Kudos
wpatton
Expert
Expert
Jump to solution

When you do a re-install, be sure you choose Install and not Upgrade, as it will retain settings using the Upgrade option.

If none of the other blades in that enclosure are having any issues, I would start looking at the blade itself. It could be a hardware issue with it's connection to the backplane, etc.

If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".

*Disclaimer: VMware Employee* If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".
0 Kudos
g_feiner
Contributor
Contributor
Jump to solution

i did choose install. strange thing is that the vmkernel interface resides on the exact same hardware - and it is responding. so i think i can rule out hardware issues, i even swapped the blade to a different position - same thing. config stays and service console - as opposed to vmkernel - is irresponsive.

regards

gerd

0 Kudos
g_feiner
Contributor
Contributor
Jump to solution

i found my error ... tou should never - inadvertently - install a new esx host into the datastore ... deleted all of my virtual machines and shredded the whole thing ... good thing it is just a testbed atm. but it will go into production and this rises the question how one would recover from such a desaster? is there a way of backing up the whole datastore in a reasonable way?

regards

gerd

0 Kudos
wpatton
Expert
Expert
Jump to solution

That is ALWAYS of utmost importance when installing a new or re-installing an ESX host. We never install a host with any LUNs presented to it, except the boot LUN, if any other LUNs are detected the admin is required to immediately power-off the system and revisit the storage configuration. Always take any steps you can to remove the "human factor".

I would encourage you to look at VCB, .

Also, depending on your storage, you may have snapshotting abilities as well. This is a good lesson to learn early, Plan for a disaster...because you never know why or when it is coming.

If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".

*Disclaimer: VMware Employee* If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".
g_feiner
Contributor
Contributor
Jump to solution

thanks for the hint. i think i learned my lesson Smiley Happy we indeed have a ds4700 with snapshot licensed, so i will have a look at that, including VCB.

what's the catch with the "helpful" or "correct" answers, anyway?

regards, gerd

0 Kudos
wpatton
Expert
Expert
Jump to solution

The points are just for the rankings on these forums, they really serve little purpose in real life, but they virtually increase my ego and make me feel better about myself! :smileylaugh:

Glad you are on your way and sounds like you have a good setup going, keep reading up on some of the docs and stay on the forums here to help others avoid your same mistakes!

Cheers.

If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".

*Disclaimer: VMware Employee* If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".