Solved: After upgrade of Esxi 5.5 to 6.0 servers lock up e...

Mad_scotsman · ‎07-31-2015

I recently upgraded two esx boxes that have been running 5.5 forever to version 6. Upgrades ran fine no issues. However since the upgrade the boxes go to disconnected from vsphere and are unreachable on the network. I can go to the servers and login on the console when this happens. Restarting agents doesn't seem to fix and the box locks up eventually. Only way to get back is a hard reboot.

Mad_scotsman · ‎08-03-2015

it didnt take long at all

a couple of hours after i had a node go down. Maybe its time to go to back to vmware 5.5 or hyper-v :smileyshocked:

View solution in original post

Closedgate1213 · ‎08-03-2015

Hi Mad Scotsman,

Silly question, but have you upgraded your vCenter to 6 as well?

Do you have any logs from around the disconnection time?

Have you tested with a clean build?

Thanks.

- Keiran.

Blog: http://www.readysetvirtual.wordpress.com

edmandsj · ‎08-03-2015

my dell C6100's are locking up. Same issue. the system appears to be OK when hitting the console. I attempt to reboot it and nothing works. I have to hard reset it with the DRAC.

note: I believe some were upgrades 5.5 --> 6.0 Some were fresh installations.

I had to fresh install vCenter Server 6.0. The upgrade did not work for me.

Mad_scotsman · ‎08-03-2015

Yes upgraded everything to 6. I found another thread on this and removed both nodes from vcenter and did a uninstall of the vpxa agent and re-added them. Waiting to see if that does anything.

No haven't done a clean install have thought about that will maybe see how it goes this week. Looking at edmandsj dell issue it sounds like he has done both.

Which logs would you like to look at and i will try to gather at next lock up?

edmandsj · ‎08-03-2015

not sure. I gathered vobd.log kernel.log and vpxa.log

My logs indicate total loss of network connectivity. Lots of intermittent network connectivity. right around the event.

Just to take a second and compare networking setups. I am using:

Intel 82576 and 82571+EB NICs (2 onboard and 2 in expansion) (both on HCL for 6.0)

2 x Software iSCSI HBAs (1 from onboard and 1 from expansion) iSCSI port binding of course. 1500MTU on vmk's and switch --- all active adapters, notify switches on, failback on, link status only on, route based on originating virtual port ID, reject promiscuous mode

2 x vm network traffic + management network (1 from onboard and 1 from expansion) --- all active adapters, notify switches on, failback on, link status only on, route based on originating virtual port ID, reject promiscuous mode

Connected to Netgear Prosafe stackable switches

no vmotion enabled

Mad_scotsman · ‎08-03-2015

edmandsj,

So my servers aren't dell but the symptoms are the same total loss of network I have 2 onboard Intel 82574L NIC's they are shared for iscsi and vm network etc..

edmandsj · ‎08-03-2015

are you servers on the HCL for 6.0? mine are not.

also, vobd log attached with pertinent info to my time frame of crash.

Mad_scotsman · ‎08-03-2015

Mine probably aren't they are older Tynan servers that have been running for 4 years so have done a few upgrades memory disks over time but never a problem till 6.

I pinged a consultant friend and he has heard that build 2809209 has a major bug in it with the netdev process causing issues with nic's and hba's. A co worker of his has stayed on the release build 2494585 without any issues.

I have wiped my servers and rolled back to that build. I will let you know if this works it usually takes 1-2 days before i have to reboot.

Mad_scotsman · ‎08-03-2015

it didnt take long at all

a couple of hours after i had a node go down. Maybe its time to go to back to vmware 5.5 or hyper-v :smileyshocked:

Alistar · ‎08-04-2015

Hi there,

can you please upload vmkernel.log and vmkwarning.log from /var/log? If you would point out what was the time that the host stopped responding so that we could correlate, we could gain some more insight into the issue

Stop by my blog if you'd like 🙂 I dabble in vSphere troubleshooting, PowerCLI scripting and NetApp storage - and I share my journeys at http://vmxp.wordpress.com/

Mad_scotsman · ‎08-05-2015

Here you go this box locked up at 2:31am this morning

Mad_scotsman · ‎08-05-2015

Closing this question and giving up maybe by update 2 or 3 esxi 6 will be good

All

After upgrade of Esxi 5.5 to 6.0 servers lock up every other day