VMware Cloud Community
TiBoReR
Enthusiast
Enthusiast

upgrade from ESX 3.0.2 to 3.5 - Kernel Panic

The server is a HP ML350G5.

HP Insight agent 7.91 installed on ESX 3.0.2 all patchs applied until jan 2 2008.

I used the upgrade files with "esxupdate update" command line.

All was going well and I received a kernel panic: VFS: Unable to mount root on fs 00:00 and since that time, not able to boot the ESX.

Was doing the upgrade by SSH and forgot to pub the host in maintenance mode.

Some ideas what happened and what to do to put my VMs back online ?

Thanks !

0 Kudos
16 Replies
TiBoReR
Enthusiast
Enthusiast

I'm trying to reinstall (selected upgrade) using the 3.5 Install CD.

Will let you know...

0 Kudos
TiBoReR
Enthusiast
Enthusiast

Exact same error...

see screenshot attached...

0 Kudos
J-D
Enthusiast
Enthusiast

Have you tried booting into maintenance mode at the Grub bootloader? I wonder if there wasn't any disk corruption even before you tried to upgrade. If the debug mode (or whatever it is called) doesn't let you see the content of eg. /etc then probably your disk is corrupted.

I think you'll have to do a new install...

0 Kudos
TiBoReR
Enthusiast
Enthusiast

Yeah, already tried that but same error.

I opened a SR and we reinstalled 3.5 keeping VMFS and VMs.

All is okay now but I have to go trough the configuration of the ESX again.

I will wait a couple of hours and then I will reinstaller HP Agent 7.91 to see if it is what causing the issue.

Will let you know if I find the source of t he problem.

0 Kudos
J-D
Enthusiast
Enthusiast

Just to be clear, with reinstall I meant installing 3.5 (and not upgrading).

So a reinstall with upgrade didn't work but a reinstall without upgrading but keeping the vmfs and VM's intact, did work?

That would mean that a real reinstall recreated your local /, /var/log and /boot. An upgrade would reuse those mount points but is likely to fail if there was disk corruption.

I am betting on disk corruption, not an upgrade issue...and I truly hope the HP management software doesn't have anything to do with the upgrade...many of us are using that version.

Please keep us updated about what you'll consider the root cause.

0 Kudos
WillFulmer
Enthusiast
Enthusiast

Same issue here. What was your resolution? A clean install from the 3.5 media as well as reconfiguring the host? Did this happen only on one box or all of your ESX servers? What model server are you using?

0 Kudos
TiBoReR
Enthusiast
Enthusiast

Yeah, install (not upgrade) from the boot CD keeping the VMFS and reconfiguring the host...

I only tried it on one of my ESX hosts.

Server HP ML350G5.

0 Kudos
WillFulmer
Enthusiast
Enthusiast

I'm using a HP BL460c. Same issue.....is it a hardware or firmware issue or will I have to do clean installs on all blades?

0 Kudos
TiBoReR
Enthusiast
Enthusiast

I don't know. Firmware and maintenance CD 7.91 was installed with HP Agents 7.91 in the SC.

After I did the install from the CD, I reinstalled HP Agents 7.91 and no problem so far.

0 Kudos
wila
Immortal
Immortal

You said you can't boot into debug mode, so that puts standard troubleshooting practices to a zero.

However when i look at your screenshots then it appears something is wrong with the GRUB setup.

If you have a linux live CD, it would be interesting to see why the system doesn't recognize the UUID as root.

The other day i had this myself when i cloned the original IDE disk that holds ESX to a CD disk in my lab. As a result the grub setup on the MBR was flawed.

In my case i managed to get it resolved as follows:

After a ghost clone to another disk, my ESX host in the lab came up with the much appreciated "GRUB" message at the boot loader prompt. Without an option to do anything. Turns out the GRUB MBR record is damaged, you cannot rerun the upgrade ISO CD image for this as it will not refresh that bit unless you re-install instead of upgrade.

So i found myself a Red Hat 9 install CD and performed the following:

Boot into rescue mode

linux rescue

Click defaults until the final screen where it tells you that it will not mount the drive as it didn't find any linux, but you are given the prompt that you need, click "Continue" and here you go a linux prompt.

Now suppose your boot disk is /dev/hda and the boot partition is /dev/hda1 and the root partition is /dev/hda2 (You can check this with fdisk -l for yourself.

mkdir /mnt/sysimage

mount /dev/hda2 /mnt/sysimage

chroot /mnt/sysimage

Ok, so now we are back in our system, but we still don't have access to our boot partition and the grub installer will need that.

mount /dev/hda1 /boot

Check that our boot folder isn't empty anymore

ls -lh

Now if everything went well you can put back the MBR record by issuing the following command.

grub-install /dev/hda

If it doesn't complain then you can now reboot the host and boot again from your normale storage.

PS: It stores info in /boot/device.map, check with cat if it is correct if you have issues.

PPS: I also tried this with a knoppix and a Linux System Rescue CD, the latter one didn't work as it was looking for the zsh shell.

| Author of Vimalin. The virtual machine Backup app for VMware Fusion, VMware Workstation and Player |
| More info at vimalin.com | Twitter @wilva
0 Kudos
wila
Immortal
Immortal

Additionally check that the UUID in grub.conf is indeed the correct one.

The command to show you the UUIDs of the attached disks is: blkid

--

Wil

| Author of Vimalin. The virtual machine Backup app for VMware Fusion, VMware Workstation and Player |
| More info at vimalin.com | Twitter @wilva
0 Kudos
J-D
Enthusiast
Enthusiast

I didn't know the "blkid" command so I tried it out but I don't think it exists on ESX 3.0.2. Probably "normal" Linux'es only or 3.5?

0 Kudos
wila
Immortal
Immortal

Ah i'm sorry, i actually knew that blkid isn't in ESX 3.0.x and it appears that it didn't make it into 3.5 either.

Without that command i'm not sure how you would find this out easily (I do know a workaround as i've had to code an alternative way in the vm-relocate.sh script that you can find on the forum). It has been part of the e2fsprogs package since version 1.26 according to the man page. The e2fsprogs install on ESX 3.5 is version 1.32..... so i'm not sure why it isn't included.

For security reasons? Can't think of any...

But as i mentioned in the post, you can take a simple linux liveCD and use the blkid command from there.

--

Wil

| Author of Vimalin. The virtual machine Backup app for VMware Fusion, VMware Workstation and Player |
| More info at vimalin.com | Twitter @wilva
0 Kudos
TiBoReR
Enthusiast
Enthusiast

I found a KB describing the bug and the solution.

0 Kudos
J-D
Enthusiast
Enthusiast

Thanks for posting the link.

The "might" is scary...so I guess it's always use that patch before upgrading. What a major issue....

0 Kudos
CPM
Contributor
Contributor

You can load the "Trouble Shooting" option from the main GRUB boot up. From there you will get to the login where you will enter the "root" credentials. From a CLI enter the following commands;

esxcfg-boot -p

One back to the command line run;

esxcfg-boot -b

Reboot and try returning to the main VMware prompt at GRUB to load the ESX Server.

Worked for me. Hope this helps.

0 Kudos