VMware Cloud Community
dclark
Enthusiast
Enthusiast

Kernel Panic: VFS: Unable to mount root fs

Hello

Running ESX3.0.1 patched with 1006511 and 2066306. Patches have been installed for approx 3 months. Rebooted server on Sunday and got error;

Kernel Panic: VFS: Unable to mount root fs

In the end I booted into troubleshooting mode and ran esxcfg-boot -b which fixed the problem.

The worry I have is that these servers are at a remote site, and if we need to reboot remotely we won't know if the problem is going to re-occur until it is too late, and then I have a 150 mile drive to the remote site to fix.

It looks as if a few others have had the problem, but does anyone know why it occurs and is there a proper fix/patch available?

Many thanks

0 Kudos
7 Replies
acmcnick
Enthusiast
Enthusiast

I had a similar issue but it didn't happen on a reboot, it happened during production.

It was a bug in the PERC5/i firmware and it cause the logical disk to become "un-attached" for lack of a better word. We updated the firmware and all was well.

A couple of things:

1. Make sure to review all patches for possible bugs

2. Make sure you actually need to apply the patches

3. Make sure your company lets you buy iLO for at least your remote servers if not all of your servers to avoid a drive. (we have sites between 300-1000 miles away)

4. Don't just look for VMWare Patches, Hardware Vendor patches for Disk Controllers have been know to fix major problems and cause major problems.

Hope this helps.

0 Kudos
Schorschi
Expert
Expert

Can you list what firmware version the 5/i was at? And what version you updated to?

We had almost the EXACT same issue with PERC 4e DI controller running 5.22A, and we had to back-rev to 5.21X to get stable again. We lost 2 servers out of about of about 30 or so. They came from 5.22A or where flashed to 5.22A before ESX 3.0.1 was installed (all patches up to 051807).

Both VMware and Dell are STILL trying to figure out what happened! We have yet to see any issue with PERC 5/i on our newer servers, but this thread scares the you-know-what out of you-know-where, it sounds so close to our PERC 4 issues, it is scary!

0 Kudos
acmcnick
Enthusiast
Enthusiast

http://support.dell.com/support/downloads/download.aspx?c=us&l=en&s=gen&releaseid=R149666&SystemID=P...

2. If all physical disks making up a virtual disk are removed from the system or

enclosure, the RAID controller will delete the virtual disk. If the disks are

later re-inserted, they will be marked foreign and can be imported using a

management application.[/i]

I believe we upgraded from firmware 1.02.10 to 1.03.10-0216 which is in the link above.

0 Kudos
acmcnick
Enthusiast
Enthusiast

R149666

Description:

Dell PowerEdge RAID Controller 5/I Integrated Firmware Release

Package Version: 5.1.1-0040

0 Kudos
Texiwill
Leadership
Leadership

Hello,

This implies that the boot volume of the SC is no longer accessible. You only have a few solutions.

One is to use 'rescue' media or boot from the TroubleShooting mode. If the trouble shooting mode works then there is a problem with the initrd for the ESX Server and you will need to recreate using esxcfg-boot.

If you can not boot into Trouble shooting mode then you need the rescue media and understand how to mount linux partitions, use fsck, and chroot, and rebuild the master boot record.

I would try troubleshooting mode first.

Best regards,

Edward

--
Edward L. Haletky
vExpert XIV: 2009-2023,
VMTN Community Moderator
vSphere Upgrade Saga: https://www.astroarch.com/blogs
GitHub Repo: https://github.com/Texiwill
0 Kudos
Schorschi
Expert
Expert

Dell and VMware have not figured out the original issue, but after some research it appears to be related to an older Linux boot issue, what is mucked up when kernel changes are made or grub gets confused. However, neither of these related issues makes sense per se in our situation since we did no kernel change nor any grub configuration changes when we experienced the issue. The symptom issue of the back-rev or forward-rev which I referenced before, which now appears unrelated to the original problem, Dell has documented and VMware has published at the following link, be careful, this technote applies to ESX 2.5.x and 3.0.x even though the title and products affected notes state ESX 3.0.x only... Dell said they are correcting this soon.

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1001577&sl...

Message was edited by:

Schorschi

0 Kudos
Flan5ter
Contributor
Contributor

Edward,

You Star. I had the following message on boot and it would go no further. "Kernel Panic: VFS: Unable to mount root fs on 00:00"

I tried troubleshooting mode that told me I had an unclean shutdown and offered a repair of volumes which I did. This did not solve the problem. But led me to believe that the boot was corrupt.

I then went into Debug mode and ran the following command

esxcfg-boot -g -b -p -r

This fixed it.

Regards

Andy

0 Kudos