VMware Cloud Community
blabili
Contributor
Contributor

File system error in VM

Hi everyone,

I just started using ESXi 3.5 and already got file system errors in my virtual machine. I already re-installed the VM but it happened again. So there must be some explanation.

It's Ubuntu 10.04 LTS minimal installation. It's running a remote backup system that wakes up every hour and backes up some clients in the network. The virtual machines image is stored on a datastore that is connected via NFS. This is the output I get from the virtual machine:

screenshot.png

Does anyone understand why this happens?

Thanks you for your help. Cheers Mike

0 Kudos
3 Replies
mcowger
Immortal
Immortal

Linux aborts the journal when the underlying storage has become unavailable for more than 60 seconds.  From that point on, until a reboot, you wont be able to access anything.

In all likelihood, your NFS datastore became unavailable for 60s or more, leading to this condition.

BTW - why are you using ESXi 3.5?  5.0 is the current version...

--Matt VCDX #52 blog.cowger.us
blabili
Contributor
Contributor

So could that be because of a power saving function on the NAS that turnes of the disks when they are unused for a longer time? But it shouldn't take 60 secounds to power up the disks again... So I don't think turning of this power saving function could solve the problem. Can I prolong the timeout on Linux? Could that solve the problem?

I'm using 3.5 because of 32 bit hardware. As far as I know, this is the newest version of esxi that is available for 32 Bit hardware. Am I right?

0 Kudos
Dracolith
Enthusiast
Enthusiast

blabili wrote:

So could that be because of a power saving function on the NAS that turnes of the disks when they are unused for a longer time? But it shouldn't take 60 secounds to power up the disks again... So I don't think turning of this power saving function could solve the problem. Can I prolong the timeout on Linux?

Honestly... we can't really be sure of the power saving function is related or not.

What kind of NAS is this?

It is possible your NAS simply isn't designed in a manner compatible with ESX(i),

but it's also more likely there is a storage problem, hardware problem, connectivity problem,  or software bug coming into play.

Without further information, yes,  I would suggest shutting off any power saving or other features that might negatively impact the NAS performance.

Make sure  you have the latest available version of VMware Tools installed and running in the guest.

If you were utilizing  ESXi5,   I would suggest utilizing  a  Paravirtualized SCSI controller type for your guest OS.

In regards to the SCSI I/O timeouts  that could cause an aborted journal, read and write I/O timeouts

are hard-wired into the LSI driver.   There is only one timeout you can really configure in the Linux guest

at runtime, "command timeout"

E.g.

for i in `ls /sys/block | grep -P ^sd`;do
echo "180" > /sys/block/$i/device/timeout
done

I'm using 3.5 because of 32 bit hardware. As far as I know, this is the newest version of esxi that is available for 32 Bit hardware. Am I right?

It may be the newest version that can run on 32 bit hardware, but at this point it is ancient.

32 bit hardware at this point is not enterprise grade server hardware.

Heck, most people are utilizing 64 bit workstations these days.

Vsphere4 and Vsphere5 are definitely more reliable in many ways.

In my experience  3.5  was buggy.

If you wish to use version 3.5 of the software,  I would strongly recommend utilizing a

downgrade option to ESX3.5 and staying away from ESXi.

0 Kudos