VMware

This Question is Possibly Answered

1 "correct" answer available (10 pts) 2 "helpful" answers available (6 pts)
6 Replies Last post: Nov 17, 2009 2:53 PM by wjs  

SCSI errors on VMs on NFS datastores posted: Oct 20, 2009 8:17 AM

Click to view deltajoka's profile Novice 9 posts since
Dec 22, 2008
Hi,

We have recently been experiencing some problems with our VMs on NFS datastores on a IBM N series/NetApp filer. We have about 5 to 10 short incidents per month where VMs (simultaneously on multiple ESX hosts) are experiencing some kind of scsi "lag"/timeout. The following types of messages are then logged on the console (and in the kern.log) during these "glitches":

mptscsih: ioc0: attempting task abort! (sc=de2e0280)
mptbase: ioc0: IOCStatus(0x004b): SCSI IOC Terminated
mptscsih: ioc0: task abort: SUCCESS (sc=de2e0280)
mptbase: ioc0: IOCStatus(0x0002): Busy
mptbase: ioc0: IOCStatus(0x0002): Busy

Usually, some (sometimes only one host, sometimes several, even though VMs on all hosts are affected) of the ESX hosts are logging single occurrances of the following type of nfs-related errors:

esx03 vmkernel: 118:20:41:44.851 cpu1:1110)WARNING: NFS: 4590: Can't find call with serial number -2146566064
esx04 kernel: nfs_statfs64: statfs error = 5
esx01 kernel: nfs_statfs: statfs error = 5

We have been investigating counters on the switches and on the filer. There seem to be some retransmitting of tcp packets occurring, but no dropped packages or packages with bad headers/invalid checksums or similar.

If these problems would be the result of high IO or latency on the filer, wouldn't the effect be slower transfers rather than VMs simply "losing" their disks for a short period of time?

The ESX hosts are HP DL360 G5, running ESX 3.5u4. The switches are Cisco 2960 (gigabit), with flow control disabled.

Any input on this matter is most welcomed!

Re: SCSI errors on VMs on NFS datastores

1. Oct 20, 2009 9:29 AM in response to: deltajoka
Click to view TobiasKracht's profile Expert 508 posts since
Aug 31, 2009
Looks like if you a torouble with a disk perfomance, and host copies files faster, than client writes it.

StarWind Software R&D

Re: SCSI errors on VMs on NFS datastores

4. Nov 4, 2009 7:06 AM in response to: deltajoka
Click to view bobross's profile Hot Shot 132 posts since
Nov 1, 2007
This phenomenon is why we stopped running swap on NFS a long time ago - it's just not a good architectural choice. We run swap on Xiotech DAS now and it cooks.

Re: SCSI errors on VMs on NFS datastores

6. Nov 17, 2009 2:53 PM in response to: deltajoka
Click to view wjs's profile Lurker 3 posts since
Aug 5, 2006
Any luck with this? I am experiencing the exact same issue. My CentOS images see the problem, but not Windows. My hardware is a bit different however. I assumed in my case, that my NFS server too busy, and I need to move some workloads around.

VMware Developer

SDKs, APIs, Videos, Learn and much more in the Developer community.

Learn More

Developer Sample Code

Increase your developer productivity with VMware API sample code.

Learn More

VMworld Sessions & Labs

Online access to the latest VMworld Sessions & Labs and online services.

Learn more

Purchase PSO Credits Online

Purchase credits to redeem training and consulting services online.

Buy Now

Community Hardware Software

View reported configurations or report your own.

Learn More

VMware vSphere

Come witness the next giant leap in virtualization.

Register Today

Communities