Skip navigation
VMware

This Question is Not Answered

1 "correct" answer available (10 pts) 1 "helpful" answer available (6 pts)
784 Views 5 Replies Last post: Jul 13, 2010 6:21 PM by ChrisKas10 RSS
ChrisKas10 Novice 17 posts since
Feb 9, 2009
Currently Being Moderated

Jul 12, 2010 1:33 PM

Can't backup or copy some of my VM Clients

 

Running ESXi 4 on an HP DL380 G5. The datastore for my VMs is on an internal RAID 1+0 array that had some issues last week. First I lost one drive, after hot-swapping it I had two others go to predictive failure. HP determined it was likely a firmware issue and sent me move drives and a backplane just in case... Got all that swapped, the arrays are rebuilt and all the lights are green.

 

Before the swap I had some issues trying to backup the client VMs that I thought might be related to the hardware challenges. Post swap I'm afraid I'm still having the same issues. I should mention that the client machines appear to all be functioning fine.

 

 

Issue 1 is that one of the VM clients (Windows 2003 R2) logs a lot of events when under heavy disk I/O such as backups (lots of de-duping activity). The verification portion of the backup process tends to fail.

 

 

Event Type:    Error

Event Source:    Disk

Event Category:    None

Event ID:    15

Date:        7/11/2010

Time:        8:50:39 PM

User:        N/A

Computer:    EFDEV

Description:

*The device, \Device\Harddisk0, is not ready for access yet.*

 

I'll get 15 - 20 of those over the span of a couple hours while the backup job is running.

 

 

 

Isssue 2: I can't get a full copy of the vmdk files. I've tried shutting down the machine and using vSphere's data browser, FTP and SCP. In all cases eventually the backup on two of the machine's vmdk stops or times out. I've tried ghettoVCB to a few different NFS targets with the same issue.

 

 

 

Here's a ghettoVCB snippet from the most recent error:

 

 

Cloning disk '/vmfs/volumes/VMs/EFdev-2k3-web2/EFdev-2k3-web2.vmdk'...

*Clone: 67% done.Failed to clone disk : Connection timed out (7208969).*

2010-07-12 20:19:05 -- info: Removing snapshot from EFdev-2k3-web2 ...

ls: /vmfs/volumes/VMs/EFdev-2k3-web2/EFdev-2k3-web2-000001.vmdk: No such file or directory

ls: /vmfs/volumes/VMs/EFdev-2k3-web2/EFdev-2k3-web2-000001-delta.vmdk: No such file or directory

2010-07-12 20:19:21 -- info: Backup Duration: 39.03 Minutes

2010-07-12 20:19:21 -- info: Successfully completed backup for EFdev-2k3-web2!

 

 

 

 

When that error happens, I find stuff like this in the /var/log/messages:

 

 

 

Jul 12 20:06:43 vmkernel: 3:20:14:38.244 cpu0:8661)NMP: nmp_CompleteCommandForPath: Command 0x28 (0x4100050b7780) to NMP device "mpx.vmhba1:C0:T1:L0" failed on physical path "vmhba1:C0:T1:L0" H:0x3 D:0x0 P:0x0 Possible sense data: 0x2 0x3a 0x0.

Jul 12 20:06:43 vmkernel: 3:20:14:38.244 cpu0:8661)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe: NMP device "mpx.vmhba1:C0:T1:L0" state in doubt; requested fast path state update...

Jul 12 20:06:43 vmkernel: 3:20:14:38.244 cpu0:8661)ScsiDeviceIO: 747: Command 0x28 to device "mpx.vmhba1:C0:T1:L0" failed H:0x3 D:0x0 P:0x0 Possible sense data: 0x2 0x3a 0x0.

Jul 12 20:06:43 vmkernel: 3:20:14:38.417 cpu0:8661)<4>cciss: cmd 0x4100b1002270 has CHECK CONDITION  byte 2 = 0x3

Jul 12 20:06:43 vmkernel: 3:20:14:38.424 cpu0:8661)NMP: nmp_CompleteCommandForPath: Command 0x28 (0x410005142c80) to NMP device "mpx.vmhba1:C0:T1:L0" failed on physical path "vmhba1:C0:T1:L0" H:0x3 D:0x0 P:0x0 Possible sense data: 0x2 0x3a 0x0.

Jul 12 20:06:43 vmkernel: 3:20:14:38.424 cpu0:8661)ScsiDeviceIO: 747: Command 0x28 to device "mpx.vmhba1:C0:T1:L0" failed H:0x3 D:0x0 P:0x0 Possible sense data: 0x2 0x3a 0x0.

Jul 12 20:06:44 Hostd: 2010-07-12 20:06:44.761 147CFB90 verbose 'vm:/vmfs/volumes/4b053e54-176e0886-5440-001b784635e0/EFfax/EFfax.vmx' Updating current heartbeatStatus: yellow

Jul 12 20:06:44 vmkernel: 3:20:14:39.588 cpu0:8144)<4>cciss: cmd 0x4100b1002000 has CHECK CONDITION  byte 2 = 0x3

Jul 12 20:06:44 vmkernel: 3:20:14:39.588 cpu0:8144)NMP: nmp_CompleteCommandForPath: Command 0x28 (0x4100050ef780) to NMP device "mpx.vmhba1:C0:T1:L0" failed on physical path "vmhba1:C0:T1:L0" H:0x3 D:0x0 P:0x0 Possible sense data: 0x2 0x3a 0x0.

Jul 12 20:06:44 vmkernel: 3:20:14:39.588 cpu0:8144)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe: NMP device "mpx.vmhba1:C0:T1:L0" state in doubt; requested fast path state update...

Jul 12 20:06:44 vmkernel: 3:20:14:39.588 cpu0:8144)ScsiDeviceIO: 747: Command 0x28 to device "mpx.vmhba1:C0:T1:L0" failed H:0x3 D:0x0 P:0x0 Possible sense data: 0x2 0x3a 0x0.

Jul 12 20:06:44 vmkernel: 3:20:14:39.767 cpu0:8675)<4>cciss: cmd 0x4100b1002750 has CHECK CONDITION  byte 2 = 0x3

Jul 12 20:06:44 vmkernel: 3:20:14:39.773 cpu0:8675)NMP: nmp_CompleteCommandForPath: Command 0x28 (0x4100050edb80) to NMP device "mpx.vmhba1:C0:T1:L0" failed on physical path "vmhba1:C0:T1:L0" H:0x3 D:0x0 P:0x0 Possible sense data: 0x2 0x3a 0x0.

Jul 12 20:06:45 Hostd: 2010-07-12 20:06:45.720 75503B90 verbose 'vm:/vmfs/volumes/4b053e54-176e0886-5440-001b784635e0/EFdev/EFdev.vmx' Actual VM overhead: 148062208 bytes

Jul 12 20:06:45 Hostd: 2010-07-12 20:06:45.727 75503B90 verbose 'Vmsvc' RefreshVms updated overhead for 1 VM

 

 

Many repititions.

 

 

I might theorize I have issues with the vmfs file system?

 

 

 

 

 

I'm looking for suggestions on how to proceed. What would you try next to fix this?

 

 

 

 

 

Thanks in advance for any suggestions.

 

 

 

 

 

DSTAVERT Guru User Moderators vExpert 10,090 posts since
Nov 30, 2003
Currently Being Moderated
1. Jul 12, 2010 2:05 PM in response to: ChrisKas10
Re: Can't backup or copy some of my VM Clients

Have a look at the following KB article for starters. http://kb.vmware.com/kb/289902

 

Do you have the BBWC module installed for the controller and write caching enabled? Firmware up to date for ESXi? Use the specific firmware CD for VMware on the HP download site. Are you using the HP specific version of ESXi? Which build of ESXi?

-- David -- VMware Communities Moderator
DSTAVERT Guru User Moderators vExpert 10,090 posts since
Nov 30, 2003
Currently Being Moderated
3. Jul 12, 2010 3:09 PM in response to: ChrisKas10
Re: Can't backup or copy some of my VM Clients

The BBWC is a an HP option for the disk controller that provides a battery backed RAM cache. Write caching "greatly" improves disk controller performance. It is a must in a virtual environment. To check write caching you need to use the Smartstart CD or the ACU (array configuration utility) cd. Do not enable write caching unless the battery module is installed. I don't remember whether the module shows up in the ACU screens.

 

The software download page  for Your particular HP server model lists OS versions for the firmware CD and VMware is listed. The firmware CD is 9 so you should be OK but there are some critical disk controller updates listed that are post the firmware date. I might consider them. I would scan the HP forums for any potential issues with VMware and those updates. Make very sure you use drivers from the VMware OS download page. There are / have been firmware versions for some components that are NOT recommended for specific versions of VMware.

 

There is a specific version of ESXi for HP servers that you can download from HP. There is a upgrade bundle on the same list as the firmware that gives you the missing components from the HP version of ESXi.

 

You might want to consider upgrading to ESXi Update 2

-- David -- VMware Communities Moderator

Bookmarked By (0)

Share This Page

Communities