About 25 VM from 4 hosts started to became unresponsive and BSOD. It probably was unresponsive for about an hour from the SNMP chart. From Windows event logs, I'm getting the below messages. All started roughly the same time. So, by googling, I found that these points to storage problems.
Storage is on a DL380 G5 12GB RAM with P400 512MB BBWC and 8 X 146GB 10K SAS RAID 10 on Opensolaris 2009.06 with NFS share. Compression=on and atime=off. It worked fine for about 15 days. I've checked the cpu and network load, nothing out of the ordinary. Disk I/O is always low. Probably < 5MB/sec. Nothing useful can be gathered from the /var/svc/log/network-nfs* logs. load average < 0.1
Any tips on where else I should be looking? Hosts are DL385 G6 so they are HCL. I understand storage is not HCL but similiar to Sun Unified Storage 7000 but still I would like to find out where is the problem.
Event Type: Error
Event Source: symmpi
Event Category: None
Event ID: 15
Date: 11/4/2009
Time: 12:21:18 PM
User: N/A
Computer: SERVER1
Description:
The device, \Device\Scsi\symmpi1, is not ready for access yet.
For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 0f 1a 18 00 01 00 6e 00 ......n.
0008: 00 00 00 00 0f 00 04 c0 .......
0010: 03 01 00 00 a3 00 00 c0 ......
0018: 35 15 00 00 00 00 00 00 5.......
0020: 00 00 00 00 00 00 00 00 ........
0028: 00 00 00 00 00 00 00 00 ........
0030: 00 00 00 00 08 00 00 00 ........
0038: 07 00 00 00 05 00 00 00 ........
Event Type: Error
Event Source: Disk
Event Category: None
Event ID: 11
Date: 11/4/2009
Time: 12:21:18 PM
User: N/A
Computer: SERVER2
Description:
The driver detected a controller error on \Device\Harddisk0.
For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 04 06 68 00 01 00 ba 00 ..h....
0008: 00 00 00 00 0b 00 04 c0 .......
0010: 03 01 00 00 00 00 00 00 ........
0018: 00 00 00 00 00 00 00 00 ........
0020: 00 00 00 00 00 00 00 00 ........
0028: 8e 8c 2f 01 00 00 00 00 ??/.....
0030: ff ff ff ff 07 00 00 00 ....
0038: 40 00 00 05 08 00 00 00 @.......
0040: 58 20 0a 12 82 03 20 40 X ..?. @
0048: 00 10 00 00 3c 00 00 00 ....<...
0050: 00 00 00 00 a0 75 93 89 ....u??
0058: 00 00 00 00 70 73 93 89 ....ps??
0060: c8 ed c3 89 72 59 b3 01 ?rY.
0068: 2a 00 01 b3 59 72 00 00 *..Yr..
0070: 08 00 00 00 00 00 00 00 ........
0078: 00 00 00 00 00 00 00 00 ........
0080: 00 00 00 00 00 00 00 00 ........
0088: 00 00 00 00 00 00 00 00 ........