Hello,
We had a problem with our P2K recently but after we bring back the storage array some of the hosts are not able to access the LUNS.
018-02-05T17:18:07.272Z cpu18:36053 opID=42052f24)World: 14369: VC opID 92D4B216-0000587C-db-cc maps to vmkernel opID 42052f24
2018-02-05T17:18:07.272Z cpu18:36053 opID=42052f24)Partition: 423: Failed read for "naa.600c0ff000195c8a26ea705301000000": I/O error
2018-02-05T17:18:07.272Z cpu18:36053 opID=42052f24)Partition: 1003: Failed to read protective mbr on "naa.6xxxxxxx" : I/O error
2018-02-05T17:18:07.272Z cpu18:36053 opID=42052f24)WARNING: Partition: 1112: Partition table read from device naa.XXXXXXX failed: I/O error
2018-02-05T17:18:17.283Z cpu12:32817)ScsiDeviceIO: 2369: Cmd(0x413642d33b40) 0x1a, CmdSN 0x3215 from world 0 to dev "naa.600c0ff000195c8ae7e9705303000000" failed H:0x5 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
Any suggestions..
Check with the storage vendor. I/O errors like those can often mean data corruption has occurred.
So you can still read the content from some hosts ?
Then do the following:
- unmount the volume on those hosts that can no longer read it. - DO THIS ASAP
- select one host that still can read the volume and unmount the volume on all other hosts
- use this one host to copy all VMs to another datastore
- rebuild the affected datastore from scratch and avoid RAID 5
Assume that the situation may deteriorate if you do not act soon - so do not waste too much time in the current state.
Try performing LUN reset (KB below) followed by below command in sequence.
1. esxcfg-rescan -A
2. vmkfstools -V
3. Check the availability of the datastore in question.
For further troubleshooting check storage array logs if required.
If you found this or any other answer helpful, please consider the use of the Correct or Helpful to award points.
Best Regards,
Deepak Koshal
CNE|CLA|CWMA|VCP4|VCP5|CCAH
> Check with the storage vendor. I/O errors like those can often mean data corruption has occurred.
From several years of VMFS-recovery I have learned that if one host in a cluster complains about I/O errors this does not necessarily means that the actual data is corrupted.
Often another host or a linux system can still read the data without problems.
So whenever I come across this issue I first try to read the data using another host or a Linux LiveCD.
So the surprising lesson here is that I/O errors in a vmkernel log basically do not immediatly mean corruption but rather that "this" host does not want to cooperate.
Yes - I know how crazy that sounds
Also when one host complains that a LUN has no partitiontable it does not mean that there is no partitiontable.
i shutdown all vms and esxi's and restarted both storage array controllers and then power them up.
thank you everyone.