SCSI reservation conflicts seen by the ESX Server are caused by doing two many simultaneous LUN activities on a single LUN. The metadata is getting locked constantly. Since you are seeing them within ESX, we must investigate that.
Can you give us the pertinent parts of the log file?
Actions that can cause a conflict:
open, close file/link to RAW/RDM
change size of file/link to RAW/RDM
create/delete file/link to RAW/RDM
updating access/mod/create times of file/link to RAW/RDM
So when you create a RAW, you are creating a link to the RAW within the metadata, updating access, modification, and create times, and setting an initial size.
Are you doing ANY other actions on the LUN when you do this?
EVA6000 should be able to handle up to 4 actions simultaneously.
Also, the # of blades that see the LUN has an impact. How many see the LUN? > 8?
The LUNs in question are all specifically for the purpose of presenting disk to MSCS (Microsoft Clustering) nodes. They are not used for anything else. Sorry... I was reading through my post and realized that I'm probably not clear enough.
It has to do with re-scanning and trying to add (new) storage, not related to those LUNs, on any ESX host in the cluster. There aren't any errors related to any other LUNs showing in the logs, just the ones that are used as raw device mappings to MSCS VM's.
We have other Windows VM's with raw device mappings (to other LUNs) and we have no problems. I suspect that this problem is due to the fact that MSCS puts reservations on LUNs to tie that LUN to the box that has the cluster disk resource at the time. ESX is trying to do a rescan on those reserved LUNs and getting reservation errors because the underlying MSCS VM has the LUN reserved. Theory only...
My question revolves around whether or not this behavior is by design or if I have something messed up in the config. If the former is the case, then is there any workaround? If the latter, then does anyone have any suggestions on how I might fix it? I followed the MSCS best practices document from VMWare when setting these up (raw device mappings are physical).
In answer to your questions, I am doing nothing else but re-scanning to add new storage and it's causing the reservation conflicts and timeouts. The only thing that is happening is that the MSCS cluster is up and serving data (it's a QA SQL server cluster).
There are currently 10 blades that can see the LUNs in question.
This problem has been ongoing, but was not enough of an annoyance to warrant the time to figure it out (we add a VMFS volume every couple of months), but the last time I went to add a new VMFS volume, I couldn't do it without increasing the timeout value in the client to almost 10 minutes.
Many thanks for the quick response.
I have the exact same issue with running mscs with a physical and virtual node. I currently have a case open with VMWare about it but not much has come of it yet. Beware that it may get to a point where a rescan could crash your ESX host, it happened to me thus alerting me to all the reservation errors. I had called VMWare previous to the crash because rescans got very slow, but all they did was change the timeout and tell me it was a result of all the LUNs connected to our hosts. Shame on me for not digging further and seeing all the reservation errors which are causing the slow rescans.
I run Emulex 4gb cards so I am currently in violation of this as I run 3.0.1. Fortunately, I am moving to a new EMC DMX SAN in the coming months, and at the same time upgrading to 3.0.2, which now supports MSCS with 4GB drivers as I read the other day (also shown in the last link of your post)
Could this possibly clear the reservation issues?
All of the blades are HP P Class (8 are G1, 2 are G2) and are connected to the switches at 2gbps.
Looking at it further, however, reveals that I'm using the 4gb drivers apparently.
\[root@enc1bl1 root]# vmkload_mod -l
Name R/O Addr Length R/W Addr Length ID Loaded
vmkapimod 0x7b5000 0x1000 0x1dff070 0x1000 1 Yes
vmklinux 0x7b6000 0x18000 0x1e8b610 0x3e000 2 Yes
cciss 0x7ce000 0x6000 0x1ed3ab8 0x2000 3 Yes
qla2300_707 0x7d4000 0x44000 0x1ed7ba0 0x72000 4 Yes
tg3 0x818000 0x12000 0x1f53c48 0x4000 5 Yes
tcpip 0x82a000 0x3b000 0x1f58670 0x1b000 6 Yes
cosShadow 0x865000 0x3b000 0x1f756b8 0x1b000 7 Yes
migration 0x8a0000 0xe000 0x1f926d0 0x1000 8 Yes
lvmdriver 0x8ae000 0xc000 0x1f93888 0x2000 9 Yes
nfsclient 0x8ba000 0x11000 0x1f968a8 0x1000 10 Yes
vmfs3 0x8cb000 0x23000 0x1f99bc0 0x1000 11 Yes
vmfs2 0x8ee000 0x11000 0x1f9c460 0x11000 12 Yes
I'll get patching coordinated and we'll see if getting to 3.0.2 (or at least getting the right drivers installed) fixes everything. I'll re-post the results, but it'll be a week or so.
Thanks for the info.
1 person found this helpful
This is going to sound crazy, however rather than using RDM, if available direct mount iSCSI via the Windows iSCSI initiator. Tons faster than RDM. I couldn't tell if you were using fiber, or iSCSI. At VMWorld, Bluelock, showed that iSCSI initiator inside the VM, rather than using ESX's initiator for the data drives was significantly faster than RDM of a lun through the ESX intiator.
Please do post your results, I may not be able to get my tests done by then and would like to know what happens.
This is 200% true. For now guest OS iSCSI initiator access is much faster then ESX own mapping. Unfortunately...
Wow... what really sucks in that case is that we're using fibre, through-and-through. I don't have iSCSI in my environment at all (yet).
So far, upgrading a couple of the blades to 3.0.2 has not resolved the issue. I want to wait until they are all upgraded to 3.0.2 before I make any hard statements about it though (we're in the middle of upgrading today... and there's 10 blades to do).
The other bothersome thing for me is that we have 4gb HBAs (in Mezzanine) on the blades, connected to 4gb switches, running 4gb drivers, and I'm running at 2gb. The next thing I'm going to do is to down a blade, force it to 4gb on the switch and bring the blade back up. Thoughts on whether or not anything thinks this will help is appreciated. We are connecting to both an EVA6000 running at 2gb (2x2gb per switch) and an EVA8000 running at 4gb (4x4gb per switch).
Thanks for the additional info and I'll keep everyone posted on our results.
Just a quick update. We've upgraded to the latest and greatest on 9/10 blades. At this time, the problem is still going on. I'm going to get the other blade upgraded, just to be sure and then place a support call to VMWare. LUN rescans are still getting hung up on the LUNs presented for MSCS because they have reservations. Given all the LUN problems with presenting RAW to a VM, I think we're going to look at iSCSI as soon as can.
iscsi with the initiator in the VM has worked flawlessly for me everytime.
just curious of the status on this...were you ever able to get this resolved?