I do not have a fibre chanel SAN to look out but for those of you that do what is an OK amount of scsi reservation conflicts messages to have in the logs. If there was a threshold for when this should be an issue what would that amount be?
According to what Texiwill and some posts I read, this is a very finicky protocol. You get a SCSI reservation any time a file is opened or closed. This includes, ISO images on VMFS and various other files (like logs, which is why I turn those off for each VM).
We still get them no matter how much qladepth I give, or tweak, or any type of testing on the SAN, nothing seems to change the fact that SCSI reservations persist. Interesting thing is the 'threshold' you seek happens on 1 VM. I setup an ESX server, and purposely tried to repeat the issues hoping to figure what could cause such a problem. I had all kinds of monitors on Fiber Switches, the cards, I saved logs, I did everything. I called VM Ware, QLogic and no one has a clue. So I gave up.
Another odd thing, QLogic spits out WAY more messages than Emulex. So this MUST be a driver issue, not ESX / SAN connectivity. Performance doesn't suffer unless the depth is too low, but when we got IBM 3950 with emulex (keep in mind SAN, VM's, Fiber, Switches nothing changed...) the SCSI reservation is almost gone completely.
Interesting. By comparison Dell 2950 (not picking on Dell, but it's the majority of machines we have) almost have 20 or 30 SCSI reservations an hour, which is the same machine I used to test just 1 VM, even with 1 VM, it still had a SCSI reservation.... during power on, off, and access (something that perplexed QLogic, because they claim that should not happen). But we have Dell R900 with Emulex, and SCSI reservation were cut in half but still numerous. Only the IBM machines show virtually no SCSI reservation, even using default qladepth at 16, and they have almost 4 times the VM's.
So if any machine would show SCSI reservation, you would think the ESX server with the most VM's would have more SCSI reservations, but they don't. I conclude there is a bottleneck somewhere on the backplane, which explains why IBM have better Disk IO than the Dell counterparts on the SAME SAN / Fiber storage.
So I think this is more of a combination of things, driver, ESX version and hardware baseline. The best combination thus far (In my opinion is IBM, not knocking Dell, I am a huge Dell fan) but numbers speak for themselves.
In the end, the R900 have no performance degradation on IO, again the difference is QLogic vs Emulex. bottom line I think is Emulex is a better hardware / driver combo than QLogic.