VMware Cloud Community
johnjackson
Contributor
Contributor

Virtual machine dies with SCSI reservation conflict

Do SCSI reservations applied from a Linux virtual machine work on virtual disks? Are there tricks/tips on how they have to be configured?

Here's the background after doing research on how we think this is supposed to work ...

We are running two ESX 3.5 (plus some patches) servers with a FC connected Sun 3510 storage array between them. A 1.4 TByte RAID-5 LUN is presented to both servers as a VMFS volume. There are two RH4_U6 32 bit virtual machines, one on each ESX server, with their own private OS virtual disk (scsi0:0/sda) and two other virtual data disks (scsi1:0/sdb and scsi1:1/sdc). The data disks were created thick and are presented to both VM's as independent-persistent. The scsi1 controller is defined as lsilogic and sharedBus is set to physical (since the VM's are on separate servers). After the VM's kickstart I can do normal setup things (fdisk, vgcreate, mke2fs, etc). Then the application load sequence starts which includes installing and configuring Sybase ASE 12.5 pointing to normal disk files stored in an ext3 file system in an LVM on one of the two virtual data disks (one VM uses one virtual disk, the other VM uses the other virtual disk). As Sybase is initializing itself (running dataserver, I think), the kernel throws errors and the process hangs:

Journal commit I/O error

scsi1 (0,0,0): reservation conflict

SCSI error <1 0 0 0> return code 0x18

Buffer I/O error, lost page write

VM1 complains about the disk "assigned" to it (sdb, as shown above) and VM2 complains about the disk assigned to it (sdc, not shown but same messages except different SCSI target). They never complain about the "other" disk (the one the other server is working on and that they should not be paying any attention to).

Both drives need to be presented to both VM's for failover, although I don't know anything about how that's done or Sybase in general. The virtual machines need to be on separate servers, also for failover. I have almost no control over the application and how it's being loaded, although if there were some Sybase config file that needed tweaked that might be possible to sneak into the sequence. Changes to OS, ESX or .vmx files are completely under my control and can easily be handled.

I've run a number of tests with the following observations:

  • Failures happen very quickly if both VM's are roughly in sync, i.e. the Sybase initialization starts at close to the same time (with a couple of seconds).

  • I've seen failures when only one VM was running and the other was shut down, although the Sybase initialization usually runs quite a while in this case before failing.

  • Sometimes (not often) one of the two servers will complete successfully after the first has failed.

  • Putting both VM's on a single server works (although only tested once). I don't recall if I changed sharedBus to virtual for this case or not (probably not).

  • Putting both VM's on a single server and using local storage instead of SAN worked (although only tested once).

  • Putting the virtual disks on a SAN RAID-10 volume instead of the RAID-5 worked.

  • Recreating the entire environment on a completely different set of hardware (but exactly the same configuration) still fails, so I don't think this is a hardware problem.

  • Using raw LUN's presented via rdmp works. We used to run this way but requirements for these servers keep changing and we really, really, want the flexibility of virtual disks.

  • I saw a reference to "disk.locking = false" in one web search but it didn't help.

I'm not convinced there is really a difference between the RAID-10 and RAID-5 LUN's -- I think I may have just gotten lucky with the very few tests run on the RAID-10. I'm less certain about using the local storage.

I looked through the various VMware log files but didn't see anything particularly worrisome.

We have to make a decision on our storage layout very shortly (the next couple of days). I've submitted a support request to VMware but am hoping for help (again Smiley Happy from the forum.

Would you think this setup should work (shared virtual disks supporting Sybase)?

Is there something else I need to do to allow SCSI reservations to work on a virtual disk?

Is this a Sybase specific thing and is there a workaround/fix?

0 Kudos
11 Replies
jhanekom
Virtuoso
Virtuoso

I don't know about your particular situation, but on a Windows cluster, your shared data disks must be raw LUNs (RDMs, to be exact) for sharing accross hosts to work (using VMDKs is permissable only for cluster-in-a-box scenarios.) Unless I'm mistaken, it has something to do with the fact that the extra level of indirection makes it problematic for VMware to keep track of who's writing what, when. (I previously believed this was mostly because there would then be little control over how snapshots etc. are dealt with, but maybe there's something more to it?)

This could also explain why it worked when you put both VMs on the same box.

Could you try your tests with RDMs and see if that helps at all?

0 Kudos
johnjackson
Contributor
Contributor

Thank you for the reply.

As I said (buried in the original observations), we know raw LUN's work. That's how our storage used to be laid out. We just really hoped we could use virtual disks because our customers can't make up their mind on how many or what size the data areas need to be and virtual disks would allow a lot more flexibility (i.e. less hassle for me which is, of course, the most important goal Smiley Happy).

I assume by "cluster in a box" you mean all the VM's running on a single server? If so, I understand how that would make this work (and matches my test results) but we can't do that because the load has to be spread across physical servers for redundancy.

0 Kudos
Texiwill
Leadership
Leadership

Hello,

When setting up a Shared Disk Cluster whether MSCS or RedHat/Linux you should follow the instructions within http://www.vmware.com/pdf/vi3_301_201_mscs.pdf as the basic setup of the VMs is the same. You can either use an RDM as the shared disk location (recommended) or a VMDK formatted as Thick (which can only be done from the CLI at the moment). Then the SCSI reservation requests should work properly.


Best regards,

Edward L. Haletky

VMware Communities User Moderator

====

Author of the book 'VMWare ESX Server in the Enterprise: Planning and Securing Virtualization Servers', Copyright 2008 Pearson Education. As well as the Virtualization Wiki at http://www.astroarch.com/wiki/index.php/Virtualization

--
Edward L. Haletky
vExpert XIV: 2009-2023,
VMTN Community Moderator
vSphere Upgrade Saga: https://www.astroarch.com/blogs
GitHub Repo: https://github.com/Texiwill
0 Kudos
kjb007
Immortal
Immortal

The problem that you will run into here is that using shared disks across physical servers is also an unsupported configuration. In order to run this type of config, you have to run boot disks on local storage, and your shared disks on shared SAN with RDM. That is the supported config for cluster across hosts. That being said, this config has worked for other people, using MSCS, but again, is not a supported config by vmware.

As you said, sharing disks across hosts, requires physical and not virtual bus sharing mode for your disks.

I have configured this before in a test mode, and another thing you want to make sure of, is that the disks are presented to the servers in the same order. What is sdb on one may be sdc on the second, and then you run into those issues as well. Other than that, the setup is fairly straight forward, although I have not tried this with Sybase, but have played with it with Oracle.

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
0 Kudos
jhanekom
Virtuoso
Virtuoso

Edward, just to expand on your statement: when clustering across separate ESX boxes, the only supported configuration at the moment is to use RDMs in physical compatibility mode. VMDKs are only supported for "cluster-in-a-box" (single physical server) solutions.

0 Kudos
Texiwill
Leadership
Leadership

Hello,

THank you jhanekom. I thought the poster mentioned CiB. Even more clarification: Shared disk clusters across multiple ESX servers require RDMs, furthermore they require the boot volume for each cluster node to be on LOCAL storage not shared storage. Cluster in a Box is not as restrictive but you can only Cluster 2 nodes with CiB, while up to 8 using the other method.

It is very important to follow the MSCS guide very carefully when setting up any shared disk cluster else things do not work.


Best regards,

Edward L. Haletky

VMware Communities User Moderator

====

Author of the book 'VMWare ESX Server in the Enterprise: Planning and Securing Virtualization Servers', Copyright 2008 Pearson Education. As well as the Virtualization Wiki at http://www.astroarch.com/wiki/index.php/Virtualization

--
Edward L. Haletky
vExpert XIV: 2009-2023,
VMTN Community Moderator
vSphere Upgrade Saga: https://www.astroarch.com/blogs
GitHub Repo: https://github.com/Texiwill
0 Kudos
jamieabbott
Enthusiast
Enthusiast

Hi John,

I don't know if you managed to resolve this yet, but I came across the post while checking for compatibility between a 3510 and ESX 3.5.

A Sun 3510 array is fine on the 3.0.x HCL, but is excluded from the 3.5 HCL, and I was hoping that it was more that it is because it is going EOL soon, rather than a fundamental change in how it works. Sounds like the VMFS side of things is ok, but this may prove a gotcha in terms of getting VMware support as they could argue the hardware is technically unsupported.

Good luck!

Jamie

0 Kudos
wbp
Contributor
Contributor

Hi

I am experiencing this issue with an Oracle RAC TEST environment.

The setup is a 2 node CIB running RHEL ES 4 Update 6 and OCFS2 on the shared disks. Both the boot & shared disks are on an EMC CX500 SAN. I have tried both physical & virtual sharing configurations for the shared drives. Why is a boot drive on the local storage a requirement?

I am looking at the MSCS config doc that was recommended - is there a corresponding linux doc?

Did you use OCFS2 for your Oracle RAC environment? If so, are there any special settings for OCFS2 in a vm environment?

Thanks, wbp.

0 Kudos
Texiwill
Leadership
Leadership

Hello,

The MSCS doc virtual hardware setup will apply to any shared disk clustering within ESX. The boot volume needs to be on local disk as there are locking issues otherwise when you are using clustering across ESX servers. CiB also requires this and the shared disk if its NOT an RDM needs to be a zeroed thick disk. However, it works better as an RDM.


Best regards,

Edward L. Haletky

VMware Communities User Moderator

====

Author of the book 'VMWare ESX Server in the Enterprise: Planning and Securing Virtualization Servers', Copyright 2008 Pearson Education.

CIO Virtualization Blog: http://www.cio.com/blog/index/topic/168354

As well as the Virtualization Wiki at http://www.astroarch.com/wiki/index.php/Virtualization

--
Edward L. Haletky
vExpert XIV: 2009-2023,
VMTN Community Moderator
vSphere Upgrade Saga: https://www.astroarch.com/blogs
GitHub Repo: https://github.com/Texiwill
0 Kudos
alrad2002
Contributor
Contributor

Hi Folks!

Has anyone experiencing this issue in the past found a resolution? My setup is not too different from WBP's post:

* 2 Linux RHEL 5.2 guests running Oracle 10 RAC and OCFS2 filesystem on the shared (VMDK) storage

* Guests set up and run on the same ESX 3.5 update 1 host server (which has boot volume on local storage)

* VMDK built following the MSCS "cluster-in-a-box" configuration documentation

I get the same SCSI reservation conflict noted above which causes Oracle to reboot the guest OS. One outstanding question that comes to mind is the SCSI-2 vs SCSI-3 reservation issue.....I can't seem to find any documentation that specifies what reservation protocol Oracle/OCFS2 uses?

Any ideas?

Cheers!

Allen

0 Kudos
Texiwill
Leadership
Leadership

Hello,

ESX uses SCSI-2 LUN Locking for VMFS only. Any locking within the VMDK is ignored. If you are using RDMs, then any lock can be used. SCSI-2 reservations happen whenever the metadata of the VMFS changes. These also depend on the SAN/iSCSI device in use.


Best regards,

Edward L. Haletky

VMware Communities User Moderator

====

Author of the book 'VMWare ESX Server in the Enterprise: Planning and Securing Virtualization Servers', Copyright 2008 Pearson Education.

CIO Virtualization Blog: http://www.cio.com/blog/index/topic/168354

As well as the Virtualization Wiki at http://www.astroarch.com/wiki/index.php/Virtualization

--
Edward L. Haletky
vExpert XIV: 2009-2023,
VMTN Community Moderator
vSphere Upgrade Saga: https://www.astroarch.com/blogs
GitHub Repo: https://github.com/Texiwill
0 Kudos