Hi there,
I have SCSI Resevation problem since new storage added in my VI3(ESX3.5) environment.
BACKUPGROUND:
10 ESX 3.5 Hosts(SunFire x4150/Intel Xeon/32GB/8 GigE Ports)
3 Sun StorageTek 2510 iSCSI Storage
Software iSCSI connection
VMkernel Ports for iSCSI is Teaming by 2 NICs
VMware HA/DRS/VMotion
250 Windows XP VM(VDI)
ESX 3.5 U4 and VC 2.5 U4
Successfully operating for 6 months with no SCSI Reservation Error
CHANGES :
Added new iSCSI storage
Sun Unified Storage 7410 (iSCSI)
Each Windows XP VMs has 2 virtual disks. One(C:) is in the STK2510 Storage, the other(D:) is 7410 storage.
Each VMFS volume has about 20 VMs and size is 500~600GB.
Since added Sun 7410 storage, too many SCSI Reservation Conflict error has occurred, degrade I/O performance or
operation failed such as VM power-on/off, VMotion(operations that need SCSI Reservation)
/var/log/vmkwarning
Jul 15 15:29:35 esx01 vmkernel: 0:01:31:52.787 cpu2:1051)WARNING: SCSI: 5350: vml.0200000000600144f04a5de86e00000c29593bf000534f4c415249: Too many failed retries 33 (32), Returning I/O failure. 0x12 1/0x0 0x2 0x8 0x0
Jul 16 02:27:34 esx01 vmkernel: 0:06:26:50.747 cpu1:1085)WARNING: SCSI: 119: Failing I/O due to too many reservation conflicts
Jul 16 02:28:39 esx01 vmkernel: 0:06:27:55.471 cpu1:1085)WARNING: SCSI: 119: Failing I/O due to too many reservation conflicts
Jul 16 02:28:44 esx01 vmkernel: 0:06:28:00.615 cpu1:1085)WARNING: SCSI: 119: Failing I/O due to too many reservation conflicts
Jul 16 02:28:49 esx01 vmkernel: 0:06:28:05.782 cpu0:1085)WARNING: SCSI: 119: Failing I/O due to too many reservation conflicts
Jul 16 02:29:11 esx01 vmkernel: 0:06:28:27.387 cpu0:1085)WARNING: SCSI: 119: Failing I/O due to too many reservation conflicts
Jul 16 02:29:16 esx01 vmkernel: 0:06:28:32.295 cpu0:1085)WARNING: SCSI: 119: Failing I/O due to too many reservation conflicts
Jul 16 02:29:24 esx01 vmkernel: 0:06:28:40.299 cpu2:1086)WARNING: J3: 2151: Aborting txn to slot 0 because can't verify locks
Jul 16 02:30:41 esx01 vmkernel: 0:06:29:57.349 cpu2:1039)WARNING: FS3: 4787: Reservation error: Timeout
Jul 16 02:30:41 esx01 vmkernel: 0:06:29:57.349 cpu2:1039)WARNING: FS3: 4981: Reclaiming timed out heartbeat http://HB state abcdef02 offset 3258368 gen 76 stamp 23352554968 uuid 4a5db706-ae5a83ca-edb2-00144f8dd1ce jrnl <FB 12403> drv 4.31 failed: Timeout
When SCSI Reservation Conflict error occured, I checked "Pending Reservation" of volumes using esxcfg-info.(Resolving SCSI reservation conflicts(KB1002293))
In that situation one or some volumes(in the Sun 7410) are "Pending Reservation=1"
|----Console Device................................../dev/sdd
|----Devfs Path....................................../vmfs/devices/disks/vml.0200000000600144f04a5e892100000c29667ceb00534f4c415249
|----SCSI Level......................................6
|----Queue Depth.....................................32
|----Is Pseudo.......................................false
|----Is Reserved.....................................false
|----Pending Reservations............................1
In my opinion, there is a problem about release reservation, bt I'm not sure what is root cause. Additionally, Sun 2510 and Sun 7410 storage are linsted in VMware HCL list.
Any idea?
Thanks in advance.
DH,
Have you verify the firmware level of your SAN, Fabric Switches, and HBAs are all at the latest level? Can you check your metadata activities anything particular such as custom scripts or "tasks" that run and touch your SAN constantly that would create a lot of I/O activities. Seems like neither your virtual machines or processes hitting too much at your storage I/O. It's recommended to run 10-16VMs per LUN but in your case 20 isn't that much to be worry about. How many XP VMs do you have for each LUN on the 7410 LUN?
Have you try testing the I/O load using IOMeter or other tools and what you see with esxtop command?
If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!!!
Regards,
Stefan Nguyen
VMware vExpert 2009
iGeek Systems Inc.
VMware, Citrix, Microsoft Consultant
Thanks, Stefan.
20 VMs in the Sun 7410 storage LUN and storage I/O is not high(about 10MB/s). But, I'm not sure how many I/O activity for metadata change.
DH,
Hello,
SCSI Reservations are generally caused by the actions you take not something within the hardware or software. For example, you get a scsi reservation when you clone a VM, actually several of them.
Check out Excerpt from my Book, VMware ESX Server in the Enterprise, for a complete review of what causes SCSI reservations.
Best regards,
Edward L. Haletky VMware Communities User Moderator, VMware vExpert 2009, Virtualization Practice Analyst[/url]
Now Available: 'VMware vSphere(TM) and Virtual Infrastructure Security: Securing the Virtual Environment'[/url]
Also available 'VMWare ESX Server in the Enterprise'[/url]
[url=http://www.astroarch.com/wiki/index.php/Blog_Roll]SearchVMware Pro[/url]|Blue Gears[/url]|Top Virtualization Security Links[/url]|Virtualization Security Round Table Podcast[/url]
Hi,
You might to decrease the Scsi.ConflictRetries parameter in Configuration->Advanced Settings->SCSI
The default value is 80. Change it to 10. It will not solve your main problem with reservations but it will alleviate the time out issues during disk operations.
rgds,