VMware Cloud Community
Donghwan_Kang
Contributor
Contributor

SCSI Reservation Conflict on multiple storage(iSCSI) environment, need help!

Hi there,

I have SCSI Resevation problem since new storage added in my VI3(ESX3.5) environment.

BACKUPGROUND:

  • 10 ESX 3.5 Hosts(SunFire x4150/Intel Xeon/32GB/8 GigE Ports)

  • 3 Sun StorageTek 2510 iSCSI Storage

  • Software iSCSI connection

  • VMkernel Ports for iSCSI is Teaming by 2 NICs

  • VMware HA/DRS/VMotion

  • 250 Windows XP VM(VDI)

  • ESX 3.5 U4 and VC 2.5 U4

  • Successfully operating for 6 months with no SCSI Reservation Error

CHANGES :

  • Added new iSCSI storage

  • Sun Unified Storage 7410 (iSCSI)

Each Windows XP VMs has 2 virtual disks. One(C:) is in the STK2510 Storage, the other(D:) is 7410 storage.

Each VMFS volume has about 20 VMs and size is 500~600GB.

Since added Sun 7410 storage, too many SCSI Reservation Conflict error has occurred, degrade I/O performance or

operation failed such as VM power-on/off, VMotion(operations that need SCSI Reservation)

/var/log/vmkwarning

Jul 15 15:29:35 esx01 vmkernel: 0:01:31:52.787 cpu2:1051)WARNING: SCSI: 5350: vml.0200000000600144f04a5de86e00000c29593bf000534f4c415249: Too many failed retries 33 (32), Returning I/O failure. 0x12 1/0x0 0x2 0x8 0x0

Jul 16 02:27:34 esx01 vmkernel: 0:06:26:50.747 cpu1:1085)WARNING: SCSI: 119: Failing I/O due to too many reservation conflicts

Jul 16 02:28:39 esx01 vmkernel: 0:06:27:55.471 cpu1:1085)WARNING: SCSI: 119: Failing I/O due to too many reservation conflicts

Jul 16 02:28:44 esx01 vmkernel: 0:06:28:00.615 cpu1:1085)WARNING: SCSI: 119: Failing I/O due to too many reservation conflicts

Jul 16 02:28:49 esx01 vmkernel: 0:06:28:05.782 cpu0:1085)WARNING: SCSI: 119: Failing I/O due to too many reservation conflicts

Jul 16 02:29:11 esx01 vmkernel: 0:06:28:27.387 cpu0:1085)WARNING: SCSI: 119: Failing I/O due to too many reservation conflicts

Jul 16 02:29:16 esx01 vmkernel: 0:06:28:32.295 cpu0:1085)WARNING: SCSI: 119: Failing I/O due to too many reservation conflicts

Jul 16 02:29:24 esx01 vmkernel: 0:06:28:40.299 cpu2:1086)WARNING: J3: 2151: Aborting txn to slot 0 because can't verify locks

Jul 16 02:30:41 esx01 vmkernel: 0:06:29:57.349 cpu2:1039)WARNING: FS3: 4787: Reservation error: Timeout

Jul 16 02:30:41 esx01 vmkernel: 0:06:29:57.349 cpu2:1039)WARNING: FS3: 4981: Reclaiming timed out heartbeat http://HB state abcdef02 offset 3258368 gen 76 stamp 23352554968 uuid 4a5db706-ae5a83ca-edb2-00144f8dd1ce jrnl <FB 12403> drv 4.31 failed: Timeout

When SCSI Reservation Conflict error occured, I checked "Pending Reservation" of volumes using esxcfg-info.(Resolving SCSI reservation conflicts(KB1002293))

In that situation one or some volumes(in the Sun 7410) are "Pending Reservation=1"

|----Console Device................................../dev/sdd

|----Devfs Path....................................../vmfs/devices/disks/vml.0200000000600144f04a5e892100000c29667ceb00534f4c415249

|----SCSI Level......................................6

|----Queue Depth.....................................32

|----Is Pseudo.......................................false

|----Is Reserved.....................................false

|----Pending Reservations............................1

In my opinion, there is a problem about release reservation, bt I'm not sure what is root cause. Additionally, Sun 2510 and Sun 7410 storage are linsted in VMware HCL list.

Any idea?

Thanks in advance.

DH,

0 Kudos
4 Replies
azn2kew
Champion
Champion

Have you verify the firmware level of your SAN, Fabric Switches, and HBAs are all at the latest level? Can you check your metadata activities anything particular such as custom scripts or "tasks" that run and touch your SAN constantly that would create a lot of I/O activities. Seems like neither your virtual machines or processes hitting too much at your storage I/O. It's recommended to run 10-16VMs per LUN but in your case 20 isn't that much to be worry about. How many XP VMs do you have for each LUN on the 7410 LUN?

Have you try testing the I/O load using IOMeter or other tools and what you see with esxtop command?

If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!!!

Regards,

Stefan Nguyen

VMware vExpert 2009

iGeek Systems Inc.

VMware, Citrix, Microsoft Consultant

If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!!! Regards, Stefan Nguyen VMware vExpert 2009 iGeek Systems Inc. VMware vExpert, VCP 3 & 4, VSP, VTSP, CCA, CCEA, CCNA, MCSA, EMCSE, EMCISA
0 Kudos
Donghwan_Kang
Contributor
Contributor

Thanks, Stefan.

20 VMs in the Sun 7410 storage LUN and storage I/O is not high(about 10MB/s). But, I'm not sure how many I/O activity for metadata change.

DH,

0 Kudos
Texiwill
Leadership
Leadership

Hello,

SCSI Reservations are generally caused by the actions you take not something within the hardware or software. For example, you get a scsi reservation when you clone a VM, actually several of them.

Check out Excerpt from my Book, VMware ESX Server in the Enterprise, for a complete review of what causes SCSI reservations.


Best regards,

Edward L. Haletky VMware Communities User Moderator, VMware vExpert 2009, Virtualization Practice Analyst[/url]
Now Available: 'VMware vSphere(TM) and Virtual Infrastructure Security: Securing the Virtual Environment'[/url]
Also available 'VMWare ESX Server in the Enterprise'[/url]
[url=http://www.astroarch.com/wiki/index.php/Blog_Roll]SearchVMware Pro[/url]|Blue Gears[/url]|Top Virtualization Security Links[/url]|Virtualization Security Round Table Podcast[/url]

--
Edward L. Haletky
vExpert XIV: 2009-2023,
VMTN Community Moderator
vSphere Upgrade Saga: https://www.astroarch.com/blogs
GitHub Repo: https://github.com/Texiwill
0 Kudos
rubensluque
Enthusiast
Enthusiast

Hi,

You might to decrease the Scsi.ConflictRetries parameter in Configuration->Advanced Settings->SCSI

The default value is 80. Change it to 10. It will not solve your main problem with reservations but it will alleviate the time out issues during disk operations.

rgds,

0 Kudos