VMware Cloud Community
enieuw
Contributor
Contributor

3ware 9650SE bad disk performance

I'm using the drivers from the VMware KB over here: http://www.3ware.com/kb/article.aspx?id=15416 and I can't seem to get decent transfer rates either. The linux guests give a message like

"mptbase: ioc0: IOCStatus(0x004b): SCSI IOC Terminated"

in their dmesg logs whenever the ESXi message log gives

"Dec 21 23:33:10 vmkernel: 20:07:31:13.409 cpu0:5759053)<4>3w-9xxx: scsi1: WARNING: (0x06:0x002c): Unit #0: Command (0x2a) timed out, resetting card.

Dec 21 23:33:10 vmkernel: 20:07:31:26.316 cpu0:5759053)<4>3w-9xxx: scsi1: AEN: INFO (0x04:0x005e): Cache synchronization completed:unit=0."

Any bright ideas? Smiley Happy

0 Kudos
11 Replies
dtalk
Contributor
Contributor

I have this card, and have never seen this problem. I would suggest first making sure the 9650SE's firmware is up to date, before you spend any more time troubleshooting ... check the obvious things, too; bad disk, bad cable, poor connection, etc.

0 Kudos
tijuhasz
Contributor
Contributor

I've seen these errors in ESXi 4 too. I use the 3ware's VMWare certified driver, but I have two hosts and on both the problem occurs! (completely different hardware) The firmware on the 3ware 9690SA cards are the latest, and the 3ware support have no idea what's happening...

Aug 25 13:17:38 vmkernel: 0:06:13:50.767 cpu3:4172)<4>3w-9xxx:8:0:1:0 :: WARNING: (0x06:0x002c): Command (0x2a) timed out, resetting card.

Aug 25 13:17:58 vmkernel: 0:06:14:11.475 cpu1:4172)<4>3w-9xxx: scsi8: AEN: INFO (0x04:0x005e): Cache synchronization completed:unit=1.

Aug 25 13:19:14 vmkernel: 0:06:15:26.799 cpu1:84014)<4>3w-9xxx:8:0:1:0 :: WARNING: (0x06:0x002c): Command (0x2a) timed out, resetting card.

Aug 25 13:19:35 vmkernel: 0:06:15:47.515 cpu3:84014)<4>3w-9xxx: scsi8: AEN: INFO (0x04:0x005e): Cache synchronization completed:unit=1.

Aug 25 15:17:34 vmkernel: 0:08:13:47.049 cpu0:84381)<4>3w-9xxx:8:0:1:0 :: WARNING: (0x06:0x002c): Command (0x2a) timed out, resetting card.

Aug 25 15:17:55 vmkernel: 0:08:14:08.203 cpu2:84381)<4>3w-9xxx: scsi8: AEN: INFO (0x04:0x005e): Cache synchronization completed:unit=1.

Aug 25 15:17:56 vmkernel: 0:08:14:08.899 cpu1:107109)<4>3w-9xxx:8:0:1:0 :: WARNING: (0x06:0x002c): Command (0x2a) timed out, resetting card.

Aug 25 15:18:16 vmkernel: 0:08:14:29.107 cpu3:107109)<4>3w-9xxx: scsi8: AEN: INFO (0x04:0x005e): Cache synchronization completed:unit=1.

0 Kudos
vsu
Contributor
Contributor

Same problem here with ESXi4 and 3ware 9690SA. The problem is reliably triggered by running IOMeter 2006.07.27 in a Windows Server 2003 VM with this workload on a 32 GB vmdk:

  • Block size = 8 KB, alignment = 8 KB

  • 50% read/write distribution

  • 100% random access

  • Outstanding I/Os (queue depth) = 256

Timeouts and resets occur at least every 5-10 minutes while the test is running.

Disks are 4 x Seagate ST31000340NS (SN06 firmware - according to 3ware, upgrade to BN06 in a 4-disk configuration is not required); RAID5 (tried both 256K and 64K stripe - no changes).

Installed Ubuntu 9.04 on the physical machine for testing - unfortunately, IOMeter does not really work on Linux, but with fio the adapter does not get reset even with some insane parameters (like --iodepth=40960).

0 Kudos
LucasAlbers
Expert
Expert

What are your cache settings set to?

What are your background disk check/rebuild settings set to?

0 Kudos
tijuhasz
Contributor
Contributor

cache settings:

We use a BBU with the card so the write cache was ON, but there were many problems (card resets) so we have to turn OFF... after this less card reset exists, but we still have

background disk check/rebuild:

auto verify was also turned ON, but because of many card resets we decided to turn it OFF

0 Kudos
LucasAlbers
Expert
Expert

Linux on the physical hardware works, so i am guessing it might be the esxi driver.

Does the error still occur with the ESX version?

0 Kudos
tijuhasz
Contributor
Contributor

I haven't tried it under ESX, but it should work under both! ()

0 Kudos
LucasAlbers
Expert
Expert

From your complaints it does not appear to work under esxi even though 3ware supports it.

Our development guys are considering getting this controller for their their esxi system.

I am advising them to get a 9690 until I learn how this issue gets resolved.

0 Kudos
salmonj
Contributor
Contributor

Lucas,

We're actually experiencing the same behavior on 9690SA as well, so it might be a driver issue. Anyone tried to open a support ticket with 3ware for this?

0 Kudos
tijuhasz
Contributor
Contributor

Hello,

I contacted with the support, but they can't reproduce this errors... I sent them all information about our systems and the support said that they will try to buil the same configuration to test it... I'm waiting for this about 1,5 month!

0 Kudos
salmonj
Contributor
Contributor

Just out of curiousity, what kind of RAID setup you experiencing this on and what kind of hard drives there are?

As in our case we have esxi 3.5 update 3 running (booting) off RAID-10 volume of four 1Tb Seagate ST31000640SS drives, and thing is we can almost reliably PSOD an entire host by introducing heavy disk IO (in terms of IOPs, not a raw linear read), and this PSOD is preceded by command timeouts on volume and VSCSIFS busy messages..

0 Kudos