VMware Cloud Community
Easycom20111014
Contributor
Contributor
Jump to solution

Areca 1212 Failure

After a few months running esxi4.1 on a supermicro server with a areca 1212 raid controller following errors show up after POSD:

Disks are in RAID 1.

Any help aprreciated.

WARNING: LinScsi: SCSILinuxAbortCommands: Failed, Driver ARCMSR ARECA SATA/SAS RAID ControllerDriver Version 1.20.00.15.vmk.100202, for vmhba2 [0m
[7m66:16:44:51.630 cpu0:10101331)WARNING: arcmsr5: abort device command(0xe01fc00) of scsi id = 0 lun = 0  [0m
[7m66:16:44:51.630 cpu0:10101331)WARNING: LinScsi: SCSILinuxAbortCommands: Failed, Driver ARCMSR ARECA SATA/SAS RAID ControllerDriver Version 1.20.00.15.vmk.100202, for vmhba2 [0m
[7m66:16:44:51.843 cpu0:10101331)WARNING: arcmsr5: abort device command(0xe030400) of scsi id = 0 lun = 0  [0m
[7m66:16:44:51.843 cpu0:10101331)

and

WARNING: LinScsi: SCSILinuxAbortCommands: Failed, Driver ARCMSR ARECA SATA/SAS RAID ControllerDriver Version 1.20.00.15.vmk.100202, for vmhba2 [0m
[7m66:16:45:19.941 cpu1:4161)WARNING: LinScsi: SCSILinuxAbortCommands: Failed, Driver ARCMSR ARECA SATA/SAS RAID ControllerDriver Version 1.20.00.15.vmk.100202, for vmhba2 [0m
[7m66:16:45:19.941 cpu1:4161)WARNING: arcmsr5: abort device command(0xe02e800) of scsi id = 0 lun = 1  [0m
66:16:45:21.018 cpu0:4098)<5>arcmsr5: pCCB ='0x0x4100b50e9400' isr got aborted command
66:16:45:21.018 cpu0:4098)<5>arcmsr5: isr get an illegal ccb command     done acb = '0x0x41000b813588'ccb = '0x0x4100b50e9400' ccbacb = '0x0x41000b813588' startdone = 0x0 ccboutstandingcount = 41
[31;1m66:16:45:21.018 cpu0:4098)ALERT: LinScsi: SCSILinuxCmdDone: Attempted double completion [0m
66:16:45:21.019 cpu0:4098)Backtrace for current CPU #0, worldID=4098, ebp=0x417f800179a8
66:16:45:21.019 cpu0:4098)0x417f800179a8:[0x4180278577b5]PanicLogBacktrace@vmkernel:nover+0x18 stack: 0x417f800179d8, 0x417f8
66:16:45:21.019 cpu0:4098)0x417f80017ae8:[0x4180278579f4]PanicvPanicInt@vmkernel:nover+0x1ab stack: 0x417f80017bd8, 0x4180278
66:16:45:21.020 cpu0:4098)0x417f80017af8:[0x418027857fdd]Panic_vPanic@vmkernel:nover+0x18 stack: 0x3000000008, 0x417f80017be8
66:16:45:21.020 cpu0:4098)0x417f80017bd8:[0x41802788a572]vmk_Panic@vmkernel:nover+0xa1 stack: 0x24c0, 0xd0b50e9400, 0x417f800
66:16:45:21.021 cpu0:4098)0x417f80017c48:[0x418027c6a35e]SCSILinuxCmdDone@esx:nover+0x2c1 stack: 0x202, 0x418027d78e38, 0x410
66:16:45:21.021 cpu0:4098)0x417f80017c88:[0x418027d78df6]arcmsr_interrupt@esx:nover+0x241 stack: 0xd000000023, 0x418027d78e38
66:16:45:21.021 cpu0:4098)0x417f80017cc8:[0x418027c7bd38]Linux_IRQHandler@esx:nover+0x77 stack: 0xd0, 0x417f80017d08, 0x417f8
66:16:45:21.021 cpu0:4098)0x417f80017d58:[0x418027832201]IDTDoInterrupt@vmkernel:nover+0x348 stack: 0x4100b6030150, 0x417f800
66:16:45:21.022 cpu0:4098)0x417f80017d98:[0x4180278324da]IDT_HandleInterrupt@vmkernel:nover+0x85 stack: 0x2637d9eee01956, 0x4
66:16:45:21.022 cpu0:4098)0x417f80017db8:[0x418027832e2d]IDT_IntrHandler@vmkernel:nover+0xc4 stack: 0x417f80017ec0, 0x418027a
66:16:45:21.022 cpu0:4098)0x417f80017dc8:[0x4180278da747]gate_entry@vmkernel:nover+0x46 stack: 0x4018, 0x4018, 0x0, 0x0, 0x0
66:16:45:21.023 cpu0:4098)0x417f80017ec0:[0x418027aacb36]Power_HaltPCPU@vmkernel:nover+0x27d stack: 0x417f80017f70, 0x1, 0x26
66:16:45:21.023 cpu0:4098)0x417f80017fd0:[0x4180279cbe1e]CpuSchedIdleLoopInt@vmkernel:nover+0x985 stack: 0x417f80017ff0, 0x41
66:16:45:21.024 cpu0:4098)0x417f80017fe0:[0x4180279d15ee]CpuSched_IdleLoop@vmkernel:nover+0x15 stack: 0x417f80017ff8, 0x0, 0x
66:16:45:21.024 cpu0:4098)0x417f80017ff0:[0x418027834916]HostPCPUIdle@vmkernel:nover+0xd stack: 0x0, 0x0, 0x0, 0x0, 0x0
66:16:45:21.024 cpu0:4098)0x417f80017ff8:[0x0]<unknown> stack: 0x0, 0x0, 0x0, 0x0, 0x0
66:16:45:21.025 cpu0:4098) [45m [33;1mVMware ESXi 4.1.0 [Releasebuild-260247 X86_64] [0m
66:16:45:21.025 cpu0:4098)Failed at vmkdrivers/src_v4/vmklinux26/vmware/linux_scsi.c:2190 -- NOT REACHED
66:16:45:21.025 cpu0:4098)cr0=0x80010039 cr2=0x0 cr3=0x10ce1000 cr4=0x16c
66:16:45:21.025 cpu0:4098)pcpu:0 world:4098 name:"idle0" (I)
66:16:45:21.025 cpu0:4098)pcpu:1 world:7146 name:"sfcb-vmware_bas" (U)
@BlueScreen: Failed at vmkdrivers/src_v4/vmklinux26/vmware/linux_scsi.c:2190 -- NOT REACHED
66:16:45:21.025 cpu0:4098)Code start: 0x418027800000 VMK uptime: 66:16:45:21.025
66:16:45:21.025 cpu0:4098)0x417f80017af8:[0x418027857fd8]Panic_vPanic@vmkernel:nover+0x13 stack: 0x3000000008
66:16:45:21.026 cpu0:4098)0x417f80017bd8:[0x41802788a572]vmk_Panic@vmkernel:nover+0xa1 stack: 0x24c0
66:16:45:21.026 cpu0:4098)0x417f80017c48:[0x418027c6a35e]SCSILinuxCmdDone@esx:nover+0x2c1 stack: 0x202
66:16:45:21.026 cpu0:4098)0x417f80017c88:[0x418027d78df6]arcmsr_interrupt@esx:nover+0x241 stack: 0xd000000023
66:16:45:21.027 cpu0:4098)0x417f80017cc8:[0x418027c7bd38]Linux_IRQHandler@esx:nover+0x77 stack: 0xd0
66:16:45:21.027 cpu0:4098)0x417f80017d58:[0x418027832201]IDTDoInterrupt@vmkernel:nover+0x348 stack: 0x4100b6030150
66:16:45:21.027 cpu0:4098)0x417f80017d98:[0x4180278324da]IDT_HandleInterrupt@vmkernel:nover+0x85 stack: 0x2637d9eee01956
66:16:45:21.028 cpu0:4098)0x417f80017db8:[0x418027832e2d]IDT_IntrHandler@vmkernel:nover+0xc4 stack: 0x417f80017ec0
66:16:45:21.028 cpu0:4098)0x417f80017dc8:[0x4180278da747]gate_entry@vmkernel:nover+0x46 stack: 0x4018
66:16:45:21.028 cpu0:4098)0x417f80017ec0:[0x418027aacb36]Power_HaltPCPU@vmkernel:nover+0x27d stack: 0x417f80017f70
66:16:45:21.029 cpu0:4098)0x417f80017fd0:[0x4180279cbe1e]CpuSchedIdleLoopInt@vmkernel:nover+0x985 stack: 0x417f80017ff0
66:16:45:21.029 cpu0:4098)0x417f80017fe0:[0x4180279d15ee]CpuSched_IdleLoop@vmkernel:nover+0x15 stack: 0x417f80017ff8
66:16:45:21.030 cpu0:4098)0x417f80017ff0:[0x418027834916]HostPCPUIdle@vmkernel:nover+0xd stack: 0x0
66:16:45:21.030 cpu0:4098)0x417f80017ff8:[0x0]<unknown> stack: 0x0
66:16:45:21.038 cpu0:4098)FSbase:0x0 GSbase:0x418040000000 kernelGSbase:0x0
66:16:45:21.018 cpu0:4098)LinScsi: SCSILinuxCmdDone: Attempted double completion
0:00:00:28.899 cpu1:4825)Elf: 3028: Kernel module arcmsr was loaded, but has no signature attached
Coredump to disk.
Slot 1 of 1.
storage message on vmhba32: Bulk command transfer result=0
                         usb storage message on vmhba32: Bulk data transfer result 0x1
0:00:00:46.061 cpu0:5552)ScsiScan: 1059: Path 'vmhba1:C0:T0:L0': Vendor: 'TEAC    '  Model: 'DV-28S-W        '  Rev: '1.2A'
0:00:00:46.061 cpu0:5552)ScsiScan: 1062: Path 'vmhba1:C0:T0:L0': Type: 0x5, ANSI rev: 5, TPGS: 0 (none)
0:00:00:46.063 cpu0:5552)ScsiUid: 273: Path 'vmhba1:C0:T0:L0' does not support VPD Device Id page.
0:00:00:46.072 cpu0:5552)VMWARE SCSI Id: Could not get disk id for vmhba1:C0:T0:L0
0:00:00:46.073 cpu0:5552)ScsiScan: 1059: Path 'vmhba2:C0:T16:L0': Vendor: 'Areca   '  Model: 'RAID controller '  Rev: 'R001'
0:00:00:46.073 cpu0:5552)ScsiScan: 1062: Path 'vmhba2:C0:T16:L0': Type: 0x3, ANSI rev: 0, TPGS: 0 (none)
[7m0:00:00:46.073 cpu0:5552)WARNING: ScsiScan: 116: Path 'vmhba2:C0:T16:L0': Unsupported pre SCSI-2 device (ansi=0) [0m
0:00:00:46.073 cpu0:5552)ScsiScan: 1059: Path 'vmhba2:C0:T0:L0': Vendor: 'Areca   '  Model: 'ARC-1212-VOL#000'  Rev: 'R001'
0:00:00:46.073 cpu0:5552)ScsiScan: 1062: Path 'vmhba2:C0:T0:L0': Type: 0x0, ANSI rev: 5, TPGS: 0 (none)
0:00:00:46.073 cpu0:5552)ScsiScan: 1059: Path 'vmhba2:C0:T0:L1': Vendor: 'Areca   '  Model: 'ARC-1212-VOL#001'  Rev: 'R001'
0:00:00:46.073 cpu0:5552)ScsiScan: 1062: Path 'vmhba2:C0:T0:L1': Type: 0x0, ANSI rev: 5, TPGS: 0 (none)
0:00:00:46.075 cpu0:4493)usb storage warning (0 throttled) on vmhba32 (SCSI cmd INQUIRY): clearing endpoint halt for pipe 0xc0008280
                         usb storage message on vmhba32: scsi cmd done

Tags (3)
0 Kudos
26 Replies
madmax14304
Enthusiast
Enthusiast
Jump to solution

Zero crashes for me since the upgrade.  Totally solid and I pound these disks hard.

0 Kudos
Easycom20111014
Contributor
Contributor
Jump to solution

So few weeks later again and NO crashes since applying the driver update! So thanks everyone for the assistance! 

0 Kudos
bendiy
Contributor
Contributor
Jump to solution

I just ran into this error today with a Supermicro MB and an Areca 1880.

I'm getting a very similar PSOD when using ghettoVCB to do a backup.  I get the error right when it starts pushing massive data as it clones the vmdk between two datastores.  It's wierd that I didn't have any issues with ghettoVCB in testing, but I guess now that the system is under a higher load in production, the RAID card cannot handle it.

I'll update the driver tonight and report back with my findings.

0 Kudos
bendiy
Contributor
Contributor
Jump to solution

No luck.  Now I'm getting a new PSOD when it clones the vmdk.  New driver seems to work, but I've got some other issues.

See attached.

UPDATE:

Backup works fine if I'm backing up to my NFS, but no if I'm backing up to another datastore in another RAID Array on the same Areca 1880 controller.

0 Kudos
Easycom1
Contributor
Contributor
Jump to solution

Ok, after a few weeks I had some new errors: at random times my vm's froze or became very slow. In the logs i could see the following message: lost connection to datastore... (this being the datastore from which my vm's ran). This message was followed by a message stating the connection was restored. I ended up replacing the controller by a brand new 1212 with the same result. So I changed my controllers from areca to LSI 9260 and so far everything is running very fast and stable.  I would not recommend to use the areca 1212 together with esxi.

0 Kudos
vanhaakonnen
Contributor
Contributor
Jump to solution

Maybe there is a chance to get new drivers from Areca. This seems to be a driver issue in ESXi... This controller (ARC-1231) with the same harddisks (Samsung HD204UI) and a FreeNAS performs very well. I wrote an e-mail to areca. Maybe a few more with the same problem could this also do.

0 Kudos
NTShad0w
Enthusiast
Enthusiast
Jump to solution

Mates, bendiy,

hmm looks like  I have to join the "Team of happy Areca Controllers owners"...:(

So I have SuperMicro Server when I was using HP P400 + HP P800 without any issues (Server never saw a PSOD, 7 months) and now I change these controllers for one Areca ARC-1882ix-24 with 1GB of cache and BBU of course (expensive solution like for home server).

And the problems started...:(

HW:

1x SuperMicro MB 8XDAH+-F, 144GB of RAM, 2x L5638 CPU

1x Areca ARC-1882ix-24 with 1GB cache and BBU

6x Seagate ST31000528AS SATA hdd (2 of them have CC34 firmware, rest have CC38)

6x Maxtor STM31000528AS SATA hdd (same like Seagate and one have CC34 fw, rest have CC38)

SW:

ESXi v5.0.0-504890

40-70 running VMs...

This is my lab enviroment

1. during a lot of data copy/migration with simultanously on the 3 datastores (on 2 RAID Sets, one is R5 and second is R10) when some machines run on the same datastores after some time... hard to say what but not so long ESXi v5.0.0-504890 server crashes with PSOD...:(( It done it in one week 2 times right now Smiley Sad, I'm really not so happy of my new Controller...:(

2. Strange things comes with performance... one time, after ESXi start performance is... very poor...:(, looks like 5x slower than it can be when...yep when what? thats the problem - I don't know, in some moments, after changing some parameters and for the end have same like when a server start in the miracle controller works with good speed/performance I expect... but for example when I will start to migrate a lot of data between a datastores it will probably slow down for a while... and may, or may not speed up again, I dont understand this but copying 4 times the same vm from one datastore to the second may be with 240MB/s... or 35MB/s...there are rather no often values between this max and min speed (of course, when I do something on the vms or vms do something in the same time copying slow down but a lot of times it slow down without any good reasons Smiley Sad()

- for example... everything looks good, 10 vms running, one is FS and CIFS speed is about 80MB/s... preety good, when I copy/move some machines between datastores... after operation end the same FS is slowing down to 12-15MB/s without any good reason (disks blinking easly without searching for a data hard...) why?? I dont know, of course during copying FS is working extremaly slow... and slowing down to 0.1-1.5MB/s, it far far slower than on my P400 that in the same situatuion have about 8-10MB/s !!!! but ok differences in drivers/hw...tuning... but what the hell is going on after the copy that controller works 5 times slower than before??!!!

3. I'm not so sure that when server is online and I'm changing options&advanced on the controller via web everything is changing in realtime or/and without restart (yes there are some options that require restart but some that not require... looks sometimes like didn't work without or immediately after change, but when they work? i dont know...:()

bendiy:

yep, it looks like it rather (I have "only" two crashes in one week of using Areca) crash when I clone or move a lot of data between datastores on the same controller but on different RAID Sets (one is RAID10 of 4 hdd and second is RAID5 with 6 hdd) and when on the same datastores ran some virtual machines in the same time... so the storage is heavely loaded with different types of IOPS. When I move between datastores on the same RAID Set but different Volumes it (for a week now) never crash, i'v copied about 10TB in different ways.

Note, during before and after ESXi crash (PSOD) Areca Controller works very well, web based management works ok and didn't show anything in Event log... so the problem is with not a HW of Controller itself but with a drivers and/or compatibility wirh high workload and ESX/ESXi v4/v5, as I read on some other sites when Areca controllers ware stress tests on windows 7/2k8 on some PC lowcosts Mobo with SSD drives it crashess too (with BSOD on MS Win), I dont know that this is the same problem with driver or just a lowcost mobo problem but it doesn't look so good for such (in teory) highend controller card like Areca ARC-1882ix-24....:(((

To Areca Support, please solve this issue, I will send You detailed report of that problem tomorrow.

kind regards

NTShad0w

0 Kudos