VMware Cloud Community
Easycom20111014
Contributor
Contributor
Jump to solution

Areca 1212 Failure

After a few months running esxi4.1 on a supermicro server with a areca 1212 raid controller following errors show up after POSD:

Disks are in RAID 1.

Any help aprreciated.

WARNING: LinScsi: SCSILinuxAbortCommands: Failed, Driver ARCMSR ARECA SATA/SAS RAID ControllerDriver Version 1.20.00.15.vmk.100202, for vmhba2 [0m
[7m66:16:44:51.630 cpu0:10101331)WARNING: arcmsr5: abort device command(0xe01fc00) of scsi id = 0 lun = 0  [0m
[7m66:16:44:51.630 cpu0:10101331)WARNING: LinScsi: SCSILinuxAbortCommands: Failed, Driver ARCMSR ARECA SATA/SAS RAID ControllerDriver Version 1.20.00.15.vmk.100202, for vmhba2 [0m
[7m66:16:44:51.843 cpu0:10101331)WARNING: arcmsr5: abort device command(0xe030400) of scsi id = 0 lun = 0  [0m
[7m66:16:44:51.843 cpu0:10101331)

and

WARNING: LinScsi: SCSILinuxAbortCommands: Failed, Driver ARCMSR ARECA SATA/SAS RAID ControllerDriver Version 1.20.00.15.vmk.100202, for vmhba2 [0m
[7m66:16:45:19.941 cpu1:4161)WARNING: LinScsi: SCSILinuxAbortCommands: Failed, Driver ARCMSR ARECA SATA/SAS RAID ControllerDriver Version 1.20.00.15.vmk.100202, for vmhba2 [0m
[7m66:16:45:19.941 cpu1:4161)WARNING: arcmsr5: abort device command(0xe02e800) of scsi id = 0 lun = 1  [0m
66:16:45:21.018 cpu0:4098)<5>arcmsr5: pCCB ='0x0x4100b50e9400' isr got aborted command
66:16:45:21.018 cpu0:4098)<5>arcmsr5: isr get an illegal ccb command     done acb = '0x0x41000b813588'ccb = '0x0x4100b50e9400' ccbacb = '0x0x41000b813588' startdone = 0x0 ccboutstandingcount = 41
[31;1m66:16:45:21.018 cpu0:4098)ALERT: LinScsi: SCSILinuxCmdDone: Attempted double completion [0m
66:16:45:21.019 cpu0:4098)Backtrace for current CPU #0, worldID=4098, ebp=0x417f800179a8
66:16:45:21.019 cpu0:4098)0x417f800179a8:[0x4180278577b5]PanicLogBacktrace@vmkernel:nover+0x18 stack: 0x417f800179d8, 0x417f8
66:16:45:21.019 cpu0:4098)0x417f80017ae8:[0x4180278579f4]PanicvPanicInt@vmkernel:nover+0x1ab stack: 0x417f80017bd8, 0x4180278
66:16:45:21.020 cpu0:4098)0x417f80017af8:[0x418027857fdd]Panic_vPanic@vmkernel:nover+0x18 stack: 0x3000000008, 0x417f80017be8
66:16:45:21.020 cpu0:4098)0x417f80017bd8:[0x41802788a572]vmk_Panic@vmkernel:nover+0xa1 stack: 0x24c0, 0xd0b50e9400, 0x417f800
66:16:45:21.021 cpu0:4098)0x417f80017c48:[0x418027c6a35e]SCSILinuxCmdDone@esx:nover+0x2c1 stack: 0x202, 0x418027d78e38, 0x410
66:16:45:21.021 cpu0:4098)0x417f80017c88:[0x418027d78df6]arcmsr_interrupt@esx:nover+0x241 stack: 0xd000000023, 0x418027d78e38
66:16:45:21.021 cpu0:4098)0x417f80017cc8:[0x418027c7bd38]Linux_IRQHandler@esx:nover+0x77 stack: 0xd0, 0x417f80017d08, 0x417f8
66:16:45:21.021 cpu0:4098)0x417f80017d58:[0x418027832201]IDTDoInterrupt@vmkernel:nover+0x348 stack: 0x4100b6030150, 0x417f800
66:16:45:21.022 cpu0:4098)0x417f80017d98:[0x4180278324da]IDT_HandleInterrupt@vmkernel:nover+0x85 stack: 0x2637d9eee01956, 0x4
66:16:45:21.022 cpu0:4098)0x417f80017db8:[0x418027832e2d]IDT_IntrHandler@vmkernel:nover+0xc4 stack: 0x417f80017ec0, 0x418027a
66:16:45:21.022 cpu0:4098)0x417f80017dc8:[0x4180278da747]gate_entry@vmkernel:nover+0x46 stack: 0x4018, 0x4018, 0x0, 0x0, 0x0
66:16:45:21.023 cpu0:4098)0x417f80017ec0:[0x418027aacb36]Power_HaltPCPU@vmkernel:nover+0x27d stack: 0x417f80017f70, 0x1, 0x26
66:16:45:21.023 cpu0:4098)0x417f80017fd0:[0x4180279cbe1e]CpuSchedIdleLoopInt@vmkernel:nover+0x985 stack: 0x417f80017ff0, 0x41
66:16:45:21.024 cpu0:4098)0x417f80017fe0:[0x4180279d15ee]CpuSched_IdleLoop@vmkernel:nover+0x15 stack: 0x417f80017ff8, 0x0, 0x
66:16:45:21.024 cpu0:4098)0x417f80017ff0:[0x418027834916]HostPCPUIdle@vmkernel:nover+0xd stack: 0x0, 0x0, 0x0, 0x0, 0x0
66:16:45:21.024 cpu0:4098)0x417f80017ff8:[0x0]<unknown> stack: 0x0, 0x0, 0x0, 0x0, 0x0
66:16:45:21.025 cpu0:4098) [45m [33;1mVMware ESXi 4.1.0 [Releasebuild-260247 X86_64] [0m
66:16:45:21.025 cpu0:4098)Failed at vmkdrivers/src_v4/vmklinux26/vmware/linux_scsi.c:2190 -- NOT REACHED
66:16:45:21.025 cpu0:4098)cr0=0x80010039 cr2=0x0 cr3=0x10ce1000 cr4=0x16c
66:16:45:21.025 cpu0:4098)pcpu:0 world:4098 name:"idle0" (I)
66:16:45:21.025 cpu0:4098)pcpu:1 world:7146 name:"sfcb-vmware_bas" (U)
@BlueScreen: Failed at vmkdrivers/src_v4/vmklinux26/vmware/linux_scsi.c:2190 -- NOT REACHED
66:16:45:21.025 cpu0:4098)Code start: 0x418027800000 VMK uptime: 66:16:45:21.025
66:16:45:21.025 cpu0:4098)0x417f80017af8:[0x418027857fd8]Panic_vPanic@vmkernel:nover+0x13 stack: 0x3000000008
66:16:45:21.026 cpu0:4098)0x417f80017bd8:[0x41802788a572]vmk_Panic@vmkernel:nover+0xa1 stack: 0x24c0
66:16:45:21.026 cpu0:4098)0x417f80017c48:[0x418027c6a35e]SCSILinuxCmdDone@esx:nover+0x2c1 stack: 0x202
66:16:45:21.026 cpu0:4098)0x417f80017c88:[0x418027d78df6]arcmsr_interrupt@esx:nover+0x241 stack: 0xd000000023
66:16:45:21.027 cpu0:4098)0x417f80017cc8:[0x418027c7bd38]Linux_IRQHandler@esx:nover+0x77 stack: 0xd0
66:16:45:21.027 cpu0:4098)0x417f80017d58:[0x418027832201]IDTDoInterrupt@vmkernel:nover+0x348 stack: 0x4100b6030150
66:16:45:21.027 cpu0:4098)0x417f80017d98:[0x4180278324da]IDT_HandleInterrupt@vmkernel:nover+0x85 stack: 0x2637d9eee01956
66:16:45:21.028 cpu0:4098)0x417f80017db8:[0x418027832e2d]IDT_IntrHandler@vmkernel:nover+0xc4 stack: 0x417f80017ec0
66:16:45:21.028 cpu0:4098)0x417f80017dc8:[0x4180278da747]gate_entry@vmkernel:nover+0x46 stack: 0x4018
66:16:45:21.028 cpu0:4098)0x417f80017ec0:[0x418027aacb36]Power_HaltPCPU@vmkernel:nover+0x27d stack: 0x417f80017f70
66:16:45:21.029 cpu0:4098)0x417f80017fd0:[0x4180279cbe1e]CpuSchedIdleLoopInt@vmkernel:nover+0x985 stack: 0x417f80017ff0
66:16:45:21.029 cpu0:4098)0x417f80017fe0:[0x4180279d15ee]CpuSched_IdleLoop@vmkernel:nover+0x15 stack: 0x417f80017ff8
66:16:45:21.030 cpu0:4098)0x417f80017ff0:[0x418027834916]HostPCPUIdle@vmkernel:nover+0xd stack: 0x0
66:16:45:21.030 cpu0:4098)0x417f80017ff8:[0x0]<unknown> stack: 0x0
66:16:45:21.038 cpu0:4098)FSbase:0x0 GSbase:0x418040000000 kernelGSbase:0x0
66:16:45:21.018 cpu0:4098)LinScsi: SCSILinuxCmdDone: Attempted double completion
0:00:00:28.899 cpu1:4825)Elf: 3028: Kernel module arcmsr was loaded, but has no signature attached
Coredump to disk.
Slot 1 of 1.
storage message on vmhba32: Bulk command transfer result=0
                         usb storage message on vmhba32: Bulk data transfer result 0x1
0:00:00:46.061 cpu0:5552)ScsiScan: 1059: Path 'vmhba1:C0:T0:L0': Vendor: 'TEAC    '  Model: 'DV-28S-W        '  Rev: '1.2A'
0:00:00:46.061 cpu0:5552)ScsiScan: 1062: Path 'vmhba1:C0:T0:L0': Type: 0x5, ANSI rev: 5, TPGS: 0 (none)
0:00:00:46.063 cpu0:5552)ScsiUid: 273: Path 'vmhba1:C0:T0:L0' does not support VPD Device Id page.
0:00:00:46.072 cpu0:5552)VMWARE SCSI Id: Could not get disk id for vmhba1:C0:T0:L0
0:00:00:46.073 cpu0:5552)ScsiScan: 1059: Path 'vmhba2:C0:T16:L0': Vendor: 'Areca   '  Model: 'RAID controller '  Rev: 'R001'
0:00:00:46.073 cpu0:5552)ScsiScan: 1062: Path 'vmhba2:C0:T16:L0': Type: 0x3, ANSI rev: 0, TPGS: 0 (none)
[7m0:00:00:46.073 cpu0:5552)WARNING: ScsiScan: 116: Path 'vmhba2:C0:T16:L0': Unsupported pre SCSI-2 device (ansi=0) [0m
0:00:00:46.073 cpu0:5552)ScsiScan: 1059: Path 'vmhba2:C0:T0:L0': Vendor: 'Areca   '  Model: 'ARC-1212-VOL#000'  Rev: 'R001'
0:00:00:46.073 cpu0:5552)ScsiScan: 1062: Path 'vmhba2:C0:T0:L0': Type: 0x0, ANSI rev: 5, TPGS: 0 (none)
0:00:00:46.073 cpu0:5552)ScsiScan: 1059: Path 'vmhba2:C0:T0:L1': Vendor: 'Areca   '  Model: 'ARC-1212-VOL#001'  Rev: 'R001'
0:00:00:46.073 cpu0:5552)ScsiScan: 1062: Path 'vmhba2:C0:T0:L1': Type: 0x0, ANSI rev: 5, TPGS: 0 (none)
0:00:00:46.075 cpu0:4493)usb storage warning (0 throttled) on vmhba32 (SCSI cmd INQUIRY): clearing endpoint halt for pipe 0xc0008280
                         usb storage message on vmhba32: scsi cmd done

Tags (3)
Reply
0 Kudos
1 Solution

Accepted Solutions
madmax14304
Enthusiast
Enthusiast
Jump to solution

I sent a couple of requests to Areca for support and they got back to me today instructing me to update my vmware drivers and pointed me to their website.  The drivers page shows the driver as a 4.0 driver but it installed:

This is the driver I've installed:

http://www.areca.us/support/s_vmware/esx_4/vmware-esx-drivers-scsi-arcmsr-400.1.20.00.15.vmk.110418....

I'll check back with an update if it crashes or is stable.  I've reenabled NCQ too.

View solution in original post

Reply
0 Kudos
26 Replies
DSTAVERT
Immortal
Immortal
Jump to solution

Welcome to the Communities

Sorry you are having troubles. A PSOD usually indicates a hardware issue. I don't know whether you have the battery backed write cache module installed or not. There is a knowledge base article that applies to your controller http://kb.vmware.com/kb/1012794 and I believe not having the BBWC module installed would be the equivalent to having a discharged battery.

Since the drivers for the controller are added after installation can I assume this is installed to USB?

I can't say whether this would work or not but I would try disabing each disk in turn and see whether ESXi would start.  I would want my first job to be copying the VMs to alternate locations. I would not want to run them on the existing drives.

-- David -- VMware Communities Moderator
Reply
0 Kudos
madmax14304
Enthusiast
Enthusiast
Jump to solution

I seem to be having the exact same issue.  I have an ARC-1212 with a BBU installed.  Any advice would be helpful.  ESXi runs fine for days and then eventually crashes after a few days.

Screen shot 2011-04-22 at 4.00.36 PM.png

Reply
0 Kudos
pdxvmug
Contributor
Contributor
Jump to solution

I am also running a SuperMicro server with an Areca 1224 raid controller and am getting the same PSOD as madmax is reporting every 2-3 weeks. Raid 5 and rebooting the ESXi 4.1 host brings it all back.

Reply
0 Kudos
DSTAVERT
Immortal
Immortal
Jump to solution

Here is an Areca link that may have some relevance. http://faq.areca.com.tw/index.php?option=com_quickfaq&view=items&cid=3:Hardware&id=406:Q10100810&Ite... While this is specifically about applying firmware updates to Seagate drives there may be others.

One of the real downsides to building your own server is that you take on the role of chief technologist, helpdesk and onsite service tech. With a general purpose operating system, Windows, Linux etc. most servers aren't really stressed and slight inconsistancies don't necessarily matter much. With virtualization you may now have many operating systems fighting for access to the same resources as you had from a single operating system. Slight timing differences can have a great impact on the stability of the host machine.

-- David -- VMware Communities Moderator
Reply
0 Kudos
madmax14304
Enthusiast
Enthusiast
Jump to solution

I'm using 4 WD WD20EADS 2TB SATA Drives.  I was using these same drives with a different Areca RAID card (Areca ARC-1230) without issue until the card died and I installed the ARC-1212 & associated drivers.  I wonder if the issue is just with immature vmware drivers for this specific RAID adapter.  I'm also wondering if I used something like Windows Server 2008 and the areca raid drivers if it would be stable as they probably put more cycles into developing drivers for windows.  Thanks for the responses.  If I come up with a solution, I'll post it.

Reply
0 Kudos
madmax14304
Enthusiast
Enthusiast
Jump to solution

It looks as if this error is related to NCQ on the raid adapter.

See http://communities.vmware.com/thread/292358?start=15&tstart=0

I've disabled NCQ and no crashing yet.  I'll check back if it crashes again. 

Reply
0 Kudos
pdxvmug
Contributor
Contributor
Jump to solution

Good luck with that. It didn't help our server-two weeks and PSOD was back...

Sent from my Verizon Wireless 4GLTE smartphone

Reply
0 Kudos
madmax14304
Enthusiast
Enthusiast
Jump to solution

I sent a couple of requests to Areca for support and they got back to me today instructing me to update my vmware drivers and pointed me to their website.  The drivers page shows the driver as a 4.0 driver but it installed:

This is the driver I've installed:

http://www.areca.us/support/s_vmware/esx_4/vmware-esx-drivers-scsi-arcmsr-400.1.20.00.15.vmk.110418....

I'll check back with an update if it crashes or is stable.  I've reenabled NCQ too.

Reply
0 Kudos
dkairo
Contributor
Contributor
Jump to solution

I have the same issue with esxi4.1 on a supermicro server with a areca 1212 raid controller.  Will I lose all of my existing VM's and have to rebuild the RAID if I upgrade the Areca driver?

Reply
0 Kudos
madmax14304
Enthusiast
Enthusiast
Jump to solution

I've had NO crashes since the driver upgrade so this is looking good.  You will not lose anything with  driver upgrade.  It was a 10 minute process to remove and replace the driver.

Reply
0 Kudos
dkairo
Contributor
Contributor
Jump to solution

Thanks, I will give it a try. 

How can I tell that the old Areca driver has been removed? 

2.1). If you have a older driver, please remove it from ESXi terminal by,
#@$ esxupdate -b ARC remove --maintenancemode

I connected via SSH and I have no response on this command  #@$ esxupdate -b ARC remove --maintenancemode

Reply
0 Kudos
madmax14304
Enthusiast
Enthusiast
Jump to solution

Once removed, it'll look like this:

~ # esxupdate query --vib-view | grep arcmsr
cross_vmware-esx-drivers-scsi-arcmsr_400.1.20.00.15.vmk.110418-1OEM        installed     2011-05-03T11:39:12.202313+00:00
cross_vmware-esx-drivers-scsi-arcmsr_400.1.20.00.15.vmk.100202-1OEM        retired       2011-05-03T11:39:12.250458+00:00

I prefer to update this driver from a windows box with VMWare CLI installed.  I just used the vihostupdate.pl program to update the driver.

Reply
0 Kudos
dkairo
Contributor
Contributor
Jump to solution

I was able to remove the old driver succesfully.

~ # esxupdate query --vib-view | grep arcmsr
cross_vmware-esx-drivers-scsi-arcmsr_400.1.20.00.15.vmk.100202-1OEM        uninstalled   2011-05-26T00:52:16.649237+00:00

However using VMWare CLI to update with vihostupdate.pl gave me this erorr. 

C:\Program Files\VMware\VMware vSphere CLI\bin>perl vihostupdate.pl -server 192.

168.1.211 -username root -password ******* -b F:\offline-bundle\offline-bundle.

zip --nosigcheck -i

Please wait patch installation is in progress ...

The format of the metadata is invalid.Duplicate definitions of bulletin ARC with

unequal attributes.

I checked the offline-bundle.zip file on the CD and the correct version is there "vmware-esx-drivers-scsi-arcmsr-400.1.20.00.15.vmk.110418-1OEM.x86_64.vib"

I originally used vSphere CLI to install the old version of the driver, but now it is not working.  I looked up that erorr, but was not able to find any information on this. 

I appreciate any help on this.

Thanks!

madmax14304
Enthusiast
Enthusiast
Jump to solution

You need to reboot the system to make it see that the driver was removed.  I had the same issue.  Once rebooted, try installing the driver.

Reply
0 Kudos
madmax14304
Enthusiast
Enthusiast
Jump to solution

Oh,

Also do this:

rm /tmp/stage/firmware/usr/lib/ipkg/bulletins.zip

Took me a long time to find that.

dkairo
Contributor
Contributor
Jump to solution

Thank you!   The driver upgrade worked.

Reply
0 Kudos
Easycom1
Contributor
Contributor
Jump to solution

Lost track of this case, so sorry about the delayed feedback. The problem server has been running fine for a few weeks until this morning. Drives have been replaced but to no avail. The strange thing is that I have an identical server running which doesn't give me any problems at all (so far). Can someone confirm that the driver update fixed the problem? Thanks again

Reply
0 Kudos
dkairo
Contributor
Contributor
Jump to solution

The driver update worked for me. I had no issues in the last three weeks since the upgrade.

David

Reply
0 Kudos
Easycom1
Contributor
Contributor
Jump to solution

I installed the new driver, I will wait a few weeks to see if crashes still occur. Thanks for the help provided so far. 

Reply
0 Kudos