adisplayname
Contributor
Contributor

Passing through tape drive causes PSOD

Hello,

We have an ESX3.5 (123630) Dell PE2900iii server, with an LTO4 tape drive, which, i'm fairly sure, is connected to a PERC6/i which also has 5 disks in a RAID5 as the core data store. We have another 2 disks as RAID1 on a SAS5i card. however - i inherited it, and it's not documented.

I've built a new 2008 R2 x64 server, testing backup exec 12.5. So I configured the tape drive as a scsi device by telling it co connect scsi device 1 to Tape IBM as SCSI0:1 on the guest. shows up as /vmfs/devices/genscsi/vml.010100000etc

Server boots ok, is detected by the OS and identified. I install Backup Exec, which tells me that the drive is supported, but currently using windows drivers, and do I wnat the BEX drivers, so I say yes. The guest pulls down the host, with a PSOD:

exception type 13 in world, scsi complete path command, scsistartpathcommands, scsiissueasyncpthcommand, scsilegacympissuecommand, scsistartdevice, scsiasyncdevicecommand, scsi_issueasyncdevicetoken, vscsi_rawdiskcommand and so on. So i'm pretty certain it's related to this tape drive somehow.

Everything reboots, ok great here we go.. into the OS, into backup exec. Choose the unidentified media in the drive, and try to catalog it. Bang, down we go with psod again.

At this point, I've confirmed that I'm not trying to share the device with any other guest. I know the console has been using it, but I'm trying to get away from that option.

I've found this link: which tells me I need to set the bus sharing to none - it's already set to none. That article also suggests I drop the tape drive off onto a scsi card all by itself, which ideally we would also avoid.

Am i missing something obvious that I need to do ? If I had of thought about it, I would have just left the windows driver, and seen how the backup program went.. but this is where we're at now.. so yea, any suggestions appreciated Smiley Happy

Tags (3)
0 Kudos
4 Replies
VMmatty
Virtuoso
Virtuoso

A few things come to mind that could be the issue here. Right off the bat I'm suspicious of Server 2008 R2, which isn't officially supported in any version of ESX (even the latest version). People have reported instability issues and problems at times trying to run R2, so I wouldn't run any produciton workloads using R2 until there is official support.

The situation you describe, however, is nearly identical to a problem I was having with a very similar setup. It ended up being resolved in a patch for ESX that, juding by your build number, is not installed on your system. The link below gives details of the patch which is specifically for ESXi but getting up to the latest build of ESX may also include the patch. I'd say try to go to the latest version of ESX 3.5 and see if that resolves your issue. The other option is to try an OS other than Server 2008, as I didn't see the problem in Windows 2003 on even an unpatched host.

Here is the patch:

Here is the specific item from the patch that resolves the SCSI issue:

Fixes an issue where ESX might fail if a SCSI read command is split for

a tape device. Splitting commands is not supported for tape devices.

After this fix is applied split SCSI commands will fail but this will

not cause ESX to fail.

Matt | http://www.thelowercasew.com | @mattliebowitz
asatoran
Immortal
Immortal

We have an ESX3.5 (123630) Dell PE2900iii server, with an LTO4 tape drive, which, i'm fairly sure, is connected to a PERC6/i which also has 5 disks in a RAID5 as the core data store. We have another 2 disks as RAID1 on a SAS5i card. however - i inherited it, and it's not documented.

The tape drive is connected to the RAID controller? I've always been told that is a no-no. Tape drive on a non-RAID, preferably it's own channel. Whether that is causing or contributing to you issue, I do not know.

As for your issue, I originally had PSODs with BE12.5 in a 32bit Server 2008 VM (not R2.) Originally the tape drive was on a physical machine with a HP branded LSI SCSI card. Moving the backups to a VM, I got PSOD whenever the tape drive was accessed. (e.g.: inventory.) I tried different host (different manufactures and models) but still PSOD. I tried a spare Adaptec 39160 SCSI: PSOD. On a lark, I loaded ESXi4 with the 39160 card. It worked! But performance was very poor with the 39160 card due to the tape drive having compatibility issues with the Ultra160 speed of the 39160. In the end, we got a HL DL385 G5 host and a Adaptec 29320LPE Ultra320 card. The Adaptec 29320 ALS is on the compatibility list, but not the LPE. The ALS is a RAID card and the LPE is a non-RAID, low-profile, PCIe card. I just hoped that it was close enough, which it was. Knock on wood, performance is about as good as when it was physical and no PSODs so I think I'm in the clear.

My point is that in addition to the RAID, it could be other compatibility issues as well. As mentioned, perhaps 2008 R2. In my case, I needed to update to ESX4. I know you said that you prefer to avoid moving the tape to a separate SCSI card, but that is raising quite a red flag in my mind.

Josh26
Virtuoso
Virtuoso

We have an ESX3.5 (123630) Dell PE2900iii server, with an LTO4 tape drive, which, i'm fairly sure, is connected to a PERC6/i which also has 5 disks in a RAID5 as the core data store. We have another 2 disks as RAID1 on a SAS5i card. however - i inherited it, and it's not documented.The tape drive is connected to the RAID controller? I've always been told that is a no-no. Tape drive on a non-RAID, preferably it's own channel. Whether that is causing or contributing to you issue, I do not know.

Agreed. Some time ago I had a physical Windows server that would BSOd under exactly these circumstances. We logged a ticket with Symantec, and received the same advice to move the tape drive to a dedicated card (which corrected the issue).

adisplayname
Contributor
Contributor

thank you folks.. I know we're due to upgrade fo ESX4, however we have another system to do first, so in the mean tim I'll try:

- downgrading from R2

- checking the tape drive to see if it's by itself.

- even potentially tryin 2003 x64...

I know the drive is going to be slower, but i'm not hugely concerned about that.

0 Kudos