VMware Cloud Community
Jollyluke
Contributor
Contributor

Esxi 5 VMs inaccessible after QNAP check disk

Hi all

in my company we bought in September a QNAP TS-859+ (with eight 2TB disks in Raid 6) to be used with a Dell PowerEdge T110 II. The purpose of these purchases was the implementation of a new booking application (we're a model agency). The whole system went on production late November 2011. On the Dell server we installed vmware Esxi 5 and all the virtual machines were saved on a single appropriate iSCSi target on the Qnap unit. The virtual machines were: the application environment itself (Windows Server 2008 R2 Standard), a web server (Windows 2008 R2 Web Edition) and two Zimbra mail servers (which were NO production servers, they were there just for test purposes). Moreover, some other iSCSI targets were created, but just for file sharing. Also, for each target, only a single LUN was created.

After the system went on production, we experienced a NAS hang twice in a week. Anyway, we realized that a new firmware came out, so we installed it in the beginning of December. After that, no more hangs occurred until last Thursday (January 26th). I have to say that nothing really bad happened until today; we just had to reboot both the QNAP and then the vmware physical server when the QNAP unit did not respond and then all the virtual machines worked again.

Last Friday, since for some time the QNAP NAS log had messages asking IT personnel to perform a chkdsk, I did a thorough analysis using the approprial QNAP web interface function. Well, I run the analysis of ALL disks at 11pm and it ended the day after at 8am. After that, hard disk number 5 was no more detected (with the web interface saying NO DISK). I restarted the QNAP unit and the again the QNAP unit reported that hard disk number 5 was damaged. Of course I wasn't worried in that moment, I have a RAID6 and all backups. Since I had to replace one disk and I didn't remember which model of HD we installed on the QNAP unit, I checked on the QNAP unit web interface. Then (I don't know why I did it, but finally it was a GOOD decision) I went to check on QNAP's website to see which are the compliant HD models for my unit. Well, I was petrified: all the disks I have in my unit are Seagate Barracuda Green model ST2000DL003, which were listed as COMPLIANT when I bought both the NAS unit and the disks, now are listed as NOT RECOMMENDED: a note is shown saying that HD model passed all lab tests, but due to customers' complaints now that HD model is not recommended. I started to understand our problems of NAS unit not responding, etc.

But unfortunately, the worse news still had to come. After the end of the analysis of all hard disks and restarting both the unit itself and the vmware Esxi server, we couln't get to boot virtual machines anymore. All virtual machines were marked as "Unknown" and (Inaccessible). The iSCSI targets we use only for file sharing have no problem, but all virtual machines are currently unusable. I performed lots of times a rescan, etc., always with no success. The first times I restarted the vmware physical server the computer reported that the target hosting all VMs was of 0 bytes. After three of four restarts and also with the restart of the NAS unit, vmware now detects the correct 6 TB size, but still doesn't allow me to see VMs names, keeping saying that all virtual machines are inaccessible.

After a lot of attempts and searching on the Internet a possible solution, I decided to reinstall vmware Esxi 5 on the server. No way to solve the problem, once again the first time I performed a rescan the target was detected by vmware with a 0 byte size, and after stopping and restarting that target on the QNAP unit, vmware detected again the correct size, but still I can't have him detect the old virtual machines.

Why all of this? Any suggestions? It's incredible to have all these troubles only for performing a disk check!!! Ok, we have a damaged disk, but we have a RAID6!!!

Thank you all for your support,

Gianluca

Reply
0 Kudos
2 Replies
NickMarshall9
VMware Employee
VMware Employee

Hi Gianluca,

Unfortunately you're in a hard situation. I do love QNAP's (run one for my home lab), but they can be picky when certifying disks to use.

The reason you "green" disks are not certified is because they are built with less stringent guidelines / tolerance for error compared with their "enterprise" grade counterparts. They are not built to run 24x7 either.

The disks I use in my QNAP NAS are the Samsung HD204UI 1AQ1 - these are NOT enterprise grade disks either but they are listed as compatible on QNAP's support page:

http://www.qnap.com/pro_compatibility.asp

Although mine had to have the following firmware update applied:

Note10 (Samsung HD204UI 2TB)
To use this hard disk model with QNAP products, please back up the disk data (if any) and follow the guide below to apply the patch for improved data integrity.
http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?DocId=223451&NewLang=en

Might I suggest you try and purchase a new set of HDDs that are marked as compatible, and see if you can get a return on the old disks. Since you have backups, unfortunately it might be a long night of restoring the data Smiley Sad

Sorry your post went unanswered for so long.

Cheers,

Book - Mastering VMware vSphere 5.5 Blog - LabGuides.com & NickMarshall.com.au Podcast - vBrownBag.com
Reply
0 Kudos
scottyyyc
Enthusiast
Enthusiast

So you've tried re-adding the machines into the inventory? Are the actual VM files present?

Hopefully, with a good backup, you're not in too much of a pinch. I too love QNAPs, but would be a little hesitant in putting them into production. I see far too many gotchas with hard drive types, and far too many glitches in their firmware. I like that QNAP regularly offers new firmware with new features, but sometimes that can be just as much of a liability than a plus in the business world. Storage is the one thing you have to be really particular about in VM setups, as your whole environment depends on it. Maybe if the funds allow at some point in the future, look into a Dell MD series or HP P2000 array. A bit more expensive than a QNAP, but rightfully so.

Also, from personal experience in these exact same scenarios, whenever I'm dealing with a new type of product, I usually break the crap out of it before I put it into production - i.e. remove drives, reboot it, run disk checks, all to see what happens in the event of an emergency. I've seen many a raid setup that has failed in some aspect miserably with only 1 dead drive.

Reply
0 Kudos