The company I am working for is rapidly expanding and we are facing a growing problem of finding ways to host files for our users that is highly-available and scalable. In my previous job experiences, we used Microsoft Clustering Services in Windows 2003 to keep two bare-metal fileservers up, however I see a lot of people are moving to virtualization now and I am honestly not sure if that is the correct way to proceed with the advancements in Server 2008.
I am currently finishing up my last year of college and don't have that "real-world" knowledge in this area yet, so I was curious if anyone has some good resources or pointers on best practices to achieve a highly-available file server
We are looking to provide at least 2 TB of usable space right off the bat; however, we will need this to be easily growable over the next few years.
Thanks in advance for any ideas!
I would say that for a file server, MS clustering is still often the way to go, particuarly with the improvements in clustering that game in with windows 2008/2008 R2. There's nothing to stop you combining this with virtualisation however by having virtual cluster nodes.
The normal high availabilty you tend to get with virtualisation will help protect you against hardware failure but not against software problems such as miscofiguration, patch problems, virus etc.
I'd disagree. Do you want the complexity that is introduced with Windows Clustering? I'd prefer to stay away from it with the simple goal of a redundant file server.
Try the Windows Server 2008 Distributed File System.
* initial note: this configuration will require double the space, whereas clustering only uses one times the space.
Set up two VMs, use anti-affinity rules to keep them from the same physical VMware host, and setup DFS to keep your copies of files replicated and synced between the two nodes. They are presented as one virtual IP so the users won't know the difference.
This is a good way to do this, but it does not help with the aforementioned misconfiguration, patching problems, virii infecting the data in the file server, etc. It does give you the added flexibility of vMotion, Storage vMotion, and DRS, which MS clustering prohibits enabling for the clustered VMs.
What do you think?
I would say that this is another solution, however its worth pointing out that if replicating files, there is a certain overhead, not least the need for additional storage.
Definitely. The overhead is there, but for a simple file server, it's pretty lightweight. The need for double the storage, however, is substantial.
I just don't like messing with clusters unless it is demanded by the business.
@DavidKlee, I think I understand what you are saying. So rather than having two machines clustered, you have two individual fileservers that are replicating the traffic between eachother on different hosts? Then in the event that one of the machines goes down, the other will take over (kind of like how Domain Controllers replicate AD info)?
If this is true, then you would just create a seperate drive and mount it to the VM? Then as you need the drive to increase in size, you simply change the drive size in the VM and use windows to adjust for the size? If this is true, how will we plan for surpassing the 2TB drive size that VMware has (I think it had to do with block level sizes?).
Yup - it'd be two distinct file server VMs, configured with anti-affinity rules so they do not run on the same piece of VMware hardware so as to minimize the risk of a host failure.
The two file servers would replicate their data between each other, and it's a single point of entry to get data onto them so the users do not know it is one. One will take over as primary if the other one goes down.
Each VM would get its own independent VMDK virtual hard drives, and each are mounted in their respective VMs. To increase the drive size, just grow the drive inside VMware, do a rescan within the Windows OS, and extend the partition once Windows can see the new space.
The 2TB drive size limit that is imposed by VMware is no longer limited by the VMFS block sizes (VMFS5 can go to 64TB) but is still limited by the 2TB VMDK limit because of the way VMware presents the disks. However, with Windows DFS, you can add a second VMDK file and add it as another mounted folder under your DFS tree.
If it were me, here's how I would architect it. This is a ficticious organization I'm creating.
OS: C: drive. 50GB vmdk.
marketing drive. 😧 drive. 200GB vmdk. mounted as /fileshare/marketing
finance drive. E: drive. 100GB vmdk. mounted as /fileshare/finance
IT drive. F: drive. 500GB vmdk (awww yeah). mounted as /fileshare/IT
Permissions would be set on these subfolders accordingly. If one drive needed to be grown, you can grow it without affecting the others, and if one org folder needed to exceed 2TB it could be broken into two (or more) subfolders with ease.
Very cool idea! I guess my last question is have you seen how the DFS reacts in a production environment with TBs of data? Usually my DCs don't sync a TB of data, so I would be curious to see how it works with a little more replication traffic :smileysilly:
For just two servers in a replicated pair, 'tis no big deal If you have 5000 people making hundreds of tiny changes a second, I'd put a second isolated network between the two VMs, but in normal, every-day usage scenarios, you're in great shape as-is! Once you get that initial replication out of the way, you're done!