Hi guys,
our ESX Servers are running on dell poweredge servers, and the vmfs partitions are on EMC SATAII Drives.
Every (META)LUN is 200GB (2x 100GB LUNs) and is striped over 14 harddisks.
we have strong performance problems and I don´t know if this is because the SATA drives..
I looked around on the VMware Homepage, but I could not find anything about ESX and SATA drives ...
The reason VMware isn't saying anything about ESX on SATA drives, is because it's not supported by ESX. Now, I do know of people that actually did get ESX to run on SATA, but don't really know of performance problems.
SATAII is quite fast. Is this maybe NFS/NAS storage over iSCSI?
No, it´s regular SAN Array over FC HBA..
If it's a SAN Array over FC, I don't really think ESX will know its writing(in the end) to SATA disks.
I've never used a SATA based SAN, so I don't have any reference point of performance using them.
Sorry.
Direct attached Sata isn't supported though it can be gotten to work. On a San it shouldn't matter as long as it meets your needs.
FC SCSI will give you more iops than sata. The nature of the VMs would determine if you need them.
I have a lot of VMs running on Sata on a Netapp filer using ISCSI protocol.
The Lun is striped over 22 disks. I haven't seen performance issues but for the most part most of the servers aren't high I/O. I do have a Sybase development server on sata, it hasn't had disk performance issues.
I ran my production environment on SATAII disk for a while without knowing (ID10T error on my part).
Anyhow, there are a couple things to think about with the design. First off, I have a suspicion you are running an AX150 series SAN, vs the CX series but you know how ASSumptions work. The AX series is known to be not a high performing system.
It also sounds as if you might be running a RAID 5 Group using all 14 disks (assuming disk 15 is the hotspare for the tray). Thats a 6TB RAID Group if you are using the 500GB SATAII drives (multiple luns in the same raid group are not going to decrease disk contention, ie increase performance).
I would normally recommend either creating a RAID10 group using 2 trays vs all the disk on a single tray. We've gotten better perofrmance that way on the cx series SAN
If you only have the single tray available, then it might be better to look at breaking the Raid Group into multiple RAID 5 Groups (which obviously uses more disks), but you get less contention on the disk. Try to keep the LUN's down around the 500GB Size and I'd almost try to design that for 1 RAID 5 GRoup = 1 LUN (instead of the norm of 1 RAID 5 group = multiple luns).
Outside of the SAN design is also the question of what you have virtualized vs what is causing performance issues.
Typically there are a couple schools of thought with Virtualization. One is to use it for HA (ie, I need to have a secondary server for this service which is flexible enough to not be hardware dependant and/or easy to recover.) The other school is I need to bring up the efficiency of my hardware (ie, my domain controller is sitting there on a dual core box idling along with my web server and file server,etc
If you are trying to run your entire environment on ESX then you need some really careful planning to ensure you have planned for the performance hit associated with running transactional system in a virtualized environment (ie, SQL, Oracle, Exchange,etc). Sometimes those make sense to Virtualize...sometimes it doesn't make sense.
If you clarify the SAN you have, the RAID Group size(s) as well as what you are running for the ESX host(s) and the Guests that might help us understand better.
I need to learn to type and spell at the same time.
ok, here are some infos about our environment:
EMC Clariion CX3-20c with 4x DAE´s with 500GB SATAII Drives.
We are using META-LUN´s for ESX.
1 LUN = 100GB over 7 drives on DAE nr.1 with RAID Lv. 5
1 LUN = 100GB over 7 drives on DAE nr.2 with RAID Lv. 5
=> So, 1 META LUN = 200GB over 14 drives and two DAE´s ...
I hope you can understand^^ ![]()
yup..makes sense...and just blew most of my wonderful post out of the water ![]()
could you provide information on the hosts/guests (ie, services guests are providing, are they single or SMP guests,etc)
hmm
mostly we have standard windows 2003 server guests, with single CPU, like Printservers, citrix secure gateway, faxserver, citrix application servers for up to 5 users, proxy servers, nothing special I think...
We have some SAP Test Systems with 2 or 4 vCPU´s, too ..
The Hosts have all two Xeon dual core cpu´s (no hyperthreading), and are connected to the Storage with 4GB HBA´s ...
The ESX System is running on local harddisks (2x Serial Attached SCSI drives with Raid Lv. 1) ..
You never said what performance issues you have.
If you look at the performance monitor what do you see ?
That´s hard to describe ..
We had all ESX Servers on local Harddisks (3x500GB RAID5, serial attached scsi) before.
Now we changed to the EMC SAN System, and alle VMs, which were transferred to the new hosts are slower than before.
the handling of windows is slower than before, for example open the task manager takes up to 10 sec, working over RDP is nearly impossible..
The performance monitor of esx shows nothing special..
I get sometimes the message: Host Disk Usage is high (red alert) ..
What type of metalun did you create? How many components does it show in Navisphere? What multiplier value did you specify?
Ok I am not an EMC San expert.
I suppose the storage can be reached through different paths.
When you take a look a one of your VMFS volumes, how many paths show up .
I give an example.
When I look at a VMFS Volume on my EVA8000, I can see 8 paths and I can select each of these paths for the lun/VMFS.
Here is a picture of one of the meta-lun´s:
http://home.arcor.de/tedshp/metalun.JPG
http://home.arcor.de/tedshp/metalun2.JPG
type of the meta is Striping, not concatenation...
it shows 4 paths, but only one active, 3 are stand-by ...
Besides Striping and Concantenation, there are also Hybrid MetaLuns (which are usually created by mistake). Based upon your pics you are using Striping, but your element size multiplier should actually be 3 based upon the number of effective disks, rather than the default of 4 (Which is based upon a 4+1 configuration).
While that issue alone shouldn't cause significant performance problems, often times when you add all the little errors together they are then significant. For instance not aligning the partitions on the ESX level and on the Guest will cost you another 3%-20%. Carving other non-metaluns on one of the RAID Groups but not the other will cost you some %, and so on.
Besides Striping and Concantenation, there are also
Hybrid MetaLuns (which are usually created by
mistake). Based upon your pics you are using
Striping, but your element size multiplier should
actually be 3 based upon the number of effective
disks, rather than the default of 4 (Which is based
upon a 4+1 configuration).
Can I change that while the lun is activ? I don´t think so.
I also can not find anything like that..
No, can't change that after the fact, same as partition alignment. Only choice if you want to optimize either is to re-create.
