VMware Cloud Community
beckhamk
Enthusiast
Enthusiast

Equallogic Cost/Performance

I am looking for some feedback on using the Equallogic iscsi sans.

First, I would like to ask truely, how well does a PS100/PS300 run using sata disks for vmware? SATA scares the crap out of us and I just can not se using the sata versions.

Second, cost. We had gotten a few quotes from CDW which where supposedly discounted for a PS100, PS300, PS3600. All I can say is WOW. We were quite surprised to see how much these equallogic systems cost compared to a FC cx3-10 or a fc HDS AMS200. For example the AMS200 we had quoted had 30x300gc fc drives with 24/7 4hr support was about $47k before discounts. Then for one PS100 with premium support was $43k all before taxes. I understand that the equallogic boxes come with alot of software included. But to us the FC costs seem like a no brainer compared to the Equallogic costs. Are we missing something here?

Any comments are welcome.

0 Kudos
48 Replies
sstelter
Enthusiast
Enthusiast

A quick word of caution - losing any one box in a group takes all spanned volumes offline. EqualLogic doesn't offer synchronous replication between boxes, so you have no protection against a chassis / backplane / rack / pdu failure. E.g. if you're swapping a drive in one PS array on a group of 8 PS arrays and the backplane fails due to flex, all spanned volumes will go offline with possible data loss. Perhaps that's why they've imposed a limit of 8 instead of 32. No protection against a double-disk fault or unrecoverable read error during rebuild - no RAID6 - so be careful when you consider larger groups of PS arrays.

On performance - it is truly amazing what they can do with that aging Broadcom processor - hats off to them. They won that eWeek Excellence Award for Storage Hardware[/b] for a reason.

For all the capabilities and performance of EqualLogic plus synchronous replication, protection against double disk faults, hardware independence, and a host of other capabilities, you need to look at SAN/iQ from my employer, LeftHand Networks. Seamless failover and failback of virtual machines - between buildings - between floors in a building - between racks in a datacenter. We won that eWeek Award for Storage Software[/b] for a reason.

0 Kudos
christianZ
Champion
Champion

I agree I like Lefthand for their features but was a little surprised about the poor performance here:

http://www.vmware.com/community/thread.jspa?threadID=73745&start=225&tstart=0

(results from aschaef ).

I hope one can find the bottleneck here.

0 Kudos
sstelter
Enthusiast
Enthusiast

ChristianZ,

your test is great for comparing systems that have the same architecture - controller shelf with disk shelves attached or a single shelf of disk with integrated controllers - it allows the basic storage subsystems to be compared. It does not simulate the impact of multiple ESX servers with multiple guest operating systems each attached to data volumes. LeftHand's aggregate performance increases as you increase the connections to the system. Other storage systems tend to fall off in performance as you add load to the system.

The ESG performance analysis available on http://www.lefthandnetworks.com confirms this and shows how we scale in performance to disk (not just cache) from 6TB to 100TB. From ESG: "LeftHand has one of the best price/performance SAN storage systems that we've tested."

That said, it is still entirely possible that aschaef has a configuration issue. it would be useful to know which version of SAN/iQ they are running. I will PM aschaef and see if I can help.

Bis spater...

0 Kudos
stisidore
Enthusiast
Enthusiast

Check with technical support on the ability to expand group members beyond 8 with the new firmware. But, to your point, the release notes DOES say it supports 8.

It needs to be pointed out that there is no limit to the number of groups and you have the added flexibility of segmenting the groups into pools (today its 4) within a single group. This adds tiering functionlaity to meet departmental or application SLA agreements - again, all within one management pane of glass.

In regards, to linear scaling of performance, check out the Veritest report (http://www.lionbridge.com/competitive_analysis/reports/equallogic/EqualLogic_PS_Series_Test_Final_Report.pdf) which speaks to this fact

0 Kudos
stisidore
Enthusiast
Enthusiast

LH is fantastic at evangelizing the flexibility of IP-SAN and virtualizing the storage environment, but

Alas, at the end of the day, LH is still x86 architecture and the enterprise rules of SAN apply. Multiple controllers both maintaining access to the spindles with mirrored cache, thus eliminating the need of multiple nodes to overcome SPOF of Intel motherboards.

Check out how million dollar enterprise solutions are designed such as EMC Symmetrix and HDS Lighting - two controllers accessing the same disks.

I'm curious why a backplane would fail due to swapping a hard drive? Can you give a personal account of this ever occuring? I agree that RAID 6 is a nice-to-have (I am a NetApp Certified Engineer, as well) and familiar with RAID-DP, but in the years of supporting NetApp installs I never faced "the END OF THE WORLD" scenario implied by the need of RAID 6.

In terms of the aging processor, I would recommend you do a little more homework on Broadcom.

0 Kudos
sstelter
Enthusiast
Enthusiast

stisidore,

Are you running multiple, enterprise applications inside virtual machines on x86 servers today? You avoid the single point of failure of x86 motherboards through the magic of VMware Enterprise's HA, Vmotion, and DRS. Think of LeftHand Networks' SAN/iQ as ESX for storage.

In my time on the planet, I have seen how standards and market forces win over elegant, proprietary hardware designs for the majority of applications. See ethernet vs. token ring, PCI vs. MCA, VHS vs. Beta, and on and on. My electrical engineering degree cries about this often.

But the proof, as they say, is in the pudding. The performance of million dollar enterprise solutions like those you mentioned can be matched and exceeded by a cluster of HP ProLiant DL320s SAS storage modules powered by LeftHand Networks' SAN/iQ. See:

http://technet.microsoft.com/en-us/exchange/bb412164.aspx

On backplanes:

One of my former employers built storage arrays whose backplanes would flex on drive insertion / removal. This was enough to disrupt the signaling to neighboring drives. I have personally watched a RAID5 array go offline in an unrecoverable fashion due to a drive swap. I've also seen SANs go offline on controller insertion (misaligned pins caused a marvelous arc). Stuff happens.

RAID6 should be increasingly viewed as a must-have option for storage solutions - especially those employing SATA drives. The bit error rate of a SATA drive combined with their large capacities make the probability of an unrecoverable read error on rebuild fairly frightening. On a 41 array of 36GB drives, the probability was low. On a 61 array of 750GB SATA drives (two of these in a PS400E configured in RAID50), you have to hope that 6x750GB of data is readable for your rebuild to succeed. For a discussion on the benefits of "drive burn in" (or lack thereof), please read:

http://www.networkcomputing.com/channels/storageandservers/showArticle.jhtml?articleID=198700132

LeftHand has RAID6, but even better than that, we've got 2, 3, and 4-way replication (configurable by LUN/volume on-the-fly). This provides multiple copies of your data on the SAN, so you can tolerate multiple disk or controller failures without downtime. You get the ability to tolerate site failures for your virtual machines and data with automatic failover and failback. For example, with a simple 2-node configuration using the 12-drive HP DL320s in RAID5, I have 24 spindles at work. I can configure any or all of the volumes on the SAN such that I can lose 14 of these drives and still be running. I could put one of the two nodes in one building and the other one in another building - complete site-level redundancy for my virtual machines and data. With a single site, I could choose to split the nodes into different racks. And all of this scales - with 10 nodes, I could tolerate 5 controller failures. How well do your "million dollar enterprise solutions" survive 5 controller failures? How well to they hold up to the idiot from the vendor when he pulls the wrong power tail during a power supply swap?

On Broadcom:

The homework I've done is from EQL's website. It shows that the 4-year-old, SATA-based PS100E turns out the same IOPS as the recently released, top-of-the-line PS3900XV. 60,000 IOPS to cache. The chipset spec sheets from Broadcom indicate a dual core processor at about 1GHz with an architecture that can't support 10 gigabit Ethernet. I'm sure Broadcom has some exciting stuff; none of it seems to be in PS-arrays though. Can someone explain to me why you can team the Broadcom interfaces in a $3,000 server but you can't team the interfaces in a $65,000 SAN using Broadcom hardware?

At the end of the day, you're locked in to the "enterprise" storage hardware vendor's price for a disk drive, rather than the market's price. You're locked in to that storage vendor's hardware roadmap. You'll get 10 gigabit when they're ready, not when you're ready. You'll get RAID6 when they're ready, not when you need it.

These are my opinions and they are not necessarily shared by my employer, LeftHand Networks.

0 Kudos
adehart
Contributor
Contributor

CrazeX,

Can you elaborate a bit more on your reasons for choosing Compellent over EqualLogic?

We're in the evaluation phase right now and are considering EqualLogic and Compellent. We'll actually have a Compellent and EqualLogic unit on hand soon to test with. We're still on the fence with LeftHand too and may yet get a demo unit to look at. Our original plans for one to be here at the same time as the other arrays fell through.

After looking at both EQL and Compellent solutions extensively, the volume of support for EqualLogic, while clearly overwhelming, seems to be mostly due to more customers and more voices. In my book, popularity doesn't necessarily mean better or best.

I like EqualLogic's all-in-one pricing approach and while the ala-carte pricing from Compellent puts them at a slight disadvantage with me, it seems that when RAW storage needs to be added, it is significantly less to do this with Compellent. Just being able to mix and match FC and SATA drives is a big plus.

Also, after looking at both software interfaces, Compellent's just seems very straightforward, elegant, and feature rich when compared to EqualLogic's which is saying a lot since EqualLogic's is quite good. Stats and data provided on the array in Compellent's software is great.

What other things am I missing or not missing?

So far, I appreciate all the sharing of opinions here.

I also wouldn't mind some suggestions on how best to test these arrays when evaluating them for use with VMWare. I've got some ideas but would love to hear what others are doing to run these through their paces.

Thanks.

Anthony

0 Kudos
stisidore
Enthusiast
Enthusiast

Sorry for late reply - I have been away for a while.

The one seemingly consistent theme I seem to notice from the guys at sstetler's company is the almost immediate reaction of vendor bashing. Instead of sticking to solving the basic problem of the user, the conversation is muddled into chipset manufacturers or "what we have that they dont''.

Other examples -

http://www.vmware.com/community/thread.jspa?messageID=754927&tstart=0#754927

I don't get it - you have a decent product and I am confused why the need to preach other vendors road map instead of sticking to yours.

0 Kudos
skip181sg
Contributor
Contributor

"sstelter" - Since you have obvious passion could you please post the negative aspects of LeftHand. You are very good at talking about the warts of others - I think its fair to the community for us to hera about LeftHands skeletons and warts aswell.

Thank you so much for doing that

0 Kudos
theguber
Contributor
Contributor

Sstelter - some points.

1. Market forces versus elegant Proprietary Solution - Interesting point but there are also a litany of Proprietary technologies that have become industry defacto also. I think your statement is a stretch

Microsoft Windows versus Everyone else (Linux, SUN's JDS, MAC and more) 90% plus desktop share and 50% plus Server share

Cisco IOS versus Everyone else (IE: Openroute and more)

Telephone Echange Voice Compression - Dominated by ECI

Just to name three that we all use everyday I am sure...:) all proprietary

2. Processors - I think Equallogic is using a 64bit Dual Core NPU - Meaning all its cycles are deicated to shifting packets. Same reason why Cisco and other Data orientated vendors use that architecture. Rather than using a General Purpose X86 CPU designed for general purpose computing like wordprocessing, Playing Doom, generating graphics, surfing the net etc etc. Whats teh CPU got to do with it anyway when the product keeps spanking the competiton in independent tests - Microsoft ESRP, VertiTest, Network Computing and so on... Really lost me on that one

3. Flex - Totally agree with you . Big issue. How is that mitigated on LH? I believe Equallogic's array architecture has no backplane or arbitrated loop. Rather a set of dedicated point to point connectors between the drives and controllers. Nothing to "flex" except your hand muscles as you insert the drive

4. 10Gb - How do you fill a 10Gb IO with a standard of the shelf server, that does not have the OS or bus speed to support it? I think 10Gb is relevant in teh backbone, but at the server and the SAN itself not so relevant of however you can fill an aggregate of 10Gb. Bsides teh 10Gb interface for 10G-Base-T is not yet done and so widespread adoption is some ways of. Apart from very specific customers (Like realtime film rendering, CAD/CAM and similar) the mainstream mid tier to F500/F100 Enterprise just dont need it, nor ask for it.

I think we need to keep the vendor bashing to a minimum, or best of all eliminate it.

0 Kudos
sstelter
Enthusiast
Enthusiast

I'm glad to do that skip181sg - feel free to PM me with any questions you might have after reading this post.

First let me say that in my opinion, a single EqualLogic box is the perfect solution for many customers, especially when the PS array has redundant controllers. I only chimed in on this post when the talk of spanning PS arrays arose. I admit that I am a freak about box-level redundancy and protecting data. I apologize if it turned into vendor bashing.

Negative aspects of LeftHand Networks:

1. Our solutions are not the cheapest $/TB storage on the planet. You can get more raw capacity for your money elsewhere. You can't get more features, but you can get more capacity.

2. For extremely small environments, we may not be an appropriate fit, although our Virtual SAN Appliance (VSA), announced at VMworld, brings the advantages of SAN/iQ and VMware to smaller shops and remote offices. You can download the VSA and even manage the storage on your EQL array if you like: http://www.lefthandnetworks.com/campaigns/vsa.php

3. For extremely large environments >200TB, adding a controller every time you add disk may not be the most efficient means to add capacity, especially if you don't need the additional performance. Note that as individual drives increase in capacity, the definition of "extremely large" changes. See #1.

4. If you prefer a heterogenous environment, where your storage comes from one vendor and your server from another, LeftHand might not be the best for you. Although you could always use HP servers for compute and then IBM servers for storage, which could actually give you more leverage in pricing from one or the other.

5. If you see no value in buying storage over time, instead of all at once, we're probably not the best fit for you.

6. We use liquid nitrogen to remove our warts and we bury our skeletons. Warts are hard to revive, but ask about LeftHand when you next meet with your EqualLogic rep and I'm sure they will wheel out the skeletons for you. It is true that four years ago, the state of the art in x86 would get outperformed by the same box EqualLogic will sell you today as brand new. But it isn't true anymore.

0 Kudos
sstelter
Enthusiast
Enthusiast

Hi theguber,

Interesting points. Sorry if I wasn't clear. To clarify:

1. Please note that I was talking about proprietary hardware - I appreciate that at least two of your examples involve software (Windows, IOS) that runs on different hardware platforms. Decoupling the SAN software from the SAN hardware allows customers to win, in much the same way that decoupling the operating system from the underlying hardware allows customers to win. You get more choices.

2. The processor architecture question talks to support for things like full data rate on the SAS side and sufficient network pipes to handle the traffic that the drives in the unit can generate. But that's not all. When you think about other things that add value - advanced block manipulation, de-duping, IPSec, compression, etc. it is nice to know that you can add another processor or more memory to your HP DL320s or IBM x3650 to extend the capabilities of your existing SAN hardware. Most SAN vendors would much rather sell you a new, bigger, shinier version of their box. LeftHand's openness allows us to leverage enterprise-class RAID technology like HP's ADG (RAID6) without taxing the CPU. (One quick aside - are you aware that the CX3-80 from EMC has Intel Xeon inside? I'm pretty sure I wouldn't want to play Doom on that.) Intel makes lots of processors really well and improves them frequently. Our customers benefit from that. HP and IBM build lots of enterprise-class servers really well and they improve them frequently too. Our customers benefit from that as well. That is my point. I hope that is clearer now.

3. Glad we agree on something ;-). We use the same point-to-point technologies that other modern storage vendors use - SAS and SATA. We all have backplanes - that's what enables hotswap. So how does LeftHand mitigate flex? Box-level redundancy. In a LeftHand Networks storage cluster, members of the cluster can fail in their entirety without impacting the accessibility of data. So you have enterprise-class servers from HP and IBM, you know, the ones that have been running critical applications in the worlds largest datacenters, handling storage blocks for you. Even if one (or more) of them fail in your storage cluster, your data is accessible. Great defense against chasis flex, but really great defense against site failure, power failure, PDU failure, circuit failure, etc. So how much life insurance should you buy for your data? As much as you can afford. If you pay the same money and get the same performance and the same usable capacity (after RAID, snapshot reserves, etc.), why would you accept less protection?

4. How are we ever going to fill 100Mb/s? How are we ever going to fill 1Gb/s? These questions have been asked with each iteration of network technology, yet now it is difficult to buy a laptop without a 1Gb Ethernet port. 10Gb Ethernet is coming. How ready do you want to be? As an aside, we are able to saturate 4x 1Gb/s pipes with 12 SAS drives today. On the theoretical front, the importance of higher speeds at the network is even more dramatic - 12x 3.0Gbp/s SAS = 36Gb/s >>> 10Gb/s. More practically, a single SAS drive could do 100MB/s with the right payload (and probably without trying too hard), which gives 1200MB/s > 1000MB/s. 6.0Gb/s SAS is on its way. Solid State Disks are making progress. Apparently you're talking to different people in the F500 than we are.

Ok, I spit all of that out without a single negative comment about another vendor. Tough to do when you're on the receiving end of it on a daily basis.

Thanks for the dialog - it is an interesting discussion to be sure.

0 Kudos
selfservicepric
Contributor
Contributor

Your costs are too high, get a Self-Service Quote on Equallogic at www.federalappliance.com . No hassles and no registration, just takes a couple of minutes.

dale

0 Kudos
pasikarkkainen
Contributor
Contributor

http://www.equallogic.com/uploadedfiles/resources/product_documentation/ps3900xv-60000-users.pdf

There you have an Equallogic benchmark setup with 12 arrays in a group. It's done by Equallogic itself, and verified by Microsoft (because it's exchange performance benchmark).

0 Kudos
netbsoft
Contributor
Contributor

I don't know how I ended up on this chain, or what it has to do with VMWare, but for some oddball reason I ended up reading it. I would like to point out a few anomalies in the discussion/debate:

First of all, in the beginning of the chain pasikarkkainen points out that EqualLogic claims 60,000 IOPS per array. Based on this Microsoft ESRP test run, using their .4 IOPS per user setting, means that the 12 array system produced 24,000 IOPS, which is less than the published 60,000 IOPS spec. for one array. This is why I can't stand vendors that publish 100% cache hit numbers, because customers will NEVER experience it. It's kind of like painting the picture of a Ferrari on the side of a Volkswagen.

In stelter's defense, it appears that EqualLogic configured this ESRP test with 4 groups of 3 arrays, so the Exchange volumes only spanned 3 arrays. The whole idea behind clustering is getting as many spindles behind an Exchange group volume as possible. The excuse doesn't hold water either. If you're worried about corruption or a virus problem on a single LUN, then take some snapshots.

Based on the paltry backup performance of 470MB/s in the report (EQL's spec. claims 300MB/s per array), It appears that EqualLogic's "clustering" software may be nothing more than a logical volume manager that just happens to manage more than one array.

Don't bother to reply, because I'm not a big blogger and probably won't be back (unless I have a VMWare problem), and I also won't be buying storage from a "smoke-and-mirrors" vendor.

0 Kudos
pasikarkkainen
Contributor
Contributor

Yes, Equallogic 60 000 IOPS per array is from cache. They call it "cached hits" themself. Nothing weird in that. It has been proven with tests.

Performance with "random io" applications (like this Exchange 2007 benchmark) is different game.

In the ESRP test with 60 000 Exchange 2007 users Equallogic achieved 174 IOPS per disk, or 313 users per disk in other words. Equallogic units in the test used 192 15k rpm disks (192 disks for 60 000 users).

Equallogic result is the best ESRP result per disk, afaik.

For example EMC CMX3 from the same test with 84000 users used 448 15k rpm disks (448 disks for 84 000 users, 187,5 users per disk)..

And yes, Equallogic units are spanning and striping the data across arrays, it's not just "volume manager".

0 Kudos
netbsoft
Contributor
Contributor

I don't understand your math.

.4 IOPS per user and 60,000 users is 24,000 IOPS, divided by 192 disks is 125 IOPS per disk. You must be from EqualLogic.

0 Kudos
netbsoft
Contributor
Contributor

Oh, and one more observation:

If that would be the DMX-3 result you're referring to (not sure what a "CMX3" is) then I smell a rat. The read and write latency on the databases and logs for the DMX-3 were much lower than EqualLogic's. For example, the DMX-3 read latency on most databases was 10ms and the write latency was 3ms, and log reads & writes were 1ms or less. EqualLogic's read latency hit a whopping 18ms on some mailboxes, write latency was 11-16ms and log writes were 2ms or more. This tells me that the DMX-3 had much more headroom and they were not bringing it to the knife edge like EqualLogic.

This means that EqualLogic's results are only superficially better (remember the Ferrari painting on a Volkswagen analogy?) Who knows what the DMX-3 can hit when stressed to the limit. Therfore your IOPS per disk superiority claim is somewhat meningless, kind of like 60,000 IOPS and 300MB/s.

If EqualLogic called on me I would run and hide.

0 Kudos
marcfarley
Contributor
Contributor

This is Marc Farley from EqualLogic. The community member netbsoft is most probably John Spiers, CTO of Lefthand Networks, a company that competes with EqualLogic. Run a WHOIS on the domain name netbsoft.com and you will find that the administrative and technical contact for this domain are John Spiers of Louisville, Colorado. Alternatively, you can look at the attached PDF file of the WHOIS domain registration information for netbsoft.com.

Clearly, his comments and arguments regarding storage products lack credibility.

Marc Farley

0 Kudos
pasikarkkainen
Contributor
Contributor

60 000 users divided by 192 disks is 312,5 users per disk. Actually this would be more if you didn't count the hotspare disks..

In the test measured IOPS "average database disk transfers/sec" was 29 068.47.

Each array has 2 hotspare disks, so there was 168 disks in use in this test.

29 068.47 / 168 = 173 IOPS per disk.

And sorry about the wrong EMC product name, I obviously ment DMX-3.

0 Kudos