TristanT
Contributor
Contributor

vSphere and IBM XIV Storage

You run XIV for vSphere workloads in your datacenter. You love it. Your performance is good. You enjoy stability, scalability, and ease of management. The hardware and software are reliable. IBM support is fantastic.

So...please tell me more! I read all kinds of positive and constructive things about storage solutions from HP, NetApp, EMC, Compellent, etc. I read very little from vSphere environments that are running on XIV. Google and VMTN searches yield far fewer search results with XIV than all other storage solutions.

You want to tell me all about XIV and what you love about it. You know you do... Thanks in advance!

0 Kudos
22 Replies
ChrisDearden
Expert
Expert

We run a couple of XIV's - they seem ok so far. Pretty easy to admin but not much to be able to user tune the performance (IBM claim this isn't needed however ! )

They havn't failed yet so I can't comment on the reliability.

If this post has been useful , please consider awarding points. @chrisdearden http://jfvi.co.uk http://vsoup.net
TristanT
Contributor
Contributor

Thanks for the feedback Chris. There just isn't a lot of customer/user feedback out there on this platform. If you know of any good sources - please send them my way. Take care and thanks again!

0 Kudos
Brian_Laws
Enthusiast
Enthusiast

We're running about 215 VMs on 16 hosts on an XIV. This includes something like 85% of our Tier 1 apps, including all of Exchange 2007, sales systems, heavy SQL databases, etc. I'm not the storage guy, so I can't really speak to the administration side of it (I've been told, though, that it's imensely easy, almost a non-event). But I can tell you that the performance is extremely good. Our slowest datastore showed a latency of < 1 ms a couple of weeks ago. So we have no complaints at all from a performance level. The cost is good as well. In the 9 months or so since we've had it, we've lost around 3 or 4 drives. There are basically no tools for the XIV, though, which is our biggest complaint. We're told they'll start coming out at the beginning of the year, though. I can't wait for some of the storage migration pieces to come out.

When we researched the XIV last fall, we found virtually nothing at all about it. That didn't help us feel very confident. However, so far it's proving itself. Give us some tools (vStorage APIs) and we'd be very happy with the box.

0 Kudos
amvmware
Expert
Expert

Brian

I have a customer that is using one of these IBM SAN's with SATA storage - i have been told this is not a performance issue as the data is primarily held in memory - is this statement correct.

0 Kudos
runclear
Expert
Expert

We have two XIV units here for vwmare....

Curently configured with 6 nodes (3650 M2's) Dual Brocade 1020 CNA's running FCOE.... no complaints here...

-


| VCP[3] | VCP[4]

-------------------- What the f* is the cloud?!
0 Kudos
Brian_Laws
Enthusiast
Enthusiast

I don't know if that's true or not (I'm not the storage guy). I don't believe so, though. I believe that they're able to achieve high performance by running over 180 spindles over 15 servers with three of them serving as heads/supervisors. Unfortunately, I don't know enough about how the XIV works to tell you.

0 Kudos
Hpapy_Gilmore10
Contributor
Contributor

We're currently running approx 400VM's, over 20 hosts, as well as a host of other P-Series and dedicated servers all attached to 4 x XIV arrays and no complaints here; simple to manage, simple to expand, simple to provision, more than reasonable performance.

I think this type of storage is an adminstrators dream as there is no tuning required, no hot spots can occur and there is no disk layout design required upfront, it's pretty much plug and play which for this type of storage is amazing. It has detailed reports built into the GUI which can individual volume performance to help diagnose any potential performance issues, which will usually be the SAN Fabric or the Hosts themselves and not XIV.

In terms of resilience, another 2 thumbs up. Each array has 3 x UPS within the rack, unfortunely one of ours was damaged during installation so we only had 2 operational for short while and wouldn't you know it, we had a major power failure in the dataroom, but XIV stayed up and we had no storage issues at all once power was restored and the hosts came back online. Disk failure rates also seem pretty low; the IBM salesman will tell you this is because of all the disks being used and running at consistent speeds and not being thrashed as can happen in more traditional arrays, I don't know if this is true but I do know that when we do have a disk failure an IBM engineer turns up the next day to replace it without us doing anything....

People tend to get scared of XIV because of the SATA drives but I always explain that XIV isn't meant to provide jaw dropping performance, it's supposed to provide good performance in a nice, easy to manage, easy to grow, single rack based solution that is competetively priced and I think it does exactly what it says on the tin....

The only thing I would be slightly careful of is that you'll only ever get the best performance out of a fully loaded XIV unit (all 15 modules), if you have a partial array you obviously have less spindles and less cache to utilise.

FYI - I don't work for IBM, I work for a large outsourcer who is very happy with XIV... Smiley Happy

0 Kudos
amvmware
Expert
Expert

Thanks for the update - some really interesting info.

When i heard this was SATA storage my concerns were around performance and reliability - someone suggested to me that performance was maintained by having lots of cache - so the data was retreived from the cache rather than the disks -is this statement correct from your experience of this storage.

0 Kudos
Hpapy_Gilmore10
Contributor
Contributor

yeah, I think it's probably fair to say that XIV relies on 3 key elements to provide the performance from what are effecitevly slow SATA drives:

1 - large numbers of spindles behind each block of data. Everything within XIV is split into 1gb chunks and spread over every single disk in the entire array. In a fully loaded XIV there are 160 spindles, which in my opinion is a lot, and more than compensates for the slower speed of SATA compared to faster FC disks

2 - large amounts of cache; a fully loaded XIV currently has 120gb (and upto 21 CPU's), which again is quite a considerable amount compared to more traditional arrays and I'm sure these numbers will rise over time as the technology continues to evolve as it utilises standard components so, for example, the move to quad, hex or oct core processors should be relatively quick

3 - concurrent processing; not only is all of the data within XIV spread over every disk spindle, there is also a large amount of I/O connectivity provided by the 6 Data Modules (there are 24 4gb FC ports, 12 of which are used for hosts, the rest are for mirroring and data migrations) which, with clever zoning, can ensure the heavier host servers get lots of potential bandwidth.

Another thing that worries people about XIV is it's still 4gb based architecture but I don't see any real issues with this at the moment due to it having lots of concurrent bandwidth, and I'm sure 8gb can't be that far away.

Reliablity also isn't an issue, in my experience. Over the past 12 months we've probably had approx 1 disk failure a month across the 4 arrays, which out of 640 disks I don't think is too bad. The rebuild times for failed disks is also very quick, it takes a maximum of 30minutes to rebuild a failed disk which means the data is very quickly re-protected. Other than the DOA UPS mentioned in my ealier post, to date we've not had any other type of component failures within the XIV arrays so I can't comment on how well they continue to operate when something major does fail but they are supposed to be fully N+1.

I can also confirm that firmware upgrades are non-disruptive.

0 Kudos
legisilver
Contributor
Contributor

180+ VM's

5 PowerEdge M610's Hosts in a Dell M1000e chassis

8 CPU x 2.659 GHz

Intel(R) Xeon(R) CPU X5550 @ 2.67GHz

47.99 GB RAM

Datastores are LUN's on an EMC Clariion

I'm currently in the planning phase of migrating from an EMC Clariion SAN to two XIV's (one for prod, the other for the DR site).  Things are going okay and I'll try to update this thread as the process goes on.

Our XIV prod rack will have all bays filled with 1TB drives (I wanted the 2TB, trust me) and the DR site will be a half-height with 2TB drives.

In addition, we are also buying into an nSeries NetApp and I'm not exactly 100% certain how this works as I was just brought onboard around a week ago.  I plan on going to the nSeries classes and getting some official training but I've heard that the XIV is VERY easy to use and I can't wait to see the interface.  From what I've heard so far the nSeries will host all of our shares from our file servers using the XIV and we'll only really use the XIV for our Unix environment(?).

There's still a lot to be worked out and I'm learning about this company at a break neak pace so I'm sure things will become clearer as time goes on.  I'll be sure to update this post.

~Brandon

01.26.2011

Message was edited by: legisilver 01.26.2011

0 Kudos
meistermn
Expert
Expert

Do you use a centrel backup solotion with XIV like netapp with smvi or do you use a backup client in every vm?

0 Kudos
Desparado9
Contributor
Contributor

Recently migrated 3 host vSphere cluster, AIX and Windows storage to an XIV, w/ 2Tb drives, from an older IBM storage array ... very please so far.

a: migration tools were amazing, AIX and Windows hosts (2Tb+) moved with very little host downtime. data copies from old storage to XIV, out of band to the host, while the host is happily back in production on XIV. VMware storage vmotion used for VMs .. zero downtime

b. XIV gui is hugely advanced from my old storage server interface

c. Storage changes and snapshots are virtually instantaneous

d. Thin provisioned luns and snapshots ... yeah

e. XIV gui realtime and historical performance data granualarity IOPS/Latency by interface, lun or host

f. xCLI ... yeah

g. performance is very good ..... sub ms latency

h. reliability ... no downtime to date / no single point of failure that I'm aware of

Wish list: a script that snapshots my critical VMs, initiates XIV snapshot and cleans up/deletes those VMs snapshots

0 Kudos
legisilver
Contributor
Contributor

The IBM guys are talking about a migration utility that will move the LUN's.  Did you use this? 

0 Kudos
idle-jam
Immortal
Immortal

You mean SVC to perform the LUN migration?

0 Kudos
ChrisDearden
Expert
Expert

I'm told by my storage people there is a built in utility for LUN migrations which makes it really quite straightforward.

If this post has been useful , please consider awarding points. @chrisdearden http://jfvi.co.uk http://vsoup.net
0 Kudos
legisilver
Contributor
Contributor

vSphere has a data recovery plugin but to be honest it doesn't seem to fit our environment due to the large number of VM's we're backing up.  You almost have to have a second storage area for the backed up VM's but that's a lot of space to maintain and a short amount of time to perform backups.  Like I said earlier, I was just brought onboard but I'm going to look into multiple backup schedules or something because right now that data recovery VM just isn't cutting it.

0 Kudos
legisilver
Contributor
Contributor

If you're talking about Storage vMotion, no, I was reffering specifically about migrating the LUNS themselves.  But yes, our current plan is to present new storage from the XIV to vSphere as new datastores (I was told 1TB LUNS and no higher!) and then use Storage vMotion to move the VM's to the new SAN.  Right now it's just a matter of identifying the critical VM's and making sure that they aren't moved until we've seen a couple of other VM's go flawlessly.

Anyone else hit me up because I'm knee-deep in this now.

0 Kudos
legisilver
Contributor
Contributor

Alright, mid-project, so here's an update:

  • IBM CE's came out and setup the XIV as expected.
    • We had heating issues immediately!  Make sure you get the heat output on the XIV and talk to your datacenter cooling team, this thing runs so hot it's unbelievable.
    • In order to cool our datacenter we had to force on the 4th cooler, and it's barely keeping it under 74 degrees.
  • We also purchased a n6040 IBM Storage System, which is actually just a re-branded NetApp.  In fact, I think IBM simply changed the front face-plate, TBH.
    • This thing has 1TB of SSD "Flash Cache", which is supposed to speed up pretty much all traffic due to the read speed.
    • A NetApp (or Network Appliance) is basically a gateway device for other storage.  So, you buy a NetApp and then you can add any amount of storage you want by connecting to it on the backend through the NetApp.  In our case, we are connecting to the XIV.
    • The NetApp can act like a SAN or a NAS device or a combination of anything you want.
    • I'm EXTREMELY IMPRESSED with the NetApp and will be attending training later to learn more about the CLI functions (GUI-only so far).
  • The NetApp is also the gateway for our VMware datastores.
    • This is the most important part of it all.  With the NetApp as our head in front of the XIV, we are seeing nearly 60-70% data deduplication on our datastores.  *Oh yeah, the NetApp offers datadedupe as well:)

More to come later as I test.  Right now we've made several new datastores on the NetApp (hosted XIV space) and I'll post more once I have IOPS and read/write speed info as we're migrating our first server this weekend!

0 Kudos
AureusStone
Expert
Expert

I am not a storage guy, but that is maybe why I like the XIV.  It is easy to manage.

At the place I used to work at we had all EMC SANs.  They were pretty good, but it would always take a fair while to provision storage, as the SAN people had to write up and QA scripts to add storage.  Now with the XIV it is a 2 second job.

There are more disk failures, but it is easily manageable.

I guess the main advantages of EMC is you will always get the new features first, for obvious reasons.  XIV currently does not support VAAI on vSphere.

0 Kudos