VMware Cloud Community
kellino
Enthusiast
Enthusiast

10G iSCSI or 4GB Fiber?

This may turn out to be an active thread Smiley Happy

We are looking at a new infrastructure to support about 50 initial VMs (production environment), and now that 10Gig Ethernet is supported, I am wondering what the pros and cons are for 10G iSCSI versus 4GB Fiber.

I see this question was raised here in the past, but not since 10Gig support was added in ESX 3.5 Update 1.

Not being an expert, my initial thoughts are that 10G should be less expensive and may even offer better performance under high loads, but this is a bit "new" while fiber is the "safe" and traditional choice and I'm not sure what "gotchas" to be aware of.

I am curious to hear experiences from others and their thoughts about which network technology would be the better choice for new VI3 deployments, hosting production servers.

Thanks!

0 Kudos
23 Replies
azn2kew
Champion
Champion

We've used 10gb NFS and iSCSI solutions and it works great with decent storage gear such as NetApp or Dell Equallogic should works great. Depends if you already have FC infrastructure e.g. fibre switches and backbone already in place, than using FC is relevant. If this new project utilize 10gb iSCSI/NFS is a way to go to save money and easy management and provisioning especially NFS/NetApp deduplication saves 40%+ disk space and support is easier.

Our client hosts 70+ VMs per ESX host with NFS and runs great as well. Just make sure you have redundancy built in for NFS & iSCSI solutions both on NICs & etherchannel switches for failover purposes. If you want better bandwidth management using Neterion 10GB NICs is great as well. I believe there is approx. 30% bandwidth overhead with iSCSI/NFS. There are great best practices and blogs regarding this topic can be view at www.vmware-land.com for details.

If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!!!

Regards,

Stefan Nguyen

iGeek Systems Inc.

VMware, Citrix, Microsoft Consultant

If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!!! Regards, Stefan Nguyen VMware vExpert 2009 iGeek Systems Inc. VMware vExpert, VCP 3 & 4, VSP, VTSP, CCA, CCEA, CCNA, MCSA, EMCSE, EMCISA
0 Kudos
StorageMonkeys
Enthusiast
Enthusiast

I think the conventional wisdom is that 10g iSCSI will be cheaper, but FCoE (Fibre Channel over Ethernet) will be the emerging protocol of choice. If you can wait another 6-12 months, you may want to wait and see where the dust settles on this. I would not want to make this choice right now.

-Tim

StorageMonkeys.com - social networking for storage end-users

0 Kudos
mreferre
Champion
Champion

I agree that FCoE will be the converging strategic choice for Storage in the future (Cisco / Qlogic / Emulex I think already have products out there). It is also true that we are at the beginning so I wouldn't bet either at this moment my entire DTC strategy on this.

As mentioned the iSCSI Vs FC argument has been around for a long time..... and clearly it cannot be "dworfed" to 10 Gbit Vs 4 Gbit speeds..... there is much more than that as skills in the organization, flaw control, maturity etc etc ..... and quite frankly in the 2Gbit FC era I haven't seen many circumstances where, with ESX deployments, we were bottlenecked at the FC link level (which is different than being bottlenecked at the Storage Server level as it depends on its configuration).

Massimo.

Massimo Re Ferre' VMware vCloud Architect twitter.com/mreferre www.it20.info
0 Kudos
RParker
Immortal
Immortal

Also let's not forget that QLogic announced 8GB Fibre as well . . . . . .

The price for going to Fibre is MUCH higher than iSCSI / NFS. So far iSCSI / NFS it not all that spectacular if you ask me. I don't like the problems with iSCSI, namely somone on here was asking how they reset the iSCSI connection, and basically the only option was to reboot the ESX host. That's not a good alternative. We had a few problems with NFS, but I know mostly NFS has proven to be very stable.

Fibre is much easier to deal with, and Ethernet is flaky, if your segments are comibined with other machines on the same network, so you have to be careful to keep iSCSI / NFS segregated properly.

Also the overhead on the machine is twice that of Fibre (network packets and CPU). So Fibre is the most efficient method of SAN connectivity...

0 Kudos
RParker
Immortal
Immortal

If you already have 10G Ethernet, then go with iSCSI / NFS. 10Gb Switches, 10Gb Nics, etc. . . .

IF you are starting off fresh, and you have to buy ALL the equipment new, go with Fibre. Fibre is much more solid connection (it's on a private network, so to speak). Fibre doesn't interact with anything else. Fibre is the fastest of ALL topologies (and this a FACT, not fiction). People compare NFS with iSCSI in terms of speed, neither can touch Fibre, period.

Also QLogic has 8Gb Fibre out now, so Fibre isn't going anywhere anytime soon. Also Fibre has been around for what, 40-50 years? It has the history.

And one more thing to add, Unix, AIX, AS/400.. they don't support iSCSI / NFS connectivity (they can use it, but Fibre is preferred and recommended). So why would the largest of technologies and the most recognized still use Fibre? The answer should tell you something about Fibre right there . . . . . .

iSCSI and NAS are cheap alternatives because they use existing ethernet technology, but if you want to do it right and for the long term, Fibre is your BEST choice.

mreferre
Champion
Champion

I agree with your analysis ... I would just challenge the long term speculation of FC to be the strategic choice.

One of the drawbacks of using iSCSI / NFS today is that it storage networking has to traverse the entire TCP/IP stack (either the OS stack or the HBA embedded stack). Either way this is not very efficient.

One of the problem was that Ethernet was never meant to be a realiable media prtocol so all the re-transmit etc etc algorythms had to be implemented in a higher protocol stack (i.e. TCP/IP). FCoE along with other key enterprise technologies falls under the umbrella of what Cisco defines DCE (DataCenterEthernet) solve these problems extending the basic Ethernet implementations. This new "thing" is going to be the foundation for both standard IP network traffic and <native> storage traffic (i.e. not via TCP/IP). For the records there have been discussions over the years whether this "unified" I/O would have been Infiniband or Ethernet .... well it appears that Ethernet won hands down.

The Cisco Nexus 5000 Series and the Emulex 21000 Series are two implementation of this new architecture.

Massimo.

Massimo Re Ferre' VMware vCloud Architect twitter.com/mreferre www.it20.info
MikaA
Contributor
Contributor

One thing iSCSI has done to FC is, it has brought the price of FC down considerably, which is a Good Thing. Sure, Gig-NICs are cheap as butter but good switches aren't and you don't really want to mix'n'match "storage switches" with other networking gear to avoid problems. It's basically the same thing as it is with FC.

Luckily many storage boxes come with both interfaces so you can start with GigE and switch to FC if necessary - or use both.

0 Kudos
StorageMonkeys
Enthusiast
Enthusiast

I would list the protocols in this order:

1) iSCSI - cheap, easy to use and deploy and may not be as fast as FC but your risk vs. return is much lower than any other technology

2) Wait for FCoE - the best of both worlds but still requires new infrastructure to implement

3) Fibre Channel - way too expensive end-to-end especially for a new environment, it's much more complicated than iSCSI and it's a protocol that is slowly dying

-Tim Masters

Storage Monkeys - Social networking for storage users

0 Kudos
kellino
Enthusiast
Enthusiast

Thanks everyone for the feedback. All of the responses were helpful, but I tried to pick the 2 that helped me the most to allocate points.

I think we will go with 4GB FC for a couple different reasons.

Fiber Channel is probably more effective than iSCSI in every area but cost. I also looked at QLogic's iSCSI HBA's and noticed that all of them are 1GB only.

Also for storage we haven't decided yet but were are considering the CLARiiON which as I undertand it only supports iSCSI to the Serice 40 and that the Series 80 (which we would likely scale to within 1-2 years) is FC only.

Since FCoE isn't an option today, it's really FC versus iSCSI and while FC will be more expensive, I feel it is the safe choice for this new environment.

Thanks!

0 Kudos
kellino
Enthusiast
Enthusiast

Actually I just noticed QLogic's QLE8042 which supports FCoE on 10G Ethernet.

The issues today I beleive would be getting FCoE support into ESX (which I don't beleive it has today) as well as the SAN.

0 Kudos
MikaA
Contributor
Contributor

CLARiiON CX3-40 has both iSCSI and Fibre, -80 has only Fibre.

But I guess if you feel you're scaling to 300+ terabytes in 2 years, you may want to go with Fibre right from the start, one would think the somewhat higher cost of FC is not the biggest issue.. Smiley Happy

0 Kudos
kellino
Enthusiast
Enthusiast

But I guess if you feel you're scaling to 300+ terabytes in 2 years, you may want to go with Fibre right from the start, one would think the somewhat higher cost of FC is not the biggest issue..

Oh no doubt. Smiley Happy This is a bit different since our parent company overseas owns the project and they are asking us to build and own the infrastructure, and gave us only some basic parameters and an initial budget. We're just going through the exercise of understanding our options and what can fit into the budget without building ourselves into a corner with scalability Smiley Happy

0 Kudos
shawnconaway
Contributor
Contributor

The performance difference can be even more dramatic if you consider teaming a pair of 10Gb NICs for a 20Gb team vs. an 8Gb HBA pair that cannot be teamed. Has anyone built a configuration similar to this?

  • A NetApp SAN (either FAS3070 or FAS3170) with NFS for a datastore.

  • Teamed 10Gb NICs (20Gb effective) on all ESX hosts and a pair of 10Gb NICs on the NetAPP.

I have to stand up a brand new virtual infrastructure, and (for now) I intend to go 10GB. The alternative is to get a 8Gb fibre infrastructure because I'll need it for connecting to the SAN for Exchange, SQL, etc. I'd rather go 20 GB Ethernet, though. VMotions have got to scream over a pipe that wide.

0 Kudos
mcowger
Immortal
Immortal

Why can't you team Fibre? MPIO and round robin works great for us.

--Matt

--Matt VCDX #52 blog.cowger.us
0 Kudos
williambishop
Expert
Expert

Agreed. You can get blistering performance out of fiber. I have one server with 4x4G hba's. Wanna guess what kind of performance I get off that beast?

--"Non Temetis Messor."
0 Kudos
mreferre
Champion
Champion

> The performance difference can be even more dramatic if you consider teaming a pair of 10Gb NICs for a 20Gb team vs. an 8Gb HBA pair that cannot be teamed

Again my piece of warning is that .... don't focus on how wide your highway is (i.e. # of lanes) if you are driving a slow car. My feeling is that iSCSI / NFS software stacks within ESX are today geared towards functionalities rather than raw performance. And just adding bandwidth to the pipe doesn't automatically speed up things......

This is similar to installing a poorly written single threaded application onto a brand new 16 cores system ...... you know what you get ....

Massimo.

Massimo Re Ferre' VMware vCloud Architect twitter.com/mreferre www.it20.info
0 Kudos
shawnconaway
Contributor
Contributor

Are your 4x4 HBAs aggregating into 8x2 active\passive or 16Gb active\active? My understanding was that, until PowerPath is integrated in ESX4, the HBAs would not be able to aggregate, just load balance for failover, much like NICs using route based on source MAC hash instead of IP hash. I would be delighted if someone could provide a link that proves me wrong.

It does appear that Round Robin may aggregate, but it is still experimental.

-


From Fibre Channel SAN Configuration Guide ESX Server 3.5, ESX Server 3i version 3.5 VirtualCenter 2.5

https://www.vmware.com/pdf/vi3_35/esx_3/r35/vi3_35_25_san_cfg.pdf

Setting a LUN Multipathing Policy

By default, the ESX Server host uses only one path, called the active path, to communicate with a particular storage device at any given time. When you select the active path, ESX server follows these multipathing policies:

Fixed – The ESX Server host always uses the designated preferred path to the disk when that path is available. If it cannot access the disk through the preferred path, it tries the alternate paths. Fixed is the default policy for active/active storage devices.

Most Recently Used – The ESX Server host uses the most recent path to the disk until this path becomes unavailable. That is, the ESX Server host does not automatically revert back to the preferred path. Most Recently Used is the default policy for active/passive storage devices and is required for those devices.

Round Robin – The ESX Server host uses an automatic path selection rotating through all available paths. In addition to path failover, round robin supports load balancing across the paths.

NOTE Round robin load balancing is experimental and not supported for production use. See the Round Robin Load Balancing white paper.

-


From http://www.vmware.com/pdf/vi3_san_design_deploy.pdf

Multipathing and Path Failover

A path describes a route

From a specific HBA port in the host,

Through the switches in the fabric, and

Into a specific storage port on the storage array.

A given host might be able to access a volume on a storage array through more than one path. Having more than one path from a host to a volume is called multipathing. By default, VMware ESX systems use only one path from the host to a given volume at any time. If the path actively being used by the VMware ESX system fails, the server selects another of the available paths. The process of detecting a failed path by the built-in ESX multipathing mechanism and switching to another path is called path failover. A path fails if any of the components along the path fails, which may include the HBA, cable, switch port, or storage processor. This method of serverbased multipathing may take up to a minute to complete, depending on the recovery mechanism used by the SAN components (that is, the SAN array hardware components).

0 Kudos
williambishop
Expert
Expert

Sorry, email alerts are off for me, yes, round robin. Yes, it works well !!!!!!!!!!!!!!!!

Bloody, will someone at vmware tell them to get this production supported?!?!?!

--"Non Temetis Messor."
0 Kudos
zman442
Contributor
Contributor

I know this is an older thread but I only saw Infiniband mentioned once. Infiniband has a theoretical maximum teamed bandwidth limit of 128 Gbit with actual data throughput of approximately 96 Gbit. This far exceeds the performance of both FC, iSCSI and FCoE solutions. I do not know if Infiniband is supported by ESX yet but Infiniband is the backbone of high speed super-computing clusters for a good reason and I recently had the pleasure of observing a rather impressive 20Gbit 24-port Infiniband connected portable Oracle cluster used for training...Oooh...Aaah.

We are also looking for a good small scale storage solution and are currently running some smaller iSCSI and 2 Gbit FC SANs in the 10TB range and the performance difference is not noticable. Our iSCSI traffic is on isolated storage networks with multiple teamed Gbit cards and dedicated iSCSI initiators in some instances and we have not observed a difference in real world performance with the ESX iSCSI software stack and the Qlogic hardware iSCSI solution. The only limitation is the software option is not bootable but is unneeded in our case.

We are currently considering the 4Gbit FC HP MSA solutions for smaller lab environments but still need a 3-5 year roadmap for expandibility to 8Gbit FC if needed and are currently undecided if FC or ISCSi is the way to go due to the temptation of 10Gbit iSCSI.

In response to a comment in an earlier post, the fastest performing storage network observed by anyone has been iSCSI, not FC but that implementation was far more expensive than any FC option currently available. I don't recall where I read it but I was surprised nonetheless. If I find the link I will post it back here.

My mistake, it looks like Mellanox Inifiniband HCA are supported in ESX 3.5...

Message was edited by: zman442

0 Kudos