bobbyxdigital
Contributor
Contributor

infiniband and esxi 5.0

I have an infiniband card in my esxi 5.0 server I'd like to utilize.  I see mellanox only provides drivers for esxi 4.0.  Can someone point me in the right direction to find drivers for esxi 5.0?  Thank you.

0 Kudos
41 Replies
mcowger
Immortal
Immortal

Mellanox would have to provide them.  Ask them if they have beta drivers.

--Matt VCDX #52 blog.cowger.us
0 Kudos
markokobal
Enthusiast
Enthusiast

Hi,

-- Kind regards, Marko. VCP5
0 Kudos
yezdi
Virtuoso
Virtuoso

Driver for Mellanox ConnectX Ethernet adapters is available in VMware portal. Is it the one you are looking for?

http://downloads.vmware.com/d/details/dt_esxi50_mellanox_connectx/dHRAYnRqdEBiZHAlZA==

0 Kudos
bobbyxdigital
Contributor
Contributor

The link below is for Ethernet.  I'm looking for Infiniband support.

0 Kudos
bobbyxdigital
Contributor
Contributor

I spoke with Mellanox support who lets me know apparently, IB drivers for ESXi 5.0 hasn't been release yet.

0 Kudos
bobbyxdigital
Contributor
Contributor

Mellanox support will not release the beta version of their driver to me.

0 Kudos
yezdi
Virtuoso
Virtuoso

Thanks for the update Bobby

0 Kudos
matteowiz
Contributor
Contributor

Just to let you know that mellanox has released IB Ofed drivers for vSphere Esxi 5.0 initiators.

Sadly, i cannot find my cards as storage adapters, only listed as Ethernet 10Gbit. Where is SRP??

Anyone?

--

matteo

0 Kudos
Macky99
Contributor
Contributor

No they dont work!!! NO SRP only ISCSI

I tried them a few times. wasted better part of 2 days.

got the driver working, and 20Gb ports on Ethernet, added them to Vswitch and enabled ISCSI

2 links via 2 nic's @ 20Gb/s vmk3&4 port group policy is compliant. but no iscsi link. see jpg's

i could then ping my SAN via IPOIB.  BUT! could not see the iscsi storage.

i then connected via a 1GB link, no issues, so all the config was correct.

I could see on the target san active iscsi connections, but not on IB ports.

Footnote

- this works on esx4 U1 same config. no probs connecting via ISCSI cept the awful performance... 5Gb/s compared to SRP where i get 18GB/s yes that’s correct 1800MB/s SAN!  VARROOOMMM problem is its unstable and A/A MPIO doesnt improve bandwidth.  When i say Unstable i get SRP sessions connect, but when more than 3 nodes in a cluster, if i add 4th the SRP connection wants to reformat the Lun! WTF!

FYI QDR Connectx2 with latest F/W applied.

then i noticed each IB Nic the MTU was 1500 instead of 2044 (2044 is supposed to be the default according to the manual, it seems to be wrong, no surprise there!)

so i changed the MTU, and everything FROZE!

a reboot reveals it's stuck booting iscsi_vmk, Geee didn’t we see this with esx 4.1, continual attempts to log onto iscsi target!  yes we did... yawn... ZZZ 10 mins later. guess what

still no iscsi connections.!!! YAY

Frankly Mellanox are doing a very, very poor job with supporting their IB Products.  Unless you pay them

do we see that attitude with other Nic vendors,,,,... NO WE DONT!  GRIPE!

if anyone in Mellanox wants IB to get market acceptance then they should support this crap!

lennyburns
Enthusiast
Enthusiast

Any more news on this front?

My company is going to make a large effort to leverage Infiniband in our future designs in an effort to further converge/decouple managment of legacy I/O devices.

We have a lab where we are building a POC, and are speaking to Mellanox today at 2:30.

I would welcome your input prior to this call.

Feel free to reply or message me if you want to discuss.

0 Kudos
bobbyxdigital
Contributor
Contributor

i can't even get the driver installed properly...

0 Kudos
matteowiz
Contributor
Contributor

I just heard from Mellanox that current (1.8.0) driver for vsphere 5 does not support SRP. there are plans to support SRP in the future but I don't have more details.

I agree with previous post, Mellanox is doing bad, almost 1 year after vsphere 5 released for a IB stack that does not support SRP. What we have to do with our brand new connectx-3 adapters? we have paid good money to get IB and now? only IPoIB with such slow performances?

What you guys think about?

regards,

--

matteo

0 Kudos
lennyburns
Enthusiast
Enthusiast

I just ran across a presentation that spoke about compiling your own mellanox drivers for unsupported linux distros.

Here they are, are they helpful?

Mellanox InfiniBand Training

File Format: PDF/Adobe Acrobat - Quick View
2009 MELLANOX TECHNOLOGIES. - CONFIDENTIAL -. 50. SRP in a Nut Shell. ▫ Maintain local disk access semantics. • Plugs to the bottom of SCSI mid-layer ...

HOWTO: Infiniband SRP Target on CentOS 6 incl RPM SPEC ...

sonicostech.wordpress.com/.../howto-infiniband-srp-target-on-centos...

Jump to Rebuilding the Mellanox RPMs‎: We need to rebuild the Mellanox ISO to support our current kernel. We need to mount it up, run the scripts, and ...


0 Kudos
Macky99
Contributor
Contributor

hi

yes using OFED u can recompile your linux kernel to integrate IB Stack, thats easy.

Ubuntu 10.4, 12.4. the drivers is al ready installed.  Ive worked with www.Quantastor.com to help them integrate Infiniband into their SAN.  It works great!!

the issues are VMWARE and Xen (Xen doesnt even support IB unless you recomile the CentOS kernel)  anyone know how to write ESXi5 VIBS for OFED drivers?  if so, then we need a comminity Vib driver for ESXi5 to integrate SRP driver for IB.

I think this may be the only way, forget Mellanox SRP and do it ourselves.  So if anyone in VMWARE is willing to help a OFED port to VIB Driver which is open source, then please post here.

plz see me latest bandwidth test.  2 images in 1 left is SRP on ESXi 4.1 U2, right image is iSCSI on ESXi5 U1

also view this http://communities.vmware.com/docs/DOC-18796

http://cto.vmware.com/rdma-on-vsphere-status-and-future-directions/

implies that they have not given up, but they wont talk about it!!!

Im about ready to chuck it all in the Bin anyway!

0 Kudos
matteowiz
Contributor
Contributor

Hi Macky,

i see larger performance impact using ipoib in esx5 compared to srp in esx4. Furtermore Mellanox has to explain why we have to pay 1000$ for an HBA that is VPI / IB compatible but lacks of support in vsphere esx 5, compared to the Network Only adapters sold for just 400$ ??

I had a big cluster with esx4 formed by infinihost 3 ex cards and all was working fine. Upgraded to esx5 no more support for those cards, Okay we know that sometimes we have to replace older cards with new one, we bought some connectx-3 for test and ... NO SRP SUPPORT??? what we have to do now with those cards? we have rolled back our hosts to esx4 for the moment but... WHAT A SHAME.

Mellanox lacks of sofware support, commercial support (you buy things and your not 100% sure if it will work), and, as my personal opinion, human support. (communication are difficult/problematic/when you say Hello they ask immediatly for money)

Said so, I'd like to contribute in porting the SRP OFED open source stack to Esx.

I'm going to look for decent documentation about compiling software for Esx. (altough I honestly don't know if it's possible)

Let's show Mellanox how to do IT in a good way!

Best Regards,

--

matteo

0 Kudos
Macky99
Contributor
Contributor

hi Metteo

hehehe. ya its like buying a new BMW with 3yr warranty, then asking for help with warranty repairs after 3 months, and being told to F off!

ok, how do you want to proceede?  perhaps we can setup a sub domain under one of mine, and create a public forum for this?  how about srpdev.v365.com.au ? 

We can use OFED drivers as a start (OFED.www.openfabrics.org/index.php. 

I tried to decompile and reverse engineer vib drivers for esxi4 and found out quite a bit of information.  Also i suggest we ask for help from SCST also http://scst.sourceforge.net/  as they have a nice SRP Target.  ill fwd them this link and see what they say.

also my business email is bruce.m@v365.com.au

we are same position as you.  Invested in 2nd hand barley used QDR switches and Cards (ConnextX2) to try using SRP as a solution. we have dual QDR PCIE cards, all updated, and HP Mezzanine cards of sim version for C7000 Chassis Blades.  Guess weve wasted about 50K so far, 😞

We partnered with OSNEXUX.COM to get an easy SAN solution working as its based on Ubuntu server,  And we use it as ISCSI and SRP Target for ESXi4 u2 clusters.  But there are issues with the SRP driver even on SRP with ESXI4. 

We use ISCSI to create and bind one path to SAN, then get SRP paths online.  set ESXi4 to RR MPIO and then disable ISCSI IQN in SAN. 

We have WD Velociraptor 1TB (200MB/s each) x 12 drives in SAN1 and 1TB x 12 WD Black ED in SAN2.  Performance is about 20% less with WD BE. 

With the 12x WD VR's its pushing the LSI9265 and PCIE Bus to max IO @ 5GT's, I see bandwidth as high as 2.5BG/s so it wont go much faster, till PCIE3 is more common.   It delivers 2000-2400MB/s to each SRP target. and 1/2 when doing concurrent IO tests over 2 + nodes.  real world File I/O with heavy video editing or databases, depending on the files is 300MB/s to 1200MB/s between SRP Initiators.  So its BLOOODY FAST!  Why on EARTH MELLANOX WONT SUPPORT IT IS FRANKLY.... DUMB!

So we could wait for mellanox or start which i think they will eventually will with the links to CIO office they are trying something....  But we need basic SRP VIB driver NOW!

so in that light

I will join OFED,

I will join VMware as a basic Partner.

i will ask SCST project for guidance. and OFED also.

I will publish a dev site for members.

i will also ask director of OSNexus for some guidance.

please email me for further info and lll post link here for ppl to join up.

Cheers

Bruce M

0 Kudos
matteowiz
Contributor
Contributor

Hi Bruce, we are on the same f. ship!

I joined the SCST project many years before and I know that Bart Van Assche it's our man... I'm writing a provate email to him with this thread, asking what he think about this situation.

But are you sure that joining the vmware partner program is sufficent to develop a package signed for esx?

Can you provide a link to this program? i'd like to join too...

Regards,

--

matteo

0 Kudos
Macky99
Contributor
Contributor

hi all

As a test of ESX 5 using ISCSI IPOIB im migrating a test ESX cluster to a blade/node on C7000 chassis, and will run VCenter from it, and see how it performs with VMotions etc.  3 BL460's in Cluster, all management, ISCSI, Ftol. VMotion networks are on IPOIB. Dual Nic in team.  MPIO A/A.  Dual 10GB Nics via HP Virual Connect Switches for external IP's. (15 Vlans)

FYI with some playing with MTU and ISCSI Target settings using PV Volumes, im getting just  8Gb/s FC San like performance.... it works, stability is an issue... will leave it running for a few days to see how it goes.  Ping times and Storage Adaptor Latency is averaging around 20-80ms which seems bad for IB Based network.  will log it to PRTG.

hope to have some results in 2-3 days.

0 Kudos
Macky99
Contributor
Contributor

ITS AWFULL

ISCSI on Infiband is quite bad.....!

ATM were now looking at GlusterFS3.3 with Infiniband.  And working on code to mount Gluster Shares > SCST RDMA Targets which will present to ESXi 4.

andif Mellanox ever release a SRP driver then it will work with ESX5 perhaps one day! 

But at this point were stuck with ESXi4.1

0 Kudos