VMware Cloud Community
srodenburg
Expert
Expert
Jump to solution

LSI SAS 9207-8i and vSAN 6.5 ?

Hello,

Can someone tell me if the LSI SAS 9207-8i  (which is certified for 6.0 U2) still works without problems in 6.5.0a ?  (it's not on the 6.5 VSAN HCL yet).



Kind regards,

Steven Rodenburg

1 Solution

Accepted Solutions
BrianMathis
Contributor
Contributor
Jump to solution

‌Hi Steven

I just installed vSAN 6.5a using LSI SAS 9207-8i Controllers (2-Node Robo, All flash, direct attached) and so far I haven't experienced any issues at all using IT Firmware 19 and the drivers delivered with ESXi 6.5a. Performance has improved over 6.0u2.

Cheers,

Brian

View solution in original post

Reply
0 Kudos
16 Replies
GodfatherX64
Enthusiast
Enthusiast
Jump to solution

i think it will work just fine, its branded brother (HP H220) is vSAN 6.5 Certified

VMware Compatibility Guide - I/O Device Search

^^ open this

Reply
0 Kudos
admin
Immortal
Immortal
Jump to solution

It is not yet certified by VMware for vSAN 6.5 but It should work fine with vSAN 6.5, however, you will be having a warning on the cluster for this.

I will keep you posted here in case I get some more details on this.

Cheers!

-Shivam

Reply
0 Kudos
BrianMathis
Contributor
Contributor
Jump to solution

‌Hi Steven

I just installed vSAN 6.5a using LSI SAS 9207-8i Controllers (2-Node Robo, All flash, direct attached) and so far I haven't experienced any issues at all using IT Firmware 19 and the drivers delivered with ESXi 6.5a. Performance has improved over 6.0u2.

Cheers,

Brian

Reply
0 Kudos
admin
Immortal
Immortal
Jump to solution

Thank you Brian, that's a good to know information. Thanks for sharing your experience with us.

Cheers!

-Shivam

Reply
0 Kudos
srodenburg
Expert
Expert
Jump to solution

Thanks everybody for your feedback. I bit the bullet and did the upgrade. Everythings works fine so far. I disabled the "Controller Firmware and driver" alerts to make the system shut-up about not being HCL compliant... 😉

I also use IT Firmware 19  (as 20 is a disaster) and the drivers that come with 6.5.0a

Reply
0 Kudos
admin
Immortal
Immortal
Jump to solution

That's great Steven. Keep the good vibes flowing, and the happiness growing. Smiley Happy

Cheers!

-Shivam

Reply
0 Kudos
Chuckak
Contributor
Contributor
Jump to solution

I have been running with the warnings for the last 4 months with no issues. 4 node cluster 115TB.

When is VMware coming out with the updated driver? It would be nice to get rid of the warning.

Reply
0 Kudos
Chuckak
Contributor
Contributor
Jump to solution

Now running vSAN 6.6 with no issues.  Still have all the warnings.

Reply
0 Kudos
srodenburg
Expert
Expert
Jump to solution

A few months have pasted since i wrote my opening-post.

Situation today:  All but one or two LSI Controllers (the 9361-8i and 9362-8i) have, after so many months, still not made it to the v6.5 HCL and probably never will.

It seems VMware and/or LSI are not interested in certifying the other adapters in the LSI Adapter portfolio, that where fully supported up until version 6.5. And they work fine too by the way.

Customers with LSI controllers from the pre v6.5 era are essentially screwed stuck with the warning in vCenter that their environment is, and never will be, HCL compliant. One cannot expect that customers will rip out all their LSI controllers from all their nodes and replace them with newer models.

I find this strategy disappointing. If this is how VSAN hardware-compatibility works now and in the future, then I have a bad taste in my mouth...

Reply
0 Kudos
ManivelR
Hot Shot
Hot Shot
Jump to solution

Any one tested this LSI with VSAN 6.7.0 U1. "'LSI SAS 9207-8i and vSAN 6.7.0 U1" ?

Performance is good?

Thanks,

Manivel R

Reply
0 Kudos
srodenburg
Expert
Expert
Jump to solution

Hello Manivel,

We have a cluster that started out with this card and vSphere 6.0 and the card was certified back then. Since then, we kept upgrading it through 6.5 and now to 6.7 U2 and it still works fine. We run 8 drives on each card (2 diskgroups, each with 1x SAS SSD and 3x SAS 10k HDD) and performance is ok.

Important: LSI card Firmware v19. Not v20 or higher as you will get stability issues. Downgrade cards to FW v19 if needed. v19 is also the only certified FW version and for good reason.

We also have a HP Proliant DL380 Gen8 cluster with the HP version of this exact card (it simply has a HP sticker on it). DELL and Lenovo etc. also used the exact same card, branded as their own.

Kind regards,

Steven Rodenburg

srodenburg
Expert
Expert
Jump to solution

"I have 6 DELL R 820 power edge rack servers.

Each server has 2 disk groups as same as your design except All Flash VSAN in ours.

SSD disk vendors are Micron(Capacity disks) and ADATA(Cache disks)

1st DG(1*1 TB-->Cache Tier & 3*2 TB Capacity tier disks)

2nd DG(1*1 TB-->Cache Tier & 3*2 TB Capacity tier disks)

Disk is SSD and bus protocol is SATA(i see this info in DRAC under storage)"

I think your problem could (also) be SATA. The SATA Protocol is half-deplux (can only read OR write per turn, but not simultaneously like SAS can, which is a full-duplex protocol. So that is one issue with SATA. SATA Devices have a queue-depth of 32 (half duplex) versus 256 full duplex for SAS.

SATA is acceptable if one does not push them too hard.

We have large all-flash clusters with SATA Flash devices (enterprise grade and vSAN HCL certified) and they perform ok, only as long as we don't hammer them too much. If we go bonkers on those hosts, we run into performance issues. It's expected and these clusters where never intended to be hammered so hard so it's fine for us.

In other words, SATA flash devices do have a valid use-case (which is also why some devices are good enough to make it onto the HCL). Simply don't expect wonders from them. SAS Devices are faster in higher performance envelopes. NVMe even more.

In your case, its more likely the controller being saturated. But to make a truly informed decision, run a SexiGraf virtual appliance. It shows you exactly where the bottleneck is. If it's the controller, then you empty your wallet on new controllers. That is what i would do. Measure first, then decide and invest.

The 9207 was the top-of-the-line SAS 6g card back then. No LSI card from that era (6g) was all-flash certified. Hybrid only.

Anything SAS all-flash certified is 12g SAS like the LSI 9300-8i or 9305-16i which are good cards and certified for 6.7 U2 as well. You just have to make sure they work with the 6g SAS Backplane in your servers. Your devices are 6g SATA and again, I hope that in your use-case, you are not beating the crap out SATA devices in the first place (for reasons i spoke of above).

What you might lose when you go away from DELL controllers, is backplane control. SES (the SCSI Enclosure Services protocol) will likely not be able to get info from the backplane anymore like device-position, thermal status, firmware functions, all that stuff (it depends). Maybe get the equivalent card from DELL instead of LSI. It's the same chip, just with DELL modifications to be able to talk to the backplane.

Reply
0 Kudos
ManivelR
Hot Shot
Hot Shot
Jump to solution

Thanks so much Steven for the detailed response.I will take a look about "SexiGraf virtual appliance" to understand where the bottleneck?

In DELL R820,as i said H710P is very bad and there is no other options to get another RAID controller cards from DELL.I thought to go with 730 PERC RAID from DELL which is not compatible in my DELL R820 servers.

I dont know(if i go either LSI 9300-8i or 9305-16i) whether my server supports this or not ? (Anything SAS all-flash certified is 12g SAS like the LSI 9300-8i or 9305-16i which are good cards and certified for 6.7 U2 as well. You just have to make sure they work with the 6g SAS Backplane in your servers).

I need to take a look with the help of DC technician.

DRAC info:-

pastedImage_0.png

pastedImage_1.png

pastedImage_3.png

pastedImage_2.png

Thanks,

Manivel R

Reply
0 Kudos
ManivelR
Hot Shot
Hot Shot
Jump to solution

Hi Steven,

Im trying to dig from this tool "exigraf" i.e regarding my PERC H710 RAID controller.Whether my RAID controller has been saturated or not ?

Where can i find out those info from Sexigraf.Any ideas,please let me know.

pastedImage_0.png

Thanks,

Manivel R

Reply
0 Kudos
srodenburg
Expert
Expert
Jump to solution

Hello Manivel,

Concerning SexiGraf:  you must use the "VMware VSAN Monitor" entries. Then, in the "layer" dropdown-box, select "client" to see if the vSAN Layer attached to the VM has issues. From the same dropdown-box, select "disk" to see what happens on the disk-layer. Look for Congestion and Outstanding IO as key indicators.

Don't be alarmed when Read-cache-hits read 0. That's normal as all-flash does not have read-cache in that way (hybrid does).

Also, use "esxtop" to fetch the queuedepth of the controller and see if the number of write and read entries hit the limit of adapter during periods when you have performance problems. If that happens, the controller can definitely not handle the sheer amount of traffic.

There is a lot of information on the net on "what to look for" besides the things I wrote above. Pasting 1000 screenshots here, asking "is this good, is this bad" is not optimal. I also don't have a lot of time but maybe others have. Teach yourself the key indicators concerning vSAN performance to make an informed descision as to which component is the culprit.

If the controller can handle it easily (queuedepth limit is not or rarely reached) but the disks themselves struggle (huge individual disk latencies), then you are pushing the performance-envelope of what the disks can handle (bottleneck most likely SATA, not the flash itself) and upgrading the HBA-controllers will not bring anything.

If the controller has smoke coming out of it's ears under load while the disks are just waiting to be fed, then that's your bad guy etc. etc. etc.

Something else. If you change controllers in a node, make sure you completely evacuate all disk-groups in that node and empty out the drives (reboot, then select "delete all paritions" to make them empty. The new controller sometimes (especially if it's a very different model) give different names to the same disks which screws-up the local host. So empty everything out completely, then change controllers and reclaim the disks (which only works when they are empty). Then let it rebuild the disk-groups, resync etc. etc. before you go onto the next host. Repair object-problems afterwards if needed and always make sure that vSAN health is completely "green" before you get to the next node.

ManivelR
Hot Shot
Hot Shot
Jump to solution

Thanks very much for the detailed explanation on this topic.Much appreciated.

Reply
0 Kudos