vSphere vNetwork

 View Only
Expand all | Collapse all

Major issues with HP DL580 G5 and Intel X520-DA2

  • 1.  Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Jan 10, 2011 04:08 PM

    Hi,

    We are experiencing mjor issues with our Hp DL 580 G5 and Intel X520-DA2 nics. You might want to grab a cup of coffee. This could take a while...

    We currently have 5 DL580 G5's running ESXi 4.1 with all of the latest patches. All of these hosts are running the latest firmware revisions. All of these hosts are exhibiting the problematic behavior.

    We HAD been using the HP branded NetXen cards (NC522SFP) but had a lot of issues with those cards. If you do a search on the message board here, you should be able to find plenty of information on the troubles these cards can cause..

    SO, in order to save myself some aggravation, I decided to go with Intel X520-DA2 nics. At first, everything seemed OK. However, we have been experiencing strange issues since switching over to these cards.

    We have two standard vswitches set up. vSwitch 0 has a pair of 1gb copper for uplinks (vmnic0,vmnic1). It handles the management traffic, as well as vMotion.

    Everything else in trunked in on a pair of 10gb fiber, plugged into the Intel x520's. These serve as uplinks for vSwitch1 (vmnic2, vmnic4), which handles all of the VM data, as well as iSCSI traffic to a pair of EqualLogic arrays. We are using the EqualLogic Multipathing Plugin.

    Now for the problem.. Every so often, VMNIC2 freaks out. It still appears to be in a "connected" state, but it no longer passes any traffic. VM's that were using that nic for an uplink lose network connectivity. They cannot ping out, nor do they respond to pings. Removing VMNIC2 from the vSwitch uplinks restores network connectivity, as they fail over to VMNIC4.

    Shortly after this happens, the host will PSOD, as requested by the HP NMI driver. For grins, I tried uninstalling the HP NMI driver from some of thos hosts.

    When this occurs on a host without the NMI driver, I just get a message saying:

    "cpu0:4120) NMI: 2540: LINT1 motherboard interrupt (1 forwarded so far). This is a hardware problem; please contact your hardware vendor."

    My incredible deductive reasoning skills led me to believe this was a hardware problem, so I contacted my vendor.

    They have been unable to find the issue.

    I ran hardware diagnostics on several servers. On one server, I went so far as to run over 3000 interations of the hardware diagnostics over two weeks, and no problem was ever discovered.

    When the NMI driver is not installed, the host will not PSOD. However, it will not behave properly again until it is rebooted.

    We are, of course, plugged into two switches. One is a Cisco 6509, and the other is a nexus 5000. I thought perhaps there was a problem with one of the switches, so I swapped all of the network cables (so what was plugged into the 6509 is now plugged into the 5000, and vice versa).

    Hoever, the problem occured again, and it was still VMNIC2 that freaked out. It did not follow the switch.

    I have logged a support ticket with vmware. It has been open since about Dec. 13th I think.

    Also, I logged a support ticket with HP around the same time. Nobody seems to know what to do.

    If anyone has an idea, I'd be quite grateful to hear it. Thanks!

    Jason



  • 2.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Jan 10, 2011 04:19 PM

    You mention HP and VMware support but did you also try Intel for support??



  • 3.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Jan 11, 2011 03:07 PM

    DSTAVERT,

    I have not. Thinking this was a pretty good idea, I went to their website to get their support number.

    I came across a rather disturbing little blurb on their support page for the X520's.

    They are saying that you HAVE to use their SFP's. I am not. I am using cisco SFP's, which I figured would work fine.

    Their support page is pretty clear though. They dont say things like "it's not supported" or "not certified".

    They flatly declare it WILL NOT WORK.

    Doesn't make much sense to me, but at this phase I am willing to try just about anything.

    I ordered a few intel SFP's for testing. I will let folks know how that goes.



  • 4.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Jan 10, 2011 05:40 PM

    Did you install the HP OEM version of ESXi 4.1? The HP version has the HP Management agents already installed. There were problems reported on the non-HP versions of ESXi 4.1 for some models. I understand the G6 and G7 had some issues with the non-HP versions. If you have don't have the HP version you can still install the HP Management agents via CLI. There is a special package for ESXi so don't use the ones for ESX.



  • 5.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Jan 11, 2011 03:11 PM

    msemon1,

    I am using the standard vmware distribution of ESXi 4.1, with the HP management agents installed via the CLI.

    That should be fine, right?



  • 6.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Jan 11, 2011 04:30 PM

    That should be fine. The HP Agents have been known to do bad things when installed or configured incorrectly.



  • 7.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Jan 17, 2011 05:48 PM

    1. Try : play arround with the ixgbe parameters - try without MSI-X support (esxcfg-module -s InterruptType = 0 ixgbe && esxcfg-boot -b). vmkload_mod -s ixgbe show enough buttons to press :smileyhappy:.

    2. Try other slots - the DL580 G5 has a strange PCIe sub-bus layout with too low amount of lanes, also try to get a dedicated IRQ assigned.

    Here the PCIe device layout: (assuming the 580 G5 has the PCIe sub IO board)


    The 580G5 is limited to 28 PCIe lanes out of the North Bridge. Of these, only 24 lanes (3ea x8) go to the slots:
    1ea x8 PCIe     shared through a switch to slots 1, 2 & 3 (sub IO board)
    1ea x8 PCIe     shared through a switch to slots 4, 5 & 6
    1ea x8 PCIe     shared through a switch to slots 7, 8, 9, 10 & 11


    (slots 8-11 are x4 PCIe slots. The rest are x8.)
    To maximize system IO bandwidth you need to equally load all three PCIe switches.

    3. There is a patched version of the IPMI driver out - do you use it ?

    4. I wonder since when HP support to build in 3rd party NICs - i would update to a NC55x card (Emulex CNA).



  • 8.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Feb 10, 2011 03:11 PM

    Thanks for the input guys. I have not abandoned this thread, I have just been waiting to see if my implementation of some of your suggestions had helped me or not..

    I took your advice, Saturnous, and changed slots for my 10gb cards. I had been hopeful that the situation was resolved, but alas, it was not.

    It did seem to help the situation, as I went for about 20 days on several of the servers without a problem.

    However, it did occur again.

    I finaly got an answer from VMWare, that may make a bit of sense.

    We have 4 10gb nics in the system, even though only 2 are being used.

    In addition, we have two 1gb copper nics, being used for management.

    This violates the config maximums. Apparently when you have 4 10gb nics you cannot have ANY 1gb nics in use.

    The VMWare rep said that he has seen situations where this config maximum was violated and nics would occasionaly just stop forwarding traffic.

    Which sounds like what we are seeing.

    I have gone into the BIOS of the HP servers, and disabled the unused nic ports, so ESXi only sees two 10gb nics and two 1gb nics.

    Vmware says this should be a good config.

    We'll see. I'll report back after a week or so and let people know if this was the solution or not.

    Thanks for your help



  • 9.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Sep 02, 2011 05:54 PM

    We've recently hit these same type of issues with the nextgen cards and was wondering if disabling the un-used 10g nics resovled the issue for you?



  • 10.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Sep 02, 2011 06:12 PM

    Not really.

    This issue has been plaguing us for over a year. Every time I thought I had it licked, it would rear it's ugly head again a few weeks later.

    These intermittent problems are always the worst.

    I worked with VMWare support extensively and FINALLY they got enough information form us, that they believed it was a problem with the intel driver.

    Intel apparently provided a new debug driver for version 4.0 and 4.1 just this week, which I am supposed to install and test.

    I am looking at doing that the first part of next week.

    I will update the thread when I get more info.

    If you think you are having the same issue, you might want to contact VMWare support. You can reference my SR# 11057191404

    Hope that helps

    Jason



  • 11.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Sep 02, 2011 07:19 PM

    I've opened a ticket and also found this latest advisory for the NC522SFP that also talks about it

    http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?objectID=c02964542&lang=en&cc=us&taskId=101&prodSeriesId=3913537&prodTypeId=329290

    Thanks for the update.  I think 10G is gonig to be a pita for a while, no matter what brand we go with...



  • 12.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Oct 06, 2011 02:21 PM

    We're building our a datacenter with 20-DL380 G7s each with 2-NC522SFP 10GbE cards (Attaching to a pair Nexus 5548). Since it was on the HCL with no footnote, I assumed it was stable but have since found various KB and posts indicating the contrary.

    Can someone verify if this is resolved? Anyone using this who has had no issues at all? Trying to determine if I need to do anything drastic prior to build.

    We will be using ESXi 4.1 U1 (latest build).I'll be applying the latest firmware and drivers for the card.



  • 13.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Oct 06, 2011 02:33 PM

    We are in the process of replacing all the NC522 cards with the Intel X520 cards…we are still experiencing pause framing taking down the hosts with the latest firmware from HP and the latest vmware driver.

    So far the X520’s have been solid….



  • 14.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Oct 06, 2011 02:52 PM

    That's not great. I originally had ordered NC523SFPs for these but, they were backordered and a decision was made to replace them with available NC522SFPs. Comparing the feature sets, there is very little difference betwen the 2 so, I expected to see similar issue posts for it. Perhaps there are physical differences that make it more stable.

    I don't envy what you've gone through but, considering the issue takes down Hosts, I'm a little surpised that I don't see more posts about it. I'm curious how wide spread it is and if might be a result of specific environmental circumstances (i.e. DL580 G5, 522 and downstream switch combination, etc).

    Not doubting any of this.....just not loving the idea of replacing 40-cards and would like to make sure before starting that ball rolling.



  • 15.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Oct 06, 2011 03:12 PM

    http://longwhiteclouds.com/2011/08/31/hp-critical-advisory-nc522-and-nc523-10gbs-server-adapters/

    this was the last advisory…the exact issues we see are pause frames on the cisco side…we lose all connectivity…and even bouncing physical ports on nexus 5000 switches or unplugging cables to cause etherchannel to failover doesn’t work.

    Only recourse is a reboot of the host.

    I tried turning fans to maximum, making sure I wasn’t in slots 1/3 and ensuring I had latest firmware of everything.

    The network guys see massive traffic on one of the nic’s just before it all shuts down…



  • 16.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Oct 06, 2011 03:12 PM

    PS – when doing firmware update as epr the HP advisory, make sure you pick RedHat Enterprise 5 x64…the x86 is a different firmware filename and isn’t recognized.

    We are also on the latest and greatest vmware patch levels…



  • 17.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Oct 06, 2011 03:24 PM

    Thanks for the advisory link and the firmware update note. Irritating but appreciated. There is vague note about workload...

    Note: There is a low probability of this occurring when operating under a normal network workload.

    ...however I'm sure an ESX Host would generally be considered above normal workload".



  • 18.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Oct 06, 2011 03:57 PM

    We have only about 100 vm’s across 7 hosts…cpu utilization…3-5%...

    Glad I wasn’t the one to build this out…explaining that ROI would be a tough sell :o)



  • 19.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Oct 06, 2011 03:26 PM

    BTW...Had you applied this most recent advisory recommended updates prior to the replacement work? I assumed by your post history that you had.



  • 20.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Oct 06, 2011 03:57 PM

    Yup..we had put that newest firmware on there about 3 days before one of the damn servers dropped…

    I have 2/5 servers changed over…feel like I am sitting on a timebomb…

    Usually it takes 1-4 weeks before the problem shows up on average. So far I have had 3 different hosts across 2 different pairs of nexus 5000’s take a dive….



  • 21.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Oct 06, 2011 06:13 PM

    You're a real black cloud Rumple :-) I'll sound the alram and see what the decision makers say...not sure our timeline will allow for a replacement. Looking forward to a very uneasy datacenter migration. At least I'll know what to monitor for and maybe get some warning. Appreciate all the information  and insight though.



  • 22.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Oct 06, 2011 06:32 PM

    Trust me…I wasn’t happy about it either…when we hit it, we had just migrated from one datacenter to a new datacenter with all new network gear, new ESX environment on 10G…then things started falling over…

    /me was not the popular boy in town let me tell you…

    What also bit us was when it was setup by the other consultant they forgot that you can have 4x10G cards…or 2x10G cards and 1G together…in the NC522 you cannot disable any of the ports on the 10g cards so even though they plugged in 2x 10G ports…vmware would see 4x ports…so while the 1G would work…it was unsupported configuration and upon reboot, there is always the possibility that depending on memory load order, your 10g ports could get knocked out…

    Sigh…



  • 23.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Oct 06, 2011 06:36 PM

    I was able to disable ports on the NC522's without any problem.

    I just disabled the pci device for that port in the server bios.



  • 24.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Oct 06, 2011 06:49 PM

    Wow..I hadn't heard about the 4-10GbE maximum but, just found it http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1020808.

    I was only planning to connect 1 port on each card to start but also planned to utilize the 4-1GB Onboard NICs.

    1. Is it then true that if the 2 unused 10GbE ports are disabled, technically I will have 2-10GbE ports with respect to thsi KB and allowed to utilize the 1GB?
    2. Manfriday - What server model are you using?


  • 25.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Oct 06, 2011 07:02 PM

    If you can get the 2 unused ports on the nic's to disable then perfect...

    Qlogic and hp both indicated it could not be done...and in the device section I only saw port 1

    My suspicion was that with port 2 unplugged it never showed in bios but I worked with vmware and they showed all 4 ports enumerating...



  • 26.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Nov 23, 2011 11:33 AM

    Hi Rumple,

    Sorry to hear you're having so much trouble with your systems. I'm the author of longwhiteclouds.com. I'm running the Intel X520-T2 and I'm not having any problems at all. The cards have been rock solid. I understand that the SFP version of the same card type is also pretty rock solid. The customer that I had with the NC522SFP's is also now stable after the last driver and firmware updates.

    Have you considered switching to vSphere 5? The maximums for NIC ports are much better than on 4.x. On vSphere 5 you can have up to 6 x 10Gb/s Ports AND 4 x 1 Gb/s Ports. Just in case you decide to go down this parth the config maximums document is at this location:  http://www.vmware.com/pdf/vsphere5/r50/vsphere-50-configuration-maximums.pdf

    I hope you get a new driver that works, or having some success with vSphere 5. IMHO vSphere 5 is well worth the upgrade.



  • 27.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Nov 23, 2011 01:35 PM

    Since we replaced all 14 of the hp 10g 522 cards with the X520 single port sfp versions we have not had a single incident.



  • 28.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Dec 28, 2011 08:20 PM

    We have 20 DL380 G7s in 2 separate datacenters.  Each server has 2 NC523SFP dual port cards.  We are connecting one port on each nic to 2 Nexus 5548.  We are etherchanneling, and are using one vmkernel with Active / Active nics.  Randomly we are seeing one nic drop for about 2 seconds which triggers a redundancy lost alarm.  We have been working with HP because this http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?objectID=c02964542&aoid=35252 didn't solve the problem.  We are using the 4.0.727 driver with firmware 4.8.22.  When the problem happens we see in the message logs "firmware hang detected".

    We then ordered 2 NC522SFP to put into one of the servers and that just ended up worse.  When the nic flapped on this one, the network connection would not come back up until I bounced the server.

    We have involved HP, VMware, and Cisco, and all fingers seem to point to HP firmware.  Please tell me that I am not the only one out here having this issue.  Unless I can come up with some other ideas, we are now looking into the Intel® Ethernet Server Adapter X520-DA2.

    Any help would be appreciated,

    Matt



  • 29.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Dec 28, 2011 08:37 PM

    We were experiencing the issues you indicated when we were running the 522NFP nic’s in ether channel mode and with same Nexus line (maybe the smaller 5520 series) and ended up replacing all 14 nic’s out with the single port Intel X520-SR1 (non HP branded) and have not had a single issue since we did that over 2 months ago…previous to that, we’d have a server fall over every day or 3.

    We has a single port on each HP Nextgen SFP connected and when one failed and it would take out the entire server when it died. The switch guys were seeing a mass amount of port flooding happening prior and during the outage. As you found, only a reboot of the server brought it back.

    Check out this thread as well

    http://wahlnetwork.com/2011/08/16/identifying-and-resolving-netxen-nx_nic-qlogic-nic-failures/



  • 30.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Jan 01, 2012 03:18 AM

    same problem here Matt..

    1 x NC523 latest firmware & vmware driver - both ports connected to Cisco 3750x latest ios, DL380G6, vSphere 4.1 348481, few vm's lightly loaded host.

    006248: Dec 30 19:57:23.911: %LINEPROTO-5-UPDOWN: Line protocol on Interface TenGigabitEthernet1/1/2, changed state to down
    006249: Dec 30 19:57:23.945: %LINEPROTO-5-UPDOWN: Line protocol on Interface TenGigabitEthernet3/1/1, changed state to down
    006250: Dec 30 19:57:24.918: %LINK-3-UPDOWN: Interface TenGigabitEthernet1/1/2, changed state to down
    006251: Dec 30 19:57:25.086: %LINK-3-UPDOWN: Interface TenGigabitEthernet3/1/1, changed state to down
    006252: Dec 30 19:57:36.628: %LINK-3-UPDOWN: Interface TenGigabitEthernet3/1/1, changed state to up
    006253: Dec 30 19:57:36.628: %LINK-3-UPDOWN: Interface TenGigabitEthernet1/1/2, changed state to up
    006254: Dec 30 19:57:38.725: %LINEPROTO-5-UPDOWN: Line protocol on Interface TenGigabitEthernet1/1/2, changed state to up
    006255: Dec 30 19:57:38.742: %LINEPROTO-5-UPDOWN: Line protocol on Interface TenGigabitEthernet3/1/1, changed state to up



  • 31.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Jan 06, 2012 02:18 AM

    We have had problems with the NC522SFP for about 18 months now. Each time we upgrade the firmware and/or drivers the problems morph but never go away. We continue to see transmit timeouts, excessive Xoff pause frames, port resets, and PSOD.

    Even our new ESXi 5.0 hosts with the most current NC522SFP firmware and drivers still have the problems.

    We still have about 60 hosts with NC522SFP adapters.

    • HP ProLiant DL380 G6, G7, and DL580 G7 servers
    • NC522SFP ports connected to separate Cisco Nexus 5000 switches
    • ESXi 4.1 U1, U2, and ESXi 5.0
    • NC522SFP firmware = 4.0.579
    • ESXi 5.0 nx_nic driver = 5.0.601

    We have open and active cases with HP and VMware. Both have acknowledged a problem, but as of today we still don’t have a fix. I have lost all confidence in the in the NC522SFP.

    Time to move on...



  • 32.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Jan 06, 2012 02:52 PM

    Yeah, we started with the 523 and then tried out the 522 (made things worse).  Just yesterday I replaced 4 NC523SFP with Intel X520-DA2 cards in two of our servers.  I will post in about a week if the cards are stable.



  • 33.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Jan 07, 2012 01:09 AM

    That would be great I hope it goes well. I think we will need to go down this path also..

    =====

    also has anyone tried the firmware that vmware state on the HCL?

    Model:NC523SFP 10Gb 2-port Server AdapterVID:1077
    Device Type:NetworkDID:8020
    Partner Name:HPSVID:103c
    Firmware Version:4.6.31 (firmware); 4.0.702 (driver)SSID:3733
    Number of Ports:2

    CollapseESXi 5.0qlcnic  version  5.0.727async

    Footnotes  :Download driver from http://www.vmware.com/download/vsphere/drivers_tools.html
    CollapseESX / ESXi 4.1 U2qlcnic  version  4.0.727


  • 34.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Jan 14, 2012 03:39 AM

    Hi Guys,

    Any updates?

    Thx



  • 35.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Jan 16, 2012 03:39 PM

    It has been a week and a half and we have had no issue with the intel nics.  Today I am replacing the remaining NC523SFP and shipping them back.

    Best of all, HP decided to close my ticket with them this weekend, without contacting me.

    I edited this post because before I mentioned turning on vmdq.  I have tested on two systems, and the performance seems worse when you actually configure it instead of using it with the default setting.  I recommend not messing with the vmdq setting.

    Message was edited by: JonesytheGrea…



  • 36.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Jan 23, 2012 03:50 PM

    One last update.  We have replaced both of our datacenters with the Intel x520-DA2 cards and after updating the drivers to the most current version, I have had no more issues.  Ditching the Qlogic cards was the solution.



  • 37.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Mar 06, 2012 11:48 AM

    ManFriday,

    Your comment:

    "They are saying that you HAVE to use their SFP's. I am not. I am using cisco SFP's, which I figured would work fine.

    Their support page is pretty clear though. They dont say things like "it's not supported" or "not certified".

    They flatly declare it WILL NOT WORK."

    Can you plese send me the link that say this?  I would like to check this out.

    Just want to update eveyone on the SR 11057191404 that was opened by ManFriday.  It is still open and under investigation by both Cisco and VMWare.

    So this is a big issue.



  • 38.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 10, 2012 06:26 PM

    Hi David,

    Here is a link to the Intel web-page:

    http://www.intel.com/support/network/adapter/pro100/sb/CS-030612.htm

    What are the SFP+ optical module requirements for the Intel® Ethernet Server Adapter Series?

    • Intel® Ethernet SFP+ SR Optics and Intel® Ethernet SFP+ LR Optics
    • Other SFP+ modules are not allowed and cannot be used with these adapters.

    I just realized you asked me this like a month ago.. Sorry, somehow it slipped by unnoticed in my inbox until just now.

    Embarassing.



  • 39.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 10, 2012 06:31 PM

    Oh, but while I am talking about intel SFP's, I noticed a new behavior with the Intel X520-DA2 nics w/ non-intel SFP's under version 5.

    In ESXi 4 the cisco and Advantage Optics SFP's did avctually seem to work, despite not being supported by Intel.

    IN ESXi5, the port is actually DISABLED if it has a non-Intel SFP plugged in.

    Neat!



  • 40.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Oct 06, 2011 07:02 PM

    I worked with hp and qlogic and in the dl380 the only think that showed up in bios was port 1 in the device list and I could disable entire card easy enough, I could not disable port 2 on each card and use port 1 for connectivity



  • 41.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 09, 2012 12:53 PM

    We have also unfortunatly purchased the NC523SFP cards.

    We have been running these cards for about a year, they have been trouble from the start.

    Although there have been various firmware and driver updates these cards have intermitently suffered Link Loss issues. Generally the cards recover with in a few seconds.

    A week or so ago we experienced the same link loss but this time on both cards at the same time. Of course this means production outage..

    I took the plunge and upgraded one host to ESXi 5, applied the new firmware and drivers

    http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?lang=en&cc=us&taskId=110&prodSeriesId=4345880&prodTypeId=329290&objectID=c02964542

    I'd be lying if I said this had improved the situation. It's in fact much worse.

    We don't suffer the Link Loss issues anymore, the cards appears fine they just don't transmit packets, OH and some how also CPU utilisation of the Host flat lines during this issue. At times the Host recovers, somethimes I have to reboot the host to get it back.

    We are using the NC522SFP cards in our g6 hosts, they have been stable for the past 2 years but did not startout that way..

    I'm also trialing the Emulex rebranded card the NC552SFP, so far so good..

    We will need to make some hasty decisions on this issue this week, it's no longer a workable solution. The NC523SFP's need to go.

    I'll get hold of a Intel X520-DA2 and trial it along side the NC552SFP.



  • 42.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 09, 2012 03:51 PM

    an update.

    there is a later driver for the NC523SFP (or qLogic QLE3242) available from the qLogic, the driver is available from VMWare.

    This obviously means HP don't support the driver but qLogic and VMWare do..

    I'll do some testing and report back.

    The driver

    http://downloads.vmware.com/d/details/dt_esxi50_qlcnic_5_0_741/dHRAYndlaCpiZHAlJQ==



  • 43.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 09, 2012 05:18 PM

    Have you test this?

    Please let us know.

    milton123 


  • 44.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 10, 2012 12:31 AM

    I've effectively been testing the driver for about 12 hours. Unfortunatly the result is the same.

    The NFS stores are periodically going off line but they do appear to recover after a few seconds. You could say it's improved but not at all workable

    An extract from the logs

    Lost connection to server fasdc01nfs10gb mount point
    /vol/esx_aggr3_file_01/esx_aggr3_file_01_qtree mounted as
    7327dc8f-d2c7c3a1-0000-000000000000 (sannfssata01).
    error
    10/04/2012 10:16:57 AM
    ServerName

    Restored connection to server fasdc01nfs10gb mount point /vol/esx
    _aggr3_file_01/esx_aggr3_file_01_qtree mounted as 7327dc8f-d2c7c3a1
    -0000-000000000000 (sannfssata01).
    info
    10/04/2012 10:17:12 AM
    ServerName



  • 45.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 10, 2012 12:49 AM

    just a thought; your server has not exceeded the configuration maximum's has it? i.e how many NIC's in total do you have in this system?



  • 46.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 10, 2012 01:20 AM

    Good though damicall, I'd not thought of that one..

    The server has 16 NIC's (as in ports). I'm not sure what the supported number of NIC's is with ESXi 5 but it was 20 under ESX 4 so I can assume we are within a supported configuration



  • 47.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 10, 2012 01:29 AM

    10gb nics complicate the config maximums a little bit.

    http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1020808

    Looks like the limit in 5.0 is "six 10gb and 4 1gb ports"



  • 48.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 10, 2012 01:38 AM

    You need to be very careful about the number of 1G vs number of 10G NIC's. 2 x 10G and 8 x 1G (Which I have in my hosts), but also this is normally only for 1500 MTU. I run my lab environment at Jumbo MTU 9000 all the time however, but I wouldn't recommend this for a production environment.

    From: Jason Morris <communities-emailer@vmware.com<mailto:communities-emailer@vmware.com>>

    Reply-To: communities-emailer <communities-emailer@vmware.com<mailto:communities-emailer@vmware.com>>

    Date: Mon, 9 Apr 2012 18:29:43 -0700

    To: Michael Webster <michael.webster@itsolutions2000.co.nz<mailto:michael.webster@itsolutions2000.co.nz>>

    Subject: New message: "Major issues with HP DL580 G5 and Intel X520-DA2"

    VMware Communities<http://communities.vmware.com/index.jspa>

    Major issues with HP DL580 G5 and Intel X520-DA2

    reply from Jason Morris<http://communities.vmware.com/people/manfriday> in VMware vSphere™ vNetwork - View the full discussion<http://communities.vmware.com/message/2023118#2023118



  • 49.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 10, 2012 04:49 AM

    We have 16 NIC's in our Dl585 g6's

    12 1Gb NIC's and 4x 10Gb NIC's 

    The 1Gb are the on board + 2x NC365T (Intel Chipset)

    the 10Gb are NC522SFP's (the older qLogic) 2 of these port run with Jumbo's

    No problem with these server



  • 50.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 10, 2012 05:08 AM

    That is currently an unsupported host NIC configuration as you are exceeding the maximums. Just because something works doesn't mean it's supported. All the maximums are is what has been tested and the main driver for the number of NIC's is CPU cores and memory buffers. Provided you have lots of cores and memory configurations exceeding the maximum's can work, even if they aren't officially tested and supported. VMware is going to be doing some more testing of different combinations of NIC's in future releases from what I hear.

    From: Mark <communities-emailer@vmware.com<mailto:communities-emailer@vmware.com>>

    Reply-To: communities-emailer <communities-emailer@vmware.com<mailto:communities-emailer@vmware.com>>

    Date: Mon, 9 Apr 2012 21:49:20 -0700

    To: Michael Webster <michael.webster@itsolutions2000.co.nz<mailto:michael.webster@itsolutions2000.co.nz>>

    Subject: New message: "Major issues with HP DL580 G5 and Intel X520-DA2"

    VMware Communities<http://communities.vmware.com/index.jspa>

    Major issues with HP DL580 G5 and Intel X520-DA2

    reply from Mark<http://communities.vmware.com/people/markzz> in VMware vSphere™ vNetwork - View the full discussion<http://communities.vmware.com/message/2023144#2023144



  • 51.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 10, 2012 06:00 AM

    Hi Michael

    The DL585g7 in question have  48 cores and 320GB of memory.

    Although I've not read the entire document relating to supported NIC configurations I would have thought this NIC configuration was OK due to the Core and Memory..

    Your advice on this would be appreciated..



  • 52.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 10, 2012 07:48 AM

    You would think, but unfortunately that's not the way the supported maximum's work. I recently had a customer with 1TB RAM and 160 Cores per host. They had 4 x 10G NIC Ports and 8 x 1G NIC ports. This is also not a supported configuration. Unless it's listed as a supported combination and as having been tested it is not supported. You may still get lucky and it might still work. But you may have difficulty if you log a call and they determine the root cause could be related to running too many NIC's per host. In my experience VMware support will still try and help on a best efforts basis, but may end up asking you to remove some NIC's from the host. Hopefully the limits and combinations are changed in the next release.

    --

    Michael Webster, VCDX

    Director

    IT Solutions 2000 Ltd

    Mob: 021 500 432 | longwhiteclouds.com | twitter.com/vcdxnz001

    From: Mark <communities-emailer@vmware.com<mailto:communities-emailer@vmware.com>>

    Reply-To: communities-emailer <communities-emailer@vmware.com<mailto:communities-emailer@vmware.com>>

    Date: Mon, 9 Apr 2012 23:00:21 -0700

    To: Michael Webster <michael.webster@itsolutions2000.co.nz<mailto:michael.webster@itsolutions2000.co.nz>>

    Subject: New message: "Major issues with HP DL580 G5 and Intel X520-DA2"

    VMware Communities<http://communities.vmware.com/index.jspa>

    Major issues with HP DL580 G5 and Intel X520-DA2

    reply from Mark<http://communities.vmware.com/people/markzz> in VMware vSphere™ vNetwork - View the full discussion<http://communities.vmware.com/message/2023208#2023208



  • 53.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 10, 2012 11:25 AM

    "4 x 10G NIC Ports and 8 x 1G NIC ports" IS NOT A SUPPORTED CONFIGURATION and I can confirm with you on this.  Michael is ABSOLUTELY correct on this.  This partly explains a lot of issues you are seeing so far.

    I've had many conversations  with VMWare TAC on this issue and they have OFFICIALLY confirmed this.

    Let say that your system have 4x10G and 8X1G NIC ports.  Even though you use only 2x10G and 2x1G ports and nothing is plugged into the remaining ports.  THIS IS NOT GOOD ENOUGH.  YOU MUST DISABLE THE REMAINING PORTS IN THE SYSTEM BIOS SO THAT VMWARE ESX CAN NOT SEE THEM DURING BOOTUP TIME.  IF VMWARE SEES THOSE UNUSED PORTS, NOW YOU HAVE AN UNSUPPORTED CONFIGURATION AND AN UNSTABLE SYSTEM.  Simple as that .  With this configuration, the more load you put on the ESX systems, the more unstable it becomes.

    There was a similar discussion on another post:  http://communities.vmware.com/message/2005863#2005863



  • 54.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 10, 2012 05:11 PM

    Hi David

    Thanks for your response.

    I must say "4 x 10G NIC Ports and 8 x 1G NIC ports" is it's self a very limited configuration.

    The article  https://www.vmware.com/pdf/vsphere5/r50/vsphere-50-configuration-maximums.pdf contradicts this.

    As you see the article only gives one example. In this example they talk about 6x 10Gb nic's and 4x 1Gb nic's is a maximum configuration.

    I'm aware there's no set formula to calculate this maximum configuration as each nic type appears to utilise differing levels of resources but as a rough guide I've always though of it as "each 10Gb port = 4 1Gb ports".

    Obviously there are no hard and fast rules here.

    There are also some inaccuracies in the "configuration-maximums" document eg. the nx_ nic driver stated as a 10Gb QLogic is in fact a 1Gb qLogic NetXen driver. The Qlogic 10Gb driver is a qlcnic.

    We have

    2x NC523SFP NIC's

    2x NC375T NIC's

    Onboard 375i ports (4 Ports)

    This configuration functions in ESX 4.1 but is unstable with ESXi 5.

    Initially with ESX 4.1 the NIC's would suffer Link Loss. Although I only updated the firmware and driver about a week ago I've not seen any Link Loss issues since the updates were applied. This maybe a possitive move forward by QLogic.

    I'm in the process of reducing the NIC count.

    As a test I have removed one of the NC523 NIC's.

    Although it's only been a few hours so far it's stable.

    If this continues to be stable I can alter our servers by removing one of the NC375T cards. If this isn't stable I'll have to look at redesigning the solution.

    Also I have requested a trial of 2 NC552SFP's. These should arrive tomorrow,  I'll update the thread post testing.



  • 55.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 10, 2012 06:03 PM

    I am just telling you what VMWare TAC informed as an official response.  The TAC case is 12150432902.  My TAC case is with ESXi 4.1 NOT 5.x.

    If ESXi 4.1 sees 4x10Gig and additional 1Gig at boot time, then you will have an unstable system



  • 56.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 10, 2012 09:18 PM

    It's not just the number of NIC's it's also the type. Different driver versions have different overheads to the hypervisor. The best way to either find out of it's a supported configuration or get it supported is to log a support request with VMware and have them bless the configuration.

    --

    Michael Webster, VCDX

    Director

    IT Solutions 2000 Ltd

    Mob: 021 500 432 | longwhiteclouds.com | twitter.com/vcdxnz001

    From: Mark <communities-emailer@vmware.com<mailto:communities-emailer@vmware.com>>

    Reply-To: communities-emailer <communities-emailer@vmware.com<mailto:communities-emailer@vmware.com>>

    Date: Tue, 10 Apr 2012 10:11:53 -0700

    To: Michael Webster <michael.webster@itsolutions2000.co.nz<mailto:michael.webster@itsolutions2000.co.nz>>

    Subject: New message: "Major issues with HP DL580 G5 and Intel X520-DA2"

    VMware Communities<http://communities.vmware.com/index.jspa>

    Major issues with HP DL580 G5 and Intel X520-DA2

    reply from Mark<http://communities.vmware.com/people/markzz> in VMware vSphere™ vNetwork - View the full discussion<http://communities.vmware.com/message/2023687#2023687



  • 57.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 11, 2012 12:24 AM

    David

    I don't mean to sound argumentative at all and appreciate the input.

    Currently the server is running with one 10Gb "NC523SFP" removed..

    It's not lost the plot yet so we may have a winner.

    Another odd think I saw.

    I had 7 NFS targets connected on the server, when I tried to add another NFS target it failed complaining the number of available NFS connectons had been exceeded.

    Odd since ESXi 5 supports 256 NFS targets!



  • 58.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 11, 2012 12:28 AM

    By default its set to 8...you need to increase that in the options.

    That would suggest none of the other recommended settings you usually change with nfs are set as well.

    If its a netapp, get vsc 2.1.1 installed and it will help you configure your nfs settings..



  • 59.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 11, 2012 12:16 PM

    Thanks Rumple. I was reading the maximum document when I saw that NFS target number.

    The bad news.

    With one NC523SFP removed from the Host server it worked fine for about 20 hours before falling over.

    I've had to reboot the host to resolve the issue.



  • 60.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 11, 2012 12:23 PM

    So what is your NIC count now?

    You may still need to remove more to get a supported and stable system.



  • 61.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 11, 2012 08:37 PM

    Back to the topic which was HP NIC's & instability.

    Found this in my error logs on esxi after I noticed a NC523 link go up and down this morning.

    Apr 11 18:59:46 vmkernel: 88:15:28:15.794 cpu11:4409)<6>qlcnic 0000:07:00.0: Firmware Hang Detected
    Apr 11 18:59:46 vmkernel: 88:15:28:15.794 cpu11:4409)<6>qlcnic 0000:07:00.0: Disabled bus mastering.
    Apr 11 18:59:46 vmkernel: 88:15:28:15.795 cpu11:4409)IDT: 1565: 0x82
    Apr 11 18:59:46 vmkernel: 88:15:28:15.795 cpu11:4409)IDT: 1634: <vmnic12[0]>
    Apr 11 18:59:46 vmkernel: 88:15:28:15.871 cpu0:4412)<6>qlcnic 0000:07:00.1: Firmware reset request received.
    Apr 11 18:59:46 vmkernel: 88:15:28:15.871 cpu0:4412)<6>qlcnic 0000:07:00.1: Disabled bus mastering.
    Apr 11 18:59:46 vmkernel: 88:15:28:15.871 cpu0:4412)IDT: 1565: 0xc2
    Apr 11 18:59:46 vmkernel: 88:15:28:15.871 cpu0:4412)IDT: 1634: <vmnic13[0]>

    and using vmware driver VMware ESX/ESXi 4.x Networking Driver (qlcnic) Version 4.0.727

    http://downloads.vmware.com/d/details/dt_esxi40_qlogic_qlcnic_40727/ZHcqYnQqQGhiZEBlZA

    and some additional info, as people are seeing the same

    http://wahlnetwork.com/2011/08/16/identifying-and-resolving-netxen-nx_nic-qlogic-nic-failures/

    http://kb.vmware.com/selfservice/search.do?cmd=displayKC&docType=kc&externalId=2012455&sliceId=1&docTypeID=DT_KB_1_1&dialogID=286415943&stateId=0%200%20286421091



  • 62.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 17, 2012 12:44 PM

    damicall,

    We were seeing the exact error that you describe where every time there would be a failure you could see that a firmware hang was detected.  I passed log after log to HP and didn't get anywhere.  We finally replaced all of our dual 523SFPs with dual Intel 520 cards and we have not had an issue since.  If you want to save yourself a headache, I would change the cards out if you can.

    I have upgraded two servers so far to ESXi 5 and am not having any issues.  We are using cisco cables and SFPs and there has been no issue with them and the Intel cards.  

    For networking I am using 2 onboard 1GB Broadcom nics for management and vmotion and I have 2 10GB Intel 520 that handle my VM traffic.  The host does see 2 extra 10GB nics out there because I cannot disable them in the bios.  So far this configuration has been very stable.

    Jonesy



  • 63.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 17, 2012 02:34 PM

    "For networking I am using 2 onboard 1GB Broadcom nics for management and vmotion and I have 2 10GB Intel 520 that handle my VM traffic.  The host does see 2 extra 10GB nics out there because I cannot disable them in the bios.  So far this configuration has been very stable."

    The reason you have a stable system is because you are using ESXi 5 and NOT ESX 4/4i.

    This configuration is UNSTABLE in ESX 4/4i because the host see 4x10GB NIC and 2x1GB on-board NIC, thus violating the maximum network configuration in ESX 4/4i



  • 64.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 18, 2012 01:18 PM

    Well David2009 I'm not entirely sure I agree with the idea ESXi 5  is more stable than ESX4.

    I can produce the same result in both version.

    I was looking at 2 servers with identical hardware, one was running ESXi 5 and the other ESX4.

    My ESX4 server was stable except for the odd link loss issue but I never entirely lost the path due to NIC redundancy.

    The ESXi 5 server was the one that was suffering path loss and would require a reboot to get it back on line..

    I noticed I'd made a mistake when I was setting up the ESX4 server.

    I'd enable jumbos on the vSwitch which was used for IP Storage traffic but I'd forgotten to enable jumbos on the associated vNIC.. Of course I thought bugger, so I corrected the configuration.. About 2 hours later the ESX4 server lost access to the NFS stores..

    By correcting my mistake I'd discovered how to break the ESX4 host..

    I've checked the physical switches and end point device, Jumbos are enable all the way down the line, This is not a configuration issue but may be related to another (ANOTHER) issue with these QLogic NIC's or possibly the port count.

    Our Hong Kong associates have advised a similar issue has been occuring to them, HP has admitted the NC375t is likely the actual cause and advising the NIC's be replaced with the intel version "NC365t".

    I know for a fact these NC365T utilise 128MB of System ram which I guess is no big deal. We have these NIC's in a few of the DL585g6 server. Oddly these DL585g6's have never had any NIC issues.



  • 65.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 18, 2012 04:11 PM

    Markzz,

    Are you using iSCSI?  Is it possible that something got messed up with the binding of nics to hba?



  • 66.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 19, 2012 05:12 AM

    Hi JonesytheGrea…

    No iSCSI used here.. All IP Storage is NFS.



  • 67.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 18, 2012 04:09 PM

    David2009,

    I am upgrading from ESXi 4.1 update 2 to ESXi 5.  This configuration on 5 was stable for me on 4.1.  If I could have disabled the extra 10GB, I would have.  The other part of the discussion is that some people are having problem with cables that are not Intel cables.  That is not the case with us; we are using Cisco cables for our 10GB SFP connections.



  • 68.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 18, 2012 08:09 PM

    I have this same issue with 10 esxi hosts on version 4.1 update2.  They run on dl380 g7s and have two nc523sfp each.  Only two 10gb links are being used. we are using cisco cables.  About every 24hrs or so I get the  "firmware hang Detected" on at least one esxi host.  I've tried disabling the 4 onboard 1gig nics so esxi only sees 4 10gb nics.  Again I'm only using 2 10gb nics for all traffic and I had the same failures.  i opened a ticket with HP they sent me the same exact card so I'm not sure its going to fix anything.  I find it hard to believe I have 20 bad cards.  I also opened a ticket with Vmware and they seemed to think it maybe something to do with the firmware\driver or maybe the network configuration.  since i have the latest firmware and driver loaded i started to simplify the network.  I'm currently testing an explicit failover setup instead of using "iphash" and portchannel with our nexus switches.  i have a few tasks setup to vmotion vms back and forth 4 times a hour.



  • 69.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 19, 2012 05:21 AM

    Hi caledunn

    Your testing sound complex and comprehensive..

    I agree with your point regarding the NC523SFP cards.

    You may have 1 or 2 faulty cards but not all 20. I therefore wonder if HP have updated the hardware version of the cards and therefore changed something.

    I've got 2 NC523SFP cards here currently which HP have sent me. These are hardware revision 0d.. I've not compared these cards to the current NC523's in the prod servers.. I'll get to that one tomorrow.

    After upgrading the firmware and driver on the NC523 I have not seen another link loss (firmware hang). I now seem to suffer an issue where the cards won't transmit packets.. Great job qLogic..



  • 70.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 19, 2012 05:32 AM

    All our cards are

    HP P/N 593715-001  REV 0B   (white sticker on SFP+ Slot)



  • 71.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 19, 2012 05:38 AM

    I'll check mine tomorrow evening.



  • 72.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 19, 2012 01:02 PM

    My test past the 24hr mark on the esxi hosts in the test cluster BUT I just got the "firmware hang detect" message on an esxi host in our exchange cluster.  I had removed iphash and the port channel on that host also. so it doesnt look like the config is the issue.  I think the next thing Im going to do is get a hp nc550sfp and x520-DA2 and test with them.

    Interesting about the rev on the nc523sfp because the ones I have are 593715-001 Rev:0C and hp sent me Rev:0C.  I'm going to double check some of my servers and see if they are Rev:0C.  I still have 13 unboxed that i believe are all Rev:0C also so i'm not optimistic.



  • 73.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 19, 2012 06:12 PM

    I doublechecked the 10gb cards in the server that failed thie morning and they are at Rev: 0C which is the same Rev that hp sent me.



  • 74.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 22, 2012 01:18 PM

    Last night one of the esxi hosts in the vsphere cluster I'm using to test finally failed with the "Firmware hang detected".  So the changes helped but instead of every 24hrs it lasted for almost 5 days.  i guess the next step is to try a new card.  We are getting 10 nc552sfp cards this week.  we sent what we hadnt unboxed yet back to swap for the nc552sfp cards.



  • 75.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 22, 2012 05:58 PM

    caledunn

    We use the NC522SFP in a number of DL585g6's.

    They were quite troublesome in the early days but have been stable since their last firware and driver update.

    If I recall the firmware released late last year is stable

    Driver and firmware information.

    ~ # ethtool -i vmnic2
    driver: nx_nic
    version: 5.0.601
    firmware-version: 4.0.579
    bus-info: 0000:08:00.0
    ~ # ethtool -i vmnic0
    driver: nx_nic
    version: 5.0.601
    firmware-version: 4.0.579
    bus-info: 0000:02:00.0

    OH another thought.

    I've had reports the NC375T are part of the problem and these should also be replaced with NC365T's

    The DL585g6's mentioned use NC364T addin nics (the NC364T is the early version of the NC365T both use an Intel chipset)



  • 76.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 23, 2012 01:01 PM

    Is the latest esxi driver for the nc523sfp card still 4.0.727 or is it now 4.0.739?  If I go to the hardware compatibility guide I noticed its still listed as 4.0.727 but when I go to the esxi driver cd I only see 4.0.739. If I go to HP advisory page and click the download link it does take me to a page for the 4.0.727.

    hardware compatibility guide:

    http://www.vmware.com/resources/compatibility/detail.php?deviceCategory=io&productid=19311&deviceCategory=io&partner=41&releases=158&deviceTypes=6&page=5&display_interval=10&sortColumn=Partner&sortOrder=Asc

    Hp advisory:

    http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?objectID=c02964542&lang=en&cc=us&taskId=110&prodSeriesId=4345880&prodTypeId=329290

    vmware 4.0.727 page:

    https://my.vmware.com/web/vmware/details/dt_esxi40_qlogic_qlcnic_40727/ZHcqYnQqQGhiZEBlZA

    Vmware driver cd:

    https://my.vmware.com/web/vmware/details?downloadGroup=DT-ESX4x-QLOGIC-QLCNIC-40739&productId=230#dt_version_history



  • 77.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 22, 2012 04:22 PM

    Friday evening I installed the new NC523SFP's HP support sent to me.

    The original cards are REV:0A. The new cards are REV:0D.

    Unfortunatly one of these cards appeared to promptly fail, regardless I continued the test with both cards installed but only one functioning.

    After about 20 hours the same issue occured and I could not transmit packets over the 10Gb interface.



  • 78.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 22, 2012 05:47 PM

    Mark,

    I was thinking about adding a second riser and moving the second card over so i could then have both in slot 1 on the riser.  Currently i have both nc523sfp's on the same riser in slot 2 and slot 3. But I'm not sure this would make a difference.  How do you have yours setup?



  • 79.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 22, 2012 06:06 PM

    caledunn We do use some DL385g7's, generally they are purchased with the 2nd riser for expansion.

    I would put the 2nd 10Gb nic on the 2nd riser for redundancy and to distribult the load across the buses.

    In our DL585's the expansion board which adds 3 more PCIe interfaces is infact called a riser. I do the same with this where 1 of the 10Gb cards is installed in it.

    To be honest it doesnt' seem to make any difference.. The NC523's still fail.

    Your plan with the NC522 is sensible, BUT they run HOT HOT HOT..

    I'd suggest you set your system bios to maximum cooling..  and be sure the servers are getting plenty of nice cold air on them.



  • 80.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 23, 2012 01:44 PM

    If I remember correctly

    The recommended driver for ESXi 5 is 5.0.727

    The recommended driver for ESX /i 4 is 4.0.727

    But there are later version drivers available

    For ESXi 5 qlcnic-esx50-5.0.741-635278.zip (Version 5.0.741)

    I'm not sure if there is a later version than 4.0.727 for ESX 4.



  • 81.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 23, 2012 01:49 PM

    Just like to add we are running the latest versions.

    NC523SFP

    ~ # ethtool -i vmnic12
    driver: qlcnic
    version: 5.0.741
    firmware-version: 4.8.22
    bus-info: 0000:81:00.0
    ~ #

    And it's still terribly unstable.

    Although if I keep my vNic MTU at 1500 i only suffer the link loss issues. If I push my vNic 9000 (the vSwitch is running an MTU of 9000) the NC523's port with these MTU values after some time won't pass packets.



  • 82.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted May 06, 2012 07:12 PM

    Just a very quick update on this debarcle of a situation.

    After a few weeks battling with HP support who seem more confused with this issue than I am, and VMWarewho were really no help at all I managed to get hold of our Enterprise account manager..

    He has been very helpful and we have made significant progress by involving the local Australian Technical team and making some rather radical changes.

    HP Agreed to send over 2x NC552SFP (10Gb Emulex) which replaced the 2x NC523SFP's and 2x NC365T (intel 1Gb) which replaced the 2x NC375T.

    This combination has now been running for 5 days.

    The new NIC's have not reported any failures, link loss, anything at all..

    To put this into perspective, these servers have been running for over 12 months, there has never been a period of 5 days where they have not experienced link loss or someother nic port failure.

    The onboard NC375i have been behaving better but they are still the one thing I'm not confident of. I've seen a couple of vmotions fail. I've not done any monitoring or diagnosie yet but it seems when these ports hit 90% untilisation they seem to pause and no longer pass packets. (sounds like some other qlogic nic's)

    I did read a forum entry where it was stated HP and QLogic recognise there is an issue with the onboard NC375i chipset and have a replacement riser available which resolves this issue..

    At this point I'm going to be kind to HP and begin discussing replaceing the NC523SFP's and NC375T's in our other 585g7's..I'm not sure if this will be a swapout or purchase situation, either way I'm just happy to see some improvement in stability.



  • 83.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted May 07, 2012 03:20 PM

    i've also been testing with the x520-DA2, nc550sfp and nc552sfp cards and without making a single configuration change they have yet to go down. 
    i have several vsphere clusters i've been testing with and I have at least one esxi host with the nc523sfp cards in each cluster and the others with the intel or emulex cards.  The hosts with the nc523sfp cards go down every few days but the others stay up.  The only thing that changed was replacing the cards.  We are moving forward with purchacing more nc552sfp cards and that will be our solution to the problem.  We will just eat the cost on the 20+ cards we have.  We are hoping we can reuse them with our windows servers. I would have responded earlier but i wanted to give it a couple of weeks to make sure the emulex and intel cards were stable.   I'll let you know if i run into any problems with the new cards.  I'll add that vmware support was actually pretty helpful for us.  They didnt offer a solution but they helped troubleshoot and narrow down where the issue is and kept the ticket open.  



  • 84.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Dec 04, 2013 01:33 PM

    I'd like to update this post with where we are at and the stability of the NC375T, NC523SFP, NC375i.

    I would like start by saying our end game solution with these qLogic network cards has simply been to replace them. I do however have some LAB servers which still use the NC375T NIC.

    It would be wonderful to report there was an achievable solution which stabilised these network cards, BUT THERE IS NOT.

    qLogic have continued to release firmware and drivers in an attempt to resolve the various performance and stability issues, none the less it does not appear they have achieved an acceptable result..

    My advice is to just avoid these network cards.

    Emulex, Brocade and Intel chipset cards are available from HP. These may be marginally more expensive but they work. If I was asked for a recommendation.

    NC552SFP are stable and fast

    NC365T are again stable and fast.

    Intel cards are always expensive but they simply work..

    (my opinions are my own, my experience is what I share)



  • 85.  RE: Major issues with HP DL580 G5 and Intel X520-DA2

    Posted Apr 14, 2014 02:26 PM

    Are these cards related to the QLogic QLE3242 dual port 10GbE adapter?

    We were having trouble maintaining new NFS storage connectivity across these adapters in ESXi5 U3, 1489271.  So far the fix (I hope its a fix) was to update the firmware and driver to these versions:

    driver: qlcnic

    version: 5.1.178

    firmware-version: 4.16.34

    Originally we had driver 5.0.727 and firmware 4.9.x.

    I found another thread on here with the same nic and poor iSCSI stability when using jumbo frames.  Going back to 1500 mtu would stablize it for them, but then they upgraded to firmware 4.12.x and jumbo was stable for them.  That post was quite some time ago so now as you can see 4.16.34 is out.  I also installed the QLogic CIM provider on each host and the vcenter server plugin so I can also now view and manage these cards.

    I made this change only a week ago but so far so good.  Here's knocking on wood.....