VMware Cloud Community
Mc20piece
Contributor
Contributor
Jump to solution

Update manager still shows 6.5 Update 1 needed even though host is running 6.5 U1 installed from HPE Custom ISO

Update manager still shows 6.5 Update 1 complete needed even though host is running 6.5 U1 installed from HPE Custom ISO

There are 3 VMware updates listed including the full update ALL of which are in the U1 release notes... What gives?

ixgben patch from 3/8/17 and ixgben update from 7/26/17

This is the case for two separate customers with completely different gear. If these updates are installed its purple screen and rollback

The servers are HPE DL380 Gen8

31 Replies
danb12376
Contributor
Contributor
Jump to solution

This is the response back from HPE.

I have not had a chance to try this. Hopefully I will get time this afternoon.

PCIe NIC FW/Drivers Details:

NIC Model: HP Ethernet 10Gb 2-port 560SFP+ Adapter (Slot 3)

Driver: ixgbe Version: 4.5.2-iov

Firmware Version: 0x80000835, 1.1618.0

As per PSOD footprint, this issue could be related to intelcim-provider installed on server in question. 

Intel_bootbank_intelcim-provider_0.5-3.3:

Name: intelcim-provider

Version: 0.5-3.3

Type: bootbank

Vendor: Intel

Action Plan:

>> To isolate, remove intelcim-provider from server in question. Ideally removing intelcim-provider should not affect functionality of 560SFP+ card installed on server.

NOTE: A host reboot might be required for changes to take effect.

Below instructions could be followed to remove intelcim-provider.

>> Ensure to move ESXi host in maintenance or tech support mode before implementing this change.

>> Connect to ESXi host using putty (SSH) application. Please refer below links to enable SSH for putty session.

http://pubs.vmware.com/vsphere-6-5/index.jsp?topic=%2Fcom.vmware.vcli.getstart.doc%2FGUID-C3A44A30-E...

OR

http://masteringvmware.com/how-to-enable-ssh-on-esxi-6-using-vsphere-web-client/

>> Once connected to ESXi host using putty (SSH), run below command to stop CIM provider service.

/etc/init.d/sfcbd-watchdog stop

/etc/init.d/sfcbd status

>> Further remove intelcim-provider from ESXi host in question.

esxcli software vib remove -n=intelcim-provider

>> Further run below command to start sfcbd again.

/etc/init.d/sfcbd-watchdog start

>> Once reboot host for changes to take effect.

>> Re-connect to ESXi host using putty application & run below command to confirm sfcbd running.

chkconfig sfcbd-watchdog

chkconfig sfcbd

/etc/init.d/sfcbd status

/etc/init.d/sfcbd-watchdog status

>> Run below command to confirm intelcim-provider no longer present.

esxcli software vib list | grep intelcim-provider (if shows no output, this confirms intelcim-provider no longer present)

>> Further take ESXi host out of maintenance or tech support mode and monitor server for 48-72 hours to confirm, if issue re-occurs.

danb12376
Contributor
Contributor
Jump to solution

I think this problem is fixed for me now.

I ran the commands HPE wanted me to run and removed the intelcim-provider.

After that was done and a reboot I was able to run VUM now with no PSOD!

Only thing is after removing the intelcim-provider the HP 6.5 u1 iso thinks it needs to be ran so I just detach that and VUM is happy.

I also fixed the system logs stored on non-persistent storage warning by doing this....

1. /etc/init.d/hostd stop

2. localcli software vib remove -n elxiscsi -n elx-esx-libelxima.so

3. reboot

After doing this everything seems to be working. The server boots much faster after fixing the scratch log non-persistent storage issue.

pirx666
Contributor
Contributor
Jump to solution

So this fixes the problem, but what is the intelcim good for and what do we lose if we unistall it?

Reply
0 Kudos
handsy
Enthusiast
Enthusiast
Jump to solution

I'm also suffering this issue after updating my DL380 Gen9s with the HPE 6.5U1 iso.

Do you think it safe to assume that HPE will release a second, repaired version of their shoddy 6.5U1 iso?

Reply
0 Kudos
danb12376
Contributor
Contributor
Jump to solution

This is what I received back from VMware...

When running to HP customized 6.5U1 ESXi image, the system may hit PSOD like XXXX. To workaround:

1) if the system has upgraded to 6.5U1, remove the intelcim-provider VIB (providing the esxcli command);

2) if the system has freshly installed 6.5U1, install the ESXi 6.5GA first, then upgrade to ESXi 6.5 and follow the steps of (1);

3) if the system is not yet installed, install the 6.5U1 VMware stock image, then install the HP OEM bundle (link provided) but exclude/remove intelcim-provider VIB after that.

handsy
Enthusiast
Enthusiast
Jump to solution

I tried this method (recommended by HPE and VMware) and I still see VUM asking me to patch to 6.5U1 :smileyconfused:

Reply
0 Kudos
danb12376
Contributor
Contributor
Jump to solution

After you remove the intelcim-provider and reboot you can then run the updates that VUM shows.

You will see 3... 2 ixgben and 1 6.5 Complete Update 1 updates.

It then installed these updates for me and didn't PSOD. Now VUM shows everything up to date.

Reply
0 Kudos
handsy
Enthusiast
Enthusiast
Jump to solution

OK thanks I will try this.

Can you reinstall intelcim-provider after this 'fix' is applied?

Reply
0 Kudos
danb12376
Contributor
Contributor
Jump to solution

No problem.

I have not tried to reinstall intelcim-provider.

My guess is if you reinstall it the server will probably PSOD but can't say for sure. I'm going to leave it uninstalled and wait till a new HPE iso is released.

Reply
0 Kudos
Mc20piece
Contributor
Contributor
Jump to solution

Nice to see the issue has been narrowed down. I'll be waiting until a proper HPE Custom ISO is released before moving anyone else to 6.5.

Reply
0 Kudos
nparas5
Enthusiast
Enthusiast
Jump to solution

Seeing PSOD Error Code PF Exception14, Page Fault may be caused by either a hardware or a software issue. As the cause may vary significantly for these types of exceptions, a core-dump review may be performed by VMware. It seems the dump wasn't able to write to the partiction properly, refer below KB article to setup partition for dump files:

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=20042...

First work to resolve the PSOD Issue, then for the host update. The sequence will be like : Scan Entity > Remediate.

Try again Scan entity for the upgrade and patches then apply host upgrade first and then patches.

Reply
0 Kudos
handsy
Enthusiast
Enthusiast
Jump to solution

Does anyone have any updates on this?

I've raised a ticket with VMware support and they've not been particularly helpful. They seem to be stalling in anticipation of the next update (6.5U2?).

Cheers

Reply
0 Kudos