VMware Cloud Community
Nouha
Contributor
Contributor

ESX 5 Crashes since i activated HA on it

Hi ,

i have a host that crashes at least twice a day (ESXi 5) ,

the server is IBM 3850 , with 16 Go Ram

that is what i find in VMKWARNIG.LOG :

2013-04-01T14:39:35.600Z cpu13:2714)WARNING: iscsi_vmk: iscsivmk_StartConnection: Conn [CID: 0 L: 172.16.0.121:58313 R: 172.16.0.40:3260]
TSC: 4855271224 cpu0:0)WARNING: SVGAConsole: 266: Extended TTY not supported. Ignoring on tty 0
TSC: 4855755600 cpu0:0)WARNING: SVGAConsole: 266: Extended TTY not supported. Ignoring on tty 2
TSC: 4856060720 cpu0:0)WARNING: SVGAConsole: 266: Extended TTY not supported. Ignoring on tty 3
TSC: 6312114056 cpu0:0)WARNING: HPET: 592: HPET counter runs at (12921597882%) of rated speed
0:00:00:04.106 cpu0:2048)WARNING: IOAPIC: 627: pin 16 is not masked
0:00:00:04.106 cpu0:2048)WARNING: IOAPIC: 627: pin 17 is not masked
0:00:00:04.107 cpu0:2048)WARNING: IOAPIC: 627: pin 16 is not masked
0:00:00:04.107 cpu0:2048)WARNING: IOAPIC: 627: pin 19 is not masked
0:00:00:04.107 cpu0:2048)WARNING: IOAPIC: 627: pin 20 is not masked
0:00:00:04.107 cpu0:2048)WARNING: IOAPIC: 605: ioapic 0, version 32 does not match ioapic0 version 17
0:00:00:04.110 cpu0:2048)WARNING: CacheSched: 801: Already disabled : Cache aware scheduling already disabled
0:00:00:04.126 cpu0:2048)WARNING: SVGAConsole: 266: Extended TTY not supported. Ignoring on tty 4
0:00:00:04.126 cpu0:2048)WARNING: SVGAConsole: 266: Extended TTY not supported. Ignoring on tty 5
2013-04-02T09:43:12.472Z cpu11:2658)WARNING: VMK_PCI: 1128: device 000:000:31.1 has no legacy interrupt(s)
2013-04-02T09:43:12.472Z cpu11:2658)WARNING: LinPCI: LinuxPCILegacyIntrVectorSet:80:Could not allocate legacy PCI interrupt for device 0000:00:1f.1
2013-04-02T09:43:12.699Z cpu2:2693)WARNING: LinuxSignal: 761: ignored unexpected signal flags 0x2 (sig 17)
2013-04-02T09:44:01.886Z cpu13:2658)WARNING: ScsiScan: 1485: Failed to add path vmhba1:C0:T0:L0 : Not found
2013-04-02T09:44:02.902Z cpu13:2658)WARNING: ScsiScan: 1485: Failed to add path vmhba1:C0:T1:L0 : Not found
2013-04-02T09:44:06.834Z cpu2:2705)WARNING: LinuxSignal: 761: ignored unexpected signal flags 0x2 (sig 17)
2013-04-02T09:44:11.957Z cpu13:2694)WARNING: UserObj: 3232: Unimplemented operation on 0x41000ffe5b60/RPC
2013-04-02T09:44:11.957Z cpu13:2694)WARNING: UserObj: 675: Failed to crossdup fd 9, cnxId: 0x80000000 type RPC: Not implemented
2013-04-02T09:44:17.690Z cpu15:2658)WARNING: APEI: 175: Could not initialize HEST
2013-04-02T09:44:19.520Z cpu13:2714)WARNING: iscsi_vmk: iscsivmk_StartConnection: vmhba33:CH:0 T:0 CN:0: iSCSI connection is being marked "ONLINE"
2013-04-02T09:44:19.520Z cpu13:2714)WARNING: iscsi_vmk: iscsivmk_StartConnection: Sess [ISID: 00023d000001 TARGET: iqn.1992-04.com.emc:storage.NAS-STORAGE.VNAS-DATASTORE TPGT: 1 TSIH: 0]
2013-04-02T09:44:19.520Z cpu13:2714)WARNING: iscsi_vmk: iscsivmk_StartConnection: Conn [CID: 0 L: 172.16.0.121:51719 R: 172.16.0.40:3260]


Can any body help me ??
PS: i've just added new memory bars , but i don't think it is the source of the problem.

Thanx in advance
Reply
0 Kudos
7 Replies
a_p_
Leadership
Leadership

Welcome to the Community,

did you add the additional memory at the same time you activated HA? Just asking to see whether one of these two changes can be ruled out as the cause of this issue.

To see whether it's an issue with the memory, you may want to run a hardware check (memtest).

André

Reply
0 Kudos
NeilChapman1
Contributor
Contributor

How long has this host been stable prior to activating HA?

Does the host become stable again (for more than 1/2 a day) if HA is disabled?

Neil

Reply
0 Kudos
NeilChapman1
Contributor
Contributor

You may want to post /var/log/fdm.log

Neil

Reply
0 Kudos
Nouha
Contributor
Contributor

hi ,

i've deactivated HA and the host crashed again :confused_face:

you will find the fdm.log in the attached file

Regards,

Reply
0 Kudos
depping
Leadership
Leadership

3850 model is on the HCL (at least several variants of it, check here:http://www.vmware.com/resources/compatibility/search.php). Just log a support call to get it figured out what is causing this to happen

Reply
0 Kudos
NeilChapman1
Contributor
Contributor

Since it's crashing w/o HA do you want to investigate a different cause or do you still think it's HA related?

N

Reply
0 Kudos
mdavidh
Contributor
Contributor

Experiencing exact same problem , 5.0 update 2 ,same server ibm 3850 , same error msg's in logs , did you find out what was causing this issue ?

Reply
0 Kudos