Dear all,
I need help.
Problem:
ESXi 5.5.0 Centos VM freezes(becomes unresponsive) or ESXi host crashes (with core dump) error
Task:
I need to build a backup system using PowerVault 124T (SCSI interface) and Linux application running under Centos on EXSi 5.5 platform.
What I have
ESXi Physical host - PowerEdge R720 rack server.
EXSi 5.5.0 Build 1331820 installed on PowerEdge R720 Server
CentOS release 6.8 (Final), Linux 2.6.32-642.4.2.el6.x86_64 installed and configured on EXSi 5.5.0
PowerVault 124T (SCSI interface) connected to PowerEdge R720 SCSI interface
What has been done.
1. Connected PowerVault 124T via SCSI interface to EXSi 5.5.0 Build 1331820
2. Installed and configured CentOS release 6.8 (Final), Linux 2.6.32-642.4.2.el6.x86_64
3. Configured SCSI I/O path through through ESXi 5.5.0 to Centos vm
- Connected PowerVault 124T and PowerEdge R720 using SCSI cable
- Under V Sphere client located a new device (see below)
- under vm created 2 SCSI devices and assigned tape drive and tape changer to them.
Tape Drive
Media Changer
Both devices use the same ID but different channels.
Maybe it could be the reason?
Note: ESXi/ESX supports vendor-supported parallel SCSI tape drives and tape libraries only. If you are using a tape drive library (versus using a stand-alone tape drive), the library must be multi-target, and not multi-LUN. Ensure that the tape drive's virtual target ID in the virtual machine's configuration is set to be the same as the physical target ID. The tape hardware must be SCSI ID addressed and cannot be LUN addressed. https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=10164...https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=10164...
In my case Dell Powervault 124T tape library has the same SCSI ID. I use the same SCSI id and different channels (see pics above)
Could it be the reason or not?
- I located both devices under Centos VM
[1:0:0:0] cd/dvd NECVMWar VMware IDE CDR10 1.00 /dev/sr0
[2:0:0:0] disk VMware Virtual disk 1.0 /dev/sda
[2:0:1:0] disk VMware Virtual disk 1.0 /dev/sdb
[2:0:2:0] disk VMware Virtual disk 1.0 /dev/sdc
[3:0:6:0] tape IBM ULTRIUM-TD3 6B20 /dev/st0
[4:0:6:0] mediumx DELL PV-124T 0053 /dev/sch0
[1:0:0:0] cd/dvd NECVMWar VMware IDE CDR10 1.00 /dev/sr0 /dev/sg0
[2:0:0:0] disk VMware Virtual disk 1.0 /dev/sda /dev/sg1
[2:0:1:0] disk VMware Virtual disk 1.0 /dev/sdb /dev/sg2
[2:0:2:0] disk VMware Virtual disk 1.0 /dev/sdc /dev/sg3
[3:0:6:0] tape IBM ULTRIUM-TD3 6B20 /dev/st0 /dev/sg4
[4:0:6:0] mediumx DELL PV-124T 0053 /dev/sch0 /dev/sg5
!!!! What I mentioned here is SCSI adapter channel number is different for both Tape drive and Media Changer under Centos VM and ESXI itself. I'm not sure if it is critical!!!
I use tar command for backup operation.
Issues:
- ESXi 5.5.0 host froze with (core dump error) - Hard reboot helped
-Media changer becomes unresponsive from time to time.
- vm froze once during backup process as it was able to locate SCSI media changer - Hard reboot helped
.
QUESTION:
What could be the reason of unstable operation of SCSI device under ESXi 5.5.0 ?
Please, see logs below:
/var/log/hostd.log
2016-10-10T22:15:47.433Z [30440B70 verbose 'Default' opID=14179AAA-0000095C user=root] AdapterServer: target='vmodl.query.PropertyCollector:ha-property-collector', method='waitForUpdates'
2016-10-10T22:16:00.045Z [312C1B70 verbose 'Statssvc.vim.PerformanceManager'] HostCtl Exception in stats collection: Sysinfo error on operation returned status : Not initialized. Please see the VMk
2016-10-10T22:16:00.046Z [312C1B70 verbose 'Statssvc.vim.PerformanceManager'] HostCtl Exception in stats collection. Turn on 'trivia' log for details
2016-10-10T22:16:19.280Z [309C8B70 verbose 'Cimsvc'] Ticket issued for CIMOM version 1.0, user root
2016-10-10T22:16:20.060Z [312C1B70 verbose 'Statssvc.vim.PerformanceManager'] HostCtl Exception in stats collection: Sysinfo error on operation returned status : Not initialized. Please see the VMk
2016-10-10T22:16:20.060Z [312C1B70 verbose 'Statssvc.vim.PerformanceManager'] HostCtl Exception in stats collection. Turn on 'trivia' log for details
2016-10-10T22:16:40.060Z [304C3B70 verbose 'Statssvc.vim.PerformanceManager'] HostCtl Exception in stats collection: Sysinfo error on operation returned status : Not initialized. Please see the VMk
2016-10-10T22:16:40.060Z [304C3B70 verbose 'Statssvc.vim.PerformanceManager'] HostCtl Exception in stats collection. Turn on 'trivia' log for details
Section for VMware ESX, pid=33896, version=5.5.0, build=1331820, option=Release
------ In-memory logs start --------
mem> 2016-10-11T02:00:41.975Z [FFC885B0 verbose 'Default'] No update forwarding configured
mem> 2016-10-11T02:00:41.976Z [FFC885B0 info 'Default'] Supported VMs 256
mem> 2016-10-11T02:00:41.976Z [FFC885B0 info 'Handle checker'] Setting system limit of 2560
mem> 2016-10-11T02:00:41.976Z [FFC885B0 info 'Handle checker'] Set system limit to 2560
mem> 2016-10-11T02:00:41.976Z [FFC885B0 info 'Default'] Setting malloc mmap threshold to 32 k
mem> 2016-10-11T02:00:41.976Z [FFC885B0 info 'Default'] getrlimit(RLIMIT_NPROC): curr=64 max=128, return code = Success
mem> 2016-10-11T02:00:41.976Z [FFC885B0 info 'Default'] setrlimit(RLIMIT_NPROC): curr=128 max=128, return code = Success
------ In-memory logs end --------
2016-10-11T02:00:41.976Z [FFC885B0 info 'Default'] Logging uses fast path: false
2016-10-11T02:00:41.976Z [FFC885B0 info 'Default'] Handling bora/lib logs with VmaCore facilities
/var/log/vobd.log
2016-10-08T01:35:02.110Z: [scsiCorrelator] 34842003385001us: [vob.scsi.scsipath.pathstate.dead] scsiPath vmhba1:C0:T6:L0 changed state from on
2016-10-08T01:35:02.110Z: [APDCorrelator] 34842003385058us: [vob.storage.apd.start] Device or filesystem with identifier [t10.IBM_____ULTRIUM2DTD3_____1210354899] has entered the All Paths Down sta
2016-10-08T01:35:02.110Z: [APDCorrelator] 34842691695354us: [esx.problem.storage.apd.start] Device or filesystem with identifier [t10.IBM_____ULTRIUM2DTD3_____1210354899] has entered the All Paths
2016-10-08T01:35:02.110Z: [scsiCorrelator] 34842691695819us: [esx.problem.storage.connectivity.lost] Lost connectivity to storage device t10.IBM_____ULTRIUM2DTD3_____1210354899. Path vmhba1:C0:T6:L
2016-10-08T01:35:02.367Z: [scsiCorrelator] 34842003642412us: [vob.scsi.scsipath.pathstate.dead] scsiPath vmhba1:C0:T6:L1 changed state from on
2016-10-08T01:35:02.367Z: [APDCorrelator] 34842003642467us: [vob.storage.apd.start] Device or filesystem with identifier [naa.500e09efff0df959] has entered the All Paths Down state.
2016-10-08T01:35:02.367Z: [APDCorrelator] 34842691952647us: [esx.problem.storage.apd.start] Device or filesystem with identifier [naa.500e09efff0df959] has entered the All Paths Down state.
2016-10-08T01:35:02.368Z: [scsiCorrelator] 34842691953017us: [esx.problem.storage.connectivity.lost] Lost connectivity to storage device naa.500e09efff0df959. Path vmhba1:C0:T6:L1 is down. Affected
2016-10-08T01:37:22.114Z: [APDCorrelator] 34842143387115us: [vob.storage.apd.timeout] Device or filesystem with identifier [t10.IBM_____ULTRIUM2DTD3_____1210354899] has entered the All Paths Down T
2016-10-08T01:37:22.114Z: [APDCorrelator] 34842831699773us: [esx.problem.storage.apd.timeout] Device or filesystem with identifier [t10.IBM_____ULTRIUM2DTD3_____1210354899] has entered the All Path
2016-10-08T01:37:22.372Z: [APDCorrelator] 34842143644489us: [vob.storage.apd.timeout] Device or filesystem with identifier [naa.500e09efff0df959] has entered the All Paths Down Timeout state after
2016-10-08T01:37:22.372Z: [APDCorrelator] 34842831957149us: [esx.problem.storage.apd.timeout] Device or filesystem with identifier [naa.500e09efff0df959] has entered the All Paths Down Timeout stat
2016-10-08T01:40:01.865Z: [scsiCorrelator] 34842303134518us: [vob.scsi.scsipath.por] Power-on Reset occurred on vmhba1:C0:T6:L0
2016-10-08T01:40:01.865Z: [scsiCorrelator] 34842303135351us: [vob.scsi.scsipath.por] Power-on Reset occurred on vmhba1:C0:T6:L1
2016-10-08T01:40:01.879Z: [scsiCorrelator] 34842303148915us: [vob.scsi.scsipath.pathstate.on] scsiPath vmhba1:C0:T6:L0 changed state from dead
2016-10-08T01:40:01.879Z: [APDCorrelator] 34842303148950us: [vob.storage.apd.exit] Device or filesystem with identifier [t10.IBM_____ULTRIUM2DTD3_____1210354899] has exited the All Paths Down state
2016-10-08T01:40:01.879Z: [APDCorrelator] 34842991464975us: [esx.clear.storage.apd.exit] Device or filesystem with identifier [t10.IBM_____ULTRIUM2DTD3_____1210354899] has exited the All Paths Down
2016-10-08T01:40:01.880Z: [scsiCorrelator] 34842991465527us: [esx.clear.storage.connectivity.restored] Connectivity to storage device t10.IBM_____ULTRIUM2DTD3_____1210354899 (Datastores: Unknown) r
2016-10-08T01:40:01.881Z: [scsiCorrelator] 34842303151048us: [vob.scsi.scsipath.pathstate.on] scsiPath vmhba1:C0:T6:L1 changed state from dead
2016-10-08T01:40:01.881Z: [APDCorrelator] 34842303151083us: [vob.storage.apd.exit] Device or filesystem with identifier [naa.500e09efff0df959] has exited the All Paths Down state.
2016-10-08T01:40:01.881Z: [APDCorrelator] 34842991466938us: [esx.clear.storage.apd.exit] Device or filesystem with identifier [naa.500e09efff0df959] has exited the All Paths Down state.
2016-10-08T01:40:01.882Z: [scsiCorrelator] 34842991467150us: [esx.clear.storage.connectivity.restored] Connectivity to storage device naa.500e09efff0df959 (Datastores: Unknown) restored. Path vmhba
2016-10-08T11:07:53.380Z: [scsiCorrelator] 34876373972883us: [vob.scsi.device.io.latency.high] Device naa.6c81f660e97c93001af69254192c0dcb performance has deteriorated. I/O latency increased from a
2016-10-08T11:07:53.380Z: [scsiCorrelator] 34877062965764us: [e
2016-10-08T01:40:01.881Z: [APDCorrelator] 34842303151083us: [vob.storage.apd.exit] Device or filesystem with identifier [naa.500e09efff0df959] has exited the All Paths Down state.
2016-10-08T01:40:01.881Z: [APDCorrelator] 34842991466938us: [esx.clear.storage.apd.exit] Device or filesystem with identifier [naa.500e09efff0df959] has exited the All Paths Down state.
2016-10-08T01:40:01.882Z: [scsiCorrelator] 34842991467150us: [esx.clear.storage.connectivity.restored] Connectivity to storage device naa.500e09efff0df959 (Datastores: Unknown) restored. Path vmhba
2016-10-08T11:07:53.380Z: [scsiCorrelator] 34876373972883us: [vob.scsi.device.io.latency.high] Device naa.6c81f660e97c93001af69254192c0dcb performance has deteriorated. I/O latency increased from a
2016-10-08T11:07:53.380Z: [scsiCorrelator] 34877062965764us: [esx.problem.scsi.device.io.latency.high] Device naa.6c81f660e97c93001af69254192c0dcb performance has deteriorated. I/O latency increase
2016-10-08T11:08:03.133Z: [scsiCorrelator] 34876383725580us: [vob.scsi.device.io.latency.improved] Device naa.6c81f660e97c93001af69254192c0dcb performance has improved. I/O latency reduced from 972
2016-10-08T11:08:03.133Z: [scsiCorrelator] 34877072718506us: [esx.clear.scsi.device.io.latency.improved] Device naa.6c81f660e97c93001af69254192c0dcb performance has improved. I/O latency reduced fr
2016-10-08T11:08:18.285Z: [scsiCorrelator] 34876398877310us: [vob.scsi.device.io.latency.improved] Device naa.6c81f660e97c93001af69254192c0dcb performance has improved. I/O latency reduced from 192
2016-10-08T11:08:18.285Z: [scsiCorrelator] 34877087870730us: [esx.clear.scsi.device.io.latency.improved] Device naa.6c81f660e97c93001af69254192c0dcb performance has improved. I/O latency reduced fr
2016-10-08T11:09:32.987Z: [scsiCorrelator] 34876473577292us: [vob.scsi.device.io.latency.high] Device naa.6c81f660e97c93001af69254192c0dcb performance has deteriorated. I/O latency increased from a
2016-10-08T11:09:32.987Z: [scsiCorrelator] 34877162572240us: [esx.problem.scsi.device.io.latency.high] Device naa.6c81f660e97c93001af69254192c0dcb performance has deteriorated. I/O latency increase
2016-10-08T11:09:35.394Z: [scsiCorrela
2016-10-10T10:15:06.174Z: [APDCorrelator] 35046003403873us: [vob.storage.apd.start] Device or filesystem with identifier [t10.IBM_____ULTRIUM2DTD3_____1210354899] has entered the All Paths Down sta
2016-10-10T10:15:06.174Z: [APDCorrelator] 35046695759512us: [esx.problem.storage.apd.start] Device or filesystem with identifier [t10.IBM_____ULTRIUM2DTD3_____1210354899] has entered the All Paths
2016-10-10T10:15:06.175Z: [scsiCorrelator] 35046695759982us: [esx.problem.storage.connectivity.lost] Lost connectivity to storage device t10.IBM_____ULTRIUM2DTD3_____1210354899. Path vmhba1:C0:T6:L
2016-10-10T10:15:06.431Z: [scsiCorrelator] 35046003660761us: [vob.scsi.scsipath.pathstate.dead] scsiPath vmhba1:C0:T6:L1 changed state from on
2016-10-10T10:15:06.431Z: [APDCorrelator] 35046003660818us: [vob.storage.apd.start] Device or filesystem with identifier [naa.500e09efff0df959] has entered the All Paths Down state.
2016-10-10T10:15:06.431Z: [APDCorrelator] 35046696016388us: [esx.problem.storage.apd.start] Device or filesystem with identifier [naa.500e09efff0df959] has entered the All Paths Down state.
2016-10-10T10:15:06.431Z: [scsiCorrelator] 35046696016778us: [esx.problem.storage.connectivity.lost] Lost connectivity to storage device naa.500e09efff0df959. Path vmhba1:C0:T6:L1 is down. Affected
2016-10-10T10:15:10.924Z: [scsiCorrelator] 35046008154422us: [vob.scsi.scsipath.pathstate.on] scsiPath vmhba1:C0:T6:L1 changed state from dead
2016-10-10T10:15:10.924Z: [APDCorrelator] 35046008154453us: [vob.storage.apd.exit] Device or filesystem with identifier [naa.500e09efff0df959] has exited the All Paths Down state.
2016-10-10T10:15:10.924Z: [APDCorrelator] 35046700509953us: [esx.clear.storage.apd.exit] Device or filesystem with identifier [naa.500e09efff0df959] has exited the All Paths Down state.
2016-10-10T10:15:10.925Z: [scsiCorrelator] 35046700510152us: [esx.clear.storage.connectivity.restored] Connectivity to storage device naa.500e09efff0df959 (Datastores: Unknown) restored. Path vmhba
2016-10-10T10:17:26.177Z: [APDCorrelator] 35046143404009us: [vob.storage.apd.timeout] Device or filesystem with identifier [t10.IBM_____ULTRIUM2DTD3_____1210354899] has entered the All Paths Down T
2016-10-10T10:17:26.177Z: [APDCorrelator] 35046835762367us: [esx.problem.storage.apd.timeout] Device or filesystem with identifier [t10.IBM_____ULTRIUM2DTD3_____1210354899] has entered the All Path
2016-10-10T10:20:05.929Z: [scsiCorrelator] 35046303153759us: [vob.scsi.scsipath.pathstate.on] scsiPath vmhba1:C0:T6:L0 changed state from dead
2016-10-10T10:20:05.930Z: [APDCorrelator] 35046303153784us: [vob.storage.apd.exit] Device or filesystem with identifier [t10.IBM_____ULTRIUM2DTD3_____1210354899] has exited the All Paths Down state
2016-10-10T10:20:05.930Z: [APDCorrelator] 35046995515102us: [esx.clear.storage.apd.exit] Device or filesystem with identifier [t10.IBM_____ULTRIUM2DTD3_____1210354899] has exited the All Paths Down
2016-10-10T10:20:05.930Z: [scsiCorrelator] 35046995515305us: [esx.clear.storage.connectivity.restored] Connectivity to storage device t10.IBM_____ULTRIUM2DTD3_____1210354899 (Datastores: Unknown) r
2016-10-10T13:38:44.421Z: [APDCorrelator] 35058221406489us: [vob.storage.apd.start] Device or filesystem with identifier [t10.IBM_____ULTRIUM2DTD3_____1210354899] has entered the All Paths Down sta
2016-10-10T13:38:44.421Z: [scsiCorrelator] 35058221406452us: [vob.scsi.scsipath.pathstate.dead] scsiPath vmhba1:C0:T6:L0 changed state from on
2016-10-10T13:38:44.421Z: [APDCorrelator] 35058914006161us: [esx.problem.storage.apd.start] Device or filesystem with identifier [t10.IBM_____ULTRIUM2DTD3_____1210354899] has entered the All Paths
2016-10-10T13:38:44.421Z: [scsiCorrelator] 35058914006467us: [esx.problem.storage.connectivity.lost] Lost connectivity to storage device t10.IBM_____ULTRIUM2DTD3_____1210354899. Path vmhba1:C0:T6:L
2016-10-10T13:38:45.677Z: [scsiCorrelator] 35058222663340us: [vob.scsi.scsipath.pathstate.dead] scsiPath vmhba1:C0:T6:L1 changed state from on
2016-10-10T13:38:45.677Z: [APDCorrelator] 35058222663375us: [vob.storage.apd.start] Device or filesystem with identifier [naa.500e09efff0df959] has entered the All Paths Down state.
2016-10-10T13:38:45.678Z: [APDCorrelator] 35058915263037us: [esx.problem.storage.apd.start] Device or filesystem with identifier [naa.500e09efff0df959] has entered the All Paths Down state.
2016-10-10T13:38:45.678Z: [scsiCorrelator] 35058915263225us: [esx.problem.storage.connectivity.lost] Lost connectivity to storage device naa.500e09efff0df959. Path vmhba1:C0:T6:L1 is down. Affected
2016-10-10T13:38:49.438Z: [scsiCorrelator] 35058226424298us: [vob.scsi.scsipath.pathstate.on] scsiPath vmhba1:C0:T6:L1 changed state from dead
2016-10-10T13:38:49.438Z: [APDCorrelator] 35058226424328us: [vob.storage.apd.exit] Device or filesystem with identifier [naa.500e09efff0df959] has exited the All Paths Down state.
2016-10-10T13:38:49.439Z: [APDCorrelator] 35058919024038us: [esx.clear.storage.apd.exit] Device or filesystem with identifier [naa.500e09efff0df959] has exited the All Paths Down state.
2016-10-10T13:38:49.439Z: [scsiCorrelator] 35058919024218us: [esx.clear.storage.connectivity.restored] Connectivity to storage device naa.500e09efff0df959 (Datastores: Unknown) restored. Path vmhba
2016-10-10T13:38:50.172Z: [APDCorrelator] 35058227157695us: [vob.storage.apd.exit] Device or filesystem with identifier [t10.IBM_____ULTRIUM2DTD3_____1210354899] has exited the All Paths Down state
2016-10-10T13:38:50.172Z: [APDCorrelator] 35058919757461us: [esx.clear.storage.apd.exit] Device or filesystem with identifier [t10.IBM_____ULTRIUM2DTD3_____1210354899] has exited the All Paths Down
2016-10-10T13:38:50.172Z: [scsiCorrelator] 35058227157664us: [vob.scsi.scsipath.pathstate.on] scsiPath vmhba1:C0:T6:L0 changed state from dead
2016-10-10T13:38:50.172Z: [scsiCorrelator] 35058919757828us: [esx.clear.storage.connectivity.restored] Connectivity to storage device t10.IBM_____ULTRIUM2DTD3_____1210354899 (Datastores: Unknown) r
2016-10-11T02:00:03.854Z: [netCorrelator] 21573056us: [vob.net.vmnic.linkstate.up] vmnic vmnic2 linkstate up
/var/log/vmkernel.log
2016-10-10T22:15:38.351Z cpu13:92288823)scsi0: Command abort returning 0x2003
2016-10-10T22:15:38.351Z cpu13:92288823)WARNING: LinScsi: SCSILinuxAbortCommands:1837: Failed, Driver Adaptec AIC7902 Ultra320 SCSI adapter, for vmhba1
2016-10-10T22:15:38.351Z cpu13:92288823)WARNING: ScsiPath: 6273: Set retry timeout for failed TaskMgmt abort for CmdSN 0x0, status Failure, path vmhba1:C0:T6:L0
<6>0 :: Attempting to queue an ABORT message:CDB: 0x0 0x0 0x0 0x0 0x0 0x0
2016-10-10T22:15:40.352Z cpu14:92288823)scsi0: Command abort returning 0x2003
2016-10-10T22:15:40.352Z cpu14:92288823)WARNING: LinScsi: SCSILinuxAbortCommands:1837: Failed, Driver Adaptec AIC7902 Ultra320 SCSI adapter, for vmhba1
2016-10-10T22:15:40.352Z cpu14:92288823)WARNING: ScsiPath: 6273: Set retry timeout for failed TaskMgmt abort for CmdSN 0x0, status Failure, path vmhba1:C0:T6:L0
<6>0 :: Attempting to queue an ABORT message:CDB: 0x0 0x0 0x0 0x0 0x0 0x0
2016-10-10T22:15:42.352Z cpu14:92288823)scsi0: Command abort returning 0x2003
2016-10-10T22:15:42.352Z cpu14:92288823)WARNING: LinScsi: SCSILinuxAbortCommands:1837: Failed, Driver Adaptec AIC7902 Ultra320 SCSI adapter, for vmhba1
2016-10-10T22:15:42.352Z cpu14:92288823)WARNING: ScsiPath: 6273: Set retry timeout for failed TaskMgmt abort for CmdSN 0x0, status Failure, path vmhba1:C0:T6:L0
<6>0 :: Attempting to queue an ABORT message:CDB: 0x0 0x0 0x0 0x0 0x0 0x0
2016-10-10T22:15:44.353Z cpu9:92288823)scsi0: Command abort returning 0x2003
2016-10-10T22:15:44.353Z cpu9:92288823)WARNING: LinScsi: SCSILinuxAbortCommands:1837: Failed, Driver Adaptec AIC7902 Ultra320 SCSI adapter, for vmhba1
2016-10-10T22:15:44.353Z cpu9:92288823)WARNING: ScsiPath: 6273: Set retry timeout for failed TaskMgmt abort for CmdSN 0x0, status Failure, path vmhba1:C0:T6:L0
<6>0 :: Attempting to queue an ABORT message:CDB: 0x0 0x0 0x0 0x0 0x0 0x0
2016-10-10T22:15:46.355Z cpu9:92288823)scsi0: Command abort returning 0x2003
2016-10-10T22:15:46.355Z cpu9:92288823)WARNING: LinScsi: SCSILinuxAbortCommands:1837: Failed, Driver Adaptec AIC7902 Ultra320 SCSI adapter, for vmhba1
2016-10-10T22:15:46.355Z cpu9:92288823)WARNING: ScsiPath: 6273: Set retry timeout for failed TaskMgmt abort for CmdSN 0x0, status Failure, path vmhba1:C0:T6:L0
<6>0 :: Attempting to queue an ABORT message:CDB: 0x0 0x0 0x0 0x0 0x0 0x0
scsi0: Dumping Card State at program address 0x12b Mode 0x33
2016-10-10T22:16:12.790Z cpu14:90826950)Card was paused
INTSTAT[0x0] SELOID[0x6] SELID[0x60] HS_MAILBOX[0x0]
INTCTL[0x80] SEQINTSTAT[0x0] SAVED_MODE[0x11] DFFSTAT[0x33]
SCSISIGI[0xa4] SCSIPHASE[0x0] SCSIBUS[0xc] LASTPHASE[0xa0]
SCSISEQ0[0x0] SCSISEQ1[0x12] SEQCTL0[0x0] SEQINTCTL[0x0]
SEQ_FLAGS[0x40] SEQ_FLAGS2[0x4] QFREEZE_COUNT[0x1965]
KERNEL_QFREEZE_COUNT[0x1965] MK_MESSAGE_SCB[0x3] MK_MESSAGE_SCSIID[0x67]
SSTAT0[0x0] SSTAT1[0x8] SSTAT2[0x0] SSTAT3[0x0] PERRDIAG[0xc0]
SIMODE1[0xac] LQISTAT0[0x0] LQISTAT1[0x0] LQISTAT2[0x0]
LQOSTAT0[0x0] LQOSTAT1[0x0] LQOSTAT2[0x0]
2016-10-10T22:16:12.790Z cpu14:90826950)
SCB Count = 4 CMDS_PENDING = 1 LASTSCB 0xffff CURRSCB 0x3 NEXTSCB 0x0
qinstart = 27531 qinfifonext = 27532
QINFIFO: 0x22016-10-10T22:16:12.790Z cpu14:90826950)
WAITING_TID_QUEUES:
Pending list:
2 FIFO_USE[0x0] SCB_CONTROL[0x48] SCB_SCSIID[0x67]
3 FIFO_USE[0x0] SCB_CONTROL[0x0] SCB_SCSIID[0x67] 2016-10-10T22:16:12.790Z cpu14:90826950)
Total 2
Kernel Free SCB list: 1 0
Sequencer Complete DMA-inprog list:
Sequencer Complete list:
Sequencer DMA-Up and Complete list:
Sequencer On QFreeze and Complete list:
2016-10-10T22:16:12.790Z cpu14:90826950)
scsi0: FIFO0 Free, LONGJMP == 0x80ff, SCB 0x0
SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]
SG_CACHE_SHADOW[0x2] SG_STATE[0x0] DFFSXFRCTL[0x0]
SOFFCNT[0x0] MDFFSTAT[0x5] SHADDR = 0x00, SHCNT = 0x0
HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10] 2016-10-10T22:16:12.790Z cpu14:90826950)
scsi0: FIFO1 Free, LONGJMP == 0x81f8, SCB 0x3
SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x4] DFSTATUS[0x89]
SG_CACHE_SHADOW[0x2] SG_STATE[0x0] DFFSXFRCTL[0x0]
SOFFCNT[0x0] MDFFSTAT[0x5] SHADDR = 0x00, SHCNT = 0x0
HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]
LQIN: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0
2016-10-10T22:16:12.790Z cpu14:90826950)scsi0: LQISTATE = 0x0, LQOSTATE = 0x0, OPTIONMODE = 0x52
2016-10-10T22:16:12.790Z cpu14:90826950)scsi0: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x0
2016-10-10T22:16:12.790Z cpu14:90826950)scsi0: SAVED_SCSIID = 0x0 SAVED_LUN = 0x0
SIMODE0[0xc]
CCSCBCTL[0x4]
2016-10-10T22:16:12.790Z cpu14:90826950)scsi0: REG0 == 0x3, SINDEX = 0x107, DINDEX = 0x10e
2016-10-10T22:16:12.790Z cpu14:90826950)scsi0: SCBPTR == 0x3, SCB_NEXT == 0xff80, SCB_NEXT2 == 0xff74
2016-10-10T22:16:12.790Z cpu14:90826950)CDB 0 0 0 0 0 0
STACK: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x02016-10-10T22:16:12.790Z cpu14:90826950)
<<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>>
/var/log/vmkwarning.log
2016-10-10T22:15:40.352Z cpu14:92288823)WARNING: ScsiPath: 6273: Set retry timeout for failed TaskMgmt abort for CmdSN 0x0, status Failure, path vmhba1:C0:T6:L0
2016-10-10T22:15:42.352Z cpu14:92288823)WARNING: LinScsi: SCSILinuxAbortCommands:1837: Failed, Driver Adaptec AIC7902 Ultra320 SCSI adapter, for vmhba1
2016-10-10T22:15:42.352Z cpu14:92288823)WARNING: ScsiPath: 6273: Set retry timeout for failed TaskMgmt abort for CmdSN 0x0, status Failure, path vmhba1:C0:T6:L0
2016-10-10T22:15:44.353Z cpu9:92288823)WARNING: LinScsi: SCSILinuxAbortCommands:1837: Failed, Driver Adaptec AIC7902 Ultra320 SCSI adapter, for vmhba1
2016-10-10T22:15:44.353Z cpu9:92288823)WARNING: ScsiPath: 6273: Set retry timeout for failed TaskMgmt abort for CmdSN 0x0, status Failure, path vmhba1:C0:T6:L0
2016-10-10T22:15:46.355Z cpu9:92288823)WARNING: LinScsi: SCSILinuxAbortCommands:1837: Failed, Driver Adaptec AIC7902 Ultra320 SCSI adapter, for vmhba1
2016-10-10T22:15:46.355Z cpu9:92288823)WARNING: ScsiPath: 6273: Set retry timeout for failed TaskMgmt abort for CmdSN 0x0, status Failure, path vmhba1:C0:T6:L0
2016-10-10T22:15:57.752Z cpu14:32939)WARNING: NMP: nmpDeviceTaskMgmt:2266: Attempt to issue lun reset on device t10.IBM_____ULTRIUM2DTD3_____1210354899. This will clear any SCSI-2 reservations on t
2016-10-10T22:16:00.837Z cpu0:90826950)WARNING: VSCSI: 3485: handle 8325(vscsi2:6):WaitForCIF: Issuing reset; number of CIF:1
2016-10-10T22:16:00.837Z cpu0:90826950)WARNING: VSCSI: 2407: handle 8325(vscsi2:6):Ignoring double reset
2016-10-10T22:16:12.791Z cpu12:73923820)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "t10.IBM_____ULTRIUM2DTD3_____1210354899" state in doubt; requested fast path state update...
2016-10-10T22:16:12.819Z cpu14:32939)WARNING: NMP: nmpDeviceTaskMgmt:2266: Attempt to issue lun reset on device t10.IBM_____ULTRIUM2DTD3_____1210354899. This will clear any SCSI-2 reservations on t
2016-10-10T22:16:15.834Z cpu14:90826950)WARNING: VSCSI: 3485: handle 8325(vscsi2:6):WaitForCIF: Issuing reset; number of CIF:1
2016-10-10T22:16:15.834Z cpu14:90826950)WARNING: VSCSI: 2407: handle 8325(vscsi2:6):Ignoring double reset
2016-10-10T22:16:37.848Z cpu10:90826950)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "t10.IBM_____ULTRIUM2DTD3_____1210354899" state in doubt; requested fast path state update...
2016-10-10T22:16:37.851Z cpu14:32939)WARNING: NMP: nmpDeviceTaskMgmt:2266: Attempt to issue lun reset on device t10.IBM_____ULTRIUM2DTD3_____1210354899. This will clear any SCSI-2 reservations on t
TSC: 734420 cpu0:1)WARNING: ACPI: 1391: SPCR: Detected unsupported reg bit width (0); will assume 8 bits.
TSC: 739745 cpu0:1)WARNING: ACPI: 1440: SPCR: Detected invalid baud rate (0); will assume 115200
0:00:00:00.000 cpu0:1)WARNING: Serial: 641: Invalid serial port config: mem-mapped to addr 0x0.
0:00:00:05.648 cpu0:32768)WARNING: Cpu: 2145: Cache latency measurement may be inaccurate min= 145 max= 753 avg= 197
2016-10-11T01:59:58.970Z cpu1:33373)WARNING: LinuxSignal: 538: ignored unexpected signal flags 0x2 (sig 17)
Oct 10 11:49:48 bup102 kernel: Not tainted 2.6.32-642.4.2.el6.x86_64 #1
Oct 10 11:49:48 bup102 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 10 11:49:48 bup102 kernel: tar D 0000000000000002 0 11851 11849 0x00000080
Oct 10 11:49:48 bup102 kernel: ffff880002207c18 0000000000000082 0000000000000000 ffff880235502e06
Oct 10 11:49:48 bup102 kernel: ffff880235cd2000 0000000000040000 0001d2340ea2d357 ffff880235cd2000
Oct 10 11:49:48 bup102 kernel: ffff880002207c38 000000011e892b77 ffff88023739c5f8 ffff880002207fd8
Oct 10 11:49:48 bup102 kernel: Call Trace:
Oct 10 11:49:48 bup102 kernel: [<ffffffff81549895>] schedule_timeout+0x215/0x2e0
Oct 10 11:49:48 bup102 kernel: [<ffffffff8129ca57>] ? kobject_put+0x27/0x60
Oct 10 11:49:48 bup102 kernel: [<ffffffff815494f3>] wait_for_common+0x123/0x180
Oct 10 11:49:48 bup102 kernel: [<ffffffff8106c500>] ? default_wake_function+0x0/0x20
Oct 10 11:49:48 bup102 kernel: [<ffffffff8154962d>] wait_for_completion+0x1d/0x20
Oct 10 11:49:48 bup102 kernel: [<ffffffffa02b0509>] st_do_scsi.clone.0+0x2c9/0x420 [st]
Oct 10 11:49:48 bup102 kernel: [<ffffffffa02b5e73>] st_write+0x5e3/0xd80 [st]
Oct 10 11:49:48 bup102 kernel: [<ffffffff81199c28>] vfs_write+0xb8/0x1a0
Oct 10 11:49:48 bup102 kernel: [<ffffffff8119b116>] ? fget_light_pos+0x16/0x50
Oct 10 11:49:48 bup102 kernel: [<ffffffff8119a761>] sys_write+0x51/0xb0
Oct 10 11:49:48 bup102 kernel: [<ffffffff810ee25e>] ? __audit_syscall_exit+0x25e/0x290
Oct 10 11:49:48 bup102 kernel: [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b
Oct 10 11:53:48 bup102 kernel: INFO: task tar:11851 blocked for more than 120 seconds.
Oct 10 11:53:48 bup102 kernel: Not tainted 2.6.32-642.4.2.el6.x86_64 #1
.
.
.
Oct 10 12:15:47 bup102 kernel: st 3:0:6:0: CDB: Write(6): 0a 00 04 00 00 00
Oct 10 12:15:47 bup102 kernel: mptscsih: ioc1: task abort: SUCCESS (rv=2002) (sc=ffff880209568c80)
Oct 10 12:15:57 bup102 kernel: mptscsih: ioc1: attempting task abort! (sc=ffff880209568c80)
Oct 10 12:15:57 bup102 kernel: st 3:0:6:0: CDB: Test Unit Ready: 00 00 00 00 00 00
Oct 10 12:15:57 bup102 kernel: mptscsih: ioc1: task abort: SUCCESS (rv=2002) (sc=ffff880209568c80)
Oct 10 12:15:57 bup102 kernel: mptscsih: ioc1: attempting target reset! (sc=ffff880209568c80)
Oct 10 12:15:57 bup102 kernel: st 3:0:6:0: CDB: Write(6): 0a 00 04 00 00 00
Oct 10 12:16:02 bup102 kernel: mptscsih: ioc1: target reset: SUCCESS (sc=ffff880209568c80)
Oct 10 12:16:12 bup102 kernel: mptscsih: ioc1: attempting task abort! (sc=ffff880209568c80)
Oct 10 12:16:12 bup102 kernel: st 3:0:6:0: CDB: Test Unit Ready: 00 00 00 00 00 00
Oct 10 12:16:12 bup102 kernel: mptscsih: ioc1: task abort: SUCCESS (rv=2002) (sc=ffff880209568c80)
Oct 10 12:16:12 bup102 kernel: mptscsih: ioc1: attempting bus reset! (sc=ffff880209568c80)
Oct 10 12:16:12 bup102 kernel: st 3:0:6:0: CDB: Write(6): 0a 00 04 00 00 00
Oct 10 12:16:17 bup102 kernel: mptscsih: ioc1: bus reset: SUCCESS (sc=ffff880209568c80)
Oct 10 12:16:37 bup102 kernel: mptscsih: ioc1: attempting task abort! (sc=ffff880209568c80)
Oct 10 12:16:37 bup102 kernel: st 3:0:6:0: CDB: Test Unit Ready: 00 00 00 00 00 00
Oct 10 12:16:42 bup102 kernel: mptscsih: ioc1: task abort: SUCCESS (rv=2002) (sc=ffff880209568c80)
Oct 10 12:16:42 bup102 kernel: mptscsih: ioc1: attempting host reset! (sc=ffff880209568c80)
Oct 10 12:16:42 bup102 kernel: mptscsih: ioc1: host reset: SUCCESS (sc=ffff880209568c80)
Oct 10 16:09:27 bup102 kernel: imklog 5.8.10, log source = /proc/kmsg started.