VMware Cloud Community
kenez
Contributor
Contributor

Everything freezes, HDD activity light constantly on, when moving-copying large vmdk-s

It is a Fujitsu-Siemens blade server, ESXi 4, with 2 SATA HDDs installed (1TB each), datastore1 and datastore2. Sometimes I want to move large files between the datastores or from them to another computer, using PowerCLI or Veeam FastSCP (mainly to duplicate vmdk files for backup).

When these large files are being copied, sometimes the system hangs. I would say once out of 3. First the data transfer seems to stop, while the green HDD activity light is on constantly. Then I try to reboot the ESXi server (using Restart-VMHost command in PowerCLI), first I can login but get no answer to the command, even vSphere cannot load the inventory, later I cannot even connect at all. The HDD light is on, the virtual machines freeze after a few minutes, and finally nothing can be done but physically holding down the power button on the blade.

This problem comes irregularly, but far too often to ignore. Sometimes the very same task finishes nicely, sometimes it brings the whole system down.

What happens? How can I avoid such situations? I suspect there must be some sync problem (??) when the system reads out a lot of data?

Otherwise, when I don't try to copy and move large files, everything works fine, 3 Windows servers run on the box nicely.

Please help, any hints might help.

Thank you in advance!

Kenez

0 Kudos
11 Replies
DSTAVERT
Immortal
Immortal

Welcome to the forums. I would download the logs using the vSphere client and check to see if there are any indications errors etc. We would only be guessing. The logs may not go back far enough since they aren't persistent. I would suggest that you change the location of the log storage. Use the Client Configuration Tab ->Software -> Advanced -> syslog and point to a folder on one of your datastore disks.

What build / version are you on? Up to date?

-- David -- VMware Communities Moderator
0 Kudos
kenez
Contributor
Contributor

Thank you for your tip!

I'm quite new to this... How do I download the logs? Isn't it true that the log is gone when the host reboots?

Anyway, I followed your recommendation and changed the log location. So now we need to wait for another case. (I don't know if I really want it.) As soon as it happens again, I will post the log content.

Thanks again!

0 Kudos
kenez
Contributor
Contributor

Sorry, I forgot, it is ESXi 4.0.0 171294

0 Kudos
DSTAVERT
Immortal
Immortal

Does your disk controller have on board battery backed cache and is read/write caching enabled? Do not enable caching if the controller is not equipped with a battery. Caching controllers are important especially in virtual environments where there are high IO demands.

The logs can be exported from the administration menu on the vSphere client.

When / if you have logs don't just post the logs. Scan through them looking for Error messages. Post any relevant sections of the log if they don't lead you somewhere. You could be looking for SCSI resets or read/write errors etc.

-- David -- VMware Communities Moderator
0 Kudos
kenez
Contributor
Contributor

Good point, I'm getting smarter and smarter Smiley Happy

I will have the hardware details checked tomorrow, since I was not the one who bought the box.

Unfortunately I find nothing in the vSphere client. Please see the screenshot attached.

0 Kudos
kenez
Contributor
Contributor

P.S.

The only thing I know about the box is that it is a Fujitsu-Siemens "BX630 Dual" blade (_not_ the "S2"). Does it say anything to anyone?

0 Kudos
DSTAVERT
Immortal
Immortal

Try the file menu in the client under export.

-- David -- VMware Communities Moderator
0 Kudos
kenez
Contributor
Contributor

Thank you, I found the logs but, as you suspected, it was too late: no error messages from the time the problems occoured. Anyway, from now on it is saved on a datastore disk. (Nothing happened since then, but no large copy was done.)

Regarding the hardware, as I mentioned, it is a Fujitsu-Siemens BX630 server blade, with Promise FastTrack S150 TX4 SATA controllerin it. I found no information on any battery backed-up cache in it.

Large copies are coming again, so I hope I will be able to provide more details about the error.

0 Kudos
kenez
Contributor
Contributor

Thank you, I found the logs but, as you suspected, it was too late: no error messages from the time the problems occoured. Anyway, from now on it is saved on a datastore disk. (Nothing happened since then, but no large copy was done.)

Regarding the hardware, as I mentioned, it is a Fujitsu-Siemens BX630 server blade, with Promise FastTrack S150 TX4 SATA controllerin it. I found no information on any battery backed-up cache in it.

Large copies are coming again, so I hope I will be able to provide more details about the error.

0 Kudos
kenez
Contributor
Contributor

Thank you, I found the logs but, as you suspected, it was too late: no error messages from the time the problems occoured. Anyway, from now on it is saved on a datastore disk. (Nothing happened since then, but no large copy was done.)

Regarding the hardware, as I mentioned, it is a Fujitsu-Siemens BX630 server blade, with Promise FastTrack S150 TX4 SATA controllerin it. I found no information on any battery backed-up cache in it.

Large copies are coming again, so I hope I will be able to provide more details about the error.

0 Kudos
kenez
Contributor
Contributor

First of all, sorry for my long silence, and thank you for your efforts. Every sign seems to point to the conclusion that this problem is not vmWare related, it is hardware incompatibility. The HDD size I use in this blade server is simply not supported. No BIOS update, nothing. Just unsupported. I can buy another blade, or go back to the original 80 GB HDDs, that should be enough, shouldn't it. (I will give it one more try, by jumpering the HDDs to the slower SATA I, will see if it helps.)

If anyone has similar experiences, and, maybe, solutions, please share it.

Thanks again, and take care.

Kenez

0 Kudos