VMware Cloud Community
Ilia_shapira
Contributor
Contributor

New Install ESXi 3.5 issue

I installed a few days ago the ESXi 3.5 on a server with Adaptec 3408 RAID

Everything was working fine, but the next morning from the install the server stoped responding, I was able to ping it but the local consloe was not responding and I was also not able to connect remotly. I restrted the server and again all day everything was working fine. Next morning again tthe same story.

I'm new to VMWARE so can anyone please tell me what can be the problem, where should I start looking and what should I do ?

Thanks.

0 Kudos
52 Replies
Jasemccarty
Immortal
Immortal

What type of hardware do you have? Is it a system on the Hardware Compatibility List?

Jase McCarty

http://www.jasemccarty.com

Co-Author of VMware ESX Essentials in the Virtual Data Center

(ISBN:1420070274) from Auerbach

Jase McCarty - @jasemccarty
0 Kudos
Ilia_shapira
Contributor
Contributor

Yes and no Smiley Happy

This is a Tyan GT24 Server with Adaptec 3408 RAID adapter.

The RAID adapter is supported but the server not.

0 Kudos
Dave_Mishchenko
Immortal
Immortal

Welcome to the VMware Community forums. The 3408 is not a supported controller (see the list here - http://www.vmware.com/pdf/vi35_io_guide.pdf) so if you're planning on using this for production it would be best to stick to something on the list.

Are you able to press ALT-F1 or better ALT-F12 when this happens? Also in the VI client, do to Administration \ System Logs and see if you have an errors there.

0 Kudos
Ilia_shapira
Contributor
Contributor

It is there on page 5

Adaptec RAID 3405 aacraid_esx30 1.1.5‐2415 aacraid_esx30 1.1.5‐2415 aacraid_esx30 1.1.5‐2415

I will later try to press ALT-F1 and F12 and let you know.

in the VI I didn't sow any errors while it was responding now I can't access it

0 Kudos
Dave_Mishchenko
Immortal
Immortal

If you're not using it, you'll want to use the lastest install of ESXi 3.5 Update 1 - Build: 82664. Likewise for the BIOS / firmware for the various hardware components.

0 Kudos
Ilia_shapira
Contributor
Contributor

Ofcourse I'm using the latest firmware for the server and the RAID card and I'm using the latest version of VMWARE

Yes I can pres ALT-F1 and Alt-F12

Actually this time I can even access the virtual servers (remote Aceess, file access etc), I only can't access the VMWARE server with VMWARE Infrastructure Client. But when I press F12 or try to restart it with F12 its not responding.

ALT-F1 Shows:

stoping sfcbd

starting sfcbd

child still alive with status of 0

killed

child still alive with status of 0

killed

..

..

child still alive with status of 0

killed

stoping sfcbd

starting sfcbd

Alt-F12 shows huge amount of information warnings etc.

Any ideas whats the problem ?

0 Kudos
Dave_Mishchenko
Immortal
Immortal

Depending on the MB the server has it may have an nVidia or LSI controller that ESX will recognize. It might be worth trying the install without the 4805 card to see if that makes a difference. have you tried to install the regulare ESX 3.5

0 Kudos
Ilia_shapira
Contributor
Contributor

It has NVIDIA RAID, I tried to install without Adaptec but it wont recognize the NVIDIA RAID, it can work directly with the disks attached to NVIDIA but can't see the NVIDIA RAID 5

No I didn't try to use ESX, actually I still can't fully understand the difference between ESX and ESXi, what is important for me is the price and ESXi cost 500$ while as far as I understand ESX costs more

0 Kudos
Dave_Mishchenko
Immortal
Immortal

Here's a thread to look at that discusses ESX / ESXi - http://communities.vmware.com/message/918458#918458. There are others like this in this forum. Essentially with ESXi VMware has removed the service console Linux VM. That has reduced both the size of ESX as well as the security exposure.

While ESX/ESXi will recoginize an number of SATA controllers like those from nVidia, the RAID functions of those cards won't work for those cards while require a software component to make RAID work. You sohuld be able to install it, when the drives are setup as native / IDE mode and then be able to get ESXi to either work OK or fail again. That way you can get an idea if the problem is the MB or the Adaptec card.

ESX starts at $1000 - are you planning to keep this as a standalone host or will you add other hosts and VirtualCenter?

0 Kudos
Ilia_shapira
Contributor
Contributor

This is going to be a standalone server so I don't want to invest too much in it and this is why I prefer ESXi

I know why it dosn't work with NVIDIA RAID and this is exactly why I put Adaptec in there.

Anyway because I'm still evaluating it I opened a support request and hope to get some answers from them soon.

0 Kudos
cdickerson
Enthusiast
Enthusiast

I'm experiencing a similar issue. In the process of trying to narrow down the cause as well. I have two ESX3i boxes (ASUS M2A-VM motherboards, AMD processors, 4gb of RAM, Intel Pro/1000 NIC's connected via iSCSI to a Openfiler SAN). This is obviously a test lab. I am booting ESX3i via Kingston HyperX USB flash drives. I have one server that after a few hours VirtualCenter loses connectivity and I can't connect directly to it neither. I can still ping it, but the console seems locked up. I was going down the path of thinking it was a hardware problem, because the other server with identical hardware doesn't have any problems. So I swapped the flashdrives between servers. The problem moved with the flashdrive. So I tried another flashdrive. When I rebuilt the flashdrive, I hadn't removed the server from VirtualCenter yet. It appears that VC tried to reinstall and reactive HA/DRS on the newly rebuilt server. Shortly after, the server was hung again. I am leaning toward a Management Agent or HA bug. What I have just done is rebuilt the servers from scratch and will see if they run for awhile without being a part of VirtualCenter. If so, then I am going to add to VC but not enable HA. I also haven't applied the firmware update patch yet either.

-Craig

0 Kudos
Ilia_shapira
Contributor
Contributor

I also have two test servers and the second one works fine, but im my case they has different hardware so.....

I don't know if the "firmware update patch" will help or not because on both of my test server I can't install it Smiley Sad I recive an error described here :

The KB say "please contact VMware Support and create a Support Request" and this is what I did, but I still got no answer from vmware people.

0 Kudos
ocremel
Hot Shot
Hot Shot

Could you tell me what your SR # is ? Thanks.

0 Kudos
Ilia_shapira
Contributor
Contributor

1114953515

0 Kudos
mike_laspina
Champion
Champion

Hi,

It could be bad ram, I had one lab server that would install fine, run fine then drop without error. I replaced the memory and it work fine there after.

http://blog.laspina.ca/ vExpert 2009
0 Kudos
Ilia_shapira
Contributor
Contributor

No its not RAM

The virtauls machines inside this server runs fine, the only part that stop responding is managment.

0 Kudos
cdickerson
Enthusiast
Enthusiast

I went back to version 70348 and haven't had any problems. I was even having problems adding a fresh install to VC without the management agents hanging. I don't have HA enabled.

0 Kudos
Ilia_shapira
Contributor
Contributor

I upgraded the server with the latest patch but it still didn't solve the problem.

However I was able to find a workaround for a problem, when I change the IP of the server from DHCP to static it works ok.

Its strange because even when it stop responding I can still ping it.

0 Kudos
Ilia_shapira
Contributor
Contributor

Unfortunatly I was wrong about DHCP, the server get stuck again but this time after two days and not the next day so DHCP to static helped only for extra few days.

Also still got no word from VMWARE on my support ticket

0 Kudos