VMware Cloud Community
shatztal
Contributor
Contributor

Can't se the datastore after Rebbot

hello, i have a Cluster with 3 Host every host has i 10 GB Disk from san that the OS is installed and 1 650GB Shared Datastore that the VM are sitting on, i updated patches yesterday and after all updates were okay, the hosts were rebooted couple time since and everything looked fine.

today i rebooted the first host and now when the host is up it cannot see the datastore just the System Disk,

i rebooted another Host and had the same problem, so now i don't want do reboot the third beacause then everything will not work,

the think is that i see the datastore from the non rebooted host in the console but the other 2 don't see, what can be the problem?

plz i need help it is very critical

0 Kudos
17 Replies
NTurnbull
Expert
Expert

On one of the hosts that doesn't see the LUNs, if you veiw the Storage Adaptors properties from VC what do you see? if your using iSCSI then you should see the iSCSI name and alias etc.. in the bottom pane when you highlight the storage adaptor.

Thanks, Neil
0 Kudos
NTurnbull
Expert
Expert

Getting a little too quick on that post button - would also be helpful to know what version/build you were on, patches applied, if anything else got patched/upgraded and what version/build your on now.

Thanks, Neil
0 Kudos
shatztal
Contributor
Contributor

the host see just the lun of the !) GB that the OS is installed on, but the shared one the don't see after the reboot,

so i don't think that is somthing to do with the luns, cause on of them they see.

Now i have ESX Server 3.5.0, 120512

before i don't remeber but i did not install patches before so i had the Build from the ESX 3.5 Installation i belive.

0 Kudos
ncentech
Enthusiast
Enthusiast

What is your Fail over policy on your HBAs is it MRU or Fix? Did you scan the HBAs? What about your connection to the SAN did any of that changed? Are the HBAs present on the storage adapter's section of the configuration tab? Sorry, I don't remember if you are suing HBAs or iscsi. But if you are go ahead and look at the things I just recommended. Let me know. I will post some more information in a few minutes.

0 Kudos
shatztal
Contributor
Contributor

i use HBA , how do i know what my failover policy is?in Path Location is written FIXED if that what u ment.

i did not change any thing in the SAN config, i hope u can help me

0 Kudos
ncentech
Enthusiast
Enthusiast

YEs, there is where you see it. Fixed it is then, is this in your working Server? What about the other two servers, are the HBAs visible? Do you have a single HBA or dual HBA? Do you have access to the physical servers? You need to make sure there is activity on the back of the cards. Have you try rescaning the HBAs yet?

0 Kudos
shatztal
Contributor
Contributor

On the good HOST,the other two can get information about it because they think it dose not exist, on all i see the 2 Cards of the HBA,from the ESX Console on all the 3 i do esxcfg-mpath -l and i see 2 cards like it should.

i have access to all servers with no problem. remeber that the System Disks on all 3 are from the san also so why is just the shared one invisible to 2 HOST.

when i do fdisk -l from the good host i see the DISK and from the other 2 i don't see.

and when i go trough the VI to the Hosts he remebers the Datastore but if i browse the store i don't se all the VMs like if i did that before.

0 Kudos
ncentech
Enthusiast
Enthusiast

Have you looked at the logs? got to /var/log and look at the vmkwarning logs, the vmkernel, etc. post the results of the logs, it might be helpfull.

0 Kudos
shatztal
Contributor
Contributor

i will look there but what should i look for? i can't print here the logs cause it is a isolated network.

0 Kudos
shatztal
Contributor
Contributor

there is a meesage in the vmkwarning logs that says that "could not register a logical device for traget vnhba1:0:1

this messege appears in the 2 host that i have the problem with and in the good host it does not apear.

0 Kudos
jeremypage
Enthusiast
Enthusiast

That looks like you are on the right track. You may want to consider calling VMware, they probably can get you a resolution faster then we can.

0 Kudos
ncentech
Enthusiast
Enthusiast

yes that is your 10GB LUN that your two rebooted hosts cannot see. You should be looking for any errors or warning related to HBA or storage etc... But it looks like we are on the right track, something happened to the way your HBA registered the LUN prior to reboot. So did you do an upgrade you said or just patched the server? You have a few options but the easiest way to go about this is to re-install your VMware software. The other is to play with the advance options but I really don't recommend it since you might create more problems. In the advance settings under LVM you have two settings, one is LVM.EnableResignature which you can change but read about it before you do it. Do a search on that in VMware stie there is a lot of documentation out there. But again it's kind of delicate and you can create more work for you. If you have tech support you can call VMware and they should be able to guide you through this, if not, proceed with caution.

0 Kudos
shatztal
Contributor
Contributor

but i can't call at this hour here , we are on an holiday here in israel now and just tommorow they can give me support so i need to find alone with all of you a soloution

0 Kudos
shatztal
Contributor
Contributor

i patched the HOST but Upgraded the VC/UM maybe somthing that the patches did? i am afraid to reboot the good one cause i know that somthing is gone happend.

0 Kudos
ncentech
Enthusiast
Enthusiast

Yes, DON'T touch the good server, but that is what it looks like, the resignaturing of the LUN is what is causing the issue. How about storage groups on your SAN, again for some reason the SAN again the SAN is not recognizing the HBAs after the reboot. So it cannot present the LUN to them. How about storage groups in your SAN? Are you able to rebuild one of the bad servers? But currently all of your VMs are running on the good server is that correct? Or are there any that are offline? If you are going to rebuild the server please make sure to move all the vms to the working server even the ones that are powered off.

0 Kudos
shatztal
Contributor
Contributor

all the VMs run on the good server,i don't know about the groups on the SAN cause i config the SAN so i don't know,

how should i rebuild the server?

what should i do with the signatures?how do use it? wich parameter should i change? the EnableResignature or the Dissallowsnapshotlun?

what do they do?should i change them on all server?even on the good? cause i am sure that the good server after reoot will aplay the sam thing

0 Kudos
ncentech
Enthusiast
Enthusiast

yeah! those are the things you might need to do but like I said before you need to read on it to see what it will do to your environment. I don't have access to your environment so I don't know how it's configured. Have you install the software before? Is pretty streight forward, you just need to document your current configuration make sure you have all partitions, the IP addresses, the Vswitches the portgroup names and of course the name of your ESX server.

But all in all I am not an expert just that I have come across this before and that is what it took me to fix it. Your scenario might be a bit different and other things can actually be the issue.

I hope you have enough information here to start working on your issue. Reboot the bad servers again and look at the logs to see what other issues you are experiencing.