VMware Cloud Community
AllBlack
Expert
Expert

Marking (empty) volume offline causing disruptions

Hi there,

When I want to get rid off an ISCSI LUN I always disconnect the host from the SAN and rescan the host in vcenter.

I have always been told this the correct way and it works just fine. Please correct me if I am wrong.

We have some volumes that need to be removed as the SAN it is on will be decommissioned.

It was ensured that the volume was completely empty. Obviously this will remove all connections

and the hosts needs to be rescanned.

Before there is actually a chance to rescan the hosts VMs in the environment start to lose some connectivity.

Vcenter for example gets disconnected for a while. vCenter for example does not even sit on this particular SAN.

I have no doubt it is better to do a host at a time but this is what happened.

Obviously vCenter is running on just one host and I can understand that the host is losing connectivity to one of its LUNs

but not sure why it would impact the entire machine.

There is always going to be a delay between disconnecting it from the SAN and re-scanning it so not sure why this is.

Please consider marking my answer as "helpful" or "correct"

Please consider marking my answer as "helpful" or "correct"
0 Kudos
9 Replies
athlon_crazy
Virtuoso
Virtuoso

And from esx host vmkernel log, does vmkernel complaint about the remaining LUNs too? If it's also impact on another machine, I would check the log & etc on your iscsi target.

BTW, may I know what storage you r using?






vcbMC-1.0.6 Beta

vcbMC-1.0.7 Lite

http://www.no-x.org

http://www.no-x.org
0 Kudos
AllBlack
Expert
Expert

It looks like it is caused by this

Please consider marking my answer as "helpful" or "correct"

Please consider marking my answer as "helpful" or "correct"
0 Kudos
athlon_crazy
Virtuoso
Virtuoso

Like it's known bug. BTW, thanks for sharing... Just wonder, whether you did exactly the same as listed below (virtualgeek) or you skip some step? :

1.In the vSphere client, vacate the VMs from the datastore being removed (migrate or Storage vMotion)

2.In the vSphere client, remove the Datastore

3.In the vSphere client, remove the storage device

4.Only then, in your array management tool remove the LUN from the host.

5.In the vSphere client, rescan the bus.






vcbMC-1.0.6 Beta

vcbMC-1.0.7 Lite

http://www.no-x.org

http://www.no-x.org
0 Kudos
AllBlack
Expert
Expert

That workaround listed on virtualgeek does not work for us. It produces the unable to query live vmfs error in step 2.

Before we came aware of the error we removed it like we have always done. Make sure no VMs are running, remove host from volume's ACL, rescan host.

Just had VM support on the phone and they pretty much confirmed that the bug is what we are experiencing

Please consider marking my answer as "helpful" or "correct"
0 Kudos
athlon_crazy
Virtuoso
Virtuoso

Step 2 meaning, u were not able to remove the datastore first before unpresent it to ESX host?. Sorry too much asking, afraid this could happen to me in future.






vcbMC-1.0.6 Beta

vcbMC-1.0.7 Lite

http://www.no-x.org

http://www.no-x.org
0 Kudos
AllBlack
Expert
Expert

Yep, Right-clicking delete from datastore view produces that error.

As I said, I always removed host from volume on SAN and did a rescan to remove LUNs.

Never had any problems (I would if there was a VM on it). If this is an incorrect way of doing it someone should tell me but it has been recommended to me by several sources over the years

Guy Defryn

Infrastructure Development Engineer

P.O. Box 11 222

Information Technology Services

Massey University, Palmerston North

Phone:+64-6-3505799 x 2874

 Please consider the environment before printing this email

Please consider marking my answer as "helpful" or "correct"
0 Kudos
athlon_crazy
Virtuoso
Virtuoso

Do host reboot can solve the problem (http://communities.vmware.com/thread/233373), but since it's impacting on other VMs & vMotion is not possible, next time we should do (remove datastore) and schedule it for downtime.






vcbMC-1.0.6 Beta

vcbMC-1.0.7 Lite

http://www.no-x.org

http://www.no-x.org
0 Kudos
athlon_crazy
Virtuoso
Virtuoso

Hi AllBlack,

MY colleague updated me with this from "xtravirt"






vcbMC-1.0.6 Beta

vcbMC-1.0.7 Lite

http://www.no-x.org

http://www.no-x.org
AllBlack
Expert
Expert

Yep that is correct. We applied that patch yesterday in our test environment and seems to have solved it .

Please consider marking my answer as "helpful" or "correct"

Please consider marking my answer as "helpful" or "correct"
0 Kudos