VMware Cloud Community
Mallik7
Enthusiast
Enthusiast
Jump to solution

Is there a script available to determine LUN / datastore GPT corruption?

The backup jobs are getting failed due to GPT corruption (primary partition table) on the datastores.

Ran a command to check the status. However, we would like to check our entire environment to know how many datastores has similar issue.

so, is there a PowerCLI script available to determine LUN / datastore GPT corruption - Cluster wise or whole vCenter ? Could you please help here....!

48 Replies
Mallik7
Enthusiast
Enthusiast
Jump to solution

you know what, the GPT status showing same for all the hosts which is incorrect.

I know, some of the LUN's GPT is corrupted. If I run a command from a host, it is showing an error.

Reply
0 Kudos
LucD
Leadership
Leadership
Jump to solution

And the VMHost and Diskname properties are different?


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

Reply
0 Kudos
Mallik7
Enthusiast
Enthusiast
Jump to solution

Host name and disk name properties are different. But, lots of duplication.

What I mean is, say example:

In one cluster, there are 10 ESXi hosts and about 30 datastores are mapped. And obviously all the 30 datastores will be mapped to the each host in that cluster.

so, in the output, it is showing for each host 30 datastores entires. (which is huge duplication entries are getting created)

Reply
0 Kudos
LucD
Leadership
Leadership
Jump to solution

Since these datastores are multi-host, that is the default view afaik.

So how do you want a datastore that is visible on multiple ESXi nodes to appear in the report?


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

Reply
0 Kudos
Mallik7
Enthusiast
Enthusiast
Jump to solution

Since the datastores are mapped to each host in a cluster (datastores mapping is limited to each Cluster), you may leave like this. However, I would like the result to be listed as correct.

TIA

Reply
0 Kudos
LucD
Leadership
Leadership
Jump to solution

What do you mean by "... result to be listed as correct"


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

Reply
0 Kudos
Mallik7
Enthusiast
Enthusiast
Jump to solution

sorry for the trouble.

what I mean is, under GPT status, it is showing same result for all the LUNs / datastores.

please see below the result -

NAA ID                                                                     gpt status

naa.60050768018287cb6800000000000056          gpt 267349 255 63 4294967296 1 2048 4294967262 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

naa.60050768018287cb6800000000000057          gpt 267349 255 63 4294967296 1 2048 4294967262 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

naa.60050768018287cb6800000000000058          gpt 267349 255 63 4294967296 1 2048 4294967262 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

naa.60050768018287cb6800000000000059          gpt 267349 255 63 4294967296 1 2048 4294967262 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

naa.60050768018287cb680000000000005a          gpt 267349 255 63 4294967296 1 2048 4294967262 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

naa.60050768018287cb680000000000005b          gpt 267349 255 63 4294967296 1 2048 4294967262 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

naa.60050768018287cb680000000000005c          gpt 267349 255 63 4294967296 1 2048 4294967262 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

naa.60050768018287cb680000000000005d          gpt 267349 255 63 4294967296 1 2048 4294967262 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

naa.60050768018287cb680000000000005e          gpt 267349 255 63 4294967296 1 2048 4294967262 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

naa.60050768018287cb680000000000005f           gpt 267349 255 63 4294967296 1 2048 4294967262 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

naa.60050768018287cb68000000000000ba          gpt 267349 255 63 4294967296 1 2048 4294967262 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

naa.60050768018287cb68000000000000bb          gpt 267349 255 63 4294967296 1 2048 4294967262 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

naa.60050768018287cb68000000000000bc          gpt 267349 255 63 4294967296 1 2048 4294967262 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

naa.60050768018287cb68000000000000bd          gpt 267349 255 63 4294967296 1 2048 4294967262 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

naa.60050768018287cb68000000000000be          gpt 267349 255 63 4294967296 1 2048 4294967262 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

naa.60050768018287cb68000000000000c3          gpt 267349 255 63 4294967296 1 2048 4294967262 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

naa.60050768018287cb68000000000000bf           gpt 267349 255 63 4294967296 1 2048 4294967262 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

naa.60050768018287cb68000000000000c0          gpt 267349 255 63 4294967296 1 2048 4294967262 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

naa.60050768018287cb68000000000000c1          gpt 267349 255 63 4294967296 1 2048 4294967262 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

naa.60050768018287cb68000000000000c2          gpt 267349 255 63 4294967296 1 2048 4294967262 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

naa.60050768018287cb6800000000000118          gpt 267349 255 63 4294967296 1 2048 4294967262 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

naa.60050768018287cb6800000000000119          gpt 267349 255 63 4294967296 1 2048 4294967262 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

naa.60050768018287cb680000000000011a          gpt 267349 255 63 4294967296 1 2048 4294967262 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

Reply
0 Kudos
LucD
Leadership
Leadership
Jump to solution

Not sure what you expect to be different there.

All your LUNs obviously have the same geometry (which is to be expected in most environments)

And the GUID (AA31E02A400F11DB9590000C2911D1B8) just indicates it is a VMFS Datastore partition.

See the Related Information section in KB1036609


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

Reply
0 Kudos
Mallik7
Enthusiast
Enthusiast
Jump to solution

in my environment, some of the LUNs GPT table is got corrupted.

if I run a command from PowerCLI shell, it is showing as "GPT primary is got corrupted, but backup table is OK".

but, when I run this PowerCLI script, the result is different (it is showing the corrupted LUN also good which is incorrect result).

Reply
0 Kudos
LucD
Leadership
Leadership
Jump to solution

I'm afraid I'm going to stop answering on this thread, and any future questions you may have.

Normally I don't care if a thread gets a Correct Answer or not, but removing a Correct Answer is a step too far for me.
I think I replied to your initial question, but you kept adding requirements.

And now you obviously want me to explain why a method, you proposed, is not returning what you expect.


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

Reply
0 Kudos
Mallik7
Enthusiast
Enthusiast
Jump to solution

Hi LucD, I really appreciate and thank you for the amount of the time that you have invested on my queries and questions. Its a great help.

Thanks once again.

Reply
0 Kudos
LucD
Leadership
Leadership
Jump to solution

What command do you mean by '...command from PowerCLI shell, it is showing as "GPT primary is got corrupted, but backup table is OK"'


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

Reply
0 Kudos
Mallik7
Enthusiast
Enthusiast
Jump to solution

here with I'm attaching to different screenshots -

GPTtable-corrupt-commandline-output.jpg  ---> This output shows, when I run the command from a ESXi server

GPTcorruption-script-output ---> This is the script output

you may see the difference of the GPT output result. "GPT Error" is not getting captured in the script output result.

Reply
0 Kudos
Mallik7
Enthusiast
Enthusiast
Jump to solution

GPTtable-corrupt-commandline-output.jpg  ---> This output shows, when I run the command from a ESXi server

re-attaching the screenshot

Reply
0 Kudos
LucD
Leadership
Leadership
Jump to solution

I think I know what happens.
The corruption message is sent to stderr, which is not captured when using a pseudo-terminal.

This version uses a pseudo-terminal, and should also return the stderr text.

Can you try like this?

$user = 'root'

$pswd = 'VMware1!'

$secPswd = ConvertTo-SecureString $pswd -AsPlainText -Force

$cred = New-Object -TypeName System.Management.Automation.PSCredential -ArgumentList $user, $secPswd


Get-VMHost -PipelineVariable esx |

ForEach-Object -Process {

   $service = Get-VMHostService -VMHost $esx | where{$_.Key -eq 'TSM-SSH' -and -not $_.Running} |

   Start-VMHostService -Confirm:$false


   $session = New-SSHSession -ComputerName $esx.Name -Credential $cred –AcceptKey

   $stream = New-SSHShellStream -SSHSession $session -TerminalName tty

  

   # Discard banner & motd & ...

   while ($stream.DataAvailable)

   {

   $stream.Read() | Out-Null

   }

   Get-Datastore -RelatedObject $esx -PipelineVariable ds |

   where{$_.Type -eq 'VMFS'} |

   ForEach-Object -Process {

   $ds.ExtensionData.Info.Vmfs.Extent |

   ForEach-Object -Process {

     $dev = $_.DiskName

     $stream.WriteLine("partedUtil getptbl /vmfs/devices/disks/$dev")

     $stream.ReadLine() | Out-Null              # Drop command echo

     $result = $stream.Expect([regex]'] $')   # Wait for prompt & read result

     New-Object -TypeName PSObject -Property ([ordered]@{

       vCenter = ([uri]$esx.ExtensionData.Client.ServiceUrl).Host

       VMHost = $esx.Name

       Datastore = $ds.Name

       DiskName = $dev

       'GPT Status' = ($result.split("`n").Where{$_ -notmatch "^\[.*\]"}) -join ' '

     })

   }

   }

   $stream.Close()

   Remove-SSHSession -SSHSession $session | Out-Null


   if($service){

     Stop-VMHostService -HostService $service -Confirm:$false | Out-Null

   }

}


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

Mallik7
Enthusiast
Enthusiast
Jump to solution

now it is correctly capturing the result. I would like to add the Cluster information as well (this will be my last request). kindly help. thanks

Reply
0 Kudos
LucD
Leadership
Leadership
Jump to solution

That can be done with just an update of the New-Object part.

   New-Object -TypeName PSObject -Property ([ordered]@{

     vCenter = ([uri]$esx.ExtensionData.Client.ServiceUrl).Host

     Cluster = (Get-Cluster -VMHost $esx).Name

     VMHost = $esx.Name

     Datastore = $ds.Name

     DiskName = $dev

     'GPT Status' = ($result.split("`n").Where{ $_ -notmatch "^\[.*\]" }) -join ' '

   })


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

Reply
0 Kudos
Mallik7
Enthusiast
Enthusiast
Jump to solution

hello again,

I found something wrong with this script which I want to bring to your notice and help me please....

If any host root password is wrong or unable to connect to that host, it is giving same result (I mean, the previous stored value in the string) to those hosts which has issue to connect with pwd or some other connectivity issue. could you please correct this?

if any host unable to connect or contact, it should give a result as unable to contact or what ever error it throws and that should be captured by the script and show against each host in the output file.

and also, can you please insert a line to ask for the user id and pwd (either root or active directory authentication) instead we input manually the user id and pwd in the script...

TIA

Reply
0 Kudos
LucD
Leadership
Leadership
Jump to solution

I know, the current script doesn't have any error handling builtin.

You can intercept the result of the New-SshSession cmdlet, and take appropriate action.

You can prompt for the credentials with the Get-Credential cmdlet.


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

Reply
0 Kudos
Mallik7
Enthusiast
Enthusiast
Jump to solution

if I run the command to check the datastore GPT manually from a putty session, it is showing as corrupt.

But, in the script output, the GPT result showing good (which means the script giving wrong result in the output file). Can you help to correct this please.

my doubt is, when ever the script tries to connect to the ESXi host with the given credentials in the script and if it is unable to authenticate, it is publishing the previous result to the current host. (hope I explained better, but, please let me know if don't understand)

kindly help here... 

Reply
0 Kudos