VMware Cloud Community
Mallik7
Enthusiast
Enthusiast
Jump to solution

in need of a script to fix the errors

I'm in need of a PowerCLI script to fix the GPT error on the huge list of datastores.

VMware provided a command to fix the GPT error, here is the command -

partedUtil fixGpt /vmfs/devices/disks/naa.id

Need to fix this error on about 1000+ datastores from one vCenter server. I've the datastores naa ids and need a script to fix them. Script should have ability to take input from a file where it contains the list of naa ids.

TIA

1 Solution

Accepted Solutions
LucD
Leadership
Leadership
Jump to solution

Try like this.

I introduced the Get-Cluster.

The prompt is answered by placing 'y' on the pipeline.

$user = 'root'

$pswd = 'VMware1!'

$clusterName = 'cluster'


$secPswd = ConvertTo-SecureString $pswd -AsPlainText -Force

$cred = New-Object -TypeName System.Management.Automation.PSCredential -ArgumentList $user, $secPswd


# Assumes CSV with this layout

# NaaId

# naaid1

# naaid2


$badDisks = Import-Csv -Path .\bad-naaid.csv -UseCulture

Get-Cluster -Name $clusterName | Get-VMHost -PipelineVariable esx |

ForEach-Object -Process {

    $service = Get-VMHostService -VMHost $esx | where { $_.Key -eq 'TSM-SSH' -and -not $_.Running } |

    Start-VMHostService -Confirm:$false


    $session = New-SSHSession -ComputerName $esx.Name -Credential $cred –AcceptKey

    foreach ($row in $badDisks) {

        Write-Host -ForegroundColor Yellow "ESXI $($esx.Name)  NAA $($row.NaaId)"

        $cmd = "echo 'y' | partedUtil fixGpt /vmfs/devices/disks/$($row.NaaId)"

        $result = Invoke-SSHCommand -SSHSession $session -Command $cmd

        $result.OutPut

    }

    Remove-SSHSession -SSHSession $session | Out-Null


    if ($service) {

        Stop-VMHostService -HostService $service -Confirm:$false | Out-Null

    }

}


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

View solution in original post

14 Replies
LucD
Leadership
Leadership
Jump to solution

Is SSH on those ESXi nodes running? Or can you start the service (temporarily)?

And do you have an account to use SSH to the ESXi nodes?


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

Reply
0 Kudos
Mallik7
Enthusiast
Enthusiast
Jump to solution

SSH service will be in stopped state. need to start temporarily to connect to the host and stop it.

my account has privilege to start.

Reply
0 Kudos
LucD
Leadership
Leadership
Jump to solution

In that case, isn't this basically the same script I provided to you in Re: Is there a script available to determine LUN / datastore GPT corruption?

You would just need to change the content of the $cmd variable, from getptbl to fixGpt.


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

Reply
0 Kudos
Mallik7
Enthusiast
Enthusiast
Jump to solution

I want to run the fix command against only corrupt datastores, not on the entire vCenter. Script should be able to pick the naa ids from a input file and fix the GPT error.

Reply
0 Kudos
LucD
Leadership
Leadership
Jump to solution

Something like this?

$user = 'root'

$pswd = 'VMware1!'


$secPswd = ConvertTo-SecureString $pswd -AsPlainText -Force

$cred = New-Object -TypeName System.Management.Automation.PSCredential -ArgumentList $user, $secPswd


# Assumes CSV with this layout

# NaaId

# naaid1

# naaid2


$badDisks = Import-Csv -Path .\bad-naaid.csv -UseCulture


Get-VMHost -PipelineVariable esx | ForEach-Object -Process {

    $service = Get-VMHostService -VMHost $esx | where { $_.Key -eq 'TSM-SSH' -and -not $_.Running } |

    Start-VMHostService -Confirm:$false


    $session = New-SSHSession -ComputerName $esx.Name -Credential $cred –AcceptKey

    foreach ($row in $badDisks) {

        Write-Host -ForegroundColor Yellow "ESXI $($esx.Name)  NAA $($row.NaaId)"

        $cmd = "partedUtil fixGpt /vmfs/devices/disks/$($row.NaaId)"

        $result = Invoke-SSHCommand -SSHSession $session -Command $cmd

        $result.OutPut

    }

    Remove-SSHSession -SSHSession $session | Out-Null


    if ($service) {

        Stop-VMHostService -HostService $service -Confirm:$false | Out-Null

    }

}


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

Reply
0 Kudos
Mallik7
Enthusiast
Enthusiast
Jump to solution

script is selecting a wrong host to fix the gpt error on a datastore.

in my environment, the datastores mapping is limited to each cluster, its not mapped across the vCenter.

in this case, when I run the script, it is choosing a different ESXi host where the datastore is not mapped to. can this be fixed....

when we run the gpt command for a datastore, it will wait for our input whether to fix (Y/N).

Reply
0 Kudos
LucD
Leadership
Leadership
Jump to solution

Try like this.

I introduced the Get-Cluster.

The prompt is answered by placing 'y' on the pipeline.

$user = 'root'

$pswd = 'VMware1!'

$clusterName = 'cluster'


$secPswd = ConvertTo-SecureString $pswd -AsPlainText -Force

$cred = New-Object -TypeName System.Management.Automation.PSCredential -ArgumentList $user, $secPswd


# Assumes CSV with this layout

# NaaId

# naaid1

# naaid2


$badDisks = Import-Csv -Path .\bad-naaid.csv -UseCulture

Get-Cluster -Name $clusterName | Get-VMHost -PipelineVariable esx |

ForEach-Object -Process {

    $service = Get-VMHostService -VMHost $esx | where { $_.Key -eq 'TSM-SSH' -and -not $_.Running } |

    Start-VMHostService -Confirm:$false


    $session = New-SSHSession -ComputerName $esx.Name -Credential $cred –AcceptKey

    foreach ($row in $badDisks) {

        Write-Host -ForegroundColor Yellow "ESXI $($esx.Name)  NAA $($row.NaaId)"

        $cmd = "echo 'y' | partedUtil fixGpt /vmfs/devices/disks/$($row.NaaId)"

        $result = Invoke-SSHCommand -SSHSession $session -Command $cmd

        $result.OutPut

    }

    Remove-SSHSession -SSHSession $session | Out-Null


    if ($service) {

        Stop-VMHostService -HostService $service -Confirm:$false | Out-Null

    }

}


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

Mallik7
Enthusiast
Enthusiast
Jump to solution

throwing the below error -

Get-Cluster : 10/19/2019 9:28:11 AM     Get-Cluster             Cluster with name 'cluster' was not found using the specified

filter(s).

At C:\Scripts\GPTfix\gptfix.ps1:15 char:1

+ Get-Cluster -Name $clusterName | Get-VMHost -PipelineVariable esx |

+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    + CategoryInfo          : ObjectNotFound: (:) [Get-Cluster], VimException

    + FullyQualifiedErrorId : Core_OutputHelper_WriteNotFoundError,VMware.VimAutomation.ViCore.Cmdlets.Commands.GetCluster

Reply
0 Kudos
LucD
Leadership
Leadership
Jump to solution

You will have to change the value assigned to variable $clusterName to fit your environment.


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

Reply
0 Kudos
Mallik7
Enthusiast
Enthusiast
Jump to solution

when gpt fix cmd runs, it waits for the input command (need to press Yes) and also type "Fix" as like below -

see the BOLD lines please...

[lab1\admin@testhost1:~] partedUtil fixGpt /vmfs/devices/disks/naa.1234567890

FixGpt tries to fix any problems detected in GPT table.

Please ensure that you don't run this on any RDM (Raw Device Mapping) disk.

Are you sure you want to continue (Y/N): Y

Error: The primary GPT table is corrupt, but the backup appears OK, so that will be used. Fix primary table ? diskPath (/dev/disks/naa.naa.1234567890) diskSize (2147483648) AlternateLBA (1) LastUsableLBA (2147483614)

Fix/Ignore/Cancel? Fix

gpt
133674 255 63 2147483648

1 2048 2147483614 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

[lab1\admin@testhost1:~]

========================================================================

the script is waiting for it and throwing below error -

PS C:\Scripts\GPTfix> .\gptfix.ps1

ESXI testhost1.testlab.com  NAA naa.1234567890

Exception calling "EndExecute" with "1" argument(s): "Command 'echo 'y' | partedUtil fixGpt /vmfs/devices/disks/ naa.1234567890' has timed out."

At C:\Program Files\WindowsPowerShell\Modules\Posh-SSH\2.1\Posh-SSH.psm1:260 char:25

+                         $Output = $_.cmd.EndExecute($_.Async)

+                         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    + CategoryInfo          : NotSpecified: (:) [], MethodInvocationException

    + FullyQualifiedErrorId : SshOperationTimeoutException

thanks again....

Reply
0 Kudos
LucD
Leadership
Leadership
Jump to solution

Try changing the command to

$cmd = "echo $'y\nfix' | partedUtil fixGpt /vmfs/devices/disks/$($row.NaaId)"


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

Reply
0 Kudos
Mallik7
Enthusiast
Enthusiast
Jump to solution

awesome, its working, thanks a lot.

however, as I said before, datastores mapping limited one single cluster and each datastore mapped to all the hosts with in the same cluster.

so, script is selecting each host where the same datsatore is mapped to and running the fix command several times.

could you to fix the duplication work please. In each cluster, script needs to be checked only one host, that should be good enough.

here is the output what I see it: (script is trying to fix the GPT error for the same datastore 3 times - in this case, 3 ESXI hosts there in that cluster)

PS C:\Scripts\GPTfix> .\gptfix.ps1

ESXI testhost1.testlab.com  NAA naa.1234567890

FixGpt tries to fix any problems detected in GPT table.

Please ensure that you don't run this on any RDM (Raw Device Mapping) disk.

Are you sure you want to continue (Y/N): gpt

133674 255 63 2147483648

1 2048 2147483614 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

ESXI testhost2.testlab.com  NAA naa.1234567890

FixGpt tries to fix any problems detected in GPT table.

Please ensure that you don't run this on any RDM (Raw Device Mapping) disk.

Are you sure you want to continue (Y/N): gpt

133674 255 63 2147483648

1 2048 2147483614 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

ESXI testhost3.testlab.com  NAA naa.1234567890

FixGpt tries to fix any problems detected in GPT table.

Please ensure that you don't run this on any RDM (Raw Device Mapping) disk.

Are you sure you want to continue (Y/N): gpt

133674 255 63 2147483648

1 2048 2147483614 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

PS C:\Scripts\GPTfix>

Reply
0 Kudos
LucD
Leadership
Leadership
Jump to solution

Change this line

Get-Cluster -Name $clusterName | Get-VMHost -PipelineVariable esx |

into something like this
Get-Cluster -Name $clusterName | Get-VMHost -PipelineVariable esx | Select -First 1 |

or if you want to pick a random ESXi node in the cluster

Get-Cluster -Name $clusterName | Get-VMHost -PipelineVariable esx | Get-Random |


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

Reply
0 Kudos
Mallik7
Enthusiast
Enthusiast
Jump to solution

thanks a heap, its a great help from you.

Reply
0 Kudos