ESXI APD ( All Path Down) Issues.
All Paths Down (APD) is a condition where all paths to the storage device are lost or the storage device is removed. The state is caused because the change happen in an uncontrolled manner, and the VMkernal core storage stack does not know how long the loss of access to the device will last.
We have to check with Storage and Network team for the cause.
>> Troubleshooting guide
Check host for "APD" message. Get VolumeID and timestamp.
Find volume name and storage ip.
Find vmk (source ip) of mounted storage.
Find switch inventory.
Check ESX/ ESXi advanced settings for network and nfs.
Check MTU Size, Interface and Switch status
Check host for "APD" message. Get VolumeID and timestamp.
command: grep -i storage /var/log/vmkernel.log | less
output:
2019-04-12T06:00:44.649Z cpu3:1988181)StorageApdHandler: 577: APD Timer killed for ident [2acd53b9-50167f5r]
2019-04-12T06:00:44.649Z cpu3:1988181)StorageApdHandler: 422: Device or filesystem with identifier [2acd53b9-50167f5r] has exited the All Paths Down state.
2019-04-12T06:00:44.649Z cpu3:1988181)StorageApdHandler: 922: APD Exit for ident [2acd53b9-50167f5r]!
Find volume name and storage ip.
command: ls -lah /vmfs/volumes/ | grep 2acd53b9-50167f5r
output:
drwxrwxrwx 1 root root 8.0K Sep 19 07:01 2acd53b9-50167f5r
lrwxr-xr-x 1 root root 17 Sep 22 07:05 ISO-FS_nsvf67869s2-06 -> 2acd53b9-50167f5r
Storage ip
command: esxcfg-nas -l | grep ISO-NFS_nsvf4269s2-02_PERS
output:
ISO-FS_nsvf67869s2-06 is /vol/nsvf4269_nfs01/q_nfs02 from10.22.177.217 mounted available
Find vmk (source ip) of mounted storage.
command: esxcli network ip connection list|grep XXXXX (IP address)
output:
XXXXXXXXXXXX
command: esxcfg-vmknic -l
output:
XXXXXXXXXXXX
Find switch inventory.
command: vim-cmd hostsvc/net/query_networkhint | egrep '(vmnic|devId|portId)'
output:
XXXXXXXXXXXX
Check MTU Size, Interface and Switch status
MTU - Size
command: esxcfg-vmknic -l
Interface:
command : esxcfg-nics -l
Switch:
command: esxcfg-vswitch -l|grep -i 'switch'
As based on result, we should check network and storage teams for further troubleshooting.
=================================***==================================
No comments: