Resolving 3PAR/Vmware SCSI reservation conflicts:
Environment:
Vmware 4.1
3PAR 2.3.1 MU2 with VAAI API
Symptoms:
-All the esx host lost the datastores
-When you login to esxhost and do "df -h" it hangs
-You cannot shutdown the guest vm's properly, it will get stuck at say 95%
-Your VM's won't be accessible to power on
Login to 3PAR
3PAR_F200_PG1 cli% showrsv
VVname Host Owner Port Reservation Type
vv-datastore1 esx01 21000024 FF2FCA50 unknown SCSI-2
vv-datastore2 esx01 21000024 FF2FCA50 unknown SCSI-2
VV-datastore5-2MB esx01 21000024 FF2FCA50 unknown SCSI-2
The above output confirms that there are scsi reservation on the luns
VMWARE KB Article to resolve the issue
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1002293
Solution from 3PAR side:
3PAR_F200_PG1 cli% setvv -clrrsv vv-datastore1
WARNING: Active hosts may be disrupted.
Are you sure you want to run setvv -clrrsv vv-datastore1
select q=quit y=yes n=no: y
3PAR_F200_PG1 cli% showrsv
VVname Host Owner Port ReservationType
vv-datastore2 esx01 21000024FF2FCA50 unknown SCSI-2
VV-datastore5-2MB esx01 21000024FF2FCA50 unknown SCSI-2
(the above output shows it reset lock on vv-datastore1)
3PAR_F200_PG1 cli% setvv -clrrsv vv-datastore2
WARNING: Active hosts may be disrupted.
Are you sure you want to run setvv -clrrsv vv-datastore2
select q=quit y=yes n=no: y
3PAR_F200_PG1 cli% showrsv
VVname Host Owner Port ReservationType
VV-datastore5-2MB esx01 21000024FF2FCA50 unknown SCSI-2
3PAR_F200_PG1 cli% setvv -clrrsv VV-datastore5-2MB
(the above output shows it reset lock on vv-datastore2)
WARNING: Active hosts may be disrupted.
Are you sure you want to run setvv -clrrsv VV-datastore5-2MB
select q=quit y=yes n=no: y
3PAR_F200_PG1 cli% showrsv
no reservations found
(the above output shows no reservations)
Solution from VMware ESx host side:
List all the pending reservation
#tail -1 /proc/scsi/qla2xxx/[0-9]*
Perform a LUN reset to clear the lock
# vmkfstools --lock lunreset /vmfs/devices/disks/naa.50002ac000080a7e
---------------
ESXi /var/log/messages Error Logs:
---------------
Feb 18 19:11:32 Vpxa: [2011-02-18 19:11:32.385 1D695B90 warning 'App'] [VpxaHalStats] Unexpected return result. Expect 1 sample, receive 2
Feb 18 19:11:35 Vpxa: [2011-02-18 19:11:35.216 1D613B90 verbose 'SoapAdapter.HTTPService'] User agent is 'VMware-client/4.1.0'
Feb 18 19:11:35 Vpxa: [2011-02-18 19:11:35.216 1D613B90 verbose 'SoapAdapter.HTTPService'] HTTP Response: Client: NeedsContentLength: false UnderstandsChunking: true CanKeepAlive: true (PresetContentLength -1)
Feb 18 19:11:35 Vpxa: [2011-02-18 19:11:35.216 1D5D2B90 info 'App' opID=task-internal-22417-dbfb04f9] [VpxLRO] -- BEGIN task-internal-22417 -- -- vpxapi.VpxaService.fetchQuickStats -- 52c7f745-06cb-1a35-9dd0-388015a6d4f4
Feb 18 19:11:35 Vpxa: [2011-02-18 19:11:35.216 1D5D2B90 verbose 'SoapAdapter.HTTPService' opID=task-internal-22417-dbfb04f9] HTTP Response: Complete (processed 396 bytes)
Feb 18 19:11:35 Vpxa: [2011-02-18 19:11:35.216 1D5D2B90 info 'App' opID=task-internal-22417-dbfb04f9] [VpxLRO] -- FINISH task-internal-22417 -- -- vpxapi.VpxaService.fetchQuickStats -- 52c7f745-06cb-1a35-9dd0-388015a6d4f4
############################################################### Error ###############################################################
Feb 18 19:11:37 vmkernel: 10:02:15:41.486 cpu17:4113)ScsiDeviceIO: 1672: Command 0x16 to device "naa.50002ac000050a7e" failed H:0x5 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
Feb 18 19:11:37 vmkernel: 10:02:15:41.486 cpu4:4134)WARNING: FS3: 7030: Reservation error: Timeout
Feb 18 19:11:37 vmkernel: 10:02:15:41.534 cpu4:4134)FS3: 6978: Starting HB reclaim for [HB state abcdef02 offset 3809280 gen 49 stamp 868815074527 uuid 4d5175ef-d3ceec8d-8b50-a4badb47bc9f jrnl drv 8.46]
############################################################### Error ###############################################################
Feb 18 19:11:37 vmkernel: 10:02:15:42.274 cpu17:4113)ScsiDeviceIO: 1672: Command 0x16 to device "naa.50002ac000080a7e" failed H:0x5 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
Feb 18 19:11:37 vmkernel: 10:02:15:42.274 cpu6:4135)WARNING: FS3: 7030: Reservation error: Timeout
Feb 18 19:11:37 vmkernel: 10:02:15:42.274 cpu18:697291)FS3: 7346: Waiting for timed-out heartbeat [HB state abcdef02 offset 3809280 gen 3517 stamp 871821852348 uuid 4d5175ef-d3ceec8d-8b50-a4badb47bc9f jrnl drv 8.46]
Feb 18 19:11:37 vmkernel: 10:02:15:42.274 cpu6:4135)FS3: 6978: Starting HB reclaim for [HB state abcdef02 offset 3809280 gen 3517 stamp 871821852348 uuid 4d5175ef-d3ceec8d-8b50-a4badb47bc9f jrnl drv 8.46]
Feb 18 19:11:38 Vpxa: [2011-02-18 19:11:38.322 1D85CB90 verbose 'SoapAdapter.HTTPService'] User agent is 'VMware-client/4.1.0'
Feb 18 19:11:38 Vpxa: [2011-02-18 19:11:38.323 1D85CB90 verbose 'SoapAdapter.HTTPService'] HTTP Response: Client: NeedsContentLength: false UnderstandsChunking: true CanKeepAlive: true (PresetContentLength -1)
Feb 18 19:11:38 Vpxa: [2011-02-18 19:11:38.323 1D613B90 info 'App' opID=CB8B255C-0000010A-52] [VpxLRO] -- BEGIN task-internal-22418 -- -- vpxapi.VpxaService.queryPerformanceStatistics -- 52c7f745-06cb-1a35-9dd0-388015a6d4f4
Feb 18 19:11:38 Vpxa: [2011-02-18 19:11:38.324 1D613B90 verbose 'App' opID=CB8B255C-0000010A-52] [VpxaMoService::QueryStats] Query last timestamp 1298056298
Feb 18 19:11:38 Vpxa: [2011-02-18 19:11:38.326 1D613B90 verbose 'SoapAdapter.HTTPService' opID=CB8B255C-0000010A-52] HTTP Response: Complete (processed 418 bytes)
Feb 18 19:11:38 Vpxa: [2011-02-18 19:11:38.326 1D613B90 info 'App' opID=CB8B255C-0000010A-52] [VpxLRO] -- FINISH task-internal-22418 -- -- vpxapi.VpxaService.queryPerformanceStatistics -- 52c7f745-06cb-1a35-9dd0-388015a6d4f4
Feb 18 19:11:39 Vpxa: [2011-02-18 19:11:39.244 1D5D2B90 verbose 'App'] [VpxaVMAP] Checking Node Resources
Feb 18 19:11:39 Vpxa: [2011-02-18 19:11:39.249 1D5D2B90 verbose 'App'] [VpxaVMAP] CheckThreshold values stored:(35400,58178,35400,58178) - retrieved:(35400,57085,35400,58129)
Feb 18 19:11:39 Vpxa: [2011-02-18 19:11:39.249 1D5D2B90 verbose 'App'] [VpxaVMAP] CheckThreshold percent changes (0,1,0,0) - threshold 5
Feb 18 19:11:40 Hostd: [2011-02-18 19:11:40.006 4FD49B90 error 'Statssvc'] HostCtl Exception during stats collection: Unable to complete Sysinfo operation. Please see the VMkernel log file for more details.
############################################################### Error ###############################################################
Feb 18 19:11:42 vmkernel: 10:02:15:46.496 cpu17:4113)ScsiDeviceIO: 1672: Command 0x16 to device "naa.50002ac000030a7e" failed H:0x5 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
Feb 18 19:11:42 vmkernel: 10:02:15:46.496 cpu14:4133)WARNING: FS3: 7030: Reservation error: Timeout
Feb 18 19:11:42 vmkernel: 10:02:15:46.496 cpu18:1048314)Fil3: 11853: Max timeout retries exceeded for caller Fil3_FileIO (status 'Timeout')
Feb 18 19:11:42 vmkernel: 10:02:15:46.496 cpu12:2194046)FS3: 7346: Waiting for timed-out heartbeat [HB state abcdef02 offset 3809280 gen 161 stamp 870183428483 uuid 4d5175ef-d3ceec8d-8b50-a4badb47bc9f jrnl drv 8.46]
Feb 18 19:11:42 vmkernel: 10:02:15:46.496 cpu14:4133)FS3: 6978: Starting HB reclaim for [HB state abcdef02 offset 3809280 gen 161 stamp 870183428483 uuid 4d5175ef-d3ceec8d-8b50-a4badb47bc9f jrnl drv 8.46]
Feb 18 19:11:42 vmkernel: 10:02:15:46.496 cpu18:1048314)FS3: 7346: Waiting for timed-out heartbeat [HB state abcdef02 offset 3809280 gen 161 stamp 870183428483 uuid 4d5175ef-d3ceec8d-8b50-a4badb47bc9f jrnl drv 8.46]
Feb 18 19:11:42 Hostd: [2011-02-18 19:11:42.035 4FBA0B90 verbose 'vm:/vmfs/volumes/4caa11fb-4eb23e3b-1ada-a4badb47bc9f/mdc3vr1904/mdc3vr1904.vmx'] Running status of tools changed to: running
Feb 18 19:11:42 Vpxa: [2011-02-18 19:11:42.037 1D591B90 verbose 'VpxaHalCnxHostagent'] Received callback in WaitForUpdatesDone
Feb 18 19:11:42 Vpxa: [2011-02-18 19:11:42.038 1D591B90 verbose 'VpxaHalCnxHostagent'] [VpxaHalCnxHostagent::ProcessUpdate] Applying updates from 102653 to 102654 (at 102653)
Feb 18 19:11:42 Vpxa: [2011-02-18 19:11:42.038 1D591B90 verbose 'App'] [VpxaHalVmHostagent] 160: Runtime changed 'guest.toolsStatus'
Feb 18 19:11:42 Vpxa: [2011-02-18 19:11:42.038 1D591B90 verbose 'App'] [VpxaHalVmHostagent] 160: Runtime changed 'guest.toolsRunningStatus'
Feb 18 19:11:42 Vpxa: [2011-02-18 19:11:42.038 1D591B90 verbose 'App'] [VpxaHalVmHostagent] 160: Runtime changed 'guest.toolsVersion'
Feb 18 19:11:42 Vpxa: [2011-02-18 19:11:42.038 1D591B90 verbose 'App'] [VpxaHalVmHostagent] 160: Runtime changed 'guest.guestId'
Feb 18 19:11:42 Vpxa: [2011-02-18 19:11:42.038 1D591B90 verbose 'App'] [VpxaHalVmHostagent] 160: Runtime changed 'guest.guestFamily'
Feb 18 19:11:42 Vpxa: [2011-02-18 19:11:42.038 1D591B90 verbose 'App'] [VpxaHalVmHostagent] 160: Runtime changed 'guest.guestFullName'
Feb 18 19:11:42 Vpxa: [2011-02-18 19:11:42.038 1D591B90 verbose 'App'] [VpxaHalVmHostagent] 160: Runtime changed 'guest.hostName'
Feb 18 19:11:42 Vpxa: [2011-02-18 19:11:42.038 1D591B90 verbose 'App'] [VpxaHalVmHostagent] 160: Runtime changed 'guest.ipAddress'
Feb 18 19:11:42 Vpxa: [2011-02-18 19:11:42.038 1D591B90 verbose 'App'] [VpxaHalVmHostagent] 160: Runtime changed 'guest.net'
Feb 18 19:11:42 Vpxa: [2011-02-18 19:11:42.038 1D591B90 verbose 'App'] [VpxaHalVmHostagent] 160: Runtime changed 'guest.ipStack'
Feb 18 19:11:42 Vpxa: [2011-02-18 19:11:42.038 1D591B90 verbose 'App'] [VpxaHalVmHostagent] 160: GuestInfo changed 'guest.disk'
Feb 18 19:11:42 Vpxa: [2011-02-18 19:11:42.038 1D591B90 verbose 'App'] [VpxaHalVmHostagent] 160: Runtime changed 'guest.screen'
Feb 18 19:11:42 Vpxa: [2011-02-18 19:11:42.038 1D591B90 verbose 'App'] [VpxaHalVmHostagent] 160: Runtime changed 'guest.guestState'
Feb 18 19:11:42 Vpxa: [2011-02-18 19:11:42.038 1D591B90 verbose 'App'] [VpxaHalVmHostagent] 160: appHeartbeatStatus changed to appStatusGray
Feb 18 19:11:42 Vpxa: [2011-02-18 19:11:42.038 1D591B90 verbose 'App'] [VpxaHalServices] appHeartbeatChange Event for vm(11) 160
Feb 18 19:11:42 Vpxa: [2011-02-18 19:11:42.038 1D591B90 verbose 'App'] [VpxaInvtVmChangeListener] App HeartbeatStatus Changed
Feb 18 19:11:42 Vpxa: [2011-02-18 19:11:42.039 1D591B90 verbose 'App'] [VpxaHalServices] RuntimeChange Event for vm(11) 160
Feb 18 19:11:42 Vpxa: [2011-02-18 19:11:42.039 1D591B90 verbose 'App'] [VpxaInvtVmChangeListener] AAM won't be notifyed. vmid 11: notifyVMAP 0,isRestarting 0
Feb 18 19:11:42 Vpxa: [2011-02-18 19:11:42.039 1D591B90 verbose 'App'] [VpxaInvtHost] Increment master gen. no to (4151): VmRuntime:VpxaInvtVmChangeListener::RuntimeChanged
Feb 18 19:11:42 Vpxa: [2011-02-18 19:11:42.039 1D591B90 verbose 'App'] [VpxaHalServices] VmGuestDiskChange Event for vm(11) 160
Feb 18 19:11:42 Vpxa: [2011-02-18 19:11:42.039 1D591B90 verbose 'App'] [VpxaInvtVmChangeListener] Guest DiskInfo Changed
Feb 18 19:11:42 Vpxa: [2011-02-18 19:11:42.039 1D591B90 verbose 'App'] [VpxaInvtHost] Increment master gen. no to (4152): VmRuntime:GuestDiskChanged
Feb 18 19:11:43 Vpxa: [2011-02-18 19:11:43.242 1D799B90 verbose 'SoapAdapter.HTTPService'] User agent is 'VMware-client/4.1.0'
Feb 18 19:11:43 Vpxa: [2011-02-18 19:11:43.242 1D799B90 verbose 'SoapAdapter.HTTPService'] HTTP Response: Client: NeedsContentLength: false UnderstandsChunking: true CanKeepAlive: true (PresetContentLength 129)
Feb 18 19:11:43 Vpxa: [2011-02-18 19:11:43.242 1D799B90 verbose 'SoapAdapter.HTTPService'] HTTP Response: Auto-completing at 129/129 bytes
Feb 18 19:11:45 vmkernel: 10:02:15:49.496 cpu14:1048552)FS3: 7346: Waiting for timed-out heartbeat [HB state abcdef02 offset 3809280 gen 161 stamp 870183428483 uuid 4d5175ef-d3ceec8d-8b50-a4badb47bc9f jrnl drv 8.46]
Feb 18 19:11:45 vmkernel: 10:02:15:49.498 cpu8:2045185)FS3: 7346: Waiting for timed-out heartbeat [HB state abcdef02 offset 3809280 gen 161 stamp 870183428483 uuid 4d5175ef-d3ceec8d-8b50-a4badb47bc9f jrnl drv 8.46]
Feb 18 19:11:45 Vpxa: [2011-02-18 19:11:45.083 1D717B90 verbose 'App'] [VpxaVMAP] Monitoring AAM health: vpxdDasStateOnLastInvocation(running) currentVpxdDasState(running) forceRunOfListNodes(0) isDasEnabled(1) skipOperation(0)
Feb 18 19:11:45 Vpxa: [2011-02-18 19:11:45.083 1D717B90 verbose 'App'] [VpxaVMAP::Invoke]Command to invoke is /opt/vmware/aam/bin/aamPerl /opt/vmware/aam/ha/aam_config_util.pl -z -shortname=md000ysesx01 -uname=VMkernel -cmd=monitornodes -domain=vmwar
Feb 18 19:11:45 e
Feb 18 19:11:45 Vpxa: [2011-02-18 19:11:45.084 1D717B90 info 'SysCommandPosix'] ForkExec(/opt/vmware/aam/bin/aamPerl) 2196440
Feb 18 19:11:45 Vpxa: [2011-02-18 19:11:45.316 1D717B90 verbose 'App'] [VpxaVMAP::Invoke] task percent done is 100
Feb 18 19:11:46 Vpxa: [2011-02-18 19:11:46.119 1D717B90 verbose 'App'] [VpxaVMAP::Invoke] Command output: 02/18/11 19:11:45 [print_args ] Invoked command: 02/18/11 19:11:45 [print_args ] /opt/vmware/aam/bin/ftPerl /opt/vmware/a
Feb 18 19:11:46 am/ha/aam_config_util.pl -z -shortname=md000ysesx01 -uname=VMkernel -cmd=monitornodes -domain=vmware 02/18/11 19:11:45 [print_args ] Environment: 02/18/11 19:11:45 [print_args ] FT_DIR=/opt/vmware/aam 02/18/11 19:11:45 [print_ar
Feb 18 19:11:46 gs ] FT_ISOLATION_TIME=1 02/18/11 19:11:45 [print_args ] GREP=/bin/grep 02/18/11 19:11:45 [print_args ] FT_CONFIG_DIR=/var/lib/vmware/aam 02/18/11 19:11:45 [print_args ] RPCINFO=/bin/rpcinfo 02/1