Friday, May 14, 2010

How to kill zombie CLOSE_WAIT() DataStage processes

When you try to restart Datastage on our Aix 5.3 machine with
$DSHOME/bin/uv -admin -stop

then $DSHOME/bin/uv -admin -start

There are no error messages but the demon is not running
Status code = 81016

ps -ef grep dsrpc shows some existing CLOSE_WAIT() connections which doesn't allow DS to start again.

grep the free 'lsof' utility for the status, such as 'CLOSE_WAIT' and use that to identify the process ID (PID) and kill it

2 comments:

Pankaj Gautam said...

#netstat -a | grep dsrpc
tcp4 0 0 ibm67p1.dsrpc ma000xvctx902.fe.3815 CLOSE_WAIT


# lsof | grep ma000xvctx902
lsof: WARNING: compiled for AIX version 5.1.0.0; this is 5.3.0.0.
java 172438 root 518u IPv6 0xf100060007ba5b98 0t0 TCP ibm67p1:corbaloc->ma000xvctx902.federated.fds:afs (ESTABLISHED)
java 172438 root 524u IPv6 0xf1000600075e5398 0t0 TCP ibm67p1:9403->ma000xvctx902.federated.fds:confluent (ESTABLISHED)
java 172438 root 617u IPv6 0xf10006000de9eb98 0t0 TCP ibm67p1:hp-pdl-datastr->ma000xvctx902.federated.fds:lansource (ESTABLISHED)
java 172438 root 631u IPv6 0xf10006000ad12398 0t620 TCP ibm67p1:glrpc->ma000xvctx902.federated.fds:ontime (ESTABLISHED)
dbx 1872474 dsop 5u IPv4 0xf10006000fac2b98 0t0 TCP ibm67p1:dsrpc->ma000xvctx902.federated.fds:3815 (CLOSE_WAIT)

then kill all the listed pids

Pankaj Gautam said...

updated with commands