Thursday, September 9, 2010

"could not open session" when using sudo

last week encountered another strange error
when sudoing to one particular user, we see this error

[root@server]# sudo su - pankaj
could not open session


Resolution
cat /etc/security/limits.conf
look for user pankaj in this file

for example
#pankaj - nofile 0


negative uptime ?

Today we noticed something very unusual.

Uptime shows negative 24855 days.
ps shows java processes running since year 1945 or 1944.

New files are created with a timestamp of year 1874.

Jboss and other processes are going haywire since they can't figure what files/processes need to be redeployed or cleaned.

Also load avg is over the roof

$ date
Fri Sep 10 01:12:09 EDT 2010

$ uptime
17:44:25 up -24855 days, -3:-14, 3 users, load average: 66.89, 67.11, 67.98

$ touch pankaj
$ ls -l pankaj
-rw-r--r-- 1 p139pkg ux_mdc 0 Aug 3 1874 pankaj

[root@server /]# uname -a
Linux esu1l100.federated.fds 2.6.9-67.0.7.ELsmp #1 SMP Wed Feb 27 04:47:23 EST 2008 x86_64 x86_64 x86_64 GNU/Linux

[root@server /]# cat /etc/redhat-release
Red Hat Enterprise Linux AS release 4 (Nahant Update 6)


[app@server]$ ps -elf |grep java
0 S pankaj 4468 32506 0 76 0 - 325272 pipe_w 1945 ? 00:11:06 /usr/java/jdk1.5.0_18//bin/java -DCAV_MON_HOME=/home/pankaj/cavisson/monitors/ -DMON_TEST_RUN=3916 -DVECTOR_NAME=WS30 cm_java_gc_ex -f /www/a/logs/AppServer/gc.log -o 2 -i 10000
0 S pankaj 4554 32506 0 76 0 - 325528 pipe_w 1944 ? 00:11:21 /usr/java/jdk1.5.0_18//bin/java -DCAV_MON_HOME=/home/pankaj/cavisson/monitors/ -DMON_TEST_RUN=3532 -DVECTOR_NAME=WS30 cm_java_gc_ex -f /www/a/logs/AppServer/gc.log -o 2 -i 10000
0 S pankaj 7003 32506 0 76 0 - 326040 pipe_w 1944 ? 00:11:12 /usr/java/jdk1.5.0_18//bin/java -DCAV_MON_HOME=/home/pankaj/cavisson/monitors/ -DMON_TEST_RUN=3724 -DVECTOR_NAME=WS30 cm_java_gc_ex -f /www/a/logs/AppServer/gc.log -o 2 -i 10000
0 R pankaj 7103 32506 0 85 0 - 325784 - 1945 ? 00:16:38 /usr/java/jdk1.5.0_18//bin/java -DCAV_MON_HOME=/home/pankaj/cavisson/monitors/ -DMON_TEST_RUN=3919 -DVECTOR_NAME=WS30 cm_java_gc_ex -f /www/a/logs/AppServer/gc.log -o 2 -i 10000
0 S pankaj 9126 32506 0 76 0 - 312984 pipe_w 1944 ? 00:11:11 /usr/java/jdk1.5.0_18//bin/java -DCAV_MON_HOME=/home/pankaj/cavisson/monitors/ -DMON_TEST_RUN=3722 -DVECTOR_NAME=WS30 cm_java_gc_ex -f /www/a/logs/AppServer/gc.log -o 2 -i 10000
0 R app 11084 9687 0 78 0 - 12777 - 2081 pts/1 00:00:00 grep java
0 S pankaj 12101 32506 0 76 0 - 326296 pipe_w 1944 ? 00:11:11 /usr/java/jdk1.5.0_18//bin/java -DCAV_MON_HOME=/home/pankaj/cavisson/monitors/ -DMON_TEST_RUN=3729 -DVECTOR_NAME=WS30 cm_java_gc_ex -f /www/a/logs/AppServer/gc.log -o 2 -i 10000
0 S pankaj 12809 32506 0 76 0 - 326808 pipe_w 1944 ? 00:11:20 /usr/java/jdk1.5.0_18//bin/java -DCAV_MON_HOME=/home/pankaj/cavisson/monitors/ -DMON_TEST_RUN=3717 -DVECTOR_NAME=WS30 cm_java_gc_ex -f /www/a/logs/AppServer/gc.log -o 2 -i 10000
0 S pankaj 12900 32506 0 76 0 - 322456 pipe_w 1944 ? 00:11:19 /usr/java/jdk1.5.0_18//bin/java -DCAV_MON_HOME=/home/pankaj/cavisson/monitors/ -DMON_TEST_RUN=3718 -DVECTOR_NAME=WS30 cm_java_gc_ex -f /www/a/logs/AppServer/gc.log -o 2 -i 10000


Resolution:
I believe this is a bug in kernel 2.6.x and earlier - that it cannot handle uptime more the 497 days. This means that uptime counters reset to 0 after this 497 days, and this can also cause some funny process start times in a ps display.

It is reported to have been fixed in kernel 2.6.14.3