Friday, October 29, 2010

Unable to authenticate local user

We encountered one other weird problem today when someone reported that they cannot login as local user and also su to that user doesn't work as normal user.

Error were as follows:
#1.
$su - pankaj
su: incorrect password

#2.
users listed in /etc/passwd (local user) cannot login to the server


Logs:
/var/log/message shows
Oct 29 14:57:15 pgserver01 sshd[1457]: Address 11.22.20.130 maps to l4339284.federated.fds, but this does not map back to the address - POSSIBLE BREAKIN ATTEMPT!
Oct 29 14:57:15 pgserver01 sshd(pam_unix)[1465]: auth could not identify password for [pankaj]
Oct 29 14:57:15 pgserver01 sshd[1457]: error: PAM: Authentication failure for pankaj from 11.22.20.130
Oct 29 14:57:17 pgserver01 sshd(pam_unix)[1457]: auth could not identify password for [pankaj]
Oct 29 14:57:17 pgserver01 sshd[1457]: Failed password for pankaj from ::ffff:11.22.20.130 port 56138 ssh2


We did lot of troubleshooting around what is the exact symptoms and restarted all necessary services to clear out any hand auth modules.
few steps taken:
/etc/init.d/vas restart
/etc/init.d/xinetd restart
/etc/init.d/sshd restart


Resolution:
We found misconfigued the pam.d/system-auth with option "use_first_pass"

[root@esu1l101 ~]# cat /etc/pam.d/system-auth
#%PAM-1.0
# This file is auto-generated.
# User changes will be destroyed the next time authconfig is run.
auth required /lib/security/$ISA/pam_env.so
auth [ignore=ignore success=done default=die] pam_vas3.so create_homedir
#auth sufficient /lib/security/$ISA/pam_unix.so likeauth nullok use_first_pass <-- replaced this line with line below
auth sufficient /lib/security/$ISA/pam_unix.so likeauth nullok
auth required /lib/security/$ISA/pam_deny.so

account [ignore=ignore success=done default=die] pam_vas3.so
account required /lib/security/$ISA/pam_unix.so
account sufficient /lib/security/$ISA/pam_succeed_if.so uid <>
account required /lib/security/$ISA/pam_permit.so

password [ignore=ignore success=done default=die] pam_vas3.so
password requisite /lib/security/$ISA/pam_cracklib.so retry=3
password sufficient /lib/security/$ISA/pam_unix.so nullok use_authtok md5 shadow nis
password required /lib/security/$ISA/pam_deny.so

session required /lib/security/$ISA/pam_limits.so
session required pam_vas3.so create_homedir
session required /lib/security/$ISA/pam_unix.so


PAM optional arguments module explaination:

use_first_pass
The module should not prompt the user for a password. Instead, it should obtain the previously typed password
(from the preceding auth module), and use that.
If that doesn't work, then the user will not be authenticated.
(This option is intended for auth and password modules only).

Thursday, October 28, 2010

A single previous owner was found in the messaging engine's data store,

This error is from service integration bus running on WebSphere 6.1.0.17 and DB2. We use this enterprise service bus for one of the project.

You will see this error when we start the application after it connects to the database. The WebSphere starts ok and the messaging bus never allows any new connections.
If you look at the log, it shows bus in starting state instead of started state.
Messaging engine ibm61p2Node_mcomstars_QA.mcomstars_QA_s61p2-MCOMStarsBus is in state Starting.


Error:
A single previous owner was found in the messaging engine's data store, ME_UUID=3FD2CC33B88EB9E

Also referred as Websphere CWSIS1545I and CWSIS1537I errors in many IBM blogs

SystemOut.log shows:
[10/18/10 18:21:05:580 EDT] 00000032 ManagedEsServ I com.ibm.wbiserver.sequencing.service.ManagedEsService startEsServiceWithConnRetries() cannot create connection. esApps.isEmpty=true esStarted=false wasStarted=false meStarted=false isActive=true

[10/18/10 15:05:54:639 EDT] 00000045 SibMessage I [MCOMStarsBus:mcomstars_cluster01.000-MCOMStarsBus] CWSIS1538I: The messaging engine, ME_UUID=3FD2CC33B88EB9E6, INC_UUID=7D467D46C0BBCCCD, is attempting to obtain an exclusive lock on the data store.

[10/18/10 15:05:54:732 EDT] 00000046 SibMessage I [MCOMStarsBus:mcomstars_cluster01.000-MCOMStarsBus] CWSIS1545I: A single previous owner was found in the messaging engine's data store, ME_UUID=3FD2CC33B88EB9E
6, INC_UUID=2BA62BA6B8B62F0C

[10/18/10 15:05:55:139 EDT] 00000045 SibMessage I [MCOMStarsBus:mcomstars_cluster01.000-MCOMStarsBus] CWSIS1537I: The messaging engine, ME_UUID=3FD2CC33B88EB9E6, INC_UUID=7D467D46C0BBCCCD, has acquired an exclusive lock on the data store.

Exception: com.ibm.ws.sib.msgstore.PersistenceException: CWSIS1500E: The dispatcher cannot accept work.
[10/18/10 15:02:00:559 EDT] 00000045 SibMessage E [:] CWSIP0002E: An internal messaging error occurred in com.ibm.ws.sib.processor.impl.BaseDestinationHandler, 1:1977:1.692.1.7, com.ibm.wsspi.sib.core.exception.SIRollbackException: CWSIS1002E: An unexpected exception was caught during transaction completion. Exception: com.ibm.ws.sib.msgstore.PersistenceException: CWSIS1500E: The dispatcher cannot accept work.

The SystemOut.log should look like this
[10/22/10 15:27:54:215 EDT] 0000002a SibMessage I [MCOMStarsBus:mcomstars_cluster01.000-MCOMStarsBus] CWSIS1538I: The messaging engine, ME_UUID=3FD2CC33B88EB9E6, INC_UUID=671A671AD5695F63, is attempting to obtain an exclusive lock on the data store.

[10/22/10 15:27:54:304 EDT] 0000002b SibMessage I [MCOMStarsBus:mcomstars_cluster01.000-MCOMStarsBus] CWSIS1543I: No previous owner was found in the messaging engines data store.

[10/22/10 15:27:54:318 EDT] 0000002a SibMessage I [MCOMStarsBus:mcomstars_cluster01.000-MCOMStarsBus] CWSIS1537I: The messaging engine, ME_UUID=3FD2CC33B88EB9E6, INC_UUID=671A671AD5695F63, has acquired an exclusive lock on the data store.

[10/22/10 15:28:06:604 EDT] 0000001f SibMessage I [MCOMStarsBus:mcomstars_cluster01.000-MCOMStarsBus] CWSID0016I: Messaging engine mcomstars_cluster01.000-MCOMStarsBus is in state Started.



Resolution:
There are lots of similar situations around this error depending upon how you have your environment setup.
Run this against the database.

delete from ESBME1.SIB000;
delete from ESBME1.SIB001;
delete from ESBME1.SIB002;
delete from ESBME1.SIBCLASSMAP;
delete from ESBME1.SIBKEYS;
delete from ESBME1.SIBLISTING;
delete from ESBME1.SIBOWNER;
delete from ESBME1.SIBXACTS;


Here is one of good explanation of the problem:
This means that either you have 2 messaging engines pointing at the same
database tables, or you had another messaging engine defined that used the
same tables before (e.g. you created a bus/messaging engine, started it,
stoped it, deleted the bus and recreated, then pointed the new messaging
engine at the same database as the previous one).

If you have 2 messaging engines using the same table then the fix is to
point your messaging engine at a different set of tables (e.g. different schema or database)
If you have deleted and recreated a bus/messaging engine then clear out the
content of the messaging engine's database tables and restart it. This is a feature that prevents 2 different messaging engines from using the same tables in the database.

--
Martin Phillips
mphillip at uk.ibm.com