Monday, December 17, 2012

Is there way to find dd status on Solaris?

Well, this is all started when we had to move around 24TB data between the 2 data centers and we decided to use dd command. I guess it is much easier on Linux where you have a way to can use progress bar etc. and dd standard out gives you percentage finished. Unfortunately I couldn't find anything fancy with Solaris.

If you are using a dd command as below and you want find out when it will finish.What options do you have?
dd if=/share/disks/disk-402.img | ssh root@server "dd of=/dev/dsk/c6t50002AC000AE0AE2d0s0"


One of the option is this which is I found on net, but it didn't work for me. It does sends a user signal 1 to the process but it kills the dd process, that means the process take a standard/default kill instead of USR1

Printing dd status

I recently used dd to zero out some hard drives on my Fedora Core workstation, and found that this operation takes a good deal of time (even when large blocksizes are used, it still takes a while). The dd utility doesn’t report status information by default, but when fed a SIGUSR1 signal it will dump the status of the current operation:
dd if=/dev/zero of=/dev/hda1 bs=512 &
kill -SIGUSR1 1749
1038465+0 records in
1038465+0 records out
531694080 bytes (532 MB) copied, 11.6338 seconds, 45.7 MB/s
watch -n 10 kill -USR1
It still amazes me how much stuff I have left to learn about the utilities I use daily.



The other option was to see if there is any way we can from the storage side what is total blocks written so far and how much time it will take to write the whole 24TB on that lun. The problem with fully provisioned lun is that the storage allocates all the blocks to the lun as soon as its provisioned. And you cannot see anything under that block, there may be a way to see this if its thin provisioned.



The easiest option is to use iostat and see how fast you are writing to the device and do the math

-bash-3.00#  iostat -xnmMpz 1
                    extended device statistics
    r/s    w/s   Mr/s   Mw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0    1.0    0.0    0.0  0.0  0.0    0.0    5.5   0   1 c1t0d0
    0.0    1.0    0.0    0.0  0.0  0.0    0.0    5.5   0   1 c1t0d0s0 (/)
  982.5  982.5    7.7    7.7  0.0  1.1    0.0    0.5   1  62 c6t50002AC000B10AE2d0
  982.5  982.5    7.7    7.7  0.0  1.1    0.0    0.5   1  62 c6t50002AC000B10AE2d0s0
                    extended device statistics
    r/s    w/s   Mr/s   Mw/s wait actv wsvc_t asvc_t  %w  %b device
  989.6  989.6    7.7    7.7  0.0  1.1    0.0    0.5   1  61 c6t50002AC000B10AE2d0
  989.6  989.6    7.7    7.7  0.0  1.1    0.0    0.5   1  61 c6t50002AC000B10AE2d0s0


As we see here we are writing at 7.7MB/s.
Now if we are writing 1TB it should take around 36hrs and 24TB would take around 865 hrs.

Saturday, October 13, 2012

How to rename QFS fsname?


1: umount /new

2: vi /etc/vfstab     change new to new4
3: vi mcf      change new to new4 in all places.

cat /etc/opt/SUNWsamfs/mcf

new4                                            500    ms      new4   on        shared
/dev/dsk/c6t600A0B80004715700000074D47FD082Ed0s0 501    md      new4   on
/dev/dsk/c6t600A0B8000471570000007D247FD0A22d0s0 502    md      new4   on
/dev/dsk/c6t600A0B80004715700000086F49F1D551d0s0 504    md      new4   on


4: samd config
5: samfsck -F -R new4          (this will change the file system name to new4)

Example:
bash-3.00# samfsck -F -R new4
name:     bcombackups       version:     2A    shared
First pass
ALERT:  ino 1743721017.967461289, Object flag set, should be clear, meta_flag 1, size 8581431105406506118
ALERT:  ino -796177020.1556009446, Object flag set, should be clear, meta_flag 0, size 3201155233938255577
ALERT:  ino 22367104.671481968, Object flag set, should be clear, meta_flag 0
ALERT:  ino 700653835.-1820544200, Object flag set, should be clear, meta_fla
ALERT:  ino -2107630336.-937590376, Object flag set, should be clear, meta_fl
ALERT:  ino 738287680.101681192, Object flag set, should be clear, meta_flag
ALERT:  ino 503702500.2047090706, Object flag set, should be clear, meta_flag
ALERT:  ino -445008776.-1635288612, Object flag set, should be clear, meta_fl
ALERT:
Second pass
Third pass
ALERT:  Invalid block:        ino 1853 marked damaged
NOTICE: ino 2214.1,     Repaired link count from 688 to 674
ALERT:  Invalid block:        ino 4218 marked damaged
ALERT:  Invalid block:        ino 5893 marked damaged
ALERT:  Invalid block:        ino 8836 marked damaged
ALERT:  Invalid block:        ino 12528 marked damaged
ALERT:  Invalid block:        ino 16594 marked damaged
ALERT:  Invalid block:        ino 18916 marked damaged

Inodes processed: 8066560

total data kilobytes       = 5858959296
total data kilobytes free  = 1617932544
NOTICE: Reclaimed 332916064256 bytes

Another example:
root@dam-app2 # samfsck -F -R new4
name:     opsbackup       version:     2A    shared
First pass
Second pass
Third pass

Inodes processed: 2048

total data kilobytes       = 3382270208
total data kilobytes free  = 3382259904



6: mount new4

new4                  4.7T    34G   4.7T     1%    /nearline3


Netgear N600 and DOCSIS 3.0

I've been in IT Infrastructure and Operations world for more 15 years now, helping companies and folks to resolve all sorts of performance issues. Here is my horrible experience with internet connection @ home :)

I have been using ATT for almost a decade and finally decided to go to Comcast.
I upgraded to a DOCSIS 3.0 Motorola modem SB6120 router and also bought a Netgear N600 wireless dual band router for Comcast internet to avoid any monthly charges at all.

Configured and customized router to whatever I wanted, internet all good. By the way I have CAT 6 structured wiring at my home covers almost all the rooms.

The problem I was having was that when connecting to the internet through the wifi,
there was a lag and websites would load way slower.

Symptoms:
Frankly I didn't notice any lags, however my wife started complaining that her vpn drops connection after we switched to Comcast. And its horribly slow compared to AT&T. We were ATT Pro which was 3mbps and Comcast promised at least 10 mbps. Speedtest confirm its double the speed, it never resulted in less than 22mbps. Now, mathematically there is no way internet will be slow. So, I ignore the problem. The complain continues so finally I decided to spend some time with issue.

I still cannot see the difference visually with google.com or yahoo.com

But, yes it can be better there is some lag of 3-5 seconds while pulling the site which may be annoying to some.
The problem became very obvious when I started a download, it was 70KB/s, that means my internet speed is only 560kbps


Troubleshooting steps:
#Packet loss:
The first thing i noticed was the packet loss. While pinging the router after every 25 packets of 1ms latency there are 4 packets which is 100ms. Called Netgear, spend couple of hrs. with tech support, they tried lots of combination with no security, changing to different channels, (I believe 6 and 11 are reliable :-)), I had already downloaded inSSIDer to see if there is any interference with channels neighbors are using, recycle the modem, re flash the modem etc. finally changed default fragmentation length from 2346 to 2345 and there was some improvement in packet loss. But ultimately gave up.

#Wireless speed test
Tried a speedtest using wireless, it says 22Mbps however download sucks (50-100 KB/s)

#Wired speedtest
Tried a speedtest directly connecting to router, it says 22Mbps however download still sucks (50-100 KB/s)

#Connected directly to modem
Now connected directly to modem, speedtest is good 27mbps and so is the download speed (2.4 MB/s)
so now the problem boils down to router.

I google'd the Netgear problem with Comcast and found that we have disable WWM for Netgear. It was already disabled when I had initially configured it. Again played with few things no change. There were lot of suggestion to change the router and go with Comcast router.

However, I decided to reset the router again. used all the default option to connect wirelessly.

What's the download speed 4MB/s !!

Now I have backup of N600 old config and new config which works with expected speed. The problem is the netgear.cfg is encrypted and so do we want to solve this problem now. May be .. whenever time permits next time..

I took me more than a month to solve my home internet speed :(

Thursday, September 13, 2012

3PAR inserv not showing WWNs on the FC switch


After the 3par installation was complete and we were working on creating zones for the hosts.
We noticed the fc coming from insev nodes doesn't show the wwns on the switch.


Here are some information learned while talking to 3PAR support

All the ports connected to cage/chassis should be set to initiator
All the ports in front of the host should be configured as target.

I believe by default all ports are configured as mode type initiators.


Step #0. showport

Result:
3par cli% showport
N:S:P      Mode     State ----Node_WWN---- -Port_WWN/HW_Addr- Type Protocol
2:0:1 initiator     ready 2FF70002AC001477   22010002AC001477 disk       FC
2:0:2 initiator     ready 2FF70002AC001477   22020002AC001477 disk       FC
2:1:1    target loss_sync 2FF70002AC001477   22110002AC001477 free       FC
2:1:2    target loss_sync 2FF70002AC001477   22120002AC001477 free       FC
2:2:1 initiator     ready 2FF70002AC001477   22210002AC001477 disk       FC
2:2:2 initiator     ready 2FF70002AC001477   22220002AC001477 disk       FC
2:2:3 initiator     ready 2FF70002AC001477   22230002AC001477 free       FC  <--- should be target
2:2:4 initiator     ready 2FF70002AC001477   22240002AC001477 free       FC  <--- should be target
2:6:1      peer loss_sync                -       0002AC69249B rcip       IP
3:0:1 initiator     ready 2FF70002AC001477   23010002AC001477 disk       FC
3:0:2 initiator     ready 2FF70002AC001477   23020002AC001477 disk       FC
3:1:1    target loss_sync 2FF70002AC001477   23110002AC001477 free       FC
3:1:2    target loss_sync 2FF70002AC001477   23120002AC001477 free       FC
3:2:1 initiator     ready 2FF70002AC001477   23210002AC001477 disk       FC
3:2:2 initiator     ready 2FF70002AC001477   23220002AC001477 disk       FC
3:2:3 initiator     ready 2FF70002AC001477   23230002AC001477 free       FC  <--- should be target
3:2:4 initiator     ready 2FF70002AC001477   23240002AC001477 free       FC  <--- should be target
3:6:1      peer loss_sync                -       0002AC6A2867 rcip       IP
---------------------------------------------------------------------------
   18



Step #1. Change all ports which would connect to hosts as targets
Assuming you have 4 connections to the fc switch
controlport rst -m target  2:2:3
controlport rst -m target  2:2:4
controlport rst -m target  3:2:3
controlport rst -m target  3:2:4


Results:
3par cli% controlport rst -m target  2:2:3
Are you sure you want to run controlport rst -m target on port 2:2:3?
select q=quit y=yes n=no: y

>>>>>repeat the same other ports


Step #2 showport -c

Result:
3par_F400_mcomNY01 cli% showport
N:S:P      Mode     State ----Node_WWN---- -Port_WWN/HW_Addr- Type Protocol
2:0:1 initiator     ready 2FF70002AC001477   22010002AC001477 disk       FC
2:0:2 initiator     ready 2FF70002AC001477   22020002AC001477 disk       FC
2:1:1    target loss_sync 2FF70002AC001477   22110002AC001477 free       FC
2:1:2    target loss_sync 2FF70002AC001477   22120002AC001477 free       FC
2:2:1 initiator     ready 2FF70002AC001477   22210002AC001477 disk       FC
2:2:2 initiator     ready 2FF70002AC001477   22220002AC001477 disk       FC
2:2:3    target     ready 2FF70002AC001477   22230002AC001477 free       FC <-- converted as target
2:2:4    target     ready 2FF70002AC001477   22240002AC001477 free       FC <-- converted as target
2:6:1      peer loss_sync                -       0002AC69249B rcip       IP
3:0:1 initiator     ready 2FF70002AC001477   23010002AC001477 disk       FC
3:0:2 initiator     ready 2FF70002AC001477   23020002AC001477 disk       FC
3:1:1    target loss_sync 2FF70002AC001477   23110002AC001477 free       FC
3:1:2    target loss_sync 2FF70002AC001477   23120002AC001477 free       FC
3:2:1 initiator     ready 2FF70002AC001477   23210002AC001477 disk       FC
3:2:2 initiator     ready 2FF70002AC001477   23220002AC001477 disk       FC
3:2:3    target     ready 2FF70002AC001477   23230002AC001477 free       FC <-- converted as target
3:2:4    target     ready 2FF70002AC001477   23240002AC001477 free       FC <-- converted as target
3:6:1      peer loss_sync                -       0002AC6A2867 rcip       IP
---------------------------------------------------------------------------



After converting to target also, the switch still doesn't show the WWNs

Step #3.showport -c
Result

3par cli% showport -c
N:S:P      Mode Device Pos Config     Topology  Rate Cls Mode_change
2:0:1 initiator  cage0   0  valid private_loop 4Gbps   3  prohibited
                 cage4   1  valid private_loop 4Gbps   3  prohibited
2:0:2 initiator  cage1   0  valid private_loop 4Gbps   3  prohibited
                 cage5   1  valid private_loop 4Gbps   3  prohibited
2:1:1    target    ---   -    ---          n/a   n/a n/a     allowed
2:1:2    target    ---   -    ---          n/a   n/a n/a     allowed
2:2:1 initiator  cage2   0  valid private_loop 4Gbps   3  prohibited
                 cage6   1  valid private_loop 4Gbps   3  prohibited
2:2:2 initiator  cage3   0  valid private_loop 4Gbps   3  prohibited
                 cage7   1  valid private_loop 4Gbps   3  prohibited
2:2:3    target    ---   -    --- private_loop 4Gbps   3     allowed <--Topology has to fabric instead of private_loop
2:2:4    target    ---   -    --- private_loop 4Gbps   3     allowed <--Topology has to fabric instead of private_loop
3:0:1 initiator  cage0   1  valid private_loop 4Gbps   3  prohibited
                 cage4   0  valid private_loop 4Gbps   3  prohibited
3:0:2 initiator  cage1   1  valid private_loop 4Gbps   3  prohibited
                 cage5   0  valid private_loop 4Gbps   3  prohibited
3:1:1    target    ---   -    ---          n/a   n/a n/a     allowed
3:1:2    target    ---   -    ---          n/a   n/a n/a     allowed
3:2:1 initiator  cage2   1  valid private_loop 4Gbps   3  prohibited
                 cage6   0  valid private_loop 4Gbps   3  prohibited
3:2:2 initiator  cage3   1  valid private_loop 4Gbps   3  prohibited
                 cage7   0  valid private_loop 4Gbps   3  prohibited
3:2:3    target    ---   -    --- private_loop 4Gbps   3     allowed <--Topology has to fabric instead of private_loop
3:2:4    target    ---   -    --- private_loop 4Gbps   3     allowed <--Topology has to fabric instead of private_loop
--------------------------------------------------------------------
   24



Step #4. Setup ports as fabric and instead of loop

Results:
To enable port on the switch:
3par cli% controlport offline  2:2:3
Are you sure you want to run controlport offline on port 2:2:3?
select q=quit y=yes n=no: y

3par cli% controlport config host -ct point   2:2:3
Are you sure you want to run controlport config host -ct point on port 2:2:3?
select q=quit y=yes n=no: y

3par cli% controlport rst  2:2:3
Are you sure you want to run controlport rst on port 2:2:3?
select q=quit y=yes n=no: y


>>>>>repeat the same other ports


Step #5. Finally it should look as below

3par cli% showport -c
N:S:P      Mode Device Pos Config     Topology  Rate Cls Mode_change
2:0:1 initiator  cage0   0  valid private_loop 4Gbps   3  prohibited
                 cage4   1  valid private_loop 4Gbps   3  prohibited
2:0:2 initiator  cage1   0  valid private_loop 4Gbps   3  prohibited
                 cage5   1  valid private_loop 4Gbps   3  prohibited
2:1:1    target    ---   -    ---          n/a   n/a n/a     allowed
2:1:2    target    ---   -    ---          n/a   n/a n/a     allowed
2:2:1 initiator  cage2   0  valid private_loop 4Gbps   3  prohibited
                 cage6   1  valid private_loop 4Gbps   3  prohibited
2:2:2 initiator  cage3   0  valid private_loop 4Gbps   3  prohibited
                 cage7   1  valid private_loop 4Gbps   3  prohibited
2:2:3    target    ---   -    ---       fabric 4Gbps   3     allowed <-- Mode as target and Topology fabric
2:2:4    target    ---   -    ---       fabric 4Gbps   3     allowed <-- Mode as target and Topology fabric
3:0:1 initiator  cage0   1  valid private_loop 4Gbps   3  prohibited
                 cage4   0  valid private_loop 4Gbps   3  prohibited
3:0:2 initiator  cage1   1  valid private_loop 4Gbps   3  prohibited
                 cage5   0  valid private_loop 4Gbps   3  prohibited
3:1:1    target    ---   -    ---          n/a   n/a n/a     allowed
3:1:2    target    ---   -    ---          n/a   n/a n/a     allowed
3:2:1 initiator  cage2   1  valid private_loop 4Gbps   3  prohibited
                 cage6   0  valid private_loop 4Gbps   3  prohibited
3:2:2 initiator  cage3   1  valid private_loop 4Gbps   3  prohibited
                 cage7   0  valid private_loop 4Gbps   3  prohibited
3:2:3    target    ---   -    ---       fabric 4Gbps   3     allowed <-- Mode as target and Topology fabric
3:2:4    target    ---   -    ---       fabric 4Gbps   3     allowed <-- Mode as target and Topology fabric
--------------------------------------------------------------------
   24

samsharefs -u -R [QFS share name]

Problem: changed the host ip and now not able to mount a QFS file system

server old ip : 1.10.10.1
server new ip: 10.1.1.1

Getting an error:
SC_mount() error: Transport endpoint is not connected


samshare -R shows old ip address of the server 

# samsharefs -u -R 
#
# Host file for family set 'share1'
#
# Version: 4    Generation: 18    Count: 6
# Server = host 0/titan, length = 192
#
1.10.10.1 1 0 server
1.10.10.2 2 0
1.10.10.3 3 0


Solution:
samsharefs -u -R

# samsharefs -u -R 
#
# Host file for family set 'share1'
#
# Version: 4    Generation: 18    Count: 6
# Server = host 0/titan, length = 192
#
10.1.1.1 1 0 server
10.1.1.2 2 0
10.1.1.3 3 0



 This procedure allows you to add a new client or to change IP addresses or add secondary servers.

Edit the file /etc/opt/SUNWsamfs/hosts. on the server to add the new client, change IP addresses or make any other change to the configuration of the file system.

Update the binary hosts file on the server:

samsharefs -u

if the file system is mounted OR

samsharefs -u  -R

if the file system is unmounted (counterintuitive, but correct).

Removing a client requires that you unmount the file system on the server, which means you must first unmount all clients. It is possible to unmount and unconfigure the client, then to do the server unconfiguration during scheduled downtime. Leaving the client in the configuration is a security hole, however, so it should be removed as soon as possible.

Solaris EFI and SMI format labels


Solaris EFI and SMI Labels

We ran into some disk corruption few days back and learned about Solaris new labeling format.
Also discovered one weird Solaris bug that is suppose to be fixed in coming releases.

This problematic device is a (155GB) LUN from IBM DS4800 storage only exported to one host.

More info on EFI
http://docs.sun.com/app/docs/doc/819-2723/disksconcepts-14?a=view


SMI is conventional format label with 0-6 partitions
EFI format is used with a disk greater than 1 terabyte.


Symptoms and Error:
1. Unable to mount the file systems
#mount /dev/rdsk/c4t600A0B80004715FC000007BC47FE38E6d0s6 /mnt
 I/O error (I don't have the exact error )

2. When ran fsck on the device, it gave up on one of the inodes and core dumped.

3. We were also getting this corrupt label magic error
WARNING: /scsi_vhci/ssd@g600a0b80004715fc000007bc47fe38e6 (ssd196):
        Corrupt label; wrong magic number
However this error was coming on almost all the devices on the system
We tried to label the disk using format and tried to print vtoc and it throws the below err.

4. Print vtoc error
# prtvtoc /dev/rdsk/c4t600A0B80004715FC000007BC47FE38E6d0s6
prtvtoc: /dev/rdsk/c4t600A0B80004715FC000007BC47FE38E6d0s6: Unable to read Disk geometry errno = 0x16

After you print vtoc if you go back to format again, it says disk is not formatted.

5. And finally when we decided to run newfs again I/O error.
# newfs /dev/rdsk/c4t600A0B80004715FC000007BC47FE38E6d0s6
/dev/rdsk/c4t600A0B80004715FC000007BC47FE38E6d0s6: I/O error

Solution:
The resolution here was when we format and label the device in EFI format we were able to run newfs and able to mount it. If we do conventional format than, it give a I/O when doing a newfs on it.

#format -e
and label it as EFI


Error with SMI label :
We tried experimenting by changing from slice 6 to slice 0

# prtvtoc /dev/dsk/c4t600A0B80004715FC000007BC47FE38E6d0s0
* /dev/dsk/c4t600A0B80004715FC000007BC47FE38E6d0s0 partition map
*
* Dimensions:
*     512 bytes/sector
*      64 sectors/track
*     128 tracks/cylinder
*    8192 sectors/cylinder
*   40320 cylinders
*   40318 accessible cylinders
*
* Flags:
*   1: unmountable
*  10: read-only
*
*                          First     Sector    Last
* Partition  Tag  Flags    Sector     Count    Sector  Mount Directory
       0      2    00          0 330285056 330285055
       1      3    01     262144    262144    524287
       2      5    01          0 330285056 330285055
       6      4    00     524288 329760768 330285055


SMI format:
# newfs /dev/rdsk/c4t600A0B80004715FC000007BC47FE38E6d0s0
newfs: construct a new file system /dev/rdsk/c4t600A0B80004715FC000007BC47FE38E6d0s0: (y/n)? y
/dev/rdsk/c4t600A0B80004715FC000007BC47FE38E6d0s0: cannot open

EFI Label works:

# newfs /dev/rdsk/c4t600A0B80004715FC000007BC47FE38E6d0s6
newfs: construct a new file system /dev/rdsk/c4t600A0B80004715FC000007BC47FE38E6d0s6: (y/n)? y
/dev/rdsk/c4t600A0B80004715FC000007BC47FE38E6d0s6:      329760768 sectors in 53672 cylinders of 48 tracks, 128 sectors
        161016.0MB in 3355 cyl groups (16 c/g, 48.00MB/g, 5824 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
 32, 98464, 196896, 295328, 393760, 492192, 590624, 689056, 787488, 885920,
Initializing cylinder groups:
..................................................................
super-block backups for last 10 cylinder groups at:
 328829088, 328927520, 329025952, 329124384, 329222816, 329321248, 329419680,
 329518112, 329616544, 329714976


After it successfully completed newfs with EFI, we changed it again to SMI label

It started the newfs ok but again ended with I/O error

# newfs /dev/rdsk/c4t600A0B80004715FC000007BC47FE38E6d0s6
newfs: construct a new file system /dev/rdsk/c4t600A0B80004715FC000007BC47FE38E6d0s6: (y/n)? y
/dev/rdsk/c4t600A0B80004715FC000007BC47FE38E6d0s6:      329760768 sectors in 53672 cylinders of 48 tracks, 128 sectors
        161016.0MB in 3355 cyl groups (16 c/g, 48.00MB/g, 5824 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
 32, 98464, 196896, 295328, 393760, 492192, 590624, 689056, 787488, 885920,
Initializing cylinder groups:
..................................................................
super-block backups for last 10 cylinder groups at:
 328829088, 328927520, 329025952, 329124384, 329222816, 329321248, 329419680,
 329518112, 329616544, 329714976
fsirand: Cannot open /dev/rdsk/c4t600A0B80004715FC000007BC47FE38E6d0s6: I/O error
/usr/sbin/fsirand /dev/rdsk/c4t600A0B80004715FC000007BC47FE38E6d0s6: failed, status = 256


Sample of both formats:

SMI Label
9. c6t50002AC0001D1477d0 <3pardata -vv-3111="-vv-3111" 2="2" 304="304" 860="860" 8="8" alt="alt" cyl="cyl" hd="hd" sec="sec">  701
          /scsi_vhci/ssd@g50002ac0001d1477

EFI Label
9. c6t50002AC0001D1477d0 <3pardata -vv-3111-1.00gb="-vv-3111-1.00gb">
          /scsi_vhci/ssd@g50002ac0001d1477

Wednesday, September 12, 2012

QFS does not support dynamic LUN sizing


Here is the scenario:

We have a QFS shared sam file system using 3 luns running Solaris 10 and QFS 5.2 as below:
1 x 2TB
1 x 2TB
1 x 1.5TB
The file system mounted from the above luns is 5.5TB
We extended the 3rd lun (1.5TB) to 2TB from storage side (3PAR)

The host sees it as 2TB, we updated the QFS config and tried to run growfs


format output:
32. c5t50002AC0000E1477d0 <3pardata -vv-3111-2.00tb="-vv-3111-2.00tb">  601
          /scsi_vhci/ssd@g50002ac0000e1477
33. c5t50002AC0000F1477d0 <3pardata -vv-3111-2.00tb="-vv-3111-2.00tb">  602
          /scsi_vhci/ssd@g50002ac0000f1477
34. c5t50002AC0001A1477d0 <3pardata -vv-3111-2.00tb="-vv-3111-2.00tb">  603   <-- changed from 1.5TB to 2TB
          /scsi_vhci/ssd@g50002ac0001a1477


umount fs
samd config
samgrowfs
mount fs

The file system is still 5.5TB

Unfortunately only thing we can do here is :
backup the file system, samfsdump  
recreate the file system, sammkfs -S
restore the file system again, samfsrestore 



Wednesday, July 4, 2012

luxadm -e port shows "NOT CONNECTED"


Here is a situation:
We been using 2 dual port HBA card, initially using only one port from the host for storage. Started setting up second port to connect to the second storage and found out the one the HBA card is bad. Got that replaced. All the 4 ports are online now. The brocade sees all the WWNs from the FC cards and cards on the hosts shows online as well.

bash-3.00# fcinfo hba-port -l | grep -i state
        State: online
        State: online
        State: online
        State: online

Now we run luxadm -e port and only one port is communicating. After replacing the card we updated the Brocade with new WWN of the first port which was already in use. Also, zoned the second port from both the FC to the second storage. So, we were expecting atleast 2 ports communicating as before.

Why luxadm is showing the card is not connect ??

bash-3.00# luxadm -e port
/devices/pci@0/pci@0/pci@8/pci@0/pci@2/SUNW,qlc@0/fp@0,0:devctl    NOT CONNECTED
/devices/pci@0/pci@0/pci@8/pci@0/pci@2/SUNW,qlc@0,1/fp@0,0:devctl  NOT CONNECTED
/devices/pci@0/pci@0/pci@8/pci@0/pci@8/SUNW,qlc@0,1/fp@0,0:devctl  NOT CONNECTED
/devices/pci@0/pci@0/pci@8/pci@0/pci@8/SUNW,qlc@0/fp@0,0:devctl    CONNECTED

Before we go into the further troubleshooting mode. I just wanted to understand what is "NOT CONNECTED" means

## “luxadm -e port” command is used to verify HBA has established communication with a node.

# luxadm -e port 
/devices/pci@1f,4000/SUNW,qlc@4/fp@0,0:devctl CONNECTED /devices/pci@1f,4000/SUNW,qlc@4,1/fp@0,0:devctl CONNECTED

NOTE:  
”CONNECTED” means the HBA has established a communications with some other node (Initiator or Target).  ”NOT CONNECTED” means the HBA has not established a communication with some other node or connecting to a switch that has no target (including not zoned to a target).

Now to our understanding we have verified switch and everything looks good.


Findings:
When the HBA went bad, it automatically took the WWN (zone member) out of the zone. That was the reason for second port in "NOT CONNECTED" state 

Everything was correct for the second ports configured for the other storage, But unfortunately it was not added to the zone group. If you are familiar with Brocade GUI (sorry never tried the command line with FC zoning), the last part is to add the newly created to the zone group after you create aliases and zones

Bottomline, If the HBAs are online its most likely the switch problem when we see "NOT CONNECTED" state.

Friday, June 22, 2012

ARC (Audio Return Channel)


Are you looking to avoid optical cable and make ARC port work on your TV?

Here is my story, hopes it saves some of your time.
I have recently purchased new Samsung 55D 7000 series TV and connected to my AV receiver (Onkyo TX SR-604)
This receiver is pretty good, I’ve been using this for last 8 yrs. It’s 7.1 channels and 630 watt but unfortunately only had 2 HDMI input and 1 HDMI output.

The TV has these hdmi ports labeled as below
HDMI/DVI 1 IN
HDMI 2 IN (ARC)
HDMI 3 IN
HDMI 4 IN

After connecting
TV HDMI/DVI 1 –-> Receiver HDMI out
Set top box –-> Receiver HDMI in
Blu ray player –-> Receiver HDMI in

Everything worked as expected, except for the fact when starting any application using smart hub (Samsung apps) there was no audio. Then I connected TV optical out to receiver optical in and that fixed the problem. But I didn’t wanted to run an extra cable for this audio out because I had already wall mounted my TV with 2 HDMI cable hiding inside the wall. And this optical cable over the wall will make it an ugly setup. Also, one of the other related requirements was to run a slide show from external NAS storage along with background music.

After doing some research I figured out there is a way to avoid the extra digital optical cable running from TV to the receiver if we use ARC HDMI port. Now the question is how?

This is just a good reference for ARC and how to set it up

“An Audio Return Channel-enabled TV allows you to send audio over an already connected HDMI 1.4 cable. Ordinarily, in order to transfer audio from your source (ie antenna or VIA widgets) to your home theater system you would need a separate audio cable (analog audio or digital optical) going from your TV to the home theater system. With the ARC function, you will be able to send any audio from your TV back to an ARC enabled home theater system and listen to your TV's audio through the home theater system without having to connect an optical cable.”
How to use ARC:
These are the steps that you will need to take to enable ARC (Audio Return Channel)
•               Make sure your audio receiver is ARC compatible and your TV is ARC compatible.
•               Ensure that you have an HDMI 1.4 cable that is multi-directional
•               Use HDMI 1 on your TV and it will need to be plugged into the HDMI OUT of the receiver
•               Change Digital Audio on your TV to Dolby Digital
•               Turn off the TV internal speakers
•               Turn CEC on
•               Ensure the receiver is in TV control and is in discoverable mode (You may have to reference the manual for your receiver or contact the manufacturer of your receiver)
•               Search for the device under CEC menu
•               This will unlock System Audio Control (under CEC), turn to ON and enjoy ARC!


I guess the first most important thing you would need is an AV receiver which supports ARC. Don’t worry much about HDMI cables.

To make that work I ordered a Pioneer VSX-822-K receiver which is ARC compatible and has 6 HDMI input ports, network ready, lots of iphone/ipods and other media support. After connecting all those HDMI ports the same way it was connected before the problems remained the same.

Follow this connection layout:
make sure you use TV ARC port instead of HDMI/DVI.
TV                                                            RECEIVER                                                 DEVICE
HDMI/DVI 1 IN                                         
HDMI 2 IN (ARC)      
à                             HDMI OUT
HDMI 3 IN                                                HDMI SAT/CBL         
à             Comcast/AT&T set top HDMI OUT
HDMI 4 IN                                                HDMI BD                 
à             Samsung Bluray Player HDMI OUT
                                                                HDMI GAME
                                                                HDMI DVR/BDR
                                                                HDMI VIDEO
                                                                HDMI DVD


Once all HDMI are connected as above.
Enable ARC on the receiver, it seems by default ARC is disabled on all receivers.


That’s all for enabling audio for smarthub using external speakers.
Now, can we also put background music once we start a slideshow?
J

I guess unless samsumg doesn't support multitasking or add the background music feature with smart apps, the only option we have is to connect ipod using USB (front panel) on the receiver while running the slideshow. No wait ! How about Airplay?

We can use airplay to stream your iphone music while you start your slideshow.

Monday, May 21, 2012

Alternatives usage


Alternatives allows you to toggle between several version of the installed binary using symlink. Alternatives has a default administrative directory as /var/lib/alternatives where it keeps all the metadata


Most common usage of alternatives are
alternatives --install
alternatives --config
alternatives --remove

usage: alternatives --install
update-alternatives --install "/usr/bin/java" "java" "/usr/java/default/bin/java" 3

lrwxrwxrwx. 1 root root 12 Feb 22 20:04 /usr/sbin/update-alternatives -> alternatives

ls -l /etc/alternatives/java
lrwxrwxrwx 1 root root 35 Jun  8  2010 /etc/alternatives/java -> /usr/lib/jvm/jre-1.4.2-gcj/bin/java


[root@pg ~]# alternatives --config java

There is 1 program that provides 'java'.

  Selection    Command
-----------------------------------------------
*+ 1           /usr/lib/jvm/jre-1.4.2-gcj/bin/java


update-alternatives --install /usr/bin/java java /usr/bin/jade 3

[root@pg ~]# alternatives --config java

There are 2 programs which provide 'java'.

  Selection    Command
-----------------------------------------------
*+ 1           /usr/lib/jvm/jre-1.4.2-gcj/bin/java
   2           /usr/bin/jade

Enter to keep the current selection[+], or type selection number:



[root@pg ~]# alternatives --config java

There are 2 programs which provide 'java'.

  Selection    Command
-----------------------------------------------
*  1           /usr/lib/jvm/jre-1.4.2-gcj/bin/java
 + 2           /usr/bin/jade

Enter to keep the current selection[+], or type selection number:


[root@pg ~]# ls -l /etc/alternatives/java
lrwxrwxrwx 1 root root 13 Apr 24 01:33 /etc/alternatives/java -> /usr/bin/jade



Why apache shows multiple processes?


The multiple apache processes we see is by design to handle multiple requests. Those multiple processes is basically listeners with multiple threads, so that it can handle simultaneous traffic more efficiently.

By default, apache spawns10 processes and each process generally takes about 10MB. Assuming all the sub processes takes the same memory its about 100MB for apache running on a machine.

Example of apache running:
#top -U nobody

Memory: 8064M phys mem, 2620M free mem, 12G total swap, 9550M free swap

   PID USERNAME LWP PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND
 14436 nobody     1  59    0   12M 3272K sleep    0:09  0.00% httpd
 14435 nobody     1  59    0   12M 3232K sleep    0:08  0.00% httpd
 17184 nobody     1  59    0   12M 4488K sleep    0:06  0.00% httpd
 14437 nobody     1  59    0   12M 4152K sleep    0:06  0.00% httpd
 17182 nobody     1  59    0   12M 4040K sleep    0:06  0.00% httpd
 14439 nobody     1  59    0   12M 3272K sleep    0:06  0.00% httpd
 16956 nobody     1  59    0   12M 4032K sleep    0:05  0.00% httpd
 14438 nobody     1  59    0   12M 3272K sleep    0:04  0.00% httpd
 17579 nobody     1  59    0   12M 3208K sleep    0:04  0.00% httpd
 17183 nobody     1  59    0   12M 3144K sleep    0:03  0.00% httpd



The defaults in httpd.conf include :


StartServers 8

MinSpareServers 5

User apache


That means start 8 listeners (total 9 processes - root httpd process + 8 servers) and have a minimum of 5 idle listeners at all times (dynamically creating new listeners as necessary). The 'User' directive controls which non-root account the listeners run as (usually 'apache' or 'nobody').

There are lots of other related directives but, unless you have low traffic and are trying to save memory, I wouldn't recommend changing the defaults.

Thursday, April 5, 2012

sar reports different device than the actual storage lun

How do we know, how is the storage devices are mapped to local devices?

bash-3.00# sar -d 2
SunOS mdc4ps001 5.10 Generic_144488-14 sun4v 04/05/2012 17:36:49
device %busy avque r+w/s blks/s avwait avserv
ssd19 0 0.0 1 178 0.0 4.8
ssd19,a 0 0.0 1 178 0.0 4.8
ssd19,h 0 0.0 0 0 0.0 0.0


devices export from SAN
output of QFS mcf
/dev/dsk/c6t600A0B800047157000001D0E4EBACAD5d0s0
/dev/dsk/c6t600A0B800047157000001D124EBACF17d0s0
/dev/dsk/c6t600A0B800047157000001D164EBACFECd0s0

output of format
1. c6t600A0B800047157000001D0E4EBACAD5d0
/scsi_vhci/ssd@g600a0b800047157000001d0e4ebacad5
2. c6t600A0B800047157000001D124EBACF17d0
/scsi_vhci/ssd@g600a0b800047157000001d124ebacf17
3. c6t600A0B800047157000001D164EBACFECd0
/scsi_vhci/ssd@g600a0b800047157000001d164ebacfec

Steps:
#1. Look at /etc/path_to_install for device "19"
bash-3.00# cat /etc/path_to_inst | grep 19
"/scsi_vhci/ssd@g600a0b800047157000001d124ebacf17" 19 "ssd"

#2. Here we are looking for device ending with f17 to be ssd19