Thursday, October 27, 2016

Application performance is slow

Scenario:

App login time has increased. Some one of the Sales and application support folks reported/complained that login takes login time.

We did confirm that login page sometimes just sits there and times out.

This was reported form all the PODs intermittently and all customers on the POD in general are showing higher login time.

Engineer Team mentioned there is a production_234234xx.js which sometimes takes longer than expected, its about 2MB file which show max of almost 5-6mins to load

Local Dublin Office speed = 70.63mbps download, 71.24mpbs upload
Download 8MB/sec

PA Office = download 17.87mbps
Download speed = 2.1MB/sec


I did few more scenario tests yesterday using the same curl command syntax

1. Download of production.js 3.1MB from (San Jose POD), twice every min (time Oct 9 14:30 - Oct 10 11:04 CDT)

2. Download of production.js 2.6MB from (Sterling POD), twice every min (time Oct 9 19:35 - Oct 10 11:04 CDT)

3. Download of 3.2MB tar file from stage4(ST), this test was without login into the application, twice every min (time Oct 9 19:11 - Oct 10 11:04 CDT)

All these tests were from Softlayer DC Datacenter (198.11.247.15)


1. We did 2470 downloads from San Jose, 87 of them took 6.5 sec rest shows 1.5 sec or under. About 3% with slow response.

2. We did 1789 downloads from Sterling, 49 of them took 5.2 sec rest shows 0.2 sec or under. About 3% again with slow response.

3. We did 1886 downloads from stage4 without using the application, 42 of them took 5.1 sec rest shows around 0.1 sec. About 2% with slow response.


All the output files are attached

This time I traced all the packets initiated from softlayer to stage4. This includes all the packets from test #2 and test #3


Surprisingly/Unfortunately the trace shows that all these are transactions took between 0.12 sec - 2.27 sec. It doesn't have any downloads showing 5 sec duration


Here are the trace files for review

https://drive.google.com/a/mywork.com/file/d/0B7-LikX8hKvKSFd3bjREbmk0Rzg/view?usp=sharing

https://drive.google.com/a/mywork.com/file/d/0B7-LikX8hKvKd0hlN080eGFTOG8/view?usp=sharing


Now, I'm trying to get into the curl results. Here is one example from vv1 downloads

time_namelookup: 5.006
time_connect: 5.076
time_appconnect: 5.273
time_pretransfer: 5.273
time_redirect: 0.000
time_starttransfer: 5.375
num_connects: 1
--------
time_total: 6.338

In all the cases, when we see the time 6.0+ sec.


The "time_namelookup: 5.006" is the one which makes it off the regular trend of 1.5 sec. If we simply remove this namelookup time we have a stable trend.


Now I'm not sure why curl is reporting this huge namelookup time, I will have to do some more digging on this ..


The 3 curl commands are as below:

1. curl "https://mywork-qms.myworkvault.com/ui/resources/scripts/minified/vault.main.production.js?1412277409" -H "Pragma: no-cache" -H "Accept-Encoding: gzip,deflate" -H "Accept-Language: en-US,en;q=0.8" -H "User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.124 Safari/537.36" -H "Accept: /" -H "Referer: https://mywork-qms.myworkvault.com/ui/" -H "Cookie: showNewsAlert=true; timestamp=; tx_regionMarkStyle=B1:#000000:3:0:#ffffff:1:2:1:0; tx_freeMode=region; tx_textMarkStyle=h103; tx_colorIndex=103; prodTS=1412277409; myworkUserSsoSettings=""sp#https://myworksys.okta.com/app/template_saml_2_0/kbzmgd97JZKLWKPZYTZJ/sso/saml""; JSESSIONID=EB1C5CA4AAAAC1C57CBEF9921166C69A.2; TK=07161E0AE282F531970F42BB3214F706ACA7EC30E22A3F03FB3936B093023A780C9D378E5CA012E6017C4DB142D1A421; TS=1412875533682; LT=sso; STS=""1412875533742,1412875534024""" -H "Connection: keep-alive" -H "Cache-Control: no-cache" --compressed -o prod.js -v --trace-time -w "@curltime.txt"


2. curl "https://stage4.myworkvault.com/resources/JMVC/mywork_vault/production_1407268926.js" -H "Pragma: no-cache" -H "Accept-Encoding: gzip,deflate" -H "Accept-Language: en-US,en;q=0.8" -H "User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.124 Safari/537.36" -H "Accept: /" -H "Referer: https://stage4.myworkvault.com/ui/" -H "Cookie: selectedLang=en; myworkUserSsoSettings=""""; JSESSIONID=A2F83783D8F70A357FA8E5B7150F7469.2; TK=9A7FF56D81BE2D50D1114479657B3A43A6B1CC259C7FB5A1F279ECFBF307E94F4711A09495E25A3ACDF9EBC7C3661F14; prodTS=1407268926; showNewsAlert=true; timestamp=" -H "Connection: keep-alive" -H "Cache-Control: no-cache" --compressed -v --trace-time -o stage4app.js -w "@curltime.txt"


3. curl -o curlstage4.tar https://downloadstage4.myworkvault.com/curltest.tar -w "@curltime.txt"



==========

working on stage4 first, changed ttl from 13 mins to 24hrs.

------------

QUESTIONS:

stage4.myworkvault.com, type = A, class = IN

ANSWERS:

 stage4.myworkvault.com

internet address = 309.20.232.135

ttl = 86400

AUTHORITY RECORDS:

ADDITIONAL RECORDS:

------------

Non-authoritative answer:

Name: stage4.myworkvault.com

Address: 309.20.232.135
stage4.myworkvault.com

Server: 10.0.80.11

Address: 10.0.80.11#53

============

We have some good results with higher TTL

As stated before, stage4.vvevavault.com TTL was set for 24hrs. and we also did additional test with mywork-qms.myworkvault.com setting the TTL for 1hr.

#1. Stage4 test ran from Oct 14 11:30 - Oct 15 18:05, about 5508 curl downloads, 3 times every min. It came with 0 slow responses.

#2. QMS test ran from Oct 14 17:03 Oct 15 18:05, about 4508 curl downloads, 3 times every min. It came with 23 slow responses. (around 0.5%)

At this point we can suggest setting up most of the active domain TTL to 24hrs to fix this problem.

However since we do domain migration more occasionally then traditionally done with a url, we should decide what is the best optimal TTL to set so that there is less administrative work on these domain.

=============










No comments: