Difference between revisions of "SitePerformanceMonitoringTools"

Latest revision as of 05:44, 19 October 2007

Rating: 0 - 0 votes

Company Logo

Company Name

Company Contact

Page Type

This page is about a company.

OurWork (4) SitePerformanceMonitoringTools (Ethan)

What (summary)

Instrumentation that provides a history of performance statistics for each part of the page load pipeline.

Related pages:

Why this is important

The responsiveness and performance of the site makes a big difference in how many pages visitors will view, and how often they will come back. A poorly performing site will also wear out our active members causing some of them to leave.

DoneDone

:A Dashboard with Red, Yellow or Green for each pipeline item.
:The history for each pipeline item stored in the database.
:Definitions of acceptable benchmarks for each pipeline item.
:Definitions of acceptable freshness for each pipeline item.
:All monitoring meets the definitions for minimum freshness.

Performance Priorities

View normal page
View random page
Edit click until available
Save click until rendered
Render invalidated frontpage

Instrumentation Steps

=== # End to end on each

Deploy instrumentation boxes in various locations
Determine and instrument the pieces
- MediaWiki profiling
- Raw database queries

Steps to DoneDone

~~Articulate the request pipeline~~
- ~~Identify each request in the pipeline~~
- ~~How to perform / retrieve data for each request~~
~~Aggregate pipeline benchmarks~~
- Tuesday ... Implement the probes
- Wednesday portland devs ... Articulate database schema
- Wednesday portland devs ... Push XML results to central HTTPS server
- Wednesday portland devs ... Remote location / benchmark results stored in database
- Project revised for local monitoring only using Zabbix. Service on nimbus communicates with agents installed on each server and stored in MySQL.
~~Integrate into monitoring~~
- ~~Dashboard to identify overall health~~
- ~~Notifications via email/paging critical problem arises~~
- Available at https://admin.aboutus.org/zabbix/
~~Analyze pipeline benchmarks~~
- ~~XML output for monitoring integration~~
- ~~Graph performance for each location~~
- ~~Detailed graph view on each request~~
- Information available at http://www.aboutus.org/AboutUsPerformanceMonitoring/
~~Define acceptable benchmarks for each request~~

Repositories

cd into the directory you want to check the client out into
git clone nimbus:/opt/git/geophone-client
make a few changes in that directory
git status ... to see what is different
git diff ... to go through the change and make sure you aren't including something accidentally
git add YYYY ... to include modified or new file YYYY in the commit
git commit -m 'Here is why I made the changes I did for this commit'
git push ... to make sure that the remote repository has your changes
gitk ... from within the directory shows the tree of revisions

Pipeline

DNS request - www.aboutus.org & images.aboutus.org

Local resolver / cache

Queries against the resolvers at the remote location provides little insight into health of the www.aboutus.org site. If the record does not exist in the local resolver cache (or the TTL has expired), the DNS root servers will be contacted and the authoritative servers. If the record already exists in the cache then it will respond immediately. If the local resolver does not reply as expected, then the issue likely lies with the remote location or possibly the authoritative name servers (or somewhere between).

Authoritative name server

 ns1.dnscloud.com
 ns2.dnscloud.com

Response time of the authoritative server is critical. This can also be measured from any location, though, network latency and connectivity will be a factor.

dig www.aboutus.org @ns1.dnscloud.com

dig www.aboutus.org @ns2.dnscloud.com

dig images.aboutus.org @ns1.dnscloud.com

dig images.aboutus.org @ns2.dnscloud.com

Results

www.aboutus.org

ns1.dnscloud.com

IP Address

Query time (ms)

ns2.dnscloud.com

IP Address

Query time (ms)

images.aboutus.org

ns1.dnscloud.com

IP Address

Query time (ms)

ns2.dnscloud.com

IP Address

Query time (ms)

IP Connectivity

Network connectivity and latency can be measured using ping and traceroute utilities. Most issues with connectivity will most likely be caused by network problems between the two locations which we have no control over. In some cases, the issues could be caused by router, switch, or load-balancer issues on the AboutUs side, but these items will affect all remote locations.

ping -c 5 -i 0.2 -q www.aboutus.org

traceroute -n www.aboutus.org

Results - Five ICMP packets

packet loss (%)
average response time (ms)

HTTP Frontpage

/index.php and requisite pages

Response time of the frontpage request is critical. Performance relies on a number of factors.

Physical server load
- CPU
- Disk I/O
- Available memory - too little memory causes swapping thereby causing disk I/O performance degradation
- Network throughput

Apache process performance
- CPU usage
- Available threads

Memcached
- Down cache / timeout

Database query
- ?Query for each /index.php request?
- physical DB (slave) server loads
- replication
- MySQL performance

DNS request - images.aboutus.org

Image GET request
- image size
- number of images per page
- NFS server load (disk I/O)
- network throughput

curl --silent --write-out %{time_total} --output > /dev/null http://www.aboutus.org/index.php | tail -1

write-output variables: http_code time_total time_namelookup size_download speed_download

Acceptable Benchmarks

Max cold-request to fully rendered time for front page

- 0

curl --location --form wpSave=Save\ page --form wpTextbox1=replacement\ text --form wpEditToken=\\ --form wpEdittime=$edittime http://www.aboutus.org/index.php?title=ObsidiansAnd.com\&action=submit

warning: will replace all article body with text in wpTextbox1

Results

Request time (ms)

HTTP Render Invalidated

curl http://www.aboutus.org/Wiki -d action=purge

Results

Request time (ms)

Potential Hurdles

False positives
Caching

Questions

Does the pipeline include all of these? Record a history of how long to

lookup DNS for www.aboutus.org, images.aboutus.org, ... from different parts of the world
Setup a port 80 TCP connection with each of the squal boxes from different parts of the world
Load the frontpage without any client caching
Retrieve a memcache item from each combination of two squal boxes (one client, one memcached server)
Load the core css files
Load the core js files

...

@@ Line 1: / Line 1: @@
-<noinclude><big>[[OurWork]] < [[DevelopmentTeam]] < [[DevelopmentTeamPriorities|Priorities]] < </noinclude>('''10''') [[SitePerformanceMonitoringTools]] ('''[[Ethan]]''') {{JustTinyEditIcon|SitePerformanceMonitoringTools}}<noinclude></big>
+<noinclude><big>[[OurWork]] < [[DevelopmentTeam]] < [[DevelopmentTeamPriorities|Priorities]] < </noinclude><strike>('''[[PairDays|4]]''') [[SitePerformanceMonitoringTools]] ('''[[Ethan]]''') {{JustTinyEditIcon|SitePerformanceMonitoringTools}}</strike><noinclude></big>
 == What (summary) ==
 Instrumentation that provides a history of performance statistics for each part of the page load pipeline.
+Related pages:
+* [[AboutUsPerformanceMonitoring]]
+* [[SystemMonitoringDashboard]]
 == Why this is important ==
@@ Line 8: / Line 12: @@
 == [[DoneDone]] ==
-* A Dashboard with Red, Yellow or Green for each pipeline item.
+* :A Dashboard with Red, Yellow or Green for each pipeline item.
-* The history for each pipeline item stored in the database.
+* :The history for each pipeline item stored in the database.
-* Definitions of acceptable benchmarks for each pipeline item.
+* :Definitions of acceptable benchmarks for each pipeline item.
-* Definitions of acceptable freshness for each pipeline item.
+* :Definitions of acceptable freshness for each pipeline item.
-* All monitoring meets the definitions for minimum freshness.
+* :All monitoring meets the definitions for minimum freshness.
 ==Performance Priorities==
@@ Line 29: / Line 33: @@
 ==Steps to DoneDone==
-* Articulate the request pipeline
+* <strike>Articulate the request pipeline</strike>
-** Identify each request in the pipeline
+** <strike>Identify each request in the pipeline</strike>
-** How to perform / retrieve data for each request
+** <strike>How to perform / retrieve data for each request</strike>
-** Define acceptable benchmarks to each request
+* <strike>Aggregate pipeline benchmarks</strike>
+** <strike>'''Tuesday''' ... Implement the probes</strike>
-* Aggregate pipeline benchmarks
+** <strike>'''Wednesday portland devs''' ... Articulate database schema</strike>
+** <strike>'''Wednesday portland devs''' ... Push XML results to central HTTPS server</strike>
-* Push XML results to central HTTPS server
+** <strike>'''Wednesday portland devs''' ... Remote location / benchmark results stored in database</strike>
-** Articulate database schema
+** Project revised for local monitoring only using Zabbix. Service on nimbus communicates with agents installed on each server and stored in MySQL.
-** Remote location / benchmark results stored in database
+* <strike>Integrate into monitoring</strike>
+** <strike>Dashboard to identify overall health</strike>
-* Analyze pipeline benchmarks
+** <strike>Notifications via email/paging critical problem arises</strike>
-** Graph performance for each location
+** Available at https://admin.aboutus.org/zabbix/
-** Detailed graph view on each request
+* <strike>Analyze pipeline benchmarks</strike>
-** XML output for monitoring integration
+** <strike>XML output for monitoring integration</strike>
+** <strike>Graph performance for each location</strike>
+** <strike>Detailed graph view on each request</strike>
+** Information available at http://www.aboutus.org/AboutUsPerformanceMonitoring/
+* <strike>Define acceptable benchmarks for each request</strike>
-* Integrate into monitoring
+== Repositories ==
-** Dashboard to identify overall health
+* cd into the directory you want to check the client out into
-** Notifications via email/paging critical problem arises
+* git clone nimbus:/opt/git/geophone-client
+* make a few changes in that directory
+* git status ... to see what is different
+* git diff ... to go through the change and make sure you aren't including something accidentally
+* git add YYYY ... to include modified or new file YYYY in the commit
+* git commit -m 'Here is why I made the changes I did for this commit'
+* git push ... to make sure that the remote repository has your changes
+* gitk ... from within the directory shows the tree of revisions
 ==Pipeline==
@@ Line 54: / Line 69: @@
 * Local resolver / cache
-: Queries against the local resolver at the remote location provides little insight into health of the www.aboutus.org site. If the record does not exist in the local resolver cache (or the TTL has expired), the DNS root servers will be contacted and the authoritative servers. If the record already exists in the cache then it will respond immediately. If the local resolver does not reply as expected, then the issue likely lies with the remote location or possibly the authoritative name servers (or somewhere between).
+Queries against the resolvers at the remote location provides little insight into health of the www.aboutus.org site. If the record does not exist in the local resolver cache (or the TTL has expired), the DNS root servers will be contacted and the authoritative servers. If the record already exists in the cache then it will respond immediately. If the local resolver does not reply as expected, then the issue likely lies with the remote location or possibly the authoritative name servers (or somewhere between).
 * Authoritative name server
@@ Line 61: / Line 76: @@
    ns2.dnscloud.com
-: Response time of the authoritative server is critical. This can also be measured from any location, though, network latency and connectivity will be a factor.
+Response time of the authoritative server is critical. This can also be measured from any location, though, network latency and connectivity will be a factor.
+: dig www.aboutus.org @ns1.dnscloud.com
+: dig www.aboutus.org @ns2.dnscloud.com
+: dig images.aboutus.org @ns1.dnscloud.com
+: dig images.aboutus.org @ns2.dnscloud.com
+'''Results'''
+* www.aboutus.org
+: ns1.dnscloud.com
+:: IP Address
+:: Query time (ms)
+: ns2.dnscloud.com
+:: IP Address
+:: Query time (ms)
+* images.aboutus.org
+: ns1.dnscloud.com
+:: IP Address
+:: Query time (ms)
+: ns2.dnscloud.com
+:: IP Address
+:: Query time (ms)
-=== IP connectivity [R] ===
+=== IP Connectivity ===
+Network connectivity and latency can be measured using ping and traceroute utilities. Most issues with connectivity will most likely be caused by network problems between the two locations which we have no control over. In some cases, the issues could be caused by router, switch, or load-balancer issues on the AboutUs side, but these items will affect all remote locations.
-: Network connectivity and latency can be measured using ping and traceroute utilities. Most issues with connectivity will most likely be caused by network problems between the two locations which we have no control over. In some cases, the issues could be caused by router, switch, or load-balancer issues on the AboutUs side, but these items will affect all remote locations.
+: ping -c 5 -i 0.2 -q www.aboutus.org
+: traceroute -n www.aboutus.org
-=== HTTP request - /index.php ===
+'''Results''' - <i>Five ICMP packets</i>
-: Response time of a single /index.php GET request is critical. Performance relies on a number of factors.
+* packet loss (%)
+* average response time (ms)
+=== HTTP Frontpage ===
+<i>/index.php and requisite pages</i>
+Response time of the frontpage request is critical. Performance relies on a number of factors.
 * Physical server load
@@ Line 97: / Line 142: @@
 ** NFS server load (disk I/O)
 ** network throughput
+: curl --silent --write-out %{time_total} --output > /dev/null http://www.aboutus.org/index.php | tail -1
+: <i>write-output variables: http_code time_total time_namelookup size_download speed_download</i>
 ==== Acceptable Benchmarks ====
@@ Line 105: / Line 153: @@
 ** 3 < t sec is unacceptable
-=== HTTP search ===
+'''Results'''
+* Request time (ms)
+=== HTTP Search ===
+Response time for search results to be displayed displayed.
+: curl --location --form auSearch=aboutus.org http://www.aboutus.org/Special:AboutUsSearch
-: Response time for search results to be displayed displayed.
+'''Results'''
+* Request time (ms)
-: curl -L -F auSearch=aboutus.org http://www.aboutus.org/Special:AboutUsSearch
+=== HTTP Random ===
+Response time for a random page to be displayed.
+: curl --location --silent  http://www.aboutus.org/Special:Random
-=== HTTP random page ===
+'''Results'''
+* Request time (ms)
-: Response time for a random page to be displayed.
+=== HTTP/HTTPS Authentication ===
+Measure the time it takes for a user to log into the www.aboutus.org site.
-=== HTTP/HTTPS user login ===
+: curl --location --cookie-jar - --form wpName=user\ name --form wpPassword=password http://www.aboutus.org/index.php?title=Special:Userlogin\&action=submitlogin\&type=login\&returnto=Wiki
+: curl --location --insecure --cookie-jar - --form wpName=user\ name --form wpPassword=password https://www.aboutus.org/index.php?title=Special:Userlogin\&action=submitlogin\&type=login\&returnto=Wiki
-: Measure the time it takes for a user to log into the www.aboutus.org site.
+'''Results'''
+* Request time (ms)
-:  curl -Lc - -F wpName=user\ name -F wpPassword=password http://www.aboutus.org/index.php?title=Special:Userlogin\&action=submitlogin\&type=login\&returnto=Wiki
+=== HTTP Edit ===
-:  curl -Lkc - -F wpName=user\ name -F wpPassword=password https://www.aboutus.org/index.php?title=Special:Userlogin\&action=submitlogin\&type=login\&returnto=Wiki
+: http://www.aboutus.org/index.php?title=domain.com&action=edit
-=== HTTP page edit ===
+'''Results'''
+* Request time (ms)
-=== HTTP edit page save ===
+=== HTTP Save ===
+: edittime=`curl --location http://www.aboutus.org/index.php?title=ObsidiansAnd.com\&action=edit | grep wpEdittime | awk -F\" '{print $2}'`<p>
+: curl --location --form wpSave=Save\ page --form wpTextbox1=replacement\ text --form wpEditToken=\\ --form wpEdittime=$edittime http://www.aboutus.org/index.php?title=ObsidiansAnd.com\&action=submit
+: <i>warning: will replace all article body with text in wpTextbox1</i>
-=== Render invalidated frontpage ===
+'''Results'''
+* Request time (ms)
+=== HTTP Render Invalidated ===
+: curl http://www.aboutus.org/Wiki -d action=purge
+'''Results'''
+* Request time (ms)
 ==Potential Hurdles==
@@ Line 145: / Line 216: @@
+</noinclude>
 [[Category:DevelopmentTeamProject]]
-</noinclude>

Difference between revisions of "SitePerformanceMonitoringTools"

Company Logo

Company Name

Company Contact

Page Type

Edit Page Image

Edit Name

Edit Contact Information

Edit Page Type

Map

Edit Page Rating

Latest revision as of 05:44, 19 October 2007

Company Logo

Company Name

Company Contact

Page Type

Contents

What (summary)

Why this is important

DoneDone

Performance Priorities

Instrumentation Steps

Steps to DoneDone

Repositories

Pipeline

DNS request - www.aboutus.org & images.aboutus.org

IP Connectivity

HTTP Frontpage

Acceptable Benchmarks

HTTP Render Invalidated

Potential Hurdles

Questions

Edit Page Image

Edit Name

Edit Contact Information

Edit Page Type

Map

Edit Page Rating

Company Logo

Company Name

Company Contact

Page Type

Edit Page Image

Edit Name

Edit Contact Information

Edit Page Type

Map

Edit Page Rating