User:GreenReaper/MakingMediaWikiFly


 * Introduction
 * Speed matters
 * http://www.flickr.com/photos/crucially/3716344792/
 * < 1 sec or bust for usability
 * MediaWiki can be slow...but it doesn't have to be!
 * Less is more - fewer hits mean it can take higher load
 * It's _not_ hard
 * The problem
 * Wikimedia has http://meta.wikimedia.org/wiki/Server_layout_diagrams
 * You can afford: (part of?) one server
 * Hard memory, CPU, bandwidth/transfer limits
 * You gotta squeeze every byte
 * Code, data, web caches
 * The idea: deflect expensive requests to cache
 * APC - vital speedup for code and data
 * Share code, no cache message files, tune user TTL to > 1 day
 * Use spin locking or (failing that) pthreads - avoid file/IPC locking
 * Web cache - Squid or (better?) Varnish. Faster than Apache.
 * DB query cache? Maybe. Some results already in APC.
 * Don't use memcached - APC is fine for a single server
 * Reducing and spreading requests
 * Latency number one killer - 4kb initial transfer, 17k recv. window
 * HTTP compression - CPU cost minimal vs. latency, and cached
 * Use long cache times and ETags (ignore Yslow there)
 * Share skin images and CSS between sites too
 * Cut headers, comments, merge CSS/JS where appropriate
 * Cuts took us from 1.6kb to 1.06kb in the headers - saved on every page
 * Consider CSS sprites for homepage images
 * Minification of CSS/JS and image recompression
 * Huge win on first page load = first impression
 * Handheld devices may only cache small files
 * Consider Extension:Minify for user CSS/JS, but check CPU
 * Don't obfuscate for further wins - heavy client CPU penalty
 * Indexed PNGS < 256 colors (our logo is 2kb, you'd never know)
 * Try Google Page Speed, it will compress images for you
 * Multiple subdomains as virtual CDN - spread the load
 * Subvert HTTP connection limits
 * Don't overdo it - two to four max
 * Less necessary with newer browsers
 * Try to keep image domain cookie-free (second best: use pool)
 * Summary
 * Speed matters: fast pages = more users, edits
 * Cache everything, client and server side
 * Cut what you don't need, shrink what you do
 * Spread requests across multiple domains
 * Measure - Webalizer, APC, profiler, MRTG . ..
 * Bonus tips
 * Block robots from /w/ and Special:Random
 * HTML 5 to trim doctype and cut optional headers - like (soon in WP)
 * Tune your server with a large congestion window
 * Use HTCP to purge the web cache
 * Are features really necessary?
 * We cut buttons, headers, http: in external/interwiki/lang links
 * Broke (and patched) MediaWiki multiple times + several web crawlers :-)
 * Double-check output for unnecessary comments (parser debug output!)
 * Separate hard disks for logs, DB, web server if available
 * Profiling: Heavy extensions like DPL, Google Maps can be a problem
 * Use dplcache=XXX and allowcachedresults=yes
 * Beyond one server
 * http://meta.wikimedia.org/wiki/Server_layout_diagrams
 * memcached on separate machine[s]
 * Web cache on different machines, different continent
 * Dedicated server for search [lucene, sphinx] and images [nginx, lighttptd?]
 * Database server on separate machine, add slaves
 * Dedicate one Apache to a popular language
 * Store revision text in secondary database, reduce primary load
 * concatenate and compress revision text further
 * More++ - see Domas' presentations