User:GreenReaper/MakingMediaWikiFly
From WikiFur, the furry encyclopedia.
- Introduction
- Speed matters
- http://www.flickr.com/photos/crucially/3716344792/
- < 1 sec or bust for usability
- MediaWiki can be slow...but it doesn't have to be!
- Less is more - fewer hits mean it can take higher load
- It's _not_ hard
- The problem
- Wikimedia has http://meta.wikimedia.org/wiki/Server_layout_diagrams
- You can afford: (part of?) one server
- Hard memory, CPU, bandwidth/transfer limits
- You gotta squeeze every byte
- Code, data, web caches
- The idea: deflect expensive requests to cache
- APC - vital speedup for code and data
- Share code, no cache message files, tune user TTL to > 1 day
- Use spin locking or (failing that) pthreads - avoid file/IPC locking
- Web cache - Squid or (better?) Varnish. Faster than Apache.
- DB query cache? Maybe. Some results already in APC.
- Don't use memcached - APC is fine for a single server
- Reducing and spreading requests
- Latency number one killer - 4kb initial transfer, 17k recv. window
- HTTP compression - CPU cost minimal vs. latency, and cached
- Use long cache times and ETags (ignore Yslow there)
- Share skin images and CSS between sites too
- Cut headers, comments, merge CSS/JS where appropriate
- Cuts took us from 1.6kb to 1.06kb in the headers - saved on every page
- Consider CSS sprites for homepage images
- Minification of CSS/JS and image recompression
- Huge win on first page load = first impression
- Handheld devices may only cache small files
- Consider Extension:Minify for user CSS/JS, but check CPU
- Don't obfuscate for further wins - heavy client CPU penalty
- Indexed PNGS < 256 colors (our logo is 2kb, you'd never know)
- Try Google Page Speed, it will compress images for you
- Multiple subdomains as virtual CDN - spread the load
- Subvert HTTP connection limits
- Don't overdo it - two to four max
- Less necessary with newer browsers
- Try to keep image domain cookie-free (second best: use pool)
- Summary
- Speed matters: fast pages = more users, edits
- Cache everything, client and server side
- Cut what you don't need, shrink what you do
- Spread requests across multiple domains
- Measure - Webalizer, APC, profiler, MRTG . . .
- Bonus tips
- Block robots from /w/ and Special:Random
- HTML 5 to trim doctype and cut optional headers - like <head> (soon in WP)
- Tune your server with a large congestion window
- Use HTCP to purge the web cache
- Are features really necessary?
- We cut buttons, headers, http: in external/interwiki/lang links
- Broke (and patched) MediaWiki multiple times + several web crawlers :-)
- Double-check output for unnecessary comments (parser debug output!)
- Separate hard disks for logs, DB, web server if available
- Profiling: Heavy extensions like DPL, Google Maps can be a problem
- Use dplcache=XXX and allowcachedresults=yes
- Beyond one server
- http://meta.wikimedia.org/wiki/Server_layout_diagrams
- memcached on separate machine[s]
- Web cache on different machines, different continent
- Dedicated server for search [lucene, sphinx] and images [nginx, lighttptd?]
- Database server on separate machine, add slaves
- Dedicate one Apache to a popular language
- Store revision text in secondary database, reduce primary load
- concatenate and compress revision text further
- More++ - see Domas' presentations