The SEO overlords say “Thou shalt provide links on your 404 page”, or something to that effect. Consider this quick tip simply a word of caution for you power users.
Typically a WordPress 404 page will not be cached by your performance plugin of choice or even Varnish in most cases. And a typical 404 page template may look something like:
wp_list_pages( 'title_li=' );
wp_list_categories( 'sort_column=name&title_li=' );
wp_list_authors( 'exclude_admin=0&optioncount=1' );
wp_get_archives( 'type=monthly' );
wp_get_archives( 'type=postbypost );
All fair and good, right? 95% no harm no foul. Except in possibly these two following circumstances:
- Your site has 500, 1000, or even 10,000 posts/authors/categories/tags etc.
- Your permalinks get botched by a plugin, and everything but your homepage is 404’ing. (We see this at Page.ly from time to time; some plugins just love to rewrite permalinks — poorly.)
Here is the cascading effect of multiple 404 pages on a big blog under a large amount of traffic:
- the web server opens a connection,
- pings Mysql to make these queries (grabs 242 pages, 634 categories, 25 authors, 6325 posts…),
- holds open the connection while Mysql burns,
- more connections to the web server stack up, waiting,
- New Zealand falls into the ocean.
But wait, there must be an easy fix you say! Indeed there is: use the limit, depth, number, and other arguments that will typically add a LIMIT or more narrowly focused SELECT to the Mysql queries. This will help a 404 page render much faster.
So we would modify our list above to look like this:
wp_list_pages( 'title_li=&depth=1' );
wp_list_categories( 'sort_column=name&title_li=&depth=1&number=10' );
wp_list_authors( 'exclude_admin=0&optioncount=1&number=10' );
wp_get_archives( 'type=monthly&limit=10' );
wp_get_archives( 'type=postbypost&limit=10 );
The W3 Total Cache plugin also has a setting to handle 404’s, basically letting the web server serve up the default (and less attractive) 404 page, with no pain of loading the WordPress stack.
And of course, always try to identify your 404’s thru your access logs and 301 redirect them to a working page.