High-Performance
Database Queries
in WordPress

Matthew Boynes / @senyob / alley.co

https://mboynes.github.io/wc-portland-2018/

  • We are a full-service digital agency
  • WordPress.com VIP partner
  • Hiring!

What makes a query fast or slow?

The most basic reason a query doesn’t perform well is because it’s working with too much data. Some queries just have to sift through a lot of data and can’t be helped. That’s unusual, though; most bad queries can be changed to access less data.

High Performance MySQL, 3rd Edition by Baron Schwartz; Peter Zaitsev; Vadim Tkachenko

What makes a query fast or slow?

What is the most basic way to speed up a database query?

Reduce the amount of data the database has to read.

Indexes

(Specifically, B-tree indexes...)

  • Indexes reduce the amount of data the server has to examine.
  • Indexes help the server avoid sorting and temporary tables.
  • Indexes turn random I/O into sequential I/O.

High Performance MySQL, 3rd Edition by Baron Schwartz; Peter Zaitsev; Vadim Tkachenko

Indexes

Cons

  • Speed of write operations
  • Storage space
  • Memory
  • Flexibility

Indexes in WordPress

wp_posts

Keyname Type Field
PRIMARYPRIMARYID
post_name INDEX post_name
type_status_date INDEX post_type
post_status
post_date
ID
post_parent INDEX post_parent
post_author INDEX post_author

Indexes in WordPress

wp_postmeta

Keyname Type Field
PRIMARYPRIMARYmeta_id
post_id INDEX post_id
meta_key INDEX meta_key

Indexes in WordPress

wp_term_relationships

Keyname Type Field
PRIMARYPRIMARYobject_id
term_taxonomy_id
term_taxonomy_idINDEXterm_taxonomy_id

Joins

  • Are they always bad? No!
  • But they can be
  • JOIN order is important, as is proper indexing
  • Many JOINs can cause a lot of CPU overhead

Archive Queries

  • Filters by post type, post status, orders by date
  • Remember that type_status_date index?

Archive Queries


						SELECT SQL_CALC_FOUND_ROWS wp_posts.ID
						FROM wp_posts
						WHERE 1=1
							AND wp_posts.post_type = 'post'
							AND (wp_posts.post_status = 'publish'
							OR wp_posts.post_status = 'private')
						ORDER BY wp_posts.post_date DESC
						LIMIT 0, 10
						

~73ms with 100k posts

Archive Queries

Why is this so slow?


						> EXPLAIN SELECT ... \G
						*************************** 1. row ***************************
						           id: 1
						  select_type: SIMPLE
						        table: wp_posts
						         type: range
						possible_keys: type_status_date
						          key: type_status_date
						      key_len: 164
						          ref: NULL
						         rows: 49702
						        Extra: Using where; Using index; Using filesort
						

4ms with order/limit removed

Archive Queries

  • The query itself is fast, sorting it is slow.
  • Can we save time by sorting it in PHP? I'm glad you asked...
  • It takes longer just to deliver the data over a local connection.
  • All told, it takes about 14x longer!

Taxonomy Queries

  • Typical tax query uses indexed columns
  • Basic tax query uses single JOIN
  • JOIN happens on indexed column

Taxonomy Queries


						SELECT SQL_CALC_FOUND_ROWS wp_posts.ID
						FROM wp_posts
						LEFT JOIN wp_term_relationships
							ON (wp_posts.ID = wp_term_relationships.object_id)
						WHERE 1=1
							AND ( wp_term_relationships.term_taxonomy_id IN (2) )
							AND wp_posts.post_type = 'post'
							AND (wp_posts.post_status = 'publish'
							OR wp_posts.post_status = 'private')
						GROUP BY wp_posts.ID
						ORDER BY wp_posts.post_date DESC
						LIMIT 0, 10
						

~200ms with 100k posts in the term

Taxonomy Queries

Multiple Taxonomies


						SELECT SQL_CALC_FOUND_ROWS wp_posts.ID
						FROM wp_posts
						LEFT JOIN wp_term_relationships
							ON (wp_posts.ID = wp_term_relationships.object_id)
						LEFT JOIN wp_term_relationships AS tt1
							ON (wp_posts.ID = tt1.object_id)
						WHERE 1=1
							AND ( wp_term_relationships.term_taxonomy_id IN (2)
							AND tt1.term_taxonomy_id IN (9) )
							AND wp_posts.post_type = 'post'
							AND (wp_posts.post_status = 'publish'
							OR wp_posts.post_status = 'private')
						GROUP BY wp_posts.ID
						ORDER BY wp_posts.post_date DESC
						LIMIT 0, 10
						

~1ms with 100k posts in one term and 1 post in the other

Meta Queries

  • Typical meta query leverages indexed columns (meta_key, post_id)
  • JOIN happens on indexed column
  • meta_value is not indexed (out-of-the-box)

Meta Queries


						SELECT SQL_CALC_FOUND_ROWS wp_posts.ID
						FROM wp_posts
						INNER JOIN wp_postmeta
							ON ( wp_posts.ID = wp_postmeta.post_id )
						WHERE 1=1
							AND ( ( wp_postmeta.meta_key = 'city'
							AND wp_postmeta.meta_value = 'portland' ) )
							AND wp_posts.post_type = 'post'
							AND (wp_posts.post_status = 'publish'
							OR wp_posts.post_status = 'private')
						GROUP BY wp_posts.ID
						ORDER BY wp_posts.post_date DESC
						LIMIT 0, 10
						

~410ms with 99,999 matching posts

~340ms with 1 matching post

Meta Queries

A query is either fast or slow based on uniqueness of the meta key

When there's only 1 post with the 'city' key: 1ms

Planning and Data Architecture

  • Know Thy Data
  • ... especially tomorrow's data
  • Working on an open-source plugin? Assume the worst.

Search Queries

  • Simply put: not scalable
  • For large sites, use something external

Refactoring

Add Date Limitations

  • Example: Widget that shows X latest posts
  • Add date query with a safe range

Add Date Limitations


						$query = new \WP_Query(
							[
								/* ... */
								'date_query' => [
									[
										'after' => '-6 months',
									],
								],
							]
						);
						

Add Index to meta_value

Courtesy of WordPress.com VIP

  • Index of meta_key and first 100 chars of meta_value

Creative Meta Keys

  • Limit cardinality
  • Example: Post relationships
  • Old: [key: related_post, value: 12345]
  • New: [key: related_post_12345, value: 👍]

Caching

Transients API

Caching


						$cache_key = 'popular_posts';
						$post_ids = get_transient( $cache_key );
						if ( false === $post_ids ) {
							$post_ids = get_posts(
								[
									/* ... */
									'fields' => 'ids',
								]
							);
							set_transient( $cache_key, $post_ids, 10 * MINUTE_IN_SECONDS );
						}
						$posts = array_map( 'get_post', $post_ids );
						

Caching

Ideally, cache indefinitely and only clear when the data changes


						function clear_popular_posts( $post_id, $post ) {
							// Some logic determining if the cache should clear.
							delete_transient( 'popular_posts' );
						}
						add_action( 'clean_post_cache', 'clear_popular_posts', 10, 2 );
						

Pre-warming Caches

  • Don't want an unlucky user to warm the cache?
  • Kick off an async task using WP-Cron

Pre-warming Caches


						add_action(
							'save_post',
							function() {
								wp_schedule_single_event( time() + 5, 'warm_cache' );
							}
						);

						add_action(
							'warm_cache',
							function() {
								// Run laborious query and cache results.
							}
						);
						

Convert Meta Queries to Tax Queries

  • Meta may make more sense as taxonomy terms for querying
  • Don't let admin presentation influence storage needs
  • Hidden taxonomies!

Elasticsearch

  • Sometimes, MySQL just isn't well-suited for the job!
  • Elasticsearch is a dedicated search engine
  • ... not just for keyword searches
  • SearchPress and ES_WP_Query

The End


Want to become a better developer?
We're hiring.

alley.co/careers