OPcache, APC, memcached, browser cache, MySQL query cache: what, why and why not?

What is caching?

Caching, generally speaking, means temporarily storing recently used data into memory for faster retrieval. Cache often has an allocated size, and when data is found in cache, that is called a cache hit, and retrieval of that data is much faster. If data is not found in cache it must be loaded or calculated (which is slower), and that’s called a cache miss. Caching mechanisms function in such a way that most recently used data is always stacked on top, while the least used data is pushed out of the stack once new data comes in – so, data that is used the most will always stay in cache.

Types of caching mechanisms?

While there are different types of caching mechanisms and different scenarios in which they are used, only a few will be explained here (relevant to web application server performance).

Browser cache

Browser cache is a mechanism implemented in web browsers for caching files (images, css, js, html and other documents) on a users machine. By loading content from the browsers cache, server bandwidth is saved and latency is reduced. Most of the time browser caching is used for images, css, flash and js but can be applicable for html and other documents if the data is rarely updated or if it is updated on fixed time intervals. A few important points with browser cache:

  • if used in the wrong way (in a website that dynamically generates data), it can serve outdated content to users
  • if a long expiration period is set (like one year) for a resource (like css), and the content of the resource changes, it will affect the displaying of pages using it (if the resource is still cached)
  • browser caching doesn’t work for secure (HTTPS) content

Browser caching can be enabled from within Apache, using mod_expires (full documentation for this module can be found here) and is probably the best way for setting cache lifetime for images, css, js and similar resources. Cache rules can also be set dynamically in PHP by using the header() function (see this for more info). The lifetime of the cache in HTTP headers can be set using Expires (more widely supported) or Cache-Control (introduced in HTTP 1.1 giving more functionality). A great article on browser cache can be found here, or you can read about browser caching in the HTTP specification here.

memcached

memcached (http://www.memcached.org/) is a distributed data caching system. It caches data in RAM for faster retrieval (using key -> value pairs), rather than calculating or loading from files or database queries. There are two implementations of the memcached service for PHP:

Memcache:

Memcache module provides handy procedural and object oriented interface to memcached, highly effective caching daemon, which was especially designed to decrease database load in dynamic web applications.

The Memcache module also provides a session handler (memcache).

Memcached:

» memcached is a high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load.

This extension uses the libmemcached library to provide an API for communicating with memcached servers. It also provides a session handler (memcached).

You can find installation/setup instructions along with the documentation and usage examples in the links shown above. Memcached should be used for boosting performance of frequent or complex queries (so don’t try replacing all queries with memcache, because memcache is not a database). And since memcached uses RAM pretty much the same way as a database, having a well designed schema and optimized queries is a step you should take before deciding where memcached should be used.

OPcache

OPcache (formerly know as Zend Optimizer+) is a PHP extension that caches precompiled scripts removing the need for parsing and compiling the given code on each request. It is considered as the fastest opcode cache currently available for PHP. Its main purpose is to speed up code execution and usually reduces server overall memory consumption without altering the code itself. Keep in mind that other extensions that alter the code flow (like XDebug or Zend Debugger) can have a negative effect on OPcache. This extension is bundled with PHP 5.5.0 and later, and the installation and configuration manual for older versions of PHP can be found here.

APC

APC (Alternative PHP Cache) like OPcache, APC is also a PHP accelerator, though slower than OPcache (5%-20%) still a widely used engine packed with a data caching mechanism of its own. The manuals for APC can be found here.

MySQL Query Cache

MySQL query cache is a functionality implemented in MySQL database servers that enables caching of query results. And as folks from Oracle put it (look at the full documentation here):

The query cache stores the text of a SELECT statement together with the corresponding result that was sent to the client. If an identical statement is received later, the server retrieves the results from the query cache rather than parsing and executing the statement again…

But, keep in mind that (also from their documentation: “How Query Cache Operates”):

Incoming queries are compared to those in the query cache before parsing, so the following two queries are regarded as different by the query cache:

SELECT * FROM tbl_name
Select * from tbl_name

Queries must be exactly the same (byte for byte) to be seen as identical. In addition, query strings that are identical may be treated as different for other reasons. Queries that use different databases, different protocol versions, or different default character sets are considered different queries and are cached separately.

Basically, the whole statement is stored as a key, while the query result is stored as the value, and the caching system functions the same way other caching mechanisms do (stacking most used queries at the top, and removing the least used). Another thing that should also be kept in mind is that, when updating a table row, all cached queries that use that table are removed from the cache. Oh, and another thing (also from their documentation on “How Query Cache Operates”):

Prepared statements that are issued using the binary protocol using mysql_stmt_prepare() andmysql_stmt_execute() (see Section 21.8.8, “C API Prepared Statements”), are subject to limitations on caching.

OK, so what is MySQL query caching good for? It’s good for systems whose databases are rarely updated. It’s good if you don’t want to change the code of your application. It’s good for quick-and-dirty optimization of poorly designed databases and queries. But most of the time it just depends on the scenario. My personal opinion is that you can always use the RAM for memcached instead of MySQL’s query cache system, as they both work the same way (wheres memcached can use smaller key values and you can actually control what data is purged on update). If you ever decide to try out MySQL’s query cache look it up under Optimization/Buffering and Caching.

Redirecting all HTTP requests except images, css, flash and JavaSript file requests using .htaccess

All you need to do is create a .htaccess file in your projects root and put this inside:

RewriteEngine on

RewriteRule !\.(ico|gif|jpg|png|css|js|swf|flv)$ index.php [QSA]

The RewriteRule stated above will redirect all requests matching the Pattern “!\.(ico|gif|jpg|png|css|js|swf|flv)$” to index.php (the pattern being anything not ending with any of the extensions listed in the brackets). The [QSA] flag stands for Query String Append, meaning that any query string passed with the URL will be appended to the rewritten URL.

A NOTE ON .htaccess:

Full documentation on how to use .htaccess can be found here.

The use of .htaccess should be limited to certain scenarios, and Apache’s main configuration file should be used instead, as stated in the documentation:

However, in general, use of .htaccess files should be avoided when possible. Any configuration that you would consider putting in a.htaccess file, can just as effectively be made in a <Directory> section in your main server configuration file.

There are two main reasons to avoid the use of .htaccess files.

The first of these is performance. When AllowOverride is set to allow the use of .htaccess files, httpd will look in every directory for.htaccess files. Thus, permitting .htaccess files causes a performance hit, whether or not you actually even use them! Also, the .htaccessfile is loaded every time a document is requested.

The second consideration is one of security. You are permitting users to modify server configuration, which may result in changes over which you have no control. Carefully consider whether you want to give your users this privilege.

..now that's a proper firm
DO IT THE PROPER WAY WITH APACHE’S CONFIGURATION FILE!