A cache is a temporary data store that usually contains pre-computed data. The purpose of a cache is to provide the data the next time someone asks for it without having to re-compute the data. For example, your website needs to run a complex query to fetch results of a user request. A complex query takes time to run and uses the system’s resources for the query to run. Every time someone click on a link, this query needs to run. However, if you cache the results, you can simple provide results from the cache rather than having to re-query the data. This way there would be less load on server’s resources and the user request would be fulfilled faster.

The caching layer is usually between the application business logic and the database. For a simple system architecture consisting of a single server, you can simply add a caching layer to the application server itself. For more complex architectures, you a solution that can scale horizontally. One solution it to create a caching server. If you are on the cloud, you can use AWS ElasticCache or another product offered by the vendor.

Caching can be write-through or lazy load. A write-through cache updates the cache as new content is created or updated. A lazy loading cache will cache a content when someone first asks for it. The next time someone asks for it, the content is served from cache. In practice, both methods are used along with TTL. Time to Live (TTL) defines when the cache would expire and memory will be released.

Memcached and Redis are two commonly used caching engines. Memcached is very scalable but it does not support persistence. Redis supports persistence and is very scaleable. For large scale applications, use Redis.

ElastiCache supports both Memcached and Redis.