I like the definition used in Wikipedia: “a cache is a temporary storage area where often accessed data can be stored for quick access”. The idea is to get ‘often accessed data’ from a database and store it in memory (RAM or as a file in your local file system). This is because:
- it’s quicker for a machine to read from memory than to connect to a database and query data.
- it’s more efficient for the database to not waste time and resources returning the same dataset multiple times when it could be focusing on other tasks.
As long as the data, in this scenario from the database, doesn’t change, there is no need to query it again.
Resources are limited on systems and to take advantage of your resources, you need to make sure time isn’t spent on tasks that could be handled better elsewhere. Here is a silly real world example. Imagine on a daily basis, I have to track how many magazines I have and send this information to Person X. I get new magazines at the beginning of each month only. To track the number of magazines I have every day I could
- Count them, one by one every day and send Person X the total. If I have 50 magazines this could take some time and assume I get 10 more every month, after a year or two I could spend all day just counting how many magazines I have instead of working. Sound productive?
- Count them once and write the number down on a piece of paper (caching!). Everyday when Person X asks how many magazines I have, I read the number from the piece of paper. Only when I get new magazines (once a month) do I count them again (or just add the current number + the new amount) to get my new total. Then I update my piece of paper with the new total (updating the value in cache).
The latter is definitely the more productive choice.
The same idea applies to computer systems. In the web, you have static and dynamic files. Static files are quicker to serve on a server because the server only has to read the contents of the file and send it to the browser requesting it. Dynamic pages take more time and resources because the server needs to execute the code in the page and only once it’s done can it send the request back. PHP can be used to create dynamic pages. The server executes the php code and spits out a file that then is read by the browser. If a database is involved, then the database has to run it’s task as well before the final file is returned.
When ever possible, it’s more efficient to serve a static file or static content. We use cache to accomplish this. In this post I’m going to talk about caching files and database queries to local files on the server. Read the rest of this entry »