| @ -0,0 +1,95 @@ | |||
| --- | |||
| title: The Battle of the Caches | |||
| author: Brett Langdon | |||
| date: 2013-08-01 | |||
| template: article.jade | |||
| --- | |||
| A co-worker and I set out to each build our own http proxy cache. | |||
| One of them was written in Go and the other as a C++ plugin for | |||
| Kyoto Tycoon. | |||
| --- | |||
| So, I know what most people are thinking: “Not another cache benchmark post, | |||
| with skewed or biased results.” But luckily that is not what this post is about; | |||
| there are no opinionated graphs showing that my favorite caching system happens | |||
| to be better than all the other ones. Instead, this post is about why at work we | |||
| decided to write our own API caching system rather than use <a href="http://www.varnish-cache.org/" target="_blank">Varnish</a> | |||
| (a tested, tried and true HTTP caching system). | |||
| Let us discuss the problem we have to solve. The system we have is a simple | |||
| request/response HTTP server that needs to have very low latency (a few | |||
| milliseconds, usually 2-3 on average) and we are adding a third-party HTTP API | |||
| call to almost every request that we see. I am sure some people see the issue | |||
| right away, any network call is going to add at least a half to a whole millisecond | |||
| to your processing time and that is if the two servers are in the same datacenter, | |||
| more if they are not. That is just network traffic, now we must rely on the | |||
| performance of the third-party API, hoping that they are able to maintain a | |||
| consistent response time under heavy load. If, in total, this third-party API call | |||
| is adding more than 2 milliseconds response time to each request that our system | |||
| is processing then that greatly reduces the capacity of our system. | |||
| THE SOLUTION! Lets use Varnish. This is the logical solution, lets put a caching | |||
| system in front of the API. The content we are requesting isn’t changing very often | |||
| (every few days, if that) and it can help speed up the added latency from the API | |||
| call. So, we tried this but had very little luck; no matter what we tried we could | |||
| not get Varnish to respond in under 2 milliseconds per request (which is a main | |||
| requirement of solution we were looking for). That means Varnish is out, the next | |||
| solution is to write our own caching system. | |||
| Now, before people start flooding the comments calling me a troll or yelling at me | |||
| for not trying this or that or some other thing, let me try to explain really why | |||
| we decided to write our own cache rather than spend extra days investing time into | |||
| Varnish or some other known HTTP cache. We have a fairly specific requirement from | |||
| our cache, very low and consistent latency. “Consistent” is the key word that really | |||
| matters to us. We decided fairly early on that getting no response on a cache miss | |||
| is better for our application than blocking and waiting for a response from the | |||
| proxy call. This is a very odd requirement and most HTTP caching systems do not | |||
| support it since it almost defeats their purpose (be “slow” 1-2 times so you can be | |||
| fast all the other times). As well, HTTP is not a requirement for us, that is, | |||
| from the cache to the API server HTTP must be used, but it is not a requirement | |||
| that our application calls to the cache using HTTP. Headers add extra bandwidth | |||
| and processing that are not required for our application. | |||
| So we decided that our ideal cache would have 3 main requirements: | |||
| 1. Must have a consistent response time, returning nothing early over waiting for a proper response | |||
| 2. Support the <a href="https://github.com/memcached/memcached/blob/master/doc/protocol.txt" target="_blank">Memcached Protocol</a> | |||
| 3. Support TTLs on the cached data | |||
| This behavior works basically like so: Call to cache, if it is a cache miss, | |||
| return an empty response and queue the request to a background process to make the | |||
| call to the API server, every identical request coming in (until the proxy call | |||
| returns a result) will receive an empty response but not add the request to the | |||
| queue. As soon as the proxy call returns, update the cache and every identical call | |||
| coming in will yield the proper response. After a given TTL consider the data in | |||
| the cache to be old and re-fetch. | |||
| This was then seen as a challenge between a co-worker, | |||
| <a href="http://late.am/" target="_blank">Dan Crosta</a>, and myself to see who | |||
| can write the better/faster caching system with these requirements. His solution, | |||
| entitled “CacheOrBust”, was a | |||
| <a href="http://fallabs.com/kyototycoon/" target="_blank">Kyoto Tycoon</a> plugin | |||
| written in C++ which simply used a subset of the memcached protocol as well as some | |||
| background workers and a request queue to perform the fetching. My solution, | |||
| <a href="https://github.com/brettlangdon/ferrite" target="_blank">Ferrite</a>, is a | |||
| custom server written in <a href="http://golang.org/" target="_blank">Go</a> | |||
| (originally written in C) that has the same functionality (except using | |||
| <a href="http://golang.org/doc/effective_go.html#goroutines" target="_blank">goroutines</a> | |||
| rather than background workers and a queue). Both servers used | |||
| <a href="http://fallabs.com/kyotocabinet/" target="_blank">Kyoto Cabinet</a> | |||
| as the underlying caching data structure. | |||
| So… results already! As with most fairly competitive competitions it is always a | |||
| sad day when there is a tie. Thats right, two similar solutions, written in two | |||
| different programming languages yielded similar results (we probably have | |||
| Kyoto Cabinet to thank). Both of our caching systems were able to yield us the | |||
| results we wanted, **consistent** sub-millisecond response times, averaging about | |||
| .5-.6 millisecond responses (different physical servers, but same datacenter), | |||
| regardless of whether the response was a cache hit or a cache miss. Usually the | |||
| morale of the story is: “do not re-invent the wheel, use something that already | |||
| exists that does what you want,” but realistically sometimes this isn’t an option. | |||
| Sometimes you have to bend the rules a little to get exactly what your application | |||
| needs, especially when dealing with low latency systems, every millisecond counts. | |||
| Just be smart about the decisions you make and make sure you have sound | |||
| justification for them, especially if you decide to build it yourself. | |||