| @ -0,0 +1,86 @@ | |||
| --- | |||
| title: Why Benchmarking Tools Suck | |||
| author: Brett Langdon | |||
| date: 2012-10-22 | |||
| template: article.jade | |||
| --- | |||
| A brief aside into why I think no benchmarking tool is exactly correct | |||
| and why I wrote my own. | |||
| --- | |||
| Benchmarking is (or should be) a fairly important part of most developers job or | |||
| duty. To determine the load that the systems that they build can withstand. We are | |||
| currently at a point in our development lifecycle at work where load testing is a | |||
| fairly high priority. We need to be able to answer questions like, what kind of | |||
| load can our servers currently handle as a whole?, what kind of load can a single | |||
| server handle?, how much throughput can we gain by adding X more servers?, what | |||
| happens when we overload our servers?, what happens when our concurrency doubles? | |||
| These are all questions that most have probably been asked at some point in their | |||
| career. Luckily enough there is a plethora of HTTP benchmarking tools to help try | |||
| to answer these questions. Tools like, | |||
| <a href="http://httpd.apache.org/docs/2.2/programs/ab.html" target="_blank">ab</a>, | |||
| <a href="http://www.joedog.org/siege-home/" target="_blank">siege</a>, | |||
| <a href="https://github.com/newsapps/beeswithmachineguns" target="_blank">beeswithmachineguns</a>, | |||
| <a href="http://curl-loader.sourceforge.net/" target="_blank">curl-loader</a> | |||
| and one I wrote recently (today), | |||
| <a href="https://github.com/brettlangdon/tommygun" target="_blank">tommygun</a>. | |||
| Every single one of those tools suck, including the one I wrote (and will | |||
| probably keep using/maintaining). Why? Don’t a lot of people use them? Yes, | |||
| almost everyone I know has used ab (most of you probably have) and I know a | |||
| decent handful of people who use siege, but that does not mean that they are | |||
| the most useful for all use cases. In fact they tend to only be useful for a | |||
| limited set of testing. Ab is great if you want to test a single web page, but | |||
| what if you need to test multiple pages at once? or in a sequence? I’ve also | |||
| personally experienced huge performance issues with running ab from a mac. These | |||
| scope issues of ab make way for other tools such as siege and curl-loader which | |||
| can test multiple pages at a time or in a sequence, but at what cost? Currently at | |||
| work we are having issues getting siege to properly parse and test a few hundred | |||
| thousand urls, some of which contain binary post data. | |||
| On top of only really having a limited set of use cases, each benchmarking tool | |||
| also introduces overhead to the machine that you are benchmarking from. Ab might | |||
| be able to test your servers faster and with more concurrency than curl-loader | |||
| can, but if curl-loader can test your specific use case, which do you use? | |||
| Curl-loader can probably benchmark exactly what your trying to test but if it | |||
| cannot supply the source load of what you are looking for, then how useful of a | |||
| tool is it? What if you need to scale your benchmarking tool? How do you scale | |||
| your benchmarking tool? What if you are running the test from the same machine as | |||
| your development environment? What kind of effect will running the benchmarking | |||
| tool itself have on your application? | |||
| So, what is the solution then? I think instead of trying to develop these command | |||
| line tools to fit each scenario we should try to develop a benchmarking framework | |||
| with all of the right pieces that we need. For example, develop a platform that | |||
| has the functionality to run a given task concurrently but where you supply the | |||
| task for it to run. This way the benchmarking tool does not become obsolete and | |||
| useless as your application evolves. This will also pave the way for the tool to | |||
| be protocol agnostic. Allowing people to write tests easily for HTTP web | |||
| applications or even services that do not interpret HTTP, such as message queues | |||
| or in memory stores. This framework should also provide a way to scale the tool | |||
| to allow more throughput and overload on your system. Lastly, but not least, this | |||
| platform should be lightweight and try to introduce as little overhead as | |||
| possible, for those who do not have EC2 available to them for testing, or who do | |||
| not have spare servers lying around for them to test from. | |||
| I am not saying that up until now load testing has been nothing but a pain and | |||
| the tools that we have available to us (for free) are the worst things out there | |||
| and should not be trusted. I just feel that they do not and cannot meet every use | |||
| case and that I have been plighted by this issue in the past. How can you properly | |||
| load test your application if you do not have the right load testing tool for | |||
| the job? | |||
| So, I know what some might be thinking, “sounds neat, when will your framework | |||
| be ready for me to use?” That is a nice idea, but if the past few months are any | |||
| indication of how much free time I have, I might not be able to get anything done | |||
| right away (seeing how I was able to write my load testing tool while on vacation). | |||
| I am however, more than willing to contribute to anyone else’s attempt at this | |||
| framework and I am especially more than willing to help test anyone else’s | |||
| framework. | |||
| **Side Note:** If anyone knows of any tool or framework currently that tries to | |||
| achieve my “goal” please let me know. I was unable to find any tools out there | |||
| that worked as I described or that even got close, but I might not of searched for | |||
| the right thing or maybe skipped over the right link, etc. | |||