port over article http://brett.is/writing/about/why-benchmarking-tools-suck/

12 years ago · f033a50596
--- a/contents/writing/about/why-benchmarking-tools-suck/index.md
+++ b/contents/writing/about/why-benchmarking-tools-suck/index.md
@ -0,0 +1,86 @@
 ---
 title: Why Benchmarking Tools Suck
 author: Brett Langdon
 date: 2012-10-22
 template: article.jade
 ---

 A brief aside into why I think no benchmarking tool is exactly correct
 and why I wrote my own.

 ---

 Benchmarking is (or should be) a fairly important part of most developers job or
 duty. To determine the load that the systems that they build can withstand. We are
 currently at a point in our development lifecycle at work where load testing is a
 fairly high priority. We need to be able to answer questions like, what kind of
 load can our servers currently handle as a whole?, what kind of load can a single
 server handle?, how much throughput can we gain by adding X more servers?, what
 happens when we overload our servers?, what happens when our concurrency doubles?
 These are all questions that most have probably been asked at some point in their
 career. Luckily enough there is a plethora of HTTP benchmarking tools to help try
 to answer these questions. Tools like,
 <a href="http://httpd.apache.org/docs/2.2/programs/ab.html" target="_blank">ab</a>,
 <a href="http://www.joedog.org/siege-home/" target="_blank">siege</a>,
 <a href="https://github.com/newsapps/beeswithmachineguns" target="_blank">beeswithmachineguns</a>,
 <a href="http://curl-loader.sourceforge.net/" target="_blank">curl-loader</a>
 and one I wrote recently (today),
 <a href="https://github.com/brettlangdon/tommygun" target="_blank">tommygun</a>.

 Every single one of those tools suck, including the one I wrote (and will
 probably keep using/maintaining). Why? Don’t a lot of people use them? Yes,
 almost everyone I know has used ab (most of you probably have) and I know a
 decent handful of people who use siege, but that does not mean that they are
 the most useful for all use cases. In fact they tend to only be useful for a
 limited set of testing. Ab is great if you want to test a single web page, but
 what if you need to test multiple pages at once? or in a sequence? I’ve also
 personally experienced huge performance issues with running ab from a mac. These
 scope issues of ab make way for other tools such as siege and curl-loader which
 can test multiple pages at a time or in a sequence, but at what cost? Currently at
 work we are having issues getting siege to properly parse and test a few hundred
 thousand urls, some of which contain binary post data.

 On top of only really having a limited set of use cases, each benchmarking tool
 also introduces overhead to the machine that you are benchmarking from. Ab might
 be able to test your servers faster and with more concurrency than curl-loader
 can, but if curl-loader can test your specific use case, which do you use?
 Curl-loader can probably benchmark exactly what your trying to test but if it
 cannot supply the source load of what you are looking for, then how useful of a
 tool is it? What if you need to scale your benchmarking tool? How do you scale
 your benchmarking tool? What if you are running the test from the same machine as
 your development environment? What kind of effect will running the benchmarking
 tool itself have on your application?

 So, what is the solution then? I think instead of trying to develop these command
 line tools to fit each scenario we should try to develop a benchmarking framework
 with all of the right pieces that we need. For example, develop a platform that
 has the functionality to run a given task concurrently but where you supply the
 task for it to run. This way the benchmarking tool does not become obsolete and
 useless as your application evolves. This will also pave the way for the tool to
 be protocol agnostic. Allowing people to write tests easily for HTTP web
 applications or even services that do not interpret HTTP, such as message queues
 or in memory stores. This framework should also provide a way to scale the tool
 to allow more throughput and overload on your system. Lastly, but not least, this
 platform should be lightweight and try to introduce as little overhead as
 possible, for those who do not have EC2 available to them for testing, or who do
 not have spare servers lying around for them to test from.

 I am not saying that up until now load testing has been nothing but a pain and
 the tools that we have available to us (for free) are the worst things out there
 and should not be trusted. I just feel that they do not and cannot meet every use
 case and that I have been plighted by this issue in the past. How can you properly
 load test your application if you do not have the right load testing tool for
 the job?

 So, I know what some might be thinking, “sounds neat, when will your framework
 be ready for me to use?” That is a nice idea, but if the past few months are any
 indication of how much free time I have, I might not be able to get anything done
 right away (seeing how I was able to write my load testing tool while on vacation).
 I am however, more than willing to contribute to anyone else’s attempt at this
 framework and I am especially more than willing to help test anyone else’s
 framework.

 **Side Note:** If anyone knows of any tool or framework currently that tries to
 achieve my “goal” please let me know. I was unable to find any tools out there
 that worked as I described or that even got close, but I might not of searched for
 the right thing or maybe skipped over the right link, etc.