Normally I find quite a bit of ambiguity when people talk about performance tests, some restrict it to response time whereas some use it to cover a gamut of things they are testing or measuring. In this post, I will put across few thoughts on contrasting between them. Ideally a lot depends on what you are trying to measure. The terms that you will frequently hear in this arena are – Response Time, Latency, Throughput, Load, Scalability, Stress, Robustness, etc. I will try explaining these terms below also throwing some light on how can you measure them.
Response Time – Amount of time system takes to process a request after it has received one. For instance you have API and you want to find how much time that API takes to execute once invoked, you are in fact measuring response time. So how do we measure them? Simple use a StopWatch (System.Diagnostics) – start it before calling API & stop it after API returns. The duration arrived here is quite small so a preferred practice is to call that API in sequential loops say 1000 times, or pass variable load to the API if possible (input/output varies from KBs/MBs/GBs e.g. returning customer array of varied lengths).
Latency – In simplest terms this is Remote Response time. For instance, you want to invoke a web service or access a web page. Apart from the processing time that is needed on the server to process your request, there is a delay involved for your request to reach to server. While referring to Latency, it’s that delay we are talking about. This becomes a big issue for a remote data center which is hosting your service/page. Imagine your data center in US, and accessing it from India. If ignored, latency can trigger your SLA’s. Though it’s quite difficult to improve latency it’s important to measure it. How we measure Latency? There are some network simulation tools out there that can help you – one such tool can be found here.
Throughput – transactions per second your application can handle (motivation / result of load testing). A typical enterprise application will have lots of users performing lots of different transactions. You should ensure that your application meets the required capacity of enterprise before it hits production. Load testing is the solution for that. Strategy here is to pick up a mix of transactions (frequent, critical, and intensive) and see how many pass successfully in an acceptable time frame governed by your SLAs. How to measure it? You normally require a high end professional tool here like Visual Studio Team System (Load Testing feature). Of course, you can try to simulate load through your custom made applications /code but my experience says custom code are good to test for response times; whereas writing custom code for load testing is too much of work. A good load testing tool like VSTS allows you to pick a mix of transactions, simulate network latency, incorporate user think times, test iterations, etc. I would also strongly recommend this testing to be as close as possible to real world with live data.
Scalability – is the measure of how your system responds when additional hardware is added. Does it take new increased load by making use of added resources? This becomes quite important while taking into consideration the growth projections for your application in future. Here we have two options – scale vertically/up (better machine) or horizontally/out (more machines), latter is usually more preferred one. A challenge to scale out is to ensure that your design doesn’t have any server affinity, so that a Load balancer can adjust load across servers. Measuring scalability can be done with help of load balancing tools with a software/hardware NLB in place ensuring system is able to take on new load without any issues. One can monitor performance counters to see whether actual request load has been balanced/shared across servers (I plan to cover NLB in a future post).
Stress testing – Many people confuse this or relate it to load testing. My take which I have found easy to explain is, if you find yourself running tests for more than 24 hours you are doing a stress test (precise would be your production time i.e. duration before you take your machine offline for a patch, etc.). Motivation behind stress test is to find out how easily your system can recover from over loaded (stressed) conditions. Does it limp back to normalcy or gives up completely? Robustness an attribute that is measured as part of stress testing relates to long running systems with almost negligible down time. A simple example here could be memory leak. Does your system release memory after working at peak loads? Another, what happens if a disk fails due to constant heavy I/O load? Does your system lose data? Finding and addressing such concerns is motivation behind stress testing.
I will look forward to read your thoughts on above .