Performance Testing: Common Myths/Confusions
Performance Testing in today’s changing times is still considered a niche skill. It is because of this reason there are many associated myths around it that lingers on and gives rise to false notions and picture which ultimately affects the kind of picture presented to client. Performance testing rings is associated with such statements as “Delivery impacting testing”, “Redundant testing”, “Not sure if we have budget for such things” etc. Below are some of such myths and a possible resolution for the same. Please do post if you think otherwise or there something similar to this in your mind.
#1-“I always consider Average reading or response time while presenting it to client”. Question – “Why so”. Answer – “Not sure. That’s what my client wants”.
In most cases, I have received the answer for this question as “I will decide as per my client’s need”. I don’t think this is quite true and if you agree to it you must always have the strong reason to support it. This poses us with 2 issues. Firstly, is this reasoning “I will decide as per my client’s need” alright? And secondly, if we don’t agree to first issue then how should we decide which reading to provide? Let us, for the sake of this discussion, concentrate on the second question.
Let’s consider we have 10 readings for transaction as:
3 18 2 4 20 7 3 12 12 10
Arranging them in ascending pattern:
2 3 3 4 7 10 12 12 18 20
Average of the above readings is 9.1 and 80 percentile is 12.
Now this tells me:
- 80 out of 100 times this transaction will be completed in 12 secs or faster. Of course, what reading of percentile (75, 80, 90) you give depends on the criticality of the functionality you are testing but it provides you an idea of the range in which majority of your transactions will fall. This cannot be true about average because it is not a relative information of the various readings your transaction has done. Consider due to any circumstances (inherent system behaviour like SQL taking more time, extreme paging etc.) the peak transactions rise to 40 and 50 instead of 18 and 20. This change will impact average to rise from 9.1 to 14.3, while the 80 percentile reading remains at 12 secs. The question you might ask here is did we just ignored major issues like SQL taking more time, extreme paging. The answer is no. We know that there is some issue but as I said before as per the criticality the percentile reading can be changed. If it had been 90 percentile response here then it would be 40 secs and would have been concern to me/client. But if we consider 80 percentile, it says that 80 time out of 100 I won’t face any issues. If the 20 times are tolerable then you can sign off the readings or else you can look closer.
- Also, when a second round of the same test is conducted, and is found that 80 percentile reading for this transaction is 15 secs, one can be sure that there is a 40% degradation in this transaction. This helps a lot if your system is undergoing tuning or performance improvements and knowing this info will point out to you exactly how bad or even good has the change affected your responses. For average, again this can’t be true reading because we do not have any relative information about the times that are considered for calculation of average response time.
#2-Once the bottleneck is found, the step ahead is not something which is an integral part of Performance Testing
This mostly depends upon the experience level of the person involved in the project and it should be respected however, pointing out the issue should be the first step of Performance Testing. The people involved should develop, with experience, the ability to understand the system and suggest the possible workarounds for such kind of scenarios. It might sound like speaking unpragmatically but believe me this is the kind of market demand and should be answered. At least, experience of the level where suggestions are made as to what could be a potential issue should be something which could be to catered. You can figure below example for a better insight to the approach. Of course, the example is just one possibility out of the ocean of such issues you might be facing.
Lets assume for example that there is an issue of higher response time for one of the transactions you are dealing with. What will your first step be? I personally, will like to check the DB logs. Lets again, with the fear of becoming specific, assume that the Database involved is Oracle. Ask for AWR reports. Analysis of AWR is something which requires understanding of SQLs, execution plan, parsing and bind variables. You can either go for ADDM report (which is an automated analysis of AWR report) or you can do the analysis yourself. Getting back to the question in hand, if the issue is response time, we can safely consider the top SQLs in AWR. Just looking at the top performing SQL will not solve your problem. You must consider the query and understand if that is corresponding to the transaction which is in question. What I mean here is, if your UI transaction is doing a submit transaction and the top SQL is a select query, it is not the correct data to be analysed.
Next you go for the query popping to the top of the pile by the elapsed time since the response time is the question. If the query ran once or twice during the period of execution and still the elapsed time is quite high, this could be your potential bottleneck. You can also consider hard parse to soft parse ratio which normally is high in cases where there is an SQL with higher elapsed time as discussed. If all this is what you can conclude then the suggestion you can make as a possible resolution is to look for any hard coded values in SQL queries.
#3-It is assumed that all kinds of Performance Testing (Load, Stress, Endurance, assisting in DR etc) will be executed if Performance Testing is employed for a particular project.
This is something which always annoys me when I interview people. I get all kinds of answers for this. “I will do load test first and if that is successful, I will consider doing stress and endurance test” This could be true because if your load test is not successful, you cannot go ahead with stress or endurance test. But this however, is not the reason for including other tests in your plan. As far I understand, the Non-Functional requirements should concur with test that are in scope. Figure for example something like this.
“NFR # – The system should cater for 5% of user load YOY.” This clearly translates to you as Performance Tester including stress test in your plan. How you design this test is another topic.
“NFR # – The system should be able to sustain for a period of application up time” This translates to you including endurance or long haul test in your plan.
There are even more of such myths associated with Performance Testing as it is still considered a niche skill. But I strongly believe that experience and proactiveness is of utmost importance to resolve this myths/confusion and bring about a kind of change in terms of how we deliver to client.
About the Author