[UPDATE] The original survey can be found at this link at Electric Cloud.
Electric Cloud, a leading provider of software production management solutions, completed a survey of software development professionals (developers, testers and managers). One of the major leads in the results was “the majority of software bugs are attributed to poor testing procedures or infrastructure limitations, not design problems.” Obviously, I was going to keep reading. Granted, the quote was a little overstated, but there were some very interesting points in the survey results.
So, what did they find? First, “Fifty-eight percent of respondents pointed to problems in the testing process or infrastructure as the cause of their last major bug found in delivered or deployed software.” Now, let me translate that. A major defect is typically a problem in software that causes the application to crash, save data in an inappropriate state or display information in an appropriate manner. As a general rule, if a major defect is found after the application is deployed into production, it is a failing of the testing process. When I say testing process, I am spreading blame around. Developers did not have appropriate unit tests, QA did not have the appropriate test plans and the management team likely did not have the appropriate people or hardware allocated to the project. Essentially, major defects are the fault of the entire team and should not be occurring.
There were two pieces of very bad news for the software industry. Only 12 percent of software development organizations are using fully automated test systems. This is a damn shame, but there are also some good reasons for this. Automating user interface (UI) testing is extremely hard, time consuming and extremely fragile. So, it does not surprise me that fully automated testing is rare. Thankfully, less than 10 percent are only doing manual testing and no automation. Another major issue was that 53 percent of respondents said their testing is limited by compute resources. Given how cheap hardware is anymore, this should not be an issue. However, limitations of the hardware environment continues to plague the industry. I know people want to buy a decent server for testing, but sometimes a simple desktop machine that costs less than $500 will suffice, especially if it is a UI automation machine. I understand we all need to control costs, but the money on hardware is nothing compared to the amount of man hours needed for manual testing.
People not in software development may wonder why all of this talk about defects and automated testing matters. Well, the survey had a really good point about this. 56 percent estimated that their last significant software bug resulted in an average of $250,000 in lost revenue and 20 developer hours to correct. Lost revenue is always a big deal, but do not underestimate the cost of developer hours. Let’s assume that the cost of a developer is $800 per day or $100 per hour on average. 20 developer hours equates to $2000, which does not seem like a big deal. However, there is also the testing time required, about 30% of development time, adding about $600. There is also management time required which for defect resolution is fairly high. Add in another 25% of the development time for each management person involved, typically a development manager, a QA manager and a VP/Director level person. The management team will typically cost more as well, about $125 per hour, which adds $725 per person, or $2175. There are also the operational costs of deploying the corrected application into a QA environment and the production environment. This is about 3 hours at the same rate as a developer, so we add another $1800. This brings our total cost, $2000 + $600 + $2175 + $1800, to $6575 plus all of the stress that goes with the fire drill. There is also the potential opportunity cost of delaying other projects because of the people required to fix this defect, but opportunity cost is hard to quantify without a specific context. All of this may not seem like a lot of money, but for a smaller departmental budget it could be very important. Also, compare the $6575 to the cost of the desktop PC that could have been used for automated tests that find that defect. This defect could have cost 10X the cost of the hardware.
A lot of people are probably agreeing with me in terms of the costs, but disagreeing with my simple “just add testing” idea. Typically, the problem is that basic automated testing is not easy. Good automated testing is hard. Automated web testing is even harder. However, automated testing is not expensive. There are a bunch of free tools available that many of your developers are probably familiar with. Some of the tools are useful for unit testing, others help with testing with a database and there are others that help with web testing. Another issue that keeps getting raised is that it is too hard to get 100% test coverage. I am not the first person to say this, but do not even try to get 100% test coverage. Developers do need a significant level of unit test coverage, probably above 90%, but acceptance tests and integration tests do not need that much test coverage. One of the best parts of testing is that once a defect is found, you can write tests for it. That way you ensure that if all of your tests pass, you have fixed the defect and it should not recur in the future.
So, there is nothing really stopping you from a solid testing plan. If people and opinions get in the way, then you can point to some of the information in the survey. If management says that writing automated tests will cost too much, ask them if it costs more than $250,000.
22 thoughts on “Survey Says: Developers Think Testing Is Failing”
[…] This post was mentioned on Twitter by Brandon Hill and LG Stream, Denton Gentry. Denton Gentry said: Survey Says: Developers Think Testing Is Failing http://j.mp/avzDW5 […]
Hi Rob Diana,
>> First, “Fifty-eight percent of respondents pointed to problems in the testing process or infrastructure as the cause of their last major bug found in delivered or deployed software.”
Who were the respondents? Testers? Checkers? How did they come to the conclusion? Do they understand testing?
I would be very happy to know what testing means to you. Is the task of finding major bugs called as testing?
>> There is also the potential opportunity cost of delaying other projects because of the people required to fix this defect, but opportunity cost is hard to quantify without a specific context.
Reading the above statement and the word ‘context’ brought a smile on my face. Could you please explain the term ‘context’?
According to the survey, the respondents were software development professionals, described as developers, testers and managers. Obviously, this does not mean that they really understand the testing process.
To me, testing is the process of validating that an application works correctly or that there are defects. In some cases, a defect could include some minor CSS styling on a web page. However, it was not my survey, so I cannot tell you what they think testing is.
Regarding “context” and opportunity cost, my thinking was that opportunity cost is highly dependent upon the state of your competitors and possibly time (both calendar time and time to market). Hopefully, that clears things up for you.
With respect, I think that many of your comments are unsupportable, and bespeak a sad (albeit common) set of misconceptions of what testing can and cannot do. Testing problems aren’t the cause of defects, major or minor, any more than problems at the fire station are the cause of burning houses. Test automation will no more prevent development problems than power steering will prevent traffic accidents. Coding errors are not the source of most software problems. Indeed, crashes, malfunctions and data loss are major problems, but these are no less significant threats to a business than a product that simply fails to solve the problem for the person using that product.
However automation might assist us in testing (and I agree with you that it most emphatically can), automation cannot solve problems on its own until we change the way we think. Testing, whether done by programmers or by testers (and I agree that it’s crucial that excellent testing must be done by both) must be done with skill. Automation extends or accelerates testing indiscriminately, whether the testing is good or bad. Lousy testing, when automated, will result in lousy testing done faster (or, perhaps, slower when you include development cost).
I would recommend reading Jerry Weinberg’s book, Perfect Software and Other Illusions About Testing. I’d recommend reading some of Cem Kaner’s work on approaches to non-trivial test automation, in particular pages 2-14 and 56-62. I would, somewhat less modestly, recommend that you read a few recent blog posts that I’ve written on the subject: Is Unit Testing Automated? and the series of posts that starts with Testing vs. Checking.
I’ll leave it to others to address other problems in the post.
I believe you may have confused my opinions with the data in the survey (which is from a 3rd party), or I may misunderstand what you are concerned about. The testing process can help avoid finding defects in production, which is the real target. I wrote about the survey mainly because it makes it sound like there are many companies that still do not know how testing should be completed.
I also agree wholeheartedly with your 2nd paragraph, especially “Lousy testing, when automated, will result in lousy testing done faster”. Skill in testing is just as important as skill in development.
Lastly, regarding your blog, I have subscribed.
It seems to me that defects are most commonly caused by mistakes in programming, often caused by mistakes in communication between requirements givers and programmers, sometimes caused by mistaken requirements.
Defects are never caused by testers or testing. Sometimes mistakes by these other people could have been detected by better testing (by someone), but the cause of software defects may not appropriately be assigned to “poor testing” in my opinion.
I agree that defects are caused by mistakes in programming or communication. The survey does seem to “blame testers” but I am not of that opinion. If a major defect gets to production, testing does get some of the blame, mainly because they did not find the defect before production. However, as you stated, poor testing does not cause defects.
I also feel I don’t understand. What was the breakdown in the survey between self descriped “developers”, “testers”, and “managers”. I am curious whether the results of this survey indicate a bias in the survey to have more of one of the three break downs. I think that is what Ajay was asking.
Anyhow, you say you aren’t blaming testers, but it seems you spend a great deal of time talking about these major defects and the costs to fix them. However, based on the software engineering classes I had in college, that to me would indicate the earlier you catch a bug, be it in the design, requirements, development, integration etc phases reduces long term cost to fix. I think a large number of these production bugs, but not all likely are the result of errors in the conception or design of particular routines.
A well implementation poor design is IMO still a poor design. If the error is in the design then that’s where the bug should be caught and fixed. The key is to have testing involved all through the development project finding the bugs earliest. As the TDD school would say, “fail early, fail often”.
BTW I’d like to meet some of these developers that cost 100 bucks an hour to fix bugs, and yet want to blame the testers for failing to deliver quality code. Tester’s don’t assure quality, they give information about the state of the product. Quality has to be built into it from the very beginning, it does not just happen at the very end.
That is not to absolve the testing side of things, I once worked at a company who’s testing consisted of only testing what’s “been added”. It was no surprise that bugs began to creep in because there was no real regression scripts to ensure existing functions still worked correctly. I was the one who took the initiative and began writing test scripts of very basic functions to check. They weren’t that detailed, but the list grew over time, and I’m told by a friend who still works for said company, that he still uses and adds new tests to it as the software grows.
Also, automation is portrayed in some circles as a silver bullet. For starters, It isn’t, some things just are not automatable. Some things may be automatable, but due to the nature and change in the product the test cases upon which they are based may quickly be made obsolete. Not to mention the cost in maintaining automation, and they do often require a good measure of care and time to keep them current.
Automation is an important tool in the testing toolbox, but it should not be the screwdriver that we use as a hammer, a chisel, etc when that is not really what it was meant for.
The amount of maintenance and time to build a scripted test may seem small assuming the same test never needs maintennance but the first time you have to rerecord a test, or make a validation change, you’ve added time to the testing process. What’s more Unit Testing, is not necessarily automation is it? Unit testing to me is only useful in verifying certain aspects of the code itself. You can have syntacticly correct code, that passes all of your unit tests, but if you don’t have testers beyond that point, many bugs, many serious errors in reaching requirements will be missed.
What’s more, since not all features can be automated, you still need testers working manually, exploring the project and putting the program through its paces. Now as for unit testing, unit testing can be a great help, but unit testing like test automation frameworks are only as good as the person maintaining them. What’s more you can have syntacticly correct code, that passes all of your unit tests, but if you don’t have testers beyond that point, many bugs, many serious errors in reaching requirements will be missed.
Getting back to the survey. Without knowing the breakdown demographics of who was surveyed and what they as groups said, it doesn’t sound all that statistically accurate to me.
Respectfully I’d like to point out:
100% code coverage != 100% tested
As an example I’d like to share that I work with the some of the most talented TDD developers in the world that produce outstanding code coverage, yet I find defects every test session. The most significant problems I find are most often oversights. There is no way to automate the unforeseen.
>> If a major defect gets to production, testing does get some of the blame, mainly because they did not find the defect before production.
What if testing team found the defect but the stakeholders did not agree to fix the defect?
Maybe the post http://bit.ly/cpLeO2 by Michael Bolton would help the readers more than the survey results.
I’d say the infrastructure problems mentioned above are misinterpreted by you. From my testing experience the infrastructure issues cannot be resolved by $500. It helps but not with those defects.
The problem is that production environments have:
1) More complex network structures
2) More firewalls
3) Load balancers
4) Different user/system security restrictions
5) No dev meddling with “just a small setting here should fix that”
6) Different systems which react differently no matter what the sales guy said
7) Larger data volumes
8) Different usage patterns
9) Different upgrade/update cycles (usually slower/more conservative)
10) The unexpected in the widest sense and many more…
Those are all things that either cannot be reproduced at all or cost so much that the $6750 or whatever you calculated are a pittance. Take a load balancer, those can easily cost $50k+ for one and without the exact model you will not have a true 1:1 representation.
When you get to performance and load testing these differences can be the key to success or failure. I have customers spending hundreds of thousands maybe even millions just to build a system that is close to production for performance testing (nothing is ever equal!) and pre-production testing.
Even the $500 desktop you mention is more expensive than that ($500 + power + the person that manages it + licensing costs + network cost +….). I’m not saying your argument is wrong. Testers need systems and the more the better (as is true for almost everything) but there is a bit more involved than you highlight here. And maybe testers need to push for more (equality) when we see the $$$ involved. Add a zero. $5000 or another $50000. If the lost opportunity is a 1/4m then that won’t hurt but I’d advise to do a descent job of it and not try and survive on a $500 system that still leaves 80% of risks not mitigated.
Maybe that is also why the descent systems for testing don’t exist readily. They are complex to set up, expensive and not easy to manage and maintain. I might be wrong too but this is my experience.
BTW: EVERY project has infrastructure problems on first release. Expecting anyhing else is like expecting to be able to walk seconds after being born. It is a learning process. What I criticise is that managers insist to Go-Live and let live users on the system pretty much on the same day. I’d always advise to have at least 2 weeks between the two!
A suggestion: please edit the post to include a link to the survey (or at least the press release, http://www.electric-cloud.com/news/2010-0602.php ). Give the reader easy access to primary data.
I must commend you on your choice of title for this post, which is a lot more honest than the title of Electric Cloud’s PR: “Survey Finds 58% of Software Bugs Result from Test Infrastructure and Process, Not Design Defects”. If a study *had* found such a thing and substantiated it, that would be big news indeed. But no: the survey found no such thing, it’s grossly misleading to put it that way, and it’s disheartening that a number of “news” outlets are spreading such misinformation around.
Such a survey reports on opinions, and thus is likely to be biased in any number of ways. (Not the least of which is that it is a survey commissioned and designed by a firm that sells software to development organizations.) In particular, when asking developers, who are famously subject to confirmation bias (see Myers’ 1976 Software Reliability Principles and Practices) what they think causes defects, it’s a little naive to expect them to point to their own errors as the culprit.
And yet – since software is, directly or indirectly, written by humans, we can confidently assert that the *cause* of all defects is a mismatch in some programmer’s mind, between what they think is true of the software and what is actually the case. We can speculate on whether improving the testing process, or stepping up automation, would improve the detection rate, or the cost to correct, and so on. We can even test such hypotheses (and my personal experience suggests that practices such as Test-Driven Development can help a skilled developer slash their defect rate).
But to assert that poor testing is the cause of defects *rather than* design errors, and to so assert on the basis of an opinion survey, is simply irresponsible. It shows a complete misunderstanding of how software development works, and it can only lull developers into a “I’m not to blame” attitude. The focus on “procedures” in particular is likely to be counter-productive, compared to a focus on *appropriate skills*, from technical skills to people skills.
I happen to agree with the conclusions you apparently think the survey supports, but the survey itself has no useful empirical basis; it’s simply a marketing gimmick, or at best an indicator of trends in the sociology of software development.
I’ll join your call for improved testing among software professionals. But “if people and opinions get in the way”, let’s point them to something more solid than silly press releases based on silly surveys. Please. (For instance, there’s a conference called Agile2010 where people could get a lot of concrete and useful information about such things as test-driven development, acceptance testing, exploratory testing, programmer-tester pairing, but also teamwork and collaboration. See http://agile2010.agilealliance.org/ )
“100% code coverage != 100% tested” is absolutely correct. 100% code coverage just means that each line of code is tested. This does not mean you have proper error handling, data validation or all sorts of other unexpected issues. This is one of the reasons that I dislike a target coverage percentage, it gives a false sense of security.
The $500 PC was to argue against the idea that testing teams do not have enough resources to do the testing. In my experience testing teams get very little budget when compared to development teams. So a small PC to be used for automation would be a big win. This is not meant to address the real infrastructure that you are talking about.
Overall, I agree with what you are saying with regards to hardware like load balancers and how things never work the same as they do in production.
Thanks for the link to the original survey. I thought I had linked to it, but I must have missed it.
Part of the reason I wrote this post was because I was so surprised by the responses. If development teams had barely started unit testing, obviously they were going to have more problems.
I have written several posts about testing in various forms, and have lead the unit testing effort of legacy systems a few times as well. As you said, even for skilled developers, unit testing and TDD can slash defect rates.
If developers think testing is failing, why is Google spending more than 250 million bucks every year on testing?
May be it is YES, to the ones who do not know what testing is all about. The survey is RIGHT, to the ones who being a tester does not know how the code/program works.
What you say is probably true, “May be it is YES, to the ones who do not know what testing is all about.” People who have worked with good testing teams and developers who use TDD or even just write unit tests always state how important and helpful testing is.
(a) There were two pieces of very bad news for the software industry. (b) Only 12 percent of software development organizations are using fully automated test systems.
(b) is not true. NO software development organization uses a fully automated test system, just as no software development organization does, or could, use a fully automated programming system. (A) is not true, because if organizations did have “100% test automation”, it would be very bad news for people.
There are certain kinds of testing Kool-Aid out there that seem tasty; but we know what happens when certain cults drink Kool-Aid. One is that if only we had enough machines or automated tests, our software would be perfect. This is like saying that if only we had enough robots, our cars would be perfect. The problem with the analysis is that it leaves out the fact development is a process of design, not assembly, and that testing is a questioning process, not a confirmatory one. It leaves out the idea that there are many kinds of threats to a product’s value, and that it’s not just coding or functional errors that threaten the value of the product.
“Testing the UI through automation” is not testing the UI at all; it’s checking certain functions in the program by inspecting a UI element.
Also, compare the $6575 to the cost of the desktop PC that could have been used for automated tests that find that defect.
Automated tests never find a defect. At best, they find a mismatch between one certain kind of expectation and a result. But it is a tester (or a programmer) who finds the defect, who decides whether the mismatch is a problem or not. The automation may have a role in assisting the tester to find the defect. The notion that automation will simply go out and find bugs is fallacious.
In your analysis of the cost of automated testing, you’ve successfully identified equipment cost in the form of what the machine costs. But you’re leaving out some key elements: development cost, maintenance cost, and transfer cost. You’ve accounted for some of these costs in the “fixing” column, but not in the development column, which weakens what otherwise might be a more plausible argument.
For automation-assisted testing, these costs tend to get lower as they get closer to the functional aspects of the code–that is, when they’re in the form of unit tests. When applied to the GUI level, they can become enormous and a burden on change; they also, typically, replicate a lot of what the unit tests could or should be doing, while removing a good deal of human observation and interaction from the package.
You appear to believe that test automation is all about checking. It isn’t. Test automation can be used to generate test data, to assist in searching and sort through results, to provide test oracles, to aid in investigation and exploration.
Your description of “manual testing” in a previous post goes like this:
The last level of testing is purely manual. I have heard this described as “bench testing”, “poke testing” and “monkey testing”. The idea is that you are using the system like any random user and seeing if anything breaks. The poke and monkey terms come from the idea that you are poking at random parts of the system or acting like a monkey and just pounding on the keyboard to see what happens.
It might be worth your while to study testing and recognize that the important bits–the exploration, discovery, investigation, and learning about a program–cannot be automated, and that “just pounding around on the keyboard to see what happens” is no more a part of skilled testing than “banging in a few commands and then hitting ‘compile'” has anything to do with skilled programming. Such descriptions is unworthy, and wrecks your credibility with actual testers and with everyone else–except with those who haven’t really thought about testing much.
The points that I’m calling out (there are more, but it’s getting late) bespeak a very simplified model of development and of testing. “Just add more machines” is not a solid testing plan. “Making sure that tests pass” is not a solid testing plan either; in fact, such an approach actively biases people away from discovering problems in the product.
When I first stumbled across this blog entry, I admit that my first reaction was “Thus guy doesn’t get it.” I then thought a bit more and considered that this was based around the press release that resulted from a survey. My initial reaction was “I wonder if the survey was created to direct respondents to a specific path or group of related answers?”
I did not post anything then, as time was short and the project was desperately behind.
When I revisted the blog I found the link to the press release was added – very helpfull. Thanks.
I reread the post and the comments, including the many comments added since I had read it before. As a recovering code writer and systems analyst who currently participates in software development as a tester, I tend to agree with the bulk of the comments about testing – particularly Michael’s testing/checking observations.
I read the press release and all my alarm bells went off. I went to the company’s website (Electric Cloud) and on the first page opened – I see this statement. They are “the leading provider of solutions that automate, accelerate and analyze the software build – test – deploy process.”
This “study” and press release looks to be a marketing push and not much more.
As for other aspects of testing, I find that the views expressed by non-testers of software are similar to views I probably had when I was writing code. In the way-more-than 20 years I have been involved in all aspects of software development, I find the challenges in testing to be remarkable. I also find the attitude of some code-writers to also be remarkable.
Thanks for waiting before passing judgment 🙂 The blog post was obviously flawed as I did not have the press release link (not sure how I missed it), and I did not write it in a way that made it obvious that the statistics and some information was from the survey and not my opinion.
Disappointingly, there are still a lot of developers that do not see the value in automated unit tests or even a strong QA group. That is probably one of the few things the survey properly represented, the lack of understanding a lot of people have about testing.
Oh, I don’t know Rob – The posting from June of last year on testing paints a big target that’s pretty easy to hit… for testers. Depending on environment and application in question, “automated testing tools” may help – it is not a certainty in all cases. Building the discipline to do good testing is a challenge. The core issue revolves around how testing is developed.
The shop where I currently work has rigorous rules on developers unit testing their work. Then core code is sent through the “continuous unit test” environment (which runs constantly with a variation of test steps being invoked.) Some still try to do the bare minimum – counting on the coninuous test to find errors. Then the poor folks blink and look in amaement when the QA/Testing group asks if they did any unit testing, because their build failed on the “sanity test.”
I know when I was programming, my unit tests tended to be “happy-path-heavy” and just brush on conditions outside the happy-path. The work involved on expanding the non-happy-path was time-consuming and required different though processes.
Most people I know who write code have the absolute confidence in their own abilities (although the guy in the next cube is an idiot) that they see anything more than this as a waste of time – “Of course its right! I wrote it!”
I would strongly suggest you check out the Weinberg book Michael mentioned. Of course, any of his books are extremely good and well worht the money.
“The posting from June of last year on testing paints a big target that’s pretty easy to hit… for testers.” Yes, I would agree when you say “for testers” 🙂 My target audience for most of what I write is software developers or software managers. I still run into plenty of developers that do not understand the benefits of unit testing, and really do not see the value in anything but formal QA testing.
Regarding the unit tests, they typically are “happy path” heavy to start mainly because they are testing what should work first. Once that is complete, most development teams do not seem to maintain tests in the same manner as their code. Most teams that I become a part of will have a rule that if a defect is found, we need to write a unit test that exercises the defect, then fix the code to make the test pass. This begins the process of “unhappy path” testing. Unfortunately, testing bad data and edge cases will typically be removed from the requirements when schedules get tight.
I do have confidence in my own abilities, but everyone’s abilities erode under stress. So, it is good to unit test at all times. I will check out the Weinberg book eventually, but I have a stack of books ahead of it.
Comments are closed.