Sunday, March 14, 2021
Wednesday, January 27, 2021
We just completed a 1.5 week intensive manual regression test phase where we executed almost the complete set of all (several hundreds) test cases. We are in a lucky situation. Our documented test cases represent nearly 100% of all implemented features. If we achieve a 70-80% test coverage, then we get a real good picture of the overall quality of the product increment. That means, aside from the many automated tests, it's worth from time to time, doing some manual end-to-end regression testing.
While tracking the regression testing progress using a cloud based test case management tool, we were looking at the activity graph and it made us smile. It's exactly what we expected.
Last but not least, weekend is getting closer. The first enthusiasm is gone, you're starting to get bored.
You hear music from your neighbour. The caféteria gets louder. The sound of clinking glasses reaches your desk. It's time for a break, time to reboot your brain. TGIF! And now it's weekend time!
Monday, January 11, 2021
I recently stumbled over a small article about the "Semmelweis-Reflex". It was published in a Swiss Magazin and I found it quite interesting. I translated and summarized it, because I found some analogy to software testing:
In 1846, the Hungarian gyneocolist Ignaz Semmelweis realized in a delivery unit that the rate of mothers dying in one department was 4 percent while in another within the same hospital, the rate was 10 per cent.
While analyzing the reason for it, he identified that the department with the higher death rate was mainly operated by doctors which were also involved in performing post-mortem examination. Right after the autopsy they went helping mothers without properly disinfecting their hands.
The other department with the lower death rate was maintained mainly by midwifes who were not involved in any such autopsy. Based on this observation, Semmelweis adviced all employees to disinfect their hands with chlorinated lime. The result: the death rate decreased remarkably.
Despite the clear evidence, employees's reaction remained sceptical, some were even hostile. Traditional believes were stronger than empiricism.
Even though this is more than 150 years ago, people haven’t changed so much in these days. We are still biasing ourselves a lot. The tendency to reject new arguments that do not match our current beliefs is still common today. It is known as the "Semmelweis-Reflex”. We all have our own experience and convictions. This is all fine, but it is important to understand that these are personal convictions that may not be transferred to a general truth.
How can we fight such bias? Be curious. If you spontaneously react with antipathy on something new, force yourself to find pieces in the presenter’s arguments that could still be interesting, despite your disbelieve as a whole.
Second, make it a common practice to question yourself by saying “I might be wrong”. This attitude helps you overcome prejudice by allowing new information to get considered and processed.
Sources: text translated to English and summarized from original article published as "Warum wir die Wahrheit nicht hören wollen», by Krogerus & Tschäppeler, Magazin, Switzerland, March 20, 2021
Back to testing:
I am learning in this article, in case we didn't do it before, we should start questioning ourselves more often. We should learn to listen and not hide behind dismissive arguments simply because what is told to us doesn't match our current view of the world.
But this story has two sides. If I am getting biased, then others may be biased, too. Not all information reaching my desk, must be right. The presenter's view of the world may be equally limited.
My personal credo therefore is NOT ONLY to question myself, but to also question statements made by others, even if these confirm my current view of the world and even if it sounds all reasonable.
He/she may also have the intention to achieve something without the need of answering critical questions. He/she may want to avoid getting challenged into discussions that uncover his/her own "sloppy" analysis.
Call me a doubting Thomas. I don't believe anything, until I've seen it.
- If someone tells you "a user will never do that",question it.
- If someone tells you "users typically do it this way", question it.
- If someone tells you "this can't happen", question it.
- If someone tells you "we didn't change anything", question it
I trust in facts only. This is the result of 20 years testing software.
Come-on, I don't even trust my own piece of code. I must be kind of a maniac =;O)
Tuesday, January 5, 2021
This happened so many years back, I can't remember whether he really meant it seriously or was he just making fun.
I mention this episode here, because - even if you are looking at software from a black box or business oriented perspective, you should still be interested in how things work from a technical point of view.
If you don't know what's going on under the hood, you are going to miss important test cases. The more one knows, the more target-oriented and effective your tests evolve.
James Bach (Satisfice) once provided a great example in one of his speaches when asking the audience "how much test cases do we need" while showing a diagram that consisted of a few simple rectangles and diamonds. Various numbers were summoned towards the speaker's desk, most of them being wrong or inappropriate. When James Bach revealed the details behind this simplified diagram, the scales fell from their eyes. It became clear that there is much more behind this graph. When people guessed numbers like 2, 3 or 5 tests, it became now clear, this was just the start and the real number of required tests would be multiple times higher than their first guess.
But, I have also my own story to share with you and demonstrate why it is important to ask more questions about the software-under-test.
A quick intro
Every passenger vehicle is registered using a 17-digit vehicle identification number (VIN) which is unique worldwide. The first 3 letters identify the manufacturer.
- WVWZZZ6NZTY099964 representing a 1994 a VW Polo
- WF0WXXGBBW6268343 representing a 2006 Ford Mondeo V6
- WBAPX71050C101516 representing a 2007 BMW 5series
In order to retrieve the details of each car's equipment such as painting color, 2-door, 4-door, tinted glass, sunroof, etc. we fired a VIN number as input and received the complete vehicle information as an XML output.
This service was an important part of a cloud based service we developed and sold as part of a bigger solution to insurance companies and bodyshops. The customers used it to estimate the total repair cost of damaged cars. Frankly speaking, in order to perform an accurate estimation, one needed at least the part prices and repair steps or workload. Our company was great in this business, since they had all that data from almost all car manufacturers. I have to admit, this is a simplified explanation of a more complex piece of software.
Back to testing
Our job was to make sure the vehicle identification service returns the correct car related data. To test that, we took our own and our friends' cars as a reference, because it was the easiest way for us to verify the data makes any sense.
What we didn't know yet; our company wasn't the owner of all data. Depending on the manufacturer of the queried vehicle, our system either retrieved data from our own database or had to call a third party web service to request the required information. This was the case for BMW and PSA (Peugeot, Citroen, Opel, Vauxhall) just to name two examples.
For the end-user, this wasn't visible. All the user saw was a 17-digit VIN field with a button to trigger the search and then wait until we returned the complete vehicle information along with the equipment data.
What does that mean for the development of test cases?
Knowing, that the system - in some cases - communicates with third party components to get the information, is essential when testing this service. Simply submitting one VIN to see that the service responds correctly, is not enough.
We needed to know what data we own, and what data is gathered through calling external services. Unfortunately, we never managed to get a comprehensive list to shed a little light on the inwards of the software. Therefore, the only weapon for us testers was to test the service using a broard variety of exemplary VIN numbers covering as much as different manufacturers as possible.
But this is just half the story. When a request was being fired, our system cached the generated XML response for an unspecified period of time.
That means, if you submit a particular BMW VIN number for the first time, followed by later calling that service again using the exact same VIN number, this wouldn't trigger the same developer's code paths anymore.
Also, we should know, for how long the data is cached. Unfortunately, the owners of the product didn't reveal the secret to the testers. I could now start with a new article on testability and rave over the necessity to have extra functions for testers to flush the cache when needed, but this goes beyond the scope of this article.
Having now two different kinds of tests such as...
- consider different brand names when testing VIN
- consider submitting the same VIN a second time to test the cache
..we should start thinking about what happens when one of the external services is unavailable.
How does our system react in such scenario, especially if the response isn't cached yet?
How would we even know that it's the external service and not our own code that got broken?
One of the approaches was to add extra tests examining the external services in isolation from our implementation.
Yet another approach - and this is what we did (because we were not given the details) - was to have a set or group of different VIN numbers that belong to the same manufacturer. For example, a set of 5 BMWs, another set of 5 Peugeots, 5 Fords, etc.
If an external service is down, then a whole set of such test cases would fail. If instead only 1 test within a set failed, then the root cause was probably somewhere else.
The below circles demonstrate it visually. At the beginning one knows little to nothing about the system- or component-under-test (1). By questioning how things work, one detects new areas that may or may not be relevant while developing new test cases (2). Knowing and understanding these areas enlarge the awareness area of circle and result in minimizing the risk of skipping important test cases.
Developing test cases is not a monotonous job. The contrary is the case, it requires asking questions, sometimes critical questions and having a good understanding what are meaningful test scenarios to cover and which can be skipped without accepting high risks. Sometimes, you need to be obtrusive when it comes to interviewing techies to get the details, especially when asking for testability hooks.
Although testing activities face a lot of repeating patterns of software-defects, developing appropriate test cases is a task that involves a lot of thinking. That's why I say, no test is the same, use your brain!
BTW, I forgot to add...not all external service providers allowed us to cache data, even if it was cached just for one day. That's an additional test to think about, and I am sure, I forgot to mention many more tests we actually developed at that time.
Below, find a simplified flowchart of the vehicle identification service.
Sunday, November 22, 2020
Sunday, October 25, 2020
Saturday, October 10, 2020
During my holidays, I read two fantastic books about problem solving and focusing on the important stuff. "Range" by David Epstein fascinated me in that there are plenty of examples demonstrating how people without specialized knowledge in a particular area could find solutions to problems where the best experts got stuck. The book rejects the common believe one has to start and specialize early in order to really get good at something. It lists plenty of famous people in the world having demonstrated the contrary such as Roger Federer, Vincent Van Gogh, Dave Brubeck, etc.
Experimenting in your career, trying out different stuff broadens the skill to look at problems from different angles and find solutions that are much more difficult to identifiy if you can't get out of your box. It explains also the success story of Nintendo which once was a small company and not very attractive to high talented graduates.
The other book "Simplicity" by Benedikt Weibel, former CEO of Swiss Railway Corporation (SBB"), goes into a similar direction with other and less detailed examples. He analyzes how the best chess players think in terms of patterns and how to focus on the essential stuff. Less is more. Weibel encourages to make more use of checklists and he also makes a heretical remark when he says that "without a great checklist, Sullenberger had not managed to bring down the Airbus on the Hudson savely" (that's not really my opinion. I think it was a mixture of all, great experience, courage, and a little bit of luck).
Big Data is an interesting tool, but it is not solving our problems and it is not free of failure (examples in the book). I am not advertizing but simply sharing my thoughts on two great books full of valuable hints and references although I know, it will be difficult to not fall back into old habits.
Interesting related reading:
- The Carter Racing Case Study: https://www.academia.edu/20358932/Carter_racing_case
- The missing bullet: https://onebiteblog.com/finding-the-missing-bullet-holes/
Thursday, October 1, 2020
The tickets were raised long time back but, because of release pressure, these were parked aside in the backlog and assigned low priority. We called it the “parking-place”. At that time, a restrictive process of prioritizing tickets was needed to manage the deadlines. The limitations were accepted for a while. But over time, things changed. What was once rated low priority for release N, all of a sudden became more important in the next release and at a bad timing. For many stakeholders including me, these tickets came out of nowhere like submarines; and of course, all at same time.
As a lessons learnt, we - as testers - have decided to get better by regularly reviewing parked tickets. This is to help everyone in the team to become aware earlier of potential new dangerous "submarines" intending to surfac on the water plane soon. There would still be enough time then to tag these before they have a chance to shoot.
Sunday, September 27, 2020
I struggled with some of our new features when looking from an end-to-end workflow view. I was convinced we didn't take into account the unexpressed users’ requirements enough to not just get a user’s job done, but to also get it done quickly.
To find out whether we were right, we prepared a plan.
When the customers were
invited to our offices the very next time - testing our new features - for one time, we didn’t prepare
anymore the typical test case checklist. This time, instead, we
prepared just one simple and realistic end-to-end scenario that combined the
features of the old version with the new features. Don't ge me wrong, I love checklists, they are a perfect tool to guarantee we don't forget anything but for this particular case, we needed to jump out of the standard operation procedure. Instead of ticking off an atomic feature list, this time we took into accocunt its place in a real-life scenario.
We were really surprised about the reaction of the domain experts. This approach revealed the "pilikia". All experts agreed this process requires improvement.
As a result, initially low rated internal tickets became a higher attention and unfortunately had to be implemented as part of late change requests in the middle of the stabilization phase. My bad. The timing was a disaster. It was so bad, it was almost good. These late changes payed off. The implemented improvement was significant and worth it. The customer appreciated the correction and our ability to respond quickly to their concerns.
They probably did because we "shocked" them with what they were getting next. Okay, it worked this time, but I guess we shouldn't do this too often.
Sunday, September 6, 2020
Sunday, August 30, 2020
As a software tester, I often feel like being an archeologist. I am analyzing bits and pieces that are found in requirements, use-cases, user-stories, emails, phone calls, balcony talks, meetings, defects, etc. I am collecting all these wide spread pieces of information and try to put them back together as accurate as possible. Unlike the two men in the excavation, I like to do this kind of job. When the connected pieces turn into a nice picture in my head, I am gluing them together in the form of well documented test cases or - if it is a really large "dinosaur" - in a final report that contains all information needed. I do it mainly for me, but I also do it for other stakeholders so they don't need to to through the same laboursome process when someone has the same questions.
The other analogy to a tester is the explorer who collects facts, numbers, painstaking noticing observed behavior or numbers gained from measures. The explorer then groups the data and analyses the collected material trying to find patterns that allow for interesting new findings that either confirm or refute assumptions made upfront. These conclusions are then (hopefully) used to make accurate decisions elsewhere (and also at your own desk).
Yet another analogy is the one of a fed. Testers are regularly cluttered with fake information. Of course, we also get accurate information, but it's all mixed in a bucket and we need to sort it out. Unlike a criminal who tries to protect himself using lies, we may get information that has either not been accurately analyzed or was invented to stop others from further asking and investigating. We, the testers, get information that is based on assumptions and it is our job to question everything. Some nice guy once stated "testers question everything, this includes authority".
There is also some analogy to medical practitioners. The patient gets interviewed by the doctor whose goal is to identify the root cause of the suffering. The software-under-test is the patient, while the tester is the doctor who checks the sick patient. When successful, the doctor identifies the root cause and solves the problem by prescribing selective medicine for treatment. The tester does similar things. She raises a defect and describes the symptoms as accurate as possible. If the examination reveals a problem that is not within the doctor's specialized field, she may delegate additional examination to be made by someone who is more specialized in the area where the doctor assumes a problem. The tester assignes the ticket to a developer or DevOps engineer for further investigation..
In case you have more examples of such analogy, simply post a comment to this blog or drop me a message at Linked-In or whatever channel you like.