Simply the Test: QA

Showing posts with label QA. Show all posts

Wednesday, October 2, 2024

The Birthday Paradox - and how the use of AI helped resolving a bug

While working / testing a new web based solution to replace the current fat client, I was assigned an an interesting customer problem report.

The customer claimed that since the deployment of the new application, they noticed at least 2 incidents per day where the sending of electronic documents contained a corrupt set of attachments. For example, instead of 5 medical reports, only 4 were included in the generated mailing. The local IT administrator observed the incident to happen due to duplicate keys set for the attachments from time to time.

But, they could not evaluate whether the duplicates were created by the old fat client or the new web based application. Both products (old and new) were used in parallel. Some departments used the new web based product, while others still stick to using the old. Due to compatibility reasons, the new application inherited also antiquated algorithms to not mess up the database and to guarantee parallel operation of both products.

One of these relicts was a four digit alphanumeric code for each document created in the database. The code had to be unique only within one person’s file of documents. If another person had a document using the same code, that was still OK. The code (eg. at court) was used to refer to a specific document. It had to be easy to remember.

At first, it seemed very unlikely that a person’s document could be assigned a duplicate code. And, there was a debate between the customer and different stakeholders on our side.
The new web application was claimed to be free of creating duplicates but I was not so sure about that. The customer ticket was left untouched and the customer lost out until we found a moment, took the initiative and developed a script to observe all new documents created during our automated tests and also during manual regression testing of other features.

The script was executed every once an hour. We never had any duplicates until after a week, all of a sudden the Jenkins script alarmed claiming the detection of a duplicate. That moment was like Xmas and we were so excited to analyze the two documents.

In fact, both documents belonged to the same person. Now, we wanted to know who created these and what was the scenario applied in this case.

Unfortunately, it was impossible to determine who was the author of the documents. My test team claimed not having done anything with the target person. The person’s name for which the duplicates were created occurred only once in our test case management tool, but not for a scenario that could have explained the phenomena. The userid (author of the documents) belonged to the product owner. He assured he did not do anything with that person that day and explained that many other stakeholders could have used the same userid within the test environment where that anomaly was detected.

An appeal in the developer group chat did not help resolve the mystery either. The only theory in place was “it must have happened during creation or copying of a document”. The most easy explanation had been the latter; the copy-procedure.

Our theory was that a copied document could result in assigning the same code to the new instance. But, we tested that; copying documents was working as expected. The copied instance received a new unique code that was different from its origin. Too easy anyway.

Encouraged to resolve the mystery, we asked ChatGBT about the likelihood of duplicates to happen in our scenario. The response was juicy.

It claimed an almost 2% chance of duplicates if the person had already 200 assigned codes (within his/her documents). That was really surprising and when we further asked ChatGBT, it turned out the probability climbed up to 25% if the person had assigned 2000 varying codes in her documents.

This result is based on the so called Birthday Paradox which states, that it needs only 23 random individuals to get a 50% chance of a shared birthday. Wow!

Since I am not a mathematician, I wanted to test the theory with my own experiment. I started to write down the birthdays of 23 random people within my circle of acquaintances. Actually, I could stop already at 18. Within this small set I had identified 4 people who shared the same birthday. Fascinating!

That egged us to develop yet another script and flood one exemplary fresh person record with hundreds of automatically created documents.

The result was revealing:

	Number of assigned codes for 1 person
	500	1000	1500	2000	2500
Number of identified duplicates (Run 1)	0	0	5	8	11
Number of identified duplicates (Run 2)	0	0	3	4	6

With these 2 test runs, we could prove that the new application produced duplicates if only we had enough unique documents assigned to the person upfront.

The resolution could be as simple as that:

When assigning the code, check for existing codes and re-generated if needed (could be time-consuming depending on the number of existing documents)
When generating the mailing, the system could check all selected attachments and automatically correct the duplicates and re-generate these or warn the user about the duplicates to correct it manually.

What followed was a nicely documented internal ticket with all our analysis work. The fix was given highest priority and made into the next hot-fix.

When people ask me, what I find so fascinating about software testing, then this story is a perfect example. Yes sure, often, we have to deal with boring regression testing or repeatedly throwing back pieces of code back to the developer because something was obviously wrong, but the really exciting moments for me are puzzles like this one; fuzzy ticket descriptions, false claims, obscure statements, contradictory or incomplete information, accusations and none really finds the time to dig deeper into the "forensics".

That is the moment where I cannot resist and love to jump in.

But, the most interesting finding in this story has not been betrayed yet. While looking at the duplicates, we noticed that all ended up with the character Q.

And when looking closer at the other non-duplicated codes, we noticed that actually ALL codes ended up with character Q. This was even more exciting. This fact reduced the number of possibilities from 1.6 million variants down to only 46656 and with it, the probability of duplicates.

At the end I could also prove that the old legacy program was not 100% free from creating duplicates even though it was close to impossible. The only way to force the old program to create duplicates was to flood one exemplary person with 46656 records, meaning all codes were used then. But that is such an unrealistic scenario, it was already pathetic.

At the end as it turned out, the problem was a configuration in a third party vendor's randomizer API function. In order to make the Randomizer create random values, you had to explicitly tell it to RANDOMIZE it. =;O) The title of this blog therefore could also be "Randomize the Randomizer" =;O)

Further notes for testers
Duplicate records can be easily identified using Excel by going to DataTools and choose "RemoveDuplicates".

Or, what was also helpful in my case was the online tool at

https://www.zickty.com/duplicatefinder
where you could post your full data set and it returned all duplicates.

Yet another alternative is use to SQL group command like in my case

SELECT COUNT(*) as num, mycode FROM table WHERE personid=xxx GROUP BY mycode ORDER BY num DESC;

Sunday, November 22, 2020

The Template

Recently in the museum.

Thursday, October 1, 2020

Tag submarines before they shoot

In the retrospective meeting it was claimed we could probably have raised some hot issues earlier. The late addressing of these items put people under pressure shortly before rolling-out. Although it was correct that we had quite some late change requests which hurt not only devs but also testers, most of the issues were not late discoveries but rather late awareness of importance.

The tickets were raised long time back but, because of release pressure, these were parked aside in the backlog and assigned low priority. We called it the “parking-place”. At that time, a restrictive process of prioritizing tickets was needed to manage the deadlines. The limitations were accepted for a while. But over time, things changed. What was once rated low priority for release N, all of a sudden became more important in the next release and at a bad timing. For many stakeholders including me, these tickets came out of nowhere like submarines; and of course, all at same time.

As a lessons learnt, we - as testers - have decided to get better by regularly reviewing parked tickets. This is to help everyone in the team to become aware earlier of potential new dangerous "submarines" intending to surfac on the water plane soon. There would still be enough time then to tag these before they have a chance to shoot.

Saturday, August 29, 2020

It was a bear, for sure (overvalued bugs)

Almost every year we go skiing in the mountains for about a week. We usually rent the same house which is surrounded by a huge garden. One morning - it's quite a while back now - we encountered a series of big footprints in the garden. They were really large. Next to them was a big bloody bone which I assumed to be the mortal remains of a sheep. "What the heck happened here and whose footprints could these be?"

After measuring the size with a scale and analyzing the shape I put a 2-Swiss Frank coin into the track and muttered to myself.."this was clearly a bear". I must add to this point, until this date, the only tracks I had ever reliably identified were those of a small rabbit. Regardless, everyone agreed, my wife, my kids and my parents in law who were joining us at holiday. It was enough confirmation to take some photos and show them to the local tourist information bureau. She too, was amazed by the photos. She picked the phone and called the forest ranger. While waiting at the counter and watching the officer talking with the ranger, I heard him mentioning a cat. Wait a minute....!

"Are you kidding me?", I protested. "Can you please tell the ranger about my photos and send those to him?", I added mortally offended.

If at all one can mention a cat, then only because these footprints had the size of a mark from a cat's after-lunch nap in the snow. That's for sure.

The ranger promised to check our garden shortly before dawn and let us know. Of course, he never did and so I showed the photos to a local farmer. He confirmed, he had come to the same conclusion as me. He guessed, the ranger as well as the tourist office may both just have been afraid the story and the pictures could make it to the news and scare the tourists.

That's good. Finally an expert who confirms what I've thought from the very beginning. Back home, two weeks later, I assorted my photos and decided to send a few of them directly to the ranger who never visited our holiday house. I was confident, once he sees the pictures, he'll be convinced and confirm my theory.

But, the ranger - who responded promptly - came up with a very interesting new theory. He claimed, it is not likely for the tracks being left by a bear. At the time the bears are usually still hibernating. He stated, these tracks were either left by a wolf or a big dog.

He explained, the huge size of the footprints may be caused by atmospheric conditions that let the footprints grow by at least 50% in size at relatively short time.

To be honest, I'd loved to hear something different , but it was an interesting argument and these statements made at lot more sense than the original comparison to a domestic cat. Of course, I am still not really convinced, but at least I am now at a point where I have to admit, there may be more than just my only one explanation.

How is that story related to testing?

From time to time it happens that we find cool and unexpected bugs that have the potential of being a real big thing; the one killer bug. We might be euphoric and immediately record it and probably all to easy forget to collect more facts before we celebrate the great finding and label it highest priority. Often, when looking closer to an ugly looking bug and spending more time to understand its impact, the probability of the anomaly to occur in real life and the number of affected users, then what's left may be the sobering that we've just found another low, maybe medium priority bug, but we made a mountain out of a molehill upfront. It's great to find bugs, but be careful with the initial rating without having a second set of eyes looking at it.

By the way...., I am still convinced it was a bear, for sure =;O)

Simply the Test