Saturday, May 1, 2021

Apostrophe, Ampersand, Exclamation mark & Co.

 by T. J. Zelger

Since I am in the testing business, I am preaching the use of special characters and umlauts wherever and whatever we test. 

 Umlauts (ä,ö,ü) and special chars such as é,â, etc. are very common in Europe. In the US and elsewhere, you often see lastnames with a single quote such as O'Neill, O'Connor or think about company names like Macy's and McDonald's.

Sometimes, we forget to include these characters in our tests and then we promptly pay the price in the  form of bugs reported in the field.

 But, awareness is increasing in our team and I am really delighted when I see a colleage posting a note into the group chat yelling: 

"Hey Buddy, there were no special chars in your demo.Will it work with an exclamation mark, too?"

  I love that, but I also recommend to first test a new user-story using simple inputs. There is no means to attack an application with special chars if it can't even deal with simple ones. Once I've convinced myself that the happy case works, I go over to feeding the program with more difficult input.

 It's always exciting to watch how a system deals with a combination of chars like
"{äöü},[áàâ]/|éèê|\ë!:;+(&)'%@" but if it fails, and if this is the exact string you include in a defect report, developers and/or business analysts will likely give you the hairy eyeball and close the issue unfixed, hereby suffering from the delusion that these inputs won't be entered in production.

  What has worked better for me, is to analyze which character causes a problem and then use it in a meaningful context so it becomes obvious to the stakeholders that the problem you spotted is worth getting fixed.

For example, I've made it a habit to use M&M's Worldstore Inc. whenever I use or create a company address. That's not because I love chocoloate so much but rather because it has a mix of special characters that all caused headaches in my past and current career as a software tester: the ampersand, the apostrophe and the dot.

Same approach is applied for street-names and ZIP codes. It is a common misbelieve that ZIP codes need to be numeric. Go visit the UK and check how their ZIP codes look alike. Street names can contain slashes, dashes, single quotes. Street numbers can be alphanumeric separated by a slash.

A blank in a name isn't excotic at all. Look at Leonardo Di Caprio. I've seen programs that cut away the second part, leaving the famous actor with the short name "Di".

Below find a few examples of names I use regularly in testing:

  • M&M's Worldstore
  • Mr. & Mrs. O'Leary
  • Léonardo Di Caprio 
  • Renée-Louise Silberschneider-Kärcher
  • Praxisgemeinschaft D'Amico & Ägerter GmbH

Example street names:

  • 29, Queen's Gardens
  • 4711 Crandon Blvd., Appartment F#1000
  • Rue du Général-Dufour 100-102
  • 55b, Rue de l'Université 
  • Elftausendjungferngässlein 12b

and ZIP-Codes:

  • 4142 Münchenstein 1 (Switzerland)
  • 33149 Key Biscayne, FL (USA)
  • EH4 2DA Edinburgh (Scotland)

   Special characters are interesting also at places where you can submit different kind of texts.

If you submit a message like "meet you at Lexington Av/59 St" to a program that stores information in an XML file without escaping the slash or embedding the input in a CDATA tag, you've found an interesting bug.

Backslashes are used in many programs to escape characters, quotes and single quotes are used to terminate text elements. The latter is a common tool for hackers trying to manipulate the executed SQL statement in that they can attach an additional OR statement. The goal is to make the program return information for which one is not authorized or for which the query wasn't prepared.

Semicolons can confuse web client logic or cause problems when information is stored in CSV files. 

Question marks and ampersands are both used in URL links, curly braces are common markers in REST payloads, etc. 

I've never really understood what's so funny using the pangram the quick brown fox jumps over the lazy dog in testing or the filler "Lore Ipsum". Both do not contain any special characters. When testing German language based programs, I prefer to use  Victor jagt zwölf Boxkämpfer quer über den großen Sylter Deich because it contains umlauts and this weird ß character. On Wiki you will find a lot more interesting pangrams.

Most often, I use my own extended version of the brown fox pangram that I have adapted to my personal needs:

René Di Agostino-Cruise has a fox living in a box opposite "Ottawa's Deacon Brodie" (Scottish Pub & Largest Whisky Collection). Each day he visits his friend and was once seen jumping quickly over the neighbours' cat  with a measured speed of > 50 km/h! How amazing is that?  

Agree, it's a little bit bumpy. There is a lot of room for improvement, but it's a text I can remember and it contains a whole set of interesting characters to challenge text processors. 

One of my most funny treasures is the Canadian town St. Louis du Ha! Ha! This is the weirdest town-name I've ever heard of, because it has two exclamation marks. I have to admit, I never reported defects when submitting such town failed. The likelyhood of such a city ever being entered by our customers is close to zero. Canadian citizens living near the area of Quebec probably disagree. Testing and test input is always a question of the context.

Null, True, All, Test and other funny bugs related to people's names

Besides the recommendation of using special characters in our tests, it is also worth to sneak a peak at reported stories related to people's names like Jennifer Null and Rachel True. Their names were processed and mis-interpreted as NULL [BBC16] or - in the case of Rachel - a boolean value [9TO5] causing a problem using her iCloud account. I've not experienced either of the two cases myself in any of my tests but we found a similar issue where submitting the search term "Alli" returned documents titled "Alligator" and "Allister". All fine, but submitting "All" ended in an exception. 

Stephen O, a Korean native living in the US had been hassled by credit card companies, because his last name was too short. When he applied the workaround by adding an extra O to end up with "OO", it didn't take long until he had a meeting with the government because he provided false information [NYT91].
 
A similar story was mentioned in a column by Rod Ackerman in the serie "Made in USA" in the Swiss newspaper Basler Zeitung. A US agency insisted to enter two initials for his first names. Since he had only one, he asked the clerk to enter R.R. Ackerman to workaorund the problem. Rod got into legal issues because of providing misrepresentation [BAZ].

Graham-Cumming could not register his name on a web-site because the hyphen was considered an invalid character [GRA10].

There is also Natalie Weiner who could not register either because here name was considered offensive [PAN18].

Yet another example is the story of William and Katie Test who were both unable to book airplane tickets simply because the system scanned the names for the term "test". A match triggered a security procedure to disallow the program of booking anything on a productive environment [COY17].

Wooha! That's exactly what one of my clients had implemented, too! I was surprised there were never any reported issues with it. So I investigated a little and found out, in Switzerland (where the software was used) there are only around 50-60 people carrying the name "Tester". Second, investigation revealed the system really checked for the occurrances of the term "test"; luckily not in the last name but in a different field. Doozy!

References:

Further similar reading:

Monday, April 26, 2021

Esteem or Truth


At the drop of a hat, they decided to stay on the hazardous highway.

Sunday, March 14, 2021

Architecture Review

"Our architects recently studied all your ill-favored coding styles, but the good news are...we don't need any obfuscator tools anymore!"

Wednesday, January 27, 2021

Interpreting the Regression Test Activity Graph

We just completed a 1.5 week intensive manual regression test phase where we executed almost the complete set of all (several hundreds) test cases. We are in a lucky situation. Our documented test cases represent nearly 100% of all implemented features. If we achieve a 70-80% test coverage, then we get a real good picture of the overall quality of the product increment. That means, aside from the many automated tests, it's worth from time to time, doing some manual end-to-end regression testing.

While tracking the regression testing progress using a cloud based test case management tool, we were looking at the activity graph and it made us smile. It's exactly what we expected.


 
At the beginning, testers focus executing those test cases that are well documented, having clear instructions and which rely on previously well prepared test data. I mean, objects that are in specific states where testers can execute just the transition from one state to the next and not worry about the laborious setup steps.
 
Then, testers switch to more complex test cases which take a little more time to understand and test.  This is when the progress curve reaches the peak and progress starts to slow down.
 
Of course, we also find anomalies. Bugs can slow you down because analyzing and understanding where and when defects were introduced takes additional time. After a few days, the first bugfixes are delivered, too. Developers require your attention to test their fixes. This interrupts testers from working on their suite. The rate of passed tests is decreasing, but still decreasing in a constant and expected way.
 
In parallel, developers are working already on the next generation of the product, meaning, their user stories get shipped and require testing too. The tester's brain is now busy with a lot of context switching; clearly more than at the beginning of the sprint.
 
Now that we are more than half way through, we switch to the monster test cases. I call them like that because they do not consist of simple steps, they contain several tests expressed in tables of inputs and expected outputs. That's why I think it's nonense to talk about the number of test cases. A test case can be atomic and executed in seconds, yet another test case can keep you busy for half an hour and more.

Some of the test cases may be poorly documented and require maintenance or correction. Some test cases require the help of a domain expert. The additional information gained should be documented in the test suite, so we don't have the same questions next time. These are all activities running in parallel.

Last but not least, weekend is getting closer. The first enthusiasm is gone, you're starting to get bored.
You hear music from your neighbour. The caféteria gets louder. The sound of clinking glasses reaches your desk. It's time for a break, time to reboot your brain. TGIF! And now it's weekend time!
 
And then, Monday is back! It's time for another final boost and time to say thank you. Great progress.
 
We made it Yogi! 
 
....and I like that graph.
 
 
 

 


Monday, January 11, 2021

About Bias and Critical Thinking

 

I recently stumbled over a small article about the "Semmelweis-Reflex". It was published in a Swiss Magazin and I found it quite interesting. I translated and summarized it, because I found some analogy to software testing:

In 1846, the Hungarian gyneocolist Ignaz Semmelweis realized in a delivery unit that the rate of mothers dying in one department was 4 percent while in another within the same hospital, the rate was 10 per cent.

While analyzing the reason for it, he identified that the department with the higher death rate was mainly operated by doctors which were also involved in performing post-mortem examination. Right after the autopsy  they went helping mothers  without properly disinfecting their hands.

The other department with the lower death rate was maintained mainly by midwifes who were not involved in any such autopsy. Based on this observation, Semmelweis adviced all employees to disinfect their hands with chlorinated lime. The result: the death rate decreased remarkably.

Despite the clear evidence, employees's reaction remained sceptical, some were even hostile. Traditional believes were stronger than empiricism.

Even though this is more than 150 years ago, people haven’t changed so much in these days. We are still biasing ourselves a lot. The tendency to reject new arguments that do not match our current beliefs is still common today. It is known as the "Semmelweis-Reflex”. We all have our own experience and convictions. This is all fine, but it is important to understand that these are personal convictions that may not be transferred to a general truth.

How can we fight such bias? Be curious. If you spontaneously react with antipathy on something new, force yourself to find pieces in the presenter’s arguments that could still be interesting, despite your disbelieve as a whole.

Second, make it a common practice to question yourself by saying “I might be wrong”. This attitude helps you overcome prejudice by allowing new information to get considered and processed.

Sources: text translated to English and summarized from original article  published as "Warum wir die Wahrheit nicht hören wollen», by Krogerus & Tschäppeler, Magazin, Switzerland, March 20, 2021

Back to testing:
I am learning in this article, in case we didn't do it before, we should start questioning ourselves more often.  We should learn to listen and not hide behind dismissive arguments simply because what is told to us doesn't match our current view of the world.

But this story has two sides. If I am getting biased, then others may be biased, too. Not all information reaching my desk, must be right. The presenter's view of the world may be equally limited.

My personal credo therefore is NOT ONLY to question myself, but to also question statements made by others, even if these confirm my current view of the world and even if it sounds all reasonable.
He/she may also have the intention to achieve something without the need of answering critical questions. He/she may want to avoid getting challenged into discussions that uncover his/her own "sloppy" analysis.

Call me a doubting Thomas. I don't believe anything, until I've seen it.

  • If someone tells you "a user will never do that",question it.
  • If someone tells you "users typically do it this way", question it.
  • If someone tells you "this can't happen", question it.
  • If someone tells you "we didn't change anything", question it
  • ...

I trust in facts only. This is the result of 20 years testing software.

Come-on, I don't even trust my own piece of code. I must be kind of a maniac =;O)

Tuesday, January 5, 2021

No test is the same, use your brain

 About a decade ago, a manager wondered why I asked questions about how the software-under-test was supposed to work. He responded: "You don't need to know all that stuff, all you need to do is test".

This happened so many years back, I can't  remember whether he really meant it seriously or was he just making fun. 

I mention this episode here, because - even if you are looking at software from a black box or business oriented perspective, you should still be interested in how things work from a technical point of view.

If you don't know what's going on under the hood, you are going to miss important test cases. The more one knows, the more target-oriented and effective your tests evolve. 

James Bach (Satisfice) once provided a great example in one of his speaches when asking the audience "how much test cases do we need" while showing a diagram that consisted of a few simple rectangles and diamonds. Various numbers were summoned towards the speaker's desk, most of them being wrong or inappropriate. When James Bach revealed the details behind this simplified diagram, the scales fell from their eyes. It became clear that there is much more behind this graph. When people guessed numbers like 2, 3 or 5 tests, it became now clear, this was just the start and the real number of required tests would be multiple times higher than their first guess.

But, I have also my own story to share with you and demonstrate why it is important to ask more questions about the software-under-test.

A quick intro
Every passenger vehicle is registered using a 17-digit vehicle identification number (VIN) which is unique worldwide. The first 3 letters identify the manufacturer. 

For example:

  • WVWZZZ6NZTY099964 representing a 1994 a VW Polo
  • WF0WXXGBBW6268343 representing a 2006 Ford Mondeo V6
  • WBAPX71050C101516 representing a 2007 BMW 5series

In order to retrieve the details of each car's equipment such as painting color, 2-door, 4-door, tinted glass, sunroof, etc. we fired a VIN number as input and received the complete vehicle information as an XML output.

This service was an important part of a cloud based service we developed and sold as part of a bigger solution to insurance companies and bodyshops. The customers used it to estimate the total repair cost of damaged cars. Frankly speaking, in order to perform an accurate estimation, one needed at least the part prices and repair steps or workload. Our company was great in this business, since they had all that data from almost all car manufacturers. I have to admit, this is a simplified explanation of a more complex piece of software.

Back to testing
Our job was to make sure the vehicle identification service returns the correct car related data. To test that, we took our own and our friends' cars as a reference, because it was the easiest way for us to verify the data makes any sense. 

What we didn't know yet; our company wasn't the owner of all data. Depending on the manufacturer of the queried vehicle, our system either retrieved data from our own database or had to call a third party web service to request the required information. This was the case for BMW and PSA (Peugeot, Citroen, Opel, Vauxhall) just to name two examples.

For the end-user, this wasn't visible. All the user saw was a 17-digit VIN field with a button to trigger the search and then wait until we returned the complete vehicle information along with the equipment data.

What does that mean for the development of test cases?

Knowing, that the system - in some cases - communicates with third party components to get the information, is essential when testing this service. Simply submitting one VIN to see that the service responds correctly, is not enough.
We needed to know what data we own, and what data is gathered through calling external services. Unfortunately, we never managed to get a comprehensive list to shed a little light on the inwards of the software. Therefore, the only weapon for us testers was to test the service using a broard variety of exemplary VIN numbers covering as much as different manufacturers as possible.

But this is just half the story. When a request was being fired, our system cached the generated XML response for an unspecified period of time. 

That means, if you submit a particular BMW VIN number for the first time, followed by later calling that service again using the exact same VIN number, this wouldn't trigger the same developer's code paths anymore.

Also, we should know, for how long the data is cached. Unfortunately, the owners of the product didn't reveal the secret to the testers. I could now start with a new article on testability and rave over the necessity to have extra functions for testers to flush the cache when needed, but this goes beyond the scope of this article.

What else?

Having now two different kinds of tests such as...

- consider different brand names when testing VIN
- consider submitting the same VIN a second time to test the cache

..we should start thinking about what happens when one of the external services is unavailable.
How does our system react in such scenario, especially if the response isn't cached yet?
How would we even know that it's the external service and not our own code that got broken?

One of the approaches was to add extra tests examining the external services in isolation from our implementation.
Yet another approach - and this is what we did (because we were not given the details) - was to have a set or group of different VIN numbers that belong to the same manufacturer. For example, a set of 5 BMWs, another set of 5 Peugeots, 5 Fords, etc.

If an external service is down, then a whole set of such test cases would fail. If instead only 1 test within a set failed, then the root cause was probably somewhere else.

Visualized
The below circles demonstrate it visually. At the beginning one knows little to nothing about the system- or component-under-test (1). By questioning how things work, one detects new areas that may or may not be relevant while developing new test cases (2). Knowing and understanding these areas enlarge the awareness area of circle and result in minimizing the risk of skipping important test cases.

Closing rate
Developing test cases is not a monotonous job. The contrary is the case, it requires asking questions, sometimes critical questions and having a good understanding what are meaningful test scenarios to cover and which can be skipped without accepting high risks. Sometimes, you need to be obtrusive when it comes to interviewing techies to get the details, especially when asking for testability hooks.
Although testing activities face a lot of repeating patterns of software-defects, developing appropriate test cases is a task that involves a lot of thinking. That's why I say, no test is the same, use your brain!

BTW, I forgot to add...not all external service providers allowed us to cache data, even if it was cached just for one day. That's an additional test to think about, and I am sure, I forgot to mention many more tests we actually developed at that time.

 Below, find a simplified flowchart of the vehicle identification service.