Saturday, May 1, 2021

Apostrophe, Ampersand, Exclamation mark & Co.

Since I am in the testing business, I am "preaching" the use of special characters and umlauts wherever and whatever we test. 

Umlauts (ä,ö,ü) and chars like é,â, etc. are very common in Europe. In the US and elsewhere, you often see last names with a single quote such as O'Neill, O'Connor or, think about company names like Macy's and McDonald's.

Sometimes, we forget to include these characters in our tests and often we promptly pay the price in the form of bugs reported in the field.

But, awareness is increasing in our team. I am really delighted when I see a colleague posting notes into the group-chat yelling: 

"Hey Buddy, there were no special chars in your demo.Will it work with an exclamation mark, too?"

  I love that, but I also recommend to first test the happy case using simple inputs. There is no means to attack an application with special chars if it can't even deal with the basics. Once I've convinced myself that the happy case works, I go over to feeding the program with more interesting input.

 It's always exciting to watch how a system deals with a combination of chars like
"{äöü},[áàâ]/|éèê|\ë!:;+(&)'%@". But, if such a test fails, developers and/or business analysts will likely give you the hairy eyeball when they see a defect documented with such example input. They probably close the issue unfixed, assuming such inputs are unrealistic and won't be seen in production.

  What has worked better for me: Use these characters in a more meaningful context so it becomes obvious to all stakeholders that the problem you spotted is worth getting fixed.

For example, I've made it a habit to use M&M's Worldstore Inc. whenever I use or create a company address. That's not because I love chocolate so much but rather because it has a mix of special characters that all caused headaches in my past and current career as a software tester: the ampersand, the apostrophe and the dot.

Same approach I apply for street-names and ZIP codes. It is a common misbelief that ZIP codes need to be numeric. Go visit the UK and check how their ZIP codes look alike. Street names can contain slashes, dashes and single quotes. Street numbers can be alphanumeric separated by a slash.

A blank in a name isn't exotic. Look at Leonardo Di Caprio or Carla Del Ponte. I've seen programs that cut away the second part, leaving the famous actor with the short name "Di" or "Del".

Below find a few examples of names I use regularly in testing:

  • M&M's Worldstore
  • Mr. & Mrs. O'Leary
  • Léonardo Di Caprio 
  • Renée-Louise Silberschneider-Kärcher
  • Praxisgemeinschaft D'Amico & Ägerter GmbH

Example street names:

  • 29, Queen's Gardens
  • 4711 Crandon Blvd., Appartment F#1000
  • Rue du Général-Dufour 100-102
  • 55b, Rue de l'Université 
  • Elftausendjungferngässlein 12b

and ZIP-Codes:

  • 4142 Münchenstein 1 (Switzerland)
  • 33149 Key Biscayne, FL (USA)
  • EH4 2DA Edinburgh (Scotland)

   Special characters are interesting wherever you can submit text.

If you post a message like "meet you at Lexington Av/59 St" and, if that text is stored/exported in an XML file without properly escaping the slash or embedding the input in a CDATA tag, you may find interesting bugs.

Backslashes are used in many programs to escape characters. Quotes and single quotes are used to terminate text elements. The latter is a common way for hackers to manipulate SQL statements so the command fired forces the back-end to reveal information not intended for the normal user. Semicolons can confuse web client logic or cause problems when information is exported to CSV files. Question marks and ampersands are both used in URL links, curly braces are common markers in REST payloads, etc. 

I've never really understood what's so funny using the pangram the quick brown fox jumps over the lazy dog in testing or the filler "Lore Ipsum". Both do not contain any interesting characters. When testing German language based programs, I prefer to use  Victor jagt zwölf Boxkämpfer quer über den großen Sylter Deich because it contains umlauts and this weird ß character which - in Switzerland - is unknown, even in the German speaking part.

Most often, I use my own pangram which I have further adapted over time:

Chloë Valérie O'Loughlin-Bäcker runs the "Deacon Brodie" (Scottish Pub & Largest Whisky Collection) at Saint-Louis du Ha!Ha!, a town in Québec Canada. Her car is a Jaguar X and runs a max. of > 200 km/h. Isn't this amazingly fast?

It contains all letters of the alphabet, plus umlauts and a selection of interesting characters that challenge text processors. But, it has not helped me later find a bug in our system where an exception was thrown when the annotation text contained a capital Ü. Other Umlauts were no problem and even the lowercase ü didn’t cause any harm. The system really failed only with the capital version of it. Crazy!

One of my most funny treasures is the Canadian town St. Louis du Ha! Ha! This is the weirdest town-name I've ever heard of, because it has two exclamation marks. Okay, the likelihood of such a city ever being entered by our customers is close to zero. Canadian citizens living near the area of Quebec probably disagree. Testing and test input is a question of context. Westward Ho! is a town in the UK which also houses an exclamation mark.


 

Null, True, All, Test and other funny bugs related to people's names

Besides the recommendation of using special characters, it is also worth to sneak a peek at reported stories related to people's names like Jennifer Null and Rachel True. Their names were processed and misinterpreted as NULL [BBC16] or - in the case of Rachel - a boolean value [9TO5] causing a problem using her iCloud account. I've not experienced either of the two cases myself in any of my tests but we found a similar issue where submitting the search term "Alli" returned documents titled "Alligator" and "Allister". All fine, but submitting "All" ended in an exception. 

Stephen O, a Korean native living in the US had been hassled by credit card companies, because his last name was too short. When he applied the workaround by adding an extra "O" to end up with "OO", it didn't take long until he had a meeting with the government because he provided false information [NYT91].
 
Graham-Cumming could not register his name on a web-site because the hyphen was considered an invalid character [GRA10].

There is also Natalie Weiner who could not register either because here name was considered offensive [PAN18].

Yet another example is the story of William and Katie Test who were both unable to book airplane tickets simply because the system scanned the names for the term "test". A match triggered a different procedure and disallowed the process of booking tickets in production [COY17].

Wooha! That's exactly what one of my clients had implemented, too! I was surprised there were never any reported issues with it. So I investigated a little and found out, in Switzerland there are only around 50-60 people carrying the last name "Tester". Second, my investigation also revealed the system did NOT scan the last name but rather the first name for the term "test" to avoid operations executed in production; Doozy! How clever!

References:

Further similar reading: