Saturday, May 1, 2021

Apostrophe, Ampersand, Exclamation mark & Co.

by T. J. Zelger

Since I am in the testing business, I am preaching the use of special characters and umlauts wherever and whatever we test. 

 Umlauts (ä,ö,ü) and special chars such as é,â, etc. are very common in Europe. In the US and elsewhere, you often see lastnames with a single quote such as O'Neill, O'Connor or think about company names like Macy's and McDonald's.

Sometimes, we forget to include these characters in our tests and often we promptly pay the price in the  form of bugs reported in the field.

 But, awareness is increasing in our team. I am really delighted when I see a colleage posting a note into the group-chat yelling: 

"Hey Buddy, there were no special chars in your demo.Will it work with an exclamation mark, too?"

  I love that, but I also recommend to first test the happy case using simple inputs. There is no means to attack an application with special chars if it can't even deal with the basics. Once I've convinced myself that the happy case works, I go over to feeding the program with more interesting input.

 It's always exciting to watch how a system deals with a combination of chars like
"{äöü},[áàâ]/|éèê|\ë!:;+(&)'%@". But, if such a test fails, developers and/or business analysts will likely give you the hairy eyeball when they see a defect documented with such example input. They probably close the issue unfixed, hereby suffering from the delusion that these inputs are unrealistic and won't be entered in production.

  What has worked better for me, is to analyze use these characters in a more meaningful context so it becomes obvious to all stakeholders that the problem you spotted is worth getting fixed.

For example, I've made it a habit to use M&M's Worldstore Inc. whenever I use or create a company address. That's not because I love chocoloate so much but rather because it has a mix of special characters that all caused headaches in my past and current career as a software tester: the ampersand, the apostrophe and the dot.

Same approach I apply for street-names and ZIP codes. It is a common misbelieve that ZIP codes need to be numeric. Go visit the UK and check how their ZIP codes look alike. Street names can contain slashes, dashes and single quotes. Street numbers can be alphanumeric separated by a slash.

A blank in a name isn't excotic. Look at Leonardo Di Caprio or Carla Del Ponte. I've seen programs that cut away the second part, leaving the famous actor with the short name "Di" or "Del".

Below find a few examples of names I use regularly in testing:

  • M&M's Worldstore
  • Mr. & Mrs. O'Leary
  • Léonardo Di Caprio 
  • Renée-Louise Silberschneider-Kärcher
  • Praxisgemeinschaft D'Amico & Ägerter GmbH

Example street names:

  • 29, Queen's Gardens
  • 4711 Crandon Blvd., Appartment F#1000
  • Rue du Général-Dufour 100-102
  • 55b, Rue de l'Université 
  • Elftausendjungferngässlein 12b

and ZIP-Codes:

  • 4142 Münchenstein 1 (Switzerland)
  • 33149 Key Biscayne, FL (USA)
  • EH4 2DA Edinburgh (Scotland)

   Special characters are interesting wherever you can submit text.

If you post a message like "meet you at Lexington Av/59 St" and if that text is stored/exported in an XML file without upfront escaping the slash or embedding the whole input in a CDATA tag, you may have found an interesting bug.

Backslashes are used in many programs to escape characters, quotes and single quotes are used to terminate text elements. The latter is a common tool for hackers trying to manipulate the executed SQL statement in that they can attach an additional OR statement. The goal is to make the program return information for which one is not authorized or for which the query wasn't prepared.

Semicolons can confuse web client logic or cause problems when information is exported to CSV files. 

Question marks and ampersands are both used in URL links, curly braces are common markers in REST payloads, etc. 

I've never really understood what's so funny using the pangram the quick brown fox jumps over the lazy dog in testing or the filler "Lore Ipsum". Both do not contain any interesting characters. When testing German language based programs, I prefer to use  Victor jagt zwölf Boxkämpfer quer über den großen Sylter Deich because it contains umlauts and this weird ß character we don't have in the German speaking part of Switzerland. On Wiki, you will find a lot more interesting pangrams.

Most often, I use my own extended version of the brown fox pangram that I have adapted to my personal needs:

Chloë Valérie O'Loughlin-Bäcker runs the "Deacon Brodie" (Scottish Pub & Largest Whisky Collection) at Saint-Louis du Ha!Ha!, a town in Québec Canada. Her car is a Jaguar X and runs a max. of > 200 km/h. Isn't this amazingly fast?

It contains all letters of the alphabeth, some umlauts and a selection of interesting characters to challenge text processors. 

One of my most funny treasures is the Canadian town St. Louis du Ha! Ha! This is the weirdest town-name I've ever heard of, because it has two exclamation marks. Okay, the likelyhood of such a city ever being entered by our customers is close to zero. Canadian citizens living near the area of Quebec probably disagree. Testing and test input is definitely a question of the context.

Null, True, All, Test and other funny bugs related to people's names

Besides the recommendation of using special characters in our tests, it is also worth to sneak a peak at reported stories related to people's names like Jennifer Null and Rachel True. Their names were processed and mis-interpreted as NULL [BBC16] or - in the case of Rachel - a boolean value [9TO5] causing a problem using her iCloud account. I've not experienced either of the two cases myself in any of my tests but we found a similar issue where submitting the search term "Alli" returned documents titled "Alligator" and "Allister". All fine, but submitting "All" ended in an exception. 

Stephen O, a Korean native living in the US had been hassled by credit card companies, because his last name was too short. When he applied the workaround by adding an extra "O" to end up with "OO", it didn't take long until he had a meeting with the government because he provided false information [NYT91].
A similar story was mentioned in a column by Rod Ackerman in the serie "Made in USA" and regularly published in Swiss newspaper Basler Zeitung. A US agency insisted him to provide two initials for his first names. Since he had only one, he asked the clerk to enter R.R. Ackerman to workaorund the problem. Rod, too, got into legal issues because of providing misrepresentation [BAZ].

Graham-Cumming could not register his name on a web-site because the hyphen was considered an invalid character [GRA10].

There is also Natalie Weiner who could not register either because here name was considered offensive [PAN18].

Yet another example is the story of William and Katie Test who were both unable to book airplane tickets simply because the system scanned the names for the term "test". A match triggered a security procedure and disallowed the process of booking tickets in production [COY17].

Wooha! That's exactly what one of my clients had implemented, too! I was surprised there were never any reported issues with it. So I investigated a little and found out, in Switzerland there are only around 50-60 people carrying the lastname "Tester". Second, my investigation also revealed the system did NOT scan the lastname,but the first name for the term "test"; Doozy! How clever!


Further similar reading:

No comments:

Post a Comment