Wednesday, December 16, 2009

Eat it

„You are damned if you do, and you are damned if you don‘t“ (Bart Simpson).
That phrase matches perfect to software testers. It doesn't matter how many great and cool bugs you find. If you miss one, you will find yourself in a defensive mode where you have to explain to managers why you missed it. I have rarely seen anyone asking a developer why he introduced that bug and why it took so long to fix it.

Under pressure, we accept that developers make mistakes and create bugs. Surprisingly we roast testers if they miss any, regardless of the circumstances. We are not even asking under which conditions they had to test. Did they have the right information at the right time? Were they given any chance to test at all or was the new version deployed without asking a tester to look at it?


Funny though, I have seen several times people to celebrate developers for fixing bugs they had  introduced themselves. This scenario is especially common if the bug was disturbing for a long time. When someone finally fixes the ugly bug, people leap for joy. They are greatful for simple things that should have worked correctly from the very beginning.

Friday, December 11, 2009

Don't Feed Your Plants

Once a month, the gardeners come into our offices and take care about the tortured plants.
As we all know, plants in IT departments often put up with everything, either they suffer thirst, or near-drowning and they serve as a trash for many weird left-overs.

That is no different in our offices. Programmers and engineers drink so much coffee, especially when it is free, and they like to share it with their plants...until it gets detected...

Sunday, November 8, 2009

Houston, we have a problem

Engineering was about to execute a database migration script and we had tested it several times before on a separate environment. However the weekend had come where we shut down the production server to do our jobs.
My job was to make sure, after the migration, everything worked fine. There was only one problem. On that weekend, I had a problem with my VPN connection. It was essential for me because without connection it wouldn't be possible to test whether the migration was successful or not. When I raised the call I got a message saying that there is no support on the weekend for internal IT occurencies...
...Fortunately I could test without having the VPN, but I could not take part in the decisions and coordination of the process because my business email could not be connected anymore. Although I informed all involved parties about the issue and that they should send all related information to my private email rather than using my business email, I only got half of the messages. It was a small challenge to get all required information but I still managed to do the job. Still I asked myself what those guys are doing if someone decides to perform a DNOS attack on the weekend.
But I have to be grateful. What surprised me most was, that immediately on Monday, a nice guy from IT came to my desk and showed me how I could access our business email from home without the need of VPN....

Tuesday, November 3, 2009

Declining Confidence

Wishful thinking...really? Let's assume someone has a brain fart and suggests to convert all bugs into tasks, only to get rid of the bug backlog. Gone are all the bugs in a second. All of a sudden you produce bug-free software. What a great achievement. Believe it or not, this really happened.

Thursday, October 22, 2009

Good Bye

In loving memory of a very good friend of mine and my family and who passed away all of a sudden, leaving an elusive new situation for his wife, his two children and a big number of good friends.

T.
Publish Post

Tuesday, October 13, 2009

Imagine there is Pandemia

If you followed my blog entry as of August 2, titled "What does Swine Flu have in common with Testing", you may remember the number of deaths the government was expecting...

...until today, there was not a single one.

Monday, October 12, 2009

Someone else's Bug

Most of the real "cool" bugs that I find, usually don't show up during automated testing. They also don't show up during the execution of manual regression testing. Of course, both approaches are successful within the restrictions that are well-known. But the bugs that show up during in depth analysis of an already reported and hard to reproduce bug, often "generates" new findings of anomalies that I often didn't expect at this place.

To bring an example, my goal was to find the root cause of a particular error more and more customers ran into. It became a high priority when suddenly 100 customers were affected by it.

When analyzing real customer defects, my mind works differently than during normal testing activities. I have now a different thinking and ask questions like "Which scenario(s) is/are candidates that drive the application into the reported behavior.

While the functional testing techniques worked well without a customer in mind, the new approach takes the customer into foreground. I now wear a different hat. In order to understand what happened, I need to know how a customer uses the system in an end-to-end environment.

The first approach is usually to check the logs that help-desk provided and then hope to find the cause fast. The help-desk usually provides a log of one error at a specific point in time only. Sometimes you need more information, for instance, how did the user get that object he was unsuccessfully working on, who sent the object originally to him, who accepted it when and who forwarded the object at what time, who added data to the object, etc.

The log from the support gives you the necessary base information. Now I dig deeper into it to find more information about the object's "life cycle". Besides the client application log and besides the web server log files we have something like an Event-Log attached to each object.

The Event-Log provides information about which action has been executed by which user and in which state the object ended up after this operation. That sounds great but I've always had this strange feeling that something is wrong with this Log. There were too many inconsistencies and I feared I lose time while digging into this other area.

This time, I had no choice, I had to understand each action and each state it ended up, so I could reproduce the same scenario in our test environment. And since again, I couldn't quite follow up the weird order of the log entries and also the time-stamps didn't really make sense to me. It was now time to get into this new area of potential defects. I started to create an object from scratch. I created it and then immediately checked the event log and noted down which actions and states the event log produced. I append data to the object, sent it to someone else, pushed it through several web services and for each action made screenshot of the event log. Step by step I tried out varieties with the only goal to understand the patterns of the logs.

What I found was astonishing. My original assumptions are now confirmed. These entries are incorrect. To make it short, these entries didn't tell you the truth. When asking our support people whether they ever realized they ever realized the clutter in the event log sequence and time stamps they confirmed that they didn't like to read the logs because of the same reason I didn't look at it. They just could not follow these entries. I then talked to the architect which – to my surprise – confirmed that there are some known issues with the event log. Strenthend through this experience, I raised a bunch of new defects which were related to this Event Log. This was just one example. The truth is that I usually run into more such "side" effects

And now just happened what often happens while I am testing: I drifted away. I was out to find the root cause for a completely different defect and found a bunch of other new interesting anomalies which didn't have to do anything with the original defect.

BTW, what was the problem that I was originally out for testing?




Tuesday, September 1, 2009

Password Expired


If you ask me where I got the inspiration for this cartoon, then I must admit that it was quite a long detour to this scene. Actually you can't refer to one particular incident. How about the diplomatic affair between Lybia and the Swiss Government, where our Federal President is just about to learn some basic lessons.

Or what about that nice instruction from IT that makes us strictly not to put any passwords on paper and then hands out PIN codes to users on papers themselves?

Another example is that story of one employee that was passed to me after he saw the cartoon. He claimed to have access to a special server room which is important in case of emergency. When he came back from holiday, his access had been disabled. He spent an hour of investigation why he could no longer access the room, until he found out that IT disabled all accounts of which they could no longer find out the owners. According to him, it wasn't the first time. Luckily, until now we didn't have any emergency scenario where that access was needed immediately.

 
Coincidence?
I was parlyzed for a moment when - five years after my cartoon was published here (2009) -  I browsed today (Feb 12, 2015) the papers of a Swiss news magazine and then found this cartoon which is an exact copy of the punchline outlined in my cartoon from 2009. Coincidence or not? I will never know, that's for sure.

Wednesday, August 12, 2009

Functions As Designed

Many customers have different requirements or requirements that change backwards and forwards over time, especially in an early stage of the project. The developer's weapon against such ping-pong requirement changes are :

make it all configurable

Whatever the customer sets is what he gets, even if the settings don't make any sense in their combination.

It is awkward if you have changes to existing behavior without noticing the parties concerned.
Testers raise defects whereas the symptoms are more a result of new features implemented and activated by default rather than actual defects.

We have an unwritten law that guarantees customers don't need to adapt their configuration files just to ensure existing functionality doesn't get changed due to some (unwanted) new features or restrictions in the use of some old existing features.

It is not only unpleasant for the customer, our test scripts won't work either. Well, that's the good thing of test automation. We wouldn't have detected those anomalies at the same speed as if we had done this test manually. However, we still struggle with a straight reaction to the findings. There are still some roadblocks to master.

Monday, August 3, 2009

What does Swine Flu have in common with Testing?

The expected worst case scneario in Switzerland are 2 mio casesof Swine Flu infections (one third of the population) and about 5000 deaths. Is it really that alarming or are these messages alone already the disease?

What happens if the Swiss Health officials (BAG) warn us and nothing happens (as already seen with the bird-flu)?

What happens if they don't warn us and we're all left in a big disaster?

My work as a tester sometimes involved similar decisions (in a much smaller and less important world of course) when thinking about which priority / severity to set for a defect. If I continue to raise many defects with a high priority status, developers soon don't pay attention anymore, especially if a majority of the issues turns out to be far less important than initially rated. For the flu, if that disease is not going to come, people are going to pay even less attention to such warnings.

If I raise a defect and set it a lower priority, developers probably never look at it, because they have too many other important defects to work on.

I might have found an important security issue but not noticed its relevance. Neither may developers notice its relevance because I set it a low priority. It could be that someone else finds the issue and hacks the system. The consequences can be tough for the company. For the flu, if you don't warn people about a pandemia, the responsible officials get all the blame for the omission of informing.

As in most situations, what's left is a decision based on common sense. But I think it is also an egoistic one. Am I really out for raising all sorts of low rated defects? I will have to re-test all of them later at a moment when developers have finally worked off their stack of high priority issues and get over to fix the low priroity defects. That is happening at a time when I am in the middle of testing all the other high priority issues, and now I am the one who ignores all those low rated issues, may be for several months because there is plenty of other high priority fixes to re-test. You may get an email from an administrator who closed all the low rated issues, because none seems to take care about them and therefore he made the right decision. But what did you raise the defects then for?

Severity vs. Priority
When you as a tester think you found an important issue, it could be looked at differently by the business analyst or the developer or the customer or the project leader. Setting the priority is often a team decision as all team-members may be right or wrong.

A system crash might be a severe issue and rated a SEVERITY 1, because that feature is not usuable. But the customer may not use this feature in the next couple of weeks. That makes is far less important, hence becomes a lower PRIORITY, at least for a while.

On the other hand, you as a tester might find a bug of which you rate as not very important. The developer might notice it and understand that the issue will have severe (other) consequences if not fixed immediately.

Testers and developers may rate translation issues as not very important, but the Business Analyst may have a closer contact to the customer and know that users will not accept a system with tons of text pieces spread over dialogs that are not translated into the user's language.

Whatever you do, it is always a chain of thoughts before you make up your mind.

Thursday, July 16, 2009

The Test Successfully Failed

"The test successfully failed!" This is what my college sitting next to me was yelling lately. I smiled when I heard him, because a malfunctioning operation is typically everything else but a reason to see happy faces here. Of course, such is different for testers, at least it is for me.

My job is done when I broke the software I am testing. If I all my tests become a green status, I am getting distrustful. Can it really be, that all those tests passed? I mean, couldn't it be that I missed some important tests or didn't I test thorough enough? The possibility that the software could simply work as designed rarely crosses my mind. I even don’t trust my own test scripting. I am sure, my scripts have the same kinds of bugs that developers continuously add into our software.

At this point I remember an article where I read something about "False positives, and false negatives". Applied to my daily work as a test automation scripter, how can I find out quickly, whether my new or touched scripts will find errors if they occur and when the feature currently work fine?

We have a lot of scripts which test the API to our service based application and I found a really stupid but effective way to smoketest my own test scripts. I plugged out the network cable and started the test suite. In such scenario, I expect my complete test suite to "successfully fail" for all tests, but it was big surprise seeing one of the test cases to still get the green status icon.

I wasn't online and all my tests were supposed to submit and query information from a service that needs a network connection to get any meaningful results. How can it then be that this test passed without a network connection?

Looking closer at the validation/assertion the script was executing (hereby debugging my own code and trying to understand why this test returned a "passed" status) revealed that I was doing an assertion to an object that becomes the same state, no matter whether you are online or not.

For example, I was querying a list of items in a database which can be of either one or the other state. For simplicity let's take the states ACTIVE or INACTIVE.

The interface I was testing offered 3 parameters to pass to the API. Passing "0" gives me the INACTIVE items. Passing "1" gives me the ACTIVE items. Passing an empty string returns all items.

As our test environment is used by a many other testers, too, hence the exact number of items could not be determined at runtime unless you clean the database to start from an initial state.

That could not be done, because it affected too many people. Therefore I focused on testing the service by passing all 3 parameters in a sequence and then compare the number of items among each other. I verified that the total number of ACTIVE items plus the total number of INACTIVE items are the same as the total number of all items using the following code:

Assert.IsTrue(nNumTotal==(nNumActive+nNumInactive));

That checkpoint becomes (unexpectedly true) if I am offline because in such case, all the variables are filled with zeroes. Having said that, this test actually validated that

Assert.IsTrue(0 == (0 + 0));

The fix in my script then was to add an additional Assert-Statement that tests for the number of items being greater than 0.


Sunday, May 10, 2009

Test Automation Side Effect

Lesson learnt Run a test on environment X, collect reference data, then re-run the same test scripts and collect new data from a different environment Y. If you expect those results to be identical.Good luck.They are most likely not. If you run a test and collect reference data, always run these tests on the exact same physical machine and don't compare data with different machines, unless there is a strong reason for this. There are always things that are different on these machines.

Here is the plan 

The Engineering team expected me to perform just 5 exemplary spot tests to verify whether the response is fine for some web service requests after they upgraded a software component on ACCEPTANCE. I was told to get informed by email when ACCEPTANCE environment is ready to do an initial test.Their plan was to mirror the data from PRODUCTION to ACCEPTANCE first.

Engineering would email me, when this is operation is completed and I would start collecting the reference data by running my scripts. Later, when this is done, Engineering would upgrade the system and inform me, so I could re-run the scripts, collect the new data and compare those with the previously collected reference data. At least, this is what I thought was the plan.

Developing the scripts
I prepared my test-scripts and extended those with additional checkpoints that I found useful and necessary. I prepared over 300 test cases which were firing web service requests. That was far more than the team expected me to do. I tested my scripts on a development environment and collected and stored the results. Everything seemed to work fine. I was ready for the big show on ACCEPTANCE.
I didn’t know much about the exact time line. All I knew was, it started around 1030h on a Sunday morning. I didn't know who was involved and I didn't even know what exactly they changed in this new upgraded software. It was no functional change, just something minor on a database level.

Executing the test
On this sunny Sunday morning at 1030h (my kids running around in our garden and waiting for me) I was sitting in front of my laptop, waiting for the starting shot, that is, an email telling me that the productive data had been successfully mirrored to the ACCEPTANCE, so I can run my scripts. Actually, I didn't get any emails. Finally, after I asked what’s going on, I got a note from SYSADMIN, stating that not only the data but also the new software system was already deployed to ACCEPTANCE and that QA could now start their tests....

Wait a minute… wasn't the idea to first collect the initial reference data so we have something meaningful to compare with? Something went terribly wrong here.

Upps, we have an issue
However, I had no other choice than to simply run my scripts against an already upgraded system with no possibility to compare the results against an untouched system. What I had so far was the data collected from the development test environment from Friday when I was still developing and testing my scripts. When I compared the data I was worried because 20 tests failed and some web service requests fired by my scripts reported a remarkable performance difference between the upgraded ACCEPTANCE environment and the old untouched development environment. The upgraded system for some tests responded 7 times slower compared to DEV.

Investigation of failing tests
After some investigation I found the reason for 19 out of 20 failed tests. It was a false negative result. The reference data collected for these tests on DEV returned an empty response. I took that as a reference, but running the same scripts on a productive near environment returned something different. This explained the failing tests except for one for which the root cause was not identified yet; That was one out of 300, and of course, we still had this performance problem with a few tests.

I repeated all my tests several times to gain more assurance that I am on the right track. But for each test run executed, I got the same results. During this analysis I suddenly got another note from SYSADMIN, now telling me that also the final production system was upgraded….

WTF….! They upgraded production while I was still investigating what was wrong and before I could even provide them a report?

Consequence
After I vetoed and presented my results (finally I learnt who is my contact person), some managers got nervous, and I mean really nervous. They wanted an exact explanation of the failing test and the performance issue, but we were unable to provide such information in the required short timeframe.
As a result, the managers decided to stop this exercise and instructed Engineering to roll-back production immediately which actually helped avoid a disaster when the customer used our system the very next morning. To my opinion, this decision was wise, given the circumstances that the communication and coordination of this operation was a disaster.

5 test cases - are you serious?
At the end, what confused me, is this: I was instructed to do only 5 tests. Are you serious?
I developed and executed 300 and it ended up with one failing test. What would the result and decision have been if I only executed 5 tests? And what would have been the reaction of our customers?

The other vexing scenario is that after PROD and ACC system was rolled-back, none asked me to test the rolled-back system….

And what did I learn here?
OK, ..after spending so much energy in summarizing this exercise, I should of course spend a moment and look at myself. How could I have prevented such situation to occur? Probably it helped if I started to collect the initial reference data way before the official test started, and – despite the fact that I wasn't involved in any planning/organization meetings/emails/telco, I could still have communicated my approach of testing to the people of whom I assumed, may be involved. If I had taken the chance to present my approach of testing, it may have helped driving the test into a different direction.

And then it happened again
Many years later btw....I had a similar situation in a different company where a developer asked me to test peformance of single REST requests fired against a system before and after an upgrade. The BEFORE should be tested on environment X while the AFTER should be tested on environment Y.  The developer stated the machines are phyiscally identical, but I didn't like that argument. Actually they were both identical. X and Y where 2 different web pages running on the same physical machine, accessing the same physical database, but...running 2 different settings and code-basis.
X (the old system) was so so terribly slow that each request took 40 seconds and Y (the improved one) responded within 2 seconds. The developer cried "see how the performance increased". Actually X was so badly configured that it was impossible to prove the new version Y is faster because of his changes. When I demonstrated to him that there is yet another environment Z where the results are far better than those of X although Z had no optimization, he realized that X was a corrupt and bad bad environment to compare.

Sunday, March 15, 2009

Don't Raise Any More Bugs

Quit a few years ago, believe it or not, I was asked to stop looking for more bugs because it was difficult enough for the manager to "sell" the problems we already had.

"Knowing about more problems would make it more difficult for the customer to accept the release..."

and so I stopped looking for more...

Monday, March 9, 2009

Unconventional Appliance

For those who didn't recognize what's hanging at the other end of the bar...it's a buiscuit, the shape of a bear (ursa).

Quite a year back I tried out some java script injecting by typing simple javascript commands directly in to messages that was part of the communication structure between fat clients and an online application-under-test. Whenever someone queried the list of items stored in the database, that piece of script was executed. The implication was a simple message saying "hello".

Users found it funny, some experts already knew it will become an issue sooner or later if javascript can be injected into one of the most important interfaces, but priorities were elsewhere. So, nothing happened for quite a while.
I decided to make it a little bit more obvious of what can happen and copied and edited the login page of our web portal, then published it on my own webserver outside the company. Then I injected the message by a javascript URL redirection to my webserver. The implication now was that users who queried the items in the list, got re-directed automatically to my "fake" logon page that looked exactly like the one from the company. Users who felt into this harmless phishing attack for a moment shrug their shoulders and then re-typed username and password into my logon page, not knowing that the POST request this time does something else, for instance storing their usernames and passwords. Real security experts laugh at this primitive phishing attack but I was shocked how easy that works. Amazingly although many testers and other users who were navigating on our test environment realized that something is wrong but none noticed the significance, or let's say the priority to fix this issue.

After quite another while I decided to make it again a little bit more obvious.. .I replaced the script instruction by an img-tag, hereby linking to a picture I found somewhere else in the web. Any person who logged on to the system and queried the database still received the queried list row-by-row, but one row now contained a hugh picture showing a crying ape with his mouth wide open.
This was in a list-column where only text was usually expected for display. Believe it or not, it didn't take long after that, until injection of such script tags was finally filtered. But it took more than a year until that issue was fixed.

Monday, February 23, 2009

Who's to blame?

Sooner or later, either you or your team will be faced with the question, ..."How come that the testers haven't found this bug?" One thing is for sure:

Testers don't create bugs, those are implemented by the developers!

So, the question should rather be:

"Why do we allow developers to implement bugs? And why do we allow them to do it over and over again?".

I cannot think of a single defect where our testers don't ask themselves the question "How could I have avoided this bug to go un-detected" and then update their testware accordingly. But the number of test cases grows with the features while the number of testers is often a constant. What's left are priorities and supernatural smell for defect-finding...sometimes it just doesn't come up to everyone's expectations.

Sunday, February 8, 2009

Greed Without Limits


Inspiration for this cartoon came from an article by Daniel Binswanger, found in "DAS MAGAZIN", issued by the Basler Zeitung as of February 7, 2009):

"...Obwohl die UBS 2007 einen Verlust von 4,4 Mia schrieb, zahlte sie knapp das Dreifache des gigantischen Verlustes (12 Mia) als Erfolgsprämie an ihre Mitarbeiter aus. Wenn man diesen Irrwitz zum Massstab nimmt, erscheint das Salärregime 2008 tatsächlich von vernünftiger Bescheidenheit. Allerdings konnte sich der Schweizer Steuerzahler vor einem Jahr noch in der Illusion wiegen, der Grössenwahnsinn der Grossbanker werde das Privatproblem ihrer Aktionäre bleiben. Heute wissen wir, dass wir die Zeche zu begleichen haben..."

It is already the second cartoon on the topic of the financial crisis, but if you carry the thought a bit further, you might find some similarities in the business I am working in.

Sunday, January 18, 2009

Fooling the Roboter

Changing UI interface, unique names and IDs and position of UI controls, dealing with 3rdparty elements, etc, etc.. all this stuff is daily business for test automatin engineers.

Besides few smaller articles I published two more detailed articles on this topic on the US magazine Software Test & Perforance 2006 (pg 33 ff) and 2007 (pg. 31 ff).