Where did IBM blow all that smoke?

Posted September 26, 2007 by Brent
Categories: General Test

I am waaaaaaay to busy to be distracted today. There is a release going out on Friday and like usual there is a lot of work to do. So when I got a Dr. Dobb’s e-mail with something that looked somewhat test oriented that pointed me to IBM I went. This paper seemed like it took testing one step farther than record and playback testing as it would try to figure which ‘test case’ you were executing and offer to finish it out for you. The more I read the more I felt like I was reading sci-fi. Then I got to this on page 4 of 4: “Although these are currently unimplemented, we will give a brief overview of some of the possible approaches” I now know where the smoke was getting blown. Wow, good luck with that! I feel like someone from the testing community should say, hey we are working on these problems… and they are real hard to solve.

Should there be a Watir users group in your town?

Posted September 20, 2007 by Brent
Categories: General Test

I think it is time for Austin to get a Watir users group.  Bret Pettichord checked the interest of this about a year ago, the response was good.  It may be even better to start one now.

In the last year I have seen some great advances of what to do with Watir to help people really take this technology and start writing practical tests very quickly.  Last week Hugh McGowan and Rafael Torres showed off a flexible data driven test system that made each row or column of a worksheet into a set of Rspec tests.  This was open source software on ruby forge.  Then shortly after that Mark Anderson used this to write a data drive test to hit google given search terms to verify the expected top url.  This code should be on the same ruby forge site now.  The demo was quick and impressive.
With this advances I saw someone ask a question on the Austin Watir mailing list how to install Watir on their system.  There was a quick answer.  The Wati-general mailing list is very busy these day with advanced and newb questions.

Now seems to be a good time to get an organized Watir group set up in Austin.  I would also like the idea of exporting our presentations (an importing as well) to user groups in other cities to get things moving that way as well.

I’m thinking a meeting would be called when we have a venue and 2 presenters and might go like this:

5:30-6:15 Early arrival, people can bring questions to see if a more experienced user can help them.  Possibly as advanced as installing Ruby/Watir/Test tools on the machine or a recent user-group presentation demo.

6:15-6:30 Regular arrival, people can announce that they are looking for Watir users and testers or are looking for work in the field.

6:30-7:00 Watir report – Someone reports experience with the Watir or related tools at work.

  •   This was the problem.
  •   We did this
  •   This worked/this didn’t

7 :10-7:50 Demo – Showing off a new tool, module or library

  • See Watir in action
  • Learn about coding techniques or a tools
  • Later you can down-load and try out something

Ideas?  Feedback? Would this be good?  What day are you free to do this?

How do I track all the test equipment

Posted August 29, 2007 by Brent
Categories: Test Harness

I have ideas how to track it but I want to track it in a way that others could use the tracking tools also. The key is setting it up in a way that the test harness can see what is available, working then reserve it for the duration of the test. At Convio we had test systems that got a build and were set up a certain way that was needed to run most of the tests. If we had automated tests that would have been the target, the parameters you feed in are: host name, db credentials and possibly what browser platform to use.

At Uplogix I have 2 pieces of equipment to test: and Envoy that you ssh into to control various devices and an EMS that has a web front end that controls many Envoys. The configuration is complicated by what devices are physically connected to the Envoys (routers, switches, GPS units, Unix machines or simulators for these devices) and how they get connected to the EMS.

I want to solve this in a generic way. Defining resource types (Envoy, EMS, Serial device, power controller,…) instances of those resources and how they are connected. Then I can write scripts to make sure the devices are functioning (they break a lot with what we do to them) configure them when new builds show up, and allow test cases to find what it wants then reserve it. That last part is the most interesting.

So I may have 7 Envoys of two different flavors, with 4 routers connected to them, 3 of them are connected to a power controller. A test case that needs to break into a router by power cycling the router and catching the prompt and sending a new config to it would have to look for this:

(Envoy: type=’TU301′, name=$envoy, reserved=false) and (2600: envoy=$envoy, power=$power, port=$port, reserved=false)

So prolog or an SQL inner join, it would match the name of the TU301 with the 2600’s envoy so it knew they are wired together (and not reserved by another test or person) and give you the name of the devices, the port id of the Envoy, and the power port id to pass to the test case.

In an SQL transaction it would have to reserve the unreserved equipment or roll back and wait or try a different set of gear. Once the tests are complete it would unreserve the 2 pieces of equipment.

When that TU301 test is done we now look for ‘Spitfire’ instead and run the test again with the new parameters. If we don’t find matching equipment we flag it as skipped due to lack of resources or busy resources for a person to fix.

For bonus points I can run the sanity tests on both devices after the test and if the router is in ROMMON mode or unresponsive I know what tests caused the problem and take the device out of service. Possibly disable the test too.

So now I have tests running every night

Posted August 28, 2007 by Brent
Categories: Test Harness

I have tests running every night, great. Now I need to monitor the results.

The tests are written in Perl and Expect.pm to test a CLI. The tests are not super important or contain any measurable coverage but allow me to see a problem right off the bat that needs to be solved now. I now have to go look at a file generated and see what the deltas are. Here is how I got to this point.

  • Create Perl libraries to figure out what build is on a machine and load a different one
  • Make an object that encapsulates all the Expect code to communicate with the device I am testing.
  • Generate tests using the object with Perl’s Test::More module.
  • Write a program that looks at Cruise Control build machine finds the latest build use the library to load this onto the test machine then run the tests.

So, here is the problem now, I am looking at a file every morning to see what the results are. I see this output:

ok 1 - timeout set to 120 minutes
ok 2 - timeout set to 5 minutes
ok 3 - timeout set to 101 minutes
...
ok 11 - timeout 5 minutes after disconnect
ok 12 - timeout set to 120 minutes
1..12

So I now want to move this to my database and have to program e-mail when there is a delta. I can look at the results with some CRUD scaffolding. Later, there will be web testing that will use Watir and be in Ruby.

How do I do this?

Write something like Test::More in Perl and Ruby that connects to a database?

Or extened Test::More to do this. What about Ruby? Extend RSpec to do this?

I like the power you get from tools like RSpec and Test::More but do they work well outside of a unit test environment?

Nobody cares if your automated test passes or fails.

Posted August 7, 2007 by Brent
Categories: Test Harness

Years ago when I worked at Convex Computer Corp we used something that is now called Convex Integrated Test Environment or CITE. While running a set of tests you would see things like PASSED which made you feel good or FAILED – RC 1 expected 0 which made you feel less good. I tested the compilers, libraries and similar tools which shipped with very few known errors, but we did have them. The errors would involve things like, should have vectorized this loop or got an IEEE NaN instead of Infinity but I soon learned that people didn’t care about the failures. That surprised me. We had a tool that post processed the results and compared them against what was expected if we expected this bizarre Fortran expression with 10000 sub expressions to produce a compiler error and it is still giving the same error, that is fine. When the tests were done being run people would be interested after the post processor looked at the results. Then and only then did people take action. Qualification of a single compiler took about 2 weeks with nearly 100% test automation with tests running on millions of dollars of hardware all over the company. Waiting 8 hours for the real test results was tough for a new collage graduate with a whole department anxiously waiting results. This was a great system for the day.

When I left the scientific computing world behind for things built with these compilers and libraries I found the tolerance for known problems to be much higher. Leading the testing effort for a startup later I decided to build in the expected and unexpected. Now when the tests run you would see: EXPECTED – PASS and feel good but still feel good when you saw EXPECTED – FAIL RC=2 expected 0 (bug 123). This was good as I was now testing medical software (that they still seem to be selling today at Quovadx, I hope my tests are still running) and using less than $50,000 of equipment. The tests had only 24 hours to be run and the software was arguably more complex as an RDB was employed, multiple config files, threads in the early days of threads and eventually multiple platforms including the dreaded Windows NT.

It turned out that building in a system for Expected/Unexpected was a good design when multiple platforms came along with different results. The system that captures a signature for a test failure was remembering the associated but that went with it now had another parameter, platform, added to it. With many sets of expected results you could now scan the old ones to look for regressions or one platform acting like another. This was good.

Now I am at Uplogix and I am faced with the same problem. However, the UI is far more complicated. I not longer use canned Fortran source or a Perl script to generate config files and run executables on them. I have a Cisco IOSish CLI on a hardware appliance that I have chosen to communicate with Perl‘s Expect.pm library. I have also done a tiny amount of web automation with Watir on our web product.

What do I do now? The Web management for our Envoy machines are tied very closely. I’m a huge fan of Perl but tools like Samie are not at the same level of Watir and Ruby’s expect is not as powerful as the TCL or Perl versions. It is time to build the test harness that binds all of this together. We have multiple platforms again in the form of different browsers. Tests now consist of many machines so remote communication will be needed.

First blog

Posted July 11, 2007 by Brent
Categories: General Test

Last night I was talking with other testers over a few beers and realized I was a non-blogging dinosaur that still read magazines. I was also convinced that I was not a testing dinosaur and that I had something to contribute. So here I am, I updated my RSS feeds to look at things other than photography and cycling and I will start writing. Really, I promise.

My areas of test automation are [I think I need better names]:

  1. Full automation – instead of manual testing helpers. This includes hooks into cruise control to start tests, cron, test lab setup, tear down and informing people of unexpected results. I have nothing against test that help manual testers it just isn’t interesting to me now. At Uplogix I have a lab management issue tracking and all the equipment that I test with.
  2. Test harnesses – This is the part of the test system that someone else can use regardless of what they are testing. Your test cases follow patterns indicating passing/failing, expected/unexpected results. When a test case gives up due to environment or failures and when it continues. It may have to activate hardware or virtual machines to connect to a server and get tests to run on that platform. Unit testers have frameworks like RSpec for Ruby, JUnit for Java or Test::More for Perl but I don’t know of an open source harness or framework for system testing. Something that would understand that the same test needs running on Windows ME/IE5 and Windows Vista Business/IE6 are different tests.
  3. Test infrastructure – how you organize your test code so the proper code is in the right place.
    • Abstraction layer giving you an API, in a ‘real language’) to write your test cases to. This might deal with the UI, WebServices, Rest an underlying RDB or a file system.
    • Test code the steps – The bulk of your testing intellectual property safe from UI changes or the technology that connects to it.
    • Test Harness – described above
  4. Automated test organization – Version tracking tests and the abstraction layer, tracking known failures, and test results.
  5. Virtualization – using technology, like VMWare, to test different OS/Browser combinations in a clean room environment. Virtualizing machines for upgrade could pay off well too.
  6. Web testing, Watir in particular. Working with the ruby test-system group, to test the reference web application.
  7. Command line testing – I am currently using Perl’s Expect.pm module to write an abstraction layer.