On UI Testing

I’m part of a secret society of developers that has evolved over the years into something that has had a pretty significant impact on my career. We ask each other advice, hang out at conferences, discuss trends in the field, etc, etc, and so on and so forth. Pretention is low, content and entertainment value is high. It’s essentially everything I had hoped alt.NET would have been.

A short while ago, we had a chat. It was the latest in a series, depending on how you define “series”, where we gather together to discuss some topic, be it JavaScript frameworks, OO practices, or smoked meat. On this particular day, it was UI testing.

I don’t recall all the participants but it was a good number of the people on this list. Here, I’m going to attempt to summarize the salient points but given my memory, it’ll more likely be a dissertation of my own thoughts. Which is just as well as I recall doing more talking than I should have.

Should you UI test?

This was a common thread throughout. Anyone who has done a significant amount of UI testing has asked a variant of this question. Usually in the form, “Why the &*%$ am I doing this?”

Let it not be said that UI testing is a “set it and forget it” affair. Computers are finicky things, UI’s seemingly more so. Sometimes things can take just that one extra second to render and all of a sudden your test starts acting out a Woody Allen scene: Where’s the button? There’s supposed to be a button. YOU TOLD ME THERE WOULD BE A BUTTON!!!

Eventually, we more or less agreed that they are probably worth the pain. From my own experience, working on a small team with no QA department, they saved us on several occasions. Yes, there are the obvious cases where they catch a potential bug. But there was also a time when we had to re-write a large section of functionality with no change to the UI. I felt really good about having the tests then.

One counter-argument was whether you could just have a comprehensive suite of integration tests. But there’s something to be said for having a test that:

  1. Searches for a product
  2. Adds it to the shopping cart
  3. Browses more products
  4. Checks out
  5. Goes to PayPal and pays
  6. Verifies that you got an email

This kind of integration test is hard to do, especially when you want to verify all the little UI things in between, like whether a success message showed up or whether the number of items in the shopping cart incremented by 1.

We also had the opposite debate: If you have a comprehensive suite of UI tests and are practicing BDD, do you still need TDD and unit tests? That was an interesting side discussion that warrants a separate post.

Maintenance

…is ongoing. There’s no getting around that. No matter how bullet-proof you make your tests, the real world will always get in the way. Especially if you integrate with third-party services (<cough>PayPal<cough>). If you plan to introduce UI tests, know that your tests will be needy at times. They’ll fail for reasons unknown for several consecutive runs, then mysteriously pass again. They’ll fail only at certain times of the day, when Daylight Savings Time kicks in, or only on days when Taylor Swift is playing an outdoor venue in the western hemisphere. There will be no rhyme or reason to the failures and you will never, ever be able to reproduce them locally.

You’ll add sleep calls out of frustration and check in with only a vague hope that it will work. Your pull requests will be riddled with variations of “I swear I wouldn’t normally do this” and “I HAVE NO IDEA WHAT’S GOING ON”. You’ll replace elegant CSS selectors with XPath so grotesque that Alan Turing will rise from his grave only to have his rotting eyeballs burst into flames at the sight of it.

This doesn’t really jibe with the “probably worth it” statement earlier. It depends on how often you have to revisit them and how much effort goes into it. From my experience, early on the answer is: often and a lot. As you learn the tricks, it dwindles significantly.

One of those tricks is the PageObject pattern. There was universal agreement that it is required when dealing with UI tests. I’ll admit I hadn’t heard of the pattern before the discussion but at the risk of sounding condescending, it sounds more like common sense than an actual pattern. It’s something that, even if you don’t implement it right away, you’ll move toward naturally as you work with your UI tests.

Data setup

…is hard, too. At least in the .NET world. Tools like Tarantino can help by creating scripts to prime and tear down a database. You can also create an endpoint (on a web app) that will clear and reset your database with known data.

The issue with these approaches is that the “known” data has to actually be known when you’re writing your tests. If you change anything in it, Odin knows what ramifications that will have.

You can mitigate this a little depending on your technology. If you use SpecFlow, then you may have direct access to the code necessary to prime your database. Otherwise, maybe you can create a utility or API endpoints that allow you to populate your data in a more transparent manner. This is the sort of thing that a ReST endpoint can probably do pretty well.

Mobile

Consensus for UI testing on mobile devices is that it sucks more than that time after the family dinner when our cousin, Toothless Maggie, cornered—…umm… we’ll leave it at: it’s pretty bad…

We would love to be proven wrong but to our collective knowledge, there are no decent ways to test a mobile UI in an automated fashion. From what I gather, ain’t no picnic doing it in a manual fashion. Emulators are laughably bad. And there are more than a few different types and versions of mobile device so you have to use these laughably bad options about a dozen different ways.

Outsourcing

What about companies that will run through all your test scripts on multiple browsers and multiple devices? You could save some development pain that way. But I personally wouldn’t feel comfortable unless the test scripts were extremely prescriptive. And if you’re going to that length, you could argue that it’s not a large effort to take those prescriptive steps and automate them.

That said, you might get some quick bang for your buck going this route. I’ve talked to a couple of them and they are always eager to help you. Some of them will even record their test sessions which I would consider a must-have if you decide to use a company for this.

Tooling

I ain’t gonna lie. I like Cucumber and Capybara. I’ve tried SpecFlow and it’s probably as good as you can get in C#, which is decent enough. But it’s hard to beat fill_in ‘Email’, :with => ‘hill@billy.edu’ for conciseness and readability. That said, do not underestimate the effort it takes to introduce Ruby to a .NET shop. There is a certain discipline required to maintain your tests and if everyone is scared to dive into your rakefile, you’re already mixing stripes with plaid.

We also discussed Canopy and there was a general appreciation for how it looks though Amir is the only one who has actually used it. Seems to balance the readability of Capybara with the “it’s still .NET” aspect of companies that fear anything non-Microsoft. It’ll be high on my list of things to try the next time I’m given the option.

Of course, there’s Selenium both the IDE and the driver. We mentioned it mostly because you’re supposed to.

Some version of Visual Studio also provided support for UI tests, both recorded and coded. The CodedUI tests are supposed to have a pretty nice fluent interface and we generally agreed that coded tests are the way to go instead of recorded ones (as if that were ever in doubt).

Ed. note: Shout out to Protractor as well. We didn’t discuss it directly but as Dave Paquette pointed out later, it helps avoid random Sleep calls in your tests because it knows how to wait until binding is done. Downside is that it’s specific to Angular.

Also: jasmine and PhantomJS got passing mentions, both favorable.

Continuous Integration

This is about as close as we got to disagreement. There was a claim that UI tests shouldn’t be included in CI due to the length of time it takes to run them. Or if they are included, run them on a schedule (i.e. once a night) rather than on every “check in” (by which we mean, every feature).

To me, this is a question of money. If you have a single server and a single build agent, this is probably a valid argument. But if you want to get full value from your UI tests, get a second agent (preferably more) and run only the UI tests on it. If it’s not interfering with your main build, it can run as often as you like. Yes, you may not get the feedback right away but you get it sooner than if you run the UI tests on a schedule.


The main takeaway we drew from the discussion, which you may have gleaned from this summary, is: damn, we should have recorded this. That’s a mistake we hope to rectify for future discussions.

Posted in UI Testing | 4 Comments

CQRS recap, or “How to resuscitate”

I’m fighting a bit with my ego at the moment which is telling me I need to provide at least four paragraphs of update on what I’ve been doing the last three years when I last posted. The fight is with my more practical side which is saying, “Name three people that have noticed.” I’ll compromise with a bulleted list because some of it does have a little bearing on the rest of this post:

  • I’m no longer with BookedIN although they are still going strong.
  • I’ve started recently with Clear Measure who has graciously relaxed their “no hillbilly” hiring policy. Guardedly.
  • For those interested in my extra-curriculars, I’m also blogging at http://kyle.baley.org where you’ll find a recent follow-up to an older post on Life in the Bahamas, which I keep getting emails about for some reason…

This past weekend, Clear Measure hosted a meetup/coding-thingy on CQRS with your host, Gabriel Schenker. Initial intended as an event by and for Clear Measurians, it was opened to the public as a means to garner feedback for future events. It was a 7-hour affair where Gabriel set out the task to perform then left us to our devices to build as much as we could while he provided guidance and answered questions.

The event itself ran as well as I expected, which, me being an optimistic sort, was awesome! And Gabriel, if you’re reading, I did manage to get to the beach today so don’t feel bad about taking that time away from me. I won’t go into logistics but wanted to get my thoughts on CQRS on the table for posterity.

By way of background, my knowledge of CQRS was, up until I started at Clear Measure, pretty vague. Limited mostly to what I knew about the definition of each of the words in the acronym. Gabriel has, in meetings and recently in his blog, increased my awareness of it to some degree to the point where it made sense as an architectural pattern but was still abstract enough that if someone asked me to implement it in a project, I would have first consulted with a local voodoo doctor (i.e. speed dial #4).

The good part

So the major benefit I got from the event is how much CQRS was demystified. It really *is* just segregation of the responsibilities of the commands and the queries. Commands must logically be separated from queries to the point where they don’t even share the same domain model. Even the term “domain model” is misleading since the model for queries is just DTOs, and not even glorified ones at that.

Taking one example, we created ourselves a swanky new TaskService for saving a new task. It takes in a ScheduleTaskDto which contains the basics from the UI: a task name, a due date, some instructions, and a list of assignees. The TaskService uses that info to create a fully-formed Task domain object, setting not only the properties passed in but also maybe the CreateDate, the Status, and the ID. Then maybe it validates the thing, saves it to the repository, and notifies the assignees of the new task. All like a good, well-behaved domain object.

Now we want a list of tasks to show on the dashboard. Here are two things we actively had to *stop* ourselves from doing:

  • Returning a list of Task objects
  • Putting the logic to retrieve the tasks in the TaskService

Instead, we created DashboardTask DTO containing the TaskId, Name, DueDate, and Status. All the items needed to display a list of tasks and nothing else. We also created a view in the database that retrieves exactly that information. The code to retrieve that info was in a separate class that goes directly to the database, not through the TaskService.

Given more time, I can see how the command/query separation would play out more easily. For the commands, we may have used NHibernate which gives us all the lazy loading and relationship-handling and property mappings and everything else is does well. For the queries, probably stick with views and Dapper which allow us to query exactly the information we want.

My sense is that we’d have a lot bigger set of classes in the query model than in the command model (which would be a full-fledged domain). Because the query model requires at lease one class almost for each and every screen in the app. Dashboard listing of tasks for supervisors: SupervisorDashboardTask. List of tasks for a dropdown list: TaskListItem. Retrieve a task for printing on a report: OverdueTask. All separate and all very specific.

Wrap up

My partner-in-crime for the day was Alper Sunar who is hosting our day’s efforts, such as they are. The big hurdle I had to jump early on was to stop myself from going infrastructure crazy. Early discussions touched on: Bootstrap, RavenDB, IoC, and Angular, all of which would have kept me from my goal: learning CQRS.

I’ve forked the code with the intent of continuing the journey and perhaps looking into something like RavenDB. I have to admit, all the talk around the virtual water cooler about elastic search has me thinking. And not just about finding new sister-wives…

Kyle the Returned

Posted in BookedIN, Clear Measure, CQRS | Tagged , , , | 2 Comments

Cron and AppEngine

Quick PSA on using cron jobs with Google App Engine because it almost wreaked havoc for us.

App Engine has a lovely feature of having different versions of your app. You can upload a new version but not make it the default until you’re good and ready. We do this all the time for deployment. Deploy to a new version and try it out, then make it the default when we’re ready to unleash it. Often, we deploy to the new version a day or so in advance.

Cron jobs, it seems, are handled outside this versioning mechanism. If you upload a new cron.xml file, it’s active. Right now. Doesn’t matter if the version it was deployed in is the default or not. As soon as it’s uploaded, it’s the new cron-ness.

Where this almost bit us is that we added a new cron job in our most recent release (deployed yesterday but not active) to use a dynamic backend. As soon as the cron job got uploaded, it started running. I didn’t notice until this morning when our backend usage reflected the new cron job. Some quick research and here we are.

What this means long term is that cron.xml is no longer going to be deployed as part of our application anymore. It now becomes an entirely separate process. I’m a little annoyed that we have to wait until we pull the trigger on the new version before we can upload the new cron.xml but it’s a quick step.

Kyle the Mis-scheduled

Posted in Uncategorized | Tagged | 2 Comments

The Economics of Ergonomics

Let it not be said there are no downsides to living in the Bahamas (though if you’ll permit a little boasting, a shortage of fantastic venues if you’re lucky enough to be in a band is not one of them).

The desk where I work is too high, plain and simple. So much so that I’ve recently abandoned my Kinesis keyboard because it is not what you might consider “low form factor”. I’ve started feeling some twinges in my lower forearm that my unscientific diagnosis is attributing to the height of my hands while I type. Dumping the Kinesis has helped but it has also led to the return of stress in other areas of my hands. And no amount of raccoon skinning seems to alleviate the pain.

Getting a lower desk is easy enough but I’d actually like to do a little experimenting with two alternatives. Alas, neither are easily done in the Bahamas. The underlying problem is availability. The desks/equipment I want to test are not available here so I would have to order them in. Which means both shipping charges and import duties, the latter of which is a major source of income for the Bahamian government to offset the fact that there is no income tax. So returning said equipment is just not practical if it doesn’t work out. Nor is there much of a reseller market.

So I’m hoping I can get some comments from people who have done something similar.

Adjustable height desk

These are, of courses, desks where you can adjust the height easily. I like the idea of these for two reasons:

  • They can be set low
  • They can be set high

I’ve never tried a desk that you stand at but I’ve always wanted to. Working on my own, I tend to get up and wander a lot while I’m thinking. I also pace when I’m on the phone with someone for any length of time so it would be more convenient to walk up to the computer during the conversation should the need arise. (“You want to know the right pattern of plaid for a first date with your second cousin? Let me look that up.”)

I went desk-shopping over the weekend and the closest thing I saw was in an office supply store. And it wasn’t on the showroom floor. Off in the corner of the store were the employee desks. They were all essentially plywood based, laminated desktops all mounted in warehouse style shelving frames. They sat on brackets in the frame which means you could set the height to whatever you want. It wasn’t something you could easily adjust on the fly and my wife wasn’t too thrilled at the industrial look so it was a fleeting idea at best.

Command centrecomputer workstation furniture MYPCE Computer Workstation Furniture

This is an idea I’ve had ruminating in my head for a while now. I would get rid of the desk altogether in favour of a comfortable command-centre style or gaming chair. In front of it it, I’d mount my monitors on a couple of flexible arms somehow, possibly on the armrests or on stands on either side of the chair. The important thing is that I can slide the monitors out of my way when I want to get out of the chair, and slide them back in when I sit down.

The keyboard would rest either on my lap or on some flat surface on my lap. Or maybe go with a split keyboard (though one without a wire between the two) and have one piece mounted on each armrest. Haven’t quite worked out how the mouse would fit in though. A trackball on some little platform on the side makes sense but I’ve got one now and it doesn’t feel as productive as just a regular mouse.

I feel like this would be more comfortable and would reduce much of the muscle stress that seems to have become more prominent since hitting 40 earlier this year. All of this kind of makes sense in my head but the logistics of getting the stuff here is such that I don’t want to make the investment unless I’ve had a chance to try it out at least for a few days. There’s a chance my tendency to get up and wander might make this impractical. Or maybe cord management would be an ongoing problem.

The device shown at right, which I discovered while researching this article, is essentially what I’ve described. It’s some US$2750. Duty would add about 50% and shipping would likely bring the total price above five large. There’s another potential hurdle in that it may not be available anymore given the company’s domain seems to point to a parking spot. But even building my custom version will cost enough in non-refundable cash dollars for me not to head over eBay.

Instead, I hunt for a standard desk about four to six inches lower than the one I’ve got. Not as exciting, possibly not as ergonomic, but easier to replace.

So my question to you, my honorary hillbillies, for anecdotal evidence. Have you tried either of these devices? What’s good and bad? Good return for the money or does it sit in the garage next to the Bowflex you bought in a fit of New Year’s anxiety?

Kyle the Unreturnable

Posted in Sundry, Working Remotely | 24 Comments

Audit Fields in Google AppEngine

Executive summary: Here’s how we’re implementing audit fields in AppEngine. IT’S BETTER THAN THE WAY YOU’RE DOING IT!

I considered saying “I hope there’s a better way of doing it” but I believe I’ll get more responses if I frame it in the form of a challenge.

For all entities in our datastore, we want to store:

  • dateCreated
  • dateModified
  • dateDeleted
  • createdByUser
  • modifiedByUser
  • deletedByUser

Here are the options we’ve considered

Datastore callbacks/Lifecycle callbacks

AuditAppEngine supports datastore callbacks natively. If you use Objectify, they have lifecycle callbacks for @PrePersist and @PostLoad. The former works fantastic for dateCreated, dateModified, and dateDeleted. Objectify can handle all three easily as well provided you use soft deletes, which we do. (And they aren’t as bad as people would have you believe, especially in AppEngine. You’d be surprised how many user experience problems you discover strolling through deleted data.)

Both of these led to problems for us when we tried to use them for the createdByUser et al methods. We store the current user in the session and access it through a UserRetrievalService (which, at its core, just retrieves the current HttpSession via a Guice provider).

If we want to use this with the Objectify lifecycle callbacks, we would need to inject either our UserRetrievalService or a Provider<HttpSession> into our domain entities. This isn’t something I’m keen on doing so we didn’t pursue this too rigorously.

The datastore callbacks have an advantage in that they can be stored completely separately from the entities and the repositories. But we ran into two issues.

First, we couldn’t inject anything into them, either via constructor injection or static injection. It looks like there’s something funky about how they hook into the process that I don’t understand and my guess is that they are instantiated explicitly somewhere along the line. Regardless, it meant we couldn’t inject our UserRetrievalService or a Provider<HttpSession> into the class.

The next issue was automating the build. When I try to compile the project with a callback in it, the javac task complained about a missing datastorecallbacks.xml file. This file gets created when you build the project in Eclipse but something about how I was doing it via ant obviously wasn’t right. This also leads me to believe there’s something going on behind the scenes.

Neither of these problems is unsurmountable, I don’t think. There is obviously some way of accessing the current HttpSession somehow because Guice is doing it. And clearly you can compile the application when there’s a callback because Eclipse does it. All the same, both issues remaining unsolved by us, which is a shame because I kind of like this option.

Pass the User to Repository

This is what was suggested in the StackOverflow question I posed on the topic. We have repositories for most of our entities so instead of calling put( appointment ), we’d call put( appointment, userWhoPerformedTheAction ).

 

I don’t know that I like this solution (as indicated in my comments). To me, passing the current user into the DAO/Repository layer isn’t something the caller should have to worry about. But that’s because in my .NET/NHibernate/SQL Server experience, you can set things up so you don’t have to. Maybe it’s common practice in AppEngine because it’s still relatively new.

(Side note: This question illustrates a number of reasons why I don’t like asking questions on StackOverflow. I usually put a lot of effort into phrasing the question and people often still end up misunderstanding the goal I’m trying to achieve. Which is my fault more than theirs but still means I tend to shy away from SO as a result.)

Add a User property to each Entity

I can’t remember where I saw this suggestion. It’s kind of the opposite of the previous one. Each entity would have a User property (marked as @Transient) and when the object is loaded, this is set to the current user. Then in your repositories, it’s trivial to set the user who modified or deleted. This has the same issue I brought up with the last one in that the caller is responsible for setting the User object.

Also, when new objects are created, we’d need to set the property there as well. If you’re doing this on the client, you may have some issues there since you won’t have access to the HttpSession until you send it off to the server.

Do it yourself

This is our current implementation. In our repositories, we have a prePersist method that is called before the actual “save this to the datastore” method. Each individual repository can override this as necessary. The UserRetrievalService is injected in and we can use it to set the relevant audit fields before saving to the repository.

This works just fine for us and we’ve extended it to perform other domain-specific prePersist actions for certain entities. I’m not entirely happy with it though. Our repositories tend not to favour composition over inheritance and as such, it is easy to forget to make a call to super.prePersist somewhere along the way. Plus there’s the nagging feeling that it should be cleaner and more testable than this.

Related to this is the underlying problem we’re trying to solve: retrieve the user from the session. In AppEngine, the session is really just the datastore (and memcache) with a fancy HttpSession wrapper around it. So when you get the current user from the session, you’re really just getting it from the datastore anyway using a session ID that is passed back and forth from the client. So if we *really* wanted to roll our own here, we’d implement our own session management which would be more easily accessible from our repositories.

So if you’re an AppEngine user, now’s where you speak up and describe if you went with one of these options or something else. Because this is one of the few areas of our app that fall under the category of “It works but…” And I don’t think it should be.

Kyle the Pre-persistent

Posted in Google App Engine | Tagged | 8 Comments