Usability Tests at Deveo: Rocket Surgery Experience Report

| Comments

As part of our ongoing work on making the Deveo signup and onboarding experience as streamlined as possible, we ran some usability tests last week. Here’s a summary of what we did and how it went.

Since we’re a bootstrapping startup, we don’t really have huge budgets or resources set aside for something like this. That means we need to keep it light. Fortunately, there are some excellent resources around on doing usability testing in a really light-weight manner. Our go-to guide was Steve Krug’s practical and concise Rocket Surgery Made Easy. If you want to learn how to do usability testing with a bootstrap budget, I urge you to read that book.

First Things First: Was it worth it?

Everyone always says you should do usability testing. It’s kind of a no-brainer, and yet I’ve seen it done surprisingly seldom. When I have seen it done, it’s mostly been in large projects with actual usability professionals around to do it. But can you do it if you don’t have access to such professionals?

In our case, it turned out that we absolutely can do it and it absolutely is worth it. While I don’t know what might have been uncovered in a wider, deeper study, we made some really important findings in just a few hours with three test participants. What we found were some obvious problems in the user experience, which had completely blindsided us. It’s really quite amazing how much you don’t see in the product you’re working on day in, day out.

So, if you’re thinking whether you should do something like this, all I can say is do it, and do it now!

Finding Test Participants

We wanted to find three people to run tests with. Since our product is a B2B one, specifically built for software development professionals, we couldn’t use just anyone, but had to find people with some experience in doing software development, or managing it.

We began by announcing the need on this blog and via our social media accounts. While some people did contact us based on that, it turned out not to give us quite what we were looking for. There were just too few people of the kind we were looking for who contacted us.

What we finally did was ask our first order contacts: People we knew, who were not intimately familiar with Deveo already, and who we thought could give us good feedback. That strategy worked out astonishingly well - we pretty much secured the first people we contacted. (Thanks guys!)

We scheduled all the tests to a single Wednesday morning: At 9 AM, 10 AM, and 11 AM. Luckily there were no problems in organizing the schedule and the times were suitable for everyone. It probably helped that we’re pretty centrally located in Helsinki.

As compensation for the participants, we offered to give out books (specifically, copies of Rands’ excellent Being Geek). For an hour’s worth of a software professional’s time, it’s obviously of value mostly as a token, rather than as any kind of realistic compensation. I have a feeling that if we dish out slightly larger compensations next time around, it will be easier to attract people who don’t know us yet, and who won’t be inclined to do us a favor.

It also turned out we were a little bit late with ordering the books, and not all of them arrived from Amazon on time, so we’ve had to send them out afterwards. It was slightly annoying to have to say that during the tests, but our participants were very understanding about that.

Test Cases

For us, it was pretty clear what we needed to test: We were working on improving signups and onboarding, so the tests had to be mostly about that.

We came up with a few specific use cases someone new to Deveo might do. We then wrote those up as kind of hypothetical scenarios, which we would then give out to our test participants. This is what we had:

Find out what Deveo costs for your organization

You work for ACME & Stuff Oy, a Helsinki based software company. You are currently employing 43 people in your software development organization, and estimate growing to over 50 people within a year.

You are looking for a version control product that you will install on your own server. You have been using Subversion for several years, but you’re considering also using Git in future projects. Find out if Deveo is suitable for you, and what it would cost for your organization.

Try Deveo

You want to try out Deveo to see how it works. You haven’t been authorized to use any money for any software trials.

Find out if there is a possibility to sign up for a free trial and sign up if there is one.

Create a trial project

You want to test setting up Deveo for one of the projects in your company: The OMG2000 Hour Reporting System. They want to use Git for version control.

Set everything up so that the project manager Jaakko Parantainen (jaakko.parantainen@acmeoy.fi), and the two developers Timo Silakka (timo.silakka@acmeoy.fi) and Jarmo Mäkiaho (jarmo.makiaho@acmeoy.fi) can get started.

Download The On-Premises Package

You want to install Deveo on your own Red Hat Enterprise Linux server now. Your IT support person has asked for the packages and materials so she can install Deveo. Find and download everything she will need.

During the test, we ran these scenarios in order, to simulate what might be a real Deveo trial experience. I was quite happy about how it worked out, but since I don’t have a huge amount of data to compare it to, I don’t really know if there would have been something that could have made them even more useful.

Facilities And Equipment

We ran the tests on our staging server, because that’s where we then had the latest version of Deveo, and so that the test participants wouldn’t need to worry about setting up real accounts or getting billed by the system later.

What we learned is if you want to run these tests in a staging environment, you really need to have one that fully matches your production environment from a user experience point of view. I’d say ours is currently about 95% there, which meant we had a couple of issues that were caused purely by the fact we were running in the staging environment (some broken links, basically).

As for the facilities, equipment, and software, we had the following setup:

  • Two meeting rooms booked at our premises for the whole morning: One for running the tests and one for observing them.
  • A Macbook with an external monitor, an external keyboard, and an external mouse for the test participants to use. (Not all of them were Mac users but there were no major problems caused by that.)
  • A conference phone (mic + speaker), for recording and transmitting our voices from the test room.
  • Google Chrome with a clean user profile for each test
  • Skype for live transmission of the sound and the screen to the observation room.
  • Silverback for recording the screen and the sound for further analysis later. We took advantage of their 30 day free period for these test, but we’ll definitely get a license for our next test runs.
  • Google Drive used by the observers for collaborative recording of test observations.

I have no real complaints about any of the equipment we used, apart from some very minor issues (microphone picked up a bit of noise from the hallway outside the test room, and the fact that the Mac keyboard and the Apple Mighty Mouse scrolling surface were unfamiliar to some of our test subjects).

Setting Up

On the morning of the tests, we arrived at the office about 30 minutes before the first test participant was scheduled to arrive. That turned out to be not early enough, and when the first test was about to start we were still scrambling to get all the equipment, VPN connections, and software set up properly.

Next time we’ll definitely arrive at least an hour early, so when the tests begin it’s easier to concentrate.

Running The Tests

Actually running the tests was my favourite part of the whole endeavour. There’s something very motivating about seeing people use your product, and especially seeing them struggle with something you’re just realizing is a problem you weren’t aware of before. When you see it, you know that fixing it will result in measurable, major progress in your product.

The test sessions took about 20-30 minutes each. I was worried that they might take a full hour causing the schedule to get cramped, but fortunately they didn’t and there was some time between the sessions to chat with the participants, debrief with the team, and just to generally catch our breath (the sessions can be, while friendly and fun, also quite intense).

For each participant, we followed a similar script:

  • We told them why they were there, what was about to happen, and the fact that we were not testing them but the product.
  • We made sure they knew their voices and the screen would be recorded, and that they were OK with that. Silverback would actually have been able to record video of the test participants’ faces as well, but we opted out of that feature, since we didn’t think we’d really need that data, and also because we thought it might have made the test participants self-conscious for being filmed.
  • We started the actual test by having them land on the Deveo home page, and spending a few minutes discussing what they were seeing, what they thought it was about, and what they could do there.
  • We then each of the four test scenarios in order, by first reading them out loud and giving them to the participants in written form, and then observing them working through the test case.
  • After the final test was finished, we checked if the participant or the observers had any questions, and then ended the recording.

Observing The Tests

While I was running the tests with the test participant, the rest of the Deveo team were in an adjacent meeting room observing what was going on and making notes. The voice and the screen from the test room were streaming live over Skype to the observation room, so the team was able to see and hear what was going on. The Skype call was muted from the observation side so the observers could discuss freely without worrying about disturbing the test.

For recording observations, we used a Rainbow Spreadsheet, which is a tool I originally picked up from Tomer Sharon’s UX Research class at General Assembly, and which he has also written a great article about for Smashing Mag. We all had the same copy of the spreadsheet open on Google Drive, and all of the observers made their notes there, looking at things like what they noticed or didn’t notice, what they understood or didn’t understand, and what they did or didn’t do.

I was also planning to participate in making the observations while the test participants were working, but it turned out I really didn’t have time to concentrate on that. Luckily I didn’t have to, since the team picked up everything we needed. (In Steve Krug’s book, there was some emphasis on techniques for engaging the observers. For us there was no need for it. That may have to do with the fact that we’re in a startup where everyone is extra engaged, all the time.)

Identifying Tasks

After all the tests were run, we spent some time together debriefing. We first identified a few issues that were both hugely impactful and easy to fix. Those we fixed and deployed during the same afternoon, to get a peace of mind about them.

For the rest of the observations, we followed some simple steps:

  1. We did some cleanup to remove duplicates.
  2. We ordered the issues by frequency, putting ones that affected all participants on top, and ones that affected only one of them on the bottom.
  3. We categorized the issues (with color coding), based on whether they were problems, positive remarks, or future development ideas identified either by the test subjects or the observers.
  4. We started thinking about solutions. For most of the problems found, it was actually quite easy to see what we should do about them. Those went straight onto our Trello board, and we’ve already fixed some of them. One of the problems was related to a potentially deeper issue with our domain model, and we want to spend some time thinking about it before rushing into a major feature redesign.

All in all, the tests gave us some very concrete, real world feedback on what we had gotten right and where we had missed the mark. We will definitely be doing this again soon!

If you’d like to know anything more about our experiences, post a comment below!

Comments