Jakob Nielsen's Alertbox, March 6, 2006:
Summary:
6% of task attempts are extremely slow and constitute outliers in measured user performance. These sad incidents are caused by bad luck that designers can -- and should -- eradicate.
Generally, I advocate qualitative user testing: a handful of users is enough to discover most design flaws. Quantitative testing does have its place, however, and we've recently been running large tests for two different reasons:
One way of assessing a dataset's distribution is to draw a quantile-quantile scatterplot. In a QQ plot, we plot each observation's empirical value on the x-axis and its hypothetical value on the y-axis, under the assumption that the entire set is normally distributed. We draw a straight line to represent a case with identical empirical and hypothetical values.
If our plotted datapoints are very close to the straight line, we conclude that the empirical values are very close to the hypothetical values. In other words, the observed data are the same as what the theory predicted, so the dataset follows a normal distribution.
Any datapoints that are far from the straight line represent cases in which the real and theoretical worlds differ substantially -- in other words, the data doesn't follow the normal distribution.
I've plotted seventy QQ plots from our recent quantitative usability studies, and they all look the same, whether they come from website or intranet studies. Here are two typical examples:
QQ plots of two user studies: a test of a content-based magazine site (New York Magazine, on the left) and a test of a transaction-based e-commerce site (Kiehl's, on the right).
Each dot represents the task time of one user. The x-axis indicates measured performance and the y-axis indicates the theoretically matching normal distribution.
Although the dots aren't exactly on the straight line, they're pretty close. There are a few outliers, but it seems safe to conclude that most users do follow a normal distribution. Close enough for government work -- or more to the point, close enough for any analysis you need in a practical development project.
In usability testing, there's a clear floor effect for measured task times: people simply can't be faster than a certain minimum, no matter how efficiently they use a site. Downloading pages and moving your hand between mouse and keyboard require a certain amount of time. Even the fastest typists still need time to type in search engine queries; the fastest readers still need time to read, regardless of how quickly they can find the salient information on a page.
All the studies I've analyzed included a few fast outliers. These fast (but not quite fast enough) users are easy to explain, however, and I don't think they should impact our thinking about Web usability.
Of 1,520 cases, eighty-seven were outliers with exceedingly slow task times. This means that 6% of users are slow outliers. This is too many people to ignore. Of course, you should first and foremost improve the user experience for the 94% of users who are not outliers, but it's worth allocating some usability resources to that slow 6% as well.
The most seemingly obvious explanation for these outliers is simply that a few people are almost incompetent at using the Web, and they'll show up as slow outliers every time they test something. But this hypothesis is false. Once we recruit people for a study, we ask them to do multiple things, so we know how the slow outliers perform on several other tasks. In general, the same users who were extremely slow on some tasks were fast on other tasks.
Sixty different users were responsible for the eighty-seven slow outliers, for an average of 1.5 outliers each. Given that users were tested on an average of 6.7 tasks across the analyzed studies, each of these users had an average of 5.2 "normal" tasks -- 3.5 times as many as their outlying tasks.
This topic clearly needs more research, and would make for several good graduate theses. For now, my best conclusion is that slow outliers are caused by bad luck rather than by a persistent property of the users in question.
Here's an example of good luck from a test with disabled users trying to use the website of the IRS (the U.S. tax authorities). One blind user wanted to find out whether she could deduct money donated to a high school band.
Because the IRS page was long and overwhelming, the user decided to have her screen reader device read out the list of links on the page. Further, because the user was looking for tax rules about "donations" she commanded the screen reader to read links that started with a "D." As it turns out, the IRS uses the term "deduction" rather than "donation" -- something the user would never discover from a simple page or site search using the word "donation." However, because both words start with "D" and the person was using a screen reader, she easily happened upon "deduction" as the correct link. A joyful outcome, but one that's purely due to good luck.
(There are a few additional usability notes here. First, by using the term "deduction" rather than "donation," the site opts for system-oriented language over a term describing the user's action, which the site is presumably supposed to support. Second, using the screen reader shortcut is an expert behavior; you shouldn't use it as an excuse for long pages, which hurt less-experienced screen reader users. Finally, the "read links" feature is one of the reasons it's a guideline to avoid links with labels such as "click here" or "more," which don't make sense out of context.)
Given that slow outliers account for 6% of Web usage, it's unacceptable to simply write them off. Although the data shows that most users will avoid bad luck in their next online task, you can't just say "better luck next time"; if you do, their next user experience will likely be on somebody else's website.
People leave websites that hurt them -- they don't know that it's just bad luck, and that next time will be better. It's therefore incumbent on you to hunt down the root causes of bad luck and eradicate them from your site.
Copyright © 2006 by Jakob Nielsen. ISSN 1548-5552.