jump to navigation

What to do if your users think you’re something that you’re not October 8, 2008

Posted by jeremyliew in product management.
3 comments

I was talking to the founders of Zintin recently about their iPhone app. Although they had initially expected their dominant usecase to be “keep in touch with your friends”, it rapidly became “meet nearby people”. The Zintin team did a great job of rolling with their users and evolving their product development towards the dominant use case, rather than treating it as a “user error” problem.

It reminded me of this great quote from one of the founders of IMVU, Eric Reis:

In our first year at IMVU, we thought we were building a 3D avatar chat product. It was only when we asked random people we brought in for usability tests “who do you think of as our competitors?” that we learned different. As product people, we thought of competition in terms of features. So the natural comparison, we thought, would be to other 3D avatar based products, like The Sims and World of Warcraft. But the early customers all compared it to MySpace. This was 2004, and we had never even heard of MySpace, let alone had any understanding of social networking. It required hearing customers say it over and over again for us to take a serious look, and eventually to realize that social networking was core to our business.

The moral of this story, if you disagree with your users about what your product is for, then you are wrong and your users are right. There is no such thing as “user error”.

How to implement reporting and analytics for your startup July 30, 2008

Posted by jeremyliew in A:B testing, analytics, product management, start-up, startup, startups.
2 comments

Andrew Chen has a good post on how a startup should think about implementing analytics that I think applies to companies of all sizes and is worth reading. He notes:

In general, a philosophy on the role of analytics within a startup is:

If you’re not going to do something about it, it may not be worth measuring.

(Similarly, if you want to act to improve something, you’ll want to measure it)

Don’t build metrics that aren’t going to be part of your day-to-day operations or don’t have potential to be incorporated as such. Building reports that no one looks at is just activity without accomplishment, and is a waste of time.

He goes on:

Metrics as a “product tax”
In fact, one way to view analytics is that they are a double-digit “tax” on your product development process because of a couple things:

* It takes engineers lots of time and development effort
* It produces numbers that people argue about
* It requires machines, serious infrastructure, its own software, etc
* Fundamentally, it slows down your feature development

As a rough estimate, I’ve found that it takes between 25-40% of your resources to do analytics REALLY well. So for every 3 engineers working on product features, you’d want to put 1 just on analytics. This may seem like a ton (and it is), but it throws off indispensible knowledge that you can’t get elsewhere, like:

* Validating your assumptions
* Pinpointing bottlenecks and key problems
* Creating the ability to predict/model your business to make future decisions
* It tells you which features actually are good and what features don’t matter

I recommend reading the whole thing.

One additional piece of advice that I’ve found helpful:

1. Ask the product owners to use excel to mock up EXACTLY the reports that they would like to use, whether charts, tables, graphs, including time periods and mock data. This is way better than PRDs when it comes to reporting.
2. Go line by line through these reports with the product owners and ask them “what decision will you make with this data”. If the answer is “none” or if it is for investigation or theortical purposes rather than frequent operating decisions, cut the report out.

Happy analytics!

How many A:B tests do I have to run before it is meaningful? July 21, 2008

Posted by jeremyliew in A:B testing, product management.
2 comments

Inside Facebook has a good post about how to not screw up your A:B testing that is a useful reminder about how many tests you need to run before you know that the results are statistically significant.

The author notes:

How many [tests] do we need to declare a statistically significant difference between [a design leading to action success rate] of p1 and one of p2? This is readily calculable:

* number of samples required per cell = 2.7 * (p1*(1-p1) + p2*(1-p2))/(p1-p2)^2

(By the way, the pre-factor of 2.7 has a one-sided confidence level of 95% and power of 50% baked into it. These have to do with the risk of choosing to switch when you shouldn’t and not switching when you should. We’re not running drug trials here so these two choices are fine for our purposes. The above calculation will determine the minimum and also the maximum you need to run.)

Thus, if you did this number of tests and found that the difference in action success was greater than (p1-p2), then you would have a 95% confidence level that the design being tested is responsible for the increase in success rate, and you would move to a new best practice.

The author reminds developers to adhere to A:B testing best practices, including:

# Running the two cells concurrently
# Randomly assigning an individual user to a cell and make sure they stay in that cell during the test
# Scheduling the test to neutralize time-of-day and day-of-week effects.
# Serving users from countries that are of interest.

One thing that immediately emerges from this formula is that you don’t need that many tests to determine if a new design is working. For example, testing a design that anticipates increasing success from .5% to .575% only needs about 52k tests. For apps and websites that are at scale, this does not take very long.

The danger is that, because of the overhead of putting up and taking down tests, “bad” test designs stay up for too long, exposing too many users to a worse experience than usual. While some people consider A:B testing to be splitting users into equal groups, there is no such requirement. I’d advise developers to size their test cells to be x% of their total traffic, where x% is a little more required to hit the minimums calculated above over a week. This neutralizes time of day and day of week effects, minimizes the overhead of test set up, and ensures that not too many users are exposed to bad designs. It also allows multiple, independent tests to be run simultaneously.

How to make your product a habit July 14, 2008

Posted by jeremyliew in product management.
8 comments

The NY Times has a great article this Sunday about how habits may be good for you. It is ostensibly about how Dr Val Curtis turned to Proctor and Gamble, Colgate-Palmolive and Unilever to help create a new habit in Ghana, washing hands with soap, thereby reducing the death rate from hygiene related diseases like diarrhea. But why turn to consumer products companies to solve a public health problem?

If you look hard enough, you’ll find that many of the products we use every day — chewing gums, skin moisturizers, disinfecting wipes, air fresheners, water purifiers, health snacks, antiperspirants, colognes, teeth whiteners, fabric softeners, vitamins — are results of manufactured habits. A century ago, few people regularly brushed their teeth multiple times a day. Today, because of canny advertising and public health campaigns, many Americans habitually give their pearly whites a cavity-preventing scrub twice a day, often with Colgate, Crest or one of the other brands advertising that no morning is complete without a minty-fresh mouth….

For most of our history, we’ve sold newer and better products for habits that already existed,” said Dr. Berning, the P.& G. psychologist. “But about a decade ago, we realized we needed to create new products. So we began thinking about how to create habits for products that had never existed before.”

Academics were also beginning to focus on habit formation. Researchers like Wendy Wood at Duke University and Brian Wansink at Cornell were examining how often smokers quit while vacationing and how much people eat when their plates are deceptively large or small.

Those and other studies revealed that as much as 45 percent of what we do every day is habitual — that is, performed almost without thinking in the same location or at the same time each day, usually because of subtle cues.

For example, the urge to check e-mail or to grab a cookie is likely a habit with a specific prompt. Researchers found that most cues fall into four broad categories: a specific location or time of day, a certain series of actions, particular moods, or the company of specific people. The e-mail urge, for instance, probably occurs after you’ve finished reading a document or completed a certain kind of task. The cookie grab probably occurs when you’re walking out of the cafeteria, or feeling sluggish or blue.

Entrepreneurs should ask themselves the same question; how can they make their product a habit? In some cases, it helps to build off of habits that already exist, as the P&G psychologist mentions above. One example is Stardolls, an online version of playing with dolls, already a habit for many girls. Club Penguin relies on a metaphor of feeding and playing with your pet, again, already a habit for many of its young players. Digg relies on the habit of sharing interesting links. Relying on existing behavioral cues is always a good place to start.

For many startups, there are no existing behavioral cues, and they will have to find or create a cue that can habitualize their users. This sounds hard because it is hard. Changing consumer behavior is a very tough challenge. Your best bet will be to try to latch on to some existing habit and associate your product with that habit. This is exactly what happened with the case of washing hands in Ghana:

However, the studies also revealed an interesting paradox: Ghanaians used soap when they felt that their hands were dirty — after cooking with grease, for example, or after traveling into the city. This hand-washing habit, studies showed, was prompted by feelings of disgust. And surveys also showed that parents felt deep concerns about exposing their children to anything disgusting.

SO the trick, Dr. Curtis and her colleagues realized, was to create a habit wherein people felt a sense of disgust that was cued by the toilet. That queasiness, in turn, could become a cue for soap.

A sense of bathroom disgust may seem natural, but in many places toilets are a symbol of cleanliness because they replaced pit latrines. So Dr. Curtis’s group had to create commercials that taught viewers to feel a habitual sense of unseemliness surrounding toilet use.

Their solution was ads showing mothers and children walking out of bathrooms with a glowing purple pigment on their hands that contaminated everything they touched.

The commercials, which began running in 2003, didn’t really sell soap use. Rather, they sold disgust. Soap was almost an afterthought — in one 55-second television commercial, actual soapy hand washing was shown only for 4 seconds. But the message was clear: The toilet cues worries of contamination, and that disgust, in turn, cues soap.

Implications of “Convenience Beats Quality” June 2, 2008

Posted by jeremyliew in Consumer internet, distribution, product management.
Tags: ,
3 comments

Fred Wilson says that convenience beats quality. In his post he is talking about video and photography. The amazing story of the limited featured Flip Camera, which captured 13% share of the video camera market in its first year on sale, bears testimony to this truism.

I think this maxim, that convenience beats quality, is true not just for video and photography, but also for most consumer internet services. It is one of the reasons that many of the apps that have been most successful on Facebook have been lightweight “just for fun” apps:

Some corollaries of this are:

1. The best product is neither necessary nor sufficient
2. Distribution can be more important than functionality
3. Lightweight interactions beat more involved interactions
4. Defaults matter as many people won’t change them
5. Use implicit information whenever you can to avoid asking users for data.

Do readers agree that convenience beats quality? If so, what are other corollaries?

Genius is 1% inspiration and 99% perspiration February 25, 2008

Posted by jeremyliew in game design, game mechanics, product management.
3 comments

Thomas Edison is credited with the saying that “Genius is 1% inspiration and 99% perspiration”.

Many of the most successful web 2.0 companies understand this intuitively and it is reflected in their product management. Although they all have a general vision for their product, it does not spring full formed from their minds. Rather, they build A:B test harnesses to explicitly test their hypotheses on live users. They don’t ask their users what they want, but rather they watch what they do. They try multiple versions of everything (title text, call to action copy, buttons versus links, number of screens in signup etc) and they let the data decide the direction of the product. They’re not driven by philosophy, but by the scientific method. Examples of companies that take this approach include many of the standout viral growth companies of the current generation, including RockYou, Slide, Plaxo, LinkedIn, Facebook, Tagged, Flixster and many more. (Disclaimer: Lightspeed is an investor in both Rockyou and Flixster)

For game design, the equivalent would be the trend towards metrics driven development. Raph Koster wrote up the Master Metrics presentation given at GDC by Dan Arey and Chris Swain from USC. They talk in part about Microsoft’s approach to metric driven design:

MS User research group… using heatmaps. When a project goes thru MS, 3 people from the user research group assess the gameplay experience. They are a real thought leader in this area.

1. usability testing – can user operate software
2. playability, does user have a good play experience
3. instrumentation, how exactly is the user playing, using tracking software

This is the first year that they are talking about this stuff publicly, the Wired article (Ed Note: Halo 3: How Microsoft Labs invented a new science of play), etc. Here’s a picture showing black dots on the Halo map. So dense on deaths that there is no info. So let’s tie it to color intensity. Then patterns emerge, you can see a pattern of where people tend to die.

In single player:

– tracking time on task, red zone indicates usability problem
– comparing if designer intent matches what players do… designer maybe wants intense “speed through gauntlet” feel, but heatmap shows players moving slowly…

In multiplayer:

– tracking deaths by weapons lets designers read exactly how players use items, more useful than written reports or lists of data. Designers collectively tend to be visual thinkers.
– Designer tuned placement of items and terrain to achieve most satisfying play experience.

User researchers independent from developers. Researchers help quantify into something measurable. Designers say “We want feeling of chaos” — researchers help pin that down.

Researcher are passionate about good game experience, but dispassionate about design specifics. Developers tend to fall in love with their designs.

Danc had a nice summary of metrics driven game design a couple of years ago that is worth re-reading.

As we see more games move to the web, allowing for much better real time data, true A:B testing against live users (not just beta testers), and shortening development cycle times, I would expect to see even more of this metrics driven approach to game design emerge.

Lightweight self expression for the general public November 21, 2007

Posted by jeremyliew in blogging, communication, Consumer internet, product management, self espression.
2 comments

MIT Technology Review has two good articles about microblogging in the November/December issue. (Both are behind a free registration wall.) The puff piece on Evan Williams and Twitter notes some of his thoughts on micbroblogging:

The criticism doesn’t seem to bother Williams, in part because he’s heard it before. “Actually, listening to people talk about Twitter over the last few months, you hear that almost all the arguments against it are the exact same arguments that people had against Blogger,” he says. “‘Why would anyone want to do this?’ ‘It’s pointless.’ ‘It’s trivial.’ ‘It’s self-aggrandizing bullshit.’ ‘It’s not technically interesting.’ ‘There’s nothing to it.’ ‘How is this different from X, Y, and Z that’s existed for the past 10 years?'” Indeed, there were blogging tools available when Blogger was released, and others have emerged since–including TypePad from Six Apart, which offers more features. But none has the simple appeal of Blogger, and none is as easy to use. These were the reasons Blogger was such an important force in the blogging revolution.

There is an interesting idea at the heart of all this, and that is the idea of innovation through removing features. By focusing on a subset of core functionality, both Blogger and Twitter (and the other microblogging startups, as well as Facebook’s status) have made the user interaction much lighter weight. In my experience at AOL, Netscape and IAC, lightweight interactions generally work better with the general public.

Last year Gartner predicted that blogging would peak in 2007:

The analysts said that during the middle of next year the number of blogs will level out at about 100 million. The firm has said that 200 million people have already stopped writing their blogs… Gartner analyst Daryl Plummer said the reason for the levelling off in blogging was due to the fact that most people who would ever start a web blog had already done so. He said those who loved blogging were committed to keeping it up, while others had become bored and moved on.
“A lot of people have been in and out of this thing,” Mr Plummer said. “Everyone thinks they have something to say, until they’re put on stage and asked to say it.”

Microblogging removes some of the pressure to write substantive posts, making it a lighter interaction that is easier to keep up.

The public’s preference for lightweight self expression is part of what has made widget providers (such as Rock You, a Lightspeed company), profile layout sites (such as Free Code Source) and quiz sites (such as Quizilla) so successful.

Social Design Best Practices November 5, 2007

Posted by jeremyliew in business models, facebook, game mechanics, google, myspace, open social, product management, social media, social networks, viral, viral marketing, web 2.0, web design.
add a comment

Bokardo notes a set of social design best practices as recommended by the Google OpenSocial team:

1. Engage Quickly – (my interpretation: provide value within 30 seconds)
2. Mimic Look and Feel – (make your widget look like the page it is in)
3. Enable Self Expression – (let people personalize their widgets)
4. Make it Dynamic – (keep showing new stuff)
5. Expose Friend Activity – (show what friends are doing)
6. Browse the Graph – (let people explore their friends and friends of friends)
7. Drive Communication – (provide commenting features)
8. Build Communities – (expose different axes of similarity)
9. Solve Real World Tasks – (leverage people’s social connections to solve real problems)

Worth reading the full text from OpenSocial

In late stage consumer markets, brand matters more than product September 14, 2007

Posted by jeremyliew in branding, business models, Consumer internet, distribution, product management, start-up, startups.
add a comment

The WSJ today has an article about how hard it is for US auto makers to get “import intenders” to add domestic cars to the consideration set:

Just about every month, CNW Market Research meets with a group of would-be car buyers and plays a trick on them.

Sometimes the company, which specializes in auto sales trends, takes a Toyota Camry, removes any identifying logos, and tells them it’s a new model from one of the U.S.-based auto makers. Or it takes a domestic car and tells them it’s a Toyota or another import make.

Either way, the result is the same. “If they think it’s an American car, the perception of the vehicle falls dramatically,” said Art Spinella, vice president of the Bandon, Ore.-based firm. “Detroit really gets a bum rap in the U.S.”

When I was at AOL we did a similar experiment for search. We took search results from multiple search engines, stripped branding and UI, and asked users what they thought. The marks were pretty even across the board, but when branding was put back, Google was thought to have the best results ever time. PC World found similar results in April.

As I’ve mentioned in the past, there are three phases of adoption for a new consumer technology. In the first phase distribution is paramount, in the second product is paramount, and in the third branding is paramount. Competing on the wrong dimension at the wrong time may not move the needle, as Detroit is discovering.