How to estimate Lifetime Value; Sample cohort analysis July 19, 2010
Posted by jeremyliew in Ecommerce, ltv, subscription.trackback
In many businesses, repeat purchase behavior is a key driver of value. Many companies track % of repeat purchases as a key business metric. This is useful in steady state, but can sometimes be quite misleading if the company is showing substantial growth. By definition, growth implies many first time customers, and the mix of these new customers can distort the view into how much repeat purchase behavior is actually occuring.
I prefer to try to analyze repeat pruchase behavior, and hence, estimate lifetime value, by doing cohort analysis. This is approximate by definition, but it can give you some sense of lifetime value well before you actually see a full customer lifetime, which can help in accelerating decisions about marketing and customer acquisition. I recently posted about how you can improve LTV and CAC for your subscription or repeat purchase business. But how do you estimate Lifetime value?
I’ve uploaded a spreadsheet with a sample cohort analysis, using representative but dummy data to illustrate how to do this.
In this particular example, I look at a hypothetical subscription business. Assume that the business has been in operation for one year. First, divide the users into cohorts depending on when they initially subscribed to the service. I calculate retention at the end of month N by dividing the number of subscribers still subscribing after month N by the total number of subscribers that started in each cohort. These are the numbers in blue. Obviously, for the subscribers that started in month 1, we have 12 months of retention data, for the subscribers that started in month 2 we have 11 months of retention data, and so on.
By averaging across the cohorts, you can get an average retention rate at the end of one month, two months and so on. As the cohorts age, there are fewer datapoints to average over, and hence the potential for error is greater. However, it is still a useful exercise to get an early indication of how the business looks.
A typical pattern found in subscription businesses is that after a steep drop off after an initial period, month-on-month attrition rates tend to level off. You can see a similar pattern in this example, where after the first month, month-on-month attrition rates are around -6% (ie month N subs ~ 94% of month [N-1] subs).
If you see a pattern like this, you can extrapolate forward using the same month-on-month attrition across several years. As you can see in the model, we extrapolate an average lifetime of 9.77 months by extrapolating forward over 5 years of data.
So if you were a subscription business charging $20/month with 90% gross margins (after accounting for customer service costs for example), then you would attribute a lifetime value for a new customer of 9.77 x $20 x 90% = $176. This sets an upper bound of what you would be willing to pay to acquire a customer (although in practice, you would prefer to see a ratio of CAC/LTV in the 25-35% range).
This example is for a subscription business where the key value driver is the number of active subscribers. However, you can conduct similar analysis on any type of repeat behavior business. In a social business the metric might be activity (e.g. how many users posted a photo this month), and in a social game the metric might be dollars spent in virtual goods that period. The measurement periods may vary according to the tempo of the business. Many social games do their cohort analysis on a daily or weekly basis, whereas some ecommerce companies whose purchases are less frequent may do their cohort analysis on a quarterly basis. This will dictate how long you have to collect data before you have enough data to project forward.
Different billing mechanisms can complicate this (e.g. an annual billing system will by nature skew average lifetime upwards) and while these can be important levers, it is usually helpful to hold billing constant and compare cohorts on a same-billing basis, at least initially. However, this cohort analysis is also useful tool to see what the impact of changes in billing, registration flow, product features etc can have on retention as you can often see an increase in early month retention from later cohorts.
The spreadsheet for the sample cohort analysis is read only but you can download it to play with it yourself.
I’d love to hear from others how they estimate lifetime value.
UPDATE: June 2012 – I have a new post describing how to estimate lifetime value for an ecommerce business using cohort analysis.
Jeremy, thanks for this writeup. I have (hopefully) quick technical question. I’ve seen a handful of articles about cohort analysis, but haven’t seen any technical writeups for how to efficiently gather / maintain the data for them. Do you have advice on this?
There are two ways I can think of:
One is to have a “created_at” and “deleted_at” date/time for each subscription, and to create the analysis dynamically each time you load it up: Find all users with a “created_at” in Month 1, then create subsets for each subsequent month, removing the subscriptions as you cross their “deleted_at” months.
The other is to create a brand new database table (data warehouse) and, each day (or week, or month), to run a cron job or other automated process to sweep the database, collect data, and pass it into the database, and to then call this (static) data up when you’re generating your report.
Is there another way that I’m missing? Do you have any suggestions for how to actually create these reports? I’d love any insight you have on this.
Thanks.
Charlie,
If you’re doing with a business intelligence solution (you mentioned data warehouse), you should be able to run a report on the subscription data to determine for each customer the first payment and last payment. I assume the first payment would be close to the created_at and the last payment would be deleted_at. This should provide the data to generate the monthly chrun.
You can do the same in SQL. It’ll be more manual with the sorting and grouping, but that can be done in excel.
I can help out if you’re interested.
How do you get to the average of 9.89 months per sub? I don’t see this on the spreadsheet. Thanks
@ Dan – thanks for the catch – i’ve changed it to 9.77 as per the spreadsheet. I anonymized some real data with some random number generators to create some noise and it runs each time i do a re-calc, so had an old # in when I wrote the post
@Larry, thanks for the offer for help. I’ll dig around for a bit and see what I can piece together, but I’ll probably be in touch down the road to show you what I’ve figured out and see if you have suggestions on it. Thanks again.
[…] LSVP posted an interesting article on this as […]
Jeremy,
Thank you for sharing this. This is very helpful.
I run an online tutoring business and think our CLV computation is pretty easy. Most of our customers sick around for anywhere from 1 week to 3 months. And we have been around for 2+ years now. So if we just average the revenue per customer (except current customers) and multiple this by our margins we should be able to get the CLV. Is this correct?
[…] Estimating Lifetime Value: by Jeremy Liew. “You actually see a full customer lifetime, which can help in accelerating decisions about marketing and customer acquisition.” […]
In the healthcare business, the LTV is more complicated. We have to segment the spend based on the DRG code because revenue per customer is dependent on the DRG code. This varies significantly.
Hey, I was wondering if you know what the average expected LTV, CAC and ARPU of a social/casual game on facebook is or if you know where I can find such information?
Ryan Carson took the time to walk a Seedcamp session through his own model for estimating (among other things) LTV for subscription model. http://video.seedcamp.com/video/867124/seedcamp-week-2010-metrics-for
[…] what basis then will winners pull away from the rest? Companies who are able to derive the highest lifetime value (LTV) from their users will squeeze out their competitors with a lower lifetime value. How can you […]
[…] what basis then will winners pull away from the rest? Companies who are able to derive the highest lifetime value (LTV) from their users will squeeze out their competitors with a lower lifetime value. How can you […]
[…] Gross Margin. How can we reduce our cost base to increase our gross margin without having to raise our prices? Can we introduce new, higher margin products into our range? These are all things we need to work on to increase the lifetime value of our customers. For a great post describing how to measure the lifetime value of your customers have a read of this blog post from the Venture Capital from Lightspeed Venture Partners. […]
I was working on my cohort analysis when I stumbled across this. It’s a great help. Shouldn’t Column O be “Percent remaining from last month” though?
@ Chris- yes, you’re right it should. thanks for pointing out the error
If you are are interested in cohort stats for your site, you can try http://www.skyglue.com
They offer cohort web analytics.
Jeremy, great post. Thanks for sharing the sample cohort analysis.
Can you share the rationale behind the methodology you use to calculate expected average customer lifetimes at the bottom of the spreadsheet?
Thanks,
Andrei
Basically, you average out the period on period decay after the first period, and project it out into the future, then sum over the lifetime
Hello, thanks for the example. However I have not understood how you got the 9.77 months value. (I understood that you added all averages (which are percentages)). Also, how is the expected cumulative count as a ‘lifetime’ value. Thanks!
Hi,
Thanks for the post. Great analysis and details. Would you know any working model for online retailing / group discount sites?
And while calculating this, what if our cost for acquiring customer sums up more than LTV? Till what period we can afford to have greater CAC than LTV?
Please suggest
The idea of LTV is that it is over a lifetime, so if CAC>LTV then you’ve got a tough business model.
Jeremy,
How would you tweak this analysis if your business operated using a sales model that required a minimum yearly subscription?
Basically you should see close to 100% retention for year one, followed by likely high churn in month 13, then start to normalize (vs normalizing in month 2)
Great post!
One math question: Why does adding the average retention RATES results on an estimate of a users subscription life in MONTHS? Shouldn’t that be the average cummulative retention rate for yr1, yr2, etc?
Lastly, how would your LTV formula change if you were running a cohort analysis for Groupon?
Thanks!
To calculate the average subsciption life you’d want to calculate:
% of people who only lasted one month x1
+ % of people who lasted exactly two months x 2
+ % of people who lasted exactly three months x3
+…
+ % of people who lasted exactly 60 months x 60
= (month one retention – month 2 retention) x1
+ (month 2 retention – month 3 retention) x2
+ (month 3 retention – month 4 retention) x3
+…
+ (month 60 retention – month 59 retention) x 60
= month one retention + (-1+2)x month 2 retention + (-2+3)x month three retention+ …+ (-50+60)x month 60 retention
= month one retention + month 2 retention + … + month 60 retention
You would do exactly the same thing if you were running a cohort analysis for groupon, except that you would calculate not subscibers but # of people pruchasing that month.
[…] have on retention as you can often see an increase in early month retention from later cohorts.(source:lsvp.wordpress) 分享到: QQ空间 新浪微博 开心网 […]
Jeremy,
Thank you for sharing this. This is very helpful.
I have one question:Why does the “first month retention data” not 100%?
Does the 98.1% in first line mean that if I send email to 1000 people and 981 of them subs? Then the 54.7% below the 98.1% is the precentage of the 981 people who still subs in the second month? Is it correct?
98.1% of people who start month one as subscribers are still subscribers at the end of the first month. and 54.7% of those 981 are still subs at end of month two
Some churn can happen in the first month
why does your analysis not factor in present value of money? Calculating the LTV of a customer without considering present value of the future revenues [which you factor in by multiplying by the average life] would certainly lead to an inflated LTV calculation. Particularly with smaller companies when WACC is often in excess of 20%, including vs. excluding PV will often have huge numerical (& thus operational) impacts.
Also, for non-subscription based e-commerce companies, the churn is often times higher than 70% in the first month alone, depending in large part on the industry and customer loyalty/trademark value. In my opinion, for what it’s worth, I think further individual customer tracking & clustering of the customer base is the best way to truly connect with each level of customer. Often times there’s a core level of customers that react vastly different than non-core customers in terms of a marketing approach, creating an inherent insight on customer retention and maximizing ROI/spend.
Jeremy, great post.
What about gaming industry?
What happens if we have people who haven’t played for certain period and then came back to play? by using your model we get few results of more than 100% retention rate.
Please suggest
I’d treat them as anomalies as at scale you will likely stop seeing these
[…] How to estimate Lifetime Value; Sample cohort analysis […]
Hi Jeremy,
Please could you help me understand how the expected cumulative count is used to calculate the ‘lifetime’.
cumalative sub after 1 year = 5.23. Does that mean 5.23 months, one average customer subscribes for a year?
Sorry if this question was addressed earlier. But I am still unclear.
Your inputs will be valuable.
Thanks,
Yes, Raksha, that is right. “lifetime” is the weighted average of all subscribers lifetimes, which would be 5.23 if you only calculated out 12 months.
[…] Liew of Lightspeed Venture Partners on how to calculate lifetime customer valuewith cohort […]
This article is very helpful! Thank you for sharing. Do you also write other articles about cohort analysis or data analysis in general? What do would you teach a “noob” in terms of the basics of doing data analysis?
I haven’t done anything else so far
[…] couple of years ago I did a post on how to estimate lifetime value for a subscription business where I uploaded a sample cohort analsyis that others can use as a […]
We use http://www.RJMetrics.com to automate this calculation. Makes it a lot easier to keep LTV and cohort analyses up to date with the most recent data.