Nutanix launches and a new era for data center computing is born — No SAN or NAS required! August 16, 2011Posted by ravimhatre in 2011, Cloud Computing, data, database, datacenter, enterprise infrastructure, Infrastructure, platforms, Portfolio Company blogs, startup, startups, Storage, Uncategorized.
Tags: data center, datacenter, nas, san, storage, virtualization, vmware
The Nutanix team (ex-Googlers, VMWare, and Asterdata alums) have been quietly working to create the world’s first high-performance appliance that enables IT to deploy a complete data center environment (compute, storage, network) from a single 2U appliance.
The platform also scales to much larger configurations with zero downtime or admin changes and users can run a broard array of mixed workloads from mail/print/file servers to databases to back-office applications without having to make upfront decisions about where or how to allocate their scare hardware resources.
For the first time an IT administrator in a small or mid-sized company or a branch office can plug in his or her virtual data center and be up/running in a matter of minutes.
Some of the most disruptive elements of Nutanix’s technology which enable the customer to avoid expensive SAN and NAS investments typically required for true data center computing are aptly described on company’s blog – http://www.nutanix.com/blog/.
Take a look. We believe this represents the beginning of the next generation in data center computing.
We continue to be very enthusiastic about the tremendous amount of opportunity in the Enterprise Infrastructure sector for 2011. In the past few years, we’ve seen significant innovation in technologies such as virtualization, flash memory and distributed databases and applications. When combined with business model shifts (cloud computing) and strong macroeconomic forces (reduced R&D budgets), a “perfect storm” is created where the IT ecosystem becomes ripe for disruption. Startups can take advantage of the changing seas and ride the subsequent waves to emerge as leaders in new categories. For this post, I’ll highlight three categories where I believe we’ll see significant enterprise adoption in 2011 – big data solutions, use cases for cloud and virtualizing the network. Startups in these categories are now at the point where ideas have become stable products and science experiments have transformed into solutions.
1. BIG DATA SOLUTIONS GROW UP
There’s been a lot of “big noise” about “Big Data” for the past couple of years but, there has been “little” clarity for the traditional Enterprise customer. Hadoop, Map Reduce, Cassandra, NoSQL – all interesting ideas, but what Enterprise IT needs is solutions. Solutions come when there are products optimized to solve the challenges with specific applications. Most of the exciting, fast growing technology companies we hear about daily (Facebook, Zynga, Twitter, Groupon, LinkedIn, Google, etc) are incredibly efficient data-centric businesses. These companies collect, analyze and leverage massive amounts of data and use it as a fundamental competitive weapon. In terms of really working with “Big Data,” Google started it. Larry and Serge taught the world that analyzing more information generates better results than any algorithm. These high-profile web companies created technologies to solve problems other companies had not faced before. In this copycat world we live in, Enterprise IT is ready to follow the consumer-tech leaders. The best enterprise companies are working hard to leverage vast amounts of data in order to make better decisions and deliver better products. At Lightspeed, we invested in companies like DataStax (www.datastax.com) and MapR Technologies (www.maprtech.com) because these are startups building solutions that enable Enterprise IT to work with promising Big Data platforms like Cassandra and Hadoop. With enterprise-grade solutions now available, I expect 2011 to be a year when tinkering leaps to full-scale engagement because these new platforms will deliver a meaningful advantage to Enterprise customers.
2. CLOUD COMPUTING FINDS ITS ENTERPRISE USE CASES
The hype around “Cloud Computing” is officially everywhere. My mom, who is in her sixties (sorry Mom) and just learned to text, recently asked me about Cloud Computing. Apparently she’s seen the commercials. In Enterprise IT circles and VC offices, there’s a lot of discussion around “Public” clouds vs. “Private” clouds; Infrastructure as a Service vs. Platforms as a Service; and the pros and cons of each. It’s all valuable theoretical debate, but people need to focus on the use cases and the specific economics of a particular “cloud” or platform configuration. As of right now, not every Enterprise IT use case fits the cloud model. In fact, most don’t. But there are three in particular that definitely do — application management, network and systems management and tier 2 and 3 storage. At Lightspeed, we’ve invested in a number of companies such as AppDynamics (www.appdynamics.com) and Cirtas (www.cirtas.com) which deliver solutions that are designed from the ground up to enable enterprise class customers to leverage the fundamental advantages of “Cloud Computing” – agility, leveraged resources, and a flexible cost model. Highly dynamic, distributed applications are being developed at an accelerating rate and represent an ideal use case for cloud environments when coupled with a solution like the one offered by AppDynamics which drives resource utilization based on application level demands. Similarly, Enterprise IT storage buyers have gotten smarter about tiering data among various levels of storage media, and infrequently accessed data is a great fit for cloud storage. Cloud controllers like the one offered by Cirtas enable enterprises to have the performance, security and reliability they are used to with traditional internal solutions but leverage the economics of the cloud.
3. VIRTUALIZING THE NETWORK
To date, the story of virtualization has been primarily about servers and storage. Tremendous innovation from VMware led the way for an entirely new set of companies to emerge in the data center infrastructure ecosystem. At Lightspeed, we talk about the fundamental pillars of the data center as application and systems management, servers, storage, and networking. Given all the advancement and activity around the first three, I think it’s about time the network caught up. As Enterprise IT continues to virtualize more of the data center and adopts cloud computing models (public or private), the network fundamentals are being forced to evolve as well. Networking solutions that decouple hardware from software are better aligned with the data center of the future. Companies such as Embrane (www.embrane.com) and Nicira Networks (www.nicira.com) are tackling this challenge head on and I believe 2011 will be the year where this fundamental segment of data center infrastructure starts to see meaningful momentum.
Going viral without going down June 20, 2008Posted by jeremyliew in database, flixster, scalability.
As the social web evolves and platforms like Facebook and MySpace open up to applications, many companies and developers are rushing to get distribution to their millions of users by “going viral”. For the successful applications, this can often present a problem (a high-quality one for sure) – how do you actually scale your deployment to handle that growth?
At Flixster, we’ve been riding this growth curve for 2 years now – first with our destination site itself (www.flixster.com), and subsequently on our embedded applications on Facebook and MySpace. Across our properties, we now have over 1 million users logging in each day and we are approaching our 2 billionth movie rating. Like many others, we started out with just a single virtual server in a shared hosting environment. So how did we scale to where we are today?
The Holy Grail for scaling is “pure horizontal scaling” – just add more boxes to service more users. This tends to be relatively easy at the application layer – there are a multitude of cheap and simple clustering and load balancing technologies. The data layer is typically much more difficult to scale, and is where a lot of web startups fall down. High-volume applications simply generate too much traffic for any reasonably-priced database (I’ll assume you’re probably running MySQL as we are). So what are your options?
Buy yourself some time
The overriding mantra to everything we’ve done to scale our database has been: “avoid going to disk at all costs”. Going to disk to retrieve data can be orders of magnitude worse than accessing memory. You should apply this principle at every layer of your application.
Given that, the first thing to do is to throw in a good caching layer. The easiest way to scale your database is to not access it. Caching can give you a ton of mileage, and we still spend a lot of effort optimizing our caching layers.
If you can afford it, you can also buy a bigger box (RAM being the most important thing to upgrade). “Scaling up” like this can be effective to a point, but only buys you so much time because after all, it’s still a single database.
A replication setup can also buy you some time if you have a read-intensive workload and can afford to send some queries to a slave database. This has its problems though, the biggest of which is replication lag (slaves fall behind). Ultimately, replication can also buy you some time, but for most web application workloads, replication is a tool much better suited to solving high-availability problems than it is to solving scalability ones.
It’s time to break up
Eventually, you’re going to have to find a way to “scale out” at your database layer. Split up your data into chunks. Put the chunks on separate databases. This strategy is often called “sharding” or more generally “data partitioning” (I use the two interchangeably). It works because it reduces the workload (and price tag) for each server. It’s not trivial, but it is very doable.
There is a lot of literature out there on the technical details and challenges of sharding (see the resources section). At Flixster, we’ve followed many of the strategies described by LiveJournal, Flickr and others. One of the critical things for any startup however is figuring out when to do things.
Our primary trigger for deciding to shard a given piece of data is the size of the “active” or “working” set. It all comes back to the principle of never going to disk. All of our database servers have 32GB of memory, which we give almost entirely to the MySQL process. We try to fit most, if not all, of our active data on a given server into that space.
The ratio of active / total data will vary tremendously by application (for us it seems to be in the 10-20% range). One way to figure out if your active data is saturating your available memory is to just look at cycles spent waiting for I/O on your server. This stat more than anything else we monitor drives our partitioning decisions.
The other thing we look at for a given table is the raw table size. If a table becomes too big (in terms of # of rows or total data volume) to administer – i.e. we can’t make schema changes easily – we partition it. There’s no magic threshold that fits all applications, but for us we typically decide to shard a table if we expect it to reach 30-40 million rows.
It’s certainly easier to start off with a fully sharded architecture, but most applications do not (we certainly didn’t). In fact, I’d say that if you are spending a lot of time figuring out partitioning strategies before you even have any users, you’re probably wasting development resources. So how do you actually rip the engine out of the car while it’s running? Piece by piece and very, very carefully…
Crawl, walk, run
There are a variety of partitioning strategies, which we’ve employed incrementally as we’ve grown. Here are some of the things we’ve done (in ascending order of difficulty).
If you have a large table with a relatively small “hot spot”, consider putting the active data into a separate table. You will have some additional complexity managing the flow of data from the “active” table to the “archive” table, but at least you have split the problem a bit. This is the strategy we used early on for our movie ratings table, after realizing that 90% of the queries we were writing against it were looking for data from the last 30 days.
Vertical (or feature-based) Partitioning
Your application may have features that are relatively independent. If so, you can put each feature on a separate database. Since the features are independent, separating them shouldn’t violate too many assumptions in your application.
We did this pretty early on, and have had a lot of success with this approach. For example, movie ratings are a core feature that didn’t overlap too much (data-wise) with the rest of the database. Comments are another one. We’ve followed the same strategy for several other “features” and now have six separate feature databases.
This was a major step forward for us as it split our big problems into several smaller ones. You might not need to go any further…vertical partitioning may be sufficient. But, then again, you want to grow forever, right?
Horizontal (or user-based) Partitioning
Our success on Facebook drastically increased the load on our feature databases. Even our dedicated ratings database was struggling to keep up. A few months after our Facebook application launch, we deployed our first horizontal partition, separating different users’ ratings onto different physical databases.
One of the challenges of horizontal partitioning is in rewriting your data access code to figure out which database to use. With vertical partitions it’s relatively straightforward – which feature am I coding? With user-based partitioning, the logic can get much more complex. Another challenge in horizontal partitioning is the transition from your single data source into your partitions. The data migration can be painful. Extra hardware eases much of the pain, especially coupled with replication.
Following movie ratings, we have now horizontally partitioned a handful of other tables. We’ve also doubled the size of the partition cluster itself, going from four to eight master-slave pairs. We still use our vertically-partitioned feature databases, but they are under much less stress given the load absorbed by the horizontal partitions. And we continue to partition our high-volume tables on an as-needed basis.
Finally, some tips
• Start small, and bite things off in pieces that are manageable. Massive, several-month-long re-architectures rarely work well.
• Get some advice. We spent a good amount of time gleaning wisdom from the success of others (which they were kind enough to put online for everyone!). See the Resources section.
• Pick the best approach for your specific problems (but you have to know where your problems are – monitor EVERYTHING).
• You’ll never get there if you don’t start.
Bonus tip – come work @ Flixster!
If you’re a DBA and interested in working on these kinds of problems at a company that is already operating at scale, please send us a resume: jobs – at – flixster.com. We’re also hiring Java developers.