About six years ago, Netflix began the move from a monolithic to cloud-based microservices architecture, openly documenting the journey along the way. Netflix is one of the earliest adopters of microservices, a term that didn't even exist when Netflix began moving away from its monolith. Today, the Netflix application is powered by an architecture featuring an API Gateway that handles about two billion API edge requests every day which are handled by approximately 500+ microservices. Netflix has been so successful with its architecture, that the company has open sourced a great deal of its platform including the technologies powering its microservices. Netflix has become one of the most well-known examples of a modern microservices architecture; if an article mentions microservices, odds are, it also mentions Netflix.
Netflix Moves to the Cloud
Netflix began moving from a monolithic to AWS cloud-based microservices architecture in 2009, long before the term microservices even existed. Netflix first began with moving movie encoding, a non-customer facing application. In 2010, Netflix began moving customer facing pieces of the website to AWS including account sign up, movie selections, TV selections, metadata, and device configuration. By the end of 2010, the entire customer facing website had been moved to AWS. By December 2011, Netflix had successfully moved to the cloud, breaking up their monolith into hundreds of fine-grained microservices.
The term microservices was coined by a group of software architects in 2012 but didn’t start gaining in popularity until 2014, when software developer and author Martin Fowler began using the term in some of his publications.
Caption: Microservices - Google Search Interest Over Time - Data Source: Google Trends
Adrian Cockcroft, a technology fellow at Battery Ventures, is well known for his role as cloud architect at Netflix where he often gave presentations about Netflix’s move to the cloud and microservices. Cockcroft has developed his own definition for microservices: "Loosely coupled service oriented architecture with bounded contexts"
Reasons for the Move
There were a number of reasons Netflix made the decision to move from a monolithic datacenter to a cloud-based microservices architecture. The primary reasons for the move however, had to do with availability, scale, and speed. The company needed an architecture that allowed Netflix to be up and running 24/7, scale to the next order of magnitude, and be optimized for speed.
Back in 2008, when Netflix was still operating as a monolith, a single missing semi semicolon brought down the entire Netflix website for several hours. Monoliths tend to become spaghetti code with various components linked together and tightly coupled together. As a monolith, when something was broken in the code, all Netflix engineers had to be alerted to check and see if it was their code that had caused the problem. While it is still possible to have outages with a microservices architecture due to code errors and other issues, outages can be minimized since the platform is divided into separate services that are loosely coupled. A well-designed microservices architecture allows for better availability.
Another reason for the move to a cloud-based microservices architecture had to do with scale. Netflix was unable to build data centers fast enough to keep up with its growth rate. Not only was the number of Netflix users growing very rapidly, Netflix was also branching out to platforms like Xbox, Wii, mobile, and was expanding globally. Thousands of server instances can be commissioned simultaneously if needed to meet increased demand for services. Netflix is able to increase or decrease its capacity in minutes with AWS cloud. In the past, increasing capacity of the datacenter could take hours or days. Scale is also undifferentiated for a monolithic application; it doesn’t allow for different components to scale at different rates. For example, a customer service application could not be scaled at a different rate than a product catalog. However, this is possible with a cloud-based microservices architecture.
The microservices architecture allowed Netflix to greatly speed up development and deployment of its platform and services. The company was able to build and test global services on a large scale without impacting the current system and they could quickly rollback if there were problems. The microservices architecture also allowed Netflix to create about 30+ independent engineering teams that could work on different release schedules which helped increase the agility and productivity of the development process.
There Were a Few Problems
While the Netflix platform is one of the best examples of a modern cloud-based microservices architecture, the move from monolith to microservices was not without some problems. When Netflix first moved the customer-facing website to the cloud, there were a lot of latency issues with the web pages. One of the ways Netflix dealt with this issue was by managing resources within AWS to avoid co-tenancy and adjust to AWS networking which has more variable latency than the Netflix datacenters.
In April 2011, there was an outage in AWS US-East that brought down several popular websites hosted on AWS. While Netflix did not experience any large external outages because of the AWS issue, the company did have to make manual changes to its AWS configuration moving sets of services out of Amazon's Availability Zone (AZ). The company has since automated much of the process, so that failures of this nature are handled without requiring a lot of manual intervention by Netflix engineers.
Netflix faced a number of issues as it steadily moved to a cloud-based architecture such as load increases, instance failures, zone/region failures, and performance problems. Netflix has built many technologies and tools to help address and solve these issues.
Netflix Provides Open Source Software
Netflix has been so successful moving to a cloud-based microservices architecture, that the company has open sourced many of the tools and components used to build it. In a recent GOTO Conference presentation, Cockcroft explained what lead up to Netflix open sourcing much of its code:
“During 2011, there were a bunch of outages where Amazon went down, and Netflix didn’t; everyone else went down, Netflix didn’t. So why was that? Well it’s because we actually architected to be highly available and people said ok well we like that. So in 2012, we [people] said ok we do like these features; it’s scalable, it’s highly available, we’ve got a very fast development pipeline, and you stay up when Amazon goes down, and it’s low cost, and you keep going up at conferences and battling people, so that looks good. But they couldn’t figure out how to do it themselves because the gap from what Netflix was talking about to what everyone else was doing was just too big. So in 2012, we started seriously open sourcing a lot of code.”
The Netflix Open Source Software Center (Netflix OSS) provides tools and technologies that companies can use for building a microservices architecture that runs on AWS.
“When we said we were going to move all of Netflix to cloud everyone said we were completely crazy,” explained Cockcroft at a recent GOTO conference. “They didn’t believe we were actually doing that, they thought we were just making stuff up.”
Today, there are many companies that have moved to the cloud adopting a modern microservices architecture including Amazon, Google, IBM, LinkedIn, Nike, Nordstrom , Orbitz, PayPal, Spotify, Target, Twitter, and the list goes on and on.
Netflix was one of the first companies to move to the cloud and implement microservices helping to pave the way for other companies to do the same. Netflix has become a technology leader, especially when it comes to cloud computing and microservices. This is why just about every article these days that mentions microservices, also mentions Netflix.