By Sean Hull
Usually an article like this one will start out with a technical word "scaling". Unfortunately like health care reform, everyone can't always agree on what they mean by it, or even what the goal is. So, I deliberately chose not to use the word, and use the non-technical words that we can all agree on. Typically, when our database is slowing down, we want it to be faster, stronger, bigger and better!
With that in mind I'm going to discuss some of the various ways to get there, and hopefully put some of the technology options in perspective. This will help you survey the landscape, and plan for your future needs. The first part of the article discussed query tuning and hardware changes, while this installment covers adding additional servers, and application changes to make that work.
Bigger With More Boxes
Adding more MySQL instances to the mix is one way to get a faster overall response to your application. If you have a server with multiple CPUs a lot of memory and fast disk, chances are good you're not fully exploiting all that processing power. So in that case, it may be that running multiple MySQL instances can scale on that server itself. That's because MySQL out of the box is a single process with multiple threads for sessions, so there's a limit to how much hardware it can really make use of.
If your single server is maxed out, you may well benefit from using multiple servers. But whether your multiple MySQL instances are on one server with different ports, or multiple servers, you still need a method for the application to decide where to send queries. Do they make changes? In that case, they'll need to go to your single master database. Are they doing selects, then a fleet of read-only slaves will work for those queries.
1. Data Partitioning and Sharding
Since many web applications identify users by session, dividing hits to the different slaves by session could make a lot of sense. A-G, H-O, P-Z for example might work, or a hash of the username, or the userid might be other methods to distribute users on different servers. This is called the partition key and is an important decision as it affects how you build out your slaves, and potentially how load is distributed across those servers. It might also affect outages of data, if one of the slaves goes down.
If you're doing this type of partitioning, you'll need to decide at runtime which database to hit. This can be done with a middle layer like MySQL Proxy. Although still in alpha, the concept is good, and some are already using it in production. It sits on a server responding to requests at port 3306 then forwards those queries on to the appropriate server behind the scenes based on some logic coded in a high-speed language called lua.
The other option is to make the decision on where to send your query in the application itself. This is the most flexible method as it provides you with full control of the decision making process. You can check the slaves to see if they are caught up or lagging and use master_pos_wait as needed. Your particular language or web framework may provide some support for this kind of logic already, so check your documentation. You might also look into Continuent Tungsten, DBIx::DBCluster for Perl and SQLRelay, which supports a lot of different languages and databases. Also, a CMS like Drupal for instance, already has multi-readonly slave support built in, so you just have to enable it.
Another consideration when using this type of architecture is deciding whether to hit the primary or slave and when. The most basic split is based on the query, all INSERT, UPDATE, DELETE go to master, and SELECT to slaves. If you hit the slave directly after a user submits a comment on a blog, for instance, it may not be on the slave yet, due to the lag inherent in MySQLs replication architecture. That is what's called an artifact.
Checking for stale data is a better method. If you have reporting queries that run at night, this method might work well. You just need to make sure replication is caught up.
Another method would be to track database changes by a version number, and verify that you have the latest "version" before reading your data.
Lastly, MySQL provides a function called master_pos_wait, which makes sure the slave is up to a certain point in the binary log before completing.
2. Functional Partitioning
Chances are you're probably already doing a bit of this. It involves creating a copy of the production database for different functions, such as one for data warehousing and reporting, another for text searching, and so on.
Better MySQL Through Load Balancing
If all of your slaves have the same read-only data, you may want to do some sort of load-balancing to distribute read traffic evenly. You can choose from random, least connections, fastest response round robin, or some sort of weighted decision. Although some hardware load balancers may provide you with functionality you need, they tend to be designed for web-traffic, and don't have database specific features.
Fortunately there are a number of software solutions, which are appealing. The Linux Virtual Server or LVS project is very mature, and provides something like DNS, but at the IP level, and is very very fast. A couple of projects have been built on top of LVS too, including wackamole, which is peer-based so you don't have a single point of failure, and ultramonkey.
MySQL provides a lot of sophisticated features but scaling remains a nebulous and exotic technical term thrown about by a lot of different folks in different circumstances to mean possibly different things. So we've endeavored to cover the topic while using this word less, and talking more about what you care about, namely making MySQL faster, stronger, bigger and better.