Offload read queries to a replica DB for better performance
In most web application, which require a lot of find()’s especially if more than a couple of models are involved, you should probably consider offloading those operations to read-only replica of your DB. This is typically achieved by having a master/slave or master/master configuration. In high traffic application you might have a cluster of databases, but for the purpose of this example we’ll only use two data sources: “default” and “replica”.
Therefore our basic database.php will look something like this:
[cc lang=”php”]
public $default = array(
‘driver’ => ‘mysql’,
‘persistent’ => false,
‘host’ => ‘production.example.com’,
‘login’ => ‘user’,
‘password’ => ‘password’,
‘database’ => ‘production’,
‘prefix’ => ‘’,
‘encoding’ => ‘utf8’
);
public $replica = array(
‘driver’ => ‘mysql’,
‘persistent’ => false,
‘host’ => ‘readonly.example.com’,
‘login’ => ‘user’,
‘password’ => ‘password’,
‘database’ => ‘readonly’,
‘prefix’ => ‘’,
‘encoding’ => ‘utf8’
);
[/cc]
This prepares our application to use two data sources, as needed.
Next, let’s imagine we have a good ol’ blog and need to grab various information to build a list of posts.
In our Posts Controller, we’ll have some method that gets the required information:
[cc lang=”php”]
$this->Post->getListofPosts();
[/cc]
The method above will have to involve additional models to get all of the needed info (Author, Tag, (PostsTag for the join table), PostRating… and maybe a few other models). The point is that this is enough operations already to consider offloading them to our read-only DB server.
The actual process is quite simple.
First, we’ll create a generic method in our App Model:
[cc lang=”php”]
protected function _switchDataSource($models, $datasource = ‘default’) {
if (is_array($models)) {
foreach ($models as $model) {
ClassRegistry::init($model)->setDataSource($datasource);
}
}
}
[/cc]
I hope this code is simple enough, but the implementation example is coming up…
It’s worth to note that ClassRegistry::init() will cache your model information (object instance, to be more precise) in memory, therefore in a more complex case (where you might have multiple find()’s) the newly switched data source will persist until switched back. Therefore it is important to remember to “reset” your data source once you are done with the read operation(s).
Now, here’s the basic usage sample (this snippet would be inside of our getListofPosts() method):
[cc lang=”php”]
//let’s switch our DS to replica
$this->_switchDataSource(array(
‘Post’, ‘PostsTag’, ‘Author’, ‘PostRating’
), ‘replica’);
//now we can execute our find with all of the above
//model data coming from the read-only DB
$posts = $this->find(‘all’, array(
‘contain’ => array(
//include our models here
),
‘limit’ => 35
));
//don’t forget to switch the DS back to default
$this->_switchDataSource(array(
‘Post’, ‘PostsTag’, ‘Author’, ‘PostRating’
));
return $posts;
[/cc]
As you see the implementation is pretty simple. The only thing to keep in mind is that if you are getting some SQL errors, chances are you have not included all of the required models for the operation, the most common case is forgetting the join table model. To troubleshoot and see the results the debug kit is very helpful, because it will show you which data source is being used to run a particular set of queries.
Another hint, if you use the same set of models over and over you might as well assign them to a property in your model, so that if you need to add or change something you’d only do it in one place.
Using our example we can modify the code like so:
[cc lang=”php”]
public $postQueryModels = array(‘Post’, ‘PostsTag’, ‘Author’, ‘PostRating’);
….
$this->_switchDataSource($this->postQueryModels, ‘replica’);
[/cc]