huesler.io My 2 cents

Active Record No No's. Be nice to your database

ActiveRecord, the ORM of Ruby on Rails is a great tool to model database backed applications without writing a lot of repetitive SQL queries. By following the principle "Convention over Configuration", you don't need to setup a lot of configuration to get started. So no getter and setters and no mapping definitions if you follow the conventions. At first, it feels a little bit like magic, except that it is not. In the end, it is still generating SQL for you. That means, that even Active Record does a great job for you, it cannot protect you from bad queries.

During the past few months, I have seen my fair share of code that makes your database cry havoc and even worse, takes down an entire application on subsequent requests. Most of these mistakes could have been easily avoided by following one simple guidline: "Be reasonable. Before committing your code, take a look at the generated SQL and keep in mind, that the production database might have more than just 10 records inside."

So let's take a look at some example code that might be trouble for your application: At first sight, this doesn't look very bad. It gets the job done and probably performs nicely in a development environment. First, let's look at the SQL being generated: We are fetching every single column of every user in our system just to get the id's. Even worse, for every user in the database, ActiveRecord creates a new user object. That takes time and uses up memory. Just imagine what would happen if you have a couple of thousands or more user in your database. At best, the request will just timeout after 60 seconds and everything goes back to normal. Worst case, memory consumption rockets, takes out the server and the database is jammed with long running queries. Not bad for just one line of code.

Let's assume this line of code was somewhere inside a view in the admin interface providing the option to assign a project owner. To fix the problem, we can do the following:

  1. Paginate
  2. Don't fetch all columns
  3. Move the code to the controller, that's where it belongs

Here are some other gems I have found hidden somewhere in a view:

Tips

So here are my tips to prevent these kind of time bombs in your application:

  • Look at the generated SQL
  • Let AR/SQL do the math (conditions,count,sum, named_scope, sorting, ordering) instead of using ruby's enumerable functions
  • Paginate, but paginate in SQL not on ruby collections (User.paginate vs. User.all.paginate)
  • Only fetch what you really need (:select => "id, name")
  • Eager load associations with :include to avoid the n + 1 problem (Post.all.each{|post| puts post.author.name} vs. Post.all(:include => :author))
  • Whenever possible fetch all records you need from inside a controller or a model

Check SQL logs

To get a feeling of what is happening behind the scenes, I like to try things out in script/console and look at the generated SQL. To see it directly in irb, I change the Rails logger to STDOUT in my ~/.irbrc file:

To get even more information, there are plenty of cool profiling tools that hook into your Rails app and give you a nice GUI with all the information you need. Try out New Relic, FiveRuns TuneUp or Rack::Bug. There is also great Railscast explaining all of these three tools.