Mar
09
2011

Rails + MySQL scaling on a budget

aka .. can’t throw hardware at it

Just when I decided to focus fulltime on my new startup SupportBee, our first product Muziboo decided to grow (no complaints though). We went from about 10k visits a day to about 30k visits a day in a month’s time. Muziboo was my first web programming experience and in the sprit of a real startup, I always postponed worrying about scaling. However, the cracks showed through this time. Here is what we observed

  • Slow page load
  • High CPU load and disk I/O
  • MySQL consuming very high CPU consistently

Below is a log of what we did to make the site faster

Better Server Monitoring and Logging

You can’t fix what you can’t measure. We researched a bit and installed munin. Apart from the standard plugins (disk, cpu, network) we setup plugins for beanstalk, rails and nginx. This gave us some pretty charts to understand how the system was behaving. Here is an example of what munin charts looks like

Not only can you see the current state of the system, you can also see data for past weeks/months/years. If you are not excited by installing munin, you can look at hosted solution like Server Density

MySQL slow query log for missing indexes

Rails does not (cannot) add indexes to your database. Rails does not (cannot) add indexes to your database automatically. You will have to figure out what indexes to add based on your queries. This is not a problem for small tables (hundreds or thousands of rows) but its a problem with bigger tables. Fortunately, its easy to find out queries that are not using the index. Apart from listing slow queries, MySQL can list all the queries that don’t use the index in slow query log. You can enable it by having this directive in your mysql configuration file

log-queries-not-using-indexes

We enabled this option and found out that we had a bunch of index less queries. As we added these indexes, mysql’s cpu usage started going down bringing the system to a much better shape. You can use mysql’s explain statement to understand your queries better and add indexes. Its a good idea to have explain’s output in your rails’ log during development. I found this snippet that does it for you

unless RAILS_ENV == 'production'
      module ActiveRecord
        module ConnectionAdapters
          class MysqlAdapter < AbstractAdapter             def select_with_explain(sql, name = nil)               explanation = execute_with_disable_logging('EXPLAIN ' + sql)               e = explanation.all_hashes.first               exp = e.collect{|k,v| " | #{k}: #{v} "}.join               log(exp, 'Explain')               select_without_explain(sql, name)             end             def execute_with_disable_logging(sql, name = nil) #:nodoc:               #Run a query without logging               @connection.query(sql)             rescue ActiveRecord::StatementInvalid => exception
              if exception.message.split(":").first =~ /Packets out of order/
                raise ActiveRecord::StatementInvalid, "'Packets out of order' error was received from the database. Please update your mysql bindings (gem install mysql) and read http://dev.mysql.com/doc/mysql/en/password-hashing.html for more information.  If you're on Windows, use the Instant Rails installer to get the updated mysql bindings."
              else
                raise
              end
            end
            alias_method_chain :select, :explain
          end
        end
      end
    end

Put it in your config/initializers directory as mysql_explain.rb and restart your server. Now you should see output from explain in your logs before your sql queries. One point to note is that throwing memcache is not a substitute for having faster queries.

Beware of MySQL’s ORDER BY rand()

Even after adding indexes, one of our queries was not using them. On investigating further I found out that MySQL’s order by rand() is evil. I won’t go into the specifics now but you can read up more on stackoverflow. In short, order by rand() will always do a full table scan. Avoid it.

MySQLDump haz all the locks

mysqldump is a popular choice for backing up databases. However it locks up all your tables during the backup. What this means is that your rails’ instances cannot access the database during that time and most likely cannot serve any requests. We moved to Percona’s xtrabackup to fix this. However this only works for tables that are innodb (in our case, every table).

Moving from mongrel to passenger

Muziboo was started when the world was just moving over from fastcgi to mongrel. Hence we were using a bunch of  mongrels monitored by monit. However, every time you want to add more mongrels, you have to change your monit configuration file to start and monitor more mongrels. Also there is no easy way to reduce the number of mongrels when under less load. Hence we decided to move over to passenger. In passenger, you just have to change the number of max instances in the nginx configuration files and more app instances will be started under load. When the load is lesser, these instances are shut down. All this is automatically done for you.

Taming I/O during rsync

We use rsync to backup uploaded files uploaded by Muziboo’s user. We have about 1 TB of data and even though the backup is incremental, there is still considerable I/O to check which files have to be moved. To make sure that rsync does not overwhelm everything else, we used the following options

nice -20 ionice -c2 -n7 /usr/bin/rsync -avz -e ssh --times --size-only --stats ......

Nice and IONice help you keep the CPU and disk usage low. You can also use it for other cron/background jobs that you run.

With these changes in place, we have been able to accomodate all the growth on a single server (Quadcore + 12 GB RAM). I think with more growth we may have slowers mysql writes and at that point we may need to move to a separate db server.

Read some interesting comments on this post on Hacker News

posted in Uncategorized by prateek

Follow comments via the RSS Feed | Leave a comment | Trackback URL

8 Comments to "Rails + MySQL scaling on a budget"

  1. sushrut wrote:

    Which provider you use for hosting?

  2. Matt wrote:

    When you say “Rails does not (cannot) add indexes to your database”, do you mean automatically? Because you most certainly can add indexes via migrations with the “add_index” function. You can also execute raw SQL to add the indexes if you feel like it.

  3. prateek wrote:

    Hi Matt,

    Yes. I meant automatically. I will update the article now.

    Thanks,
    Prateek

  4. Shai wrote:

    https://github.com/eladmeidar/rails_indexes

  5. Ariejan de Vroom wrote:

    In most apps I work on I see that rendering views takes a long time.

    Using memcached here for fragement caching will speed up requests a lot – and the load on your server – dramatically!

    Still, having a speedy database is always a good thing.

  6. Saiyine wrote:

    Some benchmarks on the “order by rand()” issue and how to solve it.

    http://www.saiyine.com/post.statistics-random-row.php

  7. Nicolas wrote:

    You should have a look at MySQL’s “key_buffer_size” parameter. It’s something a lot of people don’t bother with, but the default value (16M) might lead to a lot of i/o, and slow things down.

  8. Muziboo.com Blog - Goodbye 503 errors wrote:

    [...] if you are an RoR (Ruby on Rails) enthusiast you will have fun reading this post on Muziboo dev blog on how exactly we went about scaling. addthis_url = [...]

Leave Your Comment

 
Powered by Wordpress and MySQL. Theme by openark.org