blog.mhartl | Michael Hartl's tech blog

2008-06-26

Working around the validates_uniqueness_of bug in Ruby on Rails

Filed under: Ruby on Rails — mhartl @ 20:16

The useful validates_uniqueness_of validation in Active Record has a well-known flaw that bit us recently. (Apparently it wasn’t well-known enough. :-) Insoshi uses email addresses as unique logins, which naturally means we have a validation enforcing email uniqueness. And yet, until recently our database contained instances of people with the same email address. How can this be?

The answer is simple: despite its name, validates_uniqueness_of doesn’t actually guarantee uniqueness. Suppose that a new user registers using the address “foobar@example.com”. Rails performs the following steps:

  1. Check the database to see if there is already a person with email address “foobar@example.com”
  2. If not, insert the new record

The problem is when the following sequence occurs:

  1. HTTP request #1 tries to create a new record with email “foobar@example.com”, which Rails marks as valid
  2. HTTP request #2 tries to create a new record with email “foobar@example.com”, which Rails marks as valid
  3. The process from request #1 saves to the database
  4. The process from request #2 saves to the database

The result, contrary to the supposed “uniqueness validation”, is two records with the address “foobar@example.com”! Since the duplication happens silently, this can badly corrupt databases in some cases. (Luckily for us, our email verification uses a save! to save the person, which raises an exception for duplicate records.)

We’re not sure exactly how the problem arose—possibly from double-clicks on the registration button—but fixing it involves making changes at the database level. For the sake of illustration, consider a stripped-down Person model with only an email address and password. Here’s what the migration might look like:

class CreatePeople < ActiveRecord::Migration
  def self.up
    create_table :people do |t|
      t.column :email,    :string
      t.column :password, :string
      t.timestamps
    end
  end

  def self.down
    drop_table :people
  end
end

The validations look like this, including validates_uniqueness_of for the email attribute:

class Person < ActiveRecord::Base

  validates_presence_of     :email
  validates_uniqueness_of   :email
  validates_confirmation_of :password

  attr_accessible :email, :password, :password_confirmation
end

As we’ve seen, this doesn’t actually validate uniqueness in all cases. Let’s write an RSpec test to catch the problem. We’ll be enforcing email uniqueness at the database level, so we expect the database to raise an exception if we try to create two records with the same email address. As we’ll see once we add the constraint, the actual exception raised is of type ActiveRecord::StatementInvalid, so we’ll test for that:

require File.dirname(__FILE__) + '/../spec_helper'

describe Person do
  it "should prevent duplicate emails" do
    person = new_person
    person.save
    duplicate = new_person
    lambda do
      # Pass 'false' to 'save' in order to skip the validations.
      duplicate.save(false)
    end.should raise_error(ActiveRecord::StatementInvalid)
  end
  
  private

    def new_person(options = {})
      Person.new({ :email    => 'foobar@example.com',
                   :password => 'pass',
                   :password_confirmation => 'pass' }.merge(options)))
    end
end

The key here is the line that passes false to Active Record’s save method, which skips the validations (including validates_uniqueness_of). Initially, this test will fail:

$ spec spec/models/person_spec.rb

F

1)
'Person should prevent duplicate emails' FAILED
expected ActiveRecord::StatementInvalid but nothing was raised

Now we enforce data integrity by putting a unique index on the email field:

$ script/generate migration add_email_unique_index

class AddEmailUniqueIndex < ActiveRecord::Migration
  def self.up
    add_index :people, :email, :unique => true
  end

  def self.down
    remove_index :people, :email
  end
end

Now the test should pass:

$ rake db:migrate; rake db:test:prepare
$ spec spec/models/person_spec.rb
.

Finished in 0.138142 seconds

1 example, 0 failures

This catches the problem in the model, but the original problem—the (attempted) creation of an email duplicate—will still raise an exception in our application. The mere attempt to create a dupe isn’t by itself a problem, so it’s probably best simply to ignore the exception by catching it and redirecting somewhere sensible:

  def create
    @person = Person.new(params[:person])
    if @person.save
      self.current_person = @person
      redirect_to '/'
      flash[:notice] = "Thanks for signing up!"
    else
      flash[:error]  = "We couldn't set up that account, sorry."
      render :action => 'new'
    end
  rescue ActiveRecord::StatementInvalid
    redirect_to '/'
  end

With that, we’ve prevented duplication and handled any errors gracefully. Huzzah!

Postscript: If your database is infected with duplicate entries, there’s a quick way to find them using the console. Since people with non-unique email addresses are invalid, we can find and destroy them as follows:

$ script/console
>> duplicates = Person.find(:all).reject { |person| person.valid? }; 0
=> 0
>> duplicates.map(&:email)
=> ["foobar@example.com", "foobar@example.com", "bazquux@example.com", "bazquux@example.com"]
>> duplicate[0].destroy; duplicate[2].destroy

(The ; 0 in the second line just suppresses the (potentially long) list of Person objects.) If you have more duplicates than this, you might want to write a script or rake task to scrub your database, but using the console was sufficient to solve our problem at the Insoshi developer site.

« Newer Posts

Theme: Customized Shocking Blue Green. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.